I am trying to run DeepSeek locally according to their instructions but it does not work with some silly error (I will show it later). This is what I am doing.
- Download the smallest model (3.5GB) from here .5B
- Follow the steps from here: #6-how-to-run-locally
2.1 Get this project .git
2.2 Run docker container like this with pre-created volume to put the model
docker run --gpus all -it --name deepseek01 --rm --mount source=deepseekv3,target=/root/deepseekv3 python:3.10-slim bash
I am using python:3.10-slim because here (#6-how-to-run-locally) it is written "Linux with Python 3.10 only. Mac and Windows are not supported."
2.3 Install latest updates apt-get update
2.4 get this file .txt and install the requirements
pip install -r requirements.txt
2.5 Copy the model to the volume mounted to the docker container. these 5 files from here .5B
config.json
generation_config.json
model.safetensors
tokenizer.json
tokenizer_config.json
2.6 Convert the model as it is written here #model-weights-conversion by this command
python convert.py --hf-ckpt-path /root/deepseekv3/source_model --save-path /root/deepseekv3/converted_model --n-experts 256 --model-parallel 16
In this step (converting the model) I got this error
Traceback (most recent call last):
File "/root/deepseekv3/inference/convert.py", line 96, in <module>
main(args.hf_ckpt_path, args.save_path, args.n_experts, args.model_parallel)
File "/root/deepseekv3/inference/convert.py", line 63, in main
assert key in mapping
AssertionError
so, basically, the next steps do not make sense, as this is the essential step.
My questions:
- What I am doing wrong?
- There are some videos on YouTube where deepseek was installed with ollama. Is it really required? Should I be able to run it without it like they described here #6-how-to-run-locally?
UPDATE 1
In order to debug a bit I added these 2 lines.
print("Missing key:", key)
print("Available keys:", list(mapping.keys()))
Missing keys were identified as these:
embed_tokens
input_layernorm
down_proj
gate_proj
up_proj
post_attention_layernorm
k_proj
Although all of them do exist inside model.safetensors file.
Also, it was mentioned by @Hans Kilian in the comment, that I might put some file, which is not needed into source_model folder. I checked line 11 in the convert.py and some of the keys there do not exist inside model.safetensors, but logging reports different keys.