Skip to main content
Ollama supports importing models and adapters from various formats, allowing you to use custom fine-tuned models or convert models from other frameworks.

Supported Formats

Ollama can import:
  • Safetensors models - Full models in Safetensors format
  • Safetensors adapters - Fine-tuned LoRA adapters
  • GGUF files - Pre-quantized models and adapters from llama.cpp

Supported Architectures

Ollama supports importing models based on several architectures:

Llama

Llama 2, Llama 3, Llama 3.1, Llama 3.2, and Llama 4

Mistral

Mistral 1, Mistral 2, Mistral 3, and Mixtral

Gemma

Gemma 1, Gemma 2, and Gemma 3

Other

Phi3, Qwen2, Qwen3, Command R, and more

Importing Safetensors Models

Import full models from Safetensors weights into Ollama.
1

Create a Modelfile

Create a Modelfile pointing to your Safetensors directory:
FROM /path/to/safetensors/directory
If the Modelfile is in the same directory as the weights:
FROM .
2

Create the model

Run ollama create from the directory containing your Modelfile:
ollama create my-model
3

Test the model

Verify the model works:
ollama run my-model
This method works for foundation models and fine-tuned models that have been fused with a foundation model.

Importing Fine-Tuned Adapters

Import LoRA adapters created through fine-tuning frameworks.
1

Create a Modelfile with adapter

Specify the base model and adapter path:
FROM <base model name>
ADAPTER /path/to/safetensors/adapter/directory
If your adapter is in the same directory:
FROM llama3.2
ADAPTER .
2

Create the model

ollama create my-finetuned-model
3

Run the model

ollama run my-finetuned-model
Use the same base model in the FROM command that was used to create the adapter. Mismatched base models will produce erratic results.

Supported Fine-Tuning Frameworks

Create Safetensors adapters with these frameworks:
Use non-quantized (non-QLoRA) adapters for best results, as different frameworks use different quantization methods.

Importing GGUF Models

Import pre-quantized GGUF models or adapters from llama.cpp or Hugging Face.

GGUF Model

Create a Modelfile pointing to the GGUF file:
FROM /path/to/model.gguf
Then create the model:
ollama create my-gguf-model

GGUF Adapter

For GGUF adapters, specify both the base model and adapter:
FROM <model name>
ADAPTER /path/to/adapter.gguf
The base model can be:
  • An Ollama model (e.g., llama3.2)
  • A GGUF file
  • A Safetensors model
When importing GGUF adapters, ensure the base model matches the one used to create the adapter.

Converting to GGUF

You can convert models to GGUF format using llama.cpp tools:
  • Models: Use convert_hf_to_gguf.py from llama.cpp
  • Adapters: Use convert_lora_to_gguf.py from llama.cpp

Quantizing During Import

Quantize FP16 or FP32 models during import to reduce memory usage and increase performance.
1

Create a Modelfile with FP16/FP32 model

FROM /path/to/my/model/fp16
2

Create with quantization

Use the --quantize or -q flag:
ollama create --quantize q4_K_M mymodel

Supported Quantization Levels

q4_K_M
string
4-bit K-means quantization (medium) - Recommended balance of size and quality
q4_K_S
string
4-bit K-means quantization (small) - Smaller size, lower quality
q8_0
string
8-bit quantization - Higher quality, larger size
See the Model Quantization page for detailed information about quantization levels and performance trade-offs.

Sharing Models

After creating your model, share it on ollama.com so others can use it.
1

Create an account

Visit ollama.com/signup to create an account. Your username will be part of your model’s name (e.g., username/modelname).
2

Add your public key

Go to Ollama Keys Settings and add your Ollama public key. The page will show you where to find your public key on your system.
3

Copy and push your model

Rename your model to include your username:
ollama cp mymodel myusername/mymodel
ollama push myusername/mymodel
Once pushed, others can use your model:
ollama run myusername/mymodel

Examples

Import a Hugging Face Model

# Download a model from Hugging Face
huggingface-cli download username/model-name --local-dir ./model

# Create Modelfile
echo "FROM ./model" > Modelfile

# Import to Ollama
ollama create hf-model

Import with Custom Parameters

FROM ./llama-model
TEMPLATE """{{- if .System }}<|system|>{{ .System }}<|end|>{{- end }}
{{- if .Prompt }}<|user|>{{ .Prompt }}<|end|>{{- end }}
<|assistant|>"""
PARAMETER temperature 0.8
PARAMETER top_p 0.9

Import a Fine-Tuned Adapter

FROM llama3.2
ADAPTER ./my-finetuned-adapter
SYSTEM "You are a helpful coding assistant specialized in Python."
Your imported model is ready to use with ollama run, the API, or any Ollama client library!

Build docs developers (and LLMs) love