How to Import Models from Hugging Face to Ollama



Introduction

Ollama is a powerful tool that simplifies the process of creating, running, and managing large language models (LLMs). Hugging Face is a machine learning platform that's home to nearly 500,000 open source models. This tutorial will guide you through the steps to import a new model from Hugging Face and create a custom Ollama model. Get access to the latest and greatest without having to wait for it to be published to Ollama's model library. Let's get started!

Prerequisites

Before getting started, make sure you have the following:

Ollama installed on your system

Hugging Face account (to download models)

Enough RAM/VRAM to load the model (16GB recommended for 1.6B parameter models)

4 Steps to Import Models from HuggingFace to Ollama

To download a model from the Hugging Face model hub and run it locally using Ollama on your GPU server, you can follow these steps:

Step 1: Download GGUF File

First, you need to download the GGUF file of the model you want from Hugging Face. For this tutorial, we’ll use the bartowski/Starling-LM-7B-beta-GGUF model as an example.

You can use the git to clone the repository:

# Make sure you have git-lfs installed (https://git-lfs.com)
git lfs install

git clone https://huggingface.co/bartowski/Starling-LM-7B-beta-GGUF

Step 2: Create the Modelfile

Next, create a Modelfile configuration that defines the model's behavior. Here's an example:

# Modelfile
FROM "./Starling-LM-7B-beta-Q6_K.gguf"
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
TEMPLATE """
<|im_start|>system
<|im_end|>
<|im_start|>user
<|im_end|>
<|im_start|>assistant
"""

Replace ./Starling-LM-7B-beta-Q6_K.gguf with the path to the GGUF file you downloaded. The TEMPLATE line defines the prompt format using system, user, and assistant roles. You can customize this based on your use case.

Step 3: Build the Model

Now, build the Ollama model using the ollama create command:

ollama create "Starling-LM-7B-beta-Q6_K" -f Modelfile

Replace Starling-LM-7B-beta-Q6_K with the name you want to give your model, and Modelfile with the path to your Modelfile.

Step 4: Run and Test the Model

Finally, you can run and try your model using the ollama run command:

ollama run Starling-LM-7B-beta-Q6_K:latest

The :latest tag runs the most recent version of your model. That's it! You have successfully imported a Hugging Face model and created a custom Ollama model.

Additional Tips

Explore the Ollama model library to find other models to try beyond StableLM.

Further customize model behavior by modifying the Modelfile

Change the prompt template

Set hyperparameters like temperature, max tokens, etc.

For more information, refer to the Ollama documentation and the Hugging Face model hub.