How to Install and Run DeepSeek R1 Locally With Ollama



What is DeepSeek R1?

DeepSeek R1 is a state-of-the-art language model optimized for tasks like text generation, summarization, and question-answering. It’s lightweight yet powerful, making it an excellent choice for developers and researchers who want to experiment with AI without relying on cloud-based APIs.

What is Ollama?

Ollama is a user-friendly platform that simplifies the process of downloading, managing, and running AI models locally. It supports a wide range of models, including DeepSeek R1, and provides an intuitive interface for interacting with them. Whether you’re a beginner or an experienced developer, Ollama makes it easy to get started with AI models.

Prerequisites

A compatible machine: DeepSeek R1 requires a decent amount of computational power. A machine with a modern CPU (or GPU for faster performance) and at least 16GB of RAM is recommended.

Python installed: Ensure you have Python 3.8 or later installed on your system.

Enough storage space: DeepSeek R1 is a large model, so ensure you have sufficient disk space (at least 10GB free).

Nvidia GPU with 16GB+ VRAM，CUDA 11.x or 12.x.

How to Install and Run DeepSeek R1 with Ollama?

Ollama simplifies running LLMs locally by handling model downloads, quantization, and execution seamlessly. Note that the exact process may vary depending on the specific DeepSeek R1 model you are using.

Step 1 - Install Ollama

Visit the official Ollama GitHub repository: https://github.com/ollama/ollama. Download the latest release for your operating system (Windows, macOS, or Linux). Follow the installation instructions provided in the repository.

For example, on Ubuntu 22.04, install with one command:

curl -fsSL https://ollama.com/install.sh | sh

Once installed, verify that Ollama is working by running the following command in your terminal:

ollama --version
ollama version is 0.5.7

This should display the installed version of Ollama.

Step 2 - Download DeepSeek R1

With Ollama installed, the next step is to download the DeepSeek R1 model. Ollama makes this process straightforward. Open your terminal or command prompt. Run the following command to download DeepSeek R1:

ollama pull deepseek-r1

This command will fetch the DeepSeek R1 model and store it locally on your machine. Depending on your internet speed, this may take a few minutes.

Note: You can run deepseek-r1 1.5b,7b,8b and 14b on our GPU VPS - A4000, run deepseek-r1 32b on GPU dedicated server - A5000, and run deepseek-r1 70b on GPU dedicated server - A6000.

Step 3 - Run DeepSeek R1 Locally

Once the model is downloaded, you can start using it. Ollama provides a simple interface to interact with the model. Run the following command to start DeepSeek R1:

ollama run deepseek-r1

You’ll be greeted with a prompt where you can input text and receive responses from the model. For example:

User: What is the capital of France?
DeepSeek R1: The capital of France is Paris.

Experiment with different prompts to explore the capabilities of DeepSeek R1.

To run DeepSeek-R1 continuously and serve it via an API, start the Ollama server:

ollama serve

This will make the model available for integration with other applications.

How to Use DeepSeek R1 with Ollama?

1. Running inference via CLI

Once the model is downloaded, you can interact with DeepSeek-R1 directly in the terminal.

ollama run deepseek-r1:Xb

2. Accessing DeepSeek-R1 via API

To integrate DeepSeek-R1 into applications, use the Ollama API using curl:

curl http://localhost:11434/api/chat -d '{
  "model": "deepseek-r1",
  "messages": [{ "role": "user", "content": "Solve: 25 * 25" }],
  "stream": false
}'

3. Accessing DeepSeek-R1 via Python

Ollama allows you to integrate DeepSeek R1 into your own applications. Here’s a simple example using Python. Install the Ollama Python client:

pip install ollama

Use the following code snippet to interact with DeepSeek R1 programmatically.

import ollama

response = ollama.generate(model='deepseek-r1', prompt='Explain quantum computing in simple terms.')
print(response['response'])

This script sends a prompt to DeepSeek R1 and prints the generated response. Also, you can use the ollama.chat() function, it takes the model name and a user prompt, processing it as a conversational exchange.

import ollama
response = ollama.chat(
    model="deepseek-r1",
    messages=[
        {"role": "user", "content": "Explain Newton's second law of motion"},
    ],
)
print(response["message"]["content"])

Conclusion

Running DeepSeek R1 locally with Ollama is a powerful way to harness the capabilities of AI without relying on external services. By following the steps outlined in this blog, you can easily install, run, and integrate DeepSeek R1 into your projects. Whether you’re a developer, researcher, or AI enthusiast, this setup provides a flexible and efficient way to experiment with cutting-edge language models. So, what are you waiting for? Dive into the world of local AI with DeepSeek R1 and Ollama today!