NVIDIA GPU
Choose Your Ollama Hosting Plans
- GPU Server Price:
- Parameters:
- GPU Memory:
- GPU Card Model:
Lite GPU Dedicated Server - K620
- 16GB RAM
- GPU: Nvidia Quadro K620
- Quad-Core Xeon E3-1270v3
- 120GB + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Maxwell
- CUDA Cores: 384
- GPU Memory: 2GB DDR3
- FP32 Performance: 0.863 TFLOPS
Express GPU Dedicated Server - P600
- 32GB RAM
- GPU: Nvidia Quadro P600
- Quad-Core Xeon E5-2643
- 120GB + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Pascal
- CUDA Cores: 384
- GPU Memory: 2GB GDDR5
- FP32 Performance: 1.2 TFLOPS
Express GPU Dedicated Server - P620
- 32GB RAM
- GPU: Nvidia Quadro P620
- Eight-Core Xeon E5-2670
- 120GB + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Pascal
- CUDA Cores: 512
- GPU Memory: 2GB GDDR5
- FP32 Performance: 1.5 TFLOPS
Express GPU Dedicated Server - P1000
- 32GB RAM
- GPU: Nvidia Quadro P1000
- Eight-Core Xeon E5-2690
- 120GB + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Pascal
- CUDA Cores: 640
- GPU Memory: 4GB GDDR5
- FP32 Performance: 1.894 TFLOPS
Basic GPU Dedicated Server - GTX 1650
- 64GB RAM
- GPU: Nvidia GeForce GTX 1650
- Eight-Core Xeon E5-2667v3
- 120GB + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Turing
- CUDA Cores: 896
- GPU Memory: 4GB GDDR5
- FP32 Performance: 3.0 TFLOPS
Basic GPU Dedicated Server - T1000
- 64GB RAM
- GPU: Nvidia Quadro T1000
- Eight-Core Xeon E5-2690
- 120GB + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Turing
- CUDA Cores: 896
- GPU Memory: 8GB GDDR6
- FP32 Performance: 2.5 TFLOPS
Professional GPU VPS - A4000
- 32GB RAM
- 24 CPU Cores
- 320GB SSD
- 300Mbps Unmetered Bandwidth
- Once per 2 Weeks Backup
- OS: Linux / Windows 10/ Windows 11
- Dedicated GPU: Quadro RTX A4000
- CUDA Cores: 6,144
- Tensor Cores: 192
- GPU Memory: 16GB GDDR6
- FP32 Performance: 19.2 TFLOPS
Basic GPU Dedicated Server - GTX 1660
- 64GB RAM
- GPU: Nvidia GeForce GTX 1660
- Dual 8-Core Xeon E5-2660
- 120GB + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Turing
- CUDA Cores: 1408
- GPU Memory: 6GB GDDR6
- FP32 Performance: 5.0 TFLOPS
Basic GPU Dedicated Server - RTX 4060
- 64GB RAM
- GPU: Nvidia GeForce RTX 4060
- Eight-Core E5-2690
- 120GB SSD + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ada Lovelace
- CUDA Cores: 3072
- Tensor Cores: 96
- GPU Memory: 8GB GDDR6
- FP32 Performance: 15.11 TFLOPS
Basic GPU Dedicated Server - RTX 5060
- 64GB RAM
- GPU: Nvidia GeForce RTX 5060
- 24-Core Platinum 8160
- 120GB SSD + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Blackwell 2.0
- CUDA Cores: 4608
- Tensor Cores: 144
- GPU Memory: 8GB GDDR7
- FP32 Performance: 23.22 TFLOPS
Professional GPU Dedicated Server - RTX 2060
- 128GB RAM
- GPU: Nvidia GeForce RTX 2060
- Dual 8-Core E5-2660
- 120GB + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 1920
- Tensor Cores: 240
- GPU Memory: 6GB GDDR6
- FP32 Performance: 6.5 TFLOPS
Professional GPU Dedicated Server - P100
- 128GB RAM
- GPU: Nvidia Tesla P100
- Dual 8-Core E5-2660
- 120GB + 960GB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Pascal
- CUDA Cores: 3584
- GPU Memory: 16 GB HBM2
- FP32 Performance: 9.5 TFLOPS
Advanced GPU Dedicated Server - RTX 3060 Ti
- 128GB RAM
- GPU: GeForce RTX 3060 Ti
- Dual 12-Core E5-2697v2
- 240GB SSD + 2TB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 4864
- Tensor Cores: 152
- GPU Memory: 8GB GDDR6
- FP32 Performance: 16.2 TFLOPS
Advanced GPU Dedicated Server - A4000
- 128GB RAM
- GPU: Nvidia Quadro RTX A4000
- Dual 12-Core E5-2697v2
- 240GB SSD + 2TB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 6144
- Tensor Cores: 192
- GPU Memory: 16GB GDDR6
- FP32 Performance: 19.2 TFLOPS
Advanced GPU Dedicated Server - V100
- 128GB RAM
- GPU: Nvidia V100
- Dual 12-Core E5-2690v3
- 240GB SSD + 2TB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Volta
- CUDA Cores: 5,120
- Tensor Cores: 640
- GPU Memory: 16GB HBM2
- FP32 Performance: 14 TFLOPS
Multi-GPU Dedicated Server - 2xRTX 4060
- 64GB RAM
- GPU: 2 x Nvidia GeForce RTX 4060
- Eight-Core E5-2690
- 120GB SSD + 960GB SSD
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ada Lovelace
- CUDA Cores: 3072
- Tensor Cores: 96
- GPU Memory: 8GB GDDR6
- FP32 Performance: 15.11 TFLOPS
Advanced GPU Dedicated Server - A5000
- 128GB RAM
- GPU: Nvidia Quadro RTX A5000
- Dual 12-Core E5-2697v2
- 240GB SSD + 2TB SSD
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 8192
- Tensor Cores: 256
- GPU Memory: 24GB GDDR6
- FP32 Performance: 27.8 TFLOPS
Multi-GPU Dedicated Server - 2xRTX 3060 Ti
- 128GB RAM
- GPU: 2 x GeForce RTX 3060 Ti
- Dual 12-Core E5-2697v2
- 240GB SSD + 2TB SSD
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 4864
- Tensor Cores: 152
- GPU Memory: 8GB GDDR6
- FP32 Performance: 16.2 TFLOPS
Multi-GPU Dedicated Server - 2xRTX A4000
- 128GB RAM
- GPU: 2 x Nvidia RTX A4000
- Dual 12-Core E5-2697v2
- 240GB SSD + 2TB SSD
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 6144
- Tensor Cores: 192
- GPU Memory: 16GB GDDR6
- FP32 Performance: 19.2 TFLOPS
Multi-GPU Dedicated Server - 3xRTX 3060 Ti
- 256GB RAM
- GPU: 3 x GeForce RTX 3060 Ti
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 4864
- Tensor Cores: 152
- GPU Memory: 8GB GDDR6
- FP32 Performance: 16.2 TFLOPS
Advanced GPU VPS - RTX 5090
- 96GB RAM
- 32 CPU Cores
- 400GB SSD
- 500Mbps Unmetered Bandwidth
- Once per 2 Weeks Backup
- OS: Linux / Windows 10/ Windows 11
- Dedicated GPU: GeForce RTX 5090
- CUDA Cores: 21,760
- Tensor Cores: 680
- GPU Memory: 32GB GDDR7
- FP32 Performance: 109.7 TFLOPS
Enterprise GPU Dedicated Server - RTX 4090
- 256GB RAM
- GPU: GeForce RTX 4090
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ada Lovelace
- CUDA Cores: 16,384
- Tensor Cores: 512
- GPU Memory: 24 GB GDDR6X
- FP32 Performance: 82.6 TFLOPS
Enterprise GPU Dedicated Server - RTX A6000
- 256GB RAM
- GPU: Nvidia Quadro RTX A6000
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 10,752
- Tensor Cores: 336
- GPU Memory: 48GB GDDR6
- FP32 Performance: 38.71 TFLOPS
Enterprise GPU Dedicated Server - A40
- 256GB RAM
- GPU: Nvidia A40
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 10,752
- Tensor Cores: 336
- GPU Memory: 48GB GDDR6
- FP32 Performance: 37.48 TFLOPS
Enterprise GPU Dedicated Server - RTX 5090
- 256GB RAM
- GPU: GeForce RTX 5090
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Blackwell 2.0
- CUDA Cores: 21,760
- Tensor Cores: 680
- GPU Memory: 32 GB GDDR7
- FP32 Performance: 109.7 TFLOPS
Multi-GPU Dedicated Server - 2xRTX A5000
- 128GB RAM
- GPU: 2 x Quadro RTX A5000
- Dual 12-Core E5-2697v2
- 240GB SSD + 2TB SSD
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 8192
- Tensor Cores: 256
- GPU Memory: 24GB GDDR6
- FP32 Performance: 27.8 TFLOPS
Multi-GPU Dedicated Server - 3xV100
- 256GB RAM
- GPU: 3 x Nvidia V100
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Volta
- CUDA Cores: 5,120
- Tensor Cores: 640
- GPU Memory: 16GB HBM2
- FP32 Performance: 14 TFLOPS
Multi-GPU Dedicated Server - 3xRTX A5000
- 256GB RAM
- GPU: 3 x Quadro RTX A5000
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 8192
- Tensor Cores: 256
- GPU Memory: 24GB GDDR6
- FP32 Performance: 27.8 TFLOPS
Enterprise GPU Dedicated Server - A100
- 256GB RAM
- GPU: Nvidia A100
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 6912
- Tensor Cores: 432
- GPU Memory: 40GB HBM2
- FP32 Performance: 19.5 TFLOPS
- 50% off for the first month, 25% off for every renewals.
Multi-GPU Dedicated Server- 2xRTX 4090
- 256GB RAM
- GPU: 2 x GeForce RTX 4090
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ada Lovelace
- CUDA Cores: 16,384
- Tensor Cores: 512
- GPU Memory: 24 GB GDDR6X
- FP32 Performance: 82.6 TFLOPS
Multi-GPU Dedicated Server - 3xRTX A6000
- 256GB RAM
- GPU: 3 x Quadro RTX A6000
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 10,752
- Tensor Cores: 336
- GPU Memory: 48GB GDDR6
- FP32 Performance: 38.71 TFLOPS
Multi-GPU Dedicated Server- 2xRTX 5090
- 256GB RAM
- GPU: 2 x GeForce RTX 5090
- Dual E5-2699v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Blackwell 2.0
- CUDA Cores: 21,760
- Tensor Cores: 680
- GPU Memory: 32 GB GDDR7
- FP32 Performance: 109.7 TFLOPS
Multi-GPU Dedicated Server - 2xA100
- 256GB RAM
- GPU: Nvidia A100
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 6912
- Tensor Cores: 432
- GPU Memory: 40GB HBM2
- FP32 Performance: 19.5 TFLOPS
- Free NVLink Included
Multi-GPU Dedicated Server - 4xRTX A6000
- 512GB RAM
- GPU: 4 x Quadro RTX A6000
- Dual 22-Core E5-2699v4
- 240GB SSD + 4TB NVMe + 16TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 10,752
- Tensor Cores: 336
- GPU Memory: 48GB GDDR6
- FP32 Performance: 38.71 TFLOPS
Enterprise GPU Dedicated Server - A100(80GB)
- 256GB RAM
- GPU: Nvidia A100
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 6912
- Tensor Cores: 432
- GPU Memory: 80GB HBM2e
- FP32 Performance: 19.5 TFLOPS
Multi-GPU Dedicated Server - 8xV100
- 512GB RAM
- GPU: 8 x Nvidia Tesla V100
- Dual 22-Core E5-2699v4
- 240GB SSD + 4TB NVMe + 16TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Volta
- CUDA Cores: 5,120
- Tensor Cores: 640
- GPU Memory: 16GB HBM2
- FP32 Performance: 14 TFLOPS
Multi-GPU Dedicated Server - 4xA100
- 512GB RAM
- GPU: 4 x Nvidia A100
- Dual 22-Core E5-2699v4
- 240GB SSD + 4TB NVMe + 16TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 6912
- Tensor Cores: 432
- GPU Memory: 40GB HBM2
- FP32 Performance: 19.5 TFLOPS
Enterprise GPU Dedicated Server - H100
- 256GB RAM
- GPU: Nvidia H100
- Dual 18-Core E5-2697v4
- 240GB SSD + 2TB NVMe + 8TB SATA
- 100Mbps-1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Hopper
- CUDA Cores: 14,592
- Tensor Cores: 456
- GPU Memory: 80GB HBM2e
- FP32 Performance: 183TFLOPS
Multi-GPU Dedicated Server - 8xRTX A6000
- 512GB RAM
- GPU: 8 x Quadro RTX A6000
- Dual 22-Core E5-2699v4
- 240GB SSD + 4TB NVMe + 16TB SATA
- 1Gbps
- OS: Windows / Linux
- Single GPU Specifications:
- Microarchitecture: Ampere
- CUDA Cores: 10,752
- Tensor Cores: 336
- GPU Memory: 48GB GDDR6
- FP32 Performance: 38.71 TFLOPS
Popular LLMs and GPU Recommendations
Model Name | Params | Model Size | Recommended GPU cards |
DeepSeek R1 | 7B | 4.7GB | GTX 1660 6GB or higher |
DeepSeek R1 | 8B | 4.9GB | GTX 1660 6GB or higher |
DeepSeek R1 | 14B | 9.0GB | RTX A4000 16GB or higher |
DeepSeek R1 | 32B | 20GB | RTX 4090, RTX A5000 24GB, A100 40GB |
DeepSeek R1 | 70B | 43GB | RTX A6000, A40 48GB |
DeepSeek R1 | 671B | 404GB | Not supported yet |
Deepseek-coder-v2 | 16B | 8.9GB | RTX A4000 16GB or higher |
Deepseek-coder-v2 | 236B | 133GB | 2xA100 80GB, 4xA100 40GB |
Model Name | Params | Model Size | Recommended GPU cards |
Llama 3.3 | 70B | 43GB | A6000 48GB, A40 48GB, or higher |
Llama 3.1 | 8B | 4.9GB | GTX 1660 6GB or higher |
Llama 3.1 | 70B | 43GB | A6000 48GB, A40 48GB, or higher |
Llama 3.1 | 405B | 243GB | 4xA100 80GB, or higher |
6 Reasons to Choose our Ollama Hosting
SSD-Based Drives
Full Root/Admin Access
99.9% Uptime Guarantee
Dedicated IP
24/7/365 Technical Support
How to Run LLMs Locally with Ollama AI
Order a GPU Server
Alternatively, choose a standard OS and manually install Ollama after deployment.
Install Ollama AI
Download an LLM Model
Chat with the Model
Key Features of Ollama
Ease of Use
Flexibility
Powerful LLMs
Community Support
Advantages of Ollama over ChatGPT
Quick-Start Guides
Ollama GPU Benchmarks – Model Performance
FAQs of Ollama Hosting
What is Ollama?
What Nvidia GPUs are good for running Ollama?
Examples of minimum supported cards for each series: Quadro K620/P600, Tesla P100, GeForce GTX 1650, Nvidia V100, RTX 4000.