Installing DeepSeek Open Source on Your Server

Photo of author
Written By Jerome HENRY

Guide: Installing DeepSeek Open Source on Your Server

DeepSeek is a powerful, open-source AI solution for advanced data analysis. This guide provides a detailed walkthrough of installing it on your own infrastructure, ensuring maximum security and customization.

Why Choose a Local Installation of DeepSeek?

Installing DeepSeek locally offers significant advantages over cloud-based solutions.

First, you maintain complete control over your sensitive data, which greatly enhances privacy. Second, you can customize every aspect of the environment to meet your specific needs. Companies looking to optimize costs will find this approach particularly appealing.

Cloud solutions typically charge based on usage, which can quickly become expensive for intensive AI workloads.

“Open source is the royal road to sustainable technological innovation and digital sovereignty.” – Linus Torvalds

Installing DeepSeek locally allows you to fully leverage its capabilities without the financial or technical constraints of cloud platforms. However, this approach requires adequate hardware and specific technical skills.

Essential Technical Prerequisites

Before starting the installation, carefully verify that your infrastructure meets the minimum requirements. The success of your DeepSeek deployment depends directly on this.

To further explore projects and discussions related to DeepSeek, you can visit the issues page on GitHub.

Recommended Hardware Configuration

ComponentMinimum SpecificationsRecommended Specifications
CPU8 cores (16 threads)16 cores (32 threads) or more
RAM32 GB64 GB or more
Storage500 GB SSD1 TB NVMe SSD
GPUNVIDIA RTX 3080 (10 GB)NVIDIA A100 (40/80 GB)
OSUbuntu 20.04 LTSUbuntu 22.04 LTS
Network1 Gbps10 Gbps

Software and Dependencies

To ensure a smooth installation, several software components must be pre-installed:

  • Python: Version 3.8 or higher (3.10 recommended)
  • CUDA Toolkit: Version 11.8 or higher for GPU acceleration
  • cuDNN: Compatible with your CUDA version
  • Git: To clone the source repository
  • Docker & Docker Compose: Optional but highly recommended

Make sure to install these components before proceeding with the main installation. A stable internet connection will be required to download the various packages and models.

Step-by-Step Installation Process

Step 1: Preparing the System Environment

Start by updating your system and installing essential packages. Open a terminal and run:


  # Update the system
  sudo apt update && sudo apt upgrade -y
 

  # Install system dependencies
  sudo apt install -y build-essential python3-dev python3-pip git wget curl
  

Verify that Python 3.8+ is correctly installed:

python3 --version

Step 2: Configuring the Python Environment

Creating a virtual environment is highly recommended to avoid dependency conflicts. Proceed as follows:


  # Install virtualenv
  pip3 install virtualenv
 

  # Create the environment
  virtualenv deepseek-env
 

  # Activate the environment
  source deepseek-env/bin/activate
  

Your terminal should now display the prefix (deepseek-env), indicating that the environment is active.

Step 3: Installing Necessary AI Frameworks

DeepSeek primarily relies on PyTorch. Install the CUDA-compatible version to benefit from GPU acceleration:


  # Install PyTorch with CUDA
  pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
 

  # Verify the installation
  python -c "import torch; print('CUDA disponible:', torch.cuda.is_available())"
  

Also, install the additional libraries:

pip install transformers accelerate bitsandbytes sentencepiece protobuf

Step 4: Downloading the DeepSeek Source Code

Clone the official repository and navigate to the project directory:


  git clone https://github.com/deepseek-ai/deepseek-coder.git
  cd deepseek-coder
  

Then, install the project-specific dependencies:

pip install -e .

“DeepSeek radically transforms our approach to data analysis and code generation, offering capabilities previously reserved for proprietary platforms.” – Experienced User

Step 5: Downloading Pre-trained Models

DeepSeek offers several model variants depending on your needs. Download the one that matches your use case:


  # Create the directory for the models
  mkdir -p models
  cd models
 

  # Download the model (example with deepseek-coder-6.7b-base)
  git lfs install
  git clone https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-base
  

If your GPU has less than 24 GB of VRAM, prefer the quantized versions:

git clone https://huggingface.co/TheBloke/deepseek-coder-6.7B-base-GPTQ

Step 6: Configuring the Inference Server

Now, create a configuration file for the DeepSeek server:


  cd ..
  nano config.yaml
  

Add the following content, adapting the paths according to your installation:


  model:
  name: "deepseek-coder"
  path: "./models/deepseek-coder-6.7b-base"
  type: "llm"
  quantization: "none"  # or "4bit", "8bit" depending on your needs
 

  server:
  host: "0.0.0.0"
  port: 8000
  workers: 4
 

  inference:
  max_tokens: 2048
  temperature: 0.7
  top_p: 0.95
  

Step 7: Launching the DeepSeek Server

Finally, start the server with the command:

python -m deepseek.server --config config.yaml

Your DeepSeek server should now be accessible at http://localhost:8000. You can test its functionality with a curl request:


  curl -X POST "http://localhost:8000/v1/completions" \
  -H "Content-Type: application/json" \
  -d '{
  "prompt": "Explain how reinforcement learning works",
  "max_tokens": 100
  }'
  

Performance Optimization

To get the most out of your local installation, several optimization techniques are particularly effective.

Multi-GPU Parallelization

If you have multiple GPUs, DeepSeek can leverage them simultaneously. Modify your configuration as follows:


  model:
  # Previous configuration...
  device_map: "auto"  # Automatic distribution across available GPUs
  

Memory Optimization

Efficient memory management is crucial for large AI models. Add these parameters:


  inference:
  # Previous configuration...
  offload_to_cpu: true  # Offload unused layers to RAM
  cpu_offload_threshold: 0.3  # Offloading threshold
  

Even with a modest GPU, you can run larger models. Additionally, using quantization techniques significantly reduces the memory footprint.

If you manage a Windows infrastructure alongside your DeepSeek server, you might be interested in how to optimize your global environment. Discover Group Policy Objects (GPO): a comprehensive guide to securing and optimizing your Windows infrastructure, with essential strategies that can complement your DeepSeek installation.

“Customization is the fundamental key to maximizing the performance of open-source AI models in a local environment.” – AI Infrastructure Expert

Practical Use Cases

Advanced Predictive Analysis

DeepSeek excels in predictive analysis of financial data. For example, you can use it to:

  • Detect transaction anomalies in real-time
  • Predict market trends with increased accuracy
  • Optimize investment strategies based on historical data

To set up such a system, simply connect your data stream to the DeepSeek API and configure the appropriate models.

Automated Natural Language Processing

DeepSeek’s NLP capabilities automate many linguistic tasks:

  • Generate customized reports from raw data
  • Perform sentiment analysis on customer feedback or social media
  • Intelligently extract information from unstructured documents

These applications can be easily integrated into your existing systems via the DeepSeek REST API.

Software Development Assistance

DeepSeek Coder greatly facilitates software development:

  • Generate code from natural language descriptions
  • Automatically detect and fix bugs
  • Intelligently document existing code

These features accelerate the development cycle and improve the quality of the code produced.

Troubleshooting and Common Solutions

Despite careful installation, some issues may arise. Here are solutions to the most common errors:

CUDA Out of Memory Errors

If you encounter GPU memory errors, try these solutions:

  1. Reduce the batch size in your configuration
  2. Use a quantized version of the model (4bit or 8bit)
  3. Enable the CPU offloading mentioned previously

Performance Issues

Insufficient performance can have several causes:

  1. Verify that CUDA is correctly installed and recognized
  2. Make sure your SSD is not saturated (space and IOPS)
  3. Increase the system RAM available for offloading operations