Skip to main content

Using Code Llama with Continue

With Continue, you can use Code Llama as a drop-in replacement for GPT-4, either by running locally with Ollama or GGML or through Replicate.

If you haven't already installed Continue, you can do that here. For more general information on customizing Continue, read our customization docs.

TogetherAI

Create an account here
Copy your API key that appears on the welcome screen
Update your Continue config file like this:

~/.continue/config.json
{
  "models": [
    {
      "title": "Code Llama",
      "provider": "together",
      "model": "togethercomputer/CodeLlama-13b-Instruct",
      "apiKey": "<API_KEY>"
    }
  ]
}

Ollama

Download Ollama here (it should walk you through the rest of these steps)
Open a terminal and run ollama run codellama
Change your Continue config file like this:

~/.continue/config.json
{
  "models": [
    {
      "title": "Code Llama",
      "provider": "ollama",
      "model": "codellama-7b"
    }
  ]
}

Replicate

Get your Replicate API key here
Change your Continue config file like this:

~/.continue/config.json
{
  "models": [
    {
      "title": "Code Llama",
      "provider": "replicate",
      "model": "codellama-7b",
      "apiKey": "<API_KEY>"
    }
  ]
}

FastChat API

Setup the FastChat API (https://github.com/lm-sys/FastChat) to use one of the Codellama models on Hugging Face (e.g: codellama/CodeLlama-7b-Instruct-hf).
Start the OpenAI compatible API (ref: https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md).
Change your Continue config file like this:

~/.continue/config.json
{
  "models": [
    {
      "title": "Code Llama",
      "provider": "openai",
      "model": "codellama-7b",
      "apiBase": "http://localhost:8000/v1/"
    }
  ]
}

TogetherAI
Ollama
Replicate
FastChat API