Skip to main content

Overview

Ollama enables you to run AI models locally on your machine or on a private network endpoint. This gives you complete control over your data, eliminates API costs, and enables offline development. Ollama can automatically download and manage models for you, making local inference accessible and straightforward.

Prerequisites

Before configuring Ollama in cmd, you need to have Ollama installed and running on your system.
1

Install Ollama

Download and install Ollama from the official website or follow the Quick Start guide.
2

Start Ollama

Ensure Ollama is running. The default installation sets up autostart, but you can also start it manually:
ollama serve
Ollama will run on http://localhost:11434 by default.
3

Install Models

Install one or more AI models. Ollama will automatically download the models when you first request them:
# Example: Install a coding-focused model
ollama pull devstral-small-2:24b

# Or install a small model with lower system requirements
ollama pull qwen2.5-coder:7b

# Or install a reasoning model
ollama pull deepseek-r1:14b
You can browse available models at https://ollama.com/search.
4

Verify Installation

Verify Ollama is working correctly:
ollama list
This should show your installed models.
Ollama automatically loads models on-demand. You don’t need to manually load models before using them with cmd.

Configuring Ollama in cmd

Once Ollama is installed and running, you can enable it in cmd:
1

Open AI Provider Settings

Go to Settings > AI Providers in cmd
2

Add Ollama Provider

Click “Add Provider” or “Configure” for Ollama.By default, cmd connects to Ollama on http://localhost:11434. If you’re running Ollama on a different port or on a remote server, you can customize the base URL.
For a standard local Ollama installation, simply toggle “Enable”
If you’re running Ollama on a different port:
Base URL: http://localhost:8080
If you’re running Ollama on a remote server in your private network:
Base URL: http://192.168.1.100:11434
3

Model discovery

cmd will automatically discover the available models from your Ollama installation when the provider is configured.
4

Select Models

Go to Settings > Models and enable the models you want to use in cmd. All models installed in Ollama will be available for selection.
If you install new models in Ollama after configuring the provider, you can refresh the model list by disabling and re-enabling the Ollama provider in Settings > AI Providers.

Using Ollama

Once enabled and models are selected, you can use Ollama models just like any other AI provider:
1

Select an Ollama Model

In the cmd interface, open the model selector and choose one of your Ollama models (e.g., “devstral-small-2:24b”)
2

Start Using

You can now use Ollama models for:
  • Chat and code assistance
  • Agent mode for autonomous tasks (if supported by the model)
  • All other cmd features such as code completion
All inference happens locally on your machine or your private server - no data leaves your network.

Tested Models

The following models have been successfully tested with cmd:
  • qwen3-coder:30b - High-performance coding model
  • devstral-small-2:24b - Efficient development assistant
  • deepseek-r1:14b - Reasoning-focused model
  • qwen2.5-coder:7b - Lightweight coding model
You can find more models at https://ollama.com/search.

Benefits of Local Inference

Using Ollama with cmd provides several advantages:

Data Privacy

All processing happens locally - your code never leaves your machine

No API Costs

Run unlimited inference without per-token charges

Offline Development

Work without internet connectivity

Custom Models

Use specialized or fine-tuned models for your specific needs
Local models require significant computational resources. Larger models provide better results but need more RAM and processing power. Ensure your system meets the requirements for the models you want to run.
For the best coding experience, consider models specifically trained for code. These models understand programming contexts better than general-purpose models.