Alex Sidebar supports multiple AI models to suit different development needs and preferences. This guide explains the available models and how to configure them.

Available Models

Claude Sonnet 3.5

Anthropic’s model with a 200,000-token context window, offering exceptional code understanding and generation. Best for handling large codebases, complex tasks, and detailed explanations.

Claude Haiku 3.5

A streamlined version of Claude optimized for fast response times and efficient code completions. Ideal for real-time coding assistance and quick iterative development.

Gemini Flash 2.0

Google’s latest optimized model designed for speed and efficiency. Available in both chat and Cmd+K interfaces, perfect for quick code completions and rapid prototyping.

Grok Beta

Grok Beta is a frontier language model with state-of-the-art reasoning capabilities. It is suitable for chat, coding, and reasoning tasks.

Perplexity

A conversational search engine providing real-time, sourced answers. While not tailored for coding, it’s great for accessing coding documentation, examples, and swift explanations.

GPT-4

OpenAI’s model known for versatile code generation and reliable outputs. Excellent for general coding tasks, debugging, and refactoring.

GPT-4 Mini

A compact version of GPT-4, optimized for speed while retaining the core capabilities of the main model. Perfect for faster responses in routine coding tasks.

o1 Preview

Designed for advanced reasoning in coding, math, and science. With a 128,000-token context window, it excels in maintaining extensive context and solving intricate problems.

o1 Mini

A lightweight version of O1, optimized for speed and cost-`effectiveness. Ideal for quick code completions and everyday coding tasks with fast responses.

Model Selection

You can switch between models in two ways:

Model Selector Menu

  1. Click on the default model on the bottom left corner of the chat input view
  2. Select the model you want to use from the dropdown menu

Keyboard Shortcut

Press Command + / to quickly cycle through your enabled models during a chat session.

Note that o1 model is limited to 50 credits - to purchase additional credits, join our Discord community and message @DanielEdrisian on our Discord server.

API Key Configuration

You can add your API keys directly in the Model Settings screen. Simply click the settings icon on the top right corner of the sidebar and look for the API key input fields for each provider under the section “Model Settings”.

Your API keys are stored securely and only used to authenticate with the respective AI providers. You can update or remove them at any time from the settings screen.

Custom Model Setup

You can add custom models that comply with the OpenAI API scheme. Follow these steps to configure a custom model:

1

Add Custom Model

  1. Navigate to “Settings” by selecting the gear icon on the top right corner of the sidebar
  2. Select “Models” and you will find the section on “Custom Models” section
  3. Click the “Add New Model” button to create a new custom model configuration
2

Configure Model Details

  1. Enter the Model ID (e.g., qwen2.5-coder-32b-instruct, deepseek-chat)
  2. Provide the Base URL for your model’s API endpoint
  3. Add your API Key for authentication
  4. (Optional) Specify if the model supports image inputs
3

Example: DeepSeek V3 Model

To run the DeepSeek V3 model:

  • Model ID: deepseek-chat
  • Base URL: https://api.deepseek.com/v1
  • Enter your DeepSeek API Key in the provided field
4

Finalize Setup

Go back to the chat screen by clicking on the close icon on the top right corner of the sidebar and you will see the custom model in the model selection options.

Custom model selection is currently only available in normal chat mode, not in agent mode.

You can toggle between modes using Command + Shift + A.

Running Local Models

Alex Sidebar supports running local AI models through Ollama, providing a free and privacy-focused alternative to cloud-based models. Here is an example of how to set up a local powerful model like Qwen2.5-Coder:

1

Install Prerequisites

  1. Install Ollama to manage and serve the local model
  2. Install ngrok for creating a secure tunnel (temporary requirement until direct localhost support)
  3. Create a free ngrok account at ngrok.com to get an authentication token
2

Set Up the Model

# Pull the Qwen2.5-Coder model
ollama pull qwen2.5-coder:32b

# Start the Ollama server
ollama serve
3

Configure ngrok

# Install ngrok
brew install ngrok

# Authenticate with your token
ngrok config add-authtoken YOUR_AUTH_TOKEN

# Create tunnel to Ollama server
ngrok http 11434 --host-header="localhost:11434"
4

Configure in Alex Sidebar

Add a custom model with these settings:

  • Model ID: qwen2.5-coder:32b
  • Base URL: Your ngrok URL + /v1 (e.g., https://your-subdomain.ngrok-free.app/v1)
  • API Key: Your ngrok authentication token

Direct localhost support is coming soon to Alex Sidebar, which will eliminate the need for ngrok tunneling.

Local models may run slower than cloud-based alternatives, especially on less powerful hardware. Consider your performance requirements when choosing between local and cloud models.

Credit: This setup process was documented by Daniel Raffel who tested and validated the local model configuration with Alex Sidebar.

Best Practices

Model Selection Tips

• Use Claude 3.5 Sonnet or GPT-4 for complex architectural decisions
• Claude 3.5 Haiku or GPT-4 Mini for quick code completions

Performance Optimization

• Start new chats for long conversations to maintain accuracy
• Match model capabilities to task complexity

Troubleshooting

If you encounter issues with model responses:

  1. Check your API key configuration
  2. Verify your internet connection
  3. Ensure you’re within the model’s context limit
  4. Try switching to a different model
  5. Restart Alex Sidebar if issues persist

Need help? Join our Discord community for support and tips from other developers.

Code Apply View Position

Bottom Position

Keep the code apply interface fixed at the bottom for easy access to changes.

Improved Workflow

Review and apply code changes without scrolling through long conversations.

Was this page helpful?