AI Models Guide

AI Models & Selection

SOHAM routes your queries across 13+ specialized models from Groq, Google, HuggingFace, and OpenRouter. Here's what's available and when to use each.

Auto Mode — Recommended for Most Users

SOHAM analyzes your query and routes it to the most capable model automatically

🧮

Math problems

Routed to math-capable models (Qwen, Llama)

💻

Code questions

Routed to DeepSeek V3.2 or Llama 3.3 70B

🖼️

Image tasks

Routed to Gemini multimodal models

🔍

Web queries

Auto-triggers DuckDuckGo search + AI synthesis

💬

General chat

Routed to fast conversational models

📄

Document tasks

Routed to large-context models

Available Models

General Purpose

Fast, versatile models for everyday tasks

Llama 3.1 8B Instant

GroqUltra-fast inference, great for quick answers and general chat

Fastest response
Streaming
Low latency

Llama 3.2 1B Instruct
Free

HuggingFaceCompact free model, good for lightweight tasks

Free
Lightweight
No limits

Llama 3.1 70B Instruct
Free

HuggingFaceLarge free model with strong reasoning

Free
High quality
Complex reasoning

Qwen 2.5 7B Instruct
Free

HuggingFaceStrong multilingual model, excellent at math and code

Free
Multilingual
Math & code

Coding Specialists

Models fine-tuned for programming, debugging, and architecture

DeepSeek V3.2

HuggingFaceState-of-the-art coding model — best for complex programming tasks

Code generation
Debugging
Architecture advice

Conversational

Models optimized for natural dialogue

RNJ-1 Instruct

HuggingFaceEfficient conversational model with good personality

Natural dialogue
Context awareness
Friendly tone

Gemini 2.5 Flash Lite

GoogleLightweight Google model, fast and helpful

Fast responses
Casual chat
Helpful tone

Multimodal (Vision)

Models that understand and analyze images

Gemini 2.5 Flash

GoogleGoogle's latest multimodal model — best for image analysis

Image analysis
Visual Q&A
Large context

Gemini Flash Latest

GoogleLatest Gemini Flash with improved vision capabilities

Image understanding
Text + vision
Fast processing

Free / OpenRouter

Completely free models with no usage limits

OpenRouter Auto (Free)
Free

OpenRouterAutomatically selects the best available free model from OpenRouter

Auto selection
No limits
Multiple providers

How to Select a Model

Desktop

  1. 1. Click the Settings icon (⚙️) in the header
  2. 2. Find the "AI Model" dropdown
  3. 3. Select "Auto" or a specific model
  4. 4. Settings save automatically

Mobile

  1. 1. Tap the model button in the chat header
  2. 2. A bottom sheet opens with all models
  3. 3. Tap to select your preferred model
  4. 4. Sheet closes and model is applied

💡 Model Selection Tips

For coding tasks

Use DeepSeek V3.2 for complex programming, debugging, and architecture questions.

For math problems

Auto mode routes to Qwen 2.5 or Llama models which handle math well.

For image analysis

Use Gemini 2.5 Flash for image understanding and visual Q&A.

For speed

Groq's Llama 3.1 8B Instant is the fastest model available.

For free unlimited use

HuggingFace models (Llama, Qwen) and OpenRouter Auto have no usage limits.

For general chat

Auto mode or RNJ-1 Instruct work best for everyday conversations.

Model Providers

Groq

Ultra-fast inference with streaming

2 models available
Fastest responses
Streaming
Low latency

HuggingFace

Open-source models, many free

6 models available
Free models
Open source
Diverse selection

Google

Advanced multimodal capabilities

3 models available
Vision support
Large context
Multimodal

OpenRouter

Aggregated access to many models

1 model available
Auto selection
Free tier
Multiple providers