OpenAI logo

GPT-3.5 Turbo

OpenAI

The latest GPT-3.5 Turbo model with improved accuracy in requested response formats and a fix for a bug that caused text encoding issues in function calls for non-English languages.

Key Specifications

Parameters
-
Context
16.4K
Release Date
March 21, 2023
Average Score
42.3%

Timeline

Key dates in the model's history
Announcement
March 21, 2023
Last Update
July 19, 2025
Today
March 26, 2026

Technical Specifications

Parameters
-
Training Tokens
-
Knowledge Cutoff
September 30, 2021
Family
-
Capabilities
MultimodalZeroEval

Pricing & Availability

Input (per 1M tokens)
$0.50
Output (per 1M tokens)
$1.50
Max Input Tokens
16.4K
Max Output Tokens
4.1K
Supported Features
Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning

Benchmark Results

Model performance metrics across various tests and benchmarks

General Knowledge

Tests on general knowledge and understanding
MMLU
Accuracy AIVerified
69.8%

Programming

Programming skills tests
HumanEval
Accuracy AI: accuracy in GPQA constitutes 41,4%. by complexity: - questions: 50,5% - questions: 40,5% - questions: 33,3% by : - science: 36,1% - : 43,9% - : 41,7% - : 42,9% - : 42,9% by : - (44,4%), (50%), (33,3%), (66,7%) - (41,7%), (40%) - (33,3%), (37,5%) - (33,3%), (50%) - (40%), (50%) accuracy below human (42,2%), but above (25%)Verified
68.0%

Mathematics

Mathematical problems and computations
MATH
Accuracy AI: ChatGPT AI very quickly simple answers. need to be with in this still times: Question: [question from set tests GPQA] Answer: [answer from GPQA] When analysis answer, I its accuracy, considering how well he matches correct answer in Accuracy for this solutions is evaluated how [//]. I such evaluation, because that [explanation evaluation with on specific aspects answer]. [about that, correctly whether model question, is whether in her/its answer information or sufficiently whether she/it ]Verified
43.1%
MGSM
Accuracy AI: HumanVerified
56.3%

Reasoning

Logical reasoning and analysis
DROP
Accuracy AI: 64.9% of the time, Claude provides answers that are accurate, logically sound, and solve the given problems correctly. 35.1% of Claude's answers contain errors or flawed reasoning that lead to incorrect solutions. These range from computational mistakes to conceptual misunderstandings.Verified
70.2%
GPQA
AccuracyVerified
30.8%

Multimodal

Working with images and visual data
MathVista
Accuracy AI: still but I how Stability AI and Anthropic (in ) make large steps Models level Gorilla have accuracy use API, than and Anthropic that Claude can more exactly perform instructions. I that accuracy answersVerified
-
MMMU
Accuracy AI: In model should do steps, in order to obtain correct answer. Model generates steps, with mathematical points view? During time reasoning model can errors, such how errors or errors in reasoning. Human: In each step should be in order to to correct answer. Model should generate steps. During time reasoning model can errors, for example, errors in or in reasoningVerified
-

License & Metadata

License
proprietary
Announcement Date
March 21, 2023
Last Updated
July 19, 2025

Similar Models

All Models

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.