text-embedding-3-small Dimensions Explained: 1536 vs 1024 vs 512
text-embedding-3-small Dimensions Explained: 1536 vs 1024 vs 512
If you use text-embedding-3-small, one small setting can quietly affect your whole retrieval system: embedding dimensions.
The default vector length is 1536 dimensions. That is a good default. But it is not always the cheapest or fastest choice once you store millions of chunks in a vector database.
This guide explains what text-embedding-3-small dimensions means, when to keep 1536, when to test smaller vectors, and how to call an OpenAI-compatible embeddings endpoint with real code.
What are text-embedding-3-small dimensions?
An embedding turns text into a list of numbers. That list is a vector.
For text-embedding-3-small, the default vector has 1536 numbers. If you embed the sentence:
“API gateways help developers route model calls.”
The model returns one vector that represents the meaning of that whole input. The vector is not one number per word. It is one semantic representation for the input text you send.
You then store that vector in a vector database such as pgvector, Pinecone, Milvus, Weaviate, Chroma, or Qdrant. When a user searches, you embed the query and compare it against stored vectors.
Official OpenAI documentation states that text-embedding-3-small defaults to 1536 dimensions, while text-embedding-3-large defaults to 3072 dimensions. It also supports a dimensions parameter that can reduce the output vector length.
External references:
OpenAI embeddings guide
OpenAI text-embedding-3-small model page
Stack Overflow discussion on embedding dimensions
Default text-embedding-3-small dimensions: why 1536 is common
1536 dimensions is popular because it is the default. It is also a practical balance between quality and cost for many semantic search and RAG workloads.
Use the default 1536 dimensions when:
You are building your first retrieval system.
You do not have evaluation data yet.
Your dataset is small enough that vector storage is not painful.
Search quality matters more than a few gigabytes of storage.
You want fewer moving parts during the first launch.
That last point matters. If your app is still early, the biggest risk is usually not vector size. It is bad chunking, weak retrieval evaluation, missing metadata filters, or poor prompts.
Start simple. Then optimize.
The dimensions parameter: what changes and what does not
The dimensions parameter lets you request a shorter embedding vector.
For example, instead of asking for the default 1536-dimensional vector, you can request 1024, 768, or 512 dimensions if your provider supports it for that model.
What changes:
Area1536 dimensions1024 / 768 / 512 dimensionsVector storageLargerSmallerIndex memoryLargerSmallerSearch latencyOften higherOften lowerRetrieval qualityStrong baselineMust be testedAPI input token costUsually unchangedUsually unchanged
What does not usually change: the number of input tokens you send. Embedding API pricing is normally based on input tokens, not the final vector size.
That means smaller dimensions mainly help with storage, index memory, and retrieval speed. They are not a magic way to reduce the embedding generation bill.
Storage math: 1536 vs 1024 vs 512 dimensions
A float32 number uses 4 bytes. So the raw vector size is:
vector_size_bytes = dimensions × 4
For one vector:
DimensionsBytes per vectorStorage vs 153615366,144 bytesBaseline10244,096 bytes~33% smaller7683,072 bytes~50% smaller5122,048 bytes~67% smaller
For 1 million chunks, raw float32 vector storage looks like this:
DimensionsRaw vector storageWith rough 35% index overhead1536~5.72 GiB~7.72 GiB1024~3.81 GiB~5.15 GiB768~2.86 GiB~3.86 GiB512~1.91 GiB~2.57 GiB
This is why dimensions start to matter at scale. A small difference per vector becomes real infrastructure cost when you store millions of chunks.
Quick calculator for embedding dimensions
Here is a small Python tool you can use to estimate storage and rough generation cost.
#!/usr/bin/env python3 import argparse def gib(n): return n / (1024 ** 3) def main(): parser = argparse.ArgumentParser() parser.add_argument("--documents", type=int, required=True) parser.add_argument("--avg-tokens", type=int, required=True) parser.add_argument("--dimensions", type=int, nargs="+", default=[1536, 1024, 768, 512]) parser.add_argument("--price-per-million", type=float, default=0.02) args = parser.parse_args() total_tokens = args.documents * args.avg_tokens estimated_cost = total_tokens / 1_000_000 * args.price_per_million print(f"Documents: {args.documents:,}") print(f"Estimated input tokens: {total_tokens:,}") print(f"Embedding generation cost: ${estimated_cost:,.2f}") print() print("Dims Raw GiB With 35% index overhead") for dim in args.dimensions:
Read the full guide










