Embeddings RAG

Tool
Category
Segment
Platform / Tool
Plan / License
Monthly Price USD
Pricing Model
Free Tier / OSS
Included Usage / Limits
Embedding Models / Dimensions
Reranking / Retrieval
RAG / Search Features
Integrations / Frameworks
Deployment / Hosting
Security / Privacy
Team / Governance
Best Fit
Main Limits / Caveats
No tagline
Embeddings RAGFull RAG application platformRAGFlowApache-2.0 / open source$0 softwareOpen-source app/platform; hosting/model costs separateOpen-source RAG engine/app that combines document understanding, retrieval and generation workflowsUses external/local embedding models depending configurationRetrieval pipeline, document parsing and answer generation workflow; rerank support depends setupDocument Q&A, knowledge base apps, parsing-heavy RAG and UI workflowsDocker/self-host, document parsers, vector/search backends and LLM providersSelf-hosted OSS; commercial/cloud options should be checked separatelyData stays in self-hosted deployment unless external model APIs are usedSelf-managed governance in OSSTeams wanting a full RAG app surface instead of only a libraryHeavier than a library; production deployment and model/provider costs remain
No tagline
Embeddings RAGEmbedding APIOpenAIAPI pay-as-you-go$0.02 / 1M tokensUsage-based per input token; Batch API pricing may differNo durable free tier captured on pricing pageSmall embedding model; official model page lists $0.02 per 1M tokens1536 dimensions by default; supports shortening via dimensions parameter per OpenAI embeddings docsNo native reranker endpoint in OpenAI API row; pair with vector DB/search providerSemantic search, clustering, recommendations, anomaly detection and classificationOpenAI SDKs, LangChain, LlamaIndex, vector DB integrationsHosted OpenAI APIOpenAI API data handling terms apply; enterprise/data settings should be checked per orgOrganization/API-key governance in OpenAI platformCost-sensitive general-purpose semantic search and RAGNo built-in vector store or reranker; no free quota captured for production API use
No tagline
Embeddings RAGEmbedding APIOpenAIAPI pay-as-you-go$0.13 / 1M tokensUsage-based per input token; Batch API pricing may differNo durable free tier captured on pricing pageMost capable current OpenAI embedding model row; official page lists $0.13 per 1M tokens3072 dimensions by default; supports shortening via dimensions parameter per OpenAI embeddings docsNo native reranker endpoint in OpenAI API row; pair with vector DB/search providerHigher-quality multilingual retrieval, clustering and classificationOpenAI SDKs, LangChain, LlamaIndex, vector DB integrationsHosted OpenAI APIOpenAI API data handling terms apply; enterprise/data settings should be checked per orgOrganization/API-key governance in OpenAI platformHigher-quality retrieval when embedding cost is still secondary to answer model costMore expensive than small model; no built-in vector storage/reranking
No tagline
Embeddings RAGEmbedding APIGoogle Gemini APIFree / Paid$0 free tier; $0.15 / 1M tokens paid; $0.075 / 1M batchFree tier plus paid per-token pricingFree tier input price is free of charge; paid tier $0.15 per 1M tokens; batch paid tier $0.075 per 1M tokensgemini-embedding-001; flexible output size 128 to 3072, recommended 768/1536/3072; 2,048 input token limit in embeddings docsNo standalone reranker captured; Gemini File Search charges embeddings plus regular model tokens for retrieved document tokensEmbeddings for semantic search, classification, clustering and RAG; File Search is available as a Gemini toolGoogle GenAI SDK, REST, LangChain/LlamaIndex integrations, Google AI StudioGemini Developer API; Vertex AI for enterprise deployment pathFree tier content may be used to improve products; paid tier says content not used to improve products on pricing tableGoogle project/API-key governance; enterprise via Vertex AIDevelopers wanting a free-start embedding API with strong Google ecosystem integrationFree tier rate limits and product-improvement terms matter; paid production needs billing
No tagline
Embeddings RAGEmbedding and rerank APIVoyage AIAPI pay-as-you-go with free allocation$0 platform fee; usage after free allocationPer-token embedding pricing by modelPricing page lists free token allocations by model; multimodal row explicitly gives 200M free text tokens and 150B pixels for voyage-multimodal-3.5/3Voyage embedding family for text, code, multilingual, law and finance; dimensions vary by modelSeparate Voyage reranker endpoint availableHigh-quality domain-specific retrieval and RAG embeddingsPython/REST APIs, LangChain/LlamaIndex/vector DB integrationsHosted Voyage APIData handling terms depend on account/enterprise contractAPI-key/account governance; enterprise options by salesTeams optimizing retrieval quality in specific domains like code, law or financeFree allocations and exact per-model prices vary; check the pricing table before high-volume use
No tagline
Embeddings RAGRerank and multimodal retrieval APIVoyage AIAPI pay-as-you-go with free allocation$0 platform fee; rerank after first 200M processed tokensRerank billed by processed tokens; multimodal billed by text tokens and pixelsFirst 200M processed rerank tokens free for rerank-2.5/rerank-2.5-lite/rerank-2/rerank-2-lite; multimodal has 200M text tokens and 150B pixels freevoyage-multimodal-3.5/3 supports text/image/video retrieval pricing row; text embedding models separateRerank endpoint calculates processed tokens as query tokens times document count plus document tokensSecond-stage reranking, multimodal retrieval and RAG quality improvementAPI usage with vector DBs, LangChain/LlamaIndex and custom RAG stacksHosted Voyage APIData handling terms depend on account/enterprise contractAPI-key/account governance; enterprise options by salesRAG teams needing strong reranking or multimodal retrievalRerank billing scales with candidate document count; multimodal image/video pixel billing needs estimation
No tagline
Embeddings RAGEmbedding APICohereTrial / Production APIUsage-based; current prices on Cohere pricing pageEmbedding models billed by tokens embeddedYes, via evaluation/trial keysDocs distinguish limited evaluation keys from paid production keys; embed rate limit examples list 100/min evaluation and 2,000/min productionEmbed 4 and other Cohere embed models; dimensions/model details varyRerank is a separate Cohere model familyEnterprise-grade multilingual semantic search and retrievalCohere API, LangChain, LlamaIndex, vector DB integrations; Cohere Compass for managed searchHosted API or Model Vault private deploymentEnterprise/private deployment via Cohere Model Vault; data/security terms by planProduction keys and enterprise contracts; Model Vault dedicated deployment optionsCompanies wanting Cohere retrieval models with enterprise deployment optionsPublic pricing page emphasizes enterprise/private deployments; exact hosted API unit prices should be rechecked at checkout/docs
No tagline
Embeddings RAGRerank APICohereEnterprise / Model Vault$3,250/mo for Rerank 3.5 or Rerank 4 Fast Medium Model Vault; $6,500/mo Rerank 4 Pro LargeDedicated instance pricing; hosted API pricing separateTrial/evaluation keys exist for API accessPricing page lists Model Vault hourly/monthly rates for Rerank 3.5, Rerank 4 Fast and Rerank 4 Pro; rate-limit docs list rerank eval and production key limitsNot an embedding row; pairs with Cohere Embed or third-party embeddingsRerank 3.5/4 models for reordering retrieved candidates; Rerank docs price hosted endpoint by searchesTwo-stage RAG, enterprise search and relevance tuningCohere API, Compass, vector DB and framework integrationsHosted API or dedicated Model Vault deploymentModel Vault is dedicated/fully managed with no shared resources per pricing pageEnterprise procurement, dedicated deployment and supportEnterprise search teams that need reranking and private deployment controlsMonthly Model Vault rates are high for small teams; hosted API unit prices still need current pricing-table check
No tagline
Embeddings RAGEmbedding, rerank and search APIJina AIFree API key$0Free token quota plus rate limitsAPI docs say new users receive 10M free tokens; Embedding/Reranker free-key limits show 100 RPM and 100,000 TPMJina embeddings include text and multimodal models; dimensions/model choice varyReranker API is available with same free-key rate-limit shape as embedding APIEmbeddings, reranking, classification, reader/search APIs and batch embeddingsREST API, OpenAI-compatible metadata endpoint, LangChain/LlamaIndex style integrationsHosted Jina API; local/open model usage depends on model licenseCommercial model license and API terms apply; some model licenses are not fully openAPI-key tiers: free, paid, premium; dashboard key managerDevelopers needing generous free multilingual/multimodal retrieval APIFree tokens are finite; commercial model license details can differ by model
No tagline
Embeddings RAGEmbedding, rerank and search APIJina AIPaid / PremiumUsage-based; rate limits scale by tierToken-counted API usageFree tier existsPaid-key limits for Embedding/Reranker show 500 RPM and 2,000,000 TPM; Premium shows 5,000 RPM and 50,000,000 TPMText, multimodal, multi-vector/ColBERT and classifier model familiesReranker endpoint and search foundation APISearch AI for multilingual and multimodal data, batching and high-throughput retrievalREST/OpenAPI, batch jobs and common RAG frameworksHosted Jina APICommercial terms and data handling by Jina account/tierTiered API keys; premium/enterprise support pathTeams scaling retrieval workloads after free quota validationExact per-token price is not visible in the captured rate-limit table; confirm dashboard billing before production
No tagline
Embeddings RAGEmbedding APIMistral AIAPI pay-as-you-go$0.10 / 1M tokensPer-token API pricingNo free tier captured in official model cardModel card lists $0.1 per million tokens and 8k contextmistral-embed; text/code semantic representations; 8k context on model cardNo first-party reranker captured in this rowSemantic search, clustering, classification and RAG quickstartsMistral SDK/API, LangChain/LlamaIndex and Mistral knowledge/RAG toolkitHosted Mistral API; enterprise deployment options should be checked separatelyMistral API legal/privacy terms applyWorkspace/API-key governance in Mistral consoleTeams already using Mistral for generation and wanting same-vendor embeddingsSingle embedding model family; no native reranker row captured
No tagline
Embeddings RAGInference provider for embeddings/rankingHugging Face Inference ProvidersFree credits / PRO / Team$0.10 monthly credits Free; $2/mo credits PRO; $2/seat/mo credits Team/EnterpriseMonthly credits plus pay-as-you-go by provider/hardwarePricing docs list $0.10 monthly credits for free users, $2 for PRO, and $2/seat for Team or Enterprise organizationsAccess to many embedding and text-ranking models through providers; hf-inference focuses mostly on CPU tasks including embedding/text-rankingText-ranking models available depending provider/modelModel hub, widgets, inference playground and serverless provider routingHugging Face Hub, transformers, sentence-transformers, LangChain/LlamaIndex integrationsHosted Inference Providers or custom provider key; self-host models separatelyData/provider handling depends on selected routed provider or custom provider keyUser, PRO, Team and Enterprise org billing/governanceExperimenting across many OSS embedding/rerank models without separate provider setupFree credits are tiny; production cost depends on model/provider compute time
No tagline
Embeddings RAGDeveloper API and domain search platformNomicBusiness / Enterprise$40/user/mo annual, 25-seat minimum; $1,000/mo minimumSeat subscription with included AI usageNo free developer API tier captured on pricing pageEach $40 seat includes $20 of included AI usage; usage can apply to Developer API, document ingestion and platform toolsNomic Embed model family; developer API tools built on top of NomicSearch/research queries and document ingestion are included AI-usage categoriesProject data search, document indexing, workflows and domain-specific retrievalDeveloper API, Nomic platform, document/project data sourcesCloud SaaS; Enterprise adds VPC/on-prem optionsOrg-wide privacy controls on Business; Enterprise adds SCIM, audit logs and deployment controlsBusiness has SAML/OIDC SSO; Enterprise custom governanceArchitecture/engineering firms and teams using Nomic's document-search platform plus APIAnnual 25-seat minimum makes it poor fit for hobby embedding-only usage
No tagline
Embeddings RAGHosted model inference inside vector DBPineconeStarter / paid plans$0 Starter minimum; hosted embedding usage priced per modelPlan minimums plus per-token inference/model pricingStarter has $0/month minimum; Pinecone limits page lists 5M embedding tokens/month/model on Starter; model gallery lists hosted model prices such as multilingual-e5-large $0.08/1M tokens and llama-text-embed-v2 $0.16/1M tokensHosted embedding models include llama-text-embed-v2, multilingual-e5-large and sparse encoder rows; dimensions depend modelPinecone has hosted rerank models separatelyIntegrated inference with upsert/query, dense/sparse retrieval and vector searchPinecone SDKs, LangChain/LlamaIndex, hosted vector index integrationsManaged Pinecone cloudPinecone account/project security and enterprise controls by planStarter/Builder/Standard/Enterprise plan governanceTeams wanting embeddings and vector storage/search managed in one placeStarter monthly token limit is low; hosted model pricing and vector DB usage both contribute to cost
No tagline
Embeddings RAGHosted rerank inside vector DBPineconeStarter / paid plans$2.00 / 1k rerank requests for listed modelsPer-request rerank pricing plus plan limitsYes, selected rerank models on StarterModel gallery lists cohere-rerank-3.5, bge-reranker-v2-m3 and pinecone-rerank-v0 at $2.00 per 1k requests; limits page shows 60 RPM Starter for bge/pinecone rerank, cohere-rerank not available on StarterNot an embedding row; pairs with Pinecone hosted or external embeddingsReranks candidate documents after vector, keyword or hybrid retrievalTwo-stage retrieval, integrated search and relevance improvementPinecone SDK inference.rerank, vector DB query flows, framework integrationsManaged Pinecone cloudPinecone account/project security and enterprise controls by planPlan-based rate limits and enterprise governanceRAG teams already using Pinecone who want one vendor for vector search and rerankRerank priced per request, so candidate-count and query volume need monitoring
No tagline
Embeddings RAGEmbedding, rerank and vector store APIMixedbreadFree$0 with $5 one-time creditsFree credits plus request limitsStarter includes $5 one-time credits, 3 workspace users, 10 stores and 100 requests/min; pricing page advertises up to $250 in free credits separatelyMixedbread embedding models and vector/store platform; exact dimensions depend selected modelRerank listed at $7.50 per 1k queries in pricing snippetEmbeddings, reranking, vector stores and retrieval APIsAPI, stores, RAG integrations and custom appsHosted Mixedbread platformData/security terms depend plan; Enterprise has dedicated infrastructure/BYOC3 users on Starter; Enterprise customDevelopers testing embedding/rerank/store workflows without cardOne-time free credits are limited; exact model prices should be checked for selected route
No tagline
Embeddings RAGEmbedding, rerank and vector store APIMixedbreadScale / Enterprise$20/mo Scale; Enterprise customSubscription with included credits plus pay-as-you-go usageStarter free plan existsScale includes $20/month credits, unlimited workspace users, 10,000 stores, 1,200 queries/min and 360 ingestion/min; Enterprise adds custom limits and BYOCEmbedding models and store-backed retrieval workflowsRerank price listed as $7.50 per 1k queries; higher rate limits on Scale/EnterpriseManaged stores, ingestion, search, rerank and retrieval workflowsAPI and platform integrationsHosted platform; Enterprise dedicated infrastructure/BYOCEnterprise offers dedicated infrastructure and white-glove supportUnlimited users on Scale; Enterprise customTeams needing integrated retrieval store plus embeddings/rerankingStores and query limits are plan-specific; overage model should be watched
No tagline
Embeddings RAGHosted OSS embedding/rerank modelsFireworks AIServerless pay-as-you-go$1 free credits; embeddings from $0.008 to $0.10 / 1M input tokensPer-token serverless inferencePricing page says get started with $1 in free credits; embedding price table lists up to 150M params at $0.008/1M tokens, 150M-350M at $0.016, Qwen3 8B at $0.10Hosts Qwen3 embedding models and other embedding/rerank models; context/model dimensions varyFireworks docs include embeddings and reranking service with OpenAI-compatible embeddings endpointSemantic search, RAG and reranking using hosted open modelsOpenAI-compatible endpoint, Python/REST, LangChain/LlamaIndex style integrationsServerless API; on-demand deployments for dedicated GPUsFireworks account/API-key terms; enterprise deployments availableProject/API key governance; enterprise customTeams wanting cheap hosted open embedding models without managing GPUsModel page snippets can show inconsistent library prices; use pricing page and model page before production
No tagline
Embeddings RAGCloud embedding APIAmazon BedrockAWS pay-as-you-go$0.02 / 1M tokens for Titan Text Embeddings V2 commonly listed; Nova multimodal differsBedrock on-demand token pricingNo always-free Bedrock model tier capturedBedrock pricing docs list Amazon embedding models including Titan Text Embeddings V2, Titan Multimodal and Nova Multimodal; current region/model pricing should be checked in AWS pricing tableTitan Text Embeddings V2 and multimodal embedding models; dimensions/model capabilities varyNo first-party reranker row captured; can pair with OpenSearch/Kendra/vector DBsAWS-native RAG, semantic search and knowledge base ingestionBedrock Knowledge Bases, OpenSearch, Aurora pgvector, LangChain/LlamaIndex, AWS SDKAWS Bedrock regional serviceAWS IAM/VPC/compliance controls; data terms by Bedrock model/providerAWS account/IAM/org governanceAWS-heavy teams needing embeddings inside existing cloud/compliance perimeterPricing varies by region/model and AWS tables are harder to scrape; verify exact region before cost modeling
No tagline
Embeddings RAGOpen-source embedding librarySentence TransformersApache-2.0 / open source$0 softwareSelf-hosted open-source library; compute/model hosting separatePython framework for state-of-the-art sentence, text and image embeddings; no hosted quota because it runs locally or on your infrastructureSupports many pretrained embedding models and cross-encoders; dimensions vary by modelCross-encoder reranking supported through sentence-transformers/cross-encoder workflowsSemantic search, clustering, retrieval, paraphrase mining and model fine-tuningHugging Face models, PyTorch, transformers, LangChain/LlamaIndex integrationsLocal, server, GPU, HF Inference or custom hostingData stays local if self-hosted; model licenses varyNo SaaS governance unless wrapped by your platformTeams that want maximum model choice and control over embeddingsRequires infra, batching, monitoring and model-license checks
No tagline
Embeddings RAGOpen-source lightweight embedding/rerank libraryFastEmbedApache-2.0 / open source$0 softwareSelf-hosted open-source library; compute/model hosting separateLightweight ONNX Runtime-based library; default examples use BAAI/bge-small-en-v1.5 and support dense, sparse, late-interaction multimodal and rerankersDense/sparse/late-interaction embeddings; dimensions vary by model; example BGE small vector is 384 dimensionsTextCrossEncoder rerankers supportedLocal semantic retrieval, Qdrant integration, serverless-friendly embeddingsQdrant, Python, ONNX Runtime, custom HF model sourcesLocal/self-hosted; can run in serverless runtimes more easily than heavier PyTorch stacksData stays local; model licenses varyNo SaaS governance unless wrapped by your platformDevelopers wanting fast local embeddings/rerank without PyTorch weightModel support is curated; still needs vector DB/storage and production ops
No tagline
Embeddings RAGOpen-source embedding modelsBAAI BGE / E5 / Qwen EmbeddingsOpen model licenses vary$0 software; hosting/API costs separateLocal or hosted through HF/Fireworks/Together/etc.Yes, if local/open weights license permitsLocal resource highlights BGE-Large-EN-v1.5, E5-Mistral and Nomic Embed as free/local choices; HF model cards define exact licenses and dimensionsBGE, E5, Qwen and Nomic families; dimensions/context vary by modelOpen rerankers such as BGE reranker or Qwen reranker can be paired separatelyCustom semantic search, multilingual retrieval and domain-tuned RAGSentence Transformers, FastEmbed, TEI, vLLM, HF Inference, vector DBsLocal, cloud GPU, HF Inference or provider APIsData privacy depends on local vs hosted execution; licenses vary per modelGovernance is self-managed unless using a platformCost-controlled teams willing to manage models for retrieval qualityLicenses, quantization quality, pooling strategy and hardware requirements vary widely
No tagline
Embeddings RAGRAG orchestration frameworkLangChainMIT / open source$0 softwareOpen-source framework; provider/vector DB costs separateOfficial LangChain page says LangChain is MIT-licensed open source and free to useSupports many embedding providers and vector stores through integrationsRetriever, contextual compression and reranker integrations through ecosystemChains, retrievers, tools, agents, document loaders and RAG app patternsOpenAI, Gemini, Cohere, Hugging Face, vector DBs, LangGraph and LangSmithLocal/server app framework; LangSmith/hosting separateData handling depends on providers and whether LangSmith is usedNo SaaS governance in OSS; LangSmith adds team governance/pricingDevelopers wanting the broadest RAG integration ecosystemFramework complexity and version churn can be nontrivial; observability/hosted features are separate
No tagline
Embeddings RAGRAG pipeline frameworkHaystack by deepsetApache-2.0 / open source$0 softwareOpen-source framework; deepset platform custom/priced separatelyLocal resources list Haystack as open-source RAG framework; commercial deepset platform handles infrastructure/collaboration separatelyIntegrates embedding retrievers and document stores; model dimensions depend providerSupports rankers/retrievers and pipeline components for retrieval qualityComposable pipelines for search, Q&A, RAG, agents and evaluationElasticsearch/OpenSearch, vector DBs, model APIs and Python ecosystemSelf-hosted framework; deepset enterprise/cloud platform availableSelf-host keeps data in your infra; enterprise platform terms separateOSS has no SaaS governance; enterprise platform adds roles/collaborationPython teams building production retrieval pipelines with explicit componentsCloud/platform pricing is not public in captured official/local sources; OSS requires ops