Straightforward pricing that rewards you as you grow

Whether you’re just exploring or ready for commitment, our pricing plans make it easy to get started. All plans give you access to Speech-to-Text and Audio Intelligence models and endpoints.

Pay As You Go

Free $200 of credit

Then pay-as-you-go. No minimums. No expiration. No credit card required.

Access all endpoints and public models
Up to 100 concurrent requests for Deepgram models and 5 for Deepgram Whisper Cloud
Discord and community support

Growth

$4k-10k / year

Save 20%

With pre-paid credits for the year. Credits are redeemed against actual usage. 

Buy Now

Access all endpoints and public models at favorable discounts
Up to 100 concurrent requests for Deepgram models and 5 for Deepgram Whisper Cloud
Discord and community support

Enterprise

Exclusive

For businesses with large volumes, data or deployment requirements, or support needs.

Contact Sales

Access all endpoints and public models with our best discounts
Access to custom-trained Speech-to-Text models
Priority access to new endpoints and models
Highest concurrency support
Private Cloud or On-Prem Deployments
Premium SLAs, dedicated support teams, and email support
Discord and community support

Speech-to-text

Power your apps with world-class speech recognition in 30+ languages.

Includes: Speaker Diarization, Smart formatting, Automatic Language Detection, Deep Search, Keyword Boosting, Multichannel Support, and Callbacks.

For detailed model, language, and feature availability, please refer to our Developer Documentation.

Pre-Recorded

Streaming

Model

Pay As You Go

Growth

Enterprise

Nova-2

$0.0043/min

$0.0036/min

Contact Sales

Nova-1

$0.0043/min

$0.0036/min

Enhanced

$0.0145/min

$0.0115/min

Base

$0.0125/min

$0.0095/min

$0.0048/min

$0.0042/min

$0.0035/min

$0.0038/min

$0.0032/min

$0.0033/min

$0.0027/min

$0.0035/min

$0.0028/min

Custom

Model

Pay As You Go

Growth

Enterprise

Nova-2

$0.0059/min

$0.0049/min

Contact Sales

Nova-1

$0.0059/min

$0.0049/min

Enhanced

$0.0165/min

$0.0136/min

Base

$0.0145/min

$0.0105/min

Custom

Audio Intelligence

Powered by task-specific language models.  
Works with or without Transcription. Handles text or audio.

Model

Pay As You Go

Growth

Enterprise

Summarization

$0.0003/1k input tokens - $0.0006/1k output tokens

$0.00024/1k input tokens $0.00048/1k output tokens

Contact Sales

Redaction

$0.26/hr

$0.21/hr

Topic Detection

Get Early Access

Sentiment Analysis

Intent Recognition

Frequently Asked Questions

How is multichannel billed?

When you opt into using the multichannel feature, each channel is transcribed and billed separately. The total cost when using multichannel is the single-channel cost multiplied by the number of channels. When you do not enable multichannel, Deepgram converts your multichannel audio into mono single-channel audio, and it is transcribed and billed as one channel. We especially recommend using the multichannel feature on multichannel audio in which there is cross-talk (voices overlapping or talking over each other) for the most accurate transcription and speaker detection.

What's the difference between Nova, Enhanced and Base models?

Nova is our newest and most powerful model, offering the best balance between accuracy and cost-effectiveness. Enhanced is a powerful ASR model that performs especially well with uncommon words. Base is our signature model, with a solid combination of accuracy and cost-effectiveness. Some languages are only supported by Enhanced and Base. See our Models Overview and Model docs.

Which file types can you transcribe?

We support over 40 audio and video formats, documented here.

How does billing work?

You purchase credit upfront with a credit card. Credit will be deducted from your balance as you use our API. Pay As You Go credit never expires. Growth plan credit expire 1 year from purchase unless you renew or upgrade.

What unit of time is billed, minutes or seconds?

Deepgram bills by the second of audio. For instance, if you transcribe 61 seconds of audio, we bill you for 61 seconds of usage, not 2 minutes (120 seconds).

Can you transcribe live streaming audio?

Definitely. In fact, we’ve got the fastest real-time transcription in the biz with latency times of under 300 milliseconds.

Can Deepgram support real-time conversations?

Yes! Our streaming API is designed for low latency and will return incremental transcripts as a speaker’s sentence unfolds. You can stay up to date on our latest Conversational AI technology by subscribing to our newsletter.

What happens if I run out of credit before my Growth plan expires?

If you’re on the Growth plan and have saved a credit card, you can continue to use our API with a 10% overage fee billed at the start of each month. This is still less than Pay As You Go rates. You can renew or upgrade your plan at any point to prevent an overage fee.

What languages do you support?

We support over 30 languages and dialects for transcription. See our languages overview for a complete list of languages, as well as which models support each language.

Can I get human support?

Sure thing. You can get help from our community over at Discord or GitHub Discussions. If you are subscribed to our Enterprise plan, reach out to our support team via email or Slack for assistance.

How many seats do I get?

We bill based on usage, not users. Add as many team members and collaborators as you wish!

Can I deploy Deepgram on-premises or in a VPC?

Yes, our Enterprise plan offers on-prem and VPC deployments. Contact us about getting on an Enterprise plan to expand your deployment capabilities.

Are the Audio Intelligence features affected by the model I select?

Our Audio Intelligence features such as topic detection and summarization are not dependent on the selected model. However, audio intelligence features are primarily affected by the transcript, so you should use the most accurate model available for your use case for best results.

Essential Building Blocks for Voice AI