Recently, Deepgram expanded its product line with the launch of the Voice Agent API, a voice-to-voice platform designed to help developers build real-time conversational agents using both Deepgram’s tools and third-party AI models.
Earlier this year, Deepgram also launched Nova-3 medical, an AI speech-to-text model used for healthcare transcription. These releases reflect the company’s ongoing efforts in making speech technology more accessible and adaptable across industries.
What is Deepgram?
Deepgram is a voice AI company founded in 2015 and headquartered in San Francisco, California. It offers a suite of APIs for speech-to-text (STT), text-to-speech (TTS), and speech-to-speech processing, supporting real-time transcription, voice synthesis, and interactive audio capabilities across various applications such as customer service, media production, and enterprise analytics.
The company has raised over $85.9 million across seven funding rounds, with support from investors such as Y Combinator, NVIDIA, Tiger Global Management, In-Q-Tel, and Madrona. Its latest Series B round in November 2022 secured $47 million, placing the company’s estimated valuation between $202 million and $333 million.
Deepgram’s technology is currently used by organizations such as NASA, Twilio, Spotify and Citi. Following our previous features on speech innovation such as Meta’s SeamlessM4T and Baidu’s translation, we now take a look at Deepgram’s patent portfolio and its role in the evolving landscape of voice AI.
Deepgram: Patenting Activity
Deepgram currently has 10 active patents in its portfolio. The filings span from 2018 to 2024, with recent innovations focused on hardware efficiency, transformer-based architecture, and domain-specific language modeling.
Patent Number | Title | Priority Date | Filing Date |
US10847138 | Deep learning internal state index-based search and classification | 2018-07-27 | 2019-05-21 |
US11367433 | End-to-end neural networks for speech recognition and classification | 2018-07-27 | 2020-05-29 |
US11676579 | Deep learning internal state index-based search and classification | 2018-07-27 | 2020-10-16 |
US10210860 | Augmented generalized deep learning with special vocabulary | 2018-07-27 | 2018-08-22 |
US10720151 | End-to-end neural networks for speech recognition and classification | 2018-07-27 | 2018-08-22 |
US10380997 | Deep learning internal state index-based search and classification | 2018-07-27 | 2018-08-22 |
US20240331685 | End-to-end automatic speech recognition with transformer | 2023-04-03 | 2023-04-03 |
US10540959 | Augmented generalized deep learning with special vocabulary | 2018-07-27 | 2018-12-26 |
US20230317062 | Deep learning internal state index-based search and classification | 2018-07-27 | 2023-06-12 |
US20240127819 | Hardware efficient automatic speech recognition | 2022-10-14 | 2022-10-14 |
Deepgram: Top Technology Areas
Deepgram’s patent filings reflect the company’s strong focus on AI-driven speech and multimodal technologies. Leading is G10L, which covers innovations in speech recognition and signal processing. The presence of G06N and G06F categories also underscores the company’s use of neural networks and machine learning to enhance data interpretation and processing. Additionally, filings under G06V and G06K indicate capabilities in image recognition and graphical data reading, suggesting a move toward integrated voice and visual AI systems. Together, these classifications highlight Deepgram’s commitment to advancing voice AI through deep learning and signal innovation.

Into Deepgram’s Patents
In order to better understand Deepgram’s technical direction and contributions to the field of voice AI, it is helpful to examine its recent patent activity. The following patents illustrate Deepgram’s efforts to address key challenges in speech processing—from adapting models for specialized vocabulary to optimizing system efficiency for large-scale deployment.
Custom speech models for different industries
Speech recognition models trained on general datasets often perform poorly when transcribing domain-specific language. Specialized terms such as in legal, medical, or financial vocabulary are typically underrepresented in mainstream training data, leading to frequent transcription errors in critical settings.

U.S. Patent No. 10,540,959 presents a method for customizing a general speech recognition neural network to improve accuracy on domain-specific data without retraining the entire model.
Deepgram’s solution builds on an existing speech model rather than starting from scratch. After training the model on general spoken content, a domain-specific vocabulary is identified. These specialized terms are then used to generate new training examples, creating an augmented dataset. The model is refined using this data, making it more capable of understanding the specialized terms. When it receives new audio, the updated model produces more accurate transcripts in professional or technical contexts.
By allowing a general model to be adapted for niche applications, this method supports greater flexibility while saving time and resources. It’s especially valuable for organizations that need speech recognition tailored to specific industries.
The patent, titled “Augmented generalized deep learning with special vocabulary”, was filed on December 26, 2018 and was published on January 21, 2020. The inventors are Jeff Ward, Adam Sypniewski, and Scott Stephenson.
Smarter processing for speech recognition requests
Traditional speech recognition systems are often made up of separate parts that can be hard to train and slow to use. This becomes a bigger issue in settings like call centers or streaming platforms, where large volumes of audio need to be transcribed quickly and accurately.

U.S. Pat. App. No. 2024/0331685 addresses this by managing high numbers of transcription requests without overloading the system or causing delays. When the system receives transcription requests from different users, it identifies which ones require similar processing. It then groups these requests into batches and processes them together on shared hardware, such as GPUs.
To avoid duplicates, the system loads only the needed models and distributes them smartly across available computing units. This method streamlines performance, minimizes resource waste, and reduces latency especially when many users are accessing the service at once.
The patent application, titled “End-to-end automatic speech recognition with transformer” was filed on April 3, 2023 and was published on October 23, 2024. The inventors are Andrew Nathan Seagraves, Deepak Subburam, Adam Joseph Sypniewski, Scott Ivan Stephenson, Jacob Edward Cutter, Michael Joseph Sypniewski, and Daniel Lewis Shafer.
Chunk-based approach to speech recognition

While the ‘685 patent application batches similar transcription requests and processes them together on shared hardware, U.S. Pat. App. No. 2024/0127819 breaks audio into smaller parts that are easier to manage.
Once an audio request is received, the system divides the file into several chunks. These are then transcribed separately, either sequential or in parallel, and then combined to produce the full transcript. This chunking strategy allows the system to begin processing more quickly and manage large recordings or sudden spikes in demand more efficiently.
The patent application, titled “Hardware efficient automatic speech recognition”, was filed on October 14, 2022 and was published on April 18, 2024. The inventors are Andrew Nathan Seagraves, Deepak Subburam, Adam Joseph Sypniewski, Scott Ivan Stephenson, Jacob Edward Cutter, Michael Joseph Sypniewski, and Daniel Lewis Shafer.
Deepgram’s Top Representative
Deepgram’s IP strategy reflects a consistent legal approach through an exclusive partnership with Cognition IP, a U.S. law firm co-founded by Bryant Lee. All of Deepgram’s published patents and applications list attorneys from the firm, including Edward Steakley, Justin White, Rajesh Fotedar, Saleh Kaihani, and Schiller Hill. These individuals help manage Deepgram’s filings across its core technologies.