Enhance your patent search with AI. Try the FREE AI-powered tool

NVIDIA unveils new personal AI supercomputers

DGX Spark and DGX Station Personal AI Computers
March 27, 2025

At the recently concluded GTC 2025, NVIDIA officially unveiled its personal AI supercomputer product line, including the “world’s smallest AI supercomputer,” DGX Spark, alongside the DGX Station, a desktop system capable of data center-level performance.

These desktop AI systems are designed to further democratize AI development to individual users by allowing petaflop-level performance (one quadrillion calculations per second) in a compact form, making high-speed computing more accessible. 

The DGX Spark, formerly known as Project DIGITS, is powered by the GB10 Grace Blackwell Superchip. This system integrates a 20-core Arm-based Grace CPU (NVIDIA’s first datacenter CPU) with a Blackwell GPU, connected via NVIDIA’s high-speed NVLink-C2C interconnect. It supports 128 GB of unified memory and up to 4 TB of NVMe storage, enabling AI models with up to 200 billion parameters to run on a single unit. Designed in collaboration with MediaTek, the DGX Spark emphasizes power efficiency alongside high performance.

Complementing the DGX Spark, NVIDIA also introduced the latest iteration of the DGX Station, offering users a more powerful and compact desktop AI supercomputer featuring the GB300 Grace Blackwell Ultra Superchip. This system delivers 20 petaflops of AI performance and is equipped with 784 GB of unified system memory, leveraging 72 NVIDIA ConnectX-8 SuperNICs to achieve accelerated networking speeds of up to 800Gb/s. The system also integrates NVIDIA’s fifth-generation NVLink technology to enable seamless memory sharing across its 72 Grace Blackwell Ultra GPUs, creating a massive, unified compute fabric optimized for a variety of AI applications, including multistep reasoning and agentic AI.

 

NVIDIA’s patent portfolio

With all the excitement surrounding these newly announced AI hardware technologies, we took a closer look at NVIDIA’s patent filings to examine the patents powering their AI supercomputers and hardware architectures.

Interestingly, the law firms Nixon & Vanderhye, P.C. and Davis Wright Tremaine LLP frequently appear as the designated attorney, agent, or firm for these relevant patent publications, which cover various aspects related to NVIDIA’s new releases. Their work plays a crucial role in securing intellectual property protection for these innovations.

 

Scalable and coherent AI memory

Even with powerful multi-GPU setups, modern AI computing workloads can often require massive memory capacity and high bandwidth to keep up with increasingly larger-scale AI applications and user demands. Traditional high-bandwidth memory attached to GPUs can offer low latency but with limited capacity, while system memory and non-volatile memory (NVMe) can provide larger capacity but may suffer from bandwidth limitations and inefficiencies when accessed by GPUs. Expanding GPU memory by adding more GPUs is costly and power-intensive, creating a need for a scalable, high-performance memory solution that can be decoupled from GPU compute resources.

NVIDIA Patent

 

U.S. Patent No. 11,822,491 (‘491 patent), titled “Techniques for an efficient fabric attached memory,” discloses a Fabric Attached Memory (FAM) system that expands GPU memory capacity independently of compute resources by using high-speed interconnects like NVIDIA NVLlink and NVSwitch. The ‘491 patent describes GPUs able to treat FAM as an extension of their own memory, significantly increasing capacity (up to hundreds of TBs), bidirectional bandwidth (multiple TB/s), and power efficiency without requiring additional full-featured GPUs. Extensions to CUDA memory management APIs (or other APIs) can enable direct access to FAM as peer GPU memory, while Unified Virtual Memory (UVM) and page migration provide seamless, on-demand data movement between local GPU memory and FAM, ensuring compatibility and performance optimization for memory-intensive workloads.

The ‘491 patent also discloses capabilities for interconnecting with CPUs that support cache coherence and unified memory architectures, much like the NVIDIA Grace CPU with its Scalable Coherency Fabric (SCF). The technology allows one to envision highly efficient, high-throughput, high-bandwidth, low-latency multi-GPU setups using the Grace Blackwell architecture, whether in the data centers or even on a desktop.

The ‘491 patent was filed on October 20, 2021, and was granted on November 21, 2023. Nixon & Vanderhye, P.C. represented NVIDIA in the patent filing. The listed inventors are John Feehrer, Denis Foley, Mark Hummel, Vyas Venkataraman, Ram Gummadi, Samuel H. Duncan, Glenn Dearth, and Brian Kelleher.

 

Secure AI acceleration and confidential computing

As GPUs become increasingly powerful and virtualization enables multi-tenant environments, securing shared computing resources—such as parallel processing units (PPUs), like GPUs, and central processing units (CPUs)—presents a significant challenge. 

Traditional security mechanisms struggle to protect sensitive data and execution integrity when GPUs are accessed by virtual machines (VMs), particularly in environments where hypervisors or physical attackers could intercept data. Ensuring secure execution for GPUs while maintaining high performance in virtualized environments remains a critical issue.

As GPUs become more powerful and virtualization enables multiple applications to serve multiple users (multi-tenant environments), securing shared computing resources—such as parallel processing units (e.g.GPUs and CPUs)—has become a significant challenge. Traditional security measures struggle to protect sensitive data and ensure execution integrity when virtual machines (VMs) access GPUs. This is especially concerning in environments where hypervisors or physical attackers could intercept data. The key challenge is finding a way to secure GPU operations without sacrificing performance, particularly in cloud computing and AI applications.

Alongside this challenge, confidential computing techniques have been developed to protect data while in use, typically by executing workloads within a CPU-based Trusted Execution Environment (TEE). However, modern applications—such as those leveraging machine learning (ML) and artificial intelligence (AI)—require GPU acceleration to meet performance demands. Conventional confidential computing approaches lack mechanisms to extend security protections to PPUs, forcing organizations to choose between security and computational efficiency. This limitation is especially problematic for industries handling large amounts of sensitive data, such as healthcare and finance, where both confidentiality and high-performance computing are essential.

 

Enhancing GPU security

US12141268 NVIDIA Patent AI Supercomputer DGX Digits

U.S. Patent No. 12,141,268 (‘268 patent), titled “Secure execution for multiple processor devices using trusted executing environments,” introduces a TEE that extends security to GPUs operating in virtualized environments. It establishes a protected memory region within the GPU to prevent unauthorized memory access from compute engines or external devices. Additionally, a secure microcontroller within the GPU acts as a root of trust, facilitating encryption key negotiation with the VM to protect data transfers. The system employs cryptographic techniques to secure direct memory access (DMA) between CPUs and GPUs, utilizing encrypted bounce buffers to prevent data exposure. These mechanisms enable GPUs to execute workloads securely in multi-tenant data centers without compromising confidentiality or execution integrity.

 

Confidential computing

US20230297696A1 NVIDIA Patent AI Supercomputer DGX Digits

U.S. Pat. App. Pub. No. 2023/0297696 (‘696 patent application), titled “Confidential computing using parallel processors with code and data protection,” further advances confidential computing by integrating GPUs into a TEE established by CPUs. This approach ensures that virtual machines (VMs) with GPU acceleration can execute securely while preventing unauthorized access from hypervisors, privileged software, or external threats. A key innovation is the Compute Protected Region (CPR) in GPU memory, which restricts access to sensitive data, ensuring that only designated compute engines can interact with protected workloads. Additionally, encrypted communication channels (e.g., NVLink/NVSwitch) between the CPU and GPU secure data in transit, while secure processors within the GPU handle cryptographic operations and attest to the integrity of the TEE.

These innovations enable secure, high-performance GPU-accelerated computing in virtualized environments without sacrificing confidentiality or execution integrity. This is especially relevant for AI applications across various industries, where concerns over code and data security, privacy, and performance frequently arise.

Secure AI capabilities, such as those in NVIDIA’s Blackwell architecture, are designed to protect AI models and customer data without compromising performance. Blackwell’s native interface encryption enhances privacy-sensitive applications in industries like healthcare and finance. By enabling GPUs to operate within confidential computing environments, the innovations described in the ‘126 patent and the ‘696 patent application provide a foundation for secure AI-driven workloads, ensuring that organizations can leverage GPU acceleration while maintaining strict security guarantees.

The ‘268 patent was filed on September 24,  2021, and was granted on November 12, 2024. Davis Wright Tremaine LLP was the prosecuting law firm. The patent lists the following inventors: Philip John Rogers, Mark Overby, Michael Asbury Woodmansee, Vyas Venkataraman, Naveen Cherukuri, Gobikrishna Dhanuskodi, Dwayne Frank Swoboda, Lucien Burton Dunning, Mark Hairgrove, and Sudeshna Guha.

The ‘696 patent application was filed on March 17, 2023. The patent application lists the following inventors: Philip Rogers, Mark Overby, Vyas Venkataraman, Naveen Cherukuri, James Leroy Deming, Gobikrishna Dhanuskodi, Dwayne Swoboda, Lucien Dunning, Aruna Manjunatha, Aaron Jiricek, Mark Hairgrove, and Michael Woodmansee—some of whom have worked on the now-issued ‘268 patent.

 

Accurate and energy-efficient low-precision computing

Deep learning inference workloads demand high computational performance, but existing GPU architectures struggle to balance accuracy and energy efficiency. Lower-precision formats like INT8 enhance efficiency but sacrifice accuracy, while floating-point formats such as FP8 improve accuracy at the cost of higher energy consumption. Additionally, conventional quantization techniques, which map floating-point data to lower-bit integer values, often degrade accuracy—especially when scale factors are applied too broadly across tensor dimensions. As AI models grow in complexity, achieving an optimal tradeoff between computational precision and energy efficiency remains a critical challenge in GPU-accelerated deep learning.

US20240160406A1 NVIDIA Patent AI Supercomputer DGX Digits

U.S. Pat. App. Pub. No. 2024/0160406 (‘406 patent application), titled “Low-precision floating-point datapath in a computer processor,” addresses this challenge by introducing microarchitectural enhancements to GPU data paths that improve energy efficiency while preserving accuracy in deep learning inference workloads. The invention features energy-efficient floating-point data paths that utilize integer accumulation and per-vector scaled quantization (VS-Quant). Key innovations include specialized hardware mechanisms that perform low-precision floating-point multiplications while accumulating results in integer format, supporting formats such as FP4, FP8, and VS-Quant LOG4. Unlike traditional quantization approaches, the VS-Quant method applies per-vector scale factors at a fine granularity, reducing quantization-related accuracy loss without excessive computational overhead. Additionally, exponent biasing techniques optimize floating-point representation efficiency, enabling high-speed computations using fixed-point hardware.

These innovations enable GPUs to achieve energy-accuracy tradeoffs, reducing power consumption without significantly degrading model precision—enhancing the efficiency of AI inference workloads. This aligns with NVIDIA’s ongoing push for more efficient AI acceleration using lower-precision formats, as demonstrated by the Blackwell architecture, which achieves petaflop-level performance at FP4 precision. Notably, the ‘406 patent application builds upon concepts introduced in NVIDIA’s original VS-Quant paper from 2021, further refining hardware implementations to support lower-bit quantization with minimal accuracy tradeoff. Several original authors of the paper are also listed as inventors on the ‘406 patent application.

The ‘406 patent application was filed on October 11, 2023 and is currently pending. The patent application lists the following inventors: Rangharajan Venkatesan, Reena Elangovan, Charbel Sakr, Brucek Kurdo Khailany, Ming Y. Siu, Ilyas Elkin, and Brent Ralph Boswell.

 

GPU-accelerated dedicated decompression engine

Decompressing large volumes of variable-length-coded data is traditionally performed sequentially on CPUs, a process that can be time-consuming due to inherent data dependencies and limited by processor speed, bandwidth, and resource constraints. This sequential approach poses significant inefficiencies for high-throughput workloads, such as GPU-accelerated computing and AI applications.

US20230229630A1 NVIDIA Patent AI Supercomputer DGX Digits

U.S. Pat. App. Pub. No. 2023/0229630 (‘630 patent application), titled “Processing variable-length data,” addresses this challenge by introducing a method for parallel decompression of variable-length-coded data streams on GPUs. This method employs speculative decoding to identify overlapping portions of the data stream, determining valid decoding start points and enabling parallel execution. By offloading decompression tasks from the CPU to the GPU, the system enhances performance and throughput. It supports various coding schemes, including Huffman coding and LZ77-based compression formats like Deflate, and incorporates optimizations such as fused GPU kernels to streamline decoding operations. This approach improves decompression efficiency, reduces data transfer bottlenecks, and allows GPUs to process compressed data more effectively in AI and high-performance computing environments.

This innovation is in line with NVIDIA’s advancements in GPU architectures, particularly the Blackwell architecture, which features a dedicated “decompression engine” capable of achieving extremely high decompression throughput while supporting the latest compression formats, such as LZ4, Snappy, and Deflate. The decompression engine enables fused copy-decompress operations and allows decompression to overlap with compute tasks, minimizing latency and improving efficiency. Additionally, the Blackwell architecture’s ability to access massive amounts of memory over high-speed links accelerates the full pipeline of database queries, delivering exceptional performance in data analytics and data science. These developments highlight a significant shift toward integrating advanced decompression capabilities within GPU architectures in conjunction with NVIDIA’s own high-speed data compression and decompression library, nvCOMP, which is optimized for NVIDIA GPUs to meet the growing demands of data-intensive applications.

The ‘630 patent application was filed on January 19, 2022 and is currently pending. The patent application lists the following inventors: Eyal Soha, Elias Stehle, and Nikolay Sakharnykh.

In addition to these AI desktop supercomputers, NVIDIA has also unveiled its next-generation AI superchips, codenamed Vera Rubin, named after astrophysicist Vera Rubin. These new superchips promise up to 50 petaflops of FP4 performance, advanced memory integration, enhanced CPU-GPU synergy, and greater scalability, among other improvements. Another GPU architecture briefly teased at the conference was called Feynman, named after American theoretical physicist Richard Feynman. Both Vera Rubin and Feynman are set to be released in the coming years. A closer look at their technology from a patent perspective may be in store for us as more details emerge in the future.

Related Stories

Subscribe to our newsletter

  • Questions? Check our privacy policy.
  • This field is for validation purposes and should be left unchanged.

The latest on all things patents   — straight to your inbox

Sign up for our weekly newsletter featuring the latest patents and innovations.

Get early access to our patent landscape and insight reports.

Disclaimer: 

1. Parola Analytics and Avontis are distinct entities and operate independently. Any references to Avontis or its services do not constitute a legal partnership. 

2. Parola Analytics does not provide legal services. Our services are limited to research and technical analysis. Any information provided by Parola Analytics should not be construed as legal advice.