Direct-to-chip (D2C) cooling is rising as a core solution for AI-driven heat loads, with further R&D moving toward fully integrated on-chip approaches as thermal management collapses inward — from room-level air systems to rack-level liquid, then to direct-to-chip cooling, and ultimately into the package and silicon itself.
AI hardware has already exceeded the practical limits of traditional air cooling, pushing thermal management closer to the heat source. In large-scale data centers today, this takes the form of D2C liquid cooling: metal "cold plates" mounted atop CPUs and GPUs that transfer heat through closed liquid loops. As advanced packaging increases heat density and localization, some companies are exploring cooling integrated at the package level or etched directly into silicon.
Why AI is breaking conventional cooling
Rack densities are rising rapidly from the typical 10 kW to 30 kW toward 100 kW, 200 kW and beyond, with projections reaching 600 kW per rack. Conventional air-cooling systems like fans and chillers are insufficient at these densities, generally topping out at 20 kW to 30 kW per rack, even with advanced containment. Rear-door heat exchangers (RDHx) can only extend that range to about 50 kW to 80 kW, well below modern AI requirements.
Liquid cooling becomes necessary beyond 50 kW to 100 kW to maintain reliability, avoid thermal throttling and reduce overall power consumption. Among the available methods, D2C leads due to its compatibility with existing server architectures and ability to scale rack density, often removing an estimated 70% to 80% of rack-level heat.
D2C captured over 42% of the data center liquid cooling market last year, according to Mordor Intelligence. Other approaches remain in development, including immersion cooling for high-density deployments where submerging hardware improves heat transfer and efficiency. Hybrid air-liquid systems are also used to retrofit existing environments, often incorporating RDHx units that capture exhaust air heat before it re-enters the data center.
At the component level, power density is increasing just as quickly. High-end AI accelerators, particularly NVIDIA Blackwell (B200) and Blackwell Ultra (B300) GPUs, operate in the 1,000 W to 1,400 W range per device, making D2C liquid cooling increasingly necessary.
This trend is reflected across the industry. "In 2017, a high‑end GPU contained roughly 21 billion transistors and operated at about 300 watts. By 2025, the transistor count has increased to over 100 billion transistors and 1,000 watts per package," said Rajiv Mongia, chief thermal architect at Intel.
Industry roadmaps indicate this trajectory will continue, with "package power moving into the 3 to 5 kilowatt range and transistor counts approaching 1 trillion by around 2030," Mongia said, adding that these scaling dynamics are fundamentally reshaping how Intel's engineers think about processor, package and thermal design.
An example of Microsoft's microfluidic cooling technology. Source: Microsoft
D2C as the industry's immediate response
D2C systems target heat at the component level. Cold plates mounted on GPUs, CPUs and sometimes memory modules contain microchannels through which coolant flows, absorbing heat before circulating through rack manifolds and coolant distribution units (CDUs) in a closed loop. This approach allows more precise thermal control than room- or rack-level cooling, particularly as chips develop localized hotspots.
Suppliers are scaling D2C solutions to match increasing heat density. CoolIT Systems, for example, designs cold plates, heat exchangers and CDUs for high-power processors, with deployments across millions of processors and AI accelerators, including seven supercomputers. It manufactures tens of thousands of server cold plate loops every month, and this year it's ramping up production across Canada, Vietnam and China.
The company has expanded into 4,000 W class AI cold plates and infrastructure supporting up to 500 kW AI server racks. Its latest 4,000 W cold plate design shows more than 97% heat removal at industry-aligned flow rates for high-power chips. CoolIT also reports improved thermal and flow performance through its Split-Flow architecture, which directs coolant through microchannels toward hotspot regions within the chip, improving thermal performance while reducing pressure drop and energy use. By shortening flow paths and improving distribution uniformity, the design offers 30% better thermal and flow performance than conventional cold plates.
That level of precision is becoming mandatory. “At a certain point, these chips have such dense micro architectures that they cannot operate effectively unless they’re [liquid] cooled," said CoolIT director of marketing Charles Robison. "With the NVIDIA current generation, we're moving to the point where liquid cooling cannot be avoided. From the GB 300 onward—and based on what's in the NVIDIA roadmap—there's no possibility that these units can be air cooled anymore for them to operate."
Steve Madara, vice president of thermal engineering at Ohio-based Vertiv, said that when cooling new AI workloads, the biggest technical hurdle "isn't just removing heat [but] doing it reliably at scale," and with the increased reliability requirements of the "technical fluid loop" — the network of pipes, manifolds, pumps and heat exchangers that supports modern data center cooling. "That includes managing extreme thermal loads from dense GPU clusters, maintaining stability during load transitions or maintenance, and enabling everything from power to cooling and controls, operates as one coordinated system," Madara added.
Vertiv said it differentiates by delivering fully integrated thermal and power systems rather than standalone cooling components, emphasizing faster deployment through prefabricated modules and improved reliability through built-in redundancy and thermal backup systems. The company's CoolChip CDUs are used in reference architectures for today's AI systems, supporting liquid-to-liquid heat exchange from 100 kW to 1,350 kW or liquid-to-air up to 70 kW.
The company said it has added significant manufacturing capacity and introduced a full line of coolant distribution units and fluid distribution systems, including SmartRun and MegaMod HDX, which pair D2C liquid cooling with air systems in hybrid, prefabricated architectures. Madara said these combined systems allow customers to “deploy high-density AI environments faster and scale from single pods to multi-megawatt deployments, while keeping power, cooling, and monitoring tightly integrated."
Vertiv’s CoolChip liquid-to-liquid CDUs. Source: Vertiv
Challenges: Where D2C starts to strain
D2C remains effective but still has limits.
Intel's Mongia notes that while placing cold plates directly on GPUs and AI accelerators has become standard industry practice, today's main challenges are "optimizing these solutions for the specific characteristics of each package — such as power density, hotspot locations, and workload behavior — and taking a more holistic view across the full stack, from silicon through system and facility infrastructure."
At Intel Foundry, that systems‑level view now includes integrating D2C cooling solutions into the package itself to optimize thermal performance. The group has developed an integrated cold plate design that incorporates microfluidic features within the package lid, with specifics depending on customer design requirements and product constraints. In suitable use cases, it can improve heat dissipation by about 20% to 25%, but Mongia said broader adoption depends on factors like "product architecture, manufacturing complexity, reliability requirements, and overall cost‑benefit at the system level."
Fluid limitations are also emerging. While most chips rely on single-phase fluids such as water or PG25 for cooling, Vertiv’s Madara notes that chip density is expected to "increase to the point that a single-phase fluid can't remove all the heat effectively."
However, he added that the timeline remains uncertain, particularly as silicon and cold plate technologies continue to evolve. In the meantime, Madara said the industry is already researching and testing two-phase fluids "as the next means to remove the heat from more dense and high-powered chips."
Robison argues that while two-phase cooling is exciting from a physics standpoint, it becomes difficult to deploy at the system level. “From a systems engineering perspective, it makes that dream a nightmare,” Robison said, as systems must be designed around specialized refrigerants. That risks locking customers into single vendor ecosystems.
CoolIT is ramping up its production to meet demand for its D2C cooling. Source: CoolIT
"Single-phase direct liquid cooling can handle all the heat; not just the processors, but the peripherals,” Robison said, noting that it also enables broader interoperability and coolant flexibility.
He frames cooling as a layered problem, starting at the processor — where thermal limits, power density and hotspots are most acute — and extending through the server, rack and data center. At each level, the challenge shifts from managing localized heat on the chip to capturing it at the system level, an approach single-phase systems are designed to handle across the full stack.
On-chip cooling: Silicon-level integration
As AI workloads scale further, cooling can extend beyond cold plates and into the chip itself. Research is exploring microfluidic channels embedded directly into silicon or package structures, bringing coolant closer to heat sources than traditional cold plates, while reducing thermal resistance.
Microsoft's microfluidic cooling technology debuted in 2025. Source: Microsoft
In a major on-chip breakthrough last September, Microsoft demonstrated microfluidic cooling channels etched into silicon, allowing coolant to flow closer to chip hotspots. In lab tests, the approach performed three times better in removing heat than cold plates in certain workloads and configurations. It also reduced peak silicon temperatures by up to 65%, depending on the type of chip.
Similar work is underway through the U.S. Department of Energy’s COOLERCHIPS program, including projects led by HP and NVIDIA exploring more integrated cooling architectures. HP is developing embedded microfluidic cooling to reduce thermal interfaces at the package level, while NVIDIA is designing a hybrid system that combines D2C, two-phase heat transfer and immersion cooling within a single rack-level architecture.
Microfluidic cooling on a server. Source: Microsoft
Intel Foundry’s latest work in microfluidic designs sits between D2C and those fully integrated on-chip approaches, as the company looks to diversify cooling strategies.
“There is rarely a one‑size‑fits‑all solution when it comes to thermal management, so Intel continues to invest across multiple cooling approaches," Mongia said. "For AI and GPU products specifically, our priorities include higher‑performance thermal interface materials, advanced cold plates—both integrated and non‑integrated—and other chip‑level cooling techniques."
How data center advancements can reach everyday computing
Improvements in thermal efficiency at the data center level tend to propagate across the broader computing ecosystem. As cooling becomes more efficient and power density increases, these same innovations gradually influence edge systems, PCs and mobile devices.
For now, D2C remains the primary solution for high-density AI deployments, balancing performance, scalability and compatibility with existing infrastructure. However, as chip power reaches the multi-kilowatt range, effective thermal management will require integration at the package and silicon level.
