Intel Mask – Illustrative purposes only. (Image credit: Intel)

At GTC 2023, Nvidia announced its new cuLitho software library for speeding up a critical bottleneck in the semiconductor manufacturing workflow. The new library speeds computational lithography, a technique used to create photomasks for chip production. Nvidia claims its new approach enables 500 DGX H100 systems wielding 4,000 Hopper GPUs to do the same amount of work as 40,000 CPU-based servers, but do so 40X faster and with 9X less power. Nvidia claims this reduces the computational lithography workload to produce a photomask from several weeks to an eight-hour operation.

Chipmaking leaders TSMC, ASML, and Synopsys have all signed on for the new tech, with Synopys already integrating it into its software design tools. Over time, Nvidia expects the new approach to enable higher chip density and yield, better design rules, and AI-powered lithography.

Nvidia scientists created new algorithms that allow increasingly-complex computational lithography workflows to execute on GPUs in parallel, exhibiting a 40X speedup using Hopper GPUs. The new algorithms are integrated into a new cuLitho acceleration library that can be integrated into mask makers’ software (typically a foundry or a chip designer). The cuLitho acceleration library is also compatible with Ampere and Volta GPUs, though Hopper is the fastest solution. 


(Image credit: Intel)

Printing the small features on a chip starts with a chunk of quartz called a photomask. This transparent quartz has an imprinted pattern of a chip design and works much like a stencil — shining a light through the mask etches the design onto the wafer, thus creating the billions of 3D transistors and wire structures that comprise a modern chip. Each chip design requires multiple exposures to build up the chip’s design in layers. As such, the number of photomasks used during the chipmaking process varies based on the chip; it can even exceed 100 masks. For instance, Nvidia says it takes 89 masks to create the H100, and Intel cites ’50+’ masks used for its 14nm chips, for example.

New techniques have emerged that now allow etching features smaller than the wavelength of the light used to create them. However, the continued shrinkage of the features has led to issues with diffraction, which essentially ‘blurs’ the design that’s being printed onto the silicon. The field of computational lithography counteracts the impact of diffraction through complex mathematical operations that optimize the mask layout. However, this task is becoming increasingly compute-intensive as features shrink even further, thus enabling billions more transistors per design.

These complex problems require large clusters of computers, often numbering tens of thousands of servers (Nvidia cites 40,000), that crunch through the numbers in parallel on CPUs in a workload that can take up to weeks to process a single photomask (the amount of time varies based on chip complexity — Intel says it takes its team five days to create a single mask).

Nvidia contends that the number of servers required to design a modern mask is increasing at the same rate as Moore’s Law, thus pushing the server requirements and the amount of power needed to operate them into unsustainable territory. In fact, the incredible compute requirements for new mask tech, like Inverse Lithography Technology (ILT) which uses Inverse Curvilinear Masks (ILM), has already hampered the adoption of these more advanced techniques. Additionally, High-NA EUV and ILT are expected to increase the amount of data processing for masks by 10X in the coming years.

That’s where Nvidia’s cuLitho steps in, reducing the computational lithography workload to eight hours. The cuLitho library can be integrated into computational lithography software that leverages ILT (curvilinear shapes) or Optical Proximity Correction (OCP, which uses ‘Manhattan’ shapes) techniques, and is already integrated into Synopsys’ tools. TSMC and ASML are also adopting the tech. Given the sensitivity of these sorts of software, US export controls will govern any distribution of the software to China and other regions subject to sanctions.

Intel has long used its own proprietary software tools but is slowly shifting to adopting industry-standard tools, particularly as it begins implementing its own external IDM 2.0 foundry operations. As such, it is yet to be seen if other big fabs, like Intel and Samsung, will adopt the new software for their own internal tools. Regardless, the support from Synopsys, ASML, and TSMC assures broad uptake of the cuLitho library and Nvidia’s GPU-based solutions with leading semiconductor manufacturers over the coming years.  

Get instant access to breaking news, in-depth reviews and helpful tips.

Paul Alcorn is the Deputy Managing Editor for Tom’s Hardware US. He writes news and reviews on CPUs, storage and enterprise hardware.

The post Nvidia Tackles Chipmaking Process, Claims 40X Speed Up with cuLitho first appeared on

New reasons to get excited everyday.

Get the latest tech news delivered right in your mailbox

Nvidia Tackles Chipmaking Process, Claims 40X Speed Up with cuLitho

5 Reasons Why You Should Try Online Horse Race Betting

In many places around the world, horse races are an attraction that a lot of people love to watch. With the fast-paced action and thrill that each game provides, it is no longer surprising to know that millions of fans have grown fond of it.
Nvidia Tackles Chipmaking Process, Claims 40X Speed Up with cuLitho

NordLayer — more than a business VPN

Cybersecurity threats have become vast and more sophisticated. The rate of malware attacks and malicious activity counts within seconds despite the size or sector the organization belongs to — no one is safe enough to expect that foe actors will bypass vital company resources.

You may also like

Notify of
Inline Feedbacks
View all comments

More in computing