Lumai has successfully run billion-parameter large language models (LLMs) in real time using its optical computing system, called Lumai Iris.
The company claims it is the first time an optical compute system has been proven to run large-scale AI inference workloads. The Lumai Iris inference server accelerates workloads using light instead of silicon-based processing.
The optical compute system enables fast and efficient performance with up to 90% lower energy consumption than conventional architectures. The chips are also more sustainable than traditional GPU-based systems, Lumai said.
Lumai Iris is a family of servers with the first, the Iris Nova, available now for evaluation by hyperscalers, enterprises, research institutions and neo-clouds.
Light benefits
According to Lumai, data centers are running into issues with hard power and scalability limits. This is putting pressure on silicon-based architectures to keep pace, the company added.
According to data from the International Energy Agency, global power demand from data centers will more than double by 2030. Light-based computing can deliver more performance per kilowatt to enable AI scaling without energy and cost burdens, Lumai said.
“As the industry transitions into the inference era, we are simultaneously crossing the threshold into the post-silicon era,” said Dr. Xianxin Guo, CEO and co-founder of Lumai. “By shifting the computation paradigm from electrons to photons, Lumai can deliver an order-of-magnitude increase in performance with significant energy savings.”
The optical computing technology uses light in 3D volume to overcome 2D constraints. Millions of operations are executed simultaneously using massive spatial parallelism that results in low-cost token throughput for compute-bound workloads, the company said.
The Iris Nova server uses digital processing for system control and software with an optical tensor engine that performs core mathematical operations.
