Join us in Atlanta on April 10th and explore the landscape of security workforce. We will explore the vision, benefits, and use cases of AI for security teams. Request an invite here.
Celestial AI, a developer of optical interconnect technology, has announced a successful series B funding round, raising $100 million for its Photonic Fabric technology platform. IAG Capital Partners, Koch Disruptive Technologies (KDT) and Temasek’s Xora Innovation fund led the investment.
Other participants included Samsung Catalyst, Smart Global Holdings (SGH), Porsche Automobil Holding SE, The Engine Fund, ImecXpand, M Ventures and Tyche Partners.
According to Celestial AI, their Photonic Fabric platform represents a significant advancement in optical connectivity performance, surpassing existing technologies. The company has raised $165 million in total from seed funding through series B.
Tackling the “memory wall” challenge
Advanced artificial intelligence (AI) models — such as the widely used GPT-4 for ChatGPT and recommendation engines — require exponentially increasing memory capacity and bandwidth. However, cloud service providers (CSPs) and hyperscale data centers face challenges due to the interdependence of memory scaling and computing, commonly called the “memory-wall” challenge.
The limitations of electrical interconnect, such as restricted bandwidth, high latency and high power consumption hinder the growth of AI business models and advancements in AI.
To address these challenges, Celestial AI has collaborated with hyper scalers, AI computing and memory providers to develop Photonic Fabric. The optical interconnect is designed for disaggregated, exascale computing and memory clusters.
The company asserts that its proprietary Optical Compute Interconnect (OCI) technology enables the disaggregation of scalable data center memory and enables accelerated computing.
Memory capacity a key problem
Celestial AI CEO Dave Lazovsky told VentureBeat: “The key problem going forward is memory capacity, bandwidth and data movement (chip-to-chip interconnectivity) for large language models (LLMs) and recommendation engine workloads. Our Photonic Fabric technology allows you to integrate photonics directly into your silicon die. A key advantage is that our solution allows you to deliver data at any point on the silicon die to the point of computing. Competitive solutions such as Co-Packaged Optics (CPO) cannot do this as they only deliver data to the edge of the die.”
Lazovsky claims that Photonic Fabric has successfully addressed the challenging beachfront problem by providing significantly increased bandwidth (1.8 Tbps/mm²) with nanosecond latencies. As a result, the platform offers fully photonic compute-to-compute and compute-to-memory links. The platform also supports industry-standard protocols, including CXL, and JEDEC (HBM), and is also compatible with interfaces like PCIe, UCIe, and other proprietary interconnects.
The recent funding round has also garnered the attention of Broadcom, who is collaborating on the development of Photonic Fabric prototypes based on Celestial AI’s designs. The company expects these prototypes to be ready for shipment to customers within the next 18 months.
Enabling accelerated computing through optical interconnect
Lazovsky stated that the data rates must also rise with the increasing volume of data being transferred within data centers. He explained that as these rates increase, electrical interconnects encounter issues like signal fidelity loss and limited bandwidth that fails to scale with data growth, thereby restricting the overall system throughput.
According to Celestial AI, Photonic Fabric’s low latency data transmission facilitates the connection and disaggregation of a significantly higher number of servers than traditional electrical interconnects. This low latency also enables latency-sensitive applications to utilize remote memory, a possibility that was previously unattainable with traditional electrical interconnects.
“We enable hyperscalers and data centers to disaggregate their memory and compute resources without compromising power, latency and performance,” Lazovsky told VentureBeat. “Inefficient usage of server DRAM memory translates to $100s millions (if not billions) of waste across hyperscalers and enterprises. By enabling memory disaggregation and memory pooling, we not only help reduce the amount of memory spend but also prove memory utilization.”
Storing and processing larger sets of data
The company asserts that its new offering can deliver data from any point on the silicon directly to the point of computing. Celestial AI says that Photonic Fabric surpasses the limitations of silicon edge connectivity, providing a package bandwidth of 1.8 Tbps/mm², which is 25 times greater than that offered by CPO. Furthermore, by delivering data directly to the point of computing instead of at the edge, the company claims that Photonic Fabric achieves a latency that is 10 times lower.
Celestial AI aims to simplify enterprise computation for LLMs such as GPT-4, PaLM and deep learning recommendation models (DLRMs) that can range in size from 100 billion to 1 trillion-plus parameters.
Lazovsky explained that since AI processors (GPU, ASIC) have a limited amount of high bandwidth memory (32GB to 128GB), enterprises today need to connect hundreds to thousands of these processors to handle these models. However, this approach diminishes system efficiency and drives up costs.
“By increasing the addressable memory capacity of each processor at high bandwidth, Photonic Fabric allows each processor to store and process larger chunks of data, reducing the number of processors needed,” he added. “Providing fast chip-to-chip links allows the connected processor to process the model faster, increasing the throughput while reducing costs.”
What’s next for Celestial AI?
Lazovsky said that the money raised in this round will be used to accelerate the productization and commercialization of the Photonic Fabric technology platform by expanding Celestial AI’s engineering, sales and technical marketing teams.
“Given the growth in generative AI workloads due to LLMs and the pressures it puts on current data center architectures, demand is increasing rapidly for optical connectivity to support the transition from general computing data center infrastructure to accelerating computing,” Lazovsky told VentureBeat. “We expect to grow headcount by about 30% by the end of 2023 to 130 employees.”
He said that as the utilization of LLMs expands across various applications, infrastructure costs will also increase proportionally, leading to negative margins for many internet-scale software applications. Moreover, data centers are reaching power limitations, restricting the amount of computing that can be added.
To address these challenges, Lazovsky aims to minimize the reliance on expensive processors by providing high bandwidth and low latency chip-to-chip and chip-to-memory interconnect solutions. He said this approach is intended to reduce enterprises’ capital expenditures and enhance their existing infrastructures’ efficiency.
“By shattering the memory wall and helping improve systems efficiencies, we aim to help shape the future direction of AI model progress and adoption through our new offerings,” he said. “If memory capacity and bandwidth are no longer a limiting factor, it will enable data scientists to experiment with larger or different model architectures to unlock new applications and use cases. We believe that by lowering the cost of adopting large models, more businesses and applications would be able to adopt LLMs faster.”