
Training a frontier AI model means convincing hundreds of thousands of chips to act like one giant computer. The hard part isn't the chips — it's the wiring. In Memphis, Tennessee, there's a building where 200,000 of the world's most powerful processors think in unison. xAI's Colossus supercomputer went from bare warehouse floor to 100,000 liquid-cooled NVIDIA H100 GPUs in 122 days — then doubled to 200,000 in another 92 — with a stated roadmap toward one million [5][6][7]. Together with SpaceX's ambitions, the buildout behind machines like this is projected to consume more than $300 billion in capital spending by the end of the decade [8][9]. Here's the part nobody puts in the keynote: the GPUs aren't the hard part. You can buy GPUs. The hard part is the network — the millions of fiber-optic strands, exotic switch chips, and physics-defying packaging tricks that let 200,000 processors behave like a single brain instead of a warehouse full of very expensive space heaters. A traditional cloud data center is like a city: millions of independent jobs, each minding its own business, sharing roads that were never meant to handle everyone driving to the same place at once. An AI training cluster is the opposite — it's one job, spread across the entire city, and every few milliseconds every single processor has to swap notes with thousands of others before anyone can take the next step. The technical term for these synchronized data exchanges is collective communication — operations with names like AllReduce, AllGather, and ReduceScatter — and they are brutally unforgiving [1]. If one GPU's data arrives late because a network link hiccuped, tens of thousands of others sit idle, burning megawatts while they wait. \ \ It gets worse. The newest models use a Mixture of Experts (MoE) design, where each token of text gets routed to specialized sub-networks that may live on accelerators hundreds of racks away [3]. That means torrents of "east-west" traffic — machine-to-machine, inside the building — at a scale the internet's underlying technology was never designed for. This is the story of how the world's biggest tech companies are solving the million-GPU problem — and why the answer runs through rotating micro-mirrors, lasers fused directly onto switch chips, and a French company most people have never heard of. Five Companies, Five Wildly Different Answers Faced with the same physics, the hyperscalers split into camps. The core fight: do you build on Ethernet — the mature, ubiquitous standard that runs everything — or on specialized fabrics like NVIDIA's InfiniBand, or on something stranger? Meta: Making Ordinary Ethernet Do Extraordinary Things Meta bet that plain-vanilla Ethernet, sufficiently hot-rodded, could match InfiniBand — and then proved it by training Llama 3.1, a 405-billion-parameter model, on it [2]. The trick is a technology called RoCEv2 — RDMA over Converged Ethernet — which lets one server write directly into another server's memory across the network, skipping the operating system entirely. Meta built dedicated "backend" training networks, physically separate from the networks that serve your Instagram feed, using a non-blocking two-stage Clos topology (think: a telephone exchange where every line can always reach every other line) built from Arista 7800R3 switches running Broadcom's Tomahawk 5 chip [2]. \ Then the weird problems started. AI traffic doesn't look like internet traffic. It's a small number of enormous "elephant flows" — sustained firehoses of gradient data that can saturate a network card instantly — with almost no statistical variety ("low flow entropy," in the jargon) for routers to spread across parallel paths [2]. Meta's early path-pinning approach caused so much uplink congestion that cluster performance dropped as much as 30 percent when racks were only partially assigned to a job [2]. The fix was twofold. First, Meta modified its collective communications library (NCCL) to split traffic across more Queue Pairs, and reprogrammed the switch ASICs — using User Defined Fields — to route based on the destination Queue Pair inside each RoCE packet. That manufactured the missing entropy, and collective operations completed up to 40 percent faster [2]. \ \ Second, Meta did something genuinely surprising: it turned congestion control off . The industry-standard mechanism, DCQCN, was misbehaving at 400G speeds — firmware bugs and poor visibility into its notification packets — so Meta disabled it entirely [2]. In its place: Priority Flow Control plus a receiver-driven admission scheme. A sending GPU copies its tensor data into high-bandwidth memory and then waits . It isn't allowed to transmit until the receiving node sends back a Clear-to-Send packet — the network equivalent of air traffic control refusing to let planes take off until there's a gate available. In-flight traffic stays bounded, the deep buffers in spine switches never overflow, and the catastrophic many-senders-one-receiver pileups known as "incast" simply can't happen [2]. \n \ Google: Routing Light with Microscopic Mirrors Google looked at packet switching — the basic idea underlying both Ethernet and InfiniBand — and decided to abandon it altogether in the core of its TPU supercomputers. Its Apollo platform uses Optical Circuit Switching (OCS). Instead of a chip reading each packet's address and forwarding it hop by hop — paying a latency, jitter, and power tax at every electrical-to-optical conversion — an Apollo switch contains arrays of MEMS micro-mirrors: microscopic, physically tilting mirrors that steer a beam of light from the transmitter on one TPU directly into the receiver on another [1]. No packet inspection. No buffering. The data path is literally an unbroken beam of light, with essentially zero added latency beyond the speed of light in glass fiber [1]. The mirrors can re-aim in under ten nanoseconds, which means the physical topology of the supercomputer is software-defined [1]. Google can carve out custom sub-clusters whose wiring diagram exactly matches the communication pattern of the neural network being trained, and when a chip dies, the system routes light around it — no technician with a fiber cable required [1]. \ \ \ Underneath the optics, Google's TPUs talk over a proprietary Inter-Chip Interconnect (ICI) that implements collective-reduction math directly in hardware, bypassing the software overhead of Ethernet-style RDMA [1]. The sixth-generation TPU v6e (Trillium) pushes 800 gigabytes per second of bidirectional ICI bandwidth per chip across 256-chip pods in a 2D torus [13]. The seventh-generation TPU v7 (Ironwood) extends that to a 3D torus of up to 9,216 chips per superpod — a scale that demands custom optical circulators and wavelength-division multiplexing transceivers built specifically for Apollo [1][14]. Microsoft and Oracle: InfiniBand in a Hostile World Microsoft Azure and Oracle Cloud Infrastructure rent supercomputers to strangers — which means they have to run NVIDIA's InfiniBand fabric inside multi-tenant clouds where a paying customer might have bare-metal root access and bad intentions. Oracle's OCI Superclusters split everything three ways: a front-end network for ordinary traffic, a back-end network reserved for RDMA collectives, and an ultra-low-latency in-node interconnect (NVLink) [15][16]. The backend runs NVIDIA Quantum-2 InfiniBand — sub-500-nanosecond per-switch latency with hop-by-hop, credit-based flow control [15]. But InfiniBand grew up in trusting national-lab environments; it has no 802.1X-style port authentication. So Oracle hardened it layer by layer: tenants are isolated with 16-bit Partition Keys (Pkeys, roughly InfiniBand's answer to VLANs); the Host Channel Adapter firmware is locked so a root-level tenant can't tamper with routing; static topology specifications ("topospec") instantly disable a port if a device's hardware GUID doesn't match what's supposed to be plugged in there; and the all-powerful Subnet Manager is shielded with rotating MAD keys and rate limiting against denial-of-service [15]. Azure's signature move is the "rail-optimized" topology in its ND H100 v5 and GB200 fleets: GPU #3 in every server connects to its own dedicated leaf-to-spine "rail," as do GPU #1, #2, and so on. AllReduce traffic between corresponding GPUs stays on its rail and never takes a performance-killing detour through the spine layer [19]. And in a lineage that traces back to Project Catapult, Azure offloads its software-defined networking to SmartNICs — currently the Microsoft Azure Network Adapter (MANA, also branded Azure Boost 2) — built on Altera Agilex 7 FPGAs and ARM cores, sitting as a transparent "bump in the wire" between the host and the network. The card enforces virtualization isolation and accelerates GPUDirect storage without stealing a single host CPU cycle [20]. AWS: Spray and Pray (Scientifically) Amazon sidestepped the Ethernet-versus-InfiniBand war by inventing its own transport protocol. TCP — the internet's workhorse — funnels each connection down a single path, which is exactly wrong for a data center with thousands of equally good parallel routes [21]. AWS's Scalable Reliable Datagram (SRD) protocol, baked into its Elastic Fabric Adapter (EFAv3) and Nitro cards, shreds every flow into fragments and sprays them across all available paths simultaneously, reassembling on arrival [21]. The payoff: EC2 UltraCluster 2.0 networks couple more than 20,000 GPUs with kernel-bypass and GPU-direct RDMA, with latency down 25 percent versus the previous generation [21]. The same SRD fabric links Amazon's home-grown Trainium and Inferentia AI chips — a fully vertically integrated stack that owes nothing to anyone else's networking standard [22][23]. Scorecard | Hyperscaler | Backend Fabric | Signature Moves | Key Hardware | |----|----|----|----| | Meta | Ethernet (RoCEv2) | Receiver-driven flow control, E-ECMP, 2-stage Clos, QP scaling | Arista 7800R3 (Tomahawk 5) | | Google | Optical Circuit Switching | MEMS mirror re-patching, 3D torus, sub-10 ns switching | Apollo OCS, Inter-Chip Interconnect | | Oracle | InfiniBand / RoCEv2 | Pkey tenant isolation, locked firmware, anti-spoofing | ConnectX-7, Quantum-2 | | Microsoft | InfiniBand | Rail-optimized topology, FPGA SmartNIC bump-in-the-wire | Azure Boost 2 (MANA), Altera Agilex | | AWS | Custom Ethernet (SRD) | Packet spraying across all paths, kernel bypass | EFAv3, Nitro card | | xAI | Ethernet & InfiniBand | Liquid-cooled 64-GPU racks, hybrid RDMA/RoCE | Spectrum-X, Quantum-2 | \ \ The Switch Chips: 102.4 Trillion Bits per Second, No Dropped Packets Every one of those architectures rests on a generation of switch silicon doing things that would have sounded like fantasy five years ago. Three companies dominate. \n \ Broadcom owns the hyperscale Ethernet heartland. Its Tomahawk 5 moves 51.2 terabits per second through a single chip and powers the Arista chassis in Meta's clusters [2][12]. Its successor, Tomahawk 6 ("Davisson"), doubles that to 102.4 Tbps — and integrates co-packaged optics to get power down to 3.8 picojoules per bit [25][27]. For scale: that single chip switches roughly the equivalent of 4 million simultaneous Netflix 4K streams. Cisco is mounting its challenge with Silicon One. The G200 (51.2 Tbps) and new G300 (102.4 Tbps) are purpose-built for AI clusters [28][29]. The G300's party trick is a 252-megabyte fully shared packet buffer on-die: when a synchronized AllReduce sends micro-bursts crashing into the switch, any packet from any port can occupy any free byte of buffer, instead of overflowing a statically assigned slice [30]. Cisco claims up to 28 percent faster end-to-end job completion and 2.5× the burst absorption of industry averages [30][32]. \ \ NVIDIA owns InfiniBand outright and is invading Ethernet. The Quantum-X800 switch delivers 144 ports of 800 Gb/s — double the speed and five times the throughput of the prior Quantum-2 generation [33]. Its unfair advantage is SHARP (Scalable Hierarchical Aggregation and Reduction Protocol): the switch itself performs the gradient-averaging math inside the network , so mountains of data never need to make the round trip between servers at all [17][35]. On the Ethernet side, Spectrum-X delivers 1.6 Tb/s per port and up to 400 Tbps aggregate, tightly coupled with BlueField DPUs and ConnectX SuperNICs [33][34]. \ | Vendor | Flagship | Throughput | Differentiator | |----|----|----|----| | Broadcom | Tomahawk 6 (Davisson) | 102.4 Tbps | CPO integration, 3.8 pJ/bit, extreme radix | | Cisco | Silicon One G300 | 102.4 Tbps | 252 MB fully shared buffer, Intelligent Collective Networking | | NVIDIA | Quantum-X800 (InfiniBand) | >100 Tbps aggregate | SHARP in-network computing, 144× 800 Gb/s ports | | NVIDIA | Spectrum-X (Ethernet) | 400 Tbps aggregate | 1.6 Tb/s per port, end-to-end telemetry, SuperNICs | The Light Problem: Why Your Data Center Is Melting Here's the dirty secret of the optical age: at 800-gigabit and 1.6-terabit speeds, copper traces are finished — too much signal loss, too much crosstalk — so everything beyond a couple of meters must become light. But the way we've been converting electrons to photons is itself hitting a wall. Today's standard is the pluggable optical transceiver — those QSFP-DD and OSFP modules technicians snap into a switch's front panel [24]. Each one burns 14 to 20 watts [36]. Load a 64-port switch and the transceivers alone are a one-kilowatt space heater bolted to the front of the box [36]. Worse, the electrical signal has to travel from the switch ASIC across the circuit board to reach those modules, which requires a chain of retimers and digital signal processors just to keep it intelligible — every one adding power draw and latency [37]. Across a full AI cluster, an estimated 30 to 50 percent of total power consumption goes to communication and data movement, not computation [27]. The industry's escape hatch is Co-Packaged Optics (CPO) : move the entire optical engine — lasers, modulators, photodetectors — onto the same package as the switch chip itself, with Linear Pluggable Optics (LPO) as a halfway house [27]. When the optics sit micrometers from the ASIC instead of centimeters, the DSPs and retimers disappear, and the power math transforms. Broadcom's Tomahawk 6 Davisson CPO switch cuts interconnect power 70 percent versus conventional pluggables [27]. NVIDIA's Spectrum-X Photonics and Quantum-X Photonics switches, built on integrated optical chiplets, claim 3.5× better overall power efficiency, 10× better network resilience (fewer discrete parts to fail), and one-quarter the laser count for cleaner signals [34]. \ \ Welding Chips Together with Atomic Forces There's a catch: putting a photonic engine on a switch package requires connecting two different chips with a density no solder bump can deliver. The answer is one of the most quietly remarkable manufacturing processes in the industry — hybrid bonding — and TSMC has built its dominance on it with a platform called COUPE (Compact Universal Photonic Engine) [38][39]. Hybrid bonding throws out solder entirely [40]. Both chips — the photonic integrated circuit and the electronic one — are polished by chemical-mechanical planarization until their faces are flat to within nanometers, then plasma-activated [41]. Press them together at room temperature and the dielectric surfaces grab each other through van der Waals forces — the same intermolecular attraction that lets a gecko walk up glass — before forming covalent Si–O–Si bonds at the interface [41]. A thermal annealing step then expands the slightly recessed copper pads until they fuse, copper to copper, into solid metal connections at a pitch below 10 micrometers [41]. \ \ The result is a seam so intimate it behaves like a single chip: no solder-bump parasitics, no mechanical weak points, none of the insertion loss of legacy edge or grating couplers [38]. TSMC says COUPE cuts laser power consumption 40 percent and boosts power-delivery bandwidth 25 percent versus microbump approaches, and it scales to full Wafer-Level System Integration — which is how NVIDIA and Broadcom are fusing photonics onto their logic dies at mass-production volume [38][40][42]. Follow the Supply Chain: Lasers, Wafers, and a Quiet French Monopoly Solve the architecture and the physics, and you smash headfirst into the supply chain. The optical transceiver market is headed past 15billionin2025and15 billionin 2025 and 17 billion in 2026, almost entirely on AI demand [43] — and several of its critical inputs come from astonishingly few places. The laser shortage. The chokepoint for 800G and 1.6T module production is the supply of Electroabsorption Modulated Lasers (EMLs) and high-power Continuous Wave laser chips [36]. The cheap, efficient VCSELs used for short hops max out around 100 meters; spanning a hyperscale building at high baud rates demands EMLs, whose manufacture requires exotic epitaxial growth processes with brutal barriers to entry [36]. Analysts expect the shortfall to persist through 2026 [36]. NVIDIA found the situation alarming enough to invest $4 billion into Lumentum and Coherent — guaranteeing purchase commitments and bankrolling new fabs, including a new Lumentum device fabrication facility — because the entire Blackwell deployment timeline hinges on these parts [27][45]. Assemblers like Fabrinet and InnoLight (Zhongji Xuchuang) are riding the surge as they package the scarce chips into finished modules [36][43]. The wafer monopoly. Go one layer deeper and it gets narrower still. Silicon photonics chips can't be made on ordinary bulk silicon — they require specialized Silicon-on-Insulator (SOI) wafers, and the French firm Soitec holds a near-monopoly on Photonics-SOI via its proprietary "Smart Cut" process [47]. TSMC, Intel, and GlobalFoundries all start from Soitec material [47]. Five demand waves are converging on this single supplier: ever-bigger clusters, more optical lanes per accelerator, the substitution of legacy Indium Phosphide modules with silicon photonics, the CPO transition (which dramatically increases photonic silicon area per system), and the coming wave of optically disaggregated memory pooling [47]. The geopolitical wildcard. Above ~200 GBaud modulation — the territory beyond 1.6T — silicon itself struggles, and the industry leans on Indium Phosphide (InP) and Thin-Film Lithium Niobate (TFLN) [44]. Indium is mined almost exclusively as a byproduct of zinc refining, with production concentrated in China [43]. In response, Coherent is standing up 6-inch InP wafer lines to improve yields and costs, while TFLN matures as a high-bandwidth, low-loss alternative [44]. The Next Battlefront: Light Between the Chips The network keeps moving closer to the silicon. The next frontier isn't rack to rack — it's chip to chip, and chip to memory, attacking the "memory wall": the hard limit on how fast data can be fed from memory into a processor's logic. A cluster of startups is racing to put optical I/O inside the package . Ayar Labs builds in-package optical I/O chiplets for direct chip-to-chip links [50][51]. Celestial AI 's Photonic Fabric connects compute to remote memory through silicon photonics chiplets [51]. POET Technologies takes a contrarian route — an optical interposer on plain bulk silicon (no specialty substrate needed), integrating athermal waveguides and demultiplexers while staying CMOS-compatible, and supplying external light sources to others [47][50][53]. The shared endgame: disaggregated computing, where any GPU in a cluster can reach vast remote memory pools at latencies that feel local. Meanwhile, inside the rack, a war is on for the "scale-up" domain — the memory-coherent interconnect that welds GPUs in a chassis into one logical processor. NVIDIA's NVLink 5 delivers 1.8 TB/s of bidirectional bandwidth per GPU in the GB200 NVL72 — dwarfing any backend fabric — and it is arguably NVIDIA's deepest moat, since matching its training performance means buying NVIDIA's integrated racks [19]. In response, essentially everyone else — AMD, Broadcom, Intel, Google, Microsoft, Meta, Cisco, and AWS — formed the Ultra Accelerator Link (UALink) Consortium to define an open scale-up standard, one that could someday let an operator pool AMD Instinct GPUs, AWS Trainium, and Google TPUs in the same coherent memory domain [54][55][56]. It's a vital effort — but between CUDA's gravitational pull and NVIDIA's multi-generation hardware lead, proprietary interconnects will likely keep ruling frontier training for the immediate future. Why It Matters: The Bottleneck Has Moved Step back and the pattern is unmistakable. For decades, computing was compute-bound. The AI era is interconnect-bound — and that inversion is redirecting hundreds of billions of dollars. Five structural shifts stand out: Packaging is the new kingmaker. As optics move into the package, value shifts from discrete module assemblers to the foundries that can execute sub-10-micrometer hybrid bonding at scale. TSMC's COUPE position makes it the chokepoint of the entire next-generation networking ecosystem. Raw-material monopolies are load-bearing. Soitec's Photonics-SOI franchise, the EML/CW laser crunch, and indium's geographic concentration mean the AI buildout is physically gated by a handful of suppliers — and de-risking them (domestic InP fabs, TFLN, allied-nation III-V capacity) is now strategic policy as much as business. Switch silicon is a real fight. Broadcom's CPO lead and Cisco's shared-buffer architecture prove the non-NVIDIA networking world is fiercely competitive — especially in the Ethernet fabrics that cost-conscious hyperscalers like Meta champion. And NVIDIA's $4 billion laser investment shows that optics, not GPUs, currently pace AI factory construction. Hyperscalers refuse to be commoditized. AWS wrote its own protocol, Google switches light with mirrors, Meta rewrote Ethernet's rules, and Microsoft hides FPGAs in its network cards. The biggest hardware buyers on Earth are also becoming their own networking vendors — sustaining demand for programmable silicon like the Altera FPGAs inside Azure's SmartNICs. In-package optics is the moonshot. Ayar Labs, Celestial AI, and POET are high-risk, high-reward bets on solving the memory wall — the prerequisite for AI's scaling laws to keep holding. The zettascale era of artificial intelligence is, at bottom, an interconnect engineering problem. The companies that conquer its thermal, density, and latency walls — whether by monopolizing engineered wafers, mastering copper-to-copper bonding, or steering laser beams with microscopic mirrors — will quietly own the foundations of the machine intelligence age. Title image credit: inside Colossus: aisle after aisle of liquid-cooled Supermicro racks in Memphis, Tennessee. (© Super Micro Computer, Inc. / xAI Corp. — Supermicro case study) References Introl Blog — Google TPU Architecture: 7 Generations Explained . https://introl.com/blog/google-tpu-architecture-complete-guide-7-generations Engineering at Meta — RoCE networks for distributed AI training at scale (Aug 2024). https://engineering.fb.com/2024/08/05/data-center-engineering/roce-network-distributed-ai-training-at-scale/ arXiv — An Extensible Software Transport Layer for GPU Networking . https://arxiv.org/html/2504.17307v2 VIAVI Solutions — Validating High-Speed Ethernet for Next-Gen AI Networking (white paper). https://www.viavisolutions.com/en-us/literature/validating-high-speed-ethernet-next-generation-ai-networking-white-paper-book-en.pdf xAI — Colossus: The World's Largest AI Supercomputer . https://x.ai/colossus Supermicro — Inside the 100K GPU xAI Colossus Cluster (case study). https://www.supermicro.com/CaseStudies/Success_Story_xAI_Colossus_Cluster.pdf Introl Blog — xAI's Memphis Colossus . https://introl.com/blog/xai-memphis-colossus-100000-gpu-supercomputer-infrastructure ETF Database — SpaceX: The AI IPO Wearing a Spacesuit . https://etfdb.com/artificial-intelligence-content-hub/ai-ipo-spacesuit/ Financial Times — SpaceX's $1.78tn IPO asks investors to buy Musk's moonshots . https://www.ft.com/content/70fa49e3-1014-4412-890f-c7fe91497db9 Intelligent Visibility — Arista AI Networking: Building Lossless Ethernet Fabrics for AI . https://intelligentvisibility.com/ai-networking-solutions/arista-ai-networking-lossless-ethernet Engineering at Meta — Building Meta's GenAI Infrastructure (Mar 2024). https://engineering.fb.com/2024/03/12/data-center-engineering/building-metas-genai-infrastructure/ Hector Weyl — The AI-Driven Revolution in Optical Networking . https://www.hectorweyl.com/blogs/blog/the-ai-driven-revolution-in-optical-networking-powering-the-next-era-of-high-speed-energy-efficient-connectivity Google Cloud Documentation — TPU v6e . https://docs.cloud.google.com/tpu/docs/v6e TrendForce — Google's High-Speed Interconnect Architecture to Push 800G+ Optical Transceiver Share Past 60% by 2026 . https://www.trendforce.com/presscenter/news/20260210-12919.html Oracle Blogs — Behind the Scenes: Securing OCI InfiniBand SuperClusters . https://blogs.oracle.com/cloud-infrastructure/securing-oci-infiniband-superclusters Oracle — Accelerating AI Workloads with OCI Supercluster . https://www.oracle.com/a/ocom/docs/cloud/accelerate-ai-with-oci-supercluster.pdf Oracle Blogs — Announcing World's Largest, First Zettascale AI Supercomputer in the Cloud . https://blogs.oracle.com/cloud-infrastructure/worlds-largest-ai-supercomputer-in-the-cloud NVIDIA — NVIDIA and Oracle Partner to Accelerate AI and Data Processing for Enterprises . https://resources.nvidia.com/en-us-nvidia-oci-fastrack/oracle-ai-data-processing WiFi Hotshots — AI Networking Fabric Comparison: NVIDIA, Arista, Cisco . https://wifihotshots.com/manufacturer-comparisons/ai-networking-fabrics/ Glenn K. Lockwood — Azure SmartNICs . https://glennklockwood.com/garden/Azure-SmartNIC Gary Koys (Medium) — AWS EC2 UltraCluster 2.0: The Backbone of Next-Gen AI/ML Networking . https://medium.com/@garykoys/aws-ec2-ultracluster-2-0-the-backbone-of-next-gen-ai-ml-networking-59fe79ceb344 AWS re:Invent 2024 — AWS-accelerated computing enables customer success with generative AI (CMP207 slides). https://d1.awsstatic.com/onedam/marketing-channels/website/aws/en_US/events/approved/reinvent-2025/reinvent/2024/slides/cmp/CMP207_AWS-accelerated-computing-enables-customer-success-with-generative-AI.pdf Amazon News — 4 ways AWS is engineering infrastructure to power generative AI . https://www.aboutamazon.com/news/aws/aws-infrastructure-generative-ai Arista Networks — INFN Workshop 2024 (slides). https://agenda.infn.it/event/40160/contributions/230918/attachments/120118/174533/INFN Workshop 2024 - Arista Networks.pdf Broadcom — Broadcom Delivers the Future of AI Infrastructure with End-to-End AI Networking Solutions at 2025 OCP Global Summit . https://investors.broadcom.com/news-releases/news-release-details/broadcom-delivers-future-ai-infrastructure-end-end-ai-networking Soitec — Enabling AI with Engineered Substrates (investor presentation, Jan 2026). https://www.soitec.com/docs/default-source/financial-reports/2025-2026/en/soitec---enabling-ai-with-engineered-substrates-2026-01-06.pdf iqoys ( note.com ) — [Supply Chain Anatomy] What is the CPO bottleneck? https://note.com/iqoys/n/n0a3a4605bf1a Cisco — Silicon One G200 Data Sheet . https://www.cisco.com/c/en/us/solutions/collateral/silicon-one/silicon-one-g200-ds.html Cisco — Silicon One G300 Data Sheet . https://www.cisco.com/c/en/us/solutions/collateral/silicon-one/silicon-one-g300-ds.html CloudWifiWorks — Cisco Silicon One G300 — 102.4 Tbps AI ASIC . https://www.cloudwifiworks.com/cisco-silicon-one-g300.asp SemiWiki — Cisco launched its Silicon One G300 AI networking chip . https://semiwiki.com/forum/threads/cisco-launched-its-silicon-one-g300-ai-networking-chip-in-a-move-that-aims-to-compete-with-nvidia-and-broadcom.24521/ Cisco Newsroom — Cisco Announces New Silicon One G300, Advanced Systems and Optics to Power and Scale AI Data Centers for the Agentic Era (Feb 2026). https://newsroom.cisco.com/c/r/newsroom/en/us/a/y2026/m02/cisco-announces-new-silicon-one-g300.html NVIDIA — Quantum-X800 InfiniBand Platform . https://www.nvidia.com/en-us/networking/products/infiniband/quantum-x800/ AMAX — Spectrum-X Photonics and Quantum-X Photonics Switches . https://www.amax.com/spectrum-x-photonics-and-quantum-x-photonics-switches/ NVIDIA — InfiniBand Switches . https://www.nvidia.com/en-us/networking/infiniband-switching/ Wiitek — 1.6T OSFP DR4 FR4, 800G OSFP DAC AOC news . http://www.wiitek.com/show_news.asp?id=492&b=71 NVIDIA (YouTube) — Quantum-X Photonics InfiniBand Switch Systems — Light-speed AI Networking . https://www.youtube.com/watch?v=HsUOXa41WUk TSMC Research — On-chip Interconnect . https://research.tsmc.com/english/research/interconnect/on-chip-interconnect/publish-time-1.html Bits&Chips — TSMC pushes silicon photonics platform to mass production . https://bits-chips.com/article/tsmc-pushes-silicon-photonics-platform-to-mass-production/ DARPA — The Quest for 3D Hybrid Bonding: Challenges for the Next Steps (Dec 2025). https://www.darpa.mil/sites/default/files/attachment/2025-12/mto-ngmm-summit-presentation-3d-hybrid-bonding.pdf PatSnap — Hybrid bonding in 3D IC packaging: Cu-to-Cu explained . https://www.patsnap.com/resources/blog/articles/hybrid-bonding-in-3d-ic-packaging-cu-to-cu-explained/ TSMC Research — Heterogeneous Integration of a Compact Universal Photonic Engine for Silicon Photonics Applications in HPC . https://research.tsmc.com/page/on-chip-interconnect/14.html Seeking Alpha — Lumentum, Coherent, Fabrinet to benefit from growing optical transceiver market . https://seekingalpha.com/news/4466881-lumentum-coherent-fabrinet-benefit-growing-optical-transceiver-market IPEC / ECOC Exhibition — Development trend of optical interconnection in AI era . https://www.ecocexhibition.com/wp-content/uploads/Development-trend-of-optical-interconnection-in-AI-era-IPEC.pdf optics.org — Nvidia backs Lumentum and Coherent with $4BN cash investment . https://optics.org/news/nvidia-backs-lumentum-and-coherent-with-4bn-cash-investment Reddit r/ETFs — Pure-play photonics ETF (FOTO) (discussion). https://www.reddit.com/r/ETFs/comments/1u0rwkz/alright_degenerates_theres_now_a_pureplay/ Convequity — Soitec — Where Silicon Photonics Begins . https://www.convequity.com/soitec-where-silicon-photonics-begins/ SEMI / Yole Group — Martin Vallo presentation (Nov 2024). https://www.semi.org/sites/semi.org/files/2024-11/07 Martin Vallo - Yole Group.pdf IDTechEx — Silicon Photonics and Photonic Integrated Circuits 2026–2036 . https://www.idtechex.com/en/research-report/silicon-photonics-and-photonic-integrated-circuits/1151 Grokipedia — Ayar Labs and Celestial AI . https://grokipedia.com/page/Ayar_Labs_and_Celestial_AI Contrary Research — Ayar Labs Business Breakdown & Founding Story . https://research.contrary.com/company/ayar-labs Cignal AI — Optical Component Startup Tracker (Nov 2025). https://cignal.ai/2025/11/optical-component-startup-tracker/ Reddit r/POETTechnologiesInc — DD: Who are their competitors? (discussion). https://www.reddit.com/r/POETTechnologiesInc/comments/1oohror/dd_who_are_their_competitors_what_will_be_the/ Introl Blog — UALink and CXL 4.0 . https://introl.com/blog/ualink-cxl-4-gpu-interconnect-memory-pooling-guide-2025 UALink Consortium — Ultra Accelerator Link is an open-standard interconnect for AI accelerators . https://ualinkconsortium.org/news/ultra-accelerator-link-is-an-open-standard-interconnect-for-ai-accelerators-being-developed-by-amd-broadcom-intel-google-microsoft-others/ Reddit r/AMD_Stock — UALink Roadmap Insights (discussion). https://www.reddit.com/r/AMD_Stock/comments/1rnyz1m/ualink_roadmap_insights_accelerating_open/ \
View original source — Hacker Noon ↗



