Discover Top Posts Tagged with #accelerated computing

Nvidia GDX station = 400 CPUs

https://youtu.be/6iKV1AQEScc

NVIDIA and South Korea align on sovereign AI at APEC CEO Summit

New Post has been published on https://thedigitalinsider.com/nvidia-and-south-korea-align-on-sovereign-ai-at-apec-ceo-summit/

NVIDIA and South Korea align on sovereign AI at APEC CEO Summit

At the APEC CEO Summit, NVIDIA said it is working with public agencies and private companies to build sovereign AI infrastructure across South Korea. The plan includes hundreds of thousands of NVIDIA GPUs across sovereign clouds and AI factories for areas like automotive, manufacturing and telecommunications.

“Korea’s leadership in technology and manufacturing positions it at the heart of the AI industrial revolution — where accelerated computing infrastructure becomes as vital as power grids and broadband,” said Jensen Huang, founder and CEO of NVIDIA. “Just as Korea’s physical factories have inspired the world with sophisticated ships, cars, chips and electronics, the nation can now produce intelligence as a new export that will drive global transformation.”

“Now that AI has gone beyond mere innovation and become the foundation of future industries, South Korea stands at the threshold of transformation,” said Bae Kyung-hoon, Korea Deputy Prime Minister, and Minister of Science and Information and Communication Technologies.

The government plans to deploy up to 50,000 new NVIDIA GPUs to support sovereign AI programs for businesses and research groups. The first phase includes 13,000 NVIDIA Blackwell and other GPUs through providers such as NAVER Cloud, NHN Cloud and Kakao. The expansion includes efforts to build a National AI Computing Center. Startups, researchers and other organisations will be able to use this sovereign infrastructure to train models and build new applications.

NVIDIA is also working with Samsung, SK Telecom, ETRI, KT, LGU+ and Yonsei University on AI-RAN and 6G network research. The work focuses on shifting some computing tasks from devices to network base stations, which may reduce battery drain and lower computing costs across sovereign AI services.

Major companies build sovereign AI factories

Large corporations in Korea are investing in advanced AI infrastructure for chip production, network operations and digital manufacturing tools that support the country’s sovereign computing goals.

NVIDIA and Samsung plan to build a new AI factory that connects chip manufacturing with accelerated computing. The system will run more than 50,000 NVIDIA GPUs and support data-driven production methods, including predictive maintenance and process improvements across chip fabs.

“We are at the dawn of the AI industrial revolution — a new era that will redefine how the world designs, builds and manufactures,” said Jensen Huang. Jay Y. Lee, executive chairman of Samsung Electronics, added, “From Samsung’s DRAM for NVIDIA’s game-changing graphics card in 1995 to our new AI factory, we are thrilled to continue our longstanding journey with NVIDIA in leading this transformation.”

Samsung plans to use NVIDIA CUDA-X libraries, along with software from Synopsys, Cadence and Siemens, to speed circuit design and manufacturing workflows. It will also use NVIDIA Omniverse to create digital twins of factories and equipment for real-time simulation, testing and logistics planning — all supporting wider sovereign AI adoption.

NVIDIA’s cuLitho library is being integrated into Samsung’s computational lithography tools. The collaboration has led to major gains in performance, supporting faster scaling in chip production.

Samsung is also developing large language models that run across hundreds of millions of Samsung devices, supporting translation and other reasoning tasks. The company plans to expand into robotics using NVIDIA Isaac Sim, NVIDIA Cosmos and the Jetson Thor edge platform, which may strengthen its position in sovereign AI systems.

SK Group expands AI capacity

SK Group is building an AI factory that will include more than 50,000 NVIDIA GPUs, with completion expected by late 2027. The facility will support SK subsidiaries and outside clients through GPU-as-a-service offerings that align with South Korea’s sovereign AI strategy. NVIDIA and SK are also working together on next-generation high-bandwidth memory for GPUs.

“SK Group is working with NVIDIA to make AI the engine of a profound transformation that will enable industries across Korea to transcend traditional limits of scale, speed and precision,” said Chey Tae-Won, chairman of SK Group.

SK Telecom plans to build an industrial AI cloud using NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. The platform will support semiconductor manufacturing, digital twins and internal AI agents.

SK hynix is using NVIDIA PhysicsNeMo tools to support chip design simulations, aiming to improve accuracy and speed. It is also testing NVIDIA Blackwell GPUs with Synopsys software and building autonomous fab digital twins.

To support workers, SKT is developing a foundation model called A.X., built with NVIDIA NIM microservices and NVIDIA AI Enterprise. The model will power internal agents to assist thousands of employees across chip development and operations.

Hyundai Motor Group plans new AI factory

NVIDIA and Hyundai Motor Group are expanding their partnership to support autonomous driving, factory automation and robotics. Hyundai plans to build an AI factory using NVIDIA Blackwell GPUs for integrated training, simulation and deployment.

“AI is revolutionising every facet of every industry, and in transportation alone — from vehicle design and manufacturing to robotics and autonomous driving — NVIDIA’s AI and computing platforms are transforming how the world moves,” said Jensen Huang.

The companies expect joint investment of about $3 billion to grow national physical AI capabilities. The plan includes an NVIDIA AI Technology Center, Hyundai’s Physical AI Application Center and new data centres. These programs aim to help train a new generation of AI talent.

Hyundai will use NVIDIA Omniverse Enterprise to build digital twins of factories, supporting virtual testing, robot integration and predictive maintenance. It will also use NVIDIA DRIVE AGX Thor for in-vehicle AI systems, including driver assistance and infotainment features.

Growth of sovereign AI models

NAVER Cloud plans to deploy more than 60,000 GPUs for sovereign and physical AI work. The company will build industry-targeted models for areas such as shipbuilding and public safety.

The Ministry of Science and ICT is also leading a Sovereign AI Foundation Models project using NVIDIA NeMo and open Nemotron datasets. Partners include LG AI Research, NC AI, SK Telecom and Upstage. These models will support language and reasoning tasks.

LG is working with NVIDIA on physical AI research and will support startups and researchers using its EXAONE models, including healthcare applications.

Quantum and scientific research

KISTI plans to use NVIDIA accelerated computing in its sixth national supercomputer, HANGANG. The institute will support NVQLink, an open architecture for connecting quantum processors with GPU clusters. It will also develop scientific foundation models and explore physics-informed AI tools using NVIDIA PhysicsNeMo.

NVIDIA and local partners are forming a startup alliance through the NVIDIA Inception program. Members will gain access to accelerated computing resources from cloud partners like SK Telecom, along with support from venture firms. NVIDIA also plans to take part in the N-Up AI startup incubation program from the Ministry of SMEs and Startups.

(Photo by Nvidia)

See also: Migrating AI from Nvidia to Huawei: Opportunities and trade-offs

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events, click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

NVIDIA GPUs to power Oracle's next-gen enterprise AI services

New Post has been published on https://thedigitalinsider.com/nvidia-gpus-to-power-oracles-next-gen-enterprise-ai-services/

NVIDIA GPUs to power Oracle's next-gen enterprise AI services

Oracle and NVIDIA have expanded their partnership to make enterprise AI services more available, powerful, and practical. The announcements, made during Oracle AI World, cover everything from monstrously powerful new hardware to deeply integrated software that aims to put AI at the very core of a company’s data.

Ian Buck, VP of Hyperscale and High-Performance Computing at NVIDIA, said: “Through this latest collaboration, Oracle and NVIDIA are marking new frontiers in cutting-edge accelerated computing—streamlining database AI pipelines, speeding data processing, powering enterprise use cases and making inference easier to deploy and scale on OCI.”

The headline announcement is the new OCI Zettascale10 computing cluster. This platform is accelerated by NVIDIA GPUs and engineered for the kind of AI training and inference workloads that would make a normal server weep.

OCI Zettascale10 promises a mighty 16 zettaflops of peak AI compute performance and is knitted together with NVIDIA’s Spectrum-X Ethernet, a networking fabric designed specifically to stop GPUs from sitting around waiting for data, allowing organisations to scale up to millions of processors efficiently.

But raw power is only half the story. The real substance of this partnership lies in the software integrations that aim to weave AI into every layer of a business’s operations.

Mahesh Thiagarajan, Executive VP of Oracle Cloud Infrastructure, commented: “OCI Zettascale10 delivers multi‑gigawatt capacity for the most challenging AI workloads with NVIDIA’s next-generation GPU platform.

“In addition, the native availability of NVIDIA AI Enterprise on OCI gives our joint customers a leading AI toolset close at hand to OCI’s 200+ cloud services, supporting a long tail of customer innovation.”

Giving your Oracle database a brain with AI

The foundation of this new strategy is the Oracle AI Database 26ai. For years, the conventional wisdom was to move your data to where the AI models are. Oracle is flipping that on its head, arguing that it’s far more secure and efficient to bring the AI to your data. This latest database release is the embodiment of that “AI for Data” vision.

Juan Loaiza, Executive VP of Oracle Database Technologies at Oracle, said: “By architecting AI and data together, Oracle AI Database makes ‘AI for Data’ simple to learn and simple to use. We enable our customers to easily deliver trusted AI insights, innovations, and productivity for all their data, everywhere, including both operational systems and analytic data lakes.”

One of the standout features is the ability to run agentic AI workflows inside your database. The AI agents can tackle complex questions by combining your enterprise’s private, sensitive data with public information, all without ever having to move that private data outside your secure environment. This is made possible by features like a Unified Hybrid Vector Search, which lets the AI look for context across all your data types, whether it’s in a relational table, a JSON file, or a spatial map.

Oracle is also clearly thinking about the long game with security. The new database implements NIST-approved quantum-resistant algorithms for data both in-flight and at-rest. It’s a defence against “harvest now, decrypt later” attacks, where hackers steal encrypted data today with the hope of breaking it with future quantum computers.

Holger Mueller, VP and Principal Analyst at Constellation Research, commented: “Great AI needs great data. With Oracle AI Database 26ai, customers get both. It’s the single place where their business data lives—current, consistent, and secure. And it’s the best place to use AI on that data without moving it.

“To help simplify and accelerate AI adoption, AI Database 26ai includes impressive new AI features that go beyond AI Vector Search. A highlight is Oracle’s architecting agentic AI into the database, enabling customers to build, deploy, and manage their own in-database AI agents using a no-code visual platform that includes pre-built agents.”

The new database is designed to work with NVIDIA’s toolset. Its programming interfaces can now plug directly into NVIDIA NeMo Retriever, a collection of microservices that handle the complicated plumbing of modern AI for an enterprise.

This makes it far easier for developers to implement things like retrieval-augmented generation, or RAG. In simple terms, RAG allows a language model to look up relevant facts in your company documents before it answers a question, making its responses far more accurate and useful.

The Oracle Private AI Services Container will also get a GPU-powered boost. This container lets businesses run AI models in their own secure environment. Soon, it will be able to offload the heavy lifting of creating vector embeddings – a core task for AI search – to powerful NVIDIA GPUs using the cuVS library. This promises to slash the time it takes to prepare data for AI applications.

Democratising enterprise AI

Beyond the database, the partnership aims to simplify the entire AI pipeline. The new Oracle AI Data Platform now includes a built-in NVIDIA GPU option and the NVIDIA RAPIDS Accelerator for Apache Spark. For data scientists and engineers, this is a big deal. It means they can speed up their data processing and machine learning workflows using GPUs, often without having to change a single line of their existing code.

All of these tools and capabilities are being consolidated within the Oracle AI Hub. The idea is to give organisations a single place to build, deploy, and manage their AI solutions. From the hub, users can deploy NVIDIA’s NIM microservices – which are like pre-packaged AI skills – through a simple, no-code interface.

To lower the barrier to entry even further, the full NVIDIA AI Enterprise software suite is now natively available within the OCI Console. This means that a developer can spin up a GPU instance and enable all the necessary NVIDIA tools with a few clicks, rather than going through a separate procurement process. It’s a small change that makes a big difference in how quickly teams can get started.

It’s clear that this collaboration is aimed at solving the real-world challenges businesses face when trying to adopt AI. By bringing the hardware, the data, and the software tools into one cohesive ecosystem, Oracle and NVIDIA are making a case that the era of practical, secure, and scalable enterprise AI has well and truly arrived.

See also: Cisco: Only 13% have a solid AI strategy and they’re lapping rivals

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

Meta and Oracle choose NVIDIA Spectrum-X for AI data centres

New Post has been published on https://thedigitalinsider.com/meta-and-oracle-choose-nvidia-spectrum-x-for-ai-data-centres/

Meta and Oracle choose NVIDIA Spectrum-X for AI data centres

Meta and Oracle are upgrading their AI data centres with NVIDIA’s Spectrum-X Ethernet networking switches — technology built to handle the growing demands of large-scale AI systems. Both companies are adopting Spectrum-X as part of an open networking framework designed to improve AI training efficiency and accelerate deployment across massive compute clusters.

Jensen Huang, NVIDIA’s founder and CEO, said trillion-parameter models are transforming data centres into “giga-scale AI factories,” adding that Spectrum-X acts as the “nervous system” connecting millions of GPUs to train the largest models ever built.

Oracle plans to use Spectrum-X Ethernet with its Vera Rubin architecture to build large-scale AI factories. Mahesh Thiagarajan, Oracle Cloud Infrastructure’s executive vice president, said the new setup will allow the company to connect millions of GPUs more efficiently, helping customers train and deploy new AI models faster.

Meta, meanwhile, is expanding its AI infrastructure by integrating Spectrum-X Ethernet switches into the Facebook Open Switching System (FBOSS), its in-house platform for managing network switches at scale. According to Gaya Nagarajan, Meta’s vice president of networking engineering, the company’s next-generation network must be open and efficient to support ever-larger AI models and deliver services to billions of users.

Building flexible AI systems

According to Joe DeLaere, who leads NVIDIA’s Accelerated Computing Solution Portfolio for Data Centre, flexibility is key as data centres grow more complex. He explained that NVIDIA’s MGX system offers a modular, building-block design that lets partners combine different CPUs, GPUs, storage, and networking components as needed.

The system also promotes interoperability, allowing organisations to use the same design across multiple generations of hardware. “It offers flexibility, faster time to market, and future readiness,” DeLaere said to the media.

As AI models become larger, power efficiency has become a central challenge for data centres. DeLaere said NVIDIA is working “from chip to grid” to improve energy use and scalability, collaborating closely with power and cooling vendors to maximise performance per watt.

One example is the shift to 800-volt DC power delivery, which reduces heat loss and improves efficiency. The company is also introducing power-smoothing technology to reduce spikes on the electrical grid — an approach that can cut maximum power needs by up to 30 per cent, allowing more compute capacity within the same footprint.

Scaling up, out, and across

NVIDIA’s MGX system also plays a role in how data centres are scaled. Gilad Shainer, the company’s senior vice president of networking, told the media that MGX racks host both compute and switching components, supporting NVLink for scale-up connectivity and Spectrum-X Ethernet for scale-out growth.

He added that MGX can connect multiple AI data centres together as a unified system — what companies like Meta need to support massive distributed AI training operations. Depending on distance, they can link sites through dark fibre or additional MGX-based switches, enabling high-speed connections across regions.

Meta’s AI adoption of Spectrum-X reflects the growing importance of open networking. Shainer said the company will use FBOSS as its network operating system but noted that Spectrum-X supports several others, including Cumulus, SONiC, and Cisco’s NOS through partnerships. This flexibility allows hyperscalers and enterprises to standardise their infrastructure using the systems that best fit their environments.

Expanding the AI ecosystem

NVIDIA sees Spectrum-X as a way to make AI infrastructure more efficient and accessible across different scales. Shainer said the Ethernet platform was designed specifically for AI workloads like training and inference, offering up to 95 percent effective bandwidth and outperforming traditional Ethernet by a wide margin.

He added that NVIDIA’s partnerships with companies such as Cisco, xAI, Meta, and Oracle Cloud Infrastructure are helping to bring Spectrum-X to a broader range of environments — from hyperscalers to enterprises.

Preparing for Vera Rubin and beyond

DeLaere said NVIDIA’s upcoming Vera Rubin architecture is expected to be commercially available in the second half of 2026, with the Rubin CPX product arriving by year’s end. Both will work alongside Spectrum-X networking and MGX systems to support the next generation of AI factories.

He also clarified that Spectrum-X and XGS share the same core hardware but use different algorithms for varying distances — Spectrum-X for inside data centres and XGS for inter–data centre communication. This approach minimises latency and allows multiple sites to operate together as a single large AI supercomputer.

Collaborating across the power chain

To support the 800-volt DC transition, NVIDIA is working with partners from chip level to grid. The company is collaborating with Onsemi and Infineon on power components, with Delta, Flex, and Lite-On at the rack level, and with Schneider Electric and Siemens on data centre designs. A technical white paper detailing this approach will be released at the OCP Summit.

DeLaere described this as a “holistic design from silicon to power delivery,” ensuring all systems work seamlessly together in high-density AI environments that companies like Meta and Oracle operate.

Performance advantages for hyperscalers

Spectrum-X Ethernet was built specifically for distributed computing and AI workloads. Shainer said it offers adaptive routing and telemetry-based congestion control to eliminate network hotspots and deliver stable performance. These features enable higher training and inference speeds while allowing multiple workloads to run simultaneously without interference.

He added that Spectrum-X is the only Ethernet technology proven to scale at extreme levels, helping organisations get the best performance and return on their GPU investments. For hyperscalers such as Meta, that scalability helps manage growing AI training demands and keep infrastructure efficient.

Hardware and software working together

While NVIDIA’s focus is often on hardware, DeLaere said software optimisation is equally important. The company continues to improve performance through co-design — aligning hardware and software development to maximise efficiency for AI systems.

NVIDIA is investing in FP4 kernels, frameworks such as Dynamo and TensorRT-LLM, and algorithms like speculative decoding to improve throughput and AI model performance. These updates, he said, ensure that systems like Blackwell continue to deliver better results over time for hyperscalers such as Meta that rely on consistent AI performance.

Networking for the trillion-parameter era

The Spectrum-X platform — which includes Ethernet switches and SuperNICs — is NVIDIA’s first Ethernet system purpose-built for AI workloads. It’s designed to link millions of GPUs efficiently while maintaining predictable performance across AI data centres.

With congestion-control technology achieving up to 95 per cent data throughput, Spectrum-X marks a major leap over standard Ethernet, which typically reaches only about 60 per cent due to flow collisions. Its XGS technology also supports long-distance AI data centre links, connecting facilities across regions into unified “AI super factories.”

By tying together NVIDIA’s full stack — GPUs, CPUs, NVLink, and software — Spectrum-X provides the consistent performance needed to support trillion-parameter models and the next wave of generative AI workloads.

(Photo by Nvidia)

See also: OpenAI and Nvidia plan $100B chip deal for AI future

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

Trump jokes about AI while US and UK sign new tech deal

New Post has been published on https://thedigitalinsider.com/trump-jokes-about-ai-while-us-and-uk-sign-new-tech-deal/

Trump jokes about AI while US and UK sign new tech deal

US President Donald Trump said on Thursday that AI was “taking over the world,” and joked that he hoped tech executives understood it better than he did.

The comment came as Trump and UK Prime Minister Keir Starmer hosted a gathering of business and technology leaders in London during the president’s second state visit to Britain. Among those present was Nvidia CEO Jensen Huang, whose company has become central to the global AI boom.

Breaking from a prepared speech about US-UK ties, fresh partnerships, and billions of dollars in pledged investments, Trump admitted he had little knowledge of AI.

“This will create new government, academic, and private sector cooperation in areas like AI, which is taking over the world […] I’m looking at you guys. You’re taking over the world. Jensen, I don’t know what you’re doing here,” Trump said, drawing laughter from Starmer and the audience.

“I hope you’re right. All I can say is, we both hope you’re right.”

Trump and Starmer sign AI-focused tech deal

The highlight of the event was the signing of a “Tech Prosperity Deal,” which sets out plans for the two countries to deepen cooperation in emerging technologies. The deal covers projects like developing AI models for healthcare, advancing quantum computing, and modernising nuclear programmes.

As part of the agreement, Nvidia committed to deploying 120,000 GPUs in Britain. It will be the company’s largest rollout in Europe.

Nvidia’s parallel move with Intel

Separately on Thursday, Nvidia revealed a $5 billion investment in Intel, coupled with a collaboration with Intel on new products. The two companies will work together on custom data centres that support AI systems and on processors for personal computers.

Nvidia said it plans to buy Intel stock at $23.28 a share, subject to regulatory approval. The investment comes as Intel looks to regain ground after years of decline.

“The historic collaboration tightly couples Nvidia’s AI and accelerated computing stack with Intel’s CPUs and the vast x86 ecosystem – a fusion of two world-class platforms,” Huang said. “Together, we will expand our ecosystems and lay the foundation for the next era of computing.”

For data centres, Intel will design chips that support Nvidia’s AI infrastructure. For desktop PCs, Intel will manufacture processors that integrate Nvidia’s technology, giving the firm a chance to push into areas where it has lost momentum.

A lifeline for Intel

The partnership offers a boost for Intel, once the backbone of personal computers but now struggling to keep pace. The company missed the shift to smartphones after Apple’s iPhone transformed the market in 2007, and more recently it has lagged behind in the AI hardware race. Nvidia, meanwhile, has become the world’s most valuable company.

Investors reacted quickly: Intel’s shares jumped 30% in premarket trading, while Nvidia’s stock rose nearly 3%.

Before signing the deal, Trump added a dose of humour, turning to Treasury Secretary Scott Bessent and asking, “Should I sign this? Are you sure, Scott? If the deal’s no good, I’m blaming you.”

During the visit to the UK, Trump also said he hoped AI would be handled wisely by the experts leading its development, since he admitted it was beyond his understanding.

Trump administration monitors AI competition closely

While partnerships expand, US regulators are also tightening their focus on competition in AI. Speaking at a conference in New York, Assistant Attorney General Gail Slater said the US Justice Department is on alert for anticompetitive behaviour in the sector. “The competitive dynamics of each layer of the AI stack and how they interrelate, with a particular eye towards exclusionary behaviour that forecloses access to key inputs and distribution channels, are legitimate areas for antitrust inquiry,” she said. “Of course, a truly open-source model must be one that is not unilaterally maintained by a single vendor that exerts unwarranted influence and impose restrictions.”

One key area is access to data. A federal judge in Washington recently ordered Google to share some of its search data with rivals, including AI companies, to help level the playing field in online search. Google has said it will appeal that ruling.

Slater’s remarks reflect a continuity of concern. Antitrust officials under President Joe Biden also examined big tech’s ties to AI startups, showing that both administrations see competition as central to the future of AI.

(Photo by History in HD)

See also: Huawei announces new Ascend chips, to power world’s most powerful clusters

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

What happens when AI data centres run out of space? NVIDIA's new solution explained

New Post has been published on https://thedigitalinsider.com/what-happens-when-ai-data-centres-run-out-of-space-nvidias-new-solution-explained/

What happens when AI data centres run out of space? NVIDIA's new solution explained

When AI data centres run out of space, they face a costly dilemma: build bigger facilities or find ways to make multiple locations work together seamlessly. NVIDIA’s latest Spectrum-XGS Ethernet technology promises to solve this challenge by connecting AI data centres across vast distances into what the company calls “giga-scale AI super-factories.”

Announced ahead of Hot Chips 2025, this networking innovation represents the company’s answer to a growing problem that’s forcing the AI industry to rethink how computational power gets distributed.

The problem: When one building isn’t enough

As artificial intelligence models become more sophisticated and demanding, they require enormous computational power that often exceeds what any single facility can provide. Traditional AI data centres face constraints in power capacity, physical space, and cooling capabilities.

When companies need more processing power, they typically have to build entirely new facilities—but coordinating work between separate locations has been problematic due to networking limitations. The issue lies in standard Ethernet infrastructure, which suffers from high latency, unpredictable performance fluctuations (called “jitter”), and inconsistent data transfer speeds when connecting distant locations.

These problems make it difficult for AI systems to efficiently distribute complex calculations across multiple sites.

NVIDIA’s solution: Scale-across technology

Spectrum-XGS Ethernet introduces what NVIDIA terms “scale-across” capability—a third approach to AI computing that complements existing “scale-up” (making individual processors more powerful) and “scale-out” (adding more processors within the same location) strategies.

The technology integrates into NVIDIA’s existing Spectrum-X Ethernet platform and includes several key innovations:

Distance-adaptive algorithms that automatically adjust network behaviour based on the physical distance between facilities

Advanced congestion control that prevents data bottlenecks during long-distance transmission

Precision latency management to ensure predictable response times

End-to-end telemetry for real-time network monitoring and optimisation

According to NVIDIA’s announcement, these improvements can “nearly double the performance of the NVIDIA Collective Communications Library,” which handles communication between multiple graphics processing units (GPUs) and computing nodes.

Real-world implementation

CoreWeave, a cloud infrastructure company specialising in GPU-accelerated computing, plans to be among the first adopters of Spectrum-XGS Ethernet.

“With NVIDIA Spectrum-XGS, we can connect our data centres into a single, unified supercomputer, giving our customers access to giga-scale AI that will accelerate breakthroughs across every industry,” said Peter Salanki, CoreWeave’s cofounder and chief technology officer.

This deployment will serve as a practical test case for whether the technology can deliver on its promises in real-world conditions.

Industry context and implications

The announcement follows a series of networking-focused releases from NVIDIA, including the original Spectrum-X platform and Quantum-X silicon photonics switches. This pattern suggests the company recognises networking infrastructure as a critical bottleneck in AI development.

“The AI industrial revolution is here, and giant-scale AI factories are the essential infrastructure,” said Jensen Huang, NVIDIA’s founder and CEO, in the press release. While Huang’s characterisation reflects NVIDIA’s marketing perspective, the underlying challenge he describes—the need for more computational capacity—is acknowledged across the AI industry.

The technology could potentially impact how AI data centres are planned and operated. Instead of building massive single facilities that strain local power grids and real estate markets, companies might distribute their infrastructure across multiple smaller locations while maintaining performance levels.

Technical considerations and limitations

However, several factors could influence Spectrum-XGS Ethernet’s practical effectiveness. Network performance across long distances remains subject to physical limitations, including the speed of light and the quality of the underlying internet infrastructure between locations. The technology’s success will largely depend on how well it can work within these constraints.

Additionally, the complexity of managing distributed AI data centres extends beyond networking to include data synchronisation, fault tolerance, and regulatory compliance across different jurisdictions—challenges that networking improvements alone cannot solve.

Availability and market impact

NVIDIA states that Spectrum-XGS Ethernet is “available now” as part of the Spectrum-X platform, though pricing and specific deployment timelines haven’t been disclosed. The technology’s adoption rate will likely depend on cost-effectiveness compared to alternative approaches, such as building larger single-site facilities or using existing networking solutions.

The bottom line for consumers and businesses is this: if NVIDIA’s technology works as promised, we could see faster AI services, more powerful applications, and potentially lower costs as companies gain efficiency through distributed computing. However, if the technology fails to deliver in real-world conditions, AI companies will continue facing the expensive choice between building ever-larger single facilities or accepting performance compromises.

CoreWeave’s upcoming deployment will serve as the first major test of whether connecting AI data centres across distances can truly work at scale. The results will likely determine whether other companies follow suit or stick with traditional approaches. For now, NVIDIA has presented an ambitious vision—but the AI industry is still waiting to see if the reality matches the promise.

See also: New Nvidia Blackwell chip for China may outpace H20 model

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

UK tackles AI skills gap through NVIDIA partnership

New Post has been published on https://thedigitalinsider.com/uk-tackles-ai-skills-gap-through-nvidia-partnership/

UK tackles AI skills gap through NVIDIA partnership

The UK is cementing its position as Europe’s AI powerhouse through partnerships with players like NVIDIA to tackle issues like the skills gap.

The UK continued to outpace continental rivals both in freshly funded AI startups and overall private investment throughout 2024. Since 2013, UK AI ventures have managed to attract £22 billion in private funding, suggesting investors are continuing to bet on the home of industry giants like DeepMind, Stability AI, and Wayve.

Research unveiled during the recent London Tech Week showed something many tech observers have long suspected: regions blessed with robust AI and data centre infrastructure tend to enjoy stronger economic growth across the board. The analysis, by Public First, suggested even modest bumps in AI data centre capacity could pump nearly £5 billion into the nation’s coffers. More ambitious expansion – doubling current access levels, for instance – might deliver annual economic windfalls approaching £36.5 billion.

Cloud provider Nscale chose London Tech Week to pledge to deploy 10,000 NVIDIA Blackwell GPUs in the country by late 2026. Not to be outdone, cloud outfit Nebius revealed plans for its first AI factory in the UK which is set to bring a further 4,000 NVIDIA Blackwell GPUs online—providing much-needed computational muscle for research bodies, universities, and public services including our perpetually cash-strapped NHS.

But having the hardware is only half the battle. As anyone in tech recruitment will tell you, finding people who can actually take advantage of it remains a challenge.

NVIDIA is throwing its considerable weight behind the UK government’s national skills push, with plans for a dedicated AI Technology Center on British soil. This centre promises hands-on training in AI, data science, and the increasingly critical field of accelerated computing.

“A new NVIDIA AI Technology Center in the UK will provide hands-on training in AI, data science and accelerated computing, focusing on foundation model builders, embodied AI, materials science and earth systems modeling,” explained NVIDIA.

The financial sector – the UK’s crown jewel – stands to benefit too. A new AI-powered sandbox from the Financial Conduct Authority will allow for safer experimentation with AI in banking and finance, with NayaOne providing infrastructure and NVIDIA supplying the technological backbone.

Sumant Kumar, CTO for Banking & Financial Markets at NTT DATA UK&I, said: “In a sandbox, every action leaves a mark. This supercharged sandbox may help banks get to a viable AI proof-of-concept faster, but it doesn’t reduce their regulatory obligations. If anything, it adds new layers of responsibility. As soon as a firm begins developing models in the sandbox, it needs to be ready to explain how they work, why they produce certain outcomes, and how they’ve been built.

“In financial services, the main bottleneck is often about ensuring the right governance is in place. The FCA will still expect clear documentation and strong controls around data provenance and auditability – even in a controlled environment.

“That’s why this is such an important opportunity. For firms, it’s a chance to build and refine the internal capabilities that will let them scale AI responsibly. For the government, it’s a chance to maintain the UK’s competitive edge and advance innovation while promoting balanced regulation and consumer safeguards. Those who approach the sandbox with the right structure will be in the best position to move quickly and safely when it comes to deployment.”

Barclays Eagle Labs is opening an Innovation Hub in London that could serve as a launching pad for promising AI and deep tech startups. Those who make the cut will gain a pathway into NVIDIA’s Inception programme, unlocking access to cutting-edge tools and targeted training that might otherwise remain frustratingly out of reach.

Mark Boost, CEO of Civo said: “This feels like a real step forward. We’ve spent years talking about being a leader in AI, but investing in compute infrastructure, developer training, and serious R&D is how we actually start to deliver it.

“NVIDIA’s AI Technology Center is an important initiative. Giving UK developers better access to hands-on training in accelerated computing, AI engineering and model development will help close critical skills gaps and support the next generation of homegrown talent.

Boost also touched on a point that’s increasingly occupying minds in Whitehall and boardrooms alike: technological sovereignty.

“Building long-term resilience in the UK means looking carefully at our reliance on external compute. As the AI stack becomes more strategic, the UK should be complementing global partnerships with greater investment in local infrastructure, open standards, and technologies we can help shape. That’s what keeps us competitive—staying flexible and able to shape our own path.”

Rather than just government announcements or corporate PR, this UK AI initiative with NVIDIA appears to promise genuine coordination between public institutions, industry heavyweights, and educational bodies. The focus on both immediate needs and longer-term foundations suggests lessons have been learned from previous tech booms.

Whether this approach delivers the projected economic windfall remains to be seen. But, for once, the UK seems to be playing to its strengths—combining world-class research institutions, a vibrant financial sector, and pragmatic regulation with the computational muscle and skills development needed to turn AI potential into economic reality.

(Photo by Charles Postiaux)

See also: Anthropic launches Claude AI models for US national security

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

NVIDIA Dynamo: Scaling AI inference with open-source efficiency

New Post has been published on https://thedigitalinsider.com/nvidia-dynamo-scaling-ai-inference-with-open-source-efficiency/

NVIDIA Dynamo: Scaling AI inference with open-source efficiency

NVIDIA has launched Dynamo, an open-source inference software designed to accelerate and scale reasoning models within AI factories.

Efficiently managing and coordinating AI inference requests across a fleet of GPUs is a critical endeavour to ensure that AI factories can operate with optimal cost-effectiveness and maximise the generation of token revenue.

As AI reasoning becomes increasingly prevalent, each AI model is expected to generate tens of thousands of tokens with every prompt, essentially representing its “thinking” process. Enhancing inference performance while simultaneously reducing its cost is therefore crucial for accelerating growth and boosting revenue opportunities for service providers.

A new generation of AI inference software

NVIDIA Dynamo, which succeeds the NVIDIA Triton Inference Server, represents a new generation of AI inference software specifically engineered to maximise token revenue generation for AI factories deploying reasoning AI models.

Dynamo orchestrates and accelerates inference communication across potentially thousands of GPUs. It employs disaggregated serving, a technique that separates the processing and generation phases of large language models (LLMs) onto distinct GPUs. This approach allows each phase to be optimised independently, catering to its specific computational needs and ensuring maximum utilisation of GPU resources.

“Industries around the world are training AI models to think and learn in different ways, making them more sophisticated over time,” stated Jensen Huang, founder and CEO of NVIDIA. “To enable a future of custom reasoning AI, NVIDIA Dynamo helps serve these models at scale, driving cost savings and efficiencies across AI factories.”

Using the same number of GPUs, Dynamo has demonstrated the ability to double the performance and revenue of AI factories serving Llama models on NVIDIA’s current Hopper platform. Furthermore, when running the DeepSeek-R1 model on a large cluster of GB200 NVL72 racks, NVIDIA Dynamo’s intelligent inference optimisations have shown to boost the number of tokens generated by over 30 times per GPU.

To achieve these improvements in inference performance, NVIDIA Dynamo incorporates several key features designed to increase throughput and reduce operational costs.

Dynamo can dynamically add, remove, and reallocate GPUs in real-time to adapt to fluctuating request volumes and types. The software can also pinpoint specific GPUs within large clusters that are best suited to minimise response computations and efficiently route queries. Dynamo can also offload inference data to more cost-effective memory and storage devices while retrieving it rapidly when required, thereby minimising overall inference costs.

NVIDIA Dynamo is being released as a fully open-source project, offering broad compatibility with popular frameworks such as PyTorch, SGLang, NVIDIA TensorRT-LLM, and vLLM. This open approach supports enterprises, startups, and researchers in developing and optimising novel methods for serving AI models across disaggregated inference infrastructures.

NVIDIA expects Dynamo to accelerate the adoption of AI inference across a wide range of organisations, including major cloud providers and AI innovators like AWS, Cohere, CoreWeave, Dell, Fireworks, Google Cloud, Lambda, Meta, Microsoft Azure, Nebius, NetApp, OCI, Perplexity, Together AI, and VAST.

NVIDIA Dynamo: Supercharging inference and agentic AI

A key innovation of NVIDIA Dynamo lies in its ability to map the knowledge that inference systems hold in memory from serving previous requests, known as the KV cache, across potentially thousands of GPUs.

The software then intelligently routes new inference requests to the GPUs that possess the best knowledge match, effectively avoiding costly recomputations and freeing up other GPUs to handle new incoming requests. This smart routing mechanism significantly enhances efficiency and reduces latency.

“To handle hundreds of millions of requests monthly, we rely on NVIDIA GPUs and inference software to deliver the performance, reliability and scale our business and users demand,” said Denis Yarats, CTO of Perplexity AI.

“We look forward to leveraging Dynamo, with its enhanced distributed serving capabilities, to drive even more inference-serving efficiencies and meet the compute demands of new AI reasoning models.”

AI platform Cohere is already planning to leverage NVIDIA Dynamo to enhance the agentic AI capabilities within its Command series of models.

“Scaling advanced AI models requires sophisticated multi-GPU scheduling, seamless coordination and low-latency communication libraries that transfer reasoning contexts seamlessly across memory and storage,” explained Saurabh Baji, SVP of engineering at Cohere.

“We expect NVIDIA Dynamo will help us deliver a premier user experience to our enterprise customers.”

Support for disaggregated serving

The NVIDIA Dynamo inference platform also features robust support for disaggregated serving. This advanced technique assigns the different computational phases of LLMs – including the crucial steps of understanding the user query and then generating the most appropriate response – to different GPUs within the infrastructure.

Disaggregated serving is particularly well-suited for reasoning models, such as the new NVIDIA Llama Nemotron model family, which employs advanced inference techniques for improved contextual understanding and response generation. By allowing each phase to be fine-tuned and resourced independently, disaggregated serving improves overall throughput and delivers faster response times to users.

Together AI, a prominent player in the AI Acceleration Cloud space, is also looking to integrate its proprietary Together Inference Engine with NVIDIA Dynamo. This integration aims to enable seamless scaling of inference workloads across multiple GPU nodes. Furthermore, it will allow Together AI to dynamically address traffic bottlenecks that may arise at various stages of the model pipeline.

“Scaling reasoning models cost effectively requires new advanced inference techniques, including disaggregated serving and context-aware routing,” stated Ce Zhang, CTO of Together AI.

“The openness and modularity of NVIDIA Dynamo will allow us to seamlessly plug its components into our engine to serve more requests while optimising resource utilisation—maximising our accelerated computing investment. We’re excited to leverage the platform’s breakthrough capabilities to cost-effectively bring open-source reasoning models to our users.”

Four key innovations of NVIDIA Dynamo

NVIDIA has highlighted four key innovations within Dynamo that contribute to reducing inference serving costs and enhancing the overall user experience:

GPU Planner: A sophisticated planning engine that dynamically adds and removes GPUs based on fluctuating user demand. This ensures optimal resource allocation, preventing both over-provisioning and under-provisioning of GPU capacity.

Smart Router: An intelligent, LLM-aware router that directs inference requests across large fleets of GPUs. Its primary function is to minimise costly GPU recomputations of repeat or overlapping requests, thereby freeing up valuable GPU resources to handle new incoming requests more efficiently.

Low-Latency Communication Library: An inference-optimised library designed to support state-of-the-art GPU-to-GPU communication. It abstracts the complexities of data exchange across heterogeneous devices, significantly accelerating data transfer speeds.

Memory Manager: An intelligent engine that manages the offloading and reloading of inference data to and from lower-cost memory and storage devices. This process is designed to be seamless, ensuring no negative impact on the user experience.

NVIDIA Dynamo will be made available within NIM microservices and will be supported in a future release of the company’s AI Enterprise software platform.

See also: LG EXAONE Deep is a maths, science, and coding buff

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

NVIDIA Dynamo: Scaling AI inference with open-source efficiency

New Post has been published on https://thedigitalinsider.com/nvidia-dynamo-scaling-ai-inference-with-open-source-efficiency/

NVIDIA Dynamo: Scaling AI inference with open-source efficiency

NVIDIA has launched Dynamo, an open-source inference software designed to accelerate and scale reasoning models within AI factories.

A new generation of AI inference software

To achieve these improvements in inference performance, NVIDIA Dynamo incorporates several key features designed to increase throughput and reduce operational costs.

NVIDIA Dynamo: Supercharging inference and agentic AI

AI platform Cohere is already planning to leverage NVIDIA Dynamo to enhance the agentic AI capabilities within its Command series of models.

“We expect NVIDIA Dynamo will help us deliver a premier user experience to our enterprise customers.”

Support for disaggregated serving

“Scaling reasoning models cost effectively requires new advanced inference techniques, including disaggregated serving and context-aware routing,” stated Ce Zhang, CTO of Together AI.

Four key innovations of NVIDIA Dynamo

NVIDIA has highlighted four key innovations within Dynamo that contribute to reducing inference serving costs and enhancing the overall user experience:

NVIDIA Dynamo will be made available within NIM microservices and will be supported in a future release of the company’s AI Enterprise software platform.

See also: LG EXAONE Deep is a maths, science, and coding buff

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

#accelerated computing

Trending Tags

Recently Viewed Tags

#accelerated computing