Artificial intelligence (AI) is totally changing how we build and think about digital systems. What used to be regular data centres are now becoming AI Factories. These are massive, software-controlled computing hubs that mainly produce AI, a bit like how traditional factories make goods.
NVIDIA CEO Jensen Huang puts it really well: "AI is now infrastructure, and this infrastructure, just like the internet, just like electricity, needs factories... They're not data centres of the past... They are, in fact, AI factories. You apply energy to it, and it produces something incredibly valuable... called tokens."
These new facilities take in power and data and then churn out machine learning models, predictions, and digital agents on an amazing scale. Unlike older computer setups, these huge AI factories have:
These AI factories are a new type of digital infrastructure built specifically to handle very powerful GPU clusters, using 600kW or more of power per server rack. Beyond needing dense power and cooling, they also require:
As AI transforms industries and economies, building the physical foundation for this change isn't just an option anymore; it's absolutely crucial. This plan explains what's involved in engineering these next-generation, hyperscale AI factories from the ground up.
The following sections cover the key pillars of this blueprint:
The computing needs of modern AI, especially deep learning, aren't just demanding; they're fundamentally different and growing at an incredible pace. Traditional data centres, built for an older era of computing, simply can't scale to meet these unique requirements due to several built-in limitations.
Modern AI, especially something called deep learning, needs a massive amount of computing power. It's not just a bit more power; it's a completely different kind of demand that's growing incredibly fast. Think of it this way: old data centres were built for a different time, like an old-fashioned factory. They were designed for everyday computer tasks, like running websites or office programmes. They simply can't cope with the unique needs of AI. AI needs things to be tightly linked, process huge amounts of data quickly, and have almost no delays – something old data centres just weren't built for.
You see, traditional data centres were designed for distributed, general-purpose computing, like running websites and business applications. They weren't made for the tightly connected, high-speed, low-delay processing that AI supercomputing clusters require.
Ultimately, traditional data centres fall short not just in capacity, but in capability. They were not built to sustain the thermal loads, floor densities, or real-time data exchange requirements of modern AI workloads. Meeting these demands requires a wholesale reinvention of digital infrastructure, one optimised for AI’s computational intensity, energy consumption, physical footprint, and operational complexity. The AI Factory emerges from this need: a purpose-built architecture that reimagines everything from the rack to the grid.
In the realm of AI Factories, traditional metrics such as storage capacity or standalone network bandwidth are insufficient to gauge performance. A more pertinent metric has emerged: AI token throughput, the rate at which an AI system generates output tokens during inference. This metric encapsulates the system's ability to deliver real-time predictions and content generation, serving as a direct indicator of its intelligence production capacity.
What is a Token?
In AI, particularly with large language models (LLMs) like ChatGPT, a token is a fundamental unit of text or code that the model processes. It's often a word, but it can also be a part of a word, a punctuation mark, or even a space. For example, if you make a request to ChatGPT, such as "Tell me a story about a brave knight," the words "Tell," "me," "a," "story," "about," "a," "brave," and "knight" would likely each be treated as individual tokens. However, tokenisation isn't always one-to-one with words; a word like "running" might be broken into "run" and "##ning" (two tokens), or a common phrase might be represented by a single token.
To provide a comprehensive view of AI system performance, token throughput is often considered alongside other key indicators:
Elevated token throughput directly correlates with an AI Factory's capacity to handle extensive, concurrent inference workloads efficiently. Achieving this necessitates optimised hardware configurations, such as high-performance GPUs or TPUs, and advanced software strategies, including model parallelism and efficient batching techniques.
While factors like energy efficiency, cost management, and scalability are crucial, AI token throughput stands out as the definitive measure of an AI Factory's effectiveness. It encapsulates the facility's core mission: transforming data into actionable intelligence at scale, thereby driving innovation and competitive advantage across industries.
The rise of AI Factories is making IT and data centre leaders completely rethink their priorities. In the past, data centres were planned around things like overall size (square metres, total power) and how cheap they were to run. Today, five key priorities, which we call the 5S, have become crucial for the large cloud providers (hyperscalers) building and running AI infrastructure:
Speed: In the world of AI Factories, 'speed' means several things. First, it's about time-to-value – how quickly can you train a new AI model or add more capacity when demand suddenly increases? Hyperscalers now compete on how fast they can set up new GPU clusters or launch AI services. Cloud-native AI platforms focus on quick setup and minimal hassle; for example, offering GPU capacity by the hour, ready with AI frameworks, so development teams can innovate rapidly. Executives need to ensure their infrastructure (and partners) can deploy at "hyperspeed" both in getting hardware ready and moving data quickly. High-performance connections (low-delay networks, locations close to users) are also vital, as model training and AI predictions happen in real-time. Simply put, if your AI Factory can't keep up with the speed of experimentation and user demand, innovation will move elsewhere.
Scale: AI workloads that used to run on a few servers now need thousands of GPUs working at the same time. 'Scale' isn't just about having big data centres; it's about smoothly expanding within and across different facilities. Hyperscale AI Factories must support huge amounts of computing power (petaflops to exaflops), millions of simultaneous AI model queries, and training runs involving trillions of parameters. This requires designs that are modular and can be easily copied. For instance, NVIDIA’s reference AI Factories are built from "pods" or blocks of GPUs that can be cloned and connected by the hundred. Cloud providers talk about "availability zones" dedicated to AI, and "AI regions" appearing where there's plenty of power. The goal is to expand AI computing almost like a utility, adding more AI Factory space with minimal disruption. Scale also means having a global presence: hyperscalers like AWS, Google, and Alibaba are expanding AI infrastructure to more regions to serve local needs while balancing workloads worldwide. If an AI service suddenly needs ten times more capacity due to a popular app or a breakthrough model, the infrastructure should be able to expand within days, not months. As Huang revealed, NVIDIA even provided a five-year roadmap visibility to partners because building AI-ready power and space takes a long time. Leading data centre operators are now planning for 100+ MW expansions proactively to ensure scale never slows down innovation.
Sovereignty: Data sovereignty and infrastructure sovereignty have become critical in the age of AI. As AI systems are used in sensitive areas, from healthcare diagnoses to national security, where data and models are stored, and under whose laws, is a major concern. Hyperscalers must navigate a complex set of regulations that increasingly demand certain data remains within national borders, or that AI workloads are processed in locally controlled facilities for privacy and strategic reasons. The recent push for "sovereign cloud" offerings in Europe and elsewhere reflects this trend. For AI Factories, sovereignty can mean choosing data centre locations to meet legal requirements and customer trust. It's no longer just about technical specifications, but also about geopolitical and compliance positioning. For example, European cloud users might prefer (or be required by law) to use AI infrastructure hosted in the EU by EU-based providers. In China, AI infrastructure must be locally hosted due to strict data laws. Even within countries, some government or enterprise workloads demand sovereign-certified facilities, those checked for handling classified data or critical infrastructure roles. Where your AI infrastructure lives isn't just a technical choice, it's a competitive one. Latency, compliance, and sustainability are all shaped by location. Leading data centre operators choose sites based on a strategic mix of low latency, data sovereignty, and energy resilience. In practice, this means hyperscalers are investing in regions they previously left to partners and partnering with local data centre specialists to ensure sovereign coverage. The AI Factory revolution won't be a one-size-fits-all global solution; it will be a network of regionally tailored hubs that balance global scale with local control.
Sustainability: The power-hungry nature of AI has put sustainability at the heart of the conversation. Company boards and governments are increasingly scrutinising the energy and carbon footprint of AI operations. A single large AI training run can use as much electricity as hundreds of homes; scaled across many runs, AI could significantly impact company and national energy goals. Hyperscalers are acutely aware that any perception of AI as "wasteful" or environmentally harmful could lead to regulatory or public backlash, not to mention the direct impact on energy costs. Therefore, the new mantra is "performance per watt" and designing for efficiency from scratch. Leading cloud data centres are committing to 100% renewable energy (through solar, wind, hydro, or even emerging nuclear partnerships) to power AI Factories. They're also adopting advanced cooling to reduce waste; for example, liquid cooling can drastically cut cooling power overhead and even allow heat reuse, improving PUE (Power Usage Effectiveness) dramatically. Every aspect of facility design is under the microscope for sustainability, from using sustainable building materials to implementing circular economy principles for hardware (recycling and reusing components). Importantly, hyperscalers are now reporting metrics like "carbon per AI inference" or "energy per training run" as key performance indicators. The next generation of data centres will be judged not just on capacity, but on efficiency. As a recent report put it, "the next generation of data centres won't just be measured by performance alone; they'll be judged by efficiency… boards, regulators and customers are asking: Where is the energy coming from? How efficient is your data centre? What is the carbon impact per GPU-hour?". To remain competitive (and compliant), AI Factories must be sustainable by design, aligning with global net-zero ambitions and corporate ESG commitments. Sustainability is no longer a nice-to-have Corporate Social Responsibility (CSR) item; it's a core design principle and differentiator in the AI era.
Security: With AI becoming a backbone for everything from financial services to autonomous vehicles, the security of AI infrastructure is paramount. Here we mean both cybersecurity and physical security/resilience. On the cyber side, AI workloads often involve valuable training data (which could include personal data or proprietary information) and models that are intellectual property worth billions. Protecting these from breaches is critical; a compromised AI model or a disrupted AI service can cause immense damage. Hyperscale AI Factories are targets for attackers ranging from lone hackers to state-sponsored groups, all seeking to steal AI technology or sabotage services. This means investing in robust encryption (for data when it's stored and when it's moving), secure access controls, continuous monitoring powered by AI itself, and isolated compute environments (to prevent one client’s AI environment from affecting another’s in multi-tenant clouds). On the physical side, downtime is unacceptable; an AI Factory outage could halt operations for a business or even knock out critical infrastructure (imagine if an AI-driven power grid or hospital network fails). Therefore, AI data centres are built with extreme redundancy and hardened against threats. Many pursue Tier IV certifications for fault tolerance, and features like on-site backup power for days, multi-factor access controls, and even EMP or natural disaster protection in some cases. Additionally, supply chain security has emerged as a concern: ensuring that the chips and software powering AI are free from backdoors or vulnerabilities (which also links back to sovereignty). Security by design is a must. As one NEXTDC customer put it, their clients "rely on the ability to run AI-powered applications without interruption, for as long as they need," so having a partner that can guarantee uptime and flexibility is crucial. In practice, hyperscalers are choosing colocation providers and designs that emphasise robust risk management – from certified physical security controls to comprehensive compliance with standards (ISO 27001, SOC 2, etc.). In the AI Factory age, a security breach or prolonged outage isn't just an IT issue; it's a business-critical incident. Therefore, security and resilience permeate every layer of the 5S model, underpinning speed, scale, sovereignty, and sustainability goals with a foundation of trust and reliability.
In summary, these 5S priorities are shaping decisions at the highest levels. Hyperscaler CIOs and CTOs are now asking:
The AI Factory era demands a holistic approach. Success will come from excelling across all five dimensions, rather than optimising for just one. In practice, this means designing data centre solutions that are agile and fast, massively scalable, locally available and compliant, green and efficient, and rock-solid secure. That’s a tall order – but it’s exactly what the leading innovators are now building.
Building an AI Factory requires a holistic rethinking of digital infrastructure, focusing on highly specialised components:
Beyond its components, an AI Factory possesses distinct characteristics that enable accelerated AI development:
Investing in AI Factories unlocks significant strategic advantages for organisations:
The blueprint for today's AI Factory was largely created by pioneers in large-scale internet services like Google, AWS, Alibaba, Tencent, and ByteDance. Their huge investments and innovative methods have turned traditional data centres into powerhouses of intelligence, setting the standard for others to follow.
Beyond these tech giants, the influence of AI Factories is expanding, with various organisations showing the power of specially built AI infrastructure:
Uber – Michelangelo: Uber's ML platform, Michelangelo, is integral to its operations, enabling real-time predictions and optimizations across the platform. It supports over 10 million predictions per second, facilitating tasks such as:
Netflix – Metaflow: Netflix developed Metaflow, a human-centric framework designed to streamline the development and deployment of ML models. Metaflow empowers data scientists and engineers to:
Airbnb – Bighead: Airbnb's Bighead is an end-to-end ML platform that supports various applications across the company, including:
While webscale companies currently lead the way, the strategic advantages of AI Factories are undeniable. As more organisations recognise the transformative potential of AI, we can expect a widespread adoption of these specialised facilities across diverse industries in the near future.
The shift from traditional data centres to purpose-built AI Factories marks a critical inflection point for organisations seeking to fully realise the potential of artificial intelligence. This is not a linear upgrade, it’s a foundational transformation of compute infrastructure, network architecture, and operational readiness to support the demands of next-generation intelligence workloads.
By investing in AI Factories, forward-looking enterprises equip themselves to handle the exponential growth in AI models, data volumes, and power density. They gain the strategic capability to innovate faster, compete smarter, and lead in an economy defined by intelligence.
This new era requires decisive action. Success hinges on infrastructure partners who can deliver across the “5S” dimensions—Speed, Scale, Sovereignty, Sustainability, and Security—without compromise.
The intelligence economy isn’t on the horizon, it’s already transforming industries. To lead in this new era, you need infrastructure that’s not just ready for AI, but purpose-built for it.
NEXTDC’s high-performance, AI-optimised data centre platform delivers the density, sovereignty, and sustainability your workloads demand today and tomorrow.
Let’s build your AI Factory.
Connect with our specialists today to design the infrastructure that defines your next competitive advantage.