Please ensure Javascript is enabled for purposes of website accessibility Quick View: Tokens=revenues – NVIDIA’s case for scalable AI returns | Janus Henderson Investors | LatAm Other Advisor
For professional investors in Latam - other countries

Quick View: Tokens=revenues – NVIDIA’s case for scalable AI returns

Portfolio Manager Richard Clode summarises the main highlights from NVIDIA GTC 2026. Agentic AI, token-based monetisation to drive sustainable AI returns, and rack scale co-design are reshaping both the technology and economics of AI infrastructure.

18 Mar 2026
4 minute read

Key takeaways:

  • Agentic AI inflection point: OpenClaw is dramatically expanding agentic AI usage intensity and compute demand beyond simple conversational interfaces.
  • Tokens drive revenues: NVIDIA argues that tiered token monetisation can deliver attractive and scalable returns on AI capex; Vera Rubin could significantly increase revenue potential per gigawatt.
  • Competitive moat through extreme co‑design: By tightly integrating CPUs, GPUs, LPUs, networking and software, NVIDIA aims to defend its competitive position as AI inference shifts towards low‑latency, high‑throughput workloads.

The diversity of AI is also its resilience. The span of reach of AI is its resilience. There is no question this is not a one-app technology. This is now fundamental. This is absolutely a new computing platform shift.

 

NVIDIA CEO Jensen Huang

NVIDIA’s main event of the year, GTC (GPU Technology Conference) is focused on developers rather than the investment community, but serves as an important broader AI ‘state of the union,’ highlighting key developments in the industry. CEO Jensen Huang provided important updates on the profound impact of open-source autonomous AI agent OpenClaw, as a catalyst for agentic AI proliferation. Jensen also laid out a powerful case for token monetisation to drive revenues, providing justification that attractive returns on AI capital spending is possible, and it can be durable and resilient.

Jensen’s keynote speech also featured an in-depth defence of NVIDIA’s competitive moat given recent competitor concerns. He outlined an impressive scale and speed of innovation across the rack – this included the integration of recent quasi-acquisition Groq into the company’s roadmap from later this year (NVIDIA has a non-exclusive inference technology licensing agreement with Groq aimed at accelerating AI inference at a global scale), complementing the ‘extreme co-design’ of Vera Rubin, the rack-scale AI supercomputer built for agentic AI and reasoning. In a world of constrained powered shell data centres (built to supply explosive demand for AI, cloud computing etc but face power limitations), the AI infrastructure provider that delivers the most tokens and hence revenues per gigawatt is king.

In our view, there are four key highlights for investors from GTC 2026:

1. OpenClaw is a ‘Windows moment’ for agentic AI

[OpenClaw is] the most popular open-source project in the history of humanity, and it did so in just a few weeks. It exceeded what Linux did in 30 years, and it’s that important.

 

Jensen Huang

Achieving in weeks what Linux managed in 30 years, OpenClaw is the operating system of agentic computers in the same way Windows made personal computers possible. This free open-source autonomous AI agent allows users to move beyond AI chat to ‘do actual work’ (eg. calendar management, sending emails, checking flights, etc) by connecting with apps like WhatsApp, WeChat, Microsoft Teams, Telegram, and web browsers.

Everyone can now create personal agents and every company in the world should now have an OpenClaw strategy. While compute demand has increased by 1,000,000x in only two years, Jensen believes we are actually on the cusp of another exponential leap given the compute intensity of agentic AI and the explosive usage of this new technology.

2. Tokens = Revenues

Given the ongoing market debate on the sustainability of AI capital expenditure and its potential for monetisation/return-on-investment (ROI), Jensen laid out his maths in more detail. In his view AI companies should be charging for tokens by tiers. There will be a free tier to attract users, but beyond that token monetisation scales rapidly as interaction with AI increases. Using the new Vera Rubin infrastructure to illustrate, a company could have the potential to generate as much as US$150 billion in revenues per 1GW (gigawatt) datacentre that costs US$100 billion to build, presenting an attractive ROI opportunity.

3. Extreme co-design competitive moat

NVIDIA’s CEO articulated that AI is a full-stack problem requiring a full-stack solution. Vera Rubin launching later this year, includes seven brand new chips that have been co-designed to maximise performance, including the recent quasi-acquisition of Groq, which brings a new capability in extreme low latency (fast) token generation. Jensen laid out how Groq’s technology that speeds up inference for LLMs would be integrated by disaggregating inference to play to a GPU’s strengths in throughput for attention decode (question) but look to Groq’s LPU for decode generation (answer) given its bandwidth advantages. As well as designing new CPUs, GPUs, DPUs and storage, Vera Rubin can deliver 350x the token generation that the Hopper (GPU) delivered only two years ago. Given the recent market enthusiasm for optical networking stocks (supplier of high-speed optical links for AI datacentres), it’s worth noting that Jensen reiterated copper still has a long runway in NVIDIA’s roadmap, with optical and co-packaged optics being layered in over time.

4. US$1 trillion sales expected in 2025-27

Every single SaaS company will become an AaaS company, an agentic-as-a-service company.

 

Jensen Huang

NVIDIA expects to generate more than US$1 trillion in Blackwell and Rubin revenues in 2025 to 2027. That number does not include Hopper, standalone CPU or Groq LPU sales. The supply backlog is expected to continue to build through 2026, and that estimate could very well ramp up over time.

Unless specified NVIDIA GTC information sourced from Investing.com; NVIDIA GTC keynote speech transcript; 16 March 2026 and NVIDIA.com.

AaaS: Agentic as a service is a subscription-based cloud model for deploying autonomous AI agents that can make decisions and execute tasks with limited supervision, often powered by large language models (LLMs).

Agentic AI: An AI system that uses sophisticated reasoning and iterative planning to autonomously solve complex, multi-step problems. Vast amounts of data from multiple data sources and third-party applications are used to independently analyse challenges, develop strategies and execute tasks.

Capital expenditure: Money a business spends on major, long-term assets such as property and equipment (tangible assets) or technology, software, trademarks, patents etc (intangible assets) to facilitate new projects or investments that support business growth and expansion.

Constrained powered shell datacentres: Data‑centre facilities where the physical building (“shell”) exists, but the amount of electrical power available to run IT equipment is limited or partially unavailable.

CPU: The central processing unit is the control center that runs the machine’s operating system and apps by interpreting, processing and executing instructions from hardware and software programmes.

DPU: A Data Processing Unit is a specialised processor designed to offload networking, storage, and security tasks from the CPU. It accelerates data transfer and infrastructure services in modern data centres, improving efficiency and scalability, which are crucial for running modern AI workloads. 

Full-rack solution: Refers to renting or purchasing a complete rack of server equipment and services in a data centre.

Full-stack solution: Refers to a comprehensive approach to software development that covers all layers of an application or project. This includes both the front-end and back-end components, as well as any other layers necessary for the application to function fully.

GPU: A graphics processing unit performs complex mathematical and geometric calculations that are necessary for graphics rendering and are also used in gaming, content creation and machine learning.

Low‑latency token generation: How quickly a generative AI model (large language model) can produce each successive unit of output (“token”) after receiving a prompt.

LPU or Language Processing Unit: Groq’s proprietary and specialised chip designed specifically to handle the unique speed and memory demands of large language models (LLMs).

Open source software: Code that is designed to be publicly accessible, in terms of viewing, modifying and distributing.

ROI (return on investment): A financial ratio used to measure the performance of an investment, calculated by dividing net profit/loss by the initial cost of the investment.

SaaS: A cloud-based software delivery model where applications are accessed over the internet, with the cloud service provider responsible for infrastructure, security, and updates. applications live on software providers’ servers.

These are the views of the author at the time of publication and may differ from the views of other individuals/teams at Janus Henderson Investors. References made to individual securities do not constitute a recommendation to buy, sell or hold any security, investment strategy or market sector, and should not be assumed to be profitable. Janus Henderson Investors, its affiliated advisor, or its employees, may have a position in the securities mentioned.

 

Past performance does not predict future returns. The value of an investment and the income from it can fall as well as rise and you may not get back the amount originally invested.

 

The information in this article does not qualify as an investment recommendation.

 

There is no guarantee that past trends will continue, or forecasts will be realised.

 

Marketing Communication.

 

Glossary