Tech Fundamentals

Apple Using Google’s AI Isn’t a Surrender But a Pattern

Daniel Kolb — Sun, 18 Jan 2026 15:02:48 GMT

Koshiro K - stock.adobe.com

Disclaimer: This publication and its authors are not licensed investment professionals. Nothing posted on this blog should be construed as investment advice. Do your own research.

Apple and Google have confirmed a multi-year partnership in which Apple will use Google’s Gemini models to power parts of future versions of Siri and Apple Intelligence. The confirmation came via a rare joint statement from both companies https://blog.google/company-news/inside-google/company-announcements/joint-statement-google-apple/.

The immediate reaction framed this as another checkpoint in the AI race. Google “wins” Apple. Apple “falls behind” hyperscalers. Stocks move accordingly.

That framing is understandable, but it misses a recurring Apple pattern. Apple often uses external technology as a bridge, not an endpoint. This is very important to remember since we can learn from how Apple approached new technologies in the history of the company.

Apple: Using External Models Without Giving Up Control

Apple has never optimized for owning every layer from day one. Historically, Apple has been comfortable relying on partners while it builds internal capability - and then switching once it can control the stack end to end.

The most obvious example is Intel. Apple shipped Macs on Intel CPUs for years before transitioning to Apple Silicon. That move wasn’t rushed. It happened only once Apple could deliver better performance per watt, tighter integration, and more predictable economics with its own M-series chips. The same pattern showed up with Google Maps before Apple Maps, and with GPUs before Apple’s custom graphics pipelines matured.

Seen in that context, Gemini looks like a familiar Apple move. Apple is accessing state-of-the-art reasoning capability now, while continuing to invest in its own foundation models and silicon-level acceleration. Apple was explicit that Gemini will not behave like a generic cloud API bolted onto Siri. Apple Intelligence still runs primarily on-device, with fallback to Apple’s own Private Cloud Compute when needed. Apple and Google emphasized that user data is not broadly shared and that Apple’s privacy guarantees remain intacthttps://observervoice.com/google-offers-privacy-assurance-to-iphone-users-following-gemini-partnership-with-apple-173649/.

From a systems perspective, Apple is borrowing capability without surrendering orchestration. Gemini provides model intelligence today. Apple Silicon, the Neural Engine, Core ML, and the OS decide where inference runs, how often, and under what constraints. If and when Apple’s own models reach sufficient maturity, swapping out the backend becomes a product decision rather than a structural rewrite.

For Apple stock, that matters. Temporary dependency is very different from structural dependency. Apple is not locking itself into a cost structure or vendor relationship it can’t later unwind.

Google: Model Validation and Strategic Positioning

Google still gets something meaningful out of this arrangement, even if it isn’t permanent: validation.

In the joint statement, Apple concluded that Gemini provides “the most capable foundation” for its next generation of AI features https://blog.google/company-news/inside-google/company-announcements/joint-statement-google-apple/. That endorsement carries weight precisely because Apple is selective and historically willing to walk away once it has an internal alternative.

For Google, this is less about locking Apple in and more about proving Gemini is good enough to sit at the core of a massive consumer platform. Even if Apple eventually transitions to its own models, Google benefits from the reputational signal and near-term strategic relevance.

For Alphabet investors, this looks like a positioning win rather than a long-term capture of Apple’s AI economics. Google strengthens its claim to frontier leadership, but without controlling Apple’s destiny.

Microsoft: Still Central, But Not the Default Choice

Microsoft isn’t directly affected by the Gemini deal, but it does lose the assumption that Azure plus OpenAI automatically becomes the foundation for every major consumer assistant.

Apple already integrates ChatGPT for certain Siri queries, but it chose Gemini as the broader foundation layer. That suggests Apple evaluated multiple options and picked what worked best for its current constraints, not what aligned best with a long-term dependency.

This mirrors Apple’s past behavior with Intel. Apple partnered deeply, extracted value, and then moved on once it had a better internal alternative. Microsoft’s strength remains enterprise AI and developer ecosystems, but this deal reinforces that Apple will always keep optionality.

Amazon: Indirect Exposure and Long-Term Implications

Amazon sits slightly outside this specific partnership, but the broader implication still applies.

Always-on consumer AI struggles with cloud-first economics. Apple’s decision to keep inference mostly on-device while temporarily sourcing model capability externally reinforces that lesson. It’s a reminder that centralized inference scales poorly when AI becomes ambient rather than occasional.

For Amazon, this echoes some of the challenges Alexa faced. It also hints that future consumer AI systems may blend local execution with selective cloud augmentation, rather than relying entirely on hyperscaler infrastructure.

OpenAI: Important, but Potentially Transitional

OpenAI remains part of Apple’s AI stack. Apple continues to use ChatGPT for certain requests that require broad world knowledge or creative generationhttps://wandb.ai/byyoung3/ml-news/reports/Apple-turns-to-Google-for-AI--VmlldzoxNTYxNDk2Nw.

But just like Gemini, OpenAI’s role looks more like a component than a permanent foundation. Apple’s architecture already supports multiple models, and its tooling is designed to swap backends as capabilities evolve.

If Apple follows its historical playbook, both Gemini and OpenAI could eventually be replaced; Not abruptly, but once Apple’s internal models are good enough for its specific use cases.

How This Fits Apple’s Broader AI Architecture

This partnership fits cleanly into Apple’s long-term approach: control the stack, even if you don’t own every piece yet.

Apple isn’t trying to outspend hyperscalers on centralized compute. It’s trying to avoid inheriting a variable cost structure that grows with every user interaction. By pushing inference onto devices it already sells, Apple turns AI execution into a fixed, amortized cost tied to hardware cycles.

External models like Gemini fill the gap while Apple’s own foundation models mature. Once they do, Apple can internalize more of the stack without retraining users or rethinking the product.

This is the same transition Apple executed with CPUs, GPUs, and mapping infrastructure. The difference is that AI is more visible, but the strategy is the same.

Implications for Apple, Hyperscalers, and AI Platforms

This deal doesn’t redraw the AI landscape overnight, but it clarifies incentives.

Apple buys time and capability without committing long term. Google gains validation and near-term relevance. Microsoft and Amazon see confirmation that platform control is never guaranteed. OpenAI remains important, but likely not permanent.

More broadly, this reinforces that AI is becoming infrastructure. As that happens, temporary partnerships, modular architectures, and cost containment matter more than public dominance narratives.

What This Means for the Stocks Involved

There’s no single winner here.

Apple reduces risk around Siri while preserving optionality. Google gains credibility. The hyperscalers are reminded that consumer AI is expensive and that Apple will not accept open-ended cost exposure.

For investors, the key insight is that Apple is behaving exactly as it has before: rely on partners when necessary, build internally in parallel, and switch once control and economics improve.

Closing Perspective

Apple’s decision to use Gemini for Siri looks less surprising when you view it through the company’s own history.

Apple has repeatedly used external technology as a bridge rather than a destination. Intel CPUs, Google Maps, even early GPUs followed the same pattern. Gemini fits neatly into that lineage.

Apple isn’t conceding the AI stack. It’s buying time while keeping control of where costs, latency, and privacy live. If history is a guide, this partnership may last only as long as it takes Apple to ship something better for its own constraints.

That’s not an exciting story. But in technology and investing, those are often the stories that matter.

If you want to learn more about Apple’s approach to AI, feel free to check out my article from a few weeks ago: https://substack.com/@techfundamentals/p-183697737

AI Agents Are About to Spend Real Money

Daniel Kolb — Tue, 13 Jan 2026 13:06:29 GMT

Disclaimer: This publication and its authors are not licensed investment professionals. Nothing posted on this blog should be construed as investment advice. Do your own research.

Agentic commerce sounds abstract, but the underlying idea is straightforward. Instead of software that helps you shop by surfacing options or recommendations, you now have software that can actually act. An AI agent like ChatGPT or Gemini can search, compare, decide, and complete a purchase based on rules the user sets ahead of time.

That shift matters because ecommerce has always been organized around human-driven steps. Search, browse, click, compare, checkout. Once software starts collapsing those steps into a single automated flow, the question becomes less about user experience and more about which systems the agents trust to execute reliably.

Shopify: Designing for AI as the Customer

Shopify has been unusually clear about where it wants to sit in this new setup. In its announcement on AI commerce at scale, Shopify focused less on storefront design and more on making commerce primitives readable and usable by AI systems.

The idea is that an agent should be able to access a merchant’s catalog, understand pricing and availability, and complete a transaction without forcing the merchant to rebuild anything or manage one-off integrations. Shopify wants to handle that complexity once, at the platform level, and let everyone else plug in.

As Shopify VP Vanessa Lee put it:

“Agentic commerce has so much potential to redefine shopping, and we want to make sure it can scale to every product a customer might want to purchase.”

- Shopify News, AI commerce at scale Source: https://www.shopify.com/news/ai-commerce-at-scale

What’s interesting here is that Shopify isn’t trying to own the AI interface itself. It’s positioning itself as the execution layer that sits underneath whatever interface the user prefers. From an investment perspective, that matters because it changes Shopify’s role over time. Early on, it’s a merchant tool. At scale, it starts to look more like shared infrastructure that other systems depend on.

If agentic commerce grows, merchants won’t want to think about which AI model, search engine, or assistant their customers are using. They’ll want a backend that “just works” across all of them, and that’s the slot Shopify is aiming for.

Adobe (Magento): Capable, but Harder to Standardize

Adobe’s exposure to ecommerce comes mainly through Magento, which serves a different part of the market. Magento is flexible and powerful, especially for large brands with custom needs, but that flexibility comes at the cost of uniformity.

That tradeoff becomes more visible in an agent-driven world. Autonomous systems work best with predictable, standardized surfaces. Magento deployments tend to vary widely, with integrations handled by agencies or internal teams and business logic spread across multiple layers.

Adobe’s broader AI push is focused on content, personalization, and analytics, which are valuable but sit upstream from the actual transaction. For agentic commerce, the bottleneck isn’t creativity or targeting, it’s execution.

As a result, agentic commerce is more of an indirect tailwind for Adobe. It supports demand for orchestration tools, but unless Adobe aggressively simplifies how Magento exposes commerce functionality to agents, it’s unlikely to become the default transaction backend in the same way Shopify is trying to be.

Amazon: Built for This, but With Different Tradeoffs

Amazon already operates close to what agentic commerce looks like in practice. The company has spent years optimizing for speed, convenience, and minimal user effort, all backed by a tightly integrated logistics and payment system.

The potential tension isn’t about whether AI agents increase ecommerce volume. It’s about who controls the interaction layer. If purchasing increasingly happens inside AI assistants or search tools run by others, Amazon may still fulfill the order but lose some control over the customer relationship.

That helps explain why Amazon has been cautious and sometimes defensive around third-party AI shopping tools. It suggests the company sees agentic commerce as strategically important, but also as something that needs to be shaped carefully to avoid turning Amazon into a commoditized backend.

From an investor’s point of view, this points to durability rather than disruption. Amazon still owns the hardest assets in ecommerce, especially logistics. Agentic interfaces change how demand is routed, but they don’t make those assets less valuable.

eBay: A Neutral Marketplace That Fits Agents Well

eBay sits in an interesting middle ground when it comes to agentic commerce. It doesn’t own logistics like Amazon, and it doesn’t act as a full-stack merchant backend like Shopify. Instead, it runs a large, relatively neutral marketplace with standardized listings, pricing, and checkout flows.

That neutrality may actually be an advantage in an agent-driven world. AI agents are good at structured comparison, and eBay’s inventory is already organized around searchable attributes, condition, price ranges, and seller reputation. From an agent’s perspective, that’s a clean environment to operate in.

eBay management has hinted in the past that AI-driven discovery and automation are areas of focus, especially around search relevance and buyer efficiency. The company’s incentives are also simpler: more completed transactions, regardless of where discovery happens. Unlike Amazon, eBay has less reason to protect a single dominant interface, and unlike Shopify, it doesn’t need to serve millions of independent backends.

For eBay stock, agentic commerce isn’t a dramatic growth narrative, but it could quietly improve conversion efficiency. If agents increasingly act as comparison engines that route demand toward the best value option, eBay’s broad, price-transparent marketplace could benefit without requiring heavy reinvention. The risk is mostly one of execution: eBay needs to make sure its APIs, listings, and checkout flows are easy for agents to interact with, or it risks being bypassed in favor of more standardized platforms.

In that sense, eBay feels less like a headline winner and more like a system that could age well if agentic commerce grows gradually rather than all at once.

Infrastructure Starts to Matter More Than Storefronts

One side effect of agentic commerce is that attention shifts away from front-end design and toward the pipes underneath. Payments, identity, fraud prevention, and authorization become more central when purchases happen with fewer explicit human actions.

Market observers have started to describe this as a convergence of decision-making and execution, where systems need to be both intelligent and trustworthy. Analysts at McKinsey have highlighted agentic commerce as an area where standardised transaction infrastructure becomes increasingly important as autonomy increases.

This tends to reward platforms that are consistent and boring in the right ways. Clean APIs, stable data models, and predictable behaviour matter more than flashy features when software is the one doing the buying.

What to Take Away

Agentic commerce doesn’t remove ecommerce platforms from the picture. It changes which layers capture leverage.

Interfaces become more fluid and interchangeable, while execution concentrates in systems that agents can rely on without special handling. Shopify is leaning directly into that shift by trying to be the neutral execution layer. Adobe has strong pieces but less cohesion at the transaction boundary. Amazon remains dominant, though more constrained by its own scale and incentives.

For investors, it’s worth thinking of agentic commerce less as a feature upgrade and more as an interface transition. Over time, the companies that quietly control how intent turns into a paid order are the ones that tend to compound, even if they don’t own the most visible part of the experience.

After AI, Investors Are Eyeing Quantum Computing. Here’s What Actually Matters.

Daniel Kolb — Fri, 09 Jan 2026 16:16:03 GMT

Disclaimer: This publication and its authors are not licensed investment professionals. Nothing posted on this blog should be construed as investment advice. Do your own research.

Quantum computing sits in a strange place. It’s early, it’s hard, and it’s absolutely not following a clean startup-to-scale story. At the same time, it’s one of the few areas in tech where the upside isn’t just better software, but access to entirely new classes of problems.

That combination makes it frustrating and compelling at the same time. If you’re an investor, the question usually isn’t “does this work?” but “where does value realistically accumulate while this is still messy?”

I’m going to explain quantum computing in simple terms for you and will describe the risks and opportunities from a technical perspective. Of course, I’ll also mention some stocks that are worth looking into if you want participate in the future of quantum computing.

What quantum computing actually is

Quantum computing is an attempt to compute using physical systems that behave fundamentally differently from classical electronics. Instead of abstracting everything down to bits, quantum machines lean directly into quantum mechanical behaviour.

The important implication is that progress is constrained by nature, not just engineering talent. If quantum computing ever becomes commercially useful at scale, it won’t be because someone shipped a clever app. It will be because a small number of teams solved problems that most companies cannot even meaningfully attempt.

From an investment perspective, that matters. Hard problems tend to concentrate value.

How quantum computing works, in very simple terms

A classical computer explores one path at a time, very quickly.

A quantum computer explores many possible paths at once, but only for very specific kinds of problems. It uses “Qubits” for that. Qubits can exist in overlapping states, and groups of qubits can be linked together in ways that allow correlations classical systems cannot replicate.

When everything lines up, this lets quantum machines search complex solution spaces far more efficiently than brute-force classical approaches.

That’s why areas like materials science, chemistry simulation, cryptography, and optimization keep coming up.

The downside is: Qubits don’t like the real world. Keeping them coherent requires extreme conditions and constant correction. This makes quantum computing very tricky to control and develop and many of the brightest minds are working actively on its problems.

Quantum computing and AI: complementary, not competitive

Quantum computing isn’t going to replace GPUs for training AI models. That’s not its role.

Where quantum could matter is in areas adjacent to AI: optimization, sampling, and simulation problems that are hard even for large classical clusters. Think better materials for chips, better batteries, more efficient logistics, or improved models for physical systems.

There’s also a feedback loop forming. AI is increasingly used to design quantum experiments, tune control systems, and search for better quantum algorithms. As quantum hardware improves, that loop tightens.

This is less about one technology eating the other, and more about both expanding the set of problems we can realistically attempt.

Why quantum computing is still risky, but less fragile than it looks

Investing in Quantum Computing is risky, but not in the way most people might think.

The risk isn’t that quantum is fake or that progress will suddenly stop. The real risk is mistaking technical progress for commercial readiness, or assuming timelines compress the way software timelines do.

That said, there’s a more positive angle that often gets overlooked. Much of the foundational work has already been done. Error rates are improving. Tooling is better and hybrid classical-quantum workflows are becoming more practical. Governments and large enterprises are no longer just funding research, they’re testing early use cases.

This doesn’t mean revenue explodes this or next year. It does mean the probability that quantum becomes useful keeps going up, even if adoption stays slow and uneven.

For patient value investors, that distinction matters.

The real risk: quantum becoming the next place speculation hides

The biggest risk around quantum computing isn’t that the technology is imaginary, it’s that capital is very good at outrunning reality. Over the last few decades, markets have repeatedly shown that hype often produces better short-term returns than legitimate businesses, at least until the story collapses under its own weight. Dot-coms, crypto, NFTs, and now generative AI all followed a similar pattern: plausible technology, enormous expectations, weak economics, and capital piling in long before usefulness was proven.

AI is now starting to show those cracks. Despite hundreds of billions in investment, productivity gains remain narrow and difficult to generalize, while costs scale aggressively. Several studies (like the one from MIT) suggest that most AI pilots fail to deliver meaningful profitability or output improvements, and even widely adopted products struggle to break even without optimistic pricing assumptions. As improvement curves flatten and unit economics remain uncomfortable, capital naturally looks for the next narrative that can absorb belief without immediate verification.

Quantum computing fits that role almost too well. It is real, genuinely hard, deeply technical, and extremely difficult for outsiders to falsify. Hardware progress is visible and easy to market, while the true bottleneck, useful software and algorithms, remains slow, uncertain, and largely resistant to brute-force investment. Even with functional quantum hardware, quantum computers cannot run conventional code and only outperform classical systems on very specific problem classes. For many of the areas they are most often associated with, including AI training and general machine learning, there is still no clear evidence that relevant quantum algorithms even exist.

This creates a familiar mismatch. Scientific progress continues, but commercial usefulness lags far behind expectations. Timelines stretch into decades, while public markets price outcomes years in advance. In that gap, speculation thrives. The danger for investors isn’t that quantum fails, but that it gets used as an escape hatch for disappointed AI expectations, pulling future possibilities forward to justify present valuations.

Quantum computing can still become important, even transformative, without rewarding most early capital. That’s the uncomfortable part. The technology doesn’t owe investors a clean adoption curve, and history suggests it rarely provides one.

How investors can participate

This is where nuance helps. The most visible quantum stocks are not interchangeable. They represent very different philosophies. Here are some of the more famous ones:

IonQ (IONQ): betting on physics elegance and long-term scalability

IonQ is often seen as the “cleanest” quantum story. Their approach uses trapped ions, which tend to have very high qubit fidelity and long coherence times.

From the outside, IonQ looks slow. From the inside, they’re optimising for scalability and error reduction rather than rushing qubit counts. That’s appealing if you believe stable qubits matter more than flashy numbers.

IonQ also benefits from a relatively hardware-light model compared to some competitors. Their systems integrate well with cloud platforms, which makes experimentation easier for customers and lowers friction for adoption.

As an investment, IonQ is a long-duration bet on doing things the hard way early so scaling hurts less later. It won’t pay off quickly, but if trapped-ion architectures win, the payoff could be meaningful.

Rigetti Computing (RGTI): vertical integration and iteration speed

Rigetti Computing takes a different approach. They build superconducting qubit systems and control much more of the stack themselves, from chip fabrication to software.

That vertical integration gives Rigetti speed and flexibility. They can iterate quickly, experiment with architectures, and optimize across layers. The downside is cost and complexity. Superconducting systems are demanding, and scaling them reliably is non-trivial.

Rigetti feels more like a traditional hardware startup: faster feedback loops, higher burn, more visible stumbles. For investors, this is a higher-volatility bet, but also one with clearer signals along the way as systems improve or don’t.

If quantum adoption accelerates through cloud-accessible, general-purpose systems, Rigetti is positioned to benefit.

D-Wave Quantum (QBTS): the most commercially grounded path

D-Wave Quantum is often misunderstood because it doesn’t build universal quantum computers in the same way as IonQ or Rigetti. Instead, D-Wave focuses on quantum annealing, which is specialized but useful for optimization problems.

This specialization is a feature, not a flaw. D-Wave already has customers using their systems for real-world optimization tasks. Revenue exists. Use cases are concrete. The tradeoff is that annealing won’t solve everything quantum can theoretically do.

From an investor’s perspective, D-Wave is less about moonshots and more about pragmatic adoption. It’s a bet that early, narrow utility can compound into broader relevance over time.

NVIDIA (NVDA): quantum doesn’t replace GPUs, it leans on them

NVIDIA is not a quantum computing company in the traditional sense, but it’s deeply embedded in the quantum ecosystem.

Most quantum work today still depends heavily on classical computing. You simulate quantum systems, design quantum circuits, run error-correction layers, and coordinate experiments using massive amounts of classical compute. GPUs sit right in the middle of that workflow.

NVIDIA’s role is subtle but powerful. They sell the picks and shovels for quantum research, from simulation libraries to high-performance hardware used by labs, startups, and hyperscalers. Even if large-scale quantum computing takes longer than expected, NVIDIA still benefits as long as money keeps flowing into research and experimentation.

From an investor perspective, this is attractive because NVIDIA doesn’t need a quantum breakthrough. Incremental progress is enough to justify continued spending on tooling and infrastructure.

Microsoft (MSFT): quantum as a platform feature, not a product

Microsoft approaches quantum the same way it approaches most infrastructure problems: abstract it, wrap it in tooling, and make it accessible through a platform.

Azure Quantum is less about owning the “best” quantum hardware and more about being the place where quantum experiments happen. Microsoft connects multiple quantum hardware providers, classical compute, developer tools, and enterprise workflows under one roof.

What’s interesting here is the incentive structure. Microsoft doesn’t need to guess which quantum architecture wins. If quantum becomes useful at all, customers are likely to access it through existing cloud relationships. That gives Microsoft leverage without forcing them to bet everything on one technical path.

This makes quantum at Microsoft feel less like a moonshot and more like a long-dated platform option that fits naturally into their cloud strategy.

IBM (IBM): slow, steady, and deeply committed

IBM has been working on quantum longer than almost anyone else, and it shows in how methodical their approach is.

IBM treats quantum as an extension of its research-first DNA. They publish roadmaps, expose systems to developers, and push heavily on education and ecosystem building. Their superconducting qubit approach is well understood, even if it’s technically demanding.

From an investment lens, IBM’s strength is credibility. Enterprises, governments, and universities already trust IBM with long-term infrastructure projects. Quantum fits naturally into that relationship model.

IBM is unlikely to surprise anyone with sudden breakthroughs, but it’s also unlikely to abandon the field. If quantum adoption happens through conservative, institutional channels, IBM is positioned to be a core beneficiary.

Google (GOOG): quantum as a long-term research advantage

Google treats quantum computing as fundamental research with potential strategic upside, not as a near-term business line.

Google’s quantum work is deeply tied to their strengths in physics, systems engineering, and large-scale infrastructure. Their focus has been on proving hard technical milestones, even when those milestones don’t immediately translate into products.

What makes Google interesting is optionality. If quantum ends up mattering for materials science, optimization, or future compute workloads, Google has the talent and infrastructure to integrate it internally before it ever becomes a commercial offering.

For investors, this means quantum success would likely show up indirectly, through better internal tools, improved efficiency, or future services layered into Google Cloud rather than as a standalone revenue line.

Closing thoughts

Quantum computing is neither a punchline nor a saviour. It’s a real technological frontier with legitimate long-term potential, operating under constraints that don’t bend to capital or narrative pressure. The physics is difficult, the engineering is slow, and the software problem is still largely unsolved. None of that makes quantum uninteresting. It just makes it incompatible with the way markets like to price stories.

What complicates things now is timing. As confidence in near-term AI payoffs softens, quantum risks absorbing expectations that have nowhere else to go. That doesn’t change what quantum can eventually become, but it does increase the chance that valuations get ahead of usefulness, especially for companies whose entire identity rests on long-dated breakthroughs.

For investors, the challenge isn’t deciding whether quantum matters. It’s recognising where value can realistically accrue while timelines stretch and narratives rotate.

Quantum computing will likely matter in ways that don’t look obvious today. The mistake is assuming that inevitability translates cleanly into investable returns, or that the market will wait patiently for reality to catch up.

Apple’s AI Strategy Isn’t Loud. It’s Durable.

Daniel Kolb — Tue, 06 Jan 2026 19:34:41 GMT

gguy - stock.adobe.com

Disclaimer: This publication and its authors are not licensed investment professionals. Nothing posted on this blog should be construed as investment advice. Do your own research.

Most of the AI conversation today is framed as a race. Bigger models, more parameters, faster benchmarks, larger clusters. If you look at it that way, Apple often seems like it’s lagging behind the hyperscalers like Google, Amazon or Microsoft.

But that framing misses what Apple is actually optimizing for.

Apple isn’t trying to build the smartest model in the data center. It’s trying to make intelligence cheap, predictable, and always available on devices that already sit in people’s pockets. That leads to very different technical choices, especially around where inference runs, how costs scale, and which risks are acceptable.

Once you stop treating AI as a product category and start treating it as infrastructure, Apple’s focus on on-device AI starts to look less conservative. This article looks at those choices from a systems and fundamentals perspective, and what they imply for Apple, hyperscalers (like Google / Amazon / Microsoft), and investors.

How is Apple’s AI approach different from other Big Tech companies?

Apple’s AI strategy emphasizes on-device intelligence rather than relying on massive cloud inference. This approach is rooted in real technical and systems trade-offs — latency, privacy, bandwidth, and hardware economics — that become especially important at consumer scale.

The Hardware Foundation: Apple Silicon and the Neural Engine

Central to Apple’s strategy is the integration of specialized neural acceleration in Apple Silicon, beginning with the A-series and continuing through the famous M-series chips. If you own an Apple device you’ve surely come across Apple Intelligence and that it runs and is secured on your device - that’s what I’m referring to here.

The Apple Neural Engine (ANE) is a dedicated neural processing unit designed to accelerate neural network inference efficiently and with low power consumption. Although Apple doesn’t publish full architectural details, projects analyzing the ANE describe it as a neural processing unit tuned for matrix operations and machine learning tasks, distinct from GPUs or CPUs.

Research and developer documentation from Apple shows that machine learning models — including transformers and other AI workloads — can be optimized and deployed locally using the Neural Engine, CPU, and GPU together.

Apple has also released posts showing on-device execution of LLMs, such as an example where a mid-sized 8B-parameter model runs locally on Apple silicon at usable real-time throughput. (Source)

Delivering Low and Predictable Latency

One of the biggest practical advantages of on-device inference is latency elimination. This means the time it takes the AI to execute a task.

Cloud-based AI like ChatGPT and Gemini inevitably on the other hand introduces network hops and queuing delays, which are particularly noticeable in interactive UI features: If you’ve been using ChatGPT before, you probably know the typical loading spinner that appears while ChatGPT is processing your chat - that’s the latency.

When inference (meaning the execution of the AI model like processing a chat instead of training it on new things) runs locally, there’s no network dependency, so response time becomes driven purely by local hardware performance rather than internet speed or server load. This matters most when AI functionality is tied to the user experience itself — for example, in instant text completion or real-time image analysis — where even small delays are perceptible and degrade UX.

Privacy as an Architectural Constraint

Apple’s privacy commitments are not just marketing language; they affect how the entire AI stack is designed. According to Apple’s privacy materials:

Apple products, including Apple Intelligence, are engineered to protect user privacy by keeping as much processing on device as possible and minimizing the data sent to servers. (Source)

Technical documentation shows that Core ML and the machine learning frameworks are explicitly designed to run models entirely on device, with no network connection needed, which supports both latency and privacy goals. (Source)

Developer-focused analyses similarly emphasize that “every inference happens on your device using the Neural Engine, CPU, and GPU in concert,” meaning personal data stays local and isn’t sent to remote servers for processing. (Source: Medium)

Apple’s broader AI architecture — including its Private Cloud Compute infrastructure — is built around the dual modalities of on-device processing and secure offload only when needed, ensuring that data remains private and ephemeral even in cloud interactions. (Source: Apple Security Research)

Bandwidth and Cost: Inference Cost

At scale - in Apple’s case that means hundreds of millions of active devices - the economics of AI shift dramatically. In cloud-first systems, each inference call (meaning executing a AI task) adds cost: compute, bandwidth, and storage. For every photo analyzed, text processed, or voice snippet interpreted, there’s a per-request cost in elastic cloud infrastructure.

When inference happens on the device itself in comparison to cloud based services like the ones from OpenAI or Google, these are the results:

Bandwidth costs drop to near zero after models are shipped or updated.
Per-request variable costs are eliminated.
Network usage is only triggered for non-AI data or optional syncs.

This turns AI execution into a fixed cost (upfront hardware and model distribution) rather than a continuously variable cloud bill, which is preferable when the scale is enormous and the unit economics matter.

This means Apple prepares for a future where AI runs at scale on every of their devices.

Software Infrastructure: Core ML and Developer Toolchains

Apple doesn’t just provide hardware; it provides optimized tooling. The Core ML framework is engineered to run models directly on the device with automated optimization and efficient hardware utilization, abstracting over CPU, GPU, and Neural Engine resources.

Core ML explicitly supports advanced AI models with compression and efficient execution, which makes it feasible to run larger transformer-style models locally without network dependency.

Additionally, Apple’s Foundation Models framework — released as part of its developer ecosystem — enables on-device large language models to power intelligent app features while still preserving privacy and offline capability.

Offline and Reliability Benefits

On-device AI doesn’t break when connectivity does. Features stay responsive whether or not the user is on Wi-Fi, cellular, or completely offline. This robustness isn’t just a convenience; it’s a reliability constraint baked into product design, especially for critical features (e.g., local text understanding, image analysis) that are expected to work anywhere.

This offline capability expands where and how AI can be used - from high-latency rural networks to airplanes in flight - without degrading quality or predictability.

Tradeoffs and Strategic Implications

On-device AI isn’t a completely free lunch, otherwise everyone would do it. Apple is effectively trading cloud flexibility and massive model scale for predictability, privacy, and consistent UX. Smaller parameter counts and optimized models constrain peak generative capabilities compared to cloud LLMs, but this trade-off aligns with Apple’s design philosophy: intelligence that’s ambient, private, and reliable on the device itself.

This architectural choice shapes the unit economics and operational risk profile dramatically: rather than paying per inference or relying on external infrastructure providers, Apple compounds hardware amortization and device-level execution — costs paid once and leveraged per user for years.

Recap Of Why Apple Runs AI On-Device

Apple’s emphasis on on-device AI is not a superficial positioning statement — it’s a systems-level engineering choice that tightly aligns hardware, software, and privacy constraints. By localizing inference, Apple minimizes latency, reduces dependence on network bandwidth and cloud resources, preserves privacy, and shifts AI costs to predictable, device-specific investments rather than ongoing cloud bills. These factors fundamentally shape how AI powers everyday features on Apple platforms and define a very different set of economic constraints than cloud-centric alternatives.

What This Means for Apple vs. Hyperscalers in Today’s AI Race

This means that Apple’s on-device AI approach puts it on a very different path than the hyperscalers, and the gap matters far more at the systems and cost level.

Two Very Different Ideas of Where Intelligence Lives

Hyperscalers like Google, Microsoft, and Amazon are building what are essentially centralized intelligence factories. The assumption is simple: intelligence lives in the data center, users and companies call it remotely, and the economics work because costs can be passed on through usage fees, subscriptions, or platform lock-in.

That model makes a lot of sense when AI is scarce, expensive, and clearly differentiated. It also fits naturally with how cloud businesses already operate, where variable cost and elastic pricing are accepted facts of life.

Apple is betting on a different future. One where intelligence becomes frequent, embedded, and mostly invisible. In that world, sending every small AI interaction back to a data center starts to look wasteful, slow, and hard to justify economically.

Centralized Compute vs. Intelligence at the Edge

From a technical perspective, this is really a story about where compute sits.

Hyperscalers concentrate intelligence where power, GPUs, and networking already exist. That gives them huge advantages in model scale, iteration speed, and enterprise-grade workloads. If you need massive models or fast experimentation, centralized infrastructure wins.

Apple flips the problem around. It pushes intelligence out to the edge, onto devices people already own, and spreads compute across hundreds of millions of phones and laptops. That gives Apple consistency in latency, privacy by default, and very predictable costs once the hardware is shipped.

Neither approach is universally better. They’re optimized for different bottlenecks. Hyperscalers optimize for capability and flexibility. Apple optimizes for scale without surprises.

When Cost Curves Start to Matter

Early in the AI cycle, hyperscalers look clearly ahead. Bigger models, faster progress, and obvious monetization paths through cloud APIs and enterprise deals.

But as AI moves from “cool feature” to something people use constantly, the cost curves start to change. Hyperscalers pay for inference every single time a model runs. Apple pays once, upfront, when it designs the chip and sells the device. Basically, Apple makes their customers pay the AI bill upfront.

At low usage, that difference barely matters. At high, habitual usage, it becomes the whole game. This is where hyperscalers obsess over batching, utilization, and efficiency, while Apple benefits from inference that is effectively free at the margin.

Control Versus Ongoing Exposure

The other big difference is where risk sits.

Hyperscalers are continuously exposed to GPU supply cycles, energy prices, and the need to keep building and upgrading data centers. Those costs don’t stop once a model ships. They show up every quarter.

Apple takes on a different kind of risk. It concentrates it upfront in chip design, silicon investment, and longer iteration cycles. That risk is real, but it’s bounded. Once the hardware is out in the world, the economics are largely locked in.

From an investor’s point of view, that means Apple’s AI risk is front-loaded and familiar, while hyperscaler AI risk is ongoing and variable.

What “Winning” the AI Race Actually Means

If winning means the biggest models, the most parameters, and the flashiest benchmarks, Apple will often look behind.

If winning means stable margins at massive scale, AI quietly embedded into everyday behavior, and fewer operational surprises, Apple looks much more competitive than it gets credit for.

The key point is that Apple isn’t trying to beat hyperscalers at their own game. It’s opting out of that game and playing one where distribution, hardware amortization, and control over the stack matter more than raw model size.

What This Means for You If You Hold Apple Stock

AI Isn’t Changing Apple’s Business Model. That’s the Point.

If you zoom out a bit, Apple’s on-device AI push doesn’t really change what the company is. It changes how much risk it’s willing to take on.

Apple isn’t trying to bolt a new AI business onto the side of its P&L. It’s folding AI into the same (very profitable) machine it’s been running for years: sell premium hardware, spread silicon and R&D costs over massive volume, and use software to make each new generation feel meaningfully better than the last one.

On-device inference fits that model cleanly. There’s no new per-request cost that scales with usage, and no sudden dependence on cloud pricing, GPUs, or energy markets. That’s a very Apple move.

This Is About Protecting Margins, Not Creating a New Growth Story

If you’re looking for an obvious “AI revenue line” in Apple’s numbers, you’re probably going to be disappointed. That’s not what this is.

Apple’s AI strategy is mostly about not letting margins get chipped away over time. If everyday intelligence had to run through the cloud, Apple would eventually be paying a variable cost every time users typed, searched, spoke, or edited photos. At Apple’s scale, that would add up fast.

By keeping inference on the device, Apple avoids that slow margin bleed. It’s defensive, but in a good way. For a company this large, keeping margins stable often matters more than finding a new flashy growth lever.

Apple Is Avoiding the Messy Part of the AI Stack

A lot of companies jumping into AI are signing up for things they don’t fully control: GPU supply cycles, cloud vendor pricing, energy costs, and constant infrastructure upgrades.

Apple mostly sidesteps that. It puts the risk where it’s already comfortable: chip design, long product cycles, and tight integration. That doesn’t mean zero risk, but it’s familiar risk. Investors usually underestimate how valuable that is.

This is Apple choosing the boring problems it already knows how to solve instead of the exciting ones that blow up later.

AI Helps Sell Hardware, Quietly

The real upside from AI for Apple isn’t some standalone AI service. It’s hardware upgrades.

When AI features are tied to newer chips, older devices don’t just feel slower, they feel outdated. That nudges replacement cycles without Apple having to say much about it. Over time, that shows up as better product mix, more high-end devices sold, and stronger ecosystem lock-in.

None of that makes headlines, but it compounds.

Control Keeps the Numbers Predictable

Because Apple controls the silicon, the OS, and distribution, it can decide how capable its AI features are and how expensive they’re allowed to be. That matters in a world where AI costs are still moving targets.

Predictability doesn’t sound exciting, but it’s exactly what long-term investors want. Fewer surprises means steadier cash flows. Steadier cash flows mean Apple can keep doing what it’s always done: buy back stock, pay dividends, and invest without drama.

In my opinion this aligns with a future where AI will ultimately be boring infrastructure business.

The Bet You’re Really Making as an Investor

Owning Apple in the AI era isn’t a bet on winning benchmarks or shipping the biggest models. It’s a bet that AI becomes everywhere, that costs matter more than raw capability once that happens, and that tight integration beats brute-force compute over time.

Apple isn’t trying to win the AI race loudly. It’s trying to not lose quietly. From a fundamentals point of view, that’s usually a better position than it looks at first glance.

Closing Thoughts

Apple’s AI strategy makes a lot more sense once you stop treating AI as a product category and start treating it as infrastructure.

Hyperscalers are building centralized intelligence because that’s where their strengths already are. They’re optimized for scale, iteration speed, and selling compute as a service. Apple is optimizing for something else entirely: making intelligence cheap, predictable, and quietly embedded into devices people already use all day.

From the outside, that can look conservative, even underwhelming. From the inside, it’s a very deliberate choice about where costs should live, which risks are acceptable, and which ones are better avoided altogether. Apple is choosing to pay upfront, lock in economics early, and minimize ongoing exposure as AI usage grows.

For investors, the mistake would be to judge Apple by the same metrics used for hyperscalers like Google / Amazon / Microsoft. Apple doesn’t need to win benchmarks or dominate cloud APIs to “win” in AI. It needs AI to strengthen its existing flywheel without destabilizing margins or introducing new dependencies.

If AI becomes everywhere, the loudest players won’t necessarily be the best-positioned ones. The companies that quietly control costs, distribution, and failure modes tend to age better. Apple’s on-device approach is less about racing ahead and more about making sure it doesn’t get dragged into a cost structure it can’t control later.

That’s not an exciting ending. But in tech, and especially in investing, boring endings are often the ones that compound.

Why Google Built TPUs and Why Investors Should Care

Daniel Kolb — Mon, 05 Jan 2026 16:33:48 GMT

Photo Agency - stock.adobe.com

Disclaimer: This publication and its authors are not licensed investment professionals. Nothing posted on this blog should be construed as investment advice. Do your own research.

There’s currently a lot of buzz around Google’s TPU, especially among investors and here on Substack. This - and given your interest in my last post about AI hardware - I think it’s a good idea to elaborate a bit on what TPUs are, their history and their influence on AI in today’s post.

A short history of TPUs: why they were built

To understand why Tensor Processing Units (TPUs) matter for AI infrastructure investing, it helps to start with the problem they were designed to solve.

By the early 2010s, Google was already running neural networks in production across search ranking, ads, speech recognition, and image classification. These models worked, but the cost of running them at scale was becoming a structural issue. CPUs were too slow and inefficient. GPUs helped, but they were designed for graphics first and carried a lot of flexibility that translated into wasted power and higher operating costs when used continuously for inference.

This is the context in which TPUs emerged.

Training vs inference: What you need to know to understand TPUs

Before talking about TPUs in detail, it is worth slowing down on a distinction that shapes almost every AI cost curve: training versus inference.

Training is the phase where a model learns. It is compute-intensive, episodic, and highly visible. Large clusters spin up, capital is deployed in bursts, and progress is easy to market. This is the phase that produces headlines, benchmark charts, and capex shock numbers. From the outside, it looks like the center of gravity in AI.

A concrete example is large language models like OpenAI’s GPT series. Training a new generation involves running massive datasets through thousands of GPUs or accelerators for weeks or months. This is where you see headline numbers around capex, energy usage, and hardware shortages. Once the run is finished, the cluster can be reallocated or powered down.

Another example is recommendation models at companies like Netflix or Meta. These models are retrained periodically as user behavior shifts. The training jobs are large but scheduled. They are part of an operational cycle, not a constant drain.

Inference is what happens after the model exists. Every prompt, every search ranking, every recommendation, every generated token runs through inference. It is continuous, predictable, and quietly expensive. Inference does not happen once per quarter. It happens millions or billions of times per day, and it shows up directly in cost of revenue.

Every time someone types a query into a search engine at Google, multiple models run inference to rank results, filter spam, translate languages, and personalize outcomes. Each individual query is cheap. Billions of them are not.

When a user asks a question in a chatbot powered by models from OpenAI, inference runs for every token generated. The longer the answer, the higher the cost. This happens continuously, regardless of whether a new model is being trained that week.

In e-commerce, companies like Amazon run inference constantly to power recommendations, search ranking, fraud detection, and pricing systems. These models may be retrained occasionally, but inference never stops. It is part of the baseline cost of doing business.

At small scale, this distinction barely matters. Training dominates costs, inference is negligible, and general-purpose hardware feels flexible enough. At large scale, the balance flips. Training becomes a smaller share of lifetime cost, while inference becomes the permanent load the business has to carry.

This is the point where incentives change and why TPUs were invented:

Google presenting TPU v4 in 2021

What a TPU actually is

A Tensor Processing Unit (TPU) is a custom-built application-specific integrated circuit (ASIC) designed specifically to accelerate machine learning workloads, especially dense tensor operations such as matrix multiplication. Unlike CPUs, which are general-purpose processors, or GPUs, which are broad parallel accelerators, TPUs are intentionally narrow. They remove features that are not useful for neural networks in order to maximize efficiency per watt and per dollar.

Google’s own documentation describes them this way:

“Tensor Processing Units (TPUs) are Google’s custom-developed, application-specific integrated circuits (ASICs) used to accelerate machine learning workloads.”
Source: Google Cloud TPU documentation
https://docs.cloud.google.com/tpu/docs/intro-to-tpu

TPUs are not consumer chips. They are deployed inside Google data centers and offered through Google Cloud, typically in large groups called TPU pods, because their economic advantage only appears at scale.

Why Google built TPUs in the first place

Google began developing TPUs internally around 2013, long before AI became a mainstream investment theme. The motivation was not performance leadership for its own sake, but cost containment. Neural network inference was growing faster than hardware efficiency, and relying on general-purpose chips meant accepting a cost curve that would only get worse with scale.

An early Google engineering reflection makes this explicit:

“We thought we’d maybe build under 10,000 of them. We ended up building over 100,000 to support all kinds of great stuff including Ads, Search, speech projects, AlphaGo.”
Andy Swing, Principal Engineer, Google TPU team
Source: Google Cloud blog
https://cloud.google.com/transform/ai-specialized-chips-tpu-history-gen-ai

The first publicly disclosed TPU, announced in 2016, was built primarily for inference, not training. That detail is easy to overlook, but it matters. Inference is the always-on cost center. Training is episodic. Google was optimizing the part of the AI lifecycle that shows up every second in operating expenses.

What the efficiency gains looked like

In Google’s original academic paper introducing TPUs, the authors quantified just how inefficient general-purpose hardware had become for neural network workloads:

“The TPU delivered 15–30× higher performance and 30–80× higher performance-per-watt than contemporary CPUs and GPUs.”
Norman P. Jouppi et al., “In-Datacenter Performance Analysis of a Tensor Processing Unit”
Source: https://en.wikipedia.org/wiki/Tensor_Processing_Unit

Those numbers were not about peak benchmarks. They reflected real production inference workloads inside Google’s data centers.

As TPUs evolved, later generations expanded into training and large-scale model deployment, but the design center remained the same: predictable performance, high utilization, and lower marginal cost at scale. The most recent generations, such as Google’s inference-focused TPU platforms, continue to emphasize efficiency rather than raw flexibility.

Industry analysts like UncoverAlpha on Substack have summarized this positioning succinctly:

“TPUs are not faster GPUs. They are cheaper answers to a narrower question.”
Source: Uncover Alpha
UncoverAlpha
The chip made for the AI inference era – the Google TPU
Read more
5 months ago · 141 likes · 3 comments · UncoverAlpha

Why this history matters for investors

This history clarifies intent: TPUs were not a reaction to hype, nor an attempt to compete in the merchant chip market. They were a response to an internal cost curve that threatened long-term scalability.

From an investing perspective, that distinction is critical. When a company commits to custom silicon, it is usually because renting general-purpose infrastructure no longer works economically. TPUs are a visible example of how AI, once it reaches scale, stops behaving like software and starts behaving like infrastructure.

AI looks like software, but behaves like infrastructure

At small scale, AI behaves like software. You rent GPUs, ship features, and costs feel elastic. You can always optimize later. This phase is intoxicating, and it produces most of the narratives investors hear.

At scale, AI behaves like infrastructure. Power, memory bandwidth, utilization, and latency move from engineering trivia into margin drivers. Inference does not spike once a quarter like training runs. It runs every minute of every day, and it shows up directly in cost of revenue.

TPUs exist because this transition happens faster than people expect. Google did not build them to win benchmarks. It built them because renting general-purpose accelerators at hyperscale turns variable costs into a permanent tax.

Training gets the headlines, inference pays the bills

Training dominates public discussion because it is visible. Giant clusters, massive capex numbers, new model releases. Inference is quieter and far more important economically.

Once a model is deployed, inference becomes the long tail of cost. Every user query, every autocomplete, every recommendation request burns compute. Even modest inefficiencies compound brutally when multiplied by billions of requests.

TPUs are increasingly tuned for this reality. They are optimized around throughput per watt, stable latency, and predictable scaling rather than raw flexibility. That tells you something about where Google sees the long-term cost center.

For investors, the implication is simple. Any AI business that cannot structurally reduce inference cost over time is running uphill. Pricing pressure eventually arrives, and when it does, gross margin is the only shock absorber.

Control is the real asset

From the outside, TPUs look like a technical preference. From the inside, they are about control.

By building its own silicon, Alphabet controls its cost curve, its hardware roadmap, and its deployment cadence. It is not exposed to supplier pricing power in the same way a pure renter is. That does not mean TPUs are cheaper in every scenario. It means the company decides when and how cost improvements show up.

This distinction matters more than people realize. Companies that rent their entire AI stack inherit someone else’s incentives. When demand spikes, they pay more. When supply tightens, they wait. When pricing changes, margins move whether they want them to or not.

The GPU monoculture problem

The current AI stack has a monoculture risk. A single architecture dominates, a single ecosystem defines tooling, and a single supplier captures most of the value. From a momentum and economic perspective, this looks unbeatable.

From a systems perspective, it is fragile.

When everyone depends on the same supply chain, pricing power concentrates. When demand overshoots supply, margins compress downstream. When innovation slows, the entire ecosystem feels it.

TPUs exist partly as an escape hatch. Not a replacement for GPUs everywhere, but a pressure valve. They remind investors that diversification at the infrastructure layer is not about performance bragging rights. It is about resilience.

This is where understanding the role of NVIDIA becomes more nuanced. Owning the toll road is incredibly lucrative, but toll roads are cyclical assets when alternatives mature.

How TPUs could break out beyond Google’s environment

Historically, TPUs have lived almost entirely inside Google’s own walls. Unlike GPUs, which are sold broadly and show up in every major cloud and data center, TPUs have been tightly coupled to Google’s internal infrastructure and to Google Cloud. That constraint is not accidental. It reflects a deliberate strategy: TPUs were designed first to solve Google’s own cost problems, not to become a merchant chip.

What has changed is not the architecture, but the pressure on the rest of the ecosystem.

From internal cost lever to external option

Google has already crossed one important threshold by offering TPUs to third parties via Google Cloud. That move alone shifted TPUs from “internal optimization” to “commercial infrastructure,” even if still within a controlled environment.

As Google itself describes it:

“Cloud TPUs are designed to accelerate machine learning workloads and make Google’s internal ML infrastructure available to customers.”
Source: Google Cloud TPU documentation
https://docs.cloud.google.com/tpu/docs/intro-to-tpu

The more interesting question for investors is whether TPUs remain a Google-only cloud feature or evolve into something closer to a multi-tenant, multi-hyperscaler option.

Signals from hyperscaler-scale customers

In late 2024 and 2025, reports began to surface suggesting that very large AI users were actively exploring TPUs as an alternative to exclusive GPU dependence.

Reuters reported that Google was working with Meta to reduce Nvidia’s software advantage by improving TPU support for widely used frameworks:

“Google is working with Meta Platforms to improve support for AI software on its in-house Tensor Processing Units, a move aimed at reducing Nvidia’s dominance in AI chips.”
Source: Reuters
https://www.reuters.com/business/google-works-erode-nvidias-software-advantage-with-metas-help-2025-12-17/

That article matters less for the Meta angle and more for what it implies: Google is investing in ecosystem compatibility, not just hardware performance. That is a prerequisite for TPUs to travel beyond Google-native workloads.

Separate industry reporting has also noted that Meta has explored large-scale TPU usage through Google Cloud as part of its broader effort to diversify away from single-vendor GPU exposure.

“Meta is reportedly considering Google’s TPU chips as part of its effort to reduce reliance on Nvidia for AI infrastructure.”
Source: AI Certs industry report
https://www.aicerts.ai/news/meta-eyes-google-tpu-chips-in-high-stakes-ai-partnership/

None of this means TPUs are about to be sold like off-the-shelf GPUs. But it does suggest that, at hyperscaler scale, the desire for architectural leverage is growing.

Software is the real gatekeeper

One of the biggest historical blockers for TPUs outside Google was tooling. NVIDIA’s CUDA became the de facto standard for AI development, and most production pipelines were built around it. TPUs, by contrast, were initially optimized for TensorFlow and Google-internal stacks.

Google has been explicit about trying to lower this barrier:

“We’re investing heavily in making TPUs easier to use with popular frameworks so customers don’t have to rewrite their models.”
Source: Google Cloud blog
https://cloud.google.com/blog/products/ai-machine-learning/an-in-depth-look-at-googles-first-tensor-processing-unit-tpu

From an investor perspective, this is the most important signal. If TPUs remain software-isolated, they stay niche. If they become genuinely first-class citizens in frameworks like PyTorch, they become an economic alternative, not just a technical one.

Why a TPU breakout would matter

If TPUs ever move beyond Google Cloud into colocated environments, joint ventures, or tightly integrated partnerships with other hyperscalers, the impact would not be about performance leadership. It would be about negotiating power.

For the AI infrastructure market, that would mean:

More leverage against single-vendor pricing like NVIDIA
More pressure on GPU margins at the largest scale
More incentive for diversified compute strategies

This does not “kill” NVIDIA. But it does introduce a credible reminder that its largest customers are not passive. They are actively looking for ways to internalize costs and reduce dependency.

TPUs escaping Google’s silo would not be a sudden disruption. It would be a slow, structural shift, visible first in contracts and capex decisions rather than benchmarks. For investors, those are exactly the signals that tend to matter most over the long run.

How TPUs could affect NVIDIA and Nebius

Understanding TPUs is useful not because they replace GPUs everywhere, but because they change the power balance in the AI infrastructure stack. That shift has very different implications for companies like NVIDIA and Nebius.

NVIDIA: pricing power with an expiration date, not a cliff

NVIDIA’s position today is extraordinary. It controls the dominant hardware platform, the dominant software ecosystem, and most of the marginal dollars flowing into AI compute. From an investor perspective, that looks like a textbook toll booth.

TPUs do not invalidate this. In fact, in the near term they arguably reinforce it. The existence of TPUs signals just how expensive and strategically important AI compute has become. That validates NVIDIA’s margins rather than undermining them.

Every large-scale TPU deployment is a reminder that NVIDIA’s biggest customers are also its most motivated future competitors. Hyperscalers do not build custom silicon because GPUs are bad products. They do it because paying a perpetual margin to an external supplier becomes painful once inference volume stabilizes and grows predictably.

For NVIDIA, this creates a subtle but important dynamic: Revenue growth can remain strong while pricing power slowly caps out and unit volumes can rise even as hyperscalers work to reduce long-term dependence.

This means the the risk is not sudden displacement, but margin compression, starting with the largest, most sophisticated buyers.

From the outside, NVIDIA still looks untouchable. From the inside, TPUs represent the point where customers stop asking “how fast can we scale?” and start asking “how much control do we really have?”

That does not break the NVIDIA story, but it reframes it as a cyclical infrastructure business with unusually strong near-term leverage rather than an infinitely compounding software-like asset.

Nebius: caught between simplicity and dependence

Nebius sits on the opposite side of the control equation.

As a cloud and AI infrastructure provider without proprietary silicon, Nebius benefits from NVIDIA’s ecosystem and performance leadership, but it also inherits NVIDIA’s cost structure. Every improvement in GPU pricing or efficiency helps Nebius. Every supply constraint or pricing shift flows directly into its margins.

TPUs matter here not because Nebius will deploy them tomorrow, but because they highlight the structural disadvantage of being a compute renter in a world where AI inference becomes the dominant workload instead of training new models.

If hyperscalers can run inference more cheaply on their own silicon, they gain room to compete more aggressively on price for large customers and absorb cost volatility internally.

Nebius, by contrast, must either pass costs through or accept margin pressure. At small scale, this is manageable. At large scale, it becomes a strategic ceiling.

From an investor perspective, this does not mean Nebius cannot succeed. It means its upside is more tightly coupled to NVIDIA’s roadmap and pricing discipline. TPUs expose that dependency clearly. Nebius is building on top of someone else’s product.

The broader takeaway for investors of NVIDIA & Nebius

TPUs do not “kill” GPU companies, and they do not automatically doom GPU-based clouds. What they do is draw a clean line between infrastructure owners and infrastructure renters.

NVIDIA sits upstream, monetizing everyone’s urgency to deploy AI.
Hyperscalers with TPUs are trying to internalize that urgency into controllable costs.
Providers like Nebius live in the middle, exposed to both sides.

For investors, the mistake is treating these businesses as if they compound in the same way. TPUs make it clear that AI infrastructure rewards control first, efficiency second, and flexibility last.

Once AI becomes boring and inference is becoming a bigger topic, the companies that own their cost curves tend to age better than the ones that rent them.

Closing thoughts

Once you see TPUs as an economic tool rather than a chip, several things shift.

You stop evaluating AI companies solely on model quality and start asking who controls their cost base. You become skeptical of businesses whose gross margins depend on someone else’s roadmap. You treat infrastructure ownership as a form of pricing power, even when it does not show up immediately.

Most importantly, you stop assuming that AI compounding is automatic. Technology overwrites itself constantly. Cost structures are what survive.

TPUs matter because they sit at that intersection. They are not the future of all AI compute. They are proof that, at scale, fundamentals reassert themselves.

Intro To AI Hardware For Investors: GPUs Won. That’s Why Everyone Is Hedging Them

Daniel Kolb — Sun, 04 Jan 2026 16:06:01 GMT

An investor’s lens on AI hardware

Most AI investment narratives start in the wrong place. They focus on models, benchmarks, and headline-grabbing chip launches, as if the main question is who has the fastest technology. That’s rarely where long-term outcomes are decided.

For investors, AI hardware matters for a simpler reason: it’s becoming one of the largest and least flexible cost lines in modern tech businesses. Compute, memory, power, and supply constraints don’t scale like software, and once AI usage moves from experimentation to production, those constraints start showing up directly in margins, pricing power, and strategic optionality.

This is where a lot of confusion creeps in. Terms like GPUs, ARM, and “custom silicon” get thrown around as if they’re competing bets on the same axis, when in reality they sit at very different layers of the system and serve very different economic purposes. Some choices buy flexibility, others buy efficiency, and a few quietly buy leverage.

This article isn’t about predicting which chip wins or which company ships the fastest accelerator. It’s about understanding how AI hardware actually behaves once it’s embedded into real products at scale, who controls the economics when that happens, and why many of the most important moves in this space are defensive rather than disruptive.

What AI hardware actually is

When people talk about AI hardware, they are almost always talking about GPUs, even if they don’t explicitly say it, and that’s already a bit misleading because GPUs were never built for AI in the first place. They were built for graphics, games, video rendering, and other visual workloads, but it turns out that the kind of math you need to draw millions of pixels on a screen looks very similar to the math modern neural networks rely on.

At a high level, AI models are doing massive amounts of repetitive math, mostly matrix multiplications, nothing conceptually complicated, just a huge volume of it. That’s where GPUs happen to shine, because instead of being optimized to do a few complex tasks very quickly, they are built to do the same simple operation thousands of times in parallel.

This is the key difference compared to CPUs. A CPU is great at decision-making, branching, and juggling many different tasks at once, which is why it’s perfect for running an operating system. A GPU is more like an assembly line, where every worker repeats the same motion over and over again, which sounds limiting until you realize that this is exactly what AI workloads want.

How GPUs quietly became the backbone of AI

No one sat down and designed a “perfect AI chip” at the beginning. Researchers simply used the hardware that was already available and eventually realized that GPUs could run neural network workloads much faster than CPUs, which triggered a chain reaction that still shapes the industry today.

Once AI frameworks started targeting GPUs, model architectures adapted to GPU behavior, tooling improved, and developers learned to think in GPU-friendly ways. Over time, the GPU stopped being an optimization choice and became an assumption, something everything else was built around rather than something you consciously evaluated.

That’s why GPUs didn’t win because they were theoretically ideal. They won because they were good enough early on and then accumulated an ecosystem so large that replacing them became a tooling, cultural, and economic problem, not just a technical one.

About training and inference

From the outside, all AI compute looks roughly the same, but once you’re paying for it the difference between training and inference becomes very obvious. Training is expensive and painful, but it usually happens in bursts, while inference runs continuously in the background and scales directly with how successful your product becomes.

Every chatbot reply, every generated image, every recommendation is inference, and once users expect those features to work instantly and all the time, you don’t get to pause or batch that workload later. This is where hardware flexibility starts to matter more than peak efficiency.

GPUs ended up in a sweet spot because they can handle both training and inference reasonably well, even as models and workloads keep changing, and that adaptability turns out to be more valuable in practice than squeezing out a bit more theoretical performance.

Memory: the part that quietly dominates everything

If GPUs are the muscle of AI systems, memory is the thing that actually keeps them alive, and this is the part many non-hardware people underestimate. AI models are physically large, not just conceptually complex, because all of those parameters have to live somewhere and be moved around constantly while the model runs.

If that movement is slow, the GPU ends up waiting, and a GPU that’s waiting is still drawing power, still occupying rack space, and still showing up on the bill. In many real-world systems, the bottleneck isn’t how fast the GPU can calculate but how fast it can get data.

That’s why memory decisions matter just as much as compute decisions, even though they usually get far less attention.

System RAM and GPU memory

There’s system RAM, which sits next to the CPU and is relatively cheap and flexible, and then there’s GPU memory, often called VRAM or high-bandwidth memory, which is much faster, much closer to the GPU cores, and dramatically more expensive per gigabyte.

AI workloads strongly prefer to keep the entire model inside GPU memory. The moment parts of it spill into system RAM, performance drops sharply, latency increases, and power usage creeps up in ways that don’t show up in benchmarks but absolutely show up in cloud invoices.

This is why AI GPUs are expensive. You’re not just paying for compute, you’re paying for fast, scarce memory that’s tightly coupled to that compute, and once you hit memory limits you can’t fix the problem cheaply without changing the whole performance profile of the system.

Memory bandwidth: the quiet limiter

In practice, many AI workloads are memory-bound rather than compute-bound, meaning the GPU could theoretically do more work but spends a lot of time waiting for data to arrive. That unused capacity is still paid for in electricity and capital, which is why it becomes an economic problem, not just a technical one.

This also explains why newer generations of GPUs focus so heavily on memory bandwidth and interconnects. Faster memory leads to higher real utilization, which lowers cost per request and often matters more than raw compute improvements.

From the outside, this looks like a minor spec detail. From the inside, it’s one of the main drivers of margins.

Why memory makes AI infrastructure sticky

Once an AI system is built around a certain memory setup, it tends to stick, because models get tuned to specific sizes, engineers optimize around bandwidth assumptions, and lots of small, invisible decisions accumulate over time.

Switching hardware later is rarely a clean swap. It usually means retuning models, rewriting performance-critical code, and accepting new edge cases and failure modes, which is why hardware choices in AI tend to last longer than software choices and why GPUs, with their tightly integrated compute and memory, became so deeply entrenched.

This stickiness is also what makes any serious attempt to replace GPUs much harder than it looks at first glance.

With GPUs and Memory covered, I want to talk about CPUs next, which in the AI space is often erroneously referred to “silicon”.

What “silicon” actually means in AI conversations

When people say “silicon” in AI discussions, they’re usually not talking about a specific kind of chip at all. It’s shorthand for hardware a company designs itself instead of buying off the shelf. Almost every modern processor is made of silicon, whether it’s a CPU, a GPU, or an AI accelerator, so the material isn’t the point. What matters is who controls the design and what assumptions are baked into it.

When you hear phrases like “Apple’s silicon” or “custom silicon for AI,” what’s really being described is a decision to take ownership over part of the hardware stack rather than accept someone else’s roadmap, pricing, and constraints. Therefore silicon isn’t a category of hardware.

I often hear ARM as equally misunderstood term in discussions, so let’s tackle that next:

ARM is not a chip, it’s the language CPUs speak

ARM is often described as a chip company, but that’s slightly misleading. ARM mostly doesn’t build chips. It defines an instruction set, essentially the language a CPU speaks. That language specifies how software talks to hardware, how instructions are structured, and how memory is accessed. ARM became popular because it’s efficient, modular, and easy to license, which is why it dominates phones and is now spreading into laptops and servers.

When companies design ARM-based chips, they aren’t inventing CPUs from scratch. They’re starting from a shared language and building their own cores, memory systems, and accelerators around it. ARM answers one specific question: how the CPU part of a chip behaves. That’s important, but it’s not where most AI computation happens.

CPUs still matter, just not for the heavy AI lifting

CPUs, whether ARM-based or not, are still essential. They run the operating system, handle networking, schedule workloads, and keep the system stable. What they don’t do well is the massive parallel math modern AI models rely on, which is why GPUs took over training and large-scale inference.

When people talk about “ARM for AI,” they usually mean ARM CPUs coordinating AI workloads, not replacing GPUs. The CPU is the conductor, not the orchestra.

Custom AI chips are purpose-built accelerators

Custom AI chips live in a different category altogether. These are purpose-built accelerators designed to run neural networks efficiently under specific assumptions. They support a narrower set of operations, trade flexibility for efficiency, and are usually aimed at inference workloads where models are stable and traffic is predictable.

They don’t replace CPUs or GPUs. They sit alongside them and handle the parts of the system where efficiency matters more than adaptability.

How these pieces fit together in real systems

In real AI systems, these components stack rather than compete. A CPU handles orchestration and system logic, a GPU handles flexible high-performance compute, and a custom accelerator handles narrow, high-volume inference paths.

Once you look at the system this way, the current hardware strategies become much easier to understand. Companies aren’t choosing between ARM, GPUs, or custom chips. They’re deciding which layers of the stack they want to control directly and which ones they’re willing to rent.

Why all of this quietly pushes companies toward custom silicon

Once you really sit with how tightly compute and memory are tied together in modern AI systems, a lot of current hardware decisions start to look less mysterious. GPUs didn’t just win because they were fast, they won because they bundled compute, memory, software tooling, and developer habits into one package that’s extremely hard to unwind without breaking things in subtle, expensive ways.

That stickiness is great when you’re building on top of it and terrible once you’re dependent on it.

At small scale, none of this feels dramatic. You spin up instances, pay the bill, and focus on shipping product. But as usage grows, inference runs constantly, memory inefficiencies start showing up in real money, and suddenly hardware stops being an abstract infrastructure layer and starts shaping what you can and can’t afford to do.

At that point, GPUs stop being “just the best option” and start becoming a structural dependency. Pricing is set somewhere else. Supply constraints are out of your control. Hardware roadmaps begin to leak into product roadmaps in ways software teams usually don’t like to admit.

This is where the conversation shifts, not toward replacing GPUs outright, because that’s far harder than it sounds, but toward asking a quieter question: how much of this dependency are we actually comfortable with over the long term?

That’s where custom silicon shows up.

The mistake is to think custom chips are about building something fundamentally better than GPUs. Most of the time they aren’t. They’re about creating an escape hatch, even if it’s narrow, limited to certain workloads, or only economically sensible at very large scale.

Once a company can say that not every workload needs to run on GPUs, the balance of power changes a bit. Pricing discussions look different. Shortages hurt less; if you’re into gaming or AI infrastructure you surely know what I’m talking about. Roadmaps become something you negotiate around instead of something you simply accept.

In that sense, custom silicon isn’t a revolution or a bold bet on a new future. It’s what happens when GPUs and memory become important enough that relying on a single external supplier starts to feel risky.

This isn’t theoretical. Big tech is already acting this way.

Once you look at AI hardware through the lens of dependency and leverage, the current moves by big tech stop looking experimental and start looking inevitable. None of the major players are behaving as if the GPU era is about to end, but none of them are behaving as if full dependency on a single hardware supplier is acceptable either.

They’re all converging on the same strategy from different angles: keep GPUs central, but reduce how much power any one supplier has over the system as a whole.

Apple: control the stack or don’t play the game

Apple is often held up as the example of how custom silicon can be transformative, and that’s true, but it’s also why Apple is such a bad comparison for almost everyone else.

Apple didn’t build its own chips because GPUs were too expensive or because it wanted better benchmarks. In my opinion, it built them because Apple hates being dependent on anyone else’s roadmap. Performance per watt, battery life, thermals, form factor, software APIs, developer tooling, all of that becomes much easier to reason about once the silicon is yours.

What’s interesting is how directly this thinking maps to AI, even though Apple’s AI story looks very different from cloud providers. Running more inference on-device only makes sense if you control compute and memory tightly, because latency, power usage, and privacy constraints all collapse into hardware decisions.

Apple’s takeaway isn’t “custom silicon beats GPUs.”
It’s “control compounds when you own the whole system.”

Microsoft: hedge the dependency, don’t fight it

Microsoft sits in a very different position. It doesn’t control consumer hardware at scale, but it does run one of the largest cloud platforms in the world, and that makes GPU dependency a balance sheet problem rather than a technical curiosity.

Azure’s AI push is deeply GPU-centric, and Microsoft is one of the biggest customers NVIDIA will ever have. At the same time, Microsoft is very clearly investing in its own silicon, not because it believes it can replace GPUs overnight, but because relying exclusively on someone else’s pricing, supply, and roadmap is a long-term risk.

Exactly, this is the hedge in its purest form.

Some workloads stay on GPUs because flexibility matters. Others slowly move to internal accelerators where performance is predictable and margins matter more than raw capability. The goal isn’t technical dominance, but it’s optionality.

Microsoft doesn’t need custom silicon to win. It needs it so that GPUs don’t get to set all the rules.

NVIDIA: leaning into dominance without pretending it’s forever

What makes NVIDIA interesting is that it’s not behaving like a company that thinks this dependency will never be challenged. NVIDIA knows better than anyone how fragile dominance can be in hardware.

Instead of just selling chips, NVIDIA keeps expanding the surface area of what “a GPU” even means. Software stacks, networking, interconnects, developer tooling, and increasingly entire data center reference architectures. The goal is to make the GPU not just a component but the default way AI systems are built.

At the same time, NVIDIA is pushing aggressively into higher memory bandwidth, tighter integration, and system-level solutions, which is exactly what you’d expect if you understand that memory and utilization, not just compute, are where the real leverage lives.

NVIDIA isn’t ignoring custom silicon. It’s pricing, bundling, and integrating in a way that makes leaving as expensive and risky as possible.

Different strategies, same direction

Apple, Microsoft, and NVIDIA are doing very different things on the surface, but underneath, the incentives line up neatly.

Apple wants total control, because it can afford to build everything itself.
Microsoft wants leverage and margin protection at massive scale.
NVIDIA wants to remain unavoidable, even as customers hedge against it.

None of these companies are betting on a clean break from GPUs. None of them are acting like custom silicon is a silver bullet. They’re all behaving as if AI hardware is now core infrastructure, closer to energy or networking than to normal software, and infrastructure rewards control, redundancy, and leverage.

Custom silicon isn’t showing up because GPUs failed, but it’s showing up because GPUs succeeded so completely that depending on them alone became risky.

From here, the remaining question isn’t whether custom silicon “wins,” but how much dependency reduction is enough before the complexity stops being worth it.

Closing thoughts

If you strip away the hype, the benchmarks, and the chip launch headlines, what’s happening in AI hardware is actually pretty familiar for the development of new systems in tech. Systems that start out flexible and cheap eventually become critical infrastructure, and once that happens, the conversation shifts from performance to control, cost, and risk.

GPUs didn’t become central to AI because they were perfect. They became central because they were available, flexible, and good enough early on, and then everything else quietly aligned around them. Memory, tooling, developer habits, pricing models, and product expectations all piled on top until walking away stopped being a technical decision and started being an economic one.

Custom silicon is a reaction to that. Not as a rebellion against GPUs, and not because everyone suddenly discovered a better way to do math, but because depending entirely on someone else’s hardware, memory roadmap, and pricing power feels fine at small scale and dangerous at large scale. The bigger the AI footprint gets, the more that risk matters.

That’s why the most interesting signal right now isn’t who claims to have the fastest chip, but who is quietly building options.

From an investing perspective, this is where a lot of narratives break. Custom silicon doesn’t automatically mean disruption, and GPU dominance doesn’t automatically mean permanent pricing power. What matters is who controls enough of the stack to survive margin pressure (see my other articles), supply shocks, and the slow grind of inference costs over time.

AI hardware is starting to look less like software and more like utilities. Boring, expensive, politically constrained, and incredibly hard to unwind once you’re in too deep (except if you read and subscribe to my Substack of course).

I hope this article could give you a good overview of hardware involved in AI and you’ll understand hardware news better now. Don’t forget to like and share this article. If you have any questions or comments, don’t hesitate to post them.

Update: If you want to learn more about a specific custom silicon from Google called TPU, I got you covered:

AI in 2026: When the Demos Stop Carrying the Story

Daniel Kolb — Thu, 01 Jan 2026 07:00:37 GMT

Disclaimer: This publication and its authors are not licensed investment professionals. Nothing posted on this blog should be construed as investment advice. Do your own research.

Happy new year everyone!

For the last few years, AI has mostly been judged by how impressive it looks in public. First it was OpenAI and ChatGPT suddenly sounding smarter than anyone expected, then Sora arrived and made it feel like video production itself was about to be rewritten. Capital (and especially the NVIDIA stock) reacted immediately. Big tech spent hundreds of billions on GPUs, data centers, long-term energy contracts, and custom chips, all on the assumption that intelligence would naturally translate into revenue at scale.

From where I sit, that assumption has always felt fragile. As a software engineer working with these systems daily, and as someone who has built startups and runs an agency that ships production software for paying clients, I’ve learned that technical capability and economic leverage rarely move in sync. AI looks magical in isolation, but once it lands inside real environments with SLAs, compliance constraints, edge cases, and accountability, the story becomes far less clean.

So far, the numbers reflect that. AI revenue remains small relative to the balance sheets now supporting it. Even compared to companies like Apple or Alphabet, AI is strategically important but still not a standalone cash engine. And almost none of the infrastructure spend accrues to the average AI startup. Most live one layer above the models, packaging the same intelligence in slightly different forms and hoping focus or UX creates leverage.

Adoption looks big until it meets real workflows

This is why 2026 feels like a turning point. Not because the models stop improving (they won’t), but because attention is shifting from demos to deployment. From the outside, adoption looks everywhere. Engineers rely on AI daily, marketers treat it as default tooling, students grow up with it, and many people quietly get more done because of it.

Inside organizations, progress is slower and messier. Formal adoption is uneven, pilots are common, and many fade out without drama, sometimes because it might reflect bad on employees to speak loud about AI failures. I see this constantly in agency work. Teams build a proof of concept, show early promise (especially by euphorized team members), then hit harder questions than expected. Who signs off on outputs that are sometimes wrong. How failures are handled when automation breaks something subtle. How probabilistic systems fit into organizations built around certainty and blame.

AI often gets close enough to be useful without ever fully owning the outcome. Responsibility does not transfer along with the output. Legal risk, on-call duty, customer fallout, and reputational damage still sit with humans. That gap between assistance and ownership is where most AI ROI disappears, and it is also where many AI startups quietly stall.

Even focused vertical tools like Harvey AI (AI for lawyers) or Sierra (AI for customer service) run into this reality. Their success depends less on model quality and more on whether organizations are willing to change workflows, incentives, and responsibility boundaries. That kind of change is slow, political, and consistently underestimated.

AI wrappers: Startups that don’t own their tech & what it means for Big Tech

There is a seductive belief that AI finally restores the compounding effect to software. Smarter tools, faster iteration and more leverage per person. From my experience and in practice, AI dramatically lowers the cost of building while simultaneously shortening the half-life of advantage for anyone who does not control their economics.

This is why AI wrapper economics should feel familiar to anyone who lived through previous platform shifts. It resembles businesses built entirely on APIs whose fate depended on someone else’s roadmap. Many of those AI wrapper companies were well executed. Some became case studies.

All of this matters because AI is no longer a side theme in markets. It is a core assumption. By late 2025, AI-related companies made up a massive share of the S&P 500, and current valuations assume that adoption accelerates and margins expand.

A large share of what gets labeled an AI company today is effectively a wrapper. The core intelligence comes from OpenAI, Anthropic, or a hyperscaler. The startup builds a thin product layer on top. From the outside, it looks like SaaS. Under the hood, it behaves like a variable-cost service.

When these systems run at scale, the tension becomes obvious. Usage does not follow neat pricing assumptions. Customers expect more volume for less money. Model behavior changes without warning and margins compress in ways pitch decks never modeled. In 2026, more investors will realize that many AI startups are not failing because adoption is slow, but because the economics were fragile from the start.

This puts a lot of pressure on the big AI providers. To live up to the expectations, they need to push into real production. According to a MIT research from July 2025, 95% of of businesses’ AI pilots failed to generate a return at all. That means Big Tech needs to act fast.

Might agents be the solution to a stalling adoption curve?

To push past the adoption friction, the industry is leaning into agents. Systems that take tasks and execute them end to end. From a buyer’s perspective, outcomes are easier to understand than prompts. From a founder’s perspective, results are easier to sell than assistance. It’s as simple as replacing humans, not augmenting their capabilities.

From an operator’s perspective, agents surface the hardest questions immediately. Responsibility, failure modes, and substitution stop being abstract. Startups like Artisan framing agents as replacements for human beings rather than tools may gain attention, but they also turn technical products into cultural statements.

What 2026 actually sorts out

By the end of 2026, it will not be clear whether AI “won,” but it will be clear which advantages survived contact with a reset. Some companies will have restructured workflows deeply enough to make AI stick. Some startups will prove they actually control their cost curves. Many others will turn out to be compounding inside someone else’s system, on borrowed time.

From my perspective, AI does not need to fail to disappoint. It only needs to scale more slowly than expected, cost more than hoped, and concentrate value higher up the stack than most narratives assume.

That is why 2026 feels less like a breakthrough year and more like a sorting year. Not between AI and non-AI companies, but between businesses that can survive repeated resets and those that quietly plateau while the story moves on.

Why VC Math Encourages Fragile AI Startups

Daniel Kolb — Wed, 31 Dec 2025 14:46:42 GMT

Disclaimer: This publication and its authors are not licensed investment professionals. Nothing posted on this blog should be construed as investment advice. Do your own research.

For context, I’m a software engineer by background, but I’ve also built startups, run a boutique agency for years, and spend most of my days designing and shipping software products, nowadays often with AI components. I’m not watching this market from the sidelines. I’m inside the systems, the invoices, the latency issues, the model limits, and the cost surprises. That proximity changes how you interpret the stories we tell about AI startups.

Especially the venture-backed ones.

Power law returns shape everything downstream

The core argument of The Power Law is not just that VC returns are uneven. It’s that they are violently uneven. One or two companies can return an entire fund, while most others go to zero or limp to an acquihire.

Once you internalize that, a lot of otherwise confusing behavior starts to make sense.

VCs are not optimizing for average outcomes. They are optimizing for outliers. That means they are structurally indifferent to fragility as long as there is a credible path to extreme upside.

This matters because founders adapt to incentives. Consciously or unconsciously, you build what gets funded. And in AI right now, what gets funded is speed, surface-level magic, and the appearance of leverage.

Why “just build on top of the best model” is rational

From a purely technical perspective, building an AI wrapper can feel irresponsible. You don’t control the model. You don’t control pricing. You don’t control rate limits. You don’t control whether a new API release makes your core feature obsolete overnight.

But from a founder trying to raise venture capital, it’s often the most rational move available.

Wrappers let you move fast. They let you demo something impressive with a small team and they let you show usage graphs before you’ve solved any hard infrastructure problems. But most importantly, they let you tell a story that fits power law thinking.

“If this category explodes, we could be the category leader.”

That sentence matters more than unit economics in early-stage VC. And AI wrappers are excellent vehicles for that narrative.

I don’t say this as an outsider criticizing naive founders. I’ve been on both sides of this. I know how tempting it is to defer hard problems when momentum is rewarded more than resilience.

The illusion of leverage in AI products

AI makes this even trickier because it creates the illusion of leverage everywhere.

A small team can now ship something that would have taken a large organization a few years ago. That’s real. But the leverage often belongs to the model provider, not the startup.

When working on AI products for my company and clients, I see the same pattern over and over. Early costs look trivial. Latency feels acceptable. The system works well enough. Then usage or requirements grow, edge cases pile up, prompts get more complex, and suddenly the cost curve shows up.

At that point, if you don’t control the underlying system, your margins are someone else’s variable.

From a VC perspective, that’s fine. If the company breaks, it breaks. The fund only needs one winner. From a founder or operator perspective, it’s terrifying.

Fragility is not accidental, it’s selected for

One thing The Power Law makes very clear is that venture capital doesn’t accidentally produce fragile companies. It selects for them.

Speed beats correctness. Growth beats efficiency. Story beats control.

If you try to build something slower, more robust, and more boring, you often get punished in fundraising. Your charts look worse and your pitch sounds less exciting. Your upside feels capped, even if your downside is far better managed.

Again, I’ve felt this tension personally. Running an agency forces you to think in terms of cash flow, trust, and systems that don’t fall apart when something upstream changes. Those instincts actively work against the kind of risk-taking VC wants to see.

AI just amplifies that mismatch.

The dependency nobody prices correctly

The quiet problem with many AI startups is not that they will fail. Failure is normal. It’s that their dependency structure caps their long-term value even if they succeed.

If your core capability is rented, your differentiation has a ceiling. You can have great UX, strong distribution, and excellent branding, but if the underlying intelligence is commoditized and controlled by someone else, your negotiating power erodes over time and you can get easily replaced by competitors or evolving AI agents.

I see a lot of founders assume they’ll “figure it out later.” Replace the model. Negotiate pricing. Build proprietary data moats. Maybe all of that happens. But maybe it doesn’t.

From the VC side, that uncertainty is acceptable. From the founder side, it’s existential.

How this changed how I invest and build

As an investor or customer of an AI project, I’ve become much more skeptical of AI companies that look impressive quickly. I don’t ask whether the demo works. I ask if the team behind the product is able to adapt swiftly. This assumption is especially important when I meet AI companies without a technical cofounders - or especially - vibe coded AI products built by marketing and sales people who obviously don’t know what’s happening under the hood.

As a builder, I’m increasingly biased toward projects that look boring in the early days. Things with real constraints. Things that don’t explode on Twitter. Things where the hardest work happens before the story gets interesting.

Power law math encourages founders to chase optionality over durability. That’s not a moral failure. It’s a structural one.

But if you’re an operator, an employee, or an investor who actually cares about long-term outcomes, you need to see that clearly. Most AI startups are not designed to last. They’re designed to be asymmetric bets.

Once you see that, the current AI boom looks less like a gold rush and more like a very efficient machine for producing impressive demos, fragile businesses, and a small number of enormous winners.

And the winners probably won’t look like wrappers at all. They’ll look slow, capital-intensive, and unsexy for a very long time. Which is exactly why most people will miss them.

How Compounding Breaks When Technology Resets Every Cycle

Daniel Kolb — Wed, 31 Dec 2025 10:19:09 GMT

Disclaimer: This publication and its authors are not licensed investment professionals. Nothing posted on this blog should be construed as investment advice. Do your own research.

The longer my career as a founder and engineer gets, the more one idea from The Psychology of Money keeps resurfacing: long-term success isn’t about brilliance, timing, or even hard work in isolation. It’s about staying in the game long enough for compounding to matter. That’s also a lot what I read about here on Substack.

Compounding is slow, boring, and deeply unfair in the short term. That’s why it works.

But the more years you spend inside technology - not just investing in it, but building on top of it - the clearer it becomes that tech is one of the most hostile environments for clean compounding.

Not because growth isn’t real, but because the rules reset far more often than our mental models assume.

Technology doesn’t compound, it overwrites

Most technological progress isn’t additive. It replaces what came before.

Desktop computing didn’t smoothly compound into mobile. Mobile shifted where value was created and erased entire categories of software businesses. On-prem infrastructure didn’t evolve gracefully into cloud services; it became stranded capital almost overnight. Traditional software isn’t slowly absorbing AI - it’s being repriced, re-scoped, and in some cases made irrelevant.

Each cycle quietly wipes out part of the previous advantage stack. Experience becomes table stakes and tooling turns into legacy debt. The Moats shrink into features and what once differentiated you becomes the minimum requirement just to stay in the conversation.

You don’t go back to zero, but you also don’t start from where your intuition tells you that you should. And that gap, between perceived leverage and actual leverage, is where compounding quietly dies.

I’ve lived this personally as a consultancy founder

Running an IT consultancy looks, on paper, like a textbook compounding business. Experience accumulates. Reputation spreads. Client relationships deepen. Every year should make the next one easier.

That’s the story. The lived reality is far less linear.

Every few years, the technical foundation underneath the business shifts. And when that happens, a meaningful portion of accumulated advantage simply loses relevance.

I’ve watched this play out repeatedly. From LAMP stacks to MERN. From server-rendered applications to API-driven frontends. From Magento-heavy projects to Shopify-centric ecosystems. From owning and tuning infrastructure to living entirely on managed platforms.

Each transition forces uncomfortable questions. Who do we hire now? What do we even sell? How do we price work when platforms abstract away complexity we used to bill for?

The business doesn’t reset to zero - but it absolutely does not compound smoothly. It feels more like climbing a staircase in poor lighting. You take steady steps for a while, then suddenly the geometry changes. Miss one transition and you don’t crash dramatically. You just stop ascending while others quietly move ahead.

That’s what broken compounding looks like from the inside in my opinion.

Thanks for reading Tech Fundamentals! This post is public so feel free to share it.

Public tech companies aren’t immune, they just hide it better

It’s tempting to believe this problem only applies to small and mid sized companies or agencies like mine. Public tech companies, we tell ourselves, are different. They have scale, capital, and optionality.

But even the best examples show how fragile compounding really is.

Microsoft didn’t smoothly compound from Windows dominance into cloud leadership. There was a long stretch where the company looked culturally and strategically stuck. Azure wasn’t an incremental extension, but it was a full reset that happened just in time.

IBM survived multiple computing eras, but survival isn’t the same as compounding. Its influence and growth flattened long before the narrative caught up, precisely because each platform shift diluted prior advantages.

Adobe is often held up as a compounding success story thanks to its subscription transition. And it deserves credit. But that transition wasn’t optional. Without it, decades of dominance would’ve turned into structural obsolescence very quickly (looking at you Sketch, Affinity and Figma).

Even companies often seen as perpetual compounders tell the same story under the surface. Amazon didn’t compound by staying still. It repeatedly rebuilt itself, using cash flow from one layer to finance reinvention in another. Meta rode social platforms to enormous scale, only to discover how fragile that compounding becomes when user behavior or platform paradigms shift.

From the outside, these stories look inevitable, but from the inside, they are a sequence of narrowly avoided dead ends.

Survivorship bias is brutal in technology

What we mostly see are the winners who crossed multiple resets. What we don’t see are the far more common outcomes.

Companies that didn’t fail loudly. They just stalled.

The product still works and the team is still competent. Revenue still grows a little, but margins compress. Hiring gets harder. Strategy becomes reactive instead of intentional. The company turns into a maintenance machine for decisions made in a previous cycle.

This is the most dangerous failure mode for founders because it doesn’t feel like failure. And it’s the most dangerous failure mode for investors because it looks like stability.

Compounding doesn’t always end with a crash. Often it ends with a long, quiet plateau.

AI doesn’t fix this - it accelerates it

There’s a seductive belief that AI finally restores compounding to software. Smarter tools, faster iteration, more leverage per person - what could go wrong?

Working with AI systems on a daily basis tells a different story.

AI dramatically lowers the cost of building, which is real progress. But it also dramatically shortens the half-life of advantage for anyone who doesn’t control their own economics. Many AI products grow quickly while sitting on top of models, infrastructure, and pricing decisions owned by someone else.

This looks familiar if you’ve lived through platform shifts

If you’ve been building long enough, AI wrapper economics should feel uncomfortably familiar.

Want to know what AI wrappers are? I’ve written another post about it: https://substack.com/home/post/p-182769365

It feels like businesses built entirely on social media APIs or mobile apps whose fate depended on App Store policy changes. Or SaaS products whose distribution vanished when a platform reprioritized its roadmap.

Those companies weren’t stupid. Many were well executed. Some even became case studies for a while. But when the platform shifted, past effort stopped compounding. Execution quality mattered less than position in the stack.

AI is replaying this pattern at a much faster pace, with far more capital and far less patience.

The real compounding question founders should ask

Re-reading The Psychology of Money pushed me to reframe the core question.

It’s not “how fast can this grow?”
It’s “what survives when the reset happens?”

What still compounds when the underlying platform changes? What advantage carries forward? And what quietly evaporates?

The companies that manage to compound across cycles aren’t the ones that avoid change. They’re the ones that can absorb it without destroying their economics. They own something fundamental enough that resets don’t erase their leverage.

Everyone else is compounding inside someone else’s system, on borrowed time, under assumptions they don’t control.

A more honest model of tech compounding

Compounding in technology isn’t a smooth exponential curve. It’s a staircase.

You accumulate quietly for years, then hit sharp transitions where the rules change. Some founders step up. Some slip. Some realize, too late, that the staircase moved while they were still climbing the previous step.

As a founder, this means staying paranoid about relevance even when things are going well. Comfort is often a lagging indicator of decay. As an investor, it means distrusting stories that project yesterday’s advantage too cleanly into tomorrow.

Compounding still matters. But in technology, it only belongs to those who survive and that group is far smaller than the success stories make it seem.

What this means for investing in tech companies

This compounding problem doesn’t stop at founders. It leaks directly into how tech gets funded - and why so much capital underperforms expectations across cycles.

For VCs, the core mistake is often assuming that early traction plus a big market automatically implies a long compounding runway. In reality, a lot of venture-scale outcomes are really bets on timing a cycle, not owning a durable advantage.

Many startups look like compounders only because the platform underneath them is still stable. When that platform shifts - cloud primitives change, distribution moves, AI pricing resets - the fund isn’t underwriting growth anymore. It’s underwriting the company’s ability to survive a reset it doesn’t control.

This is why so many VC portfolios end up with a strange shape: a few extreme winners that managed to align with or control a platform transition, and a long tail of companies that never quite die but also never deliver venture returns. They didn’t fail at execution. They failed at carrying leverage across cycles.

For public market investors, the problem shows up differently but just as painfully.

Public tech investing is full of implicit compounding assumptions. Revenue growth gets extrapolated. Margins are modeled as improving over time. Moats are assumed to widen with scale. But in technology, scale often increases exposure to resets instead of protecting against them.

A SaaS company compounding at 30% looks amazing, until a platform change turns pricing power into a negotiation. An AI-heavy product looks defensible, until inference costs or model access terms change. Suddenly the long-term margin story that justified the multiple doesn’t exist anymore.

This is why so many tech stocks don’t collapse, they just de-rate. Growth slows a bit. Margins disappoint slightly. Guidance gets conservative. The stock goes sideways for years while investors wait for compounding that never really resumes.

The uncomfortable truth is that many tech investments aren’t long-term compounding bets at all. They’re cycle bets disguised as forever businesses.

The investors who tend to do better over long horizons are usually asking different questions. Not “how big can this get?” but “what does this company still control after the next reset?” Not “is this growing fast?” but “where does the cost curve live, and who owns it?”

Because in technology, returns don’t accrue to the companies that grow the fastest inside a cycle. They accrue to the ones that still matter when the cycle ends.

And that’s a much rarer and much harder thing to underwrite.

Closing thoughts

The biggest mistake we make with technology, whether as founders, operators, or investors, is assuming that time automatically works in our favor.

In many industries, it does. Experience stacks. Advantages harden. Compounding feels almost mechanical. In technology, time is far more conditional. It only helps if you are aligned with where the system is going next, not where it has already been.

Building in tech means repeatedly letting go of things that once worked. Investing in tech means accepting that many “great” businesses are only great within the boundaries of a specific cycle. When those boundaries move, the compounding story often breaks long before the narrative does.

The uncomfortable reality is that technology rewards adaptability more than consistency, and positioning more than effort. That doesn’t make compounding impossible, but it does make it rare, fragile, and uneven.

If there is a single mental shift worth making, it is this: stop asking whether something is a good business today. Start asking whether its advantages survive a reset it does not control.

Because in technology, the future rarely belongs to the best operators of the current system. It belongs to the ones who are still standing, and still relevant, after the system forgets its own past.

The AI Boom Has a Graveyard No One Talks About

Daniel Kolb — Sun, 28 Dec 2025 14:20:00 GMT

Disclaimer: This publication and its authors are not licensed investment professionals. Nothing posted on this blog should be construed as investment advice. Do your own research.

I’ve been rereading The Psychology of Money by Morgan Housel recently, and it’s been hard not to see the parallels with how people are investing in AI right now. Housel’s core point is that most mistakes aren’t about missing information. They’re about how we interpret success after the fact.

That hit a little closer to home because I spend my days working as a software engineer, building and integrating AI products. I’m inside these systems constantly. I see where the power lives, where the costs show up, and how much of what looks like “AI innovation” is actually dependency dressed up as product.

AI investing feels like a textbook case of psychology getting ahead of system reality.

The Part of the AI Story That Feels Too Obvious

Most AI conversations today sound strangely settled. People point to a few winners, talk about adoption curves, and speak as if the market structure was inevitable. Of course these companies won. Of course this is where the value landed.

That sense of inevitability is exactly what Housel warns about. When outcomes feel obvious in hindsight, we stop asking what had to go right and what risks are being quietly ignored.

Most AI Products Don’t Own the Intelligence

This is something that becomes very clear when you actually work with AI systems every day.

Most AI products do not own the intelligence that makes them work. They sit on top of it.

A large share of AI startups are thin layers on top of a small number of foundational providers, most commonly OpenAI. They add a workflow, a UI, some prompt logic, maybe light fine-tuning, and wrap it into something usable. That can still be valuable. But it’s very different from owning the core system.

When people say “AI company,” what they often mean is “company that calls an API and makes it feel nice.”

Dependency Is Not a Phase You Grow Out Of

From a software perspective, dependency always has a cost. In AI, it defines the business.

If you rely on an upstream model provider, you don’t control pricing, inference efficiency, performance improvements, or roadmap direction. Your margins and differentiation are downstream of decisions you don’t make.

That’s not a scaling issue you solve later. That’s the structure you’re building on.

One thing Housel emphasizes is how easy it is to underestimate risks that don’t show up immediately. Dependency feels fine early. It only becomes painful once you’re successful enough for it to matter.

Why So Many AI Companies Stall Instead of Fail

This is why most AI companies don’t fail dramatically.

Early on, costs are falling, credits are generous, and competition is light. Usage grows and everything looks validated. Then scale arrives. Inference costs become very real. Customers start questioning pricing. Features you thought were differentiators turn into baseline expectations as models improve.

From the inside, nothing breaks. From the outside, growth just stops being impressive.

That’s survivorship bias at work. We remember the few that broke through and forget the many that quietly stalled.

The AI Graveyard Is Quiet

Working with AI products daily makes this especially obvious. Most projects don’t collapse. They just hit economic ceilings.

Margins never show up. The product remains useful but not defensible. Eventually the company becomes strategically irrelevant or gets absorbed. These outcomes don’t make headlines, which is why the graveyard is easy to miss.

But it’s full.

The Wrapper Problem Isn’t an Insult, It’s a Constraint

Calling something an AI “wrapper” isn’t an insult. It’s a description of where power sits.

If the underlying model improves, your product gets commoditized. If pricing goes up, your margins compress. If the provider launches something adjacent, your differentiation disappears. Switching costs are lower than they look once customers understand what’s underneath.

From a software standpoint, you’re competing on UX, speed, and distribution, not on system control. That can work for a while. It rarely compounds.

AI Feels Like Software, but Behaves Like Infrastructure

This is where a lot of expectations break.

AI feels like software because you can ship fast and iterate. But once usage grows, it behaves like infrastructure. Hardware efficiency, energy costs, networking, and inference economics dominate outcomes. You can’t refactor your way out of those constraints.

If you don’t sit close to the cost curve, scale amplifies the problem instead of solving it.

Why Infrastructure Owners Sit in a Different Position

This is why companies closer to the metal operate in a different reality.

NVIDIA and large cloud platforms like Amazon Web Services, Microsoft Azure, and Google Cloud sit where costs are set, not just passed through. That doesn’t mean they’re guaranteed great returns. It does mean they’re not structurally dependent in the same way most AI products are.

From an engineering perspective, owning infrastructure is painful. From an investing perspective, not owning it caps upside.

A Simple Test I Keep Coming Back To

One simple mental model I keep using, both as an engineer and as an investor, is this:

What happens to this business if its core AI provider changes pricing or launches a competing feature?

If the answer is “that would really hurt,” then you’re not looking at a moat. You’re looking at a dependency that only feels manageable while conditions are unusually friendly.

Closing Thoughts

AI is real. The technology is powerful. Working with these systems daily makes that very clear.

What The Psychology of Money helped reinforce for me is how much of today’s AI market is shaped by survivorship bias and hidden dependencies. A small group of companies own critical layers of the stack. A much larger group depends on them and hopes the economics work out.

Survivorship bias makes it feel like everyone has a shot.
In reality, most of the graveyard is made up of companies that never failed loudly. They just never escaped the stack they were built on.

The hardest part of AI isn’t building something impressive.
It’s building something that still works when the psychology shifts and the easy phase is over.

Shopify: A Software Platform Disguised as a Retail Stock

Daniel Kolb — Fri, 26 Dec 2025 14:54:35 GMT

Disclaimer: This publication and its authors are not licensed investment professionals. Nothing posted on this blog should be construed as investment advice. Do your own research.

At first glance, Shopify looks like a leveraged bet on consumer spending. When retail weakens, the stock sells off. When e-commerce optimism returns, it rallies. That surface-level framing misses what Shopify actually is.

Shopify is not a retailer. It is not a marketplace. It is not a logistics company in the Amazon sense. Shopify is an operating system for commerce. It is a modular, API-driven software platform that lets millions of small and mid-sized businesses run online commerce without building infrastructure themselves.

Understanding Shopify as a software platform, not a retail proxy, changes how you should think about its durability, margins, risks, and long-term upside.

This post breaks Shopify down from a technical and systems perspective and explains why the stock continues to confuse both bulls and bears.

Shopify’s core product is abstraction

Shopify’s real product is not storefront templates or checkout buttons. Its real product is abstraction.

It abstracts away:

Payments infrastructure
Tax compliance
Hosting and scalability
Fraud prevention
Inventory logic
App integrations
Cross-border commerce complexity

For a merchant, Shopify replaces what would otherwise require multiple vendors, custom engineering, and ongoing operational risk. For a developer, Shopify provides a stable, opinionated platform with APIs that rarely break and documentation that actually works.

This abstraction is where the moat begins. Once a merchant builds workflows, themes, apps, and operational habits around Shopify, switching costs become real. Not contractual, but practical.

The merchant base is fragmented by design

A common criticism of Shopify is merchant quality. Many stores fail. Many merchants are small. Average revenue per merchant looks unimpressive compared to enterprise SaaS.

This critique misunderstands Shopify’s strategy.

Shopify intentionally targets a fragmented, long-tail market. That fragmentation protects Shopify from customer concentration risk and from enterprise procurement cycles. A single merchant failing does not matter. Millions of merchants experimenting does.

Shopify is closer to a protocol than a traditional SaaS vendor. It monetizes experimentation at scale.

This also explains why Shopify optimizes for self-serve onboarding and product-led growth rather than sales-led expansion. The platform is designed to let merchants try, fail, retry, and grow without human intervention.

That design choice limits short-term monetization but increases long-term surface area.

Payments is the economic engine

Shopify’s most important business decision was pushing hard into payments.

Shopify Payments is not just a convenience feature. It is the primary mechanism for monetization expansion. Every dollar of Gross Merchandise Volume that flows through Shopify Payments generates revenue with minimal incremental cost.

This shifts Shopify’s revenue mix away from flat subscription fees toward usage-based economics.

From an investor perspective, this matters for two reasons.

First, Shopify’s take rate improves naturally as merchants grow. No upsell call required.

Second, Shopify becomes structurally tied to merchant success rather than merchant count. This aligns incentives in a way many SaaS companies fail to achieve.

The trade-off is exposure to consumer spending cycles. When GMV slows, revenue growth slows. That volatility is real and should not be ignored.

Shopify versus Amazon is the wrong comparison

Shopify is often compared to Amazon. This comparison is mostly useless.

Amazon is a centralized marketplace. Shopify is decentralized infrastructure.

Amazon owns the customer relationship, the discovery layer, the fulfillment stack, and the data. Merchants rent access.

Shopify explicitly does the opposite. Merchants own the customer, the brand, the data, and the traffic acquisition strategy. Shopify stays in the background.

This difference matters because it defines the ceiling and the risks.

Amazon can extract more value per transaction but faces regulatory scrutiny, merchant resentment, and internal conflicts of interest.

Shopify monetizes less aggressively but avoids platform abuse accusations and antitrust pressure. It is harder to regulate infrastructure than a marketplace that sets prices and competes with its own sellers.

Shopify’s model scales more quietly and more globally.

Fulfillment was a strategic misstep and a valuable lesson

Shopify’s attempt to build a first-party fulfillment network was widely criticized, and correctly so.

Logistics is capital-intensive, margin-thin, and operationally complex. It conflicts with Shopify’s asset-light software DNA.

The decision to divest the fulfillment business was not a failure. It was a course correction.

What Shopify learned is important. Merchants want integration, not ownership. They want software orchestration, not Shopify-branded warehouses.

Shopify’s current approach focuses on connecting merchants to third-party logistics providers through software rather than competing with them.

This keeps Shopify aligned with its core competency while still participating in the value chain.

Operating leverage is real but delayed

Shopify’s income statement has frustrated investors for years. Revenue grows, but margins lag. Stock-based compensation remains high. Profitability appears optional.

This is partially true, and partially misleading.

Shopify deliberately reinvested heavily in product, international expansion, and ecosystem tooling during a period of unusually cheap capital. That era is over.

The important question is not whether Shopify can be profitable. It is whether profitability scales faster than revenue once investment slows.

Because Shopify’s infrastructure costs scale sublinearly relative to GMV, operating leverage should emerge over time. Payments, apps, and services carry much higher incremental margins than subscriptions.

The risk is timing. Investors expecting near-term margin expansion may be disappointed. Investors with a longer horizon should focus on unit economics rather than quarterly optics.

The developer ecosystem is the quiet moat

Shopify’s app ecosystem is one of its least discussed advantages.

Thousands of developers build specialized tools on top of Shopify. These apps solve niche problems that Shopify itself should never prioritize. The result is a marketplace of functionality that increases platform stickiness without increasing internal headcount.

This is classic platform leverage.

Every successful app makes Shopify more valuable. Every failed app costs Shopify nothing.

From a systems perspective, this is a compounding advantage that is difficult to replicate. Competing platforms need both scale and trust to attract developers. Shopify already has both.

Where Shopify can fail

A serious analysis requires acknowledging failure modes.

Shopify is exposed to:

Prolonged consumer spending weakness
Payment margin compression from competitors
Increased regulatory scrutiny on payments and data
Merchant acquisition costs rising due to ad platform consolidation
Platform fatigue if complexity grows faster than usability

Shopify also sits downstream from platforms like Meta and Google for traffic. Changes in ad pricing or tracking rules indirectly affect merchant success and therefore Shopify’s GMV.

None of these risks are existential, but they cap short-term optimism.

Why Shopify remains interesting as a long-term investment

Shopify is not a clean story. It is cyclical, volatile, and often expensive on traditional multiples.

But from a product and systems perspective, Shopify has built something rare. A globally scalable commerce operating system with embedded payments, strong developer gravity, and aligned incentives.

It benefits from:

Long-term shift toward independent brands
Increasing complexity of global commerce
Merchant preference for ownership over marketplaces
Software-driven operating leverage

Shopify is unlikely to dominate headlines the way AI infrastructure companies do. It compounds quietly, tied to real economic activity rather than speculative demand.

For investors willing to tolerate volatility and think in systems rather than narratives, Shopify remains one of the more structurally interesting platforms in public markets.

Not because it is perfect, but because it is hard to replace.