How to Stay Competitive and Flexible in a Multi-Cloud World… and How Not To

Note: I wrote this during my time as CTO at MongoDB and titled the original draft “A blog about multi-cloud I never published” — because I never quite hit publish. Came out of conversations with many CTOs whom I was trying to steer away from the false promise of cloud-agnostic frameworks.

I’m worried that many engineering teams are being led down to the river of lower productivity by the Pied Piper of multi-cloud agnostic frameworks. If you’re worried too, read on.

Over many decades, trends in computing have swung – pendulum-like – from centralized mainframes, to decentralized client-server architectures, and then back to more centralized browser-based cloud computing. And now that’s swinging again towards decentralization, with multiple clouds and multiple layers of clouds (central, edge, etc.).

With each switch, businesses optimize for what’s important to them; sometimes it’s cost, or security, or agility. One overriding impetus today is the speed of innovation they can achieve. Every time the pendulum swings, people try to get the best of the new thing without losing the benefits of the old.

I’m most interested in this article about the cloud-distribution pendulum. There are now many major clouds, and north of 70% of companies have a multi-cloud presence. As if computing wasn’t complicated enough, now your company is on multiple clouds and probably still partially on-premises as well. This leads companies to often try to abstract to a single set of tools and interfaces, high level enough that all of your applications and operations can be “cloud agnostic”. Your CFO loves this, especially when you ask for more budget or time to have your apps run across clouds; when it’s your CFO that probably negotiated the new cloud deal or acquired the company with massive infrastructure on a cloud your company has never seen before.

I’ve been creating abstraction layers my whole career; starting with the operating-system-dependent layer at Oracle that allowed the Oracle RDBMS to run well on 50+ operating systems. Abstraction is such a careful balance. If you go too high, it’s unproductive, wastes time, and damages your business. If you go too low, you’re in the weeds and can’t move forward. More on this in a minute.

First off, before I explain (and trash) this naive approach to multi-cloud, maybe we should validate why companies are moving to multi-cloud in the first place. Why wouldn’t one cloud be good enough for a company? Lots of reasons. We are moving from a world composed of the three hyperscalers you know, to a much more diverse world. Providers like Tencent, Alibaba, and OVH are all challenging the 3 major players in both features and geographic footprint. Countries are passing laws around data sovereignty that are encouraging the creation of clouds that guarantee that their citizen’s data (and the management of it) doesn’t extend past their physical borders. And everybody is expanding the capabilities at the edge - to address the high-speed and ultra-low-latency of 5G mobile networks. In addition, to complicate matters more, many companies are adopting a hybrid approach where they are keeping parts of their infra on premises for extended periods of time while also taking advantage of the cloud. So if you want to be at the leading edge of where the world is going, you have to be cloud-agile. What’s an engineer to do with this multi-cloud imperative? There are multiple approaches. It’s tempting to design a layer that abstracts away the capabilities of every cloud to the basics - compute, storage, networking.

In my opinion, the over-zealous aspirations around containers and all multi-cloud frameworks is the (il)logical conclusion of this trend. They are getting us too far away from actual great engineering. By doing so, you lose the ability to achieve remarkable results - every cloud provider is innovating crazy-fast to provide better economies and amazing managed services. You’ll end up paying the cloud providers a premium price while only using the basic commodity features - and helping them maintain their high margins while doing it. For example, if you just use “compute” you can’t take advantage of all the spot usages, nor the different instance types or sizes efficiently. Same for storage. It’s like modeling a car as “something with a gas pedal, 4 wheels, and a windshield” and asking your team to win LeMans.

Alternatively, you can push back and go all-in on one cloud - but the writing is on the wall that this is dangerous for your business as you scale, won’t let you take advantage of innovation, and may end up being untenable as data laws become more complicated. And, as a side note, it’s especially painful whenever you want to negotiate the next deal with your cloud provider - they have the ingress and egress logs to know that you’re not multi-cloud and the price lists and sales strategies to take advantage of that fact.

Don’t get me wrong; you can abstract away some things of course; if you have a stateless web app that fits nicely in a container, go for it, and scale the daylights out of it. But that’s a very small part of an enterprise architecture; provisioning, cost control, security, databases, monitoring, and event management are all crucial - and are all different on these clouds. DevOps, networking, and security, are areas where you just can’t have a “one-size fits all” model. And monitoring and incident management require deep environment-specific expertise, or you’re pretty guaranteed to be operationally mediocre.

The CEO of one large investment bank told us, “Multi-cloud is an opportunity for us to unlock the full value of each location, not water things down with abstractions and accept the lowest common denominator.” If you agree with this line of thinking, your cloud strategy should follow these rules:

First, let the workload dictate which cloud (or clouds) it lives on. There are literally dozens of considerations for choosing where your computing takes place. Geography. Reliability. Compliance. Performance. Let the strategic intent of each workload define the requirements of your cloud computing needs.

Second, take advantage of the best services possible wherever you can. Yes, there is a cognitive load to using different services on different clouds when you have to. If you can use the same service on all clouds (and it’s the best service available to you on all of them), that’s even better. (Hint: This is my only MongoDB plug in this article; our Atlas DBaaS and Application Data Platform run the same on all three major clouds).

Third, focus your teams on being the best they can be on every environment they deploy on. Listen really carefully when you hear “agnostic”, “framework”, “generic”, etc. Also, you’ll hear people talk about how the value they’re delivering next quarter is all about these things, and not about business value. Obviously good architecture, tools, etc, are essential; but don’t let them become the focus of your groups or you lose competitive advantage.

As I’ve said before, in the digital economy, our businesses are defined by the applications we offer to our customers, our business partners, and our employees. And the success of our businesses are defined by the quality of those applications, the speed you deliver them, and the quality with which you operate and support them. My advice is to approach your cloud strategy with your eyes wide open, and not let yourself be swayed by a fad but instead stay relentlessly focused on delivering great engineering and operations to your customers. This article came out of me talking to many CTOs and helping guide them away from this false cloud-agnostic fairy tale - I got worried enough that I wanted to share my worries more widely. I hope it’s been useful. I’d love to hear what you think about my characterization of the cloud landscape, and my recommended approach to cloud strategy. Reach out and let me know what you think, either on LinkedIn or at @MarkLovesTech.