People ask me this often enough that I decided to put it on my home page:
“Aren’t you one of the SQL folks? Why would one of you join MongoDB?”
For me, it was always about delighting customers; SQL and Relational were just a means to that end. Databases are indeed amazing because of the promises around data in terms of consistency, ease of use, and durability that they make to customers. I sat with one of my old relational friends about a year ago, in Oct of 2019, before MongoDB was even a gleam in my eye, and we decided that 30 years into our careers, databases were still hard to use by operators and developers, but especially developers (cloud has made operator job a lot easier). They were also still unpredictable and didn’t defend themselves against misuse. Not only that, but scalability and distribution were bolted on as afterthoughts rather than core elements of the product – making scaling either difficult or impractical or brittle.
In fact, my biggest frustration during the Aurora PostgreSQL project was how hemmed in we were. We had SQL on the top, with ORMs on top of that – leaving it hard to use and no real way to fix that. Who wants to embed SQL in their code? The PostgreSQL community (while amazing and inspiring), hemmed us in by not letting us change scale out, transactions, or anything else of substance. Compatibility was sacrosanct. Not only that, but the PostgreSQL community typically takes 2-3 years to accept any architectural changes into the code base, and every one is a negotiation. So at Amazon, for Aurora (both for MySQL and PostgreSQL), all we could really do was innovate on the storage layer only – mostly because it was just so very broken. The marketing saying that “Amazon Aurora is a new database” is just flat out wrong – it’s an amazing distributed, replicated, fast storage system that’s glued onto the bottom of PostgreSQL and into the middle of MySQL.
As I got to know MongoDB, I realized that MongoDB has very few constraints. We own the language interfaces so that we can merge seamlessly with every language – people program MongoDB using their native data structures, not by programming in a different language within C, Java, Node, Python, etc. We own the drivers so we can seamlessly implement failover, scaled reads/writes, client-side encryption. The servers are natively built from the ground up to offer scale and distribution – you can run a single MongoDB cluster on all three major cloud providers if you want. You can even run it on your laptop or in your own data center, building you a much better ramp from the data center to the cloud. The final straw was that I saw MongoDB’s vision for a full data platform via our Atlas cloud service. We have integrated search directly into the cloud offering – no additional infra to stand up or manage. We allow you to federate queries directly across S3 and MongoDB, transparently, and even to age data into S3 automatically and still use the same queries.
After writing the above, the question I ask myself is why I waited so long to come here.