Redefining Platform Engineering - Startup Stories with Cory O'Daniel
Cory O'Daniel (Co-founder and CEO, Massdriver): I was a physics major originally. I did some software development on the side, mainly for fun. In my first year of college, I worked at a hospital in the data entry department and wrote a little script to do all my data entry. One day my boss walked by and asked what I was doing. I was very obviously reading the Dungeon Masters Guide at my desk. I explained to her that I had written a script to automate my work. I wasn't in the software department and didn't sign a PIIA, so they offered to buy the software from me and give me a role on the IT help desk team. They also gave me $1000. I was very excited.
I worked in operations and IT in the healthcare industry for a while. I changed my major to focus on healthcare information systems because I thought it was what I wanted to do, and I became a HIPAA analyst. I did that for a while before moving to California in the early 2000s to join a startup, and that's when I started working in cloud operations. One of my first jobs was migrating a data center to AWS, which had just launched. I got pigeonholed very early into the cloud operations role.
Anton: Have you always wanted to launch a startup? Or was it an urge to improve the industry after years of working as an operations engineer?
Cory: I wouldn't say I've always wanted to, but Massdriver is my third startup. The inspiration came from a consulting role that I was doing.
I worked with Google Cloud helping organizations migrate from data centers to GCP. Before GCP, I worked at a few small companies and experienced the pain of scaling up the operations team, not scaling the software, but scaling the workforce.
Fast forward a few years, and I'm helping migrate these multi-billion dollar organizations, and they are suffering from the same problems. They'd have 10 or 20 operations engineers but hundreds of software engineers. The software team would overwhelm the ops team with paper cuts. The ops teams were good at what they did but were inundated with technical debt that had been saddled on the business for years. They couldn't move fast enough.
This is the problem we want to solve. Help growing organizations scale their operations teams and practices effectively without compromising security or software delivery cadence.
Anton: Does the MVP look precisely how you planned, or have you had changes to the initial vision?
Cory: DX-wise, yes. Massdriver started out as just a ton of YAML. If you follow me on Twitter or have seen my tech talks, you'll know I love YAML. I even made these really gnarly "YAML AF" hologram stickers. I know many people love to hate it, but the idea that we have simplified so much of what we do in our industry down to configuration is genuinely remarkable.
The original version of Massdriver was very similar to an open-source project called Cloud Native Application Bundles (CNAB). The idea is that you can stitch together infrastructure and application infrastructure-as-code (IaC) to provision everything your application needs in one go. The original Massdriver was just this bundling idea that would let you roll out infrastructure and applications as one.
My co-founder Dave and I were working on a different startup idea, and we got into an argument about who had to do the ops work. We both wanted to work on the product and add business value, but someone needed to set up the Kubernetes cluster, database, monitoring, etc. I showed him an early prototype of the YAML-based version, and he said we need to put UI on it and solve this problem for everyone. So Massdriver was originally just something I was planning to open source, and then we saw the potential in it and started building the diagramming interface.
Anton: Massdriver offers to set up an infrastructure in minutes. How is it possible? What is the technology behind the product?
Cory: We have a cloud-agnostic provisioning engine and a specification for handling I/O to the underlying IaC tools. These let us connect and stitch together various IaC runs and 'introspect' what they are provisioning. All of the existing infrastructure and service bundles are written by an actual human with deep experience in the cloud service.
We want to make secure operations accessible to all engineers without backing future operations personnel into a corner. We don't auto-generate IaC. Our bundles are built by an expert in that service, following compliance standards and reference architectures.
Anton: Who is your target audience? Is it developers with no DevOps experience, experienced DevOps engineers, or something in between?
Cory: DevOps experience or not, it's for developers who want to focus on building business value and want to avoid getting caught up in the details of the cloud. We are out to build a better PaaS-like experience without the lock-in and lack of extensibility. We don't prescribe a single solution for how to 'run your app.' It's easy to use, runs in your cloud, and allows you to construct golden paths that make sense for your business.
On the flip side, we are for operations teams. I don't know what percent of my life in operations I have acted as a glorified task executor for my engineering counterparts. We want Massdriver to enable engineers to self-serve without high cost or security risks to the business. Operations teams generally get caught up in the monotony; we want them to be able to focus on things that add value to the company, not just doing "DevOps" for the engineering team.
Anton: So your target audience is really technical people, and you still talk a lot about adding value to the business. How did you sell Massdriver to them?
Cory: We recently changed the way that we sell the platform. When we came out of YCombinator, we were really gungho on enterprise sales. We racked up a lot of ARR pretty early by doing a ton of top-down cold calling. While it was very good financially for us early on, it is a much harder sales process.
So we've pivoted recently to focus on product-led growth (PLG), aiming toward engineers that are feeling these problems rather than seeking towards the executives that might be hearing about them. It's much easier to sell it that way and find greenfield projects to help their teams experiment in the cloud.
We still have customers coming to us trying to get into platform engineering. They're caught up in the tediousness of DevOps, and they need help to get their heads above water. We love to help make these teams more efficient, but what we're really trying to focus on is mid-market companies that have yet to really invest in ops. They need a good path forward, and that's who we're trying to help today.
Anton: What about your first clients? So how did you find your first companies that started using Massdriver?
Cory: I made a two-minute Loom video. It was terrible. The product looked like crap back then. I put the video on LinkedIn. People saw it and realized that they needed it. They reached out to us and bought within a few calls. From there, we went straight to top-down enterprise sales, a big mistake in retrospect. We should have stayed focused on PLG and content from the get-go.
We had a profitable small startup that wanted to go towards a scaling startup using some new cloud services. They didn't have the budget to hire an ops person yet. They really wanted to bring on another software engineer to work on features. They were able to do that with Massdriver. They hired a junior software engineer, and the platform handles all their operations.
Another of our first customers is one of the Big 3 consulting firms. They have an army of operations personnel, but interdepartmental billing is expensive and tedious. The ML teams use Massdriver to self-serve without going to their ops consultants. They don't have to become MLOps experts or learn how to deploy Kubernetes. They get to focus on doing what they know how to do.
Anton: You were talking about the pre-built infrastructures in Massdriver. What is it, and who can be onboarded quickly right away?
Cory: We have about 70 of them today, from databases to container orchestration to serverless. The goal is to have a bunch of bundles (and full architectures coming soon) with which any team can get started. We don't want to prescribe how to run your business; we want to meet you where you are and help you get to where you are going.
Users can also extend the platform by publishing their IaC modules (Terraform, Helm, CDK/Pulumi) to the Massdriver registry. We want to expand operational potential. We also support eight application runtimes, with three currently in beta, including Databricks. Adding runtimes to Massdriver is also open source. You can open a pull request to our application-templates repository and add new workload types in a few hours.
We've had customers go from a new account to running cloud-managed databases and container apps in less than an hour — in their cloud account, no DevOps unicorn required.
Anton: Do you have a community around the product already? Maybe in Slack or somewhere else?
Cory: Yeah, we primarily use Slack. So we still need to do a better job of community building. We're starting to lean into that. We have a public roadmap that customers contribute to. We are open-sourcing a lot of Terraform bundles and application templates and will soon be open-sourcing our provisioning engine.
Anton: You believe teams should switch to platform engineering. What is it, and how does Massdriver help with it?
Cory: I think DevOps is mostly bullshit. Our industry is growing at an astronomical rate: in the US, 5x faster than any other career. We are at the point where most of the industry is in their first few years doing professional development with little to no cloud experience. At the same time, our world is getting more and more connected. Every industry is in the software industry now.
Teams need to assess where they are. If they don't have deep ops expertise, then a PaaS or an internal developer platform is where they need to be out of morality and ethics to their customers.
The idea of DevOps is great, in theory. It works great in a side project or an MVP. It doesn't work well when developers need to ship features. It's a hindrance to their core responsibilities to the business — delivering user value.
If an organization has an experienced ops team, they need to figure out how to enable self-service by engineers with guardrails so your business can continue to ship code effectively and efficiently.
So, yes, they should be looking at running on a platform, YMMV, on building it yourself.
What's the difference between Platform Engineering and DevOps? Perspective.
DevOps is something an engineer does alongside their feature development. Platform engineers do DevOp. The difference is the output of the platform engineer is the product. It's not a second job they do alongside their first.
It's the difference between automating your deployments and building a resilient piece of software that automates your deployments.
Anton: Security is one of the primary hurdles today. How does Massdriver manage to be secure by default?
Cory: We do several things in the security space. The first is that we incorporate each cloud's reference architectures into any bundle our team publishes to our package manager. This gives teams without ops personnel access to secure components to build their infrastructure. We take strong stances on security. No options are exposed to make unencrypted data stores. We don't allow databases to go on public IPs. We automate many of the basic things that are easy to mess up.
Secondly, we build in secrets management between infrastructure and applications. We know the dependency hierarchy of your system and can directly inject database credentials, tokens, etc., into your application's runtime. No HashiCorp Vault to set up, no AWS secrets manager, and no copy/paste. Just credentials where they belong.
And last but not least, we automate IAM roles and policies on all the three largest clouds. We have a type system that describes what is in infrastructure components and a directed graph of dependencies. This allows us to automate IAM with the principle of least privilege. No more swiss cheese IAM policies.
Anton: What are your current plans for Massdriver? Where are you moving as a product?
Cory: Massdriver is a great way to manage infrastructure and app services. It's easy to get engineers in there owning their cloud infrastructure, but Massdriver is very much a one-way street today. You log into it, configure some infrastructure, deploy some stuff, and you might not be back in Massdriver for a few weeks until you need another change.
We already do automated monitoring and alerting. So if you configure infrastructure in Massdriver, you get alarms and monitors by default. You can fine-tune them if you want, but we base them on your configuration, giving you alarms similar to PagerDuty. When an alarm goes off, you'll get an email, you click on the link, it takes you right to the infrastructure diagram, highlights the component that's experiencing the issue, and shows you what the alarm is.
We want Massdriver to not only be a way to configure your system and remediate problems, but we also want you to be able to see the health of your system. In January, we'll be rolling out automated metric visualization on the diagrams. You'll be able to log in to Massdriver, configure infrastructure, and immediately see the health of your infrastructure, RAM, CPU, traffic, etc., right there.
We're working on automated SLIs for application services using RED (rate, errors, duration) signals. We want to give teams really good SLIs by default and let them fine-tune those as they get more familiar with how their application performs.
Anton: What do you think is the future of the operations? Where will it all go, and how will it develop?
Cory: Platform engineering is the future. And again, whatever way you need to get to a platform, get there. Whether using a PaaS or an IDP or building your own platform engineering team. So many new engineers are going to be joining the workforce by 2030. We already have so many engineers who either aren't interested in the operations side or don't have deep operations experience but still need to run applications at scale.
We need to make it easier to do that moving forward. We're in a post-Heroku world. They changed the way we thought about delivering software. It's funny when you think about software. Not much has changed in software in the past 70 years. We've got some new programming languages, but it's all syntactic sugar around variables, loops, and conditionals. But everything has changed multiple times when you think about how we deploy software in the past 10 years. We went from data centers to VPSs and web hosting. Now we're in containers and serverless. We even have serverless containers! It's changing at a hectic pace.
I would love to see more companies doing what we are doing. We need to give engineers platforms, but we can't back their businesses into a corner. We need to provide them with high-level components, so it's easy for them to design their platforms, own their data, and adjust course when necessary. As an industry that has taken over the world, we can't simply sell the same handful of opinionated platforms to every business under the sun. Organizations and teams are unique, and so are how they run their apps.
Anton: Do you have something you want to add?
Cory: I'd love people to check out the platform. Billing is usage-based, and we don't charge for the first $500/mo of cloud spend. It's a generous free tier for hobbyists and startups. We're also an AWS Activate Provider, so if you are running on a PaaS today and want to migrate to AWS, our customers get $5000 of credit on AWS to get started, and we're always happy to help figure out the details of a migration. We've migrated some very large businesses to the cloud.
Anton: Great. Thanks a lot, Cory! It was a super interesting conversation about operations, DevOps, and platform engineering.