host

Kasper Borg Nissen

guest

Mauricio Salatino

Episode 3837 mins2/19/2026

#38 - Beyond Kubernetes: Platform Engineering, Developer Experience and GenAI with Mauricio Salatino

host

Kasper Borg Nissen

guest

Mauricio Salatino

Listen on

Apple Podcasts Spotify Youtube

About this Episode

Mauricio Salatino, open source and ecosystem engineer at Diagrid and author of Platform Engineering on Kubernetes, joins guest host Kasper Borg Nissen to break down why Kubernetes is a foundation, not just a standalone platform. They discuss ecosystem-driven platform design, reducing developer cognitive load, bringing feedback from production back into the ‘inner loop’ of development, and how generative AI is reshaping platform APIs and tooling. They explore whether Kubernetes is still waiting for its “Rails moment” - an opinionated, developer-first layer that makes building on it dramatically simpler.

Links mentioned in the episode:

Platform Engineering on Kubernetes by Mauricio Salatino (Manning):

https://www.manning.com/books/platform-engineering-on-kubernetes

The Evolution of Platforms: Gen.AI Edition (Blog Post by Mauricio Salatino): https://www.salaboy.com/2025/11/18/the-evolution-of-platforms-genai-edition/

Transcription

[00:00:00] Chapter 1: Introduction and Episode Overview

Kasper Borg Nissen: Hello and welcome to Code RED. I'm excited to be joining you as one of our new hosts. Now don't worry, Mirko will still be hosting as well. We're just bringing more voices into the mix to explore some new topics on Code RED. As I take the mic, I want to dive a bit deeper into the technical side and to highlight the community aspects of observability. Today, we are going to start by talking about how you can contribute to OpenTelemetry, both from a technical side, but also from a community perspective. Today's episode is all about platform engineering and developer experience on Kubernetes. Today's guest is Mauricio Salatino. Mauricio is an open source software and ecosystem engineer at Diagrid, a long time contributor to projects like Dapr and native and ex member of the steering committee, co-chair of the Tech Developer Experience and the author of the book Platform Engineering on Kubernetes. Today we will talk about what developer experience on Kubernetes really means, how platform engineering principles apply in the real world, how abstractions like Dapr help reduce cognitive load, and where observability fits for developers, and how GEN.AI is starting to shape the next evolution of platforms. So Mauricio, thank you so much for joining us.

[00:01:14] Chapter 2: Biggest “Code RED” Moment and Keeping Up with Change

Mauricio Salatino: Hi Kasper, I'm really happy to be here. This is a pleasure, and I'm pretty sure that we will have a fun conversation around those topics. I'm really passionate about that. So I think that we have a lot of things to discuss.

Kasper Borg Nissen: Yeah. Me too. So let's kick things off with our traditional question. What has been your biggest Code RED moment and incident failure or challenge in your work with Kubernetes or platform engineering? That sort of changed how you think?

Mauricio Salatino: I think that for me, like as you mentioned before, I've been contributing to CNCF projects for quite a while, and I really like the idea of going to different projects and understanding how different communities tackle different challenges. And that has put me in a position where I do help a lot of customers, like other companies, to deal with these kind of like Code RED moments where they are having reliability issues, Scalability issues. But most of the time I just engage with customer teams trying to solve those problems. So I try to help them to go through that. But personally, when you ask the question, I cannot stop thinking about writing a book about the Kubernetes ecosystem. When I wrote the book Platform Engineering on Kubernetes, it was really challenging to keep up with the Kubernetes community, all the things that were changing in the landscape and how projects integrated with each other. For me, that made me reflect quite a lot. And you can see that in the book itself. That part of the big challenge of platform engineering is to keep up. Now, with AI, this is extremely changing and it's changing fast. So how do you keep up while you build stuff that needs to support whatever it's going to be built tomorrow? And I think that that's one of the things that have changed my perception on how I tackle problems and how I implement solutions for that.

Kasper Borg Nissen: That totally makes sense, especially now with I think the CNCF project is over 200 now. Right? So it's there's so many things that you as a platform engineer or even a developer to some extent needs to like be up to date on. Right. It's crazy.

Mauricio Salatino: And you cannot know about everything, right? So you need to find ways of delegating and making the right decisions so you can adapt quickly to whatever comes next. Yeah, exactly. I think that that's the key nowadays in whatever you're doing.

[00:03:27] Chapter 3: Mauricio’s Role and Dapr’s Abstraction Model

Kasper Borg Nissen: Awesome. So before we dive deeper, could you introduce yourself a bit for our listeners and tell us about your current role and your focus areas?

Mauricio Salatino: Yeah, 100%. So I'm working as an ecosystem engineer for a company called Diagrid. Diagrid was created to build products around and on top of the Dapr open Source project, which is a project that was born inside of Microsoft to provide abstraction from infrastructure. Right. The idea is you have API's to talk to infrastructure, your cloud infrastructure, your on prem infrastructure. So then you can build applications on top of that infrastructure without being tied to a given specific provider, right? The example that we use quite a lot is Kafka and RabbitMQ or pulsar. And, you know, like the message brokers on cloud providers like Amazon SQS or Google Pubsub. The idea is you build applications once against a core set of APIs, and then you can move your application across environments quite easily. And because we expose APIs at the same time, you have like implemented like cross-cutting concerns behind those APIs like observability, security, reliability, all core things that most platform engineers want to provide to teams. Nowadays. By working on the Dapr project, I learned a lot about those topics specifically, and that's kind of like one of the main reasons why I've been collaborating with Casper on the observability side of things being that CNCF project and exposing OpenTelemetry metrics and data we need to make sure that that, you know, worked quite well with our telemetry and observability providers. So, yeah, it's been quite fun.

[00:05:03] Chapter 4: Why Write “Platform Engineering on Kubernetes”

Kasper Borg Nissen: Yeah. Mauricio mentioned we have been working together a little bit on the observability space of the Dapr. We've done a few talks and we'll be doing a few talks in the spring as well. So yeah, keep an eye out for that. So you're the author of this platform Engineering on Kubernetes book, right? And you already mentioned a little bit about it in the intro that. But can you, like, explain a little bit about what originally motivated you to write this book, and what problem did you sort of consistently see in organizations struggling with adopting Kubernetes?

Mauricio Salatino: It's not that I woke up one day and I said, I will write a book on platform engineering on Kubernetes, right? Like, I remember this was like 2021 when I presented the project to the publisher. Basically it wasn't named like that because I hadn't what I had in mind is basically a book about how the landscape, like the CNCF landscape, can be combined to create something that provides much more value than just plain Kubernetes. Right. And what I've been doing until that point was to try different projects with different customers or different implementations, and then see how different tools were helping that customer or that company to solve a very specific problem. But most of them were focusing on the same projects and then trying to integrate them together. And after doing that for five, seven times, I thought, okay, you know, now I have like a lot of examples on these 20 projects. All the companies are converging on the same point of saying, like at least these five projects we are all using. So it kind of made sense to write a book just showing that combination of projects and how teams can actually do that, and why it's going to save them so much time to combine these things together instead of just making up your own solutions for that. So the book came about that story, just telling that story and then showing the examples that I've been working on in a GitHub repo about, like all these technologies and how they can be combined together. By the end, it took me two years to write like I published it in 2023, and by the end it was that it was. It's a compendium of 20 different CNCF tools and other open source tools, and how they can be used together to create something that provides much better value for, for organizations. And it's super fun because when I published it, there were other initiatives starting just to do the same thing and to prove that the combination of projects made a lot of sense. So yeah, it was well received, I guess.

[00:07:36] Chapter 5: Kubernetes as Foundation, Not the Platform

Kasper Borg Nissen: Yeah. And I think it was very timely. Right. I remember back in, I think it was at KubeCon in San Diego, where Brian Lyles had a keynote around and mentioning Kubernetes being a platform for building other products, other platforms. I think this resonated at least to me as well. Back then, I was a platform engineer or a DevOps engineer, SRE engineer, whatever we called it back in those days. So I think this, this, this makes a lot of sense because at that point in time, CNCF was ramping up on on projects and and us a platform engineer needed to like, as you mentioned in the beginning, learn and learn and learn new projects, figuring out how to stitch them together. And I think one of the key things that's also stuck with me in the book is that you sort of position Kubernetes as the foundation rather than the platform itself. Can you based on experience, why do teams mostly like go wrong in this, that saying, hey, we have a Kubernetes platform and then we're all okay.

Mauricio Salatino: Okay. So there are like tons of different things there. I think that assuming that the journey will be easy is the most common one. Like saying, oh no, no, we'll just use Kubernetes and that's it, right? By just adopting a position like saying we will just adopt Kubernetes and that's it. You are missing out on all the other tools that you will need to be efficient, right? And by not understanding how difficult it is, you will spend a lot of time thinking that you can deal with it like building a platform or just managing Kubernetes at scale easily with a small team. But actually what you will realize is that you need a lot of experts on that space to actually do it correctly. So I think that, like underestimating how difficult it is, is kind of one of the key challenges nowadays. Of course, 2026 is much easier than what it was in 2021. There are tons of things that now actually work, like out of the box, that before you needed to spend quite a bit of time. And then I guess the normal thing of saying, well, we'll replace what we have today for like for Kubernetes and not thinking about like that developer experience or how like, you know, operations teams are going to manage this and observe kind of like the behavior of this very complex orchestrator. So that's kind of like the gist of it, like complexity, I guess. And then understanding for me, like the one of the drivers of writing the book, is for people to focus on the ecosystem more than on Kubernetes, more than the project itself. Right. Because tons of big companies have said, like, let's do Kubernetes and Kubernetes only, and then you spend like three, four years building all the tools that are already there in the ecosystem. So keeping an eye on the ecosystem is extremely helpful.

[00:10:19] Chapter 6: Bridging the DX Gap and the Search for a “Rails Moment”

Kasper Borg Nissen: Yeah. And if you're only working with Kubernetes, right, you also need to have a lot of experts on Kubernetes hired. And maybe that's not the best time spent for developers to actually learn Kubernetes in and out rather than building abstractions on top. Right.

Mauricio Salatino: Yeah, exactly. And I've seen tons of companies that were moving from like a developer focused platform like Cloud Foundry or Heroku to Kubernetes, and that's a huge gap there, right? One of the things that I'm still surprised to still is that we didn't come up with something like that yet, but it feels like it's it's coming. And now with AI again, that's changing a little bit. And I think that's good.

Kasper Borg Nissen: It's actually funny that you mentioned this because one of the other things that Brian mentioned in this talk back in 2019 was that Kubernetes needed its rails moment. For me at least, I think platform engineering is sort of like driving towards creating a rails moment for Kubernetes. At least that's the idea to simplify and create abstractions for developers, to make it as easy as possible and have that rails rails moment. And on that topic, I think you also writing right now working on a new book around developer experience on Kubernetes. Can you talk a little bit about what that means? What is the developer experience in this context?

Mauricio Salatino: Yeah, from my perspective, I think that developer experience is all about like how many tools as a developer you need to know in order to build applications that will run on top of Kubernetes, right? And how much of the ecosystem, how much of the process do you need to understand in order to get things done? The book that we are writing with Thomas. Thomas Vitale is all about how do we improve both the, you know, the software development inner loop and the outer loop to make sure that the burden is not on developers, right. Because again, that's a common challenge that I see in companies going the Kubernetes route. Is that saying, you know, okay, you are a developer. Now you need to understand how Kubernetes works and you need to create deployments. And you know, like all the manifest for deploying your applications there. And then understand how Kubernetes will run your applications. So that influence a lot on how you develop your applications like the architecture, but also the tools that you need to use in your local setup and in your inner loop in order to actually match what's expected on the other side. Right. So the book is all about that. Which tools will make you more efficient? What practices will make you more efficient? And now with AI coming really, really hard, it's like if code generation is not like something that humans will do. What do we need to provide agents with in order to just have exactly the same things that we have as humans to be more efficient? Right. And that's tooling that's, you know, again, like, how do you write tests? What kind of tests do you write? And which tools do you use for that. Again, it's just more about like efficiency and making sure that again, like if it's agents, we don't blow the agent context. Right. Like all the we don't use all the tokens just to specify which tools to use. But if it's a like a person developer, how do we don't push them to learn 10,000 tools to just do one thing?

[00:13:27] Chapter 7: Developer Pain Points on Kubernetes

Kasper Borg Nissen: Yeah. So when talking about developer experience, it's always about like fixing some pain points or making it easier. Right. What would be sort of the main pain points that you see that developers facing today when they are sort of interacting with Kubernetes based platforms?

Mauricio Salatino: I think that this is more like an industry opinion, right? Like there are tons of providers that will give you Kubernetes for development, right? Like developer environments that are basically Kubernetes based and that's really good. But again, then the developer is forced to learn about what Kubernetes is and how it works, and then connect to maybe a remote environment to do that. The other way of doing it is like, no, well, like you push developers to think about, okay, what's your application? And that's going to run in a container. So you give them the tools to simulate how the application will connect to other applications in a Kubernetes cluster and tools like teleport. And, you know, like a mirror will help you to troubleshoot or run something inside the cluster and then just do debugging sessions and stuff like that. But still, it feels like the amount of tools is way too much. The experience is, again, it's not that rails experience, it's not like Heroku experience where I can go and then you will run it and you will run it and you will scale it up and it will work. So I think that if you think about what the book is about and what the term developer experience means in this case is helping business developers like developers that are creating applications to have that experience. How do you focus on building the business logic while other teams and the platform itself will focus on running that for you?

[00:15:02] Chapter 8: Designing Platform APIs and Balancing Abstractions

Kasper Borg Nissen: But how do platform teams then decide what to hide and what to expose and what to standardize for developers? How are they creating this basically API towards the developers?

Mauricio Salatino: Yeah. So I think that's a very good question. And I think that that's company specific like organization specific. Right. Like it really depends on the maturity of the organization. There are organizations that have pushed development development teams to learn Kubernetes. And at that point you can fork you can create two options, right. Like one that is more developer friendly, where you said, give me a repository, I will create a container and run it for you perfectly fine. But also you provide all the escape hatches to go and touch, like the behavior on the clusters because like it's very difficult to create like a one size fits all tool or API that developers will use if like if your API is not providing a feature that they need, they will find a way to bypass it. And then things go really wrong.

Kasper Borg Nissen: With your like experience from the Dapr project and, and the abstractions that project like Dapr provides. How do they improve like the developer experience on top of Kubernetes and, and why should teams be careful not to, you know, create over abstractions. So we abstract too much away from our developers.

Mauricio Salatino: I tend to like that idea. I don't know where I've heard that, but I think to like the idea of if you are creating APIs, you need to create APIs for like 80% of the use cases, right? And then just provide that escape hatch that you can just go and bypass the APIs for very specific reasons, right? I think that, again, from the Dapr experience and on another project's experiences like spring, like the spring community in the Java community, they heavily rely on a programming model. Right? So it's all about like explaining what developers are supposed to do with these APIs and what these APIs were designed for, right? Because sometimes, like if you provide an API but you don't provide that context of why this was designed in this way, developers will just, you know, developers will be developers, right? Like they will just go and use these and have used the API in a lot of different ways. So providing that context is a mistake that I've seen in a lot of different communities. And that can be avoided by that. Just making sure that the programming model on top of those APIs is clear and consistent.

[00:17:21] Chapter 9: Observability for Developers vs. Operations

Kasper Borg Nissen: Then from a little bit into observability stability is often discussed as like an infrastructure or platform level. What does observability for developers mean, in your view, and how is it different from platform and infrastructure observability?

Mauricio Salatino: Yeah, that's a good point. I think that going back again to the book, right. Like that segmentation of inner Loop is like what developers do and what they do a lot. Right. Like this is like a loop where they are just coding, making changes, testing things. Compared to the outer loop, which is like more CI. How do we deploy things in an environment? It's very different, right? Like the kind of tools that they use and the kind of work that they do there. Right. One is like complete runtime, like application is running and the other one is like, we're stopping things from starting things. We as developers, we tend to think more about ephemeral executions that runs once. But with all the agentic stuff coming up really, really strong, I strongly believe that 26 and 27, like next year, will be the era of observability on the developer side of things like observability on the inner loop. And I'm really interested to see how like, for example, Dash0 and other observability providers not pivot into that space, but start extending the experience for development teams. When you have agents doing a bunch of different things and interacting with repositories, with services and generating code, you want to see how much like different team members are using from that, and how much they are producing and not to validate what they produce, but more to understand kind of the new ways of interacting and the new ways of building teams. Right? Yes. Yeah. I am really interested to see now, like if you think about the inner loop with cloud code or any of the agents, right. Like you can have like ten terminals open doing work at the same time, you need a way to see what's going on in these terminals. And nowadays it's not that easy. It's like you finish your day and you have ten terminals. You don't know what happened. You don't have any summary. So I do believe that observability is coming really strong in that space too.

[00:19:27] Chapter 10: Closing the Loop: Production Signals in Development

Kasper Borg Nissen: And extending a little bit on that, how do you see it sort of evolving like information from the outer loop going into the inner loop. So one example could be like having profiling running in your production environments and, and having that profile come back into the inner loop. To, to, I don't know, fix some changes that needs to be optimized because you are, I don't know, memory leaking or whatever is the issue. Right?

Mauricio Salatino: Yeah. And I think that's something that it feels to me. Right. Like of course, that this existed before platform engineering and all these new kind of trends. But it feels to me that at least in the Kubernetes community, that question is much strong today. Right? Like now we talk about, okay, how do we retrofit from the things that we are running into the development space, right. And I think that that question was the main reason why I joined, like the developer experience tag at CNCF, because I want to make sure that projects in CNCF understand how to bring back, you know, information to development teams. Right now, the tools are designed not for developers. They are designed for operations, as you mentioned before, and it's going to be extremely useful to get that back to developers, but also to the agents that developers are using to craft code, right, or to change code because it's completely different to just, you know, fix a bug by just generating code and fixing a bug based on some numbers that you already have from your production environment.

Kasper Borg Nissen: Yeah. And I think a product like OpenTelemetry in this space now starts to make a lot of sense. Right? Because now you can standardize on different formats that every project is now standardizing on. These values are the possible values that this can take. And that is very easy for an LLM to, to like, interpret and work on. So I think that is a very interesting sort of development that if we can push every project to be as uniform in their adoption of telemetry, the better for the developer, the better for the community in general. But I think I totally agree with you. That's a lot of work still to be done in this space. Especially many projects like designing for debugging for maintenance, not for developers. So that's an interesting perspective there. Definitely.

[00:21:51] Chapter 11: Profiling, Full-Stack Insight, and Agent-Oriented Telemetry

Mauricio Salatino: Yeah. And I don't know if like the audience is super familiar with the profiling aspect of OpenTelemetry. Right. For me it was mind blowing that you can run profiling on, for example, on your production environments where you can actually see from, you know, from the operating system level to the application level, all the things that are going on. Right. So when I was writing the book that wasn't available, I haven't seen any of those solutions. And it was something that was very difficult to explain to people like, how do you evaluate if one of the open source tools that you're using on top of Kubernetes is causing you problems, right? And I didn't have any answers for that. Like it's more like, okay, you install this and you trust that this is going to scale, but in reality you need to actually understand the entire stack and be able to profile or understand how all the things that are running impact your applications. So for me, profiling is a huge, huge advantage. And yeah, and on the standardization, I think that something that is keeping me awake at night is that thinking of OpenTelemetry was created for observability. Right. And providers rely on that to understand like get data and then just display that data to the user. Right. The same way that tools for developers are created for developers and whatever output they provide are for developers. None of those things were designed for agents. So it feels to me that we will go through a phase where we start tweaking. So we have OpenTelemetry, but for agents, so we can export the data in a way that an agent can consume that. And it's not extremely verbose or profiles. So you can actually get very specific things that agents can use. I haven't seen that change happening in developer tooling, but I know it's coming, right? If I'm running this tool to build my application, I want an output that is more accessible for agents than for humans.

[00:23:42] Chapter 12: Treating the Platform as a Product and Measuring Outcomes

Kasper Borg Nissen: Definitely. Yeah, that's some exciting stuff coming in this space. I'm pretty sure of that as well. Yeah. Now, I think one other topic that is like upcoming these days, or very popular these days is treating your platform as a product. And I think this ties very much into the discussion around developer experience and platform engineering as well. But also observability, because how do you actually measure, like the success of like a platform, like an abstraction that you created for your developers? What signals or metrics would you recommend in measuring, like developer experience or platform effectiveness without falling into vanity metrics?

Mauricio Salatino: It's funny because like, again, when you think about it, let's create this platform team inside this organization to create a platform, right? For me, that would be kind of like not the best way of doing it. It would be more like, let's create a platform to solve this very particular challenge. Right. Or this set of use cases. So the moment that you introduce a platform, you can validate if you're actually improving those use cases or not. And automation is like automation on those use cases. For me, platform engineering at the end of the day, it's all about automating tasks so people do not need to, you know, do stuff that it's not needed and can be automated. But of course, because this is coming from the Kubernetes side of things, we focused a lot on infrastructure, right? It's about how do you provision clusters, how do you provision projects or applications, how do you deploy GitOps on top of clusters. And those things for those things we have tools that are automated, right. So I think that I would focus more on that, like the internal business teams outcomes that you can get. Right. Like, so how much does it take to onboard an application? How much time does it take to create a release, for example? Yeah. Yeah. Because like on the developer experience side, you cannot focus on how many PRs a developer creates or how many lines of code do they change. Because that's not that's not useful at all. Right now. I don't think that it contributes to the discussion.

Kasper Borg Nissen: No, no, I totally agree. It's about measuring the time you're saving or like the value you're producing. For your developers. So they can actually focus on whatever task they have. Right.

Mauricio Salatino: So I would say that like if you are trying to measure developer experience, I would just create a simple list of okay, today, these are the tools that developers need to learn in order to do their day to day tasks. Right. And if we can do whatever in the platform to reduce that number, or to provide a programming model that abstracts away some of those tools, and the platform takes the lead on saying, okay, no, you don't need to know how to create a container. We will create a container for you, for example. Right. And where do you draw the lines? I think that's how I would do it because yeah, like metrics for just the sake of having metrics. Yeah. I'm not a very big fan of that approach.

[00:26:27] Chapter 13: Collaboration and Continuous Developer Feedback

Kasper Borg Nissen: No. And I guess I think a couple of years ago there was a few talks at KubeCon as well around developer happiness scores or something like that. Start simple, get some feedback. At least if you are working as a platform team, you should talk to your developers. Get some feedback on how it's performing, what the value it brings. Right.

Mauricio Salatino: And yeah. And I think that's got like another common mistake, right. Like when you have an isolated platform team that does not talk to developers. Right. Yes. Because developers will find a way to escape whatever the platform teams are, they will just do whatever they want.

Kasper Borg Nissen: Exactly.

Mauricio Salatino: Getting them involved is again, it's not the technical thing, but it's the thing that you must do.

Kasper Borg Nissen: Exactly. So in November last year, you published a blog post called The Evolution of Platforms GEN.AI Edition. And that was very much focused on, like, GEN.AI in the platform engineering space. What patterns are you seeing emerge? And yeah, basically, can you, can you talk a little bit more about like, GEN.AI from a platform engineering perspective? What we'll see in the next coming months or maybe even years.

[00:27:30] Chapter 14: GenAI Patterns in Platform Engineering

Mauricio Salatino: Yeah. I think that's an evolving space, right? For me personally. Right. Like, again, because as you can see, like my, my background is developing. Right. So I do empathize a lot with more than operations. But when you think about GEN.AI in the context of Kubernetes, right? I did a presentation with my friend Alexa from Bloomberg where we discussed, like about like. How these companies start providing AI or LLM models as services for internal teams. Right. That's I would say that that's the main thing that a platform team will do is like, how do you enable some AI enabled features or LLMs for internal teams to do work? And that takes you into different, different spaces. One is what again, what tools do you give developers to interact with the LMS? Right. Like which SDKs and gateways and rate limiting and token you know, filters and all that stuff while you look into the other side is how do you use GPUs to run these models efficiently, right? So for me, like, and if you go to any KubeCon or CNCF event, you will see that, like, okay, this is all about like running workloads that require CPUs and all the tools that you need to do that, like BLLM and all the DRA and all that stuff in Kubernetes, which I find it extremely interesting. But again, it's very infra focused, like that's for infra people.

Mauricio Salatino: On my side as a developer, I'm interested about like, how do I consume these things? As a developer. Right. Like from my machine or how do I conceptualize what's going to be available in the platform where I'm going to run my applications? Right. And I've seen a lot of things going on in that space. It's kind of like the Docker approach of letting you run models locally. That's one one way. The other approach is just pay for your, you know, cloud service. And then we will put a gateway in the middle. So to provide access to, to developers. But something that I described in that article is all about like, okay, as a developer now I need to learn about new stuff like rugs, right? Like retrieval, augmented like augmented information. Right? I need to understand these patterns in order to plan my applications correctly and make sure that agents, or any LLM can have the right information to use it that it's according to the organization where I'm working on. Right. For me that was a big revelation on API's, for example. Right. Like platform APIs or something that I described there in the article is about how do we soften those contracts, right? How do we create the right tools for agents to use our platforms as well so we can provide further automation?

[00:30:09] Chapter 15: Observability, Reliability, and Policy for AI Workloads

Kasper Borg Nissen: Yeah, and it's very different right from what we are coming from, and especially also from an observability perspective. Right. How do you provide insights into what the LLMs is doing? Because it's not just a single request response anymore. It's multiple LLM calls. It's between different agents. It's tool calls, it maybe even skills. There's all kinds of things involved in this space. Right. So how do we then like expose the right things to our developers. And how do we make sense of all of these things going forward.

Mauricio Salatino: And personally as you may know, right? Like I'm more focused on how do you make that reliable and reproducible, right. Yeah. So part of the things that we are doing on the Dapr Projects and Dapr Agents is all about that. It's like you have a long running interaction of services and tools and agents and stuff. How do you make sure that if something fails, you can keep doing the work without rerunning everything from the beginning, right. Because at the end of the day, like you know, like, tokens are limited and you need to be efficient with that. I think that today, like or 2025 was a year where we didn't care. We just used as many tokens as we want. But that's no longer the case. And I think that tools are getting better, but we still need to have the right policies and filters in place to do it in the right way.

Kasper Borg Nissen: Yeah. And I think that's also one of the things that you mentioned in the blog post as well is that at least I know that you mentioned that Kaverno and Kagan are doing some things so that at least you you introduce some policies in your cluster to make sure that that you're not providing all the access to, to the agent so that potentially they could delete everything. Right?

Mauricio Salatino: Yeah. And that's a hot topic nowadays, right? Like, I think that when I wrote the book, for example, I knew that identity will be one thing, right? Nowadays, if you think about creating a platform that spans across cloud providers, the main challenge that you will face is how the service in this cloud provider go and access S3 in a different like, you know, like buckets in a different cloud provider, right. How do you do that in a way that is controlled but also has the right policies to to check before going and executing whatever and how the identity from one workload propagates to a different provider, right. So I think that there's a lot of work being done there with Spiffy and Aspire, and I really want to see that materialize. And I'm really curious about how that relates to the observability story. Right. How can we check identities like, you know, like from the OpenTelemetry point of view and then see, okay, you know, this identity needs all these things. That would be interesting. Yeah.

Kasper Borg Nissen: Yeah.

[00:32:47] Chapter 16: Fast-Changing AI Tooling and Kubernetes Gaps

Kasper Borg Nissen: But yeah. Also from a platform engineering perspective. Right. I think the role of platform engineers in this AI space is also a little bit different. Now they need to maybe it's not different. It's probably the same thing. But you are now dealing with new tools as well. Like a completely new landscape of different categories of tooling that you now need to provide. You need to stitch them together. You need to create abstractions and expose them to your developers in a way you haven't tried before.

Mauricio Salatino: And the big problem is that at least not the problem. But the challenge is that tools are being created every day. They keep changing. And they were not designed for Kubernetes, right? While most LLM providers run their models on Kubernetes. So this is kind of like a big gap where we have tools that are not designed to work with Kubernetes, but at the same time they are using Kubernetes to run models. So it's like there's a big gap there. There is a big gap that needs to be fixed in some way. And I think that we will just keep pushing there. But as with everything, right, like even with Linux and with Kubernetes, everything takes time. We need to go through a couple of iterations of companies building these large scale platforms with AI, and then we will come up with the right tools to standardize that for the rest of the world.

[00:34:04] Chapter 17: Practical Next Steps to Improve Developer Experience

Kasper Borg Nissen: Definitely. So for platform teams listening today who want to improve developer experience in Kubernetes, what is one practical step they can take this week to make measurable progress? To sort of wrap this up.

Mauricio Salatino: From a platform team perspective, I think that having that practice of including developers into the platform discussion will lead you to figure out which libraries you can create across development teams to standardize work. So how do you do logging? How do you do again, like telemetry? How do you instrument different applications across teams. I think that that's extremely valuable. The Dapr project helps you in that regards, but I think that having those discussions with your development teams will. It's a practical step to go in the right direction. I think that if you are in the Kubernetes space, you think about like a platform API as an extension to the Kubernetes APIs. I encourage platform teams to go further and thinking about, okay, library distribution, standardization across languages and actually abstracting away infrastructure. Think about like what would be the first thing that you can abstract away from your infrastructure that will enable developers to have the same experience locally that remotely in your production environments. I think that that would be something that I encourage people to think about.

Kasper Borg Nissen: Awesome. Mauricio, thank you so much for joining us on Code RED. This was a fantastic conversation, and I think a lot of listeners will walk away with a much clearer understanding of what platform engineering and Kubernetes really means, and also why developer experience has to be like the center of all of this.

Mauricio Salatino: We are heading there. I do see companies focusing on that. So, you know, like please do share my Twitter handle and my blue sky handle and my LinkedIn if people wants to reach out. I'm always happy to mentor people on open source projects and help them contribute. So if you are listening and you want to join an open source project, reach out. Let's have a chat and let's start with your open source journey.

[00:36:02] Chapter 18: Wrap-Up and Resources

Kasper Borg Nissen: For those listening, I hope today's episode helped clarify how platforms, observability, and even GEN.AI fit together when the goal is enabling developers rather than overwhelming them. We'll include links to Mauricio's book Platform Engineering on Kubernetes and his blog post, The Evolution of Platforms GEN.AI Edition in the episode description. Thanks again for tuning in. I'm excited to continue these deep dives into platform engineering, developer experience, and the people shaping the cloud native ecosystem. We'll see you in the next episode.

Share on

More Episodes

#39 - Beyond On-Call: How incident.io Built Multiplayer Incident Response with Stephen Whitworth

Episode 3942 mins2026-03-05

Stephen Whitworth

#39 - Beyond On-Call: How incident.io Built Multiplayer Incident Response with Stephen Whitworth

#37 - Prevention Over Alerts: How OtterMon AI Reimagines Observability with Checo

Episode 3740 mins2026-02-05

Checo