Episode 2035 mins3/6/2025

Behind the Screens: Inside Dash0

host
Mirko Novakovic
Mirko Novakovic
guest
Ben Blackmore
Ben Blackmore
#20 - Behind the Screens: Inside Dash0 with CTO Ben Blackmore

About this Episode

Dash0’s own CTO, Ben Blackmore, joins Mirko Novakovic to discuss the evolution of observability at Dash0. They dive into why they’re focused on building RUM capability, how they invented their AI-powered logging tool, and where they see more opportunities to save customers time with better data. The two founders also reflect on their 13-year journey together, including what it’s like to transition from engineer to manager in a fast-moving startup.

Transcription

[00:00:00] Chapter 1: Introduction to Code RED

Mirko Novakovic: Hello everybody. My name is Mirko Novakovic. I am co-founder and CEO of Dash0. And welcome to Code RED code because we are talking about code and red stands for requests, errors and Duration the Core Metrics of Observability. On this podcast, you will hear from leaders around our industry about what they are building, what's next in observability, and what you can do today to avoid your next outage. Today my guest is Ben Blackmore, our CTO and co-founder at Dash0. Ben and I go back a long time. We worked together at codecentric, then at Instana. Now at Dash0. Hi, Ben.

[00:00:44] Chapter 2: Incident at Instana

Ben Blackmore: Hey, Mirko. Thanks for having me.

Mirko Novakovic: Yeah. And also for you. My first question. What was your best code red moment or worst?

Ben Blackmore: Yeah, I mean, we did have quite a few also at Instana from in the early days, like most of our servers, accidentally terminating there was a very fun one, but one for here. A really good one was. We had it rather in like maybe year 3 or 4 at Instana with website monitoring. So we had this script tag, right that customers could embed on their websites, and they were loading a piece of JavaScript, and they were also sending us data through that. As it is in a startup, you start with something simple. And what happened is we had eventually quite a few customers using that. And the ingress point that we had, which was basically just two small NGINX instances, got a bit overloaded. And the super interesting thing with that one was from a technical perspective, it was not a very complicated problem that occurred. So these NGINX instances got overloaded there. We couldn't, we needed to tweak the configuration, whatever. But what was really interesting is the impact of that, because customers were loading the script from our servers. This actually impacted their websites, so we had quite a few also e-commerce sites, right. And so their websites got slower, which is of course absolutely terrible. Right. That's the last thing you want to do. You don't want to impact your customers. And so that was a bit scary seeing that happen. Are seeing relatively clearly what we have to do, but also not having a really good way to test this, right? Because, I mean, you don't really have two other three other NGINX instances lying around with that amount of traffic.

Ben Blackmore: So yeah, what if we have to go in, manually change the config, reloading it, hoping for the best. Then also interesting is, well, if you have an NGINX, which is really, really good piece of tech and you are reloading the configuration under a really high amount of load, That's a bit scary, right? If you have this piece of software that normally is like responding instantly and only needs a few megabytes of Ram, and suddenly it gets stuck on startup a bit, for a moment you're like, oh shit. Yeah, and that one was just really interesting, right? Because you have one side you have this technical complexity of the incident, and on the other side you have what is the impact that it's producing. And that one was like relatively low complexity complexity but super high impact. And in that moment, we really had to solve it at that point because like end user traffic is highly seasonal. So we didn't really hit that evening spike yet. And we were really afraid of, hey, what is going to happen at the peak? Right. Is just everything falling over? Yeah. So when we acted we didn't really have really good metrics on it because we had the free version of NGINX, which has like five metrics that are not really telling you anything. The paid version comes with a lot more. But, I mean, we were a startup, right? So it didn't want to pay 6000 a year just for like a few additional metrics. Yeah. So that was a pretty interesting one.

[00:04:02] Chapter 3: Web Developers vs. Other Developers

Mirko Novakovic: Yeah, I can imagine. I'm talking about this. Right. RUM and and these things we, we I mean, I was talking to customers recently a lot enterprise level SMBs. And then I talked to you because you are a very experienced web developer. And for me, it was kind of weird that what I'm hearing from customers is that they have an observability stack, and you think all the developers are using it, but then you figure out that actually the developers have a total different stack. They use more tools like stuff like sentry and other things for error tracking and some sort of RUM session recording things. So you explained to me, but to explain it to me again, why are web developers not normal developers, and why are they using different tools, and why are they thinking differently of code than the rest?

Ben Blackmore: And there's probably a variety of reasons for it, and I'm probably over generalizing now, but I think in large parts web developers care about eros, right? That's just the thing they really care about. And when I say web developers, I mean people just writing JavaScript for most of the day, the stuff that is running in the browser, many are just consuming APIs, so they are not necessarily the ones responsible for it. Of course, that depends on the organization. And so with performance, not that often a problem, at least not from a RUM perspective, that you need RUM for this, or whether you call it RUM or UM Right. All the same, I think just the most important signal to them is there must not be a bug in it, right? It should not be exploding in my customer's face. So I think they just have less of a need for tracing rather often. Or they think they have less of a need for it. Right? We see that a lot also internally. Right. Once you show people the value of tracing for those that are using it for the first time. This is really incredible. If they have the need for that and if they need it to debug and troubleshoot what they are responsible for.

Mirko Novakovic: Yeah. I was also thinking about it a bit and I came up and maybe I'm totally wrong. That's why I will tell you my theory and then you tell me if that makes any sense. But in the server landscape you have a lot of concurrency problems, right? It's about load concurrency, calling distribution. When the browser, essentially the code is executed on the client of each individual user. Right. So you have actually a lot of power. And it's not a concurrency issue in a sense. And so if you debug or analyze performance problems, you don't have to do that under load. You can essentially do it in your browser. And the browsers now have pretty good support for performance analysis. You kind of have tracing in the browser, right, which shows you how things are executed. I'm assuming because of that it's much easier for a web developer to troubleshoot that because they don't need actually the production data for looking at performance. They can just take the browser developer tools, analyze it in the browser. So works on my PC basically, right. Optimize it. Roll it out. And then the other problem you described are errors. And errors can be very different right? Because it can depend on the browser type, on the version, on the operating system, so that you can't reproduce on your own PC. But does that make sense?

Ben Blackmore: No, it absolutely makes sense because in the end, what a web developer often wants to know is what is the path taken to the problem? What are the steps? Where did the user maybe what did they click on? Like did they do did this happen immediately or maybe like five minutes ten minutes afterwards? And all of that is often a lot easier to reproduce, right? And in the web browser. And it's also a somewhat luxurious position when you compare that to a back end developer platform developer, because also end users can just refresh the page, they can restart the application. Essentially write the backend developer can do that. I totally agree. Write the problem dimensions and how you are troubleshooting them is very different. And if you look at some of the solutions out there that are within the error monitoring space, it's a lot about the path to the problem. So it's the problem itself, but also the path taken to it so that you have an idea of what happened and in what context did it happen.

[00:08:12] Chapter 4: The Role of Real User Monitoring (RUM)

Mirko Novakovic: Yeah. The stack trace. Right. Getting the context of the error and stuff like that. Right.

Ben Blackmore: Yes, exactly. Exactly.

Mirko Novakovic: Yeah. It's interesting. I still believe that it would make sense that you have everything in one context, right? Because at the end of the day, it's one system. Yes, I get the point that it's running the browser. I can reload it and all these things, but at the end I have an end to end system, and probably browser is calling my APIs multiple of them. And then sometimes that is an issue. And I don't know if that's a network latency or actually a problem of that API responding and having end to end traces. Makes sense. I do get that it's a different set of problems, but I still think it makes sense to build it into one tool.

Ben Blackmore: And it absolutely makes sense, especially when developers take more ownership of what they're responsible for. Right. If your responsibility fully ends within the web browser, then of course what you're looking at is different. But if you are really responsible for that subsystem, for that service, whatever you might call it right, then you need a lot more context than you are also interested in solving that more holistically. Maybe let's call that right. And then you need that data. And then you will want to know exactly what has happened. Yeah I always find that interesting. Right. When you show people the tracing data, some people that are not really they don't know what the API is doing behind the scenes, like what is the service doing behind the scenes, and then you give them like some understanding of it. Right? Then they see, oh yeah, okay. So it's making some database calls. They are taking that long. Maybe they see the SQL statement. It also gives them a lot better understanding of the system as a whole and what is going on, and an appreciation for the other side as well.

Mirko Novakovic: Yeah, absolutely. And how about a little bit understand how OpenTelemetry works here. Right. As I mean, we know that there are the three signals, spans, logs, metrics. But there's also now a standardization around end user monitoring. And I know we call that beacons right when we send it. I'm not sure if there's a new word here if it's reusing existing hotel things. So how does it work. Why do I need a different concept. What's the state of it? Give us a little bit of background on that.

Ben Blackmore: So yes, the signals metrics, logs, traces. In the OpenTelemetry world you have the essentially the JavaScript distribution OpenTelemetry JS, which has, broadly speaking, two modes that it can operate in one within a Node.js context, the other within a browser. And so there's like slightly two different compilations essentially of, of that thing. And you could RUM the OpenTelemetry JS variant for the web browser to essentially collect logs and spans. In theory, you could also gather metrics, but that has a little bit of funkiness associated to it. So I'm not many people are really doing that. And so the majority of people, if anything, they would be collecting spans and logs. And you can do that, right? There's guidelines on the OpenTelemetry website to do that, to set it up. Which, I mean, you import it as a, as a library, configure it somewhat normal, and then you can use it to generate that telemetry that works somewhat normal. There is, however, a lot of discussion still around exactly what should be the data format for RUM data. So right now the majority of it is actually spans in a variety of formats, whatever the trace boundaries are, right? There's a lot of discussion around it. There are some OpenTelemetry enhancement proposals out there on exactly the format, whether some of that should be converted to lock events.

Ben Blackmore: A lot of internals that to the end user hopefully shouldn't matter, but there's still a lot of discussion around it because end user data and website monitoring data is often rather long lived, like one session. If you think about a session or a page load like where does the trace start and end? There's a lot of bikeshedding around this, right? You can have a whole lot of perspectives on it. And also there's a whole lot of activity which is rather instantaneous. And this is where a lot of the discussion is coming from, is that in our span or is it a log, whatever it is we actually at, Instana originally had this as traces as well. So all the website monitoring we did, everything we did was essentially spans. We came up with some trace boundaries for it that we thought were reasonable. But I think within a year we came to the realization that actually, well, we can use traces for it. It doesn't really make too much sense because the structure is often completely shallow, right? You don't have any depth within that trace in the majority of cases you have like, oh, page is loaded. Oh, there's like one request and then there's nothing nested underneath it.

Ben Blackmore: So we realized, hey, we're not needing this. And also it is creating some problems within the trace processing pipeline. So we essentially ditched that. And we're like, okay, we can simplify this a lot. We can make the protocol more efficient and stuff like that. And there have been similar discussions on the OpenTelemetry side as well. There's like a super huge issue on GitHub with like, hey, should we rethink this? Should this be a different signal? I think in the end, it doesn't really matter that much what signal it is, as long as it is consistent. And that is still a journey that OpenTelemetry is on, right? They're still finding it out. It's still experimental. I think it's also still marked as experimental on the website. Because it is somewhat diverging from the spec. Not everything is figured out. Also some fundamentals. But they are working towards that. There's some some rather good wrappers around it as well. We are also thinking about what to do there. Exactly right. And also what that means for OpenTelemetry native, which is something that we care about a lot. What does it mean to be OpenTelemetry native with RUM at this current point in time?

Mirko Novakovic: I think it's good that OpenTelemetry is going into that direction, right? And RUM is essential for observability in my point of view. Actually, I like RUM a lot, right? Because it's actually looking at the end user. Right?

Ben Blackmore: Yeah. Impact. Right.

[00:14:23] Chapter 5: Integrating Observability with Product Analytics

Mirko Novakovic: Exactly. So I think it's very important and I agree with you. At the end of the day, the end user will not care if it's a lock or a span or an event or some new type of signal called a beacon doesn't really matter. It is. It is more or less a technical issue. Right? Which will be solved? Yeah, I'm looking forward to it. So we are working on that whole web observability experience at the moment. Yeah, I'm kind of excited about it when we release it. Also, having this alternative and also an experience for a web developer that should be seamless and make sure that it's solved the use case that we described that are different for web developer. But there are also a lot of reasons why it should be end to end. Right. So I'm really looking forward, including this whole session replay thing and understanding sessions and funnels, and it's such an interesting and exciting topic.

Ben Blackmore: Yeah, it also enables, I think, the other side. Right. We talk now about the front end developers, web developers. But if you're a backend developer and you can see what the true origin of it is like, what user is it or what organization customer account, whatever you call it is responsible for that. And you can say, oh, like 30% of the load is coming from this one account that we have. That is also super helpful for, for everybody else working on the other side. Right. So it's a win-win all around.

Mirko Novakovic: No. Absolutely, absolutely. And as you know I'm a big believer in dogfooding. We know that we also need it right. Because we are also pretty strong on the client side. Right. We're using JavaScript in the browser. We are using Vercel as a platform. Right? we also released the Dash0 and the Vercel marketplace recently. So to make it really easy and I can see why this also makes total sense in that environment, right. Having a full RUM experience for the Vercel and this whole edge computing style of programming.

Ben Blackmore: Exactly. We are using OpenTelemetry JS internally for ourselves. And our web developers are also missing things, right? They're like, hey, I would like to have this in a better way. The way stack traces are presented, how we are post-processing them, session recording. Maybe they also have needs and ideally they just use their Dash0 for that as well. And we're on the road to that.

Mirko Novakovic: Yeah. And just touching it shortly. Right. Because that could be a whole podcast. But what we are hearing also from customers and we also discuss them internally. RUM is one topic, but then you have product analytics which is a totally different category. But if you think about it, product analytics is also a lot around session recording, analyzing sessions, understanding what users are doing, what they are clicking with, functions they are using. It's probably a little bit more analytical at one point, right, where you want to have more statistical and a suggestion what users are doing, what they are not doing. But I think there's also an opportunity for consolidation here because it's essentially the same data.

[00:17:23] Chapter 6: AI and Observability

Ben Blackmore: It absolutely is the same data. And funny enough, whenever we had a need for any kind of analytics integration. The first thing that we always build is a wrapper around these APIs, because the assumption was there's a very high likelihood that this tool will change. So you write a wrapper around it, because the API surface area of these tools is always the same, right? It's like there is an event name, there is some metadata you send out, and there's some initialization function that's slightly different, but in the end it's always the same, right? If you think about how that relates to a log event or span, it's very, very similar.

Mirko Novakovic: Yeah, absolutely. Yeah. Let's move to another topic that I'm excited about. I also can digest it totally. I think nobody at the moment really can, to be honest.

Ben Blackmore: I know what's coming. I mean, it must be AI.

Mirko Novakovic: Exactly. So let's talk about AI first of all, I can explain a little bit how we think about it. Right. At the moment we are not so much I think we agree on that around chatbots or conversational AI at all. Right. Using natural language for querying. It's just we don't believe that that's actually making you more productive. We think of AI more as a very integrated thing in the product and building a product for that, or based on that. I like the comparison with Apple Photos, right. And Apple Photos. You can search for dog and it will show you dogs. If there's text in the photo, you can just magically mark and copy it. So that's AI, but you don't even see that it's there, right? And our first feature is pretty similar, right? Log AI for us, you send in an unstructured log. And essentially what do we learn the structure. We extract the structure. So we extract the fields like the status like in the log could be is it an error or debug. And we do that and then we make unstructured log structured which makes it much easier than to search to digest to filter, to understand problems, to figure out problems. Because if you don't know that something is an error, you wouldn't alert on it. So that's the first thing we have built. Tell us a little bit about how that works, how we use LLMs there and especially how do we use that in production. Right. Because I think one of the things I've learned now is there's this whole idea of AI and using Python and creating something, an algorithm. But then if you have billions of logs in production, you have to apply it somehow, right?

Ben Blackmore: That is the real challenge. Often, at least in our experience, we have been there in the past, right? We have had several AI things also back at Instana that never made it to production because we couldn't figure out how to get this to production. And that was something we were trying to do very differently now with the things we came up with is immediately, it was always the question of, okay, there's this model, whatever it is, whether it's something in a Python library or it's an LLM or something. And how can we make this work at scale? That was always the immediate question and trying to figure that out.

[00:20:32] Chapter 7: Bringing AI to Production

Mirko Novakovic: And especially from a cost perspective. Right. It was always like, if it costs $5 per log, who would pay for it?

Ben Blackmore: Exactly. And I'm actually super afraid of some of the stuff right now is we have this stuff running and I mean, there's you need to make sure that that cost is not exploding as well, right? The immediate things we have done, like at the very beginning with log AI, is you do these super low hanging fruit which do copy it into ChatGPT and you ask it, hey, what is this thing? And of course, an LLM is pretty good at recognizing standardized log formats, right? There are so many examples of that on the web. That of course they can do that. But the problem is you can't invoke ChatGPT for 300,000 log records a second or a million log records a second. That just doesn't work right. So you need to come up with a solution. And the solution is not smaller model. The solution is more than that. It's like what can we do to get a similar experience to get this? But at a, I mean at a teeny tiny fraction of the cost, right? In comparison, that was actually quite a struggle. Also, hiring for that role like this is also what you've just mentioned, is like finding a person that exactly fits this skill set. Right, which was a different dimension to that, because what we have learned is you can hire data scientists to oversimplify that a bit, right, that are really good at maybe coming up with a solution, with trying different algorithms.

Ben Blackmore: They know math in and out, but they don't have the necessary software engineering background to get this to production. On the Pragmatic Engineer, they actually have a really good guest article. I think it was on exactly the different machine learning and AI roles, and that basically what some, a lot of people want is essentially a team, right? There's like 5 or 6 different roles. And what you're asking for is a whole team. And so of course, for us that is as a small company also not feasible. So we were trying to figure out what can we really do. How can we hire to make this work? And how can we find this middle ground between you have this whole theoretical background, but you also have software engineering skills. And so what depth do we need in what area? And that actually took quite some time to figure this out. Right. How does it match our strategy, what we want to do, the rough idea we have in mind. How can we test for this? That was pretty hard.

Mirko Novakovic: There's a word in German for that thing, but I don't know in English, Eierlegende WoLLMilchsau.

Ben Blackmore: Oh, yeah, one.

Mirko Novakovic: Of those complicated German words. It's basically something or someone who knows everything, right? From math and computer science to bring something in production. Right. Had a wide spectrum. That's kind of the person you're searching for. And we search for. But we basically figure out that's very hard to find, right?

Ben Blackmore: Incredibly hard. Yes.

Mirko Novakovic: We were lucky. We found with Larry. We got amazingly luck with someone who has experience in both, and we are adding some engineering skills to it to support this going into production. But I think it's very interesting to hear, right, that there is this whole computer science thing, but then bringing that into production. So how did they do that?

Ben Blackmore: I mean, in the end, we are using an LLM for this, right? But essentially we're just using it to give us hints to, to come up with like hints, how to interpret these logs and then everything around it is basically a custom model to figure out. Okay. Like this is the very first hint we have gotten. How can we apply this to other to future log records to interpret them, and then figuring out if we have those hints now, now our resource centricity comes into play. Because how can we apply what we have learned, learned to the same resource or to similar resources so that we avoid re-identification of, broadly speaking, the log format. Right? Because we want to remember that we want to remember this across restarts. But at the same time we also want to know does the log record change. Right. There are some nasty things that can happen, although that is really uncommon. But the more common thing is log records change at startup time, right? There's often a log record, which is where the logger is not fully initialized. That is more common. So you have like a few log records that are not fitting your expectations. That is a pretty interesting one. And you also don't want to lose those, right? Because the very 4 or 5 they can sometimes be really interesting, very interesting. The process is crashing, right? You are like you want to know those as well. So that has been a pretty, pretty interesting journey. And it actually combines a lot of the pieces we have within Dash0 in order to make this possible. And in order to make this work efficiently at the end of the day.

Mirko Novakovic: And the problem is always in the details, right? The things you figure out. And then one question is, I mean, we know that there are hallucinations, right? I mean, one of the things we discussed also is when you for example, let's take the severity of a lock. Right. And you say, oh, it's an error. It would be pretty bad in my point of view, if you would categorize something as an error, which turns out to not be an error. Right. So how do we make sure that this false positive race is essentially Dash0. Right. Is that possible at all? And if yes, how do we do that?

Ben Blackmore: I think there's always the chance that this can happen. Right. So we can only reduce the likelihood of this happening if you just naively throw it into an LLM and take whatever it's giving you. There's a chance that this is actually pretty wrong. So a lot of the work went into double checking what the output is, sanity checking it against a lot of rules around potential formats and structures. If it is able to extract a severity. What can this be? Does it? What does it make sense? What is the overall sentiment of the log record to just make sure. Okay. Yeah. This actually is very likely a problem. Right? This is not just an informational thing. So this is what we have been working towards is using an LLM, which is, I mean, purely statistical and also observability knowledge and being like, hey, we know this should look roughly like that. Do these two things overlap? Like does it match what we would expect? So that's what we have been working towards with this approach here.

Mirko Novakovic: It's interesting sentiment. So you basically say you categorize something in error, and then you look at the message and ask what is the sentiment? Right. And if the message says something, blah, blah, blah is broken. The sentiment is probably more towards error. If it says, oh, this looks fine. The sentiment is maybe too positive for an error and that you would use that as a categorization or like as a cross-check.

Ben Blackmore: To be fair, there for this dimension. We're still at the beginning, But, I mean, this is one form of validation, right? Is you throw it into an LLM and you need to figure out can you trust the output. And I mean, you could ask another LLM to evaluate the output of the first. But this eventually gets also pretty expensive.

Mirko Novakovic: Yeah. So it's very interesting. I love the first feature that we released. We are working on other things right. Like grouping things like pattern detection, like reducing the size of traces. Right. To better digest them. So what are you mostly excited about?

[00:27:45] Chapter 8: Expanding AI Features in Observability

Ben Blackmore: So I'm super excited about the the lock capability, not the severity identification, but the next step which would be extracting a lot more attributes and name fields out of these locks because then at this point, you will really be properly able to filter, because what's the alternative that's currently existing in the market is either you parse everything up front and you need to get it right, which is absolutely horror, or you parse it at query time by reading all the logs, piping them through some kind of parsing step in memory. Great, right? And then doing this while you have an incident. This is not the nicest experience, right? You don't want to deal with parsing and regular expressions while shit is hitting the fan, literally. So that is something that I'm really excited about and also showing what we're actually doing, right? Some of one of the learnings now was this whole Apple mantra is really nice. But the problem is, if it is so behind the scenes, nobody sees it. That's also quite the struggle then, right? How do you show what you have done if it is, it just means there's better data.

Mirko Novakovic: No. Absolutely. Yes. And we are working on that. Right. Especially interesting also from I mean I'm more on the go to market side from a sales perspective, right. Because it's very hard to say, look, we have this cool lock feature and then somebody says okay show me. And you're like, yeah, it's in here, right?

Ben Blackmore: The data looks better.

Mirko Novakovic: The data is better, right? Exactly. So that's something that's hard. Hard to sell, even though it makes a ton of sense. But it's hard to show, right? Especially on a demo. Or customers see that if you turn it on. Because then they have much better data. But if you don't have this moment, this is the old data. This is new data, then it's very hard to show that.

Ben Blackmore: Yeah. And I mean, we did have that at the very beginning. It was pretty easy, right, with or was it still beta customers at that time? I think so, right when we were like, hey, so we enabled this experimental feature for you, like, here's your before, here's the after, what do you think? And I mean, the first sentiment was, of course, I mean, this is awesome. It didn't have to do anything. And I now see more get more information out of my data. Cool. But now especially those that are just starting, that is a different story.

[00:30:00] Chapter 9: Transitioning from Engineer to CTO

Mirko Novakovic: Yeah. And now I have a more personal question for you. We didn't discuss that upfront, but let's figure that out because, I mean, we discussed a lot of interesting technical challenges, right? If you build an observability tool, there are so many technical challenges, right? It's like from the user perspective, it's scale, it's performance load of data. Right. You have to process, store, do that as cheap as possible. This is your first CTO role. And I know that you are on the technical side. You are really good. But we discussed today, you told me we had another call and you said, hey, during the day, I'm now doing this manager work and then in the evenings and on the weekends if I have time, I relax by programming things, right?

Ben Blackmore: So yes.

Mirko Novakovic: Tell me about that. What does that mean? So programming is now your free time essentially. And you love it and it's relaxing. And then how is that transition working for somebody who used to be a really good engineer. Now CTO already has a team of probably more than 20 people. And managing people. Right. You enjoy it.

Ben Blackmore: So I enjoy it. But it's honestly hard because the questions and the decisions you have to make are not that black and white, right? Let's say you write a something server side, right? You write some database statements, whatever. You write a test for it, you get this nice gratification. The test is green, it's working. It's fast. Whatever. This is somehow very objectively correct. Whereas as a manager, this is often missing. We have this idea. Let's try this. Let's try that. But you don't know whether it's the right thing. Is this correct? I have a gut feeling, but I also don't have any past experience that I could pull from. Right. So I can talk to people and get their opinion. But I don't know upfront. Is this correct? And is this is now the right thing to do at this very moment, or should I wait for a week with that? Are we at the right stage? Did I overlook something? So there's like so many things. And yes, programming on the evening and weekend is just like feels easier to me, right? Personally. But I know that at the stage that we're in. Right. I need to spend more time on the organization, on how we are working on the processes. On the product, figuring that out, enabling others, training them. Which is also pretty hard when you're constantly pulled in so many directions. It's fun, but it's honestly also a challenge.

Mirko Novakovic: No, absolutely. It's much more emotional also. Right? If you work with people, yes, as you said. Right. There's no unit test for a lot of things. Right. It's a lot of emotion.

Ben Blackmore: And also what's working for one person is not working for the next. Right.

Mirko Novakovic: Exactly. And it's also triggering emotions in yourself. It's like things that sometimes hard decisions, sometimes you say something that somebody totally received differently than you have sent the message. Right? So it's a very interesting thing. And I think it's interesting to share that I mean, I was an engineer myself when I turned into a CEO, but the company and I'm still struggling with it. I mean, I'm doing it for 20 years, and sometimes you come to me and say, hey, do you have a good advice? And I'm like.

Ben Blackmore: It really isn't easy.

Mirko Novakovic: It's not easy, right. There is no easy. Here's the solution do that unit test or use this API. And the problem is solved. Right. And productivity at the end also. Right. How do you make sure. Yes you can program it overnight maybe and fix the problem. But at one scale you can't do it anymore, right? Then you have to figure out how to get the team productive and make it even much more productive than than, I don't know, five bands. And that's a really big challenge. That's hard to do, but it's I think it's fun to do and it's a great challenge and we are growing fast. That's even more challenging, right? If you have to hire in parallel, that's also a complicated task, right? To hire the right people. I like that you're enjoying it. You do a great job. I'm really enjoying watching you doing that over the past months. And it's amazing how well it works. But I also see that it's not easy.

Ben Blackmore: It absolutely isn't. It's constant learning. Honestly speaking. And it also gives me an appreciation of my previous managers. Right? Everything they had to do that I wasn't aware of. Right. That's also reality as an individual contributor. There's, like, so many things that you have no idea are happening behind the scenes. So many discussions, so many individual problems that you're not aware of. And I now appreciate a lot of people a lot more. I can tell you.

[00:34:28] Chapter 10: Closing Remarks

Mirko Novakovic: It's essentially a lot of Code RED moments, right? A little Code RED which you have and that thing. So yeah, like your first technical one. So Ben was awesome talking to you. Thanks for being on the podcast.

Ben Blackmore: Same here. Mirko. Thanks.

Mirko Novakovic: Thanks for listening. I'm always sharing new insights and insight and knowledge about observability on LinkedIn. You can follow me there for more. The podcast is produced by Dash0. We make observability easy for every developer.

Share on