WEBVTT

00:00.000 --> 00:12.400
So, hi everyone, I'm Marcel, I'm the software engineer at Klyso, I work there on things

00:12.400 --> 00:19.080
self and today I'm going to talk about understanding self, a journey from metrics to tracing

00:19.080 --> 00:24.520
which is a bit of a conclusion from a running journey for myself on this topic.

00:24.520 --> 00:31.280
So, to scope and set a bit of goals for this, this is a sequel to a Sephalo contact

00:31.280 --> 00:37.280
from last year, but it's kind of stands on its own as well, but if you're interested

00:37.280 --> 00:43.520
more into the metrics part or as it is in the Seth context mostly referred to performance

00:43.520 --> 00:51.000
counters, there's Seth talk and every like puts a topic a bit on like the strangle between

00:51.000 --> 00:55.760
two crafting understanding Seth and understanding the tools my goal is a bit to lean a bit

00:55.760 --> 01:01.160
towards two crafting also like convey a bit of information about Seth and not too much

01:01.160 --> 01:03.000
about the tools themselves.

01:03.000 --> 01:08.040
At the end, I hope you get like me a good intuition of where things are useful and what

01:08.040 --> 01:09.920
to use them for.

01:09.920 --> 01:15.800
And the talk is roughly split into three parts, the first going to an intro bit of metrics

01:15.800 --> 01:20.520
a bit self like tracing and what kind of tools I picked for this task and then I walks

01:20.520 --> 01:22.600
with two examples.

01:22.600 --> 01:28.680
The first one I start with counter metrics and see where we can add more knowledge with

01:28.680 --> 01:34.800
event tracing and the second is with measure latency and then we use tracing to add even

01:34.800 --> 01:37.440
more metrics to understand latency better.

01:37.440 --> 01:45.240
So, some definitions but kind of distinguished stuff here.

01:45.240 --> 01:52.480
So, a metric is a measurement of a service captured at runtime definition of open telemetry.

01:52.480 --> 01:59.160
Some examples are the browser sense the right, we increment the right counter, the OST performs

01:59.160 --> 02:05.760
some operation, we record the latency and update an average latency.

02:05.760 --> 02:10.400
Tracing on the other hand refers to the process of capturing and recording information about

02:10.400 --> 02:14.120
the execution of a software program.

02:14.120 --> 02:17.800
So, this can be like a mixed bag, right?

02:17.800 --> 02:24.040
We can record all Rados operations, we can ask the question like, what function is

02:24.040 --> 02:28.720
used, all those watch operations we see in our metrics?

02:28.720 --> 02:34.800
Or we can do swift system while tracing and capture all the open or exact calls.

02:34.800 --> 02:41.440
Both things allow us to answer questions to the runtime behavior of our system.

02:41.440 --> 02:46.320
But a bit differently, with metrics, they are usually defined during development time

02:46.320 --> 02:52.560
but with tracing, we can have more options and freedom to ask things later.

02:52.560 --> 02:59.920
All right, a bit about what kind of data types we are looking at, those mostly about metrics

02:59.920 --> 03:07.440
and in the safe context we usually distinguish four types of metric data with gauges.

03:07.480 --> 03:14.400
This is like free space left on an OST with weight, fill level, how many requests are

03:14.400 --> 03:18.320
waiting in a request queue with counters?

03:18.320 --> 03:23.080
These are just values that go up, they might go through zero if we restart the service,

03:23.080 --> 03:30.000
but these are the things you take the derivative and have a rate from.

03:30.000 --> 03:36.400
There's also histograms, they are kind of like counters but you got this bucket.

03:36.400 --> 03:41.920
In this example, on the slide here, our buckets are latency ranges.

03:41.920 --> 03:46.840
For example, if we observe, if we observe latency of 0.

03:46.840 --> 03:52.640
milliseconds, we would increment a counter for the 1 to 3 millisecond bucket.

03:52.640 --> 03:57.800
This is great to see how it's our latency for example, distributed.

03:57.800 --> 04:03.680
They're not too common in the safe context, way more common are long running averages.

04:03.680 --> 04:10.720
In these are, in this chart here, this is a green line and if we have this observation here,

04:10.720 --> 04:17.800
T1 to T10, this bars, every time we have an observation, we update our running average.

04:17.800 --> 04:24.560
So in tracing, it's not only events that we capture, but we can also use tracing to generate

04:24.560 --> 04:27.120
most of these data as well.

04:27.120 --> 04:33.240
We can use tracing to generate histograms or countings as well, there is an overlap here.

04:33.240 --> 04:40.360
All right, when it comes to tracing, what are our options in this context here?

04:40.360 --> 04:43.560
And there is a various type of range.

04:43.560 --> 04:49.800
We have from the simplest U probes, which is basically, let's hook a function call and run our tracing call

04:49.800 --> 04:53.520
whenever this function call is called in our code.

04:53.520 --> 04:57.280
The simplest tracing points we probably can have in the user's base app.

04:57.280 --> 05:02.320
On the complete other hand, we have open telemetry tracing or in the safe context,

05:02.360 --> 05:04.880
you often find it called the agatraising.

05:04.880 --> 05:12.440
This is like distributed tracing spans that across multiple nodes, like the more complicated stuff.

05:12.440 --> 05:20.120
There is an older tracing implementation you find in the safe called block and zip can.

05:20.120 --> 05:24.960
These are according to the doc's deprecated, so I created out a bit here.

05:24.960 --> 05:35.440
And there is LTTNG, so this is trace points that have been in the safe code base for a while now.

05:35.440 --> 05:39.520
They allow you to add some extra trace points here in the source code.

05:39.520 --> 05:43.760
And the great thing about it, that I will show, like in two slides a bit more,

05:43.760 --> 05:47.200
is that they are compatible with USDT trace points.

05:47.200 --> 05:54.360
These are user statically defined tracing points that allows to use other traces on LTTNG on them.

05:54.360 --> 06:02.040
So they are green by accident, these are the kind of trace points we will look at to understand more self.

06:02.040 --> 06:13.840
And I also already mentioned EDPF, and it is because we are using EDPF tracing tools to go at the tracing.

06:13.840 --> 06:23.360
And there is a kind of, again, from easy to hard to use, there are a couple of tools to choose from.

06:23.360 --> 06:31.200
Well, I prefer to start always with BPF trace, this has a syntax that is like, orc,

06:31.200 --> 06:36.080
and allows you to like, add hoc write some scripts and get the answers easily.

06:36.080 --> 06:43.520
There is also BCC, the BPF compiler collection, this is like Python with inline C, more complex.

06:43.520 --> 06:50.240
It is very common that your start with BPF trace scripts and then go to DCC for something more complex.

06:50.240 --> 06:56.080
Later, if you need like argument parsing and making output nicer, there is also a BPF fully

06:56.080 --> 07:03.360
feed thing, it has some additional features, that are like great if you want to ship stuff here and there.

07:03.360 --> 07:08.080
But my favorite is BPF trace, and this is what we mostly look at.

07:08.080 --> 07:13.600
So I said that CF has these LTTNG probes.

07:13.600 --> 07:19.040
So in the source code, if you look around, you will find the lines like this above here, trace point,

07:19.120 --> 07:24.720
then we have like a grouping name, in this case OSTC, and then we have a trace point name and some arguments.

07:24.720 --> 07:27.840
And how can we use BPF on that?

07:27.840 --> 07:34.800
So essentially, the trace point line here is the C macro, the translates into an SCT probe,

07:34.800 --> 07:43.440
which is our usdt probe and plus something for LTTNG, and just as like a note on that,

07:43.440 --> 07:50.000
these probes are actually kind of cool, if you compile it, you are software with those probes.

07:50.000 --> 07:55.200
What they do is, at that point of the trace point, they are the knob operation as a placeholder,

07:55.200 --> 07:59.440
and at some metadata to the binary.

07:59.440 --> 08:04.640
So that if you go around with the trace, I'd use the metadata to find the place of the knob,

08:04.640 --> 08:09.120
tell the kernel, please replace that with the code for our trace.

08:09.120 --> 08:14.800
And the other cool thing is that I checked, I think, the last two recent

08:14.800 --> 08:18.480
self-container images, and those trace points are actually baked in.

08:18.480 --> 08:22.720
So you can do all the tracing stuff on your production environment.

08:22.720 --> 08:28.480
And yeah, to summarize, we can BPF trace those trace points, cool, right?

08:28.480 --> 08:37.600
And yeah, so what does it mean to trace the thing?

08:37.600 --> 08:46.240
And what we can do when we run a thing like BPF trace on our code is, it runs our tracing code.

08:46.240 --> 08:52.800
So we can generate output, event tracing, we can count events, we can also condense these measurements

08:52.880 --> 08:57.200
into histograms and starts, and all these things.

09:01.600 --> 09:10.640
And this all runs in the VM and the kernel, so it's pre-efficient at the end of the day.

09:15.360 --> 09:18.880
But, has always some challenges, right?

09:18.880 --> 09:21.840
So I ran through writing a couple of scripts.

09:21.840 --> 09:29.360
And so especially if you don't have nice USTT probes available and have to stick to U probes

09:29.360 --> 09:35.360
and you have function calls, especially in the self-context, you end up to have to trace all

09:35.360 --> 09:41.680
this deeply nested C++ structs, and this can be hard to get the information you want out of that.

09:42.480 --> 09:48.080
So having the back info on this helps, especially if you have a new ad BPF trace version,

09:48.080 --> 09:53.840
that has support for that, but debug info on your servers might not be practical, it's

09:53.840 --> 09:59.520
a thing that's like a gig or two in size, so it might be like problem and prod, right?

10:01.200 --> 10:05.760
The alternative method you can start, like mocking those structs and having

10:05.760 --> 10:12.960
exemplified versions in your tracing code, this okay, but has it's also its drawbacks, especially

10:12.960 --> 10:20.000
if things change and stuff. I also ran a couple of times into EBPF stack limitations,

10:20.000 --> 10:25.040
it's like not even the K you have available there, especially if you are tracing with large

10:25.040 --> 10:30.640
strings, you kind of run out of space and to think, tell doesn't compile anymore, and this

10:30.640 --> 10:34.400
work around for that, but just keep out for that, it's something to run into.

10:35.440 --> 10:40.720
The nice of thing to have is USTT trace points that have extracted arguments that are directly

10:40.800 --> 10:44.640
have arguments that point you to the interesting data, that's the best thing to have.

10:44.640 --> 10:53.520
Also, like a small word of course, when I try to trace our GW first, I was confused by not

10:53.520 --> 11:00.640
like seeing this like blip rattles, it's like hundreds of trace points or dozens for like

11:00.640 --> 11:05.680
riots and reads, and they weren't hit and I was like, why is that? The turns out that some

11:05.680 --> 11:12.320
clients like rattles gave way circumvent those code paths and have their own to package up

11:12.320 --> 11:17.760
their own operations, and we actually see later what that means and how we can trace it anyway.

11:19.360 --> 11:29.440
Okay, time for the first example. Okay, by doing counters to event tracing, yeah, and for that,

11:30.320 --> 11:36.640
I picked like a really simple workload. So just a single four megabyte as we put to rattles

11:36.640 --> 11:42.080
gateway. This rattles gateway greets us with this locked line and says like, okay, I saw it

11:42.720 --> 11:50.560
it took, I don't know, 60s, 3 milliseconds, and how what can we do to learn what rattles does underneath that?

11:51.360 --> 11:57.040
And there is a whole set of metrics. There is, say, you find some under the name object

11:57.040 --> 12:03.440
turns and OST up or OMA. There's a whole lot of them and it's not very readable and also if he

12:03.440 --> 12:10.480
takes all the zero values out, we end up at this. So our S3 put is essentially on the rattles

12:10.480 --> 12:15.680
like 18 operations. There's some stats, there's a create, there's a write, there's makes sense,

12:15.680 --> 12:21.920
I guess, like intuitively, but then there is set X after. It's our key value pairs attached to

12:21.920 --> 12:28.320
self objects. And I don't know, this might be all this metadata that S3 has in this protocol,

12:28.320 --> 12:35.280
but we don't know yet. There's also calls, these sound interesting, these are calls to

12:35.920 --> 12:42.160
object classes. This is coded, lives on the OST and performs like complex operations for

12:42.160 --> 12:48.880
like more complex client, like rattles gateway is. So what else can we learn from metrics here?

12:49.040 --> 12:58.720
So we had 18 operations if we sum the ones I just showed up. As also metrics that count ops,

12:58.720 --> 13:04.480
they say, S3, this is odd. And if we look at the messenger, the CRPC layer, we see that

13:04.480 --> 13:11.760
three messages went out and why is S3 and 18? And there is different ways to answer this.

13:12.320 --> 13:17.440
We can look at the code and find out what it means, but we can also run a tracer. And

13:18.400 --> 13:25.040
those are the output of a tracer. I ran to find out what the rattles layer does. So there is this

13:25.040 --> 13:34.080
airplane starting. It just rolled it last week and it's not in Seth yet, but there's an alternative

13:34.080 --> 13:42.960
to that. If you, this is my work at another trace point to the rattles to get like this nice output,

13:42.960 --> 13:48.480
but there's an alternative. If you look on GitHub for Seth trace, there is say that project

13:48.480 --> 13:56.320
does it with libbps plus debug info. There's alternatives and you probes. So what can we learn

13:56.320 --> 14:05.120
about our S3 put here? So we first see this in bold is the object we access, where there's two

14:05.120 --> 14:11.600
deer things. And these are the trends, these to together form a transaction to the bucket index.

14:11.600 --> 14:20.160
This is the data structures, it stores all the objects in the bucket with some of their metadata.

14:20.800 --> 14:27.200
And we can also see right in the middle, this is the operation that really creates our object.

14:27.200 --> 14:33.120
And we see all this set X actress here. And yeah, it's really metadata heavy. We have the content time

14:33.120 --> 14:39.280
here's the e-tech. We also have my command-time arguments for some reason. And we have our right to cool.

14:39.920 --> 14:46.960
But how does a script for that looks like? Seth like long complicated. And this is actually all

14:46.960 --> 14:52.960
you need for if you have nice trace points to do a thing, to trace a thing like that. So as I said

14:52.960 --> 15:00.000
it's like a odd style. So you have these trace point definitions here. In this case usdc dt,

15:00.000 --> 15:06.240
you have to point it to a binary because the set is baked into a binary. And then we have this

15:06.320 --> 15:12.160
group name, OSTC, and then the trace point name. And this calls our tracing code. In this case

15:12.160 --> 15:17.760
it's just the print app. We're also recording the start time here. You see something with an

15:17.760 --> 15:25.840
ad in front. And we're storing with our zero as a key, our zero is just our transaction ID.

15:25.840 --> 15:32.320
In this case it in a map to later on the finish side retrieve it again.

15:32.320 --> 15:41.600
So we also can't latency with things like that. All right. This is all, yeah. This is all you need to

15:41.600 --> 15:46.400
trace things. And you can like extend this. For example if you're just interested in parts of

15:46.400 --> 15:53.360
the operations or want to have a stack trace in this space you can also ask bpf to give your

15:53.360 --> 15:59.760
stack trace to see where those operations originate. The best to quickly summarize what we've seen

16:00.240 --> 16:05.120
with metrics it's great to find out what the operation mix is that we see for clients.

16:05.760 --> 16:12.560
I also learned that tiny operations are put together into bigger operations. And it took three

16:12.560 --> 16:19.840
messages and three big operations on the rattlesite to do our put. And two of those forms the bucket

16:19.840 --> 16:25.600
index transaction and one is object plus metadata. All right. Second example.

16:30.720 --> 16:37.600
All right. Where you look at latency and for latency we change our workload a bit.

16:40.480 --> 16:46.320
Here I just ran short S3 benchmark. It has like a mixed

16:47.680 --> 16:53.600
mixed operations where some delete gets put stops basically had request. And we get another

16:53.600 --> 17:00.960
pie chart. This time also a lot of calls but also some reads and writes. And if we look at

17:00.960 --> 17:09.120
like the simplest latency metric on an OCD we can find. This is OCD of latency. On my simple

17:09.120 --> 17:18.240
test setup I saw 40 milliseconds. And is that actually meaningful? We have all these operations and

17:18.240 --> 17:22.800
they must be different. They must have different characteristics. But they're all mesh together

17:22.800 --> 17:27.600
in this 40 milliseconds. And so the question is how we can add more detail to understand

17:27.600 --> 17:33.200
better what is going on in terms of latency. More detail means there's two options.

17:33.920 --> 17:39.440
We can look at more averages. There is averages per reads, average is per writes.

17:39.440 --> 17:43.120
But they still mesh together a lot of operations into averages.

17:44.240 --> 17:50.560
The OCD also exports histograms and they have like latency plus size

17:51.040 --> 17:57.200
so we can understand things better in terms of size there. These are parts of the previous talk.

17:57.200 --> 18:02.480
Now we're looking at tracing to ask more dynamic questions on this topic. So first

18:03.440 --> 18:09.680
what kind of trace points do we have? And if we look at the OCD we find I think it was around

18:09.680 --> 18:16.240
70 trace points. And for this there is a nice family of trace points that we can use.

18:16.400 --> 18:22.080
They are called do is the opt pre. There's post and there is a special pre for every rattle's

18:22.080 --> 18:27.760
operations. There is. So the first and the last they are just the generic ones. And then there is

18:27.760 --> 18:32.640
a specialized one that has parameters. For example for writes they tell you this is the offsets

18:32.640 --> 18:38.800
of the write for an old map update. They tell you the keys and values. And what kind of questions can

18:38.800 --> 18:45.440
we ask? Just with this there. Oh. There by the way there's surrounded around the operation process

18:46.400 --> 18:51.680
in the OCD. So we can capture that. So what can we ask? We can ask for

18:51.680 --> 18:58.720
example what is the most accessed object. And in this case turns out the bucket index is

18:58.720 --> 19:05.360
pretty popular. So this is the Durr again. We also see that with my 3 OCD they are not one OCD

19:05.360 --> 19:12.800
got lucky and got some more charts. So it's a shorted data structure. What else can we ask?

19:13.600 --> 19:21.200
So I'm a big fan of histograms. You might notice and we can use these trace points to generate

19:21.200 --> 19:28.240
our own histograms that are per operations. So I picked picked three operations here right

19:28.240 --> 19:35.280
full read and call. And with that we already see okay. Even this operations compared to another

19:35.280 --> 19:40.640
have a very different characteristics in terms of latencies. Where we had before the number

19:40.640 --> 19:48.400
40 milliseconds. 40 milliseconds is somewhere here. Yeah. And the average isn't even reach really.

19:48.400 --> 19:53.840
And we also see that reads and writes seem to be by model in their distribution. And we see

19:53.840 --> 20:00.880
that calls are very very fast usually. This is just like under a 500 microseconds here. But

20:00.880 --> 20:09.520
also some take like in this range of 30 milliseconds. This is odd right? This is but we can only

20:09.520 --> 20:16.880
find it out with looking at things like that. And from that we could even drill down further.

20:16.880 --> 20:22.720
We could build histograms that even have the size in them. And we don't have to pre-decide

20:22.720 --> 20:29.520
what kind of things we want to measure. We can do that on the running system. And again how

20:29.520 --> 20:37.040
does code for that looks like? On the left side there's the object s access counter. We just hook

20:37.120 --> 20:44.480
the post-op trace point. And in a map record the name of the object this happens to be in

20:44.480 --> 20:51.680
our zero and called count. And this count it does all the multi-threaded thing and it counts

20:52.400 --> 20:59.040
just the operation. This is it. And the nice thing with the pf traces if you exit the software

21:00.000 --> 21:07.280
we print out all the data structure you put stuff into. On the right side we have the thing that

21:07.280 --> 21:15.040
generated the histograms. And there we again we just in pre we record under TID. This is the thread

21:15.040 --> 21:23.360
ID. The start time and post the calculated duration and then call hist with this observation.

21:24.320 --> 21:34.560
Okay cool. This is all we need to do. There's I haven't so far talked much about

21:35.120 --> 21:43.440
U-propes and U-propes are as I said at the beginning. They're just hooking functions.

21:43.440 --> 21:49.040
And so I just want to say yes we can also use them to find something useful out. And this

21:49.200 --> 21:56.880
looks a bit weird. I mean if you look at these U-propes, what is that? For C++ developers

21:56.880 --> 22:02.960
is probably obvious this is like a mangled function name. You can use wild cards there and all kinds

22:02.960 --> 22:10.560
of stuff. But what I'm hooking here is in Rado's gateway the opt processing from in it. This one

22:10.560 --> 22:18.400
to complete. And so I can get histograms with that over the operation process like for every get

22:18.400 --> 22:24.560
for every put for every copy object in Rado's gateway. And what to do for this this needs

22:24.560 --> 22:29.600
debug info. I have to cast pointers to something and then traverse them to some interesting

22:29.600 --> 22:35.280
points. It's totally doable but a bit like unfriendly. The best thing you can always have with

22:35.280 --> 22:46.480
tracing is nice U-STT trace points. All right with that time for recap. So we start with how to trace

22:46.800 --> 22:52.400
what are languages, what are some good tools to start doing that. Then the first example from

22:52.400 --> 22:58.320
counters to event tracing on how to learn things about theft. And at the end we learn that

22:58.320 --> 23:05.920
we cannot only trace for events but also aggregates them into metrics. And where to go from here

23:05.920 --> 23:12.480
always like to have things like that. There is like what I would find interesting to do next

23:12.480 --> 23:20.880
is there is a tool called EDPFxPorter that allows you to convert the trace points directly into

23:20.880 --> 23:27.200
things that promisius can consume. And this seems to be interesting to like prototype possible

23:27.200 --> 23:31.680
metrics that would be interesting to have and theft just before doing the development work.

23:31.680 --> 23:37.520
Would also love to have some error tracing tools. If errors occur that allow me to further

23:37.600 --> 23:42.400
like find out what's there about or like what's up Tracer that you just say here's the process

23:42.400 --> 23:48.400
and theft like tell me if anything sensible going on there. Or integrate it with a longer

23:48.400 --> 23:53.120
act to have conditional log messages where you are just interested. I want these log messages

23:53.120 --> 24:00.720
through the Tracer and not having them fill up my log thing somewhere. And like a tiny

24:00.720 --> 24:05.360
call to action I already started collecting all the scripts I showed here to repository that

24:05.360 --> 24:12.240
I would love to get like a BFF tracing toolkit reported to going on like you will find so many

24:12.240 --> 24:20.400
of these tools for like system white tracing already. And with that the two links to first the

24:20.400 --> 24:26.000
bit more like a notebook file for all the things you saw on the talk and this for the toolkit.

24:26.000 --> 24:34.800
And then thank you.

24:34.800 --> 24:42.720
Seeing I've five minutes time for questions.

24:42.720 --> 24:47.360
Yeah.

24:47.360 --> 24:52.160
Sorry.

24:52.160 --> 24:58.800
The storage.

24:58.800 --> 25:04.720
It depends. The thing is if you're collecting those histograms they're all collected in memory

25:04.720 --> 25:12.240
in the kernel memory. Oh yeah. The question was how did I manage the storage of all the traces?

25:14.080 --> 25:19.520
The two answers to that for event tracing. I don't. I just stored them on my artist now.

25:19.840 --> 25:26.720
But I didn't do any high rates stuff. But for stuff that you are accumulating or condense into

25:26.720 --> 25:32.160
histograms for example, this is all done in kernel memory and you only get the results back.

25:32.160 --> 25:38.240
And this is so much data. So you don't have to like go from kernel to user space for every event

25:39.200 --> 25:42.400
does that answer your question? Great.

25:42.400 --> 25:48.080
Yeah. So if I want to tune this put object with things like 10 settings.

25:49.760 --> 25:55.520
How would I find the actual source code for this? And the other question is how much of

25:55.520 --> 25:59.680
access is going on in parallel as you can see?

25:59.680 --> 26:05.280
Okay. The first one is how do I find the source code for these traces?

26:05.280 --> 26:10.480
So you have shown in the put object. Yeah. There's like 8 or 9 set example.

26:12.480 --> 26:17.280
If I want to make this asynchronous in parallel, I find the source code in whatever

26:17.280 --> 26:22.160
RALOS is there a link somehow is it possible with vector traces?

26:23.200 --> 26:27.200
Yeah, we can always serve in your trace points with quest back traces. There is a

26:28.160 --> 26:34.880
called either trace or you trace. If you do a print back trace, you get the back trace.

26:34.880 --> 26:38.960
When you hit that trace point, yeah. Can do that to find the source code?

26:38.960 --> 26:43.120
Think if you have debug input kind of to tell you like more details about that.

26:44.480 --> 26:50.720
What symbols and was that? Frame pointers are usually enough to get the good trace points.

26:51.280 --> 26:53.680
And the second part was a synchronous.

26:57.600 --> 27:04.480
I mean, how much of the tracing is a synchronous in parallel? I think all of it basically.

27:04.480 --> 27:09.680
So if you have the trace point, it hits like on every thread and you even get the thread ID

27:09.680 --> 27:14.960
and the nice thread name and things like that. This is why sometimes things like the counter

27:14.960 --> 27:22.320
function is important because that handled the counter increment between threads.

27:22.320 --> 27:25.920
So it has that all built in bpf trace.

27:42.560 --> 27:49.120
I did write a skateway. Can this be useful for block device? It depends.

27:49.840 --> 27:56.720
This is for I did most of the tracing for user space and Rados block device is I think

27:56.720 --> 28:00.880
either user space or kernel space. For kernel space, you need kernel trace points.

28:02.480 --> 28:08.400
I'm not aware that they are like nice kernel space trace points, but you can do function tracing

28:08.400 --> 28:14.640
in the kernel as well. For the user space, I don't know if there's any USTT probes, but you can

28:14.880 --> 28:16.880
resort to function tracing all the time.

28:21.760 --> 28:22.400
Oh, yeah.

28:45.280 --> 28:52.800
The question was about performance and application of tracing and stuff.

28:55.600 --> 29:00.960
I know a couple of results that I didn't replicate so far.

29:00.960 --> 29:10.560
So the ebpf exporter project has a benchmark section. They claim fancy USTT probes cost

29:10.560 --> 29:19.120
around two microseconds. The conventional wisdom is, if it doesn't happen to often, it's

29:19.120 --> 29:27.680
okay to trace it. If you're starting a trace model, you will see that it towards the terms of

29:27.680 --> 29:34.000
performance. But this is something I really want to look into as well. What does it mean? Can I

29:34.080 --> 29:35.440
really run it in production, right?

29:51.600 --> 29:56.800
Sir, okay, question was, sir, circuit breaker for taking down my application was too many

29:56.800 --> 30:00.400
trace. I don't think there is. You can build your own, I guess.

30:00.400 --> 30:10.800
Yeah. Yeah. Okay, time is up. We can talk outside and anybody else wants to be stopped.