WEBVTT

00:00.000 --> 00:10.240
Hello and welcome to my talk, I'm compared with the previous talks where people talked

00:10.240 --> 00:16.720
about, I cannot louder, sorry, that's my voice.

00:16.720 --> 00:23.200
The other presenters were talking about their company efforts or the community efforts this

00:23.200 --> 00:30.640
time, I will present what happens in a smaller community that is the bottom of the supply

00:30.640 --> 00:36.640
chain. More precise in the Apache Natex Real-Time Operating System.

00:36.640 --> 00:44.320
My name is Alijer Peleau, I'm an open source advocate, number of several communities. Currently

00:44.320 --> 00:52.120
the Apache Natex Real-Time Operating System chair and I work as an open source of the architect

00:52.120 --> 00:58.840
in the Sony Ospo, Europe. You can contact me on the LinkedIn. Due to the time limit, I will

00:58.840 --> 01:05.720
run through the slides, so the disclaimer is there, I will not read it. Most of you probably

01:05.720 --> 01:11.320
didn't know about the Natex Real-Time Operating System, so I will have a small introduction

01:12.440 --> 01:19.960
so that we set the contacts. The Apache Natex is a small footprint, Real-Time Operating System,

01:20.680 --> 01:28.360
it uses standards and it's focused on compliance, it runs from systems, microcontrollers

01:28.360 --> 01:37.640
that are between 8 and 64 bits. It is available and more than 400 boards, it provides documentation

01:37.640 --> 01:45.000
and welcoming community. Also, it has been used and it is used in commercial products.

01:45.000 --> 01:52.440
You can see a few examples of development boards. The one on the right is from Sony, it's called Sony's

01:52.440 --> 02:00.120
presence. History, again, running quite fast, the project has been released under the permissive

02:00.120 --> 02:08.280
BSD license by Gregory Nat. In 2007, in 2019, has been donated by Gregory Nat to the Apache

02:08.280 --> 02:15.240
Software Foundation, thank you Gregory for this donation and we graduated in 2022 as a top-level

02:15.240 --> 02:24.200
project. In 2024 and currently we have the governments of the project as an open governance.

02:24.200 --> 02:33.240
We have 24 members in the committee and we vote all the big decisions. We have many products using

02:33.240 --> 02:40.040
Natex. You see a few pictures. I hope that they are right. Since we discover what companies and

02:40.040 --> 02:47.080
products use Natex when companies come forward to us, but otherwise we have no idea what people are doing

02:47.080 --> 02:54.280
with Natex and he will see soon an even interesting one. Digital Records,

02:54.440 --> 03:02.120
headphones, drones, robots, protection equipment. You can find all those examples in the

03:02.120 --> 03:08.840
recordings for the Natex International Worktrop Conference where people come forward and they talk about

03:08.840 --> 03:22.760
their use cases. Also, Xiaomi is using Natex in the IoT platform and the Vela is called the

03:22.760 --> 03:33.160
platform and they are contributing a lot to the project. Sure. Also, Natex is used

03:33.160 --> 03:40.280
was used on a small robot that landed on the moon by the Jackson Space Agency using Sony's

03:40.280 --> 03:47.080
presence development board exactly the board that you saw previously. So you have the full announcement

03:47.160 --> 03:56.440
and you can Google for it. Now that we have the use case for this operating system,

03:56.440 --> 04:03.880
it probably you realize why it is important for us to be compliant and why we want to provide

04:03.880 --> 04:14.120
as a community DSBOM. So during our transition from a BSD project to an Apache project which

04:14.200 --> 04:18.840
changes the licenses, you can see the progression and the versions of the bottom.

04:20.040 --> 04:26.360
Really, it is less important. It is roughly two years of work. We have to scan every bit of code

04:26.360 --> 04:32.200
and make sure that the licenses are compliant and we do proper license transition. And then

04:32.200 --> 04:40.280
because we identified licenses which were not Apache, we ended up having especially in the application

04:41.160 --> 04:49.400
side. We ended up having menu item as you see it's a fairly common view for kernel where people

04:49.400 --> 04:56.840
can select their licenses. It is impossible to build Natex without DSD and the MIT components

04:56.840 --> 05:04.680
because they are used in the core but you can exclude everything else if you don't want it in your

05:04.680 --> 05:13.160
applications. And then we wanted to see is that really true. So we wanted to have an SBOM to

05:13.160 --> 05:21.960
get a list of what's actually built. We go back to that later. What is an SBOM? Because everybody

05:21.960 --> 05:30.440
is talking about SBOM's and we as community were a bit confused. An SBOM is a softer

05:30.440 --> 05:37.320
bill of materials, a list of all components present in the code base, including license version

05:37.320 --> 05:44.760
metadata which allows security team to quickly identify security risk. The definition is really nice

05:44.760 --> 05:55.960
but which as BOM do we need? According to the CSAT listing there are six. Do we need all of them?

05:56.760 --> 06:05.240
We were groups. Design? The software is there. We don't design anything. People contribute to it.

06:05.240 --> 06:14.520
We have no control. Source we can see the whole source is but would that help the companies

06:14.520 --> 06:20.040
that use our product to really know what they put in their product because they will get a list

06:20.040 --> 06:26.520
of all the licenses, files and correlations for the whole notex. They will build a subset.

06:28.520 --> 06:37.480
So we were thinking that a build SBOM would be what would make some users, some our users,

06:37.480 --> 06:45.000
our companies. Happy. So we decided to go with source to provide a full index of everything that we

06:45.000 --> 06:51.720
have for every release and a build SBOM which means that when you build you get exactly what you have.

06:53.560 --> 07:00.360
And then we were looking even further because there was little information when we started this

07:00.360 --> 07:10.120
discussion two, three years ago that we need SBOMs. Most information that you find comes for products

07:10.120 --> 07:16.440
that have an SBOM for packages and this is how you would have it. You download the package,

07:16.440 --> 07:22.440
you get the package data, you get an DSBOM data, you assemble it and you put a product on the market

07:22.440 --> 07:29.960
with the SBOM data or at least this is our own understanding. But in our case you get code,

07:29.960 --> 07:39.720
it has nothing. The company has to generate an SBOM from whatever and then you put the SBOM

07:39.720 --> 07:47.240
and the software release on the market. So thinking about this picture, we were thinking, okay,

07:47.240 --> 07:54.680
that means that we have to integrate the SBOM generation in our build tools so that when you build

07:54.680 --> 08:05.320
your binary you get also the SBOM for exactly the product that you're building. And to put a bit of

08:05.320 --> 08:13.160
salt on the wind, we had no idea how to do that. At that point we had no idea about SBDX,

08:13.160 --> 08:22.360
there were a few mentions of it, but we had licensees in clear text. So we started looking at

08:22.360 --> 08:29.000
the SBDX, I joined the SBDX community and it's a really nice community, welcoming community

08:29.160 --> 08:39.400
sharing information. And we decided that we will use this also a bit confusing in our research

08:40.120 --> 08:46.280
which version should we go for the latest and the greatest, should we go for what the other

08:46.280 --> 08:55.640
seems to go. So we decided to start with version 2.3 to be more exact. And then when we will see

08:55.720 --> 09:04.840
others using the newer format, we will just update the tools. We decided to go as early as possible

09:04.840 --> 09:14.120
and then we will see. The SBDX website provides many tools. I started looking at all of them

09:14.120 --> 09:20.840
and from my findings, I couldn't use them for generating an SBOM for

09:20.840 --> 09:31.800
secode for real-time operating system. So okay, then we started looking at the whole landscape of

09:32.680 --> 09:42.120
operating systems that are having almost the same goals, from communities, from companies active or

09:42.440 --> 09:50.440
are high. And we saw that we have free articles which implemented a build SBOM quite recently.

09:51.320 --> 09:57.960
This slide was updated last week when we did the initial investigation three years ago. The

09:57.960 --> 10:07.400
landscape was different. Then we have the ZFEROS which provides both source and build SBOM

10:07.960 --> 10:14.920
and we have not text which hopefully will have the first release with a source SBOM and build

10:14.920 --> 10:23.800
SBOM at the build time in March in the next release. So now we have an idea of what is available,

10:24.360 --> 10:34.520
an idea of what other communities are doing, an idea of how we can do it. So we started working.

10:34.520 --> 10:42.840
And this was the painful part. Okay, we have we are in Apache Project under the Apache Foundation.

10:42.840 --> 10:48.840
We added the SBDX license identifier and we were done or so with all.

10:50.680 --> 11:00.600
Yeah, old code. BST0 identified with FOSID. I know it's a commercial tool. I used what I had.

11:00.840 --> 11:11.720
A long list of copyright owners, a long list of contributors, informations, quite scars there.

11:13.480 --> 11:22.680
Or we ended up with code that has multiple licenses. Really nice. Also identified by the scanning tools.

11:23.000 --> 11:31.880
Unknown license. I saw these license for the first time. I don't know how many people are

11:31.880 --> 11:35.880
aware with it but I couldn't identify it from the text. I needed help.

11:38.600 --> 11:45.480
And also the tools can be misleading as you can see here. Quite nice help from the Xiaomi guys

11:45.480 --> 11:53.640
because my tool gave a false positive. I was unsure. They identified it the code as BST0.

11:54.920 --> 12:04.440
My tool identified it as GPL. So we documented everything. If somebody has anything to say later

12:04.440 --> 12:11.160
that the license was not properly checked, we have the answer from two conflicting tools. And

12:11.960 --> 12:20.760
yeah, I mean help. The community is based on help. We help each other to make things work.

12:23.960 --> 12:32.760
Also there are SPDX identifiers that we cannot find. And initially I was looking for something called

12:32.760 --> 12:38.200
public domain. Public domain is handled differently in different jurisdictions. So there is no

12:38.200 --> 12:46.440
identifier. We defined our own not-expublic domain and I plan to submit it to the SPDX list.

12:47.080 --> 12:54.600
So that every time when an S bomb is handled by somebody they can see that it is defined

12:54.600 --> 13:04.280
and properly recognized and propagated in the S bomb. And then we had all the information that we

13:04.280 --> 13:12.840
needed. We had SPDX headers, identifiers. And we started defining what we want to do. We wanted

13:12.840 --> 13:20.040
to make it fully automatic. We wanted to reuse our build system which is make build a make based

13:20.840 --> 13:29.800
and collecting information. What information? Our build system looks mostly like this and it

13:29.800 --> 13:37.800
would make sense. So basically we have the not-eximage. We have the files. Then we get the sources

13:37.800 --> 13:44.760
and the headers and the headers are reused and used by many. I try to simplify the whole thing

13:44.760 --> 13:52.360
because it's quite massive. So the build system will give us debt files for each library. As you

13:52.360 --> 13:58.040
can see, those are the ones that I've been using for the demo. And then for each library you have the

13:58.120 --> 14:04.520
C files that we'll go in the library and the C files get the headers for each one of them

14:04.520 --> 14:11.720
including the system ones. So all the dependencies are aggregated in a big file and then clearing

14:11.720 --> 14:20.920
the path so that it can be reused. We populate the file with the proper files and paths.

14:21.880 --> 14:34.120
Yeah, this one I already jumped over. SPDX to the rescue. You get a really nice documentation

14:34.120 --> 14:41.320
on what fields you actually need and what fields you have to have, what fields are nice to have,

14:41.320 --> 14:50.040
what is the format for them. So if you have any SPDX doubts, just go and read the manual. It's

14:50.120 --> 14:59.400
wonderfully done. Also, they provide the examples. So you can just take a look at their examples

15:00.040 --> 15:07.400
and then you figure out what would work for you and what not. So having also this extra information

15:07.400 --> 15:15.000
we were able to finally have our objectives and define what we need. We decided to go with SPDX

15:15.080 --> 15:21.640
to point three. We decided to go with JSON format because it's easy to parse and easy to transfer

15:21.640 --> 15:27.640
to any other format. There are multiple converters over there. You can also convert it to cyclone

15:27.640 --> 15:34.520
the exit. That is what you need. We decided to collect file hashes, licenses and relationships

15:34.520 --> 15:44.520
between files for both sources and build artifacts. So that is how our header looks. I'm not

15:44.600 --> 15:52.360
100% that this is all that we need, but this is what we have in this moment. So it's a simple

15:52.360 --> 16:04.280
snippet from what we did. I generated the header on 24th. Then for source files we also collect

16:04.600 --> 16:14.120
file name, file type, the ID, show one some licensees, also the license from file and license

16:14.120 --> 16:21.160
concluded. If the file has no source, the concluded license is no assert. If the file has multiple

16:21.160 --> 16:28.600
licensees, the concluded license will be either Apache or if you have a GPL, it will be GPL.

16:29.560 --> 16:39.800
Okay. Same thing for headers. We use our own headers and we also use system headers. Everything

16:39.800 --> 16:49.080
is documented with the shots from each of the files. Some of them have no license information

16:49.080 --> 17:01.880
you to obvious reasons. And we thought that things are done, but they are not. As you would

17:01.880 --> 17:08.520
have dependencies on your package manager, we have dependencies on source code. So we are getting

17:08.520 --> 17:17.640
a build time, different get clones basically from other projects. And in our case, we are using

17:18.520 --> 17:25.800
there are many more roughly 50 projects. So in our folder, we only have the rules to build,

17:25.800 --> 17:31.640
but we do not have the sources. And the problem is that those projects either do not have

17:31.640 --> 17:41.640
SPDX identifiers or we cannot trust them. So the logical thing to do for us as a community

17:41.640 --> 17:50.520
is to help them do a license check, help them get SPDX identifiers on all our projects which

17:50.520 --> 17:56.920
are dependencies and contribute upstream because they may need the helping hand. They may be

17:56.920 --> 18:05.480
a one-man project. The alternate fix which I do not suggest anyone to do is to have patches

18:06.360 --> 18:13.160
near the building rules for each file and add the licenses locally. It is ugly and it will not

18:13.160 --> 18:23.640
help the community. So yeah. And my final slide before some Q&A, I think that we still have a few minutes.

18:23.880 --> 18:36.520
This journey was almost three years. For me, it started with a lot of information. It was

18:36.520 --> 18:45.160
overhelming information. People were mentioning SPDX, but without any reference to an embedded

18:45.160 --> 18:52.600
operating system. SPDX are great, but they need some kind of definition for somebody that is new.

18:53.000 --> 18:59.800
You need an S-bomb for packages, an S-bomb for cloud, an S-bomb for your project.

19:02.040 --> 19:10.200
Open chain and SPDX are wonderful communities. If you have any doubt on anything, just go

19:10.200 --> 19:19.720
there, read, ask. People will help. Looking at our dependencies, SPDX identifiers may be missing.

19:20.520 --> 19:27.720
I bet that is the case for all of you here and all of you online. Maybe it's good to just

19:28.520 --> 19:35.640
talk with the maintainers of those projects to make them aware that you need SPDX headers.

19:37.880 --> 19:43.720
It may be good to try to help them. Maybe they do not have the resources for doing this work.

19:44.680 --> 19:53.480
Also coming from both a company and an open source background, I can say join the open source

19:53.480 --> 20:01.640
community. If you are interested in a specific area, in this case, open chain, SPDX, join those

20:01.640 --> 20:08.440
communities. They will help you get better and you can help them understand what the problem is.

20:08.840 --> 20:12.840
Thank you.

20:18.840 --> 20:22.840
Questions for all of you who was kind enough to leave time?

20:22.840 --> 20:31.000
Yeah, great hard. You spend a lot of time to help source SPDX. Really, good but you do.

20:31.000 --> 20:34.920
But actually, did you request, really request for the source SPDX?

20:36.040 --> 20:44.120
We as a community. Yeah, so the question was, if there was a request for source SPDX.

20:44.760 --> 20:50.840
No, we as a community started discussing this because we are at the bottom of the supply chain.

20:51.480 --> 20:57.800
So if we don't provide a build SPDX, the company building the software cannot do it.

20:58.760 --> 21:01.400
It has to be implemented in the build system.

21:03.000 --> 21:08.840
They will get a full source SPDX, they can do that from the scanning tools, but that is not what

21:08.840 --> 21:18.760
their product has. It's, yeah, you have a subset of all the sources. Thank you for the question.

21:18.840 --> 21:19.400
Yes.

21:32.680 --> 21:35.560
The question was, how do we deal with external?

21:36.520 --> 21:39.480
Have a test version of some library in there that you want to see?

21:45.480 --> 21:52.040
The question was regarding projects that are not your direct dependencies.

21:52.040 --> 21:58.600
They may be tools or test projects that will end up in your S-bomb, but they are not used.

21:58.600 --> 22:02.600
The answer is, for this project, we don't have them.

22:02.840 --> 22:06.280
We only have what we actually use and what we actually build.

22:08.680 --> 22:10.680
We don't get them in our source.

22:12.280 --> 22:18.120
But this again is our case. In other cases, probably there are other answers.

22:33.560 --> 22:39.320
Sorry, I'm sorry. I don't know how to handle it for Java.

22:41.320 --> 22:45.080
There is one last question, I think, or maybe two.

22:54.600 --> 23:01.000
Yes, the source, the question was if there is a difference between source and build S-bomb.

23:01.080 --> 23:06.600
The source S-bomb will be delivered at the packaging time for each release,

23:06.600 --> 23:10.920
and it contains all the information for all sources in the project.

23:11.480 --> 23:15.320
While the build S-bomb, let's say that you have a Raspberry Pi picot,

23:16.200 --> 23:21.560
will contain only the files that end up in the final image on your Raspberry Pi picot.

23:23.560 --> 23:27.080
With the applications and everything that you selected in the menu.

23:27.960 --> 23:31.960
We have time for, I think, one last question. Yes.

23:38.520 --> 23:40.200
In the next generation?

23:40.200 --> 23:46.360
I have, yeah, the question was, how many projects we contribute to the S-bomb information?

23:46.360 --> 23:50.520
In this moment, zero, we are just fixing our own S-bomb.

23:51.240 --> 23:56.920
But the aim is to help our dependencies and then see how we can help.

23:58.040 --> 23:59.800
So, thank you very much.

23:59.800 --> 24:01.800
Thank you.

