WEBVTT

00:00.000 --> 00:09.000
Let's, I'll just dive straight into just a quick introduction. I work for the OSIS as well.

00:09.000 --> 00:14.000
I'm Jordan Maris, I'm their EU policy analyst, so I spend actually all of my time here in Brussels.

00:14.000 --> 00:20.000
Working on EU-related issues, so a lot of my time is spent trying to educate lawmakers,

00:20.000 --> 00:28.000
and trying to work as well with lawmakers when we're developing laws, so that open sources in mind.

00:28.000 --> 00:32.000
I'm a former EU parliamentary staffer, and I've been a user-breaking source, and so I was a child,

00:32.000 --> 00:40.000
and somebody gave me a Linux live CD, took about six hours to get the Wi-Fi working, but I did get there and stuck with it afterwards.

00:40.000 --> 00:45.000
So basically, in order to explain to you why we've done the things we've done, I'm in a terrible little story.

00:45.000 --> 00:51.000
A long time ago, before Chacchipiti was a thing in a parliament, approximately 20 minutes from here,

00:51.000 --> 00:57.000
there was a young policy advisor, and he was sat in the negotiations on the AI Act,

00:57.000 --> 01:03.000
along with a couple of other people who just outside this room, and as the negotiations were coming to a close,

01:03.000 --> 01:06.000
he made the terrible mistake of asking, what about open source?

01:06.000 --> 01:11.000
And this ended up creating a lot of problems for basically everyone.

01:11.000 --> 01:18.000
You see, we exempted open source from the AI Act, and we didn't direct precisely enough.

01:18.000 --> 01:25.000
The result was a handful of companies, in particular one or two, decided that they could exploit that exemption,

01:25.000 --> 01:27.000
to get away with not complying with the law.

01:27.000 --> 01:35.000
Now, around this time, the OSI also started work on its open source AI definition, which was really great.

01:35.000 --> 01:41.000
And just after I joined the OSI, we were having a discussion in Brussels with a former colleague who was working on the law,

01:41.000 --> 01:45.000
and I introduced Simon as a director of the open source initiative, explained what we do,

01:45.000 --> 01:51.000
and their question was, okay, well, we passed this law, we said AI is exempt, but what does it actually mean?

01:51.000 --> 01:57.000
And that's the problem that I've been faced with over the past five months.

01:57.000 --> 02:01.000
And the problem is basically simple, there are two visions of open source AI and Brussels.

02:01.000 --> 02:09.000
The vision that Metahalf, which is not particularly writing my opinion, and then there's the OSI-D.

02:09.000 --> 02:13.000
So, Metahalf's vision is based on probably free key arguments.

02:13.000 --> 02:17.000
The first argument is that AI is different to software, it's so different, in fact,

02:17.000 --> 02:21.000
that everything that we consider to make sense doesn't make sense anymore.

02:21.000 --> 02:27.000
So, the argument is essentially, we get to redefine what open source AI means, because it's so different.

02:27.000 --> 02:33.000
The second argument, which they use, is that AI poses a threat, a fundamental threat,

02:33.000 --> 02:37.000
and giving it to everybody without restrictions or control is a risk.

02:37.000 --> 02:45.000
The third argument is that their AI is the most used, and it's true, Lama has been deployed many times, but that doesn't make it free.

02:45.960 --> 02:49.000
Unfortunately, they've been very, very successful in Brussels.

02:49.000 --> 02:57.000
So, what we have essentially is an army of corporate lobbyists who want to redefine what open source means, and they are winning.

02:57.000 --> 03:05.000
They're winning as well, because we are an enormous community, and we are not very well represented in Brussels.

03:05.000 --> 03:07.000
We are more and more represented in Brussels.

03:07.000 --> 03:13.000
So, we've gone from having a handful of people doing open source here to about seven, which is great news for us.

03:13.000 --> 03:19.000
We have to fight back, and the OSID is the way to fight back.

03:19.000 --> 03:25.000
The reasons for that are that what lawmakers require from us, what they want to hear from us,

03:25.000 --> 03:31.000
is a definition that they can use in all situations from a legal perspective.

03:31.000 --> 03:37.000
It means that they have to be able to apply it, and it means that it has to address the challenges that come with releasing the data.

03:37.000 --> 03:41.000
I can give you two, and perhaps a couple more, data protection isn't obvious.

03:41.000 --> 03:47.000
Metas Lama was never going to be open source in the European Union, because it's probably trained on users' data, and that creates a problem.

03:47.000 --> 03:51.000
But the question is, how do we deal with that in other cases such as medical AI?

03:51.000 --> 03:57.000
If we want to build, for example, an open source AI to analyze CT scans, how can we do that in a way

03:57.000 --> 04:03.000
that respects the privacy of the people who donate their CT scans to the data set,

04:03.000 --> 04:09.000
but also gives us the possibility to do what open source allows us to do.

04:09.000 --> 04:12.000
The second thing is the copyright rules.

04:12.000 --> 04:16.000
It's kind of difficult, but when you look at, for example, the Bible.

04:16.000 --> 04:20.000
It's not exactly the text that we might use primarily to train an AI.

04:20.000 --> 04:23.000
But in different jurisdictions, the Bible has different copyright.

04:23.000 --> 04:30.000
The CT James Bible, in most of the European Union, is basically public.

04:30.000 --> 04:33.000
You can use it, however, like in the United Kingdom, it's crown copyright.

04:33.000 --> 04:39.000
For image generation, you could take a lot of Italian artwork, for example, which, in most of the European Union, has no restrictions,

04:39.000 --> 04:42.000
but in Italy, is state copyright.

04:42.000 --> 04:46.000
This is quite a big problem, because the data sets that we're using to build AI are enormous.

04:46.000 --> 04:53.000
And it just takes one thing like this, one of these edge cases, which we don't know how to manage to create enormous problems.

04:54.000 --> 05:06.000
The reason that the ISAID is a good choice to fight back against open-washing, is because it can adapt to these use cases.

05:06.000 --> 05:10.000
In my view, yeah, also important note, sorry.

05:10.000 --> 05:13.000
It was written with lawmakers in mind, but not four lawmakers.

05:13.000 --> 05:17.000
So the ISAID process was not written for the people who are making the law.

05:17.000 --> 05:20.000
It just happens to be a very good solution.

05:21.000 --> 05:25.000
So how do we actually comply with the open-source definition?

05:25.000 --> 05:32.000
How do we build the ISAID in a way that we're working all these cases, but doesn't compromise the four freedoms?

05:32.000 --> 05:40.000
Well, first of all, we mandate the training data, the sharing of all training data, which can be legally and technically shared.

05:40.000 --> 05:46.000
Another example of a case where the data might not be shareable is where it's directly, essentially directly fed into the model,

05:46.000 --> 05:49.000
and the quantities of data are too large to store.

05:49.000 --> 05:54.000
So take maybe certain or something like that as an example, you know, radio telescopes, that sort of thing.

05:54.000 --> 05:58.000
It mandates sharing where that data cannot be acquired.

05:58.000 --> 06:05.000
So let's say, for example, we have a image generation AI, which has been trained on films.

06:05.000 --> 06:11.000
It mandates that you say exactly which films it was trained on and where they can be acquired for the purposes of AI training.

06:11.000 --> 06:18.000
Third thing, it mandates quite highly detailed granular descriptions of the data for the purposes of replicating something similar.

06:18.000 --> 06:28.000
So essentially, what the OSID says is, where it is not legally or technically possible to share the data, you have to give people the instructions to build something that is equivalent.

06:28.000 --> 06:36.000
And I think that's how we can respect the four freedoms whilst still having a tool that we can use the icon use here in Brussels to fight open-washing,

06:36.000 --> 06:38.000
which is what we've been doing for the last five months.

06:38.000 --> 06:44.000
So here's what we're doing now. We are going to lawmakers, going to staff is in the European Parliament in the commission,

06:44.000 --> 06:51.000
talking to them about this. There's also a letter that's being prepared by MEPs with our support to be sent to the European Commission.

06:51.000 --> 06:56.000
We are participating in the European Commission's working group on codes of practice for open-source AI,

06:56.000 --> 07:02.000
where we're fighting back against acceptable use policies and other such restrictions which are incompatible with open-source.

07:02.000 --> 07:08.000
We are highlighting the open-washing matter of doing, even when they have an army of corporate lobbyists trying to convince everybody the country,

07:08.000 --> 07:14.000
and we are going around presenting the OSID, both to technical and to political audiences across the Union.

07:14.000 --> 07:18.000
So thank you very much, really nice to hear from you.

