WEBVTT

00:00.000 --> 00:11.680
So, our next speakers are Jerome Gawa, and I'm on the Injambar.

00:11.680 --> 00:15.320
And it is about auditing web trackers.

00:15.320 --> 00:18.320
So I hand over the microphone.

00:18.320 --> 00:19.320
Welcome.

00:19.320 --> 00:21.320
Thank you very much.

00:21.320 --> 00:22.320
Thank you.

00:22.320 --> 00:27.600
Thank you very much for adding us.

00:27.600 --> 00:28.600
So I'm a model in Jomer.

00:28.600 --> 00:33.600
I'm working at the EDPB secretariat at Technology Experts.

00:33.600 --> 00:39.920
And we are giving this talk with Jerome, who is researcher in privacy at French University

00:39.920 --> 00:42.800
that's called Unila Sen.

00:42.800 --> 00:50.080
And so just for this, we don't know the EDPB is a European data protection board.

00:50.080 --> 00:56.960
The idea is a European board who includes all the data protection, all national data protection

00:56.960 --> 01:07.600
authority in Europe, plus the tree from the FTA country, which are nation-time, Norway, and Iceland.

01:07.600 --> 01:08.760
And also the EDPBs.

01:08.760 --> 01:11.080
So we are often confused with EDPBs.

01:11.080 --> 01:20.040
So EDPBs is the supervisory authority for the EU institution, while the EDPB is the board

01:20.080 --> 01:28.920
who helps all the data protection authority to agree on themselves to discuss, to move forward.

01:28.920 --> 01:34.200
In practice, so I am working in the secretariat, so to help to not help to the confusion,

01:34.200 --> 01:37.440
the secretariat is technically provided by the EDPBs.

01:37.440 --> 01:41.040
But it just means that on my page clip it's written EDPBs.

01:41.040 --> 01:47.400
But in practice, my boss is the chair of the EDPB, who he has for now the chair of the

01:47.400 --> 01:51.320
Finnish SA, and Italian.

01:51.320 --> 01:58.160
And so in practice at the EDPB, we have a lot of different tools to make all the data protection

01:58.160 --> 02:02.680
authority thrive, and one of them is the support pool of experts.

02:02.680 --> 02:07.080
The idea of this program is we have the call for expression of interest for experts in

02:07.080 --> 02:14.320
legal, technical fields, and they can say, okay, we are interested to have a contract

02:14.320 --> 02:15.320
with you.

02:16.240 --> 02:18.160
Actually, we have a lot of people on the field.

02:18.160 --> 02:25.080
Yes, we have 30% technical and 75% legal, but it just means that we are also people who are

02:25.080 --> 02:27.320
both technical and legal.

02:27.320 --> 02:32.640
And we have all those people who can pick for our project.

02:32.640 --> 02:39.640
Since July 2022, we had 22 projects out of which 18 have been completed, and we try to

02:39.680 --> 02:48.800
make them public, so we are publishing once they are ready, as the same says.

02:48.800 --> 02:51.960
And most of these projects are report.

02:51.960 --> 02:57.680
For example, we have the one stop shop, the idea is we will have one tematic, and we

02:57.680 --> 03:03.120
will pick our layer or legal professor, and we will look at all the decisions that have

03:03.120 --> 03:07.640
been taken on this subject by all authority, and we will do a report.

03:07.640 --> 03:15.360
We also have technical reports, like the standardized messenger audit, where the professor

03:15.360 --> 03:21.640
looked at what kind of requirement we call half to do an audit on the messenger half.

03:21.640 --> 03:24.760
So we have all those reports, and we have the exception.

03:24.760 --> 03:30.480
And the exception is the EDPB website of the team tool, which is a software, who has been

03:30.480 --> 03:32.640
developed by agerum.

03:32.640 --> 03:35.880
And so what was the idea of the what?

03:35.880 --> 03:38.440
So internet is full of cookies everywhere.

03:38.440 --> 03:44.600
We have cookies, for example, when you are authenticating your user, you want to be sure

03:44.600 --> 03:48.400
that it's a same person in one page, and the next one, you need a cookie, but you also

03:48.400 --> 03:53.080
have tracker cookies, so you have a lot of different kind of cookies.

03:53.080 --> 04:02.360
And the thing is, whenever you are using a cookie to talk about one individual, then you

04:02.360 --> 04:10.240
have, you have data, personal data, and so the GDPR apply to privacy apply.

04:10.240 --> 04:14.920
You also have a certain piece of legislation that applies, that is the E-Privacy Directive,

04:14.920 --> 04:19.520
and in fact, not all data protection is written, but some of them are also responsible

04:19.520 --> 04:21.720
of the application of the E-Privacy.

04:21.720 --> 04:27.440
In any case, at least for the first case, authority will need when they will audit a website

04:27.440 --> 04:35.280
to check out whether the cookies are, you know, ligates or not well done or not.

04:35.280 --> 04:38.480
And so what do they have to look at?

04:38.480 --> 04:44.280
The third thing is, they have to understand what is this cookie, what is it used for, okay?

04:44.280 --> 04:46.840
So that's what we call the purpose.

04:46.840 --> 04:52.000
And it's very important because it has a legal impact.

04:52.000 --> 04:57.080
If you have a technical cookie, and it's just here, so your website's function, then you

04:57.080 --> 04:59.840
don't have to ask consent for it.

04:59.840 --> 05:05.640
But if you are, for example, using a cookie for advertising purpose, then you need to ask,

05:05.640 --> 05:09.280
you know, your user to consent to get this cookie.

05:09.280 --> 05:11.160
And then we are talking about consensus.

05:11.160 --> 05:16.640
So if we are talking about consensus, and the first thing is, I have a cookie there, okay?

05:16.640 --> 05:21.120
I come to ask you to eat them before asking you if you want them, agree?

05:21.120 --> 05:27.080
So the first thing is you have to check whether the cookie arrives on the website before or

05:27.080 --> 05:30.360
after the consent, the first thing you have to check.

05:30.360 --> 05:36.920
Second is, the user should be able to choose which kind of cookie they agree to eat.

05:36.920 --> 05:41.560
If I have nice milk cookie like this one, maybe you will agree, maybe if I have mint,

05:41.560 --> 05:45.800
chocolate one, you will, you know, disagree, I don't know.

05:45.960 --> 05:54.160
The last one is, you know, any good consensus should be something that you can change, okay?

05:54.160 --> 05:58.560
I can tell you, yes, I want to cookie now, and I test it, and I don't like it.

05:58.560 --> 06:01.920
And I should be allowed to put it in the bin, okay?

06:01.920 --> 06:04.800
It's the same with your cookie on your browser.

06:04.800 --> 06:11.000
So you should be able to raise your content and it should not be hard, okay?

06:11.000 --> 06:16.000
And so all of that are included in any audits that you are doing as, you know,

06:16.000 --> 06:20.760
the top protection authority whenever you are looking at the website.

06:20.760 --> 06:30.960
The same years, we wanted to do that in a tool, and our dream tool would be easy to use,

06:30.960 --> 06:35.760
because in our 30 most officer are legal officer, okay?

06:35.760 --> 06:40.800
The technical officer are very rare resource.

06:40.800 --> 06:48.080
So we want the legal officer to be able to do most or even all of the assessment.

06:48.080 --> 06:53.000
So we want some things that is easy to use, where they can interact with the websites,

06:53.000 --> 06:56.840
because you have things that you don't like, dipping a new circumstances,

06:56.840 --> 07:01.720
you may need to click on the website on buttons or this kind of things.

07:01.720 --> 07:08.720
And then we want to simplify our lives, because we need to do the best with our resources

07:08.720 --> 07:15.880
to be in the same software to do the audits, generate the reports, send the report,

07:15.880 --> 07:18.760
do the evaluation, generate the report and so on.

07:18.760 --> 07:26.160
And we want to reuse knowledge, because it's both for efficiency, but also for legal certainty.

07:26.160 --> 07:31.560
I mean, if we have the technical decision at the time, we want to be sure that our colleague

07:31.560 --> 07:36.520
or in the future will take the decision and for the act, we need to be able to reuse

07:36.520 --> 07:40.960
knowledge to reuse it and also to work with the colleague.

07:40.960 --> 07:46.560
And the last one, when we did it, the brainstorming about that, is that we need to create

07:46.560 --> 07:50.560
an open software.

07:50.560 --> 07:57.760
So at the EDPB level, we wanted to do software for transparency reason, but inside some

07:57.760 --> 08:04.560
of the data protection authority, it was also requiring, for many two reasons, the first

08:04.560 --> 08:10.720
one was, so they want public free and open software to be, to be precise, because they

08:10.720 --> 08:16.720
want whenever they do an audits, for the auditi to be able to reduce the audits and check

08:16.720 --> 08:19.560
what they are found and say, okay, you know.

08:19.560 --> 08:24.200
And so as we said, we don't want just a browser, but we want to be able to do the evaluation

08:24.200 --> 08:30.400
and the report and so on, we want to be able to prove that the entire software don't

08:30.400 --> 08:31.920
mess up with the browser.

08:31.920 --> 08:39.200
So that's what we say we have is that effectively what is it.

08:39.200 --> 08:45.000
And so we look at what was existing and we had a few things that was, we say, could qualify

08:45.000 --> 08:52.080
us very naughty and all things that are, you know, like not answering our beans, maybe

08:52.080 --> 08:55.640
because they were using machine learning and so had false positive, what is counting

08:55.640 --> 09:00.480
that you cannot have in the audits that we have legal consequences behind.

09:00.480 --> 09:05.320
So you want to be sure that what your insight is true.

09:05.320 --> 09:09.000
So we decided to develop the website of this thing too.

09:09.000 --> 09:17.800
So with the CPSP program, using the romp, nicely as to subscribe to our call, as free and

09:17.800 --> 09:22.920
open source software under the EU here and based on different other false projects.

09:22.920 --> 09:28.040
And we just take one minute, says it's, we are, in particular, used as the EDPS quick.

09:28.040 --> 09:34.200
So you see EDPS, this time, it's a complementary tool, the idea of the work is really

09:34.200 --> 09:35.880
to have bulk code it.

09:35.880 --> 09:45.000
So it's also website as a cookie auditing tool, but the thing is, they are the off script

09:45.000 --> 09:50.800
and you can't bulk audit, but you cannot do a long website.

09:50.800 --> 09:59.600
And so in a way, the work is more or less, you know, we could say that what is far, far

09:59.600 --> 10:04.000
away from our work, the version was zero, it was very, very close of it.

10:04.000 --> 10:09.960
So we are very thankful for the colleagues in the way that are working on this one.

10:09.960 --> 10:15.080
Now I will give the floor to the romp who we will talk about website testing too on how

10:15.080 --> 10:16.080
it works.

10:16.080 --> 10:18.520
Thank you very much.

10:18.520 --> 10:24.200
So let's talk about the technical side of this project.

10:24.200 --> 10:31.360
So there is many requirements in it, so this slide will talk to developers in this room.

10:31.360 --> 10:39.800
So one of the requirements was to have a browser that act the same way as the majority

10:39.840 --> 10:41.240
of browser.

10:41.240 --> 10:48.840
So we decide to use electron, because it has a chromium inside and chromium is now used

10:48.840 --> 10:53.040
in the majority of browser in the market.

10:53.040 --> 10:59.280
And the interesting thing we have electron is that you have a Node.js that is able to analyze

10:59.280 --> 11:08.440
everything that is happening inside the browser and also going out and in from the browser.

11:08.440 --> 11:16.440
So we do our analysis and then we display this analysis into a nice, angular interface.

11:16.440 --> 11:23.240
So if you want to have a look to what the tool looks like, you can go on this URL and

11:23.240 --> 11:30.440
you will have just a subset on the interface, a subset that doesn't include the browser itself.

11:30.440 --> 11:37.280
So to summarize the tool is open source, it's only based on why the use frameworks.

11:37.280 --> 11:44.960
It's cross-platform, you can use it on Linux, Windows, MacOS, and it's all written in TypeScript.

11:44.960 --> 11:51.840
So here is what the tool looks like, so I designed the tool, I developed it and when I

11:51.840 --> 11:58.240
started the project, what I want it to do is to take the user by the hand, because this

11:58.240 --> 12:06.880
tool targets legal and technical editor, but it can also be used by data controllers,

12:06.880 --> 12:14.320
data processors to inspect the own website, I mean it can be used by anybody, by everybody.

12:14.320 --> 12:16.040
So that's it.

12:16.040 --> 12:21.040
So when you start, you have like an extensive documentation about how the tool works just

12:21.040 --> 12:27.760
to not be lost inside the tools and on the left, you have all the functionality which

12:27.760 --> 12:30.600
are categorized into FreeSection.

12:30.600 --> 12:37.080
So what the tool is mainly doing is producing analysis, so it will store analysis inside

12:37.080 --> 12:38.080
the tool.

12:38.080 --> 12:44.320
The first section is to manage your analysis, and we will get in-depth into it.

12:44.320 --> 12:50.600
When you are doing analysis analysis, then you will see like session inside the running

12:50.600 --> 12:56.080
sessions, the section categories, and then you have editors that will help you to create

12:56.080 --> 13:01.600
your own knowledge base and to edit some to play to generate pretty print reports, and the

13:01.600 --> 13:05.640
other sections for setting and documentation.

13:05.640 --> 13:10.640
So if you click on browse, then you will get a number browser, I mean you will have all

13:10.640 --> 13:20.080
the tool all the button from Chrome, but what is the specificity of this tool is that

13:20.080 --> 13:25.040
it's provided as a stratum, many analysis, like this.

13:25.040 --> 13:31.520
So the idea is to have something very modern, so you can add, easily, you can add, easily,

13:31.520 --> 13:38.400
some analysis, so for instance, click on the first banner of the first stratum, which is

13:38.400 --> 13:44.800
about cookies, so maybe you will be disappointed because I didn't with it like bad guys

13:44.800 --> 13:52.280
and only goes on the European Data Production Board website, I click on Consent and

13:52.280 --> 13:58.640
on the right, you will see that there is free cookies which are used by this website, when

13:58.640 --> 14:05.960
you click on Consent, you have one technical cookie and two which are used for analytics

14:05.960 --> 14:11.560
developers, and on each of this line you will get more in-depth information when you click

14:11.560 --> 14:12.560
on it.

14:12.560 --> 14:19.040
So there is the function, the goal of this panel is to dive a little bit on each of

14:19.040 --> 14:20.040
this information.

14:20.040 --> 14:24.920
So on the first one, there is several tab, you see, and the first one you get some details

14:24.920 --> 14:30.320
about, for instance, the cookies, the name, the values, the values one, so this one is not

14:30.320 --> 14:38.160
used for tracking, when you click on log, then you will get more information, more in-depth

14:38.160 --> 14:43.440
information to do your investigation, because something which is hard when you are trying

14:43.440 --> 14:52.040
to classify the purpose of a cookie is to really know who is entering this and for what,

14:52.040 --> 14:56.960
so if you have an instance, so JavaScript cookie, then you will get the full costak of

14:56.960 --> 15:03.240
every script, so everyone will get involved in the writing process inside the browser,

15:03.240 --> 15:08.120
and if it's a JavaScript, it's a quick request cookie, sorry, I made by a set cookie,

15:08.120 --> 15:14.000
and you will see the first request, we stored this cookie.

15:14.000 --> 15:20.280
And the last tab is maybe the most important one, it's the dynamic matching functionality

15:20.280 --> 15:21.600
for cookies.

15:21.600 --> 15:29.040
So the idea is that the tool, when you get a cookie, we are always trying to find matching

15:29.040 --> 15:36.520
characteristics in the database to see which purpose this cookie is usually stored.

15:36.520 --> 15:40.480
You see, so you have several ways to do the matching.

15:40.480 --> 15:45.360
If you have an exact match, it means that the domain who usually stored this cookie with

15:45.360 --> 15:51.440
this name as this specific purpose, I mean regarding the knowledge place.

15:51.440 --> 15:58.000
If you have a match domain, it's usually this domain is stored in cookie for this purpose,

15:58.040 --> 16:06.240
so this purpose could be the objective of this cookie, and then the final one that you

16:06.240 --> 16:12.440
come to see, it's the match name, because usually you have a lot of first-party cookie,

16:12.440 --> 16:19.760
for instance, for analytics cookie, and so it will try to identify the name of the cookie,

16:19.760 --> 16:22.720
regarding some SDK, you know.

16:22.720 --> 16:29.920
So that's it, so all the analysis are, when you click on browse, it's just to have a preview

16:29.920 --> 16:35.400
of a website, then if you want to store your analysis on the tool, then you have to create

16:35.400 --> 16:41.520
a new analysis, that's why you have the this button, and all these analysis will be attached

16:41.520 --> 16:47.920
to see now, you know that you have to interact with the website itself to see is

16:47.920 --> 16:53.760
it be ever, when you consent to it, when you reject cookie, where you're just visiting it

16:53.760 --> 17:02.600
without interacting with the cookie banner, so all of these are scenario attached to analysis.

17:02.600 --> 17:09.520
So when you store analysis, then you will be able to mark every information to assess,

17:09.520 --> 17:15.120
wherever they are compliant or not, with the current regulation, so that's why you have

17:15.200 --> 17:22.320
all this tick and this button, and then you can explain why you, I mean, you can explain

17:22.320 --> 17:23.920
your evaluation.

17:23.920 --> 17:28.920
So that's it, once you have finished analysis, then you can share your analysis with

17:28.920 --> 17:36.720
others, and you can export them as pretty print report using some templates, and you can

17:36.720 --> 17:47.760
export it in many formats like PDF, tachyx, etc. So you remember that one of the requirements

17:47.760 --> 17:55.040
of the mission was to foster the reuse of knowledge regarding cookies, so there is like

17:55.040 --> 18:04.400
a full editing tool to edit your own database. So when you install this tool, this is maybe

18:04.480 --> 18:09.840
the more complex thing when you are not like in the topotection authorities, you don't have

18:09.840 --> 18:16.800
any knowledge base, mostly because most of the knowledge base which are used by that topotection

18:16.800 --> 18:23.920
authorities are covered by a secret of instructions, you know, but the tool that you to create

18:23.920 --> 18:31.440
your own database to reuse and the database, and I have made like some example of how to

18:31.520 --> 18:37.520
translate some existing database to the format of the tool itself. So this is like a project

18:37.520 --> 18:44.480
that I've put in some of my own repository, you will find the URL there, and that makes the

18:44.480 --> 18:52.960
links to my research, which is, I mean, there is a lot of research perspective open by this tool,

18:53.520 --> 19:00.720
I mean, one of one of one of something that could be done, I mean, by any researcher, if you

19:00.800 --> 19:09.040
have work inside the cookie topic, see it's like there is many database which are existing,

19:10.080 --> 19:15.840
there is no command methodology to build this database to find the purpose of this matter database,

19:15.840 --> 19:25.680
some are using like machine learning, and over adjust reading like the cookie notice, you know,

19:25.760 --> 19:30.640
so the idea is like if you have work on it, then I will be happy to discuss on you, see,

19:30.640 --> 19:36.320
we can translate it, and if you can set like a confidence level to it,

19:38.560 --> 19:46.320
over research perspective, like now browser, by default, enabling tracking protection,

19:46.960 --> 19:54.160
so it seems, I mean, there is a lot of research paper that show that there is some alternative

19:54.240 --> 20:02.400
to cookie that are used by website to try to track the user without joining cookie by itself,

20:02.400 --> 20:08.800
or try to hide their own cookies by putting them as a first domain, so I'm thinking about

20:08.800 --> 20:14.480
synemicroking for instance, just to hide, you know, a URL, or to take some fingerprint of your

20:14.480 --> 20:22.320
terminal, so there is many research paper, there is many scripts, and I will be really interested

20:22.320 --> 20:29.760
to speak with anyone who has work on this, and if we can integrate their algorithm into tools,

20:30.560 --> 20:39.200
and last one is also regarding browser, is now there are trying to provide alternatives to cookies,

20:39.200 --> 20:45.440
it's like the privacy initiatives, privacy sandbox for Google, privacy preserving

20:45.440 --> 20:55.680
attribution for Firefox, so all these initiatives are, I mean, most of the DPA say that

20:55.680 --> 21:05.760
they still require concerns, so I will be very curious to see how popular this initiative are,

21:05.760 --> 21:13.360
and if they are really like compliant with all the expectations from the regulation,

21:14.240 --> 21:22.160
and of course, if you have any other research ideas, if you have any initiative on this project,

21:22.160 --> 21:28.560
I will be very happy to discuss with you after this presentation, and you can also find me later

21:28.560 --> 21:34.080
in the audience, I will be there today and also tomorrow because I will also never, I'm speaking

21:34.080 --> 21:41.520
on another track, so contact me, and now for the SP part, I'll give the floor again to another one.

21:42.160 --> 21:50.480
So for more projects, what's in particular, so the first thing is the adpv is commenting to continue

21:50.480 --> 21:57.760
to maintain the future, even to continue to develop it, for information, we have started the new

21:57.840 --> 22:05.600
SP project, this year, to make a server version, we will pick them again, hoping it will do a

22:05.600 --> 22:13.440
good job again, we will see, the idea is to make it a collaborative and we hope it will find a way

22:13.440 --> 22:18.400
to allow to store knowledge that they send out its server sites, and of course, if you have

22:18.400 --> 22:23.280
idea, you want to discuss, we are open to it, thank you very much.

22:23.280 --> 22:27.280
Thank you.

22:29.280 --> 22:34.720
Have you made any questions? One question? I think this one works first.

22:38.160 --> 22:41.840
Hi, I'm Martin, I hope that you would like to work on cooking on this

22:41.840 --> 22:46.880
compliance, like automated analysis of all those burners, so there's no work before you

22:46.880 --> 22:52.560
don't know any work before to analyze for skookie's burners, there's pretty much good.

22:53.280 --> 22:58.080
And the next question is, is there any work on vendors themselves, so called vendors,

22:58.080 --> 23:03.360
the company is behind, because if you click on those links and you see, it's always bullshit.

23:03.680 --> 23:06.560
So, Mike, they're pointing at each other and it's a disaster.

23:09.360 --> 23:12.000
Because I'm sorry, I didn't get the question already.

23:14.640 --> 23:20.560
Is it working? Okay, but so called vendors, the company is behind this tracking, is there any

23:20.560 --> 23:26.240
like a data base, analyzing vapor uses, they have their own policies which are usually, you know,

23:27.200 --> 23:29.440
Generic and stuff. Thank you.

23:29.440 --> 23:35.840
It's sort of a confidence of the information, so could we trust that these kind of vendors

23:35.840 --> 23:42.320
information that it, because you know that you have some technical cookies, also that, I mean,

23:42.320 --> 23:51.120
one cookie can have new type of disease, and all the information about what needs the

23:51.120 --> 23:57.200
preposterous phone cookies are not always available, you know, so the, the, yeah, that's it,

23:57.200 --> 24:01.760
answer your question. And at the same time, if you looked in on like it's two, that the base that

24:02.640 --> 24:08.560
Jerome at on this slide, you I choose first one was EDPB1, so that, you know, information that

24:08.560 --> 24:14.160
shared, and you need to know by this, between SA, but the second one is built upon a

24:14.160 --> 24:19.120
free software project that is completely not vated by EDPB, I'm not saying that you know, like,

24:19.920 --> 24:25.760
we will do the same things, that was a legal part, but there are a bit based on that, I mean,

24:25.760 --> 24:29.200
there are readings and also it is, and it's how people are contributing to it.

24:31.760 --> 24:34.480
I'm sorry, I think that's all we have time for today, but thank you so much,

24:34.480 --> 24:40.400
thank you for your time. I'll be outside in front. And I do have cookies for those who want.

