WEBVTT

00:00.000 --> 00:10.960
Okay, hi everyone, I am Pratikshaya and thank you for joining this session especially

00:10.960 --> 00:16.080
my talk and today in my presentation I will be highlighting the open source component

00:16.080 --> 00:20.560
that we have to offer in the Copernicus Data Space ecosystem.

00:20.560 --> 00:25.320
Within the presentation, how we will start from the definition of remote sense again

00:25.400 --> 00:31.080
Earth observation which in a geospatial session might not be that much relevant but still

00:31.080 --> 00:36.360
I thought it would be nice to relate to the Earth observation because prior to this presentation

00:36.360 --> 00:41.000
I have seen most of you been talking about different geospatial locations but not exactly

00:41.000 --> 00:45.880
on the focus with the Earth observation itself so I thought it would be nice to start with

00:45.880 --> 00:53.160
that and then I will introduce you the Copernicus Data Space ecosystem with the different components

00:53.240 --> 00:58.920
that we have to offer including data API and application but with the focus to the open source

00:58.920 --> 01:03.720
tools that I will maybe conclude with highlighting on how do we support community and how

01:03.720 --> 01:11.400
community can support us. Starting with the definition of remote sensing just to connect

01:11.400 --> 01:19.000
among ourselves naturally when we want to observe anything we do that with our eye so remote sensing

01:19.080 --> 01:24.280
is something similar in there there is sensors placed at a remote location and it tried to

01:24.280 --> 01:31.320
observe something in a distance that is exactly what remote sensing is how ever with sensor they

01:31.320 --> 01:36.280
have an added value that they can see beyond what human eye cannot so there is a range of

01:36.280 --> 01:42.360
spectrum which they can identify and we cannot so ensure this can be defined as the remote sensing

01:42.360 --> 01:48.680
process where it tried to detect something being placed at a certain location to understand

01:48.760 --> 01:54.680
his characteristics and behavior and connecting that with Earth observation Earth observation is

01:54.680 --> 02:01.000
simply a remote sensing process where different satellites or the aircraft placed at a higher

02:01.000 --> 02:07.480
location is trying to observe our Earth whether it be above below or on the surface of the Earth

02:07.480 --> 02:12.200
and according later try to tell us about the physical characteristics or different related

02:12.200 --> 02:18.200
phenomenon that is going on the surface and these kind of information can be very helpful in

02:19.160 --> 02:25.880
special temporal global maps or as someone said earlier maybe some pretty maps or some useful

02:25.880 --> 02:31.880
maps that can be used in making decisions or various other research and planning purpose these can

02:31.880 --> 02:38.600
also be very helpful to monitor different human activities at all scales and if I talk about the

02:38.600 --> 02:44.280
different data sets as well as what can be used with this data sets it can be whether if you want

02:44.360 --> 02:51.080
to simply monitor the ground or monitor the Earth surface like in this case the global land cover

02:51.080 --> 02:55.640
map that is getting popular in order to address the food security and other reasons so that could

02:55.640 --> 03:01.800
be very helpful with the Earth observation data set or even to monitor the water bodies not just

03:01.800 --> 03:07.560
not just in Alaska but maybe small rivers or lakes that is also possible with the Earth observation

03:07.880 --> 03:14.440
data set you can also check the or you can also analyze or observe the temporal variation

03:14.440 --> 03:19.640
that is going throughout the year though we talk about the global warming and everything but there

03:19.640 --> 03:23.720
should be a pattern that we want to observe through the satellite data that's possible with this

03:23.720 --> 03:29.320
this Earth observation technique there is also different kinds of data that would be rather

03:29.320 --> 03:34.280
or optical data sets based on which you cannot only not only analyze the

03:35.240 --> 03:40.440
analyze what is going on the Earth surface but what's going below for example in case of

03:41.160 --> 03:47.240
maybe Earthquake or landslide itself they use rather base images in order to analyze what is happening

03:47.240 --> 03:55.160
in the different phenomena is related so those are the those are the few application of remote sensing

03:55.160 --> 04:02.520
data sets however in past it's like not in past even in like every day their database of data are

04:02.520 --> 04:08.600
being recorded and it has always been kind of a complication on how do we store this data or

04:08.600 --> 04:15.320
how to manage this data one task as a geospatial injury that always come to my table used to

04:15.320 --> 04:21.160
be like okay data set available out there but how do I access it and suppose I want to get

04:21.160 --> 04:26.040
even the optical data as well as the rather data but they both are located at a two different

04:26.040 --> 04:31.640
location so do I have to take every thing do I have to take the steps in a different manner

04:31.640 --> 04:37.640
so that has always been a kind of a discussion so in there came the Copernicus data space

04:37.640 --> 04:43.880
ecosystem where we are trying to centralize the data at one place with a set of different tools

04:43.880 --> 04:49.160
that could be used to directly access the data there process it there and also analyze it

04:49.160 --> 04:55.560
using the different visualization tool that's available that is in a summary what is Copernicus

04:55.560 --> 05:01.400
data space ecosystem so again if I have to highlight it is a user centric platform which

05:01.400 --> 05:07.320
allows you to access terabytes of earth observation data set it gives you tools to

05:07.880 --> 05:15.720
access and process them and also platform to visualize them and within the CDAC there are different

05:15.720 --> 05:22.760
components starting with the Sentinel data set I assume most of you have heard the terminology

05:23.640 --> 05:29.160
Copernicus so Copernicus provides you with this Sentinel data set and the Copernicus data space

05:29.240 --> 05:35.240
ecosystem will be the authoritative source where you will find the first hand data set and there

05:35.240 --> 05:39.720
is the Copernicus contributing mission data set as well you will find now within this platform

05:40.360 --> 05:45.160
and there will be additional EOD data sets that would be available maybe for European scale

05:45.160 --> 05:52.280
or for global scale it will depend for example the recently developed is a world cover map you can

05:52.280 --> 05:59.000
find this data set as well in the single in the same single platform and with this you can access

05:59.400 --> 06:04.840
maybe directly with the website or using different streamline data access and API tool or even

06:04.840 --> 06:11.480
there are different catalog APIs that supported within the ecosystem one example could be the stack

06:11.480 --> 06:18.360
API itself there is Copernicus browser which allows you to even visualize like how does this data

06:18.360 --> 06:24.760
look like or how does this data look over time so you can directly go and visualize and this I

06:24.840 --> 06:30.600
think for those who are not yet sure on how to get started with handling this data set

06:30.600 --> 06:36.760
Copernicus browser could be a good point to get yourself started then there are several on

06:36.760 --> 06:42.920
mode code repository that we provide along with a cloud computing capacity there is certain

06:42.920 --> 06:47.880
amount of cloud computing capacity available for everyone who want to use this data set

06:49.000 --> 06:54.600
and there are online code labs and interface and there is also a concept called federation

06:54.600 --> 07:01.000
and user identity service suppose you have a set of data set which you want to share with global

07:01.000 --> 07:06.520
audience you can reach out to the Copernicus data space team and accordingly share it with a wide

07:06.520 --> 07:12.920
audience that's where it comes into the federation concept like for Belgium there is a there is

07:12.920 --> 07:19.080
the entity called Terra scope which provides you with the different data set more focused with

07:19.080 --> 07:24.200
Belgium so that is that is one of the federation part of the Copernicus data space ecosystem

07:24.600 --> 07:28.760
similarly if you have your own data set that you prepare and you want to provide then that could

07:28.760 --> 07:35.800
also be part of it and it it it whole makes as an open ecosystem within the within the ecosystem itself

07:37.000 --> 07:41.800
so that was the different components that I talked about what we have to offer in Copernicus

07:41.800 --> 07:47.720
data space ecosystem most of them being open source however there is one very interesting

07:47.800 --> 07:55.400
component and which is the open you to just simplify the definition of what exactly is open

07:55.400 --> 08:03.080
you open you is a is a is an open source API in the form of source code which you can use not only

08:03.080 --> 08:10.200
to access the data but also you can use it directly to perform analyzes or do some processing

08:10.200 --> 08:20.200
with this satellite data on the cloud itself and there are few pillars that is always used

08:20.200 --> 08:25.480
in defining what exactly does open you does so as I said earlier that there used to be different

08:25.480 --> 08:30.520
platforms that used to provide you with different satellite data and it used to be always

08:30.520 --> 08:35.960
complicated like how how many accounts do I create or how many platforms should I visit in order

08:35.960 --> 08:41.800
to fetch the data so you know with open you it becomes easier you can just use your one account

08:41.800 --> 08:47.160
and no matter which back in the data is saved you can access this additionally I can also say

08:47.160 --> 08:53.960
that if your data is in a stack form and it is hosted somewhere then using open you you can also

08:53.960 --> 09:00.280
access that data if it is not already the satellite data is not already there in the ecosystem

09:00.280 --> 09:05.720
but there is some raster data that you want to use within your workflow then you can access that

09:05.720 --> 09:12.280
using open you that would be a simple data access and processing workflow and no matter what kind of

09:12.280 --> 09:16.600
workflow you develop using open you they will be scalable and efficient for processing

09:17.960 --> 09:24.920
since they open you is developed using open source code and and supports the open

09:24.920 --> 09:30.360
community so we have also we can also already say that it supports fair and open science principle

09:31.000 --> 09:37.000
and however one more thing I would like to highlight with open science and fair principle of open

09:37.000 --> 09:42.760
you or sports is the code that you have developed they are independent of underlying technology

09:42.760 --> 09:48.440
whether you develop using r or python or even javascript so open you you can if it supports

09:48.440 --> 09:52.600
whichever library you want to use and it is independent of the underlying technologies

09:53.000 --> 10:01.080
and the work that you repeat the work that you prepare so you can reuse and share it with

10:01.080 --> 10:05.800
the global audience I will later show you how you can do that if you develop it using open you

10:05.800 --> 10:10.920
workflow then it just has to be a matter you will just be provided with a name space which you can

10:10.920 --> 10:18.520
access with any different APIs how open you workflow look like is here I am trying to connect to

10:18.520 --> 10:25.800
the company because data space back back in so I just connect using authenticate OIDC and then I

10:25.800 --> 10:31.480
just provide what is the data that I want to access what is the temporal extent what is the special

10:31.480 --> 10:37.240
extent and in this case I was trying to get Sentinel 1 which is a rather data so I am just trying

10:37.240 --> 10:45.000
to get to to polarization band out of it and the next step I just showed a very simple

10:45.560 --> 10:51.960
process used here which is already built in open you minimize a minimum time because we have

10:51.960 --> 10:56.920
asked for a data over a temporal range so we just want to minimize reading it to one time frame

10:56.920 --> 11:02.840
that is why I just use one simple process in here however there could be a scenario that okay

11:02.840 --> 11:08.280
there is there is a very complex workflow and the processes are not already there in the open you

11:08.840 --> 11:14.040
then what you can do is use this concept called user defined function which are basically python scripts

11:15.320 --> 11:20.760
you sorry which are basically the python script that you can bring it to the open you and

11:20.760 --> 11:27.160
make use it within your workflow it does not have to be that all the processes are in there and at

11:27.160 --> 11:34.040
then you just write the yeah you just get the result when you get the result since I am sorry

11:34.040 --> 11:39.080
I just missed this part that open you uses the concept called data cube because we have to deal

11:39.080 --> 11:44.200
with a large amount of data over different time frame so it could be multi spectral data as well

11:44.520 --> 11:49.880
multi temporal data so we use the concept of data cube and the data cube is executed as the

11:49.880 --> 11:56.760
end of the whole workflow you can also do it in the intermediate step however I am showing here in the

11:56.760 --> 12:03.400
example the last step so that is how it would look like and I think I wanted to show it in a

12:03.400 --> 12:10.200
demo but before that I mentioned earlier that you can share your code how I just wanted to

12:10.200 --> 12:16.360
highlight it already here we have something called open you algorithm plaza where if you have

12:16.360 --> 12:22.360
your workflow that is developed it should include the open you component in it you can already

12:22.360 --> 12:29.000
publishize it here so that why everyone or wide audience can use it one example is

12:30.920 --> 12:38.040
as this pi u gpr which was recently published by one of the user which is basically which is

12:38.040 --> 12:43.320
basically a python machine learning work library that try to predict the biophysical

12:43.320 --> 12:49.080
trait using the Gaussian regression process so similarly if you have it does not have to be very

12:49.080 --> 12:55.400
complex it can be very simple like you can already say few examples of band math calculation

12:55.400 --> 13:00.920
that is done here it can be as simple as that to very complex workflow you can already share it

13:01.080 --> 13:09.560
within here and yeah it will be used by a wide audience that is the idea of shareability

13:09.560 --> 13:14.760
that we support in open you and I just quickly wanted to show you a demo in the Jupyter lab

13:14.760 --> 13:20.200
environment that is offered to everyone in Copernicus data space ecosystem so you can access it

13:20.200 --> 13:26.840
you have three different flavors to choose from when using the when using the Jupyter lab environment

13:26.920 --> 13:32.040
and you have our and python kernel installed so I'm not sure how many here would be interested in

13:32.040 --> 13:37.720
using our kernel hopefully yes with the python kernel so I will show an example with the python itself

13:38.920 --> 13:45.960
let me quickly I just wanted to show you how does it look like in the the workflow

13:46.840 --> 13:50.840
I hope I can

13:54.840 --> 13:56.840
it has not come

14:05.160 --> 14:07.160
already

14:07.160 --> 14:11.320
you open display like in your settings see if I can see the extra

14:12.440 --> 14:14.760
or what I think it's fine

14:16.600 --> 14:20.440
you might have a button you can press like F8 to switch display

14:20.440 --> 14:45.800
okay okay so this is the Jupyter environment if you register yourself in the Copernicus

14:45.800 --> 14:52.360
ecosystem you will get this you will get access to this for free and then you will

14:52.360 --> 14:57.800
you can choose from any of the kernel that you want to use they separately open your kernel provided

14:57.800 --> 15:03.960
for you as well so the idea of different kernel with different name is that we have already installed

15:03.960 --> 15:10.040
few libraries that you might need that's the idea and in addition to that there are some sample

15:10.120 --> 15:16.680
examples provided in here so just go to open you there are few to get started for example if I go

15:16.680 --> 15:26.280
inside one of this how it looks like is basically as I said earlier that you just have to authenticate

15:26.280 --> 15:31.960
yourself provide the different parameters that you want to and in this case yeah using the process

15:31.960 --> 15:39.400
that was already built in open you but there are can be cases where I want to do more and open

15:39.400 --> 15:45.000
you does not have this process so in this case I have showed in the form of string which is not an

15:45.000 --> 15:50.120
ideal form I can understand that so you can just do you can just write in your python file

15:50.840 --> 15:56.440
and what you have to do is just import it from there using the from UDF command of open you

15:58.600 --> 16:03.000
and just you get the result how it looks like at the end is this there is very cloudy image

16:03.720 --> 16:08.920
so this is just a sample example of what you can do with Sentinel two data just downloading

16:08.920 --> 16:14.680
and visualizing the RGB image but similarly you can do with the different raster data set you have

16:14.680 --> 16:20.040
available as I said earlier if you have your very high resolution data available in stack

16:20.840 --> 16:25.400
to evaluate in stack compliant format then you can directly fetch it using open you

16:25.400 --> 16:31.400
before the analysis on it it can be as simple as just visualizing it to as complex as maybe

16:32.360 --> 16:37.080
running a machine learning model or using the train model to get the inference

16:38.200 --> 16:53.000
and in order to save I think I have an example I wanted to show on how how UDP can UDP which is the

16:53.000 --> 16:58.760
user defined process the one that I showed you earlier was user defined function which are

16:58.760 --> 17:05.080
the function that you define user defined process are which you want to save as a process for others

17:05.080 --> 17:11.480
to use or for yourself to use it later so all you have to do is define the input parameters

17:11.480 --> 17:16.280
in which format you want like in this case for temporal I gave this key method okay it should

17:16.280 --> 17:22.920
be temporal interval and array and for a special extent in bounding box or maybe simple special

17:23.000 --> 17:29.400
extent with suppose both bounding box as well as the feature collection and then you have to

17:29.400 --> 17:35.640
define the workflow as it was but at this time pass in the parameter not the actual value and

17:35.640 --> 17:40.280
at then you just save it with a different name that you want to give it will be a sign with

17:41.320 --> 17:48.680
it will be the ID can be any if you want however we request it to be unique as much as possible

17:48.760 --> 17:54.840
however the user ID will use the ID that is associated with your account the name space itself will

17:54.840 --> 18:00.680
be unique so that will help in identifying the process that you have created and you just share

18:00.680 --> 18:07.320
you will get a link that is what you share in the open your algorithm plaza so that was

18:08.600 --> 18:09.800
quick demo as well

18:10.680 --> 18:25.960
I am back to the action but before that let me do that I wanted to also show you how you can contribute

18:28.520 --> 18:34.600
if we have everything provided in GitHub so it is an open source course so if you want to maybe

18:34.600 --> 18:39.080
if you already have an idea okay this feature might be a good one to have in open you as well you can

18:39.160 --> 18:44.360
already come and maybe create issues or pull because anything like that however there is another

18:44.360 --> 18:51.880
repository called community example where we try to create example maybe basic to very complex one

18:51.880 --> 18:57.080
for users to have an idea on what they can do with the remote sensing data set and also with

18:57.080 --> 19:03.880
open you you can take a reference out of this and could be useful in some of your application like

19:03.880 --> 19:10.120
in here there is also an example of how to load your stack the data that is 17 stack you can

19:10.120 --> 19:15.880
directly load in open you that that there is an example how to do that there is how to do large

19:15.880 --> 19:23.080
scale processing to do multi-backend processing and also you can find all the range of different

19:23.080 --> 19:30.040
application examples in here and you can also if you already have your use case you can directly

19:30.120 --> 19:35.400
come and create a pull request if it is if it is valid and I think that it will surely be

19:35.400 --> 19:42.520
accepted and merged so I think that was more or less I had to say so do do do do

19:49.080 --> 19:52.360
maybe coming back to the presentation

19:53.160 --> 19:57.800
I think my time is also almost over

20:08.440 --> 20:15.000
so yeah that was about community support and when using Copernicus data specifically if you

20:15.000 --> 20:21.400
run ran into any issue then please feel free to post the issue in the forum or create ticket

20:21.400 --> 20:27.560
however it is encouraged mostly to post it in forum because they should that you ran into someone

20:27.560 --> 20:35.000
else might also have the same problem so it is always helpful in that scenario and to summarize

20:35.000 --> 20:41.160
again about the Copernicus data space ecosystem it is a free public platform where you can share

20:41.160 --> 20:46.120
your data you can use the data set that's already there use the tools that's provided to

20:46.760 --> 20:52.600
you or you can also share the tools that you have developed that was more or less all thank you for

20:52.600 --> 21:05.480
your attention on the app of every round so yeah I also have my team member from the Copernicus

21:05.480 --> 21:09.640
data space ecosystem here as well as the OpenU core team so if you have any in-depth

21:09.640 --> 21:26.840
question as well please feel free to raise yes please yes it everything happens in the cloud

21:32.200 --> 21:33.560
yes yes that's as well

21:40.200 --> 21:46.120
I don't think we ourselves it is not publicly shown for sure the user defined function even the

21:46.120 --> 21:51.640
UDP you're just posting the URL you don't show the whole core and everything is handled as the

21:51.640 --> 21:59.320
process wrap in adjacent format so yeah your UDP UDF it won't be publicly announced I don't know

21:59.320 --> 22:04.680
from the back inside yeah I am to that you have to tell to be that in existence will be able to

22:04.680 --> 22:10.280
use the spot system that's ready to run in and watch so if you are running the tasks on the

22:10.280 --> 22:15.160
set of that data you're fighting for the global learning of service if you need to be in a

22:15.160 --> 22:26.280
speed up especially during the course and then what's the speed up maybe this was can I mine

22:34.280 --> 22:39.960
yes please yeah for three years ago this data would be offered free to download for almost

22:39.960 --> 22:46.440
everybody on ESA's side-up platform and also in national archive such as Francis Peps program

22:47.320 --> 22:53.400
both of which were decommissioned rather recently and moving towards this cloud-based platform

22:54.600 --> 22:59.320
on which it is much more difficult to export the data out to process on a local machine you have

22:59.320 --> 23:04.840
you kind of have to use the you kind of to use the given platform and given machines given virtual

23:04.920 --> 23:13.640
machines by by the by the service is there any why why is such why can I know the

23:13.640 --> 23:19.800
export of data regarding the question of like yeah there was I have and everything that allowed

23:19.800 --> 23:25.080
you to download the data but now why cannot you download it locally I don't yet but not

23:25.080 --> 23:30.840
no longer in high volumes I think it should be possible in high volume as well with regards to

23:30.840 --> 23:36.280
open you it is more on processing so it is not only about downloading the data but with regards

23:36.280 --> 23:41.800
to downloading high volume that also should be possible unless you go beyond yeah beyond too much

23:42.360 --> 23:46.920
that should not yeah they certainly limit because everyone around the world we have given certain

23:46.920 --> 23:53.080
quotas so that's the thing and Tyher is now the Copernicus browser that's there so I think the

23:53.080 --> 23:59.160
feature that was there in Tyher we still preserve now that's there and I think it should be

23:59.240 --> 24:07.000
possible but yeah how about the tooling that was also deprecated around the same time the Python interfaces

24:07.000 --> 24:14.120
to snap with regards to snap I don't know how it would be related with this one yes I'm sorry

24:17.160 --> 24:24.760
other things like you are the data cube based on xrd data cube they are to some extent

24:25.400 --> 24:31.240
but yeah the data cube that is there handle in xr this one is slightly different in the part

24:32.280 --> 24:39.880
there is not only multi-specure but also temporal and yeah in cluster yeah I think it is divided accordingly

24:39.880 --> 24:41.880
unless I don't want to add something

24:55.320 --> 25:02.600
okay so thank you

25:06.840 --> 25:10.840
thank you

