WEBVTT

00:00.000 --> 00:07.000
Hi, it's Hello everyone.

00:07.000 --> 00:11.000
The next talk is actually for me about how Mr.

00:11.000 --> 00:15.000
For Handles SRT connections in the Ben and Shell processes.

00:15.000 --> 00:19.000
This is actually a bit of a hack that we did in the past few months

00:19.000 --> 00:21.000
to make something that we wanted to work work

00:21.000 --> 00:23.000
and I figured it would be interesting to share

00:23.000 --> 00:29.000
because SRT is a common political to share media over.

00:30.000 --> 00:33.000
For reference, I am the lead of the Mr. for Team.

00:33.000 --> 00:37.000
For context, we need to first go over what is Mr. for exactly

00:37.000 --> 00:39.000
otherwise the rest of this talk won't make sense.

00:39.000 --> 00:41.000
I'll try to keep this short.

00:41.000 --> 00:44.000
Mr. for is a media server software package that's been

00:44.000 --> 00:46.000
development for about 16 years or so.

00:46.000 --> 00:48.000
It's a free and open source.

00:48.000 --> 00:50.000
In license, we have only one edition, the free one,

00:50.000 --> 00:53.000
and that's it, and everything is completely open.

00:53.000 --> 00:56.000
The focus on transmuxing and easy integration

00:56.000 --> 00:59.000
which means it's very developer-centric.

00:59.000 --> 01:03.000
So that means that it is something that takes media feeds.

01:03.000 --> 01:06.000
We focus mostly on transmuxing meaning that we don't do

01:06.000 --> 01:09.000
encoding though we can but we prefer not to.

01:09.000 --> 01:14.000
And we take things in any protocol and send it to any protocol.

01:14.000 --> 01:18.000
With a focus on if you want to put this in some other system,

01:18.000 --> 01:21.000
well, then this is the tool you would use.

01:21.000 --> 01:24.000
A more interesting thing that we need to go a little bit

01:24.000 --> 01:28.000
in depth in is that this uses a separate process for every connection.

01:28.000 --> 01:30.000
In going out going, it doesn't matter.

01:30.000 --> 01:33.000
Every connection, a separate process.

01:33.000 --> 01:36.000
This is less efficient and web servers stop doing this,

01:36.000 --> 01:39.000
like I think 20, 30, maybe more years ago.

01:39.000 --> 01:42.000
But while you're redoing this, because with media,

01:42.000 --> 01:45.000
you don't really get all that many connections on a single machine.

01:45.000 --> 01:48.000
Like most servers have maybe a gigabit of internet.

01:48.000 --> 01:51.000
That means that if you do in this level of quality,

01:51.000 --> 01:53.000
five hundred users and your connection is full.

01:53.000 --> 01:56.000
So the CPU is usually kind of bored on the media server.

01:56.000 --> 01:59.000
It's usually the network charge that is working itself.

01:59.000 --> 02:02.000
That the CPU is kind of bored, the ram is kind of bored.

02:02.000 --> 02:05.000
So we don't actually need to be super efficient with the CPU.

02:05.000 --> 02:08.000
And why would we opt for this?

02:08.000 --> 02:10.000
Because it gets us better isolation between clients.

02:10.000 --> 02:13.000
So each of these inputs, each of these outputs,

02:13.000 --> 02:15.000
they have a separate process running.

02:15.000 --> 02:20.000
So this is not the threat, this is not a sub-task or anything.

02:20.000 --> 02:24.000
This is like an actual separate program running handling that connection.

02:24.000 --> 02:29.000
That gives us very good isolation in the sense that if someone's doing something weird,

02:29.000 --> 02:33.000
it's a RTP input or something and they're trying to catch the server,

02:33.000 --> 02:36.000
they can catch their own connection, but they can't catch anyone else's.

02:36.000 --> 02:39.000
So this is a property that we really really like.

02:39.000 --> 02:42.000
But there is a problem with this property.

02:42.000 --> 02:47.000
Because a couple of years ago, back in 2012,

02:47.000 --> 02:51.000
a high-vision created a protocol called SRT, secure reliable transport.

02:51.000 --> 02:54.000
Not to be confused with SRT the sub-title format.

02:54.000 --> 02:57.000
I don't know why they named it that either.

02:57.000 --> 03:00.000
And this basically takes MPEC-deastreams.

03:00.000 --> 03:05.000
And it sends them over a fancy pipe that takes care of problems on the internet.

03:05.000 --> 03:08.000
So retransmissions, packet loss, stuff like that.

03:08.000 --> 03:11.000
It makes sure that you have a stable feed.

03:11.000 --> 03:13.000
It was open source in 2017.

03:14.000 --> 03:18.000
And since then, it's really had a bit of an update in people using it.

03:18.000 --> 03:25.000
Because the big studios in such, they generally send MPEC-deast over all kinds of connections.

03:25.000 --> 03:29.000
And with SRT, they could start using the internet rather than cables.

03:29.000 --> 03:33.000
And especially the fancy cables that the broadcast industry uses,

03:33.000 --> 03:35.000
they tend to be quite expensive.

03:35.000 --> 03:38.000
The hardware that works with them is also expensive,

03:38.000 --> 03:40.000
hard to get, hard to maintain.

03:40.000 --> 03:44.000
And using the internet is just kind of easier.

03:44.000 --> 03:47.000
So these are all sounds great.

03:47.000 --> 03:49.000
And of course, as soon as this was open source,

03:49.000 --> 03:51.000
deep but it's in there and it works.

03:51.000 --> 03:55.000
However, there is a small problem with it.

03:55.000 --> 03:57.000
You would say, OK, lip uncertainty.

03:57.000 --> 04:00.000
The reference implementation made by having it in themselves.

04:00.000 --> 04:01.000
It's open source.

04:01.000 --> 04:02.000
It's in C++.

04:02.000 --> 04:05.000
Missed open source lip SRT is open source.

04:05.000 --> 04:07.000
You can use use that right?

04:07.000 --> 04:09.000
We technically can.

04:09.000 --> 04:12.000
There are two other options as well.

04:12.000 --> 04:15.000
BIP has their own implementation, called UPIMP SRT,

04:15.000 --> 04:17.000
which is very closely tied to UPIP.

04:17.000 --> 04:20.000
And it's not a film implementation, so we can't use that one.

04:20.000 --> 04:22.000
And there's also GO SRT.

04:22.000 --> 04:24.000
But it seems to have been written in a bit of a brush.

04:24.000 --> 04:26.000
I didn't actually personally test it,

04:26.000 --> 04:29.000
but it's also written in GO and also a partial implementation.

04:29.000 --> 04:31.000
So we couldn't really do that with C++ for that,

04:31.000 --> 04:32.000
going through a whole bunch of hoops.

04:32.000 --> 04:37.000
That left lip SRT is the only viable option for using SRT.

04:37.000 --> 04:41.000
But there's a minor inconvenience with lip SRT,

04:41.000 --> 04:46.000
because lip SRT takes ownership of the underlying UDP tokens.

04:46.000 --> 04:50.000
And this means that we tell lip SRT, OK,

04:50.000 --> 04:53.000
GO boots, listen on this port, start taking connections,

04:53.000 --> 04:56.000
and then anything that comes into that port,

04:56.000 --> 04:58.000
it goes into that particular liability that's running,

04:58.000 --> 05:01.000
and we no longer can do or trick where we can accept the process.

05:01.000 --> 05:06.000
Like we're stuck using one process for everything going in or out or anywhere.

05:06.000 --> 05:10.000
And that's kind of annoying, that means we lose our fancy advantage.

05:10.000 --> 05:14.000
So what can we do to fix this?

05:14.000 --> 05:17.000
Well, there are a few solutions that we've considered.

05:17.000 --> 05:22.000
One of them would be to add support for forking different binaries to lip SRT itself.

05:22.000 --> 05:27.000
We would either have to upstream that or fork the entire library.

05:28.000 --> 05:31.000
We took one look at the library, and we kind of nope the out,

05:31.000 --> 05:35.000
because especially when it was just open sourced,

05:35.000 --> 05:39.000
the best word to describe what it looked like at the time was spaghetti.

05:39.000 --> 05:42.000
Since then, it has improved a little.

05:42.000 --> 05:45.000
It's definitely made big strides.

05:45.000 --> 05:48.000
They also did all kinds of fun things like releasing the spec,

05:48.000 --> 05:51.000
which is great, because it was spec by implementation at first,

05:51.000 --> 05:54.000
and now it's spec by spec and implementation breaks that spec.

05:54.000 --> 05:57.000
But it's at least at least it's there.

05:57.000 --> 06:01.000
And so that was not really an option.

06:01.000 --> 06:04.000
And we also think that upstream is kind of thing.

06:04.000 --> 06:08.000
It would be kind of niche, and they probably wouldn't want it in the main library either way.

06:08.000 --> 06:11.000
We did talk to some of the people who are high vision,

06:11.000 --> 06:13.000
and they seem like they might be interested,

06:13.000 --> 06:16.000
but I think that if we actually push through their interest would drop.

06:16.000 --> 06:19.000
So the other option would be to write our own implementation.

06:19.000 --> 06:23.000
And well, I have talked a little bit that the guys that did their own implementations.

06:23.000 --> 06:26.000
And it seems like the takeaway from them is, please don't,

06:26.000 --> 06:28.000
because it's kind of horrible.

06:28.000 --> 06:32.000
So we decided not to do that either.

06:32.000 --> 06:35.000
So then that means there's a problem, right?

06:35.000 --> 06:37.000
We could do the modification or the forking.

06:37.000 --> 06:40.000
We could do the production implementation.

06:40.000 --> 06:43.000
Back when it was still open sourced initially,

06:43.000 --> 06:45.000
we thought, hey, uncertainty, it might be fat, right?

06:45.000 --> 06:48.000
Maybe it will go away, maybe the industry won't care,

06:48.000 --> 06:50.000
and we can use silently ignore it.

06:50.000 --> 06:54.000
So we did an initial solution where we did the kind of ugly thing,

06:54.000 --> 06:57.000
and we used to use the library, and we used to use the one binary,

06:57.000 --> 06:59.000
and all the connections go to the one binary.

06:59.000 --> 07:02.000
And if it's precious for some reason, well, all of them used to go by by.

07:02.000 --> 07:04.000
And we didn't like that.

07:04.000 --> 07:08.000
This was the situation that we were like, yeah, it's probably fine, right?

07:08.000 --> 07:12.000
But the unfortunate part is that the uncertainty actually ended up sticking around.

07:12.000 --> 07:17.000
In fact, it got quite popular, and it is now more or less the default protocol.

07:17.000 --> 07:24.000
For sending media over the internet for both guests and industry things.

07:24.000 --> 07:27.000
So yeah, that's kind of a problem then, right?

07:27.000 --> 07:31.000
We have a crappy implementation that we don't like.

07:31.000 --> 07:33.000
Our customers don't like.

07:33.000 --> 07:35.000
But we kind of want to improve it.

07:35.000 --> 07:37.000
So what can we do?

07:37.000 --> 07:39.000
So we started looking around a little bit,

07:39.000 --> 07:41.000
and as I mentioned in a couple of slides back,

07:41.000 --> 07:43.000
there's a pieceback that's available now.

07:43.000 --> 07:45.000
Here we have some nice screenshots from the spec.

07:45.000 --> 07:47.000
There we have the header.

07:47.000 --> 07:50.000
And here we have a specifically in a handshake package,

07:50.000 --> 07:53.000
what is in the rest of the package.

07:53.000 --> 07:57.000
And I let it a couple of things here.

07:57.000 --> 08:00.000
Initially in the header, there's the first four bytes.

08:00.000 --> 08:03.000
Those mentioned what type of packet it is.

08:03.000 --> 08:06.000
And if the control type is zero, and the sub-type is zero,

08:06.000 --> 08:08.000
that means it's a handshake package.

08:08.000 --> 08:12.000
But that's convenient, because that means that it's just one zero zero zero zero zero zero zero zero zero.

08:13.000 --> 08:14.000
That's pretty easy to recognize.

08:14.000 --> 08:16.000
We don't really care about any of the rest,

08:16.000 --> 08:18.000
because we're not trying to implement a protocol.

08:18.000 --> 08:21.000
We're trying to see if there's a new connection coming in, right?

08:21.000 --> 08:24.000
Then if you have the handshake package, you can see in the type,

08:24.000 --> 08:28.000
okay, what kind of handshake is it?

08:28.000 --> 08:30.000
Is it a rendezvous handshake,

08:30.000 --> 08:33.000
where we have two end points connecting to each other?

08:33.000 --> 08:35.000
Is there a regular one where someone's connecting in?

08:35.000 --> 08:38.000
And that brings us to an implementation that looks like this.

08:38.000 --> 08:41.000
Check if those four bytes are that, and it's a handshake.

08:42.000 --> 08:45.000
Check if bytes 36 to 39 are all zero,

08:45.000 --> 08:47.000
then it's around the full packets.

08:47.000 --> 08:49.000
That's nice and easy, isn't it?

08:49.000 --> 08:51.000
Let's just check four bytes, check four more bytes.

08:51.000 --> 08:54.000
So that means that we can do a nice little trick,

08:54.000 --> 08:56.000
and we came up with this trick.

08:56.000 --> 08:59.000
Before I go into what the exact trick is,

08:59.000 --> 09:01.000
I need to go over UDP's sockets real quick,

09:01.000 --> 09:04.000
because not everyone might know the fill details of those.

09:04.000 --> 09:08.000
So I assume most people here do know that UDP is a connectionless protocol.

09:08.000 --> 09:12.000
So if you have a UDP socket, it's not connected really.

09:12.000 --> 09:15.000
It just sends packets, and it might receive packets,

09:15.000 --> 09:17.000
depending on if it was bound to a port,

09:17.000 --> 09:18.000
but that's pretty much it.

09:18.000 --> 09:21.000
However, you can connect the UDP sockets.

09:21.000 --> 09:23.000
So let's say that we have a socket here,

09:23.000 --> 09:25.000
listening on genetic ports, you know,

09:25.000 --> 09:28.000
the cache all interface on a particular port,

09:28.000 --> 09:30.000
and people are sending data to it.

09:30.000 --> 09:33.000
Those will come into that same socket on the system.

09:33.000 --> 09:36.000
But if you now open a second socket,

09:36.000 --> 09:38.000
check that to one of those endpoints by saying,

09:38.000 --> 09:41.000
okay, we're listening on the particular interface

09:41.000 --> 09:42.000
that's receiving those packets,

09:42.000 --> 09:44.000
and we're going to be sending to the interface,

09:44.000 --> 09:47.000
and the port that is sending them to us,

09:47.000 --> 09:49.000
then this happens.

09:49.000 --> 09:52.000
The data from that particular client will go to that socket,

09:52.000 --> 09:55.000
well, everything else will go to the generic one.

09:55.000 --> 09:57.000
So we can do a neat little trick,

09:57.000 --> 09:59.000
where we listen on a generic socket.

09:59.000 --> 10:01.000
If you see a handshake, we spawn a new process,

10:01.000 --> 10:03.000
we create a connected socket,

10:04.000 --> 10:06.000
and I think actually it has to be in the next slide.

10:06.000 --> 10:07.000
Yes.

10:07.000 --> 10:09.000
So we listen on the main port ourselves.

10:09.000 --> 10:12.000
We intercept this, that's a very sharp joke.

10:12.000 --> 10:15.000
We spawn a new process, if you see a handshake,

10:15.000 --> 10:17.000
we initialize the lip-sorty library at that point

10:17.000 --> 10:19.000
after we've already received a handshake,

10:19.000 --> 10:22.000
and we give it the connected UDP socket

10:22.000 --> 10:24.000
and say, here listen on this one,

10:24.000 --> 10:26.000
this does, it's going to work out,

10:26.000 --> 10:28.000
and guess what, it's going to see a handshake packet,

10:28.000 --> 10:30.000
because those handshake packets get repeated a whole bunch of times.

10:30.000 --> 10:32.000
So, you know, we purposely dropped the first one,

10:32.000 --> 10:34.000
but it's going to see a handshake packet.

10:34.000 --> 10:36.000
And it's going to be like, oh hey, a handshake packet,

10:36.000 --> 10:38.000
and it will accept that one connection,

10:38.000 --> 10:40.000
and it will never see another connection in its life

10:40.000 --> 10:41.000
because that's a connected socket.

10:41.000 --> 10:44.000
So it will only see that one user.

10:44.000 --> 10:47.000
And the lip-sorty documentation actually explicitly states

10:47.000 --> 10:49.000
that you should not and cannot do this.

10:49.000 --> 10:51.000
But if you do it, totally works.

10:51.000 --> 10:54.000
And I checked with the original author of SRT,

10:54.000 --> 10:56.000
and he said it should be fine.

10:56.000 --> 10:59.000
So that kind of works.

10:59.000 --> 11:01.000
And then we get our nice little trick,

11:01.000 --> 11:03.000
where we can get a standard,

11:03.000 --> 11:05.000
unemodified version of lip-sorty,

11:05.000 --> 11:07.000
and we get notic losses isolation

11:07.000 --> 11:09.000
and kind of the best of both worlds.

11:09.000 --> 11:12.000
We can listen on a generic port,

11:12.000 --> 11:14.000
spawn notable processes all we want,

11:14.000 --> 11:17.000
and none of them catching will hurt the others,

11:17.000 --> 11:19.000
and we get normal SRT connections

11:19.000 --> 11:22.000
without having to rewrite everything from scratch.

11:22.000 --> 11:26.000
And that's the magical little trick that I wanted to show here.

11:27.000 --> 11:29.000
Which makes me to the end of the presentation.

11:29.000 --> 11:31.000
I think I went through it quite fast.

11:31.000 --> 11:33.000
Yeah, not too bad.

11:33.000 --> 11:36.000
So now we have a questions.

11:36.000 --> 11:38.000
Are there any?

11:38.000 --> 11:41.000
Thank you.

11:44.000 --> 11:49.000
So are you just dropping the first handshake packet?

11:49.000 --> 11:54.000
So when you're receiving that and the clients and another one?

11:54.000 --> 11:56.000
I'll repeat the question, Craig,

11:56.000 --> 11:58.000
so that the people mostly know what you asked.

11:58.000 --> 12:01.000
The question was, do we drop the first handshake packet

12:01.000 --> 12:03.000
and expect the other side to send another one?

12:03.000 --> 12:05.000
Yes, absolutely.

12:05.000 --> 12:07.000
We actually looked and the SRT's packet,

12:07.000 --> 12:12.000
it sends approximately 50 or 60 handshake packets in the first seconds.

12:12.000 --> 12:13.000
It's dropping the first one.

12:13.000 --> 12:15.000
It doesn't really hurt the connection.

12:15.000 --> 12:18.000
We could do a fancy thing where we tell the library

12:18.000 --> 12:20.000
that we saw the packet or we fake one,

12:20.000 --> 12:22.000
but we found it was not necessary and just made it

12:22.000 --> 12:24.000
necessarily complex.

12:24.000 --> 12:28.000
So we kind of stuck with that implementation.

12:28.000 --> 12:29.000
Anyone else?

12:29.000 --> 12:30.000
Yeah, go ahead.

12:38.000 --> 12:41.000
So the question is, does this work with not-deplexing

12:41.000 --> 12:43.000
where you have multiple clients sending on the same port

12:43.000 --> 12:46.000
and all going to a different stream ID?

12:46.000 --> 12:48.000
Yes, that is the whole reason we did this.

12:48.000 --> 12:50.000
So that we could have one port and we could send

12:51.000 --> 12:54.000
dozens of different streams in or out over the same port

12:54.000 --> 12:56.000
and they would not bother each other.

12:56.000 --> 12:58.000
That's the whole idea behind it.

12:58.000 --> 13:00.000
Yep, absolutely.

13:02.000 --> 13:04.000
Anyone else?

13:04.000 --> 13:05.000
Yeah, go ahead.

13:05.000 --> 13:09.000
So the does is mean that

13:09.000 --> 13:13.000
this server is always

13:14.000 --> 13:18.000
missing for

13:23.000 --> 13:25.000
up for

13:25.000 --> 13:28.000
a server always try to look for

13:28.000 --> 13:29.000
as our

13:29.000 --> 13:31.000
team

13:31.000 --> 13:34.000
con

13:34.000 --> 13:38.000
even when you're not

13:39.000 --> 13:41.000
using a server.

13:41.000 --> 13:43.000
So good question.

13:43.000 --> 13:44.000
It is, it's a server always

13:44.000 --> 13:45.000
listening for

13:45.000 --> 13:47.000
SD connections even if you're not using a

13:47.000 --> 13:48.000
SD.

13:48.000 --> 13:49.000
It depends.

13:49.000 --> 13:50.000
You can configure a

13:50.000 --> 13:52.000
miss to listen on one or more

13:52.000 --> 13:53.000
SD ports and you can

13:53.000 --> 13:54.000
discard ones.

13:54.000 --> 13:55.000
And if you do that,

13:55.000 --> 13:56.000
it will listen.

13:56.000 --> 13:57.000
And if you don't

13:57.000 --> 13:58.000
configure that, it will not.

13:58.000 --> 14:00.000
Simple as that.

14:00.000 --> 14:02.000
Anyone else?

14:02.000 --> 14:03.000
Yeah?

14:03.000 --> 14:04.000
Okay.

14:04.000 --> 14:05.000
Well, then I guess that was it.

14:06.000 --> 14:08.000
If you want to find out more about

14:08.000 --> 14:09.000
miss server,

14:09.000 --> 14:10.000
visit us,

14:10.000 --> 14:11.000
miss server.org or

14:11.000 --> 14:12.000
skim.gov.

14:12.000 --> 14:13.000
You can also send me

14:13.000 --> 14:14.000
personally an email or

14:14.000 --> 14:15.000
info of miss server

14:15.000 --> 14:16.000
if you want to team.

14:16.000 --> 14:17.000
We hope to hear from you

14:17.000 --> 14:18.000
if you're interested.

14:18.000 --> 14:19.000
All right.

14:19.000 --> 14:21.000
Thanks for your attention.

