For anyone curious, there is a great article [1]. WebRTC enables peer to peer co...

k__ · on May 1, 2020

You need a signaling channel to your peer. This doesn't have to be a server, but it makes things simpler.

jhardy54 · on May 1, 2020

My preference is IPoAC.

RcrdBrt · on May 1, 2020

I think that adds up some latency

sktrdie · on May 1, 2020

But you can use publicly available servers to do that no? STUN servers and such. So you don't need to roll our your own.

SahAssar · on May 1, 2020

No, signaling is different from STUN. Signaling is basically pairing together the people who want to communicate so that they can get the info required to connect to each other, STUN is how they find out their public IP:port pairs and TURN is how they can talk over a proxy if direct communication fails.

So you always need some form of signaling but that can be over email or even a handwritten note if you prefer, although it is usually done over HTTP/websockets.

STUN is required if you are behind some sort of NAT.

TURN is required if your NAT does not play well with hole-punching.

sktrdie · on May 1, 2020

Ah indeed. But can't the handshaking be done without rolling our your own signaling server?

Here's a great example: https://jameshfisher.github.io/serverless-webrtc/index.html

Checkout the process in the console! For instance you can do the handshake the same way you'd send someone the URL of the actual thing.

SahAssar · on May 1, 2020

As you see if you run that it is a request-response like flow. So sure you can send the initial offer in the URL, but then you somehow need to get the answer back to the initiator.

So while you can have a initatior URL sent to the responder and then the responder send back a URL to the initiator that is still not "click link and you are connected".

Handling signaling is pretty much the easiest bit of webrtc as it is basically just a HTTP/websocket echo server with some ID or similar for the meeting.

If you have a websocket server (or REST & SSE as I i usually do) you can just have meet.example/{meetingId} and echo everything on meetingId to all others on the same meetingId. That is as simple as the web chat examples that thousands of beginner programmers create their first year.

You should also consider that the signaling info (called the SDP) does not have a set lifetime and can in some cases be valid indefinitely and in some cases just valid for less that a minute, so if you encode the SDP in the URL you:

1. Can't setup a meeting URL beforehand which is how most people want them to work.

2. Need a back-and-forth over some other medium like email/chat/pigeon.

sktrdie · on May 1, 2020

But it's a very similar flow as you'd normally do with a URL no?

- You'd send someone a link such as http://chat.com/#id - Other person opens http://chat.com/#id redirects to another url, and this other url (http://chat.com/#someOtherId) is sent back to creator - Creator clicks on link again

So it's just adding an extra step where the owner needs to also click on a new URL. But I agree since signalling server is rather dumb this can probably also be outsourced by a "public signalling server"?

SahAssar · on May 1, 2020

Yeah, that is possible but my point is that it sorta breaks how people are used to join these kinds of meetings. The back and forth required is not the expectation most people have.

If you have more people then you'd need to do this once per person, (after that they can gossip the SDP over data channels to find the other participants).

Usual flow:

1. I and other people go to meet.example/DiscussImportantStuff

Your proposed flow:

1. I go to meet.example/DiscussImportantStuff (and it generates my sdp in the background and appends it to URL)

2. I send meet.example/DiscussImportantStuff#MySDPHere to my friend

3. He goes to meet.example/DiscussImportantStuff#MySDPHere (and it generates an answer SDP and replaces it to URL)

4. He sends me back meet.example/DiscussImportantStuff#FriendsSDPHere

5. I go to that link and we are connected.

Repeat steps 2-5 for each participant.

Considering how little technical complexity is saved and that you still need to have some sort of communication channel set up I don't think the proposed flow is worth it.

sktrdie · on May 1, 2020

Indeed that makes sense!

Curious though if the "signalling server" can be abstracted away the same way STUN servers are: put a few URLs of signalling servers in the client app and it would choose whichever. They would all need the same echo'ing capabilities.

Point is to not maintain or have to spin up any servers for developing WebRTC apps, but making them fully autonomous.

Something like this could also be pushed forward to develop some sort of DHT around WebRTC so that this process of finding "signalling servers" can be made even more self-sufficient if the hardcoded urls in the client code are all offline.

Just a thought ️

SahAssar · on May 1, 2020

EDIT: tried to be more clear by condensing the comment into two questions:

1. How can you trust trust if it is established over an untrusted channel and you have no previous store of trust?

2. How can you verify identity when you have no trust and it is communicated over an untrusted proxy (the signaling server)?

STUN/TURN plays no part in establishing the trust between the parties, they just facilitate it by acting as an lookup service or a forwarding service. The signaling has to be trusted for the communication to be trusted.

--- Original comment: ---

One problem is that signaling is pretty specific to the app using it, for example how a meeting is determined and how it is used is very different between zoom, google meet, slack and so on. There is also a question of trust, in a P2P webrtc flow you can have end-to-end encryption, but it still requires you to trust the signaling (since you have no way to communicate trust before signaling).

For purposes where you have a previous channel to communicate trust you probably don't need signaling via a third party and for purposes where you don't have a channel you probably couldn't trust the signaling party if it was just an open relay on the net.

For the "free signaling server" to be a good solution it would first have to handle the problem of proven identity, which is something that even facebook with billions of users have a problem with.

DHT solves a very different problem than identity, the problem is not being able to speak/address to a user, the problem is being able to speak to the right user. WebRTC provides a channel to do that if you point it to the right user. Our problem is finding that right user in a secure, smooth way.