Skip to content

Commit 9e18881

Browse files
authored
Add Introduction to Elixir WebRTC tutorial (#136)
1 parent ef355c0 commit 9e18881

File tree

9 files changed

+714
-1
lines changed

9 files changed

+714
-1
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ end
2626
## Getting started
2727

2828
To get started with Elixir WebRTC, check out:
29+
* the [Introduction to Elixir Webrtc](https://hexdocs.pm/ex_webrtc/intro.html) tutorial
2930
* the [examples directory](https://github.com/elixir-webrtc/ex_webrtc/tree/master/examples) that contains a bunch of very simple usage examples of the library
3031
* the [`apps` repo](https://github.com/elixir-webrtc/apps) with example applications built on top of `ex_webrtc`
3132
* the [documentation](https://hexdocs.pm/ex_webrtc/readme.html), especially the [`PeerConnection` module page](https://hexdocs.pm/ex_webrtc/ExWebRTC.PeerConnection.html)
File renamed without changes.

guides/introduction/consuming.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Consuming media data
2+
3+
Other than just forwarding, we probably would like to be able to use the media right in the Elixir app to
4+
e..g feed it to a machine learning model or create a recording of a meeting.
5+
6+
In this tutorial, we are going to build on top of the simple app from the previous tutorial by, instead of just sending the packets back, depayloading and decoding
7+
the media, using a machine learning model to somehow augment the video, encode and payload it back into RTP packets and only then send it to the web browser.
8+
9+
## Deplayloading RTP
10+
11+
We refer to the process of taking the media payload out of RTP packets as _depayloading_.
12+
13+
> #### Codecs {: .info}
14+
> A media codec is a program used to encode/decode digital video and audio streams. Codecs also compress the media data,
15+
> otherwise, it would be too big to send over the network (bitrate of raw 24-bit color depth, FullHD, 60 fps video is about 3 Gbit/s!).
16+
>
17+
> In WebRTC, most likely you will encounter VP8, H264 or AV1 video codecs and Opus audio codec. Codecs that will be used during the session are negotiated in
18+
> the SDP offer/answer exchange. You can tell what codec is carried in an RTP packet by inspecting its payload type (`packet.payload_type`,
19+
> a non-negative integer field) and match it with one of the codecs listed in this track's transceiver's `codecs` field (you have to find
20+
> the `transceiver` by iterating over `PeerConnection.get_transceivers` as shown previously in this tutorial series).
21+
22+
_TBD_
23+
24+
## Decoding the media to raw format
25+
26+
_TBD_
27+

guides/introduction/forwarding.md

Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
# Forwarding media data
2+
3+
Elixir WebRTC, in contrast to the JavaScript API, provides you with the actual media data transmitted via WebRTC.
4+
That means you can be much more flexible with what you do with the data, but you also need to know a bit more
5+
about how WebRTC actually works under the hood.
6+
7+
All of the media data received by the `PeerConnection` is sent to the user in the form of messages like this:
8+
9+
```elixir
10+
receive do
11+
{:ex_webrtc, ^pc, {:rtp, track_id, _rid, packet}} ->
12+
# do something with the packet
13+
# also, for now, you can assume that _rid is always nil and ignore it
14+
end
15+
```
16+
17+
The `track_id` corresponds to one of the tracks that we received in `{:ex_webrtc, _from, {:track, %MediaStreamTrack{id: track_id}}}` messages.
18+
The `packet` is an RTP packet. It contains the media data alongside some other useful information.
19+
20+
> #### The RTP protocol {: .info}
21+
> RTP is a network protocol created for carrying real-time data (like media) and is used by WebRTC.
22+
> It provides some useful features like:
23+
>
24+
> * sequence numbers: UDP (which is usually used by WebRTC) does not provide ordering, thus we need this to catch missing or out-of-order packets
25+
> * timestamp: these can be used to correctly play the media back to the user (e.g. using the right framerate for the video)
26+
> * payload type: thanks to this combined with information in the SDP offer/answer, we can tell which codec is carried by this packet
27+
>
28+
> and many more. Check out the [RFC 3550](https://datatracker.ietf.org/doc/html/rfc3550) to learn more about RTP.
29+
30+
Next, we will learn what you can do with the RTP packets.
31+
For now, we won't actually look into the packets themselves, our goal for this part of the tutorial will be to forward the received data back to the same web browser.
32+
33+
```mermaid
34+
flowchart LR
35+
subgraph Elixir
36+
PC[PeerConnection] --> Forwarder --> PC
37+
end
38+
39+
WB((Web Browser)) <-.-> PC
40+
```
41+
42+
The only thing we have to implement is the `Forwarder` GenServer. Let's combine the ideas from the previous section to write it.
43+
44+
```elixir
45+
defmodule Forwarder do
46+
use GenServer
47+
48+
alias ExWebRTC.{PeerConnection, ICEAgent, MediaStreamTrack, SessionDescription}
49+
50+
@ice_servers [%{urls: "stun:stun.l.google.com:19302"}]
51+
52+
@impl true
53+
def init(_) do
54+
{:ok, pc} = PeerConnection.start_link(ice_servers: @ice_servers)
55+
56+
# we expect to receive two tracks from the web browser - one for audio, one for video
57+
# so we also need to add two tracks here, we will use these to forward media
58+
# from each of the web browser tracks
59+
stream_id = MediaStreamTrack.generate_stream_id()
60+
audio_track = MediaStreamTrack.new(:audio, [stream_id])
61+
video_track = MediaStreamTrack.new(:video, [stream_id])
62+
63+
{:ok, _sender} = PeerConnection.add_track(pc, audio_track)
64+
{:ok, _sender} = PeerConnection.add_track(pc, video_track)
65+
66+
# in_tracks (tracks we will receive media from) = %{id => kind}
67+
# out_tracks (tracks we will send media to) = %{kind => id}
68+
out_tracks = %{audio: audio_track.id, video: video_track.id}
69+
{:ok, %{pc: pc, out_tracks: out_tracks, in_tracks: %{}}}
70+
end
71+
72+
# ...
73+
end
74+
```
75+
76+
We started by creating the PeerConnection and adding two tracks (one for audio and one for video).
77+
Remember that these tracks will be used to *send* data to the web browser peer. Remote tracks (the ones we will set up on the JavaScript side, like in the previous tutorial)
78+
will arrive as messages after the negotiation is completed.
79+
80+
> #### Where are the tracks? {: .tip}
81+
> In the context of Elixir WebRTC, a track is simply a _track id_, _ids_ of streams this track belongs to, and a _kind_ (audio/video).
82+
> We can either add tracks to the PeerConnection (these tracks will be used to *send* data when calling `PeerConnection.send_rtp/4` and
83+
> for each one of the tracks, the remote peer should fire the `track` event)
84+
> or handle remote tracks (which you are notified about with messages from the PeerConnection process: `{:ex_webrtc, _from, {:track, track}}`).
85+
> These are used when handling messages with RTP packets: `{:ex_webrtc, _from, {:rtp, _rid, track_id, packet}}`.
86+
> You cannot use the same track to send AND receive, keep that in mind.
87+
>
88+
> Alternatively, all of the tracks can be obtained by iterating over the transceivers:
89+
>
90+
> ```elixir
91+
> tracks =
92+
> peer_connection
93+
> |> PeerConnection.get_transceivers()
94+
> |> Enum.map(&(&1.receiver.track))
95+
> ```
96+
>
97+
> If you want to know more about transceivers, read the [Mastering Transceivers](https://hexdocs.pm/ex_webrtc/mastering_transceivers.html) guide.
98+
99+
Next, we need to take care of the offer/answer and ICE candidate exchange. As in the previous tutorial, we assume that there's some kind
100+
of WebSocket relay service available that will forward our offer/answer/candidate messages to the web browser and back to us.
101+
102+
```elixir
103+
@impl true
104+
def handle_info({:web_socket, {:offer, offer}}, state) do
105+
:ok = PeerConnection.set_remote_description(state.pc, offer)
106+
{:ok, answer} = PeerConnection.create_answer(state.pc)
107+
:ok = PeerConnection.set_local_description(state.pc, answer)
108+
109+
web_socket_send(answer)
110+
{:noreply, state}
111+
end
112+
113+
@impl true
114+
def handle_info({:web_socket, {:ice_candidate, cand}}, state) do
115+
:ok = PeerConnection.add_ice_candidate(state.pc, cand)
116+
{:noreply, state}
117+
end
118+
119+
@impl true
120+
def handle_info({:ex_webrtc, _from, {:ice_candidate, cand}}, state) do
121+
web_socket_send(cand)
122+
{:noreply, state}
123+
end
124+
```
125+
126+
Now we can expect to receive messages with notifications about new remote tracks.
127+
Let's handle these and match them with the tracks that we are going to send to.
128+
We need to be careful not to send packets from the audio track on a video track by mistake!
129+
130+
```elixir
131+
@impl true
132+
def handle_info({:ex_webrtc, _from, {:track, track}}, state) do
133+
state = put_in(state.in_tracks[track.id], track.kind)
134+
{:noreply, state}
135+
end
136+
```
137+
138+
We are ready to handle the incoming RTP packets!
139+
140+
```elixir
141+
@impl true
142+
def handle_info({:ex_webrtc, _from, {:rtp, track_id, nil, packet}}, state) do
143+
kind = Map.fetch!(state.in_tracks, track_id)
144+
id = Map.fetch!(state.out_tracks, kind)
145+
:ok = PeerConnection.send_rtp(state.pc, id, packet)
146+
147+
{:noreply, state}
148+
end
149+
```
150+
151+
> #### RTP packet rewriting {: .info}
152+
> In the example above we just receive the RTP packet and immediately send it back. In reality, a lot of stuff in the packet header must be rewritten.
153+
> That includes SSRC (a number that identifies to which stream the packet belongs), payload type (indicates the codec, even though the codec does not
154+
> change between two tracks, the payload types are dynamically assigned and may differ between RTP sessions), and some RTP header extensions. All of that is
155+
> done by Elixir WebRTC behind the scenes, but be aware - it is not as simple as forwarding the same piece of data!
156+
157+
Lastly, let's take care of the client-side code. It's nearly identical to what we have written in the previous tutorial.
158+
159+
```js
160+
const localStream = await navigator.mediaDevices.getUserMedia({audio: true, video: true});
161+
const pc = new RTCPeerConnection({iceServers: [{urls: "stun:stun.l.google.com:19302"}]});
162+
localStream.getTracks().forEach(track => pc.addTrack(track, localStream));
163+
164+
// these will be the tracks that we added using `PeerConnection.add_track`
165+
pc.ontrack = event => videoPlayer.srcObject = event.stream[0];
166+
167+
// sending/receiving the offer/answer/candidates to the other peer is your responsibility
168+
pc.onicecandidate = event => send_to_other_peer(event.candidate);
169+
on_cand_received(cand => pc.addIceCandidate(cand));
170+
171+
// remember that we set up the Elixir app to just handle the incoming offer
172+
// so we need to generate and send it (and thus, start the negotiation) here
173+
const offer = await pc.createOffer();
174+
await pc.setLocalDescription(offer)
175+
send_offer_to_other_peer(offer);
176+
177+
const answer = await receive_answer_from_other_peer();
178+
await pc.setRemoteDescription(answer);
179+
```
180+
181+
And that's it! The other peer should be able to see and hear the echoed video and audio.
182+
183+
> #### PeerConnection state {: .info}
184+
> Before we can send anything on a PeerConnection, its state must change to `connected` which is signaled
185+
> by the `{:ex_webrtc, _from, {:connection_state_change, :connected}}` message. In this particular example, we want
186+
> to send packets on the very same PeerConnection that we received the packets from, thus it must be connected
187+
> from the first RTP packet received.
188+
189+
What you've seen here is a simplified version of the [echo](https://github.com/elixir-webrtc/ex_webrtc/tree/master/examples/echo) example available
190+
in the Elixir WebRTC Github repo. Check it out and play with it!
191+
192+
Now, you might be thinking that forwarding the media back to the same web browser does not seem very useful, and you're probably right!
193+
But thankfully, you can use the gained knowledge to build more complex apps.
194+
195+
In the next part of the tutorial, we will learn how to actually do something with media data in the Elixir app.

guides/introduction/intro.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Introduction to WebRTC
2+
3+
In this series of tutorials, we are going to learn what is WebRTC, and go through some simple use cases of Elixir WebRTC.
4+
Its purpose is to teach you where you'd want to use WebRTC, show you what the WebRTC API looks like, and how it should
5+
be used, focusing on some common caveats.
6+
7+
> #### Before You Start {: .info}
8+
> This guide assumes little prior knowledge of the WebRTC API, but it would be highly beneficial
9+
> to go through the [MDN WebRTC tutorial](https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API)
10+
> as the Elixir API tries to closely mimic the browser JavaScript API.
11+
12+
## What is WebRTC
13+
14+
WebRTC is an open, real-time communication standard that allows you to send video, audio, and generic data between peers over the network.
15+
It places a lot of emphasis on low latency (targeting values in low hundreds of milliseconds end-to-end) and was designed to be used peer-to-peer.
16+
17+
WebRTC is implemented by all of the major web browsers and is available as a JavaScript API, there's also native WebRTC clients for Android and iOS
18+
and implementation in other programming languages ([Pion](https://github.com/pion/webrtc), [webrtc.rs](https://github.com/webrtc-rs/webrtc),
19+
and now [Elixir WebRTC](https://github.com/elixir-webrtc/ex_webrtc)).
20+
21+
## Where would you use WebRTC
22+
23+
WebRTC is the obvious choice in applications where low latency is important. It's also probably the easiest way to obtain the voice and video from a user of
24+
your web application. Here are some example use cases:
25+
26+
* videoconferencing apps (one-on-one meetings of fully fledged meeting rooms, like Microsoft Teams or Google Meet)
27+
* ingress for broadcasting services (as a presenter, you can use WebRTC to get media to a server, which will then broadcast it to viewers using WebRTC or different protocols)
28+
* obtaining voice and video from web app users to use it for machine learning model inference on the back end.
29+
30+
In general, all of the use cases come down to getting media from one peer to another. In the case of Elixir WebRTC, one of the peers is usually a server,
31+
like your Phoenix app (although it doesn't have to - there's no concept of server/client in WebRTC, so you might as well connect two browsers or two Elixir peers).
32+
33+
This is what the next section of this tutorial series will focus on - we will try to get media from a web browser to a simple Elixir app.

0 commit comments

Comments
 (0)