Modern way to stream H.264 from the Raspberry Cam

Question

I got the Pi B+ and the Pi camera and am now trying to find the most efficient (low CPU) and lowest-latency configuration to stream H.264 encoded video from the camera to my home server.

I've read the following:

(All links use gstreamer-1.0 from deb http://vontaene.de/raspbian-updates/ . main.)

A lot has been done in this regard in the past years.

Originally, we had to pipe the output of raspivid into gst-launch-1.0 (see link 1).

Then (link 2) the official V4L2 driver was created which is now standard, and it allows to directly obtain the data without a pipe, using just gstreamer (see especially the post by towolf » Sat Dec 07, 2013 3:34 pm in link 2):

Sender (Pi): gst-launch-1.0 -e v4l2src do-timestamp=true ! video/x-h264,width=640,height=480,framerate=30/1 ! h264parse ! rtph264pay config-interval=1 ! gdppay ! udpsink host=192.168.178.20 port=5000

Receiver: gst-launch-1.0 -v udpsrc port=5000 ! gdpdepay ! rtph264depay ! avdec_h264 ! fpsdisplaysink sync=false text-overlay=false

If I understand correctly, both ways use the GPU to do the H264 decoding, but the latter is a bit mor efficient since it doesn't need to go through the kernel another time since there's no pipe between processes involved.

Now I have some questions about this.

Is the latter still the most recent way to efficiently get H264 from the camera? I've read about gst-omx, which allows gstreamer pipelines like ... video/x-raw ! omxh264enc ! .... Does this do anything different to just using video/x-h264, or might it even be more efficient? What's the difference?
How do I find out what gstreamer encoding plugin is actually used when I use the video/x-h264 ... pipeline? This seems to be just specifying the format I want, as compared to the other pipeline parts, where I explicitly name the (code) component (like h264parse or fpsdisplaysink).
In this reply to link 1 Mikael Lepistö mentions "I removed one unnecessary filter pass from streaming side", meaning that he cut out the gdppay and gdpdepay. What do those do? Why are they needed? Can I really strip them off?
He also mentions that by specifying caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" parameters for the udpsrc at the receiving side, he's able to start/resume the streaming in the middle of the stream. What do these caps achieve, why these specific choices, where can I read more about them?
When I do what's suggested in question 3 and 4 (adding the caps, dropping gdppay and gdpdepay) then my video latency becomes much worse (and seems to be accumulating, the latency increases over time, and after a few minutes the video stops)! Why could that be? I would like to get the latency I obtained with the original command, but also have the feature of being able to join the stream at any time.
I've read that RTSP+RTP usually use a combination of TCP and UDP: TCP for control messages and other things that mustn't get lost, and UDP for the actual video data transmission. In the setups above, am I actually using that, or am I just using UDP only? It's a bit opaque to me whether gstreamer takes care of this or not.

I would appreciate any answer to even a single one of these questions!

The idea that using a pipe | creates any issue in this context is an incredible piece of B.S. Have you tried any raspivid | cvlc methods? I haven't had the camera for very long or much time to play with it, but using that to produce an http stream (viewable on linux at the other end w/ vlc) seems to work okay. — goldilocks, Jan 10 '15 at 11:30
@goldilocks I'm not saying that the pipe is an "issue", just that it is not necessary and has some overhead, just like cat file | grep ... instead of grep ... file. The pipe adds another layer of copying to and from the kernel, which is easily measureable, especially on devices with low memory bandwidth. If gstreamer can read from the device file direcly, why not use that? Regarding your raspivid | cvlc suggestion: I was using this before I switched to the gstreamer based solution, it has up to 3 seconds more latency than gstreamer (I don't know why). — nh2, Jan 10 '15 at 12:19
Yeah it definitely has some latency. WRT the pipe, my point about "context" is that this cannot possibly be a bottleneck here -- network I/O is going to be orders of magnitude slower, etc. You are right, though, it may add a bit to the CPU time. Just I'd wager not much; running that at full resolution, cvlc uses ~45%, but just running through a pipe at that data rate (keeping in mind again, the pipe is not slowing it down) would barely move the needle, I think. Like <5%. It's not totally insignificant if you want to do this as efficiently as possible of course... — goldilocks, Jan 10 '15 at 13:19
...I just don't want anyone else reading this to get the impression that using a pipe here might be responsible for latency issues or other problems. That's a red herring. Or I could be wrong ;) — goldilocks, Jan 10 '15 at 13:20
If it is efficiency you are after, you might want to include observed total CPU usage for various methods at specific resolution/frame rates. The only one I've tried is the raspivid | cvlc one and that's 40-50%. People may respond better to a question that challenges them to improve on a specific figure. Right now you're asking a lot of why, without explaining why each why is significant. — goldilocks, Jan 10 '15 at 13:23
@goldilocks Ah no, I wasn't suggesting that the pipe has anything to do with the video latency. All my questions are based on gstreamer's v4l2src, which works well so fare - it's the other bits in the gstreamer pipeline that don't work / that I don't understand. — nh2, Jan 10 '15 at 13:55
Regarding why each question is significant: 1: I want to get the most efficient encoding, so I need to know whether ... video/x-raw ! omxh264enc ! ... and video/x-h264 are actually identical. 2: Would help me answer 1 myself. 3: I guess this question needs no explanation. 4: This lets me (re)connect the video at any time, which I need. 5: It solves reconnection, but has much higher latency. I need to know why. 6: To understand what methods I'm actually using right now. — nh2, Jan 10 '15 at 13:59
I think you could get answers to some of these questions here, although I also think people are less inclined to give partial answers (since we are not, as you know, a forum), and practically no one is going to answer all of this. Meaning, you might have better luck breaking it down and seeing if some parts would be more appropriate to U&L. You might also consider a gstreamer mail list (-devel or -embedded, I guess, try the former first). If that works out, please come back and answer yourself, of course. — goldilocks, Jan 10 '15 at 14:09
possible duplicate http://raspberrypi.stackexchange.com/questions/13382/what-streaming-solution-for-the-picam-has-the-smallest-lag — user1133275, May 09 '15 at 22:44
A related issue that talks about gdppay: https://github.com/thaytan/gst-rpicamsrc/issues/20 — nh2, May 20 '15 at 14:35
As a follow up for what I'm using now: gst-rpicamsrc is a gstreamer element that can can get H264 encoded video from the Raspicam directly into gstreamer with low overhead - I'm using that now, and it works well with low latency. This in combination with the gst-rtsp-server for serving it to RTSP clients - it uses TCP to control the connection and UDP to send the data. See this: http://cgit.freedesktop.org/gstreamer/gst-rtsp-server/tree/examples/test-launch.c?id=1.4.5 — nh2, May 20 '15 at 14:37

score 10 · Answer 1 · edited Aug 01 '20 at 14:36

The options:

raspivid -t 0 -o - | nc -k -l 1234
raspivid -t 0 -o - | cvlc stream:///dev/stdin --sout "#rtp{sdp=rtsp://:1234/}" :demux=h264
cvlc v4l2:///dev/video0 --v4l2-chroma h264 --sout '#rtp{sdp=rtsp://:1234/}'
raspivid -t 0 -o - | gst-launch-1.0 fdsrc ! h264parse ! rtph264pay config-interval=1 pt=96 ! gdppay ! tcpserversink host=SERVER_IP port=1234
gst-launch-1.0 -e v4l2src do-timestamp=true ! video/x-h264,width=640,height=480,framerate=30/1 ! h264parse ! rtph264pay config-interval=1 ! gdppay ! udpsink host=SERVER_IP port=1234
uv4l --driver raspicam
picam --alsadev hw:1,0

Things to consider

latency [ms] (with and without asking the client to want more fps than the server)
CPU idle [%] (measured by top -d 10)
CPU 1 client [%]
RAM [MB] (RES)
same encoding settings
same features
- audio
- reconnect
- OS independent client (vlc, webrtc, etc)

Comparison:

            1    2    3    4    5    6    7
latency     2000 5000 ?    ?    ?    ?    1300
CPU         ?    1.4  ?    ?    ?    ?    ?
CPU 1       ?    1.8  ?    ?    ?    ?    ?
RAM         ?    14   ?    ?    ?    ?    ?
encoding    ?    ?    ?    ?    ?    ?    ?
audio       n    ?    ?    ?    ?    y    ?
reconnect   y    y    ?    ?    ?    y    ?
any OS      n    y    ?    ?    ?    y    ?
latency fps ?    ?    ?    ?    ?    ?    ?

@larsks because no one cares to test and fill in the data on this 'community wiki' — user1133275, Mar 19 '18 at 23:51

Ben Olayinka · Answer 2 · 2020-01-24T09:04:54.107

I'm amazed there isn't more action on this thread, I've been chasing down the answer to this question for months.

I stream from a Pi Camera (CSI) to a Janus server, and I found the best pipeline is

gst-launch-1.0 v4l2src ! video/x-h264, width=$width, height=$height, framerate=$framerate/1 ! h264parse ! rtph264pay config-interval=1 pt=96 ! udpsink sync=false host=$host port=$port

v4l2src uses the memory efficient bmc2835-v4l2 module and pulls hardware compressed h264 video directly. On a pi zero, gst-launch consumes between 4% & 10% cpu, streaming 1280x720 at 30fps. I am also able to resume the stream at any time, without using gdppay. Make sure you run rpi-update to get the mmal v4l2 driver. My pi is also underclocked and over_voltaged for stability, and streams uninterrupted for days, see here

I stumbled over a lot of the same problems that the OP had. The most frustrating was problem 5 - latency was accumulating over time, and eventually crashing the pi. The solution is the sync=false udpsink element. The gstreamer docs don't have much information about the element, just that it disables clock sync, but after a lot of tears, I discovered that I can now stream for hours without accumulating latency.

I also fought problem 4, I couldn't resume a stream or start watching after the stream began. The solution to this is config-interval, which rebroadcasts SPS and PPS frames. Using config-interval=1 packs these at every frame, I guess, which allows me to pick up a stream at any time.

I got pretty close to the same stream using the ffmpeg pipeline:

ffmpeg -f h264 -framerate $framerate -i /dev/video0 -vcodec copy -g 60 -r $framerate -f rtp rtp://$hostname:$port

but I can't resume the stream, if I refresh a page while streaming I get no stream. I assume this is because of the SPS and PPS frames. If anyone knows how to pack them with ffmpeg, I'd love to know.

btw I also use v4l2-ctl to set params, ffmpeg seems to recognize settings like width and height automatically, but for gstreamer they have to match what the hardware is producing

v4l2-ctl --set-fmt-video=width=$width,height=$height,pixelformat=4
v4l2-ctl --set-ctrl=rotate=$rotation
v4l2-ctl --overlay=1
v4l2-ctl -p $framerate
v4l2-ctl --set-ctrl=video_bitrate=4000000 //or whatever

This does not really answer the question. If you have a different question, you can ask it by clicking Ask Question. You can also add a bounty to draw more attention to this question once you have enough reputation. - From Review — Dougie, Jan 22 '20 at 19:49
I think it does! OP asked for the most modern, efficient way to stream from a pi. The gstreamer pipeline I posted is exactly that. I also explained what the OP was missing in his pipeline, and what the critical pipeline elements are. I'm editing my response to address the cpu load directly, maybe that helps. — Ben Olayinka, Jan 24 '20 at 08:51
I (OP) think that the answer is perfectly on point, especially given that it addresses the question on accumulating latency. Thank you! — nh2, Jan 24 '20 at 20:47

score 6 · Answer 3 · edited Dec 31 '16 at 12:50

6

The only modern way to stream H264 to a browser is with UV4L: no latency, no configuration, with optional audio, optional two-way audio/video. No magic GStreamer sauce, yet it's possible to extend its usage.

edited Dec 31 '16 at 12:50

techraf

4,319
10
31
42

answered Jan 14 '16 at 21:31

prinxis

276
3
4

1

Since I want to stream to my server and potentially smartphones, streaming to a browser is not a requirement. Also, the browser may put extra restriction on it (e.g. no RTSP, potentially no TCP unless you use WebRTC, but that's fiddly). But UV4L still looks promising. Could you link to a place where I can read about how to use it / get the data out of it for streaming over the network? – nh2 Jan 15 '16 at 02:00
Holy cow, I think I found the example page ... this thing seems to be able to do everything! RTMP, RTSP, HTTPS streaming, WebRTC, "Real-time Object Detection and Object Tracking + Face detection" -- what the hell?? Each with some simple command line flags to uv4l? My gstreamer pipeline looks pretty outdated now! Can't wait to test how the latency is! – nh2 Jan 15 '16 at 02:15
3

Oh no, it is closed source :( That disqualifies it for the home surveillance use I had in mind :( – nh2 Jan 15 '16 at 02:30
it does support WebRTC, 2-way WebRTC. latency is ~200ms audio/video, audio less probably – prinxis Jan 18 '16 at 23:54
@nh2, the link seems to be broken, do you have any updated location for that example page? – Punit Soni Dec 30 '16 at 23:45
@PunitSoni OK, updated link. – nh2 Dec 31 '16 at 12:41
200ms latency is way too high for something that is being presented as 'the best option'. I get 80ms easily over wifi with ffmpeg, with HTML5 rendering through javascript decoding. – Viezevingertjes Aug 25 '17 at 15:57

sparkie · Answer 4 · 2017-12-20T11:07:22.963

2

1.) h264es streaming across the network (sample only)

on server:

raspivid -v -a 524 -a 4 -a "rpi-0 %Y-%m-%d %X" -fps 15 -n -md 2 -ih -t 0 -l -o tcp://0.0.0.0:5001

on client:

mplayer -nostop-xscreensaver -nolirc -fps 15 -vo xv -vf rotate=2,screenshot -xy 1200 -demuxer h264es ffmpeg://tcp://<rpi-ip-address>:5001

2.) mjpeg streaming across the network (sample only)

on server:

/usr/local/bin/mjpg_streamer -o output_http.so -w ./www -i input_raspicam.so -x 1920 -y 1440 -fps 3

on client:

mplayer -nostop-xscreensaver -nolirc -fps 15 -vo xv -vf rotate=2,screenshot -xy 1200 -demuxer lavf http://<rpi-ip-address>:8080/?action=stream

all this even works on a RPi Zero W (configured as server)

edited Dec 20 '17 at 11:07

answered Dec 15 '17 at 13:30

sparkie

435
3
9

Hey, thanks for you answer, what does sample only mean? – nh2 Dec 15 '17 at 20:50
I wanted to say 'it's an example only'. You can adapt this to your needs. – sparkie Dec 15 '17 at 22:20

Modern way to stream H.264 from the Raspberry Cam

4 Answers4

Linked