[Libwebsockets] Help wanted: V4L2 / mp4 / h.264

Andy Green andy at warmcat.com
Tue Jun 29 20:45:41 CEST 2021

On 5/14/21 3:04 PM, Rémi COHEN-SCALI ... wrote:
> Le 14/05/2021 à 15:45, andy at warmcat.com a écrit :
>> On May 14, 2021 1:31:35 PM UTC, "Rémi COHEN-SCALI ..." <remi at cohenscali.net> wrote:
>>> Hi
>>> I wasn't subscribed with mailling list at this message writing time
>>> then
>>> after reading archives i catch this opportunity to answer some of the
>>> questions I found.
>>> Answers inlined
>>>> On 4/30/21 3:39 PM, Silas Parker wrote:
>>>>> /Hi Andy, /> >//> >/I can't offer anything specific, but in the past
>>> I've used Shaka /> >/Packager (https://github.com/google/shaka-packager
>>> <https://github.com/google/shaka-packager> />
>>>> /<https://github.com/google/shaka-packager
>>> <https://github.com/google/shaka-packager>>) for generating MPEG DASH
>>> /> >/streams which is a standard way of delivering streaming video to
>>> /> >/browsers and other devices.  You might be able to glean what
>>> Chrome is /> >/after by looking at the MP4 files that it generates. />
>>>> Thanks for the hint Silas.
>>>> I started looking at this earlier today, rather than compare output,
>>>> it's interesting to see what it says when I give it what's produced
>>> when
>>>> spooled to a /tmp file as its input.
>>>> It complained I had disabled some mp4 boxes that were not needed on
>>> ffox
>>>> and errored out on each, so that's a significant clue.
>>>> After it was OK at mp4 level, it complains now about finding a type
>>> 13
>>>> NAL at the NALU layer, I instrumented shaka to print the NAL type
>>> As you certainly understand, bmff is made of boxes included the one in
>>> the
>>> others. The outer one being the container itself and the inners going
>>> to groups
>>> of macro blocks (or many other things). All these define some kind of
>>> database
>>> you can access to get data, boxes specifying the model, the kind of
>>> object you
>>> are dealing with.
>>> Now the use cases include straeming, progressive download or random
>>> access to
>>> this database. For the streaming use case (and in some extent for the
>>> progressive
>>> download use case), the bit stream will have to be packetized. NAL
>>> (Network Access
>>> Layer) and NALU (Network Access Layer Unit) are the result of this
>>> packetization.
>>> Most boxes types are defined in BMFF. BMFF is the generic container (at
>>> the top
>>> of the classe tree). It contains definitions for standard/most generic
>>> group of
>>> data. Then some specializations exist for all derived use cases, mp4
>>> file supports
>>> random access, progressive download (or perhaps gp3 spec for this use
>>> case...
>>> I don't remember) and stream.
>>> Of course all these use cases are about video/audio compression, then
>>> every possible
>>> way to save space is good. SPS and PPS stands for 'Sequence Parameter
>>> Set' and
>>> 'Picture Parameter Set'. They are boxes transporting set of parameters
>>> applicable to
>>> some sequences or group of sequences (GOP structure, etc,)  and to some
>>> pictures
>>> or group of pictures (more or less general parameters for algo as
>>> decoding, deblocking,
>>> rendering). For allowing space saving. Some parameters are not
>>> repeated. As the boxes
>>> are included in each others, some parameters can concern several inner
>>> boxes as they
>>> are not redefined for their values to change.
>>> At start of container, you have one SPS box and one PPS box and if
>>> parameters do not
>>> need to be modified, you will not see them anymore.
>>> A decoder maintain a database with some context in order to be able to
>>> correctly retrieve
>>> which parameter must be applied to which part of the stream. In order
>>> to be able to rebuild
>>> picture, the same problem occurs because the decoding order of the
>>> frames is not the
>>> same as the playing order of the frame (because of inter & intra
>>> coding. inter, the frame
>>> can reference another before her in the decoding order, but after her
>>> in the playing order).
>>> However this is a little bit out of scope here :)
>>> Hope I answered more questions than I add :)
>>> Feel free to ask something more specific if you need.
>> Yes I have the iso spec -12.  And, I followed when making my own boxes, what was issued by libav* mux.  So broadly, the boxes are reasonable I think.
>> It can be Chrome insists to see some extra things at box layer.  Or chrome cares about the slight mangling at h.264 layer.  Or I make a mistake ffox can forgive.
>>>> $ ./out/Release/packager
>>>> 'in=/tmp/str.mp4,stream=video,output=/tmp/x.mp4'
>>>> --generate_static_live_mpd --mpd_output h264.mpd
>>>> Zoiper Click2Dial[0503/110045
>>> <zoiper:0503110045>FR:INFO:demuxer.cc(89)] Demuxer::Run() on file
>>> '/tmp/str.mp4'.
>>>> Zoiper Click2Dial[0503/110045
>>> <zoiper:0503110045>FR:INFO:demuxer.cc(155)] Initialize Demuxer for file
>>>> '/tmp/str.mp4'.
>>>> Zoiper Click2Dial[0503/110045
>>> <zoiper:0503110045>FR:ERROR:avc_decoder_configuration_record.cc(53)] 13
>>>> Zoiper Click2Dial[0503/110045
>>> <zoiper:0503110045>FR:ERROR:avc_decoder_configuration_record.cc(54)]
>>> Failure
>>>> while processing: nalu.type() == Nalu::H264_SPS
>>>> Zoiper Click2Dial[0503/110045
>>> <zoiper:0503110045>FR:ERROR:mp4_media_parser.cc(605)] Failed to parse
>>> avcc.
>>> [RCS] avcc is the code that provide the kind of compression used, here
>>> AVC (advanced video coding === h265)
>>>> Zoiper Click2Dial[0503/110045
>>> <zoiper:0503110045>FR:ERROR:packager_main.cc(550)] Packaging Error: 8
>>>> (PARSER_FAILURE): Cannot parse media file /tmp/str.mp4
>>>> but I can't see any type 13 NAL ("SPS extension") in the hexdump from
>>> a
>>>> quick look.  There's also some problem with slightly truncated h.264
>>>> stream according to mplayer
>>> [RCS] There is always a SPS ate stream start.
>> Yes, I solved this last weekend, it expected a different layout than I was giving it, inside the avcC box payload; from its perspective I gave it junk like 0x*D at that position.  I compared it to other file avcC and fixed it, with that the error in Shaka is gone.
>> The output it produced just had an ftyp and no moov, then moof, it could be played as a file in ffox and chrome.
>> The current _v4l2 branch has the fix for that.
> [RCS] Great, I'm digging your code and docs and I'll come back to you ...

Hi Rémi -

I guess there was no good outcome to report, but if there were any 
observations or suggestions on how to proceed, that can still be useful.

I get the impression that the js apis for this are not really in favour 
or wide use any more, perhaps because they are difficult to get working 
across browsers like this.  WebRTC and rtp / rtsp seems to be more 
widely used, it might make sense to try to incrementally change horse to 


More information about the Libwebsockets mailing list