[Libwebsockets] Segfault

"Andy Green (林安廸)" andy at warmcat.com
Mon Jan 28 15:44:45 CET 2013


On 28/01/13 22:37, the mail apparently from Jack Mitchell included:
> On 28/01/13 14:17, Jack Mitchell wrote:
>> On 28/01/13 10:08, Jack Mitchell wrote:
>>> On 25/01/13 22:24, "Andy Green (林安廸)" wrote:
>>>> On 25/01/13 23:04, the mail apparently from Jack Mitchell included:
>>>>> On 25/01/13 14:29, Jack Mitchell wrote:
>>>>>> On 25/01/13 13:11, "Andy Green (林安廸)" wrote:
>>>>>>> <snip>
>>>>>>>
>>>>>>> I studied it thisevening and changed this code around somewhat.
>>>>>>> Basically you should use libwebsockets_broadcast_foreign() from
>>>>>>> another thread now and it will take care of things properly without
>>>>>>> needing to use libwebsockets_fork_service_loop().  I added some
>>>>>>> documentation about it to README.coding also in this patch:
>>>>>>>
>>>>>>> http://git.libwebsockets.org/cgi-bin/cgit/libwebsockets/commit/?id=52f28ce67acd96f468d9ebbfe6f61fea5be4502b
>>>>>>>
>>>>>>>
>>>>>>> <snip>
>>>>>>>
>>>>>>
>>>>>> Hi Andy,
>>>>>>
>>>>>> Ok, so this seem to have successfully stopped the segfaulting.
>>>>>
>>>>> Apologies, I take that back. This has done something odd and it
>>>>> seems as
>>>>> though only some of the data gets through, almost like it's only
>>>>> writing
>>>>> the last piece of data? I'm not 100% as my websocket data isn't
>>>>> deterministic...
>>>>
>>>> How much data are we trying to write at once?
>>>>
>>>> It must pass through a socket to synchronize with the service loop,
>>>> you will have to check the return code of
>>>> libwebsockets_broadcast_foreign() to see if the send succeeded. If
>>>> there's a big spam it will have to wait until the service loop gets
>>>> to it.
>>>
>>> Hi Andy,
>>>
>>> I think I've gotten to the bottom of this, and it is that it only
>>> sends the broadcast to the client.
>>>
>>> All broadcasts are in thread 1 and all services are in thread 0.
>>>
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Broadcasted:
>>> {"method":["updateRegisters"],"parameters":{"10":{"val":539},"12":{"val":15504}}}
>>> // Sends
>>> Serviced card socket 0
>>> Broadcasted:
>>> {"method":["updateData"],"parameters":{"43":{"val":539},"45":{"val":15504}}}
>>> //Sends
>>> Device (0) : Reading and Processing FPGA data took 15ms
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Broadcasted:
>>> {"method":["updateRegisters"],"parameters":{"10":{"val":471},"12":{"val":15571},"28":{"val":87}}}
>>> // Doesn't send
>>> Broadcasted:
>>> {"method":["updateData"],"parameters":{"43":{"val":471},"45":{"val":15571}}}
>>> // Doesn't send
>>> Broadcasted: {"method":["updateData"],"parameters":{"47":{"val":87}}}
>>> // Sends
>>> Device (0) : Reading and Processing FPGA data took 18ms
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Broadcasted:
>>> {"method":["updateRegisters"],"parameters":{"10":{"val":488},"12":{"val":15555},"28":{"val":86}}}
>>> // Doesn't send
>>> Broadcasted:
>>> {"method":["updateData"],"parameters":{"43":{"val":488},"45":{"val":15555}}}
>>> // Doesn't send
>>> Broadcasted: {"method":["updateData"],"parameters":{"47":{"val":86}}}
>>> // Sends
>>> Device (0) : Reading and Processing FPGA data took 18ms
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>> Serviced card socket 0
>>>
>>> I'm sure you get the gist. So, the question, is it valid to call
>>> broadcast_foreign multiple times without a service?
>>>
>>> If not. Is it valid to call service() immediatly after a broadcast()
>>> to ensure all data is serviced correctly there and then.
>>>
>>> Best Regards,
>>> Jack.
>>
>> Ok, so I think I have gotten to the bottom of this. I was storing the
>> string to be sent in a variable and I was overwriting it each time.
>>
>> With broadcast it would pretty much immediately write the string, with
>> broadcast_foreign, it would queue the broadcast callback which lead to
>> sending the same string X amount of times.
>>
>> I have now implemented service() in each thread immediately after
>> broadcast(), with a mutex to stop the main thread and child thread
>> both executing at the same time, which I think caused segfaults.
>>
>> However, now I am facing a different challenge, I am getting a
>> segfault again. When you perform a write in broadcast, what should you
>> do if it fails? At the moment I have this chain of events which leads
>> to a segfault:
>>
>> service()
>>     receive()
>>         write()
>>         broadcast()
>>             write() <-- fails (websocket dead?)
>>                 checkLegitWSI() <-- returns OK
>>                 return
>>                 segfault
>
> A backtrace:
>
> ERROR: webSock_genericSendRecieve: Socket Generic: Failed to write to
> socket! (broadcast)
> ReturnCode lws_confirm_legit_wsi: 0
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x00000068 in ?? ()
> (gdb) bt
> #0  0x00000068 in ?? ()
> #1  0xb6e83474 in libwebsocket_rx_sm (wsi=wsi at entry=0xbd148,

I think it just means you're calling service from multiple threads at 
the moment which is very much the wrong direction as I explained in the 
previous mail that crossed this one.

Please consider throwing out all that and the mutexes and doing 
everything from a single thread.

-Andy

> c=<optimized out>) at parsers.c:950
> #2  0xb6e8372c in libwebsocket_interpret_incoming_packet
> (wsi=wsi at entry=0xbd148,
>      buf=buf at entry=0xbeffebd0
> "\301\240\357!\255\035Ew\375\327\242\fd\325\240p\037O\305o\200l\241\r\347L=p\375W\351\360\373\033\345\224\255\035\301\256\035\202\274w\267\304\364l5k\224b9\224\231u\030\320\067\375\210.\026\366\267\360\316\345\017\021\017\252Wq.</qO\367\227\310\356\374V\026\326\034\034\202\301\210\327W7\016\225\205QFUa7\016\301\204\230\217\310\217\272\334\323\217",
> len=len at entry=114) at parsers.c:1037
> #3  0xb6e83974 in libwebsocket_read (context=context at entry=0x47000,
> wsi=wsi at entry=0xbd148,
>      buf=0xbeffebd0
> "\301\240\357!\255\035Ew\375\327\242\fd\325\240p\037O\305o\200l\241\r\347L=p\375W\351\360\373\033\345\224\255\035\301\256\035\202\274w\267\304\364l5k\224b9\224\231u\030\320\067\375\210.\026\366\267\360\316\345\017\021\017\252Wq.</qO\367\227\310\356\374V\026\326\034\034\202\301\210\327W7\016\225\205QFUa7\016\301\204\230\217\310\217\272\334\323\217",
> len=114) at handshake.c:233
> #4  0xb6e81b30 in libwebsocket_service_fd
> (context=context at entry=0x47000, pollfd=<optimized out>)
>      at libwebsockets.c:884
> #5  0xb6e81c14 in libwebsocket_service (context=0x47000,
> timeout_ms=timeout_ms at entry=0) at libwebsockets.c:1047
> #6  0x0000a984 in main () at R0005.c:109
> (gdb)
>
>
>>
>> Does that make sense? I know it's not very clear but I'm trying to
>> break down a fairly complex program with an involved chain of events...
>>
>> Thanks,
>> Jack.
>>
>>>
>>>>
>>>>>> However there seems to be a small issue with the socket function as
>>>>>> Valgrind kindly points out:
>>>>>>
>>>>>> ==3475== Thread 5:
>>>>>> ==3475== Syscall param socketcall.send(msg) points to uninitialised
>>>>>> byte(s)
>>>>>> ==3475==    at 0x4FEEF374: send (in /lib/libpthread-2.16.so)
>>>>>> ==3475==    by 0x498AA0B: libwebsockets_broadcast_foreign
>>>>>> (libwebsockets.c:2186)
>>>>>> ==3475==    by 0xEF4B: webSock_broadcastJsonObject
>>>>>> (webInterface_webSockets.c:223)
>>>>>> ==3475==    by 0xAFDB: XX86_processFPGAData (XX86.c:141)
>>>>>> ==3475==    by 0xDA13: XX86_tickCheck (XX86_init.c:118)
>>>>>> ==3475==    by 0x4FEE6F5B: start_thread (pthread_create.c:313)
>>>>>> ==3475==    by 0x4FE2E0D7: ??? (in /lib/libc-2.16.so)
>>>>>> ==3475==  Address 0x6f75d42 is on thread 5's stack
>>>>>> ==3475==
>>>>>> ==3475== Thread 1:
>>>>>> ==3475== Syscall param socketcall.send(msg) points to uninitialised
>>>>>> byte(s)
>>>>>> ==3475==    at 0x4FEEF374: send (in /lib/libpthread-2.16.so)
>>>>>> ==3475==    by 0x498AA0B: libwebsockets_broadcast_foreign
>>>>>> (libwebsockets.c:2186)
>>>>>> ==3475==    by 0xEF4B: webSock_broadcastJsonObject
>>>>>> (webInterface_webSockets.c:223)
>>>>>> ==3475==    by 0xD88B: XX86socket_handleReceive (XX86_socket.c:110)
>>>>>> ==3475==    by 0xEE3B: webSock_genericSendRecieve
>>>>>> (webInterface_webSockets.c:147)
>>>>>> ==3475==    by 0x498B3E7: user_callback_handle_rxflow
>>>>>> (libwebsockets.c:1347)
>>>>>> ==3475==    by 0x498D61B: libwebsocket_rx_sm (parsers.c:968)
>>>>>> ==3475==    by 0x498D72B: libwebsocket_interpret_incoming_packet
>>>>>> (parsers.c:1037)
>>>>>> ==3475==    by 0x498D973: libwebsocket_read (handshake.c:233)
>>>>>> ==3475==    by 0x498BB2F: libwebsocket_service_fd
>>>>>> (libwebsockets.c:884)
>>>>>> ==3475==    by 0x498BC13: libwebsocket_service (libwebsockets.c:1047)
>>>>>> ==3475==    by 0xA947: main (R0005.c:108)
>>>>>> ==3475==  Address 0xbd8999ba is on thread 1's stack
>>>>
>>>> Hm this is wrong, you should not call
>>>> libwebsockets_broadcast_foreign() from your service loop.
>>>> libwebsockets_broadcast() is the one for that.  I guess this is the
>>>> reason for your data loss.
>>>>
>>>>>> I also get the following (which isn't a new problem, just one I've
>>>>>> ignored for a bit):
>>>>>>
>>>>>> ==3475== Conditional jump or move depends on uninitialised value(s)
>>>>>> ==3475==    at 0x498B7F8: libwebsocket_service_fd
>>>>>> (libwebsockets.c:716)
>>>>>> ==3475==    by 0x498BC13: libwebsocket_service (libwebsockets.c:1047)
>>>>>> ==3475==    by 0xA947: main (R0005.c:108)
>>>>>>
>>>>>> I think this small snippet is trying to service a now closed context?
>>>>>> As for the big snippet I'm not so sure about that...
>>>>
>>>> 716 is currently this
>>>>
>>>>         if (context->started_with_parent &&
>>>> kill(context->started_with_parent, 0) < 0)
>>>>
>>>> context gets memset after allocation IIRC.
>>>>
>>>> -Andy
>>>
>>>
>>
>>
>
>




More information about the Libwebsockets mailing list