[Libwebsockets] Segfault

Jack Mitchell ml at communistcode.co.uk
Mon Jan 28 15:37:09 CET 2013


On 28/01/13 14:17, Jack Mitchell wrote:
> On 28/01/13 10:08, Jack Mitchell wrote:
>> On 25/01/13 22:24, "Andy Green (林安廸)" wrote:
>>> On 25/01/13 23:04, the mail apparently from Jack Mitchell included:
>>>> On 25/01/13 14:29, Jack Mitchell wrote:
>>>>> On 25/01/13 13:11, "Andy Green (林安廸)" wrote:
>>>>>> <snip>
>>>>>>
>>>>>> I studied it thisevening and changed this code around somewhat.
>>>>>> Basically you should use libwebsockets_broadcast_foreign() from
>>>>>> another thread now and it will take care of things properly without
>>>>>> needing to use libwebsockets_fork_service_loop().  I added some
>>>>>> documentation about it to README.coding also in this patch:
>>>>>>
>>>>>> http://git.libwebsockets.org/cgi-bin/cgit/libwebsockets/commit/?id=52f28ce67acd96f468d9ebbfe6f61fea5be4502b 
>>>>>>
>>>>>>
>>>>>> <snip>
>>>>>>
>>>>>
>>>>> Hi Andy,
>>>>>
>>>>> Ok, so this seem to have successfully stopped the segfaulting.
>>>>
>>>> Apologies, I take that back. This has done something odd and it 
>>>> seems as
>>>> though only some of the data gets through, almost like it's only 
>>>> writing
>>>> the last piece of data? I'm not 100% as my websocket data isn't
>>>> deterministic...
>>>
>>> How much data are we trying to write at once?
>>>
>>> It must pass through a socket to synchronize with the service loop, 
>>> you will have to check the return code of 
>>> libwebsockets_broadcast_foreign() to see if the send succeeded. If 
>>> there's a big spam it will have to wait until the service loop gets 
>>> to it.
>>
>> Hi Andy,
>>
>> I think I've gotten to the bottom of this, and it is that it only 
>> sends the broadcast to the client.
>>
>> All broadcasts are in thread 1 and all services are in thread 0.
>>
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Broadcasted: 
>> {"method":["updateRegisters"],"parameters":{"10":{"val":539},"12":{"val":15504}}} 
>> // Sends
>> Serviced card socket 0
>> Broadcasted: 
>> {"method":["updateData"],"parameters":{"43":{"val":539},"45":{"val":15504}}} 
>> //Sends
>> Device (0) : Reading and Processing FPGA data took 15ms
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Broadcasted: 
>> {"method":["updateRegisters"],"parameters":{"10":{"val":471},"12":{"val":15571},"28":{"val":87}}} 
>> // Doesn't send
>> Broadcasted: 
>> {"method":["updateData"],"parameters":{"43":{"val":471},"45":{"val":15571}}} 
>> // Doesn't send
>> Broadcasted: {"method":["updateData"],"parameters":{"47":{"val":87}}} 
>> // Sends
>> Device (0) : Reading and Processing FPGA data took 18ms
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Broadcasted: 
>> {"method":["updateRegisters"],"parameters":{"10":{"val":488},"12":{"val":15555},"28":{"val":86}}} 
>> // Doesn't send
>> Broadcasted: 
>> {"method":["updateData"],"parameters":{"43":{"val":488},"45":{"val":15555}}} 
>> // Doesn't send
>> Broadcasted: {"method":["updateData"],"parameters":{"47":{"val":86}}} 
>> // Sends
>> Device (0) : Reading and Processing FPGA data took 18ms
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>> Serviced card socket 0
>>
>> I'm sure you get the gist. So, the question, is it valid to call 
>> broadcast_foreign multiple times without a service?
>>
>> If not. Is it valid to call service() immediatly after a broadcast() 
>> to ensure all data is serviced correctly there and then.
>>
>> Best Regards,
>> Jack.
>
> Ok, so I think I have gotten to the bottom of this. I was storing the 
> string to be sent in a variable and I was overwriting it each time.
>
> With broadcast it would pretty much immediately write the string, with 
> broadcast_foreign, it would queue the broadcast callback which lead to 
> sending the same string X amount of times.
>
> I have now implemented service() in each thread immediately after 
> broadcast(), with a mutex to stop the main thread and child thread 
> both executing at the same time, which I think caused segfaults.
>
> However, now I am facing a different challenge, I am getting a 
> segfault again. When you perform a write in broadcast, what should you 
> do if it fails? At the moment I have this chain of events which leads 
> to a segfault:
>
> service()
>     receive()
>         write()
>         broadcast()
>             write() <-- fails (websocket dead?)
>                 checkLegitWSI() <-- returns OK
>                 return
>                 segfault

A backtrace:

ERROR: webSock_genericSendRecieve: Socket Generic: Failed to write to 
socket! (broadcast)
ReturnCode lws_confirm_legit_wsi: 0

Program received signal SIGSEGV, Segmentation fault.
0x00000068 in ?? ()
(gdb) bt
#0  0x00000068 in ?? ()
#1  0xb6e83474 in libwebsocket_rx_sm (wsi=wsi at entry=0xbd148, 
c=<optimized out>) at parsers.c:950
#2  0xb6e8372c in libwebsocket_interpret_incoming_packet 
(wsi=wsi at entry=0xbd148,
     buf=buf at entry=0xbeffebd0 
"\301\240\357!\255\035Ew\375\327\242\fd\325\240p\037O\305o\200l\241\r\347L=p\375W\351\360\373\033\345\224\255\035\301\256\035\202\274w\267\304\364l5k\224b9\224\231u\030\320\067\375\210.\026\366\267\360\316\345\017\021\017\252Wq.</qO\367\227\310\356\374V\026\326\034\034\202\301\210\327W7\016\225\205QFUa7\016\301\204\230\217\310\217\272\334\323\217", 
len=len at entry=114) at parsers.c:1037
#3  0xb6e83974 in libwebsocket_read (context=context at entry=0x47000, 
wsi=wsi at entry=0xbd148,
     buf=0xbeffebd0 
"\301\240\357!\255\035Ew\375\327\242\fd\325\240p\037O\305o\200l\241\r\347L=p\375W\351\360\373\033\345\224\255\035\301\256\035\202\274w\267\304\364l5k\224b9\224\231u\030\320\067\375\210.\026\366\267\360\316\345\017\021\017\252Wq.</qO\367\227\310\356\374V\026\326\034\034\202\301\210\327W7\016\225\205QFUa7\016\301\204\230\217\310\217\272\334\323\217", 
len=114) at handshake.c:233
#4  0xb6e81b30 in libwebsocket_service_fd 
(context=context at entry=0x47000, pollfd=<optimized out>)
     at libwebsockets.c:884
#5  0xb6e81c14 in libwebsocket_service (context=0x47000, 
timeout_ms=timeout_ms at entry=0) at libwebsockets.c:1047
#6  0x0000a984 in main () at R0005.c:109
(gdb)


>
> Does that make sense? I know it's not very clear but I'm trying to 
> break down a fairly complex program with an involved chain of events...
>
> Thanks,
> Jack.
>
>>
>>>
>>>>> However there seems to be a small issue with the socket function as
>>>>> Valgrind kindly points out:
>>>>>
>>>>> ==3475== Thread 5:
>>>>> ==3475== Syscall param socketcall.send(msg) points to uninitialised
>>>>> byte(s)
>>>>> ==3475==    at 0x4FEEF374: send (in /lib/libpthread-2.16.so)
>>>>> ==3475==    by 0x498AA0B: libwebsockets_broadcast_foreign
>>>>> (libwebsockets.c:2186)
>>>>> ==3475==    by 0xEF4B: webSock_broadcastJsonObject
>>>>> (webInterface_webSockets.c:223)
>>>>> ==3475==    by 0xAFDB: XX86_processFPGAData (XX86.c:141)
>>>>> ==3475==    by 0xDA13: XX86_tickCheck (XX86_init.c:118)
>>>>> ==3475==    by 0x4FEE6F5B: start_thread (pthread_create.c:313)
>>>>> ==3475==    by 0x4FE2E0D7: ??? (in /lib/libc-2.16.so)
>>>>> ==3475==  Address 0x6f75d42 is on thread 5's stack
>>>>> ==3475==
>>>>> ==3475== Thread 1:
>>>>> ==3475== Syscall param socketcall.send(msg) points to uninitialised
>>>>> byte(s)
>>>>> ==3475==    at 0x4FEEF374: send (in /lib/libpthread-2.16.so)
>>>>> ==3475==    by 0x498AA0B: libwebsockets_broadcast_foreign
>>>>> (libwebsockets.c:2186)
>>>>> ==3475==    by 0xEF4B: webSock_broadcastJsonObject
>>>>> (webInterface_webSockets.c:223)
>>>>> ==3475==    by 0xD88B: XX86socket_handleReceive (XX86_socket.c:110)
>>>>> ==3475==    by 0xEE3B: webSock_genericSendRecieve
>>>>> (webInterface_webSockets.c:147)
>>>>> ==3475==    by 0x498B3E7: user_callback_handle_rxflow
>>>>> (libwebsockets.c:1347)
>>>>> ==3475==    by 0x498D61B: libwebsocket_rx_sm (parsers.c:968)
>>>>> ==3475==    by 0x498D72B: libwebsocket_interpret_incoming_packet
>>>>> (parsers.c:1037)
>>>>> ==3475==    by 0x498D973: libwebsocket_read (handshake.c:233)
>>>>> ==3475==    by 0x498BB2F: libwebsocket_service_fd 
>>>>> (libwebsockets.c:884)
>>>>> ==3475==    by 0x498BC13: libwebsocket_service (libwebsockets.c:1047)
>>>>> ==3475==    by 0xA947: main (R0005.c:108)
>>>>> ==3475==  Address 0xbd8999ba is on thread 1's stack
>>>
>>> Hm this is wrong, you should not call 
>>> libwebsockets_broadcast_foreign() from your service loop. 
>>> libwebsockets_broadcast() is the one for that.  I guess this is the 
>>> reason for your data loss.
>>>
>>>>> I also get the following (which isn't a new problem, just one I've
>>>>> ignored for a bit):
>>>>>
>>>>> ==3475== Conditional jump or move depends on uninitialised value(s)
>>>>> ==3475==    at 0x498B7F8: libwebsocket_service_fd 
>>>>> (libwebsockets.c:716)
>>>>> ==3475==    by 0x498BC13: libwebsocket_service (libwebsockets.c:1047)
>>>>> ==3475==    by 0xA947: main (R0005.c:108)
>>>>>
>>>>> I think this small snippet is trying to service a now closed context?
>>>>> As for the big snippet I'm not so sure about that...
>>>
>>> 716 is currently this
>>>
>>>         if (context->started_with_parent && 
>>> kill(context->started_with_parent, 0) < 0)
>>>
>>> context gets memset after allocation IIRC.
>>>
>>> -Andy
>>
>>
>
>


-- 

   Jack Mitchell (jack at embed.me.uk)
   Embedded Systems Engineer
   http://www.embed.me.uk

--




More information about the Libwebsockets mailing list