[Libwebsockets] LWS assertion probably due to start/stop of a pppd daemon

Thomas Spitz thomas.spitz at hestia-france.com
Tue Sep 22 18:10:56 CEST 2020


Hello everyone,

I've been using LWS in my main program for years without stability issues.

My main program is launching a LWS server through a thread in the same
manner as minimal-ws-server.c (creating lws context, creating vhost with my
ws-protocol and looping on lws_service) :

> pthread_attr_init(&attr);

pthread_attr_setdetachstate(&attr,
> PTHREAD_CREATE_DETACHED);pthread_create(&ThreadServeurWEBSOCKETS, &attr,
> serveurwebsockets, NULL)

This thread is managing a wsi list, receiving and sending WS messages

Now, my program also launches a pppd daemon through a thread using fork +
exec. This pppd daemon is used to set a mobile backup internet connection
but this internet connection is never used by WS clients at the present.

Testing with only one WS client, switching ON/OFF the mobile backup
internet connections many times, this WS client is generaly able to stay
connected or connects / disconnects without issue. However, sometimes (1
time among 40 in average), the WS server creates an assertion.

Here a log example with LWS: 4.1.0-22d043a, loglevel 31.:

> 2020-09-22 08:33:56,3,6,pppd[1183]:, Terminating on signal 15
> 2020-09-22 08:33:56,3,6,pppd[1183]:, Connect time 1.9 minutes.
> 2020-09-22 08:33:56,3,6,pppd[1183]:, Sent 11531 bytes, received 19079
> bytes.
> 2020-09-22 08:33:56,1,7,V4:, lws_ws_rx_sm: ws fragment length exhausted
> 2020-09-22 08:33:56,1,7,V4:, lws_ws_rx_sm: post ext ret 1, ebuf in 0 / out
> 0
> 2020-09-22 08:33:56,1,7,V4:, lws_parse_ws: exit with 0 unused
> 2020-09-22 08:33:56,1,7,V4:, _lws_change_pollfd: wsi 0x1db61a0: fd 3
> events 5 -> 1
> 2020-09-22 08:33:56,1,7,V4:, lws_handle_POLLOUT_event: 0x1db61a0: non mux:
> wsistate 0x20000119, ops ws
> 2020-09-22 08:33:56,1,6,V4:, lws_issue_raw: ssl_capable_write (2060) says
> 2060
> 2020-09-22 08:33:56,1,7,V4:, _realloc: size 16: ftp
> 2020-09-22 08:33:56,1,7,V4:, _lws_change_pollfd: wsi 0x1db61a0: fd 3
> events 1 -> 5
> 2020-09-22 08:33:56,3,7,pppd[1183]:, Script /etc/ppp/ip-down started (pid
> 1195)
> 2020-09-22 08:33:56,3,7,pppd[1183]:, sent [LCP TermReq id=0x3 "User
> request"]
> 2020-09-22 08:33:56,1,7,V4:, _lws_change_pollfd: wsi 0x1db61a0: fd 3
> events 5 -> 1
> 2020-09-22 08:33:56,1,7,V4:, lws_handle_POLLOUT_event: 0x1db61a0: non mux:
> wsistate 0x20000119, ops ws
> 2020-09-22 08:33:56,1,6,V4:, lws_issue_raw: ssl_capable_write (126) says
> 126
> 2020-09-22 08:33:56,3,7,pppd[1183]:, rcvd [LCP TermAck id=0x3]
> 2020-09-22 08:33:56,3,5,pppd[1183]:, Connection terminated.
> 2020-09-22 08:33:56,1,7,V4:, _realloc: size 16: ftp
> 2020-09-22 08:33:56,1,7,V4:, _lws_change_pollfd: wsi 0x1db61a0: fd 3
> events 1 -> 5
> 2020-09-22 08:33:56,3,7,pppd[1183]:, Script /etc/ppp/ip-down finished (pid
> 1195), status = 0x0
> 2020-09-22 08:33:56,3,6,pppd[1183]:, Exit.
> 2020-09-22 08:34:06,1,7,V4:,
> ../trunk/main/processCycliques.c(22158)L108FL32: kick watchdog
> 2020-09-22 08:34:06,1,7,V4:, _lws_change_pollfd: wsi 0x1db61a0: fd 3
> events 5 -> 1
> 2020-09-22 08:34:06,1,7,V4:, lws_handle_POLLOUT_event: 0x1db61a0: non mux:
> wsistate 0x20000119, ops ws
> 2020-09-22 08:34:06,1,6,V4:, lws_issue_raw: ssl_capable_write (126) says
> 126
> 2020-09-22 08:34:06,1,7,V4:, _lws_change_pollfd: wsi 0x1db61a0: fd 3
> events 1 -> 5
> 2020-09-22 08:34:21,1,7,V4:,
> ../trunk/main/processCycliques.c(22158)L108FL32: kick watchdog
> 2020-09-22 08:34:21,1,7,V4:,
> ../trunk/main/processCycliques.c(22158)processCycliques()L123: timeout
> RxFrom CWS0
> 2020-09-22 08:34:36,1,7,V4:,
> ../trunk/main/processCycliques.c(22158)L108FL32: kick watchdog
> 2020-09-22 08:34:51,1,7,V4:,
> ../trunk/main/processCycliques.c(22158)L108FL32: kick watchdog
> 2020-09-22 08:34:51,1,7,V4:, accepted new conn port 38090 on fd=3
> 2020-09-22 08:34:51,1,7,V4:, lws_get_idlest_tsi: 4 1023
> 2020-09-22 08:34:51,1,7,V4:, _realloc: size 552: new server wsi
> 2020-09-22 08:34:51,1,7,V4:, new wsi 0x1db4280 joining vhost default, tsi 0
> 2020-09-22 08:34:51,1,7,V4:,
> ../trunk/serveurModem/outils/cmdAT.c(20956)read_from_modem()L88:
> RECV[0]:*PSNTRG: 5,1,"208","10","6594","3F66","SFR",1,,0
> 2020-09-22 08:34:51,1,7,V4:,
> ../trunk/serveurModem/outils/cmdAT.c(20956)read_from_modem()L88: RECV[0]:OK
> 2020-09-22 08:34:51,1,7,V4:, lws_vhost_bind_wsi: vh default: wsi
> none/none, count_bound_wsi 3
> 2020-09-22 08:34:51,1,7,V4:, lwsi_set_state(0x1db4280, 0x20000200)
> 2020-09-22 08:34:51,1,7,V4:, lws_role_transition: 0x1db4280: wsistate
> 0x20000208, ops h1
> 2020-09-22 08:34:51,1,7,V4:, __lws_set_timeout: 0x1db4280: 5 secs, reason 3
> 2020-09-22 08:34:51,1,7,V4:, lwsi_set_state(0x1db4280, 0x20000209)
> 2020-09-22 08:34:51,1,7,V4:, __insert_wsi_socket_into_fds: 0x1db4280:
> tsi=0, sock=3, pos-in-fds=4
> 2020-09-22 08:34:51,0,0,oot/Varuna4,
> 2020-09-22 08:34:51,0,0,,
> 2020-09-22
> 08:34:51,0,0,ome/thomas/workspace/Varuna4/0_librairies_sources/libwebsockets/libwebsockets-4.1-stable-20200922/lib/plat/unix/unix-fds.c,
> 2020-09-22 08:34:51,0,0,,
> 2020-09-22 08:34:51,4,1,,
> 2020-09-22 08:34:51,0,0,,
> 2020-09-22 08:34:51,0,0,sert_wsi,
> 2020-09-22 08:34:51,0,0,Assertion, `
> 2020-09-22 08:34:51,0,0,ntext->lws_lookup[wsi->desc.sockfd, - (0)] == 0
> 2020-09-22 08:34:51,0,0,failed.,
> 2020-09-22 08:34:51,0,2,kernel:,watchdog: watchdog0: watchdog did not stop!


I think the cause of the above assertion is not due to lws itself but I
don't know how to proceed in order to find the issue.

> assert(context->lws_lookup[wsi->desc.sockfd -
>                           lws_plat_socket_offset()] == 0);


Do you think this could be due to a fd that would be copied during the pppd
fork+exec?

Thanks in advance for your help
Best regards,
Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://libwebsockets.org/pipermail/libwebsockets/attachments/20200922/67d93d4f/attachment.htm>


More information about the Libwebsockets mailing list