[Libwebsockets] client protocol selection criteria discussion

Olivier Langlois olivier at olivierlanglois.net
Sun Feb 23 06:33:40 CET 2020


On Sat, 2020-02-22 at 14:48 -0500, Olivier Langlois wrote:
> On Sun, 2020-02-16 at 06:02 +0000, Andy Green wrote:
> > On 2/16/20 3:32 AM, Olivier Langlois wrote:
> > > On Mon, 2020-02-10 at 00:44 +0000, Andy Green wrote:
> > > > On February 9, 2020 11:05:23 PM GMT, Olivier Langlois <
> > > > olivier at olivierlanglois.net
> > > > 
> > > > 
> > > > > wrote:
> > > > > I feel like I am currently in uncharted waters in my lws
> > > > > usage
> > > > > and
> > > > > I am
> > > > > discovering the limits of what is possible as I figure out
> > > > > how
> > > > > to
> > > > > accomplish my design goal.
> > > > 
> > > > Lws is predicated around a single threaded event loop.  Its
> > > > only
> > > > way
> > > > to interoperate with other threads is lws_cancel_service().
> > > > 
> > > > That's a very simple proposition that goes a long way.  Lws
> > > > doesn't
> > > > claim anything to mislead you into thinking it's threadsafe and
> > > > you
> > > > can just call its apis from different threads - you can't.
> > > 
> > > Fair enough. you can have 1 thread per loop and this is how I
> > > intended
> > > to use 2 threads with lws.
> > 
> > I really suggest you don't... it's FOSS, you can do what you like,
> > but 
> > from my perspective lws is singlethreaded.
> > 
> > This kind of hack doesn't scale
> > 
> >   - the internal abstraction is a pt / per-thread struct which
> > contains 
> > the event loop, fd maps for the whole process-worth of fds and
> > other 
> > things... instead of "per load-balancing server on same machine"
> > that 
> > becomes "per client connection".
> > 
> >   - each connection is in its own event loop and thread and can't 
> > interoperate with the other ones without locking
> > 
> >   - if you use a muxed protocol like h2, in lws it does not
> > support 
> > streams coming from different threads, async thread management
> > depending 
> > or shared connection or stream state, locking, races etc
> > 
> > If you find missing locking that's generally useful then patches
> > are 
> > welcome... otherwise this mode isn't supported.
> > 
> 
> Hi,
> 
> I just wanted to report that lws is well behaving in a 1 thread per
> client connection in a 2 client connection setup despite that this
> usage isn't supported.
> 
> I didn't encounter any synchronization issues in this mode so far.
> 
> The local protocol and the thread-connection binding specified when
> calling lws_client_connect_via_info() is respected by lws.
> 
> Using lws this way has only needed me a small code change in lws that
> I
> will share shortly with the list as a patch.
> 
> The only odd behavior that I find suspicious and I will investigate
> further what I am seeing is that beside the
> LWS_CALLBACK_GET_THREAD_ID
> callbacks from lws_client_connect_via_info().
> 
> LWS_CALLBACK_GET_THREAD_ID is called 8 more times and only for my
> thread #2:
> 
> $ grep LWS_CALLBACK_GET_THREAD_ID pub.out
> [2020-02-22  13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 0
> [2020-02-22  13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22  13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22  13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22  13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22  13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22  13:45:42] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22  13:45:42] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22  13:45:42] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22  13:45:42] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> 
> in my code tid == tsi for the sake of simplicity.
> a quick search in the lws codebase leads me to believe that those
> callbacks originate from core-net/pollfd.c
> 
> but the situation is raising my eyebrowns... I'll look more into it
> and
> report back if I find some explanation about that...
> 
Ok, I have understood what is going on.

This is caused by code inside lws_change_pollfd. More precisely:

        /*
         * if we changed something in this pollfd...
         *   ... and we're running in a different thread context
         *     than the service thread...
         *       ... and the service thread is waiting ...
         *         then cancel it to force a restart with our changed
events
         */
        pa_events = pa->prev_events != pa->events;

        if (pa_events) {
                if (lws_plat_change_pollfd(context, wsi, pfd)) {
                        lwsl_info("%s failed\n", __func__);
                        ret = -1;
                        goto bail;
                }
                sampled_tid = pt->service_tid;
                if (sampled_tid && wsi->vhost) {
                        tid = wsi->vhost->protocols[0].callback(wsi,
                                     LWS_CALLBACK_GET_THREAD_ID, NULL,
NULL, 0);
                        if (tid == -1) {
                                ret = -1;
                                goto bail;
                        }
                        if (tid != sampled_tid)
                                lws_cancel_service_pt(wsi);
                }
        }

So the callback isn't called for tid 0 but it is for tid 1.

This brings the following questions:
1. Is tid 0 a valid value?
2. AFAIK, I'm never changing pollfd from a different thread. Is this
something possible to do at all? How? tid 0 doesn't suffer from not
calling the callback at all...
3. There seems to be some spurious, possibly unnecessary pollfd_change
going on:
 a) It is changed twice between connection and sending initial HTTP
request.
 b) It is changed twice before receiving the the HTTP reply
 c) Sending out 1 ws message and starting receiving incoming ws message
s triggers 4 pollfd changes...




More information about the Libwebsockets mailing list