[Libwebsockets] client protocol selection criteria discussion
Olivier Langlois
olivier at olivierlanglois.net
Sun Feb 23 06:33:40 CET 2020
On Sat, 2020-02-22 at 14:48 -0500, Olivier Langlois wrote:
> On Sun, 2020-02-16 at 06:02 +0000, Andy Green wrote:
> > On 2/16/20 3:32 AM, Olivier Langlois wrote:
> > > On Mon, 2020-02-10 at 00:44 +0000, Andy Green wrote:
> > > > On February 9, 2020 11:05:23 PM GMT, Olivier Langlois <
> > > > olivier at olivierlanglois.net
> > > >
> > > >
> > > > > wrote:
> > > > > I feel like I am currently in uncharted waters in my lws
> > > > > usage
> > > > > and
> > > > > I am
> > > > > discovering the limits of what is possible as I figure out
> > > > > how
> > > > > to
> > > > > accomplish my design goal.
> > > >
> > > > Lws is predicated around a single threaded event loop. Its
> > > > only
> > > > way
> > > > to interoperate with other threads is lws_cancel_service().
> > > >
> > > > That's a very simple proposition that goes a long way. Lws
> > > > doesn't
> > > > claim anything to mislead you into thinking it's threadsafe and
> > > > you
> > > > can just call its apis from different threads - you can't.
> > >
> > > Fair enough. you can have 1 thread per loop and this is how I
> > > intended
> > > to use 2 threads with lws.
> >
> > I really suggest you don't... it's FOSS, you can do what you like,
> > but
> > from my perspective lws is singlethreaded.
> >
> > This kind of hack doesn't scale
> >
> > - the internal abstraction is a pt / per-thread struct which
> > contains
> > the event loop, fd maps for the whole process-worth of fds and
> > other
> > things... instead of "per load-balancing server on same machine"
> > that
> > becomes "per client connection".
> >
> > - each connection is in its own event loop and thread and can't
> > interoperate with the other ones without locking
> >
> > - if you use a muxed protocol like h2, in lws it does not
> > support
> > streams coming from different threads, async thread management
> > depending
> > or shared connection or stream state, locking, races etc
> >
> > If you find missing locking that's generally useful then patches
> > are
> > welcome... otherwise this mode isn't supported.
> >
>
> Hi,
>
> I just wanted to report that lws is well behaving in a 1 thread per
> client connection in a 2 client connection setup despite that this
> usage isn't supported.
>
> I didn't encounter any synchronization issues in this mode so far.
>
> The local protocol and the thread-connection binding specified when
> calling lws_client_connect_via_info() is respected by lws.
>
> Using lws this way has only needed me a small code change in lws that
> I
> will share shortly with the list as a patch.
>
> The only odd behavior that I find suspicious and I will investigate
> further what I am seeing is that beside the
> LWS_CALLBACK_GET_THREAD_ID
> callbacks from lws_client_connect_via_info().
>
> LWS_CALLBACK_GET_THREAD_ID is called 8 more times and only for my
> thread #2:
>
> $ grep LWS_CALLBACK_GET_THREAD_ID pub.out
> [2020-02-22 13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 0
> [2020-02-22 13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22 13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22 13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22 13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22 13:45:40] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22 13:45:42] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22 13:45:42] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22 13:45:42] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
> [2020-02-22 13:45:42] INFO WSBASE/callback
> LWS_CALLBACK_GET_THREAD_ID:
> 1
>
> in my code tid == tsi for the sake of simplicity.
> a quick search in the lws codebase leads me to believe that those
> callbacks originate from core-net/pollfd.c
>
> but the situation is raising my eyebrowns... I'll look more into it
> and
> report back if I find some explanation about that...
>
Ok, I have understood what is going on.
This is caused by code inside lws_change_pollfd. More precisely:
/*
* if we changed something in this pollfd...
* ... and we're running in a different thread context
* than the service thread...
* ... and the service thread is waiting ...
* then cancel it to force a restart with our changed
events
*/
pa_events = pa->prev_events != pa->events;
if (pa_events) {
if (lws_plat_change_pollfd(context, wsi, pfd)) {
lwsl_info("%s failed\n", __func__);
ret = -1;
goto bail;
}
sampled_tid = pt->service_tid;
if (sampled_tid && wsi->vhost) {
tid = wsi->vhost->protocols[0].callback(wsi,
LWS_CALLBACK_GET_THREAD_ID, NULL,
NULL, 0);
if (tid == -1) {
ret = -1;
goto bail;
}
if (tid != sampled_tid)
lws_cancel_service_pt(wsi);
}
}
So the callback isn't called for tid 0 but it is for tid 1.
This brings the following questions:
1. Is tid 0 a valid value?
2. AFAIK, I'm never changing pollfd from a different thread. Is this
something possible to do at all? How? tid 0 doesn't suffer from not
calling the callback at all...
3. There seems to be some spurious, possibly unnecessary pollfd_change
going on:
a) It is changed twice between connection and sending initial HTTP
request.
b) It is changed twice before receiving the the HTTP reply
c) Sending out 1 ws message and starting receiving incoming ws message
s triggers 4 pollfd changes...
More information about the Libwebsockets
mailing list