[Libwebsockets] client protocol selection criteria discussion

Andy Green andy at warmcat.com
Sun Feb 23 08:55:02 CET 2020



On February 22, 2020 7:48:11 PM GMT, Olivier Langlois <olivier at olivierlanglois.net> wrote:
>On Sun, 2020-02-16 at 06:02 +0000, Andy Green wrote:
>> 
>> On 2/16/20 3:32 AM, Olivier Langlois wrote:
>> > On Mon, 2020-02-10 at 00:44 +0000, Andy Green wrote:
>> > > On February 9, 2020 11:05:23 PM GMT, Olivier Langlois <
>> > > olivier at olivierlanglois.net
>> > > 
>> > > > wrote:
>> > > > I feel like I am currently in uncharted waters in my lws usage
>> > > > and
>> > > > I am
>> > > > discovering the limits of what is possible as I figure out how
>> > > > to
>> > > > accomplish my design goal.
>> > > 
>> > > Lws is predicated around a single threaded event loop.  Its only
>> > > way
>> > > to interoperate with other threads is lws_cancel_service().
>> > > 
>> > > That's a very simple proposition that goes a long way.  Lws
>> > > doesn't
>> > > claim anything to mislead you into thinking it's threadsafe and
>> > > you
>> > > can just call its apis from different threads - you can't.
>> > 
>> > Fair enough. you can have 1 thread per loop and this is how I
>> > intended
>> > to use 2 threads with lws.
>> 
>> I really suggest you don't... it's FOSS, you can do what you like,
>> but 
>> from my perspective lws is singlethreaded.
>> 
>> This kind of hack doesn't scale
>> 
>>   - the internal abstraction is a pt / per-thread struct which
>> contains 
>> the event loop, fd maps for the whole process-worth of fds and other 
>> things... instead of "per load-balancing server on same machine"
>> that 
>> becomes "per client connection".
>> 
>>   - each connection is in its own event loop and thread and can't 
>> interoperate with the other ones without locking
>> 
>>   - if you use a muxed protocol like h2, in lws it does not support 
>> streams coming from different threads, async thread management
>> depending 
>> or shared connection or stream state, locking, races etc
>> 
>> If you find missing locking that's generally useful then patches are 
>> welcome... otherwise this mode isn't supported.
>> 
>Hi,
>
>I just wanted to report that lws is well behaving in a 1 thread per
>client connection in a 2 client connection setup despite that this
>usage isn't supported.

Users trying to use lws from multiple thread contexts has a lot of history... where it has ended up is I think I can promise a single lws thread and lws_cancel_service() as the mechanism to signal other threads want something will work reliably for all cases and especially all event loops.  And work when it comes to modern usage like h2, where your client connection may be joining and be being muxed on an already existing connection, and all participants must be on the same thread.

This is what I mean when I say I can't support any of the other models that have been suggested thus far, they all bifuricate or break something when coupled with other lws feature sets.

>I didn't encounter any synchronization issues in this mode so far.

For some subset of stuff each of the other ways may work, for example a couple of weeks ago there were a some guys on github for whom a similar setup to your worked, they wanted to add c++ atomics everywhere in lws and treat it as threadsafe because it didn't blow up so far.  They don't use h2 (yet) so they're ok if it breaks that.  They just want to scratch their itch and feel I'm blocking them.

>The local protocol and the thread-connection binding specified when
>calling lws_client_connect_via_info() is respected by lws.
>
>Using lws this way has only needed me a small code change in lws that I
>will share shortly with the list as a patch.
>
>The only odd behavior that I find suspicious and I will investigate
>further what I am seeing is that beside the LWS_CALLBACK_GET_THREAD_ID
>callbacks from lws_client_connect_via_info().
>
>LWS_CALLBACK_GET_THREAD_ID is called 8 more times and only for my
>thread #2:
>
>$ grep LWS_CALLBACK_GET_THREAD_ID pub.out
>[2020-02-22  13:45:40] INFO WSBASE/callback LWS_CALLBACK_GET_THREAD_ID:
>0
>[2020-02-22  13:45:40] INFO WSBASE/callback LWS_CALLBACK_GET_THREAD_ID:
>1
>[2020-02-22  13:45:40] INFO WSBASE/callback LWS_CALLBACK_GET_THREAD_ID:
>1
>[2020-02-22  13:45:40] INFO WSBASE/callback LWS_CALLBACK_GET_THREAD_ID:
>1
>[2020-02-22  13:45:40] INFO WSBASE/callback LWS_CALLBACK_GET_THREAD_ID:
>1
>[2020-02-22  13:45:40] INFO WSBASE/callback LWS_CALLBACK_GET_THREAD_ID:
>1
>[2020-02-22  13:45:42] INFO WSBASE/callback LWS_CALLBACK_GET_THREAD_ID:
>1
>[2020-02-22  13:45:42] INFO WSBASE/callback LWS_CALLBACK_GET_THREAD_ID:
>1
>[2020-02-22  13:45:42] INFO WSBASE/callback LWS_CALLBACK_GET_THREAD_ID:
>1
>[2020-02-22  13:45:42] INFO WSBASE/callback LWS_CALLBACK_GET_THREAD_ID:
>1
>
>in my code tid == tsi for the sake of simplicity.
>a quick search in the lws codebase leads me to believe that those
>callbacks originate from core-net/pollfd.c
>
>but the situation is raising my eyebrowns... I'll look more into it and
>report back if I find some explanation about that...

The code doing that is a good example of why I stopped trying to make people happy by supporting these threaded usecases ad-hoc without a solid plan (ie, a solid plan is lws_cancel_service() being definitely threadsafe under all conditions and for all event libs by virtue of pipe()).

This code is coming from an earlier attempt to make lws_callback_on_writable() the api threads could use to interact with the lws event loop.  The idea was if you have multiple threads, you should handle GET_THREAD_ID in protocols[0] and return with your platform thread id (tid != tsi... tid is like the pthreads tid or whatever).  Then, lws_callback_on_writable() can refer to this to find out if it's being called from a different thread than the lws event loop.

It breaks down because lws_callback_on_writable() is not safe itself against concurrent entry; the action is to change the poll wait and depending on the event lib that may not be threadsafe; depending on the platform (bsd) changes by other threads to .events pollfds while in poll wait are lost (which spawned more code with ghetto volatile spinlocks that stashes changes and applies them after exit from poll wait); with muxed protocols asking for writable callback is not necessarily a pollfd wait modification but manipulation of logical mux child pending write state that isn't threadsafe at all.

All of this stuff was a dead-end and will eventually go away... it's not used in the example apps or recommended in the docs.  None of it is needed when lws_cancel_service() is the only point of contact between lws event context and other threads... any lws_callback_on_writable() occur from lws event context then.  H2 or other muxed protocols are perfectly happy with that.

c++ people strongly feel they ought to wrap lws in class semantics that don't fit it, threads people feel lws ought to follow typical threadsafe / locking semantics that don't fit it... it's "what it is" and the way to leverage it is follow the hard-won advice about the recommended way.

-Andy

>
>_______________________________________________
>Libwebsockets mailing list
>Libwebsockets at ml.libwebsockets.org
>https://libwebsockets.org/mailman/listinfo/libwebsockets


More information about the Libwebsockets mailing list