[Libwebsockets] Interaction of external threads with libwebsockets server

"Andy Green (林安廸)" andy at warmcat.com
Sat Jan 11 05:48:18 CET 2014


On 03/01/14 22:05, the mail apparently from Thomas Spitz included:
> Hello Andy,
>
> It doesn't seem to work.

I tried it and you're right, it didn't work.

I had misunderstood what the signal mask stuff wanted, I fixed it and 
pushed the fix here

http://git.libwebsockets.org/cgi-bin/cgit/libwebsockets/commit/?id=ed451d5cbf1c46d889e800ff86b9e2e3c9de0d6e

I tested it by setting the test sever poll wait to 30s, leaving it idle 
and firing a SIGUSR2 at it: it interrupts the wait and goes back to wait 
again with current "events".

-Andy

> In order to debug, I added the following printf in libwebsockets.c
> #ifdef LWS_HAS_PPOLL
> /*
> * if we changed something in this pollfd...
> *   ... and we're running in a different thread context
> *     than the service thread...
> *       ... and the service thread is waiting in ppoll()...
> *          then fire a SIGUSR2 at the service thread to force it to
> *             restart the ppoll() with our changed events
> */
> if (events != context->fds[wsi->position_in_fds_table].events) {
> sampled_ppoll_tid = lws_idling_ppoll_tid;
> *printf("sampled_ppoll_tid: %d\n",sampled_ppoll_tid);*
> if (sampled_ppoll_tid) {
> tid = context->protocols[0].callback(context, NULL,
>      LWS_CALLBACK_GET_THREAD_ID, NULL, NULL, 0);
> if (tid != sampled_ppoll_tid)
> *printf("kill(sampled_ppoll_tid, SIGUSR2)\n");*
> kill(sampled_ppoll_tid, SIGUSR2);
> }
> }
> #endif
>
> Here below is the log I get when opening test.html while test-server.c
> is running (Enclosed my modified test-server.c with a asynchronous
> sending thread). Counter is increased to 1 and then 1 minute elapsed
> before it is increased to 2.
>
> webserver PID : 24290
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> START asynchronous sending
> Thread PID : 24290
> asynchronousSending pthread_self():-1678948608
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
>      GET URI = /
>      Host = 192.168.1.6:7681 <http://192.168.1.6:7681>
>      Connection = keep-alive
>      Accept: =
> text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
>      Accept-Encoding: = gzip,deflate,sdch
>      Accept-Language: = fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4
>      Cache-Control: = max-age=0
>      Cookie: = test=LWS_1388753472_194286_COOKIE
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> sampled_ppoll_tid: -1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1678948608
> kill(sampled_ppoll_tid, SIGUSR2)
>      GET URI = /libwebsockets.org-logo.png
>      Host = 192.168.1.6:7681 <http://192.168.1.6:7681>
>      Connection = keep-alive
>      Accept: = image/webp,*/*;q=0.8
>      Accept-Encoding: = gzip,deflate,sdch
>      Accept-Language: = fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4
>      Cache-Control: = max-age=0
>      Cookie: = test=LWS_1388753472_194286_COOKIE
>      Referer: = https://192.168.1.6:7681/
>      GET URI = /xxx
>      Host = 192.168.1.6:7681 <http://192.168.1.6:7681>
>      Connection = Upgrade
>      Protocol = dumb-increment-protocol
>      Upgrade = websocket
>      Origin = https://192.168.1.6:7681
>      Key = flv4nyR7f+VEHFSWJXQxBA==
>      Version = 13
>      Extensions = x-webkit-deflate-frame
>      Pragma: = no-cache
>      Cache-Control: = no-cache
>      Cookie: = test=LWS_1388753472_194286_COOKIE
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
>      GET URI = /xxx
>      Host = 192.168.1.6:7681 <http://192.168.1.6:7681>
>      Connection = Upgrade
>      Protocol = lws-mirror-protocol
>      Upgrade = websocket
>      Origin = https://192.168.1.6:7681
>      Key = OXYhUJt7pkvdd72hJzPp5w==
>      Version = 13
> sampled_ppoll_tid: 0
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
>      Extensions = x-webkit-deflate-frame
>      Pragma: = no-cache
>      Cache-Control: = no-cache
>      Cookie: = test=LWS_1388753472_194286_COOKIE
> sampled_ppoll_tid: -1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1678948608
> kill(sampled_ppoll_tid, SIGUSR2)
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> sampled_ppoll_tid: 0
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1657694464
>      GET URI = /favicon.ico
>      Host = 192.168.1.6:7681 <http://192.168.1.6:7681>
>      Connection = keep-alive
>      Accept: = */*
>      Accept-Encoding: = gzip,deflate,sdch
>      Accept-Language: = fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4
>      Cookie: = test=LWS_1388753472_194286_COOKIE
> sampled_ppoll_tid: -1657694464
> LWS_CALLBACK_GET_THREAD_ID pthread_self():-1678948608
> kill(sampled_ppoll_tid, SIGUSR2)
>
> I continue my investigation.
>
> Thanks for your the patch anyway.
>
> BR,
> Thomas
>
>
> 2014/1/3 "Andy Green (林安廸)" <andy at warmcat.com <mailto:andy at warmcat.com>>
>
>     On 01/01/14 22:38, the mail apparently from Andy Green included:
>
>
>
>         Thomas Spitz <thomas.spitz at hestia-france.__com
>         <mailto:thomas.spitz at hestia-france.com>> wrote:
>
>             Hello Andy,
>
>             Have you had some time trying to replace poll by ppoll in
>             order to have
>             poll triggered on signal from an external thread?
>
>
>         Not yet... in Taiwan the big holiday is Chinese New Year in a
>         few weeks.  I'm still interested in doing it, the weekend is the
>         most likely time.
>
>
>     Please have a look at this:
>
>     http://git.libwebsockets.org/__cgi-bin/cgit/libwebsockets/__commit/?id=__3b3fa9e2086da6157289141e0b6fe1__e5035bad25
>     <http://git.libwebsockets.org/cgi-bin/cgit/libwebsockets/commit/?id=3b3fa9e2086da6157289141e0b6fe1e5035bad25>
>
>     I didn't test it because I don't have a threaded user code, but it
>     should be pretty close if not workable already.
>
>     Note the comment in the commit log, you have to actively enable this
>     code (I wasn't able to find a way for the compiler to understand if
>     it had ppoll() or not).
>
>     ppoll() is a GNU extension so if this is useful, we'll need to add
>     it as a CMake-time option.
>
>     -Andy
>
>
>
>         -Andy
>
>             At the present your lib works very well with intensive data
>             submit from
>             external thread.
>
>             Happy new year to everyone.
>
>             BR,
>
>             Thomas
>
>
>             On 25 Dec 2013 15:24, Andy Green (林安廸) <andy at warmcat.com
>             <mailto:andy at warmcat.com>> wrote:
>
>                 On 25/12/13 20:14, the mail apparently from Thomas Spitz
>                 included:
>
>                           The choices seem to boil down to this kind of
>                     "add a fake
>                           descriptor" thing (although everything,
>                     including the "interrupt
>
>             the
>
>                           poll" descriptor and the use of it should be
>                     defined inside the
>                           library), or maybe change to use ppoll() and
>                     fire signals at it.
>
>                     ppoll() could be an interesting solution but it only
>                     interrupt the
>
>             poll.
>
>
>
>                 I think that's all we need to do.
>
>                 Latency is only coming this way on an idle system where
>                 we are
>
>             sleeping in
>
>                 the poll(), but another thread asked to change what a
>                 pollfd was
>
>             waiting on.
>
>
>                 As you pointed out originally, under those circumstances
>                 the changed
>                 pollfd rules won't be seen and handled -- if every fd is
>                 idle for the
>                 events it started out with -- until the poll() timeout
>                 expires.
>
>                 Although in other use-cases this isn't that realistic as
>                 a problem,
>
>             since
>
>                 some deal with tens of thousands of simultaneous
>                 connections and
>
>             usually
>
>                 someone is breaking the poll after a short time for
>                 service, in other
>
>             use
>
>                 cases it is realistic.  You can attack it by reducing
>                 the poll sleep
>
>             period
>
>                 but then you're looking at maybe hundreds of wakes a
>                 second on what
>
>             should
>
>                 be an idle system, needlessly bad for power.
>
>                 If we provided a way for those use-cases to have very
>                 long poll()
>
>             timeouts
>
>                 and minimal latency it's good I think, so long as it
>                 doesn't burden
>
>             or make
>
>                 problems when it's not wanted or needed.
>
>                    I was thinking of interrupting poll() using a named
>                 pipe in which I
>
>                     would have told lws which wsi it needs to write to.
>                     The complete
>
>             process
>
>                     would have been the following:
>
>
>                 No it's not a good way... lws already has a good
>                 semantic in poll()
>
>             for
>
>                 understanding who needed service.  This would be a lot
>                 of new stuff
>
>             doing
>
>                 the same job that only works in the multithreaded case.
>
>                    1) Before libwebsocket_create_context(), I create the
>                 named pipe.
>
>                     2) For every client connection, I book for a shm
>                     in LWS_CALLBACK_ESTABLISHED through which I will
>                     share incoming data
>                     with my main thread
>                     3) My main thread process the incoming data and
>                     store the answer
>
>             into
>
>                     the shm. It then indicates lws that an answer is
>                     ready for a given
>
>             wsi
>
>                     indicating the ID of the shm in the named pipe
>                     4) lws poll() is interupted and it knows immediatly
>                     which wsi it
>
>             needs
>
>                     to write to thanks to the ID of the shm. If the wsi
>                     is closed in the
>                     meanwhile, lws indicate it to the shm in
>                     LWS_CALLBACK_CLOSED
>
>                     If I use ppoll(), I could keep almost the same
>                     principle but I would
>                     then need to add a SIGUSR1 and a handler OR loop
>                     through my client
>
>             shm
>
>                     array each time ppoll() got interupted with EINTR
>                     flag set...
>
>             Finally I
>
>                     am still wondering whether my solution is not simpler?
>
>
>                 That solution is basically a threaded rewrite of lws not
>                 using
>
>             poll(). If
>
>                 you're interested to do that I don't want to discourage
>                 you, but it's
>                 something different from lws then.  Of course lws is
>                 liberally
>
>             licensed so
>
>                 you're welcome to build on it if you have a compatible
>                 license.
>
>                 However, if you think about larger scale servers, which
>                 do exist
>
>             using
>
>                 lws, "knowing the exact (single) wsi" that woke it is
>                 not useful when
>
>             there
>
>                 may be hundreds of fds needing service each poll().
>
>                        Either way lws_change_pollfd() is central to the
>                 solution.
>
>
>                     With my solution or even ppoll one, I don't see when
>                     I need to
>                     calllws_change_pollfd() especially as
>                     lws_change_pollfd needs a
>
>             pointer
>
>                     to wsi which I cannot give as my interrupt concerns
>                     the complete
>
>             context
>
>                     and not a special wsi...?I must miss a point.
>
>
>                 lws_change_pollfd() is the point that any code which
>                 wants to change
>
>             the
>
>                 events on a pollfd ends up at now.  And changing the
>                 event on a
>
>             pollfd is
>
>                 the definition of the cause of latency (when poll() is
>                 idle and with
>                 relatively long timeout).
>
>                 So whether it is doing rx flow control or wait on being
>                 able to send,
>
>             that
>
>                 function is the place to signal to break the poll() one
>                 way or the
>
>             other.
>
>
>                        If you pick a signal like SIGUSR1 and install a
>                 do-nothing
>
>             handler
>
>                           for it, firing SIGUSR1 at the process from
>                     itself in
>                           lws_change_pollfd() and using ppoll() could be
>                     a really small
>
>             and
>
>                           robust solution.
>                           Since the signal is handled it doesn't do
>                     anything except
>
>             interrupt
>
>                           the ppoll causing a pollfd reload.
>                           You only need to fire the signal the first
>                     time anything wants
>
>             to
>
>                           interrupt the wait *from another thread*
>                     (because if the lws
>
>             thread
>
>                           is in poll(), it isn't doing anything else).
>                       If a pollfd raced
>
>             it
>
>                           and changed first, there's no problem with an
>                     additional signal
>                           interrupting the next ppoll loop.
>
>                     Ideally, if it is not too much to ask, a simple
>                     example of code
>
>             would be
>
>                     ideal.
>
>
>                 I may have some time tomorrow to give this a try.
>
>                 -Andy
>
>
>




More information about the Libwebsockets mailing list