[Libwebsockets] We met an issue when using libwebsocket and fixed it, how can I submit the patch
lifeng.zhang at melot.cn
Wed Jun 10 04:45:46 CEST 2020
Thank you for your reply.
I will try it.
------------------ Original ------------------
From: "Andy Green"<andy at warmcat.com>;
Date: Thu, Jun 4, 2020 02:38 PM
To: "张立峰"<lifeng.zhang at melot.cn>; "libwebsockets"<libwebsockets at ml.libwebsockets.org>;
Subject: Re: [Libwebsockets] We met an issue when using libwebsocket and fixed it, how can I submit the patch
On 6/4/20 7:16 AM, 张立峰 wrote:
> Hi, Andy:
> Recently, we met a crash on windows.
> Some thing like this:
> The program is a windows client using websocket protocol.
> 1. We create a thread as the service thread, calling lws_service. We
> intended to use this thread to service may ws client connections. --
> Mark as thread A
> 2. And we call lws_client_connect_via_info from another thread to create
> the wsi struct. -- Mark as thread B.
You can't do that.
> 3. After some normal case testing, every thing is fine. Then we disabled
> the local network (from windows`s device manager) to test network failed
> 4. It will crash
Yes, lws is not threadsafe, if you ignore that you will crash in many
exciting and varied ways.
> 5. After some code debug effort, we know that the wsi created in
> lws_client_connect_via_info, freed twice. One is before
> lws_client_connect_via_info return and in the thread B, anther is in
> the lws_service and in the thread A (founding a closed socket, I think,
> in our case is in windows-service.c(_lws_plat_service_tsi)?)
> I tried to fix it like this:
> 1. In the /libwebsockets/lib/roles/http/client/client-handshake.c
> 2. move following code a little down, before a successfull wsi return:
> if (__insert_wsi_socket_into_fds(wsi->context, wsi))
> goto try_next_result_closesock;
> I am not very confident for this code change. Please give some help, if
> this is right fix?
No... the right fix is to understand lws deliberately does not even try
to be threadsafe. Its model is that everything that uses its apis -
with one exception - is on the same thread inside the event loop. Your
current architecture goes against that and you won't get a "fix" trying
to use it that way.
You may only call lws apis from the "lws event loop thread", unless you
are calling the one threadsafe api, lws_cancel_service(), any use of lws
apis from another thread is strictly forbidden.
For use with other threads, you need to take the approach to have the
other threads ask the lws thread to do work that is related to lws. For
example, define a mutex and struct in a shared memory area that all the
threads can access, lock it and add your request in there. Then unlock
the shared area and call lws_cancel_service()
to signal to the lws thread there is something to do. Your protocol
will immediately receive a LWS_CALLBACK_EVENT_WAIT_CANCELLED callback
**on the lws thread context**
in there it can lock the shared struct and find out what it is being
asked to do, and do it from the lws thread, unlock and go on.
You can, eg, put your prepared _connection_info struct in the shared
area, prepared by the other thread. So although this is a deep change,
actually it's probably not going to cause a big code rewrite. But this
change isn't optional.
> Best wishes
> Terry Zhang
> ------------------ Original ------------------
> *From: * "Andy Green"<andy at warmcat.com>;
> *Date: * Wed, Mar 4, 2020 09:03 PM
> *To: * "张立峰"<lifeng.zhang at melot.cn>;
> "libwebsockets"<libwebsockets at ml.libwebsockets.org>;
> *Subject: * Re: [Libwebsockets] We met an issue when using libwebsocket
> and fixed it, how can I submit the patch
> On 3/4/20 10:28 AM, 张立峰 wrote:
> > Hi, everyone:
> > When using libwebsockets as a client to communicate a ws server, we
> > write out many messages in a queue when receive
> > a LWS_CALLBACK_CLIENT_WRITEABLE. This may cause the ws server to close
> > our connection.
> > We found that when lws_write return value is smaller than requested, the
> > connection will be close when we use lws_write next time.
> > After check the source code of libwebsockets, we found the issue cause:
> > The "wsi->ws->inside_frame" is not set to "0" after wsi->buflist_out has
> > been sent. This will cause the wrong coded ws frame been sent.
> Thanks, I modified it a little bit for the cases ws is not enabled for
> build or it's http and not ws connection, and pushed it on master and
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Libwebsockets