[Libwebsockets] sporadic connection close on Rx
Andy Green
andy at warmcat.com
Tue Nov 9 06:02:39 CET 2021
On 11/8/21 16:19, Roman Nikiforov wrote:
> On 11/7/21 11:13 AM, Andy Green wrote:
>> Why does it want to limit that to 2048? Is the limit coming from your
>> lws_protocols struct? If so, upping it to 4096 might help.
>
> No, I did a global text search and found 2048 only in lws sources (for
> example in lws-adopt.h "readbuf is limited to the size of the ah rx buf,
> currently 2048 bytes.")
>
> I changed data that server send to repeated sequence of bytes 30 to 3F
> and found bytes server didn't send:
>
> [2021/11/08 15:42:33:5339] N: 01F0: 38 39 3A 3B 3C 3D 3E 3F 30 31 32 33
> 34 35 36 37 89:;<=>?01234567
> [2021/11/08 15:42:33:5339] N: 0200: 38 39 3A 3B 3C 3D 3E 3F 81 7E 28 00
> 30 31 32 33 89:;<=>?.~(.0123
> [2021/11/08 15:42:33:5339] N: 0210: 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F
> 30 31 32 33 456789:;<=>?0123
> [2021/11/08 15:42:33:5339] N: 0220: 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F
> 30 31 32 33 456789:;<=>?0123
> I have no idea who put "81 7E 28 00" between 3F and 30. I made several
> tests and in case Rx fails with "illegal opcode" it was always "81 7E 28
> 00".
This is ws framing for FIN, TEXT, of length 10240 bytes, it's normal it
would be around your payload (not inside it...). It's like you lost
some data somewhere.
> In case of "disordered continuation" I have no output of Rx data for the
> last callback. And once I had this output:
>
> [2021/11/08 16:13:52:0841] D: lws_is_final_fragment: final 1, rx pk
> length 6144, draining 0
> [2021/11/08 16:13:52:0841] D:
> [wsicli|0|WS/h1/default/medrepo.de|default]: lws_ws_client_rx_sm: bulk
> ws rx: inp used 4096, output 4096
> [2021/11/08 16:13:52:0841] D:
> [wsicli|0|WS/h1/default/medrepo.de|default]: SSL_read says 0
This is typical for if the other side hung up on us.
> [2021/11/08 16:13:52:0841] D: lws_ssl_get_error: 0x7f75f4008090 0 -> 6
> (errno 0)
> [2021/11/08 16:13:52:0842] D:
> [wsicli|0|WS/h1/default/medrepo.de|default]: ssl err 6 errno 0
> [2021/11/08 16:13:52:0842] I: rops_handle_POLLIN_ws: LWS_SSL_CAPABLE_ERROR
> [2021/11/08 16:13:52:0842] D:
> [wsicli|0|WS/h1/default/medrepo.de|default]: lws_service_fd_tsi: Close
> and handled
> Tests are running inside Docker.
Hmmm... how about your ws server is broken? I think if this was coming
from lws breakage, there would be many more complaints about it, plus I
have not done any work on ws data flow for quite a while now.
There's nothing indicating that lws lost the data, just that it was
(already) lost and lws is looking at the wreckage. So to prove or rule
that out, I think you need to look at the server side with hexdumps for
what it puts into tls and compare the two sides when it is broken.
Data loss inside tls tunnel is not something that can really happen,
since the crypto around it is going to notice. So a reasonable guess is
the server is dropping some on the floor.
-Andy
More information about the Libwebsockets
mailing list