[Libwebsockets] sporadic connection close on Rx

Andy Green andy at warmcat.com
Tue Nov 9 06:02:39 CET 2021



On 11/8/21 16:19, Roman Nikiforov wrote:
> On 11/7/21 11:13 AM, Andy Green wrote:
>> Why does it want to limit that to 2048?  Is the limit coming from your 
>> lws_protocols struct?  If so, upping it to 4096 might help. 
> 
> No, I did a global text search and found 2048 only in lws sources (for 
> example in lws-adopt.h "readbuf is limited to the size of the ah rx buf, 
> currently 2048 bytes.")
> 
> I changed data that server send to repeated sequence of bytes 30 to 3F 
> and found bytes server didn't send:
> 
> [2021/11/08 15:42:33:5339] N: 01F0: 38 39 3A 3B 3C 3D 3E 3F 30 31 32 33 
> 34 35 36 37    89:;<=>?01234567
> [2021/11/08 15:42:33:5339] N: 0200: 38 39 3A 3B 3C 3D 3E 3F 81 7E 28 00 
> 30 31 32 33    89:;<=>?.~(.0123
> [2021/11/08 15:42:33:5339] N: 0210: 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 
> 30 31 32 33    456789:;<=>?0123
> [2021/11/08 15:42:33:5339] N: 0220: 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 
> 30 31 32 33    456789:;<=>?0123
> I have no idea who put "81 7E 28 00" between 3F and 30. I made several 
> tests and in case Rx fails with "illegal opcode" it was always "81 7E 28 
> 00".

This is ws framing for FIN, TEXT, of length 10240 bytes, it's normal it 
would be around your payload (not inside it...).  It's like you lost 
some data somewhere.

> In case of "disordered continuation" I have no output of Rx data for the 
> last callback. And once I had this output:
> 
> [2021/11/08 16:13:52:0841] D: lws_is_final_fragment: final 1, rx pk 
> length 6144, draining 0
> [2021/11/08 16:13:52:0841] D: 
> [wsicli|0|WS/h1/default/medrepo.de|default]: lws_ws_client_rx_sm: bulk 
> ws rx: inp used 4096, output 4096
> [2021/11/08 16:13:52:0841] D: 
> [wsicli|0|WS/h1/default/medrepo.de|default]: SSL_read says 0

This is typical for if the other side hung up on us.

> [2021/11/08 16:13:52:0841] D: lws_ssl_get_error: 0x7f75f4008090 0 -> 6 
> (errno 0)
> [2021/11/08 16:13:52:0842] D: 
> [wsicli|0|WS/h1/default/medrepo.de|default]: ssl err 6 errno 0
> [2021/11/08 16:13:52:0842] I: rops_handle_POLLIN_ws: LWS_SSL_CAPABLE_ERROR
> [2021/11/08 16:13:52:0842] D: 
> [wsicli|0|WS/h1/default/medrepo.de|default]: lws_service_fd_tsi: Close 
> and handled
> Tests are running inside Docker.

Hmmm... how about your ws server is broken?  I think if this was coming 
from lws breakage, there would be many more complaints about it, plus I 
have not done any work on ws data flow for quite a while now.

There's nothing indicating that lws lost the data, just that it was 
(already) lost and lws is looking at the wreckage.  So to prove or rule 
that out, I think you need to look at the server side with hexdumps for 
what it puts into tls and compare the two sides when it is broken.

Data loss inside tls tunnel is not something that can really happen, 
since the crypto around it is going to notice.  So a reasonable guess is 
the server is dropping some on the floor.

-Andy


More information about the Libwebsockets mailing list