[Libwebsockets] Problem of multiple persistent WebSocket connections

Zhang, Luis (Thin Client R&D) xiaolu.zhang at hp.com
Fri Sep 22 10:40:21 CEST 2017


We just created the connection, no data exchange.

I developed a client simulator based on this lib, one process simulates 1000 webscoket connections. And then start one client simulator every 5 seconds.

Best Regards,
Luis


-----Original Message-----
From: Andy Green [mailto:andy at warmcat.com] 
Sent: Friday, September 22, 2017 16:32
To: Zhang, Luis (Thin Client R&D) <xiaolu.zhang at hp.com>; libwebsockets at ml.libwebsockets.org
Subject: Re: [Libwebsockets] Problem of multiple persistent WebSocket connections



On 09/22/2017 04:23 PM, Zhang, Luis (Thin Client R&D) wrote:
> Attached is the logs with Master and enabled LWS_WITH_STATS.
> 
> Below are some configurations on server side in this test.
> 	info.timeout_secs = 30
> 	info.max_http_header_pool = 1024
> 	info.ws_ping_pong_interval = 180

Well, he is happy until ~35,000 connections

[2017/09/22 16:14:14:9316] NOTICE:
Consuming message: request timed out
[2017/09/22 16:14:25:0909] NOTICE:
[2017/09/22 16:14:25:0909] NOTICE: LWS internal statistics dump ----->
[2017/09/22 16:14:25:0909] NOTICE: LWSSTATS_C_CONNECTIONS: 
          35000
[2017/09/22 16:14:25:0909] NOTICE: LWSSTATS_C_API_CLOSE: 
              0
[2017/09/22 16:14:25:0909] NOTICE: LWSSTATS_C_API_READ: 
          70000
[2017/09/22 16:14:25:0909] NOTICE: LWSSTATS_C_API_LWS_WRITE: 
          35000
[2017/09/22 16:14:25:0909] NOTICE: LWSSTATS_C_API_WRITE: 
          35000
[2017/09/22 16:14:25:0909] NOTICE: LWSSTATS_C_WRITE_PARTIALS: 
              0
[2017/09/22 16:14:25:0909] NOTICE: LWSSTATS_C_WRITEABLE_CB_REQ: 
              0
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_C_WRITEABLE_CB_EFF_REQ: 
              0
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_C_WRITEABLE_CB: 
              0
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_C_SSL_CONNECTIONS_FAILED: 
              0
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_C_SSL_CONNECTIONS_ACCEPTED: 
              0
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_C_SSL_CONNS_HAD_RX: 
              0
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_C_PEER_LIMIT_AH_DENIED: 
              0
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_C_PEER_LIMIT_WSI_DENIED: 
              0
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_C_TIMEOUTS: 
              0
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_C_SERVICE_ENTRY: 
              0
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_B_READ: 
       25807460
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_B_WRITE: 
        6195000
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_B_PARTIALS_ACCEPTED_PARTS: 
              0
[2017/09/22 16:14:25:0910] NOTICE: 
LWSSTATS_MS_SSL_CONNECTIONS_ACCEPTED_DELAY:        0ms
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_MS_SSL_RX_DELAY: 
              0ms
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_MS_WRITABLE_DELAY: 
              0us
[2017/09/22 16:14:25:0910] NOTICE: LWSSTATS_MS_WORST_WRITABLE_DELAY: 
              0us
[2017/09/22 16:14:25:0910] NOTICE: Simultaneous SSL restriction: 
              0/0/0
[2017/09/22 16:14:25:0910] NOTICE: Live wsi: 
          35001
[2017/09/22 16:14:25:0910] NOTICE:   Listen port 7681 actual POLLIN:   1
[2017/09/22 16:14:25:0910] NOTICE: PT 1
[2017/09/22 16:14:25:0910] NOTICE:   AH in use / max:                  0 
/ 1024
[2017/09/22 16:14:25:0910] NOTICE:   AH wait list count / actual:      0 / 0
[2017/09/22 16:14:25:0910] NOTICE:
[2017/09/22 16:14:26:5866] NOTICE: Report count = 0
[2017/09/22 16:14:36:0051] NOTICE:
[2017/09/22 16:14:36:0052] NOTICE: LWS internal statistics dump ----->
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_CONNECTIONS: 
          35258
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_API_CLOSE: 
              0
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_API_READ: 
          72265
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_API_LWS_WRITE: 
          37129
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_API_WRITE: 
          37129
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_WRITE_PARTIALS: 
              0
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_WRITEABLE_CB_REQ: 
           5000
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_WRITEABLE_CB_EFF_REQ: 
           3966
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_WRITEABLE_CB: 
           1966
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_SSL_CONNECTIONS_FAILED: 
              0
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_SSL_CONNECTIONS_ACCEPTED: 
              0
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_SSL_CONNS_HAD_RX: 
              0
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_PEER_LIMIT_AH_DENIED: 
              0
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_PEER_LIMIT_WSI_DENIED: 
              0
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_TIMEOUTS: 
              0
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_C_SERVICE_ENTRY: 
              0
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_B_READ: 
       25907970
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_B_WRITE: 
        6217833
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_B_PARTIALS_ACCEPTED_PARTS: 
              0
[2017/09/22 16:14:36:0052] NOTICE: 
LWSSTATS_MS_SSL_CONNECTIONS_ACCEPTED_DELAY:        0ms
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_MS_SSL_RX_DELAY: 
              0ms
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_MS_WRITABLE_DELAY: 
       8163448955us
[2017/09/22 16:14:36:0052] NOTICE: LWSSTATS_MS_WORST_WRITABLE_DELAY: 
        7244789us
[2017/09/22 16:14:36:0052] NOTICE:   Avg writable delay: 
        4152313us
[2017/09/22 16:14:36:0052] NOTICE: Simultaneous SSL restriction: 
              0/0/0
[2017/09/22 16:14:36:0052] NOTICE: Live wsi: 
          35259
[2017/09/22 16:14:36:0052] NOTICE:   Listen port 7681 actual POLLIN:   1
[2017/09/22 16:14:36:0052] NOTICE: PT 1
[2017/09/22 16:14:36:0052] NOTICE:   AH in use / max: 
129 / 1024
[2017/09/22 16:14:36:0052] NOTICE:   AH wait list count / actual:      0 / 0


He starts to get connections around [2017/09/22 16:11:29:0440], he has 2K by then, and he starts to choke before 16:14:36, so it's around the 3m mark... ping-pong interval is very suspiously 180s, ie, 3m.  So I guess the number of connections is irrelevant, something happens when he starts to do the ping / pong.

Can you describe what the connections are doing, if I wanted to reproduce this test?  They are upgrading to ws and just sitting there idle, or something else?

-Andy

> Best Regards,
> Luis
> 
> -----Original Message-----
> From: Andy Green [mailto:andy at warmcat.com]
> Sent: Tuesday, September 19, 2017 16:38
> To: Zhang, Luis (Thin Client R&D) <xiaolu.zhang at hp.com>; 
> libwebsockets at ml.libwebsockets.org
> Subject: Re: [Libwebsockets] Problem of multiple persistent WebSocket 
> connections
> 
> 
> 
> On 09/19/2017 04:23 PM, Zhang, Luis (Thin Client R&D) wrote:
>> Thank you.
>>
>> In my project, the value of info.max_http_header_pool is 16. I will set it to 256 and try again.
>>
>> Besides, I enabled the ping/pong and set ws_ping_pong_interval to 180s in my project. I just disabled ping/pong and tested again, there is no problem for 100K connections.
>>
>> So does the ping/pong impact the performance and connections of the server side?
> 
> Well, it will generate a lot of traffic if you have 100K synthetic ws connections created all at once.  Because they will want to send pings and get pongs all at once.
> 
> It may be related, but stuff that times out on ping / pong has a 
> different timeout code, either PENDING_TIMEOUT_WS_PONG_CHECK_SEND_PING
> (16) or PENDING_TIMEOUT_WS_PONG_CHECK_GET_PONG (17).
> 
> As I said the timeout code you reported getting (3) is because an accepted connection did not process http headers.  That really looks like ah starvation ("pfd events 0").
> 
> I suggest you test things on lws master until we know what's 
> happening... also enabling LWS_WITH_STATS at cmake on master will get 
> you helpful logging dumps every 10s like this
> 
> [2017/09/19 15:08:01:2053] NOTICE: LWS internal statistics dump ----->
> [2017/09/19 15:08:01:2053] NOTICE: LWSSTATS_C_CONNECTIONS:
>                1
> [2017/09/19 15:08:01:2053] NOTICE: LWSSTATS_C_API_CLOSE:
>                0
> [2017/09/19 15:08:01:2053] NOTICE: LWSSTATS_C_API_READ:
>               27
> [2017/09/19 15:08:01:2053] NOTICE: LWSSTATS_C_API_LWS_WRITE:
>               13
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_API_WRITE:
>               13
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_WRITE_PARTIALS:
>                0
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_WRITEABLE_CB_REQ:
>               17
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_WRITEABLE_CB_EFF_REQ:
>               13
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_WRITEABLE_CB:
>               13
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_SSL_CONNECTIONS_FAILED:
>                0
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_SSL_CONNECTIONS_ACCEPTED:
>                0
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_SSL_CONNS_HAD_RX:
>                0
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_PEER_LIMIT_AH_DENIED:
>                0
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_PEER_LIMIT_WSI_DENIED:
>                0
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_TIMEOUTS:
>                0
> [2017/09/19 15:08:01:2054] NOTICE: LWSSTATS_C_SERVICE_ENTRY:
>             1243
> [2017/09/19 15:08:01:2055] NOTICE: LWSSTATS_B_READ:
>             4485
> [2017/09/19 15:08:01:2055] NOTICE: LWSSTATS_B_WRITE:
>             1435
> [2017/09/19 15:08:01:2055] NOTICE: LWSSTATS_B_PARTIALS_ACCEPTED_PARTS:
>                0
> [2017/09/19 15:08:01:2055] NOTICE:
> LWSSTATS_MS_SSL_CONNECTIONS_ACCEPTED_DELAY:        0ms
> [2017/09/19 15:08:01:2055] NOTICE: LWSSTATS_MS_SSL_RX_DELAY:
>                0ms
> [2017/09/19 15:08:01:2055] NOTICE: LWSSTATS_MS_WRITABLE_DELAY:
>             1314us
> [2017/09/19 15:08:01:2055] NOTICE: LWSSTATS_MS_WORST_WRITABLE_DELAY:
>             1063us
> [2017/09/19 15:08:01:2055] NOTICE:   Avg writable delay:
>              101us
> [2017/09/19 15:08:01:2055] NOTICE: Simultaneous SSL restriction:
>                0/0/0
> [2017/09/19 15:08:01:2055] NOTICE: Live wsi:
>                5
> [2017/09/19 15:08:01:2055] NOTICE:   Listen port 2200 actual POLLIN:   1
> [2017/09/19 15:08:01:2056] NOTICE: PT 1
> [2017/09/19 15:08:01:2056] NOTICE:   AH in use / max:                  0 / 4
> [2017/09/19 15:08:01:2056] NOTICE:   AH wait list count / actual:      0 / 0
> [2017/09/19 15:08:01:2056] NOTICE:  Peers: total active 1
> [2017/09/19 15:08:01:2056] NOTICE:   peer 127.0.0.1: count wsi: 1, count
> ah: 0
> [2017/09/19 15:08:01:2056] NOTICE:
> [2017/09/19 15:08:12:2181] NOTICE:
> 
> -Andy
> 
>> Best Regards,
>> Luis
>>
>>
>> -----Original Message-----
>> From: Andy Green [mailto:andy at warmcat.com]
>> Sent: Tuesday, September 19, 2017 15:49
>> To: Zhang, Luis (Thin Client R&D) <xiaolu.zhang at hp.com>; 
>> libwebsockets at ml.libwebsockets.org
>> Subject: Re: [Libwebsockets] Problem of multiple persistent WebSocket 
>> connections
>>
>>
>>
>> On 09/19/2017 03:32 PM, Zhang, Luis (Thin Client R&D) wrote:
>>> Attached is the log file.
>>>
>>> There are a lot of TIMEDOUT messages likes below. I modified the timeout from 5 seconds to 15 seconds, the issue occurred too.
>>>
>>> [2017/09/19 15:16:22:0048] NOTICE: wsi 0x7fd624162660: TIMEDOUT 
>>> WAITING on 3 (did hdr 0, ah (nil), wl 611, pfd events 0)
>>> [2017/09/19 15:16:22:0048] NOTICE: wsi 0x7fd624142bb0: TIMEDOUT 
>>> WAITING on 3 (did hdr 0, ah (nil), wl 611, pfd events 0)
>>> [2017/09/19 15:16:22:0048] NOTICE: wsi 0x7fd624142900: TIMEDOUT 
>>> WAITING on 3 (did hdr 0, ah (nil), wl 611, pfd events 0)
>>> [2017/09/19 15:16:22:0048] NOTICE: wsi 0x7fd624142650: TIMEDOUT 
>>> WAITING on 3 (did hdr 0, ah (nil), wl 611, pfd events 0)
>>> [2017/09/19 15:16:22:0048] NOTICE: wsi 0x7fd6241303a0: TIMEDOUT 
>>> WAITING on 3 (did hdr 0, ah (nil), wl 611, pfd events 0)
>>> [2017/09/19 15:16:22:0048] NOTICE: wsi 0x7fd6241300f0: TIMEDOUT 
>>> WAITING on 3 (did hdr 0, ah (nil), wl 611, pfd events 0)
>>> [2017/09/19 15:16:22:0048] NOTICE: wsi 0x7fd62412fe40: TIMEDOUT 
>>> WAITING on 3 (did hdr 0, ah (nil), wl 611, pfd events 0)
>>
>> Timeout reason 3 is "PENDING_TIMEOUT_ESTABLISH_WITH_SERVER"... this starts when we accept the incoming connection and is cleared when we deal with his headers.
>>
>> Since an ah is required to deal with the headers, I guess these are getting tied up.
>>
>> When you create the context, what do you set info.max_http_header_pool to?  The default is just 4... if you have 100,000 connections you probably want to scale this up to 256 or more.  This sets how many connections can simultaneously process http headers.
>>
>> -Andy
>>
>>> Best Regards,
>>> Luis
>>>
>>>
>>> -----Original Message-----
>>> From: Libwebsockets
>>> [mailto:libwebsockets-bounces at ml.libwebsockets.org] On Behalf Of 
>>> Zhang, Luis (Thin Client R&D)
>>> Sent: Monday, September 18, 2017 9:47
>>> To: Andy Green <andy at warmcat.com>; 
>>> libwebsockets at ml.libwebsockets.org
>>> Subject: Re: [Libwebsockets] Problem of multiple persistent 
>>> WebSocket connections
>>>
>>> I will reproduce it and get more detailed logs later.
>>>
>>> Best Regards,
>>> Luis
>>>
>>>
>>> -----Original Message-----
>>> From: Andy Green [mailto:andy at warmcat.com]
>>> Sent: Monday, September 18, 2017 9:33
>>> To: Zhang, Luis (Thin Client R&D) <xiaolu.zhang at hp.com>; 
>>> libwebsockets at ml.libwebsockets.org
>>> Subject: RE: [Libwebsockets] Problem of multiple persistent 
>>> WebSocket connections
>>>
>>>
>>>
>>> On September 18, 2017 9:12:35 AM GMT+08:00, "Zhang, Luis (Thin Client R&D)" <xiaolu.zhang at hp.com> wrote:
>>>> Thanks.
>>>>
>>>> The platform is Debian 8.5 and I used libuv.
>>>
>>> OK... so what happens when it seems to go wrong?
>>>
>>> Can you reproduce this with the libuv test server?  Can you let me know how to reproduce?
>>>
>>> LWS_WITH_STATS is on master, it logs a bunch of interesting stats every 10s.
>>>
>>> -Andy
>>>
>>>> Best Regards,
>>>> Luis
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Andy Green [mailto:andy at warmcat.com]
>>>> Sent: Monday, September 18, 2017 9:11
>>>> To: Zhang, Luis (Thin Client R&D) <xiaolu.zhang at hp.com>; 
>>>> libwebsockets at ml.libwebsockets.org
>>>> Subject: Re: [Libwebsockets] Problem of multiple persistent 
>>>> WebSocket connections
>>>>
>>>>
>>>>
>>>> On 09/18/2017 09:07 AM, Andy Green wrote:
>>>>>
>>>>>
>>>>> On 09/18/2017 08:51 AM, Zhang, Luis (Thin Client R&D) wrote:
>>>>>> I'm working on developing a WebSocket Server that supports for 
>>>>>> multiple persistent WebSocket connections. Our goal is supporting
>>>> for
>>>>>> 100K connections concurrently. As mentioned in 
>>>>>> https://libwebsockets.org/ that per thread can support 250K 
>>>>>> connections. So I think that there is no problem for supporting 
>>>>>> 100K
>>>>
>>>>>> persistent WebSocket connections. And I verified it with v2.0.0, 
>>>>>> it works well.
>>>>>>
>>>>>> But, when I upgraded the libwebsockets lib to v2.1, the server
>>>> cannot
>>>>>> accepted so many connections. It only can accepted about 40K 
>>>>>> connections. When about 40K connections were established, I 
>>>>>> cannot establish websocket connection any more. I also test it 
>>>>>> with v2.2
>>>> and
>>>>>> v2.3, I got the same result.
>>>>>>
>>>>>> The test environment of all test case are same, just the version 
>>>>>> of libwesocket lib are different.
>>>>>
>>>>>      - what is the platform?
>>>>>
>>>>>      - what *exactly* happens when you "cannot establish websocket 
>>>>> connection any more"?  Logs?
>>>>
>>>> ... and...
>>>>
>>>>     - what event loop?  Default, using libuv / ev / event?
>>>>
>>>>>
>>>>> There's no intention to reduce the max clients the server can 
>>>>> serve, on the contrary...
>>>>>
>>>>> -Andy
>>>>>
>>>>>> Any thoughts? Thank you.
>>>>>>
>>>>>> Best Regards,
>>>>>>
>>>>>> Luis
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Libwebsockets mailing list
>>>>>> Libwebsockets at ml.libwebsockets.org
>>>>>> https://libwebsockets.org/mailman/listinfo/libwebsockets
>>>>>>
>>>>> _______________________________________________
>>>>> Libwebsockets mailing list
>>>>> Libwebsockets at ml.libwebsockets.org
>>>>> https://libwebsockets.org/mailman/listinfo/libwebsockets
>>> _______________________________________________
>>> Libwebsockets mailing list
>>> Libwebsockets at ml.libwebsockets.org
>>> https://libwebsockets.org/mailman/listinfo/libwebsockets
>>>


More information about the Libwebsockets mailing list