[Libwebsockets] trouble with external POLL array handling
"Andy Green (林安廸)"
andy at warmcat.com
Tue Jan 15 16:05:08 CET 2013
On 15/01/13 21:55, the mail apparently from Edwin van den Oetelaar included:
> Thanks Andy, you are very active/responsive on this !!
> As another test I tried changing the extpoll stuff into a mark and
> sweep solution.
> This means I mark fds as invalid (-1) and at the end of the mainloop
> clean them out before entering the poll() again.
> This works very good also, but I see exactly the same thing happening,
> one of the first connections takes the most wall-clock-time to
> I also tried running the loop backwards (high->low index which is easy
> when using mark/sweep, since the array does not change size in the
> loop) and it still has this strange effect.
> Very odd indeed.
> I am still searching too.
> This is fun,
Yeah it's fun, painful fun ^^
When I look at the max number of poll() members with something waiting,
it never seems to exceed 20 here.
Sometimes it's 19 and things are normal.
But when poll() returned that there are 20 waiting, when we iterate
their revents, we only ever find 19 with nonzero revents. poll()
retcode is defined to be number of array members with nonzero revents.
I guess we might learn something about poll() if we keep digging.
> Edwin van den Oetelaar
> On Tue, Jan 15, 2013 at 2:38 PM, "Andy Green (林安廸)" <andy at warmcat.com> wrote:
>> On 15/01/13 18:03, the mail apparently from "Andy Green (林安廸)" included:
>>> On 15/01/13 17:54, the mail apparently from Edwin van den Oetelaar
>>>> Under high concurrency and high load I notice connection refused.
>>>> #define SOMAXCONN 128
>>>> in libwebsockets/lib/libwebsockets.c
>>>> line 2965: listen(sockfd, SOMAXCONN );
>>>> This seems to be logical, and I never looked at it before but....
>>>> since I can do stuff like :
>>>> echo "2048 64512" > /proc/sys/net/ipv4/ip_local_port_range
>>>> echo "1" > /proc/sys/net/ipv4/tcp_tw_recycle
>>>> echo "1" > /proc/sys/net/ipv4/tcp_tw_reuse
>>>> echo "10" > /proc/sys/net/ipv4/tcp_fin_timeout
>>>>>>>>> echo "65536" > /proc/sys/net/core/somaxconn
>>>> echo "65536" > /proc/sys/net/ipv4/tcp_max_syn_backlog
>>>> echo "262144" > /proc/sys/net/netfilter/nf_conntrack_max
>>>> Maybe the listen() backlog should be set so a large value (which will
>>>> be truncated automatically to what the system can handle) or
>>> Yes configurable is the way I think, I'll add a patch for it later
>>> Actually these big instantaneous backlogs are always going to be
>>> synthetic in origin I would think. Of course we then need to support it
>>> to see it work with the synthetic tests, so it doesn't get us out of the
>>> problem. But then having it configurable will be enough.
>>>> An other way would be to handle the listen() fd different and accept
>>>> more than 1 connection per loop over the fd list?
>>>> Just some thoughts,
>>> I think it's OK to leave them as backlog, since they are newbies to the
>>> server it's better they wait a little under load than everybody else
>>> gets latency because we favour new joiners.
>>> At the moment I'm adding hashtables to the extpoll server... I thought
>>> what might be happening is we keep closing and adding sockets, every
>>> close is expensive with lots of sockets. I noticed that we do not get
>>> this distribution skew with small numbers of concurrent connections,
>>> it's basically flat distribution then. Having hashtables to reduce the
>>> number of iterations to find the position of a given fd in the pollfd
>>> array should mitigate it if that's actually anything to do with the
>> Well, this is done and pushed, and it's a lot faster than before with large
>> numbers of sockets.
>> However it doesn't solve the stats bump with ab.
>> $ ab -t 100 -n 5000 -c 300 'http://127.0.0.1:7681/'
>> This is ApacheBench, Version 2.3 <$Revision: 1373084 $>
>> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
>> Licensed to The Apache Software Foundation, http://www.apache.org/
>> Benchmarking 127.0.0.1 (be patient)
>> Completed 500 requests
>> Completed 1000 requests
>> Completed 1500 requests
>> Completed 2000 requests
>> Completed 2500 requests
>> Completed 3000 requests
>> Completed 3500 requests
>> Completed 4000 requests
>> Completed 4500 requests <---- suspicious delay here
>> Completed 5000 requests
>> Finished 5000 requests
>> Server Software: libwebsockets
>> Server Hostname: 127.0.0.1
>> Server Port: 7681
>> Document Path: /
>> Document Length: 8447 bytes
>> Concurrency Level: 300
>> Time taken for tests: 10.614 seconds
>> Complete requests: 5000
>> Failed requests: 0
>> Write errors: 0
>> Total transferred: 42680000 bytes
>> HTML transferred: 42235000 bytes
>> Requests per second: 471.08 [#/sec] (mean)
>> Time per request: 636.831 [ms] (mean)
>> Time per request: 2.123 [ms] (mean, across all concurrent requests)
>> Transfer rate: 3926.91 [Kbytes/sec] received
>> Connection Times (ms)
>> min mean[+/-sd] median max
>> Connect: 0 130 539.6 0 7020
>> Processing: 147 300 471.3 238 8968
>> Waiting: 14 246 473.5 209 8957
>> Total: 166 430 807.2 239 9970
>> Percentage of the requests served within a certain time (ms)
>> 50% 239
>> 66% 244
>> 75% 249
>> 80% 263
>> 90% 566
>> 95% 1245
>> 98% 3239
>> 99% 3884
>> 100% 9970 (longest request)
>> Because the hashtables quite radically reduce the lookup time, but the ratio
>> between the longest delay and the normal delay is pretty unchanged, it seems
>> the extpoll stuff may not be implicated.
More information about the Libwebsockets