[Libwebsockets] lws_write speed
roger at atchoo.org
Wed May 18 11:35:02 CEST 2016
On Wed, May 18, 2016 at 5:10 AM, Andy Green <andy at warmcat.com> wrote:
> No it works fine, because after copying to the internal buffer lws_write()
> lies and returns the whole amount as "sent". It has to do that because the
> buffer it was given is usually on the stack and will immediately be lost.
Ok, I don't believe that this will be done on the stack most of the
time but I understand the reasoning.
> Seeing what has happened, lws then disables any further WRITABLE callbacks
> to the user code and requests and services them automatically from the temp
> buffer. When the malloc'd temp buffer is drained, it is kept around (on the
> basis if you wrote that much once on this wsi, your code is probably
> planning to do so again) and only realloc'd if the next one is bigger.
> WRITABLE callbacks are reenabled when the temp buffer is drained. The temp
> buffer is freed when the wsi closes.
I see, that makes sense in the context of the previous
> PS: also I learned to my surprise, the pattern of giving write() a big
> length and letting it nibble what it wants is a really bad performance idea.
> The problem is the kernel processes all of the pages every time before
> passing the request to the network stack, if len is counted in MB that is a
> huge amount of time and CPU lost each call, that will only accept a fraction
> of the processed pages.
Ah, that's very interesting and not something I'd thought about,
thanks for the tip.
I wanted to test it out of course though, so tried sending a ~60MB
file (not using websockets) either using write(full_remaining_length)
or write(4096) and using callgrind to look at both cases (yes, this
is only looking at the user space). This is with an application
operating as a client with only 2 socket connections open. What I saw
was that passing the full length meant write() was called 1665 times,
but limited to 4096 bytes it was called 15725 times.
Doing further investigation to look at what was actually being
returned from write() gave me a smallest write of 1428 (this only
happened twice), a mean of 40559, median of 19992 and maximum of
It's clear that those numbers are much smaller than the 60MB total
size and so trying to pass that every single time would result in a
loss in performance from what you said. On the other hand, limiting to
4096 bytes at once seems like it would reduce performance as well from
what I've seen.
FWIW, this is Ubuntu 14.04 running on an Intel Atom N2800 with 2GB RAM
- connecting to a remote host in a different country.
More information about the Libwebsockets