[Libwebsockets] [Libwebsocket] truncated send

Andy Green andy at warmcat.com
Mon Mar 7 10:55:32 CET 2016


Please use "libwebsockets at ml.libwebsockets.org" for the list.

On March 7, 2016 4:42:12 PM GMT+07:00, Mattia Micomonaco <mattia.micomonaco at fluidmesh.com> wrote:
>Hello. In my libwebsocket client there is a thread that periodically
>creates messages that have to be sent to a server. These messages are
>compressed and fragmented if necessary. Then the thread pushes all
>fragments in a queue (called "send_queue" and protected by mutex).
>Each fragment in the queue is a pointer to an instance of the following
>struct:
>
>/*! Single fragment info */
>struct fragment {
>    int mode;
>    int len;
>    unsigned char content[WRITE_BUF_SIZE];
>};
>
>The first time the client is connected, the function
>lws_callback_on_writable is called (in LWS_CALLBACK_CLIENT_ESTABLISHED
>callback). When the socket is "writable", the functions onWritable()
>and
>lws_callback_on_writable are called. The function onWritable reads the
>first fragment in the queue and try to send it to the server through
>lws_write. If the latter is successful, this fragment is removed from
>the
>queue. This procedure is iterated until the queue is empty. But if one
>lws_write fails, the loop is broken (although the queue is not empty)
>and I
>call lws_callback_on_writable. When the socket will be "writable"
>again,
>the client will try to send the remaining fragments. This behaviour is
>implemented with the following code:
>
>case LWS_CALLBACK_CLIENT_WRITEABLE: {
>    onWritable();
>    lws_callback_on_writable(wsi);
>    break;
>}
>
>void onWritable() {
>    int ret, n;
>    struct fragment *frg;
>
>    pthread_mutex_lock(&send_queue_mutex);
>
>    while (!send_queue.empty()) {
>        frg = send_queue.front();
>
>        n = lws_write(wsi, frg->content + LWS_PRE, frg->len,
>(lws_write_protocol)frg->mode);
>        ret = checkWsWrite(n, frg->len);
>        if (ret < 0)
>            break;
>
>       // pop fragment and free memory only if lws_write was successful
>        send_queue.pop();
>        delete(frg);
>    }
>    pthread_mutex_unlock(&send_queue_mutex);
>}
>
>int checkWrite(int n, int len)
>{
>    if (n < 0) {
>        cerr << "Error writing to socket." << endl;
>        return -1;
>    }
>    if (n < len) {
>        cerr << "Partial write: " << n << " < " << len << endl;
>        return -1;
>    }
>    return n;
>}
>
>I'm testing my client by sending a very big message in order to have
>many
>fragments. I try to send 100000 fragments. Each fragment has a content
>of
>7000 bytes.
>The behavior is always the same: the client fails to send the 21th
>fragment
>(lws_write returns 0). In particular the following error occurs:
>ERR: ****** 197e160 Sending new, pending truncated ...

Put simply you called lws_write() outside of the event loop serialization.

So one send was not completely accepted by the kernel, but lws buffered the part that was not accepted automatically.  No more WRITABLE callbacks will come to user code until the buffered data is drained.

My guess is the problem is other threads were waiting already at your pthread_mutex_lock()... otherwise why put a mutex there.  Somehow your threading logic dispatched them earlier.  So they can get the lock now and directly do the write, violating the ordering of data on the wire, since lws didn't have the chance to send anything then.

Lws is not generally threadsafe, just do things related to lws in one thread, remove the locking then, and do not return from the callback until you did whatever you were doing.  You may ask for callback on writeable from another thread, that's all.

-Andy

>After breaking the loop, lws_callback_on_writable is called again but
>the
>client disconnects from the server (I don't know why).
>Could someone help me to explain this strange behavior (truncated send
>and
>unexpected disconnection)? Thanks in advance




More information about the Libwebsockets mailing list