[Libwebsockets] SSL http client stalls on the last chunk.

Andy Green andy at warmcat.com
Wed Jan 22 14:00:30 CET 2020

On 1/22/20 11:58 AM, Zevv wrote:
> Hi,
> I have run into a case (hard to reproduce, so no sample code here,
> sorry) where a client HTTPS transfer stalls while waiting for the
> last remaining bit of data.
> The symptoms are that libws no longer triggers LWS_CALLBACK_RECEIVE_CLIENT_HTTP
> events, until the remote server decides it has waited long enough and
> closes the socket. This triggers a new LWS_POLLIN, causing the state
> machine to run again and the final bits of the tranfer to succeed. It
> seems that the chain of events gets lost somewhere if there is more then
> one buffer available in the OpenSSL context, this just gets stuck until
> the event loop kicks in again.
> The comments "if it was our buffer that... this data will sit there
> forever" in openssl-ssl.c seem to apply here somehow.
> I can workaround this by calling lws_http_client_read() more then once
> from the LWS_CALLBACK_RECEIVE_CLIENT_HTTP callback to this makes sure to
> drain the OpenSSL decrypted read buffer, but I assume this is not
> supposed to work like this.
> What am I doing wrong here?

It should understand if the SSL layer has stuff buffered, and puts the 
wsi on a linked-list at the pt so it will fake a POLLIN next time around.

Although you explained your meaning well, not being able to reproduce it 
makes it kind of difficult to know where it's coming from.

What happens if we don't rely on having read up to the max result buffer 
to trigger checking SSL_pending()?

diff --git a/lib/tls/openssl/openssl-ssl.c b/lib/tls/openssl/openssl-ssl.c
index 6e59caf1e..c214dee22 100644
--- a/lib/tls/openssl/openssl-ssl.c
+++ b/lib/tls/openssl/openssl-ssl.c
@@ -290,19 +290,26 @@ lws_ssl_capable_read(struct lws *wsi, unsigned 
char *buf, int len)
  	 * Because these won't signal at the network layer with POLLIN
  	 * and if we don't realize, this data will sit there forever
-	if (n != len)
-		goto bail;
-	if (!wsi->tls.ssl)
-		goto bail;

-	if (SSL_pending(wsi->tls.ssl) &&
-	    lws_dll2_is_detached(&wsi->tls.dll_pending_tls))
-		lws_dll2_add_head(&wsi->tls.dll_pending_tls,
-				  &pt->tls.dll_pending_tls_owner);
+	m = wsi->tls.ssl && SSL_pending(wsi->tls.ssl);
+	lws_pt_lock(pt, __func__);
+	if (lws_dll2_is_detached(&wsi->tls.dll_pending_tls) == m) {
+		/*
+		 * We are either detached when we now want to be on the list,
+		 * or on the list when we now want to be detached
+		 */
+		if (m)
+			/* we want to be on the list */
+			lws_dll2_add_head(&wsi->tls.dll_pending_tls,
+					  &pt->tls.dll_pending_tls_owner);
+		else
+			/* we want to be detached */
+			__lws_ssl_remove_wsi_from_buffered_list(wsi);
+	}

-	return n;
-	lws_ssl_remove_wsi_from_buffered_list(wsi);
+	lws_pt_unlock(pt, __func__);

  	return n;

If that doesn't seem to affect it, on master there are also buflists now 
on both input and output side to catch any leftovers... like SSL there's 
also a list of those on the pt that have content and need to be serviced 
before having a nonzero poll wait, both pt lists get a chance in 
lws_service_flag_pending() in lib/core-net/service.c to force POLLIN for 
the related wsis that are on the lists.


More information about the Libwebsockets mailing list