[Libwebsockets] something broke since my last git pull (some hours ago)

"Andy Green (林安廸)" andy at warmcat.com
Mon Feb 18 09:39:27 CET 2013


On 18/02/13 16:10, the mail apparently from Edwin van den Oetelaar included:
> Finally I found it.
> It was not my application but the library after all.
>
> static struct libwebsocket_protocols my_protocols[] = {
>      /* first protocol must always be HTTP handler */
>      [PROTOCOL_HTTP] =
>      {
>          .name = "http-only", /* name */
>          .callback = callback_http, /* callback */
>          .per_session_data_size = 0, /* per_session_data_size */
>      },
>
>
> This code does not work anymore....
> Why? The .per_session_data_size == 0 and that breaks something...
> libwebsocket_ensure_user_space(struct libwebsocket *wsi) returns 0
> (which is OK since there is no allocated data)
> But on line 122 of handshake.c
> it is taken as a problem and a jump to 'bail' happens.
>
> lwsl_info("HTTP request for '%s'\n",
> 				lws_hdr_simple_ptr(wsi, WSI_TOKEN_GET_URI));
>
> 			if (libwebsocket_ensure_user_space(wsi) == NULL) {
> 				/* drop the header info */
> 				if (wsi->u.hdr.ah)
> 					free(wsi->u.hdr.ah);
> 				goto bail;
> 			}
>
>
> The quick hack is setting .per_session_data_size=1
> but this is not a very clean solution.
>
> Oh man this took me most of my sunday to find....
> I finally found it just now.

Good job finding it!

Actually that API was contributed, it's another way to try to get the 
user_space randomly which is definitely bad news.

I added this patch

http://git.libwebsockets.org/cgi-bin/cgit/libwebsockets/commit/?id=2af4d5b2e272232433ed7a6598f0de23d31b76c0

to both remove that api from the public header and change all the 
private uses to accept 0 for OK and 1 for error, correcting the 
overloaded NULL return code that's the basic problem.

Thanks a lot for your efforts tracking it down.

However, if that's the problem I'm not sure why it acted in a 
nondeterministic fashion... I'm pretty sure I got your point and fixed 
it but maybe something else is lurking around.

-Andy


> Greetings,
> Edwin van den Oetelaar
>
>
> On Mon, Feb 18, 2013 at 8:41 AM, Edwin van den Oetelaar
> <oetelaar.automatisering at gmail.com> wrote:
>> Ok, thanks, I will tell you what happens.
>>
>> On Mon, Feb 18, 2013 at 8:30 AM, "Andy Green (林安廸)" <andy at warmcat.com> wrote:
>>> On 18/02/13 15:12, the mail apparently from "Andy Green (林安廸)" included:
>>>
>>>> On 18/02/13 14:58, the mail apparently from Edwin van den Oetelaar
>>>> included:
>>>>>
>>>>> Hello Andy,
>>>>> I guess it is something on my system that has gone wrong.
>>>>> I compiled your test-server on a different machine and it behaves
>>>>> differently.
>>>>> Good thing we have git, (and mercurial and svn) to track all the changes.
>>>>> I will let you know how I messed up my system.
>>>>> My first guess is mixing autoconf with cmake build process.
>>>>> On my system the cmake build will not work as documented, it will not
>>>>> build inside a 'build' directory just inside the 'root'.
>>>>
>>>>
>>>> I have not tried the CMake stuff yet.  I have tried to keep it up to
>>>> date with other changes though, but I couldn't see where it lists things
>>>> for /usr/share, how it can make dist / distclean etc.  So although I am
>>>> glad it is there for Windows people I think Unix guys might be using
>>>> autotools for a while.  Now we have the autogen.sh thing that doesn't
>>>> seem to make problems any more.  (However trying to offer an untested
>>>> Windows build before the CMake stuff was unworkable, so it is definitely
>>>> a big advantage.)
>>>>
>>>> Having said that though, I don't know how CMake activities could tread
>>>> on autotools; if you restart from autogen.sh it should force everything
>>>> in good shape AFAIK.
>>>>
>>>> You might want to do a sanity check of building static on the good
>>>> machine, confirm it's still good then copy over the test server to the
>>>> bad machine for a test.  That'll tell you if it's something in the
>>>> environment of that particular machine (assuming it's the same ARCH).
>>>>
>>>> Another idea is tar up the good build dir and try to just link on the
>>>> bad machine, then after make clean etc and see if there's a point it
>>>> breaks down... if it only works with static libs from the other machine
>>>> that's a big clue, if the good static binary fails too it suggests other
>>>> things etc.
>>>
>>>
>>> ... something else to try if your compiler is different on the good and bad
>>> boxes: we used to always build with -O2, but since this a week ago
>>>
>>> http://git.libwebsockets.org/cgi-bin/cgit/libwebsockets/commit/?id=a3957ef80454f8772c683a05930ce20c0d1bf2f8
>>>
>>> we build -O0 if you have debugging on (default) and -O4 if you have
>>> --disable-debug.  That also knocks out the lwsl_ less critical than "notice"
>>> from the compile.
>>>
>>> You might want to try undoing that patch's action
>>>
>>> +: ${CFLAGS=""}
>>>
>>> which should force things to -O2 again and see if that's anything to do with
>>> it.
>>>
>>> -Andy




More information about the Libwebsockets mailing list