[Libwebsockets] 答复: 回复: 答复: A question about libwebsockets dns resolve

Andy Green andy at warmcat.com
Tue May 26 10:17:13 CEST 2020



On 5/26/20 4:16 AM, andy at warmcat.com wrote:
> 
> 
> On May 26, 2020 3:05:51 AM UTC, huangkaicheng <huangkaicheng at huawei.com> wrote:
>> Can you change "onevalid.bogus.warmcat.com"  second and later ip
>> 127.0.0.x to real ip (like 10.173.16.193(unreachable is also OK)) for
> 
> I also have blackhole.bogus.warmcat.com that is already 5.5.5.5 and the correct address, and test with that.
> 
>> testing. I can not change your dns setting. Latest code libwebsockets
>> doesnot still work. Second Ip (and later ip)does still not try wait.
>> Only first Ip can try wait util time out. Old version will try
>> indeed(last week). You can just change ip and test , you can reproduce
>> it. Old version will try util time out but lastest code of master will
>> not.
> 
> We're going around in circles because I can't get windows to do anything other than just use the correct address first.  On linux each time I run the test it rotates the dns results to start at a different place, so by running it a few times I can see the different behaviours.  On windows it connects to the server every time.
> 
> I'll try adding more junk in a new bogus.warmcat.com record and see what happens.

I trimmed the entries in each bogus record to 3 so with the default 5s 
wait / 20s overall it cam always complete, and added two new ones 'junk' 
and 'revjunk'

onevalid.bogus          IN      A       127.0.0.1
                         IN      A       127.0.0.2
                         IN      A       46.105.127.147
blackhole.bogus         IN      A       5.5.5.5
                         IN      A       5.5.5.5
                         IN      A       46.105.127.147
junk.bogus              IN      A       10.99.99.99
                         IN      A       5.5.5.5
                         IN      A       46.105.127.147
revjunk.bogus           IN      A       46.105.127.147
                         IN      A       5.5.5.5
                         IN      A       10.99.99.98

With Linux, you get a different ordering of the results each time, and, 
on Linux, using what's in git from yesterday...

[2020/05/26 04:40:17:6377] I: lws_client_connect_2_dnsreq: 0x2278440: 
lookup junk.bogus.warmcat.com:443
[2020/05/26 04:40:17:6395] I: lws_getaddrinfo46: getaddrinfo 
'junk.bogus.warmcat.com' says 0
[2020/05/26 04:40:17:6395] I: lws_client_connect_3_connect: 
junk.bogus.warmcat.com ipv4 5.5.5.6
[2020/05/26 04:40:17:6396] I: lws_state_transition_steps: 
CONTEXT_CREATED -> OPERATIONAL

[2020/05/26 04:40:22:6397] I: lws_client_conn_wait_timeout: connect wait 
timeout has fired
[2020/05/26 04:40:22:6398] I: lws_client_connect_3_connect: abandoning 
connect due to timeout
[2020/05/26 04:40:22:6398] I: lws_client_connect_3_connect: (null) ipv4 
5.5.5.5

[2020/05/26 04:40:27:6401] I: lws_client_conn_wait_timeout: connect wait 
timeout has fired
[2020/05/26 04:40:27:6402] I: lws_client_connect_3_connect: abandoning 
connect due to timeout

[2020/05/26 04:40:27:6402] I: lws_client_connect_3_connect: (null) ipv4 
46.105.127.147
[2020/05/26 04:40:27:6584] I: lws_client_connect_3_connect: getsockopt 
check: conn OK
[2020/05/26 04:40:27:6585] I: lws_client_connect_3_connect: Connection 
started 0x2299660

ie, it is doing what it is supposed to do.  When you say, without 
qualification

 >>  Latest code libwebsockets
 >> does not still work. Second Ip (and later ip)does still not try wait.

this is adding to the confusion... it works fine on Linux.  And it 
worked fine there before yesterday's patch, which at least as far as 
separating out the timeout control is also moving things forward in the 
right direction.

On Windows, I though I had figured out that its libc api always reports 
the DNS in reverse order to the record, ie, the last response in the 
record is the first reported one.  That is why on my existing DNS test 
records, windows always magically "found" the correct server first so I 
could not test the bad responses.  But it seems whatever it's doing is 
more complicated, with 4 x test subdomains I could usually get one of 
them to try the bad ones first and test it though.

I think I solved the problem and again updated the original patch, 
please give master a try.

-Andy

> -Andy
> 
>> 	
>> -----邮件原件-----
>> 发件人: andy at warmcat.com [mailto:andy at warmcat.com]
>> 发送时间: 2020年5月26日 3:17
>> 收件人: huangkaicheng <huangkaicheng at huawei.com>; libwebsockets
>> <libwebsockets at ml.libwebsockets.org>
>> 抄送: Chenyake <chenyake at huawei.com>
>> 主题: Re: 回复: [Libwebsockets] 答复: A question about libwebsockets dns
>> resolve
>>
>>
>>
>> On May 25, 2020 11:04:00 AM UTC, huangkaicheng
>> <huangkaicheng at huawei.com> wrote:
>>> Hi,
>>> I mean that the first resolved ip, it can wait connect until time out.
>>> And but second,third,and other it will not. It is not just 127.0.x.
>> you
>>> can change to make sure (onevalid.bogus.warmcat.com) dns resolve like
>>> (46.105.127.147(only can reach) , 2.3.5.4, 10.173.16.193,5.45.86.4)  .
>>> other Ip(unreachable it should wait connect util time out ).in last
>> old
>>> version(a week ago), it will wait some time rather than finish
>> quickly.
>>> It is quite different with last version(a week ago).
>>
>> I looked at this earlier and pushed a patch adding a separate
>> configurable timeout for the whole connect (default 20s) and reduced
>> the timeout for individual dns connect attempts to default to 5s.
>>
>> https://libwebsockets.org/git/libwebsockets/commit?id=9f4c19fd9d9dede1ec856ce4774d46cb4b79b26c
>>
>> It seems to work as before on Linux, on windows I cannot control the
>> dns ordering, both my test dns records 'just work' each time for
>> whatever reason, 127.xxx is never tried.
>>
>> -Andy
>>
>>>
>>> -----邮件原件-----
>>> 发件人: Andy Green [mailto:andy at warmcat.com]
>>> 发送时间: 2020年5月25日 17:03
>>> 收件人: huangkaicheng <huangkaicheng at huawei.com>; libwebsockets
>>> <libwebsockets at ml.libwebsockets.org>
>>> 抄送: Chenyake <chenyake at huawei.com>
>>> 主题: Re: 回复: [Libwebsockets] 答复: A question about libwebsockets dns
>>> resolve
>>>
>>>
>>>
>>> On 5/25/20 9:31 AM, huangkaicheng wrote:
>>>> Hi ,
>>>>
>>>>        I use test-client project in websockes previous. And it is not
>>
>>>> my code. it is about your code in your project.
>>>
>>> OK, fair enough.
>>>
>>> But what I mean is please make it easy for me to reproduce your
>>> problem, ie minimal example, if a diff, give me the diff, and give me
>>> the commandline.  Then I can know if I can spare a few minutes, I can
>>> stop what I am doing and look at it (and I know immediately that it is
>>
>>> about a minimal example, which makes me much more want to stop and
>>> look).
>>>
>>> As Jaco says I am not sure if it just confuses the issue with the
>>> "timedout" log.  On some platforms, 127.0.0.x port closed acts like
>>> closed and send you a FIN, on other platforms (IIRC OSX) it acts like
>>> it was DROPped and waits.  If on windows platform it resets the
>>> connection, like on Linux, it will not wait around and just fail
>>> immediately, the log may not reflect the reality of why it gave up
>>> then, which is not ideal but not really a crisis.
>>>
>>> -Andy
>>>
>>>>
>>> imap://andy@warmcat.com:993/fetch%3EUID%3E.INBOX%3E126618?header=quoteb
>>> ody&part=1.1.3&filename=image002.png
>>>>
>>>> mkdir build
>>>>
>>>> cd build
>>>>
>>>> cmake .. -DLWS_WITH_SSL=0
>>>>
>>>> cmake --build . --config DEBUG
>>>>
>>>>        And I use
>>>>
>>> imap://andy@warmcat.com:993/fetch%3EUID%3E.INBOX%3E126618?header=quoteb
>>> ody&part=1.1.2&filename=image001.png
>>>>
>>>>         If I want to use minimal-ws-client, how can I build it
>>> success. I
>>>> build it failed.
>>>>
>>>> -----邮件原件-----
>>>> 发件人: Andy Green [mailto:andy at warmcat.com]
>>>> 发送时间: 2020年5月25日15:50
>>>> 收件人: huangkaicheng <huangkaicheng at huawei.com>; libwebsockets
>>>> <libwebsockets at ml.libwebsockets.org>
>>>> 抄送: Chenyake <chenyake at huawei.com>
>>>> 主题: Re: 回复:[Libwebsockets] 答复: A question about libwebsockets dns
>>>> resolve
>>>>
>>>> On 5/25/20 8:35 AM, huangkaicheng wrote:
>>>>
>>>>> Hi,
>>>>
>>>>>
>>>>
>>>>>   there is still something wrong with latest code. Why try to
>>>>
>>>>> connect 127.0.0.1, 127.0.0.3, 127.0.0.2 time out so quickly?
>>>>
>>>> Can you please show me this using the lws minimal examples rather
>>> than
>>>> your code?  That way I can try the same thing quickly and have some
>>>> reason to think we are looking at the same problem, and if I change
>>>> something, that it solves the problem.
>>>>
>>>> -Andy
>>>>
> _______________________________________________
> Libwebsockets mailing list
> Libwebsockets at ml.libwebsockets.org
> https://libwebsockets.org/mailman/listinfo/libwebsockets
> 


More information about the Libwebsockets mailing list