From: "Mantas Mikulėnas" <grawity@gmail.com>
To: Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org
Subject: Re: traceroute failure in kernel 6.1 and 6.2
Date: Tue, 24 Jan 2023 07:34:20 +0200 [thread overview]
Message-ID: <b2ecff1c-91ad-4217-7fd5-d7bbd5704abe@gmail.com> (raw)
In-Reply-To: <CANn89iK7nn6tdQg9QZO_Gudx1BvLxhoLaNYmnOLb6ccYQnLGwg@mail.gmail.com>
On 24/01/2023 00.26, Eric Dumazet wrote:
> On Mon, Jan 23, 2023 at 10:45 PM Mantas Mikulėnas <grawity@gmail.com> wrote:
>>
>> On 23/01/2023 22.56, Eric Dumazet wrote:
>>> On Mon, Jan 23, 2023 at 8:25 PM Mantas Mikulėnas <grawity@gmail.com> wrote:
>>>>
>>>> On 2023-01-23 17:21, Eric Dumazet wrote:
>>>>> On Sat, Jan 21, 2023 at 7:09 PM Mantas Mikulėnas <grawity@gmail.com> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Not sure whether this has been reported, but:
>>>>>>
>>>>>> After upgrading from kernel 6.0.7 to 6.1.6 on Arch Linux, unprivileged
>>>>>> ICMP traceroute using the `traceroute -I` tool stopped working – it very
>>>>>> reliably fails with a "No route to host" at some point:
>>>>>>
>>>>>> myth> traceroute -I 83.171.33.188
>>>>>> traceroute to 83.171.33.188 (83.171.33.188), 30 hops max, 60
>>>>>> byte packets
>>>>>> 1 _gateway (192.168.1.1) 0.819 ms
>>>>>> send: No route to host
>>>>>> [exited with 1]
>>>>>>
>>>>>> while it still works for root:
>>>>>>
>>>>>> myth> sudo traceroute -I 83.171.33.188
>>>>>> traceroute to 83.171.33.188 (83.171.33.188), 30 hops max, 60
>>>>>> byte packets
>>>>>> 1 _gateway (192.168.1.1) 0.771 ms
>>>>>> 2 * * *
>>>>>> 3 10.69.21.145 (10.69.21.145) 47.194 ms
>>>>>> 4 82-135-179-168.static.zebra.lt (82.135.179.168) 49.124 ms
>>>>>> 5 213-190-41-3.static.telecom.lt (213.190.41.3) 44.211 ms
>>>>>> 6 193.219.153.25 (193.219.153.25) 77.171 ms
>>>>>> 7 83.171.33.188 (83.171.33.188) 78.198 ms
>>>>>>
>>>>>> According to `git bisect`, this started with:
>>>>>>
>>>>>> commit 0d24148bd276ead5708ef56a4725580555bb48a3
>>>>>> Author: Eric Dumazet <edumazet@google.com>
>>>>>> Date: Tue Oct 11 14:27:29 2022 -0700
>>>>>>
>>>>>> inet: ping: fix recent breakage
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> It still happens with a fresh 6.2rc build, unless I revert that commit.
>>>>>>
>>>>>> The /bin/traceroute is the one that calls itself "Modern traceroute for
>>>>>> Linux, version 2.1.1", on Arch Linux. It seems to use socket(AF_INET,
>>>>>> SOCK_DGRAM, IPPROTO_ICMP), has neither setuid nor file capabilities.
>>>>>> (The problem does not occur if I run it as root.)
>>>>>>
>>>>>> This version of `traceroute` sends multiple probes at once (with TTLs
>>>>>> 1..16); according to strace, the first approx. 8-12 probes are sent
>>>>>> successfully, but eventually sendto() fails with EHOSTUNREACH. (Though
>>>>>> if I run it on local tty as opposed to SSH, it fails earlier.) If I use
>>>>>> -N1 to have it only send one probe at a time, the problem doesn't seem
>>>>>> to occur.
>>>>>
>>>>>
>>>>>
>>>>> I was not able to reproduce the issue (downloading
>>>>> https://sourceforge.net/projects/traceroute/files/latest/download)
>>>>>
>>>>> I suspect some kind of bug in this traceroute, when/if some ICMP error
>>>>> comes back.
>>>>>
>>>>> Double check by
>>>>>
>>>>> tcpdump -i ethXXXX icmp
>>>>>
>>>>> While you run traceroute -I ....
>>>>
>>>> Hmm, no, the only ICMP errors I see in tcpdump are "Time exceeded in
>>>> transit", which is expected for traceroute. Nothing else shows up.
>>>>
>>>> (But when I test against an address that causes *real* ICMP "Host
>>>> unreachable" errors, it seems to handle those correctly and prints "!H"
>>>> as usual -- that is, if it reaches that point without dying.)
>>>>
>>>> I was able to reproduce this on a fresh Linode 1G instance (starting
>>>> with their Arch image), where it also happens immediately:
>>>>
>>>> # pacman -Sy archlinux-keyring
>>>> # pacman -Syu
>>>> # pacman -Sy traceroute strace
>>>> # reboot
>>>> # uname -r
>>>> 6.1.7-arch1-1
>>>> # useradd foo
>>>> # su -c "traceroute -I 8.8.8.8" foo
>>>> traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
>>>> 1 10.210.1.195 (10.210.1.195) 0.209 ms
>>>> send: No route to host
>>>>
>>>> So now I'm fairly sure it is not something caused by my own network, either.
>>>>
>>>> On one system, it seems to work properly about half the time, if I keep
>>>> re-running the same command.
>>>>
>>>
>>> Here, running the latest upstream tree and latest traceroute, I have no issue.
>>>
>>> Send us :
>>>
>>> 1) strace output
>>> 2) icmp packet capture.
>>>
>>> Thanks.
>>
>> Attached both.
>
> Thanks.
>
> I think it is a bug in this traceroute version, pushing too many
> sendmsg() at once and hitting socket SNDBUF limit
>
> If the sendmsg() is blocked in sock_alloc_send_pskb, it might abort
> because an incoming ICMP message sets sk->sk_err
>
> It might have worked in the past, by pure luck.
>
> Try to increase /proc/sys/net/core/wmem_default
>
> If this solves the issue, I would advise sending a patch to traceroute to :
>
> 1) attempt to increase SO_SNDBUF accordingly
> 2) use non blocking sendmsg() api to sense how many packets can be
> queued in qdisc/NIC queues
> 3) reduce number of parallel messages (current traceroute behavior
> looks like a flood to me)
It doesn't solve the issue; I tried bumping it from the default of
212992 to 4096-times-that, with exactly the same results.
The amount of packets it's able to send is variable, For example, right
now, on my regular VM (which is smaller than the PC that yesterday's
trace was done on), the program very consistently fails on the *second*
sendto() call -- I don't think two packets is an unreasonable amount.
The program has -q and -N options to reduce the number of simultaneous
probes, but the only effect it has is if I reduce the packets all the
way down to just one at a time.
next prev parent reply other threads:[~2023-01-24 5:34 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-21 18:09 traceroute failure in kernel 6.1 and 6.2 Mantas Mikulėnas
2023-01-23 15:21 ` Eric Dumazet
2023-01-23 19:25 ` Mantas Mikulėnas
2023-01-23 20:56 ` Eric Dumazet
2023-01-23 21:45 ` Mantas Mikulėnas
2023-01-23 22:26 ` Eric Dumazet
2023-01-24 5:34 ` Mantas Mikulėnas [this message]
2023-01-24 6:03 ` Eric Dumazet
2023-01-24 8:57 ` Eric Dumazet
2023-01-24 15:27 ` Pavel Begunkov
2023-01-24 16:14 ` Eric Dumazet
2023-01-26 21:43 ` Mantas Mikulėnas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b2ecff1c-91ad-4217-7fd5-d7bbd5704abe@gmail.com \
--to=grawity@gmail.com \
--cc=edumazet@google.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).