netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Mitchell Augustin <mitchell.augustin@canonical.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, Jiri Pirko <jiri@resnulli.us>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Lorenzo Bianconi <lorenzo@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Jacob Martin <jacob.martin@canonical.com>,
	dann frazier <dann.frazier@canonical.com>
Subject: Re: Namespaced network devices not cleaned up properly after execution of pmtu.sh kernel selftest
Date: Thu, 12 Sep 2024 19:13:06 -0700	[thread overview]
Message-ID: <20240912191306.0cf81ce3@kernel.org> (raw)
In-Reply-To: <CAHTA-uZDaJ-71o+bo8a96TV4ck-8niimztQFaa=QoeNdUm-9wg@mail.gmail.com>

On Wed, 11 Sep 2024 17:20:29 -0500 Mitchell Augustin wrote:
> We recently identified a bug still impacting upstream, triggered
> occasionally by one of the kernel selftests (net/pmtu.sh) that
> sometimes causes the following behavior:
> * One of this tests's namespaced network devices does not get properly
> cleaned up when the namespace is destroyed, evidenced by
> `unregister_netdevice: waiting for veth_A-R1 to become free. Usage
> count = 5` appearing in the dmesg output repeatedly
> * Once we start to see the above `unregister_netdevice` message, an
> un-cancelable hang will occur on subsequent attempts to run `modprobe
> ip6_vti` or `rmmod ip6_vti`

Thanks for the report! We have seen it in our CI as well, it happens
maybe once a day. But as you say on x86 is quite hard to reproduce,
and nothing obvious stood out as a culprit.

> However, I can easily reproduce the issue on an Nvidia Grace/Hopper
> machine (and other platforms with modern CPUs) with the performance
> governor set by doing the following:
> * Install/boot any affected kernel
> * Clone the kernel tree just to get an older version of the test cases
> without subtle timing changes that mask the issue (such as
> https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble/tree/?h=Ubuntu-6.8.0-39.39)
> * cd tools/testing/selftests/net
> * while true; do sudo ./pmtu.sh pmtu_ipv6_ipv6_exception; done

That's exciting! Would you be able to try to cut down the test itself
(is quite long and has a ton of sub-cases). Figure out which sub-cases
trigger this? And maybe with an even quicker repro we'll bisect or
someone will correctly guess the fix?

Somewhat tangentially but if you'd be willing I wouldn't mind if you
were to send patches to break this test up upstream, too. It takes
1h23m to run with various debug kernel options enabled. If we split 
it into multiple smaller tests each running 10min or 20min we can 
then spawn multiple VMs and get the results faster.

  reply	other threads:[~2024-09-13  2:13 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-11 22:20 Namespaced network devices not cleaned up properly after execution of pmtu.sh kernel selftest Mitchell Augustin
2024-09-13  2:13 ` Jakub Kicinski [this message]
2024-09-13 13:45   ` Mitchell Augustin
2024-09-13 13:50     ` Eric Dumazet
2024-09-13 18:51     ` Jakub Kicinski
2024-09-13 18:59       ` Mitchell Augustin
2024-09-16 19:25         ` Mitchell Augustin
2024-09-23 20:01           ` Mitchell Augustin
2024-09-23 20:14             ` Eric Dumazet
2024-09-24 16:21               ` Mitchell Augustin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240912191306.0cf81ce3@kernel.org \
    --to=kuba@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=daniel@iogearbox.net \
    --cc=dann.frazier@canonical.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jacob.martin@canonical.com \
    --cc=jiri@resnulli.us \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=mitchell.augustin@canonical.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).