From: Stephen Hemminger <stephen@networkplumber.org>
To: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>,
Network Development <netdev@vger.kernel.org>
Subject: Re: rtnl_trylock() versus SCHED_FIFO lockup
Date: Fri, 7 Aug 2020 08:03:32 -0700 [thread overview]
Message-ID: <20200807080332.3d31231d@hermes.lan> (raw)
In-Reply-To: <29a82363-411c-6f2b-9f55-97482504e453@prevas.dk>
On Fri, 7 Aug 2020 10:03:59 +0200
Rasmus Villemoes <rasmus.villemoes@prevas.dk> wrote:
> On 07/08/2020 05.39, Stephen Hemminger wrote:
> > On Thu, 6 Aug 2020 12:46:43 +0300
> > Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> >
> >> On 06/08/2020 12:17, Rasmus Villemoes wrote:
> >>> On 06/08/2020 01.34, Stephen Hemminger wrote:
> >>>> On Wed, 5 Aug 2020 16:25:23 +0200
>
> >>
> >> Hi Rasmus,
> >> I haven't tested anything but git history (and some grepping) points to deadlocks when
> >> sysfs entries are being changed under rtnl.
> >> For example check: af38f2989572704a846a5577b5ab3b1e2885cbfb and 336ca57c3b4e2b58ea3273e6d978ab3dfa387b4c
> >> This is a common usage pattern throughout net/, the bridge is not the only case and there are more
> >> commits which talk about deadlocks.
> >> Again I haven't verified anything but it seems on device delete (w/ rtnl held) -> sysfs delete
> >> would wait for current readers, but current readers might be stuck waiting on rtnl and we can deadlock.
> >>
> >
> > I was referring to AB BA lock inversion problems.
>
> Ah, so lock inversion, not priority inversion.
>
> >
> > Yes the trylock goes back to:
> >
> > commit af38f2989572704a846a5577b5ab3b1e2885cbfb
> > Author: Eric W. Biederman <ebiederm@xmission.com>
> > Date: Wed May 13 17:00:41 2009 +0000
> >
> > net: Fix bridgeing sysfs handling of rtnl_lock
> >
> > Holding rtnl_lock when we are unregistering the sysfs files can
> > deadlock if we unconditionally take rtnl_lock in a sysfs file. So fix
> > it with the now familiar patter of: rtnl_trylock and syscall_restart()
> >
> > Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
> > Signed-off-by: David S. Miller <davem@davemloft.net>
> >
> >
> > The problem is that the unregister of netdevice happens under rtnl and
> > this unregister path has to remove sysfs and other objects.
> > So those objects have to have conditional locking.
> I see. And the reason the "trylock, unwind all the way back to syscall
> entry and start over" works is that we then go through
>
> kernfs_fop_write()
> mutex_lock(&of->mutex);
> if (!kernfs_get_active(of->kn)) {
> mutex_unlock(&of->mutex);
> len = -ENODEV;
> goto out_free;
> }
>
> which makes the write fail with ENODEV if the sysfs node has already
> been marked for removal.
>
> If I'm reading the code correctly, doing "ip link set dev foobar type
> bridge fdb_flush" is equivalent to writing to that sysfs file, except
> the former ends up doing an unconditional rtnl_lock() and thus won't
> have the livelocking issue.
>
> Thanks,
> Rasmus
ip commands use netlink, and netlink doesn't have the problem because
it doesn't go through a filesystem API.
next prev parent reply other threads:[~2020-08-07 15:03 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-05 14:25 rtnl_trylock() versus SCHED_FIFO lockup Rasmus Villemoes
2020-08-05 23:34 ` Stephen Hemminger
2020-08-06 9:17 ` Rasmus Villemoes
2020-08-06 9:46 ` Nikolay Aleksandrov
2020-08-07 3:39 ` Stephen Hemminger
2020-08-07 8:03 ` Rasmus Villemoes
2020-08-07 15:03 ` Stephen Hemminger [this message]
[not found] ` <20200809134924.12056-1-hdanton@sina.com>
2020-08-09 14:12 ` Nikolay Aleksandrov
2020-08-09 14:18 ` Nikolay Aleksandrov
2020-08-09 15:32 ` Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200807080332.3d31231d@hermes.lan \
--to=stephen@networkplumber.org \
--cc=netdev@vger.kernel.org \
--cc=nikolay@cumulusnetworks.com \
--cc=rasmus.villemoes@prevas.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.