netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@idosch.org>
To: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Cc: roopa@cumulusnetworks.com, nikolay@cumulusnetworks.com,
	davem@davemloft.net,
	"bridge@lists.linux-foundation.org" 
	<bridge@lists.linux-foundation.org>,
	netdev@vger.kernel.org
Subject: Re: BUG: soft lockup while deleting tap interface from vlan aware bridge
Date: Thu, 30 Apr 2020 13:55:51 +0300	[thread overview]
Message-ID: <20200430105551.GA4068275@splinter> (raw)
In-Reply-To: <85b1e301-8189-540b-b4bf-d0902e74becc@profihost.ag>

On Wed, Apr 29, 2020 at 10:52:35PM +0200, Stefan Priebe - Profihost AG wrote:
> Hello,
> 
> while running a stable vanilla kernel 4.19.115 i'm reproducably get this
> one:
> 
> watchdog: BUG: soft lockup - CPU#38 stuck for 22s! [bridge:3570653]
> 
> ...
> 
> Call
> Trace:nbp_vlan_delete+0x59/0xa0br_vlan_info+0x66/0xd0br_afspec+0x18c/0x1d0br_dellink+0x74/0xd0rtnl_bridge_dellink+0x110/0x220rtnetlink_rcv_msg+0x283/0x360

Nik, Stefan,

My theory is that 4K VLANs are deleted in a batch and preemption is
disabled (please confirm). For each VLAN the kernel needs to go over the
entire FDB and delete affected entries. If the FDB is very large or the
FDB lock is contended this can cause the kernel to loop for more than 20
seconds without calling schedule().

To reproduce I added mdelay(100) in br_fdb_delete_by_port() and ran
this:

ip link add name br10 up type bridge vlan_filtering 1
ip link add name dummy10 up type dummy
ip link set dev dummy10 master br10
bridge vlan add vid 1-4094 dev dummy10 master
bridge vlan del vid 1-4094 dev dummy10 master

Got a similar trace to Stefan's. Seems to be fixed by attached:

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index a774e19c41bb..240e260e3461 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -615,6 +615,7 @@ int br_process_vlan_info(struct net_bridge *br,
                                               v - 1, rtm_cmd);
                                v_change_start = 0;
                        }
+                       cond_resched();
                }
                /* v_change_start is set only if the last/whole range changed */
                if (v_change_start)

WDYT?

  parent reply	other threads:[~2020-04-30 10:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-29 20:52 BUG: soft lockup while deleting tap interface from vlan aware bridge Stefan Priebe - Profihost AG
2020-04-29 21:23 ` Nikolay Aleksandrov
2020-04-30  6:47   ` Stefan Priebe - Profihost AG
2020-04-30 10:55 ` Ido Schimmel [this message]
2020-04-30 11:20   ` Nikolay Aleksandrov
2020-04-30 15:56     ` Ido Schimmel
2020-04-30 15:57       ` Nikolay Aleksandrov
2020-04-30 13:18   ` Stefan Priebe - Profihost AG
2020-04-30 15:57     ` Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200430105551.GA4068275@splinter \
    --to=idosch@idosch.org \
    --cc=bridge@lists.linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=s.priebe@profihost.ag \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).