public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Oltean <vladimir.oltean@nxp.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "David S . Miller" <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"UNGLinuxDriver@microchip.com" <UNGLinuxDriver@microchip.com>,
	Alexandre Belloni <alexandre.belloni@bootlin.com>,
	Andrew Lunn <andrew@lunn.ch>,
	Florian Fainelli <f.fainelli@gmail.com>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Horatiu Vultur <horatiu.vultur@microchip.com>,
	"Allan W . Nielsen" <allan.nielsen@microchip.com>,
	Claudiu Manoil <claudiu.manoil@nxp.com>,
	Steen Hegelund <steen.hegelund@microchip.com>
Subject: Re: [PATCH v2 net] net: mscc: ocelot: install MAC addresses in .ndo_set_rx_mode from process context
Date: Fri, 4 Dec 2020 18:12:51 +0000	[thread overview]
Message-ID: <20201204181250.t5d4hc7wis7pzqa2@skbuf> (raw)
In-Reply-To: <20201204100021.0b026725@kicinski-fedora-pc1c0hjn.DHCP.thefacebook.com>

On Fri, Dec 04, 2020 at 10:00:21AM -0800, Jakub Kicinski wrote:
> On Fri,  4 Dec 2020 19:51:25 +0200 Vladimir Oltean wrote:
> > Currently ocelot_set_rx_mode calls ocelot_mact_learn directly, which has
> > a very nice ocelot_mact_wait_for_completion at the end. Introduced in
> > commit 639c1b2625af ("net: mscc: ocelot: Register poll timeout should be
> > wall time not attempts"), this function uses readx_poll_timeout which
> > triggers a lot of lockdep warnings and is also dangerous to use from
> > atomic context, leading to lockups and panics.
> >
> > Steen Hegelund added a poll timeout of 100 ms for checking the MAC
> > table, a duration which is clearly absurd to poll in atomic context.
> > So we need to defer the MAC table access to process context, which we do
> > via a dynamically allocated workqueue which contains all there is to
> > know about the MAC table operation it has to do.
> >
> > Fixes: 639c1b2625af ("net: mscc: ocelot: Register poll timeout should be wall time not attempts")
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> > Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> > ---
> > Changes in v2:
> > - Added Fixes tag (it won't backport that far, but anyway)
> > - Using get_device and put_device to avoid racing with unbind
>
> Does get_device really protect you from unbind? I thought it only
> protects you from .release being called, IOW freeing struct device
> memory..

Possibly.
I ran a bind && unbind loop for a while, and I couldn't trigger any
concurrency.

> More usual way of handling this would be allocating your own workqueue
> and flushing that wq at the right point.

Yeah, well I more or less deliberately lose track of the workqueue as
soon as ocelot_enqueue_mact_action is over, and that is by design. There
is potentially more than one address to offload to the hardware in progress
at the same time, and any sort of serialization in .ndo_set_rx_mode (so
I could add the workqueue to a list of items to cancel on unbind)
would mean
(a) more complicated code
(b) more busy waiting

> >  drivers/net/ethernet/mscc/ocelot_net.c | 83 +++++++++++++++++++++++++-
> >  1 file changed, 80 insertions(+), 3 deletions(-)
>
> This is a little large for a rc7 fix :S

Fine, net-next it is then.

> What's the expected poll time? maybe it's not that bad to busy wait?
> Clearly nobody noticed the issue in 2 years (you mention lockdep so
> not a "real" deadlock) which makes me think the wait can't be that long?

Not too much, but the sleep is there.
Also, all 3 of ocelot/felix/seville are memory-mapped devices. But I
heard from Alex a while ago that he intends to add support for a switch
managed over a slow bus like SPI, and to use the same regmap infrastructure.
That would mean that this problem would need to be resolved anyway.

> Also for a reference - there are drivers out there with busy poll
> timeout of seconds :/

Yeah, not sure if that tells me anything. I prefer avoiding that from
atomic context, because our cyclictest numbers are not that great anyway,
the last thing I want is to make them worse.

  reply	other threads:[~2020-12-04 18:13 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-04 17:51 [PATCH v2 net] net: mscc: ocelot: install MAC addresses in .ndo_set_rx_mode from process context Vladimir Oltean
2020-12-04 18:00 ` Jakub Kicinski
2020-12-04 18:12   ` Vladimir Oltean [this message]
2020-12-04 18:15     ` Alexandre Belloni
2020-12-04 18:55     ` Jakub Kicinski
2020-12-04 19:23       ` Vladimir Oltean
2020-12-05  0:41         ` Jakub Kicinski
2020-12-04 19:24       ` Jakub Kicinski
2020-12-04 19:26       ` Alexandre Belloni
2020-12-04 18:25   ` Vladimir Oltean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201204181250.t5d4hc7wis7pzqa2@skbuf \
    --to=vladimir.oltean@nxp.com \
    --cc=UNGLinuxDriver@microchip.com \
    --cc=alexandre.belloni@bootlin.com \
    --cc=allan.nielsen@microchip.com \
    --cc=andrew@lunn.ch \
    --cc=claudiu.manoil@nxp.com \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=horatiu.vultur@microchip.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=steen.hegelund@microchip.com \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox