From: Jakub Kicinski <kubakici@wp.pl>
To: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>,
linux-wireless <linux-wireless@vger.kernel.org>
Subject: Re: [PATCH] mt7601u: do not schedule rx_tasklet when the device has been disconnected
Date: Thu, 6 Jun 2019 14:26:22 -0700 [thread overview]
Message-ID: <20190606142622.7e5f11de@cakuba.netronome.com> (raw)
In-Reply-To: <CAJ0CqmUySwtgzPCFR0QsZ-XA7Lm7xU8eMsELV8LEd2MEkhfnLA@mail.gmail.com>
On Thu, 6 Jun 2019 23:02:08 +0200, Lorenzo Bianconi wrote:
> > On Thu, 6 Jun 2019 14:26:12 +0200, Lorenzo Bianconi wrote:
> Hi Jakub,
>
> thx for the fast review :)
I guess I'm used to the 24h "review timeout" on netdev :)
> > Is this a new thing? I def tested unplugging the dongle under
> > traffic, but that must had been in 3.19 days :S
>
> I do not know if the issue has been introduced in recent kernel, I am
> able to trigger it in a vm running
> wireless-drivers-next and it has been reported here:
> https://bugzilla.kernel.org/show_bug.cgi?id=203249
I see. I'm just worried that we make a mistake here and it will get
backported all the way back to very old kernels because of the fixes
tag. If old kernels worked perhaps it's not worth disturbing them.
> > > Fixes: c869f77d6abb ("add mt7601u driver")
> > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > > ---
> > > I will post a patch to fix tx side as well
> > > ---
> > > drivers/net/wireless/mediatek/mt7601u/dma.c | 33 ++++++++++-----------
> > > 1 file changed, 16 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/drivers/net/wireless/mediatek/mt7601u/dma.c b/drivers/net/wireless/mediatek/mt7601u/dma.c
> > > index f7edeffb2b19..e7703990b291 100644
> > > --- a/drivers/net/wireless/mediatek/mt7601u/dma.c
> > > +++ b/drivers/net/wireless/mediatek/mt7601u/dma.c
> > > @@ -193,10 +193,20 @@ static void mt7601u_complete_rx(struct urb *urb)
> > > struct mt7601u_rx_queue *q = &dev->rx_q;
> > > unsigned long flags;
> > >
> > > - spin_lock_irqsave(&dev->rx_lock, flags);
> > > + switch (urb->status) {
> > > + case -ECONNRESET:
> > > + case -ESHUTDOWN:
> > > + case -ENOENT:
> > > + return;
> >
> > So we assume this is non-recoverable? Everything will fail after?
> > Because pending is incremented linearly :S That's why there is a
> > warning here.
>
> AFAIK -ECONNRESET/-ENOENT are related to urb unlink/kill while
> -ESHUTDOWN is related to device disconnection.
> I guess we can assume the device has been removed if we get these errors
Makes sense. A bit of an implicit assumption, USB subsystem may break
this for us. Let's at least put a comment here so we can go back and
know that at the time of committing we did double check?
> > > + default:
> > > + dev_err_ratelimited(dev->dev, "rx urb failed: %d\n",
> > > + urb->status);
> > > + /* fall through */
> > > + case 0:
> > > + break;
> > > + }
> > >
> > > - if (mt7601u_urb_has_error(urb))
> > > - dev_err(dev->dev, "Error: RX urb failed:%d\n", urb->status);
> > > + spin_lock_irqsave(&dev->rx_lock, flags);
> > > if (WARN_ONCE(q->e[q->end].urb != urb, "RX urb mismatch"))
> > > goto out;
> > >
> > > @@ -363,19 +373,10 @@ int mt7601u_dma_enqueue_tx(struct mt7601u_dev *dev, struct sk_buff *skb,
> > > static void mt7601u_kill_rx(struct mt7601u_dev *dev)
> > > {
> > > int i;
> > > - unsigned long flags;
> > > -
> > > - spin_lock_irqsave(&dev->rx_lock, flags);
> > >
> > > - for (i = 0; i < dev->rx_q.entries; i++) {
> > > - int next = dev->rx_q.end;
> > > -
> > > - spin_unlock_irqrestore(&dev->rx_lock, flags);
> > > - usb_poison_urb(dev->rx_q.e[next].urb);
> > > - spin_lock_irqsave(&dev->rx_lock, flags);
> > > - }
> >
> > Why is there no need to take the lock? Admittedly it's not clear what
> > this lock is protecting here :P Perhaps a separate patch to remove the
> > unnecessary locking with an explanation?
>
> usb_poison_urb() can run concurrently with urb completion so I guess
> we do not need locks here.
> I guess we need to maintain this chunk in the same patch since now
> when the device is disconnected
> we do not increment dev->rx_q.end. What do you think?
Ah, I see! The completion used to run in between the unlock/lock!
Now we just poison out of order, because completion doesn't care about
ordering for errored URBs. Perhaps worth putting in the commit message?
next prev parent reply other threads:[~2019-06-06 21:26 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-06 12:26 [PATCH] mt7601u: do not schedule rx_tasklet when the device has been disconnected Lorenzo Bianconi
2019-06-06 18:42 ` Jakub Kicinski
2019-06-06 21:02 ` Lorenzo Bianconi
2019-06-06 21:26 ` Jakub Kicinski [this message]
2019-06-06 21:59 ` Lorenzo Bianconi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190606142622.7e5f11de@cakuba.netronome.com \
--to=kubakici@wp.pl \
--cc=linux-wireless@vger.kernel.org \
--cc=lorenzo.bianconi@redhat.com \
--cc=lorenzo@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox