Linux wireless drivers development
 help / color / mirror / Atom feed
From: Jakub Kicinski <kubakici@wp.pl>
To: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>,
	linux-wireless <linux-wireless@vger.kernel.org>
Subject: Re: [PATCH] mt7601u: do not schedule rx_tasklet when the device has been disconnected
Date: Thu, 6 Jun 2019 14:26:22 -0700	[thread overview]
Message-ID: <20190606142622.7e5f11de@cakuba.netronome.com> (raw)
In-Reply-To: <CAJ0CqmUySwtgzPCFR0QsZ-XA7Lm7xU8eMsELV8LEd2MEkhfnLA@mail.gmail.com>

On Thu, 6 Jun 2019 23:02:08 +0200, Lorenzo Bianconi wrote:
> > On Thu,  6 Jun 2019 14:26:12 +0200, Lorenzo Bianconi wrote:  
> Hi Jakub,
> 
> thx for the fast review :)

I guess I'm used to the 24h "review timeout" on netdev :)

> > Is this a new thing?  I def tested unplugging the dongle under
> > traffic, but that must had been in 3.19 days :S
> 
> I do not know if the issue has been introduced in recent kernel, I am
> able to trigger it in a vm running
> wireless-drivers-next and it has been reported here:
> https://bugzilla.kernel.org/show_bug.cgi?id=203249

I see.  I'm just worried that we make a mistake here and it will get
backported all the way back to very old kernels because of the fixes
tag.  If old kernels worked perhaps it's not worth disturbing them.

> > > Fixes: c869f77d6abb ("add mt7601u driver")
> > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > > ---
> > > I will post a patch to fix tx side as well
> > > ---
> > >  drivers/net/wireless/mediatek/mt7601u/dma.c | 33 ++++++++++-----------
> > >  1 file changed, 16 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/drivers/net/wireless/mediatek/mt7601u/dma.c b/drivers/net/wireless/mediatek/mt7601u/dma.c
> > > index f7edeffb2b19..e7703990b291 100644
> > > --- a/drivers/net/wireless/mediatek/mt7601u/dma.c
> > > +++ b/drivers/net/wireless/mediatek/mt7601u/dma.c
> > > @@ -193,10 +193,20 @@ static void mt7601u_complete_rx(struct urb *urb)
> > >       struct mt7601u_rx_queue *q = &dev->rx_q;
> > >       unsigned long flags;
> > >
> > > -     spin_lock_irqsave(&dev->rx_lock, flags);
> > > +     switch (urb->status) {
> > > +     case -ECONNRESET:
> > > +     case -ESHUTDOWN:
> > > +     case -ENOENT:
> > > +             return;  
> >
> > So we assume this is non-recoverable?  Everything will fail after?
> > Because pending is incremented linearly :S  That's why there is a
> > warning here.  
> 
> AFAIK -ECONNRESET/-ENOENT are related to urb unlink/kill while
> -ESHUTDOWN is related to device disconnection.
> I guess we can assume the device has been removed if we get these errors

Makes sense.  A bit of an implicit assumption, USB subsystem may break
this for us.  Let's at least put a comment here so we can go back and
know that at the time of committing we did double check?

> > > +     default:
> > > +             dev_err_ratelimited(dev->dev, "rx urb failed: %d\n",
> > > +                                 urb->status);
> > > +             /* fall through */
> > > +     case 0:
> > > +             break;
> > > +     }
> > >
> > > -     if (mt7601u_urb_has_error(urb))
> > > -             dev_err(dev->dev, "Error: RX urb failed:%d\n", urb->status);
> > > +     spin_lock_irqsave(&dev->rx_lock, flags);
> > >       if (WARN_ONCE(q->e[q->end].urb != urb, "RX urb mismatch"))
> > >               goto out;
> > >
> > > @@ -363,19 +373,10 @@ int mt7601u_dma_enqueue_tx(struct mt7601u_dev *dev, struct sk_buff *skb,
> > >  static void mt7601u_kill_rx(struct mt7601u_dev *dev)
> > >  {
> > >       int i;
> > > -     unsigned long flags;
> > > -
> > > -     spin_lock_irqsave(&dev->rx_lock, flags);
> > >
> > > -     for (i = 0; i < dev->rx_q.entries; i++) {
> > > -             int next = dev->rx_q.end;
> > > -
> > > -             spin_unlock_irqrestore(&dev->rx_lock, flags);
> > > -             usb_poison_urb(dev->rx_q.e[next].urb);
> > > -             spin_lock_irqsave(&dev->rx_lock, flags);
> > > -     }  
> >
> > Why is there no need to take the lock?  Admittedly it's not clear what
> > this lock is protecting here :P  Perhaps a separate patch to remove the
> > unnecessary locking with an explanation?  
> 
> usb_poison_urb() can run concurrently with urb completion so I guess
> we do not need locks here.
> I guess we need to maintain this chunk in the same patch since now
> when the device is disconnected
> we do not increment dev->rx_q.end. What do you think?

Ah, I see!  The completion used to run in between the unlock/lock!
Now we just poison out of order, because completion doesn't care about
ordering for errored URBs.  Perhaps worth putting in the commit message?

  reply	other threads:[~2019-06-06 21:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-06 12:26 [PATCH] mt7601u: do not schedule rx_tasklet when the device has been disconnected Lorenzo Bianconi
2019-06-06 18:42 ` Jakub Kicinski
2019-06-06 21:02   ` Lorenzo Bianconi
2019-06-06 21:26     ` Jakub Kicinski [this message]
2019-06-06 21:59       ` Lorenzo Bianconi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190606142622.7e5f11de@cakuba.netronome.com \
    --to=kubakici@wp.pl \
    --cc=linux-wireless@vger.kernel.org \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=lorenzo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox