All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Mason <slash.tmp@free.fr>
Cc: Lukas Wunner <lukas@wunner.de>,
	Mathias Nyman <mathias.nyman@linux.intel.com>,
	Felipe Balbi <felipe.balbi@linux.intel.com>,
	linux-pci <linux-pci@vger.kernel.org>,
	linux-usb <linux-usb@vger.kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Bjorn Helgaas <helgaas@kernel.org>,
	Alan Stern <stern@rowland.harvard.edu>
Subject: Re: Possible regression between 4.9 and 4.13
Date: Wed, 30 Aug 2017 11:06:33 +0200	[thread overview]
Message-ID: <20170830090633.GA1208@kroah.com> (raw)
In-Reply-To: <678490ce-9381-e63e-7a12-33d3eff7f894@free.fr>

On Wed, Aug 30, 2017 at 10:55:37AM +0200, Mason wrote:
> On 30/08/2017 08:02, Greg Kroah-Hartman wrote:
> 
> > To get back to the original issue here, the hardware seems to have died,
> > the driver stops talking to it, and all is good.  The "regression" here
> > is that we now properly can determine that the hardware is crap.
> 
> Before 4.12, when I unplugged my USB3 Flash drive, Linux would
> detect a few "Uncorrected Non-Fatal errors" via AER, but it was
> still possible to plug the drive back in.
> 
> Since 4.12, once I unplug the drive, the whole USB3 card is marked
> as dead (all 4 ports), and I can no longer plug anything in (not even
> the USB2 drive that didn't have any issues, IIRC).
> 
> It seems a bit premature to "mark as dead" something that remains
> functional, doesn't it?

I agree, but if the device sends all ones, it's a good indication it is
really dead, right?  Or something is wrong with it.

> Disclaimer, there are many variables in this setup, and I've only
> tested a small fraction of the problem space: only one system,
> only one USB3 board, only one USB3 Flash drive.

Did you ever happen to narrow this down to a single git commit using
'git bisect'?  I can't remember what happened in the beginning of this
thread...

> > So, how do you think we should proceed, delay a bit longer before saying
> > the device is gone?  How long is "long enough"?  How many bus errors are
> > we allowed to tolerate (hint, the PCI spec says none...)
> > 
> > Maybe someone wants to get to the root problem here, why is the hardware
> > suddenly reporting all 1s?
> 
> I'm afraid I won't be able to make any progress on this front,
> unless I can get my hands on a PCIe packet analyzer.

Odds of that happening are pretty rare, right?  I've never even seen one
of those...

thanks,

greg k-h

WARNING: multiple messages have this Message-ID (diff)
From: gregkh@linuxfoundation.org (Greg Kroah-Hartman)
To: linux-arm-kernel@lists.infradead.org
Subject: Possible regression between 4.9 and 4.13
Date: Wed, 30 Aug 2017 11:06:33 +0200	[thread overview]
Message-ID: <20170830090633.GA1208@kroah.com> (raw)
In-Reply-To: <678490ce-9381-e63e-7a12-33d3eff7f894@free.fr>

On Wed, Aug 30, 2017 at 10:55:37AM +0200, Mason wrote:
> On 30/08/2017 08:02, Greg Kroah-Hartman wrote:
> 
> > To get back to the original issue here, the hardware seems to have died,
> > the driver stops talking to it, and all is good.  The "regression" here
> > is that we now properly can determine that the hardware is crap.
> 
> Before 4.12, when I unplugged my USB3 Flash drive, Linux would
> detect a few "Uncorrected Non-Fatal errors" via AER, but it was
> still possible to plug the drive back in.
> 
> Since 4.12, once I unplug the drive, the whole USB3 card is marked
> as dead (all 4 ports), and I can no longer plug anything in (not even
> the USB2 drive that didn't have any issues, IIRC).
> 
> It seems a bit premature to "mark as dead" something that remains
> functional, doesn't it?

I agree, but if the device sends all ones, it's a good indication it is
really dead, right?  Or something is wrong with it.

> Disclaimer, there are many variables in this setup, and I've only
> tested a small fraction of the problem space: only one system,
> only one USB3 board, only one USB3 Flash drive.

Did you ever happen to narrow this down to a single git commit using
'git bisect'?  I can't remember what happened in the beginning of this
thread...

> > So, how do you think we should proceed, delay a bit longer before saying
> > the device is gone?  How long is "long enough"?  How many bus errors are
> > we allowed to tolerate (hint, the PCI spec says none...)
> > 
> > Maybe someone wants to get to the root problem here, why is the hardware
> > suddenly reporting all 1s?
> 
> I'm afraid I won't be able to make any progress on this front,
> unless I can get my hands on a PCIe packet analyzer.

Odds of that happening are pretty rare, right?  I've never even seen one
of those...

thanks,

greg k-h

  reply	other threads:[~2017-08-30  9:06 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-22 17:34 Possible regression between 4.9 and 4.13 Mason
2017-08-22 17:34 ` Mason
2017-08-23  6:07 ` Felipe Balbi
2017-08-23  6:07   ` Felipe Balbi
2017-08-23  7:51   ` Mathias Nyman
2017-08-23  7:51     ` Mathias Nyman
2017-08-23  9:18     ` Mason
2017-08-23  9:18       ` Mason
2017-08-23  9:31     ` Mason
2017-08-23  9:31       ` Mason
2017-08-23 11:11       ` Mathias Nyman
2017-08-23 11:11         ` Mathias Nyman
2017-08-23 11:54         ` Mason
2017-08-23 11:54           ` Mason
2017-08-23 12:41           ` Mason
2017-08-23 12:41             ` Mason
2017-08-23 14:30             ` Mason
2017-08-23 14:30               ` Mason
2017-08-28  8:39               ` Mathias Nyman
2017-08-28  8:39                 ` Mathias Nyman
2017-08-28 14:40                 ` Mason
2017-08-28 14:40                   ` Mason
2017-08-29 13:28                   ` Mathias Nyman
2017-08-29 13:28                     ` Mathias Nyman
2017-08-29 13:38                     ` Lukas Wunner
2017-08-29 13:38                       ` Lukas Wunner
2017-08-29 14:47                       ` Greg Kroah-Hartman
2017-08-29 14:47                         ` Greg Kroah-Hartman
2017-08-29 15:34                         ` Lukas Wunner
2017-08-29 15:34                           ` Lukas Wunner
2017-08-29 15:51                           ` Greg Kroah-Hartman
2017-08-29 15:51                             ` Greg Kroah-Hartman
2017-08-30  6:36                             ` Lukas Wunner
2017-08-30  6:36                               ` Lukas Wunner
2017-08-30  6:45                               ` Greg Kroah-Hartman
2017-08-30  6:45                                 ` Greg Kroah-Hartman
2017-08-29 23:53                     ` Lukas Wunner
2017-08-29 23:53                       ` Lukas Wunner
2017-08-30  6:02                       ` Greg Kroah-Hartman
2017-08-30  6:02                         ` Greg Kroah-Hartman
2017-08-30  8:55                         ` Mason
2017-08-30  8:55                           ` Mason
2017-08-30  9:06                           ` Greg Kroah-Hartman [this message]
2017-08-30  9:06                             ` Greg Kroah-Hartman
2017-08-31  9:39                             ` Mason
2017-08-31  9:39                               ` Mason
2017-08-31 11:40                               ` Mathias Nyman
2017-08-31 11:40                                 ` Mathias Nyman
2017-08-30  9:07                           ` Ard Biesheuvel
2017-08-30  9:07                             ` Ard Biesheuvel
2017-08-30  9:22                             ` Greg Kroah-Hartman
2017-08-30  9:22                               ` Greg Kroah-Hartman
2017-08-30  9:37                             ` Mason
2017-08-30  9:37                               ` Mason
2017-08-31  9:17                               ` Mason
2017-08-31  9:17                                 ` Mason
2017-08-31 11:38                                 ` Mathias Nyman
2017-08-31 11:38                                   ` Mathias Nyman
2017-08-23 10:19     ` Mason
2017-08-23 10:19       ` Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170830090633.GA1208@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=felipe.balbi@linux.intel.com \
    --cc=helgaas@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mathias.nyman@linux.intel.com \
    --cc=slash.tmp@free.fr \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.