* [v3.16][v3.17][v3.18][ Regression] scsi: handle flush errors properly
@ 2014-12-10 22:08 Joseph Salisbury
2014-12-10 23:06 ` Steven Haber
2014-12-11 4:45 ` James Bottomley
0 siblings, 2 replies; 3+ messages in thread
From: Joseph Salisbury @ 2014-12-10 22:08 UTC (permalink / raw)
To: JBottomley
Cc: steven, Martin K. Petersen, stable@vger.kernel.org, LKML,
linux-scsi
Hello James,
A kernel bug report was opened against Ubuntu [0]. After a kernel
bisect, it was found that reverting the following commit resolved this bug:
commit 89fb4cd1f717a871ef79fa7debbe840e3225cd54
Author: James Bottomley <JBottomley@Parallels.com>
Date: Thu Jul 3 19:17:34 2014 +0200
scsi: handle flush errors properly
The regression was introduced as of v3.16 and still exits in the 3.18
kernel. It has also made it's way into the stable kernels.
I was hoping to get your feedback, since you are the patch author. Do
you think gathering any additional data will help diagnose this issue,
or would it be best to submit a revert request?
Thanks,
Joe
[0] http://pad.lv/1366538
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [v3.16][v3.17][v3.18][ Regression] scsi: handle flush errors properly 2014-12-10 22:08 [v3.16][v3.17][v3.18][ Regression] scsi: handle flush errors properly Joseph Salisbury @ 2014-12-10 23:06 ` Steven Haber 2014-12-11 4:45 ` James Bottomley 1 sibling, 0 replies; 3+ messages in thread From: Steven Haber @ 2014-12-10 23:06 UTC (permalink / raw) To: Joseph Salisbury Cc: JBottomley, Martin K. Petersen, stable@vger.kernel.org, LKML, linux-scsi Hey Joe, Here's some context: The SCSI flush command was being treated by a zero-byte write, which means that if an error was returned, you wouldn't catch it until a subsequent write (or flush). The way writes work is that all possible bytes are written, and if something bad happens, an error bubbles out on the next write attempt. This holds true even for a zero-byte write. This means that before this bug, to guarantee durability you had to flush twice (and verify both were error-free). I'm working on a storage appliance that relies on the fact that a single flush command guarantees a write made durably to a SCSI device. I'm sure many other storage products rely on this behavior, too. The patch James shipped fixes this bug by special-casing the flush error path. Before flush wouldn't return errors; now it does. I'm not sure why certain USB drives are failing in the flush path on unmount. Since the flush bug existed for such a long time, I suspect certain drivers coded around this behavior, and now that it is correct we are seeing new bugs exposed. Based on the simplicity and obviousness of our patch for the flush bug, it would really be ideal to diagnose this further rather than reverting. Steven Haber Qumulo, Inc. On Wed, Dec 10, 2014 at 2:08 PM, Joseph Salisbury <joseph.salisbury@canonical.com> wrote: > Hello James, > > A kernel bug report was opened against Ubuntu [0]. After a kernel > bisect, it was found that reverting the following commit resolved this bug: > > commit 89fb4cd1f717a871ef79fa7debbe840e3225cd54 > Author: James Bottomley <JBottomley@Parallels.com> > Date: Thu Jul 3 19:17:34 2014 +0200 > > scsi: handle flush errors properly > > The regression was introduced as of v3.16 and still exits in the 3.18 > kernel. It has also made it's way into the stable kernels. > > I was hoping to get your feedback, since you are the patch author. Do > you think gathering any additional data will help diagnose this issue, > or would it be best to submit a revert request? > > > Thanks, > > Joe > > [0] http://pad.lv/1366538 > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [v3.16][v3.17][v3.18][ Regression] scsi: handle flush errors properly 2014-12-10 22:08 [v3.16][v3.17][v3.18][ Regression] scsi: handle flush errors properly Joseph Salisbury 2014-12-10 23:06 ` Steven Haber @ 2014-12-11 4:45 ` James Bottomley 1 sibling, 0 replies; 3+ messages in thread From: James Bottomley @ 2014-12-11 4:45 UTC (permalink / raw) To: Joseph Salisbury Cc: steven, Martin K. Petersen, stable@vger.kernel.org, LKML, linux-scsi On Wed, 2014-12-10 at 17:08 -0500, Joseph Salisbury wrote: > Hello James, > > A kernel bug report was opened against Ubuntu [0]. After a kernel > bisect, it was found that reverting the following commit resolved this bug: If I read this bug report correctly, it's saying a USB attached device produces an error when doing a "shred" but the same device PCI attached doesn't. Presumably shred sends some type of zero length command USB storage doesn't like. What is shred actually doing (what commands is it sending)? James ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-12-11 4:45 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-12-10 22:08 [v3.16][v3.17][v3.18][ Regression] scsi: handle flush errors properly Joseph Salisbury 2014-12-10 23:06 ` Steven Haber 2014-12-11 4:45 ` James Bottomley
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).