From: Mike Snitzer <snitzer@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>,
Hannes Reinecke <hare@suse.de>,
tytso@mit.edu, linux-scsi@vger.kernel.org, jaxboe@fusionio.com,
jack@suse.cz, linux-kernel@vger.kernel.org, swhiteho@redhat.com,
linux-raid@vger.kernel.org, linux-ide@vger.kernel.org,
James.Bottomley@suse.de, konishi.ryusuke@lab.ntt.co.jp,
linux-fsdevel@vger.kernel.org, vst@vlnb.net, rwheeler@redhat.com,
Christoph Hellwig <hch@lst.de>,
chris.mason@oracle.com, dm-devel@redhat.com
Subject: Re: [PATCHSET block#for-2.6.36-post] block: replace barrier with sequenced flush
Date: Tue, 24 Aug 2010 13:52:16 -0400 [thread overview]
Message-ID: <20100824175215.GA29409@redhat.com> (raw)
In-Reply-To: <4C73FA8F.5080800@kernel.org>
On Tue, Aug 24 2010 at 12:59pm -0400,
Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On 08/24/2010 12:24 PM, Kiyoshi Ueda wrote:
> > Yes, checking whether it's a transport error in lower layer is
> > the right solution.
> > (Since I know it's not available yet, I just hoped if upper layers
> > had some other options.)
> >
> > Anyway, only reporting errors for REQ_FLUSH to upper layer without
> > such a solution would make dm-multipath almost unusable in real world,
> > although it's better than implicit data loss.
>
> I see.
>
> >>> Maybe just turn off barrier support in mpath for now?
> >
> > If it's possible, it could be a workaround for a short term.
> > But how can you do that?
> >
> > I think it's not enough to just drop REQ_FLUSH flag from q->flush_flags.
> > Underlying devices of a mpath device may have write-back cache and
> > it may be enabled.
> > So if a mpath device doesn't set REQ_FLUSH flag in q->flush_flags, it
> > becomes a device which has write-back cache but doesn't support flush.
> > Then, upper layer can do nothing to ensure cache flush?
>
> Yeah, I was basically suggesting to forget about cache flush w/ mpath
> until it can be fixed. You're saying that if mpath just passes
> REQ_FLUSH upwards without retrying, it will be almost unuseable,
> right? I'm not sure how to proceed here.
Seems clear that we must fix mpath to receive the SCSI errors, in some
form, so it can decide if a retry is required/valid or not.
Such error processing was a big selling point for the transition from
bio-based to request-based multipath; so it's unfortunate that this
piece has been left until now.
> How much work would discerning between transport and IO errors take?
Hannes already proposed some patches:
https://patchwork.kernel.org/patch/61282/
https://patchwork.kernel.org/patch/61283/
https://patchwork.kernel.org/patch/61596/
This work was discussed at LSF, see "Error Handling - Hannes Reinecke"
here: http://lwn.net/Articles/400589/
I thought James, Alasdair and others offered some guidance on what he'd
like to see...
Unfortunately, even though I was at this LSF session, I can't recall any
specific consensus on how Hannes' work should be refactored (to avoid
adding SCSI sense processing code directly in dm-mpath). Maybe James,
Hannes or others remember?
Was it enough to just have the SCSI sense processing code split out in a
new sub-section of the SCSI midlayer -- and then DM calls that code?
> If it can't be done quickly enough the retry logic can be kept around
> to keep the old behavior but that already was a broken behavior, so...
> :-(
I'll have to review this thread again to understand why mpath's existing
retry logic is broken behavior. mpath is used with more capable SCSI
devices so I'm missing why a failed FLUSH implies data loss.
Mike
next prev parent reply other threads:[~2010-08-24 17:52 UTC|newest]
Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-12 12:41 [PATCHSET block#for-2.6.36-post] block: replace barrier with sequenced flush Tejun Heo
2010-08-12 12:41 ` [PATCH 01/11] block/loop: queue ordered mode should be DRAIN_FLUSH Tejun Heo
2010-08-12 12:41 ` [PATCH 02/11] block: kill QUEUE_ORDERED_BY_TAG Tejun Heo
2010-08-13 12:56 ` Vladislav Bolkhovitin
2010-08-13 13:06 ` Christoph Hellwig
2010-08-12 12:41 ` [PATCH 03/11] block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush() Tejun Heo
2010-08-14 1:07 ` Jeremy Fitzhardinge
2010-08-14 9:42 ` hch
2010-08-16 20:38 ` Jeremy Fitzhardinge
2010-08-12 12:41 ` [PATCH 04/11] block: remove spurious uses of REQ_HARDBARRIER Tejun Heo
2010-08-12 12:41 ` [PATCH 05/11] block: misc cleanups in barrier code Tejun Heo
2010-08-12 12:41 ` [PATCH 06/11] block: drop barrier ordering by queue draining Tejun Heo
2010-08-12 12:41 ` [PATCH 07/11] block: rename blk-barrier.c to blk-flush.c Tejun Heo
2010-08-12 12:41 ` [PATCH 08/11] block: rename barrier/ordered to flush Tejun Heo
2010-08-17 13:26 ` Christoph Hellwig
2010-08-17 16:23 ` Tejun Heo
2010-08-17 17:08 ` Christoph Hellwig
2010-08-18 6:23 ` Tejun Heo
2010-08-12 12:41 ` [PATCH 09/11] block: implement REQ_FLUSH/FUA based interface for FLUSH/FUA requests Tejun Heo
2010-08-12 12:41 ` [PATCH 10/11] fs, block: propagate REQ_FLUSH/FUA interface to upper layers Tejun Heo
2010-08-12 21:24 ` Jan Kara
2010-08-13 7:19 ` Tejun Heo
2010-08-13 7:47 ` Christoph Hellwig
2010-08-16 16:33 ` [PATCH UPDATED " Tejun Heo
2010-08-12 12:41 ` [PATCH 11/11] block: use REQ_FLUSH in blkdev_issue_flush() Tejun Heo
2010-08-13 11:48 ` [PATCHSET block#for-2.6.36-post] block: replace barrier with sequenced flush Christoph Hellwig
2010-08-13 13:48 ` Tejun Heo
2010-08-13 14:38 ` Christoph Hellwig
2010-08-13 14:51 ` Tejun Heo
2010-08-14 10:36 ` Christoph Hellwig
2010-08-17 9:59 ` Tejun Heo
2010-08-17 13:19 ` Christoph Hellwig
2010-08-17 16:41 ` Tejun Heo
2010-08-17 16:59 ` Christoph Hellwig
2010-08-18 6:35 ` Tejun Heo
2010-08-18 8:11 ` Tejun Heo
2010-08-20 8:26 ` Kiyoshi Ueda
2010-08-23 12:14 ` Tejun Heo
2010-08-23 14:17 ` Mike Snitzer
2010-08-24 10:24 ` Kiyoshi Ueda
2010-08-24 16:59 ` Tejun Heo
2010-08-24 17:52 ` Mike Snitzer [this message]
2010-08-24 18:14 ` Tejun Heo
2010-08-25 8:00 ` Kiyoshi Ueda
2010-08-25 15:28 ` Mike Snitzer
2010-08-27 9:47 ` Kiyoshi Ueda
2010-08-27 13:49 ` Mike Snitzer
2010-08-30 6:13 ` Kiyoshi Ueda
2010-09-01 0:55 ` safety of retrying SYNCHRONIZE CACHE [was: Re: [PATCHSET block#for-2.6.36-post] block: replace barrier with sequenced flush] Mike Snitzer
2010-09-01 7:32 ` Hannes Reinecke
2010-09-01 7:38 ` Hannes Reinecke
2010-08-25 15:59 ` [RFC] training mpath to discern between SCSI errors (was: Re: [PATCHSET block#for-2.6.36-post] block: replace barrier with sequenced flush) Mike Snitzer
2010-08-25 19:15 ` [RFC] training mpath to discern between SCSI errors Mike Christie
2010-08-30 11:38 ` Hannes Reinecke
2010-08-30 12:07 ` Sergei Shtylyov
2010-08-30 12:39 ` Hannes Reinecke
2010-08-30 14:52 ` [dm-devel] " Hannes Reinecke
2010-10-18 8:09 ` Jun'ichi Nomura
2010-10-18 11:55 ` Hannes Reinecke
2010-10-19 4:03 ` Jun'ichi Nomura
2010-11-19 3:11 ` [dm-devel] " Malahal Naineni
2010-11-30 22:59 ` Mike Snitzer
2010-12-07 23:16 ` [RFC PATCH 0/3] differentiate between I/O errors Mike Snitzer
2010-12-07 23:16 ` [RFC PATCH v2 1/3] scsi: Detailed " Mike Snitzer
2010-12-07 23:16 ` [RFC PATCH v2 2/3] dm mpath: propagate target errors immediately Mike Snitzer
2010-12-07 23:16 ` [RFC PATCH 3/3] block: improve detail in I/O error messages Mike Snitzer
2010-12-08 11:28 ` Sergei Shtylyov
2010-12-08 15:05 ` [PATCH v2 " Mike Snitzer
2010-12-10 23:40 ` [RFC PATCH 0/3] differentiate between I/O errors Malahal Naineni
2011-01-14 1:15 ` Mike Snitzer
2010-12-17 9:47 ` training mpath to discern between SCSI errors Hannes Reinecke
2010-12-17 14:06 ` Mike Snitzer
2011-01-14 1:09 ` Mike Snitzer
2011-01-14 7:45 ` Hannes Reinecke
2011-01-14 13:59 ` Mike Snitzer
2010-08-24 17:11 ` [PATCHSET block#for-2.6.36-post] block: replace barrier with sequenced flush Vladislav Bolkhovitin
2010-08-24 23:14 ` Alan Cox
2010-08-13 12:55 ` Vladislav Bolkhovitin
2010-08-13 13:17 ` Christoph Hellwig
2010-08-18 19:29 ` Vladislav Bolkhovitin
2010-08-13 13:21 ` Tejun Heo
2010-08-18 19:30 ` Vladislav Bolkhovitin
2010-08-19 9:51 ` Tejun Heo
2010-08-30 9:54 ` Hannes Reinecke
2010-08-30 20:34 ` Vladislav Bolkhovitin
2010-08-18 9:46 ` Christoph Hellwig
2010-08-19 9:57 ` Tejun Heo
2010-08-19 10:20 ` Christoph Hellwig
2010-08-19 10:22 ` Tejun Heo
2010-08-20 13:22 ` Christoph Hellwig
2010-08-20 15:18 ` Ric Wheeler
2010-08-20 16:00 ` Chris Mason
2010-08-20 16:02 ` Ric Wheeler
2010-08-23 12:30 ` Tejun Heo
2010-08-23 12:48 ` Christoph Hellwig
2010-08-23 13:58 ` Ric Wheeler
2010-08-23 14:01 ` Jens Axboe
2010-08-23 14:08 ` Christoph Hellwig
2010-08-23 14:13 ` Tejun Heo
2010-08-23 14:19 ` Christoph Hellwig
2010-08-25 11:31 ` Jens Axboe
2010-08-30 10:04 ` Hannes Reinecke
2010-08-23 15:19 ` Ric Wheeler
2010-08-23 16:45 ` Sergey Vlasov
2010-08-23 16:49 ` [dm-devel] " Ric Wheeler
2010-08-23 12:36 ` Tejun Heo
2010-08-23 14:05 ` Christoph Hellwig
2010-08-23 14:15 ` [PATCH] block: simplify queue_next_fseq Christoph Hellwig
2010-08-23 16:28 ` OT grammar nit " John Robinson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100824175215.GA29409@redhat.com \
--to=snitzer@redhat.com \
--cc=James.Bottomley@suse.de \
--cc=chris.mason@oracle.com \
--cc=dm-devel@redhat.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=jaxboe@fusionio.com \
--cc=k-ueda@ct.jp.nec.com \
--cc=konishi.ryusuke@lab.ntt.co.jp \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=rwheeler@redhat.com \
--cc=swhiteho@redhat.com \
--cc=tj@kernel.org \
--cc=tytso@mit.edu \
--cc=vst@vlnb.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).