From: Bill Davidsen <davidsen@tmr.com>
To: Neil Brown <neilb@suse.de>
Cc: David Chinner <dgc@sgi.com>,
linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
dm-devel@redhat.com, Jens Axboe <jens.axboe@oracle.com>,
linux-fsdevel@vger.kernel.org
Subject: Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.
Date: Thu, 31 May 2007 08:28:02 -0400 [thread overview]
Message-ID: <465EBF52.5010204@tmr.com> (raw)
In-Reply-To: <18014.6347.753050.606896@notabene.brown>
Neil Brown wrote:
> On Monday May 28, davidsen@tmr.com wrote:
>
>> There are two things I'm not sure you covered.
>>
>> First, disks which don't support flush but do have a "cache dirty"
>> status bit you can poll at times like shutdown. If there are no drivers
>> which support these, it can be ignored.
>>
>
> There are really devices like that? So to implement a flush, you have
> to stop sending writes and wait and poll - maybe poll every
> millisecond?
>
Yes, there really are (or were). But I don't think that there are
drivers, so it's not an issue.
> That wouldn't be very good for performance.... maybe you just
> wouldn't bother with barriers on that sort of device?
>
That is why there are no drivers...
> Which reminds me: What is the best way to turn off barriers?
> Several filesystems have "-o nobarriers" or "-o barriers=0",
> or the inverse.
>
If they can function usefully without, the admin gets to make that choice.
> md/raid currently uses barriers to write metadata, and there is no
> way to turn that off. I'm beginning to wonder if that is best.
>
I don't see how you can have reliable operation without it, particularly
WRT bitmap.
> Maybe barrier support should be a function of the device. i.e. the
> filesystem or whatever always sends barrier requests where it thinks
> it is appropriate, and the block device tries to honour them to the
> best of its ability, but if you run
> blockdev --enforce-barriers=no /dev/sda
> then you lose some reliability guarantees, but gain some throughput (a
> bit like the 'async' export option for nfsd).
>
>
Since this is device dependent, it really should be in the device
driver, and requests should have status of success, failure, or feature
unavailability.
>> Second, NAS (including nbd?). Is there enough information to handle this "really right?"
>>
>
> NAS means lots of things, including NFS and CIFS where this doesn't
> apply.
>
Well, we're really talking about network attached devices rather than
network filesystems. I guess people do lump them together.
> For 'nbd', it is entirely up to the protocol. If the protocol allows
> a barrier flag to be sent to the server, then barriers should just
> work. If it doesn't, then either the server disables write-back
> caching, or flushes every request, or you lose all barrier
> guarantees.
>
Pretty much agrees with what I said above, it's at a level closer to the
device, and status should come back from the physical i/o request.
> For 'iscsi', I guess it works just the same as SCSI...
>
Hopefully.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
WARNING: multiple messages have this Message-ID (diff)
From: Bill Davidsen <davidsen@tmr.com>
To: Neil Brown <neilb@suse.de>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
dm-devel@redhat.com, linux-raid@vger.kernel.org,
Jens Axboe <jens.axboe@oracle.com>, David Chinner <dgc@sgi.com>
Subject: Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.
Date: Thu, 31 May 2007 08:28:02 -0400 [thread overview]
Message-ID: <465EBF52.5010204@tmr.com> (raw)
In-Reply-To: <18014.6347.753050.606896@notabene.brown>
Neil Brown wrote:
> On Monday May 28, davidsen@tmr.com wrote:
>
>> There are two things I'm not sure you covered.
>>
>> First, disks which don't support flush but do have a "cache dirty"
>> status bit you can poll at times like shutdown. If there are no drivers
>> which support these, it can be ignored.
>>
>
> There are really devices like that? So to implement a flush, you have
> to stop sending writes and wait and poll - maybe poll every
> millisecond?
>
Yes, there really are (or were). But I don't think that there are
drivers, so it's not an issue.
> That wouldn't be very good for performance.... maybe you just
> wouldn't bother with barriers on that sort of device?
>
That is why there are no drivers...
> Which reminds me: What is the best way to turn off barriers?
> Several filesystems have "-o nobarriers" or "-o barriers=0",
> or the inverse.
>
If they can function usefully without, the admin gets to make that choice.
> md/raid currently uses barriers to write metadata, and there is no
> way to turn that off. I'm beginning to wonder if that is best.
>
I don't see how you can have reliable operation without it, particularly
WRT bitmap.
> Maybe barrier support should be a function of the device. i.e. the
> filesystem or whatever always sends barrier requests where it thinks
> it is appropriate, and the block device tries to honour them to the
> best of its ability, but if you run
> blockdev --enforce-barriers=no /dev/sda
> then you lose some reliability guarantees, but gain some throughput (a
> bit like the 'async' export option for nfsd).
>
>
Since this is device dependent, it really should be in the device
driver, and requests should have status of success, failure, or feature
unavailability.
>> Second, NAS (including nbd?). Is there enough information to handle this "really right?"
>>
>
> NAS means lots of things, including NFS and CIFS where this doesn't
> apply.
>
Well, we're really talking about network attached devices rather than
network filesystems. I guess people do lump them together.
> For 'nbd', it is entirely up to the protocol. If the protocol allows
> a barrier flag to be sent to the server, then barriers should just
> work. If it doesn't, then either the server disables write-back
> caching, or flushes every request, or you lose all barrier
> guarantees.
>
Pretty much agrees with what I said above, it's at a level closer to the
device, and status should come back from the physical i/o request.
> For 'iscsi', I guess it works just the same as SCSI...
>
Hopefully.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
next prev parent reply other threads:[~2007-05-31 12:28 UTC|newest]
Thread overview: 134+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-25 7:58 [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md Neil Brown
2007-05-25 7:58 ` Neil Brown
2007-05-25 11:15 ` David Chinner
2007-05-25 11:49 ` Jens Axboe
2007-05-25 14:49 ` Phillip Susi
2007-05-25 14:49 ` [dm-devel] " Phillip Susi
2007-05-28 18:32 ` Jens Axboe
2007-05-25 13:52 ` Stefan Bader
2007-05-28 1:37 ` Neil Brown
2007-05-29 9:12 ` Stefan Bader
2007-05-25 15:11 ` Phillip Susi
2007-05-25 15:11 ` [dm-devel] " Phillip Susi
2007-05-26 1:03 ` Andreas Dilger
2007-05-26 10:27 ` Tejun Heo
2007-05-28 1:30 ` Neil Brown
2007-05-28 2:45 ` David Chinner
2007-05-28 2:57 ` Neil Brown
2007-05-28 4:29 ` David Chinner
2007-05-28 4:29 ` David Chinner
2007-05-31 0:46 ` Neil Brown
2007-05-31 0:57 ` Alasdair G Kergon
2007-05-31 0:57 ` [dm-devel] " Alasdair G Kergon
2007-05-31 1:07 ` Alasdair G Kergon
2007-05-31 1:07 ` [dm-devel] " Alasdair G Kergon
2007-05-31 1:11 ` David Chinner
2007-05-31 1:11 ` [dm-devel] " David Chinner
2007-05-28 4:48 ` Timothy Shimmin
2007-05-29 6:45 ` Jeremy Higdon
2007-05-29 20:03 ` Phillip Susi
2007-05-29 23:48 ` David Chinner
2007-05-30 0:01 ` david
2007-05-30 6:17 ` David Chinner
2007-05-30 8:55 ` Stefan Bader
2007-05-30 8:55 ` Stefan Bader
2007-05-30 16:52 ` david
2007-05-30 16:52 ` david
2007-05-31 0:20 ` David Chinner
2007-05-31 6:26 ` Jens Axboe
2007-05-31 7:03 ` David Chinner
2007-05-31 7:06 ` Jens Axboe
2007-05-31 13:30 ` Bill Davidsen
2007-05-31 13:36 ` Jens Axboe
2007-06-01 16:04 ` Bill Davidsen
2007-06-02 14:51 ` Jens Axboe
2007-06-02 19:55 ` Bill Davidsen
2007-06-01 3:16 ` Tejun Heo
2007-06-01 3:16 ` Tejun Heo
2007-06-01 8:21 ` Jens Axboe
2007-06-01 8:21 ` Jens Axboe
2007-06-02 9:20 ` Tejun Heo
2007-06-02 14:34 ` Jens Axboe
2007-06-02 14:34 ` Jens Axboe
2007-06-02 22:57 ` Guy Watkins
2007-06-02 22:57 ` Guy Watkins
2007-06-04 7:39 ` Tejun Heo
2007-05-31 18:31 ` Phillip Susi
2007-05-31 19:00 ` Jens Axboe
2007-05-31 19:21 ` david
2007-05-31 19:21 ` david
2007-05-31 19:40 ` Jens Axboe
2007-05-31 23:34 ` David Chinner
2007-06-01 5:59 ` Neil Brown
2007-06-01 6:11 ` Jens Axboe
2007-06-01 7:53 ` David Chinner
2007-06-01 23:56 ` Bill Davidsen
2007-05-31 18:24 ` Phillip Susi
2007-05-31 18:24 ` Phillip Susi
2007-05-30 16:45 ` Phillip Susi
2007-05-30 20:27 ` [dm-devel] " Phillip Susi
2007-05-31 6:24 ` Jens Axboe
2007-05-31 6:24 ` [dm-devel] " Jens Axboe
2007-05-31 18:37 ` Phillip Susi
2007-05-31 18:58 ` Jens Axboe
2007-06-02 0:04 ` Bill Davidsen
2007-05-28 9:29 ` Tejun Heo
2007-05-28 9:43 ` Alasdair G Kergon
2007-05-28 9:43 ` [dm-devel] " Alasdair G Kergon
2007-05-29 9:25 ` Stefan Bader
2007-05-29 22:05 ` Alasdair G Kergon
2007-05-29 22:05 ` [dm-devel] " Alasdair G Kergon
2007-05-30 9:12 ` Stefan Bader
2007-05-30 10:41 ` Alasdair G Kergon
2007-05-30 10:41 ` [dm-devel] " Alasdair G Kergon
2007-05-30 16:55 ` Phillip Susi
2007-05-30 16:55 ` [dm-devel] " Phillip Susi
2007-05-31 11:14 ` Stefan Bader
2007-06-01 3:25 ` Tejun Heo
2007-06-01 3:25 ` [dm-devel] " Tejun Heo
2007-06-01 5:55 ` david
2007-06-01 5:55 ` [dm-devel] " david
2007-06-01 7:16 ` Tejun Heo
2007-06-01 17:07 ` Valdis.Kletnieks
2007-06-01 18:09 ` Tejun Heo
2007-07-10 18:39 ` Ric Wheeler
2007-07-10 23:40 ` Valdis.Kletnieks
2007-07-11 2:49 ` Tejun Heo
2007-07-11 22:44 ` Ric Wheeler
2007-07-12 17:34 ` Valdis.Kletnieks
2007-07-12 19:43 ` Ric Wheeler
2007-07-12 23:10 ` Guy Watkins
2007-07-12 23:10 ` Guy Watkins
2007-07-13 11:30 ` Ric Wheeler
2007-07-11 2:51 ` Tejun Heo
2007-05-29 19:59 ` Phillip Susi
2007-05-31 0:22 ` Neil Brown
2007-05-30 9:35 ` Jens Axboe
2007-07-05 12:28 ` Tejun Heo
2007-07-09 12:27 ` Jens Axboe
2007-07-18 10:56 ` [PATCH] block: cosmetic changes Tejun Heo
2007-07-18 10:56 ` Tejun Heo
2007-07-18 10:59 ` [PATCH] block: factor out bio_check_eod() Tejun Heo
2007-07-18 10:59 ` Tejun Heo
2007-07-18 11:06 ` Jens Axboe
2007-07-18 11:18 ` Tejun Heo
2007-07-18 11:18 ` Tejun Heo
2007-07-18 11:31 ` Jens Axboe
2007-07-18 11:31 ` Jens Axboe
2007-07-18 11:33 ` Tejun Heo
2007-07-18 11:33 ` Tejun Heo
2007-07-18 11:34 ` Jens Axboe
2007-07-18 11:34 ` Jens Axboe
2007-07-18 11:41 ` Tejun Heo
2007-07-18 11:41 ` Tejun Heo
2007-07-18 11:45 ` Jens Axboe
2007-07-18 11:49 ` Jens Axboe
2007-07-18 12:34 ` Tejun Heo
2007-07-18 12:31 ` Jens Axboe
2007-05-28 11:17 ` [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md Nikita Danilov
2007-05-31 3:31 ` Neil Brown
2007-05-28 14:43 ` Bill Davidsen
2007-05-28 14:43 ` Bill Davidsen
2007-05-31 0:37 ` Neil Brown
2007-05-31 12:28 ` Bill Davidsen [this message]
2007-05-31 12:28 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=465EBF52.5010204@tmr.com \
--to=davidsen@tmr.com \
--cc=dgc@sgi.com \
--cc=dm-devel@redhat.com \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.