public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* XFS and write barrier
@ 2006-07-15 10:48 Martin Steigerwald
  2006-07-15 19:28 ` Chris Wedgwood
  2006-07-16 17:32 ` Federico Sevilla III
  0 siblings, 2 replies; 15+ messages in thread
From: Martin Steigerwald @ 2006-07-15 10:48 UTC (permalink / raw)
  To: linux-xfs


Hello,

I am currently gathering information to write an article about journal 
filesystems with emphasis on write barrier functionality, how it works, 
why journalling filesystems need write barrier and the current 
implementation of write barrier support for different filesystems.

I have quite good informations on XFS already, but some questions remain:

1) Is it safe to use write barriers with 2.6.16 or should one use 2.6.17 
instead? This relates to:

-----------------------------------------------------------------------
commit b04ed21a1fdbfe48ee0738519a4d1af09589dfea
Author: Nathan Scott <nathans@sgi.com>
Date:   Wed Jan 11 15:32:17 2006 +1100

    [XFS] Disable write barriers for now till intermittent IO errors are
    understood.

    SGI-PV: 912426
    SGI-Modid: xfs-linux-melb:xfs-kern:202962a

    Signed-off-by: Nathan Scott <nathans@sgi.com>
-----------------------------------------------------------------------

What are those intermittent IO errors? I googled but did not find a 
discussion of this change.

2) I experienced three XFS corruptions in one week on 2.6.16 with enabled 
write caches, but (by default) disabled write barriers, but on 2.6.15 - 
with enabled write cache as well - it only very rarely got corrupted. 
Does anyone have any hint as to why this may have been the case? Thing is 
that the system went down, the kernel crashed while it was in use.

I am suspecting that the kernel went down due to other instabilities and 
then XFS got corrupted by out of order writes. But it may also been 
related to a different IO path and / or changes in the write barrier 
implementation in the block layer. Any ideas? 

No worries, if not, I simply write so. ;) It just would be nice to know 
why. I know its difficult to retrospect especially as I do not have the 
syslogs from those occasions anymore.

Regards,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-15 10:48 XFS and write barrier Martin Steigerwald
@ 2006-07-15 19:28 ` Chris Wedgwood
  2006-07-16  9:53   ` Martin Steigerwald
  2006-07-16 17:32 ` Federico Sevilla III
  1 sibling, 1 reply; 15+ messages in thread
From: Chris Wedgwood @ 2006-07-15 19:28 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-xfs

On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote:

> What are those intermittent IO errors? I googled but did not find a
> discussion of this change.

write barriers are enabled by default now, they have been for some
months (since the end of March)

> Does anyone have any hint as to why this may have been the case?
> Thing is that the system went down, the kernel crashed while it was
> in use.

usually in the case of a kernel crash the disks are able to ti flush
(unlike with power loss) so i wouldn't expect barriers or the lack of
them to affect things here

> I am suspecting that the kernel went down due to other instabilities
> and then XFS got corrupted by out of order writes.

yes, if writes are misordered and some are lost, then it's possible to
have a corrupt filesystem when you come back up

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-15 19:28 ` Chris Wedgwood
@ 2006-07-16  9:53   ` Martin Steigerwald
  2006-07-17  0:43     ` Chris Wedgwood
  0 siblings, 1 reply; 15+ messages in thread
From: Martin Steigerwald @ 2006-07-16  9:53 UTC (permalink / raw)
  To: linux-xfs

Am Samstag 15 Juli 2006 21:28 schrieb Chris Wedgwood:
> On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote:
> > What are those intermittent IO errors? I googled but did not find a
> > discussion of this change.
>
> write barriers are enabled by default now, they have been for some
> months (since the end of March)

Hallo Chris,

yes, but for 2.6.17 which was still in development. The stable release of 
it appeared kernel.org on 18-Jun-2006 02:10 according to the date in the 
file listing there!

commit 3bbcc8e3976f8bba2fd607c8850d7dfe7e332fda
Author: Nathan Scott <nathans@sgi.com>
Date:   Fri Mar 31 13:04:56 2006 +1000

    [XFS] Reenable write barriers by default.
[...]


They have been in enabled but then disabled again 5 minutes later for 
2.6.16 which should still be widely in use:

commit 4ef19dddbaf2f24e492c18112fd8a04ce116daca
Author: Christoph Hellwig <hch@sgi.com>
Date:   Wed Jan 11 15:27:18 2006 +1100

    [XFS] enable write barriers by default
[...]

commit b04ed21a1fdbfe48ee0738519a4d1af09589dfea
Author: Nathan Scott <nathans@sgi.com>
Date:   Wed Jan 11 15:32:17 2006 +1100

    [XFS] Disable write barriers for now till intermittent IO errors are
    understood.
[...]


Thus for end users this write barrier support by default in XFS is rather 
new!

What I would like to know whether its safe to use write barriers with 
2.6.16 or even 2.6.15 (if it is possible at all) as well (I guess there 
are not many distributions that shipd with 2.6.17 already) or whether one 
might face those "intermittent IO errors" - whatever they are - when 
using them.

If I do not find anything more on this I will recommend 2.6.17 for XFS 
with write barrier usages as I am pretty much convinced that it works 
stable from my own experience. (See my other post.)

Regards,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-15 10:48 XFS and write barrier Martin Steigerwald
  2006-07-15 19:28 ` Chris Wedgwood
@ 2006-07-16 17:32 ` Federico Sevilla III
  2006-07-18  7:31   ` Nathan Scott
  1 sibling, 1 reply; 15+ messages in thread
From: Federico Sevilla III @ 2006-07-16 17:32 UTC (permalink / raw)
  To: linux-xfs

On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote:
> I am currently gathering information to write an article about journal
> filesystems with emphasis on write barrier functionality, how it
> works, why journalling filesystems need write barrier and the current
> implementation of write barrier support for different filesystems.

Cool! Would you by any chance have information on the interaction
between journal filesystems with write barrier functionality, and
software RAID (md)? Based on my experience with 2.6.17, XFS detects that
the underlying software RAID 1 device does not support barriers and
therefore disables that functionality.

Cheers!

 --> Jijo

-- 
Federico Vicente C. Sevilla III
Information Technology Consultant
Q Software Research Corporation
Website: http://jijo.free.net.ph

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-16  9:53   ` Martin Steigerwald
@ 2006-07-17  0:43     ` Chris Wedgwood
  2006-07-17  1:24       ` Chris Wedgwood
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wedgwood @ 2006-07-17  0:43 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-xfs

On Sun, Jul 16, 2006 at 11:53:39AM +0200, Martin Steigerwald wrote:

> yes, but for 2.6.17 which was still in development. The stable
> release of it appeared kernel.org on 18-Jun-2006 02:10 according to
> the date in the file listing there!

well, i guess it depends how you look at it

> What I would like to know whether its safe to use write barriers
> with 2.6.16 or even 2.6.15 (if it is possible at all) as well (I
> guess there are not many distributions that shipd with 2.6.17
> already) or whether one might face those "intermittent IO errors" -
> whatever they are - when using them.

some people (myself included) saw problems when write barriers were
enabled, but that was quite some time ago and it wasn't clear if this
was really an xfs, a write-barrier or some other coincidental problem
at the time

the problems never cause any significant on-disk damage (perhaps none
at all, i don't recall the details other than it would crash often
during large rsync jobs until i disabled it)

write barrier support appear november last year, so a lot has changed
since then

fwiw, w/o write barries if your disks have write caching enabled (most
will, they are horribly slow and worse some die without it) then you
can trivially create corrupted volumes (do something like cp -Rl src
dst and drop power or yank a hot-plug drive)

> If I do not find anything more on this I will recommend 2.6.17 for
> XFS with write barrier usages as I am pretty much convinced that it
> works stable from my own experience. (See my other post.)

you probably want 2.6.17.5+ as that has a fix for a very hard to hit
bug (that seemingly a few people may have hit)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-17  0:43     ` Chris Wedgwood
@ 2006-07-17  1:24       ` Chris Wedgwood
  0 siblings, 0 replies; 15+ messages in thread
From: Chris Wedgwood @ 2006-07-17  1:24 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-xfs

On Sun, Jul 16, 2006 at 05:43:54PM -0700, Chris Wedgwood wrote:

> you probably want 2.6.17.5+ as that has a fix for a very hard to hit
> bug (that seemingly a few people may have hit)

actually, it turns out i'm a retard and can't drive git so that fix
might not be in there either

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-16 17:32 ` Federico Sevilla III
@ 2006-07-18  7:31   ` Nathan Scott
  2006-07-18  8:58     ` Neil Brown
  0 siblings, 1 reply; 15+ messages in thread
From: Nathan Scott @ 2006-07-18  7:31 UTC (permalink / raw)
  To: xfs; +Cc: Neil Brown, linux-raid

On Mon, Jul 17, 2006 at 01:32:38AM +0800, Federico Sevilla III wrote:
> On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote:
> > I am currently gathering information to write an article about journal
> > filesystems with emphasis on write barrier functionality, how it
> > works, why journalling filesystems need write barrier and the current
> > implementation of write barrier support for different filesystems.
> 
> Cool! Would you by any chance have information on the interaction
> between journal filesystems with write barrier functionality, and
> software RAID (md)? Based on my experience with 2.6.17, XFS detects that
> the underlying software RAID 1 device does not support barriers and
> therefore disables that functionality.

Noone here seems to know, maybe Neil &| the other folks on linux-raid
can help us out with details on status of MD and write barriers?

cheers.

-- 
Nathan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-18  7:31   ` Nathan Scott
@ 2006-07-18  8:58     ` Neil Brown
  2006-07-18 17:04       ` David Chinner
  0 siblings, 1 reply; 15+ messages in thread
From: Neil Brown @ 2006-07-18  8:58 UTC (permalink / raw)
  To: Nathan Scott; +Cc: xfs, linux-raid

On Tuesday July 18, nathans@sgi.com wrote:
> On Mon, Jul 17, 2006 at 01:32:38AM +0800, Federico Sevilla III wrote:
> > On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote:
> > > I am currently gathering information to write an article about journal
> > > filesystems with emphasis on write barrier functionality, how it
> > > works, why journalling filesystems need write barrier and the current
> > > implementation of write barrier support for different filesystems.

"Journalling filesystems need write barrier" isn't really accurate.
They can make good use of write barrier if it is supported, and where
it isn't supported, they should use blkdev_issue_flush in combination
with regular submit/wait.

> > 
> > Cool! Would you by any chance have information on the interaction
> > between journal filesystems with write barrier functionality, and
> > software RAID (md)? Based on my experience with 2.6.17, XFS detects that
> > the underlying software RAID 1 device does not support barriers and
> > therefore disables that functionality.
> 
> Noone here seems to know, maybe Neil &| the other folks on linux-raid
> can help us out with details on status of MD and write barriers?

In 2.6.17, md/raid1 will detect if the underlying devices support
barriers and if they all do, it will accept barrier requests from the
filesystem and pass those requests down to all devices.

Other raid levels will reject all barrier requests.  Filesystems
should notice this and submit regular writes and wait for them to
complete (which they do) and call blk_issue_flush after the commit
block has been written (which I think few do).

At least, that is my understanding.

NeilBrown

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-18  8:58     ` Neil Brown
@ 2006-07-18 17:04       ` David Chinner
  2006-07-18 18:27         ` Martin Steigerwald
  2006-07-18 23:41         ` Neil Brown
  0 siblings, 2 replies; 15+ messages in thread
From: David Chinner @ 2006-07-18 17:04 UTC (permalink / raw)
  To: Neil Brown; +Cc: Nathan Scott, xfs, linux-raid

On Tue, Jul 18, 2006 at 06:58:56PM +1000, Neil Brown wrote:
> On Tuesday July 18, nathans@sgi.com wrote:
> > On Mon, Jul 17, 2006 at 01:32:38AM +0800, Federico Sevilla III wrote:
> > > On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote:
> > > > I am currently gathering information to write an article about journal
> > > > filesystems with emphasis on write barrier functionality, how it
> > > > works, why journalling filesystems need write barrier and the current
> > > > implementation of write barrier support for different filesystems.
> 
> "Journalling filesystems need write barrier" isn't really accurate.
> They can make good use of write barrier if it is supported, and where
> it isn't supported, they should use blkdev_issue_flush in combination
> with regular submit/wait.

blkdev_issue_flush() causes a write cache flush - just like a
barrier typically causes a write cache flush up to the I/O with the
barrier in it.  Both of these mechanisms provide the same thing - an
I/O barrier that enforces ordering of I/Os to disk.

Given that filesystems already indicate to the block layer when they
want a barrier, wouldn't it be better to get the block layer to issue
this cache flush if the underlying device doesn't support barriers
and it receives a barrier request?

FWIW, Only XFS and Reiser3 use this function, and only then when
issuing a fsync when barriers are disabled to make sure a common
test (fsync then power cycle) doesn't result in data loss...

> > Noone here seems to know, maybe Neil &| the other folks on linux-raid
> > can help us out with details on status of MD and write barriers?
> 
> In 2.6.17, md/raid1 will detect if the underlying devices support
> barriers and if they all do, it will accept barrier requests from the
> filesystem and pass those requests down to all devices.
> 
> Other raid levels will reject all barrier requests.

Any particular reason for not supporting barriers on the other types
of RAID?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-18 17:04       ` David Chinner
@ 2006-07-18 18:27         ` Martin Steigerwald
  2006-07-18 19:21           ` David Chinner
  2006-07-18 23:41         ` Neil Brown
  1 sibling, 1 reply; 15+ messages in thread
From: Martin Steigerwald @ 2006-07-18 18:27 UTC (permalink / raw)
  To: linux-xfs

Am Dienstag 18 Juli 2006 19:04 schrieb David Chinner:

> > "Journalling filesystems need write barrier" isn't really accurate.
> > They can make good use of write barrier if it is supported, and where
> > it isn't supported, they should use blkdev_issue_flush in combination
> > with regular submit/wait.
>
> blkdev_issue_flush() causes a write cache flush - just like a
> barrier typically causes a write cache flush up to the I/O with the
> barrier in it.  Both of these mechanisms provide the same thing - an
> I/O barrier that enforces ordering of I/Os to disk.

Hello David,

well now it gets interesting. If both provide the same thing, whats the 
difference?

> Given that filesystems already indicate to the block layer when they
> want a barrier, wouldn't it be better to get the block layer to issue
> this cache flush if the underlying device doesn't support barriers
> and it receives a barrier request?

Does a device need to support more than this cache flush in order to 
support barriers? Up to know I thought that when a device supports cache 
flushes the kernel can provide barrier functinality for it.

I see in boot output that my notebook harddisk supports cache flushes. But 
not in dmesg nor in syslog. I don't know yet how to actually determine 
whether barrier functionality is really usable on a certain system. 

> FWIW, Only XFS and Reiser3 use this function, and only then when
> issuing a fsync when barriers are disabled to make sure a common
> test (fsync then power cycle) doesn't result in data loss...

So will XFS be safe even without write barriers? What will it do when it 
cannot do write barriers but write barriers are requested by the user or 
the inbuilt default setting of the filesystem? Will it work unsafely or 
will mount readonly or disable write caches in that case?

I think I need to read / learn even more to get a complete picture of 
this.

Regards,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-18 18:27         ` Martin Steigerwald
@ 2006-07-18 19:21           ` David Chinner
  2006-07-20 10:34             ` Martin Steigerwald
  2006-07-22  9:31             ` Martin Steigerwald
  0 siblings, 2 replies; 15+ messages in thread
From: David Chinner @ 2006-07-18 19:21 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-xfs

On Tue, Jul 18, 2006 at 08:27:48PM +0200, Martin Steigerwald wrote:
> Am Dienstag 18 Juli 2006 19:04 schrieb David Chinner:
> 
> > > "Journalling filesystems need write barrier" isn't really accurate.
> > > They can make good use of write barrier if it is supported, and where
> > > it isn't supported, they should use blkdev_issue_flush in combination
> > > with regular submit/wait.
> >
> > blkdev_issue_flush() causes a write cache flush - just like a
> > barrier typically causes a write cache flush up to the I/O with the
> > barrier in it.  Both of these mechanisms provide the same thing - an
> > I/O barrier that enforces ordering of I/Os to disk.
> 
> Hello David,
> 
> well now it gets interesting. If both provide the same thing, whats the 
> difference?

A WRITE_BARRIER I/O can be optimised by smart drivers, protocols and hardware
to minimise the adverse effects of the barrier, whereas a cache flush
is a brute force cache cleaning mechanism that cannot be optimised....

> > Given that filesystems already indicate to the block layer when they
> > want a barrier, wouldn't it be better to get the block layer to issue
> > this cache flush if the underlying device doesn't support barriers
> > and it receives a barrier request?
> 
> Does a device need to support more than this cache flush in order to 
> support barriers? Up to know I thought that when a device supports cache 
> flushes the kernel can provide barrier functinality for it.

Not necessarily as different device/protocol commands are used.

> I see in boot output that my notebook harddisk supports cache flushes. But 
> not in dmesg nor in syslog. I don't know yet how to actually determine 
> whether barrier functionality is really usable on a certain system. 

My test is to mount an XFS filesystem using barriers on the device and
look at the syslog message. ;)

> > FWIW, Only XFS and Reiser3 use this function, and only then when
> > issuing a fsync when barriers are disabled to make sure a common
> > test (fsync then power cycle) doesn't result in data loss...
> 
> So will XFS be safe even without write barriers?

XFS is only safe when you have:

	a) no write caching on the drive (barrier or nobarrier)
	b) non-volatile write caching on the drive (barrier or nobarrier)
	c) volatile write caching and barriers supported and enabled

The same conditions hold true for any filesystem that requires I/O ordering
guarantees to maintain filesystem consistency...

> What will it do when it 
> cannot do write barriers but write barriers are requested by the user or 
> the inbuilt default setting of the filesystem?  Will it work unsafely or 
> will mount readonly or disable write caches in that case?

XFS will log a warning to the syslog and dmesg saying write barriers are
disabled and continue onwards without barriers.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-18 17:04       ` David Chinner
  2006-07-18 18:27         ` Martin Steigerwald
@ 2006-07-18 23:41         ` Neil Brown
  1 sibling, 0 replies; 15+ messages in thread
From: Neil Brown @ 2006-07-18 23:41 UTC (permalink / raw)
  To: David Chinner; +Cc: Nathan Scott, xfs, linux-raid

On Wednesday July 19, dgc@sgi.com wrote:
> On Tue, Jul 18, 2006 at 06:58:56PM +1000, Neil Brown wrote:
> > On Tuesday July 18, nathans@sgi.com wrote:
> > > On Mon, Jul 17, 2006 at 01:32:38AM +0800, Federico Sevilla III wrote:
> > > > On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote:
> > > > > I am currently gathering information to write an article about journal
> > > > > filesystems with emphasis on write barrier functionality, how it
> > > > > works, why journalling filesystems need write barrier and the current
> > > > > implementation of write barrier support for different filesystems.
> > 
> > "Journalling filesystems need write barrier" isn't really accurate.
> > They can make good use of write barrier if it is supported, and where
> > it isn't supported, they should use blkdev_issue_flush in combination
> > with regular submit/wait.
> 
> blkdev_issue_flush() causes a write cache flush - just like a
> barrier typically causes a write cache flush up to the I/O with the
> barrier in it.  Both of these mechanisms provide the same thing - an
> I/O barrier that enforces ordering of I/Os to disk.
> 
> Given that filesystems already indicate to the block layer when they
> want a barrier, wouldn't it be better to get the block layer to issue
> this cache flush if the underlying device doesn't support barriers
> and it receives a barrier request?

A barrier means a lot more than just a flush.
It means
  wait for all proceeding requests to commit
  flush
  write this request
  flush

Any block device that uses the io scheduler could probably manage
this.  Other block devices might not find it so easy.

> 
> Any particular reason for not supporting barriers on the other types
> of RAID?
> 

Imagine trying to implement barriers for raid0 (or any level with
striping).  You would need to
  block new requests
  wait for all requests to all devices to complete
  issue a flush to all devices
  issue the barrier request to the target device
  issue a flush to the target device
  permit new requests.

This means raid0 would need to keep track of all pending requests,
which it doesn't do.  As the filesystem does, it is just as efficient
to let the filesystem to the work.

I guess raid0 could
  - block new requests
  - submit a no-op barrier to all devices
  - wait for the no-op to complete
  - submit the write/barrier request
  - permit new requests.

This would avoid needing to keep track of all requests.  However I
don't think the Linux block layer supports a no-op barrer, and I don't
think this would actually be better than not supporting barriers.

The real value of barriers (as far as I can see) is that the target
device can understand them so you don't need to stall the queue of
requests flying over the buss to the device.  If you need to stall the
flow of requests and wait at the OS level, then the value of barriers
disappears and you may as well wait in the filesystem code.

At least, that is my understanding.  I am happy to be educated.

NeilBrown

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-18 19:21           ` David Chinner
@ 2006-07-20 10:34             ` Martin Steigerwald
  2006-07-22  9:31             ` Martin Steigerwald
  1 sibling, 0 replies; 15+ messages in thread
From: Martin Steigerwald @ 2006-07-20 10:34 UTC (permalink / raw)
  To: linux-xfs

Am Dienstag 18 Juli 2006 21:21 schrieb David Chinner:

> > Hello David,
> >
> > well now it gets interesting. If both provide the same thing, whats
> > the difference?
>
> A WRITE_BARRIER I/O can be optimised by smart drivers, protocols and
> hardware to minimise the adverse effects of the barrier, whereas a
> cache flush is a brute force cache cleaning mechanism that cannot be
> optimised....

Hello David,

Thanks for your answer. I think now I understand it:

The driver and the drive does a cache flush immediately, but they can 
delay a write barrier related cache flush as long as they make sure that 
they has completed all of the io requests before the write barrier and 
none of the io requests after the write barrier.

The smart driver of a smart drive that supports ordered requests may 
additionally offload request ordering from the operating system to the 
drive.

> > Does a device need to support more than this cache flush in order to
> > support barriers? Up to know I thought that when a device supports
> > cache flushes the kernel can provide barrier functinality for it.
>
> Not necessarily as different device/protocol commands are used.

Do you have any specific information link about this one?

AFAIK it should be possible to implement barrier functionality in the 
driver as soon as the drive supports cache flushes. According to what I 
understand from Documentation/block/barriers.txt, this is this case. That 
would be

"iii. Devices which have queue depth of 1.  This is a degenerate case
of ii.  Just keeping issue order suffices.  Ancient SCSI
controllers/drives and IDE drives are in this category."

in combination with this one:

"iii. Write-back cache and flush operation but no FUA (forced unit
access).  We need two cache flushes - before and after the barrier
request."

The number of cache flushes can be reduced to one if FUA is used:

"iv. Write-back cache, flush operation and FUA.  We still need one
flush to make sure requests preceding a barrier are written to medium,
but post-barrier flush can be avoided by using FUA write on the
barrier itself."

And then there are drives with queue depth greater 1 with support or do 
not support ordered requests.

Ah, I think I finally got all of this and have a complete picture. Lets 
see whether I can put that into a diagram or what ;-).

Regards,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-18 19:21           ` David Chinner
  2006-07-20 10:34             ` Martin Steigerwald
@ 2006-07-22  9:31             ` Martin Steigerwald
  2006-07-22 10:36               ` Stefan Smietanowski
  1 sibling, 1 reply; 15+ messages in thread
From: Martin Steigerwald @ 2006-07-22  9:31 UTC (permalink / raw)
  To: linux-xfs

Am Dienstag 18 Juli 2006 21:21 schrieb David Chinner:

> > > blkdev_issue_flush() causes a write cache flush - just like a
> > > barrier typically causes a write cache flush up to the I/O with the
> > > barrier in it.  Both of these mechanisms provide the same thing -
> > > an I/O barrier that enforces ordering of I/Os to disk.
> >
> > Hello David,
> >
> > well now it gets interesting. If both provide the same thing, whats
> > the difference?
>
> A WRITE_BARRIER I/O can be optimised by smart drivers, protocols and
> hardware to minimise the adverse effects of the barrier, whereas a
> cache flush is a brute force cache cleaning mechanism that cannot be
> optimised....

Hello David,

I like to understand this difference a bit better.

As far as I understand there are three important differences between 
blkdev_issue_flush() and using the new barrier functionality:

1) blkdev_issue_flush() issues a cache flush synchronously and the 
filesystem has to wait for it to return. OTOH a write barrier is like a 
asynchron cache flush: The filesystem sends a barrier request to the 
block layer and forgets about it then. It can handle other stuff in the 
meanwhile while block layer will take care of the correct order of the 
write requests.

2) Since the filesystem offloads the ordering of the requests to the block 
layer, block layer can support smart drivers, protocols and hardware to 
optimize request ordering (say TCQ devices for example).

3) A direct cache flush means that the cache flush has to happen 
immediately while with a barrier it can happen some time in the future 
given that it happens before the barrier request is issued.  

So the advantages of the barrier functionality that is that it provides 
request ordering at a lower cost for the filesystem.

Anything to add or correct?

Regards,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: XFS and write barrier
  2006-07-22  9:31             ` Martin Steigerwald
@ 2006-07-22 10:36               ` Stefan Smietanowski
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Smietanowski @ 2006-07-22 10:36 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-xfs

[-- Attachment #1: Type: text/plain, Size: 162 bytes --]

> immediately while with a barrier it can happen some time in the future 
> given that it happens before the barrier request is issued.  

Timetravel?

// Stefan

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2006-07-22 12:18 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-15 10:48 XFS and write barrier Martin Steigerwald
2006-07-15 19:28 ` Chris Wedgwood
2006-07-16  9:53   ` Martin Steigerwald
2006-07-17  0:43     ` Chris Wedgwood
2006-07-17  1:24       ` Chris Wedgwood
2006-07-16 17:32 ` Federico Sevilla III
2006-07-18  7:31   ` Nathan Scott
2006-07-18  8:58     ` Neil Brown
2006-07-18 17:04       ` David Chinner
2006-07-18 18:27         ` Martin Steigerwald
2006-07-18 19:21           ` David Chinner
2006-07-20 10:34             ` Martin Steigerwald
2006-07-22  9:31             ` Martin Steigerwald
2006-07-22 10:36               ` Stefan Smietanowski
2006-07-18 23:41         ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox