linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Mount ext3 with barrier=1 doesn't send real barrier bio?
@ 2008-08-15 21:31 Milan Broz
  2008-08-20 23:38 ` Eric Sandeen
  0 siblings, 1 reply; 8+ messages in thread
From: Milan Broz @ 2008-08-15 21:31 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Jens Axboe, linux-fsdevel

Hi,

I run some barrier tests over device-mapper (which currently doesn't
support barrier bio at all) and even if I set barrier=1 in ext3 mount,
there is never any bio with barrier flag... (in 2.6.27-rc)

How is the barrier=1 flag supposed to work in ext3 (JBD) now?

See:
If you specify barrier=1, JFS_BARRIER flag is set in ext3_init_journal_params
	journal->j_flags |= JFS_BARRIER;

Now, journal_write_commit_record is called and this happens:

	if (journal->j_flags & JFS_BARRIER) {
		set_buffer_ordered(bh);
		barrier_done = 1;
	}
	ret = sync_dirty_buffer(bh);

	if (barrier_done)
		clear_buffer_ordered(bh);

	if (ret == -EOPNOTSUPP && barrier_done) {
	...

>From this code I expect that EOPNOTSUPP is returned if barrier is not
supported (yes, that exactly does device-mapper now without barrier patches).

But it *never* happens because:

sync_dirty_buffer always calls 
	submit_bh(WRITE_SYNC, bh)

and in submit_bh is this test:

	if (buffer_ordered(bh) && (rw == WRITE))
		rw = WRITE_BARRIER;

but there is rw == WRITE_SYNC, not WRITE !

So the barrier flag for bio is never set and normal sync write
is performed.

Why it isn't done like in attached patch? Is it intentional or it is bug?

I think it was caused by change in this commit:

commit 18ce3751ccd488c78d3827e9f6bf54e6322676fb
Author: Jens Axboe <jens.axboe@oracle.com>
Date:   Tue Jul 1 09:07:34 2008 +0200

    Properly notify block layer of sync writes

Milan
--

Set BIO_RW_BARRIER flag even for submit_bh sync write request.

Signed-off-by: Milan Broz <mbroz@redhat.com>
---
 fs/buffer.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2926,16 +2926,16 @@ int submit_bh(int rw, struct buffer_head * bh)
 	BUG_ON(!buffer_mapped(bh));
 	BUG_ON(!bh->b_end_io);
 
-	if (buffer_ordered(bh) && (rw == WRITE))
-		rw = WRITE_BARRIER;
-
 	/*
 	 * Only clear out a write error when rewriting, should this
 	 * include WRITE_SYNC as well?
 	 */
-	if (test_set_buffer_req(bh) && (rw == WRITE || rw == WRITE_BARRIER))
+	if (test_set_buffer_req(bh) && rw == WRITE)
 		clear_buffer_write_io_error(bh);
 
+	if (buffer_ordered(bh) && ((rw & RW_MASK) == WRITE))
+		rw |= (1 << BIO_RW_BARRIER);
+
 	/*
 	 * from here on down, it's all bio -- do the initial mapping,
 	 * submit_bio -> generic_make_request may further map this bio around





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mount ext3 with barrier=1 doesn't send real barrier bio?
  2008-08-15 21:31 Mount ext3 with barrier=1 doesn't send real barrier bio? Milan Broz
@ 2008-08-20 23:38 ` Eric Sandeen
  2008-08-21  5:26   ` Jens Axboe
  2008-08-21 22:23   ` OGAWA Hirofumi
  0 siblings, 2 replies; 8+ messages in thread
From: Eric Sandeen @ 2008-08-20 23:38 UTC (permalink / raw)
  To: Milan Broz
  Cc: Linux Kernel Mailing List, Jens Axboe, linux-fsdevel, Ric Wheeler,
	MASON,CHRISTOPHER

Milan Broz wrote:
> Hi,
> 
> I run some barrier tests over device-mapper (which currently doesn't
> support barrier bio at all) and even if I set barrier=1 in ext3 mount,
> there is never any bio with barrier flag... (in 2.6.27-rc)
> 
> How is the barrier=1 flag supposed to work in ext3 (JBD) now?

Milan, you're right.  Ric saw this same strange behavior when doing some
benchmarking with and without barriers; Chris noticed the change in
submit_bh; I was about to write up a similar patch to what you've sent
already.  Jens, does Milan's fix look good to you?

Incidentally, I ran Ric's test on ext3 on a sata drive:

# fs_mark -d /mnt/test -n 1600 -t 1 -s 20480

cfq:
                        files/s
           2.6.25   2.6.26.2  2.6.26.2+patch
barrier=0    169      127       126
barrier=1     33      126        33

noop:
                        files/s
           2.6.25   2.6.26.2  2.6.26.2+patch
barrier=0    191      184       185
barrier=1     33      180        33

deadline:
                        files/s
           2.6.25   2.6.26.2  2.6.26.2+patch
barrier=0    181      182       185
barrier=1     33      185        33

anticipatory:
                        files/s
           2.6.25   2.6.26.2  2.6.26.2+patch
barrier=0    187      133       132
barrier=1     34      134        33

-Eric

> See:
> If you specify barrier=1, JFS_BARRIER flag is set in ext3_init_journal_params
> 	journal->j_flags |= JFS_BARRIER;
> 
> Now, journal_write_commit_record is called and this happens:
> 
> 	if (journal->j_flags & JFS_BARRIER) {
> 		set_buffer_ordered(bh);
> 		barrier_done = 1;
> 	}
> 	ret = sync_dirty_buffer(bh);
> 
> 	if (barrier_done)
> 		clear_buffer_ordered(bh);
> 
> 	if (ret == -EOPNOTSUPP && barrier_done) {
> 	...
> 
> From this code I expect that EOPNOTSUPP is returned if barrier is not
> supported (yes, that exactly does device-mapper now without barrier patches).
> 
> But it *never* happens because:
> 
> sync_dirty_buffer always calls 
> 	submit_bh(WRITE_SYNC, bh)
> 
> and in submit_bh is this test:
> 
> 	if (buffer_ordered(bh) && (rw == WRITE))
> 		rw = WRITE_BARRIER;
> 
> but there is rw == WRITE_SYNC, not WRITE !
> 
> So the barrier flag for bio is never set and normal sync write
> is performed.
> 
> Why it isn't done like in attached patch? Is it intentional or it is bug?
> 
> I think it was caused by change in this commit:
> 
> commit 18ce3751ccd488c78d3827e9f6bf54e6322676fb
> Author: Jens Axboe <jens.axboe@oracle.com>
> Date:   Tue Jul 1 09:07:34 2008 +0200
> 
>     Properly notify block layer of sync writes
> 
> Milan
> --
> 
> Set BIO_RW_BARRIER flag even for submit_bh sync write request.
> 
> Signed-off-by: Milan Broz <mbroz@redhat.com>
> ---
>  fs/buffer.c |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -2926,16 +2926,16 @@ int submit_bh(int rw, struct buffer_head * bh)
>  	BUG_ON(!buffer_mapped(bh));
>  	BUG_ON(!bh->b_end_io);
>  
> -	if (buffer_ordered(bh) && (rw == WRITE))
> -		rw = WRITE_BARRIER;
> -
>  	/*
>  	 * Only clear out a write error when rewriting, should this
>  	 * include WRITE_SYNC as well?
>  	 */
> -	if (test_set_buffer_req(bh) && (rw == WRITE || rw == WRITE_BARRIER))
> +	if (test_set_buffer_req(bh) && rw == WRITE)
>  		clear_buffer_write_io_error(bh);
>  
> +	if (buffer_ordered(bh) && ((rw & RW_MASK) == WRITE))
> +		rw |= (1 << BIO_RW_BARRIER);
> +
>  	/*
>  	 * from here on down, it's all bio -- do the initial mapping,
>  	 * submit_bio -> generic_make_request may further map this bio around
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mount ext3 with barrier=1 doesn't send real barrier bio?
  2008-08-20 23:38 ` Eric Sandeen
@ 2008-08-21  5:26   ` Jens Axboe
  2008-08-21 10:43     ` Ric Wheeler
  2008-08-21 22:23   ` OGAWA Hirofumi
  1 sibling, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2008-08-21  5:26 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Milan Broz, Linux Kernel Mailing List, linux-fsdevel, Ric Wheeler,
	MASON,CHRISTOPHER

On Wed, Aug 20 2008, Eric Sandeen wrote:
> Milan Broz wrote:
> > Hi,
> > 
> > I run some barrier tests over device-mapper (which currently doesn't
> > support barrier bio at all) and even if I set barrier=1 in ext3 mount,
> > there is never any bio with barrier flag... (in 2.6.27-rc)
> > 
> > How is the barrier=1 flag supposed to work in ext3 (JBD) now?
> 
> Milan, you're right.  Ric saw this same strange behavior when doing some
> benchmarking with and without barriers; Chris noticed the change in
> submit_bh; I was about to write up a similar patch to what you've sent
> already.  Jens, does Milan's fix look good to you?

Yep looks good, thanks a lot Milan! I'll send in the patch, unless I'm
badly mistaken we need it for 2.6.26-stable as well.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mount ext3 with barrier=1 doesn't send real barrier bio?
  2008-08-21  5:26   ` Jens Axboe
@ 2008-08-21 10:43     ` Ric Wheeler
  0 siblings, 0 replies; 8+ messages in thread
From: Ric Wheeler @ 2008-08-21 10:43 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Eric Sandeen, Milan Broz, Linux Kernel Mailing List,
	linux-fsdevel, MASON,CHRISTOPHER

Jens Axboe wrote:
> On Wed, Aug 20 2008, Eric Sandeen wrote:
>   
>> Milan Broz wrote:
>>     
>>> Hi,
>>>
>>> I run some barrier tests over device-mapper (which currently doesn't
>>> support barrier bio at all) and even if I set barrier=1 in ext3 mount,
>>> there is never any bio with barrier flag... (in 2.6.27-rc)
>>>
>>> How is the barrier=1 flag supposed to work in ext3 (JBD) now?
>>>       
>> Milan, you're right.  Ric saw this same strange behavior when doing some
>> benchmarking with and without barriers; Chris noticed the change in
>> submit_bh; I was about to write up a similar patch to what you've sent
>> already.  Jens, does Milan's fix look good to you?
>>     
>
> Yep looks good, thanks a lot Milan! I'll send in the patch, unless I'm
> badly mistaken we need it for 2.6.26-stable as well.
>
>   

I think we definitely need it there as well, thanks!

Ric


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mount ext3 with barrier=1 doesn't send real barrier bio?
  2008-08-20 23:38 ` Eric Sandeen
  2008-08-21  5:26   ` Jens Axboe
@ 2008-08-21 22:23   ` OGAWA Hirofumi
  2008-08-22  6:38     ` Jens Axboe
  1 sibling, 1 reply; 8+ messages in thread
From: OGAWA Hirofumi @ 2008-08-21 22:23 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Milan Broz, Linux Kernel Mailing List, Jens Axboe, linux-fsdevel,
	Ric Wheeler, MASON,CHRISTOPHER

Eric Sandeen <sandeen@sandeen.net> writes:

>> --- a/fs/buffer.c
>> +++ b/fs/buffer.c
>> @@ -2926,16 +2926,16 @@ int submit_bh(int rw, struct buffer_head * bh)
>>  	BUG_ON(!buffer_mapped(bh));
>>  	BUG_ON(!bh->b_end_io);
>>  
>> -	if (buffer_ordered(bh) && (rw == WRITE))
>> -		rw = WRITE_BARRIER;
>> -
>>  	/*
>>  	 * Only clear out a write error when rewriting, should this
>>  	 * include WRITE_SYNC as well?
>>  	 */
>> -	if (test_set_buffer_req(bh) && (rw == WRITE || rw == WRITE_BARRIER))
>> +	if (test_set_buffer_req(bh) && rw == WRITE)
>>  		clear_buffer_write_io_error(bh);

This should be ((rw & RW_MASK) == WRITE) too?  Anyway, this seems change
behavior of submit_bh(WRITE_BARRIER) (maybe reiserfs only), it wouldn't
be your intent...

>> +	if (buffer_ordered(bh) && ((rw & RW_MASK) == WRITE))
>> +		rw |= (1 << BIO_RW_BARRIER);
>> +
>>  	/*
>>  	 * from here on down, it's all bio -- do the initial mapping,
>>  	 * submit_bio -> generic_make_request may further map this bio around
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mount ext3 with barrier=1 doesn't send real barrier bio?
  2008-08-21 22:23   ` OGAWA Hirofumi
@ 2008-08-22  6:38     ` Jens Axboe
  2008-08-22  7:45       ` OGAWA Hirofumi
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2008-08-22  6:38 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Eric Sandeen, Milan Broz, Linux Kernel Mailing List,
	linux-fsdevel, Ric Wheeler, MASON,CHRISTOPHER

On Fri, Aug 22 2008, OGAWA Hirofumi wrote:
> Eric Sandeen <sandeen@sandeen.net> writes:
> 
> >> --- a/fs/buffer.c
> >> +++ b/fs/buffer.c
> >> @@ -2926,16 +2926,16 @@ int submit_bh(int rw, struct buffer_head * bh)
> >>  	BUG_ON(!buffer_mapped(bh));
> >>  	BUG_ON(!bh->b_end_io);
> >>  
> >> -	if (buffer_ordered(bh) && (rw == WRITE))
> >> -		rw = WRITE_BARRIER;
> >> -
> >>  	/*
> >>  	 * Only clear out a write error when rewriting, should this
> >>  	 * include WRITE_SYNC as well?
> >>  	 */
> >> -	if (test_set_buffer_req(bh) && (rw == WRITE || rw == WRITE_BARRIER))
> >> +	if (test_set_buffer_req(bh) && rw == WRITE)
> >>  		clear_buffer_write_io_error(bh);
> 
> This should be ((rw & RW_MASK) == WRITE) too?  Anyway, this seems change
> behavior of submit_bh(WRITE_BARRIER) (maybe reiserfs only), it wouldn't
> be your intent...

Yes, I believe the simpler and more correct fix is:

diff --git a/fs/buffer.c b/fs/buffer.c
index 38653e3..16b2263 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2926,14 +2926,13 @@ int submit_bh(int rw, struct buffer_head * bh)
 	BUG_ON(!buffer_mapped(bh));
 	BUG_ON(!bh->b_end_io);
 
-	if (buffer_ordered(bh) && (rw == WRITE))
+	if (buffer_ordered(bh) && (rw & WRITE))
 		rw = WRITE_BARRIER;
 
 	/*
-	 * Only clear out a write error when rewriting, should this
-	 * include WRITE_SYNC as well?
+	 * Only clear out a write error when rewriting
 	 */
-	if (test_set_buffer_req(bh) && (rw == WRITE || rw == WRITE_BARRIER))
+	if (test_set_buffer_req(bh) && (rw & WRITE))
 		clear_buffer_write_io_error(bh);
 
 	/*

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Mount ext3 with barrier=1 doesn't send real barrier bio?
  2008-08-22  6:38     ` Jens Axboe
@ 2008-08-22  7:45       ` OGAWA Hirofumi
  2008-08-22  7:58         ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: OGAWA Hirofumi @ 2008-08-22  7:45 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Eric Sandeen, Milan Broz, Linux Kernel Mailing List,
	linux-fsdevel, Ric Wheeler, MASON,CHRISTOPHER

Jens Axboe <jens.axboe@oracle.com> writes:

>> This should be ((rw & RW_MASK) == WRITE) too?  Anyway, this seems change
>> behavior of submit_bh(WRITE_BARRIER) (maybe reiserfs only), it wouldn't
>> be your intent...
>
> Yes, I believe the simpler and more correct fix is:
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 38653e3..16b2263 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -2926,14 +2926,13 @@ int submit_bh(int rw, struct buffer_head * bh)
>  	BUG_ON(!buffer_mapped(bh));
>  	BUG_ON(!bh->b_end_io);
>  
> -	if (buffer_ordered(bh) && (rw == WRITE))
> +	if (buffer_ordered(bh) && (rw & WRITE))
>  		rw = WRITE_BARRIER;

I see. But, umm..., this means WRITE_SYNC with barrier was deprecated?
Or typo?
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mount ext3 with barrier=1 doesn't send real barrier bio?
  2008-08-22  7:45       ` OGAWA Hirofumi
@ 2008-08-22  7:58         ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2008-08-22  7:58 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Eric Sandeen, Milan Broz, Linux Kernel Mailing List,
	linux-fsdevel, Ric Wheeler, MASON,CHRISTOPHER

On Fri, Aug 22 2008, OGAWA Hirofumi wrote:
> Jens Axboe <jens.axboe@oracle.com> writes:
> 
> >> This should be ((rw & RW_MASK) == WRITE) too?  Anyway, this seems change
> >> behavior of submit_bh(WRITE_BARRIER) (maybe reiserfs only), it wouldn't
> >> be your intent...
> >
> > Yes, I believe the simpler and more correct fix is:
> >
> > diff --git a/fs/buffer.c b/fs/buffer.c
> > index 38653e3..16b2263 100644
> > --- a/fs/buffer.c
> > +++ b/fs/buffer.c
> > @@ -2926,14 +2926,13 @@ int submit_bh(int rw, struct buffer_head * bh)
> >  	BUG_ON(!buffer_mapped(bh));
> >  	BUG_ON(!bh->b_end_io);
> >  
> > -	if (buffer_ordered(bh) && (rw == WRITE))
> > +	if (buffer_ordered(bh) && (rw & WRITE))
> >  		rw = WRITE_BARRIER;
> 
> I see. But, umm..., this means WRITE_SYNC with barrier was deprecated?
> Or typo?

It was supposed to read rw |= WRITE_BARRIER :-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-08-22  7:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-15 21:31 Mount ext3 with barrier=1 doesn't send real barrier bio? Milan Broz
2008-08-20 23:38 ` Eric Sandeen
2008-08-21  5:26   ` Jens Axboe
2008-08-21 10:43     ` Ric Wheeler
2008-08-21 22:23   ` OGAWA Hirofumi
2008-08-22  6:38     ` Jens Axboe
2008-08-22  7:45       ` OGAWA Hirofumi
2008-08-22  7:58         ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).