public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] Preserve the dirty bit in init_page_buffers
@ 2007-05-22  2:31 Eric W. Biederman
  2007-05-22  2:36 ` [PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty Eric W. Biederman
  2007-05-28  4:09 ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Nick Piggin
  0 siblings, 2 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-22  2:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, Nick Piggin


The problem:  When we are trying to free buffers try_to_free_buffers
will look at ramdisk pages with clean buffer heads and remove the
dirty bit from the page.  Resulting in ramdisk pages with data that
get removed from the page cache.  Ouch!

Buffer heads appear on ramdisk pages when a filesystem calls getblk,
which through a series of function calls eventually calls
init_page_buffers. 

So to fix the mismatch between buffer head state and page state this
patch modifies init_page_buffers to transfer the dirty bit from the
page to the buffer heads like we currently do for the uptodate bit.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 fs/buffer.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index aa68206..c6b58e8 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -953,6 +953,7 @@ init_page_buffers(struct page *page, struct block_device *bdev,
 	struct buffer_head *head = page_buffers(page);
 	struct buffer_head *bh = head;
 	int uptodate = PageUptodate(page);
+	int dirty = PageDirty(page);
 
 	do {
 		if (!buffer_mapped(bh)) {
@@ -961,6 +962,8 @@ init_page_buffers(struct page *page, struct block_device *bdev,
 			bh->b_blocknr = block;
 			if (uptodate)
 				set_buffer_uptodate(bh);
+			if (dirty)
+				set_buffer_dirty(bh);
 			set_buffer_mapped(bh);
 		}
 		block++;
-- 
1.5.1.1.181.g2de0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty
  2007-05-22  2:31 [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman
@ 2007-05-22  2:36 ` Eric W. Biederman
  2007-05-22  2:40   ` [PATCH 3/3] rd: Simplify by using the same helper functions in libfs Eric W. Biederman
  2007-05-28  4:09 ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Nick Piggin
  1 sibling, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-22  2:36 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, Nick Piggin


The problem:  When we are trying to free buffers try_to_free_buffers
will look at ramdisk pages with clean buffer heads and remove the
dirty bit from the page.  Resulting in ramdisk pages with data that
get removed from the page cache.  Ouch!

When we mark a ramdisk page dirty we call set_page_dirty which then
calls ramdisk_set_page_dirty.  Currently we don't mark the buffer
heads dirty leaving us susceptible to the problem above.

So to fix the mismatch between buffer head state and page state this patch
modifies ramdisk_set_page_dirty to set the dirty bit on all of the buffers
a page may posses.

I set the uptodate bit on the buffer head so that later we can use
simple_commit_write, and because it is trivially safe.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 drivers/block/rd.c |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/drivers/block/rd.c b/drivers/block/rd.c
index a1512da..41de0f4 100644
--- a/drivers/block/rd.c
+++ b/drivers/block/rd.c
@@ -184,6 +184,21 @@ static int ramdisk_writepages(struct address_space *mapping,
  */
 static int ramdisk_set_page_dirty(struct page *page)
 {
+	struct address_space * const mapping = page_mapping(page);
+
+	spin_lock(&mapping->private_lock);
+	if (page_has_buffers(page)) {
+		struct buffer_head *head = page_buffers(page);
+		struct buffer_head *bh = head;
+
+		do {
+			set_buffer_uptodate(bh);
+			set_buffer_dirty(bh);
+			bh = bh->b_this_page;
+		} while (bh != head);
+	}
+	spin_unlock(&mapping->private_lock);
+
 	if (!TestSetPageDirty(page))
 		return 1;
 	return 0;
-- 
1.5.1.1.181.g2de0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] rd: Simplify by using the same helper functions in libfs
  2007-05-22  2:36 ` [PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty Eric W. Biederman
@ 2007-05-22  2:40   ` Eric W. Biederman
  0 siblings, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-22  2:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, Nick Piggin


While the ramdisk code in the page cache started with the ramfs
code it has diverged, and is a result is more complicated then
it currently needs to be.  This patch simplifies the ramfs
code by syncing it with ramfs and similar pieces of code.

The big difference is that the ramdisk must cope with people placing
buffer heads on it's pages so there is extra code required to mark
those buffer heads dirty and uptodate.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 drivers/block/rd.c |   76 +++------------------------------------------------
 1 files changed, 5 insertions(+), 71 deletions(-)

diff --git a/drivers/block/rd.c b/drivers/block/rd.c
index 41de0f4..56b2b54 100644
--- a/drivers/block/rd.c
+++ b/drivers/block/rd.c
@@ -92,36 +92,16 @@ static int rd_blocksize = CONFIG_BLK_DEV_RAM_BLOCKSIZE;
  * aops copied from ramfs.
  */
 
-/*
- * If a ramdisk page has buffers, some may be uptodate and some may be not.
- * To bring the page uptodate we zero out the non-uptodate buffers.  The
- * page must be locked.
- */
 static void make_page_uptodate(struct page *page)
 {
+	clear_highpage(page);
 	if (page_has_buffers(page)) {
 		struct buffer_head *bh = page_buffers(page);
 		struct buffer_head *head = bh;
 
 		do {
-			if (!buffer_uptodate(bh)) {
-				memset(bh->b_data, 0, bh->b_size);
-				/*
-				 * akpm: I'm totally undecided about this.  The
-				 * buffer has just been magically brought "up to
-				 * date", but nobody should want to be reading
-				 * it anyway, because it hasn't been used for
-				 * anything yet.  It is still in a "not read
-				 * from disk yet" state.
-				 *
-				 * But non-uptodate buffers against an uptodate
-				 * page are against the rules.  So do it anyway.
-				 */
-				 set_buffer_uptodate(bh);
-			}
+			set_buffer_uptodate(bh);
 		} while ((bh = bh->b_this_page) != head);
-	} else {
-		memset(page_address(page), 0, PAGE_CACHE_SIZE);
 	}
 	flush_dcache_page(page);
 	SetPageUptodate(page);
@@ -129,55 +109,11 @@ static void make_page_uptodate(struct page *page)
 
 static int ramdisk_readpage(struct file *file, struct page *page)
 {
-	if (!PageUptodate(page))
-		make_page_uptodate(page);
+	make_page_uptodate(page);
 	unlock_page(page);
 	return 0;
 }
 
-static int ramdisk_prepare_write(struct file *file, struct page *page,
-				unsigned offset, unsigned to)
-{
-	if (!PageUptodate(page))
-		make_page_uptodate(page);
-	return 0;
-}
-
-static int ramdisk_commit_write(struct file *file, struct page *page,
-				unsigned offset, unsigned to)
-{
-	set_page_dirty(page);
-	return 0;
-}
-
-/*
- * ->writepage to the blockdev's mapping has to redirty the page so that the
- * VM doesn't go and steal it.  We return AOP_WRITEPAGE_ACTIVATE so that the VM
- * won't try to (pointlessly) write the page again for a while.
- *
- * Really, these pages should not be on the LRU at all.
- */
-static int ramdisk_writepage(struct page *page, struct writeback_control *wbc)
-{
-	if (!PageUptodate(page))
-		make_page_uptodate(page);
-	SetPageDirty(page);
-	if (wbc->for_reclaim)
-		return AOP_WRITEPAGE_ACTIVATE;
-	unlock_page(page);
-	return 0;
-}
-
-/*
- * This is a little speedup thing: short-circuit attempts to write back the
- * ramdisk blockdev inode to its non-existent backing store.
- */
-static int ramdisk_writepages(struct address_space *mapping,
-				struct writeback_control *wbc)
-{
-	return 0;
-}
-
 /*
  * ramdisk blockdev pages have their own ->set_page_dirty() because we don't
  * want them to contribute to dirty memory accounting.
@@ -206,11 +142,9 @@ static int ramdisk_set_page_dirty(struct page *page)
 
 static const struct address_space_operations ramdisk_aops = {
 	.readpage	= ramdisk_readpage,
-	.prepare_write	= ramdisk_prepare_write,
-	.commit_write	= ramdisk_commit_write,
-	.writepage	= ramdisk_writepage,
+	.prepare_write	= simple_prepare_write,
+	.commit_write	= simple_commit_write,
 	.set_page_dirty	= ramdisk_set_page_dirty,
-	.writepages	= ramdisk_writepages,
 };
 
 static int rd_blkdev_pagecache_IO(int rw, struct bio_vec *vec, sector_t sector,
-- 
1.5.1.1.181.g2de0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-22  2:31 [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman
  2007-05-22  2:36 ` [PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty Eric W. Biederman
@ 2007-05-28  4:09 ` Nick Piggin
  2007-05-28  4:30   ` Eric W. Biederman
  1 sibling, 1 reply; 17+ messages in thread
From: Nick Piggin @ 2007-05-28  4:09 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

Eric W. Biederman wrote:
> The problem:  When we are trying to free buffers try_to_free_buffers
> will look at ramdisk pages with clean buffer heads and remove the
> dirty bit from the page.  Resulting in ramdisk pages with data that
> get removed from the page cache.  Ouch!
> 
> Buffer heads appear on ramdisk pages when a filesystem calls getblk,
> which through a series of function calls eventually calls
> init_page_buffers. 
> 
> So to fix the mismatch between buffer head state and page state this
> patch modifies init_page_buffers to transfer the dirty bit from the
> page to the buffer heads like we currently do for the uptodate bit.

Ouch indeed!

But can we ever have a dirty page at init_page_buffers-time?

I would have thought we can fix this simply by removing the
broken ramdisk_set_page_dirty (as far as the comment goes, we
set CAP_NO_ACCT_DIRTY anyway, so the normal set_page_dirty
should handle everything properly, no?).


> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> ---
>  fs/buffer.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index aa68206..c6b58e8 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -953,6 +953,7 @@ init_page_buffers(struct page *page, struct block_device *bdev,
>  	struct buffer_head *head = page_buffers(page);
>  	struct buffer_head *bh = head;
>  	int uptodate = PageUptodate(page);
> +	int dirty = PageDirty(page);
>  
>  	do {
>  		if (!buffer_mapped(bh)) {
> @@ -961,6 +962,8 @@ init_page_buffers(struct page *page, struct block_device *bdev,
>  			bh->b_blocknr = block;
>  			if (uptodate)
>  				set_buffer_uptodate(bh);
> +			if (dirty)
> +				set_buffer_dirty(bh);
>  			set_buffer_mapped(bh);
>  		}
>  		block++;


-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-28  4:09 ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Nick Piggin
@ 2007-05-28  4:30   ` Eric W. Biederman
  2007-05-28  4:54     ` Nick Piggin
  2007-05-28  6:37     ` Nick Piggin
  0 siblings, 2 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-28  4:30 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

Nick Piggin <nickpiggin@yahoo.com.au> writes:

> Eric W. Biederman wrote:
>> The problem:  When we are trying to free buffers try_to_free_buffers
>> will look at ramdisk pages with clean buffer heads and remove the
>> dirty bit from the page.  Resulting in ramdisk pages with data that
>> get removed from the page cache.  Ouch!
>>
>> Buffer heads appear on ramdisk pages when a filesystem calls getblk,
>> which through a series of function calls eventually calls
>> init_page_buffers.
>>
>> So to fix the mismatch between buffer head state and page state this
>> patch modifies init_page_buffers to transfer the dirty bit from the
>> page to the buffer heads like we currently do for the uptodate bit.
>
> Ouch indeed!
>
> But can we ever have a dirty page at init_page_buffers-time?

Definitely, and it was a royal pain to trace the bug that this
caused.  An initial ramdisk having pieces disappear after mkfs
is called can look like the entire machine is dying.

When we initialize the ramdisk by writing to /dev/ram0 usually in
init/do_mounts_rd.c we don't allocate buffer heads but we do set
the dirty bit, and the page is in the page cache.  So when we
later call getblk it reuses the same page and then calls
init_page_buffers.

> I would have thought we can fix this simply by removing the
> broken ramdisk_set_page_dirty (as far as the comment goes, we
> set CAP_NO_ACCT_DIRTY anyway, so the normal set_page_dirty
> should handle everything properly, no?).

No.  I don't know where accounting comes into play.  I didn't
trace that path.  But if we have a non-dirty ramdisk page with
buffers (basically a hole in the middle or at the end of the ramdisk).
We need to set the buffer dirty bits when we write to it.

So I don't see how it would make sense to reuse the generic
set_page_dirty, and handling all of the logic in set_page_dirty
to dirty the buffer heads seemed to have made the most sense.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-28  4:30   ` Eric W. Biederman
@ 2007-05-28  4:54     ` Nick Piggin
  2007-05-28  4:59       ` Nick Piggin
  2007-05-28 13:52       ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman
  2007-05-28  6:37     ` Nick Piggin
  1 sibling, 2 replies; 17+ messages in thread
From: Nick Piggin @ 2007-05-28  4:54 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

Eric W. Biederman wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> writes:
> 
> 
>>Eric W. Biederman wrote:
>>
>>>The problem:  When we are trying to free buffers try_to_free_buffers
>>>will look at ramdisk pages with clean buffer heads and remove the
>>>dirty bit from the page.  Resulting in ramdisk pages with data that
>>>get removed from the page cache.  Ouch!
>>>
>>>Buffer heads appear on ramdisk pages when a filesystem calls getblk,
>>>which through a series of function calls eventually calls
>>>init_page_buffers.
>>>
>>>So to fix the mismatch between buffer head state and page state this
>>>patch modifies init_page_buffers to transfer the dirty bit from the
>>>page to the buffer heads like we currently do for the uptodate bit.
>>
>>Ouch indeed!
>>
>>But can we ever have a dirty page at init_page_buffers-time?
> 
> 
> Definitely, and it was a royal pain to trace the bug that this
> caused.  An initial ramdisk having pieces disappear after mkfs
> is called can look like the entire machine is dying.
> 
> When we initialize the ramdisk by writing to /dev/ram0 usually in
> init/do_mounts_rd.c we don't allocate buffer heads but we do set
> the dirty bit, and the page is in the page cache.  So when we
> later call getblk it reuses the same page and then calls
> init_page_buffers.

Hmm, so this would be a problem for block_dev.c as well, then?
Because it would be possible to have a dirty block dev page
have its buffers reclaimed and then reinitialised via
init_page_buffers, AFAIKS.


>>I would have thought we can fix this simply by removing the
>>broken ramdisk_set_page_dirty (as far as the comment goes, we
>>set CAP_NO_ACCT_DIRTY anyway, so the normal set_page_dirty
>>should handle everything properly, no?).
> 
> 
> No.  I don't know where accounting comes into play.  I didn't
> trace that path.  But if we have a non-dirty ramdisk page with
> buffers (basically a hole in the middle or at the end of the ramdisk).
> We need to set the buffer dirty bits when we write to it.

Accounting is done in set_page_dirty.

> 
> So I don't see how it would make sense to reuse the generic
> set_page_dirty, and handling all of the logic in set_page_dirty
> to dirty the buffer heads seemed to have made the most sense.

That's what the generic set_page_dirty does. What I want to know
is why *doesn't* it make sense to reuse the generic set_page_dirty?
Unless there is a good reason, then reusing is better than writing
your own.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-28  4:54     ` Nick Piggin
@ 2007-05-28  4:59       ` Nick Piggin
  2007-05-28 14:05         ` Eric W. Biederman
  2007-05-28 13:52       ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman
  1 sibling, 1 reply; 17+ messages in thread
From: Nick Piggin @ 2007-05-28  4:59 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

Nick Piggin wrote:
> Eric W. Biederman wrote:

>> When we initialize the ramdisk by writing to /dev/ram0 usually in
>> init/do_mounts_rd.c we don't allocate buffer heads but we do set
>> the dirty bit, and the page is in the page cache.  So when we
>> later call getblk it reuses the same page and then calls
>> init_page_buffers.
> 
> 
> Hmm, so this would be a problem for block_dev.c as well, then?
> Because it would be possible to have a dirty block dev page
> have its buffers reclaimed and then reinitialised via
> init_page_buffers, AFAIKS.

Oh, no, try_to_free_buffers won't drop dirty buffers. However we
could still set_page_dirty of a block device page without buffers
via an mmap.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-28  4:30   ` Eric W. Biederman
  2007-05-28  4:54     ` Nick Piggin
@ 2007-05-28  6:37     ` Nick Piggin
  2007-05-28 13:58       ` Eric W. Biederman
  1 sibling, 1 reply; 17+ messages in thread
From: Nick Piggin @ 2007-05-28  6:37 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

Eric W. Biederman wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> writes:
> 
> 
>>Eric W. Biederman wrote:
>>
>>>The problem:  When we are trying to free buffers try_to_free_buffers
>>>will look at ramdisk pages with clean buffer heads and remove the
>>>dirty bit from the page.  Resulting in ramdisk pages with data that
>>>get removed from the page cache.  Ouch!
>>>
>>>Buffer heads appear on ramdisk pages when a filesystem calls getblk,
>>>which through a series of function calls eventually calls
>>>init_page_buffers.
>>>
>>>So to fix the mismatch between buffer head state and page state this
>>>patch modifies init_page_buffers to transfer the dirty bit from the
>>>page to the buffer heads like we currently do for the uptodate bit.
>>
>>Ouch indeed!
>>
>>But can we ever have a dirty page at init_page_buffers-time?
> 
> 
> Definitely, and it was a royal pain to trace the bug that this
> caused.  An initial ramdisk having pieces disappear after mkfs
> is called can look like the entire machine is dying.
> 
> When we initialize the ramdisk by writing to /dev/ram0 usually in
> init/do_mounts_rd.c we don't allocate buffer heads but we do set
> the dirty bit, and the page is in the page cache.  So when we
> later call getblk it reuses the same page and then calls
> init_page_buffers.

Hmm, the comment above grow_dev_buffers indicates this should
not happen. But contrary to the comment, it doesn't go BUG
unless you're attaching dirty buffers to a page with dirty
buffers.

I suspect this happens more frequently with rd.c, because unlike
block_dev.c, it does not create dirty buffers in prepare_write.
However it could still happen in block_dev.c via mmaped memory,
as I said earlier.

I'm not saying this patch 1/3 is wrong, but we would at least
need to revise some comments. The grow_dev_buffers comment looks
like one of Andrew's. Maybe he can shed some more light on this?

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-28  4:54     ` Nick Piggin
  2007-05-28  4:59       ` Nick Piggin
@ 2007-05-28 13:52       ` Eric W. Biederman
  1 sibling, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-28 13:52 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

Nick Piggin <nickpiggin@yahoo.com.au> writes:

> Eric W. Biederman wrote:
>> Nick Piggin <nickpiggin@yahoo.com.au> writes:
>
>>>I would have thought we can fix this simply by removing the
>>>broken ramdisk_set_page_dirty (as far as the comment goes, we
>>>set CAP_NO_ACCT_DIRTY anyway, so the normal set_page_dirty
>>>should handle everything properly, no?).
>>
>>
>> No.  I don't know where accounting comes into play.  I didn't
>> trace that path.  But if we have a non-dirty ramdisk page with
>> buffers (basically a hole in the middle or at the end of the ramdisk).
>> We need to set the buffer dirty bits when we write to it.
>
> Accounting is done in set_page_dirty.

Yes.  What I meant was I had not looked at the implications of
accounting, so I had not looked to see if I could use a generic
set_dirty_page.

I only got as far as recognizing that __set_page_dirty_no_writeback as
not the appropriate function to use because we need to handle buffer
heads.

>> So I don't see how it would make sense to reuse the generic
>> set_page_dirty, and handling all of the logic in set_page_dirty
>> to dirty the buffer heads seemed to have made the most sense.
>
> That's what the generic set_page_dirty does. What I want to know
> is why *doesn't* it make sense to reuse the generic set_page_dirty?
> Unless there is a good reason, then reusing is better than writing
> your own.

I did not look at that part in detail.  I only realized
ramdisk_set_dirty_page needed to be modified upon a final review of my
code, as it was not a case I actually hit.

Just skimming through it again quickly I don't see a reason at this
point to preserve a separate set_dirty_page for the ramdisk code.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-28  6:37     ` Nick Piggin
@ 2007-05-28 13:58       ` Eric W. Biederman
  0 siblings, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-28 13:58 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

Nick Piggin <nickpiggin@yahoo.com.au> writes:

>> Definitely, and it was a royal pain to trace the bug that this
>> caused.  An initial ramdisk having pieces disappear after mkfs
>> is called can look like the entire machine is dying.
>>
>> When we initialize the ramdisk by writing to /dev/ram0 usually in
>> init/do_mounts_rd.c we don't allocate buffer heads but we do set
>> the dirty bit, and the page is in the page cache.  So when we
>> later call getblk it reuses the same page and then calls
>> init_page_buffers.
>
> Hmm, the comment above grow_dev_buffers indicates this should
> not happen. But contrary to the comment, it doesn't go BUG
> unless you're attaching dirty buffers to a page with dirty
> buffers.
>
> I suspect this happens more frequently with rd.c, because unlike
> block_dev.c, it does not create dirty buffers in prepare_write.
> However it could still happen in block_dev.c via mmaped memory,
> as I said earlier.
>
> I'm not saying this patch 1/3 is wrong, but we would at least
> need to revise some comments. The grow_dev_buffers comment looks
> like one of Andrew's. Maybe he can shed some more light on this?

Good question.  It sounds like you are correct.

Either we need to fix those comments or find the root
cause for the need to call cancel_dirty_page in try_to_free_buffers
and fix that.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-28  4:59       ` Nick Piggin
@ 2007-05-28 14:05         ` Eric W. Biederman
  2007-05-29  5:14           ` Nick Piggin
  0 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-28 14:05 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

Nick Piggin <nickpiggin@yahoo.com.au> writes:

> Nick Piggin wrote:
>> Eric W. Biederman wrote:
>
>>> When we initialize the ramdisk by writing to /dev/ram0 usually in
>>> init/do_mounts_rd.c we don't allocate buffer heads but we do set
>>> the dirty bit, and the page is in the page cache.  So when we
>>> later call getblk it reuses the same page and then calls
>>> init_page_buffers.
>>
>>
>> Hmm, so this would be a problem for block_dev.c as well, then?
>> Because it would be possible to have a dirty block dev page
>> have its buffers reclaimed and then reinitialised via
>> init_page_buffers, AFAIKS.
>
> Oh, no, try_to_free_buffers won't drop dirty buffers. However we
> could still set_page_dirty of a block device page without buffers
> via an mmap.

After the page is made dirty via mmap we have:
sys_write -> ... -> block_prepare_write -> ... -> create_empty_buffers.

I suspect that is a pretty rare case but it does indeed seem to exist
as a problem.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-28 14:05         ` Eric W. Biederman
@ 2007-05-29  5:14           ` Nick Piggin
  2007-05-29  5:28             ` Eric W. Biederman
  2007-05-31 11:56             ` Eric W. Biederman
  0 siblings, 2 replies; 17+ messages in thread
From: Nick Piggin @ 2007-05-29  5:14 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

Eric W. Biederman wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> writes:

>> However we
>>could still set_page_dirty of a block device page without buffers
>>via an mmap.
> 
> 
> After the page is made dirty via mmap we have:
> sys_write -> ... -> block_prepare_write -> ... -> create_empty_buffers.

Yep, that's what I mean.


> I suspect that is a pretty rare case but it does indeed seem to exist
> as a problem.

I think so too. But either we have some misunderstanding of the
codepaths involved, or the author of the comments there didn't
consider this case, so...

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-29  5:14           ` Nick Piggin
@ 2007-05-29  5:28             ` Eric W. Biederman
  2007-05-31 11:56             ` Eric W. Biederman
  1 sibling, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-29  5:28 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

Nick Piggin <nickpiggin@yahoo.com.au> writes:

> Eric W. Biederman wrote:
>> Nick Piggin <nickpiggin@yahoo.com.au> writes:
>
>>> However we
>>>could still set_page_dirty of a block device page without buffers
>>>via an mmap.
>>
>>
>> After the page is made dirty via mmap we have:
>> sys_write -> ... -> block_prepare_write -> ... -> create_empty_buffers.
>
> Yep, that's what I mean.
>
>
>> I suspect that is a pretty rare case but it does indeed seem to exist
>> as a problem.
>
> I think so too. But either we have some misunderstanding of the
> codepaths involved, or the author of the comments there didn't
> consider this case, so...

Which is likely.

Which is why I brought up the try_to_free_buffers case.

There has been some significant dancing around trying to
sort things out and make them race free in this code.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-29  5:14           ` Nick Piggin
  2007-05-29  5:28             ` Eric W. Biederman
@ 2007-05-31 11:56             ` Eric W. Biederman
  2007-05-31 15:55               ` Andrew Morton
  1 sibling, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-31 11:56 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

Nick Piggin <nickpiggin@yahoo.com.au> writes:

> Eric W. Biederman wrote:
>> Nick Piggin <nickpiggin@yahoo.com.au> writes:
>
>>> However we
>>>could still set_page_dirty of a block device page without buffers
>>>via an mmap.
>>
>>
>> After the page is made dirty via mmap we have:
>> sys_write -> ... -> block_prepare_write -> ... -> create_empty_buffers.
>
> Yep, that's what I mean.

Actually I just took a second look.  That path is fine because
create_empty_buffers already performs the page dirty check.

>> I suspect that is a pretty rare case but it does indeed seem to exist
>> as a problem.
>
> I think so too. But either we have some misunderstanding of the
> codepaths involved, or the author of the comments there didn't
> consider this case, so...

Do someone needs to stand up and write the additional patches.
Do we have a maintainer for fs/buffer.c?

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
  2007-05-31 11:56             ` Eric W. Biederman
@ 2007-05-31 15:55               ` Andrew Morton
  2007-05-31 16:40                 ` [PATCH] rd: Remove ramdisk_set_page_dirty Eric W. Biederman
  2007-05-31 16:43                 ` [PATCH] buffer: Kill old incorrect? comment Eric W. Biederman
  0 siblings, 2 replies; 17+ messages in thread
From: Andrew Morton @ 2007-05-31 15:55 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Nick Piggin, Linus Torvalds, linux-kernel

On Thu, 31 May 2007 05:56:33 -0600 ebiederm@xmission.com (Eric W. Biederman) wrote:

> >> I suspect that is a pretty rare case but it does indeed seem to exist
> >> as a problem.
> >
> > I think so too. But either we have some misunderstanding of the
> > codepaths involved, or the author of the comments there didn't
> > consider this case, so...
> 
> Do someone needs to stand up and write the additional patches.
> Do we have a maintainer for fs/buffer.c?

me, if any one, I guess.  But I haven't really been following
this discussion - too long and slow ;)  'twould be good if you
could reprise it all in one email, presumably in diff -u form..

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] rd: Remove ramdisk_set_page_dirty
  2007-05-31 15:55               ` Andrew Morton
@ 2007-05-31 16:40                 ` Eric W. Biederman
  2007-05-31 16:43                 ` [PATCH] buffer: Kill old incorrect? comment Eric W. Biederman
  1 sibling, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-31 16:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nick Piggin, Linus Torvalds, linux-kernel


The generic function __set_page_dirty_buffers called by default by
set_page_dirty appears to be a correct superset of ramdisk_set_page_dirty.
So remove the specialized ramdisk version.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 drivers/block/rd.c |   27 ---------------------------
 1 files changed, 0 insertions(+), 27 deletions(-)

diff --git a/drivers/block/rd.c b/drivers/block/rd.c
index 56b2b54..26b4c6f 100644
--- a/drivers/block/rd.c
+++ b/drivers/block/rd.c
@@ -114,37 +114,10 @@ static int ramdisk_readpage(struct file *file, struct page *page)
 	return 0;
 }
 
-/*
- * ramdisk blockdev pages have their own ->set_page_dirty() because we don't
- * want them to contribute to dirty memory accounting.
- */
-static int ramdisk_set_page_dirty(struct page *page)
-{
-	struct address_space * const mapping = page_mapping(page);
-
-	spin_lock(&mapping->private_lock);
-	if (page_has_buffers(page)) {
-		struct buffer_head *head = page_buffers(page);
-		struct buffer_head *bh = head;
-
-		do {
-			set_buffer_uptodate(bh);
-			set_buffer_dirty(bh);
-			bh = bh->b_this_page;
-		} while (bh != head);
-	}
-	spin_unlock(&mapping->private_lock);
-
-	if (!TestSetPageDirty(page))
-		return 1;
-	return 0;
-}
-
 static const struct address_space_operations ramdisk_aops = {
 	.readpage	= ramdisk_readpage,
 	.prepare_write	= simple_prepare_write,
 	.commit_write	= simple_commit_write,
-	.set_page_dirty	= ramdisk_set_page_dirty,
 };
 
 static int rd_blkdev_pagecache_IO(int rw, struct bio_vec *vec, sector_t sector,
-- 
1.5.1.1.181.g2de0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH] buffer:  Kill old incorrect? comment.
  2007-05-31 15:55               ` Andrew Morton
  2007-05-31 16:40                 ` [PATCH] rd: Remove ramdisk_set_page_dirty Eric W. Biederman
@ 2007-05-31 16:43                 ` Eric W. Biederman
  1 sibling, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-31 16:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nick Piggin, Linus Torvalds, linux-kernel


Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 fs/buffer.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index c6b58e8..ad9270b 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1029,11 +1029,6 @@ failed:
 /*
  * Create buffers for the specified block device block's page.  If
  * that page was dirty, the buffers are set dirty also.
- *
- * Except that's a bug.  Attaching dirty buffers to a dirty
- * blockdev's page can result in filesystem corruption, because
- * some of those buffers may be aliases of filesystem data.
- * grow_dev_page() will go BUG() if this happens.
  */
 static int
 grow_buffers(struct block_device *bdev, sector_t block, int size)
-- 
1.5.1.1.181.g2de0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-05-31 16:44 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-22  2:31 [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman
2007-05-22  2:36 ` [PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty Eric W. Biederman
2007-05-22  2:40   ` [PATCH 3/3] rd: Simplify by using the same helper functions in libfs Eric W. Biederman
2007-05-28  4:09 ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Nick Piggin
2007-05-28  4:30   ` Eric W. Biederman
2007-05-28  4:54     ` Nick Piggin
2007-05-28  4:59       ` Nick Piggin
2007-05-28 14:05         ` Eric W. Biederman
2007-05-29  5:14           ` Nick Piggin
2007-05-29  5:28             ` Eric W. Biederman
2007-05-31 11:56             ` Eric W. Biederman
2007-05-31 15:55               ` Andrew Morton
2007-05-31 16:40                 ` [PATCH] rd: Remove ramdisk_set_page_dirty Eric W. Biederman
2007-05-31 16:43                 ` [PATCH] buffer: Kill old incorrect? comment Eric W. Biederman
2007-05-28 13:52       ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman
2007-05-28  6:37     ` Nick Piggin
2007-05-28 13:58       ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox