* [PATCH 1/3] Preserve the dirty bit in init_page_buffers @ 2007-05-22 2:31 Eric W. Biederman 2007-05-22 2:36 ` [PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty Eric W. Biederman 2007-05-28 4:09 ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Nick Piggin 0 siblings, 2 replies; 17+ messages in thread From: Eric W. Biederman @ 2007-05-22 2:31 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, Nick Piggin The problem: When we are trying to free buffers try_to_free_buffers will look at ramdisk pages with clean buffer heads and remove the dirty bit from the page. Resulting in ramdisk pages with data that get removed from the page cache. Ouch! Buffer heads appear on ramdisk pages when a filesystem calls getblk, which through a series of function calls eventually calls init_page_buffers. So to fix the mismatch between buffer head state and page state this patch modifies init_page_buffers to transfer the dirty bit from the page to the buffer heads like we currently do for the uptodate bit. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> --- fs/buffer.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index aa68206..c6b58e8 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -953,6 +953,7 @@ init_page_buffers(struct page *page, struct block_device *bdev, struct buffer_head *head = page_buffers(page); struct buffer_head *bh = head; int uptodate = PageUptodate(page); + int dirty = PageDirty(page); do { if (!buffer_mapped(bh)) { @@ -961,6 +962,8 @@ init_page_buffers(struct page *page, struct block_device *bdev, bh->b_blocknr = block; if (uptodate) set_buffer_uptodate(bh); + if (dirty) + set_buffer_dirty(bh); set_buffer_mapped(bh); } block++; -- 1.5.1.1.181.g2de0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty 2007-05-22 2:31 [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman @ 2007-05-22 2:36 ` Eric W. Biederman 2007-05-22 2:40 ` [PATCH 3/3] rd: Simplify by using the same helper functions in libfs Eric W. Biederman 2007-05-28 4:09 ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Nick Piggin 1 sibling, 1 reply; 17+ messages in thread From: Eric W. Biederman @ 2007-05-22 2:36 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, Nick Piggin The problem: When we are trying to free buffers try_to_free_buffers will look at ramdisk pages with clean buffer heads and remove the dirty bit from the page. Resulting in ramdisk pages with data that get removed from the page cache. Ouch! When we mark a ramdisk page dirty we call set_page_dirty which then calls ramdisk_set_page_dirty. Currently we don't mark the buffer heads dirty leaving us susceptible to the problem above. So to fix the mismatch between buffer head state and page state this patch modifies ramdisk_set_page_dirty to set the dirty bit on all of the buffers a page may posses. I set the uptodate bit on the buffer head so that later we can use simple_commit_write, and because it is trivially safe. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> --- drivers/block/rd.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/drivers/block/rd.c b/drivers/block/rd.c index a1512da..41de0f4 100644 --- a/drivers/block/rd.c +++ b/drivers/block/rd.c @@ -184,6 +184,21 @@ static int ramdisk_writepages(struct address_space *mapping, */ static int ramdisk_set_page_dirty(struct page *page) { + struct address_space * const mapping = page_mapping(page); + + spin_lock(&mapping->private_lock); + if (page_has_buffers(page)) { + struct buffer_head *head = page_buffers(page); + struct buffer_head *bh = head; + + do { + set_buffer_uptodate(bh); + set_buffer_dirty(bh); + bh = bh->b_this_page; + } while (bh != head); + } + spin_unlock(&mapping->private_lock); + if (!TestSetPageDirty(page)) return 1; return 0; -- 1.5.1.1.181.g2de0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/3] rd: Simplify by using the same helper functions in libfs 2007-05-22 2:36 ` [PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty Eric W. Biederman @ 2007-05-22 2:40 ` Eric W. Biederman 0 siblings, 0 replies; 17+ messages in thread From: Eric W. Biederman @ 2007-05-22 2:40 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, Nick Piggin While the ramdisk code in the page cache started with the ramfs code it has diverged, and is a result is more complicated then it currently needs to be. This patch simplifies the ramfs code by syncing it with ramfs and similar pieces of code. The big difference is that the ramdisk must cope with people placing buffer heads on it's pages so there is extra code required to mark those buffer heads dirty and uptodate. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> --- drivers/block/rd.c | 76 +++------------------------------------------------ 1 files changed, 5 insertions(+), 71 deletions(-) diff --git a/drivers/block/rd.c b/drivers/block/rd.c index 41de0f4..56b2b54 100644 --- a/drivers/block/rd.c +++ b/drivers/block/rd.c @@ -92,36 +92,16 @@ static int rd_blocksize = CONFIG_BLK_DEV_RAM_BLOCKSIZE; * aops copied from ramfs. */ -/* - * If a ramdisk page has buffers, some may be uptodate and some may be not. - * To bring the page uptodate we zero out the non-uptodate buffers. The - * page must be locked. - */ static void make_page_uptodate(struct page *page) { + clear_highpage(page); if (page_has_buffers(page)) { struct buffer_head *bh = page_buffers(page); struct buffer_head *head = bh; do { - if (!buffer_uptodate(bh)) { - memset(bh->b_data, 0, bh->b_size); - /* - * akpm: I'm totally undecided about this. The - * buffer has just been magically brought "up to - * date", but nobody should want to be reading - * it anyway, because it hasn't been used for - * anything yet. It is still in a "not read - * from disk yet" state. - * - * But non-uptodate buffers against an uptodate - * page are against the rules. So do it anyway. - */ - set_buffer_uptodate(bh); - } + set_buffer_uptodate(bh); } while ((bh = bh->b_this_page) != head); - } else { - memset(page_address(page), 0, PAGE_CACHE_SIZE); } flush_dcache_page(page); SetPageUptodate(page); @@ -129,55 +109,11 @@ static void make_page_uptodate(struct page *page) static int ramdisk_readpage(struct file *file, struct page *page) { - if (!PageUptodate(page)) - make_page_uptodate(page); + make_page_uptodate(page); unlock_page(page); return 0; } -static int ramdisk_prepare_write(struct file *file, struct page *page, - unsigned offset, unsigned to) -{ - if (!PageUptodate(page)) - make_page_uptodate(page); - return 0; -} - -static int ramdisk_commit_write(struct file *file, struct page *page, - unsigned offset, unsigned to) -{ - set_page_dirty(page); - return 0; -} - -/* - * ->writepage to the blockdev's mapping has to redirty the page so that the - * VM doesn't go and steal it. We return AOP_WRITEPAGE_ACTIVATE so that the VM - * won't try to (pointlessly) write the page again for a while. - * - * Really, these pages should not be on the LRU at all. - */ -static int ramdisk_writepage(struct page *page, struct writeback_control *wbc) -{ - if (!PageUptodate(page)) - make_page_uptodate(page); - SetPageDirty(page); - if (wbc->for_reclaim) - return AOP_WRITEPAGE_ACTIVATE; - unlock_page(page); - return 0; -} - -/* - * This is a little speedup thing: short-circuit attempts to write back the - * ramdisk blockdev inode to its non-existent backing store. - */ -static int ramdisk_writepages(struct address_space *mapping, - struct writeback_control *wbc) -{ - return 0; -} - /* * ramdisk blockdev pages have their own ->set_page_dirty() because we don't * want them to contribute to dirty memory accounting. @@ -206,11 +142,9 @@ static int ramdisk_set_page_dirty(struct page *page) static const struct address_space_operations ramdisk_aops = { .readpage = ramdisk_readpage, - .prepare_write = ramdisk_prepare_write, - .commit_write = ramdisk_commit_write, - .writepage = ramdisk_writepage, + .prepare_write = simple_prepare_write, + .commit_write = simple_commit_write, .set_page_dirty = ramdisk_set_page_dirty, - .writepages = ramdisk_writepages, }; static int rd_blkdev_pagecache_IO(int rw, struct bio_vec *vec, sector_t sector, -- 1.5.1.1.181.g2de0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-22 2:31 [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman 2007-05-22 2:36 ` [PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty Eric W. Biederman @ 2007-05-28 4:09 ` Nick Piggin 2007-05-28 4:30 ` Eric W. Biederman 1 sibling, 1 reply; 17+ messages in thread From: Nick Piggin @ 2007-05-28 4:09 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Eric W. Biederman wrote: > The problem: When we are trying to free buffers try_to_free_buffers > will look at ramdisk pages with clean buffer heads and remove the > dirty bit from the page. Resulting in ramdisk pages with data that > get removed from the page cache. Ouch! > > Buffer heads appear on ramdisk pages when a filesystem calls getblk, > which through a series of function calls eventually calls > init_page_buffers. > > So to fix the mismatch between buffer head state and page state this > patch modifies init_page_buffers to transfer the dirty bit from the > page to the buffer heads like we currently do for the uptodate bit. Ouch indeed! But can we ever have a dirty page at init_page_buffers-time? I would have thought we can fix this simply by removing the broken ramdisk_set_page_dirty (as far as the comment goes, we set CAP_NO_ACCT_DIRTY anyway, so the normal set_page_dirty should handle everything properly, no?). > Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> > --- > fs/buffer.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/fs/buffer.c b/fs/buffer.c > index aa68206..c6b58e8 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -953,6 +953,7 @@ init_page_buffers(struct page *page, struct block_device *bdev, > struct buffer_head *head = page_buffers(page); > struct buffer_head *bh = head; > int uptodate = PageUptodate(page); > + int dirty = PageDirty(page); > > do { > if (!buffer_mapped(bh)) { > @@ -961,6 +962,8 @@ init_page_buffers(struct page *page, struct block_device *bdev, > bh->b_blocknr = block; > if (uptodate) > set_buffer_uptodate(bh); > + if (dirty) > + set_buffer_dirty(bh); > set_buffer_mapped(bh); > } > block++; -- SUSE Labs, Novell Inc. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-28 4:09 ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Nick Piggin @ 2007-05-28 4:30 ` Eric W. Biederman 2007-05-28 4:54 ` Nick Piggin 2007-05-28 6:37 ` Nick Piggin 0 siblings, 2 replies; 17+ messages in thread From: Eric W. Biederman @ 2007-05-28 4:30 UTC (permalink / raw) To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Nick Piggin <nickpiggin@yahoo.com.au> writes: > Eric W. Biederman wrote: >> The problem: When we are trying to free buffers try_to_free_buffers >> will look at ramdisk pages with clean buffer heads and remove the >> dirty bit from the page. Resulting in ramdisk pages with data that >> get removed from the page cache. Ouch! >> >> Buffer heads appear on ramdisk pages when a filesystem calls getblk, >> which through a series of function calls eventually calls >> init_page_buffers. >> >> So to fix the mismatch between buffer head state and page state this >> patch modifies init_page_buffers to transfer the dirty bit from the >> page to the buffer heads like we currently do for the uptodate bit. > > Ouch indeed! > > But can we ever have a dirty page at init_page_buffers-time? Definitely, and it was a royal pain to trace the bug that this caused. An initial ramdisk having pieces disappear after mkfs is called can look like the entire machine is dying. When we initialize the ramdisk by writing to /dev/ram0 usually in init/do_mounts_rd.c we don't allocate buffer heads but we do set the dirty bit, and the page is in the page cache. So when we later call getblk it reuses the same page and then calls init_page_buffers. > I would have thought we can fix this simply by removing the > broken ramdisk_set_page_dirty (as far as the comment goes, we > set CAP_NO_ACCT_DIRTY anyway, so the normal set_page_dirty > should handle everything properly, no?). No. I don't know where accounting comes into play. I didn't trace that path. But if we have a non-dirty ramdisk page with buffers (basically a hole in the middle or at the end of the ramdisk). We need to set the buffer dirty bits when we write to it. So I don't see how it would make sense to reuse the generic set_page_dirty, and handling all of the logic in set_page_dirty to dirty the buffer heads seemed to have made the most sense. Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-28 4:30 ` Eric W. Biederman @ 2007-05-28 4:54 ` Nick Piggin 2007-05-28 4:59 ` Nick Piggin 2007-05-28 13:52 ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman 2007-05-28 6:37 ` Nick Piggin 1 sibling, 2 replies; 17+ messages in thread From: Nick Piggin @ 2007-05-28 4:54 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Eric W. Biederman wrote: > Nick Piggin <nickpiggin@yahoo.com.au> writes: > > >>Eric W. Biederman wrote: >> >>>The problem: When we are trying to free buffers try_to_free_buffers >>>will look at ramdisk pages with clean buffer heads and remove the >>>dirty bit from the page. Resulting in ramdisk pages with data that >>>get removed from the page cache. Ouch! >>> >>>Buffer heads appear on ramdisk pages when a filesystem calls getblk, >>>which through a series of function calls eventually calls >>>init_page_buffers. >>> >>>So to fix the mismatch between buffer head state and page state this >>>patch modifies init_page_buffers to transfer the dirty bit from the >>>page to the buffer heads like we currently do for the uptodate bit. >> >>Ouch indeed! >> >>But can we ever have a dirty page at init_page_buffers-time? > > > Definitely, and it was a royal pain to trace the bug that this > caused. An initial ramdisk having pieces disappear after mkfs > is called can look like the entire machine is dying. > > When we initialize the ramdisk by writing to /dev/ram0 usually in > init/do_mounts_rd.c we don't allocate buffer heads but we do set > the dirty bit, and the page is in the page cache. So when we > later call getblk it reuses the same page and then calls > init_page_buffers. Hmm, so this would be a problem for block_dev.c as well, then? Because it would be possible to have a dirty block dev page have its buffers reclaimed and then reinitialised via init_page_buffers, AFAIKS. >>I would have thought we can fix this simply by removing the >>broken ramdisk_set_page_dirty (as far as the comment goes, we >>set CAP_NO_ACCT_DIRTY anyway, so the normal set_page_dirty >>should handle everything properly, no?). > > > No. I don't know where accounting comes into play. I didn't > trace that path. But if we have a non-dirty ramdisk page with > buffers (basically a hole in the middle or at the end of the ramdisk). > We need to set the buffer dirty bits when we write to it. Accounting is done in set_page_dirty. > > So I don't see how it would make sense to reuse the generic > set_page_dirty, and handling all of the logic in set_page_dirty > to dirty the buffer heads seemed to have made the most sense. That's what the generic set_page_dirty does. What I want to know is why *doesn't* it make sense to reuse the generic set_page_dirty? Unless there is a good reason, then reusing is better than writing your own. -- SUSE Labs, Novell Inc. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-28 4:54 ` Nick Piggin @ 2007-05-28 4:59 ` Nick Piggin 2007-05-28 14:05 ` Eric W. Biederman 2007-05-28 13:52 ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman 1 sibling, 1 reply; 17+ messages in thread From: Nick Piggin @ 2007-05-28 4:59 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Nick Piggin wrote: > Eric W. Biederman wrote: >> When we initialize the ramdisk by writing to /dev/ram0 usually in >> init/do_mounts_rd.c we don't allocate buffer heads but we do set >> the dirty bit, and the page is in the page cache. So when we >> later call getblk it reuses the same page and then calls >> init_page_buffers. > > > Hmm, so this would be a problem for block_dev.c as well, then? > Because it would be possible to have a dirty block dev page > have its buffers reclaimed and then reinitialised via > init_page_buffers, AFAIKS. Oh, no, try_to_free_buffers won't drop dirty buffers. However we could still set_page_dirty of a block device page without buffers via an mmap. -- SUSE Labs, Novell Inc. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-28 4:59 ` Nick Piggin @ 2007-05-28 14:05 ` Eric W. Biederman 2007-05-29 5:14 ` Nick Piggin 0 siblings, 1 reply; 17+ messages in thread From: Eric W. Biederman @ 2007-05-28 14:05 UTC (permalink / raw) To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Nick Piggin <nickpiggin@yahoo.com.au> writes: > Nick Piggin wrote: >> Eric W. Biederman wrote: > >>> When we initialize the ramdisk by writing to /dev/ram0 usually in >>> init/do_mounts_rd.c we don't allocate buffer heads but we do set >>> the dirty bit, and the page is in the page cache. So when we >>> later call getblk it reuses the same page and then calls >>> init_page_buffers. >> >> >> Hmm, so this would be a problem for block_dev.c as well, then? >> Because it would be possible to have a dirty block dev page >> have its buffers reclaimed and then reinitialised via >> init_page_buffers, AFAIKS. > > Oh, no, try_to_free_buffers won't drop dirty buffers. However we > could still set_page_dirty of a block device page without buffers > via an mmap. After the page is made dirty via mmap we have: sys_write -> ... -> block_prepare_write -> ... -> create_empty_buffers. I suspect that is a pretty rare case but it does indeed seem to exist as a problem. Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-28 14:05 ` Eric W. Biederman @ 2007-05-29 5:14 ` Nick Piggin 2007-05-29 5:28 ` Eric W. Biederman 2007-05-31 11:56 ` Eric W. Biederman 0 siblings, 2 replies; 17+ messages in thread From: Nick Piggin @ 2007-05-29 5:14 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Eric W. Biederman wrote: > Nick Piggin <nickpiggin@yahoo.com.au> writes: >> However we >>could still set_page_dirty of a block device page without buffers >>via an mmap. > > > After the page is made dirty via mmap we have: > sys_write -> ... -> block_prepare_write -> ... -> create_empty_buffers. Yep, that's what I mean. > I suspect that is a pretty rare case but it does indeed seem to exist > as a problem. I think so too. But either we have some misunderstanding of the codepaths involved, or the author of the comments there didn't consider this case, so... -- SUSE Labs, Novell Inc. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-29 5:14 ` Nick Piggin @ 2007-05-29 5:28 ` Eric W. Biederman 2007-05-31 11:56 ` Eric W. Biederman 1 sibling, 0 replies; 17+ messages in thread From: Eric W. Biederman @ 2007-05-29 5:28 UTC (permalink / raw) To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Nick Piggin <nickpiggin@yahoo.com.au> writes: > Eric W. Biederman wrote: >> Nick Piggin <nickpiggin@yahoo.com.au> writes: > >>> However we >>>could still set_page_dirty of a block device page without buffers >>>via an mmap. >> >> >> After the page is made dirty via mmap we have: >> sys_write -> ... -> block_prepare_write -> ... -> create_empty_buffers. > > Yep, that's what I mean. > > >> I suspect that is a pretty rare case but it does indeed seem to exist >> as a problem. > > I think so too. But either we have some misunderstanding of the > codepaths involved, or the author of the comments there didn't > consider this case, so... Which is likely. Which is why I brought up the try_to_free_buffers case. There has been some significant dancing around trying to sort things out and make them race free in this code. Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-29 5:14 ` Nick Piggin 2007-05-29 5:28 ` Eric W. Biederman @ 2007-05-31 11:56 ` Eric W. Biederman 2007-05-31 15:55 ` Andrew Morton 1 sibling, 1 reply; 17+ messages in thread From: Eric W. Biederman @ 2007-05-31 11:56 UTC (permalink / raw) To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Nick Piggin <nickpiggin@yahoo.com.au> writes: > Eric W. Biederman wrote: >> Nick Piggin <nickpiggin@yahoo.com.au> writes: > >>> However we >>>could still set_page_dirty of a block device page without buffers >>>via an mmap. >> >> >> After the page is made dirty via mmap we have: >> sys_write -> ... -> block_prepare_write -> ... -> create_empty_buffers. > > Yep, that's what I mean. Actually I just took a second look. That path is fine because create_empty_buffers already performs the page dirty check. >> I suspect that is a pretty rare case but it does indeed seem to exist >> as a problem. > > I think so too. But either we have some misunderstanding of the > codepaths involved, or the author of the comments there didn't > consider this case, so... Do someone needs to stand up and write the additional patches. Do we have a maintainer for fs/buffer.c? Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-31 11:56 ` Eric W. Biederman @ 2007-05-31 15:55 ` Andrew Morton 2007-05-31 16:40 ` [PATCH] rd: Remove ramdisk_set_page_dirty Eric W. Biederman 2007-05-31 16:43 ` [PATCH] buffer: Kill old incorrect? comment Eric W. Biederman 0 siblings, 2 replies; 17+ messages in thread From: Andrew Morton @ 2007-05-31 15:55 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Nick Piggin, Linus Torvalds, linux-kernel On Thu, 31 May 2007 05:56:33 -0600 ebiederm@xmission.com (Eric W. Biederman) wrote: > >> I suspect that is a pretty rare case but it does indeed seem to exist > >> as a problem. > > > > I think so too. But either we have some misunderstanding of the > > codepaths involved, or the author of the comments there didn't > > consider this case, so... > > Do someone needs to stand up and write the additional patches. > Do we have a maintainer for fs/buffer.c? me, if any one, I guess. But I haven't really been following this discussion - too long and slow ;) 'twould be good if you could reprise it all in one email, presumably in diff -u form.. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] rd: Remove ramdisk_set_page_dirty 2007-05-31 15:55 ` Andrew Morton @ 2007-05-31 16:40 ` Eric W. Biederman 2007-05-31 16:43 ` [PATCH] buffer: Kill old incorrect? comment Eric W. Biederman 1 sibling, 0 replies; 17+ messages in thread From: Eric W. Biederman @ 2007-05-31 16:40 UTC (permalink / raw) To: Andrew Morton; +Cc: Nick Piggin, Linus Torvalds, linux-kernel The generic function __set_page_dirty_buffers called by default by set_page_dirty appears to be a correct superset of ramdisk_set_page_dirty. So remove the specialized ramdisk version. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> --- drivers/block/rd.c | 27 --------------------------- 1 files changed, 0 insertions(+), 27 deletions(-) diff --git a/drivers/block/rd.c b/drivers/block/rd.c index 56b2b54..26b4c6f 100644 --- a/drivers/block/rd.c +++ b/drivers/block/rd.c @@ -114,37 +114,10 @@ static int ramdisk_readpage(struct file *file, struct page *page) return 0; } -/* - * ramdisk blockdev pages have their own ->set_page_dirty() because we don't - * want them to contribute to dirty memory accounting. - */ -static int ramdisk_set_page_dirty(struct page *page) -{ - struct address_space * const mapping = page_mapping(page); - - spin_lock(&mapping->private_lock); - if (page_has_buffers(page)) { - struct buffer_head *head = page_buffers(page); - struct buffer_head *bh = head; - - do { - set_buffer_uptodate(bh); - set_buffer_dirty(bh); - bh = bh->b_this_page; - } while (bh != head); - } - spin_unlock(&mapping->private_lock); - - if (!TestSetPageDirty(page)) - return 1; - return 0; -} - static const struct address_space_operations ramdisk_aops = { .readpage = ramdisk_readpage, .prepare_write = simple_prepare_write, .commit_write = simple_commit_write, - .set_page_dirty = ramdisk_set_page_dirty, }; static int rd_blkdev_pagecache_IO(int rw, struct bio_vec *vec, sector_t sector, -- 1.5.1.1.181.g2de0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH] buffer: Kill old incorrect? comment. 2007-05-31 15:55 ` Andrew Morton 2007-05-31 16:40 ` [PATCH] rd: Remove ramdisk_set_page_dirty Eric W. Biederman @ 2007-05-31 16:43 ` Eric W. Biederman 1 sibling, 0 replies; 17+ messages in thread From: Eric W. Biederman @ 2007-05-31 16:43 UTC (permalink / raw) To: Andrew Morton; +Cc: Nick Piggin, Linus Torvalds, linux-kernel Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> --- fs/buffer.c | 5 ----- 1 files changed, 0 insertions(+), 5 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index c6b58e8..ad9270b 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1029,11 +1029,6 @@ failed: /* * Create buffers for the specified block device block's page. If * that page was dirty, the buffers are set dirty also. - * - * Except that's a bug. Attaching dirty buffers to a dirty - * blockdev's page can result in filesystem corruption, because - * some of those buffers may be aliases of filesystem data. - * grow_dev_page() will go BUG() if this happens. */ static int grow_buffers(struct block_device *bdev, sector_t block, int size) -- 1.5.1.1.181.g2de0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-28 4:54 ` Nick Piggin 2007-05-28 4:59 ` Nick Piggin @ 2007-05-28 13:52 ` Eric W. Biederman 1 sibling, 0 replies; 17+ messages in thread From: Eric W. Biederman @ 2007-05-28 13:52 UTC (permalink / raw) To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Nick Piggin <nickpiggin@yahoo.com.au> writes: > Eric W. Biederman wrote: >> Nick Piggin <nickpiggin@yahoo.com.au> writes: > >>>I would have thought we can fix this simply by removing the >>>broken ramdisk_set_page_dirty (as far as the comment goes, we >>>set CAP_NO_ACCT_DIRTY anyway, so the normal set_page_dirty >>>should handle everything properly, no?). >> >> >> No. I don't know where accounting comes into play. I didn't >> trace that path. But if we have a non-dirty ramdisk page with >> buffers (basically a hole in the middle or at the end of the ramdisk). >> We need to set the buffer dirty bits when we write to it. > > Accounting is done in set_page_dirty. Yes. What I meant was I had not looked at the implications of accounting, so I had not looked to see if I could use a generic set_dirty_page. I only got as far as recognizing that __set_page_dirty_no_writeback as not the appropriate function to use because we need to handle buffer heads. >> So I don't see how it would make sense to reuse the generic >> set_page_dirty, and handling all of the logic in set_page_dirty >> to dirty the buffer heads seemed to have made the most sense. > > That's what the generic set_page_dirty does. What I want to know > is why *doesn't* it make sense to reuse the generic set_page_dirty? > Unless there is a good reason, then reusing is better than writing > your own. I did not look at that part in detail. I only realized ramdisk_set_dirty_page needed to be modified upon a final review of my code, as it was not a case I actually hit. Just skimming through it again quickly I don't see a reason at this point to preserve a separate set_dirty_page for the ramdisk code. Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-28 4:30 ` Eric W. Biederman 2007-05-28 4:54 ` Nick Piggin @ 2007-05-28 6:37 ` Nick Piggin 2007-05-28 13:58 ` Eric W. Biederman 1 sibling, 1 reply; 17+ messages in thread From: Nick Piggin @ 2007-05-28 6:37 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Eric W. Biederman wrote: > Nick Piggin <nickpiggin@yahoo.com.au> writes: > > >>Eric W. Biederman wrote: >> >>>The problem: When we are trying to free buffers try_to_free_buffers >>>will look at ramdisk pages with clean buffer heads and remove the >>>dirty bit from the page. Resulting in ramdisk pages with data that >>>get removed from the page cache. Ouch! >>> >>>Buffer heads appear on ramdisk pages when a filesystem calls getblk, >>>which through a series of function calls eventually calls >>>init_page_buffers. >>> >>>So to fix the mismatch between buffer head state and page state this >>>patch modifies init_page_buffers to transfer the dirty bit from the >>>page to the buffer heads like we currently do for the uptodate bit. >> >>Ouch indeed! >> >>But can we ever have a dirty page at init_page_buffers-time? > > > Definitely, and it was a royal pain to trace the bug that this > caused. An initial ramdisk having pieces disappear after mkfs > is called can look like the entire machine is dying. > > When we initialize the ramdisk by writing to /dev/ram0 usually in > init/do_mounts_rd.c we don't allocate buffer heads but we do set > the dirty bit, and the page is in the page cache. So when we > later call getblk it reuses the same page and then calls > init_page_buffers. Hmm, the comment above grow_dev_buffers indicates this should not happen. But contrary to the comment, it doesn't go BUG unless you're attaching dirty buffers to a page with dirty buffers. I suspect this happens more frequently with rd.c, because unlike block_dev.c, it does not create dirty buffers in prepare_write. However it could still happen in block_dev.c via mmaped memory, as I said earlier. I'm not saying this patch 1/3 is wrong, but we would at least need to revise some comments. The grow_dev_buffers comment looks like one of Andrew's. Maybe he can shed some more light on this? -- SUSE Labs, Novell Inc. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers 2007-05-28 6:37 ` Nick Piggin @ 2007-05-28 13:58 ` Eric W. Biederman 0 siblings, 0 replies; 17+ messages in thread From: Eric W. Biederman @ 2007-05-28 13:58 UTC (permalink / raw) To: Nick Piggin; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Nick Piggin <nickpiggin@yahoo.com.au> writes: >> Definitely, and it was a royal pain to trace the bug that this >> caused. An initial ramdisk having pieces disappear after mkfs >> is called can look like the entire machine is dying. >> >> When we initialize the ramdisk by writing to /dev/ram0 usually in >> init/do_mounts_rd.c we don't allocate buffer heads but we do set >> the dirty bit, and the page is in the page cache. So when we >> later call getblk it reuses the same page and then calls >> init_page_buffers. > > Hmm, the comment above grow_dev_buffers indicates this should > not happen. But contrary to the comment, it doesn't go BUG > unless you're attaching dirty buffers to a page with dirty > buffers. > > I suspect this happens more frequently with rd.c, because unlike > block_dev.c, it does not create dirty buffers in prepare_write. > However it could still happen in block_dev.c via mmaped memory, > as I said earlier. > > I'm not saying this patch 1/3 is wrong, but we would at least > need to revise some comments. The grow_dev_buffers comment looks > like one of Andrew's. Maybe he can shed some more light on this? Good question. It sounds like you are correct. Either we need to fix those comments or find the root cause for the need to call cancel_dirty_page in try_to_free_buffers and fix that. Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2007-05-31 16:44 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-05-22 2:31 [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman 2007-05-22 2:36 ` [PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty Eric W. Biederman 2007-05-22 2:40 ` [PATCH 3/3] rd: Simplify by using the same helper functions in libfs Eric W. Biederman 2007-05-28 4:09 ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Nick Piggin 2007-05-28 4:30 ` Eric W. Biederman 2007-05-28 4:54 ` Nick Piggin 2007-05-28 4:59 ` Nick Piggin 2007-05-28 14:05 ` Eric W. Biederman 2007-05-29 5:14 ` Nick Piggin 2007-05-29 5:28 ` Eric W. Biederman 2007-05-31 11:56 ` Eric W. Biederman 2007-05-31 15:55 ` Andrew Morton 2007-05-31 16:40 ` [PATCH] rd: Remove ramdisk_set_page_dirty Eric W. Biederman 2007-05-31 16:43 ` [PATCH] buffer: Kill old incorrect? comment Eric W. Biederman 2007-05-28 13:52 ` [PATCH 1/3] Preserve the dirty bit in init_page_buffers Eric W. Biederman 2007-05-28 6:37 ` Nick Piggin 2007-05-28 13:58 ` Eric W. Biederman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox