* WARNING: "test_clear_page_dirty" [fs/cifs/cifs.ko] undefined! @ 2006-12-22 14:28 Thomas Meyer 2006-12-22 21:30 ` Jean Delvare 0 siblings, 1 reply; 7+ messages in thread From: Thomas Meyer @ 2006-12-22 14:28 UTC (permalink / raw) To: linux-kernel Again current git head: I guess this should be fixed by someone! WARNING: "test_clear_page_dirty" [fs/cifs/cifs.ko] undefined! make[1]: *** [__modpost] Fehler 1 make: *** [modules] Fehler 2 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: "test_clear_page_dirty" [fs/cifs/cifs.ko] undefined! 2006-12-22 14:28 WARNING: "test_clear_page_dirty" [fs/cifs/cifs.ko] undefined! Thomas Meyer @ 2006-12-22 21:30 ` Jean Delvare 2006-12-23 0:25 ` Linus Torvalds 0 siblings, 1 reply; 7+ messages in thread From: Jean Delvare @ 2006-12-22 21:30 UTC (permalink / raw) To: Thomas Meyer, Linus Torvalds, Steve French; +Cc: linux-kernel On Fri, 22 Dec 2006 15:28:45 +0100, Thomas Meyer wrote: > Again current git head: > > I guess this should be fixed by someone! > > WARNING: "test_clear_page_dirty" [fs/cifs/cifs.ko] undefined! > make[1]: *** [__modpost] Fehler 1 > make: *** [modules] Fehler 2 This is caused by this commit: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9 >From the log message: "Some filesystems need to be fixed up for this: CIFS, FUSE, JFS, ReiserFS, XFS all use the old confusing functions, and will be fixed separately in subsequent commits (with some of them just removing the offending logic, and others using clear_page_dirty_for_io())." The approach seems quite broken to me, the users should have been fixed _before_ removing the function, so as to avoid compilation failures. These are a pain for testers, and break git bisect too. Grmbl. Now that it's done... Steve, can you please take a look and provide a patch so that cifs builds again? Thanks, -- Jean Delvare ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: "test_clear_page_dirty" [fs/cifs/cifs.ko] undefined! 2006-12-22 21:30 ` Jean Delvare @ 2006-12-23 0:25 ` Linus Torvalds 2006-12-23 18:30 ` Linus Torvalds 0 siblings, 1 reply; 7+ messages in thread From: Linus Torvalds @ 2006-12-23 0:25 UTC (permalink / raw) To: Jean Delvare; +Cc: Thomas Meyer, Steve French, linux-kernel On Fri, 22 Dec 2006, Jean Delvare wrote: > > The approach seems quite broken to me, the users should have been fixed > _before_ removing the function, so as to avoid compilation failures. > These are a pain for testers, and break git bisect too. Grmbl. This needed to be fixed, and quite frankly, things don't get fixed nearly as quickly if you don't just break them first. And there really were just two filesystems that got broken, cifs being one of them. I just can't test it. > Now that it's done... Steve, can you please take a look and provide a > patch so that cifs builds again? CIFS _should_ be using "clear_page_dirty_for_io()" in that place, and that will fix the build. However, the reason I didn't just do that myself is that I can't test the end result, and for the life of me, I can't see where CIFS does the "end_page_writeback()" that it needs to do at IO completion time. And the thing that confuses me about that, is that if CIFS doesn't do "end_page_writeback()", then it was already broken before - because when the VM calls "->writepage()" the clear_page_dirty_for_io() will have been done by the VM, and it needs that "end_page_writeback()" so that the system can know when the IO is done. I _suspect_ that those "unlock_page()" calls should be accompanied by a "end_page_writeback()" call, and that the proper patch MAY look something like the appended, but I worry about having missed something really subtle. Maybe there's a end_page_writeback() somewhere else. And if there isn't, I wonder if shared mappings have _ever_ worked on CIFS? And if so, how? That writeback bit thing isn't new per se. So this may or may not fix it. If you can test it (_including_ with some dirty shared mmap-on-mmap action, please - just call me kinky), I'll commit it. But I need somebody who actually uses this to test it. Linus --- diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 0f05cab..4f0472d 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -1245,7 +1245,7 @@ retry: wait_on_page_writeback(page); if (PageWriteback(page) || - !test_clear_page_dirty(page)) { + !clear_page_dirty_for_io(page)) { unlock_page(page); break; } @@ -1253,6 +1253,7 @@ retry: if (page_offset(page) >= mapping->host->i_size) { done = 1; unlock_page(page); + end_page_writeback(page); break; } @@ -1316,6 +1317,7 @@ retry: SetPageError(page); kunmap(page); unlock_page(page); + end_page_writeback(page); page_cache_release(page); } if ((wbc->nr_to_write -= n_iov) <= 0) @@ -1356,7 +1358,8 @@ static int cifs_writepage(struct page* page, struct writeback_control *wbc) rc = cifs_partialpagewrite(page, 0, PAGE_CACHE_SIZE); SetPageUptodate(page); /* BB add check for error and Clearuptodate? */ unlock_page(page); - page_cache_release(page); + end_page_writeback(page); + page_cache_release(page); FreeXid(xid); return rc; } ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: WARNING: "test_clear_page_dirty" [fs/cifs/cifs.ko] undefined! 2006-12-23 0:25 ` Linus Torvalds @ 2006-12-23 18:30 ` Linus Torvalds 2006-12-23 19:44 ` Randy Dunlap 0 siblings, 1 reply; 7+ messages in thread From: Linus Torvalds @ 2006-12-23 18:30 UTC (permalink / raw) To: Jean Delvare, Ian McDonald Cc: Thomas Meyer, linux-cifs-client, Steve French, Linux Kernel Mailing List, Andrew Morton [ Andrew - I'm cc'ing you, because you caused the requirement that people use "set_page_writeback()" in their writepage() routine that CIFS seems to have been ignoring all these years. That was introduced more than two years ago, back in April 11, 2004: [PATCH] fdatasync integrity fix fdatasync can fail to wait on some pages due to a race. ... and as far as I can see, ever since then, any filesystem that didn't do a "set_page_writeback()" to sync up the TAG_DIRTY bit would have this CPU usage problem. Please double-check whether I'm right or barking up the wrong tree. Afaik, the lack of doing the page writeback bit handling properly would seem to not cause any actual visible _semantic_ problems, it would just cause fdatasync to not necessarily be entirely reliable (which I guess is semantic, but very hard to see) and just wasted CPU cycles when we look up pages that are marked dirty in the radix tree, but aren't actually really dirty. Correct? Who else is implicated in all of this? ] On Fri, 22 Dec 2006, Linus Torvalds wrote: > > CIFS _should_ be using "clear_page_dirty_for_io()" in that place, and that > will fix the build. However, the reason I didn't just do that myself is > that I can't test the end result, and for the life of me, I can't see > where CIFS does the "end_page_writeback()" that it needs to do at IO > completion time. Ok, I spent some more time looking at this. The reason cifs didn't do an "end_page_writeback()" was that it didn't even do the "set_page_writeback()" to mark the page under writeback in the first place. Now, you might think that since it didn't do a set_page_writeback(), it doesn't need to do the matching end_page_writeback() at all, and instead just continue to use the old (_really_ old) way of just unlocking the page when it is done. However, you'd be wrong. The thing is, a "writepage()" function will be called with the dirty bit cleared in the "struct page *", but the mapping radix trees will still have the dirty bit set, exactly because the VM _requires_ the filesystem to tell it what the h*ll it is doing with the page. So a low-level filesystem must always do one of two things in it's "writepage()" function. Either: - "set_page_writeback()" (and then an "end_page_writeback()" when finished, of course) OR - "redirty_page_for_writepage()" to tell the VM to move the page to the back of the LRU queues because it can't be cleaned (eg, some temporary problem with write ordering or similar, or something fundamental like "I'm ramfs, and I don't _have_ any backing store"). and if the low-level filesystem doesn't do either of those, then the status bits in the radix tree that contains the mapping information will never be updated, so the page that got cleaned will continue to be marked "dirty" in the radix tree (which admittedly will generally be invisible, except for "sync()" and friends spending inordinate amounts of time looking at pages that aren't even dirty any more - since they look things up by the radix tree tags). So I think the old code happened to work, but it was definitely incorrect, and would leave the dirty tags in the radix tree very confused indeed (it so happened that "cifs_writepages()" - with an "s" at the end - because it used "test_clear_page_dirty()" - would also clear the dirty tag, but any page that went through the generic VM routines and the single-page "cifs_writepage()" - without an "s" at the end - would then be forever marked dirty in the radix tree even though it was clean. Somebody should check me, though. This fairly mindless patch adds the proper "set_page_writeback()" calls (and the "clear_page_writeback()" ones I had already added before I looked more closely at this). I added a comment in "cifs_writepage()" (the single-page case) for why this all is the case, Linus PS. To clarify: the old "test_clear_page_dirty()" would actually clear the dirty bit in the radix tree too, so in that sense it was the RIGHT thing to do for CIFS, since CIFS was mostly unaware of the need to clear the radix tree dirty bit (even if cifs_writepages() actually used that bit to look up pages). HOWEVER, since CIFS is called from the generic routines (which _are_ radix-tree-aware and need the bit to be cleared explicitly), even the old code was actually totally broken. It would clear - largely by mistake - the radix tree dirty bit only for one case, not for _all_ the cases. A filesystem really does need to know about these things now, although a lot of filesystems can ignore them, since if they use all the generic routines, they generic routines will handle it all for them. --- diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 0f05cab..8a49b2e 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -1245,14 +1245,21 @@ retry: wait_on_page_writeback(page); if (PageWriteback(page) || - !test_clear_page_dirty(page)) { + !clear_page_dirty_for_io(page)) { unlock_page(page); break; } + /* + * This actually clears the dirty bit in the radix tree. + * See cifs_writepage() for more commentary. + */ + set_page_writeback(page); + if (page_offset(page) >= mapping->host->i_size) { done = 1; unlock_page(page); + end_page_writeback(page); break; } @@ -1316,6 +1323,7 @@ retry: SetPageError(page); kunmap(page); unlock_page(page); + end_page_writeback(page); page_cache_release(page); } if ((wbc->nr_to_write -= n_iov) <= 0) @@ -1352,11 +1360,23 @@ static int cifs_writepage(struct page* page, struct writeback_control *wbc) if (!PageUptodate(page)) { cFYI(1, ("ppw - page not up to date")); } - + + /* + * Set the "writeback" flag, and clear "dirty" in the radix tree. + * + * A writepage() implementation always needs to do either this, + * or re-dirty the page with "redirty_page_for_writepage()" in + * the case of a failure. + * + * Just unlocking the page will cause the radix tree tag-bits + * to fail to update with the state of the page correctly. + */ + set_page_writeback(page); rc = cifs_partialpagewrite(page, 0, PAGE_CACHE_SIZE); SetPageUptodate(page); /* BB add check for error and Clearuptodate? */ unlock_page(page); - page_cache_release(page); + end_page_writeback(page); + page_cache_release(page); FreeXid(xid); return rc; } ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: WARNING: "test_clear_page_dirty" [fs/cifs/cifs.ko] undefined! 2006-12-23 18:30 ` Linus Torvalds @ 2006-12-23 19:44 ` Randy Dunlap 2006-12-23 20:06 ` Linus Torvalds 0 siblings, 1 reply; 7+ messages in thread From: Randy Dunlap @ 2006-12-23 19:44 UTC (permalink / raw) To: Linus Torvalds Cc: Jean Delvare, Ian McDonald, Thomas Meyer, linux-cifs-client, Steve French, Linux Kernel Mailing List, Andrew Morton On Sat, 23 Dec 2006 10:30:40 -0800 (PST) Linus Torvalds wrote: > > [ Andrew - I'm cc'ing you, because you caused the requirement that people > use "set_page_writeback()" in their writepage() routine that CIFS seems > to have been ignoring all these years. That was introduced more than > two years ago, back in April 11, 2004: > > [PATCH] fdatasync integrity fix > > fdatasync can fail to wait on some pages due to a race. > ... > > and as far as I can see, ever since then, any filesystem that didn't do > a "set_page_writeback()" to sync up the TAG_DIRTY bit would have this > CPU usage problem. Please double-check whether I'm right or barking up > the wrong tree. > > Afaik, the lack of doing the page writeback bit handling properly would > seem to not cause any actual visible _semantic_ problems, it would just > cause fdatasync to not necessarily be entirely reliable (which I guess > is semantic, but very hard to see) and just wasted CPU cycles when we > look up pages that are marked dirty in the radix tree, but aren't > actually really dirty. > > Correct? Who else is implicated in all of this? ] > > On Fri, 22 Dec 2006, Linus Torvalds wrote: > > > > CIFS _should_ be using "clear_page_dirty_for_io()" in that place, and that > > will fix the build. However, the reason I didn't just do that myself is > > that I can't test the end result, and for the life of me, I can't see > > where CIFS does the "end_page_writeback()" that it needs to do at IO > > completion time. > > Ok, I spent some more time looking at this. > > The reason cifs didn't do an "end_page_writeback()" was that it didn't > even do the "set_page_writeback()" to mark the page under writeback in the > first place. > > Now, you might think that since it didn't do a set_page_writeback(), it > doesn't need to do the matching end_page_writeback() at all, and instead > just continue to use the old (_really_ old) way of just unlocking the page > when it is done. > > However, you'd be wrong. The thing is, a "writepage()" function will be > called with the dirty bit cleared in the "struct page *", but the mapping > radix trees will still have the dirty bit set, exactly because the VM > _requires_ the filesystem to tell it what the h*ll it is doing with the > page. So a low-level filesystem must always do one of two things in it's > "writepage()" function. Either: > > - "set_page_writeback()" (and then an "end_page_writeback()" when > finished, of course) > > OR > > - "redirty_page_for_writepage()" to tell the VM to move the page to the > back of the LRU queues because it can't be cleaned (eg, some temporary > problem with write ordering or similar, or something fundamental like > "I'm ramfs, and I don't _have_ any backing store"). > > and if the low-level filesystem doesn't do either of those, then the > status bits in the radix tree that contains the mapping information will > never be updated, so the page that got cleaned will continue to be marked > "dirty" in the radix tree (which admittedly will generally be invisible, > except for "sync()" and friends spending inordinate amounts of time > looking at pages that aren't even dirty any more - since they look things > up by the radix tree tags). > > So I think the old code happened to work, but it was definitely incorrect, > and would leave the dirty tags in the radix tree very confused indeed (it > so happened that "cifs_writepages()" - with an "s" at the end - because it > used "test_clear_page_dirty()" - would also clear the dirty tag, but any > page that went through the generic VM routines and the single-page > "cifs_writepage()" - without an "s" at the end - would then be forever > marked dirty in the radix tree even though it was clean. > > Somebody should check me, though. > > This fairly mindless patch adds the proper "set_page_writeback()" calls > (and the "clear_page_writeback()" ones I had already added before I looked > more closely at this). > > I added a comment in "cifs_writepage()" (the single-page case) for why > this all is the case, BTW, reiserfs has similar build problems: it uses clear_page_dirty() so it won't build. fs/built-in.o: In function `reiserfs_cut_from_item': (.text.reiserfs_cut_from_item+0x868): undefined reference to `clear_page_dirty' --- ~Randy ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: "test_clear_page_dirty" [fs/cifs/cifs.ko] undefined! 2006-12-23 19:44 ` Randy Dunlap @ 2006-12-23 20:06 ` Linus Torvalds 2006-12-24 5:35 ` Randy Dunlap 0 siblings, 1 reply; 7+ messages in thread From: Linus Torvalds @ 2006-12-23 20:06 UTC (permalink / raw) To: Randy Dunlap Cc: Jean Delvare, Ian McDonald, Thomas Meyer, linux-cifs-client, Steve French, Linux Kernel Mailing List, Andrew Morton On Sat, 23 Dec 2006, Randy Dunlap wrote: > > BTW, reiserfs has similar build problems: it uses clear_page_dirty() > so it won't build. Not any more. I fixed that one (very different issue, btw: it's not actually doign writeout, it actually wanted to cancel IO on truncated buffers. However, it's certainly possible that my fix hasn't mirrored out yet, I pushed it just a couple of hours ago. So if you want to test it, here are the two commits in question.. (The "cancel_dirty_page()" cleanup is needed not just to do reiserfs as a module, it's also to make it more robust against reiserfs possibly feeding that function with strange pages, and to match the other related functions in the accounting functions). Len Brown tested the reiserfs changes, and claims that it was all good, but if somebody wants to run fsx-linux or some other filesystem stress testing tool that actually tests shared mmap (and truncate), that would be really appreciated. Linus -- commit 8368e328dfe1c534957051333a87b3210a12743b Author: Linus Torvalds <torvalds@woody.osdl.org> Date: Sat Dec 23 09:25:04 2006 -0800 Clean up and export cancel_dirty_page() to modules Make cancel_dirty_page() act more like all the other dirty and writeback accounting functions: test for "mapping" being NULL, and do the NR_FILE_DIRY accounting purely based on mapping_cap_account_dirty()). Also, add it to the exports, so that modular filesystems can use it. Acked-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> --- mm/truncate.c | 12 ++++++++---- 1 files changed, 8 insertions(+), 4 deletions(-) diff --git a/mm/truncate.c b/mm/truncate.c index 4a38dd1..ecdfdcc 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -60,12 +60,16 @@ void cancel_dirty_page(struct page *page, unsigned int account_size) WARN_ON(++warncount < 5); } - if (TestClearPageDirty(page) && account_size && - mapping_cap_account_dirty(page->mapping)) { - dec_zone_page_state(page, NR_FILE_DIRTY); - task_io_account_cancelled_write(account_size); + if (TestClearPageDirty(page)) { + struct address_space *mapping = page->mapping; + if (mapping && mapping_cap_account_dirty(mapping)) { + dec_zone_page_state(page, NR_FILE_DIRTY); + if (account_size) + task_io_account_cancelled_write(account_size); + } } } +EXPORT_SYMBOL(cancel_dirty_page); /* * If truncate cannot remove the fs-private metadata from the page, the page commit ffaa82008f1aad52a6d3979f49d2a76c2928b60f Author: Linus Torvalds <torvalds@woody.osdl.org> Date: Sat Dec 23 09:32:45 2006 -0800 Fix reiserfs after "test_clear_page_dirty()" removal Thanks to Len Brown for testing this fix, since while they have in the past, none of my machines run reiserfs at the moment. Cc: Vladimir V. Saveliev <vs@namesys.com> Acked-by: Len Brown <lenb@kernel.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> --- fs/reiserfs/stree.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/reiserfs/stree.c b/fs/reiserfs/stree.c index 47e7027..afb21ea 100644 --- a/fs/reiserfs/stree.c +++ b/fs/reiserfs/stree.c @@ -1459,7 +1459,7 @@ static void unmap_buffers(struct page *page, loff_t pos) bh = next; } while (bh != head); if (PAGE_SIZE == bh->b_size) { - clear_page_dirty(page); + cancel_dirty_page(page, PAGE_CACHE_SIZE); } } } ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: WARNING: "test_clear_page_dirty" [fs/cifs/cifs.ko] undefined! 2006-12-23 20:06 ` Linus Torvalds @ 2006-12-24 5:35 ` Randy Dunlap 0 siblings, 0 replies; 7+ messages in thread From: Randy Dunlap @ 2006-12-24 5:35 UTC (permalink / raw) To: Linus Torvalds Cc: Jean Delvare, Ian McDonald, Thomas Meyer, linux-cifs-client, Steve French, Linux Kernel Mailing List, Andrew Morton On Sat, 23 Dec 2006 12:06:43 -0800 (PST) Linus Torvalds wrote: > > > On Sat, 23 Dec 2006, Randy Dunlap wrote: > > > > BTW, reiserfs has similar build problems: it uses clear_page_dirty() > > so it won't build. > > Not any more. I fixed that one (very different issue, btw: it's not > actually doign writeout, it actually wanted to cancel IO on truncated > buffers. > > However, it's certainly possible that my fix hasn't mirrored out yet, I > pushed it just a couple of hours ago. So if you want to test it, here are > the two commits in question.. > > (The "cancel_dirty_page()" cleanup is needed not just to do reiserfs as a > module, it's also to make it more robust against reiserfs possibly feeding > that function with strange pages, and to match the other related functions > in the accounting functions). > > Len Brown tested the reiserfs changes, and claims that it was all good, > but if somebody wants to run fsx-linux or some other filesystem stress > testing tool that actually tests shared mmap (and truncate), that would be > really appreciated. I ran fsx-linux on it for one hour... with no problems reported. --- ~Randy ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-12-24 5:35 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-12-22 14:28 WARNING: "test_clear_page_dirty" [fs/cifs/cifs.ko] undefined! Thomas Meyer 2006-12-22 21:30 ` Jean Delvare 2006-12-23 0:25 ` Linus Torvalds 2006-12-23 18:30 ` Linus Torvalds 2006-12-23 19:44 ` Randy Dunlap 2006-12-23 20:06 ` Linus Torvalds 2006-12-24 5:35 ` Randy Dunlap
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox