* btrfs failing fsx-linux
@ 2009-08-18 17:26 Nick Piggin
2009-08-18 17:59 ` Josef Bacik
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Nick Piggin @ 2009-08-18 17:26 UTC (permalink / raw)
To: Chris Mason, linux-btrfs
Hi Chris,
Don't know if this is a known issue, but I have btrfs (after my previous
inode_tree fixup patch) failing fsx-linux in Linus's current git tree.
This makes it a bit hard for me to test my btrfs truncate conversion patch
unfortunately, though it does seem pretty stable so I will probably just
send it out anyway.
Anyway, just a head's up. Oh, the way I reproduce is to create btrfs
on 1GB rd, run 4 instances of fsx-linux on different files in the root
directory, and run `while true ; do sync ; echo 3 > /proc/sys/vm/drop_caches;
done` at the same time.
Thanks,
Nick
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: btrfs failing fsx-linux 2009-08-18 17:26 btrfs failing fsx-linux Nick Piggin @ 2009-08-18 17:59 ` Josef Bacik 2009-08-19 8:50 ` Nick Piggin 2009-09-02 17:14 ` Nick Piggin 2009-09-09 13:13 ` Chris Mason 2009-09-11 19:36 ` Chris Mason 2 siblings, 2 replies; 13+ messages in thread From: Josef Bacik @ 2009-08-18 17:59 UTC (permalink / raw) To: Nick Piggin; +Cc: Chris Mason, linux-btrfs On Tue, Aug 18, 2009 at 07:26:41PM +0200, Nick Piggin wrote: > Hi Chris, > > Don't know if this is a known issue, but I have btrfs (after my previous > inode_tree fixup patch) failing fsx-linux in Linus's current git tree. > > This makes it a bit hard for me to test my btrfs truncate conversion patch > unfortunately, though it does seem pretty stable so I will probably just > send it out anyway. > > Anyway, just a head's up. Oh, the way I reproduce is to create btrfs > on 1GB rd, run 4 instances of fsx-linux on different files in the root > directory, and run `while true ; do sync ; echo 3 > /proc/sys/vm/drop_caches; > done` at the same time. > Nick, I'll take a look at this later today and see what I can come up with. Thanks for reporting it. Josef ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: btrfs failing fsx-linux 2009-08-18 17:59 ` Josef Bacik @ 2009-08-19 8:50 ` Nick Piggin 2009-09-02 17:14 ` Nick Piggin 1 sibling, 0 replies; 13+ messages in thread From: Nick Piggin @ 2009-08-19 8:50 UTC (permalink / raw) To: Josef Bacik; +Cc: Chris Mason, linux-btrfs On Tue, Aug 18, 2009 at 01:59:50PM -0400, Josef Bacik wrote: > On Tue, Aug 18, 2009 at 07:26:41PM +0200, Nick Piggin wrote: > > Hi Chris, > > > > Don't know if this is a known issue, but I have btrfs (after my previous > > inode_tree fixup patch) failing fsx-linux in Linus's current git tree. > > > > This makes it a bit hard for me to test my btrfs truncate conversion patch > > unfortunately, though it does seem pretty stable so I will probably just > > send it out anyway. > > > > Anyway, just a head's up. Oh, the way I reproduce is to create btrfs > > on 1GB rd, run 4 instances of fsx-linux on different files in the root > > directory, and run `while true ; do sync ; echo 3 > /proc/sys/vm/drop_caches; > > done` at the same time. > > > > Nick, > > I'll take a look at this later today and see what I can come up with. Thanks > for reporting it. > > Josef Thanks. After spending a good chunk of my life tracking down fsx-linux problems in fsblock and mm code, I can't say I envy you :) It's a great test to run though (especially syncing and dropping caches because that hits writeback and bypasses caches far far more than fsx ever does on its own). Let me know if you need me to test any patches. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: btrfs failing fsx-linux 2009-08-18 17:59 ` Josef Bacik 2009-08-19 8:50 ` Nick Piggin @ 2009-09-02 17:14 ` Nick Piggin 2009-09-02 17:29 ` Josef Bacik 2009-09-02 19:07 ` Chris Mason 1 sibling, 2 replies; 13+ messages in thread From: Nick Piggin @ 2009-09-02 17:14 UTC (permalink / raw) To: Josef Bacik; +Cc: Chris Mason, linux-btrfs On Tue, Aug 18, 2009 at 01:59:50PM -0400, Josef Bacik wrote: > On Tue, Aug 18, 2009 at 07:26:41PM +0200, Nick Piggin wrote: > > Hi Chris, > > > > Don't know if this is a known issue, but I have btrfs (after my previous > > inode_tree fixup patch) failing fsx-linux in Linus's current git tree. > > > > This makes it a bit hard for me to test my btrfs truncate conversion patch > > unfortunately, though it does seem pretty stable so I will probably just > > send it out anyway. > > > > Anyway, just a head's up. Oh, the way I reproduce is to create btrfs > > on 1GB rd, run 4 instances of fsx-linux on different files in the root > > directory, and run `while true ; do sync ; echo 3 > /proc/sys/vm/drop_caches; > > done` at the same time. > > > > Nick, > > I'll take a look at this later today and see what I can come up with. Thanks > for reporting it. Any progress with this? Were you able to reproduce? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: btrfs failing fsx-linux 2009-09-02 17:14 ` Nick Piggin @ 2009-09-02 17:29 ` Josef Bacik 2009-09-02 19:07 ` Chris Mason 1 sibling, 0 replies; 13+ messages in thread From: Josef Bacik @ 2009-09-02 17:29 UTC (permalink / raw) To: Nick Piggin; +Cc: Josef Bacik, Chris Mason, linux-btrfs On Wed, Sep 02, 2009 at 07:14:50PM +0200, Nick Piggin wrote: > On Tue, Aug 18, 2009 at 01:59:50PM -0400, Josef Bacik wrote: > > On Tue, Aug 18, 2009 at 07:26:41PM +0200, Nick Piggin wrote: > > > Hi Chris, > > > > > > Don't know if this is a known issue, but I have btrfs (after my previous > > > inode_tree fixup patch) failing fsx-linux in Linus's current git tree. > > > > > > This makes it a bit hard for me to test my btrfs truncate conversion patch > > > unfortunately, though it does seem pretty stable so I will probably just > > > send it out anyway. > > > > > > Anyway, just a head's up. Oh, the way I reproduce is to create btrfs > > > on 1GB rd, run 4 instances of fsx-linux on different files in the root > > > directory, and run `while true ; do sync ; echo 3 > /proc/sys/vm/drop_caches; > > > done` at the same time. > > > > > > > Nick, > > > > I'll take a look at this later today and see what I can come up with. Thanks > > for reporting it. > > Any progress with this? Were you able to reproduce? > Err sorry no, I got distracted with my -ENOSPC stuff, it's taking longer than I expected it to. I hope to wrap it up today and take a look at this next. I was able to reproduce the problem on the experimental branch of Chris's tree, so it's not something that has been fixed yet. Thanks, Josef ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: btrfs failing fsx-linux 2009-09-02 17:14 ` Nick Piggin 2009-09-02 17:29 ` Josef Bacik @ 2009-09-02 19:07 ` Chris Mason 2009-09-03 7:10 ` Nick Piggin 1 sibling, 1 reply; 13+ messages in thread From: Chris Mason @ 2009-09-02 19:07 UTC (permalink / raw) To: Nick Piggin; +Cc: Josef Bacik, linux-btrfs On Wed, Sep 02, 2009 at 07:14:50PM +0200, Nick Piggin wrote: > On Tue, Aug 18, 2009 at 01:59:50PM -0400, Josef Bacik wrote: > > On Tue, Aug 18, 2009 at 07:26:41PM +0200, Nick Piggin wrote: > > > Hi Chris, > > > > > > Don't know if this is a known issue, but I have btrfs (after my previous > > > inode_tree fixup patch) failing fsx-linux in Linus's current git tree. > > > > > > This makes it a bit hard for me to test my btrfs truncate conversion patch > > > unfortunately, though it does seem pretty stable so I will probably just > > > send it out anyway. > > > > > > Anyway, just a head's up. Oh, the way I reproduce is to create btrfs > > > on 1GB rd, run 4 instances of fsx-linux on different files in the root > > > directory, and run `while true ; do sync ; echo 3 > /proc/sys/vm/drop_caches; > > > done` at the same time. > > > > > > > Nick, > > > > I'll take a look at this later today and see what I can come up with. Thanks > > for reporting it. > > Any progress with this? Were you able to reproduce? > I thought this was where the rb tree patch came from, sorry. I'll try to reproduce. -chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: btrfs failing fsx-linux 2009-09-02 19:07 ` Chris Mason @ 2009-09-03 7:10 ` Nick Piggin 0 siblings, 0 replies; 13+ messages in thread From: Nick Piggin @ 2009-09-03 7:10 UTC (permalink / raw) To: Chris Mason, Josef Bacik, linux-btrfs On Wed, Sep 02, 2009 at 03:07:21PM -0400, Chris Mason wrote: > On Wed, Sep 02, 2009 at 07:14:50PM +0200, Nick Piggin wrote: > > On Tue, Aug 18, 2009 at 01:59:50PM -0400, Josef Bacik wrote: > > > I'll take a look at this later today and see what I can come up with. Thanks > > > for reporting it. > > > > Any progress with this? Were you able to reproduce? > > > > I thought this was where the rb tree patch came from, sorry. I'll try > to reproduce. Oh, I must not have been clear: I first hit the rbtree bug, then when that was fixed, the same workload was eventually showing up fsx-linux failures (but no more kernel crashes). ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: btrfs failing fsx-linux 2009-08-18 17:26 btrfs failing fsx-linux Nick Piggin 2009-08-18 17:59 ` Josef Bacik @ 2009-09-09 13:13 ` Chris Mason 2009-09-09 13:34 ` Nick Piggin 2009-09-11 19:36 ` Chris Mason 2 siblings, 1 reply; 13+ messages in thread From: Chris Mason @ 2009-09-09 13:13 UTC (permalink / raw) To: Nick Piggin; +Cc: linux-btrfs On Tue, Aug 18, 2009 at 07:26:41PM +0200, Nick Piggin wrote: > Hi Chris, > > Don't know if this is a known issue, but I have btrfs (after my previous > inode_tree fixup patch) failing fsx-linux in Linus's current git tree. > > This makes it a bit hard for me to test my btrfs truncate conversion patch > unfortunately, though it does seem pretty stable so I will probably just > send it out anyway. > > Anyway, just a head's up. Oh, the way I reproduce is to create btrfs > on 1GB rd, run 4 instances of fsx-linux on different files in the root > directory, and run `while true ; do sync ; echo 3 > /proc/sys/vm/drop_caches; > done` at the same time. Just an update, I think I've tracked this down. Our fixup code for when set_page_dirty was called without page_mkwrite wasn't catching every case. I'm hitting other problems because I'm testing on my new performance code, but hopefully this will all be nailed down soon. -chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: btrfs failing fsx-linux 2009-09-09 13:13 ` Chris Mason @ 2009-09-09 13:34 ` Nick Piggin 2009-09-09 13:37 ` Chris Mason 0 siblings, 1 reply; 13+ messages in thread From: Nick Piggin @ 2009-09-09 13:34 UTC (permalink / raw) To: Chris Mason, linux-btrfs On Wed, Sep 09, 2009 at 09:13:49AM -0400, Chris Mason wrote: > On Tue, Aug 18, 2009 at 07:26:41PM +0200, Nick Piggin wrote: > > Hi Chris, > > > > Don't know if this is a known issue, but I have btrfs (after my previous > > inode_tree fixup patch) failing fsx-linux in Linus's current git tree. > > > > This makes it a bit hard for me to test my btrfs truncate conversion patch > > unfortunately, though it does seem pretty stable so I will probably just > > send it out anyway. > > > > Anyway, just a head's up. Oh, the way I reproduce is to create btrfs > > on 1GB rd, run 4 instances of fsx-linux on different files in the root > > directory, and run `while true ; do sync ; echo 3 > /proc/sys/vm/drop_caches; > > done` at the same time. > > Just an update, I think I've tracked this down. Our fixup code for when > set_page_dirty was called without page_mkwrite wasn't catching every > case. I'm hitting other problems because I'm testing on my new > performance code, but hopefully this will all be nailed down soon. Which cases do set_page_dirty happen without page_mkwrite? For fsx-linux I hope there shouldn't be any... but one issue with btrfs's page_mkwrite is that it still unlocks the page before page_mkwrite returns so if it is anything like other filesystems, the page might get written out between the page_mkwrite and the page fault's subsequent set_page_dirty. So if that race is hitting, it might appear like set_page_dirty is being called without a page_mkwrite. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: btrfs failing fsx-linux 2009-09-09 13:34 ` Nick Piggin @ 2009-09-09 13:37 ` Chris Mason 2009-09-09 13:43 ` Nick Piggin 0 siblings, 1 reply; 13+ messages in thread From: Chris Mason @ 2009-09-09 13:37 UTC (permalink / raw) To: Nick Piggin; +Cc: linux-btrfs On Wed, Sep 09, 2009 at 03:34:35PM +0200, Nick Piggin wrote: > On Wed, Sep 09, 2009 at 09:13:49AM -0400, Chris Mason wrote: > > On Tue, Aug 18, 2009 at 07:26:41PM +0200, Nick Piggin wrote: > > > Hi Chris, > > > > > > Don't know if this is a known issue, but I have btrfs (after my previous > > > inode_tree fixup patch) failing fsx-linux in Linus's current git tree. > > > > > > This makes it a bit hard for me to test my btrfs truncate conversion patch > > > unfortunately, though it does seem pretty stable so I will probably just > > > send it out anyway. > > > > > > Anyway, just a head's up. Oh, the way I reproduce is to create btrfs > > > on 1GB rd, run 4 instances of fsx-linux on different files in the root > > > directory, and run `while true ; do sync ; echo 3 > /proc/sys/vm/drop_caches; > > > done` at the same time. > > > > Just an update, I think I've tracked this down. Our fixup code for when > > set_page_dirty was called without page_mkwrite wasn't catching every > > case. I'm hitting other problems because I'm testing on my new > > performance code, but hopefully this will all be nailed down soon. > > Which cases do set_page_dirty happen without page_mkwrite? For fsx-linux > I hope there shouldn't be any... I've been assuming its the zap_pte_range part, but now I have enough code around it to make stack traces. > but one issue with btrfs's page_mkwrite > is that it still unlocks the page before page_mkwrite returns so if it > is anything like other filesystems, the page might get written out > between the page_mkwrite and the page fault's subsequent set_page_dirty. > > So if that race is hitting, it might appear like set_page_dirty is > being called without a page_mkwrite. Good point, I'll make sure. -chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: btrfs failing fsx-linux 2009-09-09 13:37 ` Chris Mason @ 2009-09-09 13:43 ` Nick Piggin 2009-09-09 13:47 ` Chris Mason 0 siblings, 1 reply; 13+ messages in thread From: Nick Piggin @ 2009-09-09 13:43 UTC (permalink / raw) To: Chris Mason, linux-btrfs On Wed, Sep 09, 2009 at 09:37:29AM -0400, Chris Mason wrote: > On Wed, Sep 09, 2009 at 03:34:35PM +0200, Nick Piggin wrote: > > Which cases do set_page_dirty happen without page_mkwrite? For fsx-linux > > I hope there shouldn't be any... > > I've been assuming its the zap_pte_range part, but now I have enough > code around it to make stack traces. OK, because AFAIK we're _supposed_ to be able to close all that up since the last round of page_mkwrite fixes here, so pte state will not be out of state with page state. Yes set_page_dirty still gets called there, for fses without page_mkwrite or which don't try hard to keep in sync. But if you return with the page locked from page_mkwrite, this set_page_diryt call should always find the page dirty (I hope). > > but one issue with btrfs's page_mkwrite > > is that it still unlocks the page before page_mkwrite returns so if it > > is anything like other filesystems, the page might get written out > > between the page_mkwrite and the page fault's subsequent set_page_dirty. > > > > So if that race is hitting, it might appear like set_page_dirty is > > being called without a page_mkwrite. > > Good point, I'll make sure. Though if it has exposed a bug, that's a good thing too; you still want to handle that case (for now) due to the get_uesr_pages problem. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: btrfs failing fsx-linux 2009-09-09 13:43 ` Nick Piggin @ 2009-09-09 13:47 ` Chris Mason 0 siblings, 0 replies; 13+ messages in thread From: Chris Mason @ 2009-09-09 13:47 UTC (permalink / raw) To: Nick Piggin; +Cc: linux-btrfs On Wed, Sep 09, 2009 at 03:43:51PM +0200, Nick Piggin wrote: > On Wed, Sep 09, 2009 at 09:37:29AM -0400, Chris Mason wrote: > > On Wed, Sep 09, 2009 at 03:34:35PM +0200, Nick Piggin wrote: > > > Which cases do set_page_dirty happen without page_mkwrite? For fsx-linux > > > I hope there shouldn't be any... > > > > I've been assuming its the zap_pte_range part, but now I have enough > > code around it to make stack traces. > > OK, because AFAIK we're _supposed_ to be able to close all that > up since the last round of page_mkwrite fixes here, so pte state > will not be out of state with page state. > > Yes set_page_dirty still gets called there, for fses without > page_mkwrite or which don't try hard to keep in sync. But if you > return with the page locked from page_mkwrite, this set_page_diryt > call should always find the page dirty (I hope). > > > > > but one issue with btrfs's page_mkwrite > > > is that it still unlocks the page before page_mkwrite returns so if it > > > is anything like other filesystems, the page might get written out > > > between the page_mkwrite and the page fault's subsequent set_page_dirty. > > > > > > So if that race is hitting, it might appear like set_page_dirty is > > > being called without a page_mkwrite. > > > > Good point, I'll make sure. > > Though if it has exposed a bug, that's a good thing too; you still > want to handle that case (for now) due to the get_uesr_pages problem. Yeah, I'll hammer on it until the handler works and then I'll fixup page_mkwrite to leave things locked. -chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: btrfs failing fsx-linux 2009-08-18 17:26 btrfs failing fsx-linux Nick Piggin 2009-08-18 17:59 ` Josef Bacik 2009-09-09 13:13 ` Chris Mason @ 2009-09-11 19:36 ` Chris Mason 2 siblings, 0 replies; 13+ messages in thread From: Chris Mason @ 2009-09-11 19:36 UTC (permalink / raw) To: Nick Piggin; +Cc: linux-btrfs On Tue, Aug 18, 2009 at 07:26:41PM +0200, Nick Piggin wrote: > Hi Chris, > > Don't know if this is a known issue, but I have btrfs (after my previous > inode_tree fixup patch) failing fsx-linux in Linus's current git tree. Thanks for your help on this Nick, I've got my current fixes for fsx in the master branch of the btrfs-unstable repo: git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git If you're curious, the 3 most relevant commits are here: http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=93c82d575055f1bd0277acae6f966bebafd80dd5 http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=50a9b214bc6c052caa05a210ebfc1bdf0d7085b2 http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=a1ed835e1ab5795f91b198d08c43e2f56848dcf3 But these fixes are built on top of my performance work (which was sure to break fsx on its own). -chris ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-09-11 19:36 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-18 17:26 btrfs failing fsx-linux Nick Piggin 2009-08-18 17:59 ` Josef Bacik 2009-08-19 8:50 ` Nick Piggin 2009-09-02 17:14 ` Nick Piggin 2009-09-02 17:29 ` Josef Bacik 2009-09-02 19:07 ` Chris Mason 2009-09-03 7:10 ` Nick Piggin 2009-09-09 13:13 ` Chris Mason 2009-09-09 13:34 ` Nick Piggin 2009-09-09 13:37 ` Chris Mason 2009-09-09 13:43 ` Nick Piggin 2009-09-09 13:47 ` Chris Mason 2009-09-11 19:36 ` Chris Mason
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox