* [GIT PULL] reiserfs fixes
@ 2010-01-02 1:27 Frederic Weisbecker
2010-01-02 13:41 ` Andi Kleen
2010-01-02 19:19 ` Linus Torvalds
0 siblings, 2 replies; 26+ messages in thread
From: Frederic Weisbecker @ 2010-01-02 1:27 UTC (permalink / raw)
To: Linus Torvalds
Cc: LKML, Frederic Weisbecker, Christian Kujau, Alexander Beregalov,
Chris Mason, Ingo Molnar
Linus,
Please pull the reiserfs/kill-bkl branch that can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
reiserfs/kill-bkl
These changes fix a lot of lock inversions, some of them were
triggering soft lockups very easily in xattrs operations.
As the reiserfs lock is a giant lock (in reiserfs scope),
these dependency inversions couldn't get smart fixes without a deep
locking rewrite.
That's why you'll mostly find dependency inversion fixes based on
such pattern:
reiserfs_write_unlock()
mutex_lock(random_lock)
reiserfs_write_lock()
This is not beautiful but at least that's better than the bkl.
Oh and I expect other lock inversions will get reported in
the future due to rare and then yet untested paths.
Thanks,
Frederic
---
Frederic Weisbecker (13):
reiserfs: Fix possible recursive lock
reiserfs: Fix reiserfs lock and journal lock inversion dependency
reiserfs: Fix reiserfs lock <-> inode mutex dependency inversion
reiserfs: Fix remaining in-reclaim-fs <-> reclaim-fs-on locking inversion
reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion
reiserfs: Warn on lock relax if taken recursively
reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr
reiserfs: Relax reiserfs lock while freeing the journal
reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle()
reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink()
reiserfs: Fix journal mutex <-> inode mutex lock inversion
reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr
reiserfs: Safely acquire i_mutex from xattr_rmdir
fs/reiserfs/bitmap.c | 3 +++
fs/reiserfs/inode.c | 5 +++--
fs/reiserfs/journal.c | 18 ++++++++++++++----
fs/reiserfs/lock.c | 9 +++++++++
fs/reiserfs/namei.c | 7 ++++---
fs/reiserfs/xattr.c | 26 ++++++++++++++++++++------
include/linux/reiserfs_fs.h | 26 ++++++++++++++++++++++++++
7 files changed, 79 insertions(+), 15 deletions(-)
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [GIT PULL] reiserfs fixes 2010-01-02 1:27 [GIT PULL] reiserfs fixes Frederic Weisbecker @ 2010-01-02 13:41 ` Andi Kleen 2010-01-02 16:36 ` Frederic Weisbecker 2010-01-02 19:19 ` Linus Torvalds 1 sibling, 1 reply; 26+ messages in thread From: Andi Kleen @ 2010-01-02 13:41 UTC (permalink / raw) To: Frederic Weisbecker Cc: Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar Frederic Weisbecker <fweisbec@gmail.com> writes: > > That's why you'll mostly find dependency inversion fixes based on > such pattern: > > reiserfs_write_unlock() > mutex_lock(random_lock) > reiserfs_write_lock() These `workarounds' look rather ugly and are likely much slower than the BKL that was there before. Perhaps it's better to simply go back to the BKL until this can be all fixed properly (or a more faithful emulation for the BKL can be devised)? > > This is not beautiful but at least that's better than the bkl. > > Oh and I expect other lock inversions will get reported in > the future due to rare and then yet untested paths. ... and given that was the conversion really a good idea? -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [GIT PULL] reiserfs fixes 2010-01-02 13:41 ` Andi Kleen @ 2010-01-02 16:36 ` Frederic Weisbecker 2010-01-02 17:43 ` reiserfs broken in 2.6.32 was " Andi Kleen 0 siblings, 1 reply; 26+ messages in thread From: Frederic Weisbecker @ 2010-01-02 16:36 UTC (permalink / raw) To: Andi Kleen Cc: Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 02:41:51PM +0100, Andi Kleen wrote: > Frederic Weisbecker <fweisbec@gmail.com> writes: > > > > That's why you'll mostly find dependency inversion fixes based on > > such pattern: > > > > reiserfs_write_unlock() > > mutex_lock(random_lock) > > reiserfs_write_lock() > > These `workarounds' look rather ugly and are likely much slower > than the BKL that was there before. This is ugly, I can't argue against that. But while this is apparently uglier than the bkl, it is actually better than it. The bkl does its dirty workarounds internally. We don't see it as it's done on schedule time and so. So indeed, placing a simple lock_kernel() in each outer most caller in reiserfs looks much proper and pretty than all these workarounds that try to catch up with its locking-based scheme. Now try to look at the new reiserfs lock not from a visual point of view but from its impact. The bkl relax the lock, it is acquired recursively etc... This usually implies a lot of hard investigation. The new lock has identified all of these dirty implicit places, knows when it is required to relax, knows when it is acquired recursively, knows where are the lock inversions once converted into a normal lock. All of this hard work of investigation has been done already. If someone wants to, one day, refine the locking to make something smarter (which I doubt), starting from the current codebase is _way_ much easier than starting from the 2.6.32 bkl based scheme. Actually someone who is going to try that from the bkl point will undoubtly need an intermediate state like the current one. The ugliness is here. But finding it more ugly than the bkl is a _pure_ illusion. Concerning the slowness. The xattr operation that have been patched here don't appear to me beeing in a fast path. Moreover some lookup areas (apart from relax on random lock) have been relaxed from the reiserfs lock in this new set. > Perhaps it's better to simply > go back to the BKL until this can be all fixed properly > (or a more faithful emulation for the BKL can be devised)? There are 99% of chances that nobody will ever fix it. Especially if one needs to start from the bkl base. Few people are familiar with the reiserfs code. Among these people, I doubt someone is motivated to do it. And for those who aren't familiar with it, reiserfs code is so messy that I doubt many people will try something for more than few minutes. This is especially true now that it is an old and legacy filesystem. Nobody cares anymore. I only have reiserfs partitions in my laptop and my testbox, nothing else. And that because I'm now maintaining it de facto. Otherwise I wouldn't encumber with that and would immediately set up btrfs everywhere. Concerning a more faithful emulation of the bkl. That would require to divide the bkl in several sub-bkl. This is pointless and even worst than the bkl. - that would require a notifier in schedule(), one notifier per sub-bkl. That's horrible for performances. And for the scheduler. I will be the first to NAK. - that doesn't solve the problem as this sub-bkl won't ever be removed. We just isolate a giant lock in reiserfs. That doesn't change anything, nor make the things simpler especially since we already know that reiserfs use of the bkl is not related to other users of bkl. - the reiserfs lock is per superblock. It scales better. > > > > This is not beautiful but at least that's better than the bkl. > > > > Oh and I expect other lock inversions will get reported in > > the future due to rare and then yet untested paths. > > ... and given that was the conversion really a good idea? Despite the _apparent_ ugliness compared to the bkl, I'm still sure this was, and is still, a good idea. The fact we are going to experience other locking inversions is a necessary pain. It's impossible to bypass these states, you can't remove that easily such giant lock from a complex code base. Thanks. ^ permalink raw reply [flat|nested] 26+ messages in thread
* reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 16:36 ` Frederic Weisbecker @ 2010-01-02 17:43 ` Andi Kleen 2010-01-02 19:02 ` Frederic Weisbecker ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Andi Kleen @ 2010-01-02 17:43 UTC (permalink / raw) To: Frederic Weisbecker Cc: Andi Kleen, Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar > I only have reiserfs partitions in my laptop and my testbox, > nothing else. And that because I'm now maintaining it de facto. AFAIK it's widely used in SUSE installations. It was the default for a long time. And right now as in 2.6.32 it's in a state of "may randomly explode/deadlock". And no clear path out of it. Not good. I am very concerned about destabilizing a widely used file system like this. This has the potential to really hurt users. > - that would require a notifier in schedule(), one notifier > per sub-bkl. That's horrible for performances. And for > the scheduler. I will be the first to NAK. I thought the original idea was to find everything that can sleep in reiserfs and simply wrap it with lock dropping? That should be roughly equivalent to the old BKL semantics. Where did it go wrong? -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 17:43 ` reiserfs broken in 2.6.32 was " Andi Kleen @ 2010-01-02 19:02 ` Frederic Weisbecker 2010-01-02 19:23 ` Andi Kleen 2010-01-02 20:11 ` Ingo Molnar 2010-01-02 22:18 ` Ingo Molnar 2 siblings, 1 reply; 26+ messages in thread From: Frederic Weisbecker @ 2010-01-02 19:02 UTC (permalink / raw) To: Andi Kleen Cc: Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 06:43:12PM +0100, Andi Kleen wrote: > > I only have reiserfs partitions in my laptop and my testbox, > > nothing else. And that because I'm now maintaining it de facto. > > AFAIK it's widely used in SUSE installations. It was the default > for a long time. > > And right now as in 2.6.32 it's in a state of > "may randomly explode/deadlock". And no clear path out of it. Not good. > > I am very concerned about destabilizing a widely used file system > like this. This has the potential to really hurt users. I understand your worries. And I've been very cautious with that, waiting for three cycles before requesting an upstream merge. I did it because the isolated tree model did not scale anymore. Now that it's upstream, I get more testing and I expect that, in the end of this cycle, I get most of these issues reported and fixed. Serious users who run serious datas won't ship 2.6.33, they will ship a further stable version 2.6.33.x (if they haven't converted their filesystems already). And at this time, things should be 99% fixed. > > - that would require a notifier in schedule(), one notifier > > per sub-bkl. That's horrible for performances. And for > > the scheduler. I will be the first to NAK. > > I thought the original idea was to find everything that > can sleep in reiserfs and simply wrap it with lock dropping? > > That should be roughly equivalent to the old BKL semantics. > > Where did it go wrong? That's the theory. Fitting into this strict scheme brings performance regressions. The bkl is a spinlock, it disables preemption, it is relaxed on sleep, and doesn't have locking dependencies. Moreover it's not a lock but a simulation of a NO_PREEMPT UP flow, with all the fixup guardians that come with (fixup if we schedule, as scheduling brings races). >From the conversion is borned a mutex. Even though we have adaptive spinning, we don't catch up spinlock performances as it's not a pure optimized looping fast path, and it may actually just sleep. The bkl is relaxed only when we sleep. Now simulating that with a mutex that gets explicitly relaxed is not the same thing as we need to relax the lock each time we _might_ sleep. It means we relax more and that brings performance regressions. That said it's sometimes a drawback for the bkl to be relaxed every time we schedule, because we need to fixup after that, sometimes we need to re-walk into the entire tree, etc... So sometimes we can do better. There are some places where we don't relax like did the bkl, so that we don't need to fixup, and we get a win of performances. You see? The bkl semantics must not be always strictly imitated on such conversion. It depends on what does the code. In reiserfs, sometimes it was desired that the bkl get relaxed, sometimes it wasn't. And all the reiserfs code deals with that. With a mutex we have the choice. So the conversion has been a balance between performance regressions brought by the mutex conversion, and the performance win because we have actually more control with a traditional lock. That said there are places where we really need to sleep, like when we grab another lock, so that we don't create inverted dependencies. That said, if the general opinion is in favour of unmerging the bkl removal changes in reiserfs. Then please do. Just to express my point of view, as my primary goal is not to fix reiserfs but the kernel: If you are afraid of such changes, your kernel will just become mildewed by the time. You need to drop such bad ill-legacies if you want it to evolve. Until every users of the big kernel lock will remain in the kernel, vanilla upstream will keep it as ball and chain, won't ever be able to perform any serious real time service, etc... So yes this is risky. But I think this is necessary. And as I explained above, things will be fine as serious datas are not manipulated with a random -rc2 (except my own datas...). ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 19:02 ` Frederic Weisbecker @ 2010-01-02 19:23 ` Andi Kleen 2010-01-02 20:11 ` Frederic Weisbecker 0 siblings, 1 reply; 26+ messages in thread From: Andi Kleen @ 2010-01-02 19:23 UTC (permalink / raw) To: Frederic Weisbecker Cc: Andi Kleen, Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 08:02:15PM +0100, Frederic Weisbecker wrote: > On Sat, Jan 02, 2010 at 06:43:12PM +0100, Andi Kleen wrote: > > > I only have reiserfs partitions in my laptop and my testbox, > > > nothing else. And that because I'm now maintaining it de facto. > > > > AFAIK it's widely used in SUSE installations. It was the default > > for a long time. > > > > And right now as in 2.6.32 it's in a state of > > "may randomly explode/deadlock". And no clear path out of it. Not good. > > > > I am very concerned about destabilizing a widely used file system > > like this. This has the potential to really hurt users. > > > I understand your worries. And I've been very cautious with that, > waiting for three cycles before requesting an upstream merge. I did > it because the isolated tree model did not scale anymore. > > Now that it's upstream, I get more testing and I expect that, in > the end of this cycle, I get most of these issues reported and > fixed. Will you? How many users systems could it break by then? > > Serious users who run serious datas won't ship 2.6.33, they will ship > a further stable version 2.6.33.x (if they haven't converted their > filesystems already). > And at this time, things should be 99% fixed. That seems very risky. For some rarely used obscure subsystems that might work but a widely used file system that keeps people's $HOME? I don't think seriously destabilizing that for a potentially longer time is a good idea. There's the potential to break a lot of porcelain. Probably you could do a ext3/ext4 like thing by starting with a "reiserfs3.5" copy and do the work there and then merge back once things work and have been reasonably verified by code review. > That's the theory. Fitting into this strict scheme brings performance > regressions. The bkl is a spinlock, it disables preemption, it is > relaxed on sleep, and doesn't have locking dependencies. Moreover > it's not a lock but a simulation of a NO_PREEMPT UP flow, with all > the fixup guardians that come with (fixup if we schedule, as > scheduling brings races). > > From the conversion is borned a mutex. Even though we have > adaptive spinning, we don't catch up spinlock performances > as it's not a pure optimized looping fast path, and it may > actually just sleep. Fix the adaptive spinlock then? > > The bkl is relaxed only when we sleep. Now simulating that with > a mutex that gets explicitly relaxed is not the same thing as > we need to relax the lock each time we _might_ sleep. It means > we relax more and that brings performance regressions. At least in the cases where the decision is in reiserfs code directly you could predict it by using need_resched(), couldn't you? That might not be 100% accurate, but good enough. > That said, if the general opinion is in favour of unmerging > the bkl removal changes in reiserfs. Then please do. For me it seems too aggressive at this point. If it was just a case of fixing a few known bugs, but if you're not even sure how many problems are left ... Perhaps do the reiserfs35 variant? > Just to express my point of view, as my primary goal is not > to fix reiserfs but the kernel: If you are afraid of such > changes, your kernel will just become mildewed by the time. Better some mildew than a seriously-broken-for-enough people's release (although I have my doubts that's the right metapher for the BKL anyways) Having stable releases is an important part for getting enough testers (we already have too little). And if we start breaking their $HOMEs they might become even less. -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 19:23 ` Andi Kleen @ 2010-01-02 20:11 ` Frederic Weisbecker 2010-01-02 20:33 ` Andi Kleen 2010-01-02 21:01 ` tytso 0 siblings, 2 replies; 26+ messages in thread From: Frederic Weisbecker @ 2010-01-02 20:11 UTC (permalink / raw) To: Andi Kleen Cc: Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 08:23:37PM +0100, Andi Kleen wrote: > On Sat, Jan 02, 2010 at 08:02:15PM +0100, Frederic Weisbecker wrote: > > On Sat, Jan 02, 2010 at 06:43:12PM +0100, Andi Kleen wrote: > > > > I only have reiserfs partitions in my laptop and my testbox, > > > > nothing else. And that because I'm now maintaining it de facto. > > > > > > AFAIK it's widely used in SUSE installations. It was the default > > > for a long time. > > > > > > And right now as in 2.6.32 it's in a state of > > > "may randomly explode/deadlock". And no clear path out of it. Not good. > > > > > > I am very concerned about destabilizing a widely used file system > > > like this. This has the potential to really hurt users. > > > > > > I understand your worries. And I've been very cautious with that, > > waiting for three cycles before requesting an upstream merge. I did > > it because the isolated tree model did not scale anymore. > > > > Now that it's upstream, I get more testing and I expect that, in > > the end of this cycle, I get most of these issues reported and > > fixed. > > Will you? > > How many users systems could it break by then? I've never lost any datas since I began this work. And I run it every day. If I had experienced lock inversions, and sometimes soft lockups, I did not experienced serious damages. It's a journalized filesystem that can fixup the things pretty well. Also we are talking about potential lock inversions, in potential rare path, that could potentially raise soft lockups. That makes a lot of potentials, for things that are going to be fixed and for which I've never seen serious damages. > > > > Serious users who run serious datas won't ship 2.6.33, they will ship > > a further stable version 2.6.33.x (if they haven't converted their > > filesystems already). > > And at this time, things should be 99% fixed. > > That seems very risky. For some rarely used obscure subsystems > that might work but a widely used file system that keeps people's $HOME? > I don't think seriously destabilizing that for a potentially longer > time is a good idea. There's the potential to break > a lot of porcelain. > > Probably you could do a ext3/ext4 like thing by starting > with a "reiserfs3.5" copy and do the work there and then > merge back once things work and have been reasonably verified > by code review. I fear nobody else than me will review it that deeply, which limits the scalability of this plan. We could make a new reiserfs version by duplicating the code base. But nobody will test it. That would require to patch mkreiserfs, waiting for distros to ship it, waiting for users to ship the distros. Assuming at this time there will be remaining users to set up new reiserfs partitions. We could also have a reiserfs-no-bkl config option that would pick the duplicated code base. Again I fear few people will test it. > > > That's the theory. Fitting into this strict scheme brings performance > > regressions. The bkl is a spinlock, it disables preemption, it is > > relaxed on sleep, and doesn't have locking dependencies. Moreover > > it's not a lock but a simulation of a NO_PREEMPT UP flow, with all > > the fixup guardians that come with (fixup if we schedule, as > > scheduling brings races). > > > > From the conversion is borned a mutex. Even though we have > > adaptive spinning, we don't catch up spinlock performances > > as it's not a pure optimized looping fast path, and it may > > actually just sleep. > > Fix the adaptive spinlock then? Believe me, I've reviewed the mutex code several dozens of time. I just fail to find weaknesses inside, especially in the adaptive spinning code. We just can not make it as fast as a spinlock fast path, as it needs to do regular checks to ensure it can continue to spin. > > > > The bkl is relaxed only when we sleep. Now simulating that with > > a mutex that gets explicitly relaxed is not the same thing as > > we need to relax the lock each time we _might_ sleep. It means > > we relax more and that brings performance regressions. > > At least in the cases where the decision is in reiserfs code > directly you could predict it by using need_resched(), couldn't you? > > > That might not be 100% accurate, but good enough. Sometimes I do. Sometimes it's just wasteful. We don't want to relax the lock just because of a kmalloc(__GFP_NOFS). Sometimes relaxing the lock even when we are going to schedule is not something we want for performances. > > > That said, if the general opinion is in favour of unmerging > > the bkl removal changes in reiserfs. Then please do. > > For me it seems too aggressive at this point. > > If it was just a case of fixing a few known bugs, but > if you're not even sure how many problems are left ... > > Perhaps do the reiserfs35 variant? As explained above, I think this just reschedule the problem for later. This model won't have any testers and won't evolve. > > > Just to express my point of view, as my primary goal is not > > to fix reiserfs but the kernel: If you are afraid of such > > changes, your kernel will just become mildewed by the time. > > Better some mildew than a seriously-broken-for-enough people's > release (although I have my doubts that's the right metapher > for the BKL anyways) > > Having stable releases is an important part for > getting enough testers (we already have too little). And > if we start breaking their $HOMEs they might become > even less. This is very unlikely to break their $HOME. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 20:11 ` Frederic Weisbecker @ 2010-01-02 20:33 ` Andi Kleen 2010-01-02 20:54 ` Frederic Weisbecker ` (2 more replies) 2010-01-02 21:01 ` tytso 1 sibling, 3 replies; 26+ messages in thread From: Andi Kleen @ 2010-01-02 20:33 UTC (permalink / raw) To: Frederic Weisbecker Cc: Andi Kleen, Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 09:11:39PM +0100, Frederic Weisbecker wrote: > I've never lost any datas since I began this work. And > I run it every day. If I had experienced lock inversions, > and sometimes soft lockups, I did not experienced serious > damages. It's a journalized filesystem that can fixup the things > pretty well. So are you confident that 2.6.33 will not have regular soft-lockups for reiserfs users? > > Also we are talking about potential lock inversions, in potential > rare path, that could potentially raise soft lockups. That makes > a lot of potentials, for things that are going to be fixed and > for which I've never seen serious damages. A soft lockup is a problem. Perhaps not totally serious, but a user who experiences them regularly would rightly consider such a release very broken. > We could make a new reiserfs version by duplicating the code > base. But nobody will test it. That would require to patch > mkreiserfs, waiting for distros to ship it, waiting for > users to ship the distros. Assuming at this time there > will be remaining users to set up new reiserfs partitions. I suspect your estimates on how widely reiserfs is used are quite off. However as usual a large part of the user base simply only uses what their distribution ships. > We could also have a reiserfs-no-bkl config option that > would pick the duplicated code base. Again I fear few people > will test it. That sounds reasonable, at least have a workaround if there are too many problems. > Sometimes I do. Sometimes it's just wasteful. We don't want to relax > the lock just because of a kmalloc(__GFP_NOFS). If that's the problem you can always split the allocation: first try it with __GFP_NOWAIT without lock dropped, then if that fails do again with full __GFP_NOFS and lock drop. However it's hard to believe that a few instructions more or less would make much difference; I would normally expect any larger changes coming from changes IO patterns or cache line bouncing. > > Better some mildew than a seriously-broken-for-enough people's > > release (although I have my doubts that's the right metapher > > for the BKL anyways) > > > > Having stable releases is an important part for > > getting enough testers (we already have too little). And > > if we start breaking their $HOMEs they might become > > even less. > > > This is very unlikely to break their $HOME. Well break access to their $HOME -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 20:33 ` Andi Kleen @ 2010-01-02 20:54 ` Frederic Weisbecker 2010-01-02 21:10 ` Ingo Molnar 2010-01-02 21:42 ` Ingo Molnar 2 siblings, 0 replies; 26+ messages in thread From: Frederic Weisbecker @ 2010-01-02 20:54 UTC (permalink / raw) To: Andi Kleen Cc: Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 09:33:05PM +0100, Andi Kleen wrote: > On Sat, Jan 02, 2010 at 09:11:39PM +0100, Frederic Weisbecker wrote: > > I've never lost any datas since I began this work. And > > I run it every day. If I had experienced lock inversions, > > and sometimes soft lockups, I did not experienced serious > > damages. It's a journalized filesystem that can fixup the things > > pretty well. > > So are you confident that 2.6.33 will not have regular soft-lockups > for reiserfs users? Yep. > > > > Also we are talking about potential lock inversions, in potential > > rare path, that could potentially raise soft lockups. That makes > > a lot of potentials, for things that are going to be fixed and > > for which I've never seen serious damages. > > A soft lockup is a problem. Perhaps not totally serious, > but a user who experiences them regularly would rightly consider > such a release very broken. If there will be softlockups, these will be rare. The same risk apply to every new features in Linux (which includes improvements in existing features). > > We could make a new reiserfs version by duplicating the code > > base. But nobody will test it. That would require to patch > > mkreiserfs, waiting for distros to ship it, waiting for > > users to ship the distros. Assuming at this time there > > will be remaining users to set up new reiserfs partitions. > > I suspect your estimates on how widely reiserfs is used > are quite off. However as usual a large part of the user > base simply only uses what their distribution ships. Sure, but few (none?) distributions continue to propose reiserfs as a default. I'm not telling there are no users anymore, I'm just telling there won't be new users anymore. > > We could also have a reiserfs-no-bkl config option that > > would pick the duplicated code base. Again I fear few people > > will test it. > > That sounds reasonable, at least have a workaround if there > are too many problems. No, I'll be the only guy that is going to test it. This will basically paralyze the progresses and once we'll merge it back as the reiserfs 3 mainline, we'll have to face the same issues. The only thing that makes this work progressing is the report done by testers. Reviews on such a huge and complicated code base is useful but is also strong in proving its own limits. > > Sometimes I do. Sometimes it's just wasteful. We don't want to relax > > the lock just because of a kmalloc(__GFP_NOFS). > > If that's the problem you can always split the allocation: > first try it with __GFP_NOWAIT without lock dropped, then > if that fails do again with full __GFP_NOFS and lock drop. > > However it's hard to believe that a few instructions > more or less would make much difference; I would normally > expect any larger changes coming from changes IO > patterns or cache line bouncing. The problem is not there. Each sites where we are going to sleep have to be treated as an isolated case. Should or shouldn't we relax here? Depending on the case, it's sometimes wasteful to relax even if we are going to block, sometimes it's useful to relax even if it doesn't appear to be needed. Only reviews proven by benchmarks can help. Which is what I did. But blindly apply the "sleep if we are going to block on io or something" is not always right. > > > Better some mildew than a seriously-broken-for-enough people's > > > release (although I have my doubts that's the right metapher > > > for the BKL anyways) > > > > > > Having stable releases is an important part for > > > getting enough testers (we already have too little). And > > > if we start breaking their $HOMEs they might become > > > even less. > > > > > > This is very unlikely to break their $HOME. > > Well break access to their $HOME Never experienced that. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 20:33 ` Andi Kleen 2010-01-02 20:54 ` Frederic Weisbecker @ 2010-01-02 21:10 ` Ingo Molnar 2010-01-02 21:42 ` Ingo Molnar 2 siblings, 0 replies; 26+ messages in thread From: Ingo Molnar @ 2010-01-02 21:10 UTC (permalink / raw) To: Andi Kleen Cc: Frederic Weisbecker, Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason * Andi Kleen <andi@firstfloor.org> wrote: > > We could also have a reiserfs-no-bkl config option that would pick the > > duplicated code base. Again I fear few people will test it. > > That sounds reasonable, at least have a workaround if there are too many > problems. Uhm, i think Frederic meant that a bit mockingly. Judging on past experience related to the BKL, introducing such a config option is one of the worst options possible. IMHO you are giving pretty bad unsolicited advice here. Thanks, Ingo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 20:33 ` Andi Kleen 2010-01-02 20:54 ` Frederic Weisbecker 2010-01-02 21:10 ` Ingo Molnar @ 2010-01-02 21:42 ` Ingo Molnar 2 siblings, 0 replies; 26+ messages in thread From: Ingo Molnar @ 2010-01-02 21:42 UTC (permalink / raw) To: Andi Kleen Cc: Frederic Weisbecker, Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason * Andi Kleen <andi@firstfloor.org> wrote: > On Sat, Jan 02, 2010 at 09:11:39PM +0100, Frederic Weisbecker wrote: > > I've never lost any datas since I began this work. And > > I run it every day. If I had experienced lock inversions, > > and sometimes soft lockups, I did not experienced serious > > damages. It's a journalized filesystem that can fixup the things > > pretty well. > > So are you confident that 2.6.33 will not have regular soft-lockups for > reiserfs users? Well, are you confident that your hardware-poison changes in v2.6.33 will not have any user triggerable bugs? It's a pretty silly question IMO because you can never be 100% confident. What matters is for the changes to be sane and clean, for there to be no known regression and for any reported bugs to be fixed speedily. Which Frederic is doing in an utmost excellent way ... Thanks, Ingo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 20:11 ` Frederic Weisbecker 2010-01-02 20:33 ` Andi Kleen @ 2010-01-02 21:01 ` tytso 2010-01-02 21:06 ` Frederic Weisbecker 1 sibling, 1 reply; 26+ messages in thread From: tytso @ 2010-01-02 21:01 UTC (permalink / raw) To: Frederic Weisbecker Cc: Andi Kleen, Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 09:11:39PM +0100, Frederic Weisbecker wrote: > > I've never lost any datas since I began this work. And > I run it every day. If I had experienced lock inversions, > and sometimes soft lockups, I did not experienced serious > damages. It's a journalized filesystem that can fixup the things > pretty well. Have you tried using the xfsqa regression test suite? Despite the name, it will work on non-xfs filesystems (although there are some XFS-specific tests in the test suite.) Both the btrfs and ext4 developers use it to debug their file systems, and it's a good way of stressing the file system in all sorts of different ways that might not be seen during normal desktop usage. I suspect it would be a good way of flushing out potential problems for reiserfs as well. Regards, - Ted ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 21:01 ` tytso @ 2010-01-02 21:06 ` Frederic Weisbecker 2010-01-02 23:36 ` tytso 0 siblings, 1 reply; 26+ messages in thread From: Frederic Weisbecker @ 2010-01-02 21:06 UTC (permalink / raw) To: tytso Cc: Andi Kleen, Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 04:01:01PM -0500, tytso@mit.edu wrote: > On Sat, Jan 02, 2010 at 09:11:39PM +0100, Frederic Weisbecker wrote: > > > > I've never lost any datas since I began this work. And > > I run it every day. If I had experienced lock inversions, > > and sometimes soft lockups, I did not experienced serious > > damages. It's a journalized filesystem that can fixup the things > > pretty well. > > Have you tried using the xfsqa regression test suite? Despite the > name, it will work on non-xfs filesystems (although there are some > XFS-specific tests in the test suite.) Both the btrfs and ext4 > developers use it to debug their file systems, and it's a good way of > stressing the file system in all sorts of different ways that might > not be seen during normal desktop usage. I suspect it would be a good > way of flushing out potential problems for reiserfs as well. > > Regards, Thanks! I'm going to test it now. I've been running a stress test from Chris Mason which basically checks races on parallel writes/read. If this testsuite includes more checks, like xattr and some other things, then that's exactly what I was searching. I guess this is the right place to get it? git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git Thanks. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 21:06 ` Frederic Weisbecker @ 2010-01-02 23:36 ` tytso 2010-01-02 23:43 ` Christian Kujau 2010-01-03 1:52 ` Frederic Weisbecker 0 siblings, 2 replies; 26+ messages in thread From: tytso @ 2010-01-02 23:36 UTC (permalink / raw) To: Frederic Weisbecker Cc: Andi Kleen, Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 10:06:55PM +0100, Frederic Weisbecker wrote: > > Thanks! I'm going to test it now. I've been running a stress test > from Chris Mason which basically checks races on parallel writes/read. Hmm, which test is this? > If this testsuite includes more checks, like xattr and some other > things, then that's exactly what I was searching. Yes, quite a bit more than that. One such test (which is used by xfsqa test) is the fsstress proram, which is quite flexible. You can program different combinations of fallocate, direct I/O read/writes, setxattr, buffered read/writes, symlinks, truncates, renames, etc.. The xfsqa suite will run fsstress in a number of different modes, but that's not the only test program that it uses. It also uses the fsx program which exercises concurrent read/write/mmap operations, as well as other programs to test acl support, noatime support, etc. I make a point of running the regression test suite before pushing a patch series to Linus; it makes me far more comfortable than I haven't accidentally introduced some problem. > I guess this is the right place to get it? > > git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git Yep. You'll need to install a number of packages to compile it, including libaio-dev, libattr1-dev, libacl1-dev, xfsprogs, xfslibs-dev, etc. - Ted ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 23:36 ` tytso @ 2010-01-02 23:43 ` Christian Kujau 2010-01-03 1:16 ` Frederic Weisbecker 2010-01-03 1:52 ` Frederic Weisbecker 1 sibling, 1 reply; 26+ messages in thread From: Christian Kujau @ 2010-01-02 23:43 UTC (permalink / raw) To: tytso, Frederic Weisbecker, Andi Kleen, Linus Torvalds, LKML, Alexander Beregalov, Chris Mason, Ingo Molnar tytso@mit.edu wrote on 2010-01-02 15:36 : >> Thanks! I'm going to test it now. I've been running a stress test >> from Chris Mason which basically checks races on parallel writes/read. > > Hmm, which test is this? http://oss.oracle.com/~mason/stress.sh, IIRC. Christian. -- make bzImage, not war ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 23:43 ` Christian Kujau @ 2010-01-03 1:16 ` Frederic Weisbecker 0 siblings, 0 replies; 26+ messages in thread From: Frederic Weisbecker @ 2010-01-03 1:16 UTC (permalink / raw) To: Christian Kujau Cc: tytso, Andi Kleen, Linus Torvalds, LKML, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 03:43:52PM -0800, Christian Kujau wrote: > tytso@mit.edu wrote on 2010-01-02 15:36 : >>> Thanks! I'm going to test it now. I've been running a stress test >>> from Chris Mason which basically checks races on parallel writes/read. >> >> Hmm, which test is this? > > http://oss.oracle.com/~mason/stress.sh, IIRC. > > Christian. That's it! It's simple script that runs concurrent copies of a directory and that checks the coherency of the result. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 23:36 ` tytso 2010-01-02 23:43 ` Christian Kujau @ 2010-01-03 1:52 ` Frederic Weisbecker 2010-01-03 2:05 ` Christian Kujau 1 sibling, 1 reply; 26+ messages in thread From: Frederic Weisbecker @ 2010-01-03 1:52 UTC (permalink / raw) To: tytso, Andi Kleen, Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 06:36:03PM -0500, tytso@mit.edu wrote: > Yes, quite a bit more than that. One such test (which is used by > xfsqa test) is the fsstress proram, which is quite flexible. You can > program different combinations of fallocate, direct I/O read/writes, > setxattr, buffered read/writes, symlinks, truncates, renames, etc.. > > The xfsqa suite will run fsstress in a number of different modes, but > that's not the only test program that it uses. It also uses the fsx > program which exercises concurrent read/write/mmap operations, as well > as other programs to test acl support, noatime support, etc. > > I make a point of running the regression test suite before pushing a > patch series to Linus; it makes me far more comfortable than I haven't > accidentally introduced some problem. Hmm, I'm trying to play with it, and some things are not obvious to me. I can launch fsstress directly, works well, but as I would like to run all possible tests, I'm trying the check script, as explained in the README file. So I've set the partition and ran the script: export TEST_DEV=/dev/sda3 ./check -r But: [: 53: ==: unexpected operator common.rc: Error: $TEST_DEV (/dev/sda3) is not a MOUNTED xfs filesystem Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/sda3 reiserfs 134797288 2982296 131814992 3% /data I'm not sure how I can run these tests on a non-xfs partitions. I must be missing something. > > I guess this is the right place to get it? > > > > git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git > > Yep. You'll need to install a number of packages to compile it, > including libaio-dev, libattr1-dev, libacl1-dev, xfsprogs, > xfslibs-dev, etc. Thanks. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-03 1:52 ` Frederic Weisbecker @ 2010-01-03 2:05 ` Christian Kujau 2010-01-03 3:27 ` tytso 0 siblings, 1 reply; 26+ messages in thread From: Christian Kujau @ 2010-01-03 2:05 UTC (permalink / raw) To: Frederic Weisbecker Cc: tytso, Andi Kleen, Linus Torvalds, LKML, Alexander Beregalov, Chris Mason, Ingo Molnar On Sun, 3 Jan 2010 at 02:52, Frederic Weisbecker wrote: > [: 53: ==: unexpected operator > common.rc: Error: $TEST_DEV (/dev/sda3) is not a MOUNTED xfs filesystem Yeah, I'm playing around with xfstests as well, but apparently they're assuming !/bin/sh will be run under /bin/bash, which is not always the case. A short fix is to link /bin/sh to /bin/bash, but perhaps some of the tests can be tweaked to run under /bin/sh as well. > I'm not sure how I can run these tests on a non-xfs partitions. > I must be missing something. I haven't found an easy way to do this yet without rewriting a few routines (mkfs, mount, etc...). As Ted is already using xfstests for btrfs, ext4, maybe he wants to share his magic? :-) Christian. -- BOFH excuse #47: Complete Transient Lockout ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-03 2:05 ` Christian Kujau @ 2010-01-03 3:27 ` tytso 2010-01-04 20:20 ` Frederic Weisbecker 0 siblings, 1 reply; 26+ messages in thread From: tytso @ 2010-01-03 3:27 UTC (permalink / raw) To: Christian Kujau Cc: Frederic Weisbecker, Andi Kleen, Linus Torvalds, LKML, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 06:05:13PM -0800, Christian Kujau wrote: > On Sun, 3 Jan 2010 at 02:52, Frederic Weisbecker wrote: > > [: 53: ==: unexpected operator > > common.rc: Error: $TEST_DEV (/dev/sda3) is not a MOUNTED xfs filesystem > > Yeah, I'm playing around with xfstests as well, but apparently they're > assuming !/bin/sh will be run under /bin/bash, which is not always the > case. A short fix is to link /bin/sh to /bin/bash, but perhaps some of the > tests can be tweaked to run under /bin/sh as well. > > > I'm not sure how I can run these tests on a non-xfs partitions. > > I must be missing something. > > I haven't found an easy way to do this yet without rewriting a few > routines (mkfs, mount, etc...). As Ted is already using xfstests for > btrfs, ext4, maybe he wants to share his magic? :-) I'm using it for ext4. It looks like someone has already tried using the xfstests with reiserfs; take a look at common.rc; you'll see case statements for xfs, udf, nfs, ext2/3/4, reiserfs, and gfs2. Someone who wants to use xfstests for some other file system may need to further edit common.rc. I thought the btrfs folks were using it as well, but at least the kernel.org git tree for xfstests doesn't seem to show any btrfs references in common.rc, so perhaps I'm wrong about btrfs developers using xfstests (or they haven't sent their patches back upstream). I'm not sure how well tested the reiserfs support is, so you may need to edit common.rc as necessary. In any case, the README file has pretty much what you need. I personally run my test kernels using KVM, and I have a run-test script which invokes check as follows: #!/bin/sh export TEST_DEV=/dev/sdb export TEST_DIR=/test export SCRATCH_DEV=/dev/sdc1 export SCRATCH_MNT=/scratch export EXT_MOUNT_OPTIONS="-o block_validity" exec ./check -ext4 $* /dev/sdb is an ext4-formated filesystem, which you're supposed to not reformat from run to run, so that some testing can be done with an "aged" file system. The scratch filesystem will be reformatted for various tests, so you shouldn't keep anything valueable on it. I also have a 1k block file system on /dev/sdd, so I invoke check as follows to check to make sure things work when blocksize != pagesize: #!/bin/sh export TEST_DEV=/dev/sdd export TEST_DIR=/test-1k export SCRATCH_DEV=/dev/sdc1 export SCRATCH_MNT=/scratch export MKFS_OPTIONS="-b 1024" exec ./check -ext4 $* As far as the inconsistency between TEST_DIR versus SCRATCH_MNT --- all I can say is, the XFS engineers who threw together the xfstests scripts may have been very good file system engineers, but they obviously weren't very well used as UI/User Experience designers. :-) (The documentation isn't all that great either, but patches sent to xfs@oss.sgi.com do tend to be accepted and merged into the kernel.org tree if they are clean.) Hope this helps, - Ted ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-03 3:27 ` tytso @ 2010-01-04 20:20 ` Frederic Weisbecker 0 siblings, 0 replies; 26+ messages in thread From: Frederic Weisbecker @ 2010-01-04 20:20 UTC (permalink / raw) To: tytso Cc: Christian Kujau, Andi Kleen, Linus Torvalds, LKML, Alexander Beregalov, Chris Mason, Ingo Molnar On Sat, Jan 02, 2010 at 10:27:58PM -0500, tytso@mit.edu wrote: > On Sat, Jan 02, 2010 at 06:05:13PM -0800, Christian Kujau wrote: > > On Sun, 3 Jan 2010 at 02:52, Frederic Weisbecker wrote: > > > [: 53: ==: unexpected operator > > > common.rc: Error: $TEST_DEV (/dev/sda3) is not a MOUNTED xfs filesystem > > > > Yeah, I'm playing around with xfstests as well, but apparently they're > > assuming !/bin/sh will be run under /bin/bash, which is not always the > > case. A short fix is to link /bin/sh to /bin/bash, but perhaps some of the > > tests can be tweaked to run under /bin/sh as well. > > > > > I'm not sure how I can run these tests on a non-xfs partitions. > > > I must be missing something. > > > > I haven't found an easy way to do this yet without rewriting a few > > routines (mkfs, mount, etc...). As Ted is already using xfstests for > > btrfs, ext4, maybe he wants to share his magic? :-) > > I'm using it for ext4. It looks like someone has already tried using > the xfstests with reiserfs; take a look at common.rc; you'll see case > statements for xfs, udf, nfs, ext2/3/4, reiserfs, and gfs2. Someone > who wants to use xfstests for some other file system may need to > further edit common.rc. I thought the btrfs folks were using it as > well, but at least the kernel.org git tree for xfstests doesn't seem > to show any btrfs references in common.rc, so perhaps I'm wrong about > btrfs developers using xfstests (or they haven't sent their patches > back upstream). I'm not sure how well tested the reiserfs support is, > so you may need to edit common.rc as necessary. > > In any case, the README file has pretty much what you need. I > personally run my test kernels using KVM, and I have a run-test script > which invokes check as follows: > > #!/bin/sh > export TEST_DEV=/dev/sdb > export TEST_DIR=/test > export SCRATCH_DEV=/dev/sdc1 > export SCRATCH_MNT=/scratch > export EXT_MOUNT_OPTIONS="-o block_validity" > exec ./check -ext4 $* > > /dev/sdb is an ext4-formated filesystem, which you're supposed to not > reformat from run to run, so that some testing can be done with an > "aged" file system. The scratch filesystem will be reformatted for > various tests, so you shouldn't keep anything valueable on it. > > I also have a 1k block file system on /dev/sdd, so I invoke check as > follows to check to make sure things work when blocksize != pagesize: > > #!/bin/sh > export TEST_DEV=/dev/sdd > export TEST_DIR=/test-1k > export SCRATCH_DEV=/dev/sdc1 > export SCRATCH_MNT=/scratch > export MKFS_OPTIONS="-b 1024" > exec ./check -ext4 $* Ok. Thanks a lot for all these details. It seems to be working now that I have edited some shebang and replaced /bin/sh by /bin/bash, like Christopher suggested. > As far as the inconsistency between TEST_DIR versus SCRATCH_MNT --- > all I can say is, the XFS engineers who threw together the xfstests > scripts may have been very good file system engineers, but they > obviously weren't very well used as UI/User Experience designers. :-) Agreed :) > (The documentation isn't all that great either, but patches sent to > xfs@oss.sgi.com do tend to be accepted and merged into the kernel.org > tree if they are clean.) > > Hope this helps, Yeah, thanks! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 17:43 ` reiserfs broken in 2.6.32 was " Andi Kleen 2010-01-02 19:02 ` Frederic Weisbecker @ 2010-01-02 20:11 ` Ingo Molnar 2010-01-02 22:18 ` Ingo Molnar 2 siblings, 0 replies; 26+ messages in thread From: Ingo Molnar @ 2010-01-02 20:11 UTC (permalink / raw) To: Andi Kleen Cc: Frederic Weisbecker, Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason * Andi Kleen <andi@firstfloor.org> wrote: > > I only have reiserfs partitions in my laptop and my testbox, nothing else. > > And that because I'm now maintaining it de facto. > > AFAIK it's widely used in SUSE installations. It was the default for a long > time. [ Btw., if so then SuSE/Novell should sponsor Frederic's reiserfs maintanence work. ] > And right now as in 2.6.32 it's in a state of "may randomly > explode/deadlock". And no clear path out of it. Not good. You are quite wrong about that accusation: Frederic's changes are not in v2.6.32 - they were merged by Linus in the v2.6.33 cycle so that code cannot physicaly have caused any problems in v2.6.32 ... If there's stability problems with reiserfs in v2.6.32 then it's doubly good that Linus merged Frederic's tree in v2.6.33 [beyond the obvious advantage that it gets rid of the BKL, which was a serious and oft reported limit to reiserfs scalability]: finally there's again reiserfs development activity, which might lead to further fixes, and which might solve the v2.6.32 stability problems you mention. In my view this work could have been merged sooner, in v2.6.32 already, maybe that way the v2.6.32 reiserfs stability problems you mentioned could have been avoided. Thanks, Ingo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes 2010-01-02 17:43 ` reiserfs broken in 2.6.32 was " Andi Kleen 2010-01-02 19:02 ` Frederic Weisbecker 2010-01-02 20:11 ` Ingo Molnar @ 2010-01-02 22:18 ` Ingo Molnar 2 siblings, 0 replies; 26+ messages in thread From: Ingo Molnar @ 2010-01-02 22:18 UTC (permalink / raw) To: Andi Kleen Cc: Frederic Weisbecker, Linus Torvalds, LKML, Christian Kujau, Alexander Beregalov, Chris Mason * Andi Kleen <andi@firstfloor.org> wrote: > > I only have reiserfs partitions in my laptop and my testbox, > > nothing else. And that because I'm now maintaining it de facto. > > AFAIK it's widely used in SUSE installations. It was the default for a long > time. > > And right now as in 2.6.32 it's in a state of "may randomly > explode/deadlock". And no clear path out of it. Not good. Btw. that's quite weird as nothing of substance happened in fs/reiserfs in v2.6.32: $ git shortlog v2.6.31..v2.6.32 fs/reiserfs/ Alexey Dobriyan (2): const: make struct super_block::dq_op const const: make struct super_block::s_qcop const ( There might be VFS core interactions perhaps - but nothing i've seen seems to correlate with your claims and that's quite weird. ) There's always reports about filesystem corruptions (file data cache is statistically the most likely thing to get corrupted by borderline hw failures such as RAM failures or DMA corruptions), but i've seen nothing out of the ordinary for v2.6.32 with reiserfs. (and the v2.6.33-rc1 bits are not yet merged in distro kernels) So please substantiate your claims: URLs, distro bugzilla entries, etc. Thanks, Ingo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [GIT PULL] reiserfs fixes 2010-01-02 1:27 [GIT PULL] reiserfs fixes Frederic Weisbecker 2010-01-02 13:41 ` Andi Kleen @ 2010-01-02 19:19 ` Linus Torvalds 2010-01-02 19:21 ` Linus Torvalds 2010-01-02 19:22 ` Frederic Weisbecker 1 sibling, 2 replies; 26+ messages in thread From: Linus Torvalds @ 2010-01-02 19:19 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar, Greg KH On Sat, 2 Jan 2010, Frederic Weisbecker wrote: > > These changes fix a lot of lock inversions, some of them were > triggering soft lockups very easily in xattrs operations. Should 2.6.32-stable merge this branch too? Linus ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [GIT PULL] reiserfs fixes 2010-01-02 19:19 ` Linus Torvalds @ 2010-01-02 19:21 ` Linus Torvalds 2010-01-02 19:24 ` Frederic Weisbecker 2010-01-02 19:22 ` Frederic Weisbecker 1 sibling, 1 reply; 26+ messages in thread From: Linus Torvalds @ 2010-01-02 19:21 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar, Greg KH On Sat, 2 Jan 2010, Linus Torvalds wrote: > > Should 2.6.32-stable merge this branch too? Never mind, none of the reiserfs bkl-removal went into 2.6.32. We had some other BKL work that got in there, but all the reiserfs stuff is post-2.6.32 only, I guess. Linus ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [GIT PULL] reiserfs fixes 2010-01-02 19:21 ` Linus Torvalds @ 2010-01-02 19:24 ` Frederic Weisbecker 0 siblings, 0 replies; 26+ messages in thread From: Frederic Weisbecker @ 2010-01-02 19:24 UTC (permalink / raw) To: Linus Torvalds Cc: LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar, Greg KH On Sat, Jan 02, 2010 at 11:21:51AM -0800, Linus Torvalds wrote: > > > On Sat, 2 Jan 2010, Linus Torvalds wrote: > > > > Should 2.6.32-stable merge this branch too? > > Never mind, none of the reiserfs bkl-removal went into 2.6.32. We had some > other BKL work that got in there, but all the reiserfs stuff is > post-2.6.32 only, I guess. > > Linus Exactly. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [GIT PULL] reiserfs fixes 2010-01-02 19:19 ` Linus Torvalds 2010-01-02 19:21 ` Linus Torvalds @ 2010-01-02 19:22 ` Frederic Weisbecker 1 sibling, 0 replies; 26+ messages in thread From: Frederic Weisbecker @ 2010-01-02 19:22 UTC (permalink / raw) To: Linus Torvalds Cc: LKML, Christian Kujau, Alexander Beregalov, Chris Mason, Ingo Molnar, Greg KH On Sat, Jan 02, 2010 at 11:19:20AM -0800, Linus Torvalds wrote: > > > On Sat, 2 Jan 2010, Frederic Weisbecker wrote: > > > > These changes fix a lot of lock inversions, some of them were > > triggering soft lockups very easily in xattrs operations. > > Should 2.6.32-stable merge this branch too? > > Linus No this only relies on the bkl removal patches merged in this cycle. Thanks. ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2010-01-04 20:20 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-02 1:27 [GIT PULL] reiserfs fixes Frederic Weisbecker 2010-01-02 13:41 ` Andi Kleen 2010-01-02 16:36 ` Frederic Weisbecker 2010-01-02 17:43 ` reiserfs broken in 2.6.32 was " Andi Kleen 2010-01-02 19:02 ` Frederic Weisbecker 2010-01-02 19:23 ` Andi Kleen 2010-01-02 20:11 ` Frederic Weisbecker 2010-01-02 20:33 ` Andi Kleen 2010-01-02 20:54 ` Frederic Weisbecker 2010-01-02 21:10 ` Ingo Molnar 2010-01-02 21:42 ` Ingo Molnar 2010-01-02 21:01 ` tytso 2010-01-02 21:06 ` Frederic Weisbecker 2010-01-02 23:36 ` tytso 2010-01-02 23:43 ` Christian Kujau 2010-01-03 1:16 ` Frederic Weisbecker 2010-01-03 1:52 ` Frederic Weisbecker 2010-01-03 2:05 ` Christian Kujau 2010-01-03 3:27 ` tytso 2010-01-04 20:20 ` Frederic Weisbecker 2010-01-02 20:11 ` Ingo Molnar 2010-01-02 22:18 ` Ingo Molnar 2010-01-02 19:19 ` Linus Torvalds 2010-01-02 19:21 ` Linus Torvalds 2010-01-02 19:24 ` Frederic Weisbecker 2010-01-02 19:22 ` Frederic Weisbecker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox