* Re: 2.6.12-rc4-mm1
[not found] ` <4284BF66.1050704@friservices.com>
@ 2005-05-13 18:13 ` David Woodhouse
2005-05-14 1:07 ` 2.6.12-rc4-mm1 steve
[not found] ` <428508BB.8030604@friservices.com>
0 siblings, 2 replies; 10+ messages in thread
From: David Woodhouse @ 2005-05-13 18:13 UTC (permalink / raw)
To: steve; +Cc: Andrew Morton, linux-fsdevel, dedekind
On Fri, 2005-05-13 at 09:53 -0500, steve would have written, if his mail
client hadn't been broken:
> a bug that appeared after running for about 2 hours:
>
> May 13 09:32:34 localhost kernel: BUG: atomic counter underflow at:
> May 13 09:32:34 localhost kernel: [reiserfs_clear_inode+129/176] reiserfs_clear_inode+0x81/0xb0
> May 13 09:32:34 localhost kernel: [clear_inode+228/304] clear_inode+0xe4/0x130
> May 13 09:32:34 localhost kernel: [dispose_list+112/304] dispose_list+0x70/0x130
> May 13 09:32:34 localhost kernel: [prune_icache+191/432] prune_icache+0xbf/0x1b0
> May 13 09:32:34 localhost kernel: [shrink_icache_memory+20/64] shrink_icache_memory+0x14/0x40
> May 13 09:32:34 localhost kernel: [shrink_slab+345/416] shrink_slab+0x159/0x1a0
> May 13 09:32:34 localhost kernel: [balance_pgdat+695/944] balance_pgdat+0x2b7/0x3b0
> May 13 09:32:34 localhost kernel: [kswapd+210/240] kswapd+0xd2/0xf0
> May 13 09:32:34 localhost kernel: [autoremove_wake_function+0/80] autoremove_wake_function+0x0/0x50
> May 13 09:32:34 localhost kernel: [ret_from_fork+6/20] ret_from_fork+0x6/0x14
> May 13 09:32:34 localhost kernel: [autoremove_wake_function+0/80] autoremove_wake_function+0x0/0x50
> May 13 09:32:34 localhost kernel: [kswapd+0/240] kswapd+0x0/0xf0
> May 13 09:32:34 localhost kernel: [kernel_thread_helper+5/24] kernel_thread_helper+0x5/0x18
Hmmm. We're hitting that bug when posix_acl_release() decrements the
refcount on one of the inode's ACLs and it goes negative.
First glance at this had me suspecting that we were somehow calling
clear_inode() twice... but since we clear the pointer to the ACL after
calling posix_acl_release(), that seems unlikely -- unless you managed
to get two CPUs in reiserfs_clear_inode() simultaneously for the same
inode. Is this SMP? Is preempt enabled?
Can you reproduce it? If so, does it go away if you revert one or both
of these:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc4/2.6.12-rc4-mm1/broken-out/vfs-bugfix-two-read_inode-calles-without.patch
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc4/2.6.12-rc4-mm1/broken-out/__wait_on_freeing_inode-fix.patch
--
dwmw2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.12-rc4-mm1
2005-05-13 18:13 ` 2.6.12-rc4-mm1 David Woodhouse
@ 2005-05-14 1:07 ` steve
2005-05-18 10:06 ` 2.6.12-rc4-mm1 Artem B. Bityuckiy
[not found] ` <428508BB.8030604@friservices.com>
1 sibling, 1 reply; 10+ messages in thread
From: steve @ 2005-05-14 1:07 UTC (permalink / raw)
To: David Woodhouse; +Cc: Andrew Morton, linux-fsdevel, dedekind
David Woodhouse wrote:
>On Fri, 2005-05-13 at 09:53 -0500, steve would have written, if his mail
>client hadn't been broken:
>
>
>>a bug that appeared after running for about 2 hours:
>>
>>May 13 09:32:34 localhost kernel: BUG: atomic counter underflow at:
>>May 13 09:32:34 localhost kernel: [reiserfs_clear_inode+129/176] reiserfs_clear_inode+0x81/0xb0
>>May 13 09:32:34 localhost kernel: [clear_inode+228/304] clear_inode+0xe4/0x130
>>May 13 09:32:34 localhost kernel: [dispose_list+112/304] dispose_list+0x70/0x130
>>May 13 09:32:34 localhost kernel: [prune_icache+191/432] prune_icache+0xbf/0x1b0
>>May 13 09:32:34 localhost kernel: [shrink_icache_memory+20/64] shrink_icache_memory+0x14/0x40
>>May 13 09:32:34 localhost kernel: [shrink_slab+345/416] shrink_slab+0x159/0x1a0
>>May 13 09:32:34 localhost kernel: [balance_pgdat+695/944] balance_pgdat+0x2b7/0x3b0
>>May 13 09:32:34 localhost kernel: [kswapd+210/240] kswapd+0xd2/0xf0
>>May 13 09:32:34 localhost kernel: [autoremove_wake_function+0/80] autoremove_wake_function+0x0/0x50
>>May 13 09:32:34 localhost kernel: [ret_from_fork+6/20] ret_from_fork+0x6/0x14
>>May 13 09:32:34 localhost kernel: [autoremove_wake_function+0/80] autoremove_wake_function+0x0/0x50
>>May 13 09:32:34 localhost kernel: [kswapd+0/240] kswapd+0x0/0xf0
>>May 13 09:32:34 localhost kernel: [kernel_thread_helper+5/24] kernel_thread_helper+0x5/0x18
>>
>>
>
>Hmmm. We're hitting that bug when posix_acl_release() decrements the
>refcount on one of the inode's ACLs and it goes negative.
>
>First glance at this had me suspecting that we were somehow calling
>clear_inode() twice... but since we clear the pointer to the ACL after
>calling posix_acl_release(), that seems unlikely -- unless you managed
>to get two CPUs in reiserfs_clear_inode() simultaneously for the same
>inode. Is this SMP? Is preempt enabled?
>
>Can you reproduce it? If so, does it go away if you revert one or both
>of these:
>
>ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc4/2.6.12-rc4-mm1/broken-out/vfs-bugfix-two-read_inode-calles-without.patch
>ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc4/2.6.12-rc4-mm1/broken-out/__wait_on_freeing_inode-fix.patch
>
>
>
okay, reproduced after a couple hours of running (nothing intensive),
then doing a grep blah -r /*
here's the new output:
May 13 19:54:16 localhost kernel: BUG: atomic counter underflow at:
May 13 19:54:16 localhost kernel: [reiserfs_clear_inode+129/176]
reiserfs_clear_inode+0x81/0xb0
May 13 19:54:16 localhost kernel: [clear_inode+228/304]
clear_inode+0xe4/0x130
May 13 19:54:16 localhost kernel: [dispose_list+112/304]
dispose_list+0x70/0x130
May 13 19:54:16 localhost kernel: [prune_icache+191/432]
prune_icache+0xbf/0x1b0
May 13 19:54:16 localhost kernel: [shrink_icache_memory+20/64]
shrink_icache_memory+0x14/0x40
May 13 19:54:16 localhost kernel: [shrink_slab+345/416]
shrink_slab+0x159/0x1a0
May 13 19:54:16 localhost kernel: [try_to_free_pages+226/416]
try_to_free_pages+0xe2/0x1a0
May 13 19:54:16 localhost kernel: [__alloc_pages+383/960]
__alloc_pages+0x17f/0x3c0
May 13 19:54:16 localhost kernel: [__do_page_cache_readahead+285/352]
__do_page_cache_readahead+0x11d/0x160
May 13 19:54:16 localhost kernel:
[blockable_page_cache_readahead+81/208]
blockable_page_cache_readahead+0x51/0xd0
May 13 19:54:16 localhost kernel: [make_ahead_window+112/176]
make_ahead_window+0x70/0xb0
May 13 19:54:16 localhost kernel: [page_cache_readahead+169/384]
page_cache_readahead+0xa9/0x180
May 13 19:54:16 localhost kernel: [file_read_actor+198/224]
file_read_actor+0xc6/0xe0
May 13 19:54:16 localhost kernel: [do_generic_mapping_read+1446/1472]
do_generic_mapping_read+0x5a6/0x5c0
May 13 19:54:16 localhost kernel: [pg0+542324240/1068651520]
ieee80211_recv_mgmt+0xed0/0x1d90 [wlan]
May 13 19:54:16 localhost kernel: [file_read_actor+0/224]
file_read_actor+0x0/0xe0
May 13 19:54:16 localhost kernel: [__generic_file_aio_read+484/544]
__generic_file_aio_read+0x1e4/0x220
May 13 19:54:16 localhost kernel: [file_read_actor+0/224]
file_read_actor+0x0/0xe0
May 13 19:54:16 localhost kernel: [generic_file_read+149/176]
generic_file_read+0x95/0xb0
May 13 19:54:16 localhost kernel: [try_to_wake_up+166/192]
try_to_wake_up+0xa6/0xc0
May 13 19:54:16 localhost kernel: [autoremove_wake_function+0/80]
autoremove_wake_function+0x0/0x50
May 13 19:54:16 localhost kernel: [schedule+791/1616] schedule+0x317/0x650
May 13 19:54:16 localhost kernel: [vfs_read+156/336] vfs_read+0x9c/0x150
May 13 19:54:16 localhost kernel: [sys_read+71/128] sys_read+0x47/0x80
May 13 19:54:16 localhost kernel: [syscall_call+7/11] syscall_call+0x7/0xb
i'll recompile without those two patches, and try it again.
Steve
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.12-rc4-mm1
[not found] ` <428508BB.8030604@friservices.com>
@ 2005-05-14 10:46 ` Artem B. Bityuckiy
0 siblings, 0 replies; 10+ messages in thread
From: Artem B. Bityuckiy @ 2005-05-14 10:46 UTC (permalink / raw)
To: steve; +Cc: David Woodhouse, Andrew Morton, linux-fsdevel
> i'll try to reproduce it. this is on an IBM X31 with a pentium M no
> smp. using reiserfs (not V4) but did compile it with reiser4
So was the preemption enabled or disabled ?
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.12-rc4-mm1
2005-05-14 1:07 ` 2.6.12-rc4-mm1 steve
@ 2005-05-18 10:06 ` Artem B. Bityuckiy
2005-05-18 16:16 ` 2.6.12-rc4-mm1 Steve Roemen
0 siblings, 1 reply; 10+ messages in thread
From: Artem B. Bityuckiy @ 2005-05-18 10:06 UTC (permalink / raw)
To: steve; +Cc: David Woodhouse, Andrew Morton, linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 1040 bytes --]
Steve,
> okay, reproduced after a couple hours of running (nothing intensive),
> then doing a grep blah -r /*
I can't reproduce your problem using 2.6.12-rc4 + the 2 patches.
I created 50GB Reiserfs partition, copied a lot of data there (built
linux sources in many exemplars) and issued 'grep -r blah *'. I also
tried it with parallel writing to the partition - no help.
I wonder, could you please try 2.6.12-rc4 + the 2 patches and see if the
warning still there ?
The above referred patches are (and they are also attached):
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-
rc4/2.6.12-rc4-mm1/broken-out/vfs-bugfix-two-read_inode-calles-
without.patch
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc4/2.6.12-rc4-mm1/broken-out/__wait_on_freeing_inode-fix.patch
P.S. I wanted to try 2.6.12-rc4-mm2, but it OOPSes during the kernel
load in my system. Then I switched to 2.6.12-rc4 in order not to fight
with -mm2's problems.
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
[-- Attachment #2: vfs-bugfix-two-read_inode-calles-without.patch --]
[-- Type: text/x-patch, Size: 1184 bytes --]
diff -puN fs/inode.c~vfs-bugfix-two-read_inode-calles-without fs/inode.c
--- 25/fs/inode.c~vfs-bugfix-two-read_inode-calles-without Fri May 6 15:12:47 2005
+++ 25-akpm/fs/inode.c Fri May 6 15:12:47 2005
@@ -282,6 +282,13 @@ static void dispose_list(struct list_hea
if (inode->i_data.nrpages)
truncate_inode_pages(&inode->i_data, 0);
clear_inode(inode);
+
+ spin_lock(&inode_lock);
+ hlist_del_init(&inode->i_hash);
+ list_del_init(&inode->i_sb_list);
+ spin_unlock(&inode_lock);
+
+ wake_up_inode(inode);
destroy_inode(inode);
nr_disposed++;
}
@@ -317,8 +324,6 @@ static int invalidate_list(struct list_h
inode = list_entry(tmp, struct inode, i_sb_list);
invalidate_inode_buffers(inode);
if (!atomic_read(&inode->i_count)) {
- hlist_del_init(&inode->i_hash);
- list_del(&inode->i_sb_list);
list_move(&inode->i_list, dispose);
inode->i_state |= I_FREEING;
count++;
@@ -439,8 +444,6 @@ static void prune_icache(int nr_to_scan)
if (!can_unuse(inode))
continue;
}
- hlist_del_init(&inode->i_hash);
- list_del_init(&inode->i_sb_list);
list_move(&inode->i_list, &freeable);
inode->i_state |= I_FREEING;
nr_pruned++;
[-- Attachment #3: __wait_on_freeing_inode-fix.patch --]
[-- Type: text/x-patch, Size: 1655 bytes --]
diff -puN fs/inode.c~__wait_on_freeing_inode-fix fs/inode.c
--- 25/fs/inode.c~__wait_on_freeing_inode-fix 2005-05-09 20:09:33.000000000 -0700
+++ 25-akpm/fs/inode.c 2005-05-09 20:09:33.000000000 -0700
@@ -1241,29 +1241,21 @@ int inode_wait(void *word)
}
/*
- * If we try to find an inode in the inode hash while it is being deleted, we
- * have to wait until the filesystem completes its deletion before reporting
- * that it isn't found. This is because iget will immediately call
- * ->read_inode, and we want to be sure that evidence of the deletion is found
- * by ->read_inode.
+ * If we try to find an inode in the inode hash while it is being
+ * deleted, we have to wait until the filesystem completes its
+ * deletion before reporting that it isn't found. This function waits
+ * until the deletion _might_ have completed. Callers are responsible
+ * to recheck inode state.
+ *
+ * It doesn't matter if I_LOCK is not set initially, a call to
+ * wake_up_inode() after removing from the hash list will DTRT.
+ *
* This is called with inode_lock held.
*/
static void __wait_on_freeing_inode(struct inode *inode)
{
wait_queue_head_t *wq;
DEFINE_WAIT_BIT(wait, &inode->i_state, __I_LOCK);
-
- /*
- * I_FREEING and I_CLEAR are cleared in process context under
- * inode_lock, so we have to give the tasks who would clear them
- * a chance to run and acquire inode_lock.
- */
- if (!(inode->i_state & I_LOCK)) {
- spin_unlock(&inode_lock);
- yield();
- spin_lock(&inode_lock);
- return;
- }
wq = bit_waitqueue(&inode->i_state, __I_LOCK);
prepare_to_wait(wq, &wait.wait, TASK_UNINTERRUPTIBLE);
spin_unlock(&inode_lock);
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.12-rc4-mm1
2005-05-18 10:06 ` 2.6.12-rc4-mm1 Artem B. Bityuckiy
@ 2005-05-18 16:16 ` Steve Roemen
2005-05-19 16:45 ` 2.6.12-rc4-mm1 David Woodhouse
0 siblings, 1 reply; 10+ messages in thread
From: Steve Roemen @ 2005-05-18 16:16 UTC (permalink / raw)
To: dedekind; +Cc: David Woodhouse, Andrew Morton, linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 4331 bytes --]
on 05/18/05 05:06 Artem B. Bityuckiy wrote the following:
>Steve,
>
>
>
>>okay, reproduced after a couple hours of running (nothing intensive),
>>then doing a grep blah -r /*
>>
>>
>I can't reproduce your problem using 2.6.12-rc4 + the 2 patches.
>I created 50GB Reiserfs partition, copied a lot of data there (built
>linux sources in many exemplars) and issued 'grep -r blah *'. I also
>tried it with parallel writing to the partition - no help.
>
>I wonder, could you please try 2.6.12-rc4 + the 2 patches and see if the
>warning still there ?
>
>The above referred patches are (and they are also attached):
>ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-
>rc4/2.6.12-rc4-mm1/broken-out/vfs-bugfix-two-read_inode-calles-
>without.patch
>ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc4/2.6.12-rc4-mm1/broken-out/__wait_on_freeing_inode-fix.patch
>
>P.S. I wanted to try 2.6.12-rc4-mm2, but it OOPSes during the kernel
>load in my system. Then I switched to 2.6.12-rc4 in order not to fight
>with -mm2's problems.
>
>
>
>------------------------------------------------------------------------
>
>diff -puN fs/inode.c~vfs-bugfix-two-read_inode-calles-without fs/inode.c
>--- 25/fs/inode.c~vfs-bugfix-two-read_inode-calles-without Fri May 6 15:12:47 2005
>+++ 25-akpm/fs/inode.c Fri May 6 15:12:47 2005
>@@ -282,6 +282,13 @@ static void dispose_list(struct list_hea
> if (inode->i_data.nrpages)
> truncate_inode_pages(&inode->i_data, 0);
> clear_inode(inode);
>+
>+ spin_lock(&inode_lock);
>+ hlist_del_init(&inode->i_hash);
>+ list_del_init(&inode->i_sb_list);
>+ spin_unlock(&inode_lock);
>+
>+ wake_up_inode(inode);
> destroy_inode(inode);
> nr_disposed++;
> }
>@@ -317,8 +324,6 @@ static int invalidate_list(struct list_h
> inode = list_entry(tmp, struct inode, i_sb_list);
> invalidate_inode_buffers(inode);
> if (!atomic_read(&inode->i_count)) {
>- hlist_del_init(&inode->i_hash);
>- list_del(&inode->i_sb_list);
> list_move(&inode->i_list, dispose);
> inode->i_state |= I_FREEING;
> count++;
>@@ -439,8 +444,6 @@ static void prune_icache(int nr_to_scan)
> if (!can_unuse(inode))
> continue;
> }
>- hlist_del_init(&inode->i_hash);
>- list_del_init(&inode->i_sb_list);
> list_move(&inode->i_list, &freeable);
> inode->i_state |= I_FREEING;
> nr_pruned++;
>
>
>
>------------------------------------------------------------------------
>
>diff -puN fs/inode.c~__wait_on_freeing_inode-fix fs/inode.c
>--- 25/fs/inode.c~__wait_on_freeing_inode-fix 2005-05-09 20:09:33.000000000 -0700
>+++ 25-akpm/fs/inode.c 2005-05-09 20:09:33.000000000 -0700
>@@ -1241,29 +1241,21 @@ int inode_wait(void *word)
> }
>
> /*
>- * If we try to find an inode in the inode hash while it is being deleted, we
>- * have to wait until the filesystem completes its deletion before reporting
>- * that it isn't found. This is because iget will immediately call
>- * ->read_inode, and we want to be sure that evidence of the deletion is found
>- * by ->read_inode.
>+ * If we try to find an inode in the inode hash while it is being
>+ * deleted, we have to wait until the filesystem completes its
>+ * deletion before reporting that it isn't found. This function waits
>+ * until the deletion _might_ have completed. Callers are responsible
>+ * to recheck inode state.
>+ *
>+ * It doesn't matter if I_LOCK is not set initially, a call to
>+ * wake_up_inode() after removing from the hash list will DTRT.
>+ *
> * This is called with inode_lock held.
> */
> static void __wait_on_freeing_inode(struct inode *inode)
> {
> wait_queue_head_t *wq;
> DEFINE_WAIT_BIT(wait, &inode->i_state, __I_LOCK);
>-
>- /*
>- * I_FREEING and I_CLEAR are cleared in process context under
>- * inode_lock, so we have to give the tasks who would clear them
>- * a chance to run and acquire inode_lock.
>- */
>- if (!(inode->i_state & I_LOCK)) {
>- spin_unlock(&inode_lock);
>- yield();
>- spin_lock(&inode_lock);
>- return;
>- }
> wq = bit_waitqueue(&inode->i_state, __I_LOCK);
> prepare_to_wait(wq, &wait.wait, TASK_UNINTERRUPTIBLE);
> spin_unlock(&inode_lock);
>
>
>
okay, recompiled with those two files back in, and after an hour of
running, while doing a
tar -xjvf linux-2.6.11.tar.bz2, it kicks out that error (attached).
Steve
[-- Attachment #2: dmesg_output --]
[-- Type: text/plain, Size: 553 bytes --]
BUG: atomic counter underflow at:
[<c01aadf1>] reiserfs_clear_inode+0x81/0xb0
[<c0173484>] clear_inode+0xe4/0x130
[<c0173540>] dispose_list+0x70/0x130
[<c017385f>] prune_icache+0xbf/0x1b0
[<c0173964>] shrink_icache_memory+0x14/0x40
[<c01471b9>] shrink_slab+0x159/0x1a0
[<c01486e7>] balance_pgdat+0x2b7/0x3b0
[<c01488b2>] kswapd+0xd2/0xf0
[<c0130430>] autoremove_wake_function+0x0/0x50
[<c0102fd2>] ret_from_fork+0x6/0x14
[<c0130430>] autoremove_wake_function+0x0/0x50
[<c01487e0>] kswapd+0x0/0xf0
[<c010136d>] kernel_thread_helper+0x5/0x18
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.12-rc4-mm1
2005-05-18 16:16 ` 2.6.12-rc4-mm1 Steve Roemen
@ 2005-05-19 16:45 ` David Woodhouse
2005-05-19 17:55 ` 2.6.12-rc4-mm1 Steve Roemen
0 siblings, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2005-05-19 16:45 UTC (permalink / raw)
To: steve; +Cc: dedekind, Andrew Morton, linux-fsdevel, mason
On Wed, 2005-05-18 at 11:16 -0500, Steve Roemen wrote:
> okay, recompiled with those two files back in, and after an hour of
> running, while doing a tar -xjvf linux-2.6.11.tar.bz2, it kicks out
> that error (attached).
Thanks. Are you using ACLs? If not, I think there's a more fundamental
problem than a race with clear_inode() -- it's not that we're
decrementing the use count on an ACL twice; it's that you think you have
an ACL when there wasn't one. This could be a symptom of memory
corruption... which has already been reported in reiserfs in 2.6.12-rc4.
Do you have CONFIG_REISERFS_CHECK enabled? Do you have preempt enabled?
Could we trouble you to try again on 2.6.12-rc3 with those two patches,
please?
--
dwmw2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.12-rc4-mm1
2005-05-19 16:45 ` 2.6.12-rc4-mm1 David Woodhouse
@ 2005-05-19 17:55 ` Steve Roemen
2005-05-19 18:04 ` 2.6.12-rc4-mm1 David Woodhouse
0 siblings, 1 reply; 10+ messages in thread
From: Steve Roemen @ 2005-05-19 17:55 UTC (permalink / raw)
To: David Woodhouse; +Cc: dedekind, Andrew Morton, linux-fsdevel, mason
on 05/19/05 11:45 David Woodhouse wrote the following:
>On Wed, 2005-05-18 at 11:16 -0500, Steve Roemen wrote:
>
>
>>okay, recompiled with those two files back in, and after an hour of
>>running, while doing a tar -xjvf linux-2.6.11.tar.bz2, it kicks out
>>that error (attached).
>>
>>
>
>Thanks. Are you using ACLs? If not, I think there's a more fundamental
>problem than a race with clear_inode() -- it's not that we're
>decrementing the use count on an ACL twice; it's that you think you have
>an ACL when there wasn't one. This could be a symptom of memory
>corruption... which has already been reported in reiserfs in 2.6.12-rc4.
>
>Do you have CONFIG_REISERFS_CHECK enabled? Do you have preempt enabled?
>
>Could we trouble you to try again on 2.6.12-rc3 with those two patches,
>please?
>
>
>
Compiling 2.6.12-rc3 + the 2 patches right now. I'll let you know in a
couple of hours if it still does it.
Artem forwarded you my .config file.
Steve
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.12-rc4-mm1
2005-05-19 17:55 ` 2.6.12-rc4-mm1 Steve Roemen
@ 2005-05-19 18:04 ` David Woodhouse
2005-05-19 20:12 ` 2.6.12-rc4-mm1 Steve Roemen
0 siblings, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2005-05-19 18:04 UTC (permalink / raw)
To: steve; +Cc: dedekind, Andrew Morton, linux-fsdevel, mason
On Thu, 2005-05-19 at 12:55 -0500, Steve Roemen wrote:
> Compiling 2.6.12-rc3 + the 2 patches right now. I'll let you know in a
> couple of hours if it still does it.
> Artem forwarded you my .config file.
He did. That confirms you have ACLs enabled -- but are you actually
_using_ them though? If not, the ACL fields whose refcount is causing
this problem should never have been set in the first place.
Artem is putting together a patch which will put a magic value into the
struct posix_acl and hence double-check whether we're really freeing one
of them twice, or whether it's just that we're seeing memory corruption
and what's in REISERFS_I(inode)->i_acl_{access,default} is pure noise.
It might also be useful to attempt to reproduce the problem with slab
debugging turned on, but let's not change that variable just yet.
--
dwmw2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.12-rc4-mm1
2005-05-19 18:04 ` 2.6.12-rc4-mm1 David Woodhouse
@ 2005-05-19 20:12 ` Steve Roemen
2005-05-19 20:21 ` 2.6.12-rc4-mm1 David Woodhouse
0 siblings, 1 reply; 10+ messages in thread
From: Steve Roemen @ 2005-05-19 20:12 UTC (permalink / raw)
To: David Woodhouse; +Cc: dedekind, Andrew Morton, linux-fsdevel, mason
on 05/19/05 13:04 David Woodhouse wrote the following:
>On Thu, 2005-05-19 at 12:55 -0500, Steve Roemen wrote:
>
>
>>Compiling 2.6.12-rc3 + the 2 patches right now. I'll let you know in a
>>couple of hours if it still does it.
>>Artem forwarded you my .config file.
>>
>>
>
>He did. That confirms you have ACLs enabled -- but are you actually
>_using_ them though? If not, the ACL fields whose refcount is causing
>this problem should never have been set in the first place.
>
>Artem is putting together a patch which will put a magic value into the
>struct posix_acl and hence double-check whether we're really freeing one
>of them twice, or whether it's just that we're seeing memory corruption
>and what's in REISERFS_I(inode)->i_acl_{access,default} is pure noise.
>
>It might also be useful to attempt to reproduce the problem with slab
>debugging turned on, but let's not change that variable just yet.
>
>
>
No, I am not using ACLs. I am running the 2.6.12-rc3 with those two
patches, and I can't get it to error out.
Steve
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.12-rc4-mm1
2005-05-19 20:12 ` 2.6.12-rc4-mm1 Steve Roemen
@ 2005-05-19 20:21 ` David Woodhouse
0 siblings, 0 replies; 10+ messages in thread
From: David Woodhouse @ 2005-05-19 20:21 UTC (permalink / raw)
To: steve; +Cc: dedekind, Andrew Morton, linux-fsdevel, mason
On Thu, 2005-05-19 at 15:12 -0500, Steve Roemen wrote:
> No, I am not using ACLs. I am running the 2.6.12-rc3 with those two
> patches, and I can't get it to error out.
OK, then it sounds like what you've seen is a manifestation of the
already-known reiserfs breakage in 2.6.12-rc4. Artem's patch didn't
cause it; it just made it show itself.
Thanks.
--
dwmw2
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2005-05-19 20:22 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20050512033100.017958f6.akpm@osdl.org>
[not found] ` <4284BF66.1050704@friservices.com>
2005-05-13 18:13 ` 2.6.12-rc4-mm1 David Woodhouse
2005-05-14 1:07 ` 2.6.12-rc4-mm1 steve
2005-05-18 10:06 ` 2.6.12-rc4-mm1 Artem B. Bityuckiy
2005-05-18 16:16 ` 2.6.12-rc4-mm1 Steve Roemen
2005-05-19 16:45 ` 2.6.12-rc4-mm1 David Woodhouse
2005-05-19 17:55 ` 2.6.12-rc4-mm1 Steve Roemen
2005-05-19 18:04 ` 2.6.12-rc4-mm1 David Woodhouse
2005-05-19 20:12 ` 2.6.12-rc4-mm1 Steve Roemen
2005-05-19 20:21 ` 2.6.12-rc4-mm1 David Woodhouse
[not found] ` <428508BB.8030604@friservices.com>
2005-05-14 10:46 ` 2.6.12-rc4-mm1 Artem B. Bityuckiy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).