public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: Chris Caputo <ccaputo@alt.net>
Cc: linux-kernel@vger.kernel.org,
	David Woodhouse <dwmw2@infradead.org>,
	riel@redhat.com, Trond Myklebust <trond.myklebust@fys.uio.no>
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?
Date: Fri, 25 Jun 2004 09:17:43 -0300	[thread overview]
Message-ID: <20040625121743.GA24896@logos.cnet> (raw)
In-Reply-To: <Pine.LNX.4.44.0406231802100.13351-100000@nacho.alt.net>

On Wed, Jun 23, 2004 at 06:50:48PM -0700, Chris Caputo wrote:
> On Sat, 19 Jun 2004, Marcelo Tosatti wrote:
> > On Fri, Jun 18, 2004 at 05:47:05PM -0700, Chris Caputo wrote:
> > > In 2.4.26 on two different dual-proc x86 machines (one dual-P4 Xeon based,
> > > the other dual-PIII) I am seeing crashes which are the result of the
> > > inode_unused doubly linked list in fs/inode.c becoming corrupted.
> > 
> > What steps are required to reproduce the problem?
> 
> Unfortunately I don't yet know how to (or if I can) repro this with a
> stock kernel.  I am using 2.4.26 + Ingo's tux3-2.4.23-A3 patch in
> conjunction with a filesystem I wrote.  (the tux module itself is not
> being used, just the patches to the existing kernel)
> 
> > > A particular instance of the corruption I have isolated is in a call from
> > > iput() to __refile_inode().  To try to diagnose this further I placed list
> > > verification code before and after the list_del() and list_add() calls in
> > > __refile_inode() and observed a healthy list become corrupted after the
> > > del/add was completed.
> > 
> > Can you show us this data in more detail?
> 
> In __refile_inode() before and after the list_add()/del() calls I call a
> function which checks up to the first 10 items on the inode_unused list to
> see if next and prev pointers are valid.
> (inode->next->prev == inode && inode->prev->next == inode)
> 
> So what I observed was a case here where iput() inline __refile_inode():
> 
>  1) checked inode_unused and saw that it was good
>  2) put an item on the inode_unused list
>  3) checked inode_unused and saw that it was now bad and that the item 
>     added was the culprit.
> 
> This all happened within __refile_inode() with the inode_lock spinlock
> grabbed by iput() and so I tend to think some other code is accessing the
> inode_unused list _without_ grabbing the spinlock.  I've checked the
> inode.c code over and over, plus my filesystem code, and haven't yet found
> a culprit.  I also checked the tux diffs to see if it was messing with
> inode objects in an inappropriate way.
> 
> Is it safe to assume that the x86 version of atomic_dec_and_lock(), which
> iput() uses, is well trusted?  I figure it's got to be, but doesn't hurt
> to ask.

Pretty sure it is, used all over. You can try to use non-optimize version 
at lib/dec_and_lock.c for a test.

> > > It would seem to me that list corruption on otherwise healthy machines
> > > would only be the result of the inode_lock spinlock not being properly
> > > locked prior to the call to __refile_inode(), but as far as I can tell,
> > > the call to atomic_dec_and_lock() in iput() is doing that properly.
> > > 
> > > So I am at a loss.  Has anyone else seen this or does anyone have any idea
> > > what routes I should be exploring to fix this problem?
> > 
> > The changes between 2.4.25->2.4.26 (which introduce __refile_inode() and 
> > the unused_pagecache list) must have something to do with this. 
> 
> __refile_inode() was introduced in 2.4.25.  I'll try 2.4.24 to see if I
> can reproduce there.
> 
> Marcelo, you asked for some oops' relating to the problem.  Here are some.

Chris, thanks. Would love to see you reproducing the oops with a stock kernel with ext2/3.

> ---
> 
> Oops: 0000
> CPU:    0
> EIP:    0010:[<c015b465>]    Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010a93
> eax: 8d12e0e2   ebx: 7b402366   ecx: c40fdf18   edx: d9c92888
> esi: 7b40235e   edi: 7b402366   ebp: 000007cf   esp: c40fdf10
> ds: 0018   es: 0018   ss: 0018
> Process kswapd (pid: 7, stackpage=c40fd000)
> Stack: e3b85380 000004e5 e3b85388 c9812688 00000002 c25caf40 c0317fd8 00023dd2
>        c015b664 00000cb4 c0138f35 00000006 000001d0 ffffffff 000001d0 00000002
>        00000006 000001d0 c0317fd8 c0317fd8 c013936a c40fdf84 000001d0 0000003c
> Call Trace:    [<c015b664>] [<c0138f35>] [<c013936a>] [<c01393e2>] [<c0139596>]
>                [<c0139608>] [<c0139748>] [<c01396b0>] [<c0105000>] [<c010587e>]
> [<c01396b0>]
> Code: 8b 5b 04 8b 86 1c 01 00 00 a8 38 0f 84 5d 01 00 00 81 fb 08
> 
> 
> >>EIP; c015b465 <prune_icache+45/220>   <=====
> 
> >>ecx; c40fdf18 <_end+3d3ffec/38676134>
> >>edx; d9c92888 <_end+198d495c/38676134>
> >>esp; c40fdf10 <_end+3d3ffe4/38676134>
> 
> Trace; c015b664 <shrink_icache_memory+24/40>
> Trace; c0138f35 <shrink_cache+185/410>
> Trace; c013936a <shrink_caches+4a/60>
> Trace; c01393e2 <try_to_free_pages_zone+62/f0>
> Trace; c0139596 <kswapd_balance_pgdat+66/b0>
> Trace; c0139608 <kswapd_balance+28/40>
> Trace; c0139748 <kswapd+98/c0>
> Trace; c01396b0 <kswapd+0/c0>
> Trace; c0105000 <_stext+0/0>
> Trace; c010587e <arch_kernel_thread+2e/40>
> Trace; c01396b0 <kswapd+0/c0>
> 
> Code;  c015b465 <prune_icache+45/220>
> 00000000 <_EIP>:
> Code;  c015b465 <prune_icache+45/220>   <=====
>    0:   8b 5b 04                  mov    0x4(%ebx),%ebx   <=====
> Code;  c015b468 <prune_icache+48/220>
>    3:   8b 86 1c 01 00 00         mov    0x11c(%esi),%eax
> Code;  c015b46e <prune_icache+4e/220>
>    9:   a8 38                     test   $0x38,%al
> Code;  c015b470 <prune_icache+50/220>
>    b:   0f 84 5d 01 00 00         je     16e <_EIP+0x16e>
> Code;  c015b476 <prune_icache+56/220>
>   11:   81 fb 08 00 00 00         cmp    $0x8,%ebx
> 
> ---
> 
> Oops: 0000
> CPU:    0
> EIP:    0010:[<c015b465>]    Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010a97
> eax: 830418da   ebx: 5954e741   ecx: c40fdf18   edx: c0318f08
> esi: 5954e739   edi: 5954e741   ebp: 000001b4   esp: c40fdf10
> ds: 0018   es: 0018   ss: 0018
> Process kswapd (pid: 7, stackpage=c40fd000)
> Stack: f718b780 000013a1 f718b788 d04d3988 00000001 c3cfe850 c0317fd8 00023d9e
>        c015b664 00001555 c0138f35 00000006 000001d0 ffffffff 000001d0 00000001
>        00000004 000001d0 c0317fd8 c0317fd8 c013936a c40fdf84 000001d0 0000003c
> Call Trace:    [<c015b664>] [<c0138f35>] [<c013936a>] [<c01393e2>] [<c0139596>]
>                [<c0139608>] [<c0139748>] [<c01396b0>] [<c0105000>] [<c010587e>]
>  [<c01396b0>]
> Code: 8b 5b 04 8b 86 1c 01 00 00 a8 38 0f 84 5d 01 00 00 81 fb 08
> 
> 
> >>EIP; c015b465 <prune_icache+45/220>   <=====
> 
> >>edx; c0318f08 <inode_unused+0/8>
> 
> Trace; c015b664 <shrink_icache_memory+24/40>
> Trace; c0138f35 <shrink_cache+185/410>
> Trace; c013936a <shrink_caches+4a/60>
> Trace; c01393e2 <try_to_free_pages_zone+62/f0>
> Trace; c0139596 <kswapd_balance_pgdat+66/b0>
> Trace; c0139608 <kswapd_balance+28/40>
> Trace; c0139748 <kswapd+98/c0>
> Trace; c01396b0 <kswapd+0/c0>
> Trace; c0105000 <_stext+0/0>
> Trace; c010587e <arch_kernel_thread+2e/40>
> Trace; c01396b0 <kswapd+0/c0>
> 
> Code;  c015b465 <prune_icache+45/220>
> 00000000 <_EIP>:
> Code;  c015b465 <prune_icache+45/220>   <=====
>    0:   8b 5b 04                  mov    0x4(%ebx),%ebx   <=====
> Code;  c015b468 <prune_icache+48/220>
>    3:   8b 86 1c 01 00 00         mov    0x11c(%esi),%eax
> Code;  c015b46e <prune_icache+4e/220>
>    9:   a8 38                     test   $0x38,%al
> Code;  c015b470 <prune_icache+50/220>
>    b:   0f 84 5d 01 00 00         je     16e <_EIP+0x16e>
> Code;  c015b476 <prune_icache+56/220>
>   11:   81 fb 08 00 00 00         cmp    $0x8,%ebx
> 
> ---
> 
> I think this one was an infinite loop which I used alt-sysrq-p to get 
> deduce:
> 
> Pid: 7, comm:               kswapd
> EIP: 0010:[<c014dd1c>] CPU: 2 EFLAGS: 00000246    Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EAX: c4e4f824 EBX: c4e4f804 ECX: 00000006 EDX: 00000000
> ESI: c4e4f80c EDI: c4e4f804 EBP: 000009ee DS: 0018 ES: 0018
> Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register 
> line ignored
> CR0: 8005003b CR2: bfffdfb4 CR3: 00101000 CR4: 000006d0
> Call Trace:    [<c0167f8e>] [<c01681d4>] [<c013d9d0>] [<c0168254>] [<c013fe69>]
>   [<c01404ae>] [<c0140533>] [<c01405b2>] [<c011a6cb>] [<c0140766>] [<c01407d8>]
>   [<c0140918>] [<c0140880>] [<c010595e>] [<c0140880>]
> Warning (Oops_read): Code line not seen, dumping what data is available
> 
> 
> >>EIP; c014dd1c <inode_has_buffers+6c/90>   <=====
> 
> >>EAX; c4e4f824 <_end+4a6b2f8/3864fb34>
> >>EBX; c4e4f804 <_end+4a6b2d8/3864fb34>
> >>ESI; c4e4f80c <_end+4a6b2e0/3864fb34>
> >>EDI; c4e4f804 <_end+4a6b2d8/3864fb34>
> 
> Trace; c0167f8e <prune_icache+7e/320>
> Trace; c01681d4 <prune_icache+2c4/320>
> Trace; c013d9d0 <kmem_cache_shrink+70/c0>
> Trace; c0168254 <shrink_icache_memory+24/40>
> Trace; c013fe69 <shrink_cache+1d9/6d0>
> Trace; c01404ae <refill_inactive+14e/160>
> Trace; c0140533 <shrink_caches+73/90>
> Trace; c01405b2 <try_to_free_pages_zone+62/f0>
> Trace; c011a6cb <schedule+34b/5f0>
> Trace; c0140766 <kswapd_balance_pgdat+66/b0>
> Trace; c01407d8 <kswapd_balance+28/40>
> Trace; c0140918 <kswapd+98/c0>
> Trace; c0140880 <kswapd+0/c0>
> Trace; c010595e <arch_kernel_thread+2e/40>
> Trace; c0140880 <kswapd+0/c0>
> 
> ---
> 
> Another infinite loop (alt-sysrq-p):
> 
> Pid: 7, comm:               kswapd
> EIP: 0010:[<c0167bc8>] CPU: 3 EFLAGS: 00000206    Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EAX: 00000001 EBX: ebe7f40c ECX: 00000002 EDX: 00000000
> ESI: ebe7f60c EDI: ebe7f604 EBP: c2851ee4 DS: 0018 ES: 0018
> Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register 
> line ignored
> CR0: 8005003b CR2: 40014000 CR3: 00101000 CR4: 000006d0
> Call Trace:    [<c013d7e2>] [<c0167eb7>] [<c013fd05>] [<c0140360>] [<c01403e2>]
>   [<c0140462>] [<c0140638>] [<c01406a8>] [<c01407fa>] [<c0140760>] [<c010596e>]
>   [<c0140760>]
> 
> >>EIP; c0167bc8 <prune_icache+78/340>   <=====
> 
> Trace; c013d7e2 <kmem_cache_shrink+72/c0>
> Trace; c0167eb7 <shrink_icache_memory+27/40>
> Trace; c013fd05 <shrink_cache+1d5/6d0>
> Trace; c0140360 <refill_inactive+160/170>
> Trace; c01403e2 <shrink_caches+72/90>
> Trace; c0140462 <try_to_free_pages_zone+62/f0>
> Trace; c0140638 <kswapd_balance_pgdat+78/c0>
> Trace; c01406a8 <kswapd_balance+28/40>
> Trace; c01407fa <kswapd+9a/c0>
> Trace; c0140760 <kswapd+0/c0>
> Trace; c010596e <arch_kernel_thread+2e/40>
> Trace; c0140760 <kswapd+0/c0>
> 
> ---
> 
> Infinite loop (alt-sysrq-p):
> 
> Pid: 9, comm:             kupdated
> EIP: 0010:[<c0168df3>] CPU: 2 EFLAGS: 00000282    Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EAX: f7bea000 EBX: f7bea000 ECX: f7bebfd0 EDX: 00000000
> ESI: f7bebfc8 EDI: f7bebfd8 EBP: f7bebf10 DS: 0018 ES: 0018
> Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register 
> line ignored
> CR0: 8005003b CR2: 40013090 CR3: 00101000 CR4: 000006d0
> Call Trace:    [<c011a1b2>] [<c0151333>] [<c015185e>] [<c0107a42>]
> [<c010596e>] [<c0151690>]
> 
> >>EIP; c0168df3 <.text.lock.inode+69/256>   <=====
> 
> >>EAX; f7bea000 <_end+377ffb74/38649bd4>
> >>EBX; f7bea000 <_end+377ffb74/38649bd4>
> >>ECX; f7bebfd0 <_end+37801b44/38649bd4>
> >>ESI; f7bebfc8 <_end+37801b3c/38649bd4>
> >>EDI; f7bebfd8 <_end+37801b4c/38649bd4>
> >>EBP; f7bebf10 <_end+37801a84/38649bd4>
> 
> Trace; c011a1b2 <schedule_timeout+62/b0>
> Trace; c0151333 <sync_old_buffers+53/160>
> Trace; c015185e <kupdate+1ce/230>
> Trace; c0107a42 <ret_from_fork+6/20>
> Trace; c010596e <arch_kernel_thread+2e/40>
> Trace; c0151690 <kupdate+0/230>
> 
> ---
> 
> Infinite loop (alt-sysrq-p):
> 
> Pid: 7, comm:               kswapd
> EIP: 0010:[<c0167be0>] CPU: 3 EFLAGS: 00000202    Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EAX: 00000020 EBX: dcbc120c ECX: 00000086 EDX: daae3044
> ESI: dcbc120c EDI: dcbc1204 EBP: c2851ee4 DS: 0018 ES: 0018
> Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register 
> line ign
> ored
> CR0: 8005003b CR2: 0807f260 CR3: 00101000 CR4: 000006d0
> Call Trace:    [<c013d7e2>] [<c0167ed7>] [<c013fd05>] [<c0140360>] [<c01403e2>]
>   [<c011a55b>] [<c0140462>] [<c0140638>] [<c01406a8>] [<c01407fa>] [<c0140760>]
>   [<c010596e>] [<c0140760>]
> 
> >>EIP; c0167be0 <prune_icache+90/360>   <=====
> 
> >>EBX; dcbc120c <_end+1c7d6d80/38649bd4>
> >>EDX; daae3044 <_end+1a6f8bb8/38649bd4>
> >>ESI; dcbc120c <_end+1c7d6d80/38649bd4>
> >>EDI; dcbc1204 <_end+1c7d6d78/38649bd4>
> >>EBP; c2851ee4 <_end+2467a58/38649bd4>
> 
> Trace; c013d7e2 <kmem_cache_shrink+72/c0>
> Trace; c0167ed7 <shrink_icache_memory+27/40>
> Trace; c013fd05 <shrink_cache+1d5/6d0>
> Trace; c0140360 <refill_inactive+160/170>
> Trace; c01403e2 <shrink_caches+72/90>
> Trace; c011a55b <schedule+34b/5f0>
> Trace; c0140462 <try_to_free_pages_zone+62/f0>
> Trace; c0140638 <kswapd_balance_pgdat+78/c0>
> Trace; c01406a8 <kswapd_balance+28/40>
> Trace; c01407fa <kswapd+9a/c0>
> Trace; c0140760 <kswapd+0/c0>
> Trace; c010596e <arch_kernel_thread+2e/40>
> Trace; c0140760 <kswapd+0/c0>

  parent reply	other threads:[~2004-06-25 12:53 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-19  0:47 inode_unused list corruption in 2.4.26 - spin_lock problem? Chris Caputo
2004-06-20  0:15 ` Marcelo Tosatti
2004-06-20  3:33   ` Trond Myklebust
2004-06-21  0:45     ` Marcelo Tosatti
2004-06-21 17:10       ` Trond Myklebust
2004-06-21 18:23         ` Marcelo Tosatti
2004-06-24  1:51         ` Chris Caputo
2004-06-25  7:47           ` Chris Caputo
2004-06-24  1:50   ` Chris Caputo
2004-06-25  8:04     ` Chris Caputo
2004-06-25 10:18       ` Chris Caputo
2004-06-25 12:17     ` Marcelo Tosatti [this message]
2004-07-02 20:00       ` Chris Caputo
2004-07-03  5:15         ` Arjan van de Ven
2004-07-26 17:41           ` Chris Caputo
2004-07-27 14:19             ` Marcelo Tosatti
2004-07-27 16:08               ` Chris Caputo
2004-07-29  0:25                 ` Marcelo Tosatti
2004-07-29  6:27                   ` Chris Caputo
2004-07-29  7:54                     ` Arjan van de Ven
2004-07-29 10:57                       ` Marcelo Tosatti
2004-07-29 12:21                         ` Arjan van de Ven
2004-07-29 16:22                           ` Chris Caputo
2004-07-29 16:23                             ` Arjan van de Ven
2004-08-04 15:38                               ` Chris Caputo
2004-08-04 18:47                                 ` Chris Caputo
     [not found] <20040805225549.GA18420@logos.cnet>
2004-08-07 16:00 ` Chris Caputo
2004-08-07 18:31   ` Marcelo Tosatti
2004-08-07 19:36   ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040625121743.GA24896@logos.cnet \
    --to=marcelo.tosatti@cyclades.com \
    --cc=ccaputo@alt.net \
    --cc=dwmw2@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@redhat.com \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox