Re: 2.6.20 (XFS? related) crash after uptime of > 180 days during apt-get dist-upgrade on Debian Testing

All of lore.kernel.org
 help / color / mirror / Atom feed

From: David Chinner <dgc@sgi.com>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: 2.6.20 (XFS? related) crash after uptime of > 180 days during apt-get dist-upgrade on Debian Testing
Date: Tue, 18 Sep 2007 11:45:37 +1000	[thread overview]
Message-ID: <20070918014537.GK23367404@sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0709171315210.22156@p34.internal.lan>

On Mon, Sep 17, 2007 at 01:20:17PM -0400, Justin Piszcz wrote:
> Including the XFS mailing list in here too because it may be an XFS bug 
> looking at the call trace.
> 
> System: Debian Testing
> Kernel: 2.6.20
> Config: Attached
> 
> I was running apt-get dist-upgrade as I always do to get the latest 
> packages upgraded and the kernel OOPS'd when it was upgrading 'tzdata' and 
> the process went into D-state and I had to reboot.
> 
> The config file is from 2.6.20 but it had been moved to a 2.6.22 directory 
> for an upgrade, but all of the options have been left unchanged.
> 
> Here is the *OOPS I captured via dmesg before I rebooted:
> 
> [16201055.214559] nfsd: last server has exited
> [16201055.214566] nfsd: unexporting all filesystems
> [17341583.697472] BUG: unable to handle kernel paging request at virtual 
> address 99e00750
> [17341583.697480]  printing eip:
> [17341583.697482] c01531b0
> [17341583.697484] *pde = 00000000
> [17341583.697488] Oops: 0000 [#1]
> [17341583.697491] CPU:    0
> [17341583.697493] EIP:    0060:[<c01531b0>]    Not tainted VLI
> [17341583.697494] EFLAGS: 00210286   (2.6.20 #3)
> [17341583.697502] EIP is at __d_lookup+0x5d/0xd6
> [17341583.697505] eax: c8d7c17e   ebx: 99e00750   ecx: 00000011   edx: 
> c17f9200
> [17341583.697508] esi: 99e00750   edi: d2a10016   ebp: c7fe2304   esp: 
> dba35d98
> [17341583.697511] ds: 007b   es: 007b   ss: 0068
> [17341583.697514] Process kdm_greet (pid: 22119, ti=dba34000 task=f52d4a70 
> task.ti=dba34000)
> [17341583.697516] Stack: c8d7c17e 00000000 dba35e10 f705d478 dba35db8 
> 0000002c d2a10016 d2a10042 [17341583.697522]        dba35e10 dba35f30 
> dba35e10 c014ab6d dba35e1c c18c5240 dba35f04 c021877e [17341583.697528]     
> d2a10042 dba35e10 c8d7c17e dba35f30 c014c38f d2a10016 00000101 dba35e48 
> [17341583.697534] Call Trace:
> [17341583.697537]  [<c014ab6d>] do_lookup+0x1c/0x168
> [17341583.697540]  [<c021877e>] xfs_vn_lookup+0x53/0x77
> [17341583.697547]  [<c014c38f>] __link_path_walk+0x6e8/0xb1b
> [17341583.697551]  [<c0153698>] dput+0x18/0x121
> [17341583.697554]  [<c014c805>] link_path_walk+0x43/0xb8
> [17341583.697558]  [<c014ca0a>] do_path_lookup+0x75/0x181
> [17341583.697561]  [<c0145fda>] get_empty_filp+0x2f/0xe5
> [17341583.697566]  [<c014d468>] __path_lookup_intent_open+0x45/0x80
> [17341583.697570]  [<c014d517>] path_lookup_open+0x20/0x25
> [17341583.697573]  [<c014d5db>] open_namei+0x66/0x58a
> [17341583.697576]  [<c0143c35>] do_filp_open+0x25/0x40
> [17341583.697580]  [<c0143c8e>] do_sys_open+0x3e/0xc7
> [17341583.697584]  [<c0143d52>] sys_open+0x1c/0x20
> [17341583.697587]  [<c0102920>] syscall_call+0x7/0xb
> [17341583.697591]  =======================
> [17341583.697593] Code: 81 f2 01 00 37 9e 8b 0d 18 3f 44 c0 d3 ea 31 d0 23 
> 05 14 3f 44 c0 8b 15 1c 3f 44 c0 8b 34 82 85 f6 75 08 eb 4d 89 de 85 db 74 
> 47 <8b> 1e 0f 18 03 90 8d 6e f4 8b 04 24 3b 45 18 75 e9 8b 44 24 0c 
> [17341583.697621] EIP: [<c01531b0>] __d_lookup+0x5d/0xd6 SS:ESP 
> 0068:dba35d98
> [17341583.697626]  <1>BUG: unable to handle kernel paging request at 
> virtual address 99e00750
> [17341648.066740]  printing eip:
> [17341648.066786] c01531b0
> [17341648.066868] *pde = 00000000
> [17341648.066916] Oops: 0000 [#2]
> [17341648.066965] CPU:    0
> [17341648.066966] EIP:    0060:[<c01531b0>]    Not tainted VLI
> [17341648.066967] EFLAGS: 00010286   (2.6.20 #3)
> [17341648.067115] EIP is at __d_lookup+0x5d/0xd6
> [17341648.067165] eax: 1efcce0e   ebx: 99e00750   ecx: 00000011   edx: 
> c17f9200
> [17341648.067219] esi: 99e00750   edi: cc87901a   ebp: c7fe2304   esp: 
> f7755f04
> [17341648.067271] ds: 007b   es: 007b   ss: 0068
> [17341648.067320] Process dpkg (pid: 24684, ti=f7754000 task=d9846a70 
> task.ti=f7754000)
> [17341648.067371] Stack: 1efcce0e 46dd3a20 f7755f5c e489fe28 00000000 
> 00000010 cc87901a 00000000 [17341648.067715]        e489fe28 00000001 
> f7755f54 c014b7cb f7755f5c ef0d4098 ffffffd9 cc879000 [17341648.068056]     
> 00000001 f7755f54 c014cf84 f7755f54 e489fe28 c18c5240 1efcce0e 00000010 
> [17341648.068397] Call Trace:
> [17341648.068482]  [<c014b7cb>] __lookup_hash+0x4a/0xef
> [17341648.068563]  [<c014cf84>] do_rmdir+0x69/0xbb
> [17341648.068642]  [<c0102920>] syscall_call+0x7/0xb
> [17341648.068724]  =======================
> [17341648.068770] Code: 81 f2 01 00 37 9e 8b 0d 18 3f 44 c0 d3 ea 31 d0 23 
> 05 14 3f 44 c0 8b 15 1c 3f 44 c0 8b 34 82 85 f6 75 08 eb 4d 89 de 85 db 74 
> 47 <8b> 1e 0f 18 03 90 8d 6e f4 8b 04 24 3b 45 18 75 e9 8b 44 24 0c 
> [17341648.070874] EIP: [<c01531b0>] __d_lookup+0x5d/0xd6 SS:ESP 
> 0068:f7755f04
> [17341648.070988]
> 
> I doubt I can reproduce it as it has happened after 180 days or so, and I 
> am upgrading to 2.6.22.6 but I was wondering what exactly happened here?

No idea - it looks like dkpg was trying to remove a directory on the
same path the lookup was and both have gone splat in __d_lookup on
the same dentry. Something happened in  those 180 days that left a
landmine that was tripped over here, I think. I can't see any way of
tracking it down from this, but thanks for reporting it anyway,
Justin.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

next prev parent reply	other threads:[~2007-09-18  1:45 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-17 17:20 2.6.20 (XFS? related) crash after uptime of > 180 days during apt-get dist-upgrade on Debian Testing Justin Piszcz
2007-09-18  1:45 ` David Chinner [this message]
2007-09-18  9:20   ` Christoph Hellwig
2007-09-18 10:39     ` David Chinner
2007-09-18 11:09       ` Christoph Hellwig
2007-09-19  8:47 ` Justin Piszcz
2007-09-21  0:15   ` David Chinner
2007-09-21  8:48     ` Justin Piszcz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070918014537.GK23367404@sgi.com \
    --to=dgc@sgi.com \
    --cc=jpiszcz@lucidpixels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.