All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pekka Enberg <penberg@cs.helsinki.fi>
To: Ingo Molnar <mingo@elte.hu>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
	linux-kernel@vger.kernel.org,
	Vegard Nossum <vegard.nossum@gmail.com>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: tty_ldisc_try_get(): BUG kmalloc-8: Poison overwritten
Date: Sun, 14 Jun 2009 11:20:50 +0300	[thread overview]
Message-ID: <4A34B2E2.7080702@cs.helsinki.fi> (raw)
In-Reply-To: <20090614081052.GA9276@elte.hu>

Hi Ingo,

Ingo Molnar wrote:
> Ok, this is one for those who like to look at weird crashes/bugs.
> 
> Here's a new regression that popped up in this merge window, there's 
> some sort of slab corruption going on in tty data structures:
> 
> [   74.900215] =============================================================================
> [   74.908193] BUG kmalloc-8: Poison overwritten
> [   74.908193] -----------------------------------------------------------------------------
> [   74.908193] 
> [   74.908193] INFO: 0x5d883a14-0x5d883a14. First byte 0x6a instead of 0x6b
> [   74.908193] INFO: Allocated in tty_ldisc_try_get+0x1a/0xb0 age=8015 cpu=0 pid=1
> [   74.908193] INFO: Freed in tty_ldisc_put+0x48/0x50 age=4 cpu=3 pid=4236
> [   74.908193] INFO: Slab 0x42c6eeb4 objects=73 used=61 fp=0x5d883a10 flags=0x1d0000c3
> [   74.908193] INFO: Object 0x5d883a10 @offset=2576 fp=0x5d883d90
> [   74.908193] 
> [   74.908193] Bytes b4 0x5d883a00:  01 00 00 00 de 04 ff ff 5a 5a 5a 5a 5a 5a 5a 5a ....�.��ZZZZZZZZ
> [   74.908193]   Object 0x5d883a10:  6b 6b 6b 6b 6a 6b 6b a5                         kkkkjkk�        

This is struct tty_ldisc and the corruption happens in the first byte of 
->refcount. This probably just means that there's a race condition and 
someone is doing tty_ldisc_deref() after tty_ldisc_put().

You could add something like

   WARN_ON(ld->refcount == 0x6b)

to tty_ldisc_deref() to see if that triggers.

> [   74.908193]  Redzone 0x5d883a18:  bb bb bb bb                                     ����            
> [   74.908193]  Padding 0x5d883a40:  5a 5a 5a 5a 5a 5a 5a 5a                         ZZZZZZZZ        
> [   74.908193] Pid: 4230, comm: mingetty Not tainted 2.6.30-tip #744
> [   74.908193] Call Trace:
> [   74.908193]  [<410ae628>] print_trailer+0xc8/0xd0
> [   74.908193]  [<410ae6a3>] check_bytes_and_report+0x73/0x90
> [   74.908193]  [<410ae941>] check_object+0xa1/0x130
> [   74.908193]  [<410aef1e>] alloc_debug_processing+0x5e/0xd0
> [   74.908193]  [<410af99e>] __slab_alloc+0x11e/0x150
> [   74.908193]  [<413d9c7a>] ? tty_ldisc_try_get+0x1a/0xb0
> [   74.908193]  [<410afcdb>] kmem_cache_alloc+0x7b/0x120
> [   74.908193]  [<413d9c7a>] ? tty_ldisc_try_get+0x1a/0xb0
> [   74.908193]  [<413d9c7a>] ? tty_ldisc_try_get+0x1a/0xb0
> [   74.908193]  [<413d9c7a>] tty_ldisc_try_get+0x1a/0xb0
> [   74.908193]  [<410b06a3>] ? __kmalloc+0x163/0x170
> [   74.908193]  [<413d9d77>] tty_ldisc_get+0x17/0x40
> [   74.908193]  [<413da63d>] tty_ldisc_init+0xd/0x30
> [   74.908193]  [<413d4098>] initialize_tty_struct+0x38/0x210
> [   74.908193]  [<413d5d6f>] tty_init_dev+0x4f/0xb0
> [   74.908193]  [<413d5f25>] __tty_open+0x155/0x2d0
> [   74.908193]  [<413d60b7>] tty_open+0x17/0x30
> [   74.908193]  [<410bb599>] chrdev_open+0xe9/0x100
> [   74.908193]  [<410b721e>] __dentry_open+0xbe/0x190
> [   74.908193]  [<410b813c>] nameidata_to_filp+0x2c/0x50
> [   74.908193]  [<410bb4b0>] ? chrdev_open+0x0/0x100
> [   74.908193]  [<410c2eba>] do_filp_open+0x2aa/0x580
> [   74.908193]  [<4100a1bb>] ? sched_clock+0xb/0x20
> [   74.908193]  [<410596c7>] ? put_lock_stats+0x17/0x30
> [   74.908193]  [<41059734>] ? lock_release_holdtime+0x54/0x60
> [   74.908193]  [<4105d4d9>] ? lock_release_nested+0x99/0xd0
> [   74.908193]  [<41377421>] ? debug_spin_unlock+0x21/0x80
> [   74.908193]  [<41377495>] ? _raw_spin_unlock+0x15/0x20
> [   74.908193]  [<410cad50>] ? alloc_fd+0xc0/0xd0
> [   74.908193]  [<410b7020>] do_sys_open+0x40/0x80
> [   74.908193]  [<410b70ae>] sys_open+0x1e/0x30
> [   74.908193]  [<4100388f>] sysenter_do_call+0x12/0x3c
> [   74.908193] FIX kmalloc-8: Restoring 0x5d883a14-0x5d883a14=0x6b
> [   74.908193] 
> [   74.908193] FIX kmalloc-8: Marking all objects used
> 
> It's a single bit corruption - but the hardware in question has a 
> good track record with thousands of bootups, so it might be a 
> reference count related corruption as well.
> 
> It started triggering in this merge window, so one of these might be 
> a starting point:
> 
>  3e3b5c0: tty: use prepare/finish_wait
>  5fc5b42: tty: remove sleep_on
>  26a2e20: tty: Untangle termios and mm mutex dependencies
>  0b4068a: tty: simplify buffer allocator cleanups
>  c481c70: tty: remove buffer special casing
>  852e99d: tty: bring ldisc into CodingStyle
>  f2c4c65: tty: Move ldisc_flush
>  c65c9bc: tty: rewrite the ldisc locking
>  e8b70e7: tty: Extract various bits of ldisc code
>  5f0878a: tty: Fix oops when scanning the polling list for kgdb
>  38db897: tty: throttling race fix
>  1ec739b: tty: Implement a drain delay in the tty port
>  fcc8ac1: tty: Add carrier processing on close to the tty_port core
> 
> (But ... if it's a low-probability bug then it might be an older bug 
> as well.)
> 
> I tried two other reboots and the bug did not trigger in a way 
> visible in the log - so it's sporadic. I've started a reboot loop 
> with this kernel on that box, to see whether it's repeatable within 
> a reasonable amount of time.
> 
> This is the -tip testbox that generally triggers SMP races very well 
> (and as the first one amongst boxes) - so my first guess would be on 
> some narrow (or not so narrow but config/timing dependent) SMP race 
> window.
> 
> Since it's not reproducible in any easy fashion, there's no 
> bisection possible either, on this box. I've Cc:-ed all the 
> tty/kmalloc/race experts, maybe the bug can be seen ...
> 
> I've attached the config and the full bootlog.
> 
> 	Ingo
> 


  reply	other threads:[~2009-06-14  8:25 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-14  8:10 tty_ldisc_try_get(): BUG kmalloc-8: Poison overwritten Ingo Molnar
2009-06-14  8:20 ` Pekka Enberg [this message]
2009-06-14  8:30   ` Pekka Enberg
2009-06-14  8:32     ` Pekka Enberg
2009-06-14 23:03       ` Vegard Nossum
2009-06-15  4:25         ` Pekka Enberg
2009-06-14  8:32   ` Ingo Molnar
2009-06-14  8:35     ` Ingo Molnar
2009-06-14 12:32       ` Ingo Molnar
2009-06-14 10:54 ` Alan Cox
2009-06-15  9:10   ` Catalin Marinas
2009-06-16  7:10   ` [bug] WARNING: at drivers/char/tty_io.c:1266 tty_open+0x1ea/0x388() Ingo Molnar
2009-06-16  8:44     ` Alan Cox
2009-06-16  8:49       ` Ingo Molnar
2009-06-16  9:00         ` Alan Cox
2009-06-16 19:43           ` Ingo Molnar
2009-06-16 10:13     ` Alan Cox
2009-06-16 10:24       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A34B2E2.7080702@cs.helsinki.fi \
    --to=penberg@cs.helsinki.fi \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rjw@sisk.pl \
    --cc=torvalds@linux-foundation.org \
    --cc=vegard.nossum@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.