From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758923AbZBLLKl (ORCPT ); Thu, 12 Feb 2009 06:10:41 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1757581AbZBLLKF (ORCPT ); Thu, 12 Feb 2009 06:10:05 -0500 Received: from mx2.redhat.com ([66.187.237.31]:33906 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757245AbZBLLKA (ORCPT ); Thu, 12 Feb 2009 06:10:00 -0500 Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: <20090210142443.629E.KOSAKI.MOTOHIRO@jp.fujitsu.com> References: <20090210142443.629E.KOSAKI.MOTOHIRO@jp.fujitsu.com> To: KOSAKI Motohiro Cc: dhowells@redhat.com, Serge Hallyn , LKML , Lee Schermerhorn Subject: Re: [CRED bug?] 2.6.29-rc3 don't survive on stress workload Date: Thu, 12 Feb 2009 11:09:52 +0000 Message-ID: <27421.1234436992@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Aha! I reproduced it myself (with my patch to check atomic_dec_and_test() in there, but not Serge's patch). Ironically, 13 hours of running Vegard's setreuid() program didn't show anything, but halting the box whilst someone was trying to SSH-crack it did. Shutting down ntpd: ------------[ cut here ]------------ kernel BUG at mm/slab.c:591! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/irq CPU 1 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.29-rc4-cachefs #35 RIP: 0010:[] [] kfree+0x65/0xd1 RSP: 0018:ffff88003dc9fe50 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffffff80625a00 RCX: 0000000000000059 RDX: ffffe20000015818 RSI: 0000000000000059 RDI: ffffffff80625a00 RBP: ffffffff8025d238 R08: 0000000000000000 R09: ffff88003cffc9c8 R10: ffff88003cd4e000 R11: 09f911029d74e35b R12: ffffffff80625a00 R13: 0000000000000286 R14: 0000000000000009 R15: 0000000000000008 FS: 0000000000000000(0000) GS:ffff88003dc64268(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00007f2bbb54f7f8 CR3: 000000003d2fe000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffff88003dc98000, task ffff88003dc95290) Stack: 09f911029d74e35b ffffffff80625a00 ffffffff8025d238 ffff88003cc82338 0000000000000202 ffffffff803820bd 0000000000000286 ffff88003d2fcec0 0000000000000286 ffffffff8023a488 ffff88003cc823b8 ffff88003cffc9c8 Call Trace: <0> [] ? free_user_ns+0x0/0x19 [] ? kref_put+0x51/0x5c [] ? free_uid+0x4c/0x99 [] ? put_cred_rcu+0x70/0x83 [] ? __rcu_process_callbacks+0x157/0x1d2 [] ? rcu_process_callbacks+0x26/0x4b [] ? __do_softirq+0x7a/0x13d [] ? call_softirq+0x1c/0x28 [] ? do_softirq+0x2c/0x6c [] ? smp_apic_timer_interrupt+0x93/0xac [] ? apic_timer_interrupt+0x13/0x20 <0> [] ? datagram_poll+0x0/0xc2 [] ? mwait_idle+0x41/0x44 [] ? cpu_idle+0x40/0x5e Code: 48 8d 14 10 48 8b 02 25 00 00 01 00 48 85 c0 74 15 48 8b 52 10 48 8b 02 25 00 00 01 00 48 85 c0 74 04 48 8b 52 10 80 3a 00 78 04 <0f> 0b eb fe 48 8b 5a 28 65 8b 04 25 24 00 00 00 89 c0 48 8b 2c RIP [] kfree+0x65/0xd1 RSP ---[ end trace 36e0423a3db60c4b ]--- Kernel panic - not syncing: Fatal exception in interrupt This is due to the BUG_ON() in the following: static inline struct kmem_cache *page_get_cache(struct page *page) { page = compound_head(page); BUG_ON(!PageSlab(page)); return (struct kmem_cache *)page->lru.next; } This is due to the user_namespace being released being init_user_ns. RDI and R12 both hold the parameter to kfree() at this point, and gdb says: (gdb) i sym 0xffffffff80625a00 init_user_ns in section .data David