From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============4967348614448351735==" MIME-Version: 1.0 From: Paul E. McKenney To: lkp@lists.01.org Subject: Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71 Date: Wed, 07 Sep 2016 00:31:53 -0700 Message-ID: <20160907073153.GR3663@linux.vnet.ibm.com> In-Reply-To: List-Id: --===============4967348614448351735== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Tue, Sep 06, 2016 at 11:49:04AM -0400, Kees Cook wrote: > On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney > wrote: > > On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote: > >> > >> FYI, we noticed the following commit: > >> > >> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git = dev.2016.08.19a > >> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add= () debug checking into separate function") > >> > >> in testcase: boot > >> > >> on test machine: qemu-system-i386 -enable-kvm -m 360M > >> > >> caused below changes: > > > > Hello, Kees, > > > > Looks like your change might have actually found a bug. ;-) > > > > How would you like to handle this? > = > We should fix the bug! :) But of course... The question is whether I am considered to be fully responsible for chasing down any bugs that arise. It is looking like I am, in which case I must defer this patch to the 4.10 merge window. Things are a bit crazy now due to prior overcommitments... Yes, yes, they will be crazy then as well, but this would be higher on the priority list. Thanx, Paul > >> +------------------------------------------------------------+--------= ----+------------+ > >> | | 0a33d91= 2a3 | 8052ce2d97 | > >> +------------------------------------------------------------+--------= ----+------------+ > >> | boot_successes | 0 = | 0 | > >> | boot_failures | 211 = | 18 | > >> | WARNING:at_lib/list_debug.c:#__list_del_entry | 211 = | 18 | > >> | calltrace:init | 209 = | 18 | > >> | WARNING:at_lib/list_debug.c:#__list_add | 209 = | | > >> | calltrace:i8042_init | 209 = | 17 | > >> | calltrace:SyS_reboot | 192 = | 4 | > >> | EIP_is_at__mutex_unlock_slowpath | 29 = | | > >> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138 = | | > >> | BUG:workqueue_lockup-pool_cpus=3D#flags=3D#nice=3D#stuck_for#s | 5= 6 | | > >> | EIP_is_at_lock_release | 7 = | | > >> | EIP_is_at_lock_acquire | 28 = | | > >> | EIP_is_at_raw_spin_unlock_irq | 30 = | | > >> | EIP_is_at_mutex_lock_nested | 21 = | | > >> | EIP_is_at_lock_is_held | 10 = | | > >> | EIP_is_at_device_shutdown | 3 = | | > >> | EIP_is_at___might_sleep | 5 = | | > >> | EIP_is_at__list_del_entry | 4 = | 1 | > >> | EIP_is_at_get_device | 1 = | | > >> | BUG:unable_to_handle_kernel | 7 = | 5 | > >> | Oops | 7 = | 5 | > >> | calltrace:SyS_ipc | 4 = | 4 | > >> | Kernel_panic-not_syncing:Fatal_exception | 7 = | 5 | > >> | EIP_is_at_pm_runtime_barrier | 1 = | | > >> | EIP_is_at_lock_acquired | 1 = | | > >> | EIP_is_at__might_sleep | 1 = | | > >> | BUG:workqueue_lockup-pool_cpus=3D#-#flags=3D#nice=3D#stuck_for#s | 1= | | > >> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page | 26 = | | > >> | calltrace:mark_rodata_ro | 26 = | | > >> | WARNING:at_lib/list_debug.c:#__list_add_valid | 0 = | 17 | > >> | calltrace:serio_handle_event | 0 = | 17 | > >> | calltrace:userio_misc_init | 0 = | 17 | > >> | calltrace:mousedev_init | 0 = | 17 | > >> | calltrace:evdev_init | 0 = | 17 | > >> | calltrace:cmos_init | 0 = | 17 | > >> | calltrace:test_init | 0 = | 17 | > >> | calltrace:pca_isa_driver_init | 0 = | 17 | > >> | calltrace:vim2m_init | 0 = | 17 | > >> | calltrace:fm_drv_init | 0 = | 17 | > >> | calltrace:test_power_init | 0 = | 17 | > >> | calltrace:raid#_init | 0 = | 17 | > >> | calltrace:dm_init | 0 = | 17 | > >> | calltrace:vhci_init | 0 = | 17 | > >> | calltrace:dcdrbu_init | 0 = | 17 | > >> | calltrace:dcdbas_init | 0 = | 17 | > >> | calltrace:uhid_init | 0 = | 17 | > >> | calltrace:vhost_net_init | 0 = | 17 | > >> | calltrace:dummy_stm_init | 0 = | 17 | > >> | calltrace:stm_heartbeat_init | 0 = | 17 | > >> | calltrace:binder_init | 0 = | 17 | > >> | calltrace:teql_init | 0 = | 17 | > >> | calltrace:ipip_init | 0 = | 17 | > >> | calltrace:ipgre_init | 0 = | 17 | > >> | calltrace:sit_init | 0 = | 17 | > >> | calltrace:ip6_tunnel_init | 0 = | 17 | > >> | calltrace:ip6gre_init | 0 = | 17 | > >> | calltrace:irlan_init | 0 = | 17 | > >> | calltrace:pm_qos_power_init | 0 = | 17 | > >> | calltrace:ubi_init | 0 = | 17 | > >> | calltrace:SyS_setsockopt | 0 = | 2 | > >> | calltrace:SyS_socketcall | 0 = | 2 | > >> | WARNING:at_include/linux/kref.h:#kobject_get | 0 = | 4 | > >> | WARNING:at_kernel/locking/lockdep.c:#__lock_acquire | 0 = | 4 | > >> | BUG:kernel_hang_in_test_stage | 0 = | 4 | > >> | calltrace:fw_cfg_sysfs_init | 0 = | 7 | > >> +------------------------------------------------------------+--------= ----+------------+ > >> > >> > >> > >> [ 152.680291] g_acm_ms usbip-vudc.0: failed to start g_acm_ms: -22 > >> [ 152.689209] usbip-vudc: probe of usbip-vudc.0 failed with error -22 > >> [ 152.698509] ------------[ cut here ]------------ > >> [ 152.709786] WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_de= l_entry+0x4d/0x71 > >> [ 152.722785] list_del corruption. next->prev should be cc0230d4, but= was 6b6b6b6b > = > That's the poison value, so this is a remove-after-remove bug? > = > >> [ 152.732828] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1-000= 33-g8052ce2 #1 > >> [ 152.742576] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), = BIOS Debian-1.8.2-1 04/01/2014 > >> [ 152.754081] 00000000 00200246 c0033e78 c13c456b c0033ea4 c13e72ac = c0033e90 c103fd02 > >> [ 152.765720] 0000003c 6b6b6b6b cc0230d4 cc02300c c0033eac c103fd41 = 00000009 00000000 > >> [ 152.776864] c0033ea4 c1f7bc14 c0033ec0 c0033ecc c13e72ac c1f7bace = 0000003c c1f7bc14 > >> [ 152.788472] Call Trace: > >> [ 152.791964] [] dump_stack+0x75/0xa9 > >> [ 152.797835] [] ? __list_del_entry+0x4d/0x71 > >> [ 152.804221] [] __warn+0xad/0xc4 > >> [ 152.809694] [] warn_slowpath_fmt+0x28/0x2d > >> [ 152.816501] [] __list_del_entry+0x4d/0x71 > >> [ 152.822946] [] device_pm_remove+0x2b/0x5a > >> [ 152.829560] [] device_del+0x134/0x19a > >> [ 152.835558] [] ? dev_set_name+0x14/0x16 > >> [ 152.842135] [] platform_device_del+0x16/0x5d > >> [ 152.848909] [] init+0xe5/0x112 > >> [ 152.854149] [] ? usbip_host_init+0xe5/0xe5 > = > What device is this? Looks like it's being removed right at init time?? > = > >> [ 152.860796] [] do_one_initcall+0x7e/0x101 > >> [ 152.867316] [] ? parse_args+0x190/0x265 > >> [ 152.873497] [] ? kernel_init_freeable+0xd0/0x174 > >> [ 152.880923] [] kernel_init_freeable+0xec/0x174 > >> [ 152.887499] [] kernel_init+0x8/0xd0 > >> [ 152.893658] [] ret_from_kernel_thread+0xe/0x30 > >> [ 152.900759] [] ? rest_init+0x111/0x111 > >> [ 152.907334] ---[ end trace 2003739b08d25430 ]--- > >> [ 152.920040] ------------[ cut here ]------------ > = > -Kees > = > = > -- = > Kees Cook > Nexus Security > = --===============4967348614448351735==-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755681AbcIGHb6 (ORCPT ); Wed, 7 Sep 2016 03:31:58 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:53285 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751088AbcIGHbz (ORCPT ); Wed, 7 Sep 2016 03:31:55 -0400 X-IBM-Helo: d03dlp02.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com Date: Wed, 7 Sep 2016 00:31:53 -0700 From: "Paul E. McKenney" To: Kees Cook Cc: kernel test robot , LKP , LKML , Rik van Riel , Steven Rostedt Subject: Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71 Reply-To: paulmck@linux.vnet.ibm.com References: <57cb6c5c.ht1t4TgNR/5nHu0Z%xiaolong.ye@intel.com> <20160904094851.GM3663@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16090707-0020-0000-0000-000009B9029F X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005721; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000185; SDB=6.00754820; UDB=6.00357265; IPR=6.00527679; BA=6.00004699; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00012606; XFM=3.00000011; UTC=2016-09-07 07:31:53 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16090707-0021-0000-0000-000055402C89 Message-Id: <20160907073153.GR3663@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-09-07_05:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1609070113 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 06, 2016 at 11:49:04AM -0400, Kees Cook wrote: > On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney > wrote: > > On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote: > >> > >> FYI, we noticed the following commit: > >> > >> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev.2016.08.19a > >> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add() debug checking into separate function") > >> > >> in testcase: boot > >> > >> on test machine: qemu-system-i386 -enable-kvm -m 360M > >> > >> caused below changes: > > > > Hello, Kees, > > > > Looks like your change might have actually found a bug. ;-) > > > > How would you like to handle this? > > We should fix the bug! :) But of course... The question is whether I am considered to be fully responsible for chasing down any bugs that arise. It is looking like I am, in which case I must defer this patch to the 4.10 merge window. Things are a bit crazy now due to prior overcommitments... Yes, yes, they will be crazy then as well, but this would be higher on the priority list. Thanx, Paul > >> +------------------------------------------------------------+------------+------------+ > >> | | 0a33d912a3 | 8052ce2d97 | > >> +------------------------------------------------------------+------------+------------+ > >> | boot_successes | 0 | 0 | > >> | boot_failures | 211 | 18 | > >> | WARNING:at_lib/list_debug.c:#__list_del_entry | 211 | 18 | > >> | calltrace:init | 209 | 18 | > >> | WARNING:at_lib/list_debug.c:#__list_add | 209 | | > >> | calltrace:i8042_init | 209 | 17 | > >> | calltrace:SyS_reboot | 192 | 4 | > >> | EIP_is_at__mutex_unlock_slowpath | 29 | | > >> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138 | | > >> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s | 56 | | > >> | EIP_is_at_lock_release | 7 | | > >> | EIP_is_at_lock_acquire | 28 | | > >> | EIP_is_at_raw_spin_unlock_irq | 30 | | > >> | EIP_is_at_mutex_lock_nested | 21 | | > >> | EIP_is_at_lock_is_held | 10 | | > >> | EIP_is_at_device_shutdown | 3 | | > >> | EIP_is_at___might_sleep | 5 | | > >> | EIP_is_at__list_del_entry | 4 | 1 | > >> | EIP_is_at_get_device | 1 | | > >> | BUG:unable_to_handle_kernel | 7 | 5 | > >> | Oops | 7 | 5 | > >> | calltrace:SyS_ipc | 4 | 4 | > >> | Kernel_panic-not_syncing:Fatal_exception | 7 | 5 | > >> | EIP_is_at_pm_runtime_barrier | 1 | | > >> | EIP_is_at_lock_acquired | 1 | | > >> | EIP_is_at__might_sleep | 1 | | > >> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1 | | > >> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page | 26 | | > >> | calltrace:mark_rodata_ro | 26 | | > >> | WARNING:at_lib/list_debug.c:#__list_add_valid | 0 | 17 | > >> | calltrace:serio_handle_event | 0 | 17 | > >> | calltrace:userio_misc_init | 0 | 17 | > >> | calltrace:mousedev_init | 0 | 17 | > >> | calltrace:evdev_init | 0 | 17 | > >> | calltrace:cmos_init | 0 | 17 | > >> | calltrace:test_init | 0 | 17 | > >> | calltrace:pca_isa_driver_init | 0 | 17 | > >> | calltrace:vim2m_init | 0 | 17 | > >> | calltrace:fm_drv_init | 0 | 17 | > >> | calltrace:test_power_init | 0 | 17 | > >> | calltrace:raid#_init | 0 | 17 | > >> | calltrace:dm_init | 0 | 17 | > >> | calltrace:vhci_init | 0 | 17 | > >> | calltrace:dcdrbu_init | 0 | 17 | > >> | calltrace:dcdbas_init | 0 | 17 | > >> | calltrace:uhid_init | 0 | 17 | > >> | calltrace:vhost_net_init | 0 | 17 | > >> | calltrace:dummy_stm_init | 0 | 17 | > >> | calltrace:stm_heartbeat_init | 0 | 17 | > >> | calltrace:binder_init | 0 | 17 | > >> | calltrace:teql_init | 0 | 17 | > >> | calltrace:ipip_init | 0 | 17 | > >> | calltrace:ipgre_init | 0 | 17 | > >> | calltrace:sit_init | 0 | 17 | > >> | calltrace:ip6_tunnel_init | 0 | 17 | > >> | calltrace:ip6gre_init | 0 | 17 | > >> | calltrace:irlan_init | 0 | 17 | > >> | calltrace:pm_qos_power_init | 0 | 17 | > >> | calltrace:ubi_init | 0 | 17 | > >> | calltrace:SyS_setsockopt | 0 | 2 | > >> | calltrace:SyS_socketcall | 0 | 2 | > >> | WARNING:at_include/linux/kref.h:#kobject_get | 0 | 4 | > >> | WARNING:at_kernel/locking/lockdep.c:#__lock_acquire | 0 | 4 | > >> | BUG:kernel_hang_in_test_stage | 0 | 4 | > >> | calltrace:fw_cfg_sysfs_init | 0 | 7 | > >> +------------------------------------------------------------+------------+------------+ > >> > >> > >> > >> [ 152.680291] g_acm_ms usbip-vudc.0: failed to start g_acm_ms: -22 > >> [ 152.689209] usbip-vudc: probe of usbip-vudc.0 failed with error -22 > >> [ 152.698509] ------------[ cut here ]------------ > >> [ 152.709786] WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71 > >> [ 152.722785] list_del corruption. next->prev should be cc0230d4, but was 6b6b6b6b > > That's the poison value, so this is a remove-after-remove bug? > > >> [ 152.732828] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1-00033-g8052ce2 #1 > >> [ 152.742576] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 > >> [ 152.754081] 00000000 00200246 c0033e78 c13c456b c0033ea4 c13e72ac c0033e90 c103fd02 > >> [ 152.765720] 0000003c 6b6b6b6b cc0230d4 cc02300c c0033eac c103fd41 00000009 00000000 > >> [ 152.776864] c0033ea4 c1f7bc14 c0033ec0 c0033ecc c13e72ac c1f7bace 0000003c c1f7bc14 > >> [ 152.788472] Call Trace: > >> [ 152.791964] [] dump_stack+0x75/0xa9 > >> [ 152.797835] [] ? __list_del_entry+0x4d/0x71 > >> [ 152.804221] [] __warn+0xad/0xc4 > >> [ 152.809694] [] warn_slowpath_fmt+0x28/0x2d > >> [ 152.816501] [] __list_del_entry+0x4d/0x71 > >> [ 152.822946] [] device_pm_remove+0x2b/0x5a > >> [ 152.829560] [] device_del+0x134/0x19a > >> [ 152.835558] [] ? dev_set_name+0x14/0x16 > >> [ 152.842135] [] platform_device_del+0x16/0x5d > >> [ 152.848909] [] init+0xe5/0x112 > >> [ 152.854149] [] ? usbip_host_init+0xe5/0xe5 > > What device is this? Looks like it's being removed right at init time?? > > >> [ 152.860796] [] do_one_initcall+0x7e/0x101 > >> [ 152.867316] [] ? parse_args+0x190/0x265 > >> [ 152.873497] [] ? kernel_init_freeable+0xd0/0x174 > >> [ 152.880923] [] kernel_init_freeable+0xec/0x174 > >> [ 152.887499] [] kernel_init+0x8/0xd0 > >> [ 152.893658] [] ret_from_kernel_thread+0xe/0x30 > >> [ 152.900759] [] ? rest_init+0x111/0x111 > >> [ 152.907334] ---[ end trace 2003739b08d25430 ]--- > >> [ 152.920040] ------------[ cut here ]------------ > > -Kees > > > -- > Kees Cook > Nexus Security >