From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754795AbcKUUCS (ORCPT ); Mon, 21 Nov 2016 15:02:18 -0500 Received: from mga11.intel.com ([192.55.52.93]:53629 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754227AbcKUUCQ (ORCPT ); Mon, 21 Nov 2016 15:02:16 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,528,1473145200"; d="scan'208";a="904071208" From: "Pandruvada, Srinivas" To: "tglx@linutronix.de" , "linux-kernel@vger.kernel.org" CC: "Zhang, Rui" , "edubezval@gmail.com" , "peterz@infradead.org" , "bp@alien8.de" , "linux-pm@vger.kernel.org" , "x86@kernel.org" , "rt@linutronix.de" Subject: Re: [patch 00/12] thermal/x86_pkg_temp: Sanitize yet another hotplug and locking trainwreck Thread-Topic: [patch 00/12] thermal/x86_pkg_temp: Sanitize yet another hotplug and locking trainwreck Thread-Index: AQHSRDImnrG09OBjwkOd+VyhmKijTg== Date: Mon, 21 Nov 2016 20:02:13 +0000 Message-ID: <1479758532.6544.169.camel@intel.com> References: <20161117231435.891545908@linutronix.de> In-Reply-To: <20161117231435.891545908@linutronix.de> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.54.75.13] Content-Type: text/plain; charset="utf-8" Content-ID: <4EC721A9640E4F4EAB8B30D162918C47@intel.com> MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id uALK2MNk020709 On Fri, 2016-11-18 at 00:03 +0000, Thomas Gleixner wrote: > We solely intended to convert that driver to the hotplug state > machine and > stumbled over a large pile of insanities, which are all interwoven > with the > package management: Thanks Thomas for fixes. But I did a very simple test on 4.9.0-rc5 on a client machine, just rmmod and read zone, it will cause crash. I have not tested on a multi- socket system, which is also required where the real test of  last cpu on a package goes offline can be tested. $ uname -a Linux xxx 4.9.0-rc5+ #47 SMP Mon Nov 21 11:30:56 PST 2016 x86_64 x86_64 x86_64 GNU/Linux $ pwd /sys/devices/virtual/thermal/thermal_zone8 $ cat type  x86_pkg_temp $ cat temp  43000 $ lsmod | grep x86 x86_pkg_temp_thermal    16384  0 aes_x86_64             20480  1 aesni_intel $ sudo rmmod x86_pkg_temp_thermal $ cd ../thermal_zone8 Zone is not removed on rmmod!!!, so any read on zone will cause crash $ ls available_policies  k_d   k_pu   power      sustainable_power trip_point_0_type  type emul_temp     k_i   offset  slope      temp trip_ point_1_temp  uevent integral_cutoff     k_po  policy  subsystem  trip_point_0_temp t rip_point_1_type spandruvada@spandruv-mobl2:/sys/devices/virtual/thermal/thermal_zone8$ cat temp  Killed spandruvada@spandruv-mobl2:/sys/devices/virtual/thermal/thermal_zone8$  [  116.726989] BUG: unable to handle kernel paging request at ffffffffa07b8010 [  116.727103] IP: [] thermal_zone_get_temp+0x3d/0x100 [  116.727192] PGD 2e0d067  [  116.727224] PUD 2e0e063  [  116.727259] PMD 14266c067  [  116.727278] PTE 0 [  116.727328] Oops: 0000 [#1] SMP [  116.727368] Modules linked in: rfcomm vmw_vsock_vmci_transport vsock vmw_vmci asix usbnet mii binfmt_misc bnep hid_sensor_rotation hid_sensor_als hid_sensor_incl_3d hid_sensor_accel_3d hid_sensor_magn_3d hid_sensor_gyro_3d hid_sensor_trigger hid_sensor_custom joydev hid_sensor_iio_common pn544_mei hid_rmi mei_phy hid_multitouch pn544 hci hid_sensor_hub nfc intel_rapl nls_iso8859_1 uvcvideo coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel videobuf2_vmalloc videobuf2_memops aesni_intel videobuf2_v4l2 option aes_x86_64 usb_wwan videobuf2_core lrw usbserial gf128mul glue_helper ablk_helper videodev btusb cryptd btrtl btbcm iwlwifi btintel serio_raw bluetooth intel_pch_thermal cfg80211 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel battery snd_hda_codec snd_hda_core [  116.728433]  winbond_cir int3403_thermal rc_core snd_hwdep snd_seq_midi snd_pcm snd_seq_midi_event mac_hid snd_rawmidi dw_dmac int3402_thermal snd_seq 8250_dw i2c_designware_platform snd_seq_device spi_pxa2xx_platform i2c_designware_core snd_timer int3400_thermal acpi_thermal_rel snd mei_me processor_thermal_device mei int340x_thermal_zone lpc_ich intel_soc_dts_iosf soundcore nfsd auth_rpcgss kvm_intel nfs_acl lockd kvm grace irqbypass sunrpc cuse parport_pc ppdev lp parport autofs4 hid_generic usbhid i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci libahci video i2c_hid hid sdhci_acpi sdhci [last unloaded: x86_pkg_temp_thermal] [  116.729342] CPU: 1 PID: 2431 Comm: cat Not tainted 4.9.0-rc5+ #47 [  116.729413] Hardware name: Intel Corporation Shark Bay Client platform/Harris Beach SDS, BIOS HSWLPTU1.86C.0133.R00.1309172123 09/17/2013 [  116.729547] task: ffff880124288000 task.stack: ffffc90000cf4000 [  116.729614] RIP: 0010:[]  [] thermal_zone_get_temp+0x3d/0x100 [  116.729721] RSP: 0018:ffffc90000cf7cd8  EFLAGS: 00010287 [  116.729783] RAX: 00000000ffffffea RBX: ffff880114de2000 RCX: ffff88010f89d4c0 [  116.729862] RDX: ffffffffa07b8000 RSI: ffffc90000cf7d1c RDI: ffff880114c5a000 [  116.729942] RBP: ffffc90000cf7d08 R08: 0000000000000000 R09: 0000000000000000 [  116.730021] R10: 000000001cffdcff R11: 0000000000000001 R12: ffff880114c5a028 [  116.730101] R13: ffffffff81a7aab0 R14: ffff880113a53800 R15: ffff88012fdcc200 [  116.730182] FS:  00007f40bf57e700(0000) GS:ffff88014f280000(0000) knlGS:0000000000000000 [  116.730272] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [  116.730337] CR2: ffffffffa07b8010 CR3: 0000000010130000 CR4: 00000000001406e0 [  116.730418] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [  116.730497] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [  116.730576] Stack: [  116.730603]  0000000000000282 ffffffff7fffffff ffff880114de2000 ffff880114c5a028 [  116.736697]  ffffffff81a7aab0 ffff880113a53800 ffffc90000cf7d28 ffffffff8167826e [  116.742680]  0000000000000246 ffffffff81f19e20 ffffc90000cf7d48 ffffffff8154c910 [  116.748607] Call Trace: [  116.754349]  [] temp_show+0x1e/0x40 [  116.759705]  [] dev_attr_show+0x20/0x50 [  116.764841]  [] ? sysfs_file_ops+0x41/0x60 [  116.769056]  [] sysfs_kf_seq_show+0xc1/0x120 [  116.773282]  [] kernfs_seq_show+0x26/0x30 [  116.777457]  [] seq_read+0xcf/0x380 [  116.781649]  [] kernfs_fop_read+0x11b/0x1a0 [  116.785727]  [] __vfs_read+0x28/0x110 [  116.789796]  [] ? security_file_permission+0x9b/0xc0 [  116.793605]  [] ? rw_verify_area+0x4e/0xb0 [  116.795640]  [] vfs_read+0x93/0x130 [  116.797212]  [] SyS_read+0x49/0xa0 [  116.798715]  [] do_syscall_64+0x6e/0x180 [  116.800224]  [] entry_SYSCALL64_slow_path+0x25/0x25 [  116.801709] Code: 41 54 53 48 83 ec 10 48 85 ff c7 45 d8 ff ff ff 7f 0f 84 a3 00 00 00 48 81 ff 00 f0 ff ff 0f 87 96 00 00 00 48 8b 97 a0 04 00 00 <48> 83 7a 10 00 0f 84 84 00 00 00 4c 8d b7 30 05 00 00 48 89 fb  [  116.803334] RIP  [] thermal_zone_get_temp+0x3d/0x100 [  116.804877]  RSP [  116.806378] CR2: ffffffffa07b8010 [  116.815516] ---[ end trace 593a281d67d382bb ]--- Thanks, Srinivas