public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Alok Kataria <akataria@vmware.com>
Cc: "rui.zhang@intel.com" <rui.zhang@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	jacob.jun.pan@linux.intel.com, "Ernst,
	Eric" <eric.ernst@intel.com>, "Rafael J. Wysocki" <rjw@sisk.pl>
Subject: Re: Regression in intel_powerclamp, due to cpu whitelist removal
Date: Wed, 19 Oct 2016 20:45:30 -0700	[thread overview]
Message-ID: <20161019204530.3d2ec1d5@jacob-builder> (raw)
In-Reply-To: <2FF1D5AB-46C6-4BEC-A5A7-EC9C16A99919@vmware.com>

On Tue, 18 Oct 2016 14:20:49 +0000
Alok Kataria <akataria@vmware.com> wrote:

> Hi Jacob, Zhang, 
> 
> One of your recent commit "thermal/powerclamp: remove cpu
> whitelist” [1], has caused a regression in the kernel. 
> 
> That commit changed powerclamp_probe from requiring all of the
> following features:
> 
> X86_FEATURE_NONSTOP_TSC
> X86_FEATURE_CONSTANT_TSC
> X86_FEATURE_MWAIT
> X86_FEATURE_ARAT           
> 
> to *any* of them.  The problem is clamp_thread still wants to use
> mwait_idle_with_hints even if the CPU doesn't support it. 
>
Hi Alok,

You are right, it should be AND not OR.
 
+Eric who has a patch to address this.

https://patchwork.kernel.org/patch/9365005/

Rui/Rafael,

Could you consider this as an urgent fix?

Jacob
> This was reported by our users when running ubuntu 16.10
> (4.8.0-22-generic) inside a VMware VM, though as mentioned above I
> don’t think it is specific to our platform. We have seen kernel
> panics due to invalid opcode because of this. Below is the stack
> trace for your reference. 
> 
> [    5.736416] invalid opcode: 0000 [#1] SMP
> [    5.736455] Modules linked in: vmw_vsock_vmci_transport vsock
> vmw_balloon intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
> ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper
> ablk_helper cryptd intel_rapl_perf input_leds joydev serio_raw
> snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm snd_seq_midi
> snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd
> soundcore i2c_piix4 shpchp vmw_vmci nfit floppy(+) mac_hid parport_pc
> ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid
> ahci libahci e1000 mptspi mptscsih psmouse mptbase vmwgfx
> scsi_transport_spi ttm drm_kms_helper syscopyarea sysfillrect
> sysimgblt fb_sys_fops drm pata_acpi fjes [    5.744370] CPU: 1 PID:
> 912 Comm: kidle_inject/1 Not tainted 4.8.0-22-generic #24-Ubuntu
> [    5.744373] Hardware name: VMware, Inc. VMware Virtual
> Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> [    5.744375] task: ffff9658f7a663c0 task.stack: ffff9658fa908000
> [    5.744378] RIP: 0010:[<ffffffffc05728b8>]  [<ffffffffc05728b8>]
> clamp_thread+0x2b8/0x5d0 [intel_powerclamp] [    5.744380] RSP:
> 0018:ffff9658fa90be00  EFLAGS: 00010246 [    5.744383] RAX:
> ffff9658fa908008 RBX: 00000000fffee0a6 RCX: 0000000000000000
> [    5.744386] RDX: 0000000000000000 RSI: 0000000000000246 RDI:
> 0000000000000246 [    5.744388] RBP: ffff9658fa90bec0 R08:
> ffff9658fa908000 R09: 0000000000000000 [    5.744391] R10:
> 000000000001cbf7 R11: 0000000000000000 R12: ffffffff8db581a0
> [    5.744393] R13: ffff9658fa908000 R14: 0000000000000000 R15:
> ffff9658fa908000 [    5.744396] FS:  0000000000000000(0000)
> GS:ffff9658fc640000(0000) knlGS:0000000000000000 [    5.744398] CS:
> 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [    5.744401] CR2:
> 00007ffa6cc262e8 CR3: 000000003ab3b000 CR4: 00000000001406e0
> [    5.744403] Stack: [    5.744406]  0000000000000001
> ffff9658f7a66dc0 ffff9658fc659200 00000000e878d638 [    5.744409]
> 0000000000000001 00000002fc659200 0000000000000001 ffff9658fa908008
> [    5.744411]  0000000000000000 ffff9658fc64fea8 00000000fffee0a6
> ffffffffc05720a0 [    5.744414] Call Trace: [    5.744416]
> [<ffffffffc05720a0>] ? pkg_state_counter+0xa0/0xa0 [intel_powerclamp]
> [    5.744419]  [<ffffffffc0572600>] ?
> powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> [    5.744421]  [<ffffffffc0572600>] ?
> powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp]
> [    5.744424]  [<ffffffff8cca3c18>] kthread+0xd8/0xf0
> [    5.744427]  [<ffffffff8d49f29f>] ret_from_fork+0x1f/0x40
> [    5.744429]  [<ffffffff8cca3b40>] ?
> kthread_create_on_node+0x1e0/0x1e0 [    5.744432] Code: cc e9 ba 00
> 00 00 eb 19 0f 1f 00 0f ae f0 65 48 8b 04 25 04 69 01 00 0f ae b8 08
> c0 ff ff 0f ae f0 31 d2 48 8b 44 24 38 48 89 d1 <0f> 01 c8 49 8b 45
> 08 a8 08 75 0b b9 01 00 00 00 4c 89 f0 0f 01 [    5.744434] RIP
> [<ffffffffc05728b8>] clamp_thread+0x2b8/0x5d0 [intel_powerclamp]
> [    5.744437]  RSP <ffff9658fa90be00> [    5.744440] invalid opcode:
> 0000 [#2] SMP [    5.744452] ---[ end trace cf659c4076bf2804 ]---
> 
> Looking at the instruction at the RIP <ffffffffc05728b8> shows that
> the kernel attempted to execute “monitor” instruction. 
> 
>  8b8:   0f 01 c8                monitor %rax,%rcx,%rdx
>  8bb:   49 8b 45 08             mov    0x8(%r13),%rax
> 
> To fix this, I think you should restore the explicit feature check
> “if block” that was removed in the above mentioned commit. Can you
> please look at this ?
> 
> Thanks,
> Alok
> 
> 
> [1] b721ca0d192754deccb89fb01c77e41e6fd91ad9
> https://github.com/torvalds/linux/commit/b721ca0d192754deccb89fb01c77e41e6fd91ad9, 
> 

[Jacob Pan]

  reply	other threads:[~2016-10-20  3:44 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-18 14:20 Regression in intel_powerclamp, due to cpu whitelist removal Alok Kataria
2016-10-20  3:45 ` Jacob Pan [this message]
2016-10-20  4:02   ` Alok Kataria
2016-10-20  5:28   ` Zhang Rui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161019204530.3d2ec1d5@jacob-builder \
    --to=jacob.jun.pan@linux.intel.com \
    --cc=akataria@vmware.com \
    --cc=eric.ernst@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rjw@sisk.pl \
    --cc=rui.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox