All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: Maxim Levitsky <mlevitsk@redhat.com>
Cc: kernel test robot <oliver.sang@intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	Mike Rapoport <rppt@linux.ibm.com>,
	Christoph Lameter <cl@linux.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Juergen Gross <jgross@suse.com>,
	"Srivatsa S. Bhat" <srivatsa@csail.mit.edu>,
	Alexey Makhalov <amakhalov@vmware.com>,
	VMware PV-Drivers Reviewers <pv-drivers@vmware.com>,
	kvm@vger.kernel.org, Sean Christopherson <seanjc@google.com>
Subject: Re: supervisor write access in kernel mode in __pv_queued_spin_unlock_slowpath
Date: Mon, 2 Jan 2023 20:17:46 +0900	[thread overview]
Message-ID: <Y7K9Wh1mgWR2TiDX@hyeyoo> (raw)
In-Reply-To: <451187de09e9a80f73a0588da65d55d4a8da6552.camel@redhat.com>

On Sun, Jan 01, 2023 at 01:08:07PM +0200, Maxim Levitsky wrote:
> On Sun, 2023-01-01 at 16:37 +0900, Hyeonggon Yoo wrote:
> > On Sun, Jan 01, 2023 at 03:50:28PM +0900, Hyeonggon Yoo wrote:
> > > On Sat, Dec 31, 2022 at 11:26:25PM +0800, kernel test robot wrote:
> > > > Greeting,
> > > > 
> > > > FYI, we noticed kernel_BUG_at_include/linux/mm.h due to commit (built with gcc-11):
> > > > 
> > > > commit: 0af8489b0216fa1dd83e264bef8063f2632633d7 ("mm, slub: remove percpu slabs with CONFIG_SLUB_TINY")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > 
> > > > [test failed on linux-next/master c76083fac3bae1a87ae3d005b5cb1cbc761e31d5]
> > > > 
> > > > in testcase: rcutorture
> > > > version: 
> > > > with following parameters:
> > > > 
> > > > 	runtime: 300s
> > > > 	test: default
> > > > 	torture_type: tasks-tracing
> > > > 
> > > > test-description: rcutorture is rcutorture kernel module load/unload test.
> > > > test-url: https://www.kernel.org/doc/Documentation/RCU/torture.txt
> > > > 
> > > > 
> > > > on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> > > > 
> > > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> > > > 
> > > > 
> > > > If you fix the issue, kindly add following tag
> > > > > Reported-by: kernel test robot <oliver.sang@intel.com>
> > > > > Link: https://lore.kernel.org/oe-lkp/202212312021.bc1efe86-oliver.sang@intel.com
> > > 
> > > <snip>
> > > 
> > > > 
> > > > To reproduce:
> > > > 
> > > >         # build kernel
> > > > 	cd linux
> > > > 	cp config-6.1.0-rc2-00014-g0af8489b0216 .config
> > > > 	make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
> > > > 	make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
> > > > 	cd <mod-install-dir>
> > > > 	find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
> > > > 
> > > > 
> > > >         git clone https://github.com/intel/lkp-tests.git
> > > >         cd lkp-tests
> > > >         bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
> > > > 
> > > >         # if come across any failure that blocks the test,
> > > >         # please remove ~/.lkp and /lkp dir to run from a clean state.
> > > 
> > > I was unable to reproduce in the same way as described above
> > > because some files referenced in job-script couldn't be downloaded from
> > > download.01.org/0day :(
> > > 
> > > So I just built rcutorture module as builtin
> > > and I got weird spinlock bug on commit: 0af8489b0216
> > > ("mm, slub: remove percpu slabs with CONFIG_SLUB_TINY")
> > 
> > (+Cc KVM/Paravirt experts)
> > 
> > > full dmesg added as attachment
> > > 
> > > [ 1387.564837][   T57] BUG: unable to handle page fault for address: c108f5f4
> > > [ 1387.566649][   T57] #PF: supervisor write access in kernel mode
> > > [ 1387.567965][   T57] #PF: error_code(0x0003) - permissions violation
> > > [ 1387.569439][   T57] *pde = 010001e1 
> > > [ 1387.570276][   T57] Oops: 0003 [#1] SMP
> > > [ 1387.571149][   T57] CPU: 2 PID: 57 Comm: rcu_torture_rea Tainted: G S                 6.1.0-rc2-00010-g0af8489b0216 #2130 63d19ac2b985fca570c354d8750f489755de37ed
> > > [ 1387.574673][   T57] EIP: kvm_kick_cpu+0x54/0x90
> > > [ 1387.575802][   T57] Code: 2f c5 01 8b 04 9d e0 d4 4e c4 83 15 14 7b 2f c5 00 83 05 08 6d 2f c5 01 0f b7 0c 30 b8 05 00 00 00 83 15 0c 6d 2f c5 00 31 db <0f> 01 c1 83 05 10 6d 2f c5 01 8b 5d f8 8b 75 fc 83 15 14 6d 2f c5
> 
> 																				^^^^^^
> Yes this is the unfamous hypercall patching bug....
> 
> > > 
> 
> So what is happening is that Intel and AMD has a *slightly* different instruction reserved for hypercalls
> (paravirt calls from guest to host hypervisor).
> 
> KVM developers made a mistake to be 'nice' to the guests and if the guest uses the wrong hypercall instruction
> the KVM attempts to rewrite it with the right instruction.
> 
> That can fail, because to avoid security issues, KVM uses the exact same security context as the instruction itself
> (it is as if the instruction was defined such as it overwrote itself)
> This means that is the guest memory is marked read-only in the guest paging, then the write will fail and #PF
> will happen on the wrong hypercall instruction.
> 
> Here we have the Intel's instruction (VMCALL, 0f 01 C1), and the host machine is likely AMD which uses VMMCALL instruction
> which is (0F 01 D9)

Oh, right. my host machine is AMD Ryzen and seems I built kernel that does not
correctly support the machine.

> Now any recent Linux guest is supposed to use a right instruction using the alternatives mechanism, but it can if
> the hypervisor passes 'non native' vendor id, like GenunineIntel on AMD machine.

[    0.000000][    T0] KERNEL supported cpus:
[    0.000000][    T0]   Intel GenuineIntel
[    0.000000][    T0]   Vortex Vortex86 SoC
[    0.000000][    T0] CPU: vendor_id 'AuthenticAMD' unknown, using generic init.
[    0.000000][    T0] CPU: Your system may be unstable.
 
> In my testing using named CPU models like you do '-cpu SandyBridge' still passes through host vendor ID (that is the guest
> will see Intel's cpu but with vendor='AutheticAMD') but nobody confirmed me that this is a bug or a feature and I am not
> sure if older qemu versions also did this.
>
> Assuming that your host machine is AMD,
> your best bet to check if my theory is right is to boot the guest without triggering the bug, 
> and check in /proc/cpuinfo if the vendor string is 'GenuineIntel'

Same here. the vendor string is AuthenticAMD no matter if I pass -cpu SandyBridge
or -cpu host.

I didn't even imagine this could happen when using configuration the bot
passed without thinking and running it on CPU with different vendor :)

Thank you for such a kind explanation!

Phew, so this bug was totally unrelated the issue bot reported
and I have no clue why the original bug happened.

-- 
Regards,
Hyeonggon

  reply	other threads:[~2023-01-02 11:17 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-31 15:26 [linus:master] [mm, slub] 0af8489b02: kernel_BUG_at_include/linux/mm.h kernel test robot
2023-01-01  5:30 ` Hyeonggon Yoo
2023-01-01  6:50 ` Hyeonggon Yoo
2023-01-01  7:37   ` supervisor write access in kernel mode in __pv_queued_spin_unlock_slowpath Hyeonggon Yoo
2023-01-01 11:08     ` Maxim Levitsky
2023-01-02 11:17       ` Hyeonggon Yoo [this message]
2023-01-03 10:42 ` [linus:master] [mm, slub] 0af8489b02: kernel_BUG_at_include/linux/mm.h Vlastimil Babka
2023-01-03 13:46   ` Oliver Sang
2023-01-03 14:36     ` Vlastimil Babka
2023-01-04  9:04     ` Hyeonggon Yoo
2023-01-05  1:46       ` Oliver Sang
2023-01-05 13:59         ` Hyeonggon Yoo
2023-01-05 14:47         ` Hyeonggon Yoo
2023-01-09 14:16           ` Oliver Sang
2023-01-06 10:13         ` Vlastimil Babka
2023-01-09 14:01           ` Oliver Sang
2023-01-09 14:04             ` Oliver Sang
2023-01-10 13:53             ` Oliver Sang
2023-01-10 14:09               ` Vlastimil Babka
2023-01-11  2:26                 ` Feng Tang
2023-01-11 10:52                   ` Vlastimil Babka
2023-01-12  7:47                 ` Oliver Sang
2023-01-12  7:56                   ` Vlastimil Babka
2023-01-17  7:19                     ` Oliver Sang
2023-01-12  8:49                   ` Vlastimil Babka
2023-01-03 15:31   ` A better dump_page() Matthew Wilcox
2023-01-03 23:07     ` David Rientjes
2023-01-03 23:29       ` Matthew Wilcox
2023-01-05 15:19         ` Vlastimil Babka
2023-01-05 15:35           ` Matthew Wilcox
2023-01-06 17:28 ` [linus:master] [mm, slub] 0af8489b02: kernel_BUG_at_include/linux/mm.h Hyeonggon Yoo
2023-01-11  9:44 ` BUG: unable to handle page fault for address: f6ffe000 Hyeonggon Yoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7K9Wh1mgWR2TiDX@hyeyoo \
    --to=42.hyeyoo@gmail.com \
    --cc=amakhalov@vmware.com \
    --cc=cl@linux.com \
    --cc=jgross@suse.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=mlevitsk@redhat.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=pbonzini@redhat.com \
    --cc=pv-drivers@vmware.com \
    --cc=rppt@linux.ibm.com \
    --cc=seanjc@google.com \
    --cc=srivatsa@csail.mit.edu \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.