All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: tcs.kernel@gmail.com
Cc: pbonzini@redhat.com, vkuznets@redhat.com, wanpengli@tencent.com,
	jmattson@google.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, jarkko@kernel.org,
	Haimin Zhang <tcs_kernel@tencent.com>
Subject: Re: [PATCH] KVM: x86: Add a return code and check kvm_page_track_init
Date: Tue, 31 Aug 2021 14:50:07 +0000	[thread overview]
Message-ID: <YS5Bn6I6wVEL8wKS@google.com> (raw)
In-Reply-To: <1630376040-20567-1-git-send-email-tcs_kernel@tencent.com>

For the shortlog, describe what is being fixed instead of the literal code change,
otherwise the shortlog doesn't help explain _why_ a change is being made.

On Tue, Aug 31, 2021, tcs.kernel@gmail.com wrote:
> From: Haimin Zhang <tcs_kernel@tencent.com>
> 
> We found a null pointer deref by our modified syzkaller.
>  KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
>  CPU: 1 PID: 13993 Comm: syz-executor.0 Kdump: loaded Tainted: 
>  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
>  BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
>  RIP: 0010:rcu_segcblist_enqueue+0xf5/0x1d0 
>  RSP: 0018:ffffc90001e1fc10 EFLAGS: 00010046
>  RAX: dffffc0000000000 RBX: ffff888135c00080 RCX: ffffffff815ba8a1
>  RDX: 0000000000000000 RSI: ffffc90001e1fd00 RDI: ffff888135c00080
>  RBP: ffff888135c000a0 R08: 0000000000000004 R09: fffff520003c3f75
>  R10: 0000000000000003 R11: fffff520003c3f75 R12: 0000000000000000
>  R13: ffff888135c00080 R14: ffff888135c00040 R15: 0000000000000000
>  FS:  00007fecc99f1700(0000) GS:ffff888135c00000(0000) knlGS:0000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000001b2f225000 CR3: 0000000093d08000 CR4: 0000000000750ee0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  PKRU: 55555554
>  Call Trace:
>  srcu_gp_start_if_needed+0x158/0xc60 build/../kernel/rcu/srcutree.c:823
>  __synchronize_srcu+0x1dc/0x250 build/../kernel/rcu/srcutree.c:929
>  kvm_mmu_uninit_vm+0x18/0x30 build/../arch/x86/kvm/mmu/mmu.c:5585
>  kvm_arch_destroy_vm+0x43f/0x5c0 build/../arch/x86/kvm/x86.c:11277
>  kvm_create_vm build/../arch/x86/kvm/../../../virt/kvm/kvm_main.c:1060 
>  kvm_dev_ioctl_create_vm build/../arch/x86/kvm/../../../virt/kvm/kvm_main
>  kvm_dev_ioctl+0xdfb/0x1860 build/../arch/x86/kvm/../../../virt/kvm/kvm_main
>  vfs_ioctl build/../fs/ioctl.c:51 [inline]
>  __do_sys_ioctl build/../fs/ioctl.c:1069 [inline]
>  __se_sys_ioctl build/../fs/ioctl.c:1055 [inline]
>  __x64_sys_ioctl+0x183/0x210 build/../fs/ioctl.c:1055
>  do_syscall_x64 build/../arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x34/0xb0 build/../arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae

Newline here to make it easier to differentiate between the splat and the
explanation.  Though I would say hoist the explanation of the "why" to the top, e.g.

  KVM: x86: Handle SRCU initialization failure during page track init

  Check the return of init_srcu_struct(), which can fail due to OOM, when
  initializing the page track mechanism.  Lack of checking leads to a NULL
  pointer deref found by a modified syzkaller.

  <splat goes here>

> This is because when init_srcu_struct() calls alloc_percpu(struct
> srcu_data) failed, kvm_page_track_init() didn't check init_srcu_struct
> return code. 
> 
> Signed-off-by: Haimin Zhang <tcs_kernel@tencent.com>
> Reported-by: TCS Robot <tcs_robot@tencent.com>
> ---
>  arch/x86/include/asm/kvm_page_track.h | 2 +-
>  arch/x86/kvm/mmu/page_track.c         | 8 ++++++--
>  arch/x86/kvm/x86.c                    | 7 +++++--
>  3 files changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h
> index 87bd6025d91d..6a5f3acf2b33 100644
> --- a/arch/x86/include/asm/kvm_page_track.h
> +++ b/arch/x86/include/asm/kvm_page_track.h
> @@ -46,7 +46,7 @@ struct kvm_page_track_notifier_node {
>  			    struct kvm_page_track_notifier_node *node);
>  };
>  
> -void kvm_page_track_init(struct kvm *kvm);
> +int kvm_page_track_init(struct kvm *kvm);
>  void kvm_page_track_cleanup(struct kvm *kvm);
>  
>  void kvm_page_track_free_memslot(struct kvm_memory_slot *slot);
> diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
> index 91a9f7e0fd91..44a67a50f6d2 100644
> --- a/arch/x86/kvm/mmu/page_track.c
> +++ b/arch/x86/kvm/mmu/page_track.c
> @@ -163,13 +163,17 @@ void kvm_page_track_cleanup(struct kvm *kvm)
>  	cleanup_srcu_struct(&head->track_srcu);
>  }
>  
> -void kvm_page_track_init(struct kvm *kvm)
> +int kvm_page_track_init(struct kvm *kvm)
>  {
> +	int r = -ENOMEM;

Unnecessary initialization.

>  	struct kvm_page_track_notifier_head *head;
>  
>  	head = &kvm->arch.track_notifier_head;
> -	init_srcu_struct(&head->track_srcu);
> +	r = init_srcu_struct(&head->track_srcu);
> +	if (r)
> +		return r;
>  	INIT_HLIST_HEAD(&head->track_notifier_list);
> +	return r;

Just do "return 0", which is guaranteed by the above.  Or even better, I would
vote for returning init_srcu_struct() directly, the ordering doesn't matter and
obviously failure is a very rare occurence.

	@@ -175,8 +175,8 @@ void kvm_page_track_init(struct kvm *kvm)
        struct kvm_page_track_notifier_head *head;
 
        head = &kvm->arch.track_notifier_head;
-       init_srcu_struct(&head->track_srcu);
        INIT_HLIST_HEAD(&head->track_notifier_list);
+       return init_srcu_struct(&head->track_srcu);
 }
>  }
>  
>  /*
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e5d5c5ed7dd4..5da76f989207 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11086,8 +11086,9 @@ void kvm_arch_free_vm(struct kvm *kvm)
>  
>  int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  {
> +	int r = -EINVAL;

Unnecessary initialization.

>  	if (type)
> -		return -EINVAL;
> +		return r;

Unrelated and unnecessary change.

>  
>  	INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list);
>  	INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
> @@ -11121,7 +11122,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  
>  	kvm_apicv_init(kvm);
>  	kvm_hv_init_vm(kvm);
> -	kvm_page_track_init(kvm);
> +	r = kvm_page_track_init(kvm);
> +	if (r)
> +		return r;

Hmm, so I don't see anything above this that needs to be unwound, but I'm still
worried this will be hard to audit/maintain.

As an alternative "fix", about dropping kvm->arch.track_notifier_head.track_srcu
and using kvm->srcu?  kvm_page_track_write() pretty much _has_ to hold that since
the caller is writing guest memory, and conversely kvm_page_track_flush_slot()
_can't_ hold it because the caller is modifying memslots and thus would deadlock
if it held kvm->srcu for read.  In other words, kvm_page_track_write() can rely
(assert?) on vcpu->srcu_idx, and kvm_page_track_flush_slot() can take and release
kvm->srcu.

Practially speaking, (Un)Registering is going to happen only at VM creation so
waiting all kvm->srcu readers instead of just page track readers should not be a
problem.

>  	kvm_mmu_init_vm(kvm);
>  
>  	return static_call(kvm_x86_vm_init)(kvm);
> -- 
> 2.27.0
> 

  reply	other threads:[~2021-08-31 14:50 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-31  2:14 [PATCH] KVM: x86: Add a return code and check kvm_page_track_init tcs.kernel
2021-08-31 14:50 ` Sean Christopherson [this message]
2021-08-31 15:53 ` Vitaly Kuznetsov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YS5Bn6I6wVEL8wKS@google.com \
    --to=seanjc@google.com \
    --cc=jarkko@kernel.org \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=tcs.kernel@gmail.com \
    --cc=tcs_kernel@tencent.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.