From: David Gibson <david@gibson.dropbear.id.au>
To: Paul Mackerras <paulus@ozlabs.org>
Cc: kvm@vger.kernel.org, kvm-ppc@vger.kernel.org
Subject: Re: [PATCH] KVM: PPC: Book3S HV: Fix host crash on changing HPT size
Date: Fri, 21 Jul 2017 16:53:32 +1000 [thread overview]
Message-ID: <20170721065332.GC3717@umbus.fritz.box> (raw)
In-Reply-To: <20170721054413.GA19957@fergus.ozlabs.ibm.com>
[-- Attachment #1: Type: text/plain, Size: 6551 bytes --]
On Fri, Jul 21, 2017 at 03:44:13PM +1000, Paul Mackerras wrote:
> Commit f98a8bf9ee20 ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB
> ioctl() to change HPT size", 2016-12-20) changed the behaviour of
> the KVM_PPC_ALLOCATE_HTAB ioctl so that it now allocates a new HPT
> and new revmap array if there was a previously-allocated HPT of a
> different size from the size being requested. In this case, we need
> to reset the rmap arrays of the memslots, because the rmap arrays
> will contain references to HPTEs which are no longer valid. Worse,
> these references are also references to slots in the new revmap
> array (which parallels the HPT), and the new revmap array contains
> random contents, since it doesn't get zeroed on allocation.
>
> The effect of having these stale references to slots in the revmap
> array that contain random contents is that subsequent calls to
> functions such as kvmppc_add_revmap_chain will crash because they
> will interpret the non-zero contents of the revmap array as HPTE
> indexes and thus index outside of the revmap array. This leads to
> host crashes such as the following.
>
> [ 7072.862122] Unable to handle kernel paging request for data at address 0xd000000c250c00f8
> [ 7072.862218] Faulting instruction address: 0xc0000000000e1c78
> [ 7072.862233] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 7072.862286] SMP NR_CPUS=1024
> [ 7072.862286] NUMA
> [ 7072.862325] PowerNV
> [ 7072.862378] Modules linked in: kvm_hv vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm iw_cxgb3 mlx5_ib ib_core ses enclosure scsi_transport_sas ipmi_powernv ipmi_devintf ipmi_msghandler powernv_op_panel i2c_opal nfsd auth_rpcgss oid_registry
> [ 7072.863085] nfs_acl lockd grace sunrpc kvm_pr kvm xfs libcrc32c scsi_dh_alua dm_service_time radeon lpfc nvme_fc nvme_fabrics nvme_core scsi_transport_fc i2c_algo_bit tg3 drm_kms_helper ptp pps_core syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm dm_multipath i2c_core cxgb3 mlx5_core mdio [last unloaded: kvm_hv]
> [ 7072.863381] CPU: 72 PID: 56929 Comm: qemu-system-ppc Not tainted 4.12.0-kvm+ #59
> [ 7072.863457] task: c000000fe29e7600 task.stack: c000001e3ffec000
> [ 7072.863520] NIP: c0000000000e1c78 LR: c0000000000e2e3c CTR: c0000000000e25f0
> [ 7072.863596] REGS: c000001e3ffef560 TRAP: 0300 Not tainted (4.12.0-kvm+)
> [ 7072.863658] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE,TM[E]>
> [ 7072.863667] CR: 44082882 XER: 20000000
> [ 7072.863767] CFAR: c0000000000e2e38 DAR: d000000c250c00f8 DSISR: 42000000 SOFTE: 1
> GPR00: c0000000000e2e3c c000001e3ffef7e0 c000000001407d00 d000000c250c00f0
> GPR04: d00000006509fb70 d00000000b3d2048 0000000003ffdfb7 0000000000000000
> GPR08: 00000001007fdfb7 00000000c000000f d0000000250c0000 000000000070f7bf
> GPR12: 0000000000000008 c00000000fdad000 0000000010879478 00000000105a0d78
> GPR16: 00007ffaf4080000 0000000000001190 0000000000000000 0000000000010000
> GPR20: 4001ffffff000415 d00000006509fb70 0000000004091190 0000000ee1881190
> GPR24: 0000000003ffdfb7 0000000003ffdfb7 00000000007fdfb7 c000000f5c958000
> GPR28: d00000002d09fb70 0000000003ffdfb7 d00000006509fb70 d00000000b3d2048
> [ 7072.864439] NIP [c0000000000e1c78] kvmppc_add_revmap_chain+0x88/0x130
> [ 7072.864503] LR [c0000000000e2e3c] kvmppc_do_h_enter+0x84c/0x9e0
> [ 7072.864566] Call Trace:
> [ 7072.864594] [c000001e3ffef7e0] [c000001e3ffef830] 0xc000001e3ffef830 (unreliable)
> [ 7072.864671] [c000001e3ffef830] [c0000000000e2e3c] kvmppc_do_h_enter+0x84c/0x9e0
> [ 7072.864751] [c000001e3ffef920] [d00000000b38d878] kvmppc_map_vrma+0x168/0x200 [kvm_hv]
> [ 7072.864831] [c000001e3ffef9e0] [d00000000b38a684] kvmppc_vcpu_run_hv+0x1284/0x1300 [kvm_hv]
> [ 7072.864914] [c000001e3ffefb30] [d00000000f465664] kvmppc_vcpu_run+0x44/0x60 [kvm]
> [ 7072.865008] [c000001e3ffefb60] [d00000000f461864] kvm_arch_vcpu_ioctl_run+0x114/0x290 [kvm]
> [ 7072.865152] [c000001e3ffefbe0] [d00000000f453c98] kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
> [ 7072.865292] [c000001e3ffefd40] [c000000000389328] do_vfs_ioctl+0xd8/0x8c0
> [ 7072.865410] [c000001e3ffefde0] [c000000000389be4] SyS_ioctl+0xd4/0x130
> [ 7072.865526] [c000001e3ffefe30] [c00000000000b760] system_call+0x58/0x6c
> [ 7072.865644] Instruction dump:
> [ 7072.865715] e95b2110 793a0020 7b4926e4 7f8a4a14 409e0098 807c000c 786326e4 7c6a1a14
> [ 7072.865857] 935e0008 7bbd0020 813c000c 913e000c <93a30008> 93bc000c 48000038 60000000
> [ 7072.866001] ---[ end trace 627b6e4bf8080edc ]---
>
> Note that to trigger this, it is necessary to use a recent upstream
> QEMU (or other userspace that resizes the HPT at CAS time), specify
> a maximum memory size substantially larger than the current memory
> size, and boot a guest kernel that does not support HPT resizing.
>
> This fixes the problem by resetting the rmap arrays when the old HPT
> is freed.
>
> Fixes: f98a8bf9ee20 ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB ioctl() to change HPT size")
> Cc: stable@vger.kernel.org # v4.11+
> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> arch/powerpc/kvm/book3s_64_mmu_hv.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 710e491..1c10e26 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -164,8 +164,10 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order)
> goto out;
> }
>
> - if (kvm->arch.hpt.virt)
> + if (kvm->arch.hpt.virt) {
> kvmppc_free_hpt(&kvm->arch.hpt);
> + kvmppc_rmap_reset(kvm);
> + }
>
> err = kvmppc_allocate_hpt(&info, order);
> if (err < 0)
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
prev parent reply other threads:[~2017-07-21 7:18 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-21 5:44 [PATCH] KVM: PPC: Book3S HV: Fix host crash on changing HPT size Paul Mackerras
2017-07-21 6:53 ` David Gibson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170721065332.GC3717@umbus.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=paulus@ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox