From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1A216C369A2 for ; Mon, 14 Apr 2025 13:53:52 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.950301.1346664 (Exim 4.92) (envelope-from ) id 1u4KFq-0001xR-Du; Mon, 14 Apr 2025 13:53:42 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 950301.1346664; Mon, 14 Apr 2025 13:53:42 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1u4KFq-0001xK-BD; Mon, 14 Apr 2025 13:53:42 +0000 Received: by outflank-mailman (input) for mailman id 950301; Mon, 14 Apr 2025 13:53:41 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1u4KFp-0001wy-1g for xen-devel@lists.xenproject.org; Mon, 14 Apr 2025 13:53:41 +0000 Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [2607:f8b0:4864:20::531]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id de1ce9ba-1937-11f0-9eae-5ba50f476ded; Mon, 14 Apr 2025 15:53:40 +0200 (CEST) Received: by mail-pg1-x531.google.com with SMTP id 41be03b00d2f7-af91fc1fa90so4003474a12.0 for ; Mon, 14 Apr 2025 06:53:40 -0700 (PDT) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with UTF8SMTPSA id 98e67ed59e1d1-306df3ffc38sm11085485a91.40.2025.04.14.06.53.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Apr 2025 06:53:37 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: de1ce9ba-1937-11f0-9eae-5ba50f476ded DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1744638818; x=1745243618; darn=lists.xenproject.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=yF3TQI9kFpPlllrK3uiAM8glsQvPwTQUXhruOt8PHMI=; b=U7EOVqidnRfQF6nQitZtE4V2oBeafxwI/Xbg/WT5FB9TceVHi0l7+4xx6/x1hI8MEa 7OYQiBteQuXzMZ/U/MWFieUrGiP6RWHoGndWzeavRKNoy15pHit2rSuYUFlw25w5sDh9 g0/2+lZF75WRoShM971q8uwekuz0wWbstwDi8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744638818; x=1745243618; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yF3TQI9kFpPlllrK3uiAM8glsQvPwTQUXhruOt8PHMI=; b=bCcyvy0HeID4Vqpb5EsadDAFFuy488hqWyZFEL8XISdItWhbWAe1x+x93mZWz3HO2c Vat/0rkrSHWtN4CisdrWujyZH3IRh8QB2aIYvGtD7WfQDj2xdUFBC0RKPgshIYcOcLHW 4jHQcX99rxwpe3ts3JWJBNNe3JabvqGep5Pt0R5jWquuVfrMfzFXbeRdDYAsJ99c5lVI brfKXyJrGMhzHnjNTIHAmrvEGgfESMzQdW/dpVmpS9lAv0Jdnf/b7WWgg4bVLd88UNEY UhzbJJ+mpe7fL4E16sLvnT3Txoy/p2kloyK3QR5pDcxcmbJEryDJMoWz5nnIWM+CUuW7 tvLw== X-Forwarded-Encrypted: i=1; AJvYcCW5g0CxutcRnTIY0FQO18nVIr43N2gldntoFIwdxISjh8CYES6x6Ok6aUTxJUCq/RVkbIEr3rfSdaE=@lists.xenproject.org X-Gm-Message-State: AOJu0YzK8akj4W0Cr59sgfIZ35F8hHHPwpAKIZMXcuN9G7g8SA7Qqdm0 1XIaneZFpAld7gbeZvyu3FxdyYvqKtDSAP0WvRuZL/o1tyPYmLRYzFLPgQ0pF1M= X-Gm-Gg: ASbGncvgZ0hayO4y6YV+X7V4emxAeF5EDm0DEqs4k8jPo06kbNFv+mjeMipDTzXccCf 74Z8TGUIRry+nNpoL4sE+ZiAdhrSy5YUa+EKVdLzrBOHduRLX1pDLe0Zjho1uMb14i97FsrlZTe vgFiBrba4pPpGeWO86OTyad+46llLnE7SmvDmAZuxp0IsyCoTjlbZ60O8n6OZY6jK4+sZBKOJLX Xe+M/4TxNEp97zMypgK0v3oyEdWvygX32kl8XVLDghkM7Wl+99PrKrLxakTv80lqf2priWWXVjw 8zu52DFonNbAiVmXYk/Msa/q3bkyb6AofWTq9qzVieqJLg== X-Google-Smtp-Source: AGHT+IGBCo8s/124k9FxEnVk7VVQnYaWP6obtft0y9kFZrBTaJF+alE1n1I5vlX95WXIbClyluDWuA== X-Received: by 2002:a17:90b:4d0c:b0:2fc:c262:ef4b with SMTP id 98e67ed59e1d1-30823645d5dmr19991048a91.18.1744638818081; Mon, 14 Apr 2025 06:53:38 -0700 (PDT) Date: Mon, 14 Apr 2025 15:53:32 +0200 From: Roger Pau =?utf-8?B?TW9ubsOp?= To: Jan Beulich Cc: Andrew Cooper , xen-devel@lists.xenproject.org Subject: Re: [PATCH 3/5] x86/hvm: fix handling of accesses to partial r/o MMIO pages Message-ID: References: <20250411105411.22334-1-roger.pau@citrix.com> <20250411105411.22334-4-roger.pau@citrix.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Mon, Apr 14, 2025 at 08:33:44AM +0200, Jan Beulich wrote: > On 11.04.2025 12:54, Roger Pau Monne wrote: > > The current logic to handle accesses to MMIO pages partially read-only is > > based on the (now removed) logic used to handle accesses to the r/o MMCFG > > region(s) for PVH v1 dom0. However that has issues when running on AMD > > hardware, as in that case the guest linear address that triggered the fault > > is not provided as part of the VM exit. This caused > > mmio_ro_emulated_write() to always fail before calling > > subpage_mmio_write_emulate() when running on AMD and called from an HVM > > context. > > > > Take a different approach and convert the handling of partial read-only > > MMIO page accesses into an HVM MMIO ops handler, as that's the more natural > > way to handle this kind of emulation for HVM domains. > > > > This allows getting rid of hvm_emulate_one_mmio() and it's single cal site > > in hvm_hap_nested_page_fault(). > > > > Note a small adjustment is needed to the `pf-fixup` dom0 PVH logic: avoid > > attempting to fixup faults resulting from accesses to read-only MMIO > > regions, as handling of those accesses is now done by handle_mmio(). > > > > Fixes: 33c19df9a5a0 ('x86/PCI: intercept accesses to RO MMIO from dom0s in HVM containers') > > Signed-off-by: Roger Pau Monné > > --- > > The fixes tag is maybe a bit wonky, it's either this or: > > > > 8847d6e23f97 ('x86/mm: add API for marking only part of a MMIO page read only') > > > > However the addition of subpage r/o access handling to the existing > > mmio_ro_emulated_write() function was done based on the assumption that the > > current code was working - which turned out to not be the case for AMD, > > hence my preference for blaming the commit that actually introduced the > > broken logic. > > I agree. > > > --- > > xen/arch/x86/hvm/emulate.c | 47 +------------- > > xen/arch/x86/hvm/hvm.c | 89 +++++++++++++++++++++++--- > > It's quite a bit of rather special code you add to this catch-all file. How > about introducing a small new one, e.g. mmio.c (to later maybe become home > to some further stuff)? Yes, I didn't find any suitable place, I was also considering hvm/io.c or hvm/intercept.c, but none looked very good TBH. The main benefit of placing it in hvm.c is that the functions can all be static and confined to hvm.c as it's only user. > > --- a/xen/arch/x86/hvm/emulate.c > > +++ b/xen/arch/x86/hvm/emulate.c > > @@ -370,7 +370,8 @@ static int hvmemul_do_io( > > /* If there is no suitable backing DM, just ignore accesses */ > > if ( !s ) > > { > > - if ( is_mmio && is_hardware_domain(currd) ) > > + if ( is_mmio && is_hardware_domain(currd) && > > + !rangeset_contains_singleton(mmio_ro_ranges, PFN_DOWN(addr)) ) > > I think this needs a brief comment - it otherwise looks misplaced here. > > > @@ -585,9 +585,81 @@ static int cf_check hvm_print_line( > > return X86EMUL_OKAY; > > } > > > > +static int cf_check subpage_mmio_accept(struct vcpu *v, unsigned long addr) > > +{ > > + p2m_type_t t; > > + mfn_t mfn = get_gfn_query_unlocked(v->domain, addr, &t); > > + > > + return !mfn_eq(mfn, INVALID_MFN) && t == p2m_mmio_direct && > > + !!subpage_mmio_find_page(mfn); > > The !! isn't needed here, is it? IIRC clang complained about conversion from pointer to integer without a cast, but maybe that was before adding the further conditions here. > > +} > > + > > +static int cf_check subpage_mmio_read( > > + struct vcpu *v, unsigned long addr, unsigned int len, unsigned long *data) > > +{ > > + struct domain *d = v->domain; > > + p2m_type_t t; > > + mfn_t mfn = get_gfn_query(d, addr, &t); > > + struct subpage_ro_range *entry; > > + volatile void __iomem *mem; > > + > > + *data = ~0UL; > > + > > + if ( mfn_eq(mfn, INVALID_MFN) || t != p2m_mmio_direct ) > > + { > > + put_gfn(d, addr); > > + return X86EMUL_RETRY; > > + } > > + > > + entry = subpage_mmio_find_page(mfn); > > + if ( !entry ) > > + { > > + put_gfn(d, addr); > > + return X86EMUL_RETRY; > > + } > > + > > + mem = subpage_mmio_map_page(entry); > > + if ( !mem ) > > + { > > + put_gfn(d, addr); > > + gprintk(XENLOG_ERR, "Failed to map page for MMIO read at %#lx\n", > > + mfn_to_maddr(mfn) + PAGE_OFFSET(addr)); > > Makes me wonder whether the function parameter wouldn't better be named > "curr" and the local variable "currd", to reflect that this log message > will report appropriate context. Sure, can adjust. > Would also logging the guest physical address perhaps be worthwhile here? Possibly, my first apporahc was to print gfn -> mfn, but ended up copying the same message from subpage_mmio_write_emulate() for symmetry reasons. > > + return X86EMUL_OKAY; > > + } > > + > > + *data = read_mmio(mem + PAGE_OFFSET(addr), len); > > + > > + put_gfn(d, addr); > > + return X86EMUL_OKAY; > > +} > > + > > +static int cf_check subpage_mmio_write( > > + struct vcpu *v, unsigned long addr, unsigned int len, unsigned long data) > > +{ > > + struct domain *d = v->domain; > > + p2m_type_t t; > > + mfn_t mfn = get_gfn_query(d, addr, &t); > > + > > + if ( mfn_eq(mfn, INVALID_MFN) || t != p2m_mmio_direct ) > > + { > > + put_gfn(d, addr); > > + return X86EMUL_RETRY; > > + } > > + > > + subpage_mmio_write_emulate(mfn, PAGE_OFFSET(addr), data, len); > > + > > + put_gfn(d, addr); > > + return X86EMUL_OKAY; > > +} > > Unlike the read path this doesn't return RETRY when subpage_mmio_find_page() > fails (indicating something must have changed after "accept" said yes). Yeah, I've noticed this, but didn't feel like modifying subpage_mmio_write_emulate() for this. The list of partial r/o MMIO pages cannot change after system boot, so accept returning yes and not finding a page here would likely warrant an ASSERT_UNRECHABLE(). Would you like me to modify subpage_mmio_write_emulate to make it return an error code? > > @@ -1981,7 +2056,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, > > */ > > if ( (p2mt == p2m_mmio_dm) || > > (npfec.write_access && > > - (p2m_is_discard_write(p2mt) || (p2mt == p2m_ioreq_server))) ) > > + (p2m_is_discard_write(p2mt) || (p2mt == p2m_ioreq_server) || > > + /* MMIO entries can be r/o if the target mfn is in mmio_ro_ranges. */ > > + (p2mt == p2m_mmio_direct))) ) > > { > > if ( !handle_mmio_with_translation(gla, gfn, npfec) ) > > hvm_inject_hw_exception(X86_EXC_GP, 0); > > Aren't we handing too many things to handle_mmio_with_translation() this > way? At the very least you're losing ... > > > @@ -2033,14 +2110,6 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, > > goto out_put_gfn; > > } > > > > - if ( (p2mt == p2m_mmio_direct) && npfec.write_access && npfec.present && > > ... the .present check. Isn't the p2mt == p2m_mmio_direct check already ensuring the entry is present? Otherwise it's type would be p2m_invalid or p2m_mmio_dm? It did seem to me the other checks in this function already assume that by having a valid type the entry is present. > I'm also concerned of e.g. VT-x'es APIC access MFN, which is > p2m_mmio_direct. But that won't go into hvm_hap_nested_page_fault() when using cpu_has_vmx_virtualize_apic_accesses (and thus having an APIC page mapped as p2m_mmio_direct)? It would instead be an EXIT_REASON_APIC_ACCESS vmexit which is handled differently? Thanks, Roger.