Re: [PATCH v7 4/5] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jason Gunthorpe <jgg@nvidia.com>
To: Ankit Agrawal <ankita@nvidia.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	"maz@kernel.org" <maz@kernel.org>,
	"oliver.upton@linux.dev" <oliver.upton@linux.dev>,
	"joey.gouly@arm.com" <joey.gouly@arm.com>,
	"suzuki.poulose@arm.com" <suzuki.poulose@arm.com>,
	"yuzenghui@huawei.com" <yuzenghui@huawei.com>,
	"will@kernel.org" <will@kernel.org>,
	"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
	"shahuang@redhat.com" <shahuang@redhat.com>,
	"lpieralisi@kernel.org" <lpieralisi@kernel.org>,
	"david@redhat.com" <david@redhat.com>,
	"ddutile@redhat.com" <ddutile@redhat.com>,
	"seanjc@google.com" <seanjc@google.com>,
	Aniket Agashe <aniketa@nvidia.com>, Neo Jia <cjia@nvidia.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	Krishnakant Jaju <kjaju@nvidia.com>,
	"Tarun Gupta (SW-GPU)" <targupta@nvidia.com>,
	Vikram Sethi <vsethi@nvidia.com>,
	Andy Currid <acurrid@nvidia.com>,
	Alistair Popple <apopple@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Dan Williams <danw@nvidia.com>, Zhi Wang <zhiw@nvidia.com>,
	Matt Ochs <mochs@nvidia.com>, Uday Dhoke <udhoke@nvidia.com>,
	Dheeraj Nigam <dnigam@nvidia.com>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"sebastianene@google.com" <sebastianene@google.com>,
	"coltonlewis@google.com" <coltonlewis@google.com>,
	"kevin.tian@intel.com" <kevin.tian@intel.com>,
	"yi.l.liu@intel.com" <yi.l.liu@intel.com>,
	"ardb@kernel.org" <ardb@kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"gshan@redhat.com" <gshan@redhat.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"tabba@google.com" <tabba@google.com>,
	"qperret@google.com" <qperret@google.com>,
	"kvmarm@lists.linux.dev" <kvmarm@lists.linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"maobibo@loongson.cn" <maobibo@loongson.cn>
Subject: Re: [PATCH v7 4/5] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags
Date: Thu, 19 Jun 2025 11:16:31 -0300	[thread overview]
Message-ID: <20250619141631.GD1643312@nvidia.com> (raw)
In-Reply-To: <SA1PR12MB7199835E63E1EF48C7C7638DB07DA@SA1PR12MB7199.namprd12.prod.outlook.com>

On Thu, Jun 19, 2025 at 12:14:38PM +0000, Ankit Agrawal wrote:
> >> > -           disable_cmo = true;
> >> > +           if (!is_vma_cacheable)
> >> > +                   disable_cmo = true;
> >>
> >> I'm tempted to stick to the 'device' variable name. Or something like
> >> s2_noncacheable. As I commented, it's not just about disabling CMOs.
> >
> > I think it would be clearer to have two concepts/variable then because
> > the cases where it is really about preventing cachable access to
> > prevent aborts are not linked to the logic that checks pfn valid. We
> > have to detect those cases separately (through the VMA flags was it?).
> >
> > Having these two things together is IMHO confusing..
> >
> > Jason
> 
> Thanks Catalin and Jason for the comments.
> 
> Considering the feedback, I think we may do the following here:
> 1. Rename the device variable to S2_noncacheable to represent if the S2
>     is going to be marked non cacheable. Otherwise S2 will be mapped
>     NORMAL.

How about "s2_force_noncachable" for extra clarity what is going on.

> 2. Detect what PFN has to be marked S2_noncacheable. If a PFN is not in the
>     kernel map, mark as S2 except for PFNMAP + VMA cacheable.
> 3. Prohibit cacheable PFNMAP if hardware doesn't support FWB and CACHE DIC.
> 4. Prohibit S2 non cached mapping for cacheable VMA for all cases, whether
>     pre-FWB hardware or not.

Logic sounds right
 
> This would be how the patch would look.
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 339194441a25..979668d475bd 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1516,8 +1516,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  {
>         int ret = 0;
>         bool write_fault, writable, force_pte = false;
> -       bool exec_fault, mte_allowed, is_vma_cacheable;
> -       bool device = false, vfio_allow_any_uc = false;
> +       bool exec_fault, mte_allowed, is_vma_cacheable, cacheable_pfnmap = false;
> +       bool s2_noncacheable = false, vfio_allow_any_uc = false;
>         unsigned long mmu_seq;
>         phys_addr_t ipa = fault_ipa;
>         struct kvm *kvm = vcpu->kvm;
> @@ -1660,6 +1660,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> 
>         is_vma_cacheable = kvm_vma_is_cacheable(vma);
> 
> +       if (vma->vm_flags & VM_PFNMAP) {
> +               /* Reject COW VM_PFNMAP */
> +               if (is_cow_mapping(vma->vm_flags))
> +                       return -EINVAL;

The comment should explain why we have to reject COW PFNMAP, it is
obvious that is what the code does.

> +
> +               if (is_vma_cacheable)
> +                       cacheable_pfnmap = true;
> +       }
> +
>         /* Don't use the VMA after the unlock -- it may have vanished */
>         vma = NULL;
> 
> @@ -1684,8 +1693,16 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>                 return -EFAULT;
> 
>         if (kvm_is_device_pfn(pfn)) {

We are changing this to !pfn_is_map_memory() ?

We should really only call pfn_is_map_memory() if VM_PFNMAP or
VM_MIXEDMAP, otherwise the VMA has only struct pages in it.

Can it look more like this?

if (vm_flags & (VM_PFNMAP | VM_MIXEDMAP) && !pfn_is_map_memory()) {
   /* the memory is non-struct page memory, it cannot be cache flushed
       and may be unsafe to be accessed as cachable */

       if (cachable_pfnmap) {
           /* the VMA owner has said the physical address is safe for cachable
              access. When FWB ..... */
	   if (!kvm_arch_supports_cacheable_pfnmap())
	       return -EFAULT;
	   /* Cannot degrade cachable to non cachable */
	   if (s2_force_noncachable)
	   	   return -EINVAL;
       } else {
           /* Assume the address is unsafe for cachable access */
	   s2_force_noncachable = true;
      }
}
/* nothing beyond here writes to s2_forcE_noncachable? */

Jason

next prev parent reply	other threads:[~2025-06-19 14:16 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-18  6:55 [PATCH v7 0/5] KVM: arm64: Map GPU device memory as cacheable ankita
2025-06-18  6:55 ` [PATCH v7 1/5] KVM: arm64: Rename symbols to reflect whether CMO may be used ankita
2025-06-18 14:28   ` Catalin Marinas
2025-06-18 14:35   ` Catalin Marinas
2025-06-19  2:22     ` Ankit Agrawal
2025-06-18  6:55 ` [PATCH v7 2/5] KVM: arm64: Block cacheable PFNMAP mapping ankita
2025-06-18 15:46   ` Catalin Marinas
2025-06-19  2:21     ` Ankit Agrawal
2025-06-18  6:55 ` [PATCH v7 3/5] KVM: arm64: New function to determine hardware cache management support ankita
2025-06-18 16:12   ` Catalin Marinas
2025-06-18  6:55 ` [PATCH v7 4/5] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags ankita
2025-06-18 16:34   ` Catalin Marinas
2025-06-18 16:38     ` Jason Gunthorpe
2025-06-19 12:14       ` Ankit Agrawal
2025-06-19 14:16         ` Jason Gunthorpe [this message]
2025-06-19 16:03         ` Donald Dutile
2025-06-19 16:46           ` Ankit Agrawal
2025-06-18  6:55 ` [PATCH v7 5/5] KVM: arm64: Expose new KVM cap for cacheable PFNMAP ankita

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250619141631.GD1643312@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acurrid@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cjia@nvidia.com \
    --cc=coltonlewis@google.com \
    --cc=danw@nvidia.com \
    --cc=david@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=dnigam@nvidia.com \
    --cc=gshan@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=joey.gouly@arm.com \
    --cc=kevin.tian@intel.com \
    --cc=kjaju@nvidia.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=kwankhede@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lpieralisi@kernel.org \
    --cc=maobibo@loongson.cn \
    --cc=maz@kernel.org \
    --cc=mochs@nvidia.com \
    --cc=oliver.upton@linux.dev \
    --cc=qperret@google.com \
    --cc=ryan.roberts@arm.com \
    --cc=seanjc@google.com \
    --cc=sebastianene@google.com \
    --cc=shahuang@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=targupta@nvidia.com \
    --cc=udhoke@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    --cc=yuzenghui@huawei.com \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.