Re: [PATCH v1 1/2] KVM: arm64: determine memory type from VMA

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jason Gunthorpe <jgg@nvidia.com>
To: ankita@nvidia.com
Cc: maz@kernel.org, oliver.upton@linux.dev, catalin.marinas@arm.com,
	will@kernel.org, aniketa@nvidia.com, cjia@nvidia.com,
	kwankhede@nvidia.com, targupta@nvidia.com, vsethi@nvidia.com,
	acurrid@nvidia.com, apopple@nvidia.com, jhubbard@nvidia.com,
	danw@nvidia.com, linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 1/2] KVM: arm64: determine memory type from VMA
Date: Thu, 7 Sep 2023 16:12:49 -0300	[thread overview]
Message-ID: <ZPogsU6YihY2+qR6@nvidia.com> (raw)
In-Reply-To: <20230907181459.18145-2-ankita@nvidia.com>

On Thu, Sep 07, 2023 at 11:14:58AM -0700, ankita@nvidia.com wrote:
> From: Ankit Agrawal <ankita@nvidia.com>
> 
> Currently KVM determines if a VMA is pointing at IO memory by checking
> pfn_is_map_memory(). However, the MM already gives us a way to tell what
> kind of memory it is by inspecting the VMA.
> 
> Replace pfn_is_map_memory() with a check on the VMA pgprot to determine if
> the memory is IO and thus needs stage-2 device mapping.
> 
> The VMA's pgprot is tested to determine the memory type with the
> following mapping:
> 
>  pgprot_noncached    MT_DEVICE_nGnRnE   device
>  pgprot_writecombine MT_NORMAL_NC       device
>  pgprot_device       MT_DEVICE_nGnRE    device
>  pgprot_tagged       MT_NORMAL_TAGGED   RAM
> 
> This patch solves a problems where it is possible for the kernel to
> have VMAs pointing at cachable memory without causing
> pfn_is_map_memory() to be true, eg DAX memremap cases and CXL/pre-CXL
> devices. This memory is now properly marked as cachable in KVM.
> 
> Unfortunately when FWB is not enabled, the kernel expects to naively do
> cache management by flushing the memory using an address in the
> kernel's map. This does not work in several of the newly allowed
> cases such as dcache_clean_inval_poc(). Check whether the targeted pfn
> and its mapping KVA is valid in case the FWB is absent before
> continuing.

Looking at this more closely, it relies on kvm_pte_follow() which
ultimately calls __va(), and the ARM 64 version of page_to_virt() is:

#define page_to_virt(x) ({                                              \
        __typeof__(x) __page = x;                                       \
        void *__addr = __va(page_to_phys(__page));                      \
        (void *)__tag_set((const void *)__addr, page_kasan_tag(__page));\
})

So we don't actually have an issue here, anything with a struct page
will be properly handled by KVM.

Thus this can just be:

	if (!stage2_has_fwb(pgt) && !pfn_valid(pfn)))

Then the last paragraph of the commit message is:

 As cachable vs device memory is now determined by the VMA adjust
 the other checks to match their purpose. In most cases the code needs
 to know if there is a struct page, or if the memory is in the kernel
 map and pfn_valid() is an appropriate API for this.

 Note when FWB is not enabled, the kernel expects to trivially do
 cache management by flushing the memory by linearly converting a
 kvm_pte to phys_addr to a KVA, see kvm_flush_dcache_to_poc(). This is
 only possibile for struct page backed memory. Do not allow non-struct
 page memory to be cachable without FWB.

> @@ -1490,6 +1499,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	gfn = fault_ipa >> PAGE_SHIFT;
>  	mte_allowed = kvm_vma_mte_allowed(vma);
>  
> +	/*
> +	 * Figure out the memory type based on the user va mapping properties
> +	 * Only MT_DEVICE_nGnRE and MT_DEVICE_nGnRnE will be set using
> +	 * pgprot_device() and pgprot_noncached() respectively.
> +	 */
> +	if ((mapping_type(vma->vm_page_prot) == MT_DEVICE_nGnRE) ||
> +	    (mapping_type(vma->vm_page_prot) == MT_DEVICE_nGnRnE) ||
> +	    (mapping_type(vma->vm_page_prot) == MT_NORMAL_NC))
> +		prot |= KVM_PGTABLE_PROT_DEVICE;
> +	else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))
> +		prot |= KVM_PGTABLE_PROT_X;
> +
>  	/* Don't use the VMA after the unlock -- it may have vanished */
>  	vma = NULL;
>  
> @@ -1576,10 +1597,21 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	if (exec_fault)
>  		prot |= KVM_PGTABLE_PROT_X;

You still didn't remove the kvm_is_device_pfn() check from this code,
I don't think it can really stay and make any sense..

Probably this:

   if (exec_fault && (prot & KVM_PGTABLE_PROT_DEVICE))
		return -ENOEXEC;

And these two should also be pfn_valid() [precompute pfn_valid]

	if (vma_pagesize == PAGE_SIZE && !(force_pte || !pfn_valid(pte))) {

	if (fault_status != ESR_ELx_FSC_PERM && pfn_valid(pte) && kvm_has_mte(kvm)) {

Makes sense?

Check if kvm_is_device_pfn() can be removed entirely.

Jason

next prev parent reply	other threads:[~2023-09-07 19:12 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-07 18:14 [PATCH v1 0/2] KVM: arm64: support write combining and cachable IO memory in VMs ankita
2023-09-07 18:14 ` [PATCH v1 1/2] KVM: arm64: determine memory type from VMA ankita
2023-09-07 19:12   ` Jason Gunthorpe [this message]
2023-10-05 16:15   ` Catalin Marinas
2023-10-05 16:54     ` Jason Gunthorpe
2023-10-10 14:25       ` Catalin Marinas
2023-10-10 15:05         ` Jason Gunthorpe
2023-10-10 17:19           ` Catalin Marinas
2023-10-10 18:23             ` Jason Gunthorpe
2023-10-11 17:45               ` Catalin Marinas
2023-10-11 18:38                 ` Jason Gunthorpe
2023-10-12 16:16                   ` Catalin Marinas
2024-03-10  3:49                     ` Ankit Agrawal
2024-03-19 13:38                       ` Jason Gunthorpe
2023-10-23 13:20   ` Shameerali Kolothum Thodi
2023-09-07 18:14 ` [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory ankita
2023-09-08 16:40   ` Catalin Marinas
2023-09-11 14:57   ` Lorenzo Pieralisi
2023-09-11 17:20     ` Jason Gunthorpe
2023-09-13 15:26       ` Lorenzo Pieralisi
2023-09-13 18:54         ` Jason Gunthorpe
2023-09-26  8:31           ` Lorenzo Pieralisi
2023-09-26 12:25             ` Jason Gunthorpe
2023-09-26 13:52             ` Catalin Marinas
2023-09-26 16:12               ` Lorenzo Pieralisi
2023-10-05  9:56               ` Lorenzo Pieralisi
2023-10-05 11:56                 ` Jason Gunthorpe
2023-10-05 14:08                   ` Lorenzo Pieralisi
2023-10-12 12:35                 ` Will Deacon
2023-10-12 13:20                   ` Jason Gunthorpe
2023-10-12 14:29                     ` Lorenzo Pieralisi
2023-10-12 13:53                   ` Catalin Marinas
2023-10-12 14:48                     ` Will Deacon
2023-10-12 15:44                       ` Jason Gunthorpe
2023-10-12 16:39                         ` Will Deacon
2023-10-12 18:36                           ` Jason Gunthorpe
2023-10-13  9:29                             ` Will Deacon
2023-10-12 17:26                       ` Catalin Marinas
2023-10-13  9:29                         ` Will Deacon
2023-10-13 13:08                           ` Catalin Marinas
2023-10-13 13:45                             ` Jason Gunthorpe
2023-10-19 11:07                               ` Catalin Marinas
2023-10-19 11:51                                 ` Jason Gunthorpe
2023-10-20 11:21                                   ` Catalin Marinas
2023-10-20 11:47                                     ` Jason Gunthorpe
2023-10-20 14:03                                       ` Lorenzo Pieralisi
2023-10-20 14:28                                         ` Jason Gunthorpe
2023-10-19 13:35                                 ` Lorenzo Pieralisi
2023-10-13 15:28                             ` Lorenzo Pieralisi
2023-10-19 11:12                               ` Catalin Marinas
2023-11-09 15:34                             ` Lorenzo Pieralisi
2023-11-10 14:26                               ` Jason Gunthorpe
2023-11-13  0:42                                 ` Lorenzo Pieralisi
2023-11-13 17:41                               ` Catalin Marinas
2023-10-12 12:27   ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZPogsU6YihY2+qR6@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acurrid@nvidia.com \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=catalin.marinas@arm.com \
    --cc=cjia@nvidia.com \
    --cc=danw@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=kwankhede@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=targupta@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).