From: Jason Gunthorpe <jgg@nvidia.com>
To: ankita@nvidia.com
Cc: maz@kernel.org, oliver.upton@linux.dev, catalin.marinas@arm.com,
will@kernel.org, aniketa@nvidia.com, cjia@nvidia.com,
kwankhede@nvidia.com, targupta@nvidia.com, vsethi@nvidia.com,
acurrid@nvidia.com, apopple@nvidia.com, jhubbard@nvidia.com,
danw@nvidia.com, linux-arm-kernel@lists.infradead.org,
kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 1/2] KVM: arm64: determine memory type from VMA
Date: Thu, 7 Sep 2023 16:12:49 -0300 [thread overview]
Message-ID: <ZPogsU6YihY2+qR6@nvidia.com> (raw)
In-Reply-To: <20230907181459.18145-2-ankita@nvidia.com>
On Thu, Sep 07, 2023 at 11:14:58AM -0700, ankita@nvidia.com wrote:
> From: Ankit Agrawal <ankita@nvidia.com>
>
> Currently KVM determines if a VMA is pointing at IO memory by checking
> pfn_is_map_memory(). However, the MM already gives us a way to tell what
> kind of memory it is by inspecting the VMA.
>
> Replace pfn_is_map_memory() with a check on the VMA pgprot to determine if
> the memory is IO and thus needs stage-2 device mapping.
>
> The VMA's pgprot is tested to determine the memory type with the
> following mapping:
>
> pgprot_noncached MT_DEVICE_nGnRnE device
> pgprot_writecombine MT_NORMAL_NC device
> pgprot_device MT_DEVICE_nGnRE device
> pgprot_tagged MT_NORMAL_TAGGED RAM
>
> This patch solves a problems where it is possible for the kernel to
> have VMAs pointing at cachable memory without causing
> pfn_is_map_memory() to be true, eg DAX memremap cases and CXL/pre-CXL
> devices. This memory is now properly marked as cachable in KVM.
>
> Unfortunately when FWB is not enabled, the kernel expects to naively do
> cache management by flushing the memory using an address in the
> kernel's map. This does not work in several of the newly allowed
> cases such as dcache_clean_inval_poc(). Check whether the targeted pfn
> and its mapping KVA is valid in case the FWB is absent before
> continuing.
Looking at this more closely, it relies on kvm_pte_follow() which
ultimately calls __va(), and the ARM 64 version of page_to_virt() is:
#define page_to_virt(x) ({ \
__typeof__(x) __page = x; \
void *__addr = __va(page_to_phys(__page)); \
(void *)__tag_set((const void *)__addr, page_kasan_tag(__page));\
})
So we don't actually have an issue here, anything with a struct page
will be properly handled by KVM.
Thus this can just be:
if (!stage2_has_fwb(pgt) && !pfn_valid(pfn)))
Then the last paragraph of the commit message is:
As cachable vs device memory is now determined by the VMA adjust
the other checks to match their purpose. In most cases the code needs
to know if there is a struct page, or if the memory is in the kernel
map and pfn_valid() is an appropriate API for this.
Note when FWB is not enabled, the kernel expects to trivially do
cache management by flushing the memory by linearly converting a
kvm_pte to phys_addr to a KVA, see kvm_flush_dcache_to_poc(). This is
only possibile for struct page backed memory. Do not allow non-struct
page memory to be cachable without FWB.
> @@ -1490,6 +1499,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> gfn = fault_ipa >> PAGE_SHIFT;
> mte_allowed = kvm_vma_mte_allowed(vma);
>
> + /*
> + * Figure out the memory type based on the user va mapping properties
> + * Only MT_DEVICE_nGnRE and MT_DEVICE_nGnRnE will be set using
> + * pgprot_device() and pgprot_noncached() respectively.
> + */
> + if ((mapping_type(vma->vm_page_prot) == MT_DEVICE_nGnRE) ||
> + (mapping_type(vma->vm_page_prot) == MT_DEVICE_nGnRnE) ||
> + (mapping_type(vma->vm_page_prot) == MT_NORMAL_NC))
> + prot |= KVM_PGTABLE_PROT_DEVICE;
> + else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))
> + prot |= KVM_PGTABLE_PROT_X;
> +
> /* Don't use the VMA after the unlock -- it may have vanished */
> vma = NULL;
>
> @@ -1576,10 +1597,21 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> if (exec_fault)
> prot |= KVM_PGTABLE_PROT_X;
You still didn't remove the kvm_is_device_pfn() check from this code,
I don't think it can really stay and make any sense..
Probably this:
if (exec_fault && (prot & KVM_PGTABLE_PROT_DEVICE))
return -ENOEXEC;
And these two should also be pfn_valid() [precompute pfn_valid]
if (vma_pagesize == PAGE_SIZE && !(force_pte || !pfn_valid(pte))) {
if (fault_status != ESR_ELx_FSC_PERM && pfn_valid(pte) && kvm_has_mte(kvm)) {
Makes sense?
Check if kvm_is_device_pfn() can be removed entirely.
Jason
WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: ankita@nvidia.com
Cc: maz@kernel.org, oliver.upton@linux.dev, catalin.marinas@arm.com,
will@kernel.org, aniketa@nvidia.com, cjia@nvidia.com,
kwankhede@nvidia.com, targupta@nvidia.com, vsethi@nvidia.com,
acurrid@nvidia.com, apopple@nvidia.com, jhubbard@nvidia.com,
danw@nvidia.com, linux-arm-kernel@lists.infradead.org,
kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 1/2] KVM: arm64: determine memory type from VMA
Date: Thu, 7 Sep 2023 16:12:49 -0300 [thread overview]
Message-ID: <ZPogsU6YihY2+qR6@nvidia.com> (raw)
In-Reply-To: <20230907181459.18145-2-ankita@nvidia.com>
On Thu, Sep 07, 2023 at 11:14:58AM -0700, ankita@nvidia.com wrote:
> From: Ankit Agrawal <ankita@nvidia.com>
>
> Currently KVM determines if a VMA is pointing at IO memory by checking
> pfn_is_map_memory(). However, the MM already gives us a way to tell what
> kind of memory it is by inspecting the VMA.
>
> Replace pfn_is_map_memory() with a check on the VMA pgprot to determine if
> the memory is IO and thus needs stage-2 device mapping.
>
> The VMA's pgprot is tested to determine the memory type with the
> following mapping:
>
> pgprot_noncached MT_DEVICE_nGnRnE device
> pgprot_writecombine MT_NORMAL_NC device
> pgprot_device MT_DEVICE_nGnRE device
> pgprot_tagged MT_NORMAL_TAGGED RAM
>
> This patch solves a problems where it is possible for the kernel to
> have VMAs pointing at cachable memory without causing
> pfn_is_map_memory() to be true, eg DAX memremap cases and CXL/pre-CXL
> devices. This memory is now properly marked as cachable in KVM.
>
> Unfortunately when FWB is not enabled, the kernel expects to naively do
> cache management by flushing the memory using an address in the
> kernel's map. This does not work in several of the newly allowed
> cases such as dcache_clean_inval_poc(). Check whether the targeted pfn
> and its mapping KVA is valid in case the FWB is absent before
> continuing.
Looking at this more closely, it relies on kvm_pte_follow() which
ultimately calls __va(), and the ARM 64 version of page_to_virt() is:
#define page_to_virt(x) ({ \
__typeof__(x) __page = x; \
void *__addr = __va(page_to_phys(__page)); \
(void *)__tag_set((const void *)__addr, page_kasan_tag(__page));\
})
So we don't actually have an issue here, anything with a struct page
will be properly handled by KVM.
Thus this can just be:
if (!stage2_has_fwb(pgt) && !pfn_valid(pfn)))
Then the last paragraph of the commit message is:
As cachable vs device memory is now determined by the VMA adjust
the other checks to match their purpose. In most cases the code needs
to know if there is a struct page, or if the memory is in the kernel
map and pfn_valid() is an appropriate API for this.
Note when FWB is not enabled, the kernel expects to trivially do
cache management by flushing the memory by linearly converting a
kvm_pte to phys_addr to a KVA, see kvm_flush_dcache_to_poc(). This is
only possibile for struct page backed memory. Do not allow non-struct
page memory to be cachable without FWB.
> @@ -1490,6 +1499,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> gfn = fault_ipa >> PAGE_SHIFT;
> mte_allowed = kvm_vma_mte_allowed(vma);
>
> + /*
> + * Figure out the memory type based on the user va mapping properties
> + * Only MT_DEVICE_nGnRE and MT_DEVICE_nGnRnE will be set using
> + * pgprot_device() and pgprot_noncached() respectively.
> + */
> + if ((mapping_type(vma->vm_page_prot) == MT_DEVICE_nGnRE) ||
> + (mapping_type(vma->vm_page_prot) == MT_DEVICE_nGnRnE) ||
> + (mapping_type(vma->vm_page_prot) == MT_NORMAL_NC))
> + prot |= KVM_PGTABLE_PROT_DEVICE;
> + else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))
> + prot |= KVM_PGTABLE_PROT_X;
> +
> /* Don't use the VMA after the unlock -- it may have vanished */
> vma = NULL;
>
> @@ -1576,10 +1597,21 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> if (exec_fault)
> prot |= KVM_PGTABLE_PROT_X;
You still didn't remove the kvm_is_device_pfn() check from this code,
I don't think it can really stay and make any sense..
Probably this:
if (exec_fault && (prot & KVM_PGTABLE_PROT_DEVICE))
return -ENOEXEC;
And these two should also be pfn_valid() [precompute pfn_valid]
if (vma_pagesize == PAGE_SIZE && !(force_pte || !pfn_valid(pte))) {
if (fault_status != ESR_ELx_FSC_PERM && pfn_valid(pte) && kvm_has_mte(kvm)) {
Makes sense?
Check if kvm_is_device_pfn() can be removed entirely.
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-09-07 19:12 UTC|newest]
Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-07 18:14 [PATCH v1 0/2] KVM: arm64: support write combining and cachable IO memory in VMs ankita
2023-09-07 18:14 ` ankita
2023-09-07 18:14 ` [PATCH v1 1/2] KVM: arm64: determine memory type from VMA ankita
2023-09-07 18:14 ` ankita
2023-09-07 19:12 ` Jason Gunthorpe [this message]
2023-09-07 19:12 ` Jason Gunthorpe
2023-10-05 16:15 ` Catalin Marinas
2023-10-05 16:15 ` Catalin Marinas
2023-10-05 16:54 ` Jason Gunthorpe
2023-10-05 16:54 ` Jason Gunthorpe
2023-10-10 14:25 ` Catalin Marinas
2023-10-10 14:25 ` Catalin Marinas
2023-10-10 15:05 ` Jason Gunthorpe
2023-10-10 15:05 ` Jason Gunthorpe
2023-10-10 17:19 ` Catalin Marinas
2023-10-10 17:19 ` Catalin Marinas
2023-10-10 18:23 ` Jason Gunthorpe
2023-10-10 18:23 ` Jason Gunthorpe
2023-10-11 17:45 ` Catalin Marinas
2023-10-11 17:45 ` Catalin Marinas
2023-10-11 18:38 ` Jason Gunthorpe
2023-10-11 18:38 ` Jason Gunthorpe
2023-10-12 16:16 ` Catalin Marinas
2023-10-12 16:16 ` Catalin Marinas
2024-03-10 3:49 ` Ankit Agrawal
2024-03-10 3:49 ` Ankit Agrawal
2024-03-19 13:38 ` Jason Gunthorpe
2024-03-19 13:38 ` Jason Gunthorpe
2023-10-23 13:20 ` Shameerali Kolothum Thodi
2023-10-23 13:20 ` Shameerali Kolothum Thodi
2023-09-07 18:14 ` [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory ankita
2023-09-07 18:14 ` ankita
2023-09-08 16:40 ` Catalin Marinas
2023-09-08 16:40 ` Catalin Marinas
2023-09-11 14:57 ` Lorenzo Pieralisi
2023-09-11 14:57 ` Lorenzo Pieralisi
2023-09-11 17:20 ` Jason Gunthorpe
2023-09-11 17:20 ` Jason Gunthorpe
2023-09-13 15:26 ` Lorenzo Pieralisi
2023-09-13 15:26 ` Lorenzo Pieralisi
2023-09-13 18:54 ` Jason Gunthorpe
2023-09-13 18:54 ` Jason Gunthorpe
2023-09-26 8:31 ` Lorenzo Pieralisi
2023-09-26 8:31 ` Lorenzo Pieralisi
2023-09-26 12:25 ` Jason Gunthorpe
2023-09-26 12:25 ` Jason Gunthorpe
2023-09-26 13:52 ` Catalin Marinas
2023-09-26 13:52 ` Catalin Marinas
2023-09-26 16:12 ` Lorenzo Pieralisi
2023-09-26 16:12 ` Lorenzo Pieralisi
2023-10-05 9:56 ` Lorenzo Pieralisi
2023-10-05 9:56 ` Lorenzo Pieralisi
2023-10-05 11:56 ` Jason Gunthorpe
2023-10-05 11:56 ` Jason Gunthorpe
2023-10-05 14:08 ` Lorenzo Pieralisi
2023-10-05 14:08 ` Lorenzo Pieralisi
2023-10-12 12:35 ` Will Deacon
2023-10-12 12:35 ` Will Deacon
2023-10-12 13:20 ` Jason Gunthorpe
2023-10-12 13:20 ` Jason Gunthorpe
2023-10-12 14:29 ` Lorenzo Pieralisi
2023-10-12 14:29 ` Lorenzo Pieralisi
2023-10-12 13:53 ` Catalin Marinas
2023-10-12 13:53 ` Catalin Marinas
2023-10-12 14:48 ` Will Deacon
2023-10-12 14:48 ` Will Deacon
2023-10-12 15:44 ` Jason Gunthorpe
2023-10-12 15:44 ` Jason Gunthorpe
2023-10-12 16:39 ` Will Deacon
2023-10-12 16:39 ` Will Deacon
2023-10-12 18:36 ` Jason Gunthorpe
2023-10-12 18:36 ` Jason Gunthorpe
2023-10-13 9:29 ` Will Deacon
2023-10-13 9:29 ` Will Deacon
2023-10-12 17:26 ` Catalin Marinas
2023-10-12 17:26 ` Catalin Marinas
2023-10-13 9:29 ` Will Deacon
2023-10-13 9:29 ` Will Deacon
2023-10-13 13:08 ` Catalin Marinas
2023-10-13 13:08 ` Catalin Marinas
2023-10-13 13:45 ` Jason Gunthorpe
2023-10-13 13:45 ` Jason Gunthorpe
2023-10-19 11:07 ` Catalin Marinas
2023-10-19 11:07 ` Catalin Marinas
2023-10-19 11:51 ` Jason Gunthorpe
2023-10-19 11:51 ` Jason Gunthorpe
2023-10-20 11:21 ` Catalin Marinas
2023-10-20 11:21 ` Catalin Marinas
2023-10-20 11:47 ` Jason Gunthorpe
2023-10-20 11:47 ` Jason Gunthorpe
2023-10-20 14:03 ` Lorenzo Pieralisi
2023-10-20 14:03 ` Lorenzo Pieralisi
2023-10-20 14:28 ` Jason Gunthorpe
2023-10-20 14:28 ` Jason Gunthorpe
2023-10-19 13:35 ` Lorenzo Pieralisi
2023-10-19 13:35 ` Lorenzo Pieralisi
2023-10-13 15:28 ` Lorenzo Pieralisi
2023-10-13 15:28 ` Lorenzo Pieralisi
2023-10-19 11:12 ` Catalin Marinas
2023-10-19 11:12 ` Catalin Marinas
2023-11-09 15:34 ` Lorenzo Pieralisi
2023-11-09 15:34 ` Lorenzo Pieralisi
2023-11-10 14:26 ` Jason Gunthorpe
2023-11-10 14:26 ` Jason Gunthorpe
2023-11-13 0:42 ` Lorenzo Pieralisi
2023-11-13 0:42 ` Lorenzo Pieralisi
2023-11-13 17:41 ` Catalin Marinas
2023-11-13 17:41 ` Catalin Marinas
2023-10-12 12:27 ` Will Deacon
2023-10-12 12:27 ` Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZPogsU6YihY2+qR6@nvidia.com \
--to=jgg@nvidia.com \
--cc=acurrid@nvidia.com \
--cc=aniketa@nvidia.com \
--cc=ankita@nvidia.com \
--cc=apopple@nvidia.com \
--cc=catalin.marinas@arm.com \
--cc=cjia@nvidia.com \
--cc=danw@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=kvmarm@lists.linux.dev \
--cc=kwankhede@nvidia.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=targupta@nvidia.com \
--cc=vsethi@nvidia.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.