* Re: [PATCH net-next] net: mana: Add MAC address to vPort logs and clarify error messages
From: Erni Sri Satya Vennela @ 2026-01-14 5:42 UTC (permalink / raw)
To: Hariprasad Kelam
Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, dipayanroy, ssengar, shirazsaleem,
shradhagupta, gargaditya, linux-hyperv, netdev, linux-kernel
In-Reply-To: <aWXqMC3C4rcdKjD0@test-OptiPlex-Tower-Plus-7010>
On Tue, Jan 13, 2026 at 12:16:08PM +0530, Hariprasad Kelam wrote:
> On 2026-01-13 at 10:54:58, Erni Sri Satya Vennela (ernis@linux.microsoft.com) wrote:
> > Add MAC address to vPort configuration success message and update error
> > message to be more specific about HWC message errors in
> > mana_send_request.
> >
> > Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
> > Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
> > ---
> > drivers/net/ethernet/microsoft/mana/hw_channel.c | 12 +++++++-----
> > drivers/net/ethernet/microsoft/mana/mana_en.c | 8 ++++----
> > 2 files changed, 11 insertions(+), 9 deletions(-)
> >
> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
>
Thanks for the reviews. Based on additional feedback, I will be
preparing a v2 of this patch with further changes.
Kindly hold off on merging this version.
- Vennela
^ permalink raw reply
* RE: [RFC v1 5/5] iommu/hyperv: Add para-virtualized IOMMU support for Hyper-V guest
From: Michael Kelley @ 2026-01-14 15:43 UTC (permalink / raw)
To: Jacob Pan
Cc: Yu Zhang, linux-kernel@vger.kernel.org,
linux-hyperv@vger.kernel.org, iommu@lists.linux.dev,
linux-pci@vger.kernel.org, kys@microsoft.com,
haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com,
lpieralisi@kernel.org, kwilczynski@kernel.org, mani@kernel.org,
robh@kernel.org, bhelgaas@google.com, arnd@arndb.de,
joro@8bytes.org, will@kernel.org, robin.murphy@arm.com,
easwar.hariharan@linux.microsoft.com,
nunodasneves@linux.microsoft.com, mrathor@linux.microsoft.com,
peterz@infradead.org, linux-arch@vger.kernel.org
In-Reply-To: <20260113092940.000050b8@linux.microsoft.com>
From: Jacob Pan <jacob.pan@linux.microsoft.com> Sent: Tuesday, January 13, 2026 9:30 AM
>
> Hi Michael,
>
> On Mon, 12 Jan 2026 17:48:30 +0000
> Michael Kelley <mhklinux@outlook.com> wrote:
> >
> > From: Yu Zhang <zhangyu1@linux.microsoft.com> Sent: Monday, January 12, 2026 8:56 AM
> > >
> > > On Thu, Jan 08, 2026 at 06:48:59PM +0000, Michael Kelley wrote:
> > > > From: Yu Zhang <zhangyu1@linux.microsoft.com> Sent: Monday, December 8, 2025 9:11 PM
> > >
> > > <snip>
> > > Thank you so much, Michael, for the thorough review!
> > >
> > > I've snipped some comments I fully agree with and will address in
> > > next version. Actually, I have to admit I agree with your remaining
> > > comments below as well. :)
> > >
> > > > > +struct hv_iommu_dev *hv_iommu_device;
> > > > > +static struct hv_iommu_domain hv_identity_domain;
> > > > > +static struct hv_iommu_domain hv_blocking_domain;
> > > >
> > > > Why is hv_iommu_device allocated dynamically while the two
> > > > domains are allocated statically? Seems like the approach could
> > > > be consistent, though maybe there's some reason I'm missing.
> > > >
> > >
> > > On second thought, `hv_identity_domain` and `hv_blocking_domain`
> > > should likely be allocated dynamically as well, consistent with
> > > `hv_iommu_device`.
> >
> > I don't know if there's a strong rationale either way (static
> > allocation vs. dynamic). If the long-term expectation is that there
> > is never more than one PV IOMMU in a guest, then static would be OK.
> > If future direction allows that there could be multiple PV IOMMUs in
> > a guest, then doing dynamic from the start is justifiable (though the
> > current PV IOMMU hypercalls seem to assume only one PV IOMMU). But
> > either way, being consistent is desirable.
> >
> I believe we only need a single global static identity domain here
> regardless how many vIOMMUs there may be. From the guest’s perspective,
> the hvIOMMU only supports hardware‑passthrough identity domains, which
> do not maintain any per‑IOMMU state, i.e., there is no S1 IO page table
> based identity domain.
Ah yes, that makes sense. With that understanding, keeping the
identity domain as a static singleton would be fine. Leave a code
comment with a short explanation.
Michael
>
> The expectation of physical IOMMU settings for guest identity
> domain should be as follows:
> - Intel vtd PASID entry PGTT = 010b (Second-stage Translation only)
> - AMD DTE TV=1; GV=0
>
> > >
> > > <snip>
> > >
> > > > > +static void hv_iommu_shutdown(void)
> > > > > +{
> > > > > + iommu_device_sysfs_remove(&hv_iommu_device->iommu);
> > > > > +
> > > > > + kfree(hv_iommu_device);
> > > > > +}
> > > > > +
> > > > > +static struct syscore_ops hv_iommu_syscore_ops = {
> > > > > + .shutdown = hv_iommu_shutdown,
> > > > > +};
> > [...]
> > >
> > > For iommu_device_sysfs_remove(), I guess they are not necessary, and
> > > I will need to do some homework to better understand the sysfs. :)
> > > Originally, we wanted a shutdown routine to trigger some hypercall,
> > > so that Hyper-V will disable the DMA translation, e.g., during the
> > > VM reboot process.
> >
> > I would presume that if Hyper-V reboots the VM, Hyper-V automatically
> > resets the PV IOMMU and prevents any further DMA operations. But
> > consider kexec(), where a new kernel gets loaded without going through
> > the hypervisor "reboot-this-VM" path. There have been problems in the
> > past with kexec() where parts of Hyper-V state for the guest didn't
> > get reset, and the PV IOMMU is likely something in that category. So
> > there may indeed be a need to tell the hypervisor to reset everything
> > related to the PV IOMMU. There are already functions to do Hyper-V
> > cleanup: see vmbus_initiate_unload() and hyperv_cleanup(). These
> > existing functions may be a better place to do PV IOMMU cleanup/reset
> > if needed.
> That would be my vote also.
^ permalink raw reply
* Re: [PATCH net-next v14 01/12] vsock: add netns to vsock core
From: Stefano Garzarella @ 2026-01-14 15:54 UTC (permalink / raw)
To: Bobby Eshleman
Cc: kernel test robot, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Stefan Hajnoczi, Michael S. Tsirkin,
Jason Wang, Eugenio Pérez, Xuan Zhuo, K. Y. Srinivasan,
Haiyang Zhang, Wei Liu, Dexuan Cui, Bryan Tan, Vishnu Dasa,
Broadcom internal kernel review list, Shuah Khan, Long Li,
oe-kbuild-all, netdev, linux-kernel, virtualization, kvm,
linux-hyperv, linux-kselftest, berrange, Sargun Dhillon
In-Reply-To: <202601140749.5TXm5gpl-lkp@intel.com>
On Wed, 14 Jan 2026 at 00:13, kernel test robot <lkp@intel.com> wrote:
>
> Hi Bobby,
>
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on net-next/main]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Bobby-Eshleman/virtio-set-skb-owner-of-virtio_transport_reset_no_sock-reply/20260113-125559
> base: net-next/main
> patch link: https://lore.kernel.org/r/20260112-vsock-vmtest-v14-1-a5c332db3e2b%40meta.com
> patch subject: [PATCH net-next v14 01/12] vsock: add netns to vsock core
> config: x86_64-buildonly-randconfig-004-20260113 (https://download.01.org/0day-ci/archive/20260114/202601140749.5TXm5gpl-lkp@intel.com/config)
> compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260114/202601140749.5TXm5gpl-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202601140749.5TXm5gpl-lkp@intel.com/
>
> All warnings (new ones prefixed by >>, old ones prefixed by <<):
>
> >> WARNING: modpost: net/vmw_vsock/vsock: section mismatch in reference: vsock_exit+0x25 (section: .exit.text) -> vsock_sysctl_ops (section: .init.data)
Bobby can you check this report?
Could be related to `__net_initdata` annotation of `vsock_sysctl_ops` ?
Why we need that?
Thanks,
Stefano
^ permalink raw reply
* [PATCH 1/1] mshv: Store the result of vfs_poll in a variable of type __poll_t
From: mhkelley58 @ 2026-01-14 17:01 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, linux-hyperv; +Cc: linux-kernel
From: Michael Kelley <mhklinux@outlook.com>
vfs_poll() returns a result of type __poll_t, but current code is using
an "unsigned int" local variable. The difference is that __poll_t carries
the "bitwise" attribute. This attribute is not interpreted by the C
compiler; it is only used by 'sparse' to flag incorrect usage of the
return value. The return value is used correctly here, so there's no
bug, but sparse complains about the type mismatch.
In the interest of general correctness and to avoid noise from sparse,
change the local variable to type __poll_t. No functional change.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512141339.791TCKnB-lkp@intel.com/
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
This change is not marked with a Fixes: tag as there's no value in
backporting to older stable releases.
drivers/hv/mshv_eventfd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hv/mshv_eventfd.c b/drivers/hv/mshv_eventfd.c
index d93a18f09c76..0b75ff1edb73 100644
--- a/drivers/hv/mshv_eventfd.c
+++ b/drivers/hv/mshv_eventfd.c
@@ -388,7 +388,7 @@ static int mshv_irqfd_assign(struct mshv_partition *pt,
{
struct eventfd_ctx *eventfd = NULL, *resamplefd = NULL;
struct mshv_irqfd *irqfd, *tmp;
- unsigned int events;
+ __poll_t events;
int ret;
int idx;
--
2.25.1
^ permalink raw reply related
* [PATCH 1/1] mshv: Add __user attribute to argument passed to access_ok()
From: mhkelley58 @ 2026-01-14 18:15 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, linux-hyperv; +Cc: linux-kernel
From: Michael Kelley <mhklinux@outlook.com>
access_ok() expects its first argument to have the __user attribute
since it is checking access to user space. Current code passes an
argument that lacks that attribute, resulting in 'sparse' flagging
the incorrect usage. However, the compiler doesn't generate code
based on the attribute, so there's no actual bug.
In the interest of general correctness and to avoid noise from sparse,
add the __user attribute. No functional change.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512141339.791TCKnB-lkp@intel.com/
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
drivers/hv/mshv_root_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index eff1b21461dc..5673af9fe101 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -1280,7 +1280,7 @@ mshv_map_user_memory(struct mshv_partition *partition,
long ret;
if (mem.flags & BIT(MSHV_SET_MEM_BIT_UNMAP) ||
- !access_ok((const void *)mem.userspace_addr, mem.size))
+ !access_ok((const void __user *)mem.userspace_addr, mem.size))
return -EINVAL;
mmap_read_lock(current->mm);
--
2.25.1
^ permalink raw reply related
* RE: [EXTERNAL] Re: [PATCH V2,net-next, 1/2] net: mana: Add support for coalesced RX packets on CQE
From: Haiyang Zhang @ 2026-01-14 18:27 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Haiyang Zhang, linux-hyperv@vger.kernel.org,
netdev@vger.kernel.org, KY Srinivasan, Wei Liu, Dexuan Cui,
Long Li, Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
Konstantin Taranov, Simon Horman, Erni Sri Satya Vennela,
Shradha Gupta, Saurabh Sengar, Aditya Garg, Dipayaan Roy,
Shiraz Saleem, linux-kernel@vger.kernel.org,
linux-rdma@vger.kernel.org, Paul Rosswurm
In-Reply-To: <20260113170948.1d6fbdaf@kernel.org>
> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, January 13, 2026 8:10 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@linux.microsoft.com>; linux-
> hyperv@vger.kernel.org; netdev@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; Wei Liu <wei.liu@kernel.org>; Dexuan Cui
> <DECUI@microsoft.com>; Long Li <longli@microsoft.com>; Andrew Lunn
> <andrew+netdev@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric
> Dumazet <edumazet@google.com>; Paolo Abeni <pabeni@redhat.com>; Konstantin
> Taranov <kotaranov@microsoft.com>; Simon Horman <horms@kernel.org>; Erni
> Sri Satya Vennela <ernis@linux.microsoft.com>; Shradha Gupta
> <shradhagupta@linux.microsoft.com>; Saurabh Sengar
> <ssengar@linux.microsoft.com>; Aditya Garg
> <gargaditya@linux.microsoft.com>; Dipayaan Roy
> <dipayanroy@linux.microsoft.com>; Shiraz Saleem
> <shirazsaleem@microsoft.com>; linux-kernel@vger.kernel.org; linux-
> rdma@vger.kernel.org; Paul Rosswurm <paulros@microsoft.com>
> Subject: Re: [EXTERNAL] Re: [PATCH V2,net-next, 1/2] net: mana: Add
> support for coalesced RX packets on CQE
>
> On Tue, 13 Jan 2026 15:13:24 +0000 Haiyang Zhang wrote:
> > > > I get that. What is the logic for combining 4 packets into a single
> > > > completion? How does it work? Your commit message mentions
> "regression
> > > > on latency" - what is the bound on that regression?
> > >
> > > When we received CQE type CQE_RX_COALESCED_4, it's a coalesced CQE.
> And in
> > > the CQE OOB, there is an array with 4 PPI elements, with each pkt's
> length:
> > > oob->ppi[i].pkt_len.
> > >
> > > So we read the related WQE and the DMA buffers for the RX pkt
> payloads, up
> > > to 4.
> > > But, if the coalesced pkts <4, the pkt_len will be 0 after the last
> pkt,
> > > so we know when to stop reading the WQEs.
> >
> > And, the coalescing can add up to 2 microseconds into one-way latency.
>
> I am asking you how the _device_ (hypervisor?) decides when to coalesce
> and when to send a partial CQE (<4 packets in 4 pkt CQE). You are using
> the coalescing uAPI, so I'm trying to make sure this is the correct API.
> CQE configuration can also be done via ringparam.
When coalescing is enabled, the device waits for packets which can
have the CQE coalesced with previous packet(s). That coalescing process
is finished (and a CQE written to the appropriate CQ) when the CQE is
filled with 4 pkts, or time expired, or other device specific logic is
satisfied.
Thanks,
- Haiyang
^ permalink raw reply
* Re: [PATCH 1/1] mshv: Add __user attribute to argument passed to access_ok()
From: Nuno Das Neves @ 2026-01-14 18:39 UTC (permalink / raw)
To: mhklinux, kys, haiyangz, wei.liu, decui, longli, linux-hyperv
Cc: linux-kernel
In-Reply-To: <20260114181508.143564-1-mhklinux@outlook.com>
On 1/14/2026 10:15 AM, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
>
> access_ok() expects its first argument to have the __user attribute
> since it is checking access to user space. Current code passes an
> argument that lacks that attribute, resulting in 'sparse' flagging
> the incorrect usage. However, the compiler doesn't generate code
> based on the attribute, so there's no actual bug.
>
> In the interest of general correctness and to avoid noise from sparse,
> add the __user attribute. No functional change.
>
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202512141339.791TCKnB-lkp@intel.com/
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> ---
> drivers/hv/mshv_root_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index eff1b21461dc..5673af9fe101 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -1280,7 +1280,7 @@ mshv_map_user_memory(struct mshv_partition *partition,
> long ret;
>
> if (mem.flags & BIT(MSHV_SET_MEM_BIT_UNMAP) ||
> - !access_ok((const void *)mem.userspace_addr, mem.size))
> + !access_ok((const void __user *)mem.userspace_addr, mem.size))
> return -EINVAL;
>
> mmap_read_lock(current->mm);
Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
^ permalink raw reply
* Re: [PATCH 1/1] mshv: Store the result of vfs_poll in a variable of type __poll_t
From: Nuno Das Neves @ 2026-01-14 18:40 UTC (permalink / raw)
To: mhklinux, kys, haiyangz, wei.liu, decui, longli, linux-hyperv
Cc: linux-kernel
In-Reply-To: <20260114170112.102673-1-mhklinux@outlook.com>
On 1/14/2026 9:01 AM, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
>
> vfs_poll() returns a result of type __poll_t, but current code is using
> an "unsigned int" local variable. The difference is that __poll_t carries
> the "bitwise" attribute. This attribute is not interpreted by the C
> compiler; it is only used by 'sparse' to flag incorrect usage of the
> return value. The return value is used correctly here, so there's no
> bug, but sparse complains about the type mismatch.
>
> In the interest of general correctness and to avoid noise from sparse,
> change the local variable to type __poll_t. No functional change.
>
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202512141339.791TCKnB-lkp@intel.com/
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> ---
> This change is not marked with a Fixes: tag as there's no value in
> backporting to older stable releases.
>
> drivers/hv/mshv_eventfd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/hv/mshv_eventfd.c b/drivers/hv/mshv_eventfd.c
> index d93a18f09c76..0b75ff1edb73 100644
> --- a/drivers/hv/mshv_eventfd.c
> +++ b/drivers/hv/mshv_eventfd.c
> @@ -388,7 +388,7 @@ static int mshv_irqfd_assign(struct mshv_partition *pt,
> {
> struct eventfd_ctx *eventfd = NULL, *resamplefd = NULL;
> struct mshv_irqfd *irqfd, *tmp;
> - unsigned int events;
> + __poll_t events;
> int ret;
> int idx;
>
Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
^ permalink raw reply
* Re: [PATCH net-next v14 01/12] vsock: add netns to vsock core
From: Bobby Eshleman @ 2026-01-14 17:21 UTC (permalink / raw)
To: Stefano Garzarella
Cc: kernel test robot, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Stefan Hajnoczi, Michael S. Tsirkin,
Jason Wang, Eugenio Pérez, Xuan Zhuo, K. Y. Srinivasan,
Haiyang Zhang, Wei Liu, Dexuan Cui, Bryan Tan, Vishnu Dasa,
Broadcom internal kernel review list, Shuah Khan, Long Li,
oe-kbuild-all, netdev, linux-kernel, virtualization, kvm,
linux-hyperv, linux-kselftest, berrange, Sargun Dhillon
In-Reply-To: <CAGxU2F45q7CWy3O_QhYj0Y2Bt84vA=eaTeBTu+TvEmFm0_E7Jw@mail.gmail.com>
On Wed, Jan 14, 2026 at 04:54:15PM +0100, Stefano Garzarella wrote:
> On Wed, 14 Jan 2026 at 00:13, kernel test robot <lkp@intel.com> wrote:
> >
> > Hi Bobby,
> >
> > kernel test robot noticed the following build warnings:
> >
> > [auto build test WARNING on net-next/main]
> >
> > url: https://github.com/intel-lab-lkp/linux/commits/Bobby-Eshleman/virtio-set-skb-owner-of-virtio_transport_reset_no_sock-reply/20260113-125559
> > base: net-next/main
> > patch link: https://lore.kernel.org/r/20260112-vsock-vmtest-v14-1-a5c332db3e2b%40meta.com
> > patch subject: [PATCH net-next v14 01/12] vsock: add netns to vsock core
> > config: x86_64-buildonly-randconfig-004-20260113 (https://download.01.org/0day-ci/archive/20260114/202601140749.5TXm5gpl-lkp@intel.com/config)
> > compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
> > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260114/202601140749.5TXm5gpl-lkp@intel.com/reproduce)
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <lkp@intel.com>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/202601140749.5TXm5gpl-lkp@intel.com/
> >
> > All warnings (new ones prefixed by >>, old ones prefixed by <<):
> >
> > >> WARNING: modpost: net/vmw_vsock/vsock: section mismatch in reference: vsock_exit+0x25 (section: .exit.text) -> vsock_sysctl_ops (section: .init.data)
>
> Bobby can you check this report?
>
> Could be related to `__net_initdata` annotation of `vsock_sysctl_ops` ?
> Why we need that?
>
> Thanks,
> Stefano
>
Yep, no problem.
Best,
Bobby
^ permalink raw reply
* Re: [PATCH] mshv: make certain field names descriptive in a header struct
From: Mukesh R @ 2026-01-14 19:27 UTC (permalink / raw)
To: Stanislav Kinsburskii; +Cc: linux-hyperv, wei.liu, nunodasneves
In-Reply-To: <aWbmJPkrJyICk4Rh@skinsburskii.localdomain>
On 1/13/26 16:41, Stanislav Kinsburskii wrote:
> On Fri, Jan 09, 2026 at 12:06:11PM -0800, Mukesh Rathor wrote:
>> There is no functional change. Just make couple field names in
>> struct mshv_mem_region, in a header that can be used in many
>> places, a little descriptive to make code easier to read by
>> allowing better support for grep, cscope, etc.
>>
>> Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
>> ---
>> drivers/hv/mshv_regions.c | 44 ++++++++++++++++++-------------------
>> drivers/hv/mshv_root.h | 6 ++---
>> drivers/hv/mshv_root_main.c | 10 ++++-----
>> 3 files changed, 30 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/hv/mshv_regions.c b/drivers/hv/mshv_regions.c
>> index 202b9d551e39..af81405f859b 100644
>> --- a/drivers/hv/mshv_regions.c
>> +++ b/drivers/hv/mshv_regions.c
>> @@ -52,7 +52,7 @@ static long mshv_region_process_chunk(struct mshv_mem_region *region,
>> struct page *page;
>> int ret;
>>
>> - page = region->pages[page_offset];
>> + page = region->mreg_pages[page_offset];
>
> What does "m" mean here - "mreg_pages"? Is it "memory region"?
> If so, then it's misleading, because the same region stuct is used to
> the MMIO regions as well. Maybe "region_pages" would be better?
m is for memory or mmio.
> Also, while we're at it, maybe rename "mshv_mem_region" to "mshv_region" to reflect that?
Well, that turns out to be much bigger change, so probably not worth it
now. mmio is "memory" mapped io, so we can probably live with it.
Thanks,
-Mukesh
.. snip ..
^ permalink raw reply
* [PATCH v3 1/6] mshv: Ignore second stats page map result failure
From: Nuno Das Neves @ 2026-01-14 21:37 UTC (permalink / raw)
To: linux-hyperv, linux-kernel, mhklinux, skinsburskii
Cc: kys, haiyangz, wei.liu, decui, longli, prapal, mrathor,
paekkaladevi, Nuno Das Neves
In-Reply-To: <20260114213803.143486-1-nunodasneves@linux.microsoft.com>
From: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Older versions of the hypervisor do not have a concept of separate SELF
and PARENT stats areas. In this case, mapping the HV_STATS_AREA_SELF page
is sufficient - it's the only page and it contains all available stats.
Mapping HV_STATS_AREA_PARENT returns HV_STATUS_INVALID_PARAMETER which
currently causes module init to fail on older hypevisor versions.
Detect this case and gracefully fall back to populating
stats_pages[HV_STATS_AREA_PARENT] with the already-mapped SELF page.
Add comments to clarify the behavior, including a clarification of why
this isn't needed for hv_call_map_stats_page2() which always supports
PARENT and SELF areas.
Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
---
drivers/hv/mshv_root_hv_call.c | 52 +++++++++++++++++++++++++++++++---
drivers/hv/mshv_root_main.c | 3 ++
2 files changed, 51 insertions(+), 4 deletions(-)
diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
index 598eaff4ff29..1f93b94d7580 100644
--- a/drivers/hv/mshv_root_hv_call.c
+++ b/drivers/hv/mshv_root_hv_call.c
@@ -813,6 +813,13 @@ hv_call_notify_port_ring_empty(u32 sint_index)
return hv_result_to_errno(status);
}
+/*
+ * Equivalent of hv_call_map_stats_page() for cases when the caller provides
+ * the map location.
+ *
+ * NOTE: This is a newer hypercall that always supports SELF and PARENT stats
+ * areas, unlike hv_call_map_stats_page().
+ */
static int hv_call_map_stats_page2(enum hv_stats_object_type type,
const union hv_stats_object_identity *identity,
u64 map_location)
@@ -855,6 +862,34 @@ static int hv_call_map_stats_page2(enum hv_stats_object_type type,
return ret;
}
+static int
+hv_stats_get_area_type(enum hv_stats_object_type type,
+ const union hv_stats_object_identity *identity)
+{
+ switch (type) {
+ case HV_STATS_OBJECT_HYPERVISOR:
+ return identity->hv.stats_area_type;
+ case HV_STATS_OBJECT_LOGICAL_PROCESSOR:
+ return identity->lp.stats_area_type;
+ case HV_STATS_OBJECT_PARTITION:
+ return identity->partition.stats_area_type;
+ case HV_STATS_OBJECT_VP:
+ return identity->vp.stats_area_type;
+ }
+
+ return -EINVAL;
+}
+
+/*
+ * Map a stats page, where the page location is provided by the hypervisor.
+ *
+ * NOTE: The concept of separate SELF and PARENT stats areas does not exist on
+ * older hypervisor versions. All the available stats information can be found
+ * on the SELF page. When attempting to map the PARENT area on a hypervisor
+ * that doesn't support it, return "success" but with a NULL address. The
+ * caller should check for this case and instead fallback to the SELF area
+ * alone.
+ */
static int hv_call_map_stats_page(enum hv_stats_object_type type,
const union hv_stats_object_identity *identity,
void **addr)
@@ -863,7 +898,7 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
struct hv_input_map_stats_page *input;
struct hv_output_map_stats_page *output;
u64 status, pfn;
- int ret = 0;
+ int hv_status, ret = 0;
do {
local_irq_save(flags);
@@ -878,11 +913,20 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
pfn = output->map_location;
local_irq_restore(flags);
- if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
- ret = hv_result_to_errno(status);
+
+ hv_status = hv_result(status);
+ if (hv_status != HV_STATUS_INSUFFICIENT_MEMORY) {
if (hv_result_success(status))
break;
- return ret;
+
+ if (hv_stats_get_area_type(type, identity) == HV_STATS_AREA_PARENT &&
+ hv_status == HV_STATUS_INVALID_PARAMETER) {
+ *addr = NULL;
+ return 0;
+ }
+
+ hv_status_debug(status, "\n");
+ return hv_result_to_errno(status);
}
ret = hv_call_deposit_pages(NUMA_NO_NODE,
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index 1134a82c7881..1777778f84b8 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -992,6 +992,9 @@ static int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
if (err)
goto unmap_self;
+ if (!stats_pages[HV_STATS_AREA_PARENT])
+ stats_pages[HV_STATS_AREA_PARENT] = stats_pages[HV_STATS_AREA_SELF];
+
return 0;
unmap_self:
--
2.34.1
^ permalink raw reply related
* [PATCH v3 0/6] mshv: Debugfs interface for mshv_root
From: Nuno Das Neves @ 2026-01-14 21:37 UTC (permalink / raw)
To: linux-hyperv, linux-kernel, mhklinux, skinsburskii
Cc: kys, haiyangz, wei.liu, decui, longli, prapal, mrathor,
paekkaladevi, Nuno Das Neves
Expose hypervisor, logical processor, partition, and virtual processor
statistics via debugfs. These are provided by mapping 'stats' pages via
hypercall.
Patch #1: Update hv_call_map_stats_page() to return success when
HV_STATS_AREA_PARENT is unavailable, which is the case on some
hypervisor versions, where it can fall back to HV_STATS_AREA_SELF
Patch #2: Use struct hv_stats_page pointers instead of void *
Patch #3: Make mshv_vp_stats_map/unmap() more flexible to use with debugfs code
Patch #4: Always map vp stats page regardless of scheduler, to reuse in debugfs
Patch #5: Introduce the definitions needed for the various stats pages
Patch #6: Add mshv_debugfs.c, and integrate it with the mshv_root driver to
expose the partition and VP stats.
---
Changes in v3:
- Add 3 small refactor/cleanup patches (patches 2,3,4) from Stanislav. These
simplify some of the debugfs code, and fix issues with mapping VP stats on
L1VH.
- Fix cleanup of parent stats dentries on module removal (via squashing some
internal patches into patch #6) [Praveen]
- Remove unused goto label [Stanislav, kernel bot]
- Use struct hv_stats_page * instead of void * in mshv_debugfs.c [Stanislav]
- Remove some redundant variables [Stanislav]
- Rename debugfs dentry fields for brevity [Stanislav]
- Use ERR_CAST() for the dentry error pointer returned from
lp_debugfs_stats_create() [Stanislav]
- Fix leak of pages allocated for lp stats mappings by storing them in an array
[Michael]
- Add comments to clarify PARENT vs SELF usage and edge cases [Michael]
- Add VpLoadAvg for x86 and print the stat [Michael]
- Add NUM_STATS_AREAS for array sizing in mshv_debugfs.c [Michael]
Changes in v2:
- Remove unnecessary pr_debug_once() in patch 1 [Stanislav Kinsburskii]
- CONFIG_X86 -> CONFIG_X86_64 in patch 2 [Stanislav Kinsburskii]
---
Nuno Das Neves (2):
mshv: Add definitions for stats pages
mshv: Add debugfs to view hypervisor statistics
Purna Pavan Chandra Aekkaladevi (1):
mshv: Ignore second stats page map result failure
Stanislav Kinsburskii (3):
mshv: Use typed hv_stats_page pointers
mshv: Improve mshv_vp_stats_map/unmap(), add them to mshv_root.h
mshv: Always map child vp stats pages regardless of scheduler type
drivers/hv/Makefile | 1 +
drivers/hv/mshv_debugfs.c | 1103 ++++++++++++++++++++++++++++++++
drivers/hv/mshv_root.h | 49 +-
drivers/hv/mshv_root_hv_call.c | 64 +-
drivers/hv/mshv_root_main.c | 130 ++--
include/hyperv/hvhdk.h | 437 +++++++++++++
6 files changed, 1721 insertions(+), 63 deletions(-)
create mode 100644 drivers/hv/mshv_debugfs.c
--
2.34.1
^ permalink raw reply
* [PATCH v3 2/6] mshv: Use typed hv_stats_page pointers
From: Nuno Das Neves @ 2026-01-14 21:37 UTC (permalink / raw)
To: linux-hyperv, linux-kernel, mhklinux, skinsburskii
Cc: kys, haiyangz, wei.liu, decui, longli, prapal, mrathor,
paekkaladevi, Nuno Das Neves
In-Reply-To: <20260114213803.143486-1-nunodasneves@linux.microsoft.com>
From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Refactor all relevant functions to use struct hv_stats_page pointers
instead of void pointers for stats page mapping and unmapping thus
improving type safety and code clarity across the Hyper-V stats mapping
APIs.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
drivers/hv/mshv_root.h | 5 +++--
drivers/hv/mshv_root_hv_call.c | 12 +++++++-----
drivers/hv/mshv_root_main.c | 8 ++++----
3 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/drivers/hv/mshv_root.h b/drivers/hv/mshv_root.h
index 3c1d88b36741..05ba1f716f9e 100644
--- a/drivers/hv/mshv_root.h
+++ b/drivers/hv/mshv_root.h
@@ -307,8 +307,9 @@ int hv_call_disconnect_port(u64 connection_partition_id,
int hv_call_notify_port_ring_empty(u32 sint_index);
int hv_map_stats_page(enum hv_stats_object_type type,
const union hv_stats_object_identity *identity,
- void **addr);
-int hv_unmap_stats_page(enum hv_stats_object_type type, void *page_addr,
+ struct hv_stats_page **addr);
+int hv_unmap_stats_page(enum hv_stats_object_type type,
+ struct hv_stats_page *page_addr,
const union hv_stats_object_identity *identity);
int hv_call_modify_spa_host_access(u64 partition_id, struct page **pages,
u64 page_struct_count, u32 host_access,
diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
index 1f93b94d7580..daee036e48bc 100644
--- a/drivers/hv/mshv_root_hv_call.c
+++ b/drivers/hv/mshv_root_hv_call.c
@@ -890,9 +890,10 @@ hv_stats_get_area_type(enum hv_stats_object_type type,
* caller should check for this case and instead fallback to the SELF area
* alone.
*/
-static int hv_call_map_stats_page(enum hv_stats_object_type type,
- const union hv_stats_object_identity *identity,
- void **addr)
+static int
+hv_call_map_stats_page(enum hv_stats_object_type type,
+ const union hv_stats_object_identity *identity,
+ struct hv_stats_page **addr)
{
unsigned long flags;
struct hv_input_map_stats_page *input;
@@ -942,7 +943,7 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
int hv_map_stats_page(enum hv_stats_object_type type,
const union hv_stats_object_identity *identity,
- void **addr)
+ struct hv_stats_page **addr)
{
int ret;
struct page *allocated_page = NULL;
@@ -990,7 +991,8 @@ static int hv_call_unmap_stats_page(enum hv_stats_object_type type,
return hv_result_to_errno(status);
}
-int hv_unmap_stats_page(enum hv_stats_object_type type, void *page_addr,
+int hv_unmap_stats_page(enum hv_stats_object_type type,
+ struct hv_stats_page *page_addr,
const union hv_stats_object_identity *identity)
{
int ret;
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index 1777778f84b8..be5ad0fbfbee 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -957,7 +957,7 @@ mshv_vp_release(struct inode *inode, struct file *filp)
}
static void mshv_vp_stats_unmap(u64 partition_id, u32 vp_index,
- void *stats_pages[])
+ struct hv_stats_page *stats_pages[])
{
union hv_stats_object_identity identity = {
.vp.partition_id = partition_id,
@@ -972,7 +972,7 @@ static void mshv_vp_stats_unmap(u64 partition_id, u32 vp_index,
}
static int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
- void *stats_pages[])
+ struct hv_stats_page *stats_pages[])
{
union hv_stats_object_identity identity = {
.vp.partition_id = partition_id,
@@ -1010,7 +1010,7 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
struct mshv_create_vp args;
struct mshv_vp *vp;
struct page *intercept_msg_page, *register_page, *ghcb_page;
- void *stats_pages[2];
+ struct hv_stats_page *stats_pages[2];
long ret;
if (copy_from_user(&args, arg, sizeof(args)))
@@ -1729,7 +1729,7 @@ static void destroy_partition(struct mshv_partition *partition)
if (hv_scheduler_type == HV_SCHEDULER_TYPE_ROOT)
mshv_vp_stats_unmap(partition->pt_id, vp->vp_index,
- (void **)vp->vp_stats_pages);
+ vp->vp_stats_pages);
if (vp->vp_register_page) {
(void)hv_unmap_vp_state_page(partition->pt_id,
--
2.34.1
^ permalink raw reply related
* [PATCH v3 3/6] mshv: Improve mshv_vp_stats_map/unmap(), add them to mshv_root.h
From: Nuno Das Neves @ 2026-01-14 21:38 UTC (permalink / raw)
To: linux-hyperv, linux-kernel, mhklinux, skinsburskii
Cc: kys, haiyangz, wei.liu, decui, longli, prapal, mrathor,
paekkaladevi, Nuno Das Neves
In-Reply-To: <20260114213803.143486-1-nunodasneves@linux.microsoft.com>
From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
These functions are currently only used to map child partition VP stats,
on root partition. However, they will soon be used on L1VH, and and also
used for mapping the host's own VP stats.
Introduce a helper is_l1vh_parent() to determine whether we are mapping
our own VP stats. In this case, do not attempt to map the PARENT area.
Note this is a different case than mapping PARENT on an older hypervisor
where it is not available at all, so must be handled separately.
On unmap, pass the stats pages since on L1VH the kernel allocates them
and they must be freed in hv_unmap_stats_page().
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
drivers/hv/mshv_root.h | 10 ++++++
drivers/hv/mshv_root_main.c | 61 ++++++++++++++++++++++++++-----------
2 files changed, 54 insertions(+), 17 deletions(-)
diff --git a/drivers/hv/mshv_root.h b/drivers/hv/mshv_root.h
index 05ba1f716f9e..e4912b0618fa 100644
--- a/drivers/hv/mshv_root.h
+++ b/drivers/hv/mshv_root.h
@@ -254,6 +254,16 @@ struct mshv_partition *mshv_partition_get(struct mshv_partition *partition);
void mshv_partition_put(struct mshv_partition *partition);
struct mshv_partition *mshv_partition_find(u64 partition_id) __must_hold(RCU);
+static inline bool is_l1vh_parent(u64 partition_id)
+{
+ return hv_l1vh_partition() && (partition_id == HV_PARTITION_ID_SELF);
+}
+
+int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
+ struct hv_stats_page **stats_pages);
+void mshv_vp_stats_unmap(u64 partition_id, u32 vp_index,
+ struct hv_stats_page **stats_pages);
+
/* hypercalls */
int hv_call_withdraw_memory(u64 count, int node, u64 partition_id);
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index be5ad0fbfbee..faca3cc63e79 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -956,23 +956,36 @@ mshv_vp_release(struct inode *inode, struct file *filp)
return 0;
}
-static void mshv_vp_stats_unmap(u64 partition_id, u32 vp_index,
- struct hv_stats_page *stats_pages[])
+void mshv_vp_stats_unmap(u64 partition_id, u32 vp_index,
+ struct hv_stats_page *stats_pages[])
{
union hv_stats_object_identity identity = {
.vp.partition_id = partition_id,
.vp.vp_index = vp_index,
};
+ int err;
identity.vp.stats_area_type = HV_STATS_AREA_SELF;
- hv_unmap_stats_page(HV_STATS_OBJECT_VP, NULL, &identity);
-
- identity.vp.stats_area_type = HV_STATS_AREA_PARENT;
- hv_unmap_stats_page(HV_STATS_OBJECT_VP, NULL, &identity);
+ err = hv_unmap_stats_page(HV_STATS_OBJECT_VP,
+ stats_pages[HV_STATS_AREA_SELF],
+ &identity);
+ if (err)
+ pr_err("%s: failed to unmap partition %llu vp %u self stats, err: %d\n",
+ __func__, partition_id, vp_index, err);
+
+ if (stats_pages[HV_STATS_AREA_PARENT] != stats_pages[HV_STATS_AREA_SELF]) {
+ identity.vp.stats_area_type = HV_STATS_AREA_PARENT;
+ err = hv_unmap_stats_page(HV_STATS_OBJECT_VP,
+ stats_pages[HV_STATS_AREA_PARENT],
+ &identity);
+ if (err)
+ pr_err("%s: failed to unmap partition %llu vp %u parent stats, err: %d\n",
+ __func__, partition_id, vp_index, err);
+ }
}
-static int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
- struct hv_stats_page *stats_pages[])
+int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
+ struct hv_stats_page *stats_pages[])
{
union hv_stats_object_identity identity = {
.vp.partition_id = partition_id,
@@ -983,23 +996,37 @@ static int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
identity.vp.stats_area_type = HV_STATS_AREA_SELF;
err = hv_map_stats_page(HV_STATS_OBJECT_VP, &identity,
&stats_pages[HV_STATS_AREA_SELF]);
- if (err)
+ if (err) {
+ pr_err("%s: failed to map partition %llu vp %u self stats, err: %d\n",
+ __func__, partition_id, vp_index, err);
return err;
+ }
- identity.vp.stats_area_type = HV_STATS_AREA_PARENT;
- err = hv_map_stats_page(HV_STATS_OBJECT_VP, &identity,
- &stats_pages[HV_STATS_AREA_PARENT]);
- if (err)
- goto unmap_self;
-
- if (!stats_pages[HV_STATS_AREA_PARENT])
+ /*
+ * L1VH partition cannot access its vp stats in parent area.
+ */
+ if (is_l1vh_parent(partition_id)) {
stats_pages[HV_STATS_AREA_PARENT] = stats_pages[HV_STATS_AREA_SELF];
+ } else {
+ identity.vp.stats_area_type = HV_STATS_AREA_PARENT;
+ err = hv_map_stats_page(HV_STATS_OBJECT_VP, &identity,
+ &stats_pages[HV_STATS_AREA_PARENT]);
+ if (err) {
+ pr_err("%s: failed to map partition %llu vp %u parent stats, err: %d\n",
+ __func__, partition_id, vp_index, err);
+ goto unmap_self;
+ }
+ if (!stats_pages[HV_STATS_AREA_PARENT])
+ stats_pages[HV_STATS_AREA_PARENT] = stats_pages[HV_STATS_AREA_SELF];
+ }
return 0;
unmap_self:
identity.vp.stats_area_type = HV_STATS_AREA_SELF;
- hv_unmap_stats_page(HV_STATS_OBJECT_VP, NULL, &identity);
+ hv_unmap_stats_page(HV_STATS_OBJECT_VP,
+ stats_pages[HV_STATS_AREA_SELF],
+ &identity);
return err;
}
--
2.34.1
^ permalink raw reply related
* [PATCH v3 4/6] mshv: Always map child vp stats pages regardless of scheduler type
From: Nuno Das Neves @ 2026-01-14 21:38 UTC (permalink / raw)
To: linux-hyperv, linux-kernel, mhklinux, skinsburskii
Cc: kys, haiyangz, wei.liu, decui, longli, prapal, mrathor,
paekkaladevi, Nuno Das Neves
In-Reply-To: <20260114213803.143486-1-nunodasneves@linux.microsoft.com>
From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Currently vp->vp_stats_pages is only used by the root scheduler for fast
interrupt injection.
Soon, vp_stats_pages will also be needed for exposing child VP stats to
userspace via debugfs. Mapping the pages a second time to a different
address causes an error on L1VH.
Remove the scheduler requirement and always map the vp stats pages.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
drivers/hv/mshv_root_main.c | 25 ++++++++-----------------
1 file changed, 8 insertions(+), 17 deletions(-)
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index faca3cc63e79..fbfc9e7d9fa4 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -1077,16 +1077,10 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
goto unmap_register_page;
}
- /*
- * This mapping of the stats page is for detecting if dispatch thread
- * is blocked - only relevant for root scheduler
- */
- if (hv_scheduler_type == HV_SCHEDULER_TYPE_ROOT) {
- ret = mshv_vp_stats_map(partition->pt_id, args.vp_index,
- stats_pages);
- if (ret)
- goto unmap_ghcb_page;
- }
+ ret = mshv_vp_stats_map(partition->pt_id, args.vp_index,
+ stats_pages);
+ if (ret)
+ goto unmap_ghcb_page;
vp = kzalloc(sizeof(*vp), GFP_KERNEL);
if (!vp)
@@ -1110,8 +1104,7 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
if (mshv_partition_encrypted(partition) && is_ghcb_mapping_available())
vp->vp_ghcb_page = page_to_virt(ghcb_page);
- if (hv_scheduler_type == HV_SCHEDULER_TYPE_ROOT)
- memcpy(vp->vp_stats_pages, stats_pages, sizeof(stats_pages));
+ memcpy(vp->vp_stats_pages, stats_pages, sizeof(stats_pages));
/*
* Keep anon_inode_getfd last: it installs fd in the file struct and
@@ -1133,8 +1126,7 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
free_vp:
kfree(vp);
unmap_stats_pages:
- if (hv_scheduler_type == HV_SCHEDULER_TYPE_ROOT)
- mshv_vp_stats_unmap(partition->pt_id, args.vp_index, stats_pages);
+ mshv_vp_stats_unmap(partition->pt_id, args.vp_index, stats_pages);
unmap_ghcb_page:
if (mshv_partition_encrypted(partition) && is_ghcb_mapping_available())
hv_unmap_vp_state_page(partition->pt_id, args.vp_index,
@@ -1754,9 +1746,8 @@ static void destroy_partition(struct mshv_partition *partition)
if (!vp)
continue;
- if (hv_scheduler_type == HV_SCHEDULER_TYPE_ROOT)
- mshv_vp_stats_unmap(partition->pt_id, vp->vp_index,
- vp->vp_stats_pages);
+ mshv_vp_stats_unmap(partition->pt_id, vp->vp_index,
+ vp->vp_stats_pages);
if (vp->vp_register_page) {
(void)hv_unmap_vp_state_page(partition->pt_id,
--
2.34.1
^ permalink raw reply related
* [PATCH v3 5/6] mshv: Add definitions for stats pages
From: Nuno Das Neves @ 2026-01-14 21:38 UTC (permalink / raw)
To: linux-hyperv, linux-kernel, mhklinux, skinsburskii
Cc: kys, haiyangz, wei.liu, decui, longli, prapal, mrathor,
paekkaladevi, Nuno Das Neves
In-Reply-To: <20260114213803.143486-1-nunodasneves@linux.microsoft.com>
Add the definitions for hypervisor, logical processor, and partition
stats pages.
Move the definition for the VP stats page to its rightful place in
hvhdk.h, and add the missing members.
While at it, correct the ARM64 value of VpRootDispatchThreadBlocked,
(which is not yet used, so there is no impact).
These enum members retain their CamelCase style, since they are imported
directly from the hypervisor code. They will be stringified when
printing the stats out, and retain more readability in this form.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
drivers/hv/mshv_root_main.c | 17 --
include/hyperv/hvhdk.h | 437 ++++++++++++++++++++++++++++++++++++
2 files changed, 437 insertions(+), 17 deletions(-)
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index fbfc9e7d9fa4..724bbaa0b08c 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -39,23 +39,6 @@ MODULE_AUTHOR("Microsoft");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Microsoft Hyper-V root partition VMM interface /dev/mshv");
-/* TODO move this to another file when debugfs code is added */
-enum hv_stats_vp_counters { /* HV_THREAD_COUNTER */
-#if defined(CONFIG_X86)
- VpRootDispatchThreadBlocked = 202,
-#elif defined(CONFIG_ARM64)
- VpRootDispatchThreadBlocked = 94,
-#endif
- VpStatsMaxCounter
-};
-
-struct hv_stats_page {
- union {
- u64 vp_cntrs[VpStatsMaxCounter]; /* VP counters */
- u8 data[HV_HYP_PAGE_SIZE];
- };
-} __packed;
-
struct mshv_root mshv_root;
enum hv_scheduler_type hv_scheduler_type;
diff --git a/include/hyperv/hvhdk.h b/include/hyperv/hvhdk.h
index 469186df7826..8bddd11feeba 100644
--- a/include/hyperv/hvhdk.h
+++ b/include/hyperv/hvhdk.h
@@ -10,6 +10,443 @@
#include "hvhdk_mini.h"
#include "hvgdk.h"
+enum hv_stats_hypervisor_counters { /* HV_HYPERVISOR_COUNTER */
+ HvLogicalProcessors = 1,
+ HvPartitions = 2,
+ HvTotalPages = 3,
+ HvVirtualProcessors = 4,
+ HvMonitoredNotifications = 5,
+ HvModernStandbyEntries = 6,
+ HvPlatformIdleTransitions = 7,
+ HvHypervisorStartupCost = 8,
+ HvIOSpacePages = 10,
+ HvNonEssentialPagesForDump = 11,
+ HvSubsumedPages = 12,
+ HvStatsMaxCounter
+};
+
+enum hv_stats_partition_counters { /* HV_PROCESS_COUNTER */
+ PartitionVirtualProcessors = 1,
+ PartitionTlbSize = 3,
+ PartitionAddressSpaces = 4,
+ PartitionDepositedPages = 5,
+ PartitionGpaPages = 6,
+ PartitionGpaSpaceModifications = 7,
+ PartitionVirtualTlbFlushEntires = 8,
+ PartitionRecommendedTlbSize = 9,
+ PartitionGpaPages4K = 10,
+ PartitionGpaPages2M = 11,
+ PartitionGpaPages1G = 12,
+ PartitionGpaPages512G = 13,
+ PartitionDevicePages4K = 14,
+ PartitionDevicePages2M = 15,
+ PartitionDevicePages1G = 16,
+ PartitionDevicePages512G = 17,
+ PartitionAttachedDevices = 18,
+ PartitionDeviceInterruptMappings = 19,
+ PartitionIoTlbFlushes = 20,
+ PartitionIoTlbFlushCost = 21,
+ PartitionDeviceInterruptErrors = 22,
+ PartitionDeviceDmaErrors = 23,
+ PartitionDeviceInterruptThrottleEvents = 24,
+ PartitionSkippedTimerTicks = 25,
+ PartitionPartitionId = 26,
+#if IS_ENABLED(CONFIG_X86_64)
+ PartitionNestedTlbSize = 27,
+ PartitionRecommendedNestedTlbSize = 28,
+ PartitionNestedTlbFreeListSize = 29,
+ PartitionNestedTlbTrimmedPages = 30,
+ PartitionPagesShattered = 31,
+ PartitionPagesRecombined = 32,
+ PartitionHwpRequestValue = 33,
+#elif IS_ENABLED(CONFIG_ARM64)
+ PartitionHwpRequestValue = 27,
+#endif
+ PartitionStatsMaxCounter
+};
+
+enum hv_stats_vp_counters { /* HV_THREAD_COUNTER */
+ VpTotalRunTime = 1,
+ VpHypervisorRunTime = 2,
+ VpRemoteNodeRunTime = 3,
+ VpNormalizedRunTime = 4,
+ VpIdealCpu = 5,
+ VpHypercallsCount = 7,
+ VpHypercallsTime = 8,
+#if IS_ENABLED(CONFIG_X86_64)
+ VpPageInvalidationsCount = 9,
+ VpPageInvalidationsTime = 10,
+ VpControlRegisterAccessesCount = 11,
+ VpControlRegisterAccessesTime = 12,
+ VpIoInstructionsCount = 13,
+ VpIoInstructionsTime = 14,
+ VpHltInstructionsCount = 15,
+ VpHltInstructionsTime = 16,
+ VpMwaitInstructionsCount = 17,
+ VpMwaitInstructionsTime = 18,
+ VpCpuidInstructionsCount = 19,
+ VpCpuidInstructionsTime = 20,
+ VpMsrAccessesCount = 21,
+ VpMsrAccessesTime = 22,
+ VpOtherInterceptsCount = 23,
+ VpOtherInterceptsTime = 24,
+ VpExternalInterruptsCount = 25,
+ VpExternalInterruptsTime = 26,
+ VpPendingInterruptsCount = 27,
+ VpPendingInterruptsTime = 28,
+ VpEmulatedInstructionsCount = 29,
+ VpEmulatedInstructionsTime = 30,
+ VpDebugRegisterAccessesCount = 31,
+ VpDebugRegisterAccessesTime = 32,
+ VpPageFaultInterceptsCount = 33,
+ VpPageFaultInterceptsTime = 34,
+ VpGuestPageTableMaps = 35,
+ VpLargePageTlbFills = 36,
+ VpSmallPageTlbFills = 37,
+ VpReflectedGuestPageFaults = 38,
+ VpApicMmioAccesses = 39,
+ VpIoInterceptMessages = 40,
+ VpMemoryInterceptMessages = 41,
+ VpApicEoiAccesses = 42,
+ VpOtherMessages = 43,
+ VpPageTableAllocations = 44,
+ VpLogicalProcessorMigrations = 45,
+ VpAddressSpaceEvictions = 46,
+ VpAddressSpaceSwitches = 47,
+ VpAddressDomainFlushes = 48,
+ VpAddressSpaceFlushes = 49,
+ VpGlobalGvaRangeFlushes = 50,
+ VpLocalGvaRangeFlushes = 51,
+ VpPageTableEvictions = 52,
+ VpPageTableReclamations = 53,
+ VpPageTableResets = 54,
+ VpPageTableValidations = 55,
+ VpApicTprAccesses = 56,
+ VpPageTableWriteIntercepts = 57,
+ VpSyntheticInterrupts = 58,
+ VpVirtualInterrupts = 59,
+ VpApicIpisSent = 60,
+ VpApicSelfIpisSent = 61,
+ VpGpaSpaceHypercalls = 62,
+ VpLogicalProcessorHypercalls = 63,
+ VpLongSpinWaitHypercalls = 64,
+ VpOtherHypercalls = 65,
+ VpSyntheticInterruptHypercalls = 66,
+ VpVirtualInterruptHypercalls = 67,
+ VpVirtualMmuHypercalls = 68,
+ VpVirtualProcessorHypercalls = 69,
+ VpHardwareInterrupts = 70,
+ VpNestedPageFaultInterceptsCount = 71,
+ VpNestedPageFaultInterceptsTime = 72,
+ VpPageScans = 73,
+ VpLogicalProcessorDispatches = 74,
+ VpWaitingForCpuTime = 75,
+ VpExtendedHypercalls = 76,
+ VpExtendedHypercallInterceptMessages = 77,
+ VpMbecNestedPageTableSwitches = 78,
+ VpOtherReflectedGuestExceptions = 79,
+ VpGlobalIoTlbFlushes = 80,
+ VpGlobalIoTlbFlushCost = 81,
+ VpLocalIoTlbFlushes = 82,
+ VpLocalIoTlbFlushCost = 83,
+ VpHypercallsForwardedCount = 84,
+ VpHypercallsForwardingTime = 85,
+ VpPageInvalidationsForwardedCount = 86,
+ VpPageInvalidationsForwardingTime = 87,
+ VpControlRegisterAccessesForwardedCount = 88,
+ VpControlRegisterAccessesForwardingTime = 89,
+ VpIoInstructionsForwardedCount = 90,
+ VpIoInstructionsForwardingTime = 91,
+ VpHltInstructionsForwardedCount = 92,
+ VpHltInstructionsForwardingTime = 93,
+ VpMwaitInstructionsForwardedCount = 94,
+ VpMwaitInstructionsForwardingTime = 95,
+ VpCpuidInstructionsForwardedCount = 96,
+ VpCpuidInstructionsForwardingTime = 97,
+ VpMsrAccessesForwardedCount = 98,
+ VpMsrAccessesForwardingTime = 99,
+ VpOtherInterceptsForwardedCount = 100,
+ VpOtherInterceptsForwardingTime = 101,
+ VpExternalInterruptsForwardedCount = 102,
+ VpExternalInterruptsForwardingTime = 103,
+ VpPendingInterruptsForwardedCount = 104,
+ VpPendingInterruptsForwardingTime = 105,
+ VpEmulatedInstructionsForwardedCount = 106,
+ VpEmulatedInstructionsForwardingTime = 107,
+ VpDebugRegisterAccessesForwardedCount = 108,
+ VpDebugRegisterAccessesForwardingTime = 109,
+ VpPageFaultInterceptsForwardedCount = 110,
+ VpPageFaultInterceptsForwardingTime = 111,
+ VpVmclearEmulationCount = 112,
+ VpVmclearEmulationTime = 113,
+ VpVmptrldEmulationCount = 114,
+ VpVmptrldEmulationTime = 115,
+ VpVmptrstEmulationCount = 116,
+ VpVmptrstEmulationTime = 117,
+ VpVmreadEmulationCount = 118,
+ VpVmreadEmulationTime = 119,
+ VpVmwriteEmulationCount = 120,
+ VpVmwriteEmulationTime = 121,
+ VpVmxoffEmulationCount = 122,
+ VpVmxoffEmulationTime = 123,
+ VpVmxonEmulationCount = 124,
+ VpVmxonEmulationTime = 125,
+ VpNestedVMEntriesCount = 126,
+ VpNestedVMEntriesTime = 127,
+ VpNestedSLATSoftPageFaultsCount = 128,
+ VpNestedSLATSoftPageFaultsTime = 129,
+ VpNestedSLATHardPageFaultsCount = 130,
+ VpNestedSLATHardPageFaultsTime = 131,
+ VpInvEptAllContextEmulationCount = 132,
+ VpInvEptAllContextEmulationTime = 133,
+ VpInvEptSingleContextEmulationCount = 134,
+ VpInvEptSingleContextEmulationTime = 135,
+ VpInvVpidAllContextEmulationCount = 136,
+ VpInvVpidAllContextEmulationTime = 137,
+ VpInvVpidSingleContextEmulationCount = 138,
+ VpInvVpidSingleContextEmulationTime = 139,
+ VpInvVpidSingleAddressEmulationCount = 140,
+ VpInvVpidSingleAddressEmulationTime = 141,
+ VpNestedTlbPageTableReclamations = 142,
+ VpNestedTlbPageTableEvictions = 143,
+ VpFlushGuestPhysicalAddressSpaceHypercalls = 144,
+ VpFlushGuestPhysicalAddressListHypercalls = 145,
+ VpPostedInterruptNotifications = 146,
+ VpPostedInterruptScans = 147,
+ VpTotalCoreRunTime = 148,
+ VpMaximumRunTime = 149,
+ VpHwpRequestContextSwitches = 150,
+ VpWaitingForCpuTimeBucket0 = 151,
+ VpWaitingForCpuTimeBucket1 = 152,
+ VpWaitingForCpuTimeBucket2 = 153,
+ VpWaitingForCpuTimeBucket3 = 154,
+ VpWaitingForCpuTimeBucket4 = 155,
+ VpWaitingForCpuTimeBucket5 = 156,
+ VpWaitingForCpuTimeBucket6 = 157,
+ VpVmloadEmulationCount = 158,
+ VpVmloadEmulationTime = 159,
+ VpVmsaveEmulationCount = 160,
+ VpVmsaveEmulationTime = 161,
+ VpGifInstructionEmulationCount = 162,
+ VpGifInstructionEmulationTime = 163,
+ VpEmulatedErrataSvmInstructions = 164,
+ VpPlaceholder1 = 165,
+ VpPlaceholder2 = 166,
+ VpPlaceholder3 = 167,
+ VpPlaceholder4 = 168,
+ VpPlaceholder5 = 169,
+ VpPlaceholder6 = 170,
+ VpPlaceholder7 = 171,
+ VpPlaceholder8 = 172,
+ VpPlaceholder9 = 173,
+ VpPlaceholder10 = 174,
+ VpSchedulingPriority = 175,
+ VpRdpmcInstructionsCount = 176,
+ VpRdpmcInstructionsTime = 177,
+ VpPerfmonPmuMsrAccessesCount = 178,
+ VpPerfmonLbrMsrAccessesCount = 179,
+ VpPerfmonIptMsrAccessesCount = 180,
+ VpPerfmonInterruptCount = 181,
+ VpVtl1DispatchCount = 182,
+ VpVtl2DispatchCount = 183,
+ VpVtl2DispatchBucket0 = 184,
+ VpVtl2DispatchBucket1 = 185,
+ VpVtl2DispatchBucket2 = 186,
+ VpVtl2DispatchBucket3 = 187,
+ VpVtl2DispatchBucket4 = 188,
+ VpVtl2DispatchBucket5 = 189,
+ VpVtl2DispatchBucket6 = 190,
+ VpVtl1RunTime = 191,
+ VpVtl2RunTime = 192,
+ VpIommuHypercalls = 193,
+ VpCpuGroupHypercalls = 194,
+ VpVsmHypercalls = 195,
+ VpEventLogHypercalls = 196,
+ VpDeviceDomainHypercalls = 197,
+ VpDepositHypercalls = 198,
+ VpSvmHypercalls = 199,
+ VpBusLockAcquisitionCount = 200,
+ VpLoadAvg = 201,
+ VpRootDispatchThreadBlocked = 202,
+#elif IS_ENABLED(CONFIG_ARM64)
+ VpSysRegAccessesCount = 9,
+ VpSysRegAccessesTime = 10,
+ VpSmcInstructionsCount = 11,
+ VpSmcInstructionsTime = 12,
+ VpOtherInterceptsCount = 13,
+ VpOtherInterceptsTime = 14,
+ VpExternalInterruptsCount = 15,
+ VpExternalInterruptsTime = 16,
+ VpPendingInterruptsCount = 17,
+ VpPendingInterruptsTime = 18,
+ VpGuestPageTableMaps = 19,
+ VpLargePageTlbFills = 20,
+ VpSmallPageTlbFills = 21,
+ VpReflectedGuestPageFaults = 22,
+ VpMemoryInterceptMessages = 23,
+ VpOtherMessages = 24,
+ VpLogicalProcessorMigrations = 25,
+ VpAddressDomainFlushes = 26,
+ VpAddressSpaceFlushes = 27,
+ VpSyntheticInterrupts = 28,
+ VpVirtualInterrupts = 29,
+ VpApicSelfIpisSent = 30,
+ VpGpaSpaceHypercalls = 31,
+ VpLogicalProcessorHypercalls = 32,
+ VpLongSpinWaitHypercalls = 33,
+ VpOtherHypercalls = 34,
+ VpSyntheticInterruptHypercalls = 35,
+ VpVirtualInterruptHypercalls = 36,
+ VpVirtualMmuHypercalls = 37,
+ VpVirtualProcessorHypercalls = 38,
+ VpHardwareInterrupts = 39,
+ VpNestedPageFaultInterceptsCount = 40,
+ VpNestedPageFaultInterceptsTime = 41,
+ VpLogicalProcessorDispatches = 42,
+ VpWaitingForCpuTime = 43,
+ VpExtendedHypercalls = 44,
+ VpExtendedHypercallInterceptMessages = 45,
+ VpMbecNestedPageTableSwitches = 46,
+ VpOtherReflectedGuestExceptions = 47,
+ VpGlobalIoTlbFlushes = 48,
+ VpGlobalIoTlbFlushCost = 49,
+ VpLocalIoTlbFlushes = 50,
+ VpLocalIoTlbFlushCost = 51,
+ VpFlushGuestPhysicalAddressSpaceHypercalls = 52,
+ VpFlushGuestPhysicalAddressListHypercalls = 53,
+ VpPostedInterruptNotifications = 54,
+ VpPostedInterruptScans = 55,
+ VpTotalCoreRunTime = 56,
+ VpMaximumRunTime = 57,
+ VpWaitingForCpuTimeBucket0 = 58,
+ VpWaitingForCpuTimeBucket1 = 59,
+ VpWaitingForCpuTimeBucket2 = 60,
+ VpWaitingForCpuTimeBucket3 = 61,
+ VpWaitingForCpuTimeBucket4 = 62,
+ VpWaitingForCpuTimeBucket5 = 63,
+ VpWaitingForCpuTimeBucket6 = 64,
+ VpHwpRequestContextSwitches = 65,
+ VpPlaceholder2 = 66,
+ VpPlaceholder3 = 67,
+ VpPlaceholder4 = 68,
+ VpPlaceholder5 = 69,
+ VpPlaceholder6 = 70,
+ VpPlaceholder7 = 71,
+ VpPlaceholder8 = 72,
+ VpContentionTime = 73,
+ VpWakeUpTime = 74,
+ VpSchedulingPriority = 75,
+ VpVtl1DispatchCount = 76,
+ VpVtl2DispatchCount = 77,
+ VpVtl2DispatchBucket0 = 78,
+ VpVtl2DispatchBucket1 = 79,
+ VpVtl2DispatchBucket2 = 80,
+ VpVtl2DispatchBucket3 = 81,
+ VpVtl2DispatchBucket4 = 82,
+ VpVtl2DispatchBucket5 = 83,
+ VpVtl2DispatchBucket6 = 84,
+ VpVtl1RunTime = 85,
+ VpVtl2RunTime = 86,
+ VpIommuHypercalls = 87,
+ VpCpuGroupHypercalls = 88,
+ VpVsmHypercalls = 89,
+ VpEventLogHypercalls = 90,
+ VpDeviceDomainHypercalls = 91,
+ VpDepositHypercalls = 92,
+ VpSvmHypercalls = 93,
+ VpLoadAvg = 94,
+ VpRootDispatchThreadBlocked = 95,
+#endif
+ VpStatsMaxCounter
+};
+
+enum hv_stats_lp_counters { /* HV_CPU_COUNTER */
+ LpGlobalTime = 1,
+ LpTotalRunTime = 2,
+ LpHypervisorRunTime = 3,
+ LpHardwareInterrupts = 4,
+ LpContextSwitches = 5,
+ LpInterProcessorInterrupts = 6,
+ LpSchedulerInterrupts = 7,
+ LpTimerInterrupts = 8,
+ LpInterProcessorInterruptsSent = 9,
+ LpProcessorHalts = 10,
+ LpMonitorTransitionCost = 11,
+ LpContextSwitchTime = 12,
+ LpC1TransitionsCount = 13,
+ LpC1RunTime = 14,
+ LpC2TransitionsCount = 15,
+ LpC2RunTime = 16,
+ LpC3TransitionsCount = 17,
+ LpC3RunTime = 18,
+ LpRootVpIndex = 19,
+ LpIdleSequenceNumber = 20,
+ LpGlobalTscCount = 21,
+ LpActiveTscCount = 22,
+ LpIdleAccumulation = 23,
+ LpReferenceCycleCount0 = 24,
+ LpActualCycleCount0 = 25,
+ LpReferenceCycleCount1 = 26,
+ LpActualCycleCount1 = 27,
+ LpProximityDomainId = 28,
+ LpPostedInterruptNotifications = 29,
+ LpBranchPredictorFlushes = 30,
+#if IS_ENABLED(CONFIG_X86_64)
+ LpL1DataCacheFlushes = 31,
+ LpImmediateL1DataCacheFlushes = 32,
+ LpMbFlushes = 33,
+ LpCounterRefreshSequenceNumber = 34,
+ LpCounterRefreshReferenceTime = 35,
+ LpIdleAccumulationSnapshot = 36,
+ LpActiveTscCountSnapshot = 37,
+ LpHwpRequestContextSwitches = 38,
+ LpPlaceholder1 = 39,
+ LpPlaceholder2 = 40,
+ LpPlaceholder3 = 41,
+ LpPlaceholder4 = 42,
+ LpPlaceholder5 = 43,
+ LpPlaceholder6 = 44,
+ LpPlaceholder7 = 45,
+ LpPlaceholder8 = 46,
+ LpPlaceholder9 = 47,
+ LpPlaceholder10 = 48,
+ LpReserveGroupId = 49,
+ LpRunningPriority = 50,
+ LpPerfmonInterruptCount = 51,
+#elif IS_ENABLED(CONFIG_ARM64)
+ LpCounterRefreshSequenceNumber = 31,
+ LpCounterRefreshReferenceTime = 32,
+ LpIdleAccumulationSnapshot = 33,
+ LpActiveTscCountSnapshot = 34,
+ LpHwpRequestContextSwitches = 35,
+ LpPlaceholder2 = 36,
+ LpPlaceholder3 = 37,
+ LpPlaceholder4 = 38,
+ LpPlaceholder5 = 39,
+ LpPlaceholder6 = 40,
+ LpPlaceholder7 = 41,
+ LpPlaceholder8 = 42,
+ LpPlaceholder9 = 43,
+ LpSchLocalRunListSize = 44,
+ LpReserveGroupId = 45,
+ LpRunningPriority = 46,
+#endif
+ LpStatsMaxCounter
+};
+
+/*
+ * Hypervisor statistics page format
+ */
+struct hv_stats_page {
+ union {
+ u64 hv_cntrs[HvStatsMaxCounter]; /* Hypervisor counters */
+ u64 pt_cntrs[PartitionStatsMaxCounter]; /* Partition counters */
+ u64 vp_cntrs[VpStatsMaxCounter]; /* VP counters */
+ u64 lp_cntrs[LpStatsMaxCounter]; /* LP counters */
+ u8 data[HV_HYP_PAGE_SIZE];
+ };
+} __packed;
+
/* Bits for dirty mask of hv_vp_register_page */
#define HV_X64_REGISTER_CLASS_GENERAL 0
#define HV_X64_REGISTER_CLASS_IP 1
--
2.34.1
^ permalink raw reply related
* [PATCH v3 6/6] mshv: Add debugfs to view hypervisor statistics
From: Nuno Das Neves @ 2026-01-14 21:38 UTC (permalink / raw)
To: linux-hyperv, linux-kernel, mhklinux, skinsburskii
Cc: kys, haiyangz, wei.liu, decui, longli, prapal, mrathor,
paekkaladevi, Nuno Das Neves, Jinank Jain
In-Reply-To: <20260114213803.143486-1-nunodasneves@linux.microsoft.com>
Introduce a debugfs interface to expose root and child partition stats
when running with mshv_root.
Create a debugfs directory "mshv" containing 'stats' files organized by
type and id. A stats file contains a number of counters depending on
its type. e.g. an excerpt from a VP stats file:
TotalRunTime : 1997602722
HypervisorRunTime : 649671371
RemoteNodeRunTime : 0
NormalizedRunTime : 1997602721
IdealCpu : 0
HypercallsCount : 1708169
HypercallsTime : 111914774
PageInvalidationsCount : 0
PageInvalidationsTime : 0
On a root partition with some active child partitions, the entire
directory structure may look like:
mshv/
stats # hypervisor stats
lp/ # logical processors
0/ # LP id
stats # LP 0 stats
1/
2/
3/
partition/ # partition stats
1/ # root partition id
stats # root partition stats
vp/ # root virtual processors
0/ # root VP id
stats # root VP 0 stats
1/
2/
3/
42/ # child partition id
stats # child partition stats
vp/ # child VPs
0/ # child VP id
stats # child VP 0 stats
1/
43/
55/
On L1VH, some stats are not present as it does not own the hardware
like the root partition does:
- The hypervisor and lp stats are not present
- L1VH's partition directory is named "self" because it can't get its
own id
- Some of L1VH's partition and VP stats fields are not populated, because
it can't map its own HV_STATS_AREA_PARENT page.
Co-developed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Co-developed-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Co-developed-by: Mukesh Rathor <mrathor@linux.microsoft.com>
Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
Co-developed-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Co-developed-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
---
drivers/hv/Makefile | 1 +
drivers/hv/mshv_debugfs.c | 1103 +++++++++++++++++++++++++++++++++++
drivers/hv/mshv_root.h | 34 ++
drivers/hv/mshv_root_main.c | 26 +-
4 files changed, 1162 insertions(+), 2 deletions(-)
create mode 100644 drivers/hv/mshv_debugfs.c
diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
index a49f93c2d245..2593711c3628 100644
--- a/drivers/hv/Makefile
+++ b/drivers/hv/Makefile
@@ -15,6 +15,7 @@ hv_vmbus-$(CONFIG_HYPERV_TESTING) += hv_debugfs.o
hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_utils_transport.o
mshv_root-y := mshv_root_main.o mshv_synic.o mshv_eventfd.o mshv_irq.o \
mshv_root_hv_call.o mshv_portid_table.o mshv_regions.o
+mshv_root-$(CONFIG_DEBUG_FS) += mshv_debugfs.o
mshv_vtl-y := mshv_vtl_main.o
# Code that must be built-in
diff --git a/drivers/hv/mshv_debugfs.c b/drivers/hv/mshv_debugfs.c
new file mode 100644
index 000000000000..2922db16e83f
--- /dev/null
+++ b/drivers/hv/mshv_debugfs.c
@@ -0,0 +1,1103 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2025, Microsoft Corporation.
+ *
+ * The /sys/kernel/debug/mshv directory contents.
+ * Contains various statistics data, provided by the hypervisor.
+ *
+ * Authors: Microsoft Linux virtualization team
+ */
+
+#include <linux/debugfs.h>
+#include <linux/stringify.h>
+#include <asm/mshyperv.h>
+#include <linux/slab.h>
+
+#include "mshv.h"
+#include "mshv_root.h"
+
+#define U32_BUF_SZ 11
+#define U64_BUF_SZ 21
+#define NUM_STATS_AREAS (HV_STATS_AREA_PARENT + 1)
+
+static struct dentry *mshv_debugfs;
+static struct dentry *mshv_debugfs_partition;
+static struct dentry *mshv_debugfs_lp;
+static struct dentry **parent_vp_stats;
+static struct dentry *parent_partition_stats;
+
+static u64 mshv_lps_count;
+static struct hv_stats_page **mshv_lps_stats;
+
+static int lp_stats_show(struct seq_file *m, void *v)
+{
+ const struct hv_stats_page *stats = m->private;
+
+#define LP_SEQ_PRINTF(cnt) \
+ seq_printf(m, "%-29s: %llu\n", __stringify(cnt), stats->lp_cntrs[Lp##cnt])
+
+ LP_SEQ_PRINTF(GlobalTime);
+ LP_SEQ_PRINTF(TotalRunTime);
+ LP_SEQ_PRINTF(HypervisorRunTime);
+ LP_SEQ_PRINTF(HardwareInterrupts);
+ LP_SEQ_PRINTF(ContextSwitches);
+ LP_SEQ_PRINTF(InterProcessorInterrupts);
+ LP_SEQ_PRINTF(SchedulerInterrupts);
+ LP_SEQ_PRINTF(TimerInterrupts);
+ LP_SEQ_PRINTF(InterProcessorInterruptsSent);
+ LP_SEQ_PRINTF(ProcessorHalts);
+ LP_SEQ_PRINTF(MonitorTransitionCost);
+ LP_SEQ_PRINTF(ContextSwitchTime);
+ LP_SEQ_PRINTF(C1TransitionsCount);
+ LP_SEQ_PRINTF(C1RunTime);
+ LP_SEQ_PRINTF(C2TransitionsCount);
+ LP_SEQ_PRINTF(C2RunTime);
+ LP_SEQ_PRINTF(C3TransitionsCount);
+ LP_SEQ_PRINTF(C3RunTime);
+ LP_SEQ_PRINTF(RootVpIndex);
+ LP_SEQ_PRINTF(IdleSequenceNumber);
+ LP_SEQ_PRINTF(GlobalTscCount);
+ LP_SEQ_PRINTF(ActiveTscCount);
+ LP_SEQ_PRINTF(IdleAccumulation);
+ LP_SEQ_PRINTF(ReferenceCycleCount0);
+ LP_SEQ_PRINTF(ActualCycleCount0);
+ LP_SEQ_PRINTF(ReferenceCycleCount1);
+ LP_SEQ_PRINTF(ActualCycleCount1);
+ LP_SEQ_PRINTF(ProximityDomainId);
+ LP_SEQ_PRINTF(PostedInterruptNotifications);
+ LP_SEQ_PRINTF(BranchPredictorFlushes);
+#if IS_ENABLED(CONFIG_X86_64)
+ LP_SEQ_PRINTF(L1DataCacheFlushes);
+ LP_SEQ_PRINTF(ImmediateL1DataCacheFlushes);
+ LP_SEQ_PRINTF(MbFlushes);
+ LP_SEQ_PRINTF(CounterRefreshSequenceNumber);
+ LP_SEQ_PRINTF(CounterRefreshReferenceTime);
+ LP_SEQ_PRINTF(IdleAccumulationSnapshot);
+ LP_SEQ_PRINTF(ActiveTscCountSnapshot);
+ LP_SEQ_PRINTF(HwpRequestContextSwitches);
+ LP_SEQ_PRINTF(Placeholder1);
+ LP_SEQ_PRINTF(Placeholder2);
+ LP_SEQ_PRINTF(Placeholder3);
+ LP_SEQ_PRINTF(Placeholder4);
+ LP_SEQ_PRINTF(Placeholder5);
+ LP_SEQ_PRINTF(Placeholder6);
+ LP_SEQ_PRINTF(Placeholder7);
+ LP_SEQ_PRINTF(Placeholder8);
+ LP_SEQ_PRINTF(Placeholder9);
+ LP_SEQ_PRINTF(Placeholder10);
+ LP_SEQ_PRINTF(ReserveGroupId);
+ LP_SEQ_PRINTF(RunningPriority);
+ LP_SEQ_PRINTF(PerfmonInterruptCount);
+#elif IS_ENABLED(CONFIG_ARM64)
+ LP_SEQ_PRINTF(CounterRefreshSequenceNumber);
+ LP_SEQ_PRINTF(CounterRefreshReferenceTime);
+ LP_SEQ_PRINTF(IdleAccumulationSnapshot);
+ LP_SEQ_PRINTF(ActiveTscCountSnapshot);
+ LP_SEQ_PRINTF(HwpRequestContextSwitches);
+ LP_SEQ_PRINTF(Placeholder2);
+ LP_SEQ_PRINTF(Placeholder3);
+ LP_SEQ_PRINTF(Placeholder4);
+ LP_SEQ_PRINTF(Placeholder5);
+ LP_SEQ_PRINTF(Placeholder6);
+ LP_SEQ_PRINTF(Placeholder7);
+ LP_SEQ_PRINTF(Placeholder8);
+ LP_SEQ_PRINTF(Placeholder9);
+ LP_SEQ_PRINTF(SchLocalRunListSize);
+ LP_SEQ_PRINTF(ReserveGroupId);
+ LP_SEQ_PRINTF(RunningPriority);
+#endif
+
+ return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(lp_stats);
+
+static void mshv_lp_stats_unmap(u32 lp_index)
+{
+ union hv_stats_object_identity identity = {
+ .lp.lp_index = lp_index,
+ .lp.stats_area_type = HV_STATS_AREA_SELF,
+ };
+ int err;
+
+ err = hv_unmap_stats_page(HV_STATS_OBJECT_LOGICAL_PROCESSOR,
+ mshv_lps_stats[lp_index], &identity);
+ if (err)
+ pr_err("%s: failed to unmap logical processor %u stats, err: %d\n",
+ __func__, lp_index, err);
+}
+
+static struct hv_stats_page * __init mshv_lp_stats_map(u32 lp_index)
+{
+ union hv_stats_object_identity identity = {
+ .lp.lp_index = lp_index,
+ .lp.stats_area_type = HV_STATS_AREA_SELF,
+ };
+ struct hv_stats_page *stats;
+ int err;
+
+ err = hv_map_stats_page(HV_STATS_OBJECT_LOGICAL_PROCESSOR, &identity,
+ &stats);
+ if (err) {
+ pr_err("%s: failed to map logical processor %u stats, err: %d\n",
+ __func__, lp_index, err);
+ return ERR_PTR(err);
+ }
+ mshv_lps_stats[lp_index] = stats;
+
+ return stats;
+}
+
+static struct hv_stats_page * __init lp_debugfs_stats_create(u32 lp_index,
+ struct dentry *parent)
+{
+ struct dentry *dentry;
+ struct hv_stats_page *stats;
+
+ stats = mshv_lp_stats_map(lp_index);
+ if (IS_ERR(stats))
+ return stats;
+
+ dentry = debugfs_create_file("stats", 0400, parent,
+ stats, &lp_stats_fops);
+ if (IS_ERR(dentry)) {
+ mshv_lp_stats_unmap(lp_index);
+ return ERR_CAST(dentry);
+ }
+ return stats;
+}
+
+static int __init lp_debugfs_create(u32 lp_index, struct dentry *parent)
+{
+ struct dentry *idx;
+ char lp_idx_str[U32_BUF_SZ];
+ struct hv_stats_page *stats;
+ int err;
+
+ sprintf(lp_idx_str, "%u", lp_index);
+
+ idx = debugfs_create_dir(lp_idx_str, parent);
+ if (IS_ERR(idx))
+ return PTR_ERR(idx);
+
+ stats = lp_debugfs_stats_create(lp_index, idx);
+ if (IS_ERR(stats)) {
+ err = PTR_ERR(stats);
+ goto remove_debugfs_lp_idx;
+ }
+
+ return 0;
+
+remove_debugfs_lp_idx:
+ debugfs_remove_recursive(idx);
+ return err;
+}
+
+static void mshv_debugfs_lp_remove(void)
+{
+ int lp_index;
+
+ debugfs_remove_recursive(mshv_debugfs_lp);
+
+ for (lp_index = 0; lp_index < mshv_lps_count; lp_index++)
+ mshv_lp_stats_unmap(lp_index);
+
+ kfree(mshv_lps_stats);
+ mshv_lps_stats = NULL;
+}
+
+static int __init mshv_debugfs_lp_create(struct dentry *parent)
+{
+ struct dentry *lp_dir;
+ int err, lp_index;
+
+ mshv_lps_stats = kcalloc(mshv_lps_count,
+ sizeof(*mshv_lps_stats),
+ GFP_KERNEL_ACCOUNT);
+
+ if (!mshv_lps_stats)
+ return -ENOMEM;
+
+ lp_dir = debugfs_create_dir("lp", parent);
+ if (IS_ERR(lp_dir)) {
+ err = PTR_ERR(lp_dir);
+ goto free_lp_stats;
+ }
+
+ for (lp_index = 0; lp_index < mshv_lps_count; lp_index++) {
+ err = lp_debugfs_create(lp_index, lp_dir);
+ if (err)
+ goto remove_debugfs_lps;
+ }
+
+ mshv_debugfs_lp = lp_dir;
+
+ return 0;
+
+remove_debugfs_lps:
+ for (lp_index -= 1; lp_index >= 0; lp_index--)
+ mshv_lp_stats_unmap(lp_index);
+ debugfs_remove_recursive(lp_dir);
+free_lp_stats:
+ kfree(mshv_lps_stats);
+
+ return err;
+}
+
+static int vp_stats_show(struct seq_file *m, void *v)
+{
+ const struct hv_stats_page **pstats = m->private;
+
+/*
+ * For VP and partition stats, there may be two stats areas mapped, SELF and
+ * PARENT. These refer to the privilege level of the data in each page. Some
+ * fields may be 0 in SELF and nonzero in PARENT, or vice versa.
+ *
+ * Hence, prioritize printing from the PARENT page (more privileged data), but
+ * use the value from the SELF page if the PARENT value is 0.
+ */
+
+#define VP_SEQ_PRINTF(cnt) \
+do { \
+ if (pstats[HV_STATS_AREA_PARENT]->vp_cntrs[Vp##cnt]) \
+ seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
+ pstats[HV_STATS_AREA_PARENT]->vp_cntrs[Vp##cnt]); \
+ else \
+ seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
+ pstats[HV_STATS_AREA_SELF]->vp_cntrs[Vp##cnt]); \
+} while (0)
+
+ VP_SEQ_PRINTF(TotalRunTime);
+ VP_SEQ_PRINTF(HypervisorRunTime);
+ VP_SEQ_PRINTF(RemoteNodeRunTime);
+ VP_SEQ_PRINTF(NormalizedRunTime);
+ VP_SEQ_PRINTF(IdealCpu);
+ VP_SEQ_PRINTF(HypercallsCount);
+ VP_SEQ_PRINTF(HypercallsTime);
+#if IS_ENABLED(CONFIG_X86_64)
+ VP_SEQ_PRINTF(PageInvalidationsCount);
+ VP_SEQ_PRINTF(PageInvalidationsTime);
+ VP_SEQ_PRINTF(ControlRegisterAccessesCount);
+ VP_SEQ_PRINTF(ControlRegisterAccessesTime);
+ VP_SEQ_PRINTF(IoInstructionsCount);
+ VP_SEQ_PRINTF(IoInstructionsTime);
+ VP_SEQ_PRINTF(HltInstructionsCount);
+ VP_SEQ_PRINTF(HltInstructionsTime);
+ VP_SEQ_PRINTF(MwaitInstructionsCount);
+ VP_SEQ_PRINTF(MwaitInstructionsTime);
+ VP_SEQ_PRINTF(CpuidInstructionsCount);
+ VP_SEQ_PRINTF(CpuidInstructionsTime);
+ VP_SEQ_PRINTF(MsrAccessesCount);
+ VP_SEQ_PRINTF(MsrAccessesTime);
+ VP_SEQ_PRINTF(OtherInterceptsCount);
+ VP_SEQ_PRINTF(OtherInterceptsTime);
+ VP_SEQ_PRINTF(ExternalInterruptsCount);
+ VP_SEQ_PRINTF(ExternalInterruptsTime);
+ VP_SEQ_PRINTF(PendingInterruptsCount);
+ VP_SEQ_PRINTF(PendingInterruptsTime);
+ VP_SEQ_PRINTF(EmulatedInstructionsCount);
+ VP_SEQ_PRINTF(EmulatedInstructionsTime);
+ VP_SEQ_PRINTF(DebugRegisterAccessesCount);
+ VP_SEQ_PRINTF(DebugRegisterAccessesTime);
+ VP_SEQ_PRINTF(PageFaultInterceptsCount);
+ VP_SEQ_PRINTF(PageFaultInterceptsTime);
+ VP_SEQ_PRINTF(GuestPageTableMaps);
+ VP_SEQ_PRINTF(LargePageTlbFills);
+ VP_SEQ_PRINTF(SmallPageTlbFills);
+ VP_SEQ_PRINTF(ReflectedGuestPageFaults);
+ VP_SEQ_PRINTF(ApicMmioAccesses);
+ VP_SEQ_PRINTF(IoInterceptMessages);
+ VP_SEQ_PRINTF(MemoryInterceptMessages);
+ VP_SEQ_PRINTF(ApicEoiAccesses);
+ VP_SEQ_PRINTF(OtherMessages);
+ VP_SEQ_PRINTF(PageTableAllocations);
+ VP_SEQ_PRINTF(LogicalProcessorMigrations);
+ VP_SEQ_PRINTF(AddressSpaceEvictions);
+ VP_SEQ_PRINTF(AddressSpaceSwitches);
+ VP_SEQ_PRINTF(AddressDomainFlushes);
+ VP_SEQ_PRINTF(AddressSpaceFlushes);
+ VP_SEQ_PRINTF(GlobalGvaRangeFlushes);
+ VP_SEQ_PRINTF(LocalGvaRangeFlushes);
+ VP_SEQ_PRINTF(PageTableEvictions);
+ VP_SEQ_PRINTF(PageTableReclamations);
+ VP_SEQ_PRINTF(PageTableResets);
+ VP_SEQ_PRINTF(PageTableValidations);
+ VP_SEQ_PRINTF(ApicTprAccesses);
+ VP_SEQ_PRINTF(PageTableWriteIntercepts);
+ VP_SEQ_PRINTF(SyntheticInterrupts);
+ VP_SEQ_PRINTF(VirtualInterrupts);
+ VP_SEQ_PRINTF(ApicIpisSent);
+ VP_SEQ_PRINTF(ApicSelfIpisSent);
+ VP_SEQ_PRINTF(GpaSpaceHypercalls);
+ VP_SEQ_PRINTF(LogicalProcessorHypercalls);
+ VP_SEQ_PRINTF(LongSpinWaitHypercalls);
+ VP_SEQ_PRINTF(OtherHypercalls);
+ VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
+ VP_SEQ_PRINTF(VirtualInterruptHypercalls);
+ VP_SEQ_PRINTF(VirtualMmuHypercalls);
+ VP_SEQ_PRINTF(VirtualProcessorHypercalls);
+ VP_SEQ_PRINTF(HardwareInterrupts);
+ VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
+ VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
+ VP_SEQ_PRINTF(PageScans);
+ VP_SEQ_PRINTF(LogicalProcessorDispatches);
+ VP_SEQ_PRINTF(WaitingForCpuTime);
+ VP_SEQ_PRINTF(ExtendedHypercalls);
+ VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
+ VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
+ VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
+ VP_SEQ_PRINTF(GlobalIoTlbFlushes);
+ VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
+ VP_SEQ_PRINTF(LocalIoTlbFlushes);
+ VP_SEQ_PRINTF(LocalIoTlbFlushCost);
+ VP_SEQ_PRINTF(HypercallsForwardedCount);
+ VP_SEQ_PRINTF(HypercallsForwardingTime);
+ VP_SEQ_PRINTF(PageInvalidationsForwardedCount);
+ VP_SEQ_PRINTF(PageInvalidationsForwardingTime);
+ VP_SEQ_PRINTF(ControlRegisterAccessesForwardedCount);
+ VP_SEQ_PRINTF(ControlRegisterAccessesForwardingTime);
+ VP_SEQ_PRINTF(IoInstructionsForwardedCount);
+ VP_SEQ_PRINTF(IoInstructionsForwardingTime);
+ VP_SEQ_PRINTF(HltInstructionsForwardedCount);
+ VP_SEQ_PRINTF(HltInstructionsForwardingTime);
+ VP_SEQ_PRINTF(MwaitInstructionsForwardedCount);
+ VP_SEQ_PRINTF(MwaitInstructionsForwardingTime);
+ VP_SEQ_PRINTF(CpuidInstructionsForwardedCount);
+ VP_SEQ_PRINTF(CpuidInstructionsForwardingTime);
+ VP_SEQ_PRINTF(MsrAccessesForwardedCount);
+ VP_SEQ_PRINTF(MsrAccessesForwardingTime);
+ VP_SEQ_PRINTF(OtherInterceptsForwardedCount);
+ VP_SEQ_PRINTF(OtherInterceptsForwardingTime);
+ VP_SEQ_PRINTF(ExternalInterruptsForwardedCount);
+ VP_SEQ_PRINTF(ExternalInterruptsForwardingTime);
+ VP_SEQ_PRINTF(PendingInterruptsForwardedCount);
+ VP_SEQ_PRINTF(PendingInterruptsForwardingTime);
+ VP_SEQ_PRINTF(EmulatedInstructionsForwardedCount);
+ VP_SEQ_PRINTF(EmulatedInstructionsForwardingTime);
+ VP_SEQ_PRINTF(DebugRegisterAccessesForwardedCount);
+ VP_SEQ_PRINTF(DebugRegisterAccessesForwardingTime);
+ VP_SEQ_PRINTF(PageFaultInterceptsForwardedCount);
+ VP_SEQ_PRINTF(PageFaultInterceptsForwardingTime);
+ VP_SEQ_PRINTF(VmclearEmulationCount);
+ VP_SEQ_PRINTF(VmclearEmulationTime);
+ VP_SEQ_PRINTF(VmptrldEmulationCount);
+ VP_SEQ_PRINTF(VmptrldEmulationTime);
+ VP_SEQ_PRINTF(VmptrstEmulationCount);
+ VP_SEQ_PRINTF(VmptrstEmulationTime);
+ VP_SEQ_PRINTF(VmreadEmulationCount);
+ VP_SEQ_PRINTF(VmreadEmulationTime);
+ VP_SEQ_PRINTF(VmwriteEmulationCount);
+ VP_SEQ_PRINTF(VmwriteEmulationTime);
+ VP_SEQ_PRINTF(VmxoffEmulationCount);
+ VP_SEQ_PRINTF(VmxoffEmulationTime);
+ VP_SEQ_PRINTF(VmxonEmulationCount);
+ VP_SEQ_PRINTF(VmxonEmulationTime);
+ VP_SEQ_PRINTF(NestedVMEntriesCount);
+ VP_SEQ_PRINTF(NestedVMEntriesTime);
+ VP_SEQ_PRINTF(NestedSLATSoftPageFaultsCount);
+ VP_SEQ_PRINTF(NestedSLATSoftPageFaultsTime);
+ VP_SEQ_PRINTF(NestedSLATHardPageFaultsCount);
+ VP_SEQ_PRINTF(NestedSLATHardPageFaultsTime);
+ VP_SEQ_PRINTF(InvEptAllContextEmulationCount);
+ VP_SEQ_PRINTF(InvEptAllContextEmulationTime);
+ VP_SEQ_PRINTF(InvEptSingleContextEmulationCount);
+ VP_SEQ_PRINTF(InvEptSingleContextEmulationTime);
+ VP_SEQ_PRINTF(InvVpidAllContextEmulationCount);
+ VP_SEQ_PRINTF(InvVpidAllContextEmulationTime);
+ VP_SEQ_PRINTF(InvVpidSingleContextEmulationCount);
+ VP_SEQ_PRINTF(InvVpidSingleContextEmulationTime);
+ VP_SEQ_PRINTF(InvVpidSingleAddressEmulationCount);
+ VP_SEQ_PRINTF(InvVpidSingleAddressEmulationTime);
+ VP_SEQ_PRINTF(NestedTlbPageTableReclamations);
+ VP_SEQ_PRINTF(NestedTlbPageTableEvictions);
+ VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
+ VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
+ VP_SEQ_PRINTF(PostedInterruptNotifications);
+ VP_SEQ_PRINTF(PostedInterruptScans);
+ VP_SEQ_PRINTF(TotalCoreRunTime);
+ VP_SEQ_PRINTF(MaximumRunTime);
+ VP_SEQ_PRINTF(HwpRequestContextSwitches);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
+ VP_SEQ_PRINTF(VmloadEmulationCount);
+ VP_SEQ_PRINTF(VmloadEmulationTime);
+ VP_SEQ_PRINTF(VmsaveEmulationCount);
+ VP_SEQ_PRINTF(VmsaveEmulationTime);
+ VP_SEQ_PRINTF(GifInstructionEmulationCount);
+ VP_SEQ_PRINTF(GifInstructionEmulationTime);
+ VP_SEQ_PRINTF(EmulatedErrataSvmInstructions);
+ VP_SEQ_PRINTF(Placeholder1);
+ VP_SEQ_PRINTF(Placeholder2);
+ VP_SEQ_PRINTF(Placeholder3);
+ VP_SEQ_PRINTF(Placeholder4);
+ VP_SEQ_PRINTF(Placeholder5);
+ VP_SEQ_PRINTF(Placeholder6);
+ VP_SEQ_PRINTF(Placeholder7);
+ VP_SEQ_PRINTF(Placeholder8);
+ VP_SEQ_PRINTF(Placeholder9);
+ VP_SEQ_PRINTF(Placeholder10);
+ VP_SEQ_PRINTF(SchedulingPriority);
+ VP_SEQ_PRINTF(RdpmcInstructionsCount);
+ VP_SEQ_PRINTF(RdpmcInstructionsTime);
+ VP_SEQ_PRINTF(PerfmonPmuMsrAccessesCount);
+ VP_SEQ_PRINTF(PerfmonLbrMsrAccessesCount);
+ VP_SEQ_PRINTF(PerfmonIptMsrAccessesCount);
+ VP_SEQ_PRINTF(PerfmonInterruptCount);
+ VP_SEQ_PRINTF(Vtl1DispatchCount);
+ VP_SEQ_PRINTF(Vtl2DispatchCount);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket0);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket1);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket2);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket3);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket4);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket5);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket6);
+ VP_SEQ_PRINTF(Vtl1RunTime);
+ VP_SEQ_PRINTF(Vtl2RunTime);
+ VP_SEQ_PRINTF(IommuHypercalls);
+ VP_SEQ_PRINTF(CpuGroupHypercalls);
+ VP_SEQ_PRINTF(VsmHypercalls);
+ VP_SEQ_PRINTF(EventLogHypercalls);
+ VP_SEQ_PRINTF(DeviceDomainHypercalls);
+ VP_SEQ_PRINTF(DepositHypercalls);
+ VP_SEQ_PRINTF(SvmHypercalls);
+ VP_SEQ_PRINTF(BusLockAcquisitionCount);
+ VP_SEQ_PRINTF(LoadAvg);
+#elif IS_ENABLED(CONFIG_ARM64)
+ VP_SEQ_PRINTF(SysRegAccessesCount);
+ VP_SEQ_PRINTF(SysRegAccessesTime);
+ VP_SEQ_PRINTF(SmcInstructionsCount);
+ VP_SEQ_PRINTF(SmcInstructionsTime);
+ VP_SEQ_PRINTF(OtherInterceptsCount);
+ VP_SEQ_PRINTF(OtherInterceptsTime);
+ VP_SEQ_PRINTF(ExternalInterruptsCount);
+ VP_SEQ_PRINTF(ExternalInterruptsTime);
+ VP_SEQ_PRINTF(PendingInterruptsCount);
+ VP_SEQ_PRINTF(PendingInterruptsTime);
+ VP_SEQ_PRINTF(GuestPageTableMaps);
+ VP_SEQ_PRINTF(LargePageTlbFills);
+ VP_SEQ_PRINTF(SmallPageTlbFills);
+ VP_SEQ_PRINTF(ReflectedGuestPageFaults);
+ VP_SEQ_PRINTF(MemoryInterceptMessages);
+ VP_SEQ_PRINTF(OtherMessages);
+ VP_SEQ_PRINTF(LogicalProcessorMigrations);
+ VP_SEQ_PRINTF(AddressDomainFlushes);
+ VP_SEQ_PRINTF(AddressSpaceFlushes);
+ VP_SEQ_PRINTF(SyntheticInterrupts);
+ VP_SEQ_PRINTF(VirtualInterrupts);
+ VP_SEQ_PRINTF(ApicSelfIpisSent);
+ VP_SEQ_PRINTF(GpaSpaceHypercalls);
+ VP_SEQ_PRINTF(LogicalProcessorHypercalls);
+ VP_SEQ_PRINTF(LongSpinWaitHypercalls);
+ VP_SEQ_PRINTF(OtherHypercalls);
+ VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
+ VP_SEQ_PRINTF(VirtualInterruptHypercalls);
+ VP_SEQ_PRINTF(VirtualMmuHypercalls);
+ VP_SEQ_PRINTF(VirtualProcessorHypercalls);
+ VP_SEQ_PRINTF(HardwareInterrupts);
+ VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
+ VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
+ VP_SEQ_PRINTF(LogicalProcessorDispatches);
+ VP_SEQ_PRINTF(WaitingForCpuTime);
+ VP_SEQ_PRINTF(ExtendedHypercalls);
+ VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
+ VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
+ VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
+ VP_SEQ_PRINTF(GlobalIoTlbFlushes);
+ VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
+ VP_SEQ_PRINTF(LocalIoTlbFlushes);
+ VP_SEQ_PRINTF(LocalIoTlbFlushCost);
+ VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
+ VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
+ VP_SEQ_PRINTF(PostedInterruptNotifications);
+ VP_SEQ_PRINTF(PostedInterruptScans);
+ VP_SEQ_PRINTF(TotalCoreRunTime);
+ VP_SEQ_PRINTF(MaximumRunTime);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
+ VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
+ VP_SEQ_PRINTF(HwpRequestContextSwitches);
+ VP_SEQ_PRINTF(Placeholder2);
+ VP_SEQ_PRINTF(Placeholder3);
+ VP_SEQ_PRINTF(Placeholder4);
+ VP_SEQ_PRINTF(Placeholder5);
+ VP_SEQ_PRINTF(Placeholder6);
+ VP_SEQ_PRINTF(Placeholder7);
+ VP_SEQ_PRINTF(Placeholder8);
+ VP_SEQ_PRINTF(ContentionTime);
+ VP_SEQ_PRINTF(WakeUpTime);
+ VP_SEQ_PRINTF(SchedulingPriority);
+ VP_SEQ_PRINTF(Vtl1DispatchCount);
+ VP_SEQ_PRINTF(Vtl2DispatchCount);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket0);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket1);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket2);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket3);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket4);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket5);
+ VP_SEQ_PRINTF(Vtl2DispatchBucket6);
+ VP_SEQ_PRINTF(Vtl1RunTime);
+ VP_SEQ_PRINTF(Vtl2RunTime);
+ VP_SEQ_PRINTF(IommuHypercalls);
+ VP_SEQ_PRINTF(CpuGroupHypercalls);
+ VP_SEQ_PRINTF(VsmHypercalls);
+ VP_SEQ_PRINTF(EventLogHypercalls);
+ VP_SEQ_PRINTF(DeviceDomainHypercalls);
+ VP_SEQ_PRINTF(DepositHypercalls);
+ VP_SEQ_PRINTF(SvmHypercalls);
+ VP_SEQ_PRINTF(LoadAvg);
+#endif
+
+ return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(vp_stats);
+
+static void vp_debugfs_remove(struct dentry *vp_stats)
+{
+ debugfs_remove_recursive(vp_stats->d_parent);
+}
+
+static int vp_debugfs_create(u64 partition_id, u32 vp_index,
+ struct hv_stats_page **pstats,
+ struct dentry **vp_stats_ptr,
+ struct dentry *parent)
+{
+ struct dentry *vp_idx_dir, *d;
+ char vp_idx_str[U32_BUF_SZ];
+ int err;
+
+ sprintf(vp_idx_str, "%u", vp_index);
+
+ vp_idx_dir = debugfs_create_dir(vp_idx_str, parent);
+ if (IS_ERR(vp_idx_dir))
+ return PTR_ERR(vp_idx_dir);
+
+ d = debugfs_create_file("stats", 0400, vp_idx_dir,
+ pstats, &vp_stats_fops);
+ if (IS_ERR(d)) {
+ err = PTR_ERR(d);
+ goto remove_debugfs_vp_idx;
+ }
+
+ *vp_stats_ptr = d;
+
+ return 0;
+
+remove_debugfs_vp_idx:
+ debugfs_remove_recursive(vp_idx_dir);
+ return err;
+}
+
+static int partition_stats_show(struct seq_file *m, void *v)
+{
+ const struct hv_stats_page **pstats = m->private;
+
+#define PARTITION_SEQ_PRINTF(cnt) \
+do { \
+ if (pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]) \
+ seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
+ pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]); \
+ else \
+ seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
+ pstats[HV_STATS_AREA_PARENT]->pt_cntrs[Partition##cnt]); \
+} while (0)
+
+ PARTITION_SEQ_PRINTF(VirtualProcessors);
+ PARTITION_SEQ_PRINTF(TlbSize);
+ PARTITION_SEQ_PRINTF(AddressSpaces);
+ PARTITION_SEQ_PRINTF(DepositedPages);
+ PARTITION_SEQ_PRINTF(GpaPages);
+ PARTITION_SEQ_PRINTF(GpaSpaceModifications);
+ PARTITION_SEQ_PRINTF(VirtualTlbFlushEntires);
+ PARTITION_SEQ_PRINTF(RecommendedTlbSize);
+ PARTITION_SEQ_PRINTF(GpaPages4K);
+ PARTITION_SEQ_PRINTF(GpaPages2M);
+ PARTITION_SEQ_PRINTF(GpaPages1G);
+ PARTITION_SEQ_PRINTF(GpaPages512G);
+ PARTITION_SEQ_PRINTF(DevicePages4K);
+ PARTITION_SEQ_PRINTF(DevicePages2M);
+ PARTITION_SEQ_PRINTF(DevicePages1G);
+ PARTITION_SEQ_PRINTF(DevicePages512G);
+ PARTITION_SEQ_PRINTF(AttachedDevices);
+ PARTITION_SEQ_PRINTF(DeviceInterruptMappings);
+ PARTITION_SEQ_PRINTF(IoTlbFlushes);
+ PARTITION_SEQ_PRINTF(IoTlbFlushCost);
+ PARTITION_SEQ_PRINTF(DeviceInterruptErrors);
+ PARTITION_SEQ_PRINTF(DeviceDmaErrors);
+ PARTITION_SEQ_PRINTF(DeviceInterruptThrottleEvents);
+ PARTITION_SEQ_PRINTF(SkippedTimerTicks);
+ PARTITION_SEQ_PRINTF(PartitionId);
+#if IS_ENABLED(CONFIG_X86_64)
+ PARTITION_SEQ_PRINTF(NestedTlbSize);
+ PARTITION_SEQ_PRINTF(RecommendedNestedTlbSize);
+ PARTITION_SEQ_PRINTF(NestedTlbFreeListSize);
+ PARTITION_SEQ_PRINTF(NestedTlbTrimmedPages);
+ PARTITION_SEQ_PRINTF(PagesShattered);
+ PARTITION_SEQ_PRINTF(PagesRecombined);
+ PARTITION_SEQ_PRINTF(HwpRequestValue);
+#elif IS_ENABLED(CONFIG_ARM64)
+ PARTITION_SEQ_PRINTF(HwpRequestValue);
+#endif
+
+ return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(partition_stats);
+
+static void mshv_partition_stats_unmap(u64 partition_id,
+ struct hv_stats_page *stats_page,
+ enum hv_stats_area_type stats_area_type)
+{
+ union hv_stats_object_identity identity = {
+ .partition.partition_id = partition_id,
+ .partition.stats_area_type = stats_area_type,
+ };
+ int err;
+
+ err = hv_unmap_stats_page(HV_STATS_OBJECT_PARTITION, stats_page,
+ &identity);
+ if (err)
+ pr_err("%s: failed to unmap partition %lld %s stats, err: %d\n",
+ __func__, partition_id,
+ (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
+ err);
+}
+
+static struct hv_stats_page *mshv_partition_stats_map(u64 partition_id,
+ enum hv_stats_area_type stats_area_type)
+{
+ union hv_stats_object_identity identity = {
+ .partition.partition_id = partition_id,
+ .partition.stats_area_type = stats_area_type,
+ };
+ struct hv_stats_page *stats;
+ int err;
+
+ err = hv_map_stats_page(HV_STATS_OBJECT_PARTITION, &identity, &stats);
+ if (err) {
+ pr_err("%s: failed to map partition %lld %s stats, err: %d\n",
+ __func__, partition_id,
+ (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
+ err);
+ return ERR_PTR(err);
+ }
+ return stats;
+}
+
+static int mshv_debugfs_partition_stats_create(u64 partition_id,
+ struct dentry **partition_stats_ptr,
+ struct dentry *parent)
+{
+ struct dentry *dentry;
+ struct hv_stats_page **pstats;
+ int err;
+
+ pstats = kcalloc(NUM_STATS_AREAS, sizeof(struct hv_stats_page *),
+ GFP_KERNEL_ACCOUNT);
+ if (!pstats)
+ return -ENOMEM;
+
+ pstats[HV_STATS_AREA_SELF] = mshv_partition_stats_map(partition_id,
+ HV_STATS_AREA_SELF);
+ if (IS_ERR(pstats[HV_STATS_AREA_SELF])) {
+ err = PTR_ERR(pstats[HV_STATS_AREA_SELF]);
+ goto cleanup;
+ }
+
+ /*
+ * L1VH partition cannot access its partition stats in parent area.
+ */
+ if (is_l1vh_parent(partition_id)) {
+ pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
+ } else {
+ pstats[HV_STATS_AREA_PARENT] = mshv_partition_stats_map(partition_id,
+ HV_STATS_AREA_PARENT);
+ if (IS_ERR(pstats[HV_STATS_AREA_PARENT])) {
+ err = PTR_ERR(pstats[HV_STATS_AREA_PARENT]);
+ goto unmap_self;
+ }
+ if (!pstats[HV_STATS_AREA_PARENT])
+ pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
+ }
+
+ dentry = debugfs_create_file("stats", 0400, parent,
+ pstats, &partition_stats_fops);
+ if (IS_ERR(dentry)) {
+ err = PTR_ERR(dentry);
+ goto unmap_partition_stats;
+ }
+
+ *partition_stats_ptr = dentry;
+ return 0;
+
+unmap_partition_stats:
+ if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF])
+ mshv_partition_stats_unmap(partition_id, pstats[HV_STATS_AREA_PARENT],
+ HV_STATS_AREA_PARENT);
+unmap_self:
+ mshv_partition_stats_unmap(partition_id, pstats[HV_STATS_AREA_SELF],
+ HV_STATS_AREA_SELF);
+cleanup:
+ kfree(pstats);
+ return err;
+}
+
+static void partition_debugfs_remove(u64 partition_id, struct dentry *dentry)
+{
+ struct hv_stats_page **pstats = NULL;
+
+ pstats = dentry->d_inode->i_private;
+
+ debugfs_remove_recursive(dentry->d_parent);
+
+ if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF]) {
+ mshv_partition_stats_unmap(partition_id,
+ pstats[HV_STATS_AREA_PARENT],
+ HV_STATS_AREA_PARENT);
+ }
+
+ mshv_partition_stats_unmap(partition_id,
+ pstats[HV_STATS_AREA_SELF],
+ HV_STATS_AREA_SELF);
+
+ kfree(pstats);
+}
+
+static int partition_debugfs_create(u64 partition_id,
+ struct dentry **vp_dir_ptr,
+ struct dentry **partition_stats_ptr,
+ struct dentry *parent)
+{
+ char part_id_str[U64_BUF_SZ];
+ struct dentry *part_id_dir, *vp_dir;
+ int err;
+
+ if (is_l1vh_parent(partition_id))
+ sprintf(part_id_str, "self");
+ else
+ sprintf(part_id_str, "%llu", partition_id);
+
+ part_id_dir = debugfs_create_dir(part_id_str, parent);
+ if (IS_ERR(part_id_dir))
+ return PTR_ERR(part_id_dir);
+
+ vp_dir = debugfs_create_dir("vp", part_id_dir);
+ if (IS_ERR(vp_dir)) {
+ err = PTR_ERR(vp_dir);
+ goto remove_debugfs_partition_id;
+ }
+
+ err = mshv_debugfs_partition_stats_create(partition_id,
+ partition_stats_ptr,
+ part_id_dir);
+ if (err)
+ goto remove_debugfs_partition_id;
+
+ *vp_dir_ptr = vp_dir;
+
+ return 0;
+
+remove_debugfs_partition_id:
+ debugfs_remove_recursive(part_id_dir);
+ return err;
+}
+
+static void parent_vp_debugfs_remove(u32 vp_index,
+ struct dentry *vp_stats_ptr)
+{
+ struct hv_stats_page **pstats;
+
+ pstats = vp_stats_ptr->d_inode->i_private;
+ vp_debugfs_remove(vp_stats_ptr);
+ mshv_vp_stats_unmap(hv_current_partition_id, vp_index, pstats);
+ kfree(pstats);
+}
+
+static void mshv_debugfs_parent_partition_remove(void)
+{
+ int idx;
+
+ for_each_online_cpu(idx)
+ parent_vp_debugfs_remove(idx,
+ parent_vp_stats[idx]);
+
+ partition_debugfs_remove(hv_current_partition_id,
+ parent_partition_stats);
+ kfree(parent_vp_stats);
+ parent_vp_stats = NULL;
+ parent_partition_stats = NULL;
+
+}
+
+static int __init parent_vp_debugfs_create(u32 vp_index,
+ struct dentry **vp_stats_ptr,
+ struct dentry *parent)
+{
+ struct hv_stats_page **pstats;
+ int err;
+
+ pstats = kcalloc(2, sizeof(struct hv_stats_page *), GFP_KERNEL_ACCOUNT);
+ if (!pstats)
+ return -ENOMEM;
+
+ err = mshv_vp_stats_map(hv_current_partition_id, vp_index, pstats);
+ if (err)
+ goto cleanup;
+
+ err = vp_debugfs_create(hv_current_partition_id, vp_index, pstats,
+ vp_stats_ptr, parent);
+ if (err)
+ goto unmap_vp_stats;
+
+ return 0;
+
+unmap_vp_stats:
+ mshv_vp_stats_unmap(hv_current_partition_id, vp_index, pstats);
+cleanup:
+ kfree(pstats);
+ return err;
+}
+
+static int __init mshv_debugfs_parent_partition_create(void)
+{
+ struct dentry *vp_dir;
+ int err, idx, i;
+
+ mshv_debugfs_partition = debugfs_create_dir("partition",
+ mshv_debugfs);
+ if (IS_ERR(mshv_debugfs_partition))
+ return PTR_ERR(mshv_debugfs_partition);
+
+ err = partition_debugfs_create(hv_current_partition_id,
+ &vp_dir,
+ &parent_partition_stats,
+ mshv_debugfs_partition);
+ if (err)
+ goto remove_debugfs_partition;
+
+ parent_vp_stats = kcalloc(num_possible_cpus(),
+ sizeof(*parent_vp_stats),
+ GFP_KERNEL);
+ if (!parent_vp_stats) {
+ err = -ENOMEM;
+ goto remove_debugfs_partition;
+ }
+
+ for_each_online_cpu(idx) {
+ err = parent_vp_debugfs_create(hv_vp_index[idx],
+ &parent_vp_stats[idx],
+ vp_dir);
+ if (err)
+ goto remove_debugfs_partition_vp;
+ }
+
+ return 0;
+
+remove_debugfs_partition_vp:
+ for_each_online_cpu(i) {
+ if (i >= idx)
+ break;
+ parent_vp_debugfs_remove(i, parent_vp_stats[i]);
+ }
+ partition_debugfs_remove(hv_current_partition_id,
+ parent_partition_stats);
+
+ kfree(parent_vp_stats);
+ parent_vp_stats = NULL;
+ parent_partition_stats = NULL;
+
+remove_debugfs_partition:
+ debugfs_remove_recursive(mshv_debugfs_partition);
+ mshv_debugfs_partition = NULL;
+ return err;
+}
+
+static int hv_stats_show(struct seq_file *m, void *v)
+{
+ const struct hv_stats_page *stats = m->private;
+
+#define HV_SEQ_PRINTF(cnt) \
+ seq_printf(m, "%-25s: %llu\n", __stringify(cnt), stats->hv_cntrs[Hv##cnt])
+
+ HV_SEQ_PRINTF(LogicalProcessors);
+ HV_SEQ_PRINTF(Partitions);
+ HV_SEQ_PRINTF(TotalPages);
+ HV_SEQ_PRINTF(VirtualProcessors);
+ HV_SEQ_PRINTF(MonitoredNotifications);
+ HV_SEQ_PRINTF(ModernStandbyEntries);
+ HV_SEQ_PRINTF(PlatformIdleTransitions);
+ HV_SEQ_PRINTF(HypervisorStartupCost);
+ HV_SEQ_PRINTF(IOSpacePages);
+ HV_SEQ_PRINTF(NonEssentialPagesForDump);
+ HV_SEQ_PRINTF(SubsumedPages);
+
+ return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(hv_stats);
+
+static void mshv_hv_stats_unmap(void)
+{
+ union hv_stats_object_identity identity = {
+ .hv.stats_area_type = HV_STATS_AREA_SELF,
+ };
+ int err;
+
+ err = hv_unmap_stats_page(HV_STATS_OBJECT_HYPERVISOR, NULL, &identity);
+ if (err)
+ pr_err("%s: failed to unmap hypervisor stats: %d\n",
+ __func__, err);
+}
+
+static void * __init mshv_hv_stats_map(void)
+{
+ union hv_stats_object_identity identity = {
+ .hv.stats_area_type = HV_STATS_AREA_SELF,
+ };
+ struct hv_stats_page *stats;
+ int err;
+
+ err = hv_map_stats_page(HV_STATS_OBJECT_HYPERVISOR, &identity, &stats);
+ if (err) {
+ pr_err("%s: failed to map hypervisor stats: %d\n",
+ __func__, err);
+ return ERR_PTR(err);
+ }
+ return stats;
+}
+
+static int __init mshv_debugfs_hv_stats_create(struct dentry *parent)
+{
+ struct dentry *dentry;
+ u64 *stats;
+ int err;
+
+ stats = mshv_hv_stats_map();
+ if (IS_ERR(stats))
+ return PTR_ERR(stats);
+
+ dentry = debugfs_create_file("stats", 0400, parent,
+ stats, &hv_stats_fops);
+ if (IS_ERR(dentry)) {
+ err = PTR_ERR(dentry);
+ pr_err("%s: failed to create hypervisor stats dentry: %d\n",
+ __func__, err);
+ goto unmap_hv_stats;
+ }
+
+ mshv_lps_count = stats[HvLogicalProcessors];
+
+ return 0;
+
+unmap_hv_stats:
+ mshv_hv_stats_unmap();
+ return err;
+}
+
+int mshv_debugfs_vp_create(struct mshv_vp *vp)
+{
+ struct mshv_partition *p = vp->vp_partition;
+
+ if (!mshv_debugfs)
+ return 0;
+
+ return vp_debugfs_create(p->pt_id, vp->vp_index,
+ vp->vp_stats_pages,
+ &vp->vp_stats_dentry,
+ p->pt_vp_dentry);
+}
+
+void mshv_debugfs_vp_remove(struct mshv_vp *vp)
+{
+ if (!mshv_debugfs)
+ return;
+
+ vp_debugfs_remove(vp->vp_stats_dentry);
+}
+
+int mshv_debugfs_partition_create(struct mshv_partition *partition)
+{
+ int err;
+
+ if (!mshv_debugfs)
+ return 0;
+
+ err = partition_debugfs_create(partition->pt_id,
+ &partition->pt_vp_dentry,
+ &partition->pt_stats_dentry,
+ mshv_debugfs_partition);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+void mshv_debugfs_partition_remove(struct mshv_partition *partition)
+{
+ if (!mshv_debugfs)
+ return;
+
+ partition_debugfs_remove(partition->pt_id,
+ partition->pt_stats_dentry);
+}
+
+int __init mshv_debugfs_init(void)
+{
+ int err;
+
+ mshv_debugfs = debugfs_create_dir("mshv", NULL);
+ if (IS_ERR(mshv_debugfs)) {
+ pr_err("%s: failed to create debugfs directory\n", __func__);
+ return PTR_ERR(mshv_debugfs);
+ }
+
+ if (hv_root_partition()) {
+ err = mshv_debugfs_hv_stats_create(mshv_debugfs);
+ if (err)
+ goto remove_mshv_dir;
+
+ err = mshv_debugfs_lp_create(mshv_debugfs);
+ if (err)
+ goto unmap_hv_stats;
+ }
+
+ err = mshv_debugfs_parent_partition_create();
+ if (err)
+ goto unmap_lp_stats;
+
+ return 0;
+
+unmap_lp_stats:
+ if (hv_root_partition()) {
+ mshv_debugfs_lp_remove();
+ mshv_debugfs_lp = NULL;
+ }
+unmap_hv_stats:
+ if (hv_root_partition())
+ mshv_hv_stats_unmap();
+remove_mshv_dir:
+ debugfs_remove_recursive(mshv_debugfs);
+ mshv_debugfs = NULL;
+ return err;
+}
+
+void mshv_debugfs_exit(void)
+{
+ mshv_debugfs_parent_partition_remove();
+
+ if (hv_root_partition()) {
+ mshv_debugfs_lp_remove();
+ mshv_debugfs_lp = NULL;
+ mshv_hv_stats_unmap();
+ }
+
+ debugfs_remove_recursive(mshv_debugfs);
+ mshv_debugfs = NULL;
+ mshv_debugfs_partition = NULL;
+}
diff --git a/drivers/hv/mshv_root.h b/drivers/hv/mshv_root.h
index e4912b0618fa..7332d9af8373 100644
--- a/drivers/hv/mshv_root.h
+++ b/drivers/hv/mshv_root.h
@@ -52,6 +52,9 @@ struct mshv_vp {
unsigned int kicked_by_hv;
wait_queue_head_t vp_suspend_queue;
} run;
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+ struct dentry *vp_stats_dentry;
+#endif
};
#define vp_fmt(fmt) "p%lluvp%u: " fmt
@@ -136,6 +139,10 @@ struct mshv_partition {
u64 isolation_type;
bool import_completed;
bool pt_initialized;
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+ struct dentry *pt_stats_dentry;
+ struct dentry *pt_vp_dentry;
+#endif
};
#define pt_fmt(fmt) "p%llu: " fmt
@@ -327,6 +334,33 @@ int hv_call_modify_spa_host_access(u64 partition_id, struct page **pages,
int hv_call_get_partition_property_ex(u64 partition_id, u64 property_code, u64 arg,
void *property_value, size_t property_value_sz);
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+int __init mshv_debugfs_init(void);
+void mshv_debugfs_exit(void);
+
+int mshv_debugfs_partition_create(struct mshv_partition *partition);
+void mshv_debugfs_partition_remove(struct mshv_partition *partition);
+int mshv_debugfs_vp_create(struct mshv_vp *vp);
+void mshv_debugfs_vp_remove(struct mshv_vp *vp);
+#else
+static inline int __init mshv_debugfs_init(void)
+{
+ return 0;
+}
+static inline void mshv_debugfs_exit(void) { }
+
+static inline int mshv_debugfs_partition_create(struct mshv_partition *partition)
+{
+ return 0;
+}
+static inline void mshv_debugfs_partition_remove(struct mshv_partition *partition) { }
+static inline int mshv_debugfs_vp_create(struct mshv_vp *vp)
+{
+ return 0;
+}
+static inline void mshv_debugfs_vp_remove(struct mshv_vp *vp) { }
+#endif
+
extern struct mshv_root mshv_root;
extern enum hv_scheduler_type hv_scheduler_type;
extern u8 * __percpu *hv_synic_eventring_tail;
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index 724bbaa0b08c..9d46ddb43d70 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -1089,6 +1089,10 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
memcpy(vp->vp_stats_pages, stats_pages, sizeof(stats_pages));
+ ret = mshv_debugfs_vp_create(vp);
+ if (ret)
+ goto put_partition;
+
/*
* Keep anon_inode_getfd last: it installs fd in the file struct and
* thus makes the state accessible in user space.
@@ -1096,7 +1100,7 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
ret = anon_inode_getfd("mshv_vp", &mshv_vp_fops, vp,
O_RDWR | O_CLOEXEC);
if (ret < 0)
- goto put_partition;
+ goto remove_debugfs_vp;
/* already exclusive with the partition mutex for all ioctls */
partition->pt_vp_count++;
@@ -1104,6 +1108,8 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
return ret;
+remove_debugfs_vp:
+ mshv_debugfs_vp_remove(vp);
put_partition:
mshv_partition_put(partition);
free_vp:
@@ -1546,10 +1552,16 @@ mshv_partition_ioctl_initialize(struct mshv_partition *partition)
if (ret)
goto withdraw_mem;
+ ret = mshv_debugfs_partition_create(partition);
+ if (ret)
+ goto finalize_partition;
+
partition->pt_initialized = true;
return 0;
+finalize_partition:
+ hv_call_finalize_partition(partition->pt_id);
withdraw_mem:
hv_call_withdraw_memory(U64_MAX, NUMA_NO_NODE, partition->pt_id);
@@ -1729,6 +1741,7 @@ static void destroy_partition(struct mshv_partition *partition)
if (!vp)
continue;
+ mshv_debugfs_vp_remove(vp);
mshv_vp_stats_unmap(partition->pt_id, vp->vp_index,
vp->vp_stats_pages);
@@ -1762,6 +1775,8 @@ static void destroy_partition(struct mshv_partition *partition)
partition->pt_vp_array[i] = NULL;
}
+ mshv_debugfs_partition_remove(partition);
+
/* Deallocates and unmaps everything including vcpus, GPA mappings etc */
hv_call_finalize_partition(partition->pt_id);
@@ -2307,10 +2322,14 @@ static int __init mshv_parent_partition_init(void)
mshv_init_vmm_caps(dev);
- ret = mshv_irqfd_wq_init();
+ ret = mshv_debugfs_init();
if (ret)
goto exit_partition;
+ ret = mshv_irqfd_wq_init();
+ if (ret)
+ goto exit_debugfs;
+
spin_lock_init(&mshv_root.pt_ht_lock);
hash_init(mshv_root.pt_htable);
@@ -2318,6 +2337,8 @@ static int __init mshv_parent_partition_init(void)
return 0;
+exit_debugfs:
+ mshv_debugfs_exit();
exit_partition:
if (hv_root_partition())
mshv_root_partition_exit();
@@ -2334,6 +2355,7 @@ static void __exit mshv_parent_partition_exit(void)
{
hv_setup_mshv_handler(NULL);
mshv_port_table_fini();
+ mshv_debugfs_exit();
misc_deregister(&mshv_dev);
mshv_irqfd_wq_cleanup();
if (hv_root_partition())
--
2.34.1
^ permalink raw reply related
* Re: [EXTERNAL] Re: [PATCH V2,net-next, 1/2] net: mana: Add support for coalesced RX packets on CQE
From: Jakub Kicinski @ 2026-01-15 2:54 UTC (permalink / raw)
To: Haiyang Zhang
Cc: Haiyang Zhang, linux-hyperv@vger.kernel.org,
netdev@vger.kernel.org, KY Srinivasan, Wei Liu, Dexuan Cui,
Long Li, Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
Konstantin Taranov, Simon Horman, Erni Sri Satya Vennela,
Shradha Gupta, Saurabh Sengar, Aditya Garg, Dipayaan Roy,
Shiraz Saleem, linux-kernel@vger.kernel.org,
linux-rdma@vger.kernel.org, Paul Rosswurm
In-Reply-To: <SA3PR21MB38676C98AA702F212CE391E2CA8FA@SA3PR21MB3867.namprd21.prod.outlook.com>
On Wed, 14 Jan 2026 18:27:50 +0000 Haiyang Zhang wrote:
> > > And, the coalescing can add up to 2 microseconds into one-way latency.
> >
> > I am asking you how the _device_ (hypervisor?) decides when to coalesce
> > and when to send a partial CQE (<4 packets in 4 pkt CQE). You are using
> > the coalescing uAPI, so I'm trying to make sure this is the correct API.
> > CQE configuration can also be done via ringparam.
>
> When coalescing is enabled, the device waits for packets which can
> have the CQE coalesced with previous packet(s). That coalescing process
> is finished (and a CQE written to the appropriate CQ) when the CQE is
> filled with 4 pkts, or time expired, or other device specific logic is
> satisfied.
See, what I'm afraid is happening here is that you are enabling
completion coalescing (how long the device keeps the CQE pending).
Which is _not_ what rx_max_coalesced_frames controls for most NICs.
For most NICs rx_max_coalesced_frames controls IRQ generation logic.
The NIC first buffers up CQEs for typically single digit usecs, and
then once CQE timer exipred and writeback happened it starts an IRQ
coalescing timer. Once the IRQ coalescing timer expires IRQ is
triggered, which schedules NAPI. (broad strokes, obviously many
differences and optimizations exist)
Is my guess correct? Are you controlling CQE coalescing>
Can you control the timeout instead of the frame count?
^ permalink raw reply
* Re: [PATCH] Drivers: hv: vmbus: fix typo in function name reference
From: Wei Liu @ 2026-01-15 7:00 UTC (permalink / raw)
To: vdso
Cc: Julia Lawall, yunbolyu, kexinsun, ratnadiraw, xutong.ma,
Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, linux-hyperv,
linux-kernel, K. Y. Srinivasan
In-Reply-To: <1647289009.63012.1767104334567@app.mailbox.org>
On Tue, Dec 30, 2025 at 06:18:54AM -0800, vdso@mailbox.org wrote:
>
> > On 12/30/2025 6:14 AM Julia Lawall <julia.lawall@inria.fr> wrote:
> >
> >
> > Replace cmxchg by cmpxchg.
> >
> > Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> >
>
> Reviewed-by: Roman Kisel <vdso@mailbox.org>
>
Applied. Thank you.
^ permalink raw reply
* Re: [PATCH v2 1/1] Drivers: hv: Always do Hyper-V panic notification in hv_kmsg_dump()
From: Wei Liu @ 2026-01-15 7:01 UTC (permalink / raw)
To: mhklinux
Cc: haiyangz, wei.liu, decui, kys, linux-kernel, linux-hyperv,
dan.carpenter
In-Reply-To: <20251231201447.1399-1-mhklinux@outlook.com>
On Wed, Dec 31, 2025 at 12:14:47PM -0800, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
>
> hv_kmsg_dump() currently skips the panic notification entirely if it
> doesn't get any message bytes to pass to Hyper-V due to an error from
> kmsg_dump_get_buffer(). Skipping the notification is undesirable because
> it leaves the Hyper-V host uncertain about the state of a panic'ed guest.
>
> Fix this by always doing the panic notification, even if bytes_written
> is zero. Also ensure that bytes_written is initialized, which fixes a
> kernel test robot warning. The warning is actually bogus because
> kmsg_dump_get_buffer() happens to set bytes_written even if it fails, and
> in the kernel test robot's CONFIG_PRINTK not set case, hv_kmsg_dump() is
> never called. But do the initialization for robustness and to quiet the
> static checker.
>
> Fixes: 9c318a1d9b50 ("Drivers: hv: move panic report code from vmbus to hv early init code")
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> Closes: https://lore.kernel.org/all/202512172103.OcUspn1Z-lkp@intel.com/
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Applied. Thanks.
^ permalink raw reply
* Re: [PATCH v2] mshv: Align huge page stride with guest mapping
From: Wei Liu @ 2026-01-15 7:10 UTC (permalink / raw)
To: Stanislav Kinsburskii
Cc: kys, haiyangz, wei.liu, decui, longli, linux-hyperv, linux-kernel
In-Reply-To: <176781093198.21595.6373086133020540990.stgit@skinsburskii-cloud-desktop.internal.cloudapp.net>
On Wed, Jan 07, 2026 at 06:45:43PM +0000, Stanislav Kinsburskii wrote:
> Ensure that a stride larger than 1 (huge page) is only used when page
> points to a head of a huge page and both the guest frame number (gfn) and
> the operation size (page_count) are aligned to the huge page size
> (PTRS_PER_PMD). This matches the hypervisor requirement that map/unmap
> operations for huge pages must be guest-aligned and cover a full huge page.
>
> Add mshv_chunk_stride() to encapsulate this alignment and page-order
> validation, and plumb a huge_page flag into the region chunk handlers.
> This prevents issuing large-page map/unmap/share operations that the
> hypervisor would reject due to misaligned guest mappings.
>
> Fixes: abceb4297bf8 ("mshv: Fix huge page handling in memory region traversal")
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Applied.
^ permalink raw reply
* Re: [PATCH 1/1] mshv: Store the result of vfs_poll in a variable of type __poll_t
From: Wei Liu @ 2026-01-15 7:11 UTC (permalink / raw)
To: Nuno Das Neves
Cc: mhklinux, kys, haiyangz, wei.liu, decui, longli, linux-hyperv,
linux-kernel
In-Reply-To: <49fd5523-f558-4ac0-b1a5-d0ead75bd9f3@linux.microsoft.com>
On Wed, Jan 14, 2026 at 10:40:04AM -0800, Nuno Das Neves wrote:
> On 1/14/2026 9:01 AM, mhkelley58@gmail.com wrote:
> > From: Michael Kelley <mhklinux@outlook.com>
> >
> > vfs_poll() returns a result of type __poll_t, but current code is using
> > an "unsigned int" local variable. The difference is that __poll_t carries
> > the "bitwise" attribute. This attribute is not interpreted by the C
> > compiler; it is only used by 'sparse' to flag incorrect usage of the
> > return value. The return value is used correctly here, so there's no
> > bug, but sparse complains about the type mismatch.
> >
> > In the interest of general correctness and to avoid noise from sparse,
> > change the local variable to type __poll_t. No functional change.
> >
> > Reported-by: kernel test robot <lkp@intel.com>
> > Closes: https://lore.kernel.org/oe-kbuild-all/202512141339.791TCKnB-lkp@intel.com/
> > Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> > ---
> > This change is not marked with a Fixes: tag as there's no value in
> > backporting to older stable releases.
> >
> > drivers/hv/mshv_eventfd.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/hv/mshv_eventfd.c b/drivers/hv/mshv_eventfd.c
> > index d93a18f09c76..0b75ff1edb73 100644
> > --- a/drivers/hv/mshv_eventfd.c
> > +++ b/drivers/hv/mshv_eventfd.c
> > @@ -388,7 +388,7 @@ static int mshv_irqfd_assign(struct mshv_partition *pt,
> > {
> > struct eventfd_ctx *eventfd = NULL, *resamplefd = NULL;
> > struct mshv_irqfd *irqfd, *tmp;
> > - unsigned int events;
> > + __poll_t events;
> > int ret;
> > int idx;
> >
>
> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Applied.
^ permalink raw reply
* Re: [PATCH 1/1] mshv: Add __user attribute to argument passed to access_ok()
From: Wei Liu @ 2026-01-15 7:12 UTC (permalink / raw)
To: Nuno Das Neves
Cc: mhklinux, kys, haiyangz, wei.liu, decui, longli, linux-hyperv,
linux-kernel
In-Reply-To: <eb163338-cd03-49c6-8c44-f6fac39ba7f6@linux.microsoft.com>
On Wed, Jan 14, 2026 at 10:39:14AM -0800, Nuno Das Neves wrote:
> On 1/14/2026 10:15 AM, mhkelley58@gmail.com wrote:
> > From: Michael Kelley <mhklinux@outlook.com>
> >
> > access_ok() expects its first argument to have the __user attribute
> > since it is checking access to user space. Current code passes an
> > argument that lacks that attribute, resulting in 'sparse' flagging
> > the incorrect usage. However, the compiler doesn't generate code
> > based on the attribute, so there's no actual bug.
> >
> > In the interest of general correctness and to avoid noise from sparse,
> > add the __user attribute. No functional change.
> >
> > Reported-by: kernel test robot <lkp@intel.com>
> > Closes: https://lore.kernel.org/oe-kbuild-all/202512141339.791TCKnB-lkp@intel.com/
> > Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> > ---
> > drivers/hv/mshv_root_main.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> > index eff1b21461dc..5673af9fe101 100644
> > --- a/drivers/hv/mshv_root_main.c
> > +++ b/drivers/hv/mshv_root_main.c
> > @@ -1280,7 +1280,7 @@ mshv_map_user_memory(struct mshv_partition *partition,
> > long ret;
> >
> > if (mem.flags & BIT(MSHV_SET_MEM_BIT_UNMAP) ||
> > - !access_ok((const void *)mem.userspace_addr, mem.size))
> > + !access_ok((const void __user *)mem.userspace_addr, mem.size))
> > return -EINVAL;
> >
> > mmap_read_lock(current->mm);
>
> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Applied.
^ permalink raw reply
* Re: [PATCH v1] x86/hyperv: Reserve 3 interrupt vectors used exclusively by mshv
From: Wei Liu @ 2026-01-15 7:25 UTC (permalink / raw)
To: Mukesh Rathor
Cc: linux-hyperv, linux-kernel, kys, haiyangz, wei.liu, decui, longli,
tglx, mingo, bp, dave.hansen, x86, hpa
In-Reply-To: <20260102220208.862818-1-mrathor@linux.microsoft.com>
On Fri, Jan 02, 2026 at 02:02:08PM -0800, Mukesh Rathor wrote:
> MSVC compiler, used to compile the Microsoft Hyper-V hypervisor currently,
> has an assert intrinsic that uses interrupt vector 0x29 to create an
> exception. This will cause hypervisor to then crash and collect core. As
> such, if this interrupt number is assigned to a device by linux and the
> device generates it, hypervisor will crash. There are two other such
> vectors hard coded in the hypervisor, 0x2C and 0x2D for debug purposes.
> Fortunately, the three vectors are part of the kernel driver space and
> that makes it feasible to reserve them early so they are not assigned
> later.
>
> Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
> ---
>
> v1: Add ifndef CONFIG_X86_FRED (thanks hpa)
>
> arch/x86/kernel/cpu/mshyperv.c | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 579fb2c64cfd..8ef4ca6733ac 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -478,6 +478,27 @@ int hv_get_hypervisor_version(union hv_hypervisor_version_info *info)
> }
> EXPORT_SYMBOL_GPL(hv_get_hypervisor_version);
>
> +#ifndef CONFIG_X86_FRED
I briefly looked up FRED and checked the code. I understand that once it
is enabled, Linux kernel doesn't setup the IDT anymore (code in
arch/x86/kernel/traps.c).
My question is, do we need to do anything when FRED is enabled?
Wei
> +/*
> + * Reserve vectors hard coded in the hypervisor. If used outside, the hypervisor
> + * will crash or hang or break into debugger.
> + */
> +static void hv_reserve_irq_vectors(void)
> +{
> + #define HYPERV_DBG_FASTFAIL_VECTOR 0x29
> + #define HYPERV_DBG_ASSERT_VECTOR 0x2C
> + #define HYPERV_DBG_SERVICE_VECTOR 0x2D
> +
> + if (test_and_set_bit(HYPERV_DBG_ASSERT_VECTOR, system_vectors) ||
> + test_and_set_bit(HYPERV_DBG_SERVICE_VECTOR, system_vectors) ||
> + test_and_set_bit(HYPERV_DBG_FASTFAIL_VECTOR, system_vectors))
> + BUG();
> +
> + pr_info("Hyper-V:reserve vectors: %d %d %d\n", HYPERV_DBG_ASSERT_VECTOR,
> + HYPERV_DBG_SERVICE_VECTOR, HYPERV_DBG_FASTFAIL_VECTOR);
> +}
> +#endif /* CONFIG_X86_FRED */
> +
> static void __init ms_hyperv_init_platform(void)
> {
> int hv_max_functions_eax, eax;
> @@ -510,6 +531,11 @@ static void __init ms_hyperv_init_platform(void)
>
> hv_identify_partition_type();
>
> +#ifndef CONFIG_X86_FRED
> + if (hv_root_partition())
> + hv_reserve_irq_vectors();
> +#endif /* CONFIG_X86_FRED */
> +
> if (cc_platform_has(CC_ATTR_SNP_SECURE_AVIC))
> ms_hyperv.hints |= HV_DEPRECATING_AEOI_RECOMMENDED;
>
> --
> 2.51.2.vfs.0.1
>
^ permalink raw reply
* Re: [PATCH v2 0/2] Fixes for movable pages
From: Wei Liu @ 2026-01-15 7:31 UTC (permalink / raw)
To: Anirudh Rayabharam
Cc: kys, haiyangz, wei.liu, decui, longli, linux-hyperv, linux-kernel
In-Reply-To: <20260105122837.1083896-1-anirudh@anirudhrb.com>
On Mon, Jan 05, 2026 at 12:28:35PM +0000, Anirudh Rayabharam wrote:
> From: "Anirudh Rayabharam (Microsoft)" <anirudh@anirudhrb.com>
>
> Fix movable pages for arm64 guests by implementing a GPA intercept
> handler.
>
> v2:
> - Added "Fixes:" tag
> - Got rid of the utility function to get intercept GPA and instead
> integrated the rather small logic into the GPA intercept handling
> function.
> - Dropped patch 3 since it was applied to the fixes tree.
>
> Anirudh Rayabharam (Microsoft) (2):
> hyperv: add definitions for arm64 gpa intercepts
> mshv: handle gpa intercepts for arm64
>
The code looks fine. I have queued both patches.
I massaged the first subject line a bit.
Wei
> drivers/hv/mshv_root_main.c | 15 ++++++------
> include/hyperv/hvhdk.h | 47 +++++++++++++++++++++++++++++++++++++
> 2 files changed, 55 insertions(+), 7 deletions(-)
>
> --
> 2.34.1
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox