linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] mshv: Debugfs interface for mshv_root
@ 2025-12-05 18:58 Nuno Das Neves
  2025-12-05 18:58 ` [PATCH v2 1/3] mshv: Ignore second stats page map result failure Nuno Das Neves
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Nuno Das Neves @ 2025-12-05 18:58 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel, skinsburskii
  Cc: kys, haiyangz, wei.liu, decui, longli, mhklinux, prapal, mrathor,
	paekkaladevi, Nuno Das Neves

Expose hypervisor, logical processor, partition, and virtual processor
statistics via debugfs. These are provided by mapping 'stats' pages via
hypercall.

Patch #1: Update hv_call_map_stats_page() to return success when
          HV_STATS_AREA_PARENT is unavailable, which is the case on some
          hypervisor versions, where it can fall back to HV_STATS_AREA_SELF
Patch #2: Introduce the definitions needed for the various stats pages
Patch #3: Add mshv_debugfs.c, and integrate it with the mshv_root driver to
          expose the partition and VP stats.

---
Changes in v2:
- Remove unnecessary pr_debug_once() in patch 1 [Stanislav Kinsburskii]
- CONFIG_X86 -> CONFIG_X86_64 in patch 2 [Stanislav Kinsburskii]

---
Nuno Das Neves (2):
  mshv: Add definitions for stats pages
  mshv: Add debugfs to view hypervisor statistics

Purna Pavan Chandra Aekkaladevi (1):
  mshv: Ignore second stats page map result failure

 drivers/hv/Makefile            |    1 +
 drivers/hv/mshv_debugfs.c      | 1122 ++++++++++++++++++++++++++++++++
 drivers/hv/mshv_root.h         |   34 +
 drivers/hv/mshv_root_hv_call.c |   41 +-
 drivers/hv/mshv_root_main.c    |   52 +-
 include/hyperv/hvhdk.h         |  437 +++++++++++++
 6 files changed, 1662 insertions(+), 25 deletions(-)
 create mode 100644 drivers/hv/mshv_debugfs.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/3] mshv: Ignore second stats page map result failure
  2025-12-05 18:58 [PATCH v2 0/3] mshv: Debugfs interface for mshv_root Nuno Das Neves
@ 2025-12-05 18:58 ` Nuno Das Neves
  2025-12-05 22:50   ` Stanislav Kinsburskii
  2025-12-08 15:12   ` Michael Kelley
  2025-12-05 18:58 ` [PATCH v2 2/3] mshv: Add definitions for stats pages Nuno Das Neves
  2025-12-05 18:58 ` [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics Nuno Das Neves
  2 siblings, 2 replies; 18+ messages in thread
From: Nuno Das Neves @ 2025-12-05 18:58 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel, skinsburskii
  Cc: kys, haiyangz, wei.liu, decui, longli, mhklinux, prapal, mrathor,
	paekkaladevi, Nuno Das Neves

From: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>

Older versions of the hypervisor do not support HV_STATS_AREA_PARENT
and return HV_STATUS_INVALID_PARAMETER for the second stats page
mapping request.

This results a failure in module init. Instead of failing, gracefully
fall back to populating stats_pages[HV_STATS_AREA_PARENT] with the
already-mapped stats_pages[HV_STATS_AREA_SELF].

Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
---
 drivers/hv/mshv_root_hv_call.c | 41 ++++++++++++++++++++++++++++++----
 drivers/hv/mshv_root_main.c    |  3 +++
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
index 598eaff4ff29..b1770c7b500c 100644
--- a/drivers/hv/mshv_root_hv_call.c
+++ b/drivers/hv/mshv_root_hv_call.c
@@ -855,6 +855,24 @@ static int hv_call_map_stats_page2(enum hv_stats_object_type type,
 	return ret;
 }
 
+static int
+hv_stats_get_area_type(enum hv_stats_object_type type,
+		       const union hv_stats_object_identity *identity)
+{
+	switch (type) {
+	case HV_STATS_OBJECT_HYPERVISOR:
+		return identity->hv.stats_area_type;
+	case HV_STATS_OBJECT_LOGICAL_PROCESSOR:
+		return identity->lp.stats_area_type;
+	case HV_STATS_OBJECT_PARTITION:
+		return identity->partition.stats_area_type;
+	case HV_STATS_OBJECT_VP:
+		return identity->vp.stats_area_type;
+	}
+
+	return -EINVAL;
+}
+
 static int hv_call_map_stats_page(enum hv_stats_object_type type,
 				  const union hv_stats_object_identity *identity,
 				  void **addr)
@@ -863,7 +881,7 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
 	struct hv_input_map_stats_page *input;
 	struct hv_output_map_stats_page *output;
 	u64 status, pfn;
-	int ret = 0;
+	int hv_status, ret = 0;
 
 	do {
 		local_irq_save(flags);
@@ -878,11 +896,26 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
 		pfn = output->map_location;
 
 		local_irq_restore(flags);
-		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
-			ret = hv_result_to_errno(status);
+
+		hv_status = hv_result(status);
+		if (hv_status != HV_STATUS_INSUFFICIENT_MEMORY) {
 			if (hv_result_success(status))
 				break;
-			return ret;
+
+			/*
+			 * Older versions of the hypervisor do not support the
+			 * PARENT stats area. In this case return "success" but
+			 * set the page to NULL. The caller should check for
+			 * this case and instead just use the SELF area.
+			 */
+			if (hv_stats_get_area_type(type, identity) == HV_STATS_AREA_PARENT &&
+			    hv_status == HV_STATUS_INVALID_PARAMETER) {
+				*addr = NULL;
+				return 0;
+			}
+
+			hv_status_debug(status, "\n");
+			return hv_result_to_errno(status);
 		}
 
 		ret = hv_call_deposit_pages(NUMA_NO_NODE,
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index bc15d6f6922f..f59a4ab47685 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -905,6 +905,9 @@ static int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
 	if (err)
 		goto unmap_self;
 
+	if (!stats_pages[HV_STATS_AREA_PARENT])
+		stats_pages[HV_STATS_AREA_PARENT] = stats_pages[HV_STATS_AREA_SELF];
+
 	return 0;
 
 unmap_self:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/3] mshv: Add definitions for stats pages
  2025-12-05 18:58 [PATCH v2 0/3] mshv: Debugfs interface for mshv_root Nuno Das Neves
  2025-12-05 18:58 ` [PATCH v2 1/3] mshv: Ignore second stats page map result failure Nuno Das Neves
@ 2025-12-05 18:58 ` Nuno Das Neves
  2025-12-05 22:51   ` Stanislav Kinsburskii
  2025-12-08 15:13   ` Michael Kelley
  2025-12-05 18:58 ` [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics Nuno Das Neves
  2 siblings, 2 replies; 18+ messages in thread
From: Nuno Das Neves @ 2025-12-05 18:58 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel, skinsburskii
  Cc: kys, haiyangz, wei.liu, decui, longli, mhklinux, prapal, mrathor,
	paekkaladevi, Nuno Das Neves

Add the definitions for hypervisor, logical processor, and partition
stats pages.

Move the definition for the VP stats page to its rightful place in
hvhdk.h, and add the missing members.

These enum members retain their CamelCase style, since they are imported
directly from the hypervisor code They will be stringified when printing
the stats out, and retain more readability in this form.

Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 drivers/hv/mshv_root_main.c |  17 --
 include/hyperv/hvhdk.h      | 437 ++++++++++++++++++++++++++++++++++++
 2 files changed, 437 insertions(+), 17 deletions(-)

diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index f59a4ab47685..19006b788e85 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -38,23 +38,6 @@ MODULE_AUTHOR("Microsoft");
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("Microsoft Hyper-V root partition VMM interface /dev/mshv");
 
-/* TODO move this to another file when debugfs code is added */
-enum hv_stats_vp_counters {			/* HV_THREAD_COUNTER */
-#if defined(CONFIG_X86)
-	VpRootDispatchThreadBlocked			= 202,
-#elif defined(CONFIG_ARM64)
-	VpRootDispatchThreadBlocked			= 94,
-#endif
-	VpStatsMaxCounter
-};
-
-struct hv_stats_page {
-	union {
-		u64 vp_cntrs[VpStatsMaxCounter];		/* VP counters */
-		u8 data[HV_HYP_PAGE_SIZE];
-	};
-} __packed;
-
 struct mshv_root mshv_root;
 
 enum hv_scheduler_type hv_scheduler_type;
diff --git a/include/hyperv/hvhdk.h b/include/hyperv/hvhdk.h
index 469186df7826..51abbcd0ec37 100644
--- a/include/hyperv/hvhdk.h
+++ b/include/hyperv/hvhdk.h
@@ -10,6 +10,443 @@
 #include "hvhdk_mini.h"
 #include "hvgdk.h"
 
+enum hv_stats_hypervisor_counters {		/* HV_HYPERVISOR_COUNTER */
+	HvLogicalProcessors			= 1,
+	HvPartitions				= 2,
+	HvTotalPages				= 3,
+	HvVirtualProcessors			= 4,
+	HvMonitoredNotifications		= 5,
+	HvModernStandbyEntries			= 6,
+	HvPlatformIdleTransitions		= 7,
+	HvHypervisorStartupCost			= 8,
+	HvIOSpacePages				= 10,
+	HvNonEssentialPagesForDump		= 11,
+	HvSubsumedPages				= 12,
+	HvStatsMaxCounter
+};
+
+enum hv_stats_partition_counters {		/* HV_PROCESS_COUNTER */
+	PartitionVirtualProcessors		= 1,
+	PartitionTlbSize			= 3,
+	PartitionAddressSpaces			= 4,
+	PartitionDepositedPages			= 5,
+	PartitionGpaPages			= 6,
+	PartitionGpaSpaceModifications		= 7,
+	PartitionVirtualTlbFlushEntires		= 8,
+	PartitionRecommendedTlbSize		= 9,
+	PartitionGpaPages4K			= 10,
+	PartitionGpaPages2M			= 11,
+	PartitionGpaPages1G			= 12,
+	PartitionGpaPages512G			= 13,
+	PartitionDevicePages4K			= 14,
+	PartitionDevicePages2M			= 15,
+	PartitionDevicePages1G			= 16,
+	PartitionDevicePages512G		= 17,
+	PartitionAttachedDevices		= 18,
+	PartitionDeviceInterruptMappings	= 19,
+	PartitionIoTlbFlushes			= 20,
+	PartitionIoTlbFlushCost			= 21,
+	PartitionDeviceInterruptErrors		= 22,
+	PartitionDeviceDmaErrors		= 23,
+	PartitionDeviceInterruptThrottleEvents	= 24,
+	PartitionSkippedTimerTicks		= 25,
+	PartitionPartitionId			= 26,
+#if IS_ENABLED(CONFIG_X86_64)
+	PartitionNestedTlbSize			= 27,
+	PartitionRecommendedNestedTlbSize	= 28,
+	PartitionNestedTlbFreeListSize		= 29,
+	PartitionNestedTlbTrimmedPages		= 30,
+	PartitionPagesShattered			= 31,
+	PartitionPagesRecombined		= 32,
+	PartitionHwpRequestValue		= 33,
+#elif IS_ENABLED(CONFIG_ARM64)
+	PartitionHwpRequestValue		= 27,
+#endif
+	PartitionStatsMaxCounter
+};
+
+enum hv_stats_vp_counters {			/* HV_THREAD_COUNTER */
+	VpTotalRunTime					= 1,
+	VpHypervisorRunTime				= 2,
+	VpRemoteNodeRunTime				= 3,
+	VpNormalizedRunTime				= 4,
+	VpIdealCpu					= 5,
+	VpHypercallsCount				= 7,
+	VpHypercallsTime				= 8,
+#if IS_ENABLED(CONFIG_X86_64)
+	VpPageInvalidationsCount			= 9,
+	VpPageInvalidationsTime				= 10,
+	VpControlRegisterAccessesCount			= 11,
+	VpControlRegisterAccessesTime			= 12,
+	VpIoInstructionsCount				= 13,
+	VpIoInstructionsTime				= 14,
+	VpHltInstructionsCount				= 15,
+	VpHltInstructionsTime				= 16,
+	VpMwaitInstructionsCount			= 17,
+	VpMwaitInstructionsTime				= 18,
+	VpCpuidInstructionsCount			= 19,
+	VpCpuidInstructionsTime				= 20,
+	VpMsrAccessesCount				= 21,
+	VpMsrAccessesTime				= 22,
+	VpOtherInterceptsCount				= 23,
+	VpOtherInterceptsTime				= 24,
+	VpExternalInterruptsCount			= 25,
+	VpExternalInterruptsTime			= 26,
+	VpPendingInterruptsCount			= 27,
+	VpPendingInterruptsTime				= 28,
+	VpEmulatedInstructionsCount			= 29,
+	VpEmulatedInstructionsTime			= 30,
+	VpDebugRegisterAccessesCount			= 31,
+	VpDebugRegisterAccessesTime			= 32,
+	VpPageFaultInterceptsCount			= 33,
+	VpPageFaultInterceptsTime			= 34,
+	VpGuestPageTableMaps				= 35,
+	VpLargePageTlbFills				= 36,
+	VpSmallPageTlbFills				= 37,
+	VpReflectedGuestPageFaults			= 38,
+	VpApicMmioAccesses				= 39,
+	VpIoInterceptMessages				= 40,
+	VpMemoryInterceptMessages			= 41,
+	VpApicEoiAccesses				= 42,
+	VpOtherMessages					= 43,
+	VpPageTableAllocations				= 44,
+	VpLogicalProcessorMigrations			= 45,
+	VpAddressSpaceEvictions				= 46,
+	VpAddressSpaceSwitches				= 47,
+	VpAddressDomainFlushes				= 48,
+	VpAddressSpaceFlushes				= 49,
+	VpGlobalGvaRangeFlushes				= 50,
+	VpLocalGvaRangeFlushes				= 51,
+	VpPageTableEvictions				= 52,
+	VpPageTableReclamations				= 53,
+	VpPageTableResets				= 54,
+	VpPageTableValidations				= 55,
+	VpApicTprAccesses				= 56,
+	VpPageTableWriteIntercepts			= 57,
+	VpSyntheticInterrupts				= 58,
+	VpVirtualInterrupts				= 59,
+	VpApicIpisSent					= 60,
+	VpApicSelfIpisSent				= 61,
+	VpGpaSpaceHypercalls				= 62,
+	VpLogicalProcessorHypercalls			= 63,
+	VpLongSpinWaitHypercalls			= 64,
+	VpOtherHypercalls				= 65,
+	VpSyntheticInterruptHypercalls			= 66,
+	VpVirtualInterruptHypercalls			= 67,
+	VpVirtualMmuHypercalls				= 68,
+	VpVirtualProcessorHypercalls			= 69,
+	VpHardwareInterrupts				= 70,
+	VpNestedPageFaultInterceptsCount		= 71,
+	VpNestedPageFaultInterceptsTime			= 72,
+	VpPageScans					= 73,
+	VpLogicalProcessorDispatches			= 74,
+	VpWaitingForCpuTime				= 75,
+	VpExtendedHypercalls				= 76,
+	VpExtendedHypercallInterceptMessages		= 77,
+	VpMbecNestedPageTableSwitches			= 78,
+	VpOtherReflectedGuestExceptions			= 79,
+	VpGlobalIoTlbFlushes				= 80,
+	VpGlobalIoTlbFlushCost				= 81,
+	VpLocalIoTlbFlushes				= 82,
+	VpLocalIoTlbFlushCost				= 83,
+	VpHypercallsForwardedCount			= 84,
+	VpHypercallsForwardingTime			= 85,
+	VpPageInvalidationsForwardedCount		= 86,
+	VpPageInvalidationsForwardingTime		= 87,
+	VpControlRegisterAccessesForwardedCount		= 88,
+	VpControlRegisterAccessesForwardingTime		= 89,
+	VpIoInstructionsForwardedCount			= 90,
+	VpIoInstructionsForwardingTime			= 91,
+	VpHltInstructionsForwardedCount			= 92,
+	VpHltInstructionsForwardingTime			= 93,
+	VpMwaitInstructionsForwardedCount		= 94,
+	VpMwaitInstructionsForwardingTime		= 95,
+	VpCpuidInstructionsForwardedCount		= 96,
+	VpCpuidInstructionsForwardingTime		= 97,
+	VpMsrAccessesForwardedCount			= 98,
+	VpMsrAccessesForwardingTime			= 99,
+	VpOtherInterceptsForwardedCount			= 100,
+	VpOtherInterceptsForwardingTime			= 101,
+	VpExternalInterruptsForwardedCount		= 102,
+	VpExternalInterruptsForwardingTime		= 103,
+	VpPendingInterruptsForwardedCount		= 104,
+	VpPendingInterruptsForwardingTime		= 105,
+	VpEmulatedInstructionsForwardedCount		= 106,
+	VpEmulatedInstructionsForwardingTime		= 107,
+	VpDebugRegisterAccessesForwardedCount		= 108,
+	VpDebugRegisterAccessesForwardingTime		= 109,
+	VpPageFaultInterceptsForwardedCount		= 110,
+	VpPageFaultInterceptsForwardingTime		= 111,
+	VpVmclearEmulationCount				= 112,
+	VpVmclearEmulationTime				= 113,
+	VpVmptrldEmulationCount				= 114,
+	VpVmptrldEmulationTime				= 115,
+	VpVmptrstEmulationCount				= 116,
+	VpVmptrstEmulationTime				= 117,
+	VpVmreadEmulationCount				= 118,
+	VpVmreadEmulationTime				= 119,
+	VpVmwriteEmulationCount				= 120,
+	VpVmwriteEmulationTime				= 121,
+	VpVmxoffEmulationCount				= 122,
+	VpVmxoffEmulationTime				= 123,
+	VpVmxonEmulationCount				= 124,
+	VpVmxonEmulationTime				= 125,
+	VpNestedVMEntriesCount				= 126,
+	VpNestedVMEntriesTime				= 127,
+	VpNestedSLATSoftPageFaultsCount			= 128,
+	VpNestedSLATSoftPageFaultsTime			= 129,
+	VpNestedSLATHardPageFaultsCount			= 130,
+	VpNestedSLATHardPageFaultsTime			= 131,
+	VpInvEptAllContextEmulationCount		= 132,
+	VpInvEptAllContextEmulationTime			= 133,
+	VpInvEptSingleContextEmulationCount		= 134,
+	VpInvEptSingleContextEmulationTime		= 135,
+	VpInvVpidAllContextEmulationCount		= 136,
+	VpInvVpidAllContextEmulationTime		= 137,
+	VpInvVpidSingleContextEmulationCount		= 138,
+	VpInvVpidSingleContextEmulationTime		= 139,
+	VpInvVpidSingleAddressEmulationCount		= 140,
+	VpInvVpidSingleAddressEmulationTime		= 141,
+	VpNestedTlbPageTableReclamations		= 142,
+	VpNestedTlbPageTableEvictions			= 143,
+	VpFlushGuestPhysicalAddressSpaceHypercalls	= 144,
+	VpFlushGuestPhysicalAddressListHypercalls	= 145,
+	VpPostedInterruptNotifications			= 146,
+	VpPostedInterruptScans				= 147,
+	VpTotalCoreRunTime				= 148,
+	VpMaximumRunTime				= 149,
+	VpHwpRequestContextSwitches			= 150,
+	VpWaitingForCpuTimeBucket0			= 151,
+	VpWaitingForCpuTimeBucket1			= 152,
+	VpWaitingForCpuTimeBucket2			= 153,
+	VpWaitingForCpuTimeBucket3			= 154,
+	VpWaitingForCpuTimeBucket4			= 155,
+	VpWaitingForCpuTimeBucket5			= 156,
+	VpWaitingForCpuTimeBucket6			= 157,
+	VpVmloadEmulationCount				= 158,
+	VpVmloadEmulationTime				= 159,
+	VpVmsaveEmulationCount				= 160,
+	VpVmsaveEmulationTime				= 161,
+	VpGifInstructionEmulationCount			= 162,
+	VpGifInstructionEmulationTime			= 163,
+	VpEmulatedErrataSvmInstructions			= 164,
+	VpPlaceholder1					= 165,
+	VpPlaceholder2					= 166,
+	VpPlaceholder3					= 167,
+	VpPlaceholder4					= 168,
+	VpPlaceholder5					= 169,
+	VpPlaceholder6					= 170,
+	VpPlaceholder7					= 171,
+	VpPlaceholder8					= 172,
+	VpPlaceholder9					= 173,
+	VpPlaceholder10					= 174,
+	VpSchedulingPriority				= 175,
+	VpRdpmcInstructionsCount			= 176,
+	VpRdpmcInstructionsTime				= 177,
+	VpPerfmonPmuMsrAccessesCount			= 178,
+	VpPerfmonLbrMsrAccessesCount			= 179,
+	VpPerfmonIptMsrAccessesCount			= 180,
+	VpPerfmonInterruptCount				= 181,
+	VpVtl1DispatchCount				= 182,
+	VpVtl2DispatchCount				= 183,
+	VpVtl2DispatchBucket0				= 184,
+	VpVtl2DispatchBucket1				= 185,
+	VpVtl2DispatchBucket2				= 186,
+	VpVtl2DispatchBucket3				= 187,
+	VpVtl2DispatchBucket4				= 188,
+	VpVtl2DispatchBucket5				= 189,
+	VpVtl2DispatchBucket6				= 190,
+	VpVtl1RunTime					= 191,
+	VpVtl2RunTime					= 192,
+	VpIommuHypercalls				= 193,
+	VpCpuGroupHypercalls				= 194,
+	VpVsmHypercalls					= 195,
+	VpEventLogHypercalls				= 196,
+	VpDeviceDomainHypercalls			= 197,
+	VpDepositHypercalls				= 198,
+	VpSvmHypercalls					= 199,
+	VpBusLockAcquisitionCount			= 200,
+	VpUnused					= 201,
+	VpRootDispatchThreadBlocked			= 202,
+#elif IS_ENABLED(CONFIG_ARM64)
+	VpSysRegAccessesCount				= 9,
+	VpSysRegAccessesTime				= 10,
+	VpSmcInstructionsCount				= 11,
+	VpSmcInstructionsTime				= 12,
+	VpOtherInterceptsCount				= 13,
+	VpOtherInterceptsTime				= 14,
+	VpExternalInterruptsCount			= 15,
+	VpExternalInterruptsTime			= 16,
+	VpPendingInterruptsCount			= 17,
+	VpPendingInterruptsTime				= 18,
+	VpGuestPageTableMaps				= 19,
+	VpLargePageTlbFills				= 20,
+	VpSmallPageTlbFills				= 21,
+	VpReflectedGuestPageFaults			= 22,
+	VpMemoryInterceptMessages			= 23,
+	VpOtherMessages					= 24,
+	VpLogicalProcessorMigrations			= 25,
+	VpAddressDomainFlushes				= 26,
+	VpAddressSpaceFlushes				= 27,
+	VpSyntheticInterrupts				= 28,
+	VpVirtualInterrupts				= 29,
+	VpApicSelfIpisSent				= 30,
+	VpGpaSpaceHypercalls				= 31,
+	VpLogicalProcessorHypercalls			= 32,
+	VpLongSpinWaitHypercalls			= 33,
+	VpOtherHypercalls				= 34,
+	VpSyntheticInterruptHypercalls			= 35,
+	VpVirtualInterruptHypercalls			= 36,
+	VpVirtualMmuHypercalls				= 37,
+	VpVirtualProcessorHypercalls			= 38,
+	VpHardwareInterrupts				= 39,
+	VpNestedPageFaultInterceptsCount		= 40,
+	VpNestedPageFaultInterceptsTime			= 41,
+	VpLogicalProcessorDispatches			= 42,
+	VpWaitingForCpuTime				= 43,
+	VpExtendedHypercalls				= 44,
+	VpExtendedHypercallInterceptMessages		= 45,
+	VpMbecNestedPageTableSwitches			= 46,
+	VpOtherReflectedGuestExceptions			= 47,
+	VpGlobalIoTlbFlushes				= 48,
+	VpGlobalIoTlbFlushCost				= 49,
+	VpLocalIoTlbFlushes				= 50,
+	VpLocalIoTlbFlushCost				= 51,
+	VpFlushGuestPhysicalAddressSpaceHypercalls	= 52,
+	VpFlushGuestPhysicalAddressListHypercalls	= 53,
+	VpPostedInterruptNotifications			= 54,
+	VpPostedInterruptScans				= 55,
+	VpTotalCoreRunTime				= 56,
+	VpMaximumRunTime				= 57,
+	VpWaitingForCpuTimeBucket0			= 58,
+	VpWaitingForCpuTimeBucket1			= 59,
+	VpWaitingForCpuTimeBucket2			= 60,
+	VpWaitingForCpuTimeBucket3			= 61,
+	VpWaitingForCpuTimeBucket4			= 62,
+	VpWaitingForCpuTimeBucket5			= 63,
+	VpWaitingForCpuTimeBucket6			= 64,
+	VpHwpRequestContextSwitches			= 65,
+	VpPlaceholder2					= 66,
+	VpPlaceholder3					= 67,
+	VpPlaceholder4					= 68,
+	VpPlaceholder5					= 69,
+	VpPlaceholder6					= 70,
+	VpPlaceholder7					= 71,
+	VpPlaceholder8					= 72,
+	VpContentionTime				= 73,
+	VpWakeUpTime					= 74,
+	VpSchedulingPriority				= 75,
+	VpVtl1DispatchCount				= 76,
+	VpVtl2DispatchCount				= 77,
+	VpVtl2DispatchBucket0				= 78,
+	VpVtl2DispatchBucket1				= 79,
+	VpVtl2DispatchBucket2				= 80,
+	VpVtl2DispatchBucket3				= 81,
+	VpVtl2DispatchBucket4				= 82,
+	VpVtl2DispatchBucket5				= 83,
+	VpVtl2DispatchBucket6				= 84,
+	VpVtl1RunTime					= 85,
+	VpVtl2RunTime					= 86,
+	VpIommuHypercalls				= 87,
+	VpCpuGroupHypercalls				= 88,
+	VpVsmHypercalls					= 89,
+	VpEventLogHypercalls				= 90,
+	VpDeviceDomainHypercalls			= 91,
+	VpDepositHypercalls				= 92,
+	VpSvmHypercalls					= 93,
+	VpLoadAvg					= 94,
+	VpRootDispatchThreadBlocked			= 95,
+#endif
+	VpStatsMaxCounter
+};
+
+enum hv_stats_lp_counters {			/* HV_CPU_COUNTER */
+	LpGlobalTime				= 1,
+	LpTotalRunTime				= 2,
+	LpHypervisorRunTime			= 3,
+	LpHardwareInterrupts			= 4,
+	LpContextSwitches			= 5,
+	LpInterProcessorInterrupts		= 6,
+	LpSchedulerInterrupts			= 7,
+	LpTimerInterrupts			= 8,
+	LpInterProcessorInterruptsSent		= 9,
+	LpProcessorHalts			= 10,
+	LpMonitorTransitionCost			= 11,
+	LpContextSwitchTime			= 12,
+	LpC1TransitionsCount			= 13,
+	LpC1RunTime				= 14,
+	LpC2TransitionsCount			= 15,
+	LpC2RunTime				= 16,
+	LpC3TransitionsCount			= 17,
+	LpC3RunTime				= 18,
+	LpRootVpIndex				= 19,
+	LpIdleSequenceNumber			= 20,
+	LpGlobalTscCount			= 21,
+	LpActiveTscCount			= 22,
+	LpIdleAccumulation			= 23,
+	LpReferenceCycleCount0			= 24,
+	LpActualCycleCount0			= 25,
+	LpReferenceCycleCount1			= 26,
+	LpActualCycleCount1			= 27,
+	LpProximityDomainId			= 28,
+	LpPostedInterruptNotifications		= 29,
+	LpBranchPredictorFlushes		= 30,
+#if IS_ENABLED(CONFIG_X86_64)
+	LpL1DataCacheFlushes			= 31,
+	LpImmediateL1DataCacheFlushes		= 32,
+	LpMbFlushes				= 33,
+	LpCounterRefreshSequenceNumber		= 34,
+	LpCounterRefreshReferenceTime		= 35,
+	LpIdleAccumulationSnapshot		= 36,
+	LpActiveTscCountSnapshot		= 37,
+	LpHwpRequestContextSwitches		= 38,
+	LpPlaceholder1				= 39,
+	LpPlaceholder2				= 40,
+	LpPlaceholder3				= 41,
+	LpPlaceholder4				= 42,
+	LpPlaceholder5				= 43,
+	LpPlaceholder6				= 44,
+	LpPlaceholder7				= 45,
+	LpPlaceholder8				= 46,
+	LpPlaceholder9				= 47,
+	LpPlaceholder10				= 48,
+	LpReserveGroupId			= 49,
+	LpRunningPriority			= 50,
+	LpPerfmonInterruptCount			= 51,
+#elif IS_ENABLED(CONFIG_ARM64)
+	LpCounterRefreshSequenceNumber		= 31,
+	LpCounterRefreshReferenceTime		= 32,
+	LpIdleAccumulationSnapshot		= 33,
+	LpActiveTscCountSnapshot		= 34,
+	LpHwpRequestContextSwitches		= 35,
+	LpPlaceholder2				= 36,
+	LpPlaceholder3				= 37,
+	LpPlaceholder4				= 38,
+	LpPlaceholder5				= 39,
+	LpPlaceholder6				= 40,
+	LpPlaceholder7				= 41,
+	LpPlaceholder8				= 42,
+	LpPlaceholder9				= 43,
+	LpSchLocalRunListSize			= 44,
+	LpReserveGroupId			= 45,
+	LpRunningPriority			= 46,
+#endif
+	LpStatsMaxCounter
+};
+
+/*
+ * Hypervisor statsitics page format
+ */
+struct hv_stats_page {
+	union {
+		u64 hv_cntrs[HvStatsMaxCounter];		/* Hypervisor counters */
+		u64 pt_cntrs[PartitionStatsMaxCounter];		/* Partition counters */
+		u64 vp_cntrs[VpStatsMaxCounter];		/* VP counters */
+		u64 lp_cntrs[LpStatsMaxCounter];		/* LP counters */
+		u8 data[HV_HYP_PAGE_SIZE];
+	};
+} __packed;
+
 /* Bits for dirty mask of hv_vp_register_page */
 #define HV_X64_REGISTER_CLASS_GENERAL	0
 #define HV_X64_REGISTER_CLASS_IP	1
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics
  2025-12-05 18:58 [PATCH v2 0/3] mshv: Debugfs interface for mshv_root Nuno Das Neves
  2025-12-05 18:58 ` [PATCH v2 1/3] mshv: Ignore second stats page map result failure Nuno Das Neves
  2025-12-05 18:58 ` [PATCH v2 2/3] mshv: Add definitions for stats pages Nuno Das Neves
@ 2025-12-05 18:58 ` Nuno Das Neves
  2025-12-05 23:06   ` Stanislav Kinsburskii
                     ` (3 more replies)
  2 siblings, 4 replies; 18+ messages in thread
From: Nuno Das Neves @ 2025-12-05 18:58 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel, skinsburskii
  Cc: kys, haiyangz, wei.liu, decui, longli, mhklinux, prapal, mrathor,
	paekkaladevi, Nuno Das Neves, Jinank Jain

Introduce a debugfs interface to expose root and child partition stats
when running with mshv_root.

Create a debugfs directory "mshv" containing 'stats' files organized by
type and id. A stats file contains a number of counters depending on
its type. e.g. an excerpt from a VP stats file:

TotalRunTime                  : 1997602722
HypervisorRunTime             : 649671371
RemoteNodeRunTime             : 0
NormalizedRunTime             : 1997602721
IdealCpu                      : 0
HypercallsCount               : 1708169
HypercallsTime                : 111914774
PageInvalidationsCount        : 0
PageInvalidationsTime         : 0

On a root partition with some active child partitions, the entire
directory structure may look like:

mshv/
  stats             # hypervisor stats
  lp/               # logical processors
    0/              # LP id
      stats         # LP 0 stats
    1/
    2/
    3/
  partition/        # partition stats
    1/              # root partition id
      stats         # root partition stats
      vp/           # root virtual processors
        0/          # root VP id
          stats     # root VP 0 stats
        1/
        2/
        3/
    42/             # child partition id
      stats         # child partition stats
      vp/           # child VPs
        0/          # child VP id
          stats     # child VP 0 stats
        1/
    43/
    55/

On L1VH, some stats are not present as it does not own the hardware
like the root partition does:
- The hypervisor and lp stats are not present
- L1VH's partition directory is named "self" because it can't get its
  own id
- Some of L1VH's partition and VP stats fields are not populated, because
  it can't map its own HV_STATS_AREA_PARENT page.

Co-developed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Co-developed-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Co-developed-by: Mukesh Rathor <mrathor@linux.microsoft.com>
Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
Co-developed-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Co-developed-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
---
 drivers/hv/Makefile         |    1 +
 drivers/hv/mshv_debugfs.c   | 1122 +++++++++++++++++++++++++++++++++++
 drivers/hv/mshv_root.h      |   34 ++
 drivers/hv/mshv_root_main.c |   32 +-
 4 files changed, 1185 insertions(+), 4 deletions(-)
 create mode 100644 drivers/hv/mshv_debugfs.c

diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
index 58b8d07639f3..36278c936914 100644
--- a/drivers/hv/Makefile
+++ b/drivers/hv/Makefile
@@ -15,6 +15,7 @@ hv_vmbus-$(CONFIG_HYPERV_TESTING)	+= hv_debugfs.o
 hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_utils_transport.o
 mshv_root-y := mshv_root_main.o mshv_synic.o mshv_eventfd.o mshv_irq.o \
 	       mshv_root_hv_call.o mshv_portid_table.o
+mshv_root-$(CONFIG_DEBUG_FS) += mshv_debugfs.o
 mshv_vtl-y := mshv_vtl_main.o
 
 # Code that must be built-in
diff --git a/drivers/hv/mshv_debugfs.c b/drivers/hv/mshv_debugfs.c
new file mode 100644
index 000000000000..581018690a27
--- /dev/null
+++ b/drivers/hv/mshv_debugfs.c
@@ -0,0 +1,1122 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2025, Microsoft Corporation.
+ *
+ * The /sys/kernel/debug/mshv directory contents.
+ * Contains various statistics data, provided by the hypervisor.
+ *
+ * Authors: Microsoft Linux virtualization team
+ */
+
+#include <linux/debugfs.h>
+#include <linux/stringify.h>
+#include <asm/mshyperv.h>
+#include <linux/slab.h>
+
+#include "mshv.h"
+#include "mshv_root.h"
+
+#define U32_BUF_SZ 11
+#define U64_BUF_SZ 21
+
+static struct dentry *mshv_debugfs;
+static struct dentry *mshv_debugfs_partition;
+static struct dentry *mshv_debugfs_lp;
+
+static u64 mshv_lps_count;
+
+static bool is_l1vh_parent(u64 partition_id)
+{
+	return hv_l1vh_partition() && (partition_id == HV_PARTITION_ID_SELF);
+}
+
+static int lp_stats_show(struct seq_file *m, void *v)
+{
+	const struct hv_stats_page *stats = m->private;
+
+#define LP_SEQ_PRINTF(cnt)		\
+	seq_printf(m, "%-29s: %llu\n", __stringify(cnt), stats->lp_cntrs[Lp##cnt])
+
+	LP_SEQ_PRINTF(GlobalTime);
+	LP_SEQ_PRINTF(TotalRunTime);
+	LP_SEQ_PRINTF(HypervisorRunTime);
+	LP_SEQ_PRINTF(HardwareInterrupts);
+	LP_SEQ_PRINTF(ContextSwitches);
+	LP_SEQ_PRINTF(InterProcessorInterrupts);
+	LP_SEQ_PRINTF(SchedulerInterrupts);
+	LP_SEQ_PRINTF(TimerInterrupts);
+	LP_SEQ_PRINTF(InterProcessorInterruptsSent);
+	LP_SEQ_PRINTF(ProcessorHalts);
+	LP_SEQ_PRINTF(MonitorTransitionCost);
+	LP_SEQ_PRINTF(ContextSwitchTime);
+	LP_SEQ_PRINTF(C1TransitionsCount);
+	LP_SEQ_PRINTF(C1RunTime);
+	LP_SEQ_PRINTF(C2TransitionsCount);
+	LP_SEQ_PRINTF(C2RunTime);
+	LP_SEQ_PRINTF(C3TransitionsCount);
+	LP_SEQ_PRINTF(C3RunTime);
+	LP_SEQ_PRINTF(RootVpIndex);
+	LP_SEQ_PRINTF(IdleSequenceNumber);
+	LP_SEQ_PRINTF(GlobalTscCount);
+	LP_SEQ_PRINTF(ActiveTscCount);
+	LP_SEQ_PRINTF(IdleAccumulation);
+	LP_SEQ_PRINTF(ReferenceCycleCount0);
+	LP_SEQ_PRINTF(ActualCycleCount0);
+	LP_SEQ_PRINTF(ReferenceCycleCount1);
+	LP_SEQ_PRINTF(ActualCycleCount1);
+	LP_SEQ_PRINTF(ProximityDomainId);
+	LP_SEQ_PRINTF(PostedInterruptNotifications);
+	LP_SEQ_PRINTF(BranchPredictorFlushes);
+#if IS_ENABLED(CONFIG_X86_64)
+	LP_SEQ_PRINTF(L1DataCacheFlushes);
+	LP_SEQ_PRINTF(ImmediateL1DataCacheFlushes);
+	LP_SEQ_PRINTF(MbFlushes);
+	LP_SEQ_PRINTF(CounterRefreshSequenceNumber);
+	LP_SEQ_PRINTF(CounterRefreshReferenceTime);
+	LP_SEQ_PRINTF(IdleAccumulationSnapshot);
+	LP_SEQ_PRINTF(ActiveTscCountSnapshot);
+	LP_SEQ_PRINTF(HwpRequestContextSwitches);
+	LP_SEQ_PRINTF(Placeholder1);
+	LP_SEQ_PRINTF(Placeholder2);
+	LP_SEQ_PRINTF(Placeholder3);
+	LP_SEQ_PRINTF(Placeholder4);
+	LP_SEQ_PRINTF(Placeholder5);
+	LP_SEQ_PRINTF(Placeholder6);
+	LP_SEQ_PRINTF(Placeholder7);
+	LP_SEQ_PRINTF(Placeholder8);
+	LP_SEQ_PRINTF(Placeholder9);
+	LP_SEQ_PRINTF(Placeholder10);
+	LP_SEQ_PRINTF(ReserveGroupId);
+	LP_SEQ_PRINTF(RunningPriority);
+	LP_SEQ_PRINTF(PerfmonInterruptCount);
+#elif IS_ENABLED(CONFIG_ARM64)
+	LP_SEQ_PRINTF(CounterRefreshSequenceNumber);
+	LP_SEQ_PRINTF(CounterRefreshReferenceTime);
+	LP_SEQ_PRINTF(IdleAccumulationSnapshot);
+	LP_SEQ_PRINTF(ActiveTscCountSnapshot);
+	LP_SEQ_PRINTF(HwpRequestContextSwitches);
+	LP_SEQ_PRINTF(Placeholder2);
+	LP_SEQ_PRINTF(Placeholder3);
+	LP_SEQ_PRINTF(Placeholder4);
+	LP_SEQ_PRINTF(Placeholder5);
+	LP_SEQ_PRINTF(Placeholder6);
+	LP_SEQ_PRINTF(Placeholder7);
+	LP_SEQ_PRINTF(Placeholder8);
+	LP_SEQ_PRINTF(Placeholder9);
+	LP_SEQ_PRINTF(SchLocalRunListSize);
+	LP_SEQ_PRINTF(ReserveGroupId);
+	LP_SEQ_PRINTF(RunningPriority);
+#endif
+
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(lp_stats);
+
+static void mshv_lp_stats_unmap(u32 lp_index, void *stats_page_addr)
+{
+	union hv_stats_object_identity identity = {
+		.lp.lp_index = lp_index,
+		.lp.stats_area_type = HV_STATS_AREA_SELF,
+	};
+	int err;
+
+	err = hv_unmap_stats_page(HV_STATS_OBJECT_LOGICAL_PROCESSOR,
+				  stats_page_addr, &identity);
+	if (err)
+		pr_err("%s: failed to unmap logical processor %u stats, err: %d\n",
+		       __func__, lp_index, err);
+}
+
+static void __init *mshv_lp_stats_map(u32 lp_index)
+{
+	union hv_stats_object_identity identity = {
+		.lp.lp_index = lp_index,
+		.lp.stats_area_type = HV_STATS_AREA_SELF,
+	};
+	void *stats;
+	int err;
+
+	err = hv_map_stats_page(HV_STATS_OBJECT_LOGICAL_PROCESSOR, &identity,
+				&stats);
+	if (err) {
+		pr_err("%s: failed to map logical processor %u stats, err: %d\n",
+		       __func__, lp_index, err);
+		return ERR_PTR(err);
+	}
+
+	return stats;
+}
+
+static void __init *lp_debugfs_stats_create(u32 lp_index, struct dentry *parent)
+{
+	struct dentry *dentry;
+	void *stats;
+
+	stats = mshv_lp_stats_map(lp_index);
+	if (IS_ERR(stats))
+		return stats;
+
+	dentry = debugfs_create_file("stats", 0400, parent,
+				     stats, &lp_stats_fops);
+	if (IS_ERR(dentry)) {
+		mshv_lp_stats_unmap(lp_index, stats);
+		return dentry;
+	}
+	return stats;
+}
+
+static int __init lp_debugfs_create(u32 lp_index, struct dentry *parent)
+{
+	struct dentry *idx;
+	char lp_idx_str[U32_BUF_SZ];
+	void *stats;
+	int err;
+
+	sprintf(lp_idx_str, "%u", lp_index);
+
+	idx = debugfs_create_dir(lp_idx_str, parent);
+	if (IS_ERR(idx))
+		return PTR_ERR(idx);
+
+	stats = lp_debugfs_stats_create(lp_index, idx);
+	if (IS_ERR(stats)) {
+		err = PTR_ERR(stats);
+		goto remove_debugfs_lp_idx;
+	}
+
+	return 0;
+
+remove_debugfs_lp_idx:
+	debugfs_remove_recursive(idx);
+	return err;
+}
+
+static void mshv_debugfs_lp_remove(void)
+{
+	int lp_index;
+
+	debugfs_remove_recursive(mshv_debugfs_lp);
+
+	for (lp_index = 0; lp_index < mshv_lps_count; lp_index++)
+		mshv_lp_stats_unmap(lp_index, NULL);
+}
+
+static int __init mshv_debugfs_lp_create(struct dentry *parent)
+{
+	struct dentry *lp_dir;
+	int err, lp_index;
+
+	lp_dir = debugfs_create_dir("lp", parent);
+	if (IS_ERR(lp_dir))
+		return PTR_ERR(lp_dir);
+
+	for (lp_index = 0; lp_index < mshv_lps_count; lp_index++) {
+		err = lp_debugfs_create(lp_index, lp_dir);
+		if (err)
+			goto remove_debugfs_lps;
+	}
+
+	mshv_debugfs_lp = lp_dir;
+
+	return 0;
+
+remove_debugfs_lps:
+	for (lp_index -= 1; lp_index >= 0; lp_index--)
+		mshv_lp_stats_unmap(lp_index, NULL);
+	debugfs_remove_recursive(lp_dir);
+	return err;
+}
+
+static int vp_stats_show(struct seq_file *m, void *v)
+{
+	const struct hv_stats_page **pstats = m->private;
+
+#define VP_SEQ_PRINTF(cnt)				 \
+do {								 \
+	if (pstats[HV_STATS_AREA_SELF]->vp_cntrs[Vp##cnt]) \
+		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
+			pstats[HV_STATS_AREA_SELF]->vp_cntrs[Vp##cnt]); \
+	else \
+		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
+			pstats[HV_STATS_AREA_PARENT]->vp_cntrs[Vp##cnt]); \
+} while (0)
+
+	VP_SEQ_PRINTF(TotalRunTime);
+	VP_SEQ_PRINTF(HypervisorRunTime);
+	VP_SEQ_PRINTF(RemoteNodeRunTime);
+	VP_SEQ_PRINTF(NormalizedRunTime);
+	VP_SEQ_PRINTF(IdealCpu);
+	VP_SEQ_PRINTF(HypercallsCount);
+	VP_SEQ_PRINTF(HypercallsTime);
+#if IS_ENABLED(CONFIG_X86_64)
+	VP_SEQ_PRINTF(PageInvalidationsCount);
+	VP_SEQ_PRINTF(PageInvalidationsTime);
+	VP_SEQ_PRINTF(ControlRegisterAccessesCount);
+	VP_SEQ_PRINTF(ControlRegisterAccessesTime);
+	VP_SEQ_PRINTF(IoInstructionsCount);
+	VP_SEQ_PRINTF(IoInstructionsTime);
+	VP_SEQ_PRINTF(HltInstructionsCount);
+	VP_SEQ_PRINTF(HltInstructionsTime);
+	VP_SEQ_PRINTF(MwaitInstructionsCount);
+	VP_SEQ_PRINTF(MwaitInstructionsTime);
+	VP_SEQ_PRINTF(CpuidInstructionsCount);
+	VP_SEQ_PRINTF(CpuidInstructionsTime);
+	VP_SEQ_PRINTF(MsrAccessesCount);
+	VP_SEQ_PRINTF(MsrAccessesTime);
+	VP_SEQ_PRINTF(OtherInterceptsCount);
+	VP_SEQ_PRINTF(OtherInterceptsTime);
+	VP_SEQ_PRINTF(ExternalInterruptsCount);
+	VP_SEQ_PRINTF(ExternalInterruptsTime);
+	VP_SEQ_PRINTF(PendingInterruptsCount);
+	VP_SEQ_PRINTF(PendingInterruptsTime);
+	VP_SEQ_PRINTF(EmulatedInstructionsCount);
+	VP_SEQ_PRINTF(EmulatedInstructionsTime);
+	VP_SEQ_PRINTF(DebugRegisterAccessesCount);
+	VP_SEQ_PRINTF(DebugRegisterAccessesTime);
+	VP_SEQ_PRINTF(PageFaultInterceptsCount);
+	VP_SEQ_PRINTF(PageFaultInterceptsTime);
+	VP_SEQ_PRINTF(GuestPageTableMaps);
+	VP_SEQ_PRINTF(LargePageTlbFills);
+	VP_SEQ_PRINTF(SmallPageTlbFills);
+	VP_SEQ_PRINTF(ReflectedGuestPageFaults);
+	VP_SEQ_PRINTF(ApicMmioAccesses);
+	VP_SEQ_PRINTF(IoInterceptMessages);
+	VP_SEQ_PRINTF(MemoryInterceptMessages);
+	VP_SEQ_PRINTF(ApicEoiAccesses);
+	VP_SEQ_PRINTF(OtherMessages);
+	VP_SEQ_PRINTF(PageTableAllocations);
+	VP_SEQ_PRINTF(LogicalProcessorMigrations);
+	VP_SEQ_PRINTF(AddressSpaceEvictions);
+	VP_SEQ_PRINTF(AddressSpaceSwitches);
+	VP_SEQ_PRINTF(AddressDomainFlushes);
+	VP_SEQ_PRINTF(AddressSpaceFlushes);
+	VP_SEQ_PRINTF(GlobalGvaRangeFlushes);
+	VP_SEQ_PRINTF(LocalGvaRangeFlushes);
+	VP_SEQ_PRINTF(PageTableEvictions);
+	VP_SEQ_PRINTF(PageTableReclamations);
+	VP_SEQ_PRINTF(PageTableResets);
+	VP_SEQ_PRINTF(PageTableValidations);
+	VP_SEQ_PRINTF(ApicTprAccesses);
+	VP_SEQ_PRINTF(PageTableWriteIntercepts);
+	VP_SEQ_PRINTF(SyntheticInterrupts);
+	VP_SEQ_PRINTF(VirtualInterrupts);
+	VP_SEQ_PRINTF(ApicIpisSent);
+	VP_SEQ_PRINTF(ApicSelfIpisSent);
+	VP_SEQ_PRINTF(GpaSpaceHypercalls);
+	VP_SEQ_PRINTF(LogicalProcessorHypercalls);
+	VP_SEQ_PRINTF(LongSpinWaitHypercalls);
+	VP_SEQ_PRINTF(OtherHypercalls);
+	VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
+	VP_SEQ_PRINTF(VirtualInterruptHypercalls);
+	VP_SEQ_PRINTF(VirtualMmuHypercalls);
+	VP_SEQ_PRINTF(VirtualProcessorHypercalls);
+	VP_SEQ_PRINTF(HardwareInterrupts);
+	VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
+	VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
+	VP_SEQ_PRINTF(PageScans);
+	VP_SEQ_PRINTF(LogicalProcessorDispatches);
+	VP_SEQ_PRINTF(WaitingForCpuTime);
+	VP_SEQ_PRINTF(ExtendedHypercalls);
+	VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
+	VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
+	VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
+	VP_SEQ_PRINTF(GlobalIoTlbFlushes);
+	VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
+	VP_SEQ_PRINTF(LocalIoTlbFlushes);
+	VP_SEQ_PRINTF(LocalIoTlbFlushCost);
+	VP_SEQ_PRINTF(HypercallsForwardedCount);
+	VP_SEQ_PRINTF(HypercallsForwardingTime);
+	VP_SEQ_PRINTF(PageInvalidationsForwardedCount);
+	VP_SEQ_PRINTF(PageInvalidationsForwardingTime);
+	VP_SEQ_PRINTF(ControlRegisterAccessesForwardedCount);
+	VP_SEQ_PRINTF(ControlRegisterAccessesForwardingTime);
+	VP_SEQ_PRINTF(IoInstructionsForwardedCount);
+	VP_SEQ_PRINTF(IoInstructionsForwardingTime);
+	VP_SEQ_PRINTF(HltInstructionsForwardedCount);
+	VP_SEQ_PRINTF(HltInstructionsForwardingTime);
+	VP_SEQ_PRINTF(MwaitInstructionsForwardedCount);
+	VP_SEQ_PRINTF(MwaitInstructionsForwardingTime);
+	VP_SEQ_PRINTF(CpuidInstructionsForwardedCount);
+	VP_SEQ_PRINTF(CpuidInstructionsForwardingTime);
+	VP_SEQ_PRINTF(MsrAccessesForwardedCount);
+	VP_SEQ_PRINTF(MsrAccessesForwardingTime);
+	VP_SEQ_PRINTF(OtherInterceptsForwardedCount);
+	VP_SEQ_PRINTF(OtherInterceptsForwardingTime);
+	VP_SEQ_PRINTF(ExternalInterruptsForwardedCount);
+	VP_SEQ_PRINTF(ExternalInterruptsForwardingTime);
+	VP_SEQ_PRINTF(PendingInterruptsForwardedCount);
+	VP_SEQ_PRINTF(PendingInterruptsForwardingTime);
+	VP_SEQ_PRINTF(EmulatedInstructionsForwardedCount);
+	VP_SEQ_PRINTF(EmulatedInstructionsForwardingTime);
+	VP_SEQ_PRINTF(DebugRegisterAccessesForwardedCount);
+	VP_SEQ_PRINTF(DebugRegisterAccessesForwardingTime);
+	VP_SEQ_PRINTF(PageFaultInterceptsForwardedCount);
+	VP_SEQ_PRINTF(PageFaultInterceptsForwardingTime);
+	VP_SEQ_PRINTF(VmclearEmulationCount);
+	VP_SEQ_PRINTF(VmclearEmulationTime);
+	VP_SEQ_PRINTF(VmptrldEmulationCount);
+	VP_SEQ_PRINTF(VmptrldEmulationTime);
+	VP_SEQ_PRINTF(VmptrstEmulationCount);
+	VP_SEQ_PRINTF(VmptrstEmulationTime);
+	VP_SEQ_PRINTF(VmreadEmulationCount);
+	VP_SEQ_PRINTF(VmreadEmulationTime);
+	VP_SEQ_PRINTF(VmwriteEmulationCount);
+	VP_SEQ_PRINTF(VmwriteEmulationTime);
+	VP_SEQ_PRINTF(VmxoffEmulationCount);
+	VP_SEQ_PRINTF(VmxoffEmulationTime);
+	VP_SEQ_PRINTF(VmxonEmulationCount);
+	VP_SEQ_PRINTF(VmxonEmulationTime);
+	VP_SEQ_PRINTF(NestedVMEntriesCount);
+	VP_SEQ_PRINTF(NestedVMEntriesTime);
+	VP_SEQ_PRINTF(NestedSLATSoftPageFaultsCount);
+	VP_SEQ_PRINTF(NestedSLATSoftPageFaultsTime);
+	VP_SEQ_PRINTF(NestedSLATHardPageFaultsCount);
+	VP_SEQ_PRINTF(NestedSLATHardPageFaultsTime);
+	VP_SEQ_PRINTF(InvEptAllContextEmulationCount);
+	VP_SEQ_PRINTF(InvEptAllContextEmulationTime);
+	VP_SEQ_PRINTF(InvEptSingleContextEmulationCount);
+	VP_SEQ_PRINTF(InvEptSingleContextEmulationTime);
+	VP_SEQ_PRINTF(InvVpidAllContextEmulationCount);
+	VP_SEQ_PRINTF(InvVpidAllContextEmulationTime);
+	VP_SEQ_PRINTF(InvVpidSingleContextEmulationCount);
+	VP_SEQ_PRINTF(InvVpidSingleContextEmulationTime);
+	VP_SEQ_PRINTF(InvVpidSingleAddressEmulationCount);
+	VP_SEQ_PRINTF(InvVpidSingleAddressEmulationTime);
+	VP_SEQ_PRINTF(NestedTlbPageTableReclamations);
+	VP_SEQ_PRINTF(NestedTlbPageTableEvictions);
+	VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
+	VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
+	VP_SEQ_PRINTF(PostedInterruptNotifications);
+	VP_SEQ_PRINTF(PostedInterruptScans);
+	VP_SEQ_PRINTF(TotalCoreRunTime);
+	VP_SEQ_PRINTF(MaximumRunTime);
+	VP_SEQ_PRINTF(HwpRequestContextSwitches);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
+	VP_SEQ_PRINTF(VmloadEmulationCount);
+	VP_SEQ_PRINTF(VmloadEmulationTime);
+	VP_SEQ_PRINTF(VmsaveEmulationCount);
+	VP_SEQ_PRINTF(VmsaveEmulationTime);
+	VP_SEQ_PRINTF(GifInstructionEmulationCount);
+	VP_SEQ_PRINTF(GifInstructionEmulationTime);
+	VP_SEQ_PRINTF(EmulatedErrataSvmInstructions);
+	VP_SEQ_PRINTF(Placeholder1);
+	VP_SEQ_PRINTF(Placeholder2);
+	VP_SEQ_PRINTF(Placeholder3);
+	VP_SEQ_PRINTF(Placeholder4);
+	VP_SEQ_PRINTF(Placeholder5);
+	VP_SEQ_PRINTF(Placeholder6);
+	VP_SEQ_PRINTF(Placeholder7);
+	VP_SEQ_PRINTF(Placeholder8);
+	VP_SEQ_PRINTF(Placeholder9);
+	VP_SEQ_PRINTF(Placeholder10);
+	VP_SEQ_PRINTF(SchedulingPriority);
+	VP_SEQ_PRINTF(RdpmcInstructionsCount);
+	VP_SEQ_PRINTF(RdpmcInstructionsTime);
+	VP_SEQ_PRINTF(PerfmonPmuMsrAccessesCount);
+	VP_SEQ_PRINTF(PerfmonLbrMsrAccessesCount);
+	VP_SEQ_PRINTF(PerfmonIptMsrAccessesCount);
+	VP_SEQ_PRINTF(PerfmonInterruptCount);
+	VP_SEQ_PRINTF(Vtl1DispatchCount);
+	VP_SEQ_PRINTF(Vtl2DispatchCount);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket0);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket1);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket2);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket3);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket4);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket5);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket6);
+	VP_SEQ_PRINTF(Vtl1RunTime);
+	VP_SEQ_PRINTF(Vtl2RunTime);
+	VP_SEQ_PRINTF(IommuHypercalls);
+	VP_SEQ_PRINTF(CpuGroupHypercalls);
+	VP_SEQ_PRINTF(VsmHypercalls);
+	VP_SEQ_PRINTF(EventLogHypercalls);
+	VP_SEQ_PRINTF(DeviceDomainHypercalls);
+	VP_SEQ_PRINTF(DepositHypercalls);
+	VP_SEQ_PRINTF(SvmHypercalls);
+	VP_SEQ_PRINTF(BusLockAcquisitionCount);
+#elif IS_ENABLED(CONFIG_ARM64)
+	VP_SEQ_PRINTF(SysRegAccessesCount);
+	VP_SEQ_PRINTF(SysRegAccessesTime);
+	VP_SEQ_PRINTF(SmcInstructionsCount);
+	VP_SEQ_PRINTF(SmcInstructionsTime);
+	VP_SEQ_PRINTF(OtherInterceptsCount);
+	VP_SEQ_PRINTF(OtherInterceptsTime);
+	VP_SEQ_PRINTF(ExternalInterruptsCount);
+	VP_SEQ_PRINTF(ExternalInterruptsTime);
+	VP_SEQ_PRINTF(PendingInterruptsCount);
+	VP_SEQ_PRINTF(PendingInterruptsTime);
+	VP_SEQ_PRINTF(GuestPageTableMaps);
+	VP_SEQ_PRINTF(LargePageTlbFills);
+	VP_SEQ_PRINTF(SmallPageTlbFills);
+	VP_SEQ_PRINTF(ReflectedGuestPageFaults);
+	VP_SEQ_PRINTF(MemoryInterceptMessages);
+	VP_SEQ_PRINTF(OtherMessages);
+	VP_SEQ_PRINTF(LogicalProcessorMigrations);
+	VP_SEQ_PRINTF(AddressDomainFlushes);
+	VP_SEQ_PRINTF(AddressSpaceFlushes);
+	VP_SEQ_PRINTF(SyntheticInterrupts);
+	VP_SEQ_PRINTF(VirtualInterrupts);
+	VP_SEQ_PRINTF(ApicSelfIpisSent);
+	VP_SEQ_PRINTF(GpaSpaceHypercalls);
+	VP_SEQ_PRINTF(LogicalProcessorHypercalls);
+	VP_SEQ_PRINTF(LongSpinWaitHypercalls);
+	VP_SEQ_PRINTF(OtherHypercalls);
+	VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
+	VP_SEQ_PRINTF(VirtualInterruptHypercalls);
+	VP_SEQ_PRINTF(VirtualMmuHypercalls);
+	VP_SEQ_PRINTF(VirtualProcessorHypercalls);
+	VP_SEQ_PRINTF(HardwareInterrupts);
+	VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
+	VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
+	VP_SEQ_PRINTF(LogicalProcessorDispatches);
+	VP_SEQ_PRINTF(WaitingForCpuTime);
+	VP_SEQ_PRINTF(ExtendedHypercalls);
+	VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
+	VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
+	VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
+	VP_SEQ_PRINTF(GlobalIoTlbFlushes);
+	VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
+	VP_SEQ_PRINTF(LocalIoTlbFlushes);
+	VP_SEQ_PRINTF(LocalIoTlbFlushCost);
+	VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
+	VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
+	VP_SEQ_PRINTF(PostedInterruptNotifications);
+	VP_SEQ_PRINTF(PostedInterruptScans);
+	VP_SEQ_PRINTF(TotalCoreRunTime);
+	VP_SEQ_PRINTF(MaximumRunTime);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
+	VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
+	VP_SEQ_PRINTF(HwpRequestContextSwitches);
+	VP_SEQ_PRINTF(Placeholder2);
+	VP_SEQ_PRINTF(Placeholder3);
+	VP_SEQ_PRINTF(Placeholder4);
+	VP_SEQ_PRINTF(Placeholder5);
+	VP_SEQ_PRINTF(Placeholder6);
+	VP_SEQ_PRINTF(Placeholder7);
+	VP_SEQ_PRINTF(Placeholder8);
+	VP_SEQ_PRINTF(ContentionTime);
+	VP_SEQ_PRINTF(WakeUpTime);
+	VP_SEQ_PRINTF(SchedulingPriority);
+	VP_SEQ_PRINTF(Vtl1DispatchCount);
+	VP_SEQ_PRINTF(Vtl2DispatchCount);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket0);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket1);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket2);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket3);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket4);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket5);
+	VP_SEQ_PRINTF(Vtl2DispatchBucket6);
+	VP_SEQ_PRINTF(Vtl1RunTime);
+	VP_SEQ_PRINTF(Vtl2RunTime);
+	VP_SEQ_PRINTF(IommuHypercalls);
+	VP_SEQ_PRINTF(CpuGroupHypercalls);
+	VP_SEQ_PRINTF(VsmHypercalls);
+	VP_SEQ_PRINTF(EventLogHypercalls);
+	VP_SEQ_PRINTF(DeviceDomainHypercalls);
+	VP_SEQ_PRINTF(DepositHypercalls);
+	VP_SEQ_PRINTF(SvmHypercalls);
+#endif
+
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(vp_stats);
+
+static void mshv_vp_stats_unmap(u64 partition_id, u32 vp_index, void *stats_page_addr,
+				enum hv_stats_area_type stats_area_type)
+{
+	union hv_stats_object_identity identity = {
+		.vp.partition_id = partition_id,
+		.vp.vp_index = vp_index,
+		.vp.stats_area_type = stats_area_type,
+	};
+	int err;
+
+	err = hv_unmap_stats_page(HV_STATS_OBJECT_VP, stats_page_addr, &identity);
+	if (err)
+		pr_err("%s: failed to unmap partition %llu vp %u %s stats, err: %d\n",
+		       __func__, partition_id, vp_index,
+		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
+		       err);
+}
+
+static void *mshv_vp_stats_map(u64 partition_id, u32 vp_index,
+			       enum hv_stats_area_type stats_area_type)
+{
+	union hv_stats_object_identity identity = {
+		.vp.partition_id = partition_id,
+		.vp.vp_index = vp_index,
+		.vp.stats_area_type = stats_area_type,
+	};
+	void *stats;
+	int err;
+
+	err = hv_map_stats_page(HV_STATS_OBJECT_VP, &identity, &stats);
+	if (err) {
+		pr_err("%s: failed to map partition %llu vp %u %s stats, err: %d\n",
+		       __func__, partition_id, vp_index,
+		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
+		       err);
+		return ERR_PTR(err);
+	}
+	return stats;
+}
+
+static int vp_debugfs_stats_create(u64 partition_id, u32 vp_index,
+				   struct dentry **vp_stats_ptr,
+				   struct dentry *parent)
+{
+	struct dentry *dentry;
+	struct hv_stats_page **pstats;
+	int err;
+
+	pstats = kcalloc(2, sizeof(struct hv_stats_page *), GFP_KERNEL_ACCOUNT);
+	if (!pstats)
+		return -ENOMEM;
+
+	pstats[HV_STATS_AREA_SELF] = mshv_vp_stats_map(partition_id, vp_index,
+						       HV_STATS_AREA_SELF);
+	if (IS_ERR(pstats[HV_STATS_AREA_SELF])) {
+		err = PTR_ERR(pstats[HV_STATS_AREA_SELF]);
+		goto cleanup;
+	}
+
+	/*
+	 * L1VH partition cannot access its vp stats in parent area.
+	 */
+	if (is_l1vh_parent(partition_id)) {
+		pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
+	} else {
+		pstats[HV_STATS_AREA_PARENT] = mshv_vp_stats_map(
+			partition_id, vp_index, HV_STATS_AREA_PARENT);
+		if (IS_ERR(pstats[HV_STATS_AREA_PARENT])) {
+			err = PTR_ERR(pstats[HV_STATS_AREA_PARENT]);
+			goto unmap_self;
+		}
+		if (!pstats[HV_STATS_AREA_PARENT])
+			pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
+	}
+
+	dentry = debugfs_create_file("stats", 0400, parent,
+				     pstats, &vp_stats_fops);
+	if (IS_ERR(dentry)) {
+		err = PTR_ERR(dentry);
+		goto unmap_vp_stats;
+	}
+
+	*vp_stats_ptr = dentry;
+	return 0;
+
+unmap_vp_stats:
+	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF])
+		mshv_vp_stats_unmap(partition_id, vp_index, pstats[HV_STATS_AREA_PARENT],
+				    HV_STATS_AREA_PARENT);
+unmap_self:
+	mshv_vp_stats_unmap(partition_id, vp_index, pstats[HV_STATS_AREA_SELF],
+			    HV_STATS_AREA_SELF);
+cleanup:
+	kfree(pstats);
+	return err;
+}
+
+static void vp_debugfs_remove(u64 partition_id, u32 vp_index,
+			      struct dentry *vp_stats)
+{
+	struct hv_stats_page **pstats = NULL;
+	void *stats;
+
+	pstats = vp_stats->d_inode->i_private;
+	debugfs_remove_recursive(vp_stats->d_parent);
+	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF]) {
+		stats = pstats[HV_STATS_AREA_PARENT];
+		mshv_vp_stats_unmap(partition_id, vp_index, stats,
+				    HV_STATS_AREA_PARENT);
+	}
+
+	stats = pstats[HV_STATS_AREA_SELF];
+	mshv_vp_stats_unmap(partition_id, vp_index, stats, HV_STATS_AREA_SELF);
+
+	kfree(pstats);
+}
+
+static int vp_debugfs_create(u64 partition_id, u32 vp_index,
+			     struct dentry **vp_stats_ptr,
+			     struct dentry *parent)
+{
+	struct dentry *vp_idx_dir;
+	char vp_idx_str[U32_BUF_SZ];
+	int err;
+
+	sprintf(vp_idx_str, "%u", vp_index);
+
+	vp_idx_dir = debugfs_create_dir(vp_idx_str, parent);
+	if (IS_ERR(vp_idx_dir))
+		return PTR_ERR(vp_idx_dir);
+
+	err = vp_debugfs_stats_create(partition_id, vp_index, vp_stats_ptr,
+				      vp_idx_dir);
+	if (err)
+		goto remove_debugfs_vp_idx;
+
+	return 0;
+
+remove_debugfs_vp_idx:
+	debugfs_remove_recursive(vp_idx_dir);
+	return err;
+}
+
+static int partition_stats_show(struct seq_file *m, void *v)
+{
+	const struct hv_stats_page **pstats = m->private;
+
+#define PARTITION_SEQ_PRINTF(cnt)				 \
+do {								 \
+	if (pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]) \
+		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
+			pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]); \
+	else \
+		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
+			pstats[HV_STATS_AREA_PARENT]->pt_cntrs[Partition##cnt]); \
+} while (0)
+
+	PARTITION_SEQ_PRINTF(VirtualProcessors);
+	PARTITION_SEQ_PRINTF(TlbSize);
+	PARTITION_SEQ_PRINTF(AddressSpaces);
+	PARTITION_SEQ_PRINTF(DepositedPages);
+	PARTITION_SEQ_PRINTF(GpaPages);
+	PARTITION_SEQ_PRINTF(GpaSpaceModifications);
+	PARTITION_SEQ_PRINTF(VirtualTlbFlushEntires);
+	PARTITION_SEQ_PRINTF(RecommendedTlbSize);
+	PARTITION_SEQ_PRINTF(GpaPages4K);
+	PARTITION_SEQ_PRINTF(GpaPages2M);
+	PARTITION_SEQ_PRINTF(GpaPages1G);
+	PARTITION_SEQ_PRINTF(GpaPages512G);
+	PARTITION_SEQ_PRINTF(DevicePages4K);
+	PARTITION_SEQ_PRINTF(DevicePages2M);
+	PARTITION_SEQ_PRINTF(DevicePages1G);
+	PARTITION_SEQ_PRINTF(DevicePages512G);
+	PARTITION_SEQ_PRINTF(AttachedDevices);
+	PARTITION_SEQ_PRINTF(DeviceInterruptMappings);
+	PARTITION_SEQ_PRINTF(IoTlbFlushes);
+	PARTITION_SEQ_PRINTF(IoTlbFlushCost);
+	PARTITION_SEQ_PRINTF(DeviceInterruptErrors);
+	PARTITION_SEQ_PRINTF(DeviceDmaErrors);
+	PARTITION_SEQ_PRINTF(DeviceInterruptThrottleEvents);
+	PARTITION_SEQ_PRINTF(SkippedTimerTicks);
+	PARTITION_SEQ_PRINTF(PartitionId);
+#if IS_ENABLED(CONFIG_X86_64)
+	PARTITION_SEQ_PRINTF(NestedTlbSize);
+	PARTITION_SEQ_PRINTF(RecommendedNestedTlbSize);
+	PARTITION_SEQ_PRINTF(NestedTlbFreeListSize);
+	PARTITION_SEQ_PRINTF(NestedTlbTrimmedPages);
+	PARTITION_SEQ_PRINTF(PagesShattered);
+	PARTITION_SEQ_PRINTF(PagesRecombined);
+	PARTITION_SEQ_PRINTF(HwpRequestValue);
+#elif IS_ENABLED(CONFIG_ARM64)
+	PARTITION_SEQ_PRINTF(HwpRequestValue);
+#endif
+
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(partition_stats);
+
+static void mshv_partition_stats_unmap(u64 partition_id, void *stats_page_addr,
+				       enum hv_stats_area_type stats_area_type)
+{
+	union hv_stats_object_identity identity = {
+		.partition.partition_id = partition_id,
+		.partition.stats_area_type = stats_area_type,
+	};
+	int err;
+
+	err = hv_unmap_stats_page(HV_STATS_OBJECT_PARTITION, stats_page_addr,
+				  &identity);
+	if (err) {
+		pr_err("%s: failed to unmap partition %lld %s stats, err: %d\n",
+		       __func__, partition_id,
+		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
+		       err);
+	}
+}
+
+static void *mshv_partition_stats_map(u64 partition_id,
+				      enum hv_stats_area_type stats_area_type)
+{
+	union hv_stats_object_identity identity = {
+		.partition.partition_id = partition_id,
+		.partition.stats_area_type = stats_area_type,
+	};
+	void *stats;
+	int err;
+
+	err = hv_map_stats_page(HV_STATS_OBJECT_PARTITION, &identity, &stats);
+	if (err) {
+		pr_err("%s: failed to map partition %lld %s stats, err: %d\n",
+		       __func__, partition_id,
+		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
+		       err);
+		return ERR_PTR(err);
+	}
+	return stats;
+}
+
+static int mshv_debugfs_partition_stats_create(u64 partition_id,
+					    struct dentry **partition_stats_ptr,
+					    struct dentry *parent)
+{
+	struct dentry *dentry;
+	struct hv_stats_page **pstats;
+	int err;
+
+	pstats = kcalloc(2, sizeof(struct hv_stats_page *), GFP_KERNEL_ACCOUNT);
+	if (!pstats)
+		return -ENOMEM;
+
+	pstats[HV_STATS_AREA_SELF] = mshv_partition_stats_map(partition_id,
+							      HV_STATS_AREA_SELF);
+	if (IS_ERR(pstats[HV_STATS_AREA_SELF])) {
+		err = PTR_ERR(pstats[HV_STATS_AREA_SELF]);
+		goto cleanup;
+	}
+
+	/*
+	 * L1VH partition cannot access its partition stats in parent area.
+	 */
+	if (is_l1vh_parent(partition_id)) {
+		pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
+	} else {
+		pstats[HV_STATS_AREA_PARENT] = mshv_partition_stats_map(partition_id,
+									HV_STATS_AREA_PARENT);
+		if (IS_ERR(pstats[HV_STATS_AREA_PARENT])) {
+			err = PTR_ERR(pstats[HV_STATS_AREA_PARENT]);
+			goto unmap_self;
+		}
+		if (!pstats[HV_STATS_AREA_PARENT])
+			pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
+	}
+
+	dentry = debugfs_create_file("stats", 0400, parent,
+				     pstats, &partition_stats_fops);
+	if (IS_ERR(dentry)) {
+		err = PTR_ERR(dentry);
+		goto unmap_partition_stats;
+	}
+
+	*partition_stats_ptr = dentry;
+	return 0;
+
+unmap_partition_stats:
+	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF])
+		mshv_partition_stats_unmap(partition_id, pstats[HV_STATS_AREA_PARENT],
+					   HV_STATS_AREA_PARENT);
+unmap_self:
+	mshv_partition_stats_unmap(partition_id, pstats[HV_STATS_AREA_SELF],
+				   HV_STATS_AREA_SELF);
+cleanup:
+	kfree(pstats);
+	return err;
+}
+
+static void partition_debugfs_remove(u64 partition_id, struct dentry *dentry)
+{
+	struct hv_stats_page **pstats = NULL;
+	void *stats;
+
+	pstats = dentry->d_inode->i_private;
+
+	debugfs_remove_recursive(dentry->d_parent);
+
+	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF]) {
+		stats = pstats[HV_STATS_AREA_PARENT];
+		mshv_partition_stats_unmap(partition_id, stats, HV_STATS_AREA_PARENT);
+	}
+
+	stats = pstats[HV_STATS_AREA_SELF];
+	mshv_partition_stats_unmap(partition_id, stats, HV_STATS_AREA_SELF);
+
+	kfree(pstats);
+}
+
+static int partition_debugfs_create(u64 partition_id,
+				    struct dentry **vp_dir_ptr,
+				    struct dentry **partition_stats_ptr,
+				    struct dentry *parent)
+{
+	char part_id_str[U64_BUF_SZ];
+	struct dentry *part_id_dir, *vp_dir;
+	int err;
+
+	if (is_l1vh_parent(partition_id))
+		sprintf(part_id_str, "self");
+	else
+		sprintf(part_id_str, "%llu", partition_id);
+
+	part_id_dir = debugfs_create_dir(part_id_str, parent);
+	if (IS_ERR(part_id_dir))
+		return PTR_ERR(part_id_dir);
+
+	vp_dir = debugfs_create_dir("vp", part_id_dir);
+	if (IS_ERR(vp_dir)) {
+		err = PTR_ERR(vp_dir);
+		goto remove_debugfs_partition_id;
+	}
+
+	err = mshv_debugfs_partition_stats_create(partition_id,
+						  partition_stats_ptr,
+						  part_id_dir);
+	if (err)
+		goto remove_debugfs_partition_id;
+
+	*vp_dir_ptr = vp_dir;
+
+	return 0;
+
+remove_debugfs_partition_id:
+	debugfs_remove_recursive(part_id_dir);
+	return err;
+}
+
+static void mshv_debugfs_parent_partition_remove(void)
+{
+	int idx;
+
+	for_each_online_cpu(idx)
+		vp_debugfs_remove(hv_current_partition_id, idx, NULL);
+
+	partition_debugfs_remove(hv_current_partition_id, NULL);
+}
+
+static int __init mshv_debugfs_parent_partition_create(void)
+{
+	struct dentry *partition_stats, *vp_dir;
+	int err, idx, i;
+
+	mshv_debugfs_partition = debugfs_create_dir("partition",
+						     mshv_debugfs);
+	if (IS_ERR(mshv_debugfs_partition))
+		return PTR_ERR(mshv_debugfs_partition);
+
+	err = partition_debugfs_create(hv_current_partition_id,
+				       &vp_dir,
+				       &partition_stats,
+				       mshv_debugfs_partition);
+	if (err)
+		goto remove_debugfs_partition;
+
+	for_each_online_cpu(idx) {
+		struct dentry *vp_stats;
+
+		err = vp_debugfs_create(hv_current_partition_id,
+					hv_vp_index[idx],
+					&vp_stats,
+					vp_dir);
+		if (err)
+			goto remove_debugfs_partition_vp;
+	}
+
+	return 0;
+
+remove_debugfs_partition_vp:
+	for_each_online_cpu(i) {
+		if (i >= idx)
+			break;
+		vp_debugfs_remove(hv_current_partition_id, i, NULL);
+	}
+	partition_debugfs_remove(hv_current_partition_id, NULL);
+remove_debugfs_partition:
+	debugfs_remove_recursive(mshv_debugfs_partition);
+	return err;
+}
+
+static int hv_stats_show(struct seq_file *m, void *v)
+{
+	const struct hv_stats_page *stats = m->private;
+
+#define HV_SEQ_PRINTF(cnt)		\
+	seq_printf(m, "%-25s: %llu\n", __stringify(cnt), stats->hv_cntrs[Hv##cnt])
+
+	HV_SEQ_PRINTF(LogicalProcessors);
+	HV_SEQ_PRINTF(Partitions);
+	HV_SEQ_PRINTF(TotalPages);
+	HV_SEQ_PRINTF(VirtualProcessors);
+	HV_SEQ_PRINTF(MonitoredNotifications);
+	HV_SEQ_PRINTF(ModernStandbyEntries);
+	HV_SEQ_PRINTF(PlatformIdleTransitions);
+	HV_SEQ_PRINTF(HypervisorStartupCost);
+	HV_SEQ_PRINTF(IOSpacePages);
+	HV_SEQ_PRINTF(NonEssentialPagesForDump);
+	HV_SEQ_PRINTF(SubsumedPages);
+
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(hv_stats);
+
+static void mshv_hv_stats_unmap(void)
+{
+	union hv_stats_object_identity identity = {
+		.hv.stats_area_type = HV_STATS_AREA_SELF,
+	};
+	int err;
+
+	err = hv_unmap_stats_page(HV_STATS_OBJECT_HYPERVISOR, NULL, &identity);
+	if (err)
+		pr_err("%s: failed to unmap hypervisor stats: %d\n",
+		       __func__, err);
+}
+
+static void * __init mshv_hv_stats_map(void)
+{
+	union hv_stats_object_identity identity = {
+		.hv.stats_area_type = HV_STATS_AREA_SELF,
+	};
+	void *stats;
+	int err;
+
+	err = hv_map_stats_page(HV_STATS_OBJECT_HYPERVISOR, &identity, &stats);
+	if (err) {
+		pr_err("%s: failed to map hypervisor stats: %d\n",
+		       __func__, err);
+		return ERR_PTR(err);
+	}
+	return stats;
+}
+
+static int __init mshv_debugfs_hv_stats_create(struct dentry *parent)
+{
+	struct dentry *dentry;
+	u64 *stats;
+	int err;
+
+	stats = mshv_hv_stats_map();
+	if (IS_ERR(stats))
+		return PTR_ERR(stats);
+
+	dentry = debugfs_create_file("stats", 0400, parent,
+				     stats, &hv_stats_fops);
+	if (IS_ERR(dentry)) {
+		err = PTR_ERR(dentry);
+		pr_err("%s: failed to create hypervisor stats dentry: %d\n",
+		       __func__, err);
+		goto unmap_hv_stats;
+	}
+
+	mshv_lps_count = stats[HvLogicalProcessors];
+
+	return 0;
+
+unmap_hv_stats:
+	mshv_hv_stats_unmap();
+	return err;
+}
+
+int mshv_debugfs_vp_create(struct mshv_vp *vp)
+{
+	struct mshv_partition *p = vp->vp_partition;
+	int err;
+
+	if (!mshv_debugfs)
+		return 0;
+
+	err = vp_debugfs_create(p->pt_id, vp->vp_index,
+				&vp->vp_debugfs_stats_dentry,
+				p->pt_debugfs_vp_dentry);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+void mshv_debugfs_vp_remove(struct mshv_vp *vp)
+{
+	if (!mshv_debugfs)
+		return;
+
+	vp_debugfs_remove(vp->vp_partition->pt_id, vp->vp_index,
+			  vp->vp_debugfs_stats_dentry);
+}
+
+int mshv_debugfs_partition_create(struct mshv_partition *partition)
+{
+	int err;
+
+	if (!mshv_debugfs)
+		return 0;
+
+	err = partition_debugfs_create(partition->pt_id,
+				       &partition->pt_debugfs_vp_dentry,
+				       &partition->pt_debugfs_stats_dentry,
+				       mshv_debugfs_partition);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+void mshv_debugfs_partition_remove(struct mshv_partition *partition)
+{
+	if (!mshv_debugfs)
+		return;
+
+	partition_debugfs_remove(partition->pt_id,
+				 partition->pt_debugfs_stats_dentry);
+}
+
+int __init mshv_debugfs_init(void)
+{
+	int err;
+
+	mshv_debugfs = debugfs_create_dir("mshv", NULL);
+	if (IS_ERR(mshv_debugfs)) {
+		pr_err("%s: failed to create debugfs directory\n", __func__);
+		return PTR_ERR(mshv_debugfs);
+	}
+
+	if (hv_root_partition()) {
+		err = mshv_debugfs_hv_stats_create(mshv_debugfs);
+		if (err)
+			goto remove_mshv_dir;
+
+		err = mshv_debugfs_lp_create(mshv_debugfs);
+		if (err)
+			goto unmap_hv_stats;
+	}
+
+	err = mshv_debugfs_parent_partition_create();
+	if (err)
+		goto unmap_lp_stats;
+
+	return 0;
+
+unmap_lp_stats:
+	if (hv_root_partition())
+		mshv_debugfs_lp_remove();
+unmap_hv_stats:
+	if (hv_root_partition())
+		mshv_hv_stats_unmap();
+remove_mshv_dir:
+	debugfs_remove_recursive(mshv_debugfs);
+	return err;
+}
+
+void mshv_debugfs_exit(void)
+{
+	mshv_debugfs_parent_partition_remove();
+
+	if (hv_root_partition()) {
+		mshv_debugfs_lp_remove();
+		mshv_hv_stats_unmap();
+	}
+
+	debugfs_remove_recursive(mshv_debugfs);
+}
diff --git a/drivers/hv/mshv_root.h b/drivers/hv/mshv_root.h
index 3eb815011b46..1f1b1984449b 100644
--- a/drivers/hv/mshv_root.h
+++ b/drivers/hv/mshv_root.h
@@ -51,6 +51,9 @@ struct mshv_vp {
 		unsigned int kicked_by_hv;
 		wait_queue_head_t vp_suspend_queue;
 	} run;
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	struct dentry *vp_debugfs_stats_dentry;
+#endif
 };
 
 #define vp_fmt(fmt) "p%lluvp%u: " fmt
@@ -128,6 +131,10 @@ struct mshv_partition {
 	u64 isolation_type;
 	bool import_completed;
 	bool pt_initialized;
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	struct dentry *pt_debugfs_stats_dentry;
+	struct dentry *pt_debugfs_vp_dentry;
+#endif
 };
 
 #define pt_fmt(fmt) "p%llu: " fmt
@@ -308,6 +315,33 @@ int hv_call_modify_spa_host_access(u64 partition_id, struct page **pages,
 int hv_call_get_partition_property_ex(u64 partition_id, u64 property_code, u64 arg,
 				      void *property_value, size_t property_value_sz);
 
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+int __init mshv_debugfs_init(void);
+void mshv_debugfs_exit(void);
+
+int mshv_debugfs_partition_create(struct mshv_partition *partition);
+void mshv_debugfs_partition_remove(struct mshv_partition *partition);
+int mshv_debugfs_vp_create(struct mshv_vp *vp);
+void mshv_debugfs_vp_remove(struct mshv_vp *vp);
+#else
+static inline int __init mshv_debugfs_init(void)
+{
+	return 0;
+}
+static inline void mshv_debugfs_exit(void) { }
+
+static inline int mshv_debugfs_partition_create(struct mshv_partition *partition)
+{
+	return 0;
+}
+static inline void mshv_debugfs_partition_remove(struct mshv_partition *partition) { }
+static inline int mshv_debugfs_vp_create(struct mshv_vp *vp)
+{
+	return 0;
+}
+static inline void mshv_debugfs_vp_remove(struct mshv_vp *vp) { }
+#endif
+
 extern struct mshv_root mshv_root;
 extern enum hv_scheduler_type hv_scheduler_type;
 extern u8 * __percpu *hv_synic_eventring_tail;
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index 19006b788e85..152fcd9b45e6 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -982,6 +982,10 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
 	if (hv_scheduler_type == HV_SCHEDULER_TYPE_ROOT)
 		memcpy(vp->vp_stats_pages, stats_pages, sizeof(stats_pages));
 
+	ret = mshv_debugfs_vp_create(vp);
+	if (ret)
+		goto put_partition;
+
 	/*
 	 * Keep anon_inode_getfd last: it installs fd in the file struct and
 	 * thus makes the state accessible in user space.
@@ -989,7 +993,7 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
 	ret = anon_inode_getfd("mshv_vp", &mshv_vp_fops, vp,
 			       O_RDWR | O_CLOEXEC);
 	if (ret < 0)
-		goto put_partition;
+		goto remove_debugfs_vp;
 
 	/* already exclusive with the partition mutex for all ioctls */
 	partition->pt_vp_count++;
@@ -997,6 +1001,8 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
 
 	return ret;
 
+remove_debugfs_vp:
+	mshv_debugfs_vp_remove(vp);
 put_partition:
 	mshv_partition_put(partition);
 free_vp:
@@ -1556,13 +1562,18 @@ mshv_partition_ioctl_initialize(struct mshv_partition *partition)
 
 	ret = hv_call_initialize_partition(partition->pt_id);
 	if (ret)
-		goto withdraw_mem;
+		return ret;
+
+	ret = mshv_debugfs_partition_create(partition);
+	if (ret)
+		goto finalize_partition;
 
 	partition->pt_initialized = true;
 
 	return 0;
 
-withdraw_mem:
+finalize_partition:
+	hv_call_finalize_partition(partition->pt_id);
 	hv_call_withdraw_memory(U64_MAX, NUMA_NO_NODE, partition->pt_id);
 
 	return ret;
@@ -1741,6 +1752,8 @@ static void destroy_partition(struct mshv_partition *partition)
 			if (!vp)
 				continue;
 
+			mshv_debugfs_vp_remove(vp);
+
 			if (hv_scheduler_type == HV_SCHEDULER_TYPE_ROOT)
 				mshv_vp_stats_unmap(partition->pt_id, vp->vp_index,
 						    (void **)vp->vp_stats_pages);
@@ -1775,6 +1788,8 @@ static void destroy_partition(struct mshv_partition *partition)
 			partition->pt_vp_array[i] = NULL;
 		}
 
+		mshv_debugfs_partition_remove(partition);
+
 		/* Deallocates and unmaps everything including vcpus, GPA mappings etc */
 		hv_call_finalize_partition(partition->pt_id);
 
@@ -2351,10 +2366,14 @@ static int __init mshv_parent_partition_init(void)
 
 	mshv_init_vmm_caps(dev);
 
-	ret = mshv_irqfd_wq_init();
+	ret = mshv_debugfs_init();
 	if (ret)
 		goto exit_partition;
 
+	ret = mshv_irqfd_wq_init();
+	if (ret)
+		goto exit_debugfs;
+
 	spin_lock_init(&mshv_root.pt_ht_lock);
 	hash_init(mshv_root.pt_htable);
 
@@ -2362,6 +2381,10 @@ static int __init mshv_parent_partition_init(void)
 
 	return 0;
 
+destroy_irqds_wq:
+	mshv_irqfd_wq_cleanup();
+exit_debugfs:
+	mshv_debugfs_exit();
 exit_partition:
 	if (hv_root_partition())
 		mshv_root_partition_exit();
@@ -2378,6 +2401,7 @@ static void __exit mshv_parent_partition_exit(void)
 {
 	hv_setup_mshv_handler(NULL);
 	mshv_port_table_fini();
+	mshv_debugfs_exit();
 	misc_deregister(&mshv_dev);
 	mshv_irqfd_wq_cleanup();
 	if (hv_root_partition())
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] mshv: Ignore second stats page map result failure
  2025-12-05 18:58 ` [PATCH v2 1/3] mshv: Ignore second stats page map result failure Nuno Das Neves
@ 2025-12-05 22:50   ` Stanislav Kinsburskii
  2025-12-08 15:12   ` Michael Kelley
  1 sibling, 0 replies; 18+ messages in thread
From: Stanislav Kinsburskii @ 2025-12-05 22:50 UTC (permalink / raw)
  To: Nuno Das Neves
  Cc: linux-hyperv, linux-kernel, kys, haiyangz, wei.liu, decui, longli,
	mhklinux, prapal, mrathor, paekkaladevi

On Fri, Dec 05, 2025 at 10:58:40AM -0800, Nuno Das Neves wrote:
> From: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
> 
> Older versions of the hypervisor do not support HV_STATS_AREA_PARENT
> and return HV_STATUS_INVALID_PARAMETER for the second stats page
> mapping request.
> 
> This results a failure in module init. Instead of failing, gracefully
> fall back to populating stats_pages[HV_STATS_AREA_PARENT] with the
> already-mapped stats_pages[HV_STATS_AREA_SELF].
> 

Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>

> Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
>  drivers/hv/mshv_root_hv_call.c | 41 ++++++++++++++++++++++++++++++----
>  drivers/hv/mshv_root_main.c    |  3 +++
>  2 files changed, 40 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
> index 598eaff4ff29..b1770c7b500c 100644
> --- a/drivers/hv/mshv_root_hv_call.c
> +++ b/drivers/hv/mshv_root_hv_call.c
> @@ -855,6 +855,24 @@ static int hv_call_map_stats_page2(enum hv_stats_object_type type,
>  	return ret;
>  }
>  
> +static int
> +hv_stats_get_area_type(enum hv_stats_object_type type,
> +		       const union hv_stats_object_identity *identity)
> +{
> +	switch (type) {
> +	case HV_STATS_OBJECT_HYPERVISOR:
> +		return identity->hv.stats_area_type;
> +	case HV_STATS_OBJECT_LOGICAL_PROCESSOR:
> +		return identity->lp.stats_area_type;
> +	case HV_STATS_OBJECT_PARTITION:
> +		return identity->partition.stats_area_type;
> +	case HV_STATS_OBJECT_VP:
> +		return identity->vp.stats_area_type;
> +	}
> +
> +	return -EINVAL;
> +}
> +
>  static int hv_call_map_stats_page(enum hv_stats_object_type type,
>  				  const union hv_stats_object_identity *identity,
>  				  void **addr)
> @@ -863,7 +881,7 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
>  	struct hv_input_map_stats_page *input;
>  	struct hv_output_map_stats_page *output;
>  	u64 status, pfn;
> -	int ret = 0;
> +	int hv_status, ret = 0;
>  
>  	do {
>  		local_irq_save(flags);
> @@ -878,11 +896,26 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
>  		pfn = output->map_location;
>  
>  		local_irq_restore(flags);
> -		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
> -			ret = hv_result_to_errno(status);
> +
> +		hv_status = hv_result(status);
> +		if (hv_status != HV_STATUS_INSUFFICIENT_MEMORY) {
>  			if (hv_result_success(status))
>  				break;
> -			return ret;
> +
> +			/*
> +			 * Older versions of the hypervisor do not support the
> +			 * PARENT stats area. In this case return "success" but
> +			 * set the page to NULL. The caller should check for
> +			 * this case and instead just use the SELF area.
> +			 */
> +			if (hv_stats_get_area_type(type, identity) == HV_STATS_AREA_PARENT &&
> +			    hv_status == HV_STATUS_INVALID_PARAMETER) {
> +				*addr = NULL;
> +				return 0;
> +			}
> +
> +			hv_status_debug(status, "\n");
> +			return hv_result_to_errno(status);
>  		}
>  
>  		ret = hv_call_deposit_pages(NUMA_NO_NODE,
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index bc15d6f6922f..f59a4ab47685 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -905,6 +905,9 @@ static int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
>  	if (err)
>  		goto unmap_self;
>  
> +	if (!stats_pages[HV_STATS_AREA_PARENT])
> +		stats_pages[HV_STATS_AREA_PARENT] = stats_pages[HV_STATS_AREA_SELF];
> +
>  	return 0;
>  
>  unmap_self:
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/3] mshv: Add definitions for stats pages
  2025-12-05 18:58 ` [PATCH v2 2/3] mshv: Add definitions for stats pages Nuno Das Neves
@ 2025-12-05 22:51   ` Stanislav Kinsburskii
  2025-12-08 15:13   ` Michael Kelley
  1 sibling, 0 replies; 18+ messages in thread
From: Stanislav Kinsburskii @ 2025-12-05 22:51 UTC (permalink / raw)
  To: Nuno Das Neves
  Cc: linux-hyperv, linux-kernel, kys, haiyangz, wei.liu, decui, longli,
	mhklinux, prapal, mrathor, paekkaladevi

On Fri, Dec 05, 2025 at 10:58:41AM -0800, Nuno Das Neves wrote:
> Add the definitions for hypervisor, logical processor, and partition
> stats pages.
> 
> Move the definition for the VP stats page to its rightful place in
> hvhdk.h, and add the missing members.
> 
> These enum members retain their CamelCase style, since they are imported
> directly from the hypervisor code They will be stringified when printing
> the stats out, and retain more readability in this form.
> 

Acked-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>

> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> ---
>  drivers/hv/mshv_root_main.c |  17 --
>  include/hyperv/hvhdk.h      | 437 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 437 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index f59a4ab47685..19006b788e85 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -38,23 +38,6 @@ MODULE_AUTHOR("Microsoft");
>  MODULE_LICENSE("GPL");
>  MODULE_DESCRIPTION("Microsoft Hyper-V root partition VMM interface /dev/mshv");
>  
> -/* TODO move this to another file when debugfs code is added */
> -enum hv_stats_vp_counters {			/* HV_THREAD_COUNTER */
> -#if defined(CONFIG_X86)
> -	VpRootDispatchThreadBlocked			= 202,
> -#elif defined(CONFIG_ARM64)
> -	VpRootDispatchThreadBlocked			= 94,
> -#endif
> -	VpStatsMaxCounter
> -};
> -
> -struct hv_stats_page {
> -	union {
> -		u64 vp_cntrs[VpStatsMaxCounter];		/* VP counters */
> -		u8 data[HV_HYP_PAGE_SIZE];
> -	};
> -} __packed;
> -
>  struct mshv_root mshv_root;
>  
>  enum hv_scheduler_type hv_scheduler_type;
> diff --git a/include/hyperv/hvhdk.h b/include/hyperv/hvhdk.h
> index 469186df7826..51abbcd0ec37 100644
> --- a/include/hyperv/hvhdk.h
> +++ b/include/hyperv/hvhdk.h
> @@ -10,6 +10,443 @@
>  #include "hvhdk_mini.h"
>  #include "hvgdk.h"
>  
> +enum hv_stats_hypervisor_counters {		/* HV_HYPERVISOR_COUNTER */
> +	HvLogicalProcessors			= 1,
> +	HvPartitions				= 2,
> +	HvTotalPages				= 3,
> +	HvVirtualProcessors			= 4,
> +	HvMonitoredNotifications		= 5,
> +	HvModernStandbyEntries			= 6,
> +	HvPlatformIdleTransitions		= 7,
> +	HvHypervisorStartupCost			= 8,
> +	HvIOSpacePages				= 10,
> +	HvNonEssentialPagesForDump		= 11,
> +	HvSubsumedPages				= 12,
> +	HvStatsMaxCounter
> +};
> +
> +enum hv_stats_partition_counters {		/* HV_PROCESS_COUNTER */
> +	PartitionVirtualProcessors		= 1,
> +	PartitionTlbSize			= 3,
> +	PartitionAddressSpaces			= 4,
> +	PartitionDepositedPages			= 5,
> +	PartitionGpaPages			= 6,
> +	PartitionGpaSpaceModifications		= 7,
> +	PartitionVirtualTlbFlushEntires		= 8,
> +	PartitionRecommendedTlbSize		= 9,
> +	PartitionGpaPages4K			= 10,
> +	PartitionGpaPages2M			= 11,
> +	PartitionGpaPages1G			= 12,
> +	PartitionGpaPages512G			= 13,
> +	PartitionDevicePages4K			= 14,
> +	PartitionDevicePages2M			= 15,
> +	PartitionDevicePages1G			= 16,
> +	PartitionDevicePages512G		= 17,
> +	PartitionAttachedDevices		= 18,
> +	PartitionDeviceInterruptMappings	= 19,
> +	PartitionIoTlbFlushes			= 20,
> +	PartitionIoTlbFlushCost			= 21,
> +	PartitionDeviceInterruptErrors		= 22,
> +	PartitionDeviceDmaErrors		= 23,
> +	PartitionDeviceInterruptThrottleEvents	= 24,
> +	PartitionSkippedTimerTicks		= 25,
> +	PartitionPartitionId			= 26,
> +#if IS_ENABLED(CONFIG_X86_64)
> +	PartitionNestedTlbSize			= 27,
> +	PartitionRecommendedNestedTlbSize	= 28,
> +	PartitionNestedTlbFreeListSize		= 29,
> +	PartitionNestedTlbTrimmedPages		= 30,
> +	PartitionPagesShattered			= 31,
> +	PartitionPagesRecombined		= 32,
> +	PartitionHwpRequestValue		= 33,
> +#elif IS_ENABLED(CONFIG_ARM64)
> +	PartitionHwpRequestValue		= 27,
> +#endif
> +	PartitionStatsMaxCounter
> +};
> +
> +enum hv_stats_vp_counters {			/* HV_THREAD_COUNTER */
> +	VpTotalRunTime					= 1,
> +	VpHypervisorRunTime				= 2,
> +	VpRemoteNodeRunTime				= 3,
> +	VpNormalizedRunTime				= 4,
> +	VpIdealCpu					= 5,
> +	VpHypercallsCount				= 7,
> +	VpHypercallsTime				= 8,
> +#if IS_ENABLED(CONFIG_X86_64)
> +	VpPageInvalidationsCount			= 9,
> +	VpPageInvalidationsTime				= 10,
> +	VpControlRegisterAccessesCount			= 11,
> +	VpControlRegisterAccessesTime			= 12,
> +	VpIoInstructionsCount				= 13,
> +	VpIoInstructionsTime				= 14,
> +	VpHltInstructionsCount				= 15,
> +	VpHltInstructionsTime				= 16,
> +	VpMwaitInstructionsCount			= 17,
> +	VpMwaitInstructionsTime				= 18,
> +	VpCpuidInstructionsCount			= 19,
> +	VpCpuidInstructionsTime				= 20,
> +	VpMsrAccessesCount				= 21,
> +	VpMsrAccessesTime				= 22,
> +	VpOtherInterceptsCount				= 23,
> +	VpOtherInterceptsTime				= 24,
> +	VpExternalInterruptsCount			= 25,
> +	VpExternalInterruptsTime			= 26,
> +	VpPendingInterruptsCount			= 27,
> +	VpPendingInterruptsTime				= 28,
> +	VpEmulatedInstructionsCount			= 29,
> +	VpEmulatedInstructionsTime			= 30,
> +	VpDebugRegisterAccessesCount			= 31,
> +	VpDebugRegisterAccessesTime			= 32,
> +	VpPageFaultInterceptsCount			= 33,
> +	VpPageFaultInterceptsTime			= 34,
> +	VpGuestPageTableMaps				= 35,
> +	VpLargePageTlbFills				= 36,
> +	VpSmallPageTlbFills				= 37,
> +	VpReflectedGuestPageFaults			= 38,
> +	VpApicMmioAccesses				= 39,
> +	VpIoInterceptMessages				= 40,
> +	VpMemoryInterceptMessages			= 41,
> +	VpApicEoiAccesses				= 42,
> +	VpOtherMessages					= 43,
> +	VpPageTableAllocations				= 44,
> +	VpLogicalProcessorMigrations			= 45,
> +	VpAddressSpaceEvictions				= 46,
> +	VpAddressSpaceSwitches				= 47,
> +	VpAddressDomainFlushes				= 48,
> +	VpAddressSpaceFlushes				= 49,
> +	VpGlobalGvaRangeFlushes				= 50,
> +	VpLocalGvaRangeFlushes				= 51,
> +	VpPageTableEvictions				= 52,
> +	VpPageTableReclamations				= 53,
> +	VpPageTableResets				= 54,
> +	VpPageTableValidations				= 55,
> +	VpApicTprAccesses				= 56,
> +	VpPageTableWriteIntercepts			= 57,
> +	VpSyntheticInterrupts				= 58,
> +	VpVirtualInterrupts				= 59,
> +	VpApicIpisSent					= 60,
> +	VpApicSelfIpisSent				= 61,
> +	VpGpaSpaceHypercalls				= 62,
> +	VpLogicalProcessorHypercalls			= 63,
> +	VpLongSpinWaitHypercalls			= 64,
> +	VpOtherHypercalls				= 65,
> +	VpSyntheticInterruptHypercalls			= 66,
> +	VpVirtualInterruptHypercalls			= 67,
> +	VpVirtualMmuHypercalls				= 68,
> +	VpVirtualProcessorHypercalls			= 69,
> +	VpHardwareInterrupts				= 70,
> +	VpNestedPageFaultInterceptsCount		= 71,
> +	VpNestedPageFaultInterceptsTime			= 72,
> +	VpPageScans					= 73,
> +	VpLogicalProcessorDispatches			= 74,
> +	VpWaitingForCpuTime				= 75,
> +	VpExtendedHypercalls				= 76,
> +	VpExtendedHypercallInterceptMessages		= 77,
> +	VpMbecNestedPageTableSwitches			= 78,
> +	VpOtherReflectedGuestExceptions			= 79,
> +	VpGlobalIoTlbFlushes				= 80,
> +	VpGlobalIoTlbFlushCost				= 81,
> +	VpLocalIoTlbFlushes				= 82,
> +	VpLocalIoTlbFlushCost				= 83,
> +	VpHypercallsForwardedCount			= 84,
> +	VpHypercallsForwardingTime			= 85,
> +	VpPageInvalidationsForwardedCount		= 86,
> +	VpPageInvalidationsForwardingTime		= 87,
> +	VpControlRegisterAccessesForwardedCount		= 88,
> +	VpControlRegisterAccessesForwardingTime		= 89,
> +	VpIoInstructionsForwardedCount			= 90,
> +	VpIoInstructionsForwardingTime			= 91,
> +	VpHltInstructionsForwardedCount			= 92,
> +	VpHltInstructionsForwardingTime			= 93,
> +	VpMwaitInstructionsForwardedCount		= 94,
> +	VpMwaitInstructionsForwardingTime		= 95,
> +	VpCpuidInstructionsForwardedCount		= 96,
> +	VpCpuidInstructionsForwardingTime		= 97,
> +	VpMsrAccessesForwardedCount			= 98,
> +	VpMsrAccessesForwardingTime			= 99,
> +	VpOtherInterceptsForwardedCount			= 100,
> +	VpOtherInterceptsForwardingTime			= 101,
> +	VpExternalInterruptsForwardedCount		= 102,
> +	VpExternalInterruptsForwardingTime		= 103,
> +	VpPendingInterruptsForwardedCount		= 104,
> +	VpPendingInterruptsForwardingTime		= 105,
> +	VpEmulatedInstructionsForwardedCount		= 106,
> +	VpEmulatedInstructionsForwardingTime		= 107,
> +	VpDebugRegisterAccessesForwardedCount		= 108,
> +	VpDebugRegisterAccessesForwardingTime		= 109,
> +	VpPageFaultInterceptsForwardedCount		= 110,
> +	VpPageFaultInterceptsForwardingTime		= 111,
> +	VpVmclearEmulationCount				= 112,
> +	VpVmclearEmulationTime				= 113,
> +	VpVmptrldEmulationCount				= 114,
> +	VpVmptrldEmulationTime				= 115,
> +	VpVmptrstEmulationCount				= 116,
> +	VpVmptrstEmulationTime				= 117,
> +	VpVmreadEmulationCount				= 118,
> +	VpVmreadEmulationTime				= 119,
> +	VpVmwriteEmulationCount				= 120,
> +	VpVmwriteEmulationTime				= 121,
> +	VpVmxoffEmulationCount				= 122,
> +	VpVmxoffEmulationTime				= 123,
> +	VpVmxonEmulationCount				= 124,
> +	VpVmxonEmulationTime				= 125,
> +	VpNestedVMEntriesCount				= 126,
> +	VpNestedVMEntriesTime				= 127,
> +	VpNestedSLATSoftPageFaultsCount			= 128,
> +	VpNestedSLATSoftPageFaultsTime			= 129,
> +	VpNestedSLATHardPageFaultsCount			= 130,
> +	VpNestedSLATHardPageFaultsTime			= 131,
> +	VpInvEptAllContextEmulationCount		= 132,
> +	VpInvEptAllContextEmulationTime			= 133,
> +	VpInvEptSingleContextEmulationCount		= 134,
> +	VpInvEptSingleContextEmulationTime		= 135,
> +	VpInvVpidAllContextEmulationCount		= 136,
> +	VpInvVpidAllContextEmulationTime		= 137,
> +	VpInvVpidSingleContextEmulationCount		= 138,
> +	VpInvVpidSingleContextEmulationTime		= 139,
> +	VpInvVpidSingleAddressEmulationCount		= 140,
> +	VpInvVpidSingleAddressEmulationTime		= 141,
> +	VpNestedTlbPageTableReclamations		= 142,
> +	VpNestedTlbPageTableEvictions			= 143,
> +	VpFlushGuestPhysicalAddressSpaceHypercalls	= 144,
> +	VpFlushGuestPhysicalAddressListHypercalls	= 145,
> +	VpPostedInterruptNotifications			= 146,
> +	VpPostedInterruptScans				= 147,
> +	VpTotalCoreRunTime				= 148,
> +	VpMaximumRunTime				= 149,
> +	VpHwpRequestContextSwitches			= 150,
> +	VpWaitingForCpuTimeBucket0			= 151,
> +	VpWaitingForCpuTimeBucket1			= 152,
> +	VpWaitingForCpuTimeBucket2			= 153,
> +	VpWaitingForCpuTimeBucket3			= 154,
> +	VpWaitingForCpuTimeBucket4			= 155,
> +	VpWaitingForCpuTimeBucket5			= 156,
> +	VpWaitingForCpuTimeBucket6			= 157,
> +	VpVmloadEmulationCount				= 158,
> +	VpVmloadEmulationTime				= 159,
> +	VpVmsaveEmulationCount				= 160,
> +	VpVmsaveEmulationTime				= 161,
> +	VpGifInstructionEmulationCount			= 162,
> +	VpGifInstructionEmulationTime			= 163,
> +	VpEmulatedErrataSvmInstructions			= 164,
> +	VpPlaceholder1					= 165,
> +	VpPlaceholder2					= 166,
> +	VpPlaceholder3					= 167,
> +	VpPlaceholder4					= 168,
> +	VpPlaceholder5					= 169,
> +	VpPlaceholder6					= 170,
> +	VpPlaceholder7					= 171,
> +	VpPlaceholder8					= 172,
> +	VpPlaceholder9					= 173,
> +	VpPlaceholder10					= 174,
> +	VpSchedulingPriority				= 175,
> +	VpRdpmcInstructionsCount			= 176,
> +	VpRdpmcInstructionsTime				= 177,
> +	VpPerfmonPmuMsrAccessesCount			= 178,
> +	VpPerfmonLbrMsrAccessesCount			= 179,
> +	VpPerfmonIptMsrAccessesCount			= 180,
> +	VpPerfmonInterruptCount				= 181,
> +	VpVtl1DispatchCount				= 182,
> +	VpVtl2DispatchCount				= 183,
> +	VpVtl2DispatchBucket0				= 184,
> +	VpVtl2DispatchBucket1				= 185,
> +	VpVtl2DispatchBucket2				= 186,
> +	VpVtl2DispatchBucket3				= 187,
> +	VpVtl2DispatchBucket4				= 188,
> +	VpVtl2DispatchBucket5				= 189,
> +	VpVtl2DispatchBucket6				= 190,
> +	VpVtl1RunTime					= 191,
> +	VpVtl2RunTime					= 192,
> +	VpIommuHypercalls				= 193,
> +	VpCpuGroupHypercalls				= 194,
> +	VpVsmHypercalls					= 195,
> +	VpEventLogHypercalls				= 196,
> +	VpDeviceDomainHypercalls			= 197,
> +	VpDepositHypercalls				= 198,
> +	VpSvmHypercalls					= 199,
> +	VpBusLockAcquisitionCount			= 200,
> +	VpUnused					= 201,
> +	VpRootDispatchThreadBlocked			= 202,
> +#elif IS_ENABLED(CONFIG_ARM64)
> +	VpSysRegAccessesCount				= 9,
> +	VpSysRegAccessesTime				= 10,
> +	VpSmcInstructionsCount				= 11,
> +	VpSmcInstructionsTime				= 12,
> +	VpOtherInterceptsCount				= 13,
> +	VpOtherInterceptsTime				= 14,
> +	VpExternalInterruptsCount			= 15,
> +	VpExternalInterruptsTime			= 16,
> +	VpPendingInterruptsCount			= 17,
> +	VpPendingInterruptsTime				= 18,
> +	VpGuestPageTableMaps				= 19,
> +	VpLargePageTlbFills				= 20,
> +	VpSmallPageTlbFills				= 21,
> +	VpReflectedGuestPageFaults			= 22,
> +	VpMemoryInterceptMessages			= 23,
> +	VpOtherMessages					= 24,
> +	VpLogicalProcessorMigrations			= 25,
> +	VpAddressDomainFlushes				= 26,
> +	VpAddressSpaceFlushes				= 27,
> +	VpSyntheticInterrupts				= 28,
> +	VpVirtualInterrupts				= 29,
> +	VpApicSelfIpisSent				= 30,
> +	VpGpaSpaceHypercalls				= 31,
> +	VpLogicalProcessorHypercalls			= 32,
> +	VpLongSpinWaitHypercalls			= 33,
> +	VpOtherHypercalls				= 34,
> +	VpSyntheticInterruptHypercalls			= 35,
> +	VpVirtualInterruptHypercalls			= 36,
> +	VpVirtualMmuHypercalls				= 37,
> +	VpVirtualProcessorHypercalls			= 38,
> +	VpHardwareInterrupts				= 39,
> +	VpNestedPageFaultInterceptsCount		= 40,
> +	VpNestedPageFaultInterceptsTime			= 41,
> +	VpLogicalProcessorDispatches			= 42,
> +	VpWaitingForCpuTime				= 43,
> +	VpExtendedHypercalls				= 44,
> +	VpExtendedHypercallInterceptMessages		= 45,
> +	VpMbecNestedPageTableSwitches			= 46,
> +	VpOtherReflectedGuestExceptions			= 47,
> +	VpGlobalIoTlbFlushes				= 48,
> +	VpGlobalIoTlbFlushCost				= 49,
> +	VpLocalIoTlbFlushes				= 50,
> +	VpLocalIoTlbFlushCost				= 51,
> +	VpFlushGuestPhysicalAddressSpaceHypercalls	= 52,
> +	VpFlushGuestPhysicalAddressListHypercalls	= 53,
> +	VpPostedInterruptNotifications			= 54,
> +	VpPostedInterruptScans				= 55,
> +	VpTotalCoreRunTime				= 56,
> +	VpMaximumRunTime				= 57,
> +	VpWaitingForCpuTimeBucket0			= 58,
> +	VpWaitingForCpuTimeBucket1			= 59,
> +	VpWaitingForCpuTimeBucket2			= 60,
> +	VpWaitingForCpuTimeBucket3			= 61,
> +	VpWaitingForCpuTimeBucket4			= 62,
> +	VpWaitingForCpuTimeBucket5			= 63,
> +	VpWaitingForCpuTimeBucket6			= 64,
> +	VpHwpRequestContextSwitches			= 65,
> +	VpPlaceholder2					= 66,
> +	VpPlaceholder3					= 67,
> +	VpPlaceholder4					= 68,
> +	VpPlaceholder5					= 69,
> +	VpPlaceholder6					= 70,
> +	VpPlaceholder7					= 71,
> +	VpPlaceholder8					= 72,
> +	VpContentionTime				= 73,
> +	VpWakeUpTime					= 74,
> +	VpSchedulingPriority				= 75,
> +	VpVtl1DispatchCount				= 76,
> +	VpVtl2DispatchCount				= 77,
> +	VpVtl2DispatchBucket0				= 78,
> +	VpVtl2DispatchBucket1				= 79,
> +	VpVtl2DispatchBucket2				= 80,
> +	VpVtl2DispatchBucket3				= 81,
> +	VpVtl2DispatchBucket4				= 82,
> +	VpVtl2DispatchBucket5				= 83,
> +	VpVtl2DispatchBucket6				= 84,
> +	VpVtl1RunTime					= 85,
> +	VpVtl2RunTime					= 86,
> +	VpIommuHypercalls				= 87,
> +	VpCpuGroupHypercalls				= 88,
> +	VpVsmHypercalls					= 89,
> +	VpEventLogHypercalls				= 90,
> +	VpDeviceDomainHypercalls			= 91,
> +	VpDepositHypercalls				= 92,
> +	VpSvmHypercalls					= 93,
> +	VpLoadAvg					= 94,
> +	VpRootDispatchThreadBlocked			= 95,
> +#endif
> +	VpStatsMaxCounter
> +};
> +
> +enum hv_stats_lp_counters {			/* HV_CPU_COUNTER */
> +	LpGlobalTime				= 1,
> +	LpTotalRunTime				= 2,
> +	LpHypervisorRunTime			= 3,
> +	LpHardwareInterrupts			= 4,
> +	LpContextSwitches			= 5,
> +	LpInterProcessorInterrupts		= 6,
> +	LpSchedulerInterrupts			= 7,
> +	LpTimerInterrupts			= 8,
> +	LpInterProcessorInterruptsSent		= 9,
> +	LpProcessorHalts			= 10,
> +	LpMonitorTransitionCost			= 11,
> +	LpContextSwitchTime			= 12,
> +	LpC1TransitionsCount			= 13,
> +	LpC1RunTime				= 14,
> +	LpC2TransitionsCount			= 15,
> +	LpC2RunTime				= 16,
> +	LpC3TransitionsCount			= 17,
> +	LpC3RunTime				= 18,
> +	LpRootVpIndex				= 19,
> +	LpIdleSequenceNumber			= 20,
> +	LpGlobalTscCount			= 21,
> +	LpActiveTscCount			= 22,
> +	LpIdleAccumulation			= 23,
> +	LpReferenceCycleCount0			= 24,
> +	LpActualCycleCount0			= 25,
> +	LpReferenceCycleCount1			= 26,
> +	LpActualCycleCount1			= 27,
> +	LpProximityDomainId			= 28,
> +	LpPostedInterruptNotifications		= 29,
> +	LpBranchPredictorFlushes		= 30,
> +#if IS_ENABLED(CONFIG_X86_64)
> +	LpL1DataCacheFlushes			= 31,
> +	LpImmediateL1DataCacheFlushes		= 32,
> +	LpMbFlushes				= 33,
> +	LpCounterRefreshSequenceNumber		= 34,
> +	LpCounterRefreshReferenceTime		= 35,
> +	LpIdleAccumulationSnapshot		= 36,
> +	LpActiveTscCountSnapshot		= 37,
> +	LpHwpRequestContextSwitches		= 38,
> +	LpPlaceholder1				= 39,
> +	LpPlaceholder2				= 40,
> +	LpPlaceholder3				= 41,
> +	LpPlaceholder4				= 42,
> +	LpPlaceholder5				= 43,
> +	LpPlaceholder6				= 44,
> +	LpPlaceholder7				= 45,
> +	LpPlaceholder8				= 46,
> +	LpPlaceholder9				= 47,
> +	LpPlaceholder10				= 48,
> +	LpReserveGroupId			= 49,
> +	LpRunningPriority			= 50,
> +	LpPerfmonInterruptCount			= 51,
> +#elif IS_ENABLED(CONFIG_ARM64)
> +	LpCounterRefreshSequenceNumber		= 31,
> +	LpCounterRefreshReferenceTime		= 32,
> +	LpIdleAccumulationSnapshot		= 33,
> +	LpActiveTscCountSnapshot		= 34,
> +	LpHwpRequestContextSwitches		= 35,
> +	LpPlaceholder2				= 36,
> +	LpPlaceholder3				= 37,
> +	LpPlaceholder4				= 38,
> +	LpPlaceholder5				= 39,
> +	LpPlaceholder6				= 40,
> +	LpPlaceholder7				= 41,
> +	LpPlaceholder8				= 42,
> +	LpPlaceholder9				= 43,
> +	LpSchLocalRunListSize			= 44,
> +	LpReserveGroupId			= 45,
> +	LpRunningPriority			= 46,
> +#endif
> +	LpStatsMaxCounter
> +};
> +
> +/*
> + * Hypervisor statsitics page format
> + */
> +struct hv_stats_page {
> +	union {
> +		u64 hv_cntrs[HvStatsMaxCounter];		/* Hypervisor counters */
> +		u64 pt_cntrs[PartitionStatsMaxCounter];		/* Partition counters */
> +		u64 vp_cntrs[VpStatsMaxCounter];		/* VP counters */
> +		u64 lp_cntrs[LpStatsMaxCounter];		/* LP counters */
> +		u8 data[HV_HYP_PAGE_SIZE];
> +	};
> +} __packed;
> +
>  /* Bits for dirty mask of hv_vp_register_page */
>  #define HV_X64_REGISTER_CLASS_GENERAL	0
>  #define HV_X64_REGISTER_CLASS_IP	1
> -- 
> 2.34.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics
  2025-12-05 18:58 ` [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics Nuno Das Neves
@ 2025-12-05 23:06   ` Stanislav Kinsburskii
  2025-12-08  3:04   ` kernel test robot
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Stanislav Kinsburskii @ 2025-12-05 23:06 UTC (permalink / raw)
  To: Nuno Das Neves
  Cc: linux-hyperv, linux-kernel, kys, haiyangz, wei.liu, decui, longli,
	mhklinux, prapal, mrathor, paekkaladevi, Jinank Jain

On Fri, Dec 05, 2025 at 10:58:42AM -0800, Nuno Das Neves wrote:
> Introduce a debugfs interface to expose root and child partition stats
> when running with mshv_root.
> 
> Create a debugfs directory "mshv" containing 'stats' files organized by
> type and id. A stats file contains a number of counters depending on
> its type. e.g. an excerpt from a VP stats file:
> 
> TotalRunTime                  : 1997602722
> HypervisorRunTime             : 649671371
> RemoteNodeRunTime             : 0
> NormalizedRunTime             : 1997602721
> IdealCpu                      : 0
> HypercallsCount               : 1708169
> HypercallsTime                : 111914774
> PageInvalidationsCount        : 0
> PageInvalidationsTime         : 0
> 
> On a root partition with some active child partitions, the entire
> directory structure may look like:
> 
> mshv/
>   stats             # hypervisor stats
>   lp/               # logical processors
>     0/              # LP id
>       stats         # LP 0 stats
>     1/
>     2/
>     3/
>   partition/        # partition stats
>     1/              # root partition id
>       stats         # root partition stats
>       vp/           # root virtual processors
>         0/          # root VP id
>           stats     # root VP 0 stats
>         1/
>         2/
>         3/
>     42/             # child partition id
>       stats         # child partition stats
>       vp/           # child VPs
>         0/          # child VP id
>           stats     # child VP 0 stats
>         1/
>     43/
>     55/
> 
> On L1VH, some stats are not present as it does not own the hardware
> like the root partition does:
> - The hypervisor and lp stats are not present
> - L1VH's partition directory is named "self" because it can't get its
>   own id
> - Some of L1VH's partition and VP stats fields are not populated, because
>   it can't map its own HV_STATS_AREA_PARENT page.
> 

<snip>

> +static void __init *lp_debugfs_stats_create(u32 lp_index, struct dentry *parent)

It would be better to return struct hv_stats_page from this (and other
functions).

> +{
> +	struct dentry *dentry;
> +	void *stats;
> +
> +	stats = mshv_lp_stats_map(lp_index);
> +	if (IS_ERR(stats))
> +		return stats;
> +
> +	dentry = debugfs_create_file("stats", 0400, parent,
> +				     stats, &lp_stats_fops);
> +	if (IS_ERR(dentry)) {
> +		mshv_lp_stats_unmap(lp_index, stats);
> +		return dentry;

This is sloppy as it returns struct dentry instead of struct
hv_stats_page and using void here simply sweeps the problem under the
carpet which otherwise the compiler would catch.
How about using ERR_CAST instead as it will make this behavior explicit?

> +	}
> +	return stats;
> +}
> +
> +static int __init lp_debugfs_create(u32 lp_index, struct dentry *parent)
> +{
> +	struct dentry *idx;
> +	char lp_idx_str[U32_BUF_SZ];
> +	void *stats;
> +	int err;
> +
> +	sprintf(lp_idx_str, "%u", lp_index);
> +
> +	idx = debugfs_create_dir(lp_idx_str, parent);
> +	if (IS_ERR(idx))
> +		return PTR_ERR(idx);
> +
> +	stats = lp_debugfs_stats_create(lp_index, idx);
> +	if (IS_ERR(stats)) {
> +		err = PTR_ERR(stats);
> +		goto remove_debugfs_lp_idx;
> +	}
> +
> +	return 0;
> +
> +remove_debugfs_lp_idx:
> +	debugfs_remove_recursive(idx);
> +	return err;
> +}
> +
> +static void mshv_debugfs_lp_remove(void)
> +{
> +	int lp_index;
> +
> +	debugfs_remove_recursive(mshv_debugfs_lp);
> +
> +	for (lp_index = 0; lp_index < mshv_lps_count; lp_index++)
> +		mshv_lp_stats_unmap(lp_index, NULL);
> +}
> +
> +static int __init mshv_debugfs_lp_create(struct dentry *parent)
> +{
> +	struct dentry *lp_dir;
> +	int err, lp_index;
> +
> +	lp_dir = debugfs_create_dir("lp", parent);
> +	if (IS_ERR(lp_dir))
> +		return PTR_ERR(lp_dir);
> +
> +	for (lp_index = 0; lp_index < mshv_lps_count; lp_index++) {
> +		err = lp_debugfs_create(lp_index, lp_dir);
> +		if (err)
> +			goto remove_debugfs_lps;
> +	}
> +
> +	mshv_debugfs_lp = lp_dir;
> +
> +	return 0;
> +
> +remove_debugfs_lps:
> +	for (lp_index -= 1; lp_index >= 0; lp_index--)
> +		mshv_lp_stats_unmap(lp_index, NULL);
> +	debugfs_remove_recursive(lp_dir);
> +	return err;
> +}
> +
> +static int vp_stats_show(struct seq_file *m, void *v)
> +{
> +	const struct hv_stats_page **pstats = m->private;
> +
> +#define VP_SEQ_PRINTF(cnt)				 \
> +do {								 \
> +	if (pstats[HV_STATS_AREA_SELF]->vp_cntrs[Vp##cnt]) \
> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> +			pstats[HV_STATS_AREA_SELF]->vp_cntrs[Vp##cnt]); \
> +	else \
> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> +			pstats[HV_STATS_AREA_PARENT]->vp_cntrs[Vp##cnt]); \
> +} while (0)
> +
> +	VP_SEQ_PRINTF(TotalRunTime);
> +	VP_SEQ_PRINTF(HypervisorRunTime);
> +	VP_SEQ_PRINTF(RemoteNodeRunTime);
> +	VP_SEQ_PRINTF(NormalizedRunTime);
> +	VP_SEQ_PRINTF(IdealCpu);
> +	VP_SEQ_PRINTF(HypercallsCount);
> +	VP_SEQ_PRINTF(HypercallsTime);
> +#if IS_ENABLED(CONFIG_X86_64)
> +	VP_SEQ_PRINTF(PageInvalidationsCount);
> +	VP_SEQ_PRINTF(PageInvalidationsTime);
> +	VP_SEQ_PRINTF(ControlRegisterAccessesCount);
> +	VP_SEQ_PRINTF(ControlRegisterAccessesTime);
> +	VP_SEQ_PRINTF(IoInstructionsCount);
> +	VP_SEQ_PRINTF(IoInstructionsTime);
> +	VP_SEQ_PRINTF(HltInstructionsCount);
> +	VP_SEQ_PRINTF(HltInstructionsTime);
> +	VP_SEQ_PRINTF(MwaitInstructionsCount);
> +	VP_SEQ_PRINTF(MwaitInstructionsTime);
> +	VP_SEQ_PRINTF(CpuidInstructionsCount);
> +	VP_SEQ_PRINTF(CpuidInstructionsTime);
> +	VP_SEQ_PRINTF(MsrAccessesCount);
> +	VP_SEQ_PRINTF(MsrAccessesTime);
> +	VP_SEQ_PRINTF(OtherInterceptsCount);
> +	VP_SEQ_PRINTF(OtherInterceptsTime);
> +	VP_SEQ_PRINTF(ExternalInterruptsCount);
> +	VP_SEQ_PRINTF(ExternalInterruptsTime);
> +	VP_SEQ_PRINTF(PendingInterruptsCount);
> +	VP_SEQ_PRINTF(PendingInterruptsTime);
> +	VP_SEQ_PRINTF(EmulatedInstructionsCount);
> +	VP_SEQ_PRINTF(EmulatedInstructionsTime);
> +	VP_SEQ_PRINTF(DebugRegisterAccessesCount);
> +	VP_SEQ_PRINTF(DebugRegisterAccessesTime);
> +	VP_SEQ_PRINTF(PageFaultInterceptsCount);
> +	VP_SEQ_PRINTF(PageFaultInterceptsTime);
> +	VP_SEQ_PRINTF(GuestPageTableMaps);
> +	VP_SEQ_PRINTF(LargePageTlbFills);
> +	VP_SEQ_PRINTF(SmallPageTlbFills);
> +	VP_SEQ_PRINTF(ReflectedGuestPageFaults);
> +	VP_SEQ_PRINTF(ApicMmioAccesses);
> +	VP_SEQ_PRINTF(IoInterceptMessages);
> +	VP_SEQ_PRINTF(MemoryInterceptMessages);
> +	VP_SEQ_PRINTF(ApicEoiAccesses);
> +	VP_SEQ_PRINTF(OtherMessages);
> +	VP_SEQ_PRINTF(PageTableAllocations);
> +	VP_SEQ_PRINTF(LogicalProcessorMigrations);
> +	VP_SEQ_PRINTF(AddressSpaceEvictions);
> +	VP_SEQ_PRINTF(AddressSpaceSwitches);
> +	VP_SEQ_PRINTF(AddressDomainFlushes);
> +	VP_SEQ_PRINTF(AddressSpaceFlushes);
> +	VP_SEQ_PRINTF(GlobalGvaRangeFlushes);
> +	VP_SEQ_PRINTF(LocalGvaRangeFlushes);
> +	VP_SEQ_PRINTF(PageTableEvictions);
> +	VP_SEQ_PRINTF(PageTableReclamations);
> +	VP_SEQ_PRINTF(PageTableResets);
> +	VP_SEQ_PRINTF(PageTableValidations);
> +	VP_SEQ_PRINTF(ApicTprAccesses);
> +	VP_SEQ_PRINTF(PageTableWriteIntercepts);
> +	VP_SEQ_PRINTF(SyntheticInterrupts);
> +	VP_SEQ_PRINTF(VirtualInterrupts);
> +	VP_SEQ_PRINTF(ApicIpisSent);
> +	VP_SEQ_PRINTF(ApicSelfIpisSent);
> +	VP_SEQ_PRINTF(GpaSpaceHypercalls);
> +	VP_SEQ_PRINTF(LogicalProcessorHypercalls);
> +	VP_SEQ_PRINTF(LongSpinWaitHypercalls);
> +	VP_SEQ_PRINTF(OtherHypercalls);
> +	VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
> +	VP_SEQ_PRINTF(VirtualInterruptHypercalls);
> +	VP_SEQ_PRINTF(VirtualMmuHypercalls);
> +	VP_SEQ_PRINTF(VirtualProcessorHypercalls);
> +	VP_SEQ_PRINTF(HardwareInterrupts);
> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
> +	VP_SEQ_PRINTF(PageScans);
> +	VP_SEQ_PRINTF(LogicalProcessorDispatches);
> +	VP_SEQ_PRINTF(WaitingForCpuTime);
> +	VP_SEQ_PRINTF(ExtendedHypercalls);
> +	VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
> +	VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
> +	VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
> +	VP_SEQ_PRINTF(GlobalIoTlbFlushes);
> +	VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
> +	VP_SEQ_PRINTF(LocalIoTlbFlushes);
> +	VP_SEQ_PRINTF(LocalIoTlbFlushCost);
> +	VP_SEQ_PRINTF(HypercallsForwardedCount);
> +	VP_SEQ_PRINTF(HypercallsForwardingTime);
> +	VP_SEQ_PRINTF(PageInvalidationsForwardedCount);
> +	VP_SEQ_PRINTF(PageInvalidationsForwardingTime);
> +	VP_SEQ_PRINTF(ControlRegisterAccessesForwardedCount);
> +	VP_SEQ_PRINTF(ControlRegisterAccessesForwardingTime);
> +	VP_SEQ_PRINTF(IoInstructionsForwardedCount);
> +	VP_SEQ_PRINTF(IoInstructionsForwardingTime);
> +	VP_SEQ_PRINTF(HltInstructionsForwardedCount);
> +	VP_SEQ_PRINTF(HltInstructionsForwardingTime);
> +	VP_SEQ_PRINTF(MwaitInstructionsForwardedCount);
> +	VP_SEQ_PRINTF(MwaitInstructionsForwardingTime);
> +	VP_SEQ_PRINTF(CpuidInstructionsForwardedCount);
> +	VP_SEQ_PRINTF(CpuidInstructionsForwardingTime);
> +	VP_SEQ_PRINTF(MsrAccessesForwardedCount);
> +	VP_SEQ_PRINTF(MsrAccessesForwardingTime);
> +	VP_SEQ_PRINTF(OtherInterceptsForwardedCount);
> +	VP_SEQ_PRINTF(OtherInterceptsForwardingTime);
> +	VP_SEQ_PRINTF(ExternalInterruptsForwardedCount);
> +	VP_SEQ_PRINTF(ExternalInterruptsForwardingTime);
> +	VP_SEQ_PRINTF(PendingInterruptsForwardedCount);
> +	VP_SEQ_PRINTF(PendingInterruptsForwardingTime);
> +	VP_SEQ_PRINTF(EmulatedInstructionsForwardedCount);
> +	VP_SEQ_PRINTF(EmulatedInstructionsForwardingTime);
> +	VP_SEQ_PRINTF(DebugRegisterAccessesForwardedCount);
> +	VP_SEQ_PRINTF(DebugRegisterAccessesForwardingTime);
> +	VP_SEQ_PRINTF(PageFaultInterceptsForwardedCount);
> +	VP_SEQ_PRINTF(PageFaultInterceptsForwardingTime);
> +	VP_SEQ_PRINTF(VmclearEmulationCount);
> +	VP_SEQ_PRINTF(VmclearEmulationTime);
> +	VP_SEQ_PRINTF(VmptrldEmulationCount);
> +	VP_SEQ_PRINTF(VmptrldEmulationTime);
> +	VP_SEQ_PRINTF(VmptrstEmulationCount);
> +	VP_SEQ_PRINTF(VmptrstEmulationTime);
> +	VP_SEQ_PRINTF(VmreadEmulationCount);
> +	VP_SEQ_PRINTF(VmreadEmulationTime);
> +	VP_SEQ_PRINTF(VmwriteEmulationCount);
> +	VP_SEQ_PRINTF(VmwriteEmulationTime);
> +	VP_SEQ_PRINTF(VmxoffEmulationCount);
> +	VP_SEQ_PRINTF(VmxoffEmulationTime);
> +	VP_SEQ_PRINTF(VmxonEmulationCount);
> +	VP_SEQ_PRINTF(VmxonEmulationTime);
> +	VP_SEQ_PRINTF(NestedVMEntriesCount);
> +	VP_SEQ_PRINTF(NestedVMEntriesTime);
> +	VP_SEQ_PRINTF(NestedSLATSoftPageFaultsCount);
> +	VP_SEQ_PRINTF(NestedSLATSoftPageFaultsTime);
> +	VP_SEQ_PRINTF(NestedSLATHardPageFaultsCount);
> +	VP_SEQ_PRINTF(NestedSLATHardPageFaultsTime);
> +	VP_SEQ_PRINTF(InvEptAllContextEmulationCount);
> +	VP_SEQ_PRINTF(InvEptAllContextEmulationTime);
> +	VP_SEQ_PRINTF(InvEptSingleContextEmulationCount);
> +	VP_SEQ_PRINTF(InvEptSingleContextEmulationTime);
> +	VP_SEQ_PRINTF(InvVpidAllContextEmulationCount);
> +	VP_SEQ_PRINTF(InvVpidAllContextEmulationTime);
> +	VP_SEQ_PRINTF(InvVpidSingleContextEmulationCount);
> +	VP_SEQ_PRINTF(InvVpidSingleContextEmulationTime);
> +	VP_SEQ_PRINTF(InvVpidSingleAddressEmulationCount);
> +	VP_SEQ_PRINTF(InvVpidSingleAddressEmulationTime);
> +	VP_SEQ_PRINTF(NestedTlbPageTableReclamations);
> +	VP_SEQ_PRINTF(NestedTlbPageTableEvictions);
> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
> +	VP_SEQ_PRINTF(PostedInterruptNotifications);
> +	VP_SEQ_PRINTF(PostedInterruptScans);
> +	VP_SEQ_PRINTF(TotalCoreRunTime);
> +	VP_SEQ_PRINTF(MaximumRunTime);
> +	VP_SEQ_PRINTF(HwpRequestContextSwitches);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
> +	VP_SEQ_PRINTF(VmloadEmulationCount);
> +	VP_SEQ_PRINTF(VmloadEmulationTime);
> +	VP_SEQ_PRINTF(VmsaveEmulationCount);
> +	VP_SEQ_PRINTF(VmsaveEmulationTime);
> +	VP_SEQ_PRINTF(GifInstructionEmulationCount);
> +	VP_SEQ_PRINTF(GifInstructionEmulationTime);
> +	VP_SEQ_PRINTF(EmulatedErrataSvmInstructions);
> +	VP_SEQ_PRINTF(Placeholder1);
> +	VP_SEQ_PRINTF(Placeholder2);
> +	VP_SEQ_PRINTF(Placeholder3);
> +	VP_SEQ_PRINTF(Placeholder4);
> +	VP_SEQ_PRINTF(Placeholder5);
> +	VP_SEQ_PRINTF(Placeholder6);
> +	VP_SEQ_PRINTF(Placeholder7);
> +	VP_SEQ_PRINTF(Placeholder8);
> +	VP_SEQ_PRINTF(Placeholder9);
> +	VP_SEQ_PRINTF(Placeholder10);
> +	VP_SEQ_PRINTF(SchedulingPriority);
> +	VP_SEQ_PRINTF(RdpmcInstructionsCount);
> +	VP_SEQ_PRINTF(RdpmcInstructionsTime);
> +	VP_SEQ_PRINTF(PerfmonPmuMsrAccessesCount);
> +	VP_SEQ_PRINTF(PerfmonLbrMsrAccessesCount);
> +	VP_SEQ_PRINTF(PerfmonIptMsrAccessesCount);
> +	VP_SEQ_PRINTF(PerfmonInterruptCount);
> +	VP_SEQ_PRINTF(Vtl1DispatchCount);
> +	VP_SEQ_PRINTF(Vtl2DispatchCount);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket0);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket1);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket2);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket3);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket4);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket5);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket6);
> +	VP_SEQ_PRINTF(Vtl1RunTime);
> +	VP_SEQ_PRINTF(Vtl2RunTime);
> +	VP_SEQ_PRINTF(IommuHypercalls);
> +	VP_SEQ_PRINTF(CpuGroupHypercalls);
> +	VP_SEQ_PRINTF(VsmHypercalls);
> +	VP_SEQ_PRINTF(EventLogHypercalls);
> +	VP_SEQ_PRINTF(DeviceDomainHypercalls);
> +	VP_SEQ_PRINTF(DepositHypercalls);
> +	VP_SEQ_PRINTF(SvmHypercalls);
> +	VP_SEQ_PRINTF(BusLockAcquisitionCount);
> +#elif IS_ENABLED(CONFIG_ARM64)
> +	VP_SEQ_PRINTF(SysRegAccessesCount);
> +	VP_SEQ_PRINTF(SysRegAccessesTime);
> +	VP_SEQ_PRINTF(SmcInstructionsCount);
> +	VP_SEQ_PRINTF(SmcInstructionsTime);
> +	VP_SEQ_PRINTF(OtherInterceptsCount);
> +	VP_SEQ_PRINTF(OtherInterceptsTime);
> +	VP_SEQ_PRINTF(ExternalInterruptsCount);
> +	VP_SEQ_PRINTF(ExternalInterruptsTime);
> +	VP_SEQ_PRINTF(PendingInterruptsCount);
> +	VP_SEQ_PRINTF(PendingInterruptsTime);
> +	VP_SEQ_PRINTF(GuestPageTableMaps);
> +	VP_SEQ_PRINTF(LargePageTlbFills);
> +	VP_SEQ_PRINTF(SmallPageTlbFills);
> +	VP_SEQ_PRINTF(ReflectedGuestPageFaults);
> +	VP_SEQ_PRINTF(MemoryInterceptMessages);
> +	VP_SEQ_PRINTF(OtherMessages);
> +	VP_SEQ_PRINTF(LogicalProcessorMigrations);
> +	VP_SEQ_PRINTF(AddressDomainFlushes);
> +	VP_SEQ_PRINTF(AddressSpaceFlushes);
> +	VP_SEQ_PRINTF(SyntheticInterrupts);
> +	VP_SEQ_PRINTF(VirtualInterrupts);
> +	VP_SEQ_PRINTF(ApicSelfIpisSent);
> +	VP_SEQ_PRINTF(GpaSpaceHypercalls);
> +	VP_SEQ_PRINTF(LogicalProcessorHypercalls);
> +	VP_SEQ_PRINTF(LongSpinWaitHypercalls);
> +	VP_SEQ_PRINTF(OtherHypercalls);
> +	VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
> +	VP_SEQ_PRINTF(VirtualInterruptHypercalls);
> +	VP_SEQ_PRINTF(VirtualMmuHypercalls);
> +	VP_SEQ_PRINTF(VirtualProcessorHypercalls);
> +	VP_SEQ_PRINTF(HardwareInterrupts);
> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
> +	VP_SEQ_PRINTF(LogicalProcessorDispatches);
> +	VP_SEQ_PRINTF(WaitingForCpuTime);
> +	VP_SEQ_PRINTF(ExtendedHypercalls);
> +	VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
> +	VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
> +	VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
> +	VP_SEQ_PRINTF(GlobalIoTlbFlushes);
> +	VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
> +	VP_SEQ_PRINTF(LocalIoTlbFlushes);
> +	VP_SEQ_PRINTF(LocalIoTlbFlushCost);
> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
> +	VP_SEQ_PRINTF(PostedInterruptNotifications);
> +	VP_SEQ_PRINTF(PostedInterruptScans);
> +	VP_SEQ_PRINTF(TotalCoreRunTime);
> +	VP_SEQ_PRINTF(MaximumRunTime);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
> +	VP_SEQ_PRINTF(HwpRequestContextSwitches);
> +	VP_SEQ_PRINTF(Placeholder2);
> +	VP_SEQ_PRINTF(Placeholder3);
> +	VP_SEQ_PRINTF(Placeholder4);
> +	VP_SEQ_PRINTF(Placeholder5);
> +	VP_SEQ_PRINTF(Placeholder6);
> +	VP_SEQ_PRINTF(Placeholder7);
> +	VP_SEQ_PRINTF(Placeholder8);
> +	VP_SEQ_PRINTF(ContentionTime);
> +	VP_SEQ_PRINTF(WakeUpTime);
> +	VP_SEQ_PRINTF(SchedulingPriority);
> +	VP_SEQ_PRINTF(Vtl1DispatchCount);
> +	VP_SEQ_PRINTF(Vtl2DispatchCount);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket0);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket1);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket2);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket3);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket4);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket5);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket6);
> +	VP_SEQ_PRINTF(Vtl1RunTime);
> +	VP_SEQ_PRINTF(Vtl2RunTime);
> +	VP_SEQ_PRINTF(IommuHypercalls);
> +	VP_SEQ_PRINTF(CpuGroupHypercalls);
> +	VP_SEQ_PRINTF(VsmHypercalls);
> +	VP_SEQ_PRINTF(EventLogHypercalls);
> +	VP_SEQ_PRINTF(DeviceDomainHypercalls);
> +	VP_SEQ_PRINTF(DepositHypercalls);
> +	VP_SEQ_PRINTF(SvmHypercalls);
> +#endif
> +
> +	return 0;
> +}
> +DEFINE_SHOW_ATTRIBUTE(vp_stats);
> +
> +static void mshv_vp_stats_unmap(u64 partition_id, u32 vp_index, void *stats_page_addr,
> +				enum hv_stats_area_type stats_area_type)
> +{
> +	union hv_stats_object_identity identity = {
> +		.vp.partition_id = partition_id,
> +		.vp.vp_index = vp_index,
> +		.vp.stats_area_type = stats_area_type,
> +	};
> +	int err;
> +
> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_VP, stats_page_addr, &identity);
> +	if (err)
> +		pr_err("%s: failed to unmap partition %llu vp %u %s stats, err: %d\n",
> +		       __func__, partition_id, vp_index,
> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> +		       err);
> +}
> +
> +static void *mshv_vp_stats_map(u64 partition_id, u32 vp_index,
> +			       enum hv_stats_area_type stats_area_type)
> +{
> +	union hv_stats_object_identity identity = {
> +		.vp.partition_id = partition_id,
> +		.vp.vp_index = vp_index,
> +		.vp.stats_area_type = stats_area_type,
> +	};
> +	void *stats;
> +	int err;
> +
> +	err = hv_map_stats_page(HV_STATS_OBJECT_VP, &identity, &stats);
> +	if (err) {
> +		pr_err("%s: failed to map partition %llu vp %u %s stats, err: %d\n",
> +		       __func__, partition_id, vp_index,
> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> +		       err);
> +		return ERR_PTR(err);
> +	}
> +	return stats;
> +}
> +
> +static int vp_debugfs_stats_create(u64 partition_id, u32 vp_index,
> +				   struct dentry **vp_stats_ptr,
> +				   struct dentry *parent)
> +{
> +	struct dentry *dentry;
> +	struct hv_stats_page **pstats;
> +	int err;
> +
> +	pstats = kcalloc(2, sizeof(struct hv_stats_page *), GFP_KERNEL_ACCOUNT);
> +	if (!pstats)
> +		return -ENOMEM;
> +
> +	pstats[HV_STATS_AREA_SELF] = mshv_vp_stats_map(partition_id, vp_index,
> +						       HV_STATS_AREA_SELF);
> +	if (IS_ERR(pstats[HV_STATS_AREA_SELF])) {
> +		err = PTR_ERR(pstats[HV_STATS_AREA_SELF]);
> +		goto cleanup;
> +	}
> +
> +	/*
> +	 * L1VH partition cannot access its vp stats in parent area.
> +	 */
> +	if (is_l1vh_parent(partition_id)) {
> +		pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
> +	} else {
> +		pstats[HV_STATS_AREA_PARENT] = mshv_vp_stats_map(
> +			partition_id, vp_index, HV_STATS_AREA_PARENT);
> +		if (IS_ERR(pstats[HV_STATS_AREA_PARENT])) {
> +			err = PTR_ERR(pstats[HV_STATS_AREA_PARENT]);
> +			goto unmap_self;
> +		}
> +		if (!pstats[HV_STATS_AREA_PARENT])
> +			pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
> +	}
> +
> +	dentry = debugfs_create_file("stats", 0400, parent,
> +				     pstats, &vp_stats_fops);
> +	if (IS_ERR(dentry)) {
> +		err = PTR_ERR(dentry);
> +		goto unmap_vp_stats;
> +	}
> +
> +	*vp_stats_ptr = dentry;
> +	return 0;
> +
> +unmap_vp_stats:
> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF])
> +		mshv_vp_stats_unmap(partition_id, vp_index, pstats[HV_STATS_AREA_PARENT],
> +				    HV_STATS_AREA_PARENT);
> +unmap_self:
> +	mshv_vp_stats_unmap(partition_id, vp_index, pstats[HV_STATS_AREA_SELF],
> +			    HV_STATS_AREA_SELF);
> +cleanup:
> +	kfree(pstats);
> +	return err;
> +}
> +
> +static void vp_debugfs_remove(u64 partition_id, u32 vp_index,
> +			      struct dentry *vp_stats)
> +{
> +	struct hv_stats_page **pstats = NULL;
> +	void *stats;
> +
> +	pstats = vp_stats->d_inode->i_private;
> +	debugfs_remove_recursive(vp_stats->d_parent);
> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF]) {
> +		stats = pstats[HV_STATS_AREA_PARENT];
> +		mshv_vp_stats_unmap(partition_id, vp_index, stats,
> +				    HV_STATS_AREA_PARENT);
> +	}
> +
> +	stats = pstats[HV_STATS_AREA_SELF];
> +	mshv_vp_stats_unmap(partition_id, vp_index, stats, HV_STATS_AREA_SELF);
> +
> +	kfree(pstats);
> +}
> +
> +static int vp_debugfs_create(u64 partition_id, u32 vp_index,
> +			     struct dentry **vp_stats_ptr,
> +			     struct dentry *parent)
> +{
> +	struct dentry *vp_idx_dir;
> +	char vp_idx_str[U32_BUF_SZ];
> +	int err;
> +
> +	sprintf(vp_idx_str, "%u", vp_index);
> +
> +	vp_idx_dir = debugfs_create_dir(vp_idx_str, parent);
> +	if (IS_ERR(vp_idx_dir))
> +		return PTR_ERR(vp_idx_dir);
> +
> +	err = vp_debugfs_stats_create(partition_id, vp_index, vp_stats_ptr,
> +				      vp_idx_dir);
> +	if (err)
> +		goto remove_debugfs_vp_idx;
> +
> +	return 0;
> +
> +remove_debugfs_vp_idx:
> +	debugfs_remove_recursive(vp_idx_dir);
> +	return err;
> +}
> +
> +static int partition_stats_show(struct seq_file *m, void *v)
> +{
> +	const struct hv_stats_page **pstats = m->private;
> +
> +#define PARTITION_SEQ_PRINTF(cnt)				 \
> +do {								 \
> +	if (pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]) \
> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> +			pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]); \
> +	else \
> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> +			pstats[HV_STATS_AREA_PARENT]->pt_cntrs[Partition##cnt]); \
> +} while (0)
> +
> +	PARTITION_SEQ_PRINTF(VirtualProcessors);
> +	PARTITION_SEQ_PRINTF(TlbSize);
> +	PARTITION_SEQ_PRINTF(AddressSpaces);
> +	PARTITION_SEQ_PRINTF(DepositedPages);
> +	PARTITION_SEQ_PRINTF(GpaPages);
> +	PARTITION_SEQ_PRINTF(GpaSpaceModifications);
> +	PARTITION_SEQ_PRINTF(VirtualTlbFlushEntires);
> +	PARTITION_SEQ_PRINTF(RecommendedTlbSize);
> +	PARTITION_SEQ_PRINTF(GpaPages4K);
> +	PARTITION_SEQ_PRINTF(GpaPages2M);
> +	PARTITION_SEQ_PRINTF(GpaPages1G);
> +	PARTITION_SEQ_PRINTF(GpaPages512G);
> +	PARTITION_SEQ_PRINTF(DevicePages4K);
> +	PARTITION_SEQ_PRINTF(DevicePages2M);
> +	PARTITION_SEQ_PRINTF(DevicePages1G);
> +	PARTITION_SEQ_PRINTF(DevicePages512G);
> +	PARTITION_SEQ_PRINTF(AttachedDevices);
> +	PARTITION_SEQ_PRINTF(DeviceInterruptMappings);
> +	PARTITION_SEQ_PRINTF(IoTlbFlushes);
> +	PARTITION_SEQ_PRINTF(IoTlbFlushCost);
> +	PARTITION_SEQ_PRINTF(DeviceInterruptErrors);
> +	PARTITION_SEQ_PRINTF(DeviceDmaErrors);
> +	PARTITION_SEQ_PRINTF(DeviceInterruptThrottleEvents);
> +	PARTITION_SEQ_PRINTF(SkippedTimerTicks);
> +	PARTITION_SEQ_PRINTF(PartitionId);
> +#if IS_ENABLED(CONFIG_X86_64)
> +	PARTITION_SEQ_PRINTF(NestedTlbSize);
> +	PARTITION_SEQ_PRINTF(RecommendedNestedTlbSize);
> +	PARTITION_SEQ_PRINTF(NestedTlbFreeListSize);
> +	PARTITION_SEQ_PRINTF(NestedTlbTrimmedPages);
> +	PARTITION_SEQ_PRINTF(PagesShattered);
> +	PARTITION_SEQ_PRINTF(PagesRecombined);
> +	PARTITION_SEQ_PRINTF(HwpRequestValue);
> +#elif IS_ENABLED(CONFIG_ARM64)
> +	PARTITION_SEQ_PRINTF(HwpRequestValue);
> +#endif
> +
> +	return 0;
> +}
> +DEFINE_SHOW_ATTRIBUTE(partition_stats);
> +
> +static void mshv_partition_stats_unmap(u64 partition_id, void *stats_page_addr,
> +				       enum hv_stats_area_type stats_area_type)
> +{
> +	union hv_stats_object_identity identity = {
> +		.partition.partition_id = partition_id,
> +		.partition.stats_area_type = stats_area_type,
> +	};
> +	int err;
> +
> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_PARTITION, stats_page_addr,
> +				  &identity);
> +	if (err) {

nit: redundant curly brackets

> +		pr_err("%s: failed to unmap partition %lld %s stats, err: %d\n",
> +		       __func__, partition_id,
> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> +		       err);
> +	}
> +}
> +
> +static void *mshv_partition_stats_map(u64 partition_id,
> +				      enum hv_stats_area_type stats_area_type)
> +{
> +	union hv_stats_object_identity identity = {
> +		.partition.partition_id = partition_id,
> +		.partition.stats_area_type = stats_area_type,
> +	};
> +	void *stats;
> +	int err;
> +
> +	err = hv_map_stats_page(HV_STATS_OBJECT_PARTITION, &identity, &stats);
> +	if (err) {
> +		pr_err("%s: failed to map partition %lld %s stats, err: %d\n",
> +		       __func__, partition_id,
> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> +		       err);
> +		return ERR_PTR(err);
> +	}
> +	return stats;
> +}
> +
> +static int mshv_debugfs_partition_stats_create(u64 partition_id,
> +					    struct dentry **partition_stats_ptr,
> +					    struct dentry *parent)
> +{
> +	struct dentry *dentry;
> +	struct hv_stats_page **pstats;
> +	int err;
> +
> +	pstats = kcalloc(2, sizeof(struct hv_stats_page *), GFP_KERNEL_ACCOUNT);
> +	if (!pstats)
> +		return -ENOMEM;
> +
> +	pstats[HV_STATS_AREA_SELF] = mshv_partition_stats_map(partition_id,
> +							      HV_STATS_AREA_SELF);
> +	if (IS_ERR(pstats[HV_STATS_AREA_SELF])) {
> +		err = PTR_ERR(pstats[HV_STATS_AREA_SELF]);
> +		goto cleanup;
> +	}
> +
> +	/*
> +	 * L1VH partition cannot access its partition stats in parent area.
> +	 */
> +	if (is_l1vh_parent(partition_id)) {
> +		pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
> +	} else {
> +		pstats[HV_STATS_AREA_PARENT] = mshv_partition_stats_map(partition_id,
> +									HV_STATS_AREA_PARENT);
> +		if (IS_ERR(pstats[HV_STATS_AREA_PARENT])) {
> +			err = PTR_ERR(pstats[HV_STATS_AREA_PARENT]);
> +			goto unmap_self;
> +		}
> +		if (!pstats[HV_STATS_AREA_PARENT])
> +			pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
> +	}
> +
> +	dentry = debugfs_create_file("stats", 0400, parent,
> +				     pstats, &partition_stats_fops);
> +	if (IS_ERR(dentry)) {
> +		err = PTR_ERR(dentry);
> +		goto unmap_partition_stats;
> +	}
> +
> +	*partition_stats_ptr = dentry;
> +	return 0;
> +
> +unmap_partition_stats:
> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF])
> +		mshv_partition_stats_unmap(partition_id, pstats[HV_STATS_AREA_PARENT],
> +					   HV_STATS_AREA_PARENT);
> +unmap_self:
> +	mshv_partition_stats_unmap(partition_id, pstats[HV_STATS_AREA_SELF],
> +				   HV_STATS_AREA_SELF);
> +cleanup:
> +	kfree(pstats);
> +	return err;
> +}
> +
> +static void partition_debugfs_remove(u64 partition_id, struct dentry *dentry)
> +{
> +	struct hv_stats_page **pstats = NULL;
> +	void *stats;

nit: stats variable looks redundant

> +
> +	pstats = dentry->d_inode->i_private;
> +
> +	debugfs_remove_recursive(dentry->d_parent);
> +
> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF]) {
> +		stats = pstats[HV_STATS_AREA_PARENT];
> +		mshv_partition_stats_unmap(partition_id, stats, HV_STATS_AREA_PARENT);
> +	}
> +
> +	stats = pstats[HV_STATS_AREA_SELF];
> +	mshv_partition_stats_unmap(partition_id, stats, HV_STATS_AREA_SELF);
> +
> +	kfree(pstats);
> +}
> +
> +static int partition_debugfs_create(u64 partition_id,
> +				    struct dentry **vp_dir_ptr,
> +				    struct dentry **partition_stats_ptr,
> +				    struct dentry *parent)
> +{
> +	char part_id_str[U64_BUF_SZ];
> +	struct dentry *part_id_dir, *vp_dir;
> +	int err;
> +
> +	if (is_l1vh_parent(partition_id))
> +		sprintf(part_id_str, "self");
> +	else
> +		sprintf(part_id_str, "%llu", partition_id);
> +
> +	part_id_dir = debugfs_create_dir(part_id_str, parent);
> +	if (IS_ERR(part_id_dir))
> +		return PTR_ERR(part_id_dir);
> +
> +	vp_dir = debugfs_create_dir("vp", part_id_dir);
> +	if (IS_ERR(vp_dir)) {
> +		err = PTR_ERR(vp_dir);
> +		goto remove_debugfs_partition_id;
> +	}
> +
> +	err = mshv_debugfs_partition_stats_create(partition_id,
> +						  partition_stats_ptr,
> +						  part_id_dir);
> +	if (err)
> +		goto remove_debugfs_partition_id;
> +
> +	*vp_dir_ptr = vp_dir;
> +
> +	return 0;
> +
> +remove_debugfs_partition_id:
> +	debugfs_remove_recursive(part_id_dir);
> +	return err;
> +}
> +
> +static void mshv_debugfs_parent_partition_remove(void)
> +{
> +	int idx;
> +
> +	for_each_online_cpu(idx)
> +		vp_debugfs_remove(hv_current_partition_id, idx, NULL);
> +
> +	partition_debugfs_remove(hv_current_partition_id, NULL);
> +}
> +
> +static int __init mshv_debugfs_parent_partition_create(void)
> +{
> +	struct dentry *partition_stats, *vp_dir;
> +	int err, idx, i;
> +
> +	mshv_debugfs_partition = debugfs_create_dir("partition",
> +						     mshv_debugfs);
> +	if (IS_ERR(mshv_debugfs_partition))
> +		return PTR_ERR(mshv_debugfs_partition);
> +
> +	err = partition_debugfs_create(hv_current_partition_id,
> +				       &vp_dir,
> +				       &partition_stats,
> +				       mshv_debugfs_partition);
> +	if (err)
> +		goto remove_debugfs_partition;
> +
> +	for_each_online_cpu(idx) {
> +		struct dentry *vp_stats;
> +
> +		err = vp_debugfs_create(hv_current_partition_id,
> +					hv_vp_index[idx],
> +					&vp_stats,
> +					vp_dir);
> +		if (err)
> +			goto remove_debugfs_partition_vp;
> +	}
> +
> +	return 0;
> +
> +remove_debugfs_partition_vp:
> +	for_each_online_cpu(i) {
> +		if (i >= idx)
> +			break;
> +		vp_debugfs_remove(hv_current_partition_id, i, NULL);
> +	}
> +	partition_debugfs_remove(hv_current_partition_id, NULL);
> +remove_debugfs_partition:
> +	debugfs_remove_recursive(mshv_debugfs_partition);
> +	return err;
> +}
> +
> +static int hv_stats_show(struct seq_file *m, void *v)
> +{
> +	const struct hv_stats_page *stats = m->private;
> +
> +#define HV_SEQ_PRINTF(cnt)		\
> +	seq_printf(m, "%-25s: %llu\n", __stringify(cnt), stats->hv_cntrs[Hv##cnt])
> +
> +	HV_SEQ_PRINTF(LogicalProcessors);
> +	HV_SEQ_PRINTF(Partitions);
> +	HV_SEQ_PRINTF(TotalPages);
> +	HV_SEQ_PRINTF(VirtualProcessors);
> +	HV_SEQ_PRINTF(MonitoredNotifications);
> +	HV_SEQ_PRINTF(ModernStandbyEntries);
> +	HV_SEQ_PRINTF(PlatformIdleTransitions);
> +	HV_SEQ_PRINTF(HypervisorStartupCost);
> +	HV_SEQ_PRINTF(IOSpacePages);
> +	HV_SEQ_PRINTF(NonEssentialPagesForDump);
> +	HV_SEQ_PRINTF(SubsumedPages);
> +
> +	return 0;
> +}
> +DEFINE_SHOW_ATTRIBUTE(hv_stats);
> +
> +static void mshv_hv_stats_unmap(void)
> +{
> +	union hv_stats_object_identity identity = {
> +		.hv.stats_area_type = HV_STATS_AREA_SELF,
> +	};
> +	int err;
> +
> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_HYPERVISOR, NULL, &identity);
> +	if (err)
> +		pr_err("%s: failed to unmap hypervisor stats: %d\n",
> +		       __func__, err);
> +}
> +
> +static void * __init mshv_hv_stats_map(void)
> +{
> +	union hv_stats_object_identity identity = {
> +		.hv.stats_area_type = HV_STATS_AREA_SELF,
> +	};
> +	void *stats;
> +	int err;
> +
> +	err = hv_map_stats_page(HV_STATS_OBJECT_HYPERVISOR, &identity, &stats);
> +	if (err) {
> +		pr_err("%s: failed to map hypervisor stats: %d\n",
> +		       __func__, err);
> +		return ERR_PTR(err);
> +	}
> +	return stats;
> +}
> +
> +static int __init mshv_debugfs_hv_stats_create(struct dentry *parent)
> +{
> +	struct dentry *dentry;
> +	u64 *stats;
> +	int err;
> +
> +	stats = mshv_hv_stats_map();
> +	if (IS_ERR(stats))
> +		return PTR_ERR(stats);
> +
> +	dentry = debugfs_create_file("stats", 0400, parent,
> +				     stats, &hv_stats_fops);
> +	if (IS_ERR(dentry)) {
> +		err = PTR_ERR(dentry);
> +		pr_err("%s: failed to create hypervisor stats dentry: %d\n",
> +		       __func__, err);
> +		goto unmap_hv_stats;
> +	}
> +
> +	mshv_lps_count = stats[HvLogicalProcessors];
> +
> +	return 0;
> +
> +unmap_hv_stats:
> +	mshv_hv_stats_unmap();
> +	return err;
> +}
> +
> +int mshv_debugfs_vp_create(struct mshv_vp *vp)
> +{
> +	struct mshv_partition *p = vp->vp_partition;
> +	int err;

nit: redundant variable

> +
> +	if (!mshv_debugfs)
> +		return 0;
> +
> +	err = vp_debugfs_create(p->pt_id, vp->vp_index,
> +				&vp->vp_debugfs_stats_dentry,
> +				p->pt_debugfs_vp_dentry);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> +
> +void mshv_debugfs_vp_remove(struct mshv_vp *vp)
> +{
> +	if (!mshv_debugfs)
> +		return;
> +
> +	vp_debugfs_remove(vp->vp_partition->pt_id, vp->vp_index,
> +			  vp->vp_debugfs_stats_dentry);
> +}
> +
> +int mshv_debugfs_partition_create(struct mshv_partition *partition)
> +{
> +	int err;
> +
> +	if (!mshv_debugfs)
> +		return 0;
> +
> +	err = partition_debugfs_create(partition->pt_id,
> +				       &partition->pt_debugfs_vp_dentry,
> +				       &partition->pt_debugfs_stats_dentry,
> +				       mshv_debugfs_partition);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> +
> +void mshv_debugfs_partition_remove(struct mshv_partition *partition)
> +{
> +	if (!mshv_debugfs)
> +		return;
> +
> +	partition_debugfs_remove(partition->pt_id,
> +				 partition->pt_debugfs_stats_dentry);
> +}
> +
> +int __init mshv_debugfs_init(void)
> +{
> +	int err;
> +
> +	mshv_debugfs = debugfs_create_dir("mshv", NULL);
> +	if (IS_ERR(mshv_debugfs)) {
> +		pr_err("%s: failed to create debugfs directory\n", __func__);
> +		return PTR_ERR(mshv_debugfs);
> +	}
> +
> +	if (hv_root_partition()) {
> +		err = mshv_debugfs_hv_stats_create(mshv_debugfs);
> +		if (err)
> +			goto remove_mshv_dir;
> +
> +		err = mshv_debugfs_lp_create(mshv_debugfs);
> +		if (err)
> +			goto unmap_hv_stats;
> +	}
> +
> +	err = mshv_debugfs_parent_partition_create();
> +	if (err)
> +		goto unmap_lp_stats;
> +
> +	return 0;
> +
> +unmap_lp_stats:
> +	if (hv_root_partition())
> +		mshv_debugfs_lp_remove();
> +unmap_hv_stats:
> +	if (hv_root_partition())
> +		mshv_hv_stats_unmap();
> +remove_mshv_dir:
> +	debugfs_remove_recursive(mshv_debugfs);
> +	return err;
> +}
> +
> +void mshv_debugfs_exit(void)
> +{
> +	mshv_debugfs_parent_partition_remove();
> +
> +	if (hv_root_partition()) {
> +		mshv_debugfs_lp_remove();
> +		mshv_hv_stats_unmap();
> +	}
> +
> +	debugfs_remove_recursive(mshv_debugfs);
> +}
> diff --git a/drivers/hv/mshv_root.h b/drivers/hv/mshv_root.h
> index 3eb815011b46..1f1b1984449b 100644
> --- a/drivers/hv/mshv_root.h
> +++ b/drivers/hv/mshv_root.h
> @@ -51,6 +51,9 @@ struct mshv_vp {
>  		unsigned int kicked_by_hv;
>  		wait_queue_head_t vp_suspend_queue;
>  	} run;
> +#if IS_ENABLED(CONFIG_DEBUG_FS)
> +	struct dentry *vp_debugfs_stats_dentry;

nit: the name could be shorter like vp_stats_dentry, for example

> +#endif
>  };
>  
>  #define vp_fmt(fmt) "p%lluvp%u: " fmt
> @@ -128,6 +131,10 @@ struct mshv_partition {
>  	u64 isolation_type;
>  	bool import_completed;
>  	bool pt_initialized;
> +#if IS_ENABLED(CONFIG_DEBUG_FS)
> +	struct dentry *pt_debugfs_stats_dentry;
> +	struct dentry *pt_debugfs_vp_dentry;

same here

> +#endif
>  };
>  
>  #define pt_fmt(fmt) "p%llu: " fmt
> @@ -308,6 +315,33 @@ int hv_call_modify_spa_host_access(u64 partition_id, struct page **pages,
>  int hv_call_get_partition_property_ex(u64 partition_id, u64 property_code, u64 arg,
>  				      void *property_value, size_t property_value_sz);
>  
> +#if IS_ENABLED(CONFIG_DEBUG_FS)
> +int __init mshv_debugfs_init(void);
> +void mshv_debugfs_exit(void);
> +
> +int mshv_debugfs_partition_create(struct mshv_partition *partition);
> +void mshv_debugfs_partition_remove(struct mshv_partition *partition);
> +int mshv_debugfs_vp_create(struct mshv_vp *vp);
> +void mshv_debugfs_vp_remove(struct mshv_vp *vp);
> +#else
> +static inline int __init mshv_debugfs_init(void)
> +{
> +	return 0;
> +}
> +static inline void mshv_debugfs_exit(void) { }
> +
> +static inline int mshv_debugfs_partition_create(struct mshv_partition *partition)
> +{
> +	return 0;
> +}
> +static inline void mshv_debugfs_partition_remove(struct mshv_partition *partition) { }
> +static inline int mshv_debugfs_vp_create(struct mshv_vp *vp)
> +{
> +	return 0;
> +}
> +static inline void mshv_debugfs_vp_remove(struct mshv_vp *vp) { }
> +#endif
> +
>  extern struct mshv_root mshv_root;
>  extern enum hv_scheduler_type hv_scheduler_type;
>  extern u8 * __percpu *hv_synic_eventring_tail;
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index 19006b788e85..152fcd9b45e6 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -982,6 +982,10 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
>  	if (hv_scheduler_type == HV_SCHEDULER_TYPE_ROOT)
>  		memcpy(vp->vp_stats_pages, stats_pages, sizeof(stats_pages));
>  
> +	ret = mshv_debugfs_vp_create(vp);
> +	if (ret)
> +		goto put_partition;
> +
>  	/*
>  	 * Keep anon_inode_getfd last: it installs fd in the file struct and
>  	 * thus makes the state accessible in user space.
> @@ -989,7 +993,7 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
>  	ret = anon_inode_getfd("mshv_vp", &mshv_vp_fops, vp,
>  			       O_RDWR | O_CLOEXEC);
>  	if (ret < 0)
> -		goto put_partition;
> +		goto remove_debugfs_vp;
>  
>  	/* already exclusive with the partition mutex for all ioctls */
>  	partition->pt_vp_count++;
> @@ -997,6 +1001,8 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
>  
>  	return ret;
>  
> +remove_debugfs_vp:
> +	mshv_debugfs_vp_remove(vp);
>  put_partition:
>  	mshv_partition_put(partition);
>  free_vp:
> @@ -1556,13 +1562,18 @@ mshv_partition_ioctl_initialize(struct mshv_partition *partition)
>  
>  	ret = hv_call_initialize_partition(partition->pt_id);
>  	if (ret)
> -		goto withdraw_mem;

Looks like the behavior changed here. Could you explain?


> +		return ret;
> +
> +	ret = mshv_debugfs_partition_create(partition);
> +	if (ret)
> +		goto finalize_partition;
>  
>  	partition->pt_initialized = true;
>  
>  	return 0;
>  
> -withdraw_mem:
> +finalize_partition:
> +	hv_call_finalize_partition(partition->pt_id);
>  	hv_call_withdraw_memory(U64_MAX, NUMA_NO_NODE, partition->pt_id);
>  
>  	return ret;
> @@ -1741,6 +1752,8 @@ static void destroy_partition(struct mshv_partition *partition)
>  			if (!vp)
>  				continue;
>  
> +			mshv_debugfs_vp_remove(vp);
> +
>  			if (hv_scheduler_type == HV_SCHEDULER_TYPE_ROOT)
>  				mshv_vp_stats_unmap(partition->pt_id, vp->vp_index,
>  						    (void **)vp->vp_stats_pages);
> @@ -1775,6 +1788,8 @@ static void destroy_partition(struct mshv_partition *partition)
>  			partition->pt_vp_array[i] = NULL;
>  		}
>  
> +		mshv_debugfs_partition_remove(partition);
> +
>  		/* Deallocates and unmaps everything including vcpus, GPA mappings etc */
>  		hv_call_finalize_partition(partition->pt_id);
>  
> @@ -2351,10 +2366,14 @@ static int __init mshv_parent_partition_init(void)
>  
>  	mshv_init_vmm_caps(dev);
>  
> -	ret = mshv_irqfd_wq_init();
> +	ret = mshv_debugfs_init();
>  	if (ret)
>  		goto exit_partition;
>  
> +	ret = mshv_irqfd_wq_init();
> +	if (ret)
> +		goto exit_debugfs;
> +
>  	spin_lock_init(&mshv_root.pt_ht_lock);
>  	hash_init(mshv_root.pt_htable);
>  
> @@ -2362,6 +2381,10 @@ static int __init mshv_parent_partition_init(void)
>  
>  	return 0;
>  
> +destroy_irqds_wq:

Where is this label used?

Thanks,
Stanislav

> +	mshv_irqfd_wq_cleanup();
> +exit_debugfs:
> +	mshv_debugfs_exit();
>  exit_partition:
>  	if (hv_root_partition())
>  		mshv_root_partition_exit();
> @@ -2378,6 +2401,7 @@ static void __exit mshv_parent_partition_exit(void)
>  {
>  	hv_setup_mshv_handler(NULL);
>  	mshv_port_table_fini();
> +	mshv_debugfs_exit();
>  	misc_deregister(&mshv_dev);
>  	mshv_irqfd_wq_cleanup();
>  	if (hv_root_partition())
> -- 
> 2.34.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics
  2025-12-05 18:58 ` [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics Nuno Das Neves
  2025-12-05 23:06   ` Stanislav Kinsburskii
@ 2025-12-08  3:04   ` kernel test robot
  2025-12-08  6:02   ` kernel test robot
  2025-12-08 15:21   ` Michael Kelley
  3 siblings, 0 replies; 18+ messages in thread
From: kernel test robot @ 2025-12-08  3:04 UTC (permalink / raw)
  To: Nuno Das Neves, linux-hyperv, linux-kernel, skinsburskii
  Cc: oe-kbuild-all, kys, haiyangz, wei.liu, decui, longli, mhklinux,
	prapal, mrathor, paekkaladevi, Nuno Das Neves, Jinank Jain

Hi Nuno,

kernel test robot noticed the following build warnings:

[auto build test WARNING on next-20251205]
[cannot apply to linus/master v6.18 v6.18-rc7 v6.18-rc6 v6.18]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Nuno-Das-Neves/mshv-Ignore-second-stats-page-map-result-failure/20251206-033756
base:   next-20251205
patch link:    https://lore.kernel.org/r/1764961122-31679-4-git-send-email-nunodasneves%40linux.microsoft.com
patch subject: [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics
config: x86_64-randconfig-002-20251208 (https://download.01.org/0day-ci/archive/20251208/202512081050.0AW9Ecvg-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.4.0-5) 12.4.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251208/202512081050.0AW9Ecvg-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512081050.0AW9Ecvg-lkp@intel.com/

All warnings (new ones prefixed by >>):

   drivers/hv/mshv_root_main.c: In function 'mshv_parent_partition_init':
>> drivers/hv/mshv_root_main.c:2369:1: warning: label 'destroy_irqds_wq' defined but not used [-Wunused-label]
    2369 | destroy_irqds_wq:
         | ^~~~~~~~~~~~~~~~


vim +/destroy_irqds_wq +2369 drivers/hv/mshv_root_main.c

  2299	
  2300	static int __init mshv_parent_partition_init(void)
  2301	{
  2302		int ret;
  2303		struct device *dev;
  2304		union hv_hypervisor_version_info version_info;
  2305	
  2306		if (!hv_parent_partition() || is_kdump_kernel())
  2307			return -ENODEV;
  2308	
  2309		if (hv_get_hypervisor_version(&version_info))
  2310			return -ENODEV;
  2311	
  2312		ret = misc_register(&mshv_dev);
  2313		if (ret)
  2314			return ret;
  2315	
  2316		dev = mshv_dev.this_device;
  2317	
  2318		if (version_info.build_number < MSHV_HV_MIN_VERSION ||
  2319		    version_info.build_number > MSHV_HV_MAX_VERSION) {
  2320			dev_err(dev, "Running on unvalidated Hyper-V version\n");
  2321			dev_err(dev, "Versions: current: %u  min: %u  max: %u\n",
  2322				version_info.build_number, MSHV_HV_MIN_VERSION,
  2323				MSHV_HV_MAX_VERSION);
  2324		}
  2325	
  2326		mshv_root.synic_pages = alloc_percpu(struct hv_synic_pages);
  2327		if (!mshv_root.synic_pages) {
  2328			dev_err(dev, "Failed to allocate percpu synic page\n");
  2329			ret = -ENOMEM;
  2330			goto device_deregister;
  2331		}
  2332	
  2333		ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "mshv_synic",
  2334					mshv_synic_init,
  2335					mshv_synic_cleanup);
  2336		if (ret < 0) {
  2337			dev_err(dev, "Failed to setup cpu hotplug state: %i\n", ret);
  2338			goto free_synic_pages;
  2339		}
  2340	
  2341		mshv_cpuhp_online = ret;
  2342	
  2343		ret = mshv_retrieve_scheduler_type(dev);
  2344		if (ret)
  2345			goto remove_cpu_state;
  2346	
  2347		if (hv_root_partition())
  2348			ret = mshv_root_partition_init(dev);
  2349		if (ret)
  2350			goto remove_cpu_state;
  2351	
  2352		mshv_init_vmm_caps(dev);
  2353	
  2354		ret = mshv_debugfs_init();
  2355		if (ret)
  2356			goto exit_partition;
  2357	
  2358		ret = mshv_irqfd_wq_init();
  2359		if (ret)
  2360			goto exit_debugfs;
  2361	
  2362		spin_lock_init(&mshv_root.pt_ht_lock);
  2363		hash_init(mshv_root.pt_htable);
  2364	
  2365		hv_setup_mshv_handler(mshv_isr);
  2366	
  2367		return 0;
  2368	
> 2369	destroy_irqds_wq:
  2370		mshv_irqfd_wq_cleanup();
  2371	exit_debugfs:
  2372		mshv_debugfs_exit();
  2373	exit_partition:
  2374		if (hv_root_partition())
  2375			mshv_root_partition_exit();
  2376	remove_cpu_state:
  2377		cpuhp_remove_state(mshv_cpuhp_online);
  2378	free_synic_pages:
  2379		free_percpu(mshv_root.synic_pages);
  2380	device_deregister:
  2381		misc_deregister(&mshv_dev);
  2382		return ret;
  2383	}
  2384	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics
  2025-12-05 18:58 ` [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics Nuno Das Neves
  2025-12-05 23:06   ` Stanislav Kinsburskii
  2025-12-08  3:04   ` kernel test robot
@ 2025-12-08  6:02   ` kernel test robot
  2025-12-08 15:21   ` Michael Kelley
  3 siblings, 0 replies; 18+ messages in thread
From: kernel test robot @ 2025-12-08  6:02 UTC (permalink / raw)
  To: Nuno Das Neves, linux-hyperv, linux-kernel, skinsburskii
  Cc: llvm, oe-kbuild-all, kys, haiyangz, wei.liu, decui, longli,
	mhklinux, prapal, mrathor, paekkaladevi, Nuno Das Neves,
	Jinank Jain

Hi Nuno,

kernel test robot noticed the following build warnings:

[auto build test WARNING on next-20251205]
[cannot apply to linus/master v6.18 v6.18-rc7 v6.18-rc6 v6.18]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Nuno-Das-Neves/mshv-Ignore-second-stats-page-map-result-failure/20251206-033756
base:   next-20251205
patch link:    https://lore.kernel.org/r/1764961122-31679-4-git-send-email-nunodasneves%40linux.microsoft.com
patch subject: [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics
config: x86_64-randconfig-076-20251208 (https://download.01.org/0day-ci/archive/20251208/202512081314.ULBIWb1d-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251208/202512081314.ULBIWb1d-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512081314.ULBIWb1d-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/hv/mshv_root_main.c:2369:1: warning: unused label 'destroy_irqds_wq' [-Wunused-label]
    2369 | destroy_irqds_wq:
         | ^~~~~~~~~~~~~~~~~
   1 warning generated.


vim +/destroy_irqds_wq +2369 drivers/hv/mshv_root_main.c

  2299	
  2300	static int __init mshv_parent_partition_init(void)
  2301	{
  2302		int ret;
  2303		struct device *dev;
  2304		union hv_hypervisor_version_info version_info;
  2305	
  2306		if (!hv_parent_partition() || is_kdump_kernel())
  2307			return -ENODEV;
  2308	
  2309		if (hv_get_hypervisor_version(&version_info))
  2310			return -ENODEV;
  2311	
  2312		ret = misc_register(&mshv_dev);
  2313		if (ret)
  2314			return ret;
  2315	
  2316		dev = mshv_dev.this_device;
  2317	
  2318		if (version_info.build_number < MSHV_HV_MIN_VERSION ||
  2319		    version_info.build_number > MSHV_HV_MAX_VERSION) {
  2320			dev_err(dev, "Running on unvalidated Hyper-V version\n");
  2321			dev_err(dev, "Versions: current: %u  min: %u  max: %u\n",
  2322				version_info.build_number, MSHV_HV_MIN_VERSION,
  2323				MSHV_HV_MAX_VERSION);
  2324		}
  2325	
  2326		mshv_root.synic_pages = alloc_percpu(struct hv_synic_pages);
  2327		if (!mshv_root.synic_pages) {
  2328			dev_err(dev, "Failed to allocate percpu synic page\n");
  2329			ret = -ENOMEM;
  2330			goto device_deregister;
  2331		}
  2332	
  2333		ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "mshv_synic",
  2334					mshv_synic_init,
  2335					mshv_synic_cleanup);
  2336		if (ret < 0) {
  2337			dev_err(dev, "Failed to setup cpu hotplug state: %i\n", ret);
  2338			goto free_synic_pages;
  2339		}
  2340	
  2341		mshv_cpuhp_online = ret;
  2342	
  2343		ret = mshv_retrieve_scheduler_type(dev);
  2344		if (ret)
  2345			goto remove_cpu_state;
  2346	
  2347		if (hv_root_partition())
  2348			ret = mshv_root_partition_init(dev);
  2349		if (ret)
  2350			goto remove_cpu_state;
  2351	
  2352		mshv_init_vmm_caps(dev);
  2353	
  2354		ret = mshv_debugfs_init();
  2355		if (ret)
  2356			goto exit_partition;
  2357	
  2358		ret = mshv_irqfd_wq_init();
  2359		if (ret)
  2360			goto exit_debugfs;
  2361	
  2362		spin_lock_init(&mshv_root.pt_ht_lock);
  2363		hash_init(mshv_root.pt_htable);
  2364	
  2365		hv_setup_mshv_handler(mshv_isr);
  2366	
  2367		return 0;
  2368	
> 2369	destroy_irqds_wq:
  2370		mshv_irqfd_wq_cleanup();
  2371	exit_debugfs:
  2372		mshv_debugfs_exit();
  2373	exit_partition:
  2374		if (hv_root_partition())
  2375			mshv_root_partition_exit();
  2376	remove_cpu_state:
  2377		cpuhp_remove_state(mshv_cpuhp_online);
  2378	free_synic_pages:
  2379		free_percpu(mshv_root.synic_pages);
  2380	device_deregister:
  2381		misc_deregister(&mshv_dev);
  2382		return ret;
  2383	}
  2384	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2 1/3] mshv: Ignore second stats page map result failure
  2025-12-05 18:58 ` [PATCH v2 1/3] mshv: Ignore second stats page map result failure Nuno Das Neves
  2025-12-05 22:50   ` Stanislav Kinsburskii
@ 2025-12-08 15:12   ` Michael Kelley
  2025-12-30  0:27     ` Nuno Das Neves
  1 sibling, 1 reply; 18+ messages in thread
From: Michael Kelley @ 2025-12-08 15:12 UTC (permalink / raw)
  To: Nuno Das Neves, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, skinsburskii@linux.microsoft.com
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com,
	prapal@linux.microsoft.com, mrathor@linux.microsoft.com,
	paekkaladevi@linux.microsoft.com

From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Friday, December 5, 2025 10:59 AM
> 
> From: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
> 
> Older versions of the hypervisor do not support HV_STATS_AREA_PARENT
> and return HV_STATUS_INVALID_PARAMETER for the second stats page
> mapping request.
> 
> This results a failure in module init. Instead of failing, gracefully
> fall back to populating stats_pages[HV_STATS_AREA_PARENT] with the
> already-mapped stats_pages[HV_STATS_AREA_SELF].

This explains "what" this patch does. But could you add an explanation of "why"
substituting SELF for the unavailable PARENT is the right thing to do? As a somewhat
outside reviewer, I don't know enough about SELF vs. PARENT to immediately know
why this substitution makes sense.

Also, does this patch affect the logic in mshv_vp_dispatch_thread_blocked() where
a zero value for the SELF version of VpRootDispatchThreadBlocked is replaced by
the PARENT value? But that logic seems to be in the reverse direction -- replacing
a missing SELF value with the PARENT value -- whereas this patch is about replacing
missing PARENT values with SELF values. So are there two separate PARENT vs. SELF
issues overall? And after this patch is in place and PARENT values are replaced with
SELF on older hypervisor versions, the logic in mshv_vp_dispatch_thread_blocked()
then effectively becomes a no-op if the SELF value is zero, and the return value will
be zero. Is that problem?

> 
> Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
>  drivers/hv/mshv_root_hv_call.c | 41 ++++++++++++++++++++++++++++++----
>  drivers/hv/mshv_root_main.c    |  3 +++
>  2 files changed, 40 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
> index 598eaff4ff29..b1770c7b500c 100644
> --- a/drivers/hv/mshv_root_hv_call.c
> +++ b/drivers/hv/mshv_root_hv_call.c
> @@ -855,6 +855,24 @@ static int hv_call_map_stats_page2(enum
> hv_stats_object_type type,
>  	return ret;
>  }
> 
> +static int
> +hv_stats_get_area_type(enum hv_stats_object_type type,
> +		       const union hv_stats_object_identity *identity)
> +{
> +	switch (type) {
> +	case HV_STATS_OBJECT_HYPERVISOR:
> +		return identity->hv.stats_area_type;
> +	case HV_STATS_OBJECT_LOGICAL_PROCESSOR:
> +		return identity->lp.stats_area_type;
> +	case HV_STATS_OBJECT_PARTITION:
> +		return identity->partition.stats_area_type;
> +	case HV_STATS_OBJECT_VP:
> +		return identity->vp.stats_area_type;
> +	}
> +
> +	return -EINVAL;
> +}
> +
>  static int hv_call_map_stats_page(enum hv_stats_object_type type,
>  				  const union hv_stats_object_identity *identity,
>  				  void **addr)
> @@ -863,7 +881,7 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
>  	struct hv_input_map_stats_page *input;
>  	struct hv_output_map_stats_page *output;
>  	u64 status, pfn;
> -	int ret = 0;
> +	int hv_status, ret = 0;
> 
>  	do {
>  		local_irq_save(flags);
> @@ -878,11 +896,26 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
>  		pfn = output->map_location;
> 
>  		local_irq_restore(flags);
> -		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
> -			ret = hv_result_to_errno(status);
> +
> +		hv_status = hv_result(status);
> +		if (hv_status != HV_STATUS_INSUFFICIENT_MEMORY) {
>  			if (hv_result_success(status))
>  				break;
> -			return ret;
> +
> +			/*
> +			 * Older versions of the hypervisor do not support the
> +			 * PARENT stats area. In this case return "success" but
> +			 * set the page to NULL. The caller should check for
> +			 * this case and instead just use the SELF area.
> +			 */
> +			if (hv_stats_get_area_type(type, identity) == HV_STATS_AREA_PARENT &&
> +			    hv_status == HV_STATUS_INVALID_PARAMETER) {
> +				*addr = NULL;
> +				return 0;
> +			}
> +
> +			hv_status_debug(status, "\n");
> +			return hv_result_to_errno(status);

Does the hv_call_map_stats_page2() function need a similar fix? Or is there a linkage
in hypervisor functionality where any hypervisor version that supports an overlay GPFN
also supports the PARENT stats? If such a linkage is why hv_call_map_stats_page2()
doesn't need a similar fix, please add a code comment to that effect in
hv_call_map_stats_page2().

>  		}
> 
>  		ret = hv_call_deposit_pages(NUMA_NO_NODE,
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index bc15d6f6922f..f59a4ab47685 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -905,6 +905,9 @@ static int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
>  	if (err)
>  		goto unmap_self;
> 
> +	if (!stats_pages[HV_STATS_AREA_PARENT])
> +		stats_pages[HV_STATS_AREA_PARENT] =
> stats_pages[HV_STATS_AREA_SELF];
> +
>  	return 0;
> 
>  unmap_self:
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2 2/3] mshv: Add definitions for stats pages
  2025-12-05 18:58 ` [PATCH v2 2/3] mshv: Add definitions for stats pages Nuno Das Neves
  2025-12-05 22:51   ` Stanislav Kinsburskii
@ 2025-12-08 15:13   ` Michael Kelley
  2025-12-30 23:04     ` Nuno Das Neves
  1 sibling, 1 reply; 18+ messages in thread
From: Michael Kelley @ 2025-12-08 15:13 UTC (permalink / raw)
  To: Nuno Das Neves, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, skinsburskii@linux.microsoft.com
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com,
	prapal@linux.microsoft.com, mrathor@linux.microsoft.com,
	paekkaladevi@linux.microsoft.com

From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Friday, December 5, 2025 10:59 AM
> 
> Add the definitions for hypervisor, logical processor, and partition
> stats pages.
> 
> Move the definition for the VP stats page to its rightful place in
> hvhdk.h, and add the missing members.
> 
> These enum members retain their CamelCase style, since they are imported
> directly from the hypervisor code They will be stringified when printing

Missing a '.' (period) after "hypervisor code".

> the stats out, and retain more readability in this form.
> 
> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> ---
>  drivers/hv/mshv_root_main.c |  17 --
>  include/hyperv/hvhdk.h      | 437 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 437 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index f59a4ab47685..19006b788e85 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -38,23 +38,6 @@ MODULE_AUTHOR("Microsoft");
>  MODULE_LICENSE("GPL");
>  MODULE_DESCRIPTION("Microsoft Hyper-V root partition VMM interface
> /dev/mshv");
> 
> -/* TODO move this to another file when debugfs code is added */
> -enum hv_stats_vp_counters {			/* HV_THREAD_COUNTER */
> -#if defined(CONFIG_X86)
> -	VpRootDispatchThreadBlocked			= 202,
> -#elif defined(CONFIG_ARM64)
> -	VpRootDispatchThreadBlocked			= 94,
> -#endif
> -	VpStatsMaxCounter
> -};
> -
> -struct hv_stats_page {
> -	union {
> -		u64 vp_cntrs[VpStatsMaxCounter];		/* VP counters */
> -		u8 data[HV_HYP_PAGE_SIZE];
> -	};
> -} __packed;
> -
>  struct mshv_root mshv_root;
> 
>  enum hv_scheduler_type hv_scheduler_type;
> diff --git a/include/hyperv/hvhdk.h b/include/hyperv/hvhdk.h
> index 469186df7826..51abbcd0ec37 100644
> --- a/include/hyperv/hvhdk.h
> +++ b/include/hyperv/hvhdk.h
> @@ -10,6 +10,443 @@
>  #include "hvhdk_mini.h"
>  #include "hvgdk.h"
> 
> +enum hv_stats_hypervisor_counters {		/* HV_HYPERVISOR_COUNTER */
> +	HvLogicalProcessors			= 1,
> +	HvPartitions				= 2,
> +	HvTotalPages				= 3,
> +	HvVirtualProcessors			= 4,
> +	HvMonitoredNotifications		= 5,
> +	HvModernStandbyEntries			= 6,
> +	HvPlatformIdleTransitions		= 7,
> +	HvHypervisorStartupCost			= 8,
> +	HvIOSpacePages				= 10,
> +	HvNonEssentialPagesForDump		= 11,
> +	HvSubsumedPages				= 12,
> +	HvStatsMaxCounter
> +};
> +
> +enum hv_stats_partition_counters {		/* HV_PROCESS_COUNTER */
> +	PartitionVirtualProcessors		= 1,
> +	PartitionTlbSize			= 3,
> +	PartitionAddressSpaces			= 4,
> +	PartitionDepositedPages			= 5,
> +	PartitionGpaPages			= 6,
> +	PartitionGpaSpaceModifications		= 7,
> +	PartitionVirtualTlbFlushEntires		= 8,
> +	PartitionRecommendedTlbSize		= 9,
> +	PartitionGpaPages4K			= 10,
> +	PartitionGpaPages2M			= 11,
> +	PartitionGpaPages1G			= 12,
> +	PartitionGpaPages512G			= 13,
> +	PartitionDevicePages4K			= 14,
> +	PartitionDevicePages2M			= 15,
> +	PartitionDevicePages1G			= 16,
> +	PartitionDevicePages512G		= 17,
> +	PartitionAttachedDevices		= 18,
> +	PartitionDeviceInterruptMappings	= 19,
> +	PartitionIoTlbFlushes			= 20,
> +	PartitionIoTlbFlushCost			= 21,
> +	PartitionDeviceInterruptErrors		= 22,
> +	PartitionDeviceDmaErrors		= 23,
> +	PartitionDeviceInterruptThrottleEvents	= 24,
> +	PartitionSkippedTimerTicks		= 25,
> +	PartitionPartitionId			= 26,
> +#if IS_ENABLED(CONFIG_X86_64)
> +	PartitionNestedTlbSize			= 27,
> +	PartitionRecommendedNestedTlbSize	= 28,
> +	PartitionNestedTlbFreeListSize		= 29,
> +	PartitionNestedTlbTrimmedPages		= 30,
> +	PartitionPagesShattered			= 31,
> +	PartitionPagesRecombined		= 32,
> +	PartitionHwpRequestValue		= 33,
> +#elif IS_ENABLED(CONFIG_ARM64)
> +	PartitionHwpRequestValue		= 27,
> +#endif
> +	PartitionStatsMaxCounter
> +};
> +
> +enum hv_stats_vp_counters {			/* HV_THREAD_COUNTER */
> +	VpTotalRunTime					= 1,
> +	VpHypervisorRunTime				= 2,
> +	VpRemoteNodeRunTime				= 3,
> +	VpNormalizedRunTime				= 4,
> +	VpIdealCpu					= 5,
> +	VpHypercallsCount				= 7,
> +	VpHypercallsTime				= 8,
> +#if IS_ENABLED(CONFIG_X86_64)
> +	VpPageInvalidationsCount			= 9,
> +	VpPageInvalidationsTime				= 10,
> +	VpControlRegisterAccessesCount			= 11,
> +	VpControlRegisterAccessesTime			= 12,
> +	VpIoInstructionsCount				= 13,
> +	VpIoInstructionsTime				= 14,
> +	VpHltInstructionsCount				= 15,
> +	VpHltInstructionsTime				= 16,
> +	VpMwaitInstructionsCount			= 17,
> +	VpMwaitInstructionsTime				= 18,
> +	VpCpuidInstructionsCount			= 19,
> +	VpCpuidInstructionsTime				= 20,
> +	VpMsrAccessesCount				= 21,
> +	VpMsrAccessesTime				= 22,
> +	VpOtherInterceptsCount				= 23,
> +	VpOtherInterceptsTime				= 24,
> +	VpExternalInterruptsCount			= 25,
> +	VpExternalInterruptsTime			= 26,
> +	VpPendingInterruptsCount			= 27,
> +	VpPendingInterruptsTime				= 28,
> +	VpEmulatedInstructionsCount			= 29,
> +	VpEmulatedInstructionsTime			= 30,
> +	VpDebugRegisterAccessesCount			= 31,
> +	VpDebugRegisterAccessesTime			= 32,
> +	VpPageFaultInterceptsCount			= 33,
> +	VpPageFaultInterceptsTime			= 34,
> +	VpGuestPageTableMaps				= 35,
> +	VpLargePageTlbFills				= 36,
> +	VpSmallPageTlbFills				= 37,
> +	VpReflectedGuestPageFaults			= 38,
> +	VpApicMmioAccesses				= 39,
> +	VpIoInterceptMessages				= 40,
> +	VpMemoryInterceptMessages			= 41,
> +	VpApicEoiAccesses				= 42,
> +	VpOtherMessages					= 43,
> +	VpPageTableAllocations				= 44,
> +	VpLogicalProcessorMigrations			= 45,
> +	VpAddressSpaceEvictions				= 46,
> +	VpAddressSpaceSwitches				= 47,
> +	VpAddressDomainFlushes				= 48,
> +	VpAddressSpaceFlushes				= 49,
> +	VpGlobalGvaRangeFlushes				= 50,
> +	VpLocalGvaRangeFlushes				= 51,
> +	VpPageTableEvictions				= 52,
> +	VpPageTableReclamations				= 53,
> +	VpPageTableResets				= 54,
> +	VpPageTableValidations				= 55,
> +	VpApicTprAccesses				= 56,
> +	VpPageTableWriteIntercepts			= 57,
> +	VpSyntheticInterrupts				= 58,
> +	VpVirtualInterrupts				= 59,
> +	VpApicIpisSent					= 60,
> +	VpApicSelfIpisSent				= 61,
> +	VpGpaSpaceHypercalls				= 62,
> +	VpLogicalProcessorHypercalls			= 63,
> +	VpLongSpinWaitHypercalls			= 64,
> +	VpOtherHypercalls				= 65,
> +	VpSyntheticInterruptHypercalls			= 66,
> +	VpVirtualInterruptHypercalls			= 67,
> +	VpVirtualMmuHypercalls				= 68,
> +	VpVirtualProcessorHypercalls			= 69,
> +	VpHardwareInterrupts				= 70,
> +	VpNestedPageFaultInterceptsCount		= 71,
> +	VpNestedPageFaultInterceptsTime			= 72,
> +	VpPageScans					= 73,
> +	VpLogicalProcessorDispatches			= 74,
> +	VpWaitingForCpuTime				= 75,
> +	VpExtendedHypercalls				= 76,
> +	VpExtendedHypercallInterceptMessages		= 77,
> +	VpMbecNestedPageTableSwitches			= 78,
> +	VpOtherReflectedGuestExceptions			= 79,
> +	VpGlobalIoTlbFlushes				= 80,
> +	VpGlobalIoTlbFlushCost				= 81,
> +	VpLocalIoTlbFlushes				= 82,
> +	VpLocalIoTlbFlushCost				= 83,
> +	VpHypercallsForwardedCount			= 84,
> +	VpHypercallsForwardingTime			= 85,
> +	VpPageInvalidationsForwardedCount		= 86,
> +	VpPageInvalidationsForwardingTime		= 87,
> +	VpControlRegisterAccessesForwardedCount		= 88,
> +	VpControlRegisterAccessesForwardingTime		= 89,
> +	VpIoInstructionsForwardedCount			= 90,
> +	VpIoInstructionsForwardingTime			= 91,
> +	VpHltInstructionsForwardedCount			= 92,
> +	VpHltInstructionsForwardingTime			= 93,
> +	VpMwaitInstructionsForwardedCount		= 94,
> +	VpMwaitInstructionsForwardingTime		= 95,
> +	VpCpuidInstructionsForwardedCount		= 96,
> +	VpCpuidInstructionsForwardingTime		= 97,
> +	VpMsrAccessesForwardedCount			= 98,
> +	VpMsrAccessesForwardingTime			= 99,
> +	VpOtherInterceptsForwardedCount			= 100,
> +	VpOtherInterceptsForwardingTime			= 101,
> +	VpExternalInterruptsForwardedCount		= 102,
> +	VpExternalInterruptsForwardingTime		= 103,
> +	VpPendingInterruptsForwardedCount		= 104,
> +	VpPendingInterruptsForwardingTime		= 105,
> +	VpEmulatedInstructionsForwardedCount		= 106,
> +	VpEmulatedInstructionsForwardingTime		= 107,
> +	VpDebugRegisterAccessesForwardedCount		= 108,
> +	VpDebugRegisterAccessesForwardingTime		= 109,
> +	VpPageFaultInterceptsForwardedCount		= 110,
> +	VpPageFaultInterceptsForwardingTime		= 111,
> +	VpVmclearEmulationCount				= 112,
> +	VpVmclearEmulationTime				= 113,
> +	VpVmptrldEmulationCount				= 114,
> +	VpVmptrldEmulationTime				= 115,
> +	VpVmptrstEmulationCount				= 116,
> +	VpVmptrstEmulationTime				= 117,
> +	VpVmreadEmulationCount				= 118,
> +	VpVmreadEmulationTime				= 119,
> +	VpVmwriteEmulationCount				= 120,
> +	VpVmwriteEmulationTime				= 121,
> +	VpVmxoffEmulationCount				= 122,
> +	VpVmxoffEmulationTime				= 123,
> +	VpVmxonEmulationCount				= 124,
> +	VpVmxonEmulationTime				= 125,
> +	VpNestedVMEntriesCount				= 126,
> +	VpNestedVMEntriesTime				= 127,
> +	VpNestedSLATSoftPageFaultsCount			= 128,
> +	VpNestedSLATSoftPageFaultsTime			= 129,
> +	VpNestedSLATHardPageFaultsCount			= 130,
> +	VpNestedSLATHardPageFaultsTime			= 131,
> +	VpInvEptAllContextEmulationCount		= 132,
> +	VpInvEptAllContextEmulationTime			= 133,
> +	VpInvEptSingleContextEmulationCount		= 134,
> +	VpInvEptSingleContextEmulationTime		= 135,
> +	VpInvVpidAllContextEmulationCount		= 136,
> +	VpInvVpidAllContextEmulationTime		= 137,
> +	VpInvVpidSingleContextEmulationCount		= 138,
> +	VpInvVpidSingleContextEmulationTime		= 139,
> +	VpInvVpidSingleAddressEmulationCount		= 140,
> +	VpInvVpidSingleAddressEmulationTime		= 141,
> +	VpNestedTlbPageTableReclamations		= 142,
> +	VpNestedTlbPageTableEvictions			= 143,
> +	VpFlushGuestPhysicalAddressSpaceHypercalls	= 144,
> +	VpFlushGuestPhysicalAddressListHypercalls	= 145,
> +	VpPostedInterruptNotifications			= 146,
> +	VpPostedInterruptScans				= 147,
> +	VpTotalCoreRunTime				= 148,
> +	VpMaximumRunTime				= 149,
> +	VpHwpRequestContextSwitches			= 150,
> +	VpWaitingForCpuTimeBucket0			= 151,
> +	VpWaitingForCpuTimeBucket1			= 152,
> +	VpWaitingForCpuTimeBucket2			= 153,
> +	VpWaitingForCpuTimeBucket3			= 154,
> +	VpWaitingForCpuTimeBucket4			= 155,
> +	VpWaitingForCpuTimeBucket5			= 156,
> +	VpWaitingForCpuTimeBucket6			= 157,
> +	VpVmloadEmulationCount				= 158,
> +	VpVmloadEmulationTime				= 159,
> +	VpVmsaveEmulationCount				= 160,
> +	VpVmsaveEmulationTime				= 161,
> +	VpGifInstructionEmulationCount			= 162,
> +	VpGifInstructionEmulationTime			= 163,
> +	VpEmulatedErrataSvmInstructions			= 164,
> +	VpPlaceholder1					= 165,
> +	VpPlaceholder2					= 166,
> +	VpPlaceholder3					= 167,
> +	VpPlaceholder4					= 168,
> +	VpPlaceholder5					= 169,
> +	VpPlaceholder6					= 170,
> +	VpPlaceholder7					= 171,
> +	VpPlaceholder8					= 172,
> +	VpPlaceholder9					= 173,
> +	VpPlaceholder10					= 174,
> +	VpSchedulingPriority				= 175,
> +	VpRdpmcInstructionsCount			= 176,
> +	VpRdpmcInstructionsTime				= 177,
> +	VpPerfmonPmuMsrAccessesCount			= 178,
> +	VpPerfmonLbrMsrAccessesCount			= 179,
> +	VpPerfmonIptMsrAccessesCount			= 180,
> +	VpPerfmonInterruptCount				= 181,
> +	VpVtl1DispatchCount				= 182,
> +	VpVtl2DispatchCount				= 183,
> +	VpVtl2DispatchBucket0				= 184,
> +	VpVtl2DispatchBucket1				= 185,
> +	VpVtl2DispatchBucket2				= 186,
> +	VpVtl2DispatchBucket3				= 187,
> +	VpVtl2DispatchBucket4				= 188,
> +	VpVtl2DispatchBucket5				= 189,
> +	VpVtl2DispatchBucket6				= 190,
> +	VpVtl1RunTime					= 191,
> +	VpVtl2RunTime					= 192,
> +	VpIommuHypercalls				= 193,
> +	VpCpuGroupHypercalls				= 194,
> +	VpVsmHypercalls					= 195,
> +	VpEventLogHypercalls				= 196,
> +	VpDeviceDomainHypercalls			= 197,
> +	VpDepositHypercalls				= 198,
> +	VpSvmHypercalls					= 199,
> +	VpBusLockAcquisitionCount			= 200,
> +	VpUnused					= 201,
> +	VpRootDispatchThreadBlocked			= 202,
> +#elif IS_ENABLED(CONFIG_ARM64)
> +	VpSysRegAccessesCount				= 9,
> +	VpSysRegAccessesTime				= 10,
> +	VpSmcInstructionsCount				= 11,
> +	VpSmcInstructionsTime				= 12,
> +	VpOtherInterceptsCount				= 13,
> +	VpOtherInterceptsTime				= 14,
> +	VpExternalInterruptsCount			= 15,
> +	VpExternalInterruptsTime			= 16,
> +	VpPendingInterruptsCount			= 17,
> +	VpPendingInterruptsTime				= 18,
> +	VpGuestPageTableMaps				= 19,
> +	VpLargePageTlbFills				= 20,
> +	VpSmallPageTlbFills				= 21,
> +	VpReflectedGuestPageFaults			= 22,
> +	VpMemoryInterceptMessages			= 23,
> +	VpOtherMessages					= 24,
> +	VpLogicalProcessorMigrations			= 25,
> +	VpAddressDomainFlushes				= 26,
> +	VpAddressSpaceFlushes				= 27,
> +	VpSyntheticInterrupts				= 28,
> +	VpVirtualInterrupts				= 29,
> +	VpApicSelfIpisSent				= 30,
> +	VpGpaSpaceHypercalls				= 31,
> +	VpLogicalProcessorHypercalls			= 32,
> +	VpLongSpinWaitHypercalls			= 33,
> +	VpOtherHypercalls				= 34,
> +	VpSyntheticInterruptHypercalls			= 35,
> +	VpVirtualInterruptHypercalls			= 36,
> +	VpVirtualMmuHypercalls				= 37,
> +	VpVirtualProcessorHypercalls			= 38,
> +	VpHardwareInterrupts				= 39,
> +	VpNestedPageFaultInterceptsCount		= 40,
> +	VpNestedPageFaultInterceptsTime			= 41,
> +	VpLogicalProcessorDispatches			= 42,
> +	VpWaitingForCpuTime				= 43,
> +	VpExtendedHypercalls				= 44,
> +	VpExtendedHypercallInterceptMessages		= 45,
> +	VpMbecNestedPageTableSwitches			= 46,
> +	VpOtherReflectedGuestExceptions			= 47,
> +	VpGlobalIoTlbFlushes				= 48,
> +	VpGlobalIoTlbFlushCost				= 49,
> +	VpLocalIoTlbFlushes				= 50,
> +	VpLocalIoTlbFlushCost				= 51,
> +	VpFlushGuestPhysicalAddressSpaceHypercalls	= 52,
> +	VpFlushGuestPhysicalAddressListHypercalls	= 53,
> +	VpPostedInterruptNotifications			= 54,
> +	VpPostedInterruptScans				= 55,
> +	VpTotalCoreRunTime				= 56,
> +	VpMaximumRunTime				= 57,
> +	VpWaitingForCpuTimeBucket0			= 58,
> +	VpWaitingForCpuTimeBucket1			= 59,
> +	VpWaitingForCpuTimeBucket2			= 60,
> +	VpWaitingForCpuTimeBucket3			= 61,
> +	VpWaitingForCpuTimeBucket4			= 62,
> +	VpWaitingForCpuTimeBucket5			= 63,
> +	VpWaitingForCpuTimeBucket6			= 64,
> +	VpHwpRequestContextSwitches			= 65,
> +	VpPlaceholder2					= 66,
> +	VpPlaceholder3					= 67,
> +	VpPlaceholder4					= 68,
> +	VpPlaceholder5					= 69,
> +	VpPlaceholder6					= 70,
> +	VpPlaceholder7					= 71,
> +	VpPlaceholder8					= 72,
> +	VpContentionTime				= 73,
> +	VpWakeUpTime					= 74,
> +	VpSchedulingPriority				= 75,
> +	VpVtl1DispatchCount				= 76,
> +	VpVtl2DispatchCount				= 77,
> +	VpVtl2DispatchBucket0				= 78,
> +	VpVtl2DispatchBucket1				= 79,
> +	VpVtl2DispatchBucket2				= 80,
> +	VpVtl2DispatchBucket3				= 81,
> +	VpVtl2DispatchBucket4				= 82,
> +	VpVtl2DispatchBucket5				= 83,
> +	VpVtl2DispatchBucket6				= 84,
> +	VpVtl1RunTime					= 85,
> +	VpVtl2RunTime					= 86,
> +	VpIommuHypercalls				= 87,
> +	VpCpuGroupHypercalls				= 88,
> +	VpVsmHypercalls					= 89,
> +	VpEventLogHypercalls				= 90,
> +	VpDeviceDomainHypercalls			= 91,
> +	VpDepositHypercalls				= 92,
> +	VpSvmHypercalls					= 93,
> +	VpLoadAvg					= 94,
> +	VpRootDispatchThreadBlocked			= 95,

In current code, VpRootDispatchThreadBlocked on ARM64 is 94. Is that an
error that is being corrected by this patch?

> +#endif
> +	VpStatsMaxCounter
> +};
> +
> +enum hv_stats_lp_counters {			/* HV_CPU_COUNTER */
> +	LpGlobalTime				= 1,
> +	LpTotalRunTime				= 2,
> +	LpHypervisorRunTime			= 3,
> +	LpHardwareInterrupts			= 4,
> +	LpContextSwitches			= 5,
> +	LpInterProcessorInterrupts		= 6,
> +	LpSchedulerInterrupts			= 7,
> +	LpTimerInterrupts			= 8,
> +	LpInterProcessorInterruptsSent		= 9,
> +	LpProcessorHalts			= 10,
> +	LpMonitorTransitionCost			= 11,
> +	LpContextSwitchTime			= 12,
> +	LpC1TransitionsCount			= 13,
> +	LpC1RunTime				= 14,
> +	LpC2TransitionsCount			= 15,
> +	LpC2RunTime				= 16,
> +	LpC3TransitionsCount			= 17,
> +	LpC3RunTime				= 18,
> +	LpRootVpIndex				= 19,
> +	LpIdleSequenceNumber			= 20,
> +	LpGlobalTscCount			= 21,
> +	LpActiveTscCount			= 22,
> +	LpIdleAccumulation			= 23,
> +	LpReferenceCycleCount0			= 24,
> +	LpActualCycleCount0			= 25,
> +	LpReferenceCycleCount1			= 26,
> +	LpActualCycleCount1			= 27,
> +	LpProximityDomainId			= 28,
> +	LpPostedInterruptNotifications		= 29,
> +	LpBranchPredictorFlushes		= 30,
> +#if IS_ENABLED(CONFIG_X86_64)
> +	LpL1DataCacheFlushes			= 31,
> +	LpImmediateL1DataCacheFlushes		= 32,
> +	LpMbFlushes				= 33,
> +	LpCounterRefreshSequenceNumber		= 34,
> +	LpCounterRefreshReferenceTime		= 35,
> +	LpIdleAccumulationSnapshot		= 36,
> +	LpActiveTscCountSnapshot		= 37,
> +	LpHwpRequestContextSwitches		= 38,
> +	LpPlaceholder1				= 39,
> +	LpPlaceholder2				= 40,
> +	LpPlaceholder3				= 41,
> +	LpPlaceholder4				= 42,
> +	LpPlaceholder5				= 43,
> +	LpPlaceholder6				= 44,
> +	LpPlaceholder7				= 45,
> +	LpPlaceholder8				= 46,
> +	LpPlaceholder9				= 47,
> +	LpPlaceholder10				= 48,
> +	LpReserveGroupId			= 49,
> +	LpRunningPriority			= 50,
> +	LpPerfmonInterruptCount			= 51,
> +#elif IS_ENABLED(CONFIG_ARM64)
> +	LpCounterRefreshSequenceNumber		= 31,
> +	LpCounterRefreshReferenceTime		= 32,
> +	LpIdleAccumulationSnapshot		= 33,
> +	LpActiveTscCountSnapshot		= 34,
> +	LpHwpRequestContextSwitches		= 35,
> +	LpPlaceholder2				= 36,
> +	LpPlaceholder3				= 37,
> +	LpPlaceholder4				= 38,
> +	LpPlaceholder5				= 39,
> +	LpPlaceholder6				= 40,
> +	LpPlaceholder7				= 41,
> +	LpPlaceholder8				= 42,
> +	LpPlaceholder9				= 43,
> +	LpSchLocalRunListSize			= 44,
> +	LpReserveGroupId			= 45,
> +	LpRunningPriority			= 46,
> +#endif
> +	LpStatsMaxCounter
> +};
> +
> +/*
> + * Hypervisor statsitics page format

s/statsitics/statistics/

> + */
> +struct hv_stats_page {
> +	union {
> +		u64 hv_cntrs[HvStatsMaxCounter];		/* Hypervisor counters
> */
> +		u64 pt_cntrs[PartitionStatsMaxCounter];		/* Partition
> counters */
> +		u64 vp_cntrs[VpStatsMaxCounter];		/* VP counters */
> +		u64 lp_cntrs[LpStatsMaxCounter];		/* LP counters */
> +		u8 data[HV_HYP_PAGE_SIZE];
> +	};
> +} __packed;
> +
>  /* Bits for dirty mask of hv_vp_register_page */
>  #define HV_X64_REGISTER_CLASS_GENERAL	0
>  #define HV_X64_REGISTER_CLASS_IP	1
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics
  2025-12-05 18:58 ` [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics Nuno Das Neves
                     ` (2 preceding siblings ...)
  2025-12-08  6:02   ` kernel test robot
@ 2025-12-08 15:21   ` Michael Kelley
  2025-12-31  0:26     ` Nuno Das Neves
  3 siblings, 1 reply; 18+ messages in thread
From: Michael Kelley @ 2025-12-08 15:21 UTC (permalink / raw)
  To: Nuno Das Neves, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, skinsburskii@linux.microsoft.com
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com,
	prapal@linux.microsoft.com, mrathor@linux.microsoft.com,
	paekkaladevi@linux.microsoft.com, Jinank Jain

From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Friday, December 5, 2025 10:59 AM
> 
> Introduce a debugfs interface to expose root and child partition stats
> when running with mshv_root.
> 
> Create a debugfs directory "mshv" containing 'stats' files organized by
> type and id. A stats file contains a number of counters depending on
> its type. e.g. an excerpt from a VP stats file:
> 
> TotalRunTime                  : 1997602722
> HypervisorRunTime             : 649671371
> RemoteNodeRunTime             : 0
> NormalizedRunTime             : 1997602721
> IdealCpu                      : 0
> HypercallsCount               : 1708169
> HypercallsTime                : 111914774
> PageInvalidationsCount        : 0
> PageInvalidationsTime         : 0
> 
> On a root partition with some active child partitions, the entire
> directory structure may look like:
> 
> mshv/
>   stats             # hypervisor stats
>   lp/               # logical processors
>     0/              # LP id
>       stats         # LP 0 stats
>     1/
>     2/
>     3/
>   partition/        # partition stats
>     1/              # root partition id
>       stats         # root partition stats
>       vp/           # root virtual processors
>         0/          # root VP id
>           stats     # root VP 0 stats
>         1/
>         2/
>         3/
>     42/             # child partition id
>       stats         # child partition stats
>       vp/           # child VPs
>         0/          # child VP id
>           stats     # child VP 0 stats
>         1/
>     43/
>     55/
> 

In the above directory tree, each of the "stats" files is in a directory
by itself, where the directory name is the number of whatever
entity the stats are for (lp, partition, or vp). Do you expect there to
be other files parallel to "stats" that will be added later? Otherwise
you could collapse one directory level. The "best" directory structure
is somewhat a matter of taste and judgment, so there's not a "right"
answer. I don't object if your preference is to keep the numbered
directories, even if they are likely to never contain more than the
"stats" file.

> On L1VH, some stats are not present as it does not own the hardware
> like the root partition does:
> - The hypervisor and lp stats are not present
> - L1VH's partition directory is named "self" because it can't get its
>   own id
> - Some of L1VH's partition and VP stats fields are not populated, because
>   it can't map its own HV_STATS_AREA_PARENT page.
> 
> Co-developed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> Co-developed-by: Praveen K Paladugu <prapal@linux.microsoft.com>
> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
> Co-developed-by: Mukesh Rathor <mrathor@linux.microsoft.com>
> Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
> Co-developed-by: Purna Pavan Chandra Aekkaladevi
> <paekkaladevi@linux.microsoft.com>
> Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
> Co-developed-by: Jinank Jain <jinankjain@microsoft.com>
> Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
>  drivers/hv/Makefile         |    1 +
>  drivers/hv/mshv_debugfs.c   | 1122 +++++++++++++++++++++++++++++++++++
>  drivers/hv/mshv_root.h      |   34 ++
>  drivers/hv/mshv_root_main.c |   32 +-
>  4 files changed, 1185 insertions(+), 4 deletions(-)
>  create mode 100644 drivers/hv/mshv_debugfs.c
> 
> diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
> index 58b8d07639f3..36278c936914 100644
> --- a/drivers/hv/Makefile
> +++ b/drivers/hv/Makefile
> @@ -15,6 +15,7 @@ hv_vmbus-$(CONFIG_HYPERV_TESTING)	+= hv_debugfs.o
>  hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_utils_transport.o
>  mshv_root-y := mshv_root_main.o mshv_synic.o mshv_eventfd.o mshv_irq.o \
>  	       mshv_root_hv_call.o mshv_portid_table.o
> +mshv_root-$(CONFIG_DEBUG_FS) += mshv_debugfs.o
>  mshv_vtl-y := mshv_vtl_main.o
> 
>  # Code that must be built-in
> diff --git a/drivers/hv/mshv_debugfs.c b/drivers/hv/mshv_debugfs.c
> new file mode 100644
> index 000000000000..581018690a27
> --- /dev/null
> +++ b/drivers/hv/mshv_debugfs.c
> @@ -0,0 +1,1122 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2025, Microsoft Corporation.
> + *
> + * The /sys/kernel/debug/mshv directory contents.
> + * Contains various statistics data, provided by the hypervisor.
> + *
> + * Authors: Microsoft Linux virtualization team
> + */
> +
> +#include <linux/debugfs.h>
> +#include <linux/stringify.h>
> +#include <asm/mshyperv.h>
> +#include <linux/slab.h>
> +
> +#include "mshv.h"
> +#include "mshv_root.h"
> +
> +#define U32_BUF_SZ 11
> +#define U64_BUF_SZ 21
> +
> +static struct dentry *mshv_debugfs;
> +static struct dentry *mshv_debugfs_partition;
> +static struct dentry *mshv_debugfs_lp;
> +
> +static u64 mshv_lps_count;
> +
> +static bool is_l1vh_parent(u64 partition_id)
> +{
> +	return hv_l1vh_partition() && (partition_id == HV_PARTITION_ID_SELF);
> +}
> +
> +static int lp_stats_show(struct seq_file *m, void *v)
> +{
> +	const struct hv_stats_page *stats = m->private;
> +
> +#define LP_SEQ_PRINTF(cnt)		\
> +	seq_printf(m, "%-29s: %llu\n", __stringify(cnt), stats->lp_cntrs[Lp##cnt])
> +
> +	LP_SEQ_PRINTF(GlobalTime);
> +	LP_SEQ_PRINTF(TotalRunTime);
> +	LP_SEQ_PRINTF(HypervisorRunTime);
> +	LP_SEQ_PRINTF(HardwareInterrupts);
> +	LP_SEQ_PRINTF(ContextSwitches);
> +	LP_SEQ_PRINTF(InterProcessorInterrupts);
> +	LP_SEQ_PRINTF(SchedulerInterrupts);
> +	LP_SEQ_PRINTF(TimerInterrupts);
> +	LP_SEQ_PRINTF(InterProcessorInterruptsSent);
> +	LP_SEQ_PRINTF(ProcessorHalts);
> +	LP_SEQ_PRINTF(MonitorTransitionCost);
> +	LP_SEQ_PRINTF(ContextSwitchTime);
> +	LP_SEQ_PRINTF(C1TransitionsCount);
> +	LP_SEQ_PRINTF(C1RunTime);
> +	LP_SEQ_PRINTF(C2TransitionsCount);
> +	LP_SEQ_PRINTF(C2RunTime);
> +	LP_SEQ_PRINTF(C3TransitionsCount);
> +	LP_SEQ_PRINTF(C3RunTime);
> +	LP_SEQ_PRINTF(RootVpIndex);
> +	LP_SEQ_PRINTF(IdleSequenceNumber);
> +	LP_SEQ_PRINTF(GlobalTscCount);
> +	LP_SEQ_PRINTF(ActiveTscCount);
> +	LP_SEQ_PRINTF(IdleAccumulation);
> +	LP_SEQ_PRINTF(ReferenceCycleCount0);
> +	LP_SEQ_PRINTF(ActualCycleCount0);
> +	LP_SEQ_PRINTF(ReferenceCycleCount1);
> +	LP_SEQ_PRINTF(ActualCycleCount1);
> +	LP_SEQ_PRINTF(ProximityDomainId);
> +	LP_SEQ_PRINTF(PostedInterruptNotifications);
> +	LP_SEQ_PRINTF(BranchPredictorFlushes);
> +#if IS_ENABLED(CONFIG_X86_64)
> +	LP_SEQ_PRINTF(L1DataCacheFlushes);
> +	LP_SEQ_PRINTF(ImmediateL1DataCacheFlushes);
> +	LP_SEQ_PRINTF(MbFlushes);
> +	LP_SEQ_PRINTF(CounterRefreshSequenceNumber);
> +	LP_SEQ_PRINTF(CounterRefreshReferenceTime);
> +	LP_SEQ_PRINTF(IdleAccumulationSnapshot);
> +	LP_SEQ_PRINTF(ActiveTscCountSnapshot);
> +	LP_SEQ_PRINTF(HwpRequestContextSwitches);
> +	LP_SEQ_PRINTF(Placeholder1);
> +	LP_SEQ_PRINTF(Placeholder2);
> +	LP_SEQ_PRINTF(Placeholder3);
> +	LP_SEQ_PRINTF(Placeholder4);
> +	LP_SEQ_PRINTF(Placeholder5);
> +	LP_SEQ_PRINTF(Placeholder6);
> +	LP_SEQ_PRINTF(Placeholder7);
> +	LP_SEQ_PRINTF(Placeholder8);
> +	LP_SEQ_PRINTF(Placeholder9);
> +	LP_SEQ_PRINTF(Placeholder10);
> +	LP_SEQ_PRINTF(ReserveGroupId);
> +	LP_SEQ_PRINTF(RunningPriority);
> +	LP_SEQ_PRINTF(PerfmonInterruptCount);
> +#elif IS_ENABLED(CONFIG_ARM64)
> +	LP_SEQ_PRINTF(CounterRefreshSequenceNumber);
> +	LP_SEQ_PRINTF(CounterRefreshReferenceTime);
> +	LP_SEQ_PRINTF(IdleAccumulationSnapshot);
> +	LP_SEQ_PRINTF(ActiveTscCountSnapshot);
> +	LP_SEQ_PRINTF(HwpRequestContextSwitches);
> +	LP_SEQ_PRINTF(Placeholder2);
> +	LP_SEQ_PRINTF(Placeholder3);
> +	LP_SEQ_PRINTF(Placeholder4);
> +	LP_SEQ_PRINTF(Placeholder5);
> +	LP_SEQ_PRINTF(Placeholder6);
> +	LP_SEQ_PRINTF(Placeholder7);
> +	LP_SEQ_PRINTF(Placeholder8);
> +	LP_SEQ_PRINTF(Placeholder9);
> +	LP_SEQ_PRINTF(SchLocalRunListSize);
> +	LP_SEQ_PRINTF(ReserveGroupId);
> +	LP_SEQ_PRINTF(RunningPriority);
> +#endif
> +
> +	return 0;
> +}
> +DEFINE_SHOW_ATTRIBUTE(lp_stats);
> +
> +static void mshv_lp_stats_unmap(u32 lp_index, void *stats_page_addr)
> +{
> +	union hv_stats_object_identity identity = {
> +		.lp.lp_index = lp_index,
> +		.lp.stats_area_type = HV_STATS_AREA_SELF,
> +	};
> +	int err;
> +
> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_LOGICAL_PROCESSOR,
> +				  stats_page_addr, &identity);
> +	if (err)
> +		pr_err("%s: failed to unmap logical processor %u stats, err: %d\n",
> +		       __func__, lp_index, err);
> +}
> +
> +static void __init *mshv_lp_stats_map(u32 lp_index)
> +{
> +	union hv_stats_object_identity identity = {
> +		.lp.lp_index = lp_index,
> +		.lp.stats_area_type = HV_STATS_AREA_SELF,
> +	};
> +	void *stats;
> +	int err;
> +
> +	err = hv_map_stats_page(HV_STATS_OBJECT_LOGICAL_PROCESSOR, &identity,
> +				&stats);
> +	if (err) {
> +		pr_err("%s: failed to map logical processor %u stats, err: %d\n",
> +		       __func__, lp_index, err);
> +		return ERR_PTR(err);
> +	}
> +
> +	return stats;
> +}
> +
> +static void __init *lp_debugfs_stats_create(u32 lp_index, struct dentry *parent)
> +{
> +	struct dentry *dentry;
> +	void *stats;
> +
> +	stats = mshv_lp_stats_map(lp_index);
> +	if (IS_ERR(stats))
> +		return stats;
> +
> +	dentry = debugfs_create_file("stats", 0400, parent,
> +				     stats, &lp_stats_fops);
> +	if (IS_ERR(dentry)) {
> +		mshv_lp_stats_unmap(lp_index, stats);
> +		return dentry;
> +	}
> +	return stats;
> +}
> +
> +static int __init lp_debugfs_create(u32 lp_index, struct dentry *parent)
> +{
> +	struct dentry *idx;
> +	char lp_idx_str[U32_BUF_SZ];
> +	void *stats;
> +	int err;
> +
> +	sprintf(lp_idx_str, "%u", lp_index);
> +
> +	idx = debugfs_create_dir(lp_idx_str, parent);
> +	if (IS_ERR(idx))
> +		return PTR_ERR(idx);
> +
> +	stats = lp_debugfs_stats_create(lp_index, idx);
> +	if (IS_ERR(stats)) {
> +		err = PTR_ERR(stats);
> +		goto remove_debugfs_lp_idx;
> +	}
> +
> +	return 0;
> +
> +remove_debugfs_lp_idx:
> +	debugfs_remove_recursive(idx);
> +	return err;
> +}
> +
> +static void mshv_debugfs_lp_remove(void)
> +{
> +	int lp_index;
> +
> +	debugfs_remove_recursive(mshv_debugfs_lp);
> +
> +	for (lp_index = 0; lp_index < mshv_lps_count; lp_index++)
> +		mshv_lp_stats_unmap(lp_index, NULL);

Passing NULL as the second argument here leaks the stats page
memory if Linux allocated the page as an overlay GPFN. But is that
considered OK because the debugfs entries for LPs are removed
only when the root partition is shutting down? That works as
long as hot-add/remove of CPUs isn't supported in the root
partition.

> +}
> +
> +static int __init mshv_debugfs_lp_create(struct dentry *parent)
> +{
> +	struct dentry *lp_dir;
> +	int err, lp_index;
> +
> +	lp_dir = debugfs_create_dir("lp", parent);
> +	if (IS_ERR(lp_dir))
> +		return PTR_ERR(lp_dir);
> +
> +	for (lp_index = 0; lp_index < mshv_lps_count; lp_index++) {
> +		err = lp_debugfs_create(lp_index, lp_dir);
> +		if (err)
> +			goto remove_debugfs_lps;
> +	}
> +
> +	mshv_debugfs_lp = lp_dir;
> +
> +	return 0;
> +
> +remove_debugfs_lps:
> +	for (lp_index -= 1; lp_index >= 0; lp_index--)
> +		mshv_lp_stats_unmap(lp_index, NULL);
> +	debugfs_remove_recursive(lp_dir);
> +	return err;
> +}
> +
> +static int vp_stats_show(struct seq_file *m, void *v)
> +{
> +	const struct hv_stats_page **pstats = m->private;
> +
> +#define VP_SEQ_PRINTF(cnt)				 \
> +do {								 \
> +	if (pstats[HV_STATS_AREA_SELF]->vp_cntrs[Vp##cnt]) \
> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> +			pstats[HV_STATS_AREA_SELF]->vp_cntrs[Vp##cnt]); \
> +	else \
> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> +			pstats[HV_STATS_AREA_PARENT]->vp_cntrs[Vp##cnt]); \
> +} while (0)

I don't understand this logic. Like in mshv_vp_dispatch_thread_blocked(), if
the SELF value is zero, then the PARENT value is used. The implication is that
you never want to display a SELF value of zero, which is a bit unexpected
since I could imagine zero being valid for some counters. But the overall result
is that the displayed values may be a mix of SELF and PARENT values.
And of course after Patch 1 of this series, if running on an older hypervisor
that doesn't provide PARENT, then SELF will be used anyway, which further
muddies what's going on here, at least for me. :-)

If this is the correct behavior, please add some code comments as to
why it makes sense, including in the case where PARENT isn't available.

> +
> +	VP_SEQ_PRINTF(TotalRunTime);
> +	VP_SEQ_PRINTF(HypervisorRunTime);
> +	VP_SEQ_PRINTF(RemoteNodeRunTime);
> +	VP_SEQ_PRINTF(NormalizedRunTime);
> +	VP_SEQ_PRINTF(IdealCpu);
> +	VP_SEQ_PRINTF(HypercallsCount);
> +	VP_SEQ_PRINTF(HypercallsTime);
> +#if IS_ENABLED(CONFIG_X86_64)
> +	VP_SEQ_PRINTF(PageInvalidationsCount);
> +	VP_SEQ_PRINTF(PageInvalidationsTime);
> +	VP_SEQ_PRINTF(ControlRegisterAccessesCount);
> +	VP_SEQ_PRINTF(ControlRegisterAccessesTime);
> +	VP_SEQ_PRINTF(IoInstructionsCount);
> +	VP_SEQ_PRINTF(IoInstructionsTime);
> +	VP_SEQ_PRINTF(HltInstructionsCount);
> +	VP_SEQ_PRINTF(HltInstructionsTime);
> +	VP_SEQ_PRINTF(MwaitInstructionsCount);
> +	VP_SEQ_PRINTF(MwaitInstructionsTime);
> +	VP_SEQ_PRINTF(CpuidInstructionsCount);
> +	VP_SEQ_PRINTF(CpuidInstructionsTime);
> +	VP_SEQ_PRINTF(MsrAccessesCount);
> +	VP_SEQ_PRINTF(MsrAccessesTime);
> +	VP_SEQ_PRINTF(OtherInterceptsCount);
> +	VP_SEQ_PRINTF(OtherInterceptsTime);
> +	VP_SEQ_PRINTF(ExternalInterruptsCount);
> +	VP_SEQ_PRINTF(ExternalInterruptsTime);
> +	VP_SEQ_PRINTF(PendingInterruptsCount);
> +	VP_SEQ_PRINTF(PendingInterruptsTime);
> +	VP_SEQ_PRINTF(EmulatedInstructionsCount);
> +	VP_SEQ_PRINTF(EmulatedInstructionsTime);
> +	VP_SEQ_PRINTF(DebugRegisterAccessesCount);
> +	VP_SEQ_PRINTF(DebugRegisterAccessesTime);
> +	VP_SEQ_PRINTF(PageFaultInterceptsCount);
> +	VP_SEQ_PRINTF(PageFaultInterceptsTime);
> +	VP_SEQ_PRINTF(GuestPageTableMaps);
> +	VP_SEQ_PRINTF(LargePageTlbFills);
> +	VP_SEQ_PRINTF(SmallPageTlbFills);
> +	VP_SEQ_PRINTF(ReflectedGuestPageFaults);
> +	VP_SEQ_PRINTF(ApicMmioAccesses);
> +	VP_SEQ_PRINTF(IoInterceptMessages);
> +	VP_SEQ_PRINTF(MemoryInterceptMessages);
> +	VP_SEQ_PRINTF(ApicEoiAccesses);
> +	VP_SEQ_PRINTF(OtherMessages);
> +	VP_SEQ_PRINTF(PageTableAllocations);
> +	VP_SEQ_PRINTF(LogicalProcessorMigrations);
> +	VP_SEQ_PRINTF(AddressSpaceEvictions);
> +	VP_SEQ_PRINTF(AddressSpaceSwitches);
> +	VP_SEQ_PRINTF(AddressDomainFlushes);
> +	VP_SEQ_PRINTF(AddressSpaceFlushes);
> +	VP_SEQ_PRINTF(GlobalGvaRangeFlushes);
> +	VP_SEQ_PRINTF(LocalGvaRangeFlushes);
> +	VP_SEQ_PRINTF(PageTableEvictions);
> +	VP_SEQ_PRINTF(PageTableReclamations);
> +	VP_SEQ_PRINTF(PageTableResets);
> +	VP_SEQ_PRINTF(PageTableValidations);
> +	VP_SEQ_PRINTF(ApicTprAccesses);
> +	VP_SEQ_PRINTF(PageTableWriteIntercepts);
> +	VP_SEQ_PRINTF(SyntheticInterrupts);
> +	VP_SEQ_PRINTF(VirtualInterrupts);
> +	VP_SEQ_PRINTF(ApicIpisSent);
> +	VP_SEQ_PRINTF(ApicSelfIpisSent);
> +	VP_SEQ_PRINTF(GpaSpaceHypercalls);
> +	VP_SEQ_PRINTF(LogicalProcessorHypercalls);
> +	VP_SEQ_PRINTF(LongSpinWaitHypercalls);
> +	VP_SEQ_PRINTF(OtherHypercalls);
> +	VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
> +	VP_SEQ_PRINTF(VirtualInterruptHypercalls);
> +	VP_SEQ_PRINTF(VirtualMmuHypercalls);
> +	VP_SEQ_PRINTF(VirtualProcessorHypercalls);
> +	VP_SEQ_PRINTF(HardwareInterrupts);
> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
> +	VP_SEQ_PRINTF(PageScans);
> +	VP_SEQ_PRINTF(LogicalProcessorDispatches);
> +	VP_SEQ_PRINTF(WaitingForCpuTime);
> +	VP_SEQ_PRINTF(ExtendedHypercalls);
> +	VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
> +	VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
> +	VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
> +	VP_SEQ_PRINTF(GlobalIoTlbFlushes);
> +	VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
> +	VP_SEQ_PRINTF(LocalIoTlbFlushes);
> +	VP_SEQ_PRINTF(LocalIoTlbFlushCost);
> +	VP_SEQ_PRINTF(HypercallsForwardedCount);
> +	VP_SEQ_PRINTF(HypercallsForwardingTime);
> +	VP_SEQ_PRINTF(PageInvalidationsForwardedCount);
> +	VP_SEQ_PRINTF(PageInvalidationsForwardingTime);
> +	VP_SEQ_PRINTF(ControlRegisterAccessesForwardedCount);
> +	VP_SEQ_PRINTF(ControlRegisterAccessesForwardingTime);
> +	VP_SEQ_PRINTF(IoInstructionsForwardedCount);
> +	VP_SEQ_PRINTF(IoInstructionsForwardingTime);
> +	VP_SEQ_PRINTF(HltInstructionsForwardedCount);
> +	VP_SEQ_PRINTF(HltInstructionsForwardingTime);
> +	VP_SEQ_PRINTF(MwaitInstructionsForwardedCount);
> +	VP_SEQ_PRINTF(MwaitInstructionsForwardingTime);
> +	VP_SEQ_PRINTF(CpuidInstructionsForwardedCount);
> +	VP_SEQ_PRINTF(CpuidInstructionsForwardingTime);
> +	VP_SEQ_PRINTF(MsrAccessesForwardedCount);
> +	VP_SEQ_PRINTF(MsrAccessesForwardingTime);
> +	VP_SEQ_PRINTF(OtherInterceptsForwardedCount);
> +	VP_SEQ_PRINTF(OtherInterceptsForwardingTime);
> +	VP_SEQ_PRINTF(ExternalInterruptsForwardedCount);
> +	VP_SEQ_PRINTF(ExternalInterruptsForwardingTime);
> +	VP_SEQ_PRINTF(PendingInterruptsForwardedCount);
> +	VP_SEQ_PRINTF(PendingInterruptsForwardingTime);
> +	VP_SEQ_PRINTF(EmulatedInstructionsForwardedCount);
> +	VP_SEQ_PRINTF(EmulatedInstructionsForwardingTime);
> +	VP_SEQ_PRINTF(DebugRegisterAccessesForwardedCount);
> +	VP_SEQ_PRINTF(DebugRegisterAccessesForwardingTime);
> +	VP_SEQ_PRINTF(PageFaultInterceptsForwardedCount);
> +	VP_SEQ_PRINTF(PageFaultInterceptsForwardingTime);
> +	VP_SEQ_PRINTF(VmclearEmulationCount);
> +	VP_SEQ_PRINTF(VmclearEmulationTime);
> +	VP_SEQ_PRINTF(VmptrldEmulationCount);
> +	VP_SEQ_PRINTF(VmptrldEmulationTime);
> +	VP_SEQ_PRINTF(VmptrstEmulationCount);
> +	VP_SEQ_PRINTF(VmptrstEmulationTime);
> +	VP_SEQ_PRINTF(VmreadEmulationCount);
> +	VP_SEQ_PRINTF(VmreadEmulationTime);
> +	VP_SEQ_PRINTF(VmwriteEmulationCount);
> +	VP_SEQ_PRINTF(VmwriteEmulationTime);
> +	VP_SEQ_PRINTF(VmxoffEmulationCount);
> +	VP_SEQ_PRINTF(VmxoffEmulationTime);
> +	VP_SEQ_PRINTF(VmxonEmulationCount);
> +	VP_SEQ_PRINTF(VmxonEmulationTime);
> +	VP_SEQ_PRINTF(NestedVMEntriesCount);
> +	VP_SEQ_PRINTF(NestedVMEntriesTime);
> +	VP_SEQ_PRINTF(NestedSLATSoftPageFaultsCount);
> +	VP_SEQ_PRINTF(NestedSLATSoftPageFaultsTime);
> +	VP_SEQ_PRINTF(NestedSLATHardPageFaultsCount);
> +	VP_SEQ_PRINTF(NestedSLATHardPageFaultsTime);
> +	VP_SEQ_PRINTF(InvEptAllContextEmulationCount);
> +	VP_SEQ_PRINTF(InvEptAllContextEmulationTime);
> +	VP_SEQ_PRINTF(InvEptSingleContextEmulationCount);
> +	VP_SEQ_PRINTF(InvEptSingleContextEmulationTime);
> +	VP_SEQ_PRINTF(InvVpidAllContextEmulationCount);
> +	VP_SEQ_PRINTF(InvVpidAllContextEmulationTime);
> +	VP_SEQ_PRINTF(InvVpidSingleContextEmulationCount);
> +	VP_SEQ_PRINTF(InvVpidSingleContextEmulationTime);
> +	VP_SEQ_PRINTF(InvVpidSingleAddressEmulationCount);
> +	VP_SEQ_PRINTF(InvVpidSingleAddressEmulationTime);
> +	VP_SEQ_PRINTF(NestedTlbPageTableReclamations);
> +	VP_SEQ_PRINTF(NestedTlbPageTableEvictions);
> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
> +	VP_SEQ_PRINTF(PostedInterruptNotifications);
> +	VP_SEQ_PRINTF(PostedInterruptScans);
> +	VP_SEQ_PRINTF(TotalCoreRunTime);
> +	VP_SEQ_PRINTF(MaximumRunTime);
> +	VP_SEQ_PRINTF(HwpRequestContextSwitches);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
> +	VP_SEQ_PRINTF(VmloadEmulationCount);
> +	VP_SEQ_PRINTF(VmloadEmulationTime);
> +	VP_SEQ_PRINTF(VmsaveEmulationCount);
> +	VP_SEQ_PRINTF(VmsaveEmulationTime);
> +	VP_SEQ_PRINTF(GifInstructionEmulationCount);
> +	VP_SEQ_PRINTF(GifInstructionEmulationTime);
> +	VP_SEQ_PRINTF(EmulatedErrataSvmInstructions);
> +	VP_SEQ_PRINTF(Placeholder1);
> +	VP_SEQ_PRINTF(Placeholder2);
> +	VP_SEQ_PRINTF(Placeholder3);
> +	VP_SEQ_PRINTF(Placeholder4);
> +	VP_SEQ_PRINTF(Placeholder5);
> +	VP_SEQ_PRINTF(Placeholder6);
> +	VP_SEQ_PRINTF(Placeholder7);
> +	VP_SEQ_PRINTF(Placeholder8);
> +	VP_SEQ_PRINTF(Placeholder9);
> +	VP_SEQ_PRINTF(Placeholder10);
> +	VP_SEQ_PRINTF(SchedulingPriority);
> +	VP_SEQ_PRINTF(RdpmcInstructionsCount);
> +	VP_SEQ_PRINTF(RdpmcInstructionsTime);
> +	VP_SEQ_PRINTF(PerfmonPmuMsrAccessesCount);
> +	VP_SEQ_PRINTF(PerfmonLbrMsrAccessesCount);
> +	VP_SEQ_PRINTF(PerfmonIptMsrAccessesCount);
> +	VP_SEQ_PRINTF(PerfmonInterruptCount);
> +	VP_SEQ_PRINTF(Vtl1DispatchCount);
> +	VP_SEQ_PRINTF(Vtl2DispatchCount);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket0);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket1);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket2);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket3);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket4);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket5);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket6);
> +	VP_SEQ_PRINTF(Vtl1RunTime);
> +	VP_SEQ_PRINTF(Vtl2RunTime);
> +	VP_SEQ_PRINTF(IommuHypercalls);
> +	VP_SEQ_PRINTF(CpuGroupHypercalls);
> +	VP_SEQ_PRINTF(VsmHypercalls);
> +	VP_SEQ_PRINTF(EventLogHypercalls);
> +	VP_SEQ_PRINTF(DeviceDomainHypercalls);
> +	VP_SEQ_PRINTF(DepositHypercalls);
> +	VP_SEQ_PRINTF(SvmHypercalls);
> +	VP_SEQ_PRINTF(BusLockAcquisitionCount);

The x86 VpUnused counter is not shown. Any reason for that? All the
Placeholder counters *are* shown, so I'm just wondering what's
different.

> +#elif IS_ENABLED(CONFIG_ARM64)
> +	VP_SEQ_PRINTF(SysRegAccessesCount);
> +	VP_SEQ_PRINTF(SysRegAccessesTime);
> +	VP_SEQ_PRINTF(SmcInstructionsCount);
> +	VP_SEQ_PRINTF(SmcInstructionsTime);
> +	VP_SEQ_PRINTF(OtherInterceptsCount);
> +	VP_SEQ_PRINTF(OtherInterceptsTime);
> +	VP_SEQ_PRINTF(ExternalInterruptsCount);
> +	VP_SEQ_PRINTF(ExternalInterruptsTime);
> +	VP_SEQ_PRINTF(PendingInterruptsCount);
> +	VP_SEQ_PRINTF(PendingInterruptsTime);
> +	VP_SEQ_PRINTF(GuestPageTableMaps);
> +	VP_SEQ_PRINTF(LargePageTlbFills);
> +	VP_SEQ_PRINTF(SmallPageTlbFills);
> +	VP_SEQ_PRINTF(ReflectedGuestPageFaults);
> +	VP_SEQ_PRINTF(MemoryInterceptMessages);
> +	VP_SEQ_PRINTF(OtherMessages);
> +	VP_SEQ_PRINTF(LogicalProcessorMigrations);
> +	VP_SEQ_PRINTF(AddressDomainFlushes);
> +	VP_SEQ_PRINTF(AddressSpaceFlushes);
> +	VP_SEQ_PRINTF(SyntheticInterrupts);
> +	VP_SEQ_PRINTF(VirtualInterrupts);
> +	VP_SEQ_PRINTF(ApicSelfIpisSent);
> +	VP_SEQ_PRINTF(GpaSpaceHypercalls);
> +	VP_SEQ_PRINTF(LogicalProcessorHypercalls);
> +	VP_SEQ_PRINTF(LongSpinWaitHypercalls);
> +	VP_SEQ_PRINTF(OtherHypercalls);
> +	VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
> +	VP_SEQ_PRINTF(VirtualInterruptHypercalls);
> +	VP_SEQ_PRINTF(VirtualMmuHypercalls);
> +	VP_SEQ_PRINTF(VirtualProcessorHypercalls);
> +	VP_SEQ_PRINTF(HardwareInterrupts);
> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
> +	VP_SEQ_PRINTF(LogicalProcessorDispatches);
> +	VP_SEQ_PRINTF(WaitingForCpuTime);
> +	VP_SEQ_PRINTF(ExtendedHypercalls);
> +	VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
> +	VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
> +	VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
> +	VP_SEQ_PRINTF(GlobalIoTlbFlushes);
> +	VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
> +	VP_SEQ_PRINTF(LocalIoTlbFlushes);
> +	VP_SEQ_PRINTF(LocalIoTlbFlushCost);
> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
> +	VP_SEQ_PRINTF(PostedInterruptNotifications);
> +	VP_SEQ_PRINTF(PostedInterruptScans);
> +	VP_SEQ_PRINTF(TotalCoreRunTime);
> +	VP_SEQ_PRINTF(MaximumRunTime);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
> +	VP_SEQ_PRINTF(HwpRequestContextSwitches);
> +	VP_SEQ_PRINTF(Placeholder2);
> +	VP_SEQ_PRINTF(Placeholder3);
> +	VP_SEQ_PRINTF(Placeholder4);
> +	VP_SEQ_PRINTF(Placeholder5);
> +	VP_SEQ_PRINTF(Placeholder6);
> +	VP_SEQ_PRINTF(Placeholder7);
> +	VP_SEQ_PRINTF(Placeholder8);
> +	VP_SEQ_PRINTF(ContentionTime);
> +	VP_SEQ_PRINTF(WakeUpTime);
> +	VP_SEQ_PRINTF(SchedulingPriority);
> +	VP_SEQ_PRINTF(Vtl1DispatchCount);
> +	VP_SEQ_PRINTF(Vtl2DispatchCount);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket0);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket1);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket2);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket3);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket4);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket5);
> +	VP_SEQ_PRINTF(Vtl2DispatchBucket6);
> +	VP_SEQ_PRINTF(Vtl1RunTime);
> +	VP_SEQ_PRINTF(Vtl2RunTime);
> +	VP_SEQ_PRINTF(IommuHypercalls);
> +	VP_SEQ_PRINTF(CpuGroupHypercalls);
> +	VP_SEQ_PRINTF(VsmHypercalls);
> +	VP_SEQ_PRINTF(EventLogHypercalls);
> +	VP_SEQ_PRINTF(DeviceDomainHypercalls);
> +	VP_SEQ_PRINTF(DepositHypercalls);
> +	VP_SEQ_PRINTF(SvmHypercalls);

The ARM64 VpLoadAvg counter is not shown?  Any reason why?

> +#endif

The VpRootDispatchThreadBlocked counter is not shown for either
x86 or ARM64. Is that intentional, and if so, why? I know the counter
is used in mshv_vp_dispatch_thread_blocked(), but it's not clear why
that means it shouldn't be shown here.

> +
> +	return 0;
> +}

This function, vp_stats_show(), seems like a candidate for redoing based on a
static table that lists the counter names and index. Then the code just loops
through the table. On x86 each VP_SEQ_PRINTF() generates 42 bytes of code,
and there are 199 entries, so 8358 bytes. The table entries would probably
be 16 bytes each (a 64-bit pointer to the string constant, a 32-bit index value,
and 4 bytes of padding so each entry is 8-byte aligned). The actual space
saving isn't that large, but the code would be a lot more compact. The
other *_stats_shows() functions could do the same.

It's distasteful to me to see 420 lines of enum entries in Patch 2 of this series,
then followed by another 420 lines of matching *_SEQ_PRINTF entries. But I
realize that the goal of the enum entries is to match the Windows code, so I
guess it is what it is. But there's an argument for ditching the enum entries
entirely, and using the putative static table to capture the information. It
doesn't seem like matching the Windows code is saving much sync effort
since any additions/ subtractions to the enum entries need to be matched
with changes in the *_stats_show() functions, or in my putative static table.
But I guess if Windows changed only the value for an enum entry without
additions/subtractions, that would sync more easily.

I'm just throwing this out as a thought. You may prefer to keep everything
"as is", in which case ignore my comment and I won't raise it again.

> +DEFINE_SHOW_ATTRIBUTE(vp_stats);
> +
> +static void mshv_vp_stats_unmap(u64 partition_id, u32 vp_index, void *stats_page_addr,
> +				enum hv_stats_area_type stats_area_type)
> +{
> +	union hv_stats_object_identity identity = {
> +		.vp.partition_id = partition_id,
> +		.vp.vp_index = vp_index,
> +		.vp.stats_area_type = stats_area_type,
> +	};
> +	int err;
> +
> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_VP, stats_page_addr, &identity);
> +	if (err)
> +		pr_err("%s: failed to unmap partition %llu vp %u %s stats, err: %d\n",
> +		       __func__, partition_id, vp_index,
> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> +		       err);
> +}
> +
> +static void *mshv_vp_stats_map(u64 partition_id, u32 vp_index,
> +			       enum hv_stats_area_type stats_area_type)
> +{
> +	union hv_stats_object_identity identity = {
> +		.vp.partition_id = partition_id,
> +		.vp.vp_index = vp_index,
> +		.vp.stats_area_type = stats_area_type,
> +	};
> +	void *stats;
> +	int err;
> +
> +	err = hv_map_stats_page(HV_STATS_OBJECT_VP, &identity, &stats);
> +	if (err) {
> +		pr_err("%s: failed to map partition %llu vp %u %s stats, err: %d\n",
> +		       __func__, partition_id, vp_index,
> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> +		       err);
> +		return ERR_PTR(err);
> +	}
> +	return stats;
> +}

Presumably you've noticed that the functions mshv_vp_stats_map() and
mshv_vp_stats_unmap() also exist in mshv_root_main.c.  They are static
functions in both places, so the compiler & linker do the right thing, but
it sure does make things a bit more complex for human readers. The versions
here follow a consistent pattern for (lp, vp, hv, partition), so maybe the ones
in mshv_root_main.c could be renamed to avoid confusion?

> +
> +static int vp_debugfs_stats_create(u64 partition_id, u32 vp_index,
> +				   struct dentry **vp_stats_ptr,
> +				   struct dentry *parent)
> +{
> +	struct dentry *dentry;
> +	struct hv_stats_page **pstats;
> +	int err;
> +
> +	pstats = kcalloc(2, sizeof(struct hv_stats_page *), GFP_KERNEL_ACCOUNT);

Open coding "2" as the first parameter makes assumptions about the values of
HV_STATS_AREA_SELF and HV_STATS_AREA_PARENT.  Should use
HV_STATS_AREA_COUNT instead of "2" so that indexing into the array is certain
to work.

> +	if (!pstats)
> +		return -ENOMEM;
> +
> +	pstats[HV_STATS_AREA_SELF] = mshv_vp_stats_map(partition_id, vp_index,
> +						       HV_STATS_AREA_SELF);
> +	if (IS_ERR(pstats[HV_STATS_AREA_SELF])) {
> +		err = PTR_ERR(pstats[HV_STATS_AREA_SELF]);
> +		goto cleanup;
> +	}
> +
> +	/*
> +	 * L1VH partition cannot access its vp stats in parent area.
> +	 */
> +	if (is_l1vh_parent(partition_id)) {
> +		pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
> +	} else {
> +		pstats[HV_STATS_AREA_PARENT] = mshv_vp_stats_map(
> +			partition_id, vp_index, HV_STATS_AREA_PARENT);
> +		if (IS_ERR(pstats[HV_STATS_AREA_PARENT])) {
> +			err = PTR_ERR(pstats[HV_STATS_AREA_PARENT]);
> +			goto unmap_self;
> +		}
> +		if (!pstats[HV_STATS_AREA_PARENT])
> +			pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
> +	}
> +
> +	dentry = debugfs_create_file("stats", 0400, parent,
> +				     pstats, &vp_stats_fops);
> +	if (IS_ERR(dentry)) {
> +		err = PTR_ERR(dentry);
> +		goto unmap_vp_stats;
> +	}
> +
> +	*vp_stats_ptr = dentry;
> +	return 0;
> +
> +unmap_vp_stats:
> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF])
> +		mshv_vp_stats_unmap(partition_id, vp_index, pstats[HV_STATS_AREA_PARENT],
> +				    HV_STATS_AREA_PARENT);
> +unmap_self:
> +	mshv_vp_stats_unmap(partition_id, vp_index, pstats[HV_STATS_AREA_SELF],
> +			    HV_STATS_AREA_SELF);
> +cleanup:
> +	kfree(pstats);
> +	return err;
> +}
> +
> +static void vp_debugfs_remove(u64 partition_id, u32 vp_index,
> +			      struct dentry *vp_stats)
> +{
> +	struct hv_stats_page **pstats = NULL;
> +	void *stats;
> +
> +	pstats = vp_stats->d_inode->i_private;
> +	debugfs_remove_recursive(vp_stats->d_parent);
> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF]) {
> +		stats = pstats[HV_STATS_AREA_PARENT];
> +		mshv_vp_stats_unmap(partition_id, vp_index, stats,
> +				    HV_STATS_AREA_PARENT);
> +	}
> +
> +	stats = pstats[HV_STATS_AREA_SELF];
> +	mshv_vp_stats_unmap(partition_id, vp_index, stats, HV_STATS_AREA_SELF);
> +
> +	kfree(pstats);
> +}
> +
> +static int vp_debugfs_create(u64 partition_id, u32 vp_index,
> +			     struct dentry **vp_stats_ptr,
> +			     struct dentry *parent)
> +{
> +	struct dentry *vp_idx_dir;
> +	char vp_idx_str[U32_BUF_SZ];
> +	int err;
> +
> +	sprintf(vp_idx_str, "%u", vp_index);
> +
> +	vp_idx_dir = debugfs_create_dir(vp_idx_str, parent);
> +	if (IS_ERR(vp_idx_dir))
> +		return PTR_ERR(vp_idx_dir);
> +
> +	err = vp_debugfs_stats_create(partition_id, vp_index, vp_stats_ptr,
> +				      vp_idx_dir);
> +	if (err)
> +		goto remove_debugfs_vp_idx;
> +
> +	return 0;
> +
> +remove_debugfs_vp_idx:
> +	debugfs_remove_recursive(vp_idx_dir);
> +	return err;
> +}
> +
> +static int partition_stats_show(struct seq_file *m, void *v)
> +{
> +	const struct hv_stats_page **pstats = m->private;
> +
> +#define PARTITION_SEQ_PRINTF(cnt)				 \
> +do {								 \
> +	if (pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]) \
> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> +			pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]); \
> +	else \
> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> +			pstats[HV_STATS_AREA_PARENT]->pt_cntrs[Partition##cnt]); \
> +} while (0)

Same comment as for VP_SEQ_PRINTF.

> +
> +	PARTITION_SEQ_PRINTF(VirtualProcessors);
> +	PARTITION_SEQ_PRINTF(TlbSize);
> +	PARTITION_SEQ_PRINTF(AddressSpaces);
> +	PARTITION_SEQ_PRINTF(DepositedPages);
> +	PARTITION_SEQ_PRINTF(GpaPages);
> +	PARTITION_SEQ_PRINTF(GpaSpaceModifications);
> +	PARTITION_SEQ_PRINTF(VirtualTlbFlushEntires);
> +	PARTITION_SEQ_PRINTF(RecommendedTlbSize);
> +	PARTITION_SEQ_PRINTF(GpaPages4K);
> +	PARTITION_SEQ_PRINTF(GpaPages2M);
> +	PARTITION_SEQ_PRINTF(GpaPages1G);
> +	PARTITION_SEQ_PRINTF(GpaPages512G);
> +	PARTITION_SEQ_PRINTF(DevicePages4K);
> +	PARTITION_SEQ_PRINTF(DevicePages2M);
> +	PARTITION_SEQ_PRINTF(DevicePages1G);
> +	PARTITION_SEQ_PRINTF(DevicePages512G);
> +	PARTITION_SEQ_PRINTF(AttachedDevices);
> +	PARTITION_SEQ_PRINTF(DeviceInterruptMappings);
> +	PARTITION_SEQ_PRINTF(IoTlbFlushes);
> +	PARTITION_SEQ_PRINTF(IoTlbFlushCost);
> +	PARTITION_SEQ_PRINTF(DeviceInterruptErrors);
> +	PARTITION_SEQ_PRINTF(DeviceDmaErrors);
> +	PARTITION_SEQ_PRINTF(DeviceInterruptThrottleEvents);
> +	PARTITION_SEQ_PRINTF(SkippedTimerTicks);
> +	PARTITION_SEQ_PRINTF(PartitionId);
> +#if IS_ENABLED(CONFIG_X86_64)
> +	PARTITION_SEQ_PRINTF(NestedTlbSize);
> +	PARTITION_SEQ_PRINTF(RecommendedNestedTlbSize);
> +	PARTITION_SEQ_PRINTF(NestedTlbFreeListSize);
> +	PARTITION_SEQ_PRINTF(NestedTlbTrimmedPages);
> +	PARTITION_SEQ_PRINTF(PagesShattered);
> +	PARTITION_SEQ_PRINTF(PagesRecombined);
> +	PARTITION_SEQ_PRINTF(HwpRequestValue);
> +#elif IS_ENABLED(CONFIG_ARM64)
> +	PARTITION_SEQ_PRINTF(HwpRequestValue);
> +#endif
> +
> +	return 0;
> +}
> +DEFINE_SHOW_ATTRIBUTE(partition_stats);
> +
> +static void mshv_partition_stats_unmap(u64 partition_id, void *stats_page_addr,
> +				       enum hv_stats_area_type stats_area_type)
> +{
> +	union hv_stats_object_identity identity = {
> +		.partition.partition_id = partition_id,
> +		.partition.stats_area_type = stats_area_type,
> +	};
> +	int err;
> +
> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_PARTITION, stats_page_addr,
> +				  &identity);
> +	if (err) {
> +		pr_err("%s: failed to unmap partition %lld %s stats, err: %d\n",
> +		       __func__, partition_id,
> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> +		       err);
> +	}
> +}
> +
> +static void *mshv_partition_stats_map(u64 partition_id,
> +				      enum hv_stats_area_type stats_area_type)
> +{
> +	union hv_stats_object_identity identity = {
> +		.partition.partition_id = partition_id,
> +		.partition.stats_area_type = stats_area_type,
> +	};
> +	void *stats;
> +	int err;
> +
> +	err = hv_map_stats_page(HV_STATS_OBJECT_PARTITION, &identity, &stats);
> +	if (err) {
> +		pr_err("%s: failed to map partition %lld %s stats, err: %d\n",
> +		       __func__, partition_id,
> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> +		       err);
> +		return ERR_PTR(err);
> +	}
> +	return stats;
> +}
> +
> +static int mshv_debugfs_partition_stats_create(u64 partition_id,
> +					    struct dentry **partition_stats_ptr,
> +					    struct dentry *parent)
> +{
> +	struct dentry *dentry;
> +	struct hv_stats_page **pstats;
> +	int err;
> +
> +	pstats = kcalloc(2, sizeof(struct hv_stats_page *), GFP_KERNEL_ACCOUNT);

Same comment here about the use of "2" as the first parameter.

> +	if (!pstats)
> +		return -ENOMEM;
> +
> +	pstats[HV_STATS_AREA_SELF] = mshv_partition_stats_map(partition_id,
> +							      HV_STATS_AREA_SELF);
> +	if (IS_ERR(pstats[HV_STATS_AREA_SELF])) {
> +		err = PTR_ERR(pstats[HV_STATS_AREA_SELF]);
> +		goto cleanup;
> +	}
> +
> +	/*
> +	 * L1VH partition cannot access its partition stats in parent area.
> +	 */
> +	if (is_l1vh_parent(partition_id)) {
> +		pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
> +	} else {
> +		pstats[HV_STATS_AREA_PARENT] = mshv_partition_stats_map(partition_id,
> +								HV_STATS_AREA_PARENT);
> +		if (IS_ERR(pstats[HV_STATS_AREA_PARENT])) {
> +			err = PTR_ERR(pstats[HV_STATS_AREA_PARENT]);
> +			goto unmap_self;
> +		}
> +		if (!pstats[HV_STATS_AREA_PARENT])
> +			pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
> +	}
> +
> +	dentry = debugfs_create_file("stats", 0400, parent,
> +				     pstats, &partition_stats_fops);
> +	if (IS_ERR(dentry)) {
> +		err = PTR_ERR(dentry);
> +		goto unmap_partition_stats;
> +	}
> +
> +	*partition_stats_ptr = dentry;
> +	return 0;
> +
> +unmap_partition_stats:
> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF])
> +		mshv_partition_stats_unmap(partition_id, pstats[HV_STATS_AREA_PARENT],
> +					   HV_STATS_AREA_PARENT);
> +unmap_self:
> +	mshv_partition_stats_unmap(partition_id, pstats[HV_STATS_AREA_SELF],
> +				   HV_STATS_AREA_SELF);
> +cleanup:
> +	kfree(pstats);
> +	return err;
> +}
> +
> +static void partition_debugfs_remove(u64 partition_id, struct dentry *dentry)
> +{
> +	struct hv_stats_page **pstats = NULL;
> +	void *stats;
> +
> +	pstats = dentry->d_inode->i_private;
> +
> +	debugfs_remove_recursive(dentry->d_parent);
> +
> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF]) {
> +		stats = pstats[HV_STATS_AREA_PARENT];
> +		mshv_partition_stats_unmap(partition_id, stats, HV_STATS_AREA_PARENT);
> +	}
> +
> +	stats = pstats[HV_STATS_AREA_SELF];
> +	mshv_partition_stats_unmap(partition_id, stats, HV_STATS_AREA_SELF);
> +
> +	kfree(pstats);
> +}
> +
> +static int partition_debugfs_create(u64 partition_id,
> +				    struct dentry **vp_dir_ptr,
> +				    struct dentry **partition_stats_ptr,
> +				    struct dentry *parent)
> +{
> +	char part_id_str[U64_BUF_SZ];
> +	struct dentry *part_id_dir, *vp_dir;
> +	int err;
> +
> +	if (is_l1vh_parent(partition_id))
> +		sprintf(part_id_str, "self");
> +	else
> +		sprintf(part_id_str, "%llu", partition_id);
> +
> +	part_id_dir = debugfs_create_dir(part_id_str, parent);
> +	if (IS_ERR(part_id_dir))
> +		return PTR_ERR(part_id_dir);
> +
> +	vp_dir = debugfs_create_dir("vp", part_id_dir);
> +	if (IS_ERR(vp_dir)) {
> +		err = PTR_ERR(vp_dir);
> +		goto remove_debugfs_partition_id;
> +	}
> +
> +	err = mshv_debugfs_partition_stats_create(partition_id,
> +						  partition_stats_ptr,
> +						  part_id_dir);
> +	if (err)
> +		goto remove_debugfs_partition_id;
> +
> +	*vp_dir_ptr = vp_dir;
> +
> +	return 0;
> +
> +remove_debugfs_partition_id:
> +	debugfs_remove_recursive(part_id_dir);
> +	return err;
> +}
> +
> +static void mshv_debugfs_parent_partition_remove(void)
> +{
> +	int idx;
> +
> +	for_each_online_cpu(idx)
> +		vp_debugfs_remove(hv_current_partition_id, idx, NULL);
> +
> +	partition_debugfs_remove(hv_current_partition_id, NULL);
> +}
> +
> +static int __init mshv_debugfs_parent_partition_create(void)
> +{
> +	struct dentry *partition_stats, *vp_dir;
> +	int err, idx, i;
> +
> +	mshv_debugfs_partition = debugfs_create_dir("partition",
> +						     mshv_debugfs);
> +	if (IS_ERR(mshv_debugfs_partition))
> +		return PTR_ERR(mshv_debugfs_partition);
> +
> +	err = partition_debugfs_create(hv_current_partition_id,
> +				       &vp_dir,
> +				       &partition_stats,
> +				       mshv_debugfs_partition);
> +	if (err)
> +		goto remove_debugfs_partition;
> +
> +	for_each_online_cpu(idx) {
> +		struct dentry *vp_stats;
> +
> +		err = vp_debugfs_create(hv_current_partition_id,
> +					hv_vp_index[idx],
> +					&vp_stats,
> +					vp_dir);
> +		if (err)
> +			goto remove_debugfs_partition_vp;
> +	}
> +
> +	return 0;
> +
> +remove_debugfs_partition_vp:
> +	for_each_online_cpu(i) {
> +		if (i >= idx)
> +			break;
> +		vp_debugfs_remove(hv_current_partition_id, i, NULL);
> +	}
> +	partition_debugfs_remove(hv_current_partition_id, NULL);
> +remove_debugfs_partition:
> +	debugfs_remove_recursive(mshv_debugfs_partition);
> +	return err;
> +}
> +
> +static int hv_stats_show(struct seq_file *m, void *v)
> +{
> +	const struct hv_stats_page *stats = m->private;
> +
> +#define HV_SEQ_PRINTF(cnt)		\
> +	seq_printf(m, "%-25s: %llu\n", __stringify(cnt), stats->hv_cntrs[Hv##cnt])
> +
> +	HV_SEQ_PRINTF(LogicalProcessors);
> +	HV_SEQ_PRINTF(Partitions);
> +	HV_SEQ_PRINTF(TotalPages);
> +	HV_SEQ_PRINTF(VirtualProcessors);
> +	HV_SEQ_PRINTF(MonitoredNotifications);
> +	HV_SEQ_PRINTF(ModernStandbyEntries);
> +	HV_SEQ_PRINTF(PlatformIdleTransitions);
> +	HV_SEQ_PRINTF(HypervisorStartupCost);
> +	HV_SEQ_PRINTF(IOSpacePages);
> +	HV_SEQ_PRINTF(NonEssentialPagesForDump);
> +	HV_SEQ_PRINTF(SubsumedPages);
> +
> +	return 0;
> +}
> +DEFINE_SHOW_ATTRIBUTE(hv_stats);
> +
> +static void mshv_hv_stats_unmap(void)
> +{
> +	union hv_stats_object_identity identity = {
> +		.hv.stats_area_type = HV_STATS_AREA_SELF,
> +	};
> +	int err;
> +
> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_HYPERVISOR, NULL, &identity);
> +	if (err)
> +		pr_err("%s: failed to unmap hypervisor stats: %d\n",
> +		       __func__, err);
> +}
> +
> +static void * __init mshv_hv_stats_map(void)
> +{
> +	union hv_stats_object_identity identity = {
> +		.hv.stats_area_type = HV_STATS_AREA_SELF,
> +	};
> +	void *stats;
> +	int err;
> +
> +	err = hv_map_stats_page(HV_STATS_OBJECT_HYPERVISOR, &identity, &stats);
> +	if (err) {
> +		pr_err("%s: failed to map hypervisor stats: %d\n",
> +		       __func__, err);
> +		return ERR_PTR(err);
> +	}
> +	return stats;
> +}
> +
> +static int __init mshv_debugfs_hv_stats_create(struct dentry *parent)
> +{
> +	struct dentry *dentry;
> +	u64 *stats;
> +	int err;
> +
> +	stats = mshv_hv_stats_map();
> +	if (IS_ERR(stats))
> +		return PTR_ERR(stats);
> +
> +	dentry = debugfs_create_file("stats", 0400, parent,
> +				     stats, &hv_stats_fops);
> +	if (IS_ERR(dentry)) {
> +		err = PTR_ERR(dentry);
> +		pr_err("%s: failed to create hypervisor stats dentry: %d\n",
> +		       __func__, err);
> +		goto unmap_hv_stats;
> +	}
> +
> +	mshv_lps_count = stats[HvLogicalProcessors];
> +
> +	return 0;
> +
> +unmap_hv_stats:
> +	mshv_hv_stats_unmap();
> +	return err;
> +}
> +
> +int mshv_debugfs_vp_create(struct mshv_vp *vp)
> +{
> +	struct mshv_partition *p = vp->vp_partition;
> +	int err;
> +
> +	if (!mshv_debugfs)
> +		return 0;
> +
> +	err = vp_debugfs_create(p->pt_id, vp->vp_index,
> +				&vp->vp_debugfs_stats_dentry,
> +				p->pt_debugfs_vp_dentry);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> +
> +void mshv_debugfs_vp_remove(struct mshv_vp *vp)
> +{
> +	if (!mshv_debugfs)
> +		return;
> +
> +	vp_debugfs_remove(vp->vp_partition->pt_id, vp->vp_index,
> +			  vp->vp_debugfs_stats_dentry);
> +}
> +
> +int mshv_debugfs_partition_create(struct mshv_partition *partition)
> +{
> +	int err;
> +
> +	if (!mshv_debugfs)
> +		return 0;
> +
> +	err = partition_debugfs_create(partition->pt_id,
> +				       &partition->pt_debugfs_vp_dentry,
> +				       &partition->pt_debugfs_stats_dentry,
> +				       mshv_debugfs_partition);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> +
> +void mshv_debugfs_partition_remove(struct mshv_partition *partition)
> +{
> +	if (!mshv_debugfs)
> +		return;
> +
> +	partition_debugfs_remove(partition->pt_id,
> +				 partition->pt_debugfs_stats_dentry);
> +}
> +
> +int __init mshv_debugfs_init(void)
> +{
> +	int err;
> +
> +	mshv_debugfs = debugfs_create_dir("mshv", NULL);
> +	if (IS_ERR(mshv_debugfs)) {
> +		pr_err("%s: failed to create debugfs directory\n", __func__);
> +		return PTR_ERR(mshv_debugfs);
> +	}
> +
> +	if (hv_root_partition()) {
> +		err = mshv_debugfs_hv_stats_create(mshv_debugfs);
> +		if (err)
> +			goto remove_mshv_dir;
> +
> +		err = mshv_debugfs_lp_create(mshv_debugfs);
> +		if (err)
> +			goto unmap_hv_stats;
> +	}
> +
> +	err = mshv_debugfs_parent_partition_create();
> +	if (err)
> +		goto unmap_lp_stats;
> +
> +	return 0;
> +
> +unmap_lp_stats:
> +	if (hv_root_partition())
> +		mshv_debugfs_lp_remove();
> +unmap_hv_stats:
> +	if (hv_root_partition())
> +		mshv_hv_stats_unmap();
> +remove_mshv_dir:
> +	debugfs_remove_recursive(mshv_debugfs);
> +	return err;
> +}
> +
> +void mshv_debugfs_exit(void)
> +{
> +	mshv_debugfs_parent_partition_remove();
> +
> +	if (hv_root_partition()) {
> +		mshv_debugfs_lp_remove();
> +		mshv_hv_stats_unmap();
> +	}
> +
> +	debugfs_remove_recursive(mshv_debugfs);
> +}
> diff --git a/drivers/hv/mshv_root.h b/drivers/hv/mshv_root.h
> index 3eb815011b46..1f1b1984449b 100644
> --- a/drivers/hv/mshv_root.h
> +++ b/drivers/hv/mshv_root.h
> @@ -51,6 +51,9 @@ struct mshv_vp {
>  		unsigned int kicked_by_hv;
>  		wait_queue_head_t vp_suspend_queue;
>  	} run;
> +#if IS_ENABLED(CONFIG_DEBUG_FS)
> +	struct dentry *vp_debugfs_stats_dentry;
> +#endif
>  };
> 
>  #define vp_fmt(fmt) "p%lluvp%u: " fmt
> @@ -128,6 +131,10 @@ struct mshv_partition {
>  	u64 isolation_type;
>  	bool import_completed;
>  	bool pt_initialized;
> +#if IS_ENABLED(CONFIG_DEBUG_FS)
> +	struct dentry *pt_debugfs_stats_dentry;
> +	struct dentry *pt_debugfs_vp_dentry;
> +#endif
>  };
> 
>  #define pt_fmt(fmt) "p%llu: " fmt
> @@ -308,6 +315,33 @@ int hv_call_modify_spa_host_access(u64 partition_id, struct page **pages,
>  int hv_call_get_partition_property_ex(u64 partition_id, u64 property_code, u64 arg,
>  				      void *property_value, size_t property_value_sz);
> 
> +#if IS_ENABLED(CONFIG_DEBUG_FS)
> +int __init mshv_debugfs_init(void);
> +void mshv_debugfs_exit(void);
> +
> +int mshv_debugfs_partition_create(struct mshv_partition *partition);
> +void mshv_debugfs_partition_remove(struct mshv_partition *partition);
> +int mshv_debugfs_vp_create(struct mshv_vp *vp);
> +void mshv_debugfs_vp_remove(struct mshv_vp *vp);
> +#else
> +static inline int __init mshv_debugfs_init(void)
> +{
> +	return 0;
> +}
> +static inline void mshv_debugfs_exit(void) { }
> +
> +static inline int mshv_debugfs_partition_create(struct mshv_partition *partition)
> +{
> +	return 0;
> +}
> +static inline void mshv_debugfs_partition_remove(struct mshv_partition *partition) { }
> +static inline int mshv_debugfs_vp_create(struct mshv_vp *vp)
> +{
> +	return 0;
> +}
> +static inline void mshv_debugfs_vp_remove(struct mshv_vp *vp) { }
> +#endif
> +
>  extern struct mshv_root mshv_root;
>  extern enum hv_scheduler_type hv_scheduler_type;
>  extern u8 * __percpu *hv_synic_eventring_tail;
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index 19006b788e85..152fcd9b45e6 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -982,6 +982,10 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
>  	if (hv_scheduler_type == HV_SCHEDULER_TYPE_ROOT)
>  		memcpy(vp->vp_stats_pages, stats_pages, sizeof(stats_pages));
> 
> +	ret = mshv_debugfs_vp_create(vp);
> +	if (ret)
> +		goto put_partition;
> +
>  	/*
>  	 * Keep anon_inode_getfd last: it installs fd in the file struct and
>  	 * thus makes the state accessible in user space.
> @@ -989,7 +993,7 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
>  	ret = anon_inode_getfd("mshv_vp", &mshv_vp_fops, vp,
>  			       O_RDWR | O_CLOEXEC);
>  	if (ret < 0)
> -		goto put_partition;
> +		goto remove_debugfs_vp;
> 
>  	/* already exclusive with the partition mutex for all ioctls */
>  	partition->pt_vp_count++;
> @@ -997,6 +1001,8 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
> 
>  	return ret;
> 
> +remove_debugfs_vp:
> +	mshv_debugfs_vp_remove(vp);
>  put_partition:
>  	mshv_partition_put(partition);
>  free_vp:
> @@ -1556,13 +1562,18 @@ mshv_partition_ioctl_initialize(struct mshv_partition *partition)
> 
>  	ret = hv_call_initialize_partition(partition->pt_id);
>  	if (ret)
> -		goto withdraw_mem;
> +		return ret;
> +
> +	ret = mshv_debugfs_partition_create(partition);
> +	if (ret)
> +		goto finalize_partition;
> 
>  	partition->pt_initialized = true;
> 
>  	return 0;
> 
> -withdraw_mem:
> +finalize_partition:
> +	hv_call_finalize_partition(partition->pt_id);
>  	hv_call_withdraw_memory(U64_MAX, NUMA_NO_NODE, partition->pt_id);
> 
>  	return ret;
> @@ -1741,6 +1752,8 @@ static void destroy_partition(struct mshv_partition *partition)
>  			if (!vp)
>  				continue;
> 
> +			mshv_debugfs_vp_remove(vp);
> +
>  			if (hv_scheduler_type == HV_SCHEDULER_TYPE_ROOT)
>  				mshv_vp_stats_unmap(partition->pt_id, vp->vp_index,
>  						    (void **)vp->vp_stats_pages);
> @@ -1775,6 +1788,8 @@ static void destroy_partition(struct mshv_partition *partition)
>  			partition->pt_vp_array[i] = NULL;
>  		}
> 
> +		mshv_debugfs_partition_remove(partition);
> +
>  		/* Deallocates and unmaps everything including vcpus, GPA mappings etc */
>  		hv_call_finalize_partition(partition->pt_id);
> 
> @@ -2351,10 +2366,14 @@ static int __init mshv_parent_partition_init(void)
> 
>  	mshv_init_vmm_caps(dev);
> 
> -	ret = mshv_irqfd_wq_init();
> +	ret = mshv_debugfs_init();
>  	if (ret)
>  		goto exit_partition;
> 
> +	ret = mshv_irqfd_wq_init();
> +	if (ret)
> +		goto exit_debugfs;
> +
>  	spin_lock_init(&mshv_root.pt_ht_lock);
>  	hash_init(mshv_root.pt_htable);
> 
> @@ -2362,6 +2381,10 @@ static int __init mshv_parent_partition_init(void)
> 
>  	return 0;
> 
> +destroy_irqds_wq:
> +	mshv_irqfd_wq_cleanup();
> +exit_debugfs:
> +	mshv_debugfs_exit();
>  exit_partition:
>  	if (hv_root_partition())
>  		mshv_root_partition_exit();
> @@ -2378,6 +2401,7 @@ static void __exit mshv_parent_partition_exit(void)
>  {
>  	hv_setup_mshv_handler(NULL);
>  	mshv_port_table_fini();
> +	mshv_debugfs_exit();
>  	misc_deregister(&mshv_dev);
>  	mshv_irqfd_wq_cleanup();
>  	if (hv_root_partition())
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/3] mshv: Ignore second stats page map result failure
  2025-12-08 15:12   ` Michael Kelley
@ 2025-12-30  0:27     ` Nuno Das Neves
  2026-01-02 16:27       ` Michael Kelley
  0 siblings, 1 reply; 18+ messages in thread
From: Nuno Das Neves @ 2025-12-30  0:27 UTC (permalink / raw)
  To: Michael Kelley, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, skinsburskii@linux.microsoft.com
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com,
	prapal@linux.microsoft.com, mrathor@linux.microsoft.com,
	paekkaladevi@linux.microsoft.com

On 12/8/2025 7:12 AM, Michael Kelley wrote:
> From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Friday, December 5, 2025 10:59 AM
>>
>> From: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
>>
>> Older versions of the hypervisor do not support HV_STATS_AREA_PARENT
>> and return HV_STATUS_INVALID_PARAMETER for the second stats page
>> mapping request.
>>
>> This results a failure in module init. Instead of failing, gracefully
>> fall back to populating stats_pages[HV_STATS_AREA_PARENT] with the
>> already-mapped stats_pages[HV_STATS_AREA_SELF].
> 
> This explains "what" this patch does. But could you add an explanation of "why"
> substituting SELF for the unavailable PARENT is the right thing to do? As a somewhat
> outside reviewer, I don't know enough about SELF vs. PARENT to immediately know
> why this substitution makes sense.
> 
I'll attempt to explain. I'm a little hindered by the fact that like many of the
root interfaces this is not well-documented, but this is my understanding:

The stats areas HV_STATS_AREA_SELF and HV_STATS_AREA_PARENT indicate the privilege
level of the data in the mapped stats page.

Both SELF and PARENT contain the same fields, but some fields that are 0 in the
SELF page may be nonzero in PARENT page, and vice-versa. So, to read all the fields
we need to map both pages if possible, and prioritize reading non-zero data from
each field, by checking both the SELF and PARENT pages.

I don't know if it's possible for a given field to have a different (nonzero) value
in both SELF and PARENT pages. I imagine in that case we'd want to prioritize the
PARENT value, but it may simply not be possible.

The API is designed in this way to be backward-compatible with older hypervisors
that didn't have a concept of SELF and PARENT. Hence on older hypervisors (detectable
via the error code), all we can do is map SELF and use it for everything.

> Also, does this patch affect the logic in mshv_vp_dispatch_thread_blocked() where
> a zero value for the SELF version of VpRootDispatchThreadBlocked is replaced by
> the PARENT value? But that logic seems to be in the reverse direction -- replacing
> a missing SELF value with the PARENT value -- whereas this patch is about replacing
> missing PARENT values with SELF values. So are there two separate PARENT vs. SELF
> issues overall? And after this patch is in place and PARENT values are replaced with
> SELF on older hypervisor versions, the logic in mshv_vp_dispatch_thread_blocked()
> then effectively becomes a no-op if the SELF value is zero, and the return value will
> be zero. Is that problem?
> 
This is the same issue, because we only care about any nonzero value in
mshv_vp_dispatch_thread_blocked(). It doesn't matter which page we check first in that
code, just that any nonzero value is returned as a boolean to indicate a blocked state.

The code in question could be rewritten:

return self_vp_cntrs[VpRootDispatchThreadBlocked] || parent_vp_cntrs[VpRootDispatchThreadBlocked];

>>
>> Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
>> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
>> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
>> ---
>>  drivers/hv/mshv_root_hv_call.c | 41 ++++++++++++++++++++++++++++++----
>>  drivers/hv/mshv_root_main.c    |  3 +++
>>  2 files changed, 40 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
>> index 598eaff4ff29..b1770c7b500c 100644
>> --- a/drivers/hv/mshv_root_hv_call.c
>> +++ b/drivers/hv/mshv_root_hv_call.c
>> @@ -855,6 +855,24 @@ static int hv_call_map_stats_page2(enum
>> hv_stats_object_type type,
>>  	return ret;
>>  }
>>
>> +static int
>> +hv_stats_get_area_type(enum hv_stats_object_type type,
>> +		       const union hv_stats_object_identity *identity)
>> +{
>> +	switch (type) {
>> +	case HV_STATS_OBJECT_HYPERVISOR:
>> +		return identity->hv.stats_area_type;
>> +	case HV_STATS_OBJECT_LOGICAL_PROCESSOR:
>> +		return identity->lp.stats_area_type;
>> +	case HV_STATS_OBJECT_PARTITION:
>> +		return identity->partition.stats_area_type;
>> +	case HV_STATS_OBJECT_VP:
>> +		return identity->vp.stats_area_type;
>> +	}
>> +
>> +	return -EINVAL;
>> +}
>> +
>>  static int hv_call_map_stats_page(enum hv_stats_object_type type,
>>  				  const union hv_stats_object_identity *identity,
>>  				  void **addr)
>> @@ -863,7 +881,7 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
>>  	struct hv_input_map_stats_page *input;
>>  	struct hv_output_map_stats_page *output;
>>  	u64 status, pfn;
>> -	int ret = 0;
>> +	int hv_status, ret = 0;
>>
>>  	do {
>>  		local_irq_save(flags);
>> @@ -878,11 +896,26 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
>>  		pfn = output->map_location;
>>
>>  		local_irq_restore(flags);
>> -		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
>> -			ret = hv_result_to_errno(status);
>> +
>> +		hv_status = hv_result(status);
>> +		if (hv_status != HV_STATUS_INSUFFICIENT_MEMORY) {
>>  			if (hv_result_success(status))
>>  				break;
>> -			return ret;
>> +
>> +			/*
>> +			 * Older versions of the hypervisor do not support the
>> +			 * PARENT stats area. In this case return "success" but
>> +			 * set the page to NULL. The caller should check for
>> +			 * this case and instead just use the SELF area.
>> +			 */
>> +			if (hv_stats_get_area_type(type, identity) == HV_STATS_AREA_PARENT &&
>> +			    hv_status == HV_STATUS_INVALID_PARAMETER) {
>> +				*addr = NULL;
>> +				return 0;
>> +			}
>> +
>> +			hv_status_debug(status, "\n");
>> +			return hv_result_to_errno(status);
> 
> Does the hv_call_map_stats_page2() function need a similar fix? Or is there a linkage
> in hypervisor functionality where any hypervisor version that supports an overlay GPFN
> also supports the PARENT stats? If such a linkage is why hv_call_map_stats_page2()
> doesn't need a similar fix, please add a code comment to that effect in
> hv_call_map_stats_page2().
> 
Exactly; hv_call_map_stats_page2() is only available on hypervisors where the PARENT
page is also available. I'll add a comment.

>>  		}
>>
>>  		ret = hv_call_deposit_pages(NUMA_NO_NODE,
>> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
>> index bc15d6f6922f..f59a4ab47685 100644
>> --- a/drivers/hv/mshv_root_main.c
>> +++ b/drivers/hv/mshv_root_main.c
>> @@ -905,6 +905,9 @@ static int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
>>  	if (err)
>>  		goto unmap_self;
>>
>> +	if (!stats_pages[HV_STATS_AREA_PARENT])
>> +		stats_pages[HV_STATS_AREA_PARENT] =
>> stats_pages[HV_STATS_AREA_SELF];
>> +
>>  	return 0;
>>
>>  unmap_self:
>> --
>> 2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/3] mshv: Add definitions for stats pages
  2025-12-08 15:13   ` Michael Kelley
@ 2025-12-30 23:04     ` Nuno Das Neves
  2026-01-02 16:27       ` Michael Kelley
  0 siblings, 1 reply; 18+ messages in thread
From: Nuno Das Neves @ 2025-12-30 23:04 UTC (permalink / raw)
  To: Michael Kelley, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, skinsburskii@linux.microsoft.com
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com,
	prapal@linux.microsoft.com, mrathor@linux.microsoft.com,
	paekkaladevi@linux.microsoft.com

On 12/8/2025 7:13 AM, Michael Kelley wrote:
> From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Friday, December 5, 2025 10:59 AM
>>
>> Add the definitions for hypervisor, logical processor, and partition
>> stats pages.
>>
>> Move the definition for the VP stats page to its rightful place in
>> hvhdk.h, and add the missing members.
>>
>> These enum members retain their CamelCase style, since they are imported
>> directly from the hypervisor code They will be stringified when printing
> 
> Missing a '.' (period) after "hypervisor code".
> 
Ack

>> the stats out, and retain more readability in this form.
>>
>> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
>> ---
>>  drivers/hv/mshv_root_main.c |  17 --
>>  include/hyperv/hvhdk.h      | 437 ++++++++++++++++++++++++++++++++++++
>>  2 files changed, 437 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
>> index f59a4ab47685..19006b788e85 100644
>> --- a/drivers/hv/mshv_root_main.c
>> +++ b/drivers/hv/mshv_root_main.c
>> @@ -38,23 +38,6 @@ MODULE_AUTHOR("Microsoft");
>>  MODULE_LICENSE("GPL");
>>  MODULE_DESCRIPTION("Microsoft Hyper-V root partition VMM interface
>> /dev/mshv");
>>
>> -/* TODO move this to another file when debugfs code is added */
>> -enum hv_stats_vp_counters {			/* HV_THREAD_COUNTER */
>> -#if defined(CONFIG_X86)
>> -	VpRootDispatchThreadBlocked			= 202,
>> -#elif defined(CONFIG_ARM64)
>> -	VpRootDispatchThreadBlocked			= 94,
>> -#endif
>> -	VpStatsMaxCounter
>> -};
>> -
>> -struct hv_stats_page {
>> -	union {
>> -		u64 vp_cntrs[VpStatsMaxCounter];		/* VP counters */
>> -		u8 data[HV_HYP_PAGE_SIZE];
>> -	};
>> -} __packed;
>> -
>>  struct mshv_root mshv_root;
>>
>>  enum hv_scheduler_type hv_scheduler_type;
>> diff --git a/include/hyperv/hvhdk.h b/include/hyperv/hvhdk.h
>> index 469186df7826..51abbcd0ec37 100644
>> --- a/include/hyperv/hvhdk.h
>> +++ b/include/hyperv/hvhdk.h
>> @@ -10,6 +10,443 @@
>>  #include "hvhdk_mini.h"
>>  #include "hvgdk.h"
>>
>> +enum hv_stats_hypervisor_counters {		/* HV_HYPERVISOR_COUNTER */
>> +	HvLogicalProcessors			= 1,
>> +	HvPartitions				= 2,
>> +	HvTotalPages				= 3,
>> +	HvVirtualProcessors			= 4,
>> +	HvMonitoredNotifications		= 5,
>> +	HvModernStandbyEntries			= 6,
>> +	HvPlatformIdleTransitions		= 7,
>> +	HvHypervisorStartupCost			= 8,
>> +	HvIOSpacePages				= 10,
>> +	HvNonEssentialPagesForDump		= 11,
>> +	HvSubsumedPages				= 12,
>> +	HvStatsMaxCounter
>> +};
>> +
>> +enum hv_stats_partition_counters {		/* HV_PROCESS_COUNTER */
>> +	PartitionVirtualProcessors		= 1,
>> +	PartitionTlbSize			= 3,
>> +	PartitionAddressSpaces			= 4,
>> +	PartitionDepositedPages			= 5,
>> +	PartitionGpaPages			= 6,
>> +	PartitionGpaSpaceModifications		= 7,
>> +	PartitionVirtualTlbFlushEntires		= 8,
>> +	PartitionRecommendedTlbSize		= 9,
>> +	PartitionGpaPages4K			= 10,
>> +	PartitionGpaPages2M			= 11,
>> +	PartitionGpaPages1G			= 12,
>> +	PartitionGpaPages512G			= 13,
>> +	PartitionDevicePages4K			= 14,
>> +	PartitionDevicePages2M			= 15,
>> +	PartitionDevicePages1G			= 16,
>> +	PartitionDevicePages512G		= 17,
>> +	PartitionAttachedDevices		= 18,
>> +	PartitionDeviceInterruptMappings	= 19,
>> +	PartitionIoTlbFlushes			= 20,
>> +	PartitionIoTlbFlushCost			= 21,
>> +	PartitionDeviceInterruptErrors		= 22,
>> +	PartitionDeviceDmaErrors		= 23,
>> +	PartitionDeviceInterruptThrottleEvents	= 24,
>> +	PartitionSkippedTimerTicks		= 25,
>> +	PartitionPartitionId			= 26,
>> +#if IS_ENABLED(CONFIG_X86_64)
>> +	PartitionNestedTlbSize			= 27,
>> +	PartitionRecommendedNestedTlbSize	= 28,
>> +	PartitionNestedTlbFreeListSize		= 29,
>> +	PartitionNestedTlbTrimmedPages		= 30,
>> +	PartitionPagesShattered			= 31,
>> +	PartitionPagesRecombined		= 32,
>> +	PartitionHwpRequestValue		= 33,
>> +#elif IS_ENABLED(CONFIG_ARM64)
>> +	PartitionHwpRequestValue		= 27,
>> +#endif
>> +	PartitionStatsMaxCounter
>> +};
>> +
>> +enum hv_stats_vp_counters {			/* HV_THREAD_COUNTER */
>> +	VpTotalRunTime					= 1,
>> +	VpHypervisorRunTime				= 2,
>> +	VpRemoteNodeRunTime				= 3,
>> +	VpNormalizedRunTime				= 4,
>> +	VpIdealCpu					= 5,
>> +	VpHypercallsCount				= 7,
>> +	VpHypercallsTime				= 8,
>> +#if IS_ENABLED(CONFIG_X86_64)
>> +	VpPageInvalidationsCount			= 9,
>> +	VpPageInvalidationsTime				= 10,
>> +	VpControlRegisterAccessesCount			= 11,
>> +	VpControlRegisterAccessesTime			= 12,
>> +	VpIoInstructionsCount				= 13,
>> +	VpIoInstructionsTime				= 14,
>> +	VpHltInstructionsCount				= 15,
>> +	VpHltInstructionsTime				= 16,
>> +	VpMwaitInstructionsCount			= 17,
>> +	VpMwaitInstructionsTime				= 18,
>> +	VpCpuidInstructionsCount			= 19,
>> +	VpCpuidInstructionsTime				= 20,
>> +	VpMsrAccessesCount				= 21,
>> +	VpMsrAccessesTime				= 22,
>> +	VpOtherInterceptsCount				= 23,
>> +	VpOtherInterceptsTime				= 24,
>> +	VpExternalInterruptsCount			= 25,
>> +	VpExternalInterruptsTime			= 26,
>> +	VpPendingInterruptsCount			= 27,
>> +	VpPendingInterruptsTime				= 28,
>> +	VpEmulatedInstructionsCount			= 29,
>> +	VpEmulatedInstructionsTime			= 30,
>> +	VpDebugRegisterAccessesCount			= 31,
>> +	VpDebugRegisterAccessesTime			= 32,
>> +	VpPageFaultInterceptsCount			= 33,
>> +	VpPageFaultInterceptsTime			= 34,
>> +	VpGuestPageTableMaps				= 35,
>> +	VpLargePageTlbFills				= 36,
>> +	VpSmallPageTlbFills				= 37,
>> +	VpReflectedGuestPageFaults			= 38,
>> +	VpApicMmioAccesses				= 39,
>> +	VpIoInterceptMessages				= 40,
>> +	VpMemoryInterceptMessages			= 41,
>> +	VpApicEoiAccesses				= 42,
>> +	VpOtherMessages					= 43,
>> +	VpPageTableAllocations				= 44,
>> +	VpLogicalProcessorMigrations			= 45,
>> +	VpAddressSpaceEvictions				= 46,
>> +	VpAddressSpaceSwitches				= 47,
>> +	VpAddressDomainFlushes				= 48,
>> +	VpAddressSpaceFlushes				= 49,
>> +	VpGlobalGvaRangeFlushes				= 50,
>> +	VpLocalGvaRangeFlushes				= 51,
>> +	VpPageTableEvictions				= 52,
>> +	VpPageTableReclamations				= 53,
>> +	VpPageTableResets				= 54,
>> +	VpPageTableValidations				= 55,
>> +	VpApicTprAccesses				= 56,
>> +	VpPageTableWriteIntercepts			= 57,
>> +	VpSyntheticInterrupts				= 58,
>> +	VpVirtualInterrupts				= 59,
>> +	VpApicIpisSent					= 60,
>> +	VpApicSelfIpisSent				= 61,
>> +	VpGpaSpaceHypercalls				= 62,
>> +	VpLogicalProcessorHypercalls			= 63,
>> +	VpLongSpinWaitHypercalls			= 64,
>> +	VpOtherHypercalls				= 65,
>> +	VpSyntheticInterruptHypercalls			= 66,
>> +	VpVirtualInterruptHypercalls			= 67,
>> +	VpVirtualMmuHypercalls				= 68,
>> +	VpVirtualProcessorHypercalls			= 69,
>> +	VpHardwareInterrupts				= 70,
>> +	VpNestedPageFaultInterceptsCount		= 71,
>> +	VpNestedPageFaultInterceptsTime			= 72,
>> +	VpPageScans					= 73,
>> +	VpLogicalProcessorDispatches			= 74,
>> +	VpWaitingForCpuTime				= 75,
>> +	VpExtendedHypercalls				= 76,
>> +	VpExtendedHypercallInterceptMessages		= 77,
>> +	VpMbecNestedPageTableSwitches			= 78,
>> +	VpOtherReflectedGuestExceptions			= 79,
>> +	VpGlobalIoTlbFlushes				= 80,
>> +	VpGlobalIoTlbFlushCost				= 81,
>> +	VpLocalIoTlbFlushes				= 82,
>> +	VpLocalIoTlbFlushCost				= 83,
>> +	VpHypercallsForwardedCount			= 84,
>> +	VpHypercallsForwardingTime			= 85,
>> +	VpPageInvalidationsForwardedCount		= 86,
>> +	VpPageInvalidationsForwardingTime		= 87,
>> +	VpControlRegisterAccessesForwardedCount		= 88,
>> +	VpControlRegisterAccessesForwardingTime		= 89,
>> +	VpIoInstructionsForwardedCount			= 90,
>> +	VpIoInstructionsForwardingTime			= 91,
>> +	VpHltInstructionsForwardedCount			= 92,
>> +	VpHltInstructionsForwardingTime			= 93,
>> +	VpMwaitInstructionsForwardedCount		= 94,
>> +	VpMwaitInstructionsForwardingTime		= 95,
>> +	VpCpuidInstructionsForwardedCount		= 96,
>> +	VpCpuidInstructionsForwardingTime		= 97,
>> +	VpMsrAccessesForwardedCount			= 98,
>> +	VpMsrAccessesForwardingTime			= 99,
>> +	VpOtherInterceptsForwardedCount			= 100,
>> +	VpOtherInterceptsForwardingTime			= 101,
>> +	VpExternalInterruptsForwardedCount		= 102,
>> +	VpExternalInterruptsForwardingTime		= 103,
>> +	VpPendingInterruptsForwardedCount		= 104,
>> +	VpPendingInterruptsForwardingTime		= 105,
>> +	VpEmulatedInstructionsForwardedCount		= 106,
>> +	VpEmulatedInstructionsForwardingTime		= 107,
>> +	VpDebugRegisterAccessesForwardedCount		= 108,
>> +	VpDebugRegisterAccessesForwardingTime		= 109,
>> +	VpPageFaultInterceptsForwardedCount		= 110,
>> +	VpPageFaultInterceptsForwardingTime		= 111,
>> +	VpVmclearEmulationCount				= 112,
>> +	VpVmclearEmulationTime				= 113,
>> +	VpVmptrldEmulationCount				= 114,
>> +	VpVmptrldEmulationTime				= 115,
>> +	VpVmptrstEmulationCount				= 116,
>> +	VpVmptrstEmulationTime				= 117,
>> +	VpVmreadEmulationCount				= 118,
>> +	VpVmreadEmulationTime				= 119,
>> +	VpVmwriteEmulationCount				= 120,
>> +	VpVmwriteEmulationTime				= 121,
>> +	VpVmxoffEmulationCount				= 122,
>> +	VpVmxoffEmulationTime				= 123,
>> +	VpVmxonEmulationCount				= 124,
>> +	VpVmxonEmulationTime				= 125,
>> +	VpNestedVMEntriesCount				= 126,
>> +	VpNestedVMEntriesTime				= 127,
>> +	VpNestedSLATSoftPageFaultsCount			= 128,
>> +	VpNestedSLATSoftPageFaultsTime			= 129,
>> +	VpNestedSLATHardPageFaultsCount			= 130,
>> +	VpNestedSLATHardPageFaultsTime			= 131,
>> +	VpInvEptAllContextEmulationCount		= 132,
>> +	VpInvEptAllContextEmulationTime			= 133,
>> +	VpInvEptSingleContextEmulationCount		= 134,
>> +	VpInvEptSingleContextEmulationTime		= 135,
>> +	VpInvVpidAllContextEmulationCount		= 136,
>> +	VpInvVpidAllContextEmulationTime		= 137,
>> +	VpInvVpidSingleContextEmulationCount		= 138,
>> +	VpInvVpidSingleContextEmulationTime		= 139,
>> +	VpInvVpidSingleAddressEmulationCount		= 140,
>> +	VpInvVpidSingleAddressEmulationTime		= 141,
>> +	VpNestedTlbPageTableReclamations		= 142,
>> +	VpNestedTlbPageTableEvictions			= 143,
>> +	VpFlushGuestPhysicalAddressSpaceHypercalls	= 144,
>> +	VpFlushGuestPhysicalAddressListHypercalls	= 145,
>> +	VpPostedInterruptNotifications			= 146,
>> +	VpPostedInterruptScans				= 147,
>> +	VpTotalCoreRunTime				= 148,
>> +	VpMaximumRunTime				= 149,
>> +	VpHwpRequestContextSwitches			= 150,
>> +	VpWaitingForCpuTimeBucket0			= 151,
>> +	VpWaitingForCpuTimeBucket1			= 152,
>> +	VpWaitingForCpuTimeBucket2			= 153,
>> +	VpWaitingForCpuTimeBucket3			= 154,
>> +	VpWaitingForCpuTimeBucket4			= 155,
>> +	VpWaitingForCpuTimeBucket5			= 156,
>> +	VpWaitingForCpuTimeBucket6			= 157,
>> +	VpVmloadEmulationCount				= 158,
>> +	VpVmloadEmulationTime				= 159,
>> +	VpVmsaveEmulationCount				= 160,
>> +	VpVmsaveEmulationTime				= 161,
>> +	VpGifInstructionEmulationCount			= 162,
>> +	VpGifInstructionEmulationTime			= 163,
>> +	VpEmulatedErrataSvmInstructions			= 164,
>> +	VpPlaceholder1					= 165,
>> +	VpPlaceholder2					= 166,
>> +	VpPlaceholder3					= 167,
>> +	VpPlaceholder4					= 168,
>> +	VpPlaceholder5					= 169,
>> +	VpPlaceholder6					= 170,
>> +	VpPlaceholder7					= 171,
>> +	VpPlaceholder8					= 172,
>> +	VpPlaceholder9					= 173,
>> +	VpPlaceholder10					= 174,
>> +	VpSchedulingPriority				= 175,
>> +	VpRdpmcInstructionsCount			= 176,
>> +	VpRdpmcInstructionsTime				= 177,
>> +	VpPerfmonPmuMsrAccessesCount			= 178,
>> +	VpPerfmonLbrMsrAccessesCount			= 179,
>> +	VpPerfmonIptMsrAccessesCount			= 180,
>> +	VpPerfmonInterruptCount				= 181,
>> +	VpVtl1DispatchCount				= 182,
>> +	VpVtl2DispatchCount				= 183,
>> +	VpVtl2DispatchBucket0				= 184,
>> +	VpVtl2DispatchBucket1				= 185,
>> +	VpVtl2DispatchBucket2				= 186,
>> +	VpVtl2DispatchBucket3				= 187,
>> +	VpVtl2DispatchBucket4				= 188,
>> +	VpVtl2DispatchBucket5				= 189,
>> +	VpVtl2DispatchBucket6				= 190,
>> +	VpVtl1RunTime					= 191,
>> +	VpVtl2RunTime					= 192,
>> +	VpIommuHypercalls				= 193,
>> +	VpCpuGroupHypercalls				= 194,
>> +	VpVsmHypercalls					= 195,
>> +	VpEventLogHypercalls				= 196,
>> +	VpDeviceDomainHypercalls			= 197,
>> +	VpDepositHypercalls				= 198,
>> +	VpSvmHypercalls					= 199,
>> +	VpBusLockAcquisitionCount			= 200,
>> +	VpUnused					= 201,
>> +	VpRootDispatchThreadBlocked			= 202,
>> +#elif IS_ENABLED(CONFIG_ARM64)
>> +	VpSysRegAccessesCount				= 9,
>> +	VpSysRegAccessesTime				= 10,
>> +	VpSmcInstructionsCount				= 11,
>> +	VpSmcInstructionsTime				= 12,
>> +	VpOtherInterceptsCount				= 13,
>> +	VpOtherInterceptsTime				= 14,
>> +	VpExternalInterruptsCount			= 15,
>> +	VpExternalInterruptsTime			= 16,
>> +	VpPendingInterruptsCount			= 17,
>> +	VpPendingInterruptsTime				= 18,
>> +	VpGuestPageTableMaps				= 19,
>> +	VpLargePageTlbFills				= 20,
>> +	VpSmallPageTlbFills				= 21,
>> +	VpReflectedGuestPageFaults			= 22,
>> +	VpMemoryInterceptMessages			= 23,
>> +	VpOtherMessages					= 24,
>> +	VpLogicalProcessorMigrations			= 25,
>> +	VpAddressDomainFlushes				= 26,
>> +	VpAddressSpaceFlushes				= 27,
>> +	VpSyntheticInterrupts				= 28,
>> +	VpVirtualInterrupts				= 29,
>> +	VpApicSelfIpisSent				= 30,
>> +	VpGpaSpaceHypercalls				= 31,
>> +	VpLogicalProcessorHypercalls			= 32,
>> +	VpLongSpinWaitHypercalls			= 33,
>> +	VpOtherHypercalls				= 34,
>> +	VpSyntheticInterruptHypercalls			= 35,
>> +	VpVirtualInterruptHypercalls			= 36,
>> +	VpVirtualMmuHypercalls				= 37,
>> +	VpVirtualProcessorHypercalls			= 38,
>> +	VpHardwareInterrupts				= 39,
>> +	VpNestedPageFaultInterceptsCount		= 40,
>> +	VpNestedPageFaultInterceptsTime			= 41,
>> +	VpLogicalProcessorDispatches			= 42,
>> +	VpWaitingForCpuTime				= 43,
>> +	VpExtendedHypercalls				= 44,
>> +	VpExtendedHypercallInterceptMessages		= 45,
>> +	VpMbecNestedPageTableSwitches			= 46,
>> +	VpOtherReflectedGuestExceptions			= 47,
>> +	VpGlobalIoTlbFlushes				= 48,
>> +	VpGlobalIoTlbFlushCost				= 49,
>> +	VpLocalIoTlbFlushes				= 50,
>> +	VpLocalIoTlbFlushCost				= 51,
>> +	VpFlushGuestPhysicalAddressSpaceHypercalls	= 52,
>> +	VpFlushGuestPhysicalAddressListHypercalls	= 53,
>> +	VpPostedInterruptNotifications			= 54,
>> +	VpPostedInterruptScans				= 55,
>> +	VpTotalCoreRunTime				= 56,
>> +	VpMaximumRunTime				= 57,
>> +	VpWaitingForCpuTimeBucket0			= 58,
>> +	VpWaitingForCpuTimeBucket1			= 59,
>> +	VpWaitingForCpuTimeBucket2			= 60,
>> +	VpWaitingForCpuTimeBucket3			= 61,
>> +	VpWaitingForCpuTimeBucket4			= 62,
>> +	VpWaitingForCpuTimeBucket5			= 63,
>> +	VpWaitingForCpuTimeBucket6			= 64,
>> +	VpHwpRequestContextSwitches			= 65,
>> +	VpPlaceholder2					= 66,
>> +	VpPlaceholder3					= 67,
>> +	VpPlaceholder4					= 68,
>> +	VpPlaceholder5					= 69,
>> +	VpPlaceholder6					= 70,
>> +	VpPlaceholder7					= 71,
>> +	VpPlaceholder8					= 72,
>> +	VpContentionTime				= 73,
>> +	VpWakeUpTime					= 74,
>> +	VpSchedulingPriority				= 75,
>> +	VpVtl1DispatchCount				= 76,
>> +	VpVtl2DispatchCount				= 77,
>> +	VpVtl2DispatchBucket0				= 78,
>> +	VpVtl2DispatchBucket1				= 79,
>> +	VpVtl2DispatchBucket2				= 80,
>> +	VpVtl2DispatchBucket3				= 81,
>> +	VpVtl2DispatchBucket4				= 82,
>> +	VpVtl2DispatchBucket5				= 83,
>> +	VpVtl2DispatchBucket6				= 84,
>> +	VpVtl1RunTime					= 85,
>> +	VpVtl2RunTime					= 86,
>> +	VpIommuHypercalls				= 87,
>> +	VpCpuGroupHypercalls				= 88,
>> +	VpVsmHypercalls					= 89,
>> +	VpEventLogHypercalls				= 90,
>> +	VpDeviceDomainHypercalls			= 91,
>> +	VpDepositHypercalls				= 92,
>> +	VpSvmHypercalls					= 93,
>> +	VpLoadAvg					= 94,
>> +	VpRootDispatchThreadBlocked			= 95,
> 
> In current code, VpRootDispatchThreadBlocked on ARM64 is 94. Is that an
> error that is being corrected by this patch?
> 

Hmm, I didn't realize this changed - 95 is the correct value. However,
the mshv driver does not yet support on ARM64, so this fix doesn't
have any impact right now. Do you suggest a separate patch to fix it?

>> +#endif
>> +	VpStatsMaxCounter
>> +};
>> +
>> +enum hv_stats_lp_counters {			/* HV_CPU_COUNTER */
>> +	LpGlobalTime				= 1,
>> +	LpTotalRunTime				= 2,
>> +	LpHypervisorRunTime			= 3,
>> +	LpHardwareInterrupts			= 4,
>> +	LpContextSwitches			= 5,
>> +	LpInterProcessorInterrupts		= 6,
>> +	LpSchedulerInterrupts			= 7,
>> +	LpTimerInterrupts			= 8,
>> +	LpInterProcessorInterruptsSent		= 9,
>> +	LpProcessorHalts			= 10,
>> +	LpMonitorTransitionCost			= 11,
>> +	LpContextSwitchTime			= 12,
>> +	LpC1TransitionsCount			= 13,
>> +	LpC1RunTime				= 14,
>> +	LpC2TransitionsCount			= 15,
>> +	LpC2RunTime				= 16,
>> +	LpC3TransitionsCount			= 17,
>> +	LpC3RunTime				= 18,
>> +	LpRootVpIndex				= 19,
>> +	LpIdleSequenceNumber			= 20,
>> +	LpGlobalTscCount			= 21,
>> +	LpActiveTscCount			= 22,
>> +	LpIdleAccumulation			= 23,
>> +	LpReferenceCycleCount0			= 24,
>> +	LpActualCycleCount0			= 25,
>> +	LpReferenceCycleCount1			= 26,
>> +	LpActualCycleCount1			= 27,
>> +	LpProximityDomainId			= 28,
>> +	LpPostedInterruptNotifications		= 29,
>> +	LpBranchPredictorFlushes		= 30,
>> +#if IS_ENABLED(CONFIG_X86_64)
>> +	LpL1DataCacheFlushes			= 31,
>> +	LpImmediateL1DataCacheFlushes		= 32,
>> +	LpMbFlushes				= 33,
>> +	LpCounterRefreshSequenceNumber		= 34,
>> +	LpCounterRefreshReferenceTime		= 35,
>> +	LpIdleAccumulationSnapshot		= 36,
>> +	LpActiveTscCountSnapshot		= 37,
>> +	LpHwpRequestContextSwitches		= 38,
>> +	LpPlaceholder1				= 39,
>> +	LpPlaceholder2				= 40,
>> +	LpPlaceholder3				= 41,
>> +	LpPlaceholder4				= 42,
>> +	LpPlaceholder5				= 43,
>> +	LpPlaceholder6				= 44,
>> +	LpPlaceholder7				= 45,
>> +	LpPlaceholder8				= 46,
>> +	LpPlaceholder9				= 47,
>> +	LpPlaceholder10				= 48,
>> +	LpReserveGroupId			= 49,
>> +	LpRunningPriority			= 50,
>> +	LpPerfmonInterruptCount			= 51,
>> +#elif IS_ENABLED(CONFIG_ARM64)
>> +	LpCounterRefreshSequenceNumber		= 31,
>> +	LpCounterRefreshReferenceTime		= 32,
>> +	LpIdleAccumulationSnapshot		= 33,
>> +	LpActiveTscCountSnapshot		= 34,
>> +	LpHwpRequestContextSwitches		= 35,
>> +	LpPlaceholder2				= 36,
>> +	LpPlaceholder3				= 37,
>> +	LpPlaceholder4				= 38,
>> +	LpPlaceholder5				= 39,
>> +	LpPlaceholder6				= 40,
>> +	LpPlaceholder7				= 41,
>> +	LpPlaceholder8				= 42,
>> +	LpPlaceholder9				= 43,
>> +	LpSchLocalRunListSize			= 44,
>> +	LpReserveGroupId			= 45,
>> +	LpRunningPriority			= 46,
>> +#endif
>> +	LpStatsMaxCounter
>> +};
>> +
>> +/*
>> + * Hypervisor statsitics page format
> 
> s/statsitics/statistics/
> 
Ack, thanks

>> + */
>> +struct hv_stats_page {
>> +	union {
>> +		u64 hv_cntrs[HvStatsMaxCounter];		/* Hypervisor counters
>> */
>> +		u64 pt_cntrs[PartitionStatsMaxCounter];		/* Partition
>> counters */
>> +		u64 vp_cntrs[VpStatsMaxCounter];		/* VP counters */
>> +		u64 lp_cntrs[LpStatsMaxCounter];		/* LP counters */
>> +		u8 data[HV_HYP_PAGE_SIZE];
>> +	};
>> +} __packed;
>> +
>>  /* Bits for dirty mask of hv_vp_register_page */
>>  #define HV_X64_REGISTER_CLASS_GENERAL	0
>>  #define HV_X64_REGISTER_CLASS_IP	1
>> --
>> 2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics
  2025-12-08 15:21   ` Michael Kelley
@ 2025-12-31  0:26     ` Nuno Das Neves
  2026-01-02 16:27       ` Michael Kelley
  0 siblings, 1 reply; 18+ messages in thread
From: Nuno Das Neves @ 2025-12-31  0:26 UTC (permalink / raw)
  To: Michael Kelley, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, skinsburskii@linux.microsoft.com
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com,
	prapal@linux.microsoft.com, mrathor@linux.microsoft.com,
	paekkaladevi@linux.microsoft.com, Jinank Jain

On 12/8/2025 7:21 AM, Michael Kelley wrote:
> From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Friday, December 5, 2025 10:59 AM
>>
>> Introduce a debugfs interface to expose root and child partition stats
>> when running with mshv_root.
>>
>> Create a debugfs directory "mshv" containing 'stats' files organized by
>> type and id. A stats file contains a number of counters depending on
>> its type. e.g. an excerpt from a VP stats file:
>>
>> TotalRunTime                  : 1997602722
>> HypervisorRunTime             : 649671371
>> RemoteNodeRunTime             : 0
>> NormalizedRunTime             : 1997602721
>> IdealCpu                      : 0
>> HypercallsCount               : 1708169
>> HypercallsTime                : 111914774
>> PageInvalidationsCount        : 0
>> PageInvalidationsTime         : 0
>>
>> On a root partition with some active child partitions, the entire
>> directory structure may look like:
>>
>> mshv/
>>   stats             # hypervisor stats
>>   lp/               # logical processors
>>     0/              # LP id
>>       stats         # LP 0 stats
>>     1/
>>     2/
>>     3/
>>   partition/        # partition stats
>>     1/              # root partition id
>>       stats         # root partition stats
>>       vp/           # root virtual processors
>>         0/          # root VP id
>>           stats     # root VP 0 stats
>>         1/
>>         2/
>>         3/
>>     42/             # child partition id
>>       stats         # child partition stats
>>       vp/           # child VPs
>>         0/          # child VP id
>>           stats     # child VP 0 stats
>>         1/
>>     43/
>>     55/
>>
> 
> In the above directory tree, each of the "stats" files is in a directory
> by itself, where the directory name is the number of whatever
> entity the stats are for (lp, partition, or vp). Do you expect there to
> be other files parallel to "stats" that will be added later? Otherwise
> you could collapse one directory level. The "best" directory structure
> is somewhat a matter of taste and judgment, so there's not a "right"
> answer. I don't object if your preference is to keep the numbered
> directories, even if they are likely to never contain more than the
> "stats" file.
> 
Good question, I'm not aware of a plan to add additional parallel files
in future, but even so, I think this structure is fine as-is.

I see how the VPs and LPs directories could be collapsed, but partitions
need to be directories to contain the VPs, so that would be an
inconsistency (some "stats" files and some "$ID" files) which seems worse
to me. e.g.., are you suggesting something like this?

mshv/
   stats             # hypervisor stats
   lp/               # logical processors
     0               # LP 0 stats 
     1               # LP 1 stats
   partition/        # partition stats directory
     1/              # root partition id
       stats         # root partition stats
       vp/           # root virtual processors
         0           # root VP 0 stats
         1           # root VP 1 stats
     4/              # child partition id
       stats         # child partition stats
       vp/           # child virtual processors
         0           # child VP 0 stats
         1           # child VP 1 stats

Unless I'm misunderstanding what you mean, I think the original is better,
both because it's more consistent and does leave room for adding additional
files if we ever want to.

>> On L1VH, some stats are not present as it does not own the hardware
>> like the root partition does:
>> - The hypervisor and lp stats are not present
>> - L1VH's partition directory is named "self" because it can't get its
>>   own id
>> - Some of L1VH's partition and VP stats fields are not populated, because
>>   it can't map its own HV_STATS_AREA_PARENT page.
>>
>> Co-developed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
>> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
>> Co-developed-by: Praveen K Paladugu <prapal@linux.microsoft.com>
>> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
>> Co-developed-by: Mukesh Rathor <mrathor@linux.microsoft.com>
>> Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
>> Co-developed-by: Purna Pavan Chandra Aekkaladevi
>> <paekkaladevi@linux.microsoft.com>
>> Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
>> Co-developed-by: Jinank Jain <jinankjain@microsoft.com>
>> Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
>> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
>> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
>> ---
>>  drivers/hv/Makefile         |    1 +
>>  drivers/hv/mshv_debugfs.c   | 1122 +++++++++++++++++++++++++++++++++++
>>  drivers/hv/mshv_root.h      |   34 ++
>>  drivers/hv/mshv_root_main.c |   32 +-
>>  4 files changed, 1185 insertions(+), 4 deletions(-)
>>  create mode 100644 drivers/hv/mshv_debugfs.c
>>
>> diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
>> index 58b8d07639f3..36278c936914 100644
>> --- a/drivers/hv/Makefile
>> +++ b/drivers/hv/Makefile
>> @@ -15,6 +15,7 @@ hv_vmbus-$(CONFIG_HYPERV_TESTING)	+= hv_debugfs.o
>>  hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_utils_transport.o
>>  mshv_root-y := mshv_root_main.o mshv_synic.o mshv_eventfd.o mshv_irq.o \
>>  	       mshv_root_hv_call.o mshv_portid_table.o
>> +mshv_root-$(CONFIG_DEBUG_FS) += mshv_debugfs.o
>>  mshv_vtl-y := mshv_vtl_main.o
>>
>>  # Code that must be built-in
>> diff --git a/drivers/hv/mshv_debugfs.c b/drivers/hv/mshv_debugfs.c
>> new file mode 100644
>> index 000000000000..581018690a27
>> --- /dev/null
>> +++ b/drivers/hv/mshv_debugfs.c
>> @@ -0,0 +1,1122 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2025, Microsoft Corporation.
>> + *
>> + * The /sys/kernel/debug/mshv directory contents.
>> + * Contains various statistics data, provided by the hypervisor.
>> + *
>> + * Authors: Microsoft Linux virtualization team
>> + */
>> +
>> +#include <linux/debugfs.h>
>> +#include <linux/stringify.h>
>> +#include <asm/mshyperv.h>
>> +#include <linux/slab.h>
>> +
>> +#include "mshv.h"
>> +#include "mshv_root.h"
>> +
>> +#define U32_BUF_SZ 11
>> +#define U64_BUF_SZ 21
>> +
>> +static struct dentry *mshv_debugfs;
>> +static struct dentry *mshv_debugfs_partition;
>> +static struct dentry *mshv_debugfs_lp;
>> +
>> +static u64 mshv_lps_count;
>> +
>> +static bool is_l1vh_parent(u64 partition_id)
>> +{
>> +	return hv_l1vh_partition() && (partition_id == HV_PARTITION_ID_SELF);
>> +}
>> +
>> +static int lp_stats_show(struct seq_file *m, void *v)
>> +{
>> +	const struct hv_stats_page *stats = m->private;
>> +
>> +#define LP_SEQ_PRINTF(cnt)		\
>> +	seq_printf(m, "%-29s: %llu\n", __stringify(cnt), stats->lp_cntrs[Lp##cnt])
>> +
>> +	LP_SEQ_PRINTF(GlobalTime);
>> +	LP_SEQ_PRINTF(TotalRunTime);
>> +	LP_SEQ_PRINTF(HypervisorRunTime);
>> +	LP_SEQ_PRINTF(HardwareInterrupts);
>> +	LP_SEQ_PRINTF(ContextSwitches);
>> +	LP_SEQ_PRINTF(InterProcessorInterrupts);
>> +	LP_SEQ_PRINTF(SchedulerInterrupts);
>> +	LP_SEQ_PRINTF(TimerInterrupts);
>> +	LP_SEQ_PRINTF(InterProcessorInterruptsSent);
>> +	LP_SEQ_PRINTF(ProcessorHalts);
>> +	LP_SEQ_PRINTF(MonitorTransitionCost);
>> +	LP_SEQ_PRINTF(ContextSwitchTime);
>> +	LP_SEQ_PRINTF(C1TransitionsCount);
>> +	LP_SEQ_PRINTF(C1RunTime);
>> +	LP_SEQ_PRINTF(C2TransitionsCount);
>> +	LP_SEQ_PRINTF(C2RunTime);
>> +	LP_SEQ_PRINTF(C3TransitionsCount);
>> +	LP_SEQ_PRINTF(C3RunTime);
>> +	LP_SEQ_PRINTF(RootVpIndex);
>> +	LP_SEQ_PRINTF(IdleSequenceNumber);
>> +	LP_SEQ_PRINTF(GlobalTscCount);
>> +	LP_SEQ_PRINTF(ActiveTscCount);
>> +	LP_SEQ_PRINTF(IdleAccumulation);
>> +	LP_SEQ_PRINTF(ReferenceCycleCount0);
>> +	LP_SEQ_PRINTF(ActualCycleCount0);
>> +	LP_SEQ_PRINTF(ReferenceCycleCount1);
>> +	LP_SEQ_PRINTF(ActualCycleCount1);
>> +	LP_SEQ_PRINTF(ProximityDomainId);
>> +	LP_SEQ_PRINTF(PostedInterruptNotifications);
>> +	LP_SEQ_PRINTF(BranchPredictorFlushes);
>> +#if IS_ENABLED(CONFIG_X86_64)
>> +	LP_SEQ_PRINTF(L1DataCacheFlushes);
>> +	LP_SEQ_PRINTF(ImmediateL1DataCacheFlushes);
>> +	LP_SEQ_PRINTF(MbFlushes);
>> +	LP_SEQ_PRINTF(CounterRefreshSequenceNumber);
>> +	LP_SEQ_PRINTF(CounterRefreshReferenceTime);
>> +	LP_SEQ_PRINTF(IdleAccumulationSnapshot);
>> +	LP_SEQ_PRINTF(ActiveTscCountSnapshot);
>> +	LP_SEQ_PRINTF(HwpRequestContextSwitches);
>> +	LP_SEQ_PRINTF(Placeholder1);
>> +	LP_SEQ_PRINTF(Placeholder2);
>> +	LP_SEQ_PRINTF(Placeholder3);
>> +	LP_SEQ_PRINTF(Placeholder4);
>> +	LP_SEQ_PRINTF(Placeholder5);
>> +	LP_SEQ_PRINTF(Placeholder6);
>> +	LP_SEQ_PRINTF(Placeholder7);
>> +	LP_SEQ_PRINTF(Placeholder8);
>> +	LP_SEQ_PRINTF(Placeholder9);
>> +	LP_SEQ_PRINTF(Placeholder10);
>> +	LP_SEQ_PRINTF(ReserveGroupId);
>> +	LP_SEQ_PRINTF(RunningPriority);
>> +	LP_SEQ_PRINTF(PerfmonInterruptCount);
>> +#elif IS_ENABLED(CONFIG_ARM64)
>> +	LP_SEQ_PRINTF(CounterRefreshSequenceNumber);
>> +	LP_SEQ_PRINTF(CounterRefreshReferenceTime);
>> +	LP_SEQ_PRINTF(IdleAccumulationSnapshot);
>> +	LP_SEQ_PRINTF(ActiveTscCountSnapshot);
>> +	LP_SEQ_PRINTF(HwpRequestContextSwitches);
>> +	LP_SEQ_PRINTF(Placeholder2);
>> +	LP_SEQ_PRINTF(Placeholder3);
>> +	LP_SEQ_PRINTF(Placeholder4);
>> +	LP_SEQ_PRINTF(Placeholder5);
>> +	LP_SEQ_PRINTF(Placeholder6);
>> +	LP_SEQ_PRINTF(Placeholder7);
>> +	LP_SEQ_PRINTF(Placeholder8);
>> +	LP_SEQ_PRINTF(Placeholder9);
>> +	LP_SEQ_PRINTF(SchLocalRunListSize);
>> +	LP_SEQ_PRINTF(ReserveGroupId);
>> +	LP_SEQ_PRINTF(RunningPriority);
>> +#endif
>> +
>> +	return 0;
>> +}
>> +DEFINE_SHOW_ATTRIBUTE(lp_stats);
>> +
>> +static void mshv_lp_stats_unmap(u32 lp_index, void *stats_page_addr)
>> +{
>> +	union hv_stats_object_identity identity = {
>> +		.lp.lp_index = lp_index,
>> +		.lp.stats_area_type = HV_STATS_AREA_SELF,
>> +	};
>> +	int err;
>> +
>> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_LOGICAL_PROCESSOR,
>> +				  stats_page_addr, &identity);
>> +	if (err)
>> +		pr_err("%s: failed to unmap logical processor %u stats, err: %d\n",
>> +		       __func__, lp_index, err);
>> +}
>> +
>> +static void __init *mshv_lp_stats_map(u32 lp_index)
>> +{
>> +	union hv_stats_object_identity identity = {
>> +		.lp.lp_index = lp_index,
>> +		.lp.stats_area_type = HV_STATS_AREA_SELF,
>> +	};
>> +	void *stats;
>> +	int err;
>> +
>> +	err = hv_map_stats_page(HV_STATS_OBJECT_LOGICAL_PROCESSOR, &identity,
>> +				&stats);
>> +	if (err) {
>> +		pr_err("%s: failed to map logical processor %u stats, err: %d\n",
>> +		       __func__, lp_index, err);
>> +		return ERR_PTR(err);
>> +	}
>> +
>> +	return stats;
>> +}
>> +
>> +static void __init *lp_debugfs_stats_create(u32 lp_index, struct dentry *parent)
>> +{
>> +	struct dentry *dentry;
>> +	void *stats;
>> +
>> +	stats = mshv_lp_stats_map(lp_index);
>> +	if (IS_ERR(stats))
>> +		return stats;
>> +
>> +	dentry = debugfs_create_file("stats", 0400, parent,
>> +				     stats, &lp_stats_fops);
>> +	if (IS_ERR(dentry)) {
>> +		mshv_lp_stats_unmap(lp_index, stats);
>> +		return dentry;
>> +	}
>> +	return stats;
>> +}
>> +
>> +static int __init lp_debugfs_create(u32 lp_index, struct dentry *parent)
>> +{
>> +	struct dentry *idx;
>> +	char lp_idx_str[U32_BUF_SZ];
>> +	void *stats;
>> +	int err;
>> +
>> +	sprintf(lp_idx_str, "%u", lp_index);
>> +
>> +	idx = debugfs_create_dir(lp_idx_str, parent);
>> +	if (IS_ERR(idx))
>> +		return PTR_ERR(idx);
>> +
>> +	stats = lp_debugfs_stats_create(lp_index, idx);
>> +	if (IS_ERR(stats)) {
>> +		err = PTR_ERR(stats);
>> +		goto remove_debugfs_lp_idx;
>> +	}
>> +
>> +	return 0;
>> +
>> +remove_debugfs_lp_idx:
>> +	debugfs_remove_recursive(idx);
>> +	return err;
>> +}
>> +
>> +static void mshv_debugfs_lp_remove(void)
>> +{
>> +	int lp_index;
>> +
>> +	debugfs_remove_recursive(mshv_debugfs_lp);
>> +
>> +	for (lp_index = 0; lp_index < mshv_lps_count; lp_index++)
>> +		mshv_lp_stats_unmap(lp_index, NULL);
> 
> Passing NULL as the second argument here leaks the stats page
> memory if Linux allocated the page as an overlay GPFN. But is that
> considered OK because the debugfs entries for LPs are removed
> only when the root partition is shutting down? That works as
> long as hot-add/remove of CPUs isn't supported in the root
> partition.
> 
Hmm, at the very least this appears to be a memory leak if the mshv
driver is built as a module and removed + reinserted. The stats
pages can be mapped multiple times so it will just allocate a page
(on L1VH anyway) and remap it each time. I will check and fix it in
this patch.

>> +}
>> +
>> +static int __init mshv_debugfs_lp_create(struct dentry *parent)
>> +{
>> +	struct dentry *lp_dir;
>> +	int err, lp_index;
>> +
>> +	lp_dir = debugfs_create_dir("lp", parent);
>> +	if (IS_ERR(lp_dir))
>> +		return PTR_ERR(lp_dir);
>> +
>> +	for (lp_index = 0; lp_index < mshv_lps_count; lp_index++) {
>> +		err = lp_debugfs_create(lp_index, lp_dir);
>> +		if (err)
>> +			goto remove_debugfs_lps;
>> +	}
>> +
>> +	mshv_debugfs_lp = lp_dir;
>> +
>> +	return 0;
>> +
>> +remove_debugfs_lps:
>> +	for (lp_index -= 1; lp_index >= 0; lp_index--)
>> +		mshv_lp_stats_unmap(lp_index, NULL);
>> +	debugfs_remove_recursive(lp_dir);
>> +	return err;
>> +}
>> +
>> +static int vp_stats_show(struct seq_file *m, void *v)
>> +{
>> +	const struct hv_stats_page **pstats = m->private;
>> +
>> +#define VP_SEQ_PRINTF(cnt)				 \
>> +do {								 \
>> +	if (pstats[HV_STATS_AREA_SELF]->vp_cntrs[Vp##cnt]) \
>> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
>> +			pstats[HV_STATS_AREA_SELF]->vp_cntrs[Vp##cnt]); \
>> +	else \
>> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
>> +			pstats[HV_STATS_AREA_PARENT]->vp_cntrs[Vp##cnt]); \
>> +} while (0)
> 
> I don't understand this logic. Like in mshv_vp_dispatch_thread_blocked(), if
> the SELF value is zero, then the PARENT value is used. The implication is that
> you never want to display a SELF value of zero, which is a bit unexpected
> since I could imagine zero being valid for some counters. But the overall result
> is that the displayed values may be a mix of SELF and PARENT values.

Yes, the basic idea is: Display a nonzero value, if there is one on either SELF or
PARENT pages. (I *think* the values will always be the same if they are nonzero.)

I admit it's not an ideal design from my perspective. As far as I know, it was
done this way to retain backward compatibility with hypervisors that don't support
the concept of a PARENT stats area at all.

> And of course after Patch 1 of this series, if running on an older hypervisor
> that doesn't provide PARENT, then SELF will be used anyway, which further
> muddies what's going on here, at least for me. :-)
> 

Yes, but in the end we need to check both pages, so there's no avoiding this
redundant check on old hypervisors without adding a separate code path just for
that case, which doesn't seem worth it.

> If this is the correct behavior, please add some code comments as to
> why it makes sense, including in the case where PARENT isn't available.
> 

Ok, will do.

>> +
>> +	VP_SEQ_PRINTF(TotalRunTime);
>> +	VP_SEQ_PRINTF(HypervisorRunTime);
>> +	VP_SEQ_PRINTF(RemoteNodeRunTime);
>> +	VP_SEQ_PRINTF(NormalizedRunTime);
>> +	VP_SEQ_PRINTF(IdealCpu);
>> +	VP_SEQ_PRINTF(HypercallsCount);
>> +	VP_SEQ_PRINTF(HypercallsTime);
>> +#if IS_ENABLED(CONFIG_X86_64)
>> +	VP_SEQ_PRINTF(PageInvalidationsCount);
>> +	VP_SEQ_PRINTF(PageInvalidationsTime);
>> +	VP_SEQ_PRINTF(ControlRegisterAccessesCount);
>> +	VP_SEQ_PRINTF(ControlRegisterAccessesTime);
>> +	VP_SEQ_PRINTF(IoInstructionsCount);
>> +	VP_SEQ_PRINTF(IoInstructionsTime);
>> +	VP_SEQ_PRINTF(HltInstructionsCount);
>> +	VP_SEQ_PRINTF(HltInstructionsTime);
>> +	VP_SEQ_PRINTF(MwaitInstructionsCount);
>> +	VP_SEQ_PRINTF(MwaitInstructionsTime);
>> +	VP_SEQ_PRINTF(CpuidInstructionsCount);
>> +	VP_SEQ_PRINTF(CpuidInstructionsTime);
>> +	VP_SEQ_PRINTF(MsrAccessesCount);
>> +	VP_SEQ_PRINTF(MsrAccessesTime);
>> +	VP_SEQ_PRINTF(OtherInterceptsCount);
>> +	VP_SEQ_PRINTF(OtherInterceptsTime);
>> +	VP_SEQ_PRINTF(ExternalInterruptsCount);
>> +	VP_SEQ_PRINTF(ExternalInterruptsTime);
>> +	VP_SEQ_PRINTF(PendingInterruptsCount);
>> +	VP_SEQ_PRINTF(PendingInterruptsTime);
>> +	VP_SEQ_PRINTF(EmulatedInstructionsCount);
>> +	VP_SEQ_PRINTF(EmulatedInstructionsTime);
>> +	VP_SEQ_PRINTF(DebugRegisterAccessesCount);
>> +	VP_SEQ_PRINTF(DebugRegisterAccessesTime);
>> +	VP_SEQ_PRINTF(PageFaultInterceptsCount);
>> +	VP_SEQ_PRINTF(PageFaultInterceptsTime);
>> +	VP_SEQ_PRINTF(GuestPageTableMaps);
>> +	VP_SEQ_PRINTF(LargePageTlbFills);
>> +	VP_SEQ_PRINTF(SmallPageTlbFills);
>> +	VP_SEQ_PRINTF(ReflectedGuestPageFaults);
>> +	VP_SEQ_PRINTF(ApicMmioAccesses);
>> +	VP_SEQ_PRINTF(IoInterceptMessages);
>> +	VP_SEQ_PRINTF(MemoryInterceptMessages);
>> +	VP_SEQ_PRINTF(ApicEoiAccesses);
>> +	VP_SEQ_PRINTF(OtherMessages);
>> +	VP_SEQ_PRINTF(PageTableAllocations);
>> +	VP_SEQ_PRINTF(LogicalProcessorMigrations);
>> +	VP_SEQ_PRINTF(AddressSpaceEvictions);
>> +	VP_SEQ_PRINTF(AddressSpaceSwitches);
>> +	VP_SEQ_PRINTF(AddressDomainFlushes);
>> +	VP_SEQ_PRINTF(AddressSpaceFlushes);
>> +	VP_SEQ_PRINTF(GlobalGvaRangeFlushes);
>> +	VP_SEQ_PRINTF(LocalGvaRangeFlushes);
>> +	VP_SEQ_PRINTF(PageTableEvictions);
>> +	VP_SEQ_PRINTF(PageTableReclamations);
>> +	VP_SEQ_PRINTF(PageTableResets);
>> +	VP_SEQ_PRINTF(PageTableValidations);
>> +	VP_SEQ_PRINTF(ApicTprAccesses);
>> +	VP_SEQ_PRINTF(PageTableWriteIntercepts);
>> +	VP_SEQ_PRINTF(SyntheticInterrupts);
>> +	VP_SEQ_PRINTF(VirtualInterrupts);
>> +	VP_SEQ_PRINTF(ApicIpisSent);
>> +	VP_SEQ_PRINTF(ApicSelfIpisSent);
>> +	VP_SEQ_PRINTF(GpaSpaceHypercalls);
>> +	VP_SEQ_PRINTF(LogicalProcessorHypercalls);
>> +	VP_SEQ_PRINTF(LongSpinWaitHypercalls);
>> +	VP_SEQ_PRINTF(OtherHypercalls);
>> +	VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
>> +	VP_SEQ_PRINTF(VirtualInterruptHypercalls);
>> +	VP_SEQ_PRINTF(VirtualMmuHypercalls);
>> +	VP_SEQ_PRINTF(VirtualProcessorHypercalls);
>> +	VP_SEQ_PRINTF(HardwareInterrupts);
>> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
>> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
>> +	VP_SEQ_PRINTF(PageScans);
>> +	VP_SEQ_PRINTF(LogicalProcessorDispatches);
>> +	VP_SEQ_PRINTF(WaitingForCpuTime);
>> +	VP_SEQ_PRINTF(ExtendedHypercalls);
>> +	VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
>> +	VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
>> +	VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
>> +	VP_SEQ_PRINTF(GlobalIoTlbFlushes);
>> +	VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
>> +	VP_SEQ_PRINTF(LocalIoTlbFlushes);
>> +	VP_SEQ_PRINTF(LocalIoTlbFlushCost);
>> +	VP_SEQ_PRINTF(HypercallsForwardedCount);
>> +	VP_SEQ_PRINTF(HypercallsForwardingTime);
>> +	VP_SEQ_PRINTF(PageInvalidationsForwardedCount);
>> +	VP_SEQ_PRINTF(PageInvalidationsForwardingTime);
>> +	VP_SEQ_PRINTF(ControlRegisterAccessesForwardedCount);
>> +	VP_SEQ_PRINTF(ControlRegisterAccessesForwardingTime);
>> +	VP_SEQ_PRINTF(IoInstructionsForwardedCount);
>> +	VP_SEQ_PRINTF(IoInstructionsForwardingTime);
>> +	VP_SEQ_PRINTF(HltInstructionsForwardedCount);
>> +	VP_SEQ_PRINTF(HltInstructionsForwardingTime);
>> +	VP_SEQ_PRINTF(MwaitInstructionsForwardedCount);
>> +	VP_SEQ_PRINTF(MwaitInstructionsForwardingTime);
>> +	VP_SEQ_PRINTF(CpuidInstructionsForwardedCount);
>> +	VP_SEQ_PRINTF(CpuidInstructionsForwardingTime);
>> +	VP_SEQ_PRINTF(MsrAccessesForwardedCount);
>> +	VP_SEQ_PRINTF(MsrAccessesForwardingTime);
>> +	VP_SEQ_PRINTF(OtherInterceptsForwardedCount);
>> +	VP_SEQ_PRINTF(OtherInterceptsForwardingTime);
>> +	VP_SEQ_PRINTF(ExternalInterruptsForwardedCount);
>> +	VP_SEQ_PRINTF(ExternalInterruptsForwardingTime);
>> +	VP_SEQ_PRINTF(PendingInterruptsForwardedCount);
>> +	VP_SEQ_PRINTF(PendingInterruptsForwardingTime);
>> +	VP_SEQ_PRINTF(EmulatedInstructionsForwardedCount);
>> +	VP_SEQ_PRINTF(EmulatedInstructionsForwardingTime);
>> +	VP_SEQ_PRINTF(DebugRegisterAccessesForwardedCount);
>> +	VP_SEQ_PRINTF(DebugRegisterAccessesForwardingTime);
>> +	VP_SEQ_PRINTF(PageFaultInterceptsForwardedCount);
>> +	VP_SEQ_PRINTF(PageFaultInterceptsForwardingTime);
>> +	VP_SEQ_PRINTF(VmclearEmulationCount);
>> +	VP_SEQ_PRINTF(VmclearEmulationTime);
>> +	VP_SEQ_PRINTF(VmptrldEmulationCount);
>> +	VP_SEQ_PRINTF(VmptrldEmulationTime);
>> +	VP_SEQ_PRINTF(VmptrstEmulationCount);
>> +	VP_SEQ_PRINTF(VmptrstEmulationTime);
>> +	VP_SEQ_PRINTF(VmreadEmulationCount);
>> +	VP_SEQ_PRINTF(VmreadEmulationTime);
>> +	VP_SEQ_PRINTF(VmwriteEmulationCount);
>> +	VP_SEQ_PRINTF(VmwriteEmulationTime);
>> +	VP_SEQ_PRINTF(VmxoffEmulationCount);
>> +	VP_SEQ_PRINTF(VmxoffEmulationTime);
>> +	VP_SEQ_PRINTF(VmxonEmulationCount);
>> +	VP_SEQ_PRINTF(VmxonEmulationTime);
>> +	VP_SEQ_PRINTF(NestedVMEntriesCount);
>> +	VP_SEQ_PRINTF(NestedVMEntriesTime);
>> +	VP_SEQ_PRINTF(NestedSLATSoftPageFaultsCount);
>> +	VP_SEQ_PRINTF(NestedSLATSoftPageFaultsTime);
>> +	VP_SEQ_PRINTF(NestedSLATHardPageFaultsCount);
>> +	VP_SEQ_PRINTF(NestedSLATHardPageFaultsTime);
>> +	VP_SEQ_PRINTF(InvEptAllContextEmulationCount);
>> +	VP_SEQ_PRINTF(InvEptAllContextEmulationTime);
>> +	VP_SEQ_PRINTF(InvEptSingleContextEmulationCount);
>> +	VP_SEQ_PRINTF(InvEptSingleContextEmulationTime);
>> +	VP_SEQ_PRINTF(InvVpidAllContextEmulationCount);
>> +	VP_SEQ_PRINTF(InvVpidAllContextEmulationTime);
>> +	VP_SEQ_PRINTF(InvVpidSingleContextEmulationCount);
>> +	VP_SEQ_PRINTF(InvVpidSingleContextEmulationTime);
>> +	VP_SEQ_PRINTF(InvVpidSingleAddressEmulationCount);
>> +	VP_SEQ_PRINTF(InvVpidSingleAddressEmulationTime);
>> +	VP_SEQ_PRINTF(NestedTlbPageTableReclamations);
>> +	VP_SEQ_PRINTF(NestedTlbPageTableEvictions);
>> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
>> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
>> +	VP_SEQ_PRINTF(PostedInterruptNotifications);
>> +	VP_SEQ_PRINTF(PostedInterruptScans);
>> +	VP_SEQ_PRINTF(TotalCoreRunTime);
>> +	VP_SEQ_PRINTF(MaximumRunTime);
>> +	VP_SEQ_PRINTF(HwpRequestContextSwitches);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
>> +	VP_SEQ_PRINTF(VmloadEmulationCount);
>> +	VP_SEQ_PRINTF(VmloadEmulationTime);
>> +	VP_SEQ_PRINTF(VmsaveEmulationCount);
>> +	VP_SEQ_PRINTF(VmsaveEmulationTime);
>> +	VP_SEQ_PRINTF(GifInstructionEmulationCount);
>> +	VP_SEQ_PRINTF(GifInstructionEmulationTime);
>> +	VP_SEQ_PRINTF(EmulatedErrataSvmInstructions);
>> +	VP_SEQ_PRINTF(Placeholder1);
>> +	VP_SEQ_PRINTF(Placeholder2);
>> +	VP_SEQ_PRINTF(Placeholder3);
>> +	VP_SEQ_PRINTF(Placeholder4);
>> +	VP_SEQ_PRINTF(Placeholder5);
>> +	VP_SEQ_PRINTF(Placeholder6);
>> +	VP_SEQ_PRINTF(Placeholder7);
>> +	VP_SEQ_PRINTF(Placeholder8);
>> +	VP_SEQ_PRINTF(Placeholder9);
>> +	VP_SEQ_PRINTF(Placeholder10);
>> +	VP_SEQ_PRINTF(SchedulingPriority);
>> +	VP_SEQ_PRINTF(RdpmcInstructionsCount);
>> +	VP_SEQ_PRINTF(RdpmcInstructionsTime);
>> +	VP_SEQ_PRINTF(PerfmonPmuMsrAccessesCount);
>> +	VP_SEQ_PRINTF(PerfmonLbrMsrAccessesCount);
>> +	VP_SEQ_PRINTF(PerfmonIptMsrAccessesCount);
>> +	VP_SEQ_PRINTF(PerfmonInterruptCount);
>> +	VP_SEQ_PRINTF(Vtl1DispatchCount);
>> +	VP_SEQ_PRINTF(Vtl2DispatchCount);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket0);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket1);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket2);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket3);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket4);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket5);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket6);
>> +	VP_SEQ_PRINTF(Vtl1RunTime);
>> +	VP_SEQ_PRINTF(Vtl2RunTime);
>> +	VP_SEQ_PRINTF(IommuHypercalls);
>> +	VP_SEQ_PRINTF(CpuGroupHypercalls);
>> +	VP_SEQ_PRINTF(VsmHypercalls);
>> +	VP_SEQ_PRINTF(EventLogHypercalls);
>> +	VP_SEQ_PRINTF(DeviceDomainHypercalls);
>> +	VP_SEQ_PRINTF(DepositHypercalls);
>> +	VP_SEQ_PRINTF(SvmHypercalls);
>> +	VP_SEQ_PRINTF(BusLockAcquisitionCount);
> 
> The x86 VpUnused counter is not shown. Any reason for that? All the
> Placeholder counters *are* shown, so I'm just wondering what's
> different.
> 

Good question, I believe when this code was written VpUnused was
actually undefined in our headers, because the value 201 was
temporarily used for VpRootDispatchThreadBlocked before that was
changed to 202 (the hypervisor version using 201 was never released
publically so not considered a breaking change).

Checking the code, 201 now refers to VpLoadAvg on x86 so I will
update the definitions in patch #2 of this series to include that,
and add it here in the debugfs code.

>> +#elif IS_ENABLED(CONFIG_ARM64)
>> +	VP_SEQ_PRINTF(SysRegAccessesCount);
>> +	VP_SEQ_PRINTF(SysRegAccessesTime);
>> +	VP_SEQ_PRINTF(SmcInstructionsCount);
>> +	VP_SEQ_PRINTF(SmcInstructionsTime);
>> +	VP_SEQ_PRINTF(OtherInterceptsCount);
>> +	VP_SEQ_PRINTF(OtherInterceptsTime);
>> +	VP_SEQ_PRINTF(ExternalInterruptsCount);
>> +	VP_SEQ_PRINTF(ExternalInterruptsTime);
>> +	VP_SEQ_PRINTF(PendingInterruptsCount);
>> +	VP_SEQ_PRINTF(PendingInterruptsTime);
>> +	VP_SEQ_PRINTF(GuestPageTableMaps);
>> +	VP_SEQ_PRINTF(LargePageTlbFills);
>> +	VP_SEQ_PRINTF(SmallPageTlbFills);
>> +	VP_SEQ_PRINTF(ReflectedGuestPageFaults);
>> +	VP_SEQ_PRINTF(MemoryInterceptMessages);
>> +	VP_SEQ_PRINTF(OtherMessages);
>> +	VP_SEQ_PRINTF(LogicalProcessorMigrations);
>> +	VP_SEQ_PRINTF(AddressDomainFlushes);
>> +	VP_SEQ_PRINTF(AddressSpaceFlushes);
>> +	VP_SEQ_PRINTF(SyntheticInterrupts);
>> +	VP_SEQ_PRINTF(VirtualInterrupts);
>> +	VP_SEQ_PRINTF(ApicSelfIpisSent);
>> +	VP_SEQ_PRINTF(GpaSpaceHypercalls);
>> +	VP_SEQ_PRINTF(LogicalProcessorHypercalls);
>> +	VP_SEQ_PRINTF(LongSpinWaitHypercalls);
>> +	VP_SEQ_PRINTF(OtherHypercalls);
>> +	VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
>> +	VP_SEQ_PRINTF(VirtualInterruptHypercalls);
>> +	VP_SEQ_PRINTF(VirtualMmuHypercalls);
>> +	VP_SEQ_PRINTF(VirtualProcessorHypercalls);
>> +	VP_SEQ_PRINTF(HardwareInterrupts);
>> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
>> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
>> +	VP_SEQ_PRINTF(LogicalProcessorDispatches);
>> +	VP_SEQ_PRINTF(WaitingForCpuTime);
>> +	VP_SEQ_PRINTF(ExtendedHypercalls);
>> +	VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
>> +	VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
>> +	VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
>> +	VP_SEQ_PRINTF(GlobalIoTlbFlushes);
>> +	VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
>> +	VP_SEQ_PRINTF(LocalIoTlbFlushes);
>> +	VP_SEQ_PRINTF(LocalIoTlbFlushCost);
>> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
>> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
>> +	VP_SEQ_PRINTF(PostedInterruptNotifications);
>> +	VP_SEQ_PRINTF(PostedInterruptScans);
>> +	VP_SEQ_PRINTF(TotalCoreRunTime);
>> +	VP_SEQ_PRINTF(MaximumRunTime);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
>> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
>> +	VP_SEQ_PRINTF(HwpRequestContextSwitches);
>> +	VP_SEQ_PRINTF(Placeholder2);
>> +	VP_SEQ_PRINTF(Placeholder3);
>> +	VP_SEQ_PRINTF(Placeholder4);
>> +	VP_SEQ_PRINTF(Placeholder5);
>> +	VP_SEQ_PRINTF(Placeholder6);
>> +	VP_SEQ_PRINTF(Placeholder7);
>> +	VP_SEQ_PRINTF(Placeholder8);
>> +	VP_SEQ_PRINTF(ContentionTime);
>> +	VP_SEQ_PRINTF(WakeUpTime);
>> +	VP_SEQ_PRINTF(SchedulingPriority);
>> +	VP_SEQ_PRINTF(Vtl1DispatchCount);
>> +	VP_SEQ_PRINTF(Vtl2DispatchCount);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket0);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket1);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket2);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket3);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket4);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket5);
>> +	VP_SEQ_PRINTF(Vtl2DispatchBucket6);
>> +	VP_SEQ_PRINTF(Vtl1RunTime);
>> +	VP_SEQ_PRINTF(Vtl2RunTime);
>> +	VP_SEQ_PRINTF(IommuHypercalls);
>> +	VP_SEQ_PRINTF(CpuGroupHypercalls);
>> +	VP_SEQ_PRINTF(VsmHypercalls);
>> +	VP_SEQ_PRINTF(EventLogHypercalls);
>> +	VP_SEQ_PRINTF(DeviceDomainHypercalls);
>> +	VP_SEQ_PRINTF(DepositHypercalls);
>> +	VP_SEQ_PRINTF(SvmHypercalls);
> 
> The ARM64 VpLoadAvg counter is not shown?  Any reason why?
> 

I'm not sure, but could be related to the reasoning in the above
comment - likely VpLoadAvg didn't exist before. I will add it.

>> +#endif
> 
> The VpRootDispatchThreadBlocked counter is not shown for either
> x86 or ARM64. Is that intentional, and if so, why? I know the counter
> is used in mshv_vp_dispatch_thread_blocked(), but it's not clear why
> that means it shouldn't be shown here.
> 

VpRootDispatchThreadBlocked is not really a 'stat' that you might want
to expose like the other values, it's really a boolean control value
that was tacked onto the vp stats page to facilitate fast interrupt
injection used by the root scheduler. As such it isn't of much value to
userspace.

>> +
>> +	return 0;
>> +}
> 
> This function, vp_stats_show(), seems like a candidate for redoing based on a
> static table that lists the counter names and index. Then the code just loops
> through the table. On x86 each VP_SEQ_PRINTF() generates 42 bytes of code,
> and there are 199 entries, so 8358 bytes. The table entries would probably
> be 16 bytes each (a 64-bit pointer to the string constant, a 32-bit index value,
> and 4 bytes of padding so each entry is 8-byte aligned). The actual space
> saving isn't that large, but the code would be a lot more compact. The
> other *_stats_shows() functions could do the same.
> 
> It's distasteful to me to see 420 lines of enum entries in Patch 2 of this series,
> then followed by another 420 lines of matching *_SEQ_PRINTF entries. But I
> realize that the goal of the enum entries is to match the Windows code, so I
> guess it is what it is. But there's an argument for ditching the enum entries
> entirely, and using the putative static table to capture the information. It
> doesn't seem like matching the Windows code is saving much sync effort
> since any additions/ subtractions to the enum entries need to be matched
> with changes in the *_stats_show() functions, or in my putative static table.
> But I guess if Windows changed only the value for an enum entry without
> additions/subtractions, that would sync more easily.
> 

Keeping the definitions as close to Windows code as possible is a high priority,
for consistency and hopefully partially automating that process in future. So,
I'm against throwing away the enum values. The downside of having to update
two code locations when adding a new enum member is fine by me.

I'm not against replacing this sequence of macros with a loop over a table like
the one you propose (in addition to keeping the enum values). That would save
some space as you point out above, but the impact is fairly minimal.

In terms of aesthetics the definition for a table will look very very similar to
the list of VP_SEQ_PRINTF() that are currently here. So all in all, I don't see
a strong reason to switch to a table, unless the space issue is more important
that I realize.

> I'm just throwing this out as a thought. You may prefer to keep everything
> "as is", in which case ignore my comment and I won't raise it again.
> 

Thanks, feel free to follow up if you have further thoughts on this part, I'm
open to changing it if there's a reason. Right now it feels like mainly an
aesthetics/cleanliness argument and I'm not sure it's worth the effort.

>> +DEFINE_SHOW_ATTRIBUTE(vp_stats);
>> +
>> +static void mshv_vp_stats_unmap(u64 partition_id, u32 vp_index, void *stats_page_addr,
>> +				enum hv_stats_area_type stats_area_type)
>> +{
>> +	union hv_stats_object_identity identity = {
>> +		.vp.partition_id = partition_id,
>> +		.vp.vp_index = vp_index,
>> +		.vp.stats_area_type = stats_area_type,
>> +	};
>> +	int err;
>> +
>> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_VP, stats_page_addr, &identity);
>> +	if (err)
>> +		pr_err("%s: failed to unmap partition %llu vp %u %s stats, err: %d\n",
>> +		       __func__, partition_id, vp_index,
>> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
>> +		       err);
>> +}
>> +
>> +static void *mshv_vp_stats_map(u64 partition_id, u32 vp_index,
>> +			       enum hv_stats_area_type stats_area_type)
>> +{
>> +	union hv_stats_object_identity identity = {
>> +		.vp.partition_id = partition_id,
>> +		.vp.vp_index = vp_index,
>> +		.vp.stats_area_type = stats_area_type,
>> +	};
>> +	void *stats;
>> +	int err;
>> +
>> +	err = hv_map_stats_page(HV_STATS_OBJECT_VP, &identity, &stats);
>> +	if (err) {
>> +		pr_err("%s: failed to map partition %llu vp %u %s stats, err: %d\n",
>> +		       __func__, partition_id, vp_index,
>> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
>> +		       err);
>> +		return ERR_PTR(err);
>> +	}
>> +	return stats;
>> +}
> 
> Presumably you've noticed that the functions mshv_vp_stats_map() and
> mshv_vp_stats_unmap() also exist in mshv_root_main.c.  They are static
> functions in both places, so the compiler & linker do the right thing, but
> it sure does make things a bit more complex for human readers. The versions
> here follow a consistent pattern for (lp, vp, hv, partition), so maybe the ones
> in mshv_root_main.c could be renamed to avoid confusion?
> 

Good point - this is being addressed in our internal tree but hasn't made it into
this patch set. I will consider squashing that into a later version of this set,
but for now I'm treating it as a future cleanup patch to send later.

>> +
>> +static int vp_debugfs_stats_create(u64 partition_id, u32 vp_index,
>> +				   struct dentry **vp_stats_ptr,
>> +				   struct dentry *parent)
>> +{
>> +	struct dentry *dentry;
>> +	struct hv_stats_page **pstats;
>> +	int err;
>> +
>> +	pstats = kcalloc(2, sizeof(struct hv_stats_page *), GFP_KERNEL_ACCOUNT);
> 
> Open coding "2" as the first parameter makes assumptions about the values of
> HV_STATS_AREA_SELF and HV_STATS_AREA_PARENT.  Should use
> HV_STATS_AREA_COUNT instead of "2" so that indexing into the array is certain
> to work.
> 

Thanks, I'll chang it to use HV_STATS_AREA_COUNT.

>> +	if (!pstats)
>> +		return -ENOMEM;
>> +
>> +	pstats[HV_STATS_AREA_SELF] = mshv_vp_stats_map(partition_id, vp_index,
>> +						       HV_STATS_AREA_SELF);
>> +	if (IS_ERR(pstats[HV_STATS_AREA_SELF])) {
>> +		err = PTR_ERR(pstats[HV_STATS_AREA_SELF]);
>> +		goto cleanup;
>> +	}
>> +
>> +	/*
>> +	 * L1VH partition cannot access its vp stats in parent area.
>> +	 */
>> +	if (is_l1vh_parent(partition_id)) {
>> +		pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
>> +	} else {
>> +		pstats[HV_STATS_AREA_PARENT] = mshv_vp_stats_map(
>> +			partition_id, vp_index, HV_STATS_AREA_PARENT);
>> +		if (IS_ERR(pstats[HV_STATS_AREA_PARENT])) {
>> +			err = PTR_ERR(pstats[HV_STATS_AREA_PARENT]);
>> +			goto unmap_self;
>> +		}
>> +		if (!pstats[HV_STATS_AREA_PARENT])
>> +			pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
>> +	}
>> +
>> +	dentry = debugfs_create_file("stats", 0400, parent,
>> +				     pstats, &vp_stats_fops);
>> +	if (IS_ERR(dentry)) {
>> +		err = PTR_ERR(dentry);
>> +		goto unmap_vp_stats;
>> +	}
>> +
>> +	*vp_stats_ptr = dentry;
>> +	return 0;
>> +
>> +unmap_vp_stats:
>> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF])
>> +		mshv_vp_stats_unmap(partition_id, vp_index, pstats[HV_STATS_AREA_PARENT],
>> +				    HV_STATS_AREA_PARENT);
>> +unmap_self:
>> +	mshv_vp_stats_unmap(partition_id, vp_index, pstats[HV_STATS_AREA_SELF],
>> +			    HV_STATS_AREA_SELF);
>> +cleanup:
>> +	kfree(pstats);
>> +	return err;
>> +}
>> +
>> +static void vp_debugfs_remove(u64 partition_id, u32 vp_index,
>> +			      struct dentry *vp_stats)
>> +{
>> +	struct hv_stats_page **pstats = NULL;
>> +	void *stats;
>> +
>> +	pstats = vp_stats->d_inode->i_private;
>> +	debugfs_remove_recursive(vp_stats->d_parent);
>> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF]) {
>> +		stats = pstats[HV_STATS_AREA_PARENT];
>> +		mshv_vp_stats_unmap(partition_id, vp_index, stats,
>> +				    HV_STATS_AREA_PARENT);
>> +	}
>> +
>> +	stats = pstats[HV_STATS_AREA_SELF];
>> +	mshv_vp_stats_unmap(partition_id, vp_index, stats, HV_STATS_AREA_SELF);
>> +
>> +	kfree(pstats);
>> +}
>> +
>> +static int vp_debugfs_create(u64 partition_id, u32 vp_index,
>> +			     struct dentry **vp_stats_ptr,
>> +			     struct dentry *parent)
>> +{
>> +	struct dentry *vp_idx_dir;
>> +	char vp_idx_str[U32_BUF_SZ];
>> +	int err;
>> +
>> +	sprintf(vp_idx_str, "%u", vp_index);
>> +
>> +	vp_idx_dir = debugfs_create_dir(vp_idx_str, parent);
>> +	if (IS_ERR(vp_idx_dir))
>> +		return PTR_ERR(vp_idx_dir);
>> +
>> +	err = vp_debugfs_stats_create(partition_id, vp_index, vp_stats_ptr,
>> +				      vp_idx_dir);
>> +	if (err)
>> +		goto remove_debugfs_vp_idx;
>> +
>> +	return 0;
>> +
>> +remove_debugfs_vp_idx:
>> +	debugfs_remove_recursive(vp_idx_dir);
>> +	return err;
>> +}
>> +
>> +static int partition_stats_show(struct seq_file *m, void *v)
>> +{
>> +	const struct hv_stats_page **pstats = m->private;
>> +
>> +#define PARTITION_SEQ_PRINTF(cnt)				 \
>> +do {								 \
>> +	if (pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]) \
>> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
>> +			pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]); \
>> +	else \
>> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
>> +			pstats[HV_STATS_AREA_PARENT]->pt_cntrs[Partition##cnt]); \
>> +} while (0)
> 
> Same comment as for VP_SEQ_PRINTF.
> 
Ack

>> +
>> +	PARTITION_SEQ_PRINTF(VirtualProcessors);
>> +	PARTITION_SEQ_PRINTF(TlbSize);
>> +	PARTITION_SEQ_PRINTF(AddressSpaces);
>> +	PARTITION_SEQ_PRINTF(DepositedPages);
>> +	PARTITION_SEQ_PRINTF(GpaPages);
>> +	PARTITION_SEQ_PRINTF(GpaSpaceModifications);
>> +	PARTITION_SEQ_PRINTF(VirtualTlbFlushEntires);
>> +	PARTITION_SEQ_PRINTF(RecommendedTlbSize);
>> +	PARTITION_SEQ_PRINTF(GpaPages4K);
>> +	PARTITION_SEQ_PRINTF(GpaPages2M);
>> +	PARTITION_SEQ_PRINTF(GpaPages1G);
>> +	PARTITION_SEQ_PRINTF(GpaPages512G);
>> +	PARTITION_SEQ_PRINTF(DevicePages4K);
>> +	PARTITION_SEQ_PRINTF(DevicePages2M);
>> +	PARTITION_SEQ_PRINTF(DevicePages1G);
>> +	PARTITION_SEQ_PRINTF(DevicePages512G);
>> +	PARTITION_SEQ_PRINTF(AttachedDevices);
>> +	PARTITION_SEQ_PRINTF(DeviceInterruptMappings);
>> +	PARTITION_SEQ_PRINTF(IoTlbFlushes);
>> +	PARTITION_SEQ_PRINTF(IoTlbFlushCost);
>> +	PARTITION_SEQ_PRINTF(DeviceInterruptErrors);
>> +	PARTITION_SEQ_PRINTF(DeviceDmaErrors);
>> +	PARTITION_SEQ_PRINTF(DeviceInterruptThrottleEvents);
>> +	PARTITION_SEQ_PRINTF(SkippedTimerTicks);
>> +	PARTITION_SEQ_PRINTF(PartitionId);
>> +#if IS_ENABLED(CONFIG_X86_64)
>> +	PARTITION_SEQ_PRINTF(NestedTlbSize);
>> +	PARTITION_SEQ_PRINTF(RecommendedNestedTlbSize);
>> +	PARTITION_SEQ_PRINTF(NestedTlbFreeListSize);
>> +	PARTITION_SEQ_PRINTF(NestedTlbTrimmedPages);
>> +	PARTITION_SEQ_PRINTF(PagesShattered);
>> +	PARTITION_SEQ_PRINTF(PagesRecombined);
>> +	PARTITION_SEQ_PRINTF(HwpRequestValue);
>> +#elif IS_ENABLED(CONFIG_ARM64)
>> +	PARTITION_SEQ_PRINTF(HwpRequestValue);
>> +#endif
>> +
>> +	return 0;
>> +}
>> +DEFINE_SHOW_ATTRIBUTE(partition_stats);
>> +
>> +static void mshv_partition_stats_unmap(u64 partition_id, void *stats_page_addr,
>> +				       enum hv_stats_area_type stats_area_type)
>> +{
>> +	union hv_stats_object_identity identity = {
>> +		.partition.partition_id = partition_id,
>> +		.partition.stats_area_type = stats_area_type,
>> +	};
>> +	int err;
>> +
>> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_PARTITION, stats_page_addr,
>> +				  &identity);
>> +	if (err) {
>> +		pr_err("%s: failed to unmap partition %lld %s stats, err: %d\n",
>> +		       __func__, partition_id,
>> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
>> +		       err);
>> +	}
>> +}
>> +
>> +static void *mshv_partition_stats_map(u64 partition_id,
>> +				      enum hv_stats_area_type stats_area_type)
>> +{
>> +	union hv_stats_object_identity identity = {
>> +		.partition.partition_id = partition_id,
>> +		.partition.stats_area_type = stats_area_type,
>> +	};
>> +	void *stats;
>> +	int err;
>> +
>> +	err = hv_map_stats_page(HV_STATS_OBJECT_PARTITION, &identity, &stats);
>> +	if (err) {
>> +		pr_err("%s: failed to map partition %lld %s stats, err: %d\n",
>> +		       __func__, partition_id,
>> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
>> +		       err);
>> +		return ERR_PTR(err);
>> +	}
>> +	return stats;
>> +}
>> +
>> +static int mshv_debugfs_partition_stats_create(u64 partition_id,
>> +					    struct dentry **partition_stats_ptr,
>> +					    struct dentry *parent)
>> +{
>> +	struct dentry *dentry;
>> +	struct hv_stats_page **pstats;
>> +	int err;
>> +
>> +	pstats = kcalloc(2, sizeof(struct hv_stats_page *), GFP_KERNEL_ACCOUNT);
> 
> Same comment here about the use of "2" as the first parameter.
> 
Ack.

>> +	if (!pstats)
>> +		return -ENOMEM;

<snip>
Thanks for the comments, I appreciate the review!

Nuno

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2 1/3] mshv: Ignore second stats page map result failure
  2025-12-30  0:27     ` Nuno Das Neves
@ 2026-01-02 16:27       ` Michael Kelley
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Kelley @ 2026-01-02 16:27 UTC (permalink / raw)
  To: Nuno Das Neves, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, skinsburskii@linux.microsoft.com
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com,
	prapal@linux.microsoft.com, mrathor@linux.microsoft.com,
	paekkaladevi@linux.microsoft.com

From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Monday, December 29, 2025 4:28 PM
> 
> On 12/8/2025 7:12 AM, Michael Kelley wrote:
> > From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Friday, December 5, 2025 10:59 AM
> >>
> >> From: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
> >>
> >> Older versions of the hypervisor do not support HV_STATS_AREA_PARENT
> >> and return HV_STATUS_INVALID_PARAMETER for the second stats page
> >> mapping request.
> >>
> >> This results a failure in module init. Instead of failing, gracefully
> >> fall back to populating stats_pages[HV_STATS_AREA_PARENT] with the
> >> already-mapped stats_pages[HV_STATS_AREA_SELF].
> >
> > This explains "what" this patch does. But could you add an explanation of "why"
> > substituting SELF for the unavailable PARENT is the right thing to do? As a somewhat
> > outside reviewer, I don't know enough about SELF vs. PARENT to immediately know
> > why this substitution makes sense.
> >
> I'll attempt to explain. I'm a little hindered by the fact that like many of the
> root interfaces this is not well-documented, but this is my understanding:
> 
> The stats areas HV_STATS_AREA_SELF and HV_STATS_AREA_PARENT indicate the
> privilege level of the data in the mapped stats page.

OK. But evidently that difference in "privilege level" (whatever that means) doesn't
affect what the root partition can do to read and display the data in debugfs, right?

> 
> Both SELF and PARENT contain the same fields, but some fields that are 0 in the
> SELF page may be nonzero in PARENT page, and vice-versa. So, to read all the fields
> we need to map both pages if possible, and prioritize reading non-zero data from
> each field, by checking both the SELF and PARENT pages.

Overall, this mostly makes sense. Each VP and each partition has associated SELF and
PARENT stats pages. For the SELF page, the stats are presumably for the single
associated VP or partition. But "PARENT" terminology usually implies some kind of
hierarchy, as in a parent has one or more children. Parent-level stats would typically
be an aggregate of all its children's stats. But if that's the case here, choosing at runtime
on a per-field stat basis between SELF and PARENT would produce weird results. So
maybe that typical model of "parent" isn't correct here. If SELF and PARENT are only
indicating some kind of privilege level, maybe the PARENT page for each VP and each
partition is like the SELF page -- it contains stats only for the associated VP or partition.

> 
> I don't know if it's possible for a given field to have a different (nonzero) value
> in both SELF and PARENT pages. I imagine in that case we'd want to prioritize the
> PARENT value, but it may simply not be possible.

It would be nice to confirm that this can't happen. If it can happen, that messes
up trying to construct a sensible model of how this all works. :-)

And a somewhat related question: Assuming that a particular stat appears in
either the SELF or the PARENT page, under what circumstances might that stat
move from one to the other, if ever? I would guess that for a given version of the
hypervisor, the split is always the same, across all VPs and all partitions running
on hypervisors of that version. But a different hypervisor version might split the
stats differently between SELF and PARENT. Of course, this stuff is overall a bit
unusual, so my guess might not be right. 

I ask because making a runtime decision between SELF and PARENT for every
individual stat, every time it is read, is conceptually a lot of wasted motion if the
split is static and know-able ahead of time. But I say "conceptually" because I
can't immediately come up with a way to make things faster or more compact if
the split were to be static and know-able ahead of time. So it may be moot point
from an implementation standpoint, but I'm still interested in the answer from
the standpoint of being able to document the overall model of how this works.

> 
> The API is designed in this way to be backward-compatible with older hypervisors
> that didn't have a concept of SELF and PARENT. Hence on older hypervisors (detectable
> via the error code), all we can do is map SELF and use it for everything.

In cases where PARENT can't be mapped by the root partition, does that mean
some of the stats just aren't available? Or does the hypervisor provide all the
stats in the SELF page?

> 
> > Also, does this patch affect the logic in mshv_vp_dispatch_thread_blocked() where
> > a zero value for the SELF version of VpRootDispatchThreadBlocked is replaced by
> > the PARENT value? But that logic seems to be in the reverse direction -- replacing
> > a missing SELF value with the PARENT value -- whereas this patch is about replacing
> > missing PARENT values with SELF values. So are there two separate PARENT vs. SELF
> > issues overall? And after this patch is in place and PARENT values are replaced with
> > SELF on older hypervisor versions, the logic in mshv_vp_dispatch_thread_blocked()
> > then effectively becomes a no-op if the SELF value is zero, and the return value will
> > be zero. Is that problem?
> >
> This is the same issue, because we only care about any nonzero value in
> mshv_vp_dispatch_thread_blocked(). It doesn't matter which page we check first in that
> code, just that any nonzero value is returned as a boolean to indicate a blocked state.
> 
> The code in question could be rewritten:
> 
> return self_vp_cntrs[VpRootDispatchThreadBlocked] ||
> parent_vp_cntrs[VpRootDispatchThreadBlocked];

OK. It would be more consistent to apply the same logic (check SELF then PARENT,
or vice versa) in both mshv_vp_dispatch_thread_blocked() and in this new debugfs
code. As you know, for me inconsistencies always beg the question of "why"? :-)
But that's a minor point.

> 
> >>
> >> Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
> >> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> >> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> >> ---
> >>  drivers/hv/mshv_root_hv_call.c | 41 ++++++++++++++++++++++++++++++----
> >>  drivers/hv/mshv_root_main.c    |  3 +++
> >>  2 files changed, 40 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
> >> index 598eaff4ff29..b1770c7b500c 100644
> >> --- a/drivers/hv/mshv_root_hv_call.c
> >> +++ b/drivers/hv/mshv_root_hv_call.c
> >> @@ -855,6 +855,24 @@ static int hv_call_map_stats_page2(enum hv_stats_object_type type,
> >>  	return ret;
> >>  }
> >>
> >> +static int
> >> +hv_stats_get_area_type(enum hv_stats_object_type type,
> >> +		       const union hv_stats_object_identity *identity)
> >> +{
> >> +	switch (type) {
> >> +	case HV_STATS_OBJECT_HYPERVISOR:
> >> +		return identity->hv.stats_area_type;
> >> +	case HV_STATS_OBJECT_LOGICAL_PROCESSOR:
> >> +		return identity->lp.stats_area_type;
> >> +	case HV_STATS_OBJECT_PARTITION:
> >> +		return identity->partition.stats_area_type;
> >> +	case HV_STATS_OBJECT_VP:
> >> +		return identity->vp.stats_area_type;
> >> +	}
> >> +
> >> +	return -EINVAL;
> >> +}
> >> +
> >>  static int hv_call_map_stats_page(enum hv_stats_object_type type,
> >>  				  const union hv_stats_object_identity *identity,
> >>  				  void **addr)
> >> @@ -863,7 +881,7 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
> >>  	struct hv_input_map_stats_page *input;
> >>  	struct hv_output_map_stats_page *output;
> >>  	u64 status, pfn;
> >> -	int ret = 0;
> >> +	int hv_status, ret = 0;
> >>
> >>  	do {
> >>  		local_irq_save(flags);
> >> @@ -878,11 +896,26 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
> >>  		pfn = output->map_location;
> >>
> >>  		local_irq_restore(flags);
> >> -		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
> >> -			ret = hv_result_to_errno(status);
> >> +
> >> +		hv_status = hv_result(status);
> >> +		if (hv_status != HV_STATUS_INSUFFICIENT_MEMORY) {
> >>  			if (hv_result_success(status))
> >>  				break;
> >> -			return ret;
> >> +
> >> +			/*
> >> +			 * Older versions of the hypervisor do not support the
> >> +			 * PARENT stats area. In this case return "success" but
> >> +			 * set the page to NULL. The caller should check for
> >> +			 * this case and instead just use the SELF area.
> >> +			 */
> >> +			if (hv_stats_get_area_type(type, identity) == HV_STATS_AREA_PARENT &&
> >> +			    hv_status == HV_STATUS_INVALID_PARAMETER) {
> >> +				*addr = NULL;
> >> +				return 0;
> >> +			}
> >> +
> >> +			hv_status_debug(status, "\n");
> >> +			return hv_result_to_errno(status);
> >
> > Does the hv_call_map_stats_page2() function need a similar fix? Or is there a linkage
> > in hypervisor functionality where any hypervisor version that supports an overlay GPFN
> > also supports the PARENT stats? If such a linkage is why hv_call_map_stats_page2()
> > doesn't need a similar fix, please add a code comment to that effect in
> > hv_call_map_stats_page2().
> >
> Exactly; hv_call_map_stats_page2() is only available on hypervisors where the PARENT
> page is also available. I'll add a comment.

Thanks.

> 
> >>  		}
> >>
> >>  		ret = hv_call_deposit_pages(NUMA_NO_NODE,
> >> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> >> index bc15d6f6922f..f59a4ab47685 100644
> >> --- a/drivers/hv/mshv_root_main.c
> >> +++ b/drivers/hv/mshv_root_main.c
> >> @@ -905,6 +905,9 @@ static int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
> >>  	if (err)
> >>  		goto unmap_self;
> >>
> >> +	if (!stats_pages[HV_STATS_AREA_PARENT])
> >> +		stats_pages[HV_STATS_AREA_PARENT] = stats_pages[HV_STATS_AREA_SELF];
> >> +
> >>  	return 0;
> >>
> >>  unmap_self:
> >> --
> >> 2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2 2/3] mshv: Add definitions for stats pages
  2025-12-30 23:04     ` Nuno Das Neves
@ 2026-01-02 16:27       ` Michael Kelley
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Kelley @ 2026-01-02 16:27 UTC (permalink / raw)
  To: Nuno Das Neves, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, skinsburskii@linux.microsoft.com
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com,
	prapal@linux.microsoft.com, mrathor@linux.microsoft.com,
	paekkaladevi@linux.microsoft.com

From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Tuesday, December 30, 2025 3:04 PM
> 
> On 12/8/2025 7:13 AM, Michael Kelley wrote:
> > From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Friday, December 5, 2025 10:59 AM
> >>
> >> Add the definitions for hypervisor, logical processor, and partition
> >> stats pages.
> >>
> >> Move the definition for the VP stats page to its rightful place in
> >> hvhdk.h, and add the missing members.
> >>
> >> These enum members retain their CamelCase style, since they are imported
> >> directly from the hypervisor code They will be stringified when printing
> >
> > Missing a '.' (period) after "hypervisor code".
> >
> Ack
> 
> >> the stats out, and retain more readability in this form.
> >>
> >> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> >> ---
> >>  drivers/hv/mshv_root_main.c |  17 --
> >>  include/hyperv/hvhdk.h      | 437 ++++++++++++++++++++++++++++++++++++
> >>  2 files changed, 437 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> >> index f59a4ab47685..19006b788e85 100644
> >> --- a/drivers/hv/mshv_root_main.c
> >> +++ b/drivers/hv/mshv_root_main.c
> >> @@ -38,23 +38,6 @@ MODULE_AUTHOR("Microsoft");
> >>  MODULE_LICENSE("GPL");
> >>  MODULE_DESCRIPTION("Microsoft Hyper-V root partition VMM interface /dev/mshv");
> >>
> >> -/* TODO move this to another file when debugfs code is added */
> >> -enum hv_stats_vp_counters {			/* HV_THREAD_COUNTER */
> >> -#if defined(CONFIG_X86)
> >> -	VpRootDispatchThreadBlocked			= 202,
> >> -#elif defined(CONFIG_ARM64)
> >> -	VpRootDispatchThreadBlocked			= 94,
> >> -#endif
> >> -	VpStatsMaxCounter
> >> -};
> >> -
> >> -struct hv_stats_page {
> >> -	union {
> >> -		u64 vp_cntrs[VpStatsMaxCounter];		/* VP counters */
> >> -		u8 data[HV_HYP_PAGE_SIZE];
> >> -	};
> >> -} __packed;
> >> -
> >>  struct mshv_root mshv_root;
> >>
> >>  enum hv_scheduler_type hv_scheduler_type;
> >> diff --git a/include/hyperv/hvhdk.h b/include/hyperv/hvhdk.h
> >> index 469186df7826..51abbcd0ec37 100644
> >> --- a/include/hyperv/hvhdk.h
> >> +++ b/include/hyperv/hvhdk.h
> >> @@ -10,6 +10,443 @@
> >>  #include "hvhdk_mini.h"
> >>  #include "hvgdk.h"
> >>
> >> +enum hv_stats_hypervisor_counters {		/* HV_HYPERVISOR_COUNTER
> */
> >> +	HvLogicalProcessors			= 1,
> >> +	HvPartitions				= 2,
> >> +	HvTotalPages				= 3,
> >> +	HvVirtualProcessors			= 4,
> >> +	HvMonitoredNotifications		= 5,
> >> +	HvModernStandbyEntries			= 6,
> >> +	HvPlatformIdleTransitions		= 7,
> >> +	HvHypervisorStartupCost			= 8,
> >> +	HvIOSpacePages				= 10,
> >> +	HvNonEssentialPagesForDump		= 11,
> >> +	HvSubsumedPages				= 12,
> >> +	HvStatsMaxCounter
> >> +};
> >> +
> >> +enum hv_stats_partition_counters {		/* HV_PROCESS_COUNTER */
> >> +	PartitionVirtualProcessors		= 1,
> >> +	PartitionTlbSize			= 3,
> >> +	PartitionAddressSpaces			= 4,
> >> +	PartitionDepositedPages			= 5,
> >> +	PartitionGpaPages			= 6,
> >> +	PartitionGpaSpaceModifications		= 7,
> >> +	PartitionVirtualTlbFlushEntires		= 8,
> >> +	PartitionRecommendedTlbSize		= 9,
> >> +	PartitionGpaPages4K			= 10,
> >> +	PartitionGpaPages2M			= 11,
> >> +	PartitionGpaPages1G			= 12,
> >> +	PartitionGpaPages512G			= 13,
> >> +	PartitionDevicePages4K			= 14,
> >> +	PartitionDevicePages2M			= 15,
> >> +	PartitionDevicePages1G			= 16,
> >> +	PartitionDevicePages512G		= 17,
> >> +	PartitionAttachedDevices		= 18,
> >> +	PartitionDeviceInterruptMappings	= 19,
> >> +	PartitionIoTlbFlushes			= 20,
> >> +	PartitionIoTlbFlushCost			= 21,
> >> +	PartitionDeviceInterruptErrors		= 22,
> >> +	PartitionDeviceDmaErrors		= 23,
> >> +	PartitionDeviceInterruptThrottleEvents	= 24,
> >> +	PartitionSkippedTimerTicks		= 25,
> >> +	PartitionPartitionId			= 26,
> >> +#if IS_ENABLED(CONFIG_X86_64)
> >> +	PartitionNestedTlbSize			= 27,
> >> +	PartitionRecommendedNestedTlbSize	= 28,
> >> +	PartitionNestedTlbFreeListSize		= 29,
> >> +	PartitionNestedTlbTrimmedPages		= 30,
> >> +	PartitionPagesShattered			= 31,
> >> +	PartitionPagesRecombined		= 32,
> >> +	PartitionHwpRequestValue		= 33,
> >> +#elif IS_ENABLED(CONFIG_ARM64)
> >> +	PartitionHwpRequestValue		= 27,
> >> +#endif
> >> +	PartitionStatsMaxCounter
> >> +};
> >> +
> >> +enum hv_stats_vp_counters {			/* HV_THREAD_COUNTER */
> >> +	VpTotalRunTime					= 1,
> >> +	VpHypervisorRunTime				= 2,
> >> +	VpRemoteNodeRunTime				= 3,
> >> +	VpNormalizedRunTime				= 4,
> >> +	VpIdealCpu					= 5,
> >> +	VpHypercallsCount				= 7,
> >> +	VpHypercallsTime				= 8,
> >> +#if IS_ENABLED(CONFIG_X86_64)
> >> +	VpPageInvalidationsCount			= 9,
> >> +	VpPageInvalidationsTime				= 10,
> >> +	VpControlRegisterAccessesCount			= 11,
> >> +	VpControlRegisterAccessesTime			= 12,
> >> +	VpIoInstructionsCount				= 13,
> >> +	VpIoInstructionsTime				= 14,
> >> +	VpHltInstructionsCount				= 15,
> >> +	VpHltInstructionsTime				= 16,
> >> +	VpMwaitInstructionsCount			= 17,
> >> +	VpMwaitInstructionsTime				= 18,
> >> +	VpCpuidInstructionsCount			= 19,
> >> +	VpCpuidInstructionsTime				= 20,
> >> +	VpMsrAccessesCount				= 21,
> >> +	VpMsrAccessesTime				= 22,
> >> +	VpOtherInterceptsCount				= 23,
> >> +	VpOtherInterceptsTime				= 24,
> >> +	VpExternalInterruptsCount			= 25,
> >> +	VpExternalInterruptsTime			= 26,
> >> +	VpPendingInterruptsCount			= 27,
> >> +	VpPendingInterruptsTime				= 28,
> >> +	VpEmulatedInstructionsCount			= 29,
> >> +	VpEmulatedInstructionsTime			= 30,
> >> +	VpDebugRegisterAccessesCount			= 31,
> >> +	VpDebugRegisterAccessesTime			= 32,
> >> +	VpPageFaultInterceptsCount			= 33,
> >> +	VpPageFaultInterceptsTime			= 34,
> >> +	VpGuestPageTableMaps				= 35,
> >> +	VpLargePageTlbFills				= 36,
> >> +	VpSmallPageTlbFills				= 37,
> >> +	VpReflectedGuestPageFaults			= 38,
> >> +	VpApicMmioAccesses				= 39,
> >> +	VpIoInterceptMessages				= 40,
> >> +	VpMemoryInterceptMessages			= 41,
> >> +	VpApicEoiAccesses				= 42,
> >> +	VpOtherMessages					= 43,
> >> +	VpPageTableAllocations				= 44,
> >> +	VpLogicalProcessorMigrations			= 45,
> >> +	VpAddressSpaceEvictions				= 46,
> >> +	VpAddressSpaceSwitches				= 47,
> >> +	VpAddressDomainFlushes				= 48,
> >> +	VpAddressSpaceFlushes				= 49,
> >> +	VpGlobalGvaRangeFlushes				= 50,
> >> +	VpLocalGvaRangeFlushes				= 51,
> >> +	VpPageTableEvictions				= 52,
> >> +	VpPageTableReclamations				= 53,
> >> +	VpPageTableResets				= 54,
> >> +	VpPageTableValidations				= 55,
> >> +	VpApicTprAccesses				= 56,
> >> +	VpPageTableWriteIntercepts			= 57,
> >> +	VpSyntheticInterrupts				= 58,
> >> +	VpVirtualInterrupts				= 59,
> >> +	VpApicIpisSent					= 60,
> >> +	VpApicSelfIpisSent				= 61,
> >> +	VpGpaSpaceHypercalls				= 62,
> >> +	VpLogicalProcessorHypercalls			= 63,
> >> +	VpLongSpinWaitHypercalls			= 64,
> >> +	VpOtherHypercalls				= 65,
> >> +	VpSyntheticInterruptHypercalls			= 66,
> >> +	VpVirtualInterruptHypercalls			= 67,
> >> +	VpVirtualMmuHypercalls				= 68,
> >> +	VpVirtualProcessorHypercalls			= 69,
> >> +	VpHardwareInterrupts				= 70,
> >> +	VpNestedPageFaultInterceptsCount		= 71,
> >> +	VpNestedPageFaultInterceptsTime			= 72,
> >> +	VpPageScans					= 73,
> >> +	VpLogicalProcessorDispatches			= 74,
> >> +	VpWaitingForCpuTime				= 75,
> >> +	VpExtendedHypercalls				= 76,
> >> +	VpExtendedHypercallInterceptMessages		= 77,
> >> +	VpMbecNestedPageTableSwitches			= 78,
> >> +	VpOtherReflectedGuestExceptions			= 79,
> >> +	VpGlobalIoTlbFlushes				= 80,
> >> +	VpGlobalIoTlbFlushCost				= 81,
> >> +	VpLocalIoTlbFlushes				= 82,
> >> +	VpLocalIoTlbFlushCost				= 83,
> >> +	VpHypercallsForwardedCount			= 84,
> >> +	VpHypercallsForwardingTime			= 85,
> >> +	VpPageInvalidationsForwardedCount		= 86,
> >> +	VpPageInvalidationsForwardingTime		= 87,
> >> +	VpControlRegisterAccessesForwardedCount		= 88,
> >> +	VpControlRegisterAccessesForwardingTime		= 89,
> >> +	VpIoInstructionsForwardedCount			= 90,
> >> +	VpIoInstructionsForwardingTime			= 91,
> >> +	VpHltInstructionsForwardedCount			= 92,
> >> +	VpHltInstructionsForwardingTime			= 93,
> >> +	VpMwaitInstructionsForwardedCount		= 94,
> >> +	VpMwaitInstructionsForwardingTime		= 95,
> >> +	VpCpuidInstructionsForwardedCount		= 96,
> >> +	VpCpuidInstructionsForwardingTime		= 97,
> >> +	VpMsrAccessesForwardedCount			= 98,
> >> +	VpMsrAccessesForwardingTime			= 99,
> >> +	VpOtherInterceptsForwardedCount			= 100,
> >> +	VpOtherInterceptsForwardingTime			= 101,
> >> +	VpExternalInterruptsForwardedCount		= 102,
> >> +	VpExternalInterruptsForwardingTime		= 103,
> >> +	VpPendingInterruptsForwardedCount		= 104,
> >> +	VpPendingInterruptsForwardingTime		= 105,
> >> +	VpEmulatedInstructionsForwardedCount		= 106,
> >> +	VpEmulatedInstructionsForwardingTime		= 107,
> >> +	VpDebugRegisterAccessesForwardedCount		= 108,
> >> +	VpDebugRegisterAccessesForwardingTime		= 109,
> >> +	VpPageFaultInterceptsForwardedCount		= 110,
> >> +	VpPageFaultInterceptsForwardingTime		= 111,
> >> +	VpVmclearEmulationCount				= 112,
> >> +	VpVmclearEmulationTime				= 113,
> >> +	VpVmptrldEmulationCount				= 114,
> >> +	VpVmptrldEmulationTime				= 115,
> >> +	VpVmptrstEmulationCount				= 116,
> >> +	VpVmptrstEmulationTime				= 117,
> >> +	VpVmreadEmulationCount				= 118,
> >> +	VpVmreadEmulationTime				= 119,
> >> +	VpVmwriteEmulationCount				= 120,
> >> +	VpVmwriteEmulationTime				= 121,
> >> +	VpVmxoffEmulationCount				= 122,
> >> +	VpVmxoffEmulationTime				= 123,
> >> +	VpVmxonEmulationCount				= 124,
> >> +	VpVmxonEmulationTime				= 125,
> >> +	VpNestedVMEntriesCount				= 126,
> >> +	VpNestedVMEntriesTime				= 127,
> >> +	VpNestedSLATSoftPageFaultsCount			= 128,
> >> +	VpNestedSLATSoftPageFaultsTime			= 129,
> >> +	VpNestedSLATHardPageFaultsCount			= 130,
> >> +	VpNestedSLATHardPageFaultsTime			= 131,
> >> +	VpInvEptAllContextEmulationCount		= 132,
> >> +	VpInvEptAllContextEmulationTime			= 133,
> >> +	VpInvEptSingleContextEmulationCount		= 134,
> >> +	VpInvEptSingleContextEmulationTime		= 135,
> >> +	VpInvVpidAllContextEmulationCount		= 136,
> >> +	VpInvVpidAllContextEmulationTime		= 137,
> >> +	VpInvVpidSingleContextEmulationCount		= 138,
> >> +	VpInvVpidSingleContextEmulationTime		= 139,
> >> +	VpInvVpidSingleAddressEmulationCount		= 140,
> >> +	VpInvVpidSingleAddressEmulationTime		= 141,
> >> +	VpNestedTlbPageTableReclamations		= 142,
> >> +	VpNestedTlbPageTableEvictions			= 143,
> >> +	VpFlushGuestPhysicalAddressSpaceHypercalls	= 144,
> >> +	VpFlushGuestPhysicalAddressListHypercalls	= 145,
> >> +	VpPostedInterruptNotifications			= 146,
> >> +	VpPostedInterruptScans				= 147,
> >> +	VpTotalCoreRunTime				= 148,
> >> +	VpMaximumRunTime				= 149,
> >> +	VpHwpRequestContextSwitches			= 150,
> >> +	VpWaitingForCpuTimeBucket0			= 151,
> >> +	VpWaitingForCpuTimeBucket1			= 152,
> >> +	VpWaitingForCpuTimeBucket2			= 153,
> >> +	VpWaitingForCpuTimeBucket3			= 154,
> >> +	VpWaitingForCpuTimeBucket4			= 155,
> >> +	VpWaitingForCpuTimeBucket5			= 156,
> >> +	VpWaitingForCpuTimeBucket6			= 157,
> >> +	VpVmloadEmulationCount				= 158,
> >> +	VpVmloadEmulationTime				= 159,
> >> +	VpVmsaveEmulationCount				= 160,
> >> +	VpVmsaveEmulationTime				= 161,
> >> +	VpGifInstructionEmulationCount			= 162,
> >> +	VpGifInstructionEmulationTime			= 163,
> >> +	VpEmulatedErrataSvmInstructions			= 164,
> >> +	VpPlaceholder1					= 165,
> >> +	VpPlaceholder2					= 166,
> >> +	VpPlaceholder3					= 167,
> >> +	VpPlaceholder4					= 168,
> >> +	VpPlaceholder5					= 169,
> >> +	VpPlaceholder6					= 170,
> >> +	VpPlaceholder7					= 171,
> >> +	VpPlaceholder8					= 172,
> >> +	VpPlaceholder9					= 173,
> >> +	VpPlaceholder10					= 174,
> >> +	VpSchedulingPriority				= 175,
> >> +	VpRdpmcInstructionsCount			= 176,
> >> +	VpRdpmcInstructionsTime				= 177,
> >> +	VpPerfmonPmuMsrAccessesCount			= 178,
> >> +	VpPerfmonLbrMsrAccessesCount			= 179,
> >> +	VpPerfmonIptMsrAccessesCount			= 180,
> >> +	VpPerfmonInterruptCount				= 181,
> >> +	VpVtl1DispatchCount				= 182,
> >> +	VpVtl2DispatchCount				= 183,
> >> +	VpVtl2DispatchBucket0				= 184,
> >> +	VpVtl2DispatchBucket1				= 185,
> >> +	VpVtl2DispatchBucket2				= 186,
> >> +	VpVtl2DispatchBucket3				= 187,
> >> +	VpVtl2DispatchBucket4				= 188,
> >> +	VpVtl2DispatchBucket5				= 189,
> >> +	VpVtl2DispatchBucket6				= 190,
> >> +	VpVtl1RunTime					= 191,
> >> +	VpVtl2RunTime					= 192,
> >> +	VpIommuHypercalls				= 193,
> >> +	VpCpuGroupHypercalls				= 194,
> >> +	VpVsmHypercalls					= 195,
> >> +	VpEventLogHypercalls				= 196,
> >> +	VpDeviceDomainHypercalls			= 197,
> >> +	VpDepositHypercalls				= 198,
> >> +	VpSvmHypercalls					= 199,
> >> +	VpBusLockAcquisitionCount			= 200,
> >> +	VpUnused					= 201,
> >> +	VpRootDispatchThreadBlocked			= 202,
> >> +#elif IS_ENABLED(CONFIG_ARM64)
> >> +	VpSysRegAccessesCount				= 9,
> >> +	VpSysRegAccessesTime				= 10,
> >> +	VpSmcInstructionsCount				= 11,
> >> +	VpSmcInstructionsTime				= 12,
> >> +	VpOtherInterceptsCount				= 13,
> >> +	VpOtherInterceptsTime				= 14,
> >> +	VpExternalInterruptsCount			= 15,
> >> +	VpExternalInterruptsTime			= 16,
> >> +	VpPendingInterruptsCount			= 17,
> >> +	VpPendingInterruptsTime				= 18,
> >> +	VpGuestPageTableMaps				= 19,
> >> +	VpLargePageTlbFills				= 20,
> >> +	VpSmallPageTlbFills				= 21,
> >> +	VpReflectedGuestPageFaults			= 22,
> >> +	VpMemoryInterceptMessages			= 23,
> >> +	VpOtherMessages					= 24,
> >> +	VpLogicalProcessorMigrations			= 25,
> >> +	VpAddressDomainFlushes				= 26,
> >> +	VpAddressSpaceFlushes				= 27,
> >> +	VpSyntheticInterrupts				= 28,
> >> +	VpVirtualInterrupts				= 29,
> >> +	VpApicSelfIpisSent				= 30,
> >> +	VpGpaSpaceHypercalls				= 31,
> >> +	VpLogicalProcessorHypercalls			= 32,
> >> +	VpLongSpinWaitHypercalls			= 33,
> >> +	VpOtherHypercalls				= 34,
> >> +	VpSyntheticInterruptHypercalls			= 35,
> >> +	VpVirtualInterruptHypercalls			= 36,
> >> +	VpVirtualMmuHypercalls				= 37,
> >> +	VpVirtualProcessorHypercalls			= 38,
> >> +	VpHardwareInterrupts				= 39,
> >> +	VpNestedPageFaultInterceptsCount		= 40,
> >> +	VpNestedPageFaultInterceptsTime			= 41,
> >> +	VpLogicalProcessorDispatches			= 42,
> >> +	VpWaitingForCpuTime				= 43,
> >> +	VpExtendedHypercalls				= 44,
> >> +	VpExtendedHypercallInterceptMessages		= 45,
> >> +	VpMbecNestedPageTableSwitches			= 46,
> >> +	VpOtherReflectedGuestExceptions			= 47,
> >> +	VpGlobalIoTlbFlushes				= 48,
> >> +	VpGlobalIoTlbFlushCost				= 49,
> >> +	VpLocalIoTlbFlushes				= 50,
> >> +	VpLocalIoTlbFlushCost				= 51,
> >> +	VpFlushGuestPhysicalAddressSpaceHypercalls	= 52,
> >> +	VpFlushGuestPhysicalAddressListHypercalls	= 53,
> >> +	VpPostedInterruptNotifications			= 54,
> >> +	VpPostedInterruptScans				= 55,
> >> +	VpTotalCoreRunTime				= 56,
> >> +	VpMaximumRunTime				= 57,
> >> +	VpWaitingForCpuTimeBucket0			= 58,
> >> +	VpWaitingForCpuTimeBucket1			= 59,
> >> +	VpWaitingForCpuTimeBucket2			= 60,
> >> +	VpWaitingForCpuTimeBucket3			= 61,
> >> +	VpWaitingForCpuTimeBucket4			= 62,
> >> +	VpWaitingForCpuTimeBucket5			= 63,
> >> +	VpWaitingForCpuTimeBucket6			= 64,
> >> +	VpHwpRequestContextSwitches			= 65,
> >> +	VpPlaceholder2					= 66,
> >> +	VpPlaceholder3					= 67,
> >> +	VpPlaceholder4					= 68,
> >> +	VpPlaceholder5					= 69,
> >> +	VpPlaceholder6					= 70,
> >> +	VpPlaceholder7					= 71,
> >> +	VpPlaceholder8					= 72,
> >> +	VpContentionTime				= 73,
> >> +	VpWakeUpTime					= 74,
> >> +	VpSchedulingPriority				= 75,
> >> +	VpVtl1DispatchCount				= 76,
> >> +	VpVtl2DispatchCount				= 77,
> >> +	VpVtl2DispatchBucket0				= 78,
> >> +	VpVtl2DispatchBucket1				= 79,
> >> +	VpVtl2DispatchBucket2				= 80,
> >> +	VpVtl2DispatchBucket3				= 81,
> >> +	VpVtl2DispatchBucket4				= 82,
> >> +	VpVtl2DispatchBucket5				= 83,
> >> +	VpVtl2DispatchBucket6				= 84,
> >> +	VpVtl1RunTime					= 85,
> >> +	VpVtl2RunTime					= 86,
> >> +	VpIommuHypercalls				= 87,
> >> +	VpCpuGroupHypercalls				= 88,
> >> +	VpVsmHypercalls					= 89,
> >> +	VpEventLogHypercalls				= 90,
> >> +	VpDeviceDomainHypercalls			= 91,
> >> +	VpDepositHypercalls				= 92,
> >> +	VpSvmHypercalls					= 93,
> >> +	VpLoadAvg					= 94,
> >> +	VpRootDispatchThreadBlocked			= 95,
> >
> > In current code, VpRootDispatchThreadBlocked on ARM64 is 94. Is that an
> > error that is being corrected by this patch?
> >
> 
> Hmm, I didn't realize this changed - 95 is the correct value. However,
> the mshv driver does not yet support on ARM64, so this fix doesn't
> have any impact right now. Do you suggest a separate patch to fix it?

I don’t see a need for a separate patch. Maybe just put a short note
about the change in the commit message for this patch, so that it is
recorded as intentional and not a random typo. Long lists like these
have been known to get fat-fingered from time-to-time. :-)

> 
> >> +#endif
> >> +	VpStatsMaxCounter
> >> +};
> >> +
> >> +enum hv_stats_lp_counters {			/* HV_CPU_COUNTER */
> >> +	LpGlobalTime				= 1,
> >> +	LpTotalRunTime				= 2,
> >> +	LpHypervisorRunTime			= 3,
> >> +	LpHardwareInterrupts			= 4,
> >> +	LpContextSwitches			= 5,
> >> +	LpInterProcessorInterrupts		= 6,
> >> +	LpSchedulerInterrupts			= 7,
> >> +	LpTimerInterrupts			= 8,
> >> +	LpInterProcessorInterruptsSent		= 9,
> >> +	LpProcessorHalts			= 10,
> >> +	LpMonitorTransitionCost			= 11,
> >> +	LpContextSwitchTime			= 12,
> >> +	LpC1TransitionsCount			= 13,
> >> +	LpC1RunTime				= 14,
> >> +	LpC2TransitionsCount			= 15,
> >> +	LpC2RunTime				= 16,
> >> +	LpC3TransitionsCount			= 17,
> >> +	LpC3RunTime				= 18,
> >> +	LpRootVpIndex				= 19,
> >> +	LpIdleSequenceNumber			= 20,
> >> +	LpGlobalTscCount			= 21,
> >> +	LpActiveTscCount			= 22,
> >> +	LpIdleAccumulation			= 23,
> >> +	LpReferenceCycleCount0			= 24,
> >> +	LpActualCycleCount0			= 25,
> >> +	LpReferenceCycleCount1			= 26,
> >> +	LpActualCycleCount1			= 27,
> >> +	LpProximityDomainId			= 28,
> >> +	LpPostedInterruptNotifications		= 29,
> >> +	LpBranchPredictorFlushes		= 30,
> >> +#if IS_ENABLED(CONFIG_X86_64)
> >> +	LpL1DataCacheFlushes			= 31,
> >> +	LpImmediateL1DataCacheFlushes		= 32,
> >> +	LpMbFlushes				= 33,
> >> +	LpCounterRefreshSequenceNumber		= 34,
> >> +	LpCounterRefreshReferenceTime		= 35,
> >> +	LpIdleAccumulationSnapshot		= 36,
> >> +	LpActiveTscCountSnapshot		= 37,
> >> +	LpHwpRequestContextSwitches		= 38,
> >> +	LpPlaceholder1				= 39,
> >> +	LpPlaceholder2				= 40,
> >> +	LpPlaceholder3				= 41,
> >> +	LpPlaceholder4				= 42,
> >> +	LpPlaceholder5				= 43,
> >> +	LpPlaceholder6				= 44,
> >> +	LpPlaceholder7				= 45,
> >> +	LpPlaceholder8				= 46,
> >> +	LpPlaceholder9				= 47,
> >> +	LpPlaceholder10				= 48,
> >> +	LpReserveGroupId			= 49,
> >> +	LpRunningPriority			= 50,
> >> +	LpPerfmonInterruptCount			= 51,
> >> +#elif IS_ENABLED(CONFIG_ARM64)
> >> +	LpCounterRefreshSequenceNumber		= 31,
> >> +	LpCounterRefreshReferenceTime		= 32,
> >> +	LpIdleAccumulationSnapshot		= 33,
> >> +	LpActiveTscCountSnapshot		= 34,
> >> +	LpHwpRequestContextSwitches		= 35,
> >> +	LpPlaceholder2				= 36,
> >> +	LpPlaceholder3				= 37,
> >> +	LpPlaceholder4				= 38,
> >> +	LpPlaceholder5				= 39,
> >> +	LpPlaceholder6				= 40,
> >> +	LpPlaceholder7				= 41,
> >> +	LpPlaceholder8				= 42,
> >> +	LpPlaceholder9				= 43,
> >> +	LpSchLocalRunListSize			= 44,
> >> +	LpReserveGroupId			= 45,
> >> +	LpRunningPriority			= 46,
> >> +#endif
> >> +	LpStatsMaxCounter
> >> +};
> >> +
> >> +/*
> >> + * Hypervisor statsitics page format
> >
> > s/statsitics/statistics/
> >
> Ack, thanks
> 
> >> + */
> >> +struct hv_stats_page {
> >> +	union {
> >> +		u64 hv_cntrs[HvStatsMaxCounter];		/* Hypervisor counters */
> >> +		u64 pt_cntrs[PartitionStatsMaxCounter];		/* Partition counters */
> >> +		u64 vp_cntrs[VpStatsMaxCounter];		/* VP counters */
> >> +		u64 lp_cntrs[LpStatsMaxCounter];		/* LP counters */
> >> +		u8 data[HV_HYP_PAGE_SIZE];
> >> +	};
> >> +} __packed;
> >> +
> >>  /* Bits for dirty mask of hv_vp_register_page */
> >>  #define HV_X64_REGISTER_CLASS_GENERAL	0
> >>  #define HV_X64_REGISTER_CLASS_IP	1
> >> --
> >> 2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics
  2025-12-31  0:26     ` Nuno Das Neves
@ 2026-01-02 16:27       ` Michael Kelley
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Kelley @ 2026-01-02 16:27 UTC (permalink / raw)
  To: Nuno Das Neves, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, skinsburskii@linux.microsoft.com
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com,
	prapal@linux.microsoft.com, mrathor@linux.microsoft.com,
	paekkaladevi@linux.microsoft.com, Jinank Jain

From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Tuesday, December 30, 2025 4:27 PM
> 
> On 12/8/2025 7:21 AM, Michael Kelley wrote:
> > From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Friday, December 5, 2025 10:59 AM
> >>
> >> Introduce a debugfs interface to expose root and child partition stats
> >> when running with mshv_root.
> >>
> >> Create a debugfs directory "mshv" containing 'stats' files organized by
> >> type and id. A stats file contains a number of counters depending on
> >> its type. e.g. an excerpt from a VP stats file:
> >>
> >> TotalRunTime                  : 1997602722
> >> HypervisorRunTime             : 649671371
> >> RemoteNodeRunTime             : 0
> >> NormalizedRunTime             : 1997602721
> >> IdealCpu                      : 0
> >> HypercallsCount               : 1708169
> >> HypercallsTime                : 111914774
> >> PageInvalidationsCount        : 0
> >> PageInvalidationsTime         : 0
> >>
> >> On a root partition with some active child partitions, the entire
> >> directory structure may look like:
> >>
> >> mshv/
> >>   stats             # hypervisor stats
> >>   lp/               # logical processors
> >>     0/              # LP id
> >>       stats         # LP 0 stats
> >>     1/
> >>     2/
> >>     3/
> >>   partition/        # partition stats
> >>     1/              # root partition id
> >>       stats         # root partition stats
> >>       vp/           # root virtual processors
> >>         0/          # root VP id
> >>           stats     # root VP 0 stats
> >>         1/
> >>         2/
> >>         3/
> >>     42/             # child partition id
> >>       stats         # child partition stats
> >>       vp/           # child VPs
> >>         0/          # child VP id
> >>           stats     # child VP 0 stats
> >>         1/
> >>     43/
> >>     55/
> >>
> >
> > In the above directory tree, each of the "stats" files is in a directory
> > by itself, where the directory name is the number of whatever
> > entity the stats are for (lp, partition, or vp). Do you expect there to
> > be other files parallel to "stats" that will be added later? Otherwise
> > you could collapse one directory level. The "best" directory structure
> > is somewhat a matter of taste and judgment, so there's not a "right"
> > answer. I don't object if your preference is to keep the numbered
> > directories, even if they are likely to never contain more than the
> > "stats" file.
> >
> Good question, I'm not aware of a plan to add additional parallel files
> in future, but even so, I think this structure is fine as-is.
> 
> I see how the VPs and LPs directories could be collapsed, but partitions
> need to be directories to contain the VPs, so that would be an
> inconsistency (some "stats" files and some "$ID" files) which seems worse
> to me. e.g.., are you suggesting something like this?
> 
> mshv/
>    stats             # hypervisor stats
>    lp/               # logical processors
>      0               # LP 0 stats
>      1               # LP 1 stats
>    partition/        # partition stats directory
>      1/              # root partition id
>        stats         # root partition stats
>        vp/           # root virtual processors
>          0           # root VP 0 stats
>          1           # root VP 1 stats
>      4/              # child partition id
>        stats         # child partition stats
>        vp/           # child virtual processors
>          0           # child VP 0 stats
>          1           # child VP 1 stats
> 
> Unless I'm misunderstanding what you mean, I think the original is better,
> both because it's more consistent and does leave room for adding additional
> files if we ever want to.

Fair enough. Just curious -- is there envisioned to be a user space program
written to read and display all these stats in some organized fashion? I'm
presuming the user space VMM should not have an operational dependency
on this data because it is debugfs.

> 
> >> On L1VH, some stats are not present as it does not own the hardware
> >> like the root partition does:
> >> - The hypervisor and lp stats are not present
> >> - L1VH's partition directory is named "self" because it can't get its
> >>   own id
> >> - Some of L1VH's partition and VP stats fields are not populated, because
> >>   it can't map its own HV_STATS_AREA_PARENT page.
> >>
> >> Co-developed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> >> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> >> Co-developed-by: Praveen K Paladugu <prapal@linux.microsoft.com>
> >> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
> >> Co-developed-by: Mukesh Rathor <mrathor@linux.microsoft.com>
> >> Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
> >> Co-developed-by: Purna Pavan Chandra Aekkaladevi
> >> <paekkaladevi@linux.microsoft.com>
> >> Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
> >> Co-developed-by: Jinank Jain <jinankjain@microsoft.com>
> >> Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
> >> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> >> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> >> ---
> >>  drivers/hv/Makefile         |    1 +
> >>  drivers/hv/mshv_debugfs.c   | 1122 +++++++++++++++++++++++++++++++++++
> >>  drivers/hv/mshv_root.h      |   34 ++
> >>  drivers/hv/mshv_root_main.c |   32 +-
> >>  4 files changed, 1185 insertions(+), 4 deletions(-)
> >>  create mode 100644 drivers/hv/mshv_debugfs.c
> >>
> >> diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
> >> index 58b8d07639f3..36278c936914 100644
> >> --- a/drivers/hv/Makefile
> >> +++ b/drivers/hv/Makefile
> >> @@ -15,6 +15,7 @@ hv_vmbus-$(CONFIG_HYPERV_TESTING)	+= hv_debugfs.o
> >>  hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_utils_transport.o
> >>  mshv_root-y := mshv_root_main.o mshv_synic.o mshv_eventfd.o mshv_irq.o \
> >>  	       mshv_root_hv_call.o mshv_portid_table.o
> >> +mshv_root-$(CONFIG_DEBUG_FS) += mshv_debugfs.o
> >>  mshv_vtl-y := mshv_vtl_main.o
> >>
> >>  # Code that must be built-in
> >> diff --git a/drivers/hv/mshv_debugfs.c b/drivers/hv/mshv_debugfs.c
> >> new file mode 100644
> >> index 000000000000..581018690a27
> >> --- /dev/null
> >> +++ b/drivers/hv/mshv_debugfs.c
> >> @@ -0,0 +1,1122 @@
> >> +// SPDX-License-Identifier: GPL-2.0-only
> >> +/*
> >> + * Copyright (c) 2025, Microsoft Corporation.
> >> + *
> >> + * The /sys/kernel/debug/mshv directory contents.
> >> + * Contains various statistics data, provided by the hypervisor.
> >> + *
> >> + * Authors: Microsoft Linux virtualization team
> >> + */
> >> +
> >> +#include <linux/debugfs.h>
> >> +#include <linux/stringify.h>
> >> +#include <asm/mshyperv.h>
> >> +#include <linux/slab.h>
> >> +
> >> +#include "mshv.h"
> >> +#include "mshv_root.h"
> >> +
> >> +#define U32_BUF_SZ 11
> >> +#define U64_BUF_SZ 21
> >> +
> >> +static struct dentry *mshv_debugfs;
> >> +static struct dentry *mshv_debugfs_partition;
> >> +static struct dentry *mshv_debugfs_lp;
> >> +
> >> +static u64 mshv_lps_count;
> >> +
> >> +static bool is_l1vh_parent(u64 partition_id)
> >> +{
> >> +	return hv_l1vh_partition() && (partition_id == HV_PARTITION_ID_SELF);
> >> +}
> >> +
> >> +static int lp_stats_show(struct seq_file *m, void *v)
> >> +{
> >> +	const struct hv_stats_page *stats = m->private;
> >> +
> >> +#define LP_SEQ_PRINTF(cnt)		\
> >> +	seq_printf(m, "%-29s: %llu\n", __stringify(cnt), stats->lp_cntrs[Lp##cnt])
> >> +
> >> +	LP_SEQ_PRINTF(GlobalTime);
> >> +	LP_SEQ_PRINTF(TotalRunTime);
> >> +	LP_SEQ_PRINTF(HypervisorRunTime);
> >> +	LP_SEQ_PRINTF(HardwareInterrupts);
> >> +	LP_SEQ_PRINTF(ContextSwitches);
> >> +	LP_SEQ_PRINTF(InterProcessorInterrupts);
> >> +	LP_SEQ_PRINTF(SchedulerInterrupts);
> >> +	LP_SEQ_PRINTF(TimerInterrupts);
> >> +	LP_SEQ_PRINTF(InterProcessorInterruptsSent);
> >> +	LP_SEQ_PRINTF(ProcessorHalts);
> >> +	LP_SEQ_PRINTF(MonitorTransitionCost);
> >> +	LP_SEQ_PRINTF(ContextSwitchTime);
> >> +	LP_SEQ_PRINTF(C1TransitionsCount);
> >> +	LP_SEQ_PRINTF(C1RunTime);
> >> +	LP_SEQ_PRINTF(C2TransitionsCount);
> >> +	LP_SEQ_PRINTF(C2RunTime);
> >> +	LP_SEQ_PRINTF(C3TransitionsCount);
> >> +	LP_SEQ_PRINTF(C3RunTime);
> >> +	LP_SEQ_PRINTF(RootVpIndex);
> >> +	LP_SEQ_PRINTF(IdleSequenceNumber);
> >> +	LP_SEQ_PRINTF(GlobalTscCount);
> >> +	LP_SEQ_PRINTF(ActiveTscCount);
> >> +	LP_SEQ_PRINTF(IdleAccumulation);
> >> +	LP_SEQ_PRINTF(ReferenceCycleCount0);
> >> +	LP_SEQ_PRINTF(ActualCycleCount0);
> >> +	LP_SEQ_PRINTF(ReferenceCycleCount1);
> >> +	LP_SEQ_PRINTF(ActualCycleCount1);
> >> +	LP_SEQ_PRINTF(ProximityDomainId);
> >> +	LP_SEQ_PRINTF(PostedInterruptNotifications);
> >> +	LP_SEQ_PRINTF(BranchPredictorFlushes);
> >> +#if IS_ENABLED(CONFIG_X86_64)
> >> +	LP_SEQ_PRINTF(L1DataCacheFlushes);
> >> +	LP_SEQ_PRINTF(ImmediateL1DataCacheFlushes);
> >> +	LP_SEQ_PRINTF(MbFlushes);
> >> +	LP_SEQ_PRINTF(CounterRefreshSequenceNumber);
> >> +	LP_SEQ_PRINTF(CounterRefreshReferenceTime);
> >> +	LP_SEQ_PRINTF(IdleAccumulationSnapshot);
> >> +	LP_SEQ_PRINTF(ActiveTscCountSnapshot);
> >> +	LP_SEQ_PRINTF(HwpRequestContextSwitches);
> >> +	LP_SEQ_PRINTF(Placeholder1);
> >> +	LP_SEQ_PRINTF(Placeholder2);
> >> +	LP_SEQ_PRINTF(Placeholder3);
> >> +	LP_SEQ_PRINTF(Placeholder4);
> >> +	LP_SEQ_PRINTF(Placeholder5);
> >> +	LP_SEQ_PRINTF(Placeholder6);
> >> +	LP_SEQ_PRINTF(Placeholder7);
> >> +	LP_SEQ_PRINTF(Placeholder8);
> >> +	LP_SEQ_PRINTF(Placeholder9);
> >> +	LP_SEQ_PRINTF(Placeholder10);
> >> +	LP_SEQ_PRINTF(ReserveGroupId);
> >> +	LP_SEQ_PRINTF(RunningPriority);
> >> +	LP_SEQ_PRINTF(PerfmonInterruptCount);
> >> +#elif IS_ENABLED(CONFIG_ARM64)
> >> +	LP_SEQ_PRINTF(CounterRefreshSequenceNumber);
> >> +	LP_SEQ_PRINTF(CounterRefreshReferenceTime);
> >> +	LP_SEQ_PRINTF(IdleAccumulationSnapshot);
> >> +	LP_SEQ_PRINTF(ActiveTscCountSnapshot);
> >> +	LP_SEQ_PRINTF(HwpRequestContextSwitches);
> >> +	LP_SEQ_PRINTF(Placeholder2);
> >> +	LP_SEQ_PRINTF(Placeholder3);
> >> +	LP_SEQ_PRINTF(Placeholder4);
> >> +	LP_SEQ_PRINTF(Placeholder5);
> >> +	LP_SEQ_PRINTF(Placeholder6);
> >> +	LP_SEQ_PRINTF(Placeholder7);
> >> +	LP_SEQ_PRINTF(Placeholder8);
> >> +	LP_SEQ_PRINTF(Placeholder9);
> >> +	LP_SEQ_PRINTF(SchLocalRunListSize);
> >> +	LP_SEQ_PRINTF(ReserveGroupId);
> >> +	LP_SEQ_PRINTF(RunningPriority);
> >> +#endif
> >> +
> >> +	return 0;
> >> +}
> >> +DEFINE_SHOW_ATTRIBUTE(lp_stats);
> >> +
> >> +static void mshv_lp_stats_unmap(u32 lp_index, void *stats_page_addr)
> >> +{
> >> +	union hv_stats_object_identity identity = {
> >> +		.lp.lp_index = lp_index,
> >> +		.lp.stats_area_type = HV_STATS_AREA_SELF,
> >> +	};
> >> +	int err;
> >> +
> >> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_LOGICAL_PROCESSOR,
> >> +				  stats_page_addr, &identity);
> >> +	if (err)
> >> +		pr_err("%s: failed to unmap logical processor %u stats, err: %d\n",
> >> +		       __func__, lp_index, err);
> >> +}
> >> +
> >> +static void __init *mshv_lp_stats_map(u32 lp_index)
> >> +{
> >> +	union hv_stats_object_identity identity = {
> >> +		.lp.lp_index = lp_index,
> >> +		.lp.stats_area_type = HV_STATS_AREA_SELF,
> >> +	};
> >> +	void *stats;
> >> +	int err;
> >> +
> >> +	err = hv_map_stats_page(HV_STATS_OBJECT_LOGICAL_PROCESSOR, &identity,
> >> +				&stats);
> >> +	if (err) {
> >> +		pr_err("%s: failed to map logical processor %u stats, err: %d\n",
> >> +		       __func__, lp_index, err);
> >> +		return ERR_PTR(err);
> >> +	}
> >> +
> >> +	return stats;
> >> +}
> >> +
> >> +static void __init *lp_debugfs_stats_create(u32 lp_index, struct dentry *parent)
> >> +{
> >> +	struct dentry *dentry;
> >> +	void *stats;
> >> +
> >> +	stats = mshv_lp_stats_map(lp_index);
> >> +	if (IS_ERR(stats))
> >> +		return stats;
> >> +
> >> +	dentry = debugfs_create_file("stats", 0400, parent,
> >> +				     stats, &lp_stats_fops);
> >> +	if (IS_ERR(dentry)) {
> >> +		mshv_lp_stats_unmap(lp_index, stats);
> >> +		return dentry;
> >> +	}
> >> +	return stats;
> >> +}
> >> +
> >> +static int __init lp_debugfs_create(u32 lp_index, struct dentry *parent)
> >> +{
> >> +	struct dentry *idx;
> >> +	char lp_idx_str[U32_BUF_SZ];
> >> +	void *stats;
> >> +	int err;
> >> +
> >> +	sprintf(lp_idx_str, "%u", lp_index);
> >> +
> >> +	idx = debugfs_create_dir(lp_idx_str, parent);
> >> +	if (IS_ERR(idx))
> >> +		return PTR_ERR(idx);
> >> +
> >> +	stats = lp_debugfs_stats_create(lp_index, idx);
> >> +	if (IS_ERR(stats)) {
> >> +		err = PTR_ERR(stats);
> >> +		goto remove_debugfs_lp_idx;
> >> +	}
> >> +
> >> +	return 0;
> >> +
> >> +remove_debugfs_lp_idx:
> >> +	debugfs_remove_recursive(idx);
> >> +	return err;
> >> +}
> >> +
> >> +static void mshv_debugfs_lp_remove(void)
> >> +{
> >> +	int lp_index;
> >> +
> >> +	debugfs_remove_recursive(mshv_debugfs_lp);
> >> +
> >> +	for (lp_index = 0; lp_index < mshv_lps_count; lp_index++)
> >> +		mshv_lp_stats_unmap(lp_index, NULL);
> >
> > Passing NULL as the second argument here leaks the stats page
> > memory if Linux allocated the page as an overlay GPFN. But is that
> > considered OK because the debugfs entries for LPs are removed
> > only when the root partition is shutting down? That works as
> > long as hot-add/remove of CPUs isn't supported in the root
> > partition.
> >
> Hmm, at the very least this appears to be a memory leak if the mshv
> driver is built as a module and removed + reinserted. The stats
> pages can be mapped multiple times so it will just allocate a page
> (on L1VH anyway) and remap it each time. I will check and fix it in
> this patch.

OK. I was thinking that removing and re-inserting the mshv driver
module isn't possible from any practical standpoint without doing
a shutdown, but maybe there is a way.

> 
> >> +}
> >> +
> >> +static int __init mshv_debugfs_lp_create(struct dentry *parent)
> >> +{
> >> +	struct dentry *lp_dir;
> >> +	int err, lp_index;
> >> +
> >> +	lp_dir = debugfs_create_dir("lp", parent);
> >> +	if (IS_ERR(lp_dir))
> >> +		return PTR_ERR(lp_dir);
> >> +
> >> +	for (lp_index = 0; lp_index < mshv_lps_count; lp_index++) {
> >> +		err = lp_debugfs_create(lp_index, lp_dir);
> >> +		if (err)
> >> +			goto remove_debugfs_lps;
> >> +	}
> >> +
> >> +	mshv_debugfs_lp = lp_dir;
> >> +
> >> +	return 0;
> >> +
> >> +remove_debugfs_lps:
> >> +	for (lp_index -= 1; lp_index >= 0; lp_index--)
> >> +		mshv_lp_stats_unmap(lp_index, NULL);
> >> +	debugfs_remove_recursive(lp_dir);
> >> +	return err;
> >> +}
> >> +
> >> +static int vp_stats_show(struct seq_file *m, void *v)
> >> +{
> >> +	const struct hv_stats_page **pstats = m->private;
> >> +
> >> +#define VP_SEQ_PRINTF(cnt)				 \
> >> +do {								 \
> >> +	if (pstats[HV_STATS_AREA_SELF]->vp_cntrs[Vp##cnt]) \
> >> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> >> +			pstats[HV_STATS_AREA_SELF]->vp_cntrs[Vp##cnt]); \
> >> +	else \
> >> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> >> +			pstats[HV_STATS_AREA_PARENT]->vp_cntrs[Vp##cnt]); \
> >> +} while (0)
> >
> > I don't understand this logic. Like in mshv_vp_dispatch_thread_blocked(), if
> > the SELF value is zero, then the PARENT value is used. The implication is that
> > you never want to display a SELF value of zero, which is a bit unexpected
> > since I could imagine zero being valid for some counters. But the overall result
> > is that the displayed values may be a mix of SELF and PARENT values.
> 
> Yes, the basic idea is: Display a nonzero value, if there is one on either SELF or
> PARENT pages. (I *think* the values will always be the same if they are nonzero.)
> 
> I admit it's not an ideal design from my perspective. As far as I know, it was
> done this way to retain backward compatibility with hypervisors that don't support
> the concept of a PARENT stats area at all.
> 
> > And of course after Patch 1 of this series, if running on an older hypervisor
> > that doesn't provide PARENT, then SELF will be used anyway, which further
> > muddies what's going on here, at least for me. :-)
> >
> 
> Yes, but in the end we need to check both pages, so there's no avoiding this
> redundant check on old hypervisors without adding a separate code path just for
> that case, which doesn't seem worth it.
> 
> > If this is the correct behavior, please add some code comments as to
> > why it makes sense, including in the case where PARENT isn't available.
> >
> 
> Ok, will do.
> 
> >> +
> >> +	VP_SEQ_PRINTF(TotalRunTime);
> >> +	VP_SEQ_PRINTF(HypervisorRunTime);
> >> +	VP_SEQ_PRINTF(RemoteNodeRunTime);
> >> +	VP_SEQ_PRINTF(NormalizedRunTime);
> >> +	VP_SEQ_PRINTF(IdealCpu);
> >> +	VP_SEQ_PRINTF(HypercallsCount);
> >> +	VP_SEQ_PRINTF(HypercallsTime);
> >> +#if IS_ENABLED(CONFIG_X86_64)
> >> +	VP_SEQ_PRINTF(PageInvalidationsCount);
> >> +	VP_SEQ_PRINTF(PageInvalidationsTime);
> >> +	VP_SEQ_PRINTF(ControlRegisterAccessesCount);
> >> +	VP_SEQ_PRINTF(ControlRegisterAccessesTime);
> >> +	VP_SEQ_PRINTF(IoInstructionsCount);
> >> +	VP_SEQ_PRINTF(IoInstructionsTime);
> >> +	VP_SEQ_PRINTF(HltInstructionsCount);
> >> +	VP_SEQ_PRINTF(HltInstructionsTime);
> >> +	VP_SEQ_PRINTF(MwaitInstructionsCount);
> >> +	VP_SEQ_PRINTF(MwaitInstructionsTime);
> >> +	VP_SEQ_PRINTF(CpuidInstructionsCount);
> >> +	VP_SEQ_PRINTF(CpuidInstructionsTime);
> >> +	VP_SEQ_PRINTF(MsrAccessesCount);
> >> +	VP_SEQ_PRINTF(MsrAccessesTime);
> >> +	VP_SEQ_PRINTF(OtherInterceptsCount);
> >> +	VP_SEQ_PRINTF(OtherInterceptsTime);
> >> +	VP_SEQ_PRINTF(ExternalInterruptsCount);
> >> +	VP_SEQ_PRINTF(ExternalInterruptsTime);
> >> +	VP_SEQ_PRINTF(PendingInterruptsCount);
> >> +	VP_SEQ_PRINTF(PendingInterruptsTime);
> >> +	VP_SEQ_PRINTF(EmulatedInstructionsCount);
> >> +	VP_SEQ_PRINTF(EmulatedInstructionsTime);
> >> +	VP_SEQ_PRINTF(DebugRegisterAccessesCount);
> >> +	VP_SEQ_PRINTF(DebugRegisterAccessesTime);
> >> +	VP_SEQ_PRINTF(PageFaultInterceptsCount);
> >> +	VP_SEQ_PRINTF(PageFaultInterceptsTime);
> >> +	VP_SEQ_PRINTF(GuestPageTableMaps);
> >> +	VP_SEQ_PRINTF(LargePageTlbFills);
> >> +	VP_SEQ_PRINTF(SmallPageTlbFills);
> >> +	VP_SEQ_PRINTF(ReflectedGuestPageFaults);
> >> +	VP_SEQ_PRINTF(ApicMmioAccesses);
> >> +	VP_SEQ_PRINTF(IoInterceptMessages);
> >> +	VP_SEQ_PRINTF(MemoryInterceptMessages);
> >> +	VP_SEQ_PRINTF(ApicEoiAccesses);
> >> +	VP_SEQ_PRINTF(OtherMessages);
> >> +	VP_SEQ_PRINTF(PageTableAllocations);
> >> +	VP_SEQ_PRINTF(LogicalProcessorMigrations);
> >> +	VP_SEQ_PRINTF(AddressSpaceEvictions);
> >> +	VP_SEQ_PRINTF(AddressSpaceSwitches);
> >> +	VP_SEQ_PRINTF(AddressDomainFlushes);
> >> +	VP_SEQ_PRINTF(AddressSpaceFlushes);
> >> +	VP_SEQ_PRINTF(GlobalGvaRangeFlushes);
> >> +	VP_SEQ_PRINTF(LocalGvaRangeFlushes);
> >> +	VP_SEQ_PRINTF(PageTableEvictions);
> >> +	VP_SEQ_PRINTF(PageTableReclamations);
> >> +	VP_SEQ_PRINTF(PageTableResets);
> >> +	VP_SEQ_PRINTF(PageTableValidations);
> >> +	VP_SEQ_PRINTF(ApicTprAccesses);
> >> +	VP_SEQ_PRINTF(PageTableWriteIntercepts);
> >> +	VP_SEQ_PRINTF(SyntheticInterrupts);
> >> +	VP_SEQ_PRINTF(VirtualInterrupts);
> >> +	VP_SEQ_PRINTF(ApicIpisSent);
> >> +	VP_SEQ_PRINTF(ApicSelfIpisSent);
> >> +	VP_SEQ_PRINTF(GpaSpaceHypercalls);
> >> +	VP_SEQ_PRINTF(LogicalProcessorHypercalls);
> >> +	VP_SEQ_PRINTF(LongSpinWaitHypercalls);
> >> +	VP_SEQ_PRINTF(OtherHypercalls);
> >> +	VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
> >> +	VP_SEQ_PRINTF(VirtualInterruptHypercalls);
> >> +	VP_SEQ_PRINTF(VirtualMmuHypercalls);
> >> +	VP_SEQ_PRINTF(VirtualProcessorHypercalls);
> >> +	VP_SEQ_PRINTF(HardwareInterrupts);
> >> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
> >> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
> >> +	VP_SEQ_PRINTF(PageScans);
> >> +	VP_SEQ_PRINTF(LogicalProcessorDispatches);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTime);
> >> +	VP_SEQ_PRINTF(ExtendedHypercalls);
> >> +	VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
> >> +	VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
> >> +	VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
> >> +	VP_SEQ_PRINTF(GlobalIoTlbFlushes);
> >> +	VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
> >> +	VP_SEQ_PRINTF(LocalIoTlbFlushes);
> >> +	VP_SEQ_PRINTF(LocalIoTlbFlushCost);
> >> +	VP_SEQ_PRINTF(HypercallsForwardedCount);
> >> +	VP_SEQ_PRINTF(HypercallsForwardingTime);
> >> +	VP_SEQ_PRINTF(PageInvalidationsForwardedCount);
> >> +	VP_SEQ_PRINTF(PageInvalidationsForwardingTime);
> >> +	VP_SEQ_PRINTF(ControlRegisterAccessesForwardedCount);
> >> +	VP_SEQ_PRINTF(ControlRegisterAccessesForwardingTime);
> >> +	VP_SEQ_PRINTF(IoInstructionsForwardedCount);
> >> +	VP_SEQ_PRINTF(IoInstructionsForwardingTime);
> >> +	VP_SEQ_PRINTF(HltInstructionsForwardedCount);
> >> +	VP_SEQ_PRINTF(HltInstructionsForwardingTime);
> >> +	VP_SEQ_PRINTF(MwaitInstructionsForwardedCount);
> >> +	VP_SEQ_PRINTF(MwaitInstructionsForwardingTime);
> >> +	VP_SEQ_PRINTF(CpuidInstructionsForwardedCount);
> >> +	VP_SEQ_PRINTF(CpuidInstructionsForwardingTime);
> >> +	VP_SEQ_PRINTF(MsrAccessesForwardedCount);
> >> +	VP_SEQ_PRINTF(MsrAccessesForwardingTime);
> >> +	VP_SEQ_PRINTF(OtherInterceptsForwardedCount);
> >> +	VP_SEQ_PRINTF(OtherInterceptsForwardingTime);
> >> +	VP_SEQ_PRINTF(ExternalInterruptsForwardedCount);
> >> +	VP_SEQ_PRINTF(ExternalInterruptsForwardingTime);
> >> +	VP_SEQ_PRINTF(PendingInterruptsForwardedCount);
> >> +	VP_SEQ_PRINTF(PendingInterruptsForwardingTime);
> >> +	VP_SEQ_PRINTF(EmulatedInstructionsForwardedCount);
> >> +	VP_SEQ_PRINTF(EmulatedInstructionsForwardingTime);
> >> +	VP_SEQ_PRINTF(DebugRegisterAccessesForwardedCount);
> >> +	VP_SEQ_PRINTF(DebugRegisterAccessesForwardingTime);
> >> +	VP_SEQ_PRINTF(PageFaultInterceptsForwardedCount);
> >> +	VP_SEQ_PRINTF(PageFaultInterceptsForwardingTime);
> >> +	VP_SEQ_PRINTF(VmclearEmulationCount);
> >> +	VP_SEQ_PRINTF(VmclearEmulationTime);
> >> +	VP_SEQ_PRINTF(VmptrldEmulationCount);
> >> +	VP_SEQ_PRINTF(VmptrldEmulationTime);
> >> +	VP_SEQ_PRINTF(VmptrstEmulationCount);
> >> +	VP_SEQ_PRINTF(VmptrstEmulationTime);
> >> +	VP_SEQ_PRINTF(VmreadEmulationCount);
> >> +	VP_SEQ_PRINTF(VmreadEmulationTime);
> >> +	VP_SEQ_PRINTF(VmwriteEmulationCount);
> >> +	VP_SEQ_PRINTF(VmwriteEmulationTime);
> >> +	VP_SEQ_PRINTF(VmxoffEmulationCount);
> >> +	VP_SEQ_PRINTF(VmxoffEmulationTime);
> >> +	VP_SEQ_PRINTF(VmxonEmulationCount);
> >> +	VP_SEQ_PRINTF(VmxonEmulationTime);
> >> +	VP_SEQ_PRINTF(NestedVMEntriesCount);
> >> +	VP_SEQ_PRINTF(NestedVMEntriesTime);
> >> +	VP_SEQ_PRINTF(NestedSLATSoftPageFaultsCount);
> >> +	VP_SEQ_PRINTF(NestedSLATSoftPageFaultsTime);
> >> +	VP_SEQ_PRINTF(NestedSLATHardPageFaultsCount);
> >> +	VP_SEQ_PRINTF(NestedSLATHardPageFaultsTime);
> >> +	VP_SEQ_PRINTF(InvEptAllContextEmulationCount);
> >> +	VP_SEQ_PRINTF(InvEptAllContextEmulationTime);
> >> +	VP_SEQ_PRINTF(InvEptSingleContextEmulationCount);
> >> +	VP_SEQ_PRINTF(InvEptSingleContextEmulationTime);
> >> +	VP_SEQ_PRINTF(InvVpidAllContextEmulationCount);
> >> +	VP_SEQ_PRINTF(InvVpidAllContextEmulationTime);
> >> +	VP_SEQ_PRINTF(InvVpidSingleContextEmulationCount);
> >> +	VP_SEQ_PRINTF(InvVpidSingleContextEmulationTime);
> >> +	VP_SEQ_PRINTF(InvVpidSingleAddressEmulationCount);
> >> +	VP_SEQ_PRINTF(InvVpidSingleAddressEmulationTime);
> >> +	VP_SEQ_PRINTF(NestedTlbPageTableReclamations);
> >> +	VP_SEQ_PRINTF(NestedTlbPageTableEvictions);
> >> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
> >> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
> >> +	VP_SEQ_PRINTF(PostedInterruptNotifications);
> >> +	VP_SEQ_PRINTF(PostedInterruptScans);
> >> +	VP_SEQ_PRINTF(TotalCoreRunTime);
> >> +	VP_SEQ_PRINTF(MaximumRunTime);
> >> +	VP_SEQ_PRINTF(HwpRequestContextSwitches);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
> >> +	VP_SEQ_PRINTF(VmloadEmulationCount);
> >> +	VP_SEQ_PRINTF(VmloadEmulationTime);
> >> +	VP_SEQ_PRINTF(VmsaveEmulationCount);
> >> +	VP_SEQ_PRINTF(VmsaveEmulationTime);
> >> +	VP_SEQ_PRINTF(GifInstructionEmulationCount);
> >> +	VP_SEQ_PRINTF(GifInstructionEmulationTime);
> >> +	VP_SEQ_PRINTF(EmulatedErrataSvmInstructions);
> >> +	VP_SEQ_PRINTF(Placeholder1);
> >> +	VP_SEQ_PRINTF(Placeholder2);
> >> +	VP_SEQ_PRINTF(Placeholder3);
> >> +	VP_SEQ_PRINTF(Placeholder4);
> >> +	VP_SEQ_PRINTF(Placeholder5);
> >> +	VP_SEQ_PRINTF(Placeholder6);
> >> +	VP_SEQ_PRINTF(Placeholder7);
> >> +	VP_SEQ_PRINTF(Placeholder8);
> >> +	VP_SEQ_PRINTF(Placeholder9);
> >> +	VP_SEQ_PRINTF(Placeholder10);
> >> +	VP_SEQ_PRINTF(SchedulingPriority);
> >> +	VP_SEQ_PRINTF(RdpmcInstructionsCount);
> >> +	VP_SEQ_PRINTF(RdpmcInstructionsTime);
> >> +	VP_SEQ_PRINTF(PerfmonPmuMsrAccessesCount);
> >> +	VP_SEQ_PRINTF(PerfmonLbrMsrAccessesCount);
> >> +	VP_SEQ_PRINTF(PerfmonIptMsrAccessesCount);
> >> +	VP_SEQ_PRINTF(PerfmonInterruptCount);
> >> +	VP_SEQ_PRINTF(Vtl1DispatchCount);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchCount);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket0);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket1);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket2);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket3);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket4);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket5);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket6);
> >> +	VP_SEQ_PRINTF(Vtl1RunTime);
> >> +	VP_SEQ_PRINTF(Vtl2RunTime);
> >> +	VP_SEQ_PRINTF(IommuHypercalls);
> >> +	VP_SEQ_PRINTF(CpuGroupHypercalls);
> >> +	VP_SEQ_PRINTF(VsmHypercalls);
> >> +	VP_SEQ_PRINTF(EventLogHypercalls);
> >> +	VP_SEQ_PRINTF(DeviceDomainHypercalls);
> >> +	VP_SEQ_PRINTF(DepositHypercalls);
> >> +	VP_SEQ_PRINTF(SvmHypercalls);
> >> +	VP_SEQ_PRINTF(BusLockAcquisitionCount);
> >
> > The x86 VpUnused counter is not shown. Any reason for that? All the
> > Placeholder counters *are* shown, so I'm just wondering what's
> > different.
> >
> 
> Good question, I believe when this code was written VpUnused was
> actually undefined in our headers, because the value 201 was
> temporarily used for VpRootDispatchThreadBlocked before that was
> changed to 202 (the hypervisor version using 201 was never released
> publically so not considered a breaking change).
> 
> Checking the code, 201 now refers to VpLoadAvg on x86 so I will
> update the definitions in patch #2 of this series to include that,
> and add it here in the debugfs code.
> 
> >> +#elif IS_ENABLED(CONFIG_ARM64)
> >> +	VP_SEQ_PRINTF(SysRegAccessesCount);
> >> +	VP_SEQ_PRINTF(SysRegAccessesTime);
> >> +	VP_SEQ_PRINTF(SmcInstructionsCount);
> >> +	VP_SEQ_PRINTF(SmcInstructionsTime);
> >> +	VP_SEQ_PRINTF(OtherInterceptsCount);
> >> +	VP_SEQ_PRINTF(OtherInterceptsTime);
> >> +	VP_SEQ_PRINTF(ExternalInterruptsCount);
> >> +	VP_SEQ_PRINTF(ExternalInterruptsTime);
> >> +	VP_SEQ_PRINTF(PendingInterruptsCount);
> >> +	VP_SEQ_PRINTF(PendingInterruptsTime);
> >> +	VP_SEQ_PRINTF(GuestPageTableMaps);
> >> +	VP_SEQ_PRINTF(LargePageTlbFills);
> >> +	VP_SEQ_PRINTF(SmallPageTlbFills);
> >> +	VP_SEQ_PRINTF(ReflectedGuestPageFaults);
> >> +	VP_SEQ_PRINTF(MemoryInterceptMessages);
> >> +	VP_SEQ_PRINTF(OtherMessages);
> >> +	VP_SEQ_PRINTF(LogicalProcessorMigrations);
> >> +	VP_SEQ_PRINTF(AddressDomainFlushes);
> >> +	VP_SEQ_PRINTF(AddressSpaceFlushes);
> >> +	VP_SEQ_PRINTF(SyntheticInterrupts);
> >> +	VP_SEQ_PRINTF(VirtualInterrupts);
> >> +	VP_SEQ_PRINTF(ApicSelfIpisSent);
> >> +	VP_SEQ_PRINTF(GpaSpaceHypercalls);
> >> +	VP_SEQ_PRINTF(LogicalProcessorHypercalls);
> >> +	VP_SEQ_PRINTF(LongSpinWaitHypercalls);
> >> +	VP_SEQ_PRINTF(OtherHypercalls);
> >> +	VP_SEQ_PRINTF(SyntheticInterruptHypercalls);
> >> +	VP_SEQ_PRINTF(VirtualInterruptHypercalls);
> >> +	VP_SEQ_PRINTF(VirtualMmuHypercalls);
> >> +	VP_SEQ_PRINTF(VirtualProcessorHypercalls);
> >> +	VP_SEQ_PRINTF(HardwareInterrupts);
> >> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsCount);
> >> +	VP_SEQ_PRINTF(NestedPageFaultInterceptsTime);
> >> +	VP_SEQ_PRINTF(LogicalProcessorDispatches);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTime);
> >> +	VP_SEQ_PRINTF(ExtendedHypercalls);
> >> +	VP_SEQ_PRINTF(ExtendedHypercallInterceptMessages);
> >> +	VP_SEQ_PRINTF(MbecNestedPageTableSwitches);
> >> +	VP_SEQ_PRINTF(OtherReflectedGuestExceptions);
> >> +	VP_SEQ_PRINTF(GlobalIoTlbFlushes);
> >> +	VP_SEQ_PRINTF(GlobalIoTlbFlushCost);
> >> +	VP_SEQ_PRINTF(LocalIoTlbFlushes);
> >> +	VP_SEQ_PRINTF(LocalIoTlbFlushCost);
> >> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressSpaceHypercalls);
> >> +	VP_SEQ_PRINTF(FlushGuestPhysicalAddressListHypercalls);
> >> +	VP_SEQ_PRINTF(PostedInterruptNotifications);
> >> +	VP_SEQ_PRINTF(PostedInterruptScans);
> >> +	VP_SEQ_PRINTF(TotalCoreRunTime);
> >> +	VP_SEQ_PRINTF(MaximumRunTime);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket0);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket1);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket2);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket3);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket4);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket5);
> >> +	VP_SEQ_PRINTF(WaitingForCpuTimeBucket6);
> >> +	VP_SEQ_PRINTF(HwpRequestContextSwitches);
> >> +	VP_SEQ_PRINTF(Placeholder2);
> >> +	VP_SEQ_PRINTF(Placeholder3);
> >> +	VP_SEQ_PRINTF(Placeholder4);
> >> +	VP_SEQ_PRINTF(Placeholder5);
> >> +	VP_SEQ_PRINTF(Placeholder6);
> >> +	VP_SEQ_PRINTF(Placeholder7);
> >> +	VP_SEQ_PRINTF(Placeholder8);
> >> +	VP_SEQ_PRINTF(ContentionTime);
> >> +	VP_SEQ_PRINTF(WakeUpTime);
> >> +	VP_SEQ_PRINTF(SchedulingPriority);
> >> +	VP_SEQ_PRINTF(Vtl1DispatchCount);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchCount);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket0);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket1);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket2);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket3);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket4);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket5);
> >> +	VP_SEQ_PRINTF(Vtl2DispatchBucket6);
> >> +	VP_SEQ_PRINTF(Vtl1RunTime);
> >> +	VP_SEQ_PRINTF(Vtl2RunTime);
> >> +	VP_SEQ_PRINTF(IommuHypercalls);
> >> +	VP_SEQ_PRINTF(CpuGroupHypercalls);
> >> +	VP_SEQ_PRINTF(VsmHypercalls);
> >> +	VP_SEQ_PRINTF(EventLogHypercalls);
> >> +	VP_SEQ_PRINTF(DeviceDomainHypercalls);
> >> +	VP_SEQ_PRINTF(DepositHypercalls);
> >> +	VP_SEQ_PRINTF(SvmHypercalls);
> >
> > The ARM64 VpLoadAvg counter is not shown?  Any reason why?
> >
> 
> I'm not sure, but could be related to the reasoning in the above
> comment - likely VpLoadAvg didn't exist before. I will add it.
> 
> >> +#endif
> >
> > The VpRootDispatchThreadBlocked counter is not shown for either
> > x86 or ARM64. Is that intentional, and if so, why? I know the counter
> > is used in mshv_vp_dispatch_thread_blocked(), but it's not clear why
> > that means it shouldn't be shown here.
> >
> 
> VpRootDispatchThreadBlocked is not really a 'stat' that you might want
> to expose like the other values, it's really a boolean control value
> that was tacked onto the vp stats page to facilitate fast interrupt
> injection used by the root scheduler. As such it isn't of much value to
> userspace.

I'd probably show it just for completeness and consistency, but I
don't have strong views on the topic.

> 
> >> +
> >> +	return 0;
> >> +}
> >
> > This function, vp_stats_show(), seems like a candidate for redoing based on a
> > static table that lists the counter names and index. Then the code just loops
> > through the table. On x86 each VP_SEQ_PRINTF() generates 42 bytes of code,
> > and there are 199 entries, so 8358 bytes. The table entries would probably
> > be 16 bytes each (a 64-bit pointer to the string constant, a 32-bit index value,
> > and 4 bytes of padding so each entry is 8-byte aligned). The actual space
> > saving isn't that large, but the code would be a lot more compact. The
> > other *_stats_shows() functions could do the same.
> >
> > It's distasteful to me to see 420 lines of enum entries in Patch 2 of this series,
> > then followed by another 420 lines of matching *_SEQ_PRINTF entries. But I
> > realize that the goal of the enum entries is to match the Windows code, so I
> > guess it is what it is. But there's an argument for ditching the enum entries
> > entirely, and using the putative static table to capture the information. It
> > doesn't seem like matching the Windows code is saving much sync effort
> > since any additions/ subtractions to the enum entries need to be matched
> > with changes in the *_stats_show() functions, or in my putative static table.
> > But I guess if Windows changed only the value for an enum entry without
> > additions/subtractions, that would sync more easily.
> >
> 
> Keeping the definitions as close to Windows code as possible is a high priority,
> for consistency and hopefully partially automating that process in future. So,
> I'm against throwing away the enum values. The downside of having to update
> two code locations when adding a new enum member is fine by me.
> 
> I'm not against replacing this sequence of macros with a loop over a table like
> the one you propose (in addition to keeping the enum values). That would save
> some space as you point out above, but the impact is fairly minimal.
> 
> In terms of aesthetics the definition for a table will look very very similar to
> the list of VP_SEQ_PRINTF() that are currently here. So all in all, I don't see
> a strong reason to switch to a table, unless the space issue is more important
> that I realize.
> 
> > I'm just throwing this out as a thought. You may prefer to keep everything
> > "as is", in which case ignore my comment and I won't raise it again.
> >
> 
> Thanks, feel free to follow up if you have further thoughts on this part, I'm
> open to changing it if there's a reason. Right now it feels like mainly an
> aesthetics/cleanliness argument and I'm not sure it's worth the effort.

No further thoughts. I wanted to broach the idea, but I'm fine with
your judgment.

> 
> >> +DEFINE_SHOW_ATTRIBUTE(vp_stats);
> >> +
> >> +static void mshv_vp_stats_unmap(u64 partition_id, u32 vp_index, void *stats_page_addr,
> >> +				enum hv_stats_area_type stats_area_type)
> >> +{
> >> +	union hv_stats_object_identity identity = {
> >> +		.vp.partition_id = partition_id,
> >> +		.vp.vp_index = vp_index,
> >> +		.vp.stats_area_type = stats_area_type,
> >> +	};
> >> +	int err;
> >> +
> >> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_VP, stats_page_addr, &identity);
> >> +	if (err)
> >> +		pr_err("%s: failed to unmap partition %llu vp %u %s stats, err: %d\n",
> >> +		       __func__, partition_id, vp_index,
> >> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> >> +		       err);
> >> +}
> >> +
> >> +static void *mshv_vp_stats_map(u64 partition_id, u32 vp_index,
> >> +			       enum hv_stats_area_type stats_area_type)
> >> +{
> >> +	union hv_stats_object_identity identity = {
> >> +		.vp.partition_id = partition_id,
> >> +		.vp.vp_index = vp_index,
> >> +		.vp.stats_area_type = stats_area_type,
> >> +	};
> >> +	void *stats;
> >> +	int err;
> >> +
> >> +	err = hv_map_stats_page(HV_STATS_OBJECT_VP, &identity, &stats);
> >> +	if (err) {
> >> +		pr_err("%s: failed to map partition %llu vp %u %s stats, err: %d\n",
> >> +		       __func__, partition_id, vp_index,
> >> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> >> +		       err);
> >> +		return ERR_PTR(err);
> >> +	}
> >> +	return stats;
> >> +}
> >
> > Presumably you've noticed that the functions mshv_vp_stats_map() and
> > mshv_vp_stats_unmap() also exist in mshv_root_main.c.  They are static
> > functions in both places, so the compiler & linker do the right thing, but
> > it sure does make things a bit more complex for human readers. The versions
> > here follow a consistent pattern for (lp, vp, hv, partition), so maybe the ones
> > in mshv_root_main.c could be renamed to avoid confusion?
> >
> 
> Good point - this is being addressed in our internal tree but hasn't made it into
> this patch set. I will consider squashing that into a later version of this set,
> but for now I'm treating it as a future cleanup patch to send later.

OK

> 
> >> +
> >> +static int vp_debugfs_stats_create(u64 partition_id, u32 vp_index,
> >> +				   struct dentry **vp_stats_ptr,
> >> +				   struct dentry *parent)
> >> +{
> >> +	struct dentry *dentry;
> >> +	struct hv_stats_page **pstats;
> >> +	int err;
> >> +
> >> +	pstats = kcalloc(2, sizeof(struct hv_stats_page *), GFP_KERNEL_ACCOUNT);
> >
> > Open coding "2" as the first parameter makes assumptions about the values of
> > HV_STATS_AREA_SELF and HV_STATS_AREA_PARENT.  Should use
> > HV_STATS_AREA_COUNT instead of "2" so that indexing into the array is certain
> > to work.
> >
> 
> Thanks, I'll chang it to use HV_STATS_AREA_COUNT.
> 
> >> +	if (!pstats)
> >> +		return -ENOMEM;
> >> +
> >> +	pstats[HV_STATS_AREA_SELF] = mshv_vp_stats_map(partition_id, vp_index,
> >> +						       HV_STATS_AREA_SELF);
> >> +	if (IS_ERR(pstats[HV_STATS_AREA_SELF])) {
> >> +		err = PTR_ERR(pstats[HV_STATS_AREA_SELF]);
> >> +		goto cleanup;
> >> +	}
> >> +
> >> +	/*
> >> +	 * L1VH partition cannot access its vp stats in parent area.
> >> +	 */
> >> +	if (is_l1vh_parent(partition_id)) {
> >> +		pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
> >> +	} else {
> >> +		pstats[HV_STATS_AREA_PARENT] = mshv_vp_stats_map(
> >> +			partition_id, vp_index, HV_STATS_AREA_PARENT);
> >> +		if (IS_ERR(pstats[HV_STATS_AREA_PARENT])) {
> >> +			err = PTR_ERR(pstats[HV_STATS_AREA_PARENT]);
> >> +			goto unmap_self;
> >> +		}
> >> +		if (!pstats[HV_STATS_AREA_PARENT])
> >> +			pstats[HV_STATS_AREA_PARENT] = pstats[HV_STATS_AREA_SELF];
> >> +	}
> >> +
> >> +	dentry = debugfs_create_file("stats", 0400, parent,
> >> +				     pstats, &vp_stats_fops);
> >> +	if (IS_ERR(dentry)) {
> >> +		err = PTR_ERR(dentry);
> >> +		goto unmap_vp_stats;
> >> +	}
> >> +
> >> +	*vp_stats_ptr = dentry;
> >> +	return 0;
> >> +
> >> +unmap_vp_stats:
> >> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF])
> >> +		mshv_vp_stats_unmap(partition_id, vp_index, pstats[HV_STATS_AREA_PARENT],
> >> +				    HV_STATS_AREA_PARENT);
> >> +unmap_self:
> >> +	mshv_vp_stats_unmap(partition_id, vp_index, pstats[HV_STATS_AREA_SELF],
> >> +			    HV_STATS_AREA_SELF);
> >> +cleanup:
> >> +	kfree(pstats);
> >> +	return err;
> >> +}
> >> +
> >> +static void vp_debugfs_remove(u64 partition_id, u32 vp_index,
> >> +			      struct dentry *vp_stats)
> >> +{
> >> +	struct hv_stats_page **pstats = NULL;
> >> +	void *stats;
> >> +
> >> +	pstats = vp_stats->d_inode->i_private;
> >> +	debugfs_remove_recursive(vp_stats->d_parent);
> >> +	if (pstats[HV_STATS_AREA_PARENT] != pstats[HV_STATS_AREA_SELF]) {
> >> +		stats = pstats[HV_STATS_AREA_PARENT];
> >> +		mshv_vp_stats_unmap(partition_id, vp_index, stats,
> >> +				    HV_STATS_AREA_PARENT);
> >> +	}
> >> +
> >> +	stats = pstats[HV_STATS_AREA_SELF];
> >> +	mshv_vp_stats_unmap(partition_id, vp_index, stats, HV_STATS_AREA_SELF);
> >> +
> >> +	kfree(pstats);
> >> +}
> >> +
> >> +static int vp_debugfs_create(u64 partition_id, u32 vp_index,
> >> +			     struct dentry **vp_stats_ptr,
> >> +			     struct dentry *parent)
> >> +{
> >> +	struct dentry *vp_idx_dir;
> >> +	char vp_idx_str[U32_BUF_SZ];
> >> +	int err;
> >> +
> >> +	sprintf(vp_idx_str, "%u", vp_index);
> >> +
> >> +	vp_idx_dir = debugfs_create_dir(vp_idx_str, parent);
> >> +	if (IS_ERR(vp_idx_dir))
> >> +		return PTR_ERR(vp_idx_dir);
> >> +
> >> +	err = vp_debugfs_stats_create(partition_id, vp_index, vp_stats_ptr,
> >> +				      vp_idx_dir);
> >> +	if (err)
> >> +		goto remove_debugfs_vp_idx;
> >> +
> >> +	return 0;
> >> +
> >> +remove_debugfs_vp_idx:
> >> +	debugfs_remove_recursive(vp_idx_dir);
> >> +	return err;
> >> +}
> >> +
> >> +static int partition_stats_show(struct seq_file *m, void *v)
> >> +{
> >> +	const struct hv_stats_page **pstats = m->private;
> >> +
> >> +#define PARTITION_SEQ_PRINTF(cnt)				 \
> >> +do {								 \
> >> +	if (pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]) \
> >> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> >> +			pstats[HV_STATS_AREA_SELF]->pt_cntrs[Partition##cnt]); \
> >> +	else \
> >> +		seq_printf(m, "%-30s: %llu\n", __stringify(cnt), \
> >> +			pstats[HV_STATS_AREA_PARENT]->pt_cntrs[Partition##cnt]); \
> >> +} while (0)
> >
> > Same comment as for VP_SEQ_PRINTF.
> >
> Ack
> 
> >> +
> >> +	PARTITION_SEQ_PRINTF(VirtualProcessors);
> >> +	PARTITION_SEQ_PRINTF(TlbSize);
> >> +	PARTITION_SEQ_PRINTF(AddressSpaces);
> >> +	PARTITION_SEQ_PRINTF(DepositedPages);
> >> +	PARTITION_SEQ_PRINTF(GpaPages);
> >> +	PARTITION_SEQ_PRINTF(GpaSpaceModifications);
> >> +	PARTITION_SEQ_PRINTF(VirtualTlbFlushEntires);
> >> +	PARTITION_SEQ_PRINTF(RecommendedTlbSize);
> >> +	PARTITION_SEQ_PRINTF(GpaPages4K);
> >> +	PARTITION_SEQ_PRINTF(GpaPages2M);
> >> +	PARTITION_SEQ_PRINTF(GpaPages1G);
> >> +	PARTITION_SEQ_PRINTF(GpaPages512G);
> >> +	PARTITION_SEQ_PRINTF(DevicePages4K);
> >> +	PARTITION_SEQ_PRINTF(DevicePages2M);
> >> +	PARTITION_SEQ_PRINTF(DevicePages1G);
> >> +	PARTITION_SEQ_PRINTF(DevicePages512G);
> >> +	PARTITION_SEQ_PRINTF(AttachedDevices);
> >> +	PARTITION_SEQ_PRINTF(DeviceInterruptMappings);
> >> +	PARTITION_SEQ_PRINTF(IoTlbFlushes);
> >> +	PARTITION_SEQ_PRINTF(IoTlbFlushCost);
> >> +	PARTITION_SEQ_PRINTF(DeviceInterruptErrors);
> >> +	PARTITION_SEQ_PRINTF(DeviceDmaErrors);
> >> +	PARTITION_SEQ_PRINTF(DeviceInterruptThrottleEvents);
> >> +	PARTITION_SEQ_PRINTF(SkippedTimerTicks);
> >> +	PARTITION_SEQ_PRINTF(PartitionId);
> >> +#if IS_ENABLED(CONFIG_X86_64)
> >> +	PARTITION_SEQ_PRINTF(NestedTlbSize);
> >> +	PARTITION_SEQ_PRINTF(RecommendedNestedTlbSize);
> >> +	PARTITION_SEQ_PRINTF(NestedTlbFreeListSize);
> >> +	PARTITION_SEQ_PRINTF(NestedTlbTrimmedPages);
> >> +	PARTITION_SEQ_PRINTF(PagesShattered);
> >> +	PARTITION_SEQ_PRINTF(PagesRecombined);
> >> +	PARTITION_SEQ_PRINTF(HwpRequestValue);
> >> +#elif IS_ENABLED(CONFIG_ARM64)
> >> +	PARTITION_SEQ_PRINTF(HwpRequestValue);
> >> +#endif
> >> +
> >> +	return 0;
> >> +}
> >> +DEFINE_SHOW_ATTRIBUTE(partition_stats);
> >> +
> >> +static void mshv_partition_stats_unmap(u64 partition_id, void *stats_page_addr,
> >> +				       enum hv_stats_area_type stats_area_type)
> >> +{
> >> +	union hv_stats_object_identity identity = {
> >> +		.partition.partition_id = partition_id,
> >> +		.partition.stats_area_type = stats_area_type,
> >> +	};
> >> +	int err;
> >> +
> >> +	err = hv_unmap_stats_page(HV_STATS_OBJECT_PARTITION, stats_page_addr,
> >> +				  &identity);
> >> +	if (err) {
> >> +		pr_err("%s: failed to unmap partition %lld %s stats, err: %d\n",
> >> +		       __func__, partition_id,
> >> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> >> +		       err);
> >> +	}
> >> +}
> >> +
> >> +static void *mshv_partition_stats_map(u64 partition_id,
> >> +				      enum hv_stats_area_type stats_area_type)
> >> +{
> >> +	union hv_stats_object_identity identity = {
> >> +		.partition.partition_id = partition_id,
> >> +		.partition.stats_area_type = stats_area_type,
> >> +	};
> >> +	void *stats;
> >> +	int err;
> >> +
> >> +	err = hv_map_stats_page(HV_STATS_OBJECT_PARTITION, &identity, &stats);
> >> +	if (err) {
> >> +		pr_err("%s: failed to map partition %lld %s stats, err: %d\n",
> >> +		       __func__, partition_id,
> >> +		       (stats_area_type == HV_STATS_AREA_SELF) ? "self" : "parent",
> >> +		       err);
> >> +		return ERR_PTR(err);
> >> +	}
> >> +	return stats;
> >> +}
> >> +
> >> +static int mshv_debugfs_partition_stats_create(u64 partition_id,
> >> +					    struct dentry **partition_stats_ptr,
> >> +					    struct dentry *parent)
> >> +{
> >> +	struct dentry *dentry;
> >> +	struct hv_stats_page **pstats;
> >> +	int err;
> >> +
> >> +	pstats = kcalloc(2, sizeof(struct hv_stats_page *), GFP_KERNEL_ACCOUNT);
> >
> > Same comment here about the use of "2" as the first parameter.
> >
> Ack.
> 
> >> +	if (!pstats)
> >> +		return -ENOMEM;
> 
> <snip>
> Thanks for the comments, I appreciate the review!
> 
> Nuno

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2026-01-02 16:27 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-05 18:58 [PATCH v2 0/3] mshv: Debugfs interface for mshv_root Nuno Das Neves
2025-12-05 18:58 ` [PATCH v2 1/3] mshv: Ignore second stats page map result failure Nuno Das Neves
2025-12-05 22:50   ` Stanislav Kinsburskii
2025-12-08 15:12   ` Michael Kelley
2025-12-30  0:27     ` Nuno Das Neves
2026-01-02 16:27       ` Michael Kelley
2025-12-05 18:58 ` [PATCH v2 2/3] mshv: Add definitions for stats pages Nuno Das Neves
2025-12-05 22:51   ` Stanislav Kinsburskii
2025-12-08 15:13   ` Michael Kelley
2025-12-30 23:04     ` Nuno Das Neves
2026-01-02 16:27       ` Michael Kelley
2025-12-05 18:58 ` [PATCH v2 3/3] mshv: Add debugfs to view hypervisor statistics Nuno Das Neves
2025-12-05 23:06   ` Stanislav Kinsburskii
2025-12-08  3:04   ` kernel test robot
2025-12-08  6:02   ` kernel test robot
2025-12-08 15:21   ` Michael Kelley
2025-12-31  0:26     ` Nuno Das Neves
2026-01-02 16:27       ` Michael Kelley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).