Linux-HyperV List
 help / color / mirror / Atom feed
* [PATCH v3 3/5] hv_balloon: set unspecified page reporting order
From: Yuvraj Sakshith @ 2026-03-03  9:33 UTC (permalink / raw)
  To: mst, david
  Cc: kys, haiyangz, wei.liu, decui, longli, jasowang, xuanzhuo,
	eperezma, akpm, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
	surenb, mhocko, jackmanb, hannes, ziy, linux-hyperv, linux-kernel,
	virtualization, linux-mm
In-Reply-To: <20260303093341.2927482-1-yuvraj.sakshith@oss.qualcomm.com>

Explicitly mention page reporting order to be set to
default value using PAGE_REPORTING_ORDER_UNSPECIFIED fallback
value.

Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com>
---
 drivers/hv/hv_balloon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 2b4080e51..09da68101 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -1663,7 +1663,7 @@ static void enable_page_reporting(void)
 	 * We let the page_reporting_order parameter decide the order
 	 * in the page_reporting code
 	 */
-	dm_device.pr_dev_info.order = 0;
+	dm_device.pr_dev_info.order = PAGE_REPORTING_ORDER_UNSPECIFIED;
 	ret = page_reporting_register(&dm_device.pr_dev_info);
 	if (ret < 0) {
 		dm_device.pr_dev_info.report = NULL;
-- 
2.34.1


^ permalink raw reply related

* [PATCH v3 2/5] virtio_balloon: set unspecified page reporting order
From: Yuvraj Sakshith @ 2026-03-03  9:33 UTC (permalink / raw)
  To: mst, david
  Cc: kys, haiyangz, wei.liu, decui, longli, jasowang, xuanzhuo,
	eperezma, akpm, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
	surenb, mhocko, jackmanb, hannes, ziy, linux-hyperv, linux-kernel,
	virtualization, linux-mm
In-Reply-To: <20260303093341.2927482-1-yuvraj.sakshith@oss.qualcomm.com>

virtio_balloon page reporting order is set to MAX_PAGE_ORDER implicitly
as vb->prdev.order is never initialised and is auto-set to zero.

Explicitly mention usage of default page order by making use of
PAGE_REPORTING_ORDER_UNSPECIFIED fallback value.

Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com>
---
 drivers/virtio/virtio_balloon.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 74fe59f5a..2dfe2bcd8 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -1044,6 +1044,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 			goto out_unregister_oom;
 		}
 
+		vb->pr_dev_info.order = PAGE_REPORTING_ORDER_UNSPECIFIED;
+
 		/*
 		 * The default page reporting order is @pageblock_order, which
 		 * corresponds to 512MB in size on ARM64 when 64KB base page
-- 
2.34.1


^ permalink raw reply related

* [PATCH v3 1/5] mm/page_reporting: add PAGE_REPORTING_ORDER_UNSPECIFIED
From: Yuvraj Sakshith @ 2026-03-03  9:33 UTC (permalink / raw)
  To: mst, david
  Cc: kys, haiyangz, wei.liu, decui, longli, jasowang, xuanzhuo,
	eperezma, akpm, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
	surenb, mhocko, jackmanb, hannes, ziy, linux-hyperv, linux-kernel,
	virtualization, linux-mm
In-Reply-To: <20260303093341.2927482-1-yuvraj.sakshith@oss.qualcomm.com>

Drivers can pass order of pages to be reported while
registering itself. Today, this is a magic number, 0.

Label this with PAGE_REPORTING_ORDER_UNSPECIFIED and
check for it when the driver is being registered.

This macro will be used in relevant drivers next.

Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com>
---
 include/linux/page_reporting.h | 1 +
 mm/page_reporting.c            | 5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/page_reporting.h b/include/linux/page_reporting.h
index fe648dfa3..d1886c657 100644
--- a/include/linux/page_reporting.h
+++ b/include/linux/page_reporting.h
@@ -7,6 +7,7 @@
 
 /* This value should always be a power of 2, see page_reporting_cycle() */
 #define PAGE_REPORTING_CAPACITY		32
+#define PAGE_REPORTING_ORDER_UNSPECIFIED	0
 
 struct page_reporting_dev_info {
 	/* function that alters pages to make them "reported" */
diff --git a/mm/page_reporting.c b/mm/page_reporting.c
index e4c428e61..40a756b60 100644
--- a/mm/page_reporting.c
+++ b/mm/page_reporting.c
@@ -369,8 +369,9 @@ int page_reporting_register(struct page_reporting_dev_info *prdev)
 	 * pageblock_order.
 	 */
 
-	if (page_reporting_order == -1) {
-		if (prdev->order > 0 && prdev->order <= MAX_PAGE_ORDER)
+	if (page_reporting_order == PAGE_REPORTING_ORDER_UNSPECIFIED) {
+		if (prdev->order != PAGE_REPORTING_ORDER_UNSPECIFIED &&
+			prdev->order <= MAX_PAGE_ORDER)
 			page_reporting_order = prdev->order;
 		else
 			page_reporting_order = pageblock_order;
-- 
2.34.1


^ permalink raw reply related

* [PATCH v3 0/5] Allow order zero pages in page reporting
From: Yuvraj Sakshith @ 2026-03-03  9:33 UTC (permalink / raw)
  To: mst, david
  Cc: kys, haiyangz, wei.liu, decui, longli, jasowang, xuanzhuo,
	eperezma, akpm, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
	surenb, mhocko, jackmanb, hannes, ziy, linux-hyperv, linux-kernel,
	virtualization, linux-mm

Today, page reporting sets page_reporting_order in two ways:

(1) page_reporting.page_reporting_order cmdline parameter
(2) Driver can pass order while registering itself.

In both cases, order zero is ignored by free page reporting
because it is used to set page_reporting_order to a default
value, like MAX_PAGE_ORDER.

In some cases we might want page_reporting_order to be zero.

For instance, when virtio-balloon runs inside a guest with
tiny memory (say, 16MB), it might not be able to find a order 1 page
(or in the worst case order MAX_PAGE_ORDER page) after some uptime.
Page reporting should be able to return order zero pages back for
optimal memory relinquishment.

This patch changes the default fallback value from '0' to '-1' in
all possible clients of free page reporting (hv_balloon and
virtio-balloon) together with allowing '0' as a valid order in
page_reporting_register().

Changes in v1:
- Introduce PAGE_REPORTING_DEFAULT_ORDER macro (initially set to 0).
- Make use of new macro in drivers (hv_balloon and virtio-balloon)
        working with page reporting.
- Change PAGE_REPORTING_DEFAULT_ORDER to -1 as zero is a valid
        page order that can be requested.

Changes in v2:
- Better naming. Replace PAGE_REPORTING_DEFAULT_ORDER with
        PAGE_REPORTING_ORDER_UNSPECIFIED. This takes care of
        the situation where page reporting order is not specified
        in the commandline.
- Minor commit message changes.

Changes in v3:
- Setting page_reporting_order's initial value to
	PAGE_REPORTING_ORDER_UNSPECIFIED moved to
	PATCH #5.

Yuvraj Sakshith (5):
  mm/page_reporting: add PAGE_REPORTING_ORDER_UNSPECIFIED
  virtio_balloon: set unspecified page reporting order
  hv_balloon: set unspecified page reporting order
  mm/page_reporting: change PAGE_REPORTING_ORDER_UNSPECIFIED to -1
  mm/page_reporting: change page_reporting_order to
    PAGE_REPORTING_ORDER_UNSPECIFIED

 drivers/hv/hv_balloon.c         | 2 +-
 drivers/virtio/virtio_balloon.c | 2 ++
 include/linux/page_reporting.h  | 1 +
 mm/page_reporting.c             | 7 ++++---
 4 files changed, 8 insertions(+), 4 deletions(-)

-- 
2.34.1


^ permalink raw reply

* Re: [PATCH v2 1/4] mm/page_reporting: add PAGE_REPORTING_ORDER_UNSPECIFIED
From: Yuvraj Sakshith @ 2026-03-03  8:52 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: akpm, mst, jasowang, kys, haiyangz, wei.liu, decui, linux-mm,
	virtualization, linux-hyperv, linux-kernel, xuanzhuo, eperezma,
	lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb, mhocko,
	jackmanb, hannes, ziy
In-Reply-To: <2f14572e-e02b-4bc5-abd2-7814c24f7905@kernel.org>

On Mon, Mar 02, 2026 at 03:57:50PM +0100, David Hildenbrand (Arm) wrote:
> >  /* Initialize to an unsupported value */
> > -unsigned int page_reporting_order = -1;
> > +unsigned int page_reporting_order = PAGE_REPORTING_ORDER_UNSPECIFIED;
> >  
> >  static int page_order_update_notify(const char *val, const struct kernel_param *kp)
> >  {
> > @@ -25,12 +25,7 @@ static int page_order_update_notify(const char *val, const struct kernel_param *
> >  
> >  static const struct kernel_param_ops page_reporting_param_ops = {
> >  	.set = &page_order_update_notify,
> > -	/*
> > -	 * For the get op, use param_get_int instead of param_get_uint.
> > -	 * This is to make sure that when unset the initialized value of
> > -	 * -1 is shown correctly
> > -	 */
> > -	.get = &param_get_int,
> > +	.get = &param_get_uint,
> >  };
> 
> I think the change to page_reporting_order (and param_get_int) should
> come after patch #4.
> 
> Otherwise, you temporarily change the semantics of
> page_reporting_param_ops() etc.
> 
> So you should perform the page_reporting_order changes either in patch
> #4 or in a new patch #5.
> 
> Apart from that LGTM.
> 
> -- 
> Cheers,
> 
> David

Sounds good. Ill add a #5.

Thanks,
Yuvraj

^ permalink raw reply

* Re: [PATCH net-next 0/6] net: mana: Per-vPort EQ and MSI-X interrupt management
From: Jakub Kicinski @ 2026-03-03  2:59 UTC (permalink / raw)
  To: Long Li
  Cc: Konstantin Taranov, David S . Miller, Paolo Abeni, Eric Dumazet,
	Andrew Lunn, Jason Gunthorpe, Leon Romanovsky, Haiyang Zhang,
	K . Y . Srinivasan, Wei Liu, Dexuan Cui, Simon Horman, netdev,
	linux-rdma, linux-hyperv, linux-kernel
In-Reply-To: <20260228021144.85054-1-longli@microsoft.com>

On Fri, 27 Feb 2026 18:11:38 -0800 Long Li wrote:
> This series adds per-vPort Event Queue (EQ) allocation and MSI-X interrupt
> management for the MANA driver. Previously, all vPorts shared a single set
> of EQs. This change enables dedicated EQs per vPort with support for both
> dedicated and shared MSI-X vector allocation modes.

Does not apply to net-next, please rebase.

^ permalink raw reply

* [PATCH net-next v5] net: mana: Add MAC address to vPort logs and clarify error messages
From: Erni Sri Satya Vennela @ 2026-03-02 17:41 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, dipayanroy, shirazsaleem, kees, ernis,
	shradhagupta, gargaditya, linux-hyperv, netdev, linux-kernel

Add MAC address to vPort configuration success message and update error
message to be more specific about HWC message errors in
mana_send_request.

Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v5:
* Remove __func__ and __LINE__ from error logs in hw_channel.c
Changes in v4:
* Remove logs that do not add value in hw_channel.c.
Changes in v3:
* Remove the changes from v2 and Update commit message.
* Use "Enabled vPort ..." instead of "Configured vPort" in
  mana_cfg_vport.
* Update error logs in mana_hwc_send_request.
Changes in v2:
* Update commit message.
* Use "Enabled vPort ..." instead of "Configured vPort" in
  mana_cfg_vport.
* Add info log in mana_uncfg_vport, mana_gd_verify_vf_version,
  mana_gd_query_max_resources, mana_query_device_cfg and
  mana_query_vport_cfg.
---
 drivers/net/ethernet/microsoft/mana/hw_channel.c | 12 +++++++-----
 drivers/net/ethernet/microsoft/mana/mana_en.c    |  8 ++++----
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/hw_channel.c b/drivers/net/ethernet/microsoft/mana/hw_channel.c
index ba3467f1e2ea..91975bdb5686 100644
--- a/drivers/net/ethernet/microsoft/mana/hw_channel.c
+++ b/drivers/net/ethernet/microsoft/mana/hw_channel.c
@@ -853,6 +853,7 @@ int mana_hwc_send_request(struct hw_channel_context *hwc, u32 req_len,
 	struct hwc_caller_ctx *ctx;
 	u32 dest_vrcq = 0;
 	u32 dest_vrq = 0;
+	u32 command;
 	u16 msg_id;
 	int err;
 
@@ -878,6 +879,7 @@ int mana_hwc_send_request(struct hw_channel_context *hwc, u32 req_len,
 	req_msg->req.hwc_msg_id = msg_id;
 
 	tx_wr->msg_size = req_len;
+	command = req_msg->req.msg_type;
 
 	if (gc->is_pf) {
 		dest_vrq = hwc->pf_dest_vrq_id;
@@ -893,8 +895,8 @@ int mana_hwc_send_request(struct hw_channel_context *hwc, u32 req_len,
 	if (!wait_for_completion_timeout(&ctx->comp_event,
 					 (msecs_to_jiffies(hwc->hwc_timeout)))) {
 		if (hwc->hwc_timeout != 0)
-			dev_err(hwc->dev, "HWC: Request timed out: %u ms\n",
-				hwc->hwc_timeout);
+			dev_err(hwc->dev, "Command 0x%x timed out: %u ms\n",
+				command, hwc->hwc_timeout);
 
 		/* Reduce further waiting if HWC no response */
 		if (hwc->hwc_timeout > 1)
@@ -914,9 +916,9 @@ int mana_hwc_send_request(struct hw_channel_context *hwc, u32 req_len,
 			err = -EOPNOTSUPP;
 			goto out;
 		}
-		if (req_msg->req.msg_type != MANA_QUERY_PHY_STAT)
-			dev_err(hwc->dev, "HWC: Failed hw_channel req: 0x%x\n",
-				ctx->status_code);
+		if (command != MANA_QUERY_PHY_STAT)
+			dev_err(hwc->dev, "Command 0x%x failed with status: 0x%x\n",
+				command, ctx->status_code);
 		err = -EPROTO;
 		goto out;
 	}
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 933e9d681ded..e25d85b38845 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1021,8 +1021,8 @@ static int mana_send_request(struct mana_context *ac, void *in_buf,
 
 		if (req->req.msg_type != MANA_QUERY_PHY_STAT &&
 		    mana_need_log(gc, err))
-			dev_err(dev, "Failed to send mana message: %d, 0x%x\n",
-				err, resp->status);
+			dev_err(dev, "Command 0x%x failed with status: 0x%x, err: %d\n",
+				req->req.msg_type, resp->status, err);
 		return err ? err : -EPROTO;
 	}
 
@@ -1335,8 +1335,8 @@ int mana_cfg_vport(struct mana_port_context *apc, u32 protection_dom_id,
 	apc->tx_shortform_allowed = resp.short_form_allowed;
 	apc->tx_vp_offset = resp.tx_vport_offset;
 
-	netdev_info(apc->ndev, "Configured vPort %llu PD %u DB %u\n",
-		    apc->port_handle, protection_dom_id, doorbell_pg_id);
+	netdev_info(apc->ndev, "Enabled vPort %llu PD %u DB %u MAC %pM\n",
+		    apc->port_handle, protection_dom_id, doorbell_pg_id, apc->mac_addr);
 out:
 	if (err)
 		mana_uncfg_vport(apc);
-- 
2.34.1


^ permalink raw reply related

* Re: [PATCH net-next v4] net: mana: Add MAC address to vPort logs and clarify error messages
From: Erni Sri Satya Vennela @ 2026-03-02 17:27 UTC (permalink / raw)
  To: Simon Horman
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, dipayanroy, shirazsaleem, ssengar,
	shradhagupta, gargaditya, linux-hyperv, netdev, linux-kernel
In-Reply-To: <aaRsN8FDf8aH54QB@horms.kernel.org>

> I have reservations about the usefulness of including __func__ and __LINE__
> in debug messages. In a nutshell, it requires the logs to be correlated
> (exactly?) with the source used to build the driver. And at that point
> I think other mechanism - e.g. dynamic trace points - are going to be
> useful if the debug message (without function and line information)
> is insufficient to pinpoint the problem.
> 
> This is a general statement, rather than something specifically
> about this code. But nonetheless I'd advise against adding this
> information here.
> 
Thankyou Jakub and Simon for the suggestions.
I'll remove both in the next version.

^ permalink raw reply

* [PATCH v3] x86/hyperv: Use __naked attribute to fix stackless C function
From: Ard Biesheuvel @ 2026-03-02 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Ard Biesheuvel, Andrew Cooper, Mukesh Rathor, Uros Bizjak,
	Wei Liu, linux-hyperv

hv_crash_c_entry() is a C function that is entered without a stack,
and this is only allowed for functions that have the __naked attribute,
which informs the compiler that it must not emit the usual prologue and
epilogue or emit any other kind of instrumentation that relies on a
stack frame.

So split up the function, and set the __naked attribute on the initial
part that sets up the stack, GDT, IDT and other pieces that are needed
for ordinary C execution. Given that function calls are not permitted
either, use the existing long return coded in an asm() block to call the
second part of the function, which is an ordinary function that is
permitted to call other functions as usual.

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> # asm parts, not hv parts
Reviewed-by: Mukesh Rathor <mrathor@linux.microsoft.com>
Acked-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: linux-hyperv@vger.kernel.org
Fixes: 94212d34618c ("x86/hyperv: Implement hypervisor RAM collection into vmcore")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
v3: make hv_wrmsr() 'asm volatile'
    combine SS segment register update with RSP assignment
    fix pre-existing bug cr4 -> cr2
    update comment gcc -> objtool

v2: apply some asm tweaks suggested by Uros and Andrew

 arch/x86/hyperv/hv_crash.c | 82 ++++++++++----------
 1 file changed, 43 insertions(+), 39 deletions(-)

diff --git a/arch/x86/hyperv/hv_crash.c b/arch/x86/hyperv/hv_crash.c
index 92da1b4f2e73..fdb277bf73d8 100644
--- a/arch/x86/hyperv/hv_crash.c
+++ b/arch/x86/hyperv/hv_crash.c
@@ -107,14 +107,12 @@ static void __noreturn hv_panic_timeout_reboot(void)
 		cpu_relax();
 }
 
-/* This cannot be inlined as it needs stack */
-static noinline __noclone void hv_crash_restore_tss(void)
+static void hv_crash_restore_tss(void)
 {
 	load_TR_desc();
 }
 
-/* This cannot be inlined as it needs stack */
-static noinline void hv_crash_clear_kernpt(void)
+static void hv_crash_clear_kernpt(void)
 {
 	pgd_t *pgd;
 	p4d_t *p4d;
@@ -125,6 +123,25 @@ static noinline void hv_crash_clear_kernpt(void)
 	native_p4d_clear(p4d);
 }
 
+
+static void __noreturn hv_crash_handle(void)
+{
+	hv_crash_restore_tss();
+	hv_crash_clear_kernpt();
+
+	/* we are now fully in devirtualized normal kernel mode */
+	__crash_kexec(NULL);
+
+	hv_panic_timeout_reboot();
+}
+
+/*
+ * __naked functions do not permit function calls, not even to __always_inline
+ * functions that only contain asm() blocks themselves. So use a macro instead.
+ */
+#define hv_wrmsr(msr, val) \
+	asm volatile("wrmsr" :: "c"(msr), "a"((u32)val), "d"((u32)(val >> 32)) : "memory")
+
 /*
  * This is the C entry point from the asm glue code after the disable hypercall.
  * We enter here in IA32-e long mode, ie, full 64bit mode running on kernel
@@ -133,51 +150,38 @@ static noinline void hv_crash_clear_kernpt(void)
  * available. We restore kernel GDT, and rest of the context, and continue
  * to kexec.
  */
-static asmlinkage void __noreturn hv_crash_c_entry(void)
+static void __naked hv_crash_c_entry(void)
 {
-	struct hv_crash_ctxt *ctxt = &hv_crash_ctxt;
-
 	/* first thing, restore kernel gdt */
-	native_load_gdt(&ctxt->gdtr);
+	asm volatile("lgdt %0" : : "m" (hv_crash_ctxt.gdtr));
 
-	asm volatile("movw %%ax, %%ss" : : "a"(ctxt->ss));
-	asm volatile("movq %0, %%rsp" : : "m"(ctxt->rsp));
+	asm volatile("movw %0, %%ss\n\t"
+		     "movq %1, %%rsp"
+		     :: "m"(hv_crash_ctxt.ss), "m"(hv_crash_ctxt.rsp));
 
-	asm volatile("movw %%ax, %%ds" : : "a"(ctxt->ds));
-	asm volatile("movw %%ax, %%es" : : "a"(ctxt->es));
-	asm volatile("movw %%ax, %%fs" : : "a"(ctxt->fs));
-	asm volatile("movw %%ax, %%gs" : : "a"(ctxt->gs));
+	asm volatile("movw %0, %%ds" : : "m"(hv_crash_ctxt.ds));
+	asm volatile("movw %0, %%es" : : "m"(hv_crash_ctxt.es));
+	asm volatile("movw %0, %%fs" : : "m"(hv_crash_ctxt.fs));
+	asm volatile("movw %0, %%gs" : : "m"(hv_crash_ctxt.gs));
 
-	native_wrmsrq(MSR_IA32_CR_PAT, ctxt->pat);
-	asm volatile("movq %0, %%cr0" : : "r"(ctxt->cr0));
+	hv_wrmsr(MSR_IA32_CR_PAT, hv_crash_ctxt.pat);
+	asm volatile("movq %0, %%cr0" : : "r"(hv_crash_ctxt.cr0));
 
-	asm volatile("movq %0, %%cr8" : : "r"(ctxt->cr8));
-	asm volatile("movq %0, %%cr4" : : "r"(ctxt->cr4));
-	asm volatile("movq %0, %%cr2" : : "r"(ctxt->cr4));
+	asm volatile("movq %0, %%cr8" : : "r"(hv_crash_ctxt.cr8));
+	asm volatile("movq %0, %%cr4" : : "r"(hv_crash_ctxt.cr4));
+	asm volatile("movq %0, %%cr2" : : "r"(hv_crash_ctxt.cr2));
 
-	native_load_idt(&ctxt->idtr);
-	native_wrmsrq(MSR_GS_BASE, ctxt->gsbase);
-	native_wrmsrq(MSR_EFER, ctxt->efer);
+	asm volatile("lidt %0" : : "m" (hv_crash_ctxt.idtr));
+	hv_wrmsr(MSR_GS_BASE, hv_crash_ctxt.gsbase);
+	hv_wrmsr(MSR_EFER, hv_crash_ctxt.efer);
 
 	/* restore the original kernel CS now via far return */
-	asm volatile("movzwq %0, %%rax\n\t"
-		     "pushq %%rax\n\t"
-		     "pushq $1f\n\t"
-		     "lretq\n\t"
-		     "1:nop\n\t" : : "m"(ctxt->cs) : "rax");
-
-	/* We are in asmlinkage without stack frame, hence make C function
-	 * calls which will buy stack frames.
-	 */
-	hv_crash_restore_tss();
-	hv_crash_clear_kernpt();
-
-	/* we are now fully in devirtualized normal kernel mode */
-	__crash_kexec(NULL);
-
-	hv_panic_timeout_reboot();
+	asm volatile("pushq %q0\n\t"
+		     "pushq %q1\n\t"
+		     "lretq"
+		     :: "r"(hv_crash_ctxt.cs), "r"(hv_crash_handle));
 }
-/* Tell gcc we are using lretq long jump in the above function intentionally */
+/* Tell objtool we are using lretq long jump in the above function intentionally */
 STACK_FRAME_NON_STANDARD(hv_crash_c_entry);
 
 static void hv_mark_tss_not_busy(void)
-- 
2.51.0


^ permalink raw reply related

* RE: [PATCH net-next] net: mana: Force full-page RX buffers for 4K page size on specific systems.
From: Haiyang Zhang @ 2026-03-02 16:38 UTC (permalink / raw)
  To: Dipayaan Roy, KY Srinivasan, wei.liu@kernel.org, Dexuan Cui,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, leon@kernel.org, Long Li,
	Konstantin Taranov, horms@kernel.org,
	shradhagupta@linux.microsoft.com, ssengar@linux.microsoft.com,
	ernis@linux.microsoft.com, Shiraz Saleem,
	linux-hyperv@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	Dipayaan Roy
In-Reply-To: <aaFusIxdbVkUqIpd@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>



> -----Original Message-----
> From: Dipayaan Roy <dipayanroy@linux.microsoft.com>
> Sent: Friday, February 27, 2026 5:15 AM
> To: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; wei.liu@kernel.org; Dexuan Cui
> <DECUI@microsoft.com>; andrew+netdev@lunn.ch; davem@davemloft.net;
> edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; leon@kernel.org;
> Long Li <longli@microsoft.com>; Konstantin Taranov
> <kotaranov@microsoft.com>; horms@kernel.org;
> shradhagupta@linux.microsoft.com; ssengar@linux.microsoft.com;
> ernis@linux.microsoft.com; Shiraz Saleem <shirazsaleem@microsoft.com>;
> linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org; linux-rdma@vger.kernel.org; Dipayaan Roy
> <dipayanroy@microsoft.com>
> Subject: [PATCH net-next] net: mana: Force full-page RX buffers for 4K
> page size on specific systems.
> 
> On certain systems configured with 4K PAGE_SIZE, utilizing page_pool
> fragments for RX buffers results in a significant throughput regression.
> Profiling reveals that this regression correlates with high overhead in
> the
> fragment allocation and reference counting paths on these specific
> platforms, rendering the multi-buffer-per-page strategy counterproductive.
> 
> To mitigate this, bypass the page_pool fragment path and force a single RX
> packet per page allocation when all the following conditions are met:
>   1. The system is configured with a 4K PAGE_SIZE.
>   2. A processor-specific quirk is detected via SMBIOS Type 4 data.
> 
> This approach restores expected line-rate performance by ensuring
> predictable RX refill behavior on affected hardware.
> 
> There is no behavioral change for systems using larger page sizes
> (16K/64K), or platforms where this processor-specific quirk do not
> apply.
> 
> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>

Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Thanks.


^ permalink raw reply

* Re: [PATCH v2 3/4] hv_balloon: set unspecified page reporting order
From: David Hildenbrand (Arm) @ 2026-03-02 15:00 UTC (permalink / raw)
  To: Yuvraj Sakshith, akpm, mst, jasowang, kys, haiyangz, wei.liu,
	decui
  Cc: linux-mm, virtualization, linux-hyperv, linux-kernel, xuanzhuo,
	eperezma, lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb,
	mhocko, jackmanb, hannes, ziy
In-Reply-To: <20260302111757.2191056-4-yuvraj.sakshith@oss.qualcomm.com>

On 3/2/26 12:17, Yuvraj Sakshith wrote:
> Explicitly mention page reporting order to be set to
> default value using PAGE_REPORTING_ORDER_UNSPECIFIED fallback
> value.
> 
> Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com>
> ---

Acked-by: David Hildenbrand (Arm) <david@kernel.org>

-- 
Cheers,

David

^ permalink raw reply

* Re: [PATCH v2 2/4] virtio_balloon: set unspecified page reporting order
From: David Hildenbrand (Arm) @ 2026-03-02 15:00 UTC (permalink / raw)
  To: Yuvraj Sakshith, akpm, mst, jasowang, kys, haiyangz, wei.liu,
	decui
  Cc: linux-mm, virtualization, linux-hyperv, linux-kernel, xuanzhuo,
	eperezma, lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb,
	mhocko, jackmanb, hannes, ziy
In-Reply-To: <20260302111757.2191056-3-yuvraj.sakshith@oss.qualcomm.com>

On 3/2/26 12:17, Yuvraj Sakshith wrote:
> virtio_balloon page reporting order is set to MAX_PAGE_ORDER implicitly
> as vb->prdev.order is never initialised and is auto-set to zero.
> 
> Explicitly mention usage of default page order by making use of
> PAGE_REPORTING_ORDER_UNSPECIFIED fallback value.
> 
> Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com>
> ---
Acked-by: David Hildenbrand (Arm) <david@kernel.org>

-- 
Cheers,

David

^ permalink raw reply

* Re: [PATCH v2 1/4] mm/page_reporting: add PAGE_REPORTING_ORDER_UNSPECIFIED
From: David Hildenbrand (Arm) @ 2026-03-02 14:57 UTC (permalink / raw)
  To: Yuvraj Sakshith, akpm, mst, jasowang, kys, haiyangz, wei.liu,
	decui
  Cc: linux-mm, virtualization, linux-hyperv, linux-kernel, xuanzhuo,
	eperezma, lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb,
	mhocko, jackmanb, hannes, ziy
In-Reply-To: <20260302111757.2191056-2-yuvraj.sakshith@oss.qualcomm.com>

On 3/2/26 12:17, Yuvraj Sakshith wrote:
> Drivers can pass order of pages to be reported while
> registering itself. Today, this is a magic number, 0.
> 
> Label this with PAGE_REPORTING_ORDER_UNSPECIFIED and
> check for it when the driver is being registered.
> 
> This macro will be used in relevant drivers next.
> 
> Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com>
> ---
>  include/linux/page_reporting.h |  1 +
>  mm/page_reporting.c            | 14 +++++---------
>  2 files changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/page_reporting.h b/include/linux/page_reporting.h
> index fe648dfa3..d1886c657 100644
> --- a/include/linux/page_reporting.h
> +++ b/include/linux/page_reporting.h
> @@ -7,6 +7,7 @@
>  
>  /* This value should always be a power of 2, see page_reporting_cycle() */
>  #define PAGE_REPORTING_CAPACITY		32
> +#define PAGE_REPORTING_ORDER_UNSPECIFIED	0
>  
>  struct page_reporting_dev_info {
>  	/* function that alters pages to make them "reported" */
> diff --git a/mm/page_reporting.c b/mm/page_reporting.c
> index e4c428e61..51cd88faf 100644
> --- a/mm/page_reporting.c
> +++ b/mm/page_reporting.c
> @@ -12,7 +12,7 @@
>  #include "internal.h"
>  
>  /* Initialize to an unsupported value */
> -unsigned int page_reporting_order = -1;
> +unsigned int page_reporting_order = PAGE_REPORTING_ORDER_UNSPECIFIED;
>  
>  static int page_order_update_notify(const char *val, const struct kernel_param *kp)
>  {
> @@ -25,12 +25,7 @@ static int page_order_update_notify(const char *val, const struct kernel_param *
>  
>  static const struct kernel_param_ops page_reporting_param_ops = {
>  	.set = &page_order_update_notify,
> -	/*
> -	 * For the get op, use param_get_int instead of param_get_uint.
> -	 * This is to make sure that when unset the initialized value of
> -	 * -1 is shown correctly
> -	 */
> -	.get = &param_get_int,
> +	.get = &param_get_uint,
>  };
>  
>  module_param_cb(page_reporting_order, &page_reporting_param_ops,
> @@ -369,8 +364,9 @@ int page_reporting_register(struct page_reporting_dev_info *prdev)
>  	 * pageblock_order.
>  	 */
>  
> -	if (page_reporting_order == -1) {
> -		if (prdev->order > 0 && prdev->order <= MAX_PAGE_ORDER)
> +	if (page_reporting_order == PAGE_REPORTING_ORDER_UNSPECIFIED) {
> +		if (prdev->order != PAGE_REPORTING_ORDER_UNSPECIFIED &&
> +			prdev->order <= MAX_PAGE_ORDER)
>  			page_reporting_order = prdev->order;
>  		else
>  			page_reporting_order = pageblock_order;

I think the change to page_reporting_order (and param_get_int) should
come after patch #4.

Otherwise, you temporarily change the semantics of
page_reporting_param_ops() etc.

So you should perform the page_reporting_order changes either in patch
#4 or in a new patch #5.

Apart from that LGTM.

-- 
Cheers,

David

^ permalink raw reply

* Re: [PATCH net-next] net: mana: Force full-page RX buffers for 4K page size on specific systems.
From: Simon Horman @ 2026-03-02 14:02 UTC (permalink / raw)
  To: Dipayaan Roy
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, leon, longli, kotaranov, shradhagupta, ssengar,
	ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, dipayanroy
In-Reply-To: <aaFusIxdbVkUqIpd@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>

On Fri, Feb 27, 2026 at 02:15:12AM -0800, Dipayaan Roy wrote:
> On certain systems configured with 4K PAGE_SIZE, utilizing page_pool
> fragments for RX buffers results in a significant throughput regression.
> Profiling reveals that this regression correlates with high overhead in the
> fragment allocation and reference counting paths on these specific
> platforms, rendering the multi-buffer-per-page strategy counterproductive.
> 
> To mitigate this, bypass the page_pool fragment path and force a single RX
> packet per page allocation when all the following conditions are met:
>   1. The system is configured with a 4K PAGE_SIZE.
>   2. A processor-specific quirk is detected via SMBIOS Type 4 data.
> 
> This approach restores expected line-rate performance by ensuring
> predictable RX refill behavior on affected hardware.
> 
> There is no behavioral change for systems using larger page sizes
> (16K/64K), or platforms where this processor-specific quirk do not
> apply.
> 
> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net-next, v2] net: mana: Trigger VF reset/recovery on health check failure due to HWC timeout
From: Simon Horman @ 2026-03-02 11:27 UTC (permalink / raw)
  To: Dipayaan Roy
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, leon, longli, kotaranov, shradhagupta, ssengar,
	ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, dipayanroy
In-Reply-To: <aaFShvKnwR5FY8dH@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>

On Fri, Feb 27, 2026 at 12:15:02AM -0800, Dipayaan Roy wrote:
> The GF stats periodic query is used as mechanism to monitor HWC health
> check. If this HWC command times out, it is a strong indication that
> the device/SoC is in a faulty state and requires recovery.
> 
> Today, when a timeout is detected, the driver marks
> hwc_timeout_occurred, clears cached stats, and stops rescheduling the
> periodic work. However, the device itself is left in the same failing
> state.
> 
> Extend the timeout handling path to trigger the existing MANA VF
> recovery service by queueing a GDMA_EQE_HWC_RESET_REQUEST work item.
> This is expected to initiate the appropriate recovery flow by suspende
> resume first and if it fails then trigger a bus rescan.
> 
> This change is intentionally limited to HWC command timeouts and does
> not trigger recovery for errors reported by the SoC as a normal command
> response.
> 
> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
> ---
> Changes in v2:
>   - Added common helper, proper clearing of gc flags.

Thanks for the update.

Reviewed-by: Simon Horman <horms@kernel.org>

...

^ permalink raw reply

* [PATCH v2 4/4] mm/page_reporting: change PAGE_REPORTING_ORDER_UNSPECIFIED to -1
From: Yuvraj Sakshith @ 2026-03-02 11:17 UTC (permalink / raw)
  To: akpm, mst, david, jasowang, kys, haiyangz, wei.liu, decui
  Cc: linux-mm, virtualization, linux-hyperv, linux-kernel, xuanzhuo,
	eperezma, lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb,
	mhocko, jackmanb, hannes, ziy
In-Reply-To: <20260302111757.2191056-1-yuvraj.sakshith@oss.qualcomm.com>

PAGE_REPORTING_ORDER_UNSPECIFIED is now set to zero. This means,
pages of order zero cannot be reported to a client/driver -- as zero
is used to signal a fallback to MAX_PAGE_ORDER.

Change PAGE_REPORTING_ORDER_UNSPECIFIED to (-1),
so that zero can be used as a valid order with which pages can
be reported.

Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com>
---
 include/linux/page_reporting.h | 2 +-
 mm/page_reporting.c            | 7 ++++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/page_reporting.h b/include/linux/page_reporting.h
index d1886c657..9d4ca5c21 100644
--- a/include/linux/page_reporting.h
+++ b/include/linux/page_reporting.h
@@ -7,7 +7,7 @@
 
 /* This value should always be a power of 2, see page_reporting_cycle() */
 #define PAGE_REPORTING_CAPACITY		32
-#define PAGE_REPORTING_ORDER_UNSPECIFIED	0
+#define PAGE_REPORTING_ORDER_UNSPECIFIED	-1
 
 struct page_reporting_dev_info {
 	/* function that alters pages to make them "reported" */
diff --git a/mm/page_reporting.c b/mm/page_reporting.c
index 51cd88faf..21c11b75e 100644
--- a/mm/page_reporting.c
+++ b/mm/page_reporting.c
@@ -25,7 +25,12 @@ static int page_order_update_notify(const char *val, const struct kernel_param *
 
 static const struct kernel_param_ops page_reporting_param_ops = {
 	.set = &page_order_update_notify,
-	.get = &param_get_uint,
+	/*
+	 * For the get op, use param_get_int instead of param_get_uint.
+	 * This is to make sure that when unset the initialized value of
+	 * -1 is shown correctly
+	 */
+	.get = &param_get_int,
 };
 
 module_param_cb(page_reporting_order, &page_reporting_param_ops,
-- 
2.34.1


^ permalink raw reply related

* [PATCH v2 3/4] hv_balloon: set unspecified page reporting order
From: Yuvraj Sakshith @ 2026-03-02 11:17 UTC (permalink / raw)
  To: akpm, mst, david, jasowang, kys, haiyangz, wei.liu, decui
  Cc: linux-mm, virtualization, linux-hyperv, linux-kernel, xuanzhuo,
	eperezma, lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb,
	mhocko, jackmanb, hannes, ziy
In-Reply-To: <20260302111757.2191056-1-yuvraj.sakshith@oss.qualcomm.com>

Explicitly mention page reporting order to be set to
default value using PAGE_REPORTING_ORDER_UNSPECIFIED fallback
value.

Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com>
---
 drivers/hv/hv_balloon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 2b4080e51..09da68101 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -1663,7 +1663,7 @@ static void enable_page_reporting(void)
 	 * We let the page_reporting_order parameter decide the order
 	 * in the page_reporting code
 	 */
-	dm_device.pr_dev_info.order = 0;
+	dm_device.pr_dev_info.order = PAGE_REPORTING_ORDER_UNSPECIFIED;
 	ret = page_reporting_register(&dm_device.pr_dev_info);
 	if (ret < 0) {
 		dm_device.pr_dev_info.report = NULL;
-- 
2.34.1


^ permalink raw reply related

* [PATCH v2 2/4] virtio_balloon: set unspecified page reporting order
From: Yuvraj Sakshith @ 2026-03-02 11:17 UTC (permalink / raw)
  To: akpm, mst, david, jasowang, kys, haiyangz, wei.liu, decui
  Cc: linux-mm, virtualization, linux-hyperv, linux-kernel, xuanzhuo,
	eperezma, lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb,
	mhocko, jackmanb, hannes, ziy
In-Reply-To: <20260302111757.2191056-1-yuvraj.sakshith@oss.qualcomm.com>

virtio_balloon page reporting order is set to MAX_PAGE_ORDER implicitly
as vb->prdev.order is never initialised and is auto-set to zero.

Explicitly mention usage of default page order by making use of
PAGE_REPORTING_ORDER_UNSPECIFIED fallback value.

Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com>
---
 drivers/virtio/virtio_balloon.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 74fe59f5a..2dfe2bcd8 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -1044,6 +1044,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
 			goto out_unregister_oom;
 		}
 
+		vb->pr_dev_info.order = PAGE_REPORTING_ORDER_UNSPECIFIED;
+
 		/*
 		 * The default page reporting order is @pageblock_order, which
 		 * corresponds to 512MB in size on ARM64 when 64KB base page
-- 
2.34.1


^ permalink raw reply related

* [PATCH v2 1/4] mm/page_reporting: add PAGE_REPORTING_ORDER_UNSPECIFIED
From: Yuvraj Sakshith @ 2026-03-02 11:17 UTC (permalink / raw)
  To: akpm, mst, david, jasowang, kys, haiyangz, wei.liu, decui
  Cc: linux-mm, virtualization, linux-hyperv, linux-kernel, xuanzhuo,
	eperezma, lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb,
	mhocko, jackmanb, hannes, ziy
In-Reply-To: <20260302111757.2191056-1-yuvraj.sakshith@oss.qualcomm.com>

Drivers can pass order of pages to be reported while
registering itself. Today, this is a magic number, 0.

Label this with PAGE_REPORTING_ORDER_UNSPECIFIED and
check for it when the driver is being registered.

This macro will be used in relevant drivers next.

Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com>
---
 include/linux/page_reporting.h |  1 +
 mm/page_reporting.c            | 14 +++++---------
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/include/linux/page_reporting.h b/include/linux/page_reporting.h
index fe648dfa3..d1886c657 100644
--- a/include/linux/page_reporting.h
+++ b/include/linux/page_reporting.h
@@ -7,6 +7,7 @@
 
 /* This value should always be a power of 2, see page_reporting_cycle() */
 #define PAGE_REPORTING_CAPACITY		32
+#define PAGE_REPORTING_ORDER_UNSPECIFIED	0
 
 struct page_reporting_dev_info {
 	/* function that alters pages to make them "reported" */
diff --git a/mm/page_reporting.c b/mm/page_reporting.c
index e4c428e61..51cd88faf 100644
--- a/mm/page_reporting.c
+++ b/mm/page_reporting.c
@@ -12,7 +12,7 @@
 #include "internal.h"
 
 /* Initialize to an unsupported value */
-unsigned int page_reporting_order = -1;
+unsigned int page_reporting_order = PAGE_REPORTING_ORDER_UNSPECIFIED;
 
 static int page_order_update_notify(const char *val, const struct kernel_param *kp)
 {
@@ -25,12 +25,7 @@ static int page_order_update_notify(const char *val, const struct kernel_param *
 
 static const struct kernel_param_ops page_reporting_param_ops = {
 	.set = &page_order_update_notify,
-	/*
-	 * For the get op, use param_get_int instead of param_get_uint.
-	 * This is to make sure that when unset the initialized value of
-	 * -1 is shown correctly
-	 */
-	.get = &param_get_int,
+	.get = &param_get_uint,
 };
 
 module_param_cb(page_reporting_order, &page_reporting_param_ops,
@@ -369,8 +364,9 @@ int page_reporting_register(struct page_reporting_dev_info *prdev)
 	 * pageblock_order.
 	 */
 
-	if (page_reporting_order == -1) {
-		if (prdev->order > 0 && prdev->order <= MAX_PAGE_ORDER)
+	if (page_reporting_order == PAGE_REPORTING_ORDER_UNSPECIFIED) {
+		if (prdev->order != PAGE_REPORTING_ORDER_UNSPECIFIED &&
+			prdev->order <= MAX_PAGE_ORDER)
 			page_reporting_order = prdev->order;
 		else
 			page_reporting_order = pageblock_order;
-- 
2.34.1


^ permalink raw reply related

* [PATCH v2 0/4] Allow order zero pages in page reporting
From: Yuvraj Sakshith @ 2026-03-02 11:17 UTC (permalink / raw)
  To: akpm, mst, david, jasowang, kys, haiyangz, wei.liu, decui
  Cc: linux-mm, virtualization, linux-hyperv, linux-kernel, xuanzhuo,
	eperezma, lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb,
	mhocko, jackmanb, hannes, ziy

Today, page reporting sets page_reporting_order in two ways:

(1) page_reporting.page_reporting_order cmdline parameter
(2) Driver can pass order while registering itself.

In both cases, order zero is ignored by free page reporting
because it is used to set page_reporting_order to a default
value, like MAX_PAGE_ORDER.

In some cases we might want page_reporting_order to be zero.

For instance, when virtio-balloon runs inside a guest with
tiny memory (say, 16MB), it might not be able to find a order 1 page
(or in the worst case order MAX_PAGE_ORDER page) after some uptime.
Page reporting should be able to return order zero pages back for
optimal memory relinquishment.

This patch changes the default fallback value from '0' to '-1' in
all possible clients of free page reporting (hv_balloon and
virtio-balloon) together with allowing '0' as a valid order in
page_reporting_register().

Changes in v1:
- Introduce PAGE_REPORTING_DEFAULT_ORDER macro (initially set to 0).
- Make use of new macro in drivers (hv_balloon and virtio-balloon)
	working with page reporting.
- Change PAGE_REPORTING_DEFAULT_ORDER to -1 as zero is a valid
	page order that can be requested.

Changes in v2:
- Better naming. Replace PAGE_REPORTING_DEFAULT_ORDER with
	PAGE_REPORTING_ORDER_UNSPECIFIED. This takes care of
	the situation where page reporting order is not specified
	in the commandline.
- Minor commit message changes.

Yuvraj Sakshith (4):
  mm/page_reporting: add PAGE_REPORTING_ORDER_UNSPECIFIED
  virtio_balloon: set unspecified page reporting order
  hv_balloon: set unspecified page reporting order
  mm/page_reporting: change PAGE_REPORTING_ORDER_UNSPECIFIED to -1

 drivers/hv/hv_balloon.c         | 2 +-
 drivers/virtio/virtio_balloon.c | 2 ++
 include/linux/page_reporting.h  | 1 +
 mm/page_reporting.c             | 7 ++++---
 4 files changed, 8 insertions(+), 4 deletions(-)

-- 
2.34.1


^ permalink raw reply

* Re: [PATCH v1 4/4] page_reporting: change PAGE_REPORTING_DEFAULT_ORDER to -1
From: Yuvraj Sakshith @ 2026-03-02  9:50 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Michael Kelley, akpm@linux-foundation.org, mst@redhat.com,
	kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com, jasowang@redhat.com,
	xuanzhuo@linux.alibaba.com, eperezma@redhat.com,
	lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, jackmanb@google.com, hannes@cmpxchg.org,
	ziy@nvidia.com, linux-hyperv@vger.kernel.org,
	virtualization@lists.linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <571547b0-007a-4cf9-be1d-95a0ef871cf8@kernel.org>

On Mon, Mar 02, 2026 at 10:18:23AM +0100, David Hildenbrand (Arm) wrote:
> > Great. Much more clearer on page_reporting.c 's end. 
> > 
> > Don't you think on the driver's end:
> > 
> > prdev->order = PAGE_REPORTING_USE_DEFAULT; looks clearer? As compared to:
> > prdev->order = PAGE_REPORTING_ORDER_UNSET; ?
> > 
> > I'm thinking, why would a driver worry about page_reporting_order being set/unset?
> 
> Maybe PAGE_REPORTING_ORDER_UNSPECIFIED ?
> 
> In any case, we should use a single flag for this. Everything else will
> be confusing once drivers could use only one of them.
> 
> -- 
> Cheers,
> 
> David

Sounds good. Thanks for the suggestion.

Thanks,
Yuvraj

^ permalink raw reply

* Re: [PATCH v1 4/4] page_reporting: change PAGE_REPORTING_DEFAULT_ORDER to -1
From: David Hildenbrand (Arm) @ 2026-03-02  9:18 UTC (permalink / raw)
  To: Yuvraj Sakshith, Michael Kelley
  Cc: akpm@linux-foundation.org, mst@redhat.com, kys@microsoft.com,
	haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com,
	longli@microsoft.com, jasowang@redhat.com,
	xuanzhuo@linux.alibaba.com, eperezma@redhat.com,
	lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, jackmanb@google.com, hannes@cmpxchg.org,
	ziy@nvidia.com, linux-hyperv@vger.kernel.org,
	virtualization@lists.linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <aaVQXbllLVBLZCwQ@hu-ysakshit-lv.qualcomm.com>

On 3/2/26 09:54, Yuvraj Sakshith wrote:
> On Mon, Mar 02, 2026 at 09:09:13AM +0100, David Hildenbrand (Arm) wrote:
>> On 3/2/26 09:00, Yuvraj Sakshith wrote:
>>> Option 1:
>>>
>>> if (page_reporting_order == PAGE_REPORTING_DEFAULT_ORDER) {
>>>         if (page_reporting_order != PAGE_REPORTING_DEFAULT_ORDER
>>>                 && prdev->order <= MAX_PAGE_ORDER) {
>>>                 page_reporting_order = prdev->order;
>>>         } else {
>>>                 page_reporting_order = pageblock_order;
>>>         }
>>> }
>>>
>>> Option 2:
>>>
>>> if (page_reporting_order == PAGE_REPORTING_ORDER_NOT_SET) {
>>>         if (page_reporting_order != PAGE_REPORTING_DEFAULT_ORDER
>>>                 && prdev->order <= MAX_PAGE_ORDER) {
>>>                 page_reporting_order = prdev->order;
>>>         } else {
>>>                 page_reporting_order = pageblock_order;
>>>         }
>>> }
>>>
>>>
>>>
>>> Agreed.
>>>
>>> If we were to read this code without context, wouldn't it be confusing as to
>>> why PAGE_REPORTING_DEFAULT_ORDER is being checked in the first place?
>>
>> I proposed in one of the last mail that
>> "PAGE_REPORTING_USE_DEFAULT_ORDER" could be clearer, stating that it's
>> not really an order just yet. Maybe just using
>> PAGE_REPORTING_ORDER_UNSET might be clearer.
>>
> Ok
>>>
>>> Option 1 checks if page_reporting_order is equal to PAGE_REPORTING_DEFAULT_ORDER
>>> and then immediately checks if its not equal to it. Which is a bit confusing..
>>
>>
>> Because it's wrong? :) We're not supposed to check page_reporting_order
>> a second time. Assume we
>> s/PAGE_REPORTING_ORDER/PAGE_REPORTING_ORDER_UNSET/ and actually check
>> prdev->order:
> Oops, typo :) I meant prdev->order.
>>
>> if (page_reporting_order == PAGE_REPORTING_ORDER_UNSET) {
>> 	if (prdev->order != PAGE_REPORTING_ORDER_UNSET &&
>> 	    prdev->order <= MAX_PAGE_ORDER) {
>> 		page_reporting_order = prdev->order;
>> 	} else {
>> 		page_reporting_order = pageblock_order;
>> 	}
>> }
>>
> Great. Much more clearer on page_reporting.c 's end. 
> 
> Don't you think on the driver's end:
> 
> prdev->order = PAGE_REPORTING_USE_DEFAULT; looks clearer? As compared to:
> prdev->order = PAGE_REPORTING_ORDER_UNSET; ?
> 
> I'm thinking, why would a driver worry about page_reporting_order being set/unset?

Maybe PAGE_REPORTING_ORDER_UNSPECIFIED ?

In any case, we should use a single flag for this. Everything else will
be confusing once drivers could use only one of them.

-- 
Cheers,

David

^ permalink raw reply

* Re: [PATCH v1 4/4] page_reporting: change PAGE_REPORTING_DEFAULT_ORDER to -1
From: Yuvraj Sakshith @ 2026-03-02  8:54 UTC (permalink / raw)
  To: David Hildenbrand (Arm), Michael Kelley
  Cc: Michael Kelley, akpm@linux-foundation.org, mst@redhat.com,
	kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com, jasowang@redhat.com,
	xuanzhuo@linux.alibaba.com, eperezma@redhat.com,
	lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, jackmanb@google.com, hannes@cmpxchg.org,
	ziy@nvidia.com, linux-hyperv@vger.kernel.org,
	virtualization@lists.linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <a0133403-8ce3-45a4-987f-96fb7421f920@kernel.org>

On Mon, Mar 02, 2026 at 09:09:13AM +0100, David Hildenbrand (Arm) wrote:
> On 3/2/26 09:00, Yuvraj Sakshith wrote:
> > On Mon, Mar 02, 2026 at 08:42:57AM +0100, David Hildenbrand (Arm) wrote:
> >> On 3/2/26 06:25, Michael Kelley wrote:
> >>> From: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com> Sent: Sunday, March 1, 2026 7:33 PM
> >>>
> >>> I don't think what you propose is correct. The purpose of testing
> >>> page_reporting_order for -1 is to see if a page reporting order has
> >>> been specified on the kernel boot line. If it has been specified, then
> >>> the page reporting order specified in the call to page_reporting_register()
> >>> [either a specific value or the default] is ignored and the kernel boot
> >>> line value prevails. But if page_reporting_order is -1 here, then
> >>> no kernel boot line value was specified, and the value passed to
> >>> page_reporting_register() should prevail.
> >>>
> >>> With this in mind, substituting PAGE_REPORTING_DEFAULT_ORDER
> >>> for the -1 in the test doesn’t exactly make sense to me. The -1 in the
> >>> test doesn't have quite the same meaning as the -1 for
> >>> PAGE_REPORTING_DEFAULT_ORDER. You could even use -2 for
> >>> the initial value of page_reporting_order, and here in the test, in
> >>> order to make that distinction obvious. Or use a separate symbolic
> >>> name like PAGE_REPORTING_ORDER_NOT_SET.
> >>
> > Option 1:
> > 
> > if (page_reporting_order == PAGE_REPORTING_DEFAULT_ORDER) {
> >         if (page_reporting_order != PAGE_REPORTING_DEFAULT_ORDER
> >                 && prdev->order <= MAX_PAGE_ORDER) {
> >                 page_reporting_order = prdev->order;
> >         } else {
> >                 page_reporting_order = pageblock_order;
> >         }
> > }
> > 
> > Option 2:
> > 
> > if (page_reporting_order == PAGE_REPORTING_ORDER_NOT_SET) {
> >         if (page_reporting_order != PAGE_REPORTING_DEFAULT_ORDER
> >                 && prdev->order <= MAX_PAGE_ORDER) {
> >                 page_reporting_order = prdev->order;
> >         } else {
> >                 page_reporting_order = pageblock_order;
> >         }
> > }
> > 
> > 
> >> I don't really see a difference between "PAGE_REPORTING_DEFAULT_ORDER"
> >> and "PAGE_REPORTING_ORDER_NOT_SET" that would warrant a split and adding
> >> confusion for the page-reporting drivers.
> >>
> >> In both cases, we want "no special requirement, just use the default".
> >> Maybe we can use a better name to express that.
> > 
> > Agreed.
> > 
> > If we were to read this code without context, wouldn't it be confusing as to
> > why PAGE_REPORTING_DEFAULT_ORDER is being checked in the first place?
> 
> I proposed in one of the last mail that
> "PAGE_REPORTING_USE_DEFAULT_ORDER" could be clearer, stating that it's
> not really an order just yet. Maybe just using
> PAGE_REPORTING_ORDER_UNSET might be clearer.
> 
Ok
> > 
> > Option 1 checks if page_reporting_order is equal to PAGE_REPORTING_DEFAULT_ORDER
> > and then immediately checks if its not equal to it. Which is a bit confusing..
> 
> 
> Because it's wrong? :) We're not supposed to check page_reporting_order
> a second time. Assume we
> s/PAGE_REPORTING_ORDER/PAGE_REPORTING_ORDER_UNSET/ and actually check
> prdev->order:
Oops, typo :) I meant prdev->order.
> 
> if (page_reporting_order == PAGE_REPORTING_ORDER_UNSET) {
> 	if (prdev->order != PAGE_REPORTING_ORDER_UNSET &&
> 	    prdev->order <= MAX_PAGE_ORDER) {
> 		page_reporting_order = prdev->order;
> 	} else {
> 		page_reporting_order = pageblock_order;
> 	}
> }
> 
Great. Much more clearer on page_reporting.c 's end. 

Don't you think on the driver's end:

prdev->order = PAGE_REPORTING_USE_DEFAULT; looks clearer? As compared to:
prdev->order = PAGE_REPORTING_ORDER_UNSET; ?

I'm thinking, why would a driver worry about page_reporting_order being set/unset?

But yes, too many flags...  


Thanks,
Yuvraj

^ permalink raw reply

* Re: [PATCH v1 4/4] page_reporting: change PAGE_REPORTING_DEFAULT_ORDER to -1
From: David Hildenbrand (Arm) @ 2026-03-02  8:09 UTC (permalink / raw)
  To: Yuvraj Sakshith, Michael Kelley
  Cc: akpm@linux-foundation.org, mst@redhat.com, kys@microsoft.com,
	haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com,
	longli@microsoft.com, jasowang@redhat.com,
	xuanzhuo@linux.alibaba.com, eperezma@redhat.com,
	lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, jackmanb@google.com, hannes@cmpxchg.org,
	ziy@nvidia.com, linux-hyperv@vger.kernel.org,
	virtualization@lists.linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <aaVDiwEPl5t2UPX4@hu-ysakshit-lv.qualcomm.com>

On 3/2/26 09:00, Yuvraj Sakshith wrote:
> On Mon, Mar 02, 2026 at 08:42:57AM +0100, David Hildenbrand (Arm) wrote:
>> On 3/2/26 06:25, Michael Kelley wrote:
>>> From: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com> Sent: Sunday, March 1, 2026 7:33 PM
>>>
>>> I don't think what you propose is correct. The purpose of testing
>>> page_reporting_order for -1 is to see if a page reporting order has
>>> been specified on the kernel boot line. If it has been specified, then
>>> the page reporting order specified in the call to page_reporting_register()
>>> [either a specific value or the default] is ignored and the kernel boot
>>> line value prevails. But if page_reporting_order is -1 here, then
>>> no kernel boot line value was specified, and the value passed to
>>> page_reporting_register() should prevail.
>>>
>>> With this in mind, substituting PAGE_REPORTING_DEFAULT_ORDER
>>> for the -1 in the test doesn’t exactly make sense to me. The -1 in the
>>> test doesn't have quite the same meaning as the -1 for
>>> PAGE_REPORTING_DEFAULT_ORDER. You could even use -2 for
>>> the initial value of page_reporting_order, and here in the test, in
>>> order to make that distinction obvious. Or use a separate symbolic
>>> name like PAGE_REPORTING_ORDER_NOT_SET.
>>
> Option 1:
> 
> if (page_reporting_order == PAGE_REPORTING_DEFAULT_ORDER) {
>         if (page_reporting_order != PAGE_REPORTING_DEFAULT_ORDER
>                 && prdev->order <= MAX_PAGE_ORDER) {
>                 page_reporting_order = prdev->order;
>         } else {
>                 page_reporting_order = pageblock_order;
>         }
> }
> 
> Option 2:
> 
> if (page_reporting_order == PAGE_REPORTING_ORDER_NOT_SET) {
>         if (page_reporting_order != PAGE_REPORTING_DEFAULT_ORDER
>                 && prdev->order <= MAX_PAGE_ORDER) {
>                 page_reporting_order = prdev->order;
>         } else {
>                 page_reporting_order = pageblock_order;
>         }
> }
> 
> 
>> I don't really see a difference between "PAGE_REPORTING_DEFAULT_ORDER"
>> and "PAGE_REPORTING_ORDER_NOT_SET" that would warrant a split and adding
>> confusion for the page-reporting drivers.
>>
>> In both cases, we want "no special requirement, just use the default".
>> Maybe we can use a better name to express that.
> 
> Agreed.
> 
> If we were to read this code without context, wouldn't it be confusing as to
> why PAGE_REPORTING_DEFAULT_ORDER is being checked in the first place?

I proposed in one of the last mail that
"PAGE_REPORTING_USE_DEFAULT_ORDER" could be clearer, stating that it's
not really an order just yet. Maybe just using
PAGE_REPORTING_ORDER_UNSET might be clearer.

> 
> Option 1 checks if page_reporting_order is equal to PAGE_REPORTING_DEFAULT_ORDER
> and then immediately checks if its not equal to it. Which is a bit confusing..


Because it's wrong? :) We're not supposed to check page_reporting_order
a second time. Assume we
s/PAGE_REPORTING_ORDER/PAGE_REPORTING_ORDER_UNSET/ and actually check
prdev->order:

if (page_reporting_order == PAGE_REPORTING_ORDER_UNSET) {
	if (prdev->order != PAGE_REPORTING_ORDER_UNSET &&
	    prdev->order <= MAX_PAGE_ORDER) {
		page_reporting_order = prdev->order;
	} else {
		page_reporting_order = pageblock_order;
	}
}




-- 
Cheers,

David

^ permalink raw reply

* Re: [PATCH v1 4/4] page_reporting: change PAGE_REPORTING_DEFAULT_ORDER to -1
From: Yuvraj Sakshith @ 2026-03-02  8:00 UTC (permalink / raw)
  To: David Hildenbrand (Arm), Michael Kelley
  Cc: Michael Kelley, akpm@linux-foundation.org, mst@redhat.com,
	kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com, jasowang@redhat.com,
	xuanzhuo@linux.alibaba.com, eperezma@redhat.com,
	lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, jackmanb@google.com, hannes@cmpxchg.org,
	ziy@nvidia.com, linux-hyperv@vger.kernel.org,
	virtualization@lists.linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <b1390b24-eaef-40e0-a16b-77c27decb77e@kernel.org>

On Mon, Mar 02, 2026 at 08:42:57AM +0100, David Hildenbrand (Arm) wrote:
> On 3/2/26 06:25, Michael Kelley wrote:
> > From: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com> Sent: Sunday, March 1, 2026 7:33 PM
> >>
> >> On Fri, Feb 27, 2026 at 09:50:15PM +0100, David Hildenbrand (Arm) wrote:
> >>>
> >>> No need for the ().
> >>>
> >>> Wondering whether we now also want to do in this patch:
> >>>
> >>>
> >>> diff --git a/mm/page_reporting.c b/mm/page_reporting.c
> >>> index f0042d5743af..d432aadf9d07 100644
> >>> --- a/mm/page_reporting.c
> >>> +++ b/mm/page_reporting.c
> >>> @@ -11,8 +11,7 @@
> >>>  #include "page_reporting.h"
> >>>  #include "internal.h"
> >>>
> >>> -/* Initialize to an unsupported value */
> >>> -unsigned int page_reporting_order = -1;
> >>> +unsigned int page_reporting_order = PAGE_REPORTING_DEFAULT_ORDER;
> >>>
> >>>  static int page_order_update_notify(const char *val, const struct
> >>> kernel_param *kp)
> >>>  {
> >>> @@ -369,7 +368,7 @@ int page_reporting_register(struct
> >>> page_reporting_dev_info *prdev)
> >>>          * pageblock_order.
> >>>          */
> >>>
> >>> -       if (page_reporting_order == -1) {
> >>> +       if (page_reporting_order == PAGE_REPORTING_DEFAULT_ORDER) {
> >>>
> >>>
> >>
> >> Sure. Now that I think of it, don’t you think the first nested if() will
> >> always be false? and can be compressed down to just one if()?
> > 
> > I don't think what you propose is correct. The purpose of testing
> > page_reporting_order for -1 is to see if a page reporting order has
> > been specified on the kernel boot line. If it has been specified, then
> > the page reporting order specified in the call to page_reporting_register()
> > [either a specific value or the default] is ignored and the kernel boot
> > line value prevails. But if page_reporting_order is -1 here, then
> > no kernel boot line value was specified, and the value passed to
> > page_reporting_register() should prevail.
> > 
> > With this in mind, substituting PAGE_REPORTING_DEFAULT_ORDER
> > for the -1 in the test doesn’t exactly make sense to me. The -1 in the
> > test doesn't have quite the same meaning as the -1 for
> > PAGE_REPORTING_DEFAULT_ORDER. You could even use -2 for
> > the initial value of page_reporting_order, and here in the test, in
> > order to make that distinction obvious. Or use a separate symbolic
> > name like PAGE_REPORTING_ORDER_NOT_SET.
> 
Option 1:

if (page_reporting_order == PAGE_REPORTING_DEFAULT_ORDER) {
        if (page_reporting_order != PAGE_REPORTING_DEFAULT_ORDER
                && prdev->order <= MAX_PAGE_ORDER) {
                page_reporting_order = prdev->order;
        } else {
                page_reporting_order = pageblock_order;
        }
}

Option 2:

if (page_reporting_order == PAGE_REPORTING_ORDER_NOT_SET) {
        if (page_reporting_order != PAGE_REPORTING_DEFAULT_ORDER
                && prdev->order <= MAX_PAGE_ORDER) {
                page_reporting_order = prdev->order;
        } else {
                page_reporting_order = pageblock_order;
        }
}


> I don't really see a difference between "PAGE_REPORTING_DEFAULT_ORDER"
> and "PAGE_REPORTING_ORDER_NOT_SET" that would warrant a split and adding
> confusion for the page-reporting drivers.
> 
> In both cases, we want "no special requirement, just use the default".
> Maybe we can use a better name to express that.

Agreed.

If we were to read this code without context, wouldn't it be confusing as to
why PAGE_REPORTING_DEFAULT_ORDER is being checked in the first place?

Option 1 checks if page_reporting_order is equal to PAGE_REPORTING_DEFAULT_ORDER
and then immediately checks if its not equal to it. Which is a bit confusing..

And moreover, page_reporting_order can be set by two people. The commandline and
the driver itself. So PAGE_REPORTING_ORDER_NOT_SET can indicate if its set by cmdline
and PAGE_REPORTING_DEFAULT_ORDER can be used by drivers exclusively to "tell" page-reporting
to select the default value for us.

I think what Michael is pointing out is the prevalence of cmdline option over the driver's
request. 

This is not obvious to the reader if we choose to only have one flag IMO :)

Thanks,
Yuvraj
 
> -- 
> Cheers,
> 
> David












^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox