public inbox for linux-hyperv@vger.kernel.org
 help / color / mirror / Atom feed
From: Naman Jain <namjain@linux.microsoft.com>
To: Michael Kelley <mhklinux@outlook.com>,
	"K . Y . Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
	Long Li <longli@microsoft.com>
Cc: Saurabh Sengar <ssengar@linux.microsoft.com>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] Drivers: hv: mshv_vtl: Fix vmemmap_shift exceeding MAX_FOLIO_ORDER
Date: Fri, 3 Apr 2026 12:55:16 +0530	[thread overview]
Message-ID: <9f0e27ac-099b-4b7d-b082-f43245331bbc@linux.microsoft.com> (raw)
In-Reply-To: <SN6PR02MB415797828F8DC5C9AC7E9131D45EA@SN6PR02MB4157.namprd02.prod.outlook.com>



On 4/3/2026 9:05 AM, Michael Kelley wrote:
> From: Naman Jain <namjain@linux.microsoft.com> Sent: Tuesday, March 31, 2026 10:40 PM
>>
>> When registering VTL0 memory via MSHV_ADD_VTL0_MEMORY, the kernel
>> computes pgmap->vmemmap_shift as the number of trailing zeros in the
>> OR of start_pfn and last_pfn, intending to use the largest compound
>> page order both endpoints are aligned to.
>>
>> However, this value is not clamped to MAX_FOLIO_ORDER, so a
>> sufficiently aligned range (e.g. physical range 0x800000000000-
>> 0x800080000000, corresponding to start_pfn=0x800000000 with 35
>> trailing zeros) can produce a shift larger than what
>> memremap_pages() accepts, triggering a WARN and returning -EINVAL:
>>
>>    WARNING: ... memremap_pages+0x512/0x650
>>    requested folio size unsupported
>>
>> The MAX_FOLIO_ORDER check was added by
>> commit 646b67d57589 ("mm/memremap: reject unreasonable folio/compound
>> page sizes in memremap_pages()").
>> When CONFIG_HAVE_GIGANTIC_FOLIOS=y, CONFIG_SPARSEMEM_VMEMMAP=y, and
>> CONFIG_HUGETLB_PAGE is not set, MAX_FOLIO_ORDER resolves to
>> (PUD_SHIFT - PAGE_SHIFT) = 18. Any range whose PFN alignment exceeds
>> order 18 hits this path.
> 
> I'm not clear on what point you are making with this specific
> configuration that results in MAX_FOLIO_ORDER being 18. Is it just
> an example? Is 18 the largest expected value for MAX_FOLIO_ORDER?
> And note that PUD_SHIFT and PAGE_SHIFT might have different values
> on arm64 with a page size other than 4K.
> 

Yes, this was just an example. It is not generalized enough, I will 
remove it.
MAX_FOLIO_ORDER could go beyond 18.

>>
>> Fix this by clamping vmemmap_shift to MAX_FOLIO_ORDER so we always
>> request the largest order the kernel supports, rather than an
>> out-of-range value.
>>
>> Also fix the error path to propagate the actual error code from
>> devm_memremap_pages() instead of hard-coding -EFAULT, which was
>> masking the real -EINVAL return.
>>
>> Fixes: 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver")
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
>> ---
>>   drivers/hv/mshv_vtl_main.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/hv/mshv_vtl_main.c b/drivers/hv/mshv_vtl_main.c
>> index 5856975f32e12..255fed3a740c1 100644
>> --- a/drivers/hv/mshv_vtl_main.c
>> +++ b/drivers/hv/mshv_vtl_main.c
>> @@ -405,8 +405,12 @@ static int mshv_vtl_ioctl_add_vtl0_mem(struct mshv_vtl *vtl, void __user *arg)
>>   	/*
>>   	 * Determine the highest page order that can be used for the given memory range.
>>   	 * This works best when the range is aligned; i.e. both the start and the length.
>> +	 * Clamp to MAX_FOLIO_ORDER to avoid a WARN in memremap_pages() when the range
>> +	 * alignment exceeds the maximum supported folio order for this kernel config.
>>   	 */
>> -	pgmap->vmemmap_shift = count_trailing_zeros(vtl0_mem.start_pfn | vtl0_mem.last_pfn);
>> +	pgmap->vmemmap_shift = min_t(unsigned long,
>> +				     count_trailing_zeros(vtl0_mem.start_pfn | vtl0_mem.last_pfn),
>> +				     MAX_FOLIO_ORDER);
> 
> Is it necessary to use min_t() here, or would min() work?  Neither count_trailing_zeros()
> nor MAX_FOLIO_ORDER is ever negative, so it seems like just min() would work with
> no potential for doing a bogus comparison or assignment.
>

min could work, yes. I just felt min_t is more safer for comparing these 
two different types of values -
count_trailing_zeroes being 'int'
MAX_FOLIO_ORDER being a macro, calculated by applying bit operations.

and destination being unsigned int.


include/linux/mmzone.h:#define MAX_FOLIO_ORDER          MAX_PAGE_ORDER
include/linux/mmzone.h:#define MAX_FOLIO_ORDER          PFN_SECTION_SHIFT
include/linux/mmzone.h:#define MAX_FOLIO_ORDER          (ilog2(SZ_16G) - 
PAGE_SHIFT)
include/linux/mmzone.h:#define MAX_FOLIO_ORDER          (ilog2(SZ_1G) - 
PAGE_SHIFT)
include/linux/mmzone.h:#define MAX_FOLIO_ORDER          (PUD_SHIFT - 
PAGE_SHIFT)

I am fine with anything you suggest here.

> The shift is calculated using the originally passed in start_pfn and last_pfn, while the
> "range" struct in pgmap has an "end" value that is one page less. So is the idea to
> go ahead and create the mapping with folios of a size that includes that last page,
> and then just waste the last page of the last folio?

No, waste does not occur. The vmemmap_shift determines the folio order, 
and memmap_init_zone_device() walks the range [start_pfn, last_pfn) in 
steps of (1 << vmemmap_shift) pages, creating one folio per step. The OR 
operation ensures both boundaries are aligned to multiples of (1 << 
vmemmap_shift), guaranteeing the range divides evenly into folios with 
no partial folio at the end.
The intention is to find the highest folio order possible here, and if 
it reaches the MAX_FOLIO_ORDER, restrict vmemmap_shift to it.

> 
>>   	dev_dbg(vtl->module_dev,
>>   		"Add VTL0 memory: start: 0x%llx, end_pfn: 0x%llx, page order: %lu\n",
>>   		vtl0_mem.start_pfn, vtl0_mem.last_pfn, pgmap->vmemmap_shift);
>> @@ -415,7 +419,7 @@ static int mshv_vtl_ioctl_add_vtl0_mem(struct mshv_vtl *vtl, void __user *arg)
>>   	if (IS_ERR(addr)) {
>>   		dev_err(vtl->module_dev, "devm_memremap_pages error: %ld\n", PTR_ERR(addr));
>>   		kfree(pgmap);
>> -		return -EFAULT;
>> +		return PTR_ERR(addr);
>>   	}
>>
>>   	/* Don't free pgmap, since it has to stick around until the memory
>>
>> base-commit: 36ece9697e89016181e5ae87510e40fb31d86f2b
>> --
>> 2.43.0
>>

Thanks for reviewing.

Regards,
Naman


  reply	other threads:[~2026-04-03  7:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01  5:40 [PATCH] Drivers: hv: mshv_vtl: Fix vmemmap_shift exceeding MAX_FOLIO_ORDER Naman Jain
2026-04-03  3:35 ` Michael Kelley
2026-04-03  7:25   ` Naman Jain [this message]
2026-04-03 18:37     ` Michael Kelley
2026-04-06  4:56       ` Naman Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9f0e27ac-099b-4b7d-b082-f43245331bbc@linux.microsoft.com \
    --to=namjain@linux.microsoft.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=mhklinux@outlook.com \
    --cc=ssengar@linux.microsoft.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox