From: Jianguo Wu <wujianguo@huawei.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, bugzilla-daemon@bugzilla.kernel.org,
iceman_dvd@yahoo.com
Subject: Re: [Bug 56881] New: MAP_HUGETLB mmap fails for certain sizes
Date: Wed, 24 Apr 2013 16:20:50 +0800 [thread overview]
Message-ID: <517795E2.6070404@huawei.com> (raw)
In-Reply-To: <1366755995-no3omuhl-mutt-n-horiguchi@ah.jp.nec.com>
On 2013/4/24 6:26, Naoya Horiguchi wrote:
> On Tue, Apr 23, 2013 at 01:25:22PM -0700, Andrew Morton wrote:
>>
>> (switched to email. Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Sat, 20 Apr 2013 03:00:30 +0000 (UTC) bugzilla-daemon@bugzilla.kernel.org wrote:
>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=56881
>>>
>>> Summary: MAP_HUGETLB mmap fails for certain sizes
>>> Product: Memory Management
>>> Version: 2.5
>>> Kernel Version: 3.5.0-27
>>
>> Thanks.
>>
>> It's a post-3.4 regression, testcase included. Does someone want to
>> take a look, please?
>
> Let me try it.
>
> static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma)
> {
> struct inode *inode = file->f_path.dentry->d_inode;
> loff_t len, vma_len;
> int ret;
> struct hstate *h = hstate_file(file);
> ...
> if (vma->vm_pgoff & (~huge_page_mask(h) >> PAGE_SHIFT))
> return -EINVAL;
>
> This code checks only whether a given hugetlb vma covers (1 << order)
> pages, not whether it's exactly hugepage aligned.
> Before 2b37c35e6552 "fs/hugetlbfs/inode.c: fix pgoff alignment
> checking on 32-bit", it was
>
> if (vma->vm_pgoff & ~(huge_page_mask(h) >> PAGE_SHIFT))
>
> , but this made no sense because ~(huge_page_mask(h) >> PAGE_SHIFT) is
> 0xff for 2M hugepage.
> I think the reported problem is not a bug because the behavior before
> this change was wrong or not as expected.
>
> If we want to make sure that a given address range fits hugepage size,
> something like below can be useful.
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 78bde32..a98304b 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -113,11 +113,11 @@ static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma)
> vma->vm_flags |= VM_HUGETLB | VM_DONTEXPAND | VM_DONTDUMP;
> vma->vm_ops = &hugetlb_vm_ops;
>
> - if (vma->vm_pgoff & (~huge_page_mask(h) >> PAGE_SHIFT))
> - return -EINVAL;
> -
> vma_len = (loff_t)(vma->vm_end - vma->vm_start);
>
> + if (vma->len & ~huge_page_mask(h))
> + return -EINVAL;
> +
> mutex_lock(&inode->i_mutex);
> file_accessed(file);
>
>
> Thanks,
> Naoya Horiguchi
>
Hi Naoya,
I think the -EINVAL is returned from hugetlb_get_unmapped_area(),
for the two testcases:
1) $ ./mmappu $((5 * 2 * 1024 * 1024 - 4096)) //len1 = 0x9ff000
2) $ ./mmappu $((5 * 2 * 1024 * 1024 - 4095)) //len2 = 0x9ff001
In do_mmap_pgoff(), after "len = PAGE_ALIGN(len);", len1 = 0x9ff000,
len2 = 0xa00000, so len2 will pass "if (len & ~huge_page_mask(h))" check in
hugetlb_get_unmapped_area(), and len1 will return -EINVAL. As follow:
do_mmap_pgoff()
{
...
/* Careful about overflows.. */
len = PAGE_ALIGN(len);
...
get_unmapped_area()
-->hugetlb_get_unmapped_area()
{
...
if (len & ~huge_page_mask(h))
return -EINVAL;
...
}
}
do we need to align len to hugepage size if it's hugetlbfs mmap? something like below:
---
mm/mmap.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 0db0de1..bd42be24 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1188,7 +1188,10 @@ unsigned long do_mmap_pgoff(struct file *file, unsigned long addr,
addr = round_hint_to_min(addr);
/* Careful about overflows.. */
- len = PAGE_ALIGN(len);
+ if (file && is_file_hugepages(file))
+ len = ALIGN(len, huge_page_size(hstate_file(file)));
+ else
+ len = PAGE_ALIGN(len);
if (!len)
return -ENOMEM;
--
Thanks,
Jianguo Wu
>>> Platform: All
>>> OS/Version: Linux
>>> Tree: Mainline
>>> Status: NEW
>>> Severity: high
>>> Priority: P1
>>> Component: Other
>>> AssignedTo: akpm@linux-foundation.org
>>> ReportedBy: iceman_dvd@yahoo.com
>>> Regression: No
>>>
>>>
>>> This is on an Ubuntu 12.10 desktop, but the same issue has been found on 12.04
>>> with 3.5.0 kernel.
>>> See the sample program. An allocation with MAP_HUGETLB consistently fails with
>>> certain sizes, while it succeeds with others.
>>> The allocation sizes are well below the number of free huge pages.
>>>
>>> $ uname -a Linux davide-lnx2 3.5.0-27-generic #46-Ubuntu SMP Mon Mar 25
>>> 19:58:17 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>>
>>> # echo 100 > /proc/sys/vm/nr_hugepages
>>>
>>> # cat /proc/meminfo
>>> ...
>>> AnonHugePages: 0 kB
>>> HugePages_Total: 100
>>> HugePages_Free: 100
>>> HugePages_Rsvd: 0
>>> HugePages_Surp: 0
>>> Hugepagesize: 2048 kB
>>>
>>>
>>> $ ./mmappu $((5 * 2 * 1024 * 1024 - 4096))
>>> size=10481664 0x9ff000
>>> hugepage mmap: Invalid argument
>>>
>>>
>>> $ ./mmappu $((5 * 2 * 1024 * 1024 - 4095))
>>> size=10481665 0x9ff001
>>> OK!
>>>
>>>
>>> It seems the trigger point is a normal page size.
>>> The same binary works flawlessly in previous kernels.
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-04-24 8:21 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-56881-27@https.bugzilla.kernel.org/>
2013-04-23 20:25 ` [Bug 56881] New: MAP_HUGETLB mmap fails for certain sizes Andrew Morton
2013-04-23 22:26 ` Naoya Horiguchi
2013-04-24 8:20 ` Jianguo Wu [this message]
2013-04-24 14:17 ` Naoya Horiguchi
2013-04-24 8:14 ` Johannes Weiner
2013-04-24 15:16 ` Naoya Horiguchi
2013-04-24 15:39 ` Johannes Weiner
2013-04-24 23:05 ` Naoya Horiguchi
2013-04-24 23:13 ` Naoya Horiguchi
2013-04-24 23:44 ` Johannes Weiner
2013-04-24 23:26 ` Johannes Weiner
2013-04-25 21:00 ` Naoya Horiguchi
2013-04-26 4:35 ` [PATCH v2] hugetlbfs: fix mmap failure in unaligned size request Naoya Horiguchi
2013-04-30 16:45 ` Johannes Weiner
2013-04-30 17:02 ` [PATCH v3] " Naoya Horiguchi
2013-05-01 8:00 ` Sam Ben
2013-04-25 3:02 ` [Bug 56881] New: MAP_HUGETLB mmap fails for certain sizes Jianguo Wu
2013-04-25 21:03 ` Naoya Horiguchi
2013-06-12 12:16 ` Aneesh Kumar K.V
2013-06-13 21:29 ` Andrew Morton
2013-06-18 11:14 ` Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=517795E2.6070404@huawei.com \
--to=wujianguo@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=bugzilla-daemon@bugzilla.kernel.org \
--cc=iceman_dvd@yahoo.com \
--cc=linux-mm@kvack.org \
--cc=n-horiguchi@ah.jp.nec.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).