From: Daniel Black <daniel@linux.ibm.com>
To: linux-mm@kvack.org, mike.kravetz@oracle.com, khlebnikov@openvz.org
Cc: Daniel Black <daniel@linux.ibm.com>
Subject: [PATCH] mm: madvise(MADV_DODUMP) allow hugetlbfs pages
Date: Sun, 30 Sep 2018 15:46:29 +1000 [thread overview]
Message-ID: <20180930054629.29150-1-daniel@linux.ibm.com> (raw)
Reproducer, assuming 2M of hugetlbfs available:
Hugetlbfs mounted, size=2M and option user=testuser
# mount | grep ^hugetlbfs
hugetlbfs on /dev/hugepages type hugetlbfs (rw,pagesize=2M,user=dan)
# sysctl vm.nr_hugepages=1
vm.nr_hugepages = 1
# grep Huge /proc/meminfo
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
HugePages_Total: 1
HugePages_Free: 1
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 2048 kB
Code:
#include <sys/mman.h>
#include <stddef.h>
#define SIZE 2*1024*1024
int main()
{
void *ptr;
ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_HUGETLB | MAP_ANONYMOUS, -1, 0);
madvise(ptr, SIZE, MADV_DONTDUMP);
madvise(ptr, SIZE, MADV_DODUMP);
}
Compile and strace:
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = 0x7ff7c9200000
madvise(0x7ff7c9200000, 2097152, MADV_DONTDUMP) = 0
madvise(0x7ff7c9200000, 2097152, MADV_DODUMP) = -1 EINVAL (Invalid argument)
hugetlbfs pages have VM_DONTEXPAND in the VmFlags driver pages based on
author testing with analysis from Florian Weimer[1].
The inclusion of VM_DONTEXPAND into the VM_SPECIAL defination
was a consequence of the large useage of VM_DONTEXPAND in device
drivers.
A consequence of [2] is that VM_DONTEXPAND marked pages are unable to be
marked DODUMP.
A user could quite legitimately madvise(MADV_DONTDUMP) their hugetlbfs
memory for a while and later request that madvise(MADV_DODUMP) on the
same memory. We correct this omission by allowing madvice(MADV_DODUMP)
on hugetlbfs pages.
[1] https://stackoverflow.com/questions/52548260/madvisedodump-on-the-same-ptr-size-as-a-successful-madvisedontdump-fails-wit
[2] commit 0103bd16fb90 ("mm: prepare VM_DONTDUMP for using in drivers")
Fixes: 0103bd16fb90 ("mm: prepare VM_DONTDUMP for using in drivers")
Reported-by: Kenneth Penza <kpenza@gmail.com>
Signed-off-by: Daniel Black <daniel@linux.ibm.com>
Buglink: https://lists.launchpad.net/maria-discuss/msg05245.html
Cc: linux-mm@kvack.org
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
mm/madvise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 972a9eaa898b..71d21df2a3f3 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -96,7 +96,7 @@ static long madvise_behavior(struct vm_area_struct *vma,
new_flags |= VM_DONTDUMP;
break;
case MADV_DODUMP:
- if (new_flags & VM_SPECIAL) {
+ if (!is_vm_hugetlb_page(vma) && new_flags & VM_SPECIAL) {
error = -EINVAL;
goto out;
}
--
2.19.0
next reply other threads:[~2018-09-30 5:46 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-30 5:46 Daniel Black [this message]
2018-10-01 22:11 ` [PATCH] mm: madvise(MADV_DODUMP) allow hugetlbfs pages Mike Kravetz
2018-10-03 5:47 ` Daniel Black
2018-10-04 21:47 ` Mike Kravetz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180930054629.29150-1-daniel@linux.ibm.com \
--to=daniel@linux.ibm.com \
--cc=khlebnikov@openvz.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).