From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
peterz@infradead.org, mingo@kernel.org,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCH 1/2] mm: mark remap_file_pages() syscall as deprecated
Date: Thu, 8 May 2014 15:41:27 +0300 [thread overview]
Message-ID: <1399552888-11024-2-git-send-email-kirill.shutemov@linux.intel.com> (raw)
In-Reply-To: <1399552888-11024-1-git-send-email-kirill.shutemov@linux.intel.com>
The remap_file_pages() system call is used to create a nonlinear mapping,
that is, a mapping in which the pages of the file are mapped into a
nonsequential order in memory. The advantage of using remap_file_pages()
over using repeated calls to mmap(2) is that the former approach does not
require the kernel to create additional VMA (Virtual Memory Area) data
structures.
Supporting of nonlinear mapping requires significant amount of non-trivial
code in kernel virtual memory subsystem including hot paths. Also to get
nonlinear mapping work kernel need a way to distinguish normal page table
entries from entries with file offset (pte_file). Kernel reserves flag in
PTE for this purpose. PTE flags are scarce resource especially on some CPU
architectures. It would be nice to free up the flag for other usage.
Fortunately, there are not many users of remap_file_pages() in the wild.
It's only known that one enterprise RDBMS implementation uses the syscall
on 32-bit systems to map files bigger than can linearly fit into 32-bit
virtual address space. This use-case is not critical anymore since 64-bit
systems are widely available.
The plan is to deprecate the syscall and replace it with an emulation.
The emulation will create new VMAs instead of nonlinear mappings. It's
going to work slower for rare users of remap_file_pages() but ABI is
preserved.
One side effect of emulation (apart from performance) is that user can hit
vm.max_map_count limit more easily due to additional VMAs. See comment for
DEFAULT_MAX_MAP_COUNT for more details on the limit.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
Documentation/vm/remap_file_pages.txt | 28 ++++++++++++++++++++++++++++
mm/fremap.c | 4 ++++
2 files changed, 32 insertions(+)
create mode 100644 Documentation/vm/remap_file_pages.txt
diff --git a/Documentation/vm/remap_file_pages.txt b/Documentation/vm/remap_file_pages.txt
new file mode 100644
index 000000000000..560e4363a55d
--- /dev/null
+++ b/Documentation/vm/remap_file_pages.txt
@@ -0,0 +1,28 @@
+The remap_file_pages() system call is used to create a nonlinear mapping,
+that is, a mapping in which the pages of the file are mapped into a
+nonsequential order in memory. The advantage of using remap_file_pages()
+over using repeated calls to mmap(2) is that the former approach does not
+require the kernel to create additional VMA (Virtual Memory Area) data
+structures.
+
+Supporting of nonlinear mapping requires significant amount of non-trivial
+code in kernel virtual memory subsystem including hot paths. Also to get
+nonlinear mapping work kernel need a way to distinguish normal page table
+entries from entries with file offset (pte_file). Kernel reserves flag in
+PTE for this purpose. PTE flags are scarce resource especially on some CPU
+architectures. It would be nice to free up the flag for other usage.
+
+Fortunately, there are not many users of remap_file_pages() in the wild.
+It's only known that one enterprise RDBMS implementation uses the syscall
+on 32-bit systems to map files bigger than can linearly fit into 32-bit
+virtual address space. This use-case is not critical anymore since 64-bit
+systems are widely available.
+
+The plan is to deprecate the syscall and replace it with an emulation.
+The emulation will create new VMAs instead of nonlinear mappings. It's
+going to work slower for rare users of remap_file_pages() but ABI is
+preserved.
+
+One side effect of emulation (apart from performance) is that user can hit
+vm.max_map_count limit more easily due to additional VMAs. See comment for
+DEFAULT_MAX_MAP_COUNT for more details on the limit.
diff --git a/mm/fremap.c b/mm/fremap.c
index 34feba60a17e..12c3bb63b7f9 100644
--- a/mm/fremap.c
+++ b/mm/fremap.c
@@ -152,6 +152,10 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
int has_write_lock = 0;
vm_flags_t vm_flags = 0;
+ pr_warn_once("%s (%d) uses depricated remap_file_pages() syscall. "
+ "See Documentation/vm/remap_file_pages.txt.\n",
+ current->comm, current->pid);
+
if (prot)
return err;
/*
--
2.0.0.rc2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-05-08 12:41 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-08 12:41 [PATCHv2 0/2] remap_file_pages() decommission Kirill A. Shutemov
2014-05-08 12:41 ` Kirill A. Shutemov [this message]
2014-06-12 5:48 ` [PATCH 1/2] mm: mark remap_file_pages() syscall as deprecated Michael Kerrisk
2014-06-12 9:40 ` Kirill A. Shutemov
2014-06-12 9:44 ` Michael Kerrisk (man-pages)
2014-05-08 12:41 ` [PATCH 2/2] mm: replace remap_file_pages() syscall with emulation Kirill A. Shutemov
2014-05-08 21:57 ` Andrew Morton
2014-05-12 15:11 ` Sasha Levin
2014-05-12 17:05 ` Kirill A. Shutemov
2014-05-14 20:52 ` Sasha Levin
2014-05-14 21:17 ` Kirill A. Shutemov
2014-05-14 21:40 ` Andrew Morton
2014-05-13 7:32 ` Armin Rigo
2014-05-13 12:57 ` Sasha Levin
2014-05-08 15:35 ` [PATCHv2 0/2] remap_file_pages() decommission Linus Torvalds
2014-05-08 15:44 ` Armin Rigo
2014-05-08 16:02 ` Kirill A. Shutemov
2014-05-08 16:08 ` Linus Torvalds
2014-05-09 14:05 ` Kirill A. Shutemov
2014-05-09 15:14 ` Linus Torvalds
2014-05-09 18:19 ` Kirill A. Shutemov
2014-05-12 12:43 ` Kirill A. Shutemov
2014-05-12 14:59 ` Konstantin Khlebnikov
2014-05-12 3:36 ` Andi Kleen
2014-05-12 5:16 ` Konstantin Khlebnikov
2014-05-12 7:50 ` Armin Rigo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1399552888-11024-2-git-send-email-kirill.shutemov@linux.intel.com \
--to=kirill.shutemov@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).