* [PATCH v2 0/3] vdso/datastore: Allow prefaulting by mlockall()
@ 2025-09-01 12:34 Thomas Weißschuh
2025-09-01 12:34 ` [PATCH v2 1/3] vdso/datastore: Explicitly prevent remote access to timens vvar page Thomas Weißschuh
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Thomas Weißschuh @ 2025-09-01 12:34 UTC (permalink / raw)
To: Anna-Maria Behnsen, Frederic Weisbecker, Thomas Gleixner,
Andy Lutomirski, Vincenzo Frascino
Cc: Nam Cao, linux-kernel, Thomas Weißschuh
Latency-sensitive applications expect not to experience any pagefaults
after calling mlockall(). However mlockall() ignores VM_PFNMAP and VM_IO
mappings, both of which are used by the generic vDSO datastore.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
---
Changes in v2:
- Stop using nth_page() which is being removed
- Link to v1: https://lore.kernel.org/r/20250812-vdso-mlockall-v1-0-2f49ba7cf819@linutronix.de
---
Thomas Weißschuh (3):
vdso/datastore: Explicitly prevent remote access to timens vvar page
vdso/datastore: Allow prefaulting by mlockall()
vdso/datastore: Map zero page for unavailable data
kernel/time/namespace.c | 7 ++-----
lib/vdso/datastore.c | 38 ++++++++++++++++++++++----------------
2 files changed, 24 insertions(+), 21 deletions(-)
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250721-vdso-mlockall-461bb33205b1
Best regards,
--
Thomas Weißschuh <thomas.weissschuh@linutronix.de>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 1/3] vdso/datastore: Explicitly prevent remote access to timens vvar page
2025-09-01 12:34 [PATCH v2 0/3] vdso/datastore: Allow prefaulting by mlockall() Thomas Weißschuh
@ 2025-09-01 12:34 ` Thomas Weißschuh
2025-09-01 12:34 ` [PATCH v2 2/3] vdso/datastore: Allow prefaulting by mlockall() Thomas Weißschuh
2025-09-01 12:34 ` [PATCH v2 3/3] vdso/datastore: Map zero page for unavailable data Thomas Weißschuh
2 siblings, 0 replies; 4+ messages in thread
From: Thomas Weißschuh @ 2025-09-01 12:34 UTC (permalink / raw)
To: Anna-Maria Behnsen, Frederic Weisbecker, Thomas Gleixner,
Andy Lutomirski, Vincenzo Frascino
Cc: Nam Cao, linux-kernel, Thomas Weißschuh
The fault handler for the timens page does not have access to the target
task and therefore can not be invoked remotely.
Currently the handler relies on the fact that the vvar mapping is marked as
VM_IO and VM_PFNMAP for which the mm core always prevents remote access.
However the VM_IO and VM_PFNMAP flags are going to be removed.
Add an explicit check to prevent remote access to the mapping.
Move the call to find_timens_vvar_page() after the check to avoid hitting
the WARN() in that function.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
---
kernel/time/namespace.c | 7 ++-----
lib/vdso/datastore.c | 7 ++++++-
2 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/kernel/time/namespace.c b/kernel/time/namespace.c
index 667452768ed3b50e48e3cfb70f8ef68e4bed9e0b..e225547021b73230e3c820cd91635e0483821c49 100644
--- a/kernel/time/namespace.c
+++ b/kernel/time/namespace.c
@@ -198,11 +198,8 @@ struct page *find_timens_vvar_page(struct vm_area_struct *vma)
return current->nsproxy->time_ns->vvar_page;
/*
- * VM_PFNMAP | VM_IO protect .fault() handler from being called
- * through interfaces like /proc/$pid/mem or
- * process_vm_{readv,writev}() as long as there's no .access()
- * in special_mapping_vmops().
- * For more details check_vma_flags() and __access_remote_vm()
+ * vvar_fault() protects this from being called through remote interfaces like
+ * /proc/$pid/mem or process_vm_{readv,writev}().
*/
WARN(1, "vvar_page accessed remotely");
diff --git a/lib/vdso/datastore.c b/lib/vdso/datastore.c
index 3693c6caf2c4d41a526613d5fb746cb3a981ea2e..ed1aa3e27b13f8b48d18dad9488e0798f49cb338 100644
--- a/lib/vdso/datastore.c
+++ b/lib/vdso/datastore.c
@@ -40,10 +40,15 @@ struct vdso_arch_data *vdso_k_arch_data = &vdso_arch_data_store.data;
static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
struct vm_area_struct *vma, struct vm_fault *vmf)
{
- struct page *timens_page = find_timens_vvar_page(vma);
+ struct page *timens_page;
unsigned long addr, pfn;
vm_fault_t err;
+ if (unlikely(vmf->flags & FAULT_FLAG_REMOTE))
+ return VM_FAULT_SIGBUS;
+
+ timens_page = find_timens_vvar_page(vma);
+
switch (vmf->pgoff) {
case VDSO_TIME_PAGE_OFFSET:
if (!IS_ENABLED(CONFIG_HAVE_GENERIC_VDSO))
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2 2/3] vdso/datastore: Allow prefaulting by mlockall()
2025-09-01 12:34 [PATCH v2 0/3] vdso/datastore: Allow prefaulting by mlockall() Thomas Weißschuh
2025-09-01 12:34 ` [PATCH v2 1/3] vdso/datastore: Explicitly prevent remote access to timens vvar page Thomas Weißschuh
@ 2025-09-01 12:34 ` Thomas Weißschuh
2025-09-01 12:34 ` [PATCH v2 3/3] vdso/datastore: Map zero page for unavailable data Thomas Weißschuh
2 siblings, 0 replies; 4+ messages in thread
From: Thomas Weißschuh @ 2025-09-01 12:34 UTC (permalink / raw)
To: Anna-Maria Behnsen, Frederic Weisbecker, Thomas Gleixner,
Andy Lutomirski, Vincenzo Frascino
Cc: Nam Cao, linux-kernel, Thomas Weißschuh
Latency-sensitive applications expect not to experience any pagefaults
after calling mlockall(). However mlockall() ignores VM_PFNMAP and VM_IO
mappings, both of which are used by the generic vDSO datastore.
While the fault handler itself is very fast, going through the full
pagefault exception handling is much slower, on the order of 20us in a
test machine.
Since the memory behind the datastore mappings is always present and
accessible it is not necessary to use VM_IO for them.
VM_PFNMAP can be removed by mapping the pages through 'struct page' instead
of PFNs. VM_MIXEDMAP is necessary to call vmf_insert_page() in the timens
optimization path.
The data page mapping is now also aligned with the architecture-specific
code pages. Some architecture-specific data pages, like the x86 VCLOCK
pages, continue to use VM_IO as they are not always mappable.
Regular mlock() would also work, but userspace does not know the boundaries
of the vDSO.
Reported-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Tested-by: Nam Cao <namcao@linutronix.de>
---
lib/vdso/datastore.c | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/lib/vdso/datastore.c b/lib/vdso/datastore.c
index ed1aa3e27b13f8b48d18dad9488e0798f49cb338..00714c0cf0b24b813bf5b28ff8a19e5f246fce45 100644
--- a/lib/vdso/datastore.c
+++ b/lib/vdso/datastore.c
@@ -40,8 +40,8 @@ struct vdso_arch_data *vdso_k_arch_data = &vdso_arch_data_store.data;
static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
struct vm_area_struct *vma, struct vm_fault *vmf)
{
- struct page *timens_page;
- unsigned long addr, pfn;
+ struct page *page, *timens_page;
+ unsigned long addr;
vm_fault_t err;
if (unlikely(vmf->flags & FAULT_FLAG_REMOTE))
@@ -53,17 +53,17 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
case VDSO_TIME_PAGE_OFFSET:
if (!IS_ENABLED(CONFIG_HAVE_GENERIC_VDSO))
return VM_FAULT_SIGBUS;
- pfn = __phys_to_pfn(__pa_symbol(vdso_k_time_data));
+ page = virt_to_page(vdso_k_time_data);
if (timens_page) {
/*
* Fault in VVAR page too, since it will be accessed
* to get clock data anyway.
*/
addr = vmf->address + VDSO_TIMENS_PAGE_OFFSET * PAGE_SIZE;
- err = vmf_insert_pfn(vma, addr, pfn);
+ err = vmf_insert_page(vma, addr, page);
if (unlikely(err & VM_FAULT_ERROR))
return err;
- pfn = page_to_pfn(timens_page);
+ page = timens_page;
}
break;
case VDSO_TIMENS_PAGE_OFFSET:
@@ -76,24 +76,25 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
*/
if (!IS_ENABLED(CONFIG_TIME_NS) || !timens_page)
return VM_FAULT_SIGBUS;
- pfn = __phys_to_pfn(__pa_symbol(vdso_k_time_data));
+ page = virt_to_page(vdso_k_time_data);
break;
case VDSO_RNG_PAGE_OFFSET:
if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
return VM_FAULT_SIGBUS;
- pfn = __phys_to_pfn(__pa_symbol(vdso_k_rng_data));
+ page = virt_to_page(vdso_k_rng_data);
break;
case VDSO_ARCH_PAGES_START ... VDSO_ARCH_PAGES_END:
if (!IS_ENABLED(CONFIG_ARCH_HAS_VDSO_ARCH_DATA))
return VM_FAULT_SIGBUS;
- pfn = __phys_to_pfn(__pa_symbol(vdso_k_arch_data)) +
- vmf->pgoff - VDSO_ARCH_PAGES_START;
+ page = virt_to_page(vdso_k_arch_data) + vmf->pgoff - VDSO_ARCH_PAGES_START;
break;
default:
return VM_FAULT_SIGBUS;
}
- return vmf_insert_pfn(vma, vmf->address, pfn);
+ get_page(page);
+ vmf->page = page;
+ return 0;
}
const struct vm_special_mapping vdso_vvar_mapping = {
@@ -104,8 +105,8 @@ const struct vm_special_mapping vdso_vvar_mapping = {
struct vm_area_struct *vdso_install_vvar_mapping(struct mm_struct *mm, unsigned long addr)
{
return _install_special_mapping(mm, addr, VDSO_NR_PAGES * PAGE_SIZE,
- VM_READ | VM_MAYREAD | VM_IO | VM_DONTDUMP |
- VM_PFNMAP | VM_SEALED_SYSMAP,
+ VM_READ | VM_MAYREAD | VM_DONTDUMP |
+ VM_MIXEDMAP | VM_SEALED_SYSMAP,
&vdso_vvar_mapping);
}
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2 3/3] vdso/datastore: Map zero page for unavailable data
2025-09-01 12:34 [PATCH v2 0/3] vdso/datastore: Allow prefaulting by mlockall() Thomas Weißschuh
2025-09-01 12:34 ` [PATCH v2 1/3] vdso/datastore: Explicitly prevent remote access to timens vvar page Thomas Weißschuh
2025-09-01 12:34 ` [PATCH v2 2/3] vdso/datastore: Allow prefaulting by mlockall() Thomas Weißschuh
@ 2025-09-01 12:34 ` Thomas Weißschuh
2 siblings, 0 replies; 4+ messages in thread
From: Thomas Weißschuh @ 2025-09-01 12:34 UTC (permalink / raw)
To: Anna-Maria Behnsen, Frederic Weisbecker, Thomas Gleixner,
Andy Lutomirski, Vincenzo Frascino
Cc: Nam Cao, linux-kernel, Thomas Weißschuh
mlockall() stops if a page in a VMA is unmappable. As the datastore VMA can
contain holes, mlockall() does not process all data pages correctly.
Replace the mapping error VM_FAULT_SIGBUS with a mapping of the zero page.
The vDSO will not access these pages in any case and for other userspace
these pages have undefined contents.
This will allow mlockall() to process all pages within the VMA.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
---
lib/vdso/datastore.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/lib/vdso/datastore.c b/lib/vdso/datastore.c
index 00714c0cf0b24b813bf5b28ff8a19e5f246fce45..f9e37195c2af43c7b2c4b02d01be492d84223ecd 100644
--- a/lib/vdso/datastore.c
+++ b/lib/vdso/datastore.c
@@ -40,7 +40,7 @@ struct vdso_arch_data *vdso_k_arch_data = &vdso_arch_data_store.data;
static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
struct vm_area_struct *vma, struct vm_fault *vmf)
{
- struct page *page, *timens_page;
+ struct page *page = ZERO_PAGE(0), *timens_page;
unsigned long addr;
vm_fault_t err;
@@ -52,7 +52,7 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
switch (vmf->pgoff) {
case VDSO_TIME_PAGE_OFFSET:
if (!IS_ENABLED(CONFIG_HAVE_GENERIC_VDSO))
- return VM_FAULT_SIGBUS;
+ break;
page = virt_to_page(vdso_k_time_data);
if (timens_page) {
/*
@@ -75,17 +75,17 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
* See also the comment near timens_setup_vdso_data().
*/
if (!IS_ENABLED(CONFIG_TIME_NS) || !timens_page)
- return VM_FAULT_SIGBUS;
+ break;
page = virt_to_page(vdso_k_time_data);
break;
case VDSO_RNG_PAGE_OFFSET:
if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
- return VM_FAULT_SIGBUS;
+ break;
page = virt_to_page(vdso_k_rng_data);
break;
case VDSO_ARCH_PAGES_START ... VDSO_ARCH_PAGES_END:
if (!IS_ENABLED(CONFIG_ARCH_HAS_VDSO_ARCH_DATA))
- return VM_FAULT_SIGBUS;
+ break;
page = virt_to_page(vdso_k_arch_data) + vmf->pgoff - VDSO_ARCH_PAGES_START;
break;
default:
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-09-01 12:34 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-01 12:34 [PATCH v2 0/3] vdso/datastore: Allow prefaulting by mlockall() Thomas Weißschuh
2025-09-01 12:34 ` [PATCH v2 1/3] vdso/datastore: Explicitly prevent remote access to timens vvar page Thomas Weißschuh
2025-09-01 12:34 ` [PATCH v2 2/3] vdso/datastore: Allow prefaulting by mlockall() Thomas Weißschuh
2025-09-01 12:34 ` [PATCH v2 3/3] vdso/datastore: Map zero page for unavailable data Thomas Weißschuh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).