From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: linux-doc@vger.kernel.org, linux-api@vger.kernel.org,
tools@kernel.org, Sasha Levin <sashal@kernel.org>
Subject: [RFC v3 3/4] mm/mlock: add API specification for mlock
Date: Fri, 11 Jul 2025 07:42:47 -0400 [thread overview]
Message-ID: <20250711114248.2288591-4-sashal@kernel.org> (raw)
In-Reply-To: <20250711114248.2288591-1-sashal@kernel.org>
Add kernel API specification for the mlock() system call.
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
mm/mlock.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 85 insertions(+)
diff --git a/mm/mlock.c b/mm/mlock.c
index 3cb72b579ffd3..06e260da5aba6 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -658,6 +658,91 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla
return 0;
}
+/**
+ * sys_mlock - Lock pages in memory
+ * @start: Starting address of memory range to lock
+ * @len: Length of memory range to lock in bytes
+ *
+ * long-desc: Locks pages in the specified address range into RAM, preventing
+ * them from being paged to swap. Requires CAP_IPC_LOCK capability
+ * or RLIMIT_MEMLOCK resource limit.
+ * context-flags: KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE
+ * param-type: start, KAPI_TYPE_UINT
+ * param-flags: start, KAPI_PARAM_IN
+ * param-constraint-type: start, KAPI_CONSTRAINT_NONE
+ * param-constraint: start, Rounded down to page boundary
+ * param-type: len, KAPI_TYPE_UINT
+ * param-flags: len, KAPI_PARAM_IN
+ * param-constraint-type: len, KAPI_CONSTRAINT_RANGE
+ * param-range: len, 0, LONG_MAX
+ * param-constraint: len, Rounded up to page boundary
+ * return-type: KAPI_TYPE_INT
+ * return-check-type: KAPI_RETURN_ERROR_CHECK
+ * return-success: 0
+ * error-code: -ENOMEM, ENOMEM, Address range issue,
+ * Some of the specified range is not mapped, has unmapped gaps,
+ * or the lock would cause the number of mapped regions to exceed the limit.
+ * error-code: -EPERM, EPERM, Insufficient privileges,
+ * The caller is not privileged (no CAP_IPC_LOCK) and RLIMIT_MEMLOCK is 0.
+ * error-code: -EINVAL, EINVAL, Address overflow,
+ * The result of the addition start+len was less than start (arithmetic overflow).
+ * error-code: -EAGAIN, EAGAIN, Some or all memory could not be locked,
+ * Some or all of the specified address range could not be locked.
+ * error-code: -EINTR, EINTR, Interrupted by signal,
+ * The operation was interrupted by a fatal signal before completion.
+ * error-code: -EFAULT, EFAULT, Bad address,
+ * The specified address range contains invalid addresses that cannot be accessed.
+ * since-version: 2.0
+ * lock: mmap_lock, KAPI_LOCK_RWLOCK
+ * lock-acquired: true
+ * lock-released: true
+ * lock-desc: Process memory map write lock
+ * signal: FATAL
+ * signal-direction: KAPI_SIGNAL_RECEIVE
+ * signal-action: KAPI_SIGNAL_ACTION_RETURN
+ * signal-condition: Fatal signal pending
+ * signal-desc: Fatal signals (SIGKILL) can interrupt the operation at two points:
+ * when acquiring mmap_write_lock_killable() and during page population
+ * in __mm_populate(). Returns -EINTR. Non-fatal signals do NOT interrupt
+ * mlock - the operation continues even if SIGINT/SIGTERM are received.
+ * signal-error: -EINTR
+ * signal-timing: KAPI_SIGNAL_TIME_DURING
+ * signal-priority: 0
+ * signal-interruptible: yes
+ * signal-state-req: KAPI_SIGNAL_STATE_RUNNING
+ * examples: mlock(addr, 4096); // Lock one page
+ * mlock(addr, len); // Lock range of pages
+ * notes: Memory locks do not stack - multiple calls on the same range can be
+ * undone by a single munlock. Locks are not inherited by child processes.
+ * Pages are locked on whole page boundaries. Commonly used by real-time
+ * applications to prevent page faults during time-critical operations.
+ * Also used for security to prevent sensitive data (e.g., cryptographic keys)
+ * from being written to swap. Note: locked pages may still be saved to
+ * swap during system suspend/hibernate.
+ *
+ * Tagged addresses are automatically handled via untagged_addr(). The operation
+ * occurs in two phases: first VMAs are marked with VM_LOCKED, then pages are
+ * populated into memory. When checking RLIMIT_MEMLOCK, the kernel optimizes
+ * by recounting locked memory to avoid double-counting overlapping regions.
+ * side-effect: KAPI_EFFECT_MODIFY_STATE | KAPI_EFFECT_ALLOC_MEMORY, process memory, Locks pages into physical memory, preventing swapping, reversible=yes
+ * side-effect: KAPI_EFFECT_MODIFY_STATE, mm->locked_vm, Increases process locked memory counter, reversible=yes
+ * side-effect: KAPI_EFFECT_ALLOC_MEMORY, physical pages, May allocate and populate page table entries, condition=Pages not already present, reversible=yes
+ * side-effect: KAPI_EFFECT_MODIFY_STATE | KAPI_EFFECT_ALLOC_MEMORY, page faults, Triggers page faults to bring pages into memory, condition=Pages not already resident
+ * side-effect: KAPI_EFFECT_MODIFY_STATE, VMA splitting, May split existing VMAs at lock boundaries, condition=Lock range partially overlaps existing VMA
+ * state-trans: memory pages, swappable, locked in RAM, Pages become non-swappable and pinned in physical memory
+ * state-trans: VMA flags, unlocked, VM_LOCKED set, Virtual memory area marked as locked
+ * capability: CAP_IPC_LOCK, KAPI_CAP_BYPASS_CHECK, CAP_IPC_LOCK capability
+ * capability-allows: Lock unlimited amount of memory (no RLIMIT_MEMLOCK enforcement)
+ * capability-without: Must respect RLIMIT_MEMLOCK resource limit
+ * capability-condition: Checked when RLIMIT_MEMLOCK is 0 or locking would exceed limit
+ * capability-priority: 0
+ * constraint: RLIMIT_MEMLOCK Resource Limit, The RLIMIT_MEMLOCK soft resource limit specifies the maximum bytes of memory that may be locked into RAM. Unprivileged processes are restricted to this limit. CAP_IPC_LOCK capability allows bypassing this limit entirely. The limit is enforced per-process, not per-user.
+ * constraint-expr: RLIMIT_MEMLOCK Resource Limit, locked_memory + request_size <= RLIMIT_MEMLOCK || CAP_IPC_LOCK
+ * constraint: Memory Pressure and OOM, Locking large amounts of memory can cause system-wide memory pressure and potentially trigger the OOM killer. The kernel does not prevent locking memory that would destabilize the system.
+ * constraint: Special Memory Areas, Some memory types cannot be locked or are silently skipped: VM_IO/VM_PFNMAP areas (device mappings) are skipped; Hugetlb pages are inherently pinned and skipped; DAX mappings are always present in memory and skipped; Secret memory (memfd_secret) mappings are skipped; VM_DROPPABLE memory cannot be locked and is skipped; Gate VMA (kernel entry point) is skipped; VM_LOCKED areas are already locked. These special areas are silently excluded without error.
+ *
+ * Context: Process context. May sleep. Takes mmap_lock for write.
+ */
SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
{
return do_mlock(start, len, VM_LOCKED);
--
2.39.5
next prev parent reply other threads:[~2025-07-11 11:42 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-11 11:42 [RFC v3 0/4] Kernel API Specification Framework with kerneldoc integration Sasha Levin
2025-07-11 11:42 ` [RFC v3 1/4] kernel/api: introduce kernel API specification framework Sasha Levin
2025-07-12 17:03 ` kernel test robot
2025-07-13 9:39 ` kernel test robot
2025-07-16 7:21 ` Askar Safin
2025-08-01 13:53 ` Sasha Levin
2025-07-11 11:42 ` [RFC v3 2/4] kernel/api: enable kerneldoc-based API specifications Sasha Levin
2025-07-12 7:16 ` kernel test robot
2025-07-21 0:55 ` Randy Dunlap
2025-07-21 2:54 ` Sasha Levin
2025-07-21 3:17 ` Randy Dunlap
2025-07-11 11:42 ` Sasha Levin [this message]
2025-07-11 11:42 ` [RFC v3 4/4] kernel/sched: add specs for sys_sched_setattr() Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250711114248.2288591-4-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tools@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.