From: Hajime Tazaki <thehajime@gmail.com>
To: linux-um@lists.infradead.org
Cc: thehajime@gmail.com, ricarkol@google.com, Liam.Howlett@oracle.com
Subject: [PATCH v5 12/13] um: nommu: add documentation of nommu UML
Date: Thu, 12 Dec 2024 19:12:19 +0900 [thread overview]
Message-ID: <ced1b154352ce289b47208b71bd536457af778ea.1733998168.git.thehajime@gmail.com> (raw)
In-Reply-To: <cover.1733998168.git.thehajime@gmail.com>
This commit adds an initial documentation for !MMU mode of UML.
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
Documentation/virt/uml/nommu-uml.rst | 177 +++++++++++++++++++++++++++
MAINTAINERS | 1 +
2 files changed, 178 insertions(+)
create mode 100644 Documentation/virt/uml/nommu-uml.rst
diff --git a/Documentation/virt/uml/nommu-uml.rst b/Documentation/virt/uml/nommu-uml.rst
new file mode 100644
index 000000000000..a98bfd9d2f38
--- /dev/null
+++ b/Documentation/virt/uml/nommu-uml.rst
@@ -0,0 +1,177 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+UML has been built with CONFIG_MMU since day 0. The patchset
+introduces the nommu mode on UML in a different angle from what Linux
+Kernel Library tried.
+
+.. contents:: :local:
+
+What is it for ?
+================
+
+- Alleviate syscall hook overhead implemented with ptrace(2)
+- To exercises nommu code over UML (and over KUnit)
+- Less dependency to host facilities
+
+
+How it works ?
+==============
+
+To illustrate how this feature works, the below shows how syscalls are
+called under nommu/UML environment.
+
+- boot kernel, install seccomp filter if ``syscall`` instructions are
+ called from userspace memory based on the address of instruction
+ pointer
+- (userspace starts)
+- calls ``vfork``/``execve`` syscalls
+- ``SIGSYS`` signal raised, handler calls syscall entry point ``__kernel_vsyscall``
+- call handler function in ``sys_call_table[]`` and follow how UML syscall
+ works.
+- return to userspace
+
+
+What are the differences from MMU-full UML ?
+============================================
+
+The current nommu implementation adds 3 different functions which
+MMU-full UML doesn't have:
+
+- kernel address space can directly be accessible from userspace
+ - so, ``uaccess()`` always returns 1
+ - generic implementation of memcpy/strcpy/futex is also used
+- alternate syscall entrypoint without ptrace
+- alternate syscall hook
+ - hook syscall by seccomp filter
+
+With those modifications, it allows us to use unmodified userspace
+binaries with nommu UML.
+
+
+History
+=======
+
+This feature was originally introduced by Ricardo Koller at Open
+Source Summit NA 2020, then integrated with the syscall translation
+functionality with the clean up to the original code.
+
+Building and run
+================
+
+::
+
+ make ARCH=um x86_64_nommu_defconfig
+ make ARCH=um
+
+will build UML with ``CONFIG_MMU=n`` applied.
+
+Kunit tests can run with the following command::
+
+ ./tools/testing/kunit/kunit.py run --kconfig_add CONFIG_MMU=n
+
+To run a typical Linux distribution, we need nommu-aware userspace.
+We can use a stock version of Alpine Linux with nommu-built version of
+busybox and musl-libc.
+
+
+Preparing root filesystem
+=========================
+
+nommu UML requires to use a specific standard library which is aware
+of nommu kernel. We have tested custom-build musl-libc and busybox,
+both of which have built-in support for nommu kernels.
+
+There are no available Linux distributions for nommu under x86_64
+architecture, so we need to prepare our own image for the root
+filesystem. We use Alpine Linux as a base distribution and replace
+busybox and musl-libc on top of that. The following are the step to
+prepare the filesystem for the quick start::
+
+ container_id=$(docker create ghcr.io/thehajime/alpine:3.20.3-um-nommu)
+ docker start $container_id
+ docker wait $container_id
+ docker export $container_id > alpine.tar
+ docker rm $container_id
+
+ mnt=$(mktemp -d)
+ dd if=/dev/zero of=alpine.ext4 bs=1 count=0 seek=1G
+ sudo chmod og+wr "alpine.ext4"
+ yes 2>/dev/null | mkfs.ext4 "alpine.ext4" || true
+ sudo mount "alpine.ext4" $mnt
+ sudo tar -xf alpine.tar -C $mnt
+ sudo umount $mnt
+
+This will create a file image, ``alpine.ext4``, which contains busybox
+and musl with nommu build on the Alpine Linux root filesystem. The
+file can be specified to the argument ``ubd0=`` to the UML command line::
+
+ ./vmlinux ubd0=./alpine.ext4 rw mem=1024m loglevel=8 init=/sbin/init
+
+We plan to upstream apk packages for busybox and musl so that we can
+follow the proper procedure to set up the root filesystem.
+
+
+Quick start with docker
+=======================
+
+There is a docker image that you can quickly start with a simple step::
+
+ docker run -it -v /dev/shm:/dev/shm --rm ghcr.io/thehajime/alpine:3.20.3-um-nommu
+
+This will launch a UML instance with an pre-configured root filesystem.
+
+Benchmark
+=========
+
+The below shows an example of performance measurement conducted with
+lmbench and (self-crafted) getpid benchmark (with v6.12-rc2 uml/next
+tree).
+
+.. csv-table:: lmbench (usec)
+ :header: ,native,um,um-nommu(s)
+
+ select-10 ,0.5544,29.7143,2.8920
+ select-100 ,2.3992,27.7262,3.7794
+ select-1000 ,20.4708,42.0885,12.6920
+ syscall ,0.1734,26.2471,2.6070
+ read ,0.3433,29.8828,2.6923
+ write ,0.2866,25.9753,2.6925
+ stat ,1.9195,40.1164,3.1813
+ open/close ,3.8657,63.4730,6.2049
+ fork+sh ,1161.1111,5216.5000,462.3077
+ fork+execve ,536.5263,2117.0000,131.0633
+
+.. csv-table:: do_getpid bench (nsec)
+ :header: ,native,um,um-nommu(s)
+
+ getpid, 172 , 26807 , 2614
+
+Limitations
+===========
+
+generic nommu limitations
+-------------------------
+Since this port is a kernel of nommu architecture so, the
+implementation inherits the characteristics of other nommu kernels
+(riscv, arm, etc), described below.
+
+- vfork(2) should be used instead of fork(2)
+- ELF loader only loads PIE (position independent executable) binaries
+- processes share the address space among others
+- mmap(2) offers a subset of functionalities (e.g., unsupported
+ MMAP_FIXED)
+
+Thus, we have limited options to userspace programs. We have tested
+Alpine Linux with musl-libc, which has a support nommu kernel.
+
+supported architecture
+----------------------
+The current implementation of nommu UML only works on x86_64 SUBARCH.
+We have not tested with 32-bit environment.
+
+
+Further readings about NOMMU UML
+================================
+
+- NOMMU UML (original code by Ricardo Koller)
+ - https://static.sched.com/hosted_files/ossna2020/ec/kollerr_linux_um_nommu.pdf
diff --git a/MAINTAINERS b/MAINTAINERS
index a097afd76ded..aaffff989580 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -24186,6 +24186,7 @@ USER-MODE LINUX (UML)
M: Richard Weinberger <richard@nod.at>
M: Anton Ivanov <anton.ivanov@cambridgegreys.com>
M: Johannes Berg <johannes@sipsolutions.net>
+M: Hajime Tazaki <thehajime@gmail.com>
L: linux-um@lists.infradead.org
S: Maintained
W: http://user-mode-linux.sourceforge.net
--
2.43.0
next prev parent reply other threads:[~2024-12-12 10:14 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-12 10:12 [PATCH v5 00/13] nommu UML Hajime Tazaki
2024-12-12 10:12 ` [PATCH v5 01/13] x86/um: clean up elf specific definitions Hajime Tazaki
2024-12-12 10:12 ` [PATCH v5 02/13] x86/um: nommu: elf loader for fdpic Hajime Tazaki
2024-12-12 14:22 ` Eric W. Biederman
2024-12-13 7:19 ` Hajime Tazaki
2024-12-13 20:01 ` Eric W. Biederman
2024-12-13 21:23 ` Hajime Tazaki
2024-12-13 21:53 ` Eric W. Biederman
2024-12-13 22:21 ` Hajime Tazaki
2024-12-18 5:13 ` Kees Cook
2024-12-12 10:12 ` [PATCH v5 03/13] um: decouple MMU specific code from the common part Hajime Tazaki
2024-12-12 10:12 ` [PATCH v5 04/13] um: nommu: memory handling Hajime Tazaki
2024-12-12 10:12 ` [PATCH v5 05/13] x86/um: nommu: syscall handling Hajime Tazaki
2024-12-12 10:12 ` [PATCH v5 06/13] um: nommu: seccomp syscalls hook Hajime Tazaki
2024-12-12 10:12 ` [PATCH v5 07/13] x86/um: nommu: process/thread handling Hajime Tazaki
2024-12-12 10:12 ` [PATCH v5 08/13] um: nommu: configure fs register on host syscall invocation Hajime Tazaki
2024-12-12 10:12 ` [PATCH v5 09/13] x86/um/vdso: nommu: vdso memory update Hajime Tazaki
2024-12-12 10:12 ` [PATCH v5 10/13] x86/um: nommu: signal handling Hajime Tazaki
2024-12-12 10:12 ` [PATCH v5 11/13] um: change machine name for uname output Hajime Tazaki
2024-12-12 10:12 ` Hajime Tazaki [this message]
2024-12-12 10:12 ` [PATCH v5 13/13] um: nommu: plug nommu code into build system Hajime Tazaki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ced1b154352ce289b47208b71bd536457af778ea.1733998168.git.thehajime@gmail.com \
--to=thehajime@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=linux-um@lists.infradead.org \
--cc=ricarkol@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox