* [PATCH] Documentation/binfmt-misc.rst: Specify aux vector for "O" flag description
From: Charlie Jenkins @ 2026-04-18 21:08 UTC (permalink / raw)
To: Jonathan Corbet, Shuah Khan, Kees Cook
Cc: linux-doc, linux-mm, linux-kernel, Charlie Jenkins
Instead of replacing the file path in the argument vector, the file
descriptor is passed as AT_EXECFD in the auxilary vector. This appears
to have been the case at least since the git port, update the
documentation to reflect this.
Signed-off-by: Charlie Jenkins <thecharlesjenkins@gmail.com>
---
Documentation/admin-guide/binfmt-misc.rst | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/binfmt-misc.rst b/Documentation/admin-guide/binfmt-misc.rst
index 59cd902e3549..c0a34fbf8022 100644
--- a/Documentation/admin-guide/binfmt-misc.rst
+++ b/Documentation/admin-guide/binfmt-misc.rst
@@ -68,10 +68,10 @@ Here is what the fields mean:
Legacy behavior of binfmt_misc is to pass the full path
of the binary to the interpreter as an argument. When this flag is
included, binfmt_misc will open the file for reading and pass its
- descriptor as an argument, instead of the full path, thus allowing
- the interpreter to execute non-readable binaries. This feature
- should be used with care - the interpreter has to be trusted not to
- emit the contents of the non-readable binary.
+ descriptor into the auxilary vector with the key "AT_EXECFD", thus
+ allowing the interpreter to execute non-readable binaries. This
+ feature should be used with care - the interpreter has to be trusted
+ not to emit the contents of the non-readable binary.
``C`` - credentials
Currently, the behavior of binfmt_misc is to calculate
the credentials and security token of the new process according to
---
base-commit: 028ef9c96e96197026887c0f092424679298aae8
change-id: ${change-id}
- Charlie
^ permalink raw reply related
* [RFC PATCH 2/2] Documentation: maple_tree: Clarify behavior when using reserved values
From: Wei-Lin Chang @ 2026-04-18 20:47 UTC (permalink / raw)
To: maple-tree, linux-mm, linux-doc, linux-kernel
Cc: Liam R . Howlett, Alice Ryhl, Andrew Ballance, Jonathan Corbet,
Shuah Khan, Wei-Lin Chang
In-Reply-To: <20260418204754.120405-1-weilin.chang@arm.com>
It doesn't matter whether the normal or the advanced API is used if the
user uses xa_{mk, to}_value when storing and retrieving the values. Just
specify that the normal API blocks usages of reserved values while the
advanced API does not.
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
---
Documentation/core-api/maple_tree.rst | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/Documentation/core-api/maple_tree.rst b/Documentation/core-api/maple_tree.rst
index 15eda6742af8..54ea99c7bca7 100644
--- a/Documentation/core-api/maple_tree.rst
+++ b/Documentation/core-api/maple_tree.rst
@@ -30,9 +30,8 @@ Tree reserves values with the bottom two bits set to '10' which are below 4096
(ie 2, 6, 10 .. 4094) for internal use. If the entries may use reserved
entries under the condition that their top bits are never 1, then the user can
convert the entries using xa_mk_value() and convert them back by calling
-xa_to_value(). If the user needs to use a reserved value, then the user can
-convert the value when using the :ref:`maple-tree-advanced-api`, but are blocked
-by the normal API.
+xa_to_value(). Usage of reserved values is blocked by the normal API, and will
+cause undefined behavior if used with the :ref:`maple-tree-advanced-api`.
The Maple Tree can also be configured to support searching for a gap of a given
size (or larger).
--
2.43.0
^ permalink raw reply related
* [RFC PATCH 1/2] Documentation: maple_tree: Point out constraint when using xa_{mk, to}_value
From: Wei-Lin Chang @ 2026-04-18 20:47 UTC (permalink / raw)
To: maple-tree, linux-mm, linux-doc, linux-kernel
Cc: Liam R . Howlett, Alice Ryhl, Andrew Ballance, Jonathan Corbet,
Shuah Khan, Wei-Lin Chang
In-Reply-To: <20260418204754.120405-1-weilin.chang@arm.com>
Using xa_{mk, to}_value when storing values loses the information of
the top bit from the left shift, point that out in the doc.
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
---
Documentation/core-api/maple_tree.rst | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/Documentation/core-api/maple_tree.rst b/Documentation/core-api/maple_tree.rst
index ccdd1615cf97..15eda6742af8 100644
--- a/Documentation/core-api/maple_tree.rst
+++ b/Documentation/core-api/maple_tree.rst
@@ -28,10 +28,11 @@ virtual memory areas.
The Maple Tree can store values between ``0`` and ``ULONG_MAX``. The Maple
Tree reserves values with the bottom two bits set to '10' which are below 4096
(ie 2, 6, 10 .. 4094) for internal use. If the entries may use reserved
-entries then the users can convert the entries using xa_mk_value() and convert
-them back by calling xa_to_value(). If the user needs to use a reserved
-value, then the user can convert the value when using the
-:ref:`maple-tree-advanced-api`, but are blocked by the normal API.
+entries under the condition that their top bits are never 1, then the user can
+convert the entries using xa_mk_value() and convert them back by calling
+xa_to_value(). If the user needs to use a reserved value, then the user can
+convert the value when using the :ref:`maple-tree-advanced-api`, but are blocked
+by the normal API.
The Maple Tree can also be configured to support searching for a gap of a given
size (or larger).
--
2.43.0
^ permalink raw reply related
* [RFC PATCH 0/2] Documentation: maple_tree: Improve statements on reserved values
From: Wei-Lin Chang @ 2026-04-18 20:47 UTC (permalink / raw)
To: maple-tree, linux-mm, linux-doc, linux-kernel
Cc: Liam R . Howlett, Alice Ryhl, Andrew Ballance, Jonathan Corbet,
Shuah Khan, Wei-Lin Chang
Hi,
While using the maple tree and reading its documentation, I found a few
bits confusing, mainly about the reserved values. So here are some
changes hoping to make things clearer.
I am not familiar with the implementation, so I might be getting things
wrong, hence this being RFC.
While looking at the code I also found that although the doc claims the
normal API blocks reserved value stores, the code checks this using
xa_is_advanced(), which only blocks values up to 1026, not up to the max
maple tree reserved value 4094. For this part I am not sure whether the
code needs to be changed or we can also improve the doc.
Any feedback is appreciated, thanks!
Wei-Lin Chang (2):
Documentation: maple_tree: Point out constraint when using xa_{mk,
to}_value
Documentation: maple_tree: Clarify behavior when using reserved values
Documentation/core-api/maple_tree.rst | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--
2.43.0
^ permalink raw reply
* Re: [PATCH v3] docs/zh_CN: add module-signing Chinese translation
From: kernel test robot @ 2026-04-18 20:45 UTC (permalink / raw)
To: Yan Zhu, seakeel, alexs, si.yanteng, corbet
Cc: oe-kbuild-all, dzm91, skhan, linux-doc, linux-kernel, zhuyan2015
In-Reply-To: <tencent_99B2EE128E02C6CC1120DE135D4A2DA5B309@qq.com>
Hi Yan,
kernel test robot noticed the following build warnings:
[auto build test WARNING on lwn/docs-next]
[also build test WARNING on linus/master v7.0 next-20260417]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Yan-Zhu/docs-zh_CN-add-module-signing-Chinese-translation/20260418-151621
base: git://git.lwn.net/linux.git docs-next
patch link: https://lore.kernel.org/r/tencent_99B2EE128E02C6CC1120DE135D4A2DA5B309%40qq.com
patch subject: [PATCH v3] docs/zh_CN: add module-signing Chinese translation
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
docutils: docutils (Docutils 0.21.2, Python 3.13.5, on linux)
reproduce: (https://download.01.org/0day-ci/archive/20260418/202604182216.Qpd5KifK-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202604182216.Qpd5KifK-lkp@intel.com/
All warnings (new ones prefixed by >>):
Checksumming on output with GSO
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [docutils]
>> Documentation/translations/zh_CN/admin-guide/module-signing.rst:157: WARNING: Inline literal start-string without end-string. [docutils]
Documentation/userspace-api/landlock:480: ./security/landlock/errata/abi-4.h:5: ERROR: Unexpected section title.
vim +157 Documentation/translations/zh_CN/admin-guide/module-signing.rst
152
153 openssl req -new -nodes -utf8 -sha256 -days 36500 -batch -x509 \
154 -config x509.genkey -outform PEM -out kernel_key.pem \
155 -keyout kernel_key.pem
156
> 157 然后可以将生成的 kernel_key.pem 文件的完整路径名指定在
158 ``CONFIG_MODULE_SIG_KEY``选项中,并且将使用其中的证书和密钥而不是自动生成的
159 密钥对。
160
161
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply
* [PATCH v2 2/2] lib/crypto: docs: Add rst documentation to Documentation/crypto/
From: Eric Biggers @ 2026-04-18 19:21 UTC (permalink / raw)
To: linux-crypto
Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
linux-doc, Jonathan Corbet, Mauro Carvalho Chehab, Randy Dunlap,
Eric Biggers
In-Reply-To: <20260418192138.15556-1-ebiggers@kernel.org>
Add a documentation file Documentation/crypto/libcrypto.rst which
provides a high-level overview of lib/crypto/.
Also add several sub-pages which include the kernel-doc for the
algorithms that have it. This makes the existing, quite extensive
kernel-doc start being included in the HTML and PDF documentation.
Note that the intent is very much *not* that everyone has to read these
Documentation/ files. The library is intended to be straightforward and
use familiar conventions; generally it should be possible to dive right
into the kernel-doc. You shouldn't need to read a lot of documentation
to just call `sha256()`, for example, or to run the unit tests if you're
already familiar with KUnit. (This differs from the traditional crypto
API which has a larger barrier to entry.)
Nevertheless, this seems worth adding. Hopefully it is useful and makes
LWN no longer consider the library to be "meticulously undocumented".
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
Documentation/crypto/index.rst | 2 +-
.../crypto/libcrypto-blockcipher.rst | 19 ++
Documentation/crypto/libcrypto-hash.rst | 86 +++++++++
Documentation/crypto/libcrypto-signature.rst | 11 ++
Documentation/crypto/libcrypto-utils.rst | 6 +
Documentation/crypto/libcrypto.rst | 165 ++++++++++++++++++
Documentation/crypto/sha3.rst | 2 +
7 files changed, 290 insertions(+), 1 deletion(-)
create mode 100644 Documentation/crypto/libcrypto-blockcipher.rst
create mode 100644 Documentation/crypto/libcrypto-hash.rst
create mode 100644 Documentation/crypto/libcrypto-signature.rst
create mode 100644 Documentation/crypto/libcrypto-utils.rst
create mode 100644 Documentation/crypto/libcrypto.rst
diff --git a/Documentation/crypto/index.rst b/Documentation/crypto/index.rst
index 4ee667c446f99..705f186d662ba 100644
--- a/Documentation/crypto/index.rst
+++ b/Documentation/crypto/index.rst
@@ -11,10 +11,11 @@ for cryptographic use cases, as well as programming examples.
.. toctree::
:caption: Table of contents
:maxdepth: 2
+ libcrypto
intro
api-intro
architecture
async-tx-api
@@ -25,6 +26,5 @@ for cryptographic use cases, as well as programming examples.
api
api-samples
descore-readme
device_drivers/index
krb5
- sha3
diff --git a/Documentation/crypto/libcrypto-blockcipher.rst b/Documentation/crypto/libcrypto-blockcipher.rst
new file mode 100644
index 0000000000000..dd5ce2f8b5151
--- /dev/null
+++ b/Documentation/crypto/libcrypto-blockcipher.rst
@@ -0,0 +1,19 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+Block ciphers
+=============
+
+AES
+---
+
+Support for the AES block cipher.
+
+.. kernel-doc:: include/crypto/aes.h
+
+DES
+---
+
+Support for the DES block cipher. This algorithm is obsolete and is supported
+only for backwards compatibility.
+
+.. kernel-doc:: include/crypto/des.h
diff --git a/Documentation/crypto/libcrypto-hash.rst b/Documentation/crypto/libcrypto-hash.rst
new file mode 100644
index 0000000000000..4248e6fdc9527
--- /dev/null
+++ b/Documentation/crypto/libcrypto-hash.rst
@@ -0,0 +1,86 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+Hash functions, MACs, and XOFs
+==============================
+
+AES-CMAC and AES-XCBC-MAC
+-------------------------
+
+Support for the AES-CMAC and AES-XCBC-MAC message authentication codes.
+
+.. kernel-doc:: include/crypto/aes-cbc-macs.h
+
+BLAKE2b
+-------
+
+Support for the BLAKE2b cryptographic hash function.
+
+.. kernel-doc:: include/crypto/blake2b.h
+
+BLAKE2s
+-------
+
+Support for the BLAKE2s cryptographic hash function.
+
+.. kernel-doc:: include/crypto/blake2s.h
+
+GHASH and POLYVAL
+-----------------
+
+Support for the GHASH and POLYVAL universal hash functions. These algorithms
+are used only as internal components of other algorithms.
+
+.. kernel-doc:: include/crypto/gf128hash.h
+
+MD5
+---
+
+Support for the MD5 cryptographic hash function and HMAC-MD5. This algorithm is
+obsolete and is supported only for backwards compatibility.
+
+.. kernel-doc:: include/crypto/md5.h
+
+NH
+--
+
+Support for the NH universal hash function. This algorithm is used only as an
+internal component of other algorithms.
+
+.. kernel-doc:: include/crypto/nh.h
+
+Poly1305
+--------
+
+Support for the Poly1305 universal hash function. This algorithm is used only
+as an internal component of other algorithms.
+
+.. kernel-doc:: include/crypto/poly1305.h
+
+SHA-1
+-----
+
+Support for the SHA-1 cryptographic hash function and HMAC-SHA1. This algorithm
+is obsolete and is supported only for backwards compatibility.
+
+.. kernel-doc:: include/crypto/sha1.h
+
+SHA-2
+-----
+
+Support for the SHA-2 family of cryptographic hash functions, including SHA-224,
+SHA-256, SHA-384, and SHA-512. This also includes their corresponding HMACs:
+HMAC-SHA224, HMAC-SHA256, HMAC-SHA384, and HMAC-SHA512.
+
+.. kernel-doc:: include/crypto/sha2.h
+
+SHA-3
+-----
+
+The SHA-3 functions are documented in :ref:`sha3`.
+
+SM3
+---
+
+Support for the SM3 cryptographic hash function.
+
+.. kernel-doc:: include/crypto/sm3.h
diff --git a/Documentation/crypto/libcrypto-signature.rst b/Documentation/crypto/libcrypto-signature.rst
new file mode 100644
index 0000000000000..e80d59fa51b6a
--- /dev/null
+++ b/Documentation/crypto/libcrypto-signature.rst
@@ -0,0 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+Digital signature algorithms
+============================
+
+ML-DSA
+------
+
+Support for the ML-DSA digital signature algorithm.
+
+.. kernel-doc:: include/crypto/mldsa.h
diff --git a/Documentation/crypto/libcrypto-utils.rst b/Documentation/crypto/libcrypto-utils.rst
new file mode 100644
index 0000000000000..9d833f47ed390
--- /dev/null
+++ b/Documentation/crypto/libcrypto-utils.rst
@@ -0,0 +1,6 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+Utility functions
+=================
+
+.. kernel-doc:: include/crypto/utils.h
diff --git a/Documentation/crypto/libcrypto.rst b/Documentation/crypto/libcrypto.rst
new file mode 100644
index 0000000000000..a1557d45b0e5a
--- /dev/null
+++ b/Documentation/crypto/libcrypto.rst
@@ -0,0 +1,165 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+==============
+Crypto library
+==============
+
+``lib/crypto/`` provides faster and easier access to cryptographic algorithms
+than the traditional crypto API.
+
+Each cryptographic algorithm is supported via a set of dedicated functions.
+"Crypto agility", where needed, is left to calling code.
+
+The crypto library functions are intended to be boring and straightforward, and
+to follow familiar conventions. Their primary documentation is their (fairly
+extensive) kernel-doc. This page just provides some extra high-level context.
+
+Note that the crypto library isn't entirely new. ``lib/`` has contained some
+crypto functions since 2005. Rather, it's just an approach that's been expanded
+over time as it's been found to work well. It also largely just matches how the
+kernel already does things elsewhere.
+
+Scope and intended audience
+===========================
+
+The crypto library documentation is primarily meant for kernel developers who
+need to use a particular cryptographic algorithm(s) in kernel code. For
+example, "I just need to compute a SHA-256 hash." A secondary audience is
+developers working on the crypto algorithm implementations themselves.
+
+If you're looking for more general information about cryptography, like the
+differences between the different crypto algorithms or how to select an
+appropriate algorithm, you should refer to external sources which cover that
+type of information much more comprehensively. If you need help selecting
+algorithms for a new kernel feature that doesn't already have its algorithms
+predefined, please reach out to ``linux-crypto@vger.kernel.org`` for advice.
+
+Code organization
+=================
+
+- ``lib/crypto/*.c``: the crypto algorithm implementations
+
+- ``lib/crypto/$(SRCARCH)/``: architecture-specific code for crypto algorithms.
+ It is here rather than somewhere in ``arch/`` partly because this allows
+ generic and architecture-optimized code to be easily built into a single
+ loadable module (when the algorithm is set to 'm' in the kconfig).
+
+- ``lib/crypto/tests/``: KUnit tests for the crypto algorithms
+
+- ``include/crypto/``: crypto headers, for both the crypto library and the
+ traditional crypto API
+
+Generally, there is one kernel module per algorithm. Sometimes related
+algorithms are grouped into one module. There is intentionally no common
+framework, though there are some utility functions that multiple algorithms use.
+
+Each algorithm module is controlled by a tristate kconfig symbol
+``CRYPTO_LIB_$(ALGORITHM)``. As is the norm for library functions in the
+kernel, these are hidden symbols which don't show up in the kconfig menu.
+Instead, they are just selected by all the kconfig symbols that need them.
+
+Many of the algorithms have multiple implementations: a generic implementation
+and architecture-optimized implementation(s). Each module initialization
+function, or initcall in the built-in case, automatically enables the best
+implementation based on the available CPU features.
+
+Note that the crypto library doesn't use the ``crypto/``,
+``arch/$(SRCARCH)/crypto/``, or ``drivers/crypto/`` directories. These
+directories are used by the traditional crypto API. When possible, algorithms
+in the traditional crypto API are implemented by calls into the library.
+
+Advantages
+==========
+
+Some of the advantages of the library over the traditional crypto API are:
+
+- The library functions tend to be much easier to use. For example, a hash
+ value can be computed using only a single function call. Most of the library
+ functions always succeed and return void, eliminating the need to write
+ error-handling code. Most also accept standard virtual addresses, rather than
+ scatterlists which are difficult and less efficient to work with.
+
+- The library functions are usually faster, especially for short inputs. They
+ call the crypto algorithms directly without inefficient indirect calls, memory
+ allocations, string parsing, lookups in an algorithm registry, and other
+ unnecessary API overhead. Architecture-optimized code is enabled by default.
+
+- The library functions use standard link-time dependencies instead of
+ error-prone dynamic loading by name. There's no need for workarounds such as
+ forcing algorithms to be built-in or adding module soft dependencies.
+
+- The library focuses on the approach that works the best on the vast majority
+ of systems: CPU-based implementations of the crypto algorithms, utilizing
+ on-CPU acceleration (such as AES instructions) when available.
+
+- The library uses standard KUnit tests, rather than custom ad-hoc tests.
+
+- The library tends to have higher assurance implementations of the crypto
+ algorithms. This is both due to its simpler design and because more of its
+ code is being regularly tested.
+
+- The library supports features that don't fit into the rigid framework of the
+ traditional crypto API, for example interleaved hashing and XOFs.
+
+When to use it
+==============
+
+In-kernel users should use the library (rather than the traditional crypto API)
+whenever possible. Many subsystems have already been converted. It usually
+simplifies their code significantly and improves performance.
+
+Some kernel features allow userspace to provide an arbitrary string that selects
+an arbitrary algorithm from the traditional crypto API by name. These features
+generally will have to keep using the traditional crypto API for backwards
+compatibility.
+
+Note: new kernel features shouldn't support every algorithm, but rather make a
+deliberate choice about what algorithm(s) to support. History has shown that
+making a deliberate, thoughtful choice greatly simplifies code maintenance,
+reduces the chance for mistakes (such as using an obsolete, insecure, or
+inappropriate algorithm), and makes your feature easier to use.
+
+Testing
+=======
+
+The crypto library uses standard KUnit tests. Like many of the kernel's other
+KUnit tests, they are included in the set of tests that is run by
+``tools/testing/kunit/kunit.py run --alltests``.
+
+A ``.kunitconfig`` file is also provided to run just the crypto library tests.
+For example, here's how to run them in user-mode Linux:
+
+.. code-block:: sh
+
+ tools/testing/kunit/kunit.py run --kunitconfig=lib/crypto/
+
+Many of the crypto algorithms have architecture-optimized implementations.
+Testing those requires building an appropriate kernel and running the tests
+either in QEMU or on appropriate hardware. Here's one example with QEMU:
+
+.. code-block:: sh
+
+ tools/testing/kunit/kunit.py run --kunitconfig=lib/crypto/ --arch=arm64 --make_options LLVM=1
+
+Depending on the code being tested, flags may need to be passed to QEMU to
+emulate the correct type of hardware for the code to be reached.
+
+Since correctness is essential in cryptographic code, new architecture-optimized
+code is accepted only if it can be tested in QEMU.
+
+Note: the crypto library also includes FIPS 140 self-tests. These are
+lightweight, are designed specifically to meet FIPS 140 requirements, and exist
+*only* to meet those requirements. Normal testing done by kernel developers and
+integrators should use the much more comprehensive KUnit tests instead.
+
+API documentation
+=================
+
+.. toctree::
+ :maxdepth: 2
+
+ libcrypto-blockcipher
+ libcrypto-hash
+ libcrypto-signature
+ libcrypto-utils
+ sha3
diff --git a/Documentation/crypto/sha3.rst b/Documentation/crypto/sha3.rst
index 37640f295118b..250669c98f6ba 100644
--- a/Documentation/crypto/sha3.rst
+++ b/Documentation/crypto/sha3.rst
@@ -1,7 +1,9 @@
.. SPDX-License-Identifier: GPL-2.0-or-later
+.. _sha3:
+
==========================
SHA-3 Algorithm Collection
==========================
.. contents::
--
2.53.0
^ permalink raw reply related
* [PATCH v2 1/2] docs: kdoc: Expand 'at_least' when creating parameter list
From: Eric Biggers @ 2026-04-18 19:21 UTC (permalink / raw)
To: linux-crypto
Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
linux-doc, Jonathan Corbet, Mauro Carvalho Chehab, Randy Dunlap,
Eric Biggers
In-Reply-To: <20260418192138.15556-1-ebiggers@kernel.org>
sphinx doesn't know that the kernel headers do:
#define at_least static
Do this replacement before declarations are passed to it.
This prevents errors like the following from appearing once the
lib/crypto/ kernel-doc is wired up to the sphinx build:
linux/Documentation/crypto/libcrypto:128: ./include/crypto/sha2.h:773: WARNING: Error in declarator or parameters
Error in declarator or parameters
Invalid C declaration: Expected ']' in end of array operator. [error at 59]
void sha512_final (struct sha512_ctx *ctx, u8 out[at_least SHA512_DIGEST_SIZE])
Acked-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
tools/lib/python/kdoc/kdoc_parser.py | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/tools/lib/python/kdoc/kdoc_parser.py b/tools/lib/python/kdoc/kdoc_parser.py
index 74af7ae47aa47..c3f966da533e0 100644
--- a/tools/lib/python/kdoc/kdoc_parser.py
+++ b/tools/lib/python/kdoc/kdoc_parser.py
@@ -437,10 +437,15 @@ class KernelDoc:
for arg in args.split(splitter):
# Ignore argument attributes
arg = KernRe(r'\sPOS0?\s').sub(' ', arg)
+ # Replace '[at_least ' with '[static '. This allows sphinx to parse
+ # array parameter declarations like 'char A[at_least 4]', where
+ # 'at_least' is #defined to 'static' by the kernel headers.
+ arg = arg.replace('[at_least ', '[static ')
+
# Strip leading/trailing spaces
arg = arg.strip()
arg = KernRe(r'\s+').sub(' ', arg, count=1)
if arg.startswith('#'):
--
2.53.0
^ permalink raw reply related
* [PATCH v2 0/2] Improve the crypto library documentation
From: Eric Biggers @ 2026-04-18 19:21 UTC (permalink / raw)
To: linux-crypto
Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
linux-doc, Jonathan Corbet, Mauro Carvalho Chehab, Randy Dunlap,
Eric Biggers
While the crypto library already has a lot of kernel-doc, it's not being
included in the HTML or PDF documentation (except for the SHA-3
kernenl-doc which is already included). Update Documentation/crypto/ to
include it, and also add a high-level overview of the library.
I'd like to take this series via the libcrypto tree for 7.1.
Changed in v2:
- Use simple string replacement instead of regex in kdoc_parser.py
- Minor editorial revisions
Eric Biggers (2):
docs: kdoc: Expand 'at_least' when creating parameter list
lib/crypto: docs: Add rst documentation to Documentation/crypto/
Documentation/crypto/index.rst | 2 +-
.../crypto/libcrypto-blockcipher.rst | 19 ++
Documentation/crypto/libcrypto-hash.rst | 86 +++++++++
Documentation/crypto/libcrypto-signature.rst | 11 ++
Documentation/crypto/libcrypto-utils.rst | 6 +
Documentation/crypto/libcrypto.rst | 165 ++++++++++++++++++
Documentation/crypto/sha3.rst | 2 +
tools/lib/python/kdoc/kdoc_parser.py | 5 +
8 files changed, 295 insertions(+), 1 deletion(-)
create mode 100644 Documentation/crypto/libcrypto-blockcipher.rst
create mode 100644 Documentation/crypto/libcrypto-hash.rst
create mode 100644 Documentation/crypto/libcrypto-signature.rst
create mode 100644 Documentation/crypto/libcrypto-utils.rst
create mode 100644 Documentation/crypto/libcrypto.rst
base-commit: 8541d8f725c673db3bd741947f27974358b2e163
--
2.53.0
^ permalink raw reply
* Re: [PATCH] docs: Add overview and SLUB allocator sections to slab documentation
From: Matthew Wilcox @ 2026-04-18 16:15 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Nick Huang, Vlastimil Babka, Harry Yoo, Andrew Morton,
David Hildenbrand, Jonathan Corbet, Hao Li, Christoph Lameter,
David Rientjes, Roman Gushchin, Liam R . Howlett, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, linux-mm, linux-doc,
linux-kernel
In-Reply-To: <aeNGbNyPxJssnkbO@lucifer>
On Sat, Apr 18, 2026 at 10:07:22AM +0100, Lorenzo Stoakes wrote:
> On Sat, Apr 18, 2026 at 12:06:19AM +0000, Nick Huang wrote:
> > - Add "Overview" section explaining the slab allocator's role and purpose
> > - Document the three main slab allocator implementations (SLAB, SLUB, SLOB)
>
> The fact you're insanely wrong about the current state of slab only makes this
> worse.
This is actually a new low. We've always had to contend with people
putting up outdated or just wrong information on web pages, and there's
little we can do about it. Witness all the outdated information about
THP that's based on code that's been deleted for over a decade.
But now we've got AI trained on all this wrong/ out of date information,
and, er, "enthusiasts" who are trying to change the correct information
in the kernel to match what the deluded AI "thinks" should be true.
Let that sink in.
^ permalink raw reply
* [sailus-media-tree:metadata 50/114] htmldocs: Documentation/userspace-api/media/v4l/subdev-config-model.rst:6: WARNING: duplicate label media_subdev_config_model, other instance in Documentation/userspace-api/media/v4l/dev-subdev.rst
From: kernel test robot @ 2026-04-18 16:12 UTC (permalink / raw)
To: Sakari Ailus
Cc: oe-kbuild-all, linux-media, Tomi Valkeinen, Lad Prabhakar,
Mirela Rabulea, Jacopo Mondi, linux-doc
tree: git://linuxtv.org/sailus/media_tree.git metadata
head: 7bb838184fdc0d716c2a06795fc6ef1f1fb8350b
commit: 8e5bf1f62e3753a0172060ede618b246677f041e [50/114] media: Documentation: Add subdev configuration models, raw sensor model
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
docutils: docutils (Docutils 0.21.2, Python 3.13.5, on linux)
reproduce: (https://download.01.org/0day-ci/archive/20260418/202604181837.JkmT17KM-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202604181837.JkmT17KM-lkp@intel.com/
All warnings (new ones prefixed by >>):
Documentation/userspace-api/landlock:526: ./include/uapi/linux/landlock.h:45: ERROR: Unknown target name: "network flags". [docutils]
Documentation/userspace-api/landlock:526: ./include/uapi/linux/landlock.h:50: ERROR: Unknown target name: "scope flags". [docutils]
Documentation/userspace-api/landlock:526: ./include/uapi/linux/landlock.h:24: ERROR: Unknown target name: "filesystem flags". [docutils]
Documentation/userspace-api/landlock:535: ./include/uapi/linux/landlock.h:166: ERROR: Unknown target name: "filesystem flags". [docutils]
Documentation/userspace-api/landlock:535: ./include/uapi/linux/landlock.h:189: ERROR: Unknown target name: "network flags". [docutils]
>> Documentation/userspace-api/media/v4l/subdev-config-model.rst:6: WARNING: duplicate label media_subdev_config_model, other instance in Documentation/userspace-api/media/v4l/dev-subdev.rst
>> Documentation/userspace-api/media/v4l/subdev-config-model.rst:35: WARNING: duplicate label media_subdev_config_model_common_raw_sensor, other instance in Documentation/userspace-api/media/v4l/dev-subdev.rst
>> Documentation/userspace-api/media/v4l/subdev-config-model.rst:: WARNING: duplicate label media_subdev_config_model_common_raw_sensor_subdev, other instance in Documentation/userspace-api/media/v4l/dev-subdev.rst
Documentation/networking/skbuff:36: ./include/linux/skbuff.h:181: WARNING: Failed to create a cross reference. A title or caption not found: 'crc' [ref.ref]
Documentation/userspace-api/media/drivers/camera-sensor.rst:147: WARNING: undefined label: 'media-metadata-layout-ccs' [ref.ref]
vim +6 Documentation/userspace-api/media/v4l/subdev-config-model.rst
4
5 Sub-device configuration models
> 6 ===============================
7
8 The V4L2 specification defines a subdev API that exposes three type of
9 configuration elements: formats, selection rectangles and controls. The
10 specification contains generic information about how those configuration
11 elements behave, but not precisely how they apply to particular hardware
12 features. We leave some leeway to drivers to decide how to map selection
13 rectangles to device features, as long as they comply with the V4L2
14 specification. This is needed as hardware features differ between devices, so
15 it's the driver's responsibility to handle this mapping.
16
17 Unfortunately, this lack of clearly defined mapping in the specification has led
18 to different drivers mapping the same hardware features to different API
19 elements, or implementing the API elements with slightly different
20 behaviours. Furthermore, many drivers have implemented selection rectangles in
21 ways that do not comply with the V4L2 specification. All of this makes userspace
22 development difficult.
23
24 Sub-device configuration models specify in detail what the user space can expect
25 from a sub-device in terms of V4L2 sub-device interface support, semantics
26 included.
27
28 A sub-device may implement more than one configuration model at the same
29 time. The implemented configuration models can be obtained from the sub-device's
30 ``V4L2_CID_CONFIG_MODEL`` control.
31
32 .. _media_subdev_config_model_common_raw_sensor:
33
34 Common raw camera sensor model
> 35 ------------------------------
36
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply
* Re: [PATCH] docs: Add overview and SLUB allocator sections to slab documentation
From: Harry Yoo (Oracle) @ 2026-04-18 15:30 UTC (permalink / raw)
To: Vlastimil Babka (SUSE)
Cc: Nick Huang, Lorenzo Stoakes, Matthew Wilcox, Andrew Morton,
David Hildenbrand, Jonathan Corbet, Hao Li, Christoph Lameter,
David Rientjes, Roman Gushchin, Liam R . Howlett, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, linux-mm, linux-doc,
linux-kernel
In-Reply-To: <75b2b254-2e02-43ca-9109-9812cf3f85b2@kernel.org>
On Sat, Apr 18, 2026 at 01:20:42PM +0200, Vlastimil Babka (SUSE) wrote:
> On 4/18/26 1:00 PM, Nick Huang wrote:
> > Hi Lorenzo Stoakes
> > Lorenzo Stoakes <ljs@kernel.org> 於 2026年4月18日週六 下午5:11寫道:
> >>
> >> On Sat, Apr 18, 2026 at 02:12:22PM +0800, Nick Huang wrote:
> >>> Nick Huang <sef1548@gmail.com> 於 2026年4月18日週六 下午1:27寫道:
> >>>>
> >>>> Matthew Wilcox <willy@infradead.org> 於 2026年4月18日週六 下午1:04寫道:
> >>>>>
> >>>>> On Sat, Apr 18, 2026 at 12:06:19AM +0000, Nick Huang wrote:
> >>>>>> - Add "Overview" section explaining the slab allocator's role and purpose
> >>>>>> - Document the three main slab allocator implementations (SLAB, SLUB, SLOB)
> >>> Hi Matthew Wilcox
> >>> I will remove this sentence in the next version:
> >>> “Document the three main slab allocator implementations (SLAB, SLUB, SLOB).”
> >>> I’m not entirely sure I fully understand your point. If I’ve missed
> >>> anything, please let me know what needs to be changed. Thank you.
> >>
> >> No, please don't send any more revisions of this garbage, thanks.
> >
> > thank you for your guidance. I will correct my work and introduce the
> > more recent `barn`, `sheave`, and `kmalloc_obj`.
> > Do you think this is appropriate?
>
> No, this whole thing is inappropriate from the beginning. We are not
> going to waste more time on development-by-review for something that
> started as undisclosed LLM slop.
Just to be clear, Nick.
I'm no longer willing to answer all of your questions and request for
guidance sent in private as you keep ignoring questions & feedback
and still looking for other "interesting" items to contribute.
Let's not waste your and others' time.
I don't think this is going to work out.
Expect NACK on your future slab contributions.
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply
* Re: [PATCH] net: ipv4: igmp: add sysctl option to ignore inbound llm_reports
From: Ido Schimmel @ 2026-04-18 15:29 UTC (permalink / raw)
To: Steffen Trumtrar
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Jonathan Corbet, Shuah Khan, David Ahern, netdev,
linux-doc, linux-kernel
In-Reply-To: <20260415-v7-0-topic-igmp-llm-drop-v1-1-1367bfbb898e@pengutronix.de>
On Wed, Apr 15, 2026 at 12:26:13PM +0200, Steffen Trumtrar wrote:
> Add a new sysctl option 'igmp_link_local_mcast_reports_drop' that allows
> dropping inbound IGMP reports for link-local multicast groups in the
> 224.0.0.X range. This can be used to prevent the local system from
> processing IGMP reports for link local multicast groups and therefore
> let the kernel still send the own outbound IGMP reports.
OK, but what is the motivation to keep sending IGMP reports for
link-local multicast groups when the host already received such reports
from other hosts on the network? Why link-local groups are special in
this case?
AFAICT, igmp_heard_report() implements report suppression according to
RFC 2236 and it doesn't mention special behavior for link-local groups:
"If the host receives another host's Report (version 1 or 2) while it
has a timer running, it stops its timer for the specified group and does
not send a Report, in order to suppress duplicate Reports."
Also, I'm not convinced we need a new sysctl (that we will need to keep
forever) for this. It should be possible to drop such packets using tc
(tc-32 / tc-bpf) or netfilter.
[...]
> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> index 6921d8594b849..2da4cd6ac7202 100644
> --- a/Documentation/networking/ip-sysctl.rst
> +++ b/Documentation/networking/ip-sysctl.rst
> @@ -2306,6 +2306,18 @@ igmp_link_local_mcast_reports - BOOLEAN
>
> Default TRUE
>
> +igmp_link_local_mcast_reports_drop - BOOLEAN
> + Drop inbound IGMP reports for link local multicast groups in
> + the 224.0.0.X range. When enabled, IGMP membership reports for
> + link local multicast addresses are silently dropped without
> + processing.
> + When the kernel gets inbound IGMP reports it stops sending own
> + IGMP reports. With allowing to drop and process the inbound reports,
> + the kernel will not stop sending the own reports, even when IGMP
> + reports from other hosts are seen on the network.
> +
> + Default FALSE
[...]
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index a674fb44ec25b..3a4932e4108bd 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -931,6 +931,8 @@ static bool igmp_heard_report(struct in_device *in_dev, __be32 group)
> if (ipv4_is_local_multicast(group) &&
> !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))
> return false;
> + if (READ_ONCE(net->ipv4.sysctl_igmp_llm_reports_drop))
> + return true;
>
> rcu_read_lock();
> for_each_pmc_rcu(in_dev, im) {
The documentation says that this sysctl is specifically about link-local
groups, but it drops reports from all groups...
^ permalink raw reply
* [PATCH v4] docs/zh_CN: add module-signing Chinese translation
From: Yan Zhu @ 2026-04-18 15:00 UTC (permalink / raw)
To: seakeel, alexs, si.yanteng, corbet, dzm91
Cc: skhan, linux-doc, linux-kernel, zhuyan2015
Translate .../admin-guide/module-signing.rst into Chinese.
Update the translation through commit 0ad9a71933e7
("modsign: Enable ML-DSA module signing")
Signed-off-by: Yan Zhu <zhuyan2015@qq.com>
---
v1->v2:
Fixed the issue of some lines exceeding 80 characters and alignment.
v2->v3:
Fix line 87 and 94 without tab indentation.
v3->v4:
Add patch change description.
---
.../zh_CN/admin-guide/module-signing.rst | 249 ++++++++++++++++++
1 file changed, 249 insertions(+)
create mode 100644 Documentation/translations/zh_CN/admin-guide/module-signing.rst
diff --git a/Documentation/translations/zh_CN/admin-guide/module-signing.rst b/Documentation/translations/zh_CN/admin-guide/module-signing.rst
new file mode 100644
index 000000000000..04b0f1cbafd5
--- /dev/null
+++ b/Documentation/translations/zh_CN/admin-guide/module-signing.rst
@@ -0,0 +1,249 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/admin-guide/module-signing.rst
+:翻译:
+ 朱岩 Yan Zhu <zhuyan2015@qq.com>
+
+
+==========================
+内核模块签名机制
+==========================
+
+.. 目录
+..
+.. - 概述
+.. - 配置模块签名
+.. - 生成签名密钥
+.. - 内核中的公钥
+.. - 模块手动签名
+.. - 已签名模块和剥离
+.. - 加载已签名模块
+.. - 无效签名和未签名模块
+.. - 管理/保护私钥
+
+
+概述
+====
+
+内核模块签名机制在安装过程中对模块进行加密签名,然后在加载模块时检查签名。这
+通过禁止加载未签名的模块或使用无效密钥签名的模块来提高内核安全性。模块签名通
+过使恶意模块更难加载到内核中来增加安全性。模块签名检查在内核中完成,因此不需
+要受信任的用户空间位。
+
+此机制使用 X.509 ITU-T 标准证书对涉及的公钥进行编码。签名本身不以任何工业标准
+类型编码。内置机制目前仅支持 RSA、NIST P-384 ECDSA 和 NIST FIPS-204 ML-DSA
+公钥签名标准(尽管它是可插拔的并允许使用其他标准)。对于 RSA 和 ECDSA,可以使
+用的可能的哈希算法是大小为 256、384 和 512 的 SHA-2 和 SHA-3(算法由签名中的
+数据选择);ML-DSA会自行进行哈希运算,但允许与SHA512哈希算法结合用于签名属性。
+
+配置模块签名
+============
+
+通过进入内核配置的 :menuselection:`Enable Loadable Module Support` 菜单并打
+开以下选项来启用模块签名机制::
+
+ CONFIG_MODULE_SIG "Module signature verification"
+
+这有多个可用选项:
+
+ (1) :menuselection:`Require modules to be validly signed`
+ (``CONFIG_MODULE_SIG_FORCE``)
+
+ 这指定了内核应如何处理其密钥未知或未签名的模块。
+
+ 如果关闭(即"宽松模式"),则允许使用不可用密钥和未签名的模块,但内核将被
+ 标记为受污染,并且相关模块将被标记为受污染,显示字符'E'。
+
+ 如果打开(即"限制模式"),只有具有有效签名且可由内核拥有的公钥验证的模块
+ 才会被加载。所有其他模块将生成错误。
+
+ 无论此处的设置如何,如果模块的签名块无法解析,它将被直接拒绝。
+
+
+ (2) :menuselection:`Automatically sign all modules`
+ (``CONFIG_MODULE_SIG_ALL``)
+
+ 如果打开此选项,则在构建的 modules_install 阶段期间将自动签名模块。
+ 如果关闭,则必须使用以下命令手动签名模块::
+
+ scripts/sign-file
+
+
+ (3) :menuselection:`Which hash algorithm should modules be signed with?`
+
+ 这提供了安装阶段将用于签名模块的哈希算法选择:
+
+ =============================== ==========================================
+ ``CONFIG_MODULE_SIG_SHA256`` :menuselection:`Sign modules with SHA-256`
+ ``CONFIG_MODULE_SIG_SHA384`` :menuselection:`Sign modules with SHA-384`
+ ``CONFIG_MODULE_SIG_SHA512`` :menuselection:`Sign modules with SHA-512`
+ ``CONFIG_MODULE_SIG_SHA3_256`` :menuselection:`Sign modules with SHA3-256`
+ ``CONFIG_MODULE_SIG_SHA3_384`` :menuselection:`Sign modules with SHA3-384`
+ ``CONFIG_MODULE_SIG_SHA3_512`` :menuselection:`Sign modules with SHA3-512`
+ =============================== ==========================================
+
+ 此处选择的算法也将被构建到内核中(而不是作为模块),以便使用该算法签名的
+ 模块可以在不导致循环依赖的情况下检查其签名。
+
+
+ (4) :menuselection:`File name or PKCS#11 URI of module signing key`
+ (``CONFIG_MODULE_SIG_KEY``)
+
+ 将此选项设置为除默认值 ``certs/signing_key.pem`` 之外的其他值将禁用签名
+ 密钥的自动生成,并允许使用您选择的密钥对内核模块进行签名。提供的字符串应
+ 标识包含私钥及其对应的 PEM 格式 X.509 证书的文件,或者在 OpenSSL
+ ENGINE_pkcs11 功能正常的系统上,使用 RFC7512 定义的 PKCS#11 URI。在后一
+ 种情况下,PKCS#11 URI 应引用证书和私钥。
+
+ 如果包含私钥的 PEM 文件已加密,或者 PKCS#11 令牌需要 PIN,可以通过
+ ``KBUILD_SIGN_PIN`` 变量在构建时提供。
+
+
+ (5) :menuselection:`Additional X.509 keys for default system keyring`
+ (``CONFIG_SYSTEM_TRUSTED_KEYS``)
+
+ 此选项可设置为包含附加证书的 PEM 编码文件的文件名,这些证书将默认包含在
+ 系统密钥环中。
+
+请注意,启用模块签名会为内核构建过程添加对执行签名工具的OpenSSL开发包的依赖。
+
+
+生成签名密钥
+============
+
+生成和检查签名需要加密密钥对。私钥用于生成签名,相应的公钥用于检查签名。私钥
+仅在构建期间需要,之后可以删除或安全存储。公钥被构建到内核中,以便在加载模块
+时可以使用它来检查签名。
+
+在正常情况下,当 ``CONFIG_MODULE_SIG_KEY`` 保持默认值时,如果文件中不存在密
+钥对,内核构建将使用 openssl 自动生成新的密钥对::
+
+ certs/signing_key.pem
+
+在构建 vmlinux 期间(公钥需要构建到 vmlinux 中)使用参数::
+
+ certs/x509.genkey
+
+文件(如果尚不存在也会生成)。
+
+可以在 RSA(``MODULE_SIG_KEY_TYPE_RSA``)、
+ECDSA(``MODULE_SIG_KEY_TYPE_ECDSA``)和
+ML-DSA(``MODULE_SIG_KEY_TYPE_MLDSA_*``)之间选择生成 RSA 4k、NIST P-384
+密钥对或 ML-DSA 44、65 或 87 密钥对。
+
+强烈建议您提供自己的 x509.genkey 文件。
+
+最值得注意的是,在 x509.genkey 文件中,req_distinguished_name 部分应从默认值
+更改::
+
+ [ req_distinguished_name ]
+ #O = Unspecified company
+ CN = Build time autogenerated kernel key
+ #emailAddress = unspecified.user@unspecified.company
+
+生成的 RSA 密钥大小也可以通过以下方式设置::
+
+ [ req ]
+ default_bits = 4096
+
+也可以使用位于 Linux 内核源代码树根节点中的 x509.genkey 密钥生成配置文件和
+openssl 命令手动生成公钥/私钥文件。以下是生成公钥/私钥文件的示例::
+
+ openssl req -new -nodes -utf8 -sha256 -days 36500 -batch -x509 \
+ -config x509.genkey -outform PEM -out kernel_key.pem \
+ -keyout kernel_key.pem
+
+然后可以将生成的 kernel_key.pem 文件的完整路径名指定在
+``CONFIG_MODULE_SIG_KEY``选项中,并且将使用其中的证书和密钥而不是自动生成的
+密钥对。
+
+
+内核中的公钥
+============
+
+内核包含一个可由 root 查看的公钥环。它们在名为 ".builtin_trusted_keys" 的密
+钥环中,可以通过以下方式查看::
+
+ [root@deneb ~]# cat /proc/keys
+ ...
+ 223c7853 I------ 1 perm 1f030000 0 0 keyring .builtin_trusted_keys: 1
+ 302d2d52 I------ 1 perm 1f010000 0 0 asymmetri Fedora kernel signing key: d69a84e6bce3d216b979e9505b3e3ef9a7118079: X509.RSA a7118079 []
+
+除了专门为模块签名生成的公钥外,还可以在 ``CONFIG_SYSTEM_TRUSTED_KEYS`` 配置
+选项引用的 PEM 编码文件中提供其他受信任的证书。
+
+此外,架构代码可以从硬件存储中获取公钥并将其添加(例如从 UEFI 密钥数据库)。
+
+最后,可以通过以下方式添加其他公钥::
+
+ keyctl padd asymmetric "" [.builtin_trusted_keys-ID] <[key-file]
+
+例如::
+
+ keyctl padd asymmetric "" 0x223c7853 <my_public_key.x509
+
+但是,请注意,内核只允许将由已驻留在 ``.builtin_trusted_keys`` 中的密钥有效
+签名的密钥添加到 ``.builtin_trusted_keys``。
+
+模块手动签名
+============
+
+要手动对模块进行签名,请使用 Linux 内核源代码树中可用的 scripts/sign-file 工
+具。该脚本需要 4 个参数:
+
+ 1. 哈希算法(例如,sha256)
+ 2. 私钥文件名或 PKCS#11 URI
+ 3. 公钥文件名
+ 4. 要签名的内核模块
+
+以下是签名内核模块的示例::
+
+ scripts/sign-file sha512 kernel-signkey.priv \
+ kernel-signkey.x509 module.ko
+
+使用的哈希算法不必与配置的算法匹配,但如果不同,应确保哈希算法要么内置在内核
+中,要么可以在不需要自身的情况下加载。
+
+如果私钥需要密码或 PIN,可以在 $KBUILD_SIGN_PIN 环境变量中提供。
+
+
+已签名模块和剥离
+================
+
+已签名模块在末尾简单地附加了数字签名。模块文件末尾的字符串
+``~Module signature appended~.`` 确认签名存在,但不能确认签名有效!
+
+已签名模块是脆弱的,因为签名在定义的ELF容器之外。因此,一旦计算并附加签名,就
+不得剥离它们。请注意,整个模块都是签名的有效载荷,包括签名时存在的任何和所有
+调试信息。
+
+
+加载已签名模块
+==============
+
+模块通过 insmod、modprobe、``init_module()`` 或 ``finit_module()`` 加载,
+与未签名模块完全一样,因为在用户空间中不进行任何处理。
+所有签名检查都在内核内完成。
+
+
+无效签名和未签名模块
+====================
+
+如果启用了 ``CONFIG_MODULE_SIG_FORCE`` 或在内核启动命令提供了
+module.sig_enforce=1,内核将仅加载具有有效签名且具有公钥的模块。否则,它还将
+加载未签名的模块。任何具有不匹配签名的模块将不被允许加载。
+
+任何具有不可解析签名的模块将被拒绝。
+
+
+管理/保护私钥
+==============
+
+由于私钥用于签名模块,病毒和恶意软件可以使用私钥签名模块并危害操作系统。私钥
+必须被销毁或移动到安全位置,而不是保存在内核源代码树的根节点中。
+
+如果使用相同的私钥为多个内核配置签名模块,必须确保模块版本信息足以防止将模块
+加载到不同的内核中。要么设置 ``CONFIG_MODVERSIONS=y``,要么通过更改
+``EXTRAVERSION`` 或 ``CONFIG_LOCALVERSION`` 确保每个配置具有不同的内核发布字
+符串。
--
2.43.0
^ permalink raw reply related
* Re: [PATCH v3] docs/zh_CN: add module-signing Chinese translation
From: Yan Zhu @ 2026-04-18 14:15 UTC (permalink / raw)
To: Dongliang Mu
Cc: alexs, seakeel, si.yanteng, corbet, skhan, linux-doc,
linux-kernel
In-Reply-To: <984f3b99-23e6-4931-a79e-7ed7cd47c04f@hust.edu.cn>
Hi Dongliang,
On 4/18/2026 2:20 PM, Dongliang Mu wrote:
>
> On 4/18/26 12:45 PM, Yan Zhu wrote:
>> Translate .../admin-guide/module-signing.rst into Chinese.
>>
>> Update the translation through commit 0ad9a71933e7
>> ("modsign: Enable ML-DSA module signing")
>>
>> Signed-off-by: Yan Zhu <zhuyan2015@qq.com>
>> ---
>
> Hi Yan,
>
> Please remember to add your changelog under the "---". For example, v1-
> >v2: XXX
>
I replied to the email according to the guidance of how-to.rst in the
Chinese thanslation. There was no special explanation there, so I
thought that the Chinese translation did not need the explanation of
version change. I think I should resend a patch to fill in this description.
> And I wonder where the v2 patch is as I don't find it in my mbox. Do I
> miss something?
There may be something wrong with the mailbox. I see that you are in the
copy list for the v2 patch. The link is
https://lore.kernel.org/lkml/tencent_0101EEFDDBC5D532222BFE0EF2487DCC0805@qq.com
> Dongliang Mu
>
>> .../zh_CN/admin-guide/module-signing.rst | 249 ++++++++++++++++++
>> 1 file changed, 249 insertions(+)
>> create mode 100644 Documentation/translations/zh_CN/admin-guide/
>> module-signing.rst
>>
>> diff --git a/Documentation/translations/zh_CN/admin-guide/module-
>> signing.rst b/Documentation/translations/zh_CN/admin-guide/module-
>> signing.rst
>> new file mode 100644
>> index 000000000000..04b0f1cbafd5
>> --- /dev/null
>> +++ b/Documentation/translations/zh_CN/admin-guide/module-signing.rst
>> @@ -0,0 +1,249 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +.. include:: ../disclaimer-zh_CN.rst
>> +
>> +:Original: Documentation/admin-guide/module-signing.rst
>> +:翻译:
>> + 朱岩 Yan Zhu <zhuyan2015@qq.com>
>> +
>> +
>> +==========================
>> +内核模块签名机制
>> +==========================
>> +
>> +.. 目录
>> +..
>> +.. - 概述
>> +.. - 配置模块签名
>> +.. - 生成签名密钥
>> +.. - 内核中的公钥
>> +.. - 模块手动签名
>> +.. - 已签名模块和剥离
>> +.. - 加载已签名模块
>> +.. - 无效签名和未签名模块
>> +.. - 管理/保护私钥
>> +
>> +
>> +概述
>> +====
>> +
>> +内核模块签名机制在安装过程中对模块进行加密签名,然后在加载模块时检查
>> 签名。这
>> +通过禁止加载未签名的模块或使用无效密钥签名的模块来提高内核安全性。模
>> 块签名通
>> +过使恶意模块更难加载到内核中来增加安全性。模块签名检查在内核中完成,
>> 因此不需
>> +要受信任的用户空间位。
>> +
>> +此机制使用 X.509 ITU-T 标准证书对涉及的公钥进行编码。签名本身不以任何
>> 工业标准
>> +类型编码。内置机制目前仅支持 RSA、NIST P-384 ECDSA 和 NIST FIPS-204
>> ML-DSA
>> +公钥签名标准(尽管它是可插拔的并允许使用其他标准)。对于 RSA 和
>> ECDSA,可以使
>> +用的可能的哈希算法是大小为 256、384 和 512 的 SHA-2 和 SHA-3(算法由
>> 签名中的
>> +数据选择);ML-DSA会自行进行哈希运算,但允许与SHA512哈希算法结合用于
>> 签名属性。
>> +
>> +配置模块签名
>> +============
>> +
>> +通过进入内核配置的 :menuselection:`Enable Loadable Module Support` 菜
>> 单并打
>> +开以下选项来启用模块签名机制::
>> +
>> + CONFIG_MODULE_SIG "Module signature verification"
>> +
>> +这有多个可用选项:
>> +
>> + (1) :menuselection:`Require modules to be validly signed`
>> + (``CONFIG_MODULE_SIG_FORCE``)
>> +
>> + 这指定了内核应如何处理其密钥未知或未签名的模块。
>> +
>> + 如果关闭(即"宽松模式"),则允许使用不可用密钥和未签名的模块,但
>> 内核将被
>> + 标记为受污染,并且相关模块将被标记为受污染,显示字符'E'。
>> +
>> + 如果打开(即"限制模式"),只有具有有效签名且可由内核拥有的公钥验
>> 证的模块
>> + 才会被加载。所有其他模块将生成错误。
>> +
>> + 无论此处的设置如何,如果模块的签名块无法解析,它将被直接拒绝。
>> +
>> +
>> + (2) :menuselection:`Automatically sign all modules`
>> + (``CONFIG_MODULE_SIG_ALL``)
>> +
>> + 如果打开此选项,则在构建的 modules_install 阶段期间将自动签名模块。
>> + 如果关闭,则必须使用以下命令手动签名模块::
>> +
>> + scripts/sign-file
>> +
>> +
>> + (3) :menuselection:`Which hash algorithm should modules be signed
>> with?`
>> +
>> + 这提供了安装阶段将用于签名模块的哈希算法选择:
>> +
>> + ===============================
>> ==========================================
>> + ``CONFIG_MODULE_SIG_SHA256`` :menuselection:`Sign modules with
>> SHA-256`
>> + ``CONFIG_MODULE_SIG_SHA384`` :menuselection:`Sign modules with
>> SHA-384`
>> + ``CONFIG_MODULE_SIG_SHA512`` :menuselection:`Sign modules with
>> SHA-512`
>> + ``CONFIG_MODULE_SIG_SHA3_256`` :menuselection:`Sign modules
>> with SHA3-256`
>> + ``CONFIG_MODULE_SIG_SHA3_384`` :menuselection:`Sign modules
>> with SHA3-384`
>> + ``CONFIG_MODULE_SIG_SHA3_512`` :menuselection:`Sign modules
>> with SHA3-512`
>> + ===============================
>> ==========================================
>> +
>> + 此处选择的算法也将被构建到内核中(而不是作为模块),以便使用该算
>> 法签名的
>> + 模块可以在不导致循环依赖的情况下检查其签名。
>> +
>> +
>> + (4) :menuselection:`File name or PKCS#11 URI of module signing key`
>> + (``CONFIG_MODULE_SIG_KEY``)
>> +
>> + 将此选项设置为除默认值 ``certs/signing_key.pem`` 之外的其他值将
>> 禁用签名
>> + 密钥的自动生成,并允许使用您选择的密钥对内核模块进行签名。提供的
>> 字符串应
>> + 标识包含私钥及其对应的 PEM 格式 X.509 证书的文件,或者在 OpenSSL
>> + ENGINE_pkcs11 功能正常的系统上,使用 RFC7512 定义的 PKCS#11
>> URI。在后一
>> + 种情况下,PKCS#11 URI 应引用证书和私钥。
>> +
>> + 如果包含私钥的 PEM 文件已加密,或者 PKCS#11 令牌需要 PIN,可以通过
>> + ``KBUILD_SIGN_PIN`` 变量在构建时提供。
>> +
>> +
>> + (5) :menuselection:`Additional X.509 keys for default system keyring`
>> + (``CONFIG_SYSTEM_TRUSTED_KEYS``)
>> +
>> + 此选项可设置为包含附加证书的 PEM 编码文件的文件名,这些证书将默
>> 认包含在
>> + 系统密钥环中。
>> +
>> +请注意,启用模块签名会为内核构建过程添加对执行签名工具的OpenSSL开发包
>> 的依赖。
>> +
>> +
>> +生成签名密钥
>> +============
>> +
>> +生成和检查签名需要加密密钥对。私钥用于生成签名,相应的公钥用于检查签
>> 名。私钥
>> +仅在构建期间需要,之后可以删除或安全存储。公钥被构建到内核中,以便在
>> 加载模块
>> +时可以使用它来检查签名。
>> +
>> +在正常情况下,当 ``CONFIG_MODULE_SIG_KEY`` 保持默认值时,如果文件中不
>> 存在密
>> +钥对,内核构建将使用 openssl 自动生成新的密钥对::
>> +
>> + certs/signing_key.pem
>> +
>> +在构建 vmlinux 期间(公钥需要构建到 vmlinux 中)使用参数::
>> +
>> + certs/x509.genkey
>> +
>> +文件(如果尚不存在也会生成)。
>> +
>> +可以在 RSA(``MODULE_SIG_KEY_TYPE_RSA``)、
>> +ECDSA(``MODULE_SIG_KEY_TYPE_ECDSA``)和
>> +ML-DSA(``MODULE_SIG_KEY_TYPE_MLDSA_*``)之间选择生成 RSA 4k、NIST P-384
>> +密钥对或 ML-DSA 44、65 或 87 密钥对。
>> +
>> +强烈建议您提供自己的 x509.genkey 文件。
>> +
>> +最值得注意的是,在 x509.genkey 文件中,req_distinguished_name 部分应
>> 从默认值
>> +更改::
>> +
>> + [ req_distinguished_name ]
>> + #O = Unspecified company
>> + CN = Build time autogenerated kernel key
>> + #emailAddress = unspecified.user@unspecified.company
>> +
>> +生成的 RSA 密钥大小也可以通过以下方式设置::
>> +
>> + [ req ]
>> + default_bits = 4096
>> +
>> +也可以使用位于 Linux 内核源代码树根节点中的 x509.genkey 密钥生成配置
>> 文件和
>> +openssl 命令手动生成公钥/私钥文件。以下是生成公钥/私钥文件的示例::
>> +
>> + openssl req -new -nodes -utf8 -sha256 -days 36500 -batch -x509 \
>> + -config x509.genkey -outform PEM -out kernel_key.pem \
>> + -keyout kernel_key.pem
>> +
>> +然后可以将生成的 kernel_key.pem 文件的完整路径名指定在
>> +``CONFIG_MODULE_SIG_KEY``选项中,并且将使用其中的证书和密钥而不是自动
>> 生成的
>> +密钥对。
>> +
>> +
>> +内核中的公钥
>> +============
>> +
>> +内核包含一个可由 root 查看的公钥环。它们在名为
>> ".builtin_trusted_keys" 的密
>> +钥环中,可以通过以下方式查看::
>> +
>> + [root@deneb ~]# cat /proc/keys
>> + ...
>> + 223c7853 I------ 1 perm 1f030000 0 0
>> keyring .builtin_trusted_keys: 1
>> + 302d2d52 I------ 1 perm 1f010000 0 0 asymmetri Fedora
>> kernel signing key: d69a84e6bce3d216b979e9505b3e3ef9a7118079: X509.RSA
>> a7118079 []
>> +
>> +除了专门为模块签名生成的公钥外,还可以在
>> ``CONFIG_SYSTEM_TRUSTED_KEYS`` 配置
>> +选项引用的 PEM 编码文件中提供其他受信任的证书。
>> +
>> +此外,架构代码可以从硬件存储中获取公钥并将其添加(例如从 UEFI 密钥数
>> 据库)。
>> +
>> +最后,可以通过以下方式添加其他公钥::
>> +
>> + keyctl padd asymmetric "" [.builtin_trusted_keys-ID] <[key-file]
>> +
>> +例如::
>> +
>> + keyctl padd asymmetric "" 0x223c7853 <my_public_key.x509
>> +
>> +但是,请注意,内核只允许将由已驻留在 ``.builtin_trusted_keys`` 中的密
>> 钥有效
>> +签名的密钥添加到 ``.builtin_trusted_keys``。
>> +
>> +模块手动签名
>> +============
>> +
>> +要手动对模块进行签名,请使用 Linux 内核源代码树中可用的 scripts/sign-
>> file 工
>> +具。该脚本需要 4 个参数:
>> +
>> + 1. 哈希算法(例如,sha256)
>> + 2. 私钥文件名或 PKCS#11 URI
>> + 3. 公钥文件名
>> + 4. 要签名的内核模块
>> +
>> +以下是签名内核模块的示例::
>> +
>> + scripts/sign-file sha512 kernel-signkey.priv \
>> + kernel-signkey.x509 module.ko
>> +
>> +使用的哈希算法不必与配置的算法匹配,但如果不同,应确保哈希算法要么内
>> 置在内核
>> +中,要么可以在不需要自身的情况下加载。
>> +
>> +如果私钥需要密码或 PIN,可以在 $KBUILD_SIGN_PIN 环境变量中提供。
>> +
>> +
>> +已签名模块和剥离
>> +================
>> +
>> +已签名模块在末尾简单地附加了数字签名。模块文件末尾的字符串
>> +``~Module signature appended~.`` 确认签名存在,但不能确认签名有效!
>> +
>> +已签名模块是脆弱的,因为签名在定义的ELF容器之外。因此,一旦计算并附加
>> 签名,就
>> +不得剥离它们。请注意,整个模块都是签名的有效载荷,包括签名时存在的任
>> 何和所有
>> +调试信息。
>> +
>> +
>> +加载已签名模块
>> +==============
>> +
>> +模块通过 insmod、modprobe、``init_module()`` 或 ``finit_module()`` 加
>> 载,
>> +与未签名模块完全一样,因为在用户空间中不进行任何处理。
>> +所有签名检查都在内核内完成。
>> +
>> +
>> +无效签名和未签名模块
>> +====================
>> +
>> +如果启用了 ``CONFIG_MODULE_SIG_FORCE`` 或在内核启动命令提供了
>> +module.sig_enforce=1,内核将仅加载具有有效签名且具有公钥的模块。否
>> 则,它还将
>> +加载未签名的模块。任何具有不匹配签名的模块将不被允许加载。
>> +
>> +任何具有不可解析签名的模块将被拒绝。
>> +
>> +
>> +管理/保护私钥
>> +==============
>> +
>> +由于私钥用于签名模块,病毒和恶意软件可以使用私钥签名模块并危害操作系
>> 统。私钥
>> +必须被销毁或移动到安全位置,而不是保存在内核源代码树的根节点中。
>> +
>> +如果使用相同的私钥为多个内核配置签名模块,必须确保模块版本信息足以防
>> 止将模块
>> +加载到不同的内核中。要么设置 ``CONFIG_MODVERSIONS=y``,要么通过更改
>> +``EXTRAVERSION`` 或 ``CONFIG_LOCALVERSION`` 确保每个配置具有不同的内
>> 核发布字
>> +符串。
--
Yan Zhu
^ permalink raw reply
* Re: [PATCH] kbuild: document generation of offset header files
From: Piyush Patle @ 2026-04-18 13:38 UTC (permalink / raw)
To: Nathan Chancellor, Nicolas Schier, Jonathan Corbet, linux-kbuild,
linux-doc
Cc: Shuah Khan, Mark Rutland, Chen Pei, Randy Dunlap, Arnd Bergmann,
Masahiro Yamada, linux-kernel
In-Reply-To: <20260410221257.191517-1-piyushpatle228@gmail.com>
On Sat, Apr 11, 2026 at 3:43 AM Piyush Patle <piyushpatle228@gmail.com> wrote:
>
> Replace the placeholder reference with a description of how Kbuild
> generates offset header files such as include/generated/asm-offsets.h.
>
> Remove the corresponding TODO entry now that this is documented.
>
> Signed-off-by: Piyush Patle <piyushpatle228@gmail.com>
> ---
> Documentation/kbuild/makefiles.rst | 41 ++++++++++++++++++++++++------
> 1 file changed, 33 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/kbuild/makefiles.rst b/Documentation/kbuild/makefiles.rst
> index 24a4708d26e8..7521cae7d56f 100644
> --- a/Documentation/kbuild/makefiles.rst
> +++ b/Documentation/kbuild/makefiles.rst
> @@ -1285,8 +1285,39 @@ Example::
> In this example, the file target maketools will be processed
> before descending down in the subdirectories.
>
> -See also chapter XXX-TODO that describes how kbuild supports
> -generating offset header files.
> +Generating offset header files
> +------------------------------
> +
> +The ``include/generated/asm-offsets.h`` header exposes C structure
> +member offsets and other compile-time constants to assembly code. It
> +is generated from ``arch/$(SRCARCH)/kernel/asm-offsets.c``.
> +
> +The source file uses ``DEFINE()``, ``OFFSET()``, ``BLANK()`` and
> +``COMMENT()`` from ``<linux/kbuild.h>``. These emit marker strings
> +through inline asm that Kbuild extracts from the compiled assembly
> +output.
> +
> +Example::
> +
> + #include <linux/kbuild.h>
> + #include <linux/sched.h>
> +
> + int main(void)
> + {
> + OFFSET(TSK_ACTIVE_MM, task_struct, active_mm);
> + DEFINE(THREAD_SIZE, THREAD_SIZE);
> + BLANK();
> + return 0;
> + }
> +
> +The rules are defined in the top-level ``Kbuild`` and
> +``scripts/Makefile.lib``. The header is built during Kbuild's
> +``prepare`` phase, after ``archprepare`` and before descending into
> +subdirectories.
> +
> +The same mechanism generates ``include/generated/bounds.h`` from
> +``kernel/bounds.c`` and ``include/generated/rq-offsets.h`` from
> +``kernel/sched/rq-offsets.c``.
>
> List directories to visit when descending
> -----------------------------------------
> @@ -1690,9 +1721,3 @@ Credits
> - Updates by Kai Germaschewski <kai@tp1.ruhr-uni-bochum.de>
> - Updates by Sam Ravnborg <sam@ravnborg.org>
> - Language QA by Jan Engelhardt <jengelh@gmx.de>
> -
> -TODO
> -====
> -
> -- Generating offset header files.
> -- Add more variables to chapters 7 or 9?
> --
> 2.43.0
>
Hi,
Gentle ping on this patch.
I’d appreciate any feedback whenever you get time, or let me know if I
should resend/rework anything.
Regards,
Piyush
^ permalink raw reply
* Re: [PATCH] docs: Add overview and SLUB allocator sections to slab documentation
From: Vlastimil Babka (SUSE) @ 2026-04-18 11:20 UTC (permalink / raw)
To: Nick Huang, Lorenzo Stoakes
Cc: Matthew Wilcox, Harry Yoo, Andrew Morton, David Hildenbrand,
Jonathan Corbet, Hao Li, Christoph Lameter, David Rientjes,
Roman Gushchin, Liam R . Howlett, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, linux-mm, linux-doc,
linux-kernel
In-Reply-To: <CABZAGRGCwM1PiSxennV-Uy9y1gGKjpcHeq+eOAbqSZcy3Qb55g@mail.gmail.com>
On 4/18/26 1:00 PM, Nick Huang wrote:
> Hi Lorenzo Stoakes
> Lorenzo Stoakes <ljs@kernel.org> 於 2026年4月18日週六 下午5:11寫道:
>>
>> On Sat, Apr 18, 2026 at 02:12:22PM +0800, Nick Huang wrote:
>>> Nick Huang <sef1548@gmail.com> 於 2026年4月18日週六 下午1:27寫道:
>>>>
>>>> Matthew Wilcox <willy@infradead.org> 於 2026年4月18日週六 下午1:04寫道:
>>>>>
>>>>> On Sat, Apr 18, 2026 at 12:06:19AM +0000, Nick Huang wrote:
>>>>>> - Add "Overview" section explaining the slab allocator's role and purpose
>>>>>> - Document the three main slab allocator implementations (SLAB, SLUB, SLOB)
>>> Hi Matthew Wilcox
>>> I will remove this sentence in the next version:
>>> “Document the three main slab allocator implementations (SLAB, SLUB, SLOB).”
>>> I’m not entirely sure I fully understand your point. If I’ve missed
>>> anything, please let me know what needs to be changed. Thank you.
>>
>> No, please don't send any more revisions of this garbage, thanks.
>
> thank you for your guidance. I will correct my work and introduce the
> more recent `barn`, `sheave`, and `kmalloc_obj`.
> Do you think this is appropriate?
No, this whole thing is inappropriate from the beginning. We are not
going to waste more time on development-by-review for something that
started as undisclosed LLM slop.
^ permalink raw reply
* Re: [PATCH] docs: Add overview and SLUB allocator sections to slab documentation
From: Nick Huang @ 2026-04-18 11:00 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Matthew Wilcox, Vlastimil Babka, Harry Yoo, Andrew Morton,
David Hildenbrand, Jonathan Corbet, Hao Li, Christoph Lameter,
David Rientjes, Roman Gushchin, Liam R . Howlett, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, linux-mm, linux-doc,
linux-kernel
In-Reply-To: <aeNKDh8S6pXHqRFh@lucifer>
Hi Lorenzo Stoakes
Lorenzo Stoakes <ljs@kernel.org> 於 2026年4月18日週六 下午5:11寫道:
>
> On Sat, Apr 18, 2026 at 02:12:22PM +0800, Nick Huang wrote:
> > Nick Huang <sef1548@gmail.com> 於 2026年4月18日週六 下午1:27寫道:
> > >
> > > Matthew Wilcox <willy@infradead.org> 於 2026年4月18日週六 下午1:04寫道:
> > > >
> > > > On Sat, Apr 18, 2026 at 12:06:19AM +0000, Nick Huang wrote:
> > > > > - Add "Overview" section explaining the slab allocator's role and purpose
> > > > > - Document the three main slab allocator implementations (SLAB, SLUB, SLOB)
> > Hi Matthew Wilcox
> > I will remove this sentence in the next version:
> > “Document the three main slab allocator implementations (SLAB, SLUB, SLOB).”
> > I’m not entirely sure I fully understand your point. If I’ve missed
> > anything, please let me know what needs to be changed. Thank you.
>
> No, please don't send any more revisions of this garbage, thanks.
thank you for your guidance. I will correct my work and introduce the
more recent `barn`, `sheave`, and `kmalloc_obj`.
Do you think this is appropriate?
--
Regards,
Nick Huang
^ permalink raw reply
* Re: [PATCH] docs: Add overview and SLUB allocator sections to slab documentation
From: Lorenzo Stoakes @ 2026-04-18 9:11 UTC (permalink / raw)
To: Nick Huang
Cc: Matthew Wilcox, Vlastimil Babka, Harry Yoo, Andrew Morton,
David Hildenbrand, Jonathan Corbet, Hao Li, Christoph Lameter,
David Rientjes, Roman Gushchin, Liam R . Howlett, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, linux-mm, linux-doc,
linux-kernel
In-Reply-To: <CABZAGRGFpiiEr7Odd5an1+9Z+sX1C6QT2iadv-0hNhxGj8eEyg@mail.gmail.com>
On Sat, Apr 18, 2026 at 02:12:22PM +0800, Nick Huang wrote:
> Nick Huang <sef1548@gmail.com> 於 2026年4月18日週六 下午1:27寫道:
> >
> > Matthew Wilcox <willy@infradead.org> 於 2026年4月18日週六 下午1:04寫道:
> > >
> > > On Sat, Apr 18, 2026 at 12:06:19AM +0000, Nick Huang wrote:
> > > > - Add "Overview" section explaining the slab allocator's role and purpose
> > > > - Document the three main slab allocator implementations (SLAB, SLUB, SLOB)
> Hi Matthew Wilcox
> I will remove this sentence in the next version:
> “Document the three main slab allocator implementations (SLAB, SLUB, SLOB).”
> I’m not entirely sure I fully understand your point. If I’ve missed
> anything, please let me know what needs to be changed. Thank you.
No, please don't send any more revisions of this garbage, thanks.
^ permalink raw reply
* [PATCH] docs/zh_CN: polish how-to.rst
From: Dongliang Mu @ 2026-04-18 9:10 UTC (permalink / raw)
To: Alex Shi, Yanteng Si, Dongliang Mu, Jonathan Corbet, Shuah Khan
Cc: linux-doc, linux-kernel
Editorial pass on the Chinese translation contributor guide.
- Fix typos: 网络通常 → 通畅; remove trailing backticks on the
checktransupdate.py command; mis-placed 。 → , in the 紧急处理
section; 您/你 and 的/地 inconsistencies in 进阶.
- Correct "git email" to "git send-email", matching usage
elsewhere in the document.
- Replace an invalid <URL> inline form with a bare URL so Sphinx
renders the lore.kernel.org link.
- Tighten grammar and wording: 针对于 → 面向; drop redundant
最多 before 不超过 and tautological 即可; remove double 的 in
您的翻译的内容; resolve ambiguity around 继续 placement in the
把补丁提交到邮件列表 section; and similar small fixes.
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Dongliang Mu <dzm91@hust.edu.cn>
---
Documentation/translations/zh_CN/how-to.rst | 46 ++++++++++-----------
1 file changed, 23 insertions(+), 23 deletions(-)
diff --git a/Documentation/translations/zh_CN/how-to.rst b/Documentation/translations/zh_CN/how-to.rst
index 7ae5d8765888..a46d7395b11c 100644
--- a/Documentation/translations/zh_CN/how-to.rst
+++ b/Documentation/translations/zh_CN/how-to.rst
@@ -13,20 +13,20 @@ Linux 内核中文文档翻译规范
过去几年,在广大社区爱好者的友好合作下,Linux 内核中文文档迎来了蓬勃的发
展。在翻译的早期,一切都是混乱的,社区对译稿只有一个准确翻译的要求,以鼓
励更多的开发者参与进来,这是从 0 到 1 的必然过程,所以早期的中文文档目录
-更加具有多样性,不过好在文档不多,维护上并没有过大的压力。
+呈现出较强的多样性,不过好在文档不多,维护上并没有过大的压力。
然而,世事变幻,不觉有年,现在内核中文文档在前进的道路上越走越远,很多潜
在的问题逐渐浮出水面,而且随着中文文档数量的增加,翻译更多的文档与提高中
文文档可维护性之间的矛盾愈发尖锐。由于文档翻译的特殊性,很多开发者并不会
一直更新文档,如果中文文档落后英文文档太多,文档更新的工作量会远大于重新
翻译。而且邮件列表中陆续有新的面孔出现,他们那股热情,就像燃烧的火焰,能
-瞬间点燃整个空间,可是他们的补丁往往具有个性,这会给审阅带来了很大的困难,
+瞬间点燃整个空间,可是他们的补丁往往具有个性,这给审阅带来了很大的困难,
reviewer 们只能耐心地指导他们如何与社区更好地合作,但是这项工作具有重复
性,长此以往,会渐渐浇灭 reviewer 审阅的热情。
-虽然内核文档中已经有了类似的贡献指南,但是缺乏专门针对于中文翻译的,尤其
+虽然内核文档中已经有了类似的贡献指南,但是缺乏专门面向中文翻译的,尤其
是对于新手来说,浏览大量的文档反而更加迷惑,该文档就是为了缓解这一问题而
-编写,目的是为提供给新手一个快速翻译指南。
+编写,旨在为新手提供一份快速翻译指南。
详细的贡献指南:Documentation/translations/zh_CN/process/index.rst。
@@ -145,8 +145,8 @@ Git 和邮箱配置
sudo dnf install git-email
vim ~/.gitconfig
-这里是我的一个配置文件示范,请根据您的邮箱域名服务商提供的手册替换到对
-应的字段。
+这里是我的一个配置文件示范,请根据您的邮箱域名服务商提供的手册替换对应
+的字段。
::
[user]
@@ -190,7 +190,7 @@ Git 和邮箱配置
译文格式要求
------------
- - 每行长度最多不超过 40 个字符
+ - 每行长度不超过 40 个字符
- 每行长度请保持一致
- 标题的下划线长度请按照一个英文一个字符、一个中文两个字符与标题对齐
- 其它的修饰符请与英文文档保持一致
@@ -211,7 +211,7 @@ Git 和邮箱配置
--------
中文文档有每行 40 字符限制,因为一个中文字符等于 2 个英文字符。但是社区并
-没有那么严格,一个诀窍是将您的翻译的内容与英文原文的每行长度对齐即可,这样,
+没有那么严格,一个诀窍是将您翻译的内容与英文原文的每行长度对齐,这样,
您也不必总是检查有没有超限。
如果您的英文阅读能力有限,可以考虑使用辅助翻译工具,例如 deepseek。但是您
@@ -309,8 +309,8 @@ warning 不需要解决::
重新导出再次检测,重复这个过程,直到处理完所有的补丁。
-最后,如果检测时没有 warning 和 error 需要被处理或者您只有一个补丁,请跳
-过下面这个步骤,否则请重新导出补丁制作封面::
+最后,如果检测时没有需要处理的 warning 和 error,或者您只有一个补丁,请
+跳过下面这个步骤,否则请重新导出补丁制作封面::
git format-patch -N --cover-letter --thread=shallow
# N 要替换为补丁数量,一般 N 大于 1
@@ -335,7 +335,7 @@ warning 不需要解决::
docs/zh_CN: add xxxxx
...
-如果您只有一个补丁,则可以不制作封面,即 0 号补丁,只需要执行::
+如果您只有一个补丁,则无需制作封面(即 0 号补丁),只需执行::
git format-patch -1
@@ -361,13 +361,13 @@ warning 不需要解决::
git send-email *.patch --to <maintainer email addr> --cc <others addr>
# 一个 to 对应一个地址,一个 cc 对应一个地址,有几个就写几个
-执行该命令时,请确保网络通常,邮件发送成功一般会返回 250。
+执行该命令时,请确保网络通畅,邮件发送成功一般会返回 250。
您可以先发送给自己,尝试发出的 patch 是否可以用 'git am' 工具正常打上。
如果检查正常, 您就可以放心的发送到社区评审了。
-如果该步骤被中断,您可以检查一下,继续用上条命令发送失败的补丁,一定不要再
-次发送已经发送成功的补丁。
+如果该步骤被中断,您可以检查一下,然后用上条命令继续发送失败的补丁,一定不
+要再次发送已经发送成功的补丁。
积极参与审阅过程并迭代补丁
==========================
@@ -380,7 +380,7 @@ reviewer 的评论,做到每条都有回复,每个回复都落实到位。
- 请先将您的邮箱客户端信件回复修改为 **纯文本** 格式,并去除所有签名,尤其是
企业邮箱。
- - 然后点击回复按钮,并将要回复的邮件带入,
+ - 然后点击回复按钮,并引用要回复的邮件,
- 在第一条评论行尾换行,输入您的回复
- 在第二条评论行尾换行,输入您的回复
- 直到处理完最后一条评论,换行空两行输入问候语和署名
@@ -425,10 +425,10 @@ reviewer 的评论,做到每条都有回复,每个回复都落实到位。
紧急处理
--------
-如果您发送到邮件列表之后。发现发错了补丁集,尤其是在多个版本迭代的过程中;
+如果您发送到邮件列表之后,发现发错了补丁集,尤其是在多个版本迭代的过程中;
自己发现了一些不妥的翻译;发送错了邮件列表……
-git email 默认会抄送给您一份,所以您可以切换为审阅者的角色审查自己的补丁,
+git send-email 默认会抄送给您一份,所以您可以切换为审阅者的角色审查自己的补丁,
并留下评论,描述有何不妥,将在下个版本怎么改,并付诸行动,重新提交,但是
注意频率,每天提交的次数不要超过两次。
@@ -437,7 +437,7 @@ git email 默认会抄送给您一份,所以您可以切换为审阅者的角
对于首次参与 Linux 内核中文文档翻译的新手,建议您在 linux 目录中运行以下命令:
::
- tools/docs/checktransupdate.py -l zh_CN``
+ tools/docs/checktransupdate.py -l zh_CN
该命令会列出需要翻译或更新的英文文档,结果同时保存在 checktransupdate.log 中。
@@ -446,9 +446,9 @@ git email 默认会抄送给您一份,所以您可以切换为审阅者的角
进阶
----
-希望您不只是单纯的翻译内核文档,在熟悉了一起与社区工作之后,您可以审阅其他
+希望您不只是单纯地翻译内核文档,在熟悉了与社区协作之后,您可以审阅其他
开发者的翻译,或者提出具有建设性的主张。与此同时,与文档对应的代码更加有趣,
-而且需要完善的地方还有很多,勇敢地去探索,然后提交你的想法吧。
+而且需要完善的地方还有很多,勇敢地去探索,然后提交您的想法吧。
常见的问题
==========
@@ -467,7 +467,7 @@ Maintainer 回复补丁不能正常 apply
------------------
大部分情况下,是由于您发送了非纯文本格式的信件,请尽量避免使用 webmail,推荐
-使用邮件客户端,比如 thunderbird,记得在设置中的回信配置那改为纯文本发送。
+使用邮件客户端,比如 thunderbird,记得在设置的回信配置中改为纯文本发送。
-如果超过了 24 小时,您依旧没有在<https://lore.kernel.org/linux-doc/>发现您的
-邮件,请联系您的网络管理员帮忙解决。
+如果超过了 24 小时,您依旧没有在 https://lore.kernel.org/linux-doc/ 上找到您
+的邮件,请联系您的网络管理员帮忙解决。
--
2.43.0
^ permalink raw reply related
* Re: [PATCH] docs: Add overview and SLUB allocator sections to slab documentation
From: Lorenzo Stoakes @ 2026-04-18 9:07 UTC (permalink / raw)
To: Nick Huang
Cc: Vlastimil Babka, Harry Yoo, Andrew Morton, David Hildenbrand,
Jonathan Corbet, Hao Li, Christoph Lameter, David Rientjes,
Roman Gushchin, Liam R . Howlett, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, linux-mm, linux-doc,
linux-kernel
In-Reply-To: <20260418000635.17499-1-sef1548@gmail.com>
NAK to obvious, disrespectful, AI slop garbage.
Go read https://docs.kernel.org/process/generated-content.html - especially the
bit about dismissing crap like this.
On Sat, Apr 18, 2026 at 12:06:19AM +0000, Nick Huang wrote:
> - Add "Overview" section explaining the slab allocator's role and purpose
> - Document the three main slab allocator implementations (SLAB, SLUB, SLOB)
The fact you're insanely wrong about the current state of slab only makes this
worse.
> - Highlight SLUB as the default allocator on modern systems
Not default. Only...
> - Add "SLUB Allocator" subsection with detailed information:
There's nothing detailed...
> - Explain SLUB's design goals and advantages over legacy SLAB
Irrelevant, SLAB doesn't exist.
> - Document its focus on simplification and performance
Who cares? This isn't linked in?
> - Note support for both uniprocessor and SMP systems
Uniprocessor? Seriously?
>
> Signed-off-by: Nick Huang <sef1548@gmail.com>
> ---
> Documentation/mm/slab.rst | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/Documentation/mm/slab.rst b/Documentation/mm/slab.rst
> index 2bcc58ada302..2d1d093afb7b 100644
> --- a/Documentation/mm/slab.rst
> +++ b/Documentation/mm/slab.rst
> @@ -4,6 +4,32 @@
> Slab Allocation
> ===============
>
> +Overview
> +========
> +
> +The slab allocator is responsible for efficient allocation and reuse of
> +small kernel objects. It reduces internal fragmentation and improves
> +performance by caching frequently used objects.
This sentence doesn't even make any sense.
> +
> +The Linux kernel provides multiple slab allocator implementations,
> +including SLAB, SLUB, and SLOB. Among these, SLUB is the default
> +allocator on most modern systems.
WRONG. WRONG. WRONG.
> +
> +SLUB Allocator
> +==============
> +
> +Overview
> +--------
> +
> +SLUB is a slab allocator designed to replace the legacy SLAB allocator
> +(mm/slab.c). It addresses the complexity, scalability limitations, and
> +memory overhead of the SLAB implementation.
This is useless crap? 'X is designed to replace Y which doesn't exist but let's
mention it anyway'. How is this an overview?
> +
> +The primary goal of SLUB is to simplify slab allocation while improving
> +performance on both uniprocessor (UP) and symmetric multiprocessing (SMP)
> +systems.
This is meaningless noise too.
> +
> +
> Functions and structures
> ========================
>
> --
> 2.43.0
>
You've wasted my time, your time and other people's time. Have a think about
that.
Lorenzo
^ permalink raw reply
* [PATCH v7 1/4] KVM: arm64: PMU: Add kvm_pmu_enabled_counter_mask()
From: Akihiko Odaki @ 2026-04-18 8:14 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon, Kees Cook,
Gustavo A. R. Silva, Paolo Bonzini, Jonathan Corbet, Shuah Khan
Cc: linux-arm-kernel, kvmarm, linux-kernel, linux-hardening, devel,
kvm, linux-doc, linux-kselftest, Akihiko Odaki
In-Reply-To: <20260418-hybrid-v7-0-2bf39ad009bf@rsg.ci.i.u-tokyo.ac.jp>
This function will be useful to enumerate enabled counters.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
arch/arm64/kvm/pmu-emul.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index b03dbda7f1ab..59ec96e09321 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -619,18 +619,24 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)
}
}
-static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc)
+static u64 kvm_pmu_enabled_counter_mask(struct kvm_vcpu *vcpu)
{
- struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
- unsigned int mdcr = __vcpu_sys_reg(vcpu, MDCR_EL2);
+ u64 mask = 0;
- if (!(__vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & BIT(pmc->idx)))
- return false;
+ if (__vcpu_sys_reg(vcpu, MDCR_EL2) & MDCR_EL2_HPME)
+ mask |= kvm_pmu_hyp_counter_mask(vcpu);
- if (kvm_pmu_counter_is_hyp(vcpu, pmc->idx))
- return mdcr & MDCR_EL2_HPME;
+ if (kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E)
+ mask |= ~kvm_pmu_hyp_counter_mask(vcpu);
+
+ return __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & mask;
+}
+
+static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc)
+{
+ struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
- return kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E;
+ return kvm_pmu_enabled_counter_mask(vcpu) & BIT(pmc->idx);
}
static bool kvm_pmc_counts_at_el0(struct kvm_pmc *pmc)
--
2.53.0
^ permalink raw reply related
* [PATCH v7 3/4] KVM: arm64: PMU: Introduce FIXED_COUNTERS_ONLY
From: Akihiko Odaki @ 2026-04-18 8:14 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon, Kees Cook,
Gustavo A. R. Silva, Paolo Bonzini, Jonathan Corbet, Shuah Khan
Cc: linux-arm-kernel, kvmarm, linux-kernel, linux-hardening, devel,
kvm, linux-doc, linux-kselftest, Akihiko Odaki
In-Reply-To: <20260418-hybrid-v7-0-2bf39ad009bf@rsg.ci.i.u-tokyo.ac.jp>
On a heterogeneous arm64 system, KVM's PMU emulation is based on the
features of a single host PMU instance. When a vCPU is migrated to a
pCPU with an incompatible PMU, counters such as PMCCNTR_EL0 stop
incrementing.
Although this behavior is permitted by the architecture, Windows does
not handle it gracefully and may crash with a division-by-zero error.
The current workaround requires VMMs to pin vCPUs to a set of pCPUs
that share a compatible PMU. This is difficult to implement correctly in
QEMU/libvirt, where pinning occurs after vCPU initialization, and it
also restricts the guest to a subset of available pCPUs.
Introduce the KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY attribute to
create a "fixed-counters-only" PMU. When set, KVM exposes a PMU that is
compatible with all pCPUs but that does not support programmable
event counters which may have different feature sets on different PMUs.
This allows Windows guests to run reliably on heterogeneous systems
without crashing, even without vCPU pinning, and enables VMMs to
schedule vCPUs across all available pCPUs, making full use of the host
hardware.
Much like KVM_ARM_VCPU_PMU_V3_IRQ and other read-write attributes, this
attribute provides a getter that facilitates kernel and userspace
debugging/testing.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
Documentation/virt/kvm/devices/vcpu.rst | 29 ++++++
arch/arm64/include/asm/kvm_host.h | 2 +
arch/arm64/include/uapi/asm/kvm.h | 1 +
arch/arm64/kvm/arm.c | 1 +
arch/arm64/kvm/pmu-emul.c | 155 +++++++++++++++++++++++---------
include/kvm/arm_pmu.h | 2 +
6 files changed, 147 insertions(+), 43 deletions(-)
diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 60bf205cb373..e0aeb1897d77 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -161,6 +161,35 @@ explicitly selected, or the number of counters is out of range for the
selected PMU. Selecting a new PMU cancels the effect of setting this
attribute.
+1.6 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY
+------------------------------------------------------
+
+:Parameters: no additional parameter in kvm_device_attr.addr
+
+:Returns:
+
+ ======= =====================================================
+ -EBUSY Attempted to set after initializing PMUv3 or running
+ VCPU, or attempted to set for the first time after
+ setting an event filter
+ -ENXIO Attempted to get before setting
+ -ENODEV Attempted to set while PMUv3 not supported
+ ======= =====================================================
+
+If set, PMUv3 will be emulated without programmable event counters. The VCPU
+will use any compatible hardware PMU. This attribute is particularly useful on
+heterogeneous systems where different hardware PMUs cover different physical
+CPUs. The compatibility of hardware PMUs can be checked with
+KVM_ARM_VCPU_PMU_V3_SET_PMU. All VCPUs in a VM share this attribute. It isn't
+possible to set it for the first time if a PMU event filter is already present.
+
+Note that KVM will not make any attempts to run the VCPU on the physical CPUs
+with compatible hardware PMUs. This is entirely left to userspace. However,
+attempting to run the VCPU on an unsupported CPU will fail and KVM_RUN will
+return with exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct
+by setting hardware_entry_failure_reason field to
+KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and the cpu field to the processor id.
+
2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
=================================
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 59f25b85be2b..b59e0182472c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -353,6 +353,8 @@ struct kvm_arch {
#define KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS 10
/* Unhandled SEAs are taken to userspace */
#define KVM_ARCH_FLAG_EXIT_SEA 11
+ /* PMUv3 is emulated without progammable event counters */
+#define KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY 12
unsigned long flags;
/* VM-wide vCPU feature set */
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index a792a599b9d6..474c84fa757f 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -436,6 +436,7 @@ enum {
#define KVM_ARM_VCPU_PMU_V3_FILTER 2
#define KVM_ARM_VCPU_PMU_V3_SET_PMU 3
#define KVM_ARM_VCPU_PMU_V3_SET_NR_COUNTERS 4
+#define KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY 5
#define KVM_ARM_VCPU_TIMER_CTRL 1
#define KVM_ARM_VCPU_TIMER_IRQ_VTIMER 0
#define KVM_ARM_VCPU_TIMER_IRQ_PTIMER 1
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 620a465248d1..dca16ca26d32 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -634,6 +634,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (has_vhe())
kvm_vcpu_load_vhe(vcpu);
kvm_arch_vcpu_load_fp(vcpu);
+ kvm_vcpu_load_pmu(vcpu);
kvm_vcpu_pmu_restore_guest(vcpu);
if (kvm_arm_is_pvtime_enabled(&vcpu->arch))
kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu);
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index ef5140bbfe28..d1009c144581 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -326,7 +326,10 @@ u64 kvm_pmu_implemented_counter_mask(struct kvm_vcpu *vcpu)
static void kvm_pmc_enable_perf_event(struct kvm_pmc *pmc)
{
- if (!pmc->perf_event) {
+ struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
+
+ if (!pmc->perf_event ||
+ !cpumask_test_cpu(vcpu->cpu, &to_arm_pmu(pmc->perf_event->pmu)->supported_cpus)) {
kvm_pmu_create_perf_event(pmc);
return;
}
@@ -667,10 +670,8 @@ static bool kvm_pmc_counts_at_el2(struct kvm_pmc *pmc)
return kvm_pmc_read_evtreg(pmc) & ARMV8_PMU_INCLUDE_EL2;
}
-static int kvm_map_pmu_event(struct kvm *kvm, unsigned int eventsel)
+static int kvm_map_pmu_event(struct arm_pmu *pmu, unsigned int eventsel)
{
- struct arm_pmu *pmu = kvm->arch.arm_pmu;
-
/*
* The CPU PMU likely isn't PMUv3; let the driver provide a mapping
* for the guest's PMUv3 event ID.
@@ -681,6 +682,23 @@ static int kvm_map_pmu_event(struct kvm *kvm, unsigned int eventsel)
return eventsel;
}
+static struct arm_pmu *kvm_pmu_probe_armpmu(int cpu)
+{
+ struct arm_pmu_entry *entry;
+ struct arm_pmu *pmu;
+
+ guard(rcu)();
+
+ list_for_each_entry_rcu(entry, &arm_pmus, entry) {
+ pmu = entry->arm_pmu;
+
+ if (cpumask_test_cpu(cpu, &pmu->supported_cpus))
+ return pmu;
+ }
+
+ return NULL;
+}
+
/**
* kvm_pmu_create_perf_event - create a perf event for a counter
* @pmc: Counter context
@@ -694,6 +712,12 @@ static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
int eventsel;
u64 evtreg;
+ if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &vcpu->kvm->arch.flags)) {
+ arm_pmu = kvm_pmu_probe_armpmu(vcpu->cpu);
+ if (!arm_pmu)
+ return;
+ }
+
evtreg = kvm_pmc_read_evtreg(pmc);
kvm_pmu_stop_counter(pmc);
@@ -722,7 +746,7 @@ static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
* Don't create an event if we're running on hardware that requires
* PMUv3 event translation and we couldn't find a valid mapping.
*/
- eventsel = kvm_map_pmu_event(vcpu->kvm, eventsel);
+ eventsel = kvm_map_pmu_event(arm_pmu, eventsel);
if (eventsel < 0)
return;
@@ -810,42 +834,6 @@ void kvm_host_pmu_init(struct arm_pmu *pmu)
list_add_tail_rcu(&entry->entry, &arm_pmus);
}
-static struct arm_pmu *kvm_pmu_probe_armpmu(void)
-{
- struct arm_pmu_entry *entry;
- struct arm_pmu *pmu;
- int cpu;
-
- guard(rcu)();
-
- /*
- * It is safe to use a stale cpu to iterate the list of PMUs so long as
- * the same value is used for the entirety of the loop. Given this, and
- * the fact that no percpu data is used for the lookup there is no need
- * to disable preemption.
- *
- * It is still necessary to get a valid cpu, though, to probe for the
- * default PMU instance as userspace is not required to specify a PMU
- * type. In order to uphold the preexisting behavior KVM selects the
- * PMU instance for the core during vcpu init. A dependent use
- * case would be a user with disdain of all things big.LITTLE that
- * affines the VMM to a particular cluster of cores.
- *
- * In any case, userspace should just do the sane thing and use the UAPI
- * to select a PMU type directly. But, be wary of the baggage being
- * carried here.
- */
- cpu = raw_smp_processor_id();
- list_for_each_entry_rcu(entry, &arm_pmus, entry) {
- pmu = entry->arm_pmu;
-
- if (cpumask_test_cpu(cpu, &pmu->supported_cpus))
- return pmu;
- }
-
- return NULL;
-}
-
static u64 __compute_pmceid(struct arm_pmu *pmu, bool pmceid1)
{
u32 hi[2], lo[2];
@@ -888,6 +876,9 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
u64 val, mask = 0;
int base, i, nr_events;
+ if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &vcpu->kvm->arch.flags))
+ return 0;
+
if (!pmceid1) {
val = compute_pmceid0(cpu_pmu);
base = 0;
@@ -915,6 +906,26 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
return val & mask;
}
+void kvm_vcpu_load_pmu(struct kvm_vcpu *vcpu)
+{
+ unsigned long mask = kvm_pmu_enabled_counter_mask(vcpu);
+ struct kvm_pmc *pmc;
+ struct arm_pmu *cpu_pmu;
+ int i;
+
+ for_each_set_bit(i, &mask, 32) {
+ pmc = kvm_vcpu_idx_to_pmc(vcpu, i);
+ if (!pmc->perf_event)
+ continue;
+
+ cpu_pmu = to_arm_pmu(pmc->perf_event->pmu);
+ if (!cpumask_test_cpu(vcpu->cpu, &cpu_pmu->supported_cpus)) {
+ kvm_make_request(KVM_REQ_RELOAD_PMU, vcpu);
+ break;
+ }
+ }
+}
+
void kvm_vcpu_reload_pmu(struct kvm_vcpu *vcpu)
{
u64 mask = kvm_pmu_implemented_counter_mask(vcpu);
@@ -1016,6 +1027,9 @@ u8 kvm_arm_pmu_get_max_counters(struct kvm *kvm)
{
struct arm_pmu *arm_pmu = kvm->arch.arm_pmu;
+ if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &kvm->arch.flags))
+ return 0;
+
/*
* PMUv3 requires that all event counters are capable of counting any
* event, though the same may not be true of non-PMUv3 hardware.
@@ -1070,7 +1084,24 @@ static void kvm_arm_set_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)
*/
int kvm_arm_set_default_pmu(struct kvm *kvm)
{
- struct arm_pmu *arm_pmu = kvm_pmu_probe_armpmu();
+ /*
+ * It is safe to use a stale cpu to iterate the list of PMUs so long as
+ * the same value is used for the entirety of the loop. Given this, and
+ * the fact that no percpu data is used for the lookup there is no need
+ * to disable preemption.
+ *
+ * It is still necessary to get a valid cpu, though, to probe for the
+ * default PMU instance as userspace is not required to specify a PMU
+ * type. In order to uphold the preexisting behavior KVM selects the
+ * PMU instance for the core during vcpu init. A dependent use
+ * case would be a user with disdain of all things big.LITTLE that
+ * affines the VMM to a particular cluster of cores.
+ *
+ * In any case, userspace should just do the sane thing and use the UAPI
+ * to select a PMU type directly. But, be wary of the baggage being
+ * carried here.
+ */
+ struct arm_pmu *arm_pmu = kvm_pmu_probe_armpmu(raw_smp_processor_id());
if (!arm_pmu)
return -ENODEV;
@@ -1098,6 +1129,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
break;
}
+ clear_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &kvm->arch.flags);
kvm_arm_set_pmu(kvm, arm_pmu);
cpumask_copy(kvm->arch.supported_cpus, &arm_pmu->supported_cpus);
ret = 0;
@@ -1108,11 +1140,42 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
return ret;
}
+static int kvm_arm_pmu_v3_set_pmu_fixed_counters_only(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct arm_pmu_entry *entry;
+ struct arm_pmu *arm_pmu;
+ struct cpumask *supported_cpus = kvm->arch.supported_cpus;
+
+ lockdep_assert_held(&kvm->arch.config_lock);
+
+ if (kvm_vm_has_ran_once(kvm) ||
+ (kvm->arch.pmu_filter &&
+ !test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &kvm->arch.flags)))
+ return -EBUSY;
+
+ set_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &kvm->arch.flags);
+ kvm_arm_set_nr_counters(kvm, 0);
+ cpumask_clear(supported_cpus);
+
+ guard(rcu)();
+
+ list_for_each_entry_rcu(entry, &arm_pmus, entry) {
+ arm_pmu = entry->arm_pmu;
+ cpumask_or(supported_cpus, supported_cpus, &arm_pmu->supported_cpus);
+ }
+
+ return 0;
+}
+
static int kvm_arm_pmu_v3_set_nr_counters(struct kvm_vcpu *vcpu, unsigned int n)
{
struct kvm *kvm = vcpu->kvm;
- if (!kvm->arch.arm_pmu)
+ lockdep_assert_held(&kvm->arch.config_lock);
+
+ if (!kvm->arch.arm_pmu &&
+ !test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &kvm->arch.flags))
return -EINVAL;
if (n > kvm_arm_pmu_get_max_counters(kvm))
@@ -1227,6 +1290,8 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
return kvm_arm_pmu_v3_set_nr_counters(vcpu, n);
}
+ case KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY:
+ return kvm_arm_pmu_v3_set_pmu_fixed_counters_only(vcpu);
case KVM_ARM_VCPU_PMU_V3_INIT:
return kvm_arm_pmu_v3_init(vcpu);
}
@@ -1253,6 +1318,9 @@ int kvm_arm_pmu_v3_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
irq = vcpu->arch.pmu.irq_num;
return put_user(irq, uaddr);
}
+ case KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY:
+ if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &vcpu->kvm->arch.flags))
+ return 0;
}
return -ENXIO;
@@ -1266,6 +1334,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
case KVM_ARM_VCPU_PMU_V3_FILTER:
case KVM_ARM_VCPU_PMU_V3_SET_PMU:
case KVM_ARM_VCPU_PMU_V3_SET_NR_COUNTERS:
+ case KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY:
if (kvm_vcpu_has_pmu(vcpu))
return 0;
}
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 96754b51b411..1375cbaf97b2 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -56,6 +56,7 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val);
void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val);
void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
u64 select_idx);
+void kvm_vcpu_load_pmu(struct kvm_vcpu *vcpu);
void kvm_vcpu_reload_pmu(struct kvm_vcpu *vcpu);
int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu,
struct kvm_device_attr *attr);
@@ -161,6 +162,7 @@ static inline u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
static inline void kvm_pmu_update_vcpu_events(struct kvm_vcpu *vcpu) {}
static inline void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu) {}
static inline void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu) {}
+static inline void kvm_vcpu_load_pmu(struct kvm_vcpu *vcpu) {}
static inline void kvm_vcpu_reload_pmu(struct kvm_vcpu *vcpu) {}
static inline u8 kvm_arm_pmu_get_pmuver_limit(void)
{
--
2.53.0
^ permalink raw reply related
* [PATCH v7 0/4] KVM: arm64: PMU: Use multiple host PMUs
From: Akihiko Odaki @ 2026-04-18 8:14 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon, Kees Cook,
Gustavo A. R. Silva, Paolo Bonzini, Jonathan Corbet, Shuah Khan
Cc: linux-arm-kernel, kvmarm, linux-kernel, linux-hardening, devel,
kvm, linux-doc, linux-kselftest, Akihiko Odaki
On a heterogeneous arm64 system, KVM's PMU emulation is based on the
features of a single host PMU instance. When a vCPU is migrated to a
pCPU with an incompatible PMU, counters such as PMCCNTR_EL0 stop
incrementing.
Although this behavior is permitted by the architecture, Windows does
not handle it gracefully and may crash with a division-by-zero error.
The current workaround requires VMMs to pin vCPUs to a set of pCPUs
that share a compatible PMU. This is difficult to implement correctly in
QEMU/libvirt, where pinning occurs after vCPU initialization, and it
also restricts the guest to a subset of available pCPUs.
This patch introduces the KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY
attribute. If set, PMUv3 will be emulated without programmable event
counters. KVM will be able to run VCPUs on any physical CPUs with a
compatible hardware PMU.
This allows Windows guests to run reliably on heterogeneous systems
without crashing, even without vCPU pinning, and enables VMMs to
schedule vCPUs across all available pCPUs, making full use of the host
hardware.
A QEMU patch that demonstrates the usage of the new attribute is
available at:
https://lore.kernel.org/qemu-devel/20260225-kvm-v2-1-b8d743db0f73@rsg.ci.i.u-tokyo.ac.jp/
("[PATCH RFC v2] target/arm/kvm: Choose PMU backend")
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
Changes in v7:
- Fixed the vCPU run hang in test_fixed_counters_only().
- Link to v6: https://lore.kernel.org/r/20260413-hybrid-v6-0-e79d760f7f1b@rsg.ci.i.u-tokyo.ac.jp
Changes in v6:
- Removed WARN_ON_ONCE() in kvm_pmu_create_perf_event(). It can be
triggered in kvm_arch_vcpu_load() before it checks supported_cpus.
- Removed an extra lockdep assertion in kvm_arm_pmu_v3_get_attr().
- Fixed error messages in test_fixed_counters_only().
- Fixed the vCPU run in test_fixed_counters_only().
- Link to v5: https://lore.kernel.org/r/20260411-hybrid-v5-0-b043b4d9f49e@rsg.ci.i.u-tokyo.ac.jp
Changes in v5:
- Rebased.
- Fixed the order to clear KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY in
kvm_arm_pmu_v3_set_pmu().
- Fixed the setting of KVM_ARM_VCPU_PMU_V3_IRQ in
test_fixed_counters_only().
- Changed to WARN_ON_ONCE() when kvm_pmu_probe_armpmu() returns NULL in
kvm_pmu_create_perf_event(), which is no longer supposed to happen.
- Link to v4: https://lore.kernel.org/r/20260317-hybrid-v4-0-bd62bcd48644@rsg.ci.i.u-tokyo.ac.jp
Changes in v4:
- Extracted kvm_pmu_enabled_counter_mask() into a separate patch.
- Added patch "KVM: arm64: PMU: Protect the list of PMUs with RCU".
- Merged KVM_REQ_CREATE_PMU into KVM_REQ_RELOAD_PMU.
- Added a check to avoid unnecessary KVM_REQ_RELOAD_PMU requests.
- Dropped the change to avoid setting kvm_arm_set_default_pmu() when
KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY is not set.
- Link to v3: https://lore.kernel.org/r/20260225-hybrid-v3-0-46e8fe220880@rsg.ci.i.u-tokyo.ac.jp
Changes in v3:
- Renamed the attribute to KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY.
- Changed to request the creation of perf counters when loading vCPU.
- Link to v2: https://lore.kernel.org/r/20250806-hybrid-v2-0-0661aec3af8c@rsg.ci.i.u-tokyo.ac.jp
Changes in v2:
- Added the KVM_ARM_VCPU_PMU_V3_COMPOSITION attribute to opt in the
feature.
- Added code to handle overflow.
- Link to v1: https://lore.kernel.org/r/20250319-hybrid-v1-1-4d1ada10e705@daynix.com
---
Akihiko Odaki (4):
KVM: arm64: PMU: Add kvm_pmu_enabled_counter_mask()
KVM: arm64: PMU: Protect the list of PMUs with RCU
KVM: arm64: PMU: Introduce FIXED_COUNTERS_ONLY
KVM: arm64: selftests: Test PMU_V3_FIXED_COUNTERS_ONLY
Documentation/virt/kvm/devices/vcpu.rst | 29 ++++
arch/arm64/include/asm/kvm_host.h | 2 +
arch/arm64/include/uapi/asm/kvm.h | 1 +
arch/arm64/kvm/arm.c | 1 +
arch/arm64/kvm/pmu-emul.c | 187 ++++++++++++++-------
include/kvm/arm_pmu.h | 2 +
.../selftests/kvm/arm64/vpmu_counter_access.c | 153 ++++++++++++++---
7 files changed, 292 insertions(+), 83 deletions(-)
---
base-commit: 94b4ae79ebb42a8a6f2124b4d4b033b15a98e4f9
change-id: 20250224-hybrid-01d5ff47edd2
Best regards,
--
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
^ permalink raw reply
* [PATCH v7 2/4] KVM: arm64: PMU: Protect the list of PMUs with RCU
From: Akihiko Odaki @ 2026-04-18 8:14 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon, Kees Cook,
Gustavo A. R. Silva, Paolo Bonzini, Jonathan Corbet, Shuah Khan
Cc: linux-arm-kernel, kvmarm, linux-kernel, linux-hardening, devel,
kvm, linux-doc, linux-kselftest, Akihiko Odaki
In-Reply-To: <20260418-hybrid-v7-0-2bf39ad009bf@rsg.ci.i.u-tokyo.ac.jp>
Convert the list of PMUs to a RCU-protected list that has primitives to
avoid read-side contention.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
arch/arm64/kvm/pmu-emul.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 59ec96e09321..ef5140bbfe28 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -7,9 +7,9 @@
#include <linux/cpu.h>
#include <linux/kvm.h>
#include <linux/kvm_host.h>
-#include <linux/list.h>
#include <linux/perf_event.h>
#include <linux/perf/arm_pmu.h>
+#include <linux/rculist.h>
#include <linux/uaccess.h>
#include <asm/kvm_emulate.h>
#include <kvm/arm_pmu.h>
@@ -26,7 +26,6 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc);
bool kvm_supports_guest_pmuv3(void)
{
- guard(mutex)(&arm_pmus_lock);
return !list_empty(&arm_pmus);
}
@@ -808,7 +807,7 @@ void kvm_host_pmu_init(struct arm_pmu *pmu)
return;
entry->arm_pmu = pmu;
- list_add_tail(&entry->entry, &arm_pmus);
+ list_add_tail_rcu(&entry->entry, &arm_pmus);
}
static struct arm_pmu *kvm_pmu_probe_armpmu(void)
@@ -817,7 +816,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void)
struct arm_pmu *pmu;
int cpu;
- guard(mutex)(&arm_pmus_lock);
+ guard(rcu)();
/*
* It is safe to use a stale cpu to iterate the list of PMUs so long as
@@ -837,7 +836,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void)
* carried here.
*/
cpu = raw_smp_processor_id();
- list_for_each_entry(entry, &arm_pmus, entry) {
+ list_for_each_entry_rcu(entry, &arm_pmus, entry) {
pmu = entry->arm_pmu;
if (cpumask_test_cpu(cpu, &pmu->supported_cpus))
@@ -1088,9 +1087,9 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
int ret = -ENXIO;
lockdep_assert_held(&kvm->arch.config_lock);
- mutex_lock(&arm_pmus_lock);
+ guard(rcu)();
- list_for_each_entry(entry, &arm_pmus, entry) {
+ list_for_each_entry_rcu(entry, &arm_pmus, entry) {
arm_pmu = entry->arm_pmu;
if (arm_pmu->pmu.type == pmu_id) {
if (kvm_vm_has_ran_once(kvm) ||
@@ -1106,7 +1105,6 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
}
}
- mutex_unlock(&arm_pmus_lock);
return ret;
}
--
2.53.0
^ permalink raw reply related
* [PATCH v7 4/4] KVM: arm64: selftests: Test PMU_V3_FIXED_COUNTERS_ONLY
From: Akihiko Odaki @ 2026-04-18 8:14 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon, Kees Cook,
Gustavo A. R. Silva, Paolo Bonzini, Jonathan Corbet, Shuah Khan
Cc: linux-arm-kernel, kvmarm, linux-kernel, linux-hardening, devel,
kvm, linux-doc, linux-kselftest, Akihiko Odaki
In-Reply-To: <20260418-hybrid-v7-0-2bf39ad009bf@rsg.ci.i.u-tokyo.ac.jp>
Assert the following:
- KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY is unset at initialization.
- KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY can be set.
- Setting KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY for the first time
after setting an event filter results in EBUSY.
- KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY can be set again even if an
event filter has already been set.
- Setting KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY after running a VCPU
results in EBUSY.
- The existing test cases pass with
KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY set.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
.../selftests/kvm/arm64/vpmu_counter_access.c | 153 +++++++++++++++++----
1 file changed, 127 insertions(+), 26 deletions(-)
diff --git a/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c b/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
index ae36325c022f..0ed0a8513b03 100644
--- a/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
+++ b/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
@@ -403,12 +403,7 @@ static void create_vpmu_vm(void *guest_code)
{
struct kvm_vcpu_init init;
uint8_t pmuver, ec;
- uint64_t dfr0, irq = 23;
- struct kvm_device_attr irq_attr = {
- .group = KVM_ARM_VCPU_PMU_V3_CTRL,
- .attr = KVM_ARM_VCPU_PMU_V3_IRQ,
- .addr = (uint64_t)&irq,
- };
+ uint64_t dfr0;
/* The test creates the vpmu_vm multiple times. Ensure a clean state */
memset(&vpmu_vm, 0, sizeof(vpmu_vm));
@@ -434,8 +429,6 @@ static void create_vpmu_vm(void *guest_code)
TEST_ASSERT(pmuver != ID_AA64DFR0_EL1_PMUVer_IMP_DEF &&
pmuver >= ID_AA64DFR0_EL1_PMUVer_IMP,
"Unexpected PMUVER (0x%x) on the vCPU with PMUv3", pmuver);
-
- vcpu_ioctl(vpmu_vm.vcpu, KVM_SET_DEVICE_ATTR, &irq_attr);
}
static void destroy_vpmu_vm(void)
@@ -461,15 +454,30 @@ static void run_vcpu(struct kvm_vcpu *vcpu, uint64_t pmcr_n)
}
}
-static void test_create_vpmu_vm_with_nr_counters(unsigned int nr_counters, bool expect_fail)
+static void guest_code_done(void)
+{
+ GUEST_DONE();
+}
+
+static void test_create_vpmu_vm_with_nr_counters(unsigned int nr_counters,
+ bool fixed_counters_only,
+ bool expect_fail)
{
struct kvm_vcpu *vcpu;
unsigned int prev;
int ret;
+ uint64_t irq = 23;
create_vpmu_vm(guest_code);
vcpu = vpmu_vm.vcpu;
+ if (fixed_counters_only)
+ vcpu_device_attr_set(vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+
+ vcpu_device_attr_set(vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_IRQ, &irq);
+
prev = get_pmcr_n(vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(SYS_PMCR_EL0)));
ret = __vcpu_device_attr_set(vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
@@ -489,15 +497,15 @@ static void test_create_vpmu_vm_with_nr_counters(unsigned int nr_counters, bool
* Create a guest with one vCPU, set the PMCR_EL0.N for the vCPU to @pmcr_n,
* and run the test.
*/
-static void run_access_test(uint64_t pmcr_n)
+static void run_access_test(uint64_t pmcr_n, bool fixed_counters_only)
{
uint64_t sp;
struct kvm_vcpu *vcpu;
struct kvm_vcpu_init init;
- pr_debug("Test with pmcr_n %lu\n", pmcr_n);
+ pr_debug("Test with pmcr_n %lu, fixed_counters_only %d\n", pmcr_n, fixed_counters_only);
- test_create_vpmu_vm_with_nr_counters(pmcr_n, false);
+ test_create_vpmu_vm_with_nr_counters(pmcr_n, fixed_counters_only, false);
vcpu = vpmu_vm.vcpu;
/* Save the initial sp to restore them later to run the guest again */
@@ -531,14 +539,14 @@ static struct pmreg_sets validity_check_reg_sets[] = {
* Create a VM, and check if KVM handles the userspace accesses of
* the PMU register sets in @validity_check_reg_sets[] correctly.
*/
-static void run_pmregs_validity_test(uint64_t pmcr_n)
+static void run_pmregs_validity_test(uint64_t pmcr_n, bool fixed_counters_only)
{
int i;
struct kvm_vcpu *vcpu;
uint64_t set_reg_id, clr_reg_id, reg_val;
uint64_t valid_counters_mask, max_counters_mask;
- test_create_vpmu_vm_with_nr_counters(pmcr_n, false);
+ test_create_vpmu_vm_with_nr_counters(pmcr_n, fixed_counters_only, false);
vcpu = vpmu_vm.vcpu;
valid_counters_mask = get_counters_mask(pmcr_n);
@@ -588,11 +596,11 @@ static void run_pmregs_validity_test(uint64_t pmcr_n)
* the vCPU to @pmcr_n, which is larger than the host value.
* The attempt should fail as @pmcr_n is too big to set for the vCPU.
*/
-static void run_error_test(uint64_t pmcr_n)
+static void run_error_test(uint64_t pmcr_n, bool fixed_counters_only)
{
pr_debug("Error test with pmcr_n %lu (larger than the host)\n", pmcr_n);
- test_create_vpmu_vm_with_nr_counters(pmcr_n, true);
+ test_create_vpmu_vm_with_nr_counters(pmcr_n, fixed_counters_only, true);
destroy_vpmu_vm();
}
@@ -622,22 +630,115 @@ static bool kvm_supports_nr_counters_attr(void)
return supported;
}
-int main(void)
+static void test_config(uint64_t pmcr_n, bool fixed_counters_only)
{
- uint64_t i, pmcr_n;
-
- TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_PMU_V3));
- TEST_REQUIRE(kvm_supports_vgic_v3());
- TEST_REQUIRE(kvm_supports_nr_counters_attr());
+ uint64_t i;
- pmcr_n = get_pmcr_n_limit();
for (i = 0; i <= pmcr_n; i++) {
- run_access_test(i);
- run_pmregs_validity_test(i);
+ run_access_test(i, fixed_counters_only);
+ run_pmregs_validity_test(i, fixed_counters_only);
}
for (i = pmcr_n + 1; i < ARMV8_PMU_MAX_COUNTERS; i++)
- run_error_test(i);
+ run_error_test(i, fixed_counters_only);
+}
+
+static void test_fixed_counters_only(void)
+{
+ struct kvm_pmu_event_filter filter = { .nevents = 0 };
+ struct kvm_vm *vm;
+ struct kvm_vcpu *running_vcpu;
+ struct kvm_vcpu *stopped_vcpu;
+ struct kvm_vcpu_init init;
+ int ret;
+ uint64_t irq = 23;
+
+ create_vpmu_vm(guest_code);
+ ret = __vcpu_has_device_attr(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY);
+ if (ret) {
+ TEST_ASSERT(ret == -1 && errno == ENXIO,
+ KVM_IOCTL_ERROR(KVM_HAS_DEVICE_ATTR, ret));
+ destroy_vpmu_vm();
+ return;
+ }
+
+ /* Assert that FIXED_COUNTERS_ONLY is unset at initialization. */
+ ret = __vcpu_device_attr_get(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+ TEST_ASSERT(ret == -1 && errno == ENXIO,
+ KVM_IOCTL_ERROR(KVM_GET_DEVICE_ATTR, ret));
+
+ /* Assert that setting FIXED_COUNTERS_ONLY succeeds. */
+ vcpu_device_attr_set(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+
+ /* Assert that getting FIXED_COUNTERS_ONLY succeeds. */
+ vcpu_device_attr_get(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+
+ /*
+ * Assert that setting FIXED_COUNTERS_ONLY again succeeds even if an
+ * event filter has already been set.
+ */
+ vcpu_device_attr_set(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FILTER, &filter);
+
+ vcpu_device_attr_set(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+
+ destroy_vpmu_vm();
+
+ create_vpmu_vm(guest_code);
+
+ /*
+ * Assert that setting FIXED_COUNTERS_ONLY results in EBUSY if an event
+ * filter has already been set while FIXED_COUNTERS_ONLY has not.
+ */
+ vcpu_device_attr_set(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FILTER, &filter);
+
+ ret = __vcpu_device_attr_set(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+ TEST_ASSERT(ret == -1 && errno == EBUSY,
+ KVM_IOCTL_ERROR(KVM_SET_DEVICE_ATTR, ret));
+
+ destroy_vpmu_vm();
+
+ /*
+ * Assert that setting FIXED_COUNTERS_ONLY after running a VCPU results
+ * in EBUSY.
+ */
+ vm = vm_create(2);
+ vm_ioctl(vm, KVM_ARM_PREFERRED_TARGET, &init);
+ init.features[0] |= (1 << KVM_ARM_VCPU_PMU_V3);
+ running_vcpu = aarch64_vcpu_add(vm, 0, &init, guest_code_done);
+ stopped_vcpu = aarch64_vcpu_add(vm, 1, &init, guest_code_done);
+ kvm_arch_vm_finalize_vcpus(vm);
+ vcpu_device_attr_set(running_vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_IRQ, &irq);
+ vcpu_device_attr_set(running_vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_INIT, NULL);
+ vcpu_run(running_vcpu);
+
+ ret = __vcpu_device_attr_set(stopped_vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+ TEST_ASSERT(ret == -1 && errno == EBUSY,
+ KVM_IOCTL_ERROR(KVM_SET_DEVICE_ATTR, ret));
+
+ kvm_vm_free(vm);
+
+ test_config(0, true);
+}
+
+int main(void)
+{
+ TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_PMU_V3));
+ TEST_REQUIRE(kvm_supports_vgic_v3());
+ TEST_REQUIRE(kvm_supports_nr_counters_attr());
+
+ test_config(get_pmcr_n_limit(), false);
+ test_fixed_counters_only();
return 0;
}
--
2.53.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox