linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains
@ 2025-07-06 15:16 Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 01/12] landlock: Set the max rules limit in a domain to U16_MAX Tingmao Wang
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

Hi,

This implements the proposed data structure from the last discussion on
hashtable domains [1], in which we store all the rules and layers of a
domain in one flat array.  But instead of doing binary search, we use a
hashtable lookup, which has performance benefits.  The goal here is both
to improve performance and also reduce Landlock's memory footprint as much
as we reasonably can.


struct landlock_domain
----------------------

This patch set creates entirely new structs to use for a merged domain,
instead of using landlock_ruleset and landlock_rule (although we still use
landlock_layer).

Since we are no longer allocating individual struct landlock_rule for each
rule, and also because this representation is more compact and does not
use any rbtree pointers, this should reduce the memory usage for a merged
domain.  In the current implementation, even if each rule only has one
layer, the struct landlock_rule would effectively take up 64 bytes due to
kalloc placing it in the next power-of-2 bucket.

While this is not completely done in this series yet, the hope is that we
can remove all code that deals with the old landlock_ruleset domain, and
simplify the landlock_ruleset and landlock_rule struct since they now only
need to represent an unmerged ruleset, which will only contain one layer.
This series already removes some existing ruleset merging code.


Hashtable implementation
------------------------

This patch set implements a "coalesced hashtable", which uses an array
instead of linked lists, and uses the array itself to store collisions, by
storing them in "unused" slots (since the existence of collisions
necessarily mean that some hashes are not used).  A more detailed
explanation of this algorithm is included in the commit message for patch
2 and 5.

The hashtable we've implemented here is a immutable-after-construction
table (technically one could probably still append to it with some care,
but in principle it should be construct-once), which fits the use case for
landlock - merge a domain once, then we just need fast read-only query.

Some research has not led me to finding any existing implementation of
such a coalesced hashtable in the kernel, therefore this series introduces
new code for this algorithm.  Currently it's placed within
security/landlock, but it is written in a generic way that if somebody
else wants to use it (for example, current users of binary search on a
fixed array?), they can probably do so relatively easily (aside from
needing to move this header outside of landlock).

Testing
-------

All selftests pass under UML.

I plan to implement Kunit tests for the hashtable (and maybe also the
domain) in the next version.


Benchmark overview
------------------

I ran benchmark with before/after using two workloads, on the same machine
and setup:

    1. run_microbench with different depth and number of extra rules using
    code in [2]

    2. A more "typical" workload I put together quickly, with 18
    reasonably logical rules, calling git status and the like repeatedly
    [3].
    (I've consistently used this same workload for benchmarking previous
    performance improvements patches, to reduce the chances of accidental
    cherry-picking)

Results for the "typical" workload

Comparing:                    before    after
  landlock_overhead:    avg = 34        34
                     median = 35        34          (-1)
  landlock_hook:        avg = 878       856   (ns)  (-2.5%)
                     median = 854       831   (ns)  (-2.7%)
  open_syscall:         avg = 2517      2485  (ns)  (-1.3%)
                     median = 2457      2425  (ns)  (-1.3%)

Results for a 100 rules test with 10 dirs to walk upwards:

with landlock: d = /1/2/3/4/5/6/7/8/9/ nb_extra_rules = 100:
  landlock_overhead:    avg = 15     15
                     median = 17     16          (-1)
  landlock_hook:        avg = 832    785   (ns)  (-5.6%)
                     median = 826    776   (ns)  (-6.1%)
  open_syscall:         avg = 5163   5001  (ns)  (-3.1%)
                     median = 4763   4847  (ns)  (+1.8%)

Note that the 100 rules benchmark has quite variable performance, and the
testing method probably meant that most of the time is spent in VFS
lookup.  (Mickaël has gave some suggestions for improvement which I've yet
to do)

The full results and .config used (basically Debian) can be found at
https://fileshare.maowtm.org/landlock/20250706/index.html


Outstanding TODOs
-----------------

- selftests for the coalesced hash table, and maybe also for the domain
- simplify struct landlock_ruleset and struct landlock_rule since they now
  only need to deal with one layer,
- Using the name "layer" to refer to individual struct landlock_layers is
  very confusing especially with names like num_layers - the next version
  should probably find a better name for it.


[1]: https://lore.kernel.org/all/20250526.quec3Dohsheu@digikod.net/
[2]: https://github.com/landlock-lsm/landlock-test-tools/pull/17
[3]: https://github.com/micromaomao/linux-dev/commit/f1865ce970af97ac3b6f4edf580529b8cdc66371

Patch based on mic/next (v6.16-rc2+)

Closes: https://github.com/landlock-lsm/linux/issues/1

Changes since v1:
- Entirely replaced the hlist-based hashtable with an array-based one.
- This time added support for net rules too.

v1: https://lore.kernel.org/all/cover.1747836146.git.m@maowtm.org/

Tingmao Wang (12):
  landlock: Set the max rules limit in a domain to U16_MAX.
  landlock/domain: Define structure and macros for flat-array domains
  landlock: Define coalesced hashtable and implement finding
  landlock/domain: Implement finding rules
  landlock/coalesced_hash: Implement insert
  landlock/domain: Implement merging of a parent domain and a ruleset
  landlock/domain: Define alloc and free
  landlock/domain: Add landlock_domain_merge_ruleset
  landlock: Update various code to use landlock_domain
  landlock: Remove unused code
  landlock/task: Fix incorrect BUILD_BUG_ON() in domain_is_scoped
  landlock: Use a hash function that does not involve division

 security/landlock/audit.c          |   8 +-
 security/landlock/coalesced_hash.h | 399 +++++++++++++++++
 security/landlock/cred.c           |  12 +-
 security/landlock/cred.h           |  14 +-
 security/landlock/domain.c         | 681 +++++++++++++++++++++++++++++
 security/landlock/domain.h         | 342 ++++++++++++++-
 security/landlock/fs.c             |  34 +-
 security/landlock/limits.h         |   2 +-
 security/landlock/net.c            |  12 +-
 security/landlock/ruleset.c        | 319 +-------------
 security/landlock/ruleset.h        |  71 +--
 security/landlock/syscalls.c       |   8 +-
 security/landlock/task.c           |  31 +-
 13 files changed, 1499 insertions(+), 434 deletions(-)
 create mode 100644 security/landlock/coalesced_hash.h


base-commit: 86fdfbade8bb09ce2be2ff334c743fe19815ceb2
-- 
2.49.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 01/12] landlock: Set the max rules limit in a domain to U16_MAX.
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 02/12] landlock/domain: Define structure and macros for flat-array domains Tingmao Wang
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

On Mon, 2 Jun 2025 at 21:50:05 +0200, Mickaël Salaün wrote [1]:
> Correct.  We can either use u64 or reduce the maximum number of rules.
> I think LANDLOCK_MAX_NUM_RULES set to U16_MAX would be much more than
> the worse practical case.  Even if one buggy policy tries to add one
> rule per network port, that will be OK.  We could even reasonably test
> this limit.  We'll need to backport this change but I'm OK with that.

The way this series will be implemented, we don't actually _need_ to
decrease this limit, as we can store a u64 instead of u32 as the layer
index and this will not change the size of landlock_domain_index on 64-bit
systems.  But given agreement with Mickaël, I will reduce it anyway here.

Note that a limit of 2^24 still leaves us with more than enough room even
for u32 indices, but for future-proofing, setting this to U16_MAX here.

Link: https://lore.kernel.org/all/20250602.uBai6ge5maiw@digikod.net/ [1]

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/limits.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/landlock/limits.h b/security/landlock/limits.h
index 65b5ff051674..8e7a8816cce2 100644
--- a/security/landlock/limits.h
+++ b/security/landlock/limits.h
@@ -17,7 +17,7 @@
 /* clang-format off */
 
 #define LANDLOCK_MAX_NUM_LAYERS		16
-#define LANDLOCK_MAX_NUM_RULES		U32_MAX
+#define LANDLOCK_MAX_NUM_RULES		U16_MAX
 
 #define LANDLOCK_LAST_ACCESS_FS		LANDLOCK_ACCESS_FS_IOCTL_DEV
 #define LANDLOCK_MASK_ACCESS_FS		((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 02/12] landlock/domain: Define structure and macros for flat-array domains
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 01/12] landlock: Set the max rules limit in a domain to U16_MAX Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 03/12] landlock: Define coalesced hashtable and implement finding Tingmao Wang
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

This implements the structure proposed in in [1], using a flat array to
store the rules and eventually using hashing to find rules.  The array is
stored in the domain struct itself to avoid extra pointer indirection and
make all the rule data as cache-local as possible.  The non-array part of
the domain struct is also kept reasonably small.  This works well for a
small (10 or 20 rules) ruleset, which is the common case for Landlock
users, and still has reasonable performance for large ones.

This will eventually make landlock_rule/landlock_ruleset only needed for
unmerged rulesets, and thus it doesn't have to store multiple layers etc.
create_rule and insert_rule would also hopefully become much simpler.

Different to the original proposal, the number of layers for each rule is
now deducted from the layer index of the next offset.  In order to
simplify logic, a special "terminating index" is placed after each of the
two index arrays, which will contain a layer_index = num_layers.

On reflection, using the name "layer" to refer to individual struct
landlock_layers is very confusing especially with names like num_layers -
the next version should probably find a better name for it.

Link: https://lore.kernel.org/all/20250526.quec3Dohsheu@digikod.net/ [1]

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/domain.c |   8 +++
 security/landlock/domain.h | 125 +++++++++++++++++++++++++++++++++++++
 2 files changed, 133 insertions(+)

diff --git a/security/landlock/domain.c b/security/landlock/domain.c
index a647b68e8d06..83233bf3be6a 100644
--- a/security/landlock/domain.c
+++ b/security/landlock/domain.c
@@ -24,6 +24,14 @@
 #include "domain.h"
 #include "id.h"
 
+static void __maybe_unused build_check_domain(void)
+{
+	BUILD_BUG_ON(LANDLOCK_MAX_NUM_RULES >= U32_MAX - 1);
+	/* Non-inclusive end indices are involved, so needs to be U32_MAX - 1. */
+	BUILD_BUG_ON(LANDLOCK_MAX_NUM_RULES * LANDLOCK_MAX_NUM_LAYERS >=
+		     U32_MAX - 1);
+}
+
 #ifdef CONFIG_AUDIT
 
 /**
diff --git a/security/landlock/domain.h b/security/landlock/domain.h
index 7fb70b25f85a..b0f5ba59ff4c 100644
--- a/security/landlock/domain.h
+++ b/security/landlock/domain.h
@@ -5,6 +5,7 @@
  * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net>
  * Copyright © 2018-2020 ANSSI
  * Copyright © 2024-2025 Microsoft Corporation
+ * Copyright © 2025      Tingmao Wang <m@maowtm.org>
  */
 
 #ifndef _SECURITY_LANDLOCK_DOMAIN_H
@@ -20,6 +21,130 @@
 
 #include "access.h"
 #include "audit.h"
+#include "ruleset.h"
+
+struct landlock_domain_index {
+	/**
+	 * @key: The landlock object or port identifier.
+	 */
+	union landlock_key key;
+	/**
+	 * @next_collision: Points to another slot in the domain indices
+	 * array, forming a collision chain in a coalesced hashtable.  See
+	 * landlock_domain_find for how this is used.
+	 */
+	u32 next_collision;
+	/**
+	 * @layer_index: The index of the first landlock_layer corresponding
+	 * to this key in the relevant subarray.  A rule may have multiple
+	 * layers.  The end of the layers region for this rule is the index of
+	 * the next struct landlock_domain_index in the array.
+	 */
+	u32 layer_index;
+};
+
+struct landlock_domain {
+	/**
+	 * @num_layers: Number of layers in this domain.  This enables to
+	 * check that all the layers allow an access request.
+	 */
+	u32 num_layers;
+	/**
+	 * @num_fs_indices: Number of non-overlapping (i.e. not for the same
+	 * object) inode rules.  Does not include the terminating index.
+	 */
+	u32 num_fs_indices;
+	/**
+	 * @num_net_indices: Number of non-overlapping (i.e. not for the same
+	 * port) network rules.  Does not include the terminating index.
+	 */
+	u32 num_net_indices;
+	/**
+	 * @num_fs_layers: Number of landlock_layer in the fs_layers array.
+	 */
+	u32 num_fs_layers;
+	/**
+	 * @num_net_layers: Number of landlock_layer in the net_layers array.
+	 */
+	u32 num_net_layers;
+	/**
+	 * @len_rules: Total length (in units of uintptr_t) of the rules
+	 * array.  Used to check accesses are not out of bounds, but in theory
+	 * this is always derivable from the other length fields.
+	 */
+	u32 len_rules;
+	/**
+	 * @rules: The rest of this struct consists of 5 dynamically-sized,
+	 * arrays as well as 2 terminating indices, placed one after another,
+	 * the contents of which are to be accessed with dom_ helper macros
+	 * defined in this header.  They are:
+	 *
+	 *     struct access_masks access_masks[num_layers];
+	 *     (possible alignment padding here)
+	 *     struct landlock_domain_index fs_indices[num_fs_indices];
+	 *     struct landlock_domain_index terminating_fs_index;
+	 *     struct landlock_domain_index net_indices[num_net_indices];
+	 *     struct landlock_domain_index terminating_net_index;
+	 *     struct landlock_layer fs_layers[num_fs_layers];
+	 *     struct landlock_layer net_layers[num_net_layers];
+	 *     (possible alignment padding here)
+	 *
+	 * The purpose of the terminating indices is to allow getting the
+	 * non-inclusive end index of the layers for a rule without branching.
+	 * They do not represent any rules themselves, and the only valid
+	 * field for those two indices is layer_index.
+	 */
+	uintptr_t rules[] __counted_by(len_rules);
+};
+
+#define dom_access_masks(dom) ((struct access_masks *)((dom)->rules))
+
+#define _dom_fs_indices_offset(dom)                                        \
+	(ALIGN(array_size((dom)->num_layers, sizeof(struct access_masks)), \
+	       sizeof(uintptr_t)))
+
+#define dom_fs_indices(dom)                                      \
+	((struct landlock_domain_index *)((char *)(dom)->rules + \
+					  _dom_fs_indices_offset(dom)))
+
+#define dom_fs_terminating_index(dom) \
+	(&dom_fs_indices(dom)[(dom)->num_fs_indices])
+
+#define _dom_net_indices_offset(dom)                       \
+	(_dom_fs_indices_offset(dom) +                     \
+	 array_size(((size_t)(dom)->num_fs_indices) + 1ul, \
+		    sizeof(struct landlock_domain_index)))
+
+#define dom_net_indices(dom)                                     \
+	((struct landlock_domain_index *)((char *)(dom)->rules + \
+					  _dom_net_indices_offset(dom)))
+
+#define dom_net_terminating_index(dom) \
+	(&dom_net_indices(dom)[(dom)->num_net_indices])
+
+#define _dom_fs_layers_offset(dom)                          \
+	(_dom_net_indices_offset(dom) +                     \
+	 array_size((size_t)((dom)->num_net_indices) + 1ul, \
+		    sizeof(struct landlock_domain_index)))
+
+#define dom_fs_layers(dom)                                \
+	((struct landlock_layer *)((char *)(dom)->rules + \
+				   _dom_fs_layers_offset(dom)))
+
+#define _dom_net_layers_offset(dom)   \
+	(_dom_fs_layers_offset(dom) + \
+	 array_size((dom)->num_fs_layers, sizeof(struct landlock_layer)))
+
+#define dom_net_layers(dom)                               \
+	((struct landlock_layer *)((char *)(dom)->rules + \
+				   _dom_net_layers_offset(dom)))
+
+#define dom_rules_len(dom)                                        \
+	(ALIGN(_dom_net_layers_offset(dom) +                      \
+		       array_size((dom)->num_net_layers,          \
+				  sizeof(struct landlock_layer)), \
+	       sizeof(uintptr_t)) /                               \
+	 sizeof(uintptr_t))
 
 enum landlock_log_status {
 	LANDLOCK_LOG_PENDING = 0,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 03/12] landlock: Define coalesced hashtable and implement finding
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 01/12] landlock: Set the max rules limit in a domain to U16_MAX Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 02/12] landlock/domain: Define structure and macros for flat-array domains Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 04/12] landlock/domain: Implement finding rules Tingmao Wang
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

This commit introduces utility functions for handling a (generic) compact
coalesced hash table, which we will use in the next commit.

I decided to make it generic for now but we can make it landlock-specific
if we want.

This should include a randomized unit test - I will add this in the next
version.

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/coalesced_hash.h | 130 +++++++++++++++++++++++++++++
 1 file changed, 130 insertions(+)
 create mode 100644 security/landlock/coalesced_hash.h

diff --git a/security/landlock/coalesced_hash.h b/security/landlock/coalesced_hash.h
new file mode 100644
index 000000000000..14950915f0b5
--- /dev/null
+++ b/security/landlock/coalesced_hash.h
@@ -0,0 +1,130 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock - utility functions for handling a compact coalesced hash
+ * tables.
+ *
+ * A coalesced hash table is an array (typically completely filled), where
+ * each element has a next_collision field that contains the index to the
+ * next collision in the chain.  If there is no next collision, the
+ * next_collision field is set to the index of the element itself.  A
+ * search for a particular key starts at the index that it hashes to, then
+ * we follow the chain of collisions until the key is found or we reach
+ * the end of the chain.  Before starting the collision chain loop, if we
+ * found that the element at the index we hashed to does not in fact hash
+ * to its index, then we know that there is no elements with our hash, and
+ * so we can terminate early.
+ *
+ * Copyright © 2025      Tingmao Wang <m@maowtm.org>
+ */
+
+#include <linux/types.h>
+#include <linux/mm.h>
+
+typedef u32 h_index_t;
+
+typedef h_index_t (*hash_element_t)(const void *elem, h_index_t table_size);
+typedef h_index_t (*get_next_collision_t)(const void *elem);
+typedef void (*set_next_collision_t)(void *elem, h_index_t next_collision);
+typedef bool (*compare_element_t)(const void *key_elem, const void *found_elem);
+typedef bool (*element_is_empty_t)(const void *elem);
+
+static inline void *h_find(const void *table, h_index_t table_size,
+			   size_t elem_size, const void *elem_to_find,
+			   hash_element_t hash_elem,
+			   get_next_collision_t get_next_collision,
+			   compare_element_t compare_elem,
+			   element_is_empty_t element_is_empty)
+{
+	h_index_t curr_index, next_collision, next_hash;
+	const void *curr_elem;
+
+	if (unlikely(table_size == 0))
+		return NULL;
+
+	curr_index = hash_elem(elem_to_find, table_size);
+	if (WARN_ON_ONCE(curr_index >= table_size))
+		return NULL;
+	curr_elem = table + curr_index * elem_size;
+	if (compare_elem(elem_to_find, curr_elem))
+		return (void *)curr_elem;
+
+	if (element_is_empty(curr_elem))
+		return NULL;
+	next_collision = get_next_collision(curr_elem);
+	if (next_collision == curr_index)
+		return NULL;
+	next_hash = hash_elem(curr_elem, table_size);
+	if (next_hash != curr_index)
+		return NULL;
+
+	while (next_collision != curr_index) {
+		curr_index = next_collision;
+		curr_elem = table + curr_index * elem_size;
+		if (compare_elem(elem_to_find, curr_elem))
+			return (void *)curr_elem;
+		next_collision = get_next_collision(curr_elem);
+	}
+
+	return NULL;
+}
+
+/**
+ * DEFINE_COALESCED_HASH_TABLE - Define a set of functions to mainpulate a
+ * coalesced hash table holding elements of type @elem_type.
+ *
+ * @elem_type: The type of the elements.
+ * @table_func_prefix: The prefix to use for the functions.
+ * @key_member: The name of a member in @elem_type that contains the key
+ * (to compare for equality).
+ * @next_collision_member: The name of a member in @elem_type that is used
+ * to store the index of the next collision in a collision chain.
+ * @hash_expr: An expression that computes the hash of an element, given
+ * const @elem_type *elem and h_index_t table_size.  If this function is
+ * evaluated, table_size is always positive.
+ * @is_empty_expr: An expression that evaluates to true if the element is
+ * empty (i.e. not used).  Empty elements are not returned by find.  If
+ * the zero value of @elem_type is not "empty", the caller must set all
+ * the slots to empty before using the table.
+ */
+#define DEFINE_COALESCED_HASH_TABLE(elem_type, table_func_prefix, key_member, \
+				    next_collision_member, hash_expr,         \
+				    is_empty_expr)                            \
+	static inline h_index_t table_func_prefix##_hash_elem(                \
+		const void *_elem, h_index_t table_size)                      \
+	{                                                                     \
+		const elem_type *elem = _elem;                                \
+		return hash_expr;                                             \
+	}                                                                     \
+	static inline h_index_t table_func_prefix##_get_next_collision(       \
+		const void *elem)                                             \
+	{                                                                     \
+		return ((const elem_type *)elem)->next_collision_member;      \
+	}                                                                     \
+	static inline void table_func_prefix##_set_next_collision(            \
+		void *elem, h_index_t next_collision)                         \
+	{                                                                     \
+		((elem_type *)elem)->next_collision_member = next_collision;  \
+	}                                                                     \
+	static inline bool table_func_prefix##_compare_elem(                  \
+		const void *key_elem, const void *found_elem)                 \
+	{                                                                     \
+		const elem_type *key = key_elem;                              \
+		const elem_type *found = found_elem;                          \
+		return key->key_member.data == found->key_member.data;        \
+	}                                                                     \
+	static inline bool table_func_prefix##_element_is_empty(              \
+		const void *_elem)                                            \
+	{                                                                     \
+		const elem_type *elem = _elem;                                \
+		return is_empty_expr;                                         \
+	}                                                                     \
+	static inline const elem_type *table_func_prefix##_find(              \
+		const elem_type *table, h_index_t table_size,                 \
+		const elem_type *elem_to_find)                                \
+	{                                                                     \
+		return h_find(table, table_size, sizeof(elem_type),           \
+			      elem_to_find, table_func_prefix##_hash_elem,    \
+			      table_func_prefix##_get_next_collision,         \
+			      table_func_prefix##_compare_elem,               \
+			      table_func_prefix##_element_is_empty);          \
+	}
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 04/12] landlock/domain: Implement finding rules
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
                   ` (2 preceding siblings ...)
  2025-07-06 15:16 ` [RFC PATCH v2 03/12] landlock: Define coalesced hashtable and implement finding Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 05/12] landlock/coalesced_hash: Implement insert Tingmao Wang
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

This implements a function to search for matching rules using the newly
defined coalesced hashtable, and define convinience macros for fs and net
respectively, as well as a macro to iterate over the layers of the rule.

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/domain.h | 67 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/security/landlock/domain.h b/security/landlock/domain.h
index b0f5ba59ff4c..8acd88a1d77a 100644
--- a/security/landlock/domain.h
+++ b/security/landlock/domain.h
@@ -22,6 +22,7 @@
 #include "access.h"
 #include "audit.h"
 #include "ruleset.h"
+#include "coalesced_hash.h"
 
 struct landlock_domain_index {
 	/**
@@ -146,6 +147,72 @@ struct landlock_domain {
 	       sizeof(uintptr_t)) /                               \
 	 sizeof(uintptr_t))
 
+/*
+ * We have to use an invalid layer_index to signal empty value as the key
+ * can be 0 for net rules.
+ */
+#define dom_index_is_empty(elem) ((elem)->layer_index == U32_MAX)
+
+DEFINE_COALESCED_HASH_TABLE(struct landlock_domain_index, dom_hash, key,
+			    next_collision,
+			    hash_long(elem->key.data, 32) % table_size,
+			    dom_index_is_empty(elem))
+
+struct landlock_found_rule {
+	const struct landlock_layer *layers_start;
+	const struct landlock_layer *layers_end;
+};
+
+/**
+ * landlock_domain_find - search for a key in a domain.  Don't use this
+ * function directly, but use one of the dom_find_index_*() macros
+ * instead.
+ *
+ * @indices_arr: The indices array to search in.
+ * @num_indices: The number of elements in @indices_arr.
+ * @layers_arr: The layers array.
+ * @num_layers: The number of elements in @layers_arr.
+ * @key: The key to search for.
+ */
+static inline struct landlock_found_rule
+landlock_domain_find(const struct landlock_domain_index *const indices_arr,
+		     const u32 num_indices,
+		     const struct landlock_layer *const layers_arr,
+		     const u32 num_layers, const union landlock_key key)
+{
+	struct landlock_domain_index key_elem = {
+		.key = key,
+	};
+	struct landlock_found_rule out_found_rule = {};
+	const struct landlock_domain_index *found;
+
+	found = dom_hash_find(indices_arr, num_indices, &key_elem);
+
+	if (found) {
+		if (WARN_ON_ONCE(found->layer_index >= num_layers))
+			return out_found_rule;
+		out_found_rule.layers_start = &layers_arr[found->layer_index];
+		out_found_rule.layers_end =
+			&layers_arr[(found + 1)->layer_index];
+	}
+
+	return out_found_rule;
+}
+
+#define dom_find_index_fs(dom, key)                                      \
+	landlock_domain_find(dom_fs_indices(dom), (dom)->num_fs_indices, \
+			     dom_fs_layers(dom), (dom)->num_fs_layers, key)
+
+#define dom_find_index_net(dom, key)                                       \
+	landlock_domain_find(dom_net_indices(dom), (dom)->num_net_indices, \
+			     dom_net_layers(dom), (dom)->num_net_layers, key)
+
+#define dom_find_success(found_rule) ((found_rule).layers_start != NULL)
+
+#define dom_rule_for_each_layer(found_rule, layer) \
+	for (layer = (found_rule).layers_start;    \
+	     layer < (found_rule).layers_end; layer++)
+
 enum landlock_log_status {
 	LANDLOCK_LOG_PENDING = 0,
 	LANDLOCK_LOG_RECORDED,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 05/12] landlock/coalesced_hash: Implement insert
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
                   ` (3 preceding siblings ...)
  2025-07-06 15:16 ` [RFC PATCH v2 04/12] landlock/domain: Implement finding rules Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 06/12] landlock/domain: Implement merging of a parent domain and a ruleset Tingmao Wang
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

This algorithm is a slight alteration of the one on Wikipedia at the time
of writing [1].  The difference is that when we are trying to insert into
a slot that is already being used (whether by an element that doesn't
actually belong there, and is just in a collision chain of a different
hash, or whether it is the head of a chain and thus has the correct hash),
we move the existing element away and insert the new element in its place.
The result is that if there is some element in the hash table with a
certain hash, the slot corresponding to that hash will always be the slot
that starts the collision chain for that hash.  In order words, chains
won't "mix" and if we detect that the hash of the element at the slot
we're targeting is not correct, we know that the hash table does not
contain the hash we want.

[1]: https://en.wikipedia.org/w/index.php?title=Coalesced_hashing&oldid=1214362866

This patch seems to have hit a checkpatch false positive:

	ERROR: need consistent spacing around '*' (ctx:WxV)
	#281: FILE: security/landlock/coalesced_hash.h:349:
	+               elem_type *table, h_index_t table_size)                       \
	                          ^

	ERROR: need consistent spacing around '*' (ctx:WxV)
	#288: FILE: security/landlock/coalesced_hash.h:356:
	+               struct h_insert_scratch *scratch, const elem_type *elem)      \
	                                                                  ^

Since this is kinda a niche use-case, I will make a report only after this
series gets out of RFC (and if they still show up).

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/coalesced_hash.h | 265 +++++++++++++++++++++++++++++
 1 file changed, 265 insertions(+)

diff --git a/security/landlock/coalesced_hash.h b/security/landlock/coalesced_hash.h
index 14950915f0b5..56d32cf70d67 100644
--- a/security/landlock/coalesced_hash.h
+++ b/security/landlock/coalesced_hash.h
@@ -68,6 +68,256 @@ static inline void *h_find(const void *table, h_index_t table_size,
 	return NULL;
 }
 
+static inline void h_initialize(void *table, h_index_t table_size,
+				size_t elem_size,
+				set_next_collision_t set_next_collision,
+				element_is_empty_t element_is_empty)
+{
+	h_index_t i;
+	void *elem;
+
+	WARN_ON_ONCE(array_size(table_size, elem_size) == SIZE_MAX);
+
+	for (i = 0; i < table_size; i++) {
+		elem = table + i * elem_size;
+		set_next_collision(elem, i);
+	}
+}
+
+struct h_insert_scratch {
+	/**
+	 * @prev_index: For each slot which belongs in a collision chain,
+	 * stores the index of the previous element in the chain.  (The next
+	 * index in the chain is stored in the element itself.)
+	 */
+	h_index_t *prev_index;
+	/**
+	 * @next_free_index: This index moves from end of the table towards
+	 * the beginning.
+	 */
+	h_index_t next_free_index;
+
+	/*
+	 * The following members just helps us avoid passing those arguments
+	 * around all the time.
+	 */
+	h_index_t table_size;
+	void *table;
+	size_t elem_size;
+};
+
+static inline int h_init_insert_scratch(struct h_insert_scratch *scratch,
+					void *table, h_index_t table_size,
+					size_t elem_size)
+{
+	h_index_t i;
+
+	if (table_size == 0) {
+		memset(scratch, 0, sizeof(*scratch));
+		scratch->table = table;
+		scratch->elem_size = elem_size;
+		return 0;
+	}
+
+	if (array_size(table_size, elem_size) == SIZE_MAX)
+		return -ENOMEM;
+
+	scratch->prev_index =
+		kcalloc(table_size, sizeof(h_index_t), GFP_KERNEL_ACCOUNT);
+	if (!scratch->prev_index)
+		return -ENOMEM;
+
+	for (i = 0; i < table_size; i++)
+		scratch->prev_index[i] = i;
+
+	scratch->table_size = table_size;
+	scratch->next_free_index = table_size - 1;
+	scratch->table = table;
+	scratch->elem_size = elem_size;
+	return 0;
+}
+
+static inline void h_free_insert_scratch(struct h_insert_scratch *scratch)
+{
+	if (!scratch)
+		return;
+
+	kfree(scratch->prev_index);
+	memset(scratch, 0, sizeof(*scratch));
+}
+
+static inline h_index_t
+__h_insert_get_next_free(struct h_insert_scratch *scratch,
+			 element_is_empty_t element_is_empty)
+{
+	h_index_t index_to_ret = scratch->next_free_index;
+	void *free_slot;
+
+	if (WARN_ON_ONCE(index_to_ret >= scratch->table_size)) {
+		/*
+		 * We used up all the available slots already.  This isn't
+		 * supposed to happen with correct usage.
+		 */
+		return 0;
+	}
+
+	free_slot = scratch->table + index_to_ret * scratch->elem_size;
+	while (!element_is_empty(free_slot)) {
+		if (WARN_ON_ONCE(index_to_ret == 0)) {
+			/* We unexpectedly used up all slots. */
+			return 0;
+		}
+		index_to_ret--;
+		free_slot = scratch->table + index_to_ret * scratch->elem_size;
+	}
+
+	if (index_to_ret == 0)
+		scratch->next_free_index = scratch->table_size;
+	else
+		scratch->next_free_index = index_to_ret - 1;
+
+	return index_to_ret;
+}
+
+/**
+ * __h_relocate_collision_slot - Moves the element at idx_to_move to
+ * another free slot, and make the original slot free.  We will update any
+ * chain links (scratch->prev_index and target->next_collision).
+ *
+ * Returns the new index of the moved element.
+ */
+static inline h_index_t
+__h_relocate_collision_slot(struct h_insert_scratch *scratch,
+			    h_index_t idx_to_move,
+			    get_next_collision_t get_next_collision,
+			    set_next_collision_t set_next_collision,
+			    element_is_empty_t element_is_empty)
+{
+	h_index_t move_to = __h_insert_get_next_free(scratch, element_is_empty);
+	void *move_to_elem = scratch->table + move_to * scratch->elem_size;
+	void *move_target_elem =
+		scratch->table + idx_to_move * scratch->elem_size;
+	h_index_t old_next = get_next_collision(move_target_elem);
+	h_index_t old_prev = scratch->prev_index[idx_to_move];
+	void *old_prev_elem = scratch->table + old_prev * scratch->elem_size;
+
+	memcpy(move_to_elem, move_target_elem, scratch->elem_size);
+
+	/*
+	 * The logic here is essentially hlist_replace_rcu, except in the
+	 * hlist case the tail have next == NULL, whereas in our case the tail
+	 * has next set to itself.
+	 */
+
+	/*
+	 * if move target already points to something else, it would have been
+	 * memcpy'd across.
+	 */
+	if (old_next == idx_to_move)
+		/*
+		 * Need to fix the tail pointer - it points to itself.  It's own
+		 * prev is correct already.
+		 */
+		set_next_collision(move_to_elem, move_to);
+	else
+		/*
+		 * the next_collision would have been memcpy'd over, but we need to
+		 * fix that next element's prev
+		 */
+		scratch->prev_index[old_next] = move_to;
+
+	if (old_prev == idx_to_move)
+		/* The moved element is a head.  Fix its prev. */
+		scratch->prev_index[move_to] = move_to;
+	else {
+		/*
+		 * Need to make the moved element's prev point to it, and copy over
+		 * the prev pointer.
+		 */
+		set_next_collision(old_prev_elem, move_to);
+		scratch->prev_index[move_to] = old_prev;
+	}
+
+	scratch->prev_index[idx_to_move] = idx_to_move;
+	memset(move_target_elem, 0, scratch->elem_size);
+	set_next_collision(move_target_elem, idx_to_move);
+
+	return move_to;
+}
+
+static inline void h_insert(struct h_insert_scratch *scratch, const void *elem,
+			    hash_element_t hash_elem,
+			    get_next_collision_t get_next_collision,
+			    set_next_collision_t set_next_collision,
+			    element_is_empty_t element_is_empty)
+{
+	h_index_t target_idx, target_hash, moved_to;
+	void *target_elem;
+
+	if (WARN_ON_ONCE(!scratch->table || !scratch->table_size))
+		return;
+	if (WARN_ON_ONCE(scratch->next_free_index >= scratch->table_size))
+		/*
+		 * We used up all the available slots already.  This isn't
+		 * supposed to happen with correct usage.
+		 */
+		return;
+	if (WARN_ON_ONCE(element_is_empty(elem)))
+		return;
+
+	/*
+	 * The general logic here is basically that we always insert the new
+	 * element at its rightful place, but we move any existing element in
+	 * that place around.  Consider these cases:
+	 *
+	 * 1. target slot is empty - we can just insert the new element.
+	 *
+	 * 2. target slot is occupied by an element that is in a collision
+	 *    chain (but not the head).
+	 *    In this case, we can just move that existing element to a free
+	 *    slot, and insert the new element in its rightful place.  This
+	 *    will start a new chain (the fact that the target slot is not a
+	 *    head means) that there is no existing chain with this hash.
+	 *
+	 * 3. target slot is occupied by the head of a chain (i.e. that
+	 *    existing element is already in its "rightful place").  In this
+	 *    case, we can still move that existing element to a free slot,
+	 *    and steals its current place to use for the new element.  The
+	 *    new element will become the new head of the chain, and will
+	 *    point to the existing element.
+	 */
+
+	target_idx = hash_elem(elem, scratch->table_size);
+	if (WARN_ON_ONCE(target_idx >= scratch->table_size))
+		return;
+	target_elem = scratch->table + target_idx * scratch->elem_size;
+
+	if (element_is_empty(target_elem)) {
+		/*
+		 * Simple case - just insert it.  scratch->prev_index is already
+		 * correctly initialized.
+		 */
+		memcpy(target_elem, elem, scratch->elem_size);
+		set_next_collision(target_elem, target_idx);
+	} else {
+		target_hash = hash_elem(target_elem, scratch->table_size);
+		moved_to = __h_relocate_collision_slot(scratch, target_idx,
+						       get_next_collision,
+						       set_next_collision,
+						       element_is_empty);
+		memcpy(target_elem, elem, scratch->elem_size);
+		if (target_hash == target_idx) {
+			/* We should be in the collision chain of the original target */
+			set_next_collision(target_elem, moved_to);
+			WARN_ON_ONCE(scratch->prev_index[moved_to] != moved_to);
+			scratch->prev_index[moved_to] = target_idx;
+		} else {
+			/* We are starting a new chain. */
+			set_next_collision(target_elem, target_idx);
+		}
+	}
+}
+
 /**
  * DEFINE_COALESCED_HASH_TABLE - Define a set of functions to mainpulate a
  * coalesced hash table holding elements of type @elem_type.
@@ -127,4 +377,19 @@ static inline void *h_find(const void *table, h_index_t table_size,
 			      table_func_prefix##_get_next_collision,         \
 			      table_func_prefix##_compare_elem,               \
 			      table_func_prefix##_element_is_empty);          \
+	}                                                                     \
+	static inline void table_func_prefix##_initialize(                    \
+		elem_type *table, h_index_t table_size)                       \
+	{                                                                     \
+		h_initialize(table, table_size, sizeof(elem_type),            \
+			     table_func_prefix##_set_next_collision,          \
+			     table_func_prefix##_element_is_empty);           \
+	}                                                                     \
+	static inline void table_func_prefix##_insert(                        \
+		struct h_insert_scratch *scratch, const elem_type *elem)      \
+	{                                                                     \
+		h_insert(scratch, elem, table_func_prefix##_hash_elem,        \
+			 table_func_prefix##_get_next_collision,              \
+			 table_func_prefix##_set_next_collision,              \
+			 table_func_prefix##_element_is_empty);               \
 	}
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 06/12] landlock/domain: Implement merging of a parent domain and a ruleset
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
                   ` (4 preceding siblings ...)
  2025-07-06 15:16 ` [RFC PATCH v2 05/12] landlock/coalesced_hash: Implement insert Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 07/12] landlock/domain: Define alloc and free Tingmao Wang
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

This implements a 3-stage merge, generic over the type of rules (so that
it can be repeated for fs and net).  It contains a small refactor to
re-use the rbtree search code in ruleset.c.

3 passes are needed because, aside from calculating the size of the arrays
to allocate, we also need to first populate the index table before we can
write out the layers sequentially, as the index will not be written
in-order.  Doing it this way means that one rule's layers ends where the
next rule's layers start.

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/domain.c  | 218 +++++++++++++++++++++++++++++++++++-
 security/landlock/ruleset.c |  16 +--
 security/landlock/ruleset.h |  20 ++++
 3 files changed, 238 insertions(+), 16 deletions(-)

diff --git a/security/landlock/domain.c b/security/landlock/domain.c
index 83233bf3be6a..89d36736a59c 100644
--- a/security/landlock/domain.c
+++ b/security/landlock/domain.c
@@ -5,6 +5,7 @@
  * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net>
  * Copyright © 2018-2020 ANSSI
  * Copyright © 2024-2025 Microsoft Corporation
+ * Copyright © 2025      Tingmao Wang <m@maowtm.org>
  */
 
 #include <kunit/test.h>
@@ -24,7 +25,7 @@
 #include "domain.h"
 #include "id.h"
 
-static void __maybe_unused build_check_domain(void)
+static void build_check_domain(void)
 {
 	BUILD_BUG_ON(LANDLOCK_MAX_NUM_RULES >= U32_MAX - 1);
 	/* Non-inclusive end indices are involved, so needs to be U32_MAX - 1. */
@@ -32,6 +33,221 @@ static void __maybe_unused build_check_domain(void)
 		     U32_MAX - 1);
 }
 
+/**
+ * dom_calculate_merged_sizes - Calculate the eventual size of the part of
+ * a new domain for a given rule size (i.e. either fs or net).  Correct
+ * usage requires the caller to hold the ruleset lock throughout the
+ * merge.  Returns -E2BIG if limits would be exceeded.
+ *
+ * @dom_ind_array: The index table in the parent domain for the relevant
+ * rule type.  Can be NULL if there is no parent domain.
+ * @dom_num_indices: The length of @dom_ind_array, or 0 if no parent
+ * domain.
+ * @dom_num_layers: The total number of distinct layer objects in the
+ * parent domain for the relevant rule type.
+ * @child_rules: The root of the rules tree to be merged.
+ * @out_num_indices: Outputs the number of indices that would be needed in
+ * the new domain.
+ * @out_num_layers: Outputs the number of layer objects that would be
+ * needed in the new domain.
+ */
+static int __maybe_unused dom_calculate_merged_sizes(
+	const struct landlock_domain_index *const dom_ind_array,
+	const u32 dom_num_indices, const u32 dom_num_layers,
+	const struct rb_root *const child_rules, u32 *const out_num_indices,
+	u32 *const out_num_layers)
+{
+	u32 num_indices = dom_num_indices;
+	u32 num_layers = dom_num_layers;
+	const struct landlock_rule *walker_rule, *next_rule;
+	struct landlock_domain_index find_key;
+	const struct landlock_domain_index *found;
+
+	build_check_domain();
+
+	rbtree_postorder_for_each_entry_safe(walker_rule, next_rule,
+					     child_rules, node) {
+		found = NULL;
+		if (dom_ind_array) {
+			find_key.key = walker_rule->key;
+			found = dom_hash_find(dom_ind_array, dom_num_indices,
+					      &find_key);
+		}
+		/* A new index is only needed if this is a non-overlapping new rule */
+		if (!found) {
+			if (num_indices >= LANDLOCK_MAX_NUM_RULES)
+				return -E2BIG;
+			num_indices++;
+		}
+
+		/* Regardless, we have a new layer. */
+		if (WARN_ON_ONCE(num_layers >= U32_MAX - 1))
+			/*
+			 * This situation should not be possible with the proper rule
+			 * number limit.
+			 */
+			return -E2BIG;
+		num_layers++;
+	}
+
+	*out_num_indices = num_indices;
+	*out_num_layers = num_layers;
+	return 0;
+}
+
+/**
+ * dom_populate_indices - Populate the index table of a new domain.
+ *
+ * @dom_ind_array: The index table in the parent domain for the relevant
+ * rule type.  Can be NULL if there is no parent domain.
+ * @dom_num_indices: The length of @dom_ind_array, or 0 if no parent
+ * domain.
+ * @child_rules: The root of the rules tree to be merged.
+ * @out_indices: The output indices array to be populated.
+ * @out_size: The size of @out_indices, in number of elements.
+ */
+static int __maybe_unused dom_populate_indices(
+	const struct landlock_domain_index *const dom_ind_array,
+	const u32 dom_num_indices, const struct rb_root *const child_rules,
+	struct landlock_domain_index *const out_indices, const u32 out_size)
+{
+	u32 indices_written = 0;
+	const struct landlock_domain_index *walker_index;
+	struct landlock_domain_index target = {};
+	const struct landlock_rule *walker_rule, *next_rule;
+	const struct landlock_domain_index *found;
+	struct h_insert_scratch scratch;
+	int ret;
+	size_t i;
+
+	dom_hash_initialize(out_indices, out_size);
+	for (i = 0; i < out_size; i++) {
+		out_indices[i].key.data = 0;
+		out_indices[i].layer_index = U32_MAX;
+	}
+
+	ret = h_init_insert_scratch(&scratch, out_indices, out_size,
+				    sizeof(*out_indices));
+	if (ret)
+		return ret;
+
+	/* Copy over all parent indices directly */
+	for (size_t i = 0; i < dom_num_indices; i++) {
+		walker_index = &dom_ind_array[i];
+		if (WARN_ON_ONCE(indices_written >= out_size)) {
+			ret = -E2BIG;
+			goto out_free;
+		}
+		/* Don't copy over layer_index */
+		target.key = walker_index->key;
+		dom_hash_insert(&scratch, &target);
+		indices_written++;
+	}
+
+	/* Copy over new child rules */
+	rbtree_postorder_for_each_entry_safe(walker_rule, next_rule,
+					     child_rules, node) {
+		target.key = walker_rule->key;
+		found = NULL;
+		if (dom_ind_array)
+			found = dom_hash_find(dom_ind_array, dom_num_indices,
+					      &target);
+		if (!found) {
+			if (WARN_ON_ONCE(indices_written >= out_size)) {
+				ret = -E2BIG;
+				goto out_free;
+			}
+			dom_hash_insert(&scratch, &target);
+			indices_written++;
+		}
+	}
+
+out_free:
+	h_free_insert_scratch(&scratch);
+
+	if (ret)
+		return ret;
+
+	for (i = 0; i < out_size; i++) {
+		walker_index = &out_indices[i];
+		/* We are not supposed to leave empty slots behind. */
+		WARN_ON_ONCE(dom_index_is_empty(walker_index));
+	}
+
+	return 0;
+}
+
+/**
+ * dom_populate_layers - Populate the layer array of a new domain.
+ *
+ * @dom_ind_array: The index table in the parent domain for the relevant
+ * rule type.  Can be NULL if there is no parent domain.
+ * @dom_num_indices: The length of @dom_ind_array, or 0 if no parent
+ * domain.
+ * @dom_layer_array: The layer array in the parent domain for the relevant
+ * rule type.  Can be NULL if there is no parent domain.
+ * @dom_num_layers: The length of @dom_layer_array, or 0 if no parent
+ * domain.
+ * @child_rules: The root of the rules tree to be merged.
+ * @child_indices: The already populated index table in the child ruleset
+ * to be merged.  This function will set the layer_index field on each
+ * indices.
+ * @child_indices_size: The size of @child_indices, in number of elements.
+ * @new_level: The level number of any new layers that will be added to
+ * @out_layers.
+ * @out_layers: The output layers array to be populated.
+ * @out_size: The size of @out_layers, in number of elements.
+ */
+static int
+dom_populate_layers(const struct landlock_domain_index *const dom_ind_array,
+		    const u32 dom_num_indices,
+		    const struct landlock_layer *const dom_layer_array,
+		    const u32 dom_num_layers,
+		    const struct rb_root *const child_rules,
+		    struct landlock_domain_index *const child_indices,
+		    const u32 child_indices_size, const u32 new_level,
+		    struct landlock_layer *const out_layers, const u32 out_size)
+{
+	u32 layers_written = 0;
+	struct landlock_domain_index *merged_index;
+	struct landlock_found_rule found_in_parent;
+	const struct landlock_rule *found_in_child;
+	const struct landlock_layer *layer;
+	struct landlock_layer child_layer;
+
+	for (size_t i = 0; i < child_indices_size; i++) {
+		merged_index = &child_indices[i];
+		merged_index->layer_index = layers_written;
+
+		found_in_parent.layers_start = NULL;
+		found_in_parent.layers_end = NULL;
+		if (dom_ind_array)
+			found_in_parent = landlock_domain_find(
+				dom_ind_array, dom_num_indices, dom_layer_array,
+				dom_num_layers, merged_index->key);
+		dom_rule_for_each_layer(found_in_parent, layer)
+		{
+			if (WARN_ON_ONCE(layers_written >= out_size))
+				return -E2BIG;
+			out_layers[layers_written++] = *layer;
+		}
+
+		found_in_child =
+			landlock_find_in_tree(child_rules, merged_index->key);
+		if (found_in_child) {
+			if (WARN_ON_ONCE(layers_written >= out_size))
+				return -E2BIG;
+			if (WARN_ON_ONCE(found_in_child->num_layers != 1))
+				return -EINVAL;
+			child_layer.access = found_in_child->layers[0].access;
+			child_layer.level = new_level;
+			out_layers[layers_written++] = child_layer;
+		}
+	}
+
+	return 0;
+}
+
 #ifdef CONFIG_AUDIT
 
 /**
diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index ce7940efea51..12fcea6a2a99 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -584,25 +584,11 @@ landlock_find_rule(const struct landlock_ruleset *const ruleset,
 		   const struct landlock_id id)
 {
 	const struct rb_root *root;
-	const struct rb_node *node;
 
 	root = get_root((struct landlock_ruleset *)ruleset, id.type);
 	if (IS_ERR(root))
 		return NULL;
-	node = root->rb_node;
-
-	while (node) {
-		struct landlock_rule *this =
-			rb_entry(node, struct landlock_rule, node);
-
-		if (this->key.data == id.key.data)
-			return this;
-		if (this->key.data < id.key.data)
-			node = node->rb_right;
-		else
-			node = node->rb_left;
-	}
-	return NULL;
+	return landlock_find_in_tree(root, id.key);
 }
 
 /*
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index 5da9a64f5af7..e45abff968c5 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -208,6 +208,26 @@ struct landlock_ruleset *
 landlock_merge_ruleset(struct landlock_ruleset *const parent,
 		       struct landlock_ruleset *const ruleset);
 
+static inline struct landlock_rule *
+landlock_find_in_tree(const struct rb_root *const root,
+		      const union landlock_key key)
+{
+	struct rb_node *node = root->rb_node;
+
+	while (node) {
+		struct landlock_rule *this =
+			rb_entry(node, struct landlock_rule, node);
+
+		if (this->key.data == key.data)
+			return this;
+		if (this->key.data < key.data)
+			node = node->rb_right;
+		else
+			node = node->rb_left;
+	}
+	return NULL;
+}
+
 const struct landlock_rule *
 landlock_find_rule(const struct landlock_ruleset *const ruleset,
 		   const struct landlock_id id);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 07/12] landlock/domain: Define alloc and free
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
                   ` (5 preceding siblings ...)
  2025-07-06 15:16 ` [RFC PATCH v2 06/12] landlock/domain: Implement merging of a parent domain and a ruleset Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 08/12] landlock/domain: Add landlock_domain_merge_ruleset Tingmao Wang
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

We will eventually need a deferred free just like we currently have for
the ruleset, so we define it here as well.  To minimize the size of the
domain struct before the rules array, we separately allocate the
work_struct (which is currently 72 bytes) and just keep a pointer in the
domain.

This patch triggers another (false positive?) checkpatch warning:

	ERROR: trailing statements should be on next line
	#177: FILE: security/landlock/domain.h:197:
	 DEFINE_FREE(landlock_put_domain, struct landlock_domain *,
	+	    if (!IS_ERR_OR_NULL(_T)) landlock_put_domain(_T))

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/domain.c | 96 ++++++++++++++++++++++++++++++++++++++
 security/landlock/domain.h | 38 +++++++++++++++
 2 files changed, 134 insertions(+)

diff --git a/security/landlock/domain.c b/security/landlock/domain.c
index 89d36736a59c..a7474b803c47 100644
--- a/security/landlock/domain.c
+++ b/security/landlock/domain.c
@@ -33,6 +33,102 @@ static void build_check_domain(void)
 		     U32_MAX - 1);
 }
 
+/**
+ * landlock_alloc_domain - allocate a new domain given known sizes.  The
+ * caller must then at least fill in the indices and layers arrays before
+ * trying to free this domain.
+ *
+ * @sizes: A "fake" struct landlock_domain which just contains various
+ * num_* numbers.  This function will call dom_rules_len() to compute the
+ * total rules array length, and copy over the number fields.
+ * sizes.usage, sizes.hierarchy and sizes.work_free are ignored.
+ */
+struct landlock_domain *
+landlock_alloc_domain(const struct landlock_domain *sizes)
+{
+	u32 len_rules = dom_rules_len(sizes);
+	struct landlock_domain *new_dom = kzalloc(
+		struct_size(new_dom, rules, len_rules), GFP_KERNEL_ACCOUNT);
+
+	if (!new_dom)
+		return NULL;
+
+	memcpy(new_dom, sizes, sizeof(*sizes));
+	new_dom->hierarchy = NULL;
+	new_dom->work_free =
+		kzalloc(sizeof(*new_dom->work_free), GFP_KERNEL_ACCOUNT);
+	if (!new_dom->work_free) {
+		kfree(new_dom);
+		return NULL;
+	}
+	new_dom->work_free->domain = new_dom;
+	refcount_set(&new_dom->usage, 1);
+	new_dom->len_rules = len_rules;
+
+	return new_dom;
+}
+
+static void free_domain(struct landlock_domain *const domain)
+{
+	struct landlock_domain_index *fs_indices, ind;
+	u32 i, num_fs_indices;
+
+	might_sleep();
+	if (WARN_ON_ONCE(!domain))
+		return;
+	fs_indices = dom_fs_indices(domain);
+	num_fs_indices = domain->num_fs_indices;
+	if (WARN_ON_ONCE((uintptr_t *)(fs_indices + num_fs_indices) -
+				 domain->rules >
+			 domain->len_rules))
+		return;
+
+	for (i = 0; i < num_fs_indices; i++) {
+		ind = fs_indices[i];
+		landlock_put_object(ind.key.object);
+	}
+
+	landlock_put_hierarchy(domain->hierarchy);
+	domain->hierarchy = NULL;
+	kfree(domain->work_free);
+	domain->work_free = NULL;
+	kfree(domain);
+}
+
+void landlock_put_domain(struct landlock_domain *const domain)
+{
+	might_sleep();
+
+	if (domain && refcount_dec_and_test(&domain->usage))
+		free_domain(domain);
+}
+
+static void free_domain_work(struct work_struct *const work)
+{
+	struct landlock_domain_work_free *const fw =
+		container_of(work, struct landlock_domain_work_free, work);
+	struct landlock_domain *domain = fw->domain;
+
+	free_domain(domain);
+	/* the work_free struct will be freed by free_domain */
+}
+
+/*
+ * Schedule work to free a landlock_domain, useful in a non-sleepable
+ * context.
+ */
+void landlock_put_domain_deferred(struct landlock_domain *const domain)
+{
+	if (domain && refcount_dec_and_test(&domain->usage)) {
+		INIT_WORK(&domain->work_free->work, free_domain_work);
+
+		if (WARN_ON_ONCE(domain->work_free->domain != domain))
+			return;
+
+		schedule_work(&domain->work_free->work);
+	}
+}
+
 /**
  * dom_calculate_merged_sizes - Calculate the eventual size of the part of
  * a new domain for a given rule size (i.e. either fs or net).  Correct
diff --git a/security/landlock/domain.h b/security/landlock/domain.h
index 8acd88a1d77a..7af32810003c 100644
--- a/security/landlock/domain.h
+++ b/security/landlock/domain.h
@@ -44,7 +44,30 @@ struct landlock_domain_index {
 	u32 layer_index;
 };
 
+struct landlock_domain_work_free {
+	struct work_struct work;
+	struct landlock_domain *domain;
+};
+
 struct landlock_domain {
+	/**
+	 * @hierarchy: Enables hierarchy identification even when a parent
+	 * domain vanishes.  This is needed for the ptrace protection.
+	 */
+	struct landlock_hierarchy *hierarchy;
+	/**
+	 * @work_free: Enables to free a domain within a lockless section.
+	 * This is only used by landlock_put_domain_deferred() when @usage
+	 * reaches zero. This is a pointer to an allocated struct in order to
+	 * minimize the size of this struct. To prevent needing to allocate
+	 * when freeing, this is pre-allocated on domain creation.
+	 */
+	struct landlock_domain_work_free *work_free;
+	/**
+	 * @usage: Number of processes (i.e. domains) or file descriptors
+	 * referencing this ruleset.
+	 */
+	refcount_t usage;
 	/**
 	 * @num_layers: Number of layers in this domain.  This enables to
 	 * check that all the layers allow an access request.
@@ -158,6 +181,21 @@ DEFINE_COALESCED_HASH_TABLE(struct landlock_domain_index, dom_hash, key,
 			    hash_long(elem->key.data, 32) % table_size,
 			    dom_index_is_empty(elem))
 
+struct landlock_domain *
+landlock_alloc_domain(const struct landlock_domain *sizes);
+
+static inline void landlock_get_domain(struct landlock_domain *const domain)
+{
+	if (domain)
+		refcount_inc(&domain->usage);
+}
+
+void landlock_put_domain(struct landlock_domain *const domain);
+void landlock_put_domain_deferred(struct landlock_domain *const domain);
+
+DEFINE_FREE(landlock_put_domain, struct landlock_domain *,
+	    if (!IS_ERR_OR_NULL(_T)) landlock_put_domain(_T))
+
 struct landlock_found_rule {
 	const struct landlock_layer *layers_start;
 	const struct landlock_layer *layers_end;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 08/12] landlock/domain: Add landlock_domain_merge_ruleset
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
                   ` (6 preceding siblings ...)
  2025-07-06 15:16 ` [RFC PATCH v2 07/12] landlock/domain: Define alloc and free Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 09/12] landlock: Update various code to use landlock_domain Tingmao Wang
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

Implement the equivalent of landlock_merge_ruleset, but using the new
domain structure.  The logic in inherit_domain and
landlock_domain_merge_ruleset c.f. inherit_ruleset and merge_ruleset.
Once we replace the existing landlock_restrict_self code to use this those
two functions can then be removed.

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/domain.c | 243 ++++++++++++++++++++++++++++++++++++-
 security/landlock/domain.h |   4 +
 2 files changed, 245 insertions(+), 2 deletions(-)

diff --git a/security/landlock/domain.c b/security/landlock/domain.c
index a7474b803c47..da8f1bf00db1 100644
--- a/security/landlock/domain.c
+++ b/security/landlock/domain.c
@@ -147,7 +147,7 @@ void landlock_put_domain_deferred(struct landlock_domain *const domain)
  * @out_num_layers: Outputs the number of layer objects that would be
  * needed in the new domain.
  */
-static int __maybe_unused dom_calculate_merged_sizes(
+static int dom_calculate_merged_sizes(
 	const struct landlock_domain_index *const dom_ind_array,
 	const u32 dom_num_indices, const u32 dom_num_layers,
 	const struct rb_root *const child_rules, u32 *const out_num_indices,
@@ -202,7 +202,7 @@ static int __maybe_unused dom_calculate_merged_sizes(
  * @out_indices: The output indices array to be populated.
  * @out_size: The size of @out_indices, in number of elements.
  */
-static int __maybe_unused dom_populate_indices(
+static int dom_populate_indices(
 	const struct landlock_domain_index *const dom_ind_array,
 	const u32 dom_num_indices, const struct rb_root *const child_rules,
 	struct landlock_domain_index *const out_indices, const u32 out_size)
@@ -344,6 +344,245 @@ dom_populate_layers(const struct landlock_domain_index *const dom_ind_array,
 	return 0;
 }
 
+/**
+ * merge_domain - Do merge for both fs and net.
+ *
+ * @parent: Parent domain, or NULL if there is no parent.
+ * @child: Child domain.  child->num_layers must be set to the new level.
+ * @ruleset: Ruleset to be merged.  Must hold the ruleset lock across
+ * calls to this function.
+ * @only_calc_sizes: Whether this is a size-calculation pass, or the final
+ * merge pass.  For the size-calculation pass, no data except various sizes
+ * in @child will be written.
+ *
+ * If @only_calc_sizes is true, child->num_{fs,net}_{indices,layers} will
+ * be updated.  Otherwise, the function writes to child->rules and checks
+ * that the number of indices and layers written matches with previously
+ * stored numbers in @child.
+ */
+static int merge_domain(const struct landlock_domain *parent,
+			struct landlock_domain *child,
+			struct landlock_ruleset *ruleset, bool only_calc_sizes)
+	__must_hold(&ruleset->lock)
+{
+	u32 new_level;
+	int err;
+
+	if (WARN_ON_ONCE(!ruleset || !child))
+		return -EINVAL;
+
+	new_level = child->num_layers;
+	/* We should have checked new_level <= LANDLOCK_MAX_NUM_LAYERS already */
+	if (WARN_ON_ONCE(new_level == 0 || new_level > LANDLOCK_MAX_NUM_LAYERS))
+		return -EINVAL;
+
+	if (only_calc_sizes) {
+		err = dom_calculate_merged_sizes(
+			parent ? dom_fs_indices(parent) : NULL,
+			parent ? parent->num_fs_indices : 0,
+			parent ? parent->num_fs_layers : 0,
+			&ruleset->root_inode, &child->num_fs_indices,
+			&child->num_fs_layers);
+		if (err)
+			return err;
+
+#ifdef CONFIG_INET
+		err = dom_calculate_merged_sizes(
+			parent ? dom_net_indices(parent) : NULL,
+			parent ? parent->num_net_indices : 0,
+			parent ? parent->num_net_layers : 0,
+			&ruleset->root_net_port, &child->num_net_indices,
+			&child->num_net_layers);
+		if (err)
+			return err;
+#else
+		child->num_net_indices = 0;
+		child->num_net_layers = 0;
+#endif /* CONFIG_INET */
+	} else {
+		err = dom_populate_indices(
+			parent ? dom_fs_indices(parent) : NULL,
+			parent ? parent->num_fs_indices : 0,
+			&ruleset->root_inode, dom_fs_indices(child),
+			child->num_fs_indices);
+		if (err)
+			return err;
+
+		err = dom_populate_layers(
+			parent ? dom_fs_indices(parent) : NULL,
+			parent ? parent->num_fs_indices : 0,
+			parent ? dom_fs_layers(parent) : NULL,
+			parent ? parent->num_fs_layers : 0,
+			&ruleset->root_inode, dom_fs_indices(child),
+			child->num_fs_indices, new_level, dom_fs_layers(child),
+			child->num_fs_layers);
+		if (err)
+			return err;
+
+		dom_fs_terminating_index(child)->layer_index =
+			child->num_fs_layers;
+
+#ifdef CONFIG_INET
+		err = dom_populate_indices(
+			parent ? dom_net_indices(parent) : NULL,
+			parent ? parent->num_net_indices : 0,
+			&ruleset->root_net_port, dom_net_indices(child),
+			child->num_net_indices);
+		if (err)
+			return err;
+
+		err = dom_populate_layers(
+			parent ? dom_net_indices(parent) : NULL,
+			parent ? parent->num_net_indices : 0,
+			parent ? dom_net_layers(parent) : NULL,
+			parent ? parent->num_net_layers : 0,
+			&ruleset->root_net_port, dom_net_indices(child),
+			child->num_net_indices, new_level,
+			dom_net_layers(child), child->num_net_layers);
+		if (err)
+			return err;
+
+		dom_net_terminating_index(child)->layer_index =
+			child->num_net_layers;
+#endif /* CONFIG_INET */
+	}
+
+	return 0;
+}
+
+/**
+ * Populate handled access masks and hierarchy for the child.
+ *
+ * @parent: Parent domain, or NULL if there is no parent.
+ * @child: Child domain to be populated.
+ * @ruleset: Ruleset to be merged.  Must hold the ruleset lock.
+ */
+static int inherit_domain(const struct landlock_domain *parent,
+			  struct landlock_domain *child,
+			  struct landlock_ruleset *ruleset)
+	__must_hold(&ruleset->lock)
+{
+	if (WARN_ON_ONCE(!child || !ruleset || !child->hierarchy ||
+			 child->num_layers < 1 || ruleset->num_layers != 1)) {
+		return -EINVAL;
+	}
+
+	if (parent) {
+		if (WARN_ON_ONCE(child->num_layers != parent->num_layers + 1))
+			return -EINVAL;
+
+		/* Copies the parent layer stack. */
+		memcpy(dom_access_masks(child), dom_access_masks(parent),
+		       array_size(parent->num_layers,
+				  sizeof(*dom_access_masks(child))));
+
+		if (WARN_ON_ONCE(!parent->hierarchy))
+			return -EINVAL;
+
+		landlock_get_hierarchy(parent->hierarchy);
+		child->hierarchy->parent = parent->hierarchy;
+	}
+
+	/* Stacks the new layer. */
+	dom_access_masks(child)[child->num_layers - 1] =
+		landlock_upgrade_handled_access_masks(ruleset->access_masks[0]);
+
+	return 0;
+}
+
+/**
+ * landlock_domain_merge_ruleset - Merge a ruleset and a parent domain
+ * into a new domain.
+ *
+ * @parent: Parent domain.
+ * @ruleset: Ruleset to be merged.  This function will take the mutex on
+ * this ruleset while merging.
+ *
+ * The current task is requesting to be restricted.  The subjective credentials
+ * must not be in an overridden state. cf. landlock_init_hierarchy_log().
+ *
+ * Returns the intersection of @parent and @ruleset, or returns @parent if
+ * @ruleset is empty, or returns a duplicate of @ruleset if @parent is empty.
+ */
+struct landlock_domain *
+landlock_domain_merge_ruleset(const struct landlock_domain *parent,
+			      struct landlock_ruleset *ruleset)
+{
+	struct landlock_domain *new_dom __free(landlock_put_domain) = NULL;
+	struct landlock_hierarchy *new_hierarchy __free(kfree) = NULL;
+	struct landlock_domain new_dom_sizes = {};
+	u32 new_level;
+	int err;
+
+	might_sleep();
+	if (WARN_ON_ONCE(!ruleset))
+		return ERR_PTR(-EINVAL);
+
+	if (parent) {
+		if (parent->num_layers >= LANDLOCK_MAX_NUM_LAYERS)
+			return ERR_PTR(-E2BIG);
+		new_level = parent->num_layers + 1;
+	} else {
+		new_level = 1;
+	}
+
+	new_dom_sizes.num_layers = new_level;
+
+	/* Allocate this now so we fail early */
+	new_hierarchy = kzalloc(sizeof(*new_hierarchy), GFP_KERNEL_ACCOUNT);
+	if (!new_hierarchy)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * Figure out how many indices and layer structs.  From this point
+	 * until we actually merge in the ruleset, ruleset must not change.
+	 */
+	mutex_lock(&ruleset->lock);
+	err = merge_domain(parent, &new_dom_sizes, ruleset, true);
+	if (err)
+		goto out_unlock;
+
+	/*
+	 * Ok, we know the required size now, allocate the domain and merge in
+	 * the indices and rules.
+	 */
+	new_dom = landlock_alloc_domain(&new_dom_sizes);
+	if (!new_dom) {
+		err = -ENOMEM;
+		goto out_unlock;
+	}
+	new_dom->hierarchy = new_hierarchy;
+	new_hierarchy = NULL;
+	refcount_set(&new_dom->hierarchy->usage, 1);
+
+	err = merge_domain(parent, new_dom, ruleset, false);
+	if (err) {
+		/* new_dom can contain invalid landlock_object references. */
+		kfree(new_dom);
+		new_dom = NULL;
+		goto out_unlock;
+	}
+
+	for (size_t i = 0; i < new_dom->num_fs_indices; i++)
+		landlock_get_object(dom_fs_indices(new_dom)[i].key.object);
+
+	err = inherit_domain(parent, new_dom, ruleset);
+	if (err)
+		goto out_unlock;
+
+	mutex_unlock(&ruleset->lock);
+
+	err = landlock_init_hierarchy_log(new_dom->hierarchy);
+	if (err)
+		return ERR_PTR(err);
+
+	return no_free_ptr(new_dom);
+
+out_unlock:
+	mutex_unlock(&ruleset->lock);
+	return ERR_PTR(err);
+}
+
 #ifdef CONFIG_AUDIT
 
 /**
diff --git a/security/landlock/domain.h b/security/landlock/domain.h
index 7af32810003c..1d9608439781 100644
--- a/security/landlock/domain.h
+++ b/security/landlock/domain.h
@@ -251,6 +251,10 @@ landlock_domain_find(const struct landlock_domain_index *const indices_arr,
 	for (layer = (found_rule).layers_start;    \
 	     layer < (found_rule).layers_end; layer++)
 
+struct landlock_domain *
+landlock_domain_merge_ruleset(const struct landlock_domain *parent,
+			      struct landlock_ruleset *ruleset);
+
 enum landlock_log_status {
 	LANDLOCK_LOG_PENDING = 0,
 	LANDLOCK_LOG_RECORDED,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 09/12] landlock: Update various code to use landlock_domain
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
                   ` (7 preceding siblings ...)
  2025-07-06 15:16 ` [RFC PATCH v2 08/12] landlock/domain: Add landlock_domain_merge_ruleset Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 10/12] landlock: Remove unused code Tingmao Wang
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

- Replace domain in landlock_cred with landlock_domain.
- Replace landlock_merge_ruleset with landlock_domain_merge_ruleset.
- Pull landlock_put_hierarchy out of domain.h.
  This allows domain.h to not depend on audit.h, as audit.h -> cred.h will
  need to depend on domain.h instead of ruleset.h after changing it to use
  the new domain struct.
- Update uses of landlock_ruleset-domains to landlock_domain

checkpath seems to not like the `layer_mask_t (*const layer_masks)[]` argument:

	WARNING: function definition argument 'layer_mask_t' should also have an identifier name
	#171: FILE: security/landlock/domain.h:397:
	+bool landlock_unmask_layers(const struct landlock_found_rule rule,

	WARNING: function definition argument 'layer_mask_t' should also have an identifier name
	#176: FILE: security/landlock/domain.h:402:
	+access_mask_t

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/audit.c    |  8 ++--
 security/landlock/cred.c     | 12 +++---
 security/landlock/cred.h     | 14 +++----
 security/landlock/domain.c   | 74 ++++++++++++++++++++++++++++++++++++
 security/landlock/domain.h   | 64 +++++++++++++++++++++++++++----
 security/landlock/fs.c       | 34 ++++++++---------
 security/landlock/net.c      | 12 +++---
 security/landlock/ruleset.c  | 66 +++-----------------------------
 security/landlock/ruleset.h  | 43 ---------------------
 security/landlock/syscalls.c |  8 ++--
 security/landlock/task.c     | 24 ++++++------
 11 files changed, 190 insertions(+), 169 deletions(-)

diff --git a/security/landlock/audit.c b/security/landlock/audit.c
index c52d079cdb77..1ac4a97e5ae0 100644
--- a/security/landlock/audit.c
+++ b/security/landlock/audit.c
@@ -134,7 +134,7 @@ static void log_domain(struct landlock_hierarchy *const hierarchy)
 }
 
 static struct landlock_hierarchy *
-get_hierarchy(const struct landlock_ruleset *const domain, const size_t layer)
+get_hierarchy(const struct landlock_domain *const domain, const size_t layer)
 {
 	struct landlock_hierarchy *hierarchy = domain->hierarchy;
 	ssize_t i;
@@ -167,7 +167,7 @@ static void test_get_hierarchy(struct kunit *const test)
 		.parent = &dom1_hierarchy,
 		.id = 30,
 	};
-	struct landlock_ruleset dom2 = {
+	struct landlock_domain dom2 = {
 		.hierarchy = &dom2_hierarchy,
 		.num_layers = 3,
 	};
@@ -180,7 +180,7 @@ static void test_get_hierarchy(struct kunit *const test)
 
 #endif /* CONFIG_SECURITY_LANDLOCK_KUNIT_TEST */
 
-static size_t get_denied_layer(const struct landlock_ruleset *const domain,
+static size_t get_denied_layer(const struct landlock_domain *const domain,
 			       access_mask_t *const access_request,
 			       const layer_mask_t (*const layer_masks)[],
 			       const size_t layer_masks_size)
@@ -218,7 +218,7 @@ static size_t get_denied_layer(const struct landlock_ruleset *const domain,
 
 static void test_get_denied_layer(struct kunit *const test)
 {
-	const struct landlock_ruleset dom = {
+	const struct landlock_domain dom = {
 		.num_layers = 5,
 	};
 	const layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {
diff --git a/security/landlock/cred.c b/security/landlock/cred.c
index 0cb3edde4d18..a90b5b895dc0 100644
--- a/security/landlock/cred.c
+++ b/security/landlock/cred.c
@@ -13,7 +13,7 @@
 
 #include "common.h"
 #include "cred.h"
-#include "ruleset.h"
+#include "domain.h"
 #include "setup.h"
 
 static void hook_cred_transfer(struct cred *const new,
@@ -23,7 +23,7 @@ static void hook_cred_transfer(struct cred *const new,
 		landlock_cred(old);
 
 	if (old_llcred->domain) {
-		landlock_get_ruleset(old_llcred->domain);
+		landlock_get_domain(old_llcred->domain);
 		*landlock_cred(new) = *old_llcred;
 	}
 }
@@ -37,10 +37,10 @@ static int hook_cred_prepare(struct cred *const new,
 
 static void hook_cred_free(struct cred *const cred)
 {
-	struct landlock_ruleset *const dom = landlock_cred(cred)->domain;
-
-	if (dom)
-		landlock_put_ruleset_deferred(dom);
+	/*
+	 * landlock_put_domain_deferred does nothing if domain is NULL
+	 */
+	landlock_put_domain_deferred(landlock_cred(cred)->domain);
 }
 
 #ifdef CONFIG_AUDIT
diff --git a/security/landlock/cred.h b/security/landlock/cred.h
index c82fe63ec598..2e8aceeed8fe 100644
--- a/security/landlock/cred.h
+++ b/security/landlock/cred.h
@@ -17,7 +17,7 @@
 
 #include "access.h"
 #include "limits.h"
-#include "ruleset.h"
+#include "domain.h"
 #include "setup.h"
 
 /**
@@ -29,9 +29,9 @@
  */
 struct landlock_cred_security {
 	/**
-	 * @domain: Immutable ruleset enforced on a task.
+	 * @domain: Immutable domain enforced on a task.
 	 */
-	struct landlock_ruleset *domain;
+	struct landlock_domain *domain;
 
 #ifdef CONFIG_AUDIT
 	/**
@@ -65,7 +65,7 @@ landlock_cred(const struct cred *cred)
 	return cred->security + landlock_blob_sizes.lbs_cred;
 }
 
-static inline struct landlock_ruleset *landlock_get_current_domain(void)
+static inline struct landlock_domain *landlock_get_current_domain(void)
 {
 	return landlock_cred(current_cred())->domain;
 }
@@ -73,7 +73,7 @@ static inline struct landlock_ruleset *landlock_get_current_domain(void)
 /*
  * The call needs to come from an RCU read-side critical section.
  */
-static inline const struct landlock_ruleset *
+static inline const struct landlock_domain *
 landlock_get_task_domain(const struct task_struct *const task)
 {
 	return landlock_cred(__task_cred(task))->domain;
@@ -114,7 +114,7 @@ landlock_get_applicable_subject(const struct cred *const cred,
 	const union access_masks_all masks_all = {
 		.masks = masks,
 	};
-	const struct landlock_ruleset *domain;
+	const struct landlock_domain *domain;
 	ssize_t layer_level;
 
 	if (!cred)
@@ -127,7 +127,7 @@ landlock_get_applicable_subject(const struct cred *const cred,
 	for (layer_level = domain->num_layers - 1; layer_level >= 0;
 	     layer_level--) {
 		union access_masks_all layer = {
-			.masks = domain->access_masks[layer_level],
+			.masks = dom_access_masks(domain)[layer_level],
 		};
 
 		if (layer.all & masks_all.all) {
diff --git a/security/landlock/domain.c b/security/landlock/domain.c
index da8f1bf00db1..4091e80e45df 100644
--- a/security/landlock/domain.c
+++ b/security/landlock/domain.c
@@ -21,6 +21,7 @@
 #include <linux/uidgid.h>
 
 #include "access.h"
+#include "audit.h"
 #include "common.h"
 #include "domain.h"
 #include "id.h"
@@ -771,6 +772,79 @@ landlock_get_deny_masks(const access_mask_t all_existing_optional_access,
 	return deny_masks;
 }
 
+void landlock_put_hierarchy(struct landlock_hierarchy *hierarchy)
+{
+	while (hierarchy && refcount_dec_and_test(&hierarchy->usage)) {
+		const struct landlock_hierarchy *const freeme = hierarchy;
+
+		landlock_log_drop_domain(hierarchy);
+		landlock_free_hierarchy_details(hierarchy);
+		hierarchy = hierarchy->parent;
+		kfree(freeme);
+	}
+}
+
+/*
+ * @layer_masks is read and may be updated according to the access request and
+ * the matching rule.
+ * @masks_array_size must be equal to ARRAY_SIZE(*layer_masks).
+ *
+ * Returns true if the request is allowed (i.e. relevant layer masks for the
+ * request are empty).
+ */
+bool landlock_unmask_layers(const struct landlock_found_rule rule,
+			    const access_mask_t access_request,
+			    layer_mask_t (*const layer_masks)[],
+			    const size_t masks_array_size)
+{
+	const struct landlock_layer *layer;
+
+	if (!access_request || !layer_masks)
+		return true;
+
+	if (rule.layers_start == rule.layers_end)
+		return false;
+
+	if (WARN_ON_ONCE(rule.layers_start > rule.layers_end))
+		return false;
+
+	/* We should not have layers_start being NULL but layers_end not */
+	if (WARN_ON_ONCE(rule.layers_start == NULL))
+		return false;
+
+	/*
+	 * An access is granted if, for each policy layer, at least one rule
+	 * encountered on the pathwalk grants the requested access,
+	 * regardless of its position in the layer stack.  We must then check
+	 * the remaining layers for each inode, from the first added layer to
+	 * the last one.  When there is multiple requested accesses, for each
+	 * policy layer, the full set of requested accesses may not be granted
+	 * by only one rule, but by the union (binary OR) of multiple rules.
+	 * E.g. /a/b <execute> + /a <read> => /a/b <execute + read>
+	 */
+	dom_rule_for_each_layer(rule, layer)
+	{
+		const layer_mask_t layer_bit = BIT_ULL(layer->level - 1);
+		const unsigned long access_req = access_request;
+		unsigned long access_bit;
+		bool is_empty;
+
+		/*
+		 * Records in @layer_masks which layer grants access to each
+		 * requested access.
+		 */
+		is_empty = true;
+		for_each_set_bit(access_bit, &access_req, masks_array_size) {
+			if (layer->access & BIT_ULL(access_bit))
+				(*layer_masks)[access_bit] &= ~layer_bit;
+			is_empty = is_empty && !(*layer_masks)[access_bit];
+		}
+		if (is_empty)
+			return true;
+	}
+	return false;
+}
+
 #ifdef CONFIG_SECURITY_LANDLOCK_KUNIT_TEST
 
 static void test_landlock_get_deny_masks(struct kunit *const test)
diff --git a/security/landlock/domain.h b/security/landlock/domain.h
index 1d9608439781..ac820903ccb6 100644
--- a/security/landlock/domain.h
+++ b/security/landlock/domain.h
@@ -20,7 +20,6 @@
 #include <linux/slab.h>
 
 #include "access.h"
-#include "audit.h"
 #include "ruleset.h"
 #include "coalesced_hash.h"
 
@@ -393,16 +392,65 @@ landlock_get_hierarchy(struct landlock_hierarchy *const hierarchy)
 		refcount_inc(&hierarchy->usage);
 }
 
-static inline void landlock_put_hierarchy(struct landlock_hierarchy *hierarchy)
+void landlock_put_hierarchy(struct landlock_hierarchy *hierarchy);
+
+bool landlock_unmask_layers(const struct landlock_found_rule rule,
+			    const access_mask_t access_request,
+			    layer_mask_t (*const layer_masks)[],
+			    const size_t masks_array_size);
+
+access_mask_t
+landlock_init_layer_masks(const struct landlock_domain *const domain,
+			  const access_mask_t access_request,
+			  layer_mask_t (*const layer_masks)[],
+			  const enum landlock_key_type key_type);
+
+static inline access_mask_t
+landlock_dom_get_fs_access_mask(const struct landlock_domain *const domain,
+				const u16 layer_level)
+{
+	/* Handles all initially denied by default access rights. */
+	return dom_access_masks(domain)[layer_level].fs |
+	       _LANDLOCK_ACCESS_FS_INITIALLY_DENIED;
+}
+
+static inline access_mask_t
+landlock_dom_get_net_access_mask(const struct landlock_domain *const domain,
+				 const u16 layer_level)
+{
+	return dom_access_masks(domain)[layer_level].net;
+}
+
+static inline access_mask_t
+landlock_dom_get_scope_mask(const struct landlock_domain *const domain,
+			    const u16 layer_level)
 {
-	while (hierarchy && refcount_dec_and_test(&hierarchy->usage)) {
-		const struct landlock_hierarchy *const freeme = hierarchy;
+	return dom_access_masks(domain)[layer_level].scope;
+}
 
-		landlock_log_drop_domain(hierarchy);
-		landlock_free_hierarchy_details(hierarchy);
-		hierarchy = hierarchy->parent;
-		kfree(freeme);
+/**
+ * landlock_dom_union_access_masks - Return all access rights handled in
+ * the domain
+ *
+ * @domain: Landlock domain
+ *
+ * Returns: an access_masks result of the OR of all the domain's access masks.
+ */
+static inline struct access_masks
+landlock_dom_union_access_masks(const struct landlock_domain *const domain)
+{
+	union access_masks_all matches = {};
+	size_t layer_level;
+
+	for (layer_level = 0; layer_level < domain->num_layers; layer_level++) {
+		union access_masks_all layer = {
+			.masks = dom_access_masks(domain)[layer_level],
+		};
+
+		matches.all |= layer.all;
 	}
+
+	return matches.masks;
 }
 
 #endif /* _SECURITY_LANDLOCK_DOMAIN_H */
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 51f03eb82069..4e18936eecbd 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -360,26 +360,24 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
  *
  * Returns NULL if no rule is found or if @dentry is negative.
  */
-static const struct landlock_rule *
-find_rule(const struct landlock_ruleset *const domain,
+static struct landlock_found_rule
+find_rule(const struct landlock_domain *const domain,
 	  const struct dentry *const dentry)
 {
-	const struct landlock_rule *rule;
 	const struct inode *inode;
-	struct landlock_id id = {
-		.type = LANDLOCK_KEY_INODE,
-	};
+	union landlock_key key;
+	struct landlock_found_rule found_rule = {};
 
 	/* Ignores nonexistent leafs. */
 	if (d_is_negative(dentry))
-		return NULL;
+		return found_rule;
 
 	inode = d_backing_inode(dentry);
 	rcu_read_lock();
-	id.key.object = rcu_dereference(landlock_inode(inode)->object);
-	rule = landlock_find_rule(domain, id);
+	key.object = rcu_dereference(landlock_inode(inode)->object);
+	found_rule = dom_find_index_fs(domain, key);
 	rcu_read_unlock();
-	return rule;
+	return found_rule;
 }
 
 /*
@@ -752,7 +750,7 @@ static void test_is_eacces_with_write(struct kunit *const test)
  * - false otherwise.
  */
 static bool is_access_to_paths_allowed(
-	const struct landlock_ruleset *const domain,
+	const struct landlock_domain *const domain,
 	const struct path *const path,
 	const access_mask_t access_request_parent1,
 	layer_mask_t (*const layer_masks_parent1)[LANDLOCK_NUM_ACCESS_FS],
@@ -800,7 +798,7 @@ static bool is_access_to_paths_allowed(
 		 * a superset of the meaningful requested accesses).
 		 */
 		access_masked_parent1 = access_masked_parent2 =
-			landlock_union_access_masks(domain).fs;
+			landlock_dom_union_access_masks(domain).fs;
 		is_dom_check = true;
 		memcpy(&_layer_masks_parent2_bkp, layer_masks_parent2,
 		       sizeof(_layer_masks_parent2_bkp));
@@ -844,7 +842,7 @@ static bool is_access_to_paths_allowed(
 	 */
 	while (true) {
 		struct dentry *parent_dentry;
-		const struct landlock_rule *rule;
+		struct landlock_found_rule rule;
 
 		/*
 		 * If at least all accesses allowed on the destination are
@@ -1102,7 +1100,7 @@ static access_mask_t maybe_remove(const struct dentry *const dentry)
  * - false if the walk reached @mnt_root.
  */
 static bool collect_domain_accesses(
-	const struct landlock_ruleset *const domain,
+	const struct landlock_domain *const domain,
 	const struct path *const mnt_dir, struct dentry *dir,
 	layer_mask_t (*const layer_masks_dom)[LANDLOCK_NUM_ACCESS_FS])
 {
@@ -1881,7 +1879,7 @@ static bool control_current_fowner(struct fown_struct *const fown)
 
 static void hook_file_set_fowner(struct file *file)
 {
-	struct landlock_ruleset *prev_dom;
+	struct landlock_domain *prev_dom;
 	struct landlock_cred_security fown_subject = {};
 	size_t fown_layer = 0;
 
@@ -1893,7 +1891,7 @@ static void hook_file_set_fowner(struct file *file)
 			landlock_get_applicable_subject(
 				current_cred(), signal_scope, &fown_layer);
 		if (new_subject) {
-			landlock_get_ruleset(new_subject->domain);
+			landlock_get_domain(new_subject->domain);
 			fown_subject = *new_subject;
 		}
 	}
@@ -1905,12 +1903,12 @@ static void hook_file_set_fowner(struct file *file)
 #endif /* CONFIG_AUDIT*/
 
 	/* May be called in an RCU read-side critical section. */
-	landlock_put_ruleset_deferred(prev_dom);
+	landlock_put_domain_deferred(prev_dom);
 }
 
 static void hook_file_free_security(struct file *file)
 {
-	landlock_put_ruleset_deferred(landlock_file(file)->fown_subject.domain);
+	landlock_put_domain_deferred(landlock_file(file)->fown_subject.domain);
 }
 
 static struct security_hook_list landlock_hooks[] __ro_after_init = {
diff --git a/security/landlock/net.c b/security/landlock/net.c
index 1f3915a90a80..411b810dd695 100644
--- a/security/landlock/net.c
+++ b/security/landlock/net.c
@@ -48,10 +48,8 @@ static int current_check_access_socket(struct socket *const sock,
 {
 	__be16 port;
 	layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_NET] = {};
-	const struct landlock_rule *rule;
-	struct landlock_id id = {
-		.type = LANDLOCK_KEY_NET_PORT,
-	};
+	struct landlock_found_rule rule;
+	union landlock_key key;
 	const struct access_masks masks = {
 		.net = access_request,
 	};
@@ -171,10 +169,10 @@ static int current_check_access_socket(struct socket *const sock,
 			return -EINVAL;
 	}
 
-	id.key.data = (__force uintptr_t)port;
-	BUILD_BUG_ON(sizeof(port) > sizeof(id.key.data));
+	key.data = (__force uintptr_t)port;
+	BUILD_BUG_ON(sizeof(port) > sizeof(key.data));
 
-	rule = landlock_find_rule(subject->domain, id);
+	rule = dom_find_index_net(subject->domain, key);
 	access_request = landlock_init_layer_masks(subject->domain,
 						   access_request, &layer_masks,
 						   LANDLOCK_KEY_NET_PORT);
diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index 12fcea6a2a99..6bea1cc16b62 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -591,63 +591,9 @@ landlock_find_rule(const struct landlock_ruleset *const ruleset,
 	return landlock_find_in_tree(root, id.key);
 }
 
-/*
- * @layer_masks is read and may be updated according to the access request and
- * the matching rule.
- * @masks_array_size must be equal to ARRAY_SIZE(*layer_masks).
- *
- * Returns true if the request is allowed (i.e. relevant layer masks for the
- * request are empty).
- */
-bool landlock_unmask_layers(const struct landlock_rule *const rule,
-			    const access_mask_t access_request,
-			    layer_mask_t (*const layer_masks)[],
-			    const size_t masks_array_size)
-{
-	size_t layer_level;
-
-	if (!access_request || !layer_masks)
-		return true;
-	if (!rule)
-		return false;
-
-	/*
-	 * An access is granted if, for each policy layer, at least one rule
-	 * encountered on the pathwalk grants the requested access,
-	 * regardless of its position in the layer stack.  We must then check
-	 * the remaining layers for each inode, from the first added layer to
-	 * the last one.  When there is multiple requested accesses, for each
-	 * policy layer, the full set of requested accesses may not be granted
-	 * by only one rule, but by the union (binary OR) of multiple rules.
-	 * E.g. /a/b <execute> + /a <read> => /a/b <execute + read>
-	 */
-	for (layer_level = 0; layer_level < rule->num_layers; layer_level++) {
-		const struct landlock_layer *const layer =
-			&rule->layers[layer_level];
-		const layer_mask_t layer_bit = BIT_ULL(layer->level - 1);
-		const unsigned long access_req = access_request;
-		unsigned long access_bit;
-		bool is_empty;
-
-		/*
-		 * Records in @layer_masks which layer grants access to each
-		 * requested access.
-		 */
-		is_empty = true;
-		for_each_set_bit(access_bit, &access_req, masks_array_size) {
-			if (layer->access & BIT_ULL(access_bit))
-				(*layer_masks)[access_bit] &= ~layer_bit;
-			is_empty = is_empty && !(*layer_masks)[access_bit];
-		}
-		if (is_empty)
-			return true;
-	}
-	return false;
-}
-
 typedef access_mask_t
-get_access_mask_t(const struct landlock_ruleset *const ruleset,
-		  const u16 layer_level);
+get_dom_access_mask_t(const struct landlock_domain *const domain,
+		      const u16 layer_level);
 
 /**
  * landlock_init_layer_masks - Initialize layer masks from an access request
@@ -665,24 +611,24 @@ get_access_mask_t(const struct landlock_ruleset *const ruleset,
  * in any of the active layers in @domain.
  */
 access_mask_t
-landlock_init_layer_masks(const struct landlock_ruleset *const domain,
+landlock_init_layer_masks(const struct landlock_domain *const domain,
 			  const access_mask_t access_request,
 			  layer_mask_t (*const layer_masks)[],
 			  const enum landlock_key_type key_type)
 {
 	access_mask_t handled_accesses = 0;
 	size_t layer_level, num_access;
-	get_access_mask_t *get_access_mask;
+	get_dom_access_mask_t *get_access_mask;
 
 	switch (key_type) {
 	case LANDLOCK_KEY_INODE:
-		get_access_mask = landlock_get_fs_access_mask;
+		get_access_mask = landlock_dom_get_fs_access_mask;
 		num_access = LANDLOCK_NUM_ACCESS_FS;
 		break;
 
 #if IS_ENABLED(CONFIG_INET)
 	case LANDLOCK_KEY_NET_PORT:
-		get_access_mask = landlock_get_net_access_mask;
+		get_access_mask = landlock_dom_get_net_access_mask;
 		num_access = LANDLOCK_NUM_ACCESS_NET;
 		break;
 #endif /* IS_ENABLED(CONFIG_INET) */
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index e45abff968c5..418d18869916 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -238,31 +238,6 @@ static inline void landlock_get_ruleset(struct landlock_ruleset *const ruleset)
 		refcount_inc(&ruleset->usage);
 }
 
-/**
- * landlock_union_access_masks - Return all access rights handled in the
- *				 domain
- *
- * @domain: Landlock ruleset (used as a domain)
- *
- * Returns: an access_masks result of the OR of all the domain's access masks.
- */
-static inline struct access_masks
-landlock_union_access_masks(const struct landlock_ruleset *const domain)
-{
-	union access_masks_all matches = {};
-	size_t layer_level;
-
-	for (layer_level = 0; layer_level < domain->num_layers; layer_level++) {
-		union access_masks_all layer = {
-			.masks = domain->access_masks[layer_level],
-		};
-
-		matches.all |= layer.all;
-	}
-
-	return matches.masks;
-}
-
 static inline void
 landlock_add_fs_access_mask(struct landlock_ruleset *const ruleset,
 			    const access_mask_t fs_access_mask,
@@ -314,22 +289,4 @@ landlock_get_net_access_mask(const struct landlock_ruleset *const ruleset,
 	return ruleset->access_masks[layer_level].net;
 }
 
-static inline access_mask_t
-landlock_get_scope_mask(const struct landlock_ruleset *const ruleset,
-			const u16 layer_level)
-{
-	return ruleset->access_masks[layer_level].scope;
-}
-
-bool landlock_unmask_layers(const struct landlock_rule *const rule,
-			    const access_mask_t access_request,
-			    layer_mask_t (*const layer_masks)[],
-			    const size_t masks_array_size);
-
-access_mask_t
-landlock_init_layer_masks(const struct landlock_ruleset *const domain,
-			  const access_mask_t access_request,
-			  layer_mask_t (*const layer_masks)[],
-			  const enum landlock_key_type key_type);
-
 #endif /* _SECURITY_LANDLOCK_RULESET_H */
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 33eafb71e4f3..2b148cbff456 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -479,8 +479,8 @@ SYSCALL_DEFINE4(landlock_add_rule, const int, ruleset_fd,
 SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
 		flags)
 {
-	struct landlock_ruleset *new_dom,
-		*ruleset __free(landlock_put_ruleset) = NULL;
+	struct landlock_domain *new_dom;
+	struct landlock_ruleset *ruleset __free(landlock_put_ruleset) = NULL;
 	struct cred *new_cred;
 	struct landlock_cred_security *new_llcred;
 	bool __maybe_unused log_same_exec, log_new_exec, log_subdomains,
@@ -546,7 +546,7 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
 	 * There is no possible race condition while copying and manipulating
 	 * the current credentials because they are dedicated per thread.
 	 */
-	new_dom = landlock_merge_ruleset(new_llcred->domain, ruleset);
+	new_dom = landlock_domain_merge_ruleset(new_llcred->domain, ruleset);
 	if (IS_ERR(new_dom)) {
 		abort_creds(new_cred);
 		return PTR_ERR(new_dom);
@@ -560,7 +560,7 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
 #endif /* CONFIG_AUDIT */
 
 	/* Replaces the old (prepared) domain. */
-	landlock_put_ruleset(new_llcred->domain);
+	landlock_put_domain(new_llcred->domain);
 	new_llcred->domain = new_dom;
 
 #ifdef CONFIG_AUDIT
diff --git a/security/landlock/task.c b/security/landlock/task.c
index 2385017418ca..d79919c1fec3 100644
--- a/security/landlock/task.c
+++ b/security/landlock/task.c
@@ -38,8 +38,8 @@
  * Checks if the @parent domain is less or equal to (i.e. an ancestor, which
  * means a subset of) the @child domain.
  */
-static bool domain_scope_le(const struct landlock_ruleset *const parent,
-			    const struct landlock_ruleset *const child)
+static bool domain_scope_le(const struct landlock_domain *const parent,
+			    const struct landlock_domain *const child)
 {
 	const struct landlock_hierarchy *walker;
 
@@ -60,8 +60,8 @@ static bool domain_scope_le(const struct landlock_ruleset *const parent,
 	return false;
 }
 
-static int domain_ptrace(const struct landlock_ruleset *const parent,
-			 const struct landlock_ruleset *const child)
+static int domain_ptrace(const struct landlock_domain *const parent,
+			 const struct landlock_domain *const child)
 {
 	if (domain_scope_le(parent, child))
 		return 0;
@@ -86,7 +86,7 @@ static int hook_ptrace_access_check(struct task_struct *const child,
 				    const unsigned int mode)
 {
 	const struct landlock_cred_security *parent_subject;
-	const struct landlock_ruleset *child_dom;
+	const struct landlock_domain *child_dom;
 	int err;
 
 	/* Quick return for non-landlocked tasks. */
@@ -135,7 +135,7 @@ static int hook_ptrace_access_check(struct task_struct *const child,
 static int hook_ptrace_traceme(struct task_struct *const parent)
 {
 	const struct landlock_cred_security *parent_subject;
-	const struct landlock_ruleset *child_dom;
+	const struct landlock_domain *child_dom;
 	int err;
 
 	child_dom = landlock_get_current_domain();
@@ -176,8 +176,8 @@ static int hook_ptrace_traceme(struct task_struct *const parent)
  * Returns: True if the @client domain is scoped to access the @server,
  * unless the @server is also scoped in the same domain as @client.
  */
-static bool domain_is_scoped(const struct landlock_ruleset *const client,
-			     const struct landlock_ruleset *const server,
+static bool domain_is_scoped(const struct landlock_domain *const client,
+			     const struct landlock_domain *const server,
 			     access_mask_t scope)
 {
 	int client_layer, server_layer;
@@ -204,7 +204,7 @@ static bool domain_is_scoped(const struct landlock_ruleset *const client,
 	 * parent domains are scoped.
 	 */
 	for (; client_layer > server_layer; client_layer--) {
-		if (landlock_get_scope_mask(client, client_layer) & scope)
+		if (landlock_dom_get_scope_mask(client, client_layer) & scope)
 			return true;
 
 		client_walker = client_walker->parent;
@@ -217,7 +217,7 @@ static bool domain_is_scoped(const struct landlock_ruleset *const client,
 		server_walker = server_walker->parent;
 
 	for (; client_layer >= 0; client_layer--) {
-		if (landlock_get_scope_mask(client, client_layer) & scope) {
+		if (landlock_dom_get_scope_mask(client, client_layer) & scope) {
 			/*
 			 * Client and server are at the same level in the
 			 * hierarchy. If the client is scoped, the request is
@@ -233,9 +233,9 @@ static bool domain_is_scoped(const struct landlock_ruleset *const client,
 }
 
 static bool sock_is_scoped(struct sock *const other,
-			   const struct landlock_ruleset *const domain)
+			   const struct landlock_domain *const domain)
 {
-	const struct landlock_ruleset *dom_other;
+	const struct landlock_domain *dom_other;
 
 	/* The credentials will not change. */
 	lockdep_assert_held(&unix_sk(other)->lock);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 10/12] landlock: Remove unused code
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
                   ` (8 preceding siblings ...)
  2025-07-06 15:16 ` [RFC PATCH v2 09/12] landlock: Update various code to use landlock_domain Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 11/12] landlock/task: Fix incorrect BUILD_BUG_ON() in domain_is_scoped Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 12/12] landlock: Use a hash function that does not involve division Tingmao Wang
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/ruleset.c | 239 ------------------------------------
 security/landlock/ruleset.h |  10 +-
 2 files changed, 1 insertion(+), 248 deletions(-)

diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index 6bea1cc16b62..8e95276486f8 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -309,169 +309,6 @@ int landlock_insert_rule(struct landlock_ruleset *const ruleset,
 	return insert_rule(ruleset, id, &layers, ARRAY_SIZE(layers));
 }
 
-static int merge_tree(struct landlock_ruleset *const dst,
-		      struct landlock_ruleset *const src,
-		      const enum landlock_key_type key_type)
-{
-	struct landlock_rule *walker_rule, *next_rule;
-	struct rb_root *src_root;
-	int err = 0;
-
-	might_sleep();
-	lockdep_assert_held(&dst->lock);
-	lockdep_assert_held(&src->lock);
-
-	src_root = get_root(src, key_type);
-	if (IS_ERR(src_root))
-		return PTR_ERR(src_root);
-
-	/* Merges the @src tree. */
-	rbtree_postorder_for_each_entry_safe(walker_rule, next_rule, src_root,
-					     node) {
-		struct landlock_layer layers[] = { {
-			.level = dst->num_layers,
-		} };
-		const struct landlock_id id = {
-			.key = walker_rule->key,
-			.type = key_type,
-		};
-
-		if (WARN_ON_ONCE(walker_rule->num_layers != 1))
-			return -EINVAL;
-
-		if (WARN_ON_ONCE(walker_rule->layers[0].level != 0))
-			return -EINVAL;
-
-		layers[0].access = walker_rule->layers[0].access;
-
-		err = insert_rule(dst, id, &layers, ARRAY_SIZE(layers));
-		if (err)
-			return err;
-	}
-	return err;
-}
-
-static int merge_ruleset(struct landlock_ruleset *const dst,
-			 struct landlock_ruleset *const src)
-{
-	int err = 0;
-
-	might_sleep();
-	/* Should already be checked by landlock_merge_ruleset() */
-	if (WARN_ON_ONCE(!src))
-		return 0;
-	/* Only merge into a domain. */
-	if (WARN_ON_ONCE(!dst || !dst->hierarchy))
-		return -EINVAL;
-
-	/* Locks @dst first because we are its only owner. */
-	mutex_lock(&dst->lock);
-	mutex_lock_nested(&src->lock, SINGLE_DEPTH_NESTING);
-
-	/* Stacks the new layer. */
-	if (WARN_ON_ONCE(src->num_layers != 1 || dst->num_layers < 1)) {
-		err = -EINVAL;
-		goto out_unlock;
-	}
-	dst->access_masks[dst->num_layers - 1] =
-		landlock_upgrade_handled_access_masks(src->access_masks[0]);
-
-	/* Merges the @src inode tree. */
-	err = merge_tree(dst, src, LANDLOCK_KEY_INODE);
-	if (err)
-		goto out_unlock;
-
-#if IS_ENABLED(CONFIG_INET)
-	/* Merges the @src network port tree. */
-	err = merge_tree(dst, src, LANDLOCK_KEY_NET_PORT);
-	if (err)
-		goto out_unlock;
-#endif /* IS_ENABLED(CONFIG_INET) */
-
-out_unlock:
-	mutex_unlock(&src->lock);
-	mutex_unlock(&dst->lock);
-	return err;
-}
-
-static int inherit_tree(struct landlock_ruleset *const parent,
-			struct landlock_ruleset *const child,
-			const enum landlock_key_type key_type)
-{
-	struct landlock_rule *walker_rule, *next_rule;
-	struct rb_root *parent_root;
-	int err = 0;
-
-	might_sleep();
-	lockdep_assert_held(&parent->lock);
-	lockdep_assert_held(&child->lock);
-
-	parent_root = get_root(parent, key_type);
-	if (IS_ERR(parent_root))
-		return PTR_ERR(parent_root);
-
-	/* Copies the @parent inode or network tree. */
-	rbtree_postorder_for_each_entry_safe(walker_rule, next_rule,
-					     parent_root, node) {
-		const struct landlock_id id = {
-			.key = walker_rule->key,
-			.type = key_type,
-		};
-
-		err = insert_rule(child, id, &walker_rule->layers,
-				  walker_rule->num_layers);
-		if (err)
-			return err;
-	}
-	return err;
-}
-
-static int inherit_ruleset(struct landlock_ruleset *const parent,
-			   struct landlock_ruleset *const child)
-{
-	int err = 0;
-
-	might_sleep();
-	if (!parent)
-		return 0;
-
-	/* Locks @child first because we are its only owner. */
-	mutex_lock(&child->lock);
-	mutex_lock_nested(&parent->lock, SINGLE_DEPTH_NESTING);
-
-	/* Copies the @parent inode tree. */
-	err = inherit_tree(parent, child, LANDLOCK_KEY_INODE);
-	if (err)
-		goto out_unlock;
-
-#if IS_ENABLED(CONFIG_INET)
-	/* Copies the @parent network port tree. */
-	err = inherit_tree(parent, child, LANDLOCK_KEY_NET_PORT);
-	if (err)
-		goto out_unlock;
-#endif /* IS_ENABLED(CONFIG_INET) */
-
-	if (WARN_ON_ONCE(child->num_layers <= parent->num_layers)) {
-		err = -EINVAL;
-		goto out_unlock;
-	}
-	/* Copies the parent layer stack and leaves a space for the new layer. */
-	memcpy(child->access_masks, parent->access_masks,
-	       flex_array_size(parent, access_masks, parent->num_layers));
-
-	if (WARN_ON_ONCE(!parent->hierarchy)) {
-		err = -EINVAL;
-		goto out_unlock;
-	}
-	landlock_get_hierarchy(parent->hierarchy);
-	child->hierarchy->parent = parent->hierarchy;
-
-out_unlock:
-	mutex_unlock(&parent->lock);
-	mutex_unlock(&child->lock);
-	return err;
-}
-
 static void free_ruleset(struct landlock_ruleset *const ruleset)
 {
 	struct landlock_rule *freeme, *next;
@@ -515,82 +352,6 @@ void landlock_put_ruleset_deferred(struct landlock_ruleset *const ruleset)
 	}
 }
 
-/**
- * landlock_merge_ruleset - Merge a ruleset with a domain
- *
- * @parent: Parent domain.
- * @ruleset: New ruleset to be merged.
- *
- * The current task is requesting to be restricted.  The subjective credentials
- * must not be in an overridden state. cf. landlock_init_hierarchy_log().
- *
- * Returns the intersection of @parent and @ruleset, or returns @parent if
- * @ruleset is empty, or returns a duplicate of @ruleset if @parent is empty.
- */
-struct landlock_ruleset *
-landlock_merge_ruleset(struct landlock_ruleset *const parent,
-		       struct landlock_ruleset *const ruleset)
-{
-	struct landlock_ruleset *new_dom __free(landlock_put_ruleset) = NULL;
-	u32 num_layers;
-	int err;
-
-	might_sleep();
-	if (WARN_ON_ONCE(!ruleset || parent == ruleset))
-		return ERR_PTR(-EINVAL);
-
-	if (parent) {
-		if (parent->num_layers >= LANDLOCK_MAX_NUM_LAYERS)
-			return ERR_PTR(-E2BIG);
-		num_layers = parent->num_layers + 1;
-	} else {
-		num_layers = 1;
-	}
-
-	/* Creates a new domain... */
-	new_dom = create_ruleset(num_layers);
-	if (IS_ERR(new_dom))
-		return new_dom;
-
-	new_dom->hierarchy =
-		kzalloc(sizeof(*new_dom->hierarchy), GFP_KERNEL_ACCOUNT);
-	if (!new_dom->hierarchy)
-		return ERR_PTR(-ENOMEM);
-
-	refcount_set(&new_dom->hierarchy->usage, 1);
-
-	/* ...as a child of @parent... */
-	err = inherit_ruleset(parent, new_dom);
-	if (err)
-		return ERR_PTR(err);
-
-	/* ...and including @ruleset. */
-	err = merge_ruleset(new_dom, ruleset);
-	if (err)
-		return ERR_PTR(err);
-
-	err = landlock_init_hierarchy_log(new_dom->hierarchy);
-	if (err)
-		return ERR_PTR(err);
-
-	return no_free_ptr(new_dom);
-}
-
-/*
- * The returned access has the same lifetime as @ruleset.
- */
-const struct landlock_rule *
-landlock_find_rule(const struct landlock_ruleset *const ruleset,
-		   const struct landlock_id id)
-{
-	const struct rb_root *root;
-
-	root = get_root((struct landlock_ruleset *)ruleset, id.type);
-	if (IS_ERR(root))
-		return NULL;
-	return landlock_find_in_tree(root, id.key);
-}
-
 typedef access_mask_t
 get_dom_access_mask_t(const struct landlock_domain *const domain,
 		      const u16 layer_level);
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index 418d18869916..517600b66d54 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -27,7 +27,7 @@ struct landlock_hierarchy;
  */
 struct landlock_layer {
 	/**
-	 * @level: Position of this layer in the layer stack.
+	 * @level: Position of this layer in the layer stack.  Starts from 1.
 	 */
 	u16 level;
 	/**
@@ -204,10 +204,6 @@ int landlock_insert_rule(struct landlock_ruleset *const ruleset,
 			 const struct landlock_id id,
 			 const access_mask_t access);
 
-struct landlock_ruleset *
-landlock_merge_ruleset(struct landlock_ruleset *const parent,
-		       struct landlock_ruleset *const ruleset);
-
 static inline struct landlock_rule *
 landlock_find_in_tree(const struct rb_root *const root,
 		      const union landlock_key key)
@@ -228,10 +224,6 @@ landlock_find_in_tree(const struct rb_root *const root,
 	return NULL;
 }
 
-const struct landlock_rule *
-landlock_find_rule(const struct landlock_ruleset *const ruleset,
-		   const struct landlock_id id);
-
 static inline void landlock_get_ruleset(struct landlock_ruleset *const ruleset)
 {
 	if (ruleset)
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 11/12] landlock/task: Fix incorrect BUILD_BUG_ON() in domain_is_scoped
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
                   ` (9 preceding siblings ...)
  2025-07-06 15:16 ` [RFC PATCH v2 10/12] landlock: Remove unused code Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  2025-07-06 15:16 ` [RFC PATCH v2 12/12] landlock: Use a hash function that does not involve division Tingmao Wang
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn,
	Tahera Fahimi

[1] introduces a check which doesn't seem fully correct / necessary, and
breaks the build for a further commit in this series.  This patch replaces
it with just a signedness check.

Cc: Tahera Fahimi <fahimitahera@gmail.com>
Link: https://lore.kernel.org/all/5f7ad85243b78427242275b93481cfc7c127764b.1725494372.git.fahimitahera@gmail.com/ [1]
Signed-off-by: Tingmao Wang <m@maowtm.org>
---

Mickaël, if this looks good can we merge this separately?

 security/landlock/task.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/security/landlock/task.c b/security/landlock/task.c
index d79919c1fec3..46e9bcc3beea 100644
--- a/security/landlock/task.c
+++ b/security/landlock/task.c
@@ -190,10 +190,11 @@ static bool domain_is_scoped(const struct landlock_domain *const client,
 	client_layer = client->num_layers - 1;
 	client_walker = client->hierarchy;
 	/*
-	 * client_layer must be a signed integer with greater capacity
-	 * than client->num_layers to ensure the following loop stops.
+	 * The following 2 loops involving client_layer and server_layer is
+	 * only safe if those integers are signed.
 	 */
-	BUILD_BUG_ON(sizeof(client_layer) > sizeof(client->num_layers));
+	BUILD_BUG_ON((typeof(client_layer))(-1) >= 0);
+	BUILD_BUG_ON((typeof(server_layer))(-1) >= 0);
 
 	server_layer = server ? (server->num_layers - 1) : -1;
 	server_walker = server ? server->hierarchy : NULL;
-- 
2.49.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 12/12] landlock: Use a hash function that does not involve division
  2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
                   ` (10 preceding siblings ...)
  2025-07-06 15:16 ` [RFC PATCH v2 11/12] landlock/task: Fix incorrect BUILD_BUG_ON() in domain_is_scoped Tingmao Wang
@ 2025-07-06 15:16 ` Tingmao Wang
  11 siblings, 0 replies; 13+ messages in thread
From: Tingmao Wang @ 2025-07-06 15:16 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Tingmao Wang, linux-security-module, Mikhail Ivanov, Jann Horn

The current hash uses a division to compute the mod, which can be slow and
also unnecessarily loses entropy (since ideally we want to use the most
significant bits of the hash):

./include/linux/hash.h:
78              return val * GOLDEN_RATIO_64 >> (64 - bits);
   0x0000000000000956 <+118>:   49 0f af c3             imul   %r11,%rax

security/landlock/domain.h:
178     DEFINE_COALESCED_HASH_TABLE(struct landlock_domain_index, dom_hash, key,
   0x000000000000095a <+122>:   48 c1 e8 20             shr    $0x20,%rax
   0x000000000000095e <+126>:   f7 f6                   div    %esi
   0x0000000000000960 <+128>:   89 d0                   mov    %edx,%eax
   0x0000000000000962 <+130>:   49 89 c5                mov    %rax,%r13

This commits introduces a hash_bits parameter to the hash table, and use a
folding hash instead of mod to constrain the value to table_size.

Benchmark comparison:

	> ./parse-bpftrace.py typical-workload-{orig,arraydomain-{hashtable-modhash,hashtable-hashbits}}.log
	  landlock_overhead:    avg = 34        34      34
	                     median = 35        34      34
	  landlock_hook:        avg = 878       875     856
	                     median = 854       850     831
	  open_syscall:         avg = 2517      2532    2485
	                     median = 2457      2471    2425

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/coalesced_hash.h | 136 +++++++++++++++--------------
 security/landlock/domain.c         |  58 ++++++++++--
 security/landlock/domain.h         |  64 +++++++++++---
 3 files changed, 177 insertions(+), 81 deletions(-)

diff --git a/security/landlock/coalesced_hash.h b/security/landlock/coalesced_hash.h
index 56d32cf70d67..199df03a8c14 100644
--- a/security/landlock/coalesced_hash.h
+++ b/security/landlock/coalesced_hash.h
@@ -22,15 +22,16 @@
 
 typedef u32 h_index_t;
 
-typedef h_index_t (*hash_element_t)(const void *elem, h_index_t table_size);
+typedef h_index_t (*hash_element_t)(const void *elem, h_index_t table_size,
+				    int hash_bits);
 typedef h_index_t (*get_next_collision_t)(const void *elem);
 typedef void (*set_next_collision_t)(void *elem, h_index_t next_collision);
 typedef bool (*compare_element_t)(const void *key_elem, const void *found_elem);
 typedef bool (*element_is_empty_t)(const void *elem);
 
 static inline void *h_find(const void *table, h_index_t table_size,
-			   size_t elem_size, const void *elem_to_find,
-			   hash_element_t hash_elem,
+			   int hash_bits, size_t elem_size,
+			   const void *elem_to_find, hash_element_t hash_elem,
 			   get_next_collision_t get_next_collision,
 			   compare_element_t compare_elem,
 			   element_is_empty_t element_is_empty)
@@ -41,7 +42,7 @@ static inline void *h_find(const void *table, h_index_t table_size,
 	if (unlikely(table_size == 0))
 		return NULL;
 
-	curr_index = hash_elem(elem_to_find, table_size);
+	curr_index = hash_elem(elem_to_find, table_size, hash_bits);
 	if (WARN_ON_ONCE(curr_index >= table_size))
 		return NULL;
 	curr_elem = table + curr_index * elem_size;
@@ -53,7 +54,7 @@ static inline void *h_find(const void *table, h_index_t table_size,
 	next_collision = get_next_collision(curr_elem);
 	if (next_collision == curr_index)
 		return NULL;
-	next_hash = hash_elem(curr_elem, table_size);
+	next_hash = hash_elem(curr_elem, table_size, hash_bits);
 	if (next_hash != curr_index)
 		return NULL;
 
@@ -102,13 +103,14 @@ struct h_insert_scratch {
 	 * around all the time.
 	 */
 	h_index_t table_size;
+	int hash_bits;
 	void *table;
 	size_t elem_size;
 };
 
 static inline int h_init_insert_scratch(struct h_insert_scratch *scratch,
 					void *table, h_index_t table_size,
-					size_t elem_size)
+					size_t elem_size, int hash_bits)
 {
 	h_index_t i;
 
@@ -131,6 +133,7 @@ static inline int h_init_insert_scratch(struct h_insert_scratch *scratch,
 		scratch->prev_index[i] = i;
 
 	scratch->table_size = table_size;
+	scratch->hash_bits = hash_bits;
 	scratch->next_free_index = table_size - 1;
 	scratch->table = table;
 	scratch->elem_size = elem_size;
@@ -287,7 +290,7 @@ static inline void h_insert(struct h_insert_scratch *scratch, const void *elem,
 	 *    point to the existing element.
 	 */
 
-	target_idx = hash_elem(elem, scratch->table_size);
+	target_idx = hash_elem(elem, scratch->table_size, scratch->hash_bits);
 	if (WARN_ON_ONCE(target_idx >= scratch->table_size))
 		return;
 	target_elem = scratch->table + target_idx * scratch->elem_size;
@@ -300,7 +303,8 @@ static inline void h_insert(struct h_insert_scratch *scratch, const void *elem,
 		memcpy(target_elem, elem, scratch->elem_size);
 		set_next_collision(target_elem, target_idx);
 	} else {
-		target_hash = hash_elem(target_elem, scratch->table_size);
+		target_hash = hash_elem(target_elem, scratch->table_size,
+					scratch->hash_bits);
 		moved_to = __h_relocate_collision_slot(scratch, target_idx,
 						       get_next_collision,
 						       set_next_collision,
@@ -329,67 +333,67 @@ static inline void h_insert(struct h_insert_scratch *scratch, const void *elem,
  * @next_collision_member: The name of a member in @elem_type that is used
  * to store the index of the next collision in a collision chain.
  * @hash_expr: An expression that computes the hash of an element, given
- * const @elem_type *elem and h_index_t table_size.  If this function is
- * evaluated, table_size is always positive.
+ * const @elem_type *elem, h_index_t table_size and int hash_bits.  If
+ * this function is evaluated, table_size is always positive.
  * @is_empty_expr: An expression that evaluates to true if the element is
  * empty (i.e. not used).  Empty elements are not returned by find.  If
  * the zero value of @elem_type is not "empty", the caller must set all
  * the slots to empty before using the table.
  */
-#define DEFINE_COALESCED_HASH_TABLE(elem_type, table_func_prefix, key_member, \
-				    next_collision_member, hash_expr,         \
-				    is_empty_expr)                            \
-	static inline h_index_t table_func_prefix##_hash_elem(                \
-		const void *_elem, h_index_t table_size)                      \
-	{                                                                     \
-		const elem_type *elem = _elem;                                \
-		return hash_expr;                                             \
-	}                                                                     \
-	static inline h_index_t table_func_prefix##_get_next_collision(       \
-		const void *elem)                                             \
-	{                                                                     \
-		return ((const elem_type *)elem)->next_collision_member;      \
-	}                                                                     \
-	static inline void table_func_prefix##_set_next_collision(            \
-		void *elem, h_index_t next_collision)                         \
-	{                                                                     \
-		((elem_type *)elem)->next_collision_member = next_collision;  \
-	}                                                                     \
-	static inline bool table_func_prefix##_compare_elem(                  \
-		const void *key_elem, const void *found_elem)                 \
-	{                                                                     \
-		const elem_type *key = key_elem;                              \
-		const elem_type *found = found_elem;                          \
-		return key->key_member.data == found->key_member.data;        \
-	}                                                                     \
-	static inline bool table_func_prefix##_element_is_empty(              \
-		const void *_elem)                                            \
-	{                                                                     \
-		const elem_type *elem = _elem;                                \
-		return is_empty_expr;                                         \
-	}                                                                     \
-	static inline const elem_type *table_func_prefix##_find(              \
-		const elem_type *table, h_index_t table_size,                 \
-		const elem_type *elem_to_find)                                \
-	{                                                                     \
-		return h_find(table, table_size, sizeof(elem_type),           \
-			      elem_to_find, table_func_prefix##_hash_elem,    \
-			      table_func_prefix##_get_next_collision,         \
-			      table_func_prefix##_compare_elem,               \
-			      table_func_prefix##_element_is_empty);          \
-	}                                                                     \
-	static inline void table_func_prefix##_initialize(                    \
-		elem_type *table, h_index_t table_size)                       \
-	{                                                                     \
-		h_initialize(table, table_size, sizeof(elem_type),            \
-			     table_func_prefix##_set_next_collision,          \
-			     table_func_prefix##_element_is_empty);           \
-	}                                                                     \
-	static inline void table_func_prefix##_insert(                        \
-		struct h_insert_scratch *scratch, const elem_type *elem)      \
-	{                                                                     \
-		h_insert(scratch, elem, table_func_prefix##_hash_elem,        \
-			 table_func_prefix##_get_next_collision,              \
-			 table_func_prefix##_set_next_collision,              \
-			 table_func_prefix##_element_is_empty);               \
+#define DEFINE_COALESCED_HASH_TABLE(elem_type, table_func_prefix, key_member,  \
+				    next_collision_member, hash_expr,          \
+				    is_empty_expr)                             \
+	static inline h_index_t table_func_prefix##_hash_elem(                 \
+		const void *_elem, h_index_t table_size, int hash_bits)        \
+	{                                                                      \
+		const elem_type *elem = _elem;                                 \
+		return hash_expr;                                              \
+	}                                                                      \
+	static inline h_index_t table_func_prefix##_get_next_collision(        \
+		const void *elem)                                              \
+	{                                                                      \
+		return ((const elem_type *)elem)->next_collision_member;       \
+	}                                                                      \
+	static inline void table_func_prefix##_set_next_collision(             \
+		void *elem, h_index_t next_collision)                          \
+	{                                                                      \
+		((elem_type *)elem)->next_collision_member = next_collision;   \
+	}                                                                      \
+	static inline bool table_func_prefix##_compare_elem(                   \
+		const void *key_elem, const void *found_elem)                  \
+	{                                                                      \
+		const elem_type *key = key_elem;                               \
+		const elem_type *found = found_elem;                           \
+		return key->key_member.data == found->key_member.data;         \
+	}                                                                      \
+	static inline bool table_func_prefix##_element_is_empty(               \
+		const void *_elem)                                             \
+	{                                                                      \
+		const elem_type *elem = _elem;                                 \
+		return is_empty_expr;                                          \
+	}                                                                      \
+	static inline const elem_type *table_func_prefix##_find(               \
+		const elem_type *table, h_index_t table_size, int hash_bits,   \
+		const elem_type *elem_to_find)                                 \
+	{                                                                      \
+		return h_find(table, table_size, hash_bits, sizeof(elem_type), \
+			      elem_to_find, table_func_prefix##_hash_elem,     \
+			      table_func_prefix##_get_next_collision,          \
+			      table_func_prefix##_compare_elem,                \
+			      table_func_prefix##_element_is_empty);           \
+	}                                                                      \
+	static inline void table_func_prefix##_initialize(                     \
+		elem_type *table, h_index_t table_size)                        \
+	{                                                                      \
+		h_initialize(table, table_size, sizeof(elem_type),             \
+			     table_func_prefix##_set_next_collision,           \
+			     table_func_prefix##_element_is_empty);            \
+	}                                                                      \
+	static inline void table_func_prefix##_insert(                         \
+		struct h_insert_scratch *scratch, const elem_type *elem)       \
+	{                                                                      \
+		h_insert(scratch, elem, table_func_prefix##_hash_elem,         \
+			 table_func_prefix##_get_next_collision,               \
+			 table_func_prefix##_set_next_collision,               \
+			 table_func_prefix##_element_is_empty);                \
 	}
diff --git a/security/landlock/domain.c b/security/landlock/domain.c
index 4091e80e45df..bcbad313a958 100644
--- a/security/landlock/domain.c
+++ b/security/landlock/domain.c
@@ -159,6 +159,7 @@ static int dom_calculate_merged_sizes(
 	const struct landlock_rule *walker_rule, *next_rule;
 	struct landlock_domain_index find_key;
 	const struct landlock_domain_index *found;
+	int dom_hash_bits = get_hash_bits(dom_num_indices);
 
 	build_check_domain();
 
@@ -168,7 +169,7 @@ static int dom_calculate_merged_sizes(
 		if (dom_ind_array) {
 			find_key.key = walker_rule->key;
 			found = dom_hash_find(dom_ind_array, dom_num_indices,
-					      &find_key);
+					      dom_hash_bits, &find_key);
 		}
 		/* A new index is only needed if this is a non-overlapping new rule */
 		if (!found) {
@@ -214,6 +215,7 @@ static int dom_populate_indices(
 	const struct landlock_rule *walker_rule, *next_rule;
 	const struct landlock_domain_index *found;
 	struct h_insert_scratch scratch;
+	int dom_hash_bits = get_hash_bits(dom_num_indices);
 	int ret;
 	size_t i;
 
@@ -224,7 +226,8 @@ static int dom_populate_indices(
 	}
 
 	ret = h_init_insert_scratch(&scratch, out_indices, out_size,
-				    sizeof(*out_indices));
+				    sizeof(*out_indices),
+				    get_hash_bits(out_size));
 	if (ret)
 		return ret;
 
@@ -248,7 +251,7 @@ static int dom_populate_indices(
 		found = NULL;
 		if (dom_ind_array)
 			found = dom_hash_find(dom_ind_array, dom_num_indices,
-					      &target);
+					      dom_hash_bits, &target);
 		if (!found) {
 			if (WARN_ON_ONCE(indices_written >= out_size)) {
 				ret = -E2BIG;
@@ -311,6 +314,7 @@ dom_populate_layers(const struct landlock_domain_index *const dom_ind_array,
 	const struct landlock_rule *found_in_child;
 	const struct landlock_layer *layer;
 	struct landlock_layer child_layer;
+	int dom_hash_bits = get_hash_bits(dom_num_indices);
 
 	for (size_t i = 0; i < child_indices_size; i++) {
 		merged_index = &child_indices[i];
@@ -320,8 +324,9 @@ dom_populate_layers(const struct landlock_domain_index *const dom_ind_array,
 		found_in_parent.layers_end = NULL;
 		if (dom_ind_array)
 			found_in_parent = landlock_domain_find(
-				dom_ind_array, dom_num_indices, dom_layer_array,
-				dom_num_layers, merged_index->key);
+				dom_ind_array, dom_num_indices, dom_hash_bits,
+				dom_layer_array, dom_num_layers,
+				merged_index->key);
 		dom_rule_for_each_layer(found_in_parent, layer)
 		{
 			if (WARN_ON_ONCE(layers_written >= out_size))
@@ -387,6 +392,9 @@ static int merge_domain(const struct landlock_domain *parent,
 		if (err)
 			return err;
 
+		child->fs_index_hash_bits =
+			get_hash_bits(child->num_fs_indices);
+
 #ifdef CONFIG_INET
 		err = dom_calculate_merged_sizes(
 			parent ? dom_net_indices(parent) : NULL,
@@ -396,9 +404,13 @@ static int merge_domain(const struct landlock_domain *parent,
 			&child->num_net_layers);
 		if (err)
 			return err;
+
+		child->net_index_hash_bits =
+			get_hash_bits(child->num_net_indices);
 #else
 		child->num_net_indices = 0;
 		child->num_net_layers = 0;
+		child->net_index_hash_bits = 0;
 #endif /* CONFIG_INET */
 	} else {
 		err = dom_populate_indices(
@@ -847,6 +859,41 @@ bool landlock_unmask_layers(const struct landlock_found_rule rule,
 
 #ifdef CONFIG_SECURITY_LANDLOCK_KUNIT_TEST
 
+static void test_domain_hash_func(struct kunit *const test)
+{
+	u32 table_size, got_hash_bits, got_hash;
+	uintptr_t hash_input;
+	int i;
+	struct landlock_domain_index elem;
+
+	KUNIT_ASSERT_EQ(test, get_hash_bits(0), 0);
+
+	for (table_size = 1; table_size <= 65; table_size++) {
+		got_hash_bits = get_hash_bits(table_size);
+		KUNIT_ASSERT_GE_MSG(
+			test, 1 << got_hash_bits, table_size,
+			"get_hash_bits(%u) returned %d which is too small for table size %u",
+			table_size, got_hash_bits, table_size);
+		KUNIT_ASSERT_LE_MSG(
+			test, 1 << got_hash_bits, table_size * 2,
+			"get_hash_bits(%u) returned %d which is too large for table size %u",
+			table_size, got_hash_bits, table_size);
+
+		for (i = 0; i < 1000; i++) {
+			hash_input = get_random_long();
+			elem.key.data = hash_input;
+			got_hash = dom_index_hash_func(&elem, table_size,
+						       got_hash_bits);
+			KUNIT_ASSERT_LT_MSG(
+				test, got_hash, table_size,
+				"dom_index_hash_func(key=%lx, table_size=%u, hash_bits=%d) "
+				"returned %u which exceeded table size %u",
+				hash_input, table_size, got_hash_bits, got_hash,
+				table_size);
+		}
+	}
+}
+
 static void test_landlock_get_deny_masks(struct kunit *const test)
 {
 	const layer_mask_t layers1[BITS_PER_TYPE(access_mask_t)] = {
@@ -881,6 +928,7 @@ static struct kunit_case test_cases[] = {
 	/* clang-format off */
 	KUNIT_CASE(test_get_layer_deny_mask),
 	KUNIT_CASE(test_landlock_get_deny_masks),
+	KUNIT_CASE(test_domain_hash_func),
 	{}
 	/* clang-format on */
 };
diff --git a/security/landlock/domain.h b/security/landlock/domain.h
index ac820903ccb6..998c4e747a35 100644
--- a/security/landlock/domain.h
+++ b/security/landlock/domain.h
@@ -71,7 +71,17 @@ struct landlock_domain {
 	 * @num_layers: Number of layers in this domain.  This enables to
 	 * check that all the layers allow an access request.
 	 */
-	u32 num_layers;
+	u16 num_layers;
+	/**
+	 * @fs_index_hash_bits: Precomputed hash bits for the fs table to
+	 * avoid recomputing this power of 2 every hash.
+	 */
+	u8 fs_index_hash_bits;
+	/**
+	 * @net_index_hash_bits: Precomputed hash bits for the net table to
+	 * avoid recomputing this power of 2 every hash.
+	 */
+	u8 net_index_hash_bits;
 	/**
 	 * @num_fs_indices: Number of non-overlapping (i.e. not for the same
 	 * object) inode rules.  Does not include the terminating index.
@@ -175,9 +185,40 @@ struct landlock_domain {
  */
 #define dom_index_is_empty(elem) ((elem)->layer_index == U32_MAX)
 
+/**
+ * dom_index_hash_func - Hash function for the domain index tables.
+ */
+static inline h_index_t
+dom_index_hash_func(const struct landlock_domain_index *elem,
+		    const h_index_t table_size, const int hash_bits)
+{
+	if (hash_bits <= 0)
+		/* hash_long requires hash_bits > 0 */
+		return 0;
+	h_index_t h = hash_long(elem->key.data, hash_bits);
+	/* hash_bits is at most 2x table_size */
+	if (h >= table_size)
+		h -= table_size;
+	return h;
+}
+
+static inline int get_hash_bits(const u32 table_size)
+{
+	if (table_size <= 1)
+		return 0;
+	/**
+	 * Example:
+	 * For table_size = 2, we need 1 bits.  ilog2(2-1)+1 = 0+1 = 1.
+	 * For table_size = 3, we need 2 bits.  ilog2(3-1)+1 = 1+1 = 2.
+	 * For table_size = 4, we need 2 bits.  ilog2(4-1)+1 = 1+1 = 2.
+	 * For table_size = 5, we need 3 bits.  ilog2(5-1)+1 = 2+1 = 3.
+	 */
+	return ilog2(table_size - 1) + 1;
+}
+
 DEFINE_COALESCED_HASH_TABLE(struct landlock_domain_index, dom_hash, key,
 			    next_collision,
-			    hash_long(elem->key.data, 32) % table_size,
+			    dom_index_hash_func(elem, table_size, hash_bits),
 			    dom_index_is_empty(elem))
 
 struct landlock_domain *
@@ -207,13 +248,14 @@ struct landlock_found_rule {
  *
  * @indices_arr: The indices array to search in.
  * @num_indices: The number of elements in @indices_arr.
+ * @hash_bits: The corresponding hash_bits for the indices array.
  * @layers_arr: The layers array.
  * @num_layers: The number of elements in @layers_arr.
  * @key: The key to search for.
  */
 static inline struct landlock_found_rule
 landlock_domain_find(const struct landlock_domain_index *const indices_arr,
-		     const u32 num_indices,
+		     const u32 num_indices, const int hash_bits,
 		     const struct landlock_layer *const layers_arr,
 		     const u32 num_layers, const union landlock_key key)
 {
@@ -223,7 +265,7 @@ landlock_domain_find(const struct landlock_domain_index *const indices_arr,
 	struct landlock_found_rule out_found_rule = {};
 	const struct landlock_domain_index *found;
 
-	found = dom_hash_find(indices_arr, num_indices, &key_elem);
+	found = dom_hash_find(indices_arr, num_indices, hash_bits, &key_elem);
 
 	if (found) {
 		if (WARN_ON_ONCE(found->layer_index >= num_layers))
@@ -236,13 +278,15 @@ landlock_domain_find(const struct landlock_domain_index *const indices_arr,
 	return out_found_rule;
 }
 
-#define dom_find_index_fs(dom, key)                                      \
-	landlock_domain_find(dom_fs_indices(dom), (dom)->num_fs_indices, \
-			     dom_fs_layers(dom), (dom)->num_fs_layers, key)
+#define dom_find_index_fs(dom, key)                                         \
+	landlock_domain_find(dom_fs_indices(dom), (dom)->num_fs_indices,    \
+			     (dom)->fs_index_hash_bits, dom_fs_layers(dom), \
+			     (dom)->num_fs_layers, key)
 
-#define dom_find_index_net(dom, key)                                       \
-	landlock_domain_find(dom_net_indices(dom), (dom)->num_net_indices, \
-			     dom_net_layers(dom), (dom)->num_net_layers, key)
+#define dom_find_index_net(dom, key)                                          \
+	landlock_domain_find(dom_net_indices(dom), (dom)->num_net_indices,    \
+			     (dom)->net_index_hash_bits, dom_net_layers(dom), \
+			     (dom)->num_net_layers, key)
 
 #define dom_find_success(found_rule) ((found_rule).layers_start != NULL)
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-07-06 15:18 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-06 15:16 [RFC PATCH v2 00/12] landlock: Use coalesced hashtable for merged domains Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 01/12] landlock: Set the max rules limit in a domain to U16_MAX Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 02/12] landlock/domain: Define structure and macros for flat-array domains Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 03/12] landlock: Define coalesced hashtable and implement finding Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 04/12] landlock/domain: Implement finding rules Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 05/12] landlock/coalesced_hash: Implement insert Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 06/12] landlock/domain: Implement merging of a parent domain and a ruleset Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 07/12] landlock/domain: Define alloc and free Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 08/12] landlock/domain: Add landlock_domain_merge_ruleset Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 09/12] landlock: Update various code to use landlock_domain Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 10/12] landlock: Remove unused code Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 11/12] landlock/task: Fix incorrect BUILD_BUG_ON() in domain_is_scoped Tingmao Wang
2025-07-06 15:16 ` [RFC PATCH v2 12/12] landlock: Use a hash function that does not involve division Tingmao Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).