linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* landlock: Use hashtable for merged domains
@ 2025-05-21 19:31 Tingmao Wang
  2025-05-21 19:31 ` [RFC PATCH 01/10] landlock: Add some debug output Tingmao Wang
                   ` (10 more replies)
  0 siblings, 11 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-05-21 19:31 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, linux-security-module

Hi Mickaël,

This is a set of (incomplete) patches for the "domain hashtable" work.
While this is still a WIP, I'm sending it now as per our discussion.  You
can take a look and see if the approach needs correcting / if we want to
go ahead with using hashtable.

Currently only implemented for fs access.

This set of changes is also available on the landlock-hashmap branch of
https://github.com/micromaomao/linux-dev.git


struct landlock_domain
----------------------

One of the major thing I'm not sure about in this patch is the fact that
I've decided to create a standalone `struct landlock_domain`, instead of
re-using landlock_ruleset.  My original hope was that it might help make
the code a bit clearer - struct landlock_ruleset would just be for
unmerged rulesets (and it can use rbtree if we still wants to), and we can
use this new type for domains (and add new fields and locks to it that
wouldn't be on a ruleset, for the mutable domains / supervisor work).

I also wanted to split the logic of dealing with rules in a merged domain
(hence needing to copy multiple layers etc) out of create_rule and
insert_rule, and simply these functions (especially insert_rule) by
letting them just deal with unmerged rulesets (so only one layer) - the
logic for merged domains would be different anyway since it's no longer a
rbtree walk (but something like landlock_hash_upsert in patch 8 instead).

However, looking back at it, I'm not sure if creating this landlock_domain
was actually a good idea.  Going down this route eventually I will just
have to move all the domain-related logic from using ruleset to using
landlock_domain.  If we decide to use hashtable for unmerged rulesets too
(not done in this patch, and need more thoughts on how that would work),
then separating this out is even less value.

Regardless, the relevant code in this patch should still be easily
portable to just extend landlock_ruleset with hashtables, so I'm happy to
do that instead in the next version.


Hashtable implementation
------------------------

Since I couldn't find a suitable existing wrapper for runtime-sized (but
fixed after creation) hashtables, this patch uses a newly hand-rolled
implementation.  It's not very complex so this might be fine, but there is
also rhashtable which would be especially suitable if we want to use hash
table for unmerged rulesets too.  See patch 2 message for details.


Testing
-------

selftests/landlock/fs_tests all passes under KASAN, lockdep etc.  Other
tests showed no obvious issues, but base and net fails due to missing
config (no MPTCP, no CONFIG_ASYMMETRIC_KEY_TYPE i.e. keyctl)

I ran benchmark with before/after using two workloads, on the same machine
and setup:

    1. run_microbench with different depth and number of extra rules using
    code in https://github.com/landlock-lsm/landlock-test-tools/pull/17

    2. A more "typical" workload I put together quickly, calling git status
    and the like repeatedly:
    https://github.com/torvalds/linux/commit/f1865ce970af97ac3b6f4edf580529b8cdc66371

On the "typical" workload, which has 2 layers and ~15 rules, we have:

Comparing:                    orig	  hashtable  (% change)
  landlock_overhead:
    (this is the % of time spent in landlock hook in the open syscall)
                        avg = 34      33         (-2.9%)
                     median = 34      33         (-2.9%)

  landlock_hook:        avg = 837     775        (-7.4%) (unit: ns)
                     median = 813     748        (-8.0%) (unit: ns)

  open_syscall:         avg = 2429    2324       (-4.3%) (unit: ns)
                     median = 2370    2265       (-4.4%) (unit: ns)

Using the microbench script, for an extreme case on a path beneath 28
directories and with 10000 rules:

Comparing:                    orig	  hashtable  (% change)
  landlock_overhead:    avg = 27      24         (-11.1%)
                     median = 29      25         (-13.8%)
  landlock_hook:        avg = 1913    1577       (-17.6%)
                     median = 1884    1544       (-18.0%)
  open_syscall:         avg = 6775    6259       (-7.6%)
                     median = 6666    6115       (-8.3%)

The full results can be found at
https://fileshare.maowtm.org/landlock/20250521/index.html

Closes: https://github.com/landlock-lsm/linux/issues/1

Tingmao Wang (10):
  landlock: Add some debug output
  landlock/hash: define (dynamic, non-resizable) hash table helpers
  landlock/hash: Use linear search for small tables
  landlock/ruleset: Rename and extract create_rule
  Add hlist_node member to struct landlock_rule
  landlock/domain: Define landlock_domain
  landlock: Add the new domain to landlock_cred_security
  landlock: Construct the inode hashtable in the new landlock_domain
  landlock/fs: Use the new hashtable-based domain to find inode rules
  landlock: Debug print inode hashtable in landlock_merge_ruleset2

 security/landlock/cred.c     |   8 +-
 security/landlock/cred.h     |   1 +
 security/landlock/domain.c   | 145 +++++++++++++++++
 security/landlock/domain.h   |  45 ++++++
 security/landlock/fs.c       | 107 +++++++++----
 security/landlock/fs.h       |   1 +
 security/landlock/hash.h     | 294 +++++++++++++++++++++++++++++++++++
 security/landlock/ruleset.c  |  80 +---------
 security/landlock/ruleset.h  |  99 +++++++++++-
 security/landlock/syscalls.c |  35 +++++
 10 files changed, 702 insertions(+), 113 deletions(-)
 create mode 100644 security/landlock/hash.h


base-commit: 3039ed432745f8fdf5cbb43fdc60b2e1aad624c1
--
2.49.0

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC PATCH 01/10] landlock: Add some debug output
  2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
@ 2025-05-21 19:31 ` Tingmao Wang
  2025-05-21 19:31 ` [RFC PATCH 02/10] landlock/hash: define (dynamic, non-resizable) hash table helpers Tingmao Wang
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-05-21 19:31 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, linux-security-module

They are behind #ifdef DEBUG for now as iterating over all the rules /
each access bits might make it slower (even if dynamic pr_debug makes the
print itself nop).

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/fs.c       | 47 ++++++++++++++++++++++++++++++++++++
 security/landlock/fs.h       |  1 +
 security/landlock/syscalls.c | 26 ++++++++++++++++++++
 3 files changed, 74 insertions(+)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 6fee7c20f64d..b407c644ac65 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -772,6 +772,10 @@ static bool is_access_to_paths_allowed(
 	layer_mask_t(*layer_masks_child1)[LANDLOCK_NUM_ACCESS_FS] = NULL,
 	(*layer_masks_child2)[LANDLOCK_NUM_ACCESS_FS] = NULL;
 
+#ifdef DEBUG
+	layer_mask_t all_layers = (1 << domain->num_layers) - 1;
+#endif
+
 	if (!access_request_parent1 && !access_request_parent2)
 		return true;
 
@@ -800,6 +804,14 @@ static bool is_access_to_paths_allowed(
 		access_masked_parent1 = access_masked_parent2 =
 			landlock_union_access_masks(domain).fs;
 		is_dom_check = true;
+
+#ifdef DEBUG
+		pr_debug(
+			"check access to path %pd4 for access request p1 %x p2 %x:\n",
+			path->dentry, access_request_parent1,
+			access_request_parent2);
+#endif
+
 	} else {
 		if (WARN_ON_ONCE(dentry_child1 || dentry_child2))
 			return false;
@@ -807,7 +819,15 @@ static bool is_access_to_paths_allowed(
 		access_masked_parent1 = access_request_parent1;
 		access_masked_parent2 = access_request_parent2;
 		is_dom_check = false;
+
+#ifdef DEBUG
+		pr_debug("check access to path %pd4 for access request %x:\n",
+			 path->dentry, access_request_parent1);
+#endif
 	}
+#ifdef DEBUG
+	pr_debug("  (need layer mask %x)", all_layers);
+#endif
 
 	if (unlikely(dentry_child1)) {
 		landlock_unmask_layers(
@@ -892,6 +912,33 @@ static bool is_access_to_paths_allowed(
 					  layer_masks_parent2,
 					  ARRAY_SIZE(*layer_masks_parent2));
 
+#ifdef DEBUG
+		{
+			rcu_read_lock();
+			pr_debug("  %pd: ino %lu (%p), rule: %s, allow: %s\n",
+				 walker_path.dentry,
+				 walker_path.dentry->d_inode->i_ino,
+				 walker_path.dentry->d_inode,
+				 rule ? "exists" : "does not exist",
+				 allowed_parent1 ? "yes" : "no");
+			unsigned long access_masked = access_masked_parent1;
+			unsigned long access_bit;
+			if (rule) {
+				for_each_set_bit(
+					access_bit, &access_masked,
+					ARRAY_SIZE(*layer_masks_parent1)) {
+					pr_debug(
+						"    access %x allowed by layer mask %d\n",
+						(1 << access_bit),
+						(~(*layer_masks_parent1)
+							 [access_bit]) &
+							all_layers);
+				}
+			}
+			rcu_read_unlock();
+		}
+#endif
+
 		/* Stops when a rule from each layer grants access. */
 		if (allowed_parent1 && allowed_parent2)
 			break;
diff --git a/security/landlock/fs.h b/security/landlock/fs.h
index bf9948941f2f..bedf61c15cd4 100644
--- a/security/landlock/fs.h
+++ b/security/landlock/fs.h
@@ -15,6 +15,7 @@
 #include <linux/init.h>
 #include <linux/rcupdate.h>
 
+#include "common.h"
 #include "access.h"
 #include "cred.h"
 #include "ruleset.h"
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 33eafb71e4f3..38eb8287f73d 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -559,6 +559,32 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
 		new_dom->hierarchy->log_status = LANDLOCK_LOG_DISABLED;
 #endif /* CONFIG_AUDIT */
 
+#ifdef DEBUG
+	pr_debug("%s[%d] restricting self with landlock\n", current->comm,
+		 current->pid);
+	struct rb_node *node;
+	pr_debug("inode tree:\n");
+	for (node = rb_first(&new_dom->root_inode); node;
+	     node = rb_next(node)) {
+		const struct landlock_rule *rule =
+			rb_entry(node, struct landlock_rule, node);
+		spinlock_t *lock = &rule->key.object->lock;
+		rcu_read_lock();
+		spin_lock(lock);
+		struct inode *inode = rule->key.object->underobj;
+		if (inode)
+			pr_debug("  rule: ino %lu (%p)\n", inode->i_ino, inode);
+		else
+			pr_debug("  rule: inode released\n");
+		for (size_t i = 0; i < rule->num_layers; i++) {
+			pr_debug("    layer %u: access %x\n",
+				 rule->layers[i].level, rule->layers[i].access);
+		}
+		spin_unlock(lock);
+		rcu_read_unlock();
+	}
+#endif /* DEBUG */
+
 	/* Replaces the old (prepared) domain. */
 	landlock_put_ruleset(new_llcred->domain);
 	new_llcred->domain = new_dom;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 02/10] landlock/hash: define (dynamic, non-resizable) hash table helpers
  2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
  2025-05-21 19:31 ` [RFC PATCH 01/10] landlock: Add some debug output Tingmao Wang
@ 2025-05-21 19:31 ` Tingmao Wang
  2025-05-27 11:00   ` Mickaël Salaün
  2025-05-21 19:31 ` [RFC PATCH 03/10] landlock/hash: Use linear search for small tables Tingmao Wang
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 18+ messages in thread
From: Tingmao Wang @ 2025-05-21 19:31 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, linux-security-module

While there is already include/linux/hash.h, it relies on the static size
of the array as the size of the hash table, and thus is inconvenient to
use for this case where we dynamically compute how many slots we need.

There is also the relativistic hash tables in rhashtable.h which supports
dynamic resizes etc, but is more complicated and might be slower to access?

However, on second thought, I'm wondering if we should just use hash
tables for both domain and a not-yet-merged ruleset anyway (which saves us
from having a union in landlock_rule).  If we do that then we should
indeed just use rhashtable.

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/hash.h | 117 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 117 insertions(+)
 create mode 100644 security/landlock/hash.h

diff --git a/security/landlock/hash.h b/security/landlock/hash.h
new file mode 100644
index 000000000000..955c5756d4d9
--- /dev/null
+++ b/security/landlock/hash.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock - Domain hashtable mainpulation
+ *
+ * Copyright © 2025      Tingmao Wang <m@maowtm.org>
+ */
+
+#ifndef _SECURITY_LANDLOCK_HASH_H
+#define _SECURITY_LANDLOCK_HASH_H
+
+#include <linux/slab.h>
+#include <linux/hash.h>
+
+#include "ruleset.h"
+
+struct landlock_hashtable {
+	struct hlist_head *hlist;
+
+	/**
+	 * @hash_bits: Number of bits in this hash index (i.e.  hlist has
+	 * 2^this many elements).
+	 */
+	int hash_bits;
+};
+
+#define landlock_hash_for_each(rule, ht, i)                \
+	for (i = 0; i < (1ULL << (ht)->hash_bits); i += 1) \
+		hlist_for_each_entry(rule, &(ht)->hlist[i], hlist)
+
+#define landlock_hash_for_each_safe(rule, tmp, ht, i)      \
+	for (i = 0; i < (1ULL << (ht)->hash_bits); i += 1) \
+		hlist_for_each_entry_safe(rule, tmp, &(ht)->hlist[i], hlist)
+
+static inline int landlock_hash_init(const size_t expected_num_entries,
+				     struct landlock_hashtable *out_ht)
+{
+	size_t table_sz = 1;
+	int hash_bits = 0;
+
+	if (likely(expected_num_entries > 0)) {
+		table_sz = roundup_pow_of_two(expected_num_entries);
+		hash_bits = fls_long(table_sz - 1);
+	}
+
+	/*
+	 * We allocate a table even if expected_num_entries == 0 to avoid
+	 * unnecessary branching in lookup code
+	 */
+
+	out_ht->hash_bits = hash_bits;
+	out_ht->hlist = kcalloc(table_sz, sizeof(struct hlist_head),
+				GFP_KERNEL_ACCOUNT);
+	if (!out_ht->hlist) {
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static inline void landlock_hash_free(struct landlock_hashtable *ht,
+				      const enum landlock_key_type key_type)
+{
+	struct landlock_rule *rule;
+	struct hlist_node *tmp;
+	size_t i;
+
+	if (key_type == LANDLOCK_KEY_INODE)
+		might_sleep();
+
+	if (!ht->hlist)
+		return;
+
+	landlock_hash_for_each_safe(rule, tmp, ht, i)
+	{
+		free_rule(rule, key_type);
+	}
+	kfree(ht->hlist);
+	ht->hlist = NULL;
+}
+
+static inline u32 landlock_hash_key(const union landlock_key key,
+				    const int hash_bits)
+{
+	return hash_ptr((void *)key.data, hash_bits);
+}
+
+static inline struct landlock_rule *
+landlock_hash_find(const struct landlock_hashtable *const ht,
+		   const union landlock_key key)
+{
+	struct hlist_head *head;
+	struct landlock_rule *rule;
+
+	head = &ht->hlist[landlock_hash_key(key, ht->hash_bits)];
+
+	hlist_for_each_entry(rule, head, hlist) {
+		if (rule->key.data == key.data)
+			return rule;
+	}
+
+	return NULL;
+}
+
+/**
+ * @landlock_hash_count - Return number of entries in the hashtable.
+ */
+static inline size_t landlock_hash_count(const struct landlock_hashtable *ht)
+{
+	size_t num_entries = 0;
+	struct landlock_rule *rule;
+	size_t i;
+	landlock_hash_for_each(rule, ht, i)
+	{
+		num_entries += 1;
+	}
+	return num_entries;
+}
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 03/10] landlock/hash: Use linear search for small tables
  2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
  2025-05-21 19:31 ` [RFC PATCH 01/10] landlock: Add some debug output Tingmao Wang
  2025-05-21 19:31 ` [RFC PATCH 02/10] landlock/hash: define (dynamic, non-resizable) hash table helpers Tingmao Wang
@ 2025-05-21 19:31 ` Tingmao Wang
  2025-05-21 19:32 ` [RFC PATCH 04/10] landlock/ruleset: Rename and extract create_rule Tingmao Wang
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-05-21 19:31 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, linux-security-module

The cost of hashing is too significant when there's only a small number of
entries.  For that we might as well just walk the hlist.  The optimal
threshold can be determined imperically which I will do later.

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/hash.h | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/security/landlock/hash.h b/security/landlock/hash.h
index 955c5756d4d9..8393593cfe7b 100644
--- a/security/landlock/hash.h
+++ b/security/landlock/hash.h
@@ -13,6 +13,8 @@
 
 #include "ruleset.h"
 
+#define LANDLOCK_HASH_LINEAR_THRESHOLD 4
+
 struct landlock_hashtable {
 	struct hlist_head *hlist;
 
@@ -37,7 +39,11 @@ static inline int landlock_hash_init(const size_t expected_num_entries,
 	size_t table_sz = 1;
 	int hash_bits = 0;
 
-	if (likely(expected_num_entries > 0)) {
+	/*
+	 * For small tables, we just have one slot, essentially making lookups
+	 * a linear search.  Doing a hash for small tables is not worth it.
+	 */
+	if (expected_num_entries > LANDLOCK_HASH_LINEAR_THRESHOLD) {
 		table_sz = roundup_pow_of_two(expected_num_entries);
 		hash_bits = fls_long(table_sz - 1);
 	}
@@ -81,6 +87,10 @@ static inline void landlock_hash_free(struct landlock_hashtable *ht,
 static inline u32 landlock_hash_key(const union landlock_key key,
 				    const int hash_bits)
 {
+	if (hash_bits == 0) {
+		return 0;
+	}
+
 	return hash_ptr((void *)key.data, hash_bits);
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 04/10] landlock/ruleset: Rename and extract create_rule
  2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
                   ` (2 preceding siblings ...)
  2025-05-21 19:31 ` [RFC PATCH 03/10] landlock/hash: Use linear search for small tables Tingmao Wang
@ 2025-05-21 19:32 ` Tingmao Wang
  2025-05-21 19:32 ` [RFC PATCH 05/10] Add hlist_node member to struct landlock_rule Tingmao Wang
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-05-21 19:32 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, linux-security-module

To be used in domain.h in a later patch

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/ruleset.c | 80 +------------------------------------
 security/landlock/ruleset.h | 77 +++++++++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+), 78 deletions(-)

diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index ce7940efea51..37f25f8f27f2 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -77,71 +77,6 @@ landlock_create_ruleset(const access_mask_t fs_access_mask,
 	return new_ruleset;
 }
 
-static void build_check_rule(void)
-{
-	const struct landlock_rule rule = {
-		.num_layers = ~0,
-	};
-
-	BUILD_BUG_ON(rule.num_layers < LANDLOCK_MAX_NUM_LAYERS);
-}
-
-static bool is_object_pointer(const enum landlock_key_type key_type)
-{
-	switch (key_type) {
-	case LANDLOCK_KEY_INODE:
-		return true;
-
-#if IS_ENABLED(CONFIG_INET)
-	case LANDLOCK_KEY_NET_PORT:
-		return false;
-#endif /* IS_ENABLED(CONFIG_INET) */
-
-	default:
-		WARN_ON_ONCE(1);
-		return false;
-	}
-}
-
-static struct landlock_rule *
-create_rule(const struct landlock_id id,
-	    const struct landlock_layer (*const layers)[], const u32 num_layers,
-	    const struct landlock_layer *const new_layer)
-{
-	struct landlock_rule *new_rule;
-	u32 new_num_layers;
-
-	build_check_rule();
-	if (new_layer) {
-		/* Should already be checked by landlock_merge_ruleset(). */
-		if (WARN_ON_ONCE(num_layers >= LANDLOCK_MAX_NUM_LAYERS))
-			return ERR_PTR(-E2BIG);
-		new_num_layers = num_layers + 1;
-	} else {
-		new_num_layers = num_layers;
-	}
-	new_rule = kzalloc(struct_size(new_rule, layers, new_num_layers),
-			   GFP_KERNEL_ACCOUNT);
-	if (!new_rule)
-		return ERR_PTR(-ENOMEM);
-	RB_CLEAR_NODE(&new_rule->node);
-	if (is_object_pointer(id.type)) {
-		/* This should have been caught by insert_rule(). */
-		WARN_ON_ONCE(!id.key.object);
-		landlock_get_object(id.key.object);
-	}
-
-	new_rule->key = id.key;
-	new_rule->num_layers = new_num_layers;
-	/* Copies the original layer stack. */
-	memcpy(new_rule->layers, layers,
-	       flex_array_size(new_rule, layers, num_layers));
-	if (new_layer)
-		/* Adds a copy of @new_layer on the layer stack. */
-		new_rule->layers[new_rule->num_layers - 1] = *new_layer;
-	return new_rule;
-}
-
 static struct rb_root *get_root(struct landlock_ruleset *const ruleset,
 				const enum landlock_key_type key_type)
 {
@@ -160,17 +95,6 @@ static struct rb_root *get_root(struct landlock_ruleset *const ruleset,
 	}
 }
 
-static void free_rule(struct landlock_rule *const rule,
-		      const enum landlock_key_type key_type)
-{
-	might_sleep();
-	if (!rule)
-		return;
-	if (is_object_pointer(key_type))
-		landlock_put_object(rule->key.object);
-	kfree(rule);
-}
-
 static void build_check_ruleset(void)
 {
 	const struct landlock_ruleset ruleset = {
@@ -261,7 +185,7 @@ static int insert_rule(struct landlock_ruleset *const ruleset,
 		 * Intersects access rights when it is a merge between a
 		 * ruleset and a domain.
 		 */
-		new_rule = create_rule(id, &this->layers, this->num_layers,
+		new_rule = landlock_create_rule(id, &this->layers, this->num_layers,
 				       &(*layers)[0]);
 		if (IS_ERR(new_rule))
 			return PTR_ERR(new_rule);
@@ -274,7 +198,7 @@ static int insert_rule(struct landlock_ruleset *const ruleset,
 	build_check_ruleset();
 	if (ruleset->num_rules >= LANDLOCK_MAX_NUM_RULES)
 		return -E2BIG;
-	new_rule = create_rule(id, layers, num_layers, NULL);
+	new_rule = landlock_create_rule(id, layers, num_layers, NULL);
 	if (IS_ERR(new_rule))
 		return PTR_ERR(new_rule);
 	rb_link_node(&new_rule->node, parent_node, walker_node);
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index 5da9a64f5af7..215578ad82f7 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -200,6 +200,83 @@ void landlock_put_ruleset_deferred(struct landlock_ruleset *const ruleset);
 DEFINE_FREE(landlock_put_ruleset, struct landlock_ruleset *,
 	    if (!IS_ERR_OR_NULL(_T)) landlock_put_ruleset(_T))
 
+static void build_check_rule(void)
+{
+	const struct landlock_rule rule = {
+		.num_layers = ~0,
+	};
+
+	BUILD_BUG_ON(rule.num_layers < LANDLOCK_MAX_NUM_LAYERS);
+}
+
+static bool is_object_pointer(const enum landlock_key_type key_type)
+{
+	switch (key_type) {
+	case LANDLOCK_KEY_INODE:
+		return true;
+
+#if IS_ENABLED(CONFIG_INET)
+	case LANDLOCK_KEY_NET_PORT:
+		return false;
+#endif /* IS_ENABLED(CONFIG_INET) */
+
+	default:
+		WARN_ON_ONCE(1);
+		return false;
+	}
+}
+
+static inline struct landlock_rule *
+landlock_create_rule(const struct landlock_id id,
+		     const struct landlock_layer (*const layers)[],
+		     const u32 num_layers,
+		     const struct landlock_layer *const new_layer)
+{
+	struct landlock_rule *new_rule;
+	u32 new_num_layers;
+
+	build_check_rule();
+	if (new_layer) {
+		/* Should already be checked by landlock_merge_ruleset(). */
+		if (WARN_ON_ONCE(num_layers >= LANDLOCK_MAX_NUM_LAYERS))
+			return ERR_PTR(-E2BIG);
+		new_num_layers = num_layers + 1;
+	} else {
+		new_num_layers = num_layers;
+	}
+	new_rule = kzalloc(struct_size(new_rule, layers, new_num_layers),
+			   GFP_KERNEL_ACCOUNT);
+	if (!new_rule)
+		return ERR_PTR(-ENOMEM);
+	RB_CLEAR_NODE(&new_rule->node);
+	if (is_object_pointer(id.type)) {
+		/* This should have been caught by insert_rule(). */
+		WARN_ON_ONCE(!id.key.object);
+		landlock_get_object(id.key.object);
+	}
+
+	new_rule->key = id.key;
+	new_rule->num_layers = new_num_layers;
+	/* Copies the original layer stack. */
+	memcpy(new_rule->layers, layers,
+	       flex_array_size(new_rule, layers, num_layers));
+	if (new_layer)
+		/* Adds a copy of @new_layer on the layer stack. */
+		new_rule->layers[new_rule->num_layers - 1] = *new_layer;
+	return new_rule;
+}
+
+static inline void free_rule(struct landlock_rule *const rule,
+			     const enum landlock_key_type key_type)
+{
+	might_sleep();
+	if (!rule)
+		return;
+	if (is_object_pointer(key_type))
+		landlock_put_object(rule->key.object);
+	kfree(rule);
+}
+
 int landlock_insert_rule(struct landlock_ruleset *const ruleset,
 			 const struct landlock_id id,
 			 const access_mask_t access);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 05/10] Add hlist_node member to struct landlock_rule
  2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
                   ` (3 preceding siblings ...)
  2025-05-21 19:32 ` [RFC PATCH 04/10] landlock/ruleset: Rename and extract create_rule Tingmao Wang
@ 2025-05-21 19:32 ` Tingmao Wang
  2025-05-21 19:32 ` [RFC PATCH 06/10] landlock/domain: Define landlock_domain Tingmao Wang
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-05-21 19:32 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, linux-security-module

This is to prepare for the new domain. Since a rule can only be in either
a ruleset (rbtree based) or a landlock_domain (hashtable based), we can
make the node/hlist member of a union.  For now let create_rule initialize
is as before.

(Alternatively, if we use hashtable for both cases, then we save 8 bytes
per each rule, but then we will need some kind of resizable hashtable)

(Maybe we should use the relativistic hash tables after all?)

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/ruleset.h | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index 215578ad82f7..07823771b402 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -87,10 +87,16 @@ struct landlock_id {
  * struct landlock_rule - Access rights tied to an object
  */
 struct landlock_rule {
-	/**
-	 * @node: Node in the ruleset's red-black tree.
-	 */
-	struct rb_node node;
+	union {
+		/**
+		 * @node: Node in the ruleset's red-black tree.
+		 */
+		struct rb_node node;
+		/**
+		 * @hlist: Node in the domain's hash table.
+		 */
+		struct hlist_node hlist;
+	};
 	/**
 	 * @key: A union to identify either a kernel object (e.g. an inode) or
 	 * a raw data value (e.g. a network socket port). This is used as a key
@@ -248,7 +254,13 @@ landlock_create_rule(const struct landlock_id id,
 			   GFP_KERNEL_ACCOUNT);
 	if (!new_rule)
 		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * We assume the rule will be in a rbtree for now - in the
+	 * landlock_domain case caller can init the hlist afterward
+	 */
 	RB_CLEAR_NODE(&new_rule->node);
+
 	if (is_object_pointer(id.type)) {
 		/* This should have been caught by insert_rule(). */
 		WARN_ON_ONCE(!id.key.object);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 06/10] landlock/domain: Define landlock_domain
  2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
                   ` (4 preceding siblings ...)
  2025-05-21 19:32 ` [RFC PATCH 05/10] Add hlist_node member to struct landlock_rule Tingmao Wang
@ 2025-05-21 19:32 ` Tingmao Wang
  2025-05-21 19:32 ` [RFC PATCH 07/10] landlock: Add the new domain to landlock_cred_security Tingmao Wang
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-05-21 19:32 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, linux-security-module

This will eventually take the role of landlock_ruleset (and maybe
landlock_hirearchy?), but for now it is just the inode rules hashtable.

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/domain.c | 36 ++++++++++++++++++++++++++++++++++++
 security/landlock/domain.h | 27 +++++++++++++++++++++++++++
 2 files changed, 63 insertions(+)

diff --git a/security/landlock/domain.c b/security/landlock/domain.c
index a647b68e8d06..180ed75da9e2 100644
--- a/security/landlock/domain.c
+++ b/security/landlock/domain.c
@@ -5,6 +5,7 @@
  * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net>
  * Copyright © 2018-2020 ANSSI
  * Copyright © 2024-2025 Microsoft Corporation
+ * Copyright © 2025      Tingmao Wang <m@maowtm.org>
  */
 
 #include <kunit/test.h>
@@ -24,6 +25,41 @@
 #include "domain.h"
 #include "id.h"
 
+struct landlock_domain *landlock_alloc_domain(size_t num_inode_entries,
+					      u16 num_layers)
+{
+	struct landlock_domain *new_domain =
+		kzalloc(sizeof(struct landlock_domain), GFP_KERNEL_ACCOUNT);
+
+	if (!new_domain)
+		return NULL;
+	refcount_set(&new_domain->usage, 1);
+	new_domain->num_layers = num_layers;
+	if (landlock_hash_init(num_inode_entries, &new_domain->inode_table)) {
+		kfree(new_domain);
+		return NULL;
+	}
+
+	return new_domain;
+}
+
+static void free_domain(struct landlock_domain *const domain)
+{
+	might_sleep();
+
+	landlock_hash_free(&domain->inode_table, LANDLOCK_KEY_INODE);
+	kfree(domain);
+}
+
+void landlock_put_domain(struct landlock_domain *const domain)
+{
+	might_sleep();
+
+	if (domain && refcount_dec_and_test(&domain->usage)) {
+		free_domain(domain);
+	}
+}
+
 #ifdef CONFIG_AUDIT
 
 /**
diff --git a/security/landlock/domain.h b/security/landlock/domain.h
index 7fb70b25f85a..ed685f8ad52e 100644
--- a/security/landlock/domain.h
+++ b/security/landlock/domain.h
@@ -5,6 +5,7 @@
  * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net>
  * Copyright © 2018-2020 ANSSI
  * Copyright © 2024-2025 Microsoft Corporation
+ * Copyright © 2025      Tingmao Wang <m@maowtm.org>
  */
 
 #ifndef _SECURITY_LANDLOCK_DOMAIN_H
@@ -20,6 +21,32 @@
 
 #include "access.h"
 #include "audit.h"
+#include "hash.h"
+
+struct landlock_domain {
+	struct landlock_hashtable inode_table;
+
+	/**
+	 * @usage: Reference count for this struct.
+	 */
+	refcount_t usage;
+
+	/**
+	 * @num_layers: Number of layers in this domain.
+	 */
+	u16 num_layers;
+};
+
+struct landlock_domain *landlock_alloc_domain(size_t num_inode_entries,
+					      u16 num_layers);
+
+static inline void landlock_get_domain(struct landlock_domain *const domain)
+{
+	if (domain)
+		refcount_inc(&domain->usage);
+}
+
+void landlock_put_domain(struct landlock_domain *const domain);
 
 enum landlock_log_status {
 	LANDLOCK_LOG_PENDING = 0,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 07/10] landlock: Add the new domain to landlock_cred_security
  2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
                   ` (5 preceding siblings ...)
  2025-05-21 19:32 ` [RFC PATCH 06/10] landlock/domain: Define landlock_domain Tingmao Wang
@ 2025-05-21 19:32 ` Tingmao Wang
  2025-05-21 19:32 ` [RFC PATCH 08/10] landlock: Construct the inode hashtable in the new landlock_domain Tingmao Wang
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-05-21 19:32 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, linux-security-module

We add a new field domain2 that will eventually replace the old struct
landlock_ruleset domain.  Also add get/put to appropriate places where
this struct is copied/freed.

In a future patch, this domain2 will be renamed to domain.

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/cred.c   |  8 ++++----
 security/landlock/cred.h   |  1 +
 security/landlock/domain.c | 17 +++++++++++++++++
 security/landlock/domain.h | 32 +++++++++++++++++++++++---------
 security/landlock/fs.c     |  5 +++++
 5 files changed, 50 insertions(+), 13 deletions(-)

diff --git a/security/landlock/cred.c b/security/landlock/cred.c
index 0cb3edde4d18..e08c9e206350 100644
--- a/security/landlock/cred.c
+++ b/security/landlock/cred.c
@@ -13,6 +13,7 @@
 
 #include "common.h"
 #include "cred.h"
+#include "domain.h"
 #include "ruleset.h"
 #include "setup.h"
 
@@ -24,6 +25,7 @@ static void hook_cred_transfer(struct cred *const new,
 
 	if (old_llcred->domain) {
 		landlock_get_ruleset(old_llcred->domain);
+		landlock_get_domain(old_llcred->domain2);
 		*landlock_cred(new) = *old_llcred;
 	}
 }
@@ -37,10 +39,8 @@ static int hook_cred_prepare(struct cred *const new,
 
 static void hook_cred_free(struct cred *const cred)
 {
-	struct landlock_ruleset *const dom = landlock_cred(cred)->domain;
-
-	if (dom)
-		landlock_put_ruleset_deferred(dom);
+	landlock_put_ruleset_deferred(landlock_cred(cred)->domain);
+	landlock_put_domain_deferred(landlock_cred(cred)->domain2);
 }
 
 #ifdef CONFIG_AUDIT
diff --git a/security/landlock/cred.h b/security/landlock/cred.h
index c82fe63ec598..b13b4c2e21aa 100644
--- a/security/landlock/cred.h
+++ b/security/landlock/cred.h
@@ -32,6 +32,7 @@ struct landlock_cred_security {
 	 * @domain: Immutable ruleset enforced on a task.
 	 */
 	struct landlock_ruleset *domain;
+	struct landlock_domain *domain2;
 
 #ifdef CONFIG_AUDIT
 	/**
diff --git a/security/landlock/domain.c b/security/landlock/domain.c
index 180ed75da9e2..321c52b275fc 100644
--- a/security/landlock/domain.c
+++ b/security/landlock/domain.c
@@ -60,6 +60,23 @@ void landlock_put_domain(struct landlock_domain *const domain)
 	}
 }
 
+static void free_domain_work(struct work_struct *const work)
+{
+	struct landlock_domain *domain;
+
+	domain = container_of(work, struct landlock_domain, work_free);
+	free_domain(domain);
+}
+
+/* Only called by hook_cred_free(). */
+void landlock_put_domain_deferred(struct landlock_domain *const domain)
+{
+	if (domain && refcount_dec_and_test(&domain->usage)) {
+		INIT_WORK(&domain->work_free, free_domain_work);
+		schedule_work(&domain->work_free);
+	}
+}
+
 #ifdef CONFIG_AUDIT
 
 /**
diff --git a/security/landlock/domain.h b/security/landlock/domain.h
index ed685f8ad52e..944420231040 100644
--- a/security/landlock/domain.h
+++ b/security/landlock/domain.h
@@ -18,6 +18,7 @@
 #include <linux/refcount.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/workqueue.h>
 
 #include "access.h"
 #include "audit.h"
@@ -26,15 +27,26 @@
 struct landlock_domain {
 	struct landlock_hashtable inode_table;
 
-	/**
-	 * @usage: Reference count for this struct.
-	 */
-	refcount_t usage;
-
-	/**
-	 * @num_layers: Number of layers in this domain.
-	 */
-	u16 num_layers;
+	union {
+		/**
+		 * @work_free: Enables to free a ruleset within a lockless
+		 * section.  This is only used by
+		 * landlock_put_domain_deferred() when @usage reaches zero.
+		 */
+		struct work_struct work_free;
+
+		struct {
+			/**
+			* @usage: Reference count for this struct.
+			*/
+			refcount_t usage;
+
+			/**
+			* @num_layers: Number of layers in this domain.
+			*/
+			u16 num_layers;
+		};
+	};
 };
 
 struct landlock_domain *landlock_alloc_domain(size_t num_inode_entries,
@@ -48,6 +60,8 @@ static inline void landlock_get_domain(struct landlock_domain *const domain)
 
 void landlock_put_domain(struct landlock_domain *const domain);
 
+void landlock_put_domain_deferred(struct landlock_domain *const domain);
+
 enum landlock_log_status {
 	LANDLOCK_LOG_PENDING = 0,
 	LANDLOCK_LOG_RECORDED,
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index b407c644ac65..c4f442093c6e 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -1844,6 +1844,7 @@ static bool control_current_fowner(struct fown_struct *const fown)
 static void hook_file_set_fowner(struct file *file)
 {
 	struct landlock_ruleset *prev_dom;
+	struct landlock_domain *prev_dom2;
 	struct landlock_cred_security fown_subject = {};
 	size_t fown_layer = 0;
 
@@ -1856,11 +1857,13 @@ static void hook_file_set_fowner(struct file *file)
 				current_cred(), signal_scope, &fown_layer);
 		if (new_subject) {
 			landlock_get_ruleset(new_subject->domain);
+			landlock_get_domain(new_subject->domain2);
 			fown_subject = *new_subject;
 		}
 	}
 
 	prev_dom = landlock_file(file)->fown_subject.domain;
+	prev_dom2 = landlock_file(file)->fown_subject.domain2;
 	landlock_file(file)->fown_subject = fown_subject;
 #ifdef CONFIG_AUDIT
 	landlock_file(file)->fown_layer = fown_layer;
@@ -1868,11 +1871,13 @@ static void hook_file_set_fowner(struct file *file)
 
 	/* May be called in an RCU read-side critical section. */
 	landlock_put_ruleset_deferred(prev_dom);
+	landlock_put_domain_deferred(prev_dom2);
 }
 
 static void hook_file_free_security(struct file *file)
 {
 	landlock_put_ruleset_deferred(landlock_file(file)->fown_subject.domain);
+	landlock_put_domain_deferred(landlock_file(file)->fown_subject.domain2);
 }
 
 static struct security_hook_list landlock_hooks[] __ro_after_init = {
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 08/10] landlock: Construct the inode hashtable in the new landlock_domain
  2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
                   ` (6 preceding siblings ...)
  2025-05-21 19:32 ` [RFC PATCH 07/10] landlock: Add the new domain to landlock_cred_security Tingmao Wang
@ 2025-05-21 19:32 ` Tingmao Wang
  2025-05-27 11:00   ` Mickaël Salaün
  2025-05-21 19:32 ` [RFC PATCH 09/10] landlock/fs: Use the new hashtable-based domain to find inode rules Tingmao Wang
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 18+ messages in thread
From: Tingmao Wang @ 2025-05-21 19:32 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, linux-security-module

Since we can't get rid of the old landlock_merge_ruleset yet, we call our
new thing landlock_merge_ruleset2.

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/domain.c   |  87 +++++++++++++++++++++++++++++
 security/landlock/domain.h   |   4 ++
 security/landlock/hash.h     | 105 +++++++++++++++++++++++++++++++++++
 security/landlock/ruleset.h  |   2 +-
 security/landlock/syscalls.c |   9 +++
 5 files changed, 206 insertions(+), 1 deletion(-)

diff --git a/security/landlock/domain.c b/security/landlock/domain.c
index 321c52b275fc..fae21b260591 100644
--- a/security/landlock/domain.c
+++ b/security/landlock/domain.c
@@ -77,6 +77,93 @@ void landlock_put_domain_deferred(struct landlock_domain *const domain)
 	}
 }
 
+/**
+ * @curr_table may be NULL.
+ */
+static int merge_domain_table(const enum landlock_key_type key_type,
+			      const struct landlock_hashtable *const curr_table,
+			      struct landlock_hashtable *const new_table,
+			      struct landlock_layer new_layer,
+			      const struct rb_root *ruleset_rb_root)
+{
+	int err;
+	struct landlock_rule *iter, *iter2;
+
+	if (curr_table) {
+		err = landlock_hash_clone(new_table, curr_table, key_type);
+		if (err) {
+			return err;
+		}
+	}
+
+	/* Merge in new rules */
+	rbtree_postorder_for_each_entry_safe(iter, iter2, ruleset_rb_root,
+					     node) {
+		WARN_ON_ONCE(iter->layers[0].level != 0);
+		WARN_ON_ONCE(iter->num_layers != 1);
+		new_layer.access = iter->layers[0].access;
+
+		err = landlock_hash_upsert(new_table, iter->key, key_type,
+					   new_layer);
+		if (err) {
+			return err;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * @landlock_merge_ruleset2 - Merge a ruleset with a (possibly NULL)
+ * domain, and return a new merged domain.
+ */
+struct landlock_domain *
+landlock_merge_ruleset2(const struct landlock_domain *curr_domain,
+			const struct landlock_ruleset *next_ruleset)
+{
+	size_t num_inodes = 0;
+	int err;
+	struct landlock_domain *new_domain;
+	struct landlock_layer new_layer = {
+		.level = 1,
+	};
+	struct landlock_rule *iter, *iter2;
+
+	if (WARN_ON_ONCE(!next_ruleset)) {
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (curr_domain) {
+		new_layer.level = curr_domain->num_layers + 1;
+		num_inodes = landlock_hash_count(&curr_domain->inode_table);
+	}
+
+	/* Find new expected size of inode table */
+	rbtree_postorder_for_each_entry_safe(iter, iter2,
+					     &next_ruleset->root_inode, node) {
+		if (!curr_domain ||
+		    landlock_hash_find(&curr_domain->inode_table, iter->key) ==
+			    NULL) {
+			num_inodes += 1;
+		}
+	}
+
+	new_domain = landlock_alloc_domain(num_inodes, new_layer.level);
+	if (!new_domain)
+		return ERR_PTR(-ENOMEM);
+
+	err = merge_domain_table(LANDLOCK_KEY_INODE,
+				 curr_domain ? &curr_domain->inode_table : NULL,
+				 &new_domain->inode_table, new_layer,
+				 &next_ruleset->root_inode);
+	if (err) {
+		landlock_put_domain(new_domain);
+		return ERR_PTR(err);
+	}
+
+	return new_domain;
+}
+
 #ifdef CONFIG_AUDIT
 
 /**
diff --git a/security/landlock/domain.h b/security/landlock/domain.h
index 944420231040..e52b32d8dd2b 100644
--- a/security/landlock/domain.h
+++ b/security/landlock/domain.h
@@ -62,6 +62,10 @@ void landlock_put_domain(struct landlock_domain *const domain);
 
 void landlock_put_domain_deferred(struct landlock_domain *const domain);
 
+struct landlock_domain *
+landlock_merge_ruleset2(const struct landlock_domain *curr_domain,
+			const struct landlock_ruleset *new_ruleset);
+
 enum landlock_log_status {
 	LANDLOCK_LOG_PENDING = 0,
 	LANDLOCK_LOG_RECORDED,
diff --git a/security/landlock/hash.h b/security/landlock/hash.h
index 8393593cfe7b..8208944c309e 100644
--- a/security/landlock/hash.h
+++ b/security/landlock/hash.h
@@ -10,6 +10,7 @@
 
 #include <linux/slab.h>
 #include <linux/hash.h>
+#include <linux/rculist.h>
 
 #include "ruleset.h"
 
@@ -125,3 +126,107 @@ static inline size_t landlock_hash_count(const struct landlock_hashtable *ht)
 	}
 	return num_entries;
 }
+
+/**
+ * @landlock_hash_insert - Insert a rule in the hashtable, taking
+ * ownership of the passed in struct landlock_rule. This function assumes
+ * that the rule is already in the hash table.
+ */
+static inline void landlock_hash_insert(const struct landlock_hashtable *ht,
+					struct landlock_rule *const new_rule)
+{
+	struct hlist_head *head =
+		&ht->hlist[landlock_hash_key(new_rule->key, ht->hash_bits)];
+
+	hlist_add_head(&new_rule->hlist, head);
+}
+
+static inline int
+landlock_hash_clone(struct landlock_hashtable *const dst,
+		    const struct landlock_hashtable *const src,
+		    const enum landlock_key_type key_type)
+{
+	struct landlock_rule *curr_rule, *new_rule;
+	struct landlock_id id = {
+		.type = key_type,
+	};
+	size_t i;
+
+	landlock_hash_for_each(curr_rule, src, i)
+	{
+		id.key = curr_rule->key;
+		new_rule = landlock_create_rule(id, &curr_rule->layers,
+						curr_rule->num_layers, NULL);
+
+		if (IS_ERR(new_rule)) {
+			return PTR_ERR(new_rule);
+		}
+
+		/*
+		 * new_rule->hlist is invalid, but should still be safe to pass to
+		 * hlist_add_head().
+		 */
+		landlock_hash_insert(dst, new_rule);
+	}
+
+	return 0;
+}
+
+/**
+ * @landlock_hash_upsert - Either insert a new rule with the new layer in
+ * the hashtable, or update an existing one, adding the new layer.
+ *
+ * Hash table must have at least one slot.  This function doesn't take any
+ * locks - it's only valid to call this on a newly created (not yet
+ * committed to creds) domain.
+ *
+ * May error with -ENOMEM.
+ */
+static inline int landlock_hash_upsert(struct landlock_hashtable *const ht,
+				       union landlock_key key,
+				       const enum landlock_key_type key_type,
+				       struct landlock_layer new_layer)
+{
+	size_t index = landlock_hash_key(key, ht->hash_bits);
+	struct hlist_head *head = &ht->hlist[index];
+	struct landlock_rule *curr_rule, *new_rule;
+	const struct landlock_id id = {
+		.type = key_type,
+		.key = key,
+	};
+
+	hlist_for_each_entry(curr_rule, head, hlist) {
+		if (curr_rule->key.data != key.data)
+			continue;
+
+		new_rule = landlock_create_rule(id, &curr_rule->layers,
+						curr_rule->num_layers,
+						&new_layer);
+		if (IS_ERR(new_rule))
+			return PTR_ERR(new_rule);
+
+		/*
+		 * Replace curr_rule with new_rule in place within the hlist
+		 * We don't really care about RCU... but there's no "hlist_replace"
+		 * We should be safe to call hlist_replace_rcu() without first
+		 * initializing new_rule->hlist
+		 */
+		hlist_replace_rcu(&curr_rule->hlist, &new_rule->hlist);
+		free_rule(curr_rule, key_type);
+		return 0;
+	}
+
+	/* No existing rules found, insert new one. */
+	new_rule = landlock_create_rule(id, NULL, 0, &new_layer);
+	if (IS_ERR(new_rule))
+		return PTR_ERR(new_rule);
+
+	/*
+	 * new_rule->hlist is invalid, but should still be safe to pass to
+	 * hlist_add_head().
+	 */
+	hlist_add_head(&new_rule->hlist, head);
+	return 0;
+}
+
+#endif /* _SECURITY_LANDLOCK_HASH_H */
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index 07823771b402..ac91d4a865b9 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -27,7 +27,7 @@ struct landlock_hierarchy;
  */
 struct landlock_layer {
 	/**
-	 * @level: Position of this layer in the layer stack.
+	 * @level: Position of this layer in the layer stack. Starts from 1.
 	 */
 	u16 level;
 	/**
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 38eb8287f73d..57ac3ee02f22 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -483,6 +483,7 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
 		*ruleset __free(landlock_put_ruleset) = NULL;
 	struct cred *new_cred;
 	struct landlock_cred_security *new_llcred;
+	struct landlock_domain *new_domain2;
 	bool __maybe_unused log_same_exec, log_new_exec, log_subdomains,
 		prev_log_subdomains;
 
@@ -551,6 +552,12 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
 		abort_creds(new_cred);
 		return PTR_ERR(new_dom);
 	}
+	new_domain2 = landlock_merge_ruleset2(new_llcred->domain2, ruleset);
+	if (IS_ERR(new_domain2)) {
+		landlock_put_ruleset(new_dom);
+		abort_creds(new_cred);
+		return PTR_ERR(new_domain2);
+	}
 
 #ifdef CONFIG_AUDIT
 	new_dom->hierarchy->log_same_exec = log_same_exec;
@@ -588,6 +595,8 @@ SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
 	/* Replaces the old (prepared) domain. */
 	landlock_put_ruleset(new_llcred->domain);
 	new_llcred->domain = new_dom;
+	landlock_put_domain(new_llcred->domain2);
+	new_llcred->domain2 = new_domain2;
 
 #ifdef CONFIG_AUDIT
 	new_llcred->domain_exec |= BIT(new_dom->num_layers - 1);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 09/10] landlock/fs: Use the new hashtable-based domain to find inode rules
  2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
                   ` (7 preceding siblings ...)
  2025-05-21 19:32 ` [RFC PATCH 08/10] landlock: Construct the inode hashtable in the new landlock_domain Tingmao Wang
@ 2025-05-21 19:32 ` Tingmao Wang
  2025-05-21 19:32 ` [RFC PATCH 10/10] landlock: Debug print inode hashtable in landlock_merge_ruleset2 Tingmao Wang
  2025-05-27 10:59 ` landlock: Use hashtable for merged domains Mickaël Salaün
  10 siblings, 0 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-05-21 19:32 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, linux-security-module

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/fs.c | 55 ++++++++++++++++++++++--------------------
 1 file changed, 29 insertions(+), 26 deletions(-)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index c4f442093c6e..0846362caaf9 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -7,6 +7,7 @@
  * Copyright © 2021-2025 Microsoft Corporation
  * Copyright © 2022 Günther Noack <gnoack3000@gmail.com>
  * Copyright © 2023-2024 Google LLC
+ * Copyright © 2025 Tingmao Wang <m@maowtm.org>
  */
 
 #include <asm/ioctls.h>
@@ -361,14 +362,12 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
  * Returns NULL if no rule is found or if @dentry is negative.
  */
 static const struct landlock_rule *
-find_rule(const struct landlock_ruleset *const domain,
+find_rule(const struct landlock_domain *const domain,
 	  const struct dentry *const dentry)
 {
 	const struct landlock_rule *rule;
 	const struct inode *inode;
-	struct landlock_id id = {
-		.type = LANDLOCK_KEY_INODE,
-	};
+	union landlock_key key;
 
 	/* Ignores nonexistent leafs. */
 	if (d_is_negative(dentry))
@@ -376,8 +375,8 @@ find_rule(const struct landlock_ruleset *const domain,
 
 	inode = d_backing_inode(dentry);
 	rcu_read_lock();
-	id.key.object = rcu_dereference(landlock_inode(inode)->object);
-	rule = landlock_find_rule(domain, id);
+	key.object = rcu_dereference(landlock_inode(inode)->object);
+	rule = landlock_hash_find(&domain->inode_table, key);
 	rcu_read_unlock();
 	return rule;
 }
@@ -753,6 +752,7 @@ static void test_is_eacces_with_write(struct kunit *const test)
  */
 static bool is_access_to_paths_allowed(
 	const struct landlock_ruleset *const domain,
+	const struct landlock_domain *const domain2,
 	const struct path *const path,
 	const access_mask_t access_request_parent1,
 	layer_mask_t (*const layer_masks_parent1)[LANDLOCK_NUM_ACCESS_FS],
@@ -831,7 +831,7 @@ static bool is_access_to_paths_allowed(
 
 	if (unlikely(dentry_child1)) {
 		landlock_unmask_layers(
-			find_rule(domain, dentry_child1),
+			find_rule(domain2, dentry_child1),
 			landlock_init_layer_masks(
 				domain, LANDLOCK_MASK_ACCESS_FS,
 				&_layer_masks_child1, LANDLOCK_KEY_INODE),
@@ -841,7 +841,7 @@ static bool is_access_to_paths_allowed(
 	}
 	if (unlikely(dentry_child2)) {
 		landlock_unmask_layers(
-			find_rule(domain, dentry_child2),
+			find_rule(domain2, dentry_child2),
 			landlock_init_layer_masks(
 				domain, LANDLOCK_MASK_ACCESS_FS,
 				&_layer_masks_child2, LANDLOCK_KEY_INODE),
@@ -900,7 +900,7 @@ static bool is_access_to_paths_allowed(
 				break;
 		}
 
-		rule = find_rule(domain, walker_path.dentry);
+		rule = find_rule(domain2, walker_path.dentry);
 		allowed_parent1 = allowed_parent1 ||
 				  landlock_unmask_layers(
 					  rule, access_masked_parent1,
@@ -1012,9 +1012,9 @@ static int current_check_access_path(const struct path *const path,
 	access_request = landlock_init_layer_masks(subject->domain,
 						   access_request, &layer_masks,
 						   LANDLOCK_KEY_INODE);
-	if (is_access_to_paths_allowed(subject->domain, path, access_request,
-				       &layer_masks, &request, NULL, 0, NULL,
-				       NULL, NULL))
+	if (is_access_to_paths_allowed(subject->domain, subject->domain2, path,
+				       access_request, &layer_masks, &request,
+				       NULL, 0, NULL, NULL, NULL))
 		return 0;
 
 	landlock_log_denial(subject, &request);
@@ -1077,6 +1077,7 @@ static access_mask_t maybe_remove(const struct dentry *const dentry)
  */
 static bool collect_domain_accesses(
 	const struct landlock_ruleset *const domain,
+	const struct landlock_domain *const domain2,
 	const struct dentry *const mnt_root, struct dentry *dir,
 	layer_mask_t (*const layer_masks_dom)[LANDLOCK_NUM_ACCESS_FS])
 {
@@ -1097,7 +1098,7 @@ static bool collect_domain_accesses(
 		struct dentry *parent_dentry;
 
 		/* Gets all layers allowing all domain accesses. */
-		if (landlock_unmask_layers(find_rule(domain, dir), access_dom,
+		if (landlock_unmask_layers(find_rule(domain2, dir), access_dom,
 					   layer_masks_dom,
 					   ARRAY_SIZE(*layer_masks_dom))) {
 			/*
@@ -1218,10 +1219,10 @@ static int current_check_refer_path(struct dentry *const old_dentry,
 			subject->domain,
 			access_request_parent1 | access_request_parent2,
 			&layer_masks_parent1, LANDLOCK_KEY_INODE);
-		if (is_access_to_paths_allowed(subject->domain, new_dir,
-					       access_request_parent1,
-					       &layer_masks_parent1, &request1,
-					       NULL, 0, NULL, NULL, NULL))
+		if (is_access_to_paths_allowed(
+			    subject->domain, subject->domain2, new_dir,
+			    access_request_parent1, &layer_masks_parent1,
+			    &request1, NULL, 0, NULL, NULL, NULL))
 			return 0;
 
 		landlock_log_denial(subject, &request1);
@@ -1245,11 +1246,13 @@ static int current_check_refer_path(struct dentry *const old_dentry,
 						      old_dentry->d_parent;
 
 	/* new_dir->dentry is equal to new_dentry->d_parent */
-	allow_parent1 = collect_domain_accesses(subject->domain, mnt_dir.dentry,
-						old_parent,
+	allow_parent1 = collect_domain_accesses(subject->domain,
+						subject->domain2,
+						mnt_dir.dentry, old_parent,
 						&layer_masks_parent1);
-	allow_parent2 = collect_domain_accesses(subject->domain, mnt_dir.dentry,
-						new_dir->dentry,
+	allow_parent2 = collect_domain_accesses(subject->domain,
+						subject->domain2,
+						mnt_dir.dentry, new_dir->dentry,
 						&layer_masks_parent2);
 
 	if (allow_parent1 && allow_parent2)
@@ -1262,10 +1265,10 @@ static int current_check_refer_path(struct dentry *const old_dentry,
 	 * destination parent access rights.
 	 */
 	if (is_access_to_paths_allowed(
-		    subject->domain, &mnt_dir, access_request_parent1,
-		    &layer_masks_parent1, &request1, old_dentry,
-		    access_request_parent2, &layer_masks_parent2, &request2,
-		    exchange ? new_dentry : NULL))
+		    subject->domain, subject->domain2, &mnt_dir,
+		    access_request_parent1, &layer_masks_parent1, &request1,
+		    old_dentry, access_request_parent2, &layer_masks_parent2,
+		    &request2, exchange ? new_dentry : NULL))
 		return 0;
 
 	if (request1.access) {
@@ -1689,7 +1692,7 @@ static int hook_file_open(struct file *const file)
 	full_access_request = open_access_request | optional_access;
 
 	if (is_access_to_paths_allowed(
-		    subject->domain, &file->f_path,
+		    subject->domain, subject->domain2, &file->f_path,
 		    landlock_init_layer_masks(subject->domain,
 					      full_access_request, &layer_masks,
 					      LANDLOCK_KEY_INODE),
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 10/10] landlock: Debug print inode hashtable in landlock_merge_ruleset2
  2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
                   ` (8 preceding siblings ...)
  2025-05-21 19:32 ` [RFC PATCH 09/10] landlock/fs: Use the new hashtable-based domain to find inode rules Tingmao Wang
@ 2025-05-21 19:32 ` Tingmao Wang
  2025-05-27 10:59 ` landlock: Use hashtable for merged domains Mickaël Salaün
  10 siblings, 0 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-05-21 19:32 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Tingmao Wang, Günther Noack, linux-security-module

Signed-off-by: Tingmao Wang <m@maowtm.org>
---
 security/landlock/domain.c |  5 +++
 security/landlock/hash.h   | 62 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 67 insertions(+)

diff --git a/security/landlock/domain.c b/security/landlock/domain.c
index fae21b260591..9c82f5c1bdb9 100644
--- a/security/landlock/domain.c
+++ b/security/landlock/domain.c
@@ -161,6 +161,11 @@ landlock_merge_ruleset2(const struct landlock_domain *curr_domain,
 		return ERR_PTR(err);
 	}
 
+#ifdef DEBUG
+	pr_debug("landlock_merge_ruleset2: inode hash table:\n");
+	landlock_hash_debug_print(&new_domain->inode_table, LANDLOCK_KEY_INODE);
+#endif /* DEBUG */
+
 	return new_domain;
 }
 
diff --git a/security/landlock/hash.h b/security/landlock/hash.h
index 8208944c309e..0c41cd8a102b 100644
--- a/security/landlock/hash.h
+++ b/security/landlock/hash.h
@@ -229,4 +229,66 @@ static inline int landlock_hash_upsert(struct landlock_hashtable *const ht,
 	return 0;
 }
 
+static inline void
+landlock_hash_debug_print(const struct landlock_hashtable *ht,
+			  const enum landlock_key_type key_type)
+{
+	size_t max_hlist_len = 0, slot_index = 0, num_rules = 0;
+
+	for (slot_index = 0; slot_index < (1ULL << ht->hash_bits);
+	     slot_index += 1) {
+		struct hlist_head *head = &ht->hlist[slot_index];
+		struct landlock_rule *rule;
+		size_t rule_index = 0;
+		spinlock_t *lock;
+
+		pr_debug("  [%zu]: first = %p\n", slot_index, head->first);
+
+		hlist_for_each_entry(rule, head, hlist) {
+			size_t j;
+
+			switch (key_type) {
+			case LANDLOCK_KEY_INODE:
+				lock = &rule->key.object->lock;
+				spin_lock(lock);
+				struct inode *inode =
+					((struct inode *)
+						 rule->key.object->underobj);
+				if (inode) {
+					pr_debug(
+						"    [%zu] rule: ino %lu (%p), %d layers\n",
+						rule_index, inode->i_ino, inode,
+						rule->num_layers);
+				} else {
+					pr_debug(
+						"    [%zu] rule: inode released, %d layers\n",
+						rule_index, rule->num_layers);
+				}
+				spin_unlock(lock);
+				break;
+			case LANDLOCK_KEY_NET_PORT:
+				pr_debug(
+					"    [%zu] rule: port %lu, %d layers\n",
+					rule_index, rule->key.data,
+					rule->num_layers);
+				break;
+			}
+			for (j = 0; j < rule->num_layers; j++) {
+				pr_debug("      layer %u: access %x\n",
+					 rule->layers[j].level,
+					 rule->layers[j].access);
+			}
+			rule_index += 1;
+			num_rules += 1;
+		}
+
+		if (rule_index > max_hlist_len)
+			max_hlist_len = rule_index;
+	}
+
+	pr_debug("  summary: %zu rules, %llu hash slots, "
+		 "%zu max hlist chain len\n",
+		 num_rules, (1ULL << ht->hash_bits), max_hlist_len);
+}
+
 #endif /* _SECURITY_LANDLOCK_HASH_H */
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: landlock: Use hashtable for merged domains
  2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
                   ` (9 preceding siblings ...)
  2025-05-21 19:32 ` [RFC PATCH 10/10] landlock: Debug print inode hashtable in landlock_merge_ruleset2 Tingmao Wang
@ 2025-05-27 10:59 ` Mickaël Salaün
  10 siblings, 0 replies; 18+ messages in thread
From: Mickaël Salaün @ 2025-05-27 10:59 UTC (permalink / raw)
  To: Tingmao Wang
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	Jann Horn

On Wed, May 21, 2025 at 08:31:56PM +0100, Tingmao Wang wrote:
> Hi Mickaël,
> 
> This is a set of (incomplete) patches for the "domain hashtable" work.
> While this is still a WIP, I'm sending it now as per our discussion.  You
> can take a look and see if the approach needs correcting / if we want to
> go ahead with using hashtable.
> 
> Currently only implemented for fs access.
> 
> This set of changes is also available on the landlock-hashmap branch of
> https://github.com/micromaomao/linux-dev.git

Thanks for these patches and the related benchmarks!

> 
> 
> struct landlock_domain
> ----------------------
> 
> One of the major thing I'm not sure about in this patch is the fact that
> I've decided to create a standalone `struct landlock_domain`, instead of
> re-using landlock_ruleset.  My original hope was that it might help make
> the code a bit clearer - struct landlock_ruleset would just be for
> unmerged rulesets (and it can use rbtree if we still wants to), and we can
> use this new type for domains (and add new fields and locks to it that
> wouldn't be on a ruleset, for the mutable domains / supervisor work).
> 
> I also wanted to split the logic of dealing with rules in a merged domain
> (hence needing to copy multiple layers etc) out of create_rule and
> insert_rule, and simply these functions (especially insert_rule) by
> letting them just deal with unmerged rulesets (so only one layer) - the
> logic for merged domains would be different anyway since it's no longer a
> rbtree walk (but something like landlock_hash_upsert in patch 8 instead).
> 
> However, looking back at it, I'm not sure if creating this landlock_domain
> was actually a good idea.  Going down this route eventually I will just
> have to move all the domain-related logic from using ruleset to using
> landlock_domain.  If we decide to use hashtable for unmerged rulesets too
> (not done in this patch, and need more thoughts on how that would work),
> then separating this out is even less value.
> 
> Regardless, the relevant code in this patch should still be easily
> portable to just extend landlock_ruleset with hashtables, so I'm happy to
> do that instead in the next version.

I think a dedicated struct landlock_domain has value: clearer API,
smaller domain's size, optimizations.  In the current implementation of
the hash table, the domain's size increases in most cases because of the
table's size.  We also still have scattered rules and at least two
pointer dereferencing (if no hash collision): landlock_hashtable.hlist,
and hlist_node.first.  This could be avoided with a compact flat
structure, using a hash table or just an array with sorted keys.  See a
new proposal in my reply to patch 2.

> 
> 
> Hashtable implementation
> ------------------------
> 
> Since I couldn't find a suitable existing wrapper for runtime-sized (but
> fixed after creation) hashtables, this patch uses a newly hand-rolled
> implementation.  It's not very complex so this might be fine, but there is
> also rhashtable which would be especially suitable if we want to use hash
> table for unmerged rulesets too.  See patch 2 message for details.
> 
> 
> Testing
> -------
> 
> selftests/landlock/fs_tests all passes under KASAN, lockdep etc.  Other
> tests showed no obvious issues, but base and net fails due to missing
> config (no MPTCP, no CONFIG_ASYMMETRIC_KEY_TYPE i.e. keyctl)
> 
> I ran benchmark with before/after using two workloads, on the same machine
> and setup:
> 
>     1. run_microbench with different depth and number of extra rules using
>     code in https://github.com/landlock-lsm/landlock-test-tools/pull/17
> 
>     2. A more "typical" workload I put together quickly, calling git status
>     and the like repeatedly:
>     https://github.com/torvalds/linux/commit/f1865ce970af97ac3b6f4edf580529b8cdc66371
> 
> On the "typical" workload, which has 2 layers and ~15 rules, we have:
> 
> Comparing:                    orig	  hashtable  (% change)
>   landlock_overhead:
>     (this is the % of time spent in landlock hook in the open syscall)
>                         avg = 34      33         (-2.9%)
>                      median = 34      33         (-2.9%)
> 
>   landlock_hook:        avg = 837     775        (-7.4%) (unit: ns)
>                      median = 813     748        (-8.0%) (unit: ns)
> 
>   open_syscall:         avg = 2429    2324       (-4.3%) (unit: ns)
>                      median = 2370    2265       (-4.4%) (unit: ns)
> 
> Using the microbench script, for an extreme case on a path beneath 28
> directories and with 10000 rules:
> 
> Comparing:                    orig	  hashtable  (% change)
>   landlock_overhead:    avg = 27      24         (-11.1%)
>                      median = 29      25         (-13.8%)
>   landlock_hook:        avg = 1913    1577       (-17.6%)
>                      median = 1884    1544       (-18.0%)
>   open_syscall:         avg = 6775    6259       (-7.6%)
>                      median = 6666    6115       (-8.3%)
> 
> The full results can be found at
> https://fileshare.maowtm.org/landlock/20250521/index.html
> 
> Closes: https://github.com/landlock-lsm/linux/issues/1
> 
> Tingmao Wang (10):
>   landlock: Add some debug output
>   landlock/hash: define (dynamic, non-resizable) hash table helpers
>   landlock/hash: Use linear search for small tables
>   landlock/ruleset: Rename and extract create_rule
>   Add hlist_node member to struct landlock_rule
>   landlock/domain: Define landlock_domain
>   landlock: Add the new domain to landlock_cred_security
>   landlock: Construct the inode hashtable in the new landlock_domain
>   landlock/fs: Use the new hashtable-based domain to find inode rules
>   landlock: Debug print inode hashtable in landlock_merge_ruleset2
> 
>  security/landlock/cred.c     |   8 +-
>  security/landlock/cred.h     |   1 +
>  security/landlock/domain.c   | 145 +++++++++++++++++
>  security/landlock/domain.h   |  45 ++++++
>  security/landlock/fs.c       | 107 +++++++++----
>  security/landlock/fs.h       |   1 +
>  security/landlock/hash.h     | 294 +++++++++++++++++++++++++++++++++++
>  security/landlock/ruleset.c  |  80 +---------
>  security/landlock/ruleset.h  |  99 +++++++++++-
>  security/landlock/syscalls.c |  35 +++++
>  10 files changed, 702 insertions(+), 113 deletions(-)
>  create mode 100644 security/landlock/hash.h
> 
> 
> base-commit: 3039ed432745f8fdf5cbb43fdc60b2e1aad624c1
> --
> 2.49.0

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 02/10] landlock/hash: define (dynamic, non-resizable) hash table helpers
  2025-05-21 19:31 ` [RFC PATCH 02/10] landlock/hash: define (dynamic, non-resizable) hash table helpers Tingmao Wang
@ 2025-05-27 11:00   ` Mickaël Salaün
  2025-06-02 13:20     ` Tingmao Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Mickaël Salaün @ 2025-05-27 11:00 UTC (permalink / raw)
  To: Tingmao Wang
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	Jann Horn

On Wed, May 21, 2025 at 08:31:58PM +0100, Tingmao Wang wrote:
> While there is already include/linux/hash.h, it relies on the static size
> of the array as the size of the hash table, and thus is inconvenient to
> use for this case where we dynamically compute how many slots we need.
> 
> There is also the relativistic hash tables in rhashtable.h which supports
> dynamic resizes etc, but is more complicated and might be slower to access?
> 
> However, on second thought, I'm wondering if we should just use hash
> tables for both domain and a not-yet-merged ruleset anyway (which saves us
> from having a union in landlock_rule).  If we do that then we should
> indeed just use rhashtable.

Thinking more about it, the important properties are that we can have a
lot of domains/tables (i.e. sandboxed processes) with a few
entries/rules (e.g. ~30 rules seems reasonable for now).  We should then
try to minimize the total amount of memory while still making access
checks quick.  As you noted, the cost of hashing should also not be
ignored.

Instead of a hash table per domain, what about flat arrays with binary
search?  Here is a proposal:

struct landlock_domain_index {
    union landlock_key key;
    u32 shift; // value to add to this array's index to jump to the set
               // of mapped landlock_layer
    u32 num_layers;
}; // 128-bit (or 96-bit) alligned

// Theoretical landlock_domain
struct landlock_domain {
    struct landlock_hierarchy *hierarchy;
    union {
        // See landlock_ruleset's union
    };
    u32 num_fs; // number of FS indexes
    u32 num_net; // number of net indexes
    struct access_masks access_masks[];
    struct landlock_domain_index fs_indexes[num_fs];
    struct landlock_layer fs_layers[sum of FS rules' layers];
    struct landlock_domain_index net_indexes[num_net];
    struct landlock_layer net_layers[sum of net rules' layers];
};

// Real landlock_domain
struct landlock_domain {
    struct landlock_hierarchy *hierarchy;
    union {
        // See landlock_ruleset's union
    };
    u32 num_fs; // number of FS indexes
    u32 num_net; // number of net indexes
    u32 rules[]; // underlying typed arrays accessed via helpers
};

The landlock_domain is a contiguously allocated memory buffer containing
variable-size arrays improving locality.  Another advantage is that we
would not get any potential allocation errors once the buffer is
allocated which should simplify domain creation.  Also, we avoid the
union in landlock_rule (patch 5) and only use landlock_rule in
landlock_ruleset.

An alternative approach would be to use a hash table instead of the
array (extending landlock_domain_index with a pointer/shift to the next
collision) but still with this big buffer.  I'm not sure the perf impact
would be noticable but it might be worse a try.

> 
> Signed-off-by: Tingmao Wang <m@maowtm.org>
> ---
>  security/landlock/hash.h | 117 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 117 insertions(+)
>  create mode 100644 security/landlock/hash.h
> 
> diff --git a/security/landlock/hash.h b/security/landlock/hash.h
> new file mode 100644
> index 000000000000..955c5756d4d9
> --- /dev/null
> +++ b/security/landlock/hash.h
> @@ -0,0 +1,117 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Landlock - Domain hashtable mainpulation

typo

> + *
> + * Copyright © 2025      Tingmao Wang <m@maowtm.org>
> + */
> +
> +#ifndef _SECURITY_LANDLOCK_HASH_H
> +#define _SECURITY_LANDLOCK_HASH_H
> +
> +#include <linux/slab.h>
> +#include <linux/hash.h>
> +
> +#include "ruleset.h"
> +
> +struct landlock_hashtable {
> +	struct hlist_head *hlist;

A simple linked-list would be enough.

> +
> +	/**
> +	 * @hash_bits: Number of bits in this hash index (i.e.  hlist has
> +	 * 2^this many elements).
> +	 */
> +	int hash_bits;
> +};
> +
> +#define landlock_hash_for_each(rule, ht, i)                \
> +	for (i = 0; i < (1ULL << (ht)->hash_bits); i += 1) \
> +		hlist_for_each_entry(rule, &(ht)->hlist[i], hlist)
> +
> +#define landlock_hash_for_each_safe(rule, tmp, ht, i)      \
> +	for (i = 0; i < (1ULL << (ht)->hash_bits); i += 1) \
> +		hlist_for_each_entry_safe(rule, tmp, &(ht)->hlist[i], hlist)
> +
> +static inline int landlock_hash_init(const size_t expected_num_entries,
> +				     struct landlock_hashtable *out_ht)
> +{
> +	size_t table_sz = 1;

Please use variables with full name when they are not too big:
table_size in this case.

> +	int hash_bits = 0;
> +
> +	if (likely(expected_num_entries > 0)) {
> +		table_sz = roundup_pow_of_two(expected_num_entries);

With roundup_pow_of_two() we'll get a few collisions but a big table.
What would be the impact of using roundup_pow_of_two() instead to have a
compact hash table (with more collisions)?

> +		hash_bits = fls_long(table_sz - 1);
> +	}
> +
> +	/*
> +	 * We allocate a table even if expected_num_entries == 0 to avoid
> +	 * unnecessary branching in lookup code
> +	 */
> +
> +	out_ht->hash_bits = hash_bits;
> +	out_ht->hlist = kcalloc(table_sz, sizeof(struct hlist_head),
> +				GFP_KERNEL_ACCOUNT);
> +	if (!out_ht->hlist) {
> +		return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +static inline void landlock_hash_free(struct landlock_hashtable *ht,
> +				      const enum landlock_key_type key_type)
> +{
> +	struct landlock_rule *rule;
> +	struct hlist_node *tmp;
> +	size_t i;
> +
> +	if (key_type == LANDLOCK_KEY_INODE)
> +		might_sleep();
> +
> +	if (!ht->hlist)
> +		return;
> +
> +	landlock_hash_for_each_safe(rule, tmp, ht, i)
> +	{
> +		free_rule(rule, key_type);
> +	}
> +	kfree(ht->hlist);
> +	ht->hlist = NULL;
> +}
> +
> +static inline u32 landlock_hash_key(const union landlock_key key,
> +				    const int hash_bits)
> +{
> +	return hash_ptr((void *)key.data, hash_bits);
> +}
> +
> +static inline struct landlock_rule *
> +landlock_hash_find(const struct landlock_hashtable *const ht,
> +		   const union landlock_key key)
> +{
> +	struct hlist_head *head;
> +	struct landlock_rule *rule;
> +
> +	head = &ht->hlist[landlock_hash_key(key, ht->hash_bits)];
> +
> +	hlist_for_each_entry(rule, head, hlist) {
> +		if (rule->key.data == key.data)
> +			return rule;
> +	}
> +
> +	return NULL;
> +}
> +
> +/**
> + * @landlock_hash_count - Return number of entries in the hashtable.
> + */
> +static inline size_t landlock_hash_count(const struct landlock_hashtable *ht)
> +{
> +	size_t num_entries = 0;
> +	struct landlock_rule *rule;
> +	size_t i;
> +	landlock_hash_for_each(rule, ht, i)
> +	{
> +		num_entries += 1;
> +	}
> +	return num_entries;
> +}
> -- 
> 2.49.0
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 08/10] landlock: Construct the inode hashtable in the new landlock_domain
  2025-05-21 19:32 ` [RFC PATCH 08/10] landlock: Construct the inode hashtable in the new landlock_domain Tingmao Wang
@ 2025-05-27 11:00   ` Mickaël Salaün
  2025-06-02 13:20     ` Tingmao Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Mickaël Salaün @ 2025-05-27 11:00 UTC (permalink / raw)
  To: Tingmao Wang; +Cc: Günther Noack, linux-security-module

On Wed, May 21, 2025 at 08:32:04PM +0100, Tingmao Wang wrote:
> Since we can't get rid of the old landlock_merge_ruleset yet, we call our
> new thing landlock_merge_ruleset2.
> 
> Signed-off-by: Tingmao Wang <m@maowtm.org>
> ---
>  security/landlock/domain.c   |  87 +++++++++++++++++++++++++++++
>  security/landlock/domain.h   |   4 ++
>  security/landlock/hash.h     | 105 +++++++++++++++++++++++++++++++++++
>  security/landlock/ruleset.h  |   2 +-
>  security/landlock/syscalls.c |   9 +++
>  5 files changed, 206 insertions(+), 1 deletion(-)


> diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
> index 07823771b402..ac91d4a865b9 100644
> --- a/security/landlock/ruleset.h
> +++ b/security/landlock/ruleset.h
> @@ -27,7 +27,7 @@ struct landlock_hierarchy;
>   */
>  struct landlock_layer {
>  	/**
> -	 * @level: Position of this layer in the layer stack.
> +	 * @level: Position of this layer in the layer stack. Starts from 1.

Feel free to send a standalone patch with improved doc, I'll merge it
directly.

>  	 */
>  	u16 level;
>  	/**

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 02/10] landlock/hash: define (dynamic, non-resizable) hash table helpers
  2025-05-27 11:00   ` Mickaël Salaün
@ 2025-06-02 13:20     ` Tingmao Wang
  2025-06-02 13:26       ` Tingmao Wang
  2025-06-02 19:50       ` Mickaël Salaün
  0 siblings, 2 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-06-02 13:20 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	Jann Horn

On 5/27/25 12:00, Mickaël Salaün wrote:
> On Wed, May 21, 2025 at 08:31:58PM +0100, Tingmao Wang wrote:
>> While there is already include/linux/hash.h, it relies on the static size
>> of the array as the size of the hash table, and thus is inconvenient to
>> use for this case where we dynamically compute how many slots we need.
>>
>> There is also the relativistic hash tables in rhashtable.h which supports
>> dynamic resizes etc, but is more complicated and might be slower to access?
>>
>> However, on second thought, I'm wondering if we should just use hash
>> tables for both domain and a not-yet-merged ruleset anyway (which saves us
>> from having a union in landlock_rule).  If we do that then we should
>> indeed just use rhashtable.
>
> Thinking more about it, the important properties are that we can have a
> lot of domains/tables (i.e. sandboxed processes) with a few
> entries/rules (e.g. ~30 rules seems reasonable for now).  We should then
> try to minimize the total amount of memory while still making access
> checks quick.  As you noted, the cost of hashing should also not be
> ignored.
>
> Instead of a hash table per domain, what about flat arrays with binary
> search?  Here is a proposal:

I think this makes sense in terms of reducing memory footprint.

My understanding of the Slab allocator is that objects are allocated in
power-of-two sized caches, so with a single-layer, 44 bytes landlock_rule,
we will still use 64 bytes.  Putting them in an array, especially with
your proposed structure, would reduce this.

I also checked how the allocations actually are distributed in the current
implementation.  Dumping the addresses of the landlock_rule objects
allocated and sorting them by the address, we see:

# env LL_FS_RO=/usr:/bin:/etc:/lib:/sys:/dev:/home/mao/.config:/home/mao/linux LL_FS_RW=/tmp:/proc:/dev/tty:/dev/null:/dev/zero \
> ./sandboxer bash
Executing the sandboxed command...
bash: /home/mao/.bashrc: Permission denied
mao@linux-devbox-vm:~$ cat /proc/dump_landlock_domain
Ruleset: ffff9268489451e0
  Inode tree:
    ffff92684e767100: inode 5374060 in fs ext4
    ffff92684e767380 (+640 (rule itself was 44 bytes)): inode 2883604 in fs ext4
    ffff92684e767440 (+192 (rule itself was 44 bytes)): inode 2883992 in fs ext4
    ffff92684e767580 (+320 (rule itself was 44 bytes)): inode 11 in fs devtmpfs
    ffff92684e767740 (+448 (rule itself was 44 bytes)): inode 5377989 in fs ext4
    ffff92684e767800 (+192 (rule itself was 44 bytes)): inode 1 in fs devtmpfs
    ffff92684e7678c0 (+192 (rule itself was 44 bytes)): inode 1 in fs tmpfs
    ffff92684e767980 (+192 (rule itself was 44 bytes)): inode 6 in fs devtmpfs
    ffff92684e767b40 (+448 (rule itself was 44 bytes)): inode 4 in fs devtmpfs
    ffff92684e767bc0 (+128 (rule itself was 44 bytes)): inode 2883585 in fs ext4
    ffff92684e767c40 (+128 (rule itself was 44 bytes)): inode 1 in fs sysfs
    ffff92684e767cc0 (+128 (rule itself was 44 bytes)): inode 6815745 in fs ext4
    ffff92684e767f80 (+704 (rule itself was 44 bytes)): inode 1 in fs proc

Note: source code for this at
https://github.com/micromaomao/linux-dev/commit/16c318fcbc64c23b87ca67836771f710d0d0ccf1

(this is a typical result out of several repeat runs)
(KASLR etc disabled, no_hash_pointers)

Because all landlock rules are copied at domain creation time, I
previously thought that they are likely to be allocated close together
(perhaps there will be some fragmentation due to a ruleset extending an
existing rule in the parent domain), but it looks like they are still
spreaded out a bit for some reason, and if the allocations are sparse
during domain creation time, this situation won't fix itself and could
slow down all access checks.  (Maybe on a single core system, with nothing
else touching the kmem cache when the domain is being merged, they would
be allocated together?)

I'm not sure about the performance of binary search vs rbtrees. I can see
that the improved locality should help, but going by my benchmark of the
hashtable domains, I'm not sure how much of a difference it make, and
whether it would be worth the added complexity.  I could try implementing
this flat array + binary search approach and run the existing benchmarks
on it tho (but as of writing this I haven't).

>
> struct landlock_domain_index {
>     union landlock_key key;
>     u32 shift; // value to add to this array's index to jump to the set
>                // of mapped landlock_layer

Can you clarify the comment?  Is it the index into the layers array, or
are you suggesting adding the index of the current landlock_domain_index
with this shift value, and use that to index into the layers array?

While it's probably an unlikely situation, what if there are more than
2^29 rules, each having 16 layers?  I think that would overflow this u32
even if it is an offset, and this is allowed as LANDLOCK_MAX_NUM_RULES is
currently defined as U32_MAX.

(Unrelated to this patch, but I think once we have the supervise feature
in, I can see a (probably badly implemented) supervisor getting close to
this limit if the sandboxed application creates / delete and recreates
lots of temporary files under /tmp)

>     u32 num_layers;
> }; // 128-bit (or 96-bit) alligned

I guess this has to be 128 bits aligned if we have an u32 num_layers.
Because we allow 16 layers, if we make it an u16, it would actually need
to be the number of layers minus 1.  However maybe that is
overcomplicating it and we should just make this 128 bits aligned
(otherwise we have to deal with unaligned reads of the
landlock_key->object too).

>
> // Theoretical landlock_domain
> struct landlock_domain {
>     struct landlock_hierarchy *hierarchy;
>     union {
>         // See landlock_ruleset's union
>     };
>     u32 num_fs; // number of FS indexes
>     u32 num_net; // number of net indexes
>     struct access_masks access_masks[];
>     struct landlock_domain_index fs_indexes[num_fs];
>     struct landlock_layer fs_layers[sum of FS rules' layers];
>     struct landlock_domain_index net_indexes[num_net];
>     struct landlock_layer net_layers[sum of net rules' layers];
> };

(Unrelated to this patch specifically, but I would probably like to use
this domain struct for the mutable part of a domain as well later.  In
that case the hierarchy pointer would be unused.)

>
> // Real landlock_domain
> struct landlock_domain {
>     struct landlock_hierarchy *hierarchy;
>     union {
>         // See landlock_ruleset's union
>     };
>     u32 num_fs; // number of FS indexes
>     u32 num_net; // number of net indexes
>     u32 rules[]; // underlying typed arrays accessed via helpers
> };
>
> The landlock_domain is a contiguously allocated memory buffer containing
> variable-size arrays improving locality.  Another advantage is that we
> would not get any potential allocation errors once the buffer is
> allocated which should simplify domain creation.  Also, we avoid the
> union in landlock_rule (patch 5) and only use landlock_rule in
> landlock_ruleset.

I think doing this definitely has positives especially in terms of memory,
however I'm worried about the complexity here.  Since we will have to do
various index calculation and use it to access variable-sized arrays, I'm
somewhat afraid that any mistake could end up either leaking kernel
pointers, or at worst, memory corruption.

We should probably have a len_rules, and check (WARN_ON_ONCE and if fails
returns -EINVAL) that any computed indices lies within range before
accessing it.

>
> An alternative approach would be to use a hash table instead of the
> array (extending landlock_domain_index with a pointer/shift to the next
> collision) but still with this big buffer.  I'm not sure the perf impact
> would be noticable but it might be worse a try.

I can give that a try too and measure benchmarks.

BTW, going by the direction this discussion is going, I assume we're on
board with a separate landlock_domain, and don't plan to change how
unmerged landlock_ruleset are stored (at least for now)?  If so I can
probably start work on quiet rules, removing access bits / rules by
intersect, etc as discussed in
https://github.com/landlock-lsm/linux/issues/44#issuecomment-2876500918

>
>>
>> Signed-off-by: Tingmao Wang <m@maowtm.org>
>> ---
>>  security/landlock/hash.h | 117 +++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 117 insertions(+)
>>  create mode 100644 security/landlock/hash.h
>>
>> diff --git a/security/landlock/hash.h b/security/landlock/hash.h
>> new file mode 100644
>> index 000000000000..955c5756d4d9
>> --- /dev/null
>> +++ b/security/landlock/hash.h
>> @@ -0,0 +1,117 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Landlock - Domain hashtable mainpulation
>
> typo

ack

>
>> + *
>> + * Copyright © 2025      Tingmao Wang <m@maowtm.org>
>> + */
>> +
>> +#ifndef _SECURITY_LANDLOCK_HASH_H
>> +#define _SECURITY_LANDLOCK_HASH_H
>> +
>> +#include <linux/slab.h>
>> +#include <linux/hash.h>
>> +
>> +#include "ruleset.h"
>> +
>> +struct landlock_hashtable {
>> +	struct hlist_head *hlist;
>
> A simple linked-list would be enough.

I'm not sure I understand what you meant by a "simple linked-list" - do
you mean like an array of `landlock_rule`s, each itself is part of a
linked list of all the rules that hashed to the same key?

Anyway, since we identified a flat array approach (whether we hash or
binary search), I'm gonna try that instead.

>
>> +
>> +	/**
>> +	 * @hash_bits: Number of bits in this hash index (i.e.  hlist has
>> +	 * 2^this many elements).
>> +	 */
>> +	int hash_bits;
>> +};
>> +
>> +#define landlock_hash_for_each(rule, ht, i)                \
>> +	for (i = 0; i < (1ULL << (ht)->hash_bits); i += 1) \
>> +		hlist_for_each_entry(rule, &(ht)->hlist[i], hlist)
>> +
>> +#define landlock_hash_for_each_safe(rule, tmp, ht, i)      \
>> +	for (i = 0; i < (1ULL << (ht)->hash_bits); i += 1) \
>> +		hlist_for_each_entry_safe(rule, tmp, &(ht)->hlist[i], hlist)
>> +
>> +static inline int landlock_hash_init(const size_t expected_num_entries,
>> +				     struct landlock_hashtable *out_ht)
>> +{
>> +	size_t table_sz = 1;
>
> Please use variables with full name when they are not too big:
> table_size in this case.
>
>> +	int hash_bits = 0;
>> +
>> +	if (likely(expected_num_entries > 0)) {
>> +		table_sz = roundup_pow_of_two(expected_num_entries);
>
> With roundup_pow_of_two() we'll get a few collisions but a big table.
> What would be the impact of using roundup_pow_of_two() instead to have a
> compact hash table (with more collisions)?

I assume you meant using rounddown_pow_of_two - I can give it a quick test
too with the flat array approach.

>
>> +		hash_bits = fls_long(table_sz - 1);
>> +	}
>> +
>> +	/*
>> +	 * We allocate a table even if expected_num_entries == 0 to avoid
>> +	 * unnecessary branching in lookup code
>> +	 */
>> +
>> +	out_ht->hash_bits = hash_bits;
>> +	out_ht->hlist = kcalloc(table_sz, sizeof(struct hlist_head),
>> +				GFP_KERNEL_ACCOUNT);
>> +	if (!out_ht->hlist) {
>> +		return -ENOMEM;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static inline void landlock_hash_free(struct landlock_hashtable *ht,
>> +				      const enum landlock_key_type key_type)
>> +{
>> +	struct landlock_rule *rule;
>> +	struct hlist_node *tmp;
>> +	size_t i;
>> +
>> +	if (key_type == LANDLOCK_KEY_INODE)
>> +		might_sleep();
>> +
>> +	if (!ht->hlist)
>> +		return;
>> +
>> +	landlock_hash_for_each_safe(rule, tmp, ht, i)
>> +	{
>> +		free_rule(rule, key_type);
>> +	}
>> +	kfree(ht->hlist);
>> +	ht->hlist = NULL;
>> +}
>> +
>> +static inline u32 landlock_hash_key(const union landlock_key key,
>> +				    const int hash_bits)
>> +{
>> +	return hash_ptr((void *)key.data, hash_bits);
>> +}
>> +
>> +static inline struct landlock_rule *
>> +landlock_hash_find(const struct landlock_hashtable *const ht,
>> +		   const union landlock_key key)
>> +{
>> +	struct hlist_head *head;
>> +	struct landlock_rule *rule;
>> +
>> +	head = &ht->hlist[landlock_hash_key(key, ht->hash_bits)];
>> +
>> +	hlist_for_each_entry(rule, head, hlist) {
>> +		if (rule->key.data == key.data)
>> +			return rule;
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +/**
>> + * @landlock_hash_count - Return number of entries in the hashtable.
>> + */
>> +static inline size_t landlock_hash_count(const struct landlock_hashtable *ht)
>> +{
>> +	size_t num_entries = 0;
>> +	struct landlock_rule *rule;
>> +	size_t i;
>> +	landlock_hash_for_each(rule, ht, i)
>> +	{
>> +		num_entries += 1;
>> +	}
>> +	return num_entries;
>> +}
>> --
>> 2.49.0
>>
>>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 08/10] landlock: Construct the inode hashtable in the new landlock_domain
  2025-05-27 11:00   ` Mickaël Salaün
@ 2025-06-02 13:20     ` Tingmao Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-06-02 13:20 UTC (permalink / raw)
  To: Mickaël Salaün; +Cc: Günther Noack, linux-security-module

On 5/27/25 12:00, Mickaël Salaün wrote:
> On Wed, May 21, 2025 at 08:32:04PM +0100, Tingmao Wang wrote:
>> Since we can't get rid of the old landlock_merge_ruleset yet, we call our
>> new thing landlock_merge_ruleset2.
>>
>> Signed-off-by: Tingmao Wang <m@maowtm.org>
>> ---
>>  security/landlock/domain.c   |  87 +++++++++++++++++++++++++++++
>>  security/landlock/domain.h   |   4 ++
>>  security/landlock/hash.h     | 105 +++++++++++++++++++++++++++++++++++
>>  security/landlock/ruleset.h  |   2 +-
>>  security/landlock/syscalls.c |   9 +++
>>  5 files changed, 206 insertions(+), 1 deletion(-)
> 
> 
>> diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
>> index 07823771b402..ac91d4a865b9 100644
>> --- a/security/landlock/ruleset.h
>> +++ b/security/landlock/ruleset.h
>> @@ -27,7 +27,7 @@ struct landlock_hierarchy;
>>   */
>>  struct landlock_layer {
>>  	/**
>> -	 * @level: Position of this layer in the layer stack.
>> +	 * @level: Position of this layer in the layer stack. Starts from 1.
> 
> Feel free to send a standalone patch with improved doc, I'll merge it
> directly.

(I've done this and will remove this change from this series.)

> 
>>  	 */
>>  	u16 level;
>>  	/**


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 02/10] landlock/hash: define (dynamic, non-resizable) hash table helpers
  2025-06-02 13:20     ` Tingmao Wang
@ 2025-06-02 13:26       ` Tingmao Wang
  2025-06-02 19:50       ` Mickaël Salaün
  1 sibling, 0 replies; 18+ messages in thread
From: Tingmao Wang @ 2025-06-02 13:26 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	Jann Horn

On 6/2/25 14:20, Tingmao Wang wrote:
> On 5/27/25 12:00, Mickaël Salaün wrote:
>> On Wed, May 21, 2025 at 08:31:58PM +0100, Tingmao Wang wrote:
>>> While there is already include/linux/hash.h, it relies on the static size
>>> of the array as the size of the hash table, and thus is inconvenient to
>>> use for this case where we dynamically compute how many slots we need.
>>>
>>> There is also the relativistic hash tables in rhashtable.h which supports
>>> dynamic resizes etc, but is more complicated and might be slower to access?
>>>
>>> However, on second thought, I'm wondering if we should just use hash
>>> tables for both domain and a not-yet-merged ruleset anyway (which saves us
>>> from having a union in landlock_rule).  If we do that then we should
>>> indeed just use rhashtable.
>>
>> Thinking more about it, the important properties are that we can have a
>> lot of domains/tables (i.e. sandboxed processes) with a few
>> entries/rules (e.g. ~30 rules seems reasonable for now).  We should then
>> try to minimize the total amount of memory while still making access
>> checks quick.  As you noted, the cost of hashing should also not be
>> ignored.
>>
>> Instead of a hash table per domain, what about flat arrays with binary
>> search?  Here is a proposal:
> 
> I think this makes sense in terms of reducing memory footprint.
> 
> My understanding of the Slab allocator is that objects are allocated in
> power-of-two sized caches, so with a single-layer, 44 bytes landlock_rule,
> we will still use 64 bytes.  Putting them in an array, especially with
> your proposed structure, would reduce this.
> 
> I also checked how the allocations actually are distributed in the current
> implementation.  Dumping the addresses of the landlock_rule objects
> allocated and sorting them by the address, we see:
> 
> # env LL_FS_RO=/usr:/bin:/etc:/lib:/sys:/dev:/home/mao/.config:/home/mao/linux LL_FS_RW=/tmp:/proc:/dev/tty:/dev/null:/dev/zero \
>> ./sandboxer bash
> Executing the sandboxed command...
> bash: /home/mao/.bashrc: Permission denied
> mao@linux-devbox-vm:~$ cat /proc/dump_landlock_domain
> Ruleset: ffff9268489451e0
>   Inode tree:
>     ffff92684e767100: inode 5374060 in fs ext4
>     ffff92684e767380 (+640 (rule itself was 44 bytes)): inode 2883604 in fs ext4
>     ffff92684e767440 (+192 (rule itself was 44 bytes)): inode 2883992 in fs ext4
>     ffff92684e767580 (+320 (rule itself was 44 bytes)): inode 11 in fs devtmpfs
>     ffff92684e767740 (+448 (rule itself was 44 bytes)): inode 5377989 in fs ext4
>     ffff92684e767800 (+192 (rule itself was 44 bytes)): inode 1 in fs devtmpfs
>     ffff92684e7678c0 (+192 (rule itself was 44 bytes)): inode 1 in fs tmpfs
>     ffff92684e767980 (+192 (rule itself was 44 bytes)): inode 6 in fs devtmpfs
>     ffff92684e767b40 (+448 (rule itself was 44 bytes)): inode 4 in fs devtmpfs
>     ffff92684e767bc0 (+128 (rule itself was 44 bytes)): inode 2883585 in fs ext4
>     ffff92684e767c40 (+128 (rule itself was 44 bytes)): inode 1 in fs sysfs
>     ffff92684e767cc0 (+128 (rule itself was 44 bytes)): inode 6815745 in fs ext4
>     ffff92684e767f80 (+704 (rule itself was 44 bytes)): inode 1 in fs proc
> 
> Note: source code for this at
> https://github.com/micromaomao/linux-dev/commit/16c318fcbc64c23b87ca67836771f710d0d0ccf1
> 
> (this is a typical result out of several repeat runs)
> (KASLR etc disabled, no_hash_pointers)

Sorry I meant KASAN, KASLR is intact here.

> 
> Because all landlock rules are copied at domain creation time, I
> previously thought that they are likely to be allocated close together
> (perhaps there will be some fragmentation due to a ruleset extending an
> existing rule in the parent domain), but it looks like they are still
> spreaded out a bit for some reason, and if the allocations are sparse
> during domain creation time, this situation won't fix itself and could
> slow down all access checks.  (Maybe on a single core system, with nothing
> else touching the kmem cache when the domain is being merged, they would
> be allocated together?)
> 
> I'm not sure about the performance of binary search vs rbtrees. I can see
> that the improved locality should help, but going by my benchmark of the
> hashtable domains, I'm not sure how much of a difference it make, and
> whether it would be worth the added complexity.  I could try implementing
> this flat array + binary search approach and run the existing benchmarks
> on it tho (but as of writing this I haven't).
> 
>>
>> struct landlock_domain_index {
>>     union landlock_key key;
>>     u32 shift; // value to add to this array's index to jump to the set
>>                // of mapped landlock_layer
> 
> Can you clarify the comment?  Is it the index into the layers array, or
> are you suggesting adding the index of the current landlock_domain_index
> with this shift value, and use that to index into the layers array?
> 
> While it's probably an unlikely situation, what if there are more than
> 2^29 rules, each having 16 layers?  I think that would overflow this u32
> even if it is an offset, and this is allowed as LANDLOCK_MAX_NUM_RULES is
> currently defined as U32_MAX.
> 
> (Unrelated to this patch, but I think once we have the supervise feature
> in, I can see a (probably badly implemented) supervisor getting close to
> this limit if the sandboxed application creates / delete and recreates
> lots of temporary files under /tmp)
> 
>>     u32 num_layers;
>> }; // 128-bit (or 96-bit) alligned
> 
> I guess this has to be 128 bits aligned if we have an u32 num_layers.
> Because we allow 16 layers, if we make it an u16, it would actually need
> to be the number of layers minus 1.  However maybe that is
> overcomplicating it and we should just make this 128 bits aligned
> (otherwise we have to deal with unaligned reads of the
> landlock_key->object too).
> 
>>
>> // Theoretical landlock_domain
>> struct landlock_domain {
>>     struct landlock_hierarchy *hierarchy;
>>     union {
>>         // See landlock_ruleset's union
>>     };
>>     u32 num_fs; // number of FS indexes
>>     u32 num_net; // number of net indexes
>>     struct access_masks access_masks[];
>>     struct landlock_domain_index fs_indexes[num_fs];
>>     struct landlock_layer fs_layers[sum of FS rules' layers];
>>     struct landlock_domain_index net_indexes[num_net];
>>     struct landlock_layer net_layers[sum of net rules' layers];
>> };
> 
> (Unrelated to this patch specifically, but I would probably like to use
> this domain struct for the mutable part of a domain as well later.  In
> that case the hierarchy pointer would be unused.)
> 
>>
>> // Real landlock_domain
>> struct landlock_domain {
>>     struct landlock_hierarchy *hierarchy;
>>     union {
>>         // See landlock_ruleset's union
>>     };
>>     u32 num_fs; // number of FS indexes
>>     u32 num_net; // number of net indexes
>>     u32 rules[]; // underlying typed arrays accessed via helpers
>> };
>>
>> The landlock_domain is a contiguously allocated memory buffer containing
>> variable-size arrays improving locality.  Another advantage is that we
>> would not get any potential allocation errors once the buffer is
>> allocated which should simplify domain creation.  Also, we avoid the
>> union in landlock_rule (patch 5) and only use landlock_rule in
>> landlock_ruleset.
> 
> I think doing this definitely has positives especially in terms of memory,
> however I'm worried about the complexity here.  Since we will have to do
> various index calculation and use it to access variable-sized arrays, I'm
> somewhat afraid that any mistake could end up either leaking kernel
> pointers, or at worst, memory corruption.
> 
> We should probably have a len_rules, and check (WARN_ON_ONCE and if fails
> returns -EINVAL) that any computed indices lies within range before
> accessing it.
> 
>>
>> An alternative approach would be to use a hash table instead of the
>> array (extending landlock_domain_index with a pointer/shift to the next
>> collision) but still with this big buffer.  I'm not sure the perf impact
>> would be noticable but it might be worse a try.
> 
> I can give that a try too and measure benchmarks.
> 
> BTW, going by the direction this discussion is going, I assume we're on
> board with a separate landlock_domain, and don't plan to change how
> unmerged landlock_ruleset are stored (at least for now)?  If so I can
> probably start work on quiet rules, removing access bits / rules by
> intersect, etc as discussed in
> https://github.com/landlock-lsm/linux/issues/44#issuecomment-2876500918
> 
>>
>>>
>>> Signed-off-by: Tingmao Wang <m@maowtm.org>
>>> ---
>>>  security/landlock/hash.h | 117 +++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 117 insertions(+)
>>>  create mode 100644 security/landlock/hash.h
>>>
>>> diff --git a/security/landlock/hash.h b/security/landlock/hash.h
>>> new file mode 100644
>>> index 000000000000..955c5756d4d9
>>> --- /dev/null
>>> +++ b/security/landlock/hash.h
>>> @@ -0,0 +1,117 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +/*
>>> + * Landlock - Domain hashtable mainpulation
>>
>> typo
> 
> ack
> 
>>
>>> + *
>>> + * Copyright © 2025      Tingmao Wang <m@maowtm.org>
>>> + */
>>> +
>>> +#ifndef _SECURITY_LANDLOCK_HASH_H
>>> +#define _SECURITY_LANDLOCK_HASH_H
>>> +
>>> +#include <linux/slab.h>
>>> +#include <linux/hash.h>
>>> +
>>> +#include "ruleset.h"
>>> +
>>> +struct landlock_hashtable {
>>> +	struct hlist_head *hlist;
>>
>> A simple linked-list would be enough.
> 
> I'm not sure I understand what you meant by a "simple linked-list" - do
> you mean like an array of `landlock_rule`s, each itself is part of a
> linked list of all the rules that hashed to the same key?
> 
> Anyway, since we identified a flat array approach (whether we hash or
> binary search), I'm gonna try that instead.
> 
>>
>>> +
>>> +	/**
>>> +	 * @hash_bits: Number of bits in this hash index (i.e.  hlist has
>>> +	 * 2^this many elements).
>>> +	 */
>>> +	int hash_bits;
>>> +};
>>> +
>>> +#define landlock_hash_for_each(rule, ht, i)                \
>>> +	for (i = 0; i < (1ULL << (ht)->hash_bits); i += 1) \
>>> +		hlist_for_each_entry(rule, &(ht)->hlist[i], hlist)
>>> +
>>> +#define landlock_hash_for_each_safe(rule, tmp, ht, i)      \
>>> +	for (i = 0; i < (1ULL << (ht)->hash_bits); i += 1) \
>>> +		hlist_for_each_entry_safe(rule, tmp, &(ht)->hlist[i], hlist)
>>> +
>>> +static inline int landlock_hash_init(const size_t expected_num_entries,
>>> +				     struct landlock_hashtable *out_ht)
>>> +{
>>> +	size_t table_sz = 1;
>>
>> Please use variables with full name when they are not too big:
>> table_size in this case.
>>
>>> +	int hash_bits = 0;
>>> +
>>> +	if (likely(expected_num_entries > 0)) {
>>> +		table_sz = roundup_pow_of_two(expected_num_entries);
>>
>> With roundup_pow_of_two() we'll get a few collisions but a big table.
>> What would be the impact of using roundup_pow_of_two() instead to have a
>> compact hash table (with more collisions)?
> 
> I assume you meant using rounddown_pow_of_two - I can give it a quick test
> too with the flat array approach.
> 
>>
>>> +		hash_bits = fls_long(table_sz - 1);
>>> +	}
>>> +
>>> +	/*
>>> +	 * We allocate a table even if expected_num_entries == 0 to avoid
>>> +	 * unnecessary branching in lookup code
>>> +	 */
>>> +
>>> +	out_ht->hash_bits = hash_bits;
>>> +	out_ht->hlist = kcalloc(table_sz, sizeof(struct hlist_head),
>>> +				GFP_KERNEL_ACCOUNT);
>>> +	if (!out_ht->hlist) {
>>> +		return -ENOMEM;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static inline void landlock_hash_free(struct landlock_hashtable *ht,
>>> +				      const enum landlock_key_type key_type)
>>> +{
>>> +	struct landlock_rule *rule;
>>> +	struct hlist_node *tmp;
>>> +	size_t i;
>>> +
>>> +	if (key_type == LANDLOCK_KEY_INODE)
>>> +		might_sleep();
>>> +
>>> +	if (!ht->hlist)
>>> +		return;
>>> +
>>> +	landlock_hash_for_each_safe(rule, tmp, ht, i)
>>> +	{
>>> +		free_rule(rule, key_type);
>>> +	}
>>> +	kfree(ht->hlist);
>>> +	ht->hlist = NULL;
>>> +}
>>> +
>>> +static inline u32 landlock_hash_key(const union landlock_key key,
>>> +				    const int hash_bits)
>>> +{
>>> +	return hash_ptr((void *)key.data, hash_bits);
>>> +}
>>> +
>>> +static inline struct landlock_rule *
>>> +landlock_hash_find(const struct landlock_hashtable *const ht,
>>> +		   const union landlock_key key)
>>> +{
>>> +	struct hlist_head *head;
>>> +	struct landlock_rule *rule;
>>> +
>>> +	head = &ht->hlist[landlock_hash_key(key, ht->hash_bits)];
>>> +
>>> +	hlist_for_each_entry(rule, head, hlist) {
>>> +		if (rule->key.data == key.data)
>>> +			return rule;
>>> +	}
>>> +
>>> +	return NULL;
>>> +}
>>> +
>>> +/**
>>> + * @landlock_hash_count - Return number of entries in the hashtable.
>>> + */
>>> +static inline size_t landlock_hash_count(const struct landlock_hashtable *ht)
>>> +{
>>> +	size_t num_entries = 0;
>>> +	struct landlock_rule *rule;
>>> +	size_t i;
>>> +	landlock_hash_for_each(rule, ht, i)
>>> +	{
>>> +		num_entries += 1;
>>> +	}
>>> +	return num_entries;
>>> +}
>>> --
>>> 2.49.0
>>>
>>>
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 02/10] landlock/hash: define (dynamic, non-resizable) hash table helpers
  2025-06-02 13:20     ` Tingmao Wang
  2025-06-02 13:26       ` Tingmao Wang
@ 2025-06-02 19:50       ` Mickaël Salaün
  1 sibling, 0 replies; 18+ messages in thread
From: Mickaël Salaün @ 2025-06-02 19:50 UTC (permalink / raw)
  To: Tingmao Wang
  Cc: Günther Noack, linux-security-module, Mikhail Ivanov,
	Jann Horn

On Mon, Jun 02, 2025 at 02:20:08PM +0100, Tingmao Wang wrote:
> On 5/27/25 12:00, Mickaël Salaün wrote:
> > On Wed, May 21, 2025 at 08:31:58PM +0100, Tingmao Wang wrote:
> >> While there is already include/linux/hash.h, it relies on the static size
> >> of the array as the size of the hash table, and thus is inconvenient to
> >> use for this case where we dynamically compute how many slots we need.
> >>
> >> There is also the relativistic hash tables in rhashtable.h which supports
> >> dynamic resizes etc, but is more complicated and might be slower to access?
> >>
> >> However, on second thought, I'm wondering if we should just use hash
> >> tables for both domain and a not-yet-merged ruleset anyway (which saves us
> >> from having a union in landlock_rule).  If we do that then we should
> >> indeed just use rhashtable.
> >
> > Thinking more about it, the important properties are that we can have a
> > lot of domains/tables (i.e. sandboxed processes) with a few
> > entries/rules (e.g. ~30 rules seems reasonable for now).  We should then
> > try to minimize the total amount of memory while still making access
> > checks quick.  As you noted, the cost of hashing should also not be
> > ignored.
> >
> > Instead of a hash table per domain, what about flat arrays with binary
> > search?  Here is a proposal:
> 
> I think this makes sense in terms of reducing memory footprint.
> 
> My understanding of the Slab allocator is that objects are allocated in
> power-of-two sized caches, so with a single-layer, 44 bytes landlock_rule,
> we will still use 64 bytes.  Putting them in an array, especially with
> your proposed structure, would reduce this.
> 
> I also checked how the allocations actually are distributed in the current
> implementation.  Dumping the addresses of the landlock_rule objects
> allocated and sorting them by the address, we see:
> 
> # env LL_FS_RO=/usr:/bin:/etc:/lib:/sys:/dev:/home/mao/.config:/home/mao/linux LL_FS_RW=/tmp:/proc:/dev/tty:/dev/null:/dev/zero \
> > ./sandboxer bash
> Executing the sandboxed command...
> bash: /home/mao/.bashrc: Permission denied
> mao@linux-devbox-vm:~$ cat /proc/dump_landlock_domain
> Ruleset: ffff9268489451e0
>   Inode tree:
>     ffff92684e767100: inode 5374060 in fs ext4
>     ffff92684e767380 (+640 (rule itself was 44 bytes)): inode 2883604 in fs ext4
>     ffff92684e767440 (+192 (rule itself was 44 bytes)): inode 2883992 in fs ext4
>     ffff92684e767580 (+320 (rule itself was 44 bytes)): inode 11 in fs devtmpfs
>     ffff92684e767740 (+448 (rule itself was 44 bytes)): inode 5377989 in fs ext4
>     ffff92684e767800 (+192 (rule itself was 44 bytes)): inode 1 in fs devtmpfs
>     ffff92684e7678c0 (+192 (rule itself was 44 bytes)): inode 1 in fs tmpfs
>     ffff92684e767980 (+192 (rule itself was 44 bytes)): inode 6 in fs devtmpfs
>     ffff92684e767b40 (+448 (rule itself was 44 bytes)): inode 4 in fs devtmpfs
>     ffff92684e767bc0 (+128 (rule itself was 44 bytes)): inode 2883585 in fs ext4
>     ffff92684e767c40 (+128 (rule itself was 44 bytes)): inode 1 in fs sysfs
>     ffff92684e767cc0 (+128 (rule itself was 44 bytes)): inode 6815745 in fs ext4
>     ffff92684e767f80 (+704 (rule itself was 44 bytes)): inode 1 in fs proc
> 
> Note: source code for this at
> https://github.com/micromaomao/linux-dev/commit/16c318fcbc64c23b87ca67836771f710d0d0ccf1
> 
> (this is a typical result out of several repeat runs)
> (KASLR etc disabled, no_hash_pointers)
> 
> Because all landlock rules are copied at domain creation time, I
> previously thought that they are likely to be allocated close together
> (perhaps there will be some fragmentation due to a ruleset extending an
> existing rule in the parent domain), but it looks like they are still
> spreaded out a bit for some reason, and if the allocations are sparse
> during domain creation time, this situation won't fix itself and could
> slow down all access checks.  (Maybe on a single core system, with nothing
> else touching the kmem cache when the domain is being merged, they would
> be allocated together?)
> 
> I'm not sure about the performance of binary search vs rbtrees. I can see
> that the improved locality should help, but going by my benchmark of the
> hashtable domains, I'm not sure how much of a difference it make, and
> whether it would be worth the added complexity.  I could try implementing
> this flat array + binary search approach and run the existing benchmarks
> on it tho (but as of writing this I haven't).

Yes, I think binary search would be enough, taking into account the
cache.

> 
> >
> > struct landlock_domain_index {
> >     union landlock_key key;
> >     u32 shift; // value to add to this array's index to jump to the set
> >                // of mapped landlock_layer
> 
> Can you clarify the comment?  Is it the index into the layers array, or
> are you suggesting adding the index of the current landlock_domain_index
> with this shift value, and use that to index into the layers array?

I was suggesting the later but thinking out loud this doesn't help much,
we can just use an "u32 rule_index" (instead of the "shift") to jump to
the layers array.

> 
> While it's probably an unlikely situation, what if there are more than
> 2^29 rules, each having 16 layers?  I think that would overflow this u32
> even if it is an offset, and this is allowed as LANDLOCK_MAX_NUM_RULES is
> currently defined as U32_MAX.

Correct.  We can either use u64 or reduce the maximum number of rules.
I think LANDLOCK_MAX_NUM_RULES set to U16_MAX would be much more than
the worse practical case.  Even if one buggy policy tries to add one
rule per network port, that will be OK.  We could even reasonably test
this limit.  We'll need to backport this change but I'm OK with that.

> 
> (Unrelated to this patch, but I think once we have the supervise feature
> in, I can see a (probably badly implemented) supervisor getting close to
> this limit if the sandboxed application creates / delete and recreates
> lots of temporary files under /tmp)
> 
> >     u32 num_layers;
> > }; // 128-bit (or 96-bit) alligned
> 
> I guess this has to be 128 bits aligned if we have an u32 num_layers.
> Because we allow 16 layers, if we make it an u16, it would actually need
> to be the number of layers minus 1.  However maybe that is
> overcomplicating it and we should just make this 128 bits aligned
> (otherwise we have to deal with unaligned reads of the
> landlock_key->object too).

I was thinking about the size on a 32-bit architecture.

> 
> >
> > // Theoretical landlock_domain
> > struct landlock_domain {
> >     struct landlock_hierarchy *hierarchy;
> >     union {
> >         // See landlock_ruleset's union
> >     };
> >     u32 num_fs; // number of FS indexes
> >     u32 num_net; // number of net indexes
> >     struct access_masks access_masks[];
> >     struct landlock_domain_index fs_indexes[num_fs];
> >     struct landlock_layer fs_layers[sum of FS rules' layers];
> >     struct landlock_domain_index net_indexes[num_net];
> >     struct landlock_layer net_layers[sum of net rules' layers];
> > };
> 
> (Unrelated to this patch specifically, but I would probably like to use
> this domain struct for the mutable part of a domain as well later.  In
> that case the hierarchy pointer would be unused.)

OK, you can create two types.

> 
> >
> > // Real landlock_domain
> > struct landlock_domain {
> >     struct landlock_hierarchy *hierarchy;
> >     union {
> >         // See landlock_ruleset's union
> >     };
> >     u32 num_fs; // number of FS indexes
> >     u32 num_net; // number of net indexes
> >     u32 rules[]; // underlying typed arrays accessed via helpers
> > };
> >
> > The landlock_domain is a contiguously allocated memory buffer containing
> > variable-size arrays improving locality.  Another advantage is that we
> > would not get any potential allocation errors once the buffer is
> > allocated which should simplify domain creation.  Also, we avoid the
> > union in landlock_rule (patch 5) and only use landlock_rule in
> > landlock_ruleset.
> 
> I think doing this definitely has positives especially in terms of memory,
> however I'm worried about the complexity here.  Since we will have to do
> various index calculation and use it to access variable-sized arrays, I'm
> somewhat afraid that any mistake could end up either leaking kernel
> pointers, or at worst, memory corruption.

Yes, packing this is a bit complex.  We could allocate a buffer and use
pointers+lenght but I'm not sure it will be safer.  Any other
suggestion?

> 
> We should probably have a len_rules, and check (WARN_ON_ONCE and if fails
> returns -EINVAL) that any computed indices lies within range before
> accessing it.

Definitely.

> 
> >
> > An alternative approach would be to use a hash table instead of the
> > array (extending landlock_domain_index with a pointer/shift to the next
> > collision) but still with this big buffer.  I'm not sure the perf impact
> > would be noticable but it might be worse a try.
> 
> I can give that a try too and measure benchmarks.

What worries me about hash tables is the size they can take.  With
Landlock, we can have a lot of domains, and we should try to keep them
as small as possible.  I don't think a small hash table with a lot of
potential collisions would perform better than a binary search.  I
suggest you ignore this hash table unless you want to compare
performance (with a hash table + extra the same size as a flat array).

> 
> BTW, going by the direction this discussion is going, I assume we're on
> board with a separate landlock_domain, and don't plan to change how
> unmerged landlock_ruleset are stored (at least for now)?

Yes, let's focus on the domain data structure.  The ruleset's rbtree
works fine and is not a performance issue.

> If so I can
> probably start work on quiet rules, removing access bits / rules by
> intersect, etc as discussed in
> https://github.com/landlock-lsm/linux/issues/44#issuecomment-2876500918

Yes please.

> 
> >
> >>
> >> Signed-off-by: Tingmao Wang <m@maowtm.org>
> >> ---
> >>  security/landlock/hash.h | 117 +++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 117 insertions(+)
> >>  create mode 100644 security/landlock/hash.h
> >>
> >> diff --git a/security/landlock/hash.h b/security/landlock/hash.h
> >> new file mode 100644
> >> index 000000000000..955c5756d4d9
> >> --- /dev/null
> >> +++ b/security/landlock/hash.h
> >> @@ -0,0 +1,117 @@
> >> +/* SPDX-License-Identifier: GPL-2.0-only */
> >> +/*
> >> + * Landlock - Domain hashtable mainpulation
> >
> > typo
> 
> ack
> 
> >
> >> + *
> >> + * Copyright © 2025      Tingmao Wang <m@maowtm.org>
> >> + */
> >> +
> >> +#ifndef _SECURITY_LANDLOCK_HASH_H
> >> +#define _SECURITY_LANDLOCK_HASH_H
> >> +
> >> +#include <linux/slab.h>
> >> +#include <linux/hash.h>
> >> +
> >> +#include "ruleset.h"
> >> +
> >> +struct landlock_hashtable {
> >> +	struct hlist_head *hlist;
> >
> > A simple linked-list would be enough.
> 
> I'm not sure I understand what you meant by a "simple linked-list" - do
> you mean like an array of `landlock_rule`s, each itself is part of a
> linked list of all the rules that hashed to the same key?

hlist_node has two pointers but we would only need one here.

> 
> Anyway, since we identified a flat array approach (whether we hash or
> binary search), I'm gonna try that instead.

Right

> 
> >
> >> +
> >> +	/**
> >> +	 * @hash_bits: Number of bits in this hash index (i.e.  hlist has
> >> +	 * 2^this many elements).
> >> +	 */
> >> +	int hash_bits;
> >> +};
> >> +
> >> +#define landlock_hash_for_each(rule, ht, i)                \
> >> +	for (i = 0; i < (1ULL << (ht)->hash_bits); i += 1) \
> >> +		hlist_for_each_entry(rule, &(ht)->hlist[i], hlist)
> >> +
> >> +#define landlock_hash_for_each_safe(rule, tmp, ht, i)      \
> >> +	for (i = 0; i < (1ULL << (ht)->hash_bits); i += 1) \
> >> +		hlist_for_each_entry_safe(rule, tmp, &(ht)->hlist[i], hlist)
> >> +
> >> +static inline int landlock_hash_init(const size_t expected_num_entries,
> >> +				     struct landlock_hashtable *out_ht)
> >> +{
> >> +	size_t table_sz = 1;
> >
> > Please use variables with full name when they are not too big:
> > table_size in this case.
> >
> >> +	int hash_bits = 0;
> >> +
> >> +	if (likely(expected_num_entries > 0)) {
> >> +		table_sz = roundup_pow_of_two(expected_num_entries);
> >
> > With roundup_pow_of_two() we'll get a few collisions but a big table.
> > What would be the impact of using roundup_pow_of_two() instead to have a
> > compact hash table (with more collisions)?
> 
> I assume you meant using rounddown_pow_of_two - I can give it a quick test
> too with the flat array approach.

Yes, I meant rounddown_pow_of_two, but I don't think it will outperform
the (small) flat array.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-06-02 19:50 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-21 19:31 landlock: Use hashtable for merged domains Tingmao Wang
2025-05-21 19:31 ` [RFC PATCH 01/10] landlock: Add some debug output Tingmao Wang
2025-05-21 19:31 ` [RFC PATCH 02/10] landlock/hash: define (dynamic, non-resizable) hash table helpers Tingmao Wang
2025-05-27 11:00   ` Mickaël Salaün
2025-06-02 13:20     ` Tingmao Wang
2025-06-02 13:26       ` Tingmao Wang
2025-06-02 19:50       ` Mickaël Salaün
2025-05-21 19:31 ` [RFC PATCH 03/10] landlock/hash: Use linear search for small tables Tingmao Wang
2025-05-21 19:32 ` [RFC PATCH 04/10] landlock/ruleset: Rename and extract create_rule Tingmao Wang
2025-05-21 19:32 ` [RFC PATCH 05/10] Add hlist_node member to struct landlock_rule Tingmao Wang
2025-05-21 19:32 ` [RFC PATCH 06/10] landlock/domain: Define landlock_domain Tingmao Wang
2025-05-21 19:32 ` [RFC PATCH 07/10] landlock: Add the new domain to landlock_cred_security Tingmao Wang
2025-05-21 19:32 ` [RFC PATCH 08/10] landlock: Construct the inode hashtable in the new landlock_domain Tingmao Wang
2025-05-27 11:00   ` Mickaël Salaün
2025-06-02 13:20     ` Tingmao Wang
2025-05-21 19:32 ` [RFC PATCH 09/10] landlock/fs: Use the new hashtable-based domain to find inode rules Tingmao Wang
2025-05-21 19:32 ` [RFC PATCH 10/10] landlock: Debug print inode hashtable in landlock_merge_ruleset2 Tingmao Wang
2025-05-27 10:59 ` landlock: Use hashtable for merged domains Mickaël Salaün

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).