* [RFC PATCH 23/30] keys: Add domain tag to the keyring search criteria
From: krzysztof.struczynski @ 2020-08-18 15:42 UTC (permalink / raw)
To: linux-integrity, linux-kernel, containers, linux-security-module
Cc: zohar, stefanb, sunyuqiong1988, mkayaalp, dmitry.kasatkin, serge,
jmorris, christian, silviu.vlasceanu, roberto.sassu,
Krzysztof Struczynski
In-Reply-To: <20200818154230.14016-1-krzysztof.struczynski@huawei.com>
From: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
Add keyring_search_tag() version of keyring_search(), that allows to
specify the key domain tag.
Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
---
include/linux/key.h | 17 +++++++++++++----
security/keys/keyring.c | 15 +++++++++------
2 files changed, 22 insertions(+), 10 deletions(-)
diff --git a/include/linux/key.h b/include/linux/key.h
index 0f2e24f13c2b..223ab9d76d15 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -416,10 +416,11 @@ extern int restrict_link_reject(struct key *keyring,
extern int keyring_clear(struct key *keyring);
-extern key_ref_t keyring_search(key_ref_t keyring,
- struct key_type *type,
- const char *description,
- bool recurse);
+extern key_ref_t keyring_search_tag(key_ref_t keyring,
+ struct key_type *type,
+ const char *description,
+ struct key_tag *domain_tag,
+ bool recurse);
extern int keyring_add_key(struct key *keyring,
struct key *key);
@@ -429,6 +430,14 @@ extern int keyring_restrict(key_ref_t keyring, const char *type,
extern struct key *key_lookup(key_serial_t id);
+static inline key_ref_t keyring_search(key_ref_t keyring,
+ struct key_type *type,
+ const char *description,
+ bool recurse)
+{
+ return keyring_search_tag(keyring, type, description, NULL, recurse);
+}
+
static inline key_serial_t key_serial(const struct key *key)
{
return key ? key->serial : 0;
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index 14abfe765b7e..12583241ff63 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -925,22 +925,25 @@ key_ref_t keyring_search_rcu(key_ref_t keyring_ref,
}
/**
- * keyring_search - Search the supplied keyring tree for a matching key
+ * keyring_search_tag - Search the supplied keyring tree for a matching key
* @keyring: The root of the keyring tree to be searched.
* @type: The type of keyring we want to find.
* @description: The name of the keyring we want to find.
+ * @domain_tag: The domain_tag of the key we want to find.
* @recurse: True to search the children of @keyring also
*
* As keyring_search_rcu() above, but using the current task's credentials and
* type's default matching function and preferred search method.
*/
-key_ref_t keyring_search(key_ref_t keyring,
- struct key_type *type,
- const char *description,
- bool recurse)
+key_ref_t keyring_search_tag(key_ref_t keyring,
+ struct key_type *type,
+ const char *description,
+ struct key_tag *domain_tag,
+ bool recurse)
{
struct keyring_search_context ctx = {
.index_key.type = type,
+ .index_key.domain_tag = domain_tag,
.index_key.description = description,
.index_key.desc_len = strlen(description),
.cred = current_cred(),
@@ -968,7 +971,7 @@ key_ref_t keyring_search(key_ref_t keyring,
type->match_free(&ctx.match_data);
return key;
}
-EXPORT_SYMBOL(keyring_search);
+EXPORT_SYMBOL(keyring_search_tag);
static struct key_restriction *keyring_restriction_alloc(
key_restrict_link_func_t check)
--
2.20.1
^ permalink raw reply related
* [RFC PATCH 21/30] user namespace: Add function that checks if the UID map is defined
From: krzysztof.struczynski @ 2020-08-18 15:42 UTC (permalink / raw)
To: linux-integrity, linux-kernel, containers, linux-security-module
Cc: zohar, stefanb, sunyuqiong1988, mkayaalp, dmitry.kasatkin, serge,
jmorris, christian, silviu.vlasceanu, roberto.sassu,
Krzysztof Struczynski
In-Reply-To: <20200818154230.14016-1-krzysztof.struczynski@huawei.com>
From: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
Add function that checks if the UID map is defined. It will be used by
ima to check if ID remapping in subject-based rules is necessary.
Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
---
include/linux/user_namespace.h | 6 ++++++
kernel/user_namespace.c | 11 +++++++++++
2 files changed, 17 insertions(+)
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index d9759c54fead..bcb21c41c910 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -138,6 +138,7 @@ extern bool in_userns(const struct user_namespace *ancestor,
const struct user_namespace *child);
extern bool current_in_userns(const struct user_namespace *target_ns);
struct ns_common *ns_get_owner(struct ns_common *ns);
+extern bool userns_set_uidmap(const struct user_namespace *ns);
#else
static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
@@ -182,6 +183,11 @@ static inline struct ns_common *ns_get_owner(struct ns_common *ns)
{
return ERR_PTR(-EPERM);
}
+
+static inline bool userns_set_uidmap(const struct user_namespace *ns)
+{
+ return true;
+}
#endif
#endif /* _LINUX_USER_H */
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 87804e0371fe..e38f9f11a589 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -1232,6 +1232,17 @@ bool current_in_userns(const struct user_namespace *target_ns)
}
EXPORT_SYMBOL(current_in_userns);
+bool userns_set_uidmap(const struct user_namespace *ns)
+{
+ bool mapping_defined;
+
+ mutex_lock(&userns_state_mutex);
+ mapping_defined = ns->uid_map.nr_extents != 0;
+ mutex_unlock(&userns_state_mutex);
+
+ return mapping_defined;
+}
+
static inline struct user_namespace *to_user_ns(struct ns_common *ns)
{
return container_of(ns, struct user_namespace, ns);
--
2.20.1
^ permalink raw reply related
* [RFC PATCH 17/30] ima: Add the violation counter to the namespace
From: krzysztof.struczynski @ 2020-08-18 15:42 UTC (permalink / raw)
To: linux-integrity, linux-kernel, containers, linux-security-module
Cc: zohar, stefanb, sunyuqiong1988, mkayaalp, dmitry.kasatkin, serge,
jmorris, christian, silviu.vlasceanu, roberto.sassu,
Krzysztof Struczynski
In-Reply-To: <20200818154230.14016-1-krzysztof.struczynski@huawei.com>
From: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
The violations are now tracked per namespace.
Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
---
include/linux/ima.h | 1 +
security/integrity/ima/ima.h | 1 -
security/integrity/ima/ima_api.c | 2 +-
security/integrity/ima/ima_fs.c | 18 +++++++++++++++---
security/integrity/ima/ima_init.c | 1 +
security/integrity/ima/ima_ns.c | 1 +
security/integrity/ima/ima_queue.c | 1 -
7 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/include/linux/ima.h b/include/linux/ima.h
index d59ed38a4305..7eb7a008c5fe 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -204,6 +204,7 @@ struct ima_namespace {
struct integrity_iint_tree *iint_tree;
struct list_head *measurements;
atomic_long_t ml_len; /* number of stored measurements in the list */
+ atomic_long_t violations;
} __randomize_layout;
extern struct ima_namespace init_ima_ns;
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 7318fff3ccaa..ef95522cc710 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -193,7 +193,6 @@ extern spinlock_t ima_queue_lock;
struct ima_h_table {
atomic_long_t len; /* number of stored measurements in the list */
- atomic_long_t violations;
struct hlist_head queue[IMA_MEASURE_HTABLE_SIZE];
};
extern struct ima_h_table ima_htable;
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index b01451b34a98..f91516885033 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -151,7 +151,7 @@ void ima_add_violation(struct file *file, const unsigned char *filename,
event_data.ns_id = get_ns_id(ima_ns);
/* can overflow, only indicator */
- atomic_long_inc(&ima_htable.violations);
+ atomic_long_inc(&ima_ns->violations);
result = ima_alloc_init_template(&event_data, &entry, NULL);
if (result < 0) {
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 6d370874d80f..b71c2552ac15 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -51,10 +51,23 @@ static ssize_t ima_show_htable_violations(struct file *filp,
char __user *buf,
size_t count, loff_t *ppos)
{
- return ima_show_htable_value(buf, count, ppos, &ima_htable.violations);
+ struct ima_namespace *ima_ns = get_current_ns();
+
+ return ima_show_htable_value(buf, count, ppos, &ima_ns->violations);
+}
+
+static int ima_open_htable_violations(struct inode *inode, struct file *file)
+{
+ struct ima_namespace *ima_ns = get_current_ns();
+
+ if (!ns_capable(ima_ns->user_ns, CAP_SYS_ADMIN))
+ return -EPERM;
+
+ return 0;
}
static const struct file_operations ima_htable_violations_ops = {
+ .open = ima_open_htable_violations,
.read = ima_show_htable_violations,
.llseek = generic_file_llseek,
};
@@ -76,7 +89,6 @@ static ssize_t ima_show_measurements_count(struct file *filp,
struct ima_namespace *ima_ns = get_current_ns();
return ima_show_htable_value(buf, count, ppos, &ima_ns->ml_len);
-
}
static const struct file_operations ima_measurements_count_ops = {
@@ -552,7 +564,7 @@ int __init ima_fs_init(void)
goto out;
violations =
- securityfs_create_file("violations", S_IRUSR | S_IRGRP,
+ securityfs_create_file("violations", S_IRUSR | S_IRGRP | S_IROTH,
ima_dir, NULL, &ima_htable_violations_ops);
if (IS_ERR(violations))
goto out;
diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
index ac9509d8c0f0..f7b25b3f95de 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -37,6 +37,7 @@ struct ima_namespace init_ima_ns = {
.iint_tree = &init_iint_tree,
.measurements = &ima_measurements,
.ml_len = ATOMIC_LONG_INIT(0),
+ .violations = ATOMIC_LONG_INIT(0),
};
EXPORT_SYMBOL(init_ima_ns);
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index 81de492baa99..3012287b22d2 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -127,6 +127,7 @@ static struct ima_namespace *clone_ima_ns(struct user_namespace *user_ns,
ns->ucounts = ucounts;
ns->frozen = false;
atomic_long_set(&ns->ml_len, 0);
+ atomic_long_set(&ns->violations, 0);
rwlock_init(&ns->iint_tree->lock);
ns->iint_tree->root = RB_ROOT;
diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
index ec5b3ca3ef92..912e6097af5b 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -34,7 +34,6 @@ static unsigned long binary_runtime_size = ULONG_MAX;
/* key: inode (before secure-hashing a file) */
struct ima_h_table ima_htable = {
.len = ATOMIC_LONG_INIT(0),
- .violations = ATOMIC_LONG_INIT(0),
.queue[0 ... IMA_MEASURE_HTABLE_SIZE - 1] = HLIST_HEAD_INIT
};
--
2.20.1
^ permalink raw reply related
* [RFC PATCH 20/30] ima: Parse per ima namespace policy file
From: krzysztof.struczynski @ 2020-08-18 15:42 UTC (permalink / raw)
To: linux-integrity, linux-kernel, containers, linux-security-module
Cc: zohar, stefanb, sunyuqiong1988, mkayaalp, dmitry.kasatkin, serge,
jmorris, christian, silviu.vlasceanu, roberto.sassu,
Krzysztof Struczynski
In-Reply-To: <20200818154230.14016-1-krzysztof.struczynski@huawei.com>
From: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
Parse per ima namespace policy file if the file path is set.
Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
---
security/integrity/ima/ima.h | 7 ++++---
security/integrity/ima/ima_fs.c | 17 +++++++++++------
security/integrity/ima/ima_ns.c | 18 ++++++++++++++++--
security/integrity/ima/ima_policy.c | 25 +++++++++++--------------
4 files changed, 42 insertions(+), 25 deletions(-)
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 48d09efaffbe..b55d25c2bf63 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -309,10 +309,10 @@ int ima_match_policy(struct inode *inode, const struct cred *cred, u32 secid,
void ima_init_policy(void);
void ima_init_ns_policy(struct ima_namespace *ima_ns,
const struct ima_policy_setup_data *policy_setup_data);
-void ima_update_policy(void);
+void ima_update_policy(struct ima_namespace *ima_ns);
void ima_update_policy_flag(struct ima_namespace *ima_ns);
-ssize_t ima_parse_add_rule(char *);
-void ima_delete_rules(void);
+ssize_t ima_parse_add_rule(char *rule, struct ima_namespace *ima_ns);
+void ima_delete_rules(struct ima_namespace *ima_ns);
int ima_check_policy(const struct ima_namespace *ima_ns);
void *ima_policy_start(struct seq_file *m, loff_t *pos);
void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos);
@@ -439,6 +439,7 @@ static inline struct ima_namespace *get_current_ns(void)
void ima_delete_ns_rules(struct ima_policy_data *policy_data,
bool is_root_ns);
+ssize_t ima_read_ns_policy(char *path, struct ima_namespace *ima_ns);
ssize_t ima_ns_write_policy_for_children(struct ima_namespace *ima_ns,
char *policy_path);
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 6c033857f521..7b93c338a478 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -343,7 +343,7 @@ static const struct file_operations ima_ascii_measurements_ops = {
.release = seq_release,
};
-static ssize_t ima_read_policy(char *path)
+static ssize_t ima_read_policy(char *path, struct ima_namespace *ima_ns)
{
void *data;
char *datap;
@@ -365,7 +365,7 @@ static ssize_t ima_read_policy(char *path)
datap = data;
while (size > 0 && (p = strsep(&datap, "\n"))) {
pr_debug("rule: %s\n", p);
- rc = ima_parse_add_rule(p);
+ rc = ima_parse_add_rule(p, ima_ns);
if (rc < 0)
break;
size -= rc;
@@ -406,7 +406,7 @@ static ssize_t ima_write_policy(struct file *file, const char __user *buf,
goto out_free;
if (data[0] == '/') {
- result = ima_read_policy(data);
+ result = ima_read_policy(data, ima_ns);
} else if (ima_ns->policy_data->ima_appraise & IMA_APPRAISE_POLICY) {
pr_err("signed policy file (specified as an absolute pathname) required\n");
integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL,
@@ -414,7 +414,7 @@ static ssize_t ima_write_policy(struct file *file, const char __user *buf,
1, 0);
result = -EACCES;
} else {
- result = ima_parse_add_rule(data);
+ result = ima_parse_add_rule(data, ima_ns);
}
mutex_unlock(&ima_write_mutex);
out_free:
@@ -502,13 +502,13 @@ static int ima_release_policy(struct inode *inode, struct file *file)
"policy_update", cause, !valid_policy, 0);
if (!valid_policy) {
- ima_delete_rules();
+ ima_delete_rules(ima_ns);
valid_policy = 1;
clear_bit(IMA_FS_BUSY, &ima_fs_flags);
return 0;
}
- ima_update_policy();
+ ima_update_policy(ima_ns);
#if !defined(CONFIG_IMA_WRITE_POLICY) && !defined(CONFIG_IMA_READ_POLICY)
securityfs_remove(ima_policy);
ima_policy = NULL;
@@ -629,6 +629,11 @@ static const struct file_operations ima_kcmd_for_children_ops = {
.open = ima_open_for_children,
.write = ima_write_kcmd_for_children,
};
+
+ssize_t ima_read_ns_policy(char *path, struct ima_namespace *ima_ns)
+{
+ return ima_read_policy(path, ima_ns);
+}
#endif /* CONFIG_IMA_NS */
int __init ima_fs_init(void)
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index dd0328ee715f..ec3abc803c82 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -91,10 +91,24 @@ static int ima_set_ns_policy(struct ima_namespace *ima_ns)
setup_data.ima_appraise = IMA_APPRAISE_ENFORCE;
#endif
ima_init_ns_policy(ima_ns, &setup_data);
- return result;
+ } else {
+ ima_init_ns_policy(ima_ns, ima_ns->policy_setup_for_children);
}
- ima_init_ns_policy(ima_ns, ima_ns->policy_setup_for_children);
+ if (ima_ns->policy_path_for_children) {
+ result = ima_read_ns_policy(ima_ns->policy_path_for_children,
+ ima_ns);
+ if ((result >= 0) && !ima_check_policy(ima_ns)) {
+ integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL,
+ "policy_update", "completed", 0, 0);
+ ima_update_policy(ima_ns);
+ } else {
+ ima_delete_rules(ima_ns);
+ ima_delete_ns_rules(ima_ns->policy_data, false);
+ integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL,
+ "policy_update", "failed", 1, 0);
+ }
+ }
return result;
}
diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index 3d712c062ed0..d4774eab6a98 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -735,7 +735,8 @@ static void add_rules(struct ima_namespace *ima_ns,
}
}
-static int ima_parse_rule(char *rule, struct ima_rule_entry *entry);
+static int ima_parse_rule(char *rule, struct ima_rule_entry *entry,
+ struct ima_namespace *ima_ns);
static int __init ima_init_arch_policy(void)
{
@@ -765,7 +766,8 @@ static int __init ima_init_arch_policy(void)
result = strlcpy(rule, *rules, sizeof(rule));
INIT_LIST_HEAD(&arch_policy_entry[i].list);
- result = ima_parse_rule(rule, &arch_policy_entry[i]);
+ result = ima_parse_rule(rule, &arch_policy_entry[i],
+ &init_ima_ns);
if (result) {
pr_warn("Skipping unknown architecture policy rule: %s\n",
rule);
@@ -895,10 +897,8 @@ int ima_check_policy(const struct ima_namespace *ima_ns)
* Policy rules are never deleted so ima_policy_flag gets zeroed only once when
* we switch from the default policy to user defined.
*/
-void ima_update_policy(void)
+void ima_update_policy(struct ima_namespace *ima_ns)
{
- /* Update only the current ima namespace */
- struct ima_namespace *ima_ns = get_current_ns();
struct list_head *policy = &ima_ns->policy_data->ima_policy_rules;
list_splice_tail_init_rcu(&ima_ns->policy_data->ima_temp_rules,
@@ -1151,7 +1151,8 @@ static bool ima_validate_rule(struct ima_rule_entry *entry)
return true;
}
-static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
+static int ima_parse_rule(char *rule, struct ima_rule_entry *entry,
+ struct ima_namespace *ima_ns)
{
struct audit_buffer *ab;
char *from;
@@ -1160,7 +1161,6 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
struct ima_template_desc *template_desc;
int result = 0;
size_t keyrings_len;
- struct ima_namespace *ima_ns = get_current_ns();
ab = integrity_audit_log_start(audit_context(), GFP_KERNEL,
AUDIT_INTEGRITY_POLICY_RULE);
@@ -1547,19 +1547,18 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
/**
* ima_parse_add_rule - add a rule to ima_policy_rules
* @rule - ima measurement policy rule
+ * @ima_ns - pointer to the ima namespace the rule will be added to
*
* Avoid locking by allowing just one writer at a time in ima_write_policy()
* Returns the length of the rule parsed, an error code on failure
*/
-ssize_t ima_parse_add_rule(char *rule)
+ssize_t ima_parse_add_rule(char *rule, struct ima_namespace *ima_ns)
{
static const char op[] = "update_policy";
char *p;
struct ima_rule_entry *entry;
ssize_t result, len;
int audit_info = 0;
- /* Add rules only to the current ima namespace */
- struct ima_namespace *ima_ns = get_current_ns();
p = strsep(&rule, "\n");
len = strlen(p) + 1;
@@ -1577,7 +1576,7 @@ ssize_t ima_parse_add_rule(char *rule)
INIT_LIST_HEAD(&entry->list);
- result = ima_parse_rule(p, entry);
+ result = ima_parse_rule(p, entry, ima_ns);
if (result) {
ima_free_rule(entry);
integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL,
@@ -1597,10 +1596,8 @@ ssize_t ima_parse_add_rule(char *rule)
* different from the active one. There is also only one user of
* ima_delete_rules() at a time.
*/
-void ima_delete_rules(void)
+void ima_delete_rules(struct ima_namespace *ima_ns)
{
- /* Delete rules only from the current ima namespace */
- struct ima_namespace *ima_ns = get_current_ns();
struct ima_rule_entry *entry, *tmp;
ima_ns->policy_data->temp_ima_appraise = 0;
--
2.20.1
^ permalink raw reply related
* [RFC PATCH 22/30] ima: Remap IDs of subject based rules if necessary
From: krzysztof.struczynski @ 2020-08-18 15:42 UTC (permalink / raw)
To: linux-integrity, linux-kernel, containers, linux-security-module
Cc: zohar, stefanb, sunyuqiong1988, mkayaalp, dmitry.kasatkin, serge,
jmorris, christian, silviu.vlasceanu, roberto.sassu,
Krzysztof Struczynski
In-Reply-To: <20200818154230.14016-1-krzysztof.struczynski@huawei.com>
From: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
If subject based rule is added to the policy before the user namespace
uid mapping is defined, ID has to be recalculated.
It can happen if the new user namespace is created alongside the new
ima namespace. The default policy rules are loaded when the first
process is born into the new ima namespace. In that case, user has no
chance to define the mapping. It can also happen for the custom policy
rules loaded from within the new ima namespace, before the mapping is
created.
Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
---
security/integrity/ima/ima_policy.c | 83 ++++++++++++++++++++++++-----
1 file changed, 70 insertions(+), 13 deletions(-)
diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index d4774eab6a98..bc1a4bb10bd0 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -18,6 +18,7 @@
#include <linux/genhd.h>
#include <linux/seq_file.h>
#include <linux/ima.h>
+#include <linux/user_namespace.h>
#include "ima.h"
@@ -78,6 +79,10 @@ struct ima_rule_entry {
char *fsname;
char *keyrings; /* Measure keys added to these keyrings */
struct ima_template_desc *template;
+ bool remap_uid; /* IDs of all subject oriented rules, added before the
+ * user namespace mapping is defined,
+ * have to be remapped.
+ */
};
/*
@@ -484,6 +489,8 @@ static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
const char *keyring)
{
int i;
+ kuid_t remapped_kuid;
+ struct ima_namespace *current_ima_ns = get_current_ns();
if (func == KEY_CHECK) {
return (rule->flags & IMA_FUNC) && (rule->func == func) &&
@@ -507,21 +514,45 @@ static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
if ((rule->flags & IMA_FSUUID) &&
!uuid_equal(&rule->fsuuid, &inode->i_sb->s_uuid))
return false;
- if ((rule->flags & IMA_UID) && !rule->uid_op(cred->uid, rule->uid))
- return false;
+ if (rule->flags & IMA_UID) {
+ if (rule->remap_uid) {
+ remapped_kuid = make_kuid(current_ima_ns->user_ns,
+ __kuid_val(rule->uid));
+ if (!uid_valid(remapped_kuid))
+ return false;
+ } else
+ remapped_kuid = rule->uid;
+ if (!rule->uid_op(cred->uid, remapped_kuid))
+ return false;
+ }
if (rule->flags & IMA_EUID) {
+ if (rule->remap_uid) {
+ remapped_kuid = make_kuid(current_ima_ns->user_ns,
+ __kuid_val(rule->uid));
+ if (!uid_valid(remapped_kuid))
+ return false;
+ } else
+ remapped_kuid = rule->uid;
if (has_capability_noaudit(current, CAP_SETUID)) {
- if (!rule->uid_op(cred->euid, rule->uid)
- && !rule->uid_op(cred->suid, rule->uid)
- && !rule->uid_op(cred->uid, rule->uid))
+ if (!rule->uid_op(cred->euid, remapped_kuid)
+ && !rule->uid_op(cred->suid, remapped_kuid)
+ && !rule->uid_op(cred->uid, remapped_kuid))
return false;
- } else if (!rule->uid_op(cred->euid, rule->uid))
+ } else if (!rule->uid_op(cred->euid, remapped_kuid))
return false;
}
- if ((rule->flags & IMA_FOWNER) &&
- !rule->fowner_op(inode->i_uid, rule->fowner))
- return false;
+ if (rule->flags & IMA_FOWNER) {
+ if (rule->remap_uid) {
+ remapped_kuid = make_kuid(current_ima_ns->user_ns,
+ __kuid_val(rule->fowner));
+ if (!uid_valid(remapped_kuid))
+ return false;
+ } else
+ remapped_kuid = rule->fowner;
+ if (!rule->fowner_op(inode->i_uid, remapped_kuid))
+ return false;
+ }
for (i = 0; i < MAX_LSM_RULES; i++) {
int rc = 0;
u32 osid;
@@ -701,6 +732,9 @@ static void add_rules(struct ima_namespace *ima_ns,
for (i = 0; i < count; i++) {
struct ima_rule_entry *entry;
+ bool set_uidmap;
+
+ set_uidmap = userns_set_uidmap(ima_ns->user_ns);
if (policy_rule & IMA_DEFAULT_POLICY) {
entry = &entries[i];
@@ -709,6 +743,9 @@ static void add_rules(struct ima_namespace *ima_ns,
GFP_KERNEL);
if (!entry)
continue;
+
+ if (!set_uidmap)
+ entry->remap_uid = true;
}
list_add_tail(&entry->list,
@@ -721,6 +758,9 @@ static void add_rules(struct ima_namespace *ima_ns,
if (!entry)
continue;
+ if (ima_ns != &init_ima_ns && !set_uidmap)
+ entry->remap_uid = true;
+
list_add_tail(&entry->list,
&ima_ns->policy_data->ima_policy_rules);
}
@@ -1165,6 +1205,10 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry,
ab = integrity_audit_log_start(audit_context(), GFP_KERNEL,
AUDIT_INTEGRITY_POLICY_RULE);
+ if ((ima_ns != &init_ima_ns) &&
+ (!userns_set_uidmap(ima_ns->user_ns)))
+ entry->remap_uid = true;
+
entry->uid = INVALID_UID;
entry->fowner = INVALID_UID;
entry->uid_op = &uid_eq;
@@ -1396,8 +1440,13 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry,
result = kstrtoul(args[0].from, 10, &lnum);
if (!result) {
- entry->uid = make_kuid(current_user_ns(),
- (uid_t) lnum);
+ if (!entry->remap_uid)
+ entry->uid =
+ make_kuid(current_user_ns(),
+ (uid_t) lnum);
+ else
+ entry->uid = KUIDT_INIT((uid_t) lnum);
+
if (!uid_valid(entry->uid) ||
(uid_t)lnum != lnum)
result = -EINVAL;
@@ -1424,8 +1473,16 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry,
result = kstrtoul(args[0].from, 10, &lnum);
if (!result) {
- entry->fowner = make_kuid(current_user_ns(), (uid_t)lnum);
- if (!uid_valid(entry->fowner) || (((uid_t)lnum) != lnum))
+ if (!entry->remap_uid)
+ entry->fowner =
+ make_kuid(current_user_ns(),
+ (uid_t) lnum);
+ else
+ entry->fowner =
+ KUIDT_INIT((uid_t) lnum);
+
+ if (!uid_valid(entry->fowner) ||
+ (((uid_t)lnum) != lnum))
result = -EINVAL;
else
entry->flags |= IMA_FOWNER;
--
2.20.1
^ permalink raw reply related
* [RFC PATCH 24/30] keys: Include key domain tag in the iterative search
From: krzysztof.struczynski @ 2020-08-18 15:42 UTC (permalink / raw)
To: linux-integrity, linux-kernel, containers, linux-security-module
Cc: zohar, stefanb, sunyuqiong1988, mkayaalp, dmitry.kasatkin, serge,
jmorris, christian, silviu.vlasceanu, roberto.sassu,
Krzysztof Struczynski
In-Reply-To: <20200818154230.14016-1-krzysztof.struczynski@huawei.com>
From: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
Add domain tag to the key_match_data. If set, check domain tag in the
default match function and asymmetric keys match functions.
This will allow to use the key domain tag in the search criteria for
the iterative search, not only for the direct lookup that is based on
the index key.
Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
---
crypto/asymmetric_keys/asymmetric_type.c | 20 ++++++++++++++++----
include/linux/key-type.h | 1 +
security/keys/keyring.c | 10 +++++++++-
3 files changed, 26 insertions(+), 5 deletions(-)
diff --git a/crypto/asymmetric_keys/asymmetric_type.c b/crypto/asymmetric_keys/asymmetric_type.c
index 33e77d846caa..177429bc5c7b 100644
--- a/crypto/asymmetric_keys/asymmetric_type.c
+++ b/crypto/asymmetric_keys/asymmetric_type.c
@@ -249,9 +249,15 @@ static bool asymmetric_key_cmp(const struct key *key,
{
const struct asymmetric_key_ids *kids = asymmetric_key_ids(key);
const struct asymmetric_key_id *match_id = match_data->preparsed;
+ bool match;
- return asymmetric_match_key_ids(kids, match_id,
- asymmetric_key_id_same);
+ match = asymmetric_match_key_ids(kids, match_id,
+ asymmetric_key_id_same);
+
+ if (match_data->domain_tag)
+ match &= key->index_key.domain_tag == match_data->domain_tag;
+
+ return match;
}
/*
@@ -262,9 +268,15 @@ static bool asymmetric_key_cmp_partial(const struct key *key,
{
const struct asymmetric_key_ids *kids = asymmetric_key_ids(key);
const struct asymmetric_key_id *match_id = match_data->preparsed;
+ bool match;
+
+ match = asymmetric_match_key_ids(kids, match_id,
+ asymmetric_key_id_partial);
+
+ if (match_data->domain_tag)
+ match &= key->index_key.domain_tag == match_data->domain_tag;
- return asymmetric_match_key_ids(kids, match_id,
- asymmetric_key_id_partial);
+ return match;
}
/*
diff --git a/include/linux/key-type.h b/include/linux/key-type.h
index 2ab2d6d6aeab..c8ea26ab242c 100644
--- a/include/linux/key-type.h
+++ b/include/linux/key-type.h
@@ -55,6 +55,7 @@ struct key_match_data {
unsigned lookup_type; /* Type of lookup for this search. */
#define KEYRING_SEARCH_LOOKUP_DIRECT 0x0000 /* Direct lookup by description. */
#define KEYRING_SEARCH_LOOKUP_ITERATE 0x0001 /* Iterative search. */
+ struct key_tag *domain_tag; /* Key domain tag */
};
/*
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index 12583241ff63..7e45e534035f 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -565,7 +565,13 @@ int restrict_link_reject(struct key *keyring,
bool key_default_cmp(const struct key *key,
const struct key_match_data *match_data)
{
- return strcmp(key->description, match_data->raw_data) == 0;
+ bool match;
+
+ match = strcmp(key->description, match_data->raw_data) == 0;
+ if (match_data->domain_tag)
+ match &= key->index_key.domain_tag == match_data->domain_tag;
+
+ return match;
}
/*
@@ -957,6 +963,8 @@ key_ref_t keyring_search_tag(key_ref_t keyring,
if (recurse)
ctx.flags |= KEYRING_SEARCH_RECURSE;
+ if (domain_tag)
+ ctx.match_data.domain_tag = domain_tag;
if (type->match_preparse) {
ret = type->match_preparse(&ctx.match_data);
if (ret < 0)
--
2.20.1
^ permalink raw reply related
* [RFC PATCH 15/30] ima: Add a reader counter to the integrity inode data
From: krzysztof.struczynski @ 2020-08-18 15:42 UTC (permalink / raw)
To: linux-integrity, linux-kernel, containers, linux-security-module
Cc: zohar, stefanb, sunyuqiong1988, mkayaalp, dmitry.kasatkin, serge,
jmorris, christian, silviu.vlasceanu, roberto.sassu,
Krzysztof Struczynski
In-Reply-To: <20200818154230.14016-1-krzysztof.struczynski@huawei.com>
From: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
To detect ToMToU violations reader counter of the given inode is
checked. This is not enough, because the reader may exist in a
different ima namespace. Per inode reader counter tracks readers in all
ima namespaces, whereas the per namespace counter is necessary to avoid
false positives.
Add a new reader counter to the integrity inode cache entry.
Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
---
security/integrity/ima/ima_main.c | 71 ++++++++++++++++++++++---------
security/integrity/integrity.h | 18 ++++++++
2 files changed, 70 insertions(+), 19 deletions(-)
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index b933c7e6c8e1..6555cdf6b65e 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -28,6 +28,11 @@
#include "ima.h"
+struct ima_file_data {
+ struct ima_namespace *ima_ns;
+ bool is_readcount;
+};
+
int ima_hash_algo = HASH_ALGO_SHA1;
static int hash_setup_done;
@@ -116,8 +121,8 @@ static void ima_rdwr_violation_check(struct file *file,
iint = integrity_iint_rb_find(ima_ns->iint_tree,
inode);
/* IMA_MEASURE is set from reader side */
- if (iint && test_bit(IMA_MUST_MEASURE,
- &iint->atomic_flags))
+ if (iint && atomic_read(&iint->readcount) &&
+ test_bit(IMA_MUST_MEASURE, &iint->atomic_flags))
send_tomtou = true;
}
} else {
@@ -171,7 +176,7 @@ static void ima_check_last_writer(struct integrity_iint_cache *iint,
{
fmode_t mode = file->f_mode;
bool update;
- struct ima_namespace *ima_ns = (struct ima_namespace *)file->f_ima;
+ struct ima_file_data *f_data = (struct ima_file_data *)file->f_ima;
if (!(mode & FMODE_WRITE))
return;
@@ -186,7 +191,7 @@ static void ima_check_last_writer(struct integrity_iint_cache *iint,
iint->flags &= ~(IMA_DONE_MASK | IMA_NEW_FILE);
iint->measured_pcrs = 0;
- ima_check_active_ns(ima_ns, inode);
+ ima_check_active_ns(f_data->ima_ns, inode);
if (update)
ima_update_xattr(iint, file);
@@ -207,8 +212,18 @@ static void ima_check_last_writer(struct integrity_iint_cache *iint,
*/
int ima_file_alloc(struct file *file)
{
- file->f_ima = get_current_ns();
- get_ima_ns((struct ima_namespace *)file->f_ima);
+ struct ima_file_data *f_data;
+
+ f_data = kmalloc(sizeof(struct ima_file_data), GFP_KERNEL);
+ if (!f_data)
+ return -ENOMEM;
+
+ f_data->ima_ns = get_current_ns();
+ f_data->is_readcount = false;
+ get_ima_ns(f_data->ima_ns);
+
+ file->f_ima = f_data;
+
return 0;
}
@@ -222,27 +237,33 @@ void ima_file_free(struct file *file)
{
struct inode *inode = file_inode(file);
struct integrity_iint_cache *iint;
- struct ima_namespace *ima_ns = (struct ima_namespace *)file->f_ima;
+ struct ima_file_data *f_data = (struct ima_file_data *)file->f_ima;
if (unlikely(!(file->f_mode & FMODE_OPENED)))
goto out;
- if (!ima_ns->policy_data->ima_policy_flag || !S_ISREG(inode->i_mode))
+ if (!f_data->ima_ns->policy_data->ima_policy_flag ||
+ !S_ISREG(inode->i_mode))
goto out;
- iint = integrity_iint_rb_find(ima_ns->iint_tree, inode);
+ iint = integrity_iint_rb_find(f_data->ima_ns->iint_tree, inode);
if (!iint)
goto out;
ima_check_last_writer(iint, inode, file);
+
+ if (f_data->is_readcount)
+ iint_readcount_dec(iint);
out:
- put_ima_ns(ima_ns);
+ put_ima_ns(f_data->ima_ns);
+ kfree(f_data);
}
static int process_ns_measurement(struct file *file, const struct cred *cred,
u32 secid, char *buf, loff_t size, int mask,
enum ima_hooks func,
- struct ima_namespace *ima_ns)
+ struct ima_namespace *ima_ns,
+ bool readcount_open)
{
struct inode *inode = file_inode(file);
struct integrity_iint_cache *iint = NULL;
@@ -262,6 +283,12 @@ static int process_ns_measurement(struct file *file, const struct cred *cred,
if (!ima_ns->policy_data->ima_policy_flag)
return 0;
+ if (ima_ns != current_ima_ns) {
+ iint = integrity_iint_rb_find(ima_ns->iint_tree, inode);
+ if (!iint)
+ return 0;
+ }
+
/* Return an IMA_MEASURE, IMA_APPRAISE, IMA_AUDIT action
* bitmask based on the appraise/audit/measurement policy.
* Included is the appraise submask.
@@ -282,12 +309,18 @@ static int process_ns_measurement(struct file *file, const struct cred *cred,
inode_lock(inode);
- if (action) {
+ if (action && !iint) {
iint = integrity_inode_rb_get(ima_ns->iint_tree, inode);
if (!iint)
rc = -ENOMEM;
}
+ if ((ima_ns == current_ima_ns) && iint && readcount_open &&
+ ((file->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)) {
+ iint_readcount_inc(iint);
+ ((struct ima_file_data *)file->f_ima)->is_readcount = true;
+ }
+
if (!rc && violation_check)
ima_rdwr_violation_check(file, iint, action & IMA_MEASURE,
&pathbuf, &pathname, filename, ima_ns);
@@ -429,7 +462,7 @@ static int process_ns_measurement(struct file *file, const struct cred *cred,
static int process_measurement(struct file *file, const struct cred *cred,
u32 secid, char *buf, loff_t size, int mask,
- enum ima_hooks func)
+ enum ima_hooks func, bool readcount_open)
{
int ret;
struct ima_namespace *ima_ns;
@@ -444,7 +477,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
continue;
ret = process_ns_measurement(file, cred, secid, buf, size, mask,
- func, ima_ns);
+ func, ima_ns, readcount_open);
if (ret != 0)
break;
}
@@ -471,7 +504,7 @@ int ima_file_mmap(struct file *file, unsigned long prot)
if (file && (prot & PROT_EXEC)) {
security_task_getsecid(current, &secid);
return process_measurement(file, current_cred(), secid, NULL,
- 0, MAY_EXEC, MMAP_CHECK);
+ 0, MAY_EXEC, MMAP_CHECK, true);
}
return 0;
@@ -551,13 +584,13 @@ int ima_bprm_check(struct linux_binprm *bprm)
security_task_getsecid(current, &secid);
ret = process_measurement(bprm->file, current_cred(), secid, NULL, 0,
- MAY_EXEC, BPRM_CHECK);
+ MAY_EXEC, BPRM_CHECK, false);
if (ret)
return ret;
security_cred_getsecid(bprm->cred, &secid);
return process_measurement(bprm->file, bprm->cred, secid, NULL, 0,
- MAY_EXEC, CREDS_CHECK);
+ MAY_EXEC, CREDS_CHECK, false);
}
/**
@@ -577,7 +610,7 @@ int ima_file_check(struct file *file, int mask)
security_task_getsecid(current, &secid);
return process_measurement(file, current_cred(), secid, NULL, 0,
mask & (MAY_READ | MAY_WRITE | MAY_EXEC |
- MAY_APPEND), FILE_CHECK);
+ MAY_APPEND), FILE_CHECK, true);
}
EXPORT_SYMBOL_GPL(ima_file_check);
@@ -779,7 +812,7 @@ int ima_post_read_file(struct file *file, void *buf, loff_t size,
func = read_idmap[read_id] ?: FILE_CHECK;
security_task_getsecid(current, &secid);
return process_measurement(file, current_cred(), secid, buf, size,
- MAY_READ, func);
+ MAY_READ, func, false);
}
/**
diff --git a/security/integrity/integrity.h b/security/integrity/integrity.h
index 721d1850e4f9..c2981a98547e 100644
--- a/security/integrity/integrity.h
+++ b/security/integrity/integrity.h
@@ -17,6 +17,7 @@
#include <crypto/sha.h>
#include <linux/key.h>
#include <linux/audit.h>
+#include <linux/atomic.h>
/* iint action cache flags */
#define IMA_MEASURE 0x00000001
@@ -138,6 +139,9 @@ struct integrity_iint_cache {
enum integrity_status ima_creds_status:4;
enum integrity_status evm_status:4;
struct ima_digest_data *ima_hash;
+ atomic_t readcount; /* Reader counter, incremented only in FILE_CHECK or
+ * MMAP_CHECK hooks, used for ima violation check.
+ */
};
struct integrity_iint_tree {
@@ -294,3 +298,17 @@ static inline void __init add_to_platform_keyring(const char *source,
{
}
#endif
+
+#ifdef CONFIG_IMA
+static inline void iint_readcount_inc(struct integrity_iint_cache *iint)
+{
+ atomic_inc(&iint->readcount);
+}
+static inline void iint_readcount_dec(struct integrity_iint_cache *iint)
+{
+ if (WARN_ON(!atomic_read(&iint->readcount)))
+ return;
+
+ atomic_dec(&iint->readcount);
+}
+#endif
--
2.20.1
^ permalink raw reply related
* [RFC PATCH 14/30] ima: Add per namespace view of the measurement list
From: krzysztof.struczynski @ 2020-08-18 15:42 UTC (permalink / raw)
To: linux-integrity, linux-kernel, containers, linux-security-module
Cc: zohar, stefanb, sunyuqiong1988, mkayaalp, dmitry.kasatkin, serge,
jmorris, christian, silviu.vlasceanu, roberto.sassu,
Krzysztof Struczynski
In-Reply-To: <20200818154230.14016-1-krzysztof.struczynski@huawei.com>
From: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
Modify ima securityfs interface, so that only measurement list entries
that belong to the given ima namespace are visible/counted. The initial
ima namespace is an exception, its processes have access to all
measurement list entries.
Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
---
include/linux/ima.h | 1 +
security/integrity/ima/ima_fs.c | 57 ++++++++++++++++++++++++++----
security/integrity/ima/ima_init.c | 1 +
security/integrity/ima/ima_ns.c | 1 +
security/integrity/ima/ima_queue.c | 8 +++++
5 files changed, 61 insertions(+), 7 deletions(-)
diff --git a/include/linux/ima.h b/include/linux/ima.h
index df22143ffe30..d59ed38a4305 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -203,6 +203,7 @@ struct ima_namespace {
struct ima_policy_data *policy_data;
struct integrity_iint_tree *iint_tree;
struct list_head *measurements;
+ atomic_long_t ml_len; /* number of stored measurements in the list */
} __randomize_layout;
extern struct ima_namespace init_ima_ns;
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 4758e14c4a7b..e2893f0b0f31 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -63,7 +63,9 @@ static ssize_t ima_show_measurements_count(struct file *filp,
char __user *buf,
size_t count, loff_t *ppos)
{
- return ima_show_htable_value(buf, count, ppos, &ima_htable.len);
+ struct ima_namespace *ima_ns = get_current_ns();
+
+ return ima_show_htable_value(buf, count, ppos, &ima_ns->ml_len);
}
@@ -77,10 +79,36 @@ static void *ima_measurements_start(struct seq_file *m, loff_t *pos)
{
loff_t l = *pos;
struct ima_queue_entry *qe;
+ struct ima_namespace *ima_ns = get_current_ns();
+ unsigned int ns_id = get_ns_id(ima_ns);
+
+ if (ima_ns == &init_ima_ns) {
+ /* we need a lock since pos could point beyond last element */
+ rcu_read_lock();
+ list_for_each_entry_rcu(qe, &ima_measurements, later) {
+ if (!l--) {
+ rcu_read_unlock();
+ return qe;
+ }
+ }
+ rcu_read_unlock();
+ return NULL;
+ }
- /* we need a lock since pos could point beyond last element */
rcu_read_lock();
- list_for_each_entry_rcu(qe, &ima_measurements, later) {
+ qe = list_next_or_null_rcu(&ima_measurements,
+ ima_ns->measurements,
+ struct ima_queue_entry,
+ later);
+ if (!qe) {
+ rcu_read_unlock();
+ return NULL;
+ }
+
+ list_for_each_entry_from_rcu(qe, &ima_measurements, later) {
+ if (ns_id != qe->entry->ns_id)
+ continue;
+
if (!l--) {
rcu_read_unlock();
return qe;
@@ -93,12 +121,27 @@ static void *ima_measurements_start(struct seq_file *m, loff_t *pos)
static void *ima_measurements_next(struct seq_file *m, void *v, loff_t *pos)
{
struct ima_queue_entry *qe = v;
+ struct ima_namespace *ima_ns = get_current_ns();
+ unsigned int ns_id = get_ns_id(ima_ns);
+
+ if (ima_ns == &init_ima_ns) {
+ /* lock protects when reading beyond last element
+ * against concurrent list-extension
+ */
+ rcu_read_lock();
+ qe = list_entry_rcu(qe->later.next, struct ima_queue_entry,
+ later);
+ rcu_read_unlock();
+ (*pos)++;
+
+ return (&qe->later == &ima_measurements) ? NULL : qe;
+ }
- /* lock protects when reading beyond last element
- * against concurrent list-extension
- */
rcu_read_lock();
- qe = list_entry_rcu(qe->later.next, struct ima_queue_entry, later);
+ list_for_each_entry_continue_rcu(qe, &ima_measurements, later) {
+ if (ns_id == qe->entry->ns_id)
+ break;
+ }
rcu_read_unlock();
(*pos)++;
diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
index 2100ee341dfc..ac9509d8c0f0 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -36,6 +36,7 @@ struct ima_namespace init_ima_ns = {
.policy_data = &init_policy_data,
.iint_tree = &init_iint_tree,
.measurements = &ima_measurements,
+ .ml_len = ATOMIC_LONG_INIT(0),
};
EXPORT_SYMBOL(init_ima_ns);
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index f331187a4d3c..81de492baa99 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -126,6 +126,7 @@ static struct ima_namespace *clone_ima_ns(struct user_namespace *user_ns,
ns->user_ns = get_user_ns(user_ns);
ns->ucounts = ucounts;
ns->frozen = false;
+ atomic_long_set(&ns->ml_len, 0);
rwlock_init(&ns->iint_tree->lock);
ns->iint_tree->root = RB_ROOT;
diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
index bd890778c5be..ec5b3ca3ef92 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -227,6 +227,14 @@ int ima_add_template_entry(struct ima_template_entry *entry, int violation,
goto out;
}
+ /* Initial ima namespace has access to all measurement list entries,
+ * therefore always increment its measurement list length. Other
+ * namespaces can see only their own entries.
+ */
+ if (ima_ns != &init_ima_ns)
+ atomic_long_inc(&ima_ns->ml_len);
+ atomic_long_inc(&init_ima_ns.ml_len);
+
if (violation) /* invalidate pcr */
digests_arg = digests;
--
2.20.1
^ permalink raw reply related
* Re: [RFC PATCH 00/30] ima: Introduce IMA namespace
From: James Bottomley @ 2020-08-18 16:19 UTC (permalink / raw)
To: krzysztof.struczynski, linux-integrity, linux-kernel, containers,
linux-security-module
Cc: zohar, stefanb, sunyuqiong1988, mkayaalp, dmitry.kasatkin, serge,
jmorris, christian, silviu.vlasceanu, roberto.sassu
In-Reply-To: <20200818152037.11869-1-krzysztof.struczynski@huawei.com>
On Tue, 2020-08-18 at 17:20 +0200, krzysztof.struczynski@huawei.com
wrote:
> The measurement list remains global, with the assumption that there
> is only one TPM in the system. Each IMA namespace has a unique ID,
> that allows to track measurements per IMA namespace. Processes in one
> namespace, have access only to the measurements from that namespace.
> The exception is made for the initial IMA namespace, whose processes
> have access to all entries.
So I think this can work in the use case where the system owner is
responsible for doing the logging and attestation and the tenants just
trust the owner without requiring an attestation. However, in a multi-
tenant system you need a way for the attestation to be per-container
(because the combined list of who executed what would be a security
leak between tenants). Since we can't virtualise the PCRs without
introducing a vtpm this is going to require a vtpm infrastructure like
that used for virtual machines and then we can do IMA logging per
container.
I don't think the above has to be in your first patch set, we just have
to have an idea of how it could be done to show that nothing in this
patch set precludes a follow on from doing this.
James
^ permalink raw reply
* Re: [RFC PATCH 00/30] ima: Introduce IMA namespace
From: Christian Brauner @ 2020-08-18 16:49 UTC (permalink / raw)
To: krzysztof.struczynski
Cc: linux-integrity, linux-kernel, containers, linux-security-module,
zohar, stefanb, sunyuqiong1988, mkayaalp, dmitry.kasatkin, serge,
jmorris, christian, silviu.vlasceanu, roberto.sassu, ebiederm,
viro, torvalds, luto, jannh
In-Reply-To: <20200818152037.11869-1-krzysztof.struczynski@huawei.com>
On Tue, Aug 18, 2020 at 05:20:07PM +0200, krzysztof.struczynski@huawei.com wrote:
> From: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
>
> IMA has not been designed to work with containers. It handles every
> process in the same way, and it cannot distinguish if a process belongs to
> a container or not.
>
> Containers use namespaces to make it appear to the processes in the
> containers that they have their own isolated instance of the global
> resource. For IMA as well, it is desirable to let processes in the
IMA is brought up on a regular basis with "we want to have this" for
years and then non-one seems to really care enough.
I'm highly skeptical of the value of ~2500 lines of code even if it
includes a bunch of namespace boilerplate. It's yet another namespace,
and yet another security framework.
Why does IMA need to be a separate namespace? Keyrings are tied to user
namespaces why can't IMA be? I believe Eric has even pointed that out
before.
Eric, thoughts?
Christian
^ permalink raw reply
* Re: file metadata via fs API
From: Linus Torvalds @ 2020-08-18 18:51 UTC (permalink / raw)
To: Miklos Szeredi
Cc: Steven Whitehouse, David Howells, linux-fsdevel, Al Viro,
Karel Zak, Jeff Layton, Miklos Szeredi, Nicolas Dichtel,
Christian Brauner, Lennart Poettering, Linux API, Ian Kent, LSM,
Linux Kernel Mailing List
In-Reply-To: <CAJfpegsr8URJHoFunnGShB-=jqypvtrmLV-BcWajkHux2H4x2w@mail.gmail.com>
On Tue, Aug 18, 2020 at 5:50 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> How do you propose handling variable size attributes, like the list of
> fs options?
I really REALLY think those things should just be ASCII data.
I think marshalling binary data is actively evil and wrong. It's great
for well-specified wire protocols. It's great for internal
communication in user space. It's *NOT* great for a kernel system call
interface.
One single simple binary structure? Sure. That's how system calls
work. I'm not claiming that things like "stat()" are wrong because
they take binary data.
But marshalling random binary structures into some buffer? Let's avoid
that. Particularly for these kinds of fairly free-form things like fs
options.
Those things *are* strings, most of them. Exactly because it needs a
level of flexibility that binary data just doesn't have.
So I'd suggest something that is very much like "statfsat()", which
gets a buffer and a length, and returns an extended "struct statfs"
*AND* just a string description at the end.
And if you don't pass a sufficiently big buffer, it will not do some
random "continuations". No state between system calls. It gets
truncated, and you need to pass a bigger buffer, kind of like
"snprintf()".
I think people who have problems parsing plain ASCII text are just
wrong. It's not that expensive. The thing that makes /proc/mounts
expensive is not the individual lines - it's that there are a lot of
them.
Linus
^ permalink raw reply
* Re: file metadata via fs API
From: Miklos Szeredi @ 2020-08-18 20:18 UTC (permalink / raw)
To: Linus Torvalds
Cc: Steven Whitehouse, David Howells, linux-fsdevel, Al Viro,
Karel Zak, Jeff Layton, Miklos Szeredi, Nicolas Dichtel,
Christian Brauner, Lennart Poettering, Linux API, Ian Kent, LSM,
Linux Kernel Mailing List
In-Reply-To: <CAHk-=wh5YifP7hzKSbwJj94+DZ2czjrZsczy6GBimiogZws=rg@mail.gmail.com>
On Tue, Aug 18, 2020 at 8:51 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> I think people who have problems parsing plain ASCII text are just
> wrong. It's not that expensive. The thing that makes /proc/mounts
> expensive is not the individual lines - it's that there are a lot of
> them.
I agree completely with the above.
So why mix a binary structure into it? Would it not make more sense
to make it text only?
I.e. NAME=VALUE pairs separated by newlines and quoting non-printable chars.
Thanks,
Miklos
^ permalink raw reply
* Re: file metadata via fs API
From: Linus Torvalds @ 2020-08-18 20:53 UTC (permalink / raw)
To: Miklos Szeredi
Cc: Steven Whitehouse, David Howells, linux-fsdevel, Al Viro,
Karel Zak, Jeff Layton, Miklos Szeredi, Nicolas Dichtel,
Christian Brauner, Lennart Poettering, Linux API, Ian Kent, LSM,
Linux Kernel Mailing List
In-Reply-To: <CAJfpegt9yEHX3C-sF9UyOXJcRa1cfDnf450OEJ47Xk=FmyEs8A@mail.gmail.com>
On Tue, Aug 18, 2020 at 1:18 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> So why mix a binary structure into it? Would it not make more sense
> to make it text only?
.. because for basic and standard stuff, the binary structure just
makes sense and is easier for everybody.
When I want to get the size of a file, I do "stat()" on it, and get
the size from st.st_size. That's convenient, and there's no reason
_not_ to do it. Returning the size as an ASCII string would be
completely pointless and annoying as hell.
So binary formats have their places. But those places are for standard
and well-understood fields that are commonly accessed and do not have
any free-form or wild components to them that needs to be marshalled
into some binary format.
Whenever you have free-form data, just use ASCII.
It's what "mount" already uses, for chrissake. We pass in mount
options as ASCII for a good reason.
Basically, I think a rough rule of thumb can and should be:
- stuff that the VFS knows about natively and fully is clearly pretty
mount-agnostic and generic, and can be represented in whatever
extended "struct statfs_x" directly.
- anything that is variable-format and per-fs should be expressed in
the ASCII buffer
Look at our fancy new fs_context - that's pretty much what it does
even inside the kernel. Sure, we have "binary" fields there for core
basic information ("struct dentry *root", but also things like flags
with MNT_NOSUID), but the configuration stuff is ASCII that the
filesystem can parse itself.
Exactly because some things are very much specific to some
filesystems, not generic things.
So we fundamentally already have a mix of "standard FS data" and
"filesystem-specific options", and it's already basically split that
way: binary flag fields for the generic stuff, and ASCII text for the
odd options.
Again: binary data isn't wrong when it's a fixed structure that didn't
need some odd massaging or marshalling or parsing. Just a simple fixed
structure. That is _the_ most common kernel interface, used for almost
everything. Sometimes we have arrays of them, but most of the time
it's a single struct pointer.
But binary data very much is wrong the moment you think you need to
have a parser to read it, or a marshaller to write it. Just use ASCII.
I really would prefer for the free-form data to have a lot of
commonalities with the /proc/mounts line. Not because that's a
wonderful format, but because there are very very few truly wonderful
formats out there, and in the absense of "wonderful", I'd much prefer
"familiar" and "able to use common helpers" (hopefully both on the
kernel side and the user side)..
Linus
^ permalink raw reply
* Re: file metadata via fs API
From: Al Viro @ 2020-08-19 2:29 UTC (permalink / raw)
To: Linus Torvalds
Cc: Miklos Szeredi, Steven Whitehouse, David Howells, linux-fsdevel,
Karel Zak, Jeff Layton, Miklos Szeredi, Nicolas Dichtel,
Christian Brauner, Lennart Poettering, Linux API, Ian Kent, LSM,
Linux Kernel Mailing List
In-Reply-To: <CAHk-=wh5YifP7hzKSbwJj94+DZ2czjrZsczy6GBimiogZws=rg@mail.gmail.com>
On Tue, Aug 18, 2020 at 11:51:25AM -0700, Linus Torvalds wrote:
> I think people who have problems parsing plain ASCII text are just
> wrong. It's not that expensive. The thing that makes /proc/mounts
> expensive is not the individual lines - it's that there are a lot of
> them.
It is expensive - if you use strdup() all over the place,
do asprintf() equivalents for concatenation, etc. IOW, you can write
BASIC (or javascript) in any language...
systemd used to be that bad - exactly in parsing /proc/mounts;
I hadn't checked that code lately, so it's possible that it had gotten
better, but about 4 years ago it had been awful. OTOH, at that time
I'd been looking at the atrocities kernel-side (in fs/pnode.c), where
on realistic setups we had O(N^2) allocations done, with all but O(N)
of them ending up freed before anyone could see them. So it's not as
if they had a monopoly on bloody awful code...
^ permalink raw reply
* Re: [PATCH bpf-next v8 3/7] bpf: Generalize bpf_sk_storage
From: KP Singh @ 2020-08-19 12:41 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: linux-kernel, bpf, linux-security-module, Alexei Starovoitov,
Daniel Borkmann, Paul Turner, Jann Horn, Florent Revest
In-Reply-To: <20200818010545.iix72le4tkhuyqe5@kafai-mbp.dhcp.thefacebook.com>
On 8/18/20 3:05 AM, Martin KaFai Lau wrote:
> On Mon, Aug 03, 2020 at 06:46:51PM +0200, KP Singh wrote:
>> From: KP Singh <kpsingh@google.com>
>>
>> Refactor the functionality in bpf_sk_storage.c so that concept of
>> storage linked to kernel objects can be extended to other objects like
>> inode, task_struct etc.
>>
>> Each new local storage will still be a separate map and provide its own
>> set of helpers. This allows for future object specific extensions and
>> still share a lot of the underlying implementation.
>>
>> This includes the changes suggested by Martin in:
>>
>> https://lore.kernel.org/bpf/20200725013047.4006241-1-kafai@fb.com/
>>
>> which adds map_local_storage_charge, map_local_storage_uncharge,
>> and map_owner_storage_ptr.
> A description will still be useful in the commit message to talk
> about the new map_ops, e.g.
> they allow kernel object to optionally have different mem-charge strategy.
>
>>
>> Co-developed-by: Martin KaFai Lau <kafai@fb.com>
>> Signed-off-by: KP Singh <kpsingh@google.com>
>> ---
>> include/linux/bpf.h | 9 ++
>> include/net/bpf_sk_storage.h | 51 +++++++
>> include/uapi/linux/bpf.h | 8 +-
>> net/core/bpf_sk_storage.c | 246 +++++++++++++++++++++------------
>> tools/include/uapi/linux/bpf.h | 8 +-
>> 5 files changed, 233 insertions(+), 89 deletions(-)
>>
>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>> index cef4ef0d2b4e..8e1e23c60dc7 100644
>> --- a/include/linux/bpf.h
>> +++ b/include/linux/bpf.h
>> @@ -34,6 +34,9 @@ struct btf_type;
>> struct exception_table_entry;
>> struct seq_operations;
>> struct bpf_iter_aux_info;
>> +struct bpf_local_storage;
>> +struct bpf_local_storage_map;
>> +struct bpf_local_storage_elem;
> "struct bpf_local_storage_elem" is not needed.
True, I moved it to bpf_sk_storage.h because it's needed there.
>
>>
>> extern struct idr btf_idr;
>> extern spinlock_t btf_idr_lock;
>> @@ -104,6 +107,12 @@ struct bpf_map_ops {
>> __poll_t (*map_poll)(struct bpf_map *map, struct file *filp,
>> struct poll_table_struct *pts);
>>
>> + /* Functions called by bpf_local_storage maps */
>> + int (*map_local_storage_charge)(struct bpf_local_storage_map *smap,
>> + void *owner, u32 size);
>> + void (*map_local_storage_uncharge)(struct bpf_local_storage_map *smap,
>> + void *owner, u32 size);
>> + struct bpf_local_storage __rcu ** (*map_owner_storage_ptr)(void *owner);
[...]
>> + struct bpf_local_storage_map *smap,
>> + struct bpf_local_storage_elem *first_selem);
>> +
>> +struct bpf_local_storage_data *
>> +bpf_local_storage_update(void *owner, struct bpf_map *map, void *value,
> Nit. It may be more consistent to take "struct bpf_local_storage_map *smap"
> instead of "struct bpf_map *map" here.
>
> bpf_local_storage_map_check_btf() will be the only one taking
> "struct bpf_map *map".
That's because it is used in map operations as map_check_btf which expects
a bpf_map *map pointer. We can wrap it in another function but is that
worth doing?
>
>> + u64 map_flags);
>> +
>> #ifdef CONFIG_BPF_SYSCALL
>> int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk);
>> struct bpf_sk_storage_diag *
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index b134e679e9db..35629752cec8 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -3647,9 +3647,13 @@ enum {
>> BPF_F_SYSCTL_BASE_NAME = (1ULL << 0),
>> };
>>
>> -/* BPF_FUNC_sk_storage_get flags */
>> +/* BPF_FUNC_<local>_storage_get flags */
> BPF_FUNC_<kernel_obj>_storage_get flags?
>
Done.
>> enum {
>> - BPF_SK_STORAGE_GET_F_CREATE = (1ULL << 0),
>> + BPF_LOCAL_STORAGE_GET_F_CREATE = (1ULL << 0),
>> + /* BPF_SK_STORAGE_GET_F_CREATE is only kept for backward compatibility
>> + * and BPF_LOCAL_STORAGE_GET_F_CREATE must be used instead.
>> + */
>> + BPF_SK_STORAGE_GET_F_CREATE = BPF_LOCAL_STORAGE_GET_F_CREATE,
>> };
>>
>> /* BPF_FUNC_read_branch_records flags. */
>> diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
>> index 99dde7b74767..bb2375769ca1 100644
>> --- a/net/core/bpf_sk_storage.c
>> +++ b/net/core/bpf_sk_storage.c
>> @@ -84,7 +84,7 @@ struct bpf_local_storage_elem {
>> struct bpf_local_storage {
>> struct bpf_local_storage_data __rcu *cache[BPF_LOCAL_STORAGE_CACHE_SIZE];
>> struct hlist_head list; /* List of bpf_local_storage_elem */
>> - struct sock *owner; /* The object that owns the the above "list" of
[...]
>> }
>>
>> -/* sk_storage->lock must be held and selem->sk_storage == sk_storage.
>> +/* local_storage->lock must be held and selem->sk_storage == sk_storage.
> This name change belongs to patch 1.
>
> Also,
> selem->"local_"storage == "local_"storage
Done.
>
>> * The caller must ensure selem->smap is still valid to be
>> * dereferenced for its smap->elem_size and smap->cache_idx.
>> */
>
> [ ... ]
>
>> @@ -367,7 +401,7 @@ static int sk_storage_alloc(struct sock *sk,
>> /* Note that even first_selem was linked to smap's
>> * bucket->list, first_selem can be freed immediately
>> * (instead of kfree_rcu) because
>> - * bpf_sk_storage_map_free() does a
>> + * bpf_local_storage_map_free() does a
[...]
>> kfree(selem);
>> - atomic_sub(smap->elem_size, &sk->sk_omem_alloc);
>> + mem_uncharge(smap, owner, smap->elem_size);
>> return ERR_PTR(err);
>> }
>>
>> @@ -430,8 +464,8 @@ bpf_local_storage_update(struct sock *sk, struct bpf_map *map, void *value,
>> * such that it can avoid taking the local_storage->lock
>> * and changing the lists.
>> */
>> - old_sdata =
>> - bpf_local_storage_lookup(local_storage, smap, false);
>> + old_sdata = bpf_local_storage_lookup(local_storage, smap,
>> + false);
> Pure indentation change. The same line has been changed in patch 1. Please change
> the identation in patch 1 if the above way is preferred.
I removed this change.
>
>> err = check_flags(old_sdata, map_flags);
>> if (err)
>> return ERR_PTR(err);
>> @@ -475,7 +509,7 @@ bpf_local_storage_update(struct sock *sk, struct bpf_map *map, void *value,
>> * old_sdata will not be uncharged later during
>> * bpf_selem_unlink_storage().
>> */
>> - selem = bpf_selem_alloc(smap, sk, value, !old_sdata);
>> + selem = bpf_selem_alloc(smap, owner, value, !old_sdata);
>> if (!selem) {
>> err = -ENOMEM;
>> goto unlock_err;
>> @@ -567,7 +601,7 @@ void bpf_sk_storage_free(struct sock *sk)
>> * Thus, no elem can be added-to or deleted-from the
>> * sk_storage->list by the bpf_prog or by the bpf-map's syscall.
>> *
>> - * It is racing with bpf_sk_storage_map_free() alone
>> + * It is racing with bpf_local_storage_map_free() alone
> This name change belongs to patch 1 also.
Done.
>
>> * when unlinking elem from the sk_storage->list and
>> * the map's bucket->list.
>> */
>> @@ -587,17 +621,12 @@ void bpf_sk_storage_free(struct sock *sk)
>> kfree_rcu(sk_storage, rcu);
>> }
>>
>> -static void bpf_local_storage_map_free(struct bpf_map *map)
[..]
>>
>> /* bpf prog and the userspace can no longer access this map
>> * now. No new selem (of this map) can be added
>> - * to the sk->sk_bpf_storage or to the map bucket's list.
>> + * to the bpf_local_storage or to the map bucket's list.
> s/bpf_local_storage/owner->storage/
Done.
>
>> *
>> * The elem of this map can be cleaned up here
>> * or
>> - * by bpf_sk_storage_free() during __sk_destruct().
>> + * by bpf_local_storage_free() during the destruction of the
>> + * owner object. eg. __sk_destruct.
> This belongs to patch 1 also.
In patch, 1, changed it to:
* The elem of this map can be cleaned up here
* or when the storage is freed e.g.
* by bpf_sk_storage_free() during __sk_destruct().
>
>> */
>> for (i = 0; i < (1U << smap->bucket_log); i++) {
>> b = &smap->buckets[i];
>> @@ -627,22 +657,31 @@ static void bpf_local_storage_map_free(struct bpf_map *map)
>> rcu_read_unlock();
>> }
>>
>> - /* bpf_sk_storage_free() may still need to access the map.
>> - * e.g. bpf_sk_storage_free() has unlinked selem from the map
>> + /* bpf_local_storage_free() may still need to access the map.
> It is confusing. There is no bpf_local_storage_free().
/* While freeing the storage we may still need to access the map.
*
* e.g. when bpf_sk_storage_free() has unlinked selem from the map
* which then made the above while((selem = ...)) loop
* exited immediately.
>
>> + * e.g. bpf_local_storage_free() has unlinked selem from the map
>> * which then made the above while((selem = ...)) loop
>> * exited immediately.
>> *
>> - * However, the bpf_sk_storage_free() still needs to access
>> + * However, the bpf_local_storage_free() still needs to access
> Same here.
With the change above, this can stay bpf_sk_storage_free
>
>> * the smap->elem_size to do the uncharging in
>> * bpf_selem_unlink_storage().
>> *
>> * Hence, wait another rcu grace period for the
>> - * bpf_sk_storage_free() to finish.
>> + * bpf_local_storage_free() to finish.
> and here.
and this too can stay bpf_sk_storage_free
>
>> */
>> synchronize_rcu();
>>
>> kvfree(smap->buckets);
>> - kfree(map);
>> + kfree(smap);
>> +}
>> +
>> +static void sk_storage_map_free(struct bpf_map *map)
>> +{
[...]
_attr *attr)
>> raw_spin_lock_init(&smap->buckets[i].lock);
>> }
>>
>> - smap->elem_size = sizeof(struct bpf_local_storage_elem) + attr->value_size;
>> - smap->cache_idx = bpf_local_storage_cache_idx_get(&sk_cache);
>> + smap->elem_size =
>> + sizeof(struct bpf_local_storage_elem) + attr->value_size;
> Same line has changed in patch 1. Change the indentation in patch 1 also
> if the above way is desired.
Done.
> Others LGTM.
>
^ permalink raw reply
* Re: [PATCH bpf-next v8 6/7] bpf: Allow local storage to be used from LSM programs
From: KP Singh @ 2020-08-19 13:01 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: linux-kernel, bpf, linux-security-module, Alexei Starovoitov,
Daniel Borkmann, Paul Turner, Jann Horn, Florent Revest
In-Reply-To: <20200818041638.2dv5cewlgwerd7hm@kafai-mbp.dhcp.thefacebook.com>
On 8/18/20 6:16 AM, Martin KaFai Lau wrote:
> On Mon, Aug 03, 2020 at 06:46:54PM +0200, KP Singh wrote:
>> From: KP Singh <kpsingh@google.com>
>>
>> Adds support for both bpf_{sk, inode}_storage_{get, delete} to be used
>> in LSM programs. These helpers are not used for tracing programs
[...]
>> @@ -2823,6 +2823,10 @@ union bpf_attr {
>> * "type". The bpf-local-storage "type" (i.e. the *map*) is
>> * searched against all bpf-local-storages residing at *sk*.
>> *
>> + * For socket programs, *sk* should be a **struct bpf_sock** pointer
>> + * and an **ARG_PTR_TO_BTF_ID** of type **struct sock** for LSM
>> + * programs.
> I found it a little vague on what "socket programs" is. May be:
>
> *sk* is a kernel **struct sock** pointer for LSM program.
> *sk* is a **struct bpf_sock** pointer for other program types.
This is better, Thanks!
- KP
>
> Others LGTM
>
> Acked-by: Martin KaFai Lau <kafai@fb.com>
>
^ permalink raw reply
* Re: [PATCH bpf-next v8 3/7] bpf: Generalize bpf_sk_storage
From: Martin KaFai Lau @ 2020-08-19 17:12 UTC (permalink / raw)
To: KP Singh
Cc: linux-kernel, bpf, linux-security-module, Alexei Starovoitov,
Daniel Borkmann, Paul Turner, Jann Horn, Florent Revest
In-Reply-To: <6cb51fa0-61a5-2cf6-b44d-84d58d08c775@chromium.org>
On Wed, Aug 19, 2020 at 02:41:50PM +0200, KP Singh wrote:
> On 8/18/20 3:05 AM, Martin KaFai Lau wrote:
> > On Mon, Aug 03, 2020 at 06:46:51PM +0200, KP Singh wrote:
> >> From: KP Singh <kpsingh@google.com>
> >>
> >> Refactor the functionality in bpf_sk_storage.c so that concept of
> >> storage linked to kernel objects can be extended to other objects like
> >> inode, task_struct etc.
> >>
> >> Each new local storage will still be a separate map and provide its own
> >> set of helpers. This allows for future object specific extensions and
> >> still share a lot of the underlying implementation.
> >>
> >> This includes the changes suggested by Martin in:
> >>
> >> https://lore.kernel.org/bpf/20200725013047.4006241-1-kafai@fb.com/
> >>
> >> which adds map_local_storage_charge, map_local_storage_uncharge,
> >> and map_owner_storage_ptr.
> > A description will still be useful in the commit message to talk
> > about the new map_ops, e.g.
> > they allow kernel object to optionally have different mem-charge strategy.
> >
> >>
> >> Co-developed-by: Martin KaFai Lau <kafai@fb.com>
> >> Signed-off-by: KP Singh <kpsingh@google.com>
> >> ---
> >> include/linux/bpf.h | 9 ++
> >> include/net/bpf_sk_storage.h | 51 +++++++
> >> include/uapi/linux/bpf.h | 8 +-
> >> net/core/bpf_sk_storage.c | 246 +++++++++++++++++++++------------
> >> tools/include/uapi/linux/bpf.h | 8 +-
> >> 5 files changed, 233 insertions(+), 89 deletions(-)
> >>
>
> >> + struct bpf_local_storage_map *smap,
> >> + struct bpf_local_storage_elem *first_selem);
> >> +
> >> +struct bpf_local_storage_data *
> >> +bpf_local_storage_update(void *owner, struct bpf_map *map, void *value,
> > Nit. It may be more consistent to take "struct bpf_local_storage_map *smap"
> > instead of "struct bpf_map *map" here.
> >
> > bpf_local_storage_map_check_btf() will be the only one taking
> > "struct bpf_map *map".
>
> That's because it is used in map operations as map_check_btf which expects
> a bpf_map *map pointer. We can wrap it in another function but is that
> worth doing?
Agree. bpf_local_storage_map_check_btf() should stay as is.
I meant to only change the "bpf_local_storage_update()" to take
"struct bpf_local_storage_map *smap".
> >
> >> *
> >> * The elem of this map can be cleaned up here
> >> * or
> >> - * by bpf_sk_storage_free() during __sk_destruct().
> >> + * by bpf_local_storage_free() during the destruction of the
> >> + * owner object. eg. __sk_destruct.
> > This belongs to patch 1 also.
>
>
> In patch, 1, changed it to:
>
> * The elem of this map can be cleaned up here
> * or when the storage is freed e.g.
> * by bpf_sk_storage_free() during __sk_destruct().
>
+1
^ permalink raw reply
* Re: [PATCH] mm/migrate: Avoid possible unnecessary ptrace_may_access() call in kernel_move_pages()
From: Kees Cook @ 2020-08-19 18:04 UTC (permalink / raw)
To: Miaohe Lin; +Cc: akpm, linux-mm, linux-kernel, linux-security-module
In-Reply-To: <20200817115933.44565-1-linmiaohe@huawei.com>
On Mon, Aug 17, 2020 at 07:59:33AM -0400, Miaohe Lin wrote:
> There is no need to check if this process has the right to modify the
> specified process when they are same.
>
> Signed-off-by: Hongxiang Lou <louhongxiang@huawei.com>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
> mm/migrate.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 34a842a8eb6a..342c1ce0b433 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1903,7 +1903,7 @@ static int kernel_move_pages(pid_t pid, unsigned long nr_pages,
> * Check if this process has the right to modify the specified
> * process. Use the regular "ptrace_may_access()" checks.
> */
> - if (!ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS)) {
> + if (pid && !ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS)) {
> rcu_read_unlock();
> err = -EPERM;
> goto out;
NAK, please don't do this -- the ptrace and security hooks already do
these kinds of self-introspection checks, and I'd like to keep a central
place to perform these kinds of checks.
Is there a specific problem you've encountered that this fixes?
--
Kees Cook
^ permalink raw reply
* Re: [PATCH v1 0/4] [RFC] Implement Trampoline File Descriptor
From: Mickaël Salaün @ 2020-08-19 18:53 UTC (permalink / raw)
To: Mark Rutland, Madhavan T. Venkataraman
Cc: kernel-hardening, linux-api, linux-arm-kernel, linux-fsdevel,
linux-integrity, linux-kernel, linux-security-module, oleg, x86
In-Reply-To: <20200812100650.GB28154@C02TD0UTHF1T.local>
On 12/08/2020 12:06, Mark Rutland wrote:
> On Thu, Aug 06, 2020 at 12:26:02PM -0500, Madhavan T. Venkataraman wrote:
>> Thanks for the lively discussion. I have tried to answer some of the
>> comments below.
>>
>> On 8/4/20 9:30 AM, Mark Rutland wrote:
>>>
>>>> So, the context is - if security settings in a system disallow a page to have
>>>> both write and execute permissions, how do you allow the execution of
>>>> genuine trampolines that are runtime generated and placed in a data
>>>> page or a stack page?
>>> There are options today, e.g.
>>>
>>> a) If the restriction is only per-alias, you can have distinct aliases
>>> where one is writable and another is executable, and you can make it
>>> hard to find the relationship between the two.
>>>
>>> b) If the restriction is only temporal, you can write instructions into
>>> an RW- buffer, transition the buffer to R--, verify the buffer
>>> contents, then transition it to --X.
>>>
>>> c) You can have two processes A and B where A generates instrucitons into
>>> a buffer that (only) B can execute (where B may be restricted from
>>> making syscalls like write, mprotect, etc).
>>
>> The general principle of the mitigation is W^X. I would argue that
>> the above options are violations of the W^X principle. If they are
>> allowed today, they must be fixed. And they will be. So, we cannot
>> rely on them.
>
> Hold on.
>
> Contemporary W^X means that a given virtual alias cannot be writeable
> and executeable simultaneously, permitting (a) and (b). If you read the
> references on the Wikipedia page for W^X you'll see the OpenBSD 3.3
> release notes and related presentation make this clear, and further they
> expect (b) to occur with JITS flipping W/X with mprotect().
W^X (with "permanent" mprotect restrictions [1]) goes back to 2000 with
PaX [2] (which predates partial OpenBSD implementation from 2003).
[1] https://pax.grsecurity.net/docs/mprotect.txt
[2] https://undeadly.org/cgi?action=article;sid=20030417082752
>
> Please don't conflate your assumed stronger semantics with the general
> principle. It not matching you expectations does not necessarily mean
> that it is wrong.
>
> If you want a stronger W^X semantics, please refer to this specifically
> with a distinct name.
>
>> a) This requires a remap operation. Two mappings point to the same
>> physical page. One mapping has W and the other one has X. This
>> is a violation of W^X.
>>
>> b) This is again a violation. The kernel should refuse to give execute
>> permission to a page that was writeable in the past and refuse to
>> give write permission to a page that was executable in the past.
>>
>> c) This is just a variation of (a).
>
> As above, this is not true.
>
> If you have a rationale for why this is desirable or necessary, please
> justify that before using this as justification for additional features.
>
>> In general, the problem with user-level methods to map and execute
>> dynamic code is that the kernel cannot tell if a genuine application is
>> using them or an attacker is using them or piggy-backing on them.
>
> Yes, and as I pointed out the same is true for trampfd unless you can
> somehow authenticate the calls are legitimate (in both callsite and the
> set of arguments), and I don't see any reasonable way of doing that.
>
> If you relax your threat model to an attacker not being able to make
> arbitrary syscalls, then your suggestion that userspace can perorm
> chceks between syscalls may be sufficient, but as I pointed out that's
> equally true for a sealed memfd or similar.
>
>> Off the top of my head, I have tried to identify some examples
>> where we can have more trust on dynamic code and have the kernel
>> permit its execution.
>>
>> 1. If the kernel can do the job, then that is one safe way. Here, the kernel
>> is the code. There is no code generation involved. This is what I
>> have presented in the patch series as the first cut.
>
> This is sleight-of-hand; it doesn't matter where the logic is performed
> if the power is identical. Practically speaking this is equivalent to
> some dynamic code generation.
>
> I think that it's misleading to say that because the kernel emulates
> something it is safe when the provenance of the syscall arguments cannot
> be verified.
>
> [...]
>
>> Anyway, these are just examples. The principle is - if we can identify
>> dynamic code that has a certain measure of trust, can the kernel
>> permit their execution?
>
> My point generally is that the kernel cannot identify this, and if
> usrspace code is trusted to dynamically generate trampfd arguments it
> can equally be trusted to dyncamilly generate code.
>
> [...]
>
>> As I have mentioned above, I intend to have the kernel generate code
>> only if the code generation is simple enough. For more complicated cases,
>> I plan to use a user-level code generator that is for exclusive kernel use.
>> I have yet to work out the details on how this would work. Need time.
>
> This reads to me like trampfd is only dealing with a few special cases
> and we know that we need a more general solution.
>
> I hope I am mistaken, but I get the strong impression that you're trying
> to justify your existing solution rather than trying to understand the
> problem space.
>
> To be clear, my strong opinion is that we should not be trying to do
> this sort of emulation or code generation within the kernel. I do think
> it's worthwhile to look at mechanisms to make it harder to subvert
> dynamic userspace code generation, but I think the code generation
> itself needs to live in userspace (e.g. for ABI reasons I previously
> mentioned).
>
> Mark.
>
^ permalink raw reply
* init_on_alloc/init_on_free boot options
From: Jirka Hladky @ 2020-08-19 22:18 UTC (permalink / raw)
To: Alexander Potapenko; +Cc: kernel-hardening, linux-mm, linux-security-module
Hi Alex,
Could you please help me to clarify the purpose of init_on_alloc=1
when init_on_free is enabled?
If I get it right, init_on_free=1 alone guarantees that the memory
returned by the page allocator and SL[AU]B is initialized with zeroes.
What is the purpose of init_on_alloc=1 in that case? We are zeroing
memory twice, or am I missing something?
Thanks a lot!
Jirka
^ permalink raw reply
* Re: [PATCH bpf-next v8 3/7] bpf: Generalize bpf_sk_storage
From: KP Singh @ 2020-08-19 22:19 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: linux-kernel, bpf, linux-security-module, Alexei Starovoitov,
Daniel Borkmann, Paul Turner, Jann Horn, Florent Revest
In-Reply-To: <20200819171215.lcgoon3fbm4kvkpc@kafai-mbp.dhcp.thefacebook.com>
On 19.08.20 19:12, Martin KaFai Lau wrote:
> On Wed, Aug 19, 2020 at 02:41:50PM +0200, KP Singh wrote:
>> On 8/18/20 3:05 AM, Martin KaFai Lau wrote:
>>> On Mon, Aug 03, 2020 at 06:46:51PM +0200, KP Singh wrote:
>>>> From: KP Singh <kpsingh@google.com>
>>>>
>>>> Refactor the functionality in bpf_sk_storage.c so that concept of
[...]
>>>> + struct bpf_local_storage_map *smap,
>>>> + struct bpf_local_storage_elem *first_selem);
>>>> +
>>>> +struct bpf_local_storage_data *
>>>> +bpf_local_storage_update(void *owner, struct bpf_map *map, void *value,
>>> Nit. It may be more consistent to take "struct bpf_local_storage_map *smap"
>>> instead of "struct bpf_map *map" here.
>>>
>>> bpf_local_storage_map_check_btf() will be the only one taking
>>> "struct bpf_map *map".
>>
>> That's because it is used in map operations as map_check_btf which expects
>> a bpf_map *map pointer. We can wrap it in another function but is that
>> worth doing?
> Agree. bpf_local_storage_map_check_btf() should stay as is.
>
> I meant to only change the "bpf_local_storage_update()" to take
> "struct bpf_local_storage_map *smap".
>
Apologies, I misread that. Updated.
- KP
up here
>> * or when the storage is freed e.g.
>> * by bpf_sk_storage_free() during __sk_destruct().
>>
> +1
>
^ permalink raw reply
* Re: init_on_alloc/init_on_free boot options
From: Kees Cook @ 2020-08-19 23:36 UTC (permalink / raw)
To: Jirka Hladky
Cc: Alexander Potapenko, kernel-hardening, linux-mm,
linux-security-module
In-Reply-To: <CAE4VaGD8sKqUgOxr0im+OJgwrLxbbXDaKTSqpyAGRx=rr9isUg@mail.gmail.com>
On Thu, Aug 20, 2020 at 12:18:33AM +0200, Jirka Hladky wrote:
> Could you please help me to clarify the purpose of init_on_alloc=1
> when init_on_free is enabled?
It's to zero memory at allocation time. :) (They are independent
options.)
> If I get it right, init_on_free=1 alone guarantees that the memory
> returned by the page allocator and SL[AU]B is initialized with zeroes.
No, it's guarantees memory freed by the page/slab allocators are zeroed.
> What is the purpose of init_on_alloc=1 in that case? We are zeroing
> memory twice, or am I missing something?
If you have both enabled, yes, you will zero twice. (In theory, if you
have any kind of Use-After-Free/dangling pointers that get written
through after free and before alloc, those contents wouldn't strictly be
zero at alloc time without init_on_alloc. But that's pretty rare.
I wouldn't expect many people to run with both options enabled;
init_on_alloc is more performance-friendly (i.e. cache-friendly), and
init_on_free minimizes the lifetime of stale data in memory.
It appears that the shipping kernel defaults for several distros (Ubuntu,
Arch, Debian, others?) and devices (Android, Chrome OS, others?) are using
init_on_alloc=1. Will Fedora and/or RedHat be joining this trend? :)
--
Kees Cook
^ permalink raw reply
* Re: init_on_alloc/init_on_free boot options
From: Jirka Hladky @ 2020-08-20 0:35 UTC (permalink / raw)
To: Kees Cook
Cc: Alexander Potapenko, kernel-hardening, linux-mm,
linux-security-module
In-Reply-To: <202008191626.1420C63231@keescook>
Thanks a lot for the clarification! I was scratching my head if it
makes sense to enable both options simultaneously.
On Thu, Aug 20, 2020 at 1:36 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Thu, Aug 20, 2020 at 12:18:33AM +0200, Jirka Hladky wrote:
> > Could you please help me to clarify the purpose of init_on_alloc=1
> > when init_on_free is enabled?
>
> It's to zero memory at allocation time. :) (They are independent
> options.)
>
> > If I get it right, init_on_free=1 alone guarantees that the memory
> > returned by the page allocator and SL[AU]B is initialized with zeroes.
>
> No, it's guarantees memory freed by the page/slab allocators are zeroed.
>
> > What is the purpose of init_on_alloc=1 in that case? We are zeroing
> > memory twice, or am I missing something?
>
> If you have both enabled, yes, you will zero twice. (In theory, if you
> have any kind of Use-After-Free/dangling pointers that get written
> through after free and before alloc, those contents wouldn't strictly be
> zero at alloc time without init_on_alloc. But that's pretty rare.
>
> I wouldn't expect many people to run with both options enabled;
> init_on_alloc is more performance-friendly (i.e. cache-friendly), and
> init_on_free minimizes the lifetime of stale data in memory.
>
> It appears that the shipping kernel defaults for several distros (Ubuntu,
> Arch, Debian, others?) and devices (Android, Chrome OS, others?) are using
> init_on_alloc=1. Will Fedora and/or RedHat be joining this trend? :)
>
> --
> Kees Cook
>
--
-Jirka
^ permalink raw reply
* Re: [PATCH] mm/migrate: Avoid possible unnecessary ptrace_may_access() call in kernel_move_pages()
From: linmiaohe @ 2020-08-20 2:18 UTC (permalink / raw)
To: Kees Cook
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org
Kees Cook <keescook@chromium.org> wrote:
>On Mon, Aug 17, 2020 at 07:59:33AM -0400, Miaohe Lin wrote:
>> There is no need to check if this process has the right to modify the
>> specified process when they are same.
>>
>> Signed-off-by: Hongxiang Lou <louhongxiang@huawei.com>
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>
>NAK, please don't do this -- the ptrace and security hooks already do these kinds of self-introspection checks, and I'd like to keep a central place to perform these kinds of checks.
>
Many thanks for your reply.
We also avoid get_task_struct/ put_task_struct pair of atomic ops, rcu_lock, task_lock and so on this way.
>Is there a specific problem you've encountered that this fixes?
>
I'am sorry but there's no specific problem. I do this mainly to skip the unnecessary ptrace and security hooks.
>--
>Kees Cook
Thanks again.
^ permalink raw reply
* Re: [PATCH 5.8 000/232] 5.8.3-rc1 review
From: Naresh Kamboju @ 2020-08-20 15:27 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: open list, Shuah Khan, patches, lkft-triage, Ben Hutchings,
linux- stable, Andrew Morton, Linus Torvalds, Guenter Roeck,
Herbert Xu, LTP List, linux-security-module, keyrings,
Eric Biggers
In-Reply-To: <20200820091612.692383444@linuxfoundation.org>
On Thu, 20 Aug 2020 at 14:55, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 5.8.3 release.
> There are 232 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.3-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
> Herbert Xu <herbert@gondor.apana.org.au>
> crypto: af_alg - Fix regression on empty requests
Results from Linaro’s test farm.
Regressions detected.
ltp-crypto-tests:
* af_alg02
ltp-cve-tests:
* cve-2017-17805
af_alg02.c:52: BROK: Timed out while reading from request socket.
We are running the LTP 20200515 tag released test suite.
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/crypto/af_alg02.c
Summary
------------------------------------------------------------------------
kernel: 5.8.3-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-5.8.y
git commit: 201fff807310ce10485bcff294d47be95f3769eb
git describe: v5.8.2-233-g201fff807310
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.8-oe/build/v5.8.2-233-g201fff807310
Regressions (compared to build v5.8.2)
------------------------------------------------------------------------
x15:
ltp-crypto-tests:
* af_alg02
ltp-cve-tests:
* cve-2017-17805
--
Linaro LKFT
https://lkft.linaro.org
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox