linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/8] Improve the copy of task comm
@ 2024-08-17  2:56 Yafang Shao
  2024-08-17  2:56 ` [PATCH v7 1/8] Get rid of __get_task_comm() Yafang Shao
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Yafang Shao @ 2024-08-17  2:56 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, alx, justinstitt, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Yafang Shao

Using {memcpy,strncpy,strcpy,kstrdup} to copy the task comm relies on the
length of task comm. Changes in the task comm could result in a destination
string that is overflow. Therefore, we should explicitly ensure the
destination string is always NUL-terminated, regardless of the task comm.
This approach will facilitate future extensions to the task comm.

As suggested by Linus [0], we can identify all relevant code with the
following git grep command:

  git grep 'memcpy.*->comm\>'
  git grep 'kstrdup.*->comm\>'
  git grep 'strncpy.*->comm\>'
  git grep 'strcpy.*->comm\>'

PATCH #2~#4:   memcpy
PATCH #5~#6:   kstrdup
PATCH #7~#8:   strcpy

Please note that strncpy() is not included in this series as it is being
tracked by another effort. [1]

In this series, we have removed __get_task_comm() because the task_lock()
and BUILD_BUG_ON() within it are unnecessary.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/all/CAHk-=wjAmmHUg6vho1KjzQi2=psR30+CogFd4aXrThr2gsiS4g@mail.gmail.com/ [0]

Changes:
v6->v7:
- Improve the comment (Alejandro)
- Drop strncpy as it is being tracked by another effort (Justin)
  https://github.com/KSPP/linux/issues/90 [1]

v5->v6: https://lore.kernel.org/linux-mm/20240812022933.69850-1-laoar.shao@gmail.com/
- Get rid of __get_task_comm() (Linus)
- Use ARRAY_SIZE() in get_task_comm() (Alejandro)

v4->v5: https://lore.kernel.org/all/20240804075619.20804-1-laoar.shao@gmail.com/
- Drop changes in the mm/kmemleak.c as it was fixed by
  commit 0b84780134fb ("mm/kmemleak: replace strncpy() with strscpy()")
- Drop changes in kernel/tsacct.c as it was fixed by
  commmit 0fe2356434e ("tsacct: replace strncpy() with strscpy()")

v3->v4: https://lore.kernel.org/linux-mm/20240729023719.1933-1-laoar.shao@gmail.com/
- Rename __kstrndup() to __kmemdup_nul() and define it inside mm/util.c
  (Matthew)
- Remove unused local varaible (Simon)

v2->v3: https://lore.kernel.org/all/20240621022959.9124-1-laoar.shao@gmail.com/
- Deduplicate code around kstrdup (Andrew)
- Add commit log for dropping task_lock (Catalin)

v1->v2: https://lore.kernel.org/bpf/20240613023044.45873-1-laoar.shao@gmail.com/
- Add comment for dropping task_lock() in __get_task_comm() (Alexei)
- Drop changes in trace event (Steven)
- Fix comment on task comm (Matus)

v1: https://lore.kernel.org/all/20240602023754.25443-1-laoar.shao@gmail.com/

Yafang Shao (8):
  Get rid of __get_task_comm()
  auditsc: Replace memcpy() with strscpy()
  security: Replace memcpy() with get_task_comm()
  bpftool: Ensure task comm is always NUL-terminated
  mm/util: Fix possible race condition in kstrdup()
  mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul}
  net: Replace strcpy() with strscpy()
  drm: Replace strcpy() with strscpy()

 drivers/gpu/drm/drm_framebuffer.c     |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c |  2 +-
 fs/exec.c                             | 10 -----
 fs/proc/array.c                       |  2 +-
 include/linux/sched.h                 | 32 +++++++++++---
 kernel/auditsc.c                      |  6 +--
 kernel/kthread.c                      |  2 +-
 mm/util.c                             | 61 ++++++++++++---------------
 net/ipv6/ndisc.c                      |  2 +-
 security/lsm_audit.c                  |  4 +-
 security/selinux/selinuxfs.c          |  2 +-
 tools/bpf/bpftool/pids.c              |  2 +
 12 files changed, 65 insertions(+), 62 deletions(-)

-- 
2.43.5


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v7 1/8] Get rid of __get_task_comm()
  2024-08-17  2:56 [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
@ 2024-08-17  2:56 ` Yafang Shao
  2024-08-17  2:56 ` [PATCH v7 2/8] auditsc: Replace memcpy() with strscpy() Yafang Shao
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Yafang Shao @ 2024-08-17  2:56 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, alx, justinstitt, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Yafang Shao, Alexander Viro, Christian Brauner,
	Jan Kara, Kees Cook, Matus Jokay, Serge E. Hallyn

We want to eliminate the use of __get_task_comm() for the following
reasons:

- The task_lock() is unnecessary
  Quoted from Linus [0]:
  : Since user space can randomly change their names anyway, using locking
  : was always wrong for readers (for writers it probably does make sense
  : to have some lock - although practically speaking nobody cares there
  : either, but at least for a writer some kind of race could have
  : long-term mixed results

- The BUILD_BUG_ON() doesn't add any value
  The only requirement is to ensure that the destination buffer is a valid
  array.

- Zeroing is not necessary in current use cases
  To avoid confusion, we should remove it. Moreover, not zeroing could
  potentially make it easier to uncover bugs. If the caller needs a
  zero-padded task name, it should be explicitly handled at the call site.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/all/CAHk-=wivfrF0_zvf+oj6==Sh=-npJooP8chLPEfaFV0oNYTTBA@mail.gmail.com [0]
Link: https://lore.kernel.org/all/CAHk-=whWtUC-AjmGJveAETKOMeMFSTwKwu99v7+b6AyHMmaDFA@mail.gmail.com/
Suggested-by: Alejandro Colomar <alx@kernel.org>
Link: https://lore.kernel.org/all/2jxak5v6dfxlpbxhpm3ey7oup4g2lnr3ueurfbosf5wdo65dk4@srb3hsk72zwq
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Matus Jokay <matus.jokay@stuba.sk>
Cc: Alejandro Colomar <alx@kernel.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
---
 fs/exec.c             | 10 ----------
 fs/proc/array.c       |  2 +-
 include/linux/sched.h | 32 ++++++++++++++++++++++++++------
 kernel/kthread.c      |  2 +-
 4 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index a47d0e4c54f6..2e468ddd203a 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1264,16 +1264,6 @@ static int unshare_sighand(struct task_struct *me)
 	return 0;
 }
 
-char *__get_task_comm(char *buf, size_t buf_size, struct task_struct *tsk)
-{
-	task_lock(tsk);
-	/* Always NUL terminated and zero-padded */
-	strscpy_pad(buf, tsk->comm, buf_size);
-	task_unlock(tsk);
-	return buf;
-}
-EXPORT_SYMBOL_GPL(__get_task_comm);
-
 /*
  * These functions flushes out all traces of the currently running executable
  * so that a new one can be started
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 34a47fb0c57f..55ed3510d2bb 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -109,7 +109,7 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape)
 	else if (p->flags & PF_KTHREAD)
 		get_kthread_comm(tcomm, sizeof(tcomm), p);
 	else
-		__get_task_comm(tcomm, sizeof(tcomm), p);
+		get_task_comm(tcomm, p);
 
 	if (escape)
 		seq_escape_str(m, tcomm, ESCAPE_SPACE | ESCAPE_SPECIAL, "\n\\");
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 33dd8d9d2b85..5f1c8a58bb76 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1096,9 +1096,12 @@ struct task_struct {
 	/*
 	 * executable name, excluding path.
 	 *
-	 * - normally initialized setup_new_exec()
-	 * - access it with [gs]et_task_comm()
-	 * - lock it with task_lock()
+	 * - normally initialized begin_new_exec()
+	 * - set it with set_task_comm()
+	 *   - strscpy_pad() to ensure it is always NUL-terminated and
+	 *     zero-padded
+	 *   - task_lock() to ensure the operation is atomic and the name is
+	 *     fully updated.
 	 */
 	char				comm[TASK_COMM_LEN];
 
@@ -1912,10 +1915,27 @@ static inline void set_task_comm(struct task_struct *tsk, const char *from)
 	__set_task_comm(tsk, from, false);
 }
 
-extern char *__get_task_comm(char *to, size_t len, struct task_struct *tsk);
+/*
+ * - Why not use task_lock()?
+ *   User space can randomly change their names anyway, so locking for readers
+ *   doesn't make sense. For writers, locking is probably necessary, as a race
+ *   condition could lead to long-term mixed results.
+ *   The strscpy_pad() in __set_task_comm() can ensure that the task comm is
+ *   always NUL-terminated and zero-padded. Therefore the race condition between
+ *   reader and writer is not an issue.
+ *
+ * - Why not use strscpy_pad()?
+ *   While strscpy_pad() prevents writing garbage past the NUL terminator, which
+ *   is useful when using the task name as a key in a hash map, most use cases
+ *   don't require this. Zero-padding might confuse users if it’s unnecessary,
+ *   and not zeroing might even make it easier to expose bugs. If you need a
+ *   zero-padded task name, please handle that explicitly at the call site.
+ *
+ * - ARRAY_SIZE() can help ensure that @buf is indeed an array.
+ */
 #define get_task_comm(buf, tsk) ({			\
-	BUILD_BUG_ON(sizeof(buf) != TASK_COMM_LEN);	\
-	__get_task_comm(buf, sizeof(buf), tsk);		\
+	strscpy(buf, (tsk)->comm, ARRAY_SIZE(buf));	\
+	buf;						\
 })
 
 #ifdef CONFIG_SMP
diff --git a/kernel/kthread.c b/kernel/kthread.c
index f7be976ff88a..7d001d033cf9 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -101,7 +101,7 @@ void get_kthread_comm(char *buf, size_t buf_size, struct task_struct *tsk)
 	struct kthread *kthread = to_kthread(tsk);
 
 	if (!kthread || !kthread->full_name) {
-		__get_task_comm(buf, buf_size, tsk);
+		strscpy(buf, tsk->comm, buf_size);
 		return;
 	}
 
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 2/8] auditsc: Replace memcpy() with strscpy()
  2024-08-17  2:56 [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
  2024-08-17  2:56 ` [PATCH v7 1/8] Get rid of __get_task_comm() Yafang Shao
@ 2024-08-17  2:56 ` Yafang Shao
  2024-08-17  2:56 ` [PATCH v7 3/8] security: Replace memcpy() with get_task_comm() Yafang Shao
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Yafang Shao @ 2024-08-17  2:56 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, alx, justinstitt, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Yafang Shao, Paul Moore, Eric Paris

Using strscpy() to read the task comm ensures that the name is
always NUL-terminated, regardless of the source string. This approach also
facilitates future extensions to the task comm.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: Eric Paris <eparis@redhat.com>
---
 kernel/auditsc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 6f0d6fb6523f..e4ef5e57dde9 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -2730,7 +2730,7 @@ void __audit_ptrace(struct task_struct *t)
 	context->target_uid = task_uid(t);
 	context->target_sessionid = audit_get_sessionid(t);
 	security_task_getsecid_obj(t, &context->target_sid);
-	memcpy(context->target_comm, t->comm, TASK_COMM_LEN);
+	strscpy(context->target_comm, t->comm);
 }
 
 /**
@@ -2757,7 +2757,7 @@ int audit_signal_info_syscall(struct task_struct *t)
 		ctx->target_uid = t_uid;
 		ctx->target_sessionid = audit_get_sessionid(t);
 		security_task_getsecid_obj(t, &ctx->target_sid);
-		memcpy(ctx->target_comm, t->comm, TASK_COMM_LEN);
+		strscpy(ctx->target_comm, t->comm);
 		return 0;
 	}
 
@@ -2778,7 +2778,7 @@ int audit_signal_info_syscall(struct task_struct *t)
 	axp->target_uid[axp->pid_count] = t_uid;
 	axp->target_sessionid[axp->pid_count] = audit_get_sessionid(t);
 	security_task_getsecid_obj(t, &axp->target_sid[axp->pid_count]);
-	memcpy(axp->target_comm[axp->pid_count], t->comm, TASK_COMM_LEN);
+	strscpy(axp->target_comm[axp->pid_count], t->comm);
 	axp->pid_count++;
 
 	return 0;
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 3/8] security: Replace memcpy() with get_task_comm()
  2024-08-17  2:56 [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
  2024-08-17  2:56 ` [PATCH v7 1/8] Get rid of __get_task_comm() Yafang Shao
  2024-08-17  2:56 ` [PATCH v7 2/8] auditsc: Replace memcpy() with strscpy() Yafang Shao
@ 2024-08-17  2:56 ` Yafang Shao
  2024-08-17  2:56 ` [PATCH v7 4/8] bpftool: Ensure task comm is always NUL-terminated Yafang Shao
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Yafang Shao @ 2024-08-17  2:56 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, alx, justinstitt, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Yafang Shao, Paul Moore, James Morris,
	Serge E. Hallyn, Stephen Smalley, Ondrej Mosnacek

Quoted from Linus [0]:

  selinux never wanted a lock, and never wanted any kind of *consistent*
  result, it just wanted a *stable* result.

Using get_task_comm() to read the task comm ensures that the name is
always NUL-terminated, regardless of the source string. This approach also
facilitates future extensions to the task comm.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
LINK: https://lore.kernel.org/all/CAHk-=wivfrF0_zvf+oj6==Sh=-npJooP8chLPEfaFV0oNYTTBA@mail.gmail.com/ [0]
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
---
 security/lsm_audit.c         | 4 ++--
 security/selinux/selinuxfs.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/security/lsm_audit.c b/security/lsm_audit.c
index 849e832719e2..9a8352972086 100644
--- a/security/lsm_audit.c
+++ b/security/lsm_audit.c
@@ -207,7 +207,7 @@ static void dump_common_audit_data(struct audit_buffer *ab,
 	BUILD_BUG_ON(sizeof(a->u) > sizeof(void *)*2);
 
 	audit_log_format(ab, " pid=%d comm=", task_tgid_nr(current));
-	audit_log_untrustedstring(ab, memcpy(comm, current->comm, sizeof(comm)));
+	audit_log_untrustedstring(ab, get_task_comm(comm, current));
 
 	switch (a->type) {
 	case LSM_AUDIT_DATA_NONE:
@@ -302,7 +302,7 @@ static void dump_common_audit_data(struct audit_buffer *ab,
 				char comm[sizeof(tsk->comm)];
 				audit_log_format(ab, " opid=%d ocomm=", pid);
 				audit_log_untrustedstring(ab,
-				    memcpy(comm, tsk->comm, sizeof(comm)));
+				    get_task_comm(comm, tsk));
 			}
 		}
 		break;
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index e172f182b65c..c9b05be27ddb 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -708,7 +708,7 @@ static ssize_t sel_write_checkreqprot(struct file *file, const char __user *buf,
 	if (new_value) {
 		char comm[sizeof(current->comm)];
 
-		memcpy(comm, current->comm, sizeof(comm));
+		strscpy(comm, current->comm);
 		pr_err("SELinux: %s (%d) set checkreqprot to 1. This is no longer supported.\n",
 		       comm, current->pid);
 	}
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 4/8] bpftool: Ensure task comm is always NUL-terminated
  2024-08-17  2:56 [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
                   ` (2 preceding siblings ...)
  2024-08-17  2:56 ` [PATCH v7 3/8] security: Replace memcpy() with get_task_comm() Yafang Shao
@ 2024-08-17  2:56 ` Yafang Shao
  2024-08-17  8:38   ` Alejandro Colomar
  2024-08-17  2:56 ` [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup() Yafang Shao
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Yafang Shao @ 2024-08-17  2:56 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, alx, justinstitt, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Yafang Shao, Quentin Monnet

Let's explicitly ensure the destination string is NUL-terminated. This way,
it won't be affected by changes to the source string.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Quentin Monnet <qmo@kernel.org>
---
 tools/bpf/bpftool/pids.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/bpf/bpftool/pids.c b/tools/bpf/bpftool/pids.c
index 9b898571b49e..23f488cf1740 100644
--- a/tools/bpf/bpftool/pids.c
+++ b/tools/bpf/bpftool/pids.c
@@ -54,6 +54,7 @@ static void add_ref(struct hashmap *map, struct pid_iter_entry *e)
 		ref = &refs->refs[refs->ref_cnt];
 		ref->pid = e->pid;
 		memcpy(ref->comm, e->comm, sizeof(ref->comm));
+		ref->comm[sizeof(ref->comm) - 1] = '\0';
 		refs->ref_cnt++;
 
 		return;
@@ -77,6 +78,7 @@ static void add_ref(struct hashmap *map, struct pid_iter_entry *e)
 	ref = &refs->refs[0];
 	ref->pid = e->pid;
 	memcpy(ref->comm, e->comm, sizeof(ref->comm));
+	ref->comm[sizeof(ref->comm) - 1] = '\0';
 	refs->ref_cnt = 1;
 	refs->has_bpf_cookie = e->has_bpf_cookie;
 	refs->bpf_cookie = e->bpf_cookie;
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup()
  2024-08-17  2:56 [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
                   ` (3 preceding siblings ...)
  2024-08-17  2:56 ` [PATCH v7 4/8] bpftool: Ensure task comm is always NUL-terminated Yafang Shao
@ 2024-08-17  2:56 ` Yafang Shao
  2024-08-17  8:48   ` Alejandro Colomar
                     ` (2 more replies)
  2024-08-17  2:56 ` [PATCH v7 6/8] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul} Yafang Shao
                   ` (3 subsequent siblings)
  8 siblings, 3 replies; 27+ messages in thread
From: Yafang Shao @ 2024-08-17  2:56 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, alx, justinstitt, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Yafang Shao

In kstrdup(), it is critical to ensure that the dest string is always
NUL-terminated. However, potential race condidtion can occur between a
writer and a reader.

Consider the following scenario involving task->comm:

    reader                    writer

  len = strlen(s) + 1;
                             strlcpy(tsk->comm, buf, sizeof(tsk->comm));
  memcpy(buf, s, len);

In this case, there is a race condition between the reader and the
writer. The reader calculate the length of the string `s` based on the
old value of task->comm. However, during the memcpy(), the string `s`
might be updated by the writer to a new value of task->comm.

If the new task->comm is larger than the old one, the `buf` might not be
NUL-terminated. This can lead to undefined behavior and potential
security vulnerabilities.

Let's fix it by explicitly adding a NUL-terminator.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/util.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/util.c b/mm/util.c
index 983baf2bd675..4542d8a800d9 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp)
 
 	len = strlen(s) + 1;
 	buf = kmalloc_track_caller(len, gfp);
-	if (buf)
+	if (buf) {
 		memcpy(buf, s, len);
+		/* During memcpy(), the string might be updated to a new value,
+		 * which could be longer than the string when strlen() is
+		 * called. Therefore, we need to add a null termimator.
+		 */
+		buf[len - 1] = '\0';
+	}
 	return buf;
 }
 EXPORT_SYMBOL(kstrdup);
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 6/8] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul}
  2024-08-17  2:56 [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
                   ` (4 preceding siblings ...)
  2024-08-17  2:56 ` [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup() Yafang Shao
@ 2024-08-17  2:56 ` Yafang Shao
  2024-08-17  8:57   ` Alejandro Colomar
  2024-08-17  2:56 ` [PATCH v7 7/8] net: Replace strcpy() with strscpy() Yafang Shao
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Yafang Shao @ 2024-08-17  2:56 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, alx, justinstitt, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Yafang Shao, Simon Horman, Matthew Wilcox

These three functions follow the same pattern. To deduplicate the code,
let's introduce a common helper __kmemdup_nul().

Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
---
 mm/util.c | 67 +++++++++++++++++++++----------------------------------
 1 file changed, 26 insertions(+), 41 deletions(-)

diff --git a/mm/util.c b/mm/util.c
index 4542d8a800d9..310c7735c617 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -45,33 +45,40 @@ void kfree_const(const void *x)
 EXPORT_SYMBOL(kfree_const);
 
 /**
- * kstrdup - allocate space for and copy an existing string
- * @s: the string to duplicate
+ * __kmemdup_nul - Create a NUL-terminated string from @s, which might be unterminated.
+ * @s: The data to copy
+ * @len: The size of the data, including the null terminator
  * @gfp: the GFP mask used in the kmalloc() call when allocating memory
  *
- * Return: newly allocated copy of @s or %NULL in case of error
+ * Return: newly allocated copy of @s with NUL-termination or %NULL in
+ * case of error
  */
-noinline
-char *kstrdup(const char *s, gfp_t gfp)
+static __always_inline char *__kmemdup_nul(const char *s, size_t len, gfp_t gfp)
 {
-	size_t len;
 	char *buf;
 
-	if (!s)
+	buf = kmalloc_track_caller(len, gfp);
+	if (!buf)
 		return NULL;
 
-	len = strlen(s) + 1;
-	buf = kmalloc_track_caller(len, gfp);
-	if (buf) {
-		memcpy(buf, s, len);
-		/* During memcpy(), the string might be updated to a new value,
-		 * which could be longer than the string when strlen() is
-		 * called. Therefore, we need to add a null termimator.
-		 */
-		buf[len - 1] = '\0';
-	}
+	memcpy(buf, s, len);
+	/* Ensure the buf is always NUL-terminated, regardless of @s. */
+	buf[len - 1] = '\0';
 	return buf;
 }
+
+/**
+ * kstrdup - allocate space for and copy an existing string
+ * @s: the string to duplicate
+ * @gfp: the GFP mask used in the kmalloc() call when allocating memory
+ *
+ * Return: newly allocated copy of @s or %NULL in case of error
+ */
+noinline
+char *kstrdup(const char *s, gfp_t gfp)
+{
+	return s ? __kmemdup_nul(s, strlen(s) + 1, gfp) : NULL;
+}
 EXPORT_SYMBOL(kstrdup);
 
 /**
@@ -106,19 +113,7 @@ EXPORT_SYMBOL(kstrdup_const);
  */
 char *kstrndup(const char *s, size_t max, gfp_t gfp)
 {
-	size_t len;
-	char *buf;
-
-	if (!s)
-		return NULL;
-
-	len = strnlen(s, max);
-	buf = kmalloc_track_caller(len+1, gfp);
-	if (buf) {
-		memcpy(buf, s, len);
-		buf[len] = '\0';
-	}
-	return buf;
+	return s ? __kmemdup_nul(s, strnlen(s, max) + 1, gfp) : NULL;
 }
 EXPORT_SYMBOL(kstrndup);
 
@@ -192,17 +187,7 @@ EXPORT_SYMBOL(kvmemdup);
  */
 char *kmemdup_nul(const char *s, size_t len, gfp_t gfp)
 {
-	char *buf;
-
-	if (!s)
-		return NULL;
-
-	buf = kmalloc_track_caller(len + 1, gfp);
-	if (buf) {
-		memcpy(buf, s, len);
-		buf[len] = '\0';
-	}
-	return buf;
+	return s ? __kmemdup_nul(s, len + 1, gfp) : NULL;
 }
 EXPORT_SYMBOL(kmemdup_nul);
 
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 7/8] net: Replace strcpy() with strscpy()
  2024-08-17  2:56 [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
                   ` (5 preceding siblings ...)
  2024-08-17  2:56 ` [PATCH v7 6/8] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul} Yafang Shao
@ 2024-08-17  2:56 ` Yafang Shao
  2024-08-17  2:56 ` [PATCH v7 8/8] drm: " Yafang Shao
  2024-08-26  2:30 ` [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
  8 siblings, 0 replies; 27+ messages in thread
From: Yafang Shao @ 2024-08-17  2:56 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, alx, justinstitt, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Yafang Shao, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

To prevent errors from occurring when the src string is longer than the dst
string in strcpy(), we should use strscpy() instead. This approach
also facilitates future extensions to the task comm.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Ahern <dsahern@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
---
 net/ipv6/ndisc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 254b192c5705..17f2e787e6f8 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1942,7 +1942,7 @@ static void ndisc_warn_deprecated_sysctl(const struct ctl_table *ctl,
 	static char warncomm[TASK_COMM_LEN];
 	static int warned;
 	if (strcmp(warncomm, current->comm) && warned < 5) {
-		strcpy(warncomm, current->comm);
+		strscpy(warncomm, current->comm);
 		pr_warn("process `%s' is using deprecated sysctl (%s) net.ipv6.neigh.%s.%s - use net.ipv6.neigh.%s.%s_ms instead\n",
 			warncomm, func,
 			dev_name, ctl->procname,
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 8/8] drm: Replace strcpy() with strscpy()
  2024-08-17  2:56 [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
                   ` (6 preceding siblings ...)
  2024-08-17  2:56 ` [PATCH v7 7/8] net: Replace strcpy() with strscpy() Yafang Shao
@ 2024-08-17  2:56 ` Yafang Shao
  2024-08-26  2:30 ` [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
  8 siblings, 0 replies; 27+ messages in thread
From: Yafang Shao @ 2024-08-17  2:56 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, alx, justinstitt, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Yafang Shao, Daniel Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie

To prevent erros from occurring when the src string is longer than the
dst string in strcpy(), we should use strscpy() instead. This
approach also facilitates future extensions to the task comm.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
---
 drivers/gpu/drm/drm_framebuffer.c     | 2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_framebuffer.c b/drivers/gpu/drm/drm_framebuffer.c
index 888aadb6a4ac..2d6993539474 100644
--- a/drivers/gpu/drm/drm_framebuffer.c
+++ b/drivers/gpu/drm/drm_framebuffer.c
@@ -868,7 +868,7 @@ int drm_framebuffer_init(struct drm_device *dev, struct drm_framebuffer *fb,
 	INIT_LIST_HEAD(&fb->filp_head);
 
 	fb->funcs = funcs;
-	strcpy(fb->comm, current->comm);
+	strscpy(fb->comm, current->comm);
 
 	ret = __drm_mode_object_add(dev, &fb->base, DRM_MODE_OBJECT_FB,
 				    false, drm_framebuffer_free);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 625b3c024540..374378ac7c85 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1411,7 +1411,7 @@ static bool record_context(struct i915_gem_context_coredump *e,
 	rcu_read_lock();
 	task = pid_task(ctx->pid, PIDTYPE_PID);
 	if (task) {
-		strcpy(e->comm, task->comm);
+		strscpy(e->comm, task->comm);
 		e->pid = task->pid;
 	}
 	rcu_read_unlock();
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 4/8] bpftool: Ensure task comm is always NUL-terminated
  2024-08-17  2:56 ` [PATCH v7 4/8] bpftool: Ensure task comm is always NUL-terminated Yafang Shao
@ 2024-08-17  8:38   ` Alejandro Colomar
  2024-08-18  2:27     ` Yafang Shao
  0 siblings, 1 reply; 27+ messages in thread
From: Alejandro Colomar @ 2024-08-17  8:38 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, torvalds, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Quentin Monnet

[-- Attachment #1: Type: text/plain, Size: 1613 bytes --]

Hi Yafang,

On Sat, Aug 17, 2024 at 10:56:20AM GMT, Yafang Shao wrote:
> Let's explicitly ensure the destination string is NUL-terminated. This way,
> it won't be affected by changes to the source string.
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Reviewed-by: Quentin Monnet <qmo@kernel.org>
> ---
>  tools/bpf/bpftool/pids.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/bpf/bpftool/pids.c b/tools/bpf/bpftool/pids.c
> index 9b898571b49e..23f488cf1740 100644
> --- a/tools/bpf/bpftool/pids.c
> +++ b/tools/bpf/bpftool/pids.c
> @@ -54,6 +54,7 @@ static void add_ref(struct hashmap *map, struct pid_iter_entry *e)
>  		ref = &refs->refs[refs->ref_cnt];
>  		ref->pid = e->pid;
>  		memcpy(ref->comm, e->comm, sizeof(ref->comm));
> +		ref->comm[sizeof(ref->comm) - 1] = '\0';

Why doesn't this use strscpy()?  Isn't the source terminated?

Both the source and the destination measure 16 characters.  If it is
true that the source is not terminated, then this copy might truncate
the (non-)string by overwriting the last byte with a NUL.  Is that
truncation a good thing?

>  		refs->ref_cnt++;
>  
>  		return;
> @@ -77,6 +78,7 @@ static void add_ref(struct hashmap *map, struct pid_iter_entry *e)
>  	ref = &refs->refs[0];
>  	ref->pid = e->pid;
>  	memcpy(ref->comm, e->comm, sizeof(ref->comm));
> +	ref->comm[sizeof(ref->comm) - 1] = '\0';

Same question here.

>  	refs->ref_cnt = 1;
>  	refs->has_bpf_cookie = e->has_bpf_cookie;
>  	refs->bpf_cookie = e->bpf_cookie;
> -- 
> 2.43.5
> 

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup()
  2024-08-17  2:56 ` [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup() Yafang Shao
@ 2024-08-17  8:48   ` Alejandro Colomar
  2024-08-17 16:26     ` Linus Torvalds
  2024-09-28 21:17     ` Kees Cook
  2024-09-26 17:35   ` Andy Shevchenko
  2024-09-28 21:14   ` Kees Cook
  2 siblings, 2 replies; 27+ messages in thread
From: Alejandro Colomar @ 2024-08-17  8:48 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, torvalds, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel

[-- Attachment #1: Type: text/plain, Size: 2497 bytes --]

Hi Yafang,

On Sat, Aug 17, 2024 at 10:56:21AM GMT, Yafang Shao wrote:
> In kstrdup(), it is critical to ensure that the dest string is always
> NUL-terminated. However, potential race condidtion can occur between a
> writer and a reader.
> 
> Consider the following scenario involving task->comm:
> 
>     reader                    writer
> 
>   len = strlen(s) + 1;
>                              strlcpy(tsk->comm, buf, sizeof(tsk->comm));
>   memcpy(buf, s, len);
> 
> In this case, there is a race condition between the reader and the
> writer. The reader calculate the length of the string `s` based on the
> old value of task->comm. However, during the memcpy(), the string `s`
> might be updated by the writer to a new value of task->comm.
> 
> If the new task->comm is larger than the old one, the `buf` might not be
> NUL-terminated. This can lead to undefined behavior and potential
> security vulnerabilities.
> 
> Let's fix it by explicitly adding a NUL-terminator.
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
>  mm/util.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/util.c b/mm/util.c
> index 983baf2bd675..4542d8a800d9 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp)
>  
>  	len = strlen(s) + 1;
>  	buf = kmalloc_track_caller(len, gfp);
> -	if (buf)
> +	if (buf) {
>  		memcpy(buf, s, len);
> +		/* During memcpy(), the string might be updated to a new value,
> +		 * which could be longer than the string when strlen() is
> +		 * called. Therefore, we need to add a null termimator.
> +		 */
> +		buf[len - 1] = '\0';
> +	}

I would compact the above to:

	len = strlen(s);
	buf = kmalloc_track_caller(len + 1, gfp);
	if (buf)
		strcpy(mempcpy(buf, s, len), "");

It allows _FORTIFY_SOURCE to track the copy of the NUL, and also uses
less screen.  It also has less moving parts.  (You'd need to write a
mempcpy() for the kernel, but that's as easy as the following:)

	#define mempcpy(d, s, n)  (memcpy(d, s, n) + n)

In shadow utils, I did a global replacement of all buf[...] = '\0'; by
strcpy(..., "");.  It ends up being optimized by the compiler to the
same code (at least in the experiments I did).


Have a lovely day!
Alex

>  	return buf;
>  }
>  EXPORT_SYMBOL(kstrdup);
> -- 
> 2.43.5
> 

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 6/8] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul}
  2024-08-17  2:56 ` [PATCH v7 6/8] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul} Yafang Shao
@ 2024-08-17  8:57   ` Alejandro Colomar
  2024-08-17  9:05     ` Alejandro Colomar
  2024-08-26  9:20     ` Alejandro Colomar
  0 siblings, 2 replies; 27+ messages in thread
From: Alejandro Colomar @ 2024-08-17  8:57 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, torvalds, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Simon Horman, Matthew Wilcox

[-- Attachment #1: Type: text/plain, Size: 4410 bytes --]

Hi Yafang,

On Sat, Aug 17, 2024 at 10:56:22AM GMT, Yafang Shao wrote:
> These three functions follow the same pattern. To deduplicate the code,
> let's introduce a common helper __kmemdup_nul().
> 
> Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Cc: Simon Horman <horms@kernel.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> ---
>  mm/util.c | 67 +++++++++++++++++++++----------------------------------
>  1 file changed, 26 insertions(+), 41 deletions(-)
> 
> diff --git a/mm/util.c b/mm/util.c
> index 4542d8a800d9..310c7735c617 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -45,33 +45,40 @@ void kfree_const(const void *x)
>  EXPORT_SYMBOL(kfree_const);
>  
>  /**
> - * kstrdup - allocate space for and copy an existing string
> - * @s: the string to duplicate
> + * __kmemdup_nul - Create a NUL-terminated string from @s, which might be unterminated.
> + * @s: The data to copy
> + * @len: The size of the data, including the null terminator
>   * @gfp: the GFP mask used in the kmalloc() call when allocating memory
>   *
> - * Return: newly allocated copy of @s or %NULL in case of error
> + * Return: newly allocated copy of @s with NUL-termination or %NULL in
> + * case of error
>   */
> -noinline
> -char *kstrdup(const char *s, gfp_t gfp)
> +static __always_inline char *__kmemdup_nul(const char *s, size_t len, gfp_t gfp)
>  {
> -	size_t len;
>  	char *buf;
>  
> -	if (!s)
> +	buf = kmalloc_track_caller(len, gfp);
> +	if (!buf)
>  		return NULL;
>  
> -	len = strlen(s) + 1;
> -	buf = kmalloc_track_caller(len, gfp);
> -	if (buf) {
> -		memcpy(buf, s, len);
> -		/* During memcpy(), the string might be updated to a new value,
> -		 * which could be longer than the string when strlen() is
> -		 * called. Therefore, we need to add a null termimator.
> -		 */
> -		buf[len - 1] = '\0';
> -	}
> +	memcpy(buf, s, len);
> +	/* Ensure the buf is always NUL-terminated, regardless of @s. */
> +	buf[len - 1] = '\0';
>  	return buf;
>  }
> +
> +/**
> + * kstrdup - allocate space for and copy an existing string
> + * @s: the string to duplicate
> + * @gfp: the GFP mask used in the kmalloc() call when allocating memory
> + *
> + * Return: newly allocated copy of @s or %NULL in case of error
> + */
> +noinline
> +char *kstrdup(const char *s, gfp_t gfp)
> +{
> +	return s ? __kmemdup_nul(s, strlen(s) + 1, gfp) : NULL;
> +}
>  EXPORT_SYMBOL(kstrdup);
>  
>  /**
> @@ -106,19 +113,7 @@ EXPORT_SYMBOL(kstrdup_const);
>   */
>  char *kstrndup(const char *s, size_t max, gfp_t gfp)
>  {
> -	size_t len;
> -	char *buf;
> -
> -	if (!s)
> -		return NULL;
> -
> -	len = strnlen(s, max);
> -	buf = kmalloc_track_caller(len+1, gfp);
> -	if (buf) {
> -		memcpy(buf, s, len);
> -		buf[len] = '\0';
> -	}
> -	return buf;
> +	return s ? __kmemdup_nul(s, strnlen(s, max) + 1, gfp) : NULL;
>  }
>  EXPORT_SYMBOL(kstrndup);
>  
> @@ -192,17 +187,7 @@ EXPORT_SYMBOL(kvmemdup);
>   */
>  char *kmemdup_nul(const char *s, size_t len, gfp_t gfp)
>  {
> -	char *buf;
> -
> -	if (!s)
> -		return NULL;
> -
> -	buf = kmalloc_track_caller(len + 1, gfp);
> -	if (buf) {
> -		memcpy(buf, s, len);
> -		buf[len] = '\0';
> -	}
> -	return buf;
> +	return s ? __kmemdup_nul(s, len + 1, gfp) : NULL;
>  }
>  EXPORT_SYMBOL(kmemdup_nul);

I like the idea of the patch, but it's plagued with all those +1 and -1.
I think that's due to a bad choice of value being passed by.  If you
pass the actual length of the string (as suggested in my reply to the
previous patch) you should end up with a cleaner set of APIs.

The only remaining +1 is for kmalloc_track_caller(), which I ignore what
it does.

	char *
	__kmemdup_nul(const char *s, size_t len, gfp_t gfp)
	{
		char *buf;

		buf = kmalloc_track_caller(len + 1, gfp);
		if (!buf)
			return NULL;

		strcpy(mempcpy(buf, s, len), "");
		return buf;
	}

	char *
	kstrdup(const char *s, gfp_t gfp)
	{
		return s ? __kmemdup_nul(s, strlen(s), gfp) : NULL;
	}

	char *
	kstrndup(const char *s, size_t n, gfp_t gfp)
	{
		return s ? __kmemdup_nul(s, strnlen(s, n), gfp) : NULL;
	}

	char *
	kmemdup_nul(const char *s, size_t len, gfp_t gfp)
	{
		return s ? __kmemdup_nul(s, len, gfp) : NULL;
	}

Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 6/8] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul}
  2024-08-17  8:57   ` Alejandro Colomar
@ 2024-08-17  9:05     ` Alejandro Colomar
  2024-08-26  9:20     ` Alejandro Colomar
  1 sibling, 0 replies; 27+ messages in thread
From: Alejandro Colomar @ 2024-08-17  9:05 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, torvalds, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Simon Horman, Matthew Wilcox

[-- Attachment #1: Type: text/plain, Size: 5619 bytes --]

On Sat, Aug 17, 2024 at 10:58:02AM GMT, Alejandro Colomar wrote:
> Hi Yafang,
> 
> On Sat, Aug 17, 2024 at 10:56:22AM GMT, Yafang Shao wrote:
> > These three functions follow the same pattern. To deduplicate the code,
> > let's introduce a common helper __kmemdup_nul().
> > 
> > Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > Cc: Simon Horman <horms@kernel.org>
> > Cc: Matthew Wilcox <willy@infradead.org>
> > ---
> >  mm/util.c | 67 +++++++++++++++++++++----------------------------------
> >  1 file changed, 26 insertions(+), 41 deletions(-)
> > 
> > diff --git a/mm/util.c b/mm/util.c
> > index 4542d8a800d9..310c7735c617 100644
> > --- a/mm/util.c
> > +++ b/mm/util.c
> > @@ -45,33 +45,40 @@ void kfree_const(const void *x)
> >  EXPORT_SYMBOL(kfree_const);
> >  
> >  /**
> > - * kstrdup - allocate space for and copy an existing string
> > - * @s: the string to duplicate
> > + * __kmemdup_nul - Create a NUL-terminated string from @s, which might be unterminated.
> > + * @s: The data to copy
> > + * @len: The size of the data, including the null terminator
> >   * @gfp: the GFP mask used in the kmalloc() call when allocating memory
> >   *
> > - * Return: newly allocated copy of @s or %NULL in case of error
> > + * Return: newly allocated copy of @s with NUL-termination or %NULL in
> > + * case of error
> >   */
> > -noinline
> > -char *kstrdup(const char *s, gfp_t gfp)
> > +static __always_inline char *__kmemdup_nul(const char *s, size_t len, gfp_t gfp)
> >  {
> > -	size_t len;
> >  	char *buf;
> >  
> > -	if (!s)
> > +	buf = kmalloc_track_caller(len, gfp);
> > +	if (!buf)
> >  		return NULL;
> >  
> > -	len = strlen(s) + 1;
> > -	buf = kmalloc_track_caller(len, gfp);
> > -	if (buf) {
> > -		memcpy(buf, s, len);
> > -		/* During memcpy(), the string might be updated to a new value,
> > -		 * which could be longer than the string when strlen() is
> > -		 * called. Therefore, we need to add a null termimator.
> > -		 */
> > -		buf[len - 1] = '\0';
> > -	}
> > +	memcpy(buf, s, len);
> > +	/* Ensure the buf is always NUL-terminated, regardless of @s. */
> > +	buf[len - 1] = '\0';
> >  	return buf;
> >  }
> > +
> > +/**
> > + * kstrdup - allocate space for and copy an existing string
> > + * @s: the string to duplicate
> > + * @gfp: the GFP mask used in the kmalloc() call when allocating memory
> > + *
> > + * Return: newly allocated copy of @s or %NULL in case of error
> > + */
> > +noinline
> > +char *kstrdup(const char *s, gfp_t gfp)
> > +{
> > +	return s ? __kmemdup_nul(s, strlen(s) + 1, gfp) : NULL;
> > +}
> >  EXPORT_SYMBOL(kstrdup);
> >  
> >  /**
> > @@ -106,19 +113,7 @@ EXPORT_SYMBOL(kstrdup_const);
> >   */
> >  char *kstrndup(const char *s, size_t max, gfp_t gfp)
> >  {
> > -	size_t len;
> > -	char *buf;
> > -
> > -	if (!s)
> > -		return NULL;
> > -
> > -	len = strnlen(s, max);
> > -	buf = kmalloc_track_caller(len+1, gfp);
> > -	if (buf) {
> > -		memcpy(buf, s, len);
> > -		buf[len] = '\0';
> > -	}
> > -	return buf;
> > +	return s ? __kmemdup_nul(s, strnlen(s, max) + 1, gfp) : NULL;
> >  }
> >  EXPORT_SYMBOL(kstrndup);
> >  
> > @@ -192,17 +187,7 @@ EXPORT_SYMBOL(kvmemdup);
> >   */
> >  char *kmemdup_nul(const char *s, size_t len, gfp_t gfp)
> >  {
> > -	char *buf;
> > -
> > -	if (!s)
> > -		return NULL;
> > -
> > -	buf = kmalloc_track_caller(len + 1, gfp);
> > -	if (buf) {
> > -		memcpy(buf, s, len);
> > -		buf[len] = '\0';
> > -	}
> > -	return buf;
> > +	return s ? __kmemdup_nul(s, len + 1, gfp) : NULL;
> >  }
> >  EXPORT_SYMBOL(kmemdup_nul);
> 
> I like the idea of the patch, but it's plagued with all those +1 and -1.
> I think that's due to a bad choice of value being passed by.  If you
> pass the actual length of the string (as suggested in my reply to the
> previous patch) you should end up with a cleaner set of APIs.
> 
> The only remaining +1 is for kmalloc_track_caller(), which I ignore what
> it does.

D'oh, of course that's the malloc.  Yes, it makes sense to have a +1
there.

> 
> 	char *
> 	__kmemdup_nul(const char *s, size_t len, gfp_t gfp)
> 	{
> 		char *buf;
> 
> 		buf = kmalloc_track_caller(len + 1, gfp);
> 		if (!buf)
> 			return NULL;
> 
> 		strcpy(mempcpy(buf, s, len), "");
> 		return buf;

Alternatively, you can also rewrite the above two lines into one as:

		return strncat(strcpy(buf, ""), s, len);

The good thing is that you have strncat() in the kernel, AFAICS.
I reminded myself when checking the definitions that I wrote in shadow:

	#define XSTRNDUP(s)                                           \
	(                                                             \
	    STRNCAT(strcpy(XMALLOC(strnlen(s, NITEMS(s)) + 1, char), ""), s) \
	)
	#define STRNDUPA(s)                                           \
	(                                                             \
	    STRNCAT(strcpy(alloca(strnlen(s, NITEMS(s)) + 1), ""), s) \
	)


Cheers,
Alex

> 	}
> 
> 	char *
> 	kstrdup(const char *s, gfp_t gfp)
> 	{
> 		return s ? __kmemdup_nul(s, strlen(s), gfp) : NULL;
> 	}
> 
> 	char *
> 	kstrndup(const char *s, size_t n, gfp_t gfp)
> 	{
> 		return s ? __kmemdup_nul(s, strnlen(s, n), gfp) : NULL;
> 	}
> 
> 	char *
> 	kmemdup_nul(const char *s, size_t len, gfp_t gfp)
> 	{
> 		return s ? __kmemdup_nul(s, len, gfp) : NULL;
> 	}
> 
> Have a lovely day!
> Alex
> 
> -- 
> <https://www.alejandro-colomar.es/>



-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup()
  2024-08-17  8:48   ` Alejandro Colomar
@ 2024-08-17 16:26     ` Linus Torvalds
  2024-08-17 17:03       ` Alejandro Colomar
  2024-09-28 21:17     ` Kees Cook
  1 sibling, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2024-08-17 16:26 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Yafang Shao, akpm, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel

On Sat, 17 Aug 2024 at 01:48, Alejandro Colomar <alx@kernel.org> wrote:
>
> I would compact the above to:
>
>         len = strlen(s);
>         buf = kmalloc_track_caller(len + 1, gfp);
>         if (buf)
>                 strcpy(mempcpy(buf, s, len), "");

No, we're not doing this kind of horror.

If _FORTIFY_SOURCE has problems with a simple "memcpy and add NUL",
then _FORTIFY_SOURCE needs to be fixed.

We don't replace a "buf[len] = 0" with strcpy(,""). Yes, compilers may
simplify it, but dammit, it's an unreadable incomprehensible mess to
humans, and humans still matter a LOT more.

                Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup()
  2024-08-17 16:26     ` Linus Torvalds
@ 2024-08-17 17:03       ` Alejandro Colomar
  0 siblings, 0 replies; 27+ messages in thread
From: Alejandro Colomar @ 2024-08-17 17:03 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Yafang Shao, akpm, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel

[-- Attachment #1: Type: text/plain, Size: 2724 bytes --]

Hi Linus,

On Sat, Aug 17, 2024 at 09:26:21AM GMT, Linus Torvalds wrote:
> On Sat, 17 Aug 2024 at 01:48, Alejandro Colomar <alx@kernel.org> wrote:
> >
> > I would compact the above to:
> >
> >         len = strlen(s);
> >         buf = kmalloc_track_caller(len + 1, gfp);
> >         if (buf)
> >                 strcpy(mempcpy(buf, s, len), "");
> 
> No, we're not doing this kind of horror.

Ok.

> If _FORTIFY_SOURCE has problems with a simple "memcpy and add NUL",
> then _FORTIFY_SOURCE needs to be fixed.

_FORTIFY_SOURCE works (AFAIK) by replacing the usual string calls by
oneis that do some extra work to learn the real size of the buffers.
This means that for _FORTIFY_SOURCE to work, you need to actually call a
function.  Since the "add NUL" is not done in a function call, it's
unprotected (except that sanitizers may protect it via other means).
Here's the fortified version of strcpy(3) in the kernel:

	$ grepc -h -B15 strcpy ./include/linux/fortify-string.h
	/**
	 * strcpy - Copy a string into another string buffer
	 *
	 * @p: pointer to destination of copy
	 * @q: pointer to NUL-terminated source string to copy
	 *
	 * Do not use this function. While FORTIFY_SOURCE tries to avoid
	 * overflows, this is only possible when the sizes of @q and @p are
	 * known to the compiler. Prefer strscpy(), though note its different
	 * return values for detecting truncation.
	 *
	 * Returns @p.
	 *
	 */
	/* Defined after fortified strlen to reuse it. */
	__FORTIFY_INLINE __diagnose_as(__builtin_strcpy, 1, 2)
	char *strcpy(char * const POS p, const char * const POS q)
	{
		const size_t p_size = __member_size(p);
		const size_t q_size = __member_size(q);
		size_t size;

		/* If neither buffer size is known, immediately give up. */
		if (__builtin_constant_p(p_size) &&
		    __builtin_constant_p(q_size) &&
		    p_size == SIZE_MAX && q_size == SIZE_MAX)
			return __underlying_strcpy(p, q);
		size = strlen(q) + 1;
		/* Compile-time check for const size overflow. */
		if (__compiletime_lessthan(p_size, size))
			__write_overflow();
		/* Run-time check for dynamic size overflow. */
		if (p_size < size)
			fortify_panic(FORTIFY_FUNC_strcpy, FORTIFY_WRITE, p_size, size, p);
		__underlying_memcpy(p, q, size);
		return p;
	}

> We don't replace a "buf[len] = 0" with strcpy(,""). Yes, compilers may
> simplify it, but dammit, it's an unreadable incomprehensible mess to
> humans, and humans still matter a LOT more.

I understand.  While I don't consider it unreadable anymore (I guess I
got used to it), it felt strange at first.

> 
>                 Linus

Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 4/8] bpftool: Ensure task comm is always NUL-terminated
  2024-08-17  8:38   ` Alejandro Colomar
@ 2024-08-18  2:27     ` Yafang Shao
  2024-08-18  8:25       ` Alejandro Colomar
  0 siblings, 1 reply; 27+ messages in thread
From: Yafang Shao @ 2024-08-18  2:27 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: akpm, torvalds, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Quentin Monnet

On Sat, Aug 17, 2024 at 4:39 PM Alejandro Colomar <alx@kernel.org> wrote:
>
> Hi Yafang,
>
> On Sat, Aug 17, 2024 at 10:56:20AM GMT, Yafang Shao wrote:
> > Let's explicitly ensure the destination string is NUL-terminated. This way,
> > it won't be affected by changes to the source string.
> >
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > Reviewed-by: Quentin Monnet <qmo@kernel.org>
> > ---
> >  tools/bpf/bpftool/pids.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/tools/bpf/bpftool/pids.c b/tools/bpf/bpftool/pids.c
> > index 9b898571b49e..23f488cf1740 100644
> > --- a/tools/bpf/bpftool/pids.c
> > +++ b/tools/bpf/bpftool/pids.c
> > @@ -54,6 +54,7 @@ static void add_ref(struct hashmap *map, struct pid_iter_entry *e)
> >               ref = &refs->refs[refs->ref_cnt];
> >               ref->pid = e->pid;
> >               memcpy(ref->comm, e->comm, sizeof(ref->comm));
> > +             ref->comm[sizeof(ref->comm) - 1] = '\0';
>
> Why doesn't this use strscpy()?

bpftool is a userspace tool, so strscpy() is only applicable in kernel
code, correct?

> Isn't the source terminated?

It is currently terminated, but I believe we should avoid relying on
the source. Making it independent of the source would reduce potential
errors.

>
> Both the source and the destination measure 16 characters.  If it is
> true that the source is not terminated, then this copy might truncate
> the (non-)string by overwriting the last byte with a NUL.  Is that
> truncation a good thing?

It's not ideal, but we should still convert it to a string, even if it
ends up being truncated.

--
Regards
Yafang

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 4/8] bpftool: Ensure task comm is always NUL-terminated
  2024-08-18  2:27     ` Yafang Shao
@ 2024-08-18  8:25       ` Alejandro Colomar
  0 siblings, 0 replies; 27+ messages in thread
From: Alejandro Colomar @ 2024-08-18  8:25 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, torvalds, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Quentin Monnet

[-- Attachment #1: Type: text/plain, Size: 1608 bytes --]

Hi Yafang,

On Sun, Aug 18, 2024 at 10:27:01AM GMT, Yafang Shao wrote:
> On Sat, Aug 17, 2024 at 4:39 PM Alejandro Colomar <alx@kernel.org> wrote:
> >
> > Hi Yafang,
> >
> > On Sat, Aug 17, 2024 at 10:56:20AM GMT, Yafang Shao wrote:
> > > Let's explicitly ensure the destination string is NUL-terminated. This way,
> > > it won't be affected by changes to the source string.
> > >
> > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > > Reviewed-by: Quentin Monnet <qmo@kernel.org>
> > > ---
> > >  tools/bpf/bpftool/pids.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/tools/bpf/bpftool/pids.c b/tools/bpf/bpftool/pids.c
> > > index 9b898571b49e..23f488cf1740 100644
> > > --- a/tools/bpf/bpftool/pids.c
> > > +++ b/tools/bpf/bpftool/pids.c
> > > @@ -54,6 +54,7 @@ static void add_ref(struct hashmap *map, struct pid_iter_entry *e)
> > >               ref = &refs->refs[refs->ref_cnt];
> > >               ref->pid = e->pid;
> > >               memcpy(ref->comm, e->comm, sizeof(ref->comm));
> > > +             ref->comm[sizeof(ref->comm) - 1] = '\0';
> >
> > Why doesn't this use strscpy()?
> 
> bpftool is a userspace tool, so strscpy() is only applicable in kernel
> code, correct?

Ahh, makes sense.  LGTM, then.  Maybe the closest user-space function to
strscpy(9) would be strlcpy(3), but I don't know how old of a glibc you
support.  strlcpy(3) is currently in POSIX, and supported by both glibc
and musl, but that's too recent.

Have a lovely day!
Alex

> --
> Regards
> Yafang

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/8] Improve the copy of task comm
  2024-08-17  2:56 [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
                   ` (7 preceding siblings ...)
  2024-08-17  2:56 ` [PATCH v7 8/8] drm: " Yafang Shao
@ 2024-08-26  2:30 ` Yafang Shao
  2024-08-28  1:19   ` Andrew Morton
  8 siblings, 1 reply; 27+ messages in thread
From: Yafang Shao @ 2024-08-26  2:30 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, alx, justinstitt, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel

On Sat, Aug 17, 2024 at 10:56 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> Using {memcpy,strncpy,strcpy,kstrdup} to copy the task comm relies on the
> length of task comm. Changes in the task comm could result in a destination
> string that is overflow. Therefore, we should explicitly ensure the
> destination string is always NUL-terminated, regardless of the task comm.
> This approach will facilitate future extensions to the task comm.
>
> As suggested by Linus [0], we can identify all relevant code with the
> following git grep command:
>
>   git grep 'memcpy.*->comm\>'
>   git grep 'kstrdup.*->comm\>'
>   git grep 'strncpy.*->comm\>'
>   git grep 'strcpy.*->comm\>'
>
> PATCH #2~#4:   memcpy
> PATCH #5~#6:   kstrdup
> PATCH #7~#8:   strcpy
>
> Please note that strncpy() is not included in this series as it is being
> tracked by another effort. [1]
>
> In this series, we have removed __get_task_comm() because the task_lock()
> and BUILD_BUG_ON() within it are unnecessary.
>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Link: https://lore.kernel.org/all/CAHk-=wjAmmHUg6vho1KjzQi2=psR30+CogFd4aXrThr2gsiS4g@mail.gmail.com/ [0]
>
> Changes:
> v6->v7:
> - Improve the comment (Alejandro)
> - Drop strncpy as it is being tracked by another effort (Justin)
>   https://github.com/KSPP/linux/issues/90 [1]
>
> v5->v6: https://lore.kernel.org/linux-mm/20240812022933.69850-1-laoar.shao@gmail.com/
> - Get rid of __get_task_comm() (Linus)
> - Use ARRAY_SIZE() in get_task_comm() (Alejandro)
>
> v4->v5: https://lore.kernel.org/all/20240804075619.20804-1-laoar.shao@gmail.com/
> - Drop changes in the mm/kmemleak.c as it was fixed by
>   commit 0b84780134fb ("mm/kmemleak: replace strncpy() with strscpy()")
> - Drop changes in kernel/tsacct.c as it was fixed by
>   commmit 0fe2356434e ("tsacct: replace strncpy() with strscpy()")
>
> v3->v4: https://lore.kernel.org/linux-mm/20240729023719.1933-1-laoar.shao@gmail.com/
> - Rename __kstrndup() to __kmemdup_nul() and define it inside mm/util.c
>   (Matthew)
> - Remove unused local varaible (Simon)
>
> v2->v3: https://lore.kernel.org/all/20240621022959.9124-1-laoar.shao@gmail.com/
> - Deduplicate code around kstrdup (Andrew)
> - Add commit log for dropping task_lock (Catalin)
>
> v1->v2: https://lore.kernel.org/bpf/20240613023044.45873-1-laoar.shao@gmail.com/
> - Add comment for dropping task_lock() in __get_task_comm() (Alexei)
> - Drop changes in trace event (Steven)
> - Fix comment on task comm (Matus)
>
> v1: https://lore.kernel.org/all/20240602023754.25443-1-laoar.shao@gmail.com/
>
> Yafang Shao (8):
>   Get rid of __get_task_comm()
>   auditsc: Replace memcpy() with strscpy()
>   security: Replace memcpy() with get_task_comm()
>   bpftool: Ensure task comm is always NUL-terminated
>   mm/util: Fix possible race condition in kstrdup()
>   mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul}
>   net: Replace strcpy() with strscpy()
>   drm: Replace strcpy() with strscpy()
>
>  drivers/gpu/drm/drm_framebuffer.c     |  2 +-
>  drivers/gpu/drm/i915/i915_gpu_error.c |  2 +-
>  fs/exec.c                             | 10 -----
>  fs/proc/array.c                       |  2 +-
>  include/linux/sched.h                 | 32 +++++++++++---
>  kernel/auditsc.c                      |  6 +--
>  kernel/kthread.c                      |  2 +-
>  mm/util.c                             | 61 ++++++++++++---------------
>  net/ipv6/ndisc.c                      |  2 +-
>  security/lsm_audit.c                  |  4 +-
>  security/selinux/selinuxfs.c          |  2 +-
>  tools/bpf/bpftool/pids.c              |  2 +
>  12 files changed, 65 insertions(+), 62 deletions(-)
>
> --
> 2.43.5
>

Hello Andrew,

Could you please apply this series to the mm tree ?

-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 6/8] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul}
  2024-08-17  8:57   ` Alejandro Colomar
  2024-08-17  9:05     ` Alejandro Colomar
@ 2024-08-26  9:20     ` Alejandro Colomar
  2024-08-26 13:13       ` Yafang Shao
  1 sibling, 1 reply; 27+ messages in thread
From: Alejandro Colomar @ 2024-08-26  9:20 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, torvalds, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Simon Horman, Matthew Wilcox

[-- Attachment #1: Type: text/plain, Size: 5092 bytes --]

Hi Yafang,

On Sat, Aug 17, 2024 at 10:58:02AM GMT, Alejandro Colomar wrote:
> Hi Yafang,
> 
> On Sat, Aug 17, 2024 at 10:56:22AM GMT, Yafang Shao wrote:
> > These three functions follow the same pattern. To deduplicate the code,
> > let's introduce a common helper __kmemdup_nul().
> > 
> > Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > Cc: Simon Horman <horms@kernel.org>
> > Cc: Matthew Wilcox <willy@infradead.org>
> > ---
> >  mm/util.c | 67 +++++++++++++++++++++----------------------------------
> >  1 file changed, 26 insertions(+), 41 deletions(-)
> > 
> > diff --git a/mm/util.c b/mm/util.c
> > index 4542d8a800d9..310c7735c617 100644
> > --- a/mm/util.c
> > +++ b/mm/util.c
> > @@ -45,33 +45,40 @@ void kfree_const(const void *x)
> >  EXPORT_SYMBOL(kfree_const);
> >  
> >  /**
> > - * kstrdup - allocate space for and copy an existing string
> > - * @s: the string to duplicate
> > + * __kmemdup_nul - Create a NUL-terminated string from @s, which might be unterminated.
> > + * @s: The data to copy
> > + * @len: The size of the data, including the null terminator
> >   * @gfp: the GFP mask used in the kmalloc() call when allocating memory
> >   *
> > - * Return: newly allocated copy of @s or %NULL in case of error
> > + * Return: newly allocated copy of @s with NUL-termination or %NULL in
> > + * case of error
> >   */
> > -noinline
> > -char *kstrdup(const char *s, gfp_t gfp)
> > +static __always_inline char *__kmemdup_nul(const char *s, size_t len, gfp_t gfp)
> >  {
> > -	size_t len;
> >  	char *buf;
> >  
> > -	if (!s)
> > +	buf = kmalloc_track_caller(len, gfp);
> > +	if (!buf)
> >  		return NULL;
> >  
> > -	len = strlen(s) + 1;
> > -	buf = kmalloc_track_caller(len, gfp);
> > -	if (buf) {
> > -		memcpy(buf, s, len);
> > -		/* During memcpy(), the string might be updated to a new value,
> > -		 * which could be longer than the string when strlen() is
> > -		 * called. Therefore, we need to add a null termimator.
> > -		 */
> > -		buf[len - 1] = '\0';
> > -	}
> > +	memcpy(buf, s, len);
> > +	/* Ensure the buf is always NUL-terminated, regardless of @s. */
> > +	buf[len - 1] = '\0';
> >  	return buf;
> >  }
> > +
> > +/**
> > + * kstrdup - allocate space for and copy an existing string
> > + * @s: the string to duplicate
> > + * @gfp: the GFP mask used in the kmalloc() call when allocating memory
> > + *
> > + * Return: newly allocated copy of @s or %NULL in case of error
> > + */
> > +noinline
> > +char *kstrdup(const char *s, gfp_t gfp)
> > +{
> > +	return s ? __kmemdup_nul(s, strlen(s) + 1, gfp) : NULL;
> > +}
> >  EXPORT_SYMBOL(kstrdup);
> >  
> >  /**
> > @@ -106,19 +113,7 @@ EXPORT_SYMBOL(kstrdup_const);
> >   */
> >  char *kstrndup(const char *s, size_t max, gfp_t gfp)
> >  {
> > -	size_t len;
> > -	char *buf;
> > -
> > -	if (!s)
> > -		return NULL;
> > -
> > -	len = strnlen(s, max);
> > -	buf = kmalloc_track_caller(len+1, gfp);
> > -	if (buf) {
> > -		memcpy(buf, s, len);
> > -		buf[len] = '\0';
> > -	}
> > -	return buf;
> > +	return s ? __kmemdup_nul(s, strnlen(s, max) + 1, gfp) : NULL;
> >  }
> >  EXPORT_SYMBOL(kstrndup);
> >  
> > @@ -192,17 +187,7 @@ EXPORT_SYMBOL(kvmemdup);
> >   */
> >  char *kmemdup_nul(const char *s, size_t len, gfp_t gfp)
> >  {
> > -	char *buf;
> > -
> > -	if (!s)
> > -		return NULL;
> > -
> > -	buf = kmalloc_track_caller(len + 1, gfp);
> > -	if (buf) {
> > -		memcpy(buf, s, len);
> > -		buf[len] = '\0';
> > -	}
> > -	return buf;
> > +	return s ? __kmemdup_nul(s, len + 1, gfp) : NULL;
> >  }
> >  EXPORT_SYMBOL(kmemdup_nul);
> 
> I like the idea of the patch, but it's plagued with all those +1 and -1.
> I think that's due to a bad choice of value being passed by.  If you
> pass the actual length of the string (as suggested in my reply to the
> previous patch) you should end up with a cleaner set of APIs.
> 
> The only remaining +1 is for kmalloc_track_caller(), which I ignore what
> it does.
> 
> 	char *
> 	__kmemdup_nul(const char *s, size_t len, gfp_t gfp)
> 	{
> 		char *buf;
> 
> 		buf = kmalloc_track_caller(len + 1, gfp);
> 		if (!buf)
> 			return NULL;
> 
> 		strcpy(mempcpy(buf, s, len), "");

Changing these strcpy(, "") to the usual; ='\0' or =0, but I'd still
recommend the rest of the changes, that is, changing the value passed in
len, to remove several +1 and -1s.

What do you think?

Have a lovely day!
Alex

> 		return buf;
> 	}
> 
> 	char *
> 	kstrdup(const char *s, gfp_t gfp)
> 	{
> 		return s ? __kmemdup_nul(s, strlen(s), gfp) : NULL;
> 	}
> 
> 	char *
> 	kstrndup(const char *s, size_t n, gfp_t gfp)
> 	{
> 		return s ? __kmemdup_nul(s, strnlen(s, n), gfp) : NULL;
> 	}
> 
> 	char *
> 	kmemdup_nul(const char *s, size_t len, gfp_t gfp)
> 	{
> 		return s ? __kmemdup_nul(s, len, gfp) : NULL;
> 	}
> 
> Have a lovely day!
> Alex
> 
> -- 
> <https://www.alejandro-colomar.es/>



-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 6/8] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul}
  2024-08-26  9:20     ` Alejandro Colomar
@ 2024-08-26 13:13       ` Yafang Shao
  0 siblings, 0 replies; 27+ messages in thread
From: Yafang Shao @ 2024-08-26 13:13 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: akpm, torvalds, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Simon Horman, Matthew Wilcox

On Mon, Aug 26, 2024 at 5:25 PM Alejandro Colomar <alx@kernel.org> wrote:
>
> Hi Yafang,
>
> On Sat, Aug 17, 2024 at 10:58:02AM GMT, Alejandro Colomar wrote:
> > Hi Yafang,
> >
> > On Sat, Aug 17, 2024 at 10:56:22AM GMT, Yafang Shao wrote:
> > > These three functions follow the same pattern. To deduplicate the code,
> > > let's introduce a common helper __kmemdup_nul().
> > >
> > > Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > > Cc: Simon Horman <horms@kernel.org>
> > > Cc: Matthew Wilcox <willy@infradead.org>
> > > ---
> > >  mm/util.c | 67 +++++++++++++++++++++----------------------------------
> > >  1 file changed, 26 insertions(+), 41 deletions(-)
> > >
> > > diff --git a/mm/util.c b/mm/util.c
> > > index 4542d8a800d9..310c7735c617 100644
> > > --- a/mm/util.c
> > > +++ b/mm/util.c
> > > @@ -45,33 +45,40 @@ void kfree_const(const void *x)
> > >  EXPORT_SYMBOL(kfree_const);
> > >
> > >  /**
> > > - * kstrdup - allocate space for and copy an existing string
> > > - * @s: the string to duplicate
> > > + * __kmemdup_nul - Create a NUL-terminated string from @s, which might be unterminated.
> > > + * @s: The data to copy
> > > + * @len: The size of the data, including the null terminator
> > >   * @gfp: the GFP mask used in the kmalloc() call when allocating memory
> > >   *
> > > - * Return: newly allocated copy of @s or %NULL in case of error
> > > + * Return: newly allocated copy of @s with NUL-termination or %NULL in
> > > + * case of error
> > >   */
> > > -noinline
> > > -char *kstrdup(const char *s, gfp_t gfp)
> > > +static __always_inline char *__kmemdup_nul(const char *s, size_t len, gfp_t gfp)
> > >  {
> > > -   size_t len;
> > >     char *buf;
> > >
> > > -   if (!s)
> > > +   buf = kmalloc_track_caller(len, gfp);
> > > +   if (!buf)
> > >             return NULL;
> > >
> > > -   len = strlen(s) + 1;
> > > -   buf = kmalloc_track_caller(len, gfp);
> > > -   if (buf) {
> > > -           memcpy(buf, s, len);
> > > -           /* During memcpy(), the string might be updated to a new value,
> > > -            * which could be longer than the string when strlen() is
> > > -            * called. Therefore, we need to add a null termimator.
> > > -            */
> > > -           buf[len - 1] = '\0';
> > > -   }
> > > +   memcpy(buf, s, len);
> > > +   /* Ensure the buf is always NUL-terminated, regardless of @s. */
> > > +   buf[len - 1] = '\0';
> > >     return buf;
> > >  }
> > > +
> > > +/**
> > > + * kstrdup - allocate space for and copy an existing string
> > > + * @s: the string to duplicate
> > > + * @gfp: the GFP mask used in the kmalloc() call when allocating memory
> > > + *
> > > + * Return: newly allocated copy of @s or %NULL in case of error
> > > + */
> > > +noinline
> > > +char *kstrdup(const char *s, gfp_t gfp)
> > > +{
> > > +   return s ? __kmemdup_nul(s, strlen(s) + 1, gfp) : NULL;
> > > +}
> > >  EXPORT_SYMBOL(kstrdup);
> > >
> > >  /**
> > > @@ -106,19 +113,7 @@ EXPORT_SYMBOL(kstrdup_const);
> > >   */
> > >  char *kstrndup(const char *s, size_t max, gfp_t gfp)
> > >  {
> > > -   size_t len;
> > > -   char *buf;
> > > -
> > > -   if (!s)
> > > -           return NULL;
> > > -
> > > -   len = strnlen(s, max);
> > > -   buf = kmalloc_track_caller(len+1, gfp);
> > > -   if (buf) {
> > > -           memcpy(buf, s, len);
> > > -           buf[len] = '\0';
> > > -   }
> > > -   return buf;
> > > +   return s ? __kmemdup_nul(s, strnlen(s, max) + 1, gfp) : NULL;
> > >  }
> > >  EXPORT_SYMBOL(kstrndup);
> > >
> > > @@ -192,17 +187,7 @@ EXPORT_SYMBOL(kvmemdup);
> > >   */
> > >  char *kmemdup_nul(const char *s, size_t len, gfp_t gfp)
> > >  {
> > > -   char *buf;
> > > -
> > > -   if (!s)
> > > -           return NULL;
> > > -
> > > -   buf = kmalloc_track_caller(len + 1, gfp);
> > > -   if (buf) {
> > > -           memcpy(buf, s, len);
> > > -           buf[len] = '\0';
> > > -   }
> > > -   return buf;
> > > +   return s ? __kmemdup_nul(s, len + 1, gfp) : NULL;
> > >  }
> > >  EXPORT_SYMBOL(kmemdup_nul);
> >
> > I like the idea of the patch, but it's plagued with all those +1 and -1.
> > I think that's due to a bad choice of value being passed by.  If you
> > pass the actual length of the string (as suggested in my reply to the
> > previous patch) you should end up with a cleaner set of APIs.
> >
> > The only remaining +1 is for kmalloc_track_caller(), which I ignore what
> > it does.
> >
> >       char *
> >       __kmemdup_nul(const char *s, size_t len, gfp_t gfp)
> >       {
> >               char *buf;
> >
> >               buf = kmalloc_track_caller(len + 1, gfp);
> >               if (!buf)
> >                       return NULL;
> >
> >               strcpy(mempcpy(buf, s, len), "");
>
> Changing these strcpy(, "") to the usual; ='\0' or =0, but I'd still
> recommend the rest of the changes, that is, changing the value passed in
> len, to remove several +1 and -1s.
>
> What do you think?

I will update it. Thanks for your suggestion.

-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 0/8] Improve the copy of task comm
  2024-08-26  2:30 ` [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
@ 2024-08-28  1:19   ` Andrew Morton
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2024-08-28  1:19 UTC (permalink / raw)
  To: Yafang Shao
  Cc: torvalds, alx, justinstitt, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel

On Mon, 26 Aug 2024 10:30:54 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:

> Hello Andrew,
> 
> Could you please apply this series to the mm tree ?

Your comment in
https://lkml.kernel.org/r/CALOAHbA5VDjRYcoMOMKcLMVR0=ZwTz5FBTvQZExi6w8We9JPHg@mail.gmail.com
led me to believe that a v8 is to be sent?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup()
  2024-08-17  2:56 ` [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup() Yafang Shao
  2024-08-17  8:48   ` Alejandro Colomar
@ 2024-09-26 17:35   ` Andy Shevchenko
  2024-09-27  8:57     ` Yafang Shao
  2024-09-28 21:14   ` Kees Cook
  2 siblings, 1 reply; 27+ messages in thread
From: Andy Shevchenko @ 2024-09-26 17:35 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, torvalds, alx, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel

On Thu, Sep 26, 2024 at 7:44 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> In kstrdup(), it is critical to ensure that the dest string is always
> NUL-terminated. However, potential race condidtion can occur between a

condition

> writer and a reader.
>
> Consider the following scenario involving task->comm:
>
>     reader                    writer
>
>   len = strlen(s) + 1;
>                              strlcpy(tsk->comm, buf, sizeof(tsk->comm));
>   memcpy(buf, s, len);
>
> In this case, there is a race condition between the reader and the
> writer. The reader calculate the length of the string `s` based on the

calculates

> old value of task->comm. However, during the memcpy(), the string `s`
> might be updated by the writer to a new value of task->comm.
>
> If the new task->comm is larger than the old one, the `buf` might not be
> NUL-terminated. This can lead to undefined behavior and potential
> security vulnerabilities.
>
> Let's fix it by explicitly adding a NUL-terminator.

memcpy() is not atomic AFAIK, meaning that the new string can be also
shorter and when memcpy() already copied past the new NUL. I would
amend the explanation to include this as well.

...

> +               /* During memcpy(), the string might be updated to a new value,
> +                * which could be longer than the string when strlen() is
> +                * called. Therefore, we need to add a null termimator.

/*
 * The wrong comment style. Besides that a typo
 * in the word 'terminator'. Please, run codespell on your changes.
 * Also use the same form: NUL-terminator when you are talking
 * about '\0' and not NULL.
 */

> +                */


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup()
  2024-09-26 17:35   ` Andy Shevchenko
@ 2024-09-27  8:57     ` Yafang Shao
  0 siblings, 0 replies; 27+ messages in thread
From: Yafang Shao @ 2024-09-27  8:57 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: akpm, torvalds, alx, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel

On Fri, Sep 27, 2024 at 1:35 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> On Thu, Sep 26, 2024 at 7:44 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > In kstrdup(), it is critical to ensure that the dest string is always
> > NUL-terminated. However, potential race condidtion can occur between a
>
> condition
>
> > writer and a reader.
> >
> > Consider the following scenario involving task->comm:
> >
> >     reader                    writer
> >
> >   len = strlen(s) + 1;
> >                              strlcpy(tsk->comm, buf, sizeof(tsk->comm));
> >   memcpy(buf, s, len);
> >
> > In this case, there is a race condition between the reader and the
> > writer. The reader calculate the length of the string `s` based on the
>
> calculates
>
> > old value of task->comm. However, during the memcpy(), the string `s`
> > might be updated by the writer to a new value of task->comm.
> >
> > If the new task->comm is larger than the old one, the `buf` might not be
> > NUL-terminated. This can lead to undefined behavior and potential
> > security vulnerabilities.
> >
> > Let's fix it by explicitly adding a NUL-terminator.
>
> memcpy() is not atomic AFAIK, meaning that the new string can be also
> shorter and when memcpy() already copied past the new NUL. I would
> amend the explanation to include this as well.
>
> ...
>
> > +               /* During memcpy(), the string might be updated to a new value,
> > +                * which could be longer than the string when strlen() is
> > +                * called. Therefore, we need to add a null termimator.
>
> /*
>  * The wrong comment style. Besides that a typo
>  * in the word 'terminator'. Please, run codespell on your changes.
>  * Also use the same form: NUL-terminator when you are talking
>  * about '\0' and not NULL.
>  */

Thank you for pointing out these errors and for recommending the use
of codespell.
Will fix them in the next version.

-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup()
  2024-08-17  2:56 ` [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup() Yafang Shao
  2024-08-17  8:48   ` Alejandro Colomar
  2024-09-26 17:35   ` Andy Shevchenko
@ 2024-09-28 21:14   ` Kees Cook
  2 siblings, 0 replies; 27+ messages in thread
From: Kees Cook @ 2024-09-28 21:14 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, torvalds, alx, justinstitt, ebiederm, alexei.starovoitov,
	rostedt, catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel

On Sat, Aug 17, 2024 at 10:56:21AM +0800, Yafang Shao wrote:
> In kstrdup(), it is critical to ensure that the dest string is always
> NUL-terminated. However, potential race condidtion can occur between a
> writer and a reader.
> 
> Consider the following scenario involving task->comm:
> 
>     reader                    writer
> 
>   len = strlen(s) + 1;
>                              strlcpy(tsk->comm, buf, sizeof(tsk->comm));
>   memcpy(buf, s, len);
> 
> In this case, there is a race condition between the reader and the
> writer. The reader calculate the length of the string `s` based on the
> old value of task->comm. However, during the memcpy(), the string `s`
> might be updated by the writer to a new value of task->comm.
> 
> If the new task->comm is larger than the old one, the `buf` might not be
> NUL-terminated. This can lead to undefined behavior and potential
> security vulnerabilities.
> 
> Let's fix it by explicitly adding a NUL-terminator.

So, I'm fine with adding this generally, but I'm not sure we can
construct these helpers to be universally safe against the strings
changing out from under them. This situation is distinct to the 'comm'
member, so I'd like to focus on helpers around 'comm' access behaving in
a reliable fashion.

-Kees

> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
>  mm/util.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/util.c b/mm/util.c
> index 983baf2bd675..4542d8a800d9 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp)
>  
>  	len = strlen(s) + 1;
>  	buf = kmalloc_track_caller(len, gfp);
> -	if (buf)
> +	if (buf) {
>  		memcpy(buf, s, len);
> +		/* During memcpy(), the string might be updated to a new value,
> +		 * which could be longer than the string when strlen() is
> +		 * called. Therefore, we need to add a null termimator.
> +		 */
> +		buf[len - 1] = '\0';
> +	}
>  	return buf;
>  }
>  EXPORT_SYMBOL(kstrdup);
> -- 
> 2.43.5
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup()
  2024-08-17  8:48   ` Alejandro Colomar
  2024-08-17 16:26     ` Linus Torvalds
@ 2024-09-28 21:17     ` Kees Cook
  2024-09-29  7:58       ` Alejandro Colomar
  1 sibling, 1 reply; 27+ messages in thread
From: Kees Cook @ 2024-09-28 21:17 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Yafang Shao, akpm, torvalds, justinstitt, ebiederm,
	alexei.starovoitov, rostedt, catalin.marinas, penguin-kernel,
	linux-mm, linux-fsdevel, linux-trace-kernel, audit,
	linux-security-module, selinux, bpf, netdev, dri-devel

On Sat, Aug 17, 2024 at 10:48:15AM +0200, Alejandro Colomar wrote:
> Hi Yafang,
> 
> On Sat, Aug 17, 2024 at 10:56:21AM GMT, Yafang Shao wrote:
> > In kstrdup(), it is critical to ensure that the dest string is always
> > NUL-terminated. However, potential race condidtion can occur between a
> > writer and a reader.
> > 
> > Consider the following scenario involving task->comm:
> > 
> >     reader                    writer
> > 
> >   len = strlen(s) + 1;
> >                              strlcpy(tsk->comm, buf, sizeof(tsk->comm));
> >   memcpy(buf, s, len);
> > 
> > In this case, there is a race condition between the reader and the
> > writer. The reader calculate the length of the string `s` based on the
> > old value of task->comm. However, during the memcpy(), the string `s`
> > might be updated by the writer to a new value of task->comm.
> > 
> > If the new task->comm is larger than the old one, the `buf` might not be
> > NUL-terminated. This can lead to undefined behavior and potential
> > security vulnerabilities.
> > 
> > Let's fix it by explicitly adding a NUL-terminator.
> > 
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > ---
> >  mm/util.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/util.c b/mm/util.c
> > index 983baf2bd675..4542d8a800d9 100644
> > --- a/mm/util.c
> > +++ b/mm/util.c
> > @@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp)
> >  
> >  	len = strlen(s) + 1;
> >  	buf = kmalloc_track_caller(len, gfp);
> > -	if (buf)
> > +	if (buf) {
> >  		memcpy(buf, s, len);
> > +		/* During memcpy(), the string might be updated to a new value,
> > +		 * which could be longer than the string when strlen() is
> > +		 * called. Therefore, we need to add a null termimator.
> > +		 */
> > +		buf[len - 1] = '\0';
> > +	}
> 
> I would compact the above to:
> 
> 	len = strlen(s);
> 	buf = kmalloc_track_caller(len + 1, gfp);
> 	if (buf)
> 		strcpy(mempcpy(buf, s, len), "");
> 
> It allows _FORTIFY_SOURCE to track the copy of the NUL, and also uses
> less screen.  It also has less moving parts.  (You'd need to write a
> mempcpy() for the kernel, but that's as easy as the following:)
> 
> 	#define mempcpy(d, s, n)  (memcpy(d, s, n) + n)
> 
> In shadow utils, I did a global replacement of all buf[...] = '\0'; by
> strcpy(..., "");.  It ends up being optimized by the compiler to the
> same code (at least in the experiments I did).

Just to repeat what's already been said: no, please, don't complicate
this with yet more wrappers. And I really don't want to add more str/mem
variants -- we're working really hard to _remove_ them. :P

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup()
  2024-09-28 21:17     ` Kees Cook
@ 2024-09-29  7:58       ` Alejandro Colomar
  2024-09-29  9:48         ` Alejandro Colomar
  0 siblings, 1 reply; 27+ messages in thread
From: Alejandro Colomar @ 2024-09-29  7:58 UTC (permalink / raw)
  To: Kees Cook
  Cc: Yafang Shao, akpm, torvalds, justinstitt, ebiederm,
	alexei.starovoitov, rostedt, catalin.marinas, penguin-kernel,
	linux-mm, linux-fsdevel, linux-trace-kernel, audit,
	linux-security-module, selinux, bpf, netdev, dri-devel,
	Andy Shevchenko, Gustavo A. R. Silva

[-- Attachment #1: Type: text/plain, Size: 3433 bytes --]

[CC += Andy, Gustavo]

On Sat, Sep 28, 2024 at 02:17:30PM GMT, Kees Cook wrote:
> > > diff --git a/mm/util.c b/mm/util.c
> > > index 983baf2bd675..4542d8a800d9 100644
> > > --- a/mm/util.c
> > > +++ b/mm/util.c
> > > @@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp)
> > >  
> > >  	len = strlen(s) + 1;
> > >  	buf = kmalloc_track_caller(len, gfp);
> > > -	if (buf)
> > > +	if (buf) {
> > >  		memcpy(buf, s, len);
> > > +		/* During memcpy(), the string might be updated to a new value,
> > > +		 * which could be longer than the string when strlen() is
> > > +		 * called. Therefore, we need to add a null termimator.
> > > +		 */
> > > +		buf[len - 1] = '\0';
> > > +	}
> > 
> > I would compact the above to:
> > 
> > 	len = strlen(s);
> > 	buf = kmalloc_track_caller(len + 1, gfp);
> > 	if (buf)
> > 		strcpy(mempcpy(buf, s, len), "");
> > 
> > It allows _FORTIFY_SOURCE to track the copy of the NUL, and also uses
> > less screen.  It also has less moving parts.  (You'd need to write a
> > mempcpy() for the kernel, but that's as easy as the following:)
> > 
> > 	#define mempcpy(d, s, n)  (memcpy(d, s, n) + n)
> > 
> > In shadow utils, I did a global replacement of all buf[...] = '\0'; by
> > strcpy(..., "");.  It ends up being optimized by the compiler to the
> > same code (at least in the experiments I did).
> 
> Just to repeat what's already been said: no, please, don't complicate
> this with yet more wrappers. And I really don't want to add more str/mem
> variants -- we're working really hard to _remove_ them. :P

Hi Kees,

I assume by "[no] more str/mem variants" you're referring to mempcpy(3).

mempcpy(3) is a libc function available in several systems (at least
glibc, musl, FreeBSD, and NetBSD).  It's not in POSIX nor in OpenBSD,
but it's relatively widely available.  Availability is probably
pointless to the kernel, but I mention it because it's not something
random I came up with, but rather something that several projects have
found useful.  I find it quite useful to copy the non-zero part of a
string.  See string_copying(7).
<https://www.man7.org/linux/man-pages/man7/string_copying.7.html>

Regarding "we're working really hard to remove them [mem/str wrappers]",
I think it's more like removing those that are prone to misuse, not just
blinly reducing the amount of wrappers.  Some of them are really useful.

I've done a randomized search of kernel code, and found several places
where mempcpy(3) would be useful for simplifying code:

./drivers/staging/rtl8723bs/core/rtw_ap.c:		memcpy(pwps_ie, pwps_ie_src, wps_ielen + 2);
./drivers/staging/rtl8723bs/core/rtw_ap.c-		pwps_ie += (wps_ielen+2);

equivalent to:

	pwps_ie = mempcpy(pwps_ie, pwps_ie_src, wps_ielen + 2);

./drivers/staging/rtl8723bs/core/rtw_ap.c:		memcpy(supportRate + supportRateNum, p + 2, ie_len);
./drivers/staging/rtl8723bs/core/rtw_ap.c-		supportRateNum += ie_len;

equivalent to:

	supportRateNum = mempcpy(supportRate + supportRateNum, p + 2, ie_len);

./drivers/staging/rtl8723bs/core/rtw_ap.c:		memcpy(dst_ie, &tim_bitmap_le, 2);
./drivers/staging/rtl8723bs/core/rtw_ap.c-		dst_ie += 2;

equivalent to:

	dst_ie = mempcpy(dst_ie, &tim_bitmap_le, 2);


And there are many cases like this.  Using mempcpy(3) would make this
pattern less repetitive.


Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup()
  2024-09-29  7:58       ` Alejandro Colomar
@ 2024-09-29  9:48         ` Alejandro Colomar
  0 siblings, 0 replies; 27+ messages in thread
From: Alejandro Colomar @ 2024-09-29  9:48 UTC (permalink / raw)
  To: Kees Cook
  Cc: Yafang Shao, akpm, torvalds, justinstitt, ebiederm,
	alexei.starovoitov, rostedt, catalin.marinas, penguin-kernel,
	linux-mm, linux-fsdevel, linux-trace-kernel, audit,
	linux-security-module, selinux, bpf, netdev, dri-devel,
	Andy Shevchenko, Gustavo A. R. Silva

[-- Attachment #1: Type: text/plain, Size: 3807 bytes --]

On Sun, Sep 29, 2024 at 09:58:30AM GMT, Alejandro Colomar wrote:
> [CC += Andy, Gustavo]
> 
> On Sat, Sep 28, 2024 at 02:17:30PM GMT, Kees Cook wrote:
> > > > diff --git a/mm/util.c b/mm/util.c
> > > > index 983baf2bd675..4542d8a800d9 100644
> > > > --- a/mm/util.c
> > > > +++ b/mm/util.c
> > > > @@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp)
> > > >  
> > > >  	len = strlen(s) + 1;
> > > >  	buf = kmalloc_track_caller(len, gfp);
> > > > -	if (buf)
> > > > +	if (buf) {
> > > >  		memcpy(buf, s, len);
> > > > +		/* During memcpy(), the string might be updated to a new value,
> > > > +		 * which could be longer than the string when strlen() is
> > > > +		 * called. Therefore, we need to add a null termimator.
> > > > +		 */
> > > > +		buf[len - 1] = '\0';
> > > > +	}
> > > 
> > > I would compact the above to:
> > > 
> > > 	len = strlen(s);
> > > 	buf = kmalloc_track_caller(len + 1, gfp);
> > > 	if (buf)
> > > 		strcpy(mempcpy(buf, s, len), "");
> > > 
> > > It allows _FORTIFY_SOURCE to track the copy of the NUL, and also uses
> > > less screen.  It also has less moving parts.  (You'd need to write a
> > > mempcpy() for the kernel, but that's as easy as the following:)
> > > 
> > > 	#define mempcpy(d, s, n)  (memcpy(d, s, n) + n)
> > > 
> > > In shadow utils, I did a global replacement of all buf[...] = '\0'; by
> > > strcpy(..., "");.  It ends up being optimized by the compiler to the
> > > same code (at least in the experiments I did).
> > 
> > Just to repeat what's already been said: no, please, don't complicate
> > this with yet more wrappers. And I really don't want to add more str/mem
> > variants -- we're working really hard to _remove_ them. :P
> 
> Hi Kees,
> 
> I assume by "[no] more str/mem variants" you're referring to mempcpy(3).
> 
> mempcpy(3) is a libc function available in several systems (at least
> glibc, musl, FreeBSD, and NetBSD).  It's not in POSIX nor in OpenBSD,
> but it's relatively widely available.  Availability is probably
> pointless to the kernel, but I mention it because it's not something
> random I came up with, but rather something that several projects have
> found useful.  I find it quite useful to copy the non-zero part of a
> string.  See string_copying(7).
> <https://www.man7.org/linux/man-pages/man7/string_copying.7.html>
> 
> Regarding "we're working really hard to remove them [mem/str wrappers]",
> I think it's more like removing those that are prone to misuse, not just
> blinly reducing the amount of wrappers.  Some of them are really useful.
> 
> I've done a randomized search of kernel code, and found several places
> where mempcpy(3) would be useful for simplifying code:
> 
> ./drivers/staging/rtl8723bs/core/rtw_ap.c:		memcpy(pwps_ie, pwps_ie_src, wps_ielen + 2);
> ./drivers/staging/rtl8723bs/core/rtw_ap.c-		pwps_ie += (wps_ielen+2);
> 
> equivalent to:
> 
> 	pwps_ie = mempcpy(pwps_ie, pwps_ie_src, wps_ielen + 2);
> 
> ./drivers/staging/rtl8723bs/core/rtw_ap.c:		memcpy(supportRate + supportRateNum, p + 2, ie_len);
> ./drivers/staging/rtl8723bs/core/rtw_ap.c-		supportRateNum += ie_len;
> 
> equivalent to:
> 
> 	supportRateNum = mempcpy(supportRate + supportRateNum, p + 2, ie_len);

Oops, I misread the original in the above.  I didn't notice that the +=
is being done on the count, not the pointer.  The other equivalences are
good, though.

> 
> ./drivers/staging/rtl8723bs/core/rtw_ap.c:		memcpy(dst_ie, &tim_bitmap_le, 2);
> ./drivers/staging/rtl8723bs/core/rtw_ap.c-		dst_ie += 2;
> 
> equivalent to:
> 
> 	dst_ie = mempcpy(dst_ie, &tim_bitmap_le, 2);
> 
> 
> And there are many cases like this.  Using mempcpy(3) would make this
> pattern less repetitive.

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-09-29  9:48 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-17  2:56 [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
2024-08-17  2:56 ` [PATCH v7 1/8] Get rid of __get_task_comm() Yafang Shao
2024-08-17  2:56 ` [PATCH v7 2/8] auditsc: Replace memcpy() with strscpy() Yafang Shao
2024-08-17  2:56 ` [PATCH v7 3/8] security: Replace memcpy() with get_task_comm() Yafang Shao
2024-08-17  2:56 ` [PATCH v7 4/8] bpftool: Ensure task comm is always NUL-terminated Yafang Shao
2024-08-17  8:38   ` Alejandro Colomar
2024-08-18  2:27     ` Yafang Shao
2024-08-18  8:25       ` Alejandro Colomar
2024-08-17  2:56 ` [PATCH v7 5/8] mm/util: Fix possible race condition in kstrdup() Yafang Shao
2024-08-17  8:48   ` Alejandro Colomar
2024-08-17 16:26     ` Linus Torvalds
2024-08-17 17:03       ` Alejandro Colomar
2024-09-28 21:17     ` Kees Cook
2024-09-29  7:58       ` Alejandro Colomar
2024-09-29  9:48         ` Alejandro Colomar
2024-09-26 17:35   ` Andy Shevchenko
2024-09-27  8:57     ` Yafang Shao
2024-09-28 21:14   ` Kees Cook
2024-08-17  2:56 ` [PATCH v7 6/8] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul} Yafang Shao
2024-08-17  8:57   ` Alejandro Colomar
2024-08-17  9:05     ` Alejandro Colomar
2024-08-26  9:20     ` Alejandro Colomar
2024-08-26 13:13       ` Yafang Shao
2024-08-17  2:56 ` [PATCH v7 7/8] net: Replace strcpy() with strscpy() Yafang Shao
2024-08-17  2:56 ` [PATCH v7 8/8] drm: " Yafang Shao
2024-08-26  2:30 ` [PATCH v7 0/8] Improve the copy of task comm Yafang Shao
2024-08-28  1:19   ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).