Linux filesystem development
 help / color / mirror / Atom feed
* [PATCH v4] coredump: Add /proc/<pid>/coredump_pre_exit for pre-exit before dumping
@ 2026-06-24 14:55 Xin Zhao
  2026-06-24 16:28 ` Al Viro
  0 siblings, 1 reply; 2+ messages in thread
From: Xin Zhao @ 2026-06-24 14:55 UTC (permalink / raw)
  To: brauner, mjguzik, pfalcato, ebiederm, viro, jack, jlayton,
	chuck.lever, alex.aring, arnd, keescook, mcgrof, j.granados,
	allen.lkml
  Cc: linux-fsdevel, linux-kernel, linux-arch, Xin Zhao

A coredump typically takes some time to complete. If we happen to hold a
write lock with flock just before triggering the coredump, that write lock
will not be released during the entire coredump process. As a result,
other processes attempting to acquire the same write lock may experience
significant delays. Another typical scenario is that shared memory, such
as dma-buf, remains occupied and is not released for a long time due to
core dumps.

To address this, add /proc/<pid>/coredump_pre_exit node so that people can
specify which resources they want to release before dumping core. This
patch implements the early release of two types of resources: flock files
and file-backed shared memory. Default settings are NOT pre-exit anything.

A temporary bit, O_TMPCLOS, is added to mark vma->vm_file->f_flags during
the execution of the newly introduced exit_mmap_mapped_shared() function.
In this way, the subsequent exit_files_pre_exit() function does not need
to find the corresponding vma through the file to check for the VM_SHARED
attribute, thereby reducing the traversal cost.

Signed-off-by: Xin Zhao <jackzxcui1989@163.com>
---

Change in v4:
- Christian pointed out that the coredump process will traverse file
  descriptors (fd), so certain fds should not be closed by default.
  Rework the whole feature, add /proc/<pid>/coredump_pre_exit for user
  pre-exit resources selection, default is NOT pre-exit anything.
- Mateusz suggested that walking the fd table and release the file-lock is
  reasonable. No longer release all the fd(s). Based on user config, only
  the flock fd(s) and the fd(s) correspondent to file-backed shared memory
  will be released at most.

Change in v3:
- Add comment and commit-log to explain why do the MMF_DUMP_MAPPED_SHARED
  mm_flags_test() check, note that memory mapped files keep their own
  separate references to the files. The case to work around is that early
  unlocking a flock on a file allows other processes to lock and modify
  the mapped data protected by the flock,
  as suggested by Pedro Falcato.
- Link to v3: https://lore.kernel.org/all/20260619122419.3954581-1-jackzxcui1989@163.com/

Change in v2:
- Get rid of the implement of adding new fcntl API, the issue does not
  worth inflicting the cost on everyone,
  as suggested by Al Viro.
- Call exit_files() in coredump_wait(),
  as suggested by Eric W. Biederman.
  Add MMF_DUMP_MAPPED_SHARED mm_flags_test() check to filter cases that
  need to dump file-backed shared memory.
- Link to v2: https://lore.kernel.org/lkml/20260618150301.3226517-1-jackzxcui1989@163.com/

v1:
- Link to v1: https://lore.kernel.org/all/20260618030700.2511668-1-jackzxcui1989@163.com/
---
 .../admin-guide/kernel-parameters.txt         |  5 ++
 Documentation/filesystems/proc.rst            | 58 +++++++++-----
 fs/coredump.c                                 | 23 ++++++
 fs/file.c                                     | 46 +++++++++++
 fs/proc/base.c                                | 78 +++++++++++++++++++
 include/linux/mm.h                            |  1 +
 include/linux/mm_types.h                      |  9 +++
 include/linux/sched/task.h                    |  1 +
 include/uapi/asm-generic/fcntl.h              |  4 +
 kernel/fork.c                                 | 12 +++
 mm/mmap.c                                     | 21 +++++
 11 files changed, 238 insertions(+), 20 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f575d4508..bc6d3859f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1024,6 +1024,11 @@ Kernel parameters
 			/proc/<pid>/coredump_filter.
 			See also Documentation/filesystems/proc.rst.
 
+	coredump_pre_exit=
+			[KNL] Change the default value for
+			/proc/<pid>/coredump_pre_exit.
+			See also Documentation/filesystems/proc.rst.
+
 	coresight_cpu_debug.enable
 			[ARM,ARM64]
 			Format: <bool>
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index db6167bef..6a637d31d 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -39,16 +39,17 @@ fixes/update part 1.1  Stefani Seibold <stefani@seibold.net>    June 9 2009
   3.2	/proc/<pid>/oom_score - Display current oom-killer score
   3.3	/proc/<pid>/io - Display the IO accounting fields
   3.4	/proc/<pid>/coredump_filter - Core dump filtering settings
-  3.5	/proc/<pid>/mountinfo - Information about mounts
-  3.6	/proc/<pid>/comm  & /proc/<pid>/task/<tid>/comm
-  3.7   /proc/<pid>/task/<tid>/children - Information about task children
-  3.8   /proc/<pid>/fdinfo/<fd> - Information about opened file
-  3.9   /proc/<pid>/map_files - Information about memory mapped files
-  3.10  /proc/<pid>/timerslack_ns - Task timerslack value
-  3.11	/proc/<pid>/patch_state - Livepatch patch operation state
-  3.12	/proc/<pid>/arch_status - Task architecture specific information
-  3.13  /proc/<pid>/fd - List of symlinks to open files
-  3.14  /proc/<pid>/ksm_stat - Information about the process's ksm status.
+  3.5  /proc/<pid>/coredump_pre_exit - Core dump pre-exit settings
+  3.6	/proc/<pid>/mountinfo - Information about mounts
+  3.7	/proc/<pid>/comm  & /proc/<pid>/task/<tid>/comm
+  3.8   /proc/<pid>/task/<tid>/children - Information about task children
+  3.9   /proc/<pid>/fdinfo/<fd> - Information about opened file
+  3.10   /proc/<pid>/map_files - Information about memory mapped files
+  3.11  /proc/<pid>/timerslack_ns - Task timerslack value
+  3.12	/proc/<pid>/patch_state - Livepatch patch operation state
+  3.13	/proc/<pid>/arch_status - Task architecture specific information
+  3.14  /proc/<pid>/fd - List of symlinks to open files
+  3.15  /proc/<pid>/ksm_stat - Information about the process's ksm status.
 
   4	Configuring procfs
   4.1	Mount options
@@ -1961,7 +1962,24 @@ For example::
   $ echo 0x7 > /proc/self/coredump_filter
   $ ./some_program
 
-3.5	/proc/<pid>/mountinfo - Information about mounts
+3.5 /proc/<pid>/coredump_pre_exit - Core dump pre-exit settings
+---------------------------------------------------------------
+A coredump typically takes some time to complete. If we happen to hold a write
+lock with flock just before triggering the coredump, that write lock will not
+be released during the entire coredump process. As a result, other processes
+attempting to acquire the same write lock may experience significant delays.
+Another typical scenario is that shared memory, such as dma-buf, remains
+occupied and is not released for a long time due to core dumps.
+
+/proc/<pid>/coredump_pre_exit allows you to pre-exit some resources before
+dumping core.
+
+The following two types are supported:
+
+  - (bit 0) flock files
+  - (bit 1) file-backed shared memory
+
+3.6	/proc/<pid>/mountinfo - Information about mounts
 --------------------------------------------------------
 
 This file contains lines of the form::
@@ -2001,7 +2019,7 @@ For more information on mount propagation see:
   Documentation/filesystems/sharedsubtree.rst
 
 
-3.6	/proc/<pid>/comm  & /proc/<pid>/task/<tid>/comm
+3.7	/proc/<pid>/comm  & /proc/<pid>/task/<tid>/comm
 --------------------------------------------------------
 These files provide a method to access a task's comm value. It also allows for
 a task to set its own or one of its thread siblings comm value. The comm value
@@ -2010,7 +2028,7 @@ then the kernel's TASK_COMM_LEN (currently 16 chars, including the NUL
 terminator) will result in a truncated comm value.
 
 
-3.7	/proc/<pid>/task/<tid>/children - Information about task children
+3.8	/proc/<pid>/task/<tid>/children - Information about task children
 -------------------------------------------------------------------------
 This file provides a fast way to retrieve first level children pids
 of a task pointed by <pid>/<tid> pair. The format is a space separated
@@ -2027,7 +2045,7 @@ pids, so one needs to either stop or freeze processes being inspected
 if precise results are needed.
 
 
-3.8	/proc/<pid>/fdinfo/<fd> - Information about opened file
+3.9	/proc/<pid>/fdinfo/<fd> - Information about opened file
 ---------------------------------------------------------------
 This file provides information associated with an opened file. The regular
 files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
@@ -2198,7 +2216,7 @@ VFIO Device files
 where 'vfio-device-syspath' is the sysfs path corresponding to the VFIO device
 file.
 
-3.9	/proc/<pid>/map_files - Information about memory mapped files
+3.10	/proc/<pid>/map_files - Information about memory mapped files
 ---------------------------------------------------------------------
 This directory contains symbolic links which represent memory mapped files
 the process is maintaining.  Example output::
@@ -2220,7 +2238,7 @@ time one can open(2) mappings from the listings of two processes and
 comparing their inode numbers to figure out which anonymous memory areas
 are actually shared.
 
-3.10	/proc/<pid>/timerslack_ns - Task timerslack value
+3.11	/proc/<pid>/timerslack_ns - Task timerslack value
 ---------------------------------------------------------
 This file provides the value of the task's timerslack value in nanoseconds.
 This value specifies an amount of time that normal timers may be deferred
@@ -2236,7 +2254,7 @@ Valid values are from 0 - ULLONG_MAX
 An application setting the value must have PTRACE_MODE_ATTACH_FSCREDS level
 permissions on the task specified to change its timerslack_ns value.
 
-3.11	/proc/<pid>/patch_state - Livepatch patch operation state
+3.12	/proc/<pid>/patch_state - Livepatch patch operation state
 -----------------------------------------------------------------
 When CONFIG_LIVEPATCH is enabled, this file displays the value of the
 patch state for the task.
@@ -2253,7 +2271,7 @@ patched.  If the patch is being enabled, then the task has already been
 patched.  If the patch is being disabled, then the task hasn't been
 unpatched yet.
 
-3.12 /proc/<pid>/arch_status - task architecture specific status
+3.13 /proc/<pid>/arch_status - task architecture specific status
 -------------------------------------------------------------------
 When CONFIG_PROC_PID_ARCH_STATUS is enabled, this file displays the
 architecture specific status of the task.
@@ -2298,7 +2316,7 @@ AVX512_elapsed_ms
   the task is unlikely an AVX512 user, but depends on the workload and the
   scheduling scenario, it also could be a false negative mentioned above.
 
-3.13 /proc/<pid>/fd - List of symlinks to open files
+3.14 /proc/<pid>/fd - List of symlinks to open files
 -------------------------------------------------------
 This directory contains symbolic links which represent open files
 the process is maintaining.  Example output::
@@ -2313,7 +2331,7 @@ The number of open files for the process is stored in 'size' member
 of stat() output for /proc/<pid>/fd for fast access.
 -------------------------------------------------------
 
-3.14 /proc/<pid>/ksm_stat - Information about the process's ksm status
+3.15 /proc/<pid>/ksm_stat - Information about the process's ksm status
 ----------------------------------------------------------------------
 When CONFIG_KSM is enabled, each process has this file which displays
 the information of ksm merging status.
diff --git a/fs/coredump.c b/fs/coredump.c
index bb6fdb1f4..e08a8a6c4 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -521,6 +521,27 @@ static int zap_threads(struct task_struct *tsk,
 	return nr;
 }
 
+static void coredump_pre_exit(void)
+{
+	struct task_struct *tsk = current;
+	unsigned long flags = __mm_flags_get_dumpable(tsk->mm);
+
+	if (!likely(flags & MMF_DUMP_PRE_EXIT_MASK))
+		return;
+
+	/*
+	 * Set O_TMPCLOS of file f_flags if file needs to be closed.
+	 */
+	if (test_bit(MMF_DUMP_PRE_EXIT_FILE_BACKED_SHARED, &flags) &&
+	    !test_bit(MMF_DUMP_MAPPED_SHARED, &flags))
+		exit_mmap_mapped_shared(tsk->mm);
+
+	/*
+	 * Check O_TMPCLOS of file f_flags to close file and clear it.
+	 */
+	exit_files_pre_exit(tsk, mm_flags_test(MMF_DUMP_PRE_EXIT_FLOCK, tsk->mm));
+}
+
 static int coredump_wait(int exit_code, struct core_state *core_state)
 {
 	struct task_struct *tsk = current;
@@ -1100,6 +1121,8 @@ static void do_coredump(struct core_name *cn, struct coredump_params *cprm,
 		return;
 	}
 
+	coredump_pre_exit();
+
 	switch (cn->core_type) {
 	case COREDUMP_FILE:
 		if (!coredump_file(cn, cprm, binfmt))
diff --git a/fs/file.c b/fs/file.c
index 2c81c0b16..a58ffffcc 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -23,6 +23,7 @@
 #include <linux/file_ref.h>
 #include <net/sock.h>
 #include <linux/init_task.h>
+#include <linux/filelock.h>
 
 #include "internal.h"
 
@@ -527,6 +528,51 @@ void exit_files(struct task_struct *tsk)
 	}
 }
 
+void exit_files_pre_exit(struct task_struct *tsk, bool checkflock)
+{
+	struct files_struct *files = tsk->files;
+	struct fdtable *fdt;
+	struct file *file;
+	unsigned int i, j = 0;
+
+	if (!files)
+		return;
+
+	fdt = rcu_dereference_raw(files->fdt);
+	for (;;) {
+		unsigned long set;
+
+		i = j * BITS_PER_LONG;
+		if (i >= fdt->max_fds)
+			break;
+		set = fdt->open_fds[j++];
+		while (set) {
+			if (!(set & 1))
+				goto next_fd;
+			file = fdt->fd[i];
+			if (!file)
+				goto next_fd;
+			if (file->f_flags & O_TMPCLOS) {
+				file->f_flags &= ~O_TMPCLOS;
+				goto close_fd;
+			}
+			if (!checkflock)
+				goto next_fd;
+			if (!vfs_inode_has_locks(file_inode(file)))
+				goto next_fd;
+
+close_fd:
+			fdt->fd[i] = NULL;
+			filp_close(file, files);
+			cond_resched();
+
+next_fd:
+			i++;
+			set >>= 1;
+		}
+	}
+}
+
 struct files_struct init_files = {
 	.count		= ATOMIC_INIT(1),
 	.fdt		= &init_files.fdtab,
diff --git a/fs/proc/base.c b/fs/proc/base.c
index d9acfa89c..99b5f219f 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3026,6 +3026,83 @@ static const struct file_operations proc_coredump_filter_operations = {
 	.write		= proc_coredump_filter_write,
 	.llseek		= generic_file_llseek,
 };
+
+static ssize_t proc_coredump_pre_exit_read(struct file *file, char __user *buf,
+					   size_t count, loff_t *ppos)
+{
+	struct task_struct *task = get_proc_task(file_inode(file));
+	struct mm_struct *mm;
+	char buffer[PROC_NUMBUF];
+	size_t len;
+	int ret;
+
+	if (!task)
+		return -ESRCH;
+
+	ret = 0;
+	mm = get_task_mm(task);
+	if (mm) {
+		unsigned long flags = __mm_flags_get_dumpable(mm);
+
+		len = snprintf(buffer, sizeof(buffer), "%08lx\n",
+			       ((flags & MMF_DUMP_PRE_EXIT_MASK) >>
+				MMF_DUMP_PRE_EXIT_SHIFT));
+		mmput(mm);
+		ret = simple_read_from_buffer(buf, count, ppos, buffer, len);
+	}
+
+	put_task_struct(task);
+
+	return ret;
+}
+
+static ssize_t proc_coredump_pre_exit_write(struct file *file,
+					    const char __user *buf,
+					    size_t count,
+					    loff_t *ppos)
+{
+	struct task_struct *task;
+	struct mm_struct *mm;
+	unsigned int val;
+	int ret;
+	int i;
+	unsigned long mask;
+
+	ret = kstrtouint_from_user(buf, count, 0, &val);
+	if (ret < 0)
+		return ret;
+
+	ret = -ESRCH;
+	task = get_proc_task(file_inode(file));
+	if (!task)
+		goto out_no_task;
+
+	mm = get_task_mm(task);
+	if (!mm)
+		goto out_no_mm;
+	ret = 0;
+
+	for (i = 0, mask = 1; i < MMF_DUMP_PRE_EXIT_BITS; i++, mask <<= 1) {
+		if (val & mask)
+			mm_flags_set(i + MMF_DUMP_PRE_EXIT_SHIFT, mm);
+		else
+			mm_flags_clear(i + MMF_DUMP_PRE_EXIT_SHIFT, mm);
+	}
+
+	mmput(mm);
+ out_no_mm:
+	put_task_struct(task);
+ out_no_task:
+	if (ret < 0)
+		return ret;
+	return count;
+}
+
+static const struct file_operations proc_coredump_pre_exit_operations = {
+	.read		= proc_coredump_pre_exit_read,
+	.write		= proc_coredump_pre_exit_write,
+	.llseek		= generic_file_llseek,
+};
 #endif
 
 #ifdef CONFIG_TASK_IO_ACCOUNTING
@@ -3391,6 +3468,7 @@ static const struct pid_entry tgid_base_stuff[] = {
 #endif
 #ifdef CONFIG_ELF_CORE
 	REG("coredump_filter", S_IRUGO|S_IWUSR, proc_coredump_filter_operations),
+	REG("coredump_pre_exit", S_IRUGO|S_IWUSR, proc_coredump_pre_exit_operations),
 #endif
 #ifdef CONFIG_TASK_IO_ACCOUNTING
 	ONE("io",	S_IRUSR, proc_tgid_io_accounting),
diff --git a/include/linux/mm.h b/include/linux/mm.h
index af23453e9..dfd4717c7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4066,6 +4066,7 @@ void anon_vma_interval_tree_verify(struct anon_vma_chain *node);
 extern int __vm_enough_memory(const struct mm_struct *mm, long pages, int cap_sys_admin);
 extern int insert_vm_struct(struct mm_struct *, struct vm_area_struct *);
 extern void exit_mmap(struct mm_struct *);
+extern void exit_mmap_mapped_shared(struct mm_struct *mm);
 bool mmap_read_lock_maybe_expand(struct mm_struct *mm, struct vm_area_struct *vma,
 				 unsigned long addr, bool write);
 
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index c7db35be6..0555aaf50 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -1963,6 +1963,15 @@ enum {
 	(BIT(MMF_DUMP_ANON_PRIVATE) | BIT(MMF_DUMP_ANON_SHARED) | \
 	 BIT(MMF_DUMP_HUGETLB_PRIVATE) | MMF_DUMP_MASK_DEFAULT_ELF)
 
+/* coredump pre-exit bits */
+#define MMF_DUMP_PRE_EXIT_FLOCK	11
+#define MMF_DUMP_PRE_EXIT_FILE_BACKED_SHARED 12
+
+#define MMF_DUMP_PRE_EXIT_SHIFT	(MMF_DUMPABLE_BITS + MMF_DUMP_FILTER_BITS)
+#define MMF_DUMP_PRE_EXIT_BITS	2
+#define MMF_DUMP_PRE_EXIT_MASK	\
+	(((1 << MMF_DUMP_PRE_EXIT_BITS) - 1) << MMF_DUMP_PRE_EXIT_SHIFT)
+
 #ifdef CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS
 # define MMF_DUMP_MASK_DEFAULT_ELF	BIT(MMF_DUMP_ELF_HEADERS)
 #else
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 41ed884cf..b4becbf6c 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -93,6 +93,7 @@ static inline void exit_thread(struct task_struct *tsk)
 extern __noreturn void do_group_exit(int);
 
 extern void exit_files(struct task_struct *);
+extern void exit_files_pre_exit(struct task_struct *, bool);
 extern void exit_itimers(struct task_struct *);
 
 extern pid_t kernel_clone(struct kernel_clone_args *kargs);
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 613475285..360604d65 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -95,6 +95,10 @@
 #define O_NDELAY	O_NONBLOCK
 #endif
 
+#ifndef O_TMPCLOS
+#define O_TMPCLOS	0x80000000	/* tag need close, temporarily used */
+#endif
+
 #define F_DUPFD		0	/* dup */
 #define F_GETFD		1	/* get close_on_exec */
 #define F_SETFD		2	/* set/clear close_on_exec */
diff --git a/kernel/fork.c b/kernel/fork.c
index a679b2448..84f1ee7f3 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1030,6 +1030,18 @@ static int __init coredump_filter_setup(char *s)
 
 __setup("coredump_filter=", coredump_filter_setup);
 
+static unsigned long default_dump_pre_exit;
+
+static int __init coredump_pre_exit_setup(char *s)
+{
+	default_dump_pre_exit =
+		(simple_strtoul(s, NULL, 0) << MMF_DUMP_PRE_EXIT_SHIFT) &
+		MMF_DUMP_PRE_EXIT_MASK;
+	return 1;
+}
+
+__setup("coredump_pre_exit=", coredump_pre_exit_setup);
+
 #include <linux/init_task.h>
 
 static void mm_init_aio(struct mm_struct *mm)
diff --git a/mm/mmap.c b/mm/mmap.c
index 5754d1c36..b955c47c0 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1326,6 +1326,27 @@ void exit_mmap(struct mm_struct *mm)
 	vm_unacct_memory(nr_accounted);
 }
 
+void exit_mmap_mapped_shared(struct mm_struct *mm)
+{
+	struct vm_area_struct *vma;
+	VMA_ITERATOR(vmi, mm, 0);
+
+	mmap_write_lock(mm);
+	lru_add_drain();
+
+	for_each_vma(vmi, vma) {
+		if (vma->vm_flags & VM_HUGETLB)
+			continue;
+		if (!(vma->vm_flags & VM_SHARED) || !file_inode(vma->vm_file)->i_nlink)
+			continue;
+		vma->vm_file->f_flags |= O_TMPCLOS;
+		do_munmap(mm, vma->vm_start, vma->vm_end - vma->vm_start, NULL);
+		cond_resched();
+	}
+
+	mmap_write_unlock(mm);
+}
+
 /*
  * Return true if the calling process may expand its vm space by the passed
  * number of pages
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v4] coredump: Add /proc/<pid>/coredump_pre_exit for pre-exit before dumping
  2026-06-24 14:55 [PATCH v4] coredump: Add /proc/<pid>/coredump_pre_exit for pre-exit before dumping Xin Zhao
@ 2026-06-24 16:28 ` Al Viro
  0 siblings, 0 replies; 2+ messages in thread
From: Al Viro @ 2026-06-24 16:28 UTC (permalink / raw)
  To: Xin Zhao
  Cc: brauner, mjguzik, pfalcato, ebiederm, jack, jlayton, chuck.lever,
	alex.aring, arnd, keescook, mcgrof, j.granados, allen.lkml,
	linux-fsdevel, linux-kernel, linux-arch

On Wed, Jun 24, 2026 at 10:55:52PM +0800, Xin Zhao wrote:
> +void exit_files_pre_exit(struct task_struct *tsk, bool checkflock)
> +{
> +	struct files_struct *files = tsk->files;
> +	struct fdtable *fdt;
> +	struct file *file;
> +	unsigned int i, j = 0;
> +
> +	if (!files)
> +		return;
> +
> +	fdt = rcu_dereference_raw(files->fdt);
> +	for (;;) {
> +		unsigned long set;
> +
> +		i = j * BITS_PER_LONG;
> +		if (i >= fdt->max_fds)
> +			break;
> +		set = fdt->open_fds[j++];
> +		while (set) {
> +			if (!(set & 1))
> +				goto next_fd;
> +			file = fdt->fd[i];
> +			if (!file)
> +				goto next_fd;
> +			if (file->f_flags & O_TMPCLOS) {
> +				file->f_flags &= ~O_TMPCLOS;
> +				goto close_fd;
> +			}

*blink*

	How could that possibly make sense?  Many descriptors
may refer to the same file; what's more, many descriptor tables
may contain such descriptors, so... just what is that code
trying to do?

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-24 16:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-24 14:55 [PATCH v4] coredump: Add /proc/<pid>/coredump_pre_exit for pre-exit before dumping Xin Zhao
2026-06-24 16:28 ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox