From: Brian Foster <bfoster@redhat.com>
To: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: ebiederm@xmission.com, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, yukuai3@huawei.com,
yi.zhang@huawei.com
Subject: Re: [PATCH v4] proc: Fix a dentry lock race between release_task and lookup
Date: Wed, 13 Jul 2022 08:41:11 -0400 [thread overview]
Message-ID: <Ys69Z4DlGhHvMDwK@bfoster> (raw)
In-Reply-To: <f64637ac-9c9e-06c4-bbea-4af5c24878bf@huawei.com>
On Wed, Jul 13, 2022 at 03:24:50PM +0800, Zhihao Cheng wrote:
> 在 2022/7/12 22:16, Brian Foster 写道:
> > On Wed, Jun 01, 2022 at 02:23:32PM +0800, Zhihao Cheng wrote:
> > > Commit 7bc3e6e55acf06 ("proc: Use a list of inodes to flush from proc")
> > > moved proc_flush_task() behind __exit_signal(). Then, process systemd
> > > can take long period high cpu usage during releasing task in following
> > > concurrent processes:
> > >
> > > systemd ps
> > > kernel_waitid stat(/proc/tgid)
> > > do_wait filename_lookup
> > > wait_consider_task lookup_fast
> > > release_task
> > > __exit_signal
> > > __unhash_process
> > > detach_pid
> > > __change_pid // remove task->pid_links
> > > d_revalidate -> pid_revalidate // 0
> > > d_invalidate(/proc/tgid)
> > > shrink_dcache_parent(/proc/tgid)
> > > d_walk(/proc/tgid)
> > > spin_lock_nested(/proc/tgid/fd)
> > > // iterating opened fd
> > > proc_flush_pid |
> > > d_invalidate (/proc/tgid/fd) |
> > > shrink_dcache_parent(/proc/tgid/fd) |
> > > shrink_dentry_list(subdirs) ↓
> > > shrink_lock_dentry(/proc/tgid/fd) --> race on dentry lock
> > >
> >
> > Curious... can this same sort of thing happen with /proc/<tgid>/task if
> > that dir similarly has a lot of dentries?
> >
>
> Yes. It could happend too. There will be many dentries under
> /proc/<tgid>/task when there are many tasks under same thread group.
>
> We must put /proc/<tgid>/task into pid->inodes, because we have to handle
> single thread exiting situation: Any one of threads should invalidate its
> /proc/<tgid>/task/<pid> dentry before begin released. You may refer to the
> function proc_flush_task_mnt() before commit 7bc3e6e55acf06 ("proc: Use a
> list of inodes to flush from proc").
>
Ah, I see. So historically when the (thread) task goes away, we look up
the tgid and then the associated /proc/<tgid>/task/<pid> dentry to zap
it. Thanks for the pointer..
Brian
> > ...
> > > Fixes: 7bc3e6e55acf06 ("proc: Use a list of inodes to flush from proc")
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216054
> > > Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
> > > Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> > > ---
> > > v1->v2: Add new helper proc_pid_make_base_inode that performs the extra
> > > work of adding to the pid->list.
> > > v2->v3: Add performance regression in commit message.
> > > v3->v4: Make proc_pid_make_base_inode() static
> > > fs/proc/base.c | 34 ++++++++++++++++++++++++++--------
> > > 1 file changed, 26 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > > index c1031843cc6a..d884933950fd 100644
> > > --- a/fs/proc/base.c
> > > +++ b/fs/proc/base.c
> > ...
> > > @@ -1931,6 +1926,27 @@ struct inode *proc_pid_make_inode(struct super_block * sb,
> > > return NULL;
> > > }
> > > +static struct inode *proc_pid_make_base_inode(struct super_block *sb,
> > > + struct task_struct *task, umode_t mode)
> > > +{
> > > + struct inode *inode;
> > > + struct proc_inode *ei;
> > > + struct pid *pid;
> > > +
> > > + inode = proc_pid_make_inode(sb, task, mode);
> > > + if (!inode)
> > > + return NULL;
> > > +
> > > + /* Let proc_flush_pid find this directory inode */
> > > + ei = PROC_I(inode);
> > > + pid = ei->pid;
> > > + spin_lock(&pid->lock);
> > > + hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes);
> > > + spin_unlock(&pid->lock);
> > > +
> > > + return inode;
> > > +}
> > > +
> >
> > Somewhat related to the question above.. it would be nice if this
> > wrapper had a line or two comment above it that explained when it should
> > or shouldn't be used over the underlying function (for example, why or
> > why not include /proc/<tgid>/task?). Otherwise the patch overall seems
> > reasonable to me..
> >
>
> Thanks for advice, I will add some notes in v5.
> > Brian
> >
> > > int pid_getattr(struct user_namespace *mnt_userns, const struct path *path,
> > > struct kstat *stat, u32 request_mask, unsigned int query_flags)
> > > {
> > > @@ -3350,7 +3366,8 @@ static struct dentry *proc_pid_instantiate(struct dentry * dentry,
> > > {
> > > struct inode *inode;
> > > - inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
> > > + inode = proc_pid_make_base_inode(dentry->d_sb, task,
> > > + S_IFDIR | S_IRUGO | S_IXUGO);
> > > if (!inode)
> > > return ERR_PTR(-ENOENT);
> > > @@ -3649,7 +3666,8 @@ static struct dentry *proc_task_instantiate(struct dentry *dentry,
> > > struct task_struct *task, const void *ptr)
> > > {
> > > struct inode *inode;
> > > - inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
> > > + inode = proc_pid_make_base_inode(dentry->d_sb, task,
> > > + S_IFDIR | S_IRUGO | S_IXUGO);
> > > if (!inode)
> > > return ERR_PTR(-ENOENT);
> > > --
> > > 2.31.1
> > >
> >
> > .
> >
>
prev parent reply other threads:[~2022-07-13 12:41 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-01 6:23 [PATCH v4] proc: Fix a dentry lock race between release_task and lookup Zhihao Cheng
2022-06-10 8:09 ` Zhihao Cheng
2022-07-12 3:06 ` Zhihao Cheng
2022-07-12 14:16 ` Brian Foster
2022-07-13 7:24 ` Zhihao Cheng
2022-07-13 12:41 ` Brian Foster [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ys69Z4DlGhHvMDwK@bfoster \
--to=bfoster@redhat.com \
--cc=chengzhihao1@huawei.com \
--cc=ebiederm@xmission.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=yi.zhang@huawei.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.