linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: NeilBrown <neil@brown.name>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	linux-unionfs@vger.kernel.org,
	 Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	 linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 02/12] ovl: Call ovl_create_temp() and ovl_create_index() without lock held.
Date: Wed, 25 Jun 2025 18:02:57 +0200	[thread overview]
Message-ID: <CAOQ4uxhCSbsPvnG1h0Bi=80KROtbCBBaB9SgZtpxMDjAVmPoKw@mail.gmail.com> (raw)
In-Reply-To: <CAOQ4uxjzZGK6fw9=dFiC8kZCUtA7NVQVE_Sa2wdHLZ9ZD7upgA@mail.gmail.com>

On Wed, Jun 25, 2025 at 5:44 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Wed, Jun 25, 2025 at 1:07 AM NeilBrown <neil@brown.name> wrote:
> >
> > ovl currently locks a directory or two and then performs multiple actions
> > in one or both directories.  This is incompatible with proposed changes
> > which will lock just the dentry of objects being acted on.
> >
> > This patch moves calls to ovl_create_temp() and ovl_create_index() out
> > of the locked region and has them take and release the relevant lock
> > themselves.
> >
> > The lock that was taken before these functions are called is now taken
> > after.  This means that any code between where the lock was taken and
> > these calls is now unlocked.  This necessitates the creation of
> > _unlocked() versions of ovl_cleanup() and ovl_lookup_upper().  These
> > will be used more widely in future patches.
> >
> > ovl_cleanup_unlocked() takes a dentry for the directory rather than an
> > inode as this simplifies calling slightly.
> >
> > Note that when we move a lookup or create out of a locked region in
> > which the dentry is acted on, we need to ensure after taking the lock
> > that the dentry is still in the directory we expect it to be in.  It is
> > conceivable that it was moved.
> >
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> >  fs/overlayfs/copy_up.c   | 37 +++++++++++-------
> >  fs/overlayfs/dir.c       | 84 +++++++++++++++++++++++++---------------
> >  fs/overlayfs/overlayfs.h | 10 +++++
> >  fs/overlayfs/super.c     |  7 ++--
> >  4 files changed, 88 insertions(+), 50 deletions(-)
> >
> > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> > index 8a3c0d18ec2e..7a21ad1f2b6e 100644
> > --- a/fs/overlayfs/copy_up.c
> > +++ b/fs/overlayfs/copy_up.c
> > @@ -517,15 +517,12 @@ static int ovl_set_upper_fh(struct ovl_fs *ofs, struct dentry *upper,
> >
> >  /*
> >   * Create and install index entry.
> > - *
> > - * Caller must hold i_mutex on indexdir.
> >   */
> >  static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
> >                             struct dentry *upper)
> >  {
> >         struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
> >         struct dentry *indexdir = ovl_indexdir(dentry->d_sb);
> > -       struct inode *dir = d_inode(indexdir);
> >         struct dentry *index = NULL;
> >         struct dentry *temp = NULL;
> >         struct qstr name = { };
> > @@ -558,17 +555,21 @@ static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh,
> >         err = ovl_set_upper_fh(ofs, upper, temp);
> >         if (err)
> >                 goto out;
> > -
> > +       lock_rename(indexdir, indexdir);
>
> This is a really strange hack.
> I assume your next patch set is going to remove this call, but I do not wish
> to merge this hack as is for an unknown period of time.
>
> Please introduce helpers {un,}lock_parent()
>
> >         index = ovl_lookup_upper(ofs, name.name, indexdir, name.len);
> >         if (IS_ERR(index)) {
> >                 err = PTR_ERR(index);
> > +       } else if (temp->d_parent != indexdir) {
> > +               err = -EINVAL;
> > +               dput(index);
>
> This can be inside lock_parent(parent, child)
> where child is an optional arg.
>
> err = lock_parent(indexdir, temp);
> if (err)
>        goto out;
>
> Because we should be checking this right after lock and
> not after ovl_lookup_upper().
>
> >         } else {
> >                 err = ovl_do_rename(ofs, indexdir, temp, indexdir, index, 0);
> >                 dput(index);
> >         }
> > +       unlock_rename(indexdir, indexdir);
> >  out:
> >         if (err)
> > -               ovl_cleanup(ofs, dir, temp);
> > +               ovl_cleanup_unlocked(ofs, indexdir, temp);
> >         dput(temp);
> >  free_name:
> >         kfree(name.name);
> > @@ -779,9 +780,7 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >                 return err;
> >
> >         ovl_start_write(c->dentry);
> > -       inode_lock(wdir);
> >         temp = ovl_create_temp(ofs, c->workdir, &cattr);
> > -       inode_unlock(wdir);
> >         ovl_end_write(c->dentry);
> >         ovl_revert_cu_creds(&cc);
> >
> > @@ -794,6 +793,8 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >          */
> >         path.dentry = temp;
> >         err = ovl_copy_up_data(c, &path);
> > +       if (err)
> > +               goto cleanup_write_unlocked;
> >         /*
> >          * We cannot hold lock_rename() throughout this helper, because of
> >          * lock ordering with sb_writers, which shouldn't be held when calling
> > @@ -801,6 +802,13 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >          * temp wasn't moved before copy up completion or cleanup.
> >          */
> >         ovl_start_write(c->dentry);
> > +
> > +       if (S_ISDIR(c->stat.mode) && c->indexed) {
> > +               err = ovl_create_index(c->dentry, c->origin_fh, temp);
> > +               if (err)
> > +                       goto cleanup_unlocked;
> > +       }
> > +
> >         trap = lock_rename(c->workdir, c->destdir);
> >         if (trap || temp->d_parent != c->workdir) {
> >                 /* temp or workdir moved underneath us? abort without cleanup */
> > @@ -809,20 +817,12 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >                 if (IS_ERR(trap))
> >                         goto out;
> >                 goto unlock;
> > -       } else if (err) {
> > -               goto cleanup;
> >         }
> >
> >         err = ovl_copy_up_metadata(c, temp);
> >         if (err)
> >                 goto cleanup;
> >
> > -       if (S_ISDIR(c->stat.mode) && c->indexed) {
> > -               err = ovl_create_index(c->dentry, c->origin_fh, temp);
> > -               if (err)
> > -                       goto cleanup;
> > -       }
> > -
> >         upper = ovl_lookup_upper(ofs, c->destname.name, c->destdir,
> >                                  c->destname.len);
> >         err = PTR_ERR(upper);
> > @@ -857,6 +857,13 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
> >         ovl_cleanup(ofs, wdir, temp);
> >         dput(temp);
> >         goto unlock;
> > +
> > +cleanup_write_unlocked:
> > +       ovl_start_write(c->dentry);
> > +cleanup_unlocked:
> > +       ovl_cleanup_unlocked(ofs, c->workdir, temp);
> > +       dput(temp);
> > +       goto out;
> >  }
>
> Wow these various cleanup flows are quite hard to follow.
> This is a massive patch set which is very hard for me to review
> and it will also be hard for me to maintain the code as it is.
> We need to figure out a way to simplify the code flow
> either more re-factoring or using some scoped cleanup hooks.
> I am open to suggestions.
>
> Thanks,
> Amir.
>
> >
> >  /* Copyup using O_TMPFILE which does not require cross dir locking */
> > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> > index 4fc221ea6480..a51a3dc02bf5 100644
> > --- a/fs/overlayfs/dir.c
> > +++ b/fs/overlayfs/dir.c
> > @@ -43,6 +43,21 @@ int ovl_cleanup(struct ovl_fs *ofs, struct inode *wdir, struct dentry *wdentry)
> >         return err;
> >  }
> >
> > +int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir,
> > +                        struct dentry *wdentry)
> > +{
> > +       int err;
> > +
> > +       inode_lock_nested(workdir->d_inode, I_MUTEX_PARENT);
> > +       if (wdentry->d_parent == workdir)
> > +               ovl_cleanup(ofs, workdir->d_inode, wdentry);
> > +       else
> > +               err = -EINVAL;
> > +       inode_unlock(workdir->d_inode);
> > +
> > +       return err;
> > +}
> > +

Just to illustrate what I meant and how the flow of ovl_cleanup_unlocked()
and later on ovl_cleanup() would look simpler:

int lock_parent(struct dentry *parent, struct dentry *child)
{
       int err;

       inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
       if (!child || child->d_parent == parent)
              return 0;

       inode_unlock(parent->d_inode);
       return -EINVAL;
}

int ovl_cleanup_unlocked(struct ovl_fs *ofs, struct dentry *workdir,
                        struct dentry *wdentry)
{
       int err;

       err = parent_lock(workdir, wdentry);
       if (err)
               return err;

       ovl_cleanup(ofs, workdir->d_inode, wdentry);
       parent_unlock(workdir);
       return 0;
}

Thanks,
Amir.

  reply	other threads:[~2025-06-25 16:03 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-24 22:54 [PATCH 00/12] ovl: narrow regions protected by directory i_rw_sem NeilBrown
2025-06-24 22:54 ` [PATCH 01/12] ovl: use is_subdir() for testing if one thing is a subdir of another NeilBrown
2025-06-25 14:54   ` Amir Goldstein
2025-06-25 21:45     ` NeilBrown
2025-06-24 22:54 ` [PATCH 02/12] ovl: Call ovl_create_temp() and ovl_create_index() without lock held NeilBrown
2025-06-25 15:44   ` Amir Goldstein
2025-06-25 16:02     ` Amir Goldstein [this message]
2025-06-28  3:08   ` Dan Carpenter
2025-06-24 22:54 ` [PATCH 03/12] ovl: narrow the locked region in ovl_copy_up_workdir() NeilBrown
2025-06-25 19:07   ` Amir Goldstein
2025-06-24 22:55 ` [PATCH 04/12] ovl: narrow locking in ovl_create_upper() NeilBrown
2025-06-25 17:55   ` Amir Goldstein
2025-06-25 18:17     ` Amir Goldstein
2025-06-24 22:55 ` [PATCH 05/12] ovl: narrow locking in ovl_clear_empty() NeilBrown
2025-06-25 18:22   ` Amir Goldstein
2025-06-24 22:55 ` [PATCH 06/12] ovl: narrow locking in ovl_create_over_whiteout() NeilBrown
2025-06-25 19:08   ` Amir Goldstein
2025-06-24 22:55 ` [PATCH 07/12] ovl: narrow locking in ovl_rename() NeilBrown
2025-06-25 18:30   ` Amir Goldstein
2025-07-02  2:16     ` NeilBrown
2025-06-24 22:55 ` [PATCH 08/12] ovl: narrow locking in ovl_cleanup_whiteouts() NeilBrown
2025-06-25 18:35   ` Amir Goldstein
2025-06-24 22:55 ` [PATCH 09/12] ovl: whiteout locking changes NeilBrown
2025-06-25 18:54   ` Amir Goldstein
2025-07-02  2:21     ` NeilBrown
2025-06-24 22:55 ` [PATCH 10/12] ovl: narrow locking in ovl_check_rename_whiteout() NeilBrown
2025-06-25 19:04   ` Amir Goldstein
2025-07-02  2:41     ` NeilBrown
2025-07-02 10:04       ` Amir Goldstein
2025-07-02 10:23       ` Amir Goldstein
2025-06-24 22:55 ` [PATCH 11/12] ovl: change ovl_create_real() to receive dentry parent NeilBrown
2025-06-25 19:05   ` Amir Goldstein
2025-06-24 22:55 ` [PATCH 12/12] ovl: rename ovl_cleanup_unlocked() to ovl_cleanup() NeilBrown
2025-06-25 18:57   ` Amir Goldstein
2025-06-25 14:56 ` [PATCH 00/12] ovl: narrow regions protected by directory i_rw_sem Amir Goldstein
2025-06-25 21:35   ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxhCSbsPvnG1h0Bi=80KROtbCBBaB9SgZtpxMDjAVmPoKw@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=neil@brown.name \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).