From: Joel Becker <jlbec@evilplan.org>
To: Seamus Connor <sconnor@purestorage.com>
Cc: Christoph Hellwig <hch@lst.de>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] configfs: improve item creation performance
Date: Sat, 14 Oct 2023 18:09:28 -0700 [thread overview]
Message-ID: <ZSs7yIvQCTVJ7cZ7@google.com> (raw)
In-Reply-To: <20231013211129.72592-1-sconnor@purestorage.com>
On Fri, Oct 13, 2023 at 02:11:29PM -0700, Seamus Connor wrote:
> As the size of a directory increases item creation slows down.
> Optimizing access to s_children removes this bottleneck.
>
> dirents are already pinned into the cache, there is no need to scan the
> s_children list looking for duplicate Items. The configfs_dirent_exists
> check is moved to a location where it is called only during subsystem
> initialization.
>
> d_lookup will only need to call configfs_lookup in the case where the
> item in question is not pinned to dcache. The only items not pinned to
> dcache are attributes. These are placed at the front of the s_children
> list, whilst pinned items are inserted at the back. configfs_lookup
> stops scanning when it encounters the first pinned entry in s_children.
>
> The assumption of the above optimizations is that there will be few
> attributes, but potentially many Items in a given directory.
>
> Signed-off-by: Seamus Connor <sconnor@purestorage.com>
> Cc: Joel Becker <jlbec@evilplan.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: linux-kernel@vger.kernel.org
Reviewed-by: Joel Becker <jlbec@evilplan.org>
> ---
>
> In this revision I've added some commentary describing the changes, and
> I have removed a helper function.
>
> fs/configfs/configfs_internal.h | 3 +--
> fs/configfs/dir.c | 39 +++++++++++++++++++++++++--------
> fs/configfs/inode.c | 24 --------------------
> 3 files changed, 31 insertions(+), 35 deletions(-)
>
> diff --git a/fs/configfs/configfs_internal.h b/fs/configfs/configfs_internal.h
> index c0395363eab9..b91036fd71b1 100644
> --- a/fs/configfs/configfs_internal.h
> +++ b/fs/configfs/configfs_internal.h
> @@ -55,6 +55,7 @@ struct configfs_dirent {
> #define CONFIGFS_USET_IN_MKDIR 0x0200
> #define CONFIGFS_USET_CREATING 0x0400
> #define CONFIGFS_NOT_PINNED (CONFIGFS_ITEM_ATTR | CONFIGFS_ITEM_BIN_ATTR)
> +#define CONFIGFS_PINNED (CONFIGFS_ROOT | CONFIGFS_DIR | CONFIGFS_ITEM_LINK)
>
> extern struct mutex configfs_symlink_mutex;
> extern spinlock_t configfs_dirent_lock;
> @@ -73,8 +74,6 @@ extern int configfs_make_dirent(struct configfs_dirent *, struct dentry *,
> void *, umode_t, int, struct configfs_fragment *);
> extern int configfs_dirent_is_ready(struct configfs_dirent *);
>
> -extern void configfs_hash_and_remove(struct dentry * dir, const char * name);
> -
> extern const unsigned char * configfs_get_name(struct configfs_dirent *sd);
> extern void configfs_drop_dentry(struct configfs_dirent *sd, struct dentry *parent);
> extern int configfs_setattr(struct user_namespace *mnt_userns,
> diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c
> index d1f9d2632202..64a0eac83b90 100644
> --- a/fs/configfs/dir.c
> +++ b/fs/configfs/dir.c
> @@ -207,7 +207,17 @@ static struct configfs_dirent *configfs_new_dirent(struct configfs_dirent *paren
> return ERR_PTR(-ENOENT);
> }
> sd->s_frag = get_fragment(frag);
> - list_add(&sd->s_sibling, &parent_sd->s_children);
> +
> + /*
> + * configfs_lookup scans only for unpinned items. s_children is
> + * partitioned so that configfs_lookup can bail out early.
> + * CONFIGFS_PINNED and CONFIGFS_NOT_PINNED are not symmetrical. readdir
> + * cursors still need to be inserted at the front of the list.
> + */
> + if (sd->s_type & CONFIGFS_PINNED)
> + list_add_tail(&sd->s_sibling, &parent_sd->s_children);
> + else
> + list_add(&sd->s_sibling, &parent_sd->s_children);
> spin_unlock(&configfs_dirent_lock);
>
> return sd;
> @@ -220,10 +230,11 @@ static struct configfs_dirent *configfs_new_dirent(struct configfs_dirent *paren
> *
> * called with parent inode's i_mutex held
> */
> -static int configfs_dirent_exists(struct configfs_dirent *parent_sd,
> - const unsigned char *new)
> +static int configfs_dirent_exists(struct dentry *dentry)
> {
> - struct configfs_dirent * sd;
> + struct configfs_dirent *parent_sd = dentry->d_parent->d_fsdata;
> + const unsigned char *new = dentry->d_name.name;
> + struct configfs_dirent *sd;
>
> list_for_each_entry(sd, &parent_sd->s_children, s_sibling) {
> if (sd->s_element) {
> @@ -289,10 +300,6 @@ static int configfs_create_dir(struct config_item *item, struct dentry *dentry,
>
> BUG_ON(!item);
>
> - error = configfs_dirent_exists(p->d_fsdata, dentry->d_name.name);
> - if (unlikely(error))
> - return error;
> -
> error = configfs_make_dirent(p->d_fsdata, dentry, item, mode,
> CONFIGFS_DIR | CONFIGFS_USET_CREATING,
> frag);
> @@ -449,6 +456,18 @@ static struct dentry * configfs_lookup(struct inode *dir,
>
> spin_lock(&configfs_dirent_lock);
> list_for_each_entry(sd, &parent_sd->s_children, s_sibling) {
> +
> + /*
> + * s_children is partitioned, see configfs_new_dirent. The first
> + * pinned item indicates we can stop scanning.
> + */
> + if (sd->s_type & CONFIGFS_PINNED)
> + break;
> +
> + /*
> + * Note: CONFIGFS_PINNED and CONFIGFS_NOT_PINNED are asymmetric.
> + * there may be a readdir cursor in this list
> + */
> if ((sd->s_type & CONFIGFS_NOT_PINNED) &&
> !strcmp(configfs_get_name(sd), dentry->d_name.name)) {
> struct configfs_attribute *attr = sd->s_element;
> @@ -1878,7 +1897,9 @@ int configfs_register_subsystem(struct configfs_subsystem *subsys)
> if (dentry) {
> d_add(dentry, NULL);
>
> - err = configfs_attach_group(sd->s_element, &group->cg_item,
> + err = configfs_dirent_exists(dentry);
> + if (!err)
> + err = configfs_attach_group(sd->s_element, &group->cg_item,
> dentry, frag);
> if (err) {
> BUG_ON(d_inode(dentry));
> diff --git a/fs/configfs/inode.c b/fs/configfs/inode.c
> index b601610e9907..15964e62329d 100644
> --- a/fs/configfs/inode.c
> +++ b/fs/configfs/inode.c
> @@ -218,27 +218,3 @@ void configfs_drop_dentry(struct configfs_dirent * sd, struct dentry * parent)
> }
> }
>
> -void configfs_hash_and_remove(struct dentry * dir, const char * name)
> -{
> - struct configfs_dirent * sd;
> - struct configfs_dirent * parent_sd = dir->d_fsdata;
> -
> - if (d_really_is_negative(dir))
> - /* no inode means this hasn't been made visible yet */
> - return;
> -
> - inode_lock(d_inode(dir));
> - list_for_each_entry(sd, &parent_sd->s_children, s_sibling) {
> - if (!sd->s_element)
> - continue;
> - if (!strcmp(configfs_get_name(sd), name)) {
> - spin_lock(&configfs_dirent_lock);
> - list_del_init(&sd->s_sibling);
> - spin_unlock(&configfs_dirent_lock);
> - configfs_drop_dentry(sd, dir);
> - configfs_put(sd);
> - break;
> - }
> - }
> - inode_unlock(d_inode(dir));
> -}
> --
> 2.37.0
>
--
"Glory is fleeting, but obscurity is forever."
- Napoleon Bonaparte
http://www.jlbec.org/
jlbec@evilplan.org
next prev parent reply other threads:[~2023-10-15 1:09 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-11 21:39 [PATCH] [WIP] configfs: improve item creation performance Seamus Connor
2023-10-12 20:18 ` Joel Becker
2023-10-12 23:59 ` Seamus Connor
2023-10-13 21:11 ` [PATCH v2] " Seamus Connor
2023-10-15 1:09 ` Joel Becker [this message]
2023-10-16 6:07 ` Christoph Hellwig
2023-10-16 15:56 ` Seamus Connor
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZSs7yIvQCTVJ7cZ7@google.com \
--to=jlbec@evilplan.org \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=sconnor@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.