linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger-xsfywfwIY+M@public.gmane.org>
To: Neil Brown <neilb-l3A5Bk7waGM@public.gmane.org>
Cc: chucklever-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	"J. Bruce Fields"
	<bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>,
	David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org,
	Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Subject: Re: [RFC] Reinstate NFS exportability for JFFS2.
Date: Sun, 17 Aug 2008 11:22:06 -0700	[thread overview]
Message-ID: <20080817182206.GA4199@webber.adilger.int> (raw)
In-Reply-To: <18582.21855.2092.903688-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>

On Aug 04, 2008  11:03 +1000, Neil Brown wrote:
> So, given that background, it is possible to see some more possible
> solutions (other than the ones already mentioned).
> 
>  - drop the internal lock across filldir.
>    It could be seen a impolite to hold any locks across a callback
>    that are not documented as being held.
>    This would put an extra burden on the filesystem, but it shouldn't
>    be a particularly big burden.
>    A filesystem needs to be able to find the 'next' entry from a
>    sequential 'seek' pointer so that is the most state that needs to
>    be preserved.  It might be convenient to be able to keep more state
>    (pointers into data structures etc).  All this can be achieved with
>    fairly standard practice:
>      a/ have a 'version' number per inode which is updated when any
>         internal restructure happens.
>      b/ when calling filldir, record the version number, drop the lock
>         call filldir, reclaim the lock, check the version number
>      c/ if the version number matches (as it mostly will) just keep
>         going.  If it doesn't jump back to the top of readdir where
>         we decode the current seek address.
> 
>    Some filesystems have problems implementing seekdir/telldir so they
>    might not be totally happy here.  I have little sympathy for such
>    filesystems and feel the burden should be on them to make it work.
> 
>  - use i_mutex to protect internal changes too, and drop i_mutex while
>    doing internal restructuring.   This would need some VFS changes so
>    that dropping i_mutex would be safe.  It would require some way to
>    lock an individual dentry.  Probably you would lock it under
>    i_mutex by setting a flag bit, wait for the flag on some inode-wide
>    waitqueue, and drop the lock by clearing the flag and waking the
>    waitqueue. And you are never allowed to look at ->d_inode if the
>    lock flag is set.

When we were working on scaling the performance of concurrent operations
in a single directory we added hashed dentry locks instead of using
i_mutex (well, i_sem in those days) to lock the whole directory.  To make
the change manageable we replaced direct i_sem locking on the directory
inode with ->lock_dir() and ->unlock_dir() methods, defaulting to just
down() and up() on i_sem, but replacing this with a per-entry lock on
the child dentry hash.

This allowed Lustre servers to create/lookup/rename/remove many entries
in a single directory concurrently, and I think this same approach could
be useful in this case also.  This allows filesystems that need it to
bypass i_mutex if they need their own brand of locking, while leaving
the majority of filesystems untouched.

It also has the benefit that filesystems that need improved multi-threaded
performance in a single directory (e.g. JFFS2, XFS, or HPC or MTA
workloads) have the ability to do it.  There is definitely some work
needed internal to the filesystem to take advantage of this increased
parallelism, and we did implement such changes for ext3+htree directories,
adding internal locking on each leaf block that scaled the concurrency
with the size of the directory.


Alas, we don't have any up-to-date kernel patches for this, though the VFS
patch was posted to LKML back in Feb 2005 as "RFC: pdirops: vfs patch"
http://www.mail-archive.com/linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg01617.html
We have better dynamic locking code today (one of Jan's objections about
that patch), but the VFS part of the patch is no longer maintained.  The
ext3+htree patch was also posted "[RFC] parallel directory operations"
http://www.ussg.iu.edu/hypermail/linux/kernel/0307.1/att-0041/03-ext3-pdirops.patch


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2008-08-17 18:22 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-01 19:42 [RFC] Reinstate NFS exportability for JFFS2 David Woodhouse
2008-05-01 20:48 ` Christoph Hellwig
2008-05-01 22:44   ` David Woodhouse
2008-05-02  1:38     ` Neil Brown
2008-05-02 11:37       ` David Woodhouse
     [not found]         ` <1209728238.25560.686.camel-ZP4jZrcIevRpWr+L1FloEB2eb7JE58TQ@public.gmane.org>
2008-05-02 14:08           ` J. Bruce Fields
2008-07-31 21:54       ` David Woodhouse
2008-08-01  0:16         ` Neil Brown
     [not found]           ` <18578.21997.529551.676627-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-08-01  0:40             ` David Woodhouse
     [not found]               ` <1217551230.3719.15.camel-Fexsq3y4057IgHVZqg5X0TlWvGAXklZc@public.gmane.org>
2008-08-01  0:52                 ` David Woodhouse
2008-08-01  0:53               ` Chuck Lever
     [not found]                 ` <76bd70e30807311753m2785c6d3kd82edd1fe8b5f8b7-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-08-01  1:00                   ` David Woodhouse
     [not found]                     ` <1217552437.3719.30.camel-Fexsq3y4057IgHVZqg5X0TlWvGAXklZc@public.gmane.org>
2008-08-01  1:31                       ` Chuck Lever
2008-08-01  8:13                         ` David Woodhouse
2008-08-01 13:35                         ` David Woodhouse
     [not found]                           ` <1217597759.3454.356.camel-ZP4jZrcIevRpWr+L1FloEB2eb7JE58TQ@public.gmane.org>
2008-08-01 13:56                             ` David Woodhouse
2008-08-01 16:05                               ` Chuck Lever
2008-08-01 16:19                                 ` David Woodhouse
2008-08-01 17:47                                   ` Chuck Lever
2008-08-02 18:26                                     ` J. Bruce Fields
     [not found]                                       ` <20080802182644.GE30454-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2008-08-02 20:42                                         ` David Woodhouse
2008-08-02 21:33                                           ` J. Bruce Fields
     [not found]                                             ` <20080802213337.GA2833-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2008-08-03  8:39                                               ` David Woodhouse
2008-08-03 11:56                                       ` Neil Brown
2008-08-03 17:15                                         ` Chuck Lever
2008-08-04  1:03                                           ` Neil Brown
2008-08-04 18:41                                             ` J. Bruce Fields
2008-08-04 22:37                                               ` Neil Brown
     [not found]                                             ` <18582.21855.2092.903688-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-08-04  6:19                                               ` Chuck Lever
2008-08-05  8:51                                                 ` Dave Chinner
2008-08-05  8:59                                                   ` David Woodhouse
2008-08-05  9:47                                                     ` Dave Chinner
2008-08-05 23:06                                                   ` Neil Brown
2008-08-06  0:08                                                     ` Dave Chinner
2008-08-06 19:56                                                       ` J. Bruce Fields
2008-08-06 20:10                                                         ` David Woodhouse
     [not found]                                                           ` <1218053443.5111.148.camel-ZP4jZrcIevRpWr+L1FloEB2eb7JE58TQ@public.gmane.org>
2008-08-09 16:47                                                             ` David Woodhouse
2008-08-09 19:55                                                               ` David Woodhouse
     [not found]                                                                 ` <1218311710.26926.125.camel-ZP4jZrcIevRpWr+L1FloEB2eb7JE58TQ@public.gmane.org>
2008-08-09 20:01                                                                   ` [PATCH 1/4] Factor out nfsd_do_readdir() into its own function David Woodhouse
     [not found]                                                                     ` <1218312114.5063.5.camel-ZP4jZrcIevRpWr+L1FloEB2eb7JE58TQ@public.gmane.org>
2008-08-09 20:07                                                                       ` Christoph Hellwig
2008-08-09 20:02                                                                   ` [PATCH 2/4] Copy XFS readdir hack into nfsd code David Woodhouse
2008-08-09 20:08                                                                     ` Christoph Hellwig
2008-08-09 20:03                                                                   ` [PATCH 3/4] Remove XFS buffered readdir hack David Woodhouse
     [not found]                                                                     ` <1218312191.5063.8.camel-ZP4jZrcIevRpWr+L1FloEB2eb7JE58TQ@public.gmane.org>
2008-08-09 20:09                                                                       ` Christoph Hellwig
2008-08-09 20:03                                                                   ` [PATCH 4/4] Reinstate NFS exportability David Woodhouse
     [not found]                                                                     ` <1218312213.5063.9.camel-ZP4jZrcIevRpWr+L1FloEB2eb7JE58TQ@public.gmane.org>
2008-08-09 20:10                                                                       ` Christoph Hellwig
2008-08-17 18:22                                               ` Andreas Dilger [this message]
2008-08-01  2:14               ` [RFC] Reinstate NFS exportability for JFFS2 Neil Brown
     [not found]                 ` <18578.29049.38904.746701-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-08-01  8:50                   ` David Woodhouse
2008-08-01 10:03                   ` Al Viro
2008-08-01 23:11                     ` Neil Brown
2008-07-31 21:54       ` [PATCH 1/4] Factor out nfsd_do_readdir() into its own function David Woodhouse
2008-07-31 21:54       ` [PATCH 2/4] Copy XFS readdir hack into nfsd code, introduce FS_NO_LOOKUP_IN_READDIR flag David Woodhouse
2008-07-31 21:55       ` [PATCH 3/4] Switch XFS to using FS_NO_LOOKUP_IN_READDIR, remove local readdir hack David Woodhouse
     [not found]       ` <18458.28833.539314.455215-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-07-31 21:55         ` [PATCH 4/4] [JFFS2] Reinstate NFS exportability David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080817182206.GA4199@webber.adilger.int \
    --to=adilger-xsfywfwiy+m@public.gmane.org \
    --cc=bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org \
    --cc=chucklever-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=neilb-l3A5Bk7waGM@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).