linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <haveblue@us.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 24/25] r/o bind mounts: track number of mount writers
Date: Mon, 01 Oct 2007 11:06:37 -0700	[thread overview]
Message-ID: <1191261997.6024.12.camel@localhost> (raw)
In-Reply-To: <20070924124233.9197d6b6.akpm@linux-foundation.org>

On Mon, 2007-09-24 at 12:42 -0700, Andrew Morton wrote: 
> On Mon, 24 Sep 2007 12:28:11 -0700
> Dave Hansen <haveblue@us.ibm.com> wrote:
> > On Mon, 2007-09-24 at 18:54 +0100, Christoph Hellwig wrote:
> > > As we already say in various messages the percpu counters in here
> > > look rather fishy.  I'd recomment to take a look at the per-cpu
> > > superblock counters in XFS as they've been debugged quite well
> > > now and could probably be lifted into a generic library for this
> > > kind of think.  The code is mostly in fs/xfs/xfs_mount.c can
> > > can be spotted by beeing under #ifdef HAVE_PERCPU_SB.
> > > 
> > > It also handles cases like hotplug cpu nicely that this code
> > > seems to work around by always iterating over all possible cpus
> > > which might not be nice on a dual core laptop with a distro kernel
> > > that also has to support big iron.
> > 
> > I'll take a look at xfs to see what I can get out of it.
> 
> And at include/linux/percpu_counter.h, please.

The basic incompatibility with what that provides and what I need is
that percpu_counters allow some fuzziness in the numbers.  One cpu can
be summing the numbers up while another is still adding to the local
percpu counters.  That's fine for statistics, but horrible for questions
like, "can anybody write to and corrupt my FS right now?"

It could probably be modified to do what I want, but it would still have
a the "invented your own lock" problem, and would likely impact the
scalability of the existing "fuzzy" users.

We could introduce fuzzy and coherent variants of the function calls,
but that would probably introduce more code than what I have now for the
very specific mnt_writer_lock.

> > There are basically two times when you have to do this
> > for_each_possible_cpu() stuff:
> > 1. when doing a r/w->r/o transition, which is rare, and
> >    certainly not a fast path
> > 2. Where the per-cpu writer count underflows.  This requires
> >    a _minimum_ of 1<<16 file opens (configurable) each of which
> >    is closed on a different cpu than it was opened on.  Even
> >    if you were trying, I'm not sure you'd notice the overhead.
> > 
> 
> Sounds like what you're doing is more akin to the local_t-based module
> refcounting.  `grep local_ kernel/module.c'.
> 
> That code should be converted from NR_CPUS to for_each_possible_cpu()..

I think that can get converted to use the percpu_counters pretty easily.
I've coded that up, and sent it [RFC] to lkml.  Rusty will forward on
into mainline.

-- Dave


  reply	other threads:[~2007-10-01 18:06 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-20 19:52 [PATCH 00/25] Read-only bind mounts Dave Hansen
2007-09-20 19:52 ` [PATCH 01/25] filesystem helpers for custom 'struct file's Dave Hansen
2007-09-20 19:52 ` [PATCH 02/25] rearrange may_open() to be r/o friendly Dave Hansen
2007-09-20 19:52 ` [PATCH 03/25] give may_open() a local 'mnt' variable Dave Hansen
2007-09-20 19:57   ` Christoph Hellwig
2007-09-20 19:52 ` [PATCH 04/25] create cleanup helper svc_msnfs() Dave Hansen
2007-09-20 19:52 ` [PATCH 05/25] r/o bind mounts: stub functions Dave Hansen
2007-09-20 19:52 ` [PATCH 06/25] elevate write count open()'d files Dave Hansen
2007-11-28  8:41   ` Andrew Morton
2007-11-28 17:33     ` Dave Hansen
2007-09-20 19:52 ` [PATCH 07/25] r/o bind mounts: elevate write count for some ioctls Dave Hansen
2007-09-21  8:17   ` Andrew Morton
2007-09-21 21:15     ` Dave Hansen
2007-09-26  1:34     ` [RFC] detect missed mnt_want_write() calls Dave Hansen
2007-09-21 23:03   ` [PATCH 07/25] r/o bind mounts: elevate write count for some ioctls Andrew Morton
2007-09-21 23:39     ` Dave Hansen
2007-09-21 23:47       ` Andrew Morton
2007-09-20 19:52 ` [PATCH 08/25] elevate writer count for chown and friends Dave Hansen
2007-09-20 19:53 ` [PATCH 09/25] make access() use mnt check Dave Hansen
2007-09-20 19:53 ` [PATCH 10/25] elevate mnt writers for callers of vfs_mkdir() Dave Hansen
2007-09-20 19:53 ` [PATCH 11/25] elevate write count during entire ncp_ioctl() Dave Hansen
2007-09-20 19:53 ` [PATCH 12/25] elevate write count for link and symlink calls Dave Hansen
2007-09-20 19:53 ` [PATCH 13/25] elevate mount count for extended attributes Dave Hansen
2007-09-20 19:53 ` [PATCH 14/25] elevate write count for file_update_time() Dave Hansen
2007-09-20 19:53 ` [PATCH 15/25] unix_find_other() elevate write count for touch_atime() Dave Hansen
2007-09-20 19:53 ` [PATCH 16/25] elevate write count over calls to vfs_rename() Dave Hansen
2007-09-20 19:53 ` [PATCH 17/25] nfs: check mnt instead of superblock directly Dave Hansen
2007-09-20 19:53 ` [PATCH 18/25] elevate writer count for do_sys_truncate() Dave Hansen
2007-09-20 19:53 ` [PATCH 19/25] elevate write count for do_utimes() Dave Hansen
2007-09-20 19:53 ` [PATCH 20/25] elevate write count for do_sys_utime() and touch_atime() Dave Hansen
2007-09-20 19:53 ` [PATCH 21/25] sys_mknodat(): elevate write count for vfs_mknod/create() Dave Hansen
2007-09-20 19:53 ` [PATCH 22/25] elevate mnt writers for vfs_unlink() callers Dave Hansen
2007-09-20 19:53 ` [PATCH 23/25] do_rmdir(): elevate write count Dave Hansen
2007-09-20 19:53 ` [PATCH 24/25] r/o bind mounts: track number of mount writers Dave Hansen
2007-09-24  6:17   ` Andrew Morton
2007-09-24 14:34     ` Arjan van de Ven
2007-09-24 22:06     ` Dave Hansen
2007-09-24 22:25       ` Andrew Morton
2007-09-24 23:05         ` Dave Hansen
2007-09-24 23:15           ` Andrew Morton
2007-09-25 16:10             ` Dave Hansen
2007-09-24 17:54   ` Christoph Hellwig
2007-09-24 19:10     ` Andrew Morton
2007-09-24 19:24       ` Christoph Hellwig
2007-09-24 19:28     ` Dave Hansen
2007-09-24 19:42       ` Andrew Morton
2007-10-01 18:06         ` Dave Hansen [this message]
2007-09-20 19:53 ` [PATCH 25/25] honor r/w changes at do_remount() time Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1191261997.6024.12.camel@localhost \
    --to=haveblue@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).