From: Nick Piggin <npiggin@kernel.dk>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Dave Chinner <david@fromorbit.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] fs: inode per-cpu last_ino allocator
Date: Sat, 16 Oct 2010 17:36:04 +1100 [thread overview]
Message-ID: <20101016063604.GA4383@amd> (raw)
In-Reply-To: <1285867685.2615.865.camel@edumazet-laptop>
On Thu, Sep 30, 2010 at 07:28:05PM +0200, Eric Dumazet wrote:
> Le jeudi 30 septembre 2010 à 09:45 -0700, Andrew Morton a écrit :
>
> > Could eliminate `p' I guess, but that would involve using
> > __get_cpu_var() as an lval, which looks vile and might generate worse
> > code.
> >
>
> Hmm, I see, please check this new patch, using the most modern stuff ;)
>
> > Readers of this code won't know why last_ino_get() was marked noinline.
> > It looks wrong, really.
>
> Oops sorry, this was a temporary hack of mine to ease disassembly
> analysis. Good catch !
>
> Here is the new generated code on i686 (with the noinline) :
> pretty good ;)
>
> c02e5930 <last_ino_get>:
> c02e5930: 55 push %ebp
> c02e5931: 89 e5 mov %esp,%ebp
> c02e5933: 64 a1 44 29 7d c0 mov %fs:0xc07d2944,%eax
> c02e5939: a9 ff 03 00 00 test $0x3ff,%eax
> c02e593e: 74 09 je c02e5949 <last_ino_get+0x19>
> c02e5940: 40 inc %eax
> c02e5941: 64 a3 44 29 7d c0 mov %eax,%fs:0xc07d2944
> c02e5947: c9 leave
> c02e5948: c3 ret
> c02e5949: b8 00 04 00 00 mov $0x400,%eax
> c02e594e: f0 0f c1 05 80 c8 92 c0 lock xadd %eax,0xc092c880
> c02e5956: eb e8 jmp c02e5940 <last_ino_get+0x10>
>
>
> Thanks
>
> [PATCH] fs: inode per-cpu last_ino allocator
Thanks Eric, this looks good. You didn't seem to add a comment about
preempt safety that Andrew wanted, but I'll add it.
>
> new_inode() dirties a contended cache line to get increasing
> inode numbers.
>
> Solve this problem by providing to each cpu a per_cpu variable,
> feeded by the shared last_ino, but once every 1024 allocations.
> This reduces contention on the shared last_ino, and give same
> spreading ino numbers than before (i.e. same wraparound after 2^32
> allocations).
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Nick Piggin <npiggin@suse.de>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/inode.c | 47 ++++++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 40 insertions(+), 7 deletions(-)
>
> diff --git a/fs/inode.c b/fs/inode.c
> index 8646433..5c233f0 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -624,6 +624,45 @@ void inode_add_to_lists(struct super_block *sb, struct inode *inode)
> }
> EXPORT_SYMBOL_GPL(inode_add_to_lists);
>
> +#define LAST_INO_BATCH 1024
> +
> +/*
> + * Each cpu owns a range of LAST_INO_BATCH numbers.
> + * 'shared_last_ino' is dirtied only once out of LAST_INO_BATCH allocations,
> + * to renew the exhausted range.
> + *
> + * This does not significantly increase overflow rate because every CPU can
> + * consume at most LAST_INO_BATCH-1 unused inode numbers. So there is
> + * NR_CPUS*(LAST_INO_BATCH-1) wastage. At 4096 and 1024, this is ~0.1% of the
> + * 2^32 range, and is a worst-case. Even a 50% wastage would only increase
> + * overflow rate by 2x, which does not seem too significant.
> + *
> + * On a 32bit, non LFS stat() call, glibc will generate an EOVERFLOW
> + * error if st_ino won't fit in target struct field. Use 32bit counter
> + * here to attempt to avoid that.
> + */
> +static DEFINE_PER_CPU(unsigned int, last_ino);
> +
> +static unsigned int last_ino_get(void)
> +{
> + unsigned int res;
> +
> + get_cpu();
> + res = __this_cpu_read(last_ino);
> +#ifdef CONFIG_SMP
> + if (unlikely((res & (LAST_INO_BATCH - 1)) == 0)) {
> + static atomic_t shared_last_ino;
> + int next = atomic_add_return(LAST_INO_BATCH, &shared_last_ino);
> +
> + res = next - LAST_INO_BATCH;
> + }
> +#endif
> + res++;
> + __this_cpu_write(last_ino, res);
> + put_cpu();
> + return res;
> +}
> +
> /**
> * new_inode - obtain an inode
> * @sb: superblock
> @@ -638,12 +677,6 @@ EXPORT_SYMBOL_GPL(inode_add_to_lists);
> */
> struct inode *new_inode(struct super_block *sb)
> {
> - /*
> - * On a 32bit, non LFS stat() call, glibc will generate an EOVERFLOW
> - * error if st_ino won't fit in target struct field. Use 32bit counter
> - * here to attempt to avoid that.
> - */
> - static unsigned int last_ino;
> struct inode *inode;
>
> spin_lock_prefetch(&inode_lock);
> @@ -652,7 +685,7 @@ struct inode *new_inode(struct super_block *sb)
> if (inode) {
> spin_lock(&inode_lock);
> __inode_add_to_lists(sb, NULL, inode);
> - inode->i_ino = ++last_ino;
> + inode->i_ino = last_ino_get();
> inode->i_state = 0;
> spin_unlock(&inode_lock);
> }
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-10-16 6:36 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-29 12:18 [PATCH 0/17] fs: Inode cache scalability Dave Chinner
2010-09-29 12:18 ` [PATCH 01/17] kernel: add bl_list Dave Chinner
2010-09-30 4:52 ` Andrew Morton
2010-10-16 7:55 ` Nick Piggin
2010-10-16 16:28 ` Christoph Hellwig
2010-10-01 5:48 ` Christoph Hellwig
2010-09-29 12:18 ` [PATCH 02/17] fs: icache lock s_inodes list Dave Chinner
2010-10-01 5:49 ` Christoph Hellwig
2010-10-16 7:54 ` Nick Piggin
2010-10-16 16:12 ` Christoph Hellwig
2010-10-16 17:09 ` Nick Piggin
2010-10-17 0:42 ` Christoph Hellwig
2010-10-17 2:03 ` Nick Piggin
2010-09-29 12:18 ` [PATCH 03/17] fs: icache lock inode hash Dave Chinner
2010-09-30 4:52 ` Andrew Morton
2010-09-30 6:13 ` Dave Chinner
2010-10-01 6:06 ` Christoph Hellwig
2010-10-16 7:57 ` Nick Piggin
2010-09-29 12:18 ` [PATCH 04/17] fs: icache lock i_state Dave Chinner
2010-10-01 5:54 ` Christoph Hellwig
2010-10-16 7:54 ` Nick Piggin
2010-09-29 12:18 ` [PATCH 05/17] fs: icache lock i_count Dave Chinner
2010-09-30 4:52 ` Andrew Morton
2010-10-01 5:55 ` Christoph Hellwig
2010-10-01 6:04 ` Andrew Morton
2010-10-01 6:16 ` Christoph Hellwig
2010-10-01 6:23 ` Andrew Morton
2010-09-29 12:18 ` [PATCH 06/17] fs: icache lock lru/writeback lists Dave Chinner
2010-09-30 4:52 ` Andrew Morton
2010-09-30 6:16 ` Dave Chinner
2010-10-16 7:55 ` Nick Piggin
2010-10-01 6:01 ` Christoph Hellwig
2010-10-05 22:30 ` Dave Chinner
2010-09-29 12:18 ` [PATCH 07/17] fs: icache atomic inodes_stat Dave Chinner
2010-09-30 4:52 ` Andrew Morton
2010-09-30 6:20 ` Dave Chinner
2010-09-30 6:37 ` Andrew Morton
2010-10-16 7:56 ` Nick Piggin
2010-09-29 12:18 ` [PATCH 08/17] fs: icache protect inode state Dave Chinner
2010-10-01 6:02 ` Christoph Hellwig
2010-10-16 7:54 ` Nick Piggin
2010-09-29 12:18 ` [PATCH 09/17] fs: Make last_ino, iunique independent of inode_lock Dave Chinner
2010-09-30 4:53 ` Andrew Morton
2010-10-01 6:08 ` Christoph Hellwig
2010-10-16 7:54 ` Nick Piggin
2010-09-29 12:18 ` [PATCH 10/17] fs: icache remove inode_lock Dave Chinner
2010-09-29 12:18 ` [PATCH 11/17] fs: Factor inode hash operations into functions Dave Chinner
2010-10-01 6:06 ` Christoph Hellwig
2010-10-16 7:54 ` Nick Piggin
2010-09-29 12:18 ` [PATCH 12/17] fs: Introduce per-bucket inode hash locks Dave Chinner
2010-09-30 1:52 ` Christoph Hellwig
2010-09-30 2:43 ` Dave Chinner
2010-10-16 7:55 ` Nick Piggin
2010-09-29 12:18 ` [PATCH 13/17] fs: Implement lazy LRU updates for inodes Dave Chinner
2010-09-30 2:05 ` Christoph Hellwig
2010-10-16 7:54 ` Nick Piggin
2010-09-29 12:18 ` [PATCH 14/17] fs: Inode counters do not need to be atomic Dave Chinner
2010-09-29 12:18 ` [PATCH 15/17] fs: inode per-cpu last_ino allocator Dave Chinner
2010-09-30 2:07 ` Christoph Hellwig
2010-10-06 6:29 ` Dave Chinner
2010-10-06 8:51 ` Christoph Hellwig
2010-09-30 4:53 ` Andrew Morton
2010-09-30 5:36 ` Eric Dumazet
2010-09-30 7:53 ` Eric Dumazet
2010-09-30 7:53 ` Eric Dumazet
2010-09-30 8:14 ` Andrew Morton
2010-09-30 10:22 ` [PATCH] " Eric Dumazet
2010-09-30 16:45 ` Andrew Morton
2010-09-30 17:28 ` Eric Dumazet
2010-09-30 17:28 ` Eric Dumazet
2010-09-30 17:39 ` Andrew Morton
2010-09-30 18:05 ` Eric Dumazet
2010-10-01 6:12 ` Christoph Hellwig
2010-10-01 6:45 ` Eric Dumazet
2010-10-01 6:45 ` Eric Dumazet
2010-10-16 6:36 ` Nick Piggin [this message]
2010-10-16 6:40 ` Nick Piggin
2010-09-29 12:18 ` [PATCH 16/17] fs: Convert nr_inodes to a per-cpu counter Dave Chinner
2010-09-30 2:12 ` Christoph Hellwig
2010-09-30 4:53 ` Andrew Morton
2010-09-30 6:10 ` Dave Chinner
2010-10-16 7:55 ` Nick Piggin
2010-10-16 8:29 ` Eric Dumazet
2010-10-16 8:29 ` Eric Dumazet
2010-10-16 9:07 ` Andrew Morton
2010-10-16 9:31 ` Eric Dumazet
2010-10-16 9:31 ` Eric Dumazet
2010-10-16 14:19 ` [PATCH] percpu_counter : add percpu_counter_add_fast() Eric Dumazet
2010-10-18 15:24 ` Christoph Lameter
2010-10-18 15:39 ` Eric Dumazet
2010-10-18 15:39 ` Eric Dumazet
2010-10-18 16:12 ` Christoph Lameter
2010-10-21 22:37 ` Andrew Morton
2010-10-21 23:10 ` Christoph Lameter
2010-10-22 0:45 ` Andrew Morton
2010-10-22 1:55 ` Andrew Morton
2010-10-22 1:55 ` Andrew Morton
2010-10-22 1:58 ` Nick Piggin
2010-10-22 2:14 ` Andrew Morton
2010-10-22 4:12 ` Eric Dumazet
2010-10-22 4:12 ` Eric Dumazet
2010-10-21 22:43 ` Andrew Morton
2010-10-21 22:58 ` Eric Dumazet
2010-10-21 23:18 ` Andrew Morton
2010-10-21 23:22 ` Eric Dumazet
2010-10-21 23:22 ` Eric Dumazet
2010-10-21 22:31 ` [PATCH 16/17] fs: Convert nr_inodes to a per-cpu counter Andrew Morton
2010-10-21 22:58 ` Eric Dumazet
2010-10-02 16:02 ` Christoph Hellwig
2010-09-29 12:18 ` [PATCH 17/17] fs: Clean up inode reference counting Dave Chinner
2010-09-30 2:15 ` Christoph Hellwig
2010-10-16 7:55 ` Nick Piggin
2010-10-16 16:14 ` Christoph Hellwig
2010-10-16 17:09 ` Nick Piggin
2010-09-30 4:53 ` Andrew Morton
2010-09-29 23:57 ` [PATCH 0/17] fs: Inode cache scalability Christoph Hellwig
2010-09-30 0:24 ` Dave Chinner
2010-09-30 2:21 ` Christoph Hellwig
2010-10-02 23:10 ` Carlos Carvalho
2010-10-04 7:22 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101016063604.GA4383@amd \
--to=npiggin@kernel.dk \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=eric.dumazet@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.