From: Steven Whitehouse <swhiteho@redhat.com>
To: paulmck@linux.vnet.ibm.com
Cc: linux-kernel@vger.kernel.org, cluster-devel@redhat.com,
Abhijith Das <adas@redhat.com>,
sasha.levin@oracle.com
Subject: Re: [PATCH 17/24] GFS2: Use RCU/hlist_bl based hash for quotas
Date: Wed, 22 Jan 2014 09:58:07 +0000 [thread overview]
Message-ID: <1390384687.2742.25.camel@menhir> (raw)
In-Reply-To: <20140122053248.GX10038@linux.vnet.ibm.com>
Hi,
On Tue, 2014-01-21 at 21:32 -0800, Paul E. McKenney wrote:
> On Mon, Jan 20, 2014 at 12:23:40PM +0000, Steven Whitehouse wrote:
> > Prior to this patch, GFS2 kept all the quotas for each
> > super block in a single linked list. This is rather slow
> > when there are large numbers of quotas.
> >
> > This patch introduces a hlist_bl based hash table, similar
> > to the one used for glocks. The initial look up of the quota
> > is now lockless in the case where it is already cached,
> > although we still have to take the per quota spinlock in
> > order to bump the ref count. Either way though, this is a
> > big improvement on what was there before.
> >
> > The qd_lock and the per super block list is preserved, for
> > the time being. However it is intended that since this is no
> > longer used for its original role, it should be possible to
> > shrink the number of items on that list in due course and
> > remove the requirement to take qd_lock in qd_get.
> >
> > Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
> > Cc: Abhijith Das <adas@redhat.com>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> Interesting! I thought that Sasha Levin had a hash table in the works,
> but I don't see it, so CCing him.
>
> A few questions and comments below.
>
> Thanx, Paul
>
Thanks for the review.
[snip]
> > +#define GFS2_QD_HASH_SHIFT 12
> Should this be a function of the number of CPUs? (Might not be an issue
> if the really big systems don't use GFS.)
I'm not sure... really it depends on how many quotas are in use, so
number of users, and even on relatively small systems, there might be a
lot of them. So I'm guessing a bit, and we'll bump it up a bit if its a
problem. There is a lot of extra complexity in changing hash table sizes
on the fly, which would be another possible solution. Either way it is a
vast improvement on what was there before :-)
[snip]
> > + if (!qid_eq(qd->qd_id, qid))
> > + continue;
> > + if (qd->qd_sbd != sdp)
> > + continue;
> > + if (lockref_get_not_dead(&qd->qd_lockref)) {
> > + list_lru_del(&gfs2_qd_lru, &qd->qd_lru);
> list_lru_del() acquires a lock, but it is from an array whose size
> depends on the NODES_SHIFT Kconfig variable. The array size seems to
> vary from 8 to 16 to 64 to 1024, depending on other configuration info,
> so should be OK.
Yes, we've tried to make use of the generic lru code here, and I'd like
to do that same for the glocks, however thats more complicated, so we've
not got that far just yet, even though we have made some steps in that
direction. Dave Chinner has pointed out to us that the lru code was
designed such that the lru lock should always be the inner most lock, so
thats the ordering that we've used here.
[snip]
> > @@ -1335,11 +1394,16 @@ void gfs2_quota_cleanup(struct gfs2_sbd *sdp)
> > spin_unlock(&qd->qd_lockref.lock);
> >
> > list_del(&qd->qd_list);
> > +
> > /* Also remove if this qd exists in the reclaim list */
> > list_lru_del(&gfs2_qd_lru, &qd->qd_lru);
> > atomic_dec(&sdp->sd_quota_count);
> > spin_unlock(&qd_lock);
> >
> > + spin_lock_bucket(qd->qd_hash);
> > + hlist_bl_del_rcu(&qd->qd_hlist);
>
> Might just be my unfamiliarity with this code, but it took me a bit
> to see the difference between ->qd_hlist and ->qd_list. Of course, until
> I spotted the difference, I was wondering why you were removing the
> item twice. ;-)
>
Well I hope that eventually qd_list might be able to go away. I'm still
working on a plan to deal with improving the quota data writeback which
should help to make that happen,
Steve.
next prev parent reply other threads:[~2014-01-22 9:58 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-20 12:23 GFS2: Pre-pull patch posting (merge window) Steven Whitehouse
2014-01-20 12:23 ` [PATCH 01/24] GFS2: If requested is too large, use the largest extent in the rgrp Steven Whitehouse
2014-01-20 12:23 ` [PATCH 02/24] GFS2: Drop inadequate rgrps from the reservation tree Steven Whitehouse
2014-01-20 12:23 ` [PATCH 03/24] GFS2: Implement a "rgrp has no extents longer than X" scheme Steven Whitehouse
2014-01-20 12:23 ` [PATCH 04/24] GFS2: Clean up releasepage Steven Whitehouse
2014-01-20 12:23 ` [PATCH 05/24] GFS2: Remove gfs2_quota_change_host structure Steven Whitehouse
2014-01-20 12:23 ` [PATCH 06/24] GFS2: Remove test which is always true Steven Whitehouse
2014-01-20 12:23 ` [PATCH 07/24] GFS2: Use range based functions for rgrp sync/invalidation Steven Whitehouse
2014-01-20 12:23 ` [PATCH 08/24] GFS2: Use only a single address space for rgrps Steven Whitehouse
2014-01-20 12:23 ` [PATCH 09/24] GFS2: Add directory addition info structure Steven Whitehouse
2014-01-20 12:23 ` [PATCH 10/24] GFS2: Consolidate transaction blocks calculation for dir add Steven Whitehouse
2014-01-20 12:23 ` [PATCH 11/24] GFS2: Remember directory insert point Steven Whitehouse
2014-01-20 12:23 ` [PATCH 12/24] GFS2: Increase i_writecount during gfs2_setattr_chown Steven Whitehouse
2014-01-20 12:23 ` [PATCH 13/24] GFS2: For exhash conversion, only one block is needed Steven Whitehouse
2014-01-20 12:23 ` [PATCH 14/24] GFS2: Add hints to directory leaf blocks Steven Whitehouse
2014-01-20 12:23 ` [PATCH 15/24] GFS2: Add initialization for address space in super block Steven Whitehouse
2014-01-20 12:23 ` [PATCH 16/24] GFS2: No need to invalidate pages for a dio read Steven Whitehouse
2014-01-20 12:23 ` [PATCH 17/24] GFS2: Use RCU/hlist_bl based hash for quotas Steven Whitehouse
2014-01-22 5:32 ` Paul E. McKenney
2014-01-22 6:06 ` Sasha Levin
2014-01-22 9:43 ` Steven Whitehouse
2014-01-22 9:58 ` Steven Whitehouse [this message]
2014-01-20 12:23 ` [PATCH 18/24] GFS2: Only run logd and quota when mounted read/write Steven Whitehouse
2014-01-20 12:23 ` [PATCH 19/24] GFS2: Clean up quota slot allocation Steven Whitehouse
2014-01-20 12:23 ` [PATCH 20/24] GFS2: Move quota bitmap operations under their own lock Steven Whitehouse
2014-01-20 12:23 ` [PATCH 21/24] GFS2: Fix kbuild test robot reported warning Steven Whitehouse
2014-01-20 12:23 ` [PATCH 22/24] GFS2: Don't use ENOBUFS when ENOMEM is the correct error code Steven Whitehouse
2014-01-20 12:23 ` [PATCH 23/24] GFS2: Small cleanup Steven Whitehouse
2014-01-20 12:23 ` [PATCH 24/24] GFS2: revert "GFS2: d_splice_alias() can't return error" Steven Whitehouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1390384687.2742.25.camel@menhir \
--to=swhiteho@redhat.com \
--cc=adas@redhat.com \
--cc=cluster-devel@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=sasha.levin@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox