From: Dave Chinner <david@fromorbit.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/3] dio: scale unaligned IO tracking via multiple lists
Date: Wed, 10 Nov 2010 10:06:27 +1100 [thread overview]
Message-ID: <20101109230627.GP2715@dastard> (raw)
In-Reply-To: <x4962w6439y.fsf@segfault.boston.devel.redhat.com>
On Tue, Nov 09, 2010 at 04:04:41PM -0500, Jeff Moyer wrote:
> Dave Chinner <david@fromorbit.com> writes:
>
> > On Mon, Nov 08, 2010 at 10:36:06AM -0500, Jeff Moyer wrote:
> >> Dave Chinner <david@fromorbit.com> writes:
> >>
> >> > From: Dave Chinner <dchinner@redhat.com>
> >> >
> >> > To avoid concerns that a single list and lock tracking the unaligned
> >> > IOs will not scale appropriately, create multiple lists and locks
> >> > and chose them by hashing the unaligned block being zeroed.
> >> >
> >> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> >> > ---
> >> > fs/direct-io.c | 49 ++++++++++++++++++++++++++++++++++++-------------
> >> > 1 files changed, 36 insertions(+), 13 deletions(-)
> >> >
> >> > diff --git a/fs/direct-io.c b/fs/direct-io.c
> >> > index 1a69efd..353ac52 100644
> >> > --- a/fs/direct-io.c
> >> > +++ b/fs/direct-io.c
> >> > @@ -152,8 +152,28 @@ struct dio_zero_block {
> >> > atomic_t ref; /* reference count */
> >> > };
> >> >
> >> > -static DEFINE_SPINLOCK(dio_zero_block_lock);
> >> > -static LIST_HEAD(dio_zero_block_list);
> >> > +#define DIO_ZERO_BLOCK_NR 37LL
> >>
> >> I'm always curious to know how these numbers are derived. Why 37?
> >
> > It's a prime number large enough to give enough lists to minimise
> > contention whilst providing decent distribution for 8 byte aligned
> > addresses with low overhead. XFS uses the same sort of waitqueue
> > hashing for global IO completion wait queues used by truncation
> > and inode eviction (see xfs_ioend_wait()).
> >
> > Seemed reasonable (and simple!) just to copy that design pattern
> > for another global IO completion wait queue....
>
> OK. I just had our performance team record some statistics for me on an
> unmodified kernel during an OLTP-type workload. I've attached the
> systemtap script that I had them run. I wanted to see just how common
> the sub-page-block zeroing was, and I was frightened to find that, in a
> 10 minute period , over 1.2 million calls were recorded. If we're
> lucky, my script is buggy. Please give it a look-see.
Well, it's just checking how many blocks are candidates for zeroing
inside the dio_zero_block() function call. i.e. the function gets
called on every newly allocated block at the start of an IO. Your
result implies that there were 1.2 million IOs requiring allocation
in ten minutes, because the next check in the dio_zero_block():
dio_blocks_per_fs_block = 1 << dio->blkfactor;
this_chunk_blocks = dio->block_in_file & (dio_blocks_per_fs_block - 1);
if (!this_chunk_blocks)
return;
determines if the IO is unaligned and zeroing is really necessary or
not. Your script needs to take this into account, not just count the
number of times the function is called with a new buffer.
> I'm all ears for next steps. We can check to see how deep the hash
> chains get. We could also ask the folks at Intel to run this through
> their database testing rig to get a quantification of the overhead.
>
> What do you think?
Let's run a fixed script first - if databases are really doing so
much unaligned sub-block IO, then they need to be fixed as a matter
of major priority because they are doing far more IO than they need
to be....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2010-11-09 23:06 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-08 7:40 [REPOST, PATCH 0/3] dio: serialise unaligned direct IO Dave Chinner
2010-11-08 7:40 ` [PATCH 1/3] dio: track and " Dave Chinner
2010-11-08 15:28 ` Jeff Moyer
2010-11-08 22:55 ` Dave Chinner
2010-11-08 7:40 ` [PATCH 2/3] dio: scale unaligned IO tracking via multiple lists Dave Chinner
2010-11-08 15:36 ` Jeff Moyer
2010-11-08 23:12 ` Dave Chinner
2010-11-09 21:04 ` Jeff Moyer
2010-11-09 23:06 ` Dave Chinner [this message]
2010-11-11 15:32 ` Jeff Moyer
2010-11-08 7:40 ` [PATCH 3/3] dio: add a mempool for the unaligned block structures Dave Chinner
2010-11-08 15:40 ` Jeff Moyer
-- strict thread matches above, loose matches on Subject: below --
2010-08-03 7:23 [PATCH 0/3] dio: serialise unaligned direct IO V3 Dave Chinner
2010-08-03 7:23 ` [PATCH 2/3] dio: scale unaligned IO tracking via multiple lists Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101109230627.GP2715@dastard \
--to=david@fromorbit.com \
--cc=jmoyer@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).