cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Steven Whitehouse <swhiteho@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH] GFS2: Don't brelse rgrp buffer_heads every allocation
Date: Wed, 10 Jun 2015 11:30:08 +0100	[thread overview]
Message-ID: <557811B0.2050406@redhat.com> (raw)
In-Reply-To: <1353222669.13489616.1433861153801.JavaMail.zimbra@redhat.com>

Hi,


On 09/06/15 15:45, Bob Peterson wrote:
> ----- Original Message -----
>> Hi,
>>
>>
>> On 05/06/15 15:49, Bob Peterson wrote:
>>> Hi,
>>>
>>> This patch allows the block allocation code to retain the buffers
>>> for the resource groups so they don't need to be re-read from buffer
>>> cache with every request. This is a performance improvement that's
>>> especially noticeable when resource groups are very large. For
>>> example, with 2GB resource groups and 4K blocks, there can be 33
>>> blocks for every resource group. This patch allows those 33 buffers
>>> to be kept around and not read in and thrown away with every
>>> operation. The buffers are released when the resource group is
>>> either synced or invalidated.
>> The blocks should be cached between operations, so this should only be
>> resulting in a skip of the look up of the cached block, and no changes
>> to the actual I/O. Does that mean that grab_cache_page() is slow I
>> wonder? Or is this an issue of going around the retry loop due to lack
>> of memory at some stage?
>>
>> How does this interact with the rgrplvb support? I'd guess that with
>> that turned on, this is no longer an issue, because we'd only read in
>> the blocks for the rgrps that we are actually going to use?
>>
>>
>>
>> Steve.
> Hi,
>
> If you compare the two vmstat outputs in the bugzilla #1154782, you'll
> see no significant difference in memory usage nor cpu usage. So I assume
> the page lookup is the "slow" part; not because it's such a slow thing
> but because it's done 33 times per read-reference-invalidate (33 pages
> to look up per rgrp).
>
> Regards,
>
> Bob Peterson
> Red Hat File Systems

Thats true, however, as I understand the problem here, the issue is not 
reading in the blocks for the rgrp that is eventually selected to use, 
but the reading in of those blocks for the rgrps that we reject, for 
whatever reason (full, or congested, or whatever). So with rgrplvb 
enabled, we don't then read those rgrps in off disk at all in most cases 
- so I was wondering whether that solves the problem without needing 
this change?

Ideally I'd like to make the rgrplvb setting the default, since it is 
much more efficient. The question is how we can do that and still remain 
backward compatible? Not an easy one to answer :(

Also, if the page lookup is the slow thing, then we should look at using 
pagevec_lookup() to get the pages in chunks rather than doing it 
individually (and indeed, multiple times per page, in case of block size 
less than page size). We know that the blocks will always be contiguous 
on disk, so we should be able to send down large I/Os, rather than 
relying on the block stack to merge them as we do at the moment, which 
should be a further improvement too,

Steve.



  reply	other threads:[~2015-06-10 10:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1673564717.11791069.1433515261791.JavaMail.zimbra@redhat.com>
2015-06-05 14:49 ` [Cluster-devel] [GFS2 PATCH] GFS2: Don't brelse rgrp buffer_heads every allocation Bob Peterson
2015-06-08 12:18   ` Steven Whitehouse
2015-06-09 14:45     ` Bob Peterson
2015-06-10 10:30       ` Steven Whitehouse [this message]
2015-06-12 19:50         ` Bob Peterson
2015-06-15 11:18           ` Steven Whitehouse
2015-06-15 13:56             ` Bob Peterson
2015-06-15 14:26               ` Steven Whitehouse
2015-06-15 14:43                 ` Bob Peterson
2015-06-16 10:19                   ` Steven Whitehouse
2015-06-16 13:54               ` Bob Peterson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=557811B0.2050406@redhat.com \
    --to=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).