From mboxrd@z Thu Jan  1 00:00:00 1970
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Mon, 15 Jun 2015 15:26:54 +0100
Subject: [Cluster-devel] [GFS2 PATCH] GFS2: Don't brelse rgrp
	buffer_heads every allocation
In-Reply-To: <1486199193.16648722.1434376611350.JavaMail.zimbra@redhat.com>
References: <40631507.11797438.1433515756742.JavaMail.zimbra@redhat.com>
	<55758830.6080405@redhat.com>
	<1353222669.13489616.1433861153801.JavaMail.zimbra@redhat.com>
	<557811B0.2050406@redhat.com>
	<2055885404.16127476.1434138634146.JavaMail.zimbra@redhat.com>
	<557EB48E.4020104@redhat.com>
	<1486199193.16648722.1434376611350.JavaMail.zimbra@redhat.com>
Message-ID: <557EE0AE.1070807@redhat.com>
List-Id: <cluster-devel.redhat.com>
To: cluster-devel.redhat.com
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

Hi,

On 15/06/15 14:56, Bob Peterson wrote:
> ----- Original Message -----
>>>>> If you compare the two vmstat outputs in the bugzilla #1154782, you'll
>>>>> see no significant difference in memory usage nor cpu usage. So I assume
>>>>> the page lookup is the "slow" part; not because it's such a slow thing
>>>>> but because it's done 33 times per read-reference-invalidate (33 pages
>>>>> to look up per rgrp).
>>>>>
>>>>> Regards,
>>>>>
>>>>> Bob Peterson
>>>>> Red Hat File Systems
>>>> Thats true, however, as I understand the problem here, the issue is not
>>>> reading in the blocks for the rgrp that is eventually selected to use,
>>>> but the reading in of those blocks for the rgrps that we reject, for
>>>> whatever reason (full, or congested, or whatever). So with rgrplvb
>>>> enabled, we don't then read those rgrps in off disk at all in most cases
>>>> - so I was wondering whether that solves the problem without needing
>>>> this change?
> Actually, I believe the problem is reading in the blocks for the rgrps we
> use, not the ones we reject. In this case, I think the rejected rgrps are
> pretty minimal.
>
>>> The rgrplvb mount option only helps if the file system is using lock_dlm.
>>> For lock_nolock, it's still just as slow because lock_nolock has no
>>> knowledge
>>> of lvbs. Now, granted, that's an unusual case because GFS2 is normally used
>>> with lock_dlm.
>> That sounds like a bug... it should work in the same way, even with
>> lock_nolock.
> Perhaps it is a bug in the rgrplvb code. I'll investigate the possibility.
> Until I look into the matter, all I can tell you is that the lvb option doesn't
> come near the performance of this patch. Here are some example runs:
>
> Stock kernel with -r128:
>                kB  reclen    write
>           2097152      32   213428
>           2097152      64   199363
>           2097152     128   202046
>           2097152     256   212355
>           2097152     512   228691
>           2097152    1024   216815
>
> Stock kernel with -r2048:
>                kB  reclen    write
>           2097152      32   150471
>           2097152      64   166858
>           2097152     128   165517
>           2097152     256   168206
>           2097152     512   163427
>           2097152    1024   158296
>
> Stock kernel with -r2048 and -o rgrplvb:
>                kB  reclen    write
>           2097152      32   167268
>           2097152      64   165654
>           2097152     128   166783
>           2097152     256   164070
>           2097152     512   166561
>           2097152    1024   166933
>
> With my patch and -r2048:
>                kB  reclen    write
>           2097152      32   196209
>           2097152      64   224383
>           2097152     128   223108
>           2097152     256   228552
>           2097152     512   224295
>           2097152    1024   229110
>
> With my patch and -r2048 and -o rgrplvb:
>                kB  reclen    write
>           2097152      32   214281
>           2097152      64   227061
>           2097152     128   226949
>           2097152     256   229651
>           2097152     512   229196
>           2097152    1024   226651
>
> I'll see if I can track down why the rgrplvb option isn't performing as well.
> I suspect the matter goes back to my first comment above. Namely, that the
> slowdown goes back to the slowness of page cache lookup for the buffers of the
> rgrps we are using (not rejected ones).
I'm assuming that these figures are bandwidth rather than times, since 
that appears to show that the patch makes quite a large difference. 
However the reclen is rather small. In the 32 bytes case, thats 128 
writes for each new block thats being allocated, unless of course that 
is 32k?

Steve.

> Regards,
>
> Bob Peterson
> Red Hat File Systems