From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Whitehouse Date: Mon, 15 Jun 2015 15:26:54 +0100 Subject: [Cluster-devel] [GFS2 PATCH] GFS2: Don't brelse rgrp buffer_heads every allocation In-Reply-To: <1486199193.16648722.1434376611350.JavaMail.zimbra@redhat.com> References: <40631507.11797438.1433515756742.JavaMail.zimbra@redhat.com> <55758830.6080405@redhat.com> <1353222669.13489616.1433861153801.JavaMail.zimbra@redhat.com> <557811B0.2050406@redhat.com> <2055885404.16127476.1434138634146.JavaMail.zimbra@redhat.com> <557EB48E.4020104@redhat.com> <1486199193.16648722.1434376611350.JavaMail.zimbra@redhat.com> Message-ID: <557EE0AE.1070807@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, On 15/06/15 14:56, Bob Peterson wrote: > ----- Original Message ----- >>>>> If you compare the two vmstat outputs in the bugzilla #1154782, you'll >>>>> see no significant difference in memory usage nor cpu usage. So I assume >>>>> the page lookup is the "slow" part; not because it's such a slow thing >>>>> but because it's done 33 times per read-reference-invalidate (33 pages >>>>> to look up per rgrp). >>>>> >>>>> Regards, >>>>> >>>>> Bob Peterson >>>>> Red Hat File Systems >>>> Thats true, however, as I understand the problem here, the issue is not >>>> reading in the blocks for the rgrp that is eventually selected to use, >>>> but the reading in of those blocks for the rgrps that we reject, for >>>> whatever reason (full, or congested, or whatever). So with rgrplvb >>>> enabled, we don't then read those rgrps in off disk at all in most cases >>>> - so I was wondering whether that solves the problem without needing >>>> this change? > Actually, I believe the problem is reading in the blocks for the rgrps we > use, not the ones we reject. In this case, I think the rejected rgrps are > pretty minimal. > >>> The rgrplvb mount option only helps if the file system is using lock_dlm. >>> For lock_nolock, it's still just as slow because lock_nolock has no >>> knowledge >>> of lvbs. Now, granted, that's an unusual case because GFS2 is normally used >>> with lock_dlm. >> That sounds like a bug... it should work in the same way, even with >> lock_nolock. > Perhaps it is a bug in the rgrplvb code. I'll investigate the possibility. > Until I look into the matter, all I can tell you is that the lvb option doesn't > come near the performance of this patch. Here are some example runs: > > Stock kernel with -r128: > kB reclen write > 2097152 32 213428 > 2097152 64 199363 > 2097152 128 202046 > 2097152 256 212355 > 2097152 512 228691 > 2097152 1024 216815 > > Stock kernel with -r2048: > kB reclen write > 2097152 32 150471 > 2097152 64 166858 > 2097152 128 165517 > 2097152 256 168206 > 2097152 512 163427 > 2097152 1024 158296 > > Stock kernel with -r2048 and -o rgrplvb: > kB reclen write > 2097152 32 167268 > 2097152 64 165654 > 2097152 128 166783 > 2097152 256 164070 > 2097152 512 166561 > 2097152 1024 166933 > > With my patch and -r2048: > kB reclen write > 2097152 32 196209 > 2097152 64 224383 > 2097152 128 223108 > 2097152 256 228552 > 2097152 512 224295 > 2097152 1024 229110 > > With my patch and -r2048 and -o rgrplvb: > kB reclen write > 2097152 32 214281 > 2097152 64 227061 > 2097152 128 226949 > 2097152 256 229651 > 2097152 512 229196 > 2097152 1024 226651 > > I'll see if I can track down why the rgrplvb option isn't performing as well. > I suspect the matter goes back to my first comment above. Namely, that the > slowdown goes back to the slowness of page cache lookup for the buffers of the > rgrps we are using (not rejected ones). I'm assuming that these figures are bandwidth rather than times, since that appears to show that the patch makes quite a large difference. However the reclen is rather small. In the 32 bytes case, thats 128 writes for each new block thats being allocated, unless of course that is 32k? Steve. > Regards, > > Bob Peterson > Red Hat File Systems