From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Price Date: Thu, 06 Jun 2013 14:04:02 +0100 Subject: [Cluster-devel] [PATCH 2/4] mkfs.gfs2: Align resource groups to RAID stripes In-Reply-To: <83456593.48037538.1370523424334.JavaMail.root@redhat.com> References: <1370520213-29676-1-git-send-email-anprice@redhat.com> <1370520213-29676-2-git-send-email-anprice@redhat.com> <83456593.48037538.1370523424334.JavaMail.root@redhat.com> Message-ID: <51B088C2.4080701@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On 06/06/13 13:57, Bob Peterson wrote: > Hi, > > | + /* Squeeze the last 1 or 2 rgs into the remaining space */ > | + if ((nextaddr < sdp->device.length) && (sdp->device.length - nextaddr >= > | minrgsz)) { > | + rglen = sdp->device.length - nextaddr; > | + } else { > | + if (sdp->device.length - rgaddr <= maxrgsz) > | + rgt->length = sdp->device.length - rgaddr; > | + else > | + rgt->length = maxrgsz; > | + /* This is the last rg */ > | + nextaddr = 0; > > In GFS1, we allowed mix-and-match resource group sizes, but we originally > designed mkfs.gfs2 to ensure that all rgrps were the same uniform size. This > usually means some space is wasted at the end of the last resource group. > > We did this primarily so that fsck.gfs2 could more easily detect and repair > damaged resource groups and rindex values. At the time it was designed, I got > the buy-in of a bunch of developers and we all agreed to it. Since that time, > I've had to change fsck.gfs2 to take more drastic measures to repair damaged > resource groups, due to the fact that gfs2_convert can convert a GFS1 file > system to GFS2, and thus, we can still end up with non-uniform resource groups. > Many customers were adding storage and doing multiple gfs_grow ops, > which resulted in metadata sets where the rgrps and rindex were complete chaos. > > Still, my assumption has always been: If the file system was made by > mkfs.gfs2, all resource groups (but the first one) are identical in size. > > I think gfs2_grow takes some steps to ensure that new rgrps are also created > using the same size as the current resource groups. If we don't enforce > that rule, the rindex could once again become chaos, which means our chances > of rgrp and rindex repair get worse. > > Do we still want to enforce this rule? Good question. I had assumed that we don't have a rule like that as the rindex specifies the rg sizes. My next planned mkfs change is to allow the journal creation code to ask for a resource group large enough to contain all of a journal's data blocks so that they're always a single extent. Returning to enforcing the rule would have implications for that plan, too. Andy > With the improved rgrp repair algorithms in fsck.gfs2, it may not be > necessary anymore. I'm not trying to be dogmatic; I'm looking for opinions here. > > Regards, > > Bob Peterson > Red Hat File Systems >