linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 00/42] mkfs: factor the crap out of the code
Date: Wed, 25 Oct 2017 11:59:32 +1100	[thread overview]
Message-ID: <20171025005932.GY3666@dastard> (raw)
In-Reply-To: <6cb1a1fa-15b2-6280-ba58-af39f496eb56@sandeen.net>

On Mon, Oct 23, 2017 at 10:00:26PM -0500, Eric Sandeen wrote:
> Ok more review, sorry this is so dispersed.
> 
> [PATCH 14/42] mkfs: factor printing of mkfs config
> 
> Oh good.  :)
> 
> [PATCH 15/42] mkfs: factor in memory superblock setup
> 
> Heh, I read that as factor in vs factor out ;)
> 
> This seems to change logic slightly, is it intentional?
> 
> from:
> 
> -	if (sb_feat.inode_align) {
> -		int	cluster_size = XFS_INODE_BIG_CLUSTER_SIZE;
> -		if (sb_feat.crcs_enabled)
> -			cluster_size *= isize / XFS_DINODE_MIN_SIZE;
> -		sbp->sb_inoalignmt = cluster_size >> blocklog;
> -		sb_feat.inode_align = sbp->sb_inoalignmt != 0;
> -	}
> 
> to
> 
> +	if (cfg->sb_feat.inode_align) {
> +		int	cluster_size = XFS_INODE_BIG_CLUSTER_SIZE;
> +		if (cfg->sb_feat.crcs_enabled)
> +			cluster_size *= cfg->inodesize / XFS_DINODE_MIN_SIZE;
> +		sbp->sb_inoalignmt = cluster_size >> cfg->blocklog;
> +	}
> 
> it loses the test for inalignmt being 0 and turning off inode_align...

Ok, so there's a few things here. When you look at the kernel and
it decides on inode alignment, it does:

int
xfs_ialloc_cluster_alignment(
        struct xfs_mount        *mp)
{
        if (xfs_sb_version_hasalign(&mp->m_sb) &&
            mp->m_sb.sb_inoalignmt >= xfs_icluster_size_fsb(mp))
                return mp->m_sb.sb_inoalignmt;
        return 1;
}

xfs_set_inoalignment() has a similar check before using
sb_inoalignmt to create the alignment mask, so the kernel doesn't
actually care that the alignment is 0 because it doesn't align
inodes if it's less than the cluster buffer size....

The tricky one is sparse inodes, because it requires sb_inoalignmt
to be equal to the size of the chunk. The kernel does this at mount
time:

        /*
         * Full inode chunks must be aligned to inode chunk size when
         * sparse inodes are enabled to support the sparse chunk
         * allocation algorithm and prevent overlapping inode records.
         */
        if (xfs_sb_version_hassparseinodes(sbp)) {
                uint32_t        align;

                align = XFS_INODES_PER_CHUNK * sbp->sb_inodesize
                                >> sbp->sb_blocklog;
                if (sbp->sb_inoalignmt != align) {
                        xfs_warn(mp,
"Inode block alignment (%u) must match chunk size (%u) for sparse inodes.",
                                 sbp->sb_inoalignmt, align);
                        return -EINVAL;
                }
        }


If we take the case of 64k block size, we have blocklog = 16, and so:

	align	= (64 * 512) >> 16
		= 2^15 >> 16
		= 0

So sparse inodes requires sbp->sb_inoalignmt == 0 when a 64k block
size is in use. It doesn't care about the align feature bit, though.

What I just noticed is this in xfs_align_sparse_ino():

        agbno = XFS_AGINO_TO_AGBNO(mp, *startino);
        mod = agbno % mp->m_sb.sb_inoalignmt;
        if (!mod)
                return;

if sb_inoalignmt is zero - and it will be for 64k block sizes and
inode sizes <= 512 bytes - then the result here should be a divide
by zero error. The only reason we haven't seen this is that we
can't allocate a sparse inode cluster on 64k block size filesystems
when the inode size <= 1k.....

So AFAICT, the inode alignment feature bit just doesn't matter when
the alignment is set to zero because the block size is larger than
the inode cluster size.

> if you deemed that impossible, a commit log comment would be helpful.
> If not, maybe it's a bug.  :)  The test & set goes back to the mists
> of time, so I'm not sure of its purpose.

Oh, and of course, I forgot the simple reason: we always use aligned
inodes on v5 filesystems, and we've defaulted to setting it on v4 fs
for donkey's years, so AFAIC there's no reason it should ever not be
set because the kernel has to handle alignment < cluster size sanely
anyway....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

      reply	other threads:[~2017-10-25  0:59 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-29 23:50 [PATCH 00/42] mkfs: factor the crap out of the code Dave Chinner
2017-08-29 23:50 ` [PATCH 01/42] mkfs: can't specify sector size of internal log Dave Chinner
2017-08-29 23:50 ` [PATCH 02/42] mkfs: make subopt table const Dave Chinner
2017-08-29 23:50 ` [PATCH 03/42] mkfs: introduce a structure to hold CLI options Dave Chinner
2017-08-29 23:50 ` [PATCH 04/42] mkfs: add generic subopt parsing table Dave Chinner
2017-08-29 23:50 ` [PATCH 05/42] mkfs: factor block subopts parser Dave Chinner
2017-08-29 23:50 ` [PATCH 06/42] mkfs: factor data " Dave Chinner
2017-08-29 23:50 ` [PATCH 07/42] mkfs: factor inode " Dave Chinner
2017-08-29 23:50 ` [PATCH 08/42] mkfs: factor log " Dave Chinner
2017-08-29 23:50 ` [PATCH 09/42] mkfs: factor meta " Dave Chinner
2017-08-29 23:50 ` [PATCH 10/42] mkfs: factor naming " Dave Chinner
2017-08-29 23:50 ` [PATCH 11/42] mkfs: factor rt " Dave Chinner
2017-08-29 23:50 ` [PATCH 12/42] mkfs: factor sector " Dave Chinner
2017-08-29 23:50 ` [PATCH 13/42] mkfs: Introduce mkfs configuration structure Dave Chinner
2017-08-29 23:50 ` [PATCH 14/42] mkfs: factor printing of mkfs config Dave Chinner
2017-08-29 23:50 ` [PATCH 15/42] mkfs: factor in memory superblock setup Dave Chinner
2017-08-29 23:50 ` [PATCH 16/42] mkfs: factor out device preparation Dave Chinner
2017-08-29 23:50 ` [PATCH 17/42] mkfs: factor writing AG headers Dave Chinner
2017-08-29 23:50 ` [PATCH 18/42] mkfs: factor secondary superblock updates Dave Chinner
2017-08-29 23:50 ` [PATCH 19/42] mkfs: introduce default configuration structure Dave Chinner
2017-08-29 23:50 ` [PATCH 20/42] mkfs: rename top level CLI parameters Dave Chinner
2017-08-29 23:50 ` [PATCH 21/42] mkfs: factor sectorsize validation Dave Chinner
2017-08-29 23:50 ` [PATCH 22/42] mkfs: factor blocksize validation Dave Chinner
2017-08-29 23:50 ` [PATCH 23/42] mkfs: factor log sector size validation Dave Chinner
2017-08-29 23:50 ` [PATCH 24/42] mkfs: factor superblock feature validation Dave Chinner
2017-08-29 23:50 ` [PATCH 25/42] mkfs: factor directory blocksize validation Dave Chinner
2017-08-29 23:50 ` [PATCH 26/42] mkfs: factor inode size validation Dave Chinner
2017-08-29 23:50 ` [PATCH 27/42] mkfs: factor out device size calculations Dave Chinner
2017-08-29 23:50 ` [PATCH 28/42] mkfs: fix hidden parameter in DTOBT() Dave Chinner
2017-08-29 23:50 ` [PATCH 29/42] mkfs: factor rtdev extent size validation Dave Chinner
2017-08-29 23:50 ` [PATCH 30/42] mkfs: rework stripe calculations Dave Chinner
2017-08-29 23:50 ` [PATCH 31/42] mkfs: factor device opening Dave Chinner
2017-08-29 23:50 ` [PATCH 32/42] mkfs: factor data device validation Dave Chinner
2017-08-29 23:50 ` [PATCH 33/42] mkfs: factor log " Dave Chinner
2017-08-29 23:50 ` [PATCH 34/42] mkfs: factor rt " Dave Chinner
2017-08-29 23:50 ` [PATCH 35/42] mkfs: factor AG geometry calculations Dave Chinner
2017-08-29 23:50 ` [PATCH 36/42] mkfs: factor AG alignment Dave Chinner
2017-08-30 23:44   ` Dave Chinner
2017-08-29 23:50 ` [PATCH 37/42] mkfs: rework imaxpct calculation Dave Chinner
2017-08-29 23:50 ` [PATCH 38/42] mkfs: factor initial mount setup Dave Chinner
2017-08-29 23:50 ` [PATCH 39/42] mkfs: factor log size calculations Dave Chinner
2017-09-05  5:23   ` Dave Chinner
2017-08-29 23:50 ` [PATCH 40/42] mkfs: cleanup redundant temporary code Dave Chinner
2017-08-29 23:50 ` [PATCH 41/42] mkfs: move error functions Dave Chinner
2017-08-29 23:50 ` [PATCH 42/42] mkfs: tidy up definitions Dave Chinner
2017-08-30  1:23 ` [PATCH 00/42] mkfs: factor the crap out of the code Darrick J. Wong
2017-08-30  1:57   ` Dave Chinner
2017-08-30  4:16 ` Luis R. Rodriguez
2017-08-30  5:44   ` Dave Chinner
2017-08-30 22:10     ` Luis R. Rodriguez
2017-08-30 23:22       ` Dave Chinner
2017-08-31  0:05         ` Luis R. Rodriguez
2017-08-31 16:23     ` Jan Tulak
2017-08-30  7:44 ` Martin Steigerwald
2017-09-04 12:31 ` Chandan Rajendra
2017-09-04 15:34   ` Eric Sandeen
2017-09-04 22:40   ` Dave Chinner
2017-09-07 10:31 ` Chandan Rajendra
2017-09-07 23:38   ` Dave Chinner
2017-09-09 10:24 ` Chandan Rajendra
2017-09-15  9:42 ` Jan Tulak
2017-09-16 11:29   ` Dave Chinner
2017-10-24  3:00 ` Eric Sandeen
2017-10-25  0:59   ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171025005932.GY3666@dastard \
    --to=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).