From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 118E0EB64D9 for ; Mon, 19 Jun 2023 08:30:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229976AbjFSIaK (ORCPT ); Mon, 19 Jun 2023 04:30:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229636AbjFSIaJ (ORCPT ); Mon, 19 Jun 2023 04:30:09 -0400 Received: from mail-oo1-xc2a.google.com (mail-oo1-xc2a.google.com [IPv6:2607:f8b0:4864:20::c2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1868110 for ; Mon, 19 Jun 2023 01:30:07 -0700 (PDT) Received: by mail-oo1-xc2a.google.com with SMTP id 006d021491bc7-5607b8c33ddso119215eaf.0 for ; Mon, 19 Jun 2023 01:30:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687163406; x=1689755406; h=in-reply-to:subject:to:from:message-id:date:from:to:cc:subject:date :message-id:reply-to; bh=+cgp08WAK5MWx35x8f1Ju8LVHUCCDS8wpYxSukkR2m0=; b=jxt37TKxlpPIrqSsSuyFlhiF8a/gEPW0D/xg97EFNqRDcypCfb9RbP+GDOIGy544NB pAjOHLq/yCNW96NMI8/voz/IlWcWvlbbZkjaI0doNB8h2u358Br18hghWyACUytIfKiF FbaBVJyHqHsaABmLod1tJu6G6FdHjz4JcLr11wHX/Kf1XZEQywgf8KpoiPPJBYh/13iQ yOX3E5+rpGxhmxYQeRWWyiPX9/cIrgQ+3Qc3UmEO8CTt3ce6fdrTe67J+WJFMnYyXul5 2CXHV4aakVDw8uYo9k6bJqwWaGjUaYBG0vcYWZGmeaBR/5Yrf8Tj4oGHzXZ7c6to/Cpj gbsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687163406; x=1689755406; h=in-reply-to:subject:to:from:message-id:date:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=+cgp08WAK5MWx35x8f1Ju8LVHUCCDS8wpYxSukkR2m0=; b=iiE/xQZ4uJC5IScsh43kDWrEAAXr6ctAV74QU1Wz+H5nMAhbYhWS0lUCanlfpwLbhG erSnPgw3a8tZ0ZSTgST1Xc2fyIU/FSebN40fLsE0FA3IZgLdP8ll3buXTV/cm9fBfWsW JcUAOZ2/6TqP19AkEXWn9fJn3qNfPSGWrP4JL7ysEHvqa3B9tJhmSI3PN5jvdANw7/ta fTbvw4w596hgDXITEY65htEbPvw3KSE6eY/NS5Gg3eXzW5PvvHnvv7fB/98QxAlInk5s iqcGdbJ4CWVJ35svkdmTkhV9wt/nCv4q/RGSgzOA0HKtrDckkIFzamcR4QreiYilngH+ kM/w== X-Gm-Message-State: AC+VfDyzwjEHCvq6JQRvYpAQMRv/idGSh7U9ZcR8/EVkf8fqXtqQqHFo EzRLJRPN2xAISFNAA8wQ0YuDQQvfkOQ= X-Google-Smtp-Source: ACHHUZ59u/feOVfaNl0kUxeAUiigu8Bv4ZPoh0JiasQGpfZSVJNoSIuuXphz9XYom3oNiKGG1G7Lfw== X-Received: by 2002:a05:6808:f14:b0:39c:64f4:bc71 with SMTP id m20-20020a0568080f1400b0039c64f4bc71mr7882904oiw.53.1687163406221; Mon, 19 Jun 2023 01:30:06 -0700 (PDT) Received: from dw-tp ([129.41.58.20]) by smtp.gmail.com with ESMTPSA id c2-20020a170902848200b001b3ab80381csm15986970plo.301.2023.06.19.01.30.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Jun 2023 01:30:05 -0700 (PDT) Date: Mon, 19 Jun 2023 13:59:57 +0530 Message-Id: <87r0q7n9bu.fsf@doe.com> From: Ritesh Harjani (IBM) To: Dave Chinner , linux-xfs@vger.kernel.org Subject: Re: [PATCH] xfs: journal geometry is not properly bounds checked In-Reply-To: <20230619070032.1912781-1-david@fromorbit.com> Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Dave Chinner writes: > From: Dave Chinner > > If the journal geometry results in a sector or log stripe unit > validation problem, it indicates that we cannot set the log up to > safely write to the the journal. In these cases, we must abort the > mount because the corruption needs external intervention to resolve. > Similarly, a journal that is too large cannot be written to safely, > either, so we shouldn't allow those geometries to mount, either. > > If the log is too small, we risk having transaction reservations > overruning the available log space and the system hanging waiting > for space it can never provide. This is purely a runtime hang issue, > not a corruption issue as per the first cases listed above. We abort > mounts of the log is too small for V5 filesystems, but we must allow > v4 filesystems to mount because, historically, there was no log size > validity checking and so some systems may still be out there with > undersized logs. > > The problem is that on V4 filesystems, when we discover a log > geometry problem, we skip all the remaining checks and then allow > the log to continue mounting. This mean that if one of the log size > checks fails, we skip the log stripe unit check. i.e. we allow the > mount because a "non-fatal" geometry is violated, and then fail to > check the hard fail geometries that should fail the mount. > > Move all these fatal checks to the superblock verifier, and add a > new check for the two log sector size geometry variables having the > same values. This will prevent any attempt to mount a log that has > invalid or inconsistent geometries long before we attempt to mount > the log. > > However, for the minimum log size checks, we can only do that once > we've setup up the log and calculated all the iclog sizes and > roundoffs. Hence this needs to remain in the log mount code after > the log has been initialised. It is also the only case where we > should allow a v4 filesystem to continue running, so leave that > handling in place, too. > > Signed-off-by: Dave Chinner > --- > fs/xfs/libxfs/xfs_sb.c | 59 +++++++++++++++++++++++++++++++++++++++++- > fs/xfs/xfs_log.c | 45 ++++++++++---------------------- > 2 files changed, 72 insertions(+), 32 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c > index ba0f17bc1dc0..0daf6bb37741 100644 > --- a/fs/xfs/libxfs/xfs_sb.c > +++ b/fs/xfs/libxfs/xfs_sb.c > @@ -412,7 +412,6 @@ xfs_validate_sb_common( > sbp->sb_inodelog < XFS_DINODE_MIN_LOG || > sbp->sb_inodelog > XFS_DINODE_MAX_LOG || > sbp->sb_inodesize != (1 << sbp->sb_inodelog) || > - sbp->sb_logsunit > XLOG_MAX_RECORD_BSIZE || > sbp->sb_inopblock != howmany(sbp->sb_blocksize,sbp->sb_inodesize) || > XFS_FSB_TO_B(mp, sbp->sb_agblocks) < XFS_MIN_AG_BYTES || > XFS_FSB_TO_B(mp, sbp->sb_agblocks) > XFS_MAX_AG_BYTES || > @@ -430,6 +429,64 @@ xfs_validate_sb_common( > return -EFSCORRUPTED; > } > > + /* > + * Logs that are too large are not supported at all. Reject them > + * outright. Logs that are too small are tolerated on v4 filesystems, > + * but we can only check that when mounting the log. Hence we skip > + * those checks here. > + */ > + if (sbp->sb_logblocks > XFS_MAX_LOG_BLOCKS) { > + xfs_notice(mp, > + "Log size 0x%x blocks too large, maximum size is 0x%llx blocks", > + sbp->sb_logblocks, XFS_MAX_LOG_BLOCKS); > + return -EFSCORRUPTED; > + } > + > + if (XFS_FSB_TO_B(mp, sbp->sb_logblocks) > XFS_MAX_LOG_BYTES) { > + xfs_warn(mp, > + "log size 0x%llx bytes too large, maximum size is 0x%llx bytes", > + XFS_FSB_TO_B(mp, sbp->sb_logblocks), > + XFS_MAX_LOG_BYTES); > + return -EFSCORRUPTED; > + } > + > + /* > + * Do not allow filesystems with corrupted log sector or stripe units to > + * be mounted. We cannot safely size the iclogs or write to the log if > + * the log stripe unit is not valid. > + */ > + if (sbp->sb_versionnum & XFS_SB_VERSION_SECTORBIT) { > + if (sbp->sb_logsectsize != (1U << sbp->sb_logsectlog)) { > + xfs_notice(mp, > + "log sector size in bytes/log2 (0x%x/0x%x) must match", > + sbp->sb_logsectsize, 1U << sbp->sb_logsectlog); > + return -EFSCORRUPTED; > + } > + } else if (sbp->sb_logsectsize || sbp->sb_logsectlog) { > + xfs_notice(mp, > + "log sector size in bytes/log2 (0x%x/0x%x) are not zero", > + sbp->sb_logsectsize, sbp->sb_logsectlog); > + return -EFSCORRUPTED; > + } > + > + if (sbp->sb_logsunit > 1) { > + if (sbp->sb_logsunit % sbp->sb_blocksize) { > + xfs_notice(mp, > + "log stripe unit %u bytes must be a multiple of block size", > + sbp->sb_logsunit); > + return -EFSCORRUPTED; > + } > + if (sbp->sb_logsunit > XLOG_MAX_RECORD_BSIZE) { > + xfs_notice(mp, > + "log stripe unit %u bytes must be a multiple of block size", > + sbp->sb_logsunit); I guess this xfs_notice message needs to be corrected. > + return -EFSCORRUPTED; > + } > + } > + > + > + > + too many new lines here ^^^ > /* Validate the realtime geometry; stolen from xfs_repair */ > if (sbp->sb_rextsize * sbp->sb_blocksize > XFS_MAX_RTEXTSIZE || > sbp->sb_rextsize * sbp->sb_blocksize < XFS_MIN_RTEXTSIZE) { > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c > index fc61cc024023..397374d07f73 100644 > --- a/fs/xfs/xfs_log.c > +++ b/fs/xfs/xfs_log.c > @@ -639,7 +639,6 @@ xfs_log_mount( > int num_bblks) > { > struct xlog *log; > - bool fatal = xfs_has_crc(mp); > int error = 0; > int min_logfsbs; > > @@ -663,51 +662,35 @@ xfs_log_mount( > mp->m_log = log; > > /* > - * Validate the given log space and drop a critical message via syslog > - * if the log size is too small that would lead to some unexpected > - * situations in transaction log space reservation stage. > + * Now that we have set up the log and it's internal geometry > + * parameters, we can validate the given log space and drop a critical > + * message via syslog if the log size is too small. A log that is too > + * small can lead to unexpected situations in transaction log space > + * reservation stage. The superblock verifier has already validated all > + * the other log geometry constraints, so we don't have to check those > + * here. > * > - * Note: we can't just reject the mount if the validation fails. This > - * would mean that people would have to downgrade their kernel just to > - * remedy the situation as there is no way to grow the log (short of > - * black magic surgery with xfs_db). > + * Note: For v4 filesystems, we can't just reject the mount if the > + * validation fails. This would mean that people would have to > + * downgrade their kernel just to remedy the situation as there is no > + * way to grow the log (short of black magic surgery with xfs_db). > * > - * We can, however, reject mounts for CRC format filesystems, as the > + * We can, however, reject mounts for V5 format filesystems, as the > * mkfs binary being used to make the filesystem should never create a > * filesystem with a log that is too small. > */ > min_logfsbs = xfs_log_calc_minimum_size(mp); > - > if (mp->m_sb.sb_logblocks < min_logfsbs) { > xfs_warn(mp, > "Log size %d blocks too small, minimum size is %d blocks", > mp->m_sb.sb_logblocks, min_logfsbs); > error = -EINVAL; Are we using this error now? > - } else if (mp->m_sb.sb_logblocks > XFS_MAX_LOG_BLOCKS) { > - xfs_warn(mp, > - "Log size %d blocks too large, maximum size is %lld blocks", > - mp->m_sb.sb_logblocks, XFS_MAX_LOG_BLOCKS); > - error = -EINVAL; > - } else if (XFS_FSB_TO_B(mp, mp->m_sb.sb_logblocks) > XFS_MAX_LOG_BYTES) { > - xfs_warn(mp, > - "log size %lld bytes too large, maximum size is %lld bytes", > - XFS_FSB_TO_B(mp, mp->m_sb.sb_logblocks), > - XFS_MAX_LOG_BYTES); > - error = -EINVAL; > - } else if (mp->m_sb.sb_logsunit > 1 && > - mp->m_sb.sb_logsunit % mp->m_sb.sb_blocksize) { > - xfs_warn(mp, > - "log stripe unit %u bytes must be a multiple of block size", > - mp->m_sb.sb_logsunit); > - error = -EINVAL; > - fatal = true; > - } > - if (error) { > + > /* > * Log check errors are always fatal on v5; or whenever bad > * metadata leads to a crash. > */ > - if (fatal) { > + if (xfs_has_crc(mp)) { > xfs_crit(mp, "AAIEEE! Log failed size checks. Abort!"); > ASSERT(0); > goto out_free_log; yes, only here in goto out_free_log we will return "error". Then why not shift error = -EINVAL in this if block? -ritesh