public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Lukas Czerner <lczerner@redhat.com>
Cc: linux-ext4@vger.kernel.org, jlayton@kernel.org, tytso@mit.edu,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE
Date: Thu, 28 Jul 2022 18:53:32 +0200	[thread overview]
Message-ID: <20220728165332.cu2kiduob2xyvoep@quack3> (raw)
In-Reply-To: <20220728133914.49890-2-lczerner@redhat.com>

On Thu 28-07-22 15:39:14, Lukas Czerner wrote:
> Currently the I_DIRTY_TIME will never get set if the inode already has
> I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME.  That's
> true, however ext4 will only update the on-disk inode in
> ->dirty_inode(), not on actual writeback. As a result if the inode
> already has I_DIRTY_INODE state by the time we get to
> __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled
> into on-disk inode and will not get updated until the next I_DIRTY_INODE
> update, which might never come if we crash or get a power failure.
> 
> The problem can be reproduced on ext4 by running xfstest generic/622
> with -o iversion mount option. Fix it by setting I_DIRTY_TIME even if
> the inode already has I_DIRTY_INODE.

As a datapoint I've checked and XFS has the very same problem as ext4.

> Also clear the I_DIRTY_TIME after ->dirty_inode() otherwise it may never
> get cleared.
> 
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> ---
>  fs/fs-writeback.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 05221366a16d..174f01e6b912 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -2383,6 +2383,11 @@ void __mark_inode_dirty(struct inode *inode, int flags)
>  
>  		/* I_DIRTY_INODE supersedes I_DIRTY_TIME. */
>  		flags &= ~I_DIRTY_TIME;
> +		if (inode->i_state & I_DIRTY_TIME) {
> +			spin_lock(&inode->i_lock);
> +			inode->i_state &= ~I_DIRTY_TIME;
> +			spin_unlock(&inode->i_lock);
> +		}

Hum, so this is a bit dangerous because inode->i_state may be inconsistent
with the writeback list inode is queued in (wb->b_dirty_time) and these two
are supposed to be in sync. So I rather think we need to make sure we go
through the full round of 'update flags and writeback list' below in case
we need to clear I_DIRTY_TIME from inode->i_state.

>  	} else {
>  		/*
>  		 * Else it's either I_DIRTY_PAGES, I_DIRTY_TIME, or nothing.
> @@ -2399,13 +2404,20 @@ void __mark_inode_dirty(struct inode *inode, int flags)
>  	 */
>  	smp_mb();
>  
> -	if (((inode->i_state & flags) == flags) ||
> -	    (dirtytime && (inode->i_state & I_DIRTY_INODE)))
> +	if ((inode->i_state & flags) == flags)
>  		return;
>  
>  	spin_lock(&inode->i_lock);
> -	if (dirtytime && (inode->i_state & I_DIRTY_INODE))
> +	if (dirtytime && (inode->i_state & I_DIRTY_INODE)) {
> +		/*
> +		 * We've got a new lazytime update. Make sure it's recorded in
> +		 * i_state, because the time might have already got updated in
> +		 * ->dirty_inode() and will not get updated until next
> +		 *  I_DIRTY_INODE update.
> +		 */
> +		inode->i_state |= I_DIRTY_TIME;
>  		goto out_unlock_inode;
> +	}

So I'm afraid this combination is not properly handled in
writeback_single_inode() where we have at the end:

        if (!(inode->i_state & I_DIRTY_ALL))
                inode_cgwb_move_to_attached(inode, wb);
        else if (!(inode->i_state & I_SYNC_QUEUED) &&
                 (inode->i_state & I_DIRTY))
                redirty_tail_locked(inode, wb);

So inode that had I_DIRTY_SYNC | I_DIRTY_TIME will not be properly refiled
to wb->b_dirty_time list after writeback was done and I_DIRTY_SYNC got
cleared.

So we need to refine it to something like:

	if (!(inode->i_state & I_DIRTY_ALL))
		inode_cgwb_move_to_attached(inode, wb);
	else if (!(inode->i_state & I_SYNC_QUEUED)) {
		if (inode->i_state & I_DIRTY) {
			redirty_tail_locked(inode, wb);
		} else if (inode->i_state & I_DIRTY_TIME) {
			inode->dirtied_when = jiffies;
			inode_io_list_move_locked(inode, wb, &wb->b_dirty_time);
		}
	}

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2022-07-28 16:53 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-28 13:39 [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes Lukas Czerner
2022-07-28 13:39 ` [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner
2022-07-28 16:53   ` Jan Kara [this message]
2022-07-29  8:52     ` Lukas Czerner
2022-07-29 11:18       ` Jan Kara
2022-07-29  4:05   ` Eric Biggers
2022-07-29  8:54     ` Lukas Czerner
2022-07-28 15:52 ` [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes Jan Kara
2022-08-02 11:58 ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220728165332.cu2kiduob2xyvoep@quack3 \
    --to=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox