All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH 3/9] xfs: punching delalloc extents on write failure is racy
Date: Tue, 15 Nov 2022 00:41:42 -0800	[thread overview]
Message-ID: <Y3NQxoD20SvgInok@infradead.org> (raw)
In-Reply-To: <20221115013043.360610-4-david@fromorbit.com>

On Tue, Nov 15, 2022 at 12:30:37PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> xfs_buffered_write_iomap_end() has a comment about the safety of
> punching delalloc extents based holding the IOLOCK_EXCL. This
> comment is wrong, and punching delalloc extents is not race free.
> 
> When we punch out a delalloc extent after a write failure in
> xfs_buffered_write_iomap_end(), we punch out the page cache with
> truncate_pagecache_range() before we punch out the delalloc extents.
> At this point, we only hold the IOLOCK_EXCL, so there is nothing
> stopping mmap() write faults racing with this cleanup operation,
> reinstantiating a folio over the range we are about to punch and
> hence requiring the delalloc extent to be kept.
> 
> If this race condition is hit, we can end up with a dirty page in
> the page cache that has no delalloc extent or space reservation
> backing it. This leads to bad things happening at writeback time.
> 
> To avoid this race condition, we need the page cache truncation to
> be atomic w.r.t. the extent manipulation. We can do this by holding
> the mapping->invalidate_lock exclusively across this operation -
> this will prevent new pages from being inserted into the page cache
> whilst we are removing the pages and the backing extent and space
> reservation.
> 
> Taking the mapping->invalidate_lock exclusively in the buffered
> write IO path is safe - it naturally nests inside the IOLOCK (see
> truncate and fallocate paths). iomap_zero_range() can be called from
> under the mapping->invalidate_lock (from the truncate path via
> either xfs_zero_eof() or xfs_truncate_page(), but iomap_zero_iter()
> will not instantiate new delalloc pages (because it skips holes) and
> hence will not ever need to punch out delalloc extents on failure.
> 
> Fix the locking issue, and clean up the code logic a little to avoid
> unnecessary work if we didn't allocate the delalloc extent or wrote
> the entire region we allocated.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

> +	filemap_invalidate_lock(inode->i_mapping);
> +	truncate_pagecache_range(VFS_I(ip), XFS_FSB_TO_B(mp, start_fsb),
> +				 XFS_FSB_TO_B(mp, end_fsb) - 1);

No need to use VFS_I here, the inode is passed as a funtion argument.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

  reply	other threads:[~2022-11-15  8:41 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-15  1:30 [PATCH v2 0/9] xfs, iomap: fix data corrupton due to stale cached iomaps Dave Chinner
2022-11-15  1:30 ` [PATCH 1/9] mm: export mapping_seek_hole_data() Dave Chinner
2022-11-15  8:40   ` Christoph Hellwig
2022-11-15  1:30 ` [PATCH 2/9] xfs: write page faults in iomap are not buffered writes Dave Chinner
2022-11-15  1:30 ` [PATCH 3/9] xfs: punching delalloc extents on write failure is racy Dave Chinner
2022-11-15  8:41   ` Christoph Hellwig [this message]
2022-11-15 23:53   ` Darrick J. Wong
2022-11-15  1:30 ` [PATCH 4/9] xfs: use byte ranges for write cleanup ranges Dave Chinner
2022-11-15  8:42   ` Christoph Hellwig
2022-11-15 23:57   ` Darrick J. Wong
2022-11-15  1:30 ` [PATCH 5/9] xfs: buffered write failure should not truncate the page cache Dave Chinner
2022-11-15  8:43   ` Christoph Hellwig
2022-11-16  0:48   ` Darrick J. Wong
2022-11-17  1:06     ` Dave Chinner
2022-11-16 13:57   ` Brian Foster
2022-11-17  0:41     ` Dave Chinner
2022-11-17 18:28       ` Darrick J. Wong
2022-11-18 17:20       ` Brian Foster
2022-11-21 23:13         ` Dave Chinner
2022-11-23 17:25           ` Brian Foster
2022-11-15  1:30 ` [PATCH 6/9] xfs: xfs_bmap_punch_delalloc_range() should take a byte range Dave Chinner
2022-11-15  8:44   ` Christoph Hellwig
2022-11-15 23:48     ` Darrick J. Wong
2022-11-16  0:57       ` Dave Chinner
2022-11-16  5:46         ` Christoph Hellwig
2022-11-15  1:30 ` [PATCH 7/9] iomap: write iomap validity checks Dave Chinner
2022-11-15  8:45   ` Christoph Hellwig
2022-11-15  1:30 ` [PATCH 8/9] xfs: use iomap_valid method to detect stale cached iomaps Dave Chinner
2022-11-15  8:49   ` Christoph Hellwig
2022-11-15 23:26     ` Darrick J. Wong
2022-11-16  0:10     ` Dave Chinner
2022-11-15  1:30 ` [PATCH 9/9] xfs: drop write error injection is unfixable, remove it Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y3NQxoD20SvgInok@infradead.org \
    --to=hch@infradead.org \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.