public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	Dave Chinner <dchinner@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	"Darrick J . Wong" <djwong@kernel.org>,
	hch@infradead.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH AUTOSEL 6.1 13/16] iomap: write iomap validity checks
Date: Wed, 21 Dec 2022 07:19:33 -0500	[thread overview]
Message-ID: <Y6L51cR5EZ//cw8J@sashalap> (raw)
In-Reply-To: <20221220040112.GG1971568@dread.disaster.area>

On Tue, Dec 20, 2022 at 03:01:12PM +1100, Dave Chinner wrote:
>On Mon, Dec 19, 2022 at 08:20:50PM -0500, Sasha Levin wrote:
>> From: Dave Chinner <dchinner@redhat.com>
>>
>> [ Upstream commit d7b64041164ca177170191d2ad775da074ab2926 ]
>>
>> A recent multithreaded write data corruption has been uncovered in
>> the iomap write code. The core of the problem is partial folio
>> writes can be flushed to disk while a new racing write can map it
>> and fill the rest of the page:
>>
>> writeback			new write
>>
>> allocate blocks
>>   blocks are unwritten
>> submit IO
>> .....
>> 				map blocks
>> 				iomap indicates UNWRITTEN range
>> 				loop {
>> 				  lock folio
>> 				  copyin data
>> .....
>> IO completes
>>   runs unwritten extent conv
>>     blocks are marked written
>> 				  <iomap now stale>
>> 				  get next folio
>> 				}
>>
>> Now add memory pressure such that memory reclaim evicts the
>> partially written folio that has already been written to disk.
>>
>> When the new write finally gets to the last partial page of the new
>> write, it does not find it in cache, so it instantiates a new page,
>> sees the iomap is unwritten, and zeros the part of the page that
>> it does not have data from. This overwrites the data on disk that
>> was originally written.
>>
>> The full description of the corruption mechanism can be found here:
>>
>> https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/
>>
>> To solve this problem, we need to check whether the iomap is still
>> valid after we lock each folio during the write. We have to do it
>> after we lock the page so that we don't end up with state changes
>> occurring while we wait for the folio to be locked.
>>
>> Hence we need a mechanism to be able to check that the cached iomap
>> is still valid (similar to what we already do in buffered
>> writeback), and we need a way for ->begin_write to back out and
>> tell the high level iomap iterator that we need to remap the
>> remaining write range.
>>
>> The iomap needs to grow some storage for the validity cookie that
>> the filesystem provides to travel with the iomap. XFS, in
>> particular, also needs to know some more information about what the
>> iomap maps (attribute extents rather than file data extents) to for
>> the validity cookie to cover all the types of iomaps we might need
>> to validate.
>>
>> Signed-off-by: Dave Chinner <dchinner@redhat.com>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>
>This commit is not a standalone backport candidate. It is a pure
>infrastructure change that does nothing by itself except to add more
>code that won't get executed. There are another 7-8 patches that
>need to be backported along with this patch to fix the data
>corruption that is mentioned in this commit.
>
>I'd stronly suggest that you leave this whole series of commits to
>the XFS LTS maintainers to backport if they so choose to - randomly
>backporting commits from the middle of the series only makes their
>job more complex....

Ack, I'll drop it, thanks!

-- 
Thanks,
Sasha

  reply	other threads:[~2022-12-21 12:19 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-20  1:20 [PATCH AUTOSEL 6.1 01/16] crypto: hisilicon/hpre - fix resource leak in remove process Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 02/16] scsi: lpfc: Fix hard lockup when reading the rx_monitor from debugfs Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 03/16] scsi: ufs: Reduce the START STOP UNIT timeout Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 04/16] crypto: hisilicon/qm - increase the memory of local variables Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 05/16] Revert "PCI: Clear PCI_STATUS when setting up device" Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 06/16] scsi: elx: libefc: Fix second parameter type in state callbacks Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 07/16] hugetlbfs: fix null-ptr-deref in hugetlbfs_parse_param() Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 08/16] scsi: smartpqi: Add new controller PCI IDs Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 09/16] scsi: smartpqi: Correct device removal for multi-actuator devices Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 10/16] drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid() Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 11/16] drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid() Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 12/16] scsi: target: iscsi: Fix a race condition between login_work and the login thread Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 13/16] iomap: write iomap validity checks Sasha Levin
2022-12-20  4:01   ` Dave Chinner
2022-12-21 12:19     ` Sasha Levin [this message]
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 14/16] orangefs: Fix kmemleak in orangefs_prepare_debugfs_help_string() Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 15/16] orangefs: Fix kmemleak in orangefs_sysfs_init() Sasha Levin
2022-12-20  1:20 ` [PATCH AUTOSEL 6.1 16/16] orangefs: Fix kmemleak in orangefs_{kernel,client}_debug_init() Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y6L51cR5EZ//cw8J@sashalap \
    --to=sashal@kernel.org \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox