All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Long Li <leo.lilong@huawei.com>
Cc: brauner@kernel.org, djwong@kernel.org, cem@kernel.org,
	linux-xfs@vger.kernel.org, yi.zhang@huawei.com,
	houtao1@huawei.com, yangerkun@huawei.com
Subject: Re: [PATCH v2 1/2] iomap: fix zero padding data issue in concurrent append writes
Date: Fri, 15 Nov 2024 08:46:13 -0500	[thread overview]
Message-ID: <ZzdQpXyVNHaT7MGv@bfoster> (raw)
In-Reply-To: <Zzc2LcUqCPqMjjxr@localhost.localdomain>

On Fri, Nov 15, 2024 at 07:53:17PM +0800, Long Li wrote:
> On Thu, Nov 14, 2024 at 01:04:31PM -0500, Brian Foster wrote:
> > On Thu, Nov 14, 2024 at 10:34:26AM +0800, Long Li wrote:
> > > On Wed, Nov 13, 2024 at 11:13:49AM -0500, Brian Foster wrote:
> > > > FYI, you probably want to include linux-fsdevel on iomap patches.
> > > > 
> > > > On Wed, Nov 13, 2024 at 05:19:06PM +0800, Long Li wrote:
> > > > > During concurrent append writes to XFS filesystem, zero padding data
> > > > > may appear in the file after power failure. This happens due to imprecise
> > > > > disk size updates when handling write completion.
> > > > > 
> > > > > Consider this scenario with concurrent append writes same file:
> > > > > 
> > > > >   Thread 1:                  Thread 2:
> > > > >   ------------               -----------
> > > > >   write [A, A+B]
> > > > >   update inode size to A+B
> > > > >   submit I/O [A, A+BS]
> > > > >                              write [A+B, A+B+C]
> > > > >                              update inode size to A+B+C
> > > > >   <I/O completes, updates disk size to A+B+C>
> > > > >   <power failure>
> > > > > 
> > > > > After reboot, file has zero padding in range [A+B, A+B+C]:
> > > > > 
> > > > >   |<         Block Size (BS)      >|
> > > > >   |DDDDDDDDDDDDDDDD0000000000000000|
> > > > >   ^               ^        ^
> > > > >   A              A+B      A+B+C (EOF)
> > > > > 
> > > > 
> > > > Thanks for the diagram. FWIW, I found the description a little confusing
> > > > because A+B+C to me implies that we'd update i_size to the end of the
> > > > write from thread 2, but it seems that is only true up to the end of the
> > > > block.
> > > > 
> > > > I.e., with 4k FSB and if thread 1 writes [0, 2k], then thread 2 writes
> > > > from [2, 16k], the write completion from the thread 1 write will set
> > > > i_size to 4k, not 16k, right?
> > > > 
> > > 
> > > Not right, the problem I'm trying to describe is:
> > > 
> > >   1) thread 1 writes [0, 2k]
> > >   2) thread 2 writes [2k, 3k]
> > >   3) write completion from the thread 1 write set i_size to 3K
> > >   4) power failure
> > >   5) after reboot,  [2k, 3K] of the file filled with zero and the file size is 3k
> > >      
> > 
> > Yeah, I get the subblock case. What I am saying above is it seems like
> > "update inode size to A+B+C" is only true for certain, select values
> > that describe the subblock case. I.e., what is the resulting i_size if
> > we replace C=1k in the example above with something >= FSB size, like
> > C=4k?
> > 
> > Note this isn't all that important. I was just trying to say that the
> > overly general description made this a little more confusing to grok at
> > first than it needed to be, because to me it subtly implies there is
> > logic around somewhere that explicitly writes in-core i_size to disk,
> > when that is not actually what is happening.
> > 
> > > 
> 
> Sorry for my previous misunderstanding. You are correct - my commit
> message description didn't cover the case where A+B+C > block size.
> In such scenarios, the final file size might end up being 4K, which
> is not what we would expect. Initially, I incorrectly thought this
> wasn't a significant issue and thus overlooked this case. Let me
> update the diagram to address this.
> 

Ok no problem.. like I said, just a minor nit. ;)

>   Thread 1:                  Thread 2:
>   ------------               -----------
>   write [A, A+B]
>   update inode size to A+B
>   submit I/O [A, A+BS]
>                              write [A+B, A+B+C]
>                              update inode size to A+B+C
>   <I/O completes, updates disk size to A+B+C>
>   <power failure>
> 
> After reboot:
>   1) The file has zero padding in the range [A+B, A+BS]
>   2) The file size is unexpectedly set to A+BS
> 
>   |<         Block Size (BS)      >|<           Block Size (BS)    >|
>   |DDDDDDDDDDDDDDDD0000000000000000|00000000000000000000000000000000|
>   ^               ^                ^               ^
>   A              A+B              A+BS (EOF)     A+B+C
> 
> 
> It will be update in the next version.
> 

The text above still says "updates disk size to A+B+C." I'm not sure if
you intended to change that to A+BS as well, but regardless LGTM.
Thanks.

Brian

> 
> Thanks,
> Long Li
> 


  reply	other threads:[~2024-11-15 13:44 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-13  9:19 [PATCH v2 1/2] iomap: fix zero padding data issue in concurrent append writes Long Li
2024-11-13  9:19 ` [PATCH v2 2/2] xfs: clean up xfs_end_ioend() to reuse local variables Long Li
2024-11-18  6:57   ` Christoph Hellwig
2024-11-13  9:44 ` [PATCH v2 1/2] iomap: fix zero padding data issue in concurrent append writes Carlos Maiolino
2024-11-13 11:38   ` Long Li
2024-11-13 12:56     ` Carlos Maiolino
2024-11-13 16:13 ` Brian Foster
2024-11-14  2:34   ` Long Li
2024-11-14 18:04     ` Brian Foster
2024-11-14 20:01       ` Dave Chinner
2024-11-15 14:03         ` Brian Foster
2024-11-15 11:53       ` Long Li
2024-11-15 13:46         ` Brian Foster [this message]
2024-11-19  1:35           ` Long Li
2024-11-18  6:56   ` Christoph Hellwig
2024-11-18 14:26     ` Brian Foster
2024-11-20  9:05       ` Christoph Hellwig
2024-11-20 13:50         ` Brian Foster
2024-11-21  5:49           ` Christoph Hellwig
2024-11-19  8:35     ` Long Li
2024-11-19 12:13       ` Brian Foster
2024-11-19 13:46         ` Long Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZzdQpXyVNHaT7MGv@bfoster \
    --to=bfoster@redhat.com \
    --cc=brauner@kernel.org \
    --cc=cem@kernel.org \
    --cc=djwong@kernel.org \
    --cc=houtao1@huawei.com \
    --cc=leo.lilong@huawei.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.