public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Marco Elver <elver@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>, Qian Cai <cai@lca.pw>,
	Christoph Hellwig <hch@infradead.org>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	linux-xfs@vger.kernel.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] fs: fix a data race in i_size_write/i_size_read
Date: Wed, 19 Feb 2020 13:18:04 +0100	[thread overview]
Message-ID: <20200219121804.GV2902@twin.jikos.cz> (raw)
In-Reply-To: <CANpmjNM=+y-OwKjtsjsEkwPjpHXpt7ywaE48JyiND6dKt=Vf1Q@mail.gmail.com>

On Wed, Feb 19, 2020 at 10:21:46AM +0100, Marco Elver wrote:
> Right. In reality, for mainstream architectures, it appears quite unlikely.
> 
> There may be other valid reasons, such as documenting the fact the
> write can happen concurrently with loads.
> 
> Let's assume the WRITE_ONCE can be dropped.
> 
> The load is a different story. While load tearing may not be an issue,
> it's more likely that other optimizations can break the code. For
> example load fusing can break code that expects repeated loads in a
> loop. E.g. I found these uses of i_size_read in loops:
> 
> git grep -E '(for|while) \(.*i_size_read'
> fs/ocfs2/dir.c: while (ctx->pos < i_size_read(inode)) {
> fs/ocfs2/dir.c:                 for (i = 0; i < i_size_read(inode) &&
> i < offset; ) {
> fs/ocfs2/dir.c: while (ctx->pos < i_size_read(inode)) {
> fs/ocfs2/dir.c:         while (ctx->pos < i_size_read(inode)
> fs/squashfs/dir.c:      while (length < i_size_read(inode)) {
> fs/squashfs/namei.c:    while (length < i_size_read(dir)) {
> 
> Can i_size writes happen concurrently, and if so will these break if
> the compiler decides to just do i_size_read's load once, and keep the
> result in a register?

It depends on the semantics and the behaviour when the value is not
cached in a register might be the wrong one. A concrete example with
assembly and analysis can be found in d98da49977f6 ("btrfs: save i_size
to avoid double evaluation of i_size_read in compress_file_range"),
which is the workardound mentioned in the my other mail.

C:
    actual_end = min_t(u64, i_size_read(inode), end + 1);

Asm:

        mov    0x20(%rsp),%rax
        cmp    %rax,0x48(%r15)           # read
        movl   $0x0,0x18(%rsp)
        mov    %rax,%r12
        mov    %r14,%rax
        cmovbe 0x48(%r15),%r12           # eval
    
      Where r15 is inode and 0x48 is offset of i_size.
    
      The original fix was to revert 62b37622718c that would do an
      intermediate assignment and this would also avoid the doulble
      evaluation but is not future-proof, should the compiler merge the
      stores and call i_size_read anyway.
    
      There's a patch adding READ_ONCE to i_size_read but that's not being
      applied at the moment and we need to fix the bug. Instead, emulate
      READ_ONCE by two barrier()s that's what effectively happens. The
      assembly confirms single evaluation:
    
        mov    0x48(%rbp),%rax          # read once
        mov    0x20(%rsp),%rcx
        mov    $0x20,%edx
        cmp    %rax,%rcx
        cmovbe %rcx,%rax
        mov    %rax,(%rsp)
        mov    %rax,%rcx
        mov    %r14,%rax
    
      Where 0x48(%rbp) is inode->i_size stored to %eax.

  reply	other threads:[~2020-02-19 12:18 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-19  4:04 [PATCH] fs: fix a data race in i_size_write/i_size_read Qian Cai
2020-02-19  4:52 ` Al Viro
2020-02-19  5:08   ` Qian Cai
2020-02-19  5:23     ` Al Viro
2020-02-19  9:21       ` Marco Elver
2020-02-19 12:18         ` David Sterba [this message]
2020-02-21  4:19         ` Qian Cai
2020-02-19 12:08 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200219121804.GV2902@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=cai@lca.pw \
    --cc=darrick.wong@oracle.com \
    --cc=elver@google.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox