[PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()
@ 2024-01-22  9:45 Baokun Li
  2024-01-22  9:45 ` [PATCH 1/2] " Baokun Li
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Baokun Li @ 2024-01-22  9:45 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: torvalds, viro, brauner, jack, willy, akpm, linux-kernel,
	yi.zhang, yangerkun, yukuai3, libaokun1

This patchset follows the linus suggestion to make the i_size_read/write
helpers be smp_load_acquire/store_release(), after which the extra smp_rmb
in filemap_read() is no longer needed, so it is removed.

Functional tests were performed and no new problems were found.

Here are the results of unixbench tests based on 6.7.0-next-20240118 on
arm64, with some degradation in single-threading and some optimization in
multi-threading, but overall the impact is not significant.

### 72 CPUs in system; running 1 parallel copy of tests
System Benchmarks Index Values        |   base  | patched |  cmp   |
--------------------------------------|---------|---------|--------|
Dhrystone 2 using register variables  | 3635.06 | 3596.3  | -1.07% |
Double-Precision Whetstone            | 808.58  | 808.58  | 0.00%  |
Execl Throughput                      | 623.52  | 618.1   | -0.87% |
File Copy 1024 bufsize 2000 maxblocks | 1715.82 | 1668.58 | -2.75% |
File Copy 256 bufsize 500 maxblocks   | 1320.98 | 1250.16 | -5.36% |
File Copy 4096 bufsize 8000 maxblocks | 2639.36 | 2488.48 | -5.72% |
Pipe Throughput                       | 869.06  | 872.3   | 0.37%  |
Pipe-based Context Switching          | 106.26  | 117.22  | 10.31% |
Process Creation                      | 247.72  | 246.74  | -0.40% |
Shell Scripts (1 concurrent)          | 1234.98 | 1226    | -0.73% |
Shell Scripts (8 concurrent)          | 6893.96 | 6210.46 | -9.91% |
System Call Overhead                  | 493.72  | 494.28  | 0.11%  |
--------------------------------------|---------|---------|--------|
Total                                 | 1003.92 | 989.58  | -1.43% |

### 72 CPUs in system; running 72 parallel copy of tests
System Benchmarks Index Values        |   base    |  patched  |  cmp   |
--------------------------------------|-----------|-----------|--------|
Dhrystone 2 using register variables  | 260471.88 | 258065.04 | -0.92% |
Double-Precision Whetstone            | 58212.32  | 58219.3   | 0.01%  |
Execl Throughput                      | 6954.7    | 7444.08   | 7.04%  |
File Copy 1024 bufsize 2000 maxblocks | 64244.74  | 64618.24  | 0.58%  |
File Copy 256 bufsize 500 maxblocks   | 89933.8   | 87026.38  | -3.23% |
File Copy 4096 bufsize 8000 maxblocks | 79808.14  | 81916.42  | 2.64%  |
Pipe Throughput                       | 62174.38  | 62389.74  | 0.35%  |
Pipe-based Context Switching          | 27239.28  | 27887.24  | 2.38%  |
Process Creation                      | 3551.28   | 3800.54   | 7.02%  |
Shell Scripts (1 concurrent)          | 19212.26  | 20749.34  | 8.00%  |
Shell Scripts (8 concurrent)          | 20842.02  | 21958.12  | 5.36%  |
System Call Overhead                  | 35328.24  | 35451.68  | 0.35%  |
--------------------------------------|-----------|-----------|--------|
Total                                 | 35592.42  | 36450.36  | 2.41%  |

Baokun Li (2):
  fs: make the i_size_read/write helpers be
    smp_load_acquire/store_release()
  Revert "mm/filemap: avoid buffered read/write race to read
    inconsistent data"

 include/linux/fs.h | 10 ++++++++--
 mm/filemap.c       |  9 ---------
 2 files changed, 8 insertions(+), 11 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()
  2024-01-22  9:45 [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release() Baokun Li
@ 2024-01-22  9:45 ` Baokun Li
  2024-01-22  9:45 ` [PATCH 2/2] Revert "mm/filemap: avoid buffered read/write race to read inconsistent data" Baokun Li
  2024-01-22 11:14 ` [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release() Christian Brauner
  2 siblings, 0 replies; 8+ messages in thread
From: Baokun Li @ 2024-01-22  9:45 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: torvalds, viro, brauner, jack, willy, akpm, linux-kernel,
	yi.zhang, yangerkun, yukuai3, libaokun1

In [Link] linus mentions that acquire/release makes it clear which
_particular_ memory accesses are the ordered ones, and it's unlikely
to make any performance difference, so it's much better to pair up
the release->acquire ordering than have a "wmb->rmb" ordering.

=========================================================
 update pagecache
 folio_mark_uptodate(folio)
   smp_wmb()
   set_bit PG_uptodate

 === ↑↑↑ STLR ↑↑↑ === smp_store_release(&inode->i_size, i_size)

 folio_test_uptodate(folio)
   test_bit PG_uptodate
   smp_rmb()

 === ↓↓↓ LDAR ↓↓↓ === smp_load_acquire(&inode->i_size)

 copy_page_to_iter()
=========================================================

Calling smp_store_release() in i_size_write() ensures that the data
in the page and the PG_uptodate bit are updated before the isize is
updated, and calling smp_load_acquire() in i_size_read ensures that
it will not read a newer isize than the data in the page. Therefore,
this avoids buffered read-write inconsistencies caused by Load-Load
reordering.

Link: https://lore.kernel.org/r/CAHk-=wifOnmeJq+sn+2s-P46zw0SFEbw9BSCGgp2c5fYPtRPGw@mail.gmail.com/
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
---
 include/linux/fs.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 06ecccbb5bfe..077849bfe89a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -907,7 +907,8 @@ static inline loff_t i_size_read(const struct inode *inode)
 	preempt_enable();
 	return i_size;
 #else
-	return inode->i_size;
+	/* Pairs with smp_store_release() in i_size_write() */
+	return smp_load_acquire(&inode->i_size);
 #endif
 }
 
@@ -929,7 +930,12 @@ static inline void i_size_write(struct inode *inode, loff_t i_size)
 	inode->i_size = i_size;
 	preempt_enable();
 #else
-	inode->i_size = i_size;
+	/*
+	 * Pairs with smp_load_acquire() in i_size_read() to ensure
+	 * changes related to inode size (such as page contents) are
+	 * visible before we see the changed inode size.
+	 */
+	smp_store_release(&inode->i_size, i_size);
 #endif
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] Revert "mm/filemap: avoid buffered read/write race to read inconsistent data"
  2024-01-22  9:45 [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release() Baokun Li
  2024-01-22  9:45 ` [PATCH 1/2] " Baokun Li
@ 2024-01-22  9:45 ` Baokun Li
  2024-01-22 11:14 ` [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release() Christian Brauner
  2 siblings, 0 replies; 8+ messages in thread
From: Baokun Li @ 2024-01-22  9:45 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: torvalds, viro, brauner, jack, willy, akpm, linux-kernel,
	yi.zhang, yangerkun, yukuai3, libaokun1

This reverts commit e2c27b803bb6 ("mm/filemap: avoid buffered read/write
race to read inconsistent data"). After making the i_size_read/write
helpers be smp_load_acquire/store_release(), it is already guaranteed that
changes to page contents are visible before we see increased inode size,
so the extra smp_rmb() in filemap_read() can be removed.

Signed-off-by: Baokun Li <libaokun1@huawei.com>
---
 mm/filemap.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 142864338ca4..bed844b07e87 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2608,15 +2608,6 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
 			goto put_folios;
 		end_offset = min_t(loff_t, isize, iocb->ki_pos + iter->count);
 
-		/*
-		 * Pairs with a barrier in
-		 * block_write_end()->mark_buffer_dirty() or other page
-		 * dirtying routines like iomap_write_end() to ensure
-		 * changes to page contents are visible before we see
-		 * increased inode size.
-		 */
-		smp_rmb();
-
 		/*
 		 * Once we start copying data, we don't want to be touching any
 		 * cachelines that might be contended:
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()
  2024-01-22  9:45 [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release() Baokun Li
  2024-01-22  9:45 ` [PATCH 1/2] " Baokun Li
  2024-01-22  9:45 ` [PATCH 2/2] Revert "mm/filemap: avoid buffered read/write race to read inconsistent data" Baokun Li
@ 2024-01-22 11:14 ` Christian Brauner
  2024-01-22 12:25   ` Baokun Li
  2024-01-23 18:56   ` Jan Kara
  2 siblings, 2 replies; 8+ messages in thread
From: Christian Brauner @ 2024-01-22 11:14 UTC (permalink / raw)
  To: Baokun Li
  Cc: Christian Brauner, torvalds, viro, jack, willy, akpm,
	linux-kernel, yi.zhang, yangerkun, yukuai3, linux-fsdevel

On Mon, 22 Jan 2024 17:45:34 +0800, Baokun Li wrote:
> This patchset follows the linus suggestion to make the i_size_read/write
> helpers be smp_load_acquire/store_release(), after which the extra smp_rmb
> in filemap_read() is no longer needed, so it is removed.
> 
> Functional tests were performed and no new problems were found.
> 
> Here are the results of unixbench tests based on 6.7.0-next-20240118 on
> arm64, with some degradation in single-threading and some optimization in
> multi-threading, but overall the impact is not significant.
> 
> [...]

Hm, we can certainly try but I wouldn't rule it out that someone will
complain aobut the "non-significant" degradation in single-threading.
We'll see. Let that performance bot chew on it for a bit as well.

But I agree that the smp_load_acquire()/smp_store_release() is clearer
than the open-coded smp_rmb().

---

Applied to the vfs.misc branch of the vfs/vfs.git tree.
Patches in the vfs.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.misc

[1/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()
      https://git.kernel.org/vfs/vfs/c/7d7825fde8ba
[2/2] Revert "mm/filemap: avoid buffered read/write race to read inconsistent data"
      https://git.kernel.org/vfs/vfs/c/83dfed690b90

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()
  2024-01-22 11:14 ` [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release() Christian Brauner
@ 2024-01-22 12:25   ` Baokun Li
  2024-01-23 18:56   ` Jan Kara
  1 sibling, 0 replies; 8+ messages in thread
From: Baokun Li @ 2024-01-22 12:25 UTC (permalink / raw)
  To: Christian Brauner
  Cc: torvalds, viro, jack, willy, akpm, linux-kernel, yi.zhang,
	yangerkun, yukuai3, linux-fsdevel, Baokun Li

On 2024/1/22 19:14, Christian Brauner wrote:
> On Mon, 22 Jan 2024 17:45:34 +0800, Baokun Li wrote:
>> This patchset follows the linus suggestion to make the i_size_read/write
>> helpers be smp_load_acquire/store_release(), after which the extra smp_rmb
>> in filemap_read() is no longer needed, so it is removed.
>>
>> Functional tests were performed and no new problems were found.
>>
>> Here are the results of unixbench tests based on 6.7.0-next-20240118 on
>> arm64, with some degradation in single-threading and some optimization in
>> multi-threading, but overall the impact is not significant.
>>
>> [...]
> Hm, we can certainly try but I wouldn't rule it out that someone will
> complain aobut the "non-significant" degradation in single-threading.
> We'll see. Let that performance bot chew on it for a bit as well.
>
> But I agree that the smp_load_acquire()/smp_store_release() is clearer
> than the open-coded smp_rmb().
Thank you very much for applying this patch!

Adding barriers where none existed does introduce some performance
degradation. But the multi-threaded test results here look pretty
good, it's just that the single-threaded test results have a bit too
much degradation for Shell Scripts (8 concurrent).  I've tracked
down this test item, which calls clone() and wait4() and then triggers
isize reads and writes frequently, so the degradation here is as
expected, just not sure if anyone cares about this scenario.
> ---
>
> Applied to the vfs.misc branch of the vfs/vfs.git tree.
> Patches in the vfs.misc branch should appear in linux-next soon.
>
> Please report any outstanding bugs that were missed during review in a
> new review to the original patch series allowing us to drop it.
>
> It's encouraged to provide Acked-bys and Reviewed-bys even though the
> patch has now been applied. If possible patch trailers will be updated.
>
> Note that commit hashes shown below are subject to change due to rebase,
> trailer updates or similar. If in doubt, please check the listed branch.
>
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> branch: vfs.misc
>
> [1/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()
>        https://git.kernel.org/vfs/vfs/c/7d7825fde8ba
> [2/2] Revert "mm/filemap: avoid buffered read/write race to read inconsistent data"
>        https://git.kernel.org/vfs/vfs/c/83dfed690b90
Thanks!
-- 
With Best Regards,
Baokun Li
.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()
  2024-01-22 11:14 ` [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release() Christian Brauner
  2024-01-22 12:25   ` Baokun Li
@ 2024-01-23 18:56   ` Jan Kara
  2024-01-24  8:06     ` Baokun Li
  2024-01-24 11:20     ` Christian Brauner
  1 sibling, 2 replies; 8+ messages in thread
From: Jan Kara @ 2024-01-23 18:56 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Baokun Li, torvalds, viro, jack, willy, akpm, linux-kernel,
	yi.zhang, yangerkun, yukuai3, linux-fsdevel

On Mon 22-01-24 12:14:52, Christian Brauner wrote:
> On Mon, 22 Jan 2024 17:45:34 +0800, Baokun Li wrote:
> > This patchset follows the linus suggestion to make the i_size_read/write
> > helpers be smp_load_acquire/store_release(), after which the extra smp_rmb
> > in filemap_read() is no longer needed, so it is removed.
> > 
> > Functional tests were performed and no new problems were found.
> > 
> > Here are the results of unixbench tests based on 6.7.0-next-20240118 on
> > arm64, with some degradation in single-threading and some optimization in
> > multi-threading, but overall the impact is not significant.
> > 
> > [...]
> 
> Hm, we can certainly try but I wouldn't rule it out that someone will
> complain aobut the "non-significant" degradation in single-threading.
> We'll see. Let that performance bot chew on it for a bit as well.

Yeah, over 5% regression in buffered read/write cost is a bit hard to
swallow. I somewhat wonder why this is so much - maybe people call
i_size_read() without thinking too much and now it becomes atomic op on
arm? Also LKP tests only on x86 (where these changes are going to be
for noop) and I'm not sure anybody else runs performance tests on
linux-next, even less so on ARM... So not sure anybody will complain until
this gets into some distro (such as Android).

> But I agree that the smp_load_acquire()/smp_store_release() is clearer
> than the open-coded smp_rmb().

Agreed, conceptually this is nice and it will also silence some KCSAN
warnings about i_size updates vs reads.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()
  2024-01-23 18:56   ` Jan Kara
@ 2024-01-24  8:06     ` Baokun Li
  2024-01-24 11:20     ` Christian Brauner
  1 sibling, 0 replies; 8+ messages in thread
From: Baokun Li @ 2024-01-24  8:06 UTC (permalink / raw)
  To: Jan Kara, Christian Brauner
  Cc: torvalds, viro, willy, akpm, linux-kernel, yi.zhang, yangerkun,
	yukuai3, linux-fsdevel, Baokun Li

On 2024/1/24 2:56, Jan Kara wrote:
> On Mon 22-01-24 12:14:52, Christian Brauner wrote:
>> On Mon, 22 Jan 2024 17:45:34 +0800, Baokun Li wrote:
>>> This patchset follows the linus suggestion to make the i_size_read/write
>>> helpers be smp_load_acquire/store_release(), after which the extra smp_rmb
>>> in filemap_read() is no longer needed, so it is removed.
>>>
>>> Functional tests were performed and no new problems were found.
>>>
>>> Here are the results of unixbench tests based on 6.7.0-next-20240118 on
>>> arm64, with some degradation in single-threading and some optimization in
>>> multi-threading, but overall the impact is not significant.
>>>
>>> [...]
>> Hm, we can certainly try but I wouldn't rule it out that someone will
>> complain aobut the "non-significant" degradation in single-threading.
>> We'll see. Let that performance bot chew on it for a bit as well.
> Yeah, over 5% regression in buffered read/write cost is a bit hard to
> swallow. I somewhat wonder why this is so much - maybe people call
> i_size_read() without thinking too much and now it becomes atomic op on
> arm? Also LKP tests only on x86 (where these changes are going to be
> for noop) and I'm not sure anybody else runs performance tests on
> linux-next, even less so on ARM... So not sure anybody will complain until
> this gets into some distro (such as Android).
>
>> But I agree that the smp_load_acquire()/smp_store_release() is clearer
>> than the open-coded smp_rmb().
> Agreed, conceptually this is nice and it will also silence some KCSAN
> warnings about i_size updates vs reads.
>
> 								Honza
Hello Honza!

Are there any other performance tests you'd like to perform?
I can test it on my machine if you have any.

Cheers!
-- 
With Best Regards,
Baokun Li
.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()
  2024-01-23 18:56   ` Jan Kara
  2024-01-24  8:06     ` Baokun Li
@ 2024-01-24 11:20     ` Christian Brauner
  1 sibling, 0 replies; 8+ messages in thread
From: Christian Brauner @ 2024-01-24 11:20 UTC (permalink / raw)
  To: Jan Kara
  Cc: Baokun Li, torvalds, viro, willy, akpm, linux-kernel, yi.zhang,
	yangerkun, yukuai3, linux-fsdevel

On Tue, Jan 23, 2024 at 07:56:22PM +0100, Jan Kara wrote:
> On Mon 22-01-24 12:14:52, Christian Brauner wrote:
> > On Mon, 22 Jan 2024 17:45:34 +0800, Baokun Li wrote:
> > > This patchset follows the linus suggestion to make the i_size_read/write
> > > helpers be smp_load_acquire/store_release(), after which the extra smp_rmb
> > > in filemap_read() is no longer needed, so it is removed.
> > > 
> > > Functional tests were performed and no new problems were found.
> > > 
> > > Here are the results of unixbench tests based on 6.7.0-next-20240118 on
> > > arm64, with some degradation in single-threading and some optimization in
> > > multi-threading, but overall the impact is not significant.
> > > 
> > > [...]
> > 
> > Hm, we can certainly try but I wouldn't rule it out that someone will
> > complain aobut the "non-significant" degradation in single-threading.
> > We'll see. Let that performance bot chew on it for a bit as well.
> 
> Yeah, over 5% regression in buffered read/write cost is a bit hard to
> swallow. I somewhat wonder why this is so much - maybe people call
> i_size_read() without thinking too much and now it becomes atomic op on
> arm? Also LKP tests only on x86 (where these changes are going to be
> for noop) and I'm not sure anybody else runs performance tests on
> linux-next, even less so on ARM... So not sure anybody will complain until
> this gets into some distro (such as Android).

The LKP thing does iirc. We get reports from them quite often but there's
no way to request a test on a specific branch and get a result in some
timeframe (1 week would already be great) back. That's what I'd really like.

And similar for the build tests from the intel build bot it would be
nice if one could opt-in to get notifications that no performance
regression did indeed happen.

> 
> > But I agree that the smp_load_acquire()/smp_store_release() is clearer
> > than the open-coded smp_rmb().
> 
> Agreed, conceptually this is nice and it will also silence some KCSAN
> warnings about i_size updates vs reads.
> 
> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-01-24 11:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-22  9:45 [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release() Baokun Li
2024-01-22  9:45 ` [PATCH 1/2] " Baokun Li
2024-01-22  9:45 ` [PATCH 2/2] Revert "mm/filemap: avoid buffered read/write race to read inconsistent data" Baokun Li
2024-01-22 11:14 ` [PATCH 0/2] fs: make the i_size_read/write helpers be smp_load_acquire/store_release() Christian Brauner
2024-01-22 12:25   ` Baokun Li
2024-01-23 18:56   ` Jan Kara
2024-01-24  8:06     ` Baokun Li
2024-01-24 11:20     ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).