public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: use percpu_counter_compare instead of naive comparing
@ 2015-04-17  2:22 xuw2015
  2015-04-17  4:06 ` Dave Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: xuw2015 @ 2015-04-17  2:22 UTC (permalink / raw)
  To: xfs; +Cc: George Wang

From: George Wang <xuw2015@gmail.com>

Function percpu_counter_read just return the current counter, regardless of
every cpu's count. This counter can be negative value, which will cause the
checking of "allocated inode counts <= m_maxicount" false positive.
Commit 501ab3238753 "xfs: use generic percpu counters for inode counter
" introduced this problem.
Use the percpu_counter_compare, which will first do stuff in current cpu for
performance; if can not get the result, it will get the exactly counter
based on the count of every cpu.

Signed-off-by: George Wang <xuw2015@gmail.com>
---
 fs/xfs/libxfs/xfs_ialloc.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index 07349a1..af9e7d2 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -376,8 +376,8 @@ xfs_ialloc_ag_alloc(
 	 */
 	newlen = args.mp->m_ialloc_inos;
 	if (args.mp->m_maxicount &&
-	    percpu_counter_read(&args.mp->m_icount) + newlen >
-							args.mp->m_maxicount)
+	    percpu_counter_compare(&args.mp->m_icount,
+					args.mp->m_maxicount - newlen) > 0)
 		return -ENOSPC;
 	args.minlen = args.maxlen = args.mp->m_ialloc_blks;
 	/*
@@ -1341,8 +1341,8 @@ xfs_dialloc(
 	 * inode.
 	 */
 	if (mp->m_maxicount &&
-	    percpu_counter_read(&mp->m_icount) + mp->m_ialloc_inos >
-							mp->m_maxicount) {
+	    percpu_counter_compare(&mp->m_icount,
+				mp->m_maxicount - mp->m_ialloc_inos) > 0) {
 		noroom = 1;
 		okalloc = 0;
 	}
-- 
1.9.3

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] xfs: use percpu_counter_compare instead of naive comparing
  2015-04-17  2:22 [PATCH] xfs: use percpu_counter_compare instead of naive comparing xuw2015
@ 2015-04-17  4:06 ` Dave Chinner
  2015-04-17  7:15   ` 王旭
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2015-04-17  4:06 UTC (permalink / raw)
  To: xuw2015; +Cc: xfs

On Fri, Apr 17, 2015 at 10:22:24AM +0800, xuw2015@gmail.com wrote:
> From: George Wang <xuw2015@gmail.com>
> 
> Function percpu_counter_read just return the current counter, regardless of
> every cpu's count. This counter can be negative value, which will cause the
> checking of "allocated inode counts <= m_maxicount" false positive.

Have you actually seen this, or is it just theoretical?

> Commit 501ab3238753 "xfs: use generic percpu counters for inode counter
> " introduced this problem.
> Use the percpu_counter_compare, which will first do stuff in current cpu for
> performance; if can not get the result, it will get the exactly counter
> based on the count of every cpu.

That defeats the purpose of using percpu_counter_read() for this
check. We most definitely do not want to lock up the counter twice
for every allocation where we are close to the threshold. We don't
care if we aren't perfectly accurate at the threshold, but we do
care about the overhead of accurately summing the counter as it can
be read hundreds of thousands of times a second.

The correct fix is to use percpu_counter_read_positive(), because in
the majority of cases args.mp->m_maxicount is orders of magnitude
larger (20 million inodes per 100GB of fs space for small filesystems)
than the unaggregated per-cpu counts can cause the sum to go
negative. Hence if it is negative, it may as well be zero because it
makes no difference to the default threshold configurations.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] xfs: use percpu_counter_compare instead of naive comparing
  2015-04-17  4:06 ` Dave Chinner
@ 2015-04-17  7:15   ` 王旭
  2015-04-17  8:15     ` Dave Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: 王旭 @ 2015-04-17  7:15 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Fri, Apr 17, 2015 at 12:06 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Fri, Apr 17, 2015 at 10:22:24AM +0800, xuw2015@gmail.com wrote:
>> From: George Wang <xuw2015@gmail.com>
>>
>> Function percpu_counter_read just return the current counter, regardless of
>> every cpu's count. This counter can be negative value, which will cause the
>> checking of "allocated inode counts <= m_maxicount" false positive.
>
> Have you actually seen this, or is it just theoretical?

I produced it by running unionmount-testsuites for overlay, based xfs as upper
and lowerdir. I think I can extract the procedure to reproduce it.

>> Commit 501ab3238753 "xfs: use generic percpu counters for inode counter
<snip>
> The correct fix is to use percpu_counter_read_positive(), because in
> the majority of cases args.mp->m_maxicount is orders of magnitude
> larger (20 million inodes per 100GB of fs space for small filesystems)
> than the unaggregated per-cpu counts can cause the sum to go
> negative. Hence if it is negative, it may as well be zero because it
> makes no difference to the default threshold configurations.

Thanks for the explanation, and I agree that totally.
Besides, shall we use the exact count in xfs_fs_counts? Because it's a ioctl
function, not so much sensitive performance.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] xfs: use percpu_counter_compare instead of naive comparing
  2015-04-17  7:15   ` 王旭
@ 2015-04-17  8:15     ` Dave Chinner
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Chinner @ 2015-04-17  8:15 UTC (permalink / raw)
  To: 王旭; +Cc: xfs

On Fri, Apr 17, 2015 at 03:15:21PM +0800, 王旭 wrote:
> On Fri, Apr 17, 2015 at 12:06 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Fri, Apr 17, 2015 at 10:22:24AM +0800, xuw2015@gmail.com wrote:
> >> From: George Wang <xuw2015@gmail.com>
> >>
> >> Function percpu_counter_read just return the current counter, regardless of
> >> every cpu's count. This counter can be negative value, which will cause the
> >> checking of "allocated inode counts <= m_maxicount" false positive.
> >
> > Have you actually seen this, or is it just theoretical?
> 
> I produced it by running unionmount-testsuites for overlay, based xfs as upper
> and lowerdir. I think I can extract the procedure to reproduce it.

Inteesting. I've never been able to get the unionmount-testsuit to
actually be able to run, let alone had it find problems.

> >> Commit 501ab3238753 "xfs: use generic percpu counters for inode counter
> <snip>
> > The correct fix is to use percpu_counter_read_positive(), because in
> > the majority of cases args.mp->m_maxicount is orders of magnitude
> > larger (20 million inodes per 100GB of fs space for small filesystems)
> > than the unaggregated per-cpu counts can cause the sum to go
> > negative. Hence if it is negative, it may as well be zero because it
> > makes no difference to the default threshold configurations.
> 
> Thanks for the explanation, and I agree that totally.
> Besides, shall we use the exact count in xfs_fs_counts? Because it's a ioctl
> function, not so much sensitive performance.

Right, but that makes percpu_counter_read_positive is perfect for
ioctls like this: what is reported to userspace is never
perfectly accurate on a busy filesystem. Also, we don't what someone
repeatedly running that ioctl to adversely affect the performance of
the filesystem, which is what would happen if it was an accurate
sum.....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-04-17  8:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-17  2:22 [PATCH] xfs: use percpu_counter_compare instead of naive comparing xuw2015
2015-04-17  4:06 ` Dave Chinner
2015-04-17  7:15   ` 王旭
2015-04-17  8:15     ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox