[RFC PATCH 0/4] xfs: parallel quota check

* [RFC PATCH 0/4] xfs: parallel quota check
@ 2013-11-12  9:29 Jeff Liu
  2013-11-12 10:05 ` Jeff Liu
  2013-11-12 21:03 ` Dave Chinner
  0 siblings, 2 replies; 6+ messages in thread
From: Jeff Liu @ 2013-11-12  9:29 UTC (permalink / raw)
  To: xfs@oss.sgi.com

Hi Folks,

We have a user report about skip quota check on first mount/boot several
monthes ago, the original discussion thread can be found at:
http://oss.sgi.com/archives/xfs/2013-06/msg00170.html.

As per Dave's suggestion, it would be possible to perform quota check
in parallel, this patch series is just trying to follow up that idea.

Sorry for the too long day as I have to spent most of time dealing with
personl things in the last few monthes, I was afraid I can not quickly
follow up the review procedure.  Now the nightmare is over, it's time to
revive this task.

Also, my previous test results on my laptop and a poor desktop can not
convience me that performs parallism quota check can really get benefits
compare to the current single thread as both machines are shipped with
slow disks, I even observed a little performance regression with millions
of small files(e.g, 100 bytes) as quota check is IO bound, additionaly,
it could affected by the seek time differences.  Now with a Mackbook Air
I bought recently, it can show significant difference.

tests:
- create files via fs_mark (empty file/100 byte small file)
fs_mark -k -S 0 -n 100000 -D 100 -N 1000 -d /xfs -t [10|20|30|50] -s [0|100]
- mount -ouquota,pquota /dev/sdaX /storage
- run each test for 5 times and figure out the average value

test environment:
- laptop: i5-3320M CPU 4 cores, 8G ram, normal SATA disk

results of empty files via time:
- # of file(million)	default			patched
	1		real 1m12.0661s		real 1m8.328s
     			user 0m0.000s		user 0m0.000s
			sys  0m43.692s		sys  0m0.048s

	2		real 1m43.907s		real 1m16.221s
			user 0m0.004s		user 0m0.000s
			sys  1m32.968s		sys  0m0.065s

	3		real 2m36.632s		real 1m48.011s
			user 0m0.000s		user 0m0.002s
			sys  2m23.501s		sys  0m0.094s

	5		real 4m20.266s		real 3m0.145s
			user 0m0.000s		user 0m0.002s
			sys  3m56.264s		sys  0m0.092s

results of 100 bytes files via time:
- # of file(million)	default			patched
	1		real 1m34.492		real 1m51.268s
			user 0m0.008s		user 0m0.008.s
			sys  0m54.432s		sys  0m0.236s

	3		real 3m26.687s		real 3m16.152s
			user 0m0.000s		user 0m0.000s
			sys  2m23.144s		sys  0m0.088s

So with emtpy files, the performance still looks good but with small files,
this change introduced a little regression on very slow storage.  I guess
this is caused by disk seek as data blocks allocated and spreads over the
disk.

In order to get some more reasonable results, I ask a friend helping
run this test on a server which were shown as following.

test environment
- 16core, 25G ram, normal SATA disk, but the XFS is resides on a loop dev. 

result of 100 bytes files via time:
- # of file(million)	default			patched
	1		real 0m19.015s		real 0m16.238s
			user 0m0.004s		user 0m0.002s
			sys  0m4.358s		sys  0m0.030s

	2		real 0m34.106s		real 0m28.300s
			user 0m0.012s		user 0m0.002s
			sys  0m8.820s		sys  0m0.035s

	3		real 0m53.716s		real 46.390s
			user 0m0.002s		user 0m0.005s
			sys  0m13.396s		sys  0m0.023s

	5		real 2m26.361s		real 2m17.415s
			user 0m0.004s		user 0m0.004s
			sys  0m22.188s		sys  0m0.023s

In this case, there is no regression although there is no noticeable
improvements. :(

test environment
- Macbook Air i7-4650U with SSD, 8G ram

- # of file(million)	default			patched
	1		real 0m6.367s		real 0m1.972s
			user 0m0.008s		user 0m0.000s
			sys  0m2.614s		sys  0m0.008s

	2		real 0m3.772s		real 0m15.221s
			user 0m0.000s		user 0m0.000s
			sys  0m0.007s		sys  0m6.269s

	5		real 0m36.036s		real 0m8.902s
			user 0m0.000s		user 0m0.002s
			sys  0m14.025s		sys  0m0.006s

Btw, The current implementation has a defeat considering the duplicated
code at [patch 0/4] xfs: implement parallism quota check at mount time.
Maybe it's better to introduce a new function xfs_bulkstat_ag() which can
be used to bulkstat inodes per ag, hence it could shared at above patch while
adjusting dquota usage per ag, i.e, xfs_qm_dqusage_adjust_perag().

As usual, critism and comments are both welcome!

Thanks,
-Jeff

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread