* better perf and memory uage for xfs_fsr? Trivial patch against xfstools-3.16 included...
@ 2012-11-08 12:51 Linda Walsh
2012-11-08 20:30 ` Linda Walsh
2012-11-08 21:29 ` Dave Chinner
0 siblings, 2 replies; 6+ messages in thread
From: Linda Walsh @ 2012-11-08 12:51 UTC (permalink / raw)
To: xfs-oss
I was looking at output of xosview and watched a daily run of xfs_fsr take
off as normal, and do it's normal thing of allocating remaining buffer
memory during it's run -- and, as seems normal, it would go up to the
machine's limit, then the kernel's memory reclaiming would kick in for a few
hundred ms, and take memory usage down from 48G to maybe 30+G.
During this time, I'd see xfs_fsr stop or pause for a bit, then resume and
I'd see a used-buffer memory look like slow ramps followed by sharp drops
with xfs_fsr showing i/o gaps as the ramps peaked.
I wondered why it lumped all this memory reclaiming and thought to try using
the posix_fadvise calls in xfs_fsr to tell the kernel what data was unneeded
and such...
It seems to have increased overall perf, and there doesn't seem to be
anymore contention or I/O drops... no more ramping either ... it goes to the
top and stays there, with the free-memory process keeping pace with fsr's
usage.
If it doesn't look like it would cause problems (seems to reduce memory
manager thrashing), maybe you'd like to add the patch to the source tree?
It's *WAY* trivial:
-------------------------
--- fsr/xfs_fsr.c 2011-02-11 02:42:15.000000000 -0800
+++ fsr/xfs_fsr.c 2012-11-08 04:10:18.608718948 -0800
@@ -1163,6 +1163,10 @@
}
unlink(tname);
+ posix_fadvise(fd, 0, 0, POSIX_FADV_NOREUSE|POSIX_FADV_SEQUENTIAL);
+
+ posix_fadvise(tfd, 0, 0, POSIX_FADV_DONTNEED);
+
/* Setup extended attributes */
if (fsr_setup_attr_fork(fd, tfd, statp) != 0) {
fsrprintf(_("failed to set ATTR fork on tmp: %s:\n"), tname);
-------
That's it... set the input files for sequential access and no-reuse after
the read, and set the output descriptor to tell the kernel I won't need the
data that will be going into it.
It hasn't been **extensively*** tested... just plopped it in and it seems to
have good behavior... but it's simple enough and advisory, so the risk
should be minimal...(besides short-term testing showing it to be
beneficial)...
It doesn't increase or decrease the memory usage, just makes alloc'ing and
freeing it in the kernel a bit smoother...
hope you find it useful?
Maybe other utils might benefit as well, dunno, ...but fsr was the most
obvious to see the changes with.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: better perf and memory uage for xfs_fsr? Trivial patch against xfstools-3.16 included...
2012-11-08 12:51 better perf and memory uage for xfs_fsr? Trivial patch against xfstools-3.16 included Linda Walsh
@ 2012-11-08 20:30 ` Linda Walsh
2012-11-08 21:39 ` Dave Chinner
2012-11-08 21:29 ` Dave Chinner
1 sibling, 1 reply; 6+ messages in thread
From: Linda Walsh @ 2012-11-08 20:30 UTC (permalink / raw)
To: xfs-oss
FWIW, the benefit, probably comes from the read-file, as the written file
is written with DIRECT I/O and I can't see that it should make a difference
there.
Another thing I noted -- when xfs_fsr _exits_, ALL of the space it had used
for file cache read into memory -- gets freed - whereas before, it just stayed in
the buffer cache and didn't get released until the space was needed.
Linda Walsh wrote:
> I wondered why it lumped all this memory reclaiming and thought to try
> using
> the posix_fadvise calls in xfs_fsr to tell the kernel what data was
> unneeded
> and such...
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: better perf and memory uage for xfs_fsr? Trivial patch against xfstools-3.16 included...
2012-11-08 12:51 better perf and memory uage for xfs_fsr? Trivial patch against xfstools-3.16 included Linda Walsh
2012-11-08 20:30 ` Linda Walsh
@ 2012-11-08 21:29 ` Dave Chinner
1 sibling, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2012-11-08 21:29 UTC (permalink / raw)
To: Linda Walsh; +Cc: xfs-oss
On Thu, Nov 08, 2012 at 04:51:11AM -0800, Linda Walsh wrote:
> I was looking at output of xosview and watched a daily run of xfs_fsr take
> off as normal, and do it's normal thing of allocating remaining buffer
> memory during it's run -- and, as seems normal, it would go up to the
> machine's limit, then the kernel's memory reclaiming would kick in for a few
> hundred ms, and take memory usage down from 48G to maybe 30+G.
>
> During this time, I'd see xfs_fsr stop or pause for a bit, then resume and
> I'd see a used-buffer memory look like slow ramps followed by sharp drops
> with xfs_fsr showing i/o gaps as the ramps peaked.
It's not xfs_fsr that is causing this problem. It's other
applications, I think, generating memory pressure and xfs_fsr is the
unfortunate victim...
> I wondered why it lumped all this memory reclaiming and thought to try using
> the posix_fadvise calls in xfs_fsr to tell the kernel what data was unneeded
> and such...
posix_fadvise won't affect fsr at all, because:
int openopts = O_CREAT|O_EXCL|O_RDWR|O_DIRECT;
it uses direct IO, and hence bypasses the page cache. posix_fadvise
only affects buffered IO behaviour (i.e. via the page cache), and
hence will have no effect on IO issued by xfs_fsr.
> It doesn't increase or decrease the memory usage, just makes alloc'ing and
> freeing it in the kernel a bit smoother...
I'd say that's a subjective observation rather than an empirical one
- you're expecting it to improve, so you think it did. ;)
> Maybe other utils might benefit as well, dunno, ...but fsr was the most
> obvious to see the changes with.
Most of the XFS tools that do significant IO use direct IO, so it's
unlikely to make much measurable difference to tool behaviour...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: better perf and memory uage for xfs_fsr? Trivial patch against xfstools-3.16 included...
2012-11-08 20:30 ` Linda Walsh
@ 2012-11-08 21:39 ` Dave Chinner
2012-11-09 7:10 ` Linda Walsh
0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2012-11-08 21:39 UTC (permalink / raw)
To: Linda Walsh; +Cc: xfs-oss
On Thu, Nov 08, 2012 at 12:30:11PM -0800, Linda Walsh wrote:
> FWIW, the benefit, probably comes from the read-file, as the written file
> is written with DIRECT I/O and I can't see that it should make a difference
> there.
Hmmm, so it does. I think that's probably the bug that needs to be
fixed, not so much using posix_fadvise....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: better perf and memory uage for xfs_fsr? Trivial patch against xfstools-3.16 included...
2012-11-08 21:39 ` Dave Chinner
@ 2012-11-09 7:10 ` Linda Walsh
2012-11-09 8:16 ` Dave Chinner
0 siblings, 1 reply; 6+ messages in thread
From: Linda Walsh @ 2012-11-09 7:10 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs-oss
Dave Chinner wrote:
> On Thu, Nov 08, 2012 at 12:30:11PM -0800, Linda Walsh wrote:
>> FWIW, the benefit, probably comes from the read-file, as the written file
>> is written with DIRECT I/O and I can't see that it should make a difference
>> there.
>
> Hmmm, so it does. I think that's probably the bug that needs to be
> fixed, not so much using posix_fadvise....
---
Well... using direct I/O might be another way of fixing it...
but I notice that neither the reads nor the writes seem to use the optimal
I/O size that takes into consideration RAID alignment. It aligns for memory
alignment and aligns for a 2-4k device alignment, but doesn't seem to take
into consideration minor things like a 64k strip-unit x 12-wide-data-width
(768k).. if you do direct I/O. might want to be sure to RAID align it...
Doing <64k at a time would cause heinous perf... while using
the SEQUENTIAL+READ-ONCE params seem to cause a notable I/O smoothing
(no dips/valleys on the I/O charts), though I don't know how much
(if any) real performance increase (or decrease) there was, as setting
up exactly fragmentation cases would be a pain...
If you do LARGE I/O's on the READs.. say 256MB at a time, I
don't think exact alignment will matter that much, but I notice speed
improvements up to a 1GB buffer size in reads + writes in 'dd' using
direct I/O (couldn't test larger size, as device driver doesn't seem
to allow anything > 2GB-8k.. (this on a 64bit machine)
at least I think it is the dev.driver, hasn't been important enough
to chase down.
While such large buffers might be bad on a memory tight
machine, on many 64-bit machines, it's well worth the throughput
and lower disk-transfer-time usage. Meanwhile, that posix
call added on the read side really does seem to benefit...
Try it, you'll like it! ;-) (not to say it is the 'best' fix,
but it's pretty low cost!)...
> Cheers,
>
> Dave.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: better perf and memory uage for xfs_fsr? Trivial patch against xfstools-3.16 included...
2012-11-09 7:10 ` Linda Walsh
@ 2012-11-09 8:16 ` Dave Chinner
0 siblings, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2012-11-09 8:16 UTC (permalink / raw)
To: Linda Walsh; +Cc: xfs-oss
On Thu, Nov 08, 2012 at 11:10:26PM -0800, Linda Walsh wrote:
>
>
> Dave Chinner wrote:
> >On Thu, Nov 08, 2012 at 12:30:11PM -0800, Linda Walsh wrote:
> >>FWIW, the benefit, probably comes from the read-file, as the written file
> >>is written with DIRECT I/O and I can't see that it should make a difference
> >>there.
> >
> >Hmmm, so it does. I think that's probably the bug that needs to be
> >fixed, not so much using posix_fadvise....
> ---
> Well... using direct I/O might be another way of fixing it...
> but I notice that neither the reads nor the writes seem to use the optimal
> I/O size that takes into consideration RAID alignment. It aligns for memory
> alignment and aligns for a 2-4k device alignment, but doesn't seem to take
> into consideration minor things like a 64k strip-unit x 12-wide-data-width
> (768k).. if you do direct I/O. might want to be sure to RAID align it...
Sure, you can get that information from the fs geometry ioctl.
> Doing <64k at a time would cause heinous perf... while using
#define BUFFER_MAX (1<<24)
....
blksz_dio = min(dio.d_maxiosz, BUFFER_MAX - pagesize);
if (argv_blksz_dio != 0)
blksz_dio = min(argv_blksz_dio, blksz_dio);
blksz_dio = (min(statp->bs_size, blksz_dio) / dio_min) * dio_min
so, the buffer size starts at 16MB, and ends up the minimum of the
buffer size and the file size. As can be seen here:
/mnt/test/foo extents=6 can_save=1 tmp=/mnt/test/.fsr4188
DEBUG: fsize=17825792 blsz_dio=16773120 d_min=512 d_max=2147483136 pgsz=4096
So, really, if you want to change that to be stripe width aligned,
you could quite easily do that...
However, if you really wanted to increase fsr throughput, using AIO
and keeping multiple IOs in flight at once woul dbe a much better
option as it would avoid the serialised read-write-read-write-...
pattern tha limits the throughput now...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-11-09 8:14 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-08 12:51 better perf and memory uage for xfs_fsr? Trivial patch against xfstools-3.16 included Linda Walsh
2012-11-08 20:30 ` Linda Walsh
2012-11-08 21:39 ` Dave Chinner
2012-11-09 7:10 ` Linda Walsh
2012-11-09 8:16 ` Dave Chinner
2012-11-08 21:29 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox