Linux ATA/IDE development
 help / color / mirror / Atom feed
From: Markus Trippelsdorf <markus@trippelsdorf.de>
To: Tejun Heo <tj@kernel.org>
Cc: Michael Tokarev <mjt@tls.msk.ru>,
	Robert Hancock <hancockrwd@gmail.com>,
	Jeff Garzik <jgarzik@pobox.com>,
	linux-ide@vger.kernel.org
Subject: Re: libata default FUA support
Date: Wed, 2 Mar 2011 18:29:40 +0100	[thread overview]
Message-ID: <20110302172940.GA1644@gentoo.trippels.de> (raw)
In-Reply-To: <20110302085823.GI19669@htj.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 3946 bytes --]

On 2011.03.02 at 09:58 +0100, Tejun Heo wrote:
> On Wed, Mar 02, 2011 at 10:30:57AM +0300, Michael Tokarev wrote:
> > > I believe the way the block layer uses it, basically it only saves the
> > > overhead of one transaction to the drive. It might be significant on
> > > some workloads (especially on high IOPS drives like SSDs) but it's
> > > likely not a huge deal.
> > 
> > One transaction per what?  If it means extra, especially "large"
> > transaction (lile flush with a wait) per each fsync-like call,
> > that can be huge deal actually, especially on database-like
> > workloads (lots of small syncronous random writes).
> 
> The way flushes are used by filesystems is that FUA is usually only
> used right after another FLUSH.  ie. Using FUA replaces FLUSH + commit
> block write + FLUSH sequence to FLUSH + FUA commit block write.  Due
> to the preceding FLUSH, the cache is already empty, so the only
> difference between WRITE + FLUSH and FUA WRITE becomes the extra
> command issue overhead which is usually almost unnoticeable compared
> to the actual IO.
> 
> Another thing is that with the recent updates to block FLUSH handling,
> using FUA might even be less efficient.  The new implementation
> aggressively merges those commit writes and flushes.  IOW, depending
> on timing, multiple consecutive commit writes can be merged as,
> 
>  FLUSH + commit writes + FLUSH
> 
> or
> 
>  FLUSH + some commit writes + FLUSH + other commit writes + FLUSH
> 
> and so on,
> 
> These merges will happen with fsync heavy workloads where FLUSH
> performance actually matters and, in these scenarios, FUA writes is
> less effective because it puts extra ordering restrictions on each FUA
> write.  ie. With surrounding FLUSHes, the drive is free to reorder
> commit writes to maximize performance, with FUA, the disk has to jump
> around all over the place to execute each command in the exact issue
> order.
> 
> I personally think FUA is a misfeature.  It's a microoptimization with
> shallow benefits even when used properly while putting much heavier
> restriction on actual IO order, which usually is the slow part.

Thanks for the detailed information. Just to confirm your point here are
some benchmark results:

(Seagate ST1500DL003 1.5TB 5900rpm, xfs (delaylog), ffsb (
http://sourceforge.net/projects/ffsb/ ) pure random write benchmark:

1)
Total Results 30sec run, 1 thread, 104*35MB files

             Op Name   Transactions      Trans/sec      % Trans     % Op Weight    Throughput
             =======   ============      =========      =======     ===========    ==========
FUA:         write :   435456            1183.44        100.000%    100.000%       162MB/sec
no FUA:      write :   441600            1243.47        100.000%    100.000%       170MB/sec

System Call Latency statistics in millisecs

                Min             Avg             Max             Total Calls
                ========        ========        ========        ============
[  write]FUA    0.000000        0.070392        5444.638184           435456
[  write]no FUA 0.000000        0.069718        4715.519043           441600

2)
Total Results 240sec run, 2 threads, 104*35MB files
===============
             Op Name   Transactions      Trans/sec      % Trans     % Op Weight    Throughput
             =======   ============      =========      =======     ===========    ==========
FUA:         write :   594944            919.45         100.000%    100.000%       126MB/sec
no FUA:      write :   653824            1097.31        100.000%    100.000%       150MB/sec

System Call Latency statistics in millisecs

                Min             Avg             Max             Total Calls
                ========        ========        ========        ============
[  write]FUA    0.000000        0.812704        13467.903320          594944
[  write]no FUA 0.000000        0.727761        9695.806641           653824

-- 
Markus

[-- Attachment #2: random_writes.ffsb --]
[-- Type: text/plain, Size: 968 bytes --]

# Large file random writes.
# 104 files, 35MB per file.

time=240
alignio=1

[filesystem0]
	location=/var/tmp/fs_bench
	num_files=104
	min_filesize=36700160  # 35 MB
	max_filesize=36700160
	reuse=1
[end0]

[threadgroup0]
	num_threads=2

	write_random=1
	write_weight=1

	write_size=1048576  # 1 MB
	write_blocksize=4096

	[stats]
		enable_stats=1
		enable_range=1

		msec_range    0.00      0.01
		msec_range    0.01      0.02
		msec_range    0.02      0.05
		msec_range    0.05      0.10
		msec_range    0.10      0.20
		msec_range    0.20      0.50
		msec_range    0.50      1.00
		msec_range    1.00      2.00
		msec_range    2.00      5.00
		msec_range    5.00     10.00
		msec_range   10.00     20.00
		msec_range   20.00     50.00
		msec_range   50.00    100.00
		msec_range  100.00    200.00
		msec_range  200.00    500.00
		msec_range  500.00   1000.00
		msec_range 1000.00   2000.00
		msec_range 2000.00   5000.00
		msec_range 5000.00  10000.00
	[end]
[end0]

  reply	other threads:[~2011-03-02 17:29 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-01 20:33 libata default FUA support Markus Trippelsdorf
2011-03-02  0:54 ` Robert Hancock
2011-03-02  7:30   ` Michael Tokarev
2011-03-02  8:58     ` Tejun Heo
2011-03-02 17:29       ` Markus Trippelsdorf [this message]
2011-03-03  4:33     ` Robert Hancock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110302172940.GA1644@gentoo.trippels.de \
    --to=markus@trippelsdorf.de \
    --cc=hancockrwd@gmail.com \
    --cc=jgarzik@pobox.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=mjt@tls.msk.ru \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox