All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Pádraig Brady" <P@draigBrady.com>
To: Linda Walsh <lkml@tlinx.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: Re: write 'O_DIRECT' file w/odd amount of data: desirable result?
Date: Wed, 23 Feb 2011 10:34:35 +0000	[thread overview]
Message-ID: <4D64E2BB.7010000@draigBrady.com> (raw)
In-Reply-To: <4D648D7D.7040500@tlinx.org>

On 23/02/11 04:30, Linda Walsh wrote:
> 
> 
> 
> I understand, somewhat, what is happening.
> I have two different utils, 'dd' and mbuffer both of
> which have a 'direct' option to write to disk.
> mbuffer was from my distro with a direct added, which
> is
> 
> I'm not sure if it's truncating the write to the
> lower bound of the sector size or the file-allocation-unit size
> but from a dump, piped into {cat, dd mbuffer}, the
> output sizes are:
> 
> file              size       delta
> -------------   ----------   ----
> dumptest.cat    5776419696
> dumptest.dd     5776343040   76656   
> dumptest.mbuff  5368709120   407710576
> 
> params:
> 
> dd of=dumptest.dd bs=512M oflag=direct
> mbuffer -b 5 -s 512m --direct -f -o dumptest.mbuff
> 
> original file size MOD 512M = 407710576 (answer from mbuff).
> 
> The disk it is being written to is a RAID with a span
> size of 640k (64k io*10 data disks) and formatted to
> indicated that with 'xfs' (stripe-unit=64k stripe=width=10).
> 
> This gives a 'coincidental' (??) interpretation for
> the output from 'dd', where the original file size MOD
> 640K = 76656  (the amount 'dd' is short).
> 
> Was that a coincidence or a fluke?
> Why didn't 'mbuffer' have the same shortfall -- it's was
> only related to it's 512m buffer size.
> 
> In any event, shouldn't the kernel yield the correct answer
> in either case?  It would be consistent with the processor it
> was natively developed on, the x86, where a misaligned memory
> access doesn't cause a fault at the user level, but is handled
> correctly, with a slight penalty to speed for the unaligned
> data parts.
> 
> Shouldn't the linux kernel behave similarly?
> Note, that the mbuffer program indicated an error
> (which didn't help the 'dump' program that had already exited
> with what it thought was a 'success'), though a bit
> cryptic:
> buffer: error: outputThread: error writing to dumptest.mbuff at offset
> 0x140000000: Invalid argument
> 
> summary: 5509 MByte in  8.4 sec - average of  658 MB/s
> mbuffer: warning: error during output to dumptest.mbuff: Invalid argument
> 
> dd indicated no warning or error.
> 
> ----
> I'm not aware of what either did, but no doubt neither
> expected an error in the final write and didn't handle the results
> properly.
> 
> However, wouldn't it be a good thing for linux to do 'the right thing'
> and successfully the last partial write (whichever is the case!), even
> if it has to be internally buffered and slightly slowed?  Seems
> correctness of the function should be given preference over the
> adherence to some limitation where possible.
> Software should be as forgiving and tolerant and 'err' to the side of
> least harm -- which I'd argue is getting the data to the disk, NOT
> generating some 'abnormal end' (ABEND) condition that the software can't
> handle.
> I'd think of it like a page-fault of a record not in memory.  The
> remainder of the I/O record is a 'zero-filled' buffer that fills in the
> remainder of the sector while the size of the field is set to the size
> written. ??
> 
> Vanilla kernel 2.6.35-7 x86_64 (SMP PREMPT)

Note dd will turn off O_DIRECT for the last write
if it's less than the block size.
http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=5929322c

Note also you mentioned that you piped from dump to dd.
For dd reading from a pipe I strongly suggest you specify iflag=fullblock

If there is still an issue, it seems from the above that the kernel is throwing
away data and not indicating this through the last non O_DIRECT write().

cheers,
Pádraig.

  reply	other threads:[~2011-02-23 10:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-23  4:30 write 'O_DIRECT' file w/odd amount of data: desirable result? Linda Walsh
2011-02-23 10:34 ` Pádraig Brady [this message]
2011-02-23 18:04   ` Linda A. Walsh
2011-02-24  1:18     ` Pádraig Brady
2011-02-24  1:18       ` Pádraig Brady
2011-02-24  9:26     ` Dave Chinner
2011-02-24  9:26       ` Dave Chinner
2011-03-02  2:27       ` RFE kernel option to do the desirable thing, w/regards to 'O_DIRECT' and mis-aligned data Linda Walsh
2011-03-02  2:27         ` Linda Walsh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D64E2BB.7010000@draigBrady.com \
    --to=p@draigbrady.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkml@tlinx.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.