* write 'O_DIRECT' file w/odd amount of data: desirable result?
@ 2011-02-23 4:30 Linda Walsh
2011-02-23 10:34 ` Pádraig Brady
0 siblings, 1 reply; 5+ messages in thread
From: Linda Walsh @ 2011-02-23 4:30 UTC (permalink / raw)
To: LKML
I understand, somewhat, what is happening.
I have two different utils, 'dd' and mbuffer both of
which have a 'direct' option to write to disk.
mbuffer was from my distro with a direct added, which
is
I'm not sure if it's truncating the write to the
lower bound of the sector size or the file-allocation-unit size
but from a dump, piped into {cat, dd mbuffer}, the
output sizes are:
file size delta
------------- ---------- ----
dumptest.cat 5776419696
dumptest.dd 5776343040 76656
dumptest.mbuff 5368709120 407710576
params:
dd of=dumptest.dd bs=512M oflag=direct
mbuffer -b 5 -s 512m --direct -f -o dumptest.mbuff
original file size MOD 512M = 407710576 (answer from mbuff).
The disk it is being written to is a RAID with a span
size of 640k (64k io*10 data disks) and formatted to
indicated that with 'xfs' (stripe-unit=64k stripe=width=10).
This gives a 'coincidental' (??) interpretation for
the output from 'dd', where the original file size MOD
640K = 76656 (the amount 'dd' is short).
Was that a coincidence or a fluke?
Why didn't 'mbuffer' have the same shortfall -- it's was
only related to it's 512m buffer size.
In any event, shouldn't the kernel yield the correct answer
in either case? It would be consistent with the processor it
was natively developed on, the x86, where a misaligned memory
access doesn't cause a fault at the user level, but is handled
correctly, with a slight penalty to speed for the unaligned
data parts.
Shouldn't the linux kernel behave similarly?
Note, that the mbuffer program indicated an error
(which didn't help the 'dump' program that had already exited
with what it thought was a 'success'), though a bit
cryptic:
buffer: error: outputThread: error writing to dumptest.mbuff at offset
0x140000000: Invalid argument
summary: 5509 MByte in 8.4 sec - average of 658 MB/s
mbuffer: warning: error during output to dumptest.mbuff: Invalid argument
dd indicated no warning or error.
----
I'm not aware of what either did, but no doubt neither
expected an error in the final write and didn't handle the results properly.
However, wouldn't it be a good thing for linux to do 'the right thing'
and successfully the last partial write (whichever is the case!), even
if it has to be internally buffered and slightly slowed? Seems
correctness of the function should be given preference over the
adherence to some limitation where possible.
Software should be as forgiving and tolerant and 'err' to the side of
least harm -- which I'd argue is getting the data to the disk, NOT
generating some 'abnormal end' (ABEND) condition that the software can't
handle.
I'd think of it like a page-fault of a record not in memory. The
remainder of the I/O record is a 'zero-filled' buffer that fills in the
remainder of the sector while the size of the field is set to the size
written. ??
Vanilla kernel 2.6.35-7 x86_64 (SMP PREMPT)
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: write 'O_DIRECT' file w/odd amount of data: desirable result?
2011-02-23 4:30 write 'O_DIRECT' file w/odd amount of data: desirable result? Linda Walsh
@ 2011-02-23 10:34 ` Pádraig Brady
[not found] ` <4D654C2E.2000703@tlinx.org>
0 siblings, 1 reply; 5+ messages in thread
From: Pádraig Brady @ 2011-02-23 10:34 UTC (permalink / raw)
To: Linda Walsh; +Cc: LKML
On 23/02/11 04:30, Linda Walsh wrote:
>
>
>
> I understand, somewhat, what is happening.
> I have two different utils, 'dd' and mbuffer both of
> which have a 'direct' option to write to disk.
> mbuffer was from my distro with a direct added, which
> is
>
> I'm not sure if it's truncating the write to the
> lower bound of the sector size or the file-allocation-unit size
> but from a dump, piped into {cat, dd mbuffer}, the
> output sizes are:
>
> file size delta
> ------------- ---------- ----
> dumptest.cat 5776419696
> dumptest.dd 5776343040 76656
> dumptest.mbuff 5368709120 407710576
>
> params:
>
> dd of=dumptest.dd bs=512M oflag=direct
> mbuffer -b 5 -s 512m --direct -f -o dumptest.mbuff
>
> original file size MOD 512M = 407710576 (answer from mbuff).
>
> The disk it is being written to is a RAID with a span
> size of 640k (64k io*10 data disks) and formatted to
> indicated that with 'xfs' (stripe-unit=64k stripe=width=10).
>
> This gives a 'coincidental' (??) interpretation for
> the output from 'dd', where the original file size MOD
> 640K = 76656 (the amount 'dd' is short).
>
> Was that a coincidence or a fluke?
> Why didn't 'mbuffer' have the same shortfall -- it's was
> only related to it's 512m buffer size.
>
> In any event, shouldn't the kernel yield the correct answer
> in either case? It would be consistent with the processor it
> was natively developed on, the x86, where a misaligned memory
> access doesn't cause a fault at the user level, but is handled
> correctly, with a slight penalty to speed for the unaligned
> data parts.
>
> Shouldn't the linux kernel behave similarly?
> Note, that the mbuffer program indicated an error
> (which didn't help the 'dump' program that had already exited
> with what it thought was a 'success'), though a bit
> cryptic:
> buffer: error: outputThread: error writing to dumptest.mbuff at offset
> 0x140000000: Invalid argument
>
> summary: 5509 MByte in 8.4 sec - average of 658 MB/s
> mbuffer: warning: error during output to dumptest.mbuff: Invalid argument
>
> dd indicated no warning or error.
>
> ----
> I'm not aware of what either did, but no doubt neither
> expected an error in the final write and didn't handle the results
> properly.
>
> However, wouldn't it be a good thing for linux to do 'the right thing'
> and successfully the last partial write (whichever is the case!), even
> if it has to be internally buffered and slightly slowed? Seems
> correctness of the function should be given preference over the
> adherence to some limitation where possible.
> Software should be as forgiving and tolerant and 'err' to the side of
> least harm -- which I'd argue is getting the data to the disk, NOT
> generating some 'abnormal end' (ABEND) condition that the software can't
> handle.
> I'd think of it like a page-fault of a record not in memory. The
> remainder of the I/O record is a 'zero-filled' buffer that fills in the
> remainder of the sector while the size of the field is set to the size
> written. ??
>
> Vanilla kernel 2.6.35-7 x86_64 (SMP PREMPT)
Note dd will turn off O_DIRECT for the last write
if it's less than the block size.
http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=5929322c
Note also you mentioned that you piped from dump to dd.
For dd reading from a pipe I strongly suggest you specify iflag=fullblock
If there is still an issue, it seems from the above that the kernel is throwing
away data and not indicating this through the last non O_DIRECT write().
cheers,
Pádraig.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-03-02 2:28 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-23 4:30 write 'O_DIRECT' file w/odd amount of data: desirable result? Linda Walsh
2011-02-23 10:34 ` Pádraig Brady
[not found] ` <4D654C2E.2000703@tlinx.org>
2011-02-24 1:18 ` Pádraig Brady
2011-02-24 9:26 ` Dave Chinner
2011-03-02 2:27 ` RFE kernel option to do the desirable thing, w/regards to 'O_DIRECT' and mis-aligned data Linda Walsh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox