Re: [Qemu-devel] [PATCH 2/2] qemu-img: Add dd seek= option

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Max Reitz <mreitz@redhat.com>
To: Eric Blake <eblake@redhat.com>, qemu-devel@nongnu.org
Cc: fullmanet@gmail.com, qemu-block@nongnu.org,
	Kevin Wolf <kwolf@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 2/2] qemu-img: Add dd seek= option
Date: Thu, 16 Aug 2018 04:49:49 +0200	[thread overview]
Message-ID: <d8c5df8d-7ae2-8ddc-aa71-38ec06b79988@redhat.com> (raw)
In-Reply-To: <4aa899d3-c622-6acf-128c-513f95c55ed5@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 6416 bytes --]

On 2018-08-16 04:39, Eric Blake wrote:
> On 08/15/2018 09:20 PM, Max Reitz wrote:
>> On 2018-08-15 04:56, Eric Blake wrote:
>>> For feature parity with dd, we want to be able to specify
>>> the offset within the output file, just as we can specify
>>> the offset for the input (in particular, this makes copying
>>> a subset range of guest-visible bytes from one file to
>>> another much easier).
>>
>> In my opinion, we do not want feature parity with dd.  What we do want
>> is feature parity with convert.
> 
> Well, convert is lacking a way to specify a subset of one file to move
> to a (possibly different) subset of the other.  I'm fine if we want to
> enhance convert to do the things that right now require a dd-alike
> interface (namely, limiting the copying to less than the full file, and
> choosing the offset at which to start [before this patch] or write to
> [with this patch]).

Yes, I would want that.

> If convert were more powerful, I'd be fine dropping 'qemu-img dd' after
> a proper deprecation period.

Technically it has those features already, with the raw block driver's
offset and size parameters.

>>> The code style for 'qemu-img dd' was pretty hard to read;
>>> unfortunately this patch focuses only on adding the new
>>> feature in the existing style rather than trying to improve
>>> the overall flow, other than switching octal constants to
>>> hex.  Oh well.
>>
>> No, the real issue is that dd is still not implemented just as a
>> frontend to convert.  Which it should be.  I'm not sure dd was a very
>> good idea from the start, and now it should ideally be a frontend to
>> convert.
>>
>> (My full opinion on the matter: dd has a horrible interface.  I don't
>> quite see why we replicated that inside qemu-img.  Also, if you want to
>> use dd, why not use qemu-nbd + Linux nbd device + real dd?)
> 
> Because of performance: qemu-nbd + Linux nbd device + real dd is one
> more layer of data copying (each write() from dd goes to kernel, then is
> sent to qemu-nbd in userspace as a socket message before being sent back
> to the kernel to actually write() to the final destination) compared to
> just doing it all in one process (write() lands in the final destination
> with no further user space bouncing).  And because the additional steps
> to set it up are awkward (see my other email where I rant about losing
> the better part of today to realizing that 'dd ...; qemu-nbd -d
> /dev/nbd1' loses data if you omit conv=fdatasync).

I can see the sync problems, but is the performance really that much worse?

>> ((That gave me a good idea.  Actually, it's probably not such a good
>> idea, but I guess I'll do it in my spare time anyway.  A qemu-img fuse
>> might be nice which represents an image as a raw image at some mount
>> point.  Benefits over qemu-nbd: (1) You don't need root, (2) you don't
>> need to type modprobe nbd.))
> 
> So the kernel->userspace translation would be happening via the FUSE
> interface instead of the NBD interface.  Data still bounces around just
> as much, but it might be a fun project.  Does fuse behave well when
> serving exactly one file at the mountpoint, rather than the more typical
> file system rooted in a directory?  NBD at least has the benefit of
> claiming to be a block device all along, rather than complicating the
> user interface with POSIX file system rules (which you'll be bending,
> because you are serving exactly one file instead of a system).

Well, but I can just pretend my FUSE file is a block device, no?

Also, I just discovered something really interesting: FUSE allows you to
specify a single file as a mountpoint.

And you know what?  You can open the original file before you replace it
by the FUSE "filesystem".

So my fun interface is going to looks like this:

$ qemu-img fuse foo.qcow2

And then your foo.qcow2 is a raw image until the next "fusermount -u
foo.qcow2"!  Isn't that fun?

>>> Also, switch the test to use an offset of 0 instead of 1,
>>> to test skip= and seek= on their own; as it is, this is
>>> effectively quadrupling the test runtime, which starts
>>> to make this test borderline on whether it should still
>>> belong to './check -g quick'.  And I didn't bother to
>>> reindent the test shell code for the new nested loop.
>>
>> In my opinion, it should no longer belong to quick.  It takes 8 s on my
>> tmpfs.  My border is somewhere around 2 or 3; and I haven't yet decided
>> whether that's on tmpfs or SSD.
> 
> I took 4 iterations pre-patch, to 8 iterations after patch 1, to 32
> iterations with this patch; my observed times went from 1s to 2s to 7s
> on SSD ext4. Yeah, for v2, I'll drop it from quick.

Thanks!

>>> @@ -4574,7 +4592,14 @@ static int img_dd(int argc, char **argv)
>>>           size = dd.count * in.bsz;
>>>       }
>>>
>>> -    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size, &error_abort);
>>> +    if (dd.flags & C_SEEK && out.offset * out.bsz > INT64_MAX - size) {
>>
>> What about overflows in out.offset * out.bsz?
> 
> I've had enough of my eyes bleeding on all the code repeatedly scaling
> things. For v2, I'm strongly considering a cleanup patch that reads all
> input, then scales all values into bytes, and THEN performs any
> additional math in a single unit, just so the additions become easier to
> reason about.

Haha.  I won't object.

>>> +        error_report("Seek too large for '%s'", out.filename);
>>> +        ret = -1;
>>> +        goto out;
>>
>> Real dd doesn't seem to error out (it just reports an error).  I don't
>> know whether that makes any difference, though.
> 
> But where does the data get written if you can't actually seek that far
> into the file?

Well, the stats printed say it doesn't write anything.  So that's why I
don't know whether it makes any difference.

>>
>> The test looks good to me.
> 
> Other than my creative indentation levels ;)

I like them.

I mean, usually I just don't indent anything when adding to a test case.
 I do it like this:


Original code:

...
qemu-img (something) $TEST_IMG
...


Post-my-patch:

for opt in x y; do
...
qemu-img (something) $opt $TEST_IMG
...
done


And I know I'm not the only one.  So, yeah, I liked your more creative
solution.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

next prev parent reply	other threads:[~2018-08-16  2:50 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-15  2:56 [Qemu-devel] [PATCH 0/2] Improve qemu-img dd Eric Blake
2018-08-15  2:56 ` [Qemu-devel] [PATCH 1/2] qemu-img: Fix dd with skip= and count= Eric Blake
2018-08-16  2:03   ` Max Reitz
2018-08-16  2:17     ` Eric Blake
2018-08-16  2:19       ` Max Reitz
2018-08-15  2:56 ` [Qemu-devel] [PATCH 2/2] qemu-img: Add dd seek= option Eric Blake
2018-08-16  2:20   ` Max Reitz
2018-08-16  2:39     ` Eric Blake
2018-08-16  2:49       ` Eric Blake
2018-08-16  2:49       ` Max Reitz [this message]
2018-08-16  2:57         ` Eric Blake
2018-08-16  3:00           ` Max Reitz
2018-08-16  7:15         ` Kevin Wolf
2018-08-17 19:22           ` Max Reitz
2018-08-20  2:07     ` Fam Zheng
2018-08-20 12:20       ` Max Reitz
2018-08-16  2:04 ` [Qemu-devel] [PATCH 0/2] Improve qemu-img dd Eric Blake
2018-08-16  2:12 ` Eric Blake
2018-08-16 19:39 ` no-reply
2018-08-16 20:00 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d8c5df8d-7ae2-8ddc-aa71-38ec06b79988@redhat.com \
    --to=mreitz@redhat.com \
    --cc=eblake@redhat.com \
    --cc=fullmanet@gmail.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).