From: "Daniel P. Berrange" <berrange@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: Re: [Qemu-devel] KVM call agenda for October 25
Date: Wed, 26 Oct 2011 12:39:32 +0100 [thread overview]
Message-ID: <20111026113932.GJ29496@redhat.com> (raw)
In-Reply-To: <4EA7ED99.1020700@redhat.com>
On Wed, Oct 26, 2011 at 01:23:05PM +0200, Kevin Wolf wrote:
> Am 26.10.2011 11:57, schrieb Daniel P. Berrange:
> > On Wed, Oct 26, 2011 at 10:48:12AM +0200, Markus Armbruster wrote:
> >> Kevin Wolf <kwolf@redhat.com> writes:
> >>
> >>> Am 25.10.2011 16:06, schrieb Anthony Liguori:
> >>>> On 10/25/2011 08:56 AM, Kevin Wolf wrote:
> >>>>> Am 25.10.2011 15:05, schrieb Anthony Liguori:
> >>>>>> I'd be much more open to changing the default mode to cache=none FWIW since the
> >>>>>> risk of data loss there is much, much lower.
> >>>>>
> >>>>> I think people said that they'd rather not have cache=none as default
> >>>>> because O_DIRECT doesn't work everywhere.
> >>>>
> >>>> Where doesn't it work these days? I know it doesn't work on tmpfs. I know it
> >>>> works on ext[234], btrfs, nfs.
> >>>
> >>> Besides file systems (and probably OSes) that don't support O_DIRECT,
> >>> there's another case: Our defaults don't work on 4k sector disks today.
> >>> You need to explicitly specify the logical_block_size qdev property for
> >>> cache=none to work on them.
> >>>
> >>> And changing this default isn't trivial as the right value doesn't only
> >>> depend on the host disk, but it's also guest visible. The only way out
> >>> would be bounce buffers, but I'm not sure that doing that silently is a
> >>> good idea...
> >>
> >> Sector size is a device property.
> >>
> >> If the user asks for a 4K sector disk, and the backend can't support
> >> that, we need to reject the configuration. Just like we reject
> >> read-only backends for read/write disks.
> >
> > I don't see why we need to reject a guest disk with 4k sectors,
> > just because the host disk only has 512 byte sectors. A guest
> > sector size that's a larger multiple of host sector size should
> > work just fine. It just means any guest sector write will update
> > 8 host sectors at a time. We only have problems if guest sector
> > size is not a multiple of host sector size, in which case bounce
> > buffers are the only option (other than rejecting the config
> > which is not too nice).
> >
> > IIUC, current QEMU behaviour is
> >
> > Guest 512 Guest 4k
> > Host 512 * OK OK
> > Host 4k * I/O Err OK
> >
> > '*' marks defaults
> >
> > IMHO, QEMU needs to work withot I/O errors in all of these
> > combinations, even if this means having to use bounce buffers
> > in some of them. That said, IMHO the default should be for
> > QEMU to avoid bounce buffers, which implies it should either
> > chose guest sector size to match host sector size, or it
> > should unconditionally use 4k guest. IMHO we need the former
> >
> > Guest 512 Guest 4k
> > Host 512 *OK OK
> > Host 4k OK *OK
>
> I'm not sure if a 4k host should imply a 4k guest by default. This means
> that some guests wouldn't be able to run on a 4k host. On the other
> hand, for those guests that can do 4k, it would be the much better option.
>
> So I think this decision is the hard thing about it.
I guess it somewhat depends whether we want to strive for
1. Give the user the fastest working config by default
2. Give the user a working config by default
3. Give the user the fastest (possibly broken) config by default
IMHO 3 is not a serious option, but I could see 2 as a reasonable
tradeoff to avoid complexity in chosing QEMU defaults. The user
would have a working config with 512 sectors, but sub-optimal perf
on 4k hosts due to bounce buffering. Ideally libvirt or other
higher app would be setting the best block size that a guest
can support by default, so bounce buffers would rarely be needed.
So only people using QEMU directly without setting a block size
would ordinarily suffer the bounce buffer perf hit on a 4k host
host
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
WARNING: multiple messages have this Message-ID (diff)
From: "Daniel P. Berrange" <berrange@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
kvm@vger.kernel.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] KVM call agenda for October 25
Date: Wed, 26 Oct 2011 12:39:32 +0100 [thread overview]
Message-ID: <20111026113932.GJ29496@redhat.com> (raw)
In-Reply-To: <4EA7ED99.1020700@redhat.com>
On Wed, Oct 26, 2011 at 01:23:05PM +0200, Kevin Wolf wrote:
> Am 26.10.2011 11:57, schrieb Daniel P. Berrange:
> > On Wed, Oct 26, 2011 at 10:48:12AM +0200, Markus Armbruster wrote:
> >> Kevin Wolf <kwolf@redhat.com> writes:
> >>
> >>> Am 25.10.2011 16:06, schrieb Anthony Liguori:
> >>>> On 10/25/2011 08:56 AM, Kevin Wolf wrote:
> >>>>> Am 25.10.2011 15:05, schrieb Anthony Liguori:
> >>>>>> I'd be much more open to changing the default mode to cache=none FWIW since the
> >>>>>> risk of data loss there is much, much lower.
> >>>>>
> >>>>> I think people said that they'd rather not have cache=none as default
> >>>>> because O_DIRECT doesn't work everywhere.
> >>>>
> >>>> Where doesn't it work these days? I know it doesn't work on tmpfs. I know it
> >>>> works on ext[234], btrfs, nfs.
> >>>
> >>> Besides file systems (and probably OSes) that don't support O_DIRECT,
> >>> there's another case: Our defaults don't work on 4k sector disks today.
> >>> You need to explicitly specify the logical_block_size qdev property for
> >>> cache=none to work on them.
> >>>
> >>> And changing this default isn't trivial as the right value doesn't only
> >>> depend on the host disk, but it's also guest visible. The only way out
> >>> would be bounce buffers, but I'm not sure that doing that silently is a
> >>> good idea...
> >>
> >> Sector size is a device property.
> >>
> >> If the user asks for a 4K sector disk, and the backend can't support
> >> that, we need to reject the configuration. Just like we reject
> >> read-only backends for read/write disks.
> >
> > I don't see why we need to reject a guest disk with 4k sectors,
> > just because the host disk only has 512 byte sectors. A guest
> > sector size that's a larger multiple of host sector size should
> > work just fine. It just means any guest sector write will update
> > 8 host sectors at a time. We only have problems if guest sector
> > size is not a multiple of host sector size, in which case bounce
> > buffers are the only option (other than rejecting the config
> > which is not too nice).
> >
> > IIUC, current QEMU behaviour is
> >
> > Guest 512 Guest 4k
> > Host 512 * OK OK
> > Host 4k * I/O Err OK
> >
> > '*' marks defaults
> >
> > IMHO, QEMU needs to work withot I/O errors in all of these
> > combinations, even if this means having to use bounce buffers
> > in some of them. That said, IMHO the default should be for
> > QEMU to avoid bounce buffers, which implies it should either
> > chose guest sector size to match host sector size, or it
> > should unconditionally use 4k guest. IMHO we need the former
> >
> > Guest 512 Guest 4k
> > Host 512 *OK OK
> > Host 4k OK *OK
>
> I'm not sure if a 4k host should imply a 4k guest by default. This means
> that some guests wouldn't be able to run on a 4k host. On the other
> hand, for those guests that can do 4k, it would be the much better option.
>
> So I think this decision is the hard thing about it.
I guess it somewhat depends whether we want to strive for
1. Give the user the fastest working config by default
2. Give the user a working config by default
3. Give the user the fastest (possibly broken) config by default
IMHO 3 is not a serious option, but I could see 2 as a reasonable
tradeoff to avoid complexity in chosing QEMU defaults. The user
would have a working config with 512 sectors, but sub-optimal perf
on 4k hosts due to bounce buffering. Ideally libvirt or other
higher app would be setting the best block size that a guest
can support by default, so bounce buffers would rarely be needed.
So only people using QEMU directly without setting a block size
would ordinarily suffer the bounce buffer perf hit on a 4k host
host
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
next prev parent reply other threads:[~2011-10-26 11:39 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-24 11:04 KVM call agenda for October 25 Juan Quintela
2011-10-24 11:04 ` [Qemu-devel] " Juan Quintela
2011-10-24 11:35 ` Paolo Bonzini
2011-10-24 11:35 ` [Qemu-devel] " Paolo Bonzini
2011-10-24 12:02 ` Peter Maydell
2011-10-24 13:06 ` Andreas Färber
2011-10-24 13:06 ` [Qemu-devel] " Andreas Färber
2011-10-24 15:34 ` Luiz Capitulino
2011-10-25 12:35 ` Kevin Wolf
2011-10-25 12:35 ` [Qemu-devel] " Kevin Wolf
2011-10-25 13:05 ` Anthony Liguori
2011-10-25 13:18 ` Dor Laor
2011-10-25 13:28 ` Anthony Liguori
2011-10-25 13:40 ` Andreas Färber
2011-10-25 13:40 ` Andreas Färber
2011-10-25 13:56 ` Kevin Wolf
2011-10-25 14:06 ` Anthony Liguori
2011-10-25 15:32 ` Kevin Wolf
2011-10-25 22:19 ` Alexander Graf
2011-10-25 22:19 ` [Qemu-devel] " Alexander Graf
2011-10-26 20:41 ` Anthony Liguori
2011-10-26 8:15 ` Kevin Wolf
2011-10-26 8:15 ` [Qemu-devel] " Kevin Wolf
2011-10-26 8:48 ` Markus Armbruster
2011-10-26 8:48 ` Markus Armbruster
2011-10-26 9:41 ` Paolo Bonzini
2011-10-26 9:41 ` Paolo Bonzini
2011-10-26 11:12 ` Markus Armbruster
2011-10-26 9:57 ` Daniel P. Berrange
2011-10-26 11:23 ` Kevin Wolf
2011-10-26 11:23 ` Kevin Wolf
2011-10-26 11:39 ` Daniel P. Berrange [this message]
2011-10-26 11:39 ` Daniel P. Berrange
2011-10-26 12:18 ` Kevin Wolf
2011-10-26 12:18 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111026113932.GJ29496@redhat.com \
--to=berrange@redhat.com \
--cc=armbru@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.