All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gleb Natapov <gleb@redhat.com>
To: Alexander Graf <agraf@suse.de>
Cc: "Richard W.M. Jones" <rjones@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Question about qemu firmware configuration (fw_cfg) device
Date: Mon, 19 Jul 2010 11:30:50 +0300	[thread overview]
Message-ID: <20100719083050.GG4689@redhat.com> (raw)
In-Reply-To: <43B9EAA8-E3F5-4903-896C-DEBD90E06162@suse.de>

On Mon, Jul 19, 2010 at 10:24:46AM +0200, Alexander Graf wrote:
> 
> On 19.07.2010, at 10:19, Gleb Natapov wrote:
> 
> > On Mon, Jul 19, 2010 at 10:08:57AM +0200, Alexander Graf wrote:
> >> 
> >> On 19.07.2010, at 10:01, Gleb Natapov wrote:
> >> 
> >>> On Mon, Jul 19, 2010 at 09:57:02AM +0200, Alexander Graf wrote:
> >>>> 
> >>>> On 19.07.2010, at 09:51, Gleb Natapov wrote:
> >>>> 
> >>>>> On Mon, Jul 19, 2010 at 09:40:18AM +0200, Alexander Graf wrote:
> >>>>>> 
> >>>>>> On 19.07.2010, at 09:33, Gleb Natapov wrote:
> >>>>>> 
> >>>>>>> On Mon, Jul 19, 2010 at 08:28:02AM +0100, Richard W.M. Jones wrote:
> >>>>>>>> On Mon, Jul 19, 2010 at 09:23:56AM +0300, Gleb Natapov wrote:
> >>>>>>>>> That what I am warring about too. If we are adding device we have to be
> >>>>>>>>> sure such device can actually exist on real hw too otherwise we may have
> >>>>>>>>> problems later.
> >>>>>>>> 
> >>>>>>>> I don't understand why the constraints of real h/w have anything to do
> >>>>>>>> with this.  Can you explain?
> >>>>>>>> 
> >>>>>>> Each time we do something not architectural it cause us troubles later.
> >>>>>>> So constraints of real h/w is our constrains to.
> >>>>>>> 
> >>>>>>>>> Also 1 second on 100M file does not look like huge gain to me.
> >>>>>>>> 
> >>>>>>>> Every second counts.  We're trying to get libguestfs boot times down
> >>>>>>>> from 8-12 seconds to 4-5 seconds.  For many cases it's an interactive
> >>>>>>>> program.
> >>>>>>>> 
> >>>>>>> So what about making initrd smaller? I remember managing two
> >>>>>>> distribution in 64M flash in embedded project.
> >>>>>> 
> >>>>>> Having a huge initrd basically helps in reusing a lot of existing code. We do the same - in general the initrd is just a subset of the applications of the host OS. And if you start putting perl or the likes into it, it becomes big.
> >>>>>> 
> >>>>> Why not provide small disk/cdrom with all those utilities installed?
> >>>> 
> >>>> Because - if the loading is done fast - this way everything's in RAM instantly. And you still have all devices available for use inside the system - that makes enumeration a lot easier. There are several reasons why and I don't think we should force different ways on people just because one component of our system is ineffective.
> >>>> 
> >>> Loading huge initrd on real HW takes noticeably longer time that small
> >>> one, so I would say that it is your design that is to blame here, not
> >>> KVM.
> >> 
> >> I disagree. Virtualization enables new use cases. The -initrd parameter is a very good example for that. It's something that you simply couldn't do on real hw.
> >> 
> > How is it different from starting kernel/initrd from usb flash drive?
> 
> The kernel and initrd are read directly from the host fs. It's more like a 9p grub boot.
> 
There is no "host" on real HW :) But conceptually it's almost the same.
9p grub boot would be also nice. Hmm, I think PXE is closest to
-kernel/-initrd option on real HW.

> > 
> >>> 
> >>>>> 
> >>>>>> I guess the best thing for now really is to try and see which code paths insb goes along. It should really be coalesced.
> >>>>>> 
> >>>>> It is coalesced to a certain extent (reenter guest every 1024 bytes,
> >>>>> read from userspace page at a time). You need to continue injecting
> >>>>> interrupt into a guest during long string operation and checking
> >>>>> exception condition on a page boundaries.
> >>>> 
> >>>> That still sounds slow. So yeah, adding DMA is probably the right way to go. But then again - if we model it after real hw it would be asynchronous, giving us an interrupt, causing even more headache. Ugh.
> >>>> 
> >>>> Can't we just ignore real hw constraints here and have it available in guest ram once one particular PIO is done? No bus master, no interrupts, but full speed and simplicity/atomicity which also helps migration.
> >>>> 
> >>> We shouldn't add devices that work not like real HW to speed up some
> >>> pathological cases (and are slow on real HW too).
> >> 
> >> Just because you don't use them doesn't mean they're pathological, really. We simply chose a bad interface for transferring reasonable big chunks of data and we need to fix that. If you want to look at it from a different perspective, it's a regression. Older qemu versions did map the kernel and initrd directly into guest ram, so now we're slower than back then.
> >> 
> > I use them hundred time each day (at least -kernel part). If the
> > interface is slow for your use case I have no problem with introducing
> > new one, but the one that make sense in x86 architecture. I do not agree
> > this is regression BTW. You can't compare buggy way of doing things and
> > non-buggy way and say that bug fixing is a regression.
> > 
> > What about adding new PCI card that holds kernel initrd in ROM bar?
> 
> Yes and no. It sounds nice at first, but doesn't quite fit. There are two issues:
> 
> 1) We need a new PCI ID
We have our range. We can allocate from there.

> 2) There can be a lot of initrd binaries with multiboot. We only have a limited amount of BARs
> 
Is it supported now with fw_cfg interface? My main concern with this
approach is huge BAR size that may take a lot of space from PCI MMIO range
if guest OS decide to configure it.

--
			Gleb.

  reply	other threads:[~2010-07-19  8:30 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-17  9:50 [Qemu-devel] Question about qemu firmware configuration (fw_cfg) device Richard W.M. Jones
2010-07-17  9:53 ` Richard W.M. Jones
2010-07-18 17:26   ` Alexander Graf
2010-07-18 20:09     ` Richard W.M. Jones
2010-07-18 20:32       ` Alexander Graf
2010-07-19  6:23         ` Gleb Natapov
2010-07-19  7:28           ` Richard W.M. Jones
2010-07-19  7:33             ` Gleb Natapov
2010-07-19  7:40               ` Alexander Graf
2010-07-19  7:51                 ` Gleb Natapov
2010-07-19  7:57                   ` Alexander Graf
2010-07-19  8:01                     ` Gleb Natapov
2010-07-19  8:08                       ` Alexander Graf
2010-07-19  8:19                         ` Gleb Natapov
2010-07-19  8:24                           ` Alexander Graf
2010-07-19  8:30                             ` Gleb Natapov [this message]
2010-07-19  8:41                               ` Alexander Graf
2010-07-19  8:48                                 ` Gleb Natapov
2010-07-19  8:54                                   ` Alexander Graf
2010-07-19  9:00                                     ` Gleb Natapov
2010-07-19  9:02                                       ` Alexander Graf
2010-07-19  9:10                                         ` Gleb Natapov
2010-07-19  9:13                                           ` Alexander Graf
2010-07-19  9:19                                             ` Gleb Natapov
2010-07-19  9:21                                               ` Alexander Graf
2010-07-19  9:32                                                 ` Gleb Natapov
2010-07-19  9:23                                               ` Richard W.M. Jones
2010-07-20 13:15                   ` Jamie Lokier
2010-07-20 13:40                     ` Gleb Natapov
2010-07-20 13:59                       ` Richard W.M. Jones
2010-07-19  9:19                 ` Richard W.M. Jones
2010-07-19  7:44               ` Richard W.M. Jones
2010-07-19  7:55                 ` Gleb Natapov
2010-07-19  8:34                   ` Richard W.M. Jones
2010-07-19  8:40                     ` Gleb Natapov
2010-07-19  9:00                       ` Richard W.M. Jones
2010-07-19  9:04                         ` Richard W.M. Jones
2010-07-19  9:06                         ` Gleb Natapov
2010-07-19  9:09                           ` Alexander Graf
2010-07-19  9:15                             ` Gleb Natapov
2010-07-19  9:16                               ` Alexander Graf
2010-07-19 13:06                               ` Richard W.M. Jones
2010-07-19 13:12                                 ` Gleb Natapov
2010-07-19 14:52                               ` Anthony Liguori
2010-07-19 14:54                                 ` Gleb Natapov
2010-07-19 14:45               ` Anthony Liguori
2010-07-19 14:53                 ` Gleb Natapov
2010-07-19 15:54                   ` Anthony Liguori
2010-07-19 16:11                     ` Gleb Natapov
2010-07-19 16:47                       ` Richard W.M. Jones
2010-07-19 17:04                         ` Gleb Natapov
2010-07-19 19:06                       ` Anthony Liguori
2010-07-19  6:12     ` Gleb Natapov
2010-07-19  6:14 ` [Qemu-devel] " Gleb Natapov
2010-07-20 22:22 ` [Qemu-devel] " Blue Swirl
2010-07-21  7:27   ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100719083050.GG4689@redhat.com \
    --to=gleb@redhat.com \
    --cc=agraf@suse.de \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.