public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: kvm@vger.kernel.org, ddutile@redhat.com, chrisw@redhat.com
Subject: Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM
Date: Mon, 11 Oct 2010 17:21:25 +0200	[thread overview]
Message-ID: <20101011152125.GA3584@redhat.com> (raw)
In-Reply-To: <1286810152.2959.28.camel@x201>

On Mon, Oct 11, 2010 at 09:15:52AM -0600, Alex Williamson wrote:
> On Sat, 2010-10-09 at 23:44 +0200, Michael S. Tsirkin wrote:
> > On Fri, Oct 08, 2010 at 09:12:52AM -0600, Alex Williamson wrote:
> > > On Fri, 2010-10-08 at 10:40 +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Oct 07, 2010 at 10:02:25PM -0600, Alex Williamson wrote:
> > > > > On Fri, 2010-10-08 at 00:45 +0200, Michael S. Tsirkin wrote:
> > > > > > On Thu, Oct 07, 2010 at 11:34:01AM -0600, Alex Williamson wrote:
> > > > > > > On Thu, 2010-10-07 at 19:18 +0200, Michael S. Tsirkin wrote:
> > > > > > > > On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote:
> > > > > > > > > --- a/hw/device-assignment.c
> > > > > > > > > +++ b/hw/device-assignment.c
> > > > > > > ...
> > > > > > > > > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices)
> > > > > > > > >   */
> > > > > > > > >  static void assigned_dev_load_option_rom(AssignedDevice *dev)
> > > > > > > > >  {
> > > > > > > > > -    int size, len, ret;
> > > > > > > > > -    void *buf;
> > > > > > > > > +    char name[32], rom_file[64];
> > > > > > > > >      FILE *fp;
> > > > > > > > > -    uint8_t i = 1;
> > > > > > > > > -    char rom_file[64];
> > > > > > > > > +    uint8_t val;
> > > > > > > > > +    struct stat st;
> > > > > > > > > +    void *ptr;
> > > > > > > > > +
> > > > > > > > > +    /* If loading ROM from file, pci handles it */
> > > > > > > > > +    if (dev->dev.romfile || !dev->dev.rom_bar)
> > > > > > > > > +        return;
> > > > > > > > >  
> > > > > > > > >      snprintf(rom_file, sizeof(rom_file),
> > > > > > > > >               "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom",
> > > > > > > > >               dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func);
> > > > > > > > >  
> > > > > > > > > -    if (access(rom_file, F_OK))
> > > > > > > > > +    if (stat(rom_file, &st)) {
> > > > > > > > >          return;
> > > > > > > > > +    }
> > > > > > > > >  
> > > > > > > > 
> > > > > > > > Just a note that stat on the ROM sysfs file returns window size,
> > > > > > > > not the ROM size. So this allocates more ram than really necessary for
> > > > > > > > ROM. Real size is returned by fread.
> > > > > > > > 
> > > > > > > > Do we care?
> > > > > > > 
> > > > > > > That was my intention with using stat.  I thought that by default the
> > > > > > > ROM BAR should match physical hardware, so even if the contents could be
> > > > > > > rounded down to a smaller size, we maintain the size of the physical
> > > > > > > device.  To use the minimum size, the contents could be extracted using
> > > > > > > pci-sysfs and passed with the romfile option, or the ROM could be
> > > > > > > disabled altogether with the rombar=0 option.  Sound reasonable?
> > > > > > > Thanks,
> > > > > > > 
> > > > > > > Alex
> > > > > > 
> > > > > > For BAR size yes, but we do not need the buffer full of 0xff as it is
> > > > > > never accessed: let's have buffer size match real ROM, avoid wasting
> > > > > > memory: this can come up to megabytes easily.
> > > > > > Makes sense?
> > > > > 
> > > > > I tend to doubt that hardware vendors are going to waste money putting
> > > > > seriously oversized eeproms on devices.  It does seem pretty typical to
> > > > > find graphics cards with 128K ROM BARs where the actual ROM squeezes
> > > > > just under 64K, but that's a long way from megabytes of wasted memory.
> > > > > The only device I have with a ROM BAR in the megabytes is an 82576, but
> > > > > it comes up as an invalid rom through pci-sysfs, so we skip it.  I
> > > > > assume that just means someone was lazy and didn't bother to fuse a
> > > > > transistor that disables the ROM BAR, leaving it at it's maximum
> > > > > aperture w/ no eeprom to back it.  Anyone know?  Examples to the
> > > > > contrary welcome.
> > > > > 
> > > > > So I think the question comes down to whether there's any value to
> > > > > trying to exactly mimic the resource layout of the device.  I'm doubtful
> > > > > that there is, but at the potential cost of 10-100s of KBs of memory, I
> > > > > thought it might be worthwhile.  If you feel strongly otherwise, I'll
> > > > > follow-up with a patch to size it by the actual readable contents.
> > > > > Thanks,
> > > > > 
> > > > > Alex
> > > > 
> > > > I actually agree sizing ROM BAR exactly the same as the device
> > > > is a good idea. I just thought we can save the extra memory
> > > > by not allocating the RAM in question, and writing code
> > > > to return 0xff on reads within the BAR but outside ROM.
> > > > And no, I don't feel strongly about this optimization.
> > > > 
> > > 
> > > Ok, so you're looking for something like below.  We can no longer map
> > > the ROM into the guest,
> > > but it's a ROM, so we don't care about speed.
> > 
> > Why can't we map ROM? Map full pages, leave 0xff unmapped.
> > The reason there will be such is because BAR is power of 2.
> 
> If I understand correctly, you're suggesting we round the ROM up to a
> power of two, allocate a full buffer to back that, and map that to the
> guest.

No, I suggested rounding up to full pages.

>  If the physical device has a larger ROM BAR, the remainder is
> pointed to a set of read functions that return 0xff and probably never
> get called.

Yes.

> > > Here's the big problem... it breaks migration.  The ramblock live
> > > migration code isn't going to deal well with migration from a VM with a
> > > BAR sized ramblock to a ROM sized ramblock (likewise the reverse).
> > 
> > You mean cross-version migration? Otherwise, why would not both
> > sides be ROM sized?
> 
> Yes, cross-version migration, though probably not an issue with the
> above since it doesn't change the size of existing emulated device ROMs.
> 
> > >  So
> > > we could do it for passthrough devices since they can't migrate anyway,
> > > but then we have to go back to separate code to handle assigned device
> > > ROMs vs emulated device ROMs.
> > 
> > I think this is based on the assumption we do not map ROM.
> > If we do map it, then most of the code is still same,
> > just add 0xff handling for pages after end of ROM.
> > These typically are unaccessed anyway.
> 
> Not on the no mapping assumption, but the assumption that you were
> looking to use the minimum buffer to back the ROM.  If we agree that
> it's ok to waste memory rounding the ROM up to a power of two, then
> things work out a little better, though I'm still dubious whether the
> memory savings is worth the code necessary to potentially handle the ROM
> as two discrete pieces.  Thanks,
> 
> Alex

True. Note this optimization is not there in existing code,
so it's definitely not urgent to implement this - just
something to keep in mind.

-- 
MST

  reply	other threads:[~2010-10-11 15:27 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-04 21:26 [PATCH 0/2] device-assignment: Re-work PCI option ROM support Alex Williamson
2010-10-04 21:26 ` [PATCH 1/2] PCI: Export pci_map_option_rom() Alex Williamson
2010-10-05 16:03   ` Chris Wright
2010-10-04 21:26 ` [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM Alex Williamson
2010-10-07 17:18   ` Michael S. Tsirkin
2010-10-07 17:34     ` Alex Williamson
2010-10-07 22:45       ` Michael S. Tsirkin
2010-10-08  4:02         ` Alex Williamson
2010-10-08  8:40           ` Michael S. Tsirkin
2010-10-08 15:12             ` Alex Williamson
2010-10-09 21:44               ` Michael S. Tsirkin
2010-10-11 15:15                 ` Alex Williamson
2010-10-11 15:21                   ` Michael S. Tsirkin [this message]
2010-10-11 15:43                     ` Alex Williamson
2010-10-06 20:43 ` [PATCH 0/2] device-assignment: Re-work PCI option ROM support Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101011152125.GA3584@redhat.com \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=chrisw@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox