* [PATCH 0/2] device-assignment: Re-work PCI option ROM support
@ 2010-10-04 21:26 Alex Williamson
2010-10-04 21:26 ` [PATCH 1/2] PCI: Export pci_map_option_rom() Alex Williamson
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Alex Williamson @ 2010-10-04 21:26 UTC (permalink / raw)
To: kvm; +Cc: ddutile, chrisw
This cleans up device assignment option ROM support and allows
us to use romfile and rombar default PCI options. Thanks,
Alex
---
Alex Williamson (2):
device-assignment: Allow PCI to manage the option ROM
PCI: Export pci_map_option_rom()
hw/device-assignment.c | 155 +++++++++++++++++++++---------------------------
hw/device-assignment.h | 4 +
hw/pci.c | 2 -
hw/pci.h | 3 +
4 files changed, 75 insertions(+), 89 deletions(-)
^ permalink raw reply [flat|nested] 15+ messages in thread* [PATCH 1/2] PCI: Export pci_map_option_rom() 2010-10-04 21:26 [PATCH 0/2] device-assignment: Re-work PCI option ROM support Alex Williamson @ 2010-10-04 21:26 ` Alex Williamson 2010-10-05 16:03 ` Chris Wright 2010-10-04 21:26 ` [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM Alex Williamson 2010-10-06 20:43 ` [PATCH 0/2] device-assignment: Re-work PCI option ROM support Marcelo Tosatti 2 siblings, 1 reply; 15+ messages in thread From: Alex Williamson @ 2010-10-04 21:26 UTC (permalink / raw) To: kvm; +Cc: ddutile, chrisw Allow it to be referenced outside of hw/pci.c so we can register option ROM BARs using the default mapping routine. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- hw/pci.c | 2 +- hw/pci.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index e75f226..36ca571 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -1968,7 +1968,7 @@ static uint8_t pci_find_capability_list(PCIDevice *pdev, uint8_t cap_id, return next; } -static void pci_map_option_rom(PCIDevice *pdev, int region_num, pcibus_t addr, pcibus_t size, int type) +void pci_map_option_rom(PCIDevice *pdev, int region_num, pcibus_t addr, pcibus_t size, int type) { cpu_register_physical_memory(addr, size, pdev->rom_offset); } diff --git a/hw/pci.h b/hw/pci.h index ed86c57..825ccbe 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -219,6 +219,9 @@ void pci_register_bar(PCIDevice *pci_dev, int region_num, pcibus_t size, int type, PCIMapIORegionFunc *map_func); +void pci_map_option_rom(PCIDevice *pdev, int region_num, pcibus_t addr, + pcibus_t size, int type); + int pci_enable_capability_support(PCIDevice *pci_dev, uint32_t config_start, PCICapConfigReadFunc *config_read, ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] PCI: Export pci_map_option_rom() 2010-10-04 21:26 ` [PATCH 1/2] PCI: Export pci_map_option_rom() Alex Williamson @ 2010-10-05 16:03 ` Chris Wright 0 siblings, 0 replies; 15+ messages in thread From: Chris Wright @ 2010-10-05 16:03 UTC (permalink / raw) To: Alex Williamson; +Cc: kvm, ddutile, chrisw * Alex Williamson (alex.williamson@redhat.com) wrote: > Allow it to be referenced outside of hw/pci.c so we can register > option ROM BARs using the default mapping routine. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Chris Wright <chrisw@redhat.com> ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM 2010-10-04 21:26 [PATCH 0/2] device-assignment: Re-work PCI option ROM support Alex Williamson 2010-10-04 21:26 ` [PATCH 1/2] PCI: Export pci_map_option_rom() Alex Williamson @ 2010-10-04 21:26 ` Alex Williamson 2010-10-07 17:18 ` Michael S. Tsirkin 2010-10-06 20:43 ` [PATCH 0/2] device-assignment: Re-work PCI option ROM support Marcelo Tosatti 2 siblings, 1 reply; 15+ messages in thread From: Alex Williamson @ 2010-10-04 21:26 UTC (permalink / raw) To: kvm; +Cc: ddutile, chrisw We don't need to duplicate PCI code for mapping and managing the option ROM for an assigned device. We're already using an in-memory copy of the ROM, so we can simply fill the contents from the physical device and pass the rest off to PCI. As a benefit, we can now make use of the rombar and romfile options, which allow us to either hide the ROM BAR, or load it from an external file, such as we can do with emulated devices. This is useful if you want to pass through and boot from devices that are either missing a physical option ROM or don't supply a valid option ROM. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- hw/device-assignment.c | 155 +++++++++++++++++++++--------------------------- hw/device-assignment.h | 4 + 2 files changed, 71 insertions(+), 88 deletions(-) diff --git a/hw/device-assignment.c b/hw/device-assignment.c index 87f7418..26cb797 100644 --- a/hw/device-assignment.c +++ b/hw/device-assignment.c @@ -233,8 +233,6 @@ static CPUReadMemoryFunc * const slow_bar_read[] = { &slow_bar_readl }; -static CPUWriteMemoryFunc * const slow_bar_null_write[] = {NULL, NULL, NULL}; - static void assigned_dev_iomem_map_slow(PCIDevice *pci_dev, int region_num, pcibus_t e_phys, pcibus_t e_size, int type) @@ -245,10 +243,7 @@ static void assigned_dev_iomem_map_slow(PCIDevice *pci_dev, int region_num, int m; DEBUG("%s", "slow map\n"); - if (region_num == PCI_ROM_SLOT) - m = cpu_register_io_memory(slow_bar_read, slow_bar_null_write, region); - else - m = cpu_register_io_memory(slow_bar_read, slow_bar_write, region); + m = cpu_register_io_memory(slow_bar_read, slow_bar_write, region); cpu_register_physical_memory(e_phys, e_size, m); /* MSI-X MMIO page */ @@ -268,7 +263,7 @@ static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num, AssignedDevice *r_dev = container_of(pci_dev, AssignedDevice, dev); AssignedDevRegion *region = &r_dev->v_addrs[region_num]; PCIRegion *real_region = &r_dev->real_device.regions[region_num]; - int ret = 0, flags = 0; + int ret = 0; DEBUG("e_phys=%08" FMT_PCIBUS " r_virt=%p type=%d len=%08" FMT_PCIBUS " region_num=%d \n", e_phys, region->u.r_virtbase, type, e_size, region_num); @@ -277,11 +272,7 @@ static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num, region->e_size = e_size; if (e_size > 0) { - - if (region_num == PCI_ROM_SLOT) - flags |= IO_MEM_ROM; - - cpu_register_physical_memory(e_phys, e_size, region->memory_index | flags); + cpu_register_physical_memory(e_phys, e_size, region->memory_index); /* deal with MSI-X MMIO page */ if (real_region->base_addr <= r_dev->msix_table_addr && @@ -527,35 +518,22 @@ static int assigned_dev_register_regions(PCIRegion *io_regions, : PCI_BASE_ADDRESS_SPACE_MEMORY; if (cur_region->size & 0xFFF) { - if (i != PCI_ROM_SLOT) { - fprintf(stderr, "PCI region %d at address 0x%llx " - "has size 0x%x, which is not a multiple of 4K. " - "You might experience some performance hit " - "due to that.\n", - i, (unsigned long long)cur_region->base_addr, - cur_region->size); - } + fprintf(stderr, "PCI region %d at address 0x%llx " + "has size 0x%x, which is not a multiple of 4K. " + "You might experience some performance hit " + "due to that.\n", + i, (unsigned long long)cur_region->base_addr, + cur_region->size); slow_map = 1; } /* map physical memory */ pci_dev->v_addrs[i].e_physbase = cur_region->base_addr; - if (i == PCI_ROM_SLOT) { - /* KVM doesn't support read-only mappings, use slow map */ - slow_map = 1; - pci_dev->v_addrs[i].u.r_virtbase = - mmap(NULL, - cur_region->size, - PROT_WRITE | PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, - 0, (off_t) 0); - - } else { - pci_dev->v_addrs[i].u.r_virtbase = - mmap(NULL, - cur_region->size, - PROT_WRITE | PROT_READ, MAP_SHARED, - cur_region->resource_fd, (off_t) 0); - } + pci_dev->v_addrs[i].u.r_virtbase = mmap(NULL, cur_region->size, + PROT_WRITE | PROT_READ, + MAP_SHARED, + cur_region->resource_fd, + (off_t)0); if (pci_dev->v_addrs[i].u.r_virtbase == MAP_FAILED) { pci_dev->v_addrs[i].u.r_virtbase = NULL; @@ -565,11 +543,6 @@ static int assigned_dev_register_regions(PCIRegion *io_regions, return -1; } - if (i == PCI_ROM_SLOT) { - memset(pci_dev->v_addrs[i].u.r_virtbase, 0, - (cur_region->size + 0xFFF) & 0xFFFFF000); - } - pci_dev->v_addrs[i].r_size = cur_region->size; pci_dev->v_addrs[i].e_size = 0; @@ -712,6 +685,12 @@ again: fprintf(stderr, "%s: read failed, errno = %d\n", __func__, errno); } + /* Clear host resource mapping info. If we choose not to register a + * BAR, such as might be the case with the option ROM, we can get + * confusing, unwritable, residual addresses from the host here. */ + memset(&pci_dev->dev.config[PCI_BASE_ADDRESS_0], 0, 24); + memset(&pci_dev->dev.config[PCI_ROM_ADDRESS], 0, 4); + snprintf(name, sizeof(name), "%sresource", dir); f = fopen(name, "r"); @@ -720,7 +699,7 @@ again: return 1; } - for (r = 0; r < PCI_NUM_REGIONS; r++) { + for (r = 0; r < PCI_ROM_SLOT; r++) { if (fscanf(f, "%lli %lli %lli\n", &start, &end, &flags) != 3) break; @@ -736,13 +715,11 @@ again: } else { flags &= ~IORESOURCE_PREFETCH; } - if (r != PCI_ROM_SLOT) { - snprintf(name, sizeof(name), "%sresource%d", dir, r); - fd = open(name, O_RDWR); - if (fd == -1) - continue; - rp->resource_fd = fd; - } + snprintf(name, sizeof(name), "%sresource%d", dir, r); + fd = open(name, O_RDWR); + if (fd == -1) + continue; + rp->resource_fd = fd; rp->type = flags; rp->valid = 1; @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) */ static void assigned_dev_load_option_rom(AssignedDevice *dev) { - int size, len, ret; - void *buf; + char name[32], rom_file[64]; FILE *fp; - uint8_t i = 1; - char rom_file[64]; + uint8_t val; + struct stat st; + void *ptr; + + /* If loading ROM from file, pci handles it */ + if (dev->dev.romfile || !dev->dev.rom_bar) + return; snprintf(rom_file, sizeof(rom_file), "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom", dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func); - if (access(rom_file, F_OK)) + if (stat(rom_file, &st)) { return; + } - /* Write something to the ROM file to enable it */ - fp = fopen(rom_file, "wb"); - if (fp == NULL) - return; - len = fwrite(&i, 1, 1, fp); - fclose(fp); - if (len != 1) + if (access(rom_file, F_OK)) { + fprintf(stderr, "pci-assign: Insufficient privileges for %s\n", + rom_file); return; + } - /* The file has to be closed and reopened, otherwise it won't work */ - fp = fopen(rom_file, "rb"); - if (fp == NULL) + /* Write "1" to the ROM file to enable it */ + fp = fopen(rom_file, "r+"); + if (fp == NULL) { return; - - fseek(fp, 0, SEEK_END); - size = ftell(fp); + } + val = 1; + if (fwrite(&val, 1, 1, fp) != 1) { + goto close_rom; + } fseek(fp, 0, SEEK_SET); - buf = malloc(size); - if (buf == NULL) { - fclose(fp); - return; + snprintf(name, sizeof(name), "%s.rom", dev->dev.qdev.info->name); + dev->dev.rom_offset = qemu_ram_alloc(&dev->dev.qdev, name, st.st_size); + ptr = qemu_get_ram_ptr(dev->dev.rom_offset); + memset(ptr, 0xff, st.st_size); + + if (!fread(ptr, 1, st.st_size, fp)) { + fprintf(stderr, "pci-assign: Cannot read from host %s\n" + "\tDevice option ROM contents are probably invalid " + "(check dmesg).\n\tSkip option ROM probe with rombar=0, " + "or load from file with romfile=\n", rom_file); + qemu_ram_free(dev->dev.rom_offset); + dev->dev.rom_offset = 0; + goto close_rom; } - if (!(ret = fread(buf, 1, size, fp))) { - free(buf); - fclose(fp); - return; + pci_register_bar(&dev->dev, PCI_ROM_SLOT, + st.st_size, 0, pci_map_option_rom); +close_rom: + /* Write "0" to disable ROM */ + fseek(fp, 0, SEEK_SET); + val = 0; + if (!fwrite(&val, 1, 1, fp)) { + DEBUG("%s\n", "Failed to disable pci-sysfs rom file"); } fclose(fp); - - /* The number of bytes read is often much smaller than the BAR size */ - size = ret; - - /* Copy ROM contents into the space backing the ROM BAR */ - if (dev->v_addrs[PCI_ROM_SLOT].r_size >= size && - dev->v_addrs[PCI_ROM_SLOT].u.r_virtbase) { - memcpy(dev->v_addrs[PCI_ROM_SLOT].u.r_virtbase, buf, size); - } - - free(buf); } diff --git a/hw/device-assignment.h b/hw/device-assignment.h index 9a3ea12..2f5fa17 100644 --- a/hw/device-assignment.h +++ b/hw/device-assignment.h @@ -57,7 +57,7 @@ typedef struct { uint16_t region_number; /* number of active regions */ /* Port I/O or MMIO Regions */ - PCIRegion regions[PCI_NUM_REGIONS]; + PCIRegion regions[PCI_NUM_REGIONS - 1]; int config_fd; } PCIDevRegions; @@ -80,7 +80,7 @@ typedef struct AssignedDevice { uint32_t use_iommu; int intpin; uint8_t debug_flags; - AssignedDevRegion v_addrs[PCI_NUM_REGIONS]; + AssignedDevRegion v_addrs[PCI_NUM_REGIONS - 1]; PCIDevRegions real_device; int run; int girq; ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM 2010-10-04 21:26 ` [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM Alex Williamson @ 2010-10-07 17:18 ` Michael S. Tsirkin 2010-10-07 17:34 ` Alex Williamson 0 siblings, 1 reply; 15+ messages in thread From: Michael S. Tsirkin @ 2010-10-07 17:18 UTC (permalink / raw) To: Alex Williamson; +Cc: kvm, ddutile, chrisw On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote: > We don't need to duplicate PCI code for mapping and managing the > option ROM for an assigned device. We're already using an in-memory > copy of the ROM, so we can simply fill the contents from the physical > device and pass the rest off to PCI. As a benefit, we can now make > use of the rombar and romfile options, which allow us to either hide > the ROM BAR, or load it from an external file, such as we can do > with emulated devices. This is useful if you want to pass through > and boot from devices that are either missing a physical option ROM > or don't supply a valid option ROM. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > --- > > hw/device-assignment.c | 155 +++++++++++++++++++++--------------------------- > hw/device-assignment.h | 4 + > 2 files changed, 71 insertions(+), 88 deletions(-) > > diff --git a/hw/device-assignment.c b/hw/device-assignment.c > index 87f7418..26cb797 100644 > --- a/hw/device-assignment.c > +++ b/hw/device-assignment.c > @@ -233,8 +233,6 @@ static CPUReadMemoryFunc * const slow_bar_read[] = { > &slow_bar_readl > }; > > -static CPUWriteMemoryFunc * const slow_bar_null_write[] = {NULL, NULL, NULL}; > - > static void assigned_dev_iomem_map_slow(PCIDevice *pci_dev, int region_num, > pcibus_t e_phys, pcibus_t e_size, > int type) > @@ -245,10 +243,7 @@ static void assigned_dev_iomem_map_slow(PCIDevice *pci_dev, int region_num, > int m; > > DEBUG("%s", "slow map\n"); > - if (region_num == PCI_ROM_SLOT) > - m = cpu_register_io_memory(slow_bar_read, slow_bar_null_write, region); > - else > - m = cpu_register_io_memory(slow_bar_read, slow_bar_write, region); > + m = cpu_register_io_memory(slow_bar_read, slow_bar_write, region); > cpu_register_physical_memory(e_phys, e_size, m); > > /* MSI-X MMIO page */ > @@ -268,7 +263,7 @@ static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num, > AssignedDevice *r_dev = container_of(pci_dev, AssignedDevice, dev); > AssignedDevRegion *region = &r_dev->v_addrs[region_num]; > PCIRegion *real_region = &r_dev->real_device.regions[region_num]; > - int ret = 0, flags = 0; > + int ret = 0; > > DEBUG("e_phys=%08" FMT_PCIBUS " r_virt=%p type=%d len=%08" FMT_PCIBUS " region_num=%d \n", > e_phys, region->u.r_virtbase, type, e_size, region_num); > @@ -277,11 +272,7 @@ static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num, > region->e_size = e_size; > > if (e_size > 0) { > - > - if (region_num == PCI_ROM_SLOT) > - flags |= IO_MEM_ROM; > - > - cpu_register_physical_memory(e_phys, e_size, region->memory_index | flags); > + cpu_register_physical_memory(e_phys, e_size, region->memory_index); > > /* deal with MSI-X MMIO page */ > if (real_region->base_addr <= r_dev->msix_table_addr && > @@ -527,35 +518,22 @@ static int assigned_dev_register_regions(PCIRegion *io_regions, > : PCI_BASE_ADDRESS_SPACE_MEMORY; > > if (cur_region->size & 0xFFF) { > - if (i != PCI_ROM_SLOT) { > - fprintf(stderr, "PCI region %d at address 0x%llx " > - "has size 0x%x, which is not a multiple of 4K. " > - "You might experience some performance hit " > - "due to that.\n", > - i, (unsigned long long)cur_region->base_addr, > - cur_region->size); > - } > + fprintf(stderr, "PCI region %d at address 0x%llx " > + "has size 0x%x, which is not a multiple of 4K. " > + "You might experience some performance hit " > + "due to that.\n", > + i, (unsigned long long)cur_region->base_addr, > + cur_region->size); > slow_map = 1; > } > > /* map physical memory */ > pci_dev->v_addrs[i].e_physbase = cur_region->base_addr; > - if (i == PCI_ROM_SLOT) { > - /* KVM doesn't support read-only mappings, use slow map */ > - slow_map = 1; > - pci_dev->v_addrs[i].u.r_virtbase = > - mmap(NULL, > - cur_region->size, > - PROT_WRITE | PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, > - 0, (off_t) 0); > - > - } else { > - pci_dev->v_addrs[i].u.r_virtbase = > - mmap(NULL, > - cur_region->size, > - PROT_WRITE | PROT_READ, MAP_SHARED, > - cur_region->resource_fd, (off_t) 0); > - } > + pci_dev->v_addrs[i].u.r_virtbase = mmap(NULL, cur_region->size, > + PROT_WRITE | PROT_READ, > + MAP_SHARED, > + cur_region->resource_fd, > + (off_t)0); > > if (pci_dev->v_addrs[i].u.r_virtbase == MAP_FAILED) { > pci_dev->v_addrs[i].u.r_virtbase = NULL; > @@ -565,11 +543,6 @@ static int assigned_dev_register_regions(PCIRegion *io_regions, > return -1; > } > > - if (i == PCI_ROM_SLOT) { > - memset(pci_dev->v_addrs[i].u.r_virtbase, 0, > - (cur_region->size + 0xFFF) & 0xFFFFF000); > - } > - > pci_dev->v_addrs[i].r_size = cur_region->size; > pci_dev->v_addrs[i].e_size = 0; > > @@ -712,6 +685,12 @@ again: > fprintf(stderr, "%s: read failed, errno = %d\n", __func__, errno); > } > > + /* Clear host resource mapping info. If we choose not to register a > + * BAR, such as might be the case with the option ROM, we can get > + * confusing, unwritable, residual addresses from the host here. */ > + memset(&pci_dev->dev.config[PCI_BASE_ADDRESS_0], 0, 24); > + memset(&pci_dev->dev.config[PCI_ROM_ADDRESS], 0, 4); > + > snprintf(name, sizeof(name), "%sresource", dir); > > f = fopen(name, "r"); > @@ -720,7 +699,7 @@ again: > return 1; > } > > - for (r = 0; r < PCI_NUM_REGIONS; r++) { > + for (r = 0; r < PCI_ROM_SLOT; r++) { > if (fscanf(f, "%lli %lli %lli\n", &start, &end, &flags) != 3) > break; > > @@ -736,13 +715,11 @@ again: > } else { > flags &= ~IORESOURCE_PREFETCH; > } > - if (r != PCI_ROM_SLOT) { > - snprintf(name, sizeof(name), "%sresource%d", dir, r); > - fd = open(name, O_RDWR); > - if (fd == -1) > - continue; > - rp->resource_fd = fd; > - } > + snprintf(name, sizeof(name), "%sresource%d", dir, r); > + fd = open(name, O_RDWR); > + if (fd == -1) > + continue; > + rp->resource_fd = fd; > > rp->type = flags; > rp->valid = 1; > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) > */ > static void assigned_dev_load_option_rom(AssignedDevice *dev) > { > - int size, len, ret; > - void *buf; > + char name[32], rom_file[64]; > FILE *fp; > - uint8_t i = 1; > - char rom_file[64]; > + uint8_t val; > + struct stat st; > + void *ptr; > + > + /* If loading ROM from file, pci handles it */ > + if (dev->dev.romfile || !dev->dev.rom_bar) > + return; > > snprintf(rom_file, sizeof(rom_file), > "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom", > dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func); > > - if (access(rom_file, F_OK)) > + if (stat(rom_file, &st)) { > return; > + } > Just a note that stat on the ROM sysfs file returns window size, not the ROM size. So this allocates more ram than really necessary for ROM. Real size is returned by fread. Do we care? > + ptr = qemu_get_ram_ptr(dev->dev.rom_offset); > + memset(ptr, 0xff, st.st_size); > + > + if (!fread(ptr, 1, st.st_size, fp)) { > + fprintf(stderr, "pci-assign: Cannot read from host %s\n" > + "\tDevice option ROM contents are probably invalid " > + "(check dmesg).\n\tSkip option ROM probe with rombar=0, " > + "or load from file with romfile=\n", rom_file); > + qemu_ram_free(dev->dev.rom_offset); > + dev->dev.rom_offset = 0; > + goto close_rom; > } > > - if (!(ret = fread(buf, 1, size, fp))) { > - free(buf); > - fclose(fp); > - return; > + pci_register_bar(&dev->dev, PCI_ROM_SLOT, > + st.st_size, 0, pci_map_option_rom); > +close_rom: > + /* Write "0" to disable ROM */ > + fseek(fp, 0, SEEK_SET); > + val = 0; > + if (!fwrite(&val, 1, 1, fp)) { > + DEBUG("%s\n", "Failed to disable pci-sysfs rom file"); > } > fclose(fp); > - > - /* The number of bytes read is often much smaller than the BAR size */ > - size = ret; > - > - /* Copy ROM contents into the space backing the ROM BAR */ > - if (dev->v_addrs[PCI_ROM_SLOT].r_size >= size && > - dev->v_addrs[PCI_ROM_SLOT].u.r_virtbase) { > - memcpy(dev->v_addrs[PCI_ROM_SLOT].u.r_virtbase, buf, size); > - } > - > - free(buf); > } > diff --git a/hw/device-assignment.h b/hw/device-assignment.h > index 9a3ea12..2f5fa17 100644 > --- a/hw/device-assignment.h > +++ b/hw/device-assignment.h > @@ -57,7 +57,7 @@ typedef struct { > uint16_t region_number; /* number of active regions */ > > /* Port I/O or MMIO Regions */ > - PCIRegion regions[PCI_NUM_REGIONS]; > + PCIRegion regions[PCI_NUM_REGIONS - 1]; > int config_fd; > } PCIDevRegions; > > @@ -80,7 +80,7 @@ typedef struct AssignedDevice { > uint32_t use_iommu; > int intpin; > uint8_t debug_flags; > - AssignedDevRegion v_addrs[PCI_NUM_REGIONS]; > + AssignedDevRegion v_addrs[PCI_NUM_REGIONS - 1]; > PCIDevRegions real_device; > int run; > int girq; > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > - /* Write something to the ROM file to enable it */ > - fp = fopen(rom_file, "wb"); > - if (fp == NULL) > - return; > - len = fwrite(&i, 1, 1, fp); > - fclose(fp); > - if (len != 1) > + if (access(rom_file, F_OK)) { > + fprintf(stderr, "pci-assign: Insufficient privileges for %s\n", > + rom_file); > return; > + } > > - /* The file has to be closed and reopened, otherwise it won't work */ > - fp = fopen(rom_file, "rb"); > - if (fp == NULL) > + /* Write "1" to the ROM file to enable it */ > + fp = fopen(rom_file, "r+"); > + if (fp == NULL) { > return; > - > - fseek(fp, 0, SEEK_END); > - size = ftell(fp); > + } > + val = 1; > + if (fwrite(&val, 1, 1, fp) != 1) { > + goto close_rom; > + } > fseek(fp, 0, SEEK_SET); > > - buf = malloc(size); > - if (buf == NULL) { > - fclose(fp); > - return; > + snprintf(name, sizeof(name), "%s.rom", dev->dev.qdev.info->name); > + dev->dev.rom_offset = qemu_ram_alloc(&dev->dev.qdev, name, st.st_size); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM 2010-10-07 17:18 ` Michael S. Tsirkin @ 2010-10-07 17:34 ` Alex Williamson 2010-10-07 22:45 ` Michael S. Tsirkin 0 siblings, 1 reply; 15+ messages in thread From: Alex Williamson @ 2010-10-07 17:34 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: kvm, ddutile, chrisw On Thu, 2010-10-07 at 19:18 +0200, Michael S. Tsirkin wrote: > On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote: > > --- a/hw/device-assignment.c > > +++ b/hw/device-assignment.c ... > > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) > > */ > > static void assigned_dev_load_option_rom(AssignedDevice *dev) > > { > > - int size, len, ret; > > - void *buf; > > + char name[32], rom_file[64]; > > FILE *fp; > > - uint8_t i = 1; > > - char rom_file[64]; > > + uint8_t val; > > + struct stat st; > > + void *ptr; > > + > > + /* If loading ROM from file, pci handles it */ > > + if (dev->dev.romfile || !dev->dev.rom_bar) > > + return; > > > > snprintf(rom_file, sizeof(rom_file), > > "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom", > > dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func); > > > > - if (access(rom_file, F_OK)) > > + if (stat(rom_file, &st)) { > > return; > > + } > > > > Just a note that stat on the ROM sysfs file returns window size, > not the ROM size. So this allocates more ram than really necessary for > ROM. Real size is returned by fread. > > Do we care? That was my intention with using stat. I thought that by default the ROM BAR should match physical hardware, so even if the contents could be rounded down to a smaller size, we maintain the size of the physical device. To use the minimum size, the contents could be extracted using pci-sysfs and passed with the romfile option, or the ROM could be disabled altogether with the rombar=0 option. Sound reasonable? Thanks, Alex ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM 2010-10-07 17:34 ` Alex Williamson @ 2010-10-07 22:45 ` Michael S. Tsirkin 2010-10-08 4:02 ` Alex Williamson 0 siblings, 1 reply; 15+ messages in thread From: Michael S. Tsirkin @ 2010-10-07 22:45 UTC (permalink / raw) To: Alex Williamson; +Cc: kvm, ddutile, chrisw On Thu, Oct 07, 2010 at 11:34:01AM -0600, Alex Williamson wrote: > On Thu, 2010-10-07 at 19:18 +0200, Michael S. Tsirkin wrote: > > On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote: > > > --- a/hw/device-assignment.c > > > +++ b/hw/device-assignment.c > ... > > > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) > > > */ > > > static void assigned_dev_load_option_rom(AssignedDevice *dev) > > > { > > > - int size, len, ret; > > > - void *buf; > > > + char name[32], rom_file[64]; > > > FILE *fp; > > > - uint8_t i = 1; > > > - char rom_file[64]; > > > + uint8_t val; > > > + struct stat st; > > > + void *ptr; > > > + > > > + /* If loading ROM from file, pci handles it */ > > > + if (dev->dev.romfile || !dev->dev.rom_bar) > > > + return; > > > > > > snprintf(rom_file, sizeof(rom_file), > > > "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom", > > > dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func); > > > > > > - if (access(rom_file, F_OK)) > > > + if (stat(rom_file, &st)) { > > > return; > > > + } > > > > > > > Just a note that stat on the ROM sysfs file returns window size, > > not the ROM size. So this allocates more ram than really necessary for > > ROM. Real size is returned by fread. > > > > Do we care? > > That was my intention with using stat. I thought that by default the > ROM BAR should match physical hardware, so even if the contents could be > rounded down to a smaller size, we maintain the size of the physical > device. To use the minimum size, the contents could be extracted using > pci-sysfs and passed with the romfile option, or the ROM could be > disabled altogether with the rombar=0 option. Sound reasonable? > Thanks, > > Alex For BAR size yes, but we do not need the buffer full of 0xff as it is never accessed: let's have buffer size match real ROM, avoid wasting memory: this can come up to megabytes easily. Makes sense? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM 2010-10-07 22:45 ` Michael S. Tsirkin @ 2010-10-08 4:02 ` Alex Williamson 2010-10-08 8:40 ` Michael S. Tsirkin 0 siblings, 1 reply; 15+ messages in thread From: Alex Williamson @ 2010-10-08 4:02 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: kvm, ddutile, chrisw On Fri, 2010-10-08 at 00:45 +0200, Michael S. Tsirkin wrote: > On Thu, Oct 07, 2010 at 11:34:01AM -0600, Alex Williamson wrote: > > On Thu, 2010-10-07 at 19:18 +0200, Michael S. Tsirkin wrote: > > > On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote: > > > > --- a/hw/device-assignment.c > > > > +++ b/hw/device-assignment.c > > ... > > > > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) > > > > */ > > > > static void assigned_dev_load_option_rom(AssignedDevice *dev) > > > > { > > > > - int size, len, ret; > > > > - void *buf; > > > > + char name[32], rom_file[64]; > > > > FILE *fp; > > > > - uint8_t i = 1; > > > > - char rom_file[64]; > > > > + uint8_t val; > > > > + struct stat st; > > > > + void *ptr; > > > > + > > > > + /* If loading ROM from file, pci handles it */ > > > > + if (dev->dev.romfile || !dev->dev.rom_bar) > > > > + return; > > > > > > > > snprintf(rom_file, sizeof(rom_file), > > > > "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom", > > > > dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func); > > > > > > > > - if (access(rom_file, F_OK)) > > > > + if (stat(rom_file, &st)) { > > > > return; > > > > + } > > > > > > > > > > Just a note that stat on the ROM sysfs file returns window size, > > > not the ROM size. So this allocates more ram than really necessary for > > > ROM. Real size is returned by fread. > > > > > > Do we care? > > > > That was my intention with using stat. I thought that by default the > > ROM BAR should match physical hardware, so even if the contents could be > > rounded down to a smaller size, we maintain the size of the physical > > device. To use the minimum size, the contents could be extracted using > > pci-sysfs and passed with the romfile option, or the ROM could be > > disabled altogether with the rombar=0 option. Sound reasonable? > > Thanks, > > > > Alex > > For BAR size yes, but we do not need the buffer full of 0xff as it is > never accessed: let's have buffer size match real ROM, avoid wasting > memory: this can come up to megabytes easily. > Makes sense? I tend to doubt that hardware vendors are going to waste money putting seriously oversized eeproms on devices. It does seem pretty typical to find graphics cards with 128K ROM BARs where the actual ROM squeezes just under 64K, but that's a long way from megabytes of wasted memory. The only device I have with a ROM BAR in the megabytes is an 82576, but it comes up as an invalid rom through pci-sysfs, so we skip it. I assume that just means someone was lazy and didn't bother to fuse a transistor that disables the ROM BAR, leaving it at it's maximum aperture w/ no eeprom to back it. Anyone know? Examples to the contrary welcome. So I think the question comes down to whether there's any value to trying to exactly mimic the resource layout of the device. I'm doubtful that there is, but at the potential cost of 10-100s of KBs of memory, I thought it might be worthwhile. If you feel strongly otherwise, I'll follow-up with a patch to size it by the actual readable contents. Thanks, Alex ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM 2010-10-08 4:02 ` Alex Williamson @ 2010-10-08 8:40 ` Michael S. Tsirkin 2010-10-08 15:12 ` Alex Williamson 0 siblings, 1 reply; 15+ messages in thread From: Michael S. Tsirkin @ 2010-10-08 8:40 UTC (permalink / raw) To: Alex Williamson; +Cc: kvm, ddutile, chrisw On Thu, Oct 07, 2010 at 10:02:25PM -0600, Alex Williamson wrote: > On Fri, 2010-10-08 at 00:45 +0200, Michael S. Tsirkin wrote: > > On Thu, Oct 07, 2010 at 11:34:01AM -0600, Alex Williamson wrote: > > > On Thu, 2010-10-07 at 19:18 +0200, Michael S. Tsirkin wrote: > > > > On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote: > > > > > --- a/hw/device-assignment.c > > > > > +++ b/hw/device-assignment.c > > > ... > > > > > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) > > > > > */ > > > > > static void assigned_dev_load_option_rom(AssignedDevice *dev) > > > > > { > > > > > - int size, len, ret; > > > > > - void *buf; > > > > > + char name[32], rom_file[64]; > > > > > FILE *fp; > > > > > - uint8_t i = 1; > > > > > - char rom_file[64]; > > > > > + uint8_t val; > > > > > + struct stat st; > > > > > + void *ptr; > > > > > + > > > > > + /* If loading ROM from file, pci handles it */ > > > > > + if (dev->dev.romfile || !dev->dev.rom_bar) > > > > > + return; > > > > > > > > > > snprintf(rom_file, sizeof(rom_file), > > > > > "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom", > > > > > dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func); > > > > > > > > > > - if (access(rom_file, F_OK)) > > > > > + if (stat(rom_file, &st)) { > > > > > return; > > > > > + } > > > > > > > > > > > > > Just a note that stat on the ROM sysfs file returns window size, > > > > not the ROM size. So this allocates more ram than really necessary for > > > > ROM. Real size is returned by fread. > > > > > > > > Do we care? > > > > > > That was my intention with using stat. I thought that by default the > > > ROM BAR should match physical hardware, so even if the contents could be > > > rounded down to a smaller size, we maintain the size of the physical > > > device. To use the minimum size, the contents could be extracted using > > > pci-sysfs and passed with the romfile option, or the ROM could be > > > disabled altogether with the rombar=0 option. Sound reasonable? > > > Thanks, > > > > > > Alex > > > > For BAR size yes, but we do not need the buffer full of 0xff as it is > > never accessed: let's have buffer size match real ROM, avoid wasting > > memory: this can come up to megabytes easily. > > Makes sense? > > I tend to doubt that hardware vendors are going to waste money putting > seriously oversized eeproms on devices. It does seem pretty typical to > find graphics cards with 128K ROM BARs where the actual ROM squeezes > just under 64K, but that's a long way from megabytes of wasted memory. > The only device I have with a ROM BAR in the megabytes is an 82576, but > it comes up as an invalid rom through pci-sysfs, so we skip it. I > assume that just means someone was lazy and didn't bother to fuse a > transistor that disables the ROM BAR, leaving it at it's maximum > aperture w/ no eeprom to back it. Anyone know? Examples to the > contrary welcome. > > So I think the question comes down to whether there's any value to > trying to exactly mimic the resource layout of the device. I'm doubtful > that there is, but at the potential cost of 10-100s of KBs of memory, I > thought it might be worthwhile. If you feel strongly otherwise, I'll > follow-up with a patch to size it by the actual readable contents. > Thanks, > > Alex I actually agree sizing ROM BAR exactly the same as the device is a good idea. I just thought we can save the extra memory by not allocating the RAM in question, and writing code to return 0xff on reads within the BAR but outside ROM. And no, I don't feel strongly about this optimization. -- MST ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM 2010-10-08 8:40 ` Michael S. Tsirkin @ 2010-10-08 15:12 ` Alex Williamson 2010-10-09 21:44 ` Michael S. Tsirkin 0 siblings, 1 reply; 15+ messages in thread From: Alex Williamson @ 2010-10-08 15:12 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: kvm, ddutile, chrisw On Fri, 2010-10-08 at 10:40 +0200, Michael S. Tsirkin wrote: > On Thu, Oct 07, 2010 at 10:02:25PM -0600, Alex Williamson wrote: > > On Fri, 2010-10-08 at 00:45 +0200, Michael S. Tsirkin wrote: > > > On Thu, Oct 07, 2010 at 11:34:01AM -0600, Alex Williamson wrote: > > > > On Thu, 2010-10-07 at 19:18 +0200, Michael S. Tsirkin wrote: > > > > > On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote: > > > > > > --- a/hw/device-assignment.c > > > > > > +++ b/hw/device-assignment.c > > > > ... > > > > > > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) > > > > > > */ > > > > > > static void assigned_dev_load_option_rom(AssignedDevice *dev) > > > > > > { > > > > > > - int size, len, ret; > > > > > > - void *buf; > > > > > > + char name[32], rom_file[64]; > > > > > > FILE *fp; > > > > > > - uint8_t i = 1; > > > > > > - char rom_file[64]; > > > > > > + uint8_t val; > > > > > > + struct stat st; > > > > > > + void *ptr; > > > > > > + > > > > > > + /* If loading ROM from file, pci handles it */ > > > > > > + if (dev->dev.romfile || !dev->dev.rom_bar) > > > > > > + return; > > > > > > > > > > > > snprintf(rom_file, sizeof(rom_file), > > > > > > "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom", > > > > > > dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func); > > > > > > > > > > > > - if (access(rom_file, F_OK)) > > > > > > + if (stat(rom_file, &st)) { > > > > > > return; > > > > > > + } > > > > > > > > > > > > > > > > Just a note that stat on the ROM sysfs file returns window size, > > > > > not the ROM size. So this allocates more ram than really necessary for > > > > > ROM. Real size is returned by fread. > > > > > > > > > > Do we care? > > > > > > > > That was my intention with using stat. I thought that by default the > > > > ROM BAR should match physical hardware, so even if the contents could be > > > > rounded down to a smaller size, we maintain the size of the physical > > > > device. To use the minimum size, the contents could be extracted using > > > > pci-sysfs and passed with the romfile option, or the ROM could be > > > > disabled altogether with the rombar=0 option. Sound reasonable? > > > > Thanks, > > > > > > > > Alex > > > > > > For BAR size yes, but we do not need the buffer full of 0xff as it is > > > never accessed: let's have buffer size match real ROM, avoid wasting > > > memory: this can come up to megabytes easily. > > > Makes sense? > > > > I tend to doubt that hardware vendors are going to waste money putting > > seriously oversized eeproms on devices. It does seem pretty typical to > > find graphics cards with 128K ROM BARs where the actual ROM squeezes > > just under 64K, but that's a long way from megabytes of wasted memory. > > The only device I have with a ROM BAR in the megabytes is an 82576, but > > it comes up as an invalid rom through pci-sysfs, so we skip it. I > > assume that just means someone was lazy and didn't bother to fuse a > > transistor that disables the ROM BAR, leaving it at it's maximum > > aperture w/ no eeprom to back it. Anyone know? Examples to the > > contrary welcome. > > > > So I think the question comes down to whether there's any value to > > trying to exactly mimic the resource layout of the device. I'm doubtful > > that there is, but at the potential cost of 10-100s of KBs of memory, I > > thought it might be worthwhile. If you feel strongly otherwise, I'll > > follow-up with a patch to size it by the actual readable contents. > > Thanks, > > > > Alex > > I actually agree sizing ROM BAR exactly the same as the device > is a good idea. I just thought we can save the extra memory > by not allocating the RAM in question, and writing code > to return 0xff on reads within the BAR but outside ROM. > And no, I don't feel strongly about this optimization. > Ok, so you're looking for something like below. We can no longer map the ROM into the guest, but it's a ROM, so we don't care about speed. Here's the big problem... it breaks migration. The ramblock live migration code isn't going to deal well with migration from a VM with a BAR sized ramblock to a ROM sized ramblock (likewise the reverse). So we could do it for passthrough devices since they can't migrate anyway, but then we have to go back to separate code to handle assigned device ROMs vs emulated device ROMs. Good idea, but I don't think it's worth the effort. Thanks, Alex Not Signed-off, Not to be applied... diff --git a/hw/device-assignment.c b/hw/device-assignment.c index 26cb797..94561ef 100644 --- a/hw/device-assignment.c +++ b/hw/device-assignment.c @@ -1622,6 +1622,7 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) static void assigned_dev_load_option_rom(AssignedDevice *dev) { char name[32], rom_file[64]; + size_t size; FILE *fp; uint8_t val; struct stat st; @@ -1654,20 +1655,23 @@ static void assigned_dev_load_option_rom(AssignedDevice *dev) if (fwrite(&val, 1, 1, fp) != 1) { goto close_rom; } + + fseek(fp, 0, SEEK_END); + size = ftell(fp); fseek(fp, 0, SEEK_SET); snprintf(name, sizeof(name), "%s.rom", dev->dev.qdev.info->name); - dev->dev.rom_offset = qemu_ram_alloc(&dev->dev.qdev, name, st.st_size); + dev->dev.rom_offset = qemu_ram_alloc(&dev->dev.qdev, name, size); + dev->dev.rom_size = size; ptr = qemu_get_ram_ptr(dev->dev.rom_offset); - memset(ptr, 0xff, st.st_size); - if (!fread(ptr, 1, st.st_size, fp)) { + if (!fread(ptr, 1, size, fp)) { fprintf(stderr, "pci-assign: Cannot read from host %s\n" "\tDevice option ROM contents are probably invalid " "(check dmesg).\n\tSkip option ROM probe with rombar=0, " "or load from file with romfile=\n", rom_file); qemu_ram_free(dev->dev.rom_offset); - dev->dev.rom_offset = 0; + dev->dev.rom_offset = dev->dev.rom_size = 0; goto close_rom; } diff --git a/hw/pci.c b/hw/pci.c index 07e9661..bd15eb7 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -1973,9 +1973,49 @@ static uint8_t pci_find_capability_list(PCIDevice *pdev, uint8_t cap_id, return next; } -void pci_map_option_rom(PCIDevice *pdev, int region_num, pcibus_t addr, pcibus_t size, int type) +static uint32_t rom_readb(void *opaque, target_phys_addr_t addr) { - cpu_register_physical_memory(addr, size, pdev->rom_offset); + PCIDevice *pdev = opaque; + + if (addr > pdev->rom_size) + return 0xff; + + return *(uint8_t *)qemu_get_ram_ptr(pdev->rom_offset + addr); +} + +static uint32_t rom_readw(void *opaque, target_phys_addr_t addr) +{ + PCIDevice *pdev = opaque; + + if (addr > pdev->rom_size) + return 0xffff; + + return *(uint16_t *)qemu_get_ram_ptr(pdev->rom_offset + addr); +} + +static uint32_t rom_readl(void *opaque, target_phys_addr_t addr) +{ + PCIDevice *pdev = opaque; + + if (addr > pdev->rom_size) + return 0xffffffff; + + return *(uint32_t *)qemu_get_ram_ptr(pdev->rom_offset + addr); +} + +static CPUReadMemoryFunc * const rom_reads[] = { + &rom_readb, &rom_readw, &rom_readl +}; + +static CPUWriteMemoryFunc * const rom_writes[] = { NULL, NULL, NULL }; + +void pci_map_option_rom(PCIDevice *pdev, int region_num, pcibus_t addr, + pcibus_t size, int type) +{ + int m; + + m = cpu_register_io_memory(rom_reads, rom_writes, pdev); + cpu_register_physical_memory(addr, size, m); } /* Add an option rom for the device */ @@ -2016,9 +2056,7 @@ static int pci_add_option_rom(PCIDevice *pdev) __FUNCTION__, pdev->romfile); return -1; } - if (size & (size - 1)) { - size = 1 << qemu_fls(size); - } + pdev->rom_size = size; if (pdev->qdev.info->vmsd) snprintf(name, sizeof(name), "%s.rom", pdev->qdev.info->vmsd->name); @@ -2030,6 +2068,11 @@ static int pci_add_option_rom(PCIDevice *pdev) load_image(path, ptr); qemu_free(path); + /* Round up size for the BAR */ + if (size & (size - 1)) { + size = 1 << qemu_fls(size); + } + pci_register_bar(pdev, PCI_ROM_SLOT, size, 0, pci_map_option_rom); @@ -2042,7 +2085,7 @@ static void pci_del_option_rom(PCIDevice *pdev) return; qemu_ram_free(pdev->rom_offset); - pdev->rom_offset = 0; + pdev->rom_offset = pdev->rom_size = 0; } /* Reserve space and add capability to the linked list in pci config space */ diff --git a/hw/pci.h b/hw/pci.h index 9ee8db3..ed87b1a 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -187,6 +187,7 @@ struct PCIDevice { /* Location of option rom */ char *romfile; ram_addr_t rom_offset; + size_t rom_size; uint32_t rom_bar; /* How much space does an MSIX table need. */ ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM 2010-10-08 15:12 ` Alex Williamson @ 2010-10-09 21:44 ` Michael S. Tsirkin 2010-10-11 15:15 ` Alex Williamson 0 siblings, 1 reply; 15+ messages in thread From: Michael S. Tsirkin @ 2010-10-09 21:44 UTC (permalink / raw) To: Alex Williamson; +Cc: kvm, ddutile, chrisw On Fri, Oct 08, 2010 at 09:12:52AM -0600, Alex Williamson wrote: > On Fri, 2010-10-08 at 10:40 +0200, Michael S. Tsirkin wrote: > > On Thu, Oct 07, 2010 at 10:02:25PM -0600, Alex Williamson wrote: > > > On Fri, 2010-10-08 at 00:45 +0200, Michael S. Tsirkin wrote: > > > > On Thu, Oct 07, 2010 at 11:34:01AM -0600, Alex Williamson wrote: > > > > > On Thu, 2010-10-07 at 19:18 +0200, Michael S. Tsirkin wrote: > > > > > > On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote: > > > > > > > --- a/hw/device-assignment.c > > > > > > > +++ b/hw/device-assignment.c > > > > > ... > > > > > > > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) > > > > > > > */ > > > > > > > static void assigned_dev_load_option_rom(AssignedDevice *dev) > > > > > > > { > > > > > > > - int size, len, ret; > > > > > > > - void *buf; > > > > > > > + char name[32], rom_file[64]; > > > > > > > FILE *fp; > > > > > > > - uint8_t i = 1; > > > > > > > - char rom_file[64]; > > > > > > > + uint8_t val; > > > > > > > + struct stat st; > > > > > > > + void *ptr; > > > > > > > + > > > > > > > + /* If loading ROM from file, pci handles it */ > > > > > > > + if (dev->dev.romfile || !dev->dev.rom_bar) > > > > > > > + return; > > > > > > > > > > > > > > snprintf(rom_file, sizeof(rom_file), > > > > > > > "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom", > > > > > > > dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func); > > > > > > > > > > > > > > - if (access(rom_file, F_OK)) > > > > > > > + if (stat(rom_file, &st)) { > > > > > > > return; > > > > > > > + } > > > > > > > > > > > > > > > > > > > Just a note that stat on the ROM sysfs file returns window size, > > > > > > not the ROM size. So this allocates more ram than really necessary for > > > > > > ROM. Real size is returned by fread. > > > > > > > > > > > > Do we care? > > > > > > > > > > That was my intention with using stat. I thought that by default the > > > > > ROM BAR should match physical hardware, so even if the contents could be > > > > > rounded down to a smaller size, we maintain the size of the physical > > > > > device. To use the minimum size, the contents could be extracted using > > > > > pci-sysfs and passed with the romfile option, or the ROM could be > > > > > disabled altogether with the rombar=0 option. Sound reasonable? > > > > > Thanks, > > > > > > > > > > Alex > > > > > > > > For BAR size yes, but we do not need the buffer full of 0xff as it is > > > > never accessed: let's have buffer size match real ROM, avoid wasting > > > > memory: this can come up to megabytes easily. > > > > Makes sense? > > > > > > I tend to doubt that hardware vendors are going to waste money putting > > > seriously oversized eeproms on devices. It does seem pretty typical to > > > find graphics cards with 128K ROM BARs where the actual ROM squeezes > > > just under 64K, but that's a long way from megabytes of wasted memory. > > > The only device I have with a ROM BAR in the megabytes is an 82576, but > > > it comes up as an invalid rom through pci-sysfs, so we skip it. I > > > assume that just means someone was lazy and didn't bother to fuse a > > > transistor that disables the ROM BAR, leaving it at it's maximum > > > aperture w/ no eeprom to back it. Anyone know? Examples to the > > > contrary welcome. > > > > > > So I think the question comes down to whether there's any value to > > > trying to exactly mimic the resource layout of the device. I'm doubtful > > > that there is, but at the potential cost of 10-100s of KBs of memory, I > > > thought it might be worthwhile. If you feel strongly otherwise, I'll > > > follow-up with a patch to size it by the actual readable contents. > > > Thanks, > > > > > > Alex > > > > I actually agree sizing ROM BAR exactly the same as the device > > is a good idea. I just thought we can save the extra memory > > by not allocating the RAM in question, and writing code > > to return 0xff on reads within the BAR but outside ROM. > > And no, I don't feel strongly about this optimization. > > > > Ok, so you're looking for something like below. We can no longer map > the ROM into the guest, > but it's a ROM, so we don't care about speed. Why can't we map ROM? Map full pages, leave 0xff unmapped. The reason there will be such is because BAR is power of 2. > Here's the big problem... it breaks migration. The ramblock live > migration code isn't going to deal well with migration from a VM with a > BAR sized ramblock to a ROM sized ramblock (likewise the reverse). You mean cross-version migration? Otherwise, why would not both sides be ROM sized? > So > we could do it for passthrough devices since they can't migrate anyway, > but then we have to go back to separate code to handle assigned device > ROMs vs emulated device ROMs. I think this is based on the assumption we do not map ROM. If we do map it, then most of the code is still same, just add 0xff handling for pages after end of ROM. These typically are unaccessed anyway. > Good idea, but I don't think it's worth > the effort. Thanks, > > Alex > > Not Signed-off, Not to be applied... > > diff --git a/hw/device-assignment.c b/hw/device-assignment.c > index 26cb797..94561ef 100644 > --- a/hw/device-assignment.c > +++ b/hw/device-assignment.c > @@ -1622,6 +1622,7 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) > static void assigned_dev_load_option_rom(AssignedDevice *dev) > { > char name[32], rom_file[64]; > + size_t size; > FILE *fp; > uint8_t val; > struct stat st; > @@ -1654,20 +1655,23 @@ static void assigned_dev_load_option_rom(AssignedDevice *dev) > if (fwrite(&val, 1, 1, fp) != 1) { > goto close_rom; > } > + > + fseek(fp, 0, SEEK_END); > + size = ftell(fp); I don't think this works: looking at kernel code: loff_t generic_file_llseek_unlocked(struct file *file, loff_t offset, int origin) { struct inode *inode = file->f_mapping->host; switch (origin) { case SEEK_END: offset += inode->i_size; So this seems to still be BAR size, you really need the size returned by fread. > fseek(fp, 0, SEEK_SET); > > snprintf(name, sizeof(name), "%s.rom", dev->dev.qdev.info->name); > - dev->dev.rom_offset = qemu_ram_alloc(&dev->dev.qdev, name, st.st_size); > + dev->dev.rom_offset = qemu_ram_alloc(&dev->dev.qdev, name, size); > + dev->dev.rom_size = size; > ptr = qemu_get_ram_ptr(dev->dev.rom_offset); > - memset(ptr, 0xff, st.st_size); > > - if (!fread(ptr, 1, st.st_size, fp)) { > + if (!fread(ptr, 1, size, fp)) { > fprintf(stderr, "pci-assign: Cannot read from host %s\n" > "\tDevice option ROM contents are probably invalid " > "(check dmesg).\n\tSkip option ROM probe with rombar=0, " > "or load from file with romfile=\n", rom_file); > qemu_ram_free(dev->dev.rom_offset); > - dev->dev.rom_offset = 0; > + dev->dev.rom_offset = dev->dev.rom_size = 0; > goto close_rom; > } > > diff --git a/hw/pci.c b/hw/pci.c > index 07e9661..bd15eb7 100644 > --- a/hw/pci.c > +++ b/hw/pci.c > @@ -1973,9 +1973,49 @@ static uint8_t pci_find_capability_list(PCIDevice *pdev, uint8_t cap_id, > return next; > } > > -void pci_map_option_rom(PCIDevice *pdev, int region_num, pcibus_t addr, pcibus_t size, int type) > +static uint32_t rom_readb(void *opaque, target_phys_addr_t addr) > { > - cpu_register_physical_memory(addr, size, pdev->rom_offset); > + PCIDevice *pdev = opaque; > + > + if (addr > pdev->rom_size) > + return 0xff; > + > + return *(uint8_t *)qemu_get_ram_ptr(pdev->rom_offset + addr); > +} > + > +static uint32_t rom_readw(void *opaque, target_phys_addr_t addr) > +{ > + PCIDevice *pdev = opaque; > + > + if (addr > pdev->rom_size) > + return 0xffff; > + > + return *(uint16_t *)qemu_get_ram_ptr(pdev->rom_offset + addr); > +} > + > +static uint32_t rom_readl(void *opaque, target_phys_addr_t addr) > +{ > + PCIDevice *pdev = opaque; > + > + if (addr > pdev->rom_size) > + return 0xffffffff; > + > + return *(uint32_t *)qemu_get_ram_ptr(pdev->rom_offset + addr); > +} > + > +static CPUReadMemoryFunc * const rom_reads[] = { > + &rom_readb, &rom_readw, &rom_readl > +}; > + > +static CPUWriteMemoryFunc * const rom_writes[] = { NULL, NULL, NULL }; > + > +void pci_map_option_rom(PCIDevice *pdev, int region_num, pcibus_t addr, > + pcibus_t size, int type) > +{ > + int m; > + > + m = cpu_register_io_memory(rom_reads, rom_writes, pdev); > + cpu_register_physical_memory(addr, size, m); > } > > /* Add an option rom for the device */ > @@ -2016,9 +2056,7 @@ static int pci_add_option_rom(PCIDevice *pdev) > __FUNCTION__, pdev->romfile); > return -1; > } > - if (size & (size - 1)) { > - size = 1 << qemu_fls(size); > - } > + pdev->rom_size = size; > > if (pdev->qdev.info->vmsd) > snprintf(name, sizeof(name), "%s.rom", pdev->qdev.info->vmsd->name); > @@ -2030,6 +2068,11 @@ static int pci_add_option_rom(PCIDevice *pdev) > load_image(path, ptr); > qemu_free(path); > > + /* Round up size for the BAR */ > + if (size & (size - 1)) { > + size = 1 << qemu_fls(size); > + } > + > pci_register_bar(pdev, PCI_ROM_SLOT, size, > 0, pci_map_option_rom); > > @@ -2042,7 +2085,7 @@ static void pci_del_option_rom(PCIDevice *pdev) > return; > > qemu_ram_free(pdev->rom_offset); > - pdev->rom_offset = 0; > + pdev->rom_offset = pdev->rom_size = 0; > } > > /* Reserve space and add capability to the linked list in pci config space */ > diff --git a/hw/pci.h b/hw/pci.h > index 9ee8db3..ed87b1a 100644 > --- a/hw/pci.h > +++ b/hw/pci.h > @@ -187,6 +187,7 @@ struct PCIDevice { > /* Location of option rom */ > char *romfile; > ram_addr_t rom_offset; > + size_t rom_size; > uint32_t rom_bar; > > /* How much space does an MSIX table need. */ > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM 2010-10-09 21:44 ` Michael S. Tsirkin @ 2010-10-11 15:15 ` Alex Williamson 2010-10-11 15:21 ` Michael S. Tsirkin 0 siblings, 1 reply; 15+ messages in thread From: Alex Williamson @ 2010-10-11 15:15 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: kvm, ddutile, chrisw On Sat, 2010-10-09 at 23:44 +0200, Michael S. Tsirkin wrote: > On Fri, Oct 08, 2010 at 09:12:52AM -0600, Alex Williamson wrote: > > On Fri, 2010-10-08 at 10:40 +0200, Michael S. Tsirkin wrote: > > > On Thu, Oct 07, 2010 at 10:02:25PM -0600, Alex Williamson wrote: > > > > On Fri, 2010-10-08 at 00:45 +0200, Michael S. Tsirkin wrote: > > > > > On Thu, Oct 07, 2010 at 11:34:01AM -0600, Alex Williamson wrote: > > > > > > On Thu, 2010-10-07 at 19:18 +0200, Michael S. Tsirkin wrote: > > > > > > > On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote: > > > > > > > > --- a/hw/device-assignment.c > > > > > > > > +++ b/hw/device-assignment.c > > > > > > ... > > > > > > > > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) > > > > > > > > */ > > > > > > > > static void assigned_dev_load_option_rom(AssignedDevice *dev) > > > > > > > > { > > > > > > > > - int size, len, ret; > > > > > > > > - void *buf; > > > > > > > > + char name[32], rom_file[64]; > > > > > > > > FILE *fp; > > > > > > > > - uint8_t i = 1; > > > > > > > > - char rom_file[64]; > > > > > > > > + uint8_t val; > > > > > > > > + struct stat st; > > > > > > > > + void *ptr; > > > > > > > > + > > > > > > > > + /* If loading ROM from file, pci handles it */ > > > > > > > > + if (dev->dev.romfile || !dev->dev.rom_bar) > > > > > > > > + return; > > > > > > > > > > > > > > > > snprintf(rom_file, sizeof(rom_file), > > > > > > > > "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom", > > > > > > > > dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func); > > > > > > > > > > > > > > > > - if (access(rom_file, F_OK)) > > > > > > > > + if (stat(rom_file, &st)) { > > > > > > > > return; > > > > > > > > + } > > > > > > > > > > > > > > > > > > > > > > Just a note that stat on the ROM sysfs file returns window size, > > > > > > > not the ROM size. So this allocates more ram than really necessary for > > > > > > > ROM. Real size is returned by fread. > > > > > > > > > > > > > > Do we care? > > > > > > > > > > > > That was my intention with using stat. I thought that by default the > > > > > > ROM BAR should match physical hardware, so even if the contents could be > > > > > > rounded down to a smaller size, we maintain the size of the physical > > > > > > device. To use the minimum size, the contents could be extracted using > > > > > > pci-sysfs and passed with the romfile option, or the ROM could be > > > > > > disabled altogether with the rombar=0 option. Sound reasonable? > > > > > > Thanks, > > > > > > > > > > > > Alex > > > > > > > > > > For BAR size yes, but we do not need the buffer full of 0xff as it is > > > > > never accessed: let's have buffer size match real ROM, avoid wasting > > > > > memory: this can come up to megabytes easily. > > > > > Makes sense? > > > > > > > > I tend to doubt that hardware vendors are going to waste money putting > > > > seriously oversized eeproms on devices. It does seem pretty typical to > > > > find graphics cards with 128K ROM BARs where the actual ROM squeezes > > > > just under 64K, but that's a long way from megabytes of wasted memory. > > > > The only device I have with a ROM BAR in the megabytes is an 82576, but > > > > it comes up as an invalid rom through pci-sysfs, so we skip it. I > > > > assume that just means someone was lazy and didn't bother to fuse a > > > > transistor that disables the ROM BAR, leaving it at it's maximum > > > > aperture w/ no eeprom to back it. Anyone know? Examples to the > > > > contrary welcome. > > > > > > > > So I think the question comes down to whether there's any value to > > > > trying to exactly mimic the resource layout of the device. I'm doubtful > > > > that there is, but at the potential cost of 10-100s of KBs of memory, I > > > > thought it might be worthwhile. If you feel strongly otherwise, I'll > > > > follow-up with a patch to size it by the actual readable contents. > > > > Thanks, > > > > > > > > Alex > > > > > > I actually agree sizing ROM BAR exactly the same as the device > > > is a good idea. I just thought we can save the extra memory > > > by not allocating the RAM in question, and writing code > > > to return 0xff on reads within the BAR but outside ROM. > > > And no, I don't feel strongly about this optimization. > > > > > > > Ok, so you're looking for something like below. We can no longer map > > the ROM into the guest, > > but it's a ROM, so we don't care about speed. > > Why can't we map ROM? Map full pages, leave 0xff unmapped. > The reason there will be such is because BAR is power of 2. If I understand correctly, you're suggesting we round the ROM up to a power of two, allocate a full buffer to back that, and map that to the guest. If the physical device has a larger ROM BAR, the remainder is pointed to a set of read functions that return 0xff and probably never get called. > > Here's the big problem... it breaks migration. The ramblock live > > migration code isn't going to deal well with migration from a VM with a > > BAR sized ramblock to a ROM sized ramblock (likewise the reverse). > > You mean cross-version migration? Otherwise, why would not both > sides be ROM sized? Yes, cross-version migration, though probably not an issue with the above since it doesn't change the size of existing emulated device ROMs. > > So > > we could do it for passthrough devices since they can't migrate anyway, > > but then we have to go back to separate code to handle assigned device > > ROMs vs emulated device ROMs. > > I think this is based on the assumption we do not map ROM. > If we do map it, then most of the code is still same, > just add 0xff handling for pages after end of ROM. > These typically are unaccessed anyway. Not on the no mapping assumption, but the assumption that you were looking to use the minimum buffer to back the ROM. If we agree that it's ok to waste memory rounding the ROM up to a power of two, then things work out a little better, though I'm still dubious whether the memory savings is worth the code necessary to potentially handle the ROM as two discrete pieces. Thanks, Alex ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM 2010-10-11 15:15 ` Alex Williamson @ 2010-10-11 15:21 ` Michael S. Tsirkin 2010-10-11 15:43 ` Alex Williamson 0 siblings, 1 reply; 15+ messages in thread From: Michael S. Tsirkin @ 2010-10-11 15:21 UTC (permalink / raw) To: Alex Williamson; +Cc: kvm, ddutile, chrisw On Mon, Oct 11, 2010 at 09:15:52AM -0600, Alex Williamson wrote: > On Sat, 2010-10-09 at 23:44 +0200, Michael S. Tsirkin wrote: > > On Fri, Oct 08, 2010 at 09:12:52AM -0600, Alex Williamson wrote: > > > On Fri, 2010-10-08 at 10:40 +0200, Michael S. Tsirkin wrote: > > > > On Thu, Oct 07, 2010 at 10:02:25PM -0600, Alex Williamson wrote: > > > > > On Fri, 2010-10-08 at 00:45 +0200, Michael S. Tsirkin wrote: > > > > > > On Thu, Oct 07, 2010 at 11:34:01AM -0600, Alex Williamson wrote: > > > > > > > On Thu, 2010-10-07 at 19:18 +0200, Michael S. Tsirkin wrote: > > > > > > > > On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote: > > > > > > > > > --- a/hw/device-assignment.c > > > > > > > > > +++ b/hw/device-assignment.c > > > > > > > ... > > > > > > > > > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) > > > > > > > > > */ > > > > > > > > > static void assigned_dev_load_option_rom(AssignedDevice *dev) > > > > > > > > > { > > > > > > > > > - int size, len, ret; > > > > > > > > > - void *buf; > > > > > > > > > + char name[32], rom_file[64]; > > > > > > > > > FILE *fp; > > > > > > > > > - uint8_t i = 1; > > > > > > > > > - char rom_file[64]; > > > > > > > > > + uint8_t val; > > > > > > > > > + struct stat st; > > > > > > > > > + void *ptr; > > > > > > > > > + > > > > > > > > > + /* If loading ROM from file, pci handles it */ > > > > > > > > > + if (dev->dev.romfile || !dev->dev.rom_bar) > > > > > > > > > + return; > > > > > > > > > > > > > > > > > > snprintf(rom_file, sizeof(rom_file), > > > > > > > > > "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom", > > > > > > > > > dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func); > > > > > > > > > > > > > > > > > > - if (access(rom_file, F_OK)) > > > > > > > > > + if (stat(rom_file, &st)) { > > > > > > > > > return; > > > > > > > > > + } > > > > > > > > > > > > > > > > > > > > > > > > > Just a note that stat on the ROM sysfs file returns window size, > > > > > > > > not the ROM size. So this allocates more ram than really necessary for > > > > > > > > ROM. Real size is returned by fread. > > > > > > > > > > > > > > > > Do we care? > > > > > > > > > > > > > > That was my intention with using stat. I thought that by default the > > > > > > > ROM BAR should match physical hardware, so even if the contents could be > > > > > > > rounded down to a smaller size, we maintain the size of the physical > > > > > > > device. To use the minimum size, the contents could be extracted using > > > > > > > pci-sysfs and passed with the romfile option, or the ROM could be > > > > > > > disabled altogether with the rombar=0 option. Sound reasonable? > > > > > > > Thanks, > > > > > > > > > > > > > > Alex > > > > > > > > > > > > For BAR size yes, but we do not need the buffer full of 0xff as it is > > > > > > never accessed: let's have buffer size match real ROM, avoid wasting > > > > > > memory: this can come up to megabytes easily. > > > > > > Makes sense? > > > > > > > > > > I tend to doubt that hardware vendors are going to waste money putting > > > > > seriously oversized eeproms on devices. It does seem pretty typical to > > > > > find graphics cards with 128K ROM BARs where the actual ROM squeezes > > > > > just under 64K, but that's a long way from megabytes of wasted memory. > > > > > The only device I have with a ROM BAR in the megabytes is an 82576, but > > > > > it comes up as an invalid rom through pci-sysfs, so we skip it. I > > > > > assume that just means someone was lazy and didn't bother to fuse a > > > > > transistor that disables the ROM BAR, leaving it at it's maximum > > > > > aperture w/ no eeprom to back it. Anyone know? Examples to the > > > > > contrary welcome. > > > > > > > > > > So I think the question comes down to whether there's any value to > > > > > trying to exactly mimic the resource layout of the device. I'm doubtful > > > > > that there is, but at the potential cost of 10-100s of KBs of memory, I > > > > > thought it might be worthwhile. If you feel strongly otherwise, I'll > > > > > follow-up with a patch to size it by the actual readable contents. > > > > > Thanks, > > > > > > > > > > Alex > > > > > > > > I actually agree sizing ROM BAR exactly the same as the device > > > > is a good idea. I just thought we can save the extra memory > > > > by not allocating the RAM in question, and writing code > > > > to return 0xff on reads within the BAR but outside ROM. > > > > And no, I don't feel strongly about this optimization. > > > > > > > > > > Ok, so you're looking for something like below. We can no longer map > > > the ROM into the guest, > > > but it's a ROM, so we don't care about speed. > > > > Why can't we map ROM? Map full pages, leave 0xff unmapped. > > The reason there will be such is because BAR is power of 2. > > If I understand correctly, you're suggesting we round the ROM up to a > power of two, allocate a full buffer to back that, and map that to the > guest. No, I suggested rounding up to full pages. > If the physical device has a larger ROM BAR, the remainder is > pointed to a set of read functions that return 0xff and probably never > get called. Yes. > > > Here's the big problem... it breaks migration. The ramblock live > > > migration code isn't going to deal well with migration from a VM with a > > > BAR sized ramblock to a ROM sized ramblock (likewise the reverse). > > > > You mean cross-version migration? Otherwise, why would not both > > sides be ROM sized? > > Yes, cross-version migration, though probably not an issue with the > above since it doesn't change the size of existing emulated device ROMs. > > > > So > > > we could do it for passthrough devices since they can't migrate anyway, > > > but then we have to go back to separate code to handle assigned device > > > ROMs vs emulated device ROMs. > > > > I think this is based on the assumption we do not map ROM. > > If we do map it, then most of the code is still same, > > just add 0xff handling for pages after end of ROM. > > These typically are unaccessed anyway. > > Not on the no mapping assumption, but the assumption that you were > looking to use the minimum buffer to back the ROM. If we agree that > it's ok to waste memory rounding the ROM up to a power of two, then > things work out a little better, though I'm still dubious whether the > memory savings is worth the code necessary to potentially handle the ROM > as two discrete pieces. Thanks, > > Alex True. Note this optimization is not there in existing code, so it's definitely not urgent to implement this - just something to keep in mind. -- MST ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM 2010-10-11 15:21 ` Michael S. Tsirkin @ 2010-10-11 15:43 ` Alex Williamson 0 siblings, 0 replies; 15+ messages in thread From: Alex Williamson @ 2010-10-11 15:43 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: kvm, ddutile, chrisw On Mon, 2010-10-11 at 17:21 +0200, Michael S. Tsirkin wrote: > On Mon, Oct 11, 2010 at 09:15:52AM -0600, Alex Williamson wrote: > > On Sat, 2010-10-09 at 23:44 +0200, Michael S. Tsirkin wrote: > > > On Fri, Oct 08, 2010 at 09:12:52AM -0600, Alex Williamson wrote: > > > > On Fri, 2010-10-08 at 10:40 +0200, Michael S. Tsirkin wrote: > > > > > On Thu, Oct 07, 2010 at 10:02:25PM -0600, Alex Williamson wrote: > > > > > > On Fri, 2010-10-08 at 00:45 +0200, Michael S. Tsirkin wrote: > > > > > > > On Thu, Oct 07, 2010 at 11:34:01AM -0600, Alex Williamson wrote: > > > > > > > > On Thu, 2010-10-07 at 19:18 +0200, Michael S. Tsirkin wrote: > > > > > > > > > On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote: > > > > > > > > > > --- a/hw/device-assignment.c > > > > > > > > > > +++ b/hw/device-assignment.c > > > > > > > > ... > > > > > > > > > > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) > > > > > > > > > > */ > > > > > > > > > > static void assigned_dev_load_option_rom(AssignedDevice *dev) > > > > > > > > > > { > > > > > > > > > > - int size, len, ret; > > > > > > > > > > - void *buf; > > > > > > > > > > + char name[32], rom_file[64]; > > > > > > > > > > FILE *fp; > > > > > > > > > > - uint8_t i = 1; > > > > > > > > > > - char rom_file[64]; > > > > > > > > > > + uint8_t val; > > > > > > > > > > + struct stat st; > > > > > > > > > > + void *ptr; > > > > > > > > > > + > > > > > > > > > > + /* If loading ROM from file, pci handles it */ > > > > > > > > > > + if (dev->dev.romfile || !dev->dev.rom_bar) > > > > > > > > > > + return; > > > > > > > > > > > > > > > > > > > > snprintf(rom_file, sizeof(rom_file), > > > > > > > > > > "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom", > > > > > > > > > > dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func); > > > > > > > > > > > > > > > > > > > > - if (access(rom_file, F_OK)) > > > > > > > > > > + if (stat(rom_file, &st)) { > > > > > > > > > > return; > > > > > > > > > > + } > > > > > > > > > > > > > > > > > > > > > > > > > > > > Just a note that stat on the ROM sysfs file returns window size, > > > > > > > > > not the ROM size. So this allocates more ram than really necessary for > > > > > > > > > ROM. Real size is returned by fread. > > > > > > > > > > > > > > > > > > Do we care? > > > > > > > > > > > > > > > > That was my intention with using stat. I thought that by default the > > > > > > > > ROM BAR should match physical hardware, so even if the contents could be > > > > > > > > rounded down to a smaller size, we maintain the size of the physical > > > > > > > > device. To use the minimum size, the contents could be extracted using > > > > > > > > pci-sysfs and passed with the romfile option, or the ROM could be > > > > > > > > disabled altogether with the rombar=0 option. Sound reasonable? > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > For BAR size yes, but we do not need the buffer full of 0xff as it is > > > > > > > never accessed: let's have buffer size match real ROM, avoid wasting > > > > > > > memory: this can come up to megabytes easily. > > > > > > > Makes sense? > > > > > > > > > > > > I tend to doubt that hardware vendors are going to waste money putting > > > > > > seriously oversized eeproms on devices. It does seem pretty typical to > > > > > > find graphics cards with 128K ROM BARs where the actual ROM squeezes > > > > > > just under 64K, but that's a long way from megabytes of wasted memory. > > > > > > The only device I have with a ROM BAR in the megabytes is an 82576, but > > > > > > it comes up as an invalid rom through pci-sysfs, so we skip it. I > > > > > > assume that just means someone was lazy and didn't bother to fuse a > > > > > > transistor that disables the ROM BAR, leaving it at it's maximum > > > > > > aperture w/ no eeprom to back it. Anyone know? Examples to the > > > > > > contrary welcome. > > > > > > > > > > > > So I think the question comes down to whether there's any value to > > > > > > trying to exactly mimic the resource layout of the device. I'm doubtful > > > > > > that there is, but at the potential cost of 10-100s of KBs of memory, I > > > > > > thought it might be worthwhile. If you feel strongly otherwise, I'll > > > > > > follow-up with a patch to size it by the actual readable contents. > > > > > > Thanks, > > > > > > > > > > > > Alex > > > > > > > > > > I actually agree sizing ROM BAR exactly the same as the device > > > > > is a good idea. I just thought we can save the extra memory > > > > > by not allocating the RAM in question, and writing code > > > > > to return 0xff on reads within the BAR but outside ROM. > > > > > And no, I don't feel strongly about this optimization. > > > > > > > > > > > > > Ok, so you're looking for something like below. We can no longer map > > > > the ROM into the guest, > > > > but it's a ROM, so we don't care about speed. > > > > > > Why can't we map ROM? Map full pages, leave 0xff unmapped. > > > The reason there will be such is because BAR is power of 2. > > > > If I understand correctly, you're suggesting we round the ROM up to a > > power of two, allocate a full buffer to back that, and map that to the > > guest. > > No, I suggested rounding up to full pages. Unless it's a power of two, like pci_add_option_rom does now, it breaks migration. The ROM is stored in a buffer allocated from qemu_ram_alloc (a ramblock), which is migrated via ram_save_live. If we change the size of that buffer for existing emulated devices with ROMs, we break migration between versions. I don't think it's worth that breakage or the ugly code necessary to allow an old to new migration for that small amount of savings. Besides, if we memset the buffer, ksm will merge them for us anyway. Alex > > If the physical device has a larger ROM BAR, the remainder is > > pointed to a set of read functions that return 0xff and probably never > > get called. > > Yes. > > > > > Here's the big problem... it breaks migration. The ramblock live > > > > migration code isn't going to deal well with migration from a VM with a > > > > BAR sized ramblock to a ROM sized ramblock (likewise the reverse). > > > > > > You mean cross-version migration? Otherwise, why would not both > > > sides be ROM sized? > > > > Yes, cross-version migration, though probably not an issue with the > > above since it doesn't change the size of existing emulated device ROMs. > > > > > > So > > > > we could do it for passthrough devices since they can't migrate anyway, > > > > but then we have to go back to separate code to handle assigned device > > > > ROMs vs emulated device ROMs. > > > > > > I think this is based on the assumption we do not map ROM. > > > If we do map it, then most of the code is still same, > > > just add 0xff handling for pages after end of ROM. > > > These typically are unaccessed anyway. > > > > Not on the no mapping assumption, but the assumption that you were > > looking to use the minimum buffer to back the ROM. If we agree that > > it's ok to waste memory rounding the ROM up to a power of two, then > > things work out a little better, though I'm still dubious whether the > > memory savings is worth the code necessary to potentially handle the ROM > > as two discrete pieces. Thanks, > > > > Alex > > True. Note this optimization is not there in existing code, > so it's definitely not urgent to implement this - just > something to keep in mind. > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] device-assignment: Re-work PCI option ROM support 2010-10-04 21:26 [PATCH 0/2] device-assignment: Re-work PCI option ROM support Alex Williamson 2010-10-04 21:26 ` [PATCH 1/2] PCI: Export pci_map_option_rom() Alex Williamson 2010-10-04 21:26 ` [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM Alex Williamson @ 2010-10-06 20:43 ` Marcelo Tosatti 2 siblings, 0 replies; 15+ messages in thread From: Marcelo Tosatti @ 2010-10-06 20:43 UTC (permalink / raw) To: Alex Williamson; +Cc: kvm, ddutile, chrisw On Mon, Oct 04, 2010 at 03:26:18PM -0600, Alex Williamson wrote: > This cleans up device assignment option ROM support and allows > us to use romfile and rombar default PCI options. Thanks, > > Alex > > --- > > Alex Williamson (2): > device-assignment: Allow PCI to manage the option ROM > PCI: Export pci_map_option_rom() > > > hw/device-assignment.c | 155 +++++++++++++++++++++--------------------------- > hw/device-assignment.h | 4 + > hw/pci.c | 2 - > hw/pci.h | 3 + > 4 files changed, 75 insertions(+), 89 deletions(-) Applied, thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2010-10-11 15:43 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-10-04 21:26 [PATCH 0/2] device-assignment: Re-work PCI option ROM support Alex Williamson 2010-10-04 21:26 ` [PATCH 1/2] PCI: Export pci_map_option_rom() Alex Williamson 2010-10-05 16:03 ` Chris Wright 2010-10-04 21:26 ` [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM Alex Williamson 2010-10-07 17:18 ` Michael S. Tsirkin 2010-10-07 17:34 ` Alex Williamson 2010-10-07 22:45 ` Michael S. Tsirkin 2010-10-08 4:02 ` Alex Williamson 2010-10-08 8:40 ` Michael S. Tsirkin 2010-10-08 15:12 ` Alex Williamson 2010-10-09 21:44 ` Michael S. Tsirkin 2010-10-11 15:15 ` Alex Williamson 2010-10-11 15:21 ` Michael S. Tsirkin 2010-10-11 15:43 ` Alex Williamson 2010-10-06 20:43 ` [PATCH 0/2] device-assignment: Re-work PCI option ROM support Marcelo Tosatti
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox