From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53774) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XA5dH-0004KY-99 for qemu-devel@nongnu.org; Wed, 23 Jul 2014 19:07:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XA5dB-0003BZ-VH for qemu-devel@nongnu.org; Wed, 23 Jul 2014 19:07:35 -0400 Received: from cantor2.suse.de ([195.135.220.15]:42083 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XA5dB-0003BR-KA for qemu-devel@nongnu.org; Wed, 23 Jul 2014 19:07:29 -0400 Message-ID: <53D0402F.2080407@suse.de> Date: Thu, 24 Jul 2014 01:07:27 +0200 From: Alexander Graf MIME-Version: 1.0 References: <1404716892-15600-1-git-send-email-eric.auger@linaro.org> <1404716892-15600-2-git-send-email-eric.auger@linaro.org> <53BBF59E.5090602@suse.de> <53CFCD89.9080300@linaro.org> In-Reply-To: <53CFCD89.9080300@linaro.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/7] hw/misc/platform_devices: helpers for dynamic instantiation of platform devices List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Auger , eric.auger@st.com, christoffer.dall@linaro.org, qemu-devel@nongnu.org, kim.phillips@freescale.com, a.rigo@virtualopensystems.com Cc: peter.maydell@linaro.org, patches@linaro.org, stuart.yoder@freescale.com, alex.williamson@redhat.com, a.motakis@virtualopensystems.com, kvmarm@lists.cs.columbia.edu On 23.07.14 16:58, Eric Auger wrote: > On 07/08/2014 03:43 PM, Alexander Graf wrote: >> On 07.07.14 09:08, Eric Auger wrote: >>> This new module implements routines which help in dynamic instantiation >>> of sysbus devices. Machine files can use those generic routines. >>> >>> --- >>> >>> Dynamic sysbus device allocation fully written by Alex Graf. >>> >>> [Eric Auger] >>> Those functions were initially in ppc e500 machine file. Now moved to a >>> separate module. >>> >>> PPCE500Params is replaced by a generic struct named PlatformParams >>> >>> Signed-off-by: Alexander Graf >>> Signed-off-by: Eric Auger >>> --- >>> hw/misc/Makefile.objs | 1 + >>> hw/misc/platform_devices.c | 217 >>> +++++++++++++++++++++++++++++++++++++ >>> include/hw/misc/platform_devices.h | 61 +++++++++++ >>> 3 files changed, 279 insertions(+) >>> create mode 100644 hw/misc/platform_devices.c >>> create mode 100644 include/hw/misc/platform_devices.h >>> >>> diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs >>> index e47fea8..d081606 100644 >>> --- a/hw/misc/Makefile.objs >>> +++ b/hw/misc/Makefile.objs >>> @@ -40,3 +40,4 @@ obj-$(CONFIG_SLAVIO) += slavio_misc.o >>> obj-$(CONFIG_ZYNQ) += zynq_slcr.o >>> obj-$(CONFIG_PVPANIC) += pvpanic.o >>> +obj-y += platform_devices.o >>> diff --git a/hw/misc/platform_devices.c b/hw/misc/platform_devices.c >>> new file mode 100644 >>> index 0000000..96ab272 >>> --- /dev/null >>> +++ b/hw/misc/platform_devices.c >>> @@ -0,0 +1,217 @@ >>> +#include "hw/misc/platform_devices.h" >>> +#include "hw/sysbus.h" >>> +#include "qemu/error-report.h" >>> + >>> +#define PAGE_SHIFT 12 >>> + >>> +int sysbus_device_create_devtree(Object *obj, void *opaque) >>> +{ >>> + PlatformDevtreeData *data = opaque; >>> + Object *dev; >>> + SysBusDevice *sbdev; >>> + bool matched = false; >>> + >>> + dev = object_dynamic_cast(obj, TYPE_SYS_BUS_DEVICE); >>> + sbdev = (SysBusDevice *)dev; >>> + >>> + if (!sbdev) { >>> + /* Container, traverse it for children */ >>> + return object_child_foreach(obj, >>> sysbus_device_create_devtree, data); >>> + } >>> + >>> + if (!matched) { >>> + error_report("Device %s is not supported by this machine yet.", >>> + qdev_fw_name(DEVICE(dev))); >>> + exit(1); >>> + } >>> + >>> + return 0; >>> +} >>> + >>> +void platform_bus_create_devtree(PlatformParams *params, void *fdt, >>> + const char *mpic) >>> +{ >>> + gchar *node = g_strdup_printf("/platform@%"PRIx64, >>> + params->platform_bus_base); >>> + const char platcomp[] = "qemu,platform\0simple-bus"; >>> + PlatformDevtreeData data; >>> + Object *container; >>> + uint64_t addr = params->platform_bus_base; >>> + uint64_t size = params->platform_bus_size; >>> + int irq_start = params->platform_bus_first_irq; >>> + >>> + /* Create a /platform node that we can put all devices into */ >>> + >>> + qemu_fdt_add_subnode(fdt, node); >>> + qemu_fdt_setprop(fdt, node, "compatible", platcomp, >>> sizeof(platcomp)); >>> + >>> + /* Our platform bus region is less than 32bit big, so 1 cell is >>> enough for >>> + address and size */ >>> + qemu_fdt_setprop_cells(fdt, node, "#size-cells", 1); >>> + qemu_fdt_setprop_cells(fdt, node, "#address-cells", 1); >>> + qemu_fdt_setprop_cells(fdt, node, "ranges", 0, addr >> 32, addr, >>> size); >>> + >>> + qemu_fdt_setprop_phandle(fdt, node, "interrupt-parent", mpic); >>> + >>> + /* Loop through all devices and create nodes for known ones */ >>> + data.fdt = fdt; >>> + data.mpic = mpic; >>> + data.irq_start = irq_start; >>> + data.node = node; >>> + >>> + container = container_get(qdev_get_machine(), "/peripheral"); >>> + sysbus_device_create_devtree(container, &data); >>> + container = container_get(qdev_get_machine(), "/peripheral-anon"); >>> + sysbus_device_create_devtree(container, &data); >>> + >>> + g_free(node); >>> +} >> Device trees are pretty platform (and even machine) specific. Just to >> give you an example - the interrupt specifier on most e500 systems >> really is 4 cells big: >> >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt#n80 >> >> >> | Interrupt specifiers consists of 4 cells encoded as >> follows: >> >> <1st-cell> interrupt-number >> >> Identifies the interrupt source. The meaning >> depends on the type of interrupt. >> >> Note: If the interrupt-type cell is undefined >> (i.e. #interrupt-cells = 2), this cell >> should be interpreted the same as for >> interrupt-type 0-- i.e. an external or >> normal SoC device interrupt. >> >> <2nd-cell> level-sense information, encoded as follows: >> 0 = low-to-high edge triggered >> 1 = active low level-sensitive >> 2 = active high level-sensitive >> 3 = high-to-low edge triggered >> >> <3rd-cell> interrupt-type >> >> The following types are supported: >> >> 0 = external or normal SoC device interrupt >> >> The interrupt-number cell contains >> the SoC device interrupt number. The >> type-specific cell is undefined. The >> interrupt-number is derived from the >> MPIC a block of registers referred to as >> the "Interrupt Source Configuration Registers". >> Each source has 32-bytes of registers >> (vector/priority and destination) in this >> region. So interrupt 0 is at offset 0x0, >> interrupt 1 is at offset 0x20, and so on. >> >> 1 = error interrupt >> >> The interrupt-number cell contains >> the SoC device interrupt number for >> the error interrupt. The type-specific >> cell identifies the specific error >> interrupt number. >> >> 2 = MPIC inter-processor interrupt (IPI) >> >> The interrupt-number cell identifies >> the MPIC IPI number. The type-specific >> cell is undefined. >> >> 3 = MPIC timer interrupt >> >> The interrupt-number cell identifies >> the MPIC timer number. The type-specific >> cell is undefined. >> >> <4th-cell> type-specific information >> >> The type-specific cell is encoded as follows: >> >> - For interrupt-type 1 (error interrupt), >> the type-specific cell contains the >> bit number of the error interrupt in the >> Error Interrupt Summary Register. >> | >> >> >> >> >> while on ARM you have a GIC which works like this: >> >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/arm/gic.txt#n20 >> >> >> |- #interrupt-cells : Specifies the number of cells needed to encode an >> interrupt source. The type shall be a and the value shall be 3. >> >> The 1st cell is the interrupt type; 0 for SPI interrupts, 1 for PPI >> interrupts. >> >> The 2nd cell contains the interrupt number for the interrupt type. >> SPI interrupts are in the range [0-987]. PPI interrupts are in the >> range [0-15]. >> >> The 3rd cell is the flags, encoded as follows: >> bits[3:0] trigger type and level flags. >> 1 = low-to-high edge triggered >> 2 = high-to-low edge triggered >> 4 = active high level-sensitive >> 8 = active low level-sensitive >> bits[15:8] PPI interrupt cpu mask. Each bit corresponds to each of >> the 8 possible cpus attached to the GIC. A bit set to '1' indicated >> the interrupt is wired to that CPU. Only valid for PPI interrupts. >> | >> >> >> >> Both have vastly different semantics. The number of cells is different, >> the value of the cells is different. Even the definition how to >> represent edge vs level triggered interrupts differs. >> >> I don't think this will stop with interrupts. Maybe someone wants to add >> a special machine check flag to addresses on a platform and then >> "ranges" and "regs" will have different semantics on different >> platforms. There is a lot that can go wrong when you try to unify this >> code. > Hi Alex, > > thank you for giving such an example. Indeed I was not aware there were > such huge discrepancies. I guess this comment mostly holds for the > actual device node generation (what I specialized in the parent QEMU > device) and not for the "qemu, platform simple-bus" node generation? Is the dt generation that much code? Just duplicate it for now - we can generalize it later if we see how things work out. > >>> + >>> +int platform_bus_map_irq(PlatformParams *params, SysBusDevice *sbdev, >>> + int n, unsigned long *used_irqs, >>> + qemu_irq *platform_irqs) >>> +{ >>> + int max_irqs = params->platform_bus_num_irqs; >>> + char *prop = g_strdup_printf("irq[%d]", n); >>> + int irqn = object_property_get_int(OBJECT(sbdev), prop, NULL); >>> + >>> + if (irqn == SYSBUS_DYNAMIC) { >>> + /* Find the first available IRQ */ >>> + irqn = find_first_zero_bit(used_irqs, max_irqs); >>> + } >>> + >>> + if ((irqn >= max_irqs) || test_and_set_bit(irqn, used_irqs)) { >>> + hw_error("IRQ %d is already allocated or no free IRQ left", >>> irqn); >>> + } >>> + >>> + sysbus_connect_irq(sbdev, n, platform_irqs[irqn]); >>> + object_property_set_int(OBJECT(sbdev), irqn, prop, NULL); >>> + >>> + g_free(prop); >>> + return 0; >>> +} >>> + >>> +int platform_bus_map_mmio(PlatformParams *params, SysBusDevice *sbdev, >>> + int n, unsigned long *used_mem, >>> + MemoryRegion *pmem) >>> +{ >>> + MemoryRegion *device_mem = sbdev->mmio[n].memory; >>> + uint64_t size = memory_region_size(device_mem); >>> + uint64_t page_size = (1 << PAGE_SHIFT); >>> + uint64_t page_mask = page_size - 1; >>> + uint64_t size_pages = (size + page_mask) >> PAGE_SHIFT; >>> + uint64_t max_size = params->platform_bus_size; >>> + uint64_t max_pages = max_size >> PAGE_SHIFT; >>> + char *prop = g_strdup_printf("mmio[%d]", n); >>> + hwaddr addr = object_property_get_int(OBJECT(sbdev), prop, NULL); >>> + int page; >>> + int i; >>> + >>> + page = addr >> PAGE_SHIFT; >>> + if (addr == SYSBUS_DYNAMIC) { >>> + uint64_t size_pages_align; >>> + >>> + /* Align the region to at least its own size granularity */ >>> + if (is_power_of_2(size_pages)) { >>> + size_pages_align = size_pages; >>> + } else { >>> + size_pages_align = pow2floor(size_pages) << 1; >>> + } >>> + >>> + /* Find the first available region that fits */ >>> + page = bitmap_find_next_zero_area(used_mem, max_pages, 0, >>> size_pages, >>> + size_pages_align); >>> + >>> + addr = (uint64_t)page << PAGE_SHIFT; >>> + } >>> + >>> + if (page >= max_pages || test_bit(page, used_mem) || >>> + (find_next_bit(used_mem, max_pages, page) < size_pages)) { >>> + hw_error("Memory [%"PRIx64":%"PRIx64" is already allocated or " >>> + "no slot left", addr, size); >>> + } >>> + >>> + for (i = page; i < (page + size_pages); i++) { >>> + set_bit(i, used_mem); >>> + } >>> + >>> + memory_region_add_subregion(pmem, addr, device_mem); >>> + sbdev->mmio[n].addr = addr; >>> + object_property_set_int(OBJECT(sbdev), addr, prop, NULL); >>> + >>> + g_free(prop); >>> + return 0; >>> +} >>> + >>> +int sysbus_device_check(Object *obj, void *opaque) >>> +{ >>> + PlatformBusInitData *init = opaque; >>> + Object *dev; >>> + SysBusDevice *sbdev; >>> + int i; >>> + >>> + dev = object_dynamic_cast(obj, TYPE_SYS_BUS_DEVICE); >>> + sbdev = (SysBusDevice *)dev; >>> + >>> + if (!sbdev) { >>> + /* Container, traverse it for children */ >>> + return object_child_foreach(obj, sysbus_device_check, opaque); >>> + } >>> + >>> + /* Connect sysbus device to virtual platform bus */ >>> + for (i = 0; i < sbdev->num_irq; i++) { >>> + if (!sbdev->irqp[i]) { >>> + /* This IRQ is an incoming IRQ, we can't wire those here */ >>> + continue; >>> + } >>> + platform_bus_map_irq(init->params, sbdev, i, >>> + init->used_irqs, init->irqs); >>> + } >>> + >>> + for (i = 0; i < sbdev->num_mmio; i++) { >>> + platform_bus_map_mmio(init->params, sbdev, i, >>> + init->used_mem, init->mem); >>> + } >>> + >>> + return 0; >>> +} >>> + >>> +void platform_bus_init(PlatformParams *params, >>> + MemoryRegion *address_space_mem, >>> + qemu_irq *mpic) >>> +{ >>> + uint64_t max_size = params->platform_bus_size; >>> + uint64_t max_pages = max_size >> PAGE_SHIFT; >>> + DECLARE_BITMAP(used_irqs, params->platform_bus_num_irqs); >>> + DECLARE_BITMAP(used_mem, max_pages); >>> + MemoryRegion *platform_region = g_new(MemoryRegion, 1); >>> + Object *container; >>> + PlatformBusInitData init = { >>> + .used_irqs = used_irqs, >>> + .used_mem = used_mem, >>> + .mem = platform_region, >>> + .irqs = &mpic[params->platform_bus_first_irq], >>> + .params = params, >>> + }; >>> + >>> + memory_region_init(platform_region, NULL, "platform devices", >>> + params->platform_bus_size); >>> + >>> + bitmap_clear(used_irqs, 0, params->platform_bus_num_irqs); >>> + bitmap_clear(used_mem, 0, max_pages); >>> + >>> + /* Loop through all sysbus devices that were spawened outside the >>> machine */ >>> + container = container_get(qdev_get_machine(), "/peripheral"); >>> + sysbus_device_check(container, &init); >>> + container = container_get(qdev_get_machine(), "/peripheral-anon"); >>> + sysbus_device_check(container, &init); >>> + >>> + memory_region_add_subregion(address_space_mem, >>> params->platform_bus_base, >>> + platform_region); >>> +} >> However, I do think it's a good idea to generalize the "platform bus" >> device if you want to reuse it on ARM. The mmio / irq allocator is >> pretty straight forward and should be generic enough for you to use. > I need clarification here: do you talk about your very first patch > "Platform Device Support" or the code above with a proper solution for > device tree generation? I'm talking about the actual implementation of the allocation logic. I think we're better off to keep all the device tree logic purely in the machine files for now. >> If you do this, please don't duplicate the code but rather move it from >> the e500 file into your new one :). > OK. do you mean modifying the e500.c code to call those routines? My > concern is about testing. Why? We have a u-boot binary that starts up based on the device tree with TCG if you just start the e500plat machine and if you like I can easily give you a nice guest kernel and rootfs as well ;). Alex