From: Alexander Graf <agraf@suse.de>
To: Eric Auger <eric.auger@linaro.org>,
eric.auger@st.com, christoffer.dall@linaro.org,
qemu-devel@nongnu.org, kim.phillips@freescale.com,
a.rigo@virtualopensystems.com
Cc: peter.maydell@linaro.org, patches@linaro.org,
stuart.yoder@freescale.com, alex.williamson@redhat.com,
a.motakis@virtualopensystems.com, kvmarm@lists.cs.columbia.edu
Subject: Re: [Qemu-devel] [PATCH 1/7] hw/misc/platform_devices: helpers for dynamic instantiation of platform devices
Date: Thu, 24 Jul 2014 01:07:27 +0200 [thread overview]
Message-ID: <53D0402F.2080407@suse.de> (raw)
In-Reply-To: <53CFCD89.9080300@linaro.org>
On 23.07.14 16:58, Eric Auger wrote:
> On 07/08/2014 03:43 PM, Alexander Graf wrote:
>> On 07.07.14 09:08, Eric Auger wrote:
>>> This new module implements routines which help in dynamic instantiation
>>> of sysbus devices. Machine files can use those generic routines.
>>>
>>> ---
>>>
>>> Dynamic sysbus device allocation fully written by Alex Graf.
>>>
>>> [Eric Auger]
>>> Those functions were initially in ppc e500 machine file. Now moved to a
>>> separate module.
>>>
>>> PPCE500Params is replaced by a generic struct named PlatformParams
>>>
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>> ---
>>> hw/misc/Makefile.objs | 1 +
>>> hw/misc/platform_devices.c | 217
>>> +++++++++++++++++++++++++++++++++++++
>>> include/hw/misc/platform_devices.h | 61 +++++++++++
>>> 3 files changed, 279 insertions(+)
>>> create mode 100644 hw/misc/platform_devices.c
>>> create mode 100644 include/hw/misc/platform_devices.h
>>>
>>> diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
>>> index e47fea8..d081606 100644
>>> --- a/hw/misc/Makefile.objs
>>> +++ b/hw/misc/Makefile.objs
>>> @@ -40,3 +40,4 @@ obj-$(CONFIG_SLAVIO) += slavio_misc.o
>>> obj-$(CONFIG_ZYNQ) += zynq_slcr.o
>>> obj-$(CONFIG_PVPANIC) += pvpanic.o
>>> +obj-y += platform_devices.o
>>> diff --git a/hw/misc/platform_devices.c b/hw/misc/platform_devices.c
>>> new file mode 100644
>>> index 0000000..96ab272
>>> --- /dev/null
>>> +++ b/hw/misc/platform_devices.c
>>> @@ -0,0 +1,217 @@
>>> +#include "hw/misc/platform_devices.h"
>>> +#include "hw/sysbus.h"
>>> +#include "qemu/error-report.h"
>>> +
>>> +#define PAGE_SHIFT 12
>>> +
>>> +int sysbus_device_create_devtree(Object *obj, void *opaque)
>>> +{
>>> + PlatformDevtreeData *data = opaque;
>>> + Object *dev;
>>> + SysBusDevice *sbdev;
>>> + bool matched = false;
>>> +
>>> + dev = object_dynamic_cast(obj, TYPE_SYS_BUS_DEVICE);
>>> + sbdev = (SysBusDevice *)dev;
>>> +
>>> + if (!sbdev) {
>>> + /* Container, traverse it for children */
>>> + return object_child_foreach(obj,
>>> sysbus_device_create_devtree, data);
>>> + }
>>> +
>>> + if (!matched) {
>>> + error_report("Device %s is not supported by this machine yet.",
>>> + qdev_fw_name(DEVICE(dev)));
>>> + exit(1);
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +void platform_bus_create_devtree(PlatformParams *params, void *fdt,
>>> + const char *mpic)
>>> +{
>>> + gchar *node = g_strdup_printf("/platform@%"PRIx64,
>>> + params->platform_bus_base);
>>> + const char platcomp[] = "qemu,platform\0simple-bus";
>>> + PlatformDevtreeData data;
>>> + Object *container;
>>> + uint64_t addr = params->platform_bus_base;
>>> + uint64_t size = params->platform_bus_size;
>>> + int irq_start = params->platform_bus_first_irq;
>>> +
>>> + /* Create a /platform node that we can put all devices into */
>>> +
>>> + qemu_fdt_add_subnode(fdt, node);
>>> + qemu_fdt_setprop(fdt, node, "compatible", platcomp,
>>> sizeof(platcomp));
>>> +
>>> + /* Our platform bus region is less than 32bit big, so 1 cell is
>>> enough for
>>> + address and size */
>>> + qemu_fdt_setprop_cells(fdt, node, "#size-cells", 1);
>>> + qemu_fdt_setprop_cells(fdt, node, "#address-cells", 1);
>>> + qemu_fdt_setprop_cells(fdt, node, "ranges", 0, addr >> 32, addr,
>>> size);
>>> +
>>> + qemu_fdt_setprop_phandle(fdt, node, "interrupt-parent", mpic);
>>> +
>>> + /* Loop through all devices and create nodes for known ones */
>>> + data.fdt = fdt;
>>> + data.mpic = mpic;
>>> + data.irq_start = irq_start;
>>> + data.node = node;
>>> +
>>> + container = container_get(qdev_get_machine(), "/peripheral");
>>> + sysbus_device_create_devtree(container, &data);
>>> + container = container_get(qdev_get_machine(), "/peripheral-anon");
>>> + sysbus_device_create_devtree(container, &data);
>>> +
>>> + g_free(node);
>>> +}
>> Device trees are pretty platform (and even machine) specific. Just to
>> give you an example - the interrupt specifier on most e500 systems
>> really is 4 cells big:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt#n80
>>
>>
>> | Interrupt specifiers consists of 4 cells encoded as
>> follows:
>>
>> <1st-cell> interrupt-number
>>
>> Identifies the interrupt source. The meaning
>> depends on the type of interrupt.
>>
>> Note: If the interrupt-type cell is undefined
>> (i.e. #interrupt-cells = 2), this cell
>> should be interpreted the same as for
>> interrupt-type 0-- i.e. an external or
>> normal SoC device interrupt.
>>
>> <2nd-cell> level-sense information, encoded as follows:
>> 0 = low-to-high edge triggered
>> 1 = active low level-sensitive
>> 2 = active high level-sensitive
>> 3 = high-to-low edge triggered
>>
>> <3rd-cell> interrupt-type
>>
>> The following types are supported:
>>
>> 0 = external or normal SoC device interrupt
>>
>> The interrupt-number cell contains
>> the SoC device interrupt number. The
>> type-specific cell is undefined. The
>> interrupt-number is derived from the
>> MPIC a block of registers referred to as
>> the "Interrupt Source Configuration Registers".
>> Each source has 32-bytes of registers
>> (vector/priority and destination) in this
>> region. So interrupt 0 is at offset 0x0,
>> interrupt 1 is at offset 0x20, and so on.
>>
>> 1 = error interrupt
>>
>> The interrupt-number cell contains
>> the SoC device interrupt number for
>> the error interrupt. The type-specific
>> cell identifies the specific error
>> interrupt number.
>>
>> 2 = MPIC inter-processor interrupt (IPI)
>>
>> The interrupt-number cell identifies
>> the MPIC IPI number. The type-specific
>> cell is undefined.
>>
>> 3 = MPIC timer interrupt
>>
>> The interrupt-number cell identifies
>> the MPIC timer number. The type-specific
>> cell is undefined.
>>
>> <4th-cell> type-specific information
>>
>> The type-specific cell is encoded as follows:
>>
>> - For interrupt-type 1 (error interrupt),
>> the type-specific cell contains the
>> bit number of the error interrupt in the
>> Error Interrupt Summary Register.
>> |
>>
>>
>>
>>
>> while on ARM you have a GIC which works like this:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/arm/gic.txt#n20
>>
>>
>> |- #interrupt-cells : Specifies the number of cells needed to encode an
>> interrupt source. The type shall be a <u32> and the value shall be 3.
>>
>> The 1st cell is the interrupt type; 0 for SPI interrupts, 1 for PPI
>> interrupts.
>>
>> The 2nd cell contains the interrupt number for the interrupt type.
>> SPI interrupts are in the range [0-987]. PPI interrupts are in the
>> range [0-15].
>>
>> The 3rd cell is the flags, encoded as follows:
>> bits[3:0] trigger type and level flags.
>> 1 = low-to-high edge triggered
>> 2 = high-to-low edge triggered
>> 4 = active high level-sensitive
>> 8 = active low level-sensitive
>> bits[15:8] PPI interrupt cpu mask. Each bit corresponds to each of
>> the 8 possible cpus attached to the GIC. A bit set to '1' indicated
>> the interrupt is wired to that CPU. Only valid for PPI interrupts.
>> |
>>
>>
>>
>> Both have vastly different semantics. The number of cells is different,
>> the value of the cells is different. Even the definition how to
>> represent edge vs level triggered interrupts differs.
>>
>> I don't think this will stop with interrupts. Maybe someone wants to add
>> a special machine check flag to addresses on a platform and then
>> "ranges" and "regs" will have different semantics on different
>> platforms. There is a lot that can go wrong when you try to unify this
>> code.
> Hi Alex,
>
> thank you for giving such an example. Indeed I was not aware there were
> such huge discrepancies. I guess this comment mostly holds for the
> actual device node generation (what I specialized in the parent QEMU
> device) and not for the "qemu, platform simple-bus" node generation?
Is the dt generation that much code? Just duplicate it for now - we can
generalize it later if we see how things work out.
>
>>> +
>>> +int platform_bus_map_irq(PlatformParams *params, SysBusDevice *sbdev,
>>> + int n, unsigned long *used_irqs,
>>> + qemu_irq *platform_irqs)
>>> +{
>>> + int max_irqs = params->platform_bus_num_irqs;
>>> + char *prop = g_strdup_printf("irq[%d]", n);
>>> + int irqn = object_property_get_int(OBJECT(sbdev), prop, NULL);
>>> +
>>> + if (irqn == SYSBUS_DYNAMIC) {
>>> + /* Find the first available IRQ */
>>> + irqn = find_first_zero_bit(used_irqs, max_irqs);
>>> + }
>>> +
>>> + if ((irqn >= max_irqs) || test_and_set_bit(irqn, used_irqs)) {
>>> + hw_error("IRQ %d is already allocated or no free IRQ left",
>>> irqn);
>>> + }
>>> +
>>> + sysbus_connect_irq(sbdev, n, platform_irqs[irqn]);
>>> + object_property_set_int(OBJECT(sbdev), irqn, prop, NULL);
>>> +
>>> + g_free(prop);
>>> + return 0;
>>> +}
>>> +
>>> +int platform_bus_map_mmio(PlatformParams *params, SysBusDevice *sbdev,
>>> + int n, unsigned long *used_mem,
>>> + MemoryRegion *pmem)
>>> +{
>>> + MemoryRegion *device_mem = sbdev->mmio[n].memory;
>>> + uint64_t size = memory_region_size(device_mem);
>>> + uint64_t page_size = (1 << PAGE_SHIFT);
>>> + uint64_t page_mask = page_size - 1;
>>> + uint64_t size_pages = (size + page_mask) >> PAGE_SHIFT;
>>> + uint64_t max_size = params->platform_bus_size;
>>> + uint64_t max_pages = max_size >> PAGE_SHIFT;
>>> + char *prop = g_strdup_printf("mmio[%d]", n);
>>> + hwaddr addr = object_property_get_int(OBJECT(sbdev), prop, NULL);
>>> + int page;
>>> + int i;
>>> +
>>> + page = addr >> PAGE_SHIFT;
>>> + if (addr == SYSBUS_DYNAMIC) {
>>> + uint64_t size_pages_align;
>>> +
>>> + /* Align the region to at least its own size granularity */
>>> + if (is_power_of_2(size_pages)) {
>>> + size_pages_align = size_pages;
>>> + } else {
>>> + size_pages_align = pow2floor(size_pages) << 1;
>>> + }
>>> +
>>> + /* Find the first available region that fits */
>>> + page = bitmap_find_next_zero_area(used_mem, max_pages, 0,
>>> size_pages,
>>> + size_pages_align);
>>> +
>>> + addr = (uint64_t)page << PAGE_SHIFT;
>>> + }
>>> +
>>> + if (page >= max_pages || test_bit(page, used_mem) ||
>>> + (find_next_bit(used_mem, max_pages, page) < size_pages)) {
>>> + hw_error("Memory [%"PRIx64":%"PRIx64" is already allocated or "
>>> + "no slot left", addr, size);
>>> + }
>>> +
>>> + for (i = page; i < (page + size_pages); i++) {
>>> + set_bit(i, used_mem);
>>> + }
>>> +
>>> + memory_region_add_subregion(pmem, addr, device_mem);
>>> + sbdev->mmio[n].addr = addr;
>>> + object_property_set_int(OBJECT(sbdev), addr, prop, NULL);
>>> +
>>> + g_free(prop);
>>> + return 0;
>>> +}
>>> +
>>> +int sysbus_device_check(Object *obj, void *opaque)
>>> +{
>>> + PlatformBusInitData *init = opaque;
>>> + Object *dev;
>>> + SysBusDevice *sbdev;
>>> + int i;
>>> +
>>> + dev = object_dynamic_cast(obj, TYPE_SYS_BUS_DEVICE);
>>> + sbdev = (SysBusDevice *)dev;
>>> +
>>> + if (!sbdev) {
>>> + /* Container, traverse it for children */
>>> + return object_child_foreach(obj, sysbus_device_check, opaque);
>>> + }
>>> +
>>> + /* Connect sysbus device to virtual platform bus */
>>> + for (i = 0; i < sbdev->num_irq; i++) {
>>> + if (!sbdev->irqp[i]) {
>>> + /* This IRQ is an incoming IRQ, we can't wire those here */
>>> + continue;
>>> + }
>>> + platform_bus_map_irq(init->params, sbdev, i,
>>> + init->used_irqs, init->irqs);
>>> + }
>>> +
>>> + for (i = 0; i < sbdev->num_mmio; i++) {
>>> + platform_bus_map_mmio(init->params, sbdev, i,
>>> + init->used_mem, init->mem);
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +void platform_bus_init(PlatformParams *params,
>>> + MemoryRegion *address_space_mem,
>>> + qemu_irq *mpic)
>>> +{
>>> + uint64_t max_size = params->platform_bus_size;
>>> + uint64_t max_pages = max_size >> PAGE_SHIFT;
>>> + DECLARE_BITMAP(used_irqs, params->platform_bus_num_irqs);
>>> + DECLARE_BITMAP(used_mem, max_pages);
>>> + MemoryRegion *platform_region = g_new(MemoryRegion, 1);
>>> + Object *container;
>>> + PlatformBusInitData init = {
>>> + .used_irqs = used_irqs,
>>> + .used_mem = used_mem,
>>> + .mem = platform_region,
>>> + .irqs = &mpic[params->platform_bus_first_irq],
>>> + .params = params,
>>> + };
>>> +
>>> + memory_region_init(platform_region, NULL, "platform devices",
>>> + params->platform_bus_size);
>>> +
>>> + bitmap_clear(used_irqs, 0, params->platform_bus_num_irqs);
>>> + bitmap_clear(used_mem, 0, max_pages);
>>> +
>>> + /* Loop through all sysbus devices that were spawened outside the
>>> machine */
>>> + container = container_get(qdev_get_machine(), "/peripheral");
>>> + sysbus_device_check(container, &init);
>>> + container = container_get(qdev_get_machine(), "/peripheral-anon");
>>> + sysbus_device_check(container, &init);
>>> +
>>> + memory_region_add_subregion(address_space_mem,
>>> params->platform_bus_base,
>>> + platform_region);
>>> +}
>> However, I do think it's a good idea to generalize the "platform bus"
>> device if you want to reuse it on ARM. The mmio / irq allocator is
>> pretty straight forward and should be generic enough for you to use.
> I need clarification here: do you talk about your very first patch
> "Platform Device Support" or the code above with a proper solution for
> device tree generation?
I'm talking about the actual implementation of the allocation logic. I
think we're better off to keep all the device tree logic purely in the
machine files for now.
>> If you do this, please don't duplicate the code but rather move it from
>> the e500 file into your new one :).
> OK. do you mean modifying the e500.c code to call those routines? My
> concern is about testing.
Why? We have a u-boot binary that starts up based on the device tree
with TCG if you just start the e500plat machine and if you like I can
easily give you a nice guest kernel and rootfs as well ;).
Alex
next prev parent reply other threads:[~2014-07-23 23:07 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-07 7:08 [Qemu-devel] [PATCH 0/7] machvirt dynamic sysbus device instantiation Eric Auger
2014-07-07 7:08 ` [Qemu-devel] [PATCH 1/7] hw/misc/platform_devices: helpers for dynamic instantiation of platform devices Eric Auger
2014-07-08 13:43 ` Alexander Graf
2014-07-23 14:58 ` Eric Auger
2014-07-23 23:07 ` Alexander Graf [this message]
2014-07-24 8:01 ` Eric Auger
2014-07-07 7:08 ` [Qemu-devel] [PATCH 2/7] hw/arm/boot: load_dtb becomes non static Eric Auger
2014-07-07 7:08 ` [Qemu-devel] [PATCH 3/7] hw/arm/virt: add new add_fdt_xxx_node functions Eric Auger
2014-07-07 7:08 ` [Qemu-devel] [PATCH 4/7] hw/arm/virt: Support dynamically spawned sysbus devices Eric Auger
2014-07-08 13:51 ` Alexander Graf
2014-07-08 13:55 ` Peter Maydell
2014-07-23 15:01 ` Eric Auger
2014-07-07 7:08 ` [Qemu-devel] [PATCH 5/7] hw/core/sysbus: add fdt_add_node method Eric Auger
2014-07-08 13:52 ` Alexander Graf
2014-07-23 15:33 ` Eric Auger
2014-07-23 23:02 ` Alexander Graf
2014-07-24 7:36 ` Eric Auger
2014-07-24 11:25 ` Alexander Graf
2014-07-24 12:42 ` Rob Herring
2014-07-07 7:08 ` [Qemu-devel] [PATCH 6/7] hw/misc/platform_devices: add call to sysbus fdt_add_node Eric Auger
2014-07-07 7:08 ` [Qemu-devel] [PATCH 7/7] hw/misc/platform_devices: Add platform_bus_base to PlatformDevtreeData Eric Auger
2014-07-08 13:53 ` Alexander Graf
2014-07-23 15:39 ` Eric Auger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53D0402F.2080407@suse.de \
--to=agraf@suse.de \
--cc=a.motakis@virtualopensystems.com \
--cc=a.rigo@virtualopensystems.com \
--cc=alex.williamson@redhat.com \
--cc=christoffer.dall@linaro.org \
--cc=eric.auger@linaro.org \
--cc=eric.auger@st.com \
--cc=kim.phillips@freescale.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=patches@linaro.org \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=stuart.yoder@freescale.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).