All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleksii Kurochko <oleksii.kurochko@gmail.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "Romain Caritey" <Romain.Caritey@microchip.com>,
	"Alistair Francis" <alistair.francis@wdc.com>,
	"Connor Davis" <connojdavis@gmail.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Anthony PERARD" <anthony.perard@vates.tech>,
	"Michal Orzel" <michal.orzel@amd.com>,
	"Julien Grall" <julien@xen.org>,
	"Roger Pau Monné" <roger.pau@citrix.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH v2 19/26] xen/riscv: implement IRQ routing for device passthrough
Date: Thu, 4 Jun 2026 17:35:27 +0200	[thread overview]
Message-ID: <60ad843e-3fc0-4f99-bbff-0a2f84679274@gmail.com> (raw)
In-Reply-To: <1941ee36-cbfd-4d7f-a15b-e74843371f3b@suse.com>



On 6/3/26 6:01 PM, Jan Beulich wrote:
> On 08.05.2026 16:43, Oleksii Kurochko wrote:
>> --- /dev/null
>> +++ b/xen/arch/riscv/device.c
>> @@ -0,0 +1,108 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +
>> +#include <xen/device_tree.h>
>> +#include <xen/errno.h>
>> +#include <xen/iocap.h>
>> +#include <xen/rangeset.h>
>> +#include <xen/sched.h>
>> +
>> +#include <asm/intc.h>
>> +
>> +int map_irq_to_domain(struct domain *d, unsigned int irq,
>> +                      bool need_mapping, const char *devname)
>> +{
>> +    int res;
>> +
>> +    res = irq_permit_access(d, irq);
>> +    if ( res )
>> +    {
>> +        printk(XENLOG_ERR "Unable to permit to %pd access to IRQ %u\n", d, irq);
> 
> Nit: Drop the first "to"?

Sure, I will drop that.

> 
>> +        return res;
>> +    }
>> +
>> +    if ( need_mapping )
>> +    {
>> +        /*
>> +         * Checking the return of vintc_reserve_virq is not
>> +         * necessary. It should not fail except when we try to map
>> +         * the IRQ twice. This can legitimately happen if the IRQ is shared.
>> +         */
>> +        vintc_reserve_virq(d, irq);
>> +
>> +        res = route_irq_to_guest(d, irq, irq, devname);
>> +        if ( res < 0 )
>> +        {
>> +            printk(XENLOG_ERR "Unable to map IRQ%u to %pd\n", irq, d);
>> +            return res;
>> +        }
>> +    }
>> +
>> +    dt_dprintk("  - IRQ: %u\n", irq);
>> +
>> +    return 0;
>> +}
>> +
>> +/*
>> + * map_device_irqs_to_domain retrieves the interrupts configuration from
>> + * a device tree node and maps those interrupts to the target domain.
>> + *
>> + * Returns:
>> + *   < 0 error
>> + *   0   success
>> + */
>> +int map_device_irqs_to_domain(struct domain *d,
>> +                              struct dt_device_node *dev,
>> +                              bool need_mapping,
>> +                              struct rangeset *irq_ranges)
>> +{
>> +    unsigned int i, nirq;
>> +    int res, irq;
>> +    struct dt_raw_irq rirq;
> 
> Move the latter three variables to the loop's scope and ...
> 
>> +    nirq = dt_number_of_irq(dev);
> 
> ... make this the variable's initializer?

It makes sense. I will do that.

> 
>> +    /* Give permission and map IRQs */
>> +    for ( i = 0; i < nirq; i++ )
>> +    {
>> +        res = dt_device_get_raw_irq(dev, i, &rirq);
>> +        if ( res )
>> +        {
>> +            printk(XENLOG_ERR "Unable to retrieve irq %u for %s\n",
>> +                   i, dt_node_full_name(dev));
>> +            return res;
>> +        }
>> +
>> +        /*
>> +         * Don't map IRQ that have no physical meaning
>> +         * ie: IRQ whose controller is not APLIC/IMSIC/PLIC.
>> +         */
>> +        if ( rirq.controller != dt_interrupt_controller )
>> +        {
>> +            dt_dprintk("irq %u not connected to primary controller."
>> +                       "Connected to %s\n", i,
>> +                       dt_node_full_name(rirq.controller));
>> +            continue;
>> +        }
>> +
>> +        irq = platform_get_irq(dev, i);
>> +        if ( irq < 0 )
>> +        {
>> +            printk("Unable to get irq %u for %s\n", i, dt_node_full_name(dev));
>> +            return irq;
>> +        }
>> +
>> +        res = map_irq_to_domain(d, irq, need_mapping, dt_node_name(dev));
>> +        if ( res )
>> +            return res;
>> +
>> +
>> +        /*
>> +         * At the moment there is only one user of map_device_irqs_to_domain()
>> +         * for RISC-V which calls it irq_ranges == NULL.
>> +         */
>> +        if ( irq_ranges )
>> +            return -EOPNOTSUPP;
> 
> Why is this checked last, and inside the loop (when it's loop invariant)?

Just to show the place where irq_ranges will be handled. But currently I 
agree it would be better just move outside the loop.

> 
>> --- a/xen/arch/riscv/include/asm/intc.h
>> +++ b/xen/arch/riscv/include/asm/intc.h
>> @@ -13,8 +13,11 @@ enum intc_version {
>>   };
>>   
>>   struct cpu_user_regs;
>> +struct domain;
> 
> I can spot why this is needed, but ...
> 
>> +struct dt_device_node;
>>   struct irq_desc;
>>   struct kernel_info;
>> +struct rangeset;
>>   struct vcpu;
> 
> ... I'm at a loss to explain the need for these two additions.

Rudements from previous version of this patch series. I will drop them.
The same below ...

> 
>> --- a/xen/arch/riscv/include/asm/setup.h
>> +++ b/xen/arch/riscv/include/asm/setup.h
>> @@ -5,6 +5,10 @@
>>   
>>   #include <xen/types.h>
>>   
>> +struct domain;
>> +struct dt_device_node;
>> +struct rangeset;
> 
> Same here - why would they be needed when you make no other changes
> to this header?

... here.

> 
>> --- a/xen/arch/riscv/intc.c
>> +++ b/xen/arch/riscv/intc.c
>> @@ -7,7 +7,9 @@
>>   #include <xen/init.h>
>>   #include <xen/irq.h>
>>   #include <xen/lib.h>
>> +#include <xen/sched.h>
>>   #include <xen/spinlock.h>
>> +#include <xen/xvmalloc.h>
>>   
>>   #include <asm/aia.h>
>>   #include <asm/intc.h>
>> @@ -86,6 +88,22 @@ unsigned int intc_irq_nums(void)
>>       return intc_hw_ops->irq_nums();
>>   }
>>   
>> +int intc_route_irq_to_guest(struct irq_desc *desc,
>> +                            unsigned int priority)
>> +{
>> +    ASSERT(spin_is_locked(&desc->lock));
>> +
>> +    ASSERT(intc_hw_ops->guest_irq_type);
>> +
>> +    desc->handler = intc_hw_ops->guest_irq_type;
>> +    set_bit(_IRQ_GUEST, &desc->status);
> 
> Is desc->status accessed anywhere without holding desc->lock? If not,
> __set_bit() or simply |= ?

In release_irq() it could be used without lock:
...
     /* Wait to make sure it's not being used on another CPU */
     do { smp_mb(); } while ( test_bit(_IRQ_INPROGRESS, &desc->status) );

> 
>> @@ -112,6 +130,14 @@ int domain_vintc_init(struct domain *d)
>>           break;
>>       }
>>   
>> +    if ( !ret )
>> +    {
>> +        d->arch.vintc->allocated_irqs =
>> +            xvzalloc_array(unsigned long, BITS_TO_LONGS(d->arch.vintc->irq_nums));
>> +        if ( !d->arch.vintc->allocated_irqs )
>> +            ret = -ENOMEM;
>> +    }
>> +
>>       return ret;
>>   }
>>   
>> @@ -129,4 +155,14 @@ void domain_vintc_deinit(struct domain *d)
>>           printk("vintc (ver:%d) isn't implemented\n", ver);
>>           break;
>>       }
>> +
>> +    xvfree(d->arch.vintc->allocated_irqs);
>> +}
> 
> XVFREE()
> 
>> +bool vintc_reserve_virq(const struct domain *d, unsigned int virq)
>> +{
>> +    if ( virq >= d->arch.vintc->irq_nums )
>> +        return false;
>> +
>> +    return !test_and_set_bit(virq, d->arch.vintc->allocated_irqs);
>>   }
> 
> As to function / field naming: You don't look to be allocating IRQs. So
> is there a reason the field name gives the impression of allocation?
> Simply s/allocated/used/ or some such?

I am okay to rename to 'used' instead of 'allocated'.

> 
>> @@ -221,3 +239,160 @@ void do_IRQ(struct cpu_user_regs *regs, unsigned int irq)
>>       spin_unlock(&desc->lock);
>>       irq_exit();
>>   }
>> +
>> +static inline struct irq_guest *irq_get_guest_info(struct irq_desc *desc)
>> +{
>> +    ASSERT(spin_is_locked(&desc->lock));
>> +    ASSERT(test_bit(_IRQ_GUEST, &desc->status));
>> +    ASSERT(desc->action != NULL);
>> +
>> +    return desc->action->dev_id;
>> +}
>> +
>> +static inline struct domain *irq_get_domain(struct irq_desc *desc)
>> +{
>> +    return irq_get_guest_info(desc)->d;
>> +}
>> +
>> +void release_irq(unsigned int irq, const void *dev_id)
>> +{
>> +    struct irq_desc *desc;
>> +    unsigned long flags;
>> +    struct irqaction *action, **action_ptr;
>> +
>> +    desc = irq_to_desc(irq);
>> +
>> +    spin_lock_irqsave(&desc->lock,flags);
>> +
>> +    action_ptr = &desc->action;
>> +#ifdef CONFIG_IRQ_HAS_MULTIPLE_ACTION
>> +    for ( ;; )
>> +    {
>> +        action = *action_ptr;
>> +        if ( !action )
>> +        {
>> +            printk(XENLOG_WARNING "Trying to free already-free IRQ %u\n", irq);
>> +            spin_unlock_irqrestore(&desc->lock, flags);
>> +            return;
>> +        }
>> +
>> +        if ( action->dev_id == dev_id )
>> +            break;
>> +
>> +        action_ptr = &action->next;
>> +    }
>> +
>> +    /* Found it - remove it from the action list */
>> +    *action_ptr = action->next;
>> +#else
>> +    action = *action_ptr;

It is needed to add *action_ptr = NULL here to deal with ...

>> +#endif
>> +
>> +    /* If this was the last action, shut down the IRQ */
>> +    if ( !desc->action )
>> +    {
>> +        desc->handler->shutdown(desc);
>> +        clear_bit(_IRQ_GUEST, &desc->status);
>> +    }
>> +
>> +    spin_unlock_irqrestore(&desc->lock,flags);
>> +
>> +    /* Wait to make sure it's not being used on another CPU */
>> +    do { smp_mb(); } while ( test_bit(_IRQ_INPROGRESS, &desc->status) );
>> +
>> +    if ( action->free_on_release )
>> +        xvfree(action);
> 
> When !IRQ_HAS_MULTIPLE_ACTION desc->action becomes a dangling pointer here.

... it could be xvfree here as action is local variable.

> 
>> +/* Route an IRQ to a specific guest */
>> +int route_irq_to_guest(struct domain *d, unsigned int virq,
>> +                       unsigned int irq, const char *devname)
>> +{
>> +    struct irqaction *action;
>> +    struct irq_guest *info;
>> +    struct irq_desc *desc;
>> +    unsigned long flags;
>> +    int retval = 0;
>> +
>> +    desc = irq_to_desc(irq);
>> +
>> +    action = xvmalloc(struct irqaction);
>> +    if ( !action )
>> +        return -ENOMEM;
> 
> This is freed by release_irq(), but ...
> 
>> +    info = xvmalloc(struct irq_guest);
>> +    if ( !info )
> 
> ... where is the (non-error-path) freeing of this?
> 

Agree it should be freed.

I am thing about just to update release_irq():

if ( action->free_on_release )
{
     xvfree(action->dev_id);
     xvfree(action);
}

But I think it is conceptually is incorrect as owner of ->dev_id in this 
case is guest so it would be better if guest will do that. So I think it 
would be better to intoduce now release_guest_irq():

int release_guest_irq(struct domain *d, unsigned int virq)
{
     struct irq_desc *desc;
     struct irq_guest *info;
     unsigned long flags;

     desc = irq_to_desc(virq);

     spin_lock_irqsave(&desc->lock, flags);

     if ( !test_bit(_IRQ_GUEST, &desc->status) )
         goto unlock_err;

     info = irq_get_guest_info(desc);
     if ( d != info->d )
         goto unlock_err;

     spin_unlock_irqrestore(&desc->lock, flags);

     release_irq(desc->irq, info);
     xvfree(info);

     return 0;

  unlock_err:
     spin_unlock_irqrestore(&desc->lock, flags);
     return -EINVAL;
}

and then call it in domain_vintc_deinit() for all virqs of a domain.

>> +    {
>> +        xvfree(action);
>> +        return -ENOMEM;
>> +    }
>> +
>> +    info->d = d;
>> +    info->virq = virq;
>> +
>> +    action->dev_id = info;
>> +    action->name = devname;
>> +    action->free_on_release = 1;
> 
> true
> 
>> +    spin_lock_irqsave(&desc->lock, flags);
>> +
>> +    /*
>> +     * If the IRQ is already used by someone
>> +     *  - If it's the same domain -> Xen doesn't need to update the IRQ desc.
>> +     *  For safety check if we are not trying to assign the IRQ to a
>> +     *  different vIRQ.
>> +     *  - Otherwise -> For now, don't allow the IRQ to be shared between
>> +     *  Xen and domains.
>> +     */
>> +    if ( desc->action != NULL )
>> +    {
>> +        if ( test_bit(_IRQ_GUEST, &desc->status) )
>> +        {
>> +            struct domain *ad = irq_get_domain(desc);
>> +
>> +            if ( d != ad )
>> +            {
>> +                printk(XENLOG_G_ERR "IRQ %u is already used by domain %u\n",
>> +                       irq, ad->domain_id);
>> +                retval = -EBUSY;
>> +            }
>> +            else if ( irq_get_guest_info(desc)->virq != virq )
>> +            {
>> +                printk(XENLOG_G_ERR
>> +                       "d%u: IRQ %u is already assigned to vIRQ %u\n",
>> +                       d->domain_id, irq, irq_get_guest_info(desc)->virq);
> 
> Please can you get used to using %pd?

I'll do my best. Thanks for consistently pointing that out. I appreciate it.


> 
>> +                retval = -EBUSY;
>> +            }
>> +        }
>> +        else
>> +        {
>> +            printk(XENLOG_G_ERR "IRQ %u is already used by Xen\n", irq);
>> +            retval = -EBUSY;
>> +        }
>> +        goto out;
>> +    }
>> +
>> +    retval = _setup_irq(desc, 0, action);
>> +    if ( retval )
>> +        goto out;
>> +
>> +    retval = intc_route_irq_to_guest(desc, IRQ_NO_PRIORITY);
>> +
>> +    spin_unlock_irqrestore(&desc->lock, flags);
>> +
>> +    if ( retval )
>> +    {
>> +        release_irq(desc->irq, info);
> 
> Is de-referencing desc legitimate / race free with desc->lock not held?

desc itself cannot be freed, it's a pointer into the statically 
allocated array irq_desc[NR_IRQS] (riscv/irq.c:29). The descriptor 
object lives for the lifetime of the system.

desc->irq is write-once, it's set during init_irq_data() at boot 
(riscv/irq.c:172) and never modified again. It's effectively an 
immutable field after initialization, so reading it without the lock 
held is safe.

Thanks.

~ Oleksii



  reply	other threads:[~2026-06-04 15:35 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08 14:43 [PATCH v2 00/26] Introduce enablemenant of dom0less Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 01/26] xen: arm: update p2m_set_allocation() prototype Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 02/26] xen/riscv: Implement ARCH_PAGING_MEMPOOL Oleksii Kurochko
2026-05-18 15:13   ` Jan Beulich
2026-05-19  9:27     ` Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 03/26] xen/riscv: Implement construct_domain() Oleksii Kurochko
2026-05-18 15:33   ` Jan Beulich
2026-05-19  9:28     ` Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 04/26] xen/riscv: implement prerequisites for domain_create() Oleksii Kurochko
2026-05-18 15:43   ` Jan Beulich
2026-05-19 11:33     ` Oleksii Kurochko
2026-05-19 11:47       ` Jan Beulich
2026-05-08 14:43 ` [PATCH v2 05/26] xen/riscv: introduce guest riscv,isa string Oleksii Kurochko
2026-05-18 15:51   ` Jan Beulich
2026-05-19 11:59     ` Oleksii Kurochko
2026-05-19 12:12       ` Jan Beulich
2026-05-19 13:24         ` Oleksii Kurochko
2026-05-19 13:40           ` Jan Beulich
2026-05-19 14:49             ` Oleksii Kurochko
2026-05-19 14:53               ` Jan Beulich
2026-05-19 15:17                 ` Oleksii Kurochko
2026-05-19 15:56                   ` Jan Beulich
2026-05-19 16:21                     ` Oleksii Kurochko
2026-05-20  6:13                       ` Jan Beulich
2026-05-20  7:28                         ` Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 06/26] xen/riscv: implement make_cpus_node() Oleksii Kurochko
2026-05-18 16:00   ` Jan Beulich
2026-05-19 13:33     ` Oleksii Kurochko
2026-05-19 13:42       ` Jan Beulich
2026-05-08 14:43 ` [PATCH v2 07/26] xen/riscv: implement make_timer_node() Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 08/26] xen/riscv: implement make_arch_nodes() Oleksii Kurochko
2026-05-21 13:20   ` Jan Beulich
2026-05-08 14:43 ` [PATCH v2 09/26] xen/riscv: introduce init interrupt controller operations Oleksii Kurochko
2026-05-21 13:25   ` Jan Beulich
2026-05-22 14:38     ` Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 10/26] xen/riscv: implement make_intc_domU_node() Oleksii Kurochko
2026-05-21 13:30   ` Jan Beulich
2026-05-22 14:45     ` Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 11/26] xen/riscv: introduce aia_init() and aia_usable() Oleksii Kurochko
2026-05-21 14:57   ` Jan Beulich
2026-05-22 14:55     ` Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 12/26] xen/riscv: add basic VGEIN management for AIA guests Oleksii Kurochko
2026-05-21 15:11   ` Jan Beulich
2026-05-22 15:43     ` Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 13/26] xen/riscv: introduce per-vCPU IMSIC state Oleksii Kurochko
2026-05-21 15:24   ` Jan Beulich
2026-05-22 15:50     ` Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 14/26] xen/riscv: add very early virtual APLIC (vAPLIC) initialization support Oleksii Kurochko
2026-06-03 14:54   ` Jan Beulich
2026-06-04 11:29     ` Oleksii Kurochko
2026-06-05  7:22       ` Jan Beulich
2026-06-05 11:59       ` Oleksii Kurochko
2026-06-05 12:05         ` Jan Beulich
2026-05-08 14:43 ` [PATCH v2 15/26] xen/riscv: introduce (de)initialization helpers for vINTC Oleksii Kurochko
2026-06-03 15:00   ` Jan Beulich
2026-06-04 11:33     ` Oleksii Kurochko
2026-06-05  7:26       ` Jan Beulich
2026-06-05  9:25         ` Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 16/26] xen/riscv: create APLIC DT node for guest domains Oleksii Kurochko
2026-06-03 15:10   ` Jan Beulich
2026-06-04 11:54     ` Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 17/26] xen/riscv: generate IMSIC " Oleksii Kurochko
2026-06-03 15:21   ` Jan Beulich
2026-06-04 14:21     ` Oleksii Kurochko
2026-06-05  7:31       ` Jan Beulich
2026-06-05  9:29         ` Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 18/26] xen: move declaration of map_device_irqs_to_domain() to common header Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 19/26] xen/riscv: implement IRQ routing for device passthrough Oleksii Kurochko
2026-06-03 16:01   ` Jan Beulich
2026-06-04 15:35     ` Oleksii Kurochko [this message]
2026-06-05  7:38       ` Jan Beulich
2026-05-08 14:43 ` [PATCH v2 20/26] xen/riscv: add missing APLIC register offsets, masks to asm/aplic.h Oleksii Kurochko
2026-06-03 15:36   ` Jan Beulich
2026-06-05  8:48     ` Oleksii Kurochko
2026-06-05  9:07       ` Jan Beulich
2026-05-08 14:43 ` [PATCH v2 21/26] xen/riscv: implement virtual APLIC MMIO emulation Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 22/26] xen/riscv: implement init_intc_phandle() Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 23/26] xen/riscv: initialize RCU, scheduler, and system domains in start_xen() Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 24/26] xen/riscv: provide init_vuart() Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 25/26] xen/riscv: add initial dom0less infrastructure support Oleksii Kurochko
2026-05-08 14:43 ` [PATCH v2 26/26] xen/riscv: manage IRQ_DISABLED flag in APLIC irq enable/disable callbacks Oleksii Kurochko
2026-05-18 15:38 ` [PATCH v2 00/26] Introduce enablemenant of dom0less Jan Beulich
2026-05-19  9:26   ` Oleksii Kurochko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60ad843e-3fc0-4f99-bbff-0a2f84679274@gmail.com \
    --to=oleksii.kurochko@gmail.com \
    --cc=Romain.Caritey@microchip.com \
    --cc=alistair.francis@wdc.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anthony.perard@vates.tech \
    --cc=connojdavis@gmail.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=michal.orzel@amd.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.