linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiang Liu <jiang.liu@linux.intel.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Grant Likely <grant.likely@linaro.org>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Yinghai Lu <yinghai@kernel.org>,
	x86@kernel.org, Len Brown <len.brown@intel.com>,
	Pavel Machek <pavel@ucw.cz>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tony Luck <tony.luck@intel.com>, Joerg Roedel <joro@8bytes.org>,
	Paul Gortmaker <paul.gortmaker@windriver.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-acpi@vger.kernel.org, Ingo Molnar <mingo@kernel.org>,
	linux-pm@vger.kernel.org
Subject: Re: [Patch V3 19/37] x86, irq: introduce mechanisms to support dynamically allocate IRQ for IOAPIC
Date: Wed, 28 May 2014 13:01:02 +0800	[thread overview]
Message-ID: <53856D8E.3010401@linux.intel.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1405271445050.21720@ionos.tec.linutronix.de>

Hi Thomas,
	Thanks for your comments. Please refer to inline
comments below.

On 2014/5/28 3:58, Thomas Gleixner wrote:
> Jiang,
> 
> On Tue, 27 May 2014, Jiang Liu wrote:
> 
>> +static int alloc_irq_from_domain(struct irq_domain *domain, u32 gsi, int pin)
>>  {
>> +	int irq = -1;
>> +
>> +	if (gsi >= arch_dynirq_lower_bound(0)) {
>> +		irq = irq_create_mapping(domain, pin);
>> +	} else if (gsi < NR_IRQS_LEGACY) {
>> +		if (!ioapic_identity_map)
>> +			irq = irq_create_mapping(domain, pin);
>> +		else if (irq_domain_associate(domain, gsi, pin) == 0)
>> +			irq = gsi;
>> +	} else if (irq_create_strict_mappings(domain, gsi, pin, 1) == 0) {
>> +		irq = gsi;
>> +	}
> 
> So you have these cases covered here:
> 
> 1) The ACPI case of secondary ioapics. You only have the strict 1:1
>    mapping for the first ioapic
> 
> 2) The gsi < NR_IRQS_LEGACY case where you have two options:
> 
>     a) Let the core create a random virq number if ioapic_identity_map
>        is 0
> 
>        ioapic_identity_map is only set by SFI and devicetree
> 
>        So in all other cases we fall into that code path for all
>        legacy interrupts. So how is that supposed to work lets say for
>        i8042 which has hardcoded irq 1 and 12?
> 
>        irq_create_mapping(1)
>        
> 	    hint = 1 % nr_irqs; --> 1
> 	    virq = irq_alloc_desc_from(hint, of_node_to_nid(domain->of_node));
> 
> 	    This returns something >= 16, because the irq descriptors
> 	    for 0-15 (LEGACY) are allocated already.
> 
>        The pin association works, but how is the i8042 driver supposed
>        to figure out that it should request the virq >=16 which was
>        created instead of the hardcoded 1 ?
This is used to work around special non-ISA interrupts with GSI below
NR_IRQS_LEGACY. The original code for the special case is:
/*
 * Provide an identity mapping of gsi == irq except on truly
 * weird platforms that have non isa irqs in the first 16 gsis.
 */
return gsi >= NR_IRQS_LEGACY ? gsi : gsi_top + gsi;

We have one path to handle ISA IRQs before calling
alloc_irq_from_domain() as below:
        if (idx >= 0 && test_bit(mp_irqs[idx].srcbus, mp_bus_not_pci))
                return mp_irqs[idx].srcbusirq;

> 	
>     b) Associate the gsi and the pin
> 
>        This only works because the virqs are already allocated at boot
>        time unconditionally due to arch_probe_nr_irqs() returning
>        NR_IRQS_LEGACY. So irq_domain_associate() works.
>        Undocumented works by chance behaviour.
Yes. It's a good suggestion to enhance legacy_pic to make this
code more clear.

> 
> 3) The case where gsi < arch_dynirq_lower_bound()
> 
>    You create a strict mapping here, fine.
> 
> This is confusing at best.
> 
> First of all, we should use legacy_pic->nr_legacy_irqs instead of
> NR_IRQS_LEGACY all over the place.
> 
> mshyperv, ce4100 and intel-mid use the null_legacy_pic which has
> nr_legacy_irqs = 0 and everything else uses the real pic which has
> nr_legacy_irqs = NR_IRQS_LEGACY. So why do we even bother to allocate
> and deal with NR_IRQS_LEGACY in the cases where we have no legacy?
I'm not sure whether it works with ce4100, so used NR_IRQS_LEGACY
instead of legacy_pic->nr_legacy_irqs for safety. Will try to refine
it in next version.

> 
> ce4100 is an oddball though. The ioapic is registered way before the
> interrupt subsystem is initialized and I have a hard time to
> understand that comment:
> 
>         /* We can't set this earlier, because we need to calibrate the timer */
>         legacy_pic = &null_legacy_pic;
I haven't figured out the story behind the comment yet:(

> 
> The timer calibration happens after the interrupts are set up. I
> assume it's check_timer() which wants that, but we know exactly how
> the ce4100 works, so we might be able to avoid that whole "testing"
> stuff. Sebastian, any input on this?
> 
> If it turns out that ce4100 needs the inital real legacy pic for some
> magic reason we still can be clever by extending the legacy pic data
> structure to tell us about that change, i.e. instead of using
> legacy_pic->nr_legacy_irqs having a field "nr_allocated_irqs", which
> is set to NR_IRQS_LEGACY for the real pic and to 0 for the null_pic
> and let ce4100 set that field to NR_IRQS_LEGACY before switching the
> legacy_pic over to the null implementation.
Good suggestion, will try this way.

> But what's really disgusting is the magic ioapic_identity_map and the
> extra ACPI specific ioapic_dynirq_base hackery.
> 
> Why do we need strict mappings in the non ACPI case for all ioapic
> pins? What's so different about ACPI? Or is this just to avoid
> breaking the existing SFI/devicetree stuff. If that's the reason I'm
> fine with it, but ...
It's to avoid breaking SFI/intel_mid stuff. intel_mid assumes IRQ
number equals to pin number and use pci_dev->irq to save both IRQ
number and pin number.

> 
> independent of this question we want to be more clever about the
> handling of strict, legacy and freely associated linux irq numbers.
> 
> So you have this weird ioapic_create_domain_fn callback in
> mp_register_ioapic, which is solely there so the different callers can
> hand in their domain ops and eventually set the magic
> ioapic_identity_map flag.
> 
> +void __init mp_register_ioapic(int id, u32 address, u32 gsi_base,
> +                              ioapic_create_domain_fn cb, void *arg)
> 
> What about having
> 
> struct ioapic_domain {
>        struct irqdomain             *domain;
>        const struct irq_domain_ops  *ops;
>        void                         *arg;
>        enum domain_type             type;
> };
> 
> and add this struct to the ioapic struct. type is:
> 
> enum domain_type {
>        IOAPIC_STRICT,
>        IOAPIC_LEGACY,
>        IOAPIC_DYNAMIC,
> };     
> 
> 
> Now the register function changes to:
> 
> void mp_register_ioapic(int id, u32 address, u32 gsi_base,
>      			const struct irq_domain_ops *ops,
> 			int type)
> {
> 	...
> 
>         ioapics[idx].irqdomain.ops = ops;
>         ioapics[idx].irqdomain.arg = arg;
> 	ioapics[idx].irqdomain.type = type;
>         ioapics[idx].irqdomain.domain = NULL;
> 	
> 	...
> }
> 
> and you can use mp_irqdomain_create() unconditionally for creating all
> domains. And there you do:
> 
> static int dynirq_lower_bound;
> 
> mp_irqdomain_create()
> {
>        ioapic->irqdomain.domain = irq_domain_add_linear(...);
>        
>        switch (ioapic->irqdomain.type) {
>        case IOAPIC_LEGACY:
>        	    /*
> 	     * Associate the legacy interrupts which have been
> 	     * already allocated.
> 	     */
> 	    for (i = 0; i < legacy_pic->nr_allocated_irqs; i++)
> 		irq_domain_associate(domain, i, i);
> 
>        case IOAPIC_STRICT:
>        	    dynirq_lower_bound += ioapic->nr_gsis;
> 
>        case IOAPIC_DYNAMIC:
>        	    break;
>        }
> }
> 
> So arch_dynirq_lower_bound() gets simplified to:
> {
>      return dynirq_lower_bound;
> }
> 
> And alloc_irq_from_domain() becomes:
> 
> int alloc_irq_from_domain(struct ioapic_domain *domain, int gsi, int pin)
> {
> 	switch (domain->type) {
> 	case IOAPIC_DYNAMIC:
> 	     return irq_create_mapping(domain->domain, pin);
> 	case IOAPIC_LEGACY:
> 	case IOAPIC_STRICT:     	    
> 	     return irq_create_strict_mappings(domain, gsi, pin, 1);
> 	}
> }
> 
> At the call site of alloc_irq_from_domain() you have already:
> 
>        irq = irq_find_mapping(domain, pin);
>        if (irq <=0 ....)
>        	       alloc_irq_from_domain(domain, gsi, pin);
> 
> So because we associated the legacy_pic->nr_allocated_irqs in
> mp_irqdomain_create() already, you'll never call into
> alloc_irq_from_domain() for those and the remaining ones for that
> first ioapic are handled by the IOAPIC_STRICT fall through.
> 
> For simplicity you can let SFI and devicetree register the ioapics
> with their specific domain ops plus IOAPIC_LEGACY for the first ioapic
> and IOAPIC_STRICT for all others. That also covers the case where the
> null_legacy_pic with legacy_pic->nr_allocated_irqs == 0 is used.
> 
> In the ACPI case you register with the acpi domain ops and
> IOAPIC_LEGACY for the first and IOAPIC_DYNAMIC for the extra ioapics.
> 
> Should be way cleaner and understandable, at least to me :)
Really good suggestions! Make thing much more clear.

> 
> Now there is that last oddity which bugs me in mp_map_pin_to_irq()
> 
>  	/*
> 	 * Don't use irqdomain to manage ISA IRQs because there may be
> 	 * multiple IOAPIC pins sharing the same ISA IRQ number and
> 	 * irqdomain only supports 1:1 mapping between IOAPIC pin and
> 	 * IRQ number.
>  	 */
> 	if (idx >= 0 && test_bit(mp_irqs[idx].srcbus, mp_bus_not_pci)) {
>  		irq = mp_irqs[idx].srcbusirq;
> 		if ((flags & IOAPIC_MAP_ALLOC) && info->count == 0 &&
> 		    mp_irqdomain_map(domain, irq, pin) != 0)
> 			irq = -1;
> 
> That really looks like a hack. I'm aware that the current irqdomain
> code cannot deal with that oddball case.
> 
> So what you are saying is that there are devices which have a separate
> physical wire to different ioapic pins, but the ioapic is supposed to
> bundle them to a shared interrupt.
> 
> I agree that this is odd enough to handle at the ioapic level, but
> it'd be nice to have a more elaborative comment on this.
Will try to improve the comment.

> 
> Aside of the above I'm pretty happy about the progress of this patch
> set. One thing, which needs to be looked at are the usage sites of
> irq_data->irq, whether they are safe. I did not spot any unsafe ones,
> but a few functions which are called with irq_data->irq and make no
> use of it.
Sure, will check usages of irq_data->irq.

Really appreciate you suggestions, it will improve the code a lot.
Thanks!
Gerry
> 
> Thanks,
> 
> 	tglx
> 

  reply	other threads:[~2014-05-28  5:01 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-27  8:07 [Patch V3 00/37] use irqdomain to dynamically allocate IRQ for IOAPIC Jiang Liu
2014-05-27  8:07 ` [Patch V3 01/37] x86, irq: update high address field when updating affinity for MSI IRQ Jiang Liu
2014-05-27  8:11   ` Thomas Gleixner
2014-05-27  8:50     ` Jiang Liu
2014-05-27  8:07 ` [Patch V3 02/37] genirq, trivial: improve documentation to match current implementation Jiang Liu
2014-05-27  8:07 ` [Patch V3 03/37] x86, mpparse: use pr_lvl() helper utilities to replace printk(KERN_LVL) Jiang Liu
2014-06-03  0:41   ` David Rientjes
2014-05-27  8:07 ` [Patch V3 04/37] x86, mpparse: simplify arch/x86/include/asm/mpspec.h Jiang Liu
2014-05-27  8:07 ` [Patch V3 05/37] x86, PCI, ACPI: use kmalloc_node() to optimize for performance Jiang Liu
2014-05-27 13:50   ` Bjorn Helgaas
2014-05-27 21:22   ` David Rientjes
2014-05-27  8:07 ` [Patch V3 06/37] x86, acpi, irq: kill static function irq_to_gsi() Jiang Liu
2014-05-27  8:07 ` [Patch V3 07/37] x86, ACPI, trivial: minor improvements to arch/x86/kernel/acpi/boot.c Jiang Liu
2014-05-27  8:07 ` [Patch V3 08/37] x86, ACPI, irq: enhance error handling in function acpi_register_gsi() Jiang Liu
2014-05-27  8:07 ` [Patch V3 09/37] x86, ACPI, irq: fix possible eror in GSI to IRQ mapping for legacy IRQ Jiang Liu
2014-05-27  8:07 ` [Patch V3 10/37] x86, irq, trivial: minor improvements of IRQ related code Jiang Liu
2014-05-27  8:07 ` [Patch V3 11/37] x86, ioapic: kill unused global variable timer_through_8259 Jiang Liu
2014-05-27  8:07 ` [Patch V3 12/37] x86, ioapic: kill static variable nr_irqs_gsi Jiang Liu
2014-05-27  8:07 ` [Patch V3 13/37] x86, ioapic: introduce helper utilities to walk ioapics and pins Jiang Liu
2014-05-27  8:07 ` [Patch V3 14/37] x86, ioapic: use irq_cfg() instead of irq_get_chip_data() for better readability Jiang Liu
2014-05-27  8:07 ` [Patch V3 15/37] x86, irq: reorganize IO_APIC_get_PCI_irq_vector() to prepare for irqdomain Jiang Liu
2014-05-27  8:07 ` [Patch V3 16/37] x86, irq: introduce some helper utilities to improve readability Jiang Liu
2014-05-27  8:07 ` [Patch V3 17/37] x86, ACPI, irq: consolidate algorithm of mapping (ioapic, pin) to IRQ number Jiang Liu
2014-05-27  8:07 ` [Patch V3 18/37] x86, irq, ACPI: change __acpi_register_gsi to return IRQ number instead of GSI Jiang Liu
2014-05-27  8:07 ` [Patch V3 19/37] x86, irq: introduce mechanisms to support dynamically allocate IRQ for IOAPIC Jiang Liu
2014-05-27 19:58   ` Thomas Gleixner
2014-05-28  5:01     ` Jiang Liu [this message]
2014-05-28 21:08       ` Thomas Gleixner
2014-05-28 21:22         ` Thomas Gleixner
2014-05-28  8:01     ` Sebastian Andrzej Siewior
2014-05-28 10:07       ` Thomas Gleixner
2014-05-28 10:39         ` Sebastian Andrzej Siewior
2014-06-05  7:04           ` [RFC Patch 3/3] x86, irq: count legacy IRQs by legacy_pic->nr_legacy_irqs instead of NR_IRQS_LEGACY Jiang Liu
2014-06-05  8:15             ` Thomas Gleixner
2014-05-27  8:07 ` [Patch V3 20/37] x86, irq: enhance mp_register_ioapic() to support irqdomain Jiang Liu
2014-05-27  8:07 ` [Patch V3 21/37] x86, ACPI, irq: provide basic irqdomain support Jiang Liu
2014-05-27  8:07 ` [Patch V3 22/37] x86, mpparse, " Jiang Liu
2014-05-27  8:07 ` [Patch V3 23/37] x86, SFI, " Jiang Liu
2014-05-27  8:07 ` [Patch V3 24/37] x86, devicetree, irq: use common mechanism to support irqdomain Jiang Liu
2014-05-27  8:08 ` [Patch V3 25/37] x86, irq: introduce two helper functions to support irqdomain map operation Jiang Liu
2014-05-27  8:08 ` [Patch V3 26/37] x86, irq, ACPI: use common irqdomain map interface to program IOAPIC pins Jiang Liu
2014-05-27  8:08 ` [Patch V3 27/37] x86, irq, mpparse: " Jiang Liu
2014-05-27  8:08 ` [Patch V3 28/37] x86, irq, SFI: " Jiang Liu
2014-05-27  8:08 ` [Patch V3 29/37] x86, irq, devicetree: " Jiang Liu
2014-05-27  8:08 ` [Patch V3 30/37] x86, irq: clean up unused IOAPIC interface Jiang Liu
2014-05-27  8:08 ` [Patch V3 31/37] x86, irq: simplify the way to handle ISA IRQ Jiang Liu
2014-05-27  8:08 ` [Patch V3 32/37] genirq: export irq_domain_disassociate() to architecture interrupt drivers Jiang Liu
2014-05-27  8:08 ` [Patch V3 33/37] x86, irq: introduce helper functions to release IOAPIC pin Jiang Liu
2014-05-27  8:08 ` [Patch V3 34/37] x86, irq, ACPI: release IOAPIC pin when PCI device is disabled Jiang Liu
2014-05-27  8:08 ` [Patch V3 35/37] x86, irq, mpparse: " Jiang Liu
2014-05-27  8:08 ` [Patch V3 36/37] x86, irq, SFI: " Jiang Liu
2014-05-27  8:08 ` [Patch V3 37/37] x86, irq, devicetree: " Jiang Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53856D8E.3010401@linux.intel.com \
    --to=jiang.liu@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=bhelgaas@google.com \
    --cc=grant.likely@linaro.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=joro@8bytes.org \
    --cc=konrad.wilk@oracle.com \
    --cc=len.brown@intel.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=paul.gortmaker@windriver.com \
    --cc=pavel@ucw.cz \
    --cc=rdunlap@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).