linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
To: Yinghai Lu <yinghai@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	Joerg Roedel <joro@8bytes.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Subject: Re: [PATCH v3 11/27] x86, irq: Add realloc_irq_and_cfg_at()
Date: Sun, 9 Jun 2013 21:13:40 +0200	[thread overview]
Message-ID: <20130609191339.GG2245@breakpoint.cc> (raw)
In-Reply-To: <1370644273-10495-12-git-send-email-yinghai@kernel.org>

On Fri, Jun 07, 2013 at 03:30:57PM -0700, Yinghai Lu wrote:
> For ioapic hot-add support, it would be easy if we put all irqs
> for that ioapic controller together.
> 
> We can reserve irq range at first, then reallocate those
> pre-reserved one when it is needed.
> 
> Add realloc_irq_and_cfg_at() to really allocate irq_desc and cfg,
> because pre-reserved only mark bits in allocate_irqs bit maps.
> 
> The reasons for not allocating them during reserving:
> 1. only several pins in ioapic are used, allocate for all pins, will
>    waste memory for not used pins.
> 2. relocate later could make sure irq_desc is allocated on local node ram.
>    as dev->node is set at that point.
> 
> -v2: update changelog by adding reasons, requested by Konrad.

I think it will be better to split out the gen irq changes out of x86 / apic
specific code. 

I don't what to say. You are worried about the extra unused memory in case
of irq_desc right? The OF code has more or less the same problem. They use
irq_of_parse_and_map() / irq_create_mapping() to create a mapping between
virq (the linux number) and hw-irq (pin on the irq chip). This mapping is
created once the irq chip is detected. They don't care about virqs (aka 
linux numbers) to be in a row and they allocate all of irqdesc at once.

Once you get irqdomain to be used within ioapic, then your "linux irq
number" vs "hw irq number" should disapper. Plus you can remove the whole
gsi_number thingy. And then you start working on getting irqdesc allocated
later, say at request_irq() time and free at free_irq().

However I am not sure if this is worth it. The advantage would be that you
would once one infrastcuture for linux-number vs hw-number mapping and the
delayed irqdesc allocate + numa node rellocating.

> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> ---
>  arch/x86/kernel/apic/io_apic.c | 32 +++++++++++++++++++++++++++++++-
>  include/linux/irq.h            |  5 +++++
>  kernel/irq/irqdesc.c           | 26 ++++++++++++++++++++++++++
>  3 files changed, 62 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 670c538..a157a56 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -301,6 +301,36 @@ static void free_irq_at(unsigned int at, struct irq_cfg *cfg)
>  	irq_free_desc(at);
>  }
>  
> +static struct irq_cfg *realloc_irq_and_cfg_at(unsigned int at, int node)
> +{
> +	struct irq_desc *desc = irq_to_desc(at);
> +	struct irq_cfg *cfg;
> +	int res;
> +
> +	if (desc) {
> +		if (irq_desc_get_irq_data(desc)->node == node)
> +			return alloc_irq_and_cfg_at(at, node);
> +
> +		cfg = irq_desc_get_chip_data(desc);
> +		if (cfg) {
> +			/* shared irq */
> +			if (!list_empty(&cfg->irq_2_pin))
> +				return cfg;
> +			free_irq_cfg(at, cfg);
> +		}
> +	}
> +
> +	res = irq_realloc_desc_at(at, node);
> +	if (res >= 0) {
> +		cfg = alloc_irq_cfg(at, node);
> +		if (cfg) {
> +			irq_set_chip_data(at, cfg);
> +			return cfg;
> +		}
> +	}

This looks somehow hackish. If irqdesc exists on another node then it
looks here like you overwrite it with a new one but __irq_realloc_desc()
deallocates the old one. As I said, this looks very hackish.

> +
> +	return alloc_irq_and_cfg_at(at, node);
> +}
>  
>  struct io_apic {
>  	unsigned int index;
> @@ -3352,7 +3382,7 @@ int arch_setup_ht_irq(unsigned int irq, struct pci_dev *dev)
>  static int
>  io_apic_setup_irq_pin(unsigned int irq, int node, struct io_apic_irq_attr *attr)
>  {
> -	struct irq_cfg *cfg = alloc_irq_and_cfg_at(irq, node);
> +	struct irq_cfg *cfg = realloc_irq_and_cfg_at(irq, node);
>  	int ret;
>  
>  	if (!cfg)
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 4e0fcbb..9c6c047 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -602,6 +602,11 @@ void irq_free_descs(unsigned int irq, unsigned int cnt);
>  int irq_reserve_irqs(unsigned int from, unsigned int cnt);
>  int __irq_reserve_irqs(int irq, unsigned int from, unsigned int cnt);
>  
> +int __irq_realloc_desc(int at, int node, struct module *owner);
> +/* use macros to avoid needing export.h for THIS_MODULE */
If you put this in line with the other functions, you wouldn't need to
copy the comment.

> +#define irq_realloc_desc_at(at, node)	\
> +	__irq_realloc_desc(at, node, THIS_MODULE)
> +
>  static inline void irq_free_desc(unsigned int irq)
>  {
>  	irq_free_descs(irq, 1);
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index 3b9fb92..b48f65b 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -99,6 +99,11 @@ EXPORT_SYMBOL_GPL(nr_irqs);
>  static DEFINE_MUTEX(sparse_irq_lock);
>  static DECLARE_BITMAP(allocated_irqs, IRQ_BITMAP_BITS);
>  
> +static bool __irq_is_reserved(int irq)
> +{
> +	return !!test_bit(irq, allocated_irqs);

This is not reserved, this is allocated.

> +}
> +
>  #ifdef CONFIG_SPARSE_IRQ
>  
>  static RADIX_TREE(irq_desc_tree, GFP_KERNEL);
> @@ -410,6 +415,27 @@ __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>  EXPORT_SYMBOL_GPL(__irq_alloc_descs);
>  
>  /**
> + * irq_realloc_desc - allocate irq descriptor for irq that is already reserved
> + * @irq:	Allocate for specific irq number if irq >= 0
> + * @node:	Preferred node on which the irq descriptor should be allocated
> + * @owner:	Owning module (can be NULL)
> + *
> + * Returns the irq number or error code
> + */
> +int __ref
> +__irq_realloc_desc(int irq, int node, struct module *owner)
> +{
> +	if (!__irq_is_reserved(irq))
> +		return -EINVAL;

I don't like the part where it is named reserved but means allocated

> +
> +	if (irq_to_desc(irq))
> +		free_desc(irq);

and free if it is already avaiable because it should not be allocated.

> +
> +	return alloc_descs(irq, 1, node, owner);
> +}
> +EXPORT_SYMBOL_GPL(__irq_realloc_desc);

Sebastian

  reply	other threads:[~2013-06-09 19:13 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-07 22:30 [PATCH v3 00/27] x86, irq: support ioapic device hotplug Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 01/27] x86, irq: Change irq_remap_modify_chip_defaults to static Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 02/27] x86, irq: Modify irq chip once for irq remapping Yinghai Lu
2013-06-09 14:54   ` Sebastian Andrzej Siewior
2013-06-10 23:17     ` Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 03/27] x86, irq: Print out MSI/MSI-X clearly Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 04/27] x86, irq: Show MSI-X in /proc/interrupt Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 05/27] x86, irq: Make dmar_msi/hpet_msi irq_chip name consistent Yinghai Lu
2013-06-09 15:16   ` Sebastian Andrzej Siewior
2013-06-10 23:40     ` Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 06/27] ia64, irq: Add dummy create_irq_nr() Yinghai Lu
2013-06-09 15:22   ` Sebastian Andrzej Siewior
2013-06-10 23:41     ` Yinghai Lu
2013-06-11 21:52       ` Luck, Tony
2013-06-07 22:30 ` [PATCH v3 07/27] iommu, irq: Allocate irq_desc for dmar_msi with local node Yinghai Lu
2013-06-09 15:31   ` Sebastian Andrzej Siewior
2013-06-10 23:43     ` Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 08/27] x86, irq: kill create_irq() Yinghai Lu
2013-06-09 15:35   ` Sebastian Andrzej Siewior
2013-06-07 22:30 ` [PATCH v3 09/27] x86, irq: Convert irq_2_pin list to generic list Yinghai Lu
2013-06-09 15:52   ` Sebastian Andrzej Siewior
2013-06-07 22:30 ` [PATCH v3 10/27] genirq: Split __irq_reserve_irqs from irq_alloc_descs Yinghai Lu
2013-06-10 13:51   ` Alexander Gordeev
2013-06-10 19:16     ` Yinghai Lu
2013-06-10 19:42       ` Alexander Gordeev
2013-06-10 19:39   ` Thomas Gleixner
2013-06-10 23:55     ` Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 11/27] x86, irq: Add realloc_irq_and_cfg_at() Yinghai Lu
2013-06-09 19:13   ` Sebastian Andrzej Siewior [this message]
2013-06-10 20:13   ` Thomas Gleixner
2013-06-07 22:30 ` [PATCH v3 12/27] x86, irq: Move down arch_early_irq_init() Yinghai Lu
2013-06-07 22:30 ` [PATCH v3 13/27] x86, irq: Split out alloc_ioapic_save_registers() Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 14/27] xen, irq: call irq_realloc_desc_at() at first Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 15/27] x86, irq: pre-reserve irq range/realloc for booting path Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 16/27] x86, irq: Add ioapic_gsi_to_irq Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 17/27] genirq: Bail out early in free_desc() Yinghai Lu
2013-06-10 20:43   ` Thomas Gleixner
2013-06-07 22:31 ` [PATCH v3 18/27] x86, irq: More strict checking about registering ioapic Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 19/27] x86, irq: Make mp_register_ioapic handle hot-added ioapic Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 20/27] x86, irq: Add mp_unregister_ioapic to handle hot-remove ioapic Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 21/27] x86, irq: Make ioapics loop skip blank slots Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 22/27] x86, ioapic: Find usable ioapic id for 64bit Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 23/27] x86: Move declaration for mp_register_ioapic() Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 24/27] PCI, x86: Make ioapic hotplug support built-in Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 25/27] PCI, x86, ACPI: Link acpi ioapic register to ioapic Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 26/27] PCI, x86, ACPI: Enable ioapic hotplug support with acpi host bridge Yinghai Lu
2013-06-07 22:31 ` [PATCH v3 27/27] PCI, x86, ACPI: get ioapic address from acpi device Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130609191339.GG2245@breakpoint.cc \
    --to=sebastian@breakpoint.cc \
    --cc=bhelgaas@google.com \
    --cc=hpa@zytor.com \
    --cc=joro@8bytes.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rjw@sisk.pl \
    --cc=tglx@linutronix.de \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).