public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yinghai Lu <yinghai@kernel.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] irq: sparseirq enabling
Date: Mon, 24 Nov 2008 11:22:33 -0800	[thread overview]
Message-ID: <492AFEF9.609@kernel.org> (raw)
In-Reply-To: <20081124144007.GA30725@elte.hu>

Ingo Molnar wrote:
> * Yinghai Lu <yinghai@kernel.org> wrote:
> 
>> +/*
>> + * Protect the sparse_irqs_free freelist:
>> + */
>> +static DEFINE_SPINLOCK(sparse_irq_lock);
>> +LIST_HEAD(sparse_irqs_head);
>> +
>> +/*
>> + * The sparse irqs are in a hash-table as well, for fast lookup:
>> + */
>> +#define SPARSEIRQHASH_BITS          (13 - 1)
>> +#define SPARSEIRQHASH_SIZE          (1UL << SPARSEIRQHASH_BITS)
>> +#define __sparseirqhashfn(key)      hash_long((unsigned long)key, SPARSEIRQHASH_BITS)
>> +#define sparseirqhashentry(key)     (sparseirqhash_table + __sparseirqhashfn((key)))
>> +
>> +static struct list_head sparseirqhash_table[SPARSEIRQHASH_SIZE];
>> +
> 
>> +struct irq_desc *irq_to_desc(unsigned int irq)
>> +{
>> +	struct irq_desc *desc;
>> +	struct list_head *hash_head;
>> +
>> +	hash_head = sparseirqhashentry(irq);
>> +
>> +	/*
>> +	 * We can walk the hash lockfree, because the hash only
>> +	 * grows, and we are careful when adding entries to the end:
>> +	 */
>> +	list_for_each_entry(desc, hash_head, hash_entry) {
>> +		if (desc->irq == irq)
>> +			return desc;
>> +	}
>> +
>> +	return NULL;
>> +}
> 
> I have talked to Thomas about the current design of sparseirq, and the 
> current code looks pretty good already, but we'd like to suggest one 
> more refinement to the hashing scheme:
> 
> Please simplify it by using a sparse pointers static array instead of 
> a full hash. I.e. do it as a:
> 
>        struct irq_desc *irq_desc_ptrs[NR_IRQS] __read_mostly;
> 
> This can scale up just fine to 4096 CPUs without measurable impact to 
> the kernel image size on x86 distro kernels.
> 
> Data structure rules:
> 
> 1) allocation: an entry would be allocated at IRQ number creation time 
>    - never at any other time. Pure access to the desc returns NULL - 
>    it's atomic and lockless.
> 
> 2) freeing: we never free them. If a system allocates an irq_desc[], 
>    it signals that it uses it. This makes all access lockless.
> 
> 3) lookup: irq_to_desc() just does a simple irq_desc_ptrs[irq]
>    dereference (inlined) - no locking or hash lookup needed. Since 
>    irq_desc_ptrs[] is accessed __read_mostly, this will scale really 
>    well even on NUMA.
> 
> 4) iterators: we still keep NR_IRQS as a limit for the
>    irq_desc_ptrs[] array - but it would never be used directly by any 
>    iteration loop, in generic code.
> 
> 5) bootup, pre-kmalloc irq_desc access: on x86 we should preallocate a 
>    pool of ~32 irq_desc entries for allocation use. This can be a
>    simple static array of struct irq_desc that is fed into the
>    first 32 legacy irq_desc[] slots or so. No bootmem-alloc and
>    no after_bootmem flaggery needed. All SMP init and secondary CPU 
>    irq desc allocation happens after kmalloc is active already - 
>    cleaning up the allocation path.
> 
> 6) limits on x86: please scale NR_IRQS to this rule:
>                max(32*MAX_IO_APICS, 8*NR_CPUS)
> 
>    That gives a minimum of 4096 IRQs on 64-bit systems. On even larger 
>    systems, we scale linearly up to 32K IRQs on 4K CPUs. That should 
>    be more than enough (and it can be increased if really needed).
> 
> Please keep all the irq_desc[] abstraction APIs and cleanups you did 
> so far - they are great. In the far future we can still make irq_desc 
> a full hash if needed - but right now we'll be just fine with such a 
> simpler scheme as well, and scale fine up to 16K CPUs or so.
> 
> This should simplify quite a few things in the sparseirq code.

ok, will change to that. 

1. nr_irqs will be legacy number or GSI numbers. MSI will start to use irq nr from nr_irqs
2. or MSI still to use irq from NR_IRQS - 1, and ... go smaller?

YH

  reply	other threads:[~2008-11-24 19:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-24  2:59 [PATCH 1/2] irq: sparseirq enabling Yinghai Lu
2008-11-24 14:40 ` Ingo Molnar
2008-11-24 19:22   ` Yinghai Lu [this message]
2008-11-24 22:26     ` Thomas Gleixner
2008-11-25  3:57   ` [PATCH 1/2] irq: sparseirq enabling v2 Yinghai Lu
2008-11-25  3:58     ` [PATCH 2/2] irq: move irq_desc according to smp_affinity v2 Yinghai Lu
2008-11-26  7:48     ` [PATCH 1/2] irq: sparseirq enabling v2 Ingo Molnar
2008-11-26  8:02       ` Yinghai Lu
2008-11-26  8:17         ` Ingo Molnar
2008-11-26 18:33           ` Yinghai Lu
2008-11-27  2:26           ` [PATCH 1/2] irq: sparseirq enabling v3 Yinghai Lu
2008-11-27  2:26             ` [PATCH 2/2] irq: move irq_desc according to smp_affinity v3 Yinghai Lu
2008-11-28 16:34             ` [PATCH 1/2] irq: sparseirq enabling v3 Ingo Molnar
2008-11-29  7:13               ` [PATCH] irq: sparseirq enabling v4 Yinghai Lu
2008-11-29 10:02                 ` Ingo Molnar
2008-11-29 10:26                   ` Ingo Molnar
2008-12-01  4:44                     ` [PATCH] irq: sparse irq_desc[] support - fix Yinghai Lu
2008-11-29 10:57                   ` [PATCH] irq: sparseirq enabling v4 Sam Ravnborg
2008-11-29 14:33                     ` Ingo Molnar
2008-11-29 17:54                       ` Sam Ravnborg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=492AFEF9.609@kernel.org \
    --to=yinghai@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox