From: Ingo Molnar <mingo@elte.hu>
To: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>,
Yinghai Lu <yhlu.kernel@gmail.com>,
Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
lkml <linux-kernel@vger.kernel.org>
Subject: Re: [RFC 1/2 -tip/master] x86, x2apic: minimize IPI register writes using cluster groups
Date: Mon, 14 Feb 2011 12:45:15 +0100 [thread overview]
Message-ID: <20110214114515.GA9867@elte.hu> (raw)
In-Reply-To: <4D4B1835.10606@gmail.com>
* Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> In the case of x2apic cluster mode we can group IPI register writes based on the
> cluster group instead of individual per-cpu destiantion messages. This reduces the
> apic register writes and reduces the amount of IPI messages (in the best case we
> can reduce it by a factor of 16).
>
> With this change, microbenchmark measuring the cost of flush_tlb_others(), with
> the flush tlb IPI being sent from a cpu in the socket-1 to all the logical cpus in
> socket-2 (on a Westmere-EX system that has 20 logical cpus in a socket) is 3x
> times better now (compared to the former 'send one-by-one' algorithm).
Pretty nice!
I have a few structural and nitpicking comments:
> +++ tip-linux-2.6/arch/x86/kernel/apic/probe_64.c
> @@ -71,7 +71,9 @@ void __init default_setup_apic_routing(v
> #endif
>
> if (apic == &apic_flat && num_possible_cpus() > 8)
> - apic = &apic_physflat;
> + apic = &apic_physflat;
> + else if (apic == &apic_x2apic_cluster)
> + x2apic_init_cpu_notifier();
This should be encapsulated cleaner - this higher layer does not care what the
x2apic code tries to initialize.
So i think the cleanest method would be to clean up default_setup_apic_routing() and
get rid of the CONFIG_X86_X2APIC (and UV and vsmp) mess from there.
We want some sort of proper driver init sequence between the various APIC drivers,
which sequence the default_setup_apic_routing() code just iterates and as a result
it gets a pointer to a 'struct apic'. This would be like all other driver init
sequences work - no ugly 'glue' like probe_64.c would be needed.
> +static void __x2apic_send_IPI_mask(const struct cpumask *mask, int vector,
> + int exclude_self)
Please if possible don't break function prototypes in the middle of the argument
list. There are more natural places to break the line:
static void
__x2apic_send_IPI_mask(const struct cpumask *mask, int vector, int exclude_self)
But it can be in a single line as well. (up to 90-100 cols is fine.)
> +{
> + unsigned long cpu;
> unsigned long flags;
> + struct cpumask *cpus_in_cluster_ptr, *ipi_mask_ptr;
> + u32 dest, this_cpu;
Ok, this is a pet peeve of mine: look how inconsistent this variable definition
block has become. This would be cleaner:
struct cpumask *cpus_in_cluster_ptr, *ipi_mask_ptr;
unsigned long cpu, flags;
u32 dest, this_cpu;
> + /*
> + * we are to modify mask, so we need an own copy
> + * and be sure it's manipulated with irq off
> + */
Most comments in this file capitalize the first word in a comment block.
(this holds for a number of similar cases as well in later portions of your patch)
> + for_each_cpu(cpu, ipi_mask_ptr) {
> + unsigned long i;
> + dest = 0;
Please separate variable definitions from the first statement in a more visible manner.
> + cpus_in_cluster_ptr = per_cpu(cpus_in_cluster, cpu);
> +
> + /* only cpus in cluster involved */
> + for_each_cpu_and(i, ipi_mask_ptr, cpus_in_cluster_ptr)
> + if (!exclude_self || i != this_cpu)
> + dest |= per_cpu(x86_cpu_to_logical_apicid, i);
Please use curly braces in blocks that are longer than a single line.
> +#define x2apic_propagate_cpu_cluster_status_online(cpu) \
> + x2apic_propagate_cpu_cluster_status(cpu, 1)
> +
> +#define x2apic_propagate_cpu_cluster_status_offline(cpu) \
> + x2apic_propagate_cpu_cluster_status(cpu, 0)
Is there any particular reason why we use preprocessor technology here, instead of
the fine C language that .c files are generally written in?
Enumerating 0/1 symbolically would be cleaner that introducing two smaller helper
functions in any case.
> +static int __cpuinit
> +cluster_setup(struct notifier_block *nfb, unsigned long action, void *hcpu)
> +{
> + unsigned int cpu = (unsigned long)hcpu;
> + int err = 0;
> +
> + switch (action) {
> + case CPU_UP_PREPARE:
> + zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL);
> + zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL);
> + if (!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu)) {
> + free_cpumask_var(per_cpu(cpus_in_cluster, cpu));
> + free_cpumask_var(per_cpu(ipi_mask, cpu));
> + err = -ENOMEM;
> + }
zalloc_cpumask_var() has a return code that could be used for a slightly more
standard deinit/teardown sequence.
> +void x2apic_init_cpu_notifier(void)
> +{
> + int cpu = smp_processor_id();
>
> + zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL);
> + zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL);
> + BUG_ON(!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu));
Such a BUG_ON() is not particularly user friendly - and this could trigger during
CPU hotplug events, i.e. while the system is fully booted up, right?
Thanks,
Ingo
next prev parent reply other threads:[~2011-02-14 11:45 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-03 21:03 [RFC 1/2 -tip/master] x86, x2apic: minimize IPI register writes using cluster groups Cyrill Gorcunov
2011-02-14 11:45 ` Ingo Molnar [this message]
2011-02-14 15:10 ` Cyrill Gorcunov
2011-02-15 3:22 ` Ingo Molnar
2011-02-15 8:39 ` Cyrill Gorcunov
2011-02-16 9:23 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110214114515.GA9867@elte.hu \
--to=mingo@elte.hu \
--cc=gorcunov@gmail.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=suresh.b.siddha@intel.com \
--cc=tglx@linutronix.de \
--cc=yhlu.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox