public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>,
	Yinghai Lu <yhlu.kernel@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [RFC 1/2 -tip/master] x86, x2apic: minimize IPI register writes using cluster groups
Date: Mon, 14 Feb 2011 12:45:15 +0100	[thread overview]
Message-ID: <20110214114515.GA9867@elte.hu> (raw)
In-Reply-To: <4D4B1835.10606@gmail.com>


* Cyrill Gorcunov <gorcunov@gmail.com> wrote:

> In the case of x2apic cluster mode we can group IPI register writes based on the 
> cluster group instead of individual per-cpu destiantion messages. This reduces the 
> apic register writes and reduces the amount of IPI messages (in the best case we 
> can reduce it by a factor of 16).
> 
> With this change, microbenchmark measuring the cost of flush_tlb_others(), with 
> the flush tlb IPI being sent from a cpu in the socket-1 to all the logical cpus in 
> socket-2 (on a Westmere-EX system that has 20 logical cpus in a socket) is 3x 
> times better now (compared to the former 'send one-by-one' algorithm).

Pretty nice!

I have a few structural and nitpicking comments:

> +++ tip-linux-2.6/arch/x86/kernel/apic/probe_64.c
> @@ -71,7 +71,9 @@ void __init default_setup_apic_routing(v
>  #endif
> 
>  	if (apic == &apic_flat && num_possible_cpus() > 8)
> -			apic = &apic_physflat;
> +		apic = &apic_physflat;
> +	else if (apic == &apic_x2apic_cluster)
> +		x2apic_init_cpu_notifier();

This should be encapsulated cleaner - this higher layer does not care what the 
x2apic code tries to initialize.

So i think the cleanest method would be to clean up default_setup_apic_routing() and 
get rid of the CONFIG_X86_X2APIC (and UV and vsmp) mess from there.

We want some sort of proper driver init sequence between the various APIC drivers, 
which sequence the default_setup_apic_routing() code just iterates and as a result 
it gets a pointer to a 'struct apic'. This would be like all other driver init 
sequences work - no ugly 'glue' like probe_64.c would be needed.


> +static void __x2apic_send_IPI_mask(const struct cpumask *mask, int vector,
> +				   int exclude_self)

Please if possible don't break function prototypes in the middle of the argument 
list. There are more natural places to break the line:

static void
__x2apic_send_IPI_mask(const struct cpumask *mask, int vector, int exclude_self)

But it can be in a single line as well. (up to 90-100 cols is fine.)

> +{
> +	unsigned long cpu;
>  	unsigned long flags;
> +	struct cpumask *cpus_in_cluster_ptr, *ipi_mask_ptr;
> +	u32 dest, this_cpu;

Ok, this is a pet peeve of mine: look how inconsistent this variable definition 
block has become. This would be cleaner:

	struct cpumask *cpus_in_cluster_ptr, *ipi_mask_ptr;
	unsigned long cpu, flags;
	u32 dest, this_cpu;

> +	/*
> +	 * we are to modify mask, so we need an own copy
> +	 * and be sure it's manipulated with irq off
> +	 */

Most comments in this file capitalize the first word in a comment block.

(this holds for a number of similar cases as well in later portions of your patch)

> +	for_each_cpu(cpu, ipi_mask_ptr) {
> +		unsigned long i;
> +		dest = 0;

Please separate variable definitions from the first statement in a more visible manner.

> +		cpus_in_cluster_ptr = per_cpu(cpus_in_cluster, cpu);
> +
> +		/* only cpus in cluster involved */
> +		for_each_cpu_and(i, ipi_mask_ptr, cpus_in_cluster_ptr)
> +			if (!exclude_self || i != this_cpu)
> +				dest |= per_cpu(x86_cpu_to_logical_apicid, i);

Please use curly braces in blocks that are longer than a single line.

> +#define x2apic_propagate_cpu_cluster_status_online(cpu)		\
> +	x2apic_propagate_cpu_cluster_status(cpu, 1)
> +
> +#define x2apic_propagate_cpu_cluster_status_offline(cpu)	\
> +	x2apic_propagate_cpu_cluster_status(cpu, 0)

Is there any particular reason why we use preprocessor technology here, instead of 
the fine C language that .c files are generally written in?

Enumerating 0/1 symbolically would be cleaner that introducing two smaller helper 
functions in any case.

> +static int __cpuinit
> +cluster_setup(struct notifier_block *nfb, unsigned long action, void *hcpu)
> +{
> +	unsigned int cpu = (unsigned long)hcpu;
> +	int err = 0;
> +
> +	switch (action) {
> +	case CPU_UP_PREPARE:
> +		zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL);
> +		zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL);
> +		if (!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu)) {
> +			free_cpumask_var(per_cpu(cpus_in_cluster, cpu));
> +			free_cpumask_var(per_cpu(ipi_mask, cpu));
> +			err = -ENOMEM;
> +		}

zalloc_cpumask_var() has a return code that could be used for a slightly more 
standard deinit/teardown sequence.

> +void x2apic_init_cpu_notifier(void)
> +{
> +	int cpu = smp_processor_id();
> 
> +	zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL);
> +	zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL);
> +	BUG_ON(!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu));

Such a BUG_ON() is not particularly user friendly - and this could trigger during 
CPU hotplug events, i.e. while the system is fully booted up, right?

Thanks,

	Ingo

  reply	other threads:[~2011-02-14 11:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-03 21:03 [RFC 1/2 -tip/master] x86, x2apic: minimize IPI register writes using cluster groups Cyrill Gorcunov
2011-02-14 11:45 ` Ingo Molnar [this message]
2011-02-14 15:10   ` Cyrill Gorcunov
2011-02-15  3:22     ` Ingo Molnar
2011-02-15  8:39       ` Cyrill Gorcunov
2011-02-16  9:23         ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110214114515.GA9867@elte.hu \
    --to=mingo@elte.hu \
    --cc=gorcunov@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=yhlu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox