All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>,
	Yinghai Lu <yhlu.kernel@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [RFC 1/2 -tip/master] x86, x2apic: minimize IPI register writes using cluster groups
Date: Mon, 14 Feb 2011 12:45:15 +0100	[thread overview]
Message-ID: <20110214114515.GA9867@elte.hu> (raw)
In-Reply-To: <4D4B1835.10606@gmail.com>


* Cyrill Gorcunov <gorcunov@gmail.com> wrote:

> In the case of x2apic cluster mode we can group IPI register writes based on the 
> cluster group instead of individual per-cpu destiantion messages. This reduces the 
> apic register writes and reduces the amount of IPI messages (in the best case we 
> can reduce it by a factor of 16).
> 
> With this change, microbenchmark measuring the cost of flush_tlb_others(), with 
> the flush tlb IPI being sent from a cpu in the socket-1 to all the logical cpus in 
> socket-2 (on a Westmere-EX system that has 20 logical cpus in a socket) is 3x 
> times better now (compared to the former 'send one-by-one' algorithm).

Pretty nice!

I have a few structural and nitpicking comments:

> +++ tip-linux-2.6/arch/x86/kernel/apic/probe_64.c
> @@ -71,7 +71,9 @@ void __init default_setup_apic_routing(v
>  #endif
> 
>  	if (apic == &apic_flat && num_possible_cpus() > 8)
> -			apic = &apic_physflat;
> +		apic = &apic_physflat;
> +	else if (apic == &apic_x2apic_cluster)
> +		x2apic_init_cpu_notifier();

This should be encapsulated cleaner - this higher layer does not care what the 
x2apic code tries to initialize.

So i think the cleanest method would be to clean up default_setup_apic_routing() and 
get rid of the CONFIG_X86_X2APIC (and UV and vsmp) mess from there.

We want some sort of proper driver init sequence between the various APIC drivers, 
which sequence the default_setup_apic_routing() code just iterates and as a result 
it gets a pointer to a 'struct apic'. This would be like all other driver init 
sequences work - no ugly 'glue' like probe_64.c would be needed.


> +static void __x2apic_send_IPI_mask(const struct cpumask *mask, int vector,
> +				   int exclude_self)

Please if possible don't break function prototypes in the middle of the argument 
list. There are more natural places to break the line:

static void
__x2apic_send_IPI_mask(const struct cpumask *mask, int vector, int exclude_self)

But it can be in a single line as well. (up to 90-100 cols is fine.)

> +{
> +	unsigned long cpu;
>  	unsigned long flags;
> +	struct cpumask *cpus_in_cluster_ptr, *ipi_mask_ptr;
> +	u32 dest, this_cpu;

Ok, this is a pet peeve of mine: look how inconsistent this variable definition 
block has become. This would be cleaner:

	struct cpumask *cpus_in_cluster_ptr, *ipi_mask_ptr;
	unsigned long cpu, flags;
	u32 dest, this_cpu;

> +	/*
> +	 * we are to modify mask, so we need an own copy
> +	 * and be sure it's manipulated with irq off
> +	 */

Most comments in this file capitalize the first word in a comment block.

(this holds for a number of similar cases as well in later portions of your patch)

> +	for_each_cpu(cpu, ipi_mask_ptr) {
> +		unsigned long i;
> +		dest = 0;

Please separate variable definitions from the first statement in a more visible manner.

> +		cpus_in_cluster_ptr = per_cpu(cpus_in_cluster, cpu);
> +
> +		/* only cpus in cluster involved */
> +		for_each_cpu_and(i, ipi_mask_ptr, cpus_in_cluster_ptr)
> +			if (!exclude_self || i != this_cpu)
> +				dest |= per_cpu(x86_cpu_to_logical_apicid, i);

Please use curly braces in blocks that are longer than a single line.

> +#define x2apic_propagate_cpu_cluster_status_online(cpu)		\
> +	x2apic_propagate_cpu_cluster_status(cpu, 1)
> +
> +#define x2apic_propagate_cpu_cluster_status_offline(cpu)	\
> +	x2apic_propagate_cpu_cluster_status(cpu, 0)

Is there any particular reason why we use preprocessor technology here, instead of 
the fine C language that .c files are generally written in?

Enumerating 0/1 symbolically would be cleaner that introducing two smaller helper 
functions in any case.

> +static int __cpuinit
> +cluster_setup(struct notifier_block *nfb, unsigned long action, void *hcpu)
> +{
> +	unsigned int cpu = (unsigned long)hcpu;
> +	int err = 0;
> +
> +	switch (action) {
> +	case CPU_UP_PREPARE:
> +		zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL);
> +		zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL);
> +		if (!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu)) {
> +			free_cpumask_var(per_cpu(cpus_in_cluster, cpu));
> +			free_cpumask_var(per_cpu(ipi_mask, cpu));
> +			err = -ENOMEM;
> +		}

zalloc_cpumask_var() has a return code that could be used for a slightly more 
standard deinit/teardown sequence.

> +void x2apic_init_cpu_notifier(void)
> +{
> +	int cpu = smp_processor_id();
> 
> +	zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL);
> +	zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL);
> +	BUG_ON(!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu));

Such a BUG_ON() is not particularly user friendly - and this could trigger during 
CPU hotplug events, i.e. while the system is fully booted up, right?

Thanks,

	Ingo

  reply	other threads:[~2011-02-14 11:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-03 21:03 [RFC 1/2 -tip/master] x86, x2apic: minimize IPI register writes using cluster groups Cyrill Gorcunov
2011-02-14 11:45 ` Ingo Molnar [this message]
2011-02-14 15:10   ` Cyrill Gorcunov
2011-02-15  3:22     ` Ingo Molnar
2011-02-15  8:39       ` Cyrill Gorcunov
2011-02-16  9:23         ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110214114515.GA9867@elte.hu \
    --to=mingo@elte.hu \
    --cc=gorcunov@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=yhlu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.