Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Prarit Bhargava <prarit@redhat.com>
To: rui wang <ruiv.wang@gmail.com>
Cc: Tony Luck <tony.luck@gmail.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	X86-ML <x86@kernel.org>, Michel Lespinasse <walken@google.com>,
	Andi Kleen <ak@linux.intel.com>,
	Seiji Aguchi <seiji.aguchi@hds.com>,
	Yang Zhang <yang.z.zhang@intel.com>,
	Paul Gortmaker <paul.gortmaker@windriver.com>,
	janet.morgan@intel.com, "Yu, Fenghua" <fenghua.yu@intel.com>,
	chen gong <gong.chen@linux.intel.com>
Subject: Re: [PATCH] x86: Add check for number of available vectors before CPU down [v2]
Date: Mon, 30 Dec 2013 12:22:37 -0500	[thread overview]
Message-ID: <52C1ABDD.6050302@redhat.com> (raw)
In-Reply-To: <CANVTcTbNqwzVBzqzGisbUJGtVCa9RtA6OuD4hVp0O39Nz_KDrQ@mail.gmail.com>



On 12/30/2013 07:56 AM, rui wang wrote:
> On 12/29/13, Prarit Bhargava <prarit@redhat.com> wrote:
>>
>>
>> On 12/20/2013 04:41 AM, rui wang wrote:
> <<snip>>
>>> The vector number for an irq is programmed in the LSB of the IOAPIC
>>> IRTE (or MSI data register in the case of MSI/MSIx). So there can be
>>> only one vector number (although multiple CPUs can be specified
>>> through DM). An MSI-capable device can dynamically change the lower
>>> few bits in the LSB to signal multiple interrupts with a contiguous
>>> range of vectors in powers of 2,but each of these vectors is treated
>>> as a separate IRQ. i.e. each of them has a separate irq desc, or a
>>> separate line in the /proc/interrupt file. This patch shows the MSI
>>> irq allocation in detail:
>>> http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=51906e779f2b13b38f8153774c4c7163d412ffd9
>>>
>>> Thanks
>>> Rui
>>>
>>
>> Gong and Rui,
>>
>> After looking at this in detail I realized I made a mistake in my patch by
>> including the check for the smp_affinity.  Simply put, it shouldn't be
>> there
>> given Rui's explanation above.
>>
>> So I think the patch simply needs to do:
>>
>>         this_count = 0;
>>         for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++)
>> {
>>                 irq = __this_cpu_read(vector_irq[vector]);
>>                 if (irq >= 0) {
>>                         desc = irq_to_desc(irq);
>>                         data = irq_desc_get_irq_data(desc);
>>                         affinity = data->affinity;
>>                         if (irq_has_action(irq) && !irqd_is_per_cpu(data))
>>                                 this_count++;
>>                 }
>>         }
>>
>> Can the two of you confirm the above is correct?  It would be greatly
>> appreciated.
> 
> An irq can be mapped to only one vector number, but can have multiple
> destination CPUs. i.e. the same irq/vector can appear on multiple
> CPUs' vector_irq[]. So checking data->affinity is necessary I think.
> But notice that data->affinity is updated in chip->irq_set_affinity()
> inside fixup_irqs(), while cpu_online_mask is updated in
> remove_cpu_from_maps() inside cpu_disable_common(). They are updated
> in different places. So the algorithm to check them against each other
> should be different, depending on where you put the check_vectors().
> That's my understanding.


Okay, so the big issue is that we need to do the calculation without this cpu,
so I think this works (sorry for the cut-and-paste)

int check_irq_vectors_for_cpu_disable(void)
{
        int irq, cpu;
        unsigned int vector, this_count, count;
        struct irq_desc *desc;
        struct irq_data *data;
        struct cpumask online_new; /* cpu_online_mask - this_cpu */
        struct cpumask affinity_new; /* affinity - this_cpu */

        cpumask_copy(&online_new, cpu_online_mask);
        cpu_clear(smp_processor_id(), online_new);

        this_count = 0;
        for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
                irq = __this_cpu_read(vector_irq[vector]);
                if (irq >= 0) {
                        desc = irq_to_desc(irq);
                        data = irq_desc_get_irq_data(desc);
                        cpumask_copy(&affinity_new, data->affinity);
                        cpu_clear(smp_processor_id(), affinity_new);
                        if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
                            !cpumask_subset(&affinity_new, &online_new) &&
                            !cpumask_empty(&affinity_new))
                                this_count++;
                }
        }

...

If I go back to the various examples this appears to work.  For example, your
previous case was all cpus are online, CPU 1 goes down and we have an IRQ with
affinity for CPU (1,2).  We skip this IRQ which is correct.

And if we have another IRQ with affinity of only CPU 1 we will not skip this
IRQ, which is also correct.

I've tried other examples and they appear to work AFAICT.

     prev parent reply	other threads:[~2013-12-30 17:22 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-18 19:29 [PATCH] x86: Add check for number of available vectors before CPU down [v2] Prarit Bhargava
2013-12-18 19:50 ` Tony Luck
2013-12-19 18:05   ` Tony Luck
2013-12-19 18:11     ` Prarit Bhargava
2013-12-20  7:18       ` Chen, Gong
2013-12-20  9:41       ` rui wang
2013-12-20 10:49         ` Prarit Bhargava
2013-12-28 17:10         ` Prarit Bhargava
2013-12-30  7:44           ` Chen, Gong
2013-12-30 15:09             ` Prarit Bhargava
2013-12-30 12:56           ` rui wang
2013-12-30 15:08             ` Prarit Bhargava
2013-12-31  2:58               ` rui wang
2013-12-31 21:22                 ` Prarit Bhargava
2014-01-02  2:41                   ` Chen, Gong
2014-01-02 12:57                     ` Prarit Bhargava
2014-01-02 16:04                       ` Prarit Bhargava
2013-12-30 17:22             ` Prarit Bhargava [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52C1ABDD.6050302@redhat.com \
    --to=prarit@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=gong.chen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=janet.morgan@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paul.gortmaker@windriver.com \
    --cc=ruiv.wang@gmail.com \
    --cc=seiji.aguchi@hds.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@gmail.com \
    --cc=walken@google.com \
    --cc=x86@kernel.org \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox