All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>, Peter Anvin <hpa@zytor.com>,
	Jiang Liu <jiang.liu@linux.intel.com>,
	Joe Lawrence <joe.lawrence@stratus.com>,
	Jeremiah Mahler <jmmahler@gmail.com>,
	Borislav Petkov <bp@alien8.de>,
	andy.shevchenko@gmail.com, Guenter Roeck <linux@roeck-us.net>
Subject: [patch 14/14] x86/irq: Plug vector cleanup race
Date: Thu, 31 Dec 2015 16:30:54 -0000	[thread overview]
Message-ID: <20151231160107.207265407@linutronix.de> (raw)
In-Reply-To: 20151231155849.772553760@linutronix.de

[-- Attachment #1: x86-irq--Plug-vector-race.patch --]
[-- Type: text/plain, Size: 5140 bytes --]

We still can end up with a stale vector due to the following:

CPU0                          CPU1                      CPU2
lock_vector()
data->move_in_progress=0
sendIPI()                       
unlock_vector()
                              set_affinity()
                              assign_irq_vector()
                              lock_vector()             handle_IPI
                              move_in_progress = 1      lock_vector()
                              unlock_vector()
                                                        move_in_progress == 1

So we need to serialize the vector assignment against a pending cleanup. The
solution is rather simple now. We not only check for the move_in_progress flag
in assign_irq_vector(), we also check whether there is still a cleanup pending
in the old_domain cpumask. If so, we return -EBUSY to the caller and let him
deal with it. Though we have to be careful in the cpu unplug case. If the
cleanout has not yet completed then the following setaffinity() call would
return -EBUSY. Add code which prevents this.

Full context is here: http://lkml.kernel.org/r/5653B688.4050809@stratus.com

Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |   63 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 53 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -118,7 +118,12 @@ static int __assign_irq_vector(int irq,
 	static int current_offset = VECTOR_OFFSET_START % 16;
 	int cpu, vector;
 
-	if (d->move_in_progress)
+	/*
+	 * If there is still a move in progress or the previous move has not
+	 * been cleaned up completely, tell the caller to come back later.
+	 */
+	if (d->move_in_progress ||
+	    cpumask_intersects(d->old_domain, cpu_online_mask))
 		return -EBUSY;
 
 	/* Only try and allocate irqs on cpus that are present */
@@ -257,7 +262,12 @@ static void clear_irq_vector(int irq, st
 	data->cfg.vector = 0;
 	cpumask_clear(data->domain);
 
-	if (likely(!data->move_in_progress))
+	/*
+	 * If move is in progress or the old_domain mask is not empty,
+	 * i.e. the cleanup IPI has not been processed yet, we need to remove
+	 * the old references to desc from all cpus vector tables.
+	 */
+	if (!data->move_in_progress && cpumask_empty(data->old_domain))
 		return;
 
 	desc = irq_to_desc(irq);
@@ -577,12 +587,25 @@ asmlinkage __visible void smp_irq_move_c
 			goto unlock;
 
 		/*
-		 * Check if the irq migration is in progress. If so, we
-		 * haven't received the cleanup request yet for this irq.
+		 * Nothing to cleanup if irq migration is in progress
+		 * or this cpu is not set in the cleanup mask.
 		 */
-		if (data->move_in_progress)
+		if (data->move_in_progress ||
+		    !cpumask_test_cpu(me, data->old_domain))
 			goto unlock;
 
+		/*
+		 * We have two cases to handle here:
+		 * 1) vector is unchanged but the target mask got reduced
+		 * 2) vector and the target mask has changed
+		 *
+		 * #1 is obvious, but in #2 we have two vectors with the same
+		 * irq descriptor: the old and the new vector. So we need to
+		 * make sure that we only cleanup the old vector. The new
+		 * vector has the current @vector number in the config and
+		 * this cpu is part of the target mask. We better leave that
+		 * one alone.
+		 */
 		if (vector == data->cfg.vector &&
 		    cpumask_test_cpu(me, data->domain))
 			goto unlock;
@@ -600,6 +623,7 @@ asmlinkage __visible void smp_irq_move_c
 			goto unlock;
 		}
 		__this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
+		cpumask_clear_cpu(me, data->old_domain);
 unlock:
 		raw_spin_unlock(&desc->lock);
 	}
@@ -643,13 +667,32 @@ void irq_force_complete_move(struct irq_
 	__irq_complete_move(cfg, cfg->vector);
 
 	/*
-	 * Remove this cpu from the cleanup mask. The IPI might have been sent
-	 * just before the cpu was removed from the offline mask, but has not
-	 * been processed because the CPU has interrupts disabled and is on
-	 * the way out.
+	 * This is tricky. If the cleanup of @data->old_domain has not been
+	 * done yet, then the following setaffinity call will fail with
+	 * -EBUSY. This can leave the interrupt in a stale state.
+	 *
+	 * The cleanup cannot make progress because we hold @desc->lock. So in
+	 * case @data->old_domain is not yet cleaned up, we need to drop the
+	 * lock and acquire it again. @desc cannot go away, because the
+	 * hotplug code holds the sparse irq lock.
 	 */
 	raw_spin_lock(&vector_lock);
-	cpumask_clear_cpu(smp_processor_id(), data->old_domain);
+	/* Clean out all offline cpus (including ourself) first. */
+	cpumask_and(data->old_domain, data->old_domain, cpu_online_mask);
+	while (!cpumask_empty(data->old_domain)) {
+		raw_spin_unlock(&vector_lock);
+		raw_spin_unlock(&desc->lock);
+		cpu_relax();
+		raw_spin_lock(&desc->lock);
+		/*
+		 * Reevaluate apic_chip_data. It might have been cleared after
+		 * we dropped @desc->lock.
+		 */
+		data = apic_chip_data(irqdata);
+		if (!data)
+			return;
+		raw_spin_lock(&vector_lock);
+	}
 	raw_spin_unlock(&vector_lock);
 }
 #endif



  parent reply	other threads:[~2015-12-31 16:32 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-31 16:30 [patch 00/14] x86/irq: Plug various vector cleanup races Thomas Gleixner
2015-12-31 16:30 ` [patch 01/14] x86/irq: Fix a race in x86_vector_free_irqs() Thomas Gleixner
2015-12-31 16:30 ` [patch 02/14] x86/irq: Validate that irq descriptor is still active Thomas Gleixner
2016-01-16 21:16   ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2015-12-31 16:30 ` [patch 03/14] x86/irq: Do not use apic_chip_data.old_domain as temporary buffer Thomas Gleixner
2015-12-31 16:30 ` [patch 04/14] x86/irq: Reorganize the return path in assign_irq_vector Thomas Gleixner
2016-01-16 21:17   ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2015-12-31 16:30 ` [patch 05/14] x86/irq: Reorganize the search " Thomas Gleixner
2016-01-16 21:17   ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2015-12-31 16:30 ` [patch 06/14] x86/irq: Check vector allocation early Thomas Gleixner
2016-01-16 21:17   ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2015-12-31 16:30 ` [patch 08/14] x86/irq: Get rid of code duplication Thomas Gleixner
2016-01-16 21:18   ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2015-12-31 16:30 ` [patch 07/14] x86/irq: Copy vectormask instead of an AND operation Thomas Gleixner
2016-01-16 21:18   ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2015-12-31 16:30 ` [patch 09/14] x86/irq: Remove offline cpus from vector cleanup Thomas Gleixner
2016-01-16 21:18   ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2015-12-31 16:30 ` [patch 10/14] x86/irq: Clear move_in_progress before sending cleanup IPI Thomas Gleixner
2016-01-16 21:19   ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2015-12-31 16:30 ` [patch 11/14] x86/irq: Remove the cpumask allocation from send_cleanup_vector() Thomas Gleixner
2016-01-16 21:19   ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2015-12-31 16:30 ` [patch 12/14] x86/irq: Remove outgoing CPU from vector cleanup mask Thomas Gleixner
2016-01-16 21:19   ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2015-12-31 16:30 ` [patch 13/14] x86/irq: Call irq_force_move_complete with irq descriptor Thomas Gleixner
2016-01-16 21:20   ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2015-12-31 16:30 ` Thomas Gleixner [this message]
2016-01-16 21:20   ` [tip:x86/urgent] x86/irq: Plug vector cleanup race tip-bot for Thomas Gleixner
2016-01-04 15:35 ` [patch 00/14] x86/irq: Plug various vector cleanup races Joe Lawrence
2016-01-14  8:24   ` Thomas Gleixner
2016-01-14 10:33     ` Borislav Petkov
2016-01-16 21:37       ` Joe Lawrence
2016-01-18 15:00         ` Joe Lawrence
2016-01-18 15:43           ` Borislav Petkov
2016-01-18 16:38             ` Joe Lawrence
2016-01-20  3:57           ` Joe Lawrence
2016-01-20  8:26             ` Borislav Petkov
2016-01-22 15:28               ` Joe Lawrence
2016-01-16 21:15     ` [tip:x86/urgent] x86/irq: Call chip-> irq_set_affinity in proper context tip-bot for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151231160107.207265407@linutronix.de \
    --to=tglx@linutronix.de \
    --cc=andy.shevchenko@gmail.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jiang.liu@linux.intel.com \
    --cc=jmmahler@gmail.com \
    --cc=joe.lawrence@stratus.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.