From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753575AbbBARxT (ORCPT ); Sun, 1 Feb 2015 12:53:19 -0500 Received: from terminus.zytor.com ([198.137.202.10]:58533 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753405AbbBARxQ (ORCPT ); Sun, 1 Feb 2015 12:53:16 -0500 Date: Sun, 1 Feb 2015 09:52:44 -0800 From: tip-bot for Frederic Weisbecker Message-ID: Cc: fweisbec@gmail.com, hpa@zytor.com, linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@kernel.org, torvalds@linux-foundation.org, tglx@linutronix.de Reply-To: torvalds@linux-foundation.org, tglx@linutronix.de, hpa@zytor.com, fweisbec@gmail.com, mingo@kernel.org, peterz@infradead.org, linux-kernel@vger.kernel.org In-Reply-To: <1421946484-9298-1-git-send-email-fweisbec@gmail.com> References: <1421946484-9298-1-git-send-email-fweisbec@gmail.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:sched/core] sched: Fix missing preemption opportunity Git-Commit-ID: a18b5d01819235629289212ad428a5ee2b40f0d9 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: a18b5d01819235629289212ad428a5ee2b40f0d9 Gitweb: http://git.kernel.org/tip/a18b5d01819235629289212ad428a5ee2b40f0d9 Author: Frederic Weisbecker AuthorDate: Thu, 22 Jan 2015 18:08:04 +0100 Committer: Ingo Molnar CommitDate: Fri, 30 Jan 2015 19:38:51 +0100 sched: Fix missing preemption opportunity If an interrupt fires in cond_resched(), between the call to __schedule() and the PREEMPT_ACTIVE count decrementation, and that interrupt sets TIF_NEED_RESCHED, the call to preempt_schedule_irq() will be ignored due to the PREEMPT_ACTIVE count. This kind of scenario, with irq preemption being delayed because it's interrupting a preempt-disabled area, is usually fixed up after preemption is re-enabled back with an explicit call to preempt_schedule(). This is what preempt_enable() does but a raw preempt count decrement as performed by __preempt_count_sub(PREEMPT_ACTIVE) doesn't handle delayed preemption check. Therefore when such a race happens, the rescheduling is going to be delayed until the next scheduler or preemption entrypoint. This can be a problem for scheduler latency sensitive workloads. Lets fix that by consolidating cond_resched() with preempt_schedule() internals. Reported-by: Linus Torvalds Reported-by: Ingo Molnar Original-patch-by: Ingo Molnar Signed-off-by: Frederic Weisbecker Signed-off-by: Peter Zijlstra (Intel) Link: http://lkml.kernel.org/r/1421946484-9298-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar --- kernel/sched/core.c | 40 +++++++++++++++++++--------------------- 1 file changed, 19 insertions(+), 21 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 0b591fe..54dce01 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2884,6 +2884,21 @@ void __sched schedule_preempt_disabled(void) preempt_disable(); } +static void preempt_schedule_common(void) +{ + do { + __preempt_count_add(PREEMPT_ACTIVE); + __schedule(); + __preempt_count_sub(PREEMPT_ACTIVE); + + /* + * Check again in case we missed a preemption opportunity + * between schedule and now. + */ + barrier(); + } while (need_resched()); +} + #ifdef CONFIG_PREEMPT /* * this is the entry point to schedule() from in-kernel preemption @@ -2899,17 +2914,7 @@ asmlinkage __visible void __sched notrace preempt_schedule(void) if (likely(!preemptible())) return; - do { - __preempt_count_add(PREEMPT_ACTIVE); - __schedule(); - __preempt_count_sub(PREEMPT_ACTIVE); - - /* - * Check again in case we missed a preemption opportunity - * between schedule and now. - */ - barrier(); - } while (need_resched()); + preempt_schedule_common(); } NOKPROBE_SYMBOL(preempt_schedule); EXPORT_SYMBOL(preempt_schedule); @@ -4209,17 +4214,10 @@ SYSCALL_DEFINE0(sched_yield) return 0; } -static void __cond_resched(void) -{ - __preempt_count_add(PREEMPT_ACTIVE); - __schedule(); - __preempt_count_sub(PREEMPT_ACTIVE); -} - int __sched _cond_resched(void) { if (should_resched()) { - __cond_resched(); + preempt_schedule_common(); return 1; } return 0; @@ -4244,7 +4242,7 @@ int __cond_resched_lock(spinlock_t *lock) if (spin_needbreak(lock) || resched) { spin_unlock(lock); if (resched) - __cond_resched(); + preempt_schedule_common(); else cpu_relax(); ret = 1; @@ -4260,7 +4258,7 @@ int __sched __cond_resched_softirq(void) if (should_resched()) { local_bh_enable(); - __cond_resched(); + preempt_schedule_common(); local_bh_disable(); return 1; }