All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Ben Segall <bsegall@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	pjt@google.com, Ingo Molnar <mingo@kernel.org>,
	Chris J Arges <chris.j.arges@canonical.com>
Subject: [PATCH 3.4 24/27] sched: Fix race on toggling cfs_bandwidth_used
Date: Mon, 13 Jan 2014 16:26:38 -0800	[thread overview]
Message-ID: <20140114002624.065013790@linuxfoundation.org> (raw)
In-Reply-To: <20140114002623.356220317@linuxfoundation.org>

3.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Segall <bsegall@google.com>

commit 1ee14e6c8cddeeb8a490d7b54cd9016e4bb900b4 upstream.

When we transition cfs_bandwidth_used to false, any currently
throttled groups will incorrectly return false from cfs_rq_throttled.
While tg_set_cfs_bandwidth will unthrottle them eventually, currently
running code (including at least dequeue_task_fair and
distribute_cfs_runtime) will cause errors.

Fix this by turning off cfs_bandwidth_used only after unthrottling all
cfs_rqs.

Tested: toggle bandwidth back and forth on a loaded cgroup. Caused
crashes in minutes without the patch, hasn't crashed with it.

Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/20131016181611.22647.80365.stgit@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/sched/core.c  |    9 ++++++++-
 kernel/sched/fair.c  |   16 +++++++++-------
 kernel/sched/sched.h |    3 ++-
 3 files changed, 19 insertions(+), 9 deletions(-)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7906,7 +7906,12 @@ static int tg_set_cfs_bandwidth(struct t
 
 	runtime_enabled = quota != RUNTIME_INF;
 	runtime_was_enabled = cfs_b->quota != RUNTIME_INF;
-	account_cfs_bandwidth_used(runtime_enabled, runtime_was_enabled);
+	/*
+	 * If we need to toggle cfs_bandwidth_used, off->on must occur
+	 * before making related changes, and on->off must occur afterwards
+	 */
+	if (runtime_enabled && !runtime_was_enabled)
+		cfs_bandwidth_usage_inc();
 	raw_spin_lock_irq(&cfs_b->lock);
 	cfs_b->period = ns_to_ktime(period);
 	cfs_b->quota = quota;
@@ -7932,6 +7937,8 @@ static int tg_set_cfs_bandwidth(struct t
 			unthrottle_cfs_rq(cfs_rq);
 		raw_spin_unlock_irq(&rq->lock);
 	}
+	if (runtime_was_enabled && !runtime_enabled)
+		cfs_bandwidth_usage_dec();
 out_unlock:
 	mutex_unlock(&cfs_constraints_mutex);
 
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1393,13 +1393,14 @@ static inline bool cfs_bandwidth_used(vo
 	return static_key_false(&__cfs_bandwidth_used);
 }
 
-void account_cfs_bandwidth_used(int enabled, int was_enabled)
+void cfs_bandwidth_usage_inc(void)
 {
-	/* only need to count groups transitioning between enabled/!enabled */
-	if (enabled && !was_enabled)
-		static_key_slow_inc(&__cfs_bandwidth_used);
-	else if (!enabled && was_enabled)
-		static_key_slow_dec(&__cfs_bandwidth_used);
+	static_key_slow_inc(&__cfs_bandwidth_used);
+}
+
+void cfs_bandwidth_usage_dec(void)
+{
+	static_key_slow_dec(&__cfs_bandwidth_used);
 }
 #else /* HAVE_JUMP_LABEL */
 static bool cfs_bandwidth_used(void)
@@ -1407,7 +1408,8 @@ static bool cfs_bandwidth_used(void)
 	return true;
 }
 
-void account_cfs_bandwidth_used(int enabled, int was_enabled) {}
+void cfs_bandwidth_usage_inc(void) {}
+void cfs_bandwidth_usage_dec(void) {}
 #endif /* HAVE_JUMP_LABEL */
 
 /*
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1140,7 +1140,8 @@ extern void init_cfs_rq(struct cfs_rq *c
 extern void init_rt_rq(struct rt_rq *rt_rq, struct rq *rq);
 extern void unthrottle_offline_cfs_rqs(struct rq *rq);
 
-extern void account_cfs_bandwidth_used(int enabled, int was_enabled);
+extern void cfs_bandwidth_usage_inc(void);
+extern void cfs_bandwidth_usage_dec(void);
 
 #ifdef CONFIG_NO_HZ
 enum rq_nohz_flag_bits {



  parent reply	other threads:[~2014-01-14  0:26 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-14  0:26 [PATCH 3.4 00/27] 3.4.77-stable review Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 01/27] net: do not pretend FRAGLIST support Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 02/27] rds: prevent BUG_ON triggered on congestion update to loopback Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 03/27] macvtap: Do not double-count received packets Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 04/27] macvtap: update file current position Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 05/27] tun: " Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 06/27] macvtap: signal truncated packets Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 07/27] ipv6: dont count addrconf generated routes against gc limit Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 08/27] net: drop_monitor: fix the value of maxattr Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 09/27] net: unix: allow set_peek_off to fail Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 10/27] tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0 Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 11/27] netvsc: dont flush peers notifying work during setting mtu Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 12/27] net: unix: allow bind to fail on mutex lock Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 13/27] net: inet_diag: zero out uninitialized idiag_{src,dst} fields Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 14/27] drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl() Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 16/27] rds: prevent dereference of a NULL device Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 17/27] net: rose: restore old recvmsg behavior Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 18/27] vlan: Fix header ops passthru when doing TX VLAN offload Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 19/27] net: llc: fix use after free in llc_ui_recvmsg Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 20/27] bridge: use spin_lock_bh() in br_multicast_set_hash_max Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 21/27] ARM: fix "bad mode in ... handler" message for undefined instructions Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 22/27] ARM: shmobile: mackerel: Fix coherent DMA mask Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 23/27] x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround Greg Kroah-Hartman
2014-01-14  0:26 ` Greg Kroah-Hartman [this message]
2014-01-14  0:26 ` [PATCH 3.4 25/27] sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 26/27] sched: Fix hrtimer_cancel()/rq->lock deadlock Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.4 27/27] sched: Guarantee new group-entities always have weight Greg Kroah-Hartman
2014-01-14  2:59 ` [PATCH 3.4 00/27] 3.4.77-stable review Guenter Roeck
2014-01-14  3:03   ` Greg Kroah-Hartman
2014-01-14 19:29 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140114002624.065013790@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bsegall@google.com \
    --cc=chris.j.arges@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.