All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Ben Segall <bsegall@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	pjt@google.com, Ingo Molnar <mingo@kernel.org>,
	Chris J Arges <chris.j.arges@canonical.com>
Subject: [PATCH 3.10 59/62] sched: Fix race on toggling cfs_bandwidth_used
Date: Mon, 13 Jan 2014 16:27:24 -0800	[thread overview]
Message-ID: <20140114002712.151977149@linuxfoundation.org> (raw)
In-Reply-To: <20140114002710.464561569@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Segall <bsegall@google.com>

commit 1ee14e6c8cddeeb8a490d7b54cd9016e4bb900b4 upstream.

When we transition cfs_bandwidth_used to false, any currently
throttled groups will incorrectly return false from cfs_rq_throttled.
While tg_set_cfs_bandwidth will unthrottle them eventually, currently
running code (including at least dequeue_task_fair and
distribute_cfs_runtime) will cause errors.

Fix this by turning off cfs_bandwidth_used only after unthrottling all
cfs_rqs.

Tested: toggle bandwidth back and forth on a loaded cgroup. Caused
crashes in minutes without the patch, hasn't crashed with it.

Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/20131016181611.22647.80365.stgit@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/sched/core.c  |    9 ++++++++-
 kernel/sched/fair.c  |   16 +++++++++-------
 kernel/sched/sched.h |    3 ++-
 3 files changed, 19 insertions(+), 9 deletions(-)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7812,7 +7812,12 @@ static int tg_set_cfs_bandwidth(struct t
 
 	runtime_enabled = quota != RUNTIME_INF;
 	runtime_was_enabled = cfs_b->quota != RUNTIME_INF;
-	account_cfs_bandwidth_used(runtime_enabled, runtime_was_enabled);
+	/*
+	 * If we need to toggle cfs_bandwidth_used, off->on must occur
+	 * before making related changes, and on->off must occur afterwards
+	 */
+	if (runtime_enabled && !runtime_was_enabled)
+		cfs_bandwidth_usage_inc();
 	raw_spin_lock_irq(&cfs_b->lock);
 	cfs_b->period = ns_to_ktime(period);
 	cfs_b->quota = quota;
@@ -7838,6 +7843,8 @@ static int tg_set_cfs_bandwidth(struct t
 			unthrottle_cfs_rq(cfs_rq);
 		raw_spin_unlock_irq(&rq->lock);
 	}
+	if (runtime_was_enabled && !runtime_enabled)
+		cfs_bandwidth_usage_dec();
 out_unlock:
 	mutex_unlock(&cfs_constraints_mutex);
 
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2029,13 +2029,14 @@ static inline bool cfs_bandwidth_used(vo
 	return static_key_false(&__cfs_bandwidth_used);
 }
 
-void account_cfs_bandwidth_used(int enabled, int was_enabled)
+void cfs_bandwidth_usage_inc(void)
 {
-	/* only need to count groups transitioning between enabled/!enabled */
-	if (enabled && !was_enabled)
-		static_key_slow_inc(&__cfs_bandwidth_used);
-	else if (!enabled && was_enabled)
-		static_key_slow_dec(&__cfs_bandwidth_used);
+	static_key_slow_inc(&__cfs_bandwidth_used);
+}
+
+void cfs_bandwidth_usage_dec(void)
+{
+	static_key_slow_dec(&__cfs_bandwidth_used);
 }
 #else /* HAVE_JUMP_LABEL */
 static bool cfs_bandwidth_used(void)
@@ -2043,7 +2044,8 @@ static bool cfs_bandwidth_used(void)
 	return true;
 }
 
-void account_cfs_bandwidth_used(int enabled, int was_enabled) {}
+void cfs_bandwidth_usage_inc(void) {}
+void cfs_bandwidth_usage_dec(void) {}
 #endif /* HAVE_JUMP_LABEL */
 
 /*
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1318,7 +1318,8 @@ extern void print_rt_stats(struct seq_fi
 extern void init_cfs_rq(struct cfs_rq *cfs_rq);
 extern void init_rt_rq(struct rt_rq *rt_rq, struct rq *rq);
 
-extern void account_cfs_bandwidth_used(int enabled, int was_enabled);
+extern void cfs_bandwidth_usage_inc(void);
+extern void cfs_bandwidth_usage_dec(void);
 
 #ifdef CONFIG_NO_HZ_COMMON
 enum rq_nohz_flag_bits {



  parent reply	other threads:[~2014-01-14  0:54 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-14  0:26 [PATCH 3.10 00/62] 3.10.27-stable review Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 01/62] irqchip: renesas-irqc: Fix irqc_probe error handling Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 02/62] clocksource: em_sti: Set cpu_possible_mask to fix SMP broadcast Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 03/62] gpio-rcar: R-Car GPIO IRQ share interrupt Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 04/62] HID: Revert "Revert "HID: Fix logitech-dj: missing Unifying device issue"" Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 05/62] IPv6: Fixed support for blackhole and prohibit routes Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 06/62] net: do not pretend FRAGLIST support Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 07/62] rds: prevent BUG_ON triggered on congestion update to loopback Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 08/62] macvtap: Do not double-count received packets Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 09/62] macvtap: update file current position Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 10/62] tun: " Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 11/62] macvtap: signal truncated packets Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 12/62] virtio: delete napi structures from netdev before releasing memory Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 13/62] packet: fix send path when running with proto == 0 Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 14/62] ipv6: dont count addrconf generated routes against gc limit Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 15/62] net: drop_monitor: fix the value of maxattr Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 16/62] net: unix: allow set_peek_off to fail Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 17/62] tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0 Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 18/62] netvsc: dont flush peers notifying work during setting mtu Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 19/62] ipv6: fix illegal mac_header comparison on 32bit Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 20/62] net: unix: allow bind to fail on mutex lock Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 22/62] net: inet_diag: zero out uninitialized idiag_{src,dst} fields Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 23/62] drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl() Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 25/62] net: fec: fix potential use after free Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 26/62] ipv6: always set the new created dsts from in ip6_rt_copy Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 27/62] rds: prevent dereference of a NULL device Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 28/62] net: rose: restore old recvmsg behavior Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 29/62] vlan: Fix header ops passthru when doing TX VLAN offload Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 30/62] virtio_net: fix error handling for mergeable buffers Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 31/62] virtio-net: make all RX paths handle errors consistently Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 32/62] virtio_net: dont leak memory or block when too many frags Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 33/62] virtio-net: fix refill races during restore Greg Kroah-Hartman
2014-01-14  0:26 ` [PATCH 3.10 34/62] net: llc: fix use after free in llc_ui_recvmsg Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 35/62] netpoll: Fix missing TXQ unlock and and OOPS Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 36/62] bridge: use spin_lock_bh() in br_multicast_set_hash_max Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 37/62] net: Loosen constraints for recalculating checksum in skb_segment() Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 38/62] ARM: fix footbridge clockevent device Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 39/62] ARM: fix "bad mode in ... handler" message for undefined instructions Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 40/62] ARM: dts: exynos5250: Fix MDMA0 clock number Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 41/62] ARM: shmobile: kzm9g: Fix coherent DMA mask Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 42/62] ARM: shmobile: armadillo: " Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 43/62] ARM: shmobile: mackerel: " Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 45/62] parisc: Ensure full cache coherency for kmap/kunmap Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 46/62] ahci: add PCI ID for Marvell 88SE9170 SATA controller Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 47/62] clk: clk-divider: fix divisor > 255 bug Greg Kroah-Hartman
2014-01-14  0:27   ` Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 48/62] clk: samsung: exynos4: Correct SRC_MFC register Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 49/62] clk: samsung: exynos5250: Add CLK_IGNORE_UNUSED flag for the sysreg clock Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 50/62] clk: exynos5250: fix sysmmu_mfc{l,r} gate clocks Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 51/62] mfd: rtsx_pcr: Disable interrupts before cancelling delayed works Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 52/62] ACPI / TPM: fix memory leak when walking ACPI namespace Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 53/62] ACPI / Battery: Add a _BIX quirk for NEC LZ750/LS Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 54/62] mac80211: move "bufferable MMPDU" check to fix AP mode scan Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 55/62] intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 56/62] SCSI: sd: Reduce buffer size for vpd request Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 57/62] netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 58/62] x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround Greg Kroah-Hartman
2014-01-14  0:27 ` Greg Kroah-Hartman [this message]
2014-01-14  0:27 ` [PATCH 3.10 60/62] sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 61/62] sched: Fix hrtimer_cancel()/rq->lock deadlock Greg Kroah-Hartman
2014-01-14  0:27 ` [PATCH 3.10 62/62] sched: Guarantee new group-entities always have weight Greg Kroah-Hartman
2014-01-14  3:02 ` [PATCH 3.10 00/62] 3.10.27-stable review Guenter Roeck
2014-01-14 23:12   ` Greg Kroah-Hartman
2014-01-14 19:30 ` Shuah Khan
2014-01-14 23:12   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140114002712.151977149@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bsegall@google.com \
    --cc=chris.j.arges@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.