All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, "Waiman Long" <longman@redhat.com>,
	"Feng Tang" <feng.tang@intel.com>,
	"Michal Koutný" <mkoutny@suse.com>, "Tejun Heo" <tj@kernel.org>
Subject: [PATCH 4.14 22/25] cgroup/cpuset: Remove cpus_allowed/mems_allowed setup in cpuset_init_smp()
Date: Mon, 16 May 2022 21:36:36 +0200	[thread overview]
Message-ID: <20220516193615.355280598@linuxfoundation.org> (raw)
In-Reply-To: <20220516193614.678319286@linuxfoundation.org>

From: Waiman Long <longman@redhat.com>

commit 2685027fca387b602ae565bff17895188b803988 upstream.

There are 3 places where the cpu and node masks of the top cpuset can
be initialized in the order they are executed:
 1) start_kernel -> cpuset_init()
 2) start_kernel -> cgroup_init() -> cpuset_bind()
 3) kernel_init_freeable() -> do_basic_setup() -> cpuset_init_smp()

The first cpuset_init() call just sets all the bits in the masks.
The second cpuset_bind() call sets cpus_allowed and mems_allowed to the
default v2 values. The third cpuset_init_smp() call sets them back to
v1 values.

For systems with cgroup v2 setup, cpuset_bind() is called once.  As a
result, cpu and memory node hot add may fail to update the cpu and node
masks of the top cpuset to include the newly added cpu or node in a
cgroup v2 environment.

For systems with cgroup v1 setup, cpuset_bind() is called again by
rebind_subsystem() when the v1 cpuset filesystem is mounted as shown
in the dmesg log below with an instrumented kernel.

  [    2.609781] cpuset_bind() called - v2 = 1
  [    3.079473] cpuset_init_smp() called
  [    7.103710] cpuset_bind() called - v2 = 0

smp_init() is called after the first two init functions.  So we don't
have a complete list of active cpus and memory nodes until later in
cpuset_init_smp() which is the right time to set up effective_cpus
and effective_mems.

To fix this cgroup v2 mask setup problem, the potentially incorrect
cpus_allowed & mems_allowed setting in cpuset_init_smp() are removed.
For cgroup v2 systems, the initial cpuset_bind() call will set the masks
correctly.  For cgroup v1 systems, the second call to cpuset_bind()
will do the right setup.

cc: stable@vger.kernel.org
Signed-off-by: Waiman Long <longman@redhat.com>
Tested-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/cgroup/cpuset.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2403,8 +2403,11 @@ static struct notifier_block cpuset_trac
  */
 void __init cpuset_init_smp(void)
 {
-	cpumask_copy(top_cpuset.cpus_allowed, cpu_active_mask);
-	top_cpuset.mems_allowed = node_states[N_MEMORY];
+	/*
+	 * cpus_allowd/mems_allowed set to v2 values in the initial
+	 * cpuset_bind() call will be reset to v1 values in another
+	 * cpuset_bind() call when v1 cpuset is mounted.
+	 */
 	top_cpuset.old_mems_allowed = top_cpuset.mems_allowed;
 
 	cpumask_copy(top_cpuset.effective_cpus, cpu_active_mask);



  parent reply	other threads:[~2022-05-16 19:42 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-16 19:36 [PATCH 4.14 00/25] 4.14.280-rc1 review Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 01/25] batman-adv: Dont skb_split skbuffs with frag_list Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 02/25] net: Fix features skip in for_each_netdev_feature() Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 03/25] ipv4: drop dst in multicast routing path Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 04/25] netlink: do not reset transport header in netlink_recvmsg() Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 05/25] mac80211_hwsim: call ieee80211_tx_prepare_skb under RCU protection Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 06/25] hwmon: (ltq-cputemp) restrict it to SOC_XWAY Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 07/25] s390/ctcm: fix variable dereferenced before check Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 08/25] s390/ctcm: fix potential memory leak Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 09/25] s390/lcs: fix variable dereferenced before check Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 10/25] net/smc: non blocking recvmsg() return -EAGAIN when no data and signal_pending Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 11/25] net: sfc: ef10: fix memory leak in efx_ef10_mtd_probe() Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 12/25] hwmon: (f71882fg) Fix negative temperature Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 13/25] ASoC: max98090: Reject invalid values in custom control put() Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 14/25] ASoC: max98090: Generate notifications on changes for custom control Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 15/25] ASoC: ops: Validate input values in snd_soc_put_volsw_range() Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 16/25] tcp: resalt the secret every 10 seconds Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 17/25] usb: cdc-wdm: fix reading stuck on device close Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 18/25] USB: serial: pl2303: add device id for HP LM930 Display Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 19/25] USB: serial: qcserial: add support for Sierra Wireless EM7590 Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 20/25] USB: serial: option: add Fibocom L610 modem Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 21/25] USB: serial: option: add Fibocom MA510 modem Greg Kroah-Hartman
2022-05-16 19:36 ` Greg Kroah-Hartman [this message]
2022-05-16 19:36 ` [PATCH 4.14 23/25] drm/vmwgfx: Initialize drm_mode_fb_cmd2 Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 24/25] ping: fix address binding wrt vrf Greg Kroah-Hartman
2022-05-16 19:36 ` [PATCH 4.14 25/25] tty/serial: digicolor: fix possible null-ptr-deref in digicolor_uart_probe() Greg Kroah-Hartman
2022-05-17  7:37 ` [PATCH 4.14 00/25] 4.14.280-rc1 review Jon Hunter
2022-05-17 19:29 ` Guenter Roeck
2022-05-18  8:33 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220516193615.355280598@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=feng.tang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.