public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Peter Newman <peternewman@google.com>,
	"Borislav Petkov (AMD)" <bp@alien8.de>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Babu Moger <babu.moger@amd.com>,
	stable@kernel.org
Subject: [PATCH 5.15 44/86] x86/resctrl: Fix task CLOSID/RMID update race
Date: Mon, 16 Jan 2023 16:51:18 +0100	[thread overview]
Message-ID: <20230116154748.894995908@linuxfoundation.org> (raw)
In-Reply-To: <20230116154747.036911298@linuxfoundation.org>

From: Peter Newman <peternewman@google.com>

commit fe1f0714385fbcf76b0cbceb02b7277d842014fc upstream.

When the user moves a running task to a new rdtgroup using the task's
file interface or by deleting its rdtgroup, the resulting change in
CLOSID/RMID must be immediately propagated to the PQR_ASSOC MSR on the
task(s) CPUs.

x86 allows reordering loads with prior stores, so if the task starts
running between a task_curr() check that the CPU hoisted before the
stores in the CLOSID/RMID update then it can start running with the old
CLOSID/RMID until it is switched again because __rdtgroup_move_task()
failed to determine that it needs to be interrupted to obtain the new
CLOSID/RMID.

Refer to the diagram below:

CPU 0                                   CPU 1
-----                                   -----
__rdtgroup_move_task():
  curr <- t1->cpu->rq->curr
                                        __schedule():
                                          rq->curr <- t1
                                        resctrl_sched_in():
                                          t1->{closid,rmid} -> {1,1}
  t1->{closid,rmid} <- {2,2}
  if (curr == t1) // false
   IPI(t1->cpu)

A similar race impacts rdt_move_group_tasks(), which updates tasks in a
deleted rdtgroup.

In both cases, use smp_mb() to order the task_struct::{closid,rmid}
stores before the loads in task_curr().  In particular, in the
rdt_move_group_tasks() case, simply execute an smp_mb() on every
iteration with a matching task.

It is possible to use a single smp_mb() in rdt_move_group_tasks(), but
this would require two passes and a means of remembering which
task_structs were updated in the first loop. However, benchmarking
results below showed too little performance impact in the simple
approach to justify implementing the two-pass approach.

Times below were collected using `perf stat` to measure the time to
remove a group containing a 1600-task, parallel workload.

CPU: Intel(R) Xeon(R) Platinum P-8136 CPU @ 2.00GHz (112 threads)

  # mkdir /sys/fs/resctrl/test
  # echo $$ > /sys/fs/resctrl/test/tasks
  # perf bench sched messaging -g 40 -l 100000

task-clock time ranges collected using:

  # perf stat rmdir /sys/fs/resctrl/test

Baseline:                     1.54 - 1.60 ms
smp_mb() every matching task: 1.57 - 1.67 ms

  [ bp: Massage commit message. ]

Fixes: ae28d1aae48a ("x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR")
Fixes: 0efc89be9471 ("x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount")
Signed-off-by: Peter Newman <peternewman@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20221220161123.432120-1-peternewman@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -580,8 +580,10 @@ static int __rdtgroup_move_task(struct t
 	/*
 	 * Ensure the task's closid and rmid are written before determining if
 	 * the task is current that will decide if it will be interrupted.
+	 * This pairs with the full barrier between the rq->curr update and
+	 * resctrl_sched_in() during context switch.
 	 */
-	barrier();
+	smp_mb();
 
 	/*
 	 * By now, the task's closid and rmid are set. If the task is current
@@ -2364,6 +2366,14 @@ static void rdt_move_group_tasks(struct
 			WRITE_ONCE(t->rmid, to->mon.rmid);
 
 			/*
+			 * Order the closid/rmid stores above before the loads
+			 * in task_curr(). This pairs with the full barrier
+			 * between the rq->curr update and resctrl_sched_in()
+			 * during context switch.
+			 */
+			smp_mb();
+
+			/*
 			 * If the task is on a CPU, set the CPU in the mask.
 			 * The detection is inaccurate as tasks might move or
 			 * schedule before the smp function call takes place.



  parent reply	other threads:[~2023-01-16 16:05 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-16 15:50 [PATCH 5.15 00/86] 5.15.89-rc1 review Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 01/86] netfilter: nft_payload: incorrect arithmetics when fetching VLAN header bits Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 02/86] ALSA: control-led: use strscpy in set_led_id() Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 03/86] ALSA: hda/realtek - Turn on power early Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 04/86] ALSA: hda/realtek: Enable mute/micmute LEDs on HP Spectre x360 13-aw0xxx Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 05/86] KVM: arm64: Fix S1PTW handling on RO memslots Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 06/86] KVM: arm64: nvhe: Fix build with profile optimization Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 07/86] selftests: kvm: Fix a compile error in selftests/kvm/rseq_test.c Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 08/86] efi: tpm: Avoid READ_ONCE() for accessing the event log Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 09/86] docs: Fix the docs build with Sphinx 6.0 Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 10/86] net: stmmac: add aux timestamps fifo clearance wait Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 11/86] perf auxtrace: Fix address filter duplicate symbol selection Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 12/86] s390/kexec: fix ipl report address for kdump Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 13/86] ASoC: qcom: lpass-cpu: Fix fallback SD line index handling Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 14/86] s390/cpum_sf: add READ_ONCE() semantics to compare and swap loops Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 15/86] s390/percpu: add READ_ONCE() to arch_this_cpu_to_op_simple() Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 16/86] drm/virtio: Fix GEM handle creation UAF Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 17/86] drm/i915/gt: Reset twice Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 18/86] net/mlx5e: Set action fwd flag when parsing tc action goto Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 19/86] cifs: Fix uninitialized memory read for smb311 posix symlink create Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 20/86] platform/x86: dell-privacy: Only register SW_CAMERA_LENS_COVER if present Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 21/86] platform/surface: aggregator: Ignore command messages not intended for us Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 22/86] platform/x86: dell-privacy: Fix SW_CAMERA_LENS_COVER reporting Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 23/86] dt-bindings: msm: dsi-controller-main: Fix operating-points-v2 constraint Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 24/86] drm/msm/adreno: Make adreno quirks not overwrite each other Greg Kroah-Hartman
2023-01-16 15:50 ` [PATCH 5.15 25/86] dt-bindings: msm: dsi-controller-main: Fix power-domain constraint Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 26/86] dt-bindings: msm: dsi-controller-main: Fix description of core clock Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 27/86] dt-bindings: msm: dsi-phy-28nm: Add missing qcom, dsi-phy-regulator-ldo-mode Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 28/86] platform/x86: ideapad-laptop: Add Legion 5 15ARH05 DMI id to set_fn_lock_led_list[] Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 29/86] drm/msm/dp: do not complete dp_aux_cmd_fifo_tx() if irq is not for aux transfer Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 30/86] dt-bindings: msm/dsi: Dont require vdds-supply on 10nm PHY Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 31/86] dt-bindings: msm/dsi: Dont require vcca-supply on 14nm PHY Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 32/86] platform/x86: sony-laptop: Dont turn off 0x153 keyboard backlight during probe Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 33/86] ixgbe: fix pci device refcount leak Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 34/86] ipv6: raw: Deduct extension header length in rawv6_push_pending_frames Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 35/86] bus: mhi: host: Fix race between channel preparation and M0 event Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 36/86] usb: ulpi: defer ulpi_register on ulpi_read_id timeout Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 37/86] iommu/iova: Fix alloc iova overflows issue Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 38/86] iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 39/86] sched/core: Fix use-after-free bug in dup_user_cpus_ptr() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 40/86] netfilter: ipset: Fix overflow before widen in the bitmap_ip_create() function Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 41/86] powerpc/imc-pmu: Fix use of mutex in IRQs disabled section Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 42/86] x86/boot: Avoid using Intel mnemonics in AT&T syntax asm Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 43/86] EDAC/device: Fix period calculation in edac_device_reset_delay_period() Greg Kroah-Hartman
2023-01-16 15:51 ` Greg Kroah-Hartman [this message]
2023-01-16 15:51 ` [PATCH 5.15 45/86] regulator: da9211: Use irq handler when ready Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 46/86] scsi: mpi3mr: Refer CONFIG_SCSI_MPI3MR in Makefile Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 47/86] scsi: ufs: Stop using the clock scaling lock in the error handler Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 48/86] scsi: ufs: core: WLUN suspend SSU/enter hibern8 fail recovery Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 49/86] ASoC: wm8904: fix wrong outputs volume after power reactivation Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 50/86] ALSA: usb-audio: Make sure to stop endpoints before closing EPs Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 51/86] ALSA: usb-audio: Relax hw constraints for implicit fb sync Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 52/86] tipc: fix unexpected link reset due to discovery messages Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 53/86] octeontx2-af: Fix LMAC config in cgx_lmac_rx_tx_enable Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 54/86] hvc/xen: lock console list traversal Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 55/86] nfc: pn533: Wait for out_urbs completion in pn533_usb_send_frame() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 56/86] af_unix: selftest: Fix the size of the parameter to connect() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 57/86] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 58/86] tools/nolibc: x86-64: Use `mov $60,%eax` instead of `mov $60,%rax` Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 59/86] tools/nolibc: use pselect6 on RISCV Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 60/86] tools/nolibc/std: move the standard type definitions to std.h Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 61/86] tools/nolibc/types: split syscall-specific definitions into their own files Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 62/86] tools/nolibc/arch: split arch-specific code into individual files Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 63/86] tools/nolibc/arch: mark the _start symbol as weak Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 64/86] tools/nolibc: Remove .global _start from the entry point code Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 65/86] tools/nolibc: restore mips branch ordering in the _start block Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 66/86] tools/nolibc: fix the O_* fcntl/open macro definitions for riscv Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 67/86] net/sched: act_mpls: Fix warning during failed attribute validation Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 68/86] net/mlx5: Fix ptp max frequency adjustment range Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 69/86] net/mlx5e: Dont support encap rules with gbp option Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 70/86] perf build: Properly guard libbpf includes Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 71/86] igc: Fix PPS delta between two synchronized end-points Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 72/86] platform/surface: aggregator: Add missing call to ssam_request_sync_free() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 73/86] mm: Always release pages to the buddy allocator in memblock_free_late() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 74/86] Documentation: KVM: add API issues section Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 75/86] KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 76/86] io_uring: lock overflowing for IOPOLL Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 77/86] arm64: atomics: format whitespace consistently Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 78/86] arm64: atomics: remove LL/SC trampolines Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 79/86] arm64: cmpxchg_double*: hazard against entire exchange variable Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 80/86] efi: fix NULL-deref in init error path Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 81/86] scsi: mpt3sas: Remove scsi_dma_map() error messages Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 82/86] io_uring/io-wq: free worker if task_work creation is canceled Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 83/86] io_uring/io-wq: only free worker if it was allocated for creation Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 84/86] block: handle bio_split_to_limits() NULL return Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.15 85/86] Revert "usb: ulpi: defer ulpi_register on ulpi_read_id timeout" Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.15 86/86] pinctrl: amd: Add dynamic debugging for active GPIOs Greg Kroah-Hartman
2023-01-16 23:57 ` [PATCH 5.15 00/86] 5.15.89-rc1 review Shuah Khan
2023-01-17  3:59 ` Bagas Sanjaya
2023-01-17 11:27 ` Naresh Kamboju
2023-01-17 14:23   ` Nathan Chancellor
2023-01-17 14:31     ` Greg Kroah-Hartman
2023-01-17 12:41 ` Sudip Mukherjee
2023-01-17 13:19 ` Allen Pais
2023-01-18  1:39 ` Guenter Roeck
2023-01-18  2:55 ` Kelsey Steele
2023-01-18  8:27 ` Ron Economos
2023-01-18 18:10 ` Florian Fainelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230116154748.894995908@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=babu.moger@amd.com \
    --cc=bp@alien8.de \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=reinette.chatre@intel.com \
    --cc=stable@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox