All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Neeraj Upadhyay <neeraju@codeaurora.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Joel Fernandes <joel@joelfernandes.org>
Subject: [PATCH 5.10 09/11] rcu-tasks: Fix IPI failure handling in trc_wait_for_one_reader
Date: Thu, 31 Aug 2023 13:10:01 +0200	[thread overview]
Message-ID: <20230831110830.823658141@linuxfoundation.org> (raw)
In-Reply-To: <20230831110830.455765526@linuxfoundation.org>

5.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Neeraj Upadhyay <neeraju@codeaurora.org>

commit 46aa886c483f57ef13cd5ea0a85e70b93eb1d381 upstream.

The trc_wait_for_one_reader() function is called at multiple stages
of trace rcu-tasks GP function, rcu_tasks_wait_gp():

- First, it is called as part of per task function -
  rcu_tasks_trace_pertask(), for all non-idle tasks. As part of per task
  processing, this function add the task in the holdout list and if the
  task is currently running on a CPU, it sends IPI to the task's CPU.
  The IPI handler takes action depending on whether task is in trace
  rcu-tasks read side critical section or not:

  - a. If the task is in trace rcu-tasks read side critical section
       (t->trc_reader_nesting != 0), the IPI handler sets the task's
       ->trc_reader_special.b.need_qs, so that this task notifies exit
       from its outermost read side critical section (by decrementing
       trc_n_readers_need_end) to the GP handling function.
       trc_wait_for_one_reader() also increments trc_n_readers_need_end,
       so that the trace rcu-tasks GP handler function waits for this
       task's read side exit notification. The IPI handler also sets
       t->trc_reader_checked to true, and no further IPIs are sent for
       this task, for this trace rcu-tasks grace period and this
       task can be removed from holdout list.

  - b. If the task is in the process of exiting its trace rcu-tasks
       read side critical section, (t->trc_reader_nesting < 0), defer
       this task's processing to future calls to trc_wait_for_one_reader().

  - c. If task is not in rcu-task read side critical section,
       t->trc_reader_nesting == 0, ->trc_reader_checked is set for this
       task, so that this task is removed from holdout list.

- Second, trc_wait_for_one_reader() is called as part of post scan, in
  function rcu_tasks_trace_postscan(), for all idle tasks.

- Third, in function check_all_holdout_tasks_trace(), this function is
  called for each task in the holdout list, but only if there isn't
  a pending IPI for the task (->trc_ipi_to_cpu == -1). This function
  removed the task from holdout list, if IPI handler has completed the
  required work, to ensure that the current trace rcu-tasks grace period
  either waits for this task, or this task is not in a trace rcu-tasks
  read side critical section.

Now, considering the scenario where smp_call_function_single() fails in
first case, inside rcu_tasks_trace_pertask(). In this case,
->trc_ipi_to_cpu is set to the current CPU for that task. This will
result in trc_wait_for_one_reader() getting skipped in third case,
inside check_all_holdout_tasks_trace(), for this task. This further
results in ->trc_reader_checked never getting set for this task,
and the task not getting removed from holdout list. This can cause
the current trace rcu-tasks grace period to stall.

Fix the above problem, by resetting ->trc_ipi_to_cpu to -1, on
smp_call_function_single() failure, so that future IPI calls can
be send for this task.

Note that all three of the trc_wait_for_one_reader() function's
callers (rcu_tasks_trace_pertask(), rcu_tasks_trace_postscan(),
check_all_holdout_tasks_trace()) hold cpu_read_lock().  This means
that smp_call_function_single() cannot race with CPU hotplug, and thus
should never fail.  Therefore, also add a warning in order to report
any such failure in case smp_call_function_single() grows some other
reason for failure.

Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/rcu/tasks.h |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -958,9 +958,11 @@ static void trc_wait_for_one_reader(stru
 		if (smp_call_function_single(cpu, trc_read_check_handler, t, 0)) {
 			// Just in case there is some other reason for
 			// failure than the target CPU being offline.
+			WARN_ONCE(1, "%s():  smp_call_function_single() failed for CPU: %d\n",
+				  __func__, cpu);
 			rcu_tasks_trace.n_ipis_fails++;
 			per_cpu(trc_ipi_to_cpu, cpu) = false;
-			t->trc_ipi_to_cpu = cpu;
+			t->trc_ipi_to_cpu = -1;
 		}
 	}
 }



  parent reply	other threads:[~2023-08-31 11:10 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-31 11:09 [PATCH 5.10 00/11] 5.10.194-rc1 review Greg Kroah-Hartman
2023-08-31 11:09 ` [PATCH 5.10 01/11] module: Expose module_init_layout_section() Greg Kroah-Hartman
2023-08-31 11:09 ` [PATCH 5.10 02/11] arm64: module-plts: inline linux/moduleloader.h Greg Kroah-Hartman
2023-08-31 11:09 ` [PATCH 5.10 03/11] arm64: module: Use module_init_layout_section() to spot init sections Greg Kroah-Hartman
2023-08-31 11:09 ` [PATCH 5.10 04/11] ARM: " Greg Kroah-Hartman
2023-08-31 11:09 ` [PATCH 5.10 05/11] mhi: pci_generic: Fix implicit conversion warning Greg Kroah-Hartman
2023-08-31 11:09 ` [PATCH 5.10 06/11] Revert "drm/amdgpu: install stub fence into potential unused fence pointers" Greg Kroah-Hartman
2023-08-31 11:09 ` [PATCH 5.10 07/11] Revert "MIPS: Alchemy: fix dbdma2" Greg Kroah-Hartman
2023-08-31 11:10 ` [PATCH 5.10 08/11] rcu: Prevent expedited GP from enabling tick on offline CPU Greg Kroah-Hartman
2023-08-31 11:10 ` Greg Kroah-Hartman [this message]
2023-08-31 11:10 ` [PATCH 5.10 10/11] rcu-tasks: Wait for trc_read_check_handler() IPIs Greg Kroah-Hartman
2023-08-31 11:10 ` [PATCH 5.10 11/11] rcu-tasks: Add trc_inspect_reader() checks for exiting critical section Greg Kroah-Hartman
2023-08-31 20:22 ` [PATCH 5.10 00/11] 5.10.194-rc1 review Florian Fainelli
2023-09-01  9:28 ` Sudip Mukherjee (Codethink)
2023-09-01 11:59 ` Naresh Kamboju
2023-09-01 15:50 ` Shuah Khan
2023-09-01 19:58 ` Jon Hunter
2023-09-02  4:18 ` Guenter Roeck
2023-09-18 21:33 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230831110830.823658141@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=joel@joelfernandes.org \
    --cc=neeraju@codeaurora.org \
    --cc=patches@lists.linux.dev \
    --cc=paulmck@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.