From: Joel Fernandes <joel@joelfernandes.org>
To: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: "moderated list:ARM/STM32 ARCHITECTURE"
<linux-arm-kernel@lists.infradead.org>,
Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
rcu <rcu@vger.kernel.org>,
"Paul E. McKenney" <paulmck@kernel.org>
Subject: Re: arm64 torture test hotplug failures (offlining causes -EBUSY)
Date: Tue, 17 Jan 2023 00:15:07 +0000 [thread overview]
Message-ID: <Y8Xoi3UvMs+Oy78O@google.com> (raw)
In-Reply-To: <CAEXW_YQsUN_80FiXcts+5KgWo999KXWZDbkuPjmFCb8LiiLBzw@mail.gmail.com>
On Mon, Jan 16, 2023 at 05:38:00PM -0500, Joel Fernandes wrote:
> Hi Zhouyi,
>
> On Mon, Jan 16, 2023 at 1:33 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >
> [..]
> > On Tue, Jan 17, 2023 at 1:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > >
> > > Hello,
> > > I am seeing -EBUSY returned a lot during torture_onoff() when running
> > > rcutorture on arm64. This causes hotplug failure 30% of the time. I am
> > > also seeing this in 6.1-rc kernels. I believe see this only for CPU0.
> > >
> > > This causes warnings in torture tests:
> > > [ 217.582290] rcu-torture:torture_onoff task: offline 0 failed: errno -16
> > > [ 221.866362] rcu-torture:torture_onoff task: offline 0 failed: errno -16
> > >
> > > Full kernel log here:
> > > http://box.joelfernandes.org:9080/job/rcutorture_stable_arm/job/linux-5.15.y/7/artifact/tools/testing/selftests/rcutorture/res/2023.01.15-14.51.11/TREE04/console.log
> > >
> > > Any ideas on why this is happening and only for CPU 0 (presumably the
> > > boot CPU)? I'd personally need these warnings to go away for my tests
> > > as this causes rcutorture's tests to not cleanly pass for me. It
> > > appears remove_cpu() -> device_offline() is what returns the error.
> > >
> > I guess this probably because CPU 0 is the tick_do_timer_cpu in
> > nohz_full mode, which prevent that cpu from
> > going offline [1]. We have discussed this topic, but there is no
> > agreement on how to solve it yet.
>
> But I am seeing the issue in TRACE02 config which is:
> CONFIG_NO_HZ_IDLE=y
> # CONFIG_NO_HZ_FULL is not set
>
> So that is not NO_HZ_FULL:
> http://box.joelfernandes.org:9080/job/rcutorture_stable_arm/job/linux-5.15.y/7/artifact/tools/testing/selftests/rcutorture/res/2023.01.15-14.51.11/TRACE02/console.log.diags/
> However, I can't seem to find the full kernel logs for that.
>
> Also, other than the TRACE02 fail, I only see the issue with configs
> with CONFIG_NO_HZ_FULL=y
>
> Can you try TRACE02 specifically, and see if you can reproduce the
> same issue on your setup? Meanwhile, I'll try to trace what is
> returning the -EBUSY.
How about something simple like the following? (untested)
---8<-----------------------
diff --git a/kernel/torture.c b/kernel/torture.c
index bc8fb361efc0..cd64110694c0 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -220,6 +220,9 @@ bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes,
// PCI probe frequently disables hotplug during boot.
(*n_offl_attempts)--;
s = " (-EBUSY forgiven during boot)";
+ } else if (tick_nohz_full_running && ret == -EBUSY) {
+ (*n_offl_attempts)--;
+ s = " (-EBUSY forgiven if nohz_full is running)";
}
if (verbose)
pr_alert("%s" TORTURE_FLAG
WARNING: multiple messages have this Message-ID (diff)
From: Joel Fernandes <joel@joelfernandes.org>
To: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: "moderated list:ARM/STM32 ARCHITECTURE"
<linux-arm-kernel@lists.infradead.org>,
Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
rcu <rcu@vger.kernel.org>,
"Paul E. McKenney" <paulmck@kernel.org>
Subject: Re: arm64 torture test hotplug failures (offlining causes -EBUSY)
Date: Tue, 17 Jan 2023 00:15:07 +0000 [thread overview]
Message-ID: <Y8Xoi3UvMs+Oy78O@google.com> (raw)
In-Reply-To: <CAEXW_YQsUN_80FiXcts+5KgWo999KXWZDbkuPjmFCb8LiiLBzw@mail.gmail.com>
On Mon, Jan 16, 2023 at 05:38:00PM -0500, Joel Fernandes wrote:
> Hi Zhouyi,
>
> On Mon, Jan 16, 2023 at 1:33 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >
> [..]
> > On Tue, Jan 17, 2023 at 1:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > >
> > > Hello,
> > > I am seeing -EBUSY returned a lot during torture_onoff() when running
> > > rcutorture on arm64. This causes hotplug failure 30% of the time. I am
> > > also seeing this in 6.1-rc kernels. I believe see this only for CPU0.
> > >
> > > This causes warnings in torture tests:
> > > [ 217.582290] rcu-torture:torture_onoff task: offline 0 failed: errno -16
> > > [ 221.866362] rcu-torture:torture_onoff task: offline 0 failed: errno -16
> > >
> > > Full kernel log here:
> > > http://box.joelfernandes.org:9080/job/rcutorture_stable_arm/job/linux-5.15.y/7/artifact/tools/testing/selftests/rcutorture/res/2023.01.15-14.51.11/TREE04/console.log
> > >
> > > Any ideas on why this is happening and only for CPU 0 (presumably the
> > > boot CPU)? I'd personally need these warnings to go away for my tests
> > > as this causes rcutorture's tests to not cleanly pass for me. It
> > > appears remove_cpu() -> device_offline() is what returns the error.
> > >
> > I guess this probably because CPU 0 is the tick_do_timer_cpu in
> > nohz_full mode, which prevent that cpu from
> > going offline [1]. We have discussed this topic, but there is no
> > agreement on how to solve it yet.
>
> But I am seeing the issue in TRACE02 config which is:
> CONFIG_NO_HZ_IDLE=y
> # CONFIG_NO_HZ_FULL is not set
>
> So that is not NO_HZ_FULL:
> http://box.joelfernandes.org:9080/job/rcutorture_stable_arm/job/linux-5.15.y/7/artifact/tools/testing/selftests/rcutorture/res/2023.01.15-14.51.11/TRACE02/console.log.diags/
> However, I can't seem to find the full kernel logs for that.
>
> Also, other than the TRACE02 fail, I only see the issue with configs
> with CONFIG_NO_HZ_FULL=y
>
> Can you try TRACE02 specifically, and see if you can reproduce the
> same issue on your setup? Meanwhile, I'll try to trace what is
> returning the -EBUSY.
How about something simple like the following? (untested)
---8<-----------------------
diff --git a/kernel/torture.c b/kernel/torture.c
index bc8fb361efc0..cd64110694c0 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -220,6 +220,9 @@ bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes,
// PCI probe frequently disables hotplug during boot.
(*n_offl_attempts)--;
s = " (-EBUSY forgiven during boot)";
+ } else if (tick_nohz_full_running && ret == -EBUSY) {
+ (*n_offl_attempts)--;
+ s = " (-EBUSY forgiven if nohz_full is running)";
}
if (verbose)
pr_alert("%s" TORTURE_FLAG
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-01-17 0:15 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-16 17:03 arm64 torture test hotplug failures (offlining causes -EBUSY) Joel Fernandes
2023-01-16 17:03 ` Joel Fernandes
2023-01-16 18:03 ` Marc Zyngier
2023-01-16 18:03 ` Marc Zyngier
2023-01-16 22:43 ` Joel Fernandes
2023-01-16 22:43 ` Joel Fernandes
2023-01-16 18:32 ` Zhouyi Zhou
2023-01-16 18:32 ` Zhouyi Zhou
2023-01-16 22:38 ` Joel Fernandes
2023-01-16 22:38 ` Joel Fernandes
2023-01-17 0:15 ` Joel Fernandes [this message]
2023-01-17 0:15 ` Joel Fernandes
2023-01-17 0:37 ` Zhouyi Zhou
2023-01-17 0:37 ` Zhouyi Zhou
2023-01-17 1:45 ` Joel Fernandes
2023-01-17 1:45 ` Joel Fernandes
2023-01-17 3:15 ` Zhouyi Zhou
2023-01-17 3:15 ` Zhouyi Zhou
2023-01-17 4:34 ` Joel Fernandes
2023-01-17 4:34 ` Joel Fernandes
2023-01-17 11:42 ` Zhouyi Zhou
2023-01-17 11:42 ` Zhouyi Zhou
2023-01-17 19:50 ` Joel Fernandes
2023-01-17 19:50 ` Joel Fernandes
2023-01-18 10:15 ` Zhouyi Zhou
2023-01-18 10:15 ` Zhouyi Zhou
2023-01-18 15:51 ` Joel Fernandes
2023-01-18 15:51 ` Joel Fernandes
2023-01-17 4:30 ` Paul E. McKenney
2023-01-17 4:30 ` Paul E. McKenney
2023-01-17 4:36 ` Joel Fernandes
2023-01-17 4:36 ` Joel Fernandes
2023-01-17 4:54 ` Paul E. McKenney
2023-01-17 4:54 ` Paul E. McKenney
2023-01-17 20:02 ` Joel Fernandes
2023-01-17 20:02 ` Joel Fernandes
2023-01-17 20:42 ` Paul E. McKenney
2023-01-17 20:42 ` Paul E. McKenney
2023-01-18 2:17 ` Joel Fernandes
2023-01-18 2:17 ` Joel Fernandes
2023-01-18 4:00 ` Paul E. McKenney
2023-01-18 4:00 ` Paul E. McKenney
2023-01-18 16:51 ` Will Deacon
2023-01-18 16:51 ` Will Deacon
2023-01-18 17:56 ` Paul E. McKenney
2023-01-18 17:56 ` Paul E. McKenney
2023-01-18 22:01 ` Joel Fernandes
2023-01-18 22:01 ` Joel Fernandes
2023-01-19 9:12 ` Mark Rutland
2023-01-19 9:12 ` Mark Rutland
2023-01-18 22:37 ` Joel Fernandes
2023-01-18 22:37 ` Joel Fernandes
2023-01-18 22:39 ` Joel Fernandes
2023-01-18 22:39 ` Joel Fernandes
2023-01-19 0:15 ` Paul E. McKenney
2023-01-19 0:15 ` Paul E. McKenney
2023-01-19 0:53 ` Joel Fernandes
2023-01-19 0:53 ` Joel Fernandes
2023-01-19 3:21 ` Zhouyi Zhou
2023-01-19 3:21 ` Zhouyi Zhou
2023-01-19 8:26 ` Joel Fernandes
2023-01-19 8:26 ` Joel Fernandes
2023-01-19 12:17 ` Zhouyi Zhou
2023-01-19 12:17 ` Zhouyi Zhou
2023-01-19 13:57 ` Frederic Weisbecker
2023-01-19 13:57 ` Frederic Weisbecker
2023-01-19 20:25 ` Joel Fernandes
2023-01-19 20:25 ` Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y8Xoi3UvMs+Oy78O@google.com \
--to=joel@joelfernandes.org \
--cc=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=will@kernel.org \
--cc=zhouzhouyi@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.