* Dilemma on changes - merge or not to merge (e.g. 6.4)
@ 2023-08-14 9:54 Richard Purdie
2023-08-15 13:08 ` Paul Gortmaker
0 siblings, 1 reply; 23+ messages in thread
From: Richard Purdie @ 2023-08-14 9:54 UTC (permalink / raw)
To: openembedded-core; +Cc: Bruce Ashfield, Paul Gortmaker
I'm becoming a little weary/wary of some of the changes that are coming
in. The challenge is that once they merge, issues become the problem of
a very small number of people.
My current dilemma is the 6.4 kernel. People would like it, we'd really
ideally use it for the next release but there are issues.
I've worked through a few, at least pinning down where the issues were
then resolving them with the help of others (thanks Bruce, Jon, Ross).
Remaining are:
* an error upon boot on preempt-rt on qemux86-64
(e.g. https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/7616/steps/36/logs/stdio)
We'll probably just have to ignore it in parselogs as it has been
around for a while and nobody seems interested in fixing it upstream.
* some random hangs:
https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/349/steps/12/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/354/steps/12/logs/stdio
The latter are rare and intermittent, mainly taking out CI test builds.
Most people aren't affected by them, find them hard to reproduce let
alone fix and will ignore them. That will leave me/Bruce/PaulG holding
the pieces.
I know Bruce spends a ton of time debugging weird things just to get
the kernel to the point we can even consider merging and nobody ever
really sees or appreciates that work :(.
Systemd was a similar challenge recently, multiple patches causing
multiple issues with a significant impact on CI. In that case the
issues weren't intermittent so resolution wasn't so bad.
Rust and reproducibility was given a pass so the rest of the changes
could merge for it. That just meant there was less pressure and the
reproducibility issue is still there with people saying its too hard.
That issue is now spreading down the chain to other recipes.
The toolchain test reports have thousands of failures nobody is really
looking at. Similarly the now consistent ltp controllers failures
(previously the reports weren't even consistent!).
I'm worried the access control patches changing the tar format are
going to destablise and once merged, people will move on to other
things leaving any remaining intermittent issues to me. Already we're
seeing things like sstate being blamed as it is easiest to do that. I
end up having to "prove" it isn't that.
There are intermittent ptests on the autobuilder too. I took mdadm
ptest patches on the basis there was help to fix them. We are still see
a lot of failures in CI from there. The glib-networking intermittent
failures continue, I know Trevor has tried to dig into those but he is
alone in doing it in code which isn't easy to navigate (and I don't
know how to help there).
As an idea of impact, every time one of these things fails in CI,
someone has triage that failure. The bug triage team has to triage the
bugs too.
I don't know how we fix this but we really could do with more people
able to dive in and help with these intermittent issues. I'm really
really apprehensive about merging some patches as I can just tell
they're going to cause pain :(.
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-14 9:54 Dilemma on changes - merge or not to merge (e.g. 6.4) Richard Purdie
@ 2023-08-15 13:08 ` Paul Gortmaker
2023-08-15 13:38 ` Richard Purdie
2023-08-16 7:55 ` [OE-core] " Rasmus Villemoes
0 siblings, 2 replies; 23+ messages in thread
From: Paul Gortmaker @ 2023-08-15 13:08 UTC (permalink / raw)
To: Richard Purdie; +Cc: openembedded-core, Bruce Ashfield
[Dilemma on changes - merge or not to merge (e.g. 6.4)] On 14/08/2023 (Mon 10:54) Richard Purdie wrote:
> I'm becoming a little weary/wary of some of the changes that are coming
> in. The challenge is that once they merge, issues become the problem of
> a very small number of people.
>
> My current dilemma is the 6.4 kernel. People would like it, we'd really
> ideally use it for the next release but there are issues.
>
> I've worked through a few, at least pinning down where the issues were
> then resolving them with the help of others (thanks Bruce, Jon, Ross).
>
> Remaining are:
> * an error upon boot on preempt-rt on qemux86-64
> (e.g. https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/7616/steps/36/logs/stdio)
> We'll probably just have to ignore it in parselogs as it has been??
> around for a while and nobody seems interested in fixing it upstream.
Just back from vacation and I see an internal report of 10-ish at boot
NOHZ tick-stop error: local softirq work is pending, handler #80!!!
..on the 6.1.43-rt10-yocto-preempt-rt kernel, on real hardware. So it
seems we can't blame that one entirely on v6.4 kernel (or qemu).
We used to get (late 3.x and 4.x era) pretty common "NOHZ: local softirq
pending" messages even on common/popular distro kernels. But I haven't
seen those for a long time and they didn't scream "error" or have the
alarmist three exclamation marks either.
I'll see if I can dig into that further. This instance is new to me, so
any additional context or information I might not turn up myself would
be useful.
> * some random hangs:
> https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/349/steps/12/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/354/steps/12/logs/stdio
>
> The latter are rare and intermittent, mainly taking out CI test builds.
> Most people aren't affected by them, find them hard to reproduce let
> alone fix and will ignore them. That will leave me/Bruce/PaulG holding
> the pieces.
Ugh. The RCU one is ugly and the Silent Boot Death one is no better.
Nobody likes SBD cases. They suck.
>
> I know Bruce spends a ton of time debugging weird things just to get
> the kernel to the point we can even consider merging and nobody ever
> really sees or appreciates that work :(.
Well, not "nobody". There are at least two people who have a good idea
of what Bruce does. :-P
Paul.
--
>
> Systemd was a similar challenge recently, multiple patches causing
> multiple issues with a significant impact on CI. In that case the
> issues weren't intermittent so resolution wasn't so bad.
>
> Rust and reproducibility??was given a pass so the rest of the changes
> could merge for it. That just meant there was less pressure and the
> reproducibility issue is still there with people saying its too hard.
> That issue is now spreading down the chain to other recipes.
>
> The toolchain test reports have thousands of failures nobody is really
> looking at. Similarly the now consistent ltp controllers failures
> (previously the reports weren't even consistent!).
>
> I'm worried the access control patches changing the tar format are
> going to destablise and once merged, people will move on to other
> things leaving any remaining intermittent issues to me. Already we're
> seeing things like sstate being blamed as it is easiest to do that. I
> end up having to "prove" it isn't that.
>
> There are intermittent ptests on the autobuilder too. I took mdadm
> ptest patches on the basis there was help to fix them. We are still see
> a lot of failures in CI from there. The glib-networking intermittent
> failures continue, I know Trevor has tried to dig into those but he is
> alone in doing it in code which isn't easy to navigate (and I don't
> know how to help there).
>
> As an idea of impact, every time one of these things fails in CI,
> someone has triage that failure. The bug triage team has to triage the
> bugs too.
>
> I don't know how we fix this but we really could do with more people
> able to dive in and help with these intermittent issues. I'm really
> really apprehensive about merging some patches as I can just tell
> they're going to cause pain :(.
>
> Cheers,
>
> Richard
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-15 13:08 ` Paul Gortmaker
@ 2023-08-15 13:38 ` Richard Purdie
2023-08-16 7:55 ` [OE-core] " Rasmus Villemoes
1 sibling, 0 replies; 23+ messages in thread
From: Richard Purdie @ 2023-08-15 13:38 UTC (permalink / raw)
To: Paul Gortmaker; +Cc: openembedded-core, Bruce Ashfield
On Tue, 2023-08-15 at 09:08 -0400, Paul Gortmaker wrote:
> [Dilemma on changes - merge or not to merge (e.g. 6.4)] On 14/08/2023 (Mon 10:54) Richard Purdie wrote:
>
> > I'm becoming a little weary/wary of some of the changes that are coming
> > in. The challenge is that once they merge, issues become the problem of
> > a very small number of people.
> >
> > My current dilemma is the 6.4 kernel. People would like it, we'd really
> > ideally use it for the next release but there are issues.
> >
> > I've worked through a few, at least pinning down where the issues were
> > then resolving them with the help of others (thanks Bruce, Jon, Ross).
> >
> > Remaining are:
> > * an error upon boot on preempt-rt on qemux86-64
> > (e.g. https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/7616/steps/36/logs/stdio)
> > We'll probably just have to ignore it in parselogs as it has been??
> > around for a while and nobody seems interested in fixing it upstream.
>
> Just back from vacation and I see an internal report of 10-ish at boot
>
> NOHZ tick-stop error: local softirq work is pending, handler #80!!!
>
> ..on the 6.1.43-rt10-yocto-preempt-rt kernel, on real hardware. So it
> seems we can't blame that one entirely on v6.4 kernel (or qemu).
That lets us rule out qemu and maybe look at "stable" series updates?
Any idea if it is there in early 6.1.x or just appeared?
> We used to get (late 3.x and 4.x era) pretty common "NOHZ: local softirq
> pending" messages even on common/popular distro kernels. But I haven't
> seen those for a long time and they didn't scream "error" or have the
> alarmist three exclamation marks either.
When I was looking around I did see a commit which "clarified" the
message adding the "error" keyword...
> I'll see if I can dig into that further. This instance is new to me, so
> any additional context or information I might not turn up myself would
> be useful.
Thanks. I don't really have any at this point, I've just been
collecting the failures. Bruce may have more. I have a few too many
issues going on at once atm.
> > * some random hangs:
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/349/steps/12/logs/stdio
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/354/steps/12/logs/stdio
> >
> > The latter are rare and intermittent, mainly taking out CI test builds.
> > Most people aren't affected by them, find them hard to reproduce let
> > alone fix and will ignore them. That will leave me/Bruce/PaulG holding
> > the pieces.
>
> Ugh. The RCU one is ugly and the Silent Boot Death one is no better.
> Nobody likes SBD cases. They suck.
They do indeed.
> > I know Bruce spends a ton of time debugging weird things just to get
> > the kernel to the point we can even consider merging and nobody ever
> > really sees or appreciates that work :(.
>
> Well, not "nobody". There are at least two people who have a good idea
> of what Bruce does. :-P
Too few would have been more accurate I guess :)
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-15 13:08 ` Paul Gortmaker
2023-08-15 13:38 ` Richard Purdie
@ 2023-08-16 7:55 ` Rasmus Villemoes
2023-08-18 3:22 ` Paul Gortmaker
1 sibling, 1 reply; 23+ messages in thread
From: Rasmus Villemoes @ 2023-08-16 7:55 UTC (permalink / raw)
To: paul.gortmaker, Richard Purdie; +Cc: openembedded-core, Bruce Ashfield
On 15/08/2023 15.08, Paul Gortmaker via lists.openembedded.org wrote:
> [Dilemma on changes - merge or not to merge (e.g. 6.4)] On 14/08/2023 (Mon 10:54) Richard Purdie wrote:
>
>> Remaining are:
>> * an error upon boot on preempt-rt on qemux86-64
>> (e.g. https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/7616/steps/36/logs/stdio)
>> We'll probably just have to ignore it in parselogs as it has been??
>> around for a while and nobody seems interested in fixing it upstream.
>
> Just back from vacation and I see an internal report of 10-ish at boot
it seems to be rate-limited to 10 per boot, so it should never appear
more than those 10ish times:
static bool report_idle_softirq(void)
{
...
if (ratelimit >= 10)
return false;
...
ratelimit++;
...
}
(it's all non-atomic/lockfree, so ofc it could just happen to get
emitted 11 or 12 times if the stars align just right...)
>
> NOHZ tick-stop error: local softirq work is pending, handler #80!!!
>
> ..on the 6.1.43-rt10-yocto-preempt-rt kernel, on real hardware. So it
> seems we can't blame that one entirely on v6.4 kernel (or qemu).
>
> We used to get (late 3.x and 4.x era) pretty common "NOHZ: local softirq
> pending" messages even on common/popular distro kernels. But I haven't
> seen those for a long time and they didn't scream "error" or have the
> alarmist three exclamation marks either.
FWIW, we're also seeing exactly that "NOHZ tick-stop error" message on
6.4.6-rt8 running on a couple of different imx8mp based boards.
Rasmus
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-16 7:55 ` [OE-core] " Rasmus Villemoes
@ 2023-08-18 3:22 ` Paul Gortmaker
2023-08-22 9:31 ` Richard Purdie
[not found] ` <177DAAB2E4C3384A.4797@lists.openembedded.org>
0 siblings, 2 replies; 23+ messages in thread
From: Paul Gortmaker @ 2023-08-18 3:22 UTC (permalink / raw)
To: Rasmus Villemoes; +Cc: Richard Purdie, openembedded-core, Bruce Ashfield
[Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)] On 16/08/2023 (Wed 09:55) Rasmus Villemoes wrote:
> On 15/08/2023 15.08, Paul Gortmaker via lists.openembedded.org wrote:
> > [Dilemma on changes - merge or not to merge (e.g. 6.4)] On 14/08/2023 (Mon 10:54) Richard Purdie wrote:
> >
> >> Remaining are:
> >> * an error upon boot on preempt-rt on qemux86-64
> >> (e.g. https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/7616/steps/36/logs/stdio)
> >> We'll probably just have to ignore it in parselogs as it has been??
> >> around for a while and nobody seems interested in fixing it upstream.
> >
> > Just back from vacation and I see an internal report of 10-ish at boot
>
> it seems to be rate-limited to 10 per boot, so it should never appear
> more than those 10ish times:
>
> static bool report_idle_softirq(void)
> {
> ...
> if (ratelimit >= 10)
> return false;
> ...
> ratelimit++;
> ...
Amusingly enough - you were looking right at the problem. Just a few
stable kernels ago, it was inadvertently ratelimited to zero. :-P
https://lists.openembedded.org/g/openembedded-core/message/186343
Paul.
--
> }
>
> (it's all non-atomic/lockfree, so ofc it could just happen to get
> emitted 11 or 12 times if the stars align just right...)
>
> >
> > NOHZ tick-stop error: local softirq work is pending, handler #80!!!
> >
> > ..on the 6.1.43-rt10-yocto-preempt-rt kernel, on real hardware. So it
> > seems we can't blame that one entirely on v6.4 kernel (or qemu).
> >
> > We used to get (late 3.x and 4.x era) pretty common "NOHZ: local softirq
> > pending" messages even on common/popular distro kernels. But I haven't
> > seen those for a long time and they didn't scream "error" or have the
> > alarmist three exclamation marks either.
>
> FWIW, we're also seeing exactly that "NOHZ tick-stop error" message on
> 6.4.6-rt8 running on a couple of different imx8mp based boards.
>
> Rasmus
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-18 3:22 ` Paul Gortmaker
@ 2023-08-22 9:31 ` Richard Purdie
[not found] ` <177DAAB2E4C3384A.4797@lists.openembedded.org>
1 sibling, 0 replies; 23+ messages in thread
From: Richard Purdie @ 2023-08-22 9:31 UTC (permalink / raw)
To: Paul Gortmaker, Rasmus Villemoes; +Cc: openembedded-core, Bruce Ashfield
On Thu, 2023-08-17 at 23:22 -0400, Paul Gortmaker wrote:
> [Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)] On 16/08/2023 (Wed 09:55) Rasmus Villemoes wrote:
>
> > On 15/08/2023 15.08, Paul Gortmaker via lists.openembedded.org wrote:
> > > [Dilemma on changes - merge or not to merge (e.g. 6.4)] On 14/08/2023 (Mon 10:54) Richard Purdie wrote:
> > >
> > > > Remaining are:
> > > > * an error upon boot on preempt-rt on qemux86-64
> > > > (e.g. https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/7616/steps/36/logs/stdio)
> > > > We'll probably just have to ignore it in parselogs as it has been??
> > > > around for a while and nobody seems interested in fixing it upstream.
> > >
> > > Just back from vacation and I see an internal report of 10-ish at boot
> >
> > it seems to be rate-limited to 10 per boot, so it should never appear
> > more than those 10ish times:
> >
> > static bool report_idle_softirq(void)
> > {
> > ...
> > if (ratelimit >= 10)
> > return false;
> > ...
> > ratelimit++;
> > ...
>
> Amusingly enough - you were looking right at the problem. Just a few
> stable kernels ago, it was inadvertently ratelimited to zero. :-P
>
> https://lists.openembedded.org/g/openembedded-core/message/186343
Thanks for tracking this down and the subsequent fix. Sadly it didn't
lead us to a magic fix everywhere but it is good to have it resolved
and we can drop the "ignore the error" patch.
I've been poking at this qemuppc issue which is now affecting both 6.1
*and* 6.4 on the autobuilder. It seems pretty consistent that "bitbake
core-image-sato-sdk -c testimage" for qemuppc generates rcu tracebacks
and/or hangs when trying to configure/compile cpio in the image. It
usually only gets as far as configure.
I'd swear this is a new thing that just started happening so it is
likely something that was backported to both 6.1 and 6.4 stable series.
How likely is it the same bug would get backported to both at once? ;-)
On the autobuilder, "qemuppc-altcfg" is building 6.1.46 and "qemuppc"
is building 6.4.11 and both hang.
I tried locally and was able to reproduce it first try. After 10 mins
of testimage, I could ssh into the image and see the rcu stall:
16.638778] systemd-journald[113]: Received client request to flush runtime journal.
[ 540.762141] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 540.772535] rcu: 0-...!: (1 ticks this GP) idle=71e4/1/0x40000002 softirq=73154/73154 fqs=0
[ 540.773036] (t=6104 jiffies g=177277 q=1 ncpus=1)
[ 540.773334] rcu: rcu_preempt kthread timer wakeup didn't happen for 6103 jiffies! g177277 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 540.773390] rcu: Possible timer handling issue on cpu=0 timer-softirq=50723
[ 540.773480] rcu: rcu_preempt kthread starved for 6104 jiffies! g177277 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[ 540.773529] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 540.773580] rcu: RCU grace-period kthread stack dump:
[ 540.773687] task:rcu_preempt state:I stack:0 pid:16 ppid:2 flags:0x00000800
[ 540.774731] Call Trace:
[ 540.774927] [f1051da0] [c0d84cc4] __schedule+0x378/0x890
[ 540.775876] [f1051df0] [c0d85244] schedule+0x68/0x118
[ 540.775964] [f1051e10] [c0d8c678] schedule_timeout+0xb0/0x17c
[ 540.776012] [f1051e50] [c00d4d84] rcu_gp_fqs_loop+0x4cc/0x6ac
[ 540.776057] [f1051eb0] [c00d859c] rcu_gp_kthread+0x238/0x284
[ 540.776100] [f1051f00] [c008a4d0] kthread+0xfc/0x114
[ 540.776144] [f1051f30] [c001c338] ret_from_kernel_thread+0x5c/0x64
[ 540.776248] rcu: Stack dump where RCU GP kthread last ran:
[ 540.776462] CPU: 0 PID: 3161 Comm: cc1 Not tainted 6.1.46-yocto-standard #1
[ 540.776797] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
[ 540.776866] NIP: a7a9f67c LR: a7a9f67c CTR: 00000040
[ 540.776914] REGS: f3a25f40 TRAP: 0900 Not tainted (6.1.46-yocto-standard)
[ 540.776987] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 28000284 XER: 20000000
[ 540.777230]
GPR00: a7aa7a9c af9a2660 00000000 00000000 00000002 00000070 00000060 00000000
GPR08: 00000008 00000700 a7ad9ff0 00000008 00000020 101df7bc a7ad9fe0 00000000
GPR16: a7adb5f8 a7adb008 a7ad9ff0 af9a282c a7adb944 00001000 00000010 00000000
GPR24: 10000034 a7adba28 00000000 00000000 00000000 00000041 a7adaff0 a7adb008
[ 540.777644] NIP [a7a9f67c] 0xa7a9f67c
[ 540.777970] LR [a7a9f67c] 0xa7a9f67c
[ 540.778015] Call Trace:
[ 540.778274] CPU: 0 PID: 3161 Comm: cc1 Not tainted 6.1.46-yocto-standard #1
[ 540.778316] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
[ 540.778340] NIP: a7a9f67c LR: a7a9f67c CTR: 00000040
[ 540.778365] REGS: f3a25f40 TRAP: 0900 Not tainted (6.1.46-yocto-standard)
[ 540.778392] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 28000284 XER: 20000000
[ 540.778502]
GPR00: a7aa7a9c af9a2660 00000000 00000000 00000002 00000070 00000060 00000000
GPR08: 00000008 00000700 a7ad9ff0 00000008 00000020 101df7bc a7ad9fe0 00000000
GPR16: a7adb5f8 a7adb008 a7ad9ff0 af9a282c a7adb944 00001000 00000010 00000000
GPR24: 10000034 a7adba28 00000000 00000000 00000000 00000041 a7adaff0 a7adb008
[ 540.778879] NIP [a7a9f67c] 0xa7a9f67c
[ 540.778913] LR [a7a9f67c] 0xa7a9f67c
[ 540.778943] Call Trace:
(and yes, dmesg did truncate there)
For fun, I applied this change alone:
diff --git a/meta/recipes-kernel/linux/linux-yocto_6.1.bb b/meta/recipes-kernel/linux/linux-yocto_6.1.bb
index b4601f583e7..a26851a4620 100644
--- a/meta/recipes-kernel/linux/linux-yocto_6.1.bb
+++ b/meta/recipes-kernel/linux/linux-yocto_6.1.bb
@@ -22,7 +22,8 @@ SRCREV_machine:qemuarm ?= "4e49d63e747e81aebad5ce6091ba6de09f09d46f"
SRCREV_machine:qemuarm64 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
SRCREV_machine:qemuloongarch64 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
SRCREV_machine:qemumips ?= "e527feb9cd8acbcbcd7115f51cf71166fdbce11a"
-SRCREV_machine:qemuppc ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
+#SRCREV_machine:qemuppc ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
+SRCREV_machine:qemuppc ?= "b110cf9bbc395fe757956839d8110e72368699f4"
SRCREV_machine:qemuriscv64 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
SRCREV_machine:qemuriscv32 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
SRCREV_machine:qemux86 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
@@ -46,6 +47,7 @@ SRC_URI += "file://0001-perf-cpumap-Make-counter-as-unsigned-ints.patch"
LIC_FILES_CHKSUM = "file://COPYING;md5=6bc538ed5bd9a7fc9398086aedcd7e46"
LINUX_VERSION ?= "6.1.46"
+LINUX_VERSION:qemuppc ?= "6.1.38"
PV = "${LINUX_VERSION}+git"
rebuilt core-image-sato-sdk and it is still going in testimage and past
configure into compile so I have a feeling it will eventually complete
with no rcu stall. The thing is slow.
This suggests the problem we're looking for is between 6.1.38 and
6.1.46. I tried this with 6.1 as I figured there might be fewer commits
than on 6.4 and once we find the issue, it is probably the same for
6.4.
I'm hoping that if we can track down the bug on qemuppc, it might solve
the x86 issue too which seems much more rare and hard to reproduce.
At test runs taking 10mins to break and at least 30 mins to complete,
this could take a while but I'll try and keep going and narrow it down.
Cheers,
Richard
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
[not found] ` <177DAAB2E4C3384A.4797@lists.openembedded.org>
@ 2023-08-22 11:07 ` Richard Purdie
[not found] ` <177DAFEBFB5EB0D2.24073@lists.openembedded.org>
1 sibling, 0 replies; 23+ messages in thread
From: Richard Purdie @ 2023-08-22 11:07 UTC (permalink / raw)
To: Paul Gortmaker, Rasmus Villemoes; +Cc: openembedded-core, Bruce Ashfield
On Tue, 2023-08-22 at 10:31 +0100, Richard Purdie via
lists.openembedded.org wrote:
> On Thu, 2023-08-17 at 23:22 -0400, Paul Gortmaker wrote:
> > [Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)] On 16/08/2023 (Wed 09:55) Rasmus Villemoes wrote:
> >
> > > On 15/08/2023 15.08, Paul Gortmaker via lists.openembedded.org wrote:
> > > > [Dilemma on changes - merge or not to merge (e.g. 6.4)] On 14/08/2023 (Mon 10:54) Richard Purdie wrote:
> > > >
> > > > > Remaining are:
> > > > > * an error upon boot on preempt-rt on qemux86-64
> > > > > (e.g. https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/7616/steps/36/logs/stdio)
> > > > > We'll probably just have to ignore it in parselogs as it has been??
> > > > > around for a while and nobody seems interested in fixing it upstream.
> > > >
> > > > Just back from vacation and I see an internal report of 10-ish at boot
> > >
> > > it seems to be rate-limited to 10 per boot, so it should never appear
> > > more than those 10ish times:
> > >
> > > static bool report_idle_softirq(void)
> > > {
> > > ...
> > > if (ratelimit >= 10)
> > > return false;
> > > ...
> > > ratelimit++;
> > > ...
> >
> > Amusingly enough - you were looking right at the problem. Just a few
> > stable kernels ago, it was inadvertently ratelimited to zero. :-P
> >
> > https://lists.openembedded.org/g/openembedded-core/message/186343
>
> Thanks for tracking this down and the subsequent fix. Sadly it didn't
> lead us to a magic fix everywhere but it is good to have it resolved
> and we can drop the "ignore the error" patch.
>
> I've been poking at this qemuppc issue which is now affecting both 6.1
> *and* 6.4 on the autobuilder. It seems pretty consistent that "bitbake
> core-image-sato-sdk -c testimage" for qemuppc generates rcu tracebacks
> and/or hangs when trying to configure/compile cpio in the image. It
> usually only gets as far as configure.
>
> I'd swear this is a new thing that just started happening so it is
> likely something that was backported to both 6.1 and 6.4 stable series.
> How likely is it the same bug would get backported to both at once? ;-)
>
> On the autobuilder, "qemuppc-altcfg" is building 6.1.46 and "qemuppc"
> is building 6.4.11 and both hang.
>
> I tried locally and was able to reproduce it first try. After 10 mins
> of testimage, I could ssh into the image and see the rcu stall:
>
>
> 16.638778] systemd-journald[113]: Received client request to flush runtime journal.
> [ 540.762141] rcu: INFO: rcu_preempt self-detected stall on CPU
> [ 540.772535] rcu: 0-...!: (1 ticks this GP) idle=71e4/1/0x40000002 softirq=73154/73154 fqs=0
> [ 540.773036] (t=6104 jiffies g=177277 q=1 ncpus=1)
> [ 540.773334] rcu: rcu_preempt kthread timer wakeup didn't happen for 6103 jiffies! g177277 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
> [ 540.773390] rcu: Possible timer handling issue on cpu=0 timer-softirq=50723
> [ 540.773480] rcu: rcu_preempt kthread starved for 6104 jiffies! g177277 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
> [ 540.773529] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
> [ 540.773580] rcu: RCU grace-period kthread stack dump:
> [ 540.773687] task:rcu_preempt state:I stack:0 pid:16 ppid:2 flags:0x00000800
> [ 540.774731] Call Trace:
> [ 540.774927] [f1051da0] [c0d84cc4] __schedule+0x378/0x890
> [ 540.775876] [f1051df0] [c0d85244] schedule+0x68/0x118
> [ 540.775964] [f1051e10] [c0d8c678] schedule_timeout+0xb0/0x17c
> [ 540.776012] [f1051e50] [c00d4d84] rcu_gp_fqs_loop+0x4cc/0x6ac
> [ 540.776057] [f1051eb0] [c00d859c] rcu_gp_kthread+0x238/0x284
> [ 540.776100] [f1051f00] [c008a4d0] kthread+0xfc/0x114
> [ 540.776144] [f1051f30] [c001c338] ret_from_kernel_thread+0x5c/0x64
> [ 540.776248] rcu: Stack dump where RCU GP kthread last ran:
> [ 540.776462] CPU: 0 PID: 3161 Comm: cc1 Not tainted 6.1.46-yocto-standard #1
> [ 540.776797] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
> [ 540.776866] NIP: a7a9f67c LR: a7a9f67c CTR: 00000040
> [ 540.776914] REGS: f3a25f40 TRAP: 0900 Not tainted (6.1.46-yocto-standard)
> [ 540.776987] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 28000284 XER: 20000000
> [ 540.777230]
> GPR00: a7aa7a9c af9a2660 00000000 00000000 00000002 00000070 00000060 00000000
> GPR08: 00000008 00000700 a7ad9ff0 00000008 00000020 101df7bc a7ad9fe0 00000000
> GPR16: a7adb5f8 a7adb008 a7ad9ff0 af9a282c a7adb944 00001000 00000010 00000000
> GPR24: 10000034 a7adba28 00000000 00000000 00000000 00000041 a7adaff0 a7adb008
> [ 540.777644] NIP [a7a9f67c] 0xa7a9f67c
> [ 540.777970] LR [a7a9f67c] 0xa7a9f67c
> [ 540.778015] Call Trace:
> [ 540.778274] CPU: 0 PID: 3161 Comm: cc1 Not tainted 6.1.46-yocto-standard #1
> [ 540.778316] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
> [ 540.778340] NIP: a7a9f67c LR: a7a9f67c CTR: 00000040
> [ 540.778365] REGS: f3a25f40 TRAP: 0900 Not tainted (6.1.46-yocto-standard)
> [ 540.778392] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 28000284 XER: 20000000
> [ 540.778502]
> GPR00: a7aa7a9c af9a2660 00000000 00000000 00000002 00000070 00000060 00000000
> GPR08: 00000008 00000700 a7ad9ff0 00000008 00000020 101df7bc a7ad9fe0 00000000
> GPR16: a7adb5f8 a7adb008 a7ad9ff0 af9a282c a7adb944 00001000 00000010 00000000
> GPR24: 10000034 a7adba28 00000000 00000000 00000000 00000041 a7adaff0 a7adb008
> [ 540.778879] NIP [a7a9f67c] 0xa7a9f67c
> [ 540.778913] LR [a7a9f67c] 0xa7a9f67c
> [ 540.778943] Call Trace:
>
> (and yes, dmesg did truncate there)
>
> For fun, I applied this change alone:
>
> diff --git a/meta/recipes-kernel/linux/linux-yocto_6.1.bb b/meta/recipes-kernel/linux/linux-yocto_6.1.bb
> index b4601f583e7..a26851a4620 100644
> --- a/meta/recipes-kernel/linux/linux-yocto_6.1.bb
> +++ b/meta/recipes-kernel/linux/linux-yocto_6.1.bb
> @@ -22,7 +22,8 @@ SRCREV_machine:qemuarm ?= "4e49d63e747e81aebad5ce6091ba6de09f09d46f"
> SRCREV_machine:qemuarm64 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
> SRCREV_machine:qemuloongarch64 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
> SRCREV_machine:qemumips ?= "e527feb9cd8acbcbcd7115f51cf71166fdbce11a"
> -SRCREV_machine:qemuppc ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
> +#SRCREV_machine:qemuppc ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
> +SRCREV_machine:qemuppc ?= "b110cf9bbc395fe757956839d8110e72368699f4"
> SRCREV_machine:qemuriscv64 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
> SRCREV_machine:qemuriscv32 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
> SRCREV_machine:qemux86 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
> @@ -46,6 +47,7 @@ SRC_URI += "file://0001-perf-cpumap-Make-counter-as-unsigned-ints.patch"
>
> LIC_FILES_CHKSUM = "file://COPYING;md5=6bc538ed5bd9a7fc9398086aedcd7e46"
> LINUX_VERSION ?= "6.1.46"
> +LINUX_VERSION:qemuppc ?= "6.1.38"
>
> PV = "${LINUX_VERSION}+git"
>
> rebuilt core-image-sato-sdk and it is still going in testimage and past
> configure into compile so I have a feeling it will eventually complete
> with no rcu stall. The thing is slow.
>
> This suggests the problem we're looking for is between 6.1.38 and
> 6.1.46. I tried this with 6.1 as I figured there might be fewer commits
> than on 6.4 and once we find the issue, it is probably the same for
> 6.4.
>
> I'm hoping that if we can track down the bug on qemuppc, it might solve
> the x86 issue too which seems much more rare and hard to reproduce.
>
> At test runs taking 10mins to break and at least 30 mins to complete,
> this could take a while but I'll try and keep going and narrow it down.
[ 0.000000] Linux version 6.1.41-yocto-standard (oe-user@oe-host) (powerpc-poky-linux-gcc (GCC) 13.2.0, GNU ld (GNU Binutils) 2.41.0.20230731) #1 PREEMPT Wed Jul 26 18:34:20 UTC 2023
<cut>
[ 1028.817827] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 1028.837241] rcu: 0-...!: (1 ticks this GP) idle=879c/1/0x40000002 softirq=132839/132839 fqs=0
[ 1028.837511] (t=6588 jiffies g=329809 q=1 ncpus=1)
[ 1028.837813] rcu: rcu_preempt kthread timer wakeup didn't happen for 6587 jiffies! g329809 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 1028.837884] rcu: Possible timer handling issue on cpu=0 timer-softirq=92451
[ 1028.837974] rcu: rcu_preempt kthread starved for 6588 jiffies! g329809 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[ 1028.838022] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 1028.838057] rcu: RCU grace-period kthread stack dump:
[ 1028.838166] task:rcu_preempt state:I stack:0 pid:16 ppid:2 flags:0x00000800
[ 1028.839004] Call Trace:
[ 1028.839078] [f1051cd0] [c0064274] irq_exit+0x20/0x144 (unreliable)
[ 1028.839814] [f1051da0] [c0d83b74] __schedule+0x378/0x890
[ 1028.839923] [f1051df0] [c0d840f4] schedule+0x68/0x118
[ 1028.839973] [f1051e10] [c0d8b538] schedule_timeout+0xb0/0x17c
[ 1028.840016] [f1051e50] [c00d4f08] rcu_gp_fqs_loop+0x4cc/0x6ac
[ 1028.840058] [f1051eb0] [c00d8720] rcu_gp_kthread+0x238/0x284
[ 1028.840099] [f1051f00] [c008a630] kthread+0xfc/0x114
[ 1028.840143] [f1051f30] [c001c338] ret_from_kernel_thread+0x5c/0x64
[ 1028.840253] rcu: Stack dump where RCU GP kthread last ran:
[ 1028.840518] CPU: 0 PID: 6055 Comm: gcc Not tainted 6.1.41-yocto-standard #1
[ 1028.840672] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
[ 1028.840744] NIP: a7958e50 LR: a794c78c CTR: 00000000
[ 1028.840796] REGS: f40c9f40 TRAP: 0900 Not tainted (6.1.41-yocto-standard)
[ 1028.840871] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 24048289 XER: 20000000
[ 1028.841075]
GPR00: a794c768 af8bc6c0 00000000 a792a000 a795dd60 00000014 00000000 00000001
GPR08: a795dd60 00000000 a792a000 676c6962 00000005 1016c320 fffffffd 10000d44
GPR16: a7985c68 af8bcc80 00000000 a79831fc a7983ff0 00000000 a7985c18 00000000
GPR24: ffffffff 10001379 0000a9ff a7983ff0 a79859cc a792a000 a7984ff0 a795dd60
[ 1028.841520] NIP [a7958e50] 0xa7958e50
[ 1028.841880] LR [a794c78c] 0xa794c78c
[ 1028.841943] Call Trace:
[ 1028.842130] CPU: 0 PID: 6055 Comm: gcc Not tainted 6.1.41-yocto-standard #1
[ 1028.842172] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
[ 1028.842243] NIP: a7958e50 LR: a794c78c CTR: 00000000
[ 1028.842324] REGS: f40c9f40 TRAP: 0900 Not tainted (6.1.41-yocto-standard)
[ 1028.842355] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 24048289 XER: 20000000
[ 1028.842445]
GPR00: a794c768 af8bc6c0 00000000 a792a000 a795dd60 00000014 00000000 00000001
GPR08: a795dd60 00000000 a792a000 676c6962 00000005 1016c320 fffffffd 10000d44
GPR16: a7985c68 af8bcc80 00000000 a79831fc a7983ff0 00000000 a7985c18 00000000
GPR24: ffffffff 10001379 0000a9ff a7983ff0 a79859cc a792a000 a7984ff0 a795dd60
[ 1028.842740] NIP [a7958e50] 0xa7958e50
[ 1028.842772] LR [a794c78c] 0xa794c78c
[ 1028.842799] Call Trace:
[ 1484.142243] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 1484.162020] rcu: 0-...!: (1 ticks this GP) idle=9ac4/1/0x40000004 softirq=191787/191787 fqs=0
[ 1484.162517] (t=5835 jiffies g=481021 q=5 ncpus=1)
[ 1484.162814] rcu: rcu_preempt kthread timer wakeup didn't happen for 5834 jiffies! g481021 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 1484.162881] rcu: Possible timer handling issue on cpu=0 timer-softirq=134035
[ 1484.162971] rcu: rcu_preempt kthread starved for 5835 jiffies! g481021 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[ 1484.163019] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 1484.163052] rcu: RCU grace-period kthread stack dump:
[ 1484.163158] task:rcu_preempt state:I stack:0 pid:16 ppid:2 flags:0x00000800
[ 1484.164006] Call Trace:
[ 1484.164080] [f1051cd0] [c0064274] irq_exit+0x20/0x144 (unreliable)
[ 1484.164883] [f1051da0] [c0d83b74] __schedule+0x378/0x890
[ 1484.164988] [f1051df0] [c0d840f4] schedule+0x68/0x118
[ 1484.165038] [f1051e10] [c0d8b538] schedule_timeout+0xb0/0x17c
[ 1484.165082] [f1051e50] [c00d4f08] rcu_gp_fqs_loop+0x4cc/0x6ac
[ 1484.165128] [f1051eb0] [c00d8720] rcu_gp_kthread+0x238/0x284
[ 1484.165170] [f1051f00] [c008a630] kthread+0xfc/0x114
[ 1484.165213] [f1051f30] [c001c338] ret_from_kernel_thread+0x5c/0x64
[ 1484.165341] rcu: Stack dump where RCU GP kthread last ran:
[ 1484.165562] CPU: 0 PID: 10034 Comm: sed Not tainted 6.1.41-yocto-standard #1
[ 1484.165748] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
[ 1484.165815] NIP: c0d8c874 LR: c0d8c864 CTR: c003d7dc
[ 1484.165863] REGS: f1001ec0 TRAP: 0900 Not tainted (6.1.41-yocto-standard)
[ 1484.165935] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 44bb4f32 XER: 00000000
[ 1484.166135]
GPR00: c0d8c81c f1001f80 c2836c00 00000000 00000001 55570190 00000000 00000000
GPR08: 00000000 00009032 00000100 fff583a2 24bb5f32 1016c320 c157abf8 c16c0000
GPR16: 00046d88 c100a5d4 00000000 0000000a c1575840 c100a620 c0f95bf8 c1583570
GPR24: c16f70a0 afaa8ef8 00000000 c16adae0 00000002 00000008 00000000 f1001ff0
[ 1484.166591] NIP [c0d8c874] __do_softirq+0xfc/0x394
[ 1484.166686] LR [c0d8c864] __do_softirq+0xec/0x394
[ 1484.166731] Call Trace:
[ 1484.166756] [f1001f80] [c0d8c81c] __do_softirq+0xa4/0x394 (unreliable)
[ 1484.166810] [f1001ff0] [c000ac2c] do_softirq_own_stack+0x3c/0x54
[ 1484.166861] [f4bb5ef0] [a7c06380] 0xa7c06380
[ 1484.167183] [f4bb5f10] [c0064338] irq_exit+0xe4/0x144
[ 1484.167229] [f4bb5f30] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
[ 1484.167347] --- interrupt: 500 at 0xa7bc44e4
[ 1484.167391] NIP: a7bc44e4 LR: a7bc4b3c CTR: 00000000
[ 1484.167420] REGS: f4bb5f40 TRAP: 0500 Not tainted (6.1.41-yocto-standard)
[ 1484.167448] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 84042280 XER: 00000000
[ 1484.167541]
GPR00: 00000000 afaa8dc0 00000000 00000003 fffffffc 70000027 6fffffff a7c063a0
GPR08: 0fdc0000 6ffffffc 0ffed394 40ad88ea 40ad85b7 1016c320 00000003 00000003
GPR16: 00000000 afaa8d08 00000006 a7c08ff0 afaa8f30 0023a550 a7c06360 10000fdd
GPR24: 00000003 afaa8ef8 00000000 afaa8cd0 00000002 a7c06380 a7c09ff0 afaa8dc0
[ 1484.167856] NIP [a7bc44e4] 0xa7bc44e4
[ 1484.167887] LR [a7bc4b3c] 0xa7bc4b3c
[ 1484.167924] --- interrupt: 500
[ 1484.167976] Instruction dump:
[ 1484.168126] 3b1870a0 3ad65bf8 3af73570 3b7bdae0 3a60000a 3a400000 7e238b78 4bff6f01
[ 1484.168249] 92540000 7d2000a6 61298000 7d200124 <7ffd00d0> 7fbff838 7fff0034 23ff0020
[ 1484.168576] CPU: 0 PID: 10034 Comm: sed Not tainted 6.1.41-yocto-standard #1
[ 1484.168619] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
[ 1484.168644] NIP: c0d8c874 LR: c0d8c864 CTR: c003d7dc
[ 1484.168669] REGS: f1001ec0 TRAP: 0900 Not tainted (6.1.41-yocto-standard)
[ 1484.168698] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 44bb4f32 XER: 00000000
[ 1484.168815]
GPR00: c0d8c81c f1001f80 c2836c00 00000000 00000001 55570190 00000000 00000000
GPR08: 00000000 00009032 00000100 fff583a2 24bb5f32 1016c320 c157abf8 c16c0000
GPR16: 00046d88 c100a5d4 00000000 0000000a c1575840 c100a620 c0f95bf8 c1583570
GPR24: c16f70a0 afaa8ef8 00000000 c16adae0 00000002 00000008 00000000 f1001ff0
[ 1484.169112] NIP [c0d8c874] __do_softirq+0xfc/0x394
[ 1484.169148] LR [c0d8c864] __do_softirq+0xec/0x394
[ 1484.169178] Call Trace:
[ 1484.169199] [f1001f80] [c0d8c81c] __do_softirq+0xa4/0x394 (unreliable)
[ 1484.169246] [f1001ff0] [c000ac2c] do_softirq_own_stack+0x3c/0x54
[ 1484.169346] [f4bb5ef0] [a7c06380] 0xa7c06380
[ 1484.169386] [f4bb5f10] [c0064338] irq_exit+0xe4/0x144
[ 1484.169425] [f4bb5f30] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
[ 1484.169464] --- interrupt: 500 at 0xa7bc44e4
[ 1484.169492] NIP: a7bc44e4 LR: a7bc4b3c CTR: 00000000
[ 1484.169516] REGS: f4bb5f40 TRAP: 0500 Not tainted (6.1.41-yocto-standard)
[ 1484.169541] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 84042280 XER: 00000000
[ 1484.169650]
GPR00: 00000000 afaa8dc0 00000000 00000003 fffffffc 70000027 6fffffff a7c063a0
GPR08: 0fdc0000 6ffffffc 0ffed394 40ad88ea 40ad85b7 1016c320 00000003 00000003
GPR16: 00000000 afaa8d08 00000006 a7c08ff0 afaa8f30 0023a550 a7c06360 10000fdd
GPR24: 00000003 afaa8ef8 00000000 afaa8cd0 00000002 a7c06380 a7c09ff0 afaa8dc0
[ 1484.169943] NIP [a7bc44e4] 0xa7bc44e4
[ 1484.169973] LR [a7bc4b3c] 0xa7bc4b3c
[ 1484.169998] --- interrupt: 500
[ 1484.170020] Instruction dump:
[ 1484.170047] 3b1870a0 3ad65bf8 3af73570 3b7bdae0 3a60000a 3a400000 7e238b78 4bff6f01
[ 1484.170130] 92540000 7d2000a6 61298000 7d200124 <7ffd00d0> 7fbff838 7fff0034 23ff0020
[ 2151.331458] systemd-journald[113]: Time jumped backwards, rotating.
[ 3642.773250] hellomod: loading out-of-tree module taints kernel.
[ 3642.930969] Hello world!
[ 3657.747385] Cleaning up hellomod.
diff --git a/meta/recipes-kernel/linux/linux-yocto_6.1.bb b/meta/recipes-kernel/linux/linux-yocto_6.1.bb
index a26851a4620..c8e67560a85 100644
--- a/meta/recipes-kernel/linux/linux-yocto_6.1.bb
+++ b/meta/recipes-kernel/linux/linux-yocto_6.1.bb
@@ -22,8 +22,10 @@ SRCREV_machine:qemuarm ?= "4e49d63e747e81aebad5ce6091ba6de09f09d46f"
SRCREV_machine:qemuarm64 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
SRCREV_machine:qemuloongarch64 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
SRCREV_machine:qemumips ?= "e527feb9cd8acbcbcd7115f51cf71166fdbce11a"
-#SRCREV_machine:qemuppc ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
-SRCREV_machine:qemuppc ?= "b110cf9bbc395fe757956839d8110e72368699f4"
+#SRCREV_machine:qemuppc ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2" 6.1.46
+SRCREV_machine:qemuppc ?= "e6b254abfbb16492998e6bd355302b47d0080b76"
+# 6.1.41
+#SRCREV_machine:qemuppc ?= "b110cf9bbc395fe757956839d8110e72368699f4" 6.1.38
SRCREV_machine:qemuriscv64 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
SRCREV_machine:qemuriscv32 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
SRCREV_machine:qemux86 ?= "44fd0c7a5a7955282a1ab24bf3dcdee068839ad2"
@@ -47,7 +49,7 @@ SRC_URI += "file://0001-perf-cpumap-Make-counter-as-unsigned-ints.patch"
LIC_FILES_CHKSUM = "file://COPYING;md5=6bc538ed5bd9a7fc9398086aedcd7e46"
LINUX_VERSION ?= "6.1.46"
-LINUX_VERSION:qemuppc ?= "6.1.38"
+LINUX_VERSION:qemuppc ?= "6.1.41"
PV = "${LINUX_VERSION}+git"
so between 6.1.38 and 6.1.41 ?
Cheers,
Richard
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
[not found] ` <177DAFEBFB5EB0D2.24073@lists.openembedded.org>
@ 2023-08-22 11:47 ` Richard Purdie
2023-08-22 12:20 ` Mikko Rapeli
0 siblings, 1 reply; 23+ messages in thread
From: Richard Purdie @ 2023-08-22 11:47 UTC (permalink / raw)
To: Paul Gortmaker, Rasmus Villemoes; +Cc: openembedded-core, Bruce Ashfield
On Tue, 2023-08-22 at 12:07 +0100, Richard Purdie via
lists.openembedded.org wrote:
> so between 6.1.38 and 6.1.41 ?
[ 0.000000] Linux version 6.1.39-yocto-standard (oe-user@oe-host) (powerpc-poky-linux-gcc (GCC) 13.2.0, GNU ld (GNU Binutils) 2.41.0.20230731) #1 PREEMPT Wed Jul 19 14:22:18 UTC 2023
[ 14.971050] systemd[1]: Started Journal Service.
[ 16.158584] systemd-journald[113]: Received client request to flush runtime journal.
[ 532.519478] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 532.530290] rcu: 0-...!: (1 ticks this GP) idle=4f2c/1/0x40000002 softirq=70544/70544 fqs=0
[ 532.530929] (t=6170 jiffies g=170793 q=3 ncpus=1)
[ 532.531235] rcu: rcu_preempt kthread timer wakeup didn't happen for 6169 jiffies! g170793 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 532.531294] rcu: Possible timer handling issue on cpu=0 timer-softirq=49025
[ 532.531582] rcu: rcu_preempt kthread starved for 6170 jiffies! g170793 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[ 532.531634] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 532.531670] rcu: RCU grace-period kthread stack dump:
[ 532.531783] task:rcu_preempt state:I stack:0 pid:16 ppid:2 flags:0x00000800
[ 532.533233] Call Trace:
[ 532.533499] [f1051cd0] [5ea7c162] 0x5ea7c162 (unreliable)
[ 532.534406] [f1051da0] [c0d83234] __schedule+0x378/0x890
[ 532.535228] [f1051df0] [c0d837b4] schedule+0x68/0x118
[ 532.535313] [f1051e10] [c0d8abf8] schedule_timeout+0xb0/0x17c
[ 532.535361] [f1051e50] [c00d4d08] rcu_gp_fqs_loop+0x4cc/0x6ac
[ 532.535405] [f1051eb0] [c00d8520] rcu_gp_kthread+0x238/0x284
[ 532.535446] [f1051f00] [c008a434] kthread+0xfc/0x114
[ 532.535490] [f1051f30] [c001c338] ret_from_kernel_thread+0x5c/0x64
[ 532.535650] rcu: Stack dump where RCU GP kthread last ran:
[ 532.535868] CPU: 0 PID: 3056 Comm: touch Not tainted 6.1.39-yocto-standard #1
[ 532.536025] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
[ 532.536294] NIP: 100145d0 LR: 100011cc CTR: 00000000
[ 532.536347] REGS: f38bdf40 TRAP: 0900 Not tainted (6.1.39-yocto-standard)
[ 532.536422] MSR: 0200f932 <VEC,EE,PR,FP,ME,IR,DR,RI> CR: 22002460 XER: 00000000
[ 532.536737]
GPR00: 100011c4 af9ce570 a7d872a0 100150ec 0ff6e850 65000000 65000000 fefefeff
GPR08: 00000003 0ff6e840 00000003 65000000 42002282 10038058 00000000 10164888
GPR16: 10160000 00000000 101e5eb0 00000008 10160000 10001120 00000000 10016998
GPR24: 100151f4 100168d0 00000000 10030000 10016920 00000002 af9ce904 100150ec
[ 532.537150] NIP [100145d0] 0x100145d0
[ 532.537204] LR [100011cc] 0x100011cc
[ 532.537246] Call Trace:
[ 532.537443] CPU: 0 PID: 3056 Comm: touch Not tainted 6.1.39-yocto-standard #1
[ 532.537487] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
[ 532.537597] NIP: 100145d0 LR: 100011cc CTR: 00000000
[ 532.537625] REGS: f38bdf40 TRAP: 0900 Not tainted (6.1.39-yocto-standard)
[ 532.537653] MSR: 0200f932 <VEC,EE,PR,FP,ME,IR,DR,RI> CR: 22002460 XER: 00000000
[ 532.537764]
GPR00: 100011c4 af9ce570 a7d872a0 100150ec 0ff6e850 65000000 65000000 fefefeff
GPR08: 00000003 0ff6e840 00000003 65000000 42002282 10038058 00000000 10164888
GPR16: 10160000 00000000 101e5eb0 00000008 10160000 10001120 00000000 10016998
GPR24: 100151f4 100168d0 00000000 10030000 10016920 00000002 af9ce904 100150ec
[ 532.538092] NIP [100145d0] 0x100145d0
[ 532.538125] LR [100011cc] 0x100011cc
[ 532.538153] Call Trace:
SRCREV_machine:qemuppc ?= "a456e17438819ed77f63d16926f96101ca215f09"
LINUX_VERSION:qemuppc ?= "6.1.39"
so between 6.1.38 and 6.1.39?
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-22 11:47 ` Richard Purdie
@ 2023-08-22 12:20 ` Mikko Rapeli
2023-08-22 12:28 ` Richard Purdie
[not found] ` <177DB4530EBE3FA3.24073@lists.openembedded.org>
0 siblings, 2 replies; 23+ messages in thread
From: Mikko Rapeli @ 2023-08-22 12:20 UTC (permalink / raw)
To: Richard Purdie
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
Hi,
On Tue, Aug 22, 2023 at 12:47:04PM +0100, Richard Purdie wrote:
> so between 6.1.38 and 6.1.39?
Maybe:
commit b1cdc56bc177c2e182c204bb08ad4e87bfd67942
Author: Paul E. McKenney <paulmck@kernel.org>
AuthorDate: Wed Apr 26 11:11:29 2023 -0700
Commit: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CommitDate: Wed Jul 19 16:21:01 2023 +0200
rcu-tasks: Stop rcu_tasks_invoke_cbs() from using never-onlined CPUs
and
commit d58f0f0ce6332ffeb406540295cc49732c26fb51
Author: Paul E. McKenney <paulmck@kernel.org>
AuthorDate: Thu Apr 27 10:50:47 2023 -0700
Commit: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CommitDate: Wed Jul 19 16:21:01 2023 +0200
rcu: Make rcu_cpu_starting() rely on interrupts being disabled
?
master branch seems to have larger set of changes to rcu.
Maybe locking debugging options could help to find this on every boot.
Then wasn't
commit 77cc52f1b8d76c995648cb4286e57142cac8ce0a
Author: Wen Yang <wenyang.linux@foxmail.com>
AuthorDate: Fri May 5 00:12:53 2023 +0800
Commit: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CommitDate: Wed Jul 19 16:20:59 2023 +0200
tick/rcu: Fix bogus ratelimit condition
[ Upstream commit a7e282c77785c7eabf98836431b1f029481085ad ]
causing some issues too?
Cheers,
-Mikko
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-22 12:20 ` Mikko Rapeli
@ 2023-08-22 12:28 ` Richard Purdie
2023-08-22 12:31 ` Alexander Kanavin
[not found] ` <177DB4530EBE3FA3.24073@lists.openembedded.org>
1 sibling, 1 reply; 23+ messages in thread
From: Richard Purdie @ 2023-08-22 12:28 UTC (permalink / raw)
To: Mikko Rapeli
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
On Tue, 2023-08-22 at 15:20 +0300, Mikko Rapeli wrote:
> Hi,
>
> On Tue, Aug 22, 2023 at 12:47:04PM +0100, Richard Purdie wrote:
> > so between 6.1.38 and 6.1.39?
>
> Maybe:
>
> commit b1cdc56bc177c2e182c204bb08ad4e87bfd67942
> Author: Paul E. McKenney <paulmck@kernel.org>
> AuthorDate: Wed Apr 26 11:11:29 2023 -0700
> Commit: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> CommitDate: Wed Jul 19 16:21:01 2023 +0200
>
> rcu-tasks: Stop rcu_tasks_invoke_cbs() from using never-onlined CPUs
>
> and
>
> commit d58f0f0ce6332ffeb406540295cc49732c26fb51
> Author: Paul E. McKenney <paulmck@kernel.org>
> AuthorDate: Thu Apr 27 10:50:47 2023 -0700
> Commit: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> CommitDate: Wed Jul 19 16:21:01 2023 +0200
>
> rcu: Make rcu_cpu_starting() rely on interrupts being disabled
>
> ?
>
> master branch seems to have larger set of changes to rcu.
I wondered that but my test says
deda0761dc6161f03278da4679d96d4727992e91 is "good" which is after
those.
> Maybe locking debugging options could help to find this on every boot.
Perhaps. I think given where I'm at now I'll just try and bisect it...
> Then wasn't
>
> commit 77cc52f1b8d76c995648cb4286e57142cac8ce0a
> Author: Wen Yang <wenyang.linux@foxmail.com>
> AuthorDate: Fri May 5 00:12:53 2023 +0800
> Commit: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> CommitDate: Wed Jul 19 16:20:59 2023 +0200
>
> tick/rcu: Fix bogus ratelimit condition
>
> [ Upstream commit a7e282c77785c7eabf98836431b1f029481085ad ]
>
> causing some issues too?
Yes, but I think this is something different...
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-22 12:28 ` Richard Purdie
@ 2023-08-22 12:31 ` Alexander Kanavin
0 siblings, 0 replies; 23+ messages in thread
From: Alexander Kanavin @ 2023-08-22 12:31 UTC (permalink / raw)
To: Richard Purdie
Cc: Mikko Rapeli, Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
Maybe this is stating the obvious, but please avoid the temptation to
shortcut a bisect, once it's been started. There's been so many times
I guessed wrong what is the offending commit in the remaining set, all
the way to the end. Still having the issue after a revert is one of
the most frustrating things.
Alex
On Tue, 22 Aug 2023 at 14:28, Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> On Tue, 2023-08-22 at 15:20 +0300, Mikko Rapeli wrote:
> > Hi,
> >
> > On Tue, Aug 22, 2023 at 12:47:04PM +0100, Richard Purdie wrote:
> > > so between 6.1.38 and 6.1.39?
> >
> > Maybe:
> >
> > commit b1cdc56bc177c2e182c204bb08ad4e87bfd67942
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > AuthorDate: Wed Apr 26 11:11:29 2023 -0700
> > Commit: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > CommitDate: Wed Jul 19 16:21:01 2023 +0200
> >
> > rcu-tasks: Stop rcu_tasks_invoke_cbs() from using never-onlined CPUs
> >
> > and
> >
> > commit d58f0f0ce6332ffeb406540295cc49732c26fb51
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > AuthorDate: Thu Apr 27 10:50:47 2023 -0700
> > Commit: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > CommitDate: Wed Jul 19 16:21:01 2023 +0200
> >
> > rcu: Make rcu_cpu_starting() rely on interrupts being disabled
> >
> > ?
> >
> > master branch seems to have larger set of changes to rcu.
>
> I wondered that but my test says
> deda0761dc6161f03278da4679d96d4727992e91 is "good" which is after
> those.
>
> > Maybe locking debugging options could help to find this on every boot.
>
> Perhaps. I think given where I'm at now I'll just try and bisect it...
>
> > Then wasn't
> >
> > commit 77cc52f1b8d76c995648cb4286e57142cac8ce0a
> > Author: Wen Yang <wenyang.linux@foxmail.com>
> > AuthorDate: Fri May 5 00:12:53 2023 +0800
> > Commit: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > CommitDate: Wed Jul 19 16:20:59 2023 +0200
> >
> > tick/rcu: Fix bogus ratelimit condition
> >
> > [ Upstream commit a7e282c77785c7eabf98836431b1f029481085ad ]
> >
> > causing some issues too?
>
> Yes, but I think this is something different...
>
> Cheers,
>
> Richard
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#186506): https://lists.openembedded.org/g/openembedded-core/message/186506
> Mute This Topic: https://lists.openembedded.org/mt/100733646/1686489
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [alex.kanavin@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
[not found] ` <177DB4530EBE3FA3.24073@lists.openembedded.org>
@ 2023-08-22 14:49 ` Richard Purdie
[not found] ` <177DBC07E94591CC.4797@lists.openembedded.org>
1 sibling, 0 replies; 23+ messages in thread
From: Richard Purdie @ 2023-08-22 14:49 UTC (permalink / raw)
To: Mikko Rapeli
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
03b2c470a136a83a9961a2a855cde59498361598 shows as broken
deda0761dc6161f03278da4679d96d4727992e91 doesn't seem to break
but this doesn't seem to make any sense as the changes are:
kernel-source$ git diff 03b2c470a136a83a9961a2a855cde59498361598 deda0761dc6161f03278da4679d96d4727992e91 | diffstat
arch/arm/boot/dts/iwg20d-q7-common.dtsi | 2 +-
arch/arm/boot/dts/meson8.dtsi | 4 ++--
arch/arm/boot/dts/qcom-apq8074-dragonboard.dts | 4 ----
arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi | 2 +-
arch/arm/mach-ep93xx/timer-ep93xx.c | 3 +--
arch/arm/mach-omap2/board-generic.c | 1 -
arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi | 4 ----
arch/arm64/boot/dts/qcom/apq8096-ifc6640.dts | 4 ++--
arch/arm64/boot/dts/qcom/pm7250b.dtsi | 1 -
arch/arm64/boot/dts/renesas/ulcb-kf.dtsi | 3 ++-
arch/arm64/boot/dts/ti/k3-j7200-common-proc-board.dts | 28 ++++++++++++++--------------
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 2 +-
drivers/infiniband/hw/hfi1/ipoib_tx.c | 4 ++--
drivers/infiniband/hw/hfi1/mmu_rb.c | 101 ++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------
drivers/infiniband/hw/hfi1/mmu_rb.h | 3 ---
drivers/infiniband/hw/hfi1/sdma.c | 23 +++++------------------
drivers/infiniband/hw/hfi1/sdma.h | 47 +++++++++++++++--------------------------------
drivers/infiniband/hw/hfi1/sdma_txreq.h | 2 --
drivers/infiniband/hw/hfi1/user_sdma.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------------
drivers/infiniband/hw/hfi1/user_sdma.h | 1 +
drivers/infiniband/hw/hfi1/vnic_sdma.c | 4 ++--
drivers/infiniband/hw/hns/hns_roce_hem.c | 7 +++----
drivers/infiniband/hw/irdma/uk.c | 10 ++++------
drivers/input/misc/pm8941-pwrkey.c | 19 ++++---------------
drivers/memory/brcmstb_dpfe.c | 4 +---
drivers/soc/fsl/qe/Kconfig | 1 -
drivers/video/fbdev/omap/lcd_mipid.c | 6 +-----
sound/soc/codecs/es8316.c | 23 +++++++++--------------
28 files changed, 191 insertions(+), 259 deletions(-)
03b2c470a136a83a9961a2a855cde59498361598 Input: pm8941-powerkey - fix debounce on gen2+ PMICs
421ce97657a84b81ce2cb915e75037e1a356736a arm64: dts: ti: k3-j7200: Fix physical address of pin
3b4c21804076e461a6453ee4d09872172336aa1d fbdev: omapfb: lcd_mipid: Fix an error handling path in mipid_spi_probe()
52b04ac85f5f4b485bf658101e464143225e68f9 drm/msm/dpu: set DSC flush bit correctly at MDP CTL flush register
6878bdd7571827babc1c4c1ff66ea1affe951020 arm64: dts: renesas: ulcb-kf: Remove flow control for SCIF1
5d14292dba9554881a137039c048a67ddf321395 ARM: dts: iwg20d-q7-common: Fix backlight pwm specifier
766e0b6f4c9649f126e59c06f100b8581d0773b8 RDMA/hns: Fix hns_roce_table_get return value
b99395ab605fb0570d1e62c9459425ac6fc58d46 IB/hfi1: Fix wrong mmu_node used for user SDMA packet after invalidate
ebec507398e11b1c25ce9fb05fb509878233051c RDMA/irdma: avoid fortify-string warning in irdma_clr_wqes
750f0a302a10dc2327a6656860a22b7da7251cea soc/fsl/qe: fix usb.c build errors
b2194d7dfc95a404990da73ebd394f6f6946a4a0 ARM: dts: meson8: correct uart_B and uart_C clock references
863054be8d4d2c9b38985371166a37c0e14111e1 ASoC: es8316: Do not set rate constraints for unsupported MCLKs
3b575d93020f20f6a711efd8c69bfed26837d694 ASoC: es8316: Increment max value for ALC Capture Target Volume control
c02f27c2950abfed58bd8aa6bf50e79d9cc1fc77 ARM: dts: qcom: apq8074-dragonboard: Set DMA as remotely controlled
9f79e638d45100dad43c56ad9b47eaff1b98fe9a memory: brcmstb_dpfe: fix testing array offset after use
09722ac9f1e557ea65098202d0b18336c3f04420 ARM: dts: stm32: Shorten the AV96 HDMI sound card name
666be7fef4d39d67c041460e29ddf2b875101133 arm64: dts: mediatek: mt8183: Add mediatek,broken-save-restore-fw to kukui
1bdb9751b4c64f5709cb3608fd93a709c4e9b2e1 arm64: dts: qcom: apq8096: fix fixed regulator name property
75c019119ebcc919e717fbce5552c2a0908405cf arm64: dts: qcom: pm7250b: add missing spmi-vadc include
c63997426da6f24f33ae6caf2423170d5bb80ebb ARM: omap2: fix missing tick_broadcast() prototype
e91ffbd6553348cdd3d04b263f8207d919681fac ARM: ep93xx: fix missing-prototype warnings
and I can't see how any of that is compiled into qemuppc. Am I missing something?
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
[not found] ` <177DBC07E94591CC.4797@lists.openembedded.org>
@ 2023-08-22 21:08 ` Richard Purdie
[not found] ` <177DD0B30D8FEDF8.27837@lists.openembedded.org>
1 sibling, 0 replies; 23+ messages in thread
From: Richard Purdie @ 2023-08-22 21:08 UTC (permalink / raw)
To: Mikko Rapeli
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
On Tue, 2023-08-22 at 15:49 +0100, Richard Purdie via
lists.openembedded.org wrote:
> 03b2c470a136a83a9961a2a855cde59498361598 shows as broken
>
> deda0761dc6161f03278da4679d96d4727992e91 doesn't seem to break
>
> but this doesn't seem to make any sense as the changes are:
>
> kernel-source$ git diff 03b2c470a136a83a9961a2a855cde59498361598 deda0761dc6161f03278da4679d96d4727992e91 | diffstat
> arch/arm/boot/dts/iwg20d-q7-common.dtsi | 2 +-
> arch/arm/boot/dts/meson8.dtsi | 4 ++--
> arch/arm/boot/dts/qcom-apq8074-dragonboard.dts | 4 ----
> arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi | 2 +-
> arch/arm/mach-ep93xx/timer-ep93xx.c | 3 +--
> arch/arm/mach-omap2/board-generic.c | 1 -
> arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi | 4 ----
> arch/arm64/boot/dts/qcom/apq8096-ifc6640.dts | 4 ++--
> arch/arm64/boot/dts/qcom/pm7250b.dtsi | 1 -
> arch/arm64/boot/dts/renesas/ulcb-kf.dtsi | 3 ++-
> arch/arm64/boot/dts/ti/k3-j7200-common-proc-board.dts | 28 ++++++++++++++--------------
> drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 2 +-
> drivers/infiniband/hw/hfi1/ipoib_tx.c | 4 ++--
> drivers/infiniband/hw/hfi1/mmu_rb.c | 101 ++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------
> drivers/infiniband/hw/hfi1/mmu_rb.h | 3 ---
> drivers/infiniband/hw/hfi1/sdma.c | 23 +++++------------------
> drivers/infiniband/hw/hfi1/sdma.h | 47 +++++++++++++++--------------------------------
> drivers/infiniband/hw/hfi1/sdma_txreq.h | 2 --
> drivers/infiniband/hw/hfi1/user_sdma.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------------
> drivers/infiniband/hw/hfi1/user_sdma.h | 1 +
> drivers/infiniband/hw/hfi1/vnic_sdma.c | 4 ++--
> drivers/infiniband/hw/hns/hns_roce_hem.c | 7 +++----
> drivers/infiniband/hw/irdma/uk.c | 10 ++++------
> drivers/input/misc/pm8941-pwrkey.c | 19 ++++---------------
> drivers/memory/brcmstb_dpfe.c | 4 +---
> drivers/soc/fsl/qe/Kconfig | 1 -
> drivers/video/fbdev/omap/lcd_mipid.c | 6 +-----
> sound/soc/codecs/es8316.c | 23 +++++++++--------------
> 28 files changed, 191 insertions(+), 259 deletions(-)
>
> 03b2c470a136a83a9961a2a855cde59498361598 Input: pm8941-powerkey - fix debounce on gen2+ PMICs
> 421ce97657a84b81ce2cb915e75037e1a356736a arm64: dts: ti: k3-j7200: Fix physical address of pin
> 3b4c21804076e461a6453ee4d09872172336aa1d fbdev: omapfb: lcd_mipid: Fix an error handling path in mipid_spi_probe()
> 52b04ac85f5f4b485bf658101e464143225e68f9 drm/msm/dpu: set DSC flush bit correctly at MDP CTL flush register
> 6878bdd7571827babc1c4c1ff66ea1affe951020 arm64: dts: renesas: ulcb-kf: Remove flow control for SCIF1
> 5d14292dba9554881a137039c048a67ddf321395 ARM: dts: iwg20d-q7-common: Fix backlight pwm specifier
> 766e0b6f4c9649f126e59c06f100b8581d0773b8 RDMA/hns: Fix hns_roce_table_get return value
> b99395ab605fb0570d1e62c9459425ac6fc58d46 IB/hfi1: Fix wrong mmu_node used for user SDMA packet after invalidate
> ebec507398e11b1c25ce9fb05fb509878233051c RDMA/irdma: avoid fortify-string warning in irdma_clr_wqes
> 750f0a302a10dc2327a6656860a22b7da7251cea soc/fsl/qe: fix usb.c build errors
> b2194d7dfc95a404990da73ebd394f6f6946a4a0 ARM: dts: meson8: correct uart_B and uart_C clock references
>
> 863054be8d4d2c9b38985371166a37c0e14111e1 ASoC: es8316: Do not set rate constraints for unsupported MCLKs
> 3b575d93020f20f6a711efd8c69bfed26837d694 ASoC: es8316: Increment max value for ALC Capture Target Volume control
> c02f27c2950abfed58bd8aa6bf50e79d9cc1fc77 ARM: dts: qcom: apq8074-dragonboard: Set DMA as remotely controlled
> 9f79e638d45100dad43c56ad9b47eaff1b98fe9a memory: brcmstb_dpfe: fix testing array offset after use
> 09722ac9f1e557ea65098202d0b18336c3f04420 ARM: dts: stm32: Shorten the AV96 HDMI sound card name
> 666be7fef4d39d67c041460e29ddf2b875101133 arm64: dts: mediatek: mt8183: Add mediatek,broken-save-restore-fw to kukui
> 1bdb9751b4c64f5709cb3608fd93a709c4e9b2e1 arm64: dts: qcom: apq8096: fix fixed regulator name property
> 75c019119ebcc919e717fbce5552c2a0908405cf arm64: dts: qcom: pm7250b: add missing spmi-vadc include
> c63997426da6f24f33ae6caf2423170d5bb80ebb ARM: omap2: fix missing tick_broadcast() prototype
> e91ffbd6553348cdd3d04b263f8207d919681fac ARM: ep93xx: fix missing-prototype warnings
>
> and I can't see how any of that is compiled into qemuppc. Am I missing something?
After banging my head against this for hours, I'm not really any
further forward. With commits prior to
deda0761dc6161f03278da4679d96d4727992e91 I can't seem to trigger rcu
stalls. I have seen them on 863054be8d4d2c9b38985371166a37c0e14111e1
which "isolates" it to the 10 commits above. They're arm or sound or
memory devices we don't build afaict.
I did cut kernel-devsrc, lttng-tools, perf and similar from the image
to reduce rebuild times a bit and the rcu stalls appear with them
missing. I also cut the systemd, dnf and dnf_runtime tests.
I have a suspicion that the rcu stalls are "always" there and the
emulation speed is marginal so some code patterns trigger it, some
don't. On a loaded autobuilder, it tips the balance to more stalls. The
more the rcu stalls trigger, the more likely an OOM situation is and
perhaps we just get unlucky on some loads?
Whilst I can see the rcu stalls locally, I've not had the patience/time
to see any hung QA test. I have let a few run through to completion but
not all, I've been assuming if configure passes, we wouldn't see
anything interesting later.
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
[not found] ` <177DD0B30D8FEDF8.27837@lists.openembedded.org>
@ 2023-08-22 22:01 ` Richard Purdie
[not found] ` <177DD39B5534099F.27837@lists.openembedded.org>
1 sibling, 0 replies; 23+ messages in thread
From: Richard Purdie @ 2023-08-22 22:01 UTC (permalink / raw)
To: Mikko Rapeli
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
On Tue, 2023-08-22 at 22:08 +0100, Richard Purdie via
lists.openembedded.org wrote:
> On Tue, 2023-08-22 at 15:49 +0100, Richard Purdie via
> lists.openembedded.org wrote:
> > 03b2c470a136a83a9961a2a855cde59498361598 shows as broken
> >
> > deda0761dc6161f03278da4679d96d4727992e91 doesn't seem to break
> >
> > but this doesn't seem to make any sense as the changes are:
> >
> > kernel-source$ git diff 03b2c470a136a83a9961a2a855cde59498361598 deda0761dc6161f03278da4679d96d4727992e91 | diffstat
> > arch/arm/boot/dts/iwg20d-q7-common.dtsi | 2 +-
> > arch/arm/boot/dts/meson8.dtsi | 4 ++--
> > arch/arm/boot/dts/qcom-apq8074-dragonboard.dts | 4 ----
> > arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi | 2 +-
> > arch/arm/mach-ep93xx/timer-ep93xx.c | 3 +--
> > arch/arm/mach-omap2/board-generic.c | 1 -
> > arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi | 4 ----
> > arch/arm64/boot/dts/qcom/apq8096-ifc6640.dts | 4 ++--
> > arch/arm64/boot/dts/qcom/pm7250b.dtsi | 1 -
> > arch/arm64/boot/dts/renesas/ulcb-kf.dtsi | 3 ++-
> > arch/arm64/boot/dts/ti/k3-j7200-common-proc-board.dts | 28 ++++++++++++++--------------
> > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 2 +-
> > drivers/infiniband/hw/hfi1/ipoib_tx.c | 4 ++--
> > drivers/infiniband/hw/hfi1/mmu_rb.c | 101 ++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------
> > drivers/infiniband/hw/hfi1/mmu_rb.h | 3 ---
> > drivers/infiniband/hw/hfi1/sdma.c | 23 +++++------------------
> > drivers/infiniband/hw/hfi1/sdma.h | 47 +++++++++++++++--------------------------------
> > drivers/infiniband/hw/hfi1/sdma_txreq.h | 2 --
> > drivers/infiniband/hw/hfi1/user_sdma.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------------
> > drivers/infiniband/hw/hfi1/user_sdma.h | 1 +
> > drivers/infiniband/hw/hfi1/vnic_sdma.c | 4 ++--
> > drivers/infiniband/hw/hns/hns_roce_hem.c | 7 +++----
> > drivers/infiniband/hw/irdma/uk.c | 10 ++++------
> > drivers/input/misc/pm8941-pwrkey.c | 19 ++++---------------
> > drivers/memory/brcmstb_dpfe.c | 4 +---
> > drivers/soc/fsl/qe/Kconfig | 1 -
> > drivers/video/fbdev/omap/lcd_mipid.c | 6 +-----
> > sound/soc/codecs/es8316.c | 23 +++++++++--------------
> > 28 files changed, 191 insertions(+), 259 deletions(-)
> >
> > 03b2c470a136a83a9961a2a855cde59498361598 Input: pm8941-powerkey - fix debounce on gen2+ PMICs
> > 421ce97657a84b81ce2cb915e75037e1a356736a arm64: dts: ti: k3-j7200: Fix physical address of pin
> > 3b4c21804076e461a6453ee4d09872172336aa1d fbdev: omapfb: lcd_mipid: Fix an error handling path in mipid_spi_probe()
> > 52b04ac85f5f4b485bf658101e464143225e68f9 drm/msm/dpu: set DSC flush bit correctly at MDP CTL flush register
> > 6878bdd7571827babc1c4c1ff66ea1affe951020 arm64: dts: renesas: ulcb-kf: Remove flow control for SCIF1
> > 5d14292dba9554881a137039c048a67ddf321395 ARM: dts: iwg20d-q7-common: Fix backlight pwm specifier
> > 766e0b6f4c9649f126e59c06f100b8581d0773b8 RDMA/hns: Fix hns_roce_table_get return value
> > b99395ab605fb0570d1e62c9459425ac6fc58d46 IB/hfi1: Fix wrong mmu_node used for user SDMA packet after invalidate
> > ebec507398e11b1c25ce9fb05fb509878233051c RDMA/irdma: avoid fortify-string warning in irdma_clr_wqes
> > 750f0a302a10dc2327a6656860a22b7da7251cea soc/fsl/qe: fix usb.c build errors
> > b2194d7dfc95a404990da73ebd394f6f6946a4a0 ARM: dts: meson8: correct uart_B and uart_C clock references
> >
> > 863054be8d4d2c9b38985371166a37c0e14111e1 ASoC: es8316: Do not set rate constraints for unsupported MCLKs
> > 3b575d93020f20f6a711efd8c69bfed26837d694 ASoC: es8316: Increment max value for ALC Capture Target Volume control
> > c02f27c2950abfed58bd8aa6bf50e79d9cc1fc77 ARM: dts: qcom: apq8074-dragonboard: Set DMA as remotely controlled
> > 9f79e638d45100dad43c56ad9b47eaff1b98fe9a memory: brcmstb_dpfe: fix testing array offset after use
> > 09722ac9f1e557ea65098202d0b18336c3f04420 ARM: dts: stm32: Shorten the AV96 HDMI sound card name
> > 666be7fef4d39d67c041460e29ddf2b875101133 arm64: dts: mediatek: mt8183: Add mediatek,broken-save-restore-fw to kukui
> > 1bdb9751b4c64f5709cb3608fd93a709c4e9b2e1 arm64: dts: qcom: apq8096: fix fixed regulator name property
> > 75c019119ebcc919e717fbce5552c2a0908405cf arm64: dts: qcom: pm7250b: add missing spmi-vadc include
> > c63997426da6f24f33ae6caf2423170d5bb80ebb ARM: omap2: fix missing tick_broadcast() prototype
> > e91ffbd6553348cdd3d04b263f8207d919681fac ARM: ep93xx: fix missing-prototype warnings
> >
> > and I can't see how any of that is compiled into qemuppc. Am I missing something?
>
> After banging my head against this for hours, I'm not really any
> further forward. With commits prior to
> deda0761dc6161f03278da4679d96d4727992e91 I can't seem to trigger rcu
> stalls. I have seen them on 863054be8d4d2c9b38985371166a37c0e14111e1
> which "isolates" it to the 10 commits above. They're arm or sound or
> memory devices we don't build afaict.
>
> I did cut kernel-devsrc, lttng-tools, perf and similar from the image
> to reduce rebuild times a bit and the rcu stalls appear with them
> missing. I also cut the systemd, dnf and dnf_runtime tests.
>
> I have a suspicion that the rcu stalls are "always" there and the
> emulation speed is marginal so some code patterns trigger it, some
> don't. On a loaded autobuilder, it tips the balance to more stalls. The
> more the rcu stalls trigger, the more likely an OOM situation is and
> perhaps we just get unlucky on some loads?
>
> Whilst I can see the rcu stalls locally, I've not had the patience/time
> to see any hung QA test. I have let a few run through to completion but
> not all, I've been assuming if configure passes, we wouldn't see
> anything interesting later.
I've gone back to the logs of recent failures and it is always a 255
exit code from ssh, not a timeout, e.g.:
core-image-sato/log.do_testimage.20329.20230822154435:DEBUG: [Command returned '255' after 235.73 seconds]
core-image-sato/log.do_testimage.20329.20230822154435-DEBUG: Command: dnf --repofrompath=oe-testimage-repo-qemuppc,http://192.168.7.1:39265/qemuppc --repofrompath=oe-testimage-repo-noarch,http://192.168.7.1:39265/noarch --repofrompath=oe-testimage-repo-ppc7400,http://192.168.7.1:39265/ppc7400 --nogpgcheck install --installroot=/home/root/chroot/test -v -y --rpmverbosity=debug busybox
core-image-sato/log.do_testimage.20329.20230822154435-Status: 255 Output: DNF version: 4.16.1
core-image-sato/log.do_testimage.20329.20230822154435-cachedir: /home/root/chroot/test/var/cache/dnf
core-image-sato/log.do_testimage.20329.20230822154435-Added oe-testimage-repo-qemuppc repo from http://192.168.7.1:39265/qemuppc
core-image-sato/log.do_testimage.20329.20230822154435-Added oe-testimage-repo-noarch repo from http://192.168.7.1:39265/noarch
core-image-sato-sdk/log.do_testimage.20325.20230822154435-checking for unistd.h... yes
core-image-sato-sdk/log.do_testimage.20325.20230822154435-checking minix/config.h usability... no
core-image-sato-sdk/log.do_testimage.20325.20230822154435-checking minix/config.h presence... no
core-image-sato-sdk/log.do_testimage.20325.20230822154435-checking for minix/config.h... no
core-image-sato-sdk/log.do_testimage.20325.20230822154435-checking whether it is safe to define __EXTENSIONS__...
core-image-sato-sdk/log.do_testimage.20325.20230822154435:DEBUG: [Command returned '255' after 312.18 seconds]
core-image-sato-sdk/log.do_testimage.20325.20230822154435-DEBUG: Command: cd ~/buildtest/cpio-2.13; gnu-configize; ./configure --disable-maintainer-mode
core-image-sato-sdk/log.do_testimage.20325.20230822154435-Status: 255 Output: aclocal.m4:17: warning: this file was generated for autoconf 2.69.
so the commands are stopping mid flow for unknown reasons or the ssh
connection fails. I can't tell if this coincides with an rcu stall or
not. Both logs do have rcu stalls in.
After these failures the system does continue to otherwise work
normally and subsequent tests pass.
I wonder if the slow emulation might be causing the networking to
glitch and break the ssh connection.
I'm at a bit of a loss on where from here.
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
[not found] ` <177DD39B5534099F.27837@lists.openembedded.org>
@ 2023-08-23 21:16 ` Richard Purdie
[not found] ` <177E1FB73F514F09.8058@lists.openembedded.org>
1 sibling, 0 replies; 23+ messages in thread
From: Richard Purdie @ 2023-08-23 21:16 UTC (permalink / raw)
To: Mikko Rapeli
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
On Tue, 2023-08-22 at 23:01 +0100, Richard Purdie via
lists.openembedded.org wrote:
> so the commands are stopping mid flow for unknown reasons or the ssh
> connection fails. I can't tell if this coincides with an rcu stall or
> not. Both logs do have rcu stalls in.
>
> After these failures the system does continue to otherwise work
> normally and subsequent tests pass.
>
> I wonder if the slow emulation might be causing the networking to
> glitch and break the ssh connection.
>
> I'm at a bit of a loss on where from here.
I thought I'd update the thread with new information.
I went back to the start with this and looked again and what is going
on. Interestingly, I found one of the autobuilder workers would
consistently fail the qemuppc-alt configuration for core-image-sato-
sdk. I paused the worker and experimented.
I saw two different failures (included below). One shows systemd-udevd
timing out on it's watchdog after 3 minutes and resetting, including
taking out an ssh session running the cpio configure command. There was
no RCU stall reported.
The second failure shows systemd-logind as well as systemd-udevd with
the 3 minute time out, the kernel complaining about missed IRQs, an RCU
stall and lots of breakage following including cut ssh commands.
I could not get the cpio build test to complete.
Interestingly, I came back to the same image/worker later this evening
and now it all works fine. The difference is earlier there was a world
build running on the worker, which continued to wind down even after I
paused the worker. By the evening, that background load was no longer
present and the ppc image works in isolation. This tells us the issue
is system load dependent and only occurs on loaded systems.
I suspect I need to replicate the load and retry locally, see if I can
reliably reproduce the hang. The watchdog won't be present on sysvinit
systems which also show the issues but I'd guess there is still some
other starvation/timeout occurring.
Cheers,
Richard
Aug 23 13:23:01 qemuppc systemd[1]: sshd@4-192.168.7.4:22-192.168.7.3:59946.service: Deactivated successfully.
Aug 23 13:23:01 qemuppc systemd-logind[200]: Session c6 logged out. Waiting for processes to exit.
Aug 23 13:23:01 qemuppc systemd[1]: Started OpenSSH Per-Connection Daemon (192.168.7.3:45940).
Aug 23 13:23:02 qemuppc systemd-logind[200]: Removed session c6.
Aug 23 13:23:03 qemuppc systemd-logind[200]: New session c7 of user root.
Aug 23 13:23:03 qemuppc systemd[1]: Started Session c7 of User root.
Aug 23 13:24:51 qemuppc systemd-journald[114]: Forwarding to syslog missed 20 messages.
Aug 23 13:29:32 qemuppc systemd[1]: systemd-udevd.service: Watchdog timeout (limit 3min)!
Aug 23 13:29:32 qemuppc systemd-journald[114]: Forwarding to syslog missed 1 messages.
Aug 23 13:29:32 qemuppc systemd[1]: systemd-udevd.service: Killing process 149 (systemd-udevd) with signal SIGABRT.
Aug 23 13:29:32 qemuppc systemd[1]: systemd-udevd.service: Main process exited, code=dumped, status=6/ABRT
Aug 23 13:29:32 qemuppc systemd[1]: systemd-udevd.service: Failed with result 'watchdog'.
Aug 23 13:29:33 qemuppc systemd[1]: systemd-udevd.service: Scheduled restart job, restart counter is at 1.
Aug 23 13:29:34 qemuppc systemd[1]: Starting Rule-based Manager for Device Events and Files...
Aug 23 13:29:36 qemuppc systemd[1]: sshd@5-192.168.7.4:22-192.168.7.3:45940.service: Deactivated successfully.
Aug 23 13:29:36 qemuppc systemd-logind[200]: Session c7 logged out. Waiting for processes to exit.
Aug 23 13:29:37 qemuppc systemd-udevd[928]: Using default interface naming scheme 'v253'.
Aug 23 14:26:19 qemuppc systemd[1]: Started OpenSSH Per-Connection Daemon (192.168.7.3:56494).
Aug 23 14:26:19 qemuppc systemd-journald[114]: Forwarding to syslog missed 23 messages.
Aug 23 14:26:26 qemuppc systemd-logind[200]: New session c8 of user root.
Aug 23 14:26:26 qemuppc systemd[1]: Started Session c8 of User root.
Aug 23 14:27:24 qemuppc systemd-journald[114]: Forwarding to syslog missed 2 messages.
Aug 23 14:31:51 qemuppc kernel: rcu: INFO: rcu_preempt self-detected stall on CPU
Aug 23 14:31:51 qemuppc kernel: rcu: 0-...!: (1 ticks this GP) idle=1a5c/1/0x40000004 softirq=34160/34160 fqs=0
Aug 23 14:31:51 qemuppc kernel: (t=32552 jiffies g=73185 q=1 ncpus=1)
Aug 23 14:31:52 qemuppc kernel: rcu: rcu_preempt kthread timer wakeup didn't happen for 32551 jiffies! g73185 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
Aug 23 14:31:52 qemuppc kernel: rcu: Possible timer handling issue on cpu=0 timer-softirq=26141
Aug 23 14:31:52 qemuppc kernel: rcu: rcu_preempt kthread starved for 32552 jiffies! g73185 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
Aug 23 14:31:52 qemuppc kernel: rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
Aug 23 14:31:52 qemuppc kernel: rcu: RCU grace-period kthread stack dump:
Aug 23 14:31:52 qemuppc kernel: task:rcu_preempt state:I stack:0 pid:16 ppid:2 flags:0x00000800
Aug 23 14:31:52 qemuppc kernel: Call Trace:
Aug 23 14:31:52 qemuppc kernel: [f1051cd0] [67bf4a72] 0x67bf4a72 (unreliable)
Aug 23 14:31:52 qemuppc kernel: [f1051da0] [c0d84cc4] __schedule+0x378/0x890
Aug 23 14:31:52 qemuppc kernel: [f1051df0] [c0d85244] schedule+0x68/0x118
Aug 23 14:31:52 qemuppc kernel: [f1051e10] [c0d8c678] schedule_timeout+0xb0/0x17c
Aug 23 14:31:52 qemuppc kernel: [f1051e50] [c00d4d84] rcu_gp_fqs_loop+0x4cc/0x6ac
Aug 23 14:31:52 qemuppc kernel: [f1051eb0] [c00d859c] rcu_gp_kthread+0x238/0x284
Aug 23 14:31:52 qemuppc kernel: [f1051f00] [c008a4d0] kthread+0xfc/0x114
Aug 23 14:31:52 qemuppc kernel: [f1051f30] [c001c338] ret_from_kernel_thread+0x5c/0x64
Aug 23 14:31:52 qemuppc kernel: rcu: Stack dump where RCU GP kthread last ran:
Aug 23 14:31:52 qemuppc kernel: CPU: 0 PID: 792 Comm: as Not tainted 6.1.46-yocto-standard #1
Aug 23 14:31:52 qemuppc kernel: Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
Aug 23 14:31:52 qemuppc kernel: NIP: c0d8d9b4 LR: c0d8d9a4 CTR: c003d7ec
Aug 23 14:31:52 qemuppc kernel: REGS: f1005ec0 TRAP: 0900 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:52 qemuppc kernel: MSR: 00009032 <EE,ME,IR,DR,RI> CR: 48f74f32 XER: 20000000
Aug 23 14:31:52 qemuppc kernel:
GPR00: c0d8d95c f1005f80 c2e36c00 00000000 00000001 0000003f 000039ff 00000000
GPR08: 00000000 00009032 00000100 c15ecab8 28f75f32 101df7bc c157abf8 c16c0000
GPR16: 000039ff c105bd2c 00000000 0000000a c1575840 c105bd78 c0fe710c c1583570
GPR24: c16f70a0 a7e3fa28 00000000 c16adba0 a7e3fa60 00000008 00000000 f1005ff0
Aug 23 14:31:52 qemuppc kernel: NIP [c0d8d9b4] __do_softirq+0xfc/0x394
Aug 23 14:31:52 qemuppc kernel: LR [c0d8d9a4] __do_softirq+0xec/0x394
Aug 23 14:31:52 qemuppc kernel: Call Trace:
Aug 23 14:31:52 qemuppc kernel: [f1005f80] [c0d8d95c] __do_softirq+0xa4/0x394 (unreliable)
Aug 23 14:31:52 qemuppc kernel: [f1005ff0] [c000ac2c] do_softirq_own_stack+0x3c/0x54
Aug 23 14:31:52 qemuppc kernel: [f2f75ef0] [a7e2e000] 0xa7e2e000
Aug 23 14:31:53 qemuppc kernel: [f2f75f10] [c00641b8] irq_exit+0xe4/0x144
Aug 23 14:31:53 qemuppc kernel: [f2f75f30] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
Aug 23 14:31:53 qemuppc kernel: --- interrupt: 500 at 0xa7e0d668
Aug 23 14:31:53 qemuppc kernel: NIP: a7e0d668 LR: a7e0d580 CTR: 00000000
Aug 23 14:31:53 qemuppc kernel: REGS: f2f75f40 TRAP: 0500 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:53 qemuppc kernel: MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 48000288 XER: 20000000
Aug 23 14:31:53 qemuppc kernel:
GPR00: a7e0d580 afe57610 00000000 fffffffc 70000027 a7e32000 6fffffff a7e2e020
GPR08: fffffef5 6ffffef5 a7e33b44 40ef66fa 40ef63c7 101df7bc a7e3dfe0 00000000
GPR16: a7e3f954 a7e3f008 a7e3dff0 afe5778c afe57ca4 00001000 00000010 00000000
GPR24: 10000034 a7e3fa28 00000000 00000001 a7e3fa60 a7e2e000 a7e3eff0 afe57610
Aug 23 14:31:53 qemuppc kernel: NIP [a7e0d668] 0xa7e0d668
Aug 23 14:31:53 qemuppc kernel: LR [a7e0d580] 0xa7e0d580
Aug 23 14:31:53 qemuppc kernel: --- interrupt: 500
Aug 23 14:31:53 qemuppc kernel: Instruction dump:
Aug 23 14:31:53 qemuppc kernel: 3b1870a0 3ad6710c 3af73570 3b7bdba0 3a60000a 3a400000 7e238b78 4bff6f11
Aug 23 14:31:53 qemuppc kernel: 92540000 7d2000a6 61298000 7d200124 <7ffd00d0> 7fbff838 7fff0034 23ff0020
Aug 23 14:31:53 qemuppc kernel: CPU: 0 PID: 792 Comm: as Not tainted 6.1.46-yocto-standard #1
Aug 23 14:31:53 qemuppc kernel: Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
Aug 23 14:31:53 qemuppc kernel: NIP: c0d8d9b4 LR: c0d8d9a4 CTR: c003d7ec
Aug 23 14:31:53 qemuppc kernel: REGS: f1005ec0 TRAP: 0900 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:53 qemuppc kernel: MSR: 00009032 <EE,ME,IR,DR,RI> CR: 48f74f32 XER: 20000000
Aug 23 14:31:53 qemuppc kernel:
GPR00: c0d8d95c f1005f80 c2e36c00 00000000 00000001 0000003f 000039ff 00000000
GPR08: 00000000 00009032 00000100 c15ecab8 28f75f32 101df7bc c157abf8 c16c0000
GPR16: 000039ff c105bd2c 00000000 0000000a c1575840 c105bd78 c0fe710c c1583570
GPR24: c16f70a0 a7e3fa28 00000000 c16adba0 a7e3fa60 00000008 00000000 f1005ff0
Aug 23 14:31:53 qemuppc kernel: NIP [c0d8d9b4] __do_softirq+0xfc/0x394
Aug 23 14:31:53 qemuppc kernel: LR [c0d8d9a4] __do_softirq+0xec/0x394
Aug 23 14:31:54 qemuppc kernel: Call Trace:
Aug 23 14:31:54 qemuppc kernel: [f1005f80] [c0d8d95c] __do_softirq+0xa4/0x394 (unreliable)
Aug 23 14:31:54 qemuppc kernel: [f1005ff0] [c000ac2c] do_softirq_own_stack+0x3c/0x54
Aug 23 14:31:54 qemuppc kernel: [f2f75ef0] [a7e2e000] 0xa7e2e000
Aug 23 14:31:54 qemuppc kernel: [f2f75f10] [c00641b8] irq_exit+0xe4/0x144
Aug 23 14:31:54 qemuppc kernel: [f2f75f30] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
Aug 23 14:31:54 qemuppc kernel: --- interrupt: 500 at 0xa7e0d668
Aug 23 14:31:54 qemuppc kernel: NIP: a7e0d668 LR: a7e0d580 CTR: 00000000
Aug 23 14:31:54 qemuppc kernel: REGS: f2f75f40 TRAP: 0500 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:54 qemuppc kernel: MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 48000288 XER: 20000000
Aug 23 14:31:54 qemuppc kernel:
GPR00: a7e0d580 afe57610 00000000 fffffffc 70000027 a7e32000 6fffffff a7e2e020
GPR08: fffffef5 6ffffef5 a7e33b44 40ef66fa 40ef63c7 101df7bc a7e3dfe0 00000000
GPR16: a7e3f954 a7e3f008 a7e3dff0 afe5778c afe57ca4 00001000 00000010 00000000
GPR24: 10000034 a7e3fa28 00000000 00000001 a7e3fa60 a7e2e000 a7e3eff0 afe57610
Aug 23 14:31:54 qemuppc kernel: NIP [a7e0d668] 0xa7e0d668
Aug 23 14:31:54 qemuppc kernel: LR [a7e0d580] 0xa7e0d580
Aug 23 14:31:54 qemuppc kernel: --- interrupt: 500
Aug 23 14:31:54 qemuppc kernel: Instruction dump:
Aug 23 14:31:54 qemuppc kernel: 3b1870a0 3ad6710c 3af73570 3b7bdba0 3a60000a 3a400000 7e238b78 4bff6f11
Aug 23 14:31:54 qemuppc kernel: 92540000 7d2000a6 61298000 7d200124 <7ffd00d0> 7fbff838 7fff0034 23ff0020
Aug 23 14:31:54 qemuppc kernel: irq 37: nobody cared (try booting with the "irqpoll" option)
Aug 23 14:31:54 qemuppc kernel: CPU: 0 PID: 792 Comm: as Not tainted 6.1.46-yocto-standard #1
Aug 23 14:31:54 qemuppc kernel: Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
Aug 23 14:31:54 qemuppc kernel: Call Trace:
Aug 23 14:31:54 qemuppc kernel: [f1005d50] [c0d54a9c] dump_stack_lvl+0x34/0x50 (unreliable)
Aug 23 14:31:54 qemuppc kernel: [f1005d70] [c00c0e84] __report_bad_irq+0x50/0x138
Aug 23 14:31:54 qemuppc kernel: [f1005da0] [c00c0d9c] note_interrupt+0x344/0x398
Aug 23 14:31:54 qemuppc kernel: [f1005df0] [c00bcfc8] handle_irq_event+0xb4/0xfc
Aug 23 14:31:54 qemuppc kernel: [f1005e10] [c00c1e74] handle_fasteoi_irq+0xc0/0x29c
Aug 23 14:31:54 qemuppc kernel: [f1005e30] [c00bb554] generic_handle_irq+0x40/0x5c
Aug 23 14:31:54 qemuppc kernel: [f1005e40] [c000a660] __do_irq+0x48/0x140
Aug 23 14:31:54 qemuppc kernel: [f1005e60] [c000aaa8] __do_IRQ+0xe0/0x11c
Aug 23 14:31:54 qemuppc kernel: [f1005e90] [c000ab18] do_IRQ+0x34/0x10c
Aug 23 14:31:54 qemuppc kernel: [f1005eb0] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
Aug 23 14:31:54 qemuppc kernel: --- interrupt: 500 at __do_softirq+0xfc/0x394
Aug 23 14:31:54 qemuppc kernel: NIP: c0d8d9b4 LR: c0d8d9a4 CTR: c003d7ec
Aug 23 14:31:54 qemuppc kernel: REGS: f1005ec0 TRAP: 0500 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:54 qemuppc kernel: MSR: 00009032 <EE,ME,IR,DR,RI> CR: 48f74f32 XER: 20000000
Aug 23 14:31:54 qemuppc kernel:
GPR00: c0d8d95c f1005f80 c2e36c00 00000000 00000001 0000003f 000039ff 00000000
GPR08: 00000000 00009032 00000100 c15ecab8 28f75f32 101df7bc c157abf8 c16c0000
GPR16: 000039ff c105bd2c 00000000 0000000a c1575840 c105bd78 c0fe710c c1583570
GPR24: c16f70a0 a7e3fa28 00000000 c16adba0 a7e3fa60 00000008 00000000 f1005ff0
Aug 23 14:31:54 qemuppc kernel: NIP [c0d8d9b4] __do_softirq+0xfc/0x394
Aug 23 14:31:54 qemuppc kernel: LR [c0d8d9a4] __do_softirq+0xec/0x394
Aug 23 14:31:54 qemuppc kernel: --- interrupt: 500
Aug 23 14:31:54 qemuppc kernel: [f1005f80] [c0d8d95c] __do_softirq+0xa4/0x394 (unreliable)
Aug 23 14:31:54 qemuppc kernel: [f1005ff0] [c000ac2c] do_softirq_own_stack+0x3c/0x54
Aug 23 14:31:54 qemuppc kernel: [f2f75ef0] [a7e2e000] 0xa7e2e000
Aug 23 14:31:54 qemuppc kernel: [f2f75f10] [c00641b8] irq_exit+0xe4/0x144
Aug 23 14:31:55 qemuppc kernel: [f2f75f30] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
Aug 23 14:31:55 qemuppc kernel: --- interrupt: 500 at 0xa7e0d668
Aug 23 14:31:55 qemuppc kernel: NIP: a7e0d668 LR: a7e0d580 CTR: 00000000
Aug 23 14:31:55 qemuppc kernel: REGS: f2f75f40 TRAP: 0500 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:55 qemuppc kernel: MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 48000288 XER: 20000000
Aug 23 14:31:55 qemuppc kernel:
GPR00: a7e0d580 afe57610 00000000 fffffffc 70000027 a7e32000 6fffffff a7e2e020
GPR08: fffffef5 6ffffef5 a7e33b44 40ef66fa 40ef63c7 101df7bc a7e3dfe0 00000000
GPR16: a7e3f954 a7e3f008 a7e3dff0 afe5778c afe57ca4 00001000 00000010 00000000
GPR24: 10000034 a7e3fa28 00000000 00000001 a7e3fa60 a7e2e000 a7e3eff0 afe57610
Aug 23 14:31:55 qemuppc kernel: NIP [a7e0d668] 0xa7e0d668
Aug 23 14:31:55 qemuppc kernel: LR [a7e0d580] 0xa7e0d580
Aug 23 14:31:55 qemuppc kernel: --- interrupt: 500
Aug 23 14:31:55 qemuppc kernel: handlers:
Aug 23 14:31:55 qemuppc kernel: [<9df35000>] pmz_interrupt
Aug 23 14:31:55 qemuppc kernel: Disabling IRQ #37
Aug 23 14:31:51 qemuppc systemd[1]: systemd-udevd.service: Watchdog timeout (limit 3min)!
Aug 23 14:31:55 qemuppc systemd-journald[114]: Forwarding to syslog missed 1 messages.
Aug 23 14:31:52 qemuppc kernel[193]: [ 489.773137] rcu: INFO: rcu_preempt self-detected stall on CPU
Aug 23 14:31:51 qemuppc systemd[1]: systemd-udevd.service: Killing process 148 (systemd-udevd) with signal SIGABRT.
Aug 23 14:31:52 qemuppc kernel[193]: [ 489.922768] rcu: 0-...!: (1 ticks this GP) idle=1a5c/1/0x40000004 softirq=34160/34160 fqs=0
Aug 23 14:31:51 qemuppc systemd[1]: systemd-logind.service: Watchdog timeout (limit 3min)!
Aug 23 14:31:52 qemuppc kernel[193]: [ 489.941081] (t=32552 jiffies g=73185 q=1 ncpus=1)
Aug 23 14:31:51 qemuppc systemd[1]: systemd-logind.service: Killing process 200 (systemd-logind) with signal SIGABRT.
Aug 23 14:31:52 qemuppc kernel[193]: [ 489.956856] rcu: rcu_preempt kthread timer wakeup didn't happen for 32551 jiffies! g73185 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
Aug 23 14:31:51 qemuppc systemd[1]: systemd-networkd.service: Watchdog timeout (limit 3min)!
Aug 23 14:31:52 qemuppc kernel[193]: [ 489.965800] rcu: Possible timer handling issue on cpu=0 timer-softirq=26141
Aug 23 14:31:51 qemuppc systemd[1]: systemd-networkd.service: Killing process 211 (systemd-network) with signal SIGABRT.
Aug 23 14:31:52 qemuppc kernel[193]: [ 489.971042] rcu: rcu_preempt kthread starved for 32552 jiffies! g73185 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
Aug 23 14:31:52 qemuppc systemd[1]: systemd-udevd.service: Main process exited, code=dumped, status=6/ABRT
Aug 23 14:31:52 qemuppc kernel[193]: [ 489.971295] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
Aug 23 14:31:52 qemuppc systemd[1]: systemd-udevd.service: Failed with result 'watchdog'.
Aug 23 14:31:52 qemuppc kernel[193]: [ 489.972612] rcu: RCU grace-period kthread stack dump:
Aug 23 14:31:52 qemuppc systemd[1]: systemd-udevd.service: Scheduled restart job, restart counter is at 1.
Aug 23 14:31:52 qemuppc kernel[193]: [ 489.984822] task:rcu_preempt state:I stack:0 pid:16 ppid:2 flags:0x00000800
Aug 23 14:31:53 qemuppc systemd[1]: Starting Rule-based Manager for Device Events and Files...
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.027482] Call Trace:
Aug 23 14:31:53 qemuppc systemd[1]: systemd-logind.service: Main process exited, code=dumped, status=6/ABRT
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.052736] [f1051cd0] [67bf4a72] 0x67bf4a72 (unreliable)
Aug 23 14:31:53 qemuppc systemd[1]: systemd-logind.service: Failed with result 'watchdog'.
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.070781] [f1051da0] [c0d84cc4] __schedule+0x378/0x890
Aug 23 14:31:54 qemuppc systemd[1]: systemd-networkd.service: Main process exited, code=dumped, status=6/ABRT
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.075515] [f1051df0] [c0d85244] schedule+0x68/0x118
Aug 23 14:31:54 qemuppc systemd[1]: systemd-networkd.service: Failed with result 'watchdog'.
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.079545] [f1051e10] [c0d8c678] schedule_timeout+0xb0/0x17c
Aug 23 14:31:54 qemuppc systemd[1]: systemd-logind.service: Scheduled restart job, restart counter is at 1.
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.083481] [f1051e50] [c00d4d84] rcu_gp_fqs_loop+0x4cc/0x6ac
Aug 23 14:31:54 qemuppc systemd[1]: systemd-networkd.service: Scheduled restart job, restart counter is at 1.
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.083671] [f1051eb0] [c00d859c] rcu_gp_kthread+0x238/0x284
Aug 23 14:31:54 qemuppc systemd[1]: Starting Load Kernel Module drm...
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.083823] [f1051f00] [c008a4d0] kthread+0xfc/0x114
Aug 23 14:31:56 qemuppc systemd[1]: modprobe@drm.service: Deactivated successfully.
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.083965] [f1051f30] [c001c338] ret_from_kernel_thread+0x5c/0x64
Aug 23 14:31:56 qemuppc systemd[1]: Finished Load Kernel Module drm.
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.084259] rcu: Stack dump where RCU GP kthread last ran:
Aug 23 14:31:56 qemuppc systemd[1]: Starting Load Kernel Module drm...
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.084685] CPU: 0 PID: 792 Comm: as Not tainted 6.1.46-yocto-standard #1
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.085165] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.085346] NIP: c0d8d9b4 LR: c0d8d9a4 CTR: c003d7ec
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.087556] REGS: f1005ec0 TRAP: 0900 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.090899] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 48f74f32 XER: 20000000
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.091346]
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.091346] GPR00: c0d8d95c f1005f80 c2e36c00 00000000 00000001 0000003f 000039ff 00000000
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.091346] GPR08: 00000000 00009032 00000100 c15ecab8 28f75f32 101df7bc c157abf8 c16c0000
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.091346] GPR16: 000039ff c105bd2c 00000000 0000000a c1575840 c105bd78 c0fe710c c1583570
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.091346] GPR24: c16f70a0 a7e3fa28 00000000 c16adba0 a7e3fa60 00000008 00000000 f1005ff0
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.096034] NIP [c0d8d9b4] __do_softirq+0xfc/0x394
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.096277] LR [c0d8d9a4] __do_softirq+0xec/0x394
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.096565] Call Trace:
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.098486] [f1005f80] [c0d8d95c] __do_softirq+0xa4/0x394 (unreliable)
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.098917] [f1005ff0] [c000ac2c] do_softirq_own_stack+0x3c/0x54
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.099091] [f2f75ef0] [a7e2e000] 0xa7e2e000
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.099221] [f2f75f10] [c00641b8] irq_exit+0xe4/0x144
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.099361] [f2f75f30] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.100981] --- interrupt: 500 at 0xa7e0d668
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.101119] NIP: a7e0d668 LR: a7e0d580 CTR: 00000000
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.101237] REGS: f2f75f40 TRAP: 0500 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.101562] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 48000288 XER: 20000000
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.103148]
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.103148] GPR00: a7e0d580 afe57610 00000000 fffffffc 70000027 a7e32000 6fffffff a7e2e020
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.103148] GPR08: fffffef5 6ffffef5 a7e33b44 40ef66fa 40ef63c7 101df7bc a7e3dfe0 00000000
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.103148] GPR16: a7e3f954 a7e3f008 a7e3dff0 afe5778c afe57ca4 00001000 00000010 00000000
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.103148] GPR24: 10000034 a7e3fa28 00000000 00000001 a7e3fa60 a7e2e000 a7e3eff0 afe57610
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.106361] NIP [a7e0d668] 0xa7e0d668
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.106836] LR [a7e0d580] 0xa7e0d580
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.106981] --- interrupt: 500
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.107253] Instruction dump:
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.108681] 3b1870a0 3ad6710c 3af73570 3b7bdba0 3a60000a 3a400000 7e238b78 4bff6f11
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.109573] 92540000 7d2000a6 61298000 7d200124 <7ffd00d0> 7fbff838 7fff0034 23ff0020
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.110082] CPU: 0 PID: 792 Comm: as Not tainted 6.1.46-yocto-standard #1
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.110249] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.110381] NIP: c0d8d9b4 LR: c0d8d9a4 CTR: c003d7ec
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.112853] REGS: f1005ec0 TRAP: 0900 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.113013] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 48f74f32 XER: 20000000
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.113248]
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.113248] GPR00: c0d8d95c f1005f80 c2e36c00 00000000 00000001 0000003f 000039ff 00000000
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.113248] GPR08: 00000000 00009032 00000100 c15ecab8 28f75f32 101df7bc c157abf8 c16c0000
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.113248] GPR16: 000039ff c105bd2c 00000000 0000000a c1575840 c105bd78 c0fe710c c1583570
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.113248] GPR24: c16f70a0 a7e3fa28 00000000 c16adba0 a7e3fa60 00000008 00000000 f1005ff0
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.116244] NIP [c0d8d9b4] __do_softirq+0xfc/0x394
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.116396] LR [c0d8d9a4] __do_softirq+0xec/0x394
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.118125] Call Trace:
Aug 23 14:31:52 qemuppc kernel[193]: [ 490.118203] [f1005f80] [c0d8d95c] __do_softirq+0xa4/0x394 (unreliable)
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.118522] [f1005ff0] [c000ac2c] do_softirq_own_stack+0x3c/0x54
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.119250] [f2f75ef0] [a7e2e000] 0xa7e2e000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.119388] [f2f75f10] [c00641b8] irq_exit+0xe4/0x144
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.120203] [f2f75f30] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.120376] --- interrupt: 500 at 0xa7e0d668
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.121075] NIP: a7e0d668 LR: a7e0d580 CTR: 00000000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.121195] REGS: f2f75f40 TRAP: 0500 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.121342] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 48000288 XER: 20000000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.122702]
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.122702] GPR00: a7e0d580 afe57610 00000000 fffffffc 70000027 a7e32000 6fffffff a7e2e020
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.122702] GPR08: fffffef5 6ffffef5 a7e33b44 40ef66fa 40ef63c7 101df7bc a7e3dfe0 00000000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.122702] GPR16: a7e3f954 a7e3f008 a7e3dff0 afe5778c afe57ca4 00001000 00000010 00000000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.122702] GPR24: 10000034 a7e3fa28 00000000 00000001 a7e3fa60 a7e2e000 a7e3eff0 afe57610
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.125554] NIP [a7e0d668] 0xa7e0d668
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.125958] LR [a7e0d580] 0xa7e0d580
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.126062] --- interrupt: 500
Aug 23 14:31:57 qemuppc systemd[1]: modprobe@drm.service: Deactivated successfully.
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.126147] Instruction dump:
Aug 23 14:31:57 qemuppc systemd[1]: Finished Load Kernel Module drm.
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.126243] 3b1870a0 3ad6710c 3af73570 3b7bdba0 3a60000a 3a400000 7e238b78 4bff6f11
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.127761] 92540000 7d2000a6 61298000 7d200124 <7ffd00d0> 7fbff838 7fff0034 23ff0020
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.695839] irq 37: nobody cared (try booting with the "irqpoll" option)
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.701589] CPU: 0 PID: 792 Comm: as Not tainted 6.1.46-yocto-standard #1
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.706880] Hardware name: PowerMac3,1 7400 0xc0209 PowerMac
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.711073] Call Trace:
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.711167] [f1005d50] [c0d54a9c] dump_stack_lvl+0x34/0x50 (unreliable)
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.711386] [f1005d70] [c00c0e84] __report_bad_irq+0x50/0x138
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.712536] [f1005da0] [c00c0d9c] note_interrupt+0x344/0x398
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.712930] [f1005df0] [c00bcfc8] handle_irq_event+0xb4/0xfc
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.713088] [f1005e10] [c00c1e74] handle_fasteoi_irq+0xc0/0x29c
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.713243] [f1005e30] [c00bb554] generic_handle_irq+0x40/0x5c
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.713393] [f1005e40] [c000a660] __do_irq+0x48/0x140
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.715065] [f1005e60] [c000aaa8] __do_IRQ+0xe0/0x11c
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.715221] [f1005e90] [c000ab18] do_IRQ+0x34/0x10c
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.715362] [f1005eb0] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.716384] --- interrupt: 500 at __do_softirq+0xfc/0x394
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.716876] NIP: c0d8d9b4 LR: c0d8d9a4 CTR: c003d7ec
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.716995] REGS: f1005ec0 TRAP: 0500 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.717369] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 48f74f32 XER: 20000000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.719338]
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.719338] GPR00: c0d8d95c f1005f80 c2e36c00 00000000 00000001 0000003f 000039ff 00000000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.719338] GPR08: 00000000 00009032 00000100 c15ecab8 28f75f32 101df7bc c157abf8 c16c0000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.719338] GPR16: 000039ff c105bd2c 00000000 0000000a c1575840 c105bd78 c0fe710c c1583570
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.719338] GPR24: c16f70a0 a7e3fa28 00000000 c16adba0 a7e3fa60 00000008 00000000 f1005ff0
Aug 23 14:31:57 qemuppc systemd[1]: Starting User Login Management...
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.721535] NIP [c0d8d9b4] __do_softirq+0xfc/0x394
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.722801] LR [c0d8d9a4] __do_softirq+0xec/0x394
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.722934] --- interrupt: 500
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.723019] [f1005f80] [c0d8d95c] __do_softirq+0xa4/0x394 (unreliable)
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.723197] [f1005ff0] [c000ac2c] do_softirq_own_stack+0x3c/0x54
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.723362] [f2f75ef0] [a7e2e000] 0xa7e2e000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.725049] [f2f75f10] [c00641b8] irq_exit+0xe4/0x144
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.725201] [f2f75f30] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.725364] --- interrupt: 500 at 0xa7e0d668
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.726214] NIP: a7e0d668 LR: a7e0d580 CTR: 00000000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.726335] REGS: f2f75f40 TRAP: 0500 Not tainted (6.1.46-yocto-standard)
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.726983] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 48000288 XER: 20000000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.727233]
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.727233] GPR00: a7e0d580 afe57610 00000000 fffffffc 70000027 a7e32000 6fffffff a7e2e020
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.727233] GPR08: fffffef5 6ffffef5 a7e33b44 40ef66fa 40ef63c7 101df7bc a7e3dfe0 00000000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.727233] GPR16: a7e3f954 a7e3f008 a7e3dff0 afe5778c afe57ca4 00001000 00000010 00000000
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.727233] GPR24: 10000034 a7e3fa28 00000000 00000001 a7e3fa60 a7e2e000 a7e3eff0 afe57610
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.730076] NIP [a7e0d668] 0xa7e0d668
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.731510] LR [a7e0d580] 0xa7e0d580
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.731781] --- interrupt: 500
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.731930] handlers:
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.732058] [<9df35000>] pmz_interrupt
Aug 23 14:31:53 qemuppc kernel[193]: [ 490.734774] Disabling IRQ #37
Aug 23 14:31:58 qemuppc systemd-udevd[801]: Using default interface naming scheme 'v253'.
Aug 23 14:32:00 qemuppc systemd[1]: Started Rule-based Manager for Device Events and Files.
Aug 23 14:32:00 qemuppc systemd[1]: Starting Network Configuration...
Aug 23 14:32:02 qemuppc systemd-logind[835]: New seat seat0.
Aug 23 14:32:03 qemuppc systemd-logind[835]: Watching system buttons on /dev/input/event0 (ADB keyboard)
Aug 23 14:32:03 qemuppc systemd-logind[835]: New session c7 of user root.
Aug 23 14:32:03 qemuppc systemd-logind[835]: New session c1 of user root.
Aug 23 14:32:04 qemuppc systemd-logind[835]: New session c8 of user root.
Aug 23 14:32:04 qemuppc systemd[1]: Started User Login Management.
Aug 23 14:32:04 qemuppc systemd[1]: sshd@5-192.168.7.4:22-192.168.7.3:59072.service: Deactivated successfully.
Aug 23 14:32:04 qemuppc systemd-logind[835]: Session c7 logged out. Waiting for processes to exit.
Aug 23 14:32:05 qemuppc systemd-networkd[838]: lo: Link UP
Aug 23 14:32:05 qemuppc systemd-networkd[838]: lo: Gained carrier
Aug 23 14:32:05 qemuppc systemd-networkd[838]: eth0: Link UP
Aug 23 14:32:05 qemuppc systemd-networkd[838]: eth0: Gained carrier
Aug 23 14:32:05 qemuppc systemd-networkd[838]: eth0: Gained IPv6LL
Aug 23 14:32:05 qemuppc systemd-networkd[838]: Enumeration completed
Aug 23 14:32:05 qemuppc systemd[1]: Started Network Configuration.
Aug 23 14:32:05 qemuppc systemd-networkd[838]: eth0: found matching network '/run/systemd/network/90-eth0.network', based on potentially unpredictable interface name.
Aug 23 14:32:05 qemuppc systemd-networkd[838]: eth0: Configuring with /run/systemd/network/90-eth0.network.
Aug 23 14:32:08 qemuppc systemd[1]: session-c7.scope: Deactivated successfully.
Aug 23 14:32:08 qemuppc systemd-logind[835]: Removed session c7.
Aug 23 14:32:18 qemuppc systemd[1]: Started OpenSSH Per-Connection Daemon (192.168.7.3:52348).
Aug 23 14:32:19 qemuppc systemd-logind[835]: New session c2 of user root.
Aug 23 14:32:19 qemuppc systemd[1]: Started Session c2 of User root.
Aug 23 14:32:21 qemuppc systemd[1]: sshd@7-192.168.7.4:22-192.168.7.3:52348.service: Deactivated successfully.
Aug 23 14:32:21 qemuppc systemd[1]: session-c2.scope: Deactivated successfully.
Aug 23 14:32:21 qemuppc systemd-logind[835]: Session c2 logged out. Waiting for processes to exit.
Aug 23 14:32:21 qemuppc systemd[1]: Started OpenSSH Per-Connection Daemon (192.168.7.3:53128).
Aug 23 14:32:21 qemuppc systemd-logind[835]: Removed session c2.
Aug 23 14:32:23 qemuppc systemd-logind[835]: New session c3 of user root.
Aug 23 14:32:23 qemuppc systemd[1]: Started Session c3 of User root.
Aug 23 14:32:24 qemuppc systemd[1]: session-c3.scope: Deactivated successfully.
Aug 23 14:32:24 qemuppc systemd[1]: sshd@8-192.168.7.4:22-192.168.7.3:53128.service: Deactivated successfully.
Aug 23 14:32:24 qemuppc systemd-logind[835]: Session c3 logged out. Waiting for processes to exit.
Aug 23 14:32:24 qemuppc systemd-logind[835]: Removed session c3.
Aug 23 14:32:24 qemuppc systemd[1]: Started OpenSSH Per-Connection Daemon (192.168.7.3:53142).
Aug 23 14:32:26 qemuppc systemd-logind[835]: New session c4 of user root.
Aug 23 14:32:26 qemuppc systemd-journald[114]: Forwarding to syslog missed 191 messages.
Aug 23 14:32:26 qemuppc systemd[1]: Started Session c4 of User root.
Aug 23 14:32:27 qemuppc systemd[1]: sshd@9-192.168.7.4:22-192.168.7.3:53142.service: Deactivated successfully.
Aug 23 14:32:27 qemuppc systemd[1]: session-c4.scope: Deactivated successfully.
Aug 23 14:32:27 qemuppc systemd-logind[835]: Session c4 logged out. Waiting for processes to exit.
Aug 23 14:32:27 qemuppc systemd[1]: Started OpenSSH Per-Connection Daemon (192.168.7.3:53152).
Aug 23 14:32:27 qemuppc systemd-logind[835]: Removed session c4.
Aug 23 14:32:29 qemuppc systemd-logind[835]: New session c5 of user root.
Aug 23 14:32:29 qemuppc systemd[1]: Started Session c5 of User root.
Aug 23 14:32:30 qemuppc systemd[1]: session-c5.scope: Deactivated successfully.
Aug 23 14:32:30 qemuppc systemd[1]: sshd@10-192.168.7.4:22-192.168.7.3:53152.service: Deactivated successfully.
Aug 23 14:32:30 qemuppc systemd-logind[835]: Session c5 logged out. Waiting for processes to exit.
Aug 23 14:32:30 qemuppc systemd[1]: Started OpenSSH Per-Connection Daemon (192.168.7.3:56798).
Aug 23 14:32:30 qemuppc systemd-logind[835]: Removed session c5.
Aug 23 14:32:33 qemuppc systemd-logind[835]: New session c6 of user root.
Aug 23 14:32:33 qemuppc systemd[1]: Started Session c6 of User root.
Aug 23 14:32:40 qemuppc systemd[1]: Started OpenSSH Per-Connection Daemon (192.168.7.3:36882).
Aug 23 14:32:40 qemuppc systemd[1]: session-c6.scope: Deactivated successfully.
Aug 23 14:32:40 qemuppc systemd[1]: sshd@11-192.168.7.4:22-192.168.7.3:56798.service: Deactivated successfully.
Aug 23 14:32:40 qemuppc systemd-logind[835]: Session c6 logged out. Waiting for processes to exit.
Aug 23 14:32:41 qemuppc systemd-logind[835]: Removed session c6.
Aug 23 14:32:46 qemuppc systemd-logind[835]: New session c7 of user root.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
[not found] ` <177E1FB73F514F09.8058@lists.openembedded.org>
@ 2023-08-24 14:04 ` Richard Purdie
[not found] ` <177E56C1DFAB4DFC.13053@lists.openembedded.org>
1 sibling, 0 replies; 23+ messages in thread
From: Richard Purdie @ 2023-08-24 14:04 UTC (permalink / raw)
To: Mikko Rapeli
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
On Wed, 2023-08-23 at 22:16 +0100, Richard Purdie via
lists.openembedded.org wrote:
> On Tue, 2023-08-22 at 23:01 +0100, Richard Purdie via
> lists.openembedded.org wrote:
> > so the commands are stopping mid flow for unknown reasons or the ssh
> > connection fails. I can't tell if this coincides with an rcu stall or
> > not. Both logs do have rcu stalls in.
> >
> > After these failures the system does continue to otherwise work
> > normally and subsequent tests pass.
> >
> > I wonder if the slow emulation might be causing the networking to
> > glitch and break the ssh connection.
> >
> > I'm at a bit of a loss on where from here.
>
> I thought I'd update the thread with new information.
>
> I went back to the start with this and looked again and what is going
> on. Interestingly, I found one of the autobuilder workers would
> consistently fail the qemuppc-alt configuration for core-image-sato-
> sdk. I paused the worker and experimented.
>
> I saw two different failures (included below). One shows systemd-udevd
> timing out on it's watchdog after 3 minutes and resetting, including
> taking out an ssh session running the cpio configure command. There was
> no RCU stall reported.
>
> The second failure shows systemd-logind as well as systemd-udevd with
> the 3 minute time out, the kernel complaining about missed IRQs, an RCU
> stall and lots of breakage following including cut ssh commands.
>
> I could not get the cpio build test to complete.
>
> Interestingly, I came back to the same image/worker later this evening
> and now it all works fine. The difference is earlier there was a world
> build running on the worker, which continued to wind down even after I
> paused the worker. By the evening, that background load was no longer
> present and the ppc image works in isolation. This tells us the issue
> is system load dependent and only occurs on loaded systems.
>
> I suspect I need to replicate the load and retry locally, see if I can
> reliably reproduce the hang. The watchdog won't be present on sysvinit
> systems which also show the issues but I'd guess there is still some
> other starvation/timeout occurring.
I've now seen the failure on the autobuilder:
* with linux-yocto 6.1.38
* with linux-yocto 6.1.46
* with qemu 8.0.4
* with qemu 8.0.3
* with qemu 8.0.0
I was a little suspicious of:
"hw/ppc: Fix clock update drift"
https://gitlab.com/qemu-project/qemu/-/commit/73d6ac24c81f1aeae554d469616c9181511e6523
but we've tested with and without that.
qemu has just released 8.1.0 so perhaps we should try that next.
I'm still struggling to pin down exactly which change caused the
problems to start...
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
[not found] ` <177E56C1DFAB4DFC.13053@lists.openembedded.org>
@ 2023-08-24 20:18 ` Richard Purdie
2023-08-25 5:04 ` Frédéric Martinsons
2023-08-25 6:27 ` Mikko Rapeli
0 siblings, 2 replies; 23+ messages in thread
From: Richard Purdie @ 2023-08-24 20:18 UTC (permalink / raw)
To: Mikko Rapeli
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
On Thu, 2023-08-24 at 15:04 +0100, Richard Purdie via
lists.openembedded.org wrote:
> On Wed, 2023-08-23 at 22:16 +0100, Richard Purdie via
> lists.openembedded.org wrote:
> > On Tue, 2023-08-22 at 23:01 +0100, Richard Purdie via
> > lists.openembedded.org wrote:
> > > so the commands are stopping mid flow for unknown reasons or the ssh
> > > connection fails. I can't tell if this coincides with an rcu stall or
> > > not. Both logs do have rcu stalls in.
> > >
> > > After these failures the system does continue to otherwise work
> > > normally and subsequent tests pass.
> > >
> > > I wonder if the slow emulation might be causing the networking to
> > > glitch and break the ssh connection.
> > >
> > > I'm at a bit of a loss on where from here.
> >
> > I thought I'd update the thread with new information.
> >
> > I went back to the start with this and looked again and what is going
> > on. Interestingly, I found one of the autobuilder workers would
> > consistently fail the qemuppc-alt configuration for core-image-sato-
> > sdk. I paused the worker and experimented.
> >
> > I saw two different failures (included below). One shows systemd-udevd
> > timing out on it's watchdog after 3 minutes and resetting, including
> > taking out an ssh session running the cpio configure command. There was
> > no RCU stall reported.
> >
> > The second failure shows systemd-logind as well as systemd-udevd with
> > the 3 minute time out, the kernel complaining about missed IRQs, an RCU
> > stall and lots of breakage following including cut ssh commands.
> >
> > I could not get the cpio build test to complete.
> >
> > Interestingly, I came back to the same image/worker later this evening
> > and now it all works fine. The difference is earlier there was a world
> > build running on the worker, which continued to wind down even after I
> > paused the worker. By the evening, that background load was no longer
> > present and the ppc image works in isolation. This tells us the issue
> > is system load dependent and only occurs on loaded systems.
> >
> > I suspect I need to replicate the load and retry locally, see if I can
> > reliably reproduce the hang. The watchdog won't be present on sysvinit
> > systems which also show the issues but I'd guess there is still some
> > other starvation/timeout occurring.
>
> I've now seen the failure on the autobuilder:
>
> * with linux-yocto 6.1.38
> * with linux-yocto 6.1.46
> * with qemu 8.0.4
> * with qemu 8.0.3
> * with qemu 8.0.0
>
> I was a little suspicious of:
>
> "hw/ppc: Fix clock update drift"
> https://gitlab.com/qemu-project/qemu/-/commit/73d6ac24c81f1aeae554d469616c9181511e6523
>
> but we've tested with and without that.
>
> qemu has just released 8.1.0 so perhaps we should try that next.
qemu 8.1.0 brings with it a new set of problems but I've reproduced the
hang with 8.1.0 so it does not solve that.
I'm really struggling to understand which change brought in these
issues for qemuppc.
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-24 20:18 ` Richard Purdie
@ 2023-08-25 5:04 ` Frédéric Martinsons
2023-08-25 6:27 ` Mikko Rapeli
1 sibling, 0 replies; 23+ messages in thread
From: Frédéric Martinsons @ 2023-08-25 5:04 UTC (permalink / raw)
To: Richard Purdie
Cc: Mikko Rapeli, Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
[-- Attachment #1: Type: text/plain, Size: 4127 bytes --]
Le jeu. 24 août 2023, 22:18, Richard Purdie <
richard.purdie@linuxfoundation.org> a écrit :
> On Thu, 2023-08-24 at 15:04 +0100, Richard Purdie via
> lists.openembedded.org wrote:
> > On Wed, 2023-08-23 at 22:16 +0100, Richard Purdie via
> > lists.openembedded.org wrote:
> > > On Tue, 2023-08-22 at 23:01 +0100, Richard Purdie via
> > > lists.openembedded.org wrote:
> > > > so the commands are stopping mid flow for unknown reasons or the ssh
> > > > connection fails. I can't tell if this coincides with an rcu stall or
> > > > not. Both logs do have rcu stalls in.
> > > >
> > > > After these failures the system does continue to otherwise work
> > > > normally and subsequent tests pass.
> > > >
> > > > I wonder if the slow emulation might be causing the networking to
> > > > glitch and break the ssh connection.
> > > >
> > > > I'm at a bit of a loss on where from here.
> > >
> > > I thought I'd update the thread with new information.
> > >
> > > I went back to the start with this and looked again and what is going
> > > on. Interestingly, I found one of the autobuilder workers would
> > > consistently fail the qemuppc-alt configuration for core-image-sato-
> > > sdk. I paused the worker and experimented.
> > >
> > > I saw two different failures (included below). One shows systemd-udevd
> > > timing out on it's watchdog after 3 minutes and resetting, including
> > > taking out an ssh session running the cpio configure command. There was
> > > no RCU stall reported.
> > >
> > > The second failure shows systemd-logind as well as systemd-udevd with
> > > the 3 minute time out, the kernel complaining about missed IRQs, an RCU
> > > stall and lots of breakage following including cut ssh commands.
> > >
> > > I could not get the cpio build test to complete.
> > >
> > > Interestingly, I came back to the same image/worker later this evening
> > > and now it all works fine. The difference is earlier there was a world
> > > build running on the worker, which continued to wind down even after I
> > > paused the worker. By the evening, that background load was no longer
> > > present and the ppc image works in isolation. This tells us the issue
> > > is system load dependent and only occurs on loaded systems.
> > >
> > > I suspect I need to replicate the load and retry locally, see if I can
> > > reliably reproduce the hang. The watchdog won't be present on sysvinit
> > > systems which also show the issues but I'd guess there is still some
> > > other starvation/timeout occurring.
> >
> > I've now seen the failure on the autobuilder:
> >
> > * with linux-yocto 6.1.38
> > * with linux-yocto 6.1.46
> > * with qemu 8.0.4
> > * with qemu 8.0.3
> > * with qemu 8.0.0
> >
> > I was a little suspicious of:
> >
> > "hw/ppc: Fix clock update drift"
> >
> https://gitlab.com/qemu-project/qemu/-/commit/73d6ac24c81f1aeae554d469616c9181511e6523
> >
> > but we've tested with and without that.
> >
> > qemu has just released 8.1.0 so perhaps we should try that next.
>
> qemu 8.1.0 brings with it a new set of problems but I've reproduced the
> hang with 8.1.0 so it does not solve that.
>
> I'm really struggling to understand which change brought in these
> issues for qemuppc.
>
> Cheers,
>
> Richard
>
Hello Richard,
I didn't understand the issues but I recently came across some keywords you
used here (rcu, NOHZ warnings, ratelimit...) in a Linux rt thread I just
read : https://www.spinics.net/lists/linux-rt-users/msg27085.html
I hope you may find it helpful for your investigation but if you was
already aware of that, my bad.
Cheers.
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#186672):
> https://lists.openembedded.org/g/openembedded-core/message/186672
> Mute This Topic: https://lists.openembedded.org/mt/100733646/6213388
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [
> frederic.martinsons@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
[-- Attachment #2: Type: text/html, Size: 6522 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-24 20:18 ` Richard Purdie
2023-08-25 5:04 ` Frédéric Martinsons
@ 2023-08-25 6:27 ` Mikko Rapeli
2023-08-25 6:34 ` Richard Purdie
[not found] ` <177E8CC0D944344B.23833@lists.openembedded.org>
1 sibling, 2 replies; 23+ messages in thread
From: Mikko Rapeli @ 2023-08-25 6:27 UTC (permalink / raw)
To: Richard Purdie
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
Hi,
On Thu, Aug 24, 2023 at 09:18:03PM +0100, Richard Purdie wrote:
> On Thu, 2023-08-24 at 15:04 +0100, Richard Purdie via
> lists.openembedded.org wrote:
> > On Wed, 2023-08-23 at 22:16 +0100, Richard Purdie via
> > lists.openembedded.org wrote:
> > > On Tue, 2023-08-22 at 23:01 +0100, Richard Purdie via
> > > lists.openembedded.org wrote:
> > > > so the commands are stopping mid flow for unknown reasons or the ssh
> > > > connection fails. I can't tell if this coincides with an rcu stall or
> > > > not. Both logs do have rcu stalls in.
> > > >
> > > > After these failures the system does continue to otherwise work
> > > > normally and subsequent tests pass.
> > > >
> > > > I wonder if the slow emulation might be causing the networking to
> > > > glitch and break the ssh connection.
> > > >
> > > > I'm at a bit of a loss on where from here.
> > >
> > > I thought I'd update the thread with new information.
> > >
> > > I went back to the start with this and looked again and what is going
> > > on. Interestingly, I found one of the autobuilder workers would
> > > consistently fail the qemuppc-alt configuration for core-image-sato-
> > > sdk. I paused the worker and experimented.
> > >
> > > I saw two different failures (included below). One shows systemd-udevd
> > > timing out on it's watchdog after 3 minutes and resetting, including
> > > taking out an ssh session running the cpio configure command. There was
> > > no RCU stall reported.
> > >
> > > The second failure shows systemd-logind as well as systemd-udevd with
> > > the 3 minute time out, the kernel complaining about missed IRQs, an RCU
> > > stall and lots of breakage following including cut ssh commands.
> > >
> > > I could not get the cpio build test to complete.
> > >
> > > Interestingly, I came back to the same image/worker later this evening
> > > and now it all works fine. The difference is earlier there was a world
> > > build running on the worker, which continued to wind down even after I
> > > paused the worker. By the evening, that background load was no longer
> > > present and the ppc image works in isolation. This tells us the issue
> > > is system load dependent and only occurs on loaded systems.
> > >
> > > I suspect I need to replicate the load and retry locally, see if I can
> > > reliably reproduce the hang. The watchdog won't be present on sysvinit
> > > systems which also show the issues but I'd guess there is still some
> > > other starvation/timeout occurring.
> >
> > I've now seen the failure on the autobuilder:
> >
> > * with linux-yocto 6.1.38
> > * with linux-yocto 6.1.46
> > * with qemu 8.0.4
> > * with qemu 8.0.3
> > * with qemu 8.0.0
> >
> > I was a little suspicious of:
> >
> > "hw/ppc: Fix clock update drift"
> > https://gitlab.com/qemu-project/qemu/-/commit/73d6ac24c81f1aeae554d469616c9181511e6523
> >
> > but we've tested with and without that.
> >
> > qemu has just released 8.1.0 so perhaps we should try that next.
>
> qemu 8.1.0 brings with it a new set of problems but I've reproduced the
> hang with 8.1.0 so it does not solve that.
>
> I'm really struggling to understand which change brought in these
> issues for qemuppc.
Are these issues visible on mickledore branch? Maybe mickledore with kernel 6.1 stable update or
qemu 7.2 update to 8.y.x could be tested too. At least then kernel or qemu could be blamed
for the issues.
Cheers,
-Mikko
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-25 6:27 ` Mikko Rapeli
@ 2023-08-25 6:34 ` Richard Purdie
2023-08-25 7:26 ` Mikko Rapeli
[not found] ` <177E8CC0D944344B.23833@lists.openembedded.org>
1 sibling, 1 reply; 23+ messages in thread
From: Richard Purdie @ 2023-08-25 6:34 UTC (permalink / raw)
To: Mikko Rapeli
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
On Fri, 2023-08-25 at 09:27 +0300, Mikko Rapeli wrote:
> Hi,
>
> On Thu, Aug 24, 2023 at 09:18:03PM +0100, Richard Purdie wrote:
> > On Thu, 2023-08-24 at 15:04 +0100, Richard Purdie via
> > lists.openembedded.org wrote:
> > > On Wed, 2023-08-23 at 22:16 +0100, Richard Purdie via
> > > lists.openembedded.org wrote:
> > > > On Tue, 2023-08-22 at 23:01 +0100, Richard Purdie via
> > > > lists.openembedded.org wrote:
> > > > > so the commands are stopping mid flow for unknown reasons or the ssh
> > > > > connection fails. I can't tell if this coincides with an rcu stall or
> > > > > not. Both logs do have rcu stalls in.
> > > > >
> > > > > After these failures the system does continue to otherwise work
> > > > > normally and subsequent tests pass.
> > > > >
> > > > > I wonder if the slow emulation might be causing the networking to
> > > > > glitch and break the ssh connection.
> > > > >
> > > > > I'm at a bit of a loss on where from here.
> > > >
> > > > I thought I'd update the thread with new information.
> > > >
> > > > I went back to the start with this and looked again and what is going
> > > > on. Interestingly, I found one of the autobuilder workers would
> > > > consistently fail the qemuppc-alt configuration for core-image-sato-
> > > > sdk. I paused the worker and experimented.
> > > >
> > > > I saw two different failures (included below). One shows systemd-udevd
> > > > timing out on it's watchdog after 3 minutes and resetting, including
> > > > taking out an ssh session running the cpio configure command. There was
> > > > no RCU stall reported.
> > > >
> > > > The second failure shows systemd-logind as well as systemd-udevd with
> > > > the 3 minute time out, the kernel complaining about missed IRQs, an RCU
> > > > stall and lots of breakage following including cut ssh commands.
> > > >
> > > > I could not get the cpio build test to complete.
> > > >
> > > > Interestingly, I came back to the same image/worker later this evening
> > > > and now it all works fine. The difference is earlier there was a world
> > > > build running on the worker, which continued to wind down even after I
> > > > paused the worker. By the evening, that background load was no longer
> > > > present and the ppc image works in isolation. This tells us the issue
> > > > is system load dependent and only occurs on loaded systems.
> > > >
> > > > I suspect I need to replicate the load and retry locally, see if I can
> > > > reliably reproduce the hang. The watchdog won't be present on sysvinit
> > > > systems which also show the issues but I'd guess there is still some
> > > > other starvation/timeout occurring.
> > >
> > > I've now seen the failure on the autobuilder:
> > >
> > > * with linux-yocto 6.1.38
> > > * with linux-yocto 6.1.46
> > > * with qemu 8.0.4
> > > * with qemu 8.0.3
> > > * with qemu 8.0.0
> > >
> > > I was a little suspicious of:
> > >
> > > "hw/ppc: Fix clock update drift"
> > > https://gitlab.com/qemu-project/qemu/-/commit/73d6ac24c81f1aeae554d469616c9181511e6523
> > >
> > > but we've tested with and without that.
> > >
> > > qemu has just released 8.1.0 so perhaps we should try that next.
> >
> > qemu 8.1.0 brings with it a new set of problems but I've reproduced the
> > hang with 8.1.0 so it does not solve that.
> >
> > I'm really struggling to understand which change brought in these
> > issues for qemuppc.
>
> Are these issues visible on mickledore branch? Maybe mickledore with kernel 6.1 stable update or
> qemu 7.2 update to 8.y.x could be tested too. At least then kernel or qemu could be blamed
> for the issues.
Not that I know of.
I have now also reproduced the failure with glibc 2.37 instead of 2.38
including the fortify sources change and the 6.1.34 kernel so there is
something else causing this.
I've wondered if we need to try going back to qemu 7.2. It may also be
worth ruling out binutils.
It shouldn't be systemd as the sysvinit images show the issue too.
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
2023-08-25 6:34 ` Richard Purdie
@ 2023-08-25 7:26 ` Mikko Rapeli
0 siblings, 0 replies; 23+ messages in thread
From: Mikko Rapeli @ 2023-08-25 7:26 UTC (permalink / raw)
To: Richard Purdie
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
Hi,
On Fri, Aug 25, 2023 at 07:34:25AM +0100, Richard Purdie wrote:
> On Fri, 2023-08-25 at 09:27 +0300, Mikko Rapeli wrote:
> > Hi,
> >
> > On Thu, Aug 24, 2023 at 09:18:03PM +0100, Richard Purdie wrote:
> > > On Thu, 2023-08-24 at 15:04 +0100, Richard Purdie via
> > > lists.openembedded.org wrote:
> > > > On Wed, 2023-08-23 at 22:16 +0100, Richard Purdie via
> > > > lists.openembedded.org wrote:
> > > > > On Tue, 2023-08-22 at 23:01 +0100, Richard Purdie via
> > > > > lists.openembedded.org wrote:
> > > > > > so the commands are stopping mid flow for unknown reasons or the ssh
> > > > > > connection fails. I can't tell if this coincides with an rcu stall or
> > > > > > not. Both logs do have rcu stalls in.
> > > > > >
> > > > > > After these failures the system does continue to otherwise work
> > > > > > normally and subsequent tests pass.
> > > > > >
> > > > > > I wonder if the slow emulation might be causing the networking to
> > > > > > glitch and break the ssh connection.
> > > > > >
> > > > > > I'm at a bit of a loss on where from here.
> > > > >
> > > > > I thought I'd update the thread with new information.
> > > > >
> > > > > I went back to the start with this and looked again and what is going
> > > > > on. Interestingly, I found one of the autobuilder workers would
> > > > > consistently fail the qemuppc-alt configuration for core-image-sato-
> > > > > sdk. I paused the worker and experimented.
> > > > >
> > > > > I saw two different failures (included below). One shows systemd-udevd
> > > > > timing out on it's watchdog after 3 minutes and resetting, including
> > > > > taking out an ssh session running the cpio configure command. There was
> > > > > no RCU stall reported.
> > > > >
> > > > > The second failure shows systemd-logind as well as systemd-udevd with
> > > > > the 3 minute time out, the kernel complaining about missed IRQs, an RCU
> > > > > stall and lots of breakage following including cut ssh commands.
> > > > >
> > > > > I could not get the cpio build test to complete.
> > > > >
> > > > > Interestingly, I came back to the same image/worker later this evening
> > > > > and now it all works fine. The difference is earlier there was a world
> > > > > build running on the worker, which continued to wind down even after I
> > > > > paused the worker. By the evening, that background load was no longer
> > > > > present and the ppc image works in isolation. This tells us the issue
> > > > > is system load dependent and only occurs on loaded systems.
> > > > >
> > > > > I suspect I need to replicate the load and retry locally, see if I can
> > > > > reliably reproduce the hang. The watchdog won't be present on sysvinit
> > > > > systems which also show the issues but I'd guess there is still some
> > > > > other starvation/timeout occurring.
> > > >
> > > > I've now seen the failure on the autobuilder:
> > > >
> > > > * with linux-yocto 6.1.38
> > > > * with linux-yocto 6.1.46
> > > > * with qemu 8.0.4
> > > > * with qemu 8.0.3
> > > > * with qemu 8.0.0
> > > >
> > > > I was a little suspicious of:
> > > >
> > > > "hw/ppc: Fix clock update drift"
> > > > https://gitlab.com/qemu-project/qemu/-/commit/73d6ac24c81f1aeae554d469616c9181511e6523
> > > >
> > > > but we've tested with and without that.
> > > >
> > > > qemu has just released 8.1.0 so perhaps we should try that next.
> > >
> > > qemu 8.1.0 brings with it a new set of problems but I've reproduced the
> > > hang with 8.1.0 so it does not solve that.
> > >
> > > I'm really struggling to understand which change brought in these
> > > issues for qemuppc.
> >
> > Are these issues visible on mickledore branch? Maybe mickledore with kernel 6.1 stable update or
> > qemu 7.2 update to 8.y.x could be tested too. At least then kernel or qemu could be blamed
> > for the issues.
>
> Not that I know of.
>
> I have now also reproduced the failure with glibc 2.37 instead of 2.38
> including the fortify sources change and the 6.1.34 kernel so there is
> something else causing this.
>
> I've wondered if we need to try going back to qemu 7.2. It may also be
> worth ruling out binutils.
Yes, I'd have no objection to qemu downgrade if that helps with stability
and release deadline. I really trust you don't do this lightly and I would
much prefer you do this instead of burning out when hunting fixes for the various
issues. In product environments I've done this a lot: changes get reverted if
they cause too much instability and fixes don't come within some limited time
from the developers who are responsible for those changes.
> It shouldn't be systemd as the sysvinit images show the issue too.
FWIW, poky master branch with 6.4 kernel is working well on our arm64 boards
and CI results are stable, same with qemu-arm64 on 7.2 and 8.0.x versions.
Cheers,
-Mikko
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
[not found] ` <177E8CC0D944344B.23833@lists.openembedded.org>
@ 2023-08-30 10:43 ` Richard Purdie
[not found] ` <178023427EE7BA0B.20206@lists.openembedded.org>
1 sibling, 0 replies; 23+ messages in thread
From: Richard Purdie @ 2023-08-30 10:43 UTC (permalink / raw)
To: Mikko Rapeli
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield
On Fri, 2023-08-25 at 07:34 +0100, Richard Purdie via
lists.openembedded.org wrote:
> > >
> > > qemu 8.1.0 brings with it a new set of problems but I've reproduced the
> > > hang with 8.1.0 so it does not solve that.
> > >
> > > I'm really struggling to understand which change brought in these
> > > issues for qemuppc.
> >
> > Are these issues visible on mickledore branch? Maybe mickledore with kernel 6.1 stable update or
> > qemu 7.2 update to 8.y.x could be tested too. At least then kernel or qemu could be blamed
> > for the issues.
>
> Not that I know of.
>
> I have now also reproduced the failure with glibc 2.37 instead of 2.38
> including the fortify sources change and the 6.1.34 kernel so there is
> something else causing this.
>
> I've wondered if we need to try going back to qemu 7.2. It may also be
> worth ruling out binutils.
>
> It shouldn't be systemd as the sysvinit images show the issue too.
I've ruled out the binutils upgrade, the glibc upgrade, systemd, the
kernel changes and the tar, libarchive and qemu upgrades.
I've continued to try and narrow things down and we see the issue from
this commit onwards:
https://git.yoctoproject.org/poky/commit/?id=12d9280c3de24c1c2b835e80fa1b8be72e9bc63a
I did get three clean runs with:
https://git.yoctoproject.org/poky/commit/?id=fb51e196a978d452e6a14a8343832659da97fdc7
but that still could be false negatives as it is intermittent.
I'm trying builds of the commits between those two to see if any
pattern emerges.
The qemu 8.1.0 upgrade breaks x86 with qemu kernel hangs seemingly with
nfs root relatively consistently:
https://autobuilder.yoctoproject.org/typhoon/#/builders/145/builds/424/steps/13/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/430/steps/12/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/145/builds/427/steps/12/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/433/steps/12/logs/stdio
I think this was where I was previously worrying about x86 kernel hangs
from. Now we know the 8.1.0 upgrade is the cause of this, we can
hopefully get to the bottom of it more quickly.
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [OE-core] Dilemma on changes - merge or not to merge (e.g. 6.4)
[not found] ` <178023427EE7BA0B.20206@lists.openembedded.org>
@ 2023-08-30 13:03 ` Richard Purdie
0 siblings, 0 replies; 23+ messages in thread
From: Richard Purdie @ 2023-08-30 13:03 UTC (permalink / raw)
To: Mikko Rapeli
Cc: Paul Gortmaker, Rasmus Villemoes, openembedded-core,
Bruce Ashfield, Ross Burton
On Wed, 2023-08-30 at 11:43 +0100, Richard Purdie via
lists.openembedded.org wrote:
> On Fri, 2023-08-25 at 07:34 +0100, Richard Purdie via
> lists.openembedded.org wrote:
> > > >
> > > > qemu 8.1.0 brings with it a new set of problems but I've reproduced the
> > > > hang with 8.1.0 so it does not solve that.
> > > >
> > > > I'm really struggling to understand which change brought in these
> > > > issues for qemuppc.
> > >
> > > Are these issues visible on mickledore branch? Maybe mickledore with kernel 6.1 stable update or
> > > qemu 7.2 update to 8.y.x could be tested too. At least then kernel or qemu could be blamed
> > > for the issues.
> >
> > Not that I know of.
> >
> > I have now also reproduced the failure with glibc 2.37 instead of 2.38
> > including the fortify sources change and the 6.1.34 kernel so there is
> > something else causing this.
> >
> > I've wondered if we need to try going back to qemu 7.2. It may also be
> > worth ruling out binutils.
> >
> > It shouldn't be systemd as the sysvinit images show the issue too.
>
> I've ruled out the binutils upgrade, the glibc upgrade, systemd, the
> kernel changes and the tar, libarchive and qemu upgrades.
>
> I've continued to try and narrow things down and we see the issue from
> this commit onwards:
>
> https://git.yoctoproject.org/poky/commit/?id=12d9280c3de24c1c2b835e80fa1b8be72e9bc63a
>
> I did get three clean runs with:
>
> https://git.yoctoproject.org/poky/commit/?id=fb51e196a978d452e6a14a8343832659da97fdc7
>
> but that still could be false negatives as it is intermittent.
>
> I'm trying builds of the commits between those two to see if any
> pattern emerges.
>
> The qemu 8.1.0 upgrade breaks x86 with qemu kernel hangs seemingly with
> nfs root relatively consistently:
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/145/builds/424/steps/13/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/430/steps/12/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/145/builds/427/steps/12/logs/stdio
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/433/steps/12/logs/stdio
This can be reproduced with a "bitbake core-image-minimal" then:
runqemu nographic /XXX/tmp/deploy/images/qemux86-64/core-image-minimal-qemux86-64.rootfs.tar.bz2 bootparams=" printk.time=1"
If you add "kvm" to the command, it works.
Interestingly dropping the nographic resulted in a different crash with
rcu stalls and fun timestamps.
> I think this was where I was previously worrying about x86 kernel hangs
> from. Now we know the 8.1.0 upgrade is the cause of this, we can
> hopefully get to the bottom of it more quickly.
Bisection brings us fairly conclusively to:
https://git.yoctoproject.org/poky/commit/?id=ffd73bef9b9bb5c94c050387941eee29719ca697
"yocto-uninative: Update to 4.2 for glibc 2.38"
which means somehow qemu for qemuppc is breaking when used with glibc
2.38.
Why? No idea.
Cheers,
Richard
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2023-08-30 13:03 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-14 9:54 Dilemma on changes - merge or not to merge (e.g. 6.4) Richard Purdie
2023-08-15 13:08 ` Paul Gortmaker
2023-08-15 13:38 ` Richard Purdie
2023-08-16 7:55 ` [OE-core] " Rasmus Villemoes
2023-08-18 3:22 ` Paul Gortmaker
2023-08-22 9:31 ` Richard Purdie
[not found] ` <177DAAB2E4C3384A.4797@lists.openembedded.org>
2023-08-22 11:07 ` Richard Purdie
[not found] ` <177DAFEBFB5EB0D2.24073@lists.openembedded.org>
2023-08-22 11:47 ` Richard Purdie
2023-08-22 12:20 ` Mikko Rapeli
2023-08-22 12:28 ` Richard Purdie
2023-08-22 12:31 ` Alexander Kanavin
[not found] ` <177DB4530EBE3FA3.24073@lists.openembedded.org>
2023-08-22 14:49 ` Richard Purdie
[not found] ` <177DBC07E94591CC.4797@lists.openembedded.org>
2023-08-22 21:08 ` Richard Purdie
[not found] ` <177DD0B30D8FEDF8.27837@lists.openembedded.org>
2023-08-22 22:01 ` Richard Purdie
[not found] ` <177DD39B5534099F.27837@lists.openembedded.org>
2023-08-23 21:16 ` Richard Purdie
[not found] ` <177E1FB73F514F09.8058@lists.openembedded.org>
2023-08-24 14:04 ` Richard Purdie
[not found] ` <177E56C1DFAB4DFC.13053@lists.openembedded.org>
2023-08-24 20:18 ` Richard Purdie
2023-08-25 5:04 ` Frédéric Martinsons
2023-08-25 6:27 ` Mikko Rapeli
2023-08-25 6:34 ` Richard Purdie
2023-08-25 7:26 ` Mikko Rapeli
[not found] ` <177E8CC0D944344B.23833@lists.openembedded.org>
2023-08-30 10:43 ` Richard Purdie
[not found] ` <178023427EE7BA0B.20206@lists.openembedded.org>
2023-08-30 13:03 ` Richard Purdie
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.