Openembedded Core Discussions
 help / color / mirror / Atom feed
* Dilemma on changes - merge or not to merge (e.g. 6.4)
@ 2023-08-14  9:54 Richard Purdie
  2023-08-15 13:08 ` Paul Gortmaker
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Purdie @ 2023-08-14  9:54 UTC (permalink / raw)
  To: openembedded-core; +Cc: Bruce Ashfield, Paul Gortmaker

I'm becoming a little weary/wary of some of the changes that are coming
in. The challenge is that once they merge, issues become the problem of
a very small number of people.

My current dilemma is the 6.4 kernel. People would like it, we'd really
ideally use it for the next release but there are issues.

I've worked through a few, at least pinning down where the issues were
then resolving them with the help of others (thanks Bruce, Jon, Ross).

Remaining are:
  * an error upon boot on preempt-rt on qemux86-64
     (e.g. https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/7616/steps/36/logs/stdio)
     We'll probably just have to ignore it in parselogs as it has been 
     around for a while and nobody seems interested in fixing it upstream.
  * some random hangs:
     https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/349/steps/12/logs/stdio
     https://autobuilder.yoctoproject.org/typhoon/#/builders/148/builds/354/steps/12/logs/stdio

The latter are rare and intermittent, mainly taking out CI test builds.
Most people aren't affected by them, find them hard to reproduce let
alone fix and will ignore them. That will leave me/Bruce/PaulG holding
the pieces.

I know Bruce spends a ton of time debugging weird things just to get
the kernel to the point we can even consider merging and nobody ever
really sees or appreciates that work :(.

Systemd was a similar challenge recently, multiple patches causing
multiple issues with a significant impact on CI. In that case the
issues weren't intermittent so resolution wasn't so bad.

Rust and reproducibility was given a pass so the rest of the changes
could merge for it. That just meant there was less pressure and the
reproducibility issue is still there with people saying its too hard.
That issue is now spreading down the chain to other recipes.

The toolchain test reports have thousands of failures nobody is really
looking at. Similarly the now consistent ltp controllers failures
(previously the reports weren't even consistent!).

I'm worried the access control patches changing the tar format are
going to destablise and once merged, people will move on to other
things leaving any remaining intermittent issues to me. Already we're
seeing things like sstate being blamed as it is easiest to do that. I
end up having to "prove" it isn't that.

There are intermittent ptests on the autobuilder too. I took mdadm
ptest patches on the basis there was help to fix them. We are still see
a lot of failures in CI from there. The glib-networking intermittent
failures continue, I know Trevor has tried to dig into those but he is
alone in doing it in code which isn't easy to navigate (and I don't
know how to help there).

As an idea of impact, every time one of these things fails in CI,
someone has triage that failure. The bug triage team has to triage the
bugs too.

I don't know how we fix this but we really could do with more people
able to dive in and help with these intermittent issues. I'm really
really apprehensive about merging some patches as I can just tell
they're going to cause pain :(.

Cheers,

Richard



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-08-30 13:03 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-14  9:54 Dilemma on changes - merge or not to merge (e.g. 6.4) Richard Purdie
2023-08-15 13:08 ` Paul Gortmaker
2023-08-15 13:38   ` Richard Purdie
2023-08-16  7:55   ` [OE-core] " Rasmus Villemoes
2023-08-18  3:22     ` Paul Gortmaker
2023-08-22  9:31       ` Richard Purdie
     [not found]       ` <177DAAB2E4C3384A.4797@lists.openembedded.org>
2023-08-22 11:07         ` Richard Purdie
     [not found]         ` <177DAFEBFB5EB0D2.24073@lists.openembedded.org>
2023-08-22 11:47           ` Richard Purdie
2023-08-22 12:20             ` Mikko Rapeli
2023-08-22 12:28               ` Richard Purdie
2023-08-22 12:31                 ` Alexander Kanavin
     [not found]               ` <177DB4530EBE3FA3.24073@lists.openembedded.org>
2023-08-22 14:49                 ` Richard Purdie
     [not found]                 ` <177DBC07E94591CC.4797@lists.openembedded.org>
2023-08-22 21:08                   ` Richard Purdie
     [not found]                   ` <177DD0B30D8FEDF8.27837@lists.openembedded.org>
2023-08-22 22:01                     ` Richard Purdie
     [not found]                     ` <177DD39B5534099F.27837@lists.openembedded.org>
2023-08-23 21:16                       ` Richard Purdie
     [not found]                       ` <177E1FB73F514F09.8058@lists.openembedded.org>
2023-08-24 14:04                         ` Richard Purdie
     [not found]                         ` <177E56C1DFAB4DFC.13053@lists.openembedded.org>
2023-08-24 20:18                           ` Richard Purdie
2023-08-25  5:04                             ` Frédéric Martinsons
2023-08-25  6:27                             ` Mikko Rapeli
2023-08-25  6:34                               ` Richard Purdie
2023-08-25  7:26                                 ` Mikko Rapeli
     [not found]                               ` <177E8CC0D944344B.23833@lists.openembedded.org>
2023-08-30 10:43                                 ` Richard Purdie
     [not found]                                 ` <178023427EE7BA0B.20206@lists.openembedded.org>
2023-08-30 13:03                                   ` Richard Purdie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox