* Regression/boot failure on 5.16.3
@ 2022-02-04 0:19 Jason Self
2022-02-04 7:00 ` Greg KH
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Jason Self @ 2022-02-04 0:19 UTC (permalink / raw)
To: stable
[-- Attachment #1: Type: text/plain, Size: 1557 bytes --]
The computer (amd64) fails to boot. The init was stuck at the
synchronization of the time through the network. This began between
5.16.2 (good) and 5.16.3 (bad.) This continues on 5.16.4 and 5.16.5.
Git bisect revealed the following. In this case the nonfree firmwre is
not present on the system. Blacklisting the iwflwifi module works as a
workaround for now.
6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 is the first bad commit
commit 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1
Author: Johannes Berg <johannes.berg@intel.com>
Date: Fri Dec 10 11:12:42 2021 +0200
iwlwifi: fix leaks/bad data after failed firmware load
[ Upstream commit ab07506b0454bea606095951e19e72c282bfbb42 ]
If firmware load fails after having loaded some parts of the
firmware, e.g. the IML image, then this would leak. For the
host command list we'd end up running into a WARN on the next
attempt to load another firmware image.
Fix this by calling iwl_dealloc_ucode() on failures, and make
that also clear the data so we start fresh on the next round.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link:
https://lore.kernel.org/r/iwlwifi.20211210110539.1f742f0eb58a.I1315f22f6aa632d94ae2069f85e1bca5e734dce0@changeid
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 8 ++++++++
1 file changed, 8 insertions(+)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Regression/boot failure on 5.16.3 2022-02-04 0:19 Regression/boot failure on 5.16.3 Jason Self @ 2022-02-04 7:00 ` Greg KH 2022-02-04 8:48 ` Thorsten Leemhuis 2022-02-08 8:50 ` Stefan Agner 2 siblings, 0 replies; 6+ messages in thread From: Greg KH @ 2022-02-04 7:00 UTC (permalink / raw) To: Jason Self; +Cc: stable On Thu, Feb 03, 2022 at 04:19:59PM -0800, Jason Self wrote: > The computer (amd64) fails to boot. The init was stuck at the > synchronization of the time through the network. This began between > 5.16.2 (good) and 5.16.3 (bad.) This continues on 5.16.4 and 5.16.5. > Git bisect revealed the following. In this case the nonfree firmwre is > not present on the system. Blacklisting the iwflwifi module works as a > workaround for now. > > 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 is the first bad commit > commit 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 > Author: Johannes Berg <johannes.berg@intel.com> > Date: Fri Dec 10 11:12:42 2021 +0200 > > iwlwifi: fix leaks/bad data after failed firmware load > > [ Upstream commit ab07506b0454bea606095951e19e72c282bfbb42 ] > > If firmware load fails after having loaded some parts of the > firmware, e.g. the IML image, then this would leak. For the > host command list we'd end up running into a WARN on the next > attempt to load another firmware image. > > Fix this by calling iwl_dealloc_ucode() on failures, and make > that also clear the data so we start fresh on the next round. > > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > Link: > https://lore.kernel.org/r/iwlwifi.20211210110539.1f742f0eb58a.I1315f22f6aa632d94ae2069f85e1bca5e734dce0@changeid > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > Signed-off-by: Sasha Levin <sashal@kernel.org> > > drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 8 ++++++++ > 1 file changed, 8 insertions(+) Please cc: the authors of this commit, and the upstream wireless developers so they can help you out here as I think the same issue shows up in 5.17-rc2, right? thanks, greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Regression/boot failure on 5.16.3 2022-02-04 0:19 Regression/boot failure on 5.16.3 Jason Self 2022-02-04 7:00 ` Greg KH @ 2022-02-04 8:48 ` Thorsten Leemhuis 2022-02-08 8:50 ` Stefan Agner 2 siblings, 0 replies; 6+ messages in thread From: Thorsten Leemhuis @ 2022-02-04 8:48 UTC (permalink / raw) To: Jason Self, stable; +Cc: regressions@lists.linux.dev [TLDR: I'm adding this regression to regzbot, the Linux kernel regression tracking bot; most text you find below is compiled from a few templates paragraphs some of you might have seen already.] Hi, this is your Linux kernel regression tracker speaking. Adding the regression mailing list to the list of recipients, as it should be in the loop for all regressions, as explained here: https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html On 04.02.22 01:19, Jason Self wrote: > The computer (amd64) fails to boot. The init was stuck at the > synchronization of the time through the network. This began between > 5.16.2 (good) and 5.16.3 (bad.) This continues on 5.16.4 and 5.16.5. > Git bisect revealed the following. In this case the nonfree firmwre is > not present on the system. Blacklisting the iwflwifi module works as a > workaround for now. > > 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 is the first bad commit > commit 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 > Author: Johannes Berg <johannes.berg@intel.com> > Date: Fri Dec 10 11:12:42 2021 +0200 To be sure this issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot: #regzbot ^introduced 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 #regzbot title net: iwlwifi: system fails to boot since 5.16.3 #regzbot ignore-activity Reminder: when fixing the issue, please add a 'Link:' tag with the URL to the report (the parent of this mail) using the kernel.org redirector, as explained in 'Documentation/process/submitting-patches.rst'. Regzbot then will automatically mark the regression as resolved once the fix lands in the appropriate tree. For more details about regzbot see footer. Sending this to everyone that got the initial report, to make all aware of the tracking. I also hope that messages like this motivate people to directly get at least the regression mailing list and ideally even regzbot involved when dealing with regressions, as messages like this wouldn't be needed then. Don't worry, I'll send further messages wrt to this regression just to the lists (with a tag in the subject so people can filter them away), as long as they are intended just for regzbot. With a bit of luck no such messages will be needed anyway. Ciao, Thorsten (wearing his 'Linux kernel regression tracker' hat) P.S.: As a Linux kernel regression tracker I'm getting a lot of reports on my table. I can only look briefly into most of them. Unfortunately therefore I sometimes will get things wrong or miss something important. I hope that's not the case here; if you think it is, don't hesitate to tell me about it in a public reply, that's in everyone's interest. BTW, I have no personal interest in this issue, which is tracked using regzbot, my Linux kernel regression tracking bot (https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting this mail to get things rolling again and hence don't need to be CC on all further activities wrt to this regression. > iwlwifi: fix leaks/bad data after failed firmware load > > [ Upstream commit ab07506b0454bea606095951e19e72c282bfbb42 ] > > If firmware load fails after having loaded some parts of the > firmware, e.g. the IML image, then this would leak. For the > host command list we'd end up running into a WARN on the next > attempt to load another firmware image. > > Fix this by calling iwl_dealloc_ucode() on failures, and make > that also clear the data so we start fresh on the next round. > > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > Link: > https://lore.kernel.org/r/iwlwifi.20211210110539.1f742f0eb58a.I1315f22f6aa632d94ae2069f85e1bca5e734dce0@changeid > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > Signed-off-by: Sasha Levin <sashal@kernel.org> > > drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 8 ++++++++ > 1 file changed, 8 insertions(+) --- Additional information about regzbot: If you want to know more about regzbot, check out its web-interface, the getting start guide, and/or the references documentation: https://linux-regtracking.leemhuis.info/regzbot/ https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md The last two documents will explain how you can interact with regzbot yourself if your want to. Hint for reporters: when reporting a regression it's in your interest to tell #regzbot about it in the report, as that will ensure the regression gets on the radar of regzbot and the regression tracker. That's in your interest, as they will make sure the report won't fall through the cracks unnoticed. Hint for developers: you normally don't need to care about regzbot once it's involved. Fix the issue as you normally would, just remember to include a 'Link:' tag to the report in the commit message, as explained in Documentation/process/submitting-patches.rst That aspect was recently was made more explicit in commit 1f57bd42b77c: https://git.kernel.org/linus/1f57bd42b77c ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Regression/boot failure on 5.16.3 2022-02-04 0:19 Regression/boot failure on 5.16.3 Jason Self 2022-02-04 7:00 ` Greg KH 2022-02-04 8:48 ` Thorsten Leemhuis @ 2022-02-08 8:50 ` Stefan Agner 2022-02-08 18:05 ` Jason Self 2 siblings, 1 reply; 6+ messages in thread From: Stefan Agner @ 2022-02-08 8:50 UTC (permalink / raw) To: Jason Self, Greg KH, Johannes Berg; +Cc: stable, regressions On 2022-02-04 01:19, Jason Self wrote: > The computer (amd64) fails to boot. The init was stuck at the > synchronization of the time through the network. This began between > 5.16.2 (good) and 5.16.3 (bad.) This continues on 5.16.4 and 5.16.5. > Git bisect revealed the following. In this case the nonfree firmwre is > not present on the system. Blacklisting the iwflwifi module works as a > workaround for now. I have several reports of Intel NUC 10th/11th gen not booting/crashing during boot after updating to 5.10.96 (from 5.10.91). At least one stack trace shows iwl_dealloc_ucode in the call path. The below commit is part of 5.10.96 So this regression seems to not only affect 5.16 series. Link: https://github.com/home-assistant/operating-system/issues/1739#issuecomment-1032013069 -- Stefan > > 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 is the first bad commit > commit 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 > Author: Johannes Berg <johannes.berg@intel.com> > Date: Fri Dec 10 11:12:42 2021 +0200 > > iwlwifi: fix leaks/bad data after failed firmware load > > [ Upstream commit ab07506b0454bea606095951e19e72c282bfbb42 ] > > If firmware load fails after having loaded some parts of the > firmware, e.g. the IML image, then this would leak. For the > host command list we'd end up running into a WARN on the next > attempt to load another firmware image. > > Fix this by calling iwl_dealloc_ucode() on failures, and make > that also clear the data so we start fresh on the next round. > > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > Link: > > https://lore.kernel.org/r/iwlwifi.20211210110539.1f742f0eb58a.I1315f22f6aa632d94ae2069f85e1bca5e734dce0@changeid > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > Signed-off-by: Sasha Levin <sashal@kernel.org> > > drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 8 ++++++++ > 1 file changed, 8 insertions(+) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Regression/boot failure on 5.16.3 2022-02-08 8:50 ` Stefan Agner @ 2022-02-08 18:05 ` Jason Self 2022-02-08 18:22 ` Thorsten Leemhuis 0 siblings, 1 reply; 6+ messages in thread From: Jason Self @ 2022-02-08 18:05 UTC (permalink / raw) To: Stefan Agner, Greg KH, Johannes Berg, stable, regressions [-- Attachment #1: Type: text/plain, Size: 948 bytes --] On Tue, 08 Feb 2022 09:50:59 +0100 Stefan Agner <stefan@agner.ch> wrote: > On 2022-02-04 01:19, Jason Self wrote: > [...] > > I have several reports of Intel NUC 10th/11th gen not booting/crashing > during boot after updating to 5.10.96 (from 5.10.91). At least one > stack trace shows iwl_dealloc_ucode in the call path. The below > commit is part of 5.10.96 So this regression seems to not only affect > 5.16 series. > > Link: > https://github.com/home-assistant/operating-system/issues/1739#issuecomment-1032013069 Yes, it does appear to affect multiple versions; at least 5.17-rc2, 5.16, 5.15, and as you say 5.10. I can confirm that this patch addresses it on 5.16: https://lore.kernel.org/stable/YgJSEEmRDKKG+3lT@mail-itl/T/#t It appears desirable to apply the patch to all of the stable versions that need it, after it's gone into Linus's tree to also address the matter with the upcoming 5.17 series. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Regression/boot failure on 5.16.3 2022-02-08 18:05 ` Jason Self @ 2022-02-08 18:22 ` Thorsten Leemhuis 0 siblings, 0 replies; 6+ messages in thread From: Thorsten Leemhuis @ 2022-02-08 18:22 UTC (permalink / raw) To: Jason Self, Stefan Agner, Greg KH, Johannes Berg, stable, regressions On 08.02.22 19:05, Jason Self wrote: > On Tue, 08 Feb 2022 09:50:59 +0100 > Stefan Agner <stefan@agner.ch> wrote: > >> On 2022-02-04 01:19, Jason Self wrote: >> [...] >> >> I have several reports of Intel NUC 10th/11th gen not booting/crashing >> during boot after updating to 5.10.96 (from 5.10.91). At least one >> stack trace shows iwl_dealloc_ucode in the call path. The below >> commit is part of 5.10.96 So this regression seems to not only affect >> 5.16 series. >> >> Link: >> https://github.com/home-assistant/operating-system/issues/1739#issuecomment-1032013069 > > Yes, it does appear to affect multiple versions; at least 5.17-rc2, > 5.16, 5.15, and as you say 5.10. > > I can confirm that this patch addresses it on 5.16: > https://lore.kernel.org/stable/YgJSEEmRDKKG+3lT@mail-itl/T/#t Thx for pointing to the thread! #regzbot monitor: https://lore.kernel.org/stable/YgJSEEmRDKKG+3lT@mail-itl/ > It appears desirable to apply the patch to all of the stable versions > that need it, after it's gone into Linus's tree to also address the > matter with the upcoming 5.17 series. FWIW, the patch is marked for backporting already, it just needs to get merged to mainline first. Ciao, Thorsten ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-02-08 18:22 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-02-04 0:19 Regression/boot failure on 5.16.3 Jason Self 2022-02-04 7:00 ` Greg KH 2022-02-04 8:48 ` Thorsten Leemhuis 2022-02-08 8:50 ` Stefan Agner 2022-02-08 18:05 ` Jason Self 2022-02-08 18:22 ` Thorsten Leemhuis
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox