* Re: [Intel-wired-lan] [PATCH net v3] net: ethtool: do runtime PM outside RTNL
[not found] ` <e6f227ee701e1ee37e8f568b1310d240a2b8935a.camel@sipsolutions.net>
@ 2023-12-07 10:16 ` Przemek Kitszel
2023-12-07 17:40 ` Jakub Kicinski
0 siblings, 1 reply; 9+ messages in thread
From: Przemek Kitszel @ 2023-12-07 10:16 UTC (permalink / raw)
To: Johannes Berg, Jakub Kicinski
Cc: netdev, Marc MERLIN, intel-wired-lan@lists.osuosl.org,
Heiner Kallweit
On 12/6/23 17:46, Johannes Berg wrote:
> On Wed, 2023-12-06 at 08:44 -0800, Jakub Kicinski wrote:
>> On Wed, 6 Dec 2023 11:39:32 +0100 Johannes Berg wrote:
>>> As reported by Marc MERLIN, at least one driver (igc) wants or
>>> needs to acquire the RTNL inside suspend/resume ops, which can
>>> be called from here in ethtool if runtime PM is enabled.
>>>
>>> Allow this by doing runtime PM transitions without the RTNL
>>> held. For the ioctl to have the same operations order, this
>>> required reworking the code to separately check validity and
>>> do the operation. For the netlink code, this now has to do
>>> the runtime_pm_put a bit later.
>>
>> I was really, really hoping that this would serve as a motivation
>> for Intel to sort out the igb/igc implementation. The flow AFAICT
>> is ndo_open() starts the NIC, the calls pm_sus, which shuts the NIC
>> back down immediately (!?) then it schedules a link check from a work
>> queue, which opens it again (!?). It's a source of never ending bugs.
>>
>
> Well, I work there, but ... WiFi something else entirely. Marc just got
> lucky I spotted an issue in the logs ;-)
>
> I'll let you guys take it from here ...
>
> johannes
>
I have let know our igc TL, architect, and anybody that could be
interested via cc: IWL.
And I'm happy that this could be done at relaxed pace thanks to Johannes
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v3] net: ethtool: do runtime PM outside RTNL
2023-12-07 10:16 ` [Intel-wired-lan] [PATCH net v3] net: ethtool: do runtime PM outside RTNL Przemek Kitszel
@ 2023-12-07 17:40 ` Jakub Kicinski
2023-12-11 4:52 ` Marc MERLIN
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2023-12-07 17:40 UTC (permalink / raw)
To: Przemek Kitszel
Cc: netdev, Johannes Berg, Marc MERLIN,
intel-wired-lan@lists.osuosl.org, Heiner Kallweit
On Thu, 7 Dec 2023 11:16:10 +0100 Przemek Kitszel wrote:
> I have let know our igc TL, architect, and anybody that could be
> interested via cc: IWL. And I'm happy that this could be done at
> relaxed pace thanks to Johannes
I think you may be expecting us to take Johannes's patch.
It's still on the table, but to make things clear -
upstream we prefer to wait for the "real fix", so if we agree
that fixing igb/igc is a better way (as Heiner pointed out on previous
version PM functions are called by the stack under rtnl elsewhere too,
just not while device is open) - we'll wait for that. Especially
that I'm 80% I complained about the PM in those drivers in
the past and nobody seemed to care. It's a constant source of rtnl
deadlocks.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v3] net: ethtool: do runtime PM outside RTNL
2023-12-07 17:40 ` Jakub Kicinski
@ 2023-12-11 4:52 ` Marc MERLIN
2023-12-15 13:42 ` Heiner Kallweit
0 siblings, 1 reply; 9+ messages in thread
From: Marc MERLIN @ 2023-12-11 4:52 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Przemek Kitszel, Johannes Berg, Heiner Kallweit,
intel-wired-lan@lists.osuosl.org, netdev
On Thu, Dec 07, 2023 at 09:40:21AM -0800, Jakub Kicinski wrote:
> On Thu, 7 Dec 2023 11:16:10 +0100 Przemek Kitszel wrote:
> > I have let know our igc TL, architect, and anybody that could be
> > interested via cc: IWL. And I'm happy that this could be done at
> > relaxed pace thanks to Johannes
>
> I think you may be expecting us to take Johannes's patch.
> It's still on the table, but to make things clear -
> upstream we prefer to wait for the "real fix", so if we agree
> that fixing igb/igc is a better way (as Heiner pointed out on previous
> version PM functions are called by the stack under rtnl elsewhere too,
> just not while device is open) - we'll wait for that. Especially
> that I'm 80% I complained about the PM in those drivers in
> the past and nobody seemed to care. It's a constant source of rtnl
> deadlocks.
For whatever it's worth, I want to be clear that all stock kernels
are 100% unusable on lenovo P17gen2 because of this deadlock and that
without the temporary patch, my laptop would be usuable.
It was also a risk of data loss due to repeated deadlocks and unclean
shutdowns.
I cannot say what the correct fix is, but I am definitely hoping you
will accept some solution for the next stable kernel.
Thank you
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v3] net: ethtool: do runtime PM outside RTNL
2023-12-11 4:52 ` Marc MERLIN
@ 2023-12-15 13:42 ` Heiner Kallweit
2023-12-15 17:46 ` Marc MERLIN
0 siblings, 1 reply; 9+ messages in thread
From: Heiner Kallweit @ 2023-12-15 13:42 UTC (permalink / raw)
To: Marc MERLIN, Jakub Kicinski
Cc: Przemek Kitszel, Johannes Berg, intel-wired-lan@lists.osuosl.org,
netdev
On 11.12.2023 05:52, Marc MERLIN wrote:
> On Thu, Dec 07, 2023 at 09:40:21AM -0800, Jakub Kicinski wrote:
>> On Thu, 7 Dec 2023 11:16:10 +0100 Przemek Kitszel wrote:
>>> I have let know our igc TL, architect, and anybody that could be
>>> interested via cc: IWL. And I'm happy that this could be done at
>>> relaxed pace thanks to Johannes
>>
>> I think you may be expecting us to take Johannes's patch.
>> It's still on the table, but to make things clear -
>> upstream we prefer to wait for the "real fix", so if we agree
>> that fixing igb/igc is a better way (as Heiner pointed out on previous
>> version PM functions are called by the stack under rtnl elsewhere too,
>> just not while device is open) - we'll wait for that. Especially
>> that I'm 80% I complained about the PM in those drivers in
>> the past and nobody seemed to care. It's a constant source of rtnl
>> deadlocks.
>
> For whatever it's worth, I want to be clear that all stock kernels
> are 100% unusable on lenovo P17gen2 because of this deadlock and that
> without the temporary patch, my laptop would be usuable.
> It was also a risk of data loss due to repeated deadlocks and unclean
> shutdowns.
>
Why don't you simply disable runtime pm for the affected device as a
workaround? This can be done via sysfs.
> I cannot say what the correct fix is, but I am definitely hoping you
> will accept some solution for the next stable kernel.
>
> Thank you
> Marc
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v3] net: ethtool: do runtime PM outside RTNL
2023-12-15 13:42 ` Heiner Kallweit
@ 2023-12-15 17:46 ` Marc MERLIN
2023-12-24 16:30 ` Marc MERLIN
0 siblings, 1 reply; 9+ messages in thread
From: Marc MERLIN @ 2023-12-15 17:46 UTC (permalink / raw)
To: Heiner Kallweit
Cc: Jakub Kicinski, Johannes Berg, netdev,
intel-wired-lan@lists.osuosl.org, Przemek Kitszel
On Fri, Dec 15, 2023 at 02:42:01PM +0100, Heiner Kallweit wrote:
> Why don't you simply disable runtime pm for the affected device as a
> workaround? This can be done via sysfs.
1) because I didn't know what the exact bug was and how to work around it :)
2) without power management, the battery use is not good, but yes not
good is better than laptop crashing :)
That said, if it's only affecting wired ethernet, I can also unload the
igc module until I actually need wired ethernet, which is sometimes but
not often.
Thanks for your suggestion
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v3] net: ethtool: do runtime PM outside RTNL
2023-12-15 17:46 ` Marc MERLIN
@ 2023-12-24 16:30 ` Marc MERLIN
2023-12-24 23:12 ` Heiner Kallweit
0 siblings, 1 reply; 9+ messages in thread
From: Marc MERLIN @ 2023-12-24 16:30 UTC (permalink / raw)
To: Heiner Kallweit
Cc: Jakub Kicinski, Johannes Berg, netdev,
intel-wired-lan@lists.osuosl.org, Przemek Kitszel
On Fri, Dec 15, 2023 at 09:46:34AM -0800, Marc MERLIN wrote:
> On Fri, Dec 15, 2023 at 02:42:01PM +0100, Heiner Kallweit wrote:
> > Why don't you simply disable runtime pm for the affected device as a
> > workaround? This can be done via sysfs.
>
> 1) because I didn't know what the exact bug was and how to work around it :)
Mmmh, so I need to test an ubuntu kernel (6.5.0-14) because of sound
issues in mainline TOT, and I can't boot the kernel to completion
without hititng this hang bug. I'm not exactly sure which part of the
boot triggers it.
I can't patch that kernel easily. How exactly do I disable runtime PM
from the kernel command line for "that device" which I'm not even sure
which one it is. If it's the eth device, I already removed the igc
module to prevent it from loading, and I also removed the ethtool
binary, but I'm still getting the hang.
On the plus side, with 6.6.8 and the old patch which I understand is not
the ideal solution, I can confirm that I've been running problem free
until now, so thanks again for that interim patch.
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v3] net: ethtool: do runtime PM outside RTNL
2023-12-24 16:30 ` Marc MERLIN
@ 2023-12-24 23:12 ` Heiner Kallweit
2023-12-25 8:03 ` Sasha Neftin
0 siblings, 1 reply; 9+ messages in thread
From: Heiner Kallweit @ 2023-12-24 23:12 UTC (permalink / raw)
To: Marc MERLIN
Cc: Jakub Kicinski, Johannes Berg, netdev,
intel-wired-lan@lists.osuosl.org, Przemek Kitszel
On 24.12.2023 17:30, Marc MERLIN wrote:
> On Fri, Dec 15, 2023 at 09:46:34AM -0800, Marc MERLIN wrote:
>> On Fri, Dec 15, 2023 at 02:42:01PM +0100, Heiner Kallweit wrote:
>>> Why don't you simply disable runtime pm for the affected device as a
>>> workaround? This can be done via sysfs.
>>
>> 1) because I didn't know what the exact bug was and how to work around it :)
>
> Mmmh, so I need to test an ubuntu kernel (6.5.0-14) because of sound
> issues in mainline TOT, and I can't boot the kernel to completion
> without hititng this hang bug. I'm not exactly sure which part of the
> boot triggers it.
>
> I can't patch that kernel easily. How exactly do I disable runtime PM
> from the kernel command line for "that device" which I'm not even sure
Change <device>/power/control from "auto" to "on".
> which one it is. If it's the eth device, I already removed the igc
> module to prevent it from loading, and I also removed the ethtool
> binary, but I'm still getting the hang.
>
> On the plus side, with 6.6.8 and the old patch which I understand is not
> the ideal solution, I can confirm that I've been running problem free
> until now, so thanks again for that interim patch.
>
> Thanks,
> Marc
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v3] net: ethtool: do runtime PM outside RTNL
2023-12-24 23:12 ` Heiner Kallweit
@ 2023-12-25 8:03 ` Sasha Neftin
2023-12-25 11:21 ` Marc MERLIN
0 siblings, 1 reply; 9+ messages in thread
From: Sasha Neftin @ 2023-12-25 8:03 UTC (permalink / raw)
To: Heiner Kallweit, Marc MERLIN
Cc: Jakub Kicinski, Johannes Berg, intel-wired-lan@lists.osuosl.org,
Przemek Kitszel, netdev
On 25/12/2023 1:12, Heiner Kallweit wrote:
> On 24.12.2023 17:30, Marc MERLIN wrote:
>> On Fri, Dec 15, 2023 at 09:46:34AM -0800, Marc MERLIN wrote:
>>> On Fri, Dec 15, 2023 at 02:42:01PM +0100, Heiner Kallweit wrote:
>>>> Why don't you simply disable runtime pm for the affected device as a
>>>> workaround? This can be done via sysfs.
>>>
>>> 1) because I didn't know what the exact bug was and how to work around it :)
>>
>> Mmmh, so I need to test an ubuntu kernel (6.5.0-14) because of sound
>> issues in mainline TOT, and I can't boot the kernel to completion
>> without hititng this hang bug. I'm not exactly sure which part of the
>> boot triggers it.
>>
>> I can't patch that kernel easily. How exactly do I disable runtime PM
>> from the kernel command line for "that device" which I'm not even sure
>
> Change <device>/power/control from "auto" to "on".
Need to figure out your controller location in a file system via
lspci/lspci -t and then change to "on"
For example: echo on >
/sys/devices/pci0000\:00/0000\:00\:1c.0/0000\:ae\:00.0/power/control
We are starting to look at this problem, but I can't reproduce the
problem on my machines yet.
>
>> which one it is. If it's the eth device, I already removed the igc
>> module to prevent it from loading, and I also removed the ethtool
>> binary, but I'm still getting the hang.
>>
>> On the plus side, with 6.6.8 and the old patch which I understand is not
>> the ideal solution, I can confirm that I've been running problem free
>> until now, so thanks again for that interim patch.
>>
>> Thanks,
>> Marc
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v3] net: ethtool: do runtime PM outside RTNL
2023-12-25 8:03 ` Sasha Neftin
@ 2023-12-25 11:21 ` Marc MERLIN
0 siblings, 0 replies; 9+ messages in thread
From: Marc MERLIN @ 2023-12-25 11:21 UTC (permalink / raw)
To: Sasha Neftin
Cc: netdev, intel-wired-lan@lists.osuosl.org, Przemek Kitszel,
Jakub Kicinski, Johannes Berg, Heiner Kallweit
On Mon, Dec 25, 2023 at 10:03:23AM +0200, Sasha Neftin wrote:
> > > I can't patch that kernel easily. How exactly do I disable runtime PM
> > > from the kernel command line for "that device" which I'm not even sure
> >
> > Change <device>/power/control from "auto" to "on".
>
> Need to figure out your controller location in a file system via lspci/lspci
> -t and then change to "on"
> For example: echo on >
> /sys/devices/pci0000\:00/0000\:00\:1c.0/0000\:ae\:00.0/power/control
>
> We are starting to look at this problem, but I can't reproduce the problem
> on my machines yet.
Thanks. I realized it was going to be hard either way if the boot hangs
before I get to a command prompt, which was what was happening
yesterday.
I had to boot ubuntu to debug a sound issue, and it was very tricky
since most of the time it hung before I got to a command prompt, but I
was finally able to get it to work long enough to debug the sound issue
and go back to my self built kernel to port over the sound config I
needed.
I wish I could tell you exactly how to reproduct this in a more useful
way, sorry about that.
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-12-26 16:30 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20231206113934.8d7819857574.I2deb5804ef1739a2af307283d320ef7d82456494@changeid>
[not found] ` <20231206084448.53b48c49@kernel.org>
[not found] ` <e6f227ee701e1ee37e8f568b1310d240a2b8935a.camel@sipsolutions.net>
2023-12-07 10:16 ` [Intel-wired-lan] [PATCH net v3] net: ethtool: do runtime PM outside RTNL Przemek Kitszel
2023-12-07 17:40 ` Jakub Kicinski
2023-12-11 4:52 ` Marc MERLIN
2023-12-15 13:42 ` Heiner Kallweit
2023-12-15 17:46 ` Marc MERLIN
2023-12-24 16:30 ` Marc MERLIN
2023-12-24 23:12 ` Heiner Kallweit
2023-12-25 8:03 ` Sasha Neftin
2023-12-25 11:21 ` Marc MERLIN
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox