netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Strangeness in ehea network driver's shutdown
@ 2022-10-01 14:31 Uwe Kleine-König
  2022-10-03 16:36 ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Uwe Kleine-König @ 2022-10-01 14:31 UTC (permalink / raw)
  To: Douglas Miller; +Cc: netdev, Greg Kroah-Hartman, Rafael J. Wysocki, kernel

[-- Attachment #1: Type: text/plain, Size: 1347 bytes --]

Hello,

while doing some cleanup I stumbled over a problem in the ehea network
driver.

In the driver's probe function (ehea_probe_adapter() via
ehea_register_memory_hooks()) a reboot notifier is registered. When this
notifier is triggered (ehea_reboot_notifier()) it unregisters the
driver. I'm unsure what is the order of the actions triggered by that.
Maybe the driver is unregistered twice if there are two bound devices?
Or the reboot notifier is called under a lock and unregistering the
driver (and so the devices) tries to unregister the notifier that is
currently locked and so results in a deadlock? Maybe Greg or Rafael can
tell about the details here?

Whatever the effect is, it's strange. It makes me wonder why it's
necessary to free all the resources of the driver on reboot?! I don't
know anything about the specifics of the affected machines, but I guess
doing just the necessary stuff on reboot would be easier to understand,
quicker to execute and doesn't have such strange side effects.

With my lack of knowledge about the machine, the best I can do is report
my findings. So don't expect a patch or testing from my side.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Strangeness in ehea network driver's shutdown
  2022-10-01 14:31 Strangeness in ehea network driver's shutdown Uwe Kleine-König
@ 2022-10-03 16:36 ` Jakub Kicinski
  2022-12-06 16:49   ` Guilherme G. Piccoli
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2022-10-03 16:36 UTC (permalink / raw)
  To: Uwe Kleine-König, Douglas Miller
  Cc: netdev, Greg Kroah-Hartman, Rafael J. Wysocki, kernel,
	Guilherme G. Piccoli

On Sat, 1 Oct 2022 16:31:31 +0200 Uwe Kleine-König wrote:
> Hello,
> 
> while doing some cleanup I stumbled over a problem in the ehea network
> driver.
> 
> In the driver's probe function (ehea_probe_adapter() via
> ehea_register_memory_hooks()) a reboot notifier is registered. When this
> notifier is triggered (ehea_reboot_notifier()) it unregisters the
> driver. I'm unsure what is the order of the actions triggered by that.
> Maybe the driver is unregistered twice if there are two bound devices?
> Or the reboot notifier is called under a lock and unregistering the
> driver (and so the devices) tries to unregister the notifier that is
> currently locked and so results in a deadlock? Maybe Greg or Rafael can
> tell about the details here?
> 
> Whatever the effect is, it's strange. It makes me wonder why it's
> necessary to free all the resources of the driver on reboot?! I don't
> know anything about the specifics of the affected machines, but I guess
> doing just the necessary stuff on reboot would be easier to understand,
> quicker to execute and doesn't have such strange side effects.
> 
> With my lack of knowledge about the machine, the best I can do is report
> my findings. So don't expect a patch or testing from my side.

Last meaningful commit to this driver FWIW:

commit 29ab5a3b94c87382da06db88e96119911d557293
Author: Guilherme G. Piccoli <kernel@gpiccoli.net>
Date:   Thu Nov 3 08:16:20 2016 -0200

Also that's the last time we heard from Douglas AFAICT..

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Strangeness in ehea network driver's shutdown
  2022-10-03 16:36 ` Jakub Kicinski
@ 2022-12-06 16:49   ` Guilherme G. Piccoli
  2022-12-06 17:02     ` Thadeu Lima de Souza Cascardo
  0 siblings, 1 reply; 5+ messages in thread
From: Guilherme G. Piccoli @ 2022-12-06 16:49 UTC (permalink / raw)
  To: Jakub Kicinski, Uwe Kleine-König
  Cc: Douglas Miller, netdev, Greg Kroah-Hartman, Rafael J. Wysocki,
	kernel, gpiccoli, cascardo

On Mon, Oct 3, 2022 at 1:36 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sat, 1 Oct 2022 16:31:31 +0200 Uwe Kleine-König wrote:
> > Hello,
> >
> > while doing some cleanup I stumbled over a problem in the ehea network
> > driver.
> >
> > In the driver's probe function (ehea_probe_adapter() via
> > ehea_register_memory_hooks()) a reboot notifier is registered. When this
> > notifier is triggered (ehea_reboot_notifier()) it unregisters the
> > driver. I'm unsure what is the order of the actions triggered by that.
> > Maybe the driver is unregistered twice if there are two bound devices?
> > Or the reboot notifier is called under a lock and unregistering the
> > driver (and so the devices) tries to unregister the notifier that is
> > currently locked and so results in a deadlock? Maybe Greg or Rafael can
> > tell about the details here?
> >
> > Whatever the effect is, it's strange. It makes me wonder why it's
> > necessary to free all the resources of the driver on reboot?! I don't
> > know anything about the specifics of the affected machines, but I guess
> > doing just the necessary stuff on reboot would be easier to understand,
> > quicker to execute and doesn't have such strange side effects.
> >
> > With my lack of knowledge about the machine, the best I can do is report
> > my findings. So don't expect a patch or testing from my side.
>
> Last meaningful commit to this driver FWIW:
>
> commit 29ab5a3b94c87382da06db88e96119911d557293
> Author: Guilherme G. Piccoli <kernel@gpiccoli.net>
> Date:   Thu Nov 3 08:16:20 2016 -0200
>
> Also that's the last time we heard from Douglas AFAICT..

Hey folks, thanks for CCing me.

I've worked a bit with ehea some time ago, will need to dig up a bit
to understand things again.
But I'm cc'ing Cascardo - which have(/had?) a more deep knowledge on
that - in the hopes he can discuss the issue without requiring that
much study heh

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Strangeness in ehea network driver's shutdown
  2022-12-06 16:49   ` Guilherme G. Piccoli
@ 2022-12-06 17:02     ` Thadeu Lima de Souza Cascardo
  2022-12-07  8:23       ` Uwe Kleine-König
  0 siblings, 1 reply; 5+ messages in thread
From: Thadeu Lima de Souza Cascardo @ 2022-12-06 17:02 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Jakub Kicinski, Uwe Kleine-König, Douglas Miller, netdev,
	Greg Kroah-Hartman, Rafael J. Wysocki, kernel, gpiccoli

On Tue, Dec 06, 2022 at 01:49:01PM -0300, Guilherme G. Piccoli wrote:
> On Mon, Oct 3, 2022 at 1:36 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Sat, 1 Oct 2022 16:31:31 +0200 Uwe Kleine-König wrote:
> > > Hello,
> > >
> > > while doing some cleanup I stumbled over a problem in the ehea network
> > > driver.
> > >
> > > In the driver's probe function (ehea_probe_adapter() via
> > > ehea_register_memory_hooks()) a reboot notifier is registered. When this
> > > notifier is triggered (ehea_reboot_notifier()) it unregisters the
> > > driver. I'm unsure what is the order of the actions triggered by that.
> > > Maybe the driver is unregistered twice if there are two bound devices?

I see how you would think it might be called for every bound device. That's
because ehea_register_memory_hooks is called by ehea_probe_adapter. However,
there is this test here that leads it the reboot_notifier to be registered only
once:

[...]
static int ehea_register_memory_hooks(void)
{
	int ret = 0;

	if (atomic_inc_return(&ehea_memory_hooks_registered) > 1)
	^^^^^^^^^^^^^^^^^^^^^^
		return 0;
[...]


> > > Or the reboot notifier is called under a lock and unregistering the
> > > driver (and so the devices) tries to unregister the notifier that is
> > > currently locked and so results in a deadlock? Maybe Greg or Rafael can
> > > tell about the details here?
> > >
> > > Whatever the effect is, it's strange. It makes me wonder why it's
> > > necessary to free all the resources of the driver on reboot?! I don't

As for why:

commit 2a6f4e4983918b18fe5d3fb364afe33db7139870
Author: Jan-Bernd Themann <ossthema@de.ibm.com>
Date:   Fri Oct 26 14:37:28 2007 +0200

    ehea: add kexec support
    
    eHEA resources that are allocated via H_CALLs have a unique identifier each.
    These identifiers are necessary to free the resources. A reboot notifier
    is used to free all eHEA resources before the indentifiers get lost, i.e
    before kexec starts a new kernel.
    
    Signed-off-by: Jan-Bernd Themann <themann@de.ibm.com>
    Signed-off-by: Jeff Garzik <jeff@garzik.org>

> > > know anything about the specifics of the affected machines, but I guess
> > > doing just the necessary stuff on reboot would be easier to understand,
> > > quicker to execute and doesn't have such strange side effects.
> > >
> > > With my lack of knowledge about the machine, the best I can do is report
> > > my findings. So don't expect a patch or testing from my side.
> >
> > Last meaningful commit to this driver FWIW:
> >
> > commit 29ab5a3b94c87382da06db88e96119911d557293
> > Author: Guilherme G. Piccoli <kernel@gpiccoli.net>
> > Date:   Thu Nov 3 08:16:20 2016 -0200
> >
> > Also that's the last time we heard from Douglas AFAICT..
> 
> Hey folks, thanks for CCing me.
> 
> I've worked a bit with ehea some time ago, will need to dig up a bit
> to understand things again.
> But I'm cc'ing Cascardo - which have(/had?) a more deep knowledge on
> that - in the hopes he can discuss the issue without requiring that
> much study heh
> 
> Cheers,
> 
> 
> Guilherme

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Strangeness in ehea network driver's shutdown
  2022-12-06 17:02     ` Thadeu Lima de Souza Cascardo
@ 2022-12-07  8:23       ` Uwe Kleine-König
  0 siblings, 0 replies; 5+ messages in thread
From: Uwe Kleine-König @ 2022-12-07  8:23 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo
  Cc: Guilherme G. Piccoli, Rafael J. Wysocki, netdev, Douglas Miller,
	gpiccoli, kernel, Greg Kroah-Hartman, Jakub Kicinski

[-- Attachment #1: Type: text/plain, Size: 2679 bytes --]

On Tue, Dec 06, 2022 at 02:02:09PM -0300, Thadeu Lima de Souza Cascardo wrote:
> On Tue, Dec 06, 2022 at 01:49:01PM -0300, Guilherme G. Piccoli wrote:
> > On Mon, Oct 3, 2022 at 1:36 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Sat, 1 Oct 2022 16:31:31 +0200 Uwe Kleine-König wrote:
> > > > Hello,
> > > >
> > > > while doing some cleanup I stumbled over a problem in the ehea network
> > > > driver.
> > > >
> > > > In the driver's probe function (ehea_probe_adapter() via
> > > > ehea_register_memory_hooks()) a reboot notifier is registered. When this
> > > > notifier is triggered (ehea_reboot_notifier()) it unregisters the
> > > > driver. I'm unsure what is the order of the actions triggered by that.
> > > > Maybe the driver is unregistered twice if there are two bound devices?
> 
> I see how you would think it might be called for every bound device. That's
> because ehea_register_memory_hooks is called by ehea_probe_adapter. However,
> there is this test here that leads it the reboot_notifier to be registered only
> once:
> 
> [...]
> static int ehea_register_memory_hooks(void)
> {
> 	int ret = 0;
> 
> 	if (atomic_inc_return(&ehea_memory_hooks_registered) > 1)
> 	^^^^^^^^^^^^^^^^^^^^^^
> 		return 0;
> [...]

Ah, I see.

> > > > Or the reboot notifier is called under a lock and unregistering the
> > > > driver (and so the devices) tries to unregister the notifier that is
> > > > currently locked and so results in a deadlock? Maybe Greg or Rafael can
> > > > tell about the details here?
> > > >
> > > > Whatever the effect is, it's strange. It makes me wonder why it's
> > > > necessary to free all the resources of the driver on reboot?! I don't
> 
> As for why:
> 
> commit 2a6f4e4983918b18fe5d3fb364afe33db7139870
> Author: Jan-Bernd Themann <ossthema@de.ibm.com>
> Date:   Fri Oct 26 14:37:28 2007 +0200
> 
>     ehea: add kexec support
>     
>     eHEA resources that are allocated via H_CALLs have a unique identifier each.
>     These identifiers are necessary to free the resources. A reboot notifier
>     is used to free all eHEA resources before the indentifiers get lost, i.e
>     before kexec starts a new kernel.
>     
>     Signed-off-by: Jan-Bernd Themann <themann@de.ibm.com>
>     Signed-off-by: Jeff Garzik <jeff@garzik.org>

I don't understand that, but that's fine for me.

As you're happy with the state as is, I consider the Case closed. Thanks
for looking into my bug report.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-12-07  8:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-01 14:31 Strangeness in ehea network driver's shutdown Uwe Kleine-König
2022-10-03 16:36 ` Jakub Kicinski
2022-12-06 16:49   ` Guilherme G. Piccoli
2022-12-06 17:02     ` Thadeu Lima de Souza Cascardo
2022-12-07  8:23       ` Uwe Kleine-König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).