netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: tg3: issue for reboot/kexec
       [not found] <42BC9F09.1060002@us.ibm.com>
@ 2005-06-25  0:46 ` randy_dunlap
  2005-06-25  3:33   ` Greg KH
  0 siblings, 1 reply; 11+ messages in thread
From: randy_dunlap @ 2005-06-25  0:46 UTC (permalink / raw)
  To: Haren Myneni; +Cc: netdev, fastboot, linux-pm

[-- Attachment #1: Type: text/plain, Size: 1534 bytes --]

On Fri, 24 Jun 2005 17:02:17 -0700 Haren Myneni wrote:

| 
| Hello,
|     The kexec boot on power blade (JS20) is not successfull without 
| doing 'ifdown <ethernet device>'.  Based on my initial look in tg3 code, 
| the driver does not have shutdown nor reboot notifier code unlike in 
| other drivers. Hence, I added the the following patch (based on e1000) 
| and it is working. Is it OK to include this patch? If not, any help to 
| track down the issue would be appreciated.
| 
| My system is having the following cards:
| 0000:11:01.0 Ethernet controller: Broadcom Corporation NetXtreme 
| BCM5704S Gigabit Ethernet (rev 03)
| 0000:11:01.1 Ethernet controller: Broadcom Corporation NetXtreme 
| BCM5704S Gigabit Ethernet (rev 03)
| 
| lspci -n -s 0000:11:01
| 0000:11:01.0 Class 0200: 14e4:16a8 (rev 03)
| 0000:11:01.1 Class 0200: 14e4:16a8 (rev 03)

[adding fastboot & linux-pm mailing lists]

There was just a SCSI driver patch that tried to use a reboot
notifier on shutdown and the patch got this comment:

Subject: Re: [PATCH] 2.6 aacraid: shutdown notification
Message-ID: <42BC6D6F.5090402@us.ibm.com>
From:	Mark Haverkamp <markh@osdl.org>

and reply From:	Brian King <brking@us.ibm.com>:
| The pci_driver->shutdown method should be used instead of registering
| a reboot notifier.

so is there a good reason that network drivers should use the
reboot notifier instead of pci_driver->shutdown,
or should we be converting drivers to use pci_driver->shutdown
instead?
My quick look favors pci_driver->shutdown.

---
~Randy

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: tg3: issue for reboot/kexec
  2005-06-25  0:46 ` tg3: issue for reboot/kexec randy_dunlap
@ 2005-06-25  3:33   ` Greg KH
  2005-06-30 23:21     ` [Fastboot] " Eric W. Biederman
  0 siblings, 1 reply; 11+ messages in thread
From: Greg KH @ 2005-06-25  3:33 UTC (permalink / raw)
  To: randy_dunlap; +Cc: netdev, fastboot, Haren Myneni, linux-pm

[-- Attachment #1: Type: text/plain, Size: 1751 bytes --]

On Fri, Jun 24, 2005 at 05:46:52PM -0700, randy_dunlap wrote:
> On Fri, 24 Jun 2005 17:02:17 -0700 Haren Myneni wrote:
> 
> | 
> | Hello,
> |     The kexec boot on power blade (JS20) is not successfull without 
> | doing 'ifdown <ethernet device>'.  Based on my initial look in tg3 code, 
> | the driver does not have shutdown nor reboot notifier code unlike in 
> | other drivers. Hence, I added the the following patch (based on e1000) 
> | and it is working. Is it OK to include this patch? If not, any help to 
> | track down the issue would be appreciated.
> | 
> | My system is having the following cards:
> | 0000:11:01.0 Ethernet controller: Broadcom Corporation NetXtreme 
> | BCM5704S Gigabit Ethernet (rev 03)
> | 0000:11:01.1 Ethernet controller: Broadcom Corporation NetXtreme 
> | BCM5704S Gigabit Ethernet (rev 03)
> | 
> | lspci -n -s 0000:11:01
> | 0000:11:01.0 Class 0200: 14e4:16a8 (rev 03)
> | 0000:11:01.1 Class 0200: 14e4:16a8 (rev 03)
> 
> [adding fastboot & linux-pm mailing lists]
> 
> There was just a SCSI driver patch that tried to use a reboot
> notifier on shutdown and the patch got this comment:
> 
> Subject: Re: [PATCH] 2.6 aacraid: shutdown notification
> Message-ID: <42BC6D6F.5090402@us.ibm.com>
> From:	Mark Haverkamp <markh@osdl.org>
> 
> and reply From:	Brian King <brking@us.ibm.com>:
> | The pci_driver->shutdown method should be used instead of registering
> | a reboot notifier.
> 
> so is there a good reason that network drivers should use the
> reboot notifier instead of pci_driver->shutdown,
> or should we be converting drivers to use pci_driver->shutdown
> instead?
> My quick look favors pci_driver->shutdown.

Yes, don't add a reboot notifier, use the shutdown function instead.

thanks,

greg k-h

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Fastboot] Re: Re: tg3: issue for reboot/kexec
  2005-06-25  3:33   ` Greg KH
@ 2005-06-30 23:21     ` Eric W. Biederman
  2005-06-30 23:33       ` Greg KH
  2005-07-01  1:18       ` Haren Myneni
  0 siblings, 2 replies; 11+ messages in thread
From: Eric W. Biederman @ 2005-06-30 23:21 UTC (permalink / raw)
  To: Greg KH; +Cc: netdev, fastboot, linux-pm

[-- Attachment #1: Type: text/plain, Size: 2527 bytes --]

Greg KH <greg@kroah.com> writes:

> On Fri, Jun 24, 2005 at 05:46:52PM -0700, randy_dunlap wrote:
>> On Fri, 24 Jun 2005 17:02:17 -0700 Haren Myneni wrote:
>> 
>> | 
>> | Hello,
>> |     The kexec boot on power blade (JS20) is not successfull without 
>> | doing 'ifdown <ethernet device>'.  Based on my initial look in tg3 code, 
>> | the driver does not have shutdown nor reboot notifier code unlike in 
>> | other drivers. Hence, I added the the following patch (based on e1000) 
>> | and it is working. Is it OK to include this patch? If not, any help to 
>> | track down the issue would be appreciated.
>> | 
>> | My system is having the following cards:
>> | 0000:11:01.0 Ethernet controller: Broadcom Corporation NetXtreme 
>> | BCM5704S Gigabit Ethernet (rev 03)
>> | 0000:11:01.1 Ethernet controller: Broadcom Corporation NetXtreme 
>> | BCM5704S Gigabit Ethernet (rev 03)
>> | 
>> | lspci -n -s 0000:11:01
>> | 0000:11:01.0 Class 0200: 14e4:16a8 (rev 03)
>> | 0000:11:01.1 Class 0200: 14e4:16a8 (rev 03)
>> 
>> [adding fastboot & linux-pm mailing lists]
>> 
>> There was just a SCSI driver patch that tried to use a reboot
>> notifier on shutdown and the patch got this comment:
>> 
>> Subject: Re: [PATCH] 2.6 aacraid: shutdown notification
>> Message-ID: <42BC6D6F.5090402@us.ibm.com>
>> From:	Mark Haverkamp <markh@osdl.org>
>> 
>> and reply From:	Brian King <brking@us.ibm.com>:
>> | The pci_driver->shutdown method should be used instead of registering
>> | a reboot notifier.
>> 
>> so is there a good reason that network drivers should use the
>> reboot notifier instead of pci_driver->shutdown,
>> or should we be converting drivers to use pci_driver->shutdown
>> instead?
>> My quick look favors pci_driver->shutdown.

A couple stray bits of feedback.  Downing the interface is generally
a user space requirement like unmounting filesystems.
/sbin/kexec and /sbin/reboot both perform this operation so I don't
see how this bug was hit.

I am wondering here if the problem was on on going DMA transaction
or something else.  If the problem was not an on going DMA transaction
then this problem is going to crop back up when people start
messing with the kexec on panic, because it indicates the
driver initialization is not robust.

> Yes, don't add a reboot notifier, use the shutdown function instead.

Sounds right to me.

However I have gotten feedback a couple of times that
driver writers tend to prefer using reboot notifiers.  In part
because shutdown functions don't exist for non-pci devices.

Eric

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Fastboot] Re: Re: tg3: issue for reboot/kexec
  2005-06-30 23:21     ` [Fastboot] " Eric W. Biederman
@ 2005-06-30 23:33       ` Greg KH
  2005-06-30 23:52         ` Eric W. Biederman
  2005-07-01  1:18       ` Haren Myneni
  1 sibling, 1 reply; 11+ messages in thread
From: Greg KH @ 2005-06-30 23:33 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev, fastboot, linux-pm

[-- Attachment #1: Type: text/plain, Size: 646 bytes --]

On Thu, Jun 30, 2005 at 05:21:45PM -0600, Eric W. Biederman wrote:
> However I have gotten feedback a couple of times that
> driver writers tend to prefer using reboot notifiers.  In part
> because shutdown functions don't exist for non-pci devices.

That's a very lame excuse.  All busses should have shutdown functions.
And any device that is just bypassing all of the existing bus logic is
still tying into the driver core directly (which is a bad thing by
itself, but that's a different matter.)  And there's a shutdown method
there too.

So there is no excuse to not use it.  Please, if they complain, point
them to me :)

thanks,

greg k-h

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Fastboot] Re: Re: tg3: issue for reboot/kexec
  2005-06-30 23:33       ` Greg KH
@ 2005-06-30 23:52         ` Eric W. Biederman
  2005-08-18 18:30           ` Khalid Aziz
  0 siblings, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2005-06-30 23:52 UTC (permalink / raw)
  To: Greg KH; +Cc: netdev, fastboot, linux-pm

[-- Attachment #1: Type: text/plain, Size: 2031 bytes --]

Greg KH <greg@kroah.com> writes:

> On Thu, Jun 30, 2005 at 05:21:45PM -0600, Eric W. Biederman wrote:
>> However I have gotten feedback a couple of times that
>> driver writers tend to prefer using reboot notifiers.  In part
>> because shutdown functions don't exist for non-pci devices.
>
> That's a very lame excuse.  All busses should have shutdown functions.
> And any device that is just bypassing all of the existing bus logic is
> still tying into the driver core directly (which is a bad thing by
> itself, but that's a different matter.)  And there's a shutdown method
> there too.
>
> So there is no excuse to not use it.  Please, if they complain, point
> them to me :)

Ok.

Then there is still my complaint and device_shutdown doesn't get called
on module removal which means it really doesn't get implemented.  Perhaps
with kexec now being in the mainline kernel this will get better.

Currently I have the following patch outstanding against the e1000
driver because on reboot on some boxes it card revisions
it places the card into a sleep state the driver initialization 
routing cannot get the card out of.

And yes the e1000 is bad and is using a reboot_notifier.

Eric

 e1000_main.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff -uNr linux-2.4.29-kexec-apic-virtwire-on-shutdownx86_64/drivers/net/e1000/e1000_main.c linux-2.4.29-e1000-no-poweroff-on-reboot/drivers/net/e1000/e1000_main.c
--- linux-2.4.29-kexec-apic-virtwire-on-shutdownx86_64/drivers/net/e1000/e1000_main.c   Tue Feb 15 14:17:09 2005
+++ linux-2.4.29-e1000-no-poweroff-on-reboot/drivers/net/e1000/e1000_main.c     Wed Feb 16 05:49:00 2005
@@ -2777,7 +2777,7 @@
        case SYS_POWER_OFF:
                while((pdev = pci_find_device(PCI_ANY_ID, PCI_ANY_ID, pdev))) {
                        if(pci_dev_driver(pdev) == &e1000_driver)
-                               e1000_suspend(pdev, 3);
+                               e1000_suspend(pdev, (event == SYS_DOWN)?0:3);
                }
        }
       return NOTIFY_DONE;

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Fastboot] Re: Re: tg3: issue for reboot/kexec
  2005-06-30 23:21     ` [Fastboot] " Eric W. Biederman
  2005-06-30 23:33       ` Greg KH
@ 2005-07-01  1:18       ` Haren Myneni
  2005-07-01  1:27         ` Eric W. Biederman
  1 sibling, 1 reply; 11+ messages in thread
From: Haren Myneni @ 2005-07-01  1:18 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: linux-pm, fastboot, netdev

[-- Attachment #1: Type: text/plain, Size: 598 bytes --]

Eric W. Biederman wrote:

>A couple stray bits of feedback.  Downing the interface is generally
>a user space requirement like unmounting filesystems.
>/sbin/kexec and /sbin/reboot both perform this operation so I don't
>see how this bug was hit.
>
>  
>
Thanks for your feedback. We are still working on kexec tools for PPC64. 
In the mean time, I was using a temporary tool which does not do ifdown. 
I was in the impression that the kexec boot will be successful since the 
normal kexec code path (kernel/sys.c) contains device_shutdown/reboot 
notifier. It turns out I am wrong.

Thanks
Haren


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Fastboot] Re: Re: tg3: issue for reboot/kexec
  2005-07-01  1:18       ` Haren Myneni
@ 2005-07-01  1:27         ` Eric W. Biederman
  0 siblings, 0 replies; 11+ messages in thread
From: Eric W. Biederman @ 2005-07-01  1:27 UTC (permalink / raw)
  To: Haren Myneni; +Cc: linux-pm, fastboot, netdev

[-- Attachment #1: Type: text/plain, Size: 797 bytes --]

Haren Myneni <haren@us.ibm.com> writes:

> Eric W. Biederman wrote:
>
>>A couple stray bits of feedback.  Downing the interface is generally
>>a user space requirement like unmounting filesystems.
>>/sbin/kexec and /sbin/reboot both perform this operation so I don't
>>see how this bug was hit.
>>
>>
>>
> Thanks for your feedback. We are still working on kexec tools for PPC64. In the
> mean time, I was using a temporary tool which does not do ifdown. I was in the
> impression that the kexec boot will be successful since the normal kexec code
> path (kernel/sys.c) contains device_shutdown/reboot notifier. It turns out I am
> wrong.

Buggy drivers will do that to you as well.  But not doing a clean shutdown
(downing interfaces, and unmounting the filesystems) doesn't help things  :)

Eric

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Fastboot] Re: Re: tg3: issue for reboot/kexec
  2005-06-30 23:52         ` Eric W. Biederman
@ 2005-08-18 18:30           ` Khalid Aziz
  2005-08-18 18:35             ` Khalid Aziz
  2005-08-18 18:42             ` Eric W. Biederman
  0 siblings, 2 replies; 11+ messages in thread
From: Khalid Aziz @ 2005-08-18 18:30 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: linux-pm, Fastboot mailing list, netdev

[-- Attachment #1: Type: text/plain, Size: 3455 bytes --]

On Thu, 2005-06-30 at 17:52 -0600, Eric W. Biederman wrote:
> Greg KH <greg@kroah.com> writes:
> 
> > On Thu, Jun 30, 2005 at 05:21:45PM -0600, Eric W. Biederman wrote:
> >> However I have gotten feedback a couple of times that
> >> driver writers tend to prefer using reboot notifiers.  In part
> >> because shutdown functions don't exist for non-pci devices.
> >
> > That's a very lame excuse.  All busses should have shutdown functions.
> > And any device that is just bypassing all of the existing bus logic is
> > still tying into the driver core directly (which is a bad thing by
> > itself, but that's a different matter.)  And there's a shutdown method
> > there too.
> >
> > So there is no excuse to not use it.  Please, if they complain, point
> > them to me :)
> 
> Ok.
> 
> Then there is still my complaint and device_shutdown doesn't get called
> on module removal which means it really doesn't get implemented.  Perhaps
> with kexec now being in the mainline kernel this will get better.
> 
> Currently I have the following patch outstanding against the e1000
> driver because on reboot on some boxes it card revisions
> it places the card into a sleep state the driver initialization 
> routing cannot get the card out of.
> 
> And yes the e1000 is bad and is using a reboot_notifier.
> 
> Eric
> 
>  e1000_main.c |    2 +-
>  1 files changed, 1 insertion(+), 1 deletion(-)
> 
> diff -uNr linux-2.4.29-kexec-apic-virtwire-on-shutdownx86_64/drivers/net/e1000/e1000_main.c linux-2.4.29-e1000-no-poweroff-on-reboot/drivers/net/e1000/e1000_main.c
> --- linux-2.4.29-kexec-apic-virtwire-on-shutdownx86_64/drivers/net/e1000/e1000_main.c   Tue Feb 15 14:17:09 2005
> +++ linux-2.4.29-e1000-no-poweroff-on-reboot/drivers/net/e1000/e1000_main.c     Wed Feb 16 05:49:00 2005
> @@ -2777,7 +2777,7 @@
>         case SYS_POWER_OFF:
>                 while((pdev = pci_find_device(PCI_ANY_ID, PCI_ANY_ID, pdev))) {
>                         if(pci_dev_driver(pdev) == &e1000_driver)
> -                               e1000_suspend(pdev, 3);
> +                               e1000_suspend(pdev, (event == SYS_DOWN)?0:3);
>                 }
>         }
>        return NOTIFY_DONE;

I have found that I can not walk reboot_notifier list in all cases
before kexec'ing a new kernel. For instance, when handling INIT on ia64,
we are running in interrupt context and atleast some of the reboot
notifier callbacks call schedule(). Calling schedule() is not gonna work
when we are running in interrupt context. I have the same concern for
when panic gets called in interrupt context. So I added a shutdown
function to e1000 driver instead. Patch is attached. This patch has
worked for me.

As soon as I have all the issues sorted out with kexec'ing on INIT on
ia64, I will post a fully updated kexec patch for ia64. I now have kexec
working solid on INIT with e1000 driver and it can handle multiple back
to back INITs and come up in kexec'd kernel every time. I am now trying
to sort some issues out with tg3 driver (another driver with no shutdown
routine :(

-- 
Khalid

====================================================================
Khalid Aziz                       Open Source and Linux Organization
(970)898-9214                                        Hewlett-Packard
khalid.aziz@hp.com                                  Fort Collins, CO

"The Linux kernel is subject to relentless development" 
                                - Alessandro Rubini

[-- Attachment #2: e1000_shutdown.patch --]
[-- Type: text/x-patch, Size: 3834 bytes --]

diff -urNp hpte-2.6/drivers/net/e1000/e1000_main.c hpte-2.6.init/drivers/net/e1000/e1000_main.c
--- hpte-2.6/drivers/net/e1000/e1000_main.c	2005-07-08 13:57:55.000000000 -0600
+++ hpte-2.6.init/drivers/net/e1000/e1000_main.c	2005-07-21 09:42:17.000000000 -0600
@@ -211,6 +211,7 @@ static void e1000_restore_vlan(struct e1
 
 static int e1000_notify_reboot(struct notifier_block *, unsigned long event, void *ptr);
 static int e1000_suspend(struct pci_dev *pdev, uint32_t state);
+static void e1000_shutdown(struct device *dev);
 #ifdef CONFIG_PM
 static int e1000_resume(struct pci_dev *pdev);
 #endif
@@ -257,6 +258,8 @@ MODULE_PARM_DESC(debug, "Debug level (0=
  * loaded. All it does is register with the PCI subsystem.
  **/
 
+#define REBOOT_NOTIFIER	0
+
 static int __init
 e1000_init_module(void)
 {
@@ -266,10 +269,15 @@ e1000_init_module(void)
 
 	printk(KERN_INFO "%s\n", e1000_copyright);
 
+#if (!REBOOT_NOTIFIER)
+	e1000_driver.driver.shutdown = e1000_shutdown;
+#endif
 	ret = pci_module_init(&e1000_driver);
+#if REBOOT_NOTIFIER
 	if(ret >= 0) {
 		register_reboot_notifier(&e1000_notifier_reboot);
 	}
+#endif
 	return ret;
 }
 
@@ -285,7 +293,9 @@ module_init(e1000_init_module);
 static void __exit
 e1000_exit_module(void)
 {
+#if REBOOT_NOTIFIER
 	unregister_reboot_notifier(&e1000_notifier_reboot);
+#endif
 	pci_unregister_driver(&e1000_driver);
 }
 
@@ -3197,6 +3207,71 @@ e1000_suspend(struct pci_dev *pdev, uint
 	return 0;
 }
 
+static void
+e1000_shutdown(struct device *dev)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct net_device *netdev = pci_get_drvdata(pdev);
+	struct e1000_adapter *adapter = netdev->priv;
+	struct e1000_hw *hw = &adapter->hw;
+	uint32_t ctrl;
+
+	netif_device_detach(netdev);
+
+	if(netif_running(netdev)) {
+		e1000_irq_disable(adapter);
+		del_timer(&adapter->tx_fifo_stall_timer);
+		del_timer(&adapter->watchdog_timer);
+		del_timer(&adapter->phy_info_timer);
+
+#ifdef CONFIG_E1000_NAPI
+		netif_poll_disable(netdev);
+#endif
+		adapter->link_speed = 0;
+		adapter->link_duplex = 0;
+		netif_carrier_off(netdev);
+		netif_stop_queue(netdev);
+	}
+
+	ctrl = E1000_READ_REG(hw, CTRL);
+
+	/* Must reset the PHY before resetting the MAC */
+	if((hw->mac_type == e1000_82541) || (hw->mac_type == e1000_82547)) {
+		E1000_WRITE_REG_IO(hw, CTRL, (ctrl | E1000_CTRL_PHY_RST));
+		mdelay(5);
+	}
+
+	/* Issue a global reset to the MAC.  This will reset the chip's
+	 * transmit, receive, DMA, and link units.  It will not effect
+	 * the current PCI configuration.  The global reset bit is self-
+	 * clearing, and should clear within a microsecond.
+	 */
+	switch(hw->mac_type) {
+	    case e1000_82544:
+	    case e1000_82540:
+	    case e1000_82545:
+	    case e1000_82546:
+	    case e1000_82541:
+	    case e1000_82541_rev_2:
+	        /* These controllers can't ack the 64-bit write when issuing the
+	         * reset, so use IO-mapping as a workaround to issue the reset 
+		 */
+	        E1000_WRITE_REG_IO(hw, CTRL, (ctrl | E1000_CTRL_RST));
+	        break;
+	    case e1000_82545_rev_3:
+	    case e1000_82546_rev_3:
+	        /* Reset is performed on a shadow of the control register */
+	        E1000_WRITE_REG(hw, CTRL_DUP, (ctrl | E1000_CTRL_RST));
+	        break;
+	    default:
+	        E1000_WRITE_REG(hw, CTRL, (ctrl | E1000_CTRL_RST));
+	        break;
+	}
+
+	pci_disable_device(pdev);
+	pci_set_power_state(pdev, 0);
+}
+
 #ifdef CONFIG_PM
 static int
 e1000_resume(struct pci_dev *pdev)
@@ -3205,9 +3280,10 @@ e1000_resume(struct pci_dev *pdev)
 	struct e1000_adapter *adapter = netdev->priv;
 	uint32_t manc, ret;
 
-	ret = pci_enable_device(pdev);
 	pci_set_power_state(pdev, 0);
 	pci_restore_state(pdev);
+	ret = pci_enable_device(pdev);
+	pci_set_master(pdev);
 
 	pci_enable_wake(pdev, 3, 0);
 	pci_enable_wake(pdev, 4, 0); /* 4 == D3 cold */

[-- Attachment #3: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Fastboot] Re: Re: tg3: issue for reboot/kexec
  2005-08-18 18:30           ` Khalid Aziz
@ 2005-08-18 18:35             ` Khalid Aziz
  2005-08-18 18:42             ` Eric W. Biederman
  1 sibling, 0 replies; 11+ messages in thread
From: Khalid Aziz @ 2005-08-18 18:35 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: linux-pm, Fastboot mailing list, netdev

[-- Attachment #1: Type: text/plain, Size: 3728 bytes --]

On Thu, 2005-08-18 at 12:30 -0600, Khalid Aziz wrote:
> On Thu, 2005-06-30 at 17:52 -0600, Eric W. Biederman wrote:
> > Greg KH <greg@kroah.com> writes:
> > 
> > > On Thu, Jun 30, 2005 at 05:21:45PM -0600, Eric W. Biederman wrote:
> > >> However I have gotten feedback a couple of times that
> > >> driver writers tend to prefer using reboot notifiers.  In part
> > >> because shutdown functions don't exist for non-pci devices.
> > >
> > > That's a very lame excuse.  All busses should have shutdown functions.
> > > And any device that is just bypassing all of the existing bus logic is
> > > still tying into the driver core directly (which is a bad thing by
> > > itself, but that's a different matter.)  And there's a shutdown method
> > > there too.
> > >
> > > So there is no excuse to not use it.  Please, if they complain, point
> > > them to me :)
> > 
> > Ok.
> > 
> > Then there is still my complaint and device_shutdown doesn't get called
> > on module removal which means it really doesn't get implemented.  Perhaps
> > with kexec now being in the mainline kernel this will get better.
> > 
> > Currently I have the following patch outstanding against the e1000
> > driver because on reboot on some boxes it card revisions
> > it places the card into a sleep state the driver initialization 
> > routing cannot get the card out of.
> > 
> > And yes the e1000 is bad and is using a reboot_notifier.
> > 
> > Eric
> > 
> >  e1000_main.c |    2 +-
> >  1 files changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff -uNr linux-2.4.29-kexec-apic-virtwire-on-shutdownx86_64/drivers/net/e1000/e1000_main.c linux-2.4.29-e1000-no-poweroff-on-reboot/drivers/net/e1000/e1000_main.c
> > --- linux-2.4.29-kexec-apic-virtwire-on-shutdownx86_64/drivers/net/e1000/e1000_main.c   Tue Feb 15 14:17:09 2005
> > +++ linux-2.4.29-e1000-no-poweroff-on-reboot/drivers/net/e1000/e1000_main.c     Wed Feb 16 05:49:00 2005
> > @@ -2777,7 +2777,7 @@
> >         case SYS_POWER_OFF:
> >                 while((pdev = pci_find_device(PCI_ANY_ID, PCI_ANY_ID, pdev))) {
> >                         if(pci_dev_driver(pdev) == &e1000_driver)
> > -                               e1000_suspend(pdev, 3);
> > +                               e1000_suspend(pdev, (event == SYS_DOWN)?0:3);
> >                 }
> >         }
> >        return NOTIFY_DONE;
> 
> I have found that I can not walk reboot_notifier list in all cases
> before kexec'ing a new kernel. For instance, when handling INIT on ia64,
> we are running in interrupt context and atleast some of the reboot
> notifier callbacks call schedule(). Calling schedule() is not gonna work
> when we are running in interrupt context. I have the same concern for
> when panic gets called in interrupt context. So I added a shutdown
> function to e1000 driver instead. Patch is attached. This patch has
> worked for me.
> 
> As soon as I have all the issues sorted out with kexec'ing on INIT on
> ia64, I will post a fully updated kexec patch for ia64. I now have kexec
> working solid on INIT with e1000 driver and it can handle multiple back
> to back INITs and come up in kexec'd kernel every time. I am now trying
> to sort some issues out with tg3 driver (another driver with no shutdown
> routine :(
> 

Sorry, forgot to remove the last hunk that is already in 2.6.12. Updated
patch attached.

--
Khalid 

====================================================================
Khalid Aziz                       Open Source and Linux Organization
(970)898-9214                                        Hewlett-Packard
khalid.aziz@hp.com                                  Fort Collins, CO

"The Linux kernel is subject to relentless development" 
                                - Alessandro Rubini

[-- Attachment #2: e1000_shutdown.patch --]
[-- Type: text/x-patch, Size: 3472 bytes --]

diff -urNp hpte-2.6/drivers/net/e1000/e1000_main.c hpte-2.6.init/drivers/net/e1000/e1000_main.c
--- hpte-2.6/drivers/net/e1000/e1000_main.c	2005-07-08 13:57:55.000000000 -0600
+++ hpte-2.6.init/drivers/net/e1000/e1000_main.c	2005-07-21 09:42:17.000000000 -0600
@@ -211,6 +211,7 @@ static void e1000_restore_vlan(struct e1
 
 static int e1000_notify_reboot(struct notifier_block *, unsigned long event, void *ptr);
 static int e1000_suspend(struct pci_dev *pdev, uint32_t state);
+static void e1000_shutdown(struct device *dev);
 #ifdef CONFIG_PM
 static int e1000_resume(struct pci_dev *pdev);
 #endif
@@ -257,6 +258,8 @@ MODULE_PARM_DESC(debug, "Debug level (0=
  * loaded. All it does is register with the PCI subsystem.
  **/
 
+#define REBOOT_NOTIFIER	0
+
 static int __init
 e1000_init_module(void)
 {
@@ -266,10 +269,15 @@ e1000_init_module(void)
 
 	printk(KERN_INFO "%s\n", e1000_copyright);
 
+#if (!REBOOT_NOTIFIER)
+	e1000_driver.driver.shutdown = e1000_shutdown;
+#endif
 	ret = pci_module_init(&e1000_driver);
+#if REBOOT_NOTIFIER
 	if(ret >= 0) {
 		register_reboot_notifier(&e1000_notifier_reboot);
 	}
+#endif
 	return ret;
 }
 
@@ -285,7 +293,9 @@ module_init(e1000_init_module);
 static void __exit
 e1000_exit_module(void)
 {
+#if REBOOT_NOTIFIER
 	unregister_reboot_notifier(&e1000_notifier_reboot);
+#endif
 	pci_unregister_driver(&e1000_driver);
 }
 
@@ -3197,6 +3207,71 @@ e1000_suspend(struct pci_dev *pdev, uint
 	return 0;
 }
 
+static void
+e1000_shutdown(struct device *dev)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct net_device *netdev = pci_get_drvdata(pdev);
+	struct e1000_adapter *adapter = netdev->priv;
+	struct e1000_hw *hw = &adapter->hw;
+	uint32_t ctrl;
+
+	netif_device_detach(netdev);
+
+	if(netif_running(netdev)) {
+		e1000_irq_disable(adapter);
+		del_timer(&adapter->tx_fifo_stall_timer);
+		del_timer(&adapter->watchdog_timer);
+		del_timer(&adapter->phy_info_timer);
+
+#ifdef CONFIG_E1000_NAPI
+		netif_poll_disable(netdev);
+#endif
+		adapter->link_speed = 0;
+		adapter->link_duplex = 0;
+		netif_carrier_off(netdev);
+		netif_stop_queue(netdev);
+	}
+
+	ctrl = E1000_READ_REG(hw, CTRL);
+
+	/* Must reset the PHY before resetting the MAC */
+	if((hw->mac_type == e1000_82541) || (hw->mac_type == e1000_82547)) {
+		E1000_WRITE_REG_IO(hw, CTRL, (ctrl | E1000_CTRL_PHY_RST));
+		mdelay(5);
+	}
+
+	/* Issue a global reset to the MAC.  This will reset the chip's
+	 * transmit, receive, DMA, and link units.  It will not effect
+	 * the current PCI configuration.  The global reset bit is self-
+	 * clearing, and should clear within a microsecond.
+	 */
+	switch(hw->mac_type) {
+	    case e1000_82544:
+	    case e1000_82540:
+	    case e1000_82545:
+	    case e1000_82546:
+	    case e1000_82541:
+	    case e1000_82541_rev_2:
+	        /* These controllers can't ack the 64-bit write when issuing the
+	         * reset, so use IO-mapping as a workaround to issue the reset 
+		 */
+	        E1000_WRITE_REG_IO(hw, CTRL, (ctrl | E1000_CTRL_RST));
+	        break;
+	    case e1000_82545_rev_3:
+	    case e1000_82546_rev_3:
+	        /* Reset is performed on a shadow of the control register */
+	        E1000_WRITE_REG(hw, CTRL_DUP, (ctrl | E1000_CTRL_RST));
+	        break;
+	    default:
+	        E1000_WRITE_REG(hw, CTRL, (ctrl | E1000_CTRL_RST));
+	        break;
+	}
+
+	pci_disable_device(pdev);
+	pci_set_power_state(pdev, 0);
+}
+
 #ifdef CONFIG_PM
 static int
 e1000_resume(struct pci_dev *pdev)

[-- Attachment #3: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Fastboot] Re: Re: tg3: issue for reboot/kexec
  2005-08-18 18:30           ` Khalid Aziz
  2005-08-18 18:35             ` Khalid Aziz
@ 2005-08-18 18:42             ` Eric W. Biederman
  2005-08-18 19:06               ` Khalid Aziz
  1 sibling, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2005-08-18 18:42 UTC (permalink / raw)
  To: Khalid Aziz; +Cc: linux-pm, Fastboot mailing list, netdev

[-- Attachment #1: Type: text/plain, Size: 1485 bytes --]

Khalid Aziz <khalid_aziz@hp.com> writes:

> On Thu, 2005-06-30 at 17:52 -0600, Eric W. Biederman wrote:
>
> I have found that I can not walk reboot_notifier list in all cases
> before kexec'ing a new kernel. For instance, when handling INIT on ia64,
> we are running in interrupt context and atleast some of the reboot
> notifier callbacks call schedule(). Calling schedule() is not gonna work
> when we are running in interrupt context. I have the same concern for
> when panic gets called in interrupt context. So I added a shutdown
> function to e1000 driver instead. Patch is attached. This patch has
> worked for me.

Thanks.  I will look at your e1000 driver and see if it works on
the hardware I have here.  You changes are at least to places
that could have fixed the problem I am seeing.

For the panic case calling any of the shutdown/reboot notifier bits
is actually wrong.  That kernel that takes over needs to cope with
hardware left in any state.

> As soon as I have all the issues sorted out with kexec'ing on INIT on
> ia64, I will post a fully updated kexec patch for ia64. I now have kexec
> working solid on INIT with e1000 driver and it can handle multiple back
> to back INITs and come up in kexec'd kernel every time. I am now trying
> to sort some issues out with tg3 driver (another driver with no shutdown
> routine :(

In the normal case (not kexec on panic) user space should down the interfaces
before calling kexec.  Is your user space doing that?


Eric

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Fastboot] Re: Re: tg3: issue for reboot/kexec
  2005-08-18 18:42             ` Eric W. Biederman
@ 2005-08-18 19:06               ` Khalid Aziz
  0 siblings, 0 replies; 11+ messages in thread
From: Khalid Aziz @ 2005-08-18 19:06 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: linux-pm, Fastboot mailing list, netdev

[-- Attachment #1: Type: text/plain, Size: 1151 bytes --]

On Thu, 2005-08-18 at 12:42 -0600, Eric W. Biederman wrote:
> Khalid Aziz <khalid_aziz@hp.com> writes:
> > As soon as I have all the issues sorted out with kexec'ing on INIT on
> > ia64, I will post a fully updated kexec patch for ia64. I now have kexec
> > working solid on INIT with e1000 driver and it can handle multiple back
> > to back INITs and come up in kexec'd kernel every time. I am now trying
> > to sort some issues out with tg3 driver (another driver with no shutdown
> > routine :(
> 
> In the normal case (not kexec on panic) user space should down the interfaces
> before calling kexec.  Is your user space doing that?

Yes, I am down'ing the interface from userspace in normal kexec case,
but that does not happen on INIT on ia64. 

-- 
Khalid

====================================================================
Khalid Aziz                       Open Source and Linux Organization
(970)898-9214                                        Hewlett-Packard
khalid.aziz@hp.com                                  Fort Collins, CO

"The Linux kernel is subject to relentless development" 
                                - Alessandro Rubini


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-08-18 19:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <42BC9F09.1060002@us.ibm.com>
2005-06-25  0:46 ` tg3: issue for reboot/kexec randy_dunlap
2005-06-25  3:33   ` Greg KH
2005-06-30 23:21     ` [Fastboot] " Eric W. Biederman
2005-06-30 23:33       ` Greg KH
2005-06-30 23:52         ` Eric W. Biederman
2005-08-18 18:30           ` Khalid Aziz
2005-08-18 18:35             ` Khalid Aziz
2005-08-18 18:42             ` Eric W. Biederman
2005-08-18 19:06               ` Khalid Aziz
2005-07-01  1:18       ` Haren Myneni
2005-07-01  1:27         ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).