netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* r8169: Crash after reloading driver if network hangs
@ 2007-01-28 18:02 Bernhard Walle
  2007-01-28 19:04 ` Francois Romieu
  0 siblings, 1 reply; 5+ messages in thread
From: Bernhard Walle @ 2007-01-28 18:02 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

Hello,

also after applying the latest patch that was posted on that mailing
list, I have still the problem described in
http://bugzilla.kernel.org/show_bug.cgi?id=5137.

So after a network hang with several 

        NETDEV WATCHDOG: eth0: transmit timed out

in the kernel log, I removed the driver ('rmmod r8169') and wanted to
reload the driver ('modprobe r8169'). Now the system crashed:

    kernel BUG at kernel/timer.c:407!
    invalid opcode: 0000 [1] SMP 
    CPU 1 
    Modules linked in: r8169 i915 drm deflate zlib_deflate twofish twofish_common serpent blowfi
    sh des cbc ecb blkcipher aes xcbc sha256 w83627ehf hwmon i2c_isa sha1 ipv6 md5 eeprom crypto
    _null af_key snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device nfs lockd nfs_acl sunrpc af_pa
    cket cpufreq_conservative cpufreq_ondemand cpufreq_userspace cpufreq_powersave acpi_cpufreq 
    freq_table button battery ac loop rfcomm hidp l2cap hci_usb bluetooth dm_mod usb_storage fus
    e snd_hda_intel snd_hda_codec snd_pcm iTCO_wdt ehci_hcd uhci_hcd intel_agp i2c_i801 i2c_core
     snd_timer snd soundcore snd_page_alloc iTCO_vendor_support usbcore floppy lp parport_pc ppd
    ev parport ext3 mbcache jbd sg sr_mod cdrom edd fan ata_piix libata piix thermal processor s
    d_mod scsi_mod ide_disk ide_core
    Pid: 5724, comm: modprobe Not tainted 2.6.20-rc6-default #1
    RIP: 0010:[<ffffffff80241dfe>]  [<ffffffff80241dfe>] mod_timer+0x7/0x22
    RSP: 0018:ffff8100566c9c20  EFLAGS: 00010246
    RAX: 0000000000000006 RBX: ffff810061d8c000 RCX: 0000000000000001
    RDX: 00000000000003e8 RSI: 00000000ffffa58a RDI: ffff810061d8d1e0
    RBP: ffff810061d8c4c0 R08: ffffc20000022000 R09: ffff810061d8c4c0
    R10: 0000000000000046 R11: 0000000000000202 R12: 0000000000000000
    R13: 00000000ffffff01 R14: ffff8100566c03e8 R15: ffff810061d8c000
    FS:  00002b38153566f0(0000) GS:ffff81007db9abc0(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00005555557fc870 CR3: 00000000540af000 CR4: 00000000000006e0
    Process modprobe (pid: 5724, threadinfo ffff8100566c8000, task ffff810037fde850)
    Stack:  ffffffff881b6d8b 0000000000000064 ffff810061d8c4c0 ffffc20000022000
     ffffffff881b88ef ffffffff000000ff ffffffff000000ff ffff8100000000ff<6>r8169: eth1: link up

     ffff8100000000ff ffff810000000011 ffff81007d0079c8 0000000000000296
    Call Trace:
     [<ffffffff881b6d8b>] :r8169:rtl8169_set_speed+0x4b/0x53
     [<ffffffff881b88ef>] :r8169:rtl8169_init_one+0x954/0x9c4
     [<ffffffff8031b440>] pci_device_probe+0xe5/0x151
     [<ffffffff8036d620>] really_probe+0x87/0x106
     [<ffffffff8036d83f>] __driver_attach+0x6f/0xaf
     [<ffffffff8036d7d0>] __driver_attach+0x0/0xaf
     [<ffffffff8036d7d0>] __driver_attach+0x0/0xaf
     [<ffffffff8036cb86>] bus_for_each_dev+0x43/0x6e
     [<ffffffff8036ce7c>] bus_add_driver+0x6b/0x18d
     [<ffffffff8031b63b>] __pci_register_driver+0x75/0xaa
     [<ffffffff80298300>] sys_init_module+0x1793/0x1900
     [<ffffffff8025511e>] system_call+0x7e/0x83


    Code: 0f 0b eb fe 48 39 77 10 75 06 48 83 3f 00 75 05 e9 73 85 fd 
    RIP  [<ffffffff80241dfe>] mod_timer+0x7/0x22
     RSP <ffff8100566c9c20>

netif_running(dev) returns true although open() hasn't been called that sets
the function of the timer. dev->state is 6 (I checked that).

Simple fix is attached. Although that seems to fix the symptom and not
the cause, please apply it if you don't have a better solution.


---
 r8169.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/net/r8169.c	2007-01-28 18:19:56.000000000 +0100
+++ b/drivers/net/r8169.c	2007-01-28 18:22:50.000000000 +0100
@@ -810,7 +810,8 @@ static int rtl8169_set_speed(struct net_
 
 	ret = tp->set_speed(dev, autoneg, speed, duplex);
 
-	if (netif_running(dev) && (tp->phy_1000_ctrl_reg & ADVERTISE_1000FULL))
+	if (netif_running(dev) && tp->timer.function &&
+			(tp->phy_1000_ctrl_reg & ADVERTISE_1000FULL))
 		mod_timer(&tp->timer, jiffies + RTL8169_PHY_TIMEOUT);
 
 	return ret;



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: r8169: Crash after reloading driver if network hangs
  2007-01-28 18:02 r8169: Crash after reloading driver if network hangs Bernhard Walle
@ 2007-01-28 19:04 ` Francois Romieu
  2007-01-28 19:56   ` Bernhard Walle
  0 siblings, 1 reply; 5+ messages in thread
From: Francois Romieu @ 2007-01-28 19:04 UTC (permalink / raw)
  To: netdev

Bernhard Walle <bwalle@suse.de> :
[...]
> Simple fix is attached. Although that seems to fix the symptom and not
> the cause, please apply it if you don't have a better solution.

What about the patch below ?

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 577babd..4e22af7 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1373,7 +1373,7 @@ static inline void rtl8169_request_timer(struct net_device *dev)
 	timer->expires = jiffies + RTL8169_PHY_TIMEOUT;
 	timer->data = (unsigned long)(dev);
 	timer->function = rtl8169_phy_timer;
-	add_timer(timer);
+	mod_timer(timer);
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
@@ -1448,7 +1448,7 @@ static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp)
 
 	rtl8169_phy_reset(dev, tp);
 
-	rtl8169_set_speed(dev, autoneg, speed, duplex);
+	tp->set_speed(dev, autoneg, speed, duplex);
 
 	if ((RTL_R8(PHYstatus) & TBI_Enable) && netif_msg_link(tp))
 		printk(KERN_INFO PFX "%s: TBI auto-negotiating\n", dev->name);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: r8169: Crash after reloading driver if network hangs
  2007-01-28 19:04 ` Francois Romieu
@ 2007-01-28 19:56   ` Bernhard Walle
  2007-01-29  7:56     ` Francois Romieu
  0 siblings, 1 reply; 5+ messages in thread
From: Bernhard Walle @ 2007-01-28 19:56 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

* Francois Romieu <romieu@fr.zoreil.com> [2007-01-28 20:04]:
> Bernhard Walle <bwalle@suse.de> :
> [...]
> > Simple fix is attached. Although that seems to fix the symptom and not
> > the cause, please apply it if you don't have a better solution.
> 
> What about the patch below ?
> 
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index 577babd..4e22af7 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -1373,7 +1373,7 @@ static inline void rtl8169_request_timer(struct net_device *dev)
>  	timer->expires = jiffies + RTL8169_PHY_TIMEOUT;
>  	timer->data = (unsigned long)(dev);
>  	timer->function = rtl8169_phy_timer;
> -	add_timer(timer);
> +	mod_timer(timer);
>  }

Doesn't compile, I think you mean this?

@@ -1371,10 +1371,9 @@ static inline void rtl8169_request_timer
                return;
 
        init_timer(timer);
-       timer->expires = jiffies + RTL8169_PHY_TIMEOUT;
        timer->data = (unsigned long)(dev);
        timer->function = rtl8169_phy_timer;
-       add_timer(timer);
+       mod_timer(timer, jiffies + RTL8169_PHY_TIMEOUT);
 }

But I think _this_ change is unnecessary, ...

>  #ifdef CONFIG_NET_POLL_CONTROLLER
> @@ -1448,7 +1448,7 @@ static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp)
>  
>  	rtl8169_phy_reset(dev, tp);
>  
> -	rtl8169_set_speed(dev, autoneg, speed, duplex);
> +	tp->set_speed(dev, autoneg, speed, duplex);
>  
>  	if ((RTL_R8(PHYstatus) & TBI_Enable) && netif_msg_link(tp))
>  		printk(KERN_INFO PFX "%s: TBI auto-negotiating\n", dev->name);

... but that looks good (better than my patch) and should resolve the issue,
too. I can't test because it's triggered only if the network hangs and you
know, the last one isn't reproducable.



Regards,
Bernhard

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: r8169: Crash after reloading driver if network hangs
  2007-01-28 19:56   ` Bernhard Walle
@ 2007-01-29  7:56     ` Francois Romieu
  2007-01-29 10:25       ` Bernhard Walle
  0 siblings, 1 reply; 5+ messages in thread
From: Francois Romieu @ 2007-01-29  7:56 UTC (permalink / raw)
  To: netdev; +Cc: Bernhard Walle

Bernhard Walle <bwalle@suse.de> :
[...]
> Doesn't compile, I think you mean this?

Yes.

> @@ -1371,10 +1371,9 @@ static inline void rtl8169_request_timer
>                 return;
>  
>         init_timer(timer);
> -       timer->expires = jiffies + RTL8169_PHY_TIMEOUT;
>         timer->data = (unsigned long)(dev);
>         timer->function = rtl8169_phy_timer;
> -       add_timer(timer);
> +       mod_timer(timer, jiffies + RTL8169_PHY_TIMEOUT);
>  }
> 
> But I think _this_ change is unnecessary, ...

add_timer() is not supposed to modify an existing timer whereas mod_timer()
encompasses both and the race exists in both direction as netif_running()
is true as soon as dev_open() starts but way before dev->open() completes.

[...]
> ... but that looks good (better than my patch) and should resolve the issue,
> too. I can't test because it's triggered only if the network hangs and you
> know, the last one isn't reproducable.

There will be something to test in the merge of realtek's stuff #2.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: r8169: Crash after reloading driver if network hangs
  2007-01-29  7:56     ` Francois Romieu
@ 2007-01-29 10:25       ` Bernhard Walle
  0 siblings, 0 replies; 5+ messages in thread
From: Bernhard Walle @ 2007-01-29 10:25 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

Hello,

* Francois Romieu <romieu@fr.zoreil.com> [2007-01-29 08:56]:
> Bernhard Walle <bwalle@suse.de> :
> 
> > @@ -1371,10 +1371,9 @@ static inline void rtl8169_request_timer
> >                 return;
> >  
> >         init_timer(timer);
> > -       timer->expires = jiffies + RTL8169_PHY_TIMEOUT;
> >         timer->data = (unsigned long)(dev);
> >         timer->function = rtl8169_phy_timer;
> > -       add_timer(timer);
> > +       mod_timer(timer, jiffies + RTL8169_PHY_TIMEOUT);
> >  }
> > 
> > But I think _this_ change is unnecessary, ...
> 
> add_timer() is not supposed to modify an existing timer whereas mod_timer()
> encompasses both 

that's clear.

> and the race exists in both direction as netif_running()
> is true as soon as dev_open() starts but way before dev->open() completes.

But rtl8169_request_timer() is only called from change_mtu() and
dev->open(). And, if you call init_timer(), you always have a new
timer, and the reference to an existing tp->timer will be lost.

> [...]
> > ... but that looks good (better than my patch) and should resolve the issue,
> > too. I can't test because it's triggered only if the network hangs and you
> > know, the last one isn't reproducable.
> 
> There will be something to test in the merge of realtek's stuff #2.

Great.


Regards,
Bernhard
-- 
SUSE LINUX Products GmbH                E-Mail:  bwalle@suse.de
Maxfeldstr. 5                           Phone:   +49 (911) 74053-0
90409 Nürnberg, Germany
OpenPGP DDAF6454: F61F 34CC 09CA FB82 C9F6  BA4B 8865 3696 DDAF 6454

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-01-29 10:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-28 18:02 r8169: Crash after reloading driver if network hangs Bernhard Walle
2007-01-28 19:04 ` Francois Romieu
2007-01-28 19:56   ` Bernhard Walle
2007-01-29  7:56     ` Francois Romieu
2007-01-29 10:25       ` Bernhard Walle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).