netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [Bugme-new] [Bug 16187] New: Carrier detection failed in dhcpcd when link is up
       [not found] <bug-16187-10286@https.bugzilla.kernel.org/>
@ 2010-06-15 21:24 ` Andrew Morton
  2010-06-16  5:46   ` Christian Casteyde
  2010-06-18  6:16   ` Grant Grundler
  0 siblings, 2 replies; 5+ messages in thread
From: Andrew Morton @ 2010-06-15 21:24 UTC (permalink / raw)
  To: netdev, Grant Grundler, Kyle McMartin
  Cc: bugzilla-daemon, bugme-daemon, casteyde.christian


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Sat, 12 Jun 2010 15:15:31 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=16187
> 
>            Summary: Carrier detection failed in dhcpcd when link is up
>            Product: Networking
>            Version: 2.5
>     Kernel Version: 2.6.35-rc2
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: acme@ghostprotocols.net
>         ReportedBy: casteyde.christian@free.fr
>         Regression: Yes
> 
> 
> Created an attachment (id=26742)
>  --> (https://bugzilla.kernel.org/attachment.cgi?id=26742)
> lspci output for 2.6.34 on my computer
> 
> Kernel at least 2.6.35-rc2, 2.6.34 works fine

Seems to be post-2.6.34 breakage in the tulip driver.

> Athlon X2 3000 in 64bits mode
> Slackware 13.1
> "Ethernet controller: ALi Corporation ULi 1689,1573 integrated ethernet. (rev
> 40)" (from lspci)
> 
> Since 2.6.35-rc2 (didn't checked -rc1), dhcpcd hangs fails to detect carrier
> appearance at boot.
> 
> The Slackware network script uses dhcpcd to bring DHCP interfaces up. At boot,
> it seems my network device doesn't have the link up immediatly. dhcpcd tries to
> bring it up, and wait for the carrier. The carrier indeed goes up, but dhcpcd
> gets absolutly no notification of that, and therefore times out.
> 
> logs says that:
> Jun 12 16:53:58 sirius logger: /etc/rc.d/rc.inet1:  /sbin/route add -net
> 127.0.0.0 netmask 255.0.0.0 lo
> Jun 12 16:53:58 sirius logger: /etc/rc.d/rc.inet1:  /sbin/dhcpcd -t 10  eth0
> Jun 12 16:53:58 sirius dhcpcd: version 5.2.2 starting
> Jun 12 16:53:58 sirius kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
> Jun 12 16:53:58 sirius dhcpcd: eth0: waiting for carrier
> Jun 12 16:54:01 sirius kernel: uli526x: eth0 NIC Link is Up 100 Mbps Full
> duplex
> Jun 12 16:54:01 sirius kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes
> ready
> then dhcpcd times out a little later (10 seconds as -t 10 is specified).
> 
> If I use the -K option (no carrier detection), and does ifconfig eth0 up just
> before issueing the dhcpcd command, dhcpcd doesn't wait for the carrier and
> gets a lease correctly.
> 
> Reversely, with the 2.6.34 kernel, dhcpcd indeed gets the link up notification,
> and gets the lease immediatly. In this case, the log says:
> 
> Jun 12 17:04:21 sirius logger: /etc/rc.d/rc.inet1:  /sbin/route add -net
> 127.0.0.0 netmask 255.0.0.0 lo
> Jun 12 17:04:21 sirius logger: /etc/rc.d/rc.inet1:  /sbin/dhcpcd -t 10  eth0
> Jun 12 17:04:22 sirius dhcpcd: version 5.2.2 starting
> Jun 12 17:04:22 sirius kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
> Jun 12 17:04:22 sirius dhcpcd: eth0: waiting for carrier
> Jun 12 17:04:25 sirius dhcpcd: eth0: carrier acquired
> Jun 12 17:04:25 sirius kernel: uli526x: eth0 NIC Link is Up 100 Mbps Full
> duplex
> Jun 12 17:04:25 sirius kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes
> ready
> Jun 12 17:04:25 sirius dhcpcd: eth0: broadcasting for a lease
> Jun 12 17:04:29 sirius dhcpcd: eth0: offered 192.168.1.3 from 192.168.1.1
> Jun 12 17:04:29 sirius dhcpcd: eth0: acknowledged 192.168.1.3 from 192.168.1.1
> Jun 12 17:04:29 sirius dhcpcd: eth0: checking for 192.168.1.3
> Jun 12 17:04:34 sirius dhcpcd: eth0: leased 192.168.1.3 for 864000 seconds
> Jun 12 17:04:34 sirius dhcpcd: forking to background
> 
> As you can see, dhcpcd asks for a lease as soon as the kernel tells it the link
> is up (message "carrier acquired").
> 
> Therefore, I think 2.6.35-rc* notification of carrier is broken, and at least
> it broke dhcpcd way of watching the link.
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Bugme-new] [Bug 16187] New: Carrier detection failed in dhcpcd when link is up
  2010-06-15 21:24 ` [Bugme-new] [Bug 16187] New: Carrier detection failed in dhcpcd when link is up Andrew Morton
@ 2010-06-16  5:46   ` Christian Casteyde
  2010-06-18  6:16   ` Grant Grundler
  1 sibling, 0 replies; 5+ messages in thread
From: Christian Casteyde @ 2010-06-16  5:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: netdev, Grant Grundler, Kyle McMartin, bugzilla-daemon,
	bugme-daemon

Exact, my kernel config is:
-*- Generic Media Independent Interface device support
[*]   "Tulip" family network device support
<*>   ULi M526x controller support

Le mardi 15 juin 2010 23:24:18, Andrew Morton a écrit :
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Sat, 12 Jun 2010 15:15:31 GMT
> 
> bugzilla-daemon@bugzilla.kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=16187
> > 
> >            Summary: Carrier detection failed in dhcpcd when link is up
> >            Product: Networking
> >            Version: 2.5
> >     
> >     Kernel Version: 2.6.35-rc2
> >     
> >           Platform: All
> >         
> >         OS/Version: Linux
> >         
> >               Tree: Mainline
> >             
> >             Status: NEW
> >           
> >           Severity: normal
> >           Priority: P1
> >          
> >          Component: Other
> >         
> >         AssignedTo: acme@ghostprotocols.net
> >         ReportedBy: casteyde.christian@free.fr
> >         Regression: Yes
> > 
> > Created an attachment (id=26742)
> > 
> >  --> (https://bugzilla.kernel.org/attachment.cgi?id=26742)
> > 
> > lspci output for 2.6.34 on my computer
> > 
> > Kernel at least 2.6.35-rc2, 2.6.34 works fine
> 
> Seems to be post-2.6.34 breakage in the tulip driver.
> 
> > Athlon X2 3000 in 64bits mode
> > Slackware 13.1
> > "Ethernet controller: ALi Corporation ULi 1689,1573 integrated ethernet.
> > (rev 40)" (from lspci)
> > 
> > Since 2.6.35-rc2 (didn't checked -rc1), dhcpcd hangs fails to detect
> > carrier appearance at boot.
> > 
> > The Slackware network script uses dhcpcd to bring DHCP interfaces up. At
> > boot, it seems my network device doesn't have the link up immediatly.
> > dhcpcd tries to bring it up, and wait for the carrier. The carrier
> > indeed goes up, but dhcpcd gets absolutly no notification of that, and
> > therefore times out.
> > 
> > logs says that:
> > Jun 12 16:53:58 sirius logger: /etc/rc.d/rc.inet1:  /sbin/route add -net
> > 127.0.0.0 netmask 255.0.0.0 lo
> > Jun 12 16:53:58 sirius logger: /etc/rc.d/rc.inet1:  /sbin/dhcpcd -t 10 
> > eth0 Jun 12 16:53:58 sirius dhcpcd: version 5.2.2 starting
> > Jun 12 16:53:58 sirius kernel: ADDRCONF(NETDEV_UP): eth0: link is not
> > ready Jun 12 16:53:58 sirius dhcpcd: eth0: waiting for carrier
> > Jun 12 16:54:01 sirius kernel: uli526x: eth0 NIC Link is Up 100 Mbps Full
> > duplex
> > Jun 12 16:54:01 sirius kernel: ADDRCONF(NETDEV_CHANGE): eth0: link
> > becomes ready
> > then dhcpcd times out a little later (10 seconds as -t 10 is specified).
> > 
> > If I use the -K option (no carrier detection), and does ifconfig eth0 up
> > just before issueing the dhcpcd command, dhcpcd doesn't wait for the
> > carrier and gets a lease correctly.
> > 
> > Reversely, with the 2.6.34 kernel, dhcpcd indeed gets the link up
> > notification, and gets the lease immediatly. In this case, the log says:
> > 
> > Jun 12 17:04:21 sirius logger: /etc/rc.d/rc.inet1:  /sbin/route add -net
> > 127.0.0.0 netmask 255.0.0.0 lo
> > Jun 12 17:04:21 sirius logger: /etc/rc.d/rc.inet1:  /sbin/dhcpcd -t 10 
> > eth0 Jun 12 17:04:22 sirius dhcpcd: version 5.2.2 starting
> > Jun 12 17:04:22 sirius kernel: ADDRCONF(NETDEV_UP): eth0: link is not
> > ready Jun 12 17:04:22 sirius dhcpcd: eth0: waiting for carrier
> > Jun 12 17:04:25 sirius dhcpcd: eth0: carrier acquired
> > Jun 12 17:04:25 sirius kernel: uli526x: eth0 NIC Link is Up 100 Mbps Full
> > duplex
> > Jun 12 17:04:25 sirius kernel: ADDRCONF(NETDEV_CHANGE): eth0: link
> > becomes ready
> > Jun 12 17:04:25 sirius dhcpcd: eth0: broadcasting for a lease
> > Jun 12 17:04:29 sirius dhcpcd: eth0: offered 192.168.1.3 from 192.168.1.1
> > Jun 12 17:04:29 sirius dhcpcd: eth0: acknowledged 192.168.1.3 from
> > 192.168.1.1 Jun 12 17:04:29 sirius dhcpcd: eth0: checking for
> > 192.168.1.3
> > Jun 12 17:04:34 sirius dhcpcd: eth0: leased 192.168.1.3 for 864000
> > seconds Jun 12 17:04:34 sirius dhcpcd: forking to background
> > 
> > As you can see, dhcpcd asks for a lease as soon as the kernel tells it
> > the link is up (message "carrier acquired").
> > 
> > Therefore, I think 2.6.35-rc* notification of carrier is broken, and at
> > least it broke dhcpcd way of watching the link.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Bugme-new] [Bug 16187] New: Carrier detection failed in dhcpcd when link is up
  2010-06-15 21:24 ` [Bugme-new] [Bug 16187] New: Carrier detection failed in dhcpcd when link is up Andrew Morton
  2010-06-16  5:46   ` Christian Casteyde
@ 2010-06-18  6:16   ` Grant Grundler
  2010-06-18 15:04     ` Allan, Bruce W
  1 sibling, 1 reply; 5+ messages in thread
From: Grant Grundler @ 2010-06-18  6:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: netdev, Grant Grundler, Kyle McMartin, bugzilla-daemon,
	bugme-daemon, casteyde.christian

On Tue, Jun 15, 2010 at 02:24:18PM -0700, Andrew Morton wrote:
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).

I've resync to linus' tree (2.6.35-rc3) and reviewed the output of:
    git diff v2.6.34 drivers/net/tulip/

I don't see anything that would affect how link state
changes get reported to user space.

I'm not inclined to believe this is a tulip "bug" unless
core netdev behavior changed and tulip is not longer
doing the right thing.

hth,
grant

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [Bugme-new] [Bug 16187] New: Carrier detection failed in dhcpcd when link is up
  2010-06-18  6:16   ` Grant Grundler
@ 2010-06-18 15:04     ` Allan, Bruce W
  2010-06-21 21:21       ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Allan, Bruce W @ 2010-06-18 15:04 UTC (permalink / raw)
  To: Grant Grundler, Andrew Morton
  Cc: netdev@vger.kernel.org, Kyle McMartin,
	bugzilla-daemon@bugzilla.kernel.org,
	bugme-daemon@bugzilla.kernel.org, casteyde.christian@free.fr

On Thursday, June 17, 2010 11:17 PM, Grant Grundler wrote:
> On Tue, Jun 15, 2010 at 02:24:18PM -0700, Andrew Morton wrote:
>> 
>> (switched to email.  Please respond via emailed reply-to-all, not
>> via the bugzilla web interface).
> 
> I've resync to linus' tree (2.6.35-rc3) and reviewed the output of:
>     git diff v2.6.34 drivers/net/tulip/
> 
> I don't see anything that would affect how link state
> changes get reported to user space.
> 
> I'm not inclined to believe this is a tulip "bug" unless
> core netdev behavior changed and tulip is not longer
> doing the right thing.
> 
> hth,
> grant

I don't believe this is a tulip specific bug - the same thing has been reported against e1000e and bnx2 (IIRC); I have not had the time to investigate further.

Bruce.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Bugme-new] [Bug 16187] New: Carrier detection failed in dhcpcd when link is up
  2010-06-18 15:04     ` Allan, Bruce W
@ 2010-06-21 21:21       ` Andrew Morton
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2010-06-21 21:21 UTC (permalink / raw)
  To: Allan, Bruce W
  Cc: Grant Grundler, netdev@vger.kernel.org, Kyle McMartin,
	bugzilla-daemon@bugzilla.kernel.org,
	bugme-daemon@bugzilla.kernel.org, casteyde.christian@free.fr,
	YOSHIFUJI Hideaki

On Fri, 18 Jun 2010 08:04:36 -0700
"Allan, Bruce W" <bruce.w.allan@intel.com> wrote:

> On Thursday, June 17, 2010 11:17 PM, Grant Grundler wrote:
> > On Tue, Jun 15, 2010 at 02:24:18PM -0700, Andrew Morton wrote:
> >> 
> >> (switched to email.  Please respond via emailed reply-to-all, not
> >> via the bugzilla web interface).
> > 
> > I've resync to linus' tree (2.6.35-rc3) and reviewed the output of:
> >     git diff v2.6.34 drivers/net/tulip/
> > 
> > I don't see anything that would affect how link state
> > changes get reported to user space.
> > 
> > I'm not inclined to believe this is a tulip "bug" unless
> > core netdev behavior changed and tulip is not longer
> > doing the right thing.
> > 
> > hth,
> > grant
> 
> I don't believe this is a tulip specific bug - the same thing has been reported against e1000e and bnx2 (IIRC); I have not had the time to investigate further.

So it's affecting three drivers.

One thing which changed in there recently is

: commit b2db756449f63f98049587f7ede4a8e85e0c79b1
: Author:     YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
: AuthorDate: Sat Mar 20 16:11:12 2010 -0700
: Commit:     David S. Miller <davem@davemloft.net>
: CommitDate: Sat Mar 20 16:11:12 2010 -0700
:
:    ipv6: Reduce timer events for addrconf_verify().

So perhaps someone could test the simple reversion patch, below?

I couldn't locate these e1000e and bnx2 bug reports so I couldn't cc
the reporters :(

I'm seeing several patches on netdev "use netif_carrier_off to prevent
tx timeout".  Is that related?


 net/ipv6/addrconf.c |   27 ++++-----------------------
 1 file changed, 4 insertions(+), 23 deletions(-)

diff -puN net/ipv6/addrconf.c~revert-2 net/ipv6/addrconf.c
--- a/net/ipv6/addrconf.c~revert-2
+++ a/net/ipv6/addrconf.c
@@ -100,10 +100,6 @@
 #define	INFINITY_LIFE_TIME	0xFFFFFFFF
 #define TIME_DELTA(a, b) ((unsigned long)((long)(a) - (long)(b)))
 
-#define ADDRCONF_TIMER_FUZZ_MINUS	(HZ > 50 ? HZ/50 : 1)
-#define ADDRCONF_TIMER_FUZZ		(HZ / 4)
-#define ADDRCONF_TIMER_FUZZ_MAX		(HZ)
-
 #ifdef CONFIG_SYSCTL
 static void addrconf_sysctl_register(struct inet6_dev *idev);
 static void addrconf_sysctl_unregister(struct inet6_dev *idev);
@@ -3159,15 +3155,15 @@ int ipv6_chk_home_addr(struct net *net, 
 
 static void addrconf_verify(unsigned long foo)
 {
-	unsigned long now, next, next_sec, next_sched;
 	struct inet6_ifaddr *ifp;
 	struct hlist_node *node;
+	unsigned long now, next;
 	int i;
 
 	rcu_read_lock_bh();
 	spin_lock(&addrconf_verify_lock);
 	now = jiffies;
-	next = round_jiffies_up(now + ADDR_CHECK_FREQUENCY);
+	next = now + ADDR_CHECK_FREQUENCY;
 
 	del_timer(&addr_chk_timer);
 
@@ -3181,8 +3177,7 @@ restart:
 				continue;
 
 			spin_lock(&ifp->lock);
-			/* We try to batch several events at once. */
-			age = (now - ifp->tstamp + ADDRCONF_TIMER_FUZZ_MINUS) / HZ;
+			age = (now - ifp->tstamp) / HZ;
 
 			if (ifp->valid_lft != INFINITY_LIFE_TIME &&
 			    age >= ifp->valid_lft) {
@@ -3252,21 +3247,7 @@ restart:
 		}
 	}
 
-	next_sec = round_jiffies_up(next);
-	next_sched = next;
-
-	/* If rounded timeout is accurate enough, accept it. */
-	if (time_before(next_sec, next + ADDRCONF_TIMER_FUZZ))
-		next_sched = next_sec;
-
-	/* And minimum interval is ADDRCONF_TIMER_FUZZ_MAX. */
-	if (time_before(next_sched, jiffies + ADDRCONF_TIMER_FUZZ_MAX))
-		next_sched = jiffies + ADDRCONF_TIMER_FUZZ_MAX;
-
-	ADBG((KERN_DEBUG "now = %lu, schedule = %lu, rounded schedule = %lu => %lu\n",
-	      now, next, next_sec, next_sched));
-
-	addr_chk_timer.expires = next_sched;
+	addr_chk_timer.expires = time_before(next, jiffies + HZ) ? jiffies + HZ : next;
 	add_timer(&addr_chk_timer);
 	spin_unlock(&addrconf_verify_lock);
 	rcu_read_unlock_bh();
_


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-06-21 21:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-16187-10286@https.bugzilla.kernel.org/>
2010-06-15 21:24 ` [Bugme-new] [Bug 16187] New: Carrier detection failed in dhcpcd when link is up Andrew Morton
2010-06-16  5:46   ` Christian Casteyde
2010-06-18  6:16   ` Grant Grundler
2010-06-18 15:04     ` Allan, Bruce W
2010-06-21 21:21       ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).