Re: tg3 issues

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: patric <pakar@imperialnet.org>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: netdev <netdev@vger.kernel.org>
Subject: Re: tg3 issues
Date: Thu, 19 Jul 2007 15:24:17 +0200	[thread overview]
Message-ID: <469F6601.20800@imperialnet.org> (raw)
In-Reply-To: <20070719113755.GA1934@hmsreliant.homelinux.net>

Neil Horman wrote:

> On Thu, Jul 19, 2007 at 04:49:13PM +0530, pradeep singh wrote:
>   
>> CCing: netdev
>>
>> On 7/19/07, patric <pakar@imperialnet.org> wrote:
>>     
>>> Hi,
>>>
>>>
>>> To start with, i'm not sure if this should go to the dev or user list,
>>> but i'll start here..
>>>
>>>
>>> I'm currently running a nfsroot via a Broadcom NetXtreme 1000-SX card
>>> (BCM5701) and i have a big problem with the tg3 drivers autonegotiation.
>>>
>>> The issue seems to be that when the kernel comes so far as it's trying
>>> to mount the boot the autonegotiation has not yet completed and then
>>> causes a panic since it cannot mount the nfsroot.
>>>
>>>
>>> From some debugging i have done here the issues seems to be related to
>>> the flowcontrol configuration, and just to make it a bit more fun it
>>> does work some of the time.. (around once every 5-10 attempts.)
>>>
>>>
>>> On the console it looks something like this when failing. (written from
>>> memory since i don't have netconsole enabled)
>>>
>>> tg3: eth0: Link is up at 1000 Mbps, full duplex.
>>> tg3: eth0: Flow control is off for TX and off for RX.
>>> IP-Config: Complete:
>>>      device=eth0, addr=192.168.1.10, mask=255.255.255.0,
>>> gw=255.255.255.255,
>>>     host=amd, domain=, nis-domain=(none),
>>>     bootserver=255.255.255.255, rootserver=192.168.1.1, rootpath=
>>> Root-NFS: unknown option: nolocks
>>> Looking up port of RPC 100003/3 on 192.168.1.1
>>> rpcbind: server 192.168.1.1 not responding, timed out
>>>
>>> Root-NFS: Unable to get nfsd port number from server, using default
>>>
>>> Looking up port of RPC 100003/3 on 192.168.1.1
>>> rpcbind: server 192.168.1.1 not responding, timed out
>>>
>>> Root-NFS: Unable to get nfsd port number from server, using default
>>>
>>>
>>> and so on until it panics...
>>>
>>>       
>
> IIRC, there are two main problems in this typ of situation
>
> 1) Spanning tree convergence
> 2) Firmware initalization latency
>
> If you are running spanning tree on your network, it can take up to 2 minutes
> before your port will forward frames properly.  if you have the options
> available, disable spanning tree on the switch port connected to your host
> system, or at least enable portfast if it is an option.  That should fix any
> spanning tree issues you have
>
> If the tg3 card is just taking a long time to initalize, there is not too much
> you can do about it.  If your goal is to use nfs root, I would, instead of
> enabling nfs-root as a kernel config option, I would create an initramfs that:
> A) Brings up your NIC
> B) Mounts your nfs partition
> C) executes a switch_root or pivot_root operation
>
> That way you can calibrate a delay between steps (A) and (B) in your initramfs
> init script
>
> Regards
> Neil
>
>   
Hi Neil and thanks for your quick reply, and thanks Pradeep for 
forwarding the question to the correct mailinglist.

Well, not using any switches and just a crossed cable between the 
machines. Did notice that it seems to get a 'good link' more often when 
cold-booting the client.
Have been thinking about using a initrd to get around the issue, but the 
problem is that you never know how long the init will be so there will 
always have to be a quite big delay before the system can boot. But 
don't really think the issue is that the card takes a long time to 
initialize since it does sometime work without delay during a warm-boot 
and the cards do report that they are up but they then are reporting 
different states of flow-control. Maybe set the flowcontrol static in 
the driver for a test, if i now can figure out how this driver works. :)

Just a hypothetical question. If the 2 network cards starts the 
autonegotiation would it be possible that they get into a loop where 
they are chasing each others state?  Maybe a fix could be to add a sleep 
of a random length that would enable them to catch up? Maybe you know if 
any of the fiber-cards so support running without flowcontrol too since 
the cards don't seem to be able to get a link with flowcontrol turned 
off at least in this setup.


Regards,
Patric

next prev parent reply	other threads:[~2007-07-19 13:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <469E78A2.80904@imperialnet.org>
2007-07-19 11:19 ` tg3 issues pradeep singh
2007-07-19 11:37   ` Neil Horman
2007-07-19 13:24     ` patric [this message]
2007-07-19 17:34       ` Michael Chan
2007-07-20 16:59         ` patric
2007-07-20 19:34           ` Michael Chan
2007-07-20 19:57             ` patric
2007-07-22 11:43               ` patric
2007-07-23 21:34                 ` Michael Chan
2007-07-24  7:33                   ` patric

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=469F6601.20800@imperialnet.org \
    --to=pakar@imperialnet.org \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).