All of lore.kernel.org
 help / color / mirror / Atom feed
From: patric <pakar@imperialnet.org>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: netdev <netdev@vger.kernel.org>
Subject: Re: tg3 issues
Date: Thu, 19 Jul 2007 15:24:17 +0200	[thread overview]
Message-ID: <469F6601.20800@imperialnet.org> (raw)
In-Reply-To: <20070719113755.GA1934@hmsreliant.homelinux.net>

Neil Horman wrote:

> On Thu, Jul 19, 2007 at 04:49:13PM +0530, pradeep singh wrote:
>   
>> CCing: netdev
>>
>> On 7/19/07, patric <pakar@imperialnet.org> wrote:
>>     
>>> Hi,
>>>
>>>
>>> To start with, i'm not sure if this should go to the dev or user list,
>>> but i'll start here..
>>>
>>>
>>> I'm currently running a nfsroot via a Broadcom NetXtreme 1000-SX card
>>> (BCM5701) and i have a big problem with the tg3 drivers autonegotiation.
>>>
>>> The issue seems to be that when the kernel comes so far as it's trying
>>> to mount the boot the autonegotiation has not yet completed and then
>>> causes a panic since it cannot mount the nfsroot.
>>>
>>>
>>> From some debugging i have done here the issues seems to be related to
>>> the flowcontrol configuration, and just to make it a bit more fun it
>>> does work some of the time.. (around once every 5-10 attempts.)
>>>
>>>
>>> On the console it looks something like this when failing. (written from
>>> memory since i don't have netconsole enabled)
>>>
>>> tg3: eth0: Link is up at 1000 Mbps, full duplex.
>>> tg3: eth0: Flow control is off for TX and off for RX.
>>> IP-Config: Complete:
>>>      device=eth0, addr=192.168.1.10, mask=255.255.255.0,
>>> gw=255.255.255.255,
>>>     host=amd, domain=, nis-domain=(none),
>>>     bootserver=255.255.255.255, rootserver=192.168.1.1, rootpath=
>>> Root-NFS: unknown option: nolocks
>>> Looking up port of RPC 100003/3 on 192.168.1.1
>>> rpcbind: server 192.168.1.1 not responding, timed out
>>>
>>> Root-NFS: Unable to get nfsd port number from server, using default
>>>
>>> Looking up port of RPC 100003/3 on 192.168.1.1
>>> rpcbind: server 192.168.1.1 not responding, timed out
>>>
>>> Root-NFS: Unable to get nfsd port number from server, using default
>>>
>>>
>>> and so on until it panics...
>>>
>>>       
>
> IIRC, there are two main problems in this typ of situation
>
> 1) Spanning tree convergence
> 2) Firmware initalization latency
>
> If you are running spanning tree on your network, it can take up to 2 minutes
> before your port will forward frames properly.  if you have the options
> available, disable spanning tree on the switch port connected to your host
> system, or at least enable portfast if it is an option.  That should fix any
> spanning tree issues you have
>
> If the tg3 card is just taking a long time to initalize, there is not too much
> you can do about it.  If your goal is to use nfs root, I would, instead of
> enabling nfs-root as a kernel config option, I would create an initramfs that:
> A) Brings up your NIC
> B) Mounts your nfs partition
> C) executes a switch_root or pivot_root operation
>
> That way you can calibrate a delay between steps (A) and (B) in your initramfs
> init script
>
> Regards
> Neil
>
>   
Hi Neil and thanks for your quick reply, and thanks Pradeep for 
forwarding the question to the correct mailinglist.

Well, not using any switches and just a crossed cable between the 
machines. Did notice that it seems to get a 'good link' more often when 
cold-booting the client.
Have been thinking about using a initrd to get around the issue, but the 
problem is that you never know how long the init will be so there will 
always have to be a quite big delay before the system can boot. But 
don't really think the issue is that the card takes a long time to 
initialize since it does sometime work without delay during a warm-boot 
and the cards do report that they are up but they then are reporting 
different states of flow-control. Maybe set the flowcontrol static in 
the driver for a test, if i now can figure out how this driver works. :)

Just a hypothetical question. If the 2 network cards starts the 
autonegotiation would it be possible that they get into a loop where 
they are chasing each others state?  Maybe a fix could be to add a sleep 
of a random length that would enable them to catch up? Maybe you know if 
any of the fiber-cards so support running without flowcontrol too since 
the cards don't seem to be able to get a link with flowcontrol turned 
off at least in this setup.


Regards,
Patric




  reply	other threads:[~2007-07-19 13:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <469E78A2.80904@imperialnet.org>
2007-07-19 11:19 ` tg3 issues pradeep singh
2007-07-19 11:37   ` Neil Horman
2007-07-19 13:24     ` patric [this message]
2007-07-19 17:34       ` Michael Chan
2007-07-20 16:59         ` patric
2007-07-20 19:34           ` Michael Chan
2007-07-20 19:57             ` patric
2007-07-22 11:43               ` patric
2007-07-23 21:34                 ` Michael Chan
2007-07-24  7:33                   ` patric

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=469F6601.20800@imperialnet.org \
    --to=pakar@imperialnet.org \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.