All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcus Blomenkamp <Marcus.Blomenkamp@epost.de>
To: Francois Romieu <romieu@fr.zoreil.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: r8169 GigE driver problem, locks up 2.4.23 NFS subsystem
Date: Mon, 15 Dec 2003 11:24:15 +0100	[thread overview]
Message-ID: <200312151124.15143.Marcus.Blomenkamp@epost.de> (raw)
In-Reply-To: <20031214144055.A4664@electric-eye.fr.zoreil.com>

Am Sonntag, 14. Dezember 2003 14:40 schrieben Sie:
>
> Ok, this one is merged into Jeff Garzik's patchset. If you try it and it
> does not work, please report if it does not work in a different manner
> (because it is still possible that I have broken something during the
> merge).

Hi.

First of all, i did not feel any difference in behaviour for the different 
ACPI options, so I ran all these tests with acpi=off explicitly. Mainboard is 
Asus P2B with recent ACPI beta bios.

Ok, from this patchset i extracted and updated the 8169.c diff only. And bad 
news: With this driver object built-in the kernel does not boot at all - 
locks up on configuring card 100% reproducible.

I enabled debug options in 8169 and got this output: (manually written...)

r8169 Gigabit Ethernet driver 1.2 loaded
mac_version == RTL_GIGA_MAC_VER_E (0002)
phy_version == RTL_GIGA_PHY_VER_E (0000)
eth0: RTL8169 at 0xd882a00, 00:08:54:d0:e4:70, IRQ 5
mac_version == RTL_GIGA_MAC_VER_E (0002)
phy_version == RTL_GIGA_PHY_VER_E (0000)
Do final_reg2.cfg

I added more dummy printks and intermediate result: it does not return from 
function 'rtl8169_hw_phy_config()' 

This routine messes up the card itself, as a reset/reboot into 2.4 does not 
revitalize it. I definitively have to power-cycle the machine.

>
> Is it possible to get an ethereal dump for a normal nfs operation as well
> as for a failed one on both sides of the link (client + server) ?
> The size of the dump should not be an issue on my side.

I tcpdump'ed both sides on transferring 1 Megabyte to the 8139 based machine 
for both low level transfers (UDP, TCP) and on filesystem level (NFS).

NFS: dd 1M to remote file
UDP/TCP: dd 1M trough netcat
TCP: scp 1M file

Very interestingly i could not reproduce the NFS lockup during this double 
monitoring setup. So i ran it twice - once for each machine in promiscious 
mode. And guess: If the 8169 NIC is in monitoring mode, NFS writes do not 
lock up. I can even recover the machine from stalling by explicitly entering 
promiscious mode and SIGINT'ing the writing process.


>
> A /proc/interrupts of the working/failing client after the nfs operation
> as well as after a normal scp (if possible) could help too.

Everything normal - no interrupt storm visible. On NFS stalling NICs interrupt 
counter does not increase, however explicit pings work fine and do increment 
the counter.

>
> Some 'ifconfig' output will be welcome too.
>
> If the problem is more or less trivially related to the length of the
> frames, it should be possible to notice a change of behavior while
> increasing the size of simple ping (ping -n -c 1 -s _size 192.168.1.254
> where _size ~= 1460...1480). Do you notice something here ?

This one ist interesting too. After having found NFS locking above 
datagramsize==4k i tried to find the exact boundary in this range. And there 
is no. First impressions (will do a deeper analysis later):

send 4k		OK
send 5k		-

IIRC there is a region around 4750 bytes at which it seems to works like a 
Schmitt trigger. If i am coming from below (aka can ping) i can ping with 
dsize=X but if i am coming from above (aka no ping answer) i can not ping 
with the same dsize=X.

I'll copy-mail this to lk, so i'll send the logs to you in a separate mail.

Best regards, Marcus


  parent reply	other threads:[~2003-12-15 10:24 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-12-13 13:01 r8169 GigE driver problem, locks up 2.4.23 NFS subsystem Francois Romieu
2003-12-13 13:54 ` Marcus Blomenkamp
     [not found]   ` <20031214144055.A4664@electric-eye.fr.zoreil.com>
2003-12-15 10:24     ` Marcus Blomenkamp [this message]
2003-12-15 18:58       ` Francois Romieu
2003-12-18  0:45       ` Francois Romieu
  -- strict thread matches above, loose matches on Subject: below --
2003-12-13 12:00 Marcus Blomenkamp
2003-12-14  2:02 ` Jeff Garzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200312151124.15143.Marcus.Blomenkamp@epost.de \
    --to=marcus.blomenkamp@epost.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=romieu@fr.zoreil.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.