public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Serov <vserov@infratel.com>
To: linux-kernel <linux-kernel@vger.kernel.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Subject: [BUG] nfs client stuck in D state in linux 2.4.17 - 2.4.21-pre5
Date: Thu, 20 Mar 2003 19:22:00 +0300	[thread overview]
Message-ID: <3E79EAA8.4000907@infratel.com> (raw)
In-Reply-To: 20030318155731.1f60a55a.skraw@ithnet.com



Hello Trond, hello all,

I'm suffering from the long present bug in the nfs client.
This bug cause programs reading from NFS volume to stuck in D state forever.
This bug revealed only when client talks to NFS server with 3COM 3C905 
NIC's ( well I'v triggered it with Intel eepro card too, but you have to 
wait) and never with cheap slower cards like RTLxxxx, NE2000 clones. It 
happens infrequently but inevitably. It happens more frequentlly on 
2.4.17 kernel then on 2.4.21-pre5, when compiled by gcc 3.2.1 then gcc 
2.95.3. Trond's NFS patches doesn't help on both kernels. It's not due 
to packets loss (ok, it happens some times but rarely), it happens on 
both 10 and 100 Mbps. This happens only on my StrongARM board (similar 
to Brutus) with SMC's LAN91C111 ethernet chip. I 've not able to 
reproduce this on PC, but i've head about very similar case:  
http://www.uwsg.iu.edu/hypermail/linux/kernel/0206.0/0066.html
It triggered simply by 'ls -lR /home>/dev/null&' and takes ~ half a 
minute to happend.
If i insert a few printk's in the interrupt handler for NIC, it's gone !!!
IMHO this is due to the race in the nfs client.

Look at some logs from my system:

sh-2.03# mount
rootfs on / type rootfs (rw)
/dev/mtdblock4 on / type jffs2 (rw)
none on /proc type proc (rw)
none on /tmp type tmpfs (rw)
none on /dev/pts type devpts (rw)
infracvs:/group on /group type nfs 
(rw,v2,rsize=4096,wsize=4096,soft,intr,udp,lock,addr=infracvs)
serov:/home on /home type nfs 
(rw,v3,rsize=4096,wsize=4096,soft,intr,udp,lock,addr=serov)

sh-2.03# ps
  PID  Uid     Stat Command
    1 root     S    init
    2 root     S    [keventd]
    3 root     S    [ksoftirqd_CPU0]
    4 root     S    [kswapd]
    5 root     S    [bdflush]
    6 root     S    [kupdated]
    7 root     S    [mtdblockd]
    8 root     S    [jffs2_gcd_mtd4]
  102 root     S    dhcpcd
  111 bin      S    portmap
  113 root     S    [rpciod]
  114 root     S    [lockd]
  124 root     S    klogd
  138 root     S    /usr/sbin/inetd
  143 root     S    /www/sbin/sshd -f /www/etc/sshd_config
  152 root     S    init
  153 root     S    sh -login -i
  158 root     D    ls -lR /home
  159 root     D    ls -lR /home
  179 root     D    ls -lR /home
  183 root     R    ps

Part of output from Magic SysRq t with decoded symbols:

ls            D C001EB8C  3216   165    153                     (NOTLB)
Function entered at [<c001e990>] from [<c0109b14>]
                      schedule          __rpc_execute

I've used /proc/sys/sunrpc/rpc_debug and /proc/sys/sunrpc/nfs_debug to 
get some info, it was nothing interesting in it exept the fact that rpc 
request wich was constantly reused after 'ls' stuck is appeared inthe 
following message in the --rqstp- column.
sh-2.03# echo 1 > /proc/sys/sunrpc/rpc_debug
sh-2.03# dmesg -c -s 66666
-pid- proc flgs status -client- -prog- --rqstp- -timeout -rpcwait 
-action- --exit--
20429 0001 0000 000000 c0eda960 100003 c8f89218 00000000  <NULL>  
c0105d5c        0
10052 0001 0000 000000 c0eda960 100003 c8f8918c 00000000  <NULL>  
c0105d5c        0
06851 0001 0000 000000 c0eda960 100003 c8f89100 00000000  <NULL>  
c0105d5c        0
00673 0004 0000 000000 c0eda960 100003 c8f89074 00000000  <NULL>  
c0105d5c        0
00368 0000 0081 -00110 c0eda960 100003        0 00003000 nfs_flushd 
c006e290 c006e3c8
00002 0000 0081 -00110 c0e310a0 100003        0 00003000 nfs_flushd 
c006e290 c006e3c8

c006e290 t nfs_flushd
c006e3c8 t nfs_flushd_exit
c0105d5c t call_status



  parent reply	other threads:[~2003-03-20 16:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-03-18 14:57 kernel nfsd Stephan von Krawczynski
2003-03-18 15:31 ` Trond Myklebust
2003-03-18 15:42   ` Stephan von Krawczynski
2003-03-18 16:07     ` Oleg Drokin
2003-03-18 16:28       ` Stephan von Krawczynski
2003-03-18 16:41         ` Stephan von Krawczynski
2003-03-18 16:46           ` Stephan von Krawczynski
2003-03-18 17:28       ` Bernd Schubert
2003-03-19  6:43         ` Oleg Drokin
2003-03-19 11:51           ` Bernd Schubert
2003-03-18 22:09     ` Neil Brown
2003-03-19 11:01       ` Stephan von Krawczynski
2003-03-20 16:22 ` Vladimir Serov [this message]
2003-03-20 16:29   ` [BUG] nfs client stuck in D state in linux 2.4.17 - 2.4.21-pre5 Trond Myklebust
2003-03-21  9:31     ` Vladimir Serov
2003-03-21 11:16       ` Trond Myklebust
     [not found]         ` <3E7B0051.8060603@infratel.com>
     [not found]           ` <15995.578.341176.325238@charged.uio.no>
     [not found]             ` <3E7B10DF.5070005@infratel.com>
     [not found]               ` <15995.5996.446164.746224@charged.uio.no>
     [not found]                 ` <3E7B1DF9.2090401@infratel.com>
     [not found]                   ` <15995.10797.983569.410234@charged.uio.no>
2003-05-07 14:42                     ` Vladimir Serov
2003-05-07 15:06                       ` Trond Myklebust
2003-05-08 13:15                         ` Vladimir Serov
2003-05-13 21:11                           ` Trond Myklebust
2003-05-19 13:20                     ` Vladimir Serov
2003-05-20  2:09                       ` Trond Myklebust
2003-05-20 12:07                         ` Vladimir Serov
2003-05-20 12:34                           ` Trond Myklebust
2003-05-21  9:29                       ` Russell King
2003-05-21  9:43                         ` Russell King
2003-05-21 14:58                           ` Nicolas Pitre
2003-05-21 13:36                         ` Vladimir Serov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3E79EAA8.4000907@infratel.com \
    --to=vserov@infratel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox