All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: "Stephen R. van den Berg" <srb-PCMv+cxZuL0@public.gmane.org>
Cc: Andrew Morton <akpm@linux-foundation.org>, linux-nfs@vger.kernel.org
Subject: Re: Fw: Deadlock regression in v2.6.31.6
Date: Wed, 25 Nov 2009 09:31:52 -0500	[thread overview]
Message-ID: <1259159512.3314.12.camel@localhost> (raw)
In-Reply-To: <64b4daae0911250056g3364d24l98850a272dcfe483-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Wed, 2009-11-25 at 09:56 +0100, Stephen R. van den Berg wrote: 
> > The problem vanishes as soon as I run v2.6.31.5 (neither kernel contains
> > any significant modules).
> 
> I did a bisect, and it turns out that the problem is there in 2.6.31.5 as well.

This makes sense. There have been no RPC level changes between 2.6.31.5
and 2.6.31.6.

> The traces are still valid.  This is on an NFS mounted root partition
> (NFSv3 over TCP), no other filesystems mounted (except a tmpfs here or
> there).  I turned on some debugging in net/sunrpc/sched.c, and the
> following happens when I execute "apt-get --reinstall install man-db"
> (it happens everytime, so it is very reproducible):
> 
> RPC:  9697 setting alarm for 60000 ms
> RPC:  9697 __rpc_wake_up_task (now 7827)
> RPC:  9697 disabling timer
> RPC:  9697 removed from queue cfa72d88 "xprt_pending"
> RPC:       __rpc_wake_up_task done
> RPC:  9697 __rpc_execute flags=0x1 cf849c44
> RPC:  9697 sleep_on(queue "xprt_pending" time 7828)
> RPC:  9697 added to queue cfa72d88 "xprt_pending"
> RPC:  9697 setting alarm for 60000 ms
> RPC:  9697 __rpc_wake_up_task (now 7830)
> RPC:  9697 disabling timer
> RPC:  9697 removed from queue cfa72d88 "xprt_pending"
> RPC:       __rpc_wake_up_task done
> RPC:  9697 __rpc_execute flags=0x1 cf849c44
> RPC:  9697 sleep_on(queue "xprt_pending" time 7831)
> RPC:  9697 added to queue cfa72d88 "xprt_pending"
> RPC:  9697 setting alarm for 60000 ms
> RPC:  9697 __rpc_wake_up_task (now 7833)
> RPC:  9697 disabling timer
> RPC:  9697 removed from queue cfa72d88 "xprt_pending"
> RPC:       __rpc_wake_up_task done
> RPC:  9697 __rpc_execute flags=0x1 cf849c44
> RPC:  9697 sleep_on(queue "xprt_pending" time 7835)
> RPC:  9697 added to queue cfa72d88 "xprt_pending"
> RPC:  9697 setting alarm for 60000 ms
> RPC:  9697 __rpc_wake_up_task (now 7836)
> RPC:  9697 disabling timer
> RPC:  9697 removed from queue cfa72d88 "xprt_pending"
> RPC:       __rpc_wake_up_task done
> RPC:  9697 __rpc_execute flags=0x1 cf849c44
> RPC:  9697 sleep_on(queue "xprt_pending" time 7838)
> RPC:  9697 added to queue cfa72d88 "xprt_pending"
> RPC:  9697 setting alarm for 60000 ms
> RPC:  9697 __rpc_wake_up_task (now 7839)
> RPC:  9697 disabling timer
> RPC:  9697 removed from queue cfa72d88 "xprt_pending"
> RPC:       __rpc_wake_up_task done
> RPC:  9697 __rpc_execute flags=0x1 cf849c44
> RPC:  9697 sleep_on(queue "xprt_pending" time 7841)
> RPC:  9697 added to queue cfa72d88 "xprt_pending"
> RPC:  9697 setting alarm for 60000 ms
> RPC:  9697 __rpc_wake_up_task (now 7842)
> RPC:  9697 disabling timer
> RPC:  9697 removed from queue cfa72d88 "xprt_pending"
> RPC:       __rpc_wake_up_task done
> RPC:  9697 __rpc_execute flags=0x1 cf849c44
> RPC:  9697 sleep_on(queue "xprt_pending" time 7844)
> RPC:  9697 added to queue cfa72d88 "xprt_pending"
> RPC:  9697 setting alarm for 60000 ms
> RPC:  9697 __rpc_wake_up_task (now 7845)
> RPC:  9697 disabling timer
> RPC:  9697 removed from queue cfa72d88 "xprt_pending"
> RPC:       __rpc_wake_up_task done
> 
> Ad infinitum.
> The cf849c44 is the task parameter which I printed as well.
> It looks like an endless loop in the statemachine.
> The kernel hangs at this point, the only way to get out of there is
> using SysBreak.
> I tried debugging it further, but I got lost in the statemachine (I think).

This just means that the RPC client is waiting for a reply from the NFS
server.

Does 'netstat -t' show that there is an active TCP connection to the
server's nfs port?
Does wireshark show that the client should have received a reply?

Trond


  parent reply	other threads:[~2009-11-25 14:31 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-25  7:35 Fw: Deadlock regression in v2.6.31.6 Andrew Morton
2009-11-25  8:56 ` Stephen R. van den Berg
     [not found]   ` <64b4daae0911250056g3364d24l98850a272dcfe483-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-11-25  9:00     ` Stephen R. van den Berg
2009-11-25 14:31     ` Trond Myklebust [this message]
2009-11-25 21:58       ` Stephen R. van den Berg
2009-11-25 23:11       ` Stephen R. van den Berg
     [not found]         ` <64b4daae0911251511q7a070b0aj1c07cdc5d6719b41-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-11-26 15:01           ` Trond Myklebust
2009-11-26 15:07             ` Stephen R. van den Berg
     [not found]               ` <64b4daae0911260707i4064f608w4f7169441640567-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-11-26 15:20                 ` Trond Myklebust
2009-11-27  0:07                   ` Stephen R. van den Berg
     [not found]                     ` <64b4daae0911261607m10d1ba3al8c067f85249c198f-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-11-27  0:14                       ` Stephen R. van den Berg
     [not found]                         ` <64b4daae0911261614l471fb74fx79db2988f0c65738-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-11-27 21:23                           ` Trond Myklebust
2009-11-28  0:20                             ` Stephen R. van den Berg
     [not found]                               ` <64b4daae0911271620k46a99666td81528fc863e69f0-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-11-28 15:30                                 ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1259159512.3314.12.camel@localhost \
    --to=trond.myklebust@fys.uio.no \
    --cc=akpm@linux-foundation.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=srb-PCMv+cxZuL0@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.