All of lore.kernel.org
 help / color / mirror / Atom feed
From: Menyhart Zoltan <Zoltan.Menyhart@bull.net>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org
Subject: Re: "xprt" reference count drops to 0
Date: Mon, 25 Oct 2010 13:56:31 +0200	[thread overview]
Message-ID: <4CC5706F.3020308@bull.net> (raw)
In-Reply-To: <20101022212007.GB22837@fieldses.org>

> Maybe your fix is right, but I'm not sure: It looks to me like if
> svc_xprt_enqueue() gets to "process:" in a situation where the caller
> holds the only reference, then that's already a bug.  Do you know who
> the caller of svc_xprt_enqueue() was when this happened?

Unfortunately, I do not.
We saw lots of warnings like this (before my patch):

WARNING: at lib/kref.c:43 kref_get+0x23/0x2d()
Hardware name: Stoutland Platform
Modules linked in: rdma_ucm ib_sdp rdma_cm iw_cm ib_addr ib_ipoib ib_cm ib_sa ib_uverbs ib_umad mlx4_ib mlx4_core ib_mthca ib_mad ib_core ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc ipv6 scsi_dh_emc dm_round_robin dm_multipath iTCO_wdt i2c_i801 i2c_core igb ioatdma iTCO_vendor_support dca raid0 lpfc usb_storage scsi_transport_fc scsi_tgt [last unloaded: mlx4_core]
Pid: 3571, comm: nfsd Tainted: G      D W  2.6.32.9-21.fc12.Bull.6.x86_64 #1
Call Trace:
  [<ffffffff81050af0>] warn_slowpath_common+0x7c/0x94
  [<ffffffff81050b1c>] warn_slowpath_null+0x14/0x16
  [<ffffffff81222cca>] kref_get+0x23/0x2d
  [<ffffffffa01d0a90>] svc_xprt_get+0x12/0x14 [sunrpc]
  [<ffffffffa01d1903>] svc_recv+0x2db/0x78a [sunrpc]
  [<ffffffff8104b2ef>] ? default_wake_function+0x0/0x14
  [<ffffffffa02a1898>] nfsd+0xac/0x13f [nfsd]
  [<ffffffffa02a17ec>] ? nfsd+0x0/0x13f [nfsd]
  [<ffffffff8106ee0e>] kthread+0x7f/0x87
  [<ffffffff8100cd6a>] child_rip+0xa/0x20
  [<ffffffff8106ed8f>] ? kthread+0x0/0x87
  [<ffffffff8100cd60>] ? child_rip+0x0/0x20

When you can see the messages like this, the guilty task is already over...

> Doh.  Wait, when you say "has not been corrected on 2.6.36-rc3", do you
> mean you've actually *seen* the problem occur on 2.6.36-rc3?

We do not really use more advanced kernel than the 2.6.32 in a great number.
Just some test configurations up to the 2.6.36-rc6...
I have not seen this problem on recent kernels.

I'm not sure that I can really follow your explication.

I have got two reasons why I think my patch is good.

1. There are several code sequences where just after calling "svc_xprt_enqueue()",
    we drop "kref", i.e. we do not delegate the access right. Therefore "kref"
    should be increased by "svc_xprt_enqueue()". See:

svc_revisit()
{
...
	svc_xprt_enqueue(xprt);
	svc_xprt_put(xprt);
}

svc_xprt_release()
{
...
	svc_reserve(rqstp, 0):
		...
		svc_xprt_enqueue(xprt);
	svc_xprt_put(xprt);
}

svc_check_conn_limits()
{
...
			svc_xprt_enqueue(xprt);
			svc_xprt_put(xprt);
}

svc_age_temp_xprts()
{
...
		svc_xprt_enqueue(xprt);
		svc_xprt_put(xprt);
}

2. Increasing "kref" by "svc_recv()" is too late: by the time you can increase
    "kref", the structure may have already been destroyed.

As "svc_xprt_enqueue()" has not been modified since, I deduce that my patch is
correct for the 2.6.36-rc kernels, too.

Thanks.

Zoltan Menyhart

  parent reply	other threads:[~2010-10-25 11:57 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-01 12:17 Relocate NFS root FS for maintenance Greg
2010-09-01 17:34 ` J. Bruce Fields
2010-09-01 21:52 ` Tom Haynes
2010-09-02  7:32   ` Greg
2010-09-02 16:06     ` J. Bruce Fields
2010-09-07  6:59       ` Greg
2010-09-02  6:56 ` statfs() gives ESTALE error Menyhart Zoltan
2010-09-07 18:32   ` Trond Myklebust
2010-09-08 13:33 ` Re :statfs() " Menyhart Zoltan
2010-09-08 20:25   ` Trond Myklebust
2010-09-09  8:12 ` Menyhart Zoltan
2010-09-20 12:49 ` Locking question around "...PagePrivate()" Menyhart Zoltan
2010-09-20 13:55   ` Trond Myklebust
2010-10-05  8:22 ` "xprt" reference count drops to 0 Menyhart Zoltan
2010-10-21 20:38   ` J. Bruce Fields
2010-10-22 15:00     ` Menyhart Zoltan
2010-10-22 21:20       ` J. Bruce Fields
2010-10-22 23:01         ` J. Bruce Fields
2010-10-22 23:21           ` J. Bruce Fields
2010-10-23  3:32             ` J. Bruce Fields
2010-10-25  1:09               ` J. Bruce Fields
2010-10-25  1:21                 ` [PATCH 1/4] svcrpc: never clear XPT_BUSY on dead xprt J. Bruce Fields
2010-10-25  1:43                   ` Neil Brown
2010-10-25 20:21                     ` J. Bruce Fields
2010-10-25 22:58                       ` Neil Brown
2010-10-25 23:03                         ` J. Bruce Fields
2010-10-25 23:54                           ` Neil Brown
2010-10-26  0:11                             ` J. Bruce Fields
2010-10-26  0:28                               ` J. Bruce Fields
2010-10-26  0:30                                 ` J. Bruce Fields
2010-10-26  1:28                                   ` Neil Brown
2010-10-26 12:59                                     ` J. Bruce Fields
2010-10-26 16:05                                       ` J. Bruce Fields
2010-11-12 19:00                                         ` J. Bruce Fields
2010-10-25  1:21                 ` [PATCH 2/4] svcrpc: assume svc_delete_xprt() called only once J. Bruce Fields
2010-10-25  1:21                 ` [PATCH 3/4] svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue J. Bruce Fields
2010-10-25  1:21                 ` [PATCH 4/4] svcrpc: svc_tcp_sendto XTP_DEAD check is redundant J. Bruce Fields
2010-10-25  2:10                   ` Neil Brown
2010-10-25 15:03                     ` J. Bruce Fields
2010-10-25 17:46                       ` J. Bruce Fields
2010-10-25 23:08                         ` Neil Brown
2010-10-26  1:33                           ` J. Bruce Fields
2010-10-25 23:23                       ` Neil Brown
2010-10-26  1:25                         ` J. Bruce Fields
2010-10-25 11:56         ` Menyhart Zoltan [this message]
2010-10-25 14:36           ` "xprt" reference count drops to 0 J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CC5706F.3020308@bull.net \
    --to=zoltan.menyhart@bull.net \
    --cc=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.