public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Banks <gnb@melbourne.sgi.com>
To: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Nikita Danilov <Nikita@Namesys.COM>,
	linux kernel mailing list <linux-kernel@vger.kernel.org>,
	alexander viro <viro@parcelfarce.linux.theplanet.co.uk>,
	trond myklebust <trondmy@trondhjem.org>
Subject: Re: d_splice_alias() problem.
Date: Fri, 30 Apr 2004 17:50:57 +1000	[thread overview]
Message-ID: <40920561.2565C9C8@melbourne.sgi.com> (raw)
In-Reply-To: 16529.56343.764629.37296@cse.unsw.edu.au

Neil Brown wrote:
> 
> If I understand you correctly, the main problem is that a disconnected
> dentry can hold an inode active after the last link has been removed.
> The file will not then be truncated and removed until memory pressure
> flushes the disconnected dentry from the dcache.
> 
> This problem can be resolved by making sure that an inode never has
> both a connected and a disconnected dentry.
> 
> This is already the case for directories (as they must only have one
> dentry), but it is not the case for non-directories.
> 
> The following patch tries to address this.  It is a "technology
> preview" in that the only testing I have done is that it compiles OK.
> 
> Please consider reviewing it to see if it makes sense.
> 
> It:
>  - changes d_alloc_anon to make sure that a new disconnected dentry is
>    only allocated if there is currently no (hashed) dentry for the
>    inode. (Previously this would noramlly be true, but a race was
>    possible).
>  - changes d_splice_alias to re-use a disconnected dentry on
>    non-directories as well as directories.
>  - splits most of d_find_alias out into a separate function to make
>    the above easier.
> 
> I haven't fully thought through issues with unhashed, connected
> dentries.
> __d_find_alias won't return them so d_alloc_anon will never return
> one, so it is possible to have an unhashed dentry and a disconnected
> dentry at the same time, which probably isn't desirable.
> 
> Is it OK for d_alloc_anon to return an unaliased dentry, and then have
> it possibly spliced back into the dentry tree??? I'm not sure.
> 
> Comments welcome.


I've been wrestling with a problem in this area for the last couple of 
days.  The symptoms are different but the ultimate cause appears to (also)
be a race condition somewhere under fh_verify() resulting in confused
dentry structures of some kind.  Eventually this results in __dget_locked()
being called on a dentry which has a zero reference count but is hashed
and not on the unused list, which spuriously decrements dentry_stat.nr_unused.
When this happens enough times dentry_stat.nr_unused drops below zero
and kswapd starts spinning trying to prune a near-infinite number of
dentries.

I just tried your patch and the problem remains.

What I'm getting from my debug setup is:

<4>kernel BUG at fs/dcache.c:289!
<4>nfsd[4981]: bugcheck! 0 [1]


   285	static inline struct dentry * __dget_locked(struct dentry *dentry)
   286	{
   287		atomic_inc(&dentry->d_count);
   288		if (atomic_read(&dentry->d_count) == 1) {
   289		    	BUG_ON(list_empty(&dentry->d_lru));   <-------------
   290			dentry_stat.nr_unused--;
   291			list_del_init(&dentry->d_lru);
   292			BUG_ON(dentry_stat.nr_unused < 0);
   293		}
   294		return dentry;
   295	}

[0]kdb> bt
Stack traceback for pid 4981
0xe00000300b1f0000     4981        1  1    0   R  0xe00000300b1f04f0 *nfsd
0xa0000001001a2250 __d_find_alias+0x250   <----- note: with your patch applied
        args (0xa000000100936acc, 0xe000003009059ba8, 0xa0000001001a23f0,
0x206)        kernel 0xa0000001001a2000 0xa0000001001a2380
0xa0000001001a23f0 d_find_alias+0x70
        args (0xe00000b07aff67b0, 0xa0000001008ec900, 0xa0000001001a4120,
0x287)        kernel 0xa0000001001a2380 0xa0000001001a2420
0xa0000001001a4120 d_alloc_anon+0x20
        args (0xe00000b07aff67b0, 0xe00000300b1f7ad0, 0xe00000300b1f7ac0,
0xa0000001003c4ee0, 0x207)
        kernel 0xa0000001001a4100 0xa0000001001a4440
0xa0000001003c4ee0 linvfs_get_dentry+0x120
        args (0xe00000b07aff67b0, 0xe00000307b299f18, 0xa000000100292860,
0xd1d)        kernel 0xa0000001003c4dc0 0xa0000001003c4f20
0xa000000100292860 find_exported_dentry+0xa0
        args (0xe000003008a9fa00, 0xe00000307b299f18, 0xe00000300b1f7ce0,
0xa000000100861340, 0xe00000b07aa6f180)
        kernel 0xa0000001002927c0 0xa000000100293920
0xa000000100294090 export_decode_fh+0xb0
        args (0xe000003008a9fa00, 0xe00000307b299f24, 0x4, 0x2, 0xa000000100861340)
        kernel 0xa000000100293fe0 0xa000000100294100
0xa0000001002993f0 fh_verify+0x910
        args (0xe000003016c0d000, 0xe00000307b299f08, 0x11270000, 0x44,
0xe00000307b299f14)
        kernel 0xa000000100298ae0 0xa000000100299960
0xa00000010029cb00 nfsd_open+0x40
        args (0xe000003016c0d000, 0xe00000307b299f08, 0x8000, 0x4,
0xe00000300b1f7d00)
        kernel 0xa00000010029cac0 0xa00000010029cea0
0xa00000010029d440 nfsd_read+0x40
        args (0xe000003016c0d000, 0xe00000307b299f08, 0xe00000300b1f7d10,
0xe00000307b299c38, 0x2)
        kernel 0xa00000010029d400 0xa00000010029dda0
0xa0000001002b02f0 nfsd3_proc_read+0x190
        args (0xe00000307b299f90, 0xe00000307b299b00, 0xe00000307b299f00,
0xe00000307b29a030, 0xe00000307b299f08)
        kernel 0xa0000001002b0160 0xa0000001002b0400
0xa000000100295120 nfsd_dispatch+0x280
        args (0xe000003016c0d000, 0xe00000306b888014, 0xa000000100ce7520,
0xe000003016c0d490, 0xa00000010093e0d0)
        kernel 0xa000000100294ea0 0xa000000100295320
0xa000000100721810 svc_process+0xff0
        args (0xe00000b07aa6e928, 0xe000003016c0d000, 0xe000003016c0d240,
0xe000003016c0d068, 0xa000000100ce7520)
        kernel 0xa000000100720820 0xa000000100721b60
0xa000000100294a60 nfsd+0x480
        args (0xe000003016c0d000, 0xfffeba2f, 0xfffeba2f, 0xe00000b07aa6e900,
0xa000000100b008a0)
        kernel 0xa0000001002945e0 0xa000000100294ea0
0xa00000010001ae60 kernel_thread_helper+0xe0
[...]

arg0 of the new d_find_alias() is the inode

[0]kdb> inode 0xe00000b07aff67b0 
struct inode at  0xe00000b07aff67b0
 i_ino = 137 i_count = 3 i_size 8589934592
 i_mode = 0100777  i_nlink = 1  i_rdev = 0x0
 i_hash.nxt = 0x0000000000000000 i_hash.pprev = 0xe00000307bcb6408
 i_list.nxt = 0xe000003008a9fac8 i_list.prv = 0xe000003008a9fac8
 i_dentry.nxt = 0xe000003009059b80 i_dentry.prv = 0xe000003009059b80
 i_sb = 0xe000003008a9fa00 i_op = 0xa0000001009422b0 i_data = 0xe00000b07aff68a8
nrpages = 73612
 i_fop= 0xa000000100941fe8 i_flock = 0x0000000000000000 i_mapping =
0xe00000b07aff68a8
 i_flags 0x0 i_state 0x1 [I_DIRTY_SYNC]  fs specific info @ 0xe00000b07aff6a10

Walk the dentry chain...only one entry

[0]kdb> dentry 0xe000003009059b80
Dentry at 0xe000003009059b80
 d_name.len = 16 d_name.name = 0xe000003016ca7010 <read_load_test.0>   <----- not
anonymous
 d_count = 1 d_flags = 0x4 d_inode = 0xe00000b07aff67b0
           ^           ^^^ DCACHE_DISCONNECTED
       count=0 before line 287
 d_parent = 0xe00000b0065d5480
 d_hash.nxt = 0x0000000000000000 d_hash.prv = 0xe00000307bbccbf0    <---- hashed
 d_lru.nxt = 0xe000003009059ba0 d_lru.prv = 0xe000003009059ba0      <---- not on
unused list
 d_child.nxt = 0xe00000b0065d54c0 d_child.prv = 0xe00000b0065d54c0
 d_subdirs.nxt = 0xe000003009059bc0 d_subdirs.prv = 0xe000003009059bc0
 d_alias.nxt = 0xe00000b07aff67d0 d_alias.prv = 0xe00000b07aff67d0
 d_op = 0x0000000000000000 d_sb = 0xe000003008a9fa00


>  /*
>   *     Try to kill dentries associated with this inode.
>   * WARNING: you must own a reference to inode.
> @@ -835,28 +841,22 @@ struct dentry * d_alloc_anon(struct inod
>         tmp->d_parent = tmp; /* make sure dput doesn't croak */
> 
>         spin_lock(&dcache_lock);
> -       if (S_ISDIR(inode->i_mode) && !list_empty(&inode->i_dentry)) {
> -               /* A directory can only have one dentry.
> -                * This (now) has one, so use it.
> -                */
> -               res = list_entry(inode->i_dentry.next, struct dentry, d_alias);
> -               __dget_locked(res);
> -       } else {
> -               /* attach a disconnected dentry */
> +
> +       res = __d_find_alias(inode, 0);
> +       if (!res) {
>                 res = tmp;

Yes this is an obvious (well, now) race condition. But not apparently the
whole story.

Greg.
-- 
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.

  reply	other threads:[~2004-04-30  7:54 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-04-23 13:02 d_splice_alias() problem Nikita Danilov
2004-04-23 15:40 ` Andreas Dilger
2004-04-23 16:20   ` Nikita Danilov
2004-04-23 23:49 ` Andrew Morton
2004-04-26 12:45   ` Nikita Danilov
2004-04-30  4:54 ` Neil Brown
2004-04-30  7:50   ` Greg Banks [this message]
2004-04-30 13:28   ` Nikita Danilov
2004-05-03 23:46     ` Neil Brown
2004-05-03 12:02   ` Greg Banks
2004-05-03 23:28     ` Neil Brown
2004-05-04  0:05       ` Greg Banks
2004-05-04  7:00       ` Greg Banks
2004-05-04  9:46         ` viro
2004-05-04 10:21           ` Greg Banks
2004-05-05  0:11           ` Neil Brown
2004-05-10  3:03         ` Neil Brown
2004-05-10  4:50           ` Greg Banks
2004-05-10  3:27       ` Neil Brown
2004-05-10 11:28         ` Greg Banks
2004-05-13  5:58           ` Neil Brown
2004-05-13  7:15             ` Greg Banks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40920561.2565C9C8@melbourne.sgi.com \
    --to=gnb@melbourne.sgi.com \
    --cc=Nikita@Namesys.COM \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@cse.unsw.edu.au \
    --cc=trondmy@trondhjem.org \
    --cc=viro@parcelfarce.linux.theplanet.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox