From: Glauber Costa <glommer@gmail.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Dave Chinner <david@fromorbit.com>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92
Date: Wed, 19 Jun 2013 12:52:54 +0400 [thread overview]
Message-ID: <20130619085252.GD1990@localhost.localdomain> (raw)
In-Reply-To: <20130619073526.GB1990@localhost.localdomain>
On Wed, Jun 19, 2013 at 11:35:27AM +0400, Glauber Costa wrote:
> On Wed, Jun 19, 2013 at 09:13:46AM +0200, Michal Hocko wrote:
> > On Tue 18-06-13 10:26:24, Glauber Costa wrote:
> > [...]
> > > Michal, would you mind testing the following patch?
> > >
> > > diff --git a/fs/inode.c b/fs/inode.c
> > > index 00b804e..48eafa6 100644
> > > --- a/fs/inode.c
> > > +++ b/fs/inode.c
> > > @@ -419,6 +419,8 @@ void inode_add_lru(struct inode *inode)
> > >
> > > static void inode_lru_list_del(struct inode *inode)
> > > {
> > > + if (inode->i_state & I_FREEING)
> > > + return;
> > >
> > > if (list_lru_del(&inode->i_sb->s_inode_lru, &inode->i_lru))
> > > this_cpu_dec(nr_unused);
> > > @@ -609,8 +611,8 @@ void evict_inodes(struct super_block *sb)
> > > continue;
> > > }
> > >
> > > - inode->i_state |= I_FREEING;
> > > inode_lru_list_del(inode);
> > > + inode->i_state |= I_FREEING;
> > > spin_unlock(&inode->i_lock);
> > > list_add(&inode->i_lru, &dispose);
> > > }
> > > @@ -653,8 +655,8 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
> > > continue;
> > > }
> > >
> > > - inode->i_state |= I_FREEING;
> > > inode_lru_list_del(inode);
> > > + inode->i_state |= I_FREEING;
> > > spin_unlock(&inode->i_lock);
> > > list_add(&inode->i_lru, &dispose);
> > > }
> > > @@ -1381,9 +1383,8 @@ static void iput_final(struct inode *inode)
> > > inode->i_state &= ~I_WILL_FREE;
> > > }
> > >
> > > + inode_lru_list_del(inode);
> > > inode->i_state |= I_FREEING;
> > > - if (!list_empty(&inode->i_lru))
> > > - inode_lru_list_del(inode);
> > > spin_unlock(&inode->i_lock);
> > >
> > > evict(inode);
> >
> > No luck. I have this on top of inode_lru_isolate one but still can see
> > hangs:
> > 911 [<ffffffff8118325e>] __wait_on_freeing_inode+0x9e/0xc0
> > [<ffffffff81183321>] find_inode_fast+0xa1/0xc0
> > [<ffffffff8118529f>] iget_locked+0x4f/0x180
> > [<ffffffff811efa23>] ext4_iget+0x33/0x9f0
> > [<ffffffff811f6a5c>] ext4_lookup+0xbc/0x160
> > [<ffffffff81174ad0>] lookup_real+0x20/0x60
> > [<ffffffff81175254>] __lookup_hash+0x34/0x40
> > [<ffffffff81179872>] path_lookupat+0x7a2/0x830
> > [<ffffffff81179933>] filename_lookup+0x33/0xd0
> > [<ffffffff8117ab0b>] user_path_at_empty+0x7b/0xb0
> > [<ffffffff8117ab4c>] user_path_at+0xc/0x10
> > [<ffffffff8116ff91>] vfs_fstatat+0x51/0xb0
> > [<ffffffff81170116>] vfs_stat+0x16/0x20
> > [<ffffffff8117013f>] sys_newstat+0x1f/0x50
> > [<ffffffff81583129>] system_call_fastpath+0x16/0x1b
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > 21409 [<ffffffff8118325e>] __wait_on_freeing_inode+0x9e/0xc0
> > [<ffffffff81183321>] find_inode_fast+0xa1/0xc0
> > [<ffffffff8118529f>] iget_locked+0x4f/0x180
> > [<ffffffff811efa23>] ext4_iget+0x33/0x9f0
> > [<ffffffff811f6a5c>] ext4_lookup+0xbc/0x160
> > [<ffffffff81174ad0>] lookup_real+0x20/0x60
> > [<ffffffff81177e25>] lookup_open+0x175/0x1d0
> > [<ffffffff8117815e>] do_last+0x2de/0x780
> > [<ffffffff8117ae9a>] path_openat+0xda/0x400
> > [<ffffffff8117b303>] do_filp_open+0x43/0xa0
> > [<ffffffff81168ee0>] do_sys_open+0x160/0x1e0
> > [<ffffffff81168f9c>] sys_open+0x1c/0x20
> > [<ffffffff81583129>] system_call_fastpath+0x16/0x1b
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > 21745 [<ffffffff81179862>] path_lookupat+0x792/0x830
> > [<ffffffff81179933>] filename_lookup+0x33/0xd0
> > [<ffffffff8117ab0b>] user_path_at_empty+0x7b/0xb0
> > [<ffffffff8117ab4c>] user_path_at+0xc/0x10
> > [<ffffffff8116ff91>] vfs_fstatat+0x51/0xb0
> > [<ffffffff81170116>] vfs_stat+0x16/0x20
> > [<ffffffff8117013f>] sys_newstat+0x1f/0x50
> > [<ffffffff81583129>] system_call_fastpath+0x16/0x1b
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > 22032 [<ffffffff81179862>] path_lookupat+0x792/0x830
> > [<ffffffff81179933>] filename_lookup+0x33/0xd0
> > [<ffffffff8117ab0b>] user_path_at_empty+0x7b/0xb0
> > [<ffffffff8117ab4c>] user_path_at+0xc/0x10
> > [<ffffffff8116ff91>] vfs_fstatat+0x51/0xb0
> > [<ffffffff81170116>] vfs_stat+0x16/0x20
> > [<ffffffff8117013f>] sys_newstat+0x1f/0x50
> > [<ffffffff81583129>] system_call_fastpath+0x16/0x1b
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > 22621 [<ffffffff8118325e>] __wait_on_freeing_inode+0x9e/0xc0
> > [<ffffffff81183321>] find_inode_fast+0xa1/0xc0
> > [<ffffffff8118529f>] iget_locked+0x4f/0x180
> > [<ffffffff811efa23>] ext4_iget+0x33/0x9f0
> > [<ffffffff811f6a5c>] ext4_lookup+0xbc/0x160
> > [<ffffffff81174ad0>] lookup_real+0x20/0x60
> > [<ffffffff81177e25>] lookup_open+0x175/0x1d0
> > [<ffffffff8117815e>] do_last+0x2de/0x780
> > [<ffffffff8117ae9a>] path_openat+0xda/0x400
> > [<ffffffff8117b303>] do_filp_open+0x43/0xa0
> > [<ffffffff81168ee0>] do_sys_open+0x160/0x1e0
> > [<ffffffff81168f9c>] sys_open+0x1c/0x20
> > [<ffffffff81583129>] system_call_fastpath+0x16/0x1b
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > 22711 [<ffffffff81178144>] do_last+0x2c4/0x780
> > [<ffffffff8117ae9a>] path_openat+0xda/0x400
> > [<ffffffff8117b303>] do_filp_open+0x43/0xa0
> > [<ffffffff81168ee0>] do_sys_open+0x160/0x1e0
> > [<ffffffff81168f9c>] sys_open+0x1c/0x20
> > [<ffffffff81583129>] system_call_fastpath+0x16/0x1b
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > 22946 [<ffffffff81178144>] do_last+0x2c4/0x780
> > [<ffffffff8117ae9a>] path_openat+0xda/0x400
> > [<ffffffff8117b303>] do_filp_open+0x43/0xa0
> > [<ffffffff81168ee0>] do_sys_open+0x160/0x1e0
> > [<ffffffff81168f9c>] sys_open+0x1c/0x20
> > [<ffffffff81583129>] system_call_fastpath+0x16/0x1b
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > 23393 [<ffffffff81178144>] do_last+0x2c4/0x780
> > [<ffffffff8117ae9a>] path_openat+0xda/0x400
> > [<ffffffff8117b303>] do_filp_open+0x43/0xa0
> > [<ffffffff81168ee0>] do_sys_open+0x160/0x1e0
> > [<ffffffff81168f9c>] sys_open+0x1c/0x20
> > [<ffffffff81583129>] system_call_fastpath+0x16/0x1b
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > --
> Sorry if you said that before Michal.
>
> But given the backtrace, are you sure this is LRU-related? You mentioned you bisected
> it but found nothing conclusive. I will keep looking but maybe this could benefit from
> a broader fs look
>
> In any case, the patch we suggested is obviously correct and we should apply nevertheless.
> I will write it down and send it to Andrew.
My analysis of the LRU side code so far is:
* Assuming we are not hanging because of held locks, the fact that we
are hung at __wait_on_freeing_inode() means that someone who should
be waking us up is not. This would indicate that an inode is marked
as to-be-freed, but later on not freed.
* We will wait for free inodes if the state is I_FREEING or I_WILL_FREE.
* I_WILL_FREE is only set for a very short time during iput_final, and
the code path leads unconditionally to evict(), which wakes up any
waiters.
* clear_inode sets I_FREEING but it is only called from within evict,
which means we will wake up the waiters shortly.
* The LRU will not necessarily put the element into those states, but when
it does, it moves them to the dispose list. We will call evict() for all
ements in the dispose list, and that will unconditionally call wake_up_bit.
So it seems that if the LRU sets I_FREEING (we never set I_WILL_FREE)
* The same is true for evict_inodes and invalidate_inodes. They test
for the freeing bits and will skip the inodes marked as such. This seems
okay, since this means someone else marked them as freeing and it should
be their responsibility to wake up the callers.
So this shows that strangely enough, the code seems very safe and fine.
Still you are seeing hangs... Any chance we are hanging on the acquisition
of inode_hash_lock?
I need to be away for some hours, but I will be back to it soon. Meanwhile,
if Dave could take a look into it, that would be trully great.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-06-19 8:53 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-17 14:18 linux-next: slab shrinkers: BUG at mm/list_lru.c:92 Michal Hocko
2013-06-17 15:14 ` Glauber Costa
2013-06-17 15:33 ` Michal Hocko
2013-06-17 16:54 ` Glauber Costa
2013-06-18 7:42 ` Michal Hocko
2013-06-17 21:35 ` Andrew Morton
2013-06-17 22:30 ` Glauber Costa
2013-06-18 2:46 ` Dave Chinner
2013-06-18 6:31 ` Glauber Costa
2013-06-18 8:24 ` Michal Hocko
2013-06-18 10:44 ` Michal Hocko
2013-06-18 13:50 ` Michal Hocko
2013-06-25 2:27 ` Dave Chinner
2013-06-26 8:15 ` Michal Hocko
2013-06-26 23:24 ` Dave Chinner
2013-06-27 14:54 ` Michal Hocko
2013-06-28 8:39 ` Michal Hocko
2013-06-28 14:31 ` Glauber Costa
2013-06-28 15:12 ` Michal Hocko
2013-06-29 2:55 ` Dave Chinner
2013-06-30 18:33 ` Michal Hocko
2013-07-01 1:25 ` Dave Chinner
2013-07-01 7:50 ` Michal Hocko
2013-07-01 8:10 ` Dave Chinner
2013-07-02 9:22 ` Michal Hocko
2013-07-02 12:19 ` Dave Chinner
2013-07-02 12:44 ` Michal Hocko
2013-07-03 11:24 ` Dave Chinner
2013-07-03 14:08 ` Glauber Costa
2013-07-04 16:36 ` Michal Hocko
2013-07-08 12:53 ` Michal Hocko
2013-07-08 21:04 ` Andrew Morton
2013-07-09 17:34 ` Glauber Costa
2013-07-09 17:51 ` Andrew Morton
2013-07-09 17:32 ` Glauber Costa
2013-07-09 17:50 ` Andrew Morton
2013-07-09 17:57 ` Glauber Costa
2013-07-09 17:57 ` Michal Hocko
2013-07-09 21:39 ` Andrew Morton
2013-07-10 2:31 ` Dave Chinner
2013-07-10 7:34 ` Michal Hocko
2013-07-10 8:06 ` Michal Hocko
2013-07-11 2:26 ` Dave Chinner
2013-07-11 3:03 ` Andrew Morton
2013-07-11 13:23 ` Michal Hocko
2013-07-12 1:42 ` Hugh Dickins
2013-07-13 3:29 ` Dave Chinner
2013-07-15 9:14 ` Michal Hocko
2013-06-18 6:26 ` Glauber Costa
2013-06-18 8:25 ` Michal Hocko
2013-06-19 7:13 ` Michal Hocko
2013-06-19 7:35 ` Glauber Costa
2013-06-19 8:52 ` Glauber Costa [this message]
2013-06-19 13:57 ` Michal Hocko
2013-06-19 14:02 ` Glauber Costa
2013-06-19 14:28 ` Michal Hocko
2013-06-20 14:11 ` Glauber Costa
2013-06-20 15:12 ` Michal Hocko
2013-06-20 15:16 ` Michal Hocko
2013-06-21 9:00 ` Michal Hocko
2013-06-23 11:51 ` Glauber Costa
2013-06-23 11:55 ` Glauber Costa
2013-06-25 2:29 ` Dave Chinner
2013-06-26 8:22 ` Michal Hocko
2013-06-18 8:19 ` Michal Hocko
2013-06-18 8:21 ` Glauber Costa
2013-06-18 8:26 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130619085252.GD1990@localhost.localdomain \
--to=glommer@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).