linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	Jia-Ju Bai <baijiaju1990@163.com>,
	torbjorn.lindh@gopta.se, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [BUG] fs/super: a possible sleep-in-atomic bug in put_super
Date: Sat, 7 Oct 2017 22:14:44 +0100	[thread overview]
Message-ID: <20171007211444.GS21978@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20171007170651.GR21978@ZenIV.linux.org.uk>

On Sat, Oct 07, 2017 at 06:06:51PM +0100, Al Viro wrote:
> On Sat, Oct 07, 2017 at 02:56:40PM +0300, Vladimir Davydov wrote:
> > Hello,
> > 
> > On Fri, Oct 06, 2017 at 11:06:04AM +0200, Michal Hocko wrote:
> > > On Fri 06-10-17 16:59:18, Jia-Ju Bai wrote:
> > > > According to fs/super.c, the kernel may sleep under a spinlock.
> > > > The function call path is:
> > > > put_super (acquire the spinlock)
> > > >   __put_super
> > > >     destroy_super
> > > >       list_lru_destroy
> > > >         list_lru_unregister
> > > >           mutex_lock --> may sleep
> > > >         memcg_get_cache_ids
> > > >           down_read --> may sleep
> > > > 
> > > > This bug is found by my static analysis tool and my code review.
> > 
> > This is false-positive: by the time we get to destroy_super(), the lru
> > lists have already been destroyed - see deactivate_locked_super() - so
> > list_lru_destroy() will retrun right away without attempting to take any
> > locks. That's why there's no lockdep warnings regarding this issue.
> > 
> > I think we can move list_lru_destroy() to destroy_super_work() to
> > suppress this warning. Not sure if it's really worth the trouble though.
> 
> It's a bit trickier than that (callers of destroy_super() prior to superblock
> getting reachable via shared data structures do not have that lru_list_destroy()
> a no-op, but they are not called under spinlocks).
> 
> Locking in mm/list_lru.c looks excessive, but then I'm not well familiar with
> that code.

It *is* excessive.

	1) coallocate struct list_lru and array of struct list_lru_node
hanging off it.  Turn all existing variables and struct members of that
type into pointers.  init would allocate and return a pointer, destroy
would free (and leave it for callers to clear their pointers, of course).

	2) store the value of memcg_nr_cache_ids as of the last creation
or resize in list_lru.  Pass that through to memcg_destroy_list_lru_node().
Result: no need for memcg_get_cache_ids() in list_lru_destroy().

	3) add static list_lru *currently_resized, protected by list_lru_mutex.
NULL when memcg_update_all_list_lrus() is not run, points to currently
resized list_lru when it is.

	4) have lru_list_destroy() check (under list_lru_mutex) whether it's
being asked to kill the currently resized one.  If it is, do
	victim->list.prev->next = victim->list.next;
	victim->list.next->prev = victim->list.prev;
	victim->list.prev = NULL;
and bugger off, otherwise act as now.  Turn the loop in
memcg_update_all_list_lrus() into
	mutex_lock(&list_lrus_mutex);
	lru = list_lrus.next;
	while (lru != &list_lrus) {
		currently_resized = list_entry(lru, struct list_lru, list);
		mutex_unlock(&list_lrus_mutex);
		ret = memcg_update_list_lru(lru, old_size, new_size);
		mutex_lock(&list_lrus_mutex);
		if (unlikely(!lru->prev)) {
			lru = lru->next;
			free currently_resized as list_lru_destroy() would have
			continue;
		}
		if (ret)
			goto fail;
		lru = lru->next;
	}
	currently_resized = NULL;
	mutex_unlock(&list_lrus_mutex);
	
	5) replace list_lrus_mutex with a spinlock.

At that point list_lru_destroy() becomes non-blocking.  I think it should work,
but that's from several hours of looking through that code, so I might be
missing something subtle here...

  reply	other threads:[~2017-10-07 21:14 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-06  8:59 [BUG] fs/super: a possible sleep-in-atomic bug in put_super Jia-Ju Bai
2017-10-06  9:06 ` Michal Hocko
2017-10-07 11:56   ` Vladimir Davydov
2017-10-07 17:06     ` Al Viro
2017-10-07 21:14       ` Al Viro [this message]
2017-10-08  0:56         ` Al Viro
2017-10-08  2:03           ` Al Viro
2017-10-08 15:47             ` Vladimir Davydov
2017-10-08 21:13               ` Al Viro
2017-10-09  8:43                 ` Vladimir Davydov
2017-10-06 12:19 ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171007211444.GS21978@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=baijiaju1990@163.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=torbjorn.lindh@gopta.se \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).