All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	Jia-Ju Bai <baijiaju1990@163.com>,
	torbjorn.lindh@gopta.se, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [BUG] fs/super: a possible sleep-in-atomic bug in put_super
Date: Sat, 7 Oct 2017 22:14:44 +0100	[thread overview]
Message-ID: <20171007211444.GS21978@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20171007170651.GR21978@ZenIV.linux.org.uk>

On Sat, Oct 07, 2017 at 06:06:51PM +0100, Al Viro wrote:
> On Sat, Oct 07, 2017 at 02:56:40PM +0300, Vladimir Davydov wrote:
> > Hello,
> > 
> > On Fri, Oct 06, 2017 at 11:06:04AM +0200, Michal Hocko wrote:
> > > On Fri 06-10-17 16:59:18, Jia-Ju Bai wrote:
> > > > According to fs/super.c, the kernel may sleep under a spinlock.
> > > > The function call path is:
> > > > put_super (acquire the spinlock)
> > > >   __put_super
> > > >     destroy_super
> > > >       list_lru_destroy
> > > >         list_lru_unregister
> > > >           mutex_lock --> may sleep
> > > >         memcg_get_cache_ids
> > > >           down_read --> may sleep
> > > > 
> > > > This bug is found by my static analysis tool and my code review.
> > 
> > This is false-positive: by the time we get to destroy_super(), the lru
> > lists have already been destroyed - see deactivate_locked_super() - so
> > list_lru_destroy() will retrun right away without attempting to take any
> > locks. That's why there's no lockdep warnings regarding this issue.
> > 
> > I think we can move list_lru_destroy() to destroy_super_work() to
> > suppress this warning. Not sure if it's really worth the trouble though.
> 
> It's a bit trickier than that (callers of destroy_super() prior to superblock
> getting reachable via shared data structures do not have that lru_list_destroy()
> a no-op, but they are not called under spinlocks).
> 
> Locking in mm/list_lru.c looks excessive, but then I'm not well familiar with
> that code.

It *is* excessive.

	1) coallocate struct list_lru and array of struct list_lru_node
hanging off it.  Turn all existing variables and struct members of that
type into pointers.  init would allocate and return a pointer, destroy
would free (and leave it for callers to clear their pointers, of course).

	2) store the value of memcg_nr_cache_ids as of the last creation
or resize in list_lru.  Pass that through to memcg_destroy_list_lru_node().
Result: no need for memcg_get_cache_ids() in list_lru_destroy().

	3) add static list_lru *currently_resized, protected by list_lru_mutex.
NULL when memcg_update_all_list_lrus() is not run, points to currently
resized list_lru when it is.

	4) have lru_list_destroy() check (under list_lru_mutex) whether it's
being asked to kill the currently resized one.  If it is, do
	victim->list.prev->next = victim->list.next;
	victim->list.next->prev = victim->list.prev;
	victim->list.prev = NULL;
and bugger off, otherwise act as now.  Turn the loop in
memcg_update_all_list_lrus() into
	mutex_lock(&list_lrus_mutex);
	lru = list_lrus.next;
	while (lru != &list_lrus) {
		currently_resized = list_entry(lru, struct list_lru, list);
		mutex_unlock(&list_lrus_mutex);
		ret = memcg_update_list_lru(lru, old_size, new_size);
		mutex_lock(&list_lrus_mutex);
		if (unlikely(!lru->prev)) {
			lru = lru->next;
			free currently_resized as list_lru_destroy() would have
			continue;
		}
		if (ret)
			goto fail;
		lru = lru->next;
	}
	currently_resized = NULL;
	mutex_unlock(&list_lrus_mutex);
	
	5) replace list_lrus_mutex with a spinlock.

At that point list_lru_destroy() becomes non-blocking.  I think it should work,
but that's from several hours of looking through that code, so I might be
missing something subtle here...

  reply	other threads:[~2017-10-07 21:14 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-06  8:59 [BUG] fs/super: a possible sleep-in-atomic bug in put_super Jia-Ju Bai
2017-10-06  9:06 ` Michal Hocko
2017-10-07 11:56   ` Vladimir Davydov
2017-10-07 17:06     ` Al Viro
2017-10-07 21:14       ` Al Viro [this message]
2017-10-08  0:56         ` Al Viro
2017-10-08  2:03           ` Al Viro
2017-10-08 15:47             ` Vladimir Davydov
2017-10-08 21:13               ` Al Viro
2017-10-09  8:43                 ` Vladimir Davydov
2017-10-06 12:19 ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171007211444.GS21978@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=baijiaju1990@163.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=torbjorn.lindh@gopta.se \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.