All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>,
	Seth Jennings <sjenning@linux.vnet.ibm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nitin Gupta <ngupta@vflare.org>,
	Robert Jennings <rcj@linux.vnet.ibm.com>,
	Jenifer Hopper <jhopper@us.ibm.com>, Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <jweiner@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Larry Woodman <lwoodman@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	devel@driverdev.osuosl.org
Subject: Re: [PATCHv2 1/9] staging: zsmalloc: add gfp flags to zs_create_pool
Date: Thu, 31 Jan 2013 14:21:21 +0900	[thread overview]
Message-ID: <20130131052121.GC23548@blaptop> (raw)
In-Reply-To: <20130130161146.GB1722@konrad-lan.dumpdata.com>

On Wed, Jan 30, 2013 at 11:11:47AM -0500, Konrad Rzeszutek Wilk wrote:
> On Mon, Jan 28, 2013 at 11:59:17AM +0900, Minchan Kim wrote:
> > On Fri, Jan 25, 2013 at 07:56:29AM -0800, Dan Magenheimer wrote:
> > > > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com]
> > > > Subject: Re: [PATCHv2 1/9] staging: zsmalloc: add gfp flags to zs_create_pool
> > > > 
> > > > On 01/24/2013 07:33 PM, Minchan Kim wrote:
> > > > > Hi Seth, frontswap guys
> > > > >
> > > > > On Tue, Jan 8, 2013 at 5:24 AM, Seth Jennings
> > > > > <sjenning@linux.vnet.ibm.com> wrote:
> > > > >> zs_create_pool() currently takes a gfp flags argument
> > > > >> that is used when growing the memory pool.  However
> > > > >> it is not used in allocating the metadata for the pool
> > > > >> itself.  That is currently hardcoded to GFP_KERNEL.
> > > > >>
> > > > >> zswap calls zs_create_pool() at swapon time which is done
> > > > >> in atomic context, resulting in a "might sleep" warning.
> > > > >
> > > > > I didn't review this all series, really sorry but totday I saw Nitin
> > > > > added Acked-by so I'm afraid Greg might get it under my radar. I'm not
> > > > > strong against but I would like know why we should call frontswap_init
> > > > > under swap_lock? Is there special reason?
> > > > 
> > > > The call stack is:
> > > > 
> > > > SYSCALL_DEFINE2(swapon.. <-- swapon_mutex taken here
> > > > enable_swap_info() <-- swap_lock taken here
> > > > frontswap_init()
> > > > __frontswap_init()
> > > > zswap_frontswap_init()
> > > > zs_create_pool()
> > > > 
> > > > It isn't entirely clear to me why frontswap_init() is called under
> > > > lock.  Then again, I'm not entirely sure what the swap_lock protects.
> > > >  There are no comments near the swap_lock definition to tell me.
> > > > 
> > > > I would guess that the intent is to block any writes to the swap
> > > > device until frontswap_init() has completed.
> > > > 
> > > > Dan care to weigh in?
> > > 
> > > I think frontswap's first appearance needs to be atomic, i.e.
> > > the transition from (a) frontswap is not present and will fail
> > > all calls, to (b) frontswap is fully functional... that transition
> > > must be atomic.  And, once Konrad's module patches are in, the
> 
> To be fair it can be "delayed". Say the swap disk is in heavy usage and
> the backend is registered. The time between the backend going online and
> the frontswap_store functions calling in the backend can be delayed (so
> we can use a racy unsigned long to check when the backend is on).
> 
> Obviously the opposite is not acceptable (so unsigned long says
> backend is enabled, but in reality the backend has not yet been
> initialized).
> 
> > > opposite transition must be atomic also.  But there are most
> > > likely other ways to do those transitions atomically that
> > > don't need to hold swap_lock.
> 
> Right. The opposite transition would be when a backend is unloaded.
> Which is something we don't do yet. For that to work we would need
> to make the "gatekeeper" (this unsigned long I've been referring to)
> be atomic. Or at least in some fashion - either via spinlocks or perhaps
> using static_key to patch the branching of the code. Naturally to
> unload a module extra things such as flushing all the pages the backend
> has to the disk is required.
> > 
> > It could be raced once swap_info is registered.
> > But what's the problem if we call frontswap_init before calling
> > _enable_swap_info out of lock?
> 
> So, we have two locks - the mutex and the spin_lock. I think we are
> fine without the spinlock (swap_lock). 
> 
> > Swap subsystem never do I/O before it register new swap_info_struct.
> > 
> > And IMHO, if frontswap is to be atomic, it would be better to have
> > own scheme without dependency of swap_lock if it's possible.
> 
> I think that can be independent of that lock. We are still under
> the mutex (swapon_mutex) which protects us against two threads doing
> swapon/swapoff and messing things up.
> > > 
> > > Honestly, I never really focused on the initialization code
> > > so I am very open to improvements as long as they work for
> > > all the various frontswap backends.
> > 
> > How about this?
> > 
> > From 157a3edf49feb93be0595574beb153b322ddf7d2 Mon Sep 17 00:00:00 2001
> > From: Minchan Kim <minchan@kernel.org>
> > Date: Mon, 28 Jan 2013 11:34:00 +0900
> > Subject: [PATCH] frontswap: Get rid of swap_lock dependency
> > 
> > Frontswap initialization routine depends on swap_lock, which want
> > to be atomic about frontswap's first appearance.
> > IOW, frontswap is not present and will fail all calls OR frontswap is
> > fully functional but if new swap_info_struct isn't registered
> > by enable_swap_info, swap subsystem doesn't start I/O so there is no race
> > between init procedure and page I/O working on frontswap.
> > 
> > So let's remove unncessary swap_lock dependency.
> 
> This looks good. I hadn't yet had a chance to test it out though.

I hope you pick up if it pass your test.
Thanks, Konrad!

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>,
	Seth Jennings <sjenning@linux.vnet.ibm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nitin Gupta <ngupta@vflare.org>,
	Robert Jennings <rcj@linux.vnet.ibm.com>,
	Jenifer Hopper <jhopper@us.ibm.com>, Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <jweiner@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Larry Woodman <lwoodman@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	devel@driverdev.osuosl.org
Subject: Re: [PATCHv2 1/9] staging: zsmalloc: add gfp flags to zs_create_pool
Date: Thu, 31 Jan 2013 14:21:21 +0900	[thread overview]
Message-ID: <20130131052121.GC23548@blaptop> (raw)
In-Reply-To: <20130130161146.GB1722@konrad-lan.dumpdata.com>

On Wed, Jan 30, 2013 at 11:11:47AM -0500, Konrad Rzeszutek Wilk wrote:
> On Mon, Jan 28, 2013 at 11:59:17AM +0900, Minchan Kim wrote:
> > On Fri, Jan 25, 2013 at 07:56:29AM -0800, Dan Magenheimer wrote:
> > > > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com]
> > > > Subject: Re: [PATCHv2 1/9] staging: zsmalloc: add gfp flags to zs_create_pool
> > > > 
> > > > On 01/24/2013 07:33 PM, Minchan Kim wrote:
> > > > > Hi Seth, frontswap guys
> > > > >
> > > > > On Tue, Jan 8, 2013 at 5:24 AM, Seth Jennings
> > > > > <sjenning@linux.vnet.ibm.com> wrote:
> > > > >> zs_create_pool() currently takes a gfp flags argument
> > > > >> that is used when growing the memory pool.  However
> > > > >> it is not used in allocating the metadata for the pool
> > > > >> itself.  That is currently hardcoded to GFP_KERNEL.
> > > > >>
> > > > >> zswap calls zs_create_pool() at swapon time which is done
> > > > >> in atomic context, resulting in a "might sleep" warning.
> > > > >
> > > > > I didn't review this all series, really sorry but totday I saw Nitin
> > > > > added Acked-by so I'm afraid Greg might get it under my radar. I'm not
> > > > > strong against but I would like know why we should call frontswap_init
> > > > > under swap_lock? Is there special reason?
> > > > 
> > > > The call stack is:
> > > > 
> > > > SYSCALL_DEFINE2(swapon.. <-- swapon_mutex taken here
> > > > enable_swap_info() <-- swap_lock taken here
> > > > frontswap_init()
> > > > __frontswap_init()
> > > > zswap_frontswap_init()
> > > > zs_create_pool()
> > > > 
> > > > It isn't entirely clear to me why frontswap_init() is called under
> > > > lock.  Then again, I'm not entirely sure what the swap_lock protects.
> > > >  There are no comments near the swap_lock definition to tell me.
> > > > 
> > > > I would guess that the intent is to block any writes to the swap
> > > > device until frontswap_init() has completed.
> > > > 
> > > > Dan care to weigh in?
> > > 
> > > I think frontswap's first appearance needs to be atomic, i.e.
> > > the transition from (a) frontswap is not present and will fail
> > > all calls, to (b) frontswap is fully functional... that transition
> > > must be atomic.  And, once Konrad's module patches are in, the
> 
> To be fair it can be "delayed". Say the swap disk is in heavy usage and
> the backend is registered. The time between the backend going online and
> the frontswap_store functions calling in the backend can be delayed (so
> we can use a racy unsigned long to check when the backend is on).
> 
> Obviously the opposite is not acceptable (so unsigned long says
> backend is enabled, but in reality the backend has not yet been
> initialized).
> 
> > > opposite transition must be atomic also.  But there are most
> > > likely other ways to do those transitions atomically that
> > > don't need to hold swap_lock.
> 
> Right. The opposite transition would be when a backend is unloaded.
> Which is something we don't do yet. For that to work we would need
> to make the "gatekeeper" (this unsigned long I've been referring to)
> be atomic. Or at least in some fashion - either via spinlocks or perhaps
> using static_key to patch the branching of the code. Naturally to
> unload a module extra things such as flushing all the pages the backend
> has to the disk is required.
> > 
> > It could be raced once swap_info is registered.
> > But what's the problem if we call frontswap_init before calling
> > _enable_swap_info out of lock?
> 
> So, we have two locks - the mutex and the spin_lock. I think we are
> fine without the spinlock (swap_lock). 
> 
> > Swap subsystem never do I/O before it register new swap_info_struct.
> > 
> > And IMHO, if frontswap is to be atomic, it would be better to have
> > own scheme without dependency of swap_lock if it's possible.
> 
> I think that can be independent of that lock. We are still under
> the mutex (swapon_mutex) which protects us against two threads doing
> swapon/swapoff and messing things up.
> > > 
> > > Honestly, I never really focused on the initialization code
> > > so I am very open to improvements as long as they work for
> > > all the various frontswap backends.
> > 
> > How about this?
> > 
> > From 157a3edf49feb93be0595574beb153b322ddf7d2 Mon Sep 17 00:00:00 2001
> > From: Minchan Kim <minchan@kernel.org>
> > Date: Mon, 28 Jan 2013 11:34:00 +0900
> > Subject: [PATCH] frontswap: Get rid of swap_lock dependency
> > 
> > Frontswap initialization routine depends on swap_lock, which want
> > to be atomic about frontswap's first appearance.
> > IOW, frontswap is not present and will fail all calls OR frontswap is
> > fully functional but if new swap_info_struct isn't registered
> > by enable_swap_info, swap subsystem doesn't start I/O so there is no race
> > between init procedure and page I/O working on frontswap.
> > 
> > So let's remove unncessary swap_lock dependency.
> 
> This looks good. I hadn't yet had a chance to test it out though.

I hope you pick up if it pass your test.
Thanks, Konrad!

-- 
Kind regards,
Minchan Kim

  reply	other threads:[~2013-01-31  5:21 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-07 20:24 [PATCHv2 0/9] zswap: compressed swap caching Seth Jennings
2013-01-07 20:24 ` Seth Jennings
2013-01-07 20:24 ` [PATCHv2 1/9] staging: zsmalloc: add gfp flags to zs_create_pool Seth Jennings
2013-01-07 20:24   ` Seth Jennings
2013-01-25  0:08   ` Nitin Gupta
2013-01-25  0:08     ` Nitin Gupta
2013-01-25  1:33   ` Minchan Kim
2013-01-25  1:33     ` Minchan Kim
2013-01-25 15:07     ` Seth Jennings
2013-01-25 15:07       ` Seth Jennings
2013-01-25 15:56       ` Dan Magenheimer
2013-01-25 15:56         ` Dan Magenheimer
2013-01-28  2:59         ` Minchan Kim
2013-01-28  2:59           ` Minchan Kim
2013-01-30 16:11           ` Konrad Rzeszutek Wilk
2013-01-30 16:11             ` Konrad Rzeszutek Wilk
2013-01-31  5:21             ` Minchan Kim [this message]
2013-01-31  5:21               ` Minchan Kim
2013-01-25 21:26   ` Rik van Riel
2013-01-25 21:26     ` Rik van Riel
2013-01-07 20:24 ` [PATCHv2 2/9] staging: zsmalloc: remove unsed pool name Seth Jennings
2013-01-07 20:24   ` Seth Jennings
2013-01-25  0:09   ` Nitin Gupta
2013-01-25  0:09     ` Nitin Gupta
2013-01-25 21:50   ` Rik van Riel
2013-01-25 21:50     ` Rik van Riel
2013-01-07 20:24 ` [PATCHv2 3/9] staging: zsmalloc: add page alloc/free callbacks Seth Jennings
2013-01-07 20:24   ` Seth Jennings
2013-01-25  0:11   ` Nitin Gupta
2013-01-25  0:11     ` Nitin Gupta
2013-01-25 21:55   ` Rik van Riel
2013-01-25 21:55     ` Rik van Riel
2013-01-07 20:24 ` [PATCHv2 4/9] staging: zsmalloc: make CLASS_DELTA relative to PAGE_SIZE Seth Jennings
2013-01-07 20:24   ` Seth Jennings
2013-01-25  0:17   ` Nitin Gupta
2013-01-25  0:17     ` Nitin Gupta
2013-01-25 16:38     ` Seth Jennings
2013-01-25 16:38       ` Seth Jennings
2013-01-07 20:24 ` [PATCHv2 5/9] debugfs: add get/set for atomic types Seth Jennings
2013-01-07 20:24   ` Seth Jennings
2013-01-07 20:32   ` Greg Kroah-Hartman
2013-01-07 20:32     ` Greg Kroah-Hartman
2013-01-07 20:41     ` Seth Jennings
2013-01-07 20:41       ` Seth Jennings
2013-01-25 16:45       ` Seth Jennings
2013-01-25 16:45         ` Seth Jennings
2013-01-25 21:35         ` Greg Kroah-Hartman
2013-01-25 21:35           ` Greg Kroah-Hartman
2013-01-07 20:24 ` [PATCHv2 6/9] zsmalloc: promote to lib/ Seth Jennings
2013-01-07 20:24   ` Seth Jennings
2013-01-28  4:01   ` Minchan Kim
2013-01-28  4:01     ` Minchan Kim
2013-01-28  4:32     ` Minchan Kim
2013-01-28  4:32       ` Minchan Kim
2013-01-28 17:41       ` Seth Jennings
2013-01-28 17:41         ` Seth Jennings
2013-01-07 20:24 ` [PATCHv2 7/9] mm: break up swap_writepage() for frontswap backends Seth Jennings
2013-01-07 20:24   ` Seth Jennings
2013-01-28  4:22   ` Minchan Kim
2013-01-28  4:22     ` Minchan Kim
2013-01-28 17:26     ` Seth Jennings
2013-01-28 17:26       ` Seth Jennings
2013-01-28 23:46       ` Minchan Kim
2013-01-28 23:46         ` Minchan Kim
2013-01-07 20:24 ` [PATCHv2 8/9] zswap: add to mm/ Seth Jennings
2013-01-07 20:24   ` Seth Jennings
2013-01-08 17:15   ` Dave Hansen
2013-01-08 17:15     ` Dave Hansen
2013-01-08 17:54     ` Dan Magenheimer
2013-01-08 17:54       ` Dan Magenheimer
2013-01-25 22:44   ` Rik van Riel
2013-01-25 22:44     ` Rik van Riel
2013-01-25 23:15     ` Dan Magenheimer
2013-01-25 23:15       ` Dan Magenheimer
2013-01-28 15:27     ` Seth Jennings
2013-01-28 15:27       ` Seth Jennings
2013-01-29 10:21       ` Lord Glauber Costa of Sealand
2013-01-29 10:21         ` Lord Glauber Costa of Sealand
2013-02-07 16:13         ` Seth Jennings
2013-02-07 16:13           ` Seth Jennings
2013-02-11 19:13           ` Dan Magenheimer
2013-02-11 19:13             ` Dan Magenheimer
2013-01-07 20:24 ` [PATCHv2 9/9] zswap: add documentation Seth Jennings
2013-01-07 20:24   ` Seth Jennings
2013-01-22 18:10 ` [PATCHv2 0/9] zswap: compressed swap caching Seth Jennings
2013-01-22 18:10   ` Seth Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130131052121.GC23548@blaptop \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dan.magenheimer@oracle.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jhopper@us.ibm.com \
    --cc=jweiner@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=ngupta@vflare.org \
    --cc=rcj@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=sjenning@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.