From: Minchan Kim <minchan@kernel.org>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>, Nitin Gupta <ngupta@vflare.org>,
Jerome Marchand <jmarchan@redhat.com>,
Ganesh Mahendran <opensource.ganesh@gmail.com>
Subject: Re: [PATCH v1 2/2] zram: remove init_lock in zram_make_request
Date: Mon, 2 Feb 2015 14:10:19 +0900 [thread overview]
Message-ID: <20150202051017.GH6402@blaptop> (raw)
In-Reply-To: <20150202042847.GG6402@blaptop>
On Mon, Feb 02, 2015 at 01:28:47PM +0900, Minchan Kim wrote:
> On Mon, Feb 02, 2015 at 01:01:24PM +0900, Sergey Senozhatsky wrote:
> > On (02/02/15 11:44), Minchan Kim wrote:
> > > > sure, I did think about this. and I actually didn't find any reason not
> > > > to use ->refcount there. if user wants to reset the device, he first
> > > > should umount it to make bdev->bd_holders check happy. and that's where
> > > > IOs will be failed. so it makes sense to switch to ->refcount there, IMHO.
> > >
> > > If we use zram as block device itself(not a fs or swap) and open the
> > > block device as !FMODE_EXCL, bd_holders will be void.
> > >
> >
> > hm.
> > I don't mind to use ->disksize there, but personally I'd maybe prefer
> > to use ->refcount, which just looks less hacky. zram's most common use
> > cases are coming from ram swap device or ram device with fs. so it looks
> > a bit like we care about some corner case here.
>
> Maybe, but I always test zram with dd so it's not a corner case for me. :)
>
> >
> > just my opinion, no objections against ->disksize != 0.
>
> Thanks. It's a draft for v2. Please review.
>
> BTW, you pointed out race between bdev_open/close and reset and
> it's cleary bug although it's rare in real practice.
> So, I want to fix it earlier than this patch and mark it as -stable
> if we can fix it easily like Ganesh's work.
> If it gets landing, we could make this patch rebased on it.
>
> From 699502b4e0c84b3d7b33f8754cf1c0109b16c012 Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@kernel.org>
> Date: Mon, 2 Feb 2015 10:36:28 +0900
> Subject: [PATCH v2] zram: remove init_lock in zram_make_request
>
> Admin could reset zram during I/O operation going on so we have
> used zram->init_lock as read-side lock in I/O path to prevent
> sudden zram meta freeing.
>
> However, the init_lock is really troublesome.
> We can't do call zram_meta_alloc under init_lock due to lockdep splat
> because zram_rw_page is one of the function under reclaim path and
> hold it as read_lock while other places in process context hold it
> as write_lock. So, we have used allocation out of the lock to avoid
> lockdep warn but it's not good for readability and fainally, I met
> another lockdep splat between init_lock and cpu_hotplug from
> kmem_cache_destroy during working zsmalloc compaction. :(
>
> Yes, the ideal is to remove horrible init_lock of zram in rw path.
> This patch removes it in rw path and instead, add atomic refcount
> for meta lifetime management and completion to free meta in process
> context. It's important to free meta in process context because
> some of resource destruction needs mutex lock, which could be held
> if we releases the resource in reclaim context so it's deadlock,
> again.
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
> drivers/block/zram/zram_drv.c | 85 ++++++++++++++++++++++++++++++-------------
> drivers/block/zram/zram_drv.h | 20 +++++-----
> 2 files changed, 71 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index aa5a4c5..c6d505c 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -55,7 +55,7 @@ static DEVICE_ATTR_RO(name);
>
> static inline int init_done(struct zram *zram)
> {
> - return zram->meta != NULL;
> + return zram->disksize != 0;
> }
>
> static inline struct zram *dev_to_zram(struct device *dev)
> @@ -358,6 +358,18 @@ out_error:
> return NULL;
> }
>
> +static inline bool zram_meta_get(struct zram *zram)
> +{
> + if (atomic_inc_not_zero(&zram->refcount))
> + return true;
> + return false;
> +}
> +
> +static inline void zram_meta_put(struct zram *zram)
> +{
> + atomic_dec(&zram->refcount);
> +}
> +
> static void update_position(u32 *index, int *offset, struct bio_vec *bvec)
> {
> if (*offset + bvec->bv_len >= PAGE_SIZE)
> @@ -719,6 +731,10 @@ static void zram_bio_discard(struct zram *zram, u32 index,
>
> static void zram_reset_device(struct zram *zram, bool reset_capacity)
> {
> + struct zram_meta *meta;
> + struct zcomp *comp;
> + u64 disksize;
> +
> down_write(&zram->init_lock);
>
> zram->limit_pages = 0;
> @@ -728,19 +744,32 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity)
> return;
> }
>
> - zcomp_destroy(zram->comp);
> - zram->max_comp_streams = 1;
> - zram_meta_free(zram->meta, zram->disksize);
> - zram->meta = NULL;
> + meta = zram->meta;
> + comp = zram->comp;
> + disksize = zram->disksize;
> + zram->disksize = 0;
> + /*
> + * ->refcount will go down to 0 eventually and rw handler cannot
> + * handle further I/O by init_done checking.
> + */
> + zram_meta_put(zram);
> + /*
> + * We want to free zram_meta in process context to avoid
> + * deadlock between reclaim path and any other locks
> + */
> + wait_event(zram->io_done, atomic_read(&zram->refcount) == 0);
> +
> /* Reset stats */
> memset(&zram->stats, 0, sizeof(zram->stats));
> + zram->max_comp_streams = 1;
>
> - zram->disksize = 0;
> if (reset_capacity)
> set_capacity(zram->disk, 0);
>
> up_write(&zram->init_lock);
> -
> + /* I/O operation under all of CPU are done so let's free */
> + zram_meta_free(meta, disksize);
> + zcomp_destroy(comp);
> /*
> * Revalidate disk out of the init_lock to avoid lockdep splat.
> * It's okay because disk's capacity is protected by init_lock
> @@ -783,6 +812,8 @@ static ssize_t disksize_store(struct device *dev,
> goto out_destroy_comp;
> }
>
> + init_waitqueue_head(&zram->io_done);
> + zram_meta_get(zram);
Argh, It should be
atomic_set(&zram->refcount, 1);
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>, Nitin Gupta <ngupta@vflare.org>,
Jerome Marchand <jmarchan@redhat.com>,
Ganesh Mahendran <opensource.ganesh@gmail.com>
Subject: Re: [PATCH v1 2/2] zram: remove init_lock in zram_make_request
Date: Mon, 2 Feb 2015 14:10:19 +0900 [thread overview]
Message-ID: <20150202051017.GH6402@blaptop> (raw)
In-Reply-To: <20150202042847.GG6402@blaptop>
On Mon, Feb 02, 2015 at 01:28:47PM +0900, Minchan Kim wrote:
> On Mon, Feb 02, 2015 at 01:01:24PM +0900, Sergey Senozhatsky wrote:
> > On (02/02/15 11:44), Minchan Kim wrote:
> > > > sure, I did think about this. and I actually didn't find any reason not
> > > > to use ->refcount there. if user wants to reset the device, he first
> > > > should umount it to make bdev->bd_holders check happy. and that's where
> > > > IOs will be failed. so it makes sense to switch to ->refcount there, IMHO.
> > >
> > > If we use zram as block device itself(not a fs or swap) and open the
> > > block device as !FMODE_EXCL, bd_holders will be void.
> > >
> >
> > hm.
> > I don't mind to use ->disksize there, but personally I'd maybe prefer
> > to use ->refcount, which just looks less hacky. zram's most common use
> > cases are coming from ram swap device or ram device with fs. so it looks
> > a bit like we care about some corner case here.
>
> Maybe, but I always test zram with dd so it's not a corner case for me. :)
>
> >
> > just my opinion, no objections against ->disksize != 0.
>
> Thanks. It's a draft for v2. Please review.
>
> BTW, you pointed out race between bdev_open/close and reset and
> it's cleary bug although it's rare in real practice.
> So, I want to fix it earlier than this patch and mark it as -stable
> if we can fix it easily like Ganesh's work.
> If it gets landing, we could make this patch rebased on it.
>
> From 699502b4e0c84b3d7b33f8754cf1c0109b16c012 Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@kernel.org>
> Date: Mon, 2 Feb 2015 10:36:28 +0900
> Subject: [PATCH v2] zram: remove init_lock in zram_make_request
>
> Admin could reset zram during I/O operation going on so we have
> used zram->init_lock as read-side lock in I/O path to prevent
> sudden zram meta freeing.
>
> However, the init_lock is really troublesome.
> We can't do call zram_meta_alloc under init_lock due to lockdep splat
> because zram_rw_page is one of the function under reclaim path and
> hold it as read_lock while other places in process context hold it
> as write_lock. So, we have used allocation out of the lock to avoid
> lockdep warn but it's not good for readability and fainally, I met
> another lockdep splat between init_lock and cpu_hotplug from
> kmem_cache_destroy during working zsmalloc compaction. :(
>
> Yes, the ideal is to remove horrible init_lock of zram in rw path.
> This patch removes it in rw path and instead, add atomic refcount
> for meta lifetime management and completion to free meta in process
> context. It's important to free meta in process context because
> some of resource destruction needs mutex lock, which could be held
> if we releases the resource in reclaim context so it's deadlock,
> again.
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
> drivers/block/zram/zram_drv.c | 85 ++++++++++++++++++++++++++++++-------------
> drivers/block/zram/zram_drv.h | 20 +++++-----
> 2 files changed, 71 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index aa5a4c5..c6d505c 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -55,7 +55,7 @@ static DEVICE_ATTR_RO(name);
>
> static inline int init_done(struct zram *zram)
> {
> - return zram->meta != NULL;
> + return zram->disksize != 0;
> }
>
> static inline struct zram *dev_to_zram(struct device *dev)
> @@ -358,6 +358,18 @@ out_error:
> return NULL;
> }
>
> +static inline bool zram_meta_get(struct zram *zram)
> +{
> + if (atomic_inc_not_zero(&zram->refcount))
> + return true;
> + return false;
> +}
> +
> +static inline void zram_meta_put(struct zram *zram)
> +{
> + atomic_dec(&zram->refcount);
> +}
> +
> static void update_position(u32 *index, int *offset, struct bio_vec *bvec)
> {
> if (*offset + bvec->bv_len >= PAGE_SIZE)
> @@ -719,6 +731,10 @@ static void zram_bio_discard(struct zram *zram, u32 index,
>
> static void zram_reset_device(struct zram *zram, bool reset_capacity)
> {
> + struct zram_meta *meta;
> + struct zcomp *comp;
> + u64 disksize;
> +
> down_write(&zram->init_lock);
>
> zram->limit_pages = 0;
> @@ -728,19 +744,32 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity)
> return;
> }
>
> - zcomp_destroy(zram->comp);
> - zram->max_comp_streams = 1;
> - zram_meta_free(zram->meta, zram->disksize);
> - zram->meta = NULL;
> + meta = zram->meta;
> + comp = zram->comp;
> + disksize = zram->disksize;
> + zram->disksize = 0;
> + /*
> + * ->refcount will go down to 0 eventually and rw handler cannot
> + * handle further I/O by init_done checking.
> + */
> + zram_meta_put(zram);
> + /*
> + * We want to free zram_meta in process context to avoid
> + * deadlock between reclaim path and any other locks
> + */
> + wait_event(zram->io_done, atomic_read(&zram->refcount) == 0);
> +
> /* Reset stats */
> memset(&zram->stats, 0, sizeof(zram->stats));
> + zram->max_comp_streams = 1;
>
> - zram->disksize = 0;
> if (reset_capacity)
> set_capacity(zram->disk, 0);
>
> up_write(&zram->init_lock);
> -
> + /* I/O operation under all of CPU are done so let's free */
> + zram_meta_free(meta, disksize);
> + zcomp_destroy(comp);
> /*
> * Revalidate disk out of the init_lock to avoid lockdep splat.
> * It's okay because disk's capacity is protected by init_lock
> @@ -783,6 +812,8 @@ static ssize_t disksize_store(struct device *dev,
> goto out_destroy_comp;
> }
>
> + init_waitqueue_head(&zram->io_done);
> + zram_meta_get(zram);
Argh, It should be
atomic_set(&zram->refcount, 1);
--
Kind regards,
Minchan Kim
next prev parent reply other threads:[~2015-02-02 5:10 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-28 8:15 [PATCH 1/2] zram: free meta table in zram_meta_free Minchan Kim
2015-01-28 8:15 ` [PATCH v1 2/2] zram: remove init_lock in zram_make_request Minchan Kim
2015-01-28 14:56 ` Sergey Senozhatsky
2015-01-28 14:56 ` Sergey Senozhatsky
2015-01-28 15:04 ` Sergey Senozhatsky
2015-01-28 15:04 ` Sergey Senozhatsky
2015-01-28 23:33 ` Minchan Kim
2015-01-28 23:33 ` Minchan Kim
2015-01-29 1:57 ` Sergey Senozhatsky
2015-01-29 2:01 ` Minchan Kim
2015-01-29 2:01 ` Minchan Kim
2015-01-29 2:22 ` Sergey Senozhatsky
2015-01-29 2:22 ` Sergey Senozhatsky
2015-01-29 5:28 ` Minchan Kim
2015-01-29 5:28 ` Minchan Kim
2015-01-29 6:06 ` Sergey Senozhatsky
2015-01-29 6:06 ` Sergey Senozhatsky
2015-01-29 6:35 ` Minchan Kim
2015-01-29 6:35 ` Minchan Kim
2015-01-29 7:08 ` Sergey Senozhatsky
2015-01-29 7:08 ` Sergey Senozhatsky
2015-01-30 14:41 ` Minchan Kim
2015-01-30 14:41 ` Minchan Kim
2015-01-31 11:31 ` Sergey Senozhatsky
2015-01-31 11:31 ` Sergey Senozhatsky
2015-02-01 14:50 ` Sergey Senozhatsky
2015-02-01 14:50 ` Sergey Senozhatsky
2015-02-01 15:04 ` Sergey Senozhatsky
2015-02-01 15:04 ` Sergey Senozhatsky
2015-02-02 1:43 ` Minchan Kim
2015-02-02 1:43 ` Minchan Kim
2015-02-02 1:59 ` Sergey Senozhatsky
2015-02-02 1:59 ` Sergey Senozhatsky
2015-02-02 2:45 ` Minchan Kim
2015-02-02 2:45 ` Minchan Kim
2015-02-02 3:47 ` Sergey Senozhatsky
2015-02-02 3:47 ` Sergey Senozhatsky
2015-02-02 1:30 ` Minchan Kim
2015-02-02 1:30 ` Minchan Kim
2015-02-02 1:48 ` Sergey Senozhatsky
2015-02-02 1:48 ` Sergey Senozhatsky
2015-02-02 2:44 ` Minchan Kim
2015-02-02 2:44 ` Minchan Kim
2015-02-02 4:01 ` Sergey Senozhatsky
2015-02-02 4:01 ` Sergey Senozhatsky
2015-02-02 4:28 ` Minchan Kim
2015-02-02 4:28 ` Minchan Kim
2015-02-02 5:09 ` Sergey Senozhatsky
2015-02-02 5:09 ` Sergey Senozhatsky
2015-02-02 5:18 ` Minchan Kim
2015-02-02 5:18 ` Minchan Kim
2015-02-02 5:28 ` Sergey Senozhatsky
2015-02-02 5:28 ` Sergey Senozhatsky
2015-02-02 5:10 ` Minchan Kim [this message]
2015-02-02 5:10 ` Minchan Kim
2015-01-30 0:20 ` Sergey Senozhatsky
2015-01-29 13:48 ` Ganesh Mahendran
2015-01-29 13:48 ` Ganesh Mahendran
2015-01-29 15:12 ` Sergey Senozhatsky
2015-01-29 15:12 ` Sergey Senozhatsky
2015-01-30 7:52 ` Ganesh Mahendran
2015-01-30 7:52 ` Ganesh Mahendran
2015-01-30 8:08 ` Sergey Senozhatsky
2015-01-30 8:08 ` Sergey Senozhatsky
2015-01-31 8:50 ` Ganesh Mahendran
2015-01-31 8:50 ` Ganesh Mahendran
2015-01-31 11:07 ` Sergey Senozhatsky
2015-01-31 11:07 ` Sergey Senozhatsky
2015-01-31 12:59 ` Ganesh Mahendran
2015-01-31 12:59 ` Ganesh Mahendran
2015-01-28 14:19 ` [PATCH 1/2] zram: free meta table in zram_meta_free Sergey Senozhatsky
2015-01-28 14:19 ` Sergey Senozhatsky
2015-01-28 23:17 ` Minchan Kim
2015-01-28 23:17 ` Minchan Kim
2015-01-29 1:49 ` Ganesh Mahendran
2015-01-29 1:49 ` Ganesh Mahendran
-- strict thread matches above, loose matches on Subject: below --
2015-02-02 3:41 [PATCH v1 2/2] zram: remove init_lock in zram_make_request Minchan Kim
2015-02-02 3:41 ` Minchan Kim
2015-02-02 5:59 ` Sergey Senozhatsky
2015-02-02 5:59 ` Sergey Senozhatsky
2015-02-02 6:18 ` Minchan Kim
2015-02-02 6:18 ` Minchan Kim
2015-02-02 7:06 ` Sergey Senozhatsky
2015-02-02 7:06 ` Sergey Senozhatsky
2015-02-03 1:54 ` Sergey Senozhatsky
2015-02-03 1:54 ` Sergey Senozhatsky
2015-02-03 3:02 ` Minchan Kim
2015-02-03 3:02 ` Minchan Kim
2015-02-03 3:56 ` Sergey Senozhatsky
2015-02-03 3:56 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150202051017.GH6402@blaptop \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=jmarchan@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ngupta@vflare.org \
--cc=opensource.ganesh@gmail.com \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=sergey.senozhatsky@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.