From: Minchan Kim <minchan@kernel.org>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Nitin Gupta <ngupta@vflare.org>,
linux-kernel@vger.kernel.org,
Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: Re: [PATCHv3 9/9] zram: add dynamic device add/remove functionality
Date: Mon, 4 May 2015 11:28:17 +0900 [thread overview]
Message-ID: <20150504022816.GB14452@blaptop> (raw)
In-Reply-To: <20150504022008.GA14452@blaptop>
On Mon, May 04, 2015 at 11:20:08AM +0900, Minchan Kim wrote:
> Hello Sergey,
>
> On Thu, Apr 30, 2015 at 03:51:12PM +0900, Sergey Senozhatsky wrote:
> > On (04/30/15 15:44), Minchan Kim wrote:
> > > > > I think the problem of deadlock is that you are trying to remove sysfs file
> > > > > in sysfs handler.
> > > > >
> > > > > #> echo 1 > /sys/xxx/zram_remove
> > > > >
> > > > > kernfs_fop_write - hold s_active
> > > > > -> zram_remove_store
> > > > > -> zram_remove
> > > > > -> sysfs_remove_group - hold s_active *again*
> > > > >
> > > > > Right?
> > > > >
> > > >
> > > > are those same s_active locks?
> > > >
> > > >
> > > > we hold (s_active#163) and (&bdev->bd_mutex) and want to acquire (s_active#162)
> > >
> > > Thanks for sharing the message.
> > > You're right. It's another lock so it shouldn't be a reason.
> > > Okay, I will review it. Please give me time.
> > >
> >
> > sure, no problem and no rush. thanks!
>
> I had a time to think over it.
>
> I think your patch is rather tricky so someone cannot see sysfs
> although he already opened /dev/zram but after a while he can see sysfs.
> It's weired.
>
> I want to fix it more generic way. Othewise, we might have trouble with
> locking problem sometime. We already have experieced it with init_lock
> although we finally fixed it.
>
> I think we can fix it with below patch I hope it's more general and right
> approach. It's based on your [zram: return zram device_id from zram_add()]
>
> What do you think about?
>
> From e943df5407b880f9262ef959b270226fdc81bc9f Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@kernel.org>
> Date: Mon, 4 May 2015 08:36:07 +0900
> Subject: [PATCH 1/2] zram: close race by open overriding
>
> [1] introduced bdev->bd_mutex to protect a race between mount
> and reset. At that time, we don't have dynamic zram-add/remove
> feature so it was okay.
>
> However, as we introduce dynamic device feature, bd_mutex became
> trouble.
>
> CPU 0
>
> echo 1 > /sys/block/zram<id>/reset
> -> kernfs->s_active(A)
> -> zram:reset_store->bd_mutex(B)
>
> CPU 1
>
> echo <id> > /sys/class/zram/zram-remove
> ->zram:zram_remove: bd_mutex(B)
> -> sysfs_remove_group
> -> kernfs->s_active(A)
>
> IOW, AB -> BA deadlock
>
> The reason we are holding bd_mutex for zram_remove is to prevent
> any incoming open /dev/zram[0-9]. Otherwise, we could remove zram
> others already have opened. But it causes above deadlock problem.
>
> To fix the problem, this patch overrides block_device.open and
> it returns -EBUSY if zram asserts he claims zram to reset so any
> incoming open will be failed so we don't need to hold bd_mutex
> for zram_remove ayn more.
>
> This patch is to prepare for zram-add/remove feature.
>
> [1] ba6b17: zram: fix umount-reset_store-mount race condition
> Signed-off-by: Minchan Kim <minchan@kernel.org>
If above has no problem, we could apply your last patch on top of it.
>From 5bfa8a2e312a9c8493f574b1cf513ef4693a465c Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Date: Mon, 4 May 2015 09:02:23 +0900
Subject: [PATCH 2/2] zram: add dynamic device add/remove functionality
We currently don't support on-demand device creation. The one and only way
to have N zram devices is to specify num_devices module parameter (default
value: 1). IOW if, for some reason, at some point, user wants to have
N + 1 devies he/she must umount all the existing devices, unload the
module, load the module passing num_devices equals to N + 1. And do this
again, if needed.
This patch introduces zram control sysfs class, which has two sysfs
attrs:
- zram_add -- add a new zram device
- zram_remove -- remove a specific (device_id) zram device
zram_add sysfs attr is read-only and has only automatic device id
assignment mode (as requested by Minchan Kim). read operation performed
on this attr creates a new zram device and returns back its device_id or
error status.
Usage example:
# add a new specific zram device
cat /sys/class/zram-control/zram_add
2
# remove a specific zram device
echo 4 > /sys/class/zram-control/zram_remove
Returning zram_add() error code back to user (-ENOMEM in this case)
cat /sys/class/zram-control/zram_add
cat: /sys/class/zram-control/zram_add: Cannot allocate memory
NOTE, there might be users who already depend on the fact that at least
zram0 device gets always created by zram_init(). Preserve this behavior.
[minchan]: use zram->claim to avoid lockdep splat
Reported-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
Documentation/blockdev/zram.txt | 23 ++++++++--
drivers/block/zram/zram_drv.c | 97 +++++++++++++++++++++++++++++++++++++++--
2 files changed, 114 insertions(+), 6 deletions(-)
diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index 65e9430..fc686d4 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -99,7 +99,24 @@ size of the disk when not in use so a huge zram is wasteful.
mkfs.ext4 /dev/zram1
mount /dev/zram1 /tmp
-7) Stats:
+7) Add/remove zram devices
+
+zram provides a control interface, which enables dynamic (on-demand) device
+addition and removal.
+
+In order to add a new /dev/zramX device, perform read operation on zram_add
+attribute. This will return either new device's device id (meaning that you
+can use /dev/zram<id>) or error code.
+
+Example:
+ cat /sys/class/zram-control/zram_add
+ 1
+
+To remove the existing /dev/zramX device (where X is a device id)
+execute
+ echo X > /sys/class/zram-control/zram_remove
+
+8) Stats:
Per-device statistics are exported as various nodes under /sys/block/zram<id>/
A brief description of exported device attritbutes. For more details please
@@ -174,11 +191,11 @@ line of text and contains the following stats separated by whitespace:
zero_pages
num_migrated
-8) Deactivate:
+9) Deactivate:
swapoff /dev/zram0
umount /dev/zram1
-9) Reset:
+10) Reset:
Write any positive value to 'reset' sysfs node
echo 1 > /sys/block/zram0/reset
echo 1 > /sys/block/zram1/reset
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 7fb72dc..97cd4f3 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -29,10 +29,14 @@
#include <linux/vmalloc.h>
#include <linux/err.h>
#include <linux/idr.h>
+#include <linux/sysfs.h>
#include "zram_drv.h"
static DEFINE_IDR(zram_index_idr);
+/* idr index must be protected */
+static DEFINE_MUTEX(zram_index_mutex);
+
static int zram_major;
static const char *default_compressor = "lzo";
@@ -1278,24 +1282,101 @@ out_free_dev:
return ret;
}
-static void zram_remove(struct zram *zram)
+static int zram_remove(struct zram *zram)
{
- pr_info("Removed device: %s\n", zram->disk->disk_name);
+ struct block_device *bdev;
+
+ bdev = bdget_disk(zram->disk, 0);
+ if (!bdev)
+ return -ENOMEM;
+
+ mutex_lock(&bdev->bd_mutex);
+ if (bdev->bd_openers || zram->claim) {
+ mutex_unlock(&bdev->bd_mutex);
+ return -EBUSY;
+ }
+
+ zram->claim = true;
+ mutex_unlock(&bdev->bd_mutex);
+
/*
* Remove sysfs first, so no one will perform a disksize
- * store while we destroy the devices
+ * store while we destroy the devices. This also helps during
+ * zram_remove() -- device_reset() is the last holder of
+ * ->init_lock.
*/
sysfs_remove_group(&disk_to_dev(zram->disk)->kobj,
&zram_disk_attr_group);
+ /* Make sure all pending I/O is finished */
+ fsync_bdev(bdev);
zram_reset_device(zram);
+ mutex_unlock(&bdev->bd_mutex);
+
+ pr_info("Removed device: %s\n", zram->disk->disk_name);
+
idr_remove(&zram_index_idr, zram->disk->first_minor);
blk_cleanup_queue(zram->disk->queue);
del_gendisk(zram->disk);
put_disk(zram->disk);
kfree(zram);
+
+ return 0;
}
+/* zram module control sysfs attributes */
+static ssize_t zram_add_show(struct class *class,
+ struct class_attribute *attr,
+ char *buf)
+{
+ int ret;
+
+ mutex_lock(&zram_index_mutex);
+ ret = zram_add();
+ mutex_unlock(&zram_index_mutex);
+
+ if (ret < 0)
+ return ret;
+ return scnprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t zram_remove_store(struct class *class,
+ struct class_attribute *attr,
+ const char *buf,
+ size_t count)
+{
+ struct zram *zram;
+ int ret, dev_id;
+
+ /* dev_id is gendisk->first_minor, which is `int' */
+ ret = kstrtoint(buf, 10, &dev_id);
+ if (ret || dev_id < 0)
+ return -EINVAL;
+
+ mutex_lock(&zram_index_mutex);
+
+ zram = idr_find(&zram_index_idr, dev_id);
+ if (zram)
+ ret = zram_remove(zram);
+ else
+ ret = -ENODEV;
+
+ mutex_unlock(&zram_index_mutex);
+ return ret ? ret : count;
+}
+
+static struct class_attribute zram_control_class_attrs[] = {
+ __ATTR_RO(zram_add),
+ __ATTR_WO(zram_remove),
+ __ATTR_NULL,
+};
+
+static struct class zram_control_class = {
+ .name = "zram-control",
+ .owner = THIS_MODULE,
+ .class_attrs = zram_control_class_attrs,
+};
+
static int zram_remove_cb(int id, void *ptr, void *data)
{
zram_remove(ptr);
@@ -1304,6 +1385,7 @@ static int zram_remove_cb(int id, void *ptr, void *data)
static void destroy_devices(void)
{
+ class_unregister(&zram_control_class);
idr_for_each(&zram_index_idr, &zram_remove_cb, NULL);
idr_destroy(&zram_index_idr);
unregister_blkdev(zram_major, "zram");
@@ -1313,14 +1395,23 @@ static int __init zram_init(void)
{
int ret;
+ ret = class_register(&zram_control_class);
+ if (ret) {
+ pr_warn("Unable to register zram-control class\n");
+ return ret;
+ }
+
zram_major = register_blkdev(0, "zram");
if (zram_major <= 0) {
pr_warn("Unable to get major number\n");
+ class_unregister(&zram_control_class);
return -EBUSY;
}
while (num_devices != 0) {
+ mutex_lock(&zram_index_mutex);
ret = zram_add();
+ mutex_unlock(&zram_index_mutex);
if (ret < 0)
goto out_error;
num_devices--;
--
1.9.3
--
Kind regards,
Minchan Kim
next prev parent reply other threads:[~2015-05-04 2:28 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-27 13:21 [PATCHv3 0/9] introduce on-demand device creation Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 1/9] zram: add `compact` sysfs entry to documentation Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 2/9] zram: cosmetic ZRAM_ATTR_RO code formatting tweak Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 3/9] zram: use idr instead of `zram_devices' array Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 4/9] zram: reorganize code layout Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 5/9] zram: remove max_num_devices limitation Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 6/9] zram: report every added and removed device Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 7/9] zram: trivial: correct flag operations comment Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 8/9] zram: return zram device_id from zram_add() Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 9/9] zram: add dynamic device add/remove functionality Sergey Senozhatsky
2015-04-29 0:16 ` Sergey Senozhatsky
2015-04-29 6:48 ` Minchan Kim
2015-04-29 7:02 ` Sergey Senozhatsky
2015-04-29 7:23 ` Sergey Senozhatsky
2015-04-30 5:47 ` Minchan Kim
2015-04-30 6:34 ` Sergey Senozhatsky
2015-04-30 6:44 ` Minchan Kim
2015-04-30 6:51 ` Sergey Senozhatsky
2015-05-04 2:20 ` Minchan Kim
2015-05-04 2:28 ` Minchan Kim [this message]
2015-05-04 6:32 ` Sergey Senozhatsky
2015-05-04 6:29 ` Sergey Senozhatsky
2015-05-04 11:34 ` Sergey Senozhatsky
2015-04-30 6:44 ` Sergey Senozhatsky
2015-04-27 13:41 ` [PATCHv3 0/9] introduce on-demand device creation Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150504022816.GB14452@blaptop \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ngupta@vflare.org \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=sergey.senozhatsky@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.