From: Minchan Kim <minchan@kernel.org>
To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>,
Jerome Marchand <jmarchan@redhat.com>,
driverdev-devel@linuxdriverproject.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race
Date: Mon, 16 Sep 2013 09:02:17 +0900 [thread overview]
Message-ID: <20130916000217.GA18616@bbox> (raw)
In-Reply-To: <20130910231250.GA2450@swordfish>
Hello Sergey,
Sorry for really slow response. I was really busy by internal works
and Thanks for pointing the BUG, Dan, Jerome and Sergey.
I read your threads roughly so I may miss something. If so, sorry
for that. Anyway I will put my opinion.
On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote:
> Dan Carpenter noted that handle_pending_slot_free() is racy with
> zram_reset_device(). Take write init_lock in zram_slot_free(), thus
Right but "init_lock" is what I really want to remove.
Yes. It's just read-side lock so most of time it doesn't hurt us but it
makes code very complicated and deadlock prone so I'd like to replace
it with RCU. Yeah, It's off topic but just let me put my opinion in
future direction.
Abought the bug, how about moving flush_work below down_write(init_lock)?
zram_make_request is already closed by init_lock and we have a rule about
lock ordering as following so I don't see any problem.
init_lock
zram->lock
> preventing any concurrent zram_slot_free(), zram_bvec_rw() or
> zram_reset_device(). This also allows to safely check zram->init_done
> in handle_pending_slot_free().
>
> Initial intention was to minimze number of handle_pending_slot_free()
> call from zram_bvec_rw(), which were slowing down READ requests due to
> slot_free_lock spin lock. Jerome Marchand suggested to remove
> handle_pending_slot_free() from zram_bvec_rw().
>
> Link: https://lkml.org/lkml/2013/9/9/172
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
>
> ---
>
> drivers/staging/zram/zram_drv.c | 13 +++++--------
> 1 file changed, 5 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
> index 91d94b5..7a2d4de 100644
> --- a/drivers/staging/zram/zram_drv.c
> +++ b/drivers/staging/zram/zram_drv.c
> @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram)
> while (zram->slot_free_rq) {
> free_rq = zram->slot_free_rq;
> zram->slot_free_rq = free_rq->next;
> - zram_free_page(zram, free_rq->index);
> + if (zram->init_done)
> + zram_free_page(zram, free_rq->index);
> kfree(free_rq);
> }
> spin_unlock(&zram->slot_free_lock);
> @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
>
> if (rw == READ) {
> down_read(&zram->lock);
> - handle_pending_slot_free(zram);
Read side is okay but actually I have a nitpick.
If someone poll a block in zram-swap device, he would see a block
has zero value suddenly although there was no I/O.(I don't want to argue
it's sane user or not, anyway) it never happens on real block device and
it never happens on zram-block device. Only it can happen zram-swap device.
And such behavior was there since we introduced swap_slot_free_notify.
(off-topic: I'd like to remove it because it makes tight coupling between
zram and swap and obviously, it was layering violation function)
so now, I don't have strong objection.
The idea is to remove swap_slot_free_notify is to use frontswap when
user want to use zram as swap so zram can be notified when the block
lose the owner but still we should solve the mutex problem in notify
handler.
> ret = zram_bvec_read(zram, bvec, index, offset, bio);
> up_read(&zram->lock);
> } else {
> down_write(&zram->lock);
> - handle_pending_slot_free(zram);
Why did you remove this in write-side?
We can't expect when the work will trigger. It means the work could remove
valid block under the us.
> ret = zram_bvec_write(zram, bvec, index, offset);
> up_write(&zram->lock);
> }
> -
> return ret;
> }
>
> @@ -750,12 +748,11 @@ error:
>
> static void zram_slot_free(struct work_struct *work)
> {
> - struct zram *zram;
> + struct zram *zram = container_of(work, struct zram, free_work);
>
> - zram = container_of(work, struct zram, free_work);
> - down_write(&zram->lock);
> + down_write(&zram->init_lock);
I don't like this.
Primary problem is we should handle it as atomic so that we should use
spinlock instead of mutex. Yeah, /me kicks his ass. From the beginning,
I should solve this problem as that way.
The simple solution popped from my mind is that
diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index 91d94b5..b23bf0e 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -534,11 +534,14 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
if (rw == READ) {
down_read(&zram->lock);
- handle_pending_slot_free(zram);
ret = zram_bvec_read(zram, bvec, index, offset, bio);
up_read(&zram->lock);
} else {
down_write(&zram->lock);
+ /*
+ * We should free pending slot. Otherwise it would
+ * free valid blocks under the us.
+ */
handle_pending_slot_free(zram);
ret = zram_bvec_write(zram, bvec, index, offset);
up_write(&zram->lock);
@@ -552,7 +555,6 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity)
size_t index;
struct zram_meta *meta;
- flush_work(&zram->free_work);
down_write(&zram->init_lock);
if (!zram->init_done) {
@@ -560,6 +562,7 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity)
return;
}
+ flush_work(&zram->free_work);
meta = zram->meta;
zram->init_done = 0;
But more ideal way I am thinking now is
1) replace init_lock with RCU lock
2) introduce new meta atmoic lock instead of zram->mutex, which is very coarse-grained.
3) use atmoic lock in notify handler.
--
Kind regards,
Minchan Kim
next prev parent reply other threads:[~2013-09-16 0:02 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-06 15:12 [PATCH 1/2] staging: zram: minimize `slot_free_lock' usage (v2) Sergey Senozhatsky
2013-09-09 12:33 ` Dan Carpenter
2013-09-09 12:49 ` Sergey Senozhatsky
2013-09-09 13:21 ` Dan Carpenter
2013-09-09 13:46 ` Jerome Marchand
2013-09-09 16:10 ` Jerome Marchand
2013-09-10 14:34 ` Sergey Senozhatsky
2013-09-10 14:58 ` Dan Carpenter
2013-09-10 15:15 ` Greg Kroah-Hartman
2013-09-10 23:12 ` [PATCH 1/2] staging: zram: fix handle_pending_slot_free() and zram_reset_device() race Sergey Senozhatsky
2013-09-12 22:12 ` Greg KH
2013-09-13 9:17 ` Sergey Senozhatsky
2013-09-16 0:02 ` Minchan Kim [this message]
2013-09-17 17:24 ` Sergey Senozhatsky
2013-09-23 4:24 ` Minchan Kim
2013-09-23 8:42 ` Sergey Senozhatsky
2013-09-10 23:19 ` [PATCH 2/2] staging: zram: remove init_done from zram struct (v3) Sergey Senozhatsky
2013-09-10 23:27 ` [PATCH 1/2] staging: zram: minimize `slot_free_lock' usage (v2) Sergey Senozhatsky
2013-09-09 14:42 ` Sergey Senozhatsky
2013-09-09 14:52 ` Dan Carpenter
2013-09-09 15:09 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130916000217.GA18616@bbox \
--to=minchan@kernel.org \
--cc=dan.carpenter@oracle.com \
--cc=driverdev-devel@linuxdriverproject.org \
--cc=jmarchan@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sergey.senozhatsky@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).