All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Nitin Gupta <ngupta@vflare.org>,
	linux-kernel@vger.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: Re: [PATCHv3 9/9] zram: add dynamic device add/remove functionality
Date: Wed, 29 Apr 2015 15:48:58 +0900	[thread overview]
Message-ID: <20150429064858.GA5125@blaptop> (raw)
In-Reply-To: <20150429001624.GA3917@swordfish>

Hello Sergey,

On Wed, Apr 29, 2015 at 09:16:24AM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> Minchan, a quick question. just to avoid resends and to make sure that I'm
> not missing any better solution.
> 
> 
> lockdep is unhappy here:
> 
> > -static void zram_remove(struct zram *zram)
> > +static int zram_remove(struct zram *zram)
> >  {
> > -	pr_info("Removed device: %s\n", zram->disk->disk_name);
> > +	struct block_device *bdev;
> > +
> > +	bdev = bdget_disk(zram->disk, 0);
> > +	if (!bdev)
> > +		return -ENOMEM;
> > +
> > +	mutex_lock(&bdev->bd_mutex);
> > +	if (bdev->bd_openers) {
> > +		mutex_unlock(&bdev->bd_mutex);
> > +		return -EBUSY;
> > +	}
> > +
> >  	/*
> >  	 * Remove sysfs first, so no one will perform a disksize
> > -	 * store while we destroy the devices
> > +	 * store while we destroy the devices. This also helps during
> > +	 * zram_remove() -- device_reset() is the last holder of
> > +	 * ->init_lock.
> >  	 */
> >  	sysfs_remove_group(&disk_to_dev(zram->disk)->kobj,
> >  			&zram_disk_attr_group);
> >  
> > +	/* Make sure all pending I/O is finished */
> > +	fsync_bdev(bdev);
> >  	zram_reset_device(zram);
> > +	mutex_unlock(&bdev->bd_mutex);
> > +
> > +	pr_info("Removed device: %s\n", zram->disk->disk_name);
> > +
> >  	idr_remove(&zram_index_idr, zram->disk->first_minor);
> >  	blk_cleanup_queue(zram->disk->queue);
> >  	del_gendisk(zram->disk);
> >  	put_disk(zram->disk);
> >  	kfree(zram);
> > +
> > +	return 0;
> >  }
> 
> 
> lock ->bd_mutex
> 	if ->bd_openers {
> 		unlock ->bd_mutex
> 		return -EBUSY;
> 	}
> 
> 	sysfs_remove_group()
> 		^^^^^^^^^ lockdep splat
> 	zram_reset_device()
> 		lock ->init_lock
> 		reset device
> 		unlock ->init_lock
> unlock ->bd_mutex
> ...
> kfree zram
> 
> 
> why did I do this:
> 
> sysfs_remove_group() turns zram_reset_device() into the last possible ->init_lock
> owner: there are no sysfs nodes before final zram_reset_device(), so no
> concurrent nor later store()/show() sysfs handler can be called. it closes a number
> of race conditions, like:
> 
> 	CPU0			CPU1
> 	umount
> 	zram_remove()
> 	zram_reset_device()	disksize_store()
> 				mount
> 	kfree zram
> 
> or
> 
> 	CPU0				CPU1
> 	umount
> 	zram_remove()
> 	zram_reset_device()
> 					cat /sys/block/zram0/_any_sysfs_node_
> 	sysfs_remove_group()
> 	kfree zram			_any_sysfs_node_read()
> 
> 
> and so on. so removing sysfs group before zram_reset_device() makes sense.
> 
> at the same time we need to prevent `umount-zram_remove vs. mount' race and forbid
> zram_remove() on active device. so we check ->bd_openers and perform device reset
> under ->bd_mutex.
> 

Could you explain in detail about unmount-zram_remove vs. mount race?
I guess it should be done in upper layer(e,g. VFS). Anyway, I want to be
more clear about that.

> 
> a quick solution I can think of is to do something like this:
> 
> sysfs_remove_group();
> lock ->bd_mutex
> 	if ->bd_openers {
> 		unlock ->bd_mutex
> 		create_sysfs_group()
> 			^^^^^^^^   return attrs back
> 		return -EBUSY
> 	}
> 
> 	zram_reset_device()
> 		lock ->init_lock
> 		reset device
> 		unlock ->init_lock
> unlock ->bd_mutex
> ...
> kfree zram
> 
> 
> iow, move sysfs_remove_group() out of ->bd_mutex lock, but keep it
> before ->bd_openers check & zram_reset_device().
> 
> I don't think that we can handle it with only ->init_lock. we need to unlock
> it at some point and kfree zram that contains that lock. and we have no idea
> are there any lock owners or waiters when we kfree zram. removing sysfs group
> and acquiring ->init_lock for write in zram_reset_device() guarantee that
> there will no further ->init_lock owners and we can safely kfree after
> `unlock ->init_lock`. hence, sysfs_remove_group() must happen before
> zram_reset_device().
> 
> 
> create_sysfs_group() potentially can fail. which is a bit ugly. but user
> will be able to umount device and remove it anyway.
> 
> 
> what do you think?
> 
> 	-ss

-- 
Kind regards,
Minchan Kim

  reply	other threads:[~2015-04-29  6:49 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-27 13:21 [PATCHv3 0/9] introduce on-demand device creation Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 1/9] zram: add `compact` sysfs entry to documentation Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 2/9] zram: cosmetic ZRAM_ATTR_RO code formatting tweak Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 3/9] zram: use idr instead of `zram_devices' array Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 4/9] zram: reorganize code layout Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 5/9] zram: remove max_num_devices limitation Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 6/9] zram: report every added and removed device Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 7/9] zram: trivial: correct flag operations comment Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 8/9] zram: return zram device_id from zram_add() Sergey Senozhatsky
2015-04-27 13:21 ` [PATCHv3 9/9] zram: add dynamic device add/remove functionality Sergey Senozhatsky
2015-04-29  0:16   ` Sergey Senozhatsky
2015-04-29  6:48     ` Minchan Kim [this message]
2015-04-29  7:02       ` Sergey Senozhatsky
2015-04-29  7:23         ` Sergey Senozhatsky
2015-04-30  5:47           ` Minchan Kim
2015-04-30  6:34             ` Sergey Senozhatsky
2015-04-30  6:44               ` Minchan Kim
2015-04-30  6:51                 ` Sergey Senozhatsky
2015-05-04  2:20                   ` Minchan Kim
2015-05-04  2:28                     ` Minchan Kim
2015-05-04  6:32                       ` Sergey Senozhatsky
2015-05-04  6:29                     ` Sergey Senozhatsky
2015-05-04 11:34                     ` Sergey Senozhatsky
2015-04-30  6:44               ` Sergey Senozhatsky
2015-04-27 13:41 ` [PATCHv3 0/9] introduce on-demand device creation Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150429064858.GA5125@blaptop \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ngupta@vflare.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.