linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Mikael Abrahamsson <swmike@swm.pp.se>
Cc: linux-raid@vger.kernel.org
Subject: Re: Linux Plumbers MD BOF discussion notes
Date: Wed, 04 Oct 2017 11:49:00 +1100	[thread overview]
Message-ID: <87lgkr3fgj.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <alpine.DEB.2.20.1710010730210.31961@uplift.swm.pp.se>

[-- Attachment #1: Type: text/plain, Size: 3706 bytes --]

On Sun, Oct 01 2017, Mikael Abrahamsson wrote:

> On Mon, 18 Sep 2017, NeilBrown wrote:
>
>> Anyway, thanks for the example of a real problem related to this.  It 
>> does make it easier to think about.
>
> Btw, if someone does --zero-superblock or dd /dev/zero to to a component 
> device that is active, what happens when mdadm --stop /dev/mdX is run? 
> Does it write out the complete superblock again?

--zero-superblock won't work on a device that is currently part of an
array.  dd /dev/zero will.
When the array is stopped the metadata will be written if the array is
not read-only and is not clean.
So for 'linear' and 'raid0' it is never written.  For others it probably
is but may not be.
I'm not sure that forcing a write makes sense.  A dd could corrupt lots
of stuff, and just saving the metadata is not a big win.

I've been playing with some code, and this patch makes it impossible to
write to a device which is in-use by md.
Well... not exactly.  If a partition is in-use by md, the whole device
can still be written to.  But the partition itself cannot.
Also if metadata is managed by user-space, writes are still allowed.
To fix that, we would need to capture each write request and validate
the sector range.  Not impossible, but ugly.

Also, by itself, this patch breaks the use of raid6check on an active
array.  We could fix that by enabling writes whenever a region is
suspended.

Still... maybe it is a starting point for thinking about the problem.

NeilBrown


diff --git a/drivers/md/md.c b/drivers/md/md.c
index 0ff1bbf6c90e..7c469cd9febc 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2264,6 +2264,7 @@ static int lock_rdev(struct md_rdev *rdev, dev_t dev, int shared)
 		pr_warn("md: could not open %s.\n", __bdevname(dev, b));
 		return PTR_ERR(bdev);
 	}
+	bdev->bd_holder_only_writes = !shared;
 	rdev->bdev = bdev;
 	return err;
 }
@@ -2272,6 +2273,7 @@ static void unlock_rdev(struct md_rdev *rdev)
 {
 	struct block_device *bdev = rdev->bdev;
 	rdev->bdev = NULL;
+	bdev->bd_holder_only_writes = 0;
 	blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
 }
 
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 93d088ffc05c..673b71bac731 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1816,10 +1816,14 @@ void blkdev_put(struct block_device *bdev, fmode_t mode)
 		WARN_ON_ONCE(--bdev->bd_contains->bd_holders < 0);
 
 		/* bd_contains might point to self, check in a separate step */
-		if ((bdev_free = !bdev->bd_holders))
+		if ((bdev_free = !bdev->bd_holders)) {
+			bdev->bd_holder_only_writes = 0;
 			bdev->bd_holder = NULL;
-		if (!bdev->bd_contains->bd_holders)
+		}
+		if (!bdev->bd_contains->bd_holders) {
+			bdev->bd_contains->bd_holder_only_writes = 0;
 			bdev->bd_contains->bd_holder = NULL;
+		}
 
 		spin_unlock(&bdev_lock);
 
@@ -1884,8 +1888,13 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	loff_t size = i_size_read(bd_inode);
 	struct blk_plug plug;
 	ssize_t ret;
+	struct block_device *bdev = I_BDEV(bd_inode);
 
-	if (bdev_read_only(I_BDEV(bd_inode)))
+	if (bdev_read_only(bdev))
+		return -EPERM;
+	if (bdev->bd_holder != NULL &&
+	    bdev->bd_holder_only_writes &&
+	    bdev->bd_holder != file)
 		return -EPERM;
 
 	if (!iov_iter_count(from))
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 339e73742e73..79e3a2822867 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -424,6 +424,7 @@ struct block_device {
 	void *			bd_holder;
 	int			bd_holders;
 	bool			bd_write_holder;
+	bool			bd_holder_only_writes;
 #ifdef CONFIG_SYSFS
 	struct list_head	bd_holder_disks;
 #endif

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2017-10-04  0:49 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-15 14:27 Linux Plumbers MD BOF discussion notes Shaohua Li
2017-09-15 20:42 ` Coly Li
2017-09-15 21:20   ` Shaohua Li
2017-09-16  0:08 ` NeilBrown
2017-09-18  4:54   ` Shaohua Li
2017-09-18  7:04   ` Mikael Abrahamsson
2017-09-18  8:56     ` NeilBrown
2017-10-01  5:32       ` Mikael Abrahamsson
2017-10-04  0:49         ` NeilBrown [this message]
2017-10-04 11:02           ` Artur Paszkiewicz
2017-10-04 11:23             ` Artur Paszkiewicz
2017-10-04 17:30               ` Piergiorgio Sartor
2017-10-04 18:03                 ` John Stoffel
2017-10-04 21:18               ` Phil Turmel
2017-10-04 21:41             ` NeilBrown
2017-10-05 18:52               ` Artur Paszkiewicz
2017-10-05 23:39                 ` NeilBrown
2017-10-06  7:13                   ` Christoph Hellwig
2017-10-06  7:59                     ` Mikael Abrahamsson
2017-10-04 17:28           ` Piergiorgio Sartor
2017-10-04 18:13             ` Anthony Youngman
2017-09-18 13:57     ` Wols Lists

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lgkr3fgj.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=swmike@swm.pp.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).