All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Niels de Vos <ndevos@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Bryn M. Reeves" <bmr@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH v2] fs: Invalidate the cache for a parent block-device if fsync() is called for a partition
Date: Thu, 26 Jan 2012 02:03:44 -0800	[thread overview]
Message-ID: <20120126020344.b3194916.akpm@linux-foundation.org> (raw)
In-Reply-To: <1327315109-7740-1-git-send-email-ndevos@redhat.com>

I suggest a viro cc on this one.

On Mon, 23 Jan 2012 10:38:29 +0000 Niels de Vos <ndevos@redhat.com> wrote:

> Executing an fsync() on a file-descriptor of a partition flushes the
> caches for that partition by calling blkdev_issue_flush(). However, it

The changelog is stale.

> seems that reading data through the parent device will still return the
> old cached data.
> 
> The cache for the block-device is not synced if the block-device is kept
> open (due to a mounted partition, for example). Only when all users for
> the disk have exited, the cache for the disk is made consistent again.
> 
> Calling invalidate_bdev() on the parent block-device in case
> blkdev_fsync() was called for a partition, fixes this.
> 
> The problem can be worked around by forcing the caches to be flushed
> with either
> 	# blockdev --flushbufs ${dev_disk}
> or
> 	# echo 3 > /proc/sys/vm/drop_caches

Please include your (useful) problem description in the changelog:

: The problem that was noticed is the following:
: 1) create two or more partitions on a device
:    - use fdisk to create /dev/sdb1 and /dev/sdb2
: 2) format and mount one of the partition
:    - mkfs -t ext3 /dev/sdb1
: 3) read through the main device to have something in the cache
:    - read /dev/sdb with dd or use something like "parted /dev/sdb print"
: 4) now write something to /dev/sdb2, format the partition for example
:    - mkfs -t ext3 /dev/sdb2
: 5) read the blocks where sdb2 starts, through /dev/sdb
:    - use dd or do again a "parted /dev/sdb print"
: 
: Without this patch, calling "blockdev --flushbufs" or dropping the
: caches, the result in 5) is the same as in 3). Reading the same area
: through /dev/sdb2 shows the inconsistancy between the two caches.
: 
: With this patch, or one of the workarounds, the data read through
: /dev/sdb and /dev/sdb2 is the same.

>
> ...
>
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -424,6 +424,10 @@ int blkdev_fsync(struct file *filp, loff_t start, loff_t end, int datasync)
>  	if (error == -EOPNOTSUPP)
>  		error = 0;
>  
> +	/* invalidate parent block_device */
> +	if (!error && bdev != bdev->bd_contains)
> +		invalidate_bdev(bdev->bd_contains);
> +
>  	return error;
>  }

It doesn't seem terribly logical to do this in blkdev_fsync().  Why not
do it right there in blkdev_ioctl()'s "case BLKFLSBUF"?

Bear in mind that invalidate_bdev() isn't a very strong function -
it won't drop pages which are dirty or under writeback nor pages which
others have a reference on.  But I can see that this change would be a
best-effort user-convenience thing.


  parent reply	other threads:[~2012-01-26  9:57 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-20  1:58 [PATCH] block: Invalidate the cache for a parent block-device Mikulas Patocka
2012-01-20  9:35 ` Niels de Vos
2012-01-23 10:38   ` [PATCH v2] fs: Invalidate the cache for a parent block-device if fsync() is called for a partition Niels de Vos
2012-01-23 16:27     ` Jeff Moyer
2012-01-23 16:46       ` Niels de Vos
2012-01-23 19:23     ` Mikulas Patocka
2012-01-23 20:04     ` Jeff Moyer
2012-01-26 10:03     ` Andrew Morton [this message]
2012-01-26 11:50       ` Niels de Vos
2012-01-26 13:33         ` [PATCH v3] " Niels de Vos
2012-01-26 21:40           ` Andrew Morton
2012-01-26 21:45             ` Christoph Hellwig
2012-01-26 21:50               ` Mikulas Patocka
2012-01-27 12:19                 ` Ric Wheeler
2012-01-31 16:00               ` Niels de Vos
2012-01-31 18:58                 ` Andrew Morton
2012-01-31 19:04                   ` Christoph Hellwig
2012-01-31 19:32                     ` Andrew Morton
2012-01-31 19:37                       ` Christoph Hellwig
2012-01-31 19:48                         ` Andrew Morton
2012-01-26 21:49             ` Jeff Moyer
2012-01-26 22:13             ` Mikulas Patocka
2012-01-26 22:26               ` Kernel Oops report (Android gingerbread) Fan Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120126020344.b3194916.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=bmr@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=ndevos@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.