linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Thanos Makatos <thanos.makatos@citrix.com>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Ross Lagerwall <ross.lagerwall@citrix.com>,
	Felipe Franciosi <felipe.franciosi@citrix.com>
Subject: Re: invalidate the buffer heads of a block device
Date: Mon, 29 Sep 2014 20:11:15 +0200	[thread overview]
Message-ID: <20140929181115.GG6328@quack.suse.cz> (raw)
In-Reply-To: <2368A3FCF9F7214298E53C823B0A48EC0420B08E@AMSPEX01CL02.citrite.net>

On Mon 29-09-14 16:24:35, Thanos Makatos wrote:
> I'm looking for ways to achieve _read_ caching of a block device (usually
> iSCSI) that is attached to multiple hosts. The problem is that sometimes there
> will be some writes on that block device in a particular, known host using
> O_DIRECT, and that requires all other hosts to invalidate their buffer caches
> for that block device. I'd prefer to use standard tools/procedures rather than
> hacking things, but if I do have to implement something to solve this problem
> I'd prefer it to be something that can be accepted upstream.
> 
> It looks like that closing the block device guarantees that the buffer
> cache is invalidated, and I can guarantee that all _my_ processes close
> the block device to achieve this. However, if there is at least one other
> process I don't know of that is keeping an open file handle against that
> block device, the buffer cache won't be invalidated, and that will result
> in corruption. So this solution doesn't seem to work.
  Well, I wouldn't really advice to depend on buffer cache to be flushed
when all openers close the device.

> I had a look at the BLKFLSBUF ioctl, which seems to be designed to do the
> job, except that it doesn't work if a process has memory mapped the block
> device, and AFAIK there's no way to disallow memory mapping of a block
> device. Again, that looks like a deal-breaker.
  Yeah, plus it doesn't work when a page is dirty / under writeback
although that doesn't seem to be an issue for your usecase.

> If the above observations are correct, it seems that I have to either
> extend BLKFLSBUF to some invalidate such memory maps (I'm completely
> ignorant in that field, is it even possible?), or look for other
> solutions.
  Well, you could unmap the pages of block device that are mapped in your
new ioctl but I don't think it's easily possible to disallow userspace to
fault the pages back behind your back. And that could be a problem for you.

> Could some configuration of dm-cache, bcache, or some other component
> solve my problem? posix_fadvise with POSIX_FADV_DONTNEED seems to be just
> a hint to the kernel, there are no guarantees that cached data will be
> discarded.  I've also though of using a virtual block device driver that
> exclusively opens the actual network block device and lets user-space
> applications use the virtualised one so that I have more control and
> enforce things.
  Hum, I'm not aware of any approach which would do what you need.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2014-09-29 18:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-29 16:24 invalidate the buffer heads of a block device Thanos Makatos
2014-09-29 18:11 ` Jan Kara [this message]
2014-09-30  8:58   ` Thanos Makatos
2014-09-30  9:19     ` Jan Kara
2014-09-30  9:39       ` Thanos Makatos
2014-09-30  9:55         ` Jan Kara
2014-09-30 10:11           ` Thanos Makatos
2014-09-30 10:48             ` Jan Kara
2014-09-30 20:53               ` Zach Brown
2014-09-30 21:13                 ` Trond Myklebust
2014-10-01  9:05                   ` Jan Kara
2014-10-01 11:50                     ` Trond Myklebust
2014-10-01 14:07                       ` Jan Kara
2014-10-01 14:47                         ` Trond Myklebust
2014-10-01 15:15                           ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140929181115.GG6328@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=felipe.franciosi@citrix.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=ross.lagerwall@citrix.com \
    --cc=thanos.makatos@citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).