From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: invalidate the buffer heads of a block device Date: Mon, 29 Sep 2014 20:11:15 +0200 Message-ID: <20140929181115.GG6328@quack.suse.cz> References: <2368A3FCF9F7214298E53C823B0A48EC0420B08E@AMSPEX01CL02.citrite.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "linux-fsdevel@vger.kernel.org" , Ross Lagerwall , Felipe Franciosi To: Thanos Makatos Return-path: Received: from cantor2.suse.de ([195.135.220.15]:40213 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754536AbaI2SLS (ORCPT ); Mon, 29 Sep 2014 14:11:18 -0400 Content-Disposition: inline In-Reply-To: <2368A3FCF9F7214298E53C823B0A48EC0420B08E@AMSPEX01CL02.citrite.net> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon 29-09-14 16:24:35, Thanos Makatos wrote: > I'm looking for ways to achieve _read_ caching of a block device (usually > iSCSI) that is attached to multiple hosts. The problem is that sometimes there > will be some writes on that block device in a particular, known host using > O_DIRECT, and that requires all other hosts to invalidate their buffer caches > for that block device. I'd prefer to use standard tools/procedures rather than > hacking things, but if I do have to implement something to solve this problem > I'd prefer it to be something that can be accepted upstream. > > It looks like that closing the block device guarantees that the buffer > cache is invalidated, and I can guarantee that all _my_ processes close > the block device to achieve this. However, if there is at least one other > process I don't know of that is keeping an open file handle against that > block device, the buffer cache won't be invalidated, and that will result > in corruption. So this solution doesn't seem to work. Well, I wouldn't really advice to depend on buffer cache to be flushed when all openers close the device. > I had a look at the BLKFLSBUF ioctl, which seems to be designed to do the > job, except that it doesn't work if a process has memory mapped the block > device, and AFAIK there's no way to disallow memory mapping of a block > device. Again, that looks like a deal-breaker. Yeah, plus it doesn't work when a page is dirty / under writeback although that doesn't seem to be an issue for your usecase. > If the above observations are correct, it seems that I have to either > extend BLKFLSBUF to some invalidate such memory maps (I'm completely > ignorant in that field, is it even possible?), or look for other > solutions. Well, you could unmap the pages of block device that are mapped in your new ioctl but I don't think it's easily possible to disallow userspace to fault the pages back behind your back. And that could be a problem for you. > Could some configuration of dm-cache, bcache, or some other component > solve my problem? posix_fadvise with POSIX_FADV_DONTNEED seems to be just > a hint to the kernel, there are no guarantees that cached data will be > discarded. I've also though of using a virtual block device driver that > exclusively opens the actual network block device and lets user-space > applications use the virtualised one so that I have more control and > enforce things. Hum, I'm not aware of any approach which would do what you need. Honza -- Jan Kara SUSE Labs, CR