From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Mason Subject: Re: [PATCH v2] Do a proper locking for mmap and block size change Date: Thu, 29 Nov 2012 20:16:08 -0500 Message-ID: <20121130011608.GA11004@shiny.int.fusionio.com> References: <20121129191503.GB3490@shiny> <20121129194840.GC3490@shiny> <20121129212931.GD3490@shiny> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: Chris Mason , Mikulas Patocka , Al Viro , Jens Axboe , Jeff Chua , Lai Jiangshan , Jan Kara , lkml , linux-fsdevel To: Linus Torvalds Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Thu, Nov 29, 2012 at 03:36:38PM -0700, Linus Torvalds wrote: > On Thu, Nov 29, 2012 at 2:16 PM, Linus Torvalds > wrote: > > > > But you're right. The direct-IO code really *is* violating that, and > > knows that get_block() ends up being defined in i_blkbits regardless > > of b_size. > > It turns out fs/ioctl.c does the same - it fills in the buffer head > with some random bh->b_size too. I think it's not even a power of two > in that case. > > And I guess it's understandable - they don't actually *use* the > buffer, they just want the offset. So the b_size field really is just > random crap to the users of the get_block interfaces, since they've > never cared before. > > Ugh, this was definitely a dark and disgusting underbelly of the VFS > layer. We've not had to really touch it for a *looong* time.. I searched through filemap.c for the magic i_size check that would let us get away with ignoring i_blkbits in get_blocks, but its just not there. The whole fallback-to-buffered scheme seems to rely on get_blocks checking for i_size. I really hope I'm just missing something. If we're going to change this, I'd vote for something non-bh based. I didn't check every single FS, but I don't think direct-IO really wants or needs buffer heads at all. One less wart in direct-io.c would really be nice, but I'm assuming it'll take us at least one full release to hammer out a shiny new get_blocks. Passing i_blkbits would be more mechanical, since all the filesystems would just ignore it. -chris