From mboxrd@z Thu Jan 1 00:00:00 1970 From: Linus Torvalds Subject: Re: [PATCH 2/2] hpfs: optimize quad buffer loading Date: Tue, 28 Jan 2014 15:44:29 -0800 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: linux-fsdevel To: Mikulas Patocka Return-path: Received: from mail-vc0-f176.google.com ([209.85.220.176]:65224 "EHLO mail-vc0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932241AbaA1Xoa (ORCPT ); Tue, 28 Jan 2014 18:44:30 -0500 Received: by mail-vc0-f176.google.com with SMTP id la4so697993vcb.7 for ; Tue, 28 Jan 2014 15:44:30 -0800 (PST) In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Tue, Jan 28, 2014 at 3:11 PM, Mikulas Patocka wrote: > HPFS needs to load 4 consecutive 512-byte sectors when accessing the > directory nodes or bitmaps. We can't switch to 2048-byte block size > because files are allocated in the units of 512-byte sectors. Bah, this is untrue. Well, it's true that you cannot *switch* to another size, but the buffer head layer should be perfectly happy with mixed sizes within a device, even if nobody happens to do it. Just allocate a whole page, and make *that* page use 2048-byte buffers. So you should be perfectly able to just do struct buffer_head *bh = __bread(dev, nr, 2048); which gets and reads a single 2048-byte buffer head. Now, the problem is that because nobody actually does this, I bet we have bugs in this area, and some path ends up using bd_inode->i_blkbits instead of the passed-in size. A very quick look finds __find_get_block -> __find_get_block_slow() looking bad, for example. But I also bet that that should be easy to fix. In fact, I think the only reason we use "i_blkbits" there is because it avoids a division (and nobody had a *reason* to do it), but since this is the "we have to do IO" path, just passing in the size and then using a "sector_div()" is a nobrainer from a performance standpoint, I think. So fixing that problem looks like a couple of lines. Now, another issue is that with multiple block sizes, it's up to the filesystem to then guarantee that there isn't aliasing between two physical blocks (eg say "2048b sized block at offset 1" vs "512b buffer-head at offset 4"). But if the aliasing is fairly naturally avoided at the FS level (and if this is done only for particular parts of the filesystem, that should be very natural), that shouldn't be a problem either. So I'd actually much rather see us taking advantage of multiple buffer sizes and use a *native* 2k buffer-head when it makes sense, than this odd "let's allocate them, and then maybe they are all properly aligned anyway" kind of voodoo programming. Would you be willing to try? Linus