From mboxrd@z Thu Jan 1 00:00:00 1970 From: Linus Torvalds Subject: Re: [PATCH 2/2] hpfs: optimize quad buffer loading Date: Thu, 30 Jan 2014 09:59:06 -0800 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: linux-fsdevel To: Mikulas Patocka Return-path: Received: from mail-vc0-f172.google.com ([209.85.220.172]:64071 "EHLO mail-vc0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751468AbaA3R7I (ORCPT ); Thu, 30 Jan 2014 12:59:08 -0500 Received: by mail-vc0-f172.google.com with SMTP id lf12so2303130vcb.17 for ; Thu, 30 Jan 2014 09:59:07 -0800 (PST) In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, Jan 29, 2014 at 7:05 AM, Mikulas Patocka wrote: > > Suppose that 8 consecutive sectors on the disk contain this data: > dnode (4 sectors) > fnode (1 sector) > file content (3 sectors) > --- now, you can't access that fnode using 2kB buffer, if you did and if > you marked that buffer dirty, you damage file content. > > So you need different-sized buffers on one page. No. You're missing the whole point. "consecutive sectors" does not mean "same page". The page cache doesn't care. It never has. Non-consecutive sectors is common for normal file mappings. The *buffer* cache doesn't really care either, and in fact that non-consecutive case used to be the common one (very much even for raw disk accesses, exactly because things *used* to be coherent with a mounted filesystem - so if there were files that had populated part of the buffer cache with their non-consecutive sectors, the raw disk access would just use those non-consecutive sectors). And all that worked because we'd just look up the buffer head in the hashes. The page it was on didn't matter. The problem is that (not *that* long ago, relatively speaking) we have castrated the buffer cache so much (because almost nobody really uses it any more) that now it's really a slave of the page cache, and we got rid of the buffer head hashes entirely. So now we look up the buffer heads using the page cache, and *that* causes the problems (and forces us to put those buffer heads in the same page, because we index by page). We can actually still just create such non-consecutive buffers and do IO on them, we just can't look them up any more. Linus