From mboxrd@z Thu Jan  1 00:00:00 1970
From: Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH 2/2] hpfs: optimize quad buffer loading
Date: Thu, 30 Jan 2014 09:59:06 -0800
Message-ID: <CA+55aFx5Wk5zbvsJbjztt4RV69xz-s43oyPhHdZS6wjMBDJ51g@mail.gmail.com>
References: <alpine.DEB.2.02.1401290008430.26781@artax.karlin.mff.cuni.cz>
	<alpine.DEB.2.02.1401290010560.26781@artax.karlin.mff.cuni.cz>
	<CA+55aFzPr516KWYeAA4yeOab29bQBvgee7fmUdBEwBhSa76Qjg@mail.gmail.com>
	<alpine.DEB.2.02.1401290131480.11973@artax.karlin.mff.cuni.cz>
	<CA+55aFwKyBq_m5E7PLm-FBrn3NSFyXrhmn5SCT=7x93emoAhCw@mail.gmail.com>
	<alpine.DEB.2.02.1401291551160.15720@artax.karlin.mff.cuni.cz>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>
To: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from mail-vc0-f172.google.com ([209.85.220.172]:64071 "EHLO
	mail-vc0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751468AbaA3R7I (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Thu, 30 Jan 2014 12:59:08 -0500
Received: by mail-vc0-f172.google.com with SMTP id lf12so2303130vcb.17
        for <linux-fsdevel@vger.kernel.org>; Thu, 30 Jan 2014 09:59:07 -0800 (PST)
In-Reply-To: <alpine.DEB.2.02.1401291551160.15720@artax.karlin.mff.cuni.cz>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On Wed, Jan 29, 2014 at 7:05 AM, Mikulas Patocka
<mikulas@artax.karlin.mff.cuni.cz> wrote:
>
> Suppose that 8 consecutive sectors on the disk contain this data:
> dnode (4 sectors)
> fnode (1 sector)
> file content (3 sectors)
> --- now, you can't access that fnode using 2kB buffer, if you did and if
> you marked that buffer dirty, you damage file content.
>
> So you need different-sized buffers on one page.

No. You're missing the whole point.

"consecutive sectors" does not mean "same page".

The page cache doesn't care. It never has. Non-consecutive sectors is
common for normal file mappings.

The *buffer* cache doesn't really care either, and in fact that
non-consecutive case used to be the common one (very much even for raw
disk accesses, exactly because things *used* to be coherent with a
mounted filesystem - so if there were files that had populated part of
the buffer cache with their non-consecutive sectors, the raw disk
access would just use those non-consecutive sectors).

And all that worked because we'd just look up the buffer head in the
hashes. The page it was on didn't matter.

The problem is that (not *that* long ago, relatively speaking) we have
castrated the buffer cache so much (because almost nobody really uses
it any more) that now it's really a slave of the page cache, and we
got rid of the buffer head hashes entirely. So now we look up the
buffer heads using the page cache, and *that* causes the problems (and
forces us to put those buffer heads in the same page, because we index
by page).

We can actually still just create such non-consecutive buffers and do
IO on them, we just can't look them up any more.

                  Linus