From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Mon, 14 May 2001 21:35:39 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Mon, 14 May 2001 21:35:28 -0400 Received: from fjordland.nl.linux.org ([131.211.28.101]:13573 "EHLO fjordland.nl.linux.org") by vger.kernel.org with ESMTP id ; Mon, 14 May 2001 21:35:12 -0400 Content-Type: text/plain; charset=US-ASCII From: Daniel Phillips To: Richard Gooch , Linus Torvalds Subject: Re: Getting FS access events Date: Tue, 15 May 2001 02:42:16 +0200 X-Mailer: KMail [version 1.2] Cc: Kernel Mailing List In-Reply-To: <200105140515.f4E5FwP10245@vindaloo.ras.ucalgary.ca> <200105142319.f4ENJpf19203@vindaloo.ras.ucalgary.ca> In-Reply-To: <200105142319.f4ENJpf19203@vindaloo.ras.ucalgary.ca> MIME-Version: 1.0 Message-Id: <01051502421603.24410@starship> Content-Transfer-Encoding: 7BIT Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Tuesday 15 May 2001 01:19, Richard Gooch wrote: > Linus Torvalds writes: > > On Sun, 13 May 2001, Richard Gooch wrote: > > > So, why can't the page cache check if a block is in the buffer > > > cache? > > > > Because it would make the damn thing slower. > > > > The whole point of the page cache is to be FAST FAST FAST. The > > reason we _have_ a page cache is that the buffer cache is slow and > > inefficient, and it will always remain so. > > Is there some fundamental reason why a buffer cache can't ever be > fast? Just looking at getblk, it takes one more lock than read_cache_page (these are noops in UP) and otherwise has very nearly the same sequence of operations. This can't be the slowness he's talking about. I know of three ways the buffer cache earned its reputation for slowness: 1) There used to be a copy from the buffer cache to page cache on every write, to keep the two in sync 2) Having the same data in both the buffer and page cache created extra memory pressure 3) To get at file data through the buffer cache you have to traverse all the index blocks every time, whereas with the logically-indexed page cache you go straight to the page data, if it's there, and in theory[1], only up as many levels of index as you have to. Once you have looked into the page cache and know the page isn't there you know you are going to have to read it. At this point, the overhead of hashing into, say, the buffer cache to see if the block is there is trivial. Just one saved read by doing that will be worth hundreds of hash lookups. But why use the buffer cache? The page cache will work perfectly well for this. There's a big saving in using a block cache for readahead instead of file-oriented readahead: if we guess wrong and don't actually need the readahead blocks then we paid less to get them - we didn't call into the filesystem to map each one. Additionally, a block cache can do things that file readahead can't, as you showed in your example: > - inode at block N > - indirect block at N+k+j > - data block at N+k Another example is where you have blocks from two different files mixed together, and you read both of those files. Note that your scsi disk controller is keeping a cache for you over on its side of the bus. This erodes the benefit of the block cache somewhat, but the same argument applies to file readahead. For all people who don't have scsi the block cache would be a big win. [1] This remains theoretical until we get the indirect blocks into the page cache. -- Daniel