public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Linus Torvalds <torvalds@transmeta.com>, linux-kernel@vger.kernel.org
Subject: Re: DVD blockdevice buffers
Date: 25 May 2001 09:09:37 -0600	[thread overview]
Message-ID: <m1bsohh3da.fsf@frodo.biederman.org> (raw)
In-Reply-To: <20010523205748.L8080@redhat.com> <Pine.LNX.4.31.0105231258420.6642-100000@penguin.transmeta.com> <20010524123627.L27177@redhat.com>
In-Reply-To: "Stephen C. Tweedie"'s message of "Thu, 24 May 2001 12:36:27 +0100"

"Stephen C. Tweedie" <sct@redhat.com> writes:

> Hi,
> 
> On Wed, May 23, 2001 at 01:01:56PM -0700, Linus Torvalds wrote:
>  
> > On Wed, 23 May 2001, Stephen C. Tweedie wrote:
> > > > that the filesystems already do. And you can do it a lot _better_ than the
> 
> > > > current buffer-cache-based approach. Done right, you can actually do all
> > > > IO in page-sized chunks, BUT fall down on sector-sized things for the
> > > > cases where you want to.
> > >
> > > Right, but you still lose the caching in that case.  The write works,
> > > but the "cache" becomes nothing more than a buffer.
> > 
> > No. It is still cached. You find the buffer with "page->buffer", and when
> > all of them are up-to-date (whether from read-in or from having written
> > to them all), you just mark the whole page up-to-date.
> 
> It works, but *only* if the application writes a whole page worth of
> data.  From the previous emails I had the understanding that this
> application is writing small data items in random 512-byte blocks.  It
> is not writing the rest of the page.  The page never becomes uptodate.
> That in itself isn't a problem, but readpage() can't tell the
> underlying layers that only a part of the page is wanted, so there's
> no way to tell readpage that the page is in fact partially uptodate.
> 
> And just telling the application to write the rest of the page too
> isn't going to cut it, because the rest of the page may contain other
> objects which aren't in cache so we can't write them without first
> reading the page.  The only alternative is to change the on-disk
> layout, forcing a minimum PAGESIZE on the IO chunks.
> 
> > This _works_. Try it on ext2 or NFS today.
> 
> Not for this workload.  Now, maybe it's not an interesting workload.
> But shifting the uptodate granularity from buffer to page sized _does_
> impact the effectiveness of the cache for such an application. 
> 
> > So in short: the page cache supports _today_ all the optimizations.
> 
> For write, perhaps; but for subsequent read, generic_read_page
> doesn't see any of the data in the page unless the whole page has been
> written.

generic_read_page???

block_read_full_page seems to handle this correctly.  At least
with respect to keeping the data around, and not doing the I/O
on data we already have.  But it still reads in the unpopulated
parts of the page even if it is unnecessary.

The case we don't get quite right are partial reads that hit cached
data, on a page that doesn't have PG_Uptodate set.  We don't actually
need to do the I/O on the surrounding page to satisfy the read
request.  But we do because generic_file_read doesn't even think about
that case.

For the small random read case we could use a 
mapping->a_ops->readpartialpage 
function that sees if a request can be satisfied entirely 
from cached data.  But this is just to allow generic_file_read
to handle this, case. 

Eric

  reply	other threads:[~2001-05-25 15:13 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-05-18 19:02 DVD blockdevice buffers Eduard Hasenleithner
2001-05-18 19:25 ` Jens Axboe
2001-05-18 19:59   ` Eduard Hasenleithner
2001-05-20  2:36   ` Linus Torvalds
2001-05-23 17:34     ` Stephen C. Tweedie
2001-05-23 18:12       ` Linus Torvalds
2001-05-23 19:57         ` Stephen C. Tweedie
2001-05-23 20:01           ` Linus Torvalds
2001-05-23 20:40             ` Jeff Garzik
2001-05-23 22:32               ` Andrea Arcangeli
2001-05-25 20:12                 ` blkdev-pagecache-2 [was Re: DVD blockdevice buffers] Andrea Arcangeli
2001-05-25 20:15                   ` Andrea Arcangeli
2001-05-23 22:09             ` DVD blockdevice buffers Andrea Arcangeli
2001-05-23 22:13               ` Alexander Viro
2001-05-23 22:24                 ` Andrea Arcangeli
2001-05-24 11:36             ` Stephen C. Tweedie
2001-05-25 15:09               ` Eric W. Biederman [this message]
2001-05-25 15:45                 ` Stephen C. Tweedie
2001-05-25 17:16                 ` Linus Torvalds
2001-05-25 17:40                   ` Alexander Viro
2001-05-25 18:05                     ` Linus Torvalds
2001-05-25 18:24                       ` Alexander Viro
2001-05-25 19:02                         ` Stephen C. Tweedie
2001-05-27  6:38                     ` Pavel Machek
2001-05-25 21:07                   ` Eric W. Biederman
2001-05-25 21:18                     ` Linus Torvalds
2001-05-25 22:31                       ` Eric W. Biederman
  -- strict thread matches above, loose matches on Subject: below --
2001-05-19 18:16 Adam Schrotenboer
2001-05-19 22:56 ` Jens Axboe
2001-05-20  1:55   ` Adam Schrotenboer
2001-05-21 15:44   ` Adam Schrotenboer
2001-05-21 15:47     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1bsohh3da.fsf@frodo.biederman.org \
    --to=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sct@redhat.com \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox