All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Linus Torvalds <torvalds@osdl.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, mason@suse.com,
	andrea@suse.de, hugh@veritas.com, axboe@suse.de
Subject: Re: [rfc][patch] remove racy sync_page?
Date: Wed, 31 May 2006 10:32:58 +1000	[thread overview]
Message-ID: <447CE43A.6030700@yahoo.com.au> (raw)
In-Reply-To: <Pine.LNX.4.64.0605301041200.5623@g5.osdl.org>

Linus Torvalds wrote:
> 
> On Tue, 30 May 2006, Nick Piggin wrote:
> 
>>For workloads where plugging helps (ie. lots of smaller, contiguous
>>requests going into the IO layer), the request pattern should be
>>pretty good without plugging these days, due to multiple page
>>readahead and writeback.
> 
> 
> No.
> 
> That's fundamentally wrong.
> 
> The fact is, plugging is not about read-ahead and writeback. It's very 
> fundamentally about the _boundaries_ between multiple requests, and in 
> particular the time when the queue starts out empty so that we can build 
> up things for devices that wand big requests, but even more so for devices 
> where _seeking_ is very expensive.
> 
> Those boundaries haven't gone anywhere. The fact that we do read-ahead and 
> write-back in chunks doesn't change anything: yes, we often have the "big 
> requests" thing handled, but (a) not always and (b) upper layers 
> fundamentally don't fix the seek issues.

The requests can only get merged if contiguous requests from the upper
layers come down, right?

So in a random IO workload, plugging is unlikely to help at all. In a
contiguous IO workload, mpage should take *some* of the burden off
plugging. But OK, it turns out not always, I accept that.



> 
> I want to know that the block layer could - if we wanted to - do things 
> like read-ahead for many distinct files, and for metadata. We don't 
> currently do much of that yet, but the point is, plugging _allows_ us to. 
> Exactly because it doesn't depend on upper layers feeding everything in 
> one go.
> 
> Look at "sys_readahead()", and realize that it can be used to start IO for 
> read ahead _across_many_small_files_. Last I tried it, it was hugely 
> faster at populating the page cache than reading individual files (I used 
> to do it with BK to bring everything into cache so that the regular ops 
> would be fster - now git doesn't much need it).
> 
> And maybe it was just my imagination, but the disk seemed quieter too. It 
> should be able to do better seek patterns at the beginning due to plugging 
> (ie we won't start IO after the first file, but after the request queue 
> fills up or something else needs to wait and we do an unplug event).
> 
> THAT is what plugging is good for. Our read-ahead does well for large 
> requests, and that's important for some disk controllers in particular. 
> But plugging is about avoiding startign the IO too early.

Why would plugging help if the requests can't get merged, though?

> 
> Think about the TCP plugging (which is actually newer, but perhaps easier 
> to explain): it's useful not for the big file case (just use large reads 
> and writes), but for the "different sources" case - for handling the gap 
> between a header and the actual file contents. Exactly because it plugs in 
> _between_ events. 

TCP plugging is a bit different because there is no page cache between
the application and the device; and it is stream based so everything can
be merged (within a single socket).

The same high level concept I agree, but I never said the concept was
wrong; just hoped that as a heuristic, the block plugging was no longer
useful. I've been set straight about that though ;)

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

WARNING: multiple messages have this Message-ID (diff)
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Linus Torvalds <torvalds@osdl.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, mason@suse.com,
	andrea@suse.de, hugh@veritas.com, axboe@suse.de
Subject: Re: [rfc][patch] remove racy sync_page?
Date: Wed, 31 May 2006 10:32:58 +1000	[thread overview]
Message-ID: <447CE43A.6030700@yahoo.com.au> (raw)
In-Reply-To: <Pine.LNX.4.64.0605301041200.5623@g5.osdl.org>

Linus Torvalds wrote:
> 
> On Tue, 30 May 2006, Nick Piggin wrote:
> 
>>For workloads where plugging helps (ie. lots of smaller, contiguous
>>requests going into the IO layer), the request pattern should be
>>pretty good without plugging these days, due to multiple page
>>readahead and writeback.
> 
> 
> No.
> 
> That's fundamentally wrong.
> 
> The fact is, plugging is not about read-ahead and writeback. It's very 
> fundamentally about the _boundaries_ between multiple requests, and in 
> particular the time when the queue starts out empty so that we can build 
> up things for devices that wand big requests, but even more so for devices 
> where _seeking_ is very expensive.
> 
> Those boundaries haven't gone anywhere. The fact that we do read-ahead and 
> write-back in chunks doesn't change anything: yes, we often have the "big 
> requests" thing handled, but (a) not always and (b) upper layers 
> fundamentally don't fix the seek issues.

The requests can only get merged if contiguous requests from the upper
layers come down, right?

So in a random IO workload, plugging is unlikely to help at all. In a
contiguous IO workload, mpage should take *some* of the burden off
plugging. But OK, it turns out not always, I accept that.



> 
> I want to know that the block layer could - if we wanted to - do things 
> like read-ahead for many distinct files, and for metadata. We don't 
> currently do much of that yet, but the point is, plugging _allows_ us to. 
> Exactly because it doesn't depend on upper layers feeding everything in 
> one go.
> 
> Look at "sys_readahead()", and realize that it can be used to start IO for 
> read ahead _across_many_small_files_. Last I tried it, it was hugely 
> faster at populating the page cache than reading individual files (I used 
> to do it with BK to bring everything into cache so that the regular ops 
> would be fster - now git doesn't much need it).
> 
> And maybe it was just my imagination, but the disk seemed quieter too. It 
> should be able to do better seek patterns at the beginning due to plugging 
> (ie we won't start IO after the first file, but after the request queue 
> fills up or something else needs to wait and we do an unplug event).
> 
> THAT is what plugging is good for. Our read-ahead does well for large 
> requests, and that's important for some disk controllers in particular. 
> But plugging is about avoiding startign the IO too early.

Why would plugging help if the requests can't get merged, though?

> 
> Think about the TCP plugging (which is actually newer, but perhaps easier 
> to explain): it's useful not for the big file case (just use large reads 
> and writes), but for the "different sources" case - for handling the gap 
> between a header and the actual file contents. Exactly because it plugs in 
> _between_ events. 

TCP plugging is a bit different because there is no page cache between
the application and the device; and it is stream based so everything can
be merged (within a single socket).

The same high level concept I agree, but I never said the concept was
wrong; just hoped that as a heuristic, the block plugging was no longer
useful. I've been set straight about that though ;)

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-05-31  0:33 UTC|newest]

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-29  9:34 [rfc][patch] remove racy sync_page? Nick Piggin
2006-05-29 19:15 ` Andrew Morton
2006-05-29 19:15   ` Andrew Morton
2006-05-30  0:08   ` Nick Piggin
2006-05-30  0:08     ` Nick Piggin
2006-05-30  1:32     ` Andrew Morton
2006-05-30  1:32       ` Andrew Morton
2006-05-30  2:54       ` Nick Piggin
2006-05-30  2:54         ` Nick Piggin
2006-05-30  3:14         ` Andrew Morton
2006-05-30  3:14           ` Andrew Morton
2006-05-30  4:13           ` Nick Piggin
2006-05-30  4:13             ` Nick Piggin
2006-05-30  9:05           ` Jens Axboe
2006-05-30  9:05             ` Jens Axboe
2006-05-31 13:43             ` Nick Piggin
2006-05-31 13:43               ` Nick Piggin
2006-05-31 15:09               ` Hugh Dickins
2006-05-31 15:09                 ` Hugh Dickins
2006-05-31 15:22                 ` Nick Piggin
2006-05-31 15:22                   ` Nick Piggin
2006-05-31 17:51                   ` Jens Axboe
2006-05-31 17:51                     ` Jens Axboe
2006-05-31 17:50               ` Jens Axboe
2006-05-31 17:50                 ` Jens Axboe
2006-05-30  4:20         ` Linus Torvalds
2006-05-30  4:20           ` Linus Torvalds
2006-05-30  5:07           ` Nick Piggin
2006-05-30  5:07             ` Nick Piggin
2006-05-30  5:21             ` Nick Piggin
2006-05-30  5:21               ` Nick Piggin
2006-05-30  6:12               ` Neil Brown
2006-05-30  6:12                 ` Neil Brown
2006-05-30  7:10                 ` Nick Piggin
2006-05-30  7:10                   ` Nick Piggin
2006-05-31  4:34                   ` Neil Brown
2006-05-31  4:34                     ` Neil Brown
2006-05-30  8:24               ` Nikita Danilov
2006-05-30  8:24                 ` Nikita Danilov
2006-05-30 17:55               ` Linus Torvalds
2006-05-30 17:55                 ` Linus Torvalds
2006-05-31  0:32                 ` Nick Piggin [this message]
2006-05-31  0:32                   ` Nick Piggin
2006-05-31  0:56                   ` Linus Torvalds
2006-05-31  0:56                     ` Linus Torvalds
2006-05-31  1:33                     ` Mark Lord
2006-05-31  1:33                       ` Mark Lord
2006-05-31  6:11                       ` Jens Axboe
2006-05-31  6:11                         ` Jens Axboe
2006-05-31 12:55                         ` Mark Lord
2006-05-31 12:55                           ` Mark Lord
2006-05-31 13:02                           ` Jens Axboe
2006-05-31 13:02                             ` Jens Axboe
2006-06-01 13:19                           ` NCQ performance (was Re: [rfc][patch] remove racy sync_page?) Jens Axboe
2006-06-01 13:19                             ` Jens Axboe
2006-06-01 14:56                             ` Avi Kivity
2006-06-01 14:56                               ` Avi Kivity
2006-06-01 15:03                               ` Jens Axboe
2006-06-01 15:03                                 ` Jens Axboe
2006-06-01 18:04                                 ` Jens Axboe
2006-06-01 18:04                                   ` Jens Axboe
2006-06-05  5:30                                   ` Avi Kivity
2006-06-05  5:30                                     ` Avi Kivity
2006-06-05  7:59                                     ` Jens Axboe
2006-06-05  7:59                                       ` Jens Axboe
2006-05-31 12:31                     ` [rfc][patch] remove racy sync_page? Helge Hafting
2006-05-31 12:31                       ` Helge Hafting
2006-05-31 12:36                       ` Arjan van de Ven
2006-05-31 12:36                         ` Arjan van de Ven
2006-05-31 13:29                     ` Nick Piggin
2006-05-31 13:29                       ` Nick Piggin
2006-05-31 13:41                       ` Jens Axboe
2006-05-31 13:41                         ` Jens Axboe
2006-05-31 13:54                         ` Nick Piggin
2006-05-31 13:54                           ` Nick Piggin
2006-05-31 14:43                       ` Linus Torvalds
2006-05-31 14:43                         ` Linus Torvalds
2006-05-31 14:57                         ` Nick Piggin
2006-05-31 14:57                           ` Nick Piggin
2006-05-31 15:13                           ` Linus Torvalds
2006-05-31 15:13                             ` Linus Torvalds
2006-05-31 15:33                             ` Nick Piggin
2006-05-31 15:57                               ` Linus Torvalds
2006-05-31 16:12                                 ` Linus Torvalds
2006-05-31 16:26                                   ` Nick Piggin
2006-05-31 16:19                                 ` Nick Piggin
2006-05-31 16:22                                   ` Nick Piggin
2006-05-31 16:41                                     ` Linus Torvalds
2006-06-02  2:34                                       ` Nick Piggin
2006-06-02  2:39                                         ` Nick Piggin
2006-05-31 16:39                                   ` Linus Torvalds
2006-06-02  2:21                                     ` Nick Piggin
2006-05-31 23:59                                   ` Neil Brown
2006-05-31 15:09                         ` Linus Torvalds
2006-05-31 15:09                           ` Linus Torvalds
2006-05-31 18:13                           ` Jens Axboe
2006-05-31 18:13                             ` Jens Axboe
2006-05-31 18:26                             ` Linus Torvalds
2006-05-31 18:26                               ` Linus Torvalds
2006-05-30  5:36             ` Nick Piggin
2006-05-30 18:31               ` Hugh Dickins
2006-05-30 18:31                 ` Hugh Dickins
2006-05-31  0:21                 ` Nick Piggin
2006-05-31  0:21                   ` Nick Piggin
2006-05-31  3:06                   ` Hugh Dickins
2006-05-31  3:06                     ` Hugh Dickins
2006-05-31 14:30                     ` Hugh Dickins
2006-05-31 14:30                       ` Hugh Dickins
2006-05-31 17:56                     ` Jens Axboe
2006-05-31 17:56                       ` Jens Axboe
2006-05-30  5:51 ` Josef Sipek
2006-05-30  5:51   ` Josef Sipek
2006-05-30  6:44   ` Nick Piggin
2006-05-30  6:44     ` Nick Piggin
2006-05-30  6:50     ` Nick Piggin
2006-05-30  6:50       ` Nick Piggin
2006-05-30 13:12     ` Josef Sipek
2006-05-30 13:12       ` Josef Sipek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=447CE43A.6030700@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=andrea@suse.de \
    --cc=axboe@suse.de \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mason@suse.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.