* early flushing of data
@ 2002-08-12 13:52 Hans Reiser
2002-08-12 14:05 ` Hendrik Visage
2002-08-12 14:21 ` Opportunistic " Xuan Baldauf
0 siblings, 2 replies; 6+ messages in thread
From: Hans Reiser @ 2002-08-12 13:52 UTC (permalink / raw)
To: reiserfs-list
I just thought that discussion of this might interest the list.
It is interesting to consider whether one should flush dirty nodes to
disk with only a small delay after they are modified, or keep them in
cache for a longer time.
A small delay has an advantage for the typical small benchmark, and for
medium length tasks. If there is poor utilization of the write cache,
then the sooner one gets started on flushing something to disk, the
more megabytes the disk can write before the benchmark ends.
On the other hand, for a loaded server with a reasonably stable load
with real users not benchmarks, the longer data stays in cache the more
likely the write won't be needed at all. One can argue that real users
don't create stable loads.....
Your thoughts are welcome.
--
Hans
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: early flushing of data
2002-08-12 13:52 early flushing of data Hans Reiser
@ 2002-08-12 14:05 ` Hendrik Visage
2002-08-12 14:14 ` Hans Reiser
2002-08-12 14:21 ` Opportunistic " Xuan Baldauf
1 sibling, 1 reply; 6+ messages in thread
From: Hendrik Visage @ 2002-08-12 14:05 UTC (permalink / raw)
To: Hans Reiser; +Cc: reiserfs-list
On Mon, Aug 12, 2002 at 05:52:35PM +0400, Hans Reiser wrote:
> I just thought that discussion of this might interest the list.
>
> It is interesting to consider whether one should flush dirty nodes to
> disk with only a small delay after they are modified, or keep them in
> cache for a longer time.
[snip]
> Your thoughts are welcome.
Being in love with buttons-an-dials, I'd vote for a configurable flush
setting.
I have the case were one email server (basically a hop in the queue) would do
well with the longer flushes as the data wouldn't be that long before it's
shipped out again (given no failure conditions), while I have others which
I'd like to have it to disk ASAP as it'll be taking awhile before it get's
moved along again.
Thus a configurable setting (would love/prefer a per mount setting ie.
one for my log mount, and the other for the email mounts etc.) would
be nice.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: early flushing of data
2002-08-12 14:05 ` Hendrik Visage
@ 2002-08-12 14:14 ` Hans Reiser
2002-08-12 14:24 ` Hendrik Visage
0 siblings, 1 reply; 6+ messages in thread
From: Hans Reiser @ 2002-08-12 14:14 UTC (permalink / raw)
To: Hendrik Visage; +Cc: reiserfs-list
Hendrik Visage wrote:
>On Mon, Aug 12, 2002 at 05:52:35PM +0400, Hans Reiser wrote:
>
>
>>I just thought that discussion of this might interest the list.
>>
>>It is interesting to consider whether one should flush dirty nodes to
>>disk with only a small delay after they are modified, or keep them in
>>cache for a longer time.
>>
>>
>[snip]
>
>
>>Your thoughts are welcome.
>>
>>
>
>Being in love with buttons-an-dials, I'd vote for a configurable flush
>setting.
>
>I have the case were one email server (basically a hop in the queue) would do
>well with the longer flushes as the data wouldn't be that long before it's
>shipped out again (given no failure conditions), while I have others which
>I'd like to have it to disk ASAP as it'll be taking awhile before it get's
>moved along again.
>
>Thus a configurable setting (would love/prefer a per mount setting ie.
>one for my log mount, and the other for the email mounts etc.) would
>be nice.
>
>
>
>
>
Yes, but what should the default be (remember that 97% of users or more
won't change the default)?
--
Hans
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Opportunistic early flushing of data
2002-08-12 13:52 early flushing of data Hans Reiser
2002-08-12 14:05 ` Hendrik Visage
@ 2002-08-12 14:21 ` Xuan Baldauf
2002-08-12 14:32 ` Hans Reiser
1 sibling, 1 reply; 6+ messages in thread
From: Xuan Baldauf @ 2002-08-12 14:21 UTC (permalink / raw)
To: Hans Reiser; +Cc: reiserfs-list
Hans Reiser wrote:
> I just thought that discussion of this might interest the list.
>
> It is interesting to consider whether one should flush dirty nodes to
> disk with only a small delay after they are modified, or keep them in
> cache for a longer time.
>
> A small delay has an advantage for the typical small benchmark, and for
> medium length tasks. If there is poor utilization of the write cache,
> then the sooner one gets started on flushing something to disk, the
> more megabytes the disk can write before the benchmark ends.
>
> On the other hand, for a loaded server with a reasonably stable load
> with real users not benchmarks, the longer data stays in cache the more
> likely the write won't be needed at all. One can argue that real users
> don't create stable loads.....
>
> Your thoughts are welcome.
>
> --
> Hans
What about opportunistic early flushing? If the disk is not accessed, we
may flush without using someone elses resources. This cannot be implemented
for all cases (because during the write, someone else may want to flush
immediately), but there may be heuristics (like "we flush opportunistically
if the disk has not been used for the last 100ms") which approximate to the
optimal case. The more the disk is busy, the more flush should be delayed.
This way, we use the advantage of early flush (early commit for database
servers, etc.) with the advantages of late flush (less IO traffic). If
there is low IO traffic, we would not need to lower it, we can do early
flush. If there is high IO traffic, early flush would even rise it but late
flush would lower it.
It is important that the heuristics really depend on usage patterns of the
disk, not only of the filesystem. There might be a system with a swap
partition and a file system partition with heavy memory load and some disk
load. The resulting heavy swap would be slowed down if the filesystem did
early flush, and the filesystem would think that it is okay to do early
flush due to the less IO throughput (which in turn is result of slowed
swap...).
Xuân.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: early flushing of data
2002-08-12 14:14 ` Hans Reiser
@ 2002-08-12 14:24 ` Hendrik Visage
0 siblings, 0 replies; 6+ messages in thread
From: Hendrik Visage @ 2002-08-12 14:24 UTC (permalink / raw)
To: Hans Reiser; +Cc: Hendrik Visage, reiserfs-list
On Mon, Aug 12, 2002 at 06:14:51PM +0400, Hans Reiser wrote:
> Hendrik Visage wrote:
> >On Mon, Aug 12, 2002 at 05:52:35PM +0400, Hans Reiser wrote:
> >>
> >>It is interesting to consider whether one should flush dirty nodes to
> >>disk with only a small delay after they are modified, or keep them in
> >>cache for a longer time.
> >>
> >Thus a configurable setting (would love/prefer a per mount setting ie.
> >one for my log mount, and the other for the email mounts etc.) would
> >be nice.
> >
> Yes, but what should the default be (remember that 97% of users or more
> won't change the default)?
To this I'd ask: What's the settings in XFS and JFS?
Seeing they have "experience" behind them, it'll be a good starting point.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Opportunistic early flushing of data
2002-08-12 14:21 ` Opportunistic " Xuan Baldauf
@ 2002-08-12 14:32 ` Hans Reiser
0 siblings, 0 replies; 6+ messages in thread
From: Hans Reiser @ 2002-08-12 14:32 UTC (permalink / raw)
To: Xuan Baldauf; +Cc: reiserfs-list, Daniel Phillips
Xuan Baldauf wrote:
>Hans Reiser wrote:
>
>
>
>>I just thought that discussion of this might interest the list.
>>
>>It is interesting to consider whether one should flush dirty nodes to
>>disk with only a small delay after they are modified, or keep them in
>>cache for a longer time.
>>
>>A small delay has an advantage for the typical small benchmark, and for
>>medium length tasks. If there is poor utilization of the write cache,
>>then the sooner one gets started on flushing something to disk, the
>> more megabytes the disk can write before the benchmark ends.
>>
>>On the other hand, for a loaded server with a reasonably stable load
>>with real users not benchmarks, the longer data stays in cache the more
>>likely the write won't be needed at all. One can argue that real users
>>don't create stable loads.....
>>
>>Your thoughts are welcome.
>>
>>--
>>Hans
>>
>>
>
>What about opportunistic early flushing? If the disk is not accessed, we
>may flush without using someone elses resources. This cannot be implemented
>for all cases (because during the write, someone else may want to flush
>immediately), but there may be heuristics (like "we flush opportunistically
>if the disk has not been used for the last 100ms") which approximate to the
>optimal case. The more the disk is busy, the more flush should be delayed.
>
>This way, we use the advantage of early flush (early commit for database
>servers, etc.) with the advantages of late flush (less IO traffic). If
>there is low IO traffic, we would not need to lower it, we can do early
>flush. If there is high IO traffic, early flush would even rise it but late
>flush would lower it.
>
>It is important that the heuristics really depend on usage patterns of the
>disk, not only of the filesystem. There might be a system with a swap
>partition and a file system partition with heavy memory load and some disk
>load. The resulting heavy swap would be slowed down if the filesystem did
>early flush, and the filesystem would think that it is okay to do early
>flush due to the less IO throughput (which in turn is result of slowed
>swap...).
>
>Xuân.
>
>
>
>
>
>
>
Xuan, I think you have the right approach here. I think Daniel Phillips
maybe outlined something somewhat similar some time ago on lkml, let's
ask him.:)
--
Hans
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2002-08-12 14:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-12 13:52 early flushing of data Hans Reiser
2002-08-12 14:05 ` Hendrik Visage
2002-08-12 14:14 ` Hans Reiser
2002-08-12 14:24 ` Hendrik Visage
2002-08-12 14:21 ` Opportunistic " Xuan Baldauf
2002-08-12 14:32 ` Hans Reiser
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.