From: Xuan Baldauf <xuan--lkml@baldauf.org>
To: "'adilger@turbolabs.com'" <adilger@turbolabs.com>
Cc: Venkatesh Ramamurthy <Venkateshr@ami.com>,
"'xuan--lkml@baldauf.org'" <xuan--lkml@baldauf.org>,
"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>
Subject: Re: dynamic swap prioritizing
Date: Fri, 12 Oct 2001 02:45:44 +0200 [thread overview]
Message-ID: <3BC63D38.AF65AAF5@baldauf.org> (raw)
In-Reply-To: <1355693A51C0D211B55A00105ACCFE6402B9E013@ATL_MS1> <20011010095536.C10443@turbolinux.com>
"'adilger@turbolabs.com'" wrote:
> On Oct 10, 2001 11:23 -0400, Venkatesh Ramamurthy wrote:
> > > If this is to be generally useful, it would be good to find things
> > > like max sequential read speed, max sequential write speed, and max
> > > seek time (at least). Estimates for max sequential read speed and
> > > seek time could be found at boot time for each disk relatively
> > > easily, but write speed may have to be found only at runtime (or
> > > it could all be fed in to the kernel from user space from benchmarks
> > > run previously).
> >
> > Maybe we can find out the statistics for the first time (or when swap is
> > created) and store this information in the swap partition itself. This would
> > allow us to compute time consuming statistics only once. Also we need to
> > create new fields in the swap structure for this purpose.
>
> I'd rather just have the statistic data in a regular file for ALL disks,
> and then send it to the kernel via ioctl or write to a special file that
> the kernel will read from. I don't think it is critical to have this
> data right at boot time, since it would only be used for optimizing I/O
> access and would not be required for a disk to actually work.
>
> Cheers, Andreas
Hey people,
why do you want to separate statistics data out? The statistics are not about disk
throughput, head seek times, etc. They are just about the time between "needing a
page" and "getting that page", which is very abstract. Let's call it the
swapin-delay. It does not only depend on disk-throughput and head seek times, but
also on "device business".
For every swap device, there is a "swap_business" data structure, which covers a
- average_swapin_delay
- average_swapin_delay_last_write_timestamp /* timestamp where swapin_delay was
last written */
There is a "swap_business_memory_timeout" kernel parameter (accessible via /proc)
which represents the length of a time interval from now into the past. This
interval is to be used as the time interval where gathered disk activity data
should be used for reasoning swap decisions of the future.
For every page fault which requires a page to be swapped in, a timestamp is
written to a datastructure covering the swapin process. When the page is ready
available in memory, a function is called which does following:
- compute the current_swapin_delay for the current swapin
- my_swap_device->average_swapin_delay = (current_swapin_delay * (now -
average_swapin_delay_last_write_timestamp) + my_swap_device->average_swapin_delay
* (average_swapin_delay_last_write_timestamp - (now -
swap_business_memory_timeout))/swap_business_memory_timeout;
There are some special cases like "no disk activity". In this case, swap_business
is not updated for that device. But maybe the reason for no disk activity is that
the disk is a swap disk and the values of "swap_business" where once so bad that
this device will not be considered anymore. That would be a "soft deadlock"...
On swapout, the "average_swapin_delay" fields of every "swap_business" data
structure of every swap device is compared against same field of other available
swap devices. According to these comparision, a decision is made where to do the
next swapout to.
Because that framework only can bring advantages if there are at least two swap
devices, it can be skipped for the one-swap-device-case (most setups do not have
more than one swap device, but maybe just because the 32MB or 64MB graphics card
(with plenty of mostly unused RAM) needs to be manually configured for swap...)
I hope that you get the concept more closer. I cannot see reasons why to create
such statistics in advance and feed them to the kernel somehow. For dynamic
systems, you need dynamic statistics, I think. And "the statistics", in fact, only
consist of two variables per swap device. Not something the kernel should not be
able to manage in reasonable time.
Of course, such a feature should be tested for real advantages
Xuân.
next prev parent reply other threads:[~2001-10-12 0:47 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-10-10 15:23 dynamic swap prioritizing Venkatesh Ramamurthy
2001-10-10 15:55 ` 'adilger@turbolabs.com'
2001-10-10 17:14 ` Richard B. Johnson
2001-10-11 11:30 ` OO swap interface David Nicol
2001-10-12 0:45 ` Xuan Baldauf [this message]
2001-10-12 3:32 ` dynamic swap prioritizing 'adilger@turbolabs.com'
2001-10-12 15:22 ` Xuan Baldauf
-- strict thread matches above, loose matches on Subject: below --
2001-10-10 16:47 Venkatesh Ramamurthy
2001-10-09 22:01 Xuan Baldauf
2001-10-10 1:43 ` Rik van Riel
2001-10-10 3:35 ` Andreas Dilger
2001-10-10 8:38 ` Helge Hafting
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3BC63D38.AF65AAF5@baldauf.org \
--to=xuan--lkml@baldauf.org \
--cc=Venkateshr@ami.com \
--cc=adilger@turbolabs.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox