public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: Andrea Arcangeli <andrea@suse.de>
Cc: linux-kernel@vger.kernel.org,
	Marcelo Tosatti <marcelo.tosatti@cyclades.com.br>
Subject: Re: 2.4 fix for write throttling on x86 >1G
Date: Fri, 11 Mar 2005 13:04:13 -0300	[thread overview]
Message-ID: <20050311160413.GK4816@logos.cnet> (raw)
In-Reply-To: <20050311061035.GZ26348@opteron.random>

Hi Andrea!

On Fri, Mar 11, 2005 at 07:10:35AM +0100, Andrea Arcangeli wrote:
> Hello Marcelo,
> 
> I've got a fix for you on 2.4. I got reports of stalls with heavy writes
> on 2.4. 

Out of curiosity, that was SuSE not mainline ? 

> There was a mistake in nr_free_buffer_pages. That function is
> definitely meant _not_ to take highmem into account (dirty cache cannot
> spread over highmem in 2.4 [even when on top of fs]). For unknown
> reasons it was actually taking highmem into account. The code was
> obviously meant to not take inot account see the GFP_USER and zonelist,
> except it wasn't using the zonelist.

True, initialization of "zone" variable in nr_free_buffer_pages() is 
un-nice. 

> That is a severe problem because
> there will be no write throttling at all, and no bdflush wakeup either.
> 
> This should fix it, though my compiler fails to compile 2.4, so it's not
> immediate to verify it. If any problem showup I'll post a followup.
> 
> This is a noop for all systems <800M (1G shouldn't be noticeable
> either). This is why most people can't notice.

Do we really want to limit dirty cache to low mem on HIGHIO capable 
machines? I'm afraid doing so might hurt performance on such systems.

I think it might be wise to have nr_free_buffer_pages() take highmem
into account if CONFIG_HIGHIO is set ?

> --- 2.4.23aa3/mm/page_alloc.c.~1~	2004-07-04 02:09:42.000000000 +0200
> +++ 2.4.23aa3/mm/page_alloc.c	2005-03-11 07:00:23.000000000 +0100
> @@ -656,7 +656,7 @@ unsigned int nr_free_buffer_pages (void)
>  		class_idx = zone_idx(zone);
>  
>  		sum += zone->nr_cache_pages;
> -		for (zone = pgdat->node_zones; zone < pgdat->node_zones + MAX_NR_ZONES; zone++) {
> +		for (; zone; zone = *zonep++) {
>  			int free = zone->free_pages - zone->watermarks[class_idx].high;
>  			if (free <= 0)
>  				continue;

  reply	other threads:[~2005-03-11 20:35 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-03-11  6:10 2.4 fix for write throttling on x86 >1G Andrea Arcangeli
2005-03-11 16:04 ` Marcelo Tosatti [this message]
2005-03-11 20:53   ` Andrea Arcangeli
2005-03-11 16:55     ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050311160413.GK4816@logos.cnet \
    --to=marcelo.tosatti@cyclades.com \
    --cc=andrea@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo.tosatti@cyclades.com.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox