All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Mark Hills <mark@xwax.org>
Cc: linux-mm@kvack.org, Michal Hocko <mhocko@suse.cz>,
	Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <hannes@cmpxchg.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: Write throughput impaired by touching dirty_ratio
Date: Wed, 24 Jun 2015 10:27:36 +0200	[thread overview]
Message-ID: <558A69F8.2080304@suse.cz> (raw)
In-Reply-To: <1506191513210.2879@stax.localdomain>

[add some CC's]

On 06/19/2015 05:16 PM, Mark Hills wrote:
> I noticed that any change to vm.dirty_ratio causes write throuput to 
> plummet -- to around 5Mbyte/sec.
> 
>   <system bootup, kernel 4.0.5>
> 
>   # dd if=/dev/zero of=/path/to/file bs=1M
> 
>   # sysctl vm.dirty_ratio
>   vm.dirty_ratio = 20
>   <all ok; writes at ~150Mbyte/sec>
> 
>   # sysctl vm.dirty_ratio=20
>   <all continues to be ok>
> 
>   # sysctl vm.dirty_ratio=21
>   <writes drop to ~5Mbyte/sec>
> 
>   # sysctl vm.dirty_ratio=20
>   <writes continue to be slow at ~5Mbyte/sec>
> 
> The test shows that return to the previous value does not restore the old 
> behaviour. I return the system to usable state with a reboot.
> 
> Reads continue to be fast and are not affected.
> 
> A quick look at the code suggests differing behaviour from 
> writeback_set_ratelimit on startup. And that some of the calculations (eg. 
> global_dirty_limit) is badly behaved once the system has booted.

Hmm, so the only thing that dirty_ratio_handler() changes except the
vm_dirty_ratio itself, is ratelimit_pages through writeback_set_ratelimit(). So
I assume the problem is with ratelimit_pages. There's num_online_cpus() used in
the calculation, which I think would differ between the initial system state
(where we are called by page_writeback_init()) and later when all CPU's are
onlined. But I don't see CPU onlining code updating the limit (unlike memory
hotplug which does that), so that's suspicious.

Another suspicious thing is that global_dirty_limits() looks at current
process's flag. It seems odd to me that the process calling the sysctl would
determine a value global to the system.

If you are brave enough (and have kernel configured properly and with
debuginfo), you can verify how value of ratelimit_pages variable changes on the
live system, using the crash tool. Just start it, and if everything works, you
can inspect the live system. It's a bit complicated since there are two static
variables called "ratelimit_pages" in the kernel so we can't print them easily
(or I don't know how). First we have to get the variable address:

crash> sym ratelimit_pages
ffffffff81e67200 (d) ratelimit_pages
ffffffff81ef4638 (d) ratelimit_pages

One will be absurdly high (probably less on your 32bit) so it's not the one we want:

crash> rd -d ffffffff81ef4638 1
ffffffff81ef4638:    4294967328768

The second will have a smaller value:
(my system after boot with dirty ratio = 20)
crash> rd -d ffffffff81e67200 1
ffffffff81e67200:             1577

(after changing to 21)
crash> rd -d ffffffff81e67200 1
ffffffff81e67200:             1570

(after changing back to 20)
crash> rd -d ffffffff81e67200 1
ffffffff81e67200:             1496

So yes, it does differ but not drastically. A difference between 1 and 8 online
CPU's would look differently I think. So my theory above is questionable. But
you might try what it looks like on your system...

> 
> The system is an HP xw6600, running i686 kernel. This happens whether 
> internal SATA HDD, SSD or external USB drive is used. I first saw this on 
> kernel 4.0.4, and 4.0.5 is also affected.

So what was the last version where you did change the dirty ratio and it worked
fine?

> 
> It would suprise me if I'm the only person who was setting dirty_ratio.
> 
> Have others seen this behaviour? Thanks
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Mark Hills <mark@xwax.org>
Cc: linux-mm@kvack.org, Michal Hocko <mhocko@suse.cz>,
	Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <hannes@cmpxchg.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: Write throughput impaired by touching dirty_ratio
Date: Wed, 24 Jun 2015 10:27:36 +0200	[thread overview]
Message-ID: <558A69F8.2080304@suse.cz> (raw)
In-Reply-To: <1506191513210.2879@stax.localdomain>

[add some CC's]

On 06/19/2015 05:16 PM, Mark Hills wrote:
> I noticed that any change to vm.dirty_ratio causes write throuput to 
> plummet -- to around 5Mbyte/sec.
> 
>   <system bootup, kernel 4.0.5>
> 
>   # dd if=/dev/zero of=/path/to/file bs=1M
> 
>   # sysctl vm.dirty_ratio
>   vm.dirty_ratio = 20
>   <all ok; writes at ~150Mbyte/sec>
> 
>   # sysctl vm.dirty_ratio=20
>   <all continues to be ok>
> 
>   # sysctl vm.dirty_ratio=21
>   <writes drop to ~5Mbyte/sec>
> 
>   # sysctl vm.dirty_ratio=20
>   <writes continue to be slow at ~5Mbyte/sec>
> 
> The test shows that return to the previous value does not restore the old 
> behaviour. I return the system to usable state with a reboot.
> 
> Reads continue to be fast and are not affected.
> 
> A quick look at the code suggests differing behaviour from 
> writeback_set_ratelimit on startup. And that some of the calculations (eg. 
> global_dirty_limit) is badly behaved once the system has booted.

Hmm, so the only thing that dirty_ratio_handler() changes except the
vm_dirty_ratio itself, is ratelimit_pages through writeback_set_ratelimit(). So
I assume the problem is with ratelimit_pages. There's num_online_cpus() used in
the calculation, which I think would differ between the initial system state
(where we are called by page_writeback_init()) and later when all CPU's are
onlined. But I don't see CPU onlining code updating the limit (unlike memory
hotplug which does that), so that's suspicious.

Another suspicious thing is that global_dirty_limits() looks at current
process's flag. It seems odd to me that the process calling the sysctl would
determine a value global to the system.

If you are brave enough (and have kernel configured properly and with
debuginfo), you can verify how value of ratelimit_pages variable changes on the
live system, using the crash tool. Just start it, and if everything works, you
can inspect the live system. It's a bit complicated since there are two static
variables called "ratelimit_pages" in the kernel so we can't print them easily
(or I don't know how). First we have to get the variable address:

crash> sym ratelimit_pages
ffffffff81e67200 (d) ratelimit_pages
ffffffff81ef4638 (d) ratelimit_pages

One will be absurdly high (probably less on your 32bit) so it's not the one we want:

crash> rd -d ffffffff81ef4638 1
ffffffff81ef4638:    4294967328768

The second will have a smaller value:
(my system after boot with dirty ratio = 20)
crash> rd -d ffffffff81e67200 1
ffffffff81e67200:             1577

(after changing to 21)
crash> rd -d ffffffff81e67200 1
ffffffff81e67200:             1570

(after changing back to 20)
crash> rd -d ffffffff81e67200 1
ffffffff81e67200:             1496

So yes, it does differ but not drastically. A difference between 1 and 8 online
CPU's would look differently I think. So my theory above is questionable. But
you might try what it looks like on your system...

> 
> The system is an HP xw6600, running i686 kernel. This happens whether 
> internal SATA HDD, SSD or external USB drive is used. I first saw this on 
> kernel 4.0.4, and 4.0.5 is also affected.

So what was the last version where you did change the dirty ratio and it worked
fine?

> 
> It would suprise me if I'm the only person who was setting dirty_ratio.
> 
> Have others seen this behaviour? Thanks
> 


  reply	other threads:[~2015-06-24  8:27 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-19 15:16 Write throughput impaired by touching dirty_ratio Mark Hills
2015-06-24  8:27 ` Vlastimil Babka [this message]
2015-06-24  8:27   ` Vlastimil Babka
2015-06-24  9:16   ` Michal Hocko
2015-06-24  9:16     ` Michal Hocko
2015-06-24 22:26   ` Mark Hills
2015-06-24 22:26     ` Mark Hills
2015-06-25  9:20     ` Michal Hocko
2015-06-25  9:20       ` Michal Hocko
2015-06-25 12:56       ` Michal Hocko
2015-06-25 12:56         ` Michal Hocko
2015-06-25 21:45       ` Mark Hills
2015-06-25 21:45         ` Mark Hills
2015-07-01 15:40         ` Michal Hocko
2015-07-01 15:40           ` Michal Hocko
2015-06-25  9:30     ` Vlastimil Babka
2015-06-25  9:30       ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=558A69F8.2080304@suse.cz \
    --to=vbabka@suse.cz \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark@xwax.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.