linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Cpuset aware writeback V2
@ 2007-01-23 18:52 Christoph Lameter
  2007-01-23 18:52 ` [PATCH 1/5] Add a map to to track dirty pages per node Christoph Lameter
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Christoph Lameter @ 2007-01-23 18:52 UTC (permalink / raw)
  To: akpm
  Cc: Peter Zijlstra, Paul Menage, Nick Piggin, linux-mm,
	Christoph Lameter, Paul Jackson, Dave Chinner, Andi Kleen

Currently cpusets are not able to do proper writeback since dirty ratio
calculations and writeback are all done for the system as a whole. This
may result in a large percentage of the nodes in a cpuset to become dirty
without background writeout being triggered and without synchrononous
writes occurring. Instead writeout occurs during reclaim when memory
is tight which may lead to dicey VM situations.

In order to fix the problem we first of all introduce a method to establish
a map of dirty nodes for each struct address_space.

Secondly, we modify the dirty limit calculation to be based on the current
state of memory on the nodes of the cpuset that the current tasks belongs to.

If the current tasks is part of a cpuset that is not allowed to allocate
from all nodes in the system then we select only inodes for writeback
that have pages on the nodes that we are allowed to allocate from.

Changelog: V1->V2
-----------------
- Remove stray diff chunk and general patch beautification

- Put do { } while (0) around cpuset_update_dirty_nodes macro since it
  contains and if()

- Update comments to clarify locking scheme for dirty node maps.

- Retest and verify compile on UP.

Changelog: RFC->V1
------------------

- Rework dirty_map logic to allocate it dynamically on larger
  NUMA systems. Move to struct address_space and address various minor issues.

- Dynamically allocate dirty maps only if an inode is dirtied.

- Clear the dirty map only when an inode is cleared (simplifies
  locking and we need to keep the dirty state even after the dirty state of
  all pages has be cleared for NFS writeout to occur correctly).

- Drop nr_node_ids patches

- Drop the NR_UNRECLAIMABLE patch. There may be other ideas around on how
  to accomplish the same in a more elegant way.

- Drop mentioning the NFS issues since Peter is working on those.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH 0/5] Cpuset aware writeback V1
@ 2007-01-20  3:10 Christoph Lameter
  2007-01-20  3:10 ` [PATCH 2/5] Add a nodemask to pdflush functions Christoph Lameter
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Lameter @ 2007-01-20  3:10 UTC (permalink / raw)
  To: akpm
  Cc: Peter Zijlstra, Paul Menage, Nick Piggin, linux-mm,
	Christoph Lameter, Paul Jackson, Dave Chinner, Andi Kleen

Currently cpusets are not able to do proper writeback since dirty ratio
calculations and writeback are all done for the system as a whole. This
may result in a large percentage of the nodes in a cpuset to become dirty
without background writeout being triggered and without synchrononous
writes occurring. Instead writeout occurs during reclaim when memory
is tight which may lead to dicey VM situations.

In order to fix the problem we first of all introduce a method to establish
a map of dirty nodes for each struct address_space.

Secondly, we modify the dirty limit calculation to be based on the current
state of memory on the nodes of the cpuset that the current tasks belongs to.

If the current tasks is part of a cpuset that is not allowed to allocate
from all nodes in the system then we select only inodes for writeback
that have pages on the nodes that we are allowed to allocate from.

Tested on:
IA64 NUMA 128p, 12p

Compiles on:
i386 SMP
x86_64 UP

Changelog: RFC->V1
------------------

- Rework dirty_map logic to allocate it dynamically on larger
  NUMA systems. Move to struct address_space and address various minor issues.

- Dynamically allocate dirty maps only if an inode is dirtied.

- Clear the dirty map only when an inode is cleared (simplifies
  locking and we need to keep the dirty state even after the dirty state of
  all pages has be cleared for NFS writeout to occur correctly).

- Drop nr_node_ids patches

- Drop the NR_UNRECLAIMABLE patch. There may be other ideas around on how
  to accomplish the same in a more elegant way.

- Drop mentioning the NFS issues since Peter is working on those.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-01-25  5:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-23 18:52 [PATCH 0/5] Cpuset aware writeback V2 Christoph Lameter
2007-01-23 18:52 ` [PATCH 1/5] Add a map to to track dirty pages per node Christoph Lameter
2007-01-25  3:04   ` Ethan Solomita
2007-01-25  5:52     ` Christoph Lameter
2007-01-23 18:52 ` [PATCH 2/5] Add a nodemask to pdflush functions Christoph Lameter
2007-01-23 18:52 ` [PATCH 3/5] Per cpuset dirty ratio calculation Christoph Lameter
2007-01-23 18:53 ` [PATCH 4/5] Cpuset aware writeback during reclaim Christoph Lameter
2007-01-23 18:53 ` [PATCH 5/5] Throttle vm writeout per cpuset Christoph Lameter
  -- strict thread matches above, loose matches on Subject: below --
2007-01-20  3:10 [PATCH 0/5] Cpuset aware writeback V1 Christoph Lameter
2007-01-20  3:10 ` [PATCH 2/5] Add a nodemask to pdflush functions Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).