* Re: [PATCH 00/31] Move LRU page reclaim from zones to nodes v8
[not found] ` <20160704043405.GB11498@techsingularity.net>
@ 2016-07-04 8:04 ` Minchan Kim
2016-07-04 9:55 ` Mel Gorman
0 siblings, 1 reply; 3+ messages in thread
From: Minchan Kim @ 2016-07-04 8:04 UTC (permalink / raw)
To: Mel Gorman
Cc: Rik van Riel, intel-gfx, LKML, dri-devel, Linux-MM,
Johannes Weiner, daniel.vetter, Andrew Morton, Vlastimil Babka
On Mon, Jul 04, 2016 at 05:34:05AM +0100, Mel Gorman wrote:
> On Mon, Jul 04, 2016 at 10:37:03AM +0900, Minchan Kim wrote:
> > > The reason we have zone-based reclaim is that we used to have
> > > large highmem zones in common configurations and it was necessary
> > > to quickly find ZONE_NORMAL pages for reclaim. Today, this is much
> > > less of a concern as machines with lots of memory will (or should) use
> > > 64-bit kernels. Combinations of 32-bit hardware and 64-bit hardware are
> > > rare. Machines that do use highmem should have relatively low highmem:lowmem
> > > ratios than we worried about in the past.
> >
> > Hello Mel,
> >
> > I agree the direction absolutely. However, I have a concern on highmem
> > system as you already mentioned.
> >
> > Embedded products still use 2 ~ 3 ratio (highmem:lowmem).
> > In such system, LRU churning by skipping other zone pages frequently
> > might be significant for the performance.
> >
> > How big ratio between highmem:lowmem do you think a problem?
> >
>
> That's a "how long is a piece of string" type question. The ratio does
> not matter as much as whether the workload is both under memory pressure
> and requires large amounts of lowmem pages. Even on systems with very high
> ratios, it may not be a problem if HIGHPTE is enabled.
As well page table, pgd/kernelstack/zbud/slab and so on, every kernel
allocations wanted to mask __GFP_HIGHMEM off would be a problem in
32bit system.
It also depends on that how many drivers needed lowmem only we have
in the system.
I don't know how many such driver in the world. When I simply do grep,
I found several cases which mask __GFP_HIGHMEM off and among them,
I guess DRM might be a popular for us. However, it might be really rare
usecase among various i915 usecases.
>
> > >
> > > Conceptually, moving to node LRUs should be easier to understand. The
> > > page allocator plays fewer tricks to game reclaim and reclaim behaves
> > > similarly on all nodes.
> > >
> > > The series has been tested on a 16 core UMA machine and a 2-socket 48
> > > core NUMA machine. The UMA results are presented in most cases as the NUMA
> > > machine behaved similarly.
> >
> > I guess you would already test below with various highmem system(e.g.,
> > 2:1, 3:1, 4:1 and so on). If you have, could you mind sharing it?
> >
>
> I haven't that data, the baseline distribution used doesn't even have
> 32-bit support. Even if it was, the results may not be that interesting.
> The workloads used were not necessarily going to trigger lowmem pressure
> as HIGHPTE was set on the 32-bit configs.
That means we didn't test this on 32-bit with highmem.
I'm not sure it's really too rare case to spend a time for testing.
In fact, I really want to test all series to our production system
which is 32bit and highmem but as we know well, most of embedded
system kernel is rather old so backporting needs lots of time and
care. However, if we miss testing in those system at the moment,
we will be suprised after 1~2 years.
I don't know what kinds of benchmark can we can check it so I cannot
insist on it but you might know it.
Okay, do you have any idea to fix it if we see such regression report
in 32-bit system in future?
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 00/31] Move LRU page reclaim from zones to nodes v8
2016-07-04 8:04 ` [PATCH 00/31] Move LRU page reclaim from zones to nodes v8 Minchan Kim
@ 2016-07-04 9:55 ` Mel Gorman
2016-07-06 1:51 ` Minchan Kim
0 siblings, 1 reply; 3+ messages in thread
From: Mel Gorman @ 2016-07-04 9:55 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, Linux-MM, Rik van Riel, Vlastimil Babka,
Johannes Weiner, LKML, daniel.vetter, intel-gfx, dri-devel,
David Airlie
On Mon, Jul 04, 2016 at 05:04:12PM +0900, Minchan Kim wrote:
> > > How big ratio between highmem:lowmem do you think a problem?
> > >
> >
> > That's a "how long is a piece of string" type question. The ratio does
> > not matter as much as whether the workload is both under memory pressure
> > and requires large amounts of lowmem pages. Even on systems with very high
> > ratios, it may not be a problem if HIGHPTE is enabled.
>
> As well page table, pgd/kernelstack/zbud/slab and so on, every kernel
> allocations wanted to mask __GFP_HIGHMEM off would be a problem in
> 32bit system.
>
The same point applies -- it depends on the rate of these allocations,
not the ratio of highmem:lowmem per se.
> It also depends on that how many drivers needed lowmem only we have
> in the system.
>
> I don't know how many such driver in the world. When I simply do grep,
> I found several cases which mask __GFP_HIGHMEM off and among them,
> I guess DRM might be a popular for us. However, it might be really rare
> usecase among various i915 usecases.
>
It's also perfectly possible that such allocations are long-lived in which
case they are not going to cause many skips. Hence why I cannot make a
general prediction.
> > > > Conceptually, moving to node LRUs should be easier to understand. The
> > > > page allocator plays fewer tricks to game reclaim and reclaim behaves
> > > > similarly on all nodes.
> > > >
> > > > The series has been tested on a 16 core UMA machine and a 2-socket 48
> > > > core NUMA machine. The UMA results are presented in most cases as the NUMA
> > > > machine behaved similarly.
> > >
> > > I guess you would already test below with various highmem system(e.g.,
> > > 2:1, 3:1, 4:1 and so on). If you have, could you mind sharing it?
> > >
> >
> > I haven't that data, the baseline distribution used doesn't even have
> > 32-bit support. Even if it was, the results may not be that interesting.
> > The workloads used were not necessarily going to trigger lowmem pressure
> > as HIGHPTE was set on the 32-bit configs.
>
> That means we didn't test this on 32-bit with highmem.
>
No. I tested the skip logic and noticed that when forced on purpose that
system CPU usage was higher but it functionally worked.
> I'm not sure it's really too rare case to spend a time for testing.
> In fact, I really want to test all series to our production system
> which is 32bit and highmem but as we know well, most of embedded
> system kernel is rather old so backporting needs lots of time and
> care. However, if we miss testing in those system at the moment,
> we will be suprised after 1~2 years.
>
It would be appreciated if it could be tested on such platforms if at all
possible. Even if I did set up a 32-bit x86 system, it won't have the same
allocation/reclaim profile as the platforms you are considering.
> I don't know what kinds of benchmark can we can check it so I cannot
> insist on it but you might know it.
>
One method would be to use fsmark with very large numbers of small files
to force slab to require low memory. It's not representative of many real
workloads unfortunately. Usually such a configuration is for checking the
slab shrinker is working as expected.
> Okay, do you have any idea to fix it if we see such regression report
> in 32-bit system in future?
Two options, neither whose complexity is justified without a "real"
workload to use as a reference.
1. Long-term isolation of highmem pages when reclaim is lowmem
When pages are skipped, they are immediately added back onto the LRU
list. If lowmem reclaim persisted for long periods of time, the same
highmem pages get continually scanned. The idea would be that lowmem
keeps those pages on a separate list until a reclaim for highmem pages
arrives that splices the highmem pages back onto the LRU.
That would reduce the skip rate, the potential corner case is that
highmem pages have to be scanned and reclaimed to free lowmem slab pages.
2. Linear scan lowmem pages if the initial LRU shrink fails
This will break LRU ordering but may be preferable and faster during
memory pressure than skipping LRU pages.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 00/31] Move LRU page reclaim from zones to nodes v8
2016-07-04 9:55 ` Mel Gorman
@ 2016-07-06 1:51 ` Minchan Kim
0 siblings, 0 replies; 3+ messages in thread
From: Minchan Kim @ 2016-07-06 1:51 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Linux-MM, Rik van Riel, Vlastimil Babka,
Johannes Weiner, LKML, daniel.vetter, intel-gfx, dri-devel,
David Airlie
On Mon, Jul 04, 2016 at 10:55:09AM +0100, Mel Gorman wrote:
> On Mon, Jul 04, 2016 at 05:04:12PM +0900, Minchan Kim wrote:
> > > > How big ratio between highmem:lowmem do you think a problem?
> > > >
> > >
> > > That's a "how long is a piece of string" type question. The ratio does
> > > not matter as much as whether the workload is both under memory pressure
> > > and requires large amounts of lowmem pages. Even on systems with very high
> > > ratios, it may not be a problem if HIGHPTE is enabled.
> >
> > As well page table, pgd/kernelstack/zbud/slab and so on, every kernel
> > allocations wanted to mask __GFP_HIGHMEM off would be a problem in
> > 32bit system.
> >
>
> The same point applies -- it depends on the rate of these allocations,
> not the ratio of highmem:lowmem per se.
>
> > It also depends on that how many drivers needed lowmem only we have
> > in the system.
> >
> > I don't know how many such driver in the world. When I simply do grep,
> > I found several cases which mask __GFP_HIGHMEM off and among them,
> > I guess DRM might be a popular for us. However, it might be really rare
> > usecase among various i915 usecases.
> >
>
> It's also perfectly possible that such allocations are long-lived in which
> case they are not going to cause many skips. Hence why I cannot make a
> general prediction.
>
> > > > > Conceptually, moving to node LRUs should be easier to understand. The
> > > > > page allocator plays fewer tricks to game reclaim and reclaim behaves
> > > > > similarly on all nodes.
> > > > >
> > > > > The series has been tested on a 16 core UMA machine and a 2-socket 48
> > > > > core NUMA machine. The UMA results are presented in most cases as the NUMA
> > > > > machine behaved similarly.
> > > >
> > > > I guess you would already test below with various highmem system(e.g.,
> > > > 2:1, 3:1, 4:1 and so on). If you have, could you mind sharing it?
> > > >
> > >
> > > I haven't that data, the baseline distribution used doesn't even have
> > > 32-bit support. Even if it was, the results may not be that interesting.
> > > The workloads used were not necessarily going to trigger lowmem pressure
> > > as HIGHPTE was set on the 32-bit configs.
> >
> > That means we didn't test this on 32-bit with highmem.
> >
>
> No. I tested the skip logic and noticed that when forced on purpose that
> system CPU usage was higher but it functionally worked.
Yeb, it would work well functionally. I meant not functionally but
performance point of view, system cpu usage and majfault rate
and so on.
>
> > I'm not sure it's really too rare case to spend a time for testing.
> > In fact, I really want to test all series to our production system
> > which is 32bit and highmem but as we know well, most of embedded
> > system kernel is rather old so backporting needs lots of time and
> > care. However, if we miss testing in those system at the moment,
> > we will be suprised after 1~2 years.
> >
>
> It would be appreciated if it could be tested on such platforms if at all
> possible. Even if I did set up a 32-bit x86 system, it won't have the same
> allocation/reclaim profile as the platforms you are considering.
Yeb. I just finished reviewing of all patches and found no *big* problem
with my brain so my remanining homework is just testing which would find
what my brain have missed.
I will give the backporing to old 32-bit production kernel a shot and
report if something strange happens.
Thanks for great work, Mel!
>
> > I don't know what kinds of benchmark can we can check it so I cannot
> > insist on it but you might know it.
> >
>
> One method would be to use fsmark with very large numbers of small files
> to force slab to require low memory. It's not representative of many real
> workloads unfortunately. Usually such a configuration is for checking the
> slab shrinker is working as expected.
Thanks for the suggestion.
>
> > Okay, do you have any idea to fix it if we see such regression report
> > in 32-bit system in future?
>
> Two options, neither whose complexity is justified without a "real"
> workload to use as a reference.
>
> 1. Long-term isolation of highmem pages when reclaim is lowmem
>
> When pages are skipped, they are immediately added back onto the LRU
> list. If lowmem reclaim persisted for long periods of time, the same
> highmem pages get continually scanned. The idea would be that lowmem
> keeps those pages on a separate list until a reclaim for highmem pages
> arrives that splices the highmem pages back onto the LRU.
>
> That would reduce the skip rate, the potential corner case is that
> highmem pages have to be scanned and reclaimed to free lowmem slab pages.
>
> 2. Linear scan lowmem pages if the initial LRU shrink fails
>
> This will break LRU ordering but may be preferable and faster during
> memory pressure than skipping LRU pages.
Okay. I guess it would be better to include this in descripion of [4/31].
>
> --
> Mel Gorman
> SUSE Labs
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-07-06 1:51 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1467403299-25786-1-git-send-email-mgorman@techsingularity.net>
[not found] ` <20160704013703.GA19943@bbox>
[not found] ` <20160704043405.GB11498@techsingularity.net>
2016-07-04 8:04 ` [PATCH 00/31] Move LRU page reclaim from zones to nodes v8 Minchan Kim
2016-07-04 9:55 ` Mel Gorman
2016-07-06 1:51 ` Minchan Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox