All of lore.kernel.org
 help / color / mirror / Atom feed
From: Balbir Singh <balbir@in.ibm.com>
To: Magnus Damm <magnus.damm@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, vatsa@in.ibm.com,
	ckrm-tech@lists.sourceforge.net, xemul@sw.ru, linux-mm@kvack.org,
	menage@google.com, svaidy@linux.vnet.ibm.com, devel@openvz.org
Subject: Re: [RFC][PATCH][0/4] Memory controller (RSS Control)
Date: Mon, 19 Feb 2007 19:37:07 +0530	[thread overview]
Message-ID: <45D9AF0B.9000803@in.ibm.com> (raw)
In-Reply-To: <aec7e5c30702190356v31e4997pf02e2887264299ce@mail.gmail.com>

Magnus Damm wrote:
> On 2/19/07, Balbir Singh <balbir@in.ibm.com> wrote:
>> Magnus Damm wrote:
>> > On 2/19/07, Andrew Morton <akpm@linux-foundation.org> wrote:
>> >> On Mon, 19 Feb 2007 12:20:19 +0530 Balbir Singh <balbir@in.ibm.com>
>> >> wrote:
>> >>
>> >> > This patch applies on top of Paul Menage's container patches (V7)
>> >> posted at
>> >> >
>> >> >       http://lkml.org/lkml/2007/2/12/88
>> >> >
>> >> > It implements a controller within the containers framework for 
>> limiting
>> >> > memory usage (RSS usage).
>> >
>> >> The key part of this patchset is the reclaim algorithm:
>> >>
>> >> Alas, I fear this might have quite bad worst-case behaviour.  One 
>> small
>> >> container which is under constant memory pressure will churn the
>> >> system-wide LRUs like mad, and will consume rather a lot of system 
>> time.
>> >> So it's a point at which container A can deleteriously affect things
>> >> which
>> >> are running in other containers, which is exactly what we're 
>> supposed to
>> >> not do.
>> >
>> > Nice with a simple memory controller. The downside seems to be that it
>> > doesn't scale very well when it comes to reclaim, but maybe that just
>> > comes with being simple. Step by step, and maybe this is a good first
>> > step?
>> >
>>
>> Thanks, I totally agree.
>>
>> > Ideally I'd like to see unmapped pages handled on a per-container LRU
>> > with a fallback to the system-wide LRUs. Shared/mapped pages could be
>> > handled using PTE ageing/unmapping instead of page ageing, but that
>> > may consume too much resources to be practical.
>> >
>> > / magnus
>>
>> Keeping unmapped pages per container sounds interesting. I am not quite
>> sure what PTE ageing, will it look it up.
> 
> You will most likely have no luck looking it up, so here is what I
> mean by PTE ageing:
> 
> The most common unit for memory resource control seems to be physical
> pages. Keeping track of pages is simple in the case of a single user
> per page, but for shared pages tracking the owner becomes more
> complex.
> 
> I consider unmapped pages to only have a single user at a time, so the
> unit for unmapped memory resource control is physical pages. Apart
> from implementation details such as fun with struct page and
> scalability, handling this case is not so complicated.
> 
> Mapped or shared pages should be handled in a different way IMO. PTEs
> should be used instead of using physical pages as unit for resource
> control and reclaim. For the user this looks pretty much the same as
> physical pages, apart for memory overcommit.
> 
> So instead of using a global page reclaim policy and reserving
> physical pages per container I propose that resource controlled shared
> pages should be handled using a PTE replacement policy. This policy is
> used to keep the most active PTEs in the container backed by physical
> pages. Inactive PTEs gets unmapped in favour over newer PTEs.
> 
> One way to implement this could be by populating the address space of
> resource controlled processes with multiple smaller LRU2Qs. The
> compact data structure that I have in mind is basically an array of
> 256 bytes, one byte per PTE. Associated with this data strucuture are
> start indexes and lengths for two lists. The indexes are used in a
> FAT-type of chain to form single linked lists. So we create active and
> inactive list here - and we move PTEs between the lists when we check
> the young bits from the page reclaim and when we apply memory
> pressure. Unmapping is done through the normal page reclaimer but
> using information from the PTE LRUs.
> 
> In my mind this should lead to more fair resource control of mapped
> pages, but if it is possible to implement with low overhead, that's
> another question. =)
> 
> Thanks for listening.
> 
> / magnus
> 

Thanks for explaining PTE aging.

-- 
	Warm Regards,
	Balbir Singh

WARNING: multiple messages have this Message-ID (diff)
From: Balbir Singh <balbir@in.ibm.com>
To: Magnus Damm <magnus.damm@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, vatsa@in.ibm.com,
	ckrm-tech@lists.sourceforge.net, xemul@sw.ru, linux-mm@kvack.org,
	menage@google.com, svaidy@linux.vnet.ibm.com, devel@openvz.org
Subject: Re: [RFC][PATCH][0/4] Memory controller (RSS Control)
Date: Mon, 19 Feb 2007 19:37:07 +0530	[thread overview]
Message-ID: <45D9AF0B.9000803@in.ibm.com> (raw)
In-Reply-To: <aec7e5c30702190356v31e4997pf02e2887264299ce@mail.gmail.com>

Magnus Damm wrote:
> On 2/19/07, Balbir Singh <balbir@in.ibm.com> wrote:
>> Magnus Damm wrote:
>> > On 2/19/07, Andrew Morton <akpm@linux-foundation.org> wrote:
>> >> On Mon, 19 Feb 2007 12:20:19 +0530 Balbir Singh <balbir@in.ibm.com>
>> >> wrote:
>> >>
>> >> > This patch applies on top of Paul Menage's container patches (V7)
>> >> posted at
>> >> >
>> >> >       http://lkml.org/lkml/2007/2/12/88
>> >> >
>> >> > It implements a controller within the containers framework for 
>> limiting
>> >> > memory usage (RSS usage).
>> >
>> >> The key part of this patchset is the reclaim algorithm:
>> >>
>> >> Alas, I fear this might have quite bad worst-case behaviour.  One 
>> small
>> >> container which is under constant memory pressure will churn the
>> >> system-wide LRUs like mad, and will consume rather a lot of system 
>> time.
>> >> So it's a point at which container A can deleteriously affect things
>> >> which
>> >> are running in other containers, which is exactly what we're 
>> supposed to
>> >> not do.
>> >
>> > Nice with a simple memory controller. The downside seems to be that it
>> > doesn't scale very well when it comes to reclaim, but maybe that just
>> > comes with being simple. Step by step, and maybe this is a good first
>> > step?
>> >
>>
>> Thanks, I totally agree.
>>
>> > Ideally I'd like to see unmapped pages handled on a per-container LRU
>> > with a fallback to the system-wide LRUs. Shared/mapped pages could be
>> > handled using PTE ageing/unmapping instead of page ageing, but that
>> > may consume too much resources to be practical.
>> >
>> > / magnus
>>
>> Keeping unmapped pages per container sounds interesting. I am not quite
>> sure what PTE ageing, will it look it up.
> 
> You will most likely have no luck looking it up, so here is what I
> mean by PTE ageing:
> 
> The most common unit for memory resource control seems to be physical
> pages. Keeping track of pages is simple in the case of a single user
> per page, but for shared pages tracking the owner becomes more
> complex.
> 
> I consider unmapped pages to only have a single user at a time, so the
> unit for unmapped memory resource control is physical pages. Apart
> from implementation details such as fun with struct page and
> scalability, handling this case is not so complicated.
> 
> Mapped or shared pages should be handled in a different way IMO. PTEs
> should be used instead of using physical pages as unit for resource
> control and reclaim. For the user this looks pretty much the same as
> physical pages, apart for memory overcommit.
> 
> So instead of using a global page reclaim policy and reserving
> physical pages per container I propose that resource controlled shared
> pages should be handled using a PTE replacement policy. This policy is
> used to keep the most active PTEs in the container backed by physical
> pages. Inactive PTEs gets unmapped in favour over newer PTEs.
> 
> One way to implement this could be by populating the address space of
> resource controlled processes with multiple smaller LRU2Qs. The
> compact data structure that I have in mind is basically an array of
> 256 bytes, one byte per PTE. Associated with this data strucuture are
> start indexes and lengths for two lists. The indexes are used in a
> FAT-type of chain to form single linked lists. So we create active and
> inactive list here - and we move PTEs between the lists when we check
> the young bits from the page reclaim and when we apply memory
> pressure. Unmapping is done through the normal page reclaimer but
> using information from the PTE LRUs.
> 
> In my mind this should lead to more fair resource control of mapped
> pages, but if it is possible to implement with low overhead, that's
> another question. =)
> 
> Thanks for listening.
> 
> / magnus
> 

Thanks for explaining PTE aging.

-- 
	Warm Regards,
	Balbir Singh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-02-19 14:07 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-19  6:50 [RFC][PATCH][0/4] Memory controller (RSS Control) Balbir Singh
2007-02-19  6:50 ` Balbir Singh
2007-02-19  6:50 ` [RFC][PATCH][1/4] RSS controller setup Balbir Singh
2007-02-19  6:50   ` Balbir Singh
2007-02-19  8:57   ` Andrew Morton
2007-02-19  8:57     ` Andrew Morton
2007-02-19  9:18     ` Paul Menage
2007-02-19  9:18       ` Paul Menage
2007-02-19 11:13       ` Balbir Singh
2007-02-19 11:13         ` Balbir Singh
2007-02-19 19:43         ` Matthew Helsley
2007-02-19 19:43           ` Matthew Helsley
2007-02-19 10:06     ` Balbir Singh
2007-02-19 10:06       ` Balbir Singh
2007-02-19  6:50 ` [RFC][PATCH][2/4] Add RSS accounting and control Balbir Singh
2007-02-19  6:50   ` Balbir Singh
2007-02-19  8:58   ` Andrew Morton
2007-02-19  8:58     ` Andrew Morton
2007-02-19 10:37     ` [ckrm-tech] " Balbir Singh
2007-02-19 10:37       ` Balbir Singh
2007-02-19 11:01       ` Andrew Morton
2007-02-19 11:01         ` Andrew Morton
2007-02-19 11:09         ` Balbir Singh
2007-02-19 11:09           ` Balbir Singh
2007-02-19 11:23           ` Andrew Morton
2007-02-19 11:23             ` Andrew Morton
2007-02-19 11:56             ` Balbir Singh
2007-02-19 11:56               ` Balbir Singh
2007-02-19 12:09               ` Paul Menage
2007-02-19 12:09                 ` Paul Menage
2007-02-19 14:10                 ` Balbir Singh
2007-02-19 14:10                   ` Balbir Singh
2007-02-19 16:07                   ` Vaidyanathan Srinivasan
2007-02-19 16:07                     ` Vaidyanathan Srinivasan
2007-02-19 16:17                     ` Balbir Singh
2007-02-19 16:17                       ` Balbir Singh
2007-02-20  6:40                       ` Vaidyanathan Srinivasan
2007-02-20  6:40                         ` Vaidyanathan Srinivasan
2007-02-19  6:50 ` [RFC][PATCH][3/4] Add reclaim support Balbir Singh
2007-02-19  6:50   ` Balbir Singh
2007-02-19  8:59   ` Andrew Morton
2007-02-19  8:59     ` Andrew Morton
2007-02-19 10:50     ` Balbir Singh
2007-02-19 10:50       ` Balbir Singh
2007-02-19 11:10       ` Andrew Morton
2007-02-19 11:10         ` Andrew Morton
2007-02-19 11:16         ` Balbir Singh
2007-02-19 11:16           ` Balbir Singh
2007-02-19  9:48   ` KAMEZAWA Hiroyuki
2007-02-19  9:48     ` KAMEZAWA Hiroyuki
2007-02-19 10:52     ` Balbir Singh
2007-02-19 10:52       ` Balbir Singh
2007-02-19  6:50 ` [RFC][PATCH][4/4] RSS controller documentation Balbir Singh
2007-02-19  6:50   ` Balbir Singh
2007-02-19  8:54 ` [RFC][PATCH][0/4] Memory controller (RSS Control) Andrew Morton
2007-02-19  8:54   ` Andrew Morton
2007-02-19  9:06   ` Paul Menage
2007-02-19  9:06     ` Paul Menage
2007-02-19  9:50     ` [ckrm-tech] " Kirill Korotaev
2007-02-19  9:50       ` Kirill Korotaev
2007-02-19  9:50       ` Paul Menage
2007-02-19  9:50         ` Paul Menage
2007-02-19 10:24       ` Balbir Singh
2007-02-19 10:24         ` Balbir Singh
2007-02-19 10:39     ` Balbir Singh
2007-02-19 10:39       ` Balbir Singh
2007-02-19  9:16   ` Magnus Damm
2007-02-19  9:16     ` Magnus Damm
2007-02-19 10:45     ` Balbir Singh
2007-02-19 10:45       ` Balbir Singh
2007-02-19 11:56       ` Magnus Damm
2007-02-19 11:56         ` Magnus Damm
2007-02-19 14:07         ` Balbir Singh [this message]
2007-02-19 14:07           ` Balbir Singh
2007-02-19 10:00   ` Balbir Singh
2007-02-19 10:00     ` Balbir Singh
  -- strict thread matches above, loose matches on Subject: below --
2007-02-24 14:45 [RFC][PATCH][0/4] Memory controller (RSS Control) ( Balbir Singh
2007-02-24 14:45 ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45D9AF0B.9000803@in.ibm.com \
    --to=balbir@in.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=devel@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=magnus.damm@gmail.com \
    --cc=menage@google.com \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=vatsa@in.ibm.com \
    --cc=xemul@sw.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.