All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zheng Liu <gnehzuil.liu@gmail.com>
To: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: Fine granularity page reclaim
Date: Thu, 8 Mar 2012 10:54:52 +0800	[thread overview]
Message-ID: <20120308025452.GA6196@gmail.com> (raw)
In-Reply-To: <4F57C610.8050101@openvz.org>

On Thu, Mar 08, 2012 at 12:33:20AM +0400, Konstantin Khlebnikov wrote:
> Zheng Liu wrote:
> >
> >
> >On Monday, February 20, 2012, Konstantin Khlebnikov <khlebnikov@openvz.org <mailto:khlebnikov@openvz.org>> wrote:
> > > Zheng Liu wrote:
> > >>
> > >> Cc linux-kernel mailing list.
> > >>
> > >> On Sat, Feb 18, 2012 at 12:20:05AM +0400, Konstantin Khlebnikov wrote:
> > >>>
> > >>> Zheng Liu wrote:
> > >>>>
> > >>>> Hi all,
> > >>>>
> > >>>> Currently, we encounter a problem about page reclaim. In our product system,
> > >>>> there is a lot of applictions that manipulate a number of files. In these
> > >>>> files, they can be divided into two categories. One is index file, another is
> > >>>> block file. The number of index files is about 15,000, and the number of
> > >>>> block files is about 23,000 in a 2TB disk. The application accesses index
> > >>>> file using mmap(2), and read/write block file using pread(2)/pwrite(2). We hope
> > >>>> to hold index file in memory as much as possible, and it works well in Redhat
> > >>>> 2.6.18-164. It is about 60-70% of index files that can be hold in memory.
> > >>>> However, it doesn't work well in Redhat 2.6.32-133. I know in 2.6.18 that the
> > >>>> linux uses an active list and an inactive list to handle page reclaim, and in
> > >>>> 2.6.32 that they are divided into anonymous list and file list. So I am
> > >>>> curious about why most of index files can be hold in 2.6.18? The index file
> > >>>> should be replaced because mmap doesn't impact the lru list.
> > >>>
> > >>> There was my patch for fixing similar problem with shared/executable mapped pages
> > >>> "vmscan: promote shared file mapped pages" commit 34dbc67a644f and commit c909e99364c
> > >>> maybe it will help in your case.
> > >>
> > >> Hi Konstantin,
> > >>
> > >> Thank you for your reply.  I have tested it in upstream kernel.  These
> > >> patches are useful for multi-processes applications.  But, in our product
> > >> system, there are some applications that are multi-thread.  So
> > >> 'references_ptes>  1' cannot help these applications to hold the data in
> > >> memory.
> > >
> > > Ok, what if you mmap you data as executable, just to test.
> > > Then these pages will be activated after first touch.
> > > In attachment patch with per-mm flag with the same effect.
> > >
> >
> >Hi Konstantin,
> >
> >Sorry for the delay reply.  Last two weeks I was trying these two solutions
> >and evaluating the impacts for the performance in our product system.
> >Good news is that these two solutions both work well. They can keep
> >mapped files in memory under mult-thread.  But I have a question for
> >the first solution (map the file with PROT_EXEC flag).  I think this way is
> >too tricky.  As I said previously, these files that needs to be mapped only
> >are normal index file, and they shouldn't be mapped with PROT_EXEC flag
> >from the view of an application programmer.  So actually the key issue is
> >that we should provide a mechanism, which lets different file sets can be
> >reclaimed separately.  I am not sure whether this idea is useful or not.  So
> >any feedbacks are welcomed.:-).  Thank you.
> >
> 
> Sounds good. Yes, PROT_EXEC isn't very usable and secure, per-mm flag not
> very flexible too. I prefer setting some kind of memory pressure priorities
> for each vma and inode. Probably we can sort vma and inodes into different
> cgroup-like sets and balance memory pressure between them.
> Maybe someone was thought about it...

Thanks for your advices.  About setting pressure priorities for each vma
and inode, I will send a new mail to mailing list to discuss this
problem.  Maybe someone has some good ideas for it. ;-)

Regards,
Zheng

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Zheng Liu <gnehzuil.liu@gmail.com>
To: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: Fine granularity page reclaim
Date: Thu, 8 Mar 2012 10:54:52 +0800	[thread overview]
Message-ID: <20120308025452.GA6196@gmail.com> (raw)
In-Reply-To: <4F57C610.8050101@openvz.org>

On Thu, Mar 08, 2012 at 12:33:20AM +0400, Konstantin Khlebnikov wrote:
> Zheng Liu wrote:
> >
> >
> >On Monday, February 20, 2012, Konstantin Khlebnikov <khlebnikov@openvz.org <mailto:khlebnikov@openvz.org>> wrote:
> > > Zheng Liu wrote:
> > >>
> > >> Cc linux-kernel mailing list.
> > >>
> > >> On Sat, Feb 18, 2012 at 12:20:05AM +0400, Konstantin Khlebnikov wrote:
> > >>>
> > >>> Zheng Liu wrote:
> > >>>>
> > >>>> Hi all,
> > >>>>
> > >>>> Currently, we encounter a problem about page reclaim. In our product system,
> > >>>> there is a lot of applictions that manipulate a number of files. In these
> > >>>> files, they can be divided into two categories. One is index file, another is
> > >>>> block file. The number of index files is about 15,000, and the number of
> > >>>> block files is about 23,000 in a 2TB disk. The application accesses index
> > >>>> file using mmap(2), and read/write block file using pread(2)/pwrite(2). We hope
> > >>>> to hold index file in memory as much as possible, and it works well in Redhat
> > >>>> 2.6.18-164. It is about 60-70% of index files that can be hold in memory.
> > >>>> However, it doesn't work well in Redhat 2.6.32-133. I know in 2.6.18 that the
> > >>>> linux uses an active list and an inactive list to handle page reclaim, and in
> > >>>> 2.6.32 that they are divided into anonymous list and file list. So I am
> > >>>> curious about why most of index files can be hold in 2.6.18? The index file
> > >>>> should be replaced because mmap doesn't impact the lru list.
> > >>>
> > >>> There was my patch for fixing similar problem with shared/executable mapped pages
> > >>> "vmscan: promote shared file mapped pages" commit 34dbc67a644f and commit c909e99364c
> > >>> maybe it will help in your case.
> > >>
> > >> Hi Konstantin,
> > >>
> > >> Thank you for your reply.  I have tested it in upstream kernel.  These
> > >> patches are useful for multi-processes applications.  But, in our product
> > >> system, there are some applications that are multi-thread.  So
> > >> 'references_ptes>  1' cannot help these applications to hold the data in
> > >> memory.
> > >
> > > Ok, what if you mmap you data as executable, just to test.
> > > Then these pages will be activated after first touch.
> > > In attachment patch with per-mm flag with the same effect.
> > >
> >
> >Hi Konstantin,
> >
> >Sorry for the delay reply.  Last two weeks I was trying these two solutions
> >and evaluating the impacts for the performance in our product system.
> >Good news is that these two solutions both work well. They can keep
> >mapped files in memory under mult-thread.  But I have a question for
> >the first solution (map the file with PROT_EXEC flag).  I think this way is
> >too tricky.  As I said previously, these files that needs to be mapped only
> >are normal index file, and they shouldn't be mapped with PROT_EXEC flag
> >from the view of an application programmer.  So actually the key issue is
> >that we should provide a mechanism, which lets different file sets can be
> >reclaimed separately.  I am not sure whether this idea is useful or not.  So
> >any feedbacks are welcomed.:-).  Thank you.
> >
> 
> Sounds good. Yes, PROT_EXEC isn't very usable and secure, per-mm flag not
> very flexible too. I prefer setting some kind of memory pressure priorities
> for each vma and inode. Probably we can sort vma and inodes into different
> cgroup-like sets and balance memory pressure between them.
> Maybe someone was thought about it...

Thanks for your advices.  About setting pressure priorities for each vma
and inode, I will send a new mail to mailing list to discuss this
problem.  Maybe someone has some good ideas for it. ;-)

Regards,
Zheng

  reply	other threads:[~2012-03-08  2:49 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-17  9:22 Fine granularity page reclaim Zheng Liu
2012-02-17 20:20 ` Konstantin Khlebnikov
2012-02-20  6:20   ` Zheng Liu
2012-02-20  6:19     ` Fwd: " Zheng Liu
2012-02-20  7:09     ` Konstantin Khlebnikov
2012-03-07 17:45       ` Zheng Liu
2012-03-07 20:33         ` Konstantin Khlebnikov
2012-03-07 20:33           ` Konstantin Khlebnikov
2012-03-08  2:54           ` Zheng Liu [this message]
2012-03-08  2:54             ` Zheng Liu
2012-04-07  0:18 ` Ying Han

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120308025452.GA6196@gmail.com \
    --to=gnehzuil.liu@gmail.com \
    --cc=khlebnikov@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.