From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759710AbZFWNiZ (ORCPT ); Tue, 23 Jun 2009 09:38:25 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1758963AbZFWNiR (ORCPT ); Tue, 23 Jun 2009 09:38:17 -0400 Received: from bombadil.infradead.org ([18.85.46.34]:35818 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758167AbZFWNiR convert rfc822-to-8bit (ORCPT ); Tue, 23 Jun 2009 09:38:17 -0400 Subject: Re: mm: dirty page problem From: Peter Zijlstra To: xue yong Cc: linux-kernel@vger.kernel.org In-Reply-To: References: <1245753219.19816.1586.camel@twins> <1245757970.19816.1675.camel@twins> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Date: Tue, 23 Jun 2009 15:38:22 +0200 Message-Id: <1245764302.19816.1800.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.26.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2009-06-23 at 21:32 +0800, xue yong wrote: > On Tue, Jun 23, 2009 at 7:52 PM, Peter Zijlstra wrote: > > On Tue, 2009-06-23 at 19:43 +0800, xue yong wrote: > >> Thanks a lot, Peter. > >> Your reply resolved my doubt. > >> > >> we have a service program (just say A) running with about 14G mmaped data. > >> and there is another daemon (just say B) do msync( SYNC) periodically. > >> > >> so I want to know in this pattern, was the data flushed to disk? > > > > I don't think so. > > > > The problem is that msync() only scans the current process' page tables, > > which would be clean since B doesn't write, only A does. > > > > So you'd have to modify your program, A, to do the msync() itself -- > > possibly from a thread (threads share the vm context and thus page > > tables). > > > > > :) I did have this thought, because there was littile bo(block out), > and pmap showed that > the dirty pages belong to a process was always growing. > > I believe you are the authority. Your confirmation matters. I'm one of the people who knows this code rather well, yes ;-) > In "Understanding the Linux® Virtual Memory Manager" page 163, Mel > Gorman said that > Process-mapped pages are not easily swappable because there is no > way to map struct pages to PTEs except to search every page table, which is far > too expensive. > So neither kswapd nor other kernel daemons do the scan job. > Without explicit action these pages would stay hidden. While a great book to learn some of the basics from, it is severely out-dated. I think in his 2.6 chapter he does mention something about reverse map, or rmap as its called. These days we do keep a data structure whereby it is easier to find all ptes for a particular mapping (mm/rmap.c). In particular try_to_unmap() is the routine used to remove all ptes in order to swap a page.