public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ray Bryant <raybry@sgi.com>
To: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: Andy Whitcroft <apw@shadowen.org>,
	"Martin J. Bligh" <mbligh@aracnet.com>,
	Andrew Morton <akpm@digeo.com>,
	Kernel Mailing List <linux-kernel@vger.kernel.org>,
	anton@samba.org, sds@epoc.ncsc.mil, ak@suse.org,
	lse-tech@lists.sourceforge.net, linux-ia64@vger.kernel.org
Subject: Re: [Lse-tech] RE: [PATCH] HUGETLB memory commitment
Date: Mon, 05 Apr 2004 20:05:56 -0500	[thread overview]
Message-ID: <40720274.3000700@sgi.com> (raw)
In-Reply-To: <200404052318.i35NIHF29964@unix-os.sc.intel.com>

Hi Ken,

Chen, Kenneth W wrote:
>>>>>Ray Bryant wrote on Monday, April 05, 2004 11:22 AM
>>>
>>>Chen, Kenneth W wrote:
>>>I actually started coding yesterday.  It doesn't look too bad (I think).
>>>I will post it once I finished it up later today or tomorrow.
>>
>>Hmmm...so did I.  Oh well.  We can pull the good ideas from both. :-)
> 
> 
> I did have a revelation from your original demand-paging patch with per-inode
> tracking ;-)  I extended it into tracking by struct address_space (so we don't
> pollute inode structure) and added per-block tracking.  See patch at the end of
> this post. I admit I had very pessimistic thoughts until I saw your patch.
> 

Cool!

Either way works, I think.  I just used the u.generic_ip pointer because it 
was there and convenient.  It's intended to be a hook in the VFS layer for a 
particular file system to add info to the inode, as near as I can tell, but 
neither hugetlbfs nor shmfs use it, so it was free for the taking.

> 
> 
>>>There are still some oddity in lifetime of the huge page reservation,
>>>but that can be discussed once everyone sees the code.
> 
> 
> I was thinking the lifetime of the huge page reservation should be the life
> of a mapping, i.e., only persist across mmap/munmap.  That means add a ref
> count in the per-block tracking.  This seriously complicates the design
> because now, ref count needs to be updated in munmap and fault_hander in
> addition to the mmap and truncate.  Not to mention that Andy Whitcroft already
> pointed out we don't get notification from munmap.  Plus it seriously make
> tracking logic complicated and have performance down side as well.
> 
> I guess everyone is OK with reservation lives until file truncate?

One can certainly argue that the only thing that is required to live until 
file truncate is the contents of the huge pages in the page cache, since 
applications expect that the data will be there in the file/segment across 
program executions until the file is truncated or the segment deleted.

But it certainly makes sense to me that if program A creates an mmap()'d file 
of 10 huge pages, that if Program B comes along later and re-mmaps() that same 
file, that Program B will be guaranteed to be able to touch all 10 pages, even 
if Program A only touched 5.  So that is an argument for having the 
reservation last until file truncate/segment removal time.

Additionally, recall that we are trying to emulate the behavior of the 
hugetlb_prefault() implementation. Under that implementation, if Program A 
would mmap() 10 huge pages, then Program B would be guarenteed not to get a 
SIGBUS when it mmap()'s and references those 10 pages, provided only that the 
underlying file/segment was not deleted in between execution of the two programs.

So, I think we >>have<< to have the reservation last until file 
truncate/segment deletion time.  Fortunately, that turns out to be easier to 
implement as well.  :-)

I'll check through your patch and make sure we've both covered the same bases 
there.  If so, we should be good to go with either version.

-- 
Best Regards,
Ray
-----------------------------------------------
                   Ray Bryant
512-453-9679 (work)         512-507-7807 (cell)
raybry@sgi.com             raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
            so I installed Linux.
-----------------------------------------------


  reply	other threads:[~2004-04-06  1:59 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-25 16:54 [PATCH] [0/6] HUGETLB memory commitment Andy Whitcroft
2004-03-25 16:58 ` [PATCH] [1/6] " Andy Whitcroft
2004-03-25 16:59 ` [PATCH] [2/6] " Andy Whitcroft
2004-03-25 17:00 ` [PATCH] [3/6] " Andy Whitcroft
2004-03-25 17:01 ` [PATCH] [4/6] " Andy Whitcroft
2004-03-25 17:02 ` [PATCH] [5/6] " Andy Whitcroft
2004-03-25 17:03 ` [PATCH] [6/6] " Andy Whitcroft
2004-03-25 21:04 ` [PATCH] [0/6] " Andrew Morton
2004-03-25 23:27   ` Andy Whitcroft
2004-03-25 23:51     ` Andrew Morton
2004-03-25 23:59       ` Andy Whitcroft
2004-03-26  0:10         ` Keith Owens
2004-03-26  0:22           ` Andrew Morton
2004-03-26  8:58             ` [Lse-tech] " Suparna Bhattacharya
2004-03-26  3:39               ` Keith Owens
2004-03-26 17:15                 ` Suparna Bhattacharya
2004-03-26  2:01         ` Andy Whitcroft
2004-03-26  0:18       ` Martin J. Bligh
2004-03-28 18:02     ` Ray Bryant
2004-03-28 19:10       ` Martin J. Bligh
2004-03-28 21:32         ` [Lse-tech] " Ray Bryant
2004-03-29 16:50           ` Martin J. Bligh
2004-03-29 12:30         ` Andy Whitcroft
2004-03-29 20:45           ` Chen, Kenneth W
2004-03-29 20:49             ` Chen, Kenneth W
2004-03-30 12:57               ` Andy Whitcroft
2004-03-30 20:04                 ` Chen, Kenneth W
2004-03-30 21:48                   ` Andy Whitcroft
2004-03-31  1:48                     ` Andy Whitcroft
2004-03-31  8:51                       ` Chen, Kenneth W
2004-03-31 16:20                         ` Andy Whitcroft
2004-04-01 21:15                         ` Andy Whitcroft
2004-04-01 22:50                           ` Andy Whitcroft
2004-04-01 23:09                           ` Chen, Kenneth W
2004-04-03  3:57                             ` [PATCH] " Ray Bryant
2004-04-04  3:31                               ` Chen, Kenneth W
2004-04-04 22:15                                 ` Ray Bryant
2004-04-05 15:26                                 ` [Lse-tech] " Ray Bryant
2004-04-05 17:01                                   ` Chen, Kenneth W
2004-04-05 18:22                                     ` Ray Bryant
2004-04-05 23:18                                       ` Chen, Kenneth W
2004-04-06  1:05                                         ` Ray Bryant [this message]
2004-04-06 16:14                                         ` Andy Whitcroft
2004-04-06 17:40                                           ` Chen, Kenneth W

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40720274.3000700@sgi.com \
    --to=raybry@sgi.com \
    --cc=ak@suse.org \
    --cc=akpm@digeo.com \
    --cc=anton@samba.org \
    --cc=apw@shadowen.org \
    --cc=kenneth.w.chen@intel.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lse-tech@lists.sourceforge.net \
    --cc=mbligh@aracnet.com \
    --cc=sds@epoc.ncsc.mil \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox