From: Ray Bryant <raybry@sgi.com>
To: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: "'Andy Whitcroft'" <apw@shadowen.org>,
"Martin J. Bligh" <mbligh@aracnet.com>,
Andrew Morton <akpm@osdl.org>,
linux-kernel@vger.kernel.org, anton@samba.org,
sds@epoch.ncsc.mil, ak@suse.de, lse-tech@lists.sourceforge.net,
linux-ia64@vger.kernel.org
Subject: Re: [Lse-tech] RE: [PATCH] HUGETLB memory commitment
Date: Mon, 05 Apr 2004 10:26:32 -0500 [thread overview]
Message-ID: <40717AA8.9050900@sgi.com> (raw)
In-Reply-To: <200404040331.i343VSF02496@unix-os.sc.intel.com>
Ken,
Chen, Kenneth W wrote:
>
> A simple counter won't work for different file offset mapping. It has to
> be some sort of per-inode, per-block reservation tracking. I think we are
> steering in the right direction though.
>
>
>
OK, pardon my question about test code, that is trivial enough I guess.
Anyway, the only way I can see to make this work with non-zero offset is to
hang a list of segment descriptors (offset and size) for each reserved segment
off of the inode. Then when a new mapping comes in, we search the segment
list to see if the new offset and size overlaps with any of the existing
reserved segments. If it doesn't, then we make a new reservation (and request
file system quota) for the current size, and add the current request to the
reserved segment list. If it does, and it fits entirely in a previously
reserved segement, then no change to reservation/quota needs to be made. If
it only partially fits, then we need to make a new reservation/quota request
for the number of new huge pages required and update the overlapping segment's
length to reflect the new reservation.
Then in truncate_hugepages() we can search the segment list again, discarding
full or partial segments that occur either entirely or partially beyond
"lstart", as appropropriate and doing hugetlb_unreserve() and
hugetlbfs_put_quota() for the appropriate number of pages.
This will be quite a bit of code and complexity. Do we still think this is
all worth it to follow Andrew's suggestion of no API changes for "allocate on
fault" hugetlbpages? It would be a lot cleaner just to return SIGBUS if we
run out of hugepages and be done with it, in spite of the API change.
Is there a simpler way to do the correct reservation? (One could allocate the
pages at mmap() time, resurrecting hugetlb_prefault(), but zero the pages at
fault time, this would solve the original problem we ran into at SGI, but
would not solve Andi's requirement to postpone allocation so NUMA API's can
control placement.)
--
Best Regards,
Ray
-----------------------------------------------
Ray Bryant
512-453-9679 (work) 512-507-7807 (cell)
raybry@sgi.com raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
so I installed Linux.
-----------------------------------------------
next prev parent reply other threads:[~2004-04-05 16:20 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-03-25 16:54 [PATCH] [0/6] HUGETLB memory commitment Andy Whitcroft
2004-03-25 16:58 ` [PATCH] [1/6] " Andy Whitcroft
2004-03-25 16:59 ` [PATCH] [2/6] " Andy Whitcroft
2004-03-25 17:00 ` [PATCH] [3/6] " Andy Whitcroft
2004-03-25 17:01 ` [PATCH] [4/6] " Andy Whitcroft
2004-03-25 17:02 ` [PATCH] [5/6] " Andy Whitcroft
2004-03-25 17:03 ` [PATCH] [6/6] " Andy Whitcroft
2004-03-25 21:04 ` [PATCH] [0/6] " Andrew Morton
2004-03-25 23:27 ` Andy Whitcroft
2004-03-25 23:51 ` Andrew Morton
2004-03-25 23:59 ` Andy Whitcroft
2004-03-26 0:10 ` Keith Owens
2004-03-26 0:22 ` Andrew Morton
2004-03-26 8:58 ` [Lse-tech] " Suparna Bhattacharya
2004-03-26 3:39 ` Keith Owens
2004-03-26 17:15 ` Suparna Bhattacharya
2004-03-26 2:01 ` Andy Whitcroft
2004-03-26 0:18 ` Martin J. Bligh
2004-03-28 18:02 ` Ray Bryant
2004-03-28 19:10 ` Martin J. Bligh
2004-03-28 21:32 ` [Lse-tech] " Ray Bryant
2004-03-29 16:50 ` Martin J. Bligh
2004-03-29 12:30 ` Andy Whitcroft
2004-03-29 20:45 ` Chen, Kenneth W
2004-03-29 20:49 ` Chen, Kenneth W
2004-03-30 12:57 ` Andy Whitcroft
2004-03-30 20:04 ` Chen, Kenneth W
2004-03-30 21:48 ` Andy Whitcroft
2004-03-31 1:48 ` Andy Whitcroft
2004-03-31 8:51 ` Chen, Kenneth W
2004-03-31 16:20 ` Andy Whitcroft
2004-04-01 21:15 ` Andy Whitcroft
2004-04-01 22:50 ` Andy Whitcroft
2004-04-01 23:09 ` Chen, Kenneth W
2004-04-03 3:57 ` [PATCH] " Ray Bryant
2004-04-04 3:31 ` Chen, Kenneth W
2004-04-04 22:15 ` Ray Bryant
2004-04-05 15:26 ` Ray Bryant [this message]
2004-04-05 17:01 ` [Lse-tech] " Chen, Kenneth W
2004-04-05 18:22 ` Ray Bryant
2004-04-05 23:18 ` Chen, Kenneth W
2004-04-06 1:05 ` Ray Bryant
2004-04-06 16:14 ` Andy Whitcroft
2004-04-06 17:40 ` Chen, Kenneth W
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=40717AA8.9050900@sgi.com \
--to=raybry@sgi.com \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=anton@samba.org \
--cc=apw@shadowen.org \
--cc=kenneth.w.chen@intel.com \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lse-tech@lists.sourceforge.net \
--cc=mbligh@aracnet.com \
--cc=sds@epoch.ncsc.mil \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox