From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
To: "'Ray Bryant'" <raybry@sgi.com>
Cc: "'Andy Whitcroft'" <apw@shadowen.org>,
"Martin J. Bligh" <mbligh@aracnet.com>,
"Andrew Morton" <akpm@osdl.org>, <linux-kernel@vger.kernel.org>,
<anton@samba.org>, <sds@epoch.ncsc.mil>, <ak@suse.de>,
<lse-tech@lists.sourceforge.net>, <linux-ia64@vger.kernel.org>
Subject: RE: [PATCH] HUGETLB memory commitment
Date: Sat, 3 Apr 2004 19:31:30 -0800 [thread overview]
Message-ID: <200404040331.i343VSF02496@unix-os.sc.intel.com> (raw)
In-Reply-To: <406E3613.6080609@sgi.com>
>>>>> Ray Bryant wrote on Fri, April 02, 2004 7:57 PM
> Chen, Kenneth W wrote:
> >
> > Can we just RIP this whole hugetlb page overcommit?
> >
>
> Ken et al,
>
> Perhaps the following patch might be more to your liking. I'm
> sorry I haven't been contributing to this discussion -- I've been
> off doing this code first for Altix under 2.4.21 (one's got to eat,
> after all). Now I've ported the changes forward to Linux 2.6.5-rc3
> and tested them. The patch below is relative to that version of Linux.
Somehow the patch came through with extra white space at beginning of
each line, but s/^ / / fix that up.
> The hugetlb memory commit code does this with a single global counter:
> htlbzone_reserved, and a per inode reserved page count. The latter is
> used to decrement the global reserved page count when the inode is
> deleted or the file is truncated.
A simple counter won't work for different file offset mapping. It has to
be some sort of per-inode, per-block reservation tracking. I think we are
steering in the right direction though.
> diff -Nru a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> --- a/fs/hugetlbfs/inode.c Fri Apr 2 19:31:56 2004
> +++ b/fs/hugetlbfs/inode.c Fri Apr 2 19:31:56 2004
> @@ -59,19 +58,34 @@
> if (vma->vm_end - vma->vm_start < HPAGE_SIZE)
> return -EINVAL;
>
> - vma_len = (loff_t)(vma->vm_end - vma->vm_start);
> + reserved_pages = (vma->vm_end - vma->vm_start) >> HPAGE_SHIFT;
>
> down(&inode->i_sem);
> file_accessed(file);
> +
> + /* a second mmap() (or a rmap()) can change the reservation */
> + prev_reserved_pages = inode->u.data;
> +
> + /*
> + * if current mmap() is smaller than previous reservation,
> + * we don't change reservation or quota
> + */
> + if (reserved_pages >= prev_reserved_pages) {
> + new_reservation = reserved_pages - prev_reserved_pages;
> + if ((hugetlb_get_quota(mapping, new_reservation) < 0) ||
> + (hugetlb_reserve(new_reservation) < 0)) {
> + up(&inode->i_sem);
> + return -ENOMEM;
> + }
> + inode->i_size = reserved_pages << HPAGE_SHIFT;
> + inode->u.data = reserved_pages;
> + }
> + up(&inode->i_sem);
> +
This assumes all mmap start from the same file offset. IMO, it's not
generic enough. This code will only reserve 1 page for the following
case, but actually there are 4 mapping totaling 4 pages:
mmap 1 page at file offset 0
mmap 1 page at file offset HPAGE_SIZE,
mmap 1 page at file offset HPAGE_SIZE*2,
mmap 1 page at file offset HPAGE_SIZE*3,
Oh, this code broke file system quota accounting as well.
- Ken
next prev parent reply other threads:[~2004-04-04 3:32 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-03-25 16:54 [PATCH] [0/6] HUGETLB memory commitment Andy Whitcroft
2004-03-25 16:58 ` [PATCH] [1/6] " Andy Whitcroft
2004-03-25 16:59 ` [PATCH] [2/6] " Andy Whitcroft
2004-03-25 17:00 ` [PATCH] [3/6] " Andy Whitcroft
2004-03-25 17:01 ` [PATCH] [4/6] " Andy Whitcroft
2004-03-25 17:02 ` [PATCH] [5/6] " Andy Whitcroft
2004-03-25 17:03 ` [PATCH] [6/6] " Andy Whitcroft
2004-03-25 21:04 ` [PATCH] [0/6] " Andrew Morton
2004-03-25 23:27 ` Andy Whitcroft
2004-03-25 23:51 ` Andrew Morton
2004-03-25 23:59 ` Andy Whitcroft
2004-03-26 0:10 ` Keith Owens
2004-03-26 0:22 ` Andrew Morton
2004-03-26 8:58 ` [Lse-tech] " Suparna Bhattacharya
2004-03-26 3:39 ` Keith Owens
2004-03-26 17:15 ` Suparna Bhattacharya
2004-03-26 2:01 ` Andy Whitcroft
2004-03-26 0:18 ` Martin J. Bligh
2004-03-28 18:02 ` Ray Bryant
2004-03-28 19:10 ` Martin J. Bligh
2004-03-28 21:32 ` [Lse-tech] " Ray Bryant
2004-03-29 16:50 ` Martin J. Bligh
2004-03-29 12:30 ` Andy Whitcroft
2004-03-29 20:45 ` Chen, Kenneth W
2004-03-29 20:49 ` Chen, Kenneth W
2004-03-30 12:57 ` Andy Whitcroft
2004-03-30 20:04 ` Chen, Kenneth W
2004-03-30 21:48 ` Andy Whitcroft
2004-03-31 1:48 ` Andy Whitcroft
2004-03-31 8:51 ` Chen, Kenneth W
2004-03-31 16:20 ` Andy Whitcroft
2004-04-01 21:15 ` Andy Whitcroft
2004-04-01 22:50 ` Andy Whitcroft
2004-04-01 23:09 ` Chen, Kenneth W
2004-04-03 3:57 ` [PATCH] " Ray Bryant
2004-04-04 3:31 ` Chen, Kenneth W [this message]
2004-04-04 22:15 ` Ray Bryant
2004-04-05 15:26 ` [Lse-tech] " Ray Bryant
2004-04-05 17:01 ` Chen, Kenneth W
2004-04-05 18:22 ` Ray Bryant
2004-04-05 23:18 ` Chen, Kenneth W
2004-04-06 1:05 ` Ray Bryant
2004-04-06 16:14 ` Andy Whitcroft
2004-04-06 17:40 ` Chen, Kenneth W
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200404040331.i343VSF02496@unix-os.sc.intel.com \
--to=kenneth.w.chen@intel.com \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=anton@samba.org \
--cc=apw@shadowen.org \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lse-tech@lists.sourceforge.net \
--cc=mbligh@aracnet.com \
--cc=raybry@sgi.com \
--cc=sds@epoch.ncsc.mil \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox