* [PATCH] Bug #3054 madvise(MADV_WILLNEED,...) fix for exceeding rlimit rss
@ 2005-06-19 16:49 Matt Keenan
2005-06-19 20:23 ` Chris Wright
0 siblings, 1 reply; 3+ messages in thread
From: Matt Keenan @ 2005-06-19 16:49 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
Here is an updated patch for 2.6.12, can it be included in -mm for
testing? I have been banging on the code for an hour or so now; no probs.
Matt
--- linux-2.6.11.7/mm/madvise.c 2005-04-12 15:58:30.000000000 +0100
+++ linux/mm/madvise.c 2005-06-19 17:20:56.000000000 +0100
@@ -61,6 +61,7 @@ static long madvise_willneed(struct vm_a
unsigned long start, unsigned long end)
{
struct file *file = vma->vm_file;
+ struct task_struct *tsk = current;
if (!file)
return -EBADF;
@@ -70,6 +71,28 @@ static long madvise_willneed(struct vm_a
end = vma->vm_end;
end = ((end - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
+ /*
+ * This code below checks to see if mapping the requested
+ * readahead would make the task's rss exceed the task's
+ * rlimit rss.
+ *
+ * This doesn't account for pages that may already be mapped
+ * due to readahead, but since this is merely a hint to the
+ * kernel no harm should be done, it won't unmap anything
+ * already mapped if it fails. N.B. This won't affect the
+ * kernel's internal automatic readahead which doesn't check
+ * (or honour) rlimit rss.
+ */
+
+ spin_lock(&tsk->mm->page_table_lock);
+ if (((max_sane_readahead(end-start) << PAGE_SHIFT) +
+ tsk->mm->_rss) > tsk->signal->rlim[RLIMIT_RSS].rlim_cur)
+ {
+ spin_unlock(&tsk->mm->page_table_lock);
+ return -EIO;
+ }
+ spin_unlock(&tsk->mm->page_table_lock);
+
force_page_cache_readahead(file->f_mapping,
file, start, max_sane_readahead(end - start));
return 0;
@@ -170,6 +193,8 @@ static long madvise_vma(struct vm_area_s
* -ENOMEM - addresses in the specified range are not currently
* mapped, or are outside the AS of the process.
* -EIO - an I/O error occurred while paging in data.
+ * - MADV_WILLNEED would map in pages that would make the task's
+ * rss exceed rlimit rss.
* -EBADF - map exists, but area maps something that isn't a file.
* -EAGAIN - a kernel resource was temporarily unavailable.
*/
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] Bug #3054 madvise(MADV_WILLNEED,...) fix for exceeding rlimit rss 2005-06-19 16:49 [PATCH] Bug #3054 madvise(MADV_WILLNEED,...) fix for exceeding rlimit rss Matt Keenan @ 2005-06-19 20:23 ` Chris Wright 2005-06-19 21:09 ` Matt Keenan 0 siblings, 1 reply; 3+ messages in thread From: Chris Wright @ 2005-06-19 20:23 UTC (permalink / raw) To: Matt Keenan; +Cc: Andrew Morton, linux-kernel * Matt Keenan (matthew.keenan@ntlworld.com) wrote: > --- linux-2.6.11.7/mm/madvise.c 2005-04-12 15:58:30.000000000 +0100 > +++ linux/mm/madvise.c 2005-06-19 17:20:56.000000000 +0100 > @@ -61,6 +61,7 @@ static long madvise_willneed(struct vm_a > unsigned long start, unsigned long end) > { > struct file *file = vma->vm_file; > + struct task_struct *tsk = current; Looks like you've got some tab/whitespace damage going on here. > if (!file) > return -EBADF; > @@ -70,6 +71,28 @@ static long madvise_willneed(struct vm_a > end = vma->vm_end; > end = ((end - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff; > > + /* > + * This code below checks to see if mapping the requested > + * readahead would make the task's rss exceed the task's > + * rlimit rss. > + * > + * This doesn't account for pages that may already be mapped > + * due to readahead, but since this is merely a hint to the > + * kernel no harm should be done, it won't unmap anything > + * already mapped if it fails. N.B. This won't affect the > + * kernel's internal automatic readahead which doesn't check > + * (or honour) rlimit rss. > + */ > + > + spin_lock(&tsk->mm->page_table_lock); > + if (((max_sane_readahead(end-start) << PAGE_SHIFT) + > + tsk->mm->_rss) > tsk->signal->rlim[RLIMIT_RSS].rlim_cur) I doubt this one would overflow, but we recenly made changes in similar tests to use page count rather than byte count. I belive this should use get_mm_counter(). Isn't _rss counting pages rather than bytes, so I think the math is off here. Something like: if ((max_sane_readahead(end - start) + get_mm_counter(tsk->mm, rss)) > tsk->signal->rlim[RLIMIT_RSS].rlim_cur >> PAGE_SHIFT) thanks, -chris ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Bug #3054 madvise(MADV_WILLNEED,...) fix for exceeding rlimit rss 2005-06-19 20:23 ` Chris Wright @ 2005-06-19 21:09 ` Matt Keenan 0 siblings, 0 replies; 3+ messages in thread From: Matt Keenan @ 2005-06-19 21:09 UTC (permalink / raw) To: Chris Wright; +Cc: Andrew Morton, linux-kernel [-- Attachment #1: Type: text/plain, Size: 2332 bytes --] Chris Wright wrote: >* Matt Keenan (matthew.keenan@ntlworld.com) wrote: > > >>--- linux-2.6.11.7/mm/madvise.c 2005-04-12 15:58:30.000000000 +0100 >>+++ linux/mm/madvise.c 2005-06-19 17:20:56.000000000 +0100 >>@@ -61,6 +61,7 @@ static long madvise_willneed(struct vm_a >> unsigned long start, unsigned long end) >>{ >> struct file *file = vma->vm_file; >>+ struct task_struct *tsk = current; >> >> > >Looks like you've got some tab/whitespace damage going on here. > > > Damn mailer! I might have to go back to mutt. >> if (!file) >> return -EBADF; >>@@ -70,6 +71,28 @@ static long madvise_willneed(struct vm_a >> end = vma->vm_end; >> end = ((end - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff; >> >>+ /* >>+ * This code below checks to see if mapping the requested >>+ * readahead would make the task's rss exceed the task's >>+ * rlimit rss. >>+ * >>+ * This doesn't account for pages that may already be mapped >>+ * due to readahead, but since this is merely a hint to the >>+ * kernel no harm should be done, it won't unmap anything >>+ * already mapped if it fails. N.B. This won't affect the >>+ * kernel's internal automatic readahead which doesn't check >>+ * (or honour) rlimit rss. >>+ */ >>+ >>+ spin_lock(&tsk->mm->page_table_lock); >>+ if (((max_sane_readahead(end-start) << PAGE_SHIFT) + >>+ tsk->mm->_rss) > tsk->signal->rlim[RLIMIT_RSS].rlim_cur) >> >> > >I doubt this one would overflow, but we recenly made changes in similar >tests to use page count rather than byte count. I belive this should >use get_mm_counter(). Isn't _rss counting pages rather than bytes, >so I think the math is off here. Something like: > > if ((max_sane_readahead(end - start) + get_mm_counter(tsk->mm, rss)) > > tsk->signal->rlim[RLIMIT_RSS].rlim_cur >> PAGE_SHIFT) > > > Ok, here is the patch again with the suggested fixes. I have attached the patch (yes yes i know!), that will probably screw up people's mail -> patch generators, but the whitespace issue should be fixed. Hopefully this is ok, I'm going to bang on it for an hour or so here to make sure. Signed-off-by: Matthew Keenan <matthew.keenan@ntlworld.com> [-- Attachment #2: patch-2.6.12-madvise-willneed-fix.diff --] [-- Type: text/x-patch, Size: 1866 bytes --] --- linux-2.6.11.7/mm/madvise.c 2005-04-12 15:58:30.000000000 +0100 +++ linux/mm/madvise.c 2005-06-19 21:39:57.000000000 +0100 @@ -61,6 +61,7 @@ static long madvise_willneed(struct vm_a unsigned long start, unsigned long end) { struct file *file = vma->vm_file; + struct task_struct *tsk = current; if (!file) return -EBADF; @@ -70,6 +71,28 @@ static long madvise_willneed(struct vm_a end = vma->vm_end; end = ((end - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff; + /* + * This code below checks to see if mapping the requested + * readahead would make the task's rss exceed the task's + * rlimit rss. + * + * This doesn't account for pages that may already be mapped + * due to readahead, but since this is merely a hint to the + * kernel no harm should be done, it won't unmap anything + * already mapped if it fails. N.B. This won't affect the + * kernel's internal automatic readahead which doesn't check + * (or honour) rlimit rss. + */ + + spin_lock(&tsk->mm->page_table_lock); + if ((max_sane_readahead(end - start) + get_mm_counter(tsk->mm, rss)) > + (tsk->signal->rlim[RLIMIT_RSS].rlim_cur >> PAGE_SHIFT)) + { + spin_unlock(&tsk->mm->page_table_lock); + return -EIO; + } + spin_unlock(&tsk->mm->page_table_lock); + force_page_cache_readahead(file->f_mapping, file, start, max_sane_readahead(end - start)); return 0; @@ -170,6 +193,8 @@ static long madvise_vma(struct vm_area_s * -ENOMEM - addresses in the specified range are not currently * mapped, or are outside the AS of the process. * -EIO - an I/O error occurred while paging in data. + * - MADV_WILLNEED would map in pages that would make the task's + * rss exceed rlimit rss. * -EBADF - map exists, but area maps something that isn't a file. * -EAGAIN - a kernel resource was temporarily unavailable. */ ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-06-19 21:09 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-06-19 16:49 [PATCH] Bug #3054 madvise(MADV_WILLNEED,...) fix for exceeding rlimit rss Matt Keenan 2005-06-19 20:23 ` Chris Wright 2005-06-19 21:09 ` Matt Keenan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox