public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Darren Hart <dvhltc@us.ibm.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>,
	Jeff Dike <jdike@addtoit.com>, linux-mm <linux-mm@kvack.org>
Subject: Re: [RFC][PATCH] OVERCOMMIT_ALWAYS extension
Date: Mon, 24 Oct 2005 13:22:02 -0700	[thread overview]
Message-ID: <435D426A.6070007@us.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.61.0510242027001.6509@goblin.wat.veritas.com>

Hugh Dickins wrote:
> On Thu, 20 Oct 2005, Badari Pulavarty wrote:
> 
>>Changes from previous:
>>
>>1) madvise(DISCARD) - zaps the range and discards the pages. So, no
>>need to call madvise(DONTNEED) before.
>>
>>2) I added truncate_inode_pages2_range() to just discard only the
>>range of pages - not the whole file.
>>
>>Hugh, when you get a chance could you review this instead ?
> 
> 
> I haven't had time to go through it thoroughly, and will have no time
> the next couple of days, but here are some remarks.

Excellent points all, I'll take some time here in the next couple days and work 
up a response to each and an updated patch.  Regarding the spaces... I must have 
  used the wrong vi wrapper to edit the patch - apologies, will be fixed in next 
rev.

--Darren


> 
> --- linux-2.6.14-rc3/include/asm-alpha/mman.h	2005-09-30 14:17:35.000000000 -0700
> +++ linux-2.6.14-rc3.db2/include/asm-alpha/mman.h	2005-10-20 10:52:37.000000000 -0700
> @@ -42,6 +42,7 @@
>  #define MADV_WILLNEED	3		/* will need these pages */
>  #define	MADV_SPACEAVAIL	5		/* ensure resources are available */
>  #define MADV_DONTNEED	6		/* don't need these pages */
> +#define MADV_DISCARD    7               /* discard pages right now */
> 
> Throughout the patch there's lots of spaces where there should be tabs.
> But I'm glad you've put a space after the "#define" here, unlike in that
> MADV_SPACEAVAIL higher up!  Not so glad at your spaces to the right of it.
> 
> Are we free to define MADV_DISCARD, coming after the others, in each of
> the architectures?  In general, I think mman.h reflects definitions made
> by native Operating Systems of the architectures in question, and they
> might have added a few since.
> 
> --- linux-2.6.14-rc3/include/linux/mm.h	2005-09-30 14:17:35.000000000 -0700
> +++ linux-2.6.14-rc3.db2/include/linux/mm.h	2005-10-20 13:41:57.000000000 -0700
> @@ -865,6 +865,7 @@ extern unsigned long do_brk(unsigned lon
>  /* filemap.c */
>  extern unsigned long page_unuse(struct page *);
>  extern void truncate_inode_pages(struct address_space *, loff_t);
> +extern void truncate_inode_pages2_range(struct address_space *, loff_t, loff_t);
> 
> Personally, I have an aversion to sticking a "2" in there.  I know you're
> just following the convention established by invalidate_inode_pages2, but..
> 
> Hold on, -mm already contains reiser4-truncate_inode_pages_range.patch,
> you should be working with that.  Doesn't it do just what you need,
> even without a "2" :-?
>  
> --- linux-2.6.14-rc3/mm/madvise.c	2005-09-30 14:17:35.000000000 -0700
> +++ linux-2.6.14-rc3.db2/mm/madvise.c	2005-10-20 13:37:41.000000000 -0700
> @@ -137,6 +137,40 @@ static long madvise_dontneed(struct vm_a
>  	return 0;
>  }
>  
> +static long madvise_discard(struct vm_area_struct * vma,
> +			     struct vm_area_struct ** prev,
> +			     unsigned long start, unsigned long end)
> +{
> ....
> +	error = madvise_dontneed(vma, prev, start, end);
> +	if (error)
> +		return error;
> +
> +	/* looks good, try and rip it out of page cache */
> +	printk("%s: trying to rip shm vma (%p) inode from page cache\n", __FUNCTION__, vma);
> +	offset = (loff_t)(start - vma->vm_start);
> +	endoff = (loff_t)(end - vma->vm_start);
> +	printk("call truncate_inode_pages(%p, %x %x)\n", mapping, 
> +			(unsigned int)offset, (unsigned int)endoff);
> +	down(&mapping->host->i_sem);
> +	truncate_inode_pages2_range(mapping, offset, endoff);
> +	up(&mapping->host->i_sem);
> +	return 0;
> +}
> 
> Hmm.  I don't think it's consistent to zap the pages from a single mm,
> then remove them from the page cache, while leaving the pages mapped into
> other mms.  Just what would those pages then be?  they're not file pages,
> they're not anonymous pages, such pages have given trouble in the past.
> 
> I think you'll need to follow vmtruncate much more closely - and the
> unmap_mapping_range code already allows for a range, shouldn't need
> much change - going through all the vmas before truncating the range.
> 
> Which makes it feel more like sys_fpunch than an madvise.
> 
> You of course need write access to the underlying file, is that checked?
> 
> What should it be doing to anonymous COWed pages?  Not clear whether
> it should be following truncate in discarding those too, or not.
> 
> Hugh
> 


-- 
Darren Hart
IBM Linux Technology Center
Linux Kernel Team
Phone: 503 578 3185
   T/L: 775 3185

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2005-10-24 20:22 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-17 17:30 [RFC] OVERCOMMIT_ALWAYS extension Badari Pulavarty
2005-10-17 18:13 ` Hugh Dickins
2005-10-17 18:25   ` Hugh Dickins
2005-10-17 23:14     ` Badari Pulavarty
2005-10-18 16:05     ` [RFC][PATCH] " Badari Pulavarty
2005-10-19 17:56       ` Hugh Dickins
2005-10-19 18:32         ` Jeff Dike
2005-10-19 21:21           ` Badari Pulavarty
2005-10-19 22:38             ` Jeff Dike
2005-10-19 18:50         ` Badari Pulavarty
2005-10-19 19:12           ` Darren Hart
2005-10-19 20:10           ` Hugh Dickins
2005-10-19 20:47           ` Jeff Dike
2005-10-20 15:11             ` Badari Pulavarty
2005-10-20 17:27               ` Jeff Dike
2005-10-20 22:37                 ` Badari Pulavarty
2005-10-24 20:04                   ` Hugh Dickins
2005-10-24 20:22                     ` Darren Hart [this message]
2005-10-24 20:24                     ` Badari Pulavarty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=435D426A.6070007@us.ibm.com \
    --to=dvhltc@us.ibm.com \
    --cc=hugh@veritas.com \
    --cc=jdike@addtoit.com \
    --cc=linux-mm@kvack.org \
    --cc=pbadari@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox