From: Badari Pulavarty <pbadari@us.ibm.com>
To: Blaisorblade <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>, Hugh Dickins <hugh@veritas.com>,
akpm@osdl.org, andrea@suse.de, dvhltc@us.ibm.com,
linux-mm <linux-mm@kvack.org>
Subject: Re: [RFC] madvise(MADV_TRUNCATE)
Date: Fri, 28 Oct 2005 11:56:58 -0700 [thread overview]
Message-ID: <4362747A.5060700@us.ibm.com> (raw)
In-Reply-To: <200510282040.29856.blaisorblade@yahoo.it>
Blaisorblade wrote:
> On Friday 28 October 2005 18:16, Badari Pulavarty wrote:
>
>>Blaisorblade wrote:
>>
>>>On Friday 28 October 2005 05:46, Jeff Dike wrote:
>>>
>>>>On Wed, Oct 26, 2005 at 03:49:55PM -0700, Badari Pulavarty wrote:
>
>
>>>On the plan, however, I have a concern: VM_NONLINEAR.
>
>
>>>However, looking at the patch, the implementation would boil down to
>>>something like
>>>
>>>for each page in range {
>>> start = page->index;
>>> end = start + PAGE_SIZE;
>>> call truncate_inode_pages_range(mapping, offset, end);
>>> inode->i_op->truncate_range(inode, offset, end);
>>>}
>>>
>>>unmap_mapping_range() should be done at once for the whole range.
>>
>>patch does
>>
>>for all the pages in the given vma {
>> unmap_mapping_range(mapping, offset, end);
>> truncate_inode_pages_range(mapping, offset, end);
>> inode->op->truncate_range(inode, offset, end)
>>}
>
>
>>It operates on bunch of pages in the given VMA. Since UML has
>>one page for VMA, it operates on one page at a time - do you
>>see anything wrong here ?
>
>
> My point was the support to VM_NONLINEAR. In the future, UML will have one big
> VMA, but different pages will be remapped with different offsets (already in
> mainline) and different protections (I have patches, I sent an earlier
> version, still revising).
>
> In that case, you could really truncate (in one single call) pages which are
> one at the start of the file and one at the end. That's why with VM_NONLINEAR
> it wouldn't work.
>
> However, Jeff made me note that we'd probably call madvise() on the linear
> kernel mapping (the kernel maps pages from the RAM file all at once,
> linearly). So you can safely just refuse operating on VM_NONLINEAR vmas.
>
>
>>>While looking at these, here's what I'd call "strange" in the patch:
>
>
>>>Also, why is unmap_mapping_range done with the inode semaphore held? I
>>>don't remember locking rule but conceptually this has no point, IMHO.
>
>
>>I am not sure either, let me look at it. (I thought we should hold it
>>for truncate()).
>
>
> Ok, do_truncate() uses the semaphore around the whole ops, because it's
> implemented in a radically different way (through notify_change()).
>
> We don't need IMHO to do things that way; we don't even change i_size - not
> even when at the end of the file, as we don't want SIGBUS.
>
> And anyway FS's must already handle holes at the end of a file.
>
> Btw, when truncating, notify_change does:
>
> if (ia_valid & ATTR_SIZE)
> down_write(&dentry->d_inode->i_alloc_sem);
>
> (which I suppose is used to protect against concurrent file extensions - page
> allocations in previous holes - and such). You should probably take that too
> (nest it inside mapping->host->i_sem).
>
> Also, vmtruncate is called with the semaphore held because it must call
> truncate_inode_pages(), and because even the calls to i_size_write() must be
> atomic with the rest. But other than that, there's no reason. Especially,
> unmap_mapping_range() does purely pagetable operations.
>
>
>>>Btw, why I don't see vm_pgoff mentioned in these lines of the patch (nor
>>>anywhere else in the patch)?
>
>
>>vm_pgoff - don't remember what that supposed to represent...
>
>
> Call mmap() with non-0 pgoff (i.e. offset in the file), say the second file
> page. You're gonna store the pgoff parameter in vma->vm_pgoff (in PAGE_SHIFT
> units).
>
> If I then request you to truncate the first page in the VMA, how does your
> code realize that it should punch the second page rather than the first?
>
> However, Jeff said this _isn't_ the bug he's hitting - in his case the VMA has
> a 0 initial offset (for the same reason we don't need VM_NONLINEAR support).
>
>
>>>You call truncate_inode_pages_range(mapping, offset, endoff), so I think
>>>you're really burned here.
>
>
>>>+offset = (loff_t)(start - vma->vm_start);
>>>+endoff = (loff_t)(end - vma->vm_start);
>
>
> So they would become:
>
> offset = (loff_t)(start - vma->vm_start) + vma->vm_pgoff << PAGE_SHIFT;
>
> or with page_offset(). Btw, shouldn't this be done by some macro in
> <linux/pagemap.h>, as page_offset() and linear_page_index()?
>
> Btw, also compare with mm/rmap.c:vma_address()/page_address_in_vma().
>
>
>>"end" here is not end of VMA - its end of the region we want to discard
>>(in UML case its start + PAGE_SIZE). Anything wrong ?
>
>
> All ok for that, I was complaining about not using ->vm_pgoff.
>
> I had the doubt that vm_pgoff entered the picture later, but I'm sure
> truncate_inode_pages{_range} wants file offsets, so it wasn't something I was
> missing.
Thank you for your comments. I need sometime to digest all these,
make changes and debug the current problem. For now, I am going
to restrict/ignore VM_NONLINEAR case.
I will get back to you on Monday.
Thanks,
Badari
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2005-10-28 18:57 UTC|newest]
Thread overview: 114+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-26 22:49 [RFC] madvise(MADV_TRUNCATE) Badari Pulavarty
2005-10-27 8:38 ` Andi Kleen
2005-10-27 13:17 ` Andrea Arcangeli
2005-10-27 15:00 ` Badari Pulavarty
2005-10-27 15:11 ` Andrea Arcangeli
2005-10-27 18:20 ` Andrew Morton
2005-10-27 18:35 ` Badari Pulavarty
2005-10-27 18:50 ` Andrew Morton
2005-10-27 19:40 ` Gerrit Huizenga
2005-10-27 19:56 ` Andi Kleen
2005-10-27 23:21 ` Darren Hart
2005-10-27 20:05 ` Theodore Ts'o
2005-10-27 20:16 ` Andrea Arcangeli
2005-10-28 1:42 ` Badari Pulavarty
2005-10-28 16:33 ` Theodore Ts'o
2005-10-27 20:22 ` Jeff Dike
2005-10-27 20:04 ` Andrea Arcangeli
2005-10-27 20:50 ` Andrew Morton
2005-10-27 21:37 ` Andrea Arcangeli
2005-10-27 22:23 ` Andrew Morton
2005-10-27 23:05 ` Badari Pulavarty
2005-10-27 23:16 ` Andrew Morton
2005-10-27 23:33 ` Peter Chubb
2005-10-28 0:22 ` Andrea Arcangeli
2005-10-28 0:32 ` Andrew Morton
2005-10-28 1:10 ` Andrea Arcangeli
2005-10-28 1:27 ` Badari Pulavarty
2005-10-28 2:00 ` Andrew Morton
2005-10-27 22:32 ` Badari Pulavarty
2005-10-27 23:28 ` Peter Chubb
2005-10-27 23:49 ` Andrew Morton
2005-10-27 23:56 ` Nathan Scott
2005-10-28 0:15 ` Andrea Arcangeli
2005-10-27 23:59 ` Peter Chubb
2005-10-28 3:46 ` Jeff Dike
2005-10-28 11:03 ` Blaisorblade
2005-10-28 13:29 ` Andrea Arcangeli
2005-10-28 16:56 ` Blaisorblade
2005-10-28 16:16 ` Badari Pulavarty
2005-10-28 18:40 ` Blaisorblade
2005-10-28 18:56 ` Badari Pulavarty [this message]
2005-10-29 0:35 ` Badari Pulavarty
2005-10-28 16:19 ` Badari Pulavarty
2005-10-28 17:10 ` Blaisorblade
2005-10-28 18:28 ` Jeff Dike
2005-10-28 18:44 ` Blaisorblade
2005-10-28 18:42 ` Jeff Dike
2005-10-28 18:54 ` Badari Pulavarty
2005-10-29 0:03 ` Badari Pulavarty
2005-10-29 2:51 ` Jeff Dike
2005-10-31 16:34 ` Badari Pulavarty
2005-10-31 19:15 ` Badari Pulavarty
2005-10-31 19:49 ` [RFC][PATCH] madvise(MADV_TRUNCATE) Badari Pulavarty
2005-11-01 0:05 ` Jeff Dike
2005-11-02 1:15 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_FREE) Badari Pulavarty
2005-11-02 1:43 ` Andrea Arcangeli
2005-11-02 1:43 ` Andrea Arcangeli
2005-11-02 15:49 ` Badari Pulavarty
2005-11-02 15:49 ` Badari Pulavarty
2005-11-02 16:12 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Badari Pulavarty
2005-11-02 19:54 ` New bug in patch and existing Linux code - race with install_page() (was: Re: [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE)) Blaisorblade
2005-11-02 19:54 ` Blaisorblade
2005-11-02 20:12 ` Hugh Dickins
2005-11-02 20:12 ` Hugh Dickins
2005-11-02 20:45 ` Hugh Dickins
2005-11-02 20:45 ` Hugh Dickins
2005-11-02 21:36 ` Badari Pulavarty
2005-11-02 21:36 ` Badari Pulavarty
2005-11-02 21:55 ` Hugh Dickins
2005-11-02 21:55 ` Hugh Dickins
2005-11-02 22:02 ` Badari Pulavarty
2005-11-02 22:02 ` Badari Pulavarty
2005-11-12 0:25 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-11-12 0:25 ` Andrew Morton
2005-11-12 0:34 ` Badari Pulavarty
2005-11-12 0:34 ` Badari Pulavarty
2005-11-12 1:43 ` Andrew Morton
2005-11-12 1:43 ` Andrew Morton
2005-11-12 4:41 ` Badari Pulavarty
2005-11-12 4:41 ` Badari Pulavarty
2006-01-16 13:06 ` differences between MADV_FREE and MADV_DONTNEED Andrea Arcangeli
2006-01-16 13:06 ` Andrea Arcangeli
2006-01-16 16:02 ` Suleiman Souhlal
2006-01-16 16:02 ` Suleiman Souhlal
2006-01-16 16:28 ` Andrea Arcangeli
2006-01-16 16:28 ` Andrea Arcangeli
2006-01-16 17:03 ` Suleiman Souhlal
2006-01-16 17:03 ` Suleiman Souhlal
2006-01-16 17:24 ` Andrea Arcangeli
2006-01-16 17:24 ` Andrea Arcangeli
2006-01-16 21:43 ` Eric W. Biederman
2006-01-16 21:43 ` Eric W. Biederman
2006-01-17 0:24 ` Suleiman Souhlal
2006-01-17 0:24 ` Suleiman Souhlal
2006-01-17 1:04 ` Nicholas Miell
2006-01-17 1:04 ` Nicholas Miell
2006-01-17 12:43 ` Christoph Hellwig
2006-01-17 12:43 ` Christoph Hellwig
2006-01-17 18:23 ` Eric W. Biederman
2006-01-17 18:23 ` Eric W. Biederman
2006-01-17 22:55 ` Nicholas Miell
2006-01-17 22:55 ` Nicholas Miell
2007-03-01 18:11 ` Samuel Thibault
2007-03-01 18:11 ` Samuel Thibault
2006-01-17 19:06 ` Badari Pulavarty
2006-01-17 19:06 ` Badari Pulavarty
2006-01-17 1:06 ` Blaisorblade
2006-01-17 1:06 ` Blaisorblade
2006-01-17 1:33 ` Andrea Arcangeli
2006-01-17 1:33 ` Andrea Arcangeli
2005-11-12 0:34 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-11-12 0:34 ` Andrew Morton
2005-10-28 17:55 ` [RFC] madvise(MADV_TRUNCATE) Blaisorblade
2005-10-28 21:23 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4362747A.5060700@us.ibm.com \
--to=pbadari@us.ibm.com \
--cc=akpm@osdl.org \
--cc=andrea@suse.de \
--cc=blaisorblade@yahoo.it \
--cc=dvhltc@us.ibm.com \
--cc=hugh@veritas.com \
--cc=jdike@addtoit.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.