From: Badari Pulavarty <pbadari@us.ibm.com>
To: lkml <linux-kernel@vger.kernel.org>
Cc: Hugh Dickins <hugh@veritas.com>,
akpm@osdl.org, andrea@suse.de, dvhltc@us.ibm.com,
linux-mm <linux-mm@kvack.org>,
Blaisorblade <blaisorblade@yahoo.it>,
Jeff Dike <jdike@addtoit.com>
Subject: [PATCH] 2.6.14 patch for supporting madvise(MADV_FREE)
Date: Tue, 01 Nov 2005 17:15:01 -0800 [thread overview]
Message-ID: <1130894101.24503.64.camel@localhost.localdomain> (raw)
In-Reply-To: <20051101000509.GA11847@ccure.user-mode-linux.org>
[-- Attachment #1: Type: text/plain, Size: 1124 bytes --]
Hi All,
Here is the patch to support madvise(MADV_FREE) - which frees
up the given range of pages and truncates the underlying backing
store. This basically provides "punch hole into file" functionality.
Currently it supports ONLY shmfs/tmpfs - where we have short term
need. Other filesystems return -ENOSYS.
Yes. This is a *crazy* interface to do it. But this is what
we exactly need for now. Here is the discussion on linux-mm
(for all the fun discussion and the naming):
http://marc.theaimsgroup.com/?l=linux-mm&m=113078625426989&w=2
Andrew, could you include this in your next -mm release ?
BTW, for completeness - this patch includes reiser4-truncate-inode-
pages-range patch from your -mm series. If you want me to re-work
my patch without that, please let me know.
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.14-
rc5/2.6.14-rc5-mm1/broken-out/reiser4-truncate_inode_pages_range.patch
I tested with my test cases and Jeff Dike was kind enough to provide
a test case with UML - which found more bugs. I thank Andrea & Hugh
for helping me out heavily :)
Comments ?
Thanks,
Badari
[-- Attachment #2: madvise-free.patch --]
[-- Type: text/x-patch, Size: 22594 bytes --]
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
diff -Naurp -X dontdiff linux-2.6.14/include/asm-alpha/mman.h linux-2.6.14.madv/include/asm-alpha/mman.h
--- linux-2.6.14/include/asm-alpha/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-alpha/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -42,6 +42,7 @@
#define MADV_WILLNEED 3 /* will need these pages */
#define MADV_SPACEAVAIL 5 /* ensure resources are available */
#define MADV_DONTNEED 6 /* don't need these pages */
+#define MADV_FREE 7 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-arm/mman.h linux-2.6.14.madv/include/asm-arm/mman.h
--- linux-2.6.14/include/asm-arm/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-arm/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-arm26/mman.h linux-2.6.14.madv/include/asm-arm26/mman.h
--- linux-2.6.14/include/asm-arm26/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-arm26/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-cris/mman.h linux-2.6.14.madv/include/asm-cris/mman.h
--- linux-2.6.14/include/asm-cris/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-cris/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -37,6 +37,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-frv/mman.h linux-2.6.14.madv/include/asm-frv/mman.h
--- linux-2.6.14/include/asm-frv/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-frv/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-h8300/mman.h linux-2.6.14.madv/include/asm-h8300/mman.h
--- linux-2.6.14/include/asm-h8300/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-h8300/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-i386/mman.h linux-2.6.14.madv/include/asm-i386/mman.h
--- linux-2.6.14/include/asm-i386/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-i386/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-ia64/mman.h linux-2.6.14.madv/include/asm-ia64/mman.h
--- linux-2.6.14/include/asm-ia64/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-ia64/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -43,6 +43,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-m32r/mman.h linux-2.6.14.madv/include/asm-m32r/mman.h
--- linux-2.6.14/include/asm-m32r/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-m32r/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -37,6 +37,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-m68k/mman.h linux-2.6.14.madv/include/asm-m68k/mman.h
--- linux-2.6.14/include/asm-m68k/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-m68k/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-mips/mman.h linux-2.6.14.madv/include/asm-mips/mman.h
--- linux-2.6.14/include/asm-mips/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-mips/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -65,6 +65,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-parisc/mman.h linux-2.6.14.madv/include/asm-parisc/mman.h
--- linux-2.6.14/include/asm-parisc/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-parisc/mman.h 2005-11-01 11:47:06.000000000 -0800
@@ -38,6 +38,7 @@
#define MADV_SPACEAVAIL 5 /* insure that resources are reserved */
#define MADV_VPS_PURGE 6 /* Purge pages from VM page cache */
#define MADV_VPS_INHERIT 7 /* Inherit parents page size */
+#define MADV_FREE 8 /* free up these pages */
/* The range 12-64 is reserved for page size specification. */
#define MADV_4K_PAGES 12 /* Use 4K pages */
diff -Naurp -X dontdiff linux-2.6.14/include/asm-powerpc/mman.h linux-2.6.14.madv/include/asm-powerpc/mman.h
--- linux-2.6.14/include/asm-powerpc/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-powerpc/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -44,6 +44,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-s390/mman.h linux-2.6.14.madv/include/asm-s390/mman.h
--- linux-2.6.14/include/asm-s390/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-s390/mman.h 2005-11-01 11:46:37.000000000 -0800
@@ -43,6 +43,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-sh/mman.h linux-2.6.14.madv/include/asm-sh/mman.h
--- linux-2.6.14/include/asm-sh/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-sh/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-sparc/mman.h linux-2.6.14.madv/include/asm-sparc/mman.h
--- linux-2.6.14/include/asm-sparc/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-sparc/mman.h 2005-11-01 11:45:42.000000000 -0800
@@ -53,7 +53,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
-#define MADV_FREE 0x5 /* (Solaris) contents can be freed */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-sparc64/mman.h linux-2.6.14.madv/include/asm-sparc64/mman.h
--- linux-2.6.14/include/asm-sparc64/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-sparc64/mman.h 2005-11-01 11:46:08.000000000 -0800
@@ -53,7 +53,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
-#define MADV_FREE 0x5 /* (Solaris) contents can be freed */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-v850/mman.h linux-2.6.14.madv/include/asm-v850/mman.h
--- linux-2.6.14/include/asm-v850/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-v850/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -32,6 +32,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-x86_64/mman.h linux-2.6.14.madv/include/asm-x86_64/mman.h
--- linux-2.6.14/include/asm-x86_64/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-x86_64/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -36,6 +36,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/asm-xtensa/mman.h linux-2.6.14.madv/include/asm-xtensa/mman.h
--- linux-2.6.14/include/asm-xtensa/mman.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/asm-xtensa/mman.h 2005-11-01 11:45:09.000000000 -0800
@@ -72,6 +72,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_FREE 0x5 /* free up these pages */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -Naurp -X dontdiff linux-2.6.14/include/linux/fs.h linux-2.6.14.madv/include/linux/fs.h
--- linux-2.6.14/include/linux/fs.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/linux/fs.h 2005-11-01 11:45:09.000000000 -0800
@@ -995,6 +995,7 @@ struct inode_operations {
ssize_t (*getxattr) (struct dentry *, const char *, void *, size_t);
ssize_t (*listxattr) (struct dentry *, char *, size_t);
int (*removexattr) (struct dentry *, const char *);
+ void (*truncate_range)(struct inode *, loff_t, loff_t);
};
struct seq_file;
diff -Naurp -X dontdiff linux-2.6.14/include/linux/mm.h linux-2.6.14.madv/include/linux/mm.h
--- linux-2.6.14/include/linux/mm.h 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/include/linux/mm.h 2005-11-01 11:45:09.000000000 -0800
@@ -704,6 +704,7 @@ static inline void unmap_shared_mapping_
}
extern int vmtruncate(struct inode * inode, loff_t offset);
+extern int vmtruncate_range(struct inode * inode, loff_t offset, loff_t end);
extern pud_t *FASTCALL(__pud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address));
extern pmd_t *FASTCALL(__pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address));
extern pte_t *FASTCALL(pte_alloc_kernel(struct mm_struct *mm, pmd_t *pmd, unsigned long address));
@@ -865,6 +866,7 @@ extern unsigned long do_brk(unsigned lon
/* filemap.c */
extern unsigned long page_unuse(struct page *);
extern void truncate_inode_pages(struct address_space *, loff_t);
+extern void truncate_inode_pages_range(struct address_space *, loff_t, loff_t);
/* generic vm_area_ops exported for stackable file systems */
extern struct page *filemap_nopage(struct vm_area_struct *, unsigned long, int *);
diff -Naurp -X dontdiff linux-2.6.14/mm/madvise.c linux-2.6.14.madv/mm/madvise.c
--- linux-2.6.14/mm/madvise.c 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/mm/madvise.c 2005-11-01 12:01:27.000000000 -0800
@@ -140,6 +140,39 @@ static long madvise_dontneed(struct vm_a
return 0;
}
+/*
+ * Application wants to free up the pages and associated backing store.
+ * This is effectively punching a hole into the middle of a file.
+ *
+ * NOTE: Currently, only shmfs/tmpfs is supported for this operation.
+ * Other filesystems return -ENOSYS.
+ */
+static long madvise_free(struct vm_area_struct * vma,
+ unsigned long start, unsigned long end)
+{
+ struct address_space *mapping;
+ loff_t offset, endoff;
+
+ if (vma->vm_flags & (VM_LOCKED|VM_NONLINEAR|VM_HUGETLB))
+ return -EINVAL;
+
+ if (!vma->vm_file || !vma->vm_file->f_mapping
+ || !vma->vm_file->f_mapping->host) {
+ return -EINVAL;
+ }
+
+ mapping = vma->vm_file->f_mapping;
+ if (mapping == &swapper_space) {
+ return -EINVAL;
+ }
+
+ offset = (loff_t)(start - vma->vm_start)
+ + (vma->vm_pgoff << PAGE_SHIFT);
+ endoff = (loff_t)(end - vma->vm_start - 1)
+ + (vma->vm_pgoff << PAGE_SHIFT);
+ return vmtruncate_range(mapping->host, offset, endoff);
+}
+
static long
madvise_vma(struct vm_area_struct *vma, struct vm_area_struct **prev,
unsigned long start, unsigned long end, int behavior)
@@ -152,6 +185,9 @@ madvise_vma(struct vm_area_struct *vma,
case MADV_RANDOM:
error = madvise_behavior(vma, prev, start, end, behavior);
break;
+ case MADV_FREE:
+ error = madvise_free(vma, start, end);
+ break;
case MADV_WILLNEED:
error = madvise_willneed(vma, prev, start, end);
@@ -190,6 +226,8 @@ madvise_vma(struct vm_area_struct *vma,
* some pages ahead.
* MADV_DONTNEED - the application is finished with the given range,
* so the kernel can free resources associated with it.
+ * MADV_FREE - the application wants to free up the given range of
+ * pages and associated backing store.
*
* return values:
* zero - success
diff -Naurp -X dontdiff linux-2.6.14/mm/memory.c linux-2.6.14.madv/mm/memory.c
--- linux-2.6.14/mm/memory.c 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/mm/memory.c 2005-11-01 11:45:09.000000000 -0800
@@ -1597,6 +1597,32 @@ out_busy:
EXPORT_SYMBOL(vmtruncate);
+int vmtruncate_range(struct inode * inode, loff_t offset, loff_t end)
+{
+ struct address_space *mapping = inode->i_mapping;
+
+ /*
+ * If the underlying filesystem is not going to provide
+ * a way to truncate a range of blocks (punch a hole) -
+ * we should return failure right now.
+ */
+ if (!inode->i_op || !inode->i_op->truncate_range)
+ return -ENOSYS;
+
+ /* XXX - Do we need both i_sem and i_allocsem all the way ? */
+ down(&inode->i_sem);
+ down_write(&inode->i_alloc_sem);
+ unmap_mapping_range(mapping, offset, (end - offset), 1);
+ truncate_inode_pages_range(mapping, offset, end);
+ inode->i_op->truncate_range(inode, offset, end);
+ up_write(&inode->i_alloc_sem);
+ up(&inode->i_sem);
+
+ return 0;
+}
+
+EXPORT_SYMBOL(vmtruncate_range);
+
/*
* Primitive swap readahead code. We simply read an aligned block of
* (1 << page_cluster) entries in the swap area. This method is chosen
diff -Naurp -X dontdiff linux-2.6.14/mm/shmem.c linux-2.6.14.madv/mm/shmem.c
--- linux-2.6.14/mm/shmem.c 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/mm/shmem.c 2005-11-01 11:45:09.000000000 -0800
@@ -459,7 +459,7 @@ static void shmem_free_pages(struct list
} while (next);
}
-static void shmem_truncate(struct inode *inode)
+static void shmem_truncate_range(struct inode *inode, loff_t start, loff_t end)
{
struct shmem_inode_info *info = SHMEM_I(inode);
unsigned long idx;
@@ -477,18 +477,27 @@ static void shmem_truncate(struct inode
long nr_swaps_freed = 0;
int offset;
int freed;
+ int punch_hole = 0;
inode->i_ctime = inode->i_mtime = CURRENT_TIME;
- idx = (inode->i_size + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
+ idx = (start + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
if (idx >= info->next_index)
return;
spin_lock(&info->lock);
info->flags |= SHMEM_TRUNCATE;
- limit = info->next_index;
- info->next_index = idx;
+ if (likely(end == (loff_t) -1)) {
+ limit = info->next_index;
+ info->next_index = idx;
+ } else {
+ limit = (end + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
+ if (limit > info->next_index)
+ limit = info->next_index;
+ punch_hole = 1;
+ }
+
topdir = info->i_indirect;
- if (topdir && idx <= SHMEM_NR_DIRECT) {
+ if (topdir && idx <= SHMEM_NR_DIRECT && !punch_hole) {
info->i_indirect = NULL;
nr_pages_to_free++;
list_add(&topdir->lru, &pages_to_free);
@@ -575,11 +584,12 @@ static void shmem_truncate(struct inode
subdir->nr_swapped -= freed;
if (offset)
spin_unlock(&info->lock);
- BUG_ON(subdir->nr_swapped > offset);
+ if (!punch_hole)
+ BUG_ON(subdir->nr_swapped > offset);
}
if (offset)
offset = 0;
- else if (subdir) {
+ else if (subdir && !subdir->nr_swapped) {
dir[diroff] = NULL;
nr_pages_to_free++;
list_add(&subdir->lru, &pages_to_free);
@@ -596,7 +606,7 @@ done2:
* Also, though shmem_getpage checks i_size before adding to
* cache, no recheck after: so fix the narrow window there too.
*/
- truncate_inode_pages(inode->i_mapping, inode->i_size);
+ truncate_inode_pages_range(inode->i_mapping, start, end);
}
spin_lock(&info->lock);
@@ -616,6 +626,11 @@ done2:
}
}
+static void shmem_truncate(struct inode *inode)
+{
+ shmem_truncate_range(inode, inode->i_size, (loff_t)-1);
+}
+
static int shmem_notify_change(struct dentry *dentry, struct iattr *attr)
{
struct inode *inode = dentry->d_inode;
@@ -2083,6 +2098,7 @@ static struct file_operations shmem_file
static struct inode_operations shmem_inode_operations = {
.truncate = shmem_truncate,
.setattr = shmem_notify_change,
+ .truncate_range = shmem_truncate_range,
};
static struct inode_operations shmem_dir_inode_operations = {
diff -Naurp -X dontdiff linux-2.6.14/mm/truncate.c linux-2.6.14.madv/mm/truncate.c
--- linux-2.6.14/mm/truncate.c 2005-10-27 17:02:08.000000000 -0700
+++ linux-2.6.14.madv/mm/truncate.c 2005-11-01 11:45:09.000000000 -0800
@@ -91,12 +91,15 @@ invalidate_complete_page(struct address_
}
/**
- * truncate_inode_pages - truncate *all* the pages from an offset
+ * truncate_inode_pages - truncate range of pages specified by start and
+ * end byte offsets
* @mapping: mapping to truncate
* @lstart: offset from which to truncate
+ * @lend: offset to which to truncate
*
- * Truncate the page cache at a set offset, removing the pages that are beyond
- * that offset (and zeroing out partial pages).
+ * Truncate the page cache, removing the pages that are between
+ * specified offsets (and zeroing out partial page
+ * (if lstart is not page aligned)).
*
* Truncate takes two passes - the first pass is nonblocking. It will not
* block on page locks and it will not block on writeback. The second pass
@@ -110,12 +113,12 @@ invalidate_complete_page(struct address_
* We pass down the cache-hot hint to the page freeing code. Even if the
* mapping is large, it is probably the case that the final pages are the most
* recently touched, and freeing happens in ascending file offset order.
- *
- * Called under (and serialised by) inode->i_sem.
*/
-void truncate_inode_pages(struct address_space *mapping, loff_t lstart)
+void truncate_inode_pages_range(struct address_space *mapping,
+ loff_t lstart, loff_t lend)
{
const pgoff_t start = (lstart + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT;
+ pgoff_t end;
const unsigned partial = lstart & (PAGE_CACHE_SIZE - 1);
struct pagevec pvec;
pgoff_t next;
@@ -124,13 +127,22 @@ void truncate_inode_pages(struct address
if (mapping->nrpages == 0)
return;
+ BUG_ON((lend & (PAGE_CACHE_SIZE - 1)) != (PAGE_CACHE_SIZE - 1));
+ end = (lend >> PAGE_CACHE_SHIFT);
+
pagevec_init(&pvec, 0);
next = start;
- while (pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE)) {
+ while (next <= end &&
+ pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE)) {
for (i = 0; i < pagevec_count(&pvec); i++) {
struct page *page = pvec.pages[i];
pgoff_t page_index = page->index;
+ if (page_index > end) {
+ next = page_index;
+ break;
+ }
+
if (page_index > next)
next = page_index;
next++;
@@ -166,9 +178,15 @@ void truncate_inode_pages(struct address
next = start;
continue;
}
+ if (pvec.pages[0]->index > end) {
+ pagevec_release(&pvec);
+ break;
+ }
for (i = 0; i < pagevec_count(&pvec); i++) {
struct page *page = pvec.pages[i];
+ if (page->index > end)
+ break;
lock_page(page);
wait_on_page_writeback(page);
if (page->index > next)
@@ -180,7 +198,19 @@ void truncate_inode_pages(struct address
pagevec_release(&pvec);
}
}
+EXPORT_SYMBOL(truncate_inode_pages_range);
+/**
+ * truncate_inode_pages - truncate *all* the pages from an offset
+ * @mapping: mapping to truncate
+ * @lstart: offset from which to truncate
+ *
+ * Called under (and serialised by) inode->i_sem.
+ */
+void truncate_inode_pages(struct address_space *mapping, loff_t lstart)
+{
+ truncate_inode_pages_range(mapping, lstart, (loff_t)-1);
+}
EXPORT_SYMBOL(truncate_inode_pages);
/**
next prev parent reply other threads:[~2005-11-02 1:15 UTC|newest]
Thread overview: 114+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-26 22:49 [RFC] madvise(MADV_TRUNCATE) Badari Pulavarty
2005-10-27 8:38 ` Andi Kleen
2005-10-27 13:17 ` Andrea Arcangeli
2005-10-27 15:00 ` Badari Pulavarty
2005-10-27 15:11 ` Andrea Arcangeli
2005-10-27 18:20 ` Andrew Morton
2005-10-27 18:35 ` Badari Pulavarty
2005-10-27 18:50 ` Andrew Morton
2005-10-27 19:40 ` Gerrit Huizenga
2005-10-27 19:56 ` Andi Kleen
2005-10-27 23:21 ` Darren Hart
2005-10-27 20:05 ` Theodore Ts'o
2005-10-27 20:16 ` Andrea Arcangeli
2005-10-28 1:42 ` Badari Pulavarty
2005-10-28 16:33 ` Theodore Ts'o
2005-10-27 20:22 ` Jeff Dike
2005-10-27 20:04 ` Andrea Arcangeli
2005-10-27 20:50 ` Andrew Morton
2005-10-27 21:37 ` Andrea Arcangeli
2005-10-27 22:23 ` Andrew Morton
2005-10-27 23:05 ` Badari Pulavarty
2005-10-27 23:16 ` Andrew Morton
2005-10-27 23:33 ` Peter Chubb
2005-10-28 0:22 ` Andrea Arcangeli
2005-10-28 0:32 ` Andrew Morton
2005-10-28 1:10 ` Andrea Arcangeli
2005-10-28 1:27 ` Badari Pulavarty
2005-10-28 2:00 ` Andrew Morton
2005-10-27 22:32 ` Badari Pulavarty
2005-10-27 23:28 ` Peter Chubb
2005-10-27 23:49 ` Andrew Morton
2005-10-27 23:56 ` Nathan Scott
2005-10-28 0:15 ` Andrea Arcangeli
2005-10-27 23:59 ` Peter Chubb
2005-10-28 3:46 ` Jeff Dike
2005-10-28 11:03 ` Blaisorblade
2005-10-28 13:29 ` Andrea Arcangeli
2005-10-28 16:56 ` Blaisorblade
2005-10-28 16:16 ` Badari Pulavarty
2005-10-28 18:40 ` Blaisorblade
2005-10-28 18:56 ` Badari Pulavarty
2005-10-29 0:35 ` Badari Pulavarty
2005-10-28 16:19 ` Badari Pulavarty
2005-10-28 17:10 ` Blaisorblade
2005-10-28 18:28 ` Jeff Dike
2005-10-28 18:44 ` Blaisorblade
2005-10-28 18:42 ` Jeff Dike
2005-10-28 18:54 ` Badari Pulavarty
2005-10-29 0:03 ` Badari Pulavarty
2005-10-29 2:51 ` Jeff Dike
2005-10-31 16:34 ` Badari Pulavarty
2005-10-31 19:15 ` Badari Pulavarty
2005-10-31 19:49 ` [RFC][PATCH] madvise(MADV_TRUNCATE) Badari Pulavarty
2005-11-01 0:05 ` Jeff Dike
2005-11-02 1:15 ` Badari Pulavarty [this message]
2005-11-02 1:43 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_FREE) Andrea Arcangeli
2005-11-02 1:43 ` Andrea Arcangeli
2005-11-02 15:49 ` Badari Pulavarty
2005-11-02 15:49 ` Badari Pulavarty
2005-11-02 16:12 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Badari Pulavarty
2005-11-02 19:54 ` New bug in patch and existing Linux code - race with install_page() (was: Re: [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE)) Blaisorblade
2005-11-02 19:54 ` Blaisorblade
2005-11-02 20:12 ` Hugh Dickins
2005-11-02 20:12 ` Hugh Dickins
2005-11-02 20:45 ` Hugh Dickins
2005-11-02 20:45 ` Hugh Dickins
2005-11-02 21:36 ` Badari Pulavarty
2005-11-02 21:36 ` Badari Pulavarty
2005-11-02 21:55 ` Hugh Dickins
2005-11-02 21:55 ` Hugh Dickins
2005-11-02 22:02 ` Badari Pulavarty
2005-11-02 22:02 ` Badari Pulavarty
2005-11-12 0:25 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-11-12 0:25 ` Andrew Morton
2005-11-12 0:34 ` Badari Pulavarty
2005-11-12 0:34 ` Badari Pulavarty
2005-11-12 1:43 ` Andrew Morton
2005-11-12 1:43 ` Andrew Morton
2005-11-12 4:41 ` Badari Pulavarty
2005-11-12 4:41 ` Badari Pulavarty
2006-01-16 13:06 ` differences between MADV_FREE and MADV_DONTNEED Andrea Arcangeli
2006-01-16 13:06 ` Andrea Arcangeli
2006-01-16 16:02 ` Suleiman Souhlal
2006-01-16 16:02 ` Suleiman Souhlal
2006-01-16 16:28 ` Andrea Arcangeli
2006-01-16 16:28 ` Andrea Arcangeli
2006-01-16 17:03 ` Suleiman Souhlal
2006-01-16 17:03 ` Suleiman Souhlal
2006-01-16 17:24 ` Andrea Arcangeli
2006-01-16 17:24 ` Andrea Arcangeli
2006-01-16 21:43 ` Eric W. Biederman
2006-01-16 21:43 ` Eric W. Biederman
2006-01-17 0:24 ` Suleiman Souhlal
2006-01-17 0:24 ` Suleiman Souhlal
2006-01-17 1:04 ` Nicholas Miell
2006-01-17 1:04 ` Nicholas Miell
2006-01-17 12:43 ` Christoph Hellwig
2006-01-17 12:43 ` Christoph Hellwig
2006-01-17 18:23 ` Eric W. Biederman
2006-01-17 18:23 ` Eric W. Biederman
2006-01-17 22:55 ` Nicholas Miell
2006-01-17 22:55 ` Nicholas Miell
2007-03-01 18:11 ` Samuel Thibault
2007-03-01 18:11 ` Samuel Thibault
2006-01-17 19:06 ` Badari Pulavarty
2006-01-17 19:06 ` Badari Pulavarty
2006-01-17 1:06 ` Blaisorblade
2006-01-17 1:06 ` Blaisorblade
2006-01-17 1:33 ` Andrea Arcangeli
2006-01-17 1:33 ` Andrea Arcangeli
2005-11-12 0:34 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-11-12 0:34 ` Andrew Morton
2005-10-28 17:55 ` [RFC] madvise(MADV_TRUNCATE) Blaisorblade
2005-10-28 21:23 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1130894101.24503.64.camel@localhost.localdomain \
--to=pbadari@us.ibm.com \
--cc=akpm@osdl.org \
--cc=andrea@suse.de \
--cc=blaisorblade@yahoo.it \
--cc=dvhltc@us.ibm.com \
--cc=hugh@veritas.com \
--cc=jdike@addtoit.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.