* [Cluster-devel] [PATCH v4 00/12] Change readahead API @ 2020-02-01 15:12 Matthew Wilcox 2020-02-01 15:12 ` [Cluster-devel] [PATCH v4 03/12] readahead: Put pages in cache earlier Matthew Wilcox ` (3 more replies) 0 siblings, 4 replies; 6+ messages in thread From: Matthew Wilcox @ 2020-02-01 15:12 UTC (permalink / raw) To: cluster-devel.redhat.com From: "Matthew Wilcox (Oracle)" <willy@infradead.org> I would particularly value feedback on this from the gfs2 and ocfs2 maintainers. They have non-trivial changes, and a review on patch 5 would be greatly appreciated. This series adds a readahead address_space operation to eventually replace the readpages operation. The key difference is that pages are added to the page cache as they are allocated (and then looked up by the filesystem) instead of passing them on a list to the readpages operation and having the filesystem add them to the page cache. It's a net reduction in code for each implementation, more efficient than walking a list, and solves the direct-write vs buffered-read problem reported by yu kuai at https://lore.kernel.org/linux-fsdevel/20200116063601.39201-1-yukuai3 at huawei.com/ v4: - Rebase on current Linus (a62aa6f7f50a ("Merge tag 'gfs2-for-5.6'")) - Add comment to __do_page_cache_readahead() acknowledging we don't care _that_ much about setting PageReadahead. - Fix the return value check of add_to_page_cache_lru(). - Add a missing call to put_page() in __do_page_cache_readahead() if we fail to insert the page. - Improve the documentation of ->readahead (including indentation problem identified by Randy). - Fix off by one error in read_pages() (Dave Chinner). - Fix nr_pages manipulation in btrfs (Dave Chinner). - Remove bogus refcount fix in erofs (Gao Xiang, Dave Chinner). - Update ext4 patch for Merkle tree readahead. - Update f2fs patch for Merkle tree readahead. - Reinstate next_page label in f2fs_readpages() now it's used by the compression code. - Reinstate call to fuse_wait_on_page_writeback (Miklos Szeredi). - Remove a double-unlock in the error path in fuse. - Remove an odd fly-speck in fuse_readpages(). - Make nr_pages loop in fuse_readpages less convoluted (Dave Chinner). Matthew Wilcox (Oracle) (12): mm: Fix the return type of __do_page_cache_readahead readahead: Ignore return value of ->readpages readahead: Put pages in cache earlier mm: Add readahead address space operation fs: Convert mpage_readpages to mpage_readahead btrfs: Convert from readpages to readahead erofs: Convert uncompressed files from readpages to readahead erofs: Convert compressed files from readpages to readahead ext4: Convert from readpages to readahead f2fs: Convert from readpages to readahead fuse: Convert from readpages to readahead iomap: Convert from readpages to readahead Documentation/filesystems/locking.rst | 7 ++- Documentation/filesystems/vfs.rst | 14 +++++ drivers/staging/exfat/exfat_super.c | 9 +-- fs/block_dev.c | 9 +-- fs/btrfs/extent_io.c | 19 +++--- fs/btrfs/extent_io.h | 2 +- fs/btrfs/inode.c | 18 +++--- fs/erofs/data.c | 33 ++++------ fs/erofs/zdata.c | 21 +++---- fs/ext2/inode.c | 12 ++-- fs/ext4/ext4.h | 5 +- fs/ext4/inode.c | 24 ++++---- fs/ext4/readpage.c | 20 +++--- fs/ext4/verity.c | 16 +++-- fs/f2fs/data.c | 35 +++++------ fs/f2fs/f2fs.h | 5 +- fs/f2fs/verity.c | 16 +++-- fs/fat/inode.c | 8 +-- fs/fuse/file.c | 37 +++++------ fs/gfs2/aops.c | 20 +++--- fs/hpfs/file.c | 8 +-- fs/iomap/buffered-io.c | 74 +++++----------------- fs/iomap/trace.h | 2 +- fs/isofs/inode.c | 9 +-- fs/jfs/inode.c | 8 +-- fs/mpage.c | 38 ++++-------- fs/nilfs2/inode.c | 13 ++-- fs/ocfs2/aops.c | 32 +++++----- fs/omfs/file.c | 8 +-- fs/qnx6/inode.c | 8 +-- fs/reiserfs/inode.c | 10 +-- fs/udf/inode.c | 8 +-- fs/xfs/xfs_aops.c | 10 +-- include/linux/fs.h | 2 + include/linux/iomap.h | 2 +- include/linux/mpage.h | 2 +- include/linux/pagemap.h | 12 ++++ include/trace/events/erofs.h | 6 +- include/trace/events/f2fs.h | 6 +- mm/internal.h | 2 +- mm/migrate.c | 2 +- mm/readahead.c | 89 ++++++++++++++++++--------- 42 files changed, 332 insertions(+), 349 deletions(-) -- 2.24.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Cluster-devel] [PATCH v4 03/12] readahead: Put pages in cache earlier 2020-02-01 15:12 [Cluster-devel] [PATCH v4 00/12] Change readahead API Matthew Wilcox @ 2020-02-01 15:12 ` Matthew Wilcox 2020-02-01 15:12 ` [Cluster-devel] [PATCH v4 04/12] mm: Add readahead address space operation Matthew Wilcox ` (2 subsequent siblings) 3 siblings, 0 replies; 6+ messages in thread From: Matthew Wilcox @ 2020-02-01 15:12 UTC (permalink / raw) To: cluster-devel.redhat.com From: "Matthew Wilcox (Oracle)" <willy@infradead.org> At allocation time, put the pages in the cache unless we're using ->readpages. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: linux-btrfs at vger.kernel.org Cc: linux-erofs at lists.ozlabs.org Cc: linux-ext4 at vger.kernel.org Cc: linux-f2fs-devel at lists.sourceforge.net Cc: linux-xfs at vger.kernel.org Cc: cluster-devel at redhat.com Cc: ocfs2-devel at oss.oracle.com --- mm/readahead.c | 64 ++++++++++++++++++++++++++++++++++---------------- 1 file changed, 44 insertions(+), 20 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index fc77d13af556..7daef0038b14 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -114,10 +114,10 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages, EXPORT_SYMBOL(read_cache_pages); static void read_pages(struct address_space *mapping, struct file *filp, - struct list_head *pages, unsigned int nr_pages, gfp_t gfp) + struct list_head *pages, pgoff_t start, + unsigned int nr_pages) { struct blk_plug plug; - unsigned page_idx; blk_start_plug(&plug); @@ -125,18 +125,17 @@ static void read_pages(struct address_space *mapping, struct file *filp, mapping->a_ops->readpages(filp, mapping, pages, nr_pages); /* Clean up the remaining pages */ put_pages_list(pages); - goto out; - } + } else { + struct page *page; + unsigned long index; - for (page_idx = 0; page_idx < nr_pages; page_idx++) { - struct page *page = lru_to_page(pages); - list_del(&page->lru); - if (!add_to_page_cache_lru(page, mapping, page->index, gfp)) + xa_for_each_range(&mapping->i_pages, index, page, start, + start + nr_pages - 1) { mapping->a_ops->readpage(filp, page); - put_page(page); + put_page(page); + } } -out: blk_finish_plug(&plug); } @@ -153,13 +152,14 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping, unsigned long lookahead_size) { struct inode *inode = mapping->host; - struct page *page; unsigned long end_index; /* The last page we want to read */ LIST_HEAD(page_pool); int page_idx; + pgoff_t page_offset; unsigned long nr_pages = 0; loff_t isize = i_size_read(inode); gfp_t gfp_mask = readahead_gfp_mask(mapping); + bool use_list = mapping->a_ops->readpages; if (isize == 0) goto out; @@ -170,21 +170,32 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping, * Preallocate as many pages as we will need. */ for (page_idx = 0; page_idx < nr_to_read; page_idx++) { - pgoff_t page_offset = offset + page_idx; + struct page *page; + page_offset = offset + page_idx; if (page_offset > end_index) break; page = xa_load(&mapping->i_pages, page_offset); if (page && !xa_is_value(page)) { /* - * Page already present? Kick off the current batch of - * contiguous pages before continuing with the next - * batch. + * Page already present? Kick off the current batch + * of contiguous pages before continuing with the + * next batch. */ if (nr_pages) - read_pages(mapping, filp, &page_pool, nr_pages, - gfp_mask); + read_pages(mapping, filp, &page_pool, + page_offset - nr_pages, + nr_pages); + /* + * It's possible this page is the page we should + * be marking with PageReadahead. However, we + * don't have a stable ref to this page so it might + * be reallocated to another user before we can set + * the bit. There's probably another page in the + * cache marked with PageReadahead from the other + * process which accessed this file. + */ nr_pages = 0; continue; } @@ -192,8 +203,20 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping, page = __page_cache_alloc(gfp_mask); if (!page) break; - page->index = page_offset; - list_add(&page->lru, &page_pool); + if (use_list) { + page->index = page_offset; + list_add(&page->lru, &page_pool); + } else if (add_to_page_cache_lru(page, mapping, page_offset, + gfp_mask) < 0) { + if (nr_pages) + read_pages(mapping, filp, &page_pool, + page_offset - nr_pages, + nr_pages); + put_page(page); + nr_pages = 0; + continue; + } + if (page_idx == nr_to_read - lookahead_size) SetPageReadahead(page); nr_pages++; @@ -205,7 +228,8 @@ unsigned long __do_page_cache_readahead(struct address_space *mapping, * will then handle the error. */ if (nr_pages) - read_pages(mapping, filp, &page_pool, nr_pages, gfp_mask); + read_pages(mapping, filp, &page_pool, page_offset - nr_pages, + nr_pages); BUG_ON(!list_empty(&page_pool)); out: return nr_pages; -- 2.24.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Cluster-devel] [PATCH v4 04/12] mm: Add readahead address space operation 2020-02-01 15:12 [Cluster-devel] [PATCH v4 00/12] Change readahead API Matthew Wilcox 2020-02-01 15:12 ` [Cluster-devel] [PATCH v4 03/12] readahead: Put pages in cache earlier Matthew Wilcox @ 2020-02-01 15:12 ` Matthew Wilcox 2020-02-01 15:12 ` [Cluster-devel] [PATCH v4 05/12] fs: Convert mpage_readpages to mpage_readahead Matthew Wilcox 2020-02-04 15:32 ` [Cluster-devel] [PATCH v4 00/12] Change readahead API David Sterba 3 siblings, 0 replies; 6+ messages in thread From: Matthew Wilcox @ 2020-02-01 15:12 UTC (permalink / raw) To: cluster-devel.redhat.com From: "Matthew Wilcox (Oracle)" <willy@infradead.org> This replaces ->readpages with a saner interface: - Return the number of pages not read instead of an ignored error code. - Pages are already in the page cache when ->readahead is called. - Implementation looks up the pages in the page cache instead of having them passed in a linked list. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: linux-btrfs at vger.kernel.org Cc: linux-erofs at lists.ozlabs.org Cc: linux-ext4 at vger.kernel.org Cc: linux-f2fs-devel at lists.sourceforge.net Cc: linux-xfs at vger.kernel.org Cc: cluster-devel at redhat.com Cc: ocfs2-devel at oss.oracle.com --- Documentation/filesystems/locking.rst | 7 ++++++- Documentation/filesystems/vfs.rst | 14 ++++++++++++++ include/linux/fs.h | 2 ++ include/linux/pagemap.h | 12 ++++++++++++ mm/readahead.c | 13 ++++++++++++- 5 files changed, 46 insertions(+), 2 deletions(-) diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index 5057e4d9dcd1..3d10729caf44 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -239,6 +239,8 @@ prototypes:: int (*readpage)(struct file *, struct page *); int (*writepages)(struct address_space *, struct writeback_control *); int (*set_page_dirty)(struct page *page); + unsigned (*readahead)(struct file *, struct address_space *, + pgoff_t start, unsigned nr_pages); int (*readpages)(struct file *filp, struct address_space *mapping, struct list_head *pages, unsigned nr_pages); int (*write_begin)(struct file *, struct address_space *mapping, @@ -271,7 +273,8 @@ writepage: yes, unlocks (see below) readpage: yes, unlocks writepages: set_page_dirty no -readpages: +readahead: yes, unlocks +readpages: no write_begin: locks the page exclusive write_end: yes, unlocks exclusive bmap: @@ -295,6 +298,8 @@ the request handler (/dev/loop). ->readpage() unlocks the page, either synchronously or via I/O completion. +->readahead() unlocks the pages like ->readpage(). + ->readpages() populates the pagecache with the passed pages and starts I/O against them. They come unlocked upon I/O completion. diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 7d4d09dd5e6d..c2bc345f2169 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -706,6 +706,8 @@ cache in your filesystem. The following members are defined: int (*readpage)(struct file *, struct page *); int (*writepages)(struct address_space *, struct writeback_control *); int (*set_page_dirty)(struct page *page); + unsigned (*readahead)(struct file *filp, struct address_space *mapping, + pgoff_t start, unsigned nr_pages); int (*readpages)(struct file *filp, struct address_space *mapping, struct list_head *pages, unsigned nr_pages); int (*write_begin)(struct file *, struct address_space *mapping, @@ -781,6 +783,18 @@ cache in your filesystem. The following members are defined: If defined, it should set the PageDirty flag, and the PAGECACHE_TAG_DIRTY tag in the radix tree. +``readahead`` + Called by the VM to read pages associated with the address_space + object. The pages are consecutive in the page cache and + are locked. The implementation should decrement the page + refcount after attempting I/O on each page. Usually the + page will be unlocked by the I/O completion handler. If the + function does not attempt I/O on some pages, return the number + of pages which were not read so the caller can unlock the pages + for you. Set PageUptodate if the I/O completes successfully. + Setting PageError on any page will be ignored; simply unlock + the page if an I/O error occurs. + ``readpages`` called by the VM to read pages associated with the address_space object. This is essentially just a vector version of readpage. diff --git a/include/linux/fs.h b/include/linux/fs.h index 41584f50af0d..3bfc142e7d10 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -375,6 +375,8 @@ struct address_space_operations { */ int (*readpages)(struct file *filp, struct address_space *mapping, struct list_head *pages, unsigned nr_pages); + unsigned (*readahead)(struct file *, struct address_space *, + pgoff_t start, unsigned nr_pages); int (*write_begin)(struct file *, struct address_space *mapping, loff_t pos, unsigned len, unsigned flags, diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index ccb14b6a16b5..a2cf007826f2 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -630,6 +630,18 @@ static inline int add_to_page_cache(struct page *page, return error; } +/* + * Only call this from a ->readahead implementation. + */ +static inline +struct page *readahead_page(struct address_space *mapping, pgoff_t index) +{ + struct page *page = xa_load(&mapping->i_pages, index); + VM_BUG_ON_PAGE(!PageLocked(page), page); + + return page; +} + static inline unsigned long dir_pages(struct inode *inode) { return (unsigned long)(inode->i_size + PAGE_SIZE - 1) >> diff --git a/mm/readahead.c b/mm/readahead.c index 7daef0038b14..b2ed0baf3a5d 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -121,7 +121,18 @@ static void read_pages(struct address_space *mapping, struct file *filp, blk_start_plug(&plug); - if (mapping->a_ops->readpages) { + if (mapping->a_ops->readahead) { + unsigned left = mapping->a_ops->readahead(filp, mapping, + start, nr_pages); + + while (left) { + struct page *page = readahead_page(mapping, + start + nr_pages - left); + unlock_page(page); + put_page(page); + left--; + } + } else if (mapping->a_ops->readpages) { mapping->a_ops->readpages(filp, mapping, pages, nr_pages); /* Clean up the remaining pages */ put_pages_list(pages); -- 2.24.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Cluster-devel] [PATCH v4 05/12] fs: Convert mpage_readpages to mpage_readahead 2020-02-01 15:12 [Cluster-devel] [PATCH v4 00/12] Change readahead API Matthew Wilcox 2020-02-01 15:12 ` [Cluster-devel] [PATCH v4 03/12] readahead: Put pages in cache earlier Matthew Wilcox 2020-02-01 15:12 ` [Cluster-devel] [PATCH v4 04/12] mm: Add readahead address space operation Matthew Wilcox @ 2020-02-01 15:12 ` Matthew Wilcox 2020-02-04 15:32 ` [Cluster-devel] [PATCH v4 00/12] Change readahead API David Sterba 3 siblings, 0 replies; 6+ messages in thread From: Matthew Wilcox @ 2020-02-01 15:12 UTC (permalink / raw) To: cluster-devel.redhat.com From: "Matthew Wilcox (Oracle)" <willy@infradead.org> Implement the new readahead aop and convert all callers (block_dev, exfat, ext2, fat, gfs2, hpfs, isofs, jfs, nilfs2, ocfs2, omfs, qnx6, reiserfs & udf). The callers are all trivial except for GFS2 & OCFS2. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: cluster-devel at redhat.com Cc: ocfs2-devel at oss.oracle.com --- drivers/staging/exfat/exfat_super.c | 9 ++++--- fs/block_dev.c | 9 ++++--- fs/ext2/inode.c | 12 ++++----- fs/fat/inode.c | 8 +++--- fs/gfs2/aops.c | 20 +++++++-------- fs/hpfs/file.c | 8 +++--- fs/iomap/buffered-io.c | 2 +- fs/isofs/inode.c | 9 ++++--- fs/jfs/inode.c | 8 +++--- fs/mpage.c | 38 ++++++++++------------------- fs/nilfs2/inode.c | 13 +++++----- fs/ocfs2/aops.c | 32 +++++++++++------------- fs/omfs/file.c | 8 +++--- fs/qnx6/inode.c | 8 +++--- fs/reiserfs/inode.c | 10 ++++---- fs/udf/inode.c | 8 +++--- include/linux/mpage.h | 2 +- mm/migrate.c | 2 +- 18 files changed, 96 insertions(+), 110 deletions(-) diff --git a/drivers/staging/exfat/exfat_super.c b/drivers/staging/exfat/exfat_super.c index b81d2a87b82e..4dbfd8c84a1b 100644 --- a/drivers/staging/exfat/exfat_super.c +++ b/drivers/staging/exfat/exfat_super.c @@ -3002,10 +3002,11 @@ static int exfat_readpage(struct file *file, struct page *page) return mpage_readpage(page, exfat_get_block); } -static int exfat_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned int nr_pages) +static +unsigned exfat_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned int nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, exfat_get_block); + return mpage_readahead(mapping, start, nr_pages, exfat_get_block); } static int exfat_writepage(struct page *page, struct writeback_control *wbc) @@ -3104,7 +3105,7 @@ static sector_t _exfat_bmap(struct address_space *mapping, sector_t block) static const struct address_space_operations exfat_aops = { .readpage = exfat_readpage, - .readpages = exfat_readpages, + .readahead = exfat_readahead, .writepage = exfat_writepage, .writepages = exfat_writepages, .write_begin = exfat_write_begin, diff --git a/fs/block_dev.c b/fs/block_dev.c index 69bf2fb6f7cd..826a5104ff56 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -614,10 +614,11 @@ static int blkdev_readpage(struct file * file, struct page * page) return block_read_full_page(page, blkdev_get_block); } -static int blkdev_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static +unsigned blkdev_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, blkdev_get_block); + return mpage_readahead(mapping, start, nr_pages, blkdev_get_block); } static int blkdev_write_begin(struct file *file, struct address_space *mapping, @@ -2062,7 +2063,7 @@ static int blkdev_writepages(struct address_space *mapping, static const struct address_space_operations def_blk_aops = { .readpage = blkdev_readpage, - .readpages = blkdev_readpages, + .readahead = blkdev_readahead, .writepage = blkdev_writepage, .write_begin = blkdev_write_begin, .write_end = blkdev_write_end, diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 119667e65890..0440eb9f24de 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -877,11 +877,11 @@ static int ext2_readpage(struct file *file, struct page *page) return mpage_readpage(page, ext2_get_block); } -static int -ext2_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static unsigned +ext2_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, ext2_get_block); + return mpage_readahead(mapping, start, nr_pages, ext2_get_block); } static int @@ -966,7 +966,7 @@ ext2_dax_writepages(struct address_space *mapping, struct writeback_control *wbc const struct address_space_operations ext2_aops = { .readpage = ext2_readpage, - .readpages = ext2_readpages, + .readahead = ext2_readahead, .writepage = ext2_writepage, .write_begin = ext2_write_begin, .write_end = ext2_write_end, @@ -980,7 +980,7 @@ const struct address_space_operations ext2_aops = { const struct address_space_operations ext2_nobh_aops = { .readpage = ext2_readpage, - .readpages = ext2_readpages, + .readahead = ext2_readahead, .writepage = ext2_nobh_writepage, .write_begin = ext2_nobh_write_begin, .write_end = nobh_write_end, diff --git a/fs/fat/inode.c b/fs/fat/inode.c index 594b05ae16c9..a671dc6a122a 100644 --- a/fs/fat/inode.c +++ b/fs/fat/inode.c @@ -210,10 +210,10 @@ static int fat_readpage(struct file *file, struct page *page) return mpage_readpage(page, fat_get_block); } -static int fat_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static unsigned fat_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, fat_get_block); + return mpage_readahead(mapping, start, nr_pages, fat_get_block); } static void fat_write_failed(struct address_space *mapping, loff_t to) @@ -344,7 +344,7 @@ int fat_block_truncate_page(struct inode *inode, loff_t from) static const struct address_space_operations fat_aops = { .readpage = fat_readpage, - .readpages = fat_readpages, + .readahead = fat_readahead, .writepage = fat_writepage, .writepages = fat_writepages, .write_begin = fat_write_begin, diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index ba83b49ce18c..5c6d89603f91 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -577,7 +577,7 @@ int gfs2_internal_read(struct gfs2_inode *ip, char *buf, loff_t *pos, } /** - * gfs2_readpages - Read a bunch of pages at once + * gfs2_readahead - Read a bunch of pages at once * @file: The file to read from * @mapping: Address space info * @pages: List of pages to read @@ -590,16 +590,15 @@ int gfs2_internal_read(struct gfs2_inode *ip, char *buf, loff_t *pos, * obviously not something we'd want to do on too regular a basis. * Any I/O we ignore at this time will be done via readpage later. * 2. We don't handle stuffed files here we let readpage do the honours. - * 3. mpage_readpages() does most of the heavy lifting in the common case. + * 3. mpage_readahead() does most of the heavy lifting in the common case. * 4. gfs2_block_map() is relied upon to set BH_Boundary in the right places. */ -static int gfs2_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static unsigned gfs2_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { struct inode *inode = mapping->host; struct gfs2_inode *ip = GFS2_I(inode); - struct gfs2_sbd *sdp = GFS2_SB(inode); struct gfs2_holder gh; int ret; @@ -608,13 +607,12 @@ static int gfs2_readpages(struct file *file, struct address_space *mapping, if (unlikely(ret)) goto out_uninit; if (!gfs2_is_stuffed(ip)) - ret = mpage_readpages(mapping, pages, nr_pages, gfs2_block_map); + nr_pages = mpage_readahead(mapping, start, nr_pages, + gfs2_block_map); gfs2_glock_dq(&gh); out_uninit: gfs2_holder_uninit(&gh); - if (unlikely(gfs2_withdrawn(sdp))) - ret = -EIO; - return ret; + return nr_pages; } /** @@ -828,7 +826,7 @@ static const struct address_space_operations gfs2_aops = { .writepage = gfs2_writepage, .writepages = gfs2_writepages, .readpage = gfs2_readpage, - .readpages = gfs2_readpages, + .readahead = gfs2_readahead, .bmap = gfs2_bmap, .invalidatepage = gfs2_invalidatepage, .releasepage = gfs2_releasepage, @@ -842,7 +840,7 @@ static const struct address_space_operations gfs2_jdata_aops = { .writepage = gfs2_jdata_writepage, .writepages = gfs2_jdata_writepages, .readpage = gfs2_readpage, - .readpages = gfs2_readpages, + .readahead = gfs2_readahead, .set_page_dirty = jdata_set_page_dirty, .bmap = gfs2_bmap, .invalidatepage = gfs2_invalidatepage, diff --git a/fs/hpfs/file.c b/fs/hpfs/file.c index b36abf9cb345..a0f7cc0262ae 100644 --- a/fs/hpfs/file.c +++ b/fs/hpfs/file.c @@ -125,10 +125,10 @@ static int hpfs_writepage(struct page *page, struct writeback_control *wbc) return block_write_full_page(page, hpfs_get_block, wbc); } -static int hpfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static unsigned hpfs_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, hpfs_get_block); + return mpage_readahead(mapping, start, nr_pages, hpfs_get_block); } static int hpfs_writepages(struct address_space *mapping, @@ -198,7 +198,7 @@ static int hpfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, const struct address_space_operations hpfs_aops = { .readpage = hpfs_readpage, .writepage = hpfs_writepage, - .readpages = hpfs_readpages, + .readahead = hpfs_readahead, .writepages = hpfs_writepages, .write_begin = hpfs_write_begin, .write_end = hpfs_write_end, diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 7c84c4c027c4..cb3511eb152a 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -359,7 +359,7 @@ iomap_readpage(struct page *page, const struct iomap_ops *ops) } /* - * Just like mpage_readpages and block_read_full_page we always + * Just like mpage_readahead and block_read_full_page we always * return 0 and just mark the page as PageError on errors. This * should be cleaned up all through the stack eventually. */ diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c index 62c0462dc89f..11154cc35b16 100644 --- a/fs/isofs/inode.c +++ b/fs/isofs/inode.c @@ -1185,10 +1185,11 @@ static int isofs_readpage(struct file *file, struct page *page) return mpage_readpage(page, isofs_get_block); } -static int isofs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static +unsigned isofs_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, isofs_get_block); + return mpage_readahead(mapping, start, nr_pages, isofs_get_block); } static sector_t _isofs_bmap(struct address_space *mapping, sector_t block) @@ -1198,7 +1199,7 @@ static sector_t _isofs_bmap(struct address_space *mapping, sector_t block) static const struct address_space_operations isofs_aops = { .readpage = isofs_readpage, - .readpages = isofs_readpages, + .readahead = isofs_readahead, .bmap = _isofs_bmap }; diff --git a/fs/jfs/inode.c b/fs/jfs/inode.c index 9486afcdac76..1ed926ac2bb9 100644 --- a/fs/jfs/inode.c +++ b/fs/jfs/inode.c @@ -296,10 +296,10 @@ static int jfs_readpage(struct file *file, struct page *page) return mpage_readpage(page, jfs_get_block); } -static int jfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static unsigned jfs_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, jfs_get_block); + return mpage_readahead(mapping, start, nr_pages, jfs_get_block); } static void jfs_write_failed(struct address_space *mapping, loff_t to) @@ -358,7 +358,7 @@ static ssize_t jfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) const struct address_space_operations jfs_aops = { .readpage = jfs_readpage, - .readpages = jfs_readpages, + .readahead = jfs_readahead, .writepage = jfs_writepage, .writepages = jfs_writepages, .write_begin = jfs_write_begin, diff --git a/fs/mpage.c b/fs/mpage.c index ccba3c4c4479..91a148bcd582 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -91,7 +91,7 @@ mpage_alloc(struct block_device *bdev, } /* - * support function for mpage_readpages. The fs supplied get_block might + * support function for mpage_readahead. The fs supplied get_block might * return an up to date buffer. This is used to map that buffer into * the page, which allows readpage to avoid triggering a duplicate call * to get_block. @@ -338,13 +338,10 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) } /** - * mpage_readpages - populate an address space with some pages & start reads against them + * mpage_readahead - start reads against pages * @mapping: the address_space - * @pages: The address of a list_head which contains the target pages. These - * pages have their ->index populated and are otherwise uninitialised. - * The page at @pages->prev has the lowest file offset, and reads should be - * issued in @pages->prev to @pages->next order. - * @nr_pages: The number of pages at *@pages + * @start: The number of the first page to read. + * @nr_pages: The number of consecutive pages to read. * @get_block: The filesystem's block mapper function. * * This function walks the pages and the blocks within each page, building and @@ -381,36 +378,27 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) * * This all causes the disk requests to be issued in the correct order. */ -int -mpage_readpages(struct address_space *mapping, struct list_head *pages, - unsigned nr_pages, get_block_t get_block) +unsigned mpage_readahead(struct address_space *mapping, pgoff_t start, + unsigned nr_pages, get_block_t get_block) { struct mpage_readpage_args args = { .get_block = get_block, .is_readahead = true, }; - unsigned page_idx; - - for (page_idx = 0; page_idx < nr_pages; page_idx++) { - struct page *page = lru_to_page(pages); + while (nr_pages--) { + struct page *page = readahead_page(mapping, start++); prefetchw(&page->flags); - list_del(&page->lru); - if (!add_to_page_cache_lru(page, mapping, - page->index, - readahead_gfp_mask(mapping))) { - args.page = page; - args.nr_pages = nr_pages - page_idx; - args.bio = do_mpage_readpage(&args); - } + args.page = page; + args.nr_pages = nr_pages; + args.bio = do_mpage_readpage(&args); put_page(page); } - BUG_ON(!list_empty(pages)); if (args.bio) mpage_bio_submit(REQ_OP_READ, REQ_RAHEAD, args.bio); return 0; } -EXPORT_SYMBOL(mpage_readpages); +EXPORT_SYMBOL(mpage_readahead); /* * This isn't called much at all @@ -563,7 +551,7 @@ static int __mpage_writepage(struct page *page, struct writeback_control *wbc, * Page has buffers, but they are all unmapped. The page was * created by pagein or read over a hole which was handled by * block_read_full_page(). If this address_space is also - * using mpage_readpages then this can rarely happen. + * using mpage_readahead then this can rarely happen. */ goto confused; } diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c index 671085512e0f..ecf543f35256 100644 --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -146,17 +146,18 @@ static int nilfs_readpage(struct file *file, struct page *page) } /** - * nilfs_readpages() - implement readpages() method of nilfs_aops {} + * nilfs_readahead() - implement readahead() method of nilfs_aops {} * address_space_operations. * @file - file struct of the file to be read * @mapping - address_space struct used for reading multiple pages - * @pages - the pages to be read + * @start - the first page to read * @nr_pages - number of pages to be read */ -static int nilfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned int nr_pages) +static +unsigned nilfs_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned int nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, nilfs_get_block); + return mpage_readahead(mapping, start, nr_pages, nilfs_get_block); } static int nilfs_writepages(struct address_space *mapping, @@ -308,7 +309,7 @@ const struct address_space_operations nilfs_aops = { .readpage = nilfs_readpage, .writepages = nilfs_writepages, .set_page_dirty = nilfs_set_page_dirty, - .readpages = nilfs_readpages, + .readahead = nilfs_readahead, .write_begin = nilfs_write_begin, .write_end = nilfs_write_end, /* .releasepage = nilfs_releasepage, */ diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index 3a67a6518ddf..a9784a6442b7 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -350,14 +350,13 @@ static int ocfs2_readpage(struct file *file, struct page *page) * grow out to a tree. If need be, detecting boundary extents could * trivially be added in a future version of ocfs2_get_block(). */ -static int ocfs2_readpages(struct file *filp, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static +unsigned ocfs2_readahead(struct file *filp, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { - int ret, err = -EIO; + int ret; struct inode *inode = mapping->host; struct ocfs2_inode_info *oi = OCFS2_I(inode); - loff_t start; - struct page *last; /* * Use the nonblocking flag for the dlm code to avoid page @@ -365,36 +364,33 @@ static int ocfs2_readpages(struct file *filp, struct address_space *mapping, */ ret = ocfs2_inode_lock_full(inode, NULL, 0, OCFS2_LOCK_NONBLOCK); if (ret) - return err; + return nr_pages; - if (down_read_trylock(&oi->ip_alloc_sem) == 0) { - ocfs2_inode_unlock(inode, 0); - return err; - } + if (down_read_trylock(&oi->ip_alloc_sem) == 0) + goto out_unlock; /* * Don't bother with inline-data. There isn't anything * to read-ahead in that case anyway... */ if (oi->ip_dyn_features & OCFS2_INLINE_DATA_FL) - goto out_unlock; + goto out_up; /* * Check whether a remote node truncated this file - we just * drop out in that case as it's not worth handling here. */ - last = lru_to_page(pages); - start = (loff_t)last->index << PAGE_SHIFT; if (start >= i_size_read(inode)) - goto out_unlock; + goto out_up; - err = mpage_readpages(mapping, pages, nr_pages, ocfs2_get_block); + nr_pages = mpage_readahead(mapping, start, nr_pages, ocfs2_get_block); -out_unlock: +out_up: up_read(&oi->ip_alloc_sem); +out_unlock: ocfs2_inode_unlock(inode, 0); - return err; + return nr_pages; } /* Note: Because we don't support holes, our allocation has @@ -2474,7 +2470,7 @@ static ssize_t ocfs2_direct_IO(struct kiocb *iocb, struct iov_iter *iter) const struct address_space_operations ocfs2_aops = { .readpage = ocfs2_readpage, - .readpages = ocfs2_readpages, + .readahead = ocfs2_readahead, .writepage = ocfs2_writepage, .write_begin = ocfs2_write_begin, .write_end = ocfs2_write_end, diff --git a/fs/omfs/file.c b/fs/omfs/file.c index d640b9388238..e7392f49f619 100644 --- a/fs/omfs/file.c +++ b/fs/omfs/file.c @@ -289,10 +289,10 @@ static int omfs_readpage(struct file *file, struct page *page) return block_read_full_page(page, omfs_get_block); } -static int omfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static unsigned omfs_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, omfs_get_block); + return mpage_readahead(mapping, start, nr_pages, omfs_get_block); } static int omfs_writepage(struct page *page, struct writeback_control *wbc) @@ -373,7 +373,7 @@ const struct inode_operations omfs_file_inops = { const struct address_space_operations omfs_aops = { .readpage = omfs_readpage, - .readpages = omfs_readpages, + .readahead = omfs_readahead, .writepage = omfs_writepage, .writepages = omfs_writepages, .write_begin = omfs_write_begin, diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c index 345db56c98fd..949e823a1d30 100644 --- a/fs/qnx6/inode.c +++ b/fs/qnx6/inode.c @@ -99,10 +99,10 @@ static int qnx6_readpage(struct file *file, struct page *page) return mpage_readpage(page, qnx6_get_block); } -static int qnx6_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static unsigned qnx6_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, qnx6_get_block); + return mpage_readahead(mapping, start, nr_pages, qnx6_get_block); } /* @@ -499,7 +499,7 @@ static sector_t qnx6_bmap(struct address_space *mapping, sector_t block) } static const struct address_space_operations qnx6_aops = { .readpage = qnx6_readpage, - .readpages = qnx6_readpages, + .readahead = qnx6_readahead, .bmap = qnx6_bmap }; diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c index 6419e6dacc39..0f2666ef23dd 100644 --- a/fs/reiserfs/inode.c +++ b/fs/reiserfs/inode.c @@ -1160,11 +1160,11 @@ int reiserfs_get_block(struct inode *inode, sector_t block, return retval; } -static int -reiserfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static unsigned +reiserfs_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, reiserfs_get_block); + return mpage_readahead(mapping, start, nr_pages, reiserfs_get_block); } /* @@ -3434,7 +3434,7 @@ int reiserfs_setattr(struct dentry *dentry, struct iattr *attr) const struct address_space_operations reiserfs_address_space_operations = { .writepage = reiserfs_writepage, .readpage = reiserfs_readpage, - .readpages = reiserfs_readpages, + .readahead = reiserfs_readahead, .releasepage = reiserfs_releasepage, .invalidatepage = reiserfs_invalidatepage, .write_begin = reiserfs_write_begin, diff --git a/fs/udf/inode.c b/fs/udf/inode.c index e875bc5668ee..97c9bccf1be4 100644 --- a/fs/udf/inode.c +++ b/fs/udf/inode.c @@ -195,10 +195,10 @@ static int udf_readpage(struct file *file, struct page *page) return mpage_readpage(page, udf_get_block); } -static int udf_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static unsigned udf_readahead(struct file *file, struct address_space *mapping, + pgoff_t start, unsigned nr_pages) { - return mpage_readpages(mapping, pages, nr_pages, udf_get_block); + return mpage_readahead(mapping, start, nr_pages, udf_get_block); } static int udf_write_begin(struct file *file, struct address_space *mapping, @@ -234,7 +234,7 @@ static sector_t udf_bmap(struct address_space *mapping, sector_t block) const struct address_space_operations udf_aops = { .readpage = udf_readpage, - .readpages = udf_readpages, + .readahead = udf_readahead, .writepage = udf_writepage, .writepages = udf_writepages, .write_begin = udf_write_begin, diff --git a/include/linux/mpage.h b/include/linux/mpage.h index 001f1fcf9836..dabf7b5a6a28 100644 --- a/include/linux/mpage.h +++ b/include/linux/mpage.h @@ -14,7 +14,7 @@ struct writeback_control; -int mpage_readpages(struct address_space *mapping, struct list_head *pages, +unsigned mpage_readahead(struct address_space *mapping, pgoff_t start, unsigned nr_pages, get_block_t get_block); int mpage_readpage(struct page *page, get_block_t get_block); int mpage_writepages(struct address_space *mapping, diff --git a/mm/migrate.c b/mm/migrate.c index edf42ed90030..860925dd2725 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1020,7 +1020,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage, * to the LRU. Later, when the IO completes the pages are * marked uptodate and unlocked. However, the queueing * could be merging multiple pages for one bio (e.g. - * mpage_readpages). If an allocation happens for the + * mpage_readahead). If an allocation happens for the * second or third page, the process can end up locking * the same page twice and deadlocking. Rather than * trying to be clever about what pages can be locked, -- 2.24.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Cluster-devel] [PATCH v4 00/12] Change readahead API 2020-02-01 15:12 [Cluster-devel] [PATCH v4 00/12] Change readahead API Matthew Wilcox ` (2 preceding siblings ...) 2020-02-01 15:12 ` [Cluster-devel] [PATCH v4 05/12] fs: Convert mpage_readpages to mpage_readahead Matthew Wilcox @ 2020-02-04 15:32 ` David Sterba 2020-02-04 17:16 ` Matthew Wilcox 3 siblings, 1 reply; 6+ messages in thread From: David Sterba @ 2020-02-04 15:32 UTC (permalink / raw) To: cluster-devel.redhat.com On Sat, Feb 01, 2020 at 07:12:28AM -0800, Matthew Wilcox wrote: > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > I would particularly value feedback on this from the gfs2 and ocfs2 > maintainers. They have non-trivial changes, and a review on patch 5 > would be greatly appreciated. > > This series adds a readahead address_space operation to eventually > replace the readpages operation. The key difference is that > pages are added to the page cache as they are allocated (and > then looked up by the filesystem) instead of passing them on a > list to the readpages operation and having the filesystem add > them to the page cache. It's a net reduction in code for each > implementation, more efficient than walking a list, and solves > the direct-write vs buffered-read problem reported by yu kuai at > https://lore.kernel.org/linux-fsdevel/20200116063601.39201-1-yukuai3 at huawei.com/ > > v4: > - Rebase on current Linus (a62aa6f7f50a ("Merge tag 'gfs2-for-5.6'")) I've tried to test the patchset but haven't got very far, it crashes at boot ritht after VFS mounts the root. The patches are from mailinglist, applied on current master, bug I saw the same crash with the git branch in your repo (probably v1). (gdb) l *(ext4_mpage_readpages+0x1da/0xc20) 0xffffffff813753f0 is in ext4_mpage_readpages (fs/ext4/readpage.c:226). 221 return i_size_read(inode); 222 } 223 224 int ext4_mpage_readpages(struct address_space *mapping, pgoff_t start, 225 struct page *page, unsigned nr_pages, bool is_readahead) 226 { 227 struct bio *bio = NULL; 228 sector_t last_block_in_bio = 0; 229 230 struct inode *inode = mapping->host; [ 8.008531] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 8.011482] #PF: supervisor read access in kernel mode [ 8.014121] #PF: error_code(0x0000) - not-present page [ 8.016767] PGD 0 P4D 0 [ 8.018352] Oops: 0000 [#1] SMP [ 8.019716] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.5.0-default+ #955 [ 8.021746] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014 [ 8.025244] RIP: 0010:ext4_mpage_readpages+0x1da/0xc20 [ 8.026817] Code: 7c 24 4e 00 0f 85 23 04 00 00 44 29 74 24 3c 83 6c 24 48 01 0f 84 4d 04 00 00 80 7c 24 4e 00 0f 85 fc 05 00 00 48 8b 4c 24 18 <48> 8b 01 f6 c4 20 75 89 4c 8b 69 20 b9 0c 00 00 00 2b 4c 24 38 83 [ 8.031957] RSP: 0000:ffffb34f40013988 EFLAGS: 00010292 [ 8.033691] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 8.035533] RDX: 0000000000000001 RSI: ffffffff960934c0 RDI: ffffffff9681a080 [ 8.036900] RBP: 0000000000000001 R08: ffffb34f40013a68 R09: 0000000000000000 [ 8.038461] R10: 0000000000000038 R11: 0000000000000000 R12: 0000000000000004 [ 8.040698] R13: ffff9668ba4e18e0 R14: 0000000000000001 R15: 0000000000000000 [ 8.042805] FS: 0000000000000000(0000) GS:ffff9668bda00000(0000) knlGS:0000000000000000 [ 8.045396] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8.047233] CR2: 0000000000000000 CR3: 000000002e011001 CR4: 0000000000160ee0 [ 8.049337] Call Trace: [ 8.050435] ? __lock_acquire+0xee0/0x1320 [ 8.051833] ? release_pages+0x310/0x380 [ 8.053265] ? mark_held_locks+0x50/0x80 [ 8.054468] ext4_readahead+0x3b/0x50 [ 8.055877] read_pages+0x65/0x1a0 [ 8.057167] ? put_pages_list+0x90/0x90 [ 8.058689] __do_page_cache_readahead+0x24b/0x2a0 [ 8.060394] generic_file_buffered_read+0x7cf/0x9f0 [ 8.062137] ? sched_clock+0x5/0x10 [ 8.063451] ? up_read+0x18/0x240 [ 8.064774] ? ext4_xattr_get+0x97/0x2c0 [ 8.066178] new_sync_read+0x111/0x1a0 [ 8.067423] vfs_read+0xc5/0x180 [ 8.068572] kernel_read+0x2c/0x40 [ 8.069788] prepare_binprm+0x171/0x1b0 [ 8.071311] load_script+0x1c1/0x250 [ 8.072643] search_binary_handler+0x5f/0x210 [ 8.074135] exec_binprm+0xd7/0x290 [ 8.075463] __do_execve_file.isra.0+0x570/0x800 [ 8.077400] ? rest_init+0x2f1/0x2f5 [ 8.078979] do_execve+0x21/0x30 [ 8.080420] kernel_init+0xa4/0x11b [ 8.081856] ? rest_init+0x2f5/0x2f5 [ 8.083173] ret_from_fork+0x24/0x30 [ 8.084695] Modules linked in: [ 8.086055] CR2: 0000000000000000 [ 8.087572] ---[ end trace 0890c371a706b34a ]--- [ 8.089417] RIP: 0010:ext4_mpage_readpages+0x1da/0xc20 [ 8.116836] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:38 [ 8.119626] in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0 [ 8.122392] INFO: lockdep is turned off. [ 8.123694] irq event stamp: 18341344 [ 8.124735] hardirqs last enabled at (18341343): [<ffffffff95230c42>] free_unref_page_list+0x232/0x270 [ 8.127918] hardirqs last disabled at (18341344): [<ffffffff95002b4b>] trace_hardirqs_off_thunk+0x1a/0x1c [ 8.131145] softirqs last enabled at (18341250): [<ffffffff95a00358>] __do_softirq+0x358/0x52b [ 8.143060] softirqs last disabled at (18341243): [<ffffffff9508ae3d>] irq_exit+0x9d/0xb0 [ 8.145603] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G D 5.5.0-default+ #955 [ 8.148474] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014 [ 8.152440] Call Trace: [ 8.153747] dump_stack+0x71/0xa0 [ 8.155238] ___might_sleep.cold+0xa6/0xf9 [ 8.156903] exit_signals+0x31/0x310 [ 8.158431] ? __do_execve_file.isra.0+0x570/0x800 [ 8.160179] do_exit+0xa8/0xd60 [ 8.161632] ? rest_init+0x2f1/0x2f5 [ 8.163204] rewind_stack_do_exit+0x17/0x20 [ 8.164931] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 [ 8.167575] Kernel Offset: 0x14000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Cluster-devel] [PATCH v4 00/12] Change readahead API 2020-02-04 15:32 ` [Cluster-devel] [PATCH v4 00/12] Change readahead API David Sterba @ 2020-02-04 17:16 ` Matthew Wilcox 0 siblings, 0 replies; 6+ messages in thread From: Matthew Wilcox @ 2020-02-04 17:16 UTC (permalink / raw) To: cluster-devel.redhat.com On Tue, Feb 04, 2020 at 04:32:27PM +0100, David Sterba wrote: > On Sat, Feb 01, 2020 at 07:12:28AM -0800, Matthew Wilcox wrote: > > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > > > I would particularly value feedback on this from the gfs2 and ocfs2 > > maintainers. They have non-trivial changes, and a review on patch 5 > > would be greatly appreciated. > > > > This series adds a readahead address_space operation to eventually > > replace the readpages operation. The key difference is that > > pages are added to the page cache as they are allocated (and > > then looked up by the filesystem) instead of passing them on a > > list to the readpages operation and having the filesystem add > > them to the page cache. It's a net reduction in code for each > > implementation, more efficient than walking a list, and solves > > the direct-write vs buffered-read problem reported by yu kuai at > > https://lore.kernel.org/linux-fsdevel/20200116063601.39201-1-yukuai3 at huawei.com/ > > > > v4: > > - Rebase on current Linus (a62aa6f7f50a ("Merge tag 'gfs2-for-5.6'")) > > I've tried to test the patchset but haven't got very far, it crashes at boot > ritht after VFS mounts the root. The patches are from mailinglist, applied on > current master, bug I saw the same crash with the git branch in your > repo (probably v1). Yeah, I wasn't able to test at the time due to what turned out to be the hpet bug in Linus' tree. Now that's fixed, I've found & fixed a couple more bugs. There'll be a v5 once I fix the remaining problem (looks like a missing page unlock somewhere). ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-02-04 17:16 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-02-01 15:12 [Cluster-devel] [PATCH v4 00/12] Change readahead API Matthew Wilcox 2020-02-01 15:12 ` [Cluster-devel] [PATCH v4 03/12] readahead: Put pages in cache earlier Matthew Wilcox 2020-02-01 15:12 ` [Cluster-devel] [PATCH v4 04/12] mm: Add readahead address space operation Matthew Wilcox 2020-02-01 15:12 ` [Cluster-devel] [PATCH v4 05/12] fs: Convert mpage_readpages to mpage_readahead Matthew Wilcox 2020-02-04 15:32 ` [Cluster-devel] [PATCH v4 00/12] Change readahead API David Sterba 2020-02-04 17:16 ` Matthew Wilcox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).