From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: "Theodore Ts'o" <tytso@mit.edu>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Jan Kara <jack@suse.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Hugh Dickins <hughd@google.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave.hansen@intel.com>,
Vlastimil Babka <vbabka@suse.cz>,
Matthew Wilcox <willy@infradead.org>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-block@vger.kernel.org,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCHv1, RFC 25/33] ext4: make ext4_mpage_readpages() hugepage-aware
Date: Tue, 26 Jul 2016 03:35:27 +0300 [thread overview]
Message-ID: <1469493335-3622-26-git-send-email-kirill.shutemov@linux.intel.com> (raw)
In-Reply-To: <1469493335-3622-1-git-send-email-kirill.shutemov@linux.intel.com>
This patch modifies ext4_mpage_readpages() to deal with huge pages.
We read out 2M at once, so we have to alloc (HPAGE_PMD_NR *
blocks_per_page) sector_t for that. I'm not entirely happy with kmalloc
in this codepath, but don't see any other option.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
fs/ext4/readpage.c | 38 ++++++++++++++++++++++++++++++++------
1 file changed, 32 insertions(+), 6 deletions(-)
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index a81b829d56de..6d7cbddceeb2 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -104,12 +104,12 @@ int ext4_mpage_readpages(struct address_space *mapping,
struct inode *inode = mapping->host;
const unsigned blkbits = inode->i_blkbits;
- const unsigned blocks_per_page = PAGE_SIZE >> blkbits;
const unsigned blocksize = 1 << blkbits;
sector_t block_in_file;
sector_t last_block;
sector_t last_block_in_file;
- sector_t blocks[MAX_BUF_PER_PAGE];
+ sector_t blocks_on_stack[MAX_BUF_PER_PAGE];
+ sector_t *blocks = blocks_on_stack;
unsigned page_block;
struct block_device *bdev = inode->i_sb->s_bdev;
int length;
@@ -122,8 +122,9 @@ int ext4_mpage_readpages(struct address_space *mapping,
map.m_flags = 0;
for (; nr_pages; nr_pages--) {
- int fully_mapped = 1;
- unsigned first_hole = blocks_per_page;
+ int fully_mapped = 1, nr = nr_pages;
+ unsigned blocks_per_page = PAGE_SIZE >> blkbits;
+ unsigned first_hole;
prefetchw(&page->flags);
if (pages) {
@@ -138,10 +139,31 @@ int ext4_mpage_readpages(struct address_space *mapping,
goto confused;
block_in_file = (sector_t)page->index << (PAGE_SHIFT - blkbits);
- last_block = block_in_file + nr_pages * blocks_per_page;
+
+ if (PageTransHuge(page)) {
+ BUILD_BUG_ON(BIO_MAX_PAGES < HPAGE_PMD_NR);
+ nr = HPAGE_PMD_NR * blocks_per_page;
+ /* XXX: need a better solution ? */
+ blocks = kmalloc(sizeof(sector_t) * nr, GFP_NOFS);
+ if (!blocks) {
+ if (pages) {
+ delete_from_page_cache(page);
+ goto next_page;
+ }
+ return -ENOMEM;
+ }
+
+ blocks_per_page *= HPAGE_PMD_NR;
+ last_block = block_in_file + blocks_per_page;
+ } else {
+ blocks = blocks_on_stack;
+ last_block = block_in_file + nr * blocks_per_page;
+ }
+
last_block_in_file = (i_size_read(inode) + blocksize - 1) >> blkbits;
if (last_block > last_block_in_file)
last_block = last_block_in_file;
+ first_hole = blocks_per_page;
page_block = 0;
/*
@@ -213,6 +235,8 @@ int ext4_mpage_readpages(struct address_space *mapping,
}
}
if (first_hole != blocks_per_page) {
+ if (PageTransHuge(page))
+ goto confused;
zero_user_segment(page, first_hole << blkbits,
PAGE_SIZE);
if (first_hole == 0) {
@@ -248,7 +272,7 @@ int ext4_mpage_readpages(struct address_space *mapping,
goto set_error_page;
}
bio = bio_alloc(GFP_KERNEL,
- min_t(int, nr_pages, BIO_MAX_PAGES));
+ min_t(int, nr, BIO_MAX_PAGES));
if (!bio) {
if (ctx)
fscrypt_release_ctx(ctx);
@@ -289,5 +313,7 @@ int ext4_mpage_readpages(struct address_space *mapping,
BUG_ON(pages && !list_empty(pages));
if (bio)
submit_bio(bio);
+ if (blocks != blocks_on_stack)
+ kfree(blocks);
return 0;
}
--
2.8.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-07-26 0:35 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-26 0:35 [PATCHv1, RFC 00/33] ext4: support of huge pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 01/33] tools: Add WARN_ON_ONCE Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 02/33] radix tree test suite: Allow GFP_ATOMIC allocations to fail Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 03/33] radix-tree: Add radix_tree_join Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 04/33] radix-tree: Add radix_tree_split Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 05/33] radix-tree: Add radix_tree_split_preload() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 06/33] radix-tree: Handle multiorder entries being deleted by replace_clear_tags Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 07/33] mm, shmem: swich huge tmpfs to multi-order radix-tree entries Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 08/33] Revert "radix-tree: implement radix_tree_maybe_preload_order()" Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 09/33] page-flags: relax page flag poliry for PG_error and PG_writeback Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 10/33] mm, rmap: account file thp pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 11/33] thp: allow splitting non-shmem file-backed THPs Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 12/33] truncate: make sure invalidate_mapping_pages() can discard huge pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 13/33] filemap: allocate huge page in page_cache_read(), if allowed Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 14/33] filemap: handle huge pages in do_generic_file_read() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 15/33] filemap: allocate huge page in pagecache_get_page(), if allowed Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 16/33] filemap: handle huge pages in filemap_fdatawait_range() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 17/33] HACK: readahead: alloc huge pages, if allowed Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 18/33] HACK: block: bump BIO_MAX_PAGES Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 19/33] mm: make write_cache_pages() work on huge pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 20/33] thp: introduce hpage_size() and hpage_mask() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 21/33] fs: make block_read_full_page() be able to read huge page Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 22/33] fs: make block_write_{begin,end}() be able to handle huge pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 23/33] fs: make block_page_mkwrite() aware about " Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 24/33] truncate: make truncate_inode_pages_range() " Kirill A. Shutemov
2016-07-26 0:35 ` Kirill A. Shutemov [this message]
2016-07-26 0:35 ` [PATCHv1, RFC 26/33] ext4: make ext4_writepage() work on " Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 27/33] ext4: handle huge pages in ext4_page_mkwrite() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 28/33] ext4: handle huge pages in __ext4_block_zero_page_range() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 29/33] ext4: handle huge pages in ext4_da_write_end() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 30/33] ext4: relax assert in ext4_da_page_release_reservation() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 31/33] WIP: ext4: handle writeback with huge pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 32/33] mm, fs, ext4: expand use of page_mapping() and page_to_pgoff() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 33/33] ext4, vfs: add huge= mount option Kirill A. Shutemov
2016-07-26 17:29 ` [PATCHv1, RFC 00/33] ext4: support of huge pages Theodore Ts'o
2016-07-26 19:12 ` Kirill A. Shutemov
2016-07-27 9:17 ` Jan Kara
2016-07-27 10:33 ` Kirill A. Shutemov
2016-07-27 14:09 ` Andrea Arcangeli
2016-08-10 0:54 ` [PATCH] mm, hugetlb: switch hugetlbfs to multi-order radix-tree entries Naoya Horiguchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1469493335-3622-26-git-send-email-kirill.shutemov@linux.intel.com \
--to=kirill.shutemov@linux.intel.com \
--cc=aarcange@redhat.com \
--cc=adilger.kernel@dilger.ca \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=hughd@google.com \
--cc=jack@suse.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ross.zwisler@linux.intel.com \
--cc=tytso@mit.edu \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).