From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: "Theodore Ts'o" <tytso@mit.edu>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Jan Kara <jack@suse.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Hugh Dickins <hughd@google.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave.hansen@intel.com>,
Vlastimil Babka <vbabka@suse.cz>,
Matthew Wilcox <willy@infradead.org>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-block@vger.kernel.org,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCHv1, RFC 21/33] fs: make block_read_full_page() be able to read huge page
Date: Tue, 26 Jul 2016 03:35:23 +0300 [thread overview]
Message-ID: <1469493335-3622-22-git-send-email-kirill.shutemov@linux.intel.com> (raw)
In-Reply-To: <1469493335-3622-1-git-send-email-kirill.shutemov@linux.intel.com>
The approach is straight-forward: for compound pages we read out whole
huge page.
For huge page we cannot have array of buffer head pointers on stack --
it's 4096 pointers on x86-64 -- 'arr' is allocated with kmalloc() for
huge pages.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
fs/buffer.c | 22 +++++++++++++++++-----
include/linux/buffer_head.h | 9 +++++----
include/linux/page-flags.h | 2 +-
3 files changed, 23 insertions(+), 10 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 193ef03401ed..9ca197a924eb 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -870,7 +870,7 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size,
try_again:
head = NULL;
- offset = PAGE_SIZE;
+ offset = hpage_size(page);
while ((offset -= size) >= 0) {
bh = alloc_buffer_head(GFP_NOFS);
if (!bh)
@@ -1466,7 +1466,7 @@ void set_bh_page(struct buffer_head *bh,
struct page *page, unsigned long offset)
{
bh->b_page = page;
- BUG_ON(offset >= PAGE_SIZE);
+ BUG_ON(offset >= hpage_size(page));
if (PageHighMem(page))
/*
* This catches illegal uses and preserves the offset:
@@ -2239,11 +2239,13 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
{
struct inode *inode = page->mapping->host;
sector_t iblock, lblock;
- struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE];
+ struct buffer_head *arr_on_stack[MAX_BUF_PER_PAGE];
+ struct buffer_head *bh, *head, **arr = arr_on_stack;
unsigned int blocksize, bbits;
int nr, i;
int fully_mapped = 1;
+ VM_BUG_ON_PAGE(PageTail(page), page);
head = create_page_buffers(page, inode, 0);
blocksize = head->b_size;
bbits = block_size_bits(blocksize);
@@ -2254,6 +2256,11 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
nr = 0;
i = 0;
+ if (PageTransHuge(page)) {
+ arr = kmalloc(sizeof(struct buffer_head *) * HPAGE_PMD_NR *
+ MAX_BUF_PER_PAGE, GFP_NOFS);
+ }
+
do {
if (buffer_uptodate(bh))
continue;
@@ -2269,7 +2276,9 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
SetPageError(page);
}
if (!buffer_mapped(bh)) {
- zero_user(page, i * blocksize, blocksize);
+ zero_user(page + (i * blocksize / PAGE_SIZE),
+ i * blocksize % PAGE_SIZE,
+ blocksize);
if (!err)
set_buffer_uptodate(bh);
continue;
@@ -2295,7 +2304,7 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
if (!PageError(page))
SetPageUptodate(page);
unlock_page(page);
- return 0;
+ goto out;
}
/* Stage two: lock the buffers */
@@ -2317,6 +2326,9 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
else
submit_bh(REQ_OP_READ, 0, bh);
}
+out:
+ if (arr != arr_on_stack)
+ kfree(arr);
return 0;
}
EXPORT_SYMBOL(block_read_full_page);
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index ebbacd14d450..2389e8fd6127 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -131,13 +131,14 @@ BUFFER_FNS(Meta, meta)
BUFFER_FNS(Prio, prio)
BUFFER_FNS(Defer_Completion, defer_completion)
-#define bh_offset(bh) ((unsigned long)(bh)->b_data & ~PAGE_MASK)
+#define bh_offset(bh) ((unsigned long)(bh)->b_data & ~hpage_mask(bh->b_page))
/* If we *know* page->private refers to buffer_heads */
-#define page_buffers(page) \
+#define page_buffers(__page) \
({ \
- BUG_ON(!PagePrivate(page)); \
- ((struct buffer_head *)page_private(page)); \
+ struct page *p = compound_head(__page); \
+ BUG_ON(!PagePrivate(p)); \
+ ((struct buffer_head *)page_private(p)); \
})
#define page_has_buffers(page) PagePrivate(page)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index de90abb7c84e..89f0e9f26694 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -730,7 +730,7 @@ static inline void ClearPageSlabPfmemalloc(struct page *page)
*/
static inline int page_has_private(struct page *page)
{
- return !!(page->flags & PAGE_FLAGS_PRIVATE);
+ return !!(compound_head(page)->flags & PAGE_FLAGS_PRIVATE);
}
#undef PF_ANY
--
2.8.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-07-26 0:35 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-26 0:35 [PATCHv1, RFC 00/33] ext4: support of huge pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 01/33] tools: Add WARN_ON_ONCE Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 02/33] radix tree test suite: Allow GFP_ATOMIC allocations to fail Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 03/33] radix-tree: Add radix_tree_join Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 04/33] radix-tree: Add radix_tree_split Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 05/33] radix-tree: Add radix_tree_split_preload() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 06/33] radix-tree: Handle multiorder entries being deleted by replace_clear_tags Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 07/33] mm, shmem: swich huge tmpfs to multi-order radix-tree entries Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 08/33] Revert "radix-tree: implement radix_tree_maybe_preload_order()" Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 09/33] page-flags: relax page flag poliry for PG_error and PG_writeback Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 10/33] mm, rmap: account file thp pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 11/33] thp: allow splitting non-shmem file-backed THPs Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 12/33] truncate: make sure invalidate_mapping_pages() can discard huge pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 13/33] filemap: allocate huge page in page_cache_read(), if allowed Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 14/33] filemap: handle huge pages in do_generic_file_read() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 15/33] filemap: allocate huge page in pagecache_get_page(), if allowed Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 16/33] filemap: handle huge pages in filemap_fdatawait_range() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 17/33] HACK: readahead: alloc huge pages, if allowed Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 18/33] HACK: block: bump BIO_MAX_PAGES Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 19/33] mm: make write_cache_pages() work on huge pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 20/33] thp: introduce hpage_size() and hpage_mask() Kirill A. Shutemov
2016-07-26 0:35 ` Kirill A. Shutemov [this message]
2016-07-26 0:35 ` [PATCHv1, RFC 22/33] fs: make block_write_{begin,end}() be able to handle huge pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 23/33] fs: make block_page_mkwrite() aware about " Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 24/33] truncate: make truncate_inode_pages_range() " Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 25/33] ext4: make ext4_mpage_readpages() hugepage-aware Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 26/33] ext4: make ext4_writepage() work on huge pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 27/33] ext4: handle huge pages in ext4_page_mkwrite() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 28/33] ext4: handle huge pages in __ext4_block_zero_page_range() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 29/33] ext4: handle huge pages in ext4_da_write_end() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 30/33] ext4: relax assert in ext4_da_page_release_reservation() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 31/33] WIP: ext4: handle writeback with huge pages Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 32/33] mm, fs, ext4: expand use of page_mapping() and page_to_pgoff() Kirill A. Shutemov
2016-07-26 0:35 ` [PATCHv1, RFC 33/33] ext4, vfs: add huge= mount option Kirill A. Shutemov
2016-07-26 17:29 ` [PATCHv1, RFC 00/33] ext4: support of huge pages Theodore Ts'o
2016-07-26 19:12 ` Kirill A. Shutemov
2016-07-27 9:17 ` Jan Kara
2016-07-27 10:33 ` Kirill A. Shutemov
2016-07-27 14:09 ` Andrea Arcangeli
2016-08-10 0:54 ` [PATCH] mm, hugetlb: switch hugetlbfs to multi-order radix-tree entries Naoya Horiguchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1469493335-3622-22-git-send-email-kirill.shutemov@linux.intel.com \
--to=kirill.shutemov@linux.intel.com \
--cc=aarcange@redhat.com \
--cc=adilger.kernel@dilger.ca \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=hughd@google.com \
--cc=jack@suse.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ross.zwisler@linux.intel.com \
--cc=tytso@mit.edu \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).