From: Boaz Harrosh <boaz@plexistor.com>
To: Matthew Wilcox <willy@linux.intel.com>, Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Dan Williams <dan.j.williams@intel.com>,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
axboe@kernel.dk, linux-nvdimm@ml01.01.org,
Dave Hansen <dave.hansen@linux.intel.com>,
linux-raid@vger.kernel.org, mgorman@suse.de, hch@infradead.org,
linux-fsdevel@vger.kernel.org,
"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [RFC PATCH 0/7] evacuate struct page from the block layer
Date: Sun, 22 Mar 2015 18:24:50 +0200 [thread overview]
Message-ID: <550EECD2.4000604@plexistor.com> (raw)
In-Reply-To: <20150320203136.GM4003@linux.intel.com>
On 03/20/2015 10:31 PM, Matthew Wilcox wrote:
<>
>
> There's a lot of code out there that relies on struct page being PAGE_SIZE
> bytes.
Not so much really. Not at the lower end of the stack. You can actually feed
a
vp = kmalloc(64K);
bv_page = virt_to_page(vp)
bv_len = 64k
And feed that to an hard drive. It works.
The only last stronghold of PAGE_SIZE is at the page-cache and page-fault
granularity where the minimum is the better. But it should not be hard
to clean up the lower end of the stack. Even introduce a:
page_size(page)
You will find that every subsystem that can work with a sub-page size
similar to above bv_len. Will also work well with bigger than PAGE_SIZE
bv_len equivalent.
Only the BUG_ONs need to convert to page_size(page) instead of PAGE_SIZE
> I'm cool with replacing 'struct page' with 'struct superpage'
> [1] in the biovec and auditing all of the code which touches it ... but
> that's going to be a lot of code! I'm not sure it's less code than
> going directly to 'just do I/O on PFNs'.
>
struct page already knows how to be a super-page. with the THP mechanics.
All a page_size(page) needs is a call to its section, we do not need any
added storage at page-struct. (And we can cache this as a flag we actually
already have a flag)
It looks like you are very trigger happy to change
"biovec and auditing all of the code which touches it"
I believe long long term your #1b is the correct "full audit" path:
Page Is the virtual-2-page-2-physical descriptor + state.
It is variable size
> [1] Please, somebody come up with a better name!
sure struct page *page.
The one to kill is PAGE_SIZE. In most current code it can just be MIN_PAGE_SIZE
and CACHE_PAGE_SIZE == MIN_PAGE_SIZE. Only novelty is enhance of the split_huge_page
in the case of "page-fault-granularity".
Thanks
Boaz
next prev parent reply other threads:[~2015-03-22 16:24 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-16 20:25 [RFC PATCH 0/7] evacuate struct page from the block layer Dan Williams
2015-03-16 20:25 ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 1/7] block: add helpers for accessing a bio_vec page Dan Williams
2015-03-16 20:25 ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 2/7] block: convert bio_vec.bv_page to bv_pfn Dan Williams
2015-03-16 20:25 ` Dan Williams
2015-03-16 23:05 ` Al Viro
2015-03-17 13:02 ` Matthew Wilcox
2015-03-17 15:53 ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 3/7] dma-mapping: allow archs to optionally specify a ->map_pfn() operation Dan Williams
2015-03-16 20:25 ` Dan Williams
2015-03-18 11:21 ` [Linux-nvdimm] " Boaz Harrosh
2015-03-18 11:21 ` Boaz Harrosh
2015-03-16 20:25 ` [RFC PATCH 4/7] scatterlist: use sg_phys() Dan Williams
2015-03-16 20:25 ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 5/7] scatterlist: support "page-less" (__pfn_t only) entries Dan Williams
2015-03-16 20:25 ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 6/7] x86: support dma_map_pfn() Dan Williams
2015-03-16 20:25 ` Dan Williams
2015-03-16 20:26 ` [RFC PATCH 7/7] block: base support for pfn i/o Dan Williams
2015-03-16 20:26 ` Dan Williams
2015-03-18 10:47 ` [RFC PATCH 0/7] evacuate struct page from the block layer Boaz Harrosh
2015-03-18 10:47 ` Boaz Harrosh
2015-03-18 13:06 ` Matthew Wilcox
2015-03-18 13:06 ` Matthew Wilcox
2015-03-18 14:38 ` [Linux-nvdimm] " Boaz Harrosh
2015-03-18 14:38 ` Boaz Harrosh
2015-03-20 15:56 ` Rik van Riel
2015-03-22 11:53 ` Boaz Harrosh
2015-03-18 15:35 ` Dan Williams
2015-03-18 15:35 ` Dan Williams
2015-03-18 20:26 ` Andrew Morton
2015-03-19 13:43 ` Matthew Wilcox
2015-03-19 15:54 ` [Linux-nvdimm] " Boaz Harrosh
2015-03-19 19:59 ` Andrew Morton
2015-03-19 20:59 ` Dan Williams
2015-03-22 17:22 ` Boaz Harrosh
2015-03-20 17:32 ` Wols Lists
2015-03-22 10:30 ` Boaz Harrosh
2015-03-19 18:17 ` Christoph Hellwig
2015-03-19 19:31 ` Matthew Wilcox
2015-03-22 16:46 ` Boaz Harrosh
2015-03-20 16:21 ` Rik van Riel
2015-03-20 20:31 ` Matthew Wilcox
2015-03-20 21:08 ` Rik van Riel
2015-03-22 17:06 ` Boaz Harrosh
2015-03-22 17:22 ` Dan Williams
2015-03-22 17:39 ` Boaz Harrosh
2015-03-20 21:17 ` Wols Lists
2015-03-22 16:24 ` Boaz Harrosh [this message]
2015-03-22 15:51 ` Boaz Harrosh
2015-03-23 15:19 ` Rik van Riel
2015-03-23 19:30 ` Christoph Hellwig
2015-03-24 9:41 ` Boaz Harrosh
2015-03-24 16:57 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=550EECD2.4000604@plexistor.com \
--to=boaz@plexistor.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=hch@infradead.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@ml01.01.org \
--cc=linux-raid@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mst@redhat.com \
--cc=riel@redhat.com \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.