linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: linux-scsi <linux-scsi@vger.kernel.org>,
	linux-ide <linux-ide@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	lsf-pc@lists.linux-foundation.org
Subject: Re: [LSF/MM TOPIC] Fixing large block devices on 32 bit
Date: Fri, 31 Jan 2014 16:19:43 -0800	[thread overview]
Message-ID: <52EC3D9F.8040702@intel.com> (raw)
In-Reply-To: <1391210864.2172.61.camel@dabdike.int.hansenpartnership.com>

On 01/31/2014 03:27 PM, James Bottomley wrote:
> On Fri, 2014-01-31 at 13:47 -0800, Dave Hansen wrote:
>> On 01/31/2014 11:02 AM, James Bottomley wrote:
>>>      3. Increase pgoff_t and the radix tree indexes to u64 for
>>>         CONFIG_LBDAF.  This will blow out the size of struct page on 32
>>>         bits by 4 bytes and may have other knock on effects, but at
>>>         least it will be transparent.
>>
>> I'm not sure how many acrobatics we want to go through for 32-bit, but...
> 
> That's partly the question: 32 bits was dying in the x86 space (at least
> until quark), but it's still predominant in embedded.
> 
>> Between page->mapping and page->index, we have 64 bits of space, which
>> *should* be plenty to uniquely identify a block.  We could easily add a
>> second-level lookup somewhere so that we store some cookie for the
>> address_space instead of a direct pointer.  How many devices would need,
>> practically?  8 bits worth?
> 
> That might work.  8 bits would get us up to 4PB, which is looking a bit
> high for single disk spinning rust.  However, how would the cookie work
> efficiently? remember we'll be doing this lookup every time we pull a
> page out of the page cache.  And the problem is that most of our lookups
> will be on file inodes, which won't be > 16TB, so it's a lot of overhead
> in the generic machinery for a problem that only occurs on buffer
> related page cache lookups.

I think all we have to do is set a low bit in page->mapping (or in
page->flags, but its more constrained) to say: "this isn't a direct
pointer".  We only set the bit for the buffer cache pages, and thus only
go to the slow(er) lookup path for those.  Whatever we use for the
lookups (radix tree or whatever) uses the remaining bits for an index.
We'd probably also need a last-lookup cache like mm->mmap_cache, but
probably not much more than that.

We already have page_mapping() in place to redirect folks away from
using page->mapping directly, so there shouldn't be too much code impact.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-02-01  0:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-31 19:02 [LSF/MM TOPIC] Fixing large block devices on 32 bit James Bottomley
2014-01-31 19:26 ` Dave Jones
2014-01-31 23:16   ` James Bottomley
2014-01-31 21:20 ` Chris Mason
2014-01-31 23:14   ` James Bottomley
2014-01-31 21:47 ` Dave Hansen
2014-01-31 23:27   ` James Bottomley
2014-02-01  0:19     ` Dave Hansen [this message]
2014-02-01  0:25       ` Kirill A. Shutemov
2014-02-01  0:32         ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52EC3D9F.8040702@intel.com \
    --to=dave.hansen@intel.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).