public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: TR Reardon <thomas_reardon@hotmail.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: RE: Reserved GDT inode: blocks vs extents
Date: Fri, 19 Sep 2014 13:26:38 -0400	[thread overview]
Message-ID: <BAY179-W495E3AE101B78A7CAA43BAFDB40@phx.gbl> (raw)
In-Reply-To: <20140919163649.GQ26995@thunk.org>

> Date: Fri, 19 Sep 2014 12:36:49 -0400
> From: tytso@mit.edu
> To: thomas_reardon@hotmail.com
> CC: linux-ext4@vger.kernel.org
> Subject: Re: Reserved GDT inode: blocks vs extents
>
> On Fri, Sep 19, 2014 at 11:54:39AM -0400, TR Reardon wrote:
>> Hello all: there's probably a good reason for this, but I'm wondering why inode#7 (reserved GDT blocks) is always created with a block map rather than extent?
>>
>> [see ext2fs_create_resize_inode()]
>
> It's created using an indirect map because the on-line resizing code
> in the kernel relies on it. It's rather dependent on the structure of
> the indirect block map so that the kernel knows where to fetch the
> necessary blocks in each block group to extend the block group
> descriptor.
>
> So no, we can't change it.
>
> And we do have a solution, namely the meta_bg layout which mostly
> solves the problem, although at the cost of slowing down the mount
> time.
>
> But that may be moot, since one of the things that I've been
> considering is to stop pinning the block group descriptors in memory,
> and just start reading in memory as they are needed. The rationale is
> that for a 4TB disk, we're burning 8 MB of memory. And if you have
> two dozen disks attached to your system, then you're burning 192
> megabytes of memory, which starts to fairly significant amounts of
> memory, especially for bookcase NAS servers.

But I'd argue that in many use cases, in particular bookcase NAS servers, 
ext4+vfs should optimize for avoiding spinups rather than reducing RAM usage. 
Would this change increase spinups when scanning for changes, say via rsync?
For mostly-cold-storage I wish I had the ability to make dentry- and inode-cache 
long lived, and have ext4 prefer to retain directory over file-data cache blocks, 
rather than current non-deterministic behavior via vfs_cache_pressure.  Unfortunately, 
it is precisely the kinds of largefiles on bookcase NAS servers being read linearly 
(and used only once) that blowout the cache of directory blocks (and dentries etc
but it's really the dir blocks that create the problem with spinups on cold-storage)

Of course, it's likelier that I don't actually understand how all these caches work ;)

+Reardon


 		 	   		  --
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2014-09-19 17:26 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-19 15:54 Reserved GDT inode: blocks vs extents TR Reardon
2014-09-19 16:36 ` Theodore Ts'o
2014-09-19 17:26   ` TR Reardon [this message]
2014-09-19 19:48     ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BAY179-W495E3AE101B78A7CAA43BAFDB40@phx.gbl \
    --to=thomas_reardon@hotmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox