linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: Phillip Susi <psusi@ubuntu.com>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>,
	linux-ext4@vger.kernel.org, Zheng Liu <wenqing.lz@taobao.com>
Subject: Re: [PATCH 3/3] mke2fs: document bigalloc and cluster-size
Date: Tue, 15 Jan 2013 17:28:24 -0500	[thread overview]
Message-ID: <20130115222824.GA5073@thunk.org> (raw)
In-Reply-To: <50F5BE57.1000305@ubuntu.com>

On Tue, Jan 15, 2013 at 03:38:47PM -0500, Phillip Susi wrote:
> 
> If it is only to get around the mm pagesize limit, then why not just
> have the fs automatically lie to the kernel about the block size and
> shift the references back and forth on the fly when it detects a
> larger blocksize?

Because of the pain in dealing with how to handle random writes into a
sparse file.  We need to either track which blocks in the large block
have been initialized, or we would need to erase the entire large
block before writing the first page into the large block (and then you
still need to track whether or not you are writing that first or
subsequent page into a large block).

What we're doing with bigalloc is effectively tracking which blocks in
the cluster have been initialized by using entries in the extent tree,
since entries to the allocation bitmaps is in units of clusters, but
entries in the extent tree is in units of blocks.

Looking back at how complicated it has been to get delalloc right, it
may have been the case that just using a brute-force sb_issue_zeroout
when the block is freshly allocated, unless the arguments to the
request to ext4_writepages() exactly covered the large block might
have been simpler.  Getting the Direct I/O path right would have been
messy, but perhaps it would have been less work in the end.

       	   	      	    	      - Ted

  reply	other threads:[~2013-01-15 22:28 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-13  9:08 [PATCH 1/3] mke2fs: indicate bigalloc feature explicity when cluster-size is enabled Zheng Liu
2013-01-13  9:08 ` [PATCH 2/3] mke2fs: reduce the range of cluster-size Zheng Liu
2013-01-14 17:41   ` Andreas Dilger
2013-01-14 21:03   ` Theodore Ts'o
2013-01-14 21:07     ` Andreas Dilger
2013-01-14 21:10       ` Theodore Ts'o
2013-01-15  0:37         ` [PATCH 1/5] mke2fs: enforce that the cluster size must be less that the block size Theodore Ts'o
2013-01-15  0:37           ` [PATCH 2/5] mke2fs: the -g option will now specify the clusters per block group Theodore Ts'o
2013-01-15 15:10             ` Eric Sandeen
2013-01-15 19:05               ` Theodore Ts'o
2013-01-15 15:22             ` Zheng Liu
2013-01-15  0:37           ` [PATCH 3/5] libe2p: teach parse_num_blocks2() to return bytes if log_block_size < 0 Theodore Ts'o
2013-01-15 15:23             ` Zheng Liu
2013-01-15  0:37           ` [PATCH 4/5] mke2fs: teach mke2fs to understand -b 4k and -C 256M Theodore Ts'o
2013-01-15 15:11             ` Eric Sandeen
2013-01-15 15:13               ` Eric Sandeen
2013-01-15 15:24             ` Zheng Liu
2013-01-15  0:37           ` [PATCH 5/5] libext2fs: avoid 32-bit overflow in ext2fs_initialize with a 512M cluster size Theodore Ts'o
2013-01-15 15:33             ` Zheng Liu
2013-01-15 15:36               ` Zheng Liu
2013-01-15 19:10               ` Theodore Ts'o
2013-01-16  1:49                 ` Zheng Liu
2013-01-15  0:41           ` [PATCH 1/5] mke2fs: enforce that the cluster size must be less that the block size Theodore Ts'o
2013-01-15 15:22             ` Zheng Liu
2013-01-13  9:08 ` [PATCH 3/3] mke2fs: document bigalloc and cluster-size Zheng Liu
2013-01-15  3:10   ` Theodore Ts'o
2013-01-15 19:12     ` Theodore Ts'o
2013-01-15 19:46       ` Phillip Susi
2013-01-15 19:57         ` Theodore Ts'o
2013-01-15 20:38           ` Phillip Susi
2013-01-15 22:28             ` Theodore Ts'o [this message]
2013-01-14 20:28 ` [PATCH 1/3] mke2fs: indicate bigalloc feature explicity when cluster-size is enabled Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130115222824.GA5073@thunk.org \
    --to=tytso@mit.edu \
    --cc=gnehzuil.liu@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=psusi@ubuntu.com \
    --cc=wenqing.lz@taobao.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).