public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: "Pocas, Jamie" <Jamie.Pocas@emc.com>
Cc: Eric Sandeen <sandeen@redhat.com>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: resize2fs stuck in ext4_group_extend with 100% CPU Utilization With Small Volumes
Date: Tue, 22 Sep 2015 19:02:05 -0400	[thread overview]
Message-ID: <20150922230204.GD3318@thunk.org> (raw)
In-Reply-To: <06724CF51D6BC94E9BEE7A8A8CB82A6740FE22BCCC@MX01A.corp.emc.com>

On Tue, Sep 22, 2015 at 04:28:39PM -0400, Pocas, Jamie wrote:
> # mount -o loop testfile mnt
> # truncate --size=1G testfile
> # losetup -c /dev/loop0 ## Cause loop device to reread size of backing file while still online
> # resize2fs /dev/loop0

It looks like the problem is with the loopback driver, and I can
reproduce the problem using 4.3-rc2.

If you don't do *either* the truncate or the resize2fs command in the
above sequence, and then do a "touch mnt/foo ; sync", the sync command
will hang.

The problem is the losetup -c command, which calls the
LOOP_SET_CAPACITY ioctl.  The problem is that this causes
bd_set_size() to be called, which has the side effect of forcing the
block size of /dev/loop0 to 4096 --- which is a problem if the file
system is using a 1k block size, and so the block size was properly
set to 1024.  This is subsequently causing the buffer cache operations
to hang.

So this will cause a hang:

cp /dev/null /tmp/foo.img
mke2fs -t ext4 /tmp/foo.img 100M
mount -o loop /tmp/foo.img /mnt
losetup -c /dev/loop0
touch /mnt/foo
sync

This will not hang:

cp /dev/null /tmp/foo.img
mke2fs -t ext4 -b 4096 /tmp/foo.img 100M
mount -o loop /tmp/foo.img /mnt
losetup -c /dev/loop0
touch /mnt/foo
sync

And this also explains why you weren't seeing the problem with small
file systems.  By default mke2fs uses a block size of 1k for file
systems smaller than 512 MB.  This is largely for historical reasons
since there was a time when we worried about optimizing the storage of
every single byte of your 80MB disk (which was all you had on your 40
MHz 80386 :-).

With larger file systems, the block size defaults to 4096, so we don't
run into problems when losetup -c attempts to set the block size ---
which is something that is *not* supposed to change if the block
device is currently mounted.  So for example, if you try to run the
command "blockdev --setbsz", it will fail with an EBUSY if the block
device is curently mounted.

So the workaround is to just create the file system with "-b 4096"
when you call mkfs.ext4.  This is a good idea if you intend to grow
the file system, since it is far more efficient to use a 4k block
size.

The proper fix in the kernel is to have the loop device check to see
if the block device is currently mounted.  If it is, then needs to
avoid changing the block size (which probably means it will need to
call a modified version of bd_set_size), and the capacity of the block
device needs to be rounded-down to the current block size.

(Currently if you set the capacity of the block device to be say, 1MB
plus 2k, and the current block size is 4k, it will change the block
size of the device to be 2k, so that the entire block device is
addressable.  If the block device is mount and the block size is fixed
to 4k, then it must not change the block size --- either up or down.
Instead, it must keep the block size at 4k, and only allow the
capacity to be set to 1MB.)

Cheers,

					- Ted

  reply	other threads:[~2015-09-22 23:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-22 19:12 resize2fs stuck in ext4_group_extend with 100% CPU Utilization With Small Volumes Pocas, Jamie
2015-09-22 19:33 ` Eric Sandeen
2015-09-22 20:28   ` Pocas, Jamie
2015-09-22 23:02     ` Theodore Ts'o [this message]
2015-09-23  4:20       ` Pocas, Jamie
2015-09-23 15:14         ` Theodore Ts'o
2015-09-23 16:04           ` Pocas, Jamie
2015-09-23 16:59             ` Theodore Ts'o
2015-09-23 18:20               ` Pocas, Jamie
2015-09-22 20:20 ` Theodore Ts'o
2015-09-22 21:26   ` Pocas, Jamie
2015-09-22 23:41     ` Eric Sandeen
2015-09-23  3:40       ` Pocas, Jamie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150922230204.GD3318@thunk.org \
    --to=tytso@mit.edu \
    --cc=Jamie.Pocas@emc.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox