* [3.2-rc2] loop device balance_dirty_pages_nr throttling hang
@ 2011-11-21 14:20 Dave Chinner
2011-11-22 3:56 ` Wu Fengguang
0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2011-11-21 14:20 UTC (permalink / raw)
To: fengguang.wu; +Cc: linux-kernel, linux-fsdevel
Hi Fengguang,
I just found a way of hanging a system and taking it down. I haven't
tried to narrow down the test case - it's pretty simple - because it
time for sleep here.
$ uname -a
Linux test-2 3.2.0-rc2-dgc+ #89 SMP Thu Nov 17 15:25:19 EST 2011 x86_64 GNU/Linux
Create a 20TB sparse loop device on 11GB XFS filesystem:
$ sudo xfs_io -f -c "truncate 20T" /mnt/scratch/scratch.img
$ sudo losetup /dev/loop0 /mnt/scratch/scratch.img
Make an ext4 filesystem on the loop device:
$ sudo mkfs.ext4 /dev/loop0
mke2fs 1.42-WIP (16-Oct-2011)
Discarding device blocks: done
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
335544320 inodes, 5368709120 blocks
268435456 blocks (5.00%) reserved for the super user
.....
Mount the filesystem:
$ sudo mount /dev/loop0 /mnt/scratch/scratch
Try to preallocate a 15TB file:
$ sudo time xfs_io -f -F -c "truncate 15T " -c "falloc 0 15T" /mnt/scratch/scratch/foo
The command hangs very soon after this, with these hung processes:
[ 283.437409] SysRq : Show Blocked State
[ 283.438563] task PC stack pid father
[ 283.439837] rsyslogd D ffff88007691ea48 0 1633 1 0x00000000
[ 283.440038] ffff880076a71bf8 0000000000000082 0000000000009f1c ffffffffb135a68d
[ 283.440038] ffff88007691e6c0 ffff880076a71fd8 ffff880076a71fd8 ffff880076a71fd8
[ 283.440038] ffff880031402540 ffff88007691e6c0 ffff880076a71bf8 ffffffff810b59ed
[ 283.440038] Call Trace:
[ 283.440038] [<ffffffff810b59ed>] ? ktime_get_ts+0xad/0xe0
[ 283.440038] [<ffffffff81117650>] ? sleep_on_page+0x20/0x20
[ 283.440038] [<ffffffff81aab3af>] schedule+0x3f/0x60
[ 283.440038] [<ffffffff81aab45f>] io_schedule+0x8f/0xd0
[ 283.440038] [<ffffffff8111765e>] sleep_on_page_killable+0xe/0x40
[ 283.440038] [<ffffffff81aabc2f>] __wait_on_bit+0x5f/0x90
[ 283.440038] [<ffffffff8114c0ff>] ? read_swap_cache_async+0x4f/0x160
[ 283.440038] [<ffffffff81119d60>] wait_on_page_bit_killable+0x80/0x90
[ 283.440038] [<ffffffff810ac360>] ? autoremove_wake_function+0x40/0x40
[ 283.440038] [<ffffffff81119e16>] __lock_page_or_retry+0xa6/0xd0
[ 283.440038] [<ffffffff8113bba7>] handle_pte_fault+0x887/0x8b0
[ 283.440038] [<ffffffff8169ee6d>] ? copy_user_generic_string+0x2d/0x40
[ 283.440038] [<ffffffff8113bee5>] handle_mm_fault+0x155/0x250
[ 283.440038] [<ffffffff81ab0a62>] do_page_fault+0x142/0x4f0
[ 283.440038] [<ffffffff81ab0565>] do_async_page_fault+0x35/0x80
[ 283.440038] [<ffffffff81aad935>] async_page_fault+0x25/0x30
[ 283.440038] loop0 D ffff8800775fe4c8 0 2658 2 0x00000000
[ 283.440038] ffff88007b6b7810 0000000000000046 ffff88007b6b77f0 ffff88007fd12580
[ 283.440038] ffff8800775fe140 ffff88007b6b7fd8 ffff88007b6b7fd8 ffff88007b6b7fd8
[ 283.440038] ffff88007b1ea740 ffff8800775fe140 ffff88007b6b77f0 ffffffff81aad47e
[ 283.440038] Call Trace:
[ 283.440038] [<ffffffff81aad47e>] ? _raw_spin_lock_irqsave+0x2e/0x40
[ 283.440038] [<ffffffff81aab3af>] schedule+0x3f/0x60
[ 283.440038] [<ffffffff81aab894>] schedule_timeout+0x144/0x2d0
[ 283.440038] [<ffffffff810985d0>] ? usleep_range+0x50/0x50
[ 283.440038] [<ffffffff81aab6c2>] io_schedule_timeout+0xa2/0x100
[ 283.440038] [<ffffffff81121b21>] ? bdi_dirty_limit+0x31/0xc0
[ 283.440038] [<ffffffff811221f8>] balance_dirty_pages_ratelimited_nr+0x298/0x6d0
[ 283.440038] [<ffffffff81190d94>] ? block_write_end+0x44/0x80
[ 283.440038] [<ffffffff81117bc8>] generic_file_buffered_write+0x1a8/0x250
[ 283.440038] [<ffffffff8142262c>] xfs_file_buffered_aio_write+0xec/0x1b0
[ 283.440038] [<ffffffff8142285a>] xfs_file_aio_write+0x16a/0x2a0
[ 283.440038] [<ffffffff8115f682>] do_sync_write+0xd2/0x110
[ 283.440038] [<ffffffff81081856>] ? load_balance+0xb6/0x8e0
[ 283.440038] [<ffffffff8184a074>] __do_lo_send_write+0x54/0xa0
[ 283.440038] [<ffffffff8184a3b1>] do_lo_send_direct_write+0x81/0xa0
[ 283.440038] [<ffffffff8184af77>] do_bio_filebacked+0x227/0x2e0
[ 283.440038] [<ffffffff8184a330>] ? transfer_xor+0xf0/0xf0
[ 283.440038] [<ffffffff8119401d>] ? bio_free+0x4d/0x60
[ 283.440038] [<ffffffff81194045>] ? bio_fs_destructor+0x15/0x20
[ 283.440038] [<ffffffff8119346b>] ? bio_put+0x2b/0x30
[ 283.440038] [<ffffffff8184bf42>] loop_thread+0xc2/0x250
[ 283.440038] [<ffffffff810ac320>] ? add_wait_queue+0x60/0x60
[ 283.440038] [<ffffffff8184be80>] ? loop_set_status_old+0x1e0/0x1e0
[ 283.440038] [<ffffffff810ab87c>] kthread+0x8c/0xa0
[ 283.440038] [<ffffffff81ab7174>] kernel_thread_helper+0x4/0x10
[ 283.440038] [<ffffffff810ab7f0>] ? flush_kthread_worker+0xa0/0xa0
[ 283.440038] [<ffffffff81ab7170>] ? gs_change+0x13/0x13
[ 283.440038] jbd2/loop0-8 D ffff88000ca08748 0 2813 2 0x00000000
[ 283.440038] ffff88000c9f7bc0 0000000000000046 0000000000000000 ffffffffb135a68d
[ 283.440038] ffff88000ca083c0 ffff88000c9f7fd8 ffff88000c9f7fd8 ffff88000c9f7fd8
[ 283.440038] ffff88007c8602c0 ffff88000ca083c0 ffff88000c9f7bc0 00000001810b59ed
[ 283.440038] Call Trace:
[ 283.440038] [<ffffffff8118eec0>] ? __wait_on_buffer+0x30/0x30
[ 283.440038] [<ffffffff81aab3af>] schedule+0x3f/0x60
[ 283.440038] [<ffffffff81aab45f>] io_schedule+0x8f/0xd0
[ 283.440038] [<ffffffff8118eece>] sleep_on_buffer+0xe/0x20
[ 283.440038] [<ffffffff81aabc2f>] __wait_on_bit+0x5f/0x90
[ 283.440038] [<ffffffff8167e177>] ? generic_make_request+0xc7/0x100
[ 283.440038] [<ffffffff8118eec0>] ? __wait_on_buffer+0x30/0x30
[ 283.440038] [<ffffffff81aabcdc>] out_of_line_wait_on_bit+0x7c/0x90
[ 283.440038] [<ffffffff810ac360>] ? autoremove_wake_function+0x40/0x40
[ 283.440038] [<ffffffff8118eebe>] __wait_on_buffer+0x2e/0x30
[ 283.440038] [<ffffffff8129d2ef>] jbd2_journal_commit_transaction+0xb0f/0x15d0
[ 283.440038] [<ffffffff810998da>] ? try_to_del_timer_sync+0x8a/0x110
[ 283.440038] [<ffffffff812a15db>] kjournald2+0xbb/0x220
[ 283.440038] [<ffffffff810ac320>] ? add_wait_queue+0x60/0x60
[ 283.440038] [<ffffffff812a1520>] ? commit_timeout+0x10/0x10
[ 283.440038] [<ffffffff810ab87c>] kthread+0x8c/0xa0
[ 283.440038] [<ffffffff81ab7174>] kernel_thread_helper+0x4/0x10
[ 283.440038] [<ffffffff810ab7f0>] ? flush_kthread_worker+0xa0/0xa0
[ 283.440038] [<ffffffff81ab7170>] ? gs_change+0x13/0x13
[ 283.440038] xfs_io D ffff880044fcca88 0 2817 2816 0x00000000
[ 283.440038] ffff880012b77ac8 0000000000000086 0000000000000000 ffffffffb135a68d
[ 283.440038] ffff880044fcc700 ffff880012b77fd8 ffff880012b77fd8 ffff880012b77fd8
[ 283.440038] ffff88007c8602c0 ffff880044fcc700 ffff880012b77ac8 00000001810b59ed
[ 283.440038] Call Trace:
[ 283.440038] [<ffffffff8118eec0>] ? __wait_on_buffer+0x30/0x30
[ 283.440038] [<ffffffff81aab3af>] schedule+0x3f/0x60
[ 283.440038] [<ffffffff81aab45f>] io_schedule+0x8f/0xd0
[ 283.440038] [<ffffffff8118eece>] sleep_on_buffer+0xe/0x20
[ 283.440038] [<ffffffff81aabc2f>] __wait_on_bit+0x5f/0x90
[ 283.440038] [<ffffffff8118eec0>] ? __wait_on_buffer+0x30/0x30
[ 283.440038] [<ffffffff81aabcdc>] out_of_line_wait_on_bit+0x7c/0x90
[ 283.440038] [<ffffffff810ac360>] ? autoremove_wake_function+0x40/0x40
[ 283.440038] [<ffffffff8118eebe>] __wait_on_buffer+0x2e/0x30
[ 283.440038] [<ffffffff8129f531>] jbd2_log_do_checkpoint+0x4c1/0x4e0
[ 283.440038] [<ffffffff8129f5ed>] __jbd2_log_wait_for_space+0x9d/0x1b0
[ 283.440038] [<ffffffff8129a4e0>] start_this_handle.isra.9+0x390/0x490
[ 283.440038] [<ffffffff810ac320>] ? add_wait_queue+0x60/0x60
[ 283.440038] [<ffffffff8129a6aa>] jbd2__journal_start+0xca/0x110
[ 283.440038] [<ffffffff8129a703>] jbd2_journal_start+0x13/0x20
[ 283.440038] [<ffffffff8127189f>] ext4_journal_start_sb+0x7f/0x1d0
[ 283.440038] [<ffffffff8127ddd4>] ? ext4_fallocate+0x1a4/0x530
[ 283.440038] [<ffffffff8127ddd4>] ext4_fallocate+0x1a4/0x530
[ 283.440038] [<ffffffff8115e992>] do_fallocate+0xf2/0x160
[ 283.440038] [<ffffffff8115ea4b>] sys_fallocate+0x4b/0x70
[ 283.440038] [<ffffffff81ab5082>] system_call_fastpath+0x16/0x1b
[ 283.440038] Sched Debug Version: v0.10, 3.2.0-rc2-dgc+ #89
Looks like the prealloc created lots of dirty pages:
$ cat /proc/meminfo
MemTotal: 2050356 kB
MemFree: 17676 kB
Buffers: 518888 kB
Cached: 1367132 kB
SwapCached: 1448 kB
Active: 423444 kB
Inactive: 1476300 kB
Active(anon): 8948 kB
Inactive(anon): 4796 kB
Active(file): 414496 kB
Inactive(file): 1471504 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 497976 kB
SwapFree: 491956 kB
Dirty: 392664 kB
Writeback: 0 kB
AnonPages: 12416 kB
Mapped: 6896 kB
Shmem: 24 kB
Slab: 105844 kB
SReclaimable: 92252 kB
SUnreclaim: 13592 kB
KernelStack: 848 kB
PageTables: 2508 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 1523152 kB
Committed_AS: 61012 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 15260 kB
VmallocChunk: 34359723095 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 6132 kB
DirectMap2M: 2091008 kB
$
$
$ cat /proc/vmstat
nr_free_pages 4378
nr_inactive_anon 1199
nr_active_anon 2232
nr_inactive_file 367877
nr_active_file 103629
nr_unevictable 0
nr_mlock 0
nr_anon_pages 3125
nr_mapped 1724
nr_file_pages 471873
nr_dirty 98166
nr_writeback 0
nr_slab_reclaimable 23064
nr_slab_unreclaimable 3416
nr_page_table_pages 628
nr_kernel_stack 106
nr_unstable 0
nr_bounce 0
nr_vmscan_write 1597
nr_vmscan_immediate_reclaim 1752
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 6
nr_dirtied 548796
nr_written 406231
numa_hit 1086286
numa_miss 0
numa_foreign 0
numa_interleave 20982
numa_local 1086286
numa_other 0
nr_anon_transparent_hugepages 0
nr_dirty_threshold 95863
nr_dirty_background_threshold 47931
pgpgin 181756
pgpgout 2417611
pswpin 437
pswpout 1597
pgalloc_dma 7445
pgalloc_dma32 1083448
pgalloc_normal 0
pgalloc_movable 0
pgfree 1095515
pgactivate 104582
pgdeactivate 3069
pgfault 677067
pgmajfault 460
pgrefill_dma 771
pgrefill_dma32 2298
pgrefill_normal 0
pgrefill_movable 0
pgsteal_dma 5415
pgsteal_dma32 90506
pgsteal_normal 0
pgsteal_movable 0
pgscan_kswapd_dma 8719
pgscan_kswapd_dma32 89764
pgscan_kswapd_normal 0
pgscan_kswapd_movable 0
pgscan_direct_dma 0
pgscan_direct_dma32 2969
pgscan_direct_normal 0
pgscan_direct_movable 0
zone_reclaim_failed 0
pginodesteal 0
slabs_scanned 1024
kswapd_steal 92952
kswapd_inodesteal 0
kswapd_low_wmark_hit_quickly 187
kswapd_high_wmark_hit_quickly 0
kswapd_skip_congestion_wait 0
pageoutrun 1083
allocstall 38
pgrotated 1623
htlb_buddy_alloc_success 0
htlb_buddy_alloc_fail 0
unevictable_pgs_culled 0
unevictable_pgs_scanned 0
unevictable_pgs_rescued 0
unevictable_pgs_mlocked 0
unevictable_pgs_munlocked 0
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0
And the loop device writeback is stalled in
balance_dirty_pages_nr(). Any further writes to the system result in
those processes also hanging in balance_dirty_pages_nr().
I thought this might be a one-off, but I rebooted the machine and
from a clean boot, ran the above commands and it entered the same
state where I pulled out the above traces via sysrq-w. I'm not sure
what is going on yet.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3.2-rc2] loop device balance_dirty_pages_nr throttling hang
2011-11-21 14:20 [3.2-rc2] loop device balance_dirty_pages_nr throttling hang Dave Chinner
@ 2011-11-22 3:56 ` Wu Fengguang
2011-11-22 10:29 ` Dave Chinner
0 siblings, 1 reply; 6+ messages in thread
From: Wu Fengguang @ 2011-11-22 3:56 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Hi Dave,
On Mon, Nov 21, 2011 at 10:20:56PM +0800, Dave Chinner wrote:
> Hi Fengguang,
>
> I just found a way of hanging a system and taking it down. I haven't
> tried to narrow down the test case - it's pretty simple - because it
> time for sleep here.
Yeah, once the global dirty limit is exceeded, the system would appear
hang because many applications will block in balance_dirty_pages().
I created a script for this case, however cannot reproduce it..
The test box has 32GB memory and 110GB /dev/sda7, so I lowered
the dirty_bytes=400MB and xfs "-b size=10g" explicitly in the script.
During the test run on 3.2.0-rc1, I find the dirty pages rarely exceed
the background dirty threshold (200MB).
Would you try run this and see if this it's a problem of the test script?
root@snb /home/wfg# cat ./test-loop-fallocate.sh
#!/bin/sh
# !!!change and uncomment this before run!!!
# DEV=/dev/sda7
echo 1 > /debug/tracing/events/writeback/balance_dirty_pages/enable
echo 1 > /debug/tracing/events/writeback/global_dirty_state/enable
echo $((400<<20)) > /proc/sys/vm/dirty_bytes
mkfs.xfs -f -d size=10g $DEV
mount $DEV /mnt/scratch
xfs_io -f -c "truncate 20T" /mnt/scratch/scratch.img
losetup /dev/loop0 /mnt/scratch/scratch.img
mkfs.ext4 /dev/loop0
mkdir /mnt/scratch/scratch
mount /dev/loop0 /mnt/scratch/scratch
time xfs_io -f -F -c "truncate 15T " -c "falloc 0 15T" /mnt/scratch/scratch/foo
umount /mnt/scratch/scratch
losetup -d /dev/loop0
umount /mnt/scratch
root@snb /home/wfg# ./test-loop-fallocate.sh
meta-data=/dev/sda7 isize=256 agcount=4, agsize=655360 blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=2621440, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mke2fs 1.42-WIP (16-Oct-2011)
Discarding device blocks: done
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
335544320 inodes, 5368709120 blocks
268435456 blocks (5.00%) reserved for the super user
First data block=0
163840 block groups
32768 blocks per group, 32768 fragments per group
2048 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848, 512000000, 550731776, 644972544, 1934917632,
2560000000, 3855122432
Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
real 0m38.323s
user 0m0.000s
sys 0m25.203s
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3.2-rc2] loop device balance_dirty_pages_nr throttling hang
2011-11-22 3:56 ` Wu Fengguang
@ 2011-11-22 10:29 ` Dave Chinner
2011-11-22 10:49 ` Dave Chinner
0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2011-11-22 10:29 UTC (permalink / raw)
To: Wu Fengguang; +Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
On Tue, Nov 22, 2011 at 11:56:29AM +0800, Wu Fengguang wrote:
> Hi Dave,
>
> On Mon, Nov 21, 2011 at 10:20:56PM +0800, Dave Chinner wrote:
> > Hi Fengguang,
> >
> > I just found a way of hanging a system and taking it down. I haven't
> > tried to narrow down the test case - it's pretty simple - because it
> > time for sleep here.
>
> Yeah, once the global dirty limit is exceeded, the system would appear
> hang because many applications will block in balance_dirty_pages().
>
> I created a script for this case, however cannot reproduce it..
>
> The test box has 32GB memory and 110GB /dev/sda7, so I lowered
> the dirty_bytes=400MB and xfs "-b size=10g" explicitly in the script.
The VM I was running was a 2p, 2GB RAM config running on a 7200rpm
SATA drive, so maybe all your extra RAM has some impact on it.
> During the test run on 3.2.0-rc1, I find the dirty pages rarely exceed
> the background dirty threshold (200MB).
Which means your IO rates are high enough to keep the number of
dirty pages under control?
> Would you try run this and see if this it's a problem of the test script?
>
> root@snb /home/wfg# cat ./test-loop-fallocate.sh
....
Ok, so using your script my system doesn't hang, either.
I suspect the difference is that I was reproducing this with a used
image file. It had somewhere in the order of 750MB of space used
prior to running the test. I'd been using the image to test large
filesystem support for xfstests. I'd done a bunch of testing with
XFS on the loop device, and was trying to get the ext4 support to
work when I was seeing these hangs. I manually ran the
losetup/mount/mkfs loopdev/mount/falloc step to get it to hang.
So I think the state of the underlying image file has something to
do with the hang. Most likely due to the IO rates, I think....
I'll try to reproduce it by running xfstests on XFS on it again
before trying ext4 again (your script blew away the old image file I
had). Alternatively, you can try writing lots of small random blocks
to the image file before running the ext4 portion of the test.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3.2-rc2] loop device balance_dirty_pages_nr throttling hang
2011-11-22 10:29 ` Dave Chinner
@ 2011-11-22 10:49 ` Dave Chinner
2011-11-22 13:17 ` Theodore Tso
0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2011-11-22 10:49 UTC (permalink / raw)
To: Wu Fengguang; +Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
On Tue, Nov 22, 2011 at 09:29:31PM +1100, Dave Chinner wrote:
> I suspect the difference is that I was reproducing this with a used
> image file. It had somewhere in the order of 750MB of space used
> prior to running the test. I'd been using the image to test large
> filesystem support for xfstests. I'd done a bunch of testing with
> XFS on the loop device, and was trying to get the ext4 support to
> work when I was seeing these hangs. I manually ran the
> losetup/mount/mkfs loopdev/mount/falloc step to get it to hang.
>
> So I think the state of the underlying image file has something to
> do with the hang. Most likely due to the IO rates, I think....
>
> I'll try to reproduce it by running xfstests on XFS on it again
> before trying ext4 again (your script blew away the old image file I
> had). Alternatively, you can try writing lots of small random blocks
> to the image file before running the ext4 portion of the test.
At a rough approximation, the image file after xfs tests has run on
it for a while has somewhere on the high side of 30000 allocated
extents in it. So that could have significant impact on the layout
of the file and the IO patterns that result.
But, I just noticed that the discard that mkfs.ext4 will result in
all the extents being discarded in the underlying image file (due to
the fact the loopback device now supports hole punching), so the
previous state of the image file is getting trashed by the mkfs.ext4
execution, too. I think I already had a ext4 filesystem in some state
before I started running this manual prealloc test....
So, let me go back to running test 223 and then trying again with
having run mkfs on the loop device.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3.2-rc2] loop device balance_dirty_pages_nr throttling hang
2011-11-22 10:49 ` Dave Chinner
@ 2011-11-22 13:17 ` Theodore Tso
2011-11-22 19:56 ` Dave Chinner
0 siblings, 1 reply; 6+ messages in thread
From: Theodore Tso @ 2011-11-22 13:17 UTC (permalink / raw)
To: Dave Chinner
Cc: Theodore Tso, Wu Fengguang, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org
On Nov 22, 2011, at 5:49 AM, Dave Chinner wrote:
> But, I just noticed that the discard that mkfs.ext4 will result in
> all the extents being discarded in the underlying image file (due to
> the fact the loopback device now supports hole punching), so the
> previous state of the image file is getting trashed by the mkfs.ext4
> execution, too. I think I already had a ext4 filesystem in some state
> before I started running this manual prealloc test....
Which version of mkfs.ext4 are you using? We've disabled the discard by default (unless configured in via a command-line option or a /etc/mke2fs.conf setting since no distribution apparently wants to be responsible for running supplying a command that causes a crap SSD to turning into a brick; personally, I'd blame the manufacturer of the crap SSD, but….)
-- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3.2-rc2] loop device balance_dirty_pages_nr throttling hang
2011-11-22 13:17 ` Theodore Tso
@ 2011-11-22 19:56 ` Dave Chinner
0 siblings, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2011-11-22 19:56 UTC (permalink / raw)
To: Theodore Tso
Cc: Wu Fengguang, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org
On Tue, Nov 22, 2011 at 08:17:07AM -0500, Theodore Tso wrote:
>
> On Nov 22, 2011, at 5:49 AM, Dave Chinner wrote:
>
> > But, I just noticed that the discard that mkfs.ext4 will result
> > in all the extents being discarded in the underlying image file
> > (due to the fact the loopback device now supports hole
> > punching), so the previous state of the image file is getting
> > trashed by the mkfs.ext4 execution, too. I think I already had a
> > ext4 filesystem in some state before I started running this
> > manual prealloc test....
>
> Which version of mkfs.ext4 are you using? We've disabled the
> discard by default (unless configured in via a command-line option
> or a /etc/mke2fs.conf setting since no distribution apparently
> wants to be responsible for running supplying a command that
> causes a crap SSD to turning into a brick; personally, I'd blame
> the manufacturer of the crap SSD, but….)
Whatever I've got installed on my test machines. The machien I
reproduced the problem originally on with sparse files is runningi
mke2fs 1.42-WIP (16-Oct-2011) which is the latest in Debian
unstable, the other that is using a real 17TB array is running
mke2fs 1.42-WIP (02-Jul-2011). I saw the discard occurring
BTW, on a 5PB sparse image file I get this error from the latest
mke2fs above:
$ ls -lh /mnt/scratch/scratch.img
-rw------- 1 root root 4.9P Nov 23 06:50 /mnt/scratch/scratch.img
$ sudo mkfs.ext4 /dev/loop0
mke2fs 1.42-WIP (16-Oct-2011)
/dev/loop0: Cannot create filesystem with requested number of inodes while setting up superblock
$
This version does emit that it is discarding blocks:
$ sudo mkfs.ext4 /dev/loop0
mke2fs 1.42-WIP (16-Oct-2011)
Discarding device blocks: done
Filesystem label=
.....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-11-22 19:56 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-21 14:20 [3.2-rc2] loop device balance_dirty_pages_nr throttling hang Dave Chinner
2011-11-22 3:56 ` Wu Fengguang
2011-11-22 10:29 ` Dave Chinner
2011-11-22 10:49 ` Dave Chinner
2011-11-22 13:17 ` Theodore Tso
2011-11-22 19:56 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).