* [3.2-rc2] loop device balance_dirty_pages_nr throttling hang @ 2011-11-21 14:20 Dave Chinner 2011-11-22 3:56 ` Wu Fengguang 0 siblings, 1 reply; 6+ messages in thread From: Dave Chinner @ 2011-11-21 14:20 UTC (permalink / raw) To: fengguang.wu; +Cc: linux-kernel, linux-fsdevel Hi Fengguang, I just found a way of hanging a system and taking it down. I haven't tried to narrow down the test case - it's pretty simple - because it time for sleep here. $ uname -a Linux test-2 3.2.0-rc2-dgc+ #89 SMP Thu Nov 17 15:25:19 EST 2011 x86_64 GNU/Linux Create a 20TB sparse loop device on 11GB XFS filesystem: $ sudo xfs_io -f -c "truncate 20T" /mnt/scratch/scratch.img $ sudo losetup /dev/loop0 /mnt/scratch/scratch.img Make an ext4 filesystem on the loop device: $ sudo mkfs.ext4 /dev/loop0 mke2fs 1.42-WIP (16-Oct-2011) Discarding device blocks: done Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 335544320 inodes, 5368709120 blocks 268435456 blocks (5.00%) reserved for the super user ..... Mount the filesystem: $ sudo mount /dev/loop0 /mnt/scratch/scratch Try to preallocate a 15TB file: $ sudo time xfs_io -f -F -c "truncate 15T " -c "falloc 0 15T" /mnt/scratch/scratch/foo The command hangs very soon after this, with these hung processes: [ 283.437409] SysRq : Show Blocked State [ 283.438563] task PC stack pid father [ 283.439837] rsyslogd D ffff88007691ea48 0 1633 1 0x00000000 [ 283.440038] ffff880076a71bf8 0000000000000082 0000000000009f1c ffffffffb135a68d [ 283.440038] ffff88007691e6c0 ffff880076a71fd8 ffff880076a71fd8 ffff880076a71fd8 [ 283.440038] ffff880031402540 ffff88007691e6c0 ffff880076a71bf8 ffffffff810b59ed [ 283.440038] Call Trace: [ 283.440038] [<ffffffff810b59ed>] ? ktime_get_ts+0xad/0xe0 [ 283.440038] [<ffffffff81117650>] ? sleep_on_page+0x20/0x20 [ 283.440038] [<ffffffff81aab3af>] schedule+0x3f/0x60 [ 283.440038] [<ffffffff81aab45f>] io_schedule+0x8f/0xd0 [ 283.440038] [<ffffffff8111765e>] sleep_on_page_killable+0xe/0x40 [ 283.440038] [<ffffffff81aabc2f>] __wait_on_bit+0x5f/0x90 [ 283.440038] [<ffffffff8114c0ff>] ? read_swap_cache_async+0x4f/0x160 [ 283.440038] [<ffffffff81119d60>] wait_on_page_bit_killable+0x80/0x90 [ 283.440038] [<ffffffff810ac360>] ? autoremove_wake_function+0x40/0x40 [ 283.440038] [<ffffffff81119e16>] __lock_page_or_retry+0xa6/0xd0 [ 283.440038] [<ffffffff8113bba7>] handle_pte_fault+0x887/0x8b0 [ 283.440038] [<ffffffff8169ee6d>] ? copy_user_generic_string+0x2d/0x40 [ 283.440038] [<ffffffff8113bee5>] handle_mm_fault+0x155/0x250 [ 283.440038] [<ffffffff81ab0a62>] do_page_fault+0x142/0x4f0 [ 283.440038] [<ffffffff81ab0565>] do_async_page_fault+0x35/0x80 [ 283.440038] [<ffffffff81aad935>] async_page_fault+0x25/0x30 [ 283.440038] loop0 D ffff8800775fe4c8 0 2658 2 0x00000000 [ 283.440038] ffff88007b6b7810 0000000000000046 ffff88007b6b77f0 ffff88007fd12580 [ 283.440038] ffff8800775fe140 ffff88007b6b7fd8 ffff88007b6b7fd8 ffff88007b6b7fd8 [ 283.440038] ffff88007b1ea740 ffff8800775fe140 ffff88007b6b77f0 ffffffff81aad47e [ 283.440038] Call Trace: [ 283.440038] [<ffffffff81aad47e>] ? _raw_spin_lock_irqsave+0x2e/0x40 [ 283.440038] [<ffffffff81aab3af>] schedule+0x3f/0x60 [ 283.440038] [<ffffffff81aab894>] schedule_timeout+0x144/0x2d0 [ 283.440038] [<ffffffff810985d0>] ? usleep_range+0x50/0x50 [ 283.440038] [<ffffffff81aab6c2>] io_schedule_timeout+0xa2/0x100 [ 283.440038] [<ffffffff81121b21>] ? bdi_dirty_limit+0x31/0xc0 [ 283.440038] [<ffffffff811221f8>] balance_dirty_pages_ratelimited_nr+0x298/0x6d0 [ 283.440038] [<ffffffff81190d94>] ? block_write_end+0x44/0x80 [ 283.440038] [<ffffffff81117bc8>] generic_file_buffered_write+0x1a8/0x250 [ 283.440038] [<ffffffff8142262c>] xfs_file_buffered_aio_write+0xec/0x1b0 [ 283.440038] [<ffffffff8142285a>] xfs_file_aio_write+0x16a/0x2a0 [ 283.440038] [<ffffffff8115f682>] do_sync_write+0xd2/0x110 [ 283.440038] [<ffffffff81081856>] ? load_balance+0xb6/0x8e0 [ 283.440038] [<ffffffff8184a074>] __do_lo_send_write+0x54/0xa0 [ 283.440038] [<ffffffff8184a3b1>] do_lo_send_direct_write+0x81/0xa0 [ 283.440038] [<ffffffff8184af77>] do_bio_filebacked+0x227/0x2e0 [ 283.440038] [<ffffffff8184a330>] ? transfer_xor+0xf0/0xf0 [ 283.440038] [<ffffffff8119401d>] ? bio_free+0x4d/0x60 [ 283.440038] [<ffffffff81194045>] ? bio_fs_destructor+0x15/0x20 [ 283.440038] [<ffffffff8119346b>] ? bio_put+0x2b/0x30 [ 283.440038] [<ffffffff8184bf42>] loop_thread+0xc2/0x250 [ 283.440038] [<ffffffff810ac320>] ? add_wait_queue+0x60/0x60 [ 283.440038] [<ffffffff8184be80>] ? loop_set_status_old+0x1e0/0x1e0 [ 283.440038] [<ffffffff810ab87c>] kthread+0x8c/0xa0 [ 283.440038] [<ffffffff81ab7174>] kernel_thread_helper+0x4/0x10 [ 283.440038] [<ffffffff810ab7f0>] ? flush_kthread_worker+0xa0/0xa0 [ 283.440038] [<ffffffff81ab7170>] ? gs_change+0x13/0x13 [ 283.440038] jbd2/loop0-8 D ffff88000ca08748 0 2813 2 0x00000000 [ 283.440038] ffff88000c9f7bc0 0000000000000046 0000000000000000 ffffffffb135a68d [ 283.440038] ffff88000ca083c0 ffff88000c9f7fd8 ffff88000c9f7fd8 ffff88000c9f7fd8 [ 283.440038] ffff88007c8602c0 ffff88000ca083c0 ffff88000c9f7bc0 00000001810b59ed [ 283.440038] Call Trace: [ 283.440038] [<ffffffff8118eec0>] ? __wait_on_buffer+0x30/0x30 [ 283.440038] [<ffffffff81aab3af>] schedule+0x3f/0x60 [ 283.440038] [<ffffffff81aab45f>] io_schedule+0x8f/0xd0 [ 283.440038] [<ffffffff8118eece>] sleep_on_buffer+0xe/0x20 [ 283.440038] [<ffffffff81aabc2f>] __wait_on_bit+0x5f/0x90 [ 283.440038] [<ffffffff8167e177>] ? generic_make_request+0xc7/0x100 [ 283.440038] [<ffffffff8118eec0>] ? __wait_on_buffer+0x30/0x30 [ 283.440038] [<ffffffff81aabcdc>] out_of_line_wait_on_bit+0x7c/0x90 [ 283.440038] [<ffffffff810ac360>] ? autoremove_wake_function+0x40/0x40 [ 283.440038] [<ffffffff8118eebe>] __wait_on_buffer+0x2e/0x30 [ 283.440038] [<ffffffff8129d2ef>] jbd2_journal_commit_transaction+0xb0f/0x15d0 [ 283.440038] [<ffffffff810998da>] ? try_to_del_timer_sync+0x8a/0x110 [ 283.440038] [<ffffffff812a15db>] kjournald2+0xbb/0x220 [ 283.440038] [<ffffffff810ac320>] ? add_wait_queue+0x60/0x60 [ 283.440038] [<ffffffff812a1520>] ? commit_timeout+0x10/0x10 [ 283.440038] [<ffffffff810ab87c>] kthread+0x8c/0xa0 [ 283.440038] [<ffffffff81ab7174>] kernel_thread_helper+0x4/0x10 [ 283.440038] [<ffffffff810ab7f0>] ? flush_kthread_worker+0xa0/0xa0 [ 283.440038] [<ffffffff81ab7170>] ? gs_change+0x13/0x13 [ 283.440038] xfs_io D ffff880044fcca88 0 2817 2816 0x00000000 [ 283.440038] ffff880012b77ac8 0000000000000086 0000000000000000 ffffffffb135a68d [ 283.440038] ffff880044fcc700 ffff880012b77fd8 ffff880012b77fd8 ffff880012b77fd8 [ 283.440038] ffff88007c8602c0 ffff880044fcc700 ffff880012b77ac8 00000001810b59ed [ 283.440038] Call Trace: [ 283.440038] [<ffffffff8118eec0>] ? __wait_on_buffer+0x30/0x30 [ 283.440038] [<ffffffff81aab3af>] schedule+0x3f/0x60 [ 283.440038] [<ffffffff81aab45f>] io_schedule+0x8f/0xd0 [ 283.440038] [<ffffffff8118eece>] sleep_on_buffer+0xe/0x20 [ 283.440038] [<ffffffff81aabc2f>] __wait_on_bit+0x5f/0x90 [ 283.440038] [<ffffffff8118eec0>] ? __wait_on_buffer+0x30/0x30 [ 283.440038] [<ffffffff81aabcdc>] out_of_line_wait_on_bit+0x7c/0x90 [ 283.440038] [<ffffffff810ac360>] ? autoremove_wake_function+0x40/0x40 [ 283.440038] [<ffffffff8118eebe>] __wait_on_buffer+0x2e/0x30 [ 283.440038] [<ffffffff8129f531>] jbd2_log_do_checkpoint+0x4c1/0x4e0 [ 283.440038] [<ffffffff8129f5ed>] __jbd2_log_wait_for_space+0x9d/0x1b0 [ 283.440038] [<ffffffff8129a4e0>] start_this_handle.isra.9+0x390/0x490 [ 283.440038] [<ffffffff810ac320>] ? add_wait_queue+0x60/0x60 [ 283.440038] [<ffffffff8129a6aa>] jbd2__journal_start+0xca/0x110 [ 283.440038] [<ffffffff8129a703>] jbd2_journal_start+0x13/0x20 [ 283.440038] [<ffffffff8127189f>] ext4_journal_start_sb+0x7f/0x1d0 [ 283.440038] [<ffffffff8127ddd4>] ? ext4_fallocate+0x1a4/0x530 [ 283.440038] [<ffffffff8127ddd4>] ext4_fallocate+0x1a4/0x530 [ 283.440038] [<ffffffff8115e992>] do_fallocate+0xf2/0x160 [ 283.440038] [<ffffffff8115ea4b>] sys_fallocate+0x4b/0x70 [ 283.440038] [<ffffffff81ab5082>] system_call_fastpath+0x16/0x1b [ 283.440038] Sched Debug Version: v0.10, 3.2.0-rc2-dgc+ #89 Looks like the prealloc created lots of dirty pages: $ cat /proc/meminfo MemTotal: 2050356 kB MemFree: 17676 kB Buffers: 518888 kB Cached: 1367132 kB SwapCached: 1448 kB Active: 423444 kB Inactive: 1476300 kB Active(anon): 8948 kB Inactive(anon): 4796 kB Active(file): 414496 kB Inactive(file): 1471504 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 497976 kB SwapFree: 491956 kB Dirty: 392664 kB Writeback: 0 kB AnonPages: 12416 kB Mapped: 6896 kB Shmem: 24 kB Slab: 105844 kB SReclaimable: 92252 kB SUnreclaim: 13592 kB KernelStack: 848 kB PageTables: 2508 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1523152 kB Committed_AS: 61012 kB VmallocTotal: 34359738367 kB VmallocUsed: 15260 kB VmallocChunk: 34359723095 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 6132 kB DirectMap2M: 2091008 kB $ $ $ cat /proc/vmstat nr_free_pages 4378 nr_inactive_anon 1199 nr_active_anon 2232 nr_inactive_file 367877 nr_active_file 103629 nr_unevictable 0 nr_mlock 0 nr_anon_pages 3125 nr_mapped 1724 nr_file_pages 471873 nr_dirty 98166 nr_writeback 0 nr_slab_reclaimable 23064 nr_slab_unreclaimable 3416 nr_page_table_pages 628 nr_kernel_stack 106 nr_unstable 0 nr_bounce 0 nr_vmscan_write 1597 nr_vmscan_immediate_reclaim 1752 nr_writeback_temp 0 nr_isolated_anon 0 nr_isolated_file 0 nr_shmem 6 nr_dirtied 548796 nr_written 406231 numa_hit 1086286 numa_miss 0 numa_foreign 0 numa_interleave 20982 numa_local 1086286 numa_other 0 nr_anon_transparent_hugepages 0 nr_dirty_threshold 95863 nr_dirty_background_threshold 47931 pgpgin 181756 pgpgout 2417611 pswpin 437 pswpout 1597 pgalloc_dma 7445 pgalloc_dma32 1083448 pgalloc_normal 0 pgalloc_movable 0 pgfree 1095515 pgactivate 104582 pgdeactivate 3069 pgfault 677067 pgmajfault 460 pgrefill_dma 771 pgrefill_dma32 2298 pgrefill_normal 0 pgrefill_movable 0 pgsteal_dma 5415 pgsteal_dma32 90506 pgsteal_normal 0 pgsteal_movable 0 pgscan_kswapd_dma 8719 pgscan_kswapd_dma32 89764 pgscan_kswapd_normal 0 pgscan_kswapd_movable 0 pgscan_direct_dma 0 pgscan_direct_dma32 2969 pgscan_direct_normal 0 pgscan_direct_movable 0 zone_reclaim_failed 0 pginodesteal 0 slabs_scanned 1024 kswapd_steal 92952 kswapd_inodesteal 0 kswapd_low_wmark_hit_quickly 187 kswapd_high_wmark_hit_quickly 0 kswapd_skip_congestion_wait 0 pageoutrun 1083 allocstall 38 pgrotated 1623 htlb_buddy_alloc_success 0 htlb_buddy_alloc_fail 0 unevictable_pgs_culled 0 unevictable_pgs_scanned 0 unevictable_pgs_rescued 0 unevictable_pgs_mlocked 0 unevictable_pgs_munlocked 0 unevictable_pgs_cleared 0 unevictable_pgs_stranded 0 unevictable_pgs_mlockfreed 0 And the loop device writeback is stalled in balance_dirty_pages_nr(). Any further writes to the system result in those processes also hanging in balance_dirty_pages_nr(). I thought this might be a one-off, but I rebooted the machine and from a clean boot, ran the above commands and it entered the same state where I pulled out the above traces via sysrq-w. I'm not sure what is going on yet. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3.2-rc2] loop device balance_dirty_pages_nr throttling hang 2011-11-21 14:20 [3.2-rc2] loop device balance_dirty_pages_nr throttling hang Dave Chinner @ 2011-11-22 3:56 ` Wu Fengguang 2011-11-22 10:29 ` Dave Chinner 0 siblings, 1 reply; 6+ messages in thread From: Wu Fengguang @ 2011-11-22 3:56 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Hi Dave, On Mon, Nov 21, 2011 at 10:20:56PM +0800, Dave Chinner wrote: > Hi Fengguang, > > I just found a way of hanging a system and taking it down. I haven't > tried to narrow down the test case - it's pretty simple - because it > time for sleep here. Yeah, once the global dirty limit is exceeded, the system would appear hang because many applications will block in balance_dirty_pages(). I created a script for this case, however cannot reproduce it.. The test box has 32GB memory and 110GB /dev/sda7, so I lowered the dirty_bytes=400MB and xfs "-b size=10g" explicitly in the script. During the test run on 3.2.0-rc1, I find the dirty pages rarely exceed the background dirty threshold (200MB). Would you try run this and see if this it's a problem of the test script? root@snb /home/wfg# cat ./test-loop-fallocate.sh #!/bin/sh # !!!change and uncomment this before run!!! # DEV=/dev/sda7 echo 1 > /debug/tracing/events/writeback/balance_dirty_pages/enable echo 1 > /debug/tracing/events/writeback/global_dirty_state/enable echo $((400<<20)) > /proc/sys/vm/dirty_bytes mkfs.xfs -f -d size=10g $DEV mount $DEV /mnt/scratch xfs_io -f -c "truncate 20T" /mnt/scratch/scratch.img losetup /dev/loop0 /mnt/scratch/scratch.img mkfs.ext4 /dev/loop0 mkdir /mnt/scratch/scratch mount /dev/loop0 /mnt/scratch/scratch time xfs_io -f -F -c "truncate 15T " -c "falloc 0 15T" /mnt/scratch/scratch/foo umount /mnt/scratch/scratch losetup -d /dev/loop0 umount /mnt/scratch root@snb /home/wfg# ./test-loop-fallocate.sh meta-data=/dev/sda7 isize=256 agcount=4, agsize=655360 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=2621440, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 mke2fs 1.42-WIP (16-Oct-2011) Discarding device blocks: done Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 335544320 inodes, 5368709120 blocks 268435456 blocks (5.00%) reserved for the super user First data block=0 163840 block groups 32768 blocks per group, 32768 fragments per group 2048 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848, 512000000, 550731776, 644972544, 1934917632, 2560000000, 3855122432 Allocating group tables: done Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done real 0m38.323s user 0m0.000s sys 0m25.203s Thanks, Fengguang ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3.2-rc2] loop device balance_dirty_pages_nr throttling hang 2011-11-22 3:56 ` Wu Fengguang @ 2011-11-22 10:29 ` Dave Chinner 2011-11-22 10:49 ` Dave Chinner 0 siblings, 1 reply; 6+ messages in thread From: Dave Chinner @ 2011-11-22 10:29 UTC (permalink / raw) To: Wu Fengguang; +Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org On Tue, Nov 22, 2011 at 11:56:29AM +0800, Wu Fengguang wrote: > Hi Dave, > > On Mon, Nov 21, 2011 at 10:20:56PM +0800, Dave Chinner wrote: > > Hi Fengguang, > > > > I just found a way of hanging a system and taking it down. I haven't > > tried to narrow down the test case - it's pretty simple - because it > > time for sleep here. > > Yeah, once the global dirty limit is exceeded, the system would appear > hang because many applications will block in balance_dirty_pages(). > > I created a script for this case, however cannot reproduce it.. > > The test box has 32GB memory and 110GB /dev/sda7, so I lowered > the dirty_bytes=400MB and xfs "-b size=10g" explicitly in the script. The VM I was running was a 2p, 2GB RAM config running on a 7200rpm SATA drive, so maybe all your extra RAM has some impact on it. > During the test run on 3.2.0-rc1, I find the dirty pages rarely exceed > the background dirty threshold (200MB). Which means your IO rates are high enough to keep the number of dirty pages under control? > Would you try run this and see if this it's a problem of the test script? > > root@snb /home/wfg# cat ./test-loop-fallocate.sh .... Ok, so using your script my system doesn't hang, either. I suspect the difference is that I was reproducing this with a used image file. It had somewhere in the order of 750MB of space used prior to running the test. I'd been using the image to test large filesystem support for xfstests. I'd done a bunch of testing with XFS on the loop device, and was trying to get the ext4 support to work when I was seeing these hangs. I manually ran the losetup/mount/mkfs loopdev/mount/falloc step to get it to hang. So I think the state of the underlying image file has something to do with the hang. Most likely due to the IO rates, I think.... I'll try to reproduce it by running xfstests on XFS on it again before trying ext4 again (your script blew away the old image file I had). Alternatively, you can try writing lots of small random blocks to the image file before running the ext4 portion of the test. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3.2-rc2] loop device balance_dirty_pages_nr throttling hang 2011-11-22 10:29 ` Dave Chinner @ 2011-11-22 10:49 ` Dave Chinner 2011-11-22 13:17 ` Theodore Tso 0 siblings, 1 reply; 6+ messages in thread From: Dave Chinner @ 2011-11-22 10:49 UTC (permalink / raw) To: Wu Fengguang; +Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org On Tue, Nov 22, 2011 at 09:29:31PM +1100, Dave Chinner wrote: > I suspect the difference is that I was reproducing this with a used > image file. It had somewhere in the order of 750MB of space used > prior to running the test. I'd been using the image to test large > filesystem support for xfstests. I'd done a bunch of testing with > XFS on the loop device, and was trying to get the ext4 support to > work when I was seeing these hangs. I manually ran the > losetup/mount/mkfs loopdev/mount/falloc step to get it to hang. > > So I think the state of the underlying image file has something to > do with the hang. Most likely due to the IO rates, I think.... > > I'll try to reproduce it by running xfstests on XFS on it again > before trying ext4 again (your script blew away the old image file I > had). Alternatively, you can try writing lots of small random blocks > to the image file before running the ext4 portion of the test. At a rough approximation, the image file after xfs tests has run on it for a while has somewhere on the high side of 30000 allocated extents in it. So that could have significant impact on the layout of the file and the IO patterns that result. But, I just noticed that the discard that mkfs.ext4 will result in all the extents being discarded in the underlying image file (due to the fact the loopback device now supports hole punching), so the previous state of the image file is getting trashed by the mkfs.ext4 execution, too. I think I already had a ext4 filesystem in some state before I started running this manual prealloc test.... So, let me go back to running test 223 and then trying again with having run mkfs on the loop device. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3.2-rc2] loop device balance_dirty_pages_nr throttling hang 2011-11-22 10:49 ` Dave Chinner @ 2011-11-22 13:17 ` Theodore Tso 2011-11-22 19:56 ` Dave Chinner 0 siblings, 1 reply; 6+ messages in thread From: Theodore Tso @ 2011-11-22 13:17 UTC (permalink / raw) To: Dave Chinner Cc: Theodore Tso, Wu Fengguang, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org On Nov 22, 2011, at 5:49 AM, Dave Chinner wrote: > But, I just noticed that the discard that mkfs.ext4 will result in > all the extents being discarded in the underlying image file (due to > the fact the loopback device now supports hole punching), so the > previous state of the image file is getting trashed by the mkfs.ext4 > execution, too. I think I already had a ext4 filesystem in some state > before I started running this manual prealloc test.... Which version of mkfs.ext4 are you using? We've disabled the discard by default (unless configured in via a command-line option or a /etc/mke2fs.conf setting since no distribution apparently wants to be responsible for running supplying a command that causes a crap SSD to turning into a brick; personally, I'd blame the manufacturer of the crap SSD, but….) -- Ted ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3.2-rc2] loop device balance_dirty_pages_nr throttling hang 2011-11-22 13:17 ` Theodore Tso @ 2011-11-22 19:56 ` Dave Chinner 0 siblings, 0 replies; 6+ messages in thread From: Dave Chinner @ 2011-11-22 19:56 UTC (permalink / raw) To: Theodore Tso Cc: Wu Fengguang, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org On Tue, Nov 22, 2011 at 08:17:07AM -0500, Theodore Tso wrote: > > On Nov 22, 2011, at 5:49 AM, Dave Chinner wrote: > > > But, I just noticed that the discard that mkfs.ext4 will result > > in all the extents being discarded in the underlying image file > > (due to the fact the loopback device now supports hole > > punching), so the previous state of the image file is getting > > trashed by the mkfs.ext4 execution, too. I think I already had a > > ext4 filesystem in some state before I started running this > > manual prealloc test.... > > Which version of mkfs.ext4 are you using? We've disabled the > discard by default (unless configured in via a command-line option > or a /etc/mke2fs.conf setting since no distribution apparently > wants to be responsible for running supplying a command that > causes a crap SSD to turning into a brick; personally, I'd blame > the manufacturer of the crap SSD, but….) Whatever I've got installed on my test machines. The machien I reproduced the problem originally on with sparse files is runningi mke2fs 1.42-WIP (16-Oct-2011) which is the latest in Debian unstable, the other that is using a real 17TB array is running mke2fs 1.42-WIP (02-Jul-2011). I saw the discard occurring BTW, on a 5PB sparse image file I get this error from the latest mke2fs above: $ ls -lh /mnt/scratch/scratch.img -rw------- 1 root root 4.9P Nov 23 06:50 /mnt/scratch/scratch.img $ sudo mkfs.ext4 /dev/loop0 mke2fs 1.42-WIP (16-Oct-2011) /dev/loop0: Cannot create filesystem with requested number of inodes while setting up superblock $ This version does emit that it is discarding blocks: $ sudo mkfs.ext4 /dev/loop0 mke2fs 1.42-WIP (16-Oct-2011) Discarding device blocks: done Filesystem label= ..... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-11-22 19:56 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-11-21 14:20 [3.2-rc2] loop device balance_dirty_pages_nr throttling hang Dave Chinner 2011-11-22 3:56 ` Wu Fengguang 2011-11-22 10:29 ` Dave Chinner 2011-11-22 10:49 ` Dave Chinner 2011-11-22 13:17 ` Theodore Tso 2011-11-22 19:56 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).