public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* BUG: xfs on linux lvm - lvconvert random hungs when doing i/o
@ 2008-03-21 14:20 Stanislaw Gruszka
  2008-03-21 17:45 ` Josef 'Jeff' Sipek
  2008-03-24 23:39 ` David Chinner
  0 siblings, 2 replies; 6+ messages in thread
From: Stanislaw Gruszka @ 2008-03-21 14:20 UTC (permalink / raw)
  To: xfs

Hello

I have problems using xfs and lvm snapshots on linux-2.6.24 , When I do
lvconvert  to create snapshots and when system is under heavy load, lvconvert
and I/O processes randomly hung .  I use below script to reproduce, but it is very hard
to catch this bug.

<code>
#!/bin/bash
#set -x
DISK="physical_device" 

# clean old stuff
umount /mnt/tmp
for ((j = 0; j < 20; j++)) ; do
    echo -n "Remove $j "
    date
    umount /mnt/m$j
    lvremove -s -f /dev/VG/sn_$j
done
vgchange -a n
vgremove -f VG

# initialization
pvcreate  $DISK 2> /dev/null
vgcreate VG $DISK 2> /dev/null
vgchange -a y
lvcreate -L40G -n lv VG
mkdir -p /mnt/tmp
mkfs.xfs /dev/VG/lv
for ((j = 0; j < 20; j++)) do
    lvcreate -L512M -n /dev/VG/sn_${j} VG
    mkdir -p /mnt/m$j
done

# test
nloops=10
for ((loop = 0; loop < $nloops; loop++)) ;  do
    echo "loop $loop start ... "
    mount /dev/VG/lv /mnt/tmp

    dd if=/dev/urandom of=/mnt/tmp/file_tmp1 bs=1024  &
    load_pid1=$!
    dd if=/dev/urandom of=/mnt/tmp/file_tmp2 bs=1024  &
    load_pid2=$!

    for ((j = 0; j < 20; j++)) ; do
        echo -n "Convert $j "
        date
        lvconvert -s -c512 /dev/VG/lv /dev/VG/sn_$j
        sleep 10
        mount -t xfs -o nouuid,noatime /dev/VG/sn_$j /mnt/m$j
        sync
    done

    for ((j = 0; j < 20; j++)) ; do
        echo -n "Remove $j "
        date
        umount /mnt/m$j
        lvremove -s -f /dev/VG/sn_$j
    done

    kill $load_pid1
    wait $load_pid1
    kill $load_pid2
    wait $load_pid2

    umount /mnt/tmp
    echo "done"
done
</code>

Here is sysrq show-blocked-task output of such situation:

SysRq : HELP : loglevel0-8 reBoot Crashdump tErm Full kIll saK showMem Nice powerOff showPc show-all-timers(Q) unRaw Sync showTasks Unmount shoW-blocked-tasks 
SysRq : Show Blocked State
  task                PC stack   pid father
xfsdatad/1    D 00000000     0   288      2
       f7d1aa90 00000046 00000000 00000000 00000000 00000000 f68be900 c4018a80 
       ffffffff ea4906f8 f7d1aa90 f744bf0c c05558e6 ea490700 c4015d60 00000000 
       ea4906f8 00000004 ea490680 ea490680 c0555a2d ea490700 ea490700 f7d1aa90 
Call Trace:
 [<c05558e6>] rwsem_down_failed_common+0x76/0x170
 [<c0555a2d>] rwsem_down_write_failed+0x1d/0x24
 [<c0555aa2>] call_rwsem_down_write_failed+0x6/0x8
 [<c05555d2>] down_write+0x12/0x20
 [<c029906a>] xfs_ilock+0x5a/0xa0
 [<c02c0093>] xfs_setfilesize+0x43/0x130
 [<c02c0180>] xfs_end_bio_delalloc+0x0/0x20
 [<c02c018d>] xfs_end_bio_delalloc+0xd/0x20
 [<c01334d2>] run_workqueue+0x52/0x100
 [<c01373a2>] prepare_to_wait+0x52/0x70
 [<c01335ff>] worker_thread+0x7f/0xc0
 [<c01374a0>] autoremove_wake_function+0x0/0x50
 [<c01374a0>] autoremove_wake_function+0x0/0x50
 [<c0133580>] worker_thread+0x0/0xc0
 [<c0136f09>] kthread+0x59/0xa0
 [<c0136eb0>] kthread+0x0/0xa0
 [<c0104d47>] kernel_thread_helper+0x7/0x10
 =======================
pdflush       D 00fc61cb     0  7337      2
       edc37a90 00000046 f28f9ed8 00fc61cb f28f9ed8 00000282 f7c5de40 c4010a80 
       f28f9ed8 00fc61cb f28f9f2c 00000000 c0554fd7 00000000 00000002 c07443e4 
       c07443e4 00fc61cb c012ca40 edc37a90 c0743d80 00000246 c0137370 c4010a80 
Call Trace:
 [<c0554fd7>] schedule_timeout+0x47/0x90
 [<c012ca40>] process_timeout+0x0/0x10
 [<c0137370>] prepare_to_wait+0x20/0x70
 [<c0554f5b>] io_schedule_timeout+0x1b/0x30
 [<c0162ede>] congestion_wait+0x7e/0xa0
 [<c01374a0>] autoremove_wake_function+0x0/0x50
 [<c0196a71>] sync_sb_inodes+0x141/0x1d0
 [<c01374a0>] autoremove_wake_function+0x0/0x50
 [<c0196b87>] writeback_inodes+0x87/0xb0
 [<c015d463>] wb_kupdate+0xa3/0x100
 [<c015e139>] __pdflush+0xb9/0x170
 [<c015e1f0>] pdflush+0x0/0x30
 [<c015e218>] pdflush+0x28/0x30
 [<c015d3c0>] wb_kupdate+0x0/0x100
 [<c0136f09>] kthread+0x59/0xa0
 [<c0136eb0>] kthread+0x0/0xa0
 [<c0104d47>] kernel_thread_helper+0x7/0x10
 =======================
dd            D c4018ab4     0 12113  29734
       ee178a90 00000082 ebe70ac0 c4018ab4 00000001 f75a4440 f7d0ee40 c4010a80 
       f3951bc0 f3951bc8 00000246 ee178a90 c0555b35 00000001 ee178a90 c011eaa0 
       f3951bcc f3951bcc ee25b160 c04793c7 f3baed80 f3951bc0 00008000 00000000 
Call Trace:
 [<c0555b35>] __down+0x75/0xe0
 [<c011eaa0>] default_wake_function+0x0/0x10
 [<c04793c7>] dm_unplug_all+0x17/0x30
 [<c0555a3b>] __down_failed+0x7/0xc
 [<c02e1ac0>] blk_backing_dev_unplug+0x0/0x10
 [<c02c2e4c>] xfs_buf_lock+0x3c/0x50
 [<c02c27d1>] _xfs_buf_find+0x151/0x1d0
 [<c02bfe67>] kmem_zone_alloc+0x47/0xc0
 [<c03cd208>] ata_check_status+0x8/0x10
 [<c02c28a5>] xfs_buf_get_flags+0x55/0x130
 [<c02c299c>] xfs_buf_read_flags+0x1c/0x90
 [<c02b48cf>] xfs_trans_read_buf+0x16f/0x350
 [<c02995fd>] xfs_itobp+0x7d/0x250
 [<c0156ba8>] find_get_pages_tag+0x38/0x90
 [<c015d70d>] write_cache_pages+0x11d/0x330
 [<c029cfc9>] xfs_iflush+0x99/0x470
 [<c02bd907>] xfs_inode_flush+0x127/0x1f0
 [<c02c9732>] xfs_fs_write_inode+0x22/0x80
 [<c01963fb>] write_inode+0x4b/0x50
 [<c01966d0>] __sync_single_inode+0xf0/0x190
 [<c01967b9>] __writeback_single_inode+0x49/0x1c0
 [<c012c5ce>] del_timer_sync+0xe/0x20
 [<c02ef3e3>] prop_fraction_single+0x33/0x60
 [<c015cc56>] task_dirty_limit+0x46/0xd0
 [<c0196a0e>] sync_sb_inodes+0xde/0x1d0
 [<c015ce8a>] get_dirty_limits+0x13a/0x160
 [<c0196ba0>] writeback_inodes+0xa0/0xb0
 [<c015d043>] balance_dirty_pages+0x193/0x2c0
 [<c0158ad2>] generic_perform_write+0x142/0x190
 [<c0158ba7>] generic_file_buffered_write+0x87/0x150
 [<c02c8d0b>] xfs_write+0x61b/0x8c0
 [<c0128555>] __do_softirq+0x75/0xf0
 [<c011177a>] smp_apic_timer_interrupt+0x2a/0x40
 [<c0104bf4>] apic_timer_interrupt+0x28/0x30
 [<c02c45f6>] xfs_file_aio_write+0x76/0x90
 [<c0178e9d>] do_sync_write+0xbd/0x110
 [<c013bb80>] notify_die+0x30/0x40
 [<c01374a0>] autoremove_wake_function+0x0/0x50
 [<c013b827>] atomic_notifier_call_chain+0x17/0x20
 [<c013bb80>] notify_die+0x30/0x40
 [<c0179050>] vfs_write+0x160/0x170
 [<c0179111>] sys_write+0x41/0x70
 [<c010418e>] syscall_call+0x7/0xb
 =======================
lvconvert     D c4010a80     0 12930  12501
       ec09e030 00000082 00000000 c4010a80 ec09e030 ebe70a90 f7c5d580 c4010a80 
       7fffffff cbd43e38 cbd43de8 00000002 c055501c ec1ea98c 00000292 ec09e030 
       c011eb9a 00000000 00000292 c0555b8e 00000001 ec09e030 c011eaa0 7fffffff 
Call Trace:
 [<c055501c>] schedule_timeout+0x8c/0x90
 [<c011eb9a>] __wake_up_locked+0x1a/0x20
 [<c0555b8e>] __down+0xce/0xe0
 [<c011eaa0>] default_wake_function+0x0/0x10
 [<c0554ca9>] wait_for_common+0xa9/0x140
 [<c011eaa0>] default_wake_function+0x0/0x10
 [<c011eaa0>] default_wake_function+0x0/0x10
 [<c0133719>] flush_cpu_workqueue+0x69/0xa0
 [<c0133640>] wq_barrier_func+0x0/0x10
 [<c013377c>] flush_workqueue+0x2c/0x40
 [<c02c3fc7>] xfs_flush_buftarg+0x17/0x120
 [<c02b6f26>] xfs_quiesce_fs+0x16/0x70
 [<c02b6fa0>] xfs_attr_quiesce+0x20/0x60
 [<c02b8db8>] xfs_freeze+0x8/0x10
 [<c019a617>] freeze_bdev+0x77/0x80
 [<c0479deb>] lock_fs+0x1b/0x70
 [<c019ff50>] bdev_set+0x0/0x10
 [<c0479f43>] dm_suspend+0xc3/0x350
 [<c011eaa0>] default_wake_function+0x0/0x10
 [<c011eaa0>] default_wake_function+0x0/0x10
 [<c047d71a>] do_suspend+0x7a/0x90
 [<c047d870>] dev_suspend+0x0/0x20
 [<c047e3db>] ctl_ioctl+0xcb/0x130
 [<c018544a>] do_ioctl+0x6a/0xa0
 [<c018560e>] vfs_ioctl+0x5e/0x1d0
 [<c01857f0>] sys_ioctl+0x70/0x80
 [<c010418e>] syscall_call+0x7/0xb
 =======================
dd            D 00fc61cb     0 12953  29684
       f7c92a90 00000082 e99a7c70 00fc61cb e99a7c70 00000286 f69b9580 c4018a80 
       e99a7c70 00fc61cb e99a7cc4 00000010 c0554fd7 00008000 c0748e44 f7c6a664 
       f7c6a664 00fc61cb c012ca40 f7c92a90 f7c6a000 00000246 c0137370 c4018a80 
Call Trace:
 [<c0554fd7>] schedule_timeout+0x47/0x90
 [<c012ca40>] process_timeout+0x0/0x10
 [<c0137370>] prepare_to_wait+0x20/0x70
 [<c0554f5b>] io_schedule_timeout+0x1b/0x30
 [<c0162ede>] congestion_wait+0x7e/0xa0
 [<c01374a0>] autoremove_wake_function+0x0/0x50
 [<c015ce8a>] get_dirty_limits+0x13a/0x160
 [<c01374a0>] autoremove_wake_function+0x0/0x50
 [<c015cf70>] balance_dirty_pages+0xc0/0x2c0
 [<c0158ad2>] generic_perform_write+0x142/0x190
 [<c0158ba7>] generic_file_buffered_write+0x87/0x150
 [<c02c8d0b>] xfs_write+0x61b/0x8c0
 [<c02df03d>] elv_next_request+0x7d/0x150
 [<c03abbde>] scsi_dispatch_cmd+0x15e/0x290
 [<c02c45f6>] xfs_file_aio_write+0x76/0x90
 [<c0178e9d>] do_sync_write+0xbd/0x110
 [<c01374a0>] autoremove_wake_function+0x0/0x50
 [<c012c740>] run_timer_softirq+0x30/0x180
 [<c014037f>] tick_do_periodic_broadcast+0x1f/0x30
 [<c013bb80>] notify_die+0x30/0x40
 [<c014039d>] tick_handle_periodic_broadcast+0xd/0x50
 [<c0179050>] vfs_write+0x160/0x170
 [<c0179111>] sys_write+0x41/0x70
 [<c010418e>] syscall_call+0x7/0xb
 =======================

I have also full memory dump of hung situation, so I could provide some interesting
variables values (xfs_buf, xfs_inode) if you want.  Please tell me if you eventually 
want some other info like linux .config etc. .  Currently I try to reproduce bug with
CONFIG_XFS_DEBUG and CONFIG_XFS_TRACE options and with such tracing options: 

#define XFS_ALLOC_TRACE 0
#define XFS_ATTR_TRACE 0
#define XFS_BLI_TRACE 0
#define XFS_BMAP_TRACE 0
#define XFS_BMBT_TRACE 0
#define XFS_DIR2_TRACE 0
#define XFS_DQUOT_TRACE 0
#define XFS_ILOCK_TRACE 1
#define XFS_LOG_TRACE 0
#define XFS_RW_TRACE 1
#define XFS_BUF_TRACE 1
#define XFS_VNODE_TRACE 0
#define XFS_FILESTREAMS_TRACE 0

I hope I would provide more valuable information soon to fix this problem.

I also would like to ask if you have some propositions how to reproduce bug,
because my scripts need to work few hours or even days to hung processes.

Regards
Stanislaw Gruszka

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: xfs on linux lvm - lvconvert random hungs when doing i/o
  2008-03-21 14:20 BUG: xfs on linux lvm - lvconvert random hungs when doing i/o Stanislaw Gruszka
@ 2008-03-21 17:45 ` Josef 'Jeff' Sipek
  2008-03-22 10:20   ` Stanislaw Gruszka
  2008-03-24 23:39 ` David Chinner
  1 sibling, 1 reply; 6+ messages in thread
From: Josef 'Jeff' Sipek @ 2008-03-21 17:45 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: xfs

On Fri, Mar 21, 2008 at 03:20:16PM +0100, Stanislaw Gruszka wrote:

Interesting, I've noticed similar hang (based on my un-expert inspection of
your backtraces) which went away as suddenly as it appeared. I wasn't making
snapshots or any other LVM operation at the time. It just happened - logs
didn't contain anything.

It's an LVM with 2 disks => 1 LV with XFS
It's a bit dated version of XFS (2.6.24-rc7 vanilla).

...
> Here is sysrq show-blocked-task output of such situation:

I did the same...

SysRq : Show Blocked State
  task                PC stack   pid father
postgres      D 0003469a     0  3435   3432
       df072d90 00200096 3f8d021a 0003469a f76c8e20 00000000 0002c7fb 00000000 
       3f8fca15 0003469a dfec20b4 dfec2f68 dfc308ec df072dd8 c032274f 00000001 
       dfec20b4 00000001 00200246 dfec20b4 dfec20b4 00200282 df072dc4 00000001 
Call Trace:
 [<c032274f>] schedule_timeout+0x69/0xa2
 [<f92c3a8b>] xlog_state_sync+0xf0/0x1ef [xfs]
 [<f92c3be2>] _xfs_log_force+0x58/0x5f [xfs]
 [<f92cef4f>] _xfs_trans_commit+0x2a0/0x371 [xfs]
 [<f92d5ea5>] xfs_fsync+0x13a/0x1d0 [xfs]
 [<f92de9fc>] xfs_file_fsync+0x6f/0x79 [xfs]
 [<c0187858>] do_fsync+0x5c/0x93
 [<c01878af>] __do_fsync+0x20/0x2f
 [<c01878dd>] sys_fsync+0xd/0xf
 [<c0103f72>] sysenter_past_esp+0x5f/0xa5
 =======================
postgres      D 0003469a     0  3437   3432
       df0d4ce0 00200096 ee0ef4db 0003469a f7a9bd9c 00000000 00671954 00000000 
       ee760e2f 0003469a fffeffff f7a9bd98 f76ad0e0 df0d4d08 c0323b86 df0d4d10 
       f7a9bd9c f7a9bdb8 f76ad0e0 e94642a8 f7a9bd98 00000000 00000000 df0d4d24 
Call Trace:
 [<c0323b86>] rwsem_down_failed_common+0x66/0x15f
 [<c0323cc1>] rwsem_down_read_failed+0x1d/0x28
 [<c0323d13>] call_rwsem_down_read_failed+0x7/0xc
 [<f92b6903>] xfs_ilock+0x2f/0x87 [xfs]
 [<f92d38a3>] xfs_access+0x16/0x39 [xfs]
 [<f92e0766>] xfs_vn_permission+0x13/0x17 [xfs]
 [<c0170a68>] permission+0x7b/0xe3
 [<c0170bfc>] vfs_permission+0xf/0x11
 [<c0172217>] __link_path_walk+0x71/0xce8
 [<c0172ed2>] link_path_walk+0x44/0xbf
 [<c0172f65>] path_walk+0x18/0x1a
 [<c0173152>] do_path_lookup+0x78/0x1b0
 [<c0173c15>] __path_lookup_intent_open+0x44/0x81
 [<c0173cca>] path_lookup_open+0x21/0x27
 [<c0173da9>] open_namei+0x5b/0x646
 [<c0168a2b>] do_filp_open+0x26/0x43
 [<c0168a8b>] do_sys_open+0x43/0xcc
 [<c0168b4c>] sys_open+0x1c/0x1e
 [<c0103f72>] sysenter_past_esp+0x5f/0xa5
 =======================
postgres      D dfed43d0     0  3438   3432
       df0debc0 00200092 00000000 dfed43d0 c032e760 dfed43d0 df0debb0 f92dd5ea 
       00200046 dfec2f78 df0dec20 dfc308ec dfec2f68 df0dec08 c032274f 00000001 
       dfec2f68 00000001 00200246 dfec2f68 dfec2f68 00200282 df0debf4 00000001 
Call Trace:
 [<c032274f>] schedule_timeout+0x69/0xa2
 [<f92c372e>] xlog_state_sync_all+0xcd/0x1aa [xfs]
 [<f92c3bca>] _xfs_log_force+0x40/0x5f [xfs]
 [<f92b71a9>] xfs_iget_core+0x404/0x5c4 [xfs]
 [<f92b7418>] xfs_iget+0xaf/0x117 [xfs]
 [<f92cffa8>] xfs_trans_iget+0xe1/0x149 [xfs]
 [<f92bb199>] xfs_ialloc+0xb2/0x5ad [xfs]
 [<f92d0a99>] xfs_dir_ialloc+0x6c/0x2a3 [xfs]
 [<f92d6712>] xfs_create+0x2ba/0x467 [xfs]
 [<f92e0fe1>] xfs_vn_mknod+0x163/0x29f [xfs]
 [<f92e1146>] xfs_vn_create+0x12/0x14 [xfs]
 [<c0171700>] vfs_create+0x96/0xe9
 [<c0174283>] open_namei+0x535/0x646
 [<c0168a2b>] do_filp_open+0x26/0x43
 [<c0168a8b>] do_sys_open+0x43/0xcc
 [<c0168b4c>] sys_open+0x1c/0x1e
 [<c0103f72>] sysenter_past_esp+0x5f/0xa5
 =======================
bash          D 00000000     0  4251   4155
       e3deade8 00000082 00000000 00000000 e8d9d76c 00000000 00000000 00000000 
       7a4ec347 0003469a fffeffff e8d9d768 e3d7b220 e3deae10 c0323b86 e3deae18 
       e8d9d76c e8d9d788 e3deaf20 cc9fb000 e8d9d768 00000000 00000000 e3deae2c 
Call Trace:
 [<c0323b86>] rwsem_down_failed_common+0x66/0x15f
 [<c0323cc1>] rwsem_down_read_failed+0x1d/0x28
 [<c0323d13>] call_rwsem_down_read_failed+0x7/0xc
 [<f92b6903>] xfs_ilock+0x2f/0x87 [xfs]
 [<f92d38a3>] xfs_access+0x16/0x39 [xfs]
 [<f92e0766>] xfs_vn_permission+0x13/0x17 [xfs]
 [<c0170a68>] permission+0x7b/0xe3
 [<c0170bfc>] vfs_permission+0xf/0x11
 [<c0171de3>] may_open+0xa5/0x240
 [<c0173dbb>] open_namei+0x6d/0x646
 [<c0168a2b>] do_filp_open+0x26/0x43
 [<c0168a8b>] do_sys_open+0x43/0xcc
 [<c0168b4c>] sys_open+0x1c/0x1e
 [<c0103f72>] sysenter_past_esp+0x5f/0xa5
 =======================
bash          D dfed49a0     0 15791   4155
       e792c9c4 00000096 00000001 dfed49a0 c032e760 00000046 dfed49a0 dfed49a0 
       00000282 e792c9c4 dfed4998 00000282 dfed49a0 e792c9f0 c0323ede dfbca6f0 
       00000001 dfbca6f0 c0116f2f dfed49bc dfed49bc dfed4998 dfef0b88 c8901aa8 
Call Trace:
 [<c0323ede>] __down+0x86/0xed
 [<c0323cd6>] __down_failed+0xa/0x10
 [<f92dc724>] xfs_buf_lock+0x46/0x49 [xfs]
 [<f92c9335>] xfs_getsb+0x16/0x33 [xfs]
 [<f92cf77a>] xfs_trans_getsb+0x36/0x74 [xfs]
 [<f92ce856>] xfs_trans_apply_sb_deltas+0x16/0x46f [xfs]
 [<f92ced3c>] _xfs_trans_commit+0x8d/0x371 [xfs]
 [<f92be719>] xfs_iomap_write_allocate+0x2ce/0x491 [xfs]
 [<f92bd870>] xfs_iomap+0x439/0x484 [xfs]
 [<f92e12ab>] xfs_bmap+0x2c/0x32 [xfs]
 [<f92da6b2>] xfs_map_blocks+0x38/0x78 [xfs]
 [<f92dbc5b>] xfs_page_state_convert+0x311/0x79b [xfs]
 [<f92dc1e4>] xfs_vm_writepage+0x54/0xe0 [xfs]
 [<c0150bbd>] __writepage+0xb/0x27
 [<c015113f>] write_cache_pages+0x202/0x2e5
 [<c0151245>] generic_writepages+0x23/0x2d
 [<f92da892>] xfs_vm_writepages+0x3d/0x45 [xfs]
 [<c0151275>] do_writepages+0x26/0x39
 [<c014bbbf>] __filemap_fdatawrite_range+0x66/0x72
 [<c014bdc7>] filemap_fdatawrite+0x26/0x28
 [<f92dec5e>] xfs_flush_pages+0x4d/0x74 [xfs]
 [<f92d8f2f>] xfs_release+0x12a/0x218 [xfs]
 [<f92dea14>] xfs_file_release+0xe/0x12 [xfs]
 [<c016ae99>] __fput+0xb6/0x18f
 [<c016b143>] fput+0x18/0x1a
 [<c01686a0>] filp_close+0x41/0x67
 [<c01698ac>] sys_close+0x65/0xa7
 [<c0103f72>] sysenter_past_esp+0x5f/0xa5
 =======================
pdflush       D 0003469a     0 27649      2
       c2aa4ea4 00000092 3f8e2cc7 0003469a 00000282 c2aa4e94 000089ca 00000000 
       3f8eb691 0003469a dfed4998 dfed49cc c2aa4eac c2aa4ecc f92dd57f 00000000 
       ceb64ea0 c0116f2f dfed49e8 dfed49e8 dfed4998 00000020 00000000 c2aa4ed8 
Call Trace:
 [<f92dd57f>] xfs_buf_wait_unpin+0x7a/0xa8 [xfs]
 [<f92dd5f6>] xfs_buf_iorequest+0x49/0x70 [xfs]
 [<f92e12fe>] xfs_bdstrat_cb+0x4d/0x52 [xfs]
 [<f92d9b86>] xfs_bwrite+0x54/0xb7 [xfs]
 [<f92d2856>] xfs_syncsub+0x130/0x2df [xfs]
 [<f92d2a42>] xfs_sync+0x3d/0x4f [xfs]
 [<f92e2802>] xfs_fs_write_super+0x1c/0x23 [xfs]
 [<c016bf4e>] sync_supers+0x84/0xb2
 [<c0151a51>] wb_kupdate+0x27/0xd3
 [<c0151fcb>] pdflush+0xc0/0x175
 [<c012c242>] kthread+0x38/0x5a
 [<c0104be7>] kernel_thread_helper+0x7/0x10
 =======================
amarokapp     D 00000000     0 27973  27967
       df915bc0 00000092 00000000 00000000 f77854f8 00000000 00000000 00000000 
       f2938f82 0003469a df915c20 f7f6392c f7f5c258 df915c08 c032274f 00000001 
       f7f5c258 00000001 00000246 f7f5c258 f7f5c258 00000282 df915bf4 00000001 
Call Trace:
 [<c032274f>] schedule_timeout+0x69/0xa2
 [<f92c372e>] xlog_state_sync_all+0xcd/0x1aa [xfs]
 [<f92c3bca>] _xfs_log_force+0x40/0x5f [xfs]
 [<f92b71a9>] xfs_iget_core+0x404/0x5c4 [xfs]
 [<f92b7418>] xfs_iget+0xaf/0x117 [xfs]
 [<f92cffa8>] xfs_trans_iget+0xe1/0x149 [xfs]
 [<f92bb199>] xfs_ialloc+0xb2/0x5ad [xfs]
 [<f92d0a99>] xfs_dir_ialloc+0x6c/0x2a3 [xfs]
 [<f92d6712>] xfs_create+0x2ba/0x467 [xfs]
 [<f92e0fe1>] xfs_vn_mknod+0x163/0x29f [xfs]
 [<f92e1146>] xfs_vn_create+0x12/0x14 [xfs]
 [<c0171700>] vfs_create+0x96/0xe9
 [<c0174283>] open_namei+0x535/0x646
 [<c0168a2b>] do_filp_open+0x26/0x43
 [<c0168a8b>] do_sys_open+0x43/0xcc
 [<c0168b4c>] sys_open+0x1c/0x1e
 [<c0103f72>] sysenter_past_esp+0x5f/0xa5
 =======================
find          D 0003469a     0 28000  26670
       c899bb88 00000086 9ad0d912 0003469a 00000001 00000046 0009a6ff 00000000 
       9ada8011 0003469a dffe1b80 00000296 dffe1b88 c899bbb4 c0323ede f3d410a0 
       00000001 f3d410a0 c0116f2f dffe1ba4 dffe1ba4 dffe1ac0 dffe1ac0 f3a6a880 
Call Trace:
 [<c0323ede>] __down+0x86/0xed
 [<c0323cd6>] __down_failed+0xa/0x10
 [<f92dc6d4>] xfs_buf_iowait+0x4e/0x58 [xfs]
 [<f92dd805>] xfs_buf_iostart+0x7a/0x8b [xfs]
 [<f92de1a9>] xfs_buf_read_flags+0x51/0x75 [xfs]
 [<f92cf9b5>] xfs_trans_read_buf+0x3f/0x333 [xfs]
 [<f92b8f77>] xfs_itobp+0x70/0x1d1 [xfs]
 [<f92bc650>] xfs_iread+0x79/0x1d9 [xfs]
 [<f92b6ef1>] xfs_iget_core+0x14c/0x5c4 [xfs]
 [<f92b7418>] xfs_iget+0xaf/0x117 [xfs]
 [<f92d099e>] xfs_dir_lookup_int+0x81/0xec [xfs]
 [<f92d4adc>] xfs_lookup+0x56/0x7a [xfs]
 [<f92e0b75>] xfs_vn_lookup+0x2f/0x61 [xfs]
 [<c0170f93>] do_lookup+0x12a/0x16a
 [<c01728f3>] __link_path_walk+0x74d/0xce8
 [<c0172ed2>] link_path_walk+0x44/0xbf
 [<c0172f65>] path_walk+0x18/0x1a
 [<c0173152>] do_path_lookup+0x78/0x1b0
 [<c0173a73>] __user_walk_fd+0x32/0x4a
 [<c016d2d8>] vfs_lstat_fd+0x18/0x3e
 [<c016d369>] vfs_lstat+0x11/0x13
 [<c016d37f>] sys_lstat64+0x14/0x28
 [<c0103f72>] sysenter_past_esp+0x5f/0xa5
 =======================

Josef 'Jeff' Sipek.

-- 
It used to be said [...] that AIX looks like one space alien discovered
Unix, and described it to another different space alien who then implemented
AIX. But their universal translators were broken and they'd had to gesture a
lot.
		- Paul Tomblin 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: xfs on linux lvm - lvconvert random hungs when doing i/o
  2008-03-21 17:45 ` Josef 'Jeff' Sipek
@ 2008-03-22 10:20   ` Stanislaw Gruszka
  0 siblings, 0 replies; 6+ messages in thread
From: Stanislaw Gruszka @ 2008-03-22 10:20 UTC (permalink / raw)
  To: Josef 'Jeff' Sipek; +Cc: xfs

On Friday 21 March 2008, Josef 'Jeff' Sipek wrote:
> On Fri, Mar 21, 2008 at 03:20:16PM +0100, Stanislaw Gruszka wrote:
> 
> Interesting, I've noticed similar hang (based on my un-expert inspection of
> your backtraces) which went away as suddenly as it appeared. I wasn't making
> snapshots or any other LVM operation at the time. It just happened - logs
> didn't contain anything.

What I showed in traces is real deadlock, processes will stay in this state
as long as machine power off or end of the world, nothing will up semaphores.

Making snapshot when there is pending I/O on volume take much more
time compering when there is no load, propably because of locks contention,
which is problem by itself.  However in most cases processes unlock
after some time. For me making snapshot take few seconds without I/O
and something between 1 - 3 minutes with I/O . However, as I told before,
in very rare cases making snapshot with I/O on xfs volume take forever.

Stanislaw Gruszka

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: xfs on linux lvm - lvconvert random hungs when doing i/o
  2008-03-21 14:20 BUG: xfs on linux lvm - lvconvert random hungs when doing i/o Stanislaw Gruszka
  2008-03-21 17:45 ` Josef 'Jeff' Sipek
@ 2008-03-24 23:39 ` David Chinner
  2008-03-25  2:02   ` David Chinner
  1 sibling, 1 reply; 6+ messages in thread
From: David Chinner @ 2008-03-24 23:39 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: xfs

On Fri, Mar 21, 2008 at 03:20:16PM +0100, Stanislaw Gruszka wrote:
> Hello
> 
> I have problems using xfs and lvm snapshots on linux-2.6.24 , When I do
> lvconvert  to create snapshots and when system is under heavy load, lvconvert
> and I/O processes randomly hung .  I use below script to reproduce, but it is very hard
> to catch this bug.

This looks like an I/O completion problem.

You're writing 2 files and doing snap shots while they are running.

> xfsdatad/1    D 00000000     0   288      2
> Call Trace:
>  [<c05558e6>] rwsem_down_failed_common+0x76/0x170
>  [<c0555a2d>] rwsem_down_write_failed+0x1d/0x24
>  [<c0555aa2>] call_rwsem_down_write_failed+0x6/0x8
>  [<c05555d2>] down_write+0x12/0x20
>  [<c029906a>] xfs_ilock+0x5a/0xa0
>  [<c02c0093>] xfs_setfilesize+0x43/0x130
>  [<c02c0180>] xfs_end_bio_delalloc+0x0/0x20
>  [<c02c018d>] xfs_end_bio_delalloc+0xd/0x20
>  [<c01334d2>] run_workqueue+0x52/0x100
>  [<c01373a2>] prepare_to_wait+0x52/0x70
>  [<c01335ff>] worker_thread+0x7f/0xc0

This is an xfs I/O completion workqueue, waiting to get the inode
ilock in excusive mode to update the file size.

> pdflush       D 00fc61cb     0  7337      2
> Call Trace:
>  [<c0554fd7>] schedule_timeout+0x47/0x90
>  [<c012ca40>] process_timeout+0x0/0x10
>  [<c0137370>] prepare_to_wait+0x20/0x70
>  [<c0554f5b>] io_schedule_timeout+0x1b/0x30
>  [<c0162ede>] congestion_wait+0x7e/0xa0

Is waiting for I/O completion to remove the congestion status.

> lvconvert     D c4010a80     0 12930  12501
> Call Trace:
>  [<c0133719>] flush_cpu_workqueue+0x69/0xa0
>  [<c0133640>] wq_barrier_func+0x0/0x10
>  [<c013377c>] flush_workqueue+0x2c/0x40
>  [<c02c3fc7>] xfs_flush_buftarg+0x17/0x120
>  [<c02b6f26>] xfs_quiesce_fs+0x16/0x70
>  [<c02b6fa0>] xfs_attr_quiesce+0x20/0x60
>  [<c02b8db8>] xfs_freeze+0x8/0x10

That's waiting for the I/O completion workqueue to be flushed.

> dd            D 00fc61cb     0 12953  29684
> Call Trace:
>  [<c0554fd7>] schedule_timeout+0x47/0x90
>  [<c012ca40>] process_timeout+0x0/0x10
>  [<c0137370>] prepare_to_wait+0x20/0x70
>  [<c0554f5b>] io_schedule_timeout+0x1b/0x30
>  [<c0162ede>] congestion_wait+0x7e/0xa0

Stuck in congestion.

This dd (I've trimmed the stack trace to make it readable):

> dd            D c4018ab4     0 12113  29734
> Call Trace:
>  [<c0555b35>] __down+0x75/0xe0
>  [<c04793c7>] dm_unplug_all+0x17/0x30
>  [<c0555a3b>] __down_failed+0x7/0xc
>  [<c02e1ac0>] blk_backing_dev_unplug+0x0/0x10
>  [<c02c2e4c>] xfs_buf_lock+0x3c/0x50
>  [<c02c27d1>] _xfs_buf_find+0x151/0x1d0
>  [<c02c28a5>] xfs_buf_get_flags+0x55/0x130
>  [<c02c299c>] xfs_buf_read_flags+0x1c/0x90
>  [<c02b48cf>] xfs_trans_read_buf+0x16f/0x350
>  [<c02995fd>] xfs_itobp+0x7d/0x250
>  [<c029cfc9>] xfs_iflush+0x99/0x470
>  [<c02bd907>] xfs_inode_flush+0x127/0x1f0
>  [<c02c9732>] xfs_fs_write_inode+0x22/0x80
>  [<c01963fb>] write_inode+0x4b/0x50
>  [<c01966d0>] __sync_single_inode+0xf0/0x190
>  [<c01967b9>] __writeback_single_inode+0x49/0x1c0
>  [<c0196a0e>] sync_sb_inodes+0xde/0x1d0
>  [<c0196ba0>] writeback_inodes+0xa0/0xb0
>  [<c015d043>] balance_dirty_pages+0x193/0x2c0
>  [<c0158ad2>] generic_perform_write+0x142/0x190
>  [<c0158ba7>] generic_file_buffered_write+0x87/0x150
>  [<c02c8d0b>] xfs_write+0x61b/0x8c0
>  [<c02c45f6>] xfs_file_aio_write+0x76/0x90
>  [<c0178e9d>] do_sync_write+0xbd/0x110
>  [<c0179050>] vfs_write+0x160/0x170
>  [<c0179111>] sys_write+0x41/0x70
>  [<c010418e>] syscall_call+0x7/0xb

Is writing to one file, hitting foreground write throttling and
flushing either itself or the other file. It's stuct waiting on 
I/O completion of the inode buffer.

I suspect that the I/O completion has been blocked by the fact it's
trying to get The xfsdatad process is blocked on this inode - the
inode flush takes the ilock shared, which is holding off the I/O
completion. As soon as the inode buffer I/O is issued, then inode
will be unlocked and completion processing can continue.

i.e. it seems that either we can't safely take the ilock in I/O
completion without a trylock or we can't hold the ilock across
I/O submission without a trylock on the buffer lock. Ouch! That's
going to take some fixing....

I'd suggest that these two patches (already queued for 2.6.26):

http://oss.sgi.com/archives/xfs/2008-01/msg00153.html
http://oss.sgi.com/archives/xfs/2008-01/msg00154.html

Which make xfs_iflush do trylocks on the inode buffer in these
writeback cases and that should avoid the problem you are seeing
here. It won't avoid all possible problems, but it will not hang
waiting on buffer I/O completion in async inode flushes like
above....

> I also would like to ask if you have some propositions how to reproduce bug,
> because my scripts need to work few hours or even days to hung processes.

It's pure chance. Hence I don't think there's much you can do to
improve the reproducability of this problem....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: xfs on linux lvm - lvconvert random hungs when doing i/o
  2008-03-24 23:39 ` David Chinner
@ 2008-03-25  2:02   ` David Chinner
  2008-03-26 14:02     ` Stanislaw Gruszka
  0 siblings, 1 reply; 6+ messages in thread
From: David Chinner @ 2008-03-25  2:02 UTC (permalink / raw)
  To: David Chinner; +Cc: Stanislaw Gruszka, xfs

On Tue, Mar 25, 2008 at 10:39:52AM +1100, David Chinner wrote:
> On Fri, Mar 21, 2008 at 03:20:16PM +0100, Stanislaw Gruszka wrote:
> > Hello
> > 
> > I have problems using xfs and lvm snapshots on linux-2.6.24 , When I do
> > lvconvert  to create snapshots and when system is under heavy load, lvconvert
> > and I/O processes randomly hung .  I use below script to reproduce, but it is very hard
> > to catch this bug.
> 
> This looks like an I/O completion problem.
> 
> You're writing 2 files and doing snap shots while they are running.
> 
> > xfsdatad/1    D 00000000     0   288      2
> > Call Trace:
> >  [<c05558e6>] rwsem_down_failed_common+0x76/0x170
> >  [<c0555a2d>] rwsem_down_write_failed+0x1d/0x24
> >  [<c0555aa2>] call_rwsem_down_write_failed+0x6/0x8
> >  [<c05555d2>] down_write+0x12/0x20
> >  [<c029906a>] xfs_ilock+0x5a/0xa0
> >  [<c02c0093>] xfs_setfilesize+0x43/0x130
> >  [<c02c0180>] xfs_end_bio_delalloc+0x0/0x20
> >  [<c02c018d>] xfs_end_bio_delalloc+0xd/0x20
> >  [<c01334d2>] run_workqueue+0x52/0x100
> >  [<c01373a2>] prepare_to_wait+0x52/0x70
> >  [<c01335ff>] worker_thread+0x7f/0xc0
> 
> This is an xfs I/O completion workqueue, waiting to get the inode
> ilock in excusive mode to update the file size.

......

> This dd (I've trimmed the stack trace to make it readable):
> 
> > dd            D c4018ab4     0 12113  29734
> > Call Trace:
> >  [<c0555b35>] __down+0x75/0xe0
> >  [<c04793c7>] dm_unplug_all+0x17/0x30
> >  [<c0555a3b>] __down_failed+0x7/0xc
> >  [<c02e1ac0>] blk_backing_dev_unplug+0x0/0x10
> >  [<c02c2e4c>] xfs_buf_lock+0x3c/0x50
> >  [<c02c27d1>] _xfs_buf_find+0x151/0x1d0
> >  [<c02c28a5>] xfs_buf_get_flags+0x55/0x130
> >  [<c02c299c>] xfs_buf_read_flags+0x1c/0x90
> >  [<c02b48cf>] xfs_trans_read_buf+0x16f/0x350
> >  [<c02995fd>] xfs_itobp+0x7d/0x250
> >  [<c029cfc9>] xfs_iflush+0x99/0x470
> >  [<c02bd907>] xfs_inode_flush+0x127/0x1f0
> >  [<c02c9732>] xfs_fs_write_inode+0x22/0x80
> >  [<c01963fb>] write_inode+0x4b/0x50
> >  [<c01966d0>] __sync_single_inode+0xf0/0x190
> >  [<c01967b9>] __writeback_single_inode+0x49/0x1c0
> >  [<c0196a0e>] sync_sb_inodes+0xde/0x1d0
> >  [<c0196ba0>] writeback_inodes+0xa0/0xb0
> >  [<c015d043>] balance_dirty_pages+0x193/0x2c0
> >  [<c0158ad2>] generic_perform_write+0x142/0x190
> >  [<c0158ba7>] generic_file_buffered_write+0x87/0x150
> >  [<c02c8d0b>] xfs_write+0x61b/0x8c0
> >  [<c02c45f6>] xfs_file_aio_write+0x76/0x90
> >  [<c0178e9d>] do_sync_write+0xbd/0x110
> >  [<c0179050>] vfs_write+0x160/0x170
> >  [<c0179111>] sys_write+0x41/0x70
> >  [<c010418e>] syscall_call+0x7/0xb
> 
> Is writing to one file, hitting foreground write throttling and
> flushing either itself or the other file. It's stuct waiting on 
> I/O completion of the inode buffer.
> 
> I suspect that the I/O completion has been blocked by the fact it's
> trying to get The xfsdatad process is blocked on this inode - the
> inode flush takes the ilock shared, which is holding off the I/O
> completion. As soon as the inode buffer I/O is issued, then inode
> will be unlocked and completion processing can continue.
> 
> i.e. it seems that either we can't safely take the ilock in I/O
> completion without a trylock or we can't hold the ilock across
> I/O submission without a trylock on the buffer lock. Ouch! That's
> going to take some fixing....

No, that's not true - the data I/O is queued to the xfsdatad, whilst
metadata gets queued to the xfslogd completion queue. Hence data I/O
completion can't hold up metadata I/O completion and we can't deadlock
here....

That points to I/O not completing (not an XFS problem at all), or
the filesystem freeze is just taking a long time to run (as it has
to sync everything to disk). Given that this is a snapshot target,
writing new blocks will take quite some time. Is the system still
making writeback  progress when in this state, or is it really hung?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: xfs on linux lvm - lvconvert random hungs when doing i/o
  2008-03-25  2:02   ` David Chinner
@ 2008-03-26 14:02     ` Stanislaw Gruszka
  0 siblings, 0 replies; 6+ messages in thread
From: Stanislaw Gruszka @ 2008-03-26 14:02 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs

On Tuesday 25 March 2008, David Chinner wrote:
> That points to I/O not completing (not an XFS problem at all), or
> the filesystem freeze is just taking a long time to run (as it has
> to sync everything to disk). Given that this is a snapshot target,
> writing new blocks will take quite some time. Is the system still
> making writeback  progress when in this state, or is it really hung?
That is real deadlock, operations which normally takes  few minutes,
hangs for few hours until I restart machine.

This bug is very strange because I can't  catch it with more verbose
debug and tracing options,

I did more tests without debugging options and looks that version from
sgi cvs (2.6.25-rc3) works well. I did also test with two patches you suggest,
applied (with some troubles) for 2.6.24.2,  but finally  things hung after
2 days  - better than without patches, as you expected.

So, looks things are fixed in sgi cvs repository. Don't know if these are
xfs or other subsystem (device-mapper?) fixes, anyway I can't reproduce
bug on this version,  that's good news, I think.

>
> Cheers,
> 
> Dave.

Thanks
Stanislaw Gruszka

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-03-26 13:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-21 14:20 BUG: xfs on linux lvm - lvconvert random hungs when doing i/o Stanislaw Gruszka
2008-03-21 17:45 ` Josef 'Jeff' Sipek
2008-03-22 10:20   ` Stanislaw Gruszka
2008-03-24 23:39 ` David Chinner
2008-03-25  2:02   ` David Chinner
2008-03-26 14:02     ` Stanislaw Gruszka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox