From: Shaohua Li <shli@kernel.org>
To: yizhan <yizhan@redhat.com>
Cc: Guoqing Jiang <gqjiang@suse.com>,
linux-raid@vger.kernel.org, shli@fb.com,
NeilBrown <neilb@suse.com>,
songliubraving@fb.com
Subject: Re: INFO: task systemd-udevd:794 blocked for more than 600 seconds observed after mdadm -G operation
Date: Fri, 5 Aug 2016 21:45:55 -0700 [thread overview]
Message-ID: <20160806044555.GC9281@kernel.org> (raw)
In-Reply-To: <40d28573-38a6-25ad-7956-95bd0c74b996@redhat.com>
On Fri, Aug 05, 2016 at 03:05:40PM +0800, yizhan wrote:
>
>
> On 08/05/2016 02:39 PM, Guoqing Jiang wrote:
> >
> >
> >On 08/04/2016 12:01 PM, Yi Zhang wrote:
> >
> >>[ 1201.734199] INFO: task systemd-udevd:794 blocked for more than 600
> >>seconds.
> >>[ 1201.734204] Not tainted 4.7.0 #5
> >>[ 1201.734204] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>disables this message.
> >>[ 1201.734206] systemd-udevd D ffff8800310a77b0 0 794 1
> >>0x00000000
> >>[ 1201.734209] ffff8800310a77b0 0000000000000001 ffff880031de0000
> >>ffff8807cd3c5688
> >>[ 1201.734211] ffff8800310a8000 0000000000000000 0000000000000002
> >>ffff8807c3e4c100
> >>[ 1201.734213] 0000000000000000 ffff8800310a77c8 ffffffff816e4935
> >>ffff8807cd3c5400
> >>[ 1201.734214] Call Trace:
> >>[ 1201.734220] [<ffffffff816e4935>] schedule+0x35/0x80
> >>[ 1201.734226] [<ffffffffa09a0ba9>] raid5_make_request+0x8b9/0xce0
> >>[raid456]
> >>[ 1201.734229] [<ffffffff810d0240>] ? prepare_to_wait_event+0xf0/0xf0
> >>[ 1201.734233] [<ffffffff8156758e>] md_make_request+0xee/0x230
> >>[ 1201.734236] [<ffffffff81320803>] generic_make_request+0x103/0x1d0
> >>[ 1201.734237] [<ffffffff81320947>] submit_bio+0x77/0x150
> >>[ 1201.734240] [<ffffffff813183bb>] ? bio_alloc_bioset+0x1ab/0x2d0
> >>[ 1201.734242] [<ffffffff81254ecd>] ? alloc_page_buffers+0x5d/0xb0
> >>[ 1201.734244] [<ffffffff81256a1f>] submit_bh_wbc+0x12f/0x160
> >>[ 1201.734246] [<ffffffff81256d58>] block_read_full_page+0x208/0x380
> >>[ 1201.734248] [<ffffffff81259000>] ? I_BDEV+0x20/0x20
> >>[ 1201.734250] [<ffffffff812597e8>] blkdev_readpage+0x18/0x20
> >>[ 1201.734253] [<ffffffff81197718>] do_read_cache_page+0x138/0x300
> >>[ 1201.734255] [<ffffffff812597d0>] ? blkdev_writepages+0x40/0x40
> >>[ 1201.734257] [<ffffffff811978f9>] read_cache_page+0x19/0x20
> >>[ 1201.734259] [<ffffffff81333b50>] read_dev_sector+0x80/0xb0
> >>[ 1201.734261] [<ffffffff81337f14>] read_lba+0x104/0x1c0
> >>[ 1201.734263] [<ffffffff8133852a>] find_valid_gpt+0xfa/0x710
> >>[ 1201.734266] [<ffffffff811d7d00>] ?
> >>vmap_page_range_noflush+0x190/0x380
> >>[ 1201.734267] [<ffffffff81338b40>] ? find_valid_gpt+0x710/0x710
> >>[ 1201.734269] [<ffffffff81338bc9>] efi_partition+0x89/0x440
> >>[ 1201.734271] [<ffffffff81359209>] ? snprintf+0x49/0x70
> >>[ 1201.734273] [<ffffffff81338b40>] ? find_valid_gpt+0x710/0x710
> >>[ 1201.734275] [<ffffffff81335b31>] check_partition+0x101/0x1f0
> >>[ 1201.734277] [<ffffffff81334462>] rescan_partitions+0xc2/0x2c0
> >>[ 1201.734279] [<ffffffff8132f294>] __blkdev_reread_part+0x64/0x70
> >>[ 1201.734281] [<ffffffff8132f2c3>] blkdev_reread_part+0x23/0x40
> >>[ 1201.734282] [<ffffffff8132fd3a>] blkdev_ioctl+0x46a/0x8f0
> >>[ 1201.734284] [<ffffffff81259aa1>] block_ioctl+0x41/0x50
> >>[ 1201.734286] [<ffffffff812323b6>] do_vfs_ioctl+0xa6/0x5c0
> >>[ 1201.734288] [<ffffffff81232949>] SyS_ioctl+0x79/0x90
> >>[ 1201.734290] [<ffffffff81274d67>] ? SyS_flock+0x117/0x1a0
> >>[ 1201.734292] [<ffffffff816e86b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> >>[ 1201.734305] INFO: task md0_reshape:4089 blocked for more than 600
> >>seconds.
> >>[ 1201.734306] Not tainted 4.7.0 #5
> >>[ 1201.734307] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>disables this message.
> >>[ 1201.734308] md0_reshape D ffff88080db77ac0 0 4089 2
> >>0x00000080
> >>[ 1201.734310] ffff88080db77ac0 0000000000000000 ffff8807d6724380
> >>ffff8807cd3c5670
> >>[ 1201.734311] ffff88080db78000 ffff88080db77b20 ffff8807cd3c5670
> >>0000000000000c20
> >>[ 1201.734313] ffff8807cd3c5418 ffff88080db77ad8 ffffffff816e4935
> >>ffff8807cd3c5400
> >>[ 1201.734314] Call Trace:
> >>[ 1201.734316] [<ffffffff816e4935>] schedule+0x35/0x80
> >>[ 1201.734319] [<ffffffffa099fe8b>] raid5_get_active_stripe+0x20b/0x670
> >>[raid456]
> >>[ 1201.734321] [<ffffffff810d0240>] ? prepare_to_wait_event+0xf0/0xf0
> >>[ 1201.734323] [<ffffffffa09a4733>] reshape_request+0x4d3/0x8a0
> >>[raid456]
> >>[ 1201.734326] [<ffffffffa09a4e21>] raid5_sync_request+0x321/0x3a0
> >>[raid456]
> >>[ 1201.734327] [<ffffffff810cfc54>] ? __wake_up+0x44/0x50
> >>[ 1201.734329] [<ffffffff8156ce97>] md_do_sync+0x9e7/0xf60
> >>[ 1201.734332] [<ffffffff810c200d>] ? update_curr+0xed/0x180
> >>[ 1201.734333] [<ffffffff810c025e>] ? account_entity_dequeue+0xae/0xd0
> >>[ 1201.734336] [<ffffffff81098d43>] ? kernel_sigaction+0x43/0xe0
> >>[ 1201.734339] [<ffffffff81566dc6>] md_thread+0x136/0x150
> >>[ 1201.734341] [<ffffffff81566c90>] ? find_pers+0x70/0x70
> >>[ 1201.734343] [<ffffffff810ab678>] kthread+0xd8/0xf0
> >>[ 1201.734344] [<ffffffff816e88bf>] ret_from_fork+0x1f/0x40
> >>[ 1201.734346] [<ffffffff810ab5a0>] ? kthread_park+0x60/0x60
> >>
> >>
> >>Reproduce steps:
> >>
> >>#mdadm -CR /dev/md0 -l5 -n4 /dev/sd[b-e]1 --write-journal /dev/sdf1
> >>#mdadm --wait /dev/md0
> >>#mdadm /dev/md0 -f /dev/sdf1
> >>#mdadm /dev/md0 -r /dev/sdf1
> >>#mdadm /dev/md0 -a /dev/sdf1
> >>#mdadm -D /dev/md0
> >>/dev/md0:
> >> Version : 1.2
> >> Creation Time : Thu Aug 4 20:37:41 2016
> >> Raid Level : raid5
> >> Array Size : 31432704 (29.98 GiB 32.19 GB)
> >> Used Dev Size : 10477568 (9.99 GiB 10.73 GB)
> >> Raid Devices : 4
> >> Total Devices : 5
> >> Persistence : Superblock is persistent
> >>
> >> Update Time : Thu Aug 4 20:40:16 2016
> >> State : clean
> >> Active Devices : 4
> >>Working Devices : 5
> >> Failed Devices : 0
> >> Spare Devices : 1
> >>
> >> Layout : left-symmetric
> >> Chunk Size : 512K
> >>
> >> Name : dhcp-12-125.nay.redhat.com:0 (local to host
> >>dhcp-12-125.nay.redhat.com)
> >> UUID : 09607542:a8dccc81:f5d2f80c:c586923f
> >> Events : 22
> >>
> >> Number Major Minor RaidDevice State
> >> 0 8 17 0 active sync /dev/sdb1
> >> 1 8 33 1 active sync /dev/sdc1
> >> 2 8 49 2 active sync /dev/sdd1
> >> 5 8 65 3 active sync /dev/sde1
> >>
> >> 4 8 81 - spare /dev/sdf1
> >>
> >>#mdadm -G -n5 /dev/md0
> >
> >Also with 4.7+, I can't find the same issue, but md0_reshape is in 'D'
> >state from my side
> >(actually I can't find reshape is in-progress by 'cat /proc/mdstat').
> >
> Hi guoqing
> Thanks for you detailed log.
> The trace cannot be print soon, I think you need wait for more time, the
> time from below is more than 15m.
> Aug 4 20:39:37 dhcp-12-125 kernel: md/raid:md0: Disk failure on sdf1,
> disabling device.#012md/raid:md0: Operation continuing on 4 devices.
> Aug 4 20:39:56 dhcp-12-125 kernel: md: unbind<sdf1>
> Aug 4 20:39:56 dhcp-12-125 kernel: md: export_rdev(sdf1)
> Aug 4 20:40:16 dhcp-12-125 kernel: md: bind<sdf1>
> Aug 4 20:40:47 dhcp-12-125 kernel: md: reshape of RAID array md0
> Aug 4 20:40:47 dhcp-12-125 kernel: md: minimum _guaranteed_ speed: 1000
> KB/sec/disk.
> Aug 4 20:40:47 dhcp-12-125 kernel: md: using maximum available idle IO
> bandwidth (but not more than 200000 KB/sec) for reshape.
> Aug 4 20:40:47 dhcp-12-125 kernel: md: using 128k window, over a total of
> 10477568k
> Aug 4 20:55:56 dhcp-12-125 kernel: INFO: task systemd-udevd:794 blocked for
> more than 600 seconds.
Adding Song who knows a lot in this side.
If an array is created with journal and journal disk fails, we will make the
array read-only. So we should disable reshape too. Likely we have a bug here.
Thanks,
Shaohua
prev parent reply other threads:[~2016-08-06 4:45 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <2047059297.11279583.1470314510395.JavaMail.zimbra@redhat.com>
2016-08-04 16:01 ` INFO: task systemd-udevd:794 blocked for more than 600 seconds observed after mdadm -G operation Yi Zhang
2016-08-05 6:39 ` Guoqing Jiang
2016-08-05 7:05 ` yizhan
2016-08-06 4:45 ` Shaohua Li [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160806044555.GC9281@kernel.org \
--to=shli@kernel.org \
--cc=gqjiang@suse.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.com \
--cc=shli@fb.com \
--cc=songliubraving@fb.com \
--cc=yizhan@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.