linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Xiao Ni <xni@redhat.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without
Date: Thu, 14 Sep 2017 15:32:02 +1000	[thread overview]
Message-ID: <871sn9alrh.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <446747392.10694917.1505364915884.JavaMail.zimbra@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3205 bytes --]

On Thu, Sep 14 2017, Xiao Ni wrote:

> ----- Original Message -----
>> From: "NeilBrown" <neilb@suse.com>
>> To: "Xiao Ni" <xni@redhat.com>
>> Cc: linux-raid@vger.kernel.org
>> Sent: Thursday, September 14, 2017 7:05:20 AM
>> Subject: Re: [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without
>> 
>> On Wed, Sep 13 2017, Xiao Ni wrote:
>> >
>> > Hi Neil
>> >
>> > Sorry for the bad news. The test is still running and it's stuck again.
>> 
>> Any details?  Anything at all?  Just a little hint maybe?
>> 
>> Just saying "it's stuck again" is very nearly useless.
>> 
> Hi Neil
>
> It doesn't show any useful information in /var/log/messages
>
> echo file raid5.c +p > /sys/kernel/debug/dynamic_debug/control
> There aren't any messages too. 
>
> It looks like another problem. 
>
> [root@dell-pr1700-02 ~]# ps auxf | grep D
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> root      8381  0.0  0.0      0     0 ?        D    Sep13   0:00  \_ [kworker/u8:1]
> root      8966  0.0  0.0      0     0 ?        D    Sep13   0:00  \_ [jbd2/md0-8]
> root       824  0.0  0.1 216856  8492 ?        Ss   Sep03   0:06 /usr/bin/abrt-watch-log -F BUG: WARNING: at WARNING: CPU: INFO: possible recursive locking detected ernel BUG at list_del corruption list_add corruption do_IRQ: stack overflow: ear stack overflow (cur: eneral protection fault nable to handle kernel ouble fault: RTNL: assertion failed eek! page_mapcount(page) went negative! adness at NETDEV WATCHDOG ysctl table check failed : nobody cared IRQ handler type mismatch Machine Check Exception: Machine check events logged divide error: bounds: coprocessor segment overrun: invalid TSS: segment not present: invalid opcode: alignment check: stack segment: fpu exception: simd exception: iret exception: /var/log/messages -- /usr/bin/abrt-dump-oops -xtD
> root       836  0.0  0.0 195052  3200 ?        Ssl  Sep03   0:00 /usr/sbin/gssproxy -D
> root      1225  0.0  0.0 106008  7436 ?        Ss   Sep03   0:00 /usr/sbin/sshd -D
> root     12411  0.0  0.0 112672  2264 pts/0    S+   00:50   0:00          \_ grep --color=auto D
> root      8987  0.0  0.0 109000  2728 pts/2    D+   Sep13   0:04          \_ dd if=/dev/urandom of=/mnt/md_test/testfile bs=1M count=1000
> root      8983  0.0  0.0   7116  2080 ?        Ds   Sep13   0:00 /usr/sbin/mdadm --grow --continue /dev/md0
>
> [root@dell-pr1700-02 ~]# cat /proc/mdstat 
> Personalities : [raid6] [raid5] [raid4] 
> md0 : active raid5 loop6[7] loop4[6] loop5[5](S) loop3[3] loop2[2] loop1[1] loop0[0]
>       2039808 blocks super 1.2 level 5, 512k chunk, algorithm 2 [6/6] [UUUUUU]
>       [>....................]  reshape =  0.0% (1/509952) finish=1059.5min speed=7K/sec
>       
> unused devices: <none>
>
>
> It looks like the reshape doesn't start. This time I didn't add the codes to check
> the information of mddev->suspended and active_stripes. I just added the patches 
> to source codes. Do you have other suggestions to check more things?
>
> Best Regards
> Xiao

What do
 cat /proc/8987/stack
 cat /proc/8983/stack
 cat /proc/8966/stack
 cat /proc/8381/stack

show??

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2017-09-14  5:32 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-12  1:49 [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without NeilBrown
2017-09-12  1:49 ` [PATCH 3/4] md: use mddev_suspend/resume instead of ->quiesce() NeilBrown
2017-09-12  1:49 ` [PATCH 4/4] md: allow metadata update while suspending NeilBrown
2017-09-12  1:49 ` [PATCH 2/4] md: don't call bitmap_create() while array is quiesced NeilBrown
2017-09-12  1:49 ` [PATCH 1/4] md: always hold reconfig_mutex when calling mddev_suspend() NeilBrown
2017-09-12  2:51 ` [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without Xiao Ni
2017-09-13  2:11 ` Xiao Ni
2017-09-13 15:09   ` Xiao Ni
2017-09-13 23:05     ` NeilBrown
2017-09-14  4:55       ` Xiao Ni
2017-09-14  5:32         ` NeilBrown [this message]
2017-09-14  7:57           ` Xiao Ni
2017-09-16 13:15             ` Xiao Ni
2017-10-05  5:17             ` NeilBrown
2017-10-06  3:53               ` Xiao Ni
2017-10-06  4:32                 ` NeilBrown
2017-10-09  1:21                   ` Xiao Ni
2017-10-09  4:57                     ` NeilBrown
2017-10-09  5:32                       ` Xiao Ni
2017-10-09  5:52                         ` NeilBrown
2017-10-10  6:05                           ` Xiao Ni
2017-10-10 21:20                             ` NeilBrown
     [not found]                               ` <960568852.19225619.1507689864371.JavaMail.zimbra@redhat.com>
2017-10-13  3:48                                 ` NeilBrown
2017-10-16  4:43                                   ` Xiao Ni
2017-09-30  9:46 ` Xiao Ni
2017-10-05  5:03   ` NeilBrown
2017-10-06  3:40     ` Xiao Ni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871sn9alrh.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=xni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).