From: Tao Ma <tm@tao.ma>
To: "Bryn M. Reeves" <bmr@redhat.com>
Cc: Bart Van Assche <bvanassche@acm.org>,
linux-scsi@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: How to online remove an error scsi disk from the system?
Date: Fri, 01 Feb 2013 19:13:52 +0800 [thread overview]
Message-ID: <510BA370.1040309@tao.ma> (raw)
In-Reply-To: <510B93EE.8000304@redhat.com>
On 02/01/2013 06:07 PM, Bryn M. Reeves wrote:
> On 02/01/2013 09:59 AM, Tao Ma wrote:
>> yes, but the result is the same. It will do some IO first which will
>> cause this command hang.
>
> You seem to have a problem with either the device/adapter or in the
> driver. The backtrace you posted shows that jbd2 (ext4) is still waiting
> on IO that's been submitted to an mpt2sas or mpt3sas adapter (I only
> know that because I recognise their log messages - you should try to
> include relevant details like this when seeking assistance).
This should be a mpt2sas adapter
#lsmod|grep mpt
mptctl 96789 0
mptbase 97052 1 mptctl
mpt2sas 164962 18
scsi_transport_sas 35232 3 isci,libsas,mpt2sas
raid_class 4746 1 mpt2sas
The system has 12 sata disks. What else do you need? I am willing to
provide any details you want.
>
> The adapter/driver hasn't completed the IO and it looks like the SCSI
> layer is trying to abort it. Depending on the state of the driver and
> hardware your only option might be to reboot (or physically hot remove
> the device if your hardware allows it).
OK, so let me describe the situation here. This is one of our storage
system. So 12 2TB sata disk in one box, normally when one disk fails, we
just want to remove it from the system by *software*, and then continue
to use the 11 disks left. We have found that sometimes an unsuccessful
umount or some actions against this disk can lead to some bad
situation(Say some very high load because many processes are 'D'ed). So
ideally if we can remove this device successfully, all the ios to this
disk will fail and there will be no 'D' processes and the loadavg will
also be low.
>
> You don't mention the versions of the kernel and driver you're using -
> if the system is in production I would suggest contacting who ever
> normally provides support for the kernel and distribution that you are
> running.
We use CentOS6.2 and the kernel version is 2.6.32-220.23.1.
Thanks,
Tao
next prev parent reply other threads:[~2013-02-01 11:14 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-01 6:13 How to online remove an error scsi disk from the system? Tao Ma
2013-02-01 7:54 ` Bart Van Assche
2013-02-01 9:07 ` Tao Ma
2013-02-01 9:52 ` Bryn M. Reeves
2013-02-01 9:59 ` Tao Ma
2013-02-01 10:07 ` Bryn M. Reeves
2013-02-01 11:13 ` Tao Ma [this message]
2013-02-01 11:20 ` Bryn M. Reeves
2013-02-01 8:50 ` Jack Wang
2013-02-01 9:17 ` Tao Ma
2013-02-01 9:24 ` Jack Wang
2013-02-01 9:48 ` Tao Ma
2013-02-01 14:41 ` Hillf Danton
2013-10-16 16:22 ` taco
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=510BA370.1040309@tao.ma \
--to=tm@tao.ma \
--cc=bmr@redhat.com \
--cc=bvanassche@acm.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox