All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bernd Schubert <bernd.schubert@fastmail.fm>
To: Nix <nix@esperi.org.uk>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-scsi@vger.kernel.org,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	nick.cheng@areca.com.tw
Subject: Re: [SCSI REGRESSION] 3.10.2 or 3.10.3: arcmsr failure at bootup / early userspace transition
Date: Mon, 29 Jul 2013 16:16:57 +0200	[thread overview]
Message-ID: <51F67959.2060803@fastmail.fm> (raw)
In-Reply-To: <87mwp5frdl.fsf@spindle.srvr.nix>

On 07/29/2013 03:05 PM, Nix wrote:
> On 29 Jul 2013, Bernd Schubert said:
>
>> Hi Nick,
>>
>> On 07/29/2013 12:10 PM, Nick Alcock wrote:
>>> arcmsr0: abort device command of scsi id = 0 lun = 1
>>> arcmsr0: abort device command of scsi id = 0 lun = 0
>>> arcmsr: executing bus reset eh.....num_resets=0, num_[...]
>>>
>>> arcmsr0: wait 'abort all outstanding command' timeout
>>> arcmsr0: executing hw bus reset ....
>>> arcmsr0: waiting for hw bus reset return, retry=0
>>> arcmsr0: waiting for hw bus reset return, retry=1
>>> Areca RAID Controller0: F/W V1.46 2009-01-06 & Model ARC-1210
>>> arcmsr: scsi  bus reset eh returns with success
>>> [and back to the top of the error messages again, apparently forever,
>>>    not that the machine would be much use without its RAID array even
>>>    if this loop terminated at some point, so I only gave it a couple
>>>    of minutes]
>>>
>>> The failure happens precisely at the moment we transition to early
>>> userspace, so presumably userspace I/O is failing (or something related
>>> to raw device access, perhaps, since the first thing it does is a
>>> vgscan).
>>>
>>> I haven't bisected yet (sorry, I have work to do which means this
>>> machine must be running right now), but nothing has changed in the
>>> arcmsr controller, nor in SCSI-land excepting
>>>
>>> commit 98dcc2946adbe4349ef1ef9b99873b912831edd4
>>> Author: Martin K. Petersen <martin.petersen@oracle.com>
>>> Date:   Thu Jun 6 22:15:55 2013 -0400
> [...]
>>> Obviously, at this point, this machine has no modules loaded (it has
>>> almost none loaded even when fully operational)
>>
>> I tested this patch with ARC-1260 and F/W V1.49, no issues. Also, this
>> patch is only in 3.10.3, but not yet in 3.10.1.
>
> ... and I see this problem with 3.10.3 but not 3.10.1. (Haven't tried
> 3.10.2.)

Hmm, indeed that points to this commit. I just don't see what could fail 
there.

Could you try to run these commands with 3.10.1?

# # check if reporting opcodes works
# sg_opcodes -v  -n /dev/sdX

# check ata information page
# sg_vpd --page=0x89 /dev/sdX

>
>>                                                  And I don't think this
>> commit can cause your issue at all, a failing heuristics would enable
>> WRITE SAME and would cause issues with linux-md, but there shouldn't
>> happen anything directly in the scsi-layer. Which was your last
>> working kernel version?
>
> 3.10.1. :)

Whoops, sorry, I missed that in your first sentence.

>
> No changes to arcmsr between those versions... I suspect I'll have to
> bisect, which will be a complete pig because every failure means a hard
> powerdown of this box. Always-on servers rarely appreciate hard
> powerdowns :(
>

Maybe just revert this commit? Helpful would be some scsi logging to see 
which command actually fails. I guess you don't have a serial console?


Thanks,
Bernd

  reply	other threads:[~2013-07-29 14:16 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-29 10:10 [SCSI REGRESSION] 3.10.2 or 3.10.3: arcmsr failure at bootup / early userspace transition Nick Alcock
2013-07-29 13:01 ` Bernd Schubert
2013-07-29 13:05   ` Nix
2013-07-29 14:16     ` Bernd Schubert [this message]
2013-07-29 15:01       ` Nix
2013-07-29 20:04       ` Nix
2013-07-29 20:15         ` Martin K. Petersen
2013-07-29 21:09       ` Nix
2013-07-29 23:34         ` Martin K. Petersen
2013-07-30 18:09           ` Bernd Schubert
2013-07-31  0:07             ` Nick Alcock
2013-07-31  3:19               ` Martin K. Petersen
2013-07-31  3:15             ` Martin K. Petersen
2013-07-31 17:51               ` Bernd Schubert
2013-07-31 18:40                 ` Bernd Schubert
2013-08-01 14:34                   ` [PATCH] scsi disk: Use its own buffer for the vpd request Bernd Schubert
2013-08-01 14:37                     ` Bernd Schubert
2013-08-02  3:00                     ` Martin K. Petersen
2013-08-26 20:15                       ` Bernd Schubert
2013-08-02 23:46                     ` Nick Alcock
2013-08-03 11:17                     ` Nick Alcock
2013-08-30 10:01                     ` Nix
2013-08-31  1:53                       ` Greg KH
2013-08-31 19:48                         ` Nix
2013-09-01 18:40                           ` Bernd Schubert
2013-09-20 22:51                             ` Martin K. Petersen
2013-09-23 12:47                   ` [PATCH] scsi disk: Reduce buffer size for " Bernd Schubert
2013-07-30  0:28         ` [SCSI REGRESSION] 3.10.2 or 3.10.3: arcmsr failure at bootup / early userspace transition Douglas Gilbert
2013-07-30  0:56           ` Nix
2013-07-30 18:14             ` Bernd Schubert
2013-07-30 21:20               ` Nix
2013-08-01 14:55                 ` Bernd Schubert
2013-08-01 16:04                   ` Nix
2013-08-01 16:21                     ` Bernd Schubert
2013-07-31  3:10           ` Martin K. Petersen
2013-07-29 14:27   ` Martin K. Petersen
2013-07-29 14:26 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51F67959.2060803@fastmail.fm \
    --to=bernd.schubert@fastmail.fm \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=nick.cheng@areca.com.tw \
    --cc=nix@esperi.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.