public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Shyam_Iyer@Dell.com
Cc: realrichardsharpe@gmail.com, linux-scsi@vger.kernel.org
Subject: Re: [LSF/MM Topic] SCSI Unit Attention Handling
Date: Wed, 09 Feb 2011 16:44:07 +0100	[thread overview]
Message-ID: <4D52B647.9060601@suse.de> (raw)
In-Reply-To: <DBFB1B45AF80394ABD1C807E9F28D1570264CAA99E@BLRX7MCDC203.AMER.DELL.COM>

Hi all,

On 02/09/2011 06:02 AM, Shyam_Iyer@Dell.com wrote:
[ .. ]
> 
> I get you there. The ioctl implementation to inform the driver will
> not plug the storage from sending the UAs.
> 
> The LUN could be multipathed so then if you have UAs come through
> one sdX path and not through the other that is adding complication.
> Also, if you are using persistent reservations and one of the path
> goes down the UAs could be going to a path that has been excluded.
> We are introducing scenarios for bugs here.
> 
Hence using debugfs; with this we would be getting an entire
configfs space for free which would allow us to set this kind of things.
ioctls are evil. Avoid at all cost.

>>
>>> Even if registering for UAs per vendor was envisioned there are
>>> scenarios that can cause a flurry of UAs too..
>>> (I initially opined to have a vendor specific implementation of
>>> logging scsi_netlink events from the scsi_device handler,
>>> it was gloriously shot down ;-))
>>>
>>> Consider this scenario..
>>>
>>> Above water mark.. --> Unit Attention
>>> Discard to free up space
>>> Below water mark ... -> Unit Attention
>>>
>>> Consider a ripple scenario where this repeats..
>>> (Although this can not happen too often it is very much akin to a
>>> thrashing scenario)
>>>
>>> The UA should be hints for the filesystem to optimize online. Here is
>>> where the thin profile can reduce the UAs.
>>>
>>> Also, you delete a file - select a good age time to discard the
>>> associated blocks(debatable and worth any good algorithm writer's
>>> salt).
>>> Now I am not sure if the filesystem should run an inkernel thread to
>>> do this profile management..
>>>
>>>> It might be more useful to allow user-land utilities to perform the
>>>> re-scanning.
>>>>
>>>> I would imagine that you will get unit attentions saying that
>>>> REPORTED LUNS DATA HAS CHANGED, but what other UNIT ATTENTIONS would
>>>> you get?
>>>> If you add storage to a LUN, then perhaps CAPACITY DATA HAS CHANGED.
>>>>
>>>> Perhaps there is also a need to say things like, for these ASC/ASCQ
>>>> values, take the device off line, and all the rest are just advisory
>>>> but pass them all to user land as well.
>>>>
>>>
>>> This is a kind of policy that needs to go into the thin profile
>>> although Storage Arrays do take the device offline on reaching
>>> certain hard limits there is nothing like mounting a filesystem read-
>>> only ;-)
>>
>> Well, yes, but Ext3/4 and XFS tend to remount the fs RO when writes to
>> the journal fail as well because the SCSI stack takes the device
>> offline :-(
>>
>> If the device has lied in its response to a READ_CAPACITY or
>> READ_CAPACITY16 that is hard to prevent unless the file system has the
>> concept of a lying reserve ...
> The lying reserve is again a profile/policy setting aka like a SWAP concept.
> 
> If the device has lied in either READ_CAPACITY_16 or GET_LBA_STATUS..
> then we are anyways not consistent to the tee on the profile. Putting
> my open-source hat on that is a Carrot and stick bait.

Quite. Currently we know of about three events / event classes which
need to be handled:

REPORTED LUNS DATA HAS CHANGED
CAPACITY DATA HAS CHANGED
thin provisioning water mark warnings

Everything else is pretty much handled by the SCSI stack nowadays
anyway.

However, currently we don't handle them at all and hence don't have
any experience as to how often they would occur. Which would be
pretty much vendor-specific anyway.
So we need to design something which is
a) capable of handling even large number of events
b) selectable per device
c) modular enough to have further sense codes added

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

      reply	other threads:[~2011-02-09 15:35 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-06 20:44 [LSF/MM Topic] SCSI Unit Attention Handling Richard Sharpe
2011-02-06 22:32 ` Shyam_Iyer
2011-02-07  7:46   ` Hannes Reinecke
2011-02-08  2:00   ` Richard Sharpe
2011-02-08  5:06     ` Shyam_Iyer
2011-02-08  5:51       ` Richard Sharpe
2011-02-09  5:02         ` Shyam_Iyer
2011-02-09 15:44           ` Hannes Reinecke [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D52B647.9060601@suse.de \
    --to=hare@suse.de \
    --cc=Shyam_Iyer@Dell.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=realrichardsharpe@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox