public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: John Meneghini <jmeneghi@redhat.com>
To: "Kai Mäkisara" <Kai.Makisara@kolumbus.fi>, linux-scsi@vger.kernel.org
Cc: martin.petersen@oracle.com,
	James.Bottomley@HansenPartnership.com, loberman@redhat.com
Subject: Re: [PATCH v2 0/4] scsi: st: scsi_error: More reset patches
Date: Wed, 11 Dec 2024 16:57:16 -0500	[thread overview]
Message-ID: <0c6e699b-8f77-411f-b73d-e6762c6ad286@redhat.com> (raw)
In-Reply-To: <20241125140301.3912-1-Kai.Makisara@kolumbus.fi>

Sorry it has taken me so long to get back to this....

I've tested these patches with both my tape drive and with scsi_debug tape emulation.

see:

   https://github.com/johnmeneghini/tape_tests

All hardware tests are passing and everything is working as expected with the tape drive tests, but the power on reset behavior 
of the scsi_debug test is still showing the some strangeness.

  https://github.com/johnmeneghini/tape_tests/blob/master/tape_reset_debug.sh

Specifically, every time you reload the scsi_debug driver the SCSI mid layer clears the POR UA. If I am not mistaken, your 
intention with adding the counters for ua_new_media_ctr and ua_por_ctr to the mid layer was to catch these events and report 
them to the upper layer driver.

Here's what the scsi_debug test does:

[tape_tests]# ./tape_reset_debug.sh 1 3 1 1

modprobe -r scsi_debug
modprobe scsi_debug tur_ms_to_ready=10000 ptype=1  max_luns=1 dev_size_mb=10000

[Wed Dec 11 15:35:48 2024] scsi_debug:sdebug_driver_probe: scsi_debug: trim poll_queues to 0. poll_q/nr_hw = (0/1)
[Wed Dec 11 15:35:48 2024] scsi host8: scsi_debug: version 0191 [20210520]
                              dev_size_mb=10000, opts=0x0, submit_queues=1, statistics=0
[Wed Dec 11 15:35:48 2024] scsi 8:0:0:0: Sequential-Access Linux    scsi_debug       0191 PQ: 0 ANSI: 7
[Wed Dec 11 15:35:48 2024] scsi 8:0:0:0: Power-on or device reset occurred

                                            ^^^^^^^^^^^^^^^^^^^^^^
                        Here's where the scsi layer is clearing the POR UA.

[Wed Dec 11 15:35:48 2024] st 8:0:0:0: Attached scsi tape st1
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: st1: try direct i/o: yes (alignment 4 B)
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: Attached scsi generic sg3 type 1
[0:0:0:0]    disk    ATA      Samsung SSD 840  4B0Q  /dev/sda   3500253855022021d  /dev/sg0
[7:0:0:0]    tape    QUANTUM  ULTRIUM 4        U53F  /dev/st0   -  /dev/sg1
[7:0:1:0]    enclosu LSI      virtualSES       02    -          -  /dev/sg2
[8:0:0:0]    tape    Linux    scsi_debug       0191  /dev/st1   -  /dev/sg3
[N:0:0:1]    disk    INTEL SSDPEDMW400G4__1                     /dev/nvme0n1  -

  Check the status

mt -f /dev/nst1 status
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] check_tape: 1082: pos_unknown 0 was_reset 0/0 ready 0
SCSI 2 tape drive:
File number=-1, block number=-1, partition=0.
Tape block size 0 bytes. Density code 0x0 (default).
Soft error count since last status=0
General status bits on (10000):
  IM_REP_EN
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] Error: 2, cmd: 0 0 0 0 0 0
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] Sense Key : Not Ready [current]
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] Add. Sense: Logical unit is in process of becoming ready
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 0 was_reset 0/0 ready 0, result 2

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
st_chk_result was run here... but it looks like scsi_get_ua_por_ctr(STp->device) didn't report the first POR UA.

[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] check_tape: 1141: CHKRES_NOT_READY pos_unknown 0 was_reset 0/0 ready 1
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] flush_buffer: 852: pos_unknown 0 was_reset 0/0 ready 1
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] st_flush: 1404: pos_unknown 0 was_reset 0/0 ready 1
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] flush_buffer: 852: pos_unknown 0 was_reset 0/0 ready 1

  Sleeping for 20 seconds

  Check the status

mt -f /dev/nst1 status
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] check_tape: 1082: pos_unknown 0 was_reset 0/0 ready 1
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] Error: 402, cmd: 5 0 0 0 0 0
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] Add. Sense: Invalid command operation code
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 0 was_reset 0/0 ready 0, result 1026
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] Can't read block limits.
SCSI 2 tape drive:
File number=-1, block number=-1, partition=0.
Tape block size 0 bytes. Density code 0x0 (default).
Soft error count since last status=0
General status bits on (1010000):
  ONLINE IM_REP_EN

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
A second mt status command is done after the tape is ready...
So it looks like the initial POR UA is never recorded in ua_por_ctr.

Following this, resetting the device works.

sg_reset --target /dev/nst1
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] check_tape: 1082: pos_unknown 0 was_reset 0/0 ready 0
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Error: 402, cmd: 5 0 0 0 0 0
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Add. Sense: Invalid command operation code
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 0 was_reset 0/0 ready 0, result 1026
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Can't read block limits.
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Error: 402, cmd: 1a 0 0 0 c 0
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Add. Sense: Invalid field in cdb
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 0 was_reset 0/0 ready 0, result 1026
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                   scsi mid layer has NOT set was_reset.

[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] No Mode Sense.
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Block size: 0, buffer size: 4096 (1 blocks).
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] check_tape: 1282: CHKRES_READY pos_unknown 0 was_reset 0/0 ready 0
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] flush_buffer: 852: pos_unknown 0 was_reset 0/0 ready 0
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] st_flush: 1404: pos_unknown 0 was_reset 1/1 ready 0
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                   We see the scsi mid layer sets was_reset here.

[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] flush_buffer: 852: pos_unknown 0 was_reset 1/1 ready 0

  Sleeping for 5 seconds
  Check the status

  mt -f /dev/nst1 status
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] check_tape: 1082: pos_unknown 0 was_reset 1/1 ready 0
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: Power-on or device reset occurred
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Power on/reset recognized.
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here's your code : scsi_get_ua_por_ctr(STp->device) found the reset here.

[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Error: 2, cmd: 0 0 0 0 0 0
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Sense Key : Unit Attention [current]
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Add. Sense: Scsi bus reset occurred
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 1 was_reset 1/1 ready 0, result 2
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            Position unknown is now set.

[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Error: 402, cmd: 5 0 0 0 0 0
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Add. Sense: Invalid command operation code
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 1 was_reset 1/1 ready 0, result 1026
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Can't read block limits.
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Error: 402, cmd: 1a 0 0 0 c 0
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
SCSI 2 tape drive:

All in all this is no different from what we have w/out your patches, so I have no problem approving this change.

One thing that did get fixed by this patch series is I can now use the sg device to reset the scsi_debug driver.

This is a good improvement.  I can now run my new script.

https://github.com/johnmeneghini/tape_tests/blob/master/tape_reset_debug_sg.sh

sg_reset --target /dev/sg3

  Sleeping for 5 seconds
  Check the status

  mt -f /dev/nst1 status
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] check_tape: 1082: pos_unknown 0 was_reset 1/1 ready 0
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: Power-on or device reset occurred
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Power on/reset recognized.

  This didn't work before your change.

[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Error: 2, cmd: 0 0 0 0 0 0
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Sense Key : Unit Attention [current]
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Add. Sense: Scsi bus reset occurred
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 1 was_reset 1/1 ready 0, result 2
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Error: 402, cmd: 5 0 0 0 0 0
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Add. Sense: Invalid command operation code
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 1 was_reset 1/1 ready 0, result 1026
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Can't read block limits.
SCSI 2 tape drive:
File number=-1, block number=-1, partition=0.
Tape block size 0 bytes. Density code 0x0 (default).
Soft error count since last status=0
General status bits on (1010000):
  ONLINE IM_REP_EN

So all in all I think this is an improvement... I'd like to ask Martin to merge these changes in v6.13.

/John

P.S. There are still many issues with the scsi_debug tape emulation. See my test results for more information about how the 
scsi_debug tape emulation test are failing at:

    https://bugzilla.kernel.org/show_bug.cgi?id=219419#c21

On 11/25/24 09:02, Kai Mäkisara wrote:
> This set applies to 6.12 + the three patches accepted earlier (and in
> linux-next).
> 
> The first patch re-applies after device reset some settings changed
> by the user (partition, density, block size). This is the same as in v1.
> 
> The second and third patch address the case where more than one ULD
> access the same device. The Unit Attention (UA) sense data is sent only
> to one ULD and the others miss it. The st driver needs to find out
> if device reset or media change has happened.
> 
> The second patch adds counters for New Media and Power On/Reset (POR)
> Unit Attentions to the scsi_device struct. The third one changes st
> so that these are used: if the value in the scsi_device struct does
> not match the one stored locally, the corresponding UA has happened.
> Use of the was_reset flag has been removed.
> 
> The fourth patch adds a file to sysfs to tell the user if reads/writes
> to a tape have been blocked following a device reset.
> ---
> Changes since V1:
> - replace the patch removing was_reset handling with patches two and three
> - add sysfs file reset_blocked
> 
> Kai Mäkisara (4):
>    scsi: st: Restore some drive settings after reset
>    scsi: scsi_error: Add counters for New Media and Power On/Reset UNIT
>      ATTENTIONs
>    scsi: st: Modify st.c to use the new scsi_error counters
>    scsi: st: Add sysfs file reset_blocked
> 
>   Documentation/scsi/st.rst  |  5 +++
>   drivers/scsi/scsi_error.c  | 12 +++++++
>   drivers/scsi/st.c          | 73 +++++++++++++++++++++++++++++++++-----
>   drivers/scsi/st.h          |  6 ++++
>   include/scsi/scsi_device.h |  9 +++++
>   5 files changed, 97 insertions(+), 8 deletions(-)
> 


  parent reply	other threads:[~2024-12-11 21:57 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-25 14:02 [PATCH v2 0/4] scsi: st: scsi_error: More reset patches Kai Mäkisara
2024-11-25 14:02 ` [PATCH v2 1/4] scsi: st: Restore some drive settings after reset Kai Mäkisara
2024-11-25 14:02 ` [PATCH v2 2/4] scsi: scsi_error: Add counters for New Media and Power On/Reset UNIT ATTENTIONs Kai Mäkisara
2024-12-11 21:57   ` John Meneghini
2024-12-11 22:14   ` Bart Van Assche
2024-12-12 18:33     ` "Kai Mäkisara (Kolumbus)"
2024-11-25 14:03 ` [PATCH v2 3/4] scsi: st: Modify st.c to use the new scsi_error counters Kai Mäkisara
2024-12-11 22:14   ` John Meneghini
2024-11-25 14:03 ` [PATCH v2 4/4] scsi: st: Add sysfs file reset_blocked Kai Mäkisara
2024-12-11 21:57   ` John Meneghini
2024-12-11 21:57 ` John Meneghini [this message]
2024-12-12 18:27   ` [PATCH v2 0/4] scsi: st: scsi_error: More reset patches "Kai Mäkisara (Kolumbus)"
2024-12-13 13:09     ` "Kai Mäkisara (Kolumbus)"
2024-12-13 17:32       ` John Meneghini
2024-12-14 13:46         ` "Kai Mäkisara (Kolumbus)"
2024-12-20 22:14           ` John Meneghini
2024-12-21  7:57             ` "Kai Mäkisara (Kolumbus)"
2024-12-13 15:09     ` John Meneghini
2024-12-13 15:28       ` John Meneghini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0c6e699b-8f77-411f-b73d-e6762c6ad286@redhat.com \
    --to=jmeneghi@redhat.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=Kai.Makisara@kolumbus.fi \
    --cc=linux-scsi@vger.kernel.org \
    --cc=loberman@redhat.com \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox