From: John Meneghini <jmeneghi@redhat.com>
To: "Kai Mäkisara" <Kai.Makisara@kolumbus.fi>, linux-scsi@vger.kernel.org
Cc: martin.petersen@oracle.com,
James.Bottomley@HansenPartnership.com, loberman@redhat.com
Subject: Re: [PATCH v2 0/4] scsi: st: scsi_error: More reset patches
Date: Wed, 11 Dec 2024 16:57:16 -0500 [thread overview]
Message-ID: <0c6e699b-8f77-411f-b73d-e6762c6ad286@redhat.com> (raw)
In-Reply-To: <20241125140301.3912-1-Kai.Makisara@kolumbus.fi>
Sorry it has taken me so long to get back to this....
I've tested these patches with both my tape drive and with scsi_debug tape emulation.
see:
https://github.com/johnmeneghini/tape_tests
All hardware tests are passing and everything is working as expected with the tape drive tests, but the power on reset behavior
of the scsi_debug test is still showing the some strangeness.
https://github.com/johnmeneghini/tape_tests/blob/master/tape_reset_debug.sh
Specifically, every time you reload the scsi_debug driver the SCSI mid layer clears the POR UA. If I am not mistaken, your
intention with adding the counters for ua_new_media_ctr and ua_por_ctr to the mid layer was to catch these events and report
them to the upper layer driver.
Here's what the scsi_debug test does:
[tape_tests]# ./tape_reset_debug.sh 1 3 1 1
modprobe -r scsi_debug
modprobe scsi_debug tur_ms_to_ready=10000 ptype=1 max_luns=1 dev_size_mb=10000
[Wed Dec 11 15:35:48 2024] scsi_debug:sdebug_driver_probe: scsi_debug: trim poll_queues to 0. poll_q/nr_hw = (0/1)
[Wed Dec 11 15:35:48 2024] scsi host8: scsi_debug: version 0191 [20210520]
dev_size_mb=10000, opts=0x0, submit_queues=1, statistics=0
[Wed Dec 11 15:35:48 2024] scsi 8:0:0:0: Sequential-Access Linux scsi_debug 0191 PQ: 0 ANSI: 7
[Wed Dec 11 15:35:48 2024] scsi 8:0:0:0: Power-on or device reset occurred
^^^^^^^^^^^^^^^^^^^^^^
Here's where the scsi layer is clearing the POR UA.
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: Attached scsi tape st1
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: st1: try direct i/o: yes (alignment 4 B)
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: Attached scsi generic sg3 type 1
[0:0:0:0] disk ATA Samsung SSD 840 4B0Q /dev/sda 3500253855022021d /dev/sg0
[7:0:0:0] tape QUANTUM ULTRIUM 4 U53F /dev/st0 - /dev/sg1
[7:0:1:0] enclosu LSI virtualSES 02 - - /dev/sg2
[8:0:0:0] tape Linux scsi_debug 0191 /dev/st1 - /dev/sg3
[N:0:0:1] disk INTEL SSDPEDMW400G4__1 /dev/nvme0n1 -
Check the status
mt -f /dev/nst1 status
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] check_tape: 1082: pos_unknown 0 was_reset 0/0 ready 0
SCSI 2 tape drive:
File number=-1, block number=-1, partition=0.
Tape block size 0 bytes. Density code 0x0 (default).
Soft error count since last status=0
General status bits on (10000):
IM_REP_EN
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] Error: 2, cmd: 0 0 0 0 0 0
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] Sense Key : Not Ready [current]
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] Add. Sense: Logical unit is in process of becoming ready
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 0 was_reset 0/0 ready 0, result 2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
st_chk_result was run here... but it looks like scsi_get_ua_por_ctr(STp->device) didn't report the first POR UA.
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] check_tape: 1141: CHKRES_NOT_READY pos_unknown 0 was_reset 0/0 ready 1
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] flush_buffer: 852: pos_unknown 0 was_reset 0/0 ready 1
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] st_flush: 1404: pos_unknown 0 was_reset 0/0 ready 1
[Wed Dec 11 15:35:48 2024] st 8:0:0:0: [st1] flush_buffer: 852: pos_unknown 0 was_reset 0/0 ready 1
Sleeping for 20 seconds
Check the status
mt -f /dev/nst1 status
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] check_tape: 1082: pos_unknown 0 was_reset 0/0 ready 1
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] Error: 402, cmd: 5 0 0 0 0 0
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] Add. Sense: Invalid command operation code
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 0 was_reset 0/0 ready 0, result 1026
[Wed Dec 11 15:36:08 2024] st 8:0:0:0: [st1] Can't read block limits.
SCSI 2 tape drive:
File number=-1, block number=-1, partition=0.
Tape block size 0 bytes. Density code 0x0 (default).
Soft error count since last status=0
General status bits on (1010000):
ONLINE IM_REP_EN
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
A second mt status command is done after the tape is ready...
So it looks like the initial POR UA is never recorded in ua_por_ctr.
Following this, resetting the device works.
sg_reset --target /dev/nst1
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] check_tape: 1082: pos_unknown 0 was_reset 0/0 ready 0
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Error: 402, cmd: 5 0 0 0 0 0
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Add. Sense: Invalid command operation code
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 0 was_reset 0/0 ready 0, result 1026
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Can't read block limits.
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Error: 402, cmd: 1a 0 0 0 c 0
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Add. Sense: Invalid field in cdb
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 0 was_reset 0/0 ready 0, result 1026
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
scsi mid layer has NOT set was_reset.
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] No Mode Sense.
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] Block size: 0, buffer size: 4096 (1 blocks).
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] check_tape: 1282: CHKRES_READY pos_unknown 0 was_reset 0/0 ready 0
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] flush_buffer: 852: pos_unknown 0 was_reset 0/0 ready 0
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] st_flush: 1404: pos_unknown 0 was_reset 1/1 ready 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We see the scsi mid layer sets was_reset here.
[Wed Dec 11 15:36:09 2024] st 8:0:0:0: [st1] flush_buffer: 852: pos_unknown 0 was_reset 1/1 ready 0
Sleeping for 5 seconds
Check the status
mt -f /dev/nst1 status
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] check_tape: 1082: pos_unknown 0 was_reset 1/1 ready 0
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: Power-on or device reset occurred
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Power on/reset recognized.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here's your code : scsi_get_ua_por_ctr(STp->device) found the reset here.
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Error: 2, cmd: 0 0 0 0 0 0
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Sense Key : Unit Attention [current]
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Add. Sense: Scsi bus reset occurred
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 1 was_reset 1/1 ready 0, result 2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Position unknown is now set.
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Error: 402, cmd: 5 0 0 0 0 0
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Add. Sense: Invalid command operation code
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 1 was_reset 1/1 ready 0, result 1026
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Can't read block limits.
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Error: 402, cmd: 1a 0 0 0 c 0
[Wed Dec 11 15:36:14 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
SCSI 2 tape drive:
All in all this is no different from what we have w/out your patches, so I have no problem approving this change.
One thing that did get fixed by this patch series is I can now use the sg device to reset the scsi_debug driver.
This is a good improvement. I can now run my new script.
https://github.com/johnmeneghini/tape_tests/blob/master/tape_reset_debug_sg.sh
sg_reset --target /dev/sg3
Sleeping for 5 seconds
Check the status
mt -f /dev/nst1 status
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] check_tape: 1082: pos_unknown 0 was_reset 1/1 ready 0
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: Power-on or device reset occurred
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Power on/reset recognized.
This didn't work before your change.
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Error: 2, cmd: 0 0 0 0 0 0
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Sense Key : Unit Attention [current]
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Add. Sense: Scsi bus reset occurred
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 1 was_reset 1/1 ready 0, result 2
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Error: 402, cmd: 5 0 0 0 0 0
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Sense Key : Illegal Request [current]
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Add. Sense: Invalid command operation code
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] st_chk_result: 432: pos_unknown 1 was_reset 1/1 ready 0, result 1026
[Wed Dec 11 16:03:39 2024] st 8:0:0:0: [st1] Can't read block limits.
SCSI 2 tape drive:
File number=-1, block number=-1, partition=0.
Tape block size 0 bytes. Density code 0x0 (default).
Soft error count since last status=0
General status bits on (1010000):
ONLINE IM_REP_EN
So all in all I think this is an improvement... I'd like to ask Martin to merge these changes in v6.13.
/John
P.S. There are still many issues with the scsi_debug tape emulation. See my test results for more information about how the
scsi_debug tape emulation test are failing at:
https://bugzilla.kernel.org/show_bug.cgi?id=219419#c21
On 11/25/24 09:02, Kai Mäkisara wrote:
> This set applies to 6.12 + the three patches accepted earlier (and in
> linux-next).
>
> The first patch re-applies after device reset some settings changed
> by the user (partition, density, block size). This is the same as in v1.
>
> The second and third patch address the case where more than one ULD
> access the same device. The Unit Attention (UA) sense data is sent only
> to one ULD and the others miss it. The st driver needs to find out
> if device reset or media change has happened.
>
> The second patch adds counters for New Media and Power On/Reset (POR)
> Unit Attentions to the scsi_device struct. The third one changes st
> so that these are used: if the value in the scsi_device struct does
> not match the one stored locally, the corresponding UA has happened.
> Use of the was_reset flag has been removed.
>
> The fourth patch adds a file to sysfs to tell the user if reads/writes
> to a tape have been blocked following a device reset.
> ---
> Changes since V1:
> - replace the patch removing was_reset handling with patches two and three
> - add sysfs file reset_blocked
>
> Kai Mäkisara (4):
> scsi: st: Restore some drive settings after reset
> scsi: scsi_error: Add counters for New Media and Power On/Reset UNIT
> ATTENTIONs
> scsi: st: Modify st.c to use the new scsi_error counters
> scsi: st: Add sysfs file reset_blocked
>
> Documentation/scsi/st.rst | 5 +++
> drivers/scsi/scsi_error.c | 12 +++++++
> drivers/scsi/st.c | 73 +++++++++++++++++++++++++++++++++-----
> drivers/scsi/st.h | 6 ++++
> include/scsi/scsi_device.h | 9 +++++
> 5 files changed, 97 insertions(+), 8 deletions(-)
>
next prev parent reply other threads:[~2024-12-11 21:57 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-25 14:02 [PATCH v2 0/4] scsi: st: scsi_error: More reset patches Kai Mäkisara
2024-11-25 14:02 ` [PATCH v2 1/4] scsi: st: Restore some drive settings after reset Kai Mäkisara
2024-11-25 14:02 ` [PATCH v2 2/4] scsi: scsi_error: Add counters for New Media and Power On/Reset UNIT ATTENTIONs Kai Mäkisara
2024-12-11 21:57 ` John Meneghini
2024-12-11 22:14 ` Bart Van Assche
2024-12-12 18:33 ` "Kai Mäkisara (Kolumbus)"
2024-11-25 14:03 ` [PATCH v2 3/4] scsi: st: Modify st.c to use the new scsi_error counters Kai Mäkisara
2024-12-11 22:14 ` John Meneghini
2024-11-25 14:03 ` [PATCH v2 4/4] scsi: st: Add sysfs file reset_blocked Kai Mäkisara
2024-12-11 21:57 ` John Meneghini
2024-12-11 21:57 ` John Meneghini [this message]
2024-12-12 18:27 ` [PATCH v2 0/4] scsi: st: scsi_error: More reset patches "Kai Mäkisara (Kolumbus)"
2024-12-13 13:09 ` "Kai Mäkisara (Kolumbus)"
2024-12-13 17:32 ` John Meneghini
2024-12-14 13:46 ` "Kai Mäkisara (Kolumbus)"
2024-12-20 22:14 ` John Meneghini
2024-12-21 7:57 ` "Kai Mäkisara (Kolumbus)"
2024-12-13 15:09 ` John Meneghini
2024-12-13 15:28 ` John Meneghini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0c6e699b-8f77-411f-b73d-e6762c6ad286@redhat.com \
--to=jmeneghi@redhat.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=Kai.Makisara@kolumbus.fi \
--cc=linux-scsi@vger.kernel.org \
--cc=loberman@redhat.com \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox