* [PATCH] ata: libata-core: Allow capacity transition to zero for locked drives
@ 2026-06-22 18:28 TJ Adams
2026-06-24 14:20 ` Niklas Cassel
0 siblings, 1 reply; 4+ messages in thread
From: TJ Adams @ 2026-06-22 18:28 UTC (permalink / raw)
To: Damien Le Moal, Niklas Cassel, Hannes Reinecke,
Martin K. Petersen
Cc: linux-ide, linux-kernel, Igor Pylypiv, Salomon Dushimirimana,
TJ Adams
Commit 91842ed844a0 ("ata: libata-core: Set capacity to zero for a
security locked drive") introduced setting the device capacity (n_sectors)
to zero in ata_dev_configure() if the drive is security locked.
However, during runtime revalidation, ata_dev_revalidate() compares the
new capacity (now 0) with the old capacity (>0) and detects a mismatch.
Since it does not consider the locked status, it returns -ENODEV.
This revalidation failure during Error Handling (EH) causes libata to
eventually disable the device, offlining the SCSI device and resulting
in subsequent accesses returning -ENXIO (No such device or address).
Fix this by allowing the capacity transition to
zero in ata_dev_revalidate() if the drive is reported as security locked by
ata_id_is_locked().
Fixes: 91842ed844a0 ("ata: libata-core: Set capacity to zero for a security locked drive")
Assisted-by: Antigravity:gemini-3.5-flash
Signed-off-by: Terrence Adams <tadamsjr@google.com>
---
drivers/ata/libata-core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 3b6243f0f91e..d090667979ca 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3974,7 +3974,8 @@ int ata_dev_revalidate(struct ata_device *dev, unsigned int new_class,
/* verify n_sectors hasn't changed */
if (dev->class != ATA_DEV_ATA || !n_sectors ||
- dev->n_sectors == n_sectors)
+ dev->n_sectors == n_sectors ||
+ (dev->n_sectors == 0 && ata_id_is_locked(dev->id)))
return 0;
/* n_sectors has changed */
--
2.55.0.rc0.786.g65d90a0328-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] ata: libata-core: Allow capacity transition to zero for locked drives 2026-06-22 18:28 [PATCH] ata: libata-core: Allow capacity transition to zero for locked drives TJ Adams @ 2026-06-24 14:20 ` Niklas Cassel 2026-07-01 23:16 ` TJ Adams 0 siblings, 1 reply; 4+ messages in thread From: Niklas Cassel @ 2026-06-24 14:20 UTC (permalink / raw) To: TJ Adams Cc: Damien Le Moal, Hannes Reinecke, Martin K. Petersen, linux-ide, linux-kernel, Igor Pylypiv, Salomon Dushimirimana On Mon, Jun 22, 2026 at 11:28:44AM -0700, TJ Adams wrote: > Commit 91842ed844a0 ("ata: libata-core: Set capacity to zero for a > security locked drive") introduced setting the device capacity (n_sectors) > to zero in ata_dev_configure() if the drive is security locked. > > However, during runtime revalidation, ata_dev_revalidate() compares the > new capacity (now 0) with the old capacity (>0) and detects a mismatch. > Since it does not consider the locked status, it returns -ENODEV. When booting with a security locked drive (that is not the boot device): ata_eh_revalidate_and_attach() -> ata_dev_read_id() -> ata_dev_configure() <- sets dev->n_sectors to 0 sets ATA_EHI_SETMODE ata_dev_set_mode() -> ata_dev_revalidate() ata_dev_revalidate() sets local variable n_sectors to dev->n_sectors -> ata_dev_reread_id() -> ata_dev_read_id() -> ata_dev_configure() will one again set dev->n_sectors to 0 /* verify n_sectors hasn't changed */ if (dev->class != ATA_DEV_ATA || !n_sectors || dev->n_sectors == n_sectors) return 0; Please explain how you reproduce this. As far as I can see the local n_sectors variable will be 0, and the function call to ata_dev_configure() will set dev->n_sectors to 0, so this should, AFAICT, never get a "n_sectors mismatch" print. I even tried to reproduce with doing a suspend + resume (using QEMU), such that ata_dev_revalidate() will be called again: $ sudo dmesg | grep -E "locked|suspend|disable device|ata_dev_revalidate" [ 0.988248] ata2.00: Security locked, setting capacity to zero [ 0.993442] ata1.00: debug print: ata_dev_revalidate [ 0.994729] ata2.00: debug print: ata_dev_revalidate [ 0.995918] ata2.00: Security locked, setting capacity to zero [ 40.239682] PM: suspend entry (deep) [ 40.300128] PM: suspend devices took 0.020 seconds [ 45.738412] PM: suspend exit [ 45.985092] ata2.00: debug print: ata_dev_revalidate [ 45.986698] ata2.00: Security locked, setting capacity to zero [ 45.988530] ata1.00: debug print: ata_dev_revalidate [ 45.993331] ata1.00: debug print: ata_dev_revalidate [ 45.994525] ata2.00: debug print: ata_dev_revalidate [ 45.995194] ata2.00: Security locked, setting capacity to zero Since you seem to state that the old capacity (local variable n_sectors) is > 0, it seems like the device wasn't locked during the initial boot / ata_dev_configure() call. Are you perhaps unlocking the device in firmware, so it is unlocked at boot, and then when you suspend, you lose power to the drive, so when you resume, the drive is locked? Kind regards, Niklas ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] ata: libata-core: Allow capacity transition to zero for locked drives 2026-06-24 14:20 ` Niklas Cassel @ 2026-07-01 23:16 ` TJ Adams 2026-07-02 15:43 ` Niklas Cassel 0 siblings, 1 reply; 4+ messages in thread From: TJ Adams @ 2026-07-01 23:16 UTC (permalink / raw) To: Niklas Cassel Cc: Damien Le Moal, Hannes Reinecke, Martin K. Petersen, linux-ide, linux-kernel, Igor Pylypiv, Salomon Dushimirimana On Wed, Jun 24, 2026 at 7:21 AM Niklas Cassel <cassel@kernel.org> wrote: > > On Mon, Jun 22, 2026 at 11:28:44AM -0700, TJ Adams wrote: > > Commit 91842ed844a0 ("ata: libata-core: Set capacity to zero for a > > security locked drive") introduced setting the device capacity (n_sectors) > > to zero in ata_dev_configure() if the drive is security locked. > > > > However, during runtime revalidation, ata_dev_revalidate() compares the > > new capacity (now 0) with the old capacity (>0) and detects a mismatch. > > Since it does not consider the locked status, it returns -ENODEV. Hey Niklas, Sorry for the delay in response. I had some difficulty recreating the issue outside of our specific test suite. > Please explain how you reproduce this. > As far as I can see the local n_sectors variable will be 0, > and the function call to ata_dev_configure() will set dev->n_sectors > to 0, so this should, AFAICT, never get a "n_sectors mismatch" print. Here are the steps to recreate it: ```bash TARGET_DEV="/dev/sdy" TARGET_NAME="sdy" # Lock Device hdparm --user-master u --security-set-pass TempPassword "$TARGET_DEV" # Disable SSP sg_raw "$TARGET_DEV" 85 06 00 00 90 00 06 00 00 00 00 00 00 00 ef 00 # In another terminal set active I/O dd if="$TARGET_DEV" of=/dev/null bs=1M status=progress # Do an active reset (then wait for scsi eh) echo 1 > /sys/class/sas_phy/phy-X:Y/hard_reset # You can end the dd command in the other terminal and then run this: # You should see it fail immediately dd if="$TARGET_DEV" of=/dev/null bs=512 count=1 ``` Some things to note: - Disabling SSP. I had some attempts to recreate the bug but didn't realize SSP was enabled. I would lock the drive but on reset the drive would actually come back up unlocked, masking the bug. - The active I/O. This is so there are I/Os in flight whenever the drive gets reset. They should timeout and then trigger the error handling path which will cause the ata device revalidation. Without it the revalidation might not occur and you might not see the bug. I also experienced this. > Since you seem to state that the old capacity (local variable n_sectors) > is > 0, it seems like the device wasn't locked during the initial boot / > ata_dev_configure() call. Yeah that's correct. To recreate the issue you should boot unlocked but lock at runtime. Also here is a dmesg snippet showing the capacity mismatch: ```dmesg [76834.784699] ata36.00: status: { DRDY } [76834.784708] ata36: hard resetting link [76834.940530] ata36.00: supports DRM functions and may not be fully accessible [76834.940537] ata36.00: Security locked, setting capacity to zero [76834.943566] ata36.00: n_sectors mismatch 15628053168 != 0 ``` Let me know if I didn't explain something well or if something is missing. Thank you! Best Regards, TJ Adams ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] ata: libata-core: Allow capacity transition to zero for locked drives 2026-07-01 23:16 ` TJ Adams @ 2026-07-02 15:43 ` Niklas Cassel 0 siblings, 0 replies; 4+ messages in thread From: Niklas Cassel @ 2026-07-02 15:43 UTC (permalink / raw) To: TJ Adams Cc: Damien Le Moal, Hannes Reinecke, Martin K. Petersen, linux-ide, linux-kernel, Igor Pylypiv, Salomon Dushimirimana Hello TJ, On Wed, Jul 01, 2026 at 04:16:49PM -0700, TJ Adams wrote: > On Wed, Jun 24, 2026 at 7:21 AM Niklas Cassel <cassel@kernel.org> wrote: > > > > On Mon, Jun 22, 2026 at 11:28:44AM -0700, TJ Adams wrote: > > > Commit 91842ed844a0 ("ata: libata-core: Set capacity to zero for a > > > security locked drive") introduced setting the device capacity (n_sectors) > > > to zero in ata_dev_configure() if the drive is security locked. > > > > > > However, during runtime revalidation, ata_dev_revalidate() compares the > > > new capacity (now 0) with the old capacity (>0) and detects a mismatch. > > > Since it does not consider the locked status, it returns -ENODEV. > > > Hey Niklas, > > Sorry for the delay in response. I had some difficulty recreating the > issue outside of our specific test suite. > > > Please explain how you reproduce this. > > As far as I can see the local n_sectors variable will be 0, > > and the function call to ata_dev_configure() will set dev->n_sectors > > to 0, so this should, AFAICT, never get a "n_sectors mismatch" print. > > Here are the steps to recreate it: > > ```bash > TARGET_DEV="/dev/sdy" > TARGET_NAME="sdy" > > # Lock Device > hdparm --user-master u --security-set-pass TempPassword "$TARGET_DEV" > > # Disable SSP > sg_raw "$TARGET_DEV" 85 06 00 00 90 00 06 00 00 00 00 00 00 00 ef 00 > > # In another terminal set active I/O > dd if="$TARGET_DEV" of=/dev/null bs=1M status=progress > > # Do an active reset (then wait for scsi eh) > echo 1 > /sys/class/sas_phy/phy-X:Y/hard_reset > > # You can end the dd command in the other terminal and then run this: > # You should see it fail immediately > dd if="$TARGET_DEV" of=/dev/null bs=512 count=1 > ``` > > Some things to note: > - Disabling SSP. I had some attempts to recreate the bug but didn't > realize SSP was enabled. I would lock the drive but on reset the > drive would actually come back up unlocked, masking the bug. > - The active I/O. This is so there are I/Os in flight whenever the > drive gets reset. They should timeout and then trigger the error > handling path which will cause the ata device revalidation. Without > it the revalidation might not occur and you might not see the bug. > I also experienced this. > > > Since you seem to state that the old capacity (local variable n_sectors) > > is > 0, it seems like the device wasn't locked during the initial boot / > > ata_dev_configure() call. > > Yeah that's correct. To recreate the issue you should boot unlocked but > lock at runtime. > > Also here is a dmesg snippet showing the capacity mismatch: > > ```dmesg > [76834.784699] ata36.00: status: { DRDY } > [76834.784708] ata36: hard resetting link > [76834.940530] ata36.00: supports DRM functions and may not be fully > accessible > [76834.940537] ata36.00: Security locked, setting capacity to zero > [76834.943566] ata36.00: n_sectors mismatch 15628053168 != 0 > ``` > > Let me know if I didn't explain something well or if something is > missing. Thank you! Ok, thank you for the information. Perhaps mention more clearly in the commit message that this happens when doing a reset of the PHY for a controller that has I/Os in flight. I think a simpler patch would be: @@ -3959,7 +3959,7 @@ int ata_dev_revalidate(struct ata_device *dev, unsigned int new_class, /* verify n_sectors hasn't changed */ if (dev->class != ATA_DEV_ATA || !n_sectors || - dev->n_sectors == n_sectors) + dev->n_sectors == n_sectors || ata_id_is_locked(dev->id)) return 0; /* n_sectors has changed */ I don't see why we need the additional dev->n_sectors == 0. If the drive is locked, no need to contiunue, just return. Perhaps you could test that change instead? Also, I think we should have a patch 1/2 (or 2/2) that does: @@ -1338,7 +1338,7 @@ static int ata_hpa_resize(struct ata_device *dev) /* do we need to do it? */ if ((dev->class != ATA_DEV_ATA && dev->class != ATA_DEV_ZAC) || !ata_id_has_lba(dev->id) || !ata_id_hpa_enabled(dev->id) || - (dev->quirks & ATA_QUIRK_BROKEN_HPA)) + (dev->quirks & ATA_QUIRK_BROKEN_HPA) || ata_id_is_locked(dev->id)) return 0; /* read native max address */ To fix the problem Sashiko complained about. Kind regards, Niklas ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-07-02 15:44 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-22 18:28 [PATCH] ata: libata-core: Allow capacity transition to zero for locked drives TJ Adams 2026-06-24 14:20 ` Niklas Cassel 2026-07-01 23:16 ` TJ Adams 2026-07-02 15:43 ` Niklas Cassel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox