* smartctl causing HSM violation on sata_nv, 2.6.18 @ 2006-09-27 18:33 Jim Paris 2006-09-28 6:54 ` Tejun Heo 0 siblings, 1 reply; 5+ messages in thread From: Jim Paris @ 2006-09-27 18:33 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide Hi Tejun, My NVIDIA SATA controller is having some problems with smartctl on 2.6.18 (+ the previously mentioned sata_nv patch). If I try to enable Attribute Autosafe (smartctl -S on) or Automatic Offline (smartctl -o on), the controller craps out (but recovers). Executing the same command on an identical disk connected to a SiI3132 works fine. Other SMART stuff (reading attributes, running self-tests) seems to be behaving just fine. -jim ### sata_nv controller (CK804): # smartctl -data -S on /dev/disk/by-path/pci-0000:00:07.0-scsi-0:0:0:0 smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF ENABLE/DISABLE COMMANDS SECTION === Error SMART Enable Auto-save failed: Input/output error Smartctl: SMART Enable Attribute Autosave Failed. A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. ### sata_sil24 controller (SiI3132): # smartctl -data -S on /dev/disk/by-path/pci-0000:04:00.0-scsi-0:0:0:0 smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF ENABLE/DISABLE COMMANDS SECTION === SMART Attribute Autosave Enabled. ### Kernel log for NVIDIA case: [36911.153208] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen [36911.153245] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation) [36911.462381] ata1: soft resetting port [36911.618322] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [36911.620269] ata1.00: configured for UDMA/133 [36911.620277] ata1: EH complete [36911.620410] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen [36911.620442] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation) [36911.930163] ata1: soft resetting port [36912.086097] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [36912.087984] ata1.00: configured for UDMA/133 [36912.087996] ata1: EH complete [36912.088126] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen [36912.088158] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation) [36912.397930] ata1: soft resetting port [36912.553871] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [36912.555790] ata1.00: configured for UDMA/133 [36912.555801] ata1: EH complete [36912.555931] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen [36912.555963] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation) [36912.865705] ata1: soft resetting port [36913.021646] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [36913.023482] ata1.00: configured for UDMA/133 [36913.023488] ata1: EH complete [36913.023621] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen [36913.023653] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation) [36913.333482] ata1: soft resetting port [36913.489422] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [36913.491320] ata1.00: configured for UDMA/133 [36913.491327] ata1: EH complete [36913.491461] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen [36913.491493] ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation) [36913.801255] ata1: soft resetting port [36913.957198] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [36913.959100] ata1.00: configured for UDMA/133 [36913.959110] ata1: EH complete [36913.959384] SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB) [36913.959530] sda: Write Protect is off [36913.959534] sda: Mode Sense: 00 3a 00 00 [36913.959801] SCSI device sda: drive cache: write back [36913.960018] SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB) [36913.960352] sda: Write Protect is off [36913.960357] sda: Mode Sense: 00 3a 00 00 [36913.960544] SCSI device sda: drive cache: write back ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: smartctl causing HSM violation on sata_nv, 2.6.18 2006-09-27 18:33 smartctl causing HSM violation on sata_nv, 2.6.18 Jim Paris @ 2006-09-28 6:54 ` Tejun Heo 2006-09-28 8:09 ` Jim Paris 0 siblings, 1 reply; 5+ messages in thread From: Tejun Heo @ 2006-09-28 6:54 UTC (permalink / raw) To: Jim Paris; +Cc: linux-ide, smartmontools-support Hello, Jim Paris, Bruce Allen. On Wed, Sep 27, 2006 at 02:33:39PM -0400, Jim Paris wrote: > Hi Tejun, > > My NVIDIA SATA controller is having some problems with smartctl on > 2.6.18 (+ the previously mentioned sata_nv patch). If I try to enable > Attribute Autosafe (smartctl -S on) or Automatic Offline (smartctl -o > on), the controller craps out (but recovers). Executing the same > command on an identical disk connected to a SiI3132 works fine. Other > SMART stuff (reading attributes, running self-tests) seems to be > behaving just fine. This is because smartctl issues AUTOSAVE and AUTO_OFFLINE w/ HDIO_DRIVE_CMD. Both SMART subcommands are non-data but still use non-zero NSECT field. HDIO_DRIVE_CMD assumes data-in protocol when NSECT is non-zero. libata HSM implementation is stricter than ide's and declares HSM violation when device reports command complete when it's expecting DRQ. > ### sata_nv controller (CK804): > > # smartctl -data -S on /dev/disk/by-path/pci-0000:00:07.0-scsi-0:0:0:0 > smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > === START OF ENABLE/DISABLE COMMANDS SECTION === > Error SMART Enable Auto-save failed: Input/output error > Smartctl: SMART Enable Attribute Autosave Failed. > > A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. > > > ### sata_sil24 controller (SiI3132): > > # smartctl -data -S on /dev/disk/by-path/pci-0000:04:00.0-scsi-0:0:0:0 > smartctl version 5.36 [x86_64-unknown-linux-gnu] Copyright (C) 2002-6 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > === START OF ENABLE/DISABLE COMMANDS SECTION === > SMART Attribute Autosave Enabled. sata_sil24 works because the controller hardware snoops the command and determines protocol by itself. So, regardless of what the ioctl says, it executes the command with non-data protocol. The following patch against smartmontools-5.36 converts it to use HDIO_DRIVE_TASK ioctl for AUTOSAVE and AUTO_OFFLINE which don't have the above issue. Thanks. diff -uNr smartmontools-5.36/os_linux.c smartmontools-5.36-fixed/os_linux.c --- smartmontools-5.36/os_linux.c 2006-04-13 02:02:19.000000000 +0900 +++ smartmontools-5.36-fixed/os_linux.c 2006-09-28 15:41:06.000000000 +0900 @@ -383,14 +383,10 @@ // 1 if the command succeeded and disk SMART status is "FAILING" -// huge value of buffer size needed because HDIO_DRIVE_CMD assumes -// that buff[3] is the data size. Since the ATA_SMART_AUTOSAVE and -// ATA_SMART_AUTO_OFFLINE use values of 0xf1 and 0xf8 we need the space. -// Otherwise a 4+512 byte buffer would be enough. -#define STRANGE_BUFFER_LENGTH (4+512*0xf8) +#define BUFFER_LEN (4+512) int ata_command_interface(int device, smart_command_set command, int select, char *data){ - unsigned char buff[STRANGE_BUFFER_LENGTH]; + unsigned char buff[BUFFER_LEN]; // positive: bytes to write to caller. negative: bytes to READ from // caller. zero: non-data command int copydata=0; @@ -407,7 +403,7 @@ // buff[2] contains the ATA SECTOR COUNT REGISTER // clear out buff. Large enough for HDIO_DRIVE_CMD (4+512 bytes) - memset(buff, 0, STRANGE_BUFFER_LENGTH); + memset(buff, 0, BUFFER_LEN); buff[0]=ATA_SMART_CMD; switch (command){ @@ -457,12 +453,14 @@ buff[2]=ATA_SMART_STATUS; break; case AUTO_OFFLINE: - buff[2]=ATA_SMART_AUTO_OFFLINE; - buff[3]=select; // YET NOTE - THIS IS A NON-DATA COMMAND!! + // NSECT is 241 for enable but no data transfer. Use TASK ioctl. + buff[1]=ATA_SMART_AUTO_OFFLINE; + buff[2]=select; break; case AUTOSAVE: - buff[2]=ATA_SMART_AUTOSAVE; - buff[3]=select; // YET NOTE - THIS IS A NON-DATA COMMAND!! + // NSECT is 248 for enable but no data transfer. Use TASK ioctl. + buff[1]=ATA_SMART_AUTOSAVE; + buff[2]=select; break; case IMMEDIATE_OFFLINE: buff[2]=ATA_SMART_IMMEDIATE_OFFLINE; @@ -517,7 +515,7 @@ // There are two different types of ioctls(). The HDIO_DRIVE_TASK // one is this: - if (command==STATUS_CHECK){ + if (command==AUTO_OFFLINE || command==AUTOSAVE || command==STATUS_CHECK){ int retval; // NOT DOCUMENTED in /usr/src/linux/include/linux/hdreg.h. You ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: smartctl causing HSM violation on sata_nv, 2.6.18 2006-09-28 6:54 ` Tejun Heo @ 2006-09-28 8:09 ` Jim Paris 2006-10-02 18:33 ` Tejun Heo 0 siblings, 1 reply; 5+ messages in thread From: Jim Paris @ 2006-09-28 8:09 UTC (permalink / raw) To: Tejun Heo; +Cc: linux-ide, smartmontools-support Hi Tejun, Tejun Heo wrote: > The following patch against smartmontools-5.36 converts it to use > HDIO_DRIVE_TASK ioctl for AUTOSAVE and AUTO_OFFLINE which don't have > the above issue. This patch works like a charm. Now "smartctl -S on" and "smartctl -o on" works as expected on all of my SATA and IDE controllers. Thank you! -jim ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: smartctl causing HSM violation on sata_nv, 2.6.18 2006-09-28 8:09 ` Jim Paris @ 2006-10-02 18:33 ` Tejun Heo 2006-10-02 23:24 ` Doug Maxey 0 siblings, 1 reply; 5+ messages in thread From: Tejun Heo @ 2006-10-02 18:33 UTC (permalink / raw) To: Jim Paris; +Cc: linux-ide, smartmontools-support Jim Paris wrote: > Hi Tejun, > > Tejun Heo wrote: >> The following patch against smartmontools-5.36 converts it to use >> HDIO_DRIVE_TASK ioctl for AUTOSAVE and AUTO_OFFLINE which don't have >> the above issue. > > This patch works like a charm. Now "smartctl -S on" and "smartctl -o on" > works as expected on all of my SATA and IDE controllers. Thank you! I've got delivery failure notice for smartmontools-support mail address. Does anyone know how to contact smartmontools author? Thanks. -- tejun ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: smartctl causing HSM violation on sata_nv, 2.6.18 2006-10-02 18:33 ` Tejun Heo @ 2006-10-02 23:24 ` Doug Maxey 0 siblings, 0 replies; 5+ messages in thread From: Doug Maxey @ 2006-10-02 23:24 UTC (permalink / raw) To: Tejun Heo; +Cc: Jim Paris, linux-ide, smartmontools-support On Tue, 03 Oct 2006 03:33:42 +0900, Tejun Heo wrote: > > I've got delivery failure notice for smartmontools-support mail address. > Does anyone know how to contact smartmontools author? > Bruce Allen <ballen@gravity.phys.uwm.edu> is the maintainer. ++doug ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-10-02 23:24 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-09-27 18:33 smartctl causing HSM violation on sata_nv, 2.6.18 Jim Paris 2006-09-28 6:54 ` Tejun Heo 2006-09-28 8:09 ` Jim Paris 2006-10-02 18:33 ` Tejun Heo 2006-10-02 23:24 ` Doug Maxey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).