* hot plug on ICH9 with AHCI on
@ 2009-03-18 22:12 Владимир Дашевский
2009-03-20 2:04 ` Tejun Heo
0 siblings, 1 reply; 13+ messages in thread
From: Владимир Дашевский @ 2009-03-18 22:12 UTC (permalink / raw)
To: Jeff Garzik; +Cc: linux-ide
Jeff!
I have some questions concerning HOTPLUG and RAID support for ICH9 chip
used in my server.
I spent some time to know what is linux kernel support for enclosure
LEDs in HDD bays. I found no such support however it is described into
ICH9' datasheet. I am using linux debian with kernel 2.6.25. Latest
kernel for debian is 2.6.26 and it also does not have any enclosure
management support. However, I found some support in sources of kernel
2.6.28, however it seems only activity led is supported. That's why I
wrote a little kernel module wich takes control over that ACHI
functionality and tries to do enclosure management just behind
traditional ahci driver. It works fine but I faced to one strange thing.
Namely, I can control LEDs of those drives that are physically inserted
into backplane. If the slot is empty since last reboot its LEDS cannot
be controlled. I studied why this is so. An answer to this seems to be
that ahci driver does not enable entire port of SATA controller if it is
not populated during boot. To check this I tried some hot swap actions.
First, I have extracted one of my spare drives and push it back. I got
the following logs:
--
ata3: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
ata3: irq_stat 0x00400040, connection status changed
ata3: SError: { HostInt PHYRdyChg 10B8B DevExch }
ata3: hard resetting link
ata3: SATA link down (SStatus 0 SControl 300)
ata3: failed to recover some devices, retrying in 5 secs
ata3: hard resetting link
ata3: SATA link down (SStatus 0 SControl 300)
ata3: failed to recover some devices, retrying in 5 secs
ata3: hard resetting link
ata3: SATA link down (SStatus 0 SControl 300)
ata3.00: disabled
ata3: EH complete
ata3.00: detaching (SCSI 2:0:0:0)
sd 2:0:0:0: [sdc] Synchronizing SCSI cache
sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK,SUGGEST_OK
sd 2:0:0:0: [sdc] Stopping disk
sd 2:0:0:0: [sdc] START_STOP FAILED
sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK,SUGGEST_OK
ahci em: 40: post command 00080002
ata3: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xa frozen
ata3: irq_stat 0x00400040, connection status changed
ata3: SError: { RecovComm PHYRdyChg CommWake DevExch }
ata3: hard resetting link
ata3: softreset failed (device not ready)
ata3: hard resetting link
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: ATA-7: ST3500320AS, SD04, max UDMA/133
ata3.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata3.00: configured for UDMA/133
ata3: EH complete
sd 2:0:0:0: [sdc] Attached SCSI disk
sd 2:0:0:0: Attached scsi generic sg2 type 0
--
This log is strange for me. It seems that system missed the point that
the drives was going out. First it tried to reinitialize the SATA link
for three times. Then, it tried to sync caches and stop the drive when
it has actually lost connection with HBA. Then disk was returned to the
slot and its softreset failed. Why? I suspect the drive did not fully
start when the host tried to establish connection to it.
Another thing happened when I extracted the drive from one slot and
pushed it back into its neigbor that was empty during linux boot up.
Kernel desided this slot is dummy:
---
ahci 0000:00:1f.2: version 3.0
ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 17 (level, low) -> IRQ 17
ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0xb impl SATA mode
ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pmp pio slum part
PCI: Setting latency timer of device 0000:00:1f.2 to 64
scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
scsi4 : ahci
scsi5 : ahci
ata1: SATA max UDMA/133 abar m2048@0xd8601000 port 0xd8601100 irq 1275
ata2: SATA max UDMA/133 abar m2048@0xd8601000 port 0xd8601180 irq 1275
ata3: DUMMY
ata4: SATA max UDMA/133 abar m2048@0xd8601000 port 0xd8601280 irq 1275
ata5: DUMMY
ata6: DUMMY
---
So, even if I put the drive as ata3 device kernel does nothing to start it.
Now my questions:
1. Is it possible to force all ports to be potentially populated during
startup. I would prefer that all ICH9 SATA ports will have their own
fixed names, eg. /dev/sata0, ..., /dev/sata5. For now I have 3 drives
and they allways get names /dev/sda /dev/sdb /dev/sdc even if there is
some empty port as shown above. This is not convenient because enclosure
management is linked to physical ports, not to only populated ones.
2. How can I remove SATA drive safely? I mean the behavior similar to
USB drives removing. I'd like to notify the system that i wish to remove
the drive. Then it performs some actions as closing all current
connections, stopping new connections, flushing caches etc. After all
that it updates indicators on backplane showing me that the drive is
ready to be removed. As I see, some portions of this procedure can be
done using hdparm -f -F -Y, but not all.
With best regards, Vladimir Dashevsky
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: hot plug on ICH9 with AHCI on 2009-03-18 22:12 hot plug on ICH9 with AHCI on Владимир Дашевский @ 2009-03-20 2:04 ` Tejun Heo 2009-03-20 9:55 ` Владимир Дашевский 0 siblings, 1 reply; 13+ messages in thread From: Tejun Heo @ 2009-03-20 2:04 UTC (permalink / raw) To: Владимир Дашевский Cc: Jeff Garzik, linux-ide Hello, Владимир Дашевский wrote: > This log is strange for me. It seems that system missed the point that > the drives was going out. First it tried to reinitialize the SATA link > for three times. That's the intended behavior. Oh PHY event, libata EH tries to revive the link at least for 15 secs so that transient PHY glitch doesn't kill your root fs. > Then, it tried to sync caches and stop the drive when it has > actually lost connection with HBA. That's SCSI sd driver shutting down. As hot unplugging is surprise-removal, sd's shutdown sequence arrives after the device is actually gone and failed immediately. > Then disk was returned to the slot and its softreset failed. Why? I > suspect the drive did not fully start when the host tried to > establish connection to it. Yeah, it sometimes depends on the spin up time. Sometimes some controllers just can't get things working for the first trial and so on. The timeout mechanism is there to achieve acceptable delay even when devices slightly malfunction, so the timeouts are a bit aggressive. > Another thing happened when I extracted the drive from one slot and > pushed it back into its neigbor that was empty during linux boot up. > Kernel desided this slot is dummy: > --- > ahci 0000:00:1f.2: version 3.0 > ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 17 (level, low) -> IRQ 17 > ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0xb impl SATA > mode > ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pmp pio slum part > PCI: Setting latency timer of device 0000:00:1f.2 to 64 > scsi0 : ahci > scsi1 : ahci > scsi2 : ahci > scsi3 : ahci > scsi4 : ahci > scsi5 : ahci > ata1: SATA max UDMA/133 abar m2048@0xd8601000 port 0xd8601100 irq 1275 > ata2: SATA max UDMA/133 abar m2048@0xd8601000 port 0xd8601180 irq 1275 > ata3: DUMMY > ata4: SATA max UDMA/133 abar m2048@0xd8601000 port 0xd8601280 irq 1275 > ata5: DUMMY > ata6: DUMMY DUMMY ports are determined by the BIOS and dummy state is recorded in an ahci register. Does your board have all six ports exposed? > So, even if I put the drive as ata3 device kernel does nothing to start > it. > > Now my questions: > 1. Is it possible to force all ports to be potentially populated during > startup. I would prefer that all ICH9 SATA ports will have their own > fixed names, eg. /dev/sata0, ..., /dev/sata5. For now I have 3 drives > and they allways get names /dev/sda /dev/sdb /dev/sdc even if there is > some empty port as shown above. This is not convenient because enclosure > management is linked to physical ports, not to only populated ones. If you have exposed ports which are marked dummy by the ahci driver. It's a BIOS bug. It either needs to be quirked and reported to the motherboard vendor. > 2. How can I remove SATA drive safely? I mean the behavior similar to > USB drives removing. I'd like to notify the system that i wish to remove > the drive. Then it performs some actions as closing all current > connections, stopping new connections, flushing caches etc. After all > that it updates indicators on backplane showing me that the drive is > ready to be removed. As I see, some portions of this procedure can be > done using hdparm -f -F -Y, but not all. echo 1 > /sys/block/sdX/device/delete Thanks. -- tejun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: hot plug on ICH9 with AHCI on 2009-03-20 2:04 ` Tejun Heo @ 2009-03-20 9:55 ` Владимир Дашевский [not found] ` <49C39933.4020501@kernel.org> 0 siblings, 1 reply; 13+ messages in thread From: Владимир Дашевский @ 2009-03-20 9:55 UTC (permalink / raw) To: Tejun Heo; +Cc: Jeff Garzik, linux-ide Tejun! First, thanks for your reply. I want to inroduce my platform so you could get some info of it: http://www.supermicro.com/products/system/1u/5015/sys-5015b-mt.cfm Below ther are some comments from me. Tejun wrote: > This log is strange for me. It seems that system missed the point that > the drives was going out. First it tried to reinitialize the SATA link > for three times. > > > That's the intended behavior. Oh PHY event, libata EH tries to revive > the link at least for 15 secs so that transient PHY glitch doesn't > kill your root fs. > Well, I partially agree. Surely, EMI problems should not break the link forever but I do not agree with the algorithm. When the drive is being removed it gets out during millisectonds. I mean the time between loss of link and detection that port is not populated. So, I can imagine that driver could be going to retry reset one but it had to abort this action once it got the drive is removed at all. Not in 15 second and even not in 5 seconds but in 0.01 second. So, I think log should be: ata3: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen ata3: irq_stat 0x00400040, connection status changed ata3: SError: { HostInt PHYRdyChg 10B8B DevExch } ata3: hard resetting link ata3: SATA link down (SStatus 0 SControl 300) ata3: drive is out ata3.00: disabled ata3: EH complete >> Then, it tried to sync caches and stop the drive when it has >> actually lost connection with HBA. >> > > That's SCSI sd driver shutting down. As hot unplugging is > surprise-removal, sd's shutdown sequence arrives after the device is > actually gone and failed immediately. > Ok. So, this is notmal. We just need to inform SCSI driver first, isn't it? >> Then disk was returned to the slot and its softreset failed. Why? I >> suspect the drive did not fully start when the host tried to >> establish connection to it. >> > > Yeah, it sometimes depends on the spin up time. Sometimes some > controllers just can't get things working for the first trial and so > on. The timeout mechanism is there to achieve acceptable delay even > when devices slightly malfunction, so the timeouts are a bit > aggressive. > Well, some drives store their firmware on disk, so they cannot work with host until fully spinned up. I heard that drive started to spin up in two or more seconds after being inserted. So, what is the indended driver behavior? It simply performs soft resets until drive answer ot this, isn't it? If the drive gets ready faster it will be fewer failed soft resets in log, right? >> Another thing happened when I extracted the drive from one slot and >> pushed it back into its neigbor that was empty during linux boot up. >> Kernel desided this slot is dummy: >> --- >> ahci 0000:00:1f.2: version 3.0 >> ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 17 (level, low) -> IRQ 17 >> ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0xb impl SATA >> mode >> ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pmp pio slum part >> PCI: Setting latency timer of device 0000:00:1f.2 to 64 >> scsi0 : ahci >> scsi1 : ahci >> scsi2 : ahci >> scsi3 : ahci >> scsi4 : ahci >> scsi5 : ahci >> ata1: SATA max UDMA/133 abar m2048@0xd8601000 port 0xd8601100 irq 1275 >> ata2: SATA max UDMA/133 abar m2048@0xd8601000 port 0xd8601180 irq 1275 >> ata3: DUMMY >> ata4: SATA max UDMA/133 abar m2048@0xd8601000 port 0xd8601280 irq 1275 >> ata5: DUMMY >> ata6: DUMMY >> > > DUMMY ports are determined by the BIOS and dummy state is recorded in > an ahci register. Does your board have all six ports exposed? > Yes. The board has ICH9 which supports how plug capability. And this is claimed by SuperMicro (its vendor). I can say more. When I read datasheet on ICH9, I found that it has register named: "14.1.31 PCS---Port Control and Status Register (SATA--D31:F2)" As stated, it contains 6 port enables and 6 port present flags. First, similar to you, I thought that some ports were disabled by BIOS. Then I printed the contents of this register into my enclosure driver and saw that PCS is 8B3F. According to the datasheet that means that all 6 ports are enabled, but onlt 3 have connected links. If I reinstall the drive to neighbour slot I see the PCS changes to 873F, just according to the change. So, I suppose there is some AHCI driver bug. It should not assume, that port is dummy if it is enabled but not present. > >> So, even if I put the drive as ata3 device kernel does nothing to start >> it. >> >> Now my questions: >> 1. Is it possible to force all ports to be potentially populated during >> startup. I would prefer that all ICH9 SATA ports will have their own >> fixed names, eg. /dev/sata0, ..., /dev/sata5. For now I have 3 drives >> and they allways get names /dev/sda /dev/sdb /dev/sdc even if there is >> some empty port as shown above. This is not convenient because enclosure >> management is linked to physical ports, not to only populated ones. >> > > If you have exposed ports which are marked dummy by the ahci driver. > It's a BIOS bug. It either needs to be quirked and reported to the > motherboard vendor. > See my argues above. >> 2. How can I remove SATA drive safely? I mean the behavior similar to >> USB drives removing. I'd like to notify the system that i wish to remove >> the drive. Then it performs some actions as closing all current >> connections, stopping new connections, flushing caches etc. After all >> that it updates indicators on backplane showing me that the drive is >> ready to be removed. As I see, some portions of this procedure can be >> done using hdparm -f -F -Y, but not all. >> > > echo 1 > /sys/block/sdX/device/delete > Can I be sure this will stop the drive sefely (without of cached data loss)? With best regards, Vladimir Dashevsky ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <49C39933.4020501@kernel.org>]
* Re: hot plug on ICH9 with AHCI on [not found] ` <49C39933.4020501@kernel.org> @ 2009-03-20 15:31 ` Владимир Дашевский 2009-03-22 12:51 ` Владимир Дашевский 1 sibling, 0 replies; 13+ messages in thread From: Владимир Дашевский @ 2009-03-20 15:31 UTC (permalink / raw) To: Tejun Heo; +Cc: Jeff Garzik, linux-ide [-- Attachment #1: Type: text/plain, Size: 73 bytes --] Tejun! I will write some comment later. Now I send dmidecode output. [-- Attachment #2: dmi_data --] [-- Type: text/plain, Size: 10761 bytes --] # dmidecode 2.9 SMBIOS 2.5 present. 38 structures occupying 1584 bytes. Table at 0x7FEDF000. Handle 0x0000, DMI type 0, 24 bytes BIOS Information Vendor: Phoenix Technologies LTD Version: 6.00 Release Date: 02/25/2008 Address: 0xE2DA0 Runtime Size: 119392 bytes ROM Size: 2048 kB Characteristics: ISA is supported PCI is supported PNP is supported BIOS is upgradeable BIOS shadowing is allowed ESCD support is available Boot from CD is supported Selectable boot is supported BIOS ROM is socketed EDD is supported 5.25"/1.2 MB floppy services are supported (int 13h) 3.5"/720 KB floppy services are supported (int 13h) 3.5"/2.88 MB floppy services are supported (int 13h) Print screen service is supported (int 5h) 8042 keyboard services are supported (int 9h) Serial services are supported (int 14h) Printer services are supported (int 17h) CGA/mono video services are supported (int 10h) USB legacy is supported Smart battery is supported BIOS boot specification is supported Targeted content distribution is supported Handle 0x0001, DMI type 1, 27 bytes System Information Manufacturer: Supermicro Product Name: X7SBi Version: 0123456789 Serial Number: 0123456789 UUID: 53D1A494-D663-A0E7-890B-00304862E142 Wake-up Type: Power Switch SKU Number: Not Specified Family: Not Specified Handle 0x0002, DMI type 2, 8 bytes Base Board Information Manufacturer: Supermicro Product Name: X7SBi Version: PCB Version Serial Number: 0123456789 Handle 0x0003, DMI type 3, 17 bytes Chassis Information Manufacturer: Supermicro Type: Other Lock: Not Present Version: 0123456789 Serial Number: 0123456789 Asset Tag: Boot-up State: Safe Power Supply State: Safe Thermal State: Safe Security Status: None OEM Information: 0x00001234 Handle 0x0004, DMI type 4, 35 bytes Processor Information Socket Designation: CPU 1 Type: Central Processor Family: Unknown Manufacturer: Intel ID: FD 06 00 00 FF FB EB BF Version: 00000000000000000000000000000000 Voltage: 1.8 V External Clock: Unknown Max Speed: 3300 MHz Current Speed: 2200 MHz Status: Populated, Enabled Upgrade: Socket LGA775 L1 Cache Handle: 0x0005 L2 Cache Handle: 0x0006 L3 Cache Handle: Not Provided Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Handle 0x0005, DMI type 7, 19 bytes Cache Information Socket Designation: L1 Cache Configuration: Enabled, Socketed, Level 1 Operational Mode: Write Back Location: Internal Installed Size: 16 KB Maximum Size: 16 KB Supported SRAM Types: Burst Pipeline Burst Asynchronous Installed SRAM Type: Asynchronous Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: Unknown Handle 0x0006, DMI type 7, 19 bytes Cache Information Socket Designation: L2 Cache Configuration: Enabled, Socketed, Level 2 Operational Mode: Write Back Location: Internal Installed Size: 2048 KB Maximum Size: 512 KB Supported SRAM Types: Burst Pipeline Burst Asynchronous Installed SRAM Type: Burst Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: Unknown Handle 0x0007, DMI type 7, 19 bytes Cache Information Socket Designation: L3 Cache Configuration: Enabled, Socketed, Level 3 Operational Mode: Write Back Location: Internal Installed Size: 2048 KB Maximum Size: 512 KB Supported SRAM Types: Burst Pipeline Burst Asynchronous Installed SRAM Type: Burst Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: Unknown Handle 0x0008, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J2A1 Internal Connector Type: 9 Pin Dual Inline (pin 10 cut) External Reference Designator: COM 1 External Connector Type: DB-9 male Port Type: Serial Port 16550A Compatible Handle 0x0009, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J3A1 Internal Connector Type: 25 Pin Dual Inline (pin 26 cut) External Reference Designator: Parallel External Connector Type: DB-25 female Port Type: Parallel Port ECP/EPP Handle 0x000A, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1A1 Internal Connector Type: None External Reference Designator: Keyboard External Connector Type: Circular DIN-8 male Port Type: Keyboard Port Handle 0x000B, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1A1 Internal Connector Type: None External Reference Designator: PS/2 Mouse External Connector Type: Circular DIN-8 male Port Type: Keyboard Port Handle 0x000C, DMI type 9, 13 bytes System Slot Information Designation: PCI Slot #1 - J5B1 Type: 32-bit PCI Current Usage: Available Length: Long ID: 1 Characteristics: 5.0 V is provided 3.3 V is provided Handle 0x000D, DMI type 9, 13 bytes System Slot Information Designation: PCIe Slot #1 - J7C1 Type: 32-bit PCI Express Current Usage: In Use Length: Long ID: 4 Characteristics: 5.0 V is provided 3.3 V is provided Handle 0x000E, DMI type 10, 6 bytes On Board Device Information Type: Sound Status: Disabled Description: ADI1886 Handle 0x000F, DMI type 11, 5 bytes OEM Strings String 1: This is the Intel 965 String 2: Customer Reference Board Handle 0x0010, DMI type 12, 5 bytes System Configuration Options Option 1: Jumper settings can be described here. Handle 0x0011, DMI type 15, 29 bytes System Event Log Area Length: 176 bytes Header Start Offset: 0x0000 Header Length: 16 bytes Data Start Offset: 0x0010 Access Method: General-purpose non-volatile data functions Access Address: 0x0000 Status: Valid, Not Full Change Token: 0x00000086 Header Format: Type 1 Supported Log Type Descriptors: 3 Descriptor 1: POST error Data Format 1: POST results bitmap Descriptor 2: Single-bit ECC memory error Data Format 2: Multiple-event Descriptor 3: Multi-bit ECC memory error Data Format 3: Multiple-event Handle 0x0012, DMI type 16, 15 bytes Physical Memory Array Location: System Board Or Motherboard Use: System Memory Error Correction Type: None Maximum Capacity: 8 GB Error Information Handle: Not Provided Number Of Devices: 4 Handle 0x0013, DMI type 17, 27 bytes Memory Device Array Handle: 0x0012 Error Information Handle: 0xFF01 Total Width: 40968 bits Data Width: 41024 bits Size: 1024 MB Form Factor: DIMM Set: 1 Locator: J6G1 Bank Locator: DIMM 0 Type: DDR2 Type Detail: Synchronous Speed: 667 MHz (1.5 ns) Manufacturer: Kingston Serial Number: D132F5F5 Asset Tag: 00000730 Part Number: 393930353331362D3030352E4130344C4600 Handle 0x0014, DMI type 17, 27 bytes Memory Device Array Handle: 0x0012 Error Information Handle: 0xFF01 Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: DIMM Set: 1 Locator: J6G2 Bank Locator: DIMM 1 Type: DDR2 Type Detail: Synchronous Speed: 667 MHz (1.5 ns) Manufacturer: 48spaces Serial Number: 01234567 Asset Tag: 01234567 Part Number: 012345678901234567890123456789012345 Handle 0x0015, DMI type 17, 27 bytes Memory Device Array Handle: 0x0012 Error Information Handle: 0xFF01 Total Width: 41992 bits Data Width: 42048 bits Size: 1024 MB Form Factor: DIMM Set: 1 Locator: J6H1 Bank Locator: DIMM 2 Type: DDR2 Type Detail: Synchronous Speed: 667 MHz (1.5 ns) Manufacturer: Kingston Serial Number: D0321BF6 Asset Tag: 00000730 Part Number: 393930353331362D3030352E4130344C4600 Handle 0x0016, DMI type 17, 27 bytes Memory Device Array Handle: 0x0012 Error Information Handle: 0xFF01 Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: DIMM Set: 1 Locator: J6H2 Bank Locator: DIMM 3 Type: DDR2 Type Detail: Synchronous Speed: 667 MHz (1.5 ns) Manufacturer: 48spaces Serial Number: 01234567 Asset Tag: 01234567 Part Number: 012345678901234567890123456789012345 Handle 0x0017, DMI type 19, 15 bytes Memory Array Mapped Address Starting Address: 0x00000000000 Ending Address: 0x0007FFFFFFF Range Size: 2 GB Physical Array Handle: 0x0012 Partition Width: 0 Handle 0x0018, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00000000000 Ending Address: 0x0003FFFFFFF Range Size: 1 GB Physical Device Handle: 0x0013 Memory Array Mapped Address Handle: 0x0017 Partition Row Position: Unknown Interleave Position: Unknown Interleaved Data Depth: Unknown Handle 0x0019, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x0003FFFFC00 Ending Address: 0x0003FFFFFFF Range Size: 1 kB Physical Device Handle: 0x0014 Memory Array Mapped Address Handle: 0x0017 Partition Row Position: Unknown Interleave Position: Unknown Interleaved Data Depth: Unknown Handle 0x001A, DMI type 23, 13 bytes System Reset Status: Enabled Watchdog Timer: Present Boot Option: Do Not Reboot Boot Option On Limit: Do Not Reboot Reset Count: Unknown Reset Limit: Unknown Timer Interval: Unknown Timeout: Unknown Handle 0x001B, DMI type 24, 5 bytes Hardware Security Power-On Password Status: Disabled Keyboard Password Status: Unknown Administrator Password Status: Enabled Front Panel Reset Status: Unknown Handle 0x001C, DMI type 25, 9 bytes System Power Controls Next Scheduled Power-on: 12-31 23:59:59 Handle 0x001D, DMI type 26, 20 bytes Voltage Probe Description: Voltage Probe Location: Processor Status: OK Maximum Value: Unknown Minimum Value: Unknown Resolution: Unknown Tolerance: Unknown Accuracy: Unknown OEM-specific Information: 0x00000000 Handle 0x001E, DMI type 27, 12 bytes Cooling Device Temperature Probe Handle: 0x001F Type: Fan Status: OK OEM-specific Information: 0x00000000 Handle 0x001F, DMI type 28, 20 bytes Temperature Probe Description: Temperature Probe Location: Processor Status: OK Maximum Value: Unknown Minimum Value Unknown Resolution: Unknown Tolerance: Unknown Accuracy: Unknown OEM-specific Information: 0x00000000 Handle 0x0020, DMI type 29, 20 bytes Electrical Current Probe Description: Electrical Current Probe Location: Processor Status: OK Maximum Value: Unknown Minimum Value: Unknown Resolution: Unknown Tolerance: Unknown Accuracy: Unknown OEM-specific Information: 0x00000000 Handle 0x0021, DMI type 30, 6 bytes Out-of-band Remote Access Manufacturer Name: Intel Inbound Connection: Enabled Outbound Connection: Disabled Handle 0x0022, DMI type 32, 20 bytes System Boot Information Status: <OUT OF SPEC> Handle 0x0023, DMI type 126, 4 bytes Inactive Handle 0x0024, DMI type 127, 4 bytes End Of Table Handle 0x0025, DMI type 127, 4 bytes End Of Table ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: hot plug on ICH9 with AHCI on [not found] ` <49C39933.4020501@kernel.org> 2009-03-20 15:31 ` Владимир Дашевский @ 2009-03-22 12:51 ` Владимир Дашевский 2009-03-22 15:08 ` Tejun Heo 1 sibling, 1 reply; 13+ messages in thread From: Владимир Дашевский @ 2009-03-22 12:51 UTC (permalink / raw) To: Tejun Heo; +Cc: Jeff Garzik, linux-ide Tejun! Here are my comments on our previous conversation >> Well, I partially agree. Surely, EMI problems should not break the link >> forever but I do not agree with the algorithm. When the drive is being >> removed it gets out during millisectonds. I mean the time between loss >> of link and detection that port is not populated. So, I can imagine that >> driver could be going to retry reset one but it had to abort this action >> once it got the drive is removed at all. Not in 15 second and even not >> in 5 seconds but in 0.01 second. So, > Heh... that will fail any EMI test. There simply isn't anything to be > earned by shortening the period. The only thing that can go wrong > with longer delay is if the user unplugs the drive, plugs it in > another host modifies the content and replug it before the detach > happens. If the user does that, well, he/she deserves a corrupt > filesystem. > I think you are not right here. If we are talking about EMI problems I can say that the strategy of many retries is worse than one read of port present status. EMI noise occurs at the time while software tries to re-establish connection with empty link because there is no link terminatiion. It's just like a car engine that has lost its muffler. It produces lots of noise. It is better to turn the link off as soon as we know there is no device on the port. That's why retries should last only until that state is reported by hardware. And I think hardware reports that state much faster than in 15 or even 5 seconds. In ICH9 it reports this just immediately. >>> That's SCSI sd driver shutting down. As hot unplugging is >>> surprise-removal, sd's shutdown sequence arrives after the device is >>> actually gone and failed immediately. >>> >>> >> Ok. So, this is notmal. We just need to inform SCSI driver first, isn't it? >> > > If you hot unplug, how do you notify SCSI driver first? > Well, there are some strange comments in FAQs that older ICH chips (5 to 8) do not fully support hot plug, while the newer ones have better support for this. Then there is an explanation of what is proper hotplug support. That explanation says of surprise-removal support. However, I do not mind of notifying system before a actually remove the drive. Just because it may take a long time to shut down drive activity and it will be better to indicate this explicitly to user. >> >> Well, some drives store their firmware on disk, so they cannot work with >> host until fully spinned up. I heard that drive started to spin up in >> two or more seconds after being inserted. So, what is the indended >> driver behavior? It simply performs soft resets until drive answer ot >> this, isn't it? If the drive gets ready faster it will be fewer failed >> soft resets in log, right? >> > > The goal is being robust. With numerous hardware combinations, that's > about the only way to keep things manageable. Things don't always > work as described in the spec. Again, the only down side is one or > more failed attempts and log about those. Why do you care? > It's just because one of my jobs is writing high performance embedded real-time software were logs are almost the only way to know what happens in hardware. After several years of such practice there is a habit of suspecting side effects of each stralge log line. That's whay I do care :-) I just want to understand that this softreset failure is a normal behavior for that particular case. > ahci doesn't use PCS to detect dummy ports. It uses PORT_IMPL ahci > register. If PORT_IMPL changes according to which ports are occupied > that's a BIOS bug. Can you post the output of "dmidecode"? > Yes, you were right, it was a BIOS bug. I have downloaded latest BIOS from the vendor's web site and now PI register allways reports all six ports implemented. >>> echo 1 > /sys/block/sdX/device/delete >>> >>> >> Can I be sure this will stop the drive sefely (without of cached data >> loss)? >> > > Yes. > Ok. So, one more question. How can I know exactly when device deletion has completed after sending this command? For example, consider that there some data in cache that needs 3 seconds to be sync-ed to disk. How can I know that I must wait for 3 seconds before a can actually remove the drive? Should I check the presence of some other filename in /sys/block/sdX/ or do something else? Thank you! With best regards, Vladimir Dashevsky ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: hot plug on ICH9 with AHCI on 2009-03-22 12:51 ` Владимир Дашевский @ 2009-03-22 15:08 ` Tejun Heo 2009-03-22 16:07 ` Владимир Дашевский 0 siblings, 1 reply; 13+ messages in thread From: Tejun Heo @ 2009-03-22 15:08 UTC (permalink / raw) To: Владимир Дашевский Cc: Jeff Garzik, linux-ide Hello, Владимир Дашевский wrote: > I think you are not right here. If we are talking about EMI problems I > can say that the strategy of many retries is worse than one read of port > present status. EMI noise occurs at the time while software tries to > re-establish connection with empty link because there is no link > terminatiion. It's just like a car engine that has lost its muffler. It > produces lots of noise. It is better to turn the link off as soon as we > know there is no device on the port. That's why retries should last only > until that state is reported by hardware. And I think hardware reports > that state much faster than in 15 or even 5 seconds. In ICH9 it reports > this just immediately. Not all EMIs are one-shot events. Some can span seconds. Links don't always come up right after failures. Sometimes they require more than one hardresets to get back to working order. Link status report is not reliable. Sometimes they report offline for a while after certain events. If you know how to work around the above problems under a second, I'm all ears but I doubt it unless it involves an additional mechanical switch. I don't know of a practical downside to lingering for limited amount of time. If you know one, please let me know. >> If you hot unplug, how do you notify SCSI driver first? >> > Well, there are some strange comments in FAQs that older ICH chips (5 to > 8) do not fully support hot plug, while the newer ones have better > support for this. Then there is an explanation of what is proper hotplug > support. That explanation says of surprise-removal support. However, I > do not mind of notifying system before a actually remove the drive. Just > because it may take a long time to shut down drive activity and it will > be better to indicate this explicitly to user. If you sync and spin down the drive before hot unplugging, there's no practical difference. If you do a surprise hot-unplug, whatever was on the buffer will be lost and the drive will have to do an emergency unload. >> The goal is being robust. With numerous hardware combinations, that's >> about the only way to keep things manageable. Things don't always >> work as described in the spec. Again, the only down side is one or >> more failed attempts and log about those. Why do you care? >> > It's just because one of my jobs is writing high performance embedded > real-time software were logs are almost the only way to know what > happens in hardware. After several years of such practice there is a > habit of suspecting side effects of each stralge log line. That's whay I > do care :-) > I just want to understand that this softreset failure is a normal > behavior for that particular case. Yes, they're expected. If you really don't like those messages, feel free to comment them out in your tree but please stop obsessing over the messages. I'll be happy to improve EH behavior but you need to come up with better reasons. >> ahci doesn't use PCS to detect dummy ports. It uses PORT_IMPL ahci >> register. If PORT_IMPL changes according to which ports are occupied >> that's a BIOS bug. Can you post the output of "dmidecode"? >> > Yes, you were right, it was a BIOS bug. I have downloaded latest BIOS > from the vendor's web site and now PI register allways reports all six > ports implemented. Great. :-) > Ok. So, one more question. How can I know exactly when device deletion > has completed after sending this command? For example, consider that > there some data in cache that needs 3 seconds to be sync-ed to disk. How > can I know that I must wait for 3 seconds before a can actually remove > the drive? Should I check the presence of some other filename in > /sys/block/sdX/ or do something else? The echo to delete node is synchronous. It will return after the device is completely removed but please note that "removing" in this sense only covers the device itself. It will flush the request queue and spin the drive down but won't do anything about filesystems. You need to unmount first. hal and desktop stuff already do the right thing for devices marked removable. Thanks. -- tejun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: hot plug on ICH9 with AHCI on 2009-03-22 15:08 ` Tejun Heo @ 2009-03-22 16:07 ` Владимир Дашевский 2009-03-22 16:41 ` Tejun Heo 0 siblings, 1 reply; 13+ messages in thread From: Владимир Дашевский @ 2009-03-22 16:07 UTC (permalink / raw) To: Tejun Heo; +Cc: Jeff Garzik, linux-ide Tejun! Tejun Heo пишет: >> I think you are not right here. If we are talking about EMI problems I >> can say that the strategy of many retries is worse than one read of port >> present status. EMI noise occurs at the time while software tries to >> re-establish connection with empty link because there is no link >> terminatiion. It's just like a car engine that has lost its muffler. It >> produces lots of noise. It is better to turn the link off as soon as we >> know there is no device on the port. That's why retries should last only >> until that state is reported by hardware. And I think hardware reports >> that state much faster than in 15 or even 5 seconds. In ICH9 it reports >> this just immediately. >> > > Not all EMIs are one-shot events. Some can span seconds. Links don't > always come up right after failures. Sometimes they require more than > one hardresets to get back to working order. Link status report is > not reliable. Sometimes they report offline for a while after certain > events. If you know how to work around the above problems under a > second, I'm all ears but I doubt it unless it involves an additional > mechanical switch. > Well, for example, USB devices have a pull-up resistor on their D+ line. DC bias can be used for detection of device presence without mechanical switch. > I don't know of a practical downside to lingering for limited amount > of time. If you know one, please let me know. > > Ok. >> Ok. So, one more question. How can I know exactly when device deletion >> has completed after sending this command? For example, consider that >> there some data in cache that needs 3 seconds to be sync-ed to disk. How >> can I know that I must wait for 3 seconds before a can actually remove >> the drive? Should I check the presence of some other filename in >> /sys/block/sdX/ or do something else? >> > > The echo to delete node is synchronous. It will return after the > device is completely removed but please note that "removing" in this > sense only covers the device itself. It will flush the request queue > and spin the drive down but won't do anything about filesystems. You > need to unmount first. hal and desktop stuff already do the right > thing for devices marked removable. > > Ok, but two more questions: 1. Is there any generic mechanism of notifiing processes which had previously opened device being deleted of this event? What will happen to such processes? Is it possible to check who are those who uses the drive at the moment? 2. If the drive was deleted is it possible to start it back without physical re-connection? Can I simulate status change og that port to force the driver to auto-detect block device? PS: as for this: > I'll be happy to improve EH behavior but you need to come up with better reasons. > I can tell that for me enclosure management support is quite a good reason. Unfortunately, there is no this support in official kernel. I have seen only limited support of activity LED in kernel 2.6.28. However, I am using Debian where the latest kernel is only 2.6.26. As a result I had to write a simple ahci_em module which register simple proc interface to send LED states to all ICH9 ports. However, final goal is to integrate this module with mdadm to have proper indication of RAID state. Best regards, Vladimir Dashevsky ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: hot plug on ICH9 with AHCI on 2009-03-22 16:07 ` Владимир Дашевский @ 2009-03-22 16:41 ` Tejun Heo 2009-03-22 18:26 ` Владимир Дашевский 0 siblings, 1 reply; 13+ messages in thread From: Tejun Heo @ 2009-03-22 16:41 UTC (permalink / raw) To: Владимир Дашевский Cc: Jeff Garzik, linux-ide Hello, Владимир Дашевский wrote: >> Not all EMIs are one-shot events. Some can span seconds. Links don't >> always come up right after failures. Sometimes they require more than >> one hardresets to get back to working order. Link status report is >> not reliable. Sometimes they report offline for a while after certain >> events. If you know how to work around the above problems under a >> second, I'm all ears but I doubt it unless it involves an additional >> mechanical switch. >> > Well, for example, USB devices have a pull-up resistor on their D+ line. > DC bias can be used for detection of device presence without mechanical > switch. SATA is not USB and onlineness detection isn't that simple. Also, have you tried to run a system on a USB device over flaky connection? >> The echo to delete node is synchronous. It will return after the >> device is completely removed but please note that "removing" in this >> sense only covers the device itself. It will flush the request queue >> and spin the drive down but won't do anything about filesystems. You >> need to unmount first. hal and desktop stuff already do the right >> thing for devices marked removable. >> > Ok, but two more questions: > 1. Is there any generic mechanism of notifiing processes which had > previously opened device being deleted of this event? What will happen > to such processes? Is it possible to check who are those who uses the > drive at the moment? -EIO will happen, fuser, but if you want something intelligent, hal + dbus. > 2. If the drive was deleted is it possible to start it back without > physical re-connection? Can I simulate status change og that port to > force the driver to auto-detect block device? I don't really follow what you're trying to achieve but if you want some fancy snapshotting + remapping trick, the best place would be dm. > PS: as for this: >> I'll be happy to improve EH behavior but you need to come up with >> better reasons. >> > I can tell that for me enclosure management support is quite a good > reason. How is that in any way exclusive against longer detach delay? > Unfortunately, there is no this support in official kernel. I have > seen only limited support of activity LED in kernel 2.6.28. > However, I am using Debian where the latest kernel is only > 2.6.26. As a result I had to write a simple ahci_em module which > register simple proc interface to send LED states to all ICH9 > ports. However, final goal is to integrate this module with mdadm to > have proper indication of RAID state. The biggest obstacle is that there aren't too many enclosure devices floating around. What kind of device are you using? -- tejun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: hot plug on ICH9 with AHCI on 2009-03-22 16:41 ` Tejun Heo @ 2009-03-22 18:26 ` Владимир Дашевский 2009-03-23 2:04 ` Tejun Heo 0 siblings, 1 reply; 13+ messages in thread From: Владимир Дашевский @ 2009-03-22 18:26 UTC (permalink / raw) To: Tejun Heo; +Cc: Jeff Garzik, linux-ide Tejun! > Владимир Дашевский wrote: > >>> Not all EMIs are one-shot events. Some can span seconds. Links don't >>> always come up right after failures. Sometimes they require more than >>> one hardresets to get back to working order. Link status report is >>> not reliable. Sometimes they report offline for a while after certain >>> events. If you know how to work around the above problems under a >>> second, I'm all ears but I doubt it unless it involves an additional >>> mechanical switch. >>> >>> >> Well, for example, USB devices have a pull-up resistor on their D+ line. >> DC bias can be used for detection of device presence without mechanical >> switch. >> > > SATA is not USB and onlineness detection isn't that simple. Also, > have you tried to run a system on a USB device over flaky connection? > Well, I cannot argue with you here. All that I wanted to say is that I would prefer more optimistic software behavior if the hardware really supports device connection status. >>> The echo to delete node is synchronous. It will return after the >>> device is completely removed but please note that "removing" in this >>> sense only covers the device itself. It will flush the request queue >>> and spin the drive down but won't do anything about filesystems. You >>> need to unmount first. hal and desktop stuff already do the right >>> thing for devices marked removable. >>> >>> >> Ok, but two more questions: >> 1. Is there any generic mechanism of notifiing processes which had >> previously opened device being deleted of this event? What will happen >> to such processes? Is it possible to check who are those who uses the >> drive at the moment? >> > > -EIO will happen, fuser, but if you want something intelligent, hal + > dbus. > Sorry, I missed the sense of this sentence. I tried this deletion with fdisk and see that fdisk does not even comply for device failure. It just starts to print empty partition table and so on. So the question is how to properly close any activity concerned with device being deleted if I do not know exactly what is that activity? Are the most typical programs which are allowed to use raw block devices aware of unexpected block device loss? >> 2. If the drive was deleted is it possible to start it back without >> physical re-connection? Can I simulate status change og that port to >> force the driver to auto-detect block device? >> > > I don't really follow what you're trying to achieve but if you want > some fancy snapshotting + remapping trick, the best place would be dm. > Well, I didn't think of any tricks. I just deleted the drive as you taught me and tried to get it back without moving myself in front of the server. :-) However, I think that some call to rescan scsi devices will be useful. > >> PS: as for this: >> >>> I'll be happy to improve EH behavior but you need to come up with >>> better reasons. >>> >> I can tell that for me enclosure management support is quite a good >> reason. >> > > How is that in any way exclusive against longer detach delay? > I just answered with better reasons to make you happy, not with another advice of detach delay. > >> Unfortunately, there is no this support in official kernel. I have >> seen only limited support of activity LED in kernel 2.6.28. >> However, I am using Debian where the latest kernel is only >> 2.6.26. As a result I had to write a simple ahci_em module which >> register simple proc interface to send LED states to all ICH9 >> ports. However, final goal is to integrate this module with mdadm to >> have proper indication of RAID state. >> > > The biggest obstacle is that there aren't too many enclosure devices > floating around. What kind of device are you using? > I don't know exactly what device are you talking about. I was talking about LED message types that are supported in ICH9. As for my server, ICH9 provides SGPIO interface that is routed to 4-drive hot-swap backplane based on AMI MG9071 chip. However, this information isn't needed to program ICH9 since the LED message mechanism is supported in it. Other message types are not supported. And it is very strange that linux ahci still does not support this functionality since it was first introduced in ICH8 (datasheet first release in June of 2006). PS: My code has about 11Kb of text and supports all useful RAID states: NORMAL, LOCATE, REBUILD, FAILURE, HOTSPARE, PREDICTED FAILURE SOON. I have tested in on my server and it works. I think it can be useful for other implementations of soft RAID systems with hat swap support. Best regards, Vladimir Dashevsky ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: hot plug on ICH9 with AHCI on 2009-03-22 18:26 ` Владимир Дашевский @ 2009-03-23 2:04 ` Tejun Heo 2009-03-23 11:38 ` Владимир Дашевский 0 siblings, 1 reply; 13+ messages in thread From: Tejun Heo @ 2009-03-23 2:04 UTC (permalink / raw) To: Владимир Дашевский Cc: Jeff Garzik, linux-ide Hello, Владимир Дашевский wrote: >>> Well, for example, USB devices have a pull-up resistor on their D+ line. >>> DC bias can be used for detection of device presence without mechanical >>> switch. >> >> SATA is not USB and onlineness detection isn't that simple. Also, >> have you tried to run a system on a USB device over flaky connection? >> > Well, I cannot argue with you here. All that I wanted to say is that I > would prefer more optimistic software behavior if the hardware really > supports device connection status. I really don't follow your train of thoughts here. Are you saying that the driver should be optimistic about the reliability about status reported by the hardware even when it is inherently imprecise (please read the spec) and real world experiments prove that? >>> Ok, but two more questions: >>> 1. Is there any generic mechanism of notifiing processes which had >>> previously opened device being deleted of this event? What will happen >>> to such processes? Is it possible to check who are those who uses the >>> drive at the moment? >>> >> >> -EIO will happen, fuser, but if you want something intelligent, hal + >> dbus. >> > Sorry, I missed the sense of this sentence. -EIO will happen to any processes trying to do IO on the removed device. fuser will find out who's using the block device but if you want something more intelligent, look at hal + dbus. > I tried this deletion with fdisk and see that fdisk does not even > comply for device failure. It just starts to print empty partition > table and so on. So the question is how to properly close any > activity concerned with device being deleted if I do not know > exactly what is that activity? Are the most typical programs which > are allowed to use raw block devices aware of unexpected block > device loss? Please take a look at how desktop guys are handling the issue. It's not something which can be handled in kernel proper. >> I don't really follow what you're trying to achieve but if you want >> some fancy snapshotting + remapping trick, the best place would be dm. >> > Well, I didn't think of any tricks. I just deleted the drive as you > taught me and tried to get it back without moving myself in front of the > server. :-) > However, I think that some call to rescan scsi devices will be useful. Ah.. in that case, you can do # echo - - - > /sys/class/scsi_host/hostN/scan >>>> I'll be happy to improve EH behavior but you need to come up with >>>> better reasons. >>> I can tell that for me enclosure management support is quite a good >>> reason. >>> >> >> How is that in any way exclusive against longer detach delay? > > I just answered with better reasons to make you happy, not with another > advice of detach delay. Well, I was asking for better reasons to change the detach delay. >> The biggest obstacle is that there aren't too many enclosure devices >> floating around. What kind of device are you using? >> > I don't know exactly what device are you talking about. I was talking > about LED message types that are supported in ICH9. > As for my server, ICH9 provides SGPIO interface that is routed to > 4-drive hot-swap backplane based on AMI MG9071 chip. However, this > information isn't needed to program ICH9 since the LED message mechanism > is supported in it. Other message types are not supported. And it is > very strange that linux ahci still does not support this functionality > since it was first introduced in ICH8 (datasheet first release in June > of 2006). Yeah, I know it has been in the spec but without hardware to play with it's difficult to add driver features and lack of general availability also means lower demand. > PS: My code has about 11Kb of text and supports all useful RAID states: > NORMAL, LOCATE, REBUILD, FAILURE, HOTSPARE, PREDICTED FAILURE SOON. I > have tested in on my server and it works. I think it can be useful for > other implementations of soft RAID systems with hat swap support. I think it should be independent from RAID but having general enclosure support will be nice. Care to post the patches? Thanks. -- tejun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: hot plug on ICH9 with AHCI on 2009-03-23 2:04 ` Tejun Heo @ 2009-03-23 11:38 ` Владимир Дашевский 2009-03-23 12:01 ` Tejun Heo 0 siblings, 1 reply; 13+ messages in thread From: Владимир Дашевский @ 2009-03-23 11:38 UTC (permalink / raw) To: Tejun Heo; +Cc: Jeff Garzik, linux-ide Tejun! > Hello, > > Владимир Дашевский wrote: > >>>> Well, for example, USB devices have a pull-up resistor on their D+ line. >>>> DC bias can be used for detection of device presence without mechanical >>>> switch. >>>> >>> SATA is not USB and onlineness detection isn't that simple. Also, >>> have you tried to run a system on a USB device over flaky connection? >>> >>> >> Well, I cannot argue with you here. All that I wanted to say is that I >> would prefer more optimistic software behavior if the hardware really >> supports device connection status. >> > > I really don't follow your train of thoughts here. Are you saying > that the driver should be optimistic about the reliability about > status reported by the hardware even when it is inherently imprecise > (please read the spec) and real world experiments prove that? > No. I ment that driver should performs better if the hardware supports some features for that. Consider two different cases. 1. hardware derives port population status by sensing the carrier in the data link. In this case it is possible that some EMI noise can damage link integrity so strongly that not data bits but also a carrier will be lost for a short time. This will lead to 'port is not present' status however noone has actually removed the drive. 2. Hardware implements some feature like pull-up resistor in USB, or special shorter 'present' contact as in PCI or CPCI connectors, or it simply senses some dc current through power lines etc. In this case port status is robust over EMI noise and be used to inform driver of actual connection. My thought was to improve driver behavior in case 2, either autodetected by PCI IDs or manually overriden by some configure script. >>> -EIO will happen, fuser, but if you want something intelligent, hal + >>> dbus. >>> >>> >> Sorry, I missed the sense of this sentence. >> > > -EIO will happen to any processes trying to do IO on the removed > device. fuser will find out who's using the block device but if you > want something more intelligent, look at hal + dbus. > Hm, I tried to write fuser /dev/sda and got empty output. It seems that file system does not open sda. How it works? >> I tried this deletion with fdisk and see that fdisk does not even >> comply for device failure. It just starts to print empty partition >> table and so on. So the question is how to properly close any >> activity concerned with device being deleted if I do not know >> exactly what is that activity? Are the most typical programs which >> are allowed to use raw block devices aware of unexpected block >> device loss? >> > Please take a look at how desktop guys are handling the issue. It's > not something which can be handled in kernel proper. > Ok. > >>> I don't really follow what you're trying to achieve but if you want >>> some fancy snapshotting + remapping trick, the best place would be dm. >>> >>> >> Well, I didn't think of any tricks. I just deleted the drive as you >> taught me and tried to get it back without moving myself in front of the >> server. :-) >> However, I think that some call to rescan scsi devices will be useful. >> > > Ah.. in that case, you can do > > # echo - - - > /sys/class/scsi_host/hostN/scan > well, it works but it takes of about 10 seconds to finish scan for deleted drive. is this ok? Probably, that's because drive goes down after deletion and it starts to spin up during this scan. >>> The biggest obstacle is that there aren't too many enclosure devices >>> floating around. What kind of device are you using? >>> >>> >> I don't know exactly what device are you talking about. I was talking >> about LED message types that are supported in ICH9. >> As for my server, ICH9 provides SGPIO interface that is routed to >> 4-drive hot-swap backplane based on AMI MG9071 chip. However, this >> information isn't needed to program ICH9 since the LED message mechanism >> is supported in it. Other message types are not supported. And it is >> very strange that linux ahci still does not support this functionality >> since it was first introduced in ICH8 (datasheet first release in June >> of 2006). >> > > Yeah, I know it has been in the spec but without hardware to play with > it's difficult to add driver features and lack of general availability > also means lower demand. > Well, I just cannot imagine how software raid can work without clearly visible state. One drive mixed up in RAID5 and the whole array can get damaged. And it is not so difficult to mix them up because drive names may differ from physical slot numbers. >> PS: My code has about 11Kb of text and supports all useful RAID states: >> NORMAL, LOCATE, REBUILD, FAILURE, HOTSPARE, PREDICTED FAILURE SOON. I >> have tested in on my server and it works. I think it can be useful for >> other implementations of soft RAID systems with hat swap support. >> > > I think it should be independent from RAID but having general > enclosure support will be nice. Care to post the patches? > > Well, I can provide you with a code which works on my ICH9 Supermicro platform. I believe it will also work with both ICH8 and ICH10. However, since I could not install this module as traditional pci driver (the kernel decided not to claim my ahci device since the main driver present in the system) I had to rewrite it as a general linux kernel module. It justs scan pci devices for AHCI capable ones and remaps their ABAR to try enclosure management support. For now, only my ICH9 PCI IDs are in my try list. All AHCI EM-capable devices get their associated proc interface - /proc/ahci_emX/leds*. This module actually works in parallel with kernel ahci driver but I think it will be a conflict with it once the kernel driver starts to support em by itself. I guess, the best way would be to document some API for controlling the EM, then to declare some kernel ahci flag that will indicate full EM presence in the kernel. Then I can improve my ahci_em module to skip its installation when similar functions are built into the kernel. My interface is quite simple. You just write a char to leds-controlling proc file to set state of leds, for example: echo r > /proc/ahci_em0/leds0 means you asked for REBUILD state indicated in the bay of port 0. I think that most of users would prefer additional module rather that kernel udgrade, for the first time. Also, I am not very close to linux kernel to provide a kernel patch. Thanks. Best regards, Vladimir Dashevsky ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: hot plug on ICH9 with AHCI on 2009-03-23 11:38 ` Владимир Дашевский @ 2009-03-23 12:01 ` Tejun Heo 2009-03-24 8:26 ` Владимир Дашевский 0 siblings, 1 reply; 13+ messages in thread From: Tejun Heo @ 2009-03-23 12:01 UTC (permalink / raw) To: Владимир Дашевский Cc: Jeff Garzik, linux-ide Hello, Владимир Дашевский wrote: >> I really don't follow your train of thoughts here. Are you saying >> that the driver should be optimistic about the reliability about >> status reported by the hardware even when it is inherently imprecise >> (please read the spec) and real world experiments prove that? >> > No. I ment that driver should performs better if the hardware supports > some features for that. Consider two different cases. > 1. hardware derives port population status by sensing the carrier in the > data link. In this case it is possible that some EMI noise can damage > link integrity so strongly that not data bits but also a carrier will be > lost for a short time. This will lead to 'port is not present' status > however noone has actually removed the drive. SATA is even weaker than this. If you put a machine under an EMI test, SATA links check out the first together with unshielded serial connections. > 2. Hardware implements some feature like pull-up resistor in USB, or > special shorter 'present' contact as in PCI or CPCI connectors, or it > simply senses some dc current through power lines etc. In this case port > status is robust over EMI noise and be used to inform driver of actual > connection. > My thought was to improve driver behavior in case 2, either autodetected > by PCI IDs or manually overriden by some configure script. There is no extra line in SATA connector. There are only two pairs of wires, one for each direction. Each pair is used to carry voltage differential. There is no common ground or closed circuits formed. The only thing which distinguishes an online link from an offline one is live signal on the receiving side, so the earlier mention of mechnical switch. ahci actually has support for it although I haven't seen any which actually implemented it. >> -EIO will happen to any processes trying to do IO on the removed >> device. fuser will find out who's using the block device but if you >> want something more intelligent, look at hal + dbus. >> > Hm, I tried to write fuser /dev/sda and got empty output. It seems that > file system does not open sda. How it works? No it won't show up and I can't really answer all your linux related questions. Sorry. :-P >> Ah.. in that case, you can do >> >> # echo - - - > /sys/class/scsi_host/hostN/scan >> > well, it works but it takes of about 10 seconds to finish scan for > deleted drive. is this ok? > Probably, that's because drive goes down after deletion and it starts to > spin up during this scan. Yeah, probably. >> Yeah, I know it has been in the spec but without hardware to play with >> it's difficult to add driver features and lack of general availability >> also means lower demand. >> > Well, I just cannot imagine how software raid can work without clearly > visible state. One drive mixed up in RAID5 and the whole array can get > damaged. And it is not so difficult to mix them up because drive names > may differ from physical slot numbers. Yeah, it can be tricky but highend machine usually go with sas and consumer grade machines don't really care, so there just aren't too many ahci machines with enclosure support. Well, not here anyway. >> I think it should be independent from RAID but having general >> enclosure support will be nice. Care to post the patches? >> >> > Well, I can provide you with a code which works on my ICH9 Supermicro > platform. I believe it will also work with both ICH8 and ICH10. > However, since I could not install this module as traditional pci driver > (the kernel decided not to claim my ahci device since the main driver > present in the system) I had to rewrite it as a general linux kernel > module. It justs scan pci devices for AHCI capable ones and remaps their > ABAR to try enclosure management support. For now, only my ICH9 PCI IDs > are in my try list. All AHCI EM-capable devices get their associated > proc interface - /proc/ahci_emX/leds*. This module actually works in > parallel with kernel ahci driver but I think it will be a conflict with > it once the kernel driver starts to support em by itself. I guess, the > best way would be to document some API for controlling the EM, then to > declare some kernel ahci flag that will indicate full EM presence in the > kernel. Then I can improve my ahci_em module to skip its installation > when similar functions are built into the kernel. > > My interface is quite simple. You just write a char to leds-controlling > proc file to set state of leds, for example: > echo r > /proc/ahci_em0/leds0 means you asked for REBUILD state > indicated in the bay of port 0. > I think that most of users would prefer additional module rather that > kernel udgrade, for the first time. Also, I am not very close to linux > kernel to provide a kernel patch. I don't think the design you described would fit into upstream kernel too well but if you have it working it's some place to start. Just refresh in on top of the current devel kernel and let's see what can be done. Thanks. -- tejun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: hot plug on ICH9 with AHCI on 2009-03-23 12:01 ` Tejun Heo @ 2009-03-24 8:26 ` Владимир Дашевский 0 siblings, 0 replies; 13+ messages in thread From: Владимир Дашевский @ 2009-03-24 8:26 UTC (permalink / raw) To: Tejun Heo; +Cc: Jeff Garzik, linux-ide [-- Attachment #1: Type: text/plain, Size: 4265 bytes --] Tejun! > No it won't show up and I can't really answer all your linux related > questions. Sorry. :-P > Nice system... > >>> Yeah, I know it has been in the spec but without hardware to play with >>> it's difficult to add driver features and lack of general availability >>> also means lower demand. >>> >>> >> Well, I just cannot imagine how software raid can work without clearly >> visible state. One drive mixed up in RAID5 and the whole array can get >> damaged. And it is not so difficult to mix them up because drive names >> may differ from physical slot numbers. >> > > Yeah, it can be tricky but highend machine usually go with sas and > consumer grade machines don't really care, so there just aren't too > many ahci machines with enclosure support. Well, not here anyway. > > As for me, there is a lot of tasks where the main issue is a reliablity and not nesessary a performance. SAS cannot be used for RAID anyway, it's closer to RAED. I guess that software RAID can be the best solution for every case where reliablity and online scalability is needed. >>> I think it should be independent from RAID but having general >>> enclosure support will be nice. Care to post the patches? >>> >>> >>> >> Well, I can provide you with a code which works on my ICH9 Supermicro >> platform. I believe it will also work with both ICH8 and ICH10. >> However, since I could not install this module as traditional pci driver >> (the kernel decided not to claim my ahci device since the main driver >> present in the system) I had to rewrite it as a general linux kernel >> module. It justs scan pci devices for AHCI capable ones and remaps their >> ABAR to try enclosure management support. For now, only my ICH9 PCI IDs >> are in my try list. All AHCI EM-capable devices get their associated >> proc interface - /proc/ahci_emX/leds*. This module actually works in >> parallel with kernel ahci driver but I think it will be a conflict with >> it once the kernel driver starts to support em by itself. I guess, the >> best way would be to document some API for controlling the EM, then to >> declare some kernel ahci flag that will indicate full EM presence in the >> kernel. Then I can improve my ahci_em module to skip its installation >> when similar functions are built into the kernel. >> >> My interface is quite simple. You just write a char to leds-controlling >> proc file to set state of leds, for example: >> echo r > /proc/ahci_em0/leds0 means you asked for REBUILD state >> indicated in the bay of port 0. >> I think that most of users would prefer additional module rather that >> kernel udgrade, for the first time. Also, I am not very close to linux >> kernel to provide a kernel patch. >> > > I don't think the design you described would fit into upstream kernel > too well but if you have it working it's some place to start. Just > refresh in on top of the current devel kernel and let's see what can > be done. > Sorry, I am not so close to linux kernel to perform a jump from my current kernel (2.6.25) to latest one. I am sending you the sources of kernel module which performs all LEDs EM functions. The code is simple and relatively small. I have brushed it up to be more readable. If you decide it useful and will change user API for this (eg, move proc interface to sys interface) then let me know. I would like to design this software so that module could be used with older kernels and have the same behavior as newer kenels will have. this will help to spread this technology among machines where kernel cannot be upgraded for some reasons. Usage notes: 1. Use 'make' to make driver module. 2. Use 'make reload' to (re)install driver into a system 3. Use 'echo X > /proc/ahci_em0/ledsN' to set state of leds for porn N (N is 0 to 5 for ICH9). X is one of the: 'N' - normal state (leds off) 'L' - locate state (4Hz blinking) 'R' - rebuild state (1Hz blinking) 'F' - failure state (solid + beep) 'P' - predicted failure state (two 4Hz blinks in 1 second ) 'H' - hot spare ( two 4Hz blinks in 4 seconds). My enclosure has only one state LED (besides activity LED) so I developed blink patterns as described here: http://en.wikipedia.org/wiki/IBPI Best regards, Vladimir Dashevsky [-- Attachment #2: ahci_em.tar.gz --] [-- Type: application/gzip, Size: 4973 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-03-24 8:28 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-18 22:12 hot plug on ICH9 with AHCI on Владимир Дашевский
2009-03-20 2:04 ` Tejun Heo
2009-03-20 9:55 ` Владимир Дашевский
[not found] ` <49C39933.4020501@kernel.org>
2009-03-20 15:31 ` Владимир Дашевский
2009-03-22 12:51 ` Владимир Дашевский
2009-03-22 15:08 ` Tejun Heo
2009-03-22 16:07 ` Владимир Дашевский
2009-03-22 16:41 ` Tejun Heo
2009-03-22 18:26 ` Владимир Дашевский
2009-03-23 2:04 ` Tejun Heo
2009-03-23 11:38 ` Владимир Дашевский
2009-03-23 12:01 ` Tejun Heo
2009-03-24 8:26 ` Владимир Дашевский
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).