* Linux libata, Sil3124, and SATA staggered spin-up support
@ 2007-03-31 16:29 Craig Metz
2007-04-11 8:22 ` Tejun Heo
0 siblings, 1 reply; 3+ messages in thread
From: Craig Metz @ 2007-03-31 16:29 UTC (permalink / raw)
To: linux-ide
I have a Sil3124 controller and four drives, and I'm trying to build a
large JBOD file server (with three or four of those controller+drive sets).
One of the things I would need to be able to do in order to practically do
that is to have reasonable staggered spin-up; that is, I need to not have all
the drives try to start at once - it will put a lot of strain on any power
supply and require gross oversizing and/or separate supplies.
Silicon Image claims to support staggered spin-up, but their BIOS appears
to do it in the pessimal manner - on boot, it simply spins up all drives at
once, and there are no settings I could find to change any of that behavior.
What I could find, though, is a way to simply erase the adapter's flash and
thus disable the BIOS entirely.
I was hoping that Linux would be able to do staggered spin-up itself in a
reasonable manner. And, if it doesn't today, I am hoping that is just a matter
of someone writing the code.
I have done some digging in with the 2.6.20.3 libata code, and one thing I
have noticed that is specific to the Sil3124 driver is that the controller
init function starts up all the ports, which spins up all the drives. Does
anyone know if the Sil3124 hardware really supports the necessary things to do
staggered spin-up in a rational manner? I have done some digging and some
playing with the code, and from what I see, the controller likes to either
cause all the ports to come up (drives to spin up) or not.
Is this something that folks would be willing to support in libata if
reasonable code existed? At some level, staggered spin-up should be done in
the BIOS. However, I expect that Sil3124 is far from the only controller whose
BIOS does it wrong but could be disabled. So I personally see value in having
Linux able to do it and do it right, but if you two don't and the patch would
never go in, then I would be wasting my time continuing to work on this.
What is the right sequence of events for staggered spin-up? I was thinking
that the process should as a first cut happen iteratively/serially:
Init controller
Add controller to libata, get resources
for each port
Init port
Add port to libata
Tell SCSI layer to identify
This is a different flow than what libata/Sil3124 currently does:
Init controller
for each port
Init port
Add controller to libata
for each port
More init port
Add port to libata
Get resources
for each port
Tell SCSI layer to identify
I am not an expert on libata, so I don't really understand the design
decisions that led to the current design. I would greatly appreciate it if
you had any suggestions or could give me additional clues about what the
right design should be to implement a feature like this along with everything
else libata needs to do.
Thanks,
-Craig
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Linux libata, Sil3124, and SATA staggered spin-up support
2007-03-31 16:29 Linux libata, Sil3124, and SATA staggered spin-up support Craig Metz
@ 2007-04-11 8:22 ` Tejun Heo
2007-04-11 13:53 ` Craig Metz
0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2007-04-11 8:22 UTC (permalink / raw)
To: Craig Metz; +Cc: linux-ide
Hello,
Craig Metz wrote:
> I was hoping that Linux would be able to do staggered spin-up itself in a
> reasonable manner. And, if it doesn't today, I am hoping that is just a matter
> of someone writing the code.
>
> I have done some digging in with the 2.6.20.3 libata code, and one thing I
> have noticed that is specific to the Sil3124 driver is that the controller
> init function starts up all the ports, which spins up all the drives. Does
> anyone know if the Sil3124 hardware really supports the necessary things to do
> staggered spin-up in a rational manner? I have done some digging and some
> playing with the code, and from what I see, the controller likes to either
> cause all the ports to come up (drives to spin up) or not.
Yeah, the silicon can do staggered spin up. SATA disks shouldn't spin
up till PHY gets reset. libata currently probes each port serially, so
it should sequentially spin up each drive. Or do they spin up during
controller initialization?
> Is this something that folks would be willing to support in libata if
> reasonable code existed? At some level, staggered spin-up should be done in
> the BIOS. However, I expect that Sil3124 is far from the only controller whose
> BIOS does it wrong but could be disabled. So I personally see value in having
> Linux able to do it and do it right, but if you two don't and the patch would
> never go in, then I would be wasting my time continuing to work on this.
Yeah, staggered spin up support would be good to have.
> What is the right sequence of events for staggered spin-up? I was thinking
> that the process should as a first cut happen iteratively/serially:
>
> Init controller
> Add controller to libata, get resources
> for each port
> Init port
> Add port to libata
> Tell SCSI layer to identify
>
> This is a different flow than what libata/Sil3124 currently does:
>
> Init controller
> for each port
> Init port
> Add controller to libata
> for each port
> More init port
> Add port to libata
> Get resources
> for each port
> Tell SCSI layer to identify
>
> I am not an expert on libata, so I don't really understand the design
> decisions that led to the current design. I would greatly appreciate it if
> you had any suggestions or could give me additional clues about what the
> right design should be to implement a feature like this along with everything
> else libata needs to do.
Hmmm... Most of ATA probing occurs in EH between ata_port_schedule_eh()
and ata_port_wait_eh() in ata_device_add(). For ATAPI devices, SCSI
probing plays a role but for ATA devices the probing just fetches some
info from libata and configures SCSI layer accordingly.
--
tejun
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Linux libata, Sil3124, and SATA staggered spin-up support
2007-04-11 8:22 ` Tejun Heo
@ 2007-04-11 13:53 ` Craig Metz
0 siblings, 0 replies; 3+ messages in thread
From: Craig Metz @ 2007-04-11 13:53 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide
In message <461C9AB2.9070807@gmail.com>, you write:
>Yeah, the silicon can do staggered spin up. SATA disks shouldn't spin
>up till PHY gets reset. libata currently probes each port serially, so
>it should sequentially spin up each drive. Or do they spin up during
>controller initialization?
On a cold boot, if I disable the BIOS, all of the drives begin their spin up
at once (or so close to not be staggered). It appears to be happenning in
sil24_init_controller(), where it loops through all ports and resets the PHYs.
So it appears that the control flow there would need to be changed so that
we go all the way to having libata identify port 1 before we go back and
initialize the PHY for port 2.
If I put all the drives into suspend mode (hdparm -z) and warm reboot, the
drives are spun up in libata's ata_device_add(). In the second loop, the
drives are all spun up, then in the third loop they're identified. In this
case a simple change to combine loops 2 ("probe begin") & 3 ("host probe
begin") in order that they happen more serially seemed to cause a resonable
staggerring behavior to happen. On my drives, the identify commands seemed to
complete a bit before the drives were finished spinning up, so there was still
some overlap in power draw, but over a large population of drives this
shouldn't be a big deal.
I don't have any port multipliers handy to play with, but I expect that they
make this problem a bit more complex and require some thinking about.
-Craig
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-04-11 13:42 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-31 16:29 Linux libata, Sil3124, and SATA staggered spin-up support Craig Metz
2007-04-11 8:22 ` Tejun Heo
2007-04-11 13:53 ` Craig Metz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).