[RFC] AHCI Command Completion Coalescing(CCC) proposal

linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC] AHCI Command Completion Coalescing(CCC) proposal
@ 2006-06-08  7:30 zhao, forrest
  2006-06-08 15:01 ` Jeff Garzik
  0 siblings, 1 reply; 24+ messages in thread
From: zhao, forrest @ 2006-06-08  7:30 UTC (permalink / raw)
  To: jgarzik, htejun; +Cc: linux-ide

Hello, all

0 Why this RFC?
Although AHCI spec 1.1 provides a detailed explanation about how to play
with CCC-related registers to enable CCC, several CCC-policy-related
parameters need to be defined(or the consensus need to be achieved)
before we start to write the code.

1 What is CCC used for?
As described in AHCI spec 1.1, "CCC is a feature designed to reduce the
interrupt and command completion overhead in a heavily loaded system.
The feature enables the number of interrupts taken per completion to be
reduced significantly, while ensuring a minimum quality of service for
command completions. When a software specified number of commands have
completed or a software specified timeout has expired, an interrupt is
generated by hardware to allow software to process completed commands."

2 When is CCC activated?
As stated above, CCC is useful only if the system is heavily loaded. So
CCC should be activated when the system is heavily loaded. Then the
question is how to determine whether the system is heavily-loaded or
not? In other words, how many interrupts generated per second can be
defined as "heavily-loaded system"? Does it make sense to define "1000
IRQs per second" as a heavily-loaded system?

3 What should the software specified number of commands be?
>From my understanding, the measurement of "IRQ numbers per second"
should be based on per-port instead of all ports of a SATA controller.
For NCQ, the usable command slots for each port is 31(the 32nd command
slot is reserved for internal command), so the software specified number
of commands should be 31*n (n is the number of ports, which is selected
to join CCC).
For non-NCQ, the usable command slots for each port is 32, so the
software specified number of commands should be 32*n.

4 What should the software specified timeout be?
I don't have the strong reasoning of a specific timeout value. 500ms? or
1000ms? We should trade-off between the delay and overhead.

5 When is CCC de-activated?
When the port becomes lightly-loaded, we should de-activate CCC of this
port. Otherwise the unnecessary delay would be introduced. However we
should not de-activate CCC of a port immediately when IRQ's per second
drops down the threshold in order to avoid jitter. My suggestion is that
if consecutive 3 timeout occurs, then we de-activate CCC of a port with
least "IRQ's per second".

Your comments are welcome.

Thanks,
Forrest

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-08  7:30 [RFC] AHCI Command Completion Coalescing(CCC) proposal zhao, forrest
@ 2006-06-08 15:01 ` Jeff Garzik
  2006-06-09  2:27   ` zhao, forrest
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Garzik @ 2006-06-08 15:01 UTC (permalink / raw)
  To: zhao, forrest; +Cc: htejun, linux-ide

zhao, forrest wrote:
> Hello, all
> 
> 0 Why this RFC?
> Although AHCI spec 1.1 provides a detailed explanation about how to play
> with CCC-related registers to enable CCC, several CCC-policy-related
> parameters need to be defined(or the consensus need to be achieved)
> before we start to write the code.

To brag a bit, I pushed Intel heavily for this feature, in the 
pre-AHCI-1.0 development days.


>>From my understanding, the measurement of "IRQ numbers per second"
> should be based on per-port instead of all ports of a SATA controller.

No, it should be all ports of a SATA controller.

If an interrupt arrives while CCC is active, we should take the 
opportunity to check all ports for activity -- as the standard code does 
now.


> 4 What should the software specified timeout be?
> I don't have the strong reasoning of a specific timeout value. 500ms? or
> 1000ms? We should trade-off between the delay and overhead.

500ms is a lot of latency.

	Jeff



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-08 15:01 ` Jeff Garzik
@ 2006-06-09  2:27   ` zhao, forrest
  2006-06-09  3:11     ` Another project for you... :) Jeff Garzik
  2006-06-09  3:30     ` [RFC] AHCI Command Completion Coalescing(CCC) proposal Jeff Garzik
  0 siblings, 2 replies; 24+ messages in thread
From: zhao, forrest @ 2006-06-09  2:27 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: htejun, linux-ide

On Thu, 2006-06-08 at 11:01 -0400, Jeff Garzik wrote: 
> >>From my understanding, the measurement of "IRQ numbers per second"
> > should be based on per-port instead of all ports of a SATA controller.
> 
> No, it should be all ports of a SATA controller.

Maybe I didn't state my ideas clearly. Let me explain it by an example.

1 Assume there are 2 active ports(P0 and P1) on a system, they all run
under non-NCQ mode
2 During a certain period, P0 is heavily loaded, which generates >1000
interrupts per second; P1 is idle, which generates no interrupt
3
3.1 If the measurement of "IRQ numbers per second" is based on all
active ports of a SATA controller, CCC is activated by CCC_PORTS being
set to 0x3, CCC_CTL.CC being set to 64(32*2). Then the problem comes:
the CCC interrupt will be raised only when the timeout expires, this is
because P1 is in idle, thus hCccComplete can never be greater than or
equal to 64, the maximum of hCccComplete is 32.
3.2 If the measurement of "IRQ numbers per second" is based on per-port,
we can know that P0 is heavily-loaded, then CCC is activated by
CCC_PORTS being set to 0x1, CCC_CTL.CC being set to 32. Then CCC can
take effect as we have expected :)
NOTE: hCccComplete is the term used in section 11 of AHCI spec1.1

> > 4 What should the software specified timeout be?
> > I don't have the strong reasoning of a specific timeout value. 500ms? or
> > 1000ms? We should trade-off between the delay and overhead.
> 
> 500ms is a lot of latency.

I'll use 100ms as timeout value in the code.

Thanks,
Forrest

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Another project for you...  :)
  2006-06-09  2:27   ` zhao, forrest
@ 2006-06-09  3:11     ` Jeff Garzik
  2006-06-09  3:13       ` zhao, forrest
                         ` (2 more replies)
  2006-06-09  3:30     ` [RFC] AHCI Command Completion Coalescing(CCC) proposal Jeff Garzik
  1 sibling, 3 replies; 24+ messages in thread
From: Jeff Garzik @ 2006-06-09  3:11 UTC (permalink / raw)
  To: zhao, forrest; +Cc: htejun, linux-ide, randy_dunlap, Alan Cox

Forrest,

BTW, if you are looking for useful libata projects, it would really be 
nice to resurrect Randy Dunlap's SATA ACPI patches, update those for the 
current libata-dev.git#upstream, and get those in.

libata needs to execute the SATA taskfiles passed to us from ACPI BIOS 
tables, in order to properly set up the hard drive in a way the user 
expects (hard drive password, acoustic settings, etc.).  There should be 
a module option that allows the user to skip this step, and preserve 
current behavior.

Also, a feature Alan requests on occasion:  Call the ATA "set max" 
command to fully address the hard drive, including HPA.  The Linux 
standard is to export the raw hardware directly, making 100% of the 
hardware capability available to the user (and, in this case, 
Linux-based BIOS and recovery tools).

	Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Another project for you...  :)
  2006-06-09  3:11     ` Another project for you... :) Jeff Garzik
@ 2006-06-09  3:13       ` zhao, forrest
  2006-06-09 22:29         ` Greg Freemyer
  2006-06-09  3:43       ` [RFC] ATA host-protected area (HPA) device mapper? Jeff Garzik
  2006-06-14  8:01       ` Another project for you... :) zhao, forrest
  2 siblings, 1 reply; 24+ messages in thread
From: zhao, forrest @ 2006-06-09  3:13 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: htejun, linux-ide, randy_dunlap, Alan Cox

On Thu, 2006-06-08 at 23:11 -0400, Jeff Garzik wrote:
> Forrest,
> 
> BTW, if you are looking for useful libata projects, it would really be 
> nice to resurrect Randy Dunlap's SATA ACPI patches, update those for the 
> current libata-dev.git#upstream, and get those in.
> 
> libata needs to execute the SATA taskfiles passed to us from ACPI BIOS 
> tables, in order to properly set up the hard drive in a way the user 
> expects (hard drive password, acoustic settings, etc.).  There should be 
> a module option that allows the user to skip this step, and preserve 
> current behavior.

OK. I'll work on Randy's SATA ACPI patches first, then CCC.

> Also, a feature Alan requests on occasion:  Call the ATA "set max" 
> command to fully address the hard drive, including HPA.  The Linux 
> standard is to export the raw hardware directly, making 100% of the 
> hardware capability available to the user (and, in this case, 
> Linux-based BIOS and recovery tools).

I'll first study what this means, then start to work on it.

Jeff and Tejun,

Thank you very much for helping me get involved in libata development.
I'm very happy to make contribution to open source project :)

Best wishes,
Forrest

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-09  2:27   ` zhao, forrest
  2006-06-09  3:11     ` Another project for you... :) Jeff Garzik
@ 2006-06-09  3:30     ` Jeff Garzik
  2006-06-09  3:39       ` zhao, forrest
  2006-06-09  3:43       ` Tejun Heo
  1 sibling, 2 replies; 24+ messages in thread
From: Jeff Garzik @ 2006-06-09  3:30 UTC (permalink / raw)
  To: zhao, forrest; +Cc: htejun, linux-ide

zhao, forrest wrote:
> On Thu, 2006-06-08 at 11:01 -0400, Jeff Garzik wrote: 
>>> >From my understanding, the measurement of "IRQ numbers per second"
>>> should be based on per-port instead of all ports of a SATA controller.
>> No, it should be all ports of a SATA controller.
> 
> Maybe I didn't state my ideas clearly. Let me explain it by an example.
> 
> 1 Assume there are 2 active ports(P0 and P1) on a system, they all run
> under non-NCQ mode
> 2 During a certain period, P0 is heavily loaded, which generates >1000
> interrupts per second; P1 is idle, which generates no interrupt
> 3
> 3.1 If the measurement of "IRQ numbers per second" is based on all
> active ports of a SATA controller, CCC is activated by CCC_PORTS being
> set to 0x3, CCC_CTL.CC being set to 64(32*2). Then the problem comes:
> the CCC interrupt will be raised only when the timeout expires, this is
> because P1 is in idle, thus hCccComplete can never be greater than or
> equal to 64, the maximum of hCccComplete is 32.
> 3.2 If the measurement of "IRQ numbers per second" is based on per-port,
> we can know that P0 is heavily-loaded, then CCC is activated by
> CCC_PORTS being set to 0x1, CCC_CTL.CC being set to 32. Then CCC can
> take effect as we have expected :)
> NOTE: hCccComplete is the term used in section 11 of AHCI spec1.1

I'm still not sure I follow you?

When AHCI runs out of commands to execute, it transitions from H:Idle to 
Ccc:SetIS.

IMPORTANT NOTE:  In order for CCC to be effective on AHCI, ahci.c and 
libata (and sata_sil24) must be updated to support queuing a list of 
non-NCQ commands onto the controller, and recovering from errors in the 
case where a command list full of non-NCQ commands is present.

Also, you should scale up CCC_CTL.CC really based on "commands in 
flight" across all ports.

	Jeff




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-09  3:30     ` [RFC] AHCI Command Completion Coalescing(CCC) proposal Jeff Garzik
@ 2006-06-09  3:39       ` zhao, forrest
  2006-06-09  3:43       ` Tejun Heo
  1 sibling, 0 replies; 24+ messages in thread
From: zhao, forrest @ 2006-06-09  3:39 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: htejun, linux-ide

On Thu, 2006-06-08 at 23:30 -0400, Jeff Garzik wrote:
> When AHCI runs out of commands to execute, it transitions from H:Idle to 
> Ccc:SetIS.
> 

Oh, that's my misunderstanding of the spec. Thanks for your patience.

Forrest

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC] ATA host-protected area (HPA) device mapper?
  2006-06-09  3:11     ` Another project for you... :) Jeff Garzik
  2006-06-09  3:13       ` zhao, forrest
@ 2006-06-09  3:43       ` Jeff Garzik
  2006-06-09  4:51         ` Matthew Frost
  2006-06-14  8:01       ` Another project for you... :) zhao, forrest
  2 siblings, 1 reply; 24+ messages in thread
From: Jeff Garzik @ 2006-06-09  3:43 UTC (permalink / raw)
  To: linux-ide; +Cc: zhao, forrest, htejun, randy_dunlap, Alan Cox, Linux Kernel

As I just mentioned on linux-ide in another email:
libata should -- like drivers/ide -- call the ATA "set max" command to 
fully address the hard drive, including the special "host-protected 
area" (HPA).  We should do this because the Linux standard is to export 
the raw hardware directly, making 100% of the hardware capability 
available to the user (and, in this case, Linux-based BIOS and recovery 
tools).

However, there are rare bug reports and general paranoia related to 
presenting 100% of the ATA hard drive "native" space, rather than the 
possibly-smaller space that the BIOS chose to present to the user.

My thinking is that [someone] should create an optional, ATA-specific 
device mapper module.  This module would layer on top of an ATA block 
device, and present two block devices:  the BIOS-presented space, and 
the HPA.

Such a module would make it trivial for users to ensure that partition 
tables and RAID metadata formats know what the BIOS (rather than 
underlying hard drive) considers to be end-of-disk.

Comments?  Questions?  Am I completely insane?  ;-)

	Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-09  3:30     ` [RFC] AHCI Command Completion Coalescing(CCC) proposal Jeff Garzik
  2006-06-09  3:39       ` zhao, forrest
@ 2006-06-09  3:43       ` Tejun Heo
  2006-06-09  3:47         ` Tejun Heo
                           ` (2 more replies)
  1 sibling, 3 replies; 24+ messages in thread
From: Tejun Heo @ 2006-06-09  3:43 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: zhao, forrest, linux-ide

Jeff Garzik wrote:

> I'm still not sure I follow you?
> 
> When AHCI runs out of commands to execute, it transitions from H:Idle to 
> Ccc:SetIS.
> 
> IMPORTANT NOTE:  In order for CCC to be effective on AHCI, ahci.c and 
> libata (and sata_sil24) must be updated to support queuing a list of 
> non-NCQ commands onto the controller, and recovering from errors in the 
> case where a command list full of non-NCQ commands is present.

I thought about it but am not really sure whether it's worth the 
trouble.  We'll be saving on inter-command latency and interrupt 
handling which is great but not so sure how noticeable the improvement 
would be.  NCQ is already all around.  How about doing CCC only during 
NCQ command phase?

Hmmm... maybe those SSDs would benefit from CCC during non-NCQ commands 
though if they don't support NCQ, which they don't really need.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-09  3:43       ` Tejun Heo
@ 2006-06-09  3:47         ` Tejun Heo
  2006-06-09  3:51           ` zhao, forrest
  2006-06-09  3:53           ` Jeff Garzik
  2006-06-09  3:52         ` Jeff Garzik
  2006-06-09 11:49         ` Jens Axboe
  2 siblings, 2 replies; 24+ messages in thread
From: Tejun Heo @ 2006-06-09  3:47 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: zhao, forrest, linux-ide

Tejun Heo wrote:
> Jeff Garzik wrote:
> 
>> I'm still not sure I follow you?
>>
>> When AHCI runs out of commands to execute, it transitions from H:Idle 
>> to Ccc:SetIS.
>>
>> IMPORTANT NOTE:  In order for CCC to be effective on AHCI, ahci.c and 
>> libata (and sata_sil24) must be updated to support queuing a list of 
>> non-NCQ commands onto the controller, and recovering from errors in 
>> the case where a command list full of non-NCQ commands is present.
> 
> I thought about it but am not really sure whether it's worth the 
> trouble.  We'll be saving on inter-command latency and interrupt 
> handling which is great but not so sure how noticeable the improvement 
> would be.  NCQ is already all around.  How about doing CCC only during 
> NCQ command phase?
> 
> Hmmm... maybe those SSDs would benefit from CCC during non-NCQ commands 
> though if they don't support NCQ, which they don't really need.
> 

If we're gonna do it.  EH needs only a few changes probably during 
autopsy and report.  Fixing up command issue path and implementing 
command exclusion (NCQ vs. non-NCQ, sil24 does it in hardware, ahci 
doesn't) will be a bit complex though.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-09  3:47         ` Tejun Heo
@ 2006-06-09  3:51           ` zhao, forrest
  2006-06-09  4:12             ` Jeff Garzik
  2006-06-09  5:24             ` Tejun Heo
  2006-06-09  3:53           ` Jeff Garzik
  1 sibling, 2 replies; 24+ messages in thread
From: zhao, forrest @ 2006-06-09  3:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jeff Garzik, linux-ide

On Fri, 2006-06-09 at 12:47 +0900, Tejun Heo wrote:
> If we're gonna do it.  EH needs only a few changes probably during 
> autopsy and report.  Fixing up command issue path and implementing 
> command exclusion (NCQ vs. non-NCQ, sil24 does it in hardware, ahci 
> doesn't) will be a bit complex though.

Would you please elaborate on command exclusion? Why NCQ commands need
to be excluded from non-NCQ commands?

Thanks,
Forrest

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-09  3:43       ` Tejun Heo
  2006-06-09  3:47         ` Tejun Heo
@ 2006-06-09  3:52         ` Jeff Garzik
  2006-06-09 11:49         ` Jens Axboe
  2 siblings, 0 replies; 24+ messages in thread
From: Jeff Garzik @ 2006-06-09  3:52 UTC (permalink / raw)
  To: Tejun Heo; +Cc: zhao, forrest, linux-ide

Tejun Heo wrote:
> Jeff Garzik wrote:
> 
>> I'm still not sure I follow you?
>>
>> When AHCI runs out of commands to execute, it transitions from H:Idle 
>> to Ccc:SetIS.
>>
>> IMPORTANT NOTE:  In order for CCC to be effective on AHCI, ahci.c and 
>> libata (and sata_sil24) must be updated to support queuing a list of 
>> non-NCQ commands onto the controller, and recovering from errors in 
>> the case where a command list full of non-NCQ commands is present.
> 
> I thought about it but am not really sure whether it's worth the 
> trouble.  We'll be saving on inter-command latency and interrupt 
> handling which is great but not so sure how noticeable the improvement 
> would be.  NCQ is already all around.  How about doing CCC only during 
> NCQ command phase?

Agreed with all these observations.

In general, the non-NCQ case shares characteristics with host-queue 
controllers like sx8.  The benefit is nowhere near true command queueing 
like NCQ, but the benefits (which you list) are real.

Think of this as a long term requirement.  libata _should_ eventually 
support this, simply because the gains are there to be had.  Several 
SATA controllers support queueing of non-NCQ commands.

In the short term, only turning on CCC for NCQ ports makes a lot of sense.


> Hmmm... maybe those SSDs would benefit from CCC during non-NCQ commands 
> though if they don't support NCQ, which they don't really need.

My Gigabyte i-Ram reliably corrupts data within seconds, regardless of 
SATA controller :/

	Jeff




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-09  3:47         ` Tejun Heo
  2006-06-09  3:51           ` zhao, forrest
@ 2006-06-09  3:53           ` Jeff Garzik
  1 sibling, 0 replies; 24+ messages in thread
From: Jeff Garzik @ 2006-06-09  3:53 UTC (permalink / raw)
  To: Tejun Heo; +Cc: zhao, forrest, linux-ide

Tejun Heo wrote:
> If we're gonna do it.  EH needs only a few changes probably during 
> autopsy and report.  Fixing up command issue path and implementing 
> command exclusion (NCQ vs. non-NCQ, sil24 does it in hardware, ahci 
> doesn't) will be a bit complex though.

FWIW:  Promise, Marvell, and ServerWorks can all queue non-NCQ commands.

Oh yeah, ADMA too, IIRC.

	Jeff


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-09  3:51           ` zhao, forrest
@ 2006-06-09  4:12             ` Jeff Garzik
  2006-06-09  5:24             ` Tejun Heo
  1 sibling, 0 replies; 24+ messages in thread
From: Jeff Garzik @ 2006-06-09  4:12 UTC (permalink / raw)
  To: zhao, forrest; +Cc: Tejun Heo, linux-ide

zhao, forrest wrote:
> On Fri, 2006-06-09 at 12:47 +0900, Tejun Heo wrote:
>> If we're gonna do it.  EH needs only a few changes probably during 
>> autopsy and report.  Fixing up command issue path and implementing 
>> command exclusion (NCQ vs. non-NCQ, sil24 does it in hardware, ahci 
>> doesn't) will be a bit complex though.
> 
> Would you please elaborate on command exclusion? Why NCQ commands need
> to be excluded from non-NCQ commands?

For one thing, getting the command ordering right (particularly with 
regards to barriers) isn't trivial...  also, you need exclusion for 
major feature changes like set xfer mode.

	Jeff




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] ATA host-protected area (HPA) device mapper?
  2006-06-09  3:43       ` [RFC] ATA host-protected area (HPA) device mapper? Jeff Garzik
@ 2006-06-09  4:51         ` Matthew Frost
  0 siblings, 0 replies; 24+ messages in thread
From: Matthew Frost @ 2006-06-09  4:51 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: linux-ide, zhao, forrest, htejun, randy_dunlap, Alan Cox,
	Linux Kernel

Jeff Garzik wrote:
> As I just mentioned on linux-ide in another email:
> libata should -- like drivers/ide -- call the ATA "set max" command to 
> fully address the hard drive, including the special "host-protected 
> area" (HPA).  We should do this because the Linux standard is to export 
> the raw hardware directly, making 100% of the hardware capability 
> available to the user (and, in this case, Linux-based BIOS and recovery 
> tools).
> 

Yay for exposing absolute potential functionality; yay for recognizing 
the havok possible, and proposing strategies for channeling that 
possibility.

> However, there are rare bug reports and general paranoia related to 
> presenting 100% of the ATA hard drive "native" space, rather than the 
> possibly-smaller space that the BIOS chose to present to the user.
> 

I've grepped through several old discussions of HPA handling, and it 
doesn't seem like everyone has the same idea of exactly what this will 
do, possibly because of the delta in BIOS behavior over original design 
restrictions.

> My thinking is that [someone] should create an optional, ATA-specific 
> device mapper module.  This module would layer on top of an ATA block 
> device, and present two block devices:  the BIOS-presented space, and 
> the HPA.
> 
> Such a module would make it trivial for users to ensure that partition 
> tables and RAID metadata formats know what the BIOS (rather than 
> underlying hard drive) considers to be end-of-disk.
> 
> Comments?  Questions?  Am I completely insane?  ;-)
> 

Tools with which to lay waste to systems, or save them.

What I like about your proposal is that it doesn't go back to "Do we 
blow away the HPA or reserve it?"; you suggest conserving both options. 
  Make the kernel aware of the existence of the HPA, and thereby the 
whole capacity of the disk, and simultaneously of what it should see and 
expose for usage 'safely'.  Doesn't sound insane to me; it sounds like 
you're planning on [having someone] teach the kernel to respect the 
actual disk limitations.

Whether the implementation will be sane ... 'nother story.  :)  Thence 
the question of teaching userspace to sanely use what is exposed, though 
if the 'old' (non-HPA) space is presented, it shouldn't be a hard 
reorientation.  Would we be talking about a new sysfs entry parallel to 
the existing information?  If I understand it right -- and I might not 
-- the HPA doesn't get included in the partitioning schemes, because it 
is protected.  Even nuking the disk will/should bypass it.  So the 
system will tend to ignore it under normal conditions, until you decide 
to get fancy and trip over its shadow.  So making the kernel aware that 
this disk has this spot that must be respected should be a no-brainer. 
What better way to make the kernel aware of it, than by acknowledging it 
as a block device among other block devices?  It just needs a good 
molly-guard to cover the respect portion of the problem.

Of course, I don't hack ATA, so my opinions may have limited validity 
after a certain level of specificity.  I can always be enlightened as to 
why you really are insane.  ;)

>     Jeff
> 

Matt

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-09  3:51           ` zhao, forrest
  2006-06-09  4:12             ` Jeff Garzik
@ 2006-06-09  5:24             ` Tejun Heo
  2006-06-09 11:49               ` Jens Axboe
  1 sibling, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2006-06-09  5:24 UTC (permalink / raw)
  To: zhao, forrest; +Cc: Jeff Garzik, linux-ide

zhao, forrest wrote:
> On Fri, 2006-06-09 at 12:47 +0900, Tejun Heo wrote:
>> If we're gonna do it.  EH needs only a few changes probably during 
>> autopsy and report.  Fixing up command issue path and implementing 
>> command exclusion (NCQ vs. non-NCQ, sil24 does it in hardware, ahci 
>> doesn't) will be a bit complex though.
> 
> Would you please elaborate on command exclusion? Why NCQ commands need
> to be excluded from non-NCQ commands?

AHCI spec rev 1.1, sect 1.7.  The last paragraph says.

"This multiple-use of the command list is achieved by the HBA only 
moving its command list pointer when the BSY, DRQ, and ERR bits are 
cleared by the device. System software is responsible to ensure that 
queued and non-queued commands are not mixed in the command list."

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-09  5:24             ` Tejun Heo
@ 2006-06-09 11:49               ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2006-06-09 11:49 UTC (permalink / raw)
  To: Tejun Heo; +Cc: zhao, forrest, Jeff Garzik, linux-ide

On Fri, Jun 09 2006, Tejun Heo wrote:
> zhao, forrest wrote:
> >On Fri, 2006-06-09 at 12:47 +0900, Tejun Heo wrote:
> >>If we're gonna do it.  EH needs only a few changes probably during 
> >>autopsy and report.  Fixing up command issue path and implementing 
> >>command exclusion (NCQ vs. non-NCQ, sil24 does it in hardware, ahci 
> >>doesn't) will be a bit complex though.
> >
> >Would you please elaborate on command exclusion? Why NCQ commands need
> >to be excluded from non-NCQ commands?
> 
> AHCI spec rev 1.1, sect 1.7.  The last paragraph says.
> 
> "This multiple-use of the command list is achieved by the HBA only 
> moving its command list pointer when the BSY, DRQ, and ERR bits are 
> cleared by the device. System software is responsible to ensure that 
> queued and non-queued commands are not mixed in the command list."

This, btw, was also the case with the legacy TCQ.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] AHCI Command Completion Coalescing(CCC) proposal
  2006-06-09  3:43       ` Tejun Heo
  2006-06-09  3:47         ` Tejun Heo
  2006-06-09  3:52         ` Jeff Garzik
@ 2006-06-09 11:49         ` Jens Axboe
  2 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2006-06-09 11:49 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jeff Garzik, zhao, forrest, linux-ide

On Fri, Jun 09 2006, Tejun Heo wrote:
> Jeff Garzik wrote:
> 
> >I'm still not sure I follow you?
> >
> >When AHCI runs out of commands to execute, it transitions from H:Idle to 
> >Ccc:SetIS.
> >
> >IMPORTANT NOTE:  In order for CCC to be effective on AHCI, ahci.c and 
> >libata (and sata_sil24) must be updated to support queuing a list of 
> >non-NCQ commands onto the controller, and recovering from errors in the 
> >case where a command list full of non-NCQ commands is present.
> 
> I thought about it but am not really sure whether it's worth the 
> trouble.  We'll be saving on inter-command latency and interrupt 
> handling which is great but not so sure how noticeable the improvement 
> would be.  NCQ is already all around.  How about doing CCC only during 
> NCQ command phase?

Fully agree, doing CCC on non-ncq sounds like a very silly thing to do.
Ugly complexity for very little (and questionable) gain.

I'm not a big fan of CCC in generel, to me it seems to have bigger
potential to cause you latency than save you interrupt processing time.
I'm not saying there aren't cases where CCC would be a win, but I see a
lot more cases where it definitely wont be. And it's a classic case of
having to implement policy, so it's surely get it wrong here and there.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Another project for you... :)
  2006-06-09  3:13       ` zhao, forrest
@ 2006-06-09 22:29         ` Greg Freemyer
  2006-06-09 23:44           ` Alan Cox
  0 siblings, 1 reply; 24+ messages in thread
From: Greg Freemyer @ 2006-06-09 22:29 UTC (permalink / raw)
  To: zhao, forrest; +Cc: Jeff Garzik, htejun, linux-ide, randy_dunlap, Alan Cox

> > Also, a feature Alan requests on occasion:  Call the ATA "set max"
> > command to fully address the hard drive, including HPA.  The Linux
> > standard is to export the raw hardware directly, making 100% of the
> > hardware capability available to the user (and, in this case,
> > Linux-based BIOS and recovery tools).
>
> I'll first study what this means, then start to work on it.
>
> Best wishes,
> Forrest

The below relates to PATA and HPA (Host Protected Area), but I assume
it would be relevant to SATA as well.

====
This has been discussed before in the archives.  I think the desired
behavior was to have the Linux Kernel look for HPAs by default and to
open the drive up to the full size via "set max" on boot, but to have
a boot parameter to disable this behavior if desired.

FYI: There is a userland tool called setmax floating around.  (I can
find if you're interested.)  The userland tool works fine for LBA28
drives, but does not have LBA48 support.  That may get to be an issue,
but at least from my perspective, I have not yet seen any 128+ GiB
drives with HPAs setup.  I do occasionally see smaller drives with
HPAs.

FYI2: I do computer forensics and make dd images of drives routinely.
As part of the process I double check if a HPA is present or not.  I
typically use the userland setmax tool to open it up if the linux
kernel failed to do so on bootup for some reason.  (I've only seen
that a few times so far.)

Greg
-- 
Greg Freemyer
The Norcross Group
Forensics for the 21st Century

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Another project for you... :)
  2006-06-09 22:29         ` Greg Freemyer
@ 2006-06-09 23:44           ` Alan Cox
  0 siblings, 0 replies; 24+ messages in thread
From: Alan Cox @ 2006-06-09 23:44 UTC (permalink / raw)
  To: Greg Freemyer; +Cc: zhao, forrest, Jeff Garzik, htejun, linux-ide, randy_dunlap

Ar Gwe, 2006-06-09 am 18:29 -0400, ysgrifennodd Greg Freemyer:
> behavior was to have the Linux Kernel look for HPAs by default and to
> open the drive up to the full size via "set max" on boot, but to have
> a boot parameter to disable this behavior if desired.

It doesn't really work very well, even less so when you hit SATA and
have hot plug disks. HPA on which disks, inserted when.

Thus it needs, as Jeff says, to be cleanly exposed dynamically at
runtime.

Alan


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Another project for you...  :)
  2006-06-09  3:11     ` Another project for you... :) Jeff Garzik
  2006-06-09  3:13       ` zhao, forrest
  2006-06-09  3:43       ` [RFC] ATA host-protected area (HPA) device mapper? Jeff Garzik
@ 2006-06-14  8:01       ` zhao, forrest
  2006-06-14 15:19         ` Randy.Dunlap
  2 siblings, 1 reply; 24+ messages in thread
From: zhao, forrest @ 2006-06-14  8:01 UTC (permalink / raw)
  To: randy_dunlap; +Cc: Jeff Garzik, htejun, linux-ide, Alan Cox

On Thu, 2006-06-08 at 23:11 -0400, Jeff Garzik wrote:
> Forrest,
> 
> BTW, if you are looking for useful libata projects, it would really be 
> nice to resurrect Randy Dunlap's SATA ACPI patches, update those for the 
> current libata-dev.git#upstream, and get those in.

Randy,

Could you confirm if your latest SATA-ACPI patch is at this URL?
http://www.xenotime.net/linux/SATA/2.6.16-rc4/libata-rollup-2616-rc4.patch

Thanks,
Forrest

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Another project for you...  :)
  2006-06-14  8:01       ` Another project for you... :) zhao, forrest
@ 2006-06-14 15:19         ` Randy.Dunlap
  2006-06-15  7:59           ` zhao, forrest
  0 siblings, 1 reply; 24+ messages in thread
From: Randy.Dunlap @ 2006-06-14 15:19 UTC (permalink / raw)
  To: zhao, forrest; +Cc: jgarzik, htejun, linux-ide, alan

On Wed, 14 Jun 2006 16:01:54 +0800 zhao, forrest wrote:

> On Thu, 2006-06-08 at 23:11 -0400, Jeff Garzik wrote:
> > Forrest,
> > 
> > BTW, if you are looking for useful libata projects, it would really be 
> > nice to resurrect Randy Dunlap's SATA ACPI patches, update those for the 
> > current libata-dev.git#upstream, and get those in.
> 
> Randy,
> 
> Could you confirm if your latest SATA-ACPI patch is at this URL?
> http://www.xenotime.net/linux/SATA/2.6.16-rc4/libata-rollup-2616-rc4.patch

Yes, correct.

---
~Randy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Another project for you...  :)
  2006-06-14 15:19         ` Randy.Dunlap
@ 2006-06-15  7:59           ` zhao, forrest
  2006-06-15 11:47             ` Jeff Garzik
  0 siblings, 1 reply; 24+ messages in thread
From: zhao, forrest @ 2006-06-15  7:59 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: jgarzik, htejun, linux-ide, alan

On Wed, 2006-06-14 at 08:19 -0700, Randy.Dunlap wrote:
> On Wed, 14 Jun 2006 16:01:54 +0800 zhao, forrest wrote:
> 
> > On Thu, 2006-06-08 at 23:11 -0400, Jeff Garzik wrote:
> > > Forrest,
> > > 
> > > BTW, if you are looking for useful libata projects, it would really be 
> > > nice to resurrect Randy Dunlap's SATA ACPI patches, update those for the 
> > > current libata-dev.git#upstream, and get those in.
> > 
> > Randy,
> > 
> > Could you confirm if your latest SATA-ACPI patch is at this URL?
> > http://www.xenotime.net/linux/SATA/2.6.16-rc4/libata-rollup-2616-rc4.patch
> 
> Yes, correct.

According to ACPI spec 3.0, _GTM and _STM are IDE-only objects, _SDD is
SATA-only object. And in your patch you used field "legacy_mode" of
"struct ata_probe_ent" in order to distinguish between IDE and SATA.
But after reading the code of ata_pci_init_one(), I found that
"legacy_mode" is used to distinguish between legacy mode and native PCI
mode of IDE controller. Am I right? Or did I miss anything?
If I'm right, I'll fix it during the porting of your patch.

Thanks,
Forrest

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Another project for you...  :)
  2006-06-15  7:59           ` zhao, forrest
@ 2006-06-15 11:47             ` Jeff Garzik
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff Garzik @ 2006-06-15 11:47 UTC (permalink / raw)
  To: zhao, forrest; +Cc: Randy.Dunlap, htejun, linux-ide, alan

zhao, forrest wrote:
> According to ACPI spec 3.0, _GTM and _STM are IDE-only objects, _SDD is
> SATA-only object. And in your patch you used field "legacy_mode" of
> "struct ata_probe_ent" in order to distinguish between IDE and SATA.
> But after reading the code of ata_pci_init_one(), I found that
> "legacy_mode" is used to distinguish between legacy mode and native PCI
> mode of IDE controller. Am I right? Or did I miss anything?
> If I'm right, I'll fix it during the porting of your patch.

You are correct, you should use cbl==SATA or ATA_FLAG_SATA to check for 
SATA.  legacy-vs-native is not an accurate check.

	Jeff



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2006-06-15 11:47 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-08  7:30 [RFC] AHCI Command Completion Coalescing(CCC) proposal zhao, forrest
2006-06-08 15:01 ` Jeff Garzik
2006-06-09  2:27   ` zhao, forrest
2006-06-09  3:11     ` Another project for you... :) Jeff Garzik
2006-06-09  3:13       ` zhao, forrest
2006-06-09 22:29         ` Greg Freemyer
2006-06-09 23:44           ` Alan Cox
2006-06-09  3:43       ` [RFC] ATA host-protected area (HPA) device mapper? Jeff Garzik
2006-06-09  4:51         ` Matthew Frost
2006-06-14  8:01       ` Another project for you... :) zhao, forrest
2006-06-14 15:19         ` Randy.Dunlap
2006-06-15  7:59           ` zhao, forrest
2006-06-15 11:47             ` Jeff Garzik
2006-06-09  3:30     ` [RFC] AHCI Command Completion Coalescing(CCC) proposal Jeff Garzik
2006-06-09  3:39       ` zhao, forrest
2006-06-09  3:43       ` Tejun Heo
2006-06-09  3:47         ` Tejun Heo
2006-06-09  3:51           ` zhao, forrest
2006-06-09  4:12             ` Jeff Garzik
2006-06-09  5:24             ` Tejun Heo
2006-06-09 11:49               ` Jens Axboe
2006-06-09  3:53           ` Jeff Garzik
2006-06-09  3:52         ` Jeff Garzik
2006-06-09 11:49         ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).