* Question regarding forcewake in i915
@ 2014-12-22 12:26 유재용
2015-01-06 15:19 ` Dave Gordon
0 siblings, 1 reply; 7+ messages in thread
From: 유재용 @ 2014-12-22 12:26 UTC (permalink / raw)
To: intel-gfx
Hello intel-gfx,
I'm reading i915 gpu drivers and find myself quite hard to understand about forcewake concepts.
I understand that it is something with the energy efficiency so related to ACPI. And it looks like forcewake is working as a pair (get and put).
In the "get" part, what it first does it waiting on FORCEWAKE_ACK_HSW register (in case of haswell).
And then, it writes something to FORCEWAKE_MT register, read from ECOBUS.
And again, it waits on FORCEWAKE_ACK_HSW again!
It becomes more confusing when it comes to put.
In the "put" part, what it does it writing to FORCEWAKE_MT register and read from ECOBUS.
I tried to find some good reading materials about this forcewake, but what I found was a series of patches in this mailing list. (which are quite hard to follow from the begining)
Could you explain about the concept of FORCEWAKE and possibly the magic tricks on these get and put?
Thanks,
Jaeyong
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question regarding forcewake in i915
2014-12-22 12:26 Question regarding forcewake in i915 유재용
@ 2015-01-06 15:19 ` Dave Gordon
2015-01-07 8:13 ` Jaeyong Yoo
0 siblings, 1 reply; 7+ messages in thread
From: Dave Gordon @ 2015-01-06 15:19 UTC (permalink / raw)
To: jaeyong.yoo; +Cc: intel-gfx
On 22/12/14 12:26, 유재용 wrote:
> Hello intel-gfx,
>
> I'm reading i915 gpu drivers and find myself quite hard to understand
> about forcewake concepts.
>
> I understand that it is something with the energy efficiency so related
> to ACPI. And it looks like forcewake is working as a pair (get and put).
> In the "get" part, what it first does it waiting on FORCEWAKE_ACK_HSW
> register (in case of haswell).
> And then, it writes something to FORCEWAKE_MT register, read from ECOBUS.
> And again, it waits on FORCEWAKE_ACK_HSW again!
> It becomes more confusing when it comes to put.
> In the "put" part, what it does it writing to FORCEWAKE_MT register and
> read from ECOBUS.
>
> I tried to find some good reading materials about this forcewake, but
> what I found was a series of patches in this mailing list. (which are
> quite hard to follow from the begining)
> Could you explain about the concept of FORCEWAKE and possibly the magic
> tricks on these get and put?
>
> Thanks,
> Jaeyong
Hi Jaeyong,
FORCEWAKE details vary a little from one chip to another, so this is
only a general description, but essentially setting one or more bits in
the FORCEWAKE register(s) prevents some or all of the power domains from
going into the deeper idle (sleep) states (and forces them out of the
sleep state if they're already asleep). Clearing the bit(s) allows the
affected parts to go to sleep again.
The FORCEWAKE_ACK register(s) contain one or more bits which reflect the
internal state, and so acknowledge that the most recent write to the
corresponding FORCEWAKE register has been accepted and acted upon. It
can take a while for a portion of the chip to wake up, so after setting
a FORCEWAKE bit we have to spin-wait until it's taken effect.
So, the general algorithm for accessing some part of the chip that may
be asleep is:
1) set the relevant bit of (a) FORCEWAKE register
2) poll (matching) FORCEWAKE_ACK until the write is acknowledged
3) access the chip (this can encompass several reads and writes)
4) clear the FORCEWAKE bit that we set earlier
5) poll FORCEWAKE_ACK again until this write is acknowledged
Now for extra confusion, there are a few more details:
* because reads and writes can in some cases be reordered, we
need to force the write to FORCEWAKE to complete before the
busy-polling of FORCEWAKE_ACK. This is the sole purpose of the
read of the ECOBUS register, which is used just because it
happens to lie in the same cacheline as FORCEWAKE.
* we can choose not to poll for FORCEWAKE_ACK clear in step (5).
Instead, we can just leave the chip to go back to sleep while
we get on with other things. But in that case, we might come
back and try to wake the chip again before it's finished
responding to the write in step (4). So if we don't poll at
the end of the sequence, we have to poll at the beginning
instead; in other words, move step (5) to before step (1).
IIRC, gen6 had a single FORCEWAKE register containing a single effective
bit, gen7 has a single register containing multiple bits (so that they
can be controlled by different agents) which are OR-ed together to
produce a combined wakeup signal (this also applies to HSW, although the
FORCEWAKE_ACK is in a different place from earlier chips); and VLV has
multiple registers for different power domains (e.g. MEDIA vs RENDER).
Hope this helps!
Dave
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question regarding forcewake in i915
2015-01-06 15:19 ` Dave Gordon
@ 2015-01-07 8:13 ` Jaeyong Yoo
2015-01-12 17:47 ` Dave Gordon
0 siblings, 1 reply; 7+ messages in thread
From: Jaeyong Yoo @ 2015-01-07 8:13 UTC (permalink / raw)
To: 'Dave Gordon'; +Cc: intel-gfx
Thanks a lot. It is very helpful. Couple of follow-up questions below.
> -----Original Message-----
> From: Dave Gordon [mailto:david.s.gordon@intel.com]
> Sent: Wednesday, January 07, 2015 12:19 AM
> To: jaeyong.yoo@samsung.com
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] Question regarding forcewake in i915
>
> On 22/12/14 12:26, 유재용 wrote:
> > Hello intel-gfx,
> >
> > I'm reading i915 gpu drivers and find myself quite hard to understand
> > about forcewake concepts.
> >
> > I understand that it is something with the energy efficiency so
> > related to ACPI. And it looks like forcewake is working as a pair (get
> and put).
> > In the "get" part, what it first does it waiting on FORCEWAKE_ACK_HSW
> > register (in case of haswell).
> > And then, it writes something to FORCEWAKE_MT register, read from ECOBUS.
> > And again, it waits on FORCEWAKE_ACK_HSW again!
> > It becomes more confusing when it comes to put.
> > In the "put" part, what it does it writing to FORCEWAKE_MT register
> > and read from ECOBUS.
> >
> > I tried to find some good reading materials about this forcewake, but
> > what I found was a series of patches in this mailing list. (which are
> > quite hard to follow from the begining) Could you explain about the
> > concept of FORCEWAKE and possibly the magic tricks on these get and
> > put?
> >
> > Thanks,
> > Jaeyong
>
> Hi Jaeyong,
>
> FORCEWAKE details vary a little from one chip to another, so this is only
> a general description, but essentially setting one or more bits in the
> FORCEWAKE register(s) prevents some or all of the power domains from going
> into the deeper idle (sleep) states (and forces them out of the sleep
> state if they're already asleep). Clearing the bit(s) allows the affected
> parts to go to sleep again.
>
> The FORCEWAKE_ACK register(s) contain one or more bits which reflect the
> internal state, and so acknowledge that the most recent write to the
> corresponding FORCEWAKE register has been accepted and acted upon. It can
> take a while for a portion of the chip to wake up, so after setting a
> FORCEWAKE bit we have to spin-wait until it's taken effect.
>
> So, the general algorithm for accessing some part of the chip that may be
> asleep is:
> 1) set the relevant bit of (a) FORCEWAKE register
> 2) poll (matching) FORCEWAKE_ACK until the write is acknowledged
> 3) access the chip (this can encompass several reads and writes)
> 4) clear the FORCEWAKE bit that we set earlier
> 5) poll FORCEWAKE_ACK again until this write is acknowledged
>
> Now for extra confusion, there are a few more details:
> * because reads and writes can in some cases be reordered, we
> need to force the write to FORCEWAKE to complete before the
> busy-polling of FORCEWAKE_ACK. This is the sole purpose of the
> read of the ECOBUS register, which is used just because it
> happens to lie in the same cacheline as FORCEWAKE.
>
> * we can choose not to poll for FORCEWAKE_ACK clear in step (5).
> Instead, we can just leave the chip to go back to sleep while
> we get on with other things. But in that case, we might come
> back and try to wake the chip again before it's finished
> responding to the write in step (4). So if we don't poll at
> the end of the sequence, we have to poll at the beginning
> instead; in other words, move step (5) to before step (1).
I see we can move step (5) before step (1). But, I don't understand why
we have to do this. For instance, if we put step (5) right after step (4),
does the chip have to wake up for processing the polling (5)?
And, additionally, I saw calling "__gen6_gt_wait_for_thread_c0" after step (2).
Does it mean after FORCEWAKE_ACK is acknowledged, the hardware (possible ACPI)
Sets the thread to C0 state?
And is it noticible via GEN6_GT_THREAD_STATUS_REG (0x13805c)?
Thanks,
Jaeyong
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question regarding forcewake in i915
2015-01-07 8:13 ` Jaeyong Yoo
@ 2015-01-12 17:47 ` Dave Gordon
2015-01-15 7:38 ` Debugging registers (INSTPS, CSCMDOP, CSCMDVLD, ...) Jaeyong Yoo
0 siblings, 1 reply; 7+ messages in thread
From: Dave Gordon @ 2015-01-12 17:47 UTC (permalink / raw)
To: Jaeyong Yoo; +Cc: intel-gfx@lists.freedesktop.org
On 07/01/15 08:13, Jaeyong Yoo wrote:
> Thanks a lot. It is very helpful. Couple of follow-up questions below.
>
>> -----Original Message-----
>> From: Dave Gordon [mailto:david.s.gordon@intel.com]
>> Sent: Wednesday, January 07, 2015 12:19 AM
>> To: jaeyong.yoo@samsung.com
>> Cc: intel-gfx@lists.freedesktop.org
>> Subject: Re: [Intel-gfx] Question regarding forcewake in i915
>>
>> On 22/12/14 12:26, 유재용 wrote:
>>> Hello intel-gfx,
>>>
>>> I'm reading i915 gpu drivers and find myself quite hard to understand
>>> about forcewake concepts.
>>>
>>> I understand that it is something with the energy efficiency so
>>> related to ACPI. And it looks like forcewake is working as a pair (get
>>> and put).
>>> In the "get" part, what it first does it waiting on FORCEWAKE_ACK_HSW
>>> register (in case of haswell).
>>> And then, it writes something to FORCEWAKE_MT register, read from ECOBUS.
>>> And again, it waits on FORCEWAKE_ACK_HSW again!
>>> It becomes more confusing when it comes to put.
>>> In the "put" part, what it does it writing to FORCEWAKE_MT register
>>> and read from ECOBUS.
>>>
>>> I tried to find some good reading materials about this forcewake, but
>>> what I found was a series of patches in this mailing list. (which are
>>> quite hard to follow from the begining) Could you explain about the
>>> concept of FORCEWAKE and possibly the magic tricks on these get and
>>> put?
>>>
>>> Thanks,
>>> Jaeyong
>>
>> Hi Jaeyong,
>>
>> FORCEWAKE details vary a little from one chip to another, so this is only
>> a general description, but essentially setting one or more bits in the
>> FORCEWAKE register(s) prevents some or all of the power domains from going
>> into the deeper idle (sleep) states (and forces them out of the sleep
>> state if they're already asleep). Clearing the bit(s) allows the affected
>> parts to go to sleep again.
>>
>> The FORCEWAKE_ACK register(s) contain one or more bits which reflect the
>> internal state, and so acknowledge that the most recent write to the
>> corresponding FORCEWAKE register has been accepted and acted upon. It can
>> take a while for a portion of the chip to wake up, so after setting a
>> FORCEWAKE bit we have to spin-wait until it's taken effect.
>>
>> So, the general algorithm for accessing some part of the chip that may be
>> asleep is:
>> 1) set the relevant bit of (a) FORCEWAKE register
>> 2) poll (matching) FORCEWAKE_ACK until the write is acknowledged
>> 3) access the chip (this can encompass several reads and writes)
>> 4) clear the FORCEWAKE bit that we set earlier
>> 5) poll FORCEWAKE_ACK again until this write is acknowledged
>>
>> Now for extra confusion, there are a few more details:
>> * because reads and writes can in some cases be reordered, we
>> need to force the write to FORCEWAKE to complete before the
>> busy-polling of FORCEWAKE_ACK. This is the sole purpose of the
>> read of the ECOBUS register, which is used just because it
>> happens to lie in the same cacheline as FORCEWAKE.
>>
>> * we can choose not to poll for FORCEWAKE_ACK clear in step (5).
>> Instead, we can just leave the chip to go back to sleep while
>> we get on with other things. But in that case, we might come
>> back and try to wake the chip again before it's finished
>> responding to the write in step (4). So if we don't poll at
>> the end of the sequence, we have to poll at the beginning
>> instead; in other words, move step (5) to before step (1).
>
> I see we can move step (5) before step (1). But, I don't understand why
> we have to do this. For instance, if we put step (5) right after step (4),
> does the chip have to wake up for processing the polling (5)?
No, polling these registers doesn't affect the wake state; they're in a
power domain that's not itself controlled by FORCEWAKE.
Either sequence (1-2-3-4-5 or 5-1-2-3-4) is valid. But if we use the
former sequence, the CPU will be busy polling in step 5 for however long
it takes the GPU to finish whatever else it's doing internally and then
acknowledge the write to FORCEWAKE, which could take a while.
By moving step 5 to the beginning of the sequence, the CPU can get on
with unrelated tasks during this time, so in general by the time it gets
round to needing to access FORCEWAKE again the previous write will have
completed and the CPU will see the ACK on the first read. So the
modified sequence allows greater parallelism between GPU and CPU.
[Aside]: if the driver needs to make a whole sequence of accesses, then
it's better to turn on FORCEWAKE once and hold it across the whole
sequence and then release it at the end, rather than setting it around
each access individually. See execlists_elsp_write() in intel_lrc.c for
an example.
[/Aside]
> And, additionally, I saw calling "__gen6_gt_wait_for_thread_c0" after step (2).
> Does it mean after FORCEWAKE_ACK is acknowledged, the hardware (possible ACPI)
> Sets the thread to C0 state?
> And is it noticible via GEN6_GT_THREAD_STATUS_REG (0x13805c)?
>
> Thanks,
> Jaeyong
I'm not the expert on that, but it /looks/ like the intent is for the
driver to wait when waking up the GPU not only for the write to the
FORCEWAKE register to be acknowledged, but also for the GT unit to be
fully active (which might take longer).
It polls the register you mention, and the comment suggests that if we
don't wait here, other registers read /via/ the GT unit might appear to
be zero when they aren't really.
I note that it's described as a workaround on SNB/IVB/HSW only, so it
may be that the original expectation was that seeing FORCEWAKE_ACK set
meant that the chip was ready for access, but it turned out that in some
circumstances the GT unit took longer than expected to become ready and
so it had to be polled separately.
.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Debugging registers (INSTPS, CSCMDOP, CSCMDVLD, ...)
2015-01-12 17:47 ` Dave Gordon
@ 2015-01-15 7:38 ` Jaeyong Yoo
2015-01-19 12:52 ` Dave Gordon
2015-01-19 13:31 ` Dave Gordon
0 siblings, 2 replies; 7+ messages in thread
From: Jaeyong Yoo @ 2015-01-15 7:38 UTC (permalink / raw)
To: daniel.vetter; +Cc: intel-gfx
Hello Daniel and other maintainers,
While I'm working on drm memory allocator with myself, I've encountered render ring hang.
And I am noticed that I can diagnose the command streamer's status with the following registers:
INSTPS: 0x2070
CSCMDOP: 0x220c
CSCMDVLD: 0x2210
INSTDONE_1: 206C
I can see the general description of such registers in the following PDF.
http://www.x.org/docs/intel/VOL_1_graphics_core.pdf
But, sadly, this document does not provide a field description for implementation specific registers.
I appreciate if you point me to such information in Haswell architecture?
Best regards,
JY
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Debugging registers (INSTPS, CSCMDOP, CSCMDVLD, ...)
2015-01-15 7:38 ` Debugging registers (INSTPS, CSCMDOP, CSCMDVLD, ...) Jaeyong Yoo
@ 2015-01-19 12:52 ` Dave Gordon
2015-01-19 13:31 ` Dave Gordon
1 sibling, 0 replies; 7+ messages in thread
From: Dave Gordon @ 2015-01-19 12:52 UTC (permalink / raw)
To: Jaeyong Yoo, daniel.vetter; +Cc: intel-gfx
On 15/01/15 07:38, Jaeyong Yoo wrote:
> Hello Daniel and other maintainers,
>
> While I'm working on drm memory allocator with myself, I've encountered render ring hang.
> And I am noticed that I can diagnose the command streamer's status with the following registers:
>
> INSTPS: 0x2070
> CSCMDOP: 0x220c
> CSCMDVLD: 0x2210
> INSTDONE_1: 206C
>
> I can see the general description of such registers in the following PDF.
> http://www.x.org/docs/intel/VOL_1_graphics_core.pdf
>
> But, sadly, this document does not provide a field description for implementation specific registers.
> I appreciate if you point me to such information in Haswell architecture?
>
> Best regards,
> JY
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Hi,
have you looked at intel-gpu-tools, available from
http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/ ?
It includes tools to decode specific registers (such as INSTDONE) and
describe what the various bits mean (have a look at lib/instdone.c).
There's also a tool to decode an error dump, which is a commonly used
means of tracking down a GPU hang.
Hope this helps,
.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Debugging registers (INSTPS, CSCMDOP, CSCMDVLD, ...)
2015-01-15 7:38 ` Debugging registers (INSTPS, CSCMDOP, CSCMDVLD, ...) Jaeyong Yoo
2015-01-19 12:52 ` Dave Gordon
@ 2015-01-19 13:31 ` Dave Gordon
1 sibling, 0 replies; 7+ messages in thread
From: Dave Gordon @ 2015-01-19 13:31 UTC (permalink / raw)
To: Jaeyong Yoo; +Cc: daniel.vetter, intel-gfx
On 15/01/15 07:38, Jaeyong Yoo wrote:
> Hello Daniel and other maintainers,
>
> While I'm working on drm memory allocator with myself, I've encountered render ring hang.
> And I am noticed that I can diagnose the command streamer's status with the following registers:
>
> INSTPS: 0x2070
> CSCMDOP: 0x220c
> CSCMDVLD: 0x2210
> INSTDONE_1: 206C
>
> I can see the general description of such registers in the following PDF.
> http://www.x.org/docs/intel/VOL_1_graphics_core.pdf
>
> But, sadly, this document does not provide a field description for implementation specific registers.
> I appreciate if you point me to such information in Haswell architecture?
>
> Best regards,
> JY
> _________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Hi,
that link you provided looks pretty ancient -- the document is dated Jan
2008! There's a lot of much more recent documentation at
* https://01.org/linuxgraphics/documentation
including some Haswell-specific files at:
*
https://01.org/linuxgraphics/documentation/2013-intel-core-processor-family
If you are trying to diagnose a GPU hang, these may be useful:
* https://01.org/linuxgraphics/documentation/how-get-gpu-error-state
* https://01.org/linuxgraphics/documentation/intel-gpu-dump-tool-guide
Hope this helps,
.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-01-19 13:32 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-22 12:26 Question regarding forcewake in i915 유재용
2015-01-06 15:19 ` Dave Gordon
2015-01-07 8:13 ` Jaeyong Yoo
2015-01-12 17:47 ` Dave Gordon
2015-01-15 7:38 ` Debugging registers (INSTPS, CSCMDOP, CSCMDVLD, ...) Jaeyong Yoo
2015-01-19 12:52 ` Dave Gordon
2015-01-19 13:31 ` Dave Gordon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox