Usage of _PAGE_PCD et al in i915 driver

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Usage of _PAGE_PCD et al in i915 driver
@ 2014-08-08 11:23 Juergen Gross
  2014-08-08 13:14 ` Daniel Vetter
  0 siblings, 1 reply; 8+ messages in thread
From: Juergen Gross @ 2014-08-08 11:23 UTC (permalink / raw)
  To: benjamin.widawsky; +Cc: daniel.vetter, linux-kernel

Hi,

I'm just about to create a patch for full PAT support in the Linux
kernel, including Xen. For this purpose I introduce a translation
between cache modes and pte bits.

Scanning the kernel sources for usage of the cache mode bits in the
pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
_PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
to create ptes not for usage by the main processor, but for the
graphics processor. Is this true? In this case I'd suggest to define
i915-specific macros instead of using the x86 ones.

Juergen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Usage of _PAGE_PCD et al in i915 driver
  2014-08-08 11:23 Usage of _PAGE_PCD et al in i915 driver Juergen Gross
@ 2014-08-08 13:14 ` Daniel Vetter
  2014-08-13 15:07   ` Jesse Barnes
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Vetter @ 2014-08-08 13:14 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Ben Widawsky, Linux Kernel Mailing List, intel-gfx, Barnes, Jesse

Adding relevant mailing lists.

On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote:
> I'm just about to create a patch for full PAT support in the Linux
> kernel, including Xen. For this purpose I introduce a translation
> between cache modes and pte bits.
>
> Scanning the kernel sources for usage of the cache mode bits in the
> pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
> to create ptes not for usage by the main processor, but for the
> graphics processor. Is this true? In this case I'd suggest to define
> i915-specific macros instead of using the x86 ones.

Yeah, those are gpu specific PAT tables, but the hw engineers
specifically designed this to match, and we've tried to follow the cpu
side to match it. Especially in the future that will be somewhat
important, since we want to fully share the entire address space
between cpu and gpu on the next platform. Jesse is working on that.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Usage of _PAGE_PCD et al in i915 driver
  2014-08-08 13:14 ` Daniel Vetter
@ 2014-08-13 15:07   ` Jesse Barnes
  2014-08-14  3:55     ` Juergen Gross
  0 siblings, 1 reply; 8+ messages in thread
From: Jesse Barnes @ 2014-08-13 15:07 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Juergen Gross, Ben Widawsky, Linux Kernel Mailing List, intel-gfx

On Fri, 8 Aug 2014 15:14:15 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Adding relevant mailing lists.
> 
> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote:
> > I'm just about to create a patch for full PAT support in the Linux
> > kernel, including Xen. For this purpose I introduce a translation
> > between cache modes and pte bits.
> >
> > Scanning the kernel sources for usage of the cache mode bits in the
> > pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
> > _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
> > to create ptes not for usage by the main processor, but for the
> > graphics processor. Is this true? In this case I'd suggest to define
> > i915-specific macros instead of using the x86 ones.
> 
> Yeah, those are gpu specific PAT tables, but the hw engineers
> specifically designed this to match, and we've tried to follow the cpu
> side to match it. Especially in the future that will be somewhat
> important, since we want to fully share the entire address space
> between cpu and gpu on the next platform. Jesse is working on that.

Right, we have an x86 compatible MMU in the GPU itself, so re-using the
defines makes sense.  I suppose with your work you'll move them and
make them a bit more opaque?  If so, we'll still want a way to get at
them directly, or access your mapping functions for generating PTE bits
for the GPU MMU.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Usage of _PAGE_PCD et al in i915 driver
  2014-08-13 15:07   ` Jesse Barnes
@ 2014-08-14  3:55     ` Juergen Gross
  2014-08-15 10:21       ` [Intel-gfx] " Ville Syrjälä
  0 siblings, 1 reply; 8+ messages in thread
From: Juergen Gross @ 2014-08-14  3:55 UTC (permalink / raw)
  To: Jesse Barnes, Daniel Vetter
  Cc: Ben Widawsky, Linux Kernel Mailing List, intel-gfx

On 08/13/2014 05:07 PM, Jesse Barnes wrote:
> On Fri, 8 Aug 2014 15:14:15 +0200
> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
>> Adding relevant mailing lists.
>>
>> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote:
>>> I'm just about to create a patch for full PAT support in the Linux
>>> kernel, including Xen. For this purpose I introduce a translation
>>> between cache modes and pte bits.
>>>
>>> Scanning the kernel sources for usage of the cache mode bits in the
>>> pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
>>> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
>>> to create ptes not for usage by the main processor, but for the
>>> graphics processor. Is this true? In this case I'd suggest to define
>>> i915-specific macros instead of using the x86 ones.
>>
>> Yeah, those are gpu specific PAT tables, but the hw engineers
>> specifically designed this to match, and we've tried to follow the cpu
>> side to match it. Especially in the future that will be somewhat
>> important, since we want to fully share the entire address space
>> between cpu and gpu on the next platform. Jesse is working on that.
>
> Right, we have an x86 compatible MMU in the GPU itself, so re-using the
> defines makes sense.  I suppose with your work you'll move them and
> make them a bit more opaque?  If so, we'll still want a way to get at
> them directly, or access your mapping functions for generating PTE bits
> for the GPU MMU.

Using the mapping functions I'm introducing should work, if the MMU has
an x86 compatible MSR_IA32_CR_PAT which is configured the same way as
on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT
setting as the Linux kernel).

Juergen


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] Usage of _PAGE_PCD et al in i915 driver
  2014-08-14  3:55     ` Juergen Gross
@ 2014-08-15 10:21       ` Ville Syrjälä
  2014-08-18  5:31         ` Juergen Gross
  0 siblings, 1 reply; 8+ messages in thread
From: Ville Syrjälä @ 2014-08-15 10:21 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Jesse Barnes, Daniel Vetter, intel-gfx, Linux Kernel Mailing List,
	Ben Widawsky

On Thu, Aug 14, 2014 at 05:55:11AM +0200, Juergen Gross wrote:
> On 08/13/2014 05:07 PM, Jesse Barnes wrote:
> > On Fri, 8 Aug 2014 15:14:15 +0200
> > Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> >> Adding relevant mailing lists.
> >>
> >> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote:
> >>> I'm just about to create a patch for full PAT support in the Linux
> >>> kernel, including Xen. For this purpose I introduce a translation
> >>> between cache modes and pte bits.
> >>>
> >>> Scanning the kernel sources for usage of the cache mode bits in the
> >>> pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
> >>> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
> >>> to create ptes not for usage by the main processor, but for the
> >>> graphics processor. Is this true? In this case I'd suggest to define
> >>> i915-specific macros instead of using the x86 ones.
> >>
> >> Yeah, those are gpu specific PAT tables, but the hw engineers
> >> specifically designed this to match, and we've tried to follow the cpu
> >> side to match it. Especially in the future that will be somewhat
> >> important, since we want to fully share the entire address space
> >> between cpu and gpu on the next platform. Jesse is working on that.
> >
> > Right, we have an x86 compatible MMU in the GPU itself, so re-using the
> > defines makes sense.  I suppose with your work you'll move them and
> > make them a bit more opaque?  If so, we'll still want a way to get at
> > them directly, or access your mapping functions for generating PTE bits
> > for the GPU MMU.
> 
> Using the mapping functions I'm introducing should work, if the MMU has
> an x86 compatible MSR_IA32_CR_PAT which is configured the same way as
> on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT
> setting as the Linux kernel).

We have a PAT that is structured the same way as the x86 PAT. But the
contents of the PAT entries are obviously specific to the GPU so it's
not identical. But the pcd/pwt/pat bits index the PAT in exactly the
same way as on x86.

See bdw_setup_private_ppat() and chv_setup_private_ppat() for how we
set up the PAT.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] Usage of _PAGE_PCD et al in i915 driver
  2014-08-15 10:21       ` [Intel-gfx] " Ville Syrjälä
@ 2014-08-18  5:31         ` Juergen Gross
  2014-08-18 10:21           ` Ville Syrjälä
  0 siblings, 1 reply; 8+ messages in thread
From: Juergen Gross @ 2014-08-18  5:31 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Jesse Barnes, Daniel Vetter, intel-gfx, Linux Kernel Mailing List,
	Ben Widawsky

On 08/15/2014 12:21 PM, Ville Syrjälä wrote:
> On Thu, Aug 14, 2014 at 05:55:11AM +0200, Juergen Gross wrote:
>> On 08/13/2014 05:07 PM, Jesse Barnes wrote:
>>> On Fri, 8 Aug 2014 15:14:15 +0200
>>> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>
>>>> Adding relevant mailing lists.
>>>>
>>>> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote:
>>>>> I'm just about to create a patch for full PAT support in the Linux
>>>>> kernel, including Xen. For this purpose I introduce a translation
>>>>> between cache modes and pte bits.
>>>>>
>>>>> Scanning the kernel sources for usage of the cache mode bits in the
>>>>> pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
>>>>> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
>>>>> to create ptes not for usage by the main processor, but for the
>>>>> graphics processor. Is this true? In this case I'd suggest to define
>>>>> i915-specific macros instead of using the x86 ones.
>>>>
>>>> Yeah, those are gpu specific PAT tables, but the hw engineers
>>>> specifically designed this to match, and we've tried to follow the cpu
>>>> side to match it. Especially in the future that will be somewhat
>>>> important, since we want to fully share the entire address space
>>>> between cpu and gpu on the next platform. Jesse is working on that.
>>>
>>> Right, we have an x86 compatible MMU in the GPU itself, so re-using the
>>> defines makes sense.  I suppose with your work you'll move them and
>>> make them a bit more opaque?  If so, we'll still want a way to get at
>>> them directly, or access your mapping functions for generating PTE bits
>>> for the GPU MMU.
>>
>> Using the mapping functions I'm introducing should work, if the MMU has
>> an x86 compatible MSR_IA32_CR_PAT which is configured the same way as
>> on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT
>> setting as the Linux kernel).
>
> We have a PAT that is structured the same way as the x86 PAT. But the
> contents of the PAT entries are obviously specific to the GPU so it's
> not identical. But the pcd/pwt/pat bits index the PAT in exactly the
> same way as on x86.
>
> See bdw_setup_private_ppat() and chv_setup_private_ppat() for how we
> set up the PAT.
>

So you are using the PAT bit in the ptes, but the semantic for the GPU
will be different as for the x86 processor, because the GPU PAT is set
up differently from the x86 one.

In case you are sharing ptes between GPU and x86 processor in future,
this might lead to problems when the x86 processor will use ptes with
the PAT bit set.


Juergen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] Usage of _PAGE_PCD et al in i915 driver
  2014-08-18  5:31         ` Juergen Gross
@ 2014-08-18 10:21           ` Ville Syrjälä
  2014-08-18 10:36             ` Juergen Gross
  0 siblings, 1 reply; 8+ messages in thread
From: Ville Syrjälä @ 2014-08-18 10:21 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Jesse Barnes, Daniel Vetter, intel-gfx, Linux Kernel Mailing List,
	Ben Widawsky

On Mon, Aug 18, 2014 at 07:31:58AM +0200, Juergen Gross wrote:
> On 08/15/2014 12:21 PM, Ville Syrjälä wrote:
> > On Thu, Aug 14, 2014 at 05:55:11AM +0200, Juergen Gross wrote:
> >> On 08/13/2014 05:07 PM, Jesse Barnes wrote:
> >>> On Fri, 8 Aug 2014 15:14:15 +0200
> >>> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >>>
> >>>> Adding relevant mailing lists.
> >>>>
> >>>> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote:
> >>>>> I'm just about to create a patch for full PAT support in the Linux
> >>>>> kernel, including Xen. For this purpose I introduce a translation
> >>>>> between cache modes and pte bits.
> >>>>>
> >>>>> Scanning the kernel sources for usage of the cache mode bits in the
> >>>>> pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
> >>>>> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
> >>>>> to create ptes not for usage by the main processor, but for the
> >>>>> graphics processor. Is this true? In this case I'd suggest to define
> >>>>> i915-specific macros instead of using the x86 ones.
> >>>>
> >>>> Yeah, those are gpu specific PAT tables, but the hw engineers
> >>>> specifically designed this to match, and we've tried to follow the cpu
> >>>> side to match it. Especially in the future that will be somewhat
> >>>> important, since we want to fully share the entire address space
> >>>> between cpu and gpu on the next platform. Jesse is working on that.
> >>>
> >>> Right, we have an x86 compatible MMU in the GPU itself, so re-using the
> >>> defines makes sense.  I suppose with your work you'll move them and
> >>> make them a bit more opaque?  If so, we'll still want a way to get at
> >>> them directly, or access your mapping functions for generating PTE bits
> >>> for the GPU MMU.
> >>
> >> Using the mapping functions I'm introducing should work, if the MMU has
> >> an x86 compatible MSR_IA32_CR_PAT which is configured the same way as
> >> on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT
> >> setting as the Linux kernel).
> >
> > We have a PAT that is structured the same way as the x86 PAT. But the
> > contents of the PAT entries are obviously specific to the GPU so it's
> > not identical. But the pcd/pwt/pat bits index the PAT in exactly the
> > same way as on x86.
> >
> > See bdw_setup_private_ppat() and chv_setup_private_ppat() for how we
> > set up the PAT.
> >
> 
> So you are using the PAT bit in the ptes, but the semantic for the GPU
> will be different as for the x86 processor, because the GPU PAT is set
> up differently from the x86 one.
> 
> In case you are sharing ptes between GPU and x86 processor in future,
> this might lead to problems when the x86 processor will use ptes with
> the PAT bit set.

I'm not sure why you single out the PAT bit. It's just another index bit
like PCD and PWT.

Currently we play around with the GPU caching mode rather freely because
the hardware is already fully coherent wrt. CPU caches (well, apart from
display scanout which knows nothing about any caches). What we do
currently is leave all the CPU mappings as WB and just change the GPU
caching mode depending on the need.

However once we share the page tables I'm not sure what's the plan wrt.
changing the caching mode for GPU buffers since that would involve
changing the CPU cachine mode as well, and we may still want finer
granularity control over the various GPU caches. Maybe we need to
reserve some PAT entries for GPU specific purposes so that the CPU
might have no difference between two PAT entries but the GPU would.
But I'm not sure there are any extra PAT entries left which could be
reserved for such things.

We do have ways to override the GPU caching mode using inline information
in the GPU command buffers though, so in theory at least, it doesn't
matter all that much to the GPU how the page table caching bits are
configured. However not all commands may have such inline caching
information, and we still have the display scanout to worry about which
still relies on the page tables to avoid expensive manual clflushes.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] Usage of _PAGE_PCD et al in i915 driver
  2014-08-18 10:21           ` Ville Syrjälä
@ 2014-08-18 10:36             ` Juergen Gross
  0 siblings, 0 replies; 8+ messages in thread
From: Juergen Gross @ 2014-08-18 10:36 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Jesse Barnes, Daniel Vetter, intel-gfx, Linux Kernel Mailing List,
	Ben Widawsky

On 08/18/2014 12:21 PM, Ville Syrjälä wrote:
> On Mon, Aug 18, 2014 at 07:31:58AM +0200, Juergen Gross wrote:
>> On 08/15/2014 12:21 PM, Ville Syrjälä wrote:
>>> On Thu, Aug 14, 2014 at 05:55:11AM +0200, Juergen Gross wrote:
>>>> On 08/13/2014 05:07 PM, Jesse Barnes wrote:
>>>>> On Fri, 8 Aug 2014 15:14:15 +0200
>>>>> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>>>
>>>>>> Adding relevant mailing lists.
>>>>>>
>>>>>> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote:
>>>>>>> I'm just about to create a patch for full PAT support in the Linux
>>>>>>> kernel, including Xen. For this purpose I introduce a translation
>>>>>>> between cache modes and pte bits.
>>>>>>>
>>>>>>> Scanning the kernel sources for usage of the cache mode bits in the
>>>>>>> pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
>>>>>>> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
>>>>>>> to create ptes not for usage by the main processor, but for the
>>>>>>> graphics processor. Is this true? In this case I'd suggest to define
>>>>>>> i915-specific macros instead of using the x86 ones.
>>>>>>
>>>>>> Yeah, those are gpu specific PAT tables, but the hw engineers
>>>>>> specifically designed this to match, and we've tried to follow the cpu
>>>>>> side to match it. Especially in the future that will be somewhat
>>>>>> important, since we want to fully share the entire address space
>>>>>> between cpu and gpu on the next platform. Jesse is working on that.
>>>>>
>>>>> Right, we have an x86 compatible MMU in the GPU itself, so re-using the
>>>>> defines makes sense.  I suppose with your work you'll move them and
>>>>> make them a bit more opaque?  If so, we'll still want a way to get at
>>>>> them directly, or access your mapping functions for generating PTE bits
>>>>> for the GPU MMU.
>>>>
>>>> Using the mapping functions I'm introducing should work, if the MMU has
>>>> an x86 compatible MSR_IA32_CR_PAT which is configured the same way as
>>>> on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT
>>>> setting as the Linux kernel).
>>>
>>> We have a PAT that is structured the same way as the x86 PAT. But the
>>> contents of the PAT entries are obviously specific to the GPU so it's
>>> not identical. But the pcd/pwt/pat bits index the PAT in exactly the
>>> same way as on x86.
>>>
>>> See bdw_setup_private_ppat() and chv_setup_private_ppat() for how we
>>> set up the PAT.
>>>
>>
>> So you are using the PAT bit in the ptes, but the semantic for the GPU
>> will be different as for the x86 processor, because the GPU PAT is set
>> up differently from the x86 one.
>>
>> In case you are sharing ptes between GPU and x86 processor in future,
>> this might lead to problems when the x86 processor will use ptes with
>> the PAT bit set.
>
> I'm not sure why you single out the PAT bit. It's just another index bit
> like PCD and PWT.

I single out the PAT bit because all entries of CPU PAT-register and
GPU PAT-register differ with PAT==1. With PAT==0 they are configured
to have the same semantics.

> Currently we play around with the GPU caching mode rather freely because
> the hardware is already fully coherent wrt. CPU caches (well, apart from
> display scanout which knows nothing about any caches). What we do
> currently is leave all the CPU mappings as WB and just change the GPU
> caching mode depending on the need.

The Xen hypervisor is already using a different PAT configuration than
then Linux Kernel.

So your approach could break Xen when sharing the page tables between
CPU and GPU.

> However once we share the page tables I'm not sure what's the plan wrt.
> changing the caching mode for GPU buffers since that would involve
> changing the CPU cachine mode as well, and we may still want finer
> granularity control over the various GPU caches. Maybe we need to
> reserve some PAT entries for GPU specific purposes so that the CPU
> might have no difference between two PAT entries but the GPU would.
> But I'm not sure there are any extra PAT entries left which could be
> reserved for such things.

There should be 2 entries left in the PAT-register which could be used
by the GPU, I think: there are only 6 different cache modes defined for
x86 and we have 8 PAT register entries, so at least 2 entries must be
duplicates.

> We do have ways to override the GPU caching mode using inline information
> in the GPU command buffers though, so in theory at least, it doesn't
> matter all that much to the GPU how the page table caching bits are
> configured. However not all commands may have such inline caching
> information, and we still have the display scanout to worry about which
> still relies on the page tables to avoid expensive manual clflushes.




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-08-18 10:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-08 11:23 Usage of _PAGE_PCD et al in i915 driver Juergen Gross
2014-08-08 13:14 ` Daniel Vetter
2014-08-13 15:07   ` Jesse Barnes
2014-08-14  3:55     ` Juergen Gross
2014-08-15 10:21       ` [Intel-gfx] " Ville Syrjälä
2014-08-18  5:31         ` Juergen Gross
2014-08-18 10:21           ` Ville Syrjälä
2014-08-18 10:36             ` Juergen Gross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).