* Usage of _PAGE_PCD et al in i915 driver @ 2014-08-08 11:23 Juergen Gross 2014-08-08 13:14 ` Daniel Vetter 0 siblings, 1 reply; 8+ messages in thread From: Juergen Gross @ 2014-08-08 11:23 UTC (permalink / raw) To: benjamin.widawsky; +Cc: daniel.vetter, linux-kernel Hi, I'm just about to create a patch for full PAT support in the Linux kernel, including Xen. For this purpose I introduce a translation between cache modes and pte bits. Scanning the kernel sources for usage of the cache mode bits in the pte I discovered drivers/gpu/drm/i915/i915_gem_gtt.h is using _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used to create ptes not for usage by the main processor, but for the graphics processor. Is this true? In this case I'd suggest to define i915-specific macros instead of using the x86 ones. Juergen ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Usage of _PAGE_PCD et al in i915 driver 2014-08-08 11:23 Usage of _PAGE_PCD et al in i915 driver Juergen Gross @ 2014-08-08 13:14 ` Daniel Vetter 2014-08-13 15:07 ` Jesse Barnes 0 siblings, 1 reply; 8+ messages in thread From: Daniel Vetter @ 2014-08-08 13:14 UTC (permalink / raw) To: Juergen Gross Cc: Ben Widawsky, Linux Kernel Mailing List, intel-gfx, Barnes, Jesse Adding relevant mailing lists. On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote: > I'm just about to create a patch for full PAT support in the Linux > kernel, including Xen. For this purpose I introduce a translation > between cache modes and pte bits. > > Scanning the kernel sources for usage of the cache mode bits in the > pte I discovered drivers/gpu/drm/i915/i915_gem_gtt.h is using > _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used > to create ptes not for usage by the main processor, but for the > graphics processor. Is this true? In this case I'd suggest to define > i915-specific macros instead of using the x86 ones. Yeah, those are gpu specific PAT tables, but the hw engineers specifically designed this to match, and we've tried to follow the cpu side to match it. Especially in the future that will be somewhat important, since we want to fully share the entire address space between cpu and gpu on the next platform. Jesse is working on that. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Usage of _PAGE_PCD et al in i915 driver 2014-08-08 13:14 ` Daniel Vetter @ 2014-08-13 15:07 ` Jesse Barnes 2014-08-14 3:55 ` Juergen Gross 0 siblings, 1 reply; 8+ messages in thread From: Jesse Barnes @ 2014-08-13 15:07 UTC (permalink / raw) To: Daniel Vetter Cc: Juergen Gross, Ben Widawsky, Linux Kernel Mailing List, intel-gfx On Fri, 8 Aug 2014 15:14:15 +0200 Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > Adding relevant mailing lists. > > On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote: > > I'm just about to create a patch for full PAT support in the Linux > > kernel, including Xen. For this purpose I introduce a translation > > between cache modes and pte bits. > > > > Scanning the kernel sources for usage of the cache mode bits in the > > pte I discovered drivers/gpu/drm/i915/i915_gem_gtt.h is using > > _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used > > to create ptes not for usage by the main processor, but for the > > graphics processor. Is this true? In this case I'd suggest to define > > i915-specific macros instead of using the x86 ones. > > Yeah, those are gpu specific PAT tables, but the hw engineers > specifically designed this to match, and we've tried to follow the cpu > side to match it. Especially in the future that will be somewhat > important, since we want to fully share the entire address space > between cpu and gpu on the next platform. Jesse is working on that. Right, we have an x86 compatible MMU in the GPU itself, so re-using the defines makes sense. I suppose with your work you'll move them and make them a bit more opaque? If so, we'll still want a way to get at them directly, or access your mapping functions for generating PTE bits for the GPU MMU. Thanks, -- Jesse Barnes, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Usage of _PAGE_PCD et al in i915 driver 2014-08-13 15:07 ` Jesse Barnes @ 2014-08-14 3:55 ` Juergen Gross 2014-08-15 10:21 ` [Intel-gfx] " Ville Syrjälä 0 siblings, 1 reply; 8+ messages in thread From: Juergen Gross @ 2014-08-14 3:55 UTC (permalink / raw) To: Jesse Barnes, Daniel Vetter Cc: Ben Widawsky, Linux Kernel Mailing List, intel-gfx On 08/13/2014 05:07 PM, Jesse Barnes wrote: > On Fri, 8 Aug 2014 15:14:15 +0200 > Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > >> Adding relevant mailing lists. >> >> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote: >>> I'm just about to create a patch for full PAT support in the Linux >>> kernel, including Xen. For this purpose I introduce a translation >>> between cache modes and pte bits. >>> >>> Scanning the kernel sources for usage of the cache mode bits in the >>> pte I discovered drivers/gpu/drm/i915/i915_gem_gtt.h is using >>> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used >>> to create ptes not for usage by the main processor, but for the >>> graphics processor. Is this true? In this case I'd suggest to define >>> i915-specific macros instead of using the x86 ones. >> >> Yeah, those are gpu specific PAT tables, but the hw engineers >> specifically designed this to match, and we've tried to follow the cpu >> side to match it. Especially in the future that will be somewhat >> important, since we want to fully share the entire address space >> between cpu and gpu on the next platform. Jesse is working on that. > > Right, we have an x86 compatible MMU in the GPU itself, so re-using the > defines makes sense. I suppose with your work you'll move them and > make them a bit more opaque? If so, we'll still want a way to get at > them directly, or access your mapping functions for generating PTE bits > for the GPU MMU. Using the mapping functions I'm introducing should work, if the MMU has an x86 compatible MSR_IA32_CR_PAT which is configured the same way as on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT setting as the Linux kernel). Juergen ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-gfx] Usage of _PAGE_PCD et al in i915 driver 2014-08-14 3:55 ` Juergen Gross @ 2014-08-15 10:21 ` Ville Syrjälä 2014-08-18 5:31 ` Juergen Gross 0 siblings, 1 reply; 8+ messages in thread From: Ville Syrjälä @ 2014-08-15 10:21 UTC (permalink / raw) To: Juergen Gross Cc: Jesse Barnes, Daniel Vetter, intel-gfx, Linux Kernel Mailing List, Ben Widawsky On Thu, Aug 14, 2014 at 05:55:11AM +0200, Juergen Gross wrote: > On 08/13/2014 05:07 PM, Jesse Barnes wrote: > > On Fri, 8 Aug 2014 15:14:15 +0200 > > Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > > > >> Adding relevant mailing lists. > >> > >> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote: > >>> I'm just about to create a patch for full PAT support in the Linux > >>> kernel, including Xen. For this purpose I introduce a translation > >>> between cache modes and pte bits. > >>> > >>> Scanning the kernel sources for usage of the cache mode bits in the > >>> pte I discovered drivers/gpu/drm/i915/i915_gem_gtt.h is using > >>> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used > >>> to create ptes not for usage by the main processor, but for the > >>> graphics processor. Is this true? In this case I'd suggest to define > >>> i915-specific macros instead of using the x86 ones. > >> > >> Yeah, those are gpu specific PAT tables, but the hw engineers > >> specifically designed this to match, and we've tried to follow the cpu > >> side to match it. Especially in the future that will be somewhat > >> important, since we want to fully share the entire address space > >> between cpu and gpu on the next platform. Jesse is working on that. > > > > Right, we have an x86 compatible MMU in the GPU itself, so re-using the > > defines makes sense. I suppose with your work you'll move them and > > make them a bit more opaque? If so, we'll still want a way to get at > > them directly, or access your mapping functions for generating PTE bits > > for the GPU MMU. > > Using the mapping functions I'm introducing should work, if the MMU has > an x86 compatible MSR_IA32_CR_PAT which is configured the same way as > on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT > setting as the Linux kernel). We have a PAT that is structured the same way as the x86 PAT. But the contents of the PAT entries are obviously specific to the GPU so it's not identical. But the pcd/pwt/pat bits index the PAT in exactly the same way as on x86. See bdw_setup_private_ppat() and chv_setup_private_ppat() for how we set up the PAT. -- Ville Syrjälä Intel OTC ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-gfx] Usage of _PAGE_PCD et al in i915 driver 2014-08-15 10:21 ` [Intel-gfx] " Ville Syrjälä @ 2014-08-18 5:31 ` Juergen Gross 2014-08-18 10:21 ` Ville Syrjälä 0 siblings, 1 reply; 8+ messages in thread From: Juergen Gross @ 2014-08-18 5:31 UTC (permalink / raw) To: Ville Syrjälä Cc: Jesse Barnes, Daniel Vetter, intel-gfx, Linux Kernel Mailing List, Ben Widawsky On 08/15/2014 12:21 PM, Ville Syrjälä wrote: > On Thu, Aug 14, 2014 at 05:55:11AM +0200, Juergen Gross wrote: >> On 08/13/2014 05:07 PM, Jesse Barnes wrote: >>> On Fri, 8 Aug 2014 15:14:15 +0200 >>> Daniel Vetter <daniel.vetter@ffwll.ch> wrote: >>> >>>> Adding relevant mailing lists. >>>> >>>> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote: >>>>> I'm just about to create a patch for full PAT support in the Linux >>>>> kernel, including Xen. For this purpose I introduce a translation >>>>> between cache modes and pte bits. >>>>> >>>>> Scanning the kernel sources for usage of the cache mode bits in the >>>>> pte I discovered drivers/gpu/drm/i915/i915_gem_gtt.h is using >>>>> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used >>>>> to create ptes not for usage by the main processor, but for the >>>>> graphics processor. Is this true? In this case I'd suggest to define >>>>> i915-specific macros instead of using the x86 ones. >>>> >>>> Yeah, those are gpu specific PAT tables, but the hw engineers >>>> specifically designed this to match, and we've tried to follow the cpu >>>> side to match it. Especially in the future that will be somewhat >>>> important, since we want to fully share the entire address space >>>> between cpu and gpu on the next platform. Jesse is working on that. >>> >>> Right, we have an x86 compatible MMU in the GPU itself, so re-using the >>> defines makes sense. I suppose with your work you'll move them and >>> make them a bit more opaque? If so, we'll still want a way to get at >>> them directly, or access your mapping functions for generating PTE bits >>> for the GPU MMU. >> >> Using the mapping functions I'm introducing should work, if the MMU has >> an x86 compatible MSR_IA32_CR_PAT which is configured the same way as >> on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT >> setting as the Linux kernel). > > We have a PAT that is structured the same way as the x86 PAT. But the > contents of the PAT entries are obviously specific to the GPU so it's > not identical. But the pcd/pwt/pat bits index the PAT in exactly the > same way as on x86. > > See bdw_setup_private_ppat() and chv_setup_private_ppat() for how we > set up the PAT. > So you are using the PAT bit in the ptes, but the semantic for the GPU will be different as for the x86 processor, because the GPU PAT is set up differently from the x86 one. In case you are sharing ptes between GPU and x86 processor in future, this might lead to problems when the x86 processor will use ptes with the PAT bit set. Juergen ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-gfx] Usage of _PAGE_PCD et al in i915 driver 2014-08-18 5:31 ` Juergen Gross @ 2014-08-18 10:21 ` Ville Syrjälä 2014-08-18 10:36 ` Juergen Gross 0 siblings, 1 reply; 8+ messages in thread From: Ville Syrjälä @ 2014-08-18 10:21 UTC (permalink / raw) To: Juergen Gross Cc: Jesse Barnes, Daniel Vetter, intel-gfx, Linux Kernel Mailing List, Ben Widawsky On Mon, Aug 18, 2014 at 07:31:58AM +0200, Juergen Gross wrote: > On 08/15/2014 12:21 PM, Ville Syrjälä wrote: > > On Thu, Aug 14, 2014 at 05:55:11AM +0200, Juergen Gross wrote: > >> On 08/13/2014 05:07 PM, Jesse Barnes wrote: > >>> On Fri, 8 Aug 2014 15:14:15 +0200 > >>> Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > >>> > >>>> Adding relevant mailing lists. > >>>> > >>>> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote: > >>>>> I'm just about to create a patch for full PAT support in the Linux > >>>>> kernel, including Xen. For this purpose I introduce a translation > >>>>> between cache modes and pte bits. > >>>>> > >>>>> Scanning the kernel sources for usage of the cache mode bits in the > >>>>> pte I discovered drivers/gpu/drm/i915/i915_gem_gtt.h is using > >>>>> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used > >>>>> to create ptes not for usage by the main processor, but for the > >>>>> graphics processor. Is this true? In this case I'd suggest to define > >>>>> i915-specific macros instead of using the x86 ones. > >>>> > >>>> Yeah, those are gpu specific PAT tables, but the hw engineers > >>>> specifically designed this to match, and we've tried to follow the cpu > >>>> side to match it. Especially in the future that will be somewhat > >>>> important, since we want to fully share the entire address space > >>>> between cpu and gpu on the next platform. Jesse is working on that. > >>> > >>> Right, we have an x86 compatible MMU in the GPU itself, so re-using the > >>> defines makes sense. I suppose with your work you'll move them and > >>> make them a bit more opaque? If so, we'll still want a way to get at > >>> them directly, or access your mapping functions for generating PTE bits > >>> for the GPU MMU. > >> > >> Using the mapping functions I'm introducing should work, if the MMU has > >> an x86 compatible MSR_IA32_CR_PAT which is configured the same way as > >> on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT > >> setting as the Linux kernel). > > > > We have a PAT that is structured the same way as the x86 PAT. But the > > contents of the PAT entries are obviously specific to the GPU so it's > > not identical. But the pcd/pwt/pat bits index the PAT in exactly the > > same way as on x86. > > > > See bdw_setup_private_ppat() and chv_setup_private_ppat() for how we > > set up the PAT. > > > > So you are using the PAT bit in the ptes, but the semantic for the GPU > will be different as for the x86 processor, because the GPU PAT is set > up differently from the x86 one. > > In case you are sharing ptes between GPU and x86 processor in future, > this might lead to problems when the x86 processor will use ptes with > the PAT bit set. I'm not sure why you single out the PAT bit. It's just another index bit like PCD and PWT. Currently we play around with the GPU caching mode rather freely because the hardware is already fully coherent wrt. CPU caches (well, apart from display scanout which knows nothing about any caches). What we do currently is leave all the CPU mappings as WB and just change the GPU caching mode depending on the need. However once we share the page tables I'm not sure what's the plan wrt. changing the caching mode for GPU buffers since that would involve changing the CPU cachine mode as well, and we may still want finer granularity control over the various GPU caches. Maybe we need to reserve some PAT entries for GPU specific purposes so that the CPU might have no difference between two PAT entries but the GPU would. But I'm not sure there are any extra PAT entries left which could be reserved for such things. We do have ways to override the GPU caching mode using inline information in the GPU command buffers though, so in theory at least, it doesn't matter all that much to the GPU how the page table caching bits are configured. However not all commands may have such inline caching information, and we still have the display scanout to worry about which still relies on the page tables to avoid expensive manual clflushes. -- Ville Syrjälä Intel OTC ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-gfx] Usage of _PAGE_PCD et al in i915 driver 2014-08-18 10:21 ` Ville Syrjälä @ 2014-08-18 10:36 ` Juergen Gross 0 siblings, 0 replies; 8+ messages in thread From: Juergen Gross @ 2014-08-18 10:36 UTC (permalink / raw) To: Ville Syrjälä Cc: Jesse Barnes, Daniel Vetter, intel-gfx, Linux Kernel Mailing List, Ben Widawsky On 08/18/2014 12:21 PM, Ville Syrjälä wrote: > On Mon, Aug 18, 2014 at 07:31:58AM +0200, Juergen Gross wrote: >> On 08/15/2014 12:21 PM, Ville Syrjälä wrote: >>> On Thu, Aug 14, 2014 at 05:55:11AM +0200, Juergen Gross wrote: >>>> On 08/13/2014 05:07 PM, Jesse Barnes wrote: >>>>> On Fri, 8 Aug 2014 15:14:15 +0200 >>>>> Daniel Vetter <daniel.vetter@ffwll.ch> wrote: >>>>> >>>>>> Adding relevant mailing lists. >>>>>> >>>>>> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross@suse.com> wrote: >>>>>>> I'm just about to create a patch for full PAT support in the Linux >>>>>>> kernel, including Xen. For this purpose I introduce a translation >>>>>>> between cache modes and pte bits. >>>>>>> >>>>>>> Scanning the kernel sources for usage of the cache mode bits in the >>>>>>> pte I discovered drivers/gpu/drm/i915/i915_gem_gtt.h is using >>>>>>> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used >>>>>>> to create ptes not for usage by the main processor, but for the >>>>>>> graphics processor. Is this true? In this case I'd suggest to define >>>>>>> i915-specific macros instead of using the x86 ones. >>>>>> >>>>>> Yeah, those are gpu specific PAT tables, but the hw engineers >>>>>> specifically designed this to match, and we've tried to follow the cpu >>>>>> side to match it. Especially in the future that will be somewhat >>>>>> important, since we want to fully share the entire address space >>>>>> between cpu and gpu on the next platform. Jesse is working on that. >>>>> >>>>> Right, we have an x86 compatible MMU in the GPU itself, so re-using the >>>>> defines makes sense. I suppose with your work you'll move them and >>>>> make them a bit more opaque? If so, we'll still want a way to get at >>>>> them directly, or access your mapping functions for generating PTE bits >>>>> for the GPU MMU. >>>> >>>> Using the mapping functions I'm introducing should work, if the MMU has >>>> an x86 compatible MSR_IA32_CR_PAT which is configured the same way as >>>> on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT >>>> setting as the Linux kernel). >>> >>> We have a PAT that is structured the same way as the x86 PAT. But the >>> contents of the PAT entries are obviously specific to the GPU so it's >>> not identical. But the pcd/pwt/pat bits index the PAT in exactly the >>> same way as on x86. >>> >>> See bdw_setup_private_ppat() and chv_setup_private_ppat() for how we >>> set up the PAT. >>> >> >> So you are using the PAT bit in the ptes, but the semantic for the GPU >> will be different as for the x86 processor, because the GPU PAT is set >> up differently from the x86 one. >> >> In case you are sharing ptes between GPU and x86 processor in future, >> this might lead to problems when the x86 processor will use ptes with >> the PAT bit set. > > I'm not sure why you single out the PAT bit. It's just another index bit > like PCD and PWT. I single out the PAT bit because all entries of CPU PAT-register and GPU PAT-register differ with PAT==1. With PAT==0 they are configured to have the same semantics. > Currently we play around with the GPU caching mode rather freely because > the hardware is already fully coherent wrt. CPU caches (well, apart from > display scanout which knows nothing about any caches). What we do > currently is leave all the CPU mappings as WB and just change the GPU > caching mode depending on the need. The Xen hypervisor is already using a different PAT configuration than then Linux Kernel. So your approach could break Xen when sharing the page tables between CPU and GPU. > However once we share the page tables I'm not sure what's the plan wrt. > changing the caching mode for GPU buffers since that would involve > changing the CPU cachine mode as well, and we may still want finer > granularity control over the various GPU caches. Maybe we need to > reserve some PAT entries for GPU specific purposes so that the CPU > might have no difference between two PAT entries but the GPU would. > But I'm not sure there are any extra PAT entries left which could be > reserved for such things. There should be 2 entries left in the PAT-register which could be used by the GPU, I think: there are only 6 different cache modes defined for x86 and we have 8 PAT register entries, so at least 2 entries must be duplicates. > We do have ways to override the GPU caching mode using inline information > in the GPU command buffers though, so in theory at least, it doesn't > matter all that much to the GPU how the page table caching bits are > configured. However not all commands may have such inline caching > information, and we still have the display scanout to worry about which > still relies on the page tables to avoid expensive manual clflushes. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-08-18 10:36 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-08-08 11:23 Usage of _PAGE_PCD et al in i915 driver Juergen Gross 2014-08-08 13:14 ` Daniel Vetter 2014-08-13 15:07 ` Jesse Barnes 2014-08-14 3:55 ` Juergen Gross 2014-08-15 10:21 ` [Intel-gfx] " Ville Syrjälä 2014-08-18 5:31 ` Juergen Gross 2014-08-18 10:21 ` Ville Syrjälä 2014-08-18 10:36 ` Juergen Gross
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).