* [PATCH] drm/radeon: Disable writeback by default on ppc @ 2013-06-17 14:06 Adam Jackson 2013-06-17 15:04 ` Alex Deucher 0 siblings, 1 reply; 23+ messages in thread From: Adam Jackson @ 2013-06-17 14:06 UTC (permalink / raw) To: dri-devel At least on an IBM Power 720, this check passes, but several piglit tests will reliably trigger GPU resets due to the ring buffer pointers not being updated. There's probably a better way to limit this to just affected machines though. Signed-off-by: Adam Jackson <ajax@redhat.com> --- drivers/gpu/drm/radeon/r600_cp.c | 7 +++++++ drivers/gpu/drm/radeon/radeon_cp.c | 7 +++++++ drivers/gpu/drm/radeon/radeon_device.c | 4 ++-- drivers/gpu/drm/radeon/radeon_drv.c | 2 +- 4 files changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600_cp.c b/drivers/gpu/drm/radeon/r600_cp.c index 1c51c08..ef28532 100644 --- a/drivers/gpu/drm/radeon/r600_cp.c +++ b/drivers/gpu/drm/radeon/r600_cp.c @@ -552,6 +552,13 @@ static void r600_test_writeback(drm_radeon_private_t *dev_priv) dev_priv->writeback_works = 0; DRM_INFO("writeback test failed\n"); } +#if defined(__ppc__) || defined(__ppc64__) + /* the test might succeed on ppc, but it's usually not reliable */ + if (radeon_no_wb == -1) { + radeon_no_wb = 1; + DRM_INFO("not trusting writeback test due to arch quirk\n"); + } +#endif if (radeon_no_wb == 1) { dev_priv->writeback_works = 0; DRM_INFO("writeback forced off\n"); diff --git a/drivers/gpu/drm/radeon/radeon_cp.c b/drivers/gpu/drm/radeon/radeon_cp.c index efc4f64..a967b33 100644 --- a/drivers/gpu/drm/radeon/radeon_cp.c +++ b/drivers/gpu/drm/radeon/radeon_cp.c @@ -892,6 +892,13 @@ static void radeon_test_writeback(drm_radeon_private_t * dev_priv) dev_priv->writeback_works = 0; DRM_INFO("writeback test failed\n"); } +#if defined(__ppc__) || defined(__ppc64__) + /* the test might succeed on ppc, but it's usually not reliable */ + if (radeon_no_wb == -1) { + radeon_no_wb = 1; + DRM_INFO("not trusting writeback test due to arch quirk\n"); + } +#endif if (radeon_no_wb == 1) { dev_priv->writeback_works = 0; DRM_INFO("writeback forced off\n"); diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 1899738..524046e 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -322,8 +322,8 @@ int radeon_wb_init(struct radeon_device *rdev) /* disable event_write fences */ rdev->wb.use_event = false; /* disabled via module param */ - if (radeon_no_wb == 1) { - rdev->wb.enabled = false; + if (radeon_no_wb != -1) { + rdev->wb.enabled = !!radeon_no_wb; } else { if (rdev->flags & RADEON_IS_AGP) { /* often unreliable on AGP */ diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 094e7e5..04809d4 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -146,7 +146,7 @@ static inline void radeon_register_atpx_handler(void) {} static inline void radeon_unregister_atpx_handler(void) {} #endif -int radeon_no_wb; +int radeon_no_wb = -1; int radeon_modeset = -1; int radeon_dynclks = -1; int radeon_r4xx_atom = 0; -- 1.8.2.1 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-06-17 14:06 [PATCH] drm/radeon: Disable writeback by default on ppc Adam Jackson @ 2013-06-17 15:04 ` Alex Deucher 2013-06-17 16:07 ` Adam Jackson 0 siblings, 1 reply; 23+ messages in thread From: Alex Deucher @ 2013-06-17 15:04 UTC (permalink / raw) To: Adam Jackson; +Cc: dri-devel On Mon, Jun 17, 2013 at 10:06 AM, Adam Jackson <ajax@redhat.com> wrote: > At least on an IBM Power 720, this check passes, but several piglit > tests will reliably trigger GPU resets due to the ring buffer pointers > not being updated. There's probably a better way to limit this to just > affected machines though. What radeon chips are you seeing this on? wb is more or less required on r6xx and newer and I'm not sure those generations will even work properly without writeback enabled these days. We force it to always be enabled on APUs and NI and newer asics. With KMS, wb encompasses more than just rptr writeback; it covers pretty much everything involving the GPU writing any status information to memory. Alex > > Signed-off-by: Adam Jackson <ajax@redhat.com> > --- > drivers/gpu/drm/radeon/r600_cp.c | 7 +++++++ > drivers/gpu/drm/radeon/radeon_cp.c | 7 +++++++ > drivers/gpu/drm/radeon/radeon_device.c | 4 ++-- > drivers/gpu/drm/radeon/radeon_drv.c | 2 +- > 4 files changed, 17 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/r600_cp.c b/drivers/gpu/drm/radeon/r600_cp.c > index 1c51c08..ef28532 100644 > --- a/drivers/gpu/drm/radeon/r600_cp.c > +++ b/drivers/gpu/drm/radeon/r600_cp.c > @@ -552,6 +552,13 @@ static void r600_test_writeback(drm_radeon_private_t *dev_priv) > dev_priv->writeback_works = 0; > DRM_INFO("writeback test failed\n"); > } > +#if defined(__ppc__) || defined(__ppc64__) > + /* the test might succeed on ppc, but it's usually not reliable */ > + if (radeon_no_wb == -1) { > + radeon_no_wb = 1; > + DRM_INFO("not trusting writeback test due to arch quirk\n"); > + } > +#endif > if (radeon_no_wb == 1) { > dev_priv->writeback_works = 0; > DRM_INFO("writeback forced off\n"); > diff --git a/drivers/gpu/drm/radeon/radeon_cp.c b/drivers/gpu/drm/radeon/radeon_cp.c > index efc4f64..a967b33 100644 > --- a/drivers/gpu/drm/radeon/radeon_cp.c > +++ b/drivers/gpu/drm/radeon/radeon_cp.c > @@ -892,6 +892,13 @@ static void radeon_test_writeback(drm_radeon_private_t * dev_priv) > dev_priv->writeback_works = 0; > DRM_INFO("writeback test failed\n"); > } > +#if defined(__ppc__) || defined(__ppc64__) > + /* the test might succeed on ppc, but it's usually not reliable */ > + if (radeon_no_wb == -1) { > + radeon_no_wb = 1; > + DRM_INFO("not trusting writeback test due to arch quirk\n"); > + } > +#endif > if (radeon_no_wb == 1) { > dev_priv->writeback_works = 0; > DRM_INFO("writeback forced off\n"); > diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c > index 1899738..524046e 100644 > --- a/drivers/gpu/drm/radeon/radeon_device.c > +++ b/drivers/gpu/drm/radeon/radeon_device.c > @@ -322,8 +322,8 @@ int radeon_wb_init(struct radeon_device *rdev) > /* disable event_write fences */ > rdev->wb.use_event = false; > /* disabled via module param */ > - if (radeon_no_wb == 1) { > - rdev->wb.enabled = false; > + if (radeon_no_wb != -1) { > + rdev->wb.enabled = !!radeon_no_wb; > } else { > if (rdev->flags & RADEON_IS_AGP) { > /* often unreliable on AGP */ > diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c > index 094e7e5..04809d4 100644 > --- a/drivers/gpu/drm/radeon/radeon_drv.c > +++ b/drivers/gpu/drm/radeon/radeon_drv.c > @@ -146,7 +146,7 @@ static inline void radeon_register_atpx_handler(void) {} > static inline void radeon_unregister_atpx_handler(void) {} > #endif > > -int radeon_no_wb; > +int radeon_no_wb = -1; > int radeon_modeset = -1; > int radeon_dynclks = -1; > int radeon_r4xx_atom = 0; > -- > 1.8.2.1 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-06-17 15:04 ` Alex Deucher @ 2013-06-17 16:07 ` Adam Jackson 2013-06-17 22:57 ` Alex Deucher 0 siblings, 1 reply; 23+ messages in thread From: Adam Jackson @ 2013-06-17 16:07 UTC (permalink / raw) To: Alex Deucher; +Cc: dri-devel On Mon, 2013-06-17 at 11:04 -0400, Alex Deucher wrote: > On Mon, Jun 17, 2013 at 10:06 AM, Adam Jackson <ajax@redhat.com> wrote: > > At least on an IBM Power 720, this check passes, but several piglit > > tests will reliably trigger GPU resets due to the ring buffer pointers > > not being updated. There's probably a better way to limit this to just > > affected machines though. > > What radeon chips are you seeing this on? wb is more or less required > on r6xx and newer and I'm not sure those generations will even work > properly without writeback enabled these days. We force it to always > be enabled on APUs and NI and newer asics. With KMS, wb encompasses > more than just rptr writeback; it covers pretty much everything > involving the GPU writing any status information to memory. FirePro 2270, at least. Booting with no_wb=1, piglit runs to completion with no GPU resets or IB submission failures. Booting without no_wb, the following piglits go from pass to fail, all complaining that the kernel rejected CS: no-wb wb (info) (info) All 7598/9587 7403/9553 glean 362/385 361/385 pointAtten pass fail texture_srgb pass fail shaders 477/533 441/533 glsl-algebraic-add-add-1 pass fail glsl-algebraic-add-add-2 pass fail glsl-algebraic-add-add-3 pass fail glsl-algebraic-sub-zero-3 pass fail glsl-algebraic-sub-zero-4 pass fail glsl-complex-subscript pass fail glsl-copy-propagation-if-1 pass fail glsl-copy-propagation-if-2 pass fail glsl-copy-propagation-if-3 pass fail glsl-copy-propagation-vector-indexing pass fail glsl-fs-atan-2 pass fail glsl-fs-dot-vec2-2 pass fail glsl-fs-log2 pass fail glsl-fs-main-return pass fail glsl-fs-max-3 pass fail glsl-fs-min-2 pass fail glsl-fs-min-3 pass fail glsl-fs-statevar-call pass fail glsl-fs-struct-equal pass fail glsl-function-chain16 pass fail glsl-implicit-conversion-02 pass fail glsl-inout-struct-01 pass fail glsl-inout-struct-02 pass fail glsl-link-varying-TexCoord pass fail glsl-link-varyings-2 pass fail glsl-uniform-initializer-4 pass fail glsl-uniform-initializer-6 pass fail glsl-uniform-initializer-7 pass fail glsl-vs-abs-neg-with-intermediate pass fail glsl-vs-clamp-1 pass fail glsl-vs-deadcode-1 pass fail glsl-vs-deadcode-2 pass fail glsl-vs-f2b pass fail glsl-vs-position-outval pass fail link-uniform-array-size pass fail loopfunc pass fail spec 5921/7801 5763/7767 !OpenGL 1.1 118/229 101/219 depthstencil-default_fb-clear pass fail getteximage-simple pass fail texwrap formats 27/50 12/40 GL_LUMINANCE12 pass fail GL_LUMINANCE16 pass fail GL_LUMINANCE4 pass fail GL_LUMINANCE4_ALPHA4 pass fail GL_LUMINANCE8 pass fail !OpenGL 1.4 11/13 9/13 triangle-rasterization pass fail triangle-rasterization-fbo pass fail ARB_depth_buffer_float 9/63 6/63 fbo-depth-GL_DEPTH32F_STENCIL8-clear pass fail fbo-depth-GL_DEPTH_COMPONENT32F-clear pass fail fbo-stencil-GL_DEPTH32F_STENCIL8-clear pass fail ARB_depth_texture 11/53 10/53 fbo-depth-GL_DEPTH_COMPONENT32-clear pass fail ARB_fragment_program 23/28 22/28 kil-swizzle pass fail ARB_framebuffer_object 14/26 13/26 fbo-deriv pass fail ARB_framebuffer_sRGB 80/81 65/81 blit renderbuffer linear downsample disabled pass fail blit renderbuffer linear msaa enabled pass fail blit renderbuffer linear single_sampled enabled pass fail blit renderbuffer linear_to_srgb scaled disabled pass fail blit renderbuffer srgb msaa disabled pass fail blit renderbuffer srgb msaa enabled pass fail blit renderbuffer srgb scaled enabled pass fail blit renderbuffer srgb single_sampled enabled pass fail blit renderbuffer srgb upsample enabled pass fail blit renderbuffer srgb_to_linear msaa disabled pass fail blit renderbuffer srgb_to_linear msaa enabled pass fail blit texture linear scaled enabled pass fail blit texture linear_to_srgb upsample disabled pass fail blit texture srgb msaa disabled pass fail blit texture srgb msaa enabled pass fail ARB_map_buffer_range 8/8 6/8 MAP_INVALIDATE_RANGE_BIT decrement-offset pass fail MAP_INVALIDATE_RANGE_BIT offset=0 pass fail ARB_sampler_objects 1/4 0/4 sampler-objects pass fail ARB_shading_language_packing 10/10 7/10 execution 10/10 7/10 built-in-functions 10/10 7/10 const-packSnorm4x8 pass fail const-unpackSnorm2x16 pass fail const-unpackUnorm2x16 pass fail ARB_texture_compression 25/40 22/40 texwrap formats bordercolor-swizzled 3/6 0/6 GL_COMPRESSED_INTENSITY, swizzled, border color only pass fail GL_COMPRESSED_LUMINANCE, swizzled, border color only pass fail GL_COMPRESSED_LUMINANCE_ALPHA, swizzled, border color pass fail only ARB_texture_cube_map 7/8 6/8 cubemap pass fail ARB_texture_rectangle 11/21 10/21 texrect_simple_arb_texrect pass fail ARB_texture_rg 42/113 19/97 texwrap formats-int 24/28 0/12 GL_R16UI pass fail GL_R32UI pass fail GL_R8I pass fail GL_R8UI pass fail GL_RG16UI pass fail GL_RG32UI pass fail GL_RG8I pass fail GL_RG8UI pass fail ARB_texture_storage 1/1 0/1 texture-storage pass fail ARB_timer_query 1/3 0/3 query GL_TIMESTAMP pass fail ATI_draw_buffers 3/3 2/3 arbfp-no-option pass fail EXT_framebuffer_multisample 267/287 259/287 bitmap 4 pass fail bitmap 6 pass fail bitmap 8 pass fail clear 2 depth pass fail clear 6 color pass fail clear 6 depth pass fail clear 6 stencil pass fail draw-buffers-alpha-to-one 4 pass fail EXT_framebuffer_object 175/289 173/289 fbo-fragcoord2 pass fail fbo-generatemipmap-filtering pass fail fbo-generatemipmap-formats 2/76 1/76 GL_LUMINANCE4_ALPHA4 pass fail fbo-readpixels-depth-formats 14/24 16/24 GL_DEPTH24_STENCIL8_EXT 3/4 2/4 GL_UNSIGNED_INT pass fail fbo-stencil-GL_STENCIL_INDEX16-clear pass fail EXT_packed_depth_stencil 13/58 11/58 fbo-depthstencil-GL_DEPTH24_STENCIL8-clear pass fail fbo-stencil-GL_DEPTH24_STENCIL8-clear pass fail EXT_texture_compression_rgtc 31/39 11/31 fbo-generatemipmap-formats 8/8 4/8 GL_COMPRESSED_RED pass fail GL_COMPRESSED_RED_GREEN_RGTC2_EXT pass fail GL_COMPRESSED_RED_RGTC1_EXT pass fail GL_COMPRESSED_RG pass fail texwrap formats 12/12 0/4 GL_COMPRESSED_RED_RGTC1 pass fail GL_COMPRESSED_RG_RGTC2 pass fail GL_COMPRESSED_SIGNED_RED_RGTC1 pass fail GL_COMPRESSED_SIGNED_RG_RGTC2 pass fail texwrap formats bordercolor 4/4 0/4 GL_COMPRESSED_RED_RGTC1, border color only pass fail GL_COMPRESSED_RG_RGTC2, border color only pass fail GL_COMPRESSED_SIGNED_RED_RGTC1, border color only pass fail GL_COMPRESSED_SIGNED_RG_RGTC2, border color only pass fail EXT_transform_feedback 45/279 44/279 discard-bitmap pass fail glsl-1.10 1540/1636 1529/1636 execution 1295/1391 1286/1391 fs-inline-notequal pass fail fs-saturate-pow pass fail maximums 12/12 8/12 gl_MaxCombinedTextureImageUnits pass fail gl_MaxFragmentUniformComponents pass fail gl_MaxTextureImageUnits pass fail gl_MaxVertexAttribs pass fail vs-mat2-array-assignment pass fail vs-saturate-exp2 pass fail vs-saturate-pow pass fail linker 18/18 16/18 access-builtin-global-from-fn-unknown-to-main pass fail override-builtin-uniform-01 pass fail glsl-1.20 2124/2183 2097/2183 execution 788/845 761/845 maximums 12/12 7/12 gl_MaxClipPlanes pass fail gl_MaxCombinedTextureImageUnits pass fail gl_MaxTextureCoords pass fail gl_MaxVaryingFloats pass fail gl_MaxVertexTextureImageUnits pass fail uniform-initializer 64/64 44/64 fs-bool-set-by-other-stage pass fail fs-float-array pass fail fs-float-from-const pass fail fs-float-set-by-other-stage pass fail fs-int pass fail fs-int-from-const pass fail fs-int-set-by-other-stage pass fail fs-mat3-set-by-other-stage pass fail fs-mat4-from-const pass fail fs-mat4-set-by-API pass fail fs-mat4-set-by-other-stage pass fail fs-structure-array pass fail vs-bool-from-const pass fail vs-float pass fail vs-float-array pass fail vs-mat2 pass fail vs-mat3 pass fail vs-mat4 pass fail vs-mat4-from-const pass fail vs-mat4-set-by-API pass fail vs-all-equal-bool-array pass fail vs-assign-varied-struct pass fail glsl-1.30 468/604 455/604 execution 465/601 452/601 clipping 20/20 18/20 fs-clip-distance-interpolated pass fail vs-clip-distance-explicitly-sized pass fail fs-discard-exit-1 pass fail fs-discard-exit-2 pass fail fs-increment-int pass fail fs-multiply-const-ivec4 pass fail maximums 13/13 9/13 gl_MaxClipPlanes pass fail gl_MaxCombinedTextureImageUnits pass fail gl_MaxTextureImageUnits pass fail gl_MaxVaryingFloats pass fail qualifiers 1/1 0/1 vs-out-conversion-ivec4-to-vec4 pass fail uniform-initializer 8/8 6/8 fs-uint-from-const pass fail vs-uint pass fail - ajax ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-06-17 16:07 ` Adam Jackson @ 2013-06-17 22:57 ` Alex Deucher 2013-11-07 22:29 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 23+ messages in thread From: Alex Deucher @ 2013-06-17 22:57 UTC (permalink / raw) To: Adam Jackson; +Cc: dri-devel On Mon, Jun 17, 2013 at 12:07 PM, Adam Jackson <ajax@redhat.com> wrote: > On Mon, 2013-06-17 at 11:04 -0400, Alex Deucher wrote: >> On Mon, Jun 17, 2013 at 10:06 AM, Adam Jackson <ajax@redhat.com> wrote: >> > At least on an IBM Power 720, this check passes, but several piglit >> > tests will reliably trigger GPU resets due to the ring buffer pointers >> > not being updated. There's probably a better way to limit this to just >> > affected machines though. >> >> What radeon chips are you seeing this on? wb is more or less required >> on r6xx and newer and I'm not sure those generations will even work >> properly without writeback enabled these days. We force it to always >> be enabled on APUs and NI and newer asics. With KMS, wb encompasses >> more than just rptr writeback; it covers pretty much everything >> involving the GPU writing any status information to memory. > > FirePro 2270, at least. Booting with no_wb=1, piglit runs to completion > with no GPU resets or IB submission failures. Booting without no_wb, > the following piglits go from pass to fail, all complaining that the > kernel rejected CS: Weird. I wonder if there is an issue with cache snoops on PPC. We currently use the gart in cached mode (GPU snoops CPU cache) with cached pages. I wonder if we need to use uncached pages on PPC. Alex ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-06-17 22:57 ` Alex Deucher @ 2013-11-07 22:29 ` Benjamin Herrenschmidt 2013-11-08 13:43 ` Kleber Sacilotto de Souza 0 siblings, 1 reply; 23+ messages in thread From: Benjamin Herrenschmidt @ 2013-11-07 22:29 UTC (permalink / raw) To: Alex Deucher Cc: dri-devel, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: > Weird. I wonder if there is an issue with cache snoops on PPC. We > currently use the gart in cached mode (GPU snoops CPU cache) with > cached pages. I wonder if we need to use uncached pages on PPC. There is no such issue and no known bugs with DMA writes on those PCIe host bridges (and they do get hammered pretty bad here). This needs further investigation by the lab/hw guys to find out what's actually happening on the bus and the host bridge. Thadeu, Kleber: Jerome suggested writing a test case in userspace that continuously writes to a spare scratch register (thus triggering the corresponding writeback DMA) and checks the memory location to compare the writeback value (using a debugfs file for example, or mmap). Can you guys do something like that ? Then we need the analyzer on and/or the lab guys to look at the fabric trace & PHB trace. Ben. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-11-07 22:29 ` Benjamin Herrenschmidt @ 2013-11-08 13:43 ` Kleber Sacilotto de Souza 2013-11-24 23:15 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 23+ messages in thread From: Kleber Sacilotto de Souza @ 2013-11-08 13:43 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: dri-devel, Brian King, Jerome Glisse, Thadeu Lima de Souza Cascardo On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote: > On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: > >> Weird. I wonder if there is an issue with cache snoops on PPC. We >> currently use the gart in cached mode (GPU snoops CPU cache) with >> cached pages. I wonder if we need to use uncached pages on PPC. > > There is no such issue and no known bugs with DMA writes on those > PCIe host bridges (and they do get hammered pretty bad here). > > This needs further investigation by the lab/hw guys to find out what's > actually happening on the bus and the host bridge. > > Thadeu, Kleber: Jerome suggested writing a test case in userspace that > continuously writes to a spare scratch register (thus triggering the > corresponding writeback DMA) and checks the memory location to compare > the writeback value (using a debugfs file for example, or mmap). > I can look into that. Thanks, Kleber > Can you guys do something like that ? Then we need the analyzer on > and/or the lab guys to look at the fabric trace & PHB trace. > > Ben. > > -- Kleber Sacilotto de Souza IBM Linux Technology Center ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-11-08 13:43 ` Kleber Sacilotto de Souza @ 2013-11-24 23:15 ` Benjamin Herrenschmidt 2013-11-26 0:11 ` Kleber Sacilotto de Souza 0 siblings, 1 reply; 23+ messages in thread From: Benjamin Herrenschmidt @ 2013-11-24 23:15 UTC (permalink / raw) To: Kleber Sacilotto de Souza Cc: dri-devel, Brian King, Jerome Glisse, Thadeu Lima de Souza Cascardo On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote: > On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote: > > On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: > > > >> Weird. I wonder if there is an issue with cache snoops on PPC. We > >> currently use the gart in cached mode (GPU snoops CPU cache) with > >> cached pages. I wonder if we need to use uncached pages on PPC. > > > > There is no such issue and no known bugs with DMA writes on those > > PCIe host bridges (and they do get hammered pretty bad here). > > > > This needs further investigation by the lab/hw guys to find out what's > > actually happening on the bus and the host bridge. > > > > Thadeu, Kleber: Jerome suggested writing a test case in userspace that > > continuously writes to a spare scratch register (thus triggering the > > corresponding writeback DMA) and checks the memory location to compare > > the writeback value (using a debugfs file for example, or mmap). > > > > I can look into that. Any news ? Ben. > > Thanks, > > Kleber > > > Can you guys do something like that ? Then we need the analyzer on > > and/or the lab guys to look at the fabric trace & PHB trace. > > > > Ben. > > > > > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-11-24 23:15 ` Benjamin Herrenschmidt @ 2013-11-26 0:11 ` Kleber Sacilotto de Souza 2013-12-04 22:16 ` Kleber Sacilotto de Souza 0 siblings, 1 reply; 23+ messages in thread From: Kleber Sacilotto de Souza @ 2013-11-26 0:11 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: dri-devel, Brian King, Jerome Glisse, Thadeu Lima de Souza Cascardo On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote: > On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote: >> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote: >>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: >>> >>>> Weird. I wonder if there is an issue with cache snoops on PPC. We >>>> currently use the gart in cached mode (GPU snoops CPU cache) with >>>> cached pages. I wonder if we need to use uncached pages on PPC. >>> There is no such issue and no known bugs with DMA writes on those >>> PCIe host bridges (and they do get hammered pretty bad here). >>> >>> This needs further investigation by the lab/hw guys to find out what's >>> actually happening on the bus and the host bridge. >>> >>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that >>> continuously writes to a spare scratch register (thus triggering the >>> corresponding writeback DMA) and checks the memory location to compare >>> the writeback value (using a debugfs file for example, or mmap). >>> >> I can look into that. > Any news ? Only now I've got the bandwidth to actually take a look at this. I will start to write some code and let you guys know of the results. Thanks, Kleber > > Ben. > >> Thanks, >> >> Kleber >> >>> Can you guys do something like that ? Then we need the analyzer on >>> and/or the lab guys to look at the fabric trace & PHB trace. >>> >>> Ben. >>> >>> >> > -- Kleber Sacilotto de Souza IBM Linux Technology Center ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-11-26 0:11 ` Kleber Sacilotto de Souza @ 2013-12-04 22:16 ` Kleber Sacilotto de Souza 2013-12-04 23:56 ` Alex Deucher 0 siblings, 1 reply; 23+ messages in thread From: Kleber Sacilotto de Souza @ 2013-12-04 22:16 UTC (permalink / raw) To: Benjamin Herrenschmidt, Jerome Glisse, Alex Deucher Cc: Brian King, Thadeu Lima de Souza Cascardo, dri-devel On 11/25/2013 10:11 PM, Kleber Sacilotto de Souza wrote: > On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote: >> On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote: >>> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote: >>>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: >>>> >>>>> Weird. I wonder if there is an issue with cache snoops on PPC. We >>>>> currently use the gart in cached mode (GPU snoops CPU cache) with >>>>> cached pages. I wonder if we need to use uncached pages on PPC. >>>> There is no such issue and no known bugs with DMA writes on those >>>> PCIe host bridges (and they do get hammered pretty bad here). >>>> >>>> This needs further investigation by the lab/hw guys to find out what's >>>> actually happening on the bus and the host bridge. >>>> >>>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that >>>> continuously writes to a spare scratch register (thus triggering the >>>> corresponding writeback DMA) and checks the memory location to compare >>>> the writeback value (using a debugfs file for example, or mmap). I was not able to reproduce the issue with this method, even after a weekend run. However, doing some more investigation it seems the problem is here, where we read the ring rptr: u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev, struct radeon_ring *ring) { u32 rptr; if (rdev->wb.enabled) rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); else rptr = RREG32(ring->rptr_reg); return rptr; } I realized that the DMA'ed rptr value has always the opposite byte order from the MMIO value. Since RREG32 already returns the register value on the CPU byte order, it seems we don't need to byte-swap the DMA'ed value. If I remove the le32_to_cpu() call and use the DMA'ed value directly, I don't get the IB scheduling failures and piglit results are the same as with writeback disabled. Is the adapter chipset swapping the bytes before doing the DMA to a big-endian host? -- Kleber Sacilotto de Souza IBM Linux Technology Center ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-04 22:16 ` Kleber Sacilotto de Souza @ 2013-12-04 23:56 ` Alex Deucher 2013-12-05 0:05 ` Alex Deucher 0 siblings, 1 reply; 23+ messages in thread From: Alex Deucher @ 2013-12-04 23:56 UTC (permalink / raw) To: Kleber Sacilotto de Souza Cc: Maling list - DRI developers, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King On Wed, Dec 4, 2013 at 5:16 PM, Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> wrote: > On 11/25/2013 10:11 PM, Kleber Sacilotto de Souza wrote: >> >> On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote: >>> >>> On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote: >>>> >>>> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote: >>>>> >>>>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: >>>>> >>>>>> Weird. I wonder if there is an issue with cache snoops on PPC. We >>>>>> currently use the gart in cached mode (GPU snoops CPU cache) with >>>>>> cached pages. I wonder if we need to use uncached pages on PPC. >>>>> >>>>> There is no such issue and no known bugs with DMA writes on those >>>>> PCIe host bridges (and they do get hammered pretty bad here). >>>>> >>>>> This needs further investigation by the lab/hw guys to find out what's >>>>> actually happening on the bus and the host bridge. >>>>> >>>>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that >>>>> continuously writes to a spare scratch register (thus triggering the >>>>> corresponding writeback DMA) and checks the memory location to compare >>>>> the writeback value (using a debugfs file for example, or mmap). > > > I was not able to reproduce the issue with this method, even after a weekend > run. > > However, doing some more investigation it seems the problem is here, where > we read the ring rptr: > > u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev, > struct radeon_ring *ring) > { > u32 rptr; > > if (rdev->wb.enabled) > rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); > else > rptr = RREG32(ring->rptr_reg); > > return rptr; > } > > I realized that the DMA'ed rptr value has always the opposite byte order > from the MMIO value. Since RREG32 already returns the register value on the > CPU byte order, it seems we don't need to byte-swap the DMA'ed value. If I > remove the le32_to_cpu() call and use the DMA'ed value directly, I don't get > the IB scheduling failures and piglit results are the same as with writeback > disabled. > > Is the adapter chipset swapping the bytes before doing the DMA to a > big-endian host? Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte swapping for just about everything accessed by the CP (rptr writeback, indirect buffers, etc.). Looks like the DMA ring supports and enables rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so I think we can drop the swapping of the rptr writeback. Alex > > > > -- > Kleber Sacilotto de Souza > IBM Linux Technology Center > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-04 23:56 ` Alex Deucher @ 2013-12-05 0:05 ` Alex Deucher 2013-12-05 1:39 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 23+ messages in thread From: Alex Deucher @ 2013-12-05 0:05 UTC (permalink / raw) To: Kleber Sacilotto de Souza Cc: Maling list - DRI developers, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King [-- Attachment #1: Type: text/plain, Size: 2875 bytes --] On Wed, Dec 4, 2013 at 6:56 PM, Alex Deucher <alexdeucher@gmail.com> wrote: > On Wed, Dec 4, 2013 at 5:16 PM, Kleber Sacilotto de Souza > <klebers@linux.vnet.ibm.com> wrote: >> On 11/25/2013 10:11 PM, Kleber Sacilotto de Souza wrote: >>> >>> On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote: >>>> >>>> On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote: >>>>> >>>>> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote: >>>>>> >>>>>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: >>>>>> >>>>>>> Weird. I wonder if there is an issue with cache snoops on PPC. We >>>>>>> currently use the gart in cached mode (GPU snoops CPU cache) with >>>>>>> cached pages. I wonder if we need to use uncached pages on PPC. >>>>>> >>>>>> There is no such issue and no known bugs with DMA writes on those >>>>>> PCIe host bridges (and they do get hammered pretty bad here). >>>>>> >>>>>> This needs further investigation by the lab/hw guys to find out what's >>>>>> actually happening on the bus and the host bridge. >>>>>> >>>>>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that >>>>>> continuously writes to a spare scratch register (thus triggering the >>>>>> corresponding writeback DMA) and checks the memory location to compare >>>>>> the writeback value (using a debugfs file for example, or mmap). >> >> >> I was not able to reproduce the issue with this method, even after a weekend >> run. >> >> However, doing some more investigation it seems the problem is here, where >> we read the ring rptr: >> >> u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev, >> struct radeon_ring *ring) >> { >> u32 rptr; >> >> if (rdev->wb.enabled) >> rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); >> else >> rptr = RREG32(ring->rptr_reg); >> >> return rptr; >> } >> >> I realized that the DMA'ed rptr value has always the opposite byte order >> from the MMIO value. Since RREG32 already returns the register value on the >> CPU byte order, it seems we don't need to byte-swap the DMA'ed value. If I >> remove the le32_to_cpu() call and use the DMA'ed value directly, I don't get >> the IB scheduling failures and piglit results are the same as with writeback >> disabled. >> >> Is the adapter chipset swapping the bytes before doing the DMA to a >> big-endian host? > > Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte > swapping for just about everything accessed by the CP (rptr writeback, > indirect buffers, etc.). Looks like the DMA ring supports and enables > rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so > I think we can drop the swapping of the rptr writeback. > Obvious patch attached. Alex > Alex > >> >> >> >> -- >> Kleber Sacilotto de Souza >> IBM Linux Technology Center >> [-- Attachment #2: 0001-drm-radeon-don-t-byteswap-readback-of-rptr-writeback.patch --] [-- Type: text/x-diff, Size: 995 bytes --] From 0f7705c378a14060552df19c0e724e47632da4d8 Mon Sep 17 00:00:00 2001 From: Alex Deucher <alexander.deucher@amd.com> Date: Wed, 4 Dec 2013 19:02:08 -0500 Subject: [PATCH] drm/radeon: don't byteswap readback of rptr writeback We already enable byteswapping of the rptr writeback in the hw. Fixes incorrect rptr readback on BE systems. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org --- drivers/gpu/drm/radeon/radeon_ring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c index 9214403..56ff0cd 100644 --- a/drivers/gpu/drm/radeon/radeon_ring.c +++ b/drivers/gpu/drm/radeon/radeon_ring.c @@ -338,7 +338,7 @@ u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev, u32 rptr; if (rdev->wb.enabled) - rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); + rptr = rdev->wb.wb[ring->rptr_offs/4]; else rptr = RREG32(ring->rptr_reg); -- 1.8.3.1 [-- Attachment #3: Type: text/plain, Size: 159 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-05 0:05 ` Alex Deucher @ 2013-12-05 1:39 ` Benjamin Herrenschmidt 2013-12-05 2:29 ` Michel Dänzer 0 siblings, 1 reply; 23+ messages in thread From: Benjamin Herrenschmidt @ 2013-12-05 1:39 UTC (permalink / raw) To: Alex Deucher Cc: Maling list - DRI developers, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King On Wed, 2013-12-04 at 19:05 -0500, Alex Deucher wrote: > > Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte > > swapping for just about everything accessed by the CP (rptr writeback, > > indirect buffers, etc.). Looks like the DMA ring supports and enables > > rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so > > I think we can drop the swapping of the rptr writeback. > > > > Obvious patch attached. This works all the way back to r300 ? Cheers, Ben. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-05 1:39 ` Benjamin Herrenschmidt @ 2013-12-05 2:29 ` Michel Dänzer 2013-12-05 4:06 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 23+ messages in thread From: Michel Dänzer @ 2013-12-05 2:29 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Brian King, Jerome Glisse, Thadeu Lima de Souza Cascardo, Maling list - DRI developers On Don, 2013-12-05 at 12:39 +1100, Benjamin Herrenschmidt wrote: > On Wed, 2013-12-04 at 19:05 -0500, Alex Deucher wrote: > > > > Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte > > > swapping for just about everything accessed by the CP (rptr writeback, > > > indirect buffers, etc.). Looks like the DMA ring supports and enables > > > rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so > > > I think we can drop the swapping of the rptr writeback. > > > > > > > Obvious patch attached. > > This works all the way back to r300 ? I don't think so, as I have writeback working without this patch on the RV350 in this PowerBook. So I think this function needs to be split, probably between R600 and older. Also, there's more code at least potentially affected by this, e.g. in: * cik_compute_ring_get_rptr(), cik_compute_ring_get_wptr(), cik_compute_ring_set_wptr(), cik_get_ih_wptr() * si_get_ih_wptr() * evergreen_get_ih_wptr() * r600_get_ih_wptr() * radeon_fence_write(), radeon_fence_read() * radeon_ring_backup() -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-05 2:29 ` Michel Dänzer @ 2013-12-05 4:06 ` Benjamin Herrenschmidt 2013-12-05 14:42 ` Alex Deucher 0 siblings, 1 reply; 23+ messages in thread From: Benjamin Herrenschmidt @ 2013-12-05 4:06 UTC (permalink / raw) To: Michel Dänzer Cc: Brian King, Jerome Glisse, Thadeu Lima de Souza Cascardo, Maling list - DRI developers On Thu, 2013-12-05 at 11:29 +0900, Michel Dänzer wrote: > On Don, 2013-12-05 at 12:39 +1100, Benjamin Herrenschmidt wrote: > > On Wed, 2013-12-04 at 19:05 -0500, Alex Deucher wrote: > > > > > > Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte > > > > swapping for just about everything accessed by the CP (rptr writeback, > > > > indirect buffers, etc.). Looks like the DMA ring supports and enables > > > > rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so > > > > I think we can drop the swapping of the rptr writeback. > > > > > > > > > > Obvious patch attached. > > > > This works all the way back to r300 ? > > I don't think so, as I have writeback working without this patch on the > RV350 in this PowerBook. So I think this function needs to be split, > probably between R600 and older. Or can we tell it to not swap (setup code) and continue doing le32_to_cpu in the kernel ? Sounds better to me. > Also, there's more code at least potentially affected by this, e.g. in: > > * cik_compute_ring_get_rptr(), cik_compute_ring_get_wptr(), > cik_compute_ring_set_wptr(), cik_get_ih_wptr() > * si_get_ih_wptr() > * evergreen_get_ih_wptr() > * r600_get_ih_wptr() > * radeon_fence_write(), radeon_fence_read() > * radeon_ring_backup() Yeah I'd say just don't swap, write LE to memory and let the kernel use the right leXX_to_cpu. HW swapping is evil :-) Cheers, Ben. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-05 4:06 ` Benjamin Herrenschmidt @ 2013-12-05 14:42 ` Alex Deucher 2013-12-06 13:58 ` Kleber Sacilotto de Souza 0 siblings, 1 reply; 23+ messages in thread From: Alex Deucher @ 2013-12-05 14:42 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Brian King, Michel Dänzer, Jerome Glisse, Maling list - DRI developers, Thadeu Lima de Souza Cascardo On Wed, Dec 4, 2013 at 11:06 PM, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > On Thu, 2013-12-05 at 11:29 +0900, Michel Dänzer wrote: >> On Don, 2013-12-05 at 12:39 +1100, Benjamin Herrenschmidt wrote: >> > On Wed, 2013-12-04 at 19:05 -0500, Alex Deucher wrote: >> > >> > > > Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte >> > > > swapping for just about everything accessed by the CP (rptr writeback, >> > > > indirect buffers, etc.). Looks like the DMA ring supports and enables >> > > > rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so >> > > > I think we can drop the swapping of the rptr writeback. >> > > > >> > > >> > > Obvious patch attached. >> > >> > This works all the way back to r300 ? >> >> I don't think so, as I have writeback working without this patch on the >> RV350 in this PowerBook. So I think this function needs to be split, >> probably between R600 and older. > > Or can we tell it to not swap (setup code) and continue doing > le32_to_cpu in the kernel ? Sounds better to me. > >> Also, there's more code at least potentially affected by this, e.g. in: >> >> * cik_compute_ring_get_rptr(), cik_compute_ring_get_wptr(), >> cik_compute_ring_set_wptr(), cik_get_ih_wptr() >> * si_get_ih_wptr() >> * evergreen_get_ih_wptr() >> * r600_get_ih_wptr() >> * radeon_fence_write(), radeon_fence_read() >> * radeon_ring_backup() > > Yeah I'd say just don't swap, write LE to memory and let the kernel use > the right leXX_to_cpu. > > HW swapping is evil :-) Well, we'd need to start swapping indirect buffers and the ring as well then which would get tricky as the CP at least does not have separate swapping controls for different things. Probably easier to fix up as appropriate for different asic families. We have function pointers for the rtpr and wptr fetchers now, so we can do this pretty cleanly. Alex > > Cheers, > Ben. > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-05 14:42 ` Alex Deucher @ 2013-12-06 13:58 ` Kleber Sacilotto de Souza 2013-12-06 15:59 ` Alex Deucher 0 siblings, 1 reply; 23+ messages in thread From: Kleber Sacilotto de Souza @ 2013-12-06 13:58 UTC (permalink / raw) To: Alex Deucher Cc: Michel Dänzer, Maling list - DRI developers, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King On 12/05/2013 12:42 PM, Alex Deucher wrote: > Well, we'd need to start swapping indirect buffers and the ring as > well then which would get tricky as the CP at least does not have > separate swapping controls for different things. Probably easier to > fix up as appropriate for different asic families. We have function > pointers for the rtpr and wptr fetchers now, so we can do this pretty > cleanly. > > Alex Alex, Are you going to send a patch to fix this? If not, I don't have the knowledge of which asic families will need this fix, but if you inform me what needs to be done I can write the patch. Thanks, -- Kleber Sacilotto de Souza IBM Linux Technology Center ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-06 13:58 ` Kleber Sacilotto de Souza @ 2013-12-06 15:59 ` Alex Deucher 2013-12-10 0:48 ` Alex Deucher 0 siblings, 1 reply; 23+ messages in thread From: Alex Deucher @ 2013-12-06 15:59 UTC (permalink / raw) To: Kleber Sacilotto de Souza Cc: Michel Dänzer, Maling list - DRI developers, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King On Fri, Dec 6, 2013 at 8:58 AM, Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> wrote: > On 12/05/2013 12:42 PM, Alex Deucher wrote: >> >> Well, we'd need to start swapping indirect buffers and the ring as >> well then which would get tricky as the CP at least does not have >> separate swapping controls for different things. Probably easier to >> fix up as appropriate for different asic families. We have function >> pointers for the rtpr and wptr fetchers now, so we can do this pretty >> cleanly. >> >> Alex > > > Alex, > > Are you going to send a patch to fix this? > > If not, I don't have the knowledge of which asic families will need this > fix, but if you inform me what needs to be done I can write the patch. I'll try and send a patch out in the next few days. Alex > > > Thanks, > > > -- > Kleber Sacilotto de Souza > IBM Linux Technology Center > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-06 15:59 ` Alex Deucher @ 2013-12-10 0:48 ` Alex Deucher 2013-12-10 2:20 ` Michel Dänzer 0 siblings, 1 reply; 23+ messages in thread From: Alex Deucher @ 2013-12-10 0:48 UTC (permalink / raw) To: Kleber Sacilotto de Souza Cc: Michel Dänzer, Maling list - DRI developers, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King [-- Attachment #1: Type: text/plain, Size: 1049 bytes --] On Fri, Dec 6, 2013 at 10:59 AM, Alex Deucher <alexdeucher@gmail.com> wrote: > On Fri, Dec 6, 2013 at 8:58 AM, Kleber Sacilotto de Souza > <klebers@linux.vnet.ibm.com> wrote: >> On 12/05/2013 12:42 PM, Alex Deucher wrote: >>> >>> Well, we'd need to start swapping indirect buffers and the ring as >>> well then which would get tricky as the CP at least does not have >>> separate swapping controls for different things. Probably easier to >>> fix up as appropriate for different asic families. We have function >>> pointers for the rtpr and wptr fetchers now, so we can do this pretty >>> cleanly. >>> >>> Alex >> >> >> Alex, >> >> Are you going to send a patch to fix this? >> >> If not, I don't have the knowledge of which asic families will need this >> fix, but if you inform me what needs to be done I can write the patch. > > I'll try and send a patch out in the next few days. Patch attached. Compile tested only at the moment. Alex > > Alex > >> >> >> Thanks, >> >> >> -- >> Kleber Sacilotto de Souza >> IBM Linux Technology Center >> [-- Attachment #2: 0001-drm-radeon-remove-generic-rptr-wptr-functions.patch --] [-- Type: text/x-diff, Size: 33711 bytes --] From 89b31dd54d2818e16e239d840aeda4007d5fedc6 Mon Sep 17 00:00:00 2001 From: Alex Deucher <alexander.deucher@amd.com> Date: Mon, 9 Dec 2013 19:44:30 -0500 Subject: [PATCH] drm/radeon: remove generic rptr/wptr functions Fill in asic family specific versions rather than using the generic version. This lets us handle asic specific differences more easily. In this case, we disable sw swapping of the rtpr writeback value on r6xx+ since the hw does it for us. Fixes bogus rptr readback on BE systems. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/radeon/cik.c | 52 ++++++++++++++++++--------- drivers/gpu/drm/radeon/cik_sdma.c | 69 ++++++++++++++++++++++++++++++++++++ drivers/gpu/drm/radeon/evergreen.c | 3 -- drivers/gpu/drm/radeon/ni.c | 69 +++++++++++++++++++++++++++++++----- drivers/gpu/drm/radeon/ni_dma.c | 69 ++++++++++++++++++++++++++++++++++++ drivers/gpu/drm/radeon/r100.c | 31 +++++++++++++++- drivers/gpu/drm/radeon/r600.c | 32 +++++++++++++++-- drivers/gpu/drm/radeon/r600_dma.c | 13 +++++-- drivers/gpu/drm/radeon/radeon.h | 4 +-- drivers/gpu/drm/radeon/radeon_asic.c | 66 +++++++++++++++++----------------- drivers/gpu/drm/radeon/radeon_asic.h | 57 ++++++++++++++++++++++------- drivers/gpu/drm/radeon/radeon_ring.c | 40 ++------------------- drivers/gpu/drm/radeon/rv770.c | 3 -- drivers/gpu/drm/radeon/si.c | 8 ----- 14 files changed, 386 insertions(+), 130 deletions(-) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index b43a3a3..df01d8a 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -4015,15 +4015,43 @@ static int cik_cp_gfx_resume(struct radeon_device *rdev) return 0; } -u32 cik_compute_ring_get_rptr(struct radeon_device *rdev, - struct radeon_ring *ring) +u32 cik_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) { u32 rptr; + if (rdev->wb.enabled) + rptr = rdev->wb.wb[ring->rptr_offs/4]; + else + rptr = RREG32(CP_RB0_RPTR); + + return rptr; +} + +u32 cik_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 wptr; + + wptr = RREG32(CP_RB0_WPTR); + return wptr; +} + +void cik_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + WREG32(CP_RB0_WPTR, ring->wptr); + (void)RREG32(CP_RB0_WPTR); +} + +u32 cik_compute_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr; if (rdev->wb.enabled) { - rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); + rptr = rdev->wb.wb[ring->rptr_offs/4]; } else { mutex_lock(&rdev->srbm_mutex); cik_srbm_select(rdev, ring->me, ring->pipe, ring->queue, 0); @@ -4035,13 +4063,13 @@ u32 cik_compute_ring_get_rptr(struct radeon_device *rdev, return rptr; } -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev, - struct radeon_ring *ring) +u32 cik_compute_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) { u32 wptr; if (rdev->wb.enabled) { - wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]); + wptr = rdev->wb.wb[ring->wptr_offs/4]; } else { mutex_lock(&rdev->srbm_mutex); cik_srbm_select(rdev, ring->me, ring->pipe, ring->queue, 0); @@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct radeon_device *rdev, return wptr; } -void cik_compute_ring_set_wptr(struct radeon_device *rdev, - struct radeon_ring *ring) +void cik_compute_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) { rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr); WDOORBELL32(ring->doorbell_index, ring->wptr); @@ -7625,7 +7653,6 @@ static int cik_startup(struct radeon_device *rdev) ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - CP_RB0_RPTR, CP_RB0_WPTR, PACKET3(PACKET3_NOP, 0x3FFF)); if (r) return r; @@ -7634,7 +7661,6 @@ static int cik_startup(struct radeon_device *rdev) /* type-2 packets are deprecated on MEC, use type-3 instead */ ring = &rdev->ring[CAYMAN_RING_TYPE_CP1_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP1_RPTR_OFFSET, - CP_HQD_PQ_RPTR, CP_HQD_PQ_WPTR, PACKET3(PACKET3_NOP, 0x3FFF)); if (r) return r; @@ -7646,7 +7672,6 @@ static int cik_startup(struct radeon_device *rdev) /* type-2 packets are deprecated on MEC, use type-3 instead */ ring = &rdev->ring[CAYMAN_RING_TYPE_CP2_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP2_RPTR_OFFSET, - CP_HQD_PQ_RPTR, CP_HQD_PQ_WPTR, PACKET3(PACKET3_NOP, 0x3FFF)); if (r) return r; @@ -7658,16 +7683,12 @@ static int cik_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - SDMA0_GFX_RB_RPTR + SDMA0_REGISTER_OFFSET, - SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET, SDMA_PACKET(SDMA_OPCODE_NOP, 0, 0)); if (r) return r; ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET, - SDMA0_GFX_RB_RPTR + SDMA1_REGISTER_OFFSET, - SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET, SDMA_PACKET(SDMA_OPCODE_NOP, 0, 0)); if (r) return r; @@ -7683,7 +7704,6 @@ static int cik_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX]; if (ring->ring_size) { r = radeon_ring_init(rdev, ring, ring->ring_size, 0, - UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR, RADEON_CP_PACKET2); if (!r) r = uvd_v1_0_init(rdev); diff --git a/drivers/gpu/drm/radeon/cik_sdma.c b/drivers/gpu/drm/radeon/cik_sdma.c index 0300727..f0f9e10 100644 --- a/drivers/gpu/drm/radeon/cik_sdma.c +++ b/drivers/gpu/drm/radeon/cik_sdma.c @@ -52,6 +52,75 @@ u32 cik_gpu_check_soft_reset(struct radeon_device *rdev); */ /** + * cik_sdma_get_rptr - get the current read pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Get the current rptr from the hardware (CIK+). + */ +uint32_t cik_sdma_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr, reg; + + if (rdev->wb.enabled) { + rptr = rdev->wb.wb[ring->rptr_offs/4]; + } else { + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = SDMA0_GFX_RB_RPTR + SDMA0_REGISTER_OFFSET; + else + reg = SDMA0_GFX_RB_RPTR + SDMA1_REGISTER_OFFSET; + + rptr = RREG32(reg); + } + + return (rptr & 0x3fffc) >> 2; +} + +/** + * cik_sdma_get_wptr - get the current write pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Get the current wptr from the hardware (CIK+). + */ +uint32_t cik_sdma_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 reg; + + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET; + else + reg = SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET; + + return (RREG32(reg) & 0x3fffc) >> 2; +} + +/** + * cik_sdma_set_wptr - commit the write pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Write the wptr back to the hardware (CIK+). + */ +void cik_sdma_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 reg; + + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET; + else + reg = SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET; + + WREG32(reg, (ring->wptr << 2) & 0x3fffc); +} + +/** * cik_sdma_ring_ib_execute - Schedule an IB on the DMA engine * * @rdev: radeon_device pointer diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 9702e55..292749a 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -5199,14 +5199,12 @@ static int evergreen_startup(struct radeon_device *rdev) ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - R600_CP_RB_RPTR, R600_CP_RB_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - DMA_RB_RPTR, DMA_RB_WPTR, DMA_PACKET(DMA_PACKET_NOP, 0, 0)); if (r) return r; @@ -5224,7 +5222,6 @@ static int evergreen_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX]; if (ring->ring_size) { r = radeon_ring_init(rdev, ring, ring->ring_size, 0, - UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR, RADEON_CP_PACKET2); if (!r) r = uvd_v1_0_init(rdev); diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c index 11aab2a..4861dd5 100644 --- a/drivers/gpu/drm/radeon/ni.c +++ b/drivers/gpu/drm/radeon/ni.c @@ -1386,6 +1386,55 @@ static void cayman_cp_enable(struct radeon_device *rdev, bool enable) } } +u32 cayman_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr; + + if (rdev->wb.enabled) + rptr = rdev->wb.wb[ring->rptr_offs/4]; + else { + if (ring->idx == RADEON_RING_TYPE_GFX_INDEX) + rptr = RREG32(CP_RB0_RPTR); + else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX) + rptr = RREG32(CP_RB1_RPTR); + else + rptr = RREG32(CP_RB2_RPTR); + } + + return rptr; +} + +u32 cayman_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 wptr; + + if (ring->idx == RADEON_RING_TYPE_GFX_INDEX) + wptr = RREG32(CP_RB0_WPTR); + else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX) + wptr = RREG32(CP_RB1_WPTR); + else + wptr = RREG32(CP_RB2_WPTR); + + return wptr; +} + +void cayman_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + if (ring->idx == RADEON_RING_TYPE_GFX_INDEX) { + WREG32(CP_RB0_WPTR, ring->wptr); + (void)RREG32(CP_RB0_WPTR); + } else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX) { + WREG32(CP_RB1_WPTR, ring->wptr); + (void)RREG32(CP_RB1_WPTR); + } else { + WREG32(CP_RB2_WPTR, ring->wptr); + (void)RREG32(CP_RB2_WPTR); + } +} + static int cayman_cp_load_microcode(struct radeon_device *rdev) { const __be32 *fw_data; @@ -1514,6 +1563,16 @@ static int cayman_cp_resume(struct radeon_device *rdev) CP_RB1_BASE, CP_RB2_BASE }; + static const unsigned cp_rb_rptr[] = { + CP_RB0_RPTR, + CP_RB1_RPTR, + CP_RB2_RPTR + }; + static const unsigned cp_rb_wptr[] = { + CP_RB0_WPTR, + CP_RB1_WPTR, + CP_RB2_WPTR + }; struct radeon_ring *ring; int i, r; @@ -1572,8 +1631,8 @@ static int cayman_cp_resume(struct radeon_device *rdev) WREG32_P(cp_rb_cntl[i], RB_RPTR_WR_ENA, ~RB_RPTR_WR_ENA); ring->rptr = ring->wptr = 0; - WREG32(ring->rptr_reg, ring->rptr); - WREG32(ring->wptr_reg, ring->wptr); + WREG32(cp_rb_rptr[i], ring->rptr); + WREG32(cp_rb_wptr[i], ring->wptr); mdelay(1); WREG32_P(cp_rb_cntl[i], 0, ~RB_RPTR_WR_ENA); @@ -1969,23 +2028,18 @@ static int cayman_startup(struct radeon_device *rdev) evergreen_irq_set(rdev); r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - CP_RB0_RPTR, CP_RB0_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - DMA_RB_RPTR + DMA0_REGISTER_OFFSET, - DMA_RB_WPTR + DMA0_REGISTER_OFFSET, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0)); if (r) return r; ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET, - DMA_RB_RPTR + DMA1_REGISTER_OFFSET, - DMA_RB_WPTR + DMA1_REGISTER_OFFSET, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0)); if (r) return r; @@ -2004,7 +2058,6 @@ static int cayman_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX]; if (ring->ring_size) { r = radeon_ring_init(rdev, ring, ring->ring_size, 0, - UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR, RADEON_CP_PACKET2); if (!r) r = uvd_v1_0_init(rdev); diff --git a/drivers/gpu/drm/radeon/ni_dma.c b/drivers/gpu/drm/radeon/ni_dma.c index bdeb65e..51424ab 100644 --- a/drivers/gpu/drm/radeon/ni_dma.c +++ b/drivers/gpu/drm/radeon/ni_dma.c @@ -43,6 +43,75 @@ u32 cayman_gpu_check_soft_reset(struct radeon_device *rdev); */ /** + * cayman_dma_get_rptr - get the current read pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Get the current rptr from the hardware (cayman+). + */ +uint32_t cayman_dma_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr, reg; + + if (rdev->wb.enabled) { + rptr = rdev->wb.wb[ring->rptr_offs/4]; + } else { + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = DMA_RB_RPTR + DMA0_REGISTER_OFFSET; + else + reg = DMA_RB_RPTR + DMA1_REGISTER_OFFSET; + + rptr = RREG32(reg); + } + + return (rptr & 0x3fffc) >> 2; +} + +/** + * cayman_dma_get_wptr - get the current write pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Get the current wptr from the hardware (cayman+). + */ +uint32_t cayman_dma_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 reg; + + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = DMA_RB_WPTR + DMA0_REGISTER_OFFSET; + else + reg = DMA_RB_WPTR + DMA1_REGISTER_OFFSET; + + return (RREG32(reg) & 0x3fffc) >> 2; +} + +/** + * cayman_dma_set_wptr - commit the write pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Write the wptr back to the hardware (cayman+). + */ +void cayman_dma_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 reg; + + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = DMA_RB_WPTR + DMA0_REGISTER_OFFSET; + else + reg = DMA_RB_WPTR + DMA1_REGISTER_OFFSET; + + WREG32(reg, (ring->wptr << 2) & 0x3fffc); +} + +/** * cayman_dma_ring_ib_execute - Schedule an IB on the DMA engine * * @rdev: radeon_device pointer diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index 10abc4d..ef1791b 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -1050,6 +1050,36 @@ static int r100_cp_init_microcode(struct radeon_device *rdev) return err; } +u32 r100_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr; + + if (rdev->wb.enabled) + rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); + else + rptr = RREG32(RADEON_CP_RB_RPTR); + + return rptr; +} + +u32 r100_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 wptr; + + wptr = RREG32(RADEON_CP_RB_WPTR); + + return wptr; +} + +void r100_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + WREG32(RADEON_CP_RB_WPTR, ring->wptr); + (void)RREG32(RADEON_CP_RB_WPTR); +} + static void r100_cp_load_microcode(struct radeon_device *rdev) { const __be32 *fw_data; @@ -1102,7 +1132,6 @@ int r100_cp_init(struct radeon_device *rdev, unsigned ring_size) ring_size = (1 << (rb_bufsz + 1)) * 4; r100_cp_load_microcode(rdev); r = radeon_ring_init(rdev, ring, ring_size, RADEON_WB_CP_RPTR_OFFSET, - RADEON_CP_RB_RPTR, RADEON_CP_RB_WPTR, RADEON_CP_PACKET2); if (r) { return r; diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 9ad0673..a9034d8 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2382,6 +2382,36 @@ out: return err; } +u32 r600_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr; + + if (rdev->wb.enabled) + rptr = rdev->wb.wb[ring->rptr_offs/4]; + else + rptr = RREG32(R600_CP_RB_RPTR); + + return rptr; +} + +u32 r600_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 wptr; + + wptr = RREG32(R600_CP_RB_WPTR); + + return wptr; +} + +void r600_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + WREG32(R600_CP_RB_WPTR, ring->wptr); + (void)RREG32(R600_CP_RB_WPTR); +} + static int r600_cp_load_microcode(struct radeon_device *rdev) { const __be32 *fw_data; @@ -2826,14 +2856,12 @@ static int r600_startup(struct radeon_device *rdev) ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - R600_CP_RB_RPTR, R600_CP_RB_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - DMA_RB_RPTR, DMA_RB_WPTR, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0)); if (r) return r; diff --git a/drivers/gpu/drm/radeon/r600_dma.c b/drivers/gpu/drm/radeon/r600_dma.c index 7844d15..3452c84 100644 --- a/drivers/gpu/drm/radeon/r600_dma.c +++ b/drivers/gpu/drm/radeon/r600_dma.c @@ -51,7 +51,14 @@ u32 r600_gpu_check_soft_reset(struct radeon_device *rdev); uint32_t r600_dma_get_rptr(struct radeon_device *rdev, struct radeon_ring *ring) { - return (radeon_ring_generic_get_rptr(rdev, ring) & 0x3fffc) >> 2; + u32 rptr; + + if (rdev->wb.enabled) + rptr = rdev->wb.wb[ring->rptr_offs/4]; + else + rptr = RREG32(DMA_RB_RPTR); + + return (rptr & 0x3fffc) >> 2; } /** @@ -65,7 +72,7 @@ uint32_t r600_dma_get_rptr(struct radeon_device *rdev, uint32_t r600_dma_get_wptr(struct radeon_device *rdev, struct radeon_ring *ring) { - return (RREG32(ring->wptr_reg) & 0x3fffc) >> 2; + return (RREG32(DMA_RB_WPTR) & 0x3fffc) >> 2; } /** @@ -79,7 +86,7 @@ uint32_t r600_dma_get_wptr(struct radeon_device *rdev, void r600_dma_set_wptr(struct radeon_device *rdev, struct radeon_ring *ring) { - WREG32(ring->wptr_reg, (ring->wptr << 2) & 0x3fffc); + WREG32(DMA_RB_WPTR, (ring->wptr << 2) & 0x3fffc); } /** diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index b1f990d..e7c02a7 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -779,13 +779,11 @@ struct radeon_ring { volatile uint32_t *ring; unsigned rptr; unsigned rptr_offs; - unsigned rptr_reg; unsigned rptr_save_reg; u64 next_rptr_gpu_addr; volatile u32 *next_rptr_cpu_addr; unsigned wptr; unsigned wptr_old; - unsigned wptr_reg; unsigned ring_size; unsigned ring_free_dw; int count_dw; @@ -949,7 +947,7 @@ unsigned radeon_ring_backup(struct radeon_device *rdev, struct radeon_ring *ring int radeon_ring_restore(struct radeon_device *rdev, struct radeon_ring *ring, unsigned size, uint32_t *data); int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *cp, unsigned ring_size, - unsigned rptr_offs, unsigned rptr_reg, unsigned wptr_reg, u32 nop); + unsigned rptr_offs, u32 nop); void radeon_ring_fini(struct radeon_device *rdev, struct radeon_ring *cp); diff --git a/drivers/gpu/drm/radeon/radeon_asic.c b/drivers/gpu/drm/radeon/radeon_asic.c index c0425bb..0447604 100644 --- a/drivers/gpu/drm/radeon/radeon_asic.c +++ b/drivers/gpu/drm/radeon/radeon_asic.c @@ -182,9 +182,9 @@ static struct radeon_asic_ring r100_gfx_ring = { .ring_test = &r100_ring_test, .ib_test = &r100_ib_test, .is_lockup = &r100_gpu_is_lockup, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &r100_gfx_get_rptr, + .get_wptr = &r100_gfx_get_wptr, + .set_wptr = &r100_gfx_set_wptr, }; static struct radeon_asic r100_asic = { @@ -330,9 +330,9 @@ static struct radeon_asic_ring r300_gfx_ring = { .ring_test = &r100_ring_test, .ib_test = &r100_ib_test, .is_lockup = &r100_gpu_is_lockup, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &r100_gfx_get_rptr, + .get_wptr = &r100_gfx_get_wptr, + .set_wptr = &r100_gfx_set_wptr, }; static struct radeon_asic r300_asic = { @@ -883,9 +883,9 @@ static struct radeon_asic_ring r600_gfx_ring = { .ring_test = &r600_ring_test, .ib_test = &r600_ib_test, .is_lockup = &r600_gfx_is_lockup, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &r600_gfx_get_rptr, + .get_wptr = &r600_gfx_get_wptr, + .set_wptr = &r600_gfx_set_wptr, }; static struct radeon_asic_ring r600_dma_ring = { @@ -1267,9 +1267,9 @@ static struct radeon_asic_ring evergreen_gfx_ring = { .ring_test = &r600_ring_test, .ib_test = &r600_ib_test, .is_lockup = &evergreen_gfx_is_lockup, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &r600_gfx_get_rptr, + .get_wptr = &r600_gfx_get_wptr, + .set_wptr = &r600_gfx_set_wptr, }; static struct radeon_asic_ring evergreen_dma_ring = { @@ -1570,9 +1570,9 @@ static struct radeon_asic_ring cayman_gfx_ring = { .ib_test = &r600_ib_test, .is_lockup = &cayman_gfx_is_lockup, .vm_flush = &cayman_vm_flush, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &cayman_gfx_get_rptr, + .get_wptr = &cayman_gfx_get_wptr, + .set_wptr = &cayman_gfx_set_wptr, }; static struct radeon_asic_ring cayman_dma_ring = { @@ -1585,9 +1585,9 @@ static struct radeon_asic_ring cayman_dma_ring = { .ib_test = &r600_dma_ib_test, .is_lockup = &cayman_dma_is_lockup, .vm_flush = &cayman_dma_vm_flush, - .get_rptr = &r600_dma_get_rptr, - .get_wptr = &r600_dma_get_wptr, - .set_wptr = &r600_dma_set_wptr + .get_rptr = &cayman_dma_get_rptr, + .get_wptr = &cayman_dma_get_wptr, + .set_wptr = &cayman_dma_set_wptr }; static struct radeon_asic_ring cayman_uvd_ring = { @@ -1813,9 +1813,9 @@ static struct radeon_asic_ring si_gfx_ring = { .ib_test = &r600_ib_test, .is_lockup = &si_gfx_is_lockup, .vm_flush = &si_vm_flush, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &cayman_gfx_get_rptr, + .get_wptr = &cayman_gfx_get_wptr, + .set_wptr = &cayman_gfx_set_wptr, }; static struct radeon_asic_ring si_dma_ring = { @@ -1828,9 +1828,9 @@ static struct radeon_asic_ring si_dma_ring = { .ib_test = &r600_dma_ib_test, .is_lockup = &si_dma_is_lockup, .vm_flush = &si_dma_vm_flush, - .get_rptr = &r600_dma_get_rptr, - .get_wptr = &r600_dma_get_wptr, - .set_wptr = &r600_dma_set_wptr, + .get_rptr = &cayman_dma_get_rptr, + .get_wptr = &cayman_dma_get_wptr, + .set_wptr = &cayman_dma_set_wptr, }; static struct radeon_asic si_asic = { @@ -1943,9 +1943,9 @@ static struct radeon_asic_ring ci_gfx_ring = { .ib_test = &cik_ib_test, .is_lockup = &cik_gfx_is_lockup, .vm_flush = &cik_vm_flush, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &cik_gfx_get_rptr, + .get_wptr = &cik_gfx_get_wptr, + .set_wptr = &cik_gfx_set_wptr, }; static struct radeon_asic_ring ci_cp_ring = { @@ -1958,9 +1958,9 @@ static struct radeon_asic_ring ci_cp_ring = { .ib_test = &cik_ib_test, .is_lockup = &cik_gfx_is_lockup, .vm_flush = &cik_vm_flush, - .get_rptr = &cik_compute_ring_get_rptr, - .get_wptr = &cik_compute_ring_get_wptr, - .set_wptr = &cik_compute_ring_set_wptr, + .get_rptr = &cik_compute_get_rptr, + .get_wptr = &cik_compute_get_wptr, + .set_wptr = &cik_compute_set_wptr, }; static struct radeon_asic_ring ci_dma_ring = { @@ -1973,9 +1973,9 @@ static struct radeon_asic_ring ci_dma_ring = { .ib_test = &cik_sdma_ib_test, .is_lockup = &cik_sdma_is_lockup, .vm_flush = &cik_dma_vm_flush, - .get_rptr = &r600_dma_get_rptr, - .get_wptr = &r600_dma_get_wptr, - .set_wptr = &r600_dma_set_wptr, + .get_rptr = &cik_sdma_get_rptr, + .get_wptr = &cik_sdma_get_wptr, + .set_wptr = &cik_sdma_set_wptr, }; static struct radeon_asic ci_asic = { diff --git a/drivers/gpu/drm/radeon/radeon_asic.h b/drivers/gpu/drm/radeon/radeon_asic.h index c9fd97b..941e7eb 100644 --- a/drivers/gpu/drm/radeon/radeon_asic.h +++ b/drivers/gpu/drm/radeon/radeon_asic.h @@ -47,13 +47,6 @@ u8 atombios_get_backlight_level(struct radeon_encoder *radeon_encoder); void radeon_legacy_set_backlight_level(struct radeon_encoder *radeon_encoder, u8 level); u8 radeon_legacy_get_backlight_level(struct radeon_encoder *radeon_encoder); -u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev, - struct radeon_ring *ring); -u32 radeon_ring_generic_get_wptr(struct radeon_device *rdev, - struct radeon_ring *ring); -void radeon_ring_generic_set_wptr(struct radeon_device *rdev, - struct radeon_ring *ring); - /* * r100,rv100,rs100,rv200,rs200 */ @@ -148,6 +141,13 @@ extern void r100_post_page_flip(struct radeon_device *rdev, int crtc); extern void r100_wait_for_vblank(struct radeon_device *rdev, int crtc); extern int r100_mc_wait_for_idle(struct radeon_device *rdev); +u32 r100_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 r100_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void r100_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); + /* * r200,rv250,rs300,rv280 */ @@ -368,6 +368,12 @@ int r600_mc_wait_for_idle(struct radeon_device *rdev); int r600_pcie_gart_init(struct radeon_device *rdev); void r600_scratch_init(struct radeon_device *rdev); int r600_init_microcode(struct radeon_device *rdev); +u32 r600_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 r600_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void r600_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); /* r600 irq */ int r600_irq_process(struct radeon_device *rdev); int r600_irq_init(struct radeon_device *rdev); @@ -591,6 +597,19 @@ void cayman_dma_vm_set_page(struct radeon_device *rdev, void cayman_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm); +u32 cayman_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cayman_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void cayman_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +uint32_t cayman_dma_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +uint32_t cayman_dma_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void cayman_dma_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); + int ni_dpm_init(struct radeon_device *rdev); void ni_dpm_setup_asic(struct radeon_device *rdev); int ni_dpm_enable(struct radeon_device *rdev); @@ -739,12 +758,24 @@ void cik_sdma_vm_set_page(struct radeon_device *rdev, uint32_t incr, uint32_t flags); void cik_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm); int cik_ib_parse(struct radeon_device *rdev, struct radeon_ib *ib); -u32 cik_compute_ring_get_rptr(struct radeon_device *rdev, - struct radeon_ring *ring); -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev, - struct radeon_ring *ring); -void cik_compute_ring_set_wptr(struct radeon_device *rdev, - struct radeon_ring *ring); +u32 cik_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cik_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void cik_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cik_compute_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cik_compute_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void cik_compute_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cik_sdma_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cik_sdma_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void cik_sdma_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); int ci_get_temp(struct radeon_device *rdev); int kv_get_temp(struct radeon_device *rdev); diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c index 9214403..b1771a6 100644 --- a/drivers/gpu/drm/radeon/radeon_ring.c +++ b/drivers/gpu/drm/radeon/radeon_ring.c @@ -332,36 +332,6 @@ bool radeon_ring_supports_scratch_reg(struct radeon_device *rdev, } } -u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev, - struct radeon_ring *ring) -{ - u32 rptr; - - if (rdev->wb.enabled) - rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); - else - rptr = RREG32(ring->rptr_reg); - - return rptr; -} - -u32 radeon_ring_generic_get_wptr(struct radeon_device *rdev, - struct radeon_ring *ring) -{ - u32 wptr; - - wptr = RREG32(ring->wptr_reg); - - return wptr; -} - -void radeon_ring_generic_set_wptr(struct radeon_device *rdev, - struct radeon_ring *ring) -{ - WREG32(ring->wptr_reg, ring->wptr); - (void)RREG32(ring->wptr_reg); -} - /** * radeon_ring_free_size - update the free size * @@ -689,22 +659,18 @@ int radeon_ring_restore(struct radeon_device *rdev, struct radeon_ring *ring, * @ring: radeon_ring structure holding ring information * @ring_size: size of the ring * @rptr_offs: offset of the rptr writeback location in the WB buffer - * @rptr_reg: MMIO offset of the rptr register - * @wptr_reg: MMIO offset of the wptr register * @nop: nop packet for this ring * * Initialize the driver information for the selected ring (all asics). * Returns 0 on success, error on failure. */ int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, unsigned ring_size, - unsigned rptr_offs, unsigned rptr_reg, unsigned wptr_reg, u32 nop) + unsigned rptr_offs, u32 nop) { int r; ring->ring_size = ring_size; ring->rptr_offs = rptr_offs; - ring->rptr_reg = rptr_reg; - ring->wptr_reg = wptr_reg; ring->nop = nop; /* Allocate ring buffer */ if (ring->ring_obj == NULL) { @@ -796,9 +762,9 @@ static int radeon_debugfs_ring_info(struct seq_file *m, void *data) radeon_ring_free_size(rdev, ring); count = (ring->ring_size / 4) - ring->ring_free_dw; tmp = radeon_ring_get_wptr(rdev, ring); - seq_printf(m, "wptr(0x%04x): 0x%08x [%5d]\n", ring->wptr_reg, tmp, tmp); + seq_printf(m, "wptr: 0x%08x [%5d]\n", tmp, tmp); tmp = radeon_ring_get_rptr(rdev, ring); - seq_printf(m, "rptr(0x%04x): 0x%08x [%5d]\n", ring->rptr_reg, tmp, tmp); + seq_printf(m, "rptr: 0x%08x [%5d]\n", tmp, tmp); if (ring->rptr_save_reg) { seq_printf(m, "rptr next(0x%04x): 0x%08x\n", ring->rptr_save_reg, RREG32(ring->rptr_save_reg)); diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c index 9f58467..6cce0de 100644 --- a/drivers/gpu/drm/radeon/rv770.c +++ b/drivers/gpu/drm/radeon/rv770.c @@ -1728,14 +1728,12 @@ static int rv770_startup(struct radeon_device *rdev) ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - R600_CP_RB_RPTR, R600_CP_RB_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - DMA_RB_RPTR, DMA_RB_WPTR, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0)); if (r) return r; @@ -1754,7 +1752,6 @@ static int rv770_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX]; if (ring->ring_size) { r = radeon_ring_init(rdev, ring, ring->ring_size, 0, - UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR, RADEON_CP_PACKET2); if (!r) r = uvd_v1_0_init(rdev); diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c index a36736d..0a71a2e 100644 --- a/drivers/gpu/drm/radeon/si.c +++ b/drivers/gpu/drm/radeon/si.c @@ -6419,37 +6419,30 @@ static int si_startup(struct radeon_device *rdev) ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - CP_RB0_RPTR, CP_RB0_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[CAYMAN_RING_TYPE_CP1_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP1_RPTR_OFFSET, - CP_RB1_RPTR, CP_RB1_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[CAYMAN_RING_TYPE_CP2_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP2_RPTR_OFFSET, - CP_RB2_RPTR, CP_RB2_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - DMA_RB_RPTR + DMA0_REGISTER_OFFSET, - DMA_RB_WPTR + DMA0_REGISTER_OFFSET, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0, 0)); if (r) return r; ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET, - DMA_RB_RPTR + DMA1_REGISTER_OFFSET, - DMA_RB_WPTR + DMA1_REGISTER_OFFSET, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0, 0)); if (r) return r; @@ -6469,7 +6462,6 @@ static int si_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX]; if (ring->ring_size) { r = radeon_ring_init(rdev, ring, ring->ring_size, 0, - UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR, RADEON_CP_PACKET2); if (!r) r = uvd_v1_0_init(rdev); -- 1.8.3.1 [-- Attachment #3: Type: text/plain, Size: 159 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-10 0:48 ` Alex Deucher @ 2013-12-10 2:20 ` Michel Dänzer 2013-12-10 15:04 ` Alex Deucher 0 siblings, 1 reply; 23+ messages in thread From: Michel Dänzer @ 2013-12-10 2:20 UTC (permalink / raw) To: Alex Deucher Cc: Maling list - DRI developers, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King On Mon, 2013-12-09 at 19:48 -0500, Alex Deucher wrote: > > -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev, > - struct radeon_ring *ring) > +u32 cik_compute_get_wptr(struct radeon_device *rdev, > + struct radeon_ring *ring) > { > u32 wptr; > > if (rdev->wb.enabled) { > - wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]); > + wptr = rdev->wb.wb[ring->wptr_offs/4]; > } else { > mutex_lock(&rdev->srbm_mutex); > cik_srbm_select(rdev, ring->me, ring->pipe, > ring->queue, 0); > @@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct > radeon_device *rdev, > return wptr; > } > > -void cik_compute_ring_set_wptr(struct radeon_device *rdev, > - struct radeon_ring *ring) > +void cik_compute_set_wptr(struct radeon_device *rdev, > + struct radeon_ring *ring) > { > rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr); I think this cpu_to_le32() needs to be dropped as well to match cik_compute_ring_get_wptr(). > diff --git a/drivers/gpu/drm/radeon/radeon.h > b/drivers/gpu/drm/radeon/radeon.h > index b1f990d..e7c02a7 100644 > --- a/drivers/gpu/drm/radeon/radeon.h > +++ b/drivers/gpu/drm/radeon/radeon.h > @@ -779,13 +779,11 @@ struct radeon_ring { > volatile uint32_t *ring; > unsigned rptr; > unsigned rptr_offs; > - unsigned rptr_reg; > unsigned rptr_save_reg; > u64 next_rptr_gpu_addr; > volatile u32 *next_rptr_cpu_addr; > unsigned wptr; > unsigned wptr_old; > - unsigned wptr_reg; What's the motivation for removing these? Seems like keeping them would allow keeping this patch and the resulting code smaller (fewer function variants) and cleaner (no if/else/... for the register offsets in the variants). -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-10 2:20 ` Michel Dänzer @ 2013-12-10 15:04 ` Alex Deucher 2013-12-10 15:12 ` Alex Deucher 0 siblings, 1 reply; 23+ messages in thread From: Alex Deucher @ 2013-12-10 15:04 UTC (permalink / raw) To: Michel Dänzer Cc: Maling list - DRI developers, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King On Mon, Dec 9, 2013 at 9:20 PM, Michel Dänzer <michel@daenzer.net> wrote: > On Mon, 2013-12-09 at 19:48 -0500, Alex Deucher wrote: >> >> -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev, >> - struct radeon_ring *ring) >> +u32 cik_compute_get_wptr(struct radeon_device *rdev, >> + struct radeon_ring *ring) >> { >> u32 wptr; >> >> if (rdev->wb.enabled) { >> - wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]); >> + wptr = rdev->wb.wb[ring->wptr_offs/4]; >> } else { >> mutex_lock(&rdev->srbm_mutex); >> cik_srbm_select(rdev, ring->me, ring->pipe, >> ring->queue, 0); >> @@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct >> radeon_device *rdev, >> return wptr; >> } >> >> -void cik_compute_ring_set_wptr(struct radeon_device *rdev, >> - struct radeon_ring *ring) >> +void cik_compute_set_wptr(struct radeon_device *rdev, >> + struct radeon_ring *ring) >> { >> rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr); > > I think this cpu_to_le32() needs to be dropped as well to match > cik_compute_ring_get_wptr(). whoops, yeah, missed that one. > > >> diff --git a/drivers/gpu/drm/radeon/radeon.h >> b/drivers/gpu/drm/radeon/radeon.h >> index b1f990d..e7c02a7 100644 >> --- a/drivers/gpu/drm/radeon/radeon.h >> +++ b/drivers/gpu/drm/radeon/radeon.h >> @@ -779,13 +779,11 @@ struct radeon_ring { >> volatile uint32_t *ring; >> unsigned rptr; >> unsigned rptr_offs; >> - unsigned rptr_reg; >> unsigned rptr_save_reg; >> u64 next_rptr_gpu_addr; >> volatile u32 *next_rptr_cpu_addr; >> unsigned wptr; >> unsigned wptr_old; >> - unsigned wptr_reg; > > What's the motivation for removing these? Seems like keeping them would > allow keeping this patch and the resulting code smaller (fewer function > variants) and cleaner (no if/else/... for the register offsets in the > variants). I was trying to remove the asic specific info out of the ring struct and into the asic specific functions. Alex ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-10 15:04 ` Alex Deucher @ 2013-12-10 15:12 ` Alex Deucher 2014-01-02 20:54 ` Kleber Sacilotto de Souza 0 siblings, 1 reply; 23+ messages in thread From: Alex Deucher @ 2013-12-10 15:12 UTC (permalink / raw) To: Michel Dänzer Cc: Maling list - DRI developers, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King [-- Attachment #1: Type: text/plain, Size: 2585 bytes --] On Tue, Dec 10, 2013 at 10:04 AM, Alex Deucher <alexdeucher@gmail.com> wrote: > On Mon, Dec 9, 2013 at 9:20 PM, Michel Dänzer <michel@daenzer.net> wrote: >> On Mon, 2013-12-09 at 19:48 -0500, Alex Deucher wrote: >>> >>> -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev, >>> - struct radeon_ring *ring) >>> +u32 cik_compute_get_wptr(struct radeon_device *rdev, >>> + struct radeon_ring *ring) >>> { >>> u32 wptr; >>> >>> if (rdev->wb.enabled) { >>> - wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]); >>> + wptr = rdev->wb.wb[ring->wptr_offs/4]; >>> } else { >>> mutex_lock(&rdev->srbm_mutex); >>> cik_srbm_select(rdev, ring->me, ring->pipe, >>> ring->queue, 0); >>> @@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct >>> radeon_device *rdev, >>> return wptr; >>> } >>> >>> -void cik_compute_ring_set_wptr(struct radeon_device *rdev, >>> - struct radeon_ring *ring) >>> +void cik_compute_set_wptr(struct radeon_device *rdev, >>> + struct radeon_ring *ring) >>> { >>> rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr); >> >> I think this cpu_to_le32() needs to be dropped as well to match >> cik_compute_ring_get_wptr(). > > whoops, yeah, missed that one. Updated patch attached. Alex > >> >> >>> diff --git a/drivers/gpu/drm/radeon/radeon.h >>> b/drivers/gpu/drm/radeon/radeon.h >>> index b1f990d..e7c02a7 100644 >>> --- a/drivers/gpu/drm/radeon/radeon.h >>> +++ b/drivers/gpu/drm/radeon/radeon.h >>> @@ -779,13 +779,11 @@ struct radeon_ring { >>> volatile uint32_t *ring; >>> unsigned rptr; >>> unsigned rptr_offs; >>> - unsigned rptr_reg; >>> unsigned rptr_save_reg; >>> u64 next_rptr_gpu_addr; >>> volatile u32 *next_rptr_cpu_addr; >>> unsigned wptr; >>> unsigned wptr_old; >>> - unsigned wptr_reg; >> >> What's the motivation for removing these? Seems like keeping them would >> allow keeping this patch and the resulting code smaller (fewer function >> variants) and cleaner (no if/else/... for the register offsets in the >> variants). > > I was trying to remove the asic specific info out of the ring struct > and into the asic specific functions. > > Alex [-- Attachment #2: 0001-drm-radeon-remove-generic-rptr-wptr-functions-v2.patch --] [-- Type: text/x-diff, Size: 33918 bytes --] From 44d447981cd247aea135ca864dc205d304cf3e5f Mon Sep 17 00:00:00 2001 From: Alex Deucher <alexander.deucher@amd.com> Date: Mon, 9 Dec 2013 19:44:30 -0500 Subject: [PATCH] drm/radeon: remove generic rptr/wptr functions (v2) Fill in asic family specific versions rather than using the generic version. This lets us handle asic specific differences more easily. In this case, we disable sw swapping of the rtpr writeback value on r6xx+ since the hw does it for us. Fixes bogus rptr readback on BE systems. v2: remove missed cpu_to_le32(), add comments Signed-off-by: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/radeon/cik.c | 56 ++++++++++++++++++++--------- drivers/gpu/drm/radeon/cik_sdma.c | 69 ++++++++++++++++++++++++++++++++++++ drivers/gpu/drm/radeon/evergreen.c | 3 -- drivers/gpu/drm/radeon/ni.c | 69 +++++++++++++++++++++++++++++++----- drivers/gpu/drm/radeon/ni_dma.c | 69 ++++++++++++++++++++++++++++++++++++ drivers/gpu/drm/radeon/r100.c | 31 +++++++++++++++- drivers/gpu/drm/radeon/r600.c | 32 +++++++++++++++-- drivers/gpu/drm/radeon/r600_dma.c | 13 +++++-- drivers/gpu/drm/radeon/radeon.h | 4 +-- drivers/gpu/drm/radeon/radeon_asic.c | 66 +++++++++++++++++----------------- drivers/gpu/drm/radeon/radeon_asic.h | 57 ++++++++++++++++++++++------- drivers/gpu/drm/radeon/radeon_ring.c | 40 ++------------------- drivers/gpu/drm/radeon/rv770.c | 3 -- drivers/gpu/drm/radeon/si.c | 8 ----- 14 files changed, 389 insertions(+), 131 deletions(-) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index b43a3a3..9d7db10 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -4015,15 +4015,43 @@ static int cik_cp_gfx_resume(struct radeon_device *rdev) return 0; } -u32 cik_compute_ring_get_rptr(struct radeon_device *rdev, - struct radeon_ring *ring) +u32 cik_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) { u32 rptr; + if (rdev->wb.enabled) + rptr = rdev->wb.wb[ring->rptr_offs/4]; + else + rptr = RREG32(CP_RB0_RPTR); + + return rptr; +} + +u32 cik_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 wptr; + + wptr = RREG32(CP_RB0_WPTR); + return wptr; +} + +void cik_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + WREG32(CP_RB0_WPTR, ring->wptr); + (void)RREG32(CP_RB0_WPTR); +} + +u32 cik_compute_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr; if (rdev->wb.enabled) { - rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); + rptr = rdev->wb.wb[ring->rptr_offs/4]; } else { mutex_lock(&rdev->srbm_mutex); cik_srbm_select(rdev, ring->me, ring->pipe, ring->queue, 0); @@ -4035,13 +4063,14 @@ u32 cik_compute_ring_get_rptr(struct radeon_device *rdev, return rptr; } -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev, - struct radeon_ring *ring) +u32 cik_compute_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) { u32 wptr; if (rdev->wb.enabled) { - wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]); + /* XXX check if swapping is necessary on BE */ + wptr = rdev->wb.wb[ring->wptr_offs/4]; } else { mutex_lock(&rdev->srbm_mutex); cik_srbm_select(rdev, ring->me, ring->pipe, ring->queue, 0); @@ -4053,10 +4082,11 @@ u32 cik_compute_ring_get_wptr(struct radeon_device *rdev, return wptr; } -void cik_compute_ring_set_wptr(struct radeon_device *rdev, - struct radeon_ring *ring) +void cik_compute_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) { - rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr); + /* XXX check if swapping is necessary on BE */ + rdev->wb.wb[ring->wptr_offs/4] = ring->wptr; WDOORBELL32(ring->doorbell_index, ring->wptr); } @@ -7625,7 +7655,6 @@ static int cik_startup(struct radeon_device *rdev) ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - CP_RB0_RPTR, CP_RB0_WPTR, PACKET3(PACKET3_NOP, 0x3FFF)); if (r) return r; @@ -7634,7 +7663,6 @@ static int cik_startup(struct radeon_device *rdev) /* type-2 packets are deprecated on MEC, use type-3 instead */ ring = &rdev->ring[CAYMAN_RING_TYPE_CP1_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP1_RPTR_OFFSET, - CP_HQD_PQ_RPTR, CP_HQD_PQ_WPTR, PACKET3(PACKET3_NOP, 0x3FFF)); if (r) return r; @@ -7646,7 +7674,6 @@ static int cik_startup(struct radeon_device *rdev) /* type-2 packets are deprecated on MEC, use type-3 instead */ ring = &rdev->ring[CAYMAN_RING_TYPE_CP2_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP2_RPTR_OFFSET, - CP_HQD_PQ_RPTR, CP_HQD_PQ_WPTR, PACKET3(PACKET3_NOP, 0x3FFF)); if (r) return r; @@ -7658,16 +7685,12 @@ static int cik_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - SDMA0_GFX_RB_RPTR + SDMA0_REGISTER_OFFSET, - SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET, SDMA_PACKET(SDMA_OPCODE_NOP, 0, 0)); if (r) return r; ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET, - SDMA0_GFX_RB_RPTR + SDMA1_REGISTER_OFFSET, - SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET, SDMA_PACKET(SDMA_OPCODE_NOP, 0, 0)); if (r) return r; @@ -7683,7 +7706,6 @@ static int cik_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX]; if (ring->ring_size) { r = radeon_ring_init(rdev, ring, ring->ring_size, 0, - UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR, RADEON_CP_PACKET2); if (!r) r = uvd_v1_0_init(rdev); diff --git a/drivers/gpu/drm/radeon/cik_sdma.c b/drivers/gpu/drm/radeon/cik_sdma.c index 0300727..f0f9e10 100644 --- a/drivers/gpu/drm/radeon/cik_sdma.c +++ b/drivers/gpu/drm/radeon/cik_sdma.c @@ -52,6 +52,75 @@ u32 cik_gpu_check_soft_reset(struct radeon_device *rdev); */ /** + * cik_sdma_get_rptr - get the current read pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Get the current rptr from the hardware (CIK+). + */ +uint32_t cik_sdma_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr, reg; + + if (rdev->wb.enabled) { + rptr = rdev->wb.wb[ring->rptr_offs/4]; + } else { + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = SDMA0_GFX_RB_RPTR + SDMA0_REGISTER_OFFSET; + else + reg = SDMA0_GFX_RB_RPTR + SDMA1_REGISTER_OFFSET; + + rptr = RREG32(reg); + } + + return (rptr & 0x3fffc) >> 2; +} + +/** + * cik_sdma_get_wptr - get the current write pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Get the current wptr from the hardware (CIK+). + */ +uint32_t cik_sdma_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 reg; + + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET; + else + reg = SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET; + + return (RREG32(reg) & 0x3fffc) >> 2; +} + +/** + * cik_sdma_set_wptr - commit the write pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Write the wptr back to the hardware (CIK+). + */ +void cik_sdma_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 reg; + + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET; + else + reg = SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET; + + WREG32(reg, (ring->wptr << 2) & 0x3fffc); +} + +/** * cik_sdma_ring_ib_execute - Schedule an IB on the DMA engine * * @rdev: radeon_device pointer diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 9702e55..292749a 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -5199,14 +5199,12 @@ static int evergreen_startup(struct radeon_device *rdev) ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - R600_CP_RB_RPTR, R600_CP_RB_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - DMA_RB_RPTR, DMA_RB_WPTR, DMA_PACKET(DMA_PACKET_NOP, 0, 0)); if (r) return r; @@ -5224,7 +5222,6 @@ static int evergreen_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX]; if (ring->ring_size) { r = radeon_ring_init(rdev, ring, ring->ring_size, 0, - UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR, RADEON_CP_PACKET2); if (!r) r = uvd_v1_0_init(rdev); diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c index 11aab2a..4861dd5 100644 --- a/drivers/gpu/drm/radeon/ni.c +++ b/drivers/gpu/drm/radeon/ni.c @@ -1386,6 +1386,55 @@ static void cayman_cp_enable(struct radeon_device *rdev, bool enable) } } +u32 cayman_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr; + + if (rdev->wb.enabled) + rptr = rdev->wb.wb[ring->rptr_offs/4]; + else { + if (ring->idx == RADEON_RING_TYPE_GFX_INDEX) + rptr = RREG32(CP_RB0_RPTR); + else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX) + rptr = RREG32(CP_RB1_RPTR); + else + rptr = RREG32(CP_RB2_RPTR); + } + + return rptr; +} + +u32 cayman_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 wptr; + + if (ring->idx == RADEON_RING_TYPE_GFX_INDEX) + wptr = RREG32(CP_RB0_WPTR); + else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX) + wptr = RREG32(CP_RB1_WPTR); + else + wptr = RREG32(CP_RB2_WPTR); + + return wptr; +} + +void cayman_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + if (ring->idx == RADEON_RING_TYPE_GFX_INDEX) { + WREG32(CP_RB0_WPTR, ring->wptr); + (void)RREG32(CP_RB0_WPTR); + } else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX) { + WREG32(CP_RB1_WPTR, ring->wptr); + (void)RREG32(CP_RB1_WPTR); + } else { + WREG32(CP_RB2_WPTR, ring->wptr); + (void)RREG32(CP_RB2_WPTR); + } +} + static int cayman_cp_load_microcode(struct radeon_device *rdev) { const __be32 *fw_data; @@ -1514,6 +1563,16 @@ static int cayman_cp_resume(struct radeon_device *rdev) CP_RB1_BASE, CP_RB2_BASE }; + static const unsigned cp_rb_rptr[] = { + CP_RB0_RPTR, + CP_RB1_RPTR, + CP_RB2_RPTR + }; + static const unsigned cp_rb_wptr[] = { + CP_RB0_WPTR, + CP_RB1_WPTR, + CP_RB2_WPTR + }; struct radeon_ring *ring; int i, r; @@ -1572,8 +1631,8 @@ static int cayman_cp_resume(struct radeon_device *rdev) WREG32_P(cp_rb_cntl[i], RB_RPTR_WR_ENA, ~RB_RPTR_WR_ENA); ring->rptr = ring->wptr = 0; - WREG32(ring->rptr_reg, ring->rptr); - WREG32(ring->wptr_reg, ring->wptr); + WREG32(cp_rb_rptr[i], ring->rptr); + WREG32(cp_rb_wptr[i], ring->wptr); mdelay(1); WREG32_P(cp_rb_cntl[i], 0, ~RB_RPTR_WR_ENA); @@ -1969,23 +2028,18 @@ static int cayman_startup(struct radeon_device *rdev) evergreen_irq_set(rdev); r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - CP_RB0_RPTR, CP_RB0_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - DMA_RB_RPTR + DMA0_REGISTER_OFFSET, - DMA_RB_WPTR + DMA0_REGISTER_OFFSET, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0)); if (r) return r; ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET, - DMA_RB_RPTR + DMA1_REGISTER_OFFSET, - DMA_RB_WPTR + DMA1_REGISTER_OFFSET, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0)); if (r) return r; @@ -2004,7 +2058,6 @@ static int cayman_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX]; if (ring->ring_size) { r = radeon_ring_init(rdev, ring, ring->ring_size, 0, - UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR, RADEON_CP_PACKET2); if (!r) r = uvd_v1_0_init(rdev); diff --git a/drivers/gpu/drm/radeon/ni_dma.c b/drivers/gpu/drm/radeon/ni_dma.c index bdeb65e..51424ab 100644 --- a/drivers/gpu/drm/radeon/ni_dma.c +++ b/drivers/gpu/drm/radeon/ni_dma.c @@ -43,6 +43,75 @@ u32 cayman_gpu_check_soft_reset(struct radeon_device *rdev); */ /** + * cayman_dma_get_rptr - get the current read pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Get the current rptr from the hardware (cayman+). + */ +uint32_t cayman_dma_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr, reg; + + if (rdev->wb.enabled) { + rptr = rdev->wb.wb[ring->rptr_offs/4]; + } else { + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = DMA_RB_RPTR + DMA0_REGISTER_OFFSET; + else + reg = DMA_RB_RPTR + DMA1_REGISTER_OFFSET; + + rptr = RREG32(reg); + } + + return (rptr & 0x3fffc) >> 2; +} + +/** + * cayman_dma_get_wptr - get the current write pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Get the current wptr from the hardware (cayman+). + */ +uint32_t cayman_dma_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 reg; + + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = DMA_RB_WPTR + DMA0_REGISTER_OFFSET; + else + reg = DMA_RB_WPTR + DMA1_REGISTER_OFFSET; + + return (RREG32(reg) & 0x3fffc) >> 2; +} + +/** + * cayman_dma_set_wptr - commit the write pointer + * + * @rdev: radeon_device pointer + * @ring: radeon ring pointer + * + * Write the wptr back to the hardware (cayman+). + */ +void cayman_dma_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 reg; + + if (ring->idx == R600_RING_TYPE_DMA_INDEX) + reg = DMA_RB_WPTR + DMA0_REGISTER_OFFSET; + else + reg = DMA_RB_WPTR + DMA1_REGISTER_OFFSET; + + WREG32(reg, (ring->wptr << 2) & 0x3fffc); +} + +/** * cayman_dma_ring_ib_execute - Schedule an IB on the DMA engine * * @rdev: radeon_device pointer diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index 10abc4d..ef1791b 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -1050,6 +1050,36 @@ static int r100_cp_init_microcode(struct radeon_device *rdev) return err; } +u32 r100_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr; + + if (rdev->wb.enabled) + rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); + else + rptr = RREG32(RADEON_CP_RB_RPTR); + + return rptr; +} + +u32 r100_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 wptr; + + wptr = RREG32(RADEON_CP_RB_WPTR); + + return wptr; +} + +void r100_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + WREG32(RADEON_CP_RB_WPTR, ring->wptr); + (void)RREG32(RADEON_CP_RB_WPTR); +} + static void r100_cp_load_microcode(struct radeon_device *rdev) { const __be32 *fw_data; @@ -1102,7 +1132,6 @@ int r100_cp_init(struct radeon_device *rdev, unsigned ring_size) ring_size = (1 << (rb_bufsz + 1)) * 4; r100_cp_load_microcode(rdev); r = radeon_ring_init(rdev, ring, ring_size, RADEON_WB_CP_RPTR_OFFSET, - RADEON_CP_RB_RPTR, RADEON_CP_RB_WPTR, RADEON_CP_PACKET2); if (r) { return r; diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 9ad0673..a9034d8 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2382,6 +2382,36 @@ out: return err; } +u32 r600_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 rptr; + + if (rdev->wb.enabled) + rptr = rdev->wb.wb[ring->rptr_offs/4]; + else + rptr = RREG32(R600_CP_RB_RPTR); + + return rptr; +} + +u32 r600_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + u32 wptr; + + wptr = RREG32(R600_CP_RB_WPTR); + + return wptr; +} + +void r600_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring) +{ + WREG32(R600_CP_RB_WPTR, ring->wptr); + (void)RREG32(R600_CP_RB_WPTR); +} + static int r600_cp_load_microcode(struct radeon_device *rdev) { const __be32 *fw_data; @@ -2826,14 +2856,12 @@ static int r600_startup(struct radeon_device *rdev) ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - R600_CP_RB_RPTR, R600_CP_RB_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - DMA_RB_RPTR, DMA_RB_WPTR, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0)); if (r) return r; diff --git a/drivers/gpu/drm/radeon/r600_dma.c b/drivers/gpu/drm/radeon/r600_dma.c index 7844d15..3452c84 100644 --- a/drivers/gpu/drm/radeon/r600_dma.c +++ b/drivers/gpu/drm/radeon/r600_dma.c @@ -51,7 +51,14 @@ u32 r600_gpu_check_soft_reset(struct radeon_device *rdev); uint32_t r600_dma_get_rptr(struct radeon_device *rdev, struct radeon_ring *ring) { - return (radeon_ring_generic_get_rptr(rdev, ring) & 0x3fffc) >> 2; + u32 rptr; + + if (rdev->wb.enabled) + rptr = rdev->wb.wb[ring->rptr_offs/4]; + else + rptr = RREG32(DMA_RB_RPTR); + + return (rptr & 0x3fffc) >> 2; } /** @@ -65,7 +72,7 @@ uint32_t r600_dma_get_rptr(struct radeon_device *rdev, uint32_t r600_dma_get_wptr(struct radeon_device *rdev, struct radeon_ring *ring) { - return (RREG32(ring->wptr_reg) & 0x3fffc) >> 2; + return (RREG32(DMA_RB_WPTR) & 0x3fffc) >> 2; } /** @@ -79,7 +86,7 @@ uint32_t r600_dma_get_wptr(struct radeon_device *rdev, void r600_dma_set_wptr(struct radeon_device *rdev, struct radeon_ring *ring) { - WREG32(ring->wptr_reg, (ring->wptr << 2) & 0x3fffc); + WREG32(DMA_RB_WPTR, (ring->wptr << 2) & 0x3fffc); } /** diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index b1f990d..e7c02a7 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -779,13 +779,11 @@ struct radeon_ring { volatile uint32_t *ring; unsigned rptr; unsigned rptr_offs; - unsigned rptr_reg; unsigned rptr_save_reg; u64 next_rptr_gpu_addr; volatile u32 *next_rptr_cpu_addr; unsigned wptr; unsigned wptr_old; - unsigned wptr_reg; unsigned ring_size; unsigned ring_free_dw; int count_dw; @@ -949,7 +947,7 @@ unsigned radeon_ring_backup(struct radeon_device *rdev, struct radeon_ring *ring int radeon_ring_restore(struct radeon_device *rdev, struct radeon_ring *ring, unsigned size, uint32_t *data); int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *cp, unsigned ring_size, - unsigned rptr_offs, unsigned rptr_reg, unsigned wptr_reg, u32 nop); + unsigned rptr_offs, u32 nop); void radeon_ring_fini(struct radeon_device *rdev, struct radeon_ring *cp); diff --git a/drivers/gpu/drm/radeon/radeon_asic.c b/drivers/gpu/drm/radeon/radeon_asic.c index c0425bb..0447604 100644 --- a/drivers/gpu/drm/radeon/radeon_asic.c +++ b/drivers/gpu/drm/radeon/radeon_asic.c @@ -182,9 +182,9 @@ static struct radeon_asic_ring r100_gfx_ring = { .ring_test = &r100_ring_test, .ib_test = &r100_ib_test, .is_lockup = &r100_gpu_is_lockup, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &r100_gfx_get_rptr, + .get_wptr = &r100_gfx_get_wptr, + .set_wptr = &r100_gfx_set_wptr, }; static struct radeon_asic r100_asic = { @@ -330,9 +330,9 @@ static struct radeon_asic_ring r300_gfx_ring = { .ring_test = &r100_ring_test, .ib_test = &r100_ib_test, .is_lockup = &r100_gpu_is_lockup, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &r100_gfx_get_rptr, + .get_wptr = &r100_gfx_get_wptr, + .set_wptr = &r100_gfx_set_wptr, }; static struct radeon_asic r300_asic = { @@ -883,9 +883,9 @@ static struct radeon_asic_ring r600_gfx_ring = { .ring_test = &r600_ring_test, .ib_test = &r600_ib_test, .is_lockup = &r600_gfx_is_lockup, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &r600_gfx_get_rptr, + .get_wptr = &r600_gfx_get_wptr, + .set_wptr = &r600_gfx_set_wptr, }; static struct radeon_asic_ring r600_dma_ring = { @@ -1267,9 +1267,9 @@ static struct radeon_asic_ring evergreen_gfx_ring = { .ring_test = &r600_ring_test, .ib_test = &r600_ib_test, .is_lockup = &evergreen_gfx_is_lockup, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &r600_gfx_get_rptr, + .get_wptr = &r600_gfx_get_wptr, + .set_wptr = &r600_gfx_set_wptr, }; static struct radeon_asic_ring evergreen_dma_ring = { @@ -1570,9 +1570,9 @@ static struct radeon_asic_ring cayman_gfx_ring = { .ib_test = &r600_ib_test, .is_lockup = &cayman_gfx_is_lockup, .vm_flush = &cayman_vm_flush, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &cayman_gfx_get_rptr, + .get_wptr = &cayman_gfx_get_wptr, + .set_wptr = &cayman_gfx_set_wptr, }; static struct radeon_asic_ring cayman_dma_ring = { @@ -1585,9 +1585,9 @@ static struct radeon_asic_ring cayman_dma_ring = { .ib_test = &r600_dma_ib_test, .is_lockup = &cayman_dma_is_lockup, .vm_flush = &cayman_dma_vm_flush, - .get_rptr = &r600_dma_get_rptr, - .get_wptr = &r600_dma_get_wptr, - .set_wptr = &r600_dma_set_wptr + .get_rptr = &cayman_dma_get_rptr, + .get_wptr = &cayman_dma_get_wptr, + .set_wptr = &cayman_dma_set_wptr }; static struct radeon_asic_ring cayman_uvd_ring = { @@ -1813,9 +1813,9 @@ static struct radeon_asic_ring si_gfx_ring = { .ib_test = &r600_ib_test, .is_lockup = &si_gfx_is_lockup, .vm_flush = &si_vm_flush, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &cayman_gfx_get_rptr, + .get_wptr = &cayman_gfx_get_wptr, + .set_wptr = &cayman_gfx_set_wptr, }; static struct radeon_asic_ring si_dma_ring = { @@ -1828,9 +1828,9 @@ static struct radeon_asic_ring si_dma_ring = { .ib_test = &r600_dma_ib_test, .is_lockup = &si_dma_is_lockup, .vm_flush = &si_dma_vm_flush, - .get_rptr = &r600_dma_get_rptr, - .get_wptr = &r600_dma_get_wptr, - .set_wptr = &r600_dma_set_wptr, + .get_rptr = &cayman_dma_get_rptr, + .get_wptr = &cayman_dma_get_wptr, + .set_wptr = &cayman_dma_set_wptr, }; static struct radeon_asic si_asic = { @@ -1943,9 +1943,9 @@ static struct radeon_asic_ring ci_gfx_ring = { .ib_test = &cik_ib_test, .is_lockup = &cik_gfx_is_lockup, .vm_flush = &cik_vm_flush, - .get_rptr = &radeon_ring_generic_get_rptr, - .get_wptr = &radeon_ring_generic_get_wptr, - .set_wptr = &radeon_ring_generic_set_wptr, + .get_rptr = &cik_gfx_get_rptr, + .get_wptr = &cik_gfx_get_wptr, + .set_wptr = &cik_gfx_set_wptr, }; static struct radeon_asic_ring ci_cp_ring = { @@ -1958,9 +1958,9 @@ static struct radeon_asic_ring ci_cp_ring = { .ib_test = &cik_ib_test, .is_lockup = &cik_gfx_is_lockup, .vm_flush = &cik_vm_flush, - .get_rptr = &cik_compute_ring_get_rptr, - .get_wptr = &cik_compute_ring_get_wptr, - .set_wptr = &cik_compute_ring_set_wptr, + .get_rptr = &cik_compute_get_rptr, + .get_wptr = &cik_compute_get_wptr, + .set_wptr = &cik_compute_set_wptr, }; static struct radeon_asic_ring ci_dma_ring = { @@ -1973,9 +1973,9 @@ static struct radeon_asic_ring ci_dma_ring = { .ib_test = &cik_sdma_ib_test, .is_lockup = &cik_sdma_is_lockup, .vm_flush = &cik_dma_vm_flush, - .get_rptr = &r600_dma_get_rptr, - .get_wptr = &r600_dma_get_wptr, - .set_wptr = &r600_dma_set_wptr, + .get_rptr = &cik_sdma_get_rptr, + .get_wptr = &cik_sdma_get_wptr, + .set_wptr = &cik_sdma_set_wptr, }; static struct radeon_asic ci_asic = { diff --git a/drivers/gpu/drm/radeon/radeon_asic.h b/drivers/gpu/drm/radeon/radeon_asic.h index c9fd97b..941e7eb 100644 --- a/drivers/gpu/drm/radeon/radeon_asic.h +++ b/drivers/gpu/drm/radeon/radeon_asic.h @@ -47,13 +47,6 @@ u8 atombios_get_backlight_level(struct radeon_encoder *radeon_encoder); void radeon_legacy_set_backlight_level(struct radeon_encoder *radeon_encoder, u8 level); u8 radeon_legacy_get_backlight_level(struct radeon_encoder *radeon_encoder); -u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev, - struct radeon_ring *ring); -u32 radeon_ring_generic_get_wptr(struct radeon_device *rdev, - struct radeon_ring *ring); -void radeon_ring_generic_set_wptr(struct radeon_device *rdev, - struct radeon_ring *ring); - /* * r100,rv100,rs100,rv200,rs200 */ @@ -148,6 +141,13 @@ extern void r100_post_page_flip(struct radeon_device *rdev, int crtc); extern void r100_wait_for_vblank(struct radeon_device *rdev, int crtc); extern int r100_mc_wait_for_idle(struct radeon_device *rdev); +u32 r100_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 r100_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void r100_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); + /* * r200,rv250,rs300,rv280 */ @@ -368,6 +368,12 @@ int r600_mc_wait_for_idle(struct radeon_device *rdev); int r600_pcie_gart_init(struct radeon_device *rdev); void r600_scratch_init(struct radeon_device *rdev); int r600_init_microcode(struct radeon_device *rdev); +u32 r600_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 r600_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void r600_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); /* r600 irq */ int r600_irq_process(struct radeon_device *rdev); int r600_irq_init(struct radeon_device *rdev); @@ -591,6 +597,19 @@ void cayman_dma_vm_set_page(struct radeon_device *rdev, void cayman_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm); +u32 cayman_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cayman_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void cayman_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +uint32_t cayman_dma_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +uint32_t cayman_dma_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void cayman_dma_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); + int ni_dpm_init(struct radeon_device *rdev); void ni_dpm_setup_asic(struct radeon_device *rdev); int ni_dpm_enable(struct radeon_device *rdev); @@ -739,12 +758,24 @@ void cik_sdma_vm_set_page(struct radeon_device *rdev, uint32_t incr, uint32_t flags); void cik_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm); int cik_ib_parse(struct radeon_device *rdev, struct radeon_ib *ib); -u32 cik_compute_ring_get_rptr(struct radeon_device *rdev, - struct radeon_ring *ring); -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev, - struct radeon_ring *ring); -void cik_compute_ring_set_wptr(struct radeon_device *rdev, - struct radeon_ring *ring); +u32 cik_gfx_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cik_gfx_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void cik_gfx_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cik_compute_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cik_compute_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void cik_compute_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cik_sdma_get_rptr(struct radeon_device *rdev, + struct radeon_ring *ring); +u32 cik_sdma_get_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); +void cik_sdma_set_wptr(struct radeon_device *rdev, + struct radeon_ring *ring); int ci_get_temp(struct radeon_device *rdev); int kv_get_temp(struct radeon_device *rdev); diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c index 9214403..b1771a6 100644 --- a/drivers/gpu/drm/radeon/radeon_ring.c +++ b/drivers/gpu/drm/radeon/radeon_ring.c @@ -332,36 +332,6 @@ bool radeon_ring_supports_scratch_reg(struct radeon_device *rdev, } } -u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev, - struct radeon_ring *ring) -{ - u32 rptr; - - if (rdev->wb.enabled) - rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); - else - rptr = RREG32(ring->rptr_reg); - - return rptr; -} - -u32 radeon_ring_generic_get_wptr(struct radeon_device *rdev, - struct radeon_ring *ring) -{ - u32 wptr; - - wptr = RREG32(ring->wptr_reg); - - return wptr; -} - -void radeon_ring_generic_set_wptr(struct radeon_device *rdev, - struct radeon_ring *ring) -{ - WREG32(ring->wptr_reg, ring->wptr); - (void)RREG32(ring->wptr_reg); -} - /** * radeon_ring_free_size - update the free size * @@ -689,22 +659,18 @@ int radeon_ring_restore(struct radeon_device *rdev, struct radeon_ring *ring, * @ring: radeon_ring structure holding ring information * @ring_size: size of the ring * @rptr_offs: offset of the rptr writeback location in the WB buffer - * @rptr_reg: MMIO offset of the rptr register - * @wptr_reg: MMIO offset of the wptr register * @nop: nop packet for this ring * * Initialize the driver information for the selected ring (all asics). * Returns 0 on success, error on failure. */ int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, unsigned ring_size, - unsigned rptr_offs, unsigned rptr_reg, unsigned wptr_reg, u32 nop) + unsigned rptr_offs, u32 nop) { int r; ring->ring_size = ring_size; ring->rptr_offs = rptr_offs; - ring->rptr_reg = rptr_reg; - ring->wptr_reg = wptr_reg; ring->nop = nop; /* Allocate ring buffer */ if (ring->ring_obj == NULL) { @@ -796,9 +762,9 @@ static int radeon_debugfs_ring_info(struct seq_file *m, void *data) radeon_ring_free_size(rdev, ring); count = (ring->ring_size / 4) - ring->ring_free_dw; tmp = radeon_ring_get_wptr(rdev, ring); - seq_printf(m, "wptr(0x%04x): 0x%08x [%5d]\n", ring->wptr_reg, tmp, tmp); + seq_printf(m, "wptr: 0x%08x [%5d]\n", tmp, tmp); tmp = radeon_ring_get_rptr(rdev, ring); - seq_printf(m, "rptr(0x%04x): 0x%08x [%5d]\n", ring->rptr_reg, tmp, tmp); + seq_printf(m, "rptr: 0x%08x [%5d]\n", tmp, tmp); if (ring->rptr_save_reg) { seq_printf(m, "rptr next(0x%04x): 0x%08x\n", ring->rptr_save_reg, RREG32(ring->rptr_save_reg)); diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c index 9f58467..6cce0de 100644 --- a/drivers/gpu/drm/radeon/rv770.c +++ b/drivers/gpu/drm/radeon/rv770.c @@ -1728,14 +1728,12 @@ static int rv770_startup(struct radeon_device *rdev) ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - R600_CP_RB_RPTR, R600_CP_RB_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - DMA_RB_RPTR, DMA_RB_WPTR, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0)); if (r) return r; @@ -1754,7 +1752,6 @@ static int rv770_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX]; if (ring->ring_size) { r = radeon_ring_init(rdev, ring, ring->ring_size, 0, - UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR, RADEON_CP_PACKET2); if (!r) r = uvd_v1_0_init(rdev); diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c index a36736d..0a71a2e 100644 --- a/drivers/gpu/drm/radeon/si.c +++ b/drivers/gpu/drm/radeon/si.c @@ -6419,37 +6419,30 @@ static int si_startup(struct radeon_device *rdev) ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET, - CP_RB0_RPTR, CP_RB0_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[CAYMAN_RING_TYPE_CP1_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP1_RPTR_OFFSET, - CP_RB1_RPTR, CP_RB1_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[CAYMAN_RING_TYPE_CP2_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP2_RPTR_OFFSET, - CP_RB2_RPTR, CP_RB2_WPTR, RADEON_CP_PACKET2); if (r) return r; ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET, - DMA_RB_RPTR + DMA0_REGISTER_OFFSET, - DMA_RB_WPTR + DMA0_REGISTER_OFFSET, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0, 0)); if (r) return r; ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX]; r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET, - DMA_RB_RPTR + DMA1_REGISTER_OFFSET, - DMA_RB_WPTR + DMA1_REGISTER_OFFSET, DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0, 0)); if (r) return r; @@ -6469,7 +6462,6 @@ static int si_startup(struct radeon_device *rdev) ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX]; if (ring->ring_size) { r = radeon_ring_init(rdev, ring, ring->ring_size, 0, - UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR, RADEON_CP_PACKET2); if (!r) r = uvd_v1_0_init(rdev); -- 1.8.3.1 [-- Attachment #3: Type: text/plain, Size: 159 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2013-12-10 15:12 ` Alex Deucher @ 2014-01-02 20:54 ` Kleber Sacilotto de Souza 2014-01-02 22:48 ` Alex Deucher 0 siblings, 1 reply; 23+ messages in thread From: Kleber Sacilotto de Souza @ 2014-01-02 20:54 UTC (permalink / raw) To: Alex Deucher Cc: Michel Dänzer, Maling list - DRI developers, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King On 12/10/2013 01:12 PM, Alex Deucher wrote: > On Tue, Dec 10, 2013 at 10:04 AM, Alex Deucher <alexdeucher@gmail.com> wrote: >> On Mon, Dec 9, 2013 at 9:20 PM, Michel Dänzer <michel@daenzer.net> wrote: >>> On Mon, 2013-12-09 at 19:48 -0500, Alex Deucher wrote: >>>> >>>> -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev, >>>> - struct radeon_ring *ring) >>>> +u32 cik_compute_get_wptr(struct radeon_device *rdev, >>>> + struct radeon_ring *ring) >>>> { >>>> u32 wptr; >>>> >>>> if (rdev->wb.enabled) { >>>> - wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]); >>>> + wptr = rdev->wb.wb[ring->wptr_offs/4]; >>>> } else { >>>> mutex_lock(&rdev->srbm_mutex); >>>> cik_srbm_select(rdev, ring->me, ring->pipe, >>>> ring->queue, 0); >>>> @@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct >>>> radeon_device *rdev, >>>> return wptr; >>>> } >>>> >>>> -void cik_compute_ring_set_wptr(struct radeon_device *rdev, >>>> - struct radeon_ring *ring) >>>> +void cik_compute_set_wptr(struct radeon_device *rdev, >>>> + struct radeon_ring *ring) >>>> { >>>> rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr); >>> >>> I think this cpu_to_le32() needs to be dropped as well to match >>> cik_compute_ring_get_wptr(). >> >> whoops, yeah, missed that one. > > Updated patch attached. > > Alex > Hi, Alex. Has this patch been accepted upstream? I've tested it successfully on a ppc64 system with a FirePro 2270 adapter. Thanks, -- Kleber Sacilotto de Souza IBM Linux Technology Center ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] drm/radeon: Disable writeback by default on ppc 2014-01-02 20:54 ` Kleber Sacilotto de Souza @ 2014-01-02 22:48 ` Alex Deucher 0 siblings, 0 replies; 23+ messages in thread From: Alex Deucher @ 2014-01-02 22:48 UTC (permalink / raw) To: Kleber Sacilotto de Souza Cc: Michel Dänzer, Maling list - DRI developers, Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King On Thu, Jan 2, 2014 at 3:54 PM, Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> wrote: > On 12/10/2013 01:12 PM, Alex Deucher wrote: >> >> On Tue, Dec 10, 2013 at 10:04 AM, Alex Deucher <alexdeucher@gmail.com> >> wrote: >>> >>> On Mon, Dec 9, 2013 at 9:20 PM, Michel Dänzer <michel@daenzer.net> wrote: >>>> >>>> On Mon, 2013-12-09 at 19:48 -0500, Alex Deucher wrote: >>>>> >>>>> >>>>> -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev, >>>>> - struct radeon_ring *ring) >>>>> +u32 cik_compute_get_wptr(struct radeon_device *rdev, >>>>> + struct radeon_ring *ring) >>>>> { >>>>> u32 wptr; >>>>> >>>>> if (rdev->wb.enabled) { >>>>> - wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]); >>>>> + wptr = rdev->wb.wb[ring->wptr_offs/4]; >>>>> } else { >>>>> mutex_lock(&rdev->srbm_mutex); >>>>> cik_srbm_select(rdev, ring->me, ring->pipe, >>>>> ring->queue, 0); >>>>> @@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct >>>>> radeon_device *rdev, >>>>> return wptr; >>>>> } >>>>> >>>>> -void cik_compute_ring_set_wptr(struct radeon_device *rdev, >>>>> - struct radeon_ring *ring) >>>>> +void cik_compute_set_wptr(struct radeon_device *rdev, >>>>> + struct radeon_ring *ring) >>>>> { >>>>> rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr); >>>> >>>> >>>> I think this cpu_to_le32() needs to be dropped as well to match >>>> cik_compute_ring_get_wptr(). >>> >>> >>> whoops, yeah, missed that one. >> >> >> Updated patch attached. >> >> Alex >> > > Hi, Alex. > > Has this patch been accepted upstream? I've tested it successfully on a > ppc64 system with a FirePro 2270 adapter. Not yet, I was planning to upstream it for 3.14 unless there are objections. http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.14-wip Alex > > > > Thanks, > > > -- > Kleber Sacilotto de Souza > IBM Linux Technology Center > ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2014-01-02 22:48 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-06-17 14:06 [PATCH] drm/radeon: Disable writeback by default on ppc Adam Jackson 2013-06-17 15:04 ` Alex Deucher 2013-06-17 16:07 ` Adam Jackson 2013-06-17 22:57 ` Alex Deucher 2013-11-07 22:29 ` Benjamin Herrenschmidt 2013-11-08 13:43 ` Kleber Sacilotto de Souza 2013-11-24 23:15 ` Benjamin Herrenschmidt 2013-11-26 0:11 ` Kleber Sacilotto de Souza 2013-12-04 22:16 ` Kleber Sacilotto de Souza 2013-12-04 23:56 ` Alex Deucher 2013-12-05 0:05 ` Alex Deucher 2013-12-05 1:39 ` Benjamin Herrenschmidt 2013-12-05 2:29 ` Michel Dänzer 2013-12-05 4:06 ` Benjamin Herrenschmidt 2013-12-05 14:42 ` Alex Deucher 2013-12-06 13:58 ` Kleber Sacilotto de Souza 2013-12-06 15:59 ` Alex Deucher 2013-12-10 0:48 ` Alex Deucher 2013-12-10 2:20 ` Michel Dänzer 2013-12-10 15:04 ` Alex Deucher 2013-12-10 15:12 ` Alex Deucher 2014-01-02 20:54 ` Kleber Sacilotto de Souza 2014-01-02 22:48 ` Alex Deucher
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.