All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/radeon: Disable writeback by default on ppc
@ 2013-06-17 14:06 Adam Jackson
  2013-06-17 15:04 ` Alex Deucher
  0 siblings, 1 reply; 23+ messages in thread
From: Adam Jackson @ 2013-06-17 14:06 UTC (permalink / raw)
  To: dri-devel

At least on an IBM Power 720, this check passes, but several piglit
tests will reliably trigger GPU resets due to the ring buffer pointers
not being updated.  There's probably a better way to limit this to just
affected machines though.

Signed-off-by: Adam Jackson <ajax@redhat.com>
---
 drivers/gpu/drm/radeon/r600_cp.c       | 7 +++++++
 drivers/gpu/drm/radeon/radeon_cp.c     | 7 +++++++
 drivers/gpu/drm/radeon/radeon_device.c | 4 ++--
 drivers/gpu/drm/radeon/radeon_drv.c    | 2 +-
 4 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r600_cp.c b/drivers/gpu/drm/radeon/r600_cp.c
index 1c51c08..ef28532 100644
--- a/drivers/gpu/drm/radeon/r600_cp.c
+++ b/drivers/gpu/drm/radeon/r600_cp.c
@@ -552,6 +552,13 @@ static void r600_test_writeback(drm_radeon_private_t *dev_priv)
 		dev_priv->writeback_works = 0;
 		DRM_INFO("writeback test failed\n");
 	}
+#if defined(__ppc__) || defined(__ppc64__)
+	/* the test might succeed on ppc, but it's usually not reliable */
+	if (radeon_no_wb == -1) {
+		radeon_no_wb = 1;
+		DRM_INFO("not trusting writeback test due to arch quirk\n");
+	}
+#endif
 	if (radeon_no_wb == 1) {
 		dev_priv->writeback_works = 0;
 		DRM_INFO("writeback forced off\n");
diff --git a/drivers/gpu/drm/radeon/radeon_cp.c b/drivers/gpu/drm/radeon/radeon_cp.c
index efc4f64..a967b33 100644
--- a/drivers/gpu/drm/radeon/radeon_cp.c
+++ b/drivers/gpu/drm/radeon/radeon_cp.c
@@ -892,6 +892,13 @@ static void radeon_test_writeback(drm_radeon_private_t * dev_priv)
 		dev_priv->writeback_works = 0;
 		DRM_INFO("writeback test failed\n");
 	}
+#if defined(__ppc__) || defined(__ppc64__)
+	/* the test might succeed on ppc, but it's usually not reliable */
+	if (radeon_no_wb == -1) {
+		radeon_no_wb = 1;
+		DRM_INFO("not trusting writeback test due to arch quirk\n");
+	}
+#endif
 	if (radeon_no_wb == 1) {
 		dev_priv->writeback_works = 0;
 		DRM_INFO("writeback forced off\n");
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index 1899738..524046e 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -322,8 +322,8 @@ int radeon_wb_init(struct radeon_device *rdev)
 	/* disable event_write fences */
 	rdev->wb.use_event = false;
 	/* disabled via module param */
-	if (radeon_no_wb == 1) {
-		rdev->wb.enabled = false;
+	if (radeon_no_wb != -1) {
+		rdev->wb.enabled = !!radeon_no_wb;
 	} else {
 		if (rdev->flags & RADEON_IS_AGP) {
 			/* often unreliable on AGP */
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index 094e7e5..04809d4 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -146,7 +146,7 @@ static inline void radeon_register_atpx_handler(void) {}
 static inline void radeon_unregister_atpx_handler(void) {}
 #endif
 
-int radeon_no_wb;
+int radeon_no_wb = -1;
 int radeon_modeset = -1;
 int radeon_dynclks = -1;
 int radeon_r4xx_atom = 0;
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-06-17 14:06 [PATCH] drm/radeon: Disable writeback by default on ppc Adam Jackson
@ 2013-06-17 15:04 ` Alex Deucher
  2013-06-17 16:07   ` Adam Jackson
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Deucher @ 2013-06-17 15:04 UTC (permalink / raw)
  To: Adam Jackson; +Cc: dri-devel

On Mon, Jun 17, 2013 at 10:06 AM, Adam Jackson <ajax@redhat.com> wrote:
> At least on an IBM Power 720, this check passes, but several piglit
> tests will reliably trigger GPU resets due to the ring buffer pointers
> not being updated.  There's probably a better way to limit this to just
> affected machines though.

What radeon chips are you seeing this on?  wb is more or less required
on r6xx and newer and I'm not sure those generations will even work
properly without writeback enabled these days.  We force it to always
be enabled on APUs and NI and newer asics.  With KMS, wb encompasses
more than just rptr writeback; it covers pretty much everything
involving the GPU writing any status information to memory.

Alex

>
> Signed-off-by: Adam Jackson <ajax@redhat.com>
> ---
>  drivers/gpu/drm/radeon/r600_cp.c       | 7 +++++++
>  drivers/gpu/drm/radeon/radeon_cp.c     | 7 +++++++
>  drivers/gpu/drm/radeon/radeon_device.c | 4 ++--
>  drivers/gpu/drm/radeon/radeon_drv.c    | 2 +-
>  4 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/r600_cp.c b/drivers/gpu/drm/radeon/r600_cp.c
> index 1c51c08..ef28532 100644
> --- a/drivers/gpu/drm/radeon/r600_cp.c
> +++ b/drivers/gpu/drm/radeon/r600_cp.c
> @@ -552,6 +552,13 @@ static void r600_test_writeback(drm_radeon_private_t *dev_priv)
>                 dev_priv->writeback_works = 0;
>                 DRM_INFO("writeback test failed\n");
>         }
> +#if defined(__ppc__) || defined(__ppc64__)
> +       /* the test might succeed on ppc, but it's usually not reliable */
> +       if (radeon_no_wb == -1) {
> +               radeon_no_wb = 1;
> +               DRM_INFO("not trusting writeback test due to arch quirk\n");
> +       }
> +#endif
>         if (radeon_no_wb == 1) {
>                 dev_priv->writeback_works = 0;
>                 DRM_INFO("writeback forced off\n");
> diff --git a/drivers/gpu/drm/radeon/radeon_cp.c b/drivers/gpu/drm/radeon/radeon_cp.c
> index efc4f64..a967b33 100644
> --- a/drivers/gpu/drm/radeon/radeon_cp.c
> +++ b/drivers/gpu/drm/radeon/radeon_cp.c
> @@ -892,6 +892,13 @@ static void radeon_test_writeback(drm_radeon_private_t * dev_priv)
>                 dev_priv->writeback_works = 0;
>                 DRM_INFO("writeback test failed\n");
>         }
> +#if defined(__ppc__) || defined(__ppc64__)
> +       /* the test might succeed on ppc, but it's usually not reliable */
> +       if (radeon_no_wb == -1) {
> +               radeon_no_wb = 1;
> +               DRM_INFO("not trusting writeback test due to arch quirk\n");
> +       }
> +#endif
>         if (radeon_no_wb == 1) {
>                 dev_priv->writeback_works = 0;
>                 DRM_INFO("writeback forced off\n");
> diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
> index 1899738..524046e 100644
> --- a/drivers/gpu/drm/radeon/radeon_device.c
> +++ b/drivers/gpu/drm/radeon/radeon_device.c
> @@ -322,8 +322,8 @@ int radeon_wb_init(struct radeon_device *rdev)
>         /* disable event_write fences */
>         rdev->wb.use_event = false;
>         /* disabled via module param */
> -       if (radeon_no_wb == 1) {
> -               rdev->wb.enabled = false;
> +       if (radeon_no_wb != -1) {
> +               rdev->wb.enabled = !!radeon_no_wb;
>         } else {
>                 if (rdev->flags & RADEON_IS_AGP) {
>                         /* often unreliable on AGP */
> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
> index 094e7e5..04809d4 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -146,7 +146,7 @@ static inline void radeon_register_atpx_handler(void) {}
>  static inline void radeon_unregister_atpx_handler(void) {}
>  #endif
>
> -int radeon_no_wb;
> +int radeon_no_wb = -1;
>  int radeon_modeset = -1;
>  int radeon_dynclks = -1;
>  int radeon_r4xx_atom = 0;
> --
> 1.8.2.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-06-17 15:04 ` Alex Deucher
@ 2013-06-17 16:07   ` Adam Jackson
  2013-06-17 22:57     ` Alex Deucher
  0 siblings, 1 reply; 23+ messages in thread
From: Adam Jackson @ 2013-06-17 16:07 UTC (permalink / raw)
  To: Alex Deucher; +Cc: dri-devel

On Mon, 2013-06-17 at 11:04 -0400, Alex Deucher wrote:
> On Mon, Jun 17, 2013 at 10:06 AM, Adam Jackson <ajax@redhat.com> wrote:
> > At least on an IBM Power 720, this check passes, but several piglit
> > tests will reliably trigger GPU resets due to the ring buffer pointers
> > not being updated.  There's probably a better way to limit this to just
> > affected machines though.
> 
> What radeon chips are you seeing this on?  wb is more or less required
> on r6xx and newer and I'm not sure those generations will even work
> properly without writeback enabled these days.  We force it to always
> be enabled on APUs and NI and newer asics.  With KMS, wb encompasses
> more than just rptr writeback; it covers pretty much everything
> involving the GPU writing any status information to memory.

FirePro 2270, at least.  Booting with no_wb=1, piglit runs to completion
with no GPU resets or IB submission failures.  Booting without no_wb,
the following piglits go from pass to fail, all complaining that the
kernel rejected CS:

                                                       no-wb     wb        
                                                       (info)    (info)    
All                                                    7598/9587 7403/9553 
glean                                                  362/385   361/385   
pointAtten                                             pass      fail      
texture_srgb                                           pass      fail      
shaders                                                477/533   441/533   
glsl-algebraic-add-add-1                               pass      fail      
glsl-algebraic-add-add-2                               pass      fail      
glsl-algebraic-add-add-3                               pass      fail      
glsl-algebraic-sub-zero-3                              pass      fail      
glsl-algebraic-sub-zero-4                              pass      fail      
glsl-complex-subscript                                 pass      fail      
glsl-copy-propagation-if-1                             pass      fail      
glsl-copy-propagation-if-2                             pass      fail      
glsl-copy-propagation-if-3                             pass      fail      
glsl-copy-propagation-vector-indexing                  pass      fail      
glsl-fs-atan-2                                         pass      fail      
glsl-fs-dot-vec2-2                                     pass      fail      
glsl-fs-log2                                           pass      fail      
glsl-fs-main-return                                    pass      fail      
glsl-fs-max-3                                          pass      fail      
glsl-fs-min-2                                          pass      fail      
glsl-fs-min-3                                          pass      fail      
glsl-fs-statevar-call                                  pass      fail      
glsl-fs-struct-equal                                   pass      fail      
glsl-function-chain16                                  pass      fail      
glsl-implicit-conversion-02                            pass      fail      
glsl-inout-struct-01                                   pass      fail      
glsl-inout-struct-02                                   pass      fail      
glsl-link-varying-TexCoord                             pass      fail      
glsl-link-varyings-2                                   pass      fail      
glsl-uniform-initializer-4                             pass      fail      
glsl-uniform-initializer-6                             pass      fail      
glsl-uniform-initializer-7                             pass      fail      
glsl-vs-abs-neg-with-intermediate                      pass      fail      
glsl-vs-clamp-1                                        pass      fail      
glsl-vs-deadcode-1                                     pass      fail      
glsl-vs-deadcode-2                                     pass      fail      
glsl-vs-f2b                                            pass      fail      
glsl-vs-position-outval                                pass      fail      
link-uniform-array-size                                pass      fail      
loopfunc                                               pass      fail      
spec                                                   5921/7801 5763/7767 
!OpenGL 1.1                                            118/229   101/219   
depthstencil-default_fb-clear                          pass      fail      
getteximage-simple                                     pass      fail      
texwrap formats                                        27/50     12/40     
GL_LUMINANCE12                                         pass      fail      
GL_LUMINANCE16                                         pass      fail      
GL_LUMINANCE4                                          pass      fail      
GL_LUMINANCE4_ALPHA4                                   pass      fail      
GL_LUMINANCE8                                          pass      fail      
!OpenGL 1.4                                            11/13     9/13      
triangle-rasterization                                 pass      fail      
triangle-rasterization-fbo                             pass      fail      
ARB_depth_buffer_float                                 9/63      6/63      
fbo-depth-GL_DEPTH32F_STENCIL8-clear                   pass      fail      
fbo-depth-GL_DEPTH_COMPONENT32F-clear                  pass      fail      
fbo-stencil-GL_DEPTH32F_STENCIL8-clear                 pass      fail      
ARB_depth_texture                                      11/53     10/53     
fbo-depth-GL_DEPTH_COMPONENT32-clear                   pass      fail      
ARB_fragment_program                                   23/28     22/28     
kil-swizzle                                            pass      fail      
ARB_framebuffer_object                                 14/26     13/26     
fbo-deriv                                              pass      fail      
ARB_framebuffer_sRGB                                   80/81     65/81     
blit renderbuffer linear downsample disabled           pass      fail      
blit renderbuffer linear msaa enabled                  pass      fail      
blit renderbuffer linear single_sampled enabled        pass      fail      
blit renderbuffer linear_to_srgb scaled disabled       pass      fail      
blit renderbuffer srgb msaa disabled                   pass      fail      
blit renderbuffer srgb msaa enabled                    pass      fail      
blit renderbuffer srgb scaled enabled                  pass      fail      
blit renderbuffer srgb single_sampled enabled          pass      fail      
blit renderbuffer srgb upsample enabled                pass      fail      
blit renderbuffer srgb_to_linear msaa disabled         pass      fail      
blit renderbuffer srgb_to_linear msaa enabled          pass      fail      
blit texture linear scaled enabled                     pass      fail      
blit texture linear_to_srgb upsample disabled          pass      fail      
blit texture srgb msaa disabled                        pass      fail      
blit texture srgb msaa enabled                         pass      fail      
ARB_map_buffer_range                                   8/8       6/8       
MAP_INVALIDATE_RANGE_BIT decrement-offset              pass      fail      
MAP_INVALIDATE_RANGE_BIT offset=0                      pass      fail      
ARB_sampler_objects                                    1/4       0/4       
sampler-objects                                        pass      fail      
ARB_shading_language_packing                           10/10     7/10      
execution                                              10/10     7/10      
built-in-functions                                     10/10     7/10      
const-packSnorm4x8                                     pass      fail      
const-unpackSnorm2x16                                  pass      fail      
const-unpackUnorm2x16                                  pass      fail      
ARB_texture_compression                                25/40     22/40     
texwrap formats bordercolor-swizzled                   3/6       0/6       
GL_COMPRESSED_INTENSITY, swizzled, border color only   pass      fail      
GL_COMPRESSED_LUMINANCE, swizzled, border color only   pass      fail      
GL_COMPRESSED_LUMINANCE_ALPHA, swizzled, border color  pass      fail      
only                                                   
ARB_texture_cube_map                                   7/8       6/8       
cubemap                                                pass      fail      
ARB_texture_rectangle                                  11/21     10/21     
texrect_simple_arb_texrect                             pass      fail      
ARB_texture_rg                                         42/113    19/97     
texwrap formats-int                                    24/28     0/12      
GL_R16UI                                               pass      fail      
GL_R32UI                                               pass      fail      
GL_R8I                                                 pass      fail      
GL_R8UI                                                pass      fail      
GL_RG16UI                                              pass      fail      
GL_RG32UI                                              pass      fail      
GL_RG8I                                                pass      fail      
GL_RG8UI                                               pass      fail      
ARB_texture_storage                                    1/1       0/1       
texture-storage                                        pass      fail      
ARB_timer_query                                        1/3       0/3       
query GL_TIMESTAMP                                     pass      fail      
ATI_draw_buffers                                       3/3       2/3       
arbfp-no-option                                        pass      fail      
EXT_framebuffer_multisample                            267/287   259/287   
bitmap 4                                               pass      fail      
bitmap 6                                               pass      fail      
bitmap 8                                               pass      fail      
clear 2 depth                                          pass      fail      
clear 6 color                                          pass      fail      
clear 6 depth                                          pass      fail      
clear 6 stencil                                        pass      fail      
draw-buffers-alpha-to-one 4                            pass      fail      
EXT_framebuffer_object                                 175/289   173/289   
fbo-fragcoord2                                         pass      fail      
fbo-generatemipmap-filtering                           pass      fail      
fbo-generatemipmap-formats                             2/76      1/76      
GL_LUMINANCE4_ALPHA4                                   pass      fail      
fbo-readpixels-depth-formats                           14/24     16/24     
GL_DEPTH24_STENCIL8_EXT                                3/4       2/4       
GL_UNSIGNED_INT                                        pass      fail      
fbo-stencil-GL_STENCIL_INDEX16-clear                   pass      fail      
EXT_packed_depth_stencil                               13/58     11/58     
fbo-depthstencil-GL_DEPTH24_STENCIL8-clear             pass      fail      
fbo-stencil-GL_DEPTH24_STENCIL8-clear                  pass      fail      
EXT_texture_compression_rgtc                           31/39     11/31     
fbo-generatemipmap-formats                             8/8       4/8       
GL_COMPRESSED_RED                                      pass      fail      
GL_COMPRESSED_RED_GREEN_RGTC2_EXT                      pass      fail      
GL_COMPRESSED_RED_RGTC1_EXT                            pass      fail      
GL_COMPRESSED_RG                                       pass      fail      
texwrap formats                                        12/12     0/4       
GL_COMPRESSED_RED_RGTC1                                pass      fail      
GL_COMPRESSED_RG_RGTC2                                 pass      fail      
GL_COMPRESSED_SIGNED_RED_RGTC1                         pass      fail      
GL_COMPRESSED_SIGNED_RG_RGTC2                          pass      fail      
texwrap formats bordercolor                            4/4       0/4       
GL_COMPRESSED_RED_RGTC1, border color only             pass      fail      
GL_COMPRESSED_RG_RGTC2, border color only              pass      fail      
GL_COMPRESSED_SIGNED_RED_RGTC1, border color only      pass      fail      
GL_COMPRESSED_SIGNED_RG_RGTC2, border color only       pass      fail      
EXT_transform_feedback                                 45/279    44/279    
discard-bitmap                                         pass      fail      
glsl-1.10                                              1540/1636 1529/1636 
execution                                              1295/1391 1286/1391 
fs-inline-notequal                                     pass      fail      
fs-saturate-pow                                        pass      fail      
maximums                                               12/12     8/12      
gl_MaxCombinedTextureImageUnits                        pass      fail      
gl_MaxFragmentUniformComponents                        pass      fail      
gl_MaxTextureImageUnits                                pass      fail      
gl_MaxVertexAttribs                                    pass      fail      
vs-mat2-array-assignment                               pass      fail      
vs-saturate-exp2                                       pass      fail      
vs-saturate-pow                                        pass      fail      
linker                                                 18/18     16/18     
access-builtin-global-from-fn-unknown-to-main          pass      fail      
override-builtin-uniform-01                            pass      fail      
glsl-1.20                                              2124/2183 2097/2183 
execution                                              788/845   761/845   
maximums                                               12/12     7/12      
gl_MaxClipPlanes                                       pass      fail      
gl_MaxCombinedTextureImageUnits                        pass      fail      
gl_MaxTextureCoords                                    pass      fail      
gl_MaxVaryingFloats                                    pass      fail      
gl_MaxVertexTextureImageUnits                          pass      fail      
uniform-initializer                                    64/64     44/64     
fs-bool-set-by-other-stage                             pass      fail      
fs-float-array                                         pass      fail      
fs-float-from-const                                    pass      fail      
fs-float-set-by-other-stage                            pass      fail      
fs-int                                                 pass      fail      
fs-int-from-const                                      pass      fail      
fs-int-set-by-other-stage                              pass      fail      
fs-mat3-set-by-other-stage                             pass      fail      
fs-mat4-from-const                                     pass      fail      
fs-mat4-set-by-API                                     pass      fail      
fs-mat4-set-by-other-stage                             pass      fail      
fs-structure-array                                     pass      fail      
vs-bool-from-const                                     pass      fail      
vs-float                                               pass      fail      
vs-float-array                                         pass      fail      
vs-mat2                                                pass      fail      
vs-mat3                                                pass      fail      
vs-mat4                                                pass      fail      
vs-mat4-from-const                                     pass      fail      
vs-mat4-set-by-API                                     pass      fail      
vs-all-equal-bool-array                                pass      fail      
vs-assign-varied-struct                                pass      fail      
glsl-1.30                                              468/604   455/604   
execution                                              465/601   452/601   
clipping                                               20/20     18/20     
fs-clip-distance-interpolated                          pass      fail      
vs-clip-distance-explicitly-sized                      pass      fail      
fs-discard-exit-1                                      pass      fail      
fs-discard-exit-2                                      pass      fail      
fs-increment-int                                       pass      fail      
fs-multiply-const-ivec4                                pass      fail      
maximums                                               13/13     9/13      
gl_MaxClipPlanes                                       pass      fail      
gl_MaxCombinedTextureImageUnits                        pass      fail      
gl_MaxTextureImageUnits                                pass      fail      
gl_MaxVaryingFloats                                    pass      fail      
qualifiers                                             1/1       0/1       
vs-out-conversion-ivec4-to-vec4                        pass      fail      
uniform-initializer                                    8/8       6/8       
fs-uint-from-const                                     pass      fail      
vs-uint                                                pass      fail      

- ajax

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-06-17 16:07   ` Adam Jackson
@ 2013-06-17 22:57     ` Alex Deucher
  2013-11-07 22:29       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Deucher @ 2013-06-17 22:57 UTC (permalink / raw)
  To: Adam Jackson; +Cc: dri-devel

On Mon, Jun 17, 2013 at 12:07 PM, Adam Jackson <ajax@redhat.com> wrote:
> On Mon, 2013-06-17 at 11:04 -0400, Alex Deucher wrote:
>> On Mon, Jun 17, 2013 at 10:06 AM, Adam Jackson <ajax@redhat.com> wrote:
>> > At least on an IBM Power 720, this check passes, but several piglit
>> > tests will reliably trigger GPU resets due to the ring buffer pointers
>> > not being updated.  There's probably a better way to limit this to just
>> > affected machines though.
>>
>> What radeon chips are you seeing this on?  wb is more or less required
>> on r6xx and newer and I'm not sure those generations will even work
>> properly without writeback enabled these days.  We force it to always
>> be enabled on APUs and NI and newer asics.  With KMS, wb encompasses
>> more than just rptr writeback; it covers pretty much everything
>> involving the GPU writing any status information to memory.
>
> FirePro 2270, at least.  Booting with no_wb=1, piglit runs to completion
> with no GPU resets or IB submission failures.  Booting without no_wb,
> the following piglits go from pass to fail, all complaining that the
> kernel rejected CS:

Weird.  I wonder if there is an issue with cache snoops on PPC.  We
currently use the gart in cached mode (GPU snoops CPU cache) with
cached pages.  I wonder if we need to use uncached pages on PPC.

Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-06-17 22:57     ` Alex Deucher
@ 2013-11-07 22:29       ` Benjamin Herrenschmidt
  2013-11-08 13:43         ` Kleber Sacilotto de Souza
  0 siblings, 1 reply; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2013-11-07 22:29 UTC (permalink / raw)
  To: Alex Deucher
  Cc: dri-devel, Jerome Glisse, Thadeu Lima de Souza Cascardo,
	Brian King

On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote:

> Weird.  I wonder if there is an issue with cache snoops on PPC.  We
> currently use the gart in cached mode (GPU snoops CPU cache) with
> cached pages.  I wonder if we need to use uncached pages on PPC.

There is no such issue and no known bugs with DMA writes on those
PCIe host bridges (and they do get hammered pretty bad here).

This needs further investigation by the lab/hw guys to find out what's
actually happening on the bus and the host bridge.

Thadeu, Kleber: Jerome suggested writing a test case in userspace that
continuously writes to a spare scratch register (thus triggering the
corresponding writeback DMA) and checks the memory location to compare
the writeback value (using a debugfs file for example, or mmap).

Can you guys do something like that ? Then we need the analyzer on
and/or the lab guys to look at the fabric trace & PHB trace.
 
Ben.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-11-07 22:29       ` Benjamin Herrenschmidt
@ 2013-11-08 13:43         ` Kleber Sacilotto de Souza
  2013-11-24 23:15           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 23+ messages in thread
From: Kleber Sacilotto de Souza @ 2013-11-08 13:43 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: dri-devel, Brian King, Jerome Glisse,
	Thadeu Lima de Souza Cascardo

On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote:
>
>> Weird.  I wonder if there is an issue with cache snoops on PPC.  We
>> currently use the gart in cached mode (GPU snoops CPU cache) with
>> cached pages.  I wonder if we need to use uncached pages on PPC.
>
> There is no such issue and no known bugs with DMA writes on those
> PCIe host bridges (and they do get hammered pretty bad here).
>
> This needs further investigation by the lab/hw guys to find out what's
> actually happening on the bus and the host bridge.
>
> Thadeu, Kleber: Jerome suggested writing a test case in userspace that
> continuously writes to a spare scratch register (thus triggering the
> corresponding writeback DMA) and checks the memory location to compare
> the writeback value (using a debugfs file for example, or mmap).
>

I can look into that.


Thanks,

Kleber

> Can you guys do something like that ? Then we need the analyzer on
> and/or the lab guys to look at the fabric trace & PHB trace.
>
> Ben.
>
>


-- 
Kleber Sacilotto de Souza
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-11-08 13:43         ` Kleber Sacilotto de Souza
@ 2013-11-24 23:15           ` Benjamin Herrenschmidt
  2013-11-26  0:11             ` Kleber Sacilotto de Souza
  0 siblings, 1 reply; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2013-11-24 23:15 UTC (permalink / raw)
  To: Kleber Sacilotto de Souza
  Cc: dri-devel, Brian King, Jerome Glisse,
	Thadeu Lima de Souza Cascardo

On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote:
> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote:
> > On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote:
> >
> >> Weird.  I wonder if there is an issue with cache snoops on PPC.  We
> >> currently use the gart in cached mode (GPU snoops CPU cache) with
> >> cached pages.  I wonder if we need to use uncached pages on PPC.
> >
> > There is no such issue and no known bugs with DMA writes on those
> > PCIe host bridges (and they do get hammered pretty bad here).
> >
> > This needs further investigation by the lab/hw guys to find out what's
> > actually happening on the bus and the host bridge.
> >
> > Thadeu, Kleber: Jerome suggested writing a test case in userspace that
> > continuously writes to a spare scratch register (thus triggering the
> > corresponding writeback DMA) and checks the memory location to compare
> > the writeback value (using a debugfs file for example, or mmap).
> >
> 
> I can look into that.

Any news ?

Ben.

> 
> Thanks,
> 
> Kleber
> 
> > Can you guys do something like that ? Then we need the analyzer on
> > and/or the lab guys to look at the fabric trace & PHB trace.
> >
> > Ben.
> >
> >
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-11-24 23:15           ` Benjamin Herrenschmidt
@ 2013-11-26  0:11             ` Kleber Sacilotto de Souza
  2013-12-04 22:16               ` Kleber Sacilotto de Souza
  0 siblings, 1 reply; 23+ messages in thread
From: Kleber Sacilotto de Souza @ 2013-11-26  0:11 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: dri-devel, Brian King, Jerome Glisse,
	Thadeu Lima de Souza Cascardo

On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote:
> On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote:
>> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote:
>>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote:
>>>
>>>> Weird.  I wonder if there is an issue with cache snoops on PPC.  We
>>>> currently use the gart in cached mode (GPU snoops CPU cache) with
>>>> cached pages.  I wonder if we need to use uncached pages on PPC.
>>> There is no such issue and no known bugs with DMA writes on those
>>> PCIe host bridges (and they do get hammered pretty bad here).
>>>
>>> This needs further investigation by the lab/hw guys to find out what's
>>> actually happening on the bus and the host bridge.
>>>
>>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that
>>> continuously writes to a spare scratch register (thus triggering the
>>> corresponding writeback DMA) and checks the memory location to compare
>>> the writeback value (using a debugfs file for example, or mmap).
>>>
>> I can look into that.
> Any news ?
Only now I've got the bandwidth to actually take a look at this. I will 
start to write some code and let you guys know of the results.


Thanks,

Kleber

>
> Ben.
>
>> Thanks,
>>
>> Kleber
>>
>>> Can you guys do something like that ? Then we need the analyzer on
>>> and/or the lab guys to look at the fabric trace & PHB trace.
>>>
>>> Ben.
>>>
>>>
>>
>


-- 
Kleber Sacilotto de Souza
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-11-26  0:11             ` Kleber Sacilotto de Souza
@ 2013-12-04 22:16               ` Kleber Sacilotto de Souza
  2013-12-04 23:56                 ` Alex Deucher
  0 siblings, 1 reply; 23+ messages in thread
From: Kleber Sacilotto de Souza @ 2013-12-04 22:16 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Jerome Glisse, Alex Deucher
  Cc: Brian King, Thadeu Lima de Souza Cascardo, dri-devel

On 11/25/2013 10:11 PM, Kleber Sacilotto de Souza wrote:
> On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote:
>> On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote:
>>> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote:
>>>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote:
>>>>
>>>>> Weird.  I wonder if there is an issue with cache snoops on PPC.  We
>>>>> currently use the gart in cached mode (GPU snoops CPU cache) with
>>>>> cached pages.  I wonder if we need to use uncached pages on PPC.
>>>> There is no such issue and no known bugs with DMA writes on those
>>>> PCIe host bridges (and they do get hammered pretty bad here).
>>>>
>>>> This needs further investigation by the lab/hw guys to find out what's
>>>> actually happening on the bus and the host bridge.
>>>>
>>>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that
>>>> continuously writes to a spare scratch register (thus triggering the
>>>> corresponding writeback DMA) and checks the memory location to compare
>>>> the writeback value (using a debugfs file for example, or mmap).

I was not able to reproduce the issue with this method, even after a 
weekend run.

However, doing some more investigation it seems the problem is here, 
where we read the ring rptr:

u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev,
				 struct radeon_ring *ring)
{
	u32 rptr;

	if (rdev->wb.enabled)
		rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]);
	else
		rptr = RREG32(ring->rptr_reg);

	return rptr;
}

I realized that the DMA'ed rptr value has always the opposite byte order 
from the MMIO value. Since RREG32 already returns the register value on 
the CPU byte order, it seems we don't need to byte-swap the DMA'ed 
value. If I remove the le32_to_cpu() call and use the DMA'ed value 
directly, I don't get the IB scheduling failures and piglit results are 
the same as with writeback disabled.

Is the adapter chipset swapping the bytes before doing the DMA to a 
big-endian host?


-- 
Kleber Sacilotto de Souza
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-04 22:16               ` Kleber Sacilotto de Souza
@ 2013-12-04 23:56                 ` Alex Deucher
  2013-12-05  0:05                   ` Alex Deucher
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Deucher @ 2013-12-04 23:56 UTC (permalink / raw)
  To: Kleber Sacilotto de Souza
  Cc: Maling list - DRI developers, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Brian King

On Wed, Dec 4, 2013 at 5:16 PM, Kleber Sacilotto de Souza
<klebers@linux.vnet.ibm.com> wrote:
> On 11/25/2013 10:11 PM, Kleber Sacilotto de Souza wrote:
>>
>> On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote:
>>>
>>> On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote:
>>>>
>>>> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote:
>>>>>
>>>>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote:
>>>>>
>>>>>> Weird.  I wonder if there is an issue with cache snoops on PPC.  We
>>>>>> currently use the gart in cached mode (GPU snoops CPU cache) with
>>>>>> cached pages.  I wonder if we need to use uncached pages on PPC.
>>>>>
>>>>> There is no such issue and no known bugs with DMA writes on those
>>>>> PCIe host bridges (and they do get hammered pretty bad here).
>>>>>
>>>>> This needs further investigation by the lab/hw guys to find out what's
>>>>> actually happening on the bus and the host bridge.
>>>>>
>>>>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that
>>>>> continuously writes to a spare scratch register (thus triggering the
>>>>> corresponding writeback DMA) and checks the memory location to compare
>>>>> the writeback value (using a debugfs file for example, or mmap).
>
>
> I was not able to reproduce the issue with this method, even after a weekend
> run.
>
> However, doing some more investigation it seems the problem is here, where
> we read the ring rptr:
>
> u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev,
>                                  struct radeon_ring *ring)
> {
>         u32 rptr;
>
>         if (rdev->wb.enabled)
>                 rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]);
>         else
>                 rptr = RREG32(ring->rptr_reg);
>
>         return rptr;
> }
>
> I realized that the DMA'ed rptr value has always the opposite byte order
> from the MMIO value. Since RREG32 already returns the register value on the
> CPU byte order, it seems we don't need to byte-swap the DMA'ed value. If I
> remove the le32_to_cpu() call and use the DMA'ed value directly, I don't get
> the IB scheduling failures and piglit results are the same as with writeback
> disabled.
>
> Is the adapter chipset swapping the bytes before doing the DMA to a
> big-endian host?

Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte
swapping for just about everything accessed by the CP (rptr writeback,
indirect buffers, etc.).  Looks like the DMA ring supports and enables
rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so
I think we can drop the swapping of the rptr writeback.

Alex

>
>
>
> --
> Kleber Sacilotto de Souza
> IBM Linux Technology Center
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-04 23:56                 ` Alex Deucher
@ 2013-12-05  0:05                   ` Alex Deucher
  2013-12-05  1:39                     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Deucher @ 2013-12-05  0:05 UTC (permalink / raw)
  To: Kleber Sacilotto de Souza
  Cc: Maling list - DRI developers, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Brian King

[-- Attachment #1: Type: text/plain, Size: 2875 bytes --]

On Wed, Dec 4, 2013 at 6:56 PM, Alex Deucher <alexdeucher@gmail.com> wrote:
> On Wed, Dec 4, 2013 at 5:16 PM, Kleber Sacilotto de Souza
> <klebers@linux.vnet.ibm.com> wrote:
>> On 11/25/2013 10:11 PM, Kleber Sacilotto de Souza wrote:
>>>
>>> On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote:
>>>>
>>>> On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote:
>>>>>
>>>>> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote:
>>>>>>
>>>>>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote:
>>>>>>
>>>>>>> Weird.  I wonder if there is an issue with cache snoops on PPC.  We
>>>>>>> currently use the gart in cached mode (GPU snoops CPU cache) with
>>>>>>> cached pages.  I wonder if we need to use uncached pages on PPC.
>>>>>>
>>>>>> There is no such issue and no known bugs with DMA writes on those
>>>>>> PCIe host bridges (and they do get hammered pretty bad here).
>>>>>>
>>>>>> This needs further investigation by the lab/hw guys to find out what's
>>>>>> actually happening on the bus and the host bridge.
>>>>>>
>>>>>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that
>>>>>> continuously writes to a spare scratch register (thus triggering the
>>>>>> corresponding writeback DMA) and checks the memory location to compare
>>>>>> the writeback value (using a debugfs file for example, or mmap).
>>
>>
>> I was not able to reproduce the issue with this method, even after a weekend
>> run.
>>
>> However, doing some more investigation it seems the problem is here, where
>> we read the ring rptr:
>>
>> u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev,
>>                                  struct radeon_ring *ring)
>> {
>>         u32 rptr;
>>
>>         if (rdev->wb.enabled)
>>                 rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]);
>>         else
>>                 rptr = RREG32(ring->rptr_reg);
>>
>>         return rptr;
>> }
>>
>> I realized that the DMA'ed rptr value has always the opposite byte order
>> from the MMIO value. Since RREG32 already returns the register value on the
>> CPU byte order, it seems we don't need to byte-swap the DMA'ed value. If I
>> remove the le32_to_cpu() call and use the DMA'ed value directly, I don't get
>> the IB scheduling failures and piglit results are the same as with writeback
>> disabled.
>>
>> Is the adapter chipset swapping the bytes before doing the DMA to a
>> big-endian host?
>
> Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte
> swapping for just about everything accessed by the CP (rptr writeback,
> indirect buffers, etc.).  Looks like the DMA ring supports and enables
> rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so
> I think we can drop the swapping of the rptr writeback.
>

Obvious patch attached.

Alex

> Alex
>
>>
>>
>>
>> --
>> Kleber Sacilotto de Souza
>> IBM Linux Technology Center
>>

[-- Attachment #2: 0001-drm-radeon-don-t-byteswap-readback-of-rptr-writeback.patch --]
[-- Type: text/x-diff, Size: 995 bytes --]

From 0f7705c378a14060552df19c0e724e47632da4d8 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Wed, 4 Dec 2013 19:02:08 -0500
Subject: [PATCH] drm/radeon: don't byteswap readback of rptr writeback

We already enable byteswapping of the rptr writeback
in the hw.  Fixes incorrect rptr readback on BE systems.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/radeon/radeon_ring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 9214403..56ff0cd 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -338,7 +338,7 @@ u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev,
 	u32 rptr;
 
 	if (rdev->wb.enabled)
-		rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]);
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
 	else
 		rptr = RREG32(ring->rptr_reg);
 
-- 
1.8.3.1


[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-05  0:05                   ` Alex Deucher
@ 2013-12-05  1:39                     ` Benjamin Herrenschmidt
  2013-12-05  2:29                       ` Michel Dänzer
  0 siblings, 1 reply; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2013-12-05  1:39 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Maling list - DRI developers, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Brian King

On Wed, 2013-12-04 at 19:05 -0500, Alex Deucher wrote:

> > Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte
> > swapping for just about everything accessed by the CP (rptr writeback,
> > indirect buffers, etc.).  Looks like the DMA ring supports and enables
> > rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so
> > I think we can drop the swapping of the rptr writeback.
> >
> 
> Obvious patch attached.

This works all the way back to r300 ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-05  1:39                     ` Benjamin Herrenschmidt
@ 2013-12-05  2:29                       ` Michel Dänzer
  2013-12-05  4:06                         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 23+ messages in thread
From: Michel Dänzer @ 2013-12-05  2:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Brian King, Jerome Glisse, Thadeu Lima de Souza Cascardo,
	Maling list - DRI developers

On Don, 2013-12-05 at 12:39 +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2013-12-04 at 19:05 -0500, Alex Deucher wrote:
> 
> > > Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte
> > > swapping for just about everything accessed by the CP (rptr writeback,
> > > indirect buffers, etc.).  Looks like the DMA ring supports and enables
> > > rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so
> > > I think we can drop the swapping of the rptr writeback.
> > >
> > 
> > Obvious patch attached.
> 
> This works all the way back to r300 ?

I don't think so, as I have writeback working without this patch on the
RV350 in this PowerBook. So I think this function needs to be split,
probably between R600 and older.

Also, there's more code at least potentially affected by this, e.g. in: 

      * cik_compute_ring_get_rptr(), cik_compute_ring_get_wptr(),
        cik_compute_ring_set_wptr(), cik_get_ih_wptr()
      * si_get_ih_wptr()
      * evergreen_get_ih_wptr()
      * r600_get_ih_wptr()
      * radeon_fence_write(), radeon_fence_read()
      * radeon_ring_backup()


-- 
Earthling Michel Dänzer            |                  http://www.amd.com
Libre software enthusiast          |                Mesa and X developer

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-05  2:29                       ` Michel Dänzer
@ 2013-12-05  4:06                         ` Benjamin Herrenschmidt
  2013-12-05 14:42                           ` Alex Deucher
  0 siblings, 1 reply; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2013-12-05  4:06 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Brian King, Jerome Glisse, Thadeu Lima de Souza Cascardo,
	Maling list - DRI developers

On Thu, 2013-12-05 at 11:29 +0900, Michel Dänzer wrote:
> On Don, 2013-12-05 at 12:39 +1100, Benjamin Herrenschmidt wrote:
> > On Wed, 2013-12-04 at 19:05 -0500, Alex Deucher wrote:
> > 
> > > > Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte
> > > > swapping for just about everything accessed by the CP (rptr writeback,
> > > > indirect buffers, etc.).  Looks like the DMA ring supports and enables
> > > > rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so
> > > > I think we can drop the swapping of the rptr writeback.
> > > >
> > > 
> > > Obvious patch attached.
> > 
> > This works all the way back to r300 ?
> 
> I don't think so, as I have writeback working without this patch on the
> RV350 in this PowerBook. So I think this function needs to be split,
> probably between R600 and older.

Or can we tell it to not swap (setup code) and continue doing
le32_to_cpu in the kernel ? Sounds better to me.

> Also, there's more code at least potentially affected by this, e.g. in: 
> 
>       * cik_compute_ring_get_rptr(), cik_compute_ring_get_wptr(),
>         cik_compute_ring_set_wptr(), cik_get_ih_wptr()
>       * si_get_ih_wptr()
>       * evergreen_get_ih_wptr()
>       * r600_get_ih_wptr()
>       * radeon_fence_write(), radeon_fence_read()
>       * radeon_ring_backup()

Yeah I'd say just don't swap, write LE to memory and let the kernel use
the right leXX_to_cpu.

HW swapping is evil :-)

Cheers,
Ben.


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-05  4:06                         ` Benjamin Herrenschmidt
@ 2013-12-05 14:42                           ` Alex Deucher
  2013-12-06 13:58                             ` Kleber Sacilotto de Souza
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Deucher @ 2013-12-05 14:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Brian King, Michel Dänzer, Jerome Glisse,
	Maling list - DRI developers, Thadeu Lima de Souza Cascardo

On Wed, Dec 4, 2013 at 11:06 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Thu, 2013-12-05 at 11:29 +0900, Michel Dänzer wrote:
>> On Don, 2013-12-05 at 12:39 +1100, Benjamin Herrenschmidt wrote:
>> > On Wed, 2013-12-04 at 19:05 -0500, Alex Deucher wrote:
>> >
>> > > > Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte
>> > > > swapping for just about everything accessed by the CP (rptr writeback,
>> > > > indirect buffers, etc.).  Looks like the DMA ring supports and enables
>> > > > rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so
>> > > > I think we can drop the swapping of the rptr writeback.
>> > > >
>> > >
>> > > Obvious patch attached.
>> >
>> > This works all the way back to r300 ?
>>
>> I don't think so, as I have writeback working without this patch on the
>> RV350 in this PowerBook. So I think this function needs to be split,
>> probably between R600 and older.
>
> Or can we tell it to not swap (setup code) and continue doing
> le32_to_cpu in the kernel ? Sounds better to me.
>
>> Also, there's more code at least potentially affected by this, e.g. in:
>>
>>       * cik_compute_ring_get_rptr(), cik_compute_ring_get_wptr(),
>>         cik_compute_ring_set_wptr(), cik_get_ih_wptr()
>>       * si_get_ih_wptr()
>>       * evergreen_get_ih_wptr()
>>       * r600_get_ih_wptr()
>>       * radeon_fence_write(), radeon_fence_read()
>>       * radeon_ring_backup()
>
> Yeah I'd say just don't swap, write LE to memory and let the kernel use
> the right leXX_to_cpu.
>
> HW swapping is evil :-)

Well, we'd need to start swapping indirect buffers and the ring as
well then which would get tricky as the CP at least does not have
separate swapping controls for different things.  Probably easier to
fix up as appropriate for different asic families.  We have function
pointers for the rtpr and wptr fetchers now, so we can do this pretty
cleanly.

Alex

>
> Cheers,
> Ben.
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-05 14:42                           ` Alex Deucher
@ 2013-12-06 13:58                             ` Kleber Sacilotto de Souza
  2013-12-06 15:59                               ` Alex Deucher
  0 siblings, 1 reply; 23+ messages in thread
From: Kleber Sacilotto de Souza @ 2013-12-06 13:58 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Michel Dänzer, Maling list - DRI developers, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Brian King

On 12/05/2013 12:42 PM, Alex Deucher wrote:
> Well, we'd need to start swapping indirect buffers and the ring as
> well then which would get tricky as the CP at least does not have
> separate swapping controls for different things.  Probably easier to
> fix up as appropriate for different asic families.  We have function
> pointers for the rtpr and wptr fetchers now, so we can do this pretty
> cleanly.
>
> Alex

Alex,

Are you going to send a patch to fix this?

If not, I don't have the knowledge of which asic families will need this 
fix, but if you inform me what needs to be done I can write the patch.


Thanks,

-- 
Kleber Sacilotto de Souza
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-06 13:58                             ` Kleber Sacilotto de Souza
@ 2013-12-06 15:59                               ` Alex Deucher
  2013-12-10  0:48                                 ` Alex Deucher
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Deucher @ 2013-12-06 15:59 UTC (permalink / raw)
  To: Kleber Sacilotto de Souza
  Cc: Michel Dänzer, Maling list - DRI developers, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Brian King

On Fri, Dec 6, 2013 at 8:58 AM, Kleber Sacilotto de Souza
<klebers@linux.vnet.ibm.com> wrote:
> On 12/05/2013 12:42 PM, Alex Deucher wrote:
>>
>> Well, we'd need to start swapping indirect buffers and the ring as
>> well then which would get tricky as the CP at least does not have
>> separate swapping controls for different things.  Probably easier to
>> fix up as appropriate for different asic families.  We have function
>> pointers for the rtpr and wptr fetchers now, so we can do this pretty
>> cleanly.
>>
>> Alex
>
>
> Alex,
>
> Are you going to send a patch to fix this?
>
> If not, I don't have the knowledge of which asic families will need this
> fix, but if you inform me what needs to be done I can write the patch.

I'll try and send a patch out in the next few days.

Alex

>
>
> Thanks,
>
>
> --
> Kleber Sacilotto de Souza
> IBM Linux Technology Center
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-06 15:59                               ` Alex Deucher
@ 2013-12-10  0:48                                 ` Alex Deucher
  2013-12-10  2:20                                   ` Michel Dänzer
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Deucher @ 2013-12-10  0:48 UTC (permalink / raw)
  To: Kleber Sacilotto de Souza
  Cc: Michel Dänzer, Maling list - DRI developers, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Brian King

[-- Attachment #1: Type: text/plain, Size: 1049 bytes --]

On Fri, Dec 6, 2013 at 10:59 AM, Alex Deucher <alexdeucher@gmail.com> wrote:
> On Fri, Dec 6, 2013 at 8:58 AM, Kleber Sacilotto de Souza
> <klebers@linux.vnet.ibm.com> wrote:
>> On 12/05/2013 12:42 PM, Alex Deucher wrote:
>>>
>>> Well, we'd need to start swapping indirect buffers and the ring as
>>> well then which would get tricky as the CP at least does not have
>>> separate swapping controls for different things.  Probably easier to
>>> fix up as appropriate for different asic families.  We have function
>>> pointers for the rtpr and wptr fetchers now, so we can do this pretty
>>> cleanly.
>>>
>>> Alex
>>
>>
>> Alex,
>>
>> Are you going to send a patch to fix this?
>>
>> If not, I don't have the knowledge of which asic families will need this
>> fix, but if you inform me what needs to be done I can write the patch.
>
> I'll try and send a patch out in the next few days.

Patch attached.  Compile tested only at the moment.

Alex

>
> Alex
>
>>
>>
>> Thanks,
>>
>>
>> --
>> Kleber Sacilotto de Souza
>> IBM Linux Technology Center
>>

[-- Attachment #2: 0001-drm-radeon-remove-generic-rptr-wptr-functions.patch --]
[-- Type: text/x-diff, Size: 33711 bytes --]

From 89b31dd54d2818e16e239d840aeda4007d5fedc6 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Mon, 9 Dec 2013 19:44:30 -0500
Subject: [PATCH] drm/radeon: remove generic rptr/wptr functions

Fill in asic family specific versions rather than
using the generic version.  This lets us handle asic
specific differences more easily.  In this case, we
disable sw swapping of the rtpr writeback value on
r6xx+ since the hw does it for us.  Fixes bogus
rptr readback on BE systems.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/radeon/cik.c         | 52 ++++++++++++++++++---------
 drivers/gpu/drm/radeon/cik_sdma.c    | 69 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/radeon/evergreen.c   |  3 --
 drivers/gpu/drm/radeon/ni.c          | 69 +++++++++++++++++++++++++++++++-----
 drivers/gpu/drm/radeon/ni_dma.c      | 69 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/radeon/r100.c        | 31 +++++++++++++++-
 drivers/gpu/drm/radeon/r600.c        | 32 +++++++++++++++--
 drivers/gpu/drm/radeon/r600_dma.c    | 13 +++++--
 drivers/gpu/drm/radeon/radeon.h      |  4 +--
 drivers/gpu/drm/radeon/radeon_asic.c | 66 +++++++++++++++++-----------------
 drivers/gpu/drm/radeon/radeon_asic.h | 57 ++++++++++++++++++++++-------
 drivers/gpu/drm/radeon/radeon_ring.c | 40 ++-------------------
 drivers/gpu/drm/radeon/rv770.c       |  3 --
 drivers/gpu/drm/radeon/si.c          |  8 -----
 14 files changed, 386 insertions(+), 130 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index b43a3a3..df01d8a 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -4015,15 +4015,43 @@ static int cik_cp_gfx_resume(struct radeon_device *rdev)
 	return 0;
 }
 
-u32 cik_compute_ring_get_rptr(struct radeon_device *rdev,
-			      struct radeon_ring *ring)
+u32 cik_gfx_get_rptr(struct radeon_device *rdev,
+		     struct radeon_ring *ring)
 {
 	u32 rptr;
 
+	if (rdev->wb.enabled)
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	else
+		rptr = RREG32(CP_RB0_RPTR);
+
+	return rptr;
+}
+
+u32 cik_gfx_get_wptr(struct radeon_device *rdev,
+		     struct radeon_ring *ring)
+{
+	u32 wptr;
+
+	wptr = RREG32(CP_RB0_WPTR);
 
+	return wptr;
+}
+
+void cik_gfx_set_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring)
+{
+	WREG32(CP_RB0_WPTR, ring->wptr);
+	(void)RREG32(CP_RB0_WPTR);
+}
+
+u32 cik_compute_get_rptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring)
+{
+	u32 rptr;
 
 	if (rdev->wb.enabled) {
-		rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]);
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
 	} else {
 		mutex_lock(&rdev->srbm_mutex);
 		cik_srbm_select(rdev, ring->me, ring->pipe, ring->queue, 0);
@@ -4035,13 +4063,13 @@ u32 cik_compute_ring_get_rptr(struct radeon_device *rdev,
 	return rptr;
 }
 
-u32 cik_compute_ring_get_wptr(struct radeon_device *rdev,
-			      struct radeon_ring *ring)
+u32 cik_compute_get_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring)
 {
 	u32 wptr;
 
 	if (rdev->wb.enabled) {
-		wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]);
+		wptr = rdev->wb.wb[ring->wptr_offs/4];
 	} else {
 		mutex_lock(&rdev->srbm_mutex);
 		cik_srbm_select(rdev, ring->me, ring->pipe, ring->queue, 0);
@@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct radeon_device *rdev,
 	return wptr;
 }
 
-void cik_compute_ring_set_wptr(struct radeon_device *rdev,
-			       struct radeon_ring *ring)
+void cik_compute_set_wptr(struct radeon_device *rdev,
+			  struct radeon_ring *ring)
 {
 	rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr);
 	WDOORBELL32(ring->doorbell_index, ring->wptr);
@@ -7625,7 +7653,6 @@ static int cik_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     CP_RB0_RPTR, CP_RB0_WPTR,
 			     PACKET3(PACKET3_NOP, 0x3FFF));
 	if (r)
 		return r;
@@ -7634,7 +7661,6 @@ static int cik_startup(struct radeon_device *rdev)
 	/* type-2 packets are deprecated on MEC, use type-3 instead */
 	ring = &rdev->ring[CAYMAN_RING_TYPE_CP1_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP1_RPTR_OFFSET,
-			     CP_HQD_PQ_RPTR, CP_HQD_PQ_WPTR,
 			     PACKET3(PACKET3_NOP, 0x3FFF));
 	if (r)
 		return r;
@@ -7646,7 +7672,6 @@ static int cik_startup(struct radeon_device *rdev)
 	/* type-2 packets are deprecated on MEC, use type-3 instead */
 	ring = &rdev->ring[CAYMAN_RING_TYPE_CP2_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP2_RPTR_OFFSET,
-			     CP_HQD_PQ_RPTR, CP_HQD_PQ_WPTR,
 			     PACKET3(PACKET3_NOP, 0x3FFF));
 	if (r)
 		return r;
@@ -7658,16 +7683,12 @@ static int cik_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     SDMA0_GFX_RB_RPTR + SDMA0_REGISTER_OFFSET,
-			     SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET,
 			     SDMA_PACKET(SDMA_OPCODE_NOP, 0, 0));
 	if (r)
 		return r;
 
 	ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET,
-			     SDMA0_GFX_RB_RPTR + SDMA1_REGISTER_OFFSET,
-			     SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET,
 			     SDMA_PACKET(SDMA_OPCODE_NOP, 0, 0));
 	if (r)
 		return r;
@@ -7683,7 +7704,6 @@ static int cik_startup(struct radeon_device *rdev)
 	ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
 	if (ring->ring_size) {
 		r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
-				     UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR,
 				     RADEON_CP_PACKET2);
 		if (!r)
 			r = uvd_v1_0_init(rdev);
diff --git a/drivers/gpu/drm/radeon/cik_sdma.c b/drivers/gpu/drm/radeon/cik_sdma.c
index 0300727..f0f9e10 100644
--- a/drivers/gpu/drm/radeon/cik_sdma.c
+++ b/drivers/gpu/drm/radeon/cik_sdma.c
@@ -52,6 +52,75 @@ u32 cik_gpu_check_soft_reset(struct radeon_device *rdev);
  */
 
 /**
+ * cik_sdma_get_rptr - get the current read pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Get the current rptr from the hardware (CIK+).
+ */
+uint32_t cik_sdma_get_rptr(struct radeon_device *rdev,
+			   struct radeon_ring *ring)
+{
+	u32 rptr, reg;
+
+	if (rdev->wb.enabled) {
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	} else {
+		if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+			reg = SDMA0_GFX_RB_RPTR + SDMA0_REGISTER_OFFSET;
+		else
+			reg = SDMA0_GFX_RB_RPTR + SDMA1_REGISTER_OFFSET;
+
+		rptr = RREG32(reg);
+	}
+
+	return (rptr & 0x3fffc) >> 2;
+}
+
+/**
+ * cik_sdma_get_wptr - get the current write pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Get the current wptr from the hardware (CIK+).
+ */
+uint32_t cik_sdma_get_wptr(struct radeon_device *rdev,
+			   struct radeon_ring *ring)
+{
+	u32 reg;
+
+	if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+		reg = SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET;
+	else
+		reg = SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET;
+
+	return (RREG32(reg) & 0x3fffc) >> 2;
+}
+
+/**
+ * cik_sdma_set_wptr - commit the write pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Write the wptr back to the hardware (CIK+).
+ */
+void cik_sdma_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring)
+{
+	u32 reg;
+
+	if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+		reg = SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET;
+	else
+		reg = SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET;
+
+	WREG32(reg, (ring->wptr << 2) & 0x3fffc);
+}
+
+/**
  * cik_sdma_ring_ib_execute - Schedule an IB on the DMA engine
  *
  * @rdev: radeon_device pointer
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c
index 9702e55..292749a 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -5199,14 +5199,12 @@ static int evergreen_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     R600_CP_RB_RPTR, R600_CP_RB_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     DMA_RB_RPTR, DMA_RB_WPTR,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0));
 	if (r)
 		return r;
@@ -5224,7 +5222,6 @@ static int evergreen_startup(struct radeon_device *rdev)
 	ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
 	if (ring->ring_size) {
 		r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
-				     UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR,
 				     RADEON_CP_PACKET2);
 		if (!r)
 			r = uvd_v1_0_init(rdev);
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 11aab2a..4861dd5 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1386,6 +1386,55 @@ static void cayman_cp_enable(struct radeon_device *rdev, bool enable)
 	}
 }
 
+u32 cayman_gfx_get_rptr(struct radeon_device *rdev,
+			struct radeon_ring *ring)
+{
+	u32 rptr;
+
+	if (rdev->wb.enabled)
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	else {
+		if (ring->idx == RADEON_RING_TYPE_GFX_INDEX)
+			rptr = RREG32(CP_RB0_RPTR);
+		else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX)
+			rptr = RREG32(CP_RB1_RPTR);
+		else
+			rptr = RREG32(CP_RB2_RPTR);
+	}
+
+	return rptr;
+}
+
+u32 cayman_gfx_get_wptr(struct radeon_device *rdev,
+			struct radeon_ring *ring)
+{
+	u32 wptr;
+
+	if (ring->idx == RADEON_RING_TYPE_GFX_INDEX)
+		wptr = RREG32(CP_RB0_WPTR);
+	else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX)
+		wptr = RREG32(CP_RB1_WPTR);
+	else
+		wptr = RREG32(CP_RB2_WPTR);
+
+	return wptr;
+}
+
+void cayman_gfx_set_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring)
+{
+	if (ring->idx == RADEON_RING_TYPE_GFX_INDEX) {
+		WREG32(CP_RB0_WPTR, ring->wptr);
+		(void)RREG32(CP_RB0_WPTR);
+	} else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX) {
+		WREG32(CP_RB1_WPTR, ring->wptr);
+		(void)RREG32(CP_RB1_WPTR);
+	} else {
+		WREG32(CP_RB2_WPTR, ring->wptr);
+		(void)RREG32(CP_RB2_WPTR);
+	}
+}
+
 static int cayman_cp_load_microcode(struct radeon_device *rdev)
 {
 	const __be32 *fw_data;
@@ -1514,6 +1563,16 @@ static int cayman_cp_resume(struct radeon_device *rdev)
 		CP_RB1_BASE,
 		CP_RB2_BASE
 	};
+	static const unsigned cp_rb_rptr[] = {
+		CP_RB0_RPTR,
+		CP_RB1_RPTR,
+		CP_RB2_RPTR
+	};
+	static const unsigned cp_rb_wptr[] = {
+		CP_RB0_WPTR,
+		CP_RB1_WPTR,
+		CP_RB2_WPTR
+	};
 	struct radeon_ring *ring;
 	int i, r;
 
@@ -1572,8 +1631,8 @@ static int cayman_cp_resume(struct radeon_device *rdev)
 		WREG32_P(cp_rb_cntl[i], RB_RPTR_WR_ENA, ~RB_RPTR_WR_ENA);
 
 		ring->rptr = ring->wptr = 0;
-		WREG32(ring->rptr_reg, ring->rptr);
-		WREG32(ring->wptr_reg, ring->wptr);
+		WREG32(cp_rb_rptr[i], ring->rptr);
+		WREG32(cp_rb_wptr[i], ring->wptr);
 
 		mdelay(1);
 		WREG32_P(cp_rb_cntl[i], 0, ~RB_RPTR_WR_ENA);
@@ -1969,23 +2028,18 @@ static int cayman_startup(struct radeon_device *rdev)
 	evergreen_irq_set(rdev);
 
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     CP_RB0_RPTR, CP_RB0_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     DMA_RB_RPTR + DMA0_REGISTER_OFFSET,
-			     DMA_RB_WPTR + DMA0_REGISTER_OFFSET,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0));
 	if (r)
 		return r;
 
 	ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET,
-			     DMA_RB_RPTR + DMA1_REGISTER_OFFSET,
-			     DMA_RB_WPTR + DMA1_REGISTER_OFFSET,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0));
 	if (r)
 		return r;
@@ -2004,7 +2058,6 @@ static int cayman_startup(struct radeon_device *rdev)
 	ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
 	if (ring->ring_size) {
 		r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
-				     UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR,
 				     RADEON_CP_PACKET2);
 		if (!r)
 			r = uvd_v1_0_init(rdev);
diff --git a/drivers/gpu/drm/radeon/ni_dma.c b/drivers/gpu/drm/radeon/ni_dma.c
index bdeb65e..51424ab 100644
--- a/drivers/gpu/drm/radeon/ni_dma.c
+++ b/drivers/gpu/drm/radeon/ni_dma.c
@@ -43,6 +43,75 @@ u32 cayman_gpu_check_soft_reset(struct radeon_device *rdev);
  */
 
 /**
+ * cayman_dma_get_rptr - get the current read pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Get the current rptr from the hardware (cayman+).
+ */
+uint32_t cayman_dma_get_rptr(struct radeon_device *rdev,
+			     struct radeon_ring *ring)
+{
+	u32 rptr, reg;
+
+	if (rdev->wb.enabled) {
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	} else {
+		if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+			reg = DMA_RB_RPTR + DMA0_REGISTER_OFFSET;
+		else
+			reg = DMA_RB_RPTR + DMA1_REGISTER_OFFSET;
+
+		rptr = RREG32(reg);
+	}
+
+	return (rptr & 0x3fffc) >> 2;
+}
+
+/**
+ * cayman_dma_get_wptr - get the current write pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Get the current wptr from the hardware (cayman+).
+ */
+uint32_t cayman_dma_get_wptr(struct radeon_device *rdev,
+			   struct radeon_ring *ring)
+{
+	u32 reg;
+
+	if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+		reg = DMA_RB_WPTR + DMA0_REGISTER_OFFSET;
+	else
+		reg = DMA_RB_WPTR + DMA1_REGISTER_OFFSET;
+
+	return (RREG32(reg) & 0x3fffc) >> 2;
+}
+
+/**
+ * cayman_dma_set_wptr - commit the write pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Write the wptr back to the hardware (cayman+).
+ */
+void cayman_dma_set_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring)
+{
+	u32 reg;
+
+	if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+		reg = DMA_RB_WPTR + DMA0_REGISTER_OFFSET;
+	else
+		reg = DMA_RB_WPTR + DMA1_REGISTER_OFFSET;
+
+	WREG32(reg, (ring->wptr << 2) & 0x3fffc);
+}
+
+/**
  * cayman_dma_ring_ib_execute - Schedule an IB on the DMA engine
  *
  * @rdev: radeon_device pointer
diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index 10abc4d..ef1791b 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -1050,6 +1050,36 @@ static int r100_cp_init_microcode(struct radeon_device *rdev)
 	return err;
 }
 
+u32 r100_gfx_get_rptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring)
+{
+	u32 rptr;
+
+	if (rdev->wb.enabled)
+		rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]);
+	else
+		rptr = RREG32(RADEON_CP_RB_RPTR);
+
+	return rptr;
+}
+
+u32 r100_gfx_get_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring)
+{
+	u32 wptr;
+
+	wptr = RREG32(RADEON_CP_RB_WPTR);
+
+	return wptr;
+}
+
+void r100_gfx_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring)
+{
+	WREG32(RADEON_CP_RB_WPTR, ring->wptr);
+	(void)RREG32(RADEON_CP_RB_WPTR);
+}
+
 static void r100_cp_load_microcode(struct radeon_device *rdev)
 {
 	const __be32 *fw_data;
@@ -1102,7 +1132,6 @@ int r100_cp_init(struct radeon_device *rdev, unsigned ring_size)
 	ring_size = (1 << (rb_bufsz + 1)) * 4;
 	r100_cp_load_microcode(rdev);
 	r = radeon_ring_init(rdev, ring, ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     RADEON_CP_RB_RPTR, RADEON_CP_RB_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r) {
 		return r;
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 9ad0673..a9034d8 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2382,6 +2382,36 @@ out:
 	return err;
 }
 
+u32 r600_gfx_get_rptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring)
+{
+	u32 rptr;
+
+	if (rdev->wb.enabled)
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	else
+		rptr = RREG32(R600_CP_RB_RPTR);
+
+	return rptr;
+}
+
+u32 r600_gfx_get_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring)
+{
+	u32 wptr;
+
+	wptr = RREG32(R600_CP_RB_WPTR);
+
+	return wptr;
+}
+
+void r600_gfx_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring)
+{
+	WREG32(R600_CP_RB_WPTR, ring->wptr);
+	(void)RREG32(R600_CP_RB_WPTR);
+}
+
 static int r600_cp_load_microcode(struct radeon_device *rdev)
 {
 	const __be32 *fw_data;
@@ -2826,14 +2856,12 @@ static int r600_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     R600_CP_RB_RPTR, R600_CP_RB_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     DMA_RB_RPTR, DMA_RB_WPTR,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0));
 	if (r)
 		return r;
diff --git a/drivers/gpu/drm/radeon/r600_dma.c b/drivers/gpu/drm/radeon/r600_dma.c
index 7844d15..3452c84 100644
--- a/drivers/gpu/drm/radeon/r600_dma.c
+++ b/drivers/gpu/drm/radeon/r600_dma.c
@@ -51,7 +51,14 @@ u32 r600_gpu_check_soft_reset(struct radeon_device *rdev);
 uint32_t r600_dma_get_rptr(struct radeon_device *rdev,
 			   struct radeon_ring *ring)
 {
-	return (radeon_ring_generic_get_rptr(rdev, ring) & 0x3fffc) >> 2;
+	u32 rptr;
+
+	if (rdev->wb.enabled)
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	else
+		rptr = RREG32(DMA_RB_RPTR);
+
+	return (rptr & 0x3fffc) >> 2;
 }
 
 /**
@@ -65,7 +72,7 @@ uint32_t r600_dma_get_rptr(struct radeon_device *rdev,
 uint32_t r600_dma_get_wptr(struct radeon_device *rdev,
 			   struct radeon_ring *ring)
 {
-	return (RREG32(ring->wptr_reg) & 0x3fffc) >> 2;
+	return (RREG32(DMA_RB_WPTR) & 0x3fffc) >> 2;
 }
 
 /**
@@ -79,7 +86,7 @@ uint32_t r600_dma_get_wptr(struct radeon_device *rdev,
 void r600_dma_set_wptr(struct radeon_device *rdev,
 		       struct radeon_ring *ring)
 {
-	WREG32(ring->wptr_reg, (ring->wptr << 2) & 0x3fffc);
+	WREG32(DMA_RB_WPTR, (ring->wptr << 2) & 0x3fffc);
 }
 
 /**
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index b1f990d..e7c02a7 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -779,13 +779,11 @@ struct radeon_ring {
 	volatile uint32_t	*ring;
 	unsigned		rptr;
 	unsigned		rptr_offs;
-	unsigned		rptr_reg;
 	unsigned		rptr_save_reg;
 	u64			next_rptr_gpu_addr;
 	volatile u32		*next_rptr_cpu_addr;
 	unsigned		wptr;
 	unsigned		wptr_old;
-	unsigned		wptr_reg;
 	unsigned		ring_size;
 	unsigned		ring_free_dw;
 	int			count_dw;
@@ -949,7 +947,7 @@ unsigned radeon_ring_backup(struct radeon_device *rdev, struct radeon_ring *ring
 int radeon_ring_restore(struct radeon_device *rdev, struct radeon_ring *ring,
 			unsigned size, uint32_t *data);
 int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *cp, unsigned ring_size,
-		     unsigned rptr_offs, unsigned rptr_reg, unsigned wptr_reg, u32 nop);
+		     unsigned rptr_offs, u32 nop);
 void radeon_ring_fini(struct radeon_device *rdev, struct radeon_ring *cp);
 
 
diff --git a/drivers/gpu/drm/radeon/radeon_asic.c b/drivers/gpu/drm/radeon/radeon_asic.c
index c0425bb..0447604 100644
--- a/drivers/gpu/drm/radeon/radeon_asic.c
+++ b/drivers/gpu/drm/radeon/radeon_asic.c
@@ -182,9 +182,9 @@ static struct radeon_asic_ring r100_gfx_ring = {
 	.ring_test = &r100_ring_test,
 	.ib_test = &r100_ib_test,
 	.is_lockup = &r100_gpu_is_lockup,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &r100_gfx_get_rptr,
+	.get_wptr = &r100_gfx_get_wptr,
+	.set_wptr = &r100_gfx_set_wptr,
 };
 
 static struct radeon_asic r100_asic = {
@@ -330,9 +330,9 @@ static struct radeon_asic_ring r300_gfx_ring = {
 	.ring_test = &r100_ring_test,
 	.ib_test = &r100_ib_test,
 	.is_lockup = &r100_gpu_is_lockup,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &r100_gfx_get_rptr,
+	.get_wptr = &r100_gfx_get_wptr,
+	.set_wptr = &r100_gfx_set_wptr,
 };
 
 static struct radeon_asic r300_asic = {
@@ -883,9 +883,9 @@ static struct radeon_asic_ring r600_gfx_ring = {
 	.ring_test = &r600_ring_test,
 	.ib_test = &r600_ib_test,
 	.is_lockup = &r600_gfx_is_lockup,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &r600_gfx_get_rptr,
+	.get_wptr = &r600_gfx_get_wptr,
+	.set_wptr = &r600_gfx_set_wptr,
 };
 
 static struct radeon_asic_ring r600_dma_ring = {
@@ -1267,9 +1267,9 @@ static struct radeon_asic_ring evergreen_gfx_ring = {
 	.ring_test = &r600_ring_test,
 	.ib_test = &r600_ib_test,
 	.is_lockup = &evergreen_gfx_is_lockup,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &r600_gfx_get_rptr,
+	.get_wptr = &r600_gfx_get_wptr,
+	.set_wptr = &r600_gfx_set_wptr,
 };
 
 static struct radeon_asic_ring evergreen_dma_ring = {
@@ -1570,9 +1570,9 @@ static struct radeon_asic_ring cayman_gfx_ring = {
 	.ib_test = &r600_ib_test,
 	.is_lockup = &cayman_gfx_is_lockup,
 	.vm_flush = &cayman_vm_flush,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &cayman_gfx_get_rptr,
+	.get_wptr = &cayman_gfx_get_wptr,
+	.set_wptr = &cayman_gfx_set_wptr,
 };
 
 static struct radeon_asic_ring cayman_dma_ring = {
@@ -1585,9 +1585,9 @@ static struct radeon_asic_ring cayman_dma_ring = {
 	.ib_test = &r600_dma_ib_test,
 	.is_lockup = &cayman_dma_is_lockup,
 	.vm_flush = &cayman_dma_vm_flush,
-	.get_rptr = &r600_dma_get_rptr,
-	.get_wptr = &r600_dma_get_wptr,
-	.set_wptr = &r600_dma_set_wptr
+	.get_rptr = &cayman_dma_get_rptr,
+	.get_wptr = &cayman_dma_get_wptr,
+	.set_wptr = &cayman_dma_set_wptr
 };
 
 static struct radeon_asic_ring cayman_uvd_ring = {
@@ -1813,9 +1813,9 @@ static struct radeon_asic_ring si_gfx_ring = {
 	.ib_test = &r600_ib_test,
 	.is_lockup = &si_gfx_is_lockup,
 	.vm_flush = &si_vm_flush,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &cayman_gfx_get_rptr,
+	.get_wptr = &cayman_gfx_get_wptr,
+	.set_wptr = &cayman_gfx_set_wptr,
 };
 
 static struct radeon_asic_ring si_dma_ring = {
@@ -1828,9 +1828,9 @@ static struct radeon_asic_ring si_dma_ring = {
 	.ib_test = &r600_dma_ib_test,
 	.is_lockup = &si_dma_is_lockup,
 	.vm_flush = &si_dma_vm_flush,
-	.get_rptr = &r600_dma_get_rptr,
-	.get_wptr = &r600_dma_get_wptr,
-	.set_wptr = &r600_dma_set_wptr,
+	.get_rptr = &cayman_dma_get_rptr,
+	.get_wptr = &cayman_dma_get_wptr,
+	.set_wptr = &cayman_dma_set_wptr,
 };
 
 static struct radeon_asic si_asic = {
@@ -1943,9 +1943,9 @@ static struct radeon_asic_ring ci_gfx_ring = {
 	.ib_test = &cik_ib_test,
 	.is_lockup = &cik_gfx_is_lockup,
 	.vm_flush = &cik_vm_flush,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &cik_gfx_get_rptr,
+	.get_wptr = &cik_gfx_get_wptr,
+	.set_wptr = &cik_gfx_set_wptr,
 };
 
 static struct radeon_asic_ring ci_cp_ring = {
@@ -1958,9 +1958,9 @@ static struct radeon_asic_ring ci_cp_ring = {
 	.ib_test = &cik_ib_test,
 	.is_lockup = &cik_gfx_is_lockup,
 	.vm_flush = &cik_vm_flush,
-	.get_rptr = &cik_compute_ring_get_rptr,
-	.get_wptr = &cik_compute_ring_get_wptr,
-	.set_wptr = &cik_compute_ring_set_wptr,
+	.get_rptr = &cik_compute_get_rptr,
+	.get_wptr = &cik_compute_get_wptr,
+	.set_wptr = &cik_compute_set_wptr,
 };
 
 static struct radeon_asic_ring ci_dma_ring = {
@@ -1973,9 +1973,9 @@ static struct radeon_asic_ring ci_dma_ring = {
 	.ib_test = &cik_sdma_ib_test,
 	.is_lockup = &cik_sdma_is_lockup,
 	.vm_flush = &cik_dma_vm_flush,
-	.get_rptr = &r600_dma_get_rptr,
-	.get_wptr = &r600_dma_get_wptr,
-	.set_wptr = &r600_dma_set_wptr,
+	.get_rptr = &cik_sdma_get_rptr,
+	.get_wptr = &cik_sdma_get_wptr,
+	.set_wptr = &cik_sdma_set_wptr,
 };
 
 static struct radeon_asic ci_asic = {
diff --git a/drivers/gpu/drm/radeon/radeon_asic.h b/drivers/gpu/drm/radeon/radeon_asic.h
index c9fd97b..941e7eb 100644
--- a/drivers/gpu/drm/radeon/radeon_asic.h
+++ b/drivers/gpu/drm/radeon/radeon_asic.h
@@ -47,13 +47,6 @@ u8 atombios_get_backlight_level(struct radeon_encoder *radeon_encoder);
 void radeon_legacy_set_backlight_level(struct radeon_encoder *radeon_encoder, u8 level);
 u8 radeon_legacy_get_backlight_level(struct radeon_encoder *radeon_encoder);
 
-u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev,
-				 struct radeon_ring *ring);
-u32 radeon_ring_generic_get_wptr(struct radeon_device *rdev,
-				 struct radeon_ring *ring);
-void radeon_ring_generic_set_wptr(struct radeon_device *rdev,
-				  struct radeon_ring *ring);
-
 /*
  * r100,rv100,rs100,rv200,rs200
  */
@@ -148,6 +141,13 @@ extern void r100_post_page_flip(struct radeon_device *rdev, int crtc);
 extern void r100_wait_for_vblank(struct radeon_device *rdev, int crtc);
 extern int r100_mc_wait_for_idle(struct radeon_device *rdev);
 
+u32 r100_gfx_get_rptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+u32 r100_gfx_get_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+void r100_gfx_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring);
+
 /*
  * r200,rv250,rs300,rv280
  */
@@ -368,6 +368,12 @@ int r600_mc_wait_for_idle(struct radeon_device *rdev);
 int r600_pcie_gart_init(struct radeon_device *rdev);
 void r600_scratch_init(struct radeon_device *rdev);
 int r600_init_microcode(struct radeon_device *rdev);
+u32 r600_gfx_get_rptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+u32 r600_gfx_get_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+void r600_gfx_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring);
 /* r600 irq */
 int r600_irq_process(struct radeon_device *rdev);
 int r600_irq_init(struct radeon_device *rdev);
@@ -591,6 +597,19 @@ void cayman_dma_vm_set_page(struct radeon_device *rdev,
 
 void cayman_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm);
 
+u32 cayman_gfx_get_rptr(struct radeon_device *rdev,
+			struct radeon_ring *ring);
+u32 cayman_gfx_get_wptr(struct radeon_device *rdev,
+			struct radeon_ring *ring);
+void cayman_gfx_set_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring);
+uint32_t cayman_dma_get_rptr(struct radeon_device *rdev,
+			     struct radeon_ring *ring);
+uint32_t cayman_dma_get_wptr(struct radeon_device *rdev,
+			     struct radeon_ring *ring);
+void cayman_dma_set_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring);
+
 int ni_dpm_init(struct radeon_device *rdev);
 void ni_dpm_setup_asic(struct radeon_device *rdev);
 int ni_dpm_enable(struct radeon_device *rdev);
@@ -739,12 +758,24 @@ void cik_sdma_vm_set_page(struct radeon_device *rdev,
 			  uint32_t incr, uint32_t flags);
 void cik_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm);
 int cik_ib_parse(struct radeon_device *rdev, struct radeon_ib *ib);
-u32 cik_compute_ring_get_rptr(struct radeon_device *rdev,
-			      struct radeon_ring *ring);
-u32 cik_compute_ring_get_wptr(struct radeon_device *rdev,
-			      struct radeon_ring *ring);
-void cik_compute_ring_set_wptr(struct radeon_device *rdev,
-			       struct radeon_ring *ring);
+u32 cik_gfx_get_rptr(struct radeon_device *rdev,
+		     struct radeon_ring *ring);
+u32 cik_gfx_get_wptr(struct radeon_device *rdev,
+		     struct radeon_ring *ring);
+void cik_gfx_set_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+u32 cik_compute_get_rptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring);
+u32 cik_compute_get_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring);
+void cik_compute_set_wptr(struct radeon_device *rdev,
+			  struct radeon_ring *ring);
+u32 cik_sdma_get_rptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+u32 cik_sdma_get_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+void cik_sdma_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring);
 int ci_get_temp(struct radeon_device *rdev);
 int kv_get_temp(struct radeon_device *rdev);
 
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 9214403..b1771a6 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -332,36 +332,6 @@ bool radeon_ring_supports_scratch_reg(struct radeon_device *rdev,
 	}
 }
 
-u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev,
-				 struct radeon_ring *ring)
-{
-	u32 rptr;
-
-	if (rdev->wb.enabled)
-		rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]);
-	else
-		rptr = RREG32(ring->rptr_reg);
-
-	return rptr;
-}
-
-u32 radeon_ring_generic_get_wptr(struct radeon_device *rdev,
-				 struct radeon_ring *ring)
-{
-	u32 wptr;
-
-	wptr = RREG32(ring->wptr_reg);
-
-	return wptr;
-}
-
-void radeon_ring_generic_set_wptr(struct radeon_device *rdev,
-				  struct radeon_ring *ring)
-{
-	WREG32(ring->wptr_reg, ring->wptr);
-	(void)RREG32(ring->wptr_reg);
-}
-
 /**
  * radeon_ring_free_size - update the free size
  *
@@ -689,22 +659,18 @@ int radeon_ring_restore(struct radeon_device *rdev, struct radeon_ring *ring,
  * @ring: radeon_ring structure holding ring information
  * @ring_size: size of the ring
  * @rptr_offs: offset of the rptr writeback location in the WB buffer
- * @rptr_reg: MMIO offset of the rptr register
- * @wptr_reg: MMIO offset of the wptr register
  * @nop: nop packet for this ring
  *
  * Initialize the driver information for the selected ring (all asics).
  * Returns 0 on success, error on failure.
  */
 int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, unsigned ring_size,
-		     unsigned rptr_offs, unsigned rptr_reg, unsigned wptr_reg, u32 nop)
+		     unsigned rptr_offs, u32 nop)
 {
 	int r;
 
 	ring->ring_size = ring_size;
 	ring->rptr_offs = rptr_offs;
-	ring->rptr_reg = rptr_reg;
-	ring->wptr_reg = wptr_reg;
 	ring->nop = nop;
 	/* Allocate ring buffer */
 	if (ring->ring_obj == NULL) {
@@ -796,9 +762,9 @@ static int radeon_debugfs_ring_info(struct seq_file *m, void *data)
 	radeon_ring_free_size(rdev, ring);
 	count = (ring->ring_size / 4) - ring->ring_free_dw;
 	tmp = radeon_ring_get_wptr(rdev, ring);
-	seq_printf(m, "wptr(0x%04x): 0x%08x [%5d]\n", ring->wptr_reg, tmp, tmp);
+	seq_printf(m, "wptr: 0x%08x [%5d]\n", tmp, tmp);
 	tmp = radeon_ring_get_rptr(rdev, ring);
-	seq_printf(m, "rptr(0x%04x): 0x%08x [%5d]\n", ring->rptr_reg, tmp, tmp);
+	seq_printf(m, "rptr: 0x%08x [%5d]\n", tmp, tmp);
 	if (ring->rptr_save_reg) {
 		seq_printf(m, "rptr next(0x%04x): 0x%08x\n", ring->rptr_save_reg,
 			   RREG32(ring->rptr_save_reg));
diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c
index 9f58467..6cce0de 100644
--- a/drivers/gpu/drm/radeon/rv770.c
+++ b/drivers/gpu/drm/radeon/rv770.c
@@ -1728,14 +1728,12 @@ static int rv770_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     R600_CP_RB_RPTR, R600_CP_RB_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     DMA_RB_RPTR, DMA_RB_WPTR,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0));
 	if (r)
 		return r;
@@ -1754,7 +1752,6 @@ static int rv770_startup(struct radeon_device *rdev)
 	ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
 	if (ring->ring_size) {
 		r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
-				     UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR,
 				     RADEON_CP_PACKET2);
 		if (!r)
 			r = uvd_v1_0_init(rdev);
diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c
index a36736d..0a71a2e 100644
--- a/drivers/gpu/drm/radeon/si.c
+++ b/drivers/gpu/drm/radeon/si.c
@@ -6419,37 +6419,30 @@ static int si_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     CP_RB0_RPTR, CP_RB0_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[CAYMAN_RING_TYPE_CP1_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP1_RPTR_OFFSET,
-			     CP_RB1_RPTR, CP_RB1_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[CAYMAN_RING_TYPE_CP2_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP2_RPTR_OFFSET,
-			     CP_RB2_RPTR, CP_RB2_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     DMA_RB_RPTR + DMA0_REGISTER_OFFSET,
-			     DMA_RB_WPTR + DMA0_REGISTER_OFFSET,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0, 0));
 	if (r)
 		return r;
 
 	ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET,
-			     DMA_RB_RPTR + DMA1_REGISTER_OFFSET,
-			     DMA_RB_WPTR + DMA1_REGISTER_OFFSET,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0, 0));
 	if (r)
 		return r;
@@ -6469,7 +6462,6 @@ static int si_startup(struct radeon_device *rdev)
 		ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
 		if (ring->ring_size) {
 			r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
-					     UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR,
 					     RADEON_CP_PACKET2);
 			if (!r)
 				r = uvd_v1_0_init(rdev);
-- 
1.8.3.1


[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-10  0:48                                 ` Alex Deucher
@ 2013-12-10  2:20                                   ` Michel Dänzer
  2013-12-10 15:04                                     ` Alex Deucher
  0 siblings, 1 reply; 23+ messages in thread
From: Michel Dänzer @ 2013-12-10  2:20 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Maling list - DRI developers, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Brian King

On Mon, 2013-12-09 at 19:48 -0500, Alex Deucher wrote:
> 
> -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev,
> -                             struct radeon_ring *ring)
> +u32 cik_compute_get_wptr(struct radeon_device *rdev,
> +                        struct radeon_ring *ring)
>  {
>         u32 wptr;
>  
>         if (rdev->wb.enabled) {
> -               wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]);
> +               wptr = rdev->wb.wb[ring->wptr_offs/4];
>         } else {
>                 mutex_lock(&rdev->srbm_mutex);
>                 cik_srbm_select(rdev, ring->me, ring->pipe,
> ring->queue, 0);
> @@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct
> radeon_device *rdev,
>         return wptr;
>  }
>  
> -void cik_compute_ring_set_wptr(struct radeon_device *rdev,
> -                              struct radeon_ring *ring)
> +void cik_compute_set_wptr(struct radeon_device *rdev,
> +                         struct radeon_ring *ring)
>  {
>         rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr);

I think this cpu_to_le32() needs to be dropped as well to match
cik_compute_ring_get_wptr().


> diff --git a/drivers/gpu/drm/radeon/radeon.h
> b/drivers/gpu/drm/radeon/radeon.h
> index b1f990d..e7c02a7 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -779,13 +779,11 @@ struct radeon_ring {
>         volatile uint32_t       *ring;
>         unsigned                rptr;
>         unsigned                rptr_offs;
> -       unsigned                rptr_reg;
>         unsigned                rptr_save_reg;
>         u64                     next_rptr_gpu_addr;
>         volatile u32            *next_rptr_cpu_addr;
>         unsigned                wptr;
>         unsigned                wptr_old;
> -       unsigned                wptr_reg;

What's the motivation for removing these? Seems like keeping them would
allow keeping this patch and the resulting code smaller (fewer function
variants) and cleaner (no if/else/... for the register offsets in the
variants).


-- 
Earthling Michel Dänzer            |                  http://www.amd.com
Libre software enthusiast          |                Mesa and X developer

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-10  2:20                                   ` Michel Dänzer
@ 2013-12-10 15:04                                     ` Alex Deucher
  2013-12-10 15:12                                       ` Alex Deucher
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Deucher @ 2013-12-10 15:04 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Maling list - DRI developers, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Brian King

On Mon, Dec 9, 2013 at 9:20 PM, Michel Dänzer <michel@daenzer.net> wrote:
> On Mon, 2013-12-09 at 19:48 -0500, Alex Deucher wrote:
>>
>> -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev,
>> -                             struct radeon_ring *ring)
>> +u32 cik_compute_get_wptr(struct radeon_device *rdev,
>> +                        struct radeon_ring *ring)
>>  {
>>         u32 wptr;
>>
>>         if (rdev->wb.enabled) {
>> -               wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]);
>> +               wptr = rdev->wb.wb[ring->wptr_offs/4];
>>         } else {
>>                 mutex_lock(&rdev->srbm_mutex);
>>                 cik_srbm_select(rdev, ring->me, ring->pipe,
>> ring->queue, 0);
>> @@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct
>> radeon_device *rdev,
>>         return wptr;
>>  }
>>
>> -void cik_compute_ring_set_wptr(struct radeon_device *rdev,
>> -                              struct radeon_ring *ring)
>> +void cik_compute_set_wptr(struct radeon_device *rdev,
>> +                         struct radeon_ring *ring)
>>  {
>>         rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr);
>
> I think this cpu_to_le32() needs to be dropped as well to match
> cik_compute_ring_get_wptr().

whoops, yeah, missed that one.

>
>
>> diff --git a/drivers/gpu/drm/radeon/radeon.h
>> b/drivers/gpu/drm/radeon/radeon.h
>> index b1f990d..e7c02a7 100644
>> --- a/drivers/gpu/drm/radeon/radeon.h
>> +++ b/drivers/gpu/drm/radeon/radeon.h
>> @@ -779,13 +779,11 @@ struct radeon_ring {
>>         volatile uint32_t       *ring;
>>         unsigned                rptr;
>>         unsigned                rptr_offs;
>> -       unsigned                rptr_reg;
>>         unsigned                rptr_save_reg;
>>         u64                     next_rptr_gpu_addr;
>>         volatile u32            *next_rptr_cpu_addr;
>>         unsigned                wptr;
>>         unsigned                wptr_old;
>> -       unsigned                wptr_reg;
>
> What's the motivation for removing these? Seems like keeping them would
> allow keeping this patch and the resulting code smaller (fewer function
> variants) and cleaner (no if/else/... for the register offsets in the
> variants).

I was trying to remove the asic specific info out of the ring struct
and into the asic specific functions.

Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-10 15:04                                     ` Alex Deucher
@ 2013-12-10 15:12                                       ` Alex Deucher
  2014-01-02 20:54                                         ` Kleber Sacilotto de Souza
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Deucher @ 2013-12-10 15:12 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Maling list - DRI developers, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Brian King

[-- Attachment #1: Type: text/plain, Size: 2585 bytes --]

On Tue, Dec 10, 2013 at 10:04 AM, Alex Deucher <alexdeucher@gmail.com> wrote:
> On Mon, Dec 9, 2013 at 9:20 PM, Michel Dänzer <michel@daenzer.net> wrote:
>> On Mon, 2013-12-09 at 19:48 -0500, Alex Deucher wrote:
>>>
>>> -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev,
>>> -                             struct radeon_ring *ring)
>>> +u32 cik_compute_get_wptr(struct radeon_device *rdev,
>>> +                        struct radeon_ring *ring)
>>>  {
>>>         u32 wptr;
>>>
>>>         if (rdev->wb.enabled) {
>>> -               wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]);
>>> +               wptr = rdev->wb.wb[ring->wptr_offs/4];
>>>         } else {
>>>                 mutex_lock(&rdev->srbm_mutex);
>>>                 cik_srbm_select(rdev, ring->me, ring->pipe,
>>> ring->queue, 0);
>>> @@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct
>>> radeon_device *rdev,
>>>         return wptr;
>>>  }
>>>
>>> -void cik_compute_ring_set_wptr(struct radeon_device *rdev,
>>> -                              struct radeon_ring *ring)
>>> +void cik_compute_set_wptr(struct radeon_device *rdev,
>>> +                         struct radeon_ring *ring)
>>>  {
>>>         rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr);
>>
>> I think this cpu_to_le32() needs to be dropped as well to match
>> cik_compute_ring_get_wptr().
>
> whoops, yeah, missed that one.

Updated patch attached.

Alex

>
>>
>>
>>> diff --git a/drivers/gpu/drm/radeon/radeon.h
>>> b/drivers/gpu/drm/radeon/radeon.h
>>> index b1f990d..e7c02a7 100644
>>> --- a/drivers/gpu/drm/radeon/radeon.h
>>> +++ b/drivers/gpu/drm/radeon/radeon.h
>>> @@ -779,13 +779,11 @@ struct radeon_ring {
>>>         volatile uint32_t       *ring;
>>>         unsigned                rptr;
>>>         unsigned                rptr_offs;
>>> -       unsigned                rptr_reg;
>>>         unsigned                rptr_save_reg;
>>>         u64                     next_rptr_gpu_addr;
>>>         volatile u32            *next_rptr_cpu_addr;
>>>         unsigned                wptr;
>>>         unsigned                wptr_old;
>>> -       unsigned                wptr_reg;
>>
>> What's the motivation for removing these? Seems like keeping them would
>> allow keeping this patch and the resulting code smaller (fewer function
>> variants) and cleaner (no if/else/... for the register offsets in the
>> variants).
>
> I was trying to remove the asic specific info out of the ring struct
> and into the asic specific functions.
>
> Alex

[-- Attachment #2: 0001-drm-radeon-remove-generic-rptr-wptr-functions-v2.patch --]
[-- Type: text/x-diff, Size: 33918 bytes --]

From 44d447981cd247aea135ca864dc205d304cf3e5f Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Mon, 9 Dec 2013 19:44:30 -0500
Subject: [PATCH] drm/radeon: remove generic rptr/wptr functions (v2)

Fill in asic family specific versions rather than
using the generic version.  This lets us handle asic
specific differences more easily.  In this case, we
disable sw swapping of the rtpr writeback value on
r6xx+ since the hw does it for us.  Fixes bogus
rptr readback on BE systems.

v2: remove missed cpu_to_le32(), add comments

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/radeon/cik.c         | 56 ++++++++++++++++++++---------
 drivers/gpu/drm/radeon/cik_sdma.c    | 69 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/radeon/evergreen.c   |  3 --
 drivers/gpu/drm/radeon/ni.c          | 69 +++++++++++++++++++++++++++++++-----
 drivers/gpu/drm/radeon/ni_dma.c      | 69 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/radeon/r100.c        | 31 +++++++++++++++-
 drivers/gpu/drm/radeon/r600.c        | 32 +++++++++++++++--
 drivers/gpu/drm/radeon/r600_dma.c    | 13 +++++--
 drivers/gpu/drm/radeon/radeon.h      |  4 +--
 drivers/gpu/drm/radeon/radeon_asic.c | 66 +++++++++++++++++-----------------
 drivers/gpu/drm/radeon/radeon_asic.h | 57 ++++++++++++++++++++++-------
 drivers/gpu/drm/radeon/radeon_ring.c | 40 ++-------------------
 drivers/gpu/drm/radeon/rv770.c       |  3 --
 drivers/gpu/drm/radeon/si.c          |  8 -----
 14 files changed, 389 insertions(+), 131 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index b43a3a3..9d7db10 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -4015,15 +4015,43 @@ static int cik_cp_gfx_resume(struct radeon_device *rdev)
 	return 0;
 }
 
-u32 cik_compute_ring_get_rptr(struct radeon_device *rdev,
-			      struct radeon_ring *ring)
+u32 cik_gfx_get_rptr(struct radeon_device *rdev,
+		     struct radeon_ring *ring)
 {
 	u32 rptr;
 
+	if (rdev->wb.enabled)
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	else
+		rptr = RREG32(CP_RB0_RPTR);
+
+	return rptr;
+}
+
+u32 cik_gfx_get_wptr(struct radeon_device *rdev,
+		     struct radeon_ring *ring)
+{
+	u32 wptr;
+
+	wptr = RREG32(CP_RB0_WPTR);
 
+	return wptr;
+}
+
+void cik_gfx_set_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring)
+{
+	WREG32(CP_RB0_WPTR, ring->wptr);
+	(void)RREG32(CP_RB0_WPTR);
+}
+
+u32 cik_compute_get_rptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring)
+{
+	u32 rptr;
 
 	if (rdev->wb.enabled) {
-		rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]);
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
 	} else {
 		mutex_lock(&rdev->srbm_mutex);
 		cik_srbm_select(rdev, ring->me, ring->pipe, ring->queue, 0);
@@ -4035,13 +4063,14 @@ u32 cik_compute_ring_get_rptr(struct radeon_device *rdev,
 	return rptr;
 }
 
-u32 cik_compute_ring_get_wptr(struct radeon_device *rdev,
-			      struct radeon_ring *ring)
+u32 cik_compute_get_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring)
 {
 	u32 wptr;
 
 	if (rdev->wb.enabled) {
-		wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]);
+		/* XXX check if swapping is necessary on BE */
+		wptr = rdev->wb.wb[ring->wptr_offs/4];
 	} else {
 		mutex_lock(&rdev->srbm_mutex);
 		cik_srbm_select(rdev, ring->me, ring->pipe, ring->queue, 0);
@@ -4053,10 +4082,11 @@ u32 cik_compute_ring_get_wptr(struct radeon_device *rdev,
 	return wptr;
 }
 
-void cik_compute_ring_set_wptr(struct radeon_device *rdev,
-			       struct radeon_ring *ring)
+void cik_compute_set_wptr(struct radeon_device *rdev,
+			  struct radeon_ring *ring)
 {
-	rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr);
+	/* XXX check if swapping is necessary on BE */
+	rdev->wb.wb[ring->wptr_offs/4] = ring->wptr;
 	WDOORBELL32(ring->doorbell_index, ring->wptr);
 }
 
@@ -7625,7 +7655,6 @@ static int cik_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     CP_RB0_RPTR, CP_RB0_WPTR,
 			     PACKET3(PACKET3_NOP, 0x3FFF));
 	if (r)
 		return r;
@@ -7634,7 +7663,6 @@ static int cik_startup(struct radeon_device *rdev)
 	/* type-2 packets are deprecated on MEC, use type-3 instead */
 	ring = &rdev->ring[CAYMAN_RING_TYPE_CP1_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP1_RPTR_OFFSET,
-			     CP_HQD_PQ_RPTR, CP_HQD_PQ_WPTR,
 			     PACKET3(PACKET3_NOP, 0x3FFF));
 	if (r)
 		return r;
@@ -7646,7 +7674,6 @@ static int cik_startup(struct radeon_device *rdev)
 	/* type-2 packets are deprecated on MEC, use type-3 instead */
 	ring = &rdev->ring[CAYMAN_RING_TYPE_CP2_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP2_RPTR_OFFSET,
-			     CP_HQD_PQ_RPTR, CP_HQD_PQ_WPTR,
 			     PACKET3(PACKET3_NOP, 0x3FFF));
 	if (r)
 		return r;
@@ -7658,16 +7685,12 @@ static int cik_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     SDMA0_GFX_RB_RPTR + SDMA0_REGISTER_OFFSET,
-			     SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET,
 			     SDMA_PACKET(SDMA_OPCODE_NOP, 0, 0));
 	if (r)
 		return r;
 
 	ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET,
-			     SDMA0_GFX_RB_RPTR + SDMA1_REGISTER_OFFSET,
-			     SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET,
 			     SDMA_PACKET(SDMA_OPCODE_NOP, 0, 0));
 	if (r)
 		return r;
@@ -7683,7 +7706,6 @@ static int cik_startup(struct radeon_device *rdev)
 	ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
 	if (ring->ring_size) {
 		r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
-				     UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR,
 				     RADEON_CP_PACKET2);
 		if (!r)
 			r = uvd_v1_0_init(rdev);
diff --git a/drivers/gpu/drm/radeon/cik_sdma.c b/drivers/gpu/drm/radeon/cik_sdma.c
index 0300727..f0f9e10 100644
--- a/drivers/gpu/drm/radeon/cik_sdma.c
+++ b/drivers/gpu/drm/radeon/cik_sdma.c
@@ -52,6 +52,75 @@ u32 cik_gpu_check_soft_reset(struct radeon_device *rdev);
  */
 
 /**
+ * cik_sdma_get_rptr - get the current read pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Get the current rptr from the hardware (CIK+).
+ */
+uint32_t cik_sdma_get_rptr(struct radeon_device *rdev,
+			   struct radeon_ring *ring)
+{
+	u32 rptr, reg;
+
+	if (rdev->wb.enabled) {
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	} else {
+		if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+			reg = SDMA0_GFX_RB_RPTR + SDMA0_REGISTER_OFFSET;
+		else
+			reg = SDMA0_GFX_RB_RPTR + SDMA1_REGISTER_OFFSET;
+
+		rptr = RREG32(reg);
+	}
+
+	return (rptr & 0x3fffc) >> 2;
+}
+
+/**
+ * cik_sdma_get_wptr - get the current write pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Get the current wptr from the hardware (CIK+).
+ */
+uint32_t cik_sdma_get_wptr(struct radeon_device *rdev,
+			   struct radeon_ring *ring)
+{
+	u32 reg;
+
+	if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+		reg = SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET;
+	else
+		reg = SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET;
+
+	return (RREG32(reg) & 0x3fffc) >> 2;
+}
+
+/**
+ * cik_sdma_set_wptr - commit the write pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Write the wptr back to the hardware (CIK+).
+ */
+void cik_sdma_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring)
+{
+	u32 reg;
+
+	if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+		reg = SDMA0_GFX_RB_WPTR + SDMA0_REGISTER_OFFSET;
+	else
+		reg = SDMA0_GFX_RB_WPTR + SDMA1_REGISTER_OFFSET;
+
+	WREG32(reg, (ring->wptr << 2) & 0x3fffc);
+}
+
+/**
  * cik_sdma_ring_ib_execute - Schedule an IB on the DMA engine
  *
  * @rdev: radeon_device pointer
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c
index 9702e55..292749a 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -5199,14 +5199,12 @@ static int evergreen_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     R600_CP_RB_RPTR, R600_CP_RB_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     DMA_RB_RPTR, DMA_RB_WPTR,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0));
 	if (r)
 		return r;
@@ -5224,7 +5222,6 @@ static int evergreen_startup(struct radeon_device *rdev)
 	ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
 	if (ring->ring_size) {
 		r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
-				     UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR,
 				     RADEON_CP_PACKET2);
 		if (!r)
 			r = uvd_v1_0_init(rdev);
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 11aab2a..4861dd5 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1386,6 +1386,55 @@ static void cayman_cp_enable(struct radeon_device *rdev, bool enable)
 	}
 }
 
+u32 cayman_gfx_get_rptr(struct radeon_device *rdev,
+			struct radeon_ring *ring)
+{
+	u32 rptr;
+
+	if (rdev->wb.enabled)
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	else {
+		if (ring->idx == RADEON_RING_TYPE_GFX_INDEX)
+			rptr = RREG32(CP_RB0_RPTR);
+		else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX)
+			rptr = RREG32(CP_RB1_RPTR);
+		else
+			rptr = RREG32(CP_RB2_RPTR);
+	}
+
+	return rptr;
+}
+
+u32 cayman_gfx_get_wptr(struct radeon_device *rdev,
+			struct radeon_ring *ring)
+{
+	u32 wptr;
+
+	if (ring->idx == RADEON_RING_TYPE_GFX_INDEX)
+		wptr = RREG32(CP_RB0_WPTR);
+	else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX)
+		wptr = RREG32(CP_RB1_WPTR);
+	else
+		wptr = RREG32(CP_RB2_WPTR);
+
+	return wptr;
+}
+
+void cayman_gfx_set_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring)
+{
+	if (ring->idx == RADEON_RING_TYPE_GFX_INDEX) {
+		WREG32(CP_RB0_WPTR, ring->wptr);
+		(void)RREG32(CP_RB0_WPTR);
+	} else if (ring->idx == CAYMAN_RING_TYPE_CP1_INDEX) {
+		WREG32(CP_RB1_WPTR, ring->wptr);
+		(void)RREG32(CP_RB1_WPTR);
+	} else {
+		WREG32(CP_RB2_WPTR, ring->wptr);
+		(void)RREG32(CP_RB2_WPTR);
+	}
+}
+
 static int cayman_cp_load_microcode(struct radeon_device *rdev)
 {
 	const __be32 *fw_data;
@@ -1514,6 +1563,16 @@ static int cayman_cp_resume(struct radeon_device *rdev)
 		CP_RB1_BASE,
 		CP_RB2_BASE
 	};
+	static const unsigned cp_rb_rptr[] = {
+		CP_RB0_RPTR,
+		CP_RB1_RPTR,
+		CP_RB2_RPTR
+	};
+	static const unsigned cp_rb_wptr[] = {
+		CP_RB0_WPTR,
+		CP_RB1_WPTR,
+		CP_RB2_WPTR
+	};
 	struct radeon_ring *ring;
 	int i, r;
 
@@ -1572,8 +1631,8 @@ static int cayman_cp_resume(struct radeon_device *rdev)
 		WREG32_P(cp_rb_cntl[i], RB_RPTR_WR_ENA, ~RB_RPTR_WR_ENA);
 
 		ring->rptr = ring->wptr = 0;
-		WREG32(ring->rptr_reg, ring->rptr);
-		WREG32(ring->wptr_reg, ring->wptr);
+		WREG32(cp_rb_rptr[i], ring->rptr);
+		WREG32(cp_rb_wptr[i], ring->wptr);
 
 		mdelay(1);
 		WREG32_P(cp_rb_cntl[i], 0, ~RB_RPTR_WR_ENA);
@@ -1969,23 +2028,18 @@ static int cayman_startup(struct radeon_device *rdev)
 	evergreen_irq_set(rdev);
 
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     CP_RB0_RPTR, CP_RB0_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     DMA_RB_RPTR + DMA0_REGISTER_OFFSET,
-			     DMA_RB_WPTR + DMA0_REGISTER_OFFSET,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0));
 	if (r)
 		return r;
 
 	ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET,
-			     DMA_RB_RPTR + DMA1_REGISTER_OFFSET,
-			     DMA_RB_WPTR + DMA1_REGISTER_OFFSET,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0));
 	if (r)
 		return r;
@@ -2004,7 +2058,6 @@ static int cayman_startup(struct radeon_device *rdev)
 	ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
 	if (ring->ring_size) {
 		r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
-				     UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR,
 				     RADEON_CP_PACKET2);
 		if (!r)
 			r = uvd_v1_0_init(rdev);
diff --git a/drivers/gpu/drm/radeon/ni_dma.c b/drivers/gpu/drm/radeon/ni_dma.c
index bdeb65e..51424ab 100644
--- a/drivers/gpu/drm/radeon/ni_dma.c
+++ b/drivers/gpu/drm/radeon/ni_dma.c
@@ -43,6 +43,75 @@ u32 cayman_gpu_check_soft_reset(struct radeon_device *rdev);
  */
 
 /**
+ * cayman_dma_get_rptr - get the current read pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Get the current rptr from the hardware (cayman+).
+ */
+uint32_t cayman_dma_get_rptr(struct radeon_device *rdev,
+			     struct radeon_ring *ring)
+{
+	u32 rptr, reg;
+
+	if (rdev->wb.enabled) {
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	} else {
+		if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+			reg = DMA_RB_RPTR + DMA0_REGISTER_OFFSET;
+		else
+			reg = DMA_RB_RPTR + DMA1_REGISTER_OFFSET;
+
+		rptr = RREG32(reg);
+	}
+
+	return (rptr & 0x3fffc) >> 2;
+}
+
+/**
+ * cayman_dma_get_wptr - get the current write pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Get the current wptr from the hardware (cayman+).
+ */
+uint32_t cayman_dma_get_wptr(struct radeon_device *rdev,
+			   struct radeon_ring *ring)
+{
+	u32 reg;
+
+	if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+		reg = DMA_RB_WPTR + DMA0_REGISTER_OFFSET;
+	else
+		reg = DMA_RB_WPTR + DMA1_REGISTER_OFFSET;
+
+	return (RREG32(reg) & 0x3fffc) >> 2;
+}
+
+/**
+ * cayman_dma_set_wptr - commit the write pointer
+ *
+ * @rdev: radeon_device pointer
+ * @ring: radeon ring pointer
+ *
+ * Write the wptr back to the hardware (cayman+).
+ */
+void cayman_dma_set_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring)
+{
+	u32 reg;
+
+	if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+		reg = DMA_RB_WPTR + DMA0_REGISTER_OFFSET;
+	else
+		reg = DMA_RB_WPTR + DMA1_REGISTER_OFFSET;
+
+	WREG32(reg, (ring->wptr << 2) & 0x3fffc);
+}
+
+/**
  * cayman_dma_ring_ib_execute - Schedule an IB on the DMA engine
  *
  * @rdev: radeon_device pointer
diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index 10abc4d..ef1791b 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -1050,6 +1050,36 @@ static int r100_cp_init_microcode(struct radeon_device *rdev)
 	return err;
 }
 
+u32 r100_gfx_get_rptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring)
+{
+	u32 rptr;
+
+	if (rdev->wb.enabled)
+		rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]);
+	else
+		rptr = RREG32(RADEON_CP_RB_RPTR);
+
+	return rptr;
+}
+
+u32 r100_gfx_get_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring)
+{
+	u32 wptr;
+
+	wptr = RREG32(RADEON_CP_RB_WPTR);
+
+	return wptr;
+}
+
+void r100_gfx_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring)
+{
+	WREG32(RADEON_CP_RB_WPTR, ring->wptr);
+	(void)RREG32(RADEON_CP_RB_WPTR);
+}
+
 static void r100_cp_load_microcode(struct radeon_device *rdev)
 {
 	const __be32 *fw_data;
@@ -1102,7 +1132,6 @@ int r100_cp_init(struct radeon_device *rdev, unsigned ring_size)
 	ring_size = (1 << (rb_bufsz + 1)) * 4;
 	r100_cp_load_microcode(rdev);
 	r = radeon_ring_init(rdev, ring, ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     RADEON_CP_RB_RPTR, RADEON_CP_RB_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r) {
 		return r;
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 9ad0673..a9034d8 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2382,6 +2382,36 @@ out:
 	return err;
 }
 
+u32 r600_gfx_get_rptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring)
+{
+	u32 rptr;
+
+	if (rdev->wb.enabled)
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	else
+		rptr = RREG32(R600_CP_RB_RPTR);
+
+	return rptr;
+}
+
+u32 r600_gfx_get_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring)
+{
+	u32 wptr;
+
+	wptr = RREG32(R600_CP_RB_WPTR);
+
+	return wptr;
+}
+
+void r600_gfx_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring)
+{
+	WREG32(R600_CP_RB_WPTR, ring->wptr);
+	(void)RREG32(R600_CP_RB_WPTR);
+}
+
 static int r600_cp_load_microcode(struct radeon_device *rdev)
 {
 	const __be32 *fw_data;
@@ -2826,14 +2856,12 @@ static int r600_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     R600_CP_RB_RPTR, R600_CP_RB_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     DMA_RB_RPTR, DMA_RB_WPTR,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0));
 	if (r)
 		return r;
diff --git a/drivers/gpu/drm/radeon/r600_dma.c b/drivers/gpu/drm/radeon/r600_dma.c
index 7844d15..3452c84 100644
--- a/drivers/gpu/drm/radeon/r600_dma.c
+++ b/drivers/gpu/drm/radeon/r600_dma.c
@@ -51,7 +51,14 @@ u32 r600_gpu_check_soft_reset(struct radeon_device *rdev);
 uint32_t r600_dma_get_rptr(struct radeon_device *rdev,
 			   struct radeon_ring *ring)
 {
-	return (radeon_ring_generic_get_rptr(rdev, ring) & 0x3fffc) >> 2;
+	u32 rptr;
+
+	if (rdev->wb.enabled)
+		rptr = rdev->wb.wb[ring->rptr_offs/4];
+	else
+		rptr = RREG32(DMA_RB_RPTR);
+
+	return (rptr & 0x3fffc) >> 2;
 }
 
 /**
@@ -65,7 +72,7 @@ uint32_t r600_dma_get_rptr(struct radeon_device *rdev,
 uint32_t r600_dma_get_wptr(struct radeon_device *rdev,
 			   struct radeon_ring *ring)
 {
-	return (RREG32(ring->wptr_reg) & 0x3fffc) >> 2;
+	return (RREG32(DMA_RB_WPTR) & 0x3fffc) >> 2;
 }
 
 /**
@@ -79,7 +86,7 @@ uint32_t r600_dma_get_wptr(struct radeon_device *rdev,
 void r600_dma_set_wptr(struct radeon_device *rdev,
 		       struct radeon_ring *ring)
 {
-	WREG32(ring->wptr_reg, (ring->wptr << 2) & 0x3fffc);
+	WREG32(DMA_RB_WPTR, (ring->wptr << 2) & 0x3fffc);
 }
 
 /**
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index b1f990d..e7c02a7 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -779,13 +779,11 @@ struct radeon_ring {
 	volatile uint32_t	*ring;
 	unsigned		rptr;
 	unsigned		rptr_offs;
-	unsigned		rptr_reg;
 	unsigned		rptr_save_reg;
 	u64			next_rptr_gpu_addr;
 	volatile u32		*next_rptr_cpu_addr;
 	unsigned		wptr;
 	unsigned		wptr_old;
-	unsigned		wptr_reg;
 	unsigned		ring_size;
 	unsigned		ring_free_dw;
 	int			count_dw;
@@ -949,7 +947,7 @@ unsigned radeon_ring_backup(struct radeon_device *rdev, struct radeon_ring *ring
 int radeon_ring_restore(struct radeon_device *rdev, struct radeon_ring *ring,
 			unsigned size, uint32_t *data);
 int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *cp, unsigned ring_size,
-		     unsigned rptr_offs, unsigned rptr_reg, unsigned wptr_reg, u32 nop);
+		     unsigned rptr_offs, u32 nop);
 void radeon_ring_fini(struct radeon_device *rdev, struct radeon_ring *cp);
 
 
diff --git a/drivers/gpu/drm/radeon/radeon_asic.c b/drivers/gpu/drm/radeon/radeon_asic.c
index c0425bb..0447604 100644
--- a/drivers/gpu/drm/radeon/radeon_asic.c
+++ b/drivers/gpu/drm/radeon/radeon_asic.c
@@ -182,9 +182,9 @@ static struct radeon_asic_ring r100_gfx_ring = {
 	.ring_test = &r100_ring_test,
 	.ib_test = &r100_ib_test,
 	.is_lockup = &r100_gpu_is_lockup,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &r100_gfx_get_rptr,
+	.get_wptr = &r100_gfx_get_wptr,
+	.set_wptr = &r100_gfx_set_wptr,
 };
 
 static struct radeon_asic r100_asic = {
@@ -330,9 +330,9 @@ static struct radeon_asic_ring r300_gfx_ring = {
 	.ring_test = &r100_ring_test,
 	.ib_test = &r100_ib_test,
 	.is_lockup = &r100_gpu_is_lockup,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &r100_gfx_get_rptr,
+	.get_wptr = &r100_gfx_get_wptr,
+	.set_wptr = &r100_gfx_set_wptr,
 };
 
 static struct radeon_asic r300_asic = {
@@ -883,9 +883,9 @@ static struct radeon_asic_ring r600_gfx_ring = {
 	.ring_test = &r600_ring_test,
 	.ib_test = &r600_ib_test,
 	.is_lockup = &r600_gfx_is_lockup,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &r600_gfx_get_rptr,
+	.get_wptr = &r600_gfx_get_wptr,
+	.set_wptr = &r600_gfx_set_wptr,
 };
 
 static struct radeon_asic_ring r600_dma_ring = {
@@ -1267,9 +1267,9 @@ static struct radeon_asic_ring evergreen_gfx_ring = {
 	.ring_test = &r600_ring_test,
 	.ib_test = &r600_ib_test,
 	.is_lockup = &evergreen_gfx_is_lockup,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &r600_gfx_get_rptr,
+	.get_wptr = &r600_gfx_get_wptr,
+	.set_wptr = &r600_gfx_set_wptr,
 };
 
 static struct radeon_asic_ring evergreen_dma_ring = {
@@ -1570,9 +1570,9 @@ static struct radeon_asic_ring cayman_gfx_ring = {
 	.ib_test = &r600_ib_test,
 	.is_lockup = &cayman_gfx_is_lockup,
 	.vm_flush = &cayman_vm_flush,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &cayman_gfx_get_rptr,
+	.get_wptr = &cayman_gfx_get_wptr,
+	.set_wptr = &cayman_gfx_set_wptr,
 };
 
 static struct radeon_asic_ring cayman_dma_ring = {
@@ -1585,9 +1585,9 @@ static struct radeon_asic_ring cayman_dma_ring = {
 	.ib_test = &r600_dma_ib_test,
 	.is_lockup = &cayman_dma_is_lockup,
 	.vm_flush = &cayman_dma_vm_flush,
-	.get_rptr = &r600_dma_get_rptr,
-	.get_wptr = &r600_dma_get_wptr,
-	.set_wptr = &r600_dma_set_wptr
+	.get_rptr = &cayman_dma_get_rptr,
+	.get_wptr = &cayman_dma_get_wptr,
+	.set_wptr = &cayman_dma_set_wptr
 };
 
 static struct radeon_asic_ring cayman_uvd_ring = {
@@ -1813,9 +1813,9 @@ static struct radeon_asic_ring si_gfx_ring = {
 	.ib_test = &r600_ib_test,
 	.is_lockup = &si_gfx_is_lockup,
 	.vm_flush = &si_vm_flush,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &cayman_gfx_get_rptr,
+	.get_wptr = &cayman_gfx_get_wptr,
+	.set_wptr = &cayman_gfx_set_wptr,
 };
 
 static struct radeon_asic_ring si_dma_ring = {
@@ -1828,9 +1828,9 @@ static struct radeon_asic_ring si_dma_ring = {
 	.ib_test = &r600_dma_ib_test,
 	.is_lockup = &si_dma_is_lockup,
 	.vm_flush = &si_dma_vm_flush,
-	.get_rptr = &r600_dma_get_rptr,
-	.get_wptr = &r600_dma_get_wptr,
-	.set_wptr = &r600_dma_set_wptr,
+	.get_rptr = &cayman_dma_get_rptr,
+	.get_wptr = &cayman_dma_get_wptr,
+	.set_wptr = &cayman_dma_set_wptr,
 };
 
 static struct radeon_asic si_asic = {
@@ -1943,9 +1943,9 @@ static struct radeon_asic_ring ci_gfx_ring = {
 	.ib_test = &cik_ib_test,
 	.is_lockup = &cik_gfx_is_lockup,
 	.vm_flush = &cik_vm_flush,
-	.get_rptr = &radeon_ring_generic_get_rptr,
-	.get_wptr = &radeon_ring_generic_get_wptr,
-	.set_wptr = &radeon_ring_generic_set_wptr,
+	.get_rptr = &cik_gfx_get_rptr,
+	.get_wptr = &cik_gfx_get_wptr,
+	.set_wptr = &cik_gfx_set_wptr,
 };
 
 static struct radeon_asic_ring ci_cp_ring = {
@@ -1958,9 +1958,9 @@ static struct radeon_asic_ring ci_cp_ring = {
 	.ib_test = &cik_ib_test,
 	.is_lockup = &cik_gfx_is_lockup,
 	.vm_flush = &cik_vm_flush,
-	.get_rptr = &cik_compute_ring_get_rptr,
-	.get_wptr = &cik_compute_ring_get_wptr,
-	.set_wptr = &cik_compute_ring_set_wptr,
+	.get_rptr = &cik_compute_get_rptr,
+	.get_wptr = &cik_compute_get_wptr,
+	.set_wptr = &cik_compute_set_wptr,
 };
 
 static struct radeon_asic_ring ci_dma_ring = {
@@ -1973,9 +1973,9 @@ static struct radeon_asic_ring ci_dma_ring = {
 	.ib_test = &cik_sdma_ib_test,
 	.is_lockup = &cik_sdma_is_lockup,
 	.vm_flush = &cik_dma_vm_flush,
-	.get_rptr = &r600_dma_get_rptr,
-	.get_wptr = &r600_dma_get_wptr,
-	.set_wptr = &r600_dma_set_wptr,
+	.get_rptr = &cik_sdma_get_rptr,
+	.get_wptr = &cik_sdma_get_wptr,
+	.set_wptr = &cik_sdma_set_wptr,
 };
 
 static struct radeon_asic ci_asic = {
diff --git a/drivers/gpu/drm/radeon/radeon_asic.h b/drivers/gpu/drm/radeon/radeon_asic.h
index c9fd97b..941e7eb 100644
--- a/drivers/gpu/drm/radeon/radeon_asic.h
+++ b/drivers/gpu/drm/radeon/radeon_asic.h
@@ -47,13 +47,6 @@ u8 atombios_get_backlight_level(struct radeon_encoder *radeon_encoder);
 void radeon_legacy_set_backlight_level(struct radeon_encoder *radeon_encoder, u8 level);
 u8 radeon_legacy_get_backlight_level(struct radeon_encoder *radeon_encoder);
 
-u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev,
-				 struct radeon_ring *ring);
-u32 radeon_ring_generic_get_wptr(struct radeon_device *rdev,
-				 struct radeon_ring *ring);
-void radeon_ring_generic_set_wptr(struct radeon_device *rdev,
-				  struct radeon_ring *ring);
-
 /*
  * r100,rv100,rs100,rv200,rs200
  */
@@ -148,6 +141,13 @@ extern void r100_post_page_flip(struct radeon_device *rdev, int crtc);
 extern void r100_wait_for_vblank(struct radeon_device *rdev, int crtc);
 extern int r100_mc_wait_for_idle(struct radeon_device *rdev);
 
+u32 r100_gfx_get_rptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+u32 r100_gfx_get_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+void r100_gfx_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring);
+
 /*
  * r200,rv250,rs300,rv280
  */
@@ -368,6 +368,12 @@ int r600_mc_wait_for_idle(struct radeon_device *rdev);
 int r600_pcie_gart_init(struct radeon_device *rdev);
 void r600_scratch_init(struct radeon_device *rdev);
 int r600_init_microcode(struct radeon_device *rdev);
+u32 r600_gfx_get_rptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+u32 r600_gfx_get_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+void r600_gfx_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring);
 /* r600 irq */
 int r600_irq_process(struct radeon_device *rdev);
 int r600_irq_init(struct radeon_device *rdev);
@@ -591,6 +597,19 @@ void cayman_dma_vm_set_page(struct radeon_device *rdev,
 
 void cayman_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm);
 
+u32 cayman_gfx_get_rptr(struct radeon_device *rdev,
+			struct radeon_ring *ring);
+u32 cayman_gfx_get_wptr(struct radeon_device *rdev,
+			struct radeon_ring *ring);
+void cayman_gfx_set_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring);
+uint32_t cayman_dma_get_rptr(struct radeon_device *rdev,
+			     struct radeon_ring *ring);
+uint32_t cayman_dma_get_wptr(struct radeon_device *rdev,
+			     struct radeon_ring *ring);
+void cayman_dma_set_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring);
+
 int ni_dpm_init(struct radeon_device *rdev);
 void ni_dpm_setup_asic(struct radeon_device *rdev);
 int ni_dpm_enable(struct radeon_device *rdev);
@@ -739,12 +758,24 @@ void cik_sdma_vm_set_page(struct radeon_device *rdev,
 			  uint32_t incr, uint32_t flags);
 void cik_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm);
 int cik_ib_parse(struct radeon_device *rdev, struct radeon_ib *ib);
-u32 cik_compute_ring_get_rptr(struct radeon_device *rdev,
-			      struct radeon_ring *ring);
-u32 cik_compute_ring_get_wptr(struct radeon_device *rdev,
-			      struct radeon_ring *ring);
-void cik_compute_ring_set_wptr(struct radeon_device *rdev,
-			       struct radeon_ring *ring);
+u32 cik_gfx_get_rptr(struct radeon_device *rdev,
+		     struct radeon_ring *ring);
+u32 cik_gfx_get_wptr(struct radeon_device *rdev,
+		     struct radeon_ring *ring);
+void cik_gfx_set_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+u32 cik_compute_get_rptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring);
+u32 cik_compute_get_wptr(struct radeon_device *rdev,
+			 struct radeon_ring *ring);
+void cik_compute_set_wptr(struct radeon_device *rdev,
+			  struct radeon_ring *ring);
+u32 cik_sdma_get_rptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+u32 cik_sdma_get_wptr(struct radeon_device *rdev,
+		      struct radeon_ring *ring);
+void cik_sdma_set_wptr(struct radeon_device *rdev,
+		       struct radeon_ring *ring);
 int ci_get_temp(struct radeon_device *rdev);
 int kv_get_temp(struct radeon_device *rdev);
 
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 9214403..b1771a6 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -332,36 +332,6 @@ bool radeon_ring_supports_scratch_reg(struct radeon_device *rdev,
 	}
 }
 
-u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev,
-				 struct radeon_ring *ring)
-{
-	u32 rptr;
-
-	if (rdev->wb.enabled)
-		rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]);
-	else
-		rptr = RREG32(ring->rptr_reg);
-
-	return rptr;
-}
-
-u32 radeon_ring_generic_get_wptr(struct radeon_device *rdev,
-				 struct radeon_ring *ring)
-{
-	u32 wptr;
-
-	wptr = RREG32(ring->wptr_reg);
-
-	return wptr;
-}
-
-void radeon_ring_generic_set_wptr(struct radeon_device *rdev,
-				  struct radeon_ring *ring)
-{
-	WREG32(ring->wptr_reg, ring->wptr);
-	(void)RREG32(ring->wptr_reg);
-}
-
 /**
  * radeon_ring_free_size - update the free size
  *
@@ -689,22 +659,18 @@ int radeon_ring_restore(struct radeon_device *rdev, struct radeon_ring *ring,
  * @ring: radeon_ring structure holding ring information
  * @ring_size: size of the ring
  * @rptr_offs: offset of the rptr writeback location in the WB buffer
- * @rptr_reg: MMIO offset of the rptr register
- * @wptr_reg: MMIO offset of the wptr register
  * @nop: nop packet for this ring
  *
  * Initialize the driver information for the selected ring (all asics).
  * Returns 0 on success, error on failure.
  */
 int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, unsigned ring_size,
-		     unsigned rptr_offs, unsigned rptr_reg, unsigned wptr_reg, u32 nop)
+		     unsigned rptr_offs, u32 nop)
 {
 	int r;
 
 	ring->ring_size = ring_size;
 	ring->rptr_offs = rptr_offs;
-	ring->rptr_reg = rptr_reg;
-	ring->wptr_reg = wptr_reg;
 	ring->nop = nop;
 	/* Allocate ring buffer */
 	if (ring->ring_obj == NULL) {
@@ -796,9 +762,9 @@ static int radeon_debugfs_ring_info(struct seq_file *m, void *data)
 	radeon_ring_free_size(rdev, ring);
 	count = (ring->ring_size / 4) - ring->ring_free_dw;
 	tmp = radeon_ring_get_wptr(rdev, ring);
-	seq_printf(m, "wptr(0x%04x): 0x%08x [%5d]\n", ring->wptr_reg, tmp, tmp);
+	seq_printf(m, "wptr: 0x%08x [%5d]\n", tmp, tmp);
 	tmp = radeon_ring_get_rptr(rdev, ring);
-	seq_printf(m, "rptr(0x%04x): 0x%08x [%5d]\n", ring->rptr_reg, tmp, tmp);
+	seq_printf(m, "rptr: 0x%08x [%5d]\n", tmp, tmp);
 	if (ring->rptr_save_reg) {
 		seq_printf(m, "rptr next(0x%04x): 0x%08x\n", ring->rptr_save_reg,
 			   RREG32(ring->rptr_save_reg));
diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c
index 9f58467..6cce0de 100644
--- a/drivers/gpu/drm/radeon/rv770.c
+++ b/drivers/gpu/drm/radeon/rv770.c
@@ -1728,14 +1728,12 @@ static int rv770_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     R600_CP_RB_RPTR, R600_CP_RB_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     DMA_RB_RPTR, DMA_RB_WPTR,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0));
 	if (r)
 		return r;
@@ -1754,7 +1752,6 @@ static int rv770_startup(struct radeon_device *rdev)
 	ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
 	if (ring->ring_size) {
 		r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
-				     UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR,
 				     RADEON_CP_PACKET2);
 		if (!r)
 			r = uvd_v1_0_init(rdev);
diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c
index a36736d..0a71a2e 100644
--- a/drivers/gpu/drm/radeon/si.c
+++ b/drivers/gpu/drm/radeon/si.c
@@ -6419,37 +6419,30 @@ static int si_startup(struct radeon_device *rdev)
 
 	ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP_RPTR_OFFSET,
-			     CP_RB0_RPTR, CP_RB0_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[CAYMAN_RING_TYPE_CP1_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP1_RPTR_OFFSET,
-			     CP_RB1_RPTR, CP_RB1_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[CAYMAN_RING_TYPE_CP2_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, RADEON_WB_CP2_RPTR_OFFSET,
-			     CP_RB2_RPTR, CP_RB2_WPTR,
 			     RADEON_CP_PACKET2);
 	if (r)
 		return r;
 
 	ring = &rdev->ring[R600_RING_TYPE_DMA_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, R600_WB_DMA_RPTR_OFFSET,
-			     DMA_RB_RPTR + DMA0_REGISTER_OFFSET,
-			     DMA_RB_WPTR + DMA0_REGISTER_OFFSET,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0, 0));
 	if (r)
 		return r;
 
 	ring = &rdev->ring[CAYMAN_RING_TYPE_DMA1_INDEX];
 	r = radeon_ring_init(rdev, ring, ring->ring_size, CAYMAN_WB_DMA1_RPTR_OFFSET,
-			     DMA_RB_RPTR + DMA1_REGISTER_OFFSET,
-			     DMA_RB_WPTR + DMA1_REGISTER_OFFSET,
 			     DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0, 0));
 	if (r)
 		return r;
@@ -6469,7 +6462,6 @@ static int si_startup(struct radeon_device *rdev)
 		ring = &rdev->ring[R600_RING_TYPE_UVD_INDEX];
 		if (ring->ring_size) {
 			r = radeon_ring_init(rdev, ring, ring->ring_size, 0,
-					     UVD_RBC_RB_RPTR, UVD_RBC_RB_WPTR,
 					     RADEON_CP_PACKET2);
 			if (!r)
 				r = uvd_v1_0_init(rdev);
-- 
1.8.3.1


[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2013-12-10 15:12                                       ` Alex Deucher
@ 2014-01-02 20:54                                         ` Kleber Sacilotto de Souza
  2014-01-02 22:48                                           ` Alex Deucher
  0 siblings, 1 reply; 23+ messages in thread
From: Kleber Sacilotto de Souza @ 2014-01-02 20:54 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Michel Dänzer, Maling list - DRI developers, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Brian King

On 12/10/2013 01:12 PM, Alex Deucher wrote:
> On Tue, Dec 10, 2013 at 10:04 AM, Alex Deucher <alexdeucher@gmail.com> wrote:
>> On Mon, Dec 9, 2013 at 9:20 PM, Michel Dänzer <michel@daenzer.net> wrote:
>>> On Mon, 2013-12-09 at 19:48 -0500, Alex Deucher wrote:
>>>>
>>>> -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev,
>>>> -                             struct radeon_ring *ring)
>>>> +u32 cik_compute_get_wptr(struct radeon_device *rdev,
>>>> +                        struct radeon_ring *ring)
>>>>   {
>>>>          u32 wptr;
>>>>
>>>>          if (rdev->wb.enabled) {
>>>> -               wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]);
>>>> +               wptr = rdev->wb.wb[ring->wptr_offs/4];
>>>>          } else {
>>>>                  mutex_lock(&rdev->srbm_mutex);
>>>>                  cik_srbm_select(rdev, ring->me, ring->pipe,
>>>> ring->queue, 0);
>>>> @@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct
>>>> radeon_device *rdev,
>>>>          return wptr;
>>>>   }
>>>>
>>>> -void cik_compute_ring_set_wptr(struct radeon_device *rdev,
>>>> -                              struct radeon_ring *ring)
>>>> +void cik_compute_set_wptr(struct radeon_device *rdev,
>>>> +                         struct radeon_ring *ring)
>>>>   {
>>>>          rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr);
>>>
>>> I think this cpu_to_le32() needs to be dropped as well to match
>>> cik_compute_ring_get_wptr().
>>
>> whoops, yeah, missed that one.
>
> Updated patch attached.
>
> Alex
>

Hi, Alex.

Has this patch been accepted upstream? I've tested it successfully on a 
ppc64 system with a FirePro 2270 adapter.


Thanks,


-- 
Kleber Sacilotto de Souza
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] drm/radeon: Disable writeback by default on ppc
  2014-01-02 20:54                                         ` Kleber Sacilotto de Souza
@ 2014-01-02 22:48                                           ` Alex Deucher
  0 siblings, 0 replies; 23+ messages in thread
From: Alex Deucher @ 2014-01-02 22:48 UTC (permalink / raw)
  To: Kleber Sacilotto de Souza
  Cc: Michel Dänzer, Maling list - DRI developers, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Brian King

On Thu, Jan 2, 2014 at 3:54 PM, Kleber Sacilotto de Souza
<klebers@linux.vnet.ibm.com> wrote:
> On 12/10/2013 01:12 PM, Alex Deucher wrote:
>>
>> On Tue, Dec 10, 2013 at 10:04 AM, Alex Deucher <alexdeucher@gmail.com>
>> wrote:
>>>
>>> On Mon, Dec 9, 2013 at 9:20 PM, Michel Dänzer <michel@daenzer.net> wrote:
>>>>
>>>> On Mon, 2013-12-09 at 19:48 -0500, Alex Deucher wrote:
>>>>>
>>>>>
>>>>> -u32 cik_compute_ring_get_wptr(struct radeon_device *rdev,
>>>>> -                             struct radeon_ring *ring)
>>>>> +u32 cik_compute_get_wptr(struct radeon_device *rdev,
>>>>> +                        struct radeon_ring *ring)
>>>>>   {
>>>>>          u32 wptr;
>>>>>
>>>>>          if (rdev->wb.enabled) {
>>>>> -               wptr = le32_to_cpu(rdev->wb.wb[ring->wptr_offs/4]);
>>>>> +               wptr = rdev->wb.wb[ring->wptr_offs/4];
>>>>>          } else {
>>>>>                  mutex_lock(&rdev->srbm_mutex);
>>>>>                  cik_srbm_select(rdev, ring->me, ring->pipe,
>>>>> ring->queue, 0);
>>>>> @@ -4053,8 +4081,8 @@ u32 cik_compute_ring_get_wptr(struct
>>>>> radeon_device *rdev,
>>>>>          return wptr;
>>>>>   }
>>>>>
>>>>> -void cik_compute_ring_set_wptr(struct radeon_device *rdev,
>>>>> -                              struct radeon_ring *ring)
>>>>> +void cik_compute_set_wptr(struct radeon_device *rdev,
>>>>> +                         struct radeon_ring *ring)
>>>>>   {
>>>>>          rdev->wb.wb[ring->wptr_offs/4] = cpu_to_le32(ring->wptr);
>>>>
>>>>
>>>> I think this cpu_to_le32() needs to be dropped as well to match
>>>> cik_compute_ring_get_wptr().
>>>
>>>
>>> whoops, yeah, missed that one.
>>
>>
>> Updated patch attached.
>>
>> Alex
>>
>
> Hi, Alex.
>
> Has this patch been accepted upstream? I've tested it successfully on a
> ppc64 system with a FirePro 2270 adapter.

Not yet, I was planning to upstream it for 3.14 unless there are objections.
http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.14-wip

Alex

>
>
>
> Thanks,
>
>
> --
> Kleber Sacilotto de Souza
> IBM Linux Technology Center
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-01-02 22:48 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-17 14:06 [PATCH] drm/radeon: Disable writeback by default on ppc Adam Jackson
2013-06-17 15:04 ` Alex Deucher
2013-06-17 16:07   ` Adam Jackson
2013-06-17 22:57     ` Alex Deucher
2013-11-07 22:29       ` Benjamin Herrenschmidt
2013-11-08 13:43         ` Kleber Sacilotto de Souza
2013-11-24 23:15           ` Benjamin Herrenschmidt
2013-11-26  0:11             ` Kleber Sacilotto de Souza
2013-12-04 22:16               ` Kleber Sacilotto de Souza
2013-12-04 23:56                 ` Alex Deucher
2013-12-05  0:05                   ` Alex Deucher
2013-12-05  1:39                     ` Benjamin Herrenschmidt
2013-12-05  2:29                       ` Michel Dänzer
2013-12-05  4:06                         ` Benjamin Herrenschmidt
2013-12-05 14:42                           ` Alex Deucher
2013-12-06 13:58                             ` Kleber Sacilotto de Souza
2013-12-06 15:59                               ` Alex Deucher
2013-12-10  0:48                                 ` Alex Deucher
2013-12-10  2:20                                   ` Michel Dänzer
2013-12-10 15:04                                     ` Alex Deucher
2013-12-10 15:12                                       ` Alex Deucher
2014-01-02 20:54                                         ` Kleber Sacilotto de Souza
2014-01-02 22:48                                           ` Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.