public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH]: radeonfb: Fix copyarea for R300 and later.
@ 2008-08-06  7:35 David Miller
  2008-08-06 15:43 ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2008-08-06  7:35 UTC (permalink / raw)
  To: benh; +Cc: linux-kernel, akpm


[ Andrew/Ben, this fixes the bootup hang we were talking about
  the other day. ]

radeonfb: Fix copyarea for R300 and later.

For current chips we need to make sure the DSTCACHE_CTLSTAT and
WAIT_UNTIL registers are properly setup.  If not, a sequence of
copyarea operations can result in the 2D raster processor
hanging.  Here is a trace on an RV370 (PCI device ID 0x5b64), it
records the RBBM_STATUS register, then the source x/y, destination
x/y, and width/height used for the copy:

----------------------------------------
radeonfb_prim_copyarea: STATUS[00000140] src[210:70] dst[210:60] wh[a0:10]
radeonfb_prim_copyarea: STATUS[00000140] src[2b8:70] dst[2b8:60] wh[88:10]
radeonfb_prim_copyarea: STATUS[00000140] src[348:70] dst[348:60] wh[40:10]
radeonfb_prim_copyarea: STATUS[80020140] src[390:70] dst[390:60] wh[88:10]
radeonfb_prim_copyarea: STATUS[8002613f] src[40:80] dst[40:70] wh[28:10]
radeonfb_prim_copyarea: STATUS[80026139] src[a8:80] dst[a8:70] wh[38:10]
radeonfb_prim_copyarea: STATUS[80026133] src[e8:80] dst[e8:70] wh[80:10]
radeonfb_prim_copyarea: STATUS[8002612d] src[170:80] dst[170:70] wh[30:10]
radeonfb_prim_copyarea: STATUS[80026127] src[1a8:80] dst[1a8:70] wh[8:10]
radeonfb_prim_copyarea: STATUS[80026121] src[1b8:80] dst[1b8:70] wh[88:10]
radeonfb_prim_copyarea: STATUS[8002611b] src[248:80] dst[248:70] wh[68:10]
----------------------------------------

When things are going fine the copies complete before the next ROP is
even issued, but all of a sudden the 2D unit becomes active (bit 17 in
RBBM_STATUS) and the FIFO retry (bit 13) and FIFO pipeline busy (bit
14) are set as well.  The FIFO begins to backup until it becomes full.

What happens next is the radeon_fifo_wait() times out, and we access
the chip illegally leading to a bus error which usually wedges the
box.  None of this makes it to the console screen, of course :-)
radeon_fifo_wait() should be modified to reset the accelerator when
this timeout happens instead of programming the chip anyways.

----------------------------------------
radeonfb: FIFO Timeout !
ERROR(0): Cheetah error trap taken afsr[0010080005000000] afar[000007f900800e40] TL1(0)
ERROR(0): TPC[595114] TNPC[595118] O7[459788] TSTATE[11009601]
ERROR(0): TPC<radeonfb_copyarea+0xfc/0x248>
ERROR(0): M_SYND(0),  E_SYND(0), Privileged
ERROR(0): Highest priority error (0000080000000000) "Bus error response from system bus"
ERROR(0): D-cache idx[0] tag[0000000000000000] utag[0000000000000000] stag[0000000000000000]
ERROR(0): D-cache data0[0000000000000000] data1[0000000000000000] data2[0000000000000000] data3[0000000000000000]
ERROR(0): I-cache idx[0] tag[0000000000000000] utag[0000000000000000] stag[0000000000000000] u[0000000000000000] l[00\

ERROR(0): I-cache INSN0[0000000000000000] INSN1[0000000000000000] INSN2[0000000000000000] INSN3[0000000000000000]
ERROR(0): I-cache INSN4[0000000000000000] INSN5[0000000000000000] INSN6[0000000000000000] INSN7[0000000000000000]
ERROR(0): E-cache idx[800e40] tag[000000000e049f4c]
ERROR(0): E-cache data0[fffff8127d300180] data1[00000000004b5384] data2[0000000000000000] data3[0000000000000000]
Kernel panic - not syncing: Irrecoverable deferred error trap.
----------------------------------------

Another quirk is that these copyarea calls will not happen until the
first drivers/char/vt.c:redraw_screen() occurs.  This will only happen
if you 1) VC switch or 2) run "consolechars" or 3) unblank the screen.

This seems to happen because until a redraw_screen() the screen scrolling
method used by fbcon is not finalized yet.  I've seen this with other fb
drivers too.

So if all you do is boot straight into X you will never see this bug on
the relevant chips.

Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/drivers/video/aty/radeon_accel.c b/drivers/video/aty/radeon_accel.c
index 3ca27cb..6a30792 100644
--- a/drivers/video/aty/radeon_accel.c
+++ b/drivers/video/aty/radeon_accel.c
@@ -116,6 +116,10 @@ static void radeonfb_prim_copyarea(struct radeonfb_info *rinfo,
 	OUTREG(DP_CNTL, (xdir>=0 ? DST_X_LEFT_TO_RIGHT : 0)
 			| (ydir>=0 ? DST_Y_TOP_TO_BOTTOM : 0));
 
+	radeon_fifo_wait(2);
+	OUTREG(DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL);
+	OUTREG(WAIT_UNTIL, (WAIT_2D_IDLECLEAN | WAIT_DMA_GUI_IDLE));
+
 	radeon_fifo_wait(3);
 	OUTREG(SRC_Y_X, (sy << 16) | sx);
 	OUTREG(DST_Y_X, (dy << 16) | dx);
diff --git a/include/video/radeon.h b/include/video/radeon.h
index 83467e1..acde09c 100644
--- a/include/video/radeon.h
+++ b/include/video/radeon.h
@@ -741,6 +741,10 @@
 #define SOFT_RESET_RB           		   (1 <<  6)
 #define SOFT_RESET_HDP          		   (1 <<  7)
 
+/* WAIT_UNTIL bit constants */
+#define WAIT_DMA_GUI_IDLE			   (1 << 9)
+#define WAIT_2D_IDLECLEAN			   (1 << 16)
+
 /* SURFACE_CNTL bit consants */
 #define SURF_TRANSLATION_DIS			   (1 << 8)
 #define NONSURF_AP0_SWP_16BPP			   (1 << 20)

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH]: radeonfb: Fix copyarea for R300 and later.
  2008-08-06  7:35 [PATCH]: radeonfb: Fix copyarea for R300 and later David Miller
@ 2008-08-06 15:43 ` Andrew Morton
  2008-08-06 21:19   ` David Miller
  2008-08-06 22:09   ` [PATCH]: radeonfb: Fix copyarea for R300 and later Benjamin Herrenschmidt
  0 siblings, 2 replies; 10+ messages in thread
From: Andrew Morton @ 2008-08-06 15:43 UTC (permalink / raw)
  To: David Miller; +Cc: benh, linux-kernel, stable

On Wed, 06 Aug 2008 00:35:36 -0700 (PDT) David Miller <davem@davemloft.net> wrote:

> [ Andrew/Ben, this fixes the bootup hang we were talking about
>   the other day. ]
> 
> radeonfb: Fix copyarea for R300 and later.
> 
> For current chips we need to make sure the DSTCACHE_CTLSTAT and
> WAIT_UNTIL registers are properly setup.  If not, a sequence of
> copyarea operations can result in the 2D raster processor
> hanging.

Neato.  This can be backported to -stable without also backporting your
recent radeon-misc-corrections.patch, I assume?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH]: radeonfb: Fix copyarea for R300 and later.
  2008-08-06 15:43 ` Andrew Morton
@ 2008-08-06 21:19   ` David Miller
  2008-08-06 22:08     ` Benjamin Herrenschmidt
                       ` (2 more replies)
  2008-08-06 22:09   ` [PATCH]: radeonfb: Fix copyarea for R300 and later Benjamin Herrenschmidt
  1 sibling, 3 replies; 10+ messages in thread
From: David Miller @ 2008-08-06 21:19 UTC (permalink / raw)
  To: akpm; +Cc: benh, linux-kernel, stable

From: Andrew Morton <akpm@linux-foundation.org>
Date: Wed, 6 Aug 2008 08:43:29 -0700

> On Wed, 06 Aug 2008 00:35:36 -0700 (PDT) David Miller <davem@davemloft.net> wrote:
> 
> > [ Andrew/Ben, this fixes the bootup hang we were talking about
> >   the other day. ]
> > 
> > radeonfb: Fix copyarea for R300 and later.
> > 
> > For current chips we need to make sure the DSTCACHE_CTLSTAT and
> > WAIT_UNTIL registers are properly setup.  If not, a sequence of
> > copyarea operations can result in the 2D raster processor
> > hanging.
> 
> Neato.  This can be backported to -stable without also backporting your
> recent radeon-misc-corrections.patch, I assume?

Yes, it can.

I discussed the patch with Ben earlier and he wants to add the
same register writes to the solidfill code as well.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH]: radeonfb: Fix copyarea for R300 and later.
  2008-08-06 21:19   ` David Miller
@ 2008-08-06 22:08     ` Benjamin Herrenschmidt
  2008-08-07  2:06     ` Benjamin Herrenschmidt
  2008-08-07  8:52     ` [PATCH] radeonfb: Fix engine hangs and cache flushing (v2) Benjamin Herrenschmidt
  2 siblings, 0 replies; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-06 22:08 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, linux-kernel, stable

On Wed, 2008-08-06 at 14:19 -0700, David Miller wrote:
> From: Andrew Morton <akpm@linux-foundation.org>
> Date: Wed, 6 Aug 2008 08:43:29 -0700
> 
> > On Wed, 06 Aug 2008 00:35:36 -0700 (PDT) David Miller <davem@davemloft.net> wrote:
> > 
> > > [ Andrew/Ben, this fixes the bootup hang we were talking about
> > >   the other day. ]
> > > 
> > > radeonfb: Fix copyarea for R300 and later.
> > > 
> > > For current chips we need to make sure the DSTCACHE_CTLSTAT and
> > > WAIT_UNTIL registers are properly setup.  If not, a sequence of
> > > copyarea operations can result in the 2D raster processor
> > > hanging.
> > 
> > Neato.  This can be backported to -stable without also backporting your
> > recent radeon-misc-corrections.patch, I assume?
> 
> Yes, it can.
> 
> I discussed the patch with Ben earlier and he wants to add the
> same register writes to the solidfill code as well.

That and the flush code for when we're about to manually hit the fb.

I'll send a patch later today.

Ben.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH]: radeonfb: Fix copyarea for R300 and later.
  2008-08-06 15:43 ` Andrew Morton
  2008-08-06 21:19   ` David Miller
@ 2008-08-06 22:09   ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-06 22:09 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David Miller, linux-kernel, stable

On Wed, 2008-08-06 at 08:43 -0700, Andrew Morton wrote:
> On Wed, 06 Aug 2008 00:35:36 -0700 (PDT) David Miller <davem@davemloft.net> wrote:
> 
> > [ Andrew/Ben, this fixes the bootup hang we were talking about
> >   the other day. ]
> > 
> > radeonfb: Fix copyarea for R300 and later.
> > 
> > For current chips we need to make sure the DSTCACHE_CTLSTAT and
> > WAIT_UNTIL registers are properly setup.  If not, a sequence of
> > copyarea operations can result in the 2D raster processor
> > hanging.
> 
> Neato.  This can be backported to -stable without also backporting your
> recent radeon-misc-corrections.patch, I assume?

I'd rather do both. The first one affects the bits used by the second.

Ben.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH]: radeonfb: Fix copyarea for R300 and later.
  2008-08-06 21:19   ` David Miller
  2008-08-06 22:08     ` Benjamin Herrenschmidt
@ 2008-08-07  2:06     ` Benjamin Herrenschmidt
  2008-08-07  5:52       ` Benjamin Herrenschmidt
  2008-08-07  8:52     ` [PATCH] radeonfb: Fix engine hangs and cache flushing (v2) Benjamin Herrenschmidt
  2 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-07  2:06 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, linux-kernel, stable

radeonfb: Fix engine hangs and cache flushing

(Thanks to David Miller for debugging that on his rv370 and
providing the initial version of that patch !)

This patches fixes a few things.

One is, among the 3 or so different variants of cache control registers,
radeon_engine_flush() is using one that shouldn't be used on r3xx and
later. This fixes it by making is use one that should work on everything
at least for the 2D cache.

We also didn't use the proper list of chip families on some functions,
this resyncs us with what X does.

In addition, I added a waitfor fifo in radeon_engine_flush() to make
sure the cache flush command did hit the register backbone before testing
for completion of the flush operation.

Finally, we enqueue a destination cache flush and a wait for engine
idle before solid fills and blits. This effectively prevents those
operations from being pipelined and shouldn't be necessary but it
appaears to cure some hangs on David's card, so I suspect something
fishy is going on with the engine caches. The performances of radeonfb
don't appear to suffer a great deal from that anyway.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

Index: linux-work/drivers/video/aty/radeon_accel.c
===================================================================
--- linux-work.orig/drivers/video/aty/radeon_accel.c	2008-08-07 10:30:12.000000000 +1000
+++ linux-work/drivers/video/aty/radeon_accel.c	2008-08-07 11:34:57.000000000 +1000
@@ -55,6 +55,14 @@ static void radeonfb_prim_fillrect(struc
 	OUTREG(DP_WRITE_MSK, 0xffffffff);
 	OUTREG(DP_CNTL, (DST_X_LEFT_TO_RIGHT | DST_Y_TOP_TO_BOTTOM));
 
+	/* Ensure the dst cache is flushed and the engine idle before
+	 * issuing the operation.
+	 *
+	 * This works around engine lockups on some cards
+	 */
+	OUTREG(DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL);
+	OUTREG(WAIT_UNTIL, (WAIT_2D_IDLECLEAN | WAIT_DMA_GUI_IDLE));
+
 	radeon_fifo_wait(2);  
 	OUTREG(DST_Y_X, (region->dy << 16) | region->dx);
 	OUTREG(DST_WIDTH_HEIGHT, (region->width << 16) | region->height);
@@ -116,6 +124,15 @@ static void radeonfb_prim_copyarea(struc
 	OUTREG(DP_CNTL, (xdir>=0 ? DST_X_LEFT_TO_RIGHT : 0)
 			| (ydir>=0 ? DST_Y_TOP_TO_BOTTOM : 0));
 
+	/* Ensure the dst cache is flushed and the engine idle before
+	 * issuing the operation.
+	 *
+	 * This works around engine lockups on some cards
+	 */
+	radeon_fifo_wait(2);
+	OUTREG(DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL);
+	OUTREG(WAIT_UNTIL, (WAIT_2D_IDLECLEAN | WAIT_DMA_GUI_IDLE));
+
 	radeon_fifo_wait(3);
 	OUTREG(SRC_Y_X, (sy << 16) | sx);
 	OUTREG(DST_Y_X, (dy << 16) | dx);
@@ -203,9 +220,7 @@ void radeonfb_engine_reset(struct radeon
 	host_path_cntl = INREG(HOST_PATH_CNTL);
 	rbbm_soft_reset = INREG(RBBM_SOFT_RESET);
 
-	if (rinfo->family == CHIP_FAMILY_R300 ||
-	    rinfo->family == CHIP_FAMILY_R350 ||
-	    rinfo->family == CHIP_FAMILY_RV350) {
+	if (IS_R300_VARIANT(rinfo)) {
 		u32 tmp;
 
 		OUTREG(RBBM_SOFT_RESET, (rbbm_soft_reset |
@@ -241,9 +256,7 @@ void radeonfb_engine_reset(struct radeon
 	INREG(HOST_PATH_CNTL);
 	OUTREG(HOST_PATH_CNTL, host_path_cntl);
 
-	if (rinfo->family != CHIP_FAMILY_R300 &&
-	    rinfo->family != CHIP_FAMILY_R350 &&
-	    rinfo->family != CHIP_FAMILY_RV350)
+	if (!IS_R300_VARIANT(rinfo))
 		OUTREG(RBBM_SOFT_RESET, rbbm_soft_reset);
 
 	OUTREG(CLOCK_CNTL_INDEX, clock_cntl_index);
@@ -260,10 +273,18 @@ void radeonfb_engine_init (struct radeon
 	radeonfb_engine_reset(rinfo);
 
 	radeon_fifo_wait (1);
-	if ((rinfo->family != CHIP_FAMILY_R300) &&
-	    (rinfo->family != CHIP_FAMILY_R350) &&
-	    (rinfo->family != CHIP_FAMILY_RV350))
+	if (IS_R300_VARIANT(rinfo)) {
+		OUTREG(RB2D_DSTCACHE_MODE, INREG(RB2D_DSTCACHE_MODE) |
+		       RB2D_DC_AUTOFLUSH_ENABLE |
+		       RB2D_DC_DC_DISABLE_IGNORE_PE);
+	} else {
+		/* This needs to be double checked with ATI. Latest X driver
+		 * completely "forgets" to set this register on < r3xx, and
+		 * we used to just write 0 there... I'll keep the 0 and update
+		 * that when we have sorted things out on X side.
+		 */
 		OUTREG(RB2D_DSTCACHE_MODE, 0);
+	}
 
 	radeon_fifo_wait (3);
 	/* We re-read MC_FB_LOCATION from card as it can have been
Index: linux-work/include/video/radeon.h
===================================================================
--- linux-work.orig/include/video/radeon.h	2008-08-07 10:30:12.000000000 +1000
+++ linux-work/include/video/radeon.h	2008-08-07 11:11:53.000000000 +1000
@@ -386,7 +386,7 @@
 #define SC_BOTTOM_RIGHT                        0x16F0  
 #define SRC_SC_BOTTOM_RIGHT                    0x16F4  
 #define RB2D_DSTCACHE_MODE		       0x3428
-#define RB2D_DSTCACHE_CTLSTAT		       0x342C
+#define RB2D_DSTCACHE_CTLSTAT_broken	       0x342C /* do not use */
 #define LVDS_GEN_CNTL			       0x02d0
 #define LVDS_PLL_CNTL			       0x02d4
 #define FP2_GEN_CNTL                           0x0288
@@ -532,6 +532,9 @@
 #define RB2D_DC_FLUSH_ALL			   (RB2D_DC_FLUSH_2D | RB2D_DC_FREE_2D)
 #define RB2D_DC_BUSY				   (1 << 31)
 
+/* DSTCACHE_MODE bits constants */
+#define RB2D_DC_AUTOFLUSH_ENABLE                   (1 << 8)
+#define RB2D_DC_DC_DISABLE_IGNORE_PE               (1 << 17)
 
 /* CRTC_GEN_CNTL bit constants */
 #define CRTC_DBL_SCAN_EN                           0x00000001
@@ -742,6 +745,10 @@
 #define SOFT_RESET_RB           		   (1 <<  6)
 #define SOFT_RESET_HDP          		   (1 <<  7)
 
+/* WAIT_UNTIL bit constants */
+#define WAIT_DMA_GUI_IDLE			   (1 << 9)
+#define WAIT_2D_IDLECLEAN			   (1 << 16)
+
 /* SURFACE_CNTL bit consants */
 #define SURF_TRANSLATION_DIS			   (1 << 8)
 #define NONSURF_AP0_SWP_16BPP			   (1 << 20)
Index: linux-work/drivers/video/aty/radeonfb.h
===================================================================
--- linux-work.orig/drivers/video/aty/radeonfb.h	2008-08-07 10:36:35.000000000 +1000
+++ linux-work/drivers/video/aty/radeonfb.h	2008-08-07 11:26:58.000000000 +1000
@@ -53,6 +53,7 @@ enum radeon_family {
 	CHIP_FAMILY_RV380,    /* RV370/RV380/M22/M24 */
 	CHIP_FAMILY_R420,     /* R420/R423/M18 */
 	CHIP_FAMILY_RC410,
+	CHIP_FAMILY_RS400,
 	CHIP_FAMILY_RS480,
 	CHIP_FAMILY_LAST,
 };
@@ -533,33 +534,39 @@ static inline u32 radeon_get_dstbpp(u16 
 /*
  * 2D Engine helper routines
  */
-static inline void radeon_engine_flush (struct radeonfb_info *rinfo)
+
+static inline void _radeon_fifo_wait(struct radeonfb_info *rinfo, int entries)
 {
 	int i;
 
-	/* initiate flush */
-	OUTREGP(RB2D_DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL,
-	        ~RB2D_DC_FLUSH_ALL);
-
-	for (i=0; i < 2000000; i++) {
-		if (!(INREG(RB2D_DSTCACHE_CTLSTAT) & RB2D_DC_BUSY))
+	for (i=0; i<2000000; i++) {
+		if ((INREG(RBBM_STATUS) & 0x7f) >= entries)
 			return;
 		udelay(1);
 	}
-	printk(KERN_ERR "radeonfb: Flush Timeout !\n");
+	printk(KERN_ERR "radeonfb: FIFO Timeout !\n");
 }
 
-
-static inline void _radeon_fifo_wait(struct radeonfb_info *rinfo, int entries)
+static inline void radeon_engine_flush (struct radeonfb_info *rinfo)
 {
 	int i;
 
-	for (i=0; i<2000000; i++) {
-		if ((INREG(RBBM_STATUS) & 0x7f) >= entries)
+	/* Initiate flush */
+	OUTREGP(DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL,
+	        ~RB2D_DC_FLUSH_ALL);
+
+	/* Ensure FIFO is empty, ie, make sure the flush commands
+	 * has reached the cache
+	 */
+	_radeon_fifo_wait (rinfo, 64);
+
+	/* Wait for the flush to complete */
+	for (i=0; i < 2000000; i++) {
+		if (!(INREG(DSTCACHE_CTLSTAT) & RB2D_DC_BUSY))
 			return;
 		udelay(1);
 	}
-	printk(KERN_ERR "radeonfb: FIFO Timeout !\n");
+	printk(KERN_ERR "radeonfb: Flush Timeout !\n");
 }
 
 
Index: linux-work/drivers/video/aty/radeon_base.c
===================================================================
--- linux-work.orig/drivers/video/aty/radeon_base.c	2008-08-07 11:25:34.000000000 +1000
+++ linux-work/drivers/video/aty/radeon_base.c	2008-08-07 11:31:11.000000000 +1000
@@ -1286,11 +1286,10 @@ static void radeon_write_pll_regs(struct
 	radeon_pll_errata_after_data(rinfo);
 
 	/* Set PPLL ref. div */
-	if (rinfo->family == CHIP_FAMILY_R300 ||
+	if (IS_R300_VARIANT(rinfo) ||
 	    rinfo->family == CHIP_FAMILY_RS300 ||
-	    rinfo->family == CHIP_FAMILY_R350 ||
-	    rinfo->family == CHIP_FAMILY_RV350 ||
-	    rinfo->family == CHIP_FAMILY_RV380 ) {
+	    rinfo->family == CHIP_FAMILY_RS400 ||
+	    rinfo->family == CHIP_FAMILY_RS480) {
 		if (mode->ppll_ref_div & R300_PPLL_REF_DIV_ACC_MASK) {
 			/* When restoring console mode, use saved PPLL_REF_DIV
 			 * setting.
@@ -1461,10 +1460,7 @@ static void radeon_calc_pll_regs(struct 
 		/* Not all chip revs have the same format for this register,
 		 * extract the source selection
 		 */
-		if (rinfo->family == CHIP_FAMILY_R200 ||
-		    rinfo->family == CHIP_FAMILY_R300 ||
-		    rinfo->family == CHIP_FAMILY_R350 ||
-		    rinfo->family == CHIP_FAMILY_RV350) {
+		if (rinfo->family == CHIP_FAMILY_R200 || IS_R300_VARIANT(rinfo)) {
 			source = (fp2_gen_cntl >> 10) & 0x3;
 			/* sourced from transform unit, check for transform unit
 			 * own source
@@ -2005,6 +2001,7 @@ static void radeon_identify_vram(struct 
             (rinfo->family == CHIP_FAMILY_RS200) ||
             (rinfo->family == CHIP_FAMILY_RS300) ||
             (rinfo->family == CHIP_FAMILY_RC410) ||
+            (rinfo->family == CHIP_FAMILY_RS400) ||
 	    (rinfo->family == CHIP_FAMILY_RS480) ) {
           u32 tom = INREG(NB_TOM);
           tmp = ((((tom >> 16) - (tom & 0xffff) + 1) << 6) * 1024);



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH]: radeonfb: Fix copyarea for R300 and later.
  2008-08-07  2:06     ` Benjamin Herrenschmidt
@ 2008-08-07  5:52       ` Benjamin Herrenschmidt
  2008-08-07 11:22         ` Geert Uytterhoeven
  0 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-07  5:52 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, linux-kernel, stable


>  
> +	/* Ensure the dst cache is flushed and the engine idle before
> +	 * issuing the operation.
> +	 *
> +	 * This works around engine lockups on some cards
> +	 */
> +	OUTREG(DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL);
> +	OUTREG(WAIT_UNTIL, (WAIT_2D_IDLECLEAN | WAIT_DMA_GUI_IDLE));
> +
>  	radeon_fifo_wait(2);  

Typo in the patch, missing radeon_fifo_wait before the cache flush.

Fixed version on it's way, sorry.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] radeonfb: Fix engine hangs and cache flushing (v2)
  2008-08-06 21:19   ` David Miller
  2008-08-06 22:08     ` Benjamin Herrenschmidt
  2008-08-07  2:06     ` Benjamin Herrenschmidt
@ 2008-08-07  8:52     ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-07  8:52 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, linux-kernel, stable

(Thanks to David Miller for debugging that on his rv370 and
providing the initial version of that patch !)

This patches fixes a few things.

One is, among the 3 or so different variants of cache control registers,
radeon_engine_flush() is using one that shouldn't be used on r3xx and
later. This fixes it by making is use one that should work on everything
at least for the 2D cache.

We also didn't use the proper list of chip families on some functions,
this resyncs us with what X does.

In addition, I added a waitfor fifo in radeon_engine_flush() to make
sure the cache flush command did hit the register backbone before testing
for completion of the flush operation.

Finally, we enqueue a destination cache flush and a wait for engine
idle before solid fills and blits. This effectively prevents those
operations from being pipelined and shouldn't be necessary but it
appaears to cure some hangs on David's card, so I suspect something
fishy is going on with the engine caches. The performances of radeonfb
don't appear to suffer a great deal from that anyway.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

v2. adds a missing wait on fifo

Index: linux-work/drivers/video/aty/radeon_accel.c
===================================================================
--- linux-work.orig/drivers/video/aty/radeon_accel.c	2008-08-07 18:14:01.000000000 +1000
+++ linux-work/drivers/video/aty/radeon_accel.c	2008-08-07 18:14:49.000000000 +1000
@@ -55,6 +55,15 @@ static void radeonfb_prim_fillrect(struc
 	OUTREG(DP_WRITE_MSK, 0xffffffff);
 	OUTREG(DP_CNTL, (DST_X_LEFT_TO_RIGHT | DST_Y_TOP_TO_BOTTOM));
 
+	/* Ensure the dst cache is flushed and the engine idle before
+	 * issuing the operation.
+	 *
+	 * This works around engine lockups on some cards
+	 */
+	radeon_fifo_wait(2);  
+	OUTREG(DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL);
+	OUTREG(WAIT_UNTIL, (WAIT_2D_IDLECLEAN | WAIT_DMA_GUI_IDLE));
+
 	radeon_fifo_wait(2);  
 	OUTREG(DST_Y_X, (region->dy << 16) | region->dx);
 	OUTREG(DST_WIDTH_HEIGHT, (region->width << 16) | region->height);
@@ -116,6 +125,15 @@ static void radeonfb_prim_copyarea(struc
 	OUTREG(DP_CNTL, (xdir>=0 ? DST_X_LEFT_TO_RIGHT : 0)
 			| (ydir>=0 ? DST_Y_TOP_TO_BOTTOM : 0));
 
+	/* Ensure the dst cache is flushed and the engine idle before
+	 * issuing the operation.
+	 *
+	 * This works around engine lockups on some cards
+	 */
+	radeon_fifo_wait(2);
+	OUTREG(DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL);
+	OUTREG(WAIT_UNTIL, (WAIT_2D_IDLECLEAN | WAIT_DMA_GUI_IDLE));
+
 	radeon_fifo_wait(3);
 	OUTREG(SRC_Y_X, (sy << 16) | sx);
 	OUTREG(DST_Y_X, (dy << 16) | dx);
@@ -203,9 +221,7 @@ void radeonfb_engine_reset(struct radeon
 	host_path_cntl = INREG(HOST_PATH_CNTL);
 	rbbm_soft_reset = INREG(RBBM_SOFT_RESET);
 
-	if (rinfo->family == CHIP_FAMILY_R300 ||
-	    rinfo->family == CHIP_FAMILY_R350 ||
-	    rinfo->family == CHIP_FAMILY_RV350) {
+	if (IS_R300_VARIANT(rinfo)) {
 		u32 tmp;
 
 		OUTREG(RBBM_SOFT_RESET, (rbbm_soft_reset |
@@ -241,9 +257,7 @@ void radeonfb_engine_reset(struct radeon
 	INREG(HOST_PATH_CNTL);
 	OUTREG(HOST_PATH_CNTL, host_path_cntl);
 
-	if (rinfo->family != CHIP_FAMILY_R300 &&
-	    rinfo->family != CHIP_FAMILY_R350 &&
-	    rinfo->family != CHIP_FAMILY_RV350)
+	if (!IS_R300_VARIANT(rinfo))
 		OUTREG(RBBM_SOFT_RESET, rbbm_soft_reset);
 
 	OUTREG(CLOCK_CNTL_INDEX, clock_cntl_index);
@@ -260,10 +274,18 @@ void radeonfb_engine_init (struct radeon
 	radeonfb_engine_reset(rinfo);
 
 	radeon_fifo_wait (1);
-	if ((rinfo->family != CHIP_FAMILY_R300) &&
-	    (rinfo->family != CHIP_FAMILY_R350) &&
-	    (rinfo->family != CHIP_FAMILY_RV350))
+	if (IS_R300_VARIANT(rinfo)) {
+		OUTREG(RB2D_DSTCACHE_MODE, INREG(RB2D_DSTCACHE_MODE) |
+		       RB2D_DC_AUTOFLUSH_ENABLE |
+		       RB2D_DC_DC_DISABLE_IGNORE_PE);
+	} else {
+		/* This needs to be double checked with ATI. Latest X driver
+		 * completely "forgets" to set this register on < r3xx, and
+		 * we used to just write 0 there... I'll keep the 0 and update
+		 * that when we have sorted things out on X side.
+		 */
 		OUTREG(RB2D_DSTCACHE_MODE, 0);
+	}
 
 	radeon_fifo_wait (3);
 	/* We re-read MC_FB_LOCATION from card as it can have been
Index: linux-work/include/video/radeon.h
===================================================================
--- linux-work.orig/include/video/radeon.h	2008-08-07 18:14:01.000000000 +1000
+++ linux-work/include/video/radeon.h	2008-08-07 18:14:20.000000000 +1000
@@ -386,7 +386,7 @@
 #define SC_BOTTOM_RIGHT                        0x16F0  
 #define SRC_SC_BOTTOM_RIGHT                    0x16F4  
 #define RB2D_DSTCACHE_MODE		       0x3428
-#define RB2D_DSTCACHE_CTLSTAT		       0x342C
+#define RB2D_DSTCACHE_CTLSTAT_broken	       0x342C /* do not use */
 #define LVDS_GEN_CNTL			       0x02d0
 #define LVDS_PLL_CNTL			       0x02d4
 #define FP2_GEN_CNTL                           0x0288
@@ -532,6 +532,9 @@
 #define RB2D_DC_FLUSH_ALL			   (RB2D_DC_FLUSH_2D | RB2D_DC_FREE_2D)
 #define RB2D_DC_BUSY				   (1 << 31)
 
+/* DSTCACHE_MODE bits constants */
+#define RB2D_DC_AUTOFLUSH_ENABLE                   (1 << 8)
+#define RB2D_DC_DC_DISABLE_IGNORE_PE               (1 << 17)
 
 /* CRTC_GEN_CNTL bit constants */
 #define CRTC_DBL_SCAN_EN                           0x00000001
@@ -742,6 +745,10 @@
 #define SOFT_RESET_RB           		   (1 <<  6)
 #define SOFT_RESET_HDP          		   (1 <<  7)
 
+/* WAIT_UNTIL bit constants */
+#define WAIT_DMA_GUI_IDLE			   (1 << 9)
+#define WAIT_2D_IDLECLEAN			   (1 << 16)
+
 /* SURFACE_CNTL bit consants */
 #define SURF_TRANSLATION_DIS			   (1 << 8)
 #define NONSURF_AP0_SWP_16BPP			   (1 << 20)
Index: linux-work/drivers/video/aty/radeonfb.h
===================================================================
--- linux-work.orig/drivers/video/aty/radeonfb.h	2008-08-07 18:14:01.000000000 +1000
+++ linux-work/drivers/video/aty/radeonfb.h	2008-08-07 18:14:20.000000000 +1000
@@ -53,6 +53,7 @@ enum radeon_family {
 	CHIP_FAMILY_RV380,    /* RV370/RV380/M22/M24 */
 	CHIP_FAMILY_R420,     /* R420/R423/M18 */
 	CHIP_FAMILY_RC410,
+	CHIP_FAMILY_RS400,
 	CHIP_FAMILY_RS480,
 	CHIP_FAMILY_LAST,
 };
@@ -533,33 +534,39 @@ static inline u32 radeon_get_dstbpp(u16 
 /*
  * 2D Engine helper routines
  */
-static inline void radeon_engine_flush (struct radeonfb_info *rinfo)
+
+static inline void _radeon_fifo_wait(struct radeonfb_info *rinfo, int entries)
 {
 	int i;
 
-	/* initiate flush */
-	OUTREGP(RB2D_DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL,
-	        ~RB2D_DC_FLUSH_ALL);
-
-	for (i=0; i < 2000000; i++) {
-		if (!(INREG(RB2D_DSTCACHE_CTLSTAT) & RB2D_DC_BUSY))
+	for (i=0; i<2000000; i++) {
+		if ((INREG(RBBM_STATUS) & 0x7f) >= entries)
 			return;
 		udelay(1);
 	}
-	printk(KERN_ERR "radeonfb: Flush Timeout !\n");
+	printk(KERN_ERR "radeonfb: FIFO Timeout !\n");
 }
 
-
-static inline void _radeon_fifo_wait(struct radeonfb_info *rinfo, int entries)
+static inline void radeon_engine_flush (struct radeonfb_info *rinfo)
 {
 	int i;
 
-	for (i=0; i<2000000; i++) {
-		if ((INREG(RBBM_STATUS) & 0x7f) >= entries)
+	/* Initiate flush */
+	OUTREGP(DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL,
+	        ~RB2D_DC_FLUSH_ALL);
+
+	/* Ensure FIFO is empty, ie, make sure the flush commands
+	 * has reached the cache
+	 */
+	_radeon_fifo_wait (rinfo, 64);
+
+	/* Wait for the flush to complete */
+	for (i=0; i < 2000000; i++) {
+		if (!(INREG(DSTCACHE_CTLSTAT) & RB2D_DC_BUSY))
 			return;
 		udelay(1);
 	}
-	printk(KERN_ERR "radeonfb: FIFO Timeout !\n");
+	printk(KERN_ERR "radeonfb: Flush Timeout !\n");
 }
 
 
Index: linux-work/drivers/video/aty/radeon_base.c
===================================================================
--- linux-work.orig/drivers/video/aty/radeon_base.c	2008-08-07 18:14:01.000000000 +1000
+++ linux-work/drivers/video/aty/radeon_base.c	2008-08-07 18:14:20.000000000 +1000
@@ -1286,11 +1286,10 @@ static void radeon_write_pll_regs(struct
 	radeon_pll_errata_after_data(rinfo);
 
 	/* Set PPLL ref. div */
-	if (rinfo->family == CHIP_FAMILY_R300 ||
+	if (IS_R300_VARIANT(rinfo) ||
 	    rinfo->family == CHIP_FAMILY_RS300 ||
-	    rinfo->family == CHIP_FAMILY_R350 ||
-	    rinfo->family == CHIP_FAMILY_RV350 ||
-	    rinfo->family == CHIP_FAMILY_RV380 ) {
+	    rinfo->family == CHIP_FAMILY_RS400 ||
+	    rinfo->family == CHIP_FAMILY_RS480) {
 		if (mode->ppll_ref_div & R300_PPLL_REF_DIV_ACC_MASK) {
 			/* When restoring console mode, use saved PPLL_REF_DIV
 			 * setting.
@@ -1461,10 +1460,7 @@ static void radeon_calc_pll_regs(struct 
 		/* Not all chip revs have the same format for this register,
 		 * extract the source selection
 		 */
-		if (rinfo->family == CHIP_FAMILY_R200 ||
-		    rinfo->family == CHIP_FAMILY_R300 ||
-		    rinfo->family == CHIP_FAMILY_R350 ||
-		    rinfo->family == CHIP_FAMILY_RV350) {
+		if (rinfo->family == CHIP_FAMILY_R200 || IS_R300_VARIANT(rinfo)) {
 			source = (fp2_gen_cntl >> 10) & 0x3;
 			/* sourced from transform unit, check for transform unit
 			 * own source
@@ -2005,6 +2001,7 @@ static void radeon_identify_vram(struct 
             (rinfo->family == CHIP_FAMILY_RS200) ||
             (rinfo->family == CHIP_FAMILY_RS300) ||
             (rinfo->family == CHIP_FAMILY_RC410) ||
+            (rinfo->family == CHIP_FAMILY_RS400) ||
 	    (rinfo->family == CHIP_FAMILY_RS480) ) {
           u32 tom = INREG(NB_TOM);
           tmp = ((((tom >> 16) - (tom & 0xffff) + 1) << 6) * 1024);



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH]: radeonfb: Fix copyarea for R300 and later.
  2008-08-07  5:52       ` Benjamin Herrenschmidt
@ 2008-08-07 11:22         ` Geert Uytterhoeven
  2008-08-07 21:24           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2008-08-07 11:22 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: David Miller, akpm, linux-kernel, stable

	Hi Ben,

On Thu, 7 Aug 2008, Benjamin Herrenschmidt wrote:
> > +	/* Ensure the dst cache is flushed and the engine idle before
> > +	 * issuing the operation.
> > +	 *
> > +	 * This works around engine lockups on some cards
> > +	 */
> > +	OUTREG(DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL);
> > +	OUTREG(WAIT_UNTIL, (WAIT_2D_IDLECLEAN | WAIT_DMA_GUI_IDLE));
> > +
> >  	radeon_fifo_wait(2);  
> 
> Typo in the patch, missing radeon_fifo_wait before the cache flush.
> 
> Fixed version on it's way, sorry.

Can we please get a CC: linux-fbdev-devel on the fixed version?

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH]: radeonfb: Fix copyarea for R300 and later.
  2008-08-07 11:22         ` Geert Uytterhoeven
@ 2008-08-07 21:24           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-07 21:24 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: David Miller, akpm, linux-kernel, stable

On Thu, 2008-08-07 at 13:22 +0200, Geert Uytterhoeven wrote:
> 	Hi Ben,
> 
> On Thu, 7 Aug 2008, Benjamin Herrenschmidt wrote:
> > > +	/* Ensure the dst cache is flushed and the engine idle before
> > > +	 * issuing the operation.
> > > +	 *
> > > +	 * This works around engine lockups on some cards
> > > +	 */
> > > +	OUTREG(DSTCACHE_CTLSTAT, RB2D_DC_FLUSH_ALL);
> > > +	OUTREG(WAIT_UNTIL, (WAIT_2D_IDLECLEAN | WAIT_DMA_GUI_IDLE));
> > > +
> > >  	radeon_fifo_wait(2);  
> > 
> > Typo in the patch, missing radeon_fifo_wait before the cache flush.
> > 
> > Fixed version on it's way, sorry.
> 
> Can we please get a CC: linux-fbdev-devel on the fixed version?

I CCed the list on the final patch.

Ben.



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-08-07 21:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-06  7:35 [PATCH]: radeonfb: Fix copyarea for R300 and later David Miller
2008-08-06 15:43 ` Andrew Morton
2008-08-06 21:19   ` David Miller
2008-08-06 22:08     ` Benjamin Herrenschmidt
2008-08-07  2:06     ` Benjamin Herrenschmidt
2008-08-07  5:52       ` Benjamin Herrenschmidt
2008-08-07 11:22         ` Geert Uytterhoeven
2008-08-07 21:24           ` Benjamin Herrenschmidt
2008-08-07  8:52     ` [PATCH] radeonfb: Fix engine hangs and cache flushing (v2) Benjamin Herrenschmidt
2008-08-06 22:09   ` [PATCH]: radeonfb: Fix copyarea for R300 and later Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox