* Re: Some issues to resolve with XFree 4.0 yet [not found] <Pine.LNX.4.10.10003230911180.6826-100000@shell.unixbox.com> @ 2000-03-23 18:16 ` Kevin Hendricks 2000-03-25 3:54 ` Found bug in mode switching but who is at fault...XFree86 or aty128fb.c? Kevin Hendricks 2000-03-25 23:50 ` Some issues to resolve with XFree 4.0 yet Kevin Hendricks 0 siblings, 2 replies; 24+ messages in thread From: Kevin Hendricks @ 2000-03-23 18:16 UTC (permalink / raw) To: Ani Joshi; +Cc: Kostas Gewrgiou, linuxppc-dev Hi Ani and Kostas, > ErrorF or xf86DrvMsg the var struct before and after a mode switch, then > see whats wrong. Okay, here is all the snippets I think you need to see what is going on. AFAICT everything looks to be okay. It's almost like we are missing a pixel cache flush or engine flush or something along those lines. I current have 3 modes in my XF86Config. If I put only one mode on the mode line then I can successfully startx properly into that mode. If however, I put all three modes on one line (allow SwitchMode to work) then only the first mode (the highest resolution works, the lower resolutions do not). Here are the snippets from the log for switching modes from 1152x870 (the working mode) to 832x624 (bad) then to 1024x768 (again bad) and once more back to 1152x870. Any ideas here? Do we need something like an engine reset or flush in aty128fb_set_var? Here is the log snippet: fbdevHW: SwitchMode 0 xfree new mode: 57591 832 885 949 1152 624 625 628 667 fbdev before mode: 9999 1152 53 128 123 870 3 3 39 32 8:8:8 fbdev after mode: 17364 832 53 64 203 624 1 3 39 32 8:8:8 fbdevHW: AdjustFrame 0 fbdevHW: SwitchMode 0 xfree new mode: 78747 1024 1056 1152 1312 768 769 772 800 fbdev before mode: 17364 832 53 64 203 624 1 3 39 32 8:8:8 fbdev after mode: 12698 1024 37 96 155 768 1 3 28 32 8:8:8 fbdevHW: AdjustFrame 0 fbdevHW: SwitchMode 0 xfree new mode: 100001 1152 1205 1333 1456 870 873 876 915 fbdev before mode: 12698 1024 37 96 155 768 1 3 28 32 8:8:8 fbdev after mode: 9999 1152 53 128 123 870 3 3 39 32 8:8:8 fbdevHW: AdjustFrame 0 Here are the print routines so that you can see what is being printed above: static void print_fbdev_mode(char *txt, struct fb_var_screeninfo *var) { ErrorF( "fbdev %s mode:\t%d %d %d %d %d %d %d %d %d %d %d:%d:%d\n" , txt,var->pixclock, var->xres, var->right_margin, var->hsync_len, var->left_margin, var->yres, var->lower_margin, var->vsync_len, var->upper_margin, var->bits_per_pixel, var->red.length, var->green.length, var->blue.length); } static void print_xfree_mode(char *txt, DisplayModePtr mode) { ErrorF( "xfree %s mode:\t%d %d %d %d %d %d %d %d %d\n", txt,mode->Clock, mode->HDisplay, mode->HSyncStart, mode->HSyncEnd, mode->HTotal, mode->VDisplay, mode->VSyncStart, mode->VSyncEnd, mode->VTotal); } Here is the routine that literally loads the new mode info so you can see how the calculations from SyncStart to margins and etc are being done: static void xfree2fbdev_timing(DisplayModePtr mode, struct fb_var_screeninfo *var) { var->xres = mode->HDisplay; var->yres = mode->VDisplay; if (var->xres_virtual < var->xres) var->xres_virtual = var->xres; if (var->yres_virtual < var->yres) var->yres_virtual = var->yres; var->xoffset = var->yoffset = 0; var->pixclock = mode->Clock ? 1000000000/mode->Clock : 0; var->right_margin = mode->HSyncStart-mode->HDisplay; var->hsync_len = mode->HSyncEnd-mode->HSyncStart; var->left_margin = mode->HTotal-mode->HSyncEnd; var->lower_margin = mode->VSyncStart-mode->VDisplay; var->vsync_len = mode->VSyncEnd-mode->VSyncStart; var->upper_margin = mode->VTotal-mode->VSyncEnd; var->sync = 0; if (mode->Flags & V_PHSYNC) var->sync |= FB_SYNC_HOR_HIGH_ACT; if (mode->Flags & V_PVSYNC) var->sync |= FB_SYNC_VERT_HIGH_ACT; if (mode->Flags & V_PCSYNC) var->sync |= FB_SYNC_COMP_HIGH_ACT; #if 0 if (mode->Flags & V_BCAST) var->sync |= FB_SYNC_BROADCAST; #endif if (mode->Flags & V_INTERLACE) var->vmode = FB_VMODE_INTERLACED; else if (mode->Flags & V_DBLSCAN) var->vmode = FB_VMODE_DOUBLE; else var->vmode = FB_VMODE_NONINTERLACED; } Everything seems to be alright to me. I think we are just missing some sort of flush or reset in the aty128fb set_var routine? It is interesting to note, that the r128 code in its SwitchMode (not usig FBDev) literally reloads all of the registers and does a full R128EngineInit. Ideas here? Thanks, Kevin -- Kevin B. Hendricks Associate Professor of Operations and Information Technology Richard Ivey School of Business, University of Western Ontario London, Ontario N6A-3K7 CANADA khendricks@ivey.uwo.ca, (519) 661-3874, fax: 519-661-3959 ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Found bug in mode switching but who is at fault...XFree86 or aty128fb.c? 2000-03-23 18:16 ` Some issues to resolve with XFree 4.0 yet Kevin Hendricks @ 2000-03-25 3:54 ` Kevin Hendricks 2000-03-25 7:57 ` Michel Dänzer 2000-03-25 13:46 ` Geert Uytterhoeven 2000-03-25 23:50 ` Some issues to resolve with XFree 4.0 yet Kevin Hendricks 1 sibling, 2 replies; 24+ messages in thread From: Kevin Hendricks @ 2000-03-25 3:54 UTC (permalink / raw) To: Kostas Gewrgiou, Ani Joshi; +Cc: linuxppc-dev Hi, I figured since the fbdev driver showed the same problem as the r128 driver, that the mode switching problem must be in either aty128fb.c or xfree86 but not in the r128 driver. Okay so I found the bug. It seems all through the r128 driver, crtc.pitch values are set to the virtual x resolution (vxres) / 8. But in aty128fb.c in the var_to_crtc routine the crtc.pitch is set to be just the xres / 8. This is not a problem if xres == vxres. Which is what happens when the aty128fb.c starts up. So for any one mode it defaults to being okay. However, when doing mode switching via the Cntl-Alt-Keypad+- keys, xfree sets vxres and vyres to be the same as the resolution of the largest mode on that line (so 1152x870 becomes my vxres, and vyres when 1152x870, 832x624, 1024x768 are all specified on the same line. This results in a call to aty128fb_set-var which calls decode_var which calls var_to_crtc. which gets the crtc.pitch wrong. So my questions is as follows? Who is wrong? Should xfree shrink the vxres and vyres to match xres and yres before calling set_var or should aty128fb.c var_to_crtc routine be fixed to use vxres >> 3 instead of just xres >> 3? If aty128fb.c needs to be fixed, here is a patch: --- aty128fb.c.last Sat Mar 18 23:04:24 2000 +++ aty128fb.c Fri Mar 24 22:39:26 2000 @@ -794,8 +794,11 @@ crtc->v_sync_strt_wid = v_sync_strt | (v_sync_wid << 16) | (v_sync_pol << 23); +#if 0 crtc->pitch = xres >> 3; - +#else + crtc->pitch = vxres >> 3; +#endif crtc->offset = 0; crtc->offset_cntl = 0; But I am not sure if this makes sense alone. What use is it to get a nice 832x624 hole into a display that is virtually 1152x870?!? I can't get to any of the kde controls, panels, etc since they are off the screen! And it would be a pain to have to pan around looking for them (especially since the ioctl for panning is on the "to do" list!). So my feeling is that both are wrong. We should shrink the virtual resolution to match the physical resolution in xfree when mode switching and put the patch in place in aty128fb.c Comments? Thanks, Kevin -- Kevin B. Hendricks Associate Professor of Operations and Information Technology Richard Ivey School of Business, University of Western Ontario London, Ontario N6A-3K7 CANADA khendricks@ivey.uwo.ca, (519) 661-3874, fax: 519-661-3959 ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Found bug in mode switching but who is at fault...XFree86 or aty128fb.c? 2000-03-25 3:54 ` Found bug in mode switching but who is at fault...XFree86 or aty128fb.c? Kevin Hendricks @ 2000-03-25 7:57 ` Michel Dänzer 2000-03-25 8:07 ` Michel Dänzer 2000-03-25 13:46 ` Geert Uytterhoeven 1 sibling, 1 reply; 24+ messages in thread From: Michel Dänzer @ 2000-03-25 7:57 UTC (permalink / raw) To: khendricks; +Cc: Kostas Gewrgiou, Ani Joshi, linuxppc-dev Kevin Hendricks wrote: > Okay so I found the bug. It seems all through the r128 driver, crtc.pitch > values are set to the virtual x resolution (vxres) / 8. But in aty128fb.c > in the var_to_crtc routine the crtc.pitch is set to be just the xres / 8. > > This is not a problem if xres == vxres. Which is what happens when the > aty128fb.c starts up. Not necessarily. The problem could have shown up if someone had put a mode with xres < vxres as first in the "Modes" line, but apparently only configuration tools tend to do that... > So for any one mode it defaults to being okay. Okay. > Who is wrong? Should xfree shrink the vxres and vyres to match xres and > yres before calling set_var or should aty128fb.c var_to_crtc routine be > fixed to use vxres >> 3 instead of just xres >> 3? I vote for the latter, because otherwise invisible parts of the screen may be damaged, or am I wrong? A better reason might be that it works perfectly as-is in glint ;) Michel ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Found bug in mode switching but who is at fault...XFree86 or aty128fb.c? 2000-03-25 7:57 ` Michel Dänzer @ 2000-03-25 8:07 ` Michel Dänzer 0 siblings, 0 replies; 24+ messages in thread From: Michel Dänzer @ 2000-03-25 8:07 UTC (permalink / raw) To: khendricks; +Cc: Kostas Gewrgiou, Ani Joshi, linuxppc-dev Michel Dänzer wrote: > > Kevin Hendricks wrote: > > > This is not a problem if xres == vxres. Which is what happens when the > > aty128fb.c starts up. > > Not necessarily. The problem could have shown up if someone had put a mode > with xres < vxres as first in the "Modes" line, but apparently only > configuration tools tend to do that... Oops. I misread you were writing about the r128 driver. My apologies. Michel ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Found bug in mode switching but who is at fault...XFree86 or aty128fb.c? 2000-03-25 3:54 ` Found bug in mode switching but who is at fault...XFree86 or aty128fb.c? Kevin Hendricks 2000-03-25 7:57 ` Michel Dänzer @ 2000-03-25 13:46 ` Geert Uytterhoeven 1 sibling, 0 replies; 24+ messages in thread From: Geert Uytterhoeven @ 2000-03-25 13:46 UTC (permalink / raw) To: Kevin Hendricks; +Cc: Kostas Gewrgiou, Ani Joshi, linuxppc-dev On Fri, 24 Mar 2000, Kevin Hendricks wrote: > Okay so I found the bug. It seems all through the r128 driver, crtc.pitch > values are set to the virtual x resolution (vxres) / 8. But in aty128fb.c in > the var_to_crtc routine the crtc.pitch is set to be just the xres / 8. Which is wrong: aty128fb must do `vxres * bpp / 8'. > Who is wrong? Should xfree shrink the vxres and vyres to match xres and yres > before calling set_var or should aty128fb.c var_to_crtc routine be fixed to use > vxres >> 3 instead of just xres >> 3? XFree86 cannot change the visible resolution on the fly. > What use is it to get a nice 832x624 hole into a display that is virtually > 1152x870?!? I can't get to any of the kde controls, panels, etc since they are > off the screen! And it would be a pain to have to pan around looking for them > (especially since the ioctl for panning is on the "to do" list!). Hence panning needs to be fixed :-) In fact panning is very simple, just change the offset of the first pixel. That's a `one-register' update. > So my feeling is that both are wrong. We should shrink the virtual resolution > to match the physical resolution in xfree when mode switching and put the patch > in place in aty128fb.c XFree86 cannot change the visible resolution on the fly, so we cannot change it. Design bug in the whole X system :-) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-23 18:16 ` Some issues to resolve with XFree 4.0 yet Kevin Hendricks 2000-03-25 3:54 ` Found bug in mode switching but who is at fault...XFree86 or aty128fb.c? Kevin Hendricks @ 2000-03-25 23:50 ` Kevin Hendricks 2000-03-27 11:09 ` Kostas Gewrgiou 1 sibling, 1 reply; 24+ messages in thread From: Kevin Hendricks @ 2000-03-25 23:50 UTC (permalink / raw) To: Kostas Gewrgiou, Ani Joshi; +Cc: linuxppc-dev Hi Kostas, Okay, with the patch I posted last night for setting crtc.pitch in aty128fb.c, mode switching now works fine but the "panning" ioctl is on the "todo" list yet. Thanks to Geert for pointing out you can't change virtual resolutions on the fly with XFree (I was about to try! ;-) You then asked me to look at getting it to work without using the FBDev. Given my earlier patch which calculates XCLK using OF supplied values in the pll registers, all you need to do to use it without FBDev is to simply comment out the calls to vgaHWSave and vgaHWRestore in r128_driver.c. >From that point on, everything works like a charm. My question is as follows, under ppc should we ever be doing anything with vgaHWSave and vgaHWRestore. Can I simply ifdef them out for all __powerpc__ machines? If not, is there any way to determine under which powerpc machines that an r128 card actually can use vgaHWSave and vgaHWRestore. I think the only outstanding issue on r128 is the damn flashing white square when cursor images are changed. I have looked and looked at this but I can't figure out why this is happening unless a big white square is someone's idea of a transparent cursor! ;-) I have to start spending time on some other projects for awhile (i.e. real life research project that needs to get underway) so I wanted to wrap things up with the r128 driver for awhile. If and when I get some time, I would be happy to take a shot at taking the r128 source and making it a mach64 source just in case you think that would be of use (i.e. someone else hasn't done that yet and the old ati driver has not been converted to work yet). Thanks, Kevin -- Kevin B. Hendricks Associate Professor of Operations and Information Technology Richard Ivey School of Business, University of Western Ontario London, Ontario N6A-3K7 CANADA khendricks@ivey.uwo.ca, (519) 661-3874, fax: 519-661-3959 ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-25 23:50 ` Some issues to resolve with XFree 4.0 yet Kevin Hendricks @ 2000-03-27 11:09 ` Kostas Gewrgiou 2000-03-27 17:41 ` Ryuichi Oikawa 0 siblings, 1 reply; 24+ messages in thread From: Kostas Gewrgiou @ 2000-03-27 11:09 UTC (permalink / raw) To: Kevin Hendricks; +Cc: Ani Joshi, linuxppc-dev On Sat, 25 Mar 2000, Kevin Hendricks wrote: > Hi Kostas, > > Okay, with the patch I posted last night for setting crtc.pitch in aty128fb.c, > mode switching now works fine but the "panning" ioctl is on the "todo" list yet. > Thanks to Geert for pointing out you can't change virtual resolutions on the > fly with XFree (I was about to try! ;-) > The aty128fb in the linus 2.3.x tree has panning working so you can try that and see if its working ok. > You then asked me to look at getting it to work without using the FBDev. Given > my earlier patch which calculates XCLK using OF supplied values in the pll > registers, all you need to do to use it without FBDev is to simply comment out > the calls to vgaHWSave and vgaHWRestore in r128_driver.c. > You will also need to add code to switch the framebuffer in the right endian for the depth and probably disable the int10 module. > >From that point on, everything works like a charm. > > My question is as follows, under ppc should we ever be doing anything with > vgaHWSave and vgaHWRestore. Can I simply ifdef them out for all __powerpc__ > machines? > vgahw will not work under powerpc right now (iobase and vga memory aren't handled right), once its working it will probably be usefull for prep/chrp but for now you have to disable it. > If not, is there any way to determine under which powerpc machines that an r128 > card actually can use vgaHWSave and vgaHWRestore. > Thats a good question, right now they don't work at all under ppc for drivers that don't switch vgahw to MMIO. > I think the only outstanding issue on r128 is the damn flashing white square > when cursor images are changed. I have looked and looked at this but I can't > figure out why this is happening unless a big white square is someone's > idea of a transparent cursor! ;-) > This is strange, from what i see in the driver it hides the cursor before loading the image so i can't imagine why you get the artifacts > I have to start spending time on some other projects for awhile (i.e. real > life research project that needs to get underway) so I wanted to wrap things up > with the r128 driver for awhile. > > If and when I get some time, I would be happy to take a shot at taking the r128 > source and making it a mach64 source just in case you think that would be of > use (i.e. someone else hasn't done that yet and the old ati driver has not been > converted to work yet). > There is an ati driver in 4.0 (not accelerated much though) it just needs fbdev support and prabably some endian changes before its usable under ppc, it shouln't be much harder than what you did to add fbdev support in r128 Kostas ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 11:09 ` Kostas Gewrgiou @ 2000-03-27 17:41 ` Ryuichi Oikawa 2000-03-27 18:05 ` Ani Joshi 0 siblings, 1 reply; 24+ messages in thread From: Ryuichi Oikawa @ 2000-03-27 17:41 UTC (permalink / raw) To: gewrgiou; +Cc: khendricks, ajoshi, linuxppc-dev From: Kostas Gewrgiou <gewrgiou@imbc.gr> Subject: Re: Some issues to resolve with XFree 4.0 yet > > You then asked me to look at getting it to work without using the FBDev. Given > > my earlier patch which calculates XCLK using OF supplied values in the pll > > registers, all you need to do to use it without FBDev is to simply comment out > > the calls to vgaHWSave and vgaHWRestore in r128_driver.c. > > > > You will also need to add code to switch the framebuffer in the right > endian for the depth and probably disable the int10 module. Yes, you're right. The r128 driver is now working fine on my B&W G3 without fbdev support in 8/15/16/24 bit depth so far, except one problem -- offb console becomes blank screen on VT switch(I'm not using aty128fb). r128 driver doesn't seem to restore the original state perfectly. But this isn't harmless because I have running second and third head(with xinerama). > Thats a good question, right now they don't work at all under ppc for > drivers that don't switch vgahw to MMIO. So I disabled all vgahw access to prevent seg. fault. I think Rage128 VGA register access is not necessary at least for powermacs. > > I think the only outstanding issue on r128 is the damn flashing white square > > when cursor images are changed. I have looked and looked at this but I can't > > figure out why this is happening unless a big white square is someone's > > idea of a transparent cursor! ;-) > > > This is strange, from what i see in the driver it hides the cursor before > loading the image so i can't imagine why you get the artifacts Though I could be wrong, it may not be strange. R128LoadCursorImage() starts display cursor immediately after the cursor image is written to the frame buffer, but rage128 frame buffer write is always FIFO'ed while CRTC write is never FIFO'ed. So it'll be possible to start display cursor before the image write is complete. In my case I commented out cursor ON/OFF code in R128LoadCursorImage() since mid-level routine calls R128HideCursor/R128ShowCursor before and after cursor image is loaded. I haven't seen this cursor flashing yet. BTW, I noticed an interesting x11perf score. x11perf -scroll500 marked ~300/sec for ATI Rage128RE connected to 66MHz bus on B&W G3 rev.1, but ~600/sec for an old Matrox Millennium II to 33MHz bus, measured at 32bpp/24bit depth. Regards, Ryuichi Oikawa roikawa@rr.iij4u.or.jp ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 17:41 ` Ryuichi Oikawa @ 2000-03-27 18:05 ` Ani Joshi 2000-03-27 19:06 ` Kevin B. Hendricks 2000-03-28 16:51 ` Ryuichi Oikawa 0 siblings, 2 replies; 24+ messages in thread From: Ani Joshi @ 2000-03-27 18:05 UTC (permalink / raw) To: Ryuichi Oikawa; +Cc: linuxppc-dev On Tue, 28 Mar 2000, Ryuichi Oikawa wrote: > BTW, I noticed an interesting x11perf score. x11perf -scroll500 marked ~300/sec > for ATI Rage128RE connected to 66MHz bus on B&W G3 rev.1, but ~600/sec for an > old Matrox Millennium II to 33MHz bus, measured at 32bpp/24bit depth. are you using the patch I posted last week? If not, then I suggest you do. I fixed the improper load/stores in r128 and it shows a 200% increase in almost all x11perf tests. example: (jack howarth tested these on his g4/450 (rage 128pro?)): before: Scroll 500x500 pixels: 583.0/sec after: Scroll 500x500 pixels: 1060.0/sec > Ryuichi Oikawa > roikawa@rr.iij4u.or.jp ani ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 18:05 ` Ani Joshi @ 2000-03-27 19:06 ` Kevin B. Hendricks 2000-03-27 19:13 ` David Edelsohn 2000-03-28 16:51 ` Ryuichi Oikawa 1 sibling, 1 reply; 24+ messages in thread From: Kevin B. Hendricks @ 2000-03-27 19:06 UTC (permalink / raw) To: Ani Joshi, Ryuichi Oikawa; +Cc: linuxppc-dev Hi Ani and Ryuichi, >are you using the patch I posted last week? If not, then I suggest you >do. I fixed the improper load/stores in r128 and it shows a 200% increase >in almost all x11perf tests. Actually, you might want to try Gabriel Paubert's patch which simply removes the "volatile" from the base_addr parameter. The incirrectly specified volatile on the parameter (which really makes no sense if you think about it ;-)) is what was causing all the problems with inefficiency. Interestingly, with this patch you can actually save one extra instruction over Ani's patch but either one is a big big improvement. Kevin ----snip-here-for Gabriel_Paubert's_e-mail_with_patch---- > Hi, > > >From comparing the performance of the XFree 4.0 r128 drivers across x86 and > ppc we noticed that the ppc version was much slower. The following patch > made a huge change in x11perf results (improivement). This is on a ppc > with glibc 2.1.3 and the latest gcc 2.95.2 from Franz Sirl. > > Did I write the output constraint version incorrectly? Is this what you > expected the generated code to look like? I have just made a test with suppressing the volatile in the parameter to the regr/regw/regr16/regw16 macros and the code is even better (one instruction less than with the memory clobber): 000003d4 <R128Blank>: 3d4: 81 43 00 f8 lwz r10,248(r3) 3d8: 81 6a 00 24 lwz r11,36(r10) 3dc: 39 20 00 54 li r9,84 3e0: 7c 09 5c 2c lwbrx r0,r9,r11 3e4: 7c 00 06 ac eieio 3e8: 60 00 04 00 ori r0,r0,1024 3ec: 7c 09 5d 2c stwbrx r0,r9,r11 3f0: 7c 00 06 ac eieio 3f4: 4e 80 00 20 blr the diff is: --- r128_reg.h~ Sat Feb 26 06:38:43 2000 +++ r128_reg.h Fri Mar 24 23:47:31 2000 @@ -48,19 +48,19 @@ #if defined(__powerpc__) -static inline void regw(volatile unsigned long base_addr, unsigned long regindex, unsigned long regdata) +static inline void regw(unsigned long base_addr, unsigned long regindex, unsigned long regdata) { asm volatile ("stwbrx %1,%2,%3; eieio" : "=m" (*(volatile unsigned *)(base_addr+regindex)) : "r" (regdata), "b" (regindex), "r" (base_addr)); } -static inline void regw16(volatile unsigned long base_addr, unsigned long regindex, unsigned short regdata) +static inline void regw16(unsigned long base_addr, unsigned long regindex, unsigned short regdata) { asm volatile ("sthbrx %0,%1,%2; eieio": : "r"(regdata), "b"(regindex), "r"(base_addr)); } -static inline unsigned long regr(volatile unsigned long base_addr, unsigned long regindex) +static inline unsigned long regr(unsigned long base_addr, unsigned long regindex) { register unsigned long val; asm volatile ("lwbrx %0,%1,%2; eieio" @@ -70,7 +70,7 @@ return(val); } -static inline unsigned short regr16(volatile unsigned long base_addr, unsigned long regindex) +static inline unsigned short regr16(unsigned long base_addr, unsigned long regindex) { register unsigned short val; asm volatile ("lhbrx %0,%1,%2; eieio": "=r"(val):"b"(regindex), "r"(base_addr)); ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 19:06 ` Kevin B. Hendricks @ 2000-03-27 19:13 ` David Edelsohn 2000-03-27 19:20 ` Kevin B. Hendricks ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: David Edelsohn @ 2000-03-27 19:13 UTC (permalink / raw) To: Kevin B. Hendricks; +Cc: Ani Joshi, Ryuichi Oikawa, linuxppc-dev Gabriel's patch is the correct way to address the problem and should be the one which goes into the public sources. I do not understand, however, why the patch only includes the "=m" constraint on regw() and not regw16(). All of the inlined functions should have constraints which reference the actual memory address read or written to ensure proper dependencies in optimized code. The problem was the unnecessary "volatile" not the memory constraint. David ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 19:13 ` David Edelsohn @ 2000-03-27 19:20 ` Kevin B. Hendricks 2000-03-27 19:25 ` Ani Joshi 2000-03-29 10:45 ` Gabriel Paubert 2 siblings, 0 replies; 24+ messages in thread From: Kevin B. Hendricks @ 2000-03-27 19:20 UTC (permalink / raw) To: David Edelsohn; +Cc: Ani Joshi, Ryuichi Oikawa, linuxppc-dev Hi, Okay I will add this. But right now the regr16 and regw16 macros are not used at all in the r128 code in xfree86 4.0. (I put them there only for completeness and to match the x86 versions). Thanks, Kevin At 14:13 -0500 3/27/00, David Edelsohn wrote: > Gabriel's patch is the correct way to address the problem and >should be the one which goes into the public sources. > > I do not understand, however, why the patch only includes the "=m" >constraint on regw() and not regw16(). All of the inlined functions >should have constraints which reference the actual memory address read or >written to ensure proper dependencies in optimized code. > > The problem was the unnecessary "volatile" not the memory >constraint. > >David ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 19:13 ` David Edelsohn 2000-03-27 19:20 ` Kevin B. Hendricks @ 2000-03-27 19:25 ` Ani Joshi 2000-03-27 19:45 ` David Edelsohn 2000-03-27 19:48 ` Kevin B. Hendricks 2000-03-29 10:45 ` Gabriel Paubert 2 siblings, 2 replies; 24+ messages in thread From: Ani Joshi @ 2000-03-27 19:25 UTC (permalink / raw) To: David Edelsohn; +Cc: Kevin B. Hendricks, Ryuichi Oikawa, linuxppc-dev Can anybody explain how method a) is different/better then method b) ? I *lot* of drivers are using method b) so is that to say all the developers who are using it are wrong and should change? a) asm volatile ("stwbrx %1,%2,%3; eieio" : "=m" (*(volatile unsigned *)(base_addr+regindex)) : "r" (regdata), "b" (regindex), "r" (base_addr)); b) asm volatile ("stwbrx %0,%1,%2; eieio" : : "r"(regdata), "b" (regindex), "r"(base_addr) : "memory"); a) asm volatile ("lwbrx %0,%1,%2; eieio" : "=r"(val) : "b"(regindex), "r"(base_addr), "m" (*(volatile unsigned *)(base_addr+regindex))); b) asm volatile ("lwbrx %0,%1,%2; eieio" : "=r"(val) : "b"(regindex), "r"(base_addr)); ani ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 19:25 ` Ani Joshi @ 2000-03-27 19:45 ` David Edelsohn 2000-03-27 19:38 ` Ani Joshi 2000-03-27 19:48 ` Kevin B. Hendricks 1 sibling, 1 reply; 24+ messages in thread From: David Edelsohn @ 2000-03-27 19:45 UTC (permalink / raw) To: Ani Joshi; +Cc: Kevin B. Hendricks, Ryuichi Oikawa, linuxppc-dev Method "a" says that the memory at the specified address was read/written. Method "b" says that some unspecified piece of memory was affected and all future references to memory need to be reloaded to ensure that the correct value is used. David ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 19:45 ` David Edelsohn @ 2000-03-27 19:38 ` Ani Joshi 2000-03-27 20:01 ` David Edelsohn 0 siblings, 1 reply; 24+ messages in thread From: Ani Joshi @ 2000-03-27 19:38 UTC (permalink / raw) To: David Edelsohn; +Cc: Kevin B. Hendricks, Ryuichi Oikawa, linuxppc-dev So does this mean all drivers (atyfb, aty128fb, xpmac to name a few) which use method "b" should change to "a"? ani On Mon, 27 Mar 2000, David Edelsohn wrote: > Method "a" says that the memory at the specified address was > read/written. Method "b" says that some unspecified piece of memory was > affected and all future references to memory need to be reloaded to ensure > that the correct value is used. > > David > ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 19:38 ` Ani Joshi @ 2000-03-27 20:01 ` David Edelsohn 0 siblings, 0 replies; 24+ messages in thread From: David Edelsohn @ 2000-03-27 20:01 UTC (permalink / raw) To: Ani Joshi; +Cc: Kevin B. Hendricks, Ryuichi Oikawa, linuxppc-dev >>>>> Ani Joshi writes: Ani> So does this mean all drivers (atyfb, aty128fb, xpmac to name a few) which Ani> use method "b" should change to "a"? In general, I think that it is recommended that inline assembly use exact constraints and not "memory". "memory" clobber is intended for inlined assembly implementing something like memcpy(). Even the GCC inline asm tutorial from April 1988 uses "=m" constraints which are not referenced in the output template. For a more definitive statement, I think that you should ask on the gcc@gcc.gnu.org mailinglist. David ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 19:25 ` Ani Joshi 2000-03-27 19:45 ` David Edelsohn @ 2000-03-27 19:48 ` Kevin B. Hendricks 2000-03-28 7:59 ` Geert Uytterhoeven 1 sibling, 1 reply; 24+ messages in thread From: Kevin B. Hendricks @ 2000-03-27 19:48 UTC (permalink / raw) To: Ani Joshi, David Edelsohn; +Cc: Ryuichi Oikawa, linuxppc-dev Hi Ani, I asked the same things a few weeks back. David is the epxert, I am not. I think the key is what David just wrote: >All of the inlined functions >should have constraints which reference the actual memory address read or >written to ensure proper dependencies in optimized code. I think the memory constraint on the exact address prevents the compiler from moving this inline code to someplace inappropriate, but David and Gabriel could answer better. Kevin ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 19:48 ` Kevin B. Hendricks @ 2000-03-28 7:59 ` Geert Uytterhoeven 0 siblings, 0 replies; 24+ messages in thread From: Geert Uytterhoeven @ 2000-03-28 7:59 UTC (permalink / raw) To: Kevin B. Hendricks Cc: Ani Joshi, David Edelsohn, Ryuichi Oikawa, linuxppc-dev On Mon, 27 Mar 2000, Kevin B. Hendricks wrote: > I asked the same things a few weeks back. David is the epxert, I am not. > I think the key is what David just wrote: > > >All of the inlined functions > >should have constraints which reference the actual memory address read or > >written to ensure proper dependencies in optimized code. > > I think the memory constraint on the exact address prevents the compiler > from moving this inline code to someplace inappropriate, but David and > Gabriel could answer better. It's not about moving inline code, it's about the compiler thinking which in-memory variables may be clobbered by the inline code. If the constraint says that `memory' is clobbered, the compiler cannot know which variables were clobbered, so it will assume they were all clobbered and it will reload them from memory into the registers instead of reusing the regoister values. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 19:13 ` David Edelsohn 2000-03-27 19:20 ` Kevin B. Hendricks 2000-03-27 19:25 ` Ani Joshi @ 2000-03-29 10:45 ` Gabriel Paubert 2000-03-29 13:11 ` Franz Sirl 2 siblings, 1 reply; 24+ messages in thread From: Gabriel Paubert @ 2000-03-29 10:45 UTC (permalink / raw) To: David Edelsohn Cc: Kevin B. Hendricks, Ani Joshi, Ryuichi Oikawa, linuxppc-dev On Mon, 27 Mar 2000, David Edelsohn wrote: > > Gabriel's patch is the correct way to address the problem and > should be the one which goes into the public sources. Thanks, and sorry for the delay. I was busy on other fronts (my wife had surgery on Monday, nothing serious however). > I do not understand, however, why the patch only includes the "=m" > constraint on regw() and not regw16(). All of the inlined functions > should have constraints which reference the actual memory address read or > written to ensure proper dependencies in optimized code. Well, I overlooked that. What I wanted to insist upon in my patch was that the volatile was absolutely unnecessary. Actually, if it were my call I would have declared the variable as volatile unsigned char *, which is the right type for address arithmetic and eliminates any need for cast on byte accesses. I consider minimizing the number or required casts as the right guideline to choose the variable type in this case. BTW, did anybody think of adding a __builtin_byteswap to GCC ? This would allow the compiler to directly generate *brx instructions on PPC by combining them with memory loads and stores. I'm aware that it would require an additional constraint letter for indexed addressing modee only but this is required for Altivec anyway. And this would open opportunities for quite a lot of optimizations, for example when setting or clearing some bits in a device register. In this latter case (and in the given example) operands are often constants and the compiler could generate non byte swapped load and stores and byte swap the constants. Gabriel. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-29 10:45 ` Gabriel Paubert @ 2000-03-29 13:11 ` Franz Sirl 2000-03-29 14:58 ` Gabriel Paubert 0 siblings, 1 reply; 24+ messages in thread From: Franz Sirl @ 2000-03-29 13:11 UTC (permalink / raw) To: Gabriel Paubert Cc: David Edelsohn, Kevin B. Hendricks, Ani Joshi, Ryuichi Oikawa, linuxppc-dev At 12:45 29.03.00, Gabriel Paubert wrote: >BTW, did anybody think of adding a __builtin_byteswap to GCC ? > >This would allow the compiler to directly generate *brx instructions on >PPC by combining them with memory loads and stores. I'm aware that it >would require an additional constraint letter for indexed addressing modee >only but this is required for Altivec anyway. > >And this would open opportunities for quite a lot of optimizations, for >example when setting or clearing some bits in a device register. In this >latter case (and in the given example) operands are often constants and >the compiler could generate non byte swapped load and stores and byte swap >the constants. Hmm, I was thinking about adding __attribute__((little_endian)) and __attribute__((big_endian)) to further describe variables. This should give us optimum optimization on various platforms. Even things like: union { unsigned long little_var __attribute__((little_endian)); unsigned long big_var __attribute__((big_endian)); } should be possible then. Franz. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-29 13:11 ` Franz Sirl @ 2000-03-29 14:58 ` Gabriel Paubert 2000-03-29 19:39 ` Franz Sirl 0 siblings, 1 reply; 24+ messages in thread From: Gabriel Paubert @ 2000-03-29 14:58 UTC (permalink / raw) To: Franz Sirl Cc: David Edelsohn, Kevin B. Hendricks, Ani Joshi, Ryuichi Oikawa, linuxppc-dev On Wed, 29 Mar 2000, Franz Sirl wrote: > Hmm, I was thinking about adding __attribute__((little_endian)) and > __attribute__((big_endian)) to further describe variables. This should give > us optimum optimization on various platforms. Even things like: > > union { > unsigned long little_var __attribute__((little_endian)); > unsigned long big_var __attribute__((big_endian)); > } > > should be possible then. Indeed, but I was considering it as a later step. I have the feeling that adding a builtin would be simpler and would allow to build the necessary infrastructure for attribute support with minimal intermediate breakage (perhaps by implementing it only on some architectures at first). Furthermore, the subject of adding this attribute has appeared on a quite regular basis on GCC mailing lists in the last few years and nothing has ever been done about it AFAICT. Perhaps a different strategy through builtin functions would get things started, that's why I'm suggesting it. Gabriel. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-29 14:58 ` Gabriel Paubert @ 2000-03-29 19:39 ` Franz Sirl 0 siblings, 0 replies; 24+ messages in thread From: Franz Sirl @ 2000-03-29 19:39 UTC (permalink / raw) To: Gabriel Paubert Cc: David Edelsohn, Kevin B. Hendricks, Ani Joshi, Ryuichi Oikawa, linuxppc-dev Am Wed, 29 Mar 2000 schrieb Gabriel Paubert: >On Wed, 29 Mar 2000, Franz Sirl wrote: > >> Hmm, I was thinking about adding __attribute__((little_endian)) and >> __attribute__((big_endian)) to further describe variables. This should give >> us optimum optimization on various platforms. Even things like: >> >> union { >> unsigned long little_var __attribute__((little_endian)); >> unsigned long big_var __attribute__((big_endian)); >> } >> >> should be possible then. > >Indeed, but I was considering it as a later step. I have the feeling that >adding a builtin would be simpler and would allow to build the necessary >infrastructure for attribute support with minimal intermediate breakage >(perhaps by implementing it only on some architectures at first). Adding the attributes is quite simple, the part not quite clear to me yet is how to evaluate the attribute in the backend and if we need middle-end support. Evaluating the attribute directly in the backend mov* patterns seems straightforward, but maybe separate reversed_mov* patterns maybe more appropriate... >Furthermore, the subject of adding this attribute has appeared on a quite >regular basis on GCC mailing lists in the last few years and nothing has >ever been done about it AFAICT. Perhaps a different strategy through >builtin functions would get things started, that's why I'm suggesting it. As always with GCC, if you want something done, do it yourself :-) (unless you can pay somebody for coding). Franz. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-27 18:05 ` Ani Joshi 2000-03-27 19:06 ` Kevin B. Hendricks @ 2000-03-28 16:51 ` Ryuichi Oikawa 2000-03-28 17:51 ` Geert Uytterhoeven 1 sibling, 1 reply; 24+ messages in thread From: Ryuichi Oikawa @ 2000-03-28 16:51 UTC (permalink / raw) To: ajoshi; +Cc: roikawa, linuxppc-dev From: Ani Joshi <ajoshi@shell.unixbox.com> Subject: Re: Some issues to resolve with XFree 4.0 yet > On Tue, 28 Mar 2000, Ryuichi Oikawa wrote: > > > BTW, I noticed an interesting x11perf score. x11perf -scroll500 marked ~300/sec > > for ATI Rage128RE connected to 66MHz bus on B&W G3 rev.1, but ~600/sec for an > > old Matrox Millennium II to 33MHz bus, measured at 32bpp/24bit depth. > > are you using the patch I posted last week? If not, then I suggest you Yes, I am. But Rage128RE score was only ~300/sec at 1280x1024x32bpp/75Hz. (The score is very sensitive to actual(not virtual) screen size and refresh rate). > do. I fixed the improper load/stores in r128 and it shows a 200% increase > in almost all x11perf tests. example: > > (jack howarth tested these on his g4/450 (rage 128pro?)): > > before: > Scroll 500x500 pixels: 583.0/sec > > after: > Scroll 500x500 pixels: 1060.0/sec Regards, Ryuichi Oikawa roikawa@rr.iij4u.or.jp ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Some issues to resolve with XFree 4.0 yet 2000-03-28 16:51 ` Ryuichi Oikawa @ 2000-03-28 17:51 ` Geert Uytterhoeven 0 siblings, 0 replies; 24+ messages in thread From: Geert Uytterhoeven @ 2000-03-28 17:51 UTC (permalink / raw) To: Ryuichi Oikawa; +Cc: ajoshi, linuxppc-dev On Wed, 29 Mar 2000, Ryuichi Oikawa wrote: > From: Ani Joshi <ajoshi@shell.unixbox.com> > Subject: Re: Some issues to resolve with XFree 4.0 yet > > On Tue, 28 Mar 2000, Ryuichi Oikawa wrote: > > > BTW, I noticed an interesting x11perf score. x11perf -scroll500 marked ~300/sec > > > for ATI Rage128RE connected to 66MHz bus on B&W G3 rev.1, but ~600/sec for an > > > old Matrox Millennium II to 33MHz bus, measured at 32bpp/24bit depth. > > > > are you using the patch I posted last week? If not, then I suggest you > Yes, I am. But Rage128RE score was only ~300/sec at 1280x1024x32bpp/75Hz. > (The score is very sensitive to actual(not virtual) screen size and refresh rate). The old Millennium II had dual-ported WRAM, so the accel engine wasn't sensitive to refresh rate. But single-ported commodity SDRAM/SGRAM is cheaper, so WRAM was killed. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2000-03-29 19:39 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <Pine.LNX.4.10.10003230911180.6826-100000@shell.unixbox.com>
2000-03-23 18:16 ` Some issues to resolve with XFree 4.0 yet Kevin Hendricks
2000-03-25 3:54 ` Found bug in mode switching but who is at fault...XFree86 or aty128fb.c? Kevin Hendricks
2000-03-25 7:57 ` Michel Dänzer
2000-03-25 8:07 ` Michel Dänzer
2000-03-25 13:46 ` Geert Uytterhoeven
2000-03-25 23:50 ` Some issues to resolve with XFree 4.0 yet Kevin Hendricks
2000-03-27 11:09 ` Kostas Gewrgiou
2000-03-27 17:41 ` Ryuichi Oikawa
2000-03-27 18:05 ` Ani Joshi
2000-03-27 19:06 ` Kevin B. Hendricks
2000-03-27 19:13 ` David Edelsohn
2000-03-27 19:20 ` Kevin B. Hendricks
2000-03-27 19:25 ` Ani Joshi
2000-03-27 19:45 ` David Edelsohn
2000-03-27 19:38 ` Ani Joshi
2000-03-27 20:01 ` David Edelsohn
2000-03-27 19:48 ` Kevin B. Hendricks
2000-03-28 7:59 ` Geert Uytterhoeven
2000-03-29 10:45 ` Gabriel Paubert
2000-03-29 13:11 ` Franz Sirl
2000-03-29 14:58 ` Gabriel Paubert
2000-03-29 19:39 ` Franz Sirl
2000-03-28 16:51 ` Ryuichi Oikawa
2000-03-28 17:51 ` Geert Uytterhoeven
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).