From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Richter Subject: Performance drop on XDrawRectangles() Date: Wed, 08 Jan 2014 16:57:52 +0100 Message-ID: <52CD7580.8040106@rus.uni-stuttgart.de> Reply-To: richter@rus.uni-stuttgart.de Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from hydra.rus.uni-stuttgart.de (hydra.rus.uni-stuttgart.de [129.69.192.3]) by gabe.freedesktop.org (Postfix) with ESMTP id 41610FAF19 for ; Wed, 8 Jan 2014 08:07:16 -0800 (PST) List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org To: Daniel Vetter , intel-gfx List-Id: intel-gfx@lists.freedesktop.org Hi folks, during the changes from 3.12rc7 to 3.13rc4, the performance of XDrawRectangles() dropped considerably. Interestingly, it is not the raw rectangles drawing operations that are slow, but it seems that the "per-call" overhead has increased by one magnitude. In specific, if you use the unmodified "x11pref" program: x11pref -rect10 no substiantial changes are visible. However, if the rectangles are drawn one by one by changing: /* snip: old version, lines 86ff of do_rects.c of the x11perf program */ void DoRectangles(XParms xp, Parms p, int reps) { int i; for (i = 0; i != reps; i++) { XFillRectangles(xp->d, xp->w, pgc, rects, p->objects); if (pgc == xp->bggc) pgc = xp->fggc; else pgc = xp->bggc; CheckAbort (); } } /* to the following : */ void DoRectangles(XParms xp, Parms p, int reps) { int i; int j; for (i = 0; i != reps; i++) { for(j = 0;j < p->objects;j++) { XFillRectangles(xp->d, xp->w, pgc, rects+j, 1); } if (pgc == xp->bggc) pgc = xp->fggc; else pgc = xp->bggc; CheckAbort (); } } by instead drawing the rectangles one by one, the performance is decreased to one eigths of the original performance: 400000 trep @ 0.0687 msec ( 14600.0/sec): 10x10 rectangle (new) 2500000 trep @ 0.0107 msec ( 93900.0/sec): 10x10 rectangle (old) Thus, apparently, not the actual hardware acceleration degraded, but there is something in the call path that slowed down the call considerably. Any idea what changed? Greetings, Thomas