From mboxrd@z Thu Jan 1 00:00:00 1970 From: Antonino Daplas Subject: Re: RFC: Optimizing putcs() Date: 07 Aug 2002 13:25:46 +0800 Sender: linux-fbdev-devel-admin@lists.sourceforge.net Message-ID: <1028697994.561.3.camel@daplas> References: <1028584418.556.29.camel@daplas> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-z9XGX2oo3DeYZDzemGo1" Return-path: Received: from [203.167.79.9] (helo=willow.compass.com.ph) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 17cJGV-0007eG-00 for ; Tue, 06 Aug 2002 22:21:48 -0700 Received: from AP-203.167.30.67.sysads.com (cwd67.compass.com.ph [203.167.30.67]) by willow.compass.com.ph (8.9.3/8.9.3) with ESMTP id NAA10517 for ; Wed, 7 Aug 2002 13:21:28 +0800 (PHT) (envelope-from adaplas@pol.net) In-Reply-To: <1028584418.556.29.camel@daplas> Errors-To: linux-fbdev-devel-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: List-Unsubscribe: , List-Archive: To: fbdev --=-z9XGX2oo3DeYZDzemGo1 Content-Type: text/plain Content-Transfer-Encoding: 7bit One of the reason why 2.4 console performance is good especially at low bit depths is its ability to process more than 1 pixel per iteration and its usage of mask arrays. I tried to generalize the above in cfbimgblt.c by incorporating the idea in fbcon-cfb*.c. It's significantly faster but still not as fast as the 2.4 API. time cat /usr/src/linux/MAINTAINERs (40K text file) 1024x768-8bpp, y-panning disabled 2.5 old (with offscreen buffers) real 0m10.708s user 0m0.001s sys 0m10.707s 2.5 new real 0m4.378s user 0m0.002s sys 0m4.375s 2.4 real 0m2.098s user 0m0.000s sys 0m2.070s I've only tested the implementation at 8, 16, 24, and 32 bpp. 24bpp is slightly slower than 32 bpp :( Tony --=-z9XGX2oo3DeYZDzemGo1 Content-Disposition: attachment; filename=cfbimgblt.c Content-Transfer-Encoding: quoted-printable Content-Type: text/x-c; name=cfbimgblt.c; charset=ISO-8859-1 /* * Generic BitBLT function for frame buffer with packed pixels of any dept= h. * * Copyright (C) June 1999 James Simmons * * This file is subject to the terms and conditions of the GNU General Pub= lic * License. See the file COPYING in the main directory of this archive fo= r * more details. * * NOTES: * * This function copys a image from system memory to video memory. The * image can be a bitmap where each 0 represents the background color and * each 1 represents the foreground color. Great for font handling. It can * also be a color image. This is determined by image_depth. The color ima= ge * must be laid out exactly in the same format as the framebuffer. Yes I k= now * their are cards with hardware that coverts images of various depths to = the * framebuffer depth. But not every card has this. All images must be roun= ded * up to the nearest byte. For example a bitmap 12 bits wide must be two=20 * bytes width.=20 * * FIXME * The code for 24 bit is horrible. It copies byte by byte size instead of * longs like the other sizes. Needs to be optimized. * =20 * Tony:=20 * Incorporate mask tables similar to fbcon-cfb*.c in 2.4 API. This speed= s=20 * up the code significantly. * =20 * Code for depths not multiples of BITS_PER_LONG is still kludgy, which i= s * still processed a bit at a time. =20 * * Also need to add code to deal with cards endians that are different tha= n * the native cpu endians. I also need to deal with MSB position in the wo= rd. * */ #include #include #include #include