From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with archive (Exim 4.43) id 1OSDGu-0007NA-Bm for mharc-grub-devel@gnu.org; Fri, 25 Jun 2010 14:05:00 -0400 Received: from [140.186.70.92] (port=53498 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OSDGr-0007KP-Hb for grub-devel@gnu.org; Fri, 25 Jun 2010 14:04:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OSDGo-0000Ma-PT for grub-devel@gnu.org; Fri, 25 Jun 2010 14:04:57 -0400 Received: from mail-ww0-f41.google.com ([74.125.82.41]:36371) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OSDGo-0000MK-Cg for grub-devel@gnu.org; Fri, 25 Jun 2010 14:04:54 -0400 Received: by wwf26 with SMTP id 26so1334958wwf.0 for ; Fri, 25 Jun 2010 11:04:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :x-enigmail-version:content-type; bh=dZByaNhi8x6eO20R+BzUprhe+hsZYhTwjRaeegQsvj0=; b=VKLeeqv2iC2DRq+D61QrdLR33UjTFlDwEpjXhluxn9Aqg1e7dFNpdYMqalUv4VrC6P GIFyZKSshBAAFVQ74IgMPPUCJ7CRT7vA/QmU3qyVYbzbSmJk6c0Id96r7+iuw0j6O+Tm X/Qc+3aIh+gcrnEXhiVvZKw686nakfmuAOaQI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:x-enigmail-version:content-type; b=FJ/m/Yfu5Awne22AM1soQRIgHoY0s15bsYgegK6VAgbBXbZeRw6TV0zTAtIRSB/FBZ CxJf4YrdfTtxJyi5YkcGJN7H3NVKJsqoctWhtR7VW5MWMgqjuYoSczjM1vjiWmk/vnR6 1XCLnnxYZ/I9OvFMlS0XJlPbGfcfsgh7w8jxk= Received: by 10.227.127.83 with SMTP id f19mr897234wbs.83.1277489093093; Fri, 25 Jun 2010 11:04:53 -0700 (PDT) Received: from debian.bg45.phnet (gprs51.swisscom-mobile.ch [193.247.250.51]) by mx.google.com with ESMTPS id k33sm30166108wbn.0.2010.06.25.11.04.48 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 25 Jun 2010 11:04:51 -0700 (PDT) Message-ID: <4C24EFB9.7000201@gmail.com> Date: Fri, 25 Jun 2010 20:04:41 +0200 From: =?UTF-8?B?VmxhZGltaXIgJ8+GLWNvZGVyL3BoY29kZXInIFNlcmJpbmVua28=?= User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100515 Icedove/3.0.4 MIME-Version: 1.0 To: grub-devel@gnu.org References: <20100623213838.GI21862@riva.ucam.org> In-Reply-To: <20100623213838.GI21862@riva.ucam.org> X-Enigmail-Version: 1.0.1 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="------------enig6942F05188BB7290EEE5C120" X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) Subject: Re: [PATCH] Optimise memset on i386 X-BeenThere: grub-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: The development of GNU GRUB List-Id: The development of GNU GRUB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jun 2010 18:04:59 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig6942F05188BB7290EEE5C120 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 06/23/2010 11:38 PM, Colin Watson wrote: > With this approach, one of the most noticeable time sinks is that > setting a graphical video mode (I'm using the VBE backend) takes ages: > 1.6 seconds, which is a substantial percentage of this project's total > boot time. It turns out that most of this is spent initialising > double-buffering: doublebuf_pageflipping_init calls > grub_video_fb_create_render_target_from_pointer twice, and each call > takes a little over 600 milliseconds. Now, > grub_video_fb_create_render_target_from_pointer is basically just a big= > grub_memset to clear framebuffer memory, so this equates to under two > frames per second. What's going on? > > It turns out that write caching is disabled on video memory when GRUB i= s > running, so we take a cache stall on every single write, and it's > apparently hard to enable caching without implementing MTRRs. People > who know more about this than I do tell me that this can get > unpleasantly CPU-specific at times, although I still hold out some hope= > that it's possible in GRUB. > > =20 On non-device memory GRUB should take advantage of cache. On MIPS enabling/disabling cache is done by using a different address. So we have all infrastructure necessary for differentiating cacheable/non-cacheable is present. Enabling cache on video memory is however more of a trouble. One of the reasons is that cache nmishandling produces difficult bugs. > However, there's a way to substantially speed things up without that. > The na=C3=AFve implementation of grub_memset writes a byte at a time, a= nd for > that matter on i386 it compiles to a poorly-optimised loop rather than > using REP STOS or similar. grub_memset is an inner loop practically by= > definition, and it's worth optimising. We can fix both of these > weaknesses by importing the optimised memset from GNU libc: since it > writes four bytes at a time except (sometimes) at the start and end, it= > should take about a quarter the number of cache stalls. And, indeed, > measurement bears this out: instead of taking over 600 milliseconds per= > call to grub_video_fb_create_render_target_from_pointer (I think it was= > actually 630 or so, though I neglected to write that down), GRUB now > takes about 160 milliseconds per call. Much better! > > The optimised memset is LGPLv2.1 or later, and I've preserved that > notice, but as far as I know this should be fine for use in GRUB; it ca= n > be upgraded to LGPLv3, and that's just GPLv3 with some additional > permissions. It's already assigned to the FSF due to being in glibc. > > =20 It's ok to use this code but be sure to mention its origin. It's also ok to keep its license unless big divergeance is to be expected. Did you test it on x86_64? > +void * > +grub_memset (void *s, int c, grub_size_t n) > +{ > + unsigned char *p =3D (unsigned char *) s; > + > + while (n--) > + *p++ =3D (unsigned char) c; > + > + return s; > +} > =20 This can be optimised the same way as i386 part, just replace stos with a loop over iterator with a pointer aligned on its size. > Thanks, > > =20 --=20 Regards Vladimir '=CF=86-coder/phcoder' Serbinenko --------------enig6942F05188BB7290EEE5C120 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iF4EAREKAAYFAkwk778ACgkQNak7dOguQgkAfAEAuPNBaTZzo1H6fQ3bJEPllQ4T GA7TAiEfTGL3WGPLBpkA/RjMclBrfbQsuIwtpTCMYCQvNCUHC+xrTfbatKguxYSY =VGZe -----END PGP SIGNATURE----- --------------enig6942F05188BB7290EEE5C120--