From: "Vladimir 'φ-coder/phcoder' Serbinenko" <phcoder@gmail.com>
To: grub-devel@gnu.org
Subject: Re: [PATCH] Optimise memset on i386
Date: Fri, 25 Jun 2010 20:04:41 +0200 [thread overview]
Message-ID: <4C24EFB9.7000201@gmail.com> (raw)
In-Reply-To: <20100623213838.GI21862@riva.ucam.org>
[-- Attachment #1: Type: text/plain, Size: 3201 bytes --]
On 06/23/2010 11:38 PM, Colin Watson wrote:
> With this approach, one of the most noticeable time sinks is that
> setting a graphical video mode (I'm using the VBE backend) takes ages:
> 1.6 seconds, which is a substantial percentage of this project's total
> boot time. It turns out that most of this is spent initialising
> double-buffering: doublebuf_pageflipping_init calls
> grub_video_fb_create_render_target_from_pointer twice, and each call
> takes a little over 600 milliseconds. Now,
> grub_video_fb_create_render_target_from_pointer is basically just a big
> grub_memset to clear framebuffer memory, so this equates to under two
> frames per second. What's going on?
>
> It turns out that write caching is disabled on video memory when GRUB is
> running, so we take a cache stall on every single write, and it's
> apparently hard to enable caching without implementing MTRRs. People
> who know more about this than I do tell me that this can get
> unpleasantly CPU-specific at times, although I still hold out some hope
> that it's possible in GRUB.
>
>
On non-device memory GRUB should take advantage of cache. On MIPS
enabling/disabling cache is done by using a different address. So we
have all infrastructure necessary for differentiating
cacheable/non-cacheable is present. Enabling cache on video memory is
however more of a trouble. One of the reasons is that cache nmishandling
produces difficult bugs.
> However, there's a way to substantially speed things up without that.
> The naïve implementation of grub_memset writes a byte at a time, and for
> that matter on i386 it compiles to a poorly-optimised loop rather than
> using REP STOS or similar. grub_memset is an inner loop practically by
> definition, and it's worth optimising. We can fix both of these
> weaknesses by importing the optimised memset from GNU libc: since it
> writes four bytes at a time except (sometimes) at the start and end, it
> should take about a quarter the number of cache stalls. And, indeed,
> measurement bears this out: instead of taking over 600 milliseconds per
> call to grub_video_fb_create_render_target_from_pointer (I think it was
> actually 630 or so, though I neglected to write that down), GRUB now
> takes about 160 milliseconds per call. Much better!
>
> The optimised memset is LGPLv2.1 or later, and I've preserved that
> notice, but as far as I know this should be fine for use in GRUB; it can
> be upgraded to LGPLv3, and that's just GPLv3 with some additional
> permissions. It's already assigned to the FSF due to being in glibc.
>
>
It's ok to use this code but be sure to mention its origin. It's also ok
to keep its license unless big divergeance is to be expected.
Did you test it on x86_64?
> +void *
> +grub_memset (void *s, int c, grub_size_t n)
> +{
> + unsigned char *p = (unsigned char *) s;
> +
> + while (n--)
> + *p++ = (unsigned char) c;
> +
> + return s;
> +}
>
This can be optimised the same way as i386 part, just replace stos with
a loop over iterator with a pointer aligned on its size.
> Thanks,
>
>
--
Regards
Vladimir 'φ-coder/phcoder' Serbinenko
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 294 bytes --]
next prev parent reply other threads:[~2010-06-25 18:05 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-23 21:38 [PATCH] Optimise memset on i386 Colin Watson
2010-06-25 18:04 ` Vladimir 'φ-coder/phcoder' Serbinenko [this message]
2010-06-25 18:27 ` Vladimir 'φ-coder/phcoder' Serbinenko
2010-07-23 11:14 ` Colin Watson
2010-07-23 15:56 ` richardvoigt
2010-07-23 17:34 ` Christian Franke
2010-07-23 19:48 ` richardvoigt
2010-07-23 20:09 ` richardvoigt
2010-07-24 22:40 ` Colin Watson
2010-07-28 14:21 ` Vladimir 'φ-coder/phcoder' Serbinenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C24EFB9.7000201@gmail.com \
--to=phcoder@gmail.com \
--cc=grub-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.