From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1YQh8p-0004bg-4H for mharc-grub-devel@gnu.org; Wed, 25 Feb 2015 13:57:03 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36081) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQh8k-0004Wu-8c for grub-devel@gnu.org; Wed, 25 Feb 2015 13:56:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YQh8j-0002S9-1t for grub-devel@gnu.org; Wed, 25 Feb 2015 13:56:58 -0500 Received: from mail-wg0-x22f.google.com ([2a00:1450:400c:c00::22f]:43479) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQh8i-0002S4-OO for grub-devel@gnu.org; Wed, 25 Feb 2015 13:56:57 -0500 Received: by wggy19 with SMTP id y19so5565810wgg.10 for ; Wed, 25 Feb 2015 10:56:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type; bh=E6ZDiIuf5o/akTHVM8D1quKQBgXvPr9P1QSEtS6aBDI=; b=J5QTFEmdGE5XuEgxAzrDVo7BxvsgSFI+/o45voW37QOZ6dHW4/BGk5rixqwX6oT57q CMuFD5QIu84pb7SR3ApWVWkBsN3c1adtUEYQKix6Vdl76tLWjCsDwtWaPdze2fyvLQbc Mj6AqpTGQl7G7/b3jveGwMgzyje3VVfgAQX1Ov626v7dV80pR+k8H4pROFq6lPri6B6k EBwbHtV6qZ3espcolS5irEvbdAuUvrakcMMbYPjf8p/OhPMs1nAwheKlNh5gQ5KfOW0o l4bEjs2hJxflUdF0xaZdbAJd18Amsw04XAFzanjRA17LkKA+q2dAFTZdZ7y68ezz47ez 9mcw== X-Received: by 10.194.71.175 with SMTP id w15mr8985085wju.16.1424890616064; Wed, 25 Feb 2015 10:56:56 -0800 (PST) Received: from ?IPv6:2620:0:105f:fd00:863a:4bff:fe50:abc4? ([2620:0:105f:fd00:863a:4bff:fe50:abc4]) by mx.google.com with ESMTPSA id hs7sm26341209wib.4.2015.02.25.10.56.55 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 25 Feb 2015 10:56:55 -0800 (PST) Message-ID: <54EE1AF7.50305@gmail.com> Date: Wed, 25 Feb 2015 19:56:55 +0100 From: =?UTF-8?B?VmxhZGltaXIgJ8+GLWNvZGVyL3BoY29kZXInIFNlcmJpbmVua28=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.4.0 MIME-Version: 1.0 To: grub-devel@gnu.org Subject: Re: Image scaling performance References: <20150225154540.GR4278@bivouac.eciton.net> <20150225162352.GT4278@bivouac.eciton.net> In-Reply-To: Content-Type: multipart/mixed; boundary="------------050001020900060009020403" X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2a00:1450:400c:c00::22f X-BeenThere: grub-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: The development of GNU GRUB List-Id: The development of GNU GRUB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 18:56:59 -0000 This is a multi-part message in MIME format. --------------050001020900060009020403 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 25.02.2015 19:46, Michael Zimmermann wrote: > oh ok so linux's div/mod/... assembler is as slow/fast as grub's code? > Linux uses armv5>= ifdefs. Maybe we could optimized things a little :) > About scale_nn, > amarullz(https://plus.google.com/u/0/+AhmadAmarullah/about) wrote a > optimized version without divs: > loops: http://pastebin.com/MaZqWSA9 > memcpy: http://pastebin.com/iNq0V5Tw > Please try my patch (reattached here after minor fixes). The patch by anonymous source, sent by third-party through pastebin isn't acceptable from legal perspective > this code works a little faster. I'm still questioning the efficiency > math operations because on slow devices there are other bottlenecks of > the same kind(like de/compression). > > On Wed, Feb 25, 2015 at 7:41 PM, Vladimir 'phcoder' Serbinenko > wrote: >> ARMv7 doesn't mandate div instructions. It's a separate flag in features. >> GRUB supports earlier CPUs as well and we use them for testing. My only test >> machine is armv6 >> >> Le 2015-02-25 19:38, "Michael Zimmermann" a écrit >> : >> >>> Why u think the native div code would crash on most devices? I support >>> ARMv7+ only anyway. >>> >>> On Wed, Feb 25, 2015 at 5:23 PM, Leif Lindholm >>> wrote: >>>> On Wed, Feb 25, 2015 at 03:45:40PM +0000, Leif Lindholm wrote: >>>>>>>> Some technical info: >>>>>>>> ARMv7 >>>>>>>> Linaro GCC 4.9 >>>>> >>>>> I don't see any calls to any of the __aeabi helpers generated for this >>>>> file with current head. Which specific Linaro toolchain are you using? >>>>> (mine is"Linaro GCC 4.9-2014.09"). >>>> >>>> Scratch that, I do see them. Just failing to drive the tools properly. >>>> >>>> / >>>> Leif >>>> >>>> _______________________________________________ >>>> Grub-devel mailing list >>>> Grub-devel@gnu.org >>>> https://lists.gnu.org/mailman/listinfo/grub-devel >>> >>> _______________________________________________ >>> Grub-devel mailing list >>> Grub-devel@gnu.org >>> https://lists.gnu.org/mailman/listinfo/grub-devel >> >> >> _______________________________________________ >> Grub-devel mailing list >> Grub-devel@gnu.org >> https://lists.gnu.org/mailman/listinfo/grub-devel >> > > _______________________________________________ > Grub-devel mailing list > Grub-devel@gnu.org > https://lists.gnu.org/mailman/listinfo/grub-devel > --------------050001020900060009020403 Content-Type: text/x-diff; name="scale.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="scale.diff" diff --git a/grub-core/video/bitmap_scale.c b/grub-core/video/bitmap_scale.c index 0b93d02..64bacbf 100644 --- a/grub-core/video/bitmap_scale.c +++ b/grub-core/video/bitmap_scale.c @@ -366,22 +366,31 @@ scale_nn (struct grub_video_bitmap *dst, struct grub_video_bitmap *src) /* bytes_per_pixel is the same for both src and dst. */ unsigned bytes_per_pixel = dst->mode_info.bytes_per_pixel; - unsigned dy; - for (dy = 0; dy < dh; dy++) + unsigned dy, sy, ystep, yfrac, yover; + unsigned dx, sx, xstep, xfrac, xover; + ystep = sw / dw; + yover = sw % dw; + xstep = sh / dh; + xover = sh % dh; + + for (dy = 0, sy = 0, yfrac = 0; dy < dh; dy++, sy += ystep, yfrac += yover) { - unsigned dx; - for (dx = 0; dx < dw; dx++) + if (yfrac > dw) + { + yfrac -= dw; + sy++; + } + for (dx = 0, sx = 0, xfrac = 0; dx < dw; dx++, sx += xstep, xfrac += xover) { grub_uint8_t *dptr; grub_uint8_t *sptr; - unsigned sx; - unsigned sy; unsigned comp; - /* Compute the source coordinate that the destination coordinate - maps to. Note: sx/sw = dx/dw => sx = sw*dx/dw. */ - sx = sw * dx / dw; - sy = sh * dy / dh; + if (xfrac > dh) + { + xfrac -= dh; + sx++; + } /* Get the address of the pixels in src and dst. */ dptr = ddata + dy * dstride + dx * bytes_per_pixel; --------------050001020900060009020403--