From: Thomas Zimmermann <tzimmermann@suse.de>
To: Sam Ravnborg <sam@ravnborg.org>
Cc: daniel@ffwll.ch, javierm@redhat.com, noralf@tronnes.org,
andriy.shevchenko@linux.intel.com, deller@gmx.de,
bernie@plugable.com, jayalk@intworks.biz,
linux-fbdev@vger.kernel.org, linux-staging@lists.linux.dev,
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 2/2] fbdev: Don't sort deferred-I/O pages by default
Date: Fri, 11 Feb 2022 09:21:50 +0100 [thread overview]
Message-ID: <60c15ca3-80fa-8c72-deb8-6f355cde6c25@suse.de> (raw)
In-Reply-To: <YgWAvXF+WClk/fyk@ravnborg.org>
[-- Attachment #1.1: Type: text/plain, Size: 6445 bytes --]
Hi Sam
Am 10.02.22 um 22:16 schrieb Sam Ravnborg:
> Hi Thomas,
>
> On Thu, Feb 10, 2022 at 03:11:13PM +0100, Thomas Zimmermann wrote:
>> Fbdev's deferred I/O sorts all dirty pages by default, which incurs a
>> significant overhead. Make the sorting step optional and update the few
>> drivers that require it. Use a FIFO list by default.
>>
>> Sorting pages by memory offset for deferred I/O performs an implicit
>> bubble-sort step on the list of dirty pages. The algorithm goes through
>> the list of dirty pages and inserts each new page according to its
>> index field. Even worse, list traversal always starts at the first
>> entry. As video memory is most likely updated scanline by scanline, the
>> algorithm traverses through the complete list for each updated page.
>>
>> For example, with 1024x768x32bpp a page covers exactly one scanline.
>> Writing a single screen update from top to bottom requires updating
>> 768 pages. With an average list length of 384 entries, a screen update
>> creates (768 * 384 =) 294912 compare operation.
>>
>> Fix this by making the sorting step opt-in and update the few drivers
>> that require it. All other drivers work with unsorted page lists. Pages
>> are appended to the list. Therefore, in the common case of writing the
>> framebuffer top to bottom, pages are still sorted by offset, which may
>> have a positive effect on performance.
>>
>> Playing a video [1] in mplayer's benchmark mode shows the difference
>> (i7-4790, FullHD, simpledrm, kernel with debugging).
>>
>> mplayer -benchmark -nosound -vo fbdev ./big_buck_bunny_720p_stereo.ogg
>>
>> With sorted page lists:
>>
>> BENCHMARKs: VC: 32.960s VO: 73.068s A: 0.000s Sys: 2.413s = 108.441s
>> BENCHMARK%: VC: 30.3947% VO: 67.3802% A: 0.0000% Sys: 2.2251% = 100.0000%
>>
>> With unsorted page lists:
>>
>> BENCHMARKs: VC: 31.005s VO: 42.889s A: 0.000s Sys: 2.256s = 76.150s
>> BENCHMARK%: VC: 40.7156% VO: 56.3219% A: 0.0000% Sys: 2.9625% = 100.0000%
>>
>> VC shows the overhead of video decoding, VO shows the overhead of the
>> video output. Using unsorted page lists reduces the benchmark's run time
>> by ~32s/~25%.
> Nice!
>
>>
>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
>> Link: https://download.blender.org/peach/bigbuckbunny_movies/big_buck_bunny_720p_stereo.ogg # [1]
>> ---
>> drivers/staging/fbtft/fbtft-core.c | 1 +
>> drivers/video/fbdev/broadsheetfb.c | 1 +
>> drivers/video/fbdev/core/fb_defio.c | 19 ++++++++++++-------
>> drivers/video/fbdev/metronomefb.c | 1 +
>> drivers/video/fbdev/udlfb.c | 1 +
>> include/linux/fb.h | 1 +
>> 6 files changed, 17 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/staging/fbtft/fbtft-core.c b/drivers/staging/fbtft/fbtft-core.c
>> index f2684d2d6851..4a35347b3020 100644
>> --- a/drivers/staging/fbtft/fbtft-core.c
>> +++ b/drivers/staging/fbtft/fbtft-core.c
>> @@ -654,6 +654,7 @@ struct fb_info *fbtft_framebuffer_alloc(struct fbtft_display *display,
>> fbops->fb_blank = fbtft_fb_blank;
>>
>> fbdefio->delay = HZ / fps;
>> + fbdefio->sort_pagelist = true;
>> fbdefio->deferred_io = fbtft_deferred_io;
>> fb_deferred_io_init(info);
>>
>> diff --git a/drivers/video/fbdev/broadsheetfb.c b/drivers/video/fbdev/broadsheetfb.c
>> index fd66f4d4a621..b9054f658838 100644
>> --- a/drivers/video/fbdev/broadsheetfb.c
>> +++ b/drivers/video/fbdev/broadsheetfb.c
>> @@ -1059,6 +1059,7 @@ static const struct fb_ops broadsheetfb_ops = {
>>
>> static struct fb_deferred_io broadsheetfb_defio = {
>> .delay = HZ/4,
>> + .sort_pagelist = true,
>> .deferred_io = broadsheetfb_dpy_deferred_io,
>> };
>>
>> diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
>> index 3727b1ca87b1..1f672cf253b2 100644
>> --- a/drivers/video/fbdev/core/fb_defio.c
>> +++ b/drivers/video/fbdev/core/fb_defio.c
>> @@ -132,15 +132,20 @@ static vm_fault_t fb_deferred_io_mkwrite(struct vm_fault *vmf)
>> if (!list_empty(&page->lru))
>> goto page_already_added;
>>
>> - /* we loop through the pagelist before adding in order
>> - to keep the pagelist sorted */
>> - list_for_each_entry(cur, &fbdefio->pagelist, lru) {
>> - if (cur->index > page->index)
>> - break;
>> + if (fbdefio->sort_pagelist) {
>> + /*
>> + * We loop through the pagelist before adding in order
>> + * to keep the pagelist sorted.
>> + */
>> + list_for_each_entry(cur, &fbdefio->pagelist, lru) {
>> + if (cur->index > page->index)
>> + break;
>> + }
>> + list_add_tail(&page->lru, &cur->lru);
>> + } else {
>> + list_add_tail(&page->lru, &fbdefio->pagelist);
>> }
> Bikeshedding - my personal style is to have the likely part first.
> This makes reading the code easier.
I'll change this a bit to leave out the else branch.
>
>
> The following drivers uses deferred io but are not listed as
> they need the page list sorted:
>
> - hecubafb
> - hyperv_fb
> - sh_mobile_lcdcfb
> - smscufx
> - ssd1307fb
> - xen-fbfront
>
> It would be nice with some info in the commit log that they do not need
> the pages sorted.
> To make the list complete include the drm stuff too.
>
> It did not jump to me why they did not need sorted pages,
> so some sort of reassurance that they have been checked would be nice.
Most drivers build a bounding rectangle around the dirty pages or simply
flush the whole screen. The only two affected DRM drivers, generic fbdev
and vmwgfx, both use the bounding rectangle. In those cases, the exact
order of the pages doesn't matter. The other drivers look at the page
index or handle pages one-by-one. I set the sort_pagelist flag for
those, even though some of them would probably work correctly without
sorting.
I'll add this information to the commit description.
Best regards
Thomas
>
> With the following addressed:
> Acked-by: Sam Ravnborg <sam@ravnborg.org>
>
> I hope someone else looks that can verify that the list of drivers
> without sort_pagelist is correct so someone knowledgeable have looked
> too.
>
> Sam
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
next prev parent reply other threads:[~2022-02-11 8:21 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-10 14:11 [PATCH 0/2] fbdev: Significantly improve performance of fbdefio mmap Thomas Zimmermann
2022-02-10 14:11 ` [PATCH 1/2] fbdev/defio: Early-out if page is already enlisted Thomas Zimmermann
2022-02-10 21:00 ` Sam Ravnborg
2022-02-11 9:12 ` Thomas Zimmermann
2022-02-10 14:11 ` [PATCH 2/2] fbdev: Don't sort deferred-I/O pages by default Thomas Zimmermann
2022-02-10 16:15 ` Andy Shevchenko
2022-02-10 21:16 ` Sam Ravnborg
2022-02-11 7:58 ` Dan Carpenter
2022-02-11 8:25 ` Thomas Zimmermann
2022-02-11 8:21 ` Thomas Zimmermann [this message]
2022-02-14 8:05 ` Geert Uytterhoeven
2022-02-14 8:28 ` Thomas Zimmermann
2022-02-14 9:05 ` Geert Uytterhoeven
2022-02-14 13:29 ` Thomas Zimmermann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=60c15ca3-80fa-8c72-deb8-6f355cde6c25@suse.de \
--to=tzimmermann@suse.de \
--cc=andriy.shevchenko@linux.intel.com \
--cc=bernie@plugable.com \
--cc=daniel@ffwll.ch \
--cc=deller@gmx.de \
--cc=dri-devel@lists.freedesktop.org \
--cc=javierm@redhat.com \
--cc=jayalk@intworks.biz \
--cc=linux-fbdev@vger.kernel.org \
--cc=linux-staging@lists.linux.dev \
--cc=noralf@tronnes.org \
--cc=sam@ravnborg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).