From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mikulas Patocka Date: Sun, 03 Jun 2018 14:41:12 +0000 Subject: [PATCH 19/21] udlfb: optimization - test the backing buffer Message-Id: <20180603144225.839044928@twibright.com> List-Id: References: <20180603144053.875668929@twibright.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Mikulas Patocka , Bartlomiej Zolnierkiewicz , Dave Airlie , Bernie Thompson , Ladislav Michl Cc: linux-fbdev@vger.kernel.org, dri-devel@lists.freedesktop.org Currently, the udlfb driver only tests for identical bytes at the beginning or at the end of a page and renders anything between the first and last mismatching pixel. But pages are not the same as lines, so this is quite suboptimal - if there is something modified at the beginning of a page and at the end of a page, the whole page is rendered, even if most of the page is not modified. This patch makes it test for identical pixels at the beginning and end of each rendering command. This patch improves identical byte detection by 41% when playing video in a window. This patch also fixes a possible screen corruption if the user is writing to the framebuffer while dlfb_render_hline is in progress - the pixel data that is copied to the backbuffer with memcpy may be different from the pixel data that is actually rendered to the hardware (because the content of the framebuffer may change between memcpy and the rendering command). We must make sure that we copy exactly the same pixel as the pixel that is being rendered. Signed-off-by: Mikulas Patocka --- drivers/video/fbdev/udlfb.c | 45 +++++++++++++++++++++++++++++++++----------- 1 file changed, 34 insertions(+), 11 deletions(-) Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c =================================--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c 2018-05-31 14:51:43.000000000 +0200 +++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c 2018-05-31 14:51:43.000000000 +0200 @@ -431,7 +431,9 @@ static void dlfb_compress_hline( const uint16_t *const pixel_end, uint32_t *device_address_ptr, uint8_t **command_buffer_ptr, - const uint8_t *const cmd_buffer_end) + const uint8_t *const cmd_buffer_end, + unsigned long back_buffer_offset, + int *ident_ptr) { const uint16_t *pixel = *pixel_start_ptr; uint32_t dev_addr = *device_address_ptr; @@ -444,6 +446,14 @@ static void dlfb_compress_hline( const uint16_t *raw_pixel_start = NULL; const uint16_t *cmd_pixel_start, *cmd_pixel_end = NULL; + if (back_buffer_offset && + *pixel = *(u16 *)((u8 *)pixel + back_buffer_offset)) { + pixel++; + dev_addr += BPP; + (*ident_ptr)++; + continue; + } + prefetchw((void *) cmd); /* pull in one cache line at least */ *cmd++ = 0xAF; @@ -462,25 +472,37 @@ static void dlfb_compress_hline( (unsigned long)(pixel_end - pixel), (unsigned long)(cmd_buffer_end - 1 - cmd) / BPP); + if (back_buffer_offset) { + /* note: the framebuffer may change under us, so we must test for underflow */ + while (cmd_pixel_end - 1 > pixel && + *(cmd_pixel_end - 1) = *(u16 *)((u8 *)(cmd_pixel_end - 1) + back_buffer_offset)) + cmd_pixel_end--; + } + prefetch_range((void *) pixel, (u8 *)cmd_pixel_end - (u8 *)pixel); while (pixel < cmd_pixel_end) { const uint16_t * const repeating_pixel = pixel; + u16 pixel_value = *pixel; - put_unaligned_be16(*pixel, cmd); + put_unaligned_be16(pixel_value, cmd); + if (back_buffer_offset) + *(u16 *)((u8 *)pixel + back_buffer_offset) = pixel_value; cmd += 2; pixel++; if (unlikely((pixel < cmd_pixel_end) && - (*pixel = *repeating_pixel))) { + (*pixel = pixel_value))) { /* go back and fill in raw pixel count */ *raw_pixels_count_byte = ((repeating_pixel - raw_pixel_start) + 1) & 0xFF; - while ((pixel < cmd_pixel_end) - && (*pixel = *repeating_pixel)) { - pixel++; - } + do { + if (back_buffer_offset) + *(u16 *)((u8 *)pixel + back_buffer_offset) = pixel_value; + pixel++; + } while ((pixel < cmd_pixel_end) && + (*pixel = pixel_value)); /* immediately after raw data is repeat byte */ *cmd++ = ((pixel - repeating_pixel) - 1) & 0xFF; @@ -531,6 +553,7 @@ static int dlfb_render_hline(struct dlfb struct urb *urb = *urb_ptr; u8 *cmd = *urb_buf_ptr; u8 *cmd_end = (u8 *) urb->transfer_buffer + urb->transfer_buffer_length; + unsigned long back_buffer_offset = 0; line_start = (u8 *) (front + byte_offset); next_pixel = line_start; @@ -541,6 +564,8 @@ static int dlfb_render_hline(struct dlfb const u8 *back_start = (u8 *) (dlfb->backing_buffer + byte_offset); + back_buffer_offset = (unsigned long)back_start - (unsigned long)line_start; + *ident_ptr += dlfb_trim_hline(back_start, &next_pixel, &byte_width); @@ -549,16 +574,14 @@ static int dlfb_render_hline(struct dlfb dev_addr += offset; back_start += offset; line_start += offset; - - memcpy((char *)back_start, (char *) line_start, - byte_width); } while (next_pixel < line_end) { dlfb_compress_hline((const uint16_t **) &next_pixel, (const uint16_t *) line_end, &dev_addr, - (u8 **) &cmd, (u8 *) cmd_end); + (u8 **) &cmd, (u8 *) cmd_end, back_buffer_offset, + ident_ptr); if (cmd >= cmd_end) { int len = cmd - (u8 *) urb->transfer_buffer; From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mikulas Patocka Subject: [PATCH 19/21] udlfb: optimization - test the backing buffer Date: Sun, 03 Jun 2018 16:41:12 +0200 Message-ID: <20180603144225.839044928@twibright.com> References: <20180603144053.875668929@twibright.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from leontynka.twibright.com (109-183-129-149.tmcz.cz [109.183.129.149]) by gabe.freedesktop.org (Postfix) with ESMTPS id B124C6E2AA for ; Sun, 3 Jun 2018 15:19:56 +0000 (UTC) Content-Disposition: inline; filename=udl-test-backing-buffer.patch List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Mikulas Patocka , Bartlomiej Zolnierkiewicz , Dave Airlie , Bernie Thompson , Ladislav Michl Cc: linux-fbdev@vger.kernel.org, dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org Q3VycmVudGx5LCB0aGUgdWRsZmIgZHJpdmVyIG9ubHkgdGVzdHMgZm9yIGlkZW50aWNhbCBieXRl cyBhdCB0aGUKYmVnaW5uaW5nIG9yIGF0IHRoZSBlbmQgb2YgYSBwYWdlIGFuZCByZW5kZXJzIGFu eXRoaW5nIGJldHdlZW4gdGhlIGZpcnN0CmFuZCBsYXN0IG1pc21hdGNoaW5nIHBpeGVsLiBCdXQg cGFnZXMgYXJlIG5vdCB0aGUgc2FtZSBhcyBsaW5lcywgc28gdGhpcwppcyBxdWl0ZSBzdWJvcHRp bWFsIC0gaWYgdGhlcmUgaXMgc29tZXRoaW5nIG1vZGlmaWVkIGF0IHRoZSBiZWdpbm5pbmcgb2Yg YQpwYWdlIGFuZCBhdCB0aGUgZW5kIG9mIGEgcGFnZSwgdGhlIHdob2xlIHBhZ2UgaXMgcmVuZGVy ZWQsIGV2ZW4gaWYgbW9zdCBvZgp0aGUgcGFnZSBpcyBub3QgbW9kaWZpZWQuCgpUaGlzIHBhdGNo IG1ha2VzIGl0IHRlc3QgZm9yIGlkZW50aWNhbCBwaXhlbHMgYXQgdGhlIGJlZ2lubmluZyBhbmQg ZW5kIG9mCmVhY2ggcmVuZGVyaW5nIGNvbW1hbmQuIFRoaXMgcGF0Y2ggaW1wcm92ZXMgaWRlbnRp Y2FsIGJ5dGUgZGV0ZWN0aW9uIGJ5CjQxJSB3aGVuIHBsYXlpbmcgdmlkZW8gaW4gYSB3aW5kb3cu CgpUaGlzIHBhdGNoIGFsc28gZml4ZXMgYSBwb3NzaWJsZSBzY3JlZW4gY29ycnVwdGlvbiBpZiB0 aGUgdXNlciBpcyB3cml0aW5nCnRvIHRoZSBmcmFtZWJ1ZmZlciB3aGlsZSBkbGZiX3JlbmRlcl9o bGluZSBpcyBpbiBwcm9ncmVzcyAtIHRoZSBwaXhlbCBkYXRhCnRoYXQgaXMgY29waWVkIHRvIHRo ZSBiYWNrYnVmZmVyIHdpdGggbWVtY3B5IG1heSBiZSBkaWZmZXJlbnQgZnJvbSB0aGUKcGl4ZWwg ZGF0YSB0aGF0IGlzIGFjdHVhbGx5IHJlbmRlcmVkIHRvIHRoZSBoYXJkd2FyZSAoYmVjYXVzZSB0 aGUgY29udGVudApvZiB0aGUgZnJhbWVidWZmZXIgbWF5IGNoYW5nZSBiZXR3ZWVuIG1lbWNweSBh bmQgdGhlIHJlbmRlcmluZyBjb21tYW5kKS4KV2UgbXVzdCBtYWtlIHN1cmUgdGhhdCB3ZSBjb3B5 IGV4YWN0bHkgdGhlIHNhbWUgcGl4ZWwgYXMgdGhlIHBpeGVsIHRoYXQgaXMKYmVpbmcgcmVuZGVy ZWQuCgpTaWduZWQtb2ZmLWJ5OiBNaWt1bGFzIFBhdG9ja2EgPG1wYXRvY2thQHJlZGhhdC5jb20+ CgotLS0KIGRyaXZlcnMvdmlkZW8vZmJkZXYvdWRsZmIuYyB8ICAgNDUgKysrKysrKysrKysrKysr KysrKysrKysrKysrKysrKysrLS0tLS0tLS0tLS0KIDEgZmlsZSBjaGFuZ2VkLCAzNCBpbnNlcnRp b25zKCspLCAxMSBkZWxldGlvbnMoLSkKCkluZGV4OiBsaW51eC00LjE3LXJjNy9kcml2ZXJzL3Zp ZGVvL2ZiZGV2L3VkbGZiLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gbGludXgtNC4xNy1yYzcub3JpZy9kcml2 ZXJzL3ZpZGVvL2ZiZGV2L3VkbGZiLmMJMjAxOC0wNS0zMSAxNDo1MTo0My4wMDAwMDAwMDAgKzAy MDAKKysrIGxpbnV4LTQuMTctcmM3L2RyaXZlcnMvdmlkZW8vZmJkZXYvdWRsZmIuYwkyMDE4LTA1 LTMxIDE0OjUxOjQzLjAwMDAwMDAwMCArMDIwMApAQCAtNDMxLDcgKzQzMSw5IEBAIHN0YXRpYyB2 b2lkIGRsZmJfY29tcHJlc3NfaGxpbmUoCiAJY29uc3QgdWludDE2X3QgKmNvbnN0IHBpeGVsX2Vu ZCwKIAl1aW50MzJfdCAqZGV2aWNlX2FkZHJlc3NfcHRyLAogCXVpbnQ4X3QgKipjb21tYW5kX2J1 ZmZlcl9wdHIsCi0JY29uc3QgdWludDhfdCAqY29uc3QgY21kX2J1ZmZlcl9lbmQpCisJY29uc3Qg dWludDhfdCAqY29uc3QgY21kX2J1ZmZlcl9lbmQsCisJdW5zaWduZWQgbG9uZyBiYWNrX2J1ZmZl cl9vZmZzZXQsCisJaW50ICppZGVudF9wdHIpCiB7CiAJY29uc3QgdWludDE2X3QgKnBpeGVsID0g KnBpeGVsX3N0YXJ0X3B0cjsKIAl1aW50MzJfdCBkZXZfYWRkciAgPSAqZGV2aWNlX2FkZHJlc3Nf cHRyOwpAQCAtNDQ0LDYgKzQ0NiwxNCBAQCBzdGF0aWMgdm9pZCBkbGZiX2NvbXByZXNzX2hsaW5l KAogCQljb25zdCB1aW50MTZfdCAqcmF3X3BpeGVsX3N0YXJ0ID0gTlVMTDsKIAkJY29uc3QgdWlu dDE2X3QgKmNtZF9waXhlbF9zdGFydCwgKmNtZF9waXhlbF9lbmQgPSBOVUxMOwogCisJCWlmIChi YWNrX2J1ZmZlcl9vZmZzZXQgJiYKKwkJICAgICpwaXhlbCA9PSAqKHUxNiAqKSgodTggKilwaXhl bCArIGJhY2tfYnVmZmVyX29mZnNldCkpIHsKKwkJCXBpeGVsKys7CisJCQlkZXZfYWRkciArPSBC UFA7CisJCQkoKmlkZW50X3B0cikrKzsKKwkJCWNvbnRpbnVlOworCQl9CisKIAkJcHJlZmV0Y2h3 KCh2b2lkICopIGNtZCk7IC8qIHB1bGwgaW4gb25lIGNhY2hlIGxpbmUgYXQgbGVhc3QgKi8KIAog CQkqY21kKysgPSAweEFGOwpAQCAtNDYyLDI1ICs0NzIsMzcgQEAgc3RhdGljIHZvaWQgZGxmYl9j b21wcmVzc19obGluZSgKIAkJCQkJKHVuc2lnbmVkIGxvbmcpKHBpeGVsX2VuZCAtIHBpeGVsKSwK IAkJCQkJKHVuc2lnbmVkIGxvbmcpKGNtZF9idWZmZXJfZW5kIC0gMSAtIGNtZCkgLyBCUFApOwog CisJCWlmIChiYWNrX2J1ZmZlcl9vZmZzZXQpIHsKKwkJCS8qIG5vdGU6IHRoZSBmcmFtZWJ1ZmZl ciBtYXkgY2hhbmdlIHVuZGVyIHVzLCBzbyB3ZSBtdXN0IHRlc3QgZm9yIHVuZGVyZmxvdyAqLwor CQkJd2hpbGUgKGNtZF9waXhlbF9lbmQgLSAxID4gcGl4ZWwgJiYKKwkJCSAgICAgICAqKGNtZF9w aXhlbF9lbmQgLSAxKSA9PSAqKHUxNiAqKSgodTggKikoY21kX3BpeGVsX2VuZCAtIDEpICsgYmFj a19idWZmZXJfb2Zmc2V0KSkKKwkJCQljbWRfcGl4ZWxfZW5kLS07CisJCX0KKwogCQlwcmVmZXRj aF9yYW5nZSgodm9pZCAqKSBwaXhlbCwgKHU4ICopY21kX3BpeGVsX2VuZCAtICh1OCAqKXBpeGVs KTsKIAogCQl3aGlsZSAocGl4ZWwgPCBjbWRfcGl4ZWxfZW5kKSB7CiAJCQljb25zdCB1aW50MTZf dCAqIGNvbnN0IHJlcGVhdGluZ19waXhlbCA9IHBpeGVsOworCQkJdTE2IHBpeGVsX3ZhbHVlID0g KnBpeGVsOwogCi0JCQlwdXRfdW5hbGlnbmVkX2JlMTYoKnBpeGVsLCBjbWQpOworCQkJcHV0X3Vu YWxpZ25lZF9iZTE2KHBpeGVsX3ZhbHVlLCBjbWQpOworCQkJaWYgKGJhY2tfYnVmZmVyX29mZnNl dCkKKwkJCQkqKHUxNiAqKSgodTggKilwaXhlbCArIGJhY2tfYnVmZmVyX29mZnNldCkgPSBwaXhl bF92YWx1ZTsKIAkJCWNtZCArPSAyOwogCQkJcGl4ZWwrKzsKIAogCQkJaWYgKHVubGlrZWx5KChw aXhlbCA8IGNtZF9waXhlbF9lbmQpICYmCi0JCQkJICAgICAoKnBpeGVsID09ICpyZXBlYXRpbmdf cGl4ZWwpKSkgeworCQkJCSAgICAgKCpwaXhlbCA9PSBwaXhlbF92YWx1ZSkpKSB7CiAJCQkJLyog Z28gYmFjayBhbmQgZmlsbCBpbiByYXcgcGl4ZWwgY291bnQgKi8KIAkJCQkqcmF3X3BpeGVsc19j b3VudF9ieXRlID0gKChyZXBlYXRpbmdfcGl4ZWwgLQogCQkJCQkJcmF3X3BpeGVsX3N0YXJ0KSAr IDEpICYgMHhGRjsKIAotCQkJCXdoaWxlICgocGl4ZWwgPCBjbWRfcGl4ZWxfZW5kKQotCQkJCSAg ICAgICAmJiAoKnBpeGVsID09ICpyZXBlYXRpbmdfcGl4ZWwpKSB7Ci0JCQkJCXBpeGVsKys7Ci0J CQkJfQorCQkJCWRvIHsKKwkJCQkJaWYgKGJhY2tfYnVmZmVyX29mZnNldCkKKwkJCQkJCSoodTE2 ICopKCh1OCAqKXBpeGVsICsgYmFja19idWZmZXJfb2Zmc2V0KSA9IHBpeGVsX3ZhbHVlOworIAkJ CQkJcGl4ZWwrKzsKKwkJCQl9IHdoaWxlICgocGl4ZWwgPCBjbWRfcGl4ZWxfZW5kKSAmJgorCQkJ CQkgKCpwaXhlbCA9PSBwaXhlbF92YWx1ZSkpOwogCiAJCQkJLyogaW1tZWRpYXRlbHkgYWZ0ZXIg cmF3IGRhdGEgaXMgcmVwZWF0IGJ5dGUgKi8KIAkJCQkqY21kKysgPSAoKHBpeGVsIC0gcmVwZWF0 aW5nX3BpeGVsKSAtIDEpICYgMHhGRjsKQEAgLTUzMSw2ICs1NTMsNyBAQCBzdGF0aWMgaW50IGRs ZmJfcmVuZGVyX2hsaW5lKHN0cnVjdCBkbGZiCiAJc3RydWN0IHVyYiAqdXJiID0gKnVyYl9wdHI7 CiAJdTggKmNtZCA9ICp1cmJfYnVmX3B0cjsKIAl1OCAqY21kX2VuZCA9ICh1OCAqKSB1cmItPnRy YW5zZmVyX2J1ZmZlciArIHVyYi0+dHJhbnNmZXJfYnVmZmVyX2xlbmd0aDsKKwl1bnNpZ25lZCBs b25nIGJhY2tfYnVmZmVyX29mZnNldCA9IDA7CiAKIAlsaW5lX3N0YXJ0ID0gKHU4ICopIChmcm9u dCArIGJ5dGVfb2Zmc2V0KTsKIAluZXh0X3BpeGVsID0gbGluZV9zdGFydDsKQEAgLTU0MSw2ICs1 NjQsOCBAQCBzdGF0aWMgaW50IGRsZmJfcmVuZGVyX2hsaW5lKHN0cnVjdCBkbGZiCiAJCWNvbnN0 IHU4ICpiYWNrX3N0YXJ0ID0gKHU4ICopIChkbGZiLT5iYWNraW5nX2J1ZmZlcgogCQkJCQkJKyBi eXRlX29mZnNldCk7CiAKKwkJYmFja19idWZmZXJfb2Zmc2V0ID0gKHVuc2lnbmVkIGxvbmcpYmFj a19zdGFydCAtICh1bnNpZ25lZCBsb25nKWxpbmVfc3RhcnQ7CisKIAkJKmlkZW50X3B0ciArPSBk bGZiX3RyaW1faGxpbmUoYmFja19zdGFydCwgJm5leHRfcGl4ZWwsCiAJCQkmYnl0ZV93aWR0aCk7 CiAKQEAgLTU0OSwxNiArNTc0LDE0IEBAIHN0YXRpYyBpbnQgZGxmYl9yZW5kZXJfaGxpbmUoc3Ry dWN0IGRsZmIKIAkJZGV2X2FkZHIgKz0gb2Zmc2V0OwogCQliYWNrX3N0YXJ0ICs9IG9mZnNldDsK IAkJbGluZV9zdGFydCArPSBvZmZzZXQ7Ci0KLQkJbWVtY3B5KChjaGFyICopYmFja19zdGFydCwg KGNoYXIgKikgbGluZV9zdGFydCwKLQkJICAgICAgIGJ5dGVfd2lkdGgpOwogCX0KIAogCXdoaWxl IChuZXh0X3BpeGVsIDwgbGluZV9lbmQpIHsKIAogCQlkbGZiX2NvbXByZXNzX2hsaW5lKChjb25z dCB1aW50MTZfdCAqKikgJm5leHRfcGl4ZWwsCiAJCQkgICAgIChjb25zdCB1aW50MTZfdCAqKSBs aW5lX2VuZCwgJmRldl9hZGRyLAotCQkJKHU4ICoqKSAmY21kLCAodTggKikgY21kX2VuZCk7CisJ CQkodTggKiopICZjbWQsICh1OCAqKSBjbWRfZW5kLCBiYWNrX2J1ZmZlcl9vZmZzZXQsCisJCQlp ZGVudF9wdHIpOwogCiAJCWlmIChjbWQgPj0gY21kX2VuZCkgewogCQkJaW50IGxlbiA9IGNtZCAt ICh1OCAqKSB1cmItPnRyYW5zZmVyX2J1ZmZlcjsKCl9fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fCmRyaS1kZXZlbCBtYWlsaW5nIGxpc3QKZHJpLWRldmVsQGxp c3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3RzLmZyZWVkZXNrdG9wLm9yZy9tYWlsbWFu L2xpc3RpbmZvL2RyaS1kZXZlbAo=