From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1ML3fA-0001mD-Aa for qemu-devel@nongnu.org; Sun, 28 Jun 2009 19:19:56 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1ML3f5-0001lm-F7 for qemu-devel@nongnu.org; Sun, 28 Jun 2009 19:19:55 -0400 Received: from [199.232.76.173] (port=49581 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1ML3f5-0001lj-9g for qemu-devel@nongnu.org; Sun, 28 Jun 2009 19:19:51 -0400 Received: from mail-ew0-f211.google.com ([209.85.219.211]:39628) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1ML3f4-0003PP-Q5 for qemu-devel@nongnu.org; Sun, 28 Jun 2009 19:19:51 -0400 Received: by ewy7 with SMTP id 7so4389619ewy.34 for ; Sun, 28 Jun 2009 16:19:49 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <761ea48b0906281424p5966022erbcb20143c06fd6b3@mail.gmail.com> References: <5b31733c0906281119r7ea485b6k81f8e59fd3aa4926@mail.gmail.com> <761ea48b0906281424p5966022erbcb20143c06fd6b3@mail.gmail.com> Date: Mon, 29 Jun 2009 01:19:49 +0200 Message-ID: <5b31733c0906281619k6a4bbf54s46de7d07b0395b2e@mail.gmail.com> Subject: Re: OT: TCG SSA, speed, misc (was Re: [Qemu-devel] Re: [PATCH 08/11] QMP: Port balloon command) From: Filip Navara Content-Type: multipart/mixed; boundary=0016364c76ff5c5054046d70cfe4 List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Laurent Desnogues Cc: Blue Swirl , Anthony Liguori , qemu-devel@nongnu.org, Avi Kivity --0016364c76ff5c5054046d70cfe4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On Sun, Jun 28, 2009 at 11:24 PM, Laurent Desnogues wrote: > On Sun, Jun 28, 2009 at 8:19 PM, Filip Navara wrote: >> Doing a profiling run on several ARM demo programs showed that most of >> the generated code was doing load/store operations to the machine >> registers (in CPU_env). Sample run of FreeRTOS looked like this (OP >> counts): >> >> movi_i32 1603 >> ld_i32 1305 >> st_i32 1174 >> add_i32 530 >> ... >> >> If there could be done something that would allow the guest registers >> to be stored in host registers, even if for a temporary amount of time >> it would certainly help the guests that I'm dealing with. > > TCG does a good job for register allocation. > > The problem you have here is that the ARM translator > isn't using tcg_global_mem_new_i32 for ARM registers. Interesting, thanks for the tip. I have been trying to achieve the same effect using tcg_global_reg_new_i32, no wonder it felt so hard. :) > Here's an example of number of ops I see when using > tcg_global_mem_new_i32: > > exit_tb 4991 > add_i32 7945 > st_i32 8257 > movi_i32 26812 > mov_i32 38369 > > And with the trunk: > > exit_tb 4957 > add_i32 8165 > st_i32 20281 > ld_i32 21926 > movi_i32 25083 > > > Laurent > Attached is a proof-of-concept of ARM patch for using tcg_global_mem_new_i32. I didn't have much time to test it yet, but on synthetic benchmark it improved the performance by 13 DMIPS to the total of 216 DMIPS, which equals to 6% improvement. On x86 host the register allocation still looks very pathetic, I will post a follow-up soon. Best regards, Filip Navara --0016364c76ff5c5054046d70cfe4 Content-Type: text/plain; charset=US-ASCII; name="0001-First-try-at-using-tcg_global_mem_new_i32.patch.txt" Content-Disposition: attachment; filename="0001-First-try-at-using-tcg_global_mem_new_i32.patch.txt" Content-Transfer-Encoding: base64 X-Attachment-Id: f_fwidrjfq0 RnJvbSA0ZmVkZGVlMGU3ZTAyZTFkYWFiNzY0ZGJiZjlkNjk0Mjc3YjFlMDBhIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBGaWxpcCBOYXZhcmEgPGZpbGlwLm5hdmFyYUBnbWFpbC5jb20+ CkRhdGU6IE1vbiwgMjkgSnVuIDIwMDkgMDE6MTM6NDIgKzAyMDAKU3ViamVjdDogW1BBVENIXSBG aXJzdCB0cnkgYXQgdXNpbmcgdGNnX2dsb2JhbF9tZW1fbmV3X2kzMi4KCi0tLQogdGFyZ2V0LWFy bS90cmFuc2xhdGUuYyB8ICAgNDAgKysrKysrKysrKysrKysrKysrKysrKystLS0tLS0tLS0tLS0t LS0tLQogMSBmaWxlcyBjaGFuZ2VkLCAyMyBpbnNlcnRpb25zKCspLCAxNyBkZWxldGlvbnMoLSkK CmRpZmYgLS1naXQgYS90YXJnZXQtYXJtL3RyYW5zbGF0ZS5jIGIvdGFyZ2V0LWFybS90cmFuc2xh dGUuYwppbmRleCA2MmM5ZWZmLi45YTM5NTM2IDEwMDY0NAotLS0gYS90YXJnZXQtYXJtL3RyYW5z bGF0ZS5jCisrKyBiL3RhcmdldC1hcm0vdHJhbnNsYXRlLmMKQEAgLTc3LDYgKzc3LDcgQEAgdHlw ZWRlZiBzdHJ1Y3QgRGlzYXNDb250ZXh0IHsKIHN0YXRpYyBUQ0d2X3B0ciBjcHVfZW52OwogLyog V2UgcmV1c2UgdGhlIHNhbWUgNjQtYml0IHRlbXBvcmFyaWVzIGZvciBlZmZpY2llbmN5LiAgKi8K IHN0YXRpYyBUQ0d2X2k2NCBjcHVfVjAsIGNwdV9WMSwgY3B1X00wOworc3RhdGljIFRDR3ZfaTMy IGNwdV9SWzE2XTsKIAogLyogRklYTUU6ICBUaGVzZSBzaG91bGQgYmUgcmVtb3ZlZC4gICovCiBz dGF0aWMgVENHdiBjcHVfVFsyXTsKQEAgLTg2LDE0ICs4NywyNiBAQCBzdGF0aWMgVENHdl9pNjQg Y3B1X0YwZCwgY3B1X0YxZDsKICNkZWZpbmUgSUNPVU5UX1RFTVAgY3B1X1RbMF0KICNpbmNsdWRl ICJnZW4taWNvdW50LmgiCiAKK3N0YXRpYyBjb25zdCBjaGFyICpyZWduYW1lc1tdID0KKyAgICB7 ICJyMCIsICJyMSIsICJyMiIsICJyMyIsICJyNCIsICJyNSIsICJyNiIsICJyNyIsCisgICAgICAi cjgiLCAicjkiLCAicjEwIiwgInIxMSIsICJyMTIiLCAicjEzIiwgInIxNCIsICJwYyIgfTsKKwog LyogaW5pdGlhbGl6ZSBUQ0cgZ2xvYmFscy4gICovCiB2b2lkIGFybV90cmFuc2xhdGVfaW5pdCh2 b2lkKQogeworICAgIGludCBpOworCiAgICAgY3B1X2VudiA9IHRjZ19nbG9iYWxfcmVnX25ld19w dHIoVENHX0FSRUcwLCAiZW52Iik7CiAKICAgICBjcHVfVFswXSA9IHRjZ19nbG9iYWxfcmVnX25l d19pMzIoVENHX0FSRUcxLCAiVDAiKTsKICAgICBjcHVfVFsxXSA9IHRjZ19nbG9iYWxfcmVnX25l d19pMzIoVENHX0FSRUcyLCAiVDEiKTsKIAorICAgIGZvciAoaSA9IDA7IGkgPCAxNjsgaSsrKSB7 CisgICAgICAgIGNwdV9SW2ldID0gdGNnX2dsb2JhbF9tZW1fbmV3X2kzMihUQ0dfQVJFRzAsCisg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBvZmZzZXRvZihDUFVTdGF0 ZSwgcmVnc1tpXSksCisgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBy ZWduYW1lc1tpXSk7CisgICAgfQorCiAjZGVmaW5lIEdFTl9IRUxQRVIgMgogI2luY2x1ZGUgImhl bHBlcnMuaCIKIH0KQEAgLTE2OCw3ICsxODEsNyBAQCBzdGF0aWMgdm9pZCBsb2FkX3JlZ192YXIo RGlzYXNDb250ZXh0ICpzLCBUQ0d2IHZhciwgaW50IHJlZykKICAgICAgICAgICAgIGFkZHIgPSAo bG9uZylzLT5wYyArIDQ7CiAgICAgICAgIHRjZ19nZW5fbW92aV9pMzIodmFyLCBhZGRyKTsKICAg ICB9IGVsc2UgewotICAgICAgICB0Y2dfZ2VuX2xkX2kzMih2YXIsIGNwdV9lbnYsIG9mZnNldG9m KENQVVN0YXRlLCByZWdzW3JlZ10pKTsKKyAgICAgICAgdGNnX2dlbl9tb3ZfaTMyKHZhciwgY3B1 X1JbcmVnXSk7CiAgICAgfQogfQogCkBAIC0xODgsNyArMjAxLDcgQEAgc3RhdGljIHZvaWQgc3Rv cmVfcmVnKERpc2FzQ29udGV4dCAqcywgaW50IHJlZywgVENHdiB2YXIpCiAgICAgICAgIHRjZ19n ZW5fYW5kaV9pMzIodmFyLCB2YXIsIH4xKTsKICAgICAgICAgcy0+aXNfam1wID0gRElTQVNfSlVN UDsKICAgICB9Ci0gICAgdGNnX2dlbl9zdF9pMzIodmFyLCBjcHVfZW52LCBvZmZzZXRvZihDUFVT dGF0ZSwgcmVnc1tyZWddKSk7CisgICAgdGNnX2dlbl9tb3ZfaTMyKGNwdV9SW3JlZ10sIHZhcik7 CiAgICAgZGVhZF90bXAodmFyKTsKIH0KIApAQCAtNzkwLDI3ICs4MDMsMjIgQEAgc3RhdGljIGlu bGluZSB2b2lkIGdlbl9ieF9pbShEaXNhc0NvbnRleHQgKnMsIHVpbnQzMl90IGFkZHIpCiAgICAg VENHdiB0bXA7CiAKICAgICBzLT5pc19qbXAgPSBESVNBU19VUERBVEU7Ci0gICAgdG1wID0gbmV3 X3RtcCgpOwogICAgIGlmIChzLT50aHVtYiAhPSAoYWRkciAmIDEpKSB7CisgICAgICAgIHRtcCA9 IG5ld190bXAoKTsKICAgICAgICAgdGNnX2dlbl9tb3ZpX2kzMih0bXAsIGFkZHIgJiAxKTsKICAg ICAgICAgdGNnX2dlbl9zdF9pMzIodG1wLCBjcHVfZW52LCBvZmZzZXRvZihDUFVTdGF0ZSwgdGh1 bWIpKTsKKyAgICAgICAgZGVhZF90bXAodG1wKTsKICAgICB9Ci0gICAgdGNnX2dlbl9tb3ZpX2kz Mih0bXAsIGFkZHIgJiB+MSk7Ci0gICAgdGNnX2dlbl9zdF9pMzIodG1wLCBjcHVfZW52LCBvZmZz ZXRvZihDUFVTdGF0ZSwgcmVnc1sxNV0pKTsKLSAgICBkZWFkX3RtcCh0bXApOworICAgIHRjZ19n ZW5fbW92X2kzMihjcHVfUlsxNV0sIGFkZHIgJiB+MSk7CiB9CiAKIC8qIFNldCBQQyBhbmQgVGh1 bWIgc3RhdGUgZnJvbSB2YXIuICB2YXIgaXMgbWFya2VkIGFzIGRlYWQuICAqLwogc3RhdGljIGlu bGluZSB2b2lkIGdlbl9ieChEaXNhc0NvbnRleHQgKnMsIFRDR3YgdmFyKQogewotICAgIFRDR3Yg dG1wOwotCiAgICAgcy0+aXNfam1wID0gRElTQVNfVVBEQVRFOwotICAgIHRtcCA9IG5ld190bXAo KTsKLSAgICB0Y2dfZ2VuX2FuZGlfaTMyKHRtcCwgdmFyLCAxKTsKLSAgICBzdG9yZV9jcHVfZmll bGQodG1wLCB0aHVtYik7Ci0gICAgdGNnX2dlbl9hbmRpX2kzMih2YXIsIHZhciwgfjEpOwotICAg IHN0b3JlX2NwdV9maWVsZCh2YXIsIHJlZ3NbMTVdKTsKKyAgICB0Y2dfZ2VuX2FuZGlfaTMyKGNw dV9SWzE1XSwgdmFyLCB+MSk7CisgICAgdGNnX2dlbl9hbmRpX2kzMih2YXIsIHZhciwgMSk7Cisg ICAgc3RvcmVfY3B1X2ZpZWxkKHZhciwgdGh1bWIpOwogfQogCiAvKiBWYXJpYW50IG9mIHN0b3Jl X3JlZyB3aGljaCB1c2VzIGJyYW5jaCZleGNoYW5nZSBsb2dpYyB3aGVuIHN0b3JpbmcKQEAgLTg4 OSw5ICs4OTcsNyBAQCBzdGF0aWMgaW5saW5lIHZvaWQgZ2VuX21vdmxfVDJfcmVnKERpc2FzQ29u dGV4dCAqcywgaW50IHJlZykKIAogc3RhdGljIGlubGluZSB2b2lkIGdlbl9zZXRfcGNfaW0odWlu dDMyX3QgdmFsKQogewotICAgIFRDR3YgdG1wID0gbmV3X3RtcCgpOwotICAgIHRjZ19nZW5fbW92 aV9pMzIodG1wLCB2YWwpOwotICAgIHN0b3JlX2NwdV9maWVsZCh0bXAsIHJlZ3NbMTVdKTsKKyAg ICB0Y2dfZ2VuX21vdmlfaTMyKGNwdV9SWzE1XSwgdmFsKTsKIH0KIAogc3RhdGljIGlubGluZSB2 b2lkIGdlbl9tb3ZsX3JlZ19UTihEaXNhc0NvbnRleHQgKnMsIGludCByZWcsIGludCB0KQpAQCAt OTAzLDcgKzkwOSw3IEBAIHN0YXRpYyBpbmxpbmUgdm9pZCBnZW5fbW92bF9yZWdfVE4oRGlzYXND b250ZXh0ICpzLCBpbnQgcmVnLCBpbnQgdCkKICAgICB9IGVsc2UgewogICAgICAgICB0bXAgPSBj cHVfVFt0XTsKICAgICB9Ci0gICAgdGNnX2dlbl9zdF9pMzIodG1wLCBjcHVfZW52LCBvZmZzZXRv ZihDUFVTdGF0ZSwgcmVnc1tyZWddKSk7CisgICAgdGNnX2dlbl9tb3ZfaTMyKGNwdV9SW3JlZ10s IHRtcCk7CiAgICAgaWYgKHJlZyA9PSAxNSkgewogICAgICAgICBkZWFkX3RtcCh0bXApOwogICAg ICAgICBzLT5pc19qbXAgPSBESVNBU19KVU1QOwotLSAKMS42LjMubXN5c2dpdC4wCgo= --0016364c76ff5c5054046d70cfe4--