From mboxrd@z Thu Jan 1 00:00:00 1970 From: zhangleiqiang Subject: Poor network performance between DomU with multiqueue support Date: Tue, 2 Dec 2014 16:30:49 +0800 Message-ID: Mime-Version: 1.0 (1.0) Content-Type: multipart/mixed; boundary="===============1327411302560573387==" Return-path: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org --===============1327411302560573387== Content-Type: multipart/alternative; boundary=Apple-Mail-043A107B-8FBB-48C3-8335-0EE6E2C01279 Content-Transfer-Encoding: 7bit --Apple-Mail-043A107B-8FBB-48C3-8335-0EE6E2C01279 Content-Type: text/plain; charset=gb2312 Content-Transfer-Encoding: quoted-printable Hi, all I am testing the performance of xen netfront-netback driver that with mu= lti-queues support. The throughput from domU to remote dom0 is 9.2Gb/s, but t= he throughput from domU to remote domU is only 3.6Gb/s, I think the bottlene= ck is the throughput from dom0 to local domU. However, we have done some tes= ting and found the throughput from dom0 to local domU is 5.8Gb/s. And if we send packets from one DomU to other 3 DomUs on different host s= imultaneously, the sum of throughout can reach 9Gbps. It seems like the bott= leneck is the receiver? After some analysis, I found that even the max_queue of netfront/back is= set to 4, there are some strange results as follows: 1. In domU, only one rx queue deal with softirq 2. In dom0, only two netback queues process are scheduled, other two pro= cess aren't scheduled. Are there any issues in my test? In theory, can we achieve 9~10Gb/s betw= een DomUs on different hosts using netfront/netback? =20 The testing environment details are as follows: 1. Hardware a. CPU: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz, 2 CPU 6 Cores with Hype= r Thread enabled b. NIC: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connect= ion (rev 01) 2. Sofware: a. HostOS: SLES 12 (Kernel 3.16-7,git commit d0335e4feea0d3f7a8af3116= c5dc166239da7521 ) b. NIC Driver: IXGBE 3.21.2=20 c. OVS: 2.1.3 d. MTU: 1600 e. Dom0=A3=BA6U6G f. queue number: 4 g. xen 4.4 h. DomU: 4U4G 3. Networking Environment: a. All network flows are transmit/receive through OVS b. Sender server and receiver server are connected directly between 1= 0GE NIC 4. Testing Tools: a. Sender: netperf b. Receiver: netserver ---------- zhangleiqiang (Trump) Best Regards= --Apple-Mail-043A107B-8FBB-48C3-8335-0EE6E2C01279 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
Hi, all
    I am testing the performance of xen= netfront-netback driver that with multi-queues support. The throughput from= domU to remote dom0 is 9.2Gb/s, but the throughput from domU to remote domU= is only 3.6Gb/s, I think the bottleneck is the throughput from dom0 to loca= l domU. However, we have done some testing and found the throughput from dom= 0 to local domU is 5.8Gb/s.
    And if we send packets from one Dom= U to other 3 DomUs on different host simultaneously, the sum of throughout c= an reach 9Gbps. It seems like the bottleneck is the receiver?
    A= fter some analysis, I found that even the max_queue of netfront/back is set t= o 4, there are some strange results as follows:
    1. In domU, onl= y one rx queue deal with softirq
    2. In dom0, only two netback q= ueues process are scheduled, other two process aren't scheduled.

<= /div>
  &= nbsp; Are there any issues in my test? In theory, can we achieve 9~10Gb/s be= tween DomUs on different hosts using netfront/netback?
    
&nb= sp;    The testing environment details are as follows:
   = ;1. Hardware
       a. CPU: Intel(R) Xeon(R) CPU E5645 @ 2= .40GHz, 2 CPU 6 Cores with Hyper Thread enabled
       b.= NIC: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 0= 1)
       a. HostOS: SLES 12 (Ker= nel 3.16-7,git commit d0335e4feea0d3f7a8af3116c5dc166239da7521 )<= /span>
&= nbsp;      b. NIC Driver: IXGBE 3.21.2 
    &nb= sp;  c. OVS: 2.1.3
       d. MTU: 1600
<= div>    &= nbsp;  e. Dom0=EF=BC=9A6U6G
       f. queue number: 4=
=        g. xen 4.4
       h. DomU: 4U4= G
   3. Networking Environment:
       a. All ne= twork flows are transmit/receive through OVS
       b. Se= nder server and receiver server are connected directly between 10GE NIC
 = ;  4. Testing Tools:
       a. Sender: netperf=
  &= nbsp;    b. Receiver: netserver


----------
= zhangleiqiang (Trump)
Best Regards
= --Apple-Mail-043A107B-8FBB-48C3-8335-0EE6E2C01279-- --===============1327411302560573387== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============1327411302560573387==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Vrabel Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 2 Dec 2014 10:57:04 +0000 Message-ID: <547D9B00.2090506@citrix.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: zhangleiqiang , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 02/12/14 08:30, zhangleiqiang wrote: > Hi, all > I am testing the performance of xen netfront-netback driver that > with multi-queues support. The throughput from domU to remote dom0 is > 9.2Gb/s, but the throughput from domU to remote domU is only 3.6Gb/s, I > think the bottleneck is the throughput from dom0 to local domU. However, > we have done some testing and found the throughput from dom0 to local > domU is 5.8Gb/s. > And if we send packets from one DomU to other 3 DomUs on different > host simultaneously, the sum of throughout can reach 9Gbps. It seems > like the bottleneck is the receiver? > After some analysis, I found that even the max_queue of > netfront/back is set to 4, there are some strange results as follows: > 1. In domU, only one rx queue deal with softirq > 2. In dom0, only two netback queues process are scheduled, other two > process aren't scheduled. Multiqueue only has benefits if you have multiple flows since the source/destination addresses are hashed to a queue number. This probably explains why only some of the queues are being used in your test. David From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Liu Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 2 Dec 2014 11:01:33 +0000 Message-ID: <20141202110133.GA5768@zion.uk.xensource.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: zhangleiqiang Cc: wei.liu2@citrix.com, xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org T24gVHVlLCBEZWMgMDIsIDIwMTQgYXQgMDQ6MzA6NDlQTSArMDgwMCwgemhhbmdsZWlxaWFuZyB3 cm90ZToKPiBIaSwgYWxsCj4gICAgIEkgYW0gdGVzdGluZyB0aGUgcGVyZm9ybWFuY2Ugb2YgeGVu IG5ldGZyb250LW5ldGJhY2sgZHJpdmVyIHRoYXQgd2l0aCBtdWx0aS1xdWV1ZXMgc3VwcG9ydC4g VGhlIHRocm91Z2hwdXQgZnJvbSBkb21VIHRvIHJlbW90ZSBkb20wIGlzIDkuMkdiL3MsIGJ1dCB0 aGUgdGhyb3VnaHB1dCBmcm9tIGRvbVUgdG8gcmVtb3RlIGRvbVUgaXMgb25seSAzLjZHYi9zLCBJ IHRoaW5rIHRoZSBib3R0bGVuZWNrIGlzIHRoZSB0aHJvdWdocHV0IGZyb20gZG9tMCB0byBsb2Nh bCBkb21VLiBIb3dldmVyLCB3ZSBoYXZlIGRvbmUgc29tZSB0ZXN0aW5nIGFuZCBmb3VuZCB0aGUg dGhyb3VnaHB1dCBmcm9tIGRvbTAgdG8gbG9jYWwgZG9tVSBpcyA1LjhHYi9zLgo+ICAgICBBbmQg aWYgd2Ugc2VuZCBwYWNrZXRzIGZyb20gb25lIERvbVUgdG8gb3RoZXIgMyBEb21VcyBvbiBkaWZm ZXJlbnQgaG9zdCBzaW11bHRhbmVvdXNseSwgdGhlIHN1bSBvZiB0aHJvdWdob3V0IGNhbiByZWFj aCA5R2Jwcy4gSXQgc2VlbXMgbGlrZSB0aGUgYm90dGxlbmVjayBpcyB0aGUgcmVjZWl2ZXI/Cj4g ICAgIEFmdGVyIHNvbWUgYW5hbHlzaXMsIEkgZm91bmQgdGhhdCBldmVuIHRoZSBtYXhfcXVldWUg b2YgbmV0ZnJvbnQvYmFjayBpcyBzZXQgdG8gNCwgdGhlcmUgYXJlIHNvbWUgc3RyYW5nZSByZXN1 bHRzIGFzIGZvbGxvd3M6Cj4gICAgIDEuIEluIGRvbVUsIG9ubHkgb25lIHJ4IHF1ZXVlIGRlYWwg d2l0aCBzb2Z0aXJxCgpUcnkgdG8gYmluZCBpcnEgdG8gZGlmZmVyZW50IHZjcHVzPwoKPiAgICAg Mi4gSW4gZG9tMCwgb25seSB0d28gbmV0YmFjayBxdWV1ZXMgcHJvY2VzcyBhcmUgc2NoZWR1bGVk LCBvdGhlciB0d28gcHJvY2VzcyBhcmVuJ3Qgc2NoZWR1bGVkLgoKSG93IG1hbnkgRG9tMCB2Y3B1 IGRvIHlvdSBoYXZlPyBJZiBpdCBvbmx5IGhhcyB0d28gdGhlbiB0aGVyZSB3aWxsIG9ubHkKYmUg dHdvIHByb2Nlc3NlcyBydW5uaW5nIGF0IGEgdGltZS4KCj4gCj4gICAgIEFyZSB0aGVyZSBhbnkg aXNzdWVzIGluIG15IHRlc3Q/IEluIHRoZW9yeSwgY2FuIHdlIGFjaGlldmUgOX4xMEdiL3MgYmV0 d2VlbiBEb21VcyBvbiBkaWZmZXJlbnQgaG9zdHMgdXNpbmcgbmV0ZnJvbnQvbmV0YmFjaz8KPiAg ICAgCj4gICAgICBUaGUgdGVzdGluZyBlbnZpcm9ubWVudCBkZXRhaWxzIGFyZSBhcyBmb2xsb3dz Ogo+ICAgIDEuIEhhcmR3YXJlCj4gICAgICAgIGEuIENQVTogSW50ZWwoUikgWGVvbihSKSBDUFUg RTU2NDUgQCAyLjQwR0h6LCAyIENQVSA2IENvcmVzIHdpdGggSHlwZXIgVGhyZWFkIGVuYWJsZWQK PiAgICAgICAgYi4gTklDOiBJbnRlbCBDb3Jwb3JhdGlvbiA4MjU5OUVCIDEwLUdpZ2FiaXQgU0ZJ L1NGUCsgTmV0d29yayBDb25uZWN0aW9uIChyZXYgMDEpCj4gICAgMi4gU29md2FyZToKPiAgICAg ICAgYS4gSG9zdE9TOiBTTEVTIDEyIChLZXJuZWwgMy4xNi03LGdpdCBjb21taXQgZDAzMzVlNGZl ZWEwZDNmN2E4YWYzMTE2YzVkYzE2NjIzOWRhNzUyMSApCgpBbmQgdGhpcyBpcyBhIFN1U0Uga2Vy bmVsPwoKPiAgICAgICAgYi4gTklDIERyaXZlcjogSVhHQkUgMy4yMS4yIAo+ICAgICAgICBjLiBP VlM6IDIuMS4zCj4gICAgICAgIGQuIE1UVTogMTYwMAo+ICAgICAgICBlLiBEb20w77yaNlU2Rwo+ ICAgICAgICBmLiBxdWV1ZSBudW1iZXI6IDQKPiAgICAgICAgZy4geGVuIDQuNAo+ICAgICAgICBo LiBEb21VOiA0VTRHCj4gICAgMy4gTmV0d29ya2luZyBFbnZpcm9ubWVudDoKPiAgICAgICAgYS4g QWxsIG5ldHdvcmsgZmxvd3MgYXJlIHRyYW5zbWl0L3JlY2VpdmUgdGhyb3VnaCBPVlMKPiAgICAg ICAgYi4gU2VuZGVyIHNlcnZlciBhbmQgcmVjZWl2ZXIgc2VydmVyIGFyZSBjb25uZWN0ZWQgZGly ZWN0bHkgYmV0d2VlbiAxMEdFIE5JQwo+ICAgIDQuIFRlc3RpbmcgVG9vbHM6Cj4gICAgICAgIGEu IFNlbmRlcjogbmV0cGVyZgo+ICAgICAgICBiLiBSZWNlaXZlcjogbmV0c2VydmVyCj4gCj4gCj4g LS0tLS0tLS0tLQo+IHpoYW5nbGVpcWlhbmcgKFRydW1wKQo+IEJlc3QgUmVnYXJkcwoKPiBfX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwo+IFhlbi1kZXZlbCBt YWlsaW5nIGxpc3QKPiBYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwo+IGh0dHA6Ly9saXN0cy54ZW4u b3JnL3hlbi1kZXZlbAoKCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fClhlbi1kZXZlbCBtYWlsaW5nIGxpc3QKWGVuLWRldmVsQGxpc3RzLnhlbi5vcmcKaHR0 cDovL2xpc3RzLnhlbi5vcmcveGVuLWRldmVsCg== From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 2 Dec 2014 11:53:35 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE23931FBB@SZXEMA512-MBX.china.huawei.com> References: <547D9B00.2090506@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <547D9B00.2090506@citrix.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel , zhangleiqiang , "xen-devel@lists.xen.org" Cc: "Xiaoding (B)" , Zhuangyuxin , "Luohao (brian)" , "Yuzhou (C)" List-Id: xen-devel@lists.xenproject.org > -----Original Message----- > From: xen-devel-bounces@lists.xen.org > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of David Vrabel > Sent: Tuesday, December 02, 2014 6:57 PM > To: zhangleiqiang; xen-devel@lists.xen.org > Subject: Re: [Xen-devel] Poor network performance between DomU with > multiqueue support > > On 02/12/14 08:30, zhangleiqiang wrote: > > Hi, all > > I am testing the performance of xen netfront-netback driver that > > with multi-queues support. The throughput from domU to remote dom0 is > > 9.2Gb/s, but the throughput from domU to remote domU is only 3.6Gb/s, > > I think the bottleneck is the throughput from dom0 to local domU. > > However, we have done some testing and found the throughput from dom0 > > to local domU is 5.8Gb/s. > > And if we send packets from one DomU to other 3 DomUs on different > > host simultaneously, the sum of throughout can reach 9Gbps. It seems > > like the bottleneck is the receiver? > > After some analysis, I found that even the max_queue of > > netfront/back is set to 4, there are some strange results as follows: > > 1. In domU, only one rx queue deal with softirq > > 2. In dom0, only two netback queues process are scheduled, other > > two process aren't scheduled. > > Multiqueue only has benefits if you have multiple flows since the > source/destination addresses are hashed to a queue number. This probably > explains why only some of the queues are being used in your test. The hash method you mentioned is used for selection of netback process or netfront rx queue? Indeed, there are 4 netback processes running in Dom0, because there are only one DomU running in Dom0 and so four netback processes are running in Dom0 (the max_queue param of netback kernel module is set to 4). The phenomenon is that only 2 of these four netback process were running with about 70% cpu usage, and another two use little CPU. > David > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 2 Dec 2014 11:50:59 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> References: <20141202110133.GA5768@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <20141202110133.GA5768@zion.uk.xensource.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , zhangleiqiang Cc: "Xiaoding (B)" , Zhuangyuxin , "Luohao (brian)" , "Yuzhou (C)" , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiB4ZW4tZGV2ZWwtYm91bmNlc0Bs aXN0cy54ZW4ub3JnDQo+IFttYWlsdG86eGVuLWRldmVsLWJvdW5jZXNAbGlzdHMueGVuLm9yZ10g T24gQmVoYWxmIE9mIFdlaSBMaXUNCj4gU2VudDogVHVlc2RheSwgRGVjZW1iZXIgMDIsIDIwMTQg NzowMiBQTQ0KPiBUbzogemhhbmdsZWlxaWFuZw0KPiBDYzogd2VpLmxpdTJAY2l0cml4LmNvbTsg eGVuLWRldmVsQGxpc3RzLnhlbi5vcmcNCj4gU3ViamVjdDogUmU6IFtYZW4tZGV2ZWxdIFBvb3Ig bmV0d29yayBwZXJmb3JtYW5jZSBiZXR3ZWVuIERvbVUgd2l0aA0KPiBtdWx0aXF1ZXVlIHN1cHBv cnQNCj4gDQo+IE9uIFR1ZSwgRGVjIDAyLCAyMDE0IGF0IDA0OjMwOjQ5UE0gKzA4MDAsIHpoYW5n bGVpcWlhbmcgd3JvdGU6DQo+ID4gSGksIGFsbA0KPiA+ICAgICBJIGFtIHRlc3RpbmcgdGhlIHBl cmZvcm1hbmNlIG9mIHhlbiBuZXRmcm9udC1uZXRiYWNrIGRyaXZlciB0aGF0IHdpdGgNCj4gbXVs dGktcXVldWVzIHN1cHBvcnQuIFRoZSB0aHJvdWdocHV0IGZyb20gZG9tVSB0byByZW1vdGUgZG9t MCBpcyA5LjJHYi9zLA0KPiBidXQgdGhlIHRocm91Z2hwdXQgZnJvbSBkb21VIHRvIHJlbW90ZSBk b21VIGlzIG9ubHkgMy42R2IvcywgSSB0aGluayB0aGUNCj4gYm90dGxlbmVjayBpcyB0aGUgdGhy b3VnaHB1dCBmcm9tIGRvbTAgdG8gbG9jYWwgZG9tVS4gSG93ZXZlciwgd2UgaGF2ZQ0KPiBkb25l IHNvbWUgdGVzdGluZyBhbmQgZm91bmQgdGhlIHRocm91Z2hwdXQgZnJvbSBkb20wIHRvIGxvY2Fs IGRvbVUgaXMNCj4gNS44R2Ivcy4NCj4gPiAgICAgQW5kIGlmIHdlIHNlbmQgcGFja2V0cyBmcm9t IG9uZSBEb21VIHRvIG90aGVyIDMgRG9tVXMgb24gZGlmZmVyZW50DQo+IGhvc3Qgc2ltdWx0YW5l b3VzbHksIHRoZSBzdW0gb2YgdGhyb3VnaG91dCBjYW4gcmVhY2ggOUdicHMuIEl0IHNlZW1zIGxp a2UgdGhlDQo+IGJvdHRsZW5lY2sgaXMgdGhlIHJlY2VpdmVyPw0KPiA+ICAgICBBZnRlciBzb21l IGFuYWx5c2lzLCBJIGZvdW5kIHRoYXQgZXZlbiB0aGUgbWF4X3F1ZXVlIG9mIG5ldGZyb250L2Jh Y2sNCj4gaXMgc2V0IHRvIDQsIHRoZXJlIGFyZSBzb21lIHN0cmFuZ2UgcmVzdWx0cyBhcyBmb2xs b3dzOg0KPiA+ICAgICAxLiBJbiBkb21VLCBvbmx5IG9uZSByeCBxdWV1ZSBkZWFsIHdpdGggc29m dGlycQ0KPiANCj4gVHJ5IHRvIGJpbmQgaXJxIHRvIGRpZmZlcmVudCB2Y3B1cz8NCg0KRG8geW91 IG1lYW4gd2UgdHJ5IHRvIGJpbmQgaXJxIHRvIGRpZmZlcmVudCB2Y3B1cyBpbiBEb21VPyBJIHdp bGwgdHJ5IGl0IG5vdy4NCg0KPiANCj4gPiAgICAgMi4gSW4gZG9tMCwgb25seSB0d28gbmV0YmFj ayBxdWV1ZXMgcHJvY2VzcyBhcmUgc2NoZWR1bGVkLCBvdGhlciB0d28NCj4gcHJvY2VzcyBhcmVu J3Qgc2NoZWR1bGVkLg0KPiANCj4gSG93IG1hbnkgRG9tMCB2Y3B1IGRvIHlvdSBoYXZlPyBJZiBp dCBvbmx5IGhhcyB0d28gdGhlbiB0aGVyZSB3aWxsIG9ubHkgYmUNCj4gdHdvIHByb2Nlc3NlcyBy dW5uaW5nIGF0IGEgdGltZS4NCg0KRG9tMCBoYXMgNiB2Y3B1cywgYW5kIDZHIG1lbW9yeS4gVGhl cmUgYXJlIG9ubHkgb25lIERvbVUgcnVubmluZyBpbiBEb20wIGFuZCBzbyBmb3VyIG5ldGJhY2sg cHJvY2Vzc2VzIGFyZSBydW5uaW5nIGluIERvbTAgKGJlY2F1c2UgdGhlIG1heF9xdWV1ZSBwYXJh bSBvZiBuZXRiYWNrIGtlcm5lbCBtb2R1bGUgaXMgc2V0IHRvIDQpLiANClRoZSBwaGVub21lbm9u IGlzIHRoYXQgb25seSAyIG9mIHRoZXNlIGZvdXIgbmV0YmFjayBwcm9jZXNzIHdlcmUgcnVubmlu ZyB3aXRoIGFib3V0IDcwJSBjcHUgdXNhZ2UsIGFuZCBhbm90aGVyIHR3byB1c2UgbGl0dGxlIENQ VS4NCklzIHRoZXJlIGEgaGFzaCBhbGdvcml0aG0gdG8gZGV0ZXJtaW5lIHdoaWNoIG5ldGJhY2sg cHJvY2VzcyB0byBoYW5kbGUgdGhlIGlucHV0IHBhY2tldD8NCg0KPiA+DQo+ID4gICAgIEFyZSB0 aGVyZSBhbnkgaXNzdWVzIGluIG15IHRlc3Q/IEluIHRoZW9yeSwgY2FuIHdlIGFjaGlldmUgOX4x MEdiL3MNCj4gYmV0d2VlbiBEb21VcyBvbiBkaWZmZXJlbnQgaG9zdHMgdXNpbmcgbmV0ZnJvbnQv bmV0YmFjaz8NCj4gPg0KPiA+ICAgICAgVGhlIHRlc3RpbmcgZW52aXJvbm1lbnQgZGV0YWlscyBh cmUgYXMgZm9sbG93czoNCj4gPiAgICAxLiBIYXJkd2FyZQ0KPiA+ICAgICAgICBhLiBDUFU6IElu dGVsKFIpIFhlb24oUikgQ1BVIEU1NjQ1IEAgMi40MEdIeiwgMiBDUFUgNiBDb3JlcyB3aXRoDQo+ IEh5cGVyIFRocmVhZCBlbmFibGVkDQo+ID4gICAgICAgIGIuIE5JQzogSW50ZWwgQ29ycG9yYXRp b24gODI1OTlFQiAxMC1HaWdhYml0IFNGSS9TRlArIE5ldHdvcmsNCj4gQ29ubmVjdGlvbiAocmV2 IDAxKQ0KPiA+ICAgIDIuIFNvZndhcmU6DQo+ID4gICAgICAgIGEuIEhvc3RPUzogU0xFUyAxMiAo S2VybmVsIDMuMTYtNyxnaXQgY29tbWl0DQo+ID4gZDAzMzVlNGZlZWEwZDNmN2E4YWYzMTE2YzVk YzE2NjIzOWRhNzUyMSApDQo+IA0KPiBBbmQgdGhpcyBpcyBhIFN1U0Uga2VybmVsPw0KDQpObywg SSBqdXN0IGNvbXBpbGUgRG9tMCBhbmQgRG9tVSBrZXJuZWwgdXNpbmcgMy4xNi03IHRhZyBmcm9t IGtlcm5lbC5vcmcuDQoNCj4gPiAgICAgICAgYi4gTklDIERyaXZlcjogSVhHQkUgMy4yMS4yDQo+ ID4gICAgICAgIGMuIE9WUzogMi4xLjMNCj4gPiAgICAgICAgZC4gTVRVOiAxNjAwDQo+ID4gICAg ICAgIGUuIERvbTDvvJo2VTZHDQo+ID4gICAgICAgIGYuIHF1ZXVlIG51bWJlcjogNA0KPiA+ICAg ICAgICBnLiB4ZW4gNC40DQo+ID4gICAgICAgIGguIERvbVU6IDRVNEcNCj4gPiAgICAzLiBOZXR3 b3JraW5nIEVudmlyb25tZW50Og0KPiA+ICAgICAgICBhLiBBbGwgbmV0d29yayBmbG93cyBhcmUg dHJhbnNtaXQvcmVjZWl2ZSB0aHJvdWdoIE9WUw0KPiA+ICAgICAgICBiLiBTZW5kZXIgc2VydmVy IGFuZCByZWNlaXZlciBzZXJ2ZXIgYXJlIGNvbm5lY3RlZCBkaXJlY3RseSBiZXR3ZWVuDQo+IDEw R0UgTklDDQo+ID4gICAgNC4gVGVzdGluZyBUb29sczoNCj4gPiAgICAgICAgYS4gU2VuZGVyOiBu ZXRwZXJmDQo+ID4gICAgICAgIGIuIFJlY2VpdmVyOiBuZXRzZXJ2ZXINCj4gPg0KPiA+DQo+ID4g LS0tLS0tLS0tLQ0KPiA+IHpoYW5nbGVpcWlhbmcgKFRydW1wKQ0KPiA+IEJlc3QgUmVnYXJkcw0K PiANCj4gPiBfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0K PiA+IFhlbi1kZXZlbCBtYWlsaW5nIGxpc3QNCj4gPiBYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZw0K PiA+IGh0dHA6Ly9saXN0cy54ZW4ub3JnL3hlbi1kZXZlbA0KPiANCj4gDQo+IF9fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fDQo+IFhlbi1kZXZlbCBtYWlsaW5n IGxpc3QNCj4gWGVuLWRldmVsQGxpc3RzLnhlbi5vcmcNCj4gaHR0cDovL2xpc3RzLnhlbi5vcmcv eGVuLWRldmVsDQpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f XwpYZW4tZGV2ZWwgbWFpbGluZyBsaXN0Clhlbi1kZXZlbEBsaXN0cy54ZW4ub3JnCmh0dHA6Ly9s aXN0cy54ZW4ub3JnL3hlbi1kZXZlbAo= From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Liu Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 2 Dec 2014 12:11:51 +0000 Message-ID: <20141202121151.GD5768@zion.uk.xensource.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhangleiqiang (Trump)" Cc: "Luohao (brian)" , Wei Liu , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "xen-devel@lists.xen.org" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org On Tue, Dec 02, 2014 at 11:50:59AM +0000, Zhangleiqiang (Trump) wrote: > > -----Original Message----- > > From: xen-devel-bounces@lists.xen.org > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Wei Liu > > Sent: Tuesday, December 02, 2014 7:02 PM > > To: zhangleiqiang > > Cc: wei.liu2@citrix.com; xen-devel@lists.xen.org > > Subject: Re: [Xen-devel] Poor network performance between DomU with > > multiqueue support > > > > On Tue, Dec 02, 2014 at 04:30:49PM +0800, zhangleiqiang wrote: > > > Hi, all > > > I am testing the performance of xen netfront-netback driver that with > > multi-queues support. The throughput from domU to remote dom0 is 9.2Gb/s, > > but the throughput from domU to remote domU is only 3.6Gb/s, I think the > > bottleneck is the throughput from dom0 to local domU. However, we have > > done some testing and found the throughput from dom0 to local domU is > > 5.8Gb/s. > > > And if we send packets from one DomU to other 3 DomUs on different > > host simultaneously, the sum of throughout can reach 9Gbps. It seems like the > > bottleneck is the receiver? > > > After some analysis, I found that even the max_queue of netfront/back > > is set to 4, there are some strange results as follows: > > > 1. In domU, only one rx queue deal with softirq > > > > Try to bind irq to different vcpus? > > Do you mean we try to bind irq to different vcpus in DomU? I will try it now. > Yes. Given the fact that you have two backend threads running while only one DomU vcpu is busy, it smells like misconfiguration in DomU. If this phenomenon persists after correctly binding irqs, you might want to check traffic is steering correctly to different queues. > > > > > 2. In dom0, only two netback queues process are scheduled, other two > > process aren't scheduled. > > > > How many Dom0 vcpu do you have? If it only has two then there will only be > > two processes running at a time. > > Dom0 has 6 vcpus, and 6G memory. There are only one DomU running in Dom0 and so four netback processes are running in Dom0 (because the max_queue param of netback kernel module is set to 4). > The phenomenon is that only 2 of these four netback process were running with about 70% cpu usage, and another two use little CPU. > Is there a hash algorithm to determine which netback process to handle the input packet? > I think that's whatever default algorithm Linux kernel is using. We don't currently support other algorithms. Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 2 Dec 2014 14:46:36 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20141202121151.GD5768@zion.uk.xensource.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: "Luohao (brian)" , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "xen-devel@lists.xen.org" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org Thanks for your reply, Wei. I do the following testing just now and found the results as follows: There are three DomUs (4U4G) are running on Host A (6U6G) and one DomU (4U4G) is running on Host B (6U6G), I send packets from three DomUs to the DomU on Host B simultaneously. 1. The "top" output of Host B as follows: top - 09:42:11 up 1:07, 2 users, load average: 2.46, 1.90, 1.47 Tasks: 173 total, 4 running, 169 sleeping, 0 stopped, 0 zombie %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni, 97.3 id, 0.0 wa, 0.0 hi, 0.8 si, 1.9 st %Cpu1 : 0.0 us, 27.0 sy, 0.0 ni, 63.1 id, 0.0 wa, 0.0 hi, 9.5 si, 0.4 st %Cpu2 : 0.0 us, 90.0 sy, 0.0 ni, 8.3 id, 0.0 wa, 0.0 hi, 1.7 si, 0.0 st %Cpu3 : 0.4 us, 1.4 sy, 0.0 ni, 95.4 id, 0.0 wa, 0.0 hi, 1.4 si, 1.4 st %Cpu4 : 0.0 us, 60.2 sy, 0.0 ni, 39.5 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st %Cpu5 : 0.0 us, 2.8 sy, 0.0 ni, 89.4 id, 0.0 wa, 0.0 hi, 6.9 si, 0.9 st KiB Mem: 4517144 total, 3116480 used, 1400664 free, 876 buffers KiB Swap: 2103292 total, 0 used, 2103292 free. 2374656 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7440 root 20 0 0 0 0 R 71.10 0.000 8:15.38 vif4.0-q3-guest 7434 root 20 0 0 0 0 R 59.14 0.000 9:00.58 vif4.0-q0-guest 18 root 20 0 0 0 0 R 33.89 0.000 2:35.06 ksoftirqd/2 28 root 20 0 0 0 0 S 20.93 0.000 3:01.81 ksoftirqd/4 As shown above, only two netback related processes (vif4.0-*) are running with high cpu usage, and the other 2 netback processes are idle. The "ps" result of vif4.0-* processes as follows: root 7434 50.5 0.0 0 0 ? R 09:23 11:29 [vif4.0-q0-guest] root 7435 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q0-deall] root 7436 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q1-guest] root 7437 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q1-deall] root 7438 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q2-guest] root 7439 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q2-deall] root 7440 48.1 0.0 0 0 ? R 09:23 10:55 [vif4.0-q3-guest] root 7441 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q3-deall] root 9724 0.0 0.0 9244 1520 pts/0 S+ 09:46 0:00 grep --color=auto 2. The "rx" related content in /proc/interupts in receiver DomU (on Host B): 73: 2 0 2925405 0 xen-dyn-event eth0-q0-rx 75: 43 93 0 118 xen-dyn-event eth0-q1-rx 77: 2 3376 14 1983 xen-dyn-event eth0-q2-rx 79: 2414666 0 9 0 xen-dyn-event eth0-q3-rx As shown above, it seems like that only q0 and q3 handles the interrupt triggered by packet receving. Any advise? Thanks. ---------- zhangleiqiang (Trump) Best Regards > -----Original Message----- > From: Wei Liu [mailto:wei.liu2@citrix.com] > Sent: Tuesday, December 02, 2014 8:12 PM > To: Zhangleiqiang (Trump) > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao (brian); Xiaoding > (B); Yuzhou (C); Zhuangyuxin > Subject: Re: [Xen-devel] Poor network performance between DomU with > multiqueue support > > On Tue, Dec 02, 2014 at 11:50:59AM +0000, Zhangleiqiang (Trump) wrote: > > > -----Original Message----- > > > From: xen-devel-bounces@lists.xen.org > > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Wei Liu > > > Sent: Tuesday, December 02, 2014 7:02 PM > > > To: zhangleiqiang > > > Cc: wei.liu2@citrix.com; xen-devel@lists.xen.org > > > Subject: Re: [Xen-devel] Poor network performance between DomU with > > > multiqueue support > > > > > > On Tue, Dec 02, 2014 at 04:30:49PM +0800, zhangleiqiang wrote: > > > > Hi, all > > > > I am testing the performance of xen netfront-netback driver > > > > that with > > > multi-queues support. The throughput from domU to remote dom0 is > > > 9.2Gb/s, but the throughput from domU to remote domU is only > > > 3.6Gb/s, I think the bottleneck is the throughput from dom0 to local > > > domU. However, we have done some testing and found the throughput > > > from dom0 to local domU is 5.8Gb/s. > > > > And if we send packets from one DomU to other 3 DomUs on > > > > different > > > host simultaneously, the sum of throughout can reach 9Gbps. It seems > > > like the bottleneck is the receiver? > > > > After some analysis, I found that even the max_queue of > > > > netfront/back > > > is set to 4, there are some strange results as follows: > > > > 1. In domU, only one rx queue deal with softirq > > > > > > Try to bind irq to different vcpus? > > > > Do you mean we try to bind irq to different vcpus in DomU? I will try it now. > > > > Yes. Given the fact that you have two backend threads running while only one > DomU vcpu is busy, it smells like misconfiguration in DomU. > > If this phenomenon persists after correctly binding irqs, you might want to > check traffic is steering correctly to different queues. > > > > > > > > 2. In dom0, only two netback queues process are scheduled, > > > > other two > > > process aren't scheduled. > > > > > > How many Dom0 vcpu do you have? If it only has two then there will > > > only be two processes running at a time. > > > > Dom0 has 6 vcpus, and 6G memory. There are only one DomU running in > Dom0 and so four netback processes are running in Dom0 (because the > max_queue param of netback kernel module is set to 4). > > The phenomenon is that only 2 of these four netback process were running > with about 70% cpu usage, and another two use little CPU. > > Is there a hash algorithm to determine which netback process to handle the > input packet? > > > > I think that's whatever default algorithm Linux kernel is using. > > We don't currently support other algorithms. > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Liu Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 2 Dec 2014 15:58:32 +0000 Message-ID: <20141202155832.GH5768@zion.uk.xensource.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhangleiqiang (Trump)" Cc: "Luohao (brian)" , Wei Liu , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "xen-devel@lists.xen.org" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org On Tue, Dec 02, 2014 at 02:46:36PM +0000, Zhangleiqiang (Trump) wrote: > Thanks for your reply, Wei. > > I do the following testing just now and found the results as follows: > > There are three DomUs (4U4G) are running on Host A (6U6G) and one DomU (4U4G) is running on Host B (6U6G), I send packets from three DomUs to the DomU on Host B simultaneously. > > 1. The "top" output of Host B as follows: > > top - 09:42:11 up 1:07, 2 users, load average: 2.46, 1.90, 1.47 > Tasks: 173 total, 4 running, 169 sleeping, 0 stopped, 0 zombie > %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni, 97.3 id, 0.0 wa, 0.0 hi, 0.8 si, 1.9 st > %Cpu1 : 0.0 us, 27.0 sy, 0.0 ni, 63.1 id, 0.0 wa, 0.0 hi, 9.5 si, 0.4 st > %Cpu2 : 0.0 us, 90.0 sy, 0.0 ni, 8.3 id, 0.0 wa, 0.0 hi, 1.7 si, 0.0 st > %Cpu3 : 0.4 us, 1.4 sy, 0.0 ni, 95.4 id, 0.0 wa, 0.0 hi, 1.4 si, 1.4 st > %Cpu4 : 0.0 us, 60.2 sy, 0.0 ni, 39.5 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st > %Cpu5 : 0.0 us, 2.8 sy, 0.0 ni, 89.4 id, 0.0 wa, 0.0 hi, 6.9 si, 0.9 st > KiB Mem: 4517144 total, 3116480 used, 1400664 free, 876 buffers > KiB Swap: 2103292 total, 0 used, 2103292 free. 2374656 cached Mem > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 7440 root 20 0 0 0 0 R 71.10 0.000 8:15.38 vif4.0-q3-guest > 7434 root 20 0 0 0 0 R 59.14 0.000 9:00.58 vif4.0-q0-guest > 18 root 20 0 0 0 0 R 33.89 0.000 2:35.06 ksoftirqd/2 > 28 root 20 0 0 0 0 S 20.93 0.000 3:01.81 ksoftirqd/4 > > > As shown above, only two netback related processes (vif4.0-*) are running with high cpu usage, and the other 2 netback processes are idle. The "ps" result of vif4.0-* processes as follows: > > root 7434 50.5 0.0 0 0 ? R 09:23 11:29 [vif4.0-q0-guest] > root 7435 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q0-deall] > root 7436 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q1-guest] > root 7437 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q1-deall] > root 7438 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q2-guest] > root 7439 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q2-deall] > root 7440 48.1 0.0 0 0 ? R 09:23 10:55 [vif4.0-q3-guest] > root 7441 0.0 0.0 0 0 ? S 09:23 0:00 [vif4.0-q3-deall] > root 9724 0.0 0.0 9244 1520 pts/0 S+ 09:46 0:00 grep --color=auto > > > 2. The "rx" related content in /proc/interupts in receiver DomU (on Host B): > > 73: 2 0 2925405 0 xen-dyn-event eth0-q0-rx > 75: 43 93 0 118 xen-dyn-event eth0-q1-rx > 77: 2 3376 14 1983 xen-dyn-event eth0-q2-rx > 79: 2414666 0 9 0 xen-dyn-event eth0-q3-rx > > As shown above, it seems like that only q0 and q3 handles the interrupt triggered by packet receving. > > Any advise? Thanks. Netback selects queue based on the return value of skb_get_queue_mapping. The queue mapping is set by core driver or ndo_select_queue (if specified by individual driver). In this case netback doesn't have its implementation of ndo_select_queue, so it's up to core driver to decide which queue to dispatch the packet to. I think you need to inspect why Dom0 only steers traffic to these two queues but not all of them. Don't know which utility is handy for this job. Probably tc(8) is useful? Wei. > ---------- > zhangleiqiang (Trump) > > Best Regards > > > > -----Original Message----- > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > Sent: Tuesday, December 02, 2014 8:12 PM > > To: Zhangleiqiang (Trump) > > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao (brian); Xiaoding > > (B); Yuzhou (C); Zhuangyuxin > > Subject: Re: [Xen-devel] Poor network performance between DomU with > > multiqueue support > > > > On Tue, Dec 02, 2014 at 11:50:59AM +0000, Zhangleiqiang (Trump) wrote: > > > > -----Original Message----- > > > > From: xen-devel-bounces@lists.xen.org > > > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Wei Liu > > > > Sent: Tuesday, December 02, 2014 7:02 PM > > > > To: zhangleiqiang > > > > Cc: wei.liu2@citrix.com; xen-devel@lists.xen.org > > > > Subject: Re: [Xen-devel] Poor network performance between DomU with > > > > multiqueue support > > > > > > > > On Tue, Dec 02, 2014 at 04:30:49PM +0800, zhangleiqiang wrote: > > > > > Hi, all > > > > > I am testing the performance of xen netfront-netback driver > > > > > that with > > > > multi-queues support. The throughput from domU to remote dom0 is > > > > 9.2Gb/s, but the throughput from domU to remote domU is only > > > > 3.6Gb/s, I think the bottleneck is the throughput from dom0 to local > > > > domU. However, we have done some testing and found the throughput > > > > from dom0 to local domU is 5.8Gb/s. > > > > > And if we send packets from one DomU to other 3 DomUs on > > > > > different > > > > host simultaneously, the sum of throughout can reach 9Gbps. It seems > > > > like the bottleneck is the receiver? > > > > > After some analysis, I found that even the max_queue of > > > > > netfront/back > > > > is set to 4, there are some strange results as follows: > > > > > 1. In domU, only one rx queue deal with softirq > > > > > > > > Try to bind irq to different vcpus? > > > > > > Do you mean we try to bind irq to different vcpus in DomU? I will try it now. > > > > > > > Yes. Given the fact that you have two backend threads running while only one > > DomU vcpu is busy, it smells like misconfiguration in DomU. > > > > If this phenomenon persists after correctly binding irqs, you might want to > > check traffic is steering correctly to different queues. > > > > > > > > > > > 2. In dom0, only two netback queues process are scheduled, > > > > > other two > > > > process aren't scheduled. > > > > > > > > How many Dom0 vcpu do you have? If it only has two then there will > > > > only be two processes running at a time. > > > > > > Dom0 has 6 vcpus, and 6G memory. There are only one DomU running in > > Dom0 and so four netback processes are running in Dom0 (because the > > max_queue param of netback kernel module is set to 4). > > > The phenomenon is that only 2 of these four netback process were running > > with about 70% cpu usage, and another two use little CPU. > > > Is there a hash algorithm to determine which netback process to handle the > > input packet? > > > > > > > I think that's whatever default algorithm Linux kernel is using. > > > > We don't currently support other algorithms. > > > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zoltan Kiss Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 02 Dec 2014 17:25:22 +0000 Message-ID: <547DF602.2040704@linaro.org> References: <547D9B00.2090506@citrix.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FBB@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE23931FBB@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhangleiqiang (Trump)" , David Vrabel , zhangleiqiang , "xen-devel@lists.xen.org" Cc: "Xiaoding (B)" , Zhuangyuxin , "Luohao (brian)" , "Yuzhou (C)" List-Id: xen-devel@lists.xenproject.org On 02/12/14 11:53, Zhangleiqiang (Trump) wrote: >> -----Original Message----- >> From: xen-devel-bounces@lists.xen.org >> [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of David Vrabel >> Sent: Tuesday, December 02, 2014 6:57 PM >> To: zhangleiqiang; xen-devel@lists.xen.org >> Subject: Re: [Xen-devel] Poor network performance between DomU with >> multiqueue support >> >> On 02/12/14 08:30, zhangleiqiang wrote: >>> Hi, all >>> I am testing the performance of xen netfront-netback driver that >>> with multi-queues support. The throughput from domU to remote dom0 is >>> 9.2Gb/s, but the throughput from domU to remote domU is only 3.6Gb/s, >>> I think the bottleneck is the throughput from dom0 to local domU. >>> However, we have done some testing and found the throughput from dom0 >>> to local domU is 5.8Gb/s. >>> And if we send packets from one DomU to other 3 DomUs on different >>> host simultaneously, the sum of throughout can reach 9Gbps. It seems >>> like the bottleneck is the receiver? >>> After some analysis, I found that even the max_queue of >>> netfront/back is set to 4, there are some strange results as follows: >>> 1. In domU, only one rx queue deal with softirq >>> 2. In dom0, only two netback queues process are scheduled, other >>> two process aren't scheduled. >> >> Multiqueue only has benefits if you have multiple flows since the >> source/destination addresses are hashed to a queue number. This probably >> explains why only some of the queues are being used in your test. > > The hash method you mentioned is used for selection of netback process or netfront rx queue? It's used in both direction to select the queue. > Indeed, there are 4 netback processes running in Dom0, because there are only one DomU running in Dom0 and so four netback processes are running in Dom0 (the max_queue param of netback kernel module is set to 4). > The phenomenon is that only 2 of these four netback process were running with about 70% cpu usage, and another two use little CPU. > >> David >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Wed, 3 Dec 2014 14:43:37 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20141202155832.GH5768@zion.uk.xensource.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , "xen-devel@lists.xen.org" Cc: "Xiaoding (B)" , Zhuangyuxin , zhangleiqiang , "Luohao (brian)" , "Yuzhou (C)" List-Id: xen-devel@lists.xenproject.org > -----Original Message----- > From: Wei Liu [mailto:wei.liu2@citrix.com] > Sent: Tuesday, December 02, 2014 11:59 PM > To: Zhangleiqiang (Trump) > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao (brian); Xiaoding > (B); Yuzhou (C); Zhuangyuxin > Subject: Re: [Xen-devel] Poor network performance between DomU with > multiqueue support > > On Tue, Dec 02, 2014 at 02:46:36PM +0000, Zhangleiqiang (Trump) wrote: > > Thanks for your reply, Wei. > > > > I do the following testing just now and found the results as follows: > > > > There are three DomUs (4U4G) are running on Host A (6U6G) and one DomU > (4U4G) is running on Host B (6U6G), I send packets from three DomUs to the > DomU on Host B simultaneously. > > > > 1. The "top" output of Host B as follows: > > > > top - 09:42:11 up 1:07, 2 users, load average: 2.46, 1.90, 1.47 > > Tasks: 173 total, 4 running, 169 sleeping, 0 stopped, 0 zombie > > %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni, 97.3 id, 0.0 wa, 0.0 hi, 0.8 > > si, 1.9 st > > %Cpu1 : 0.0 us, 27.0 sy, 0.0 ni, 63.1 id, 0.0 wa, 0.0 hi, 9.5 > > si, 0.4 st > > %Cpu2 : 0.0 us, 90.0 sy, 0.0 ni, 8.3 id, 0.0 wa, 0.0 hi, 1.7 > > si, 0.0 st > > %Cpu3 : 0.4 us, 1.4 sy, 0.0 ni, 95.4 id, 0.0 wa, 0.0 hi, 1.4 > > si, 1.4 st > > %Cpu4 : 0.0 us, 60.2 sy, 0.0 ni, 39.5 id, 0.0 wa, 0.0 hi, 0.3 > > si, 0.0 st > > %Cpu5 : 0.0 us, 2.8 sy, 0.0 ni, 89.4 id, 0.0 wa, 0.0 hi, 6.9 si, 0.9 > st > > KiB Mem: 4517144 total, 3116480 used, 1400664 free, 876 > buffers > > KiB Swap: 2103292 total, 0 used, 2103292 free. 2374656 > cached Mem > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM > TIME+ COMMAND > > 7440 root 20 0 0 0 0 R 71.10 0.000 > 8:15.38 vif4.0-q3-guest > > 7434 root 20 0 0 0 0 R 59.14 0.000 > 9:00.58 vif4.0-q0-guest > > 18 root 20 0 0 0 0 R 33.89 0.000 > 2:35.06 ksoftirqd/2 > > 28 root 20 0 0 0 0 S 20.93 0.000 > 3:01.81 ksoftirqd/4 > > > > > > As shown above, only two netback related processes (vif4.0-*) are running > with high cpu usage, and the other 2 netback processes are idle. The "ps" > result of vif4.0-* processes as follows: > > > > root 7434 50.5 0.0 0 0 ? R 09:23 11:29 > [vif4.0-q0-guest] > > root 7435 0.0 0.0 0 0 ? S 09:23 0:00 > [vif4.0-q0-deall] > > root 7436 0.0 0.0 0 0 ? S 09:23 0:00 > [vif4.0-q1-guest] > > root 7437 0.0 0.0 0 0 ? S 09:23 0:00 > [vif4.0-q1-deall] > > root 7438 0.0 0.0 0 0 ? S 09:23 0:00 > [vif4.0-q2-guest] > > root 7439 0.0 0.0 0 0 ? S 09:23 0:00 > [vif4.0-q2-deall] > > root 7440 48.1 0.0 0 0 ? R 09:23 10:55 > [vif4.0-q3-guest] > > root 7441 0.0 0.0 0 0 ? S 09:23 0:00 > [vif4.0-q3-deall] > > root 9724 0.0 0.0 9244 1520 pts/0 S+ 09:46 0:00 > grep --color=auto > > > > > > 2. The "rx" related content in /proc/interupts in receiver DomU (on Host B): > > > > 73: 2 0 2925405 0 xen-dyn-event > eth0-q0-rx > > 75: 43 93 0 118 xen-dyn-event > eth0-q1-rx > > 77: 2 3376 14 1983 xen-dyn-event > eth0-q2-rx > > 79: 2414666 0 9 0 xen-dyn-event > eth0-q3-rx > > > > As shown above, it seems like that only q0 and q3 handles the interrupt > triggered by packet receving. > > > > Any advise? Thanks. > > Netback selects queue based on the return value of skb_get_queue_mapping. > The queue mapping is set by core driver or ndo_select_queue (if specified by > individual driver). In this case netback doesn't have its implementation of > ndo_select_queue, so it's up to core driver to decide which queue to dispatch > the packet to. I think you need to inspect why Dom0 only steers traffic to > these two queues but not all of them. > > Don't know which utility is handy for this job. Probably tc(8) is useful? Thanks Wei. I think the reason for the above results that only two netback/netfront processes works hard is the queue select method. I have tried to send packets from multiple host/vm to a vm, and all of the netback/netfront processes are running with high cpu usage a few times. However, I find another issue. Even using 6 queues and making sure that all of these 6 netback processes running with high cpu usage (indeed, any of it running with 87% cpu usage), the whole VM receive throughout is not very higher than results when using 4 queues. The results are from 4.5Gbps to 5.04 Gbps using TCP with 512 bytes length and 4.3Gbps to 5.78Gbps using TCP with 1460 bytes length. According to the testing result from WIKI: http://wiki.xen.org/wiki/Xen-netback_and_xen-netfront_multi-queue_performance_testing, The VM receive throughput is also more lower than VM transmit. I am wondering why the VM receive throughout cannot be up to 8-10Gbps as VM transmit under multi-queue? I also tried to send packets directly from Local Dom0 to DomU, the DomU receive throughput can reach about 8-12Gbps, so I am also wondering why transmitting packets from Dom0 to Remote DomU can only reach about 4-5Gbps throughout? > Wei. > > > ---------- > > zhangleiqiang (Trump) > > > > Best Regards > > > > > > > -----Original Message----- > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > Sent: Tuesday, December 02, 2014 8:12 PM > > > To: Zhangleiqiang (Trump) > > > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao (brian); > > > Xiaoding (B); Yuzhou (C); Zhuangyuxin > > > Subject: Re: [Xen-devel] Poor network performance between DomU with > > > multiqueue support > > > > > > On Tue, Dec 02, 2014 at 11:50:59AM +0000, Zhangleiqiang (Trump) wrote: > > > > > -----Original Message----- > > > > > From: xen-devel-bounces@lists.xen.org > > > > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Wei Liu > > > > > Sent: Tuesday, December 02, 2014 7:02 PM > > > > > To: zhangleiqiang > > > > > Cc: wei.liu2@citrix.com; xen-devel@lists.xen.org > > > > > Subject: Re: [Xen-devel] Poor network performance between DomU > > > > > with multiqueue support > > > > > > > > > > On Tue, Dec 02, 2014 at 04:30:49PM +0800, zhangleiqiang wrote: > > > > > > Hi, all > > > > > > I am testing the performance of xen netfront-netback > > > > > > driver that with > > > > > multi-queues support. The throughput from domU to remote dom0 is > > > > > 9.2Gb/s, but the throughput from domU to remote domU is only > > > > > 3.6Gb/s, I think the bottleneck is the throughput from dom0 to > > > > > local domU. However, we have done some testing and found the > > > > > throughput from dom0 to local domU is 5.8Gb/s. > > > > > > And if we send packets from one DomU to other 3 DomUs on > > > > > > different > > > > > host simultaneously, the sum of throughout can reach 9Gbps. It > > > > > seems like the bottleneck is the receiver? > > > > > > After some analysis, I found that even the max_queue of > > > > > > netfront/back > > > > > is set to 4, there are some strange results as follows: > > > > > > 1. In domU, only one rx queue deal with softirq > > > > > > > > > > Try to bind irq to different vcpus? > > > > > > > > Do you mean we try to bind irq to different vcpus in DomU? I will try it > now. > > > > > > > > > > Yes. Given the fact that you have two backend threads running while > > > only one DomU vcpu is busy, it smells like misconfiguration in DomU. > > > > > > If this phenomenon persists after correctly binding irqs, you might > > > want to check traffic is steering correctly to different queues. > > > > > > > > > > > > > > 2. In dom0, only two netback queues process are scheduled, > > > > > > other two > > > > > process aren't scheduled. > > > > > > > > > > How many Dom0 vcpu do you have? If it only has two then there > > > > > will only be two processes running at a time. > > > > > > > > Dom0 has 6 vcpus, and 6G memory. There are only one DomU running > > > > in > > > Dom0 and so four netback processes are running in Dom0 (because the > > > max_queue param of netback kernel module is set to 4). > > > > The phenomenon is that only 2 of these four netback process were > > > > running > > > with about 70% cpu usage, and another two use little CPU. > > > > Is there a hash algorithm to determine which netback process to > > > > handle the > > > input packet? > > > > > > > > > > I think that's whatever default algorithm Linux kernel is using. > > > > > > We don't currently support other algorithms. > > > > > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Liu Subject: Re: Poor network performance between DomU with multiqueue support Date: Thu, 4 Dec 2014 10:50:21 +0000 Message-ID: <20141204105021.GA16532@zion.uk.xensource.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhangleiqiang (Trump)" Cc: "Luohao (brian)" , Wei Liu , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "xen-devel@lists.xen.org" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org On Wed, Dec 03, 2014 at 02:43:37PM +0000, Zhangleiqiang (Trump) wrote: > > -----Original Message----- > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > Sent: Tuesday, December 02, 2014 11:59 PM > > To: Zhangleiqiang (Trump) > > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao (brian); Xiaoding > > (B); Yuzhou (C); Zhuangyuxin > > Subject: Re: [Xen-devel] Poor network performance between DomU with > > multiqueue support > > > > On Tue, Dec 02, 2014 at 02:46:36PM +0000, Zhangleiqiang (Trump) wrote: > > > Thanks for your reply, Wei. > > > > > > I do the following testing just now and found the results as follows: > > > > > > There are three DomUs (4U4G) are running on Host A (6U6G) and one DomU > > (4U4G) is running on Host B (6U6G), I send packets from three DomUs to the > > DomU on Host B simultaneously. > > > > > > 1. The "top" output of Host B as follows: > > > > > > top - 09:42:11 up 1:07, 2 users, load average: 2.46, 1.90, 1.47 > > > Tasks: 173 total, 4 running, 169 sleeping, 0 stopped, 0 zombie > > > %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni, 97.3 id, 0.0 wa, 0.0 hi, 0.8 > > > si, 1.9 st > > > %Cpu1 : 0.0 us, 27.0 sy, 0.0 ni, 63.1 id, 0.0 wa, 0.0 hi, 9.5 > > > si, 0.4 st > > > %Cpu2 : 0.0 us, 90.0 sy, 0.0 ni, 8.3 id, 0.0 wa, 0.0 hi, 1.7 > > > si, 0.0 st > > > %Cpu3 : 0.4 us, 1.4 sy, 0.0 ni, 95.4 id, 0.0 wa, 0.0 hi, 1.4 > > > si, 1.4 st > > > %Cpu4 : 0.0 us, 60.2 sy, 0.0 ni, 39.5 id, 0.0 wa, 0.0 hi, 0.3 > > > si, 0.0 st > > > %Cpu5 : 0.0 us, 2.8 sy, 0.0 ni, 89.4 id, 0.0 wa, 0.0 hi, 6.9 si, 0.9 > > st > > > KiB Mem: 4517144 total, 3116480 used, 1400664 free, 876 > > buffers > > > KiB Swap: 2103292 total, 0 used, 2103292 free. 2374656 > > cached Mem > > > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM > > TIME+ COMMAND > > > 7440 root 20 0 0 0 0 R 71.10 0.000 > > 8:15.38 vif4.0-q3-guest > > > 7434 root 20 0 0 0 0 R 59.14 0.000 > > 9:00.58 vif4.0-q0-guest > > > 18 root 20 0 0 0 0 R 33.89 0.000 > > 2:35.06 ksoftirqd/2 > > > 28 root 20 0 0 0 0 S 20.93 0.000 > > 3:01.81 ksoftirqd/4 > > > > > > > > > As shown above, only two netback related processes (vif4.0-*) are running > > with high cpu usage, and the other 2 netback processes are idle. The "ps" > > result of vif4.0-* processes as follows: > > > > > > root 7434 50.5 0.0 0 0 ? R 09:23 11:29 > > [vif4.0-q0-guest] > > > root 7435 0.0 0.0 0 0 ? S 09:23 0:00 > > [vif4.0-q0-deall] > > > root 7436 0.0 0.0 0 0 ? S 09:23 0:00 > > [vif4.0-q1-guest] > > > root 7437 0.0 0.0 0 0 ? S 09:23 0:00 > > [vif4.0-q1-deall] > > > root 7438 0.0 0.0 0 0 ? S 09:23 0:00 > > [vif4.0-q2-guest] > > > root 7439 0.0 0.0 0 0 ? S 09:23 0:00 > > [vif4.0-q2-deall] > > > root 7440 48.1 0.0 0 0 ? R 09:23 10:55 > > [vif4.0-q3-guest] > > > root 7441 0.0 0.0 0 0 ? S 09:23 0:00 > > [vif4.0-q3-deall] > > > root 9724 0.0 0.0 9244 1520 pts/0 S+ 09:46 0:00 > > grep --color=auto > > > > > > > > > 2. The "rx" related content in /proc/interupts in receiver DomU (on Host B): > > > > > > 73: 2 0 2925405 0 xen-dyn-event > > eth0-q0-rx > > > 75: 43 93 0 118 xen-dyn-event > > eth0-q1-rx > > > 77: 2 3376 14 1983 xen-dyn-event > > eth0-q2-rx > > > 79: 2414666 0 9 0 xen-dyn-event > > eth0-q3-rx > > > > > > As shown above, it seems like that only q0 and q3 handles the interrupt > > triggered by packet receving. > > > > > > Any advise? Thanks. > > > > Netback selects queue based on the return value of skb_get_queue_mapping. > > The queue mapping is set by core driver or ndo_select_queue (if specified by > > individual driver). In this case netback doesn't have its implementation of > > ndo_select_queue, so it's up to core driver to decide which queue to dispatch > > the packet to. I think you need to inspect why Dom0 only steers traffic to > > these two queues but not all of them. > > > > Don't know which utility is handy for this job. Probably tc(8) is useful? > > Thanks Wei. > > I think the reason for the above results that only two > netback/netfront processes works hard is the queue select method. I > have tried to send packets from multiple host/vm to a vm, and all of > the netback/netfront processes are running with high cpu usage a few > times. > A few times? You might want to check some patches to rework RX stall detection by David Vrabel that went in after 3.16. > However, I find another issue. Even using 6 queues and making sure > that all of these 6 netback processes running with high cpu usage > (indeed, any of it running with 87% cpu usage), the whole VM receive > throughout is not very higher than results when using 4 queues. The > results are from 4.5Gbps to 5.04 Gbps using TCP with 512 bytes length > and 4.3Gbps to 5.78Gbps using TCP with 1460 bytes length. > I would like to ask if you're still using 4U4G (4 CPU 4 G?) configuration? If so, please make sure there are at least the same number of vcpus as queues. > According to the testing result from WIKI: > http://wiki.xen.org/wiki/Xen-netback_and_xen-netfront_multi-queue_performance_testing, > The VM receive throughput is also more lower than VM transmit. > I think that's expected, because guest RX data path still uses grant_copy while guest TX uses grant_map to do zero-copy transmit. > I am wondering why the VM receive throughout cannot be up to 8-10Gbps > as VM transmit under multi-queue? I also tried to send packets > directly from Local Dom0 to DomU, the DomU receive throughput can > reach about 8-12Gbps, so I am also wondering why transmitting packets > from Dom0 to Remote DomU can only reach about 4-5Gbps throughout? If data is from Dom0 to DomU then SKB is probably not fragmented by network stack. You can use tcpdump to check that. Wei. > > > Wei. > > > > > ---------- > > > zhangleiqiang (Trump) > > > > > > Best Regards > > > > > > > > > > -----Original Message----- > > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > > Sent: Tuesday, December 02, 2014 8:12 PM > > > > To: Zhangleiqiang (Trump) > > > > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao (brian); > > > > Xiaoding (B); Yuzhou (C); Zhuangyuxin > > > > Subject: Re: [Xen-devel] Poor network performance between DomU with > > > > multiqueue support > > > > > > > > On Tue, Dec 02, 2014 at 11:50:59AM +0000, Zhangleiqiang (Trump) wrote: > > > > > > -----Original Message----- > > > > > > From: xen-devel-bounces@lists.xen.org > > > > > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Wei Liu > > > > > > Sent: Tuesday, December 02, 2014 7:02 PM > > > > > > To: zhangleiqiang > > > > > > Cc: wei.liu2@citrix.com; xen-devel@lists.xen.org > > > > > > Subject: Re: [Xen-devel] Poor network performance between DomU > > > > > > with multiqueue support > > > > > > > > > > > > On Tue, Dec 02, 2014 at 04:30:49PM +0800, zhangleiqiang wrote: > > > > > > > Hi, all > > > > > > > I am testing the performance of xen netfront-netback > > > > > > > driver that with > > > > > > multi-queues support. The throughput from domU to remote dom0 is > > > > > > 9.2Gb/s, but the throughput from domU to remote domU is only > > > > > > 3.6Gb/s, I think the bottleneck is the throughput from dom0 to > > > > > > local domU. However, we have done some testing and found the > > > > > > throughput from dom0 to local domU is 5.8Gb/s. > > > > > > > And if we send packets from one DomU to other 3 DomUs on > > > > > > > different > > > > > > host simultaneously, the sum of throughout can reach 9Gbps. It > > > > > > seems like the bottleneck is the receiver? > > > > > > > After some analysis, I found that even the max_queue of > > > > > > > netfront/back > > > > > > is set to 4, there are some strange results as follows: > > > > > > > 1. In domU, only one rx queue deal with softirq > > > > > > > > > > > > Try to bind irq to different vcpus? > > > > > > > > > > Do you mean we try to bind irq to different vcpus in DomU? I will try it > > now. > > > > > > > > > > > > > Yes. Given the fact that you have two backend threads running while > > > > only one DomU vcpu is busy, it smells like misconfiguration in DomU. > > > > > > > > If this phenomenon persists after correctly binding irqs, you might > > > > want to check traffic is steering correctly to different queues. > > > > > > > > > > > > > > > > > 2. In dom0, only two netback queues process are scheduled, > > > > > > > other two > > > > > > process aren't scheduled. > > > > > > > > > > > > How many Dom0 vcpu do you have? If it only has two then there > > > > > > will only be two processes running at a time. > > > > > > > > > > Dom0 has 6 vcpus, and 6G memory. There are only one DomU running > > > > > in > > > > Dom0 and so four netback processes are running in Dom0 (because the > > > > max_queue param of netback kernel module is set to 4). > > > > > The phenomenon is that only 2 of these four netback process were > > > > > running > > > > with about 70% cpu usage, and another two use little CPU. > > > > > Is there a hash algorithm to determine which netback process to > > > > > handle the > > > > input packet? > > > > > > > > > > > > > I think that's whatever default algorithm Linux kernel is using. > > > > > > > > We don't currently support other algorithms. > > > > > > > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Thu, 4 Dec 2014 12:09:33 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE2393371E@SZXEMA512-MBX.china.huawei.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20141204105021.GA16532@zion.uk.xensource.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , "xen-devel@lists.xen.org" Cc: "Xiaoding (B)" , Zhuangyuxin , zhangleiqiang , "Luohao (brian)" , "Yuzhou (C)" List-Id: xen-devel@lists.xenproject.org > -----Original Message----- > From: Wei Liu [mailto:wei.liu2@citrix.com] > Sent: Thursday, December 04, 2014 6:50 PM > To: Zhangleiqiang (Trump) > Cc: Wei Liu; xen-devel@lists.xen.org; zhangleiqiang; Luohao (brian); Xiaoding > (B); Yuzhou (C); Zhuangyuxin > Subject: Re: [Xen-devel] Poor network performance between DomU with > multiqueue support > > On Wed, Dec 03, 2014 at 02:43:37PM +0000, Zhangleiqiang (Trump) wrote: > > > -----Original Message----- > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > Sent: Tuesday, December 02, 2014 11:59 PM > > > To: Zhangleiqiang (Trump) > > > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao (brian); > > > Xiaoding (B); Yuzhou (C); Zhuangyuxin > > > Subject: Re: [Xen-devel] Poor network performance between DomU with > > > multiqueue support > > > > > > On Tue, Dec 02, 2014 at 02:46:36PM +0000, Zhangleiqiang (Trump) wrote: > > > > Thanks for your reply, Wei. > > > > > > > > I do the following testing just now and found the results as follows: > > > > > > > > There are three DomUs (4U4G) are running on Host A (6U6G) and one > > > > DomU > > > (4U4G) is running on Host B (6U6G), I send packets from three DomUs > > > to the DomU on Host B simultaneously. > > > > > > > > 1. The "top" output of Host B as follows: > > > > > > > > top - 09:42:11 up 1:07, 2 users, load average: 2.46, 1.90, 1.47 > > > > Tasks: 173 total, 4 running, 169 sleeping, 0 stopped, 0 zombie > > > > %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni, 97.3 id, 0.0 wa, 0.0 hi, > > > > 0.8 si, 1.9 st > > > > %Cpu1 : 0.0 us, 27.0 sy, 0.0 ni, 63.1 id, 0.0 wa, 0.0 hi, > > > > 9.5 si, 0.4 st > > > > %Cpu2 : 0.0 us, 90.0 sy, 0.0 ni, 8.3 id, 0.0 wa, 0.0 hi, > > > > 1.7 si, 0.0 st > > > > %Cpu3 : 0.4 us, 1.4 sy, 0.0 ni, 95.4 id, 0.0 wa, 0.0 hi, > > > > 1.4 si, 1.4 st > > > > %Cpu4 : 0.0 us, 60.2 sy, 0.0 ni, 39.5 id, 0.0 wa, 0.0 hi, > > > > 0.3 si, 0.0 st > > > > %Cpu5 : 0.0 us, 2.8 sy, 0.0 ni, 89.4 id, 0.0 wa, 0.0 hi, > > > > 6.9 si, 0.9 > > > st > > > > KiB Mem: 4517144 total, 3116480 used, 1400664 free, 876 > > > buffers > > > > KiB Swap: 2103292 total, 0 used, 2103292 free. 2374656 > > > cached Mem > > > > > > > > PID USER PR NI VIRT RES SHR > S %CPU %MEM > > > TIME+ COMMAND > > > > 7440 root 20 0 0 0 0 R 71.10 0.000 > > > 8:15.38 vif4.0-q3-guest > > > > 7434 root 20 0 0 0 0 R 59.14 0.000 > > > 9:00.58 vif4.0-q0-guest > > > > 18 root 20 0 0 0 0 R 33.89 0.000 > > > 2:35.06 ksoftirqd/2 > > > > 28 root 20 0 0 0 0 S 20.93 0.000 > > > 3:01.81 ksoftirqd/4 > > > > > > > > > > > > As shown above, only two netback related processes (vif4.0-*) are > > > > running > > > with high cpu usage, and the other 2 netback processes are idle. The "ps" > > > result of vif4.0-* processes as follows: > > > > > > > > root 7434 50.5 0.0 0 0 ? R 09:23 > 11:29 > > > [vif4.0-q0-guest] > > > > root 7435 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q0-deall] > > > > root 7436 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q1-guest] > > > > root 7437 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q1-deall] > > > > root 7438 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q2-guest] > > > > root 7439 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q2-deall] > > > > root 7440 48.1 0.0 0 0 ? R 09:23 > 10:55 > > > [vif4.0-q3-guest] > > > > root 7441 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q3-deall] > > > > root 9724 0.0 0.0 9244 1520 pts/0 S+ 09:46 > 0:00 > > > grep --color=auto > > > > > > > > > > > > 2. The "rx" related content in /proc/interupts in receiver DomU (on Host > B): > > > > > > > > 73: 2 0 2925405 0 xen-dyn-event > > > eth0-q0-rx > > > > 75: 43 93 0 118 xen-dyn-event > > > eth0-q1-rx > > > > 77: 2 3376 14 1983 xen-dyn-event > > > eth0-q2-rx > > > > 79: 2414666 0 9 0 xen-dyn-event > > > eth0-q3-rx > > > > > > > > As shown above, it seems like that only q0 and q3 handles the > > > > interrupt > > > triggered by packet receving. > > > > > > > > Any advise? Thanks. > > > > > > Netback selects queue based on the return value of > skb_get_queue_mapping. > > > The queue mapping is set by core driver or ndo_select_queue (if > > > specified by individual driver). In this case netback doesn't have > > > its implementation of ndo_select_queue, so it's up to core driver to > > > decide which queue to dispatch the packet to. I think you need to > > > inspect why Dom0 only steers traffic to these two queues but not all of > them. > > > > > > Don't know which utility is handy for this job. Probably tc(8) is useful? > > > > Thanks Wei. > > > > > I think the reason for the above results that only two > > netback/netfront processes works hard is the queue select method. I > > have tried to send packets from multiple host/vm to a vm, and all of > > the netback/netfront processes are running with high cpu usage a few > > times. > > > > A few times? You might want to check some patches to rework RX stall > detection by David Vrabel that went in after 3.16. Thanks for your suggest. I have switched to latest stable branch 3.17.4 and I find the patches you mentioned are not merged in this branch too, I will merge this patch and try again. > > However, I find another issue. Even using 6 queues and making sure > > that all of these 6 netback processes running with high cpu usage > > (indeed, any of it running with 87% cpu usage), the whole VM receive > > throughout is not very higher than results when using 4 queues. The > > results are from 4.5Gbps to 5.04 Gbps using TCP with 512 bytes length > > and 4.3Gbps to 5.78Gbps using TCP with 1460 bytes length. > > > > I would like to ask if you're still using 4U4G (4 CPU 4 G?) configuration? If so, > please make sure there are at least the same number of vcpus as queues. Sorry for misleading you, 4U4G means 4 CPU and 4 G memory, :). I also found that the max_queue of netback is determinated by min(online_cpu, module_param) yesterday, so when using 6 queues in the previous testing, I used VM with 6 CPU and 6 G Memory. > > According to the testing result from WIKI: > > http://wiki.xen.org/wiki/Xen-netback_and_xen-netfront_multi-queue_perf > > ormance_testing, The VM receive throughput is also more lower than VM > > transmit. > > > > I think that's expected, because guest RX data path still uses grant_copy while > guest TX uses grant_map to do zero-copy transmit. As I understand, the RX process is as follows: 1. Phy NIC receive packet 2. XEN Hypervisor trigger interrupt to Dom0 3. Dom0' s NIC driver do the "RX" operation, and the packet is stored into SKB which is also owned/shared with netback 4. NetBack notify netfront through event channel that a packet is receiving 5. Netfront grant a buffer for receiving and notify netback the GR (if using grant-resue mechanism, netfront just notify the GR to netback) through IO Ring 6. NetBack do the grant_copy to copy packet from its SKB to the buffer referenced by GR, and notify netfront through event channel 7. Netfront copy the data from buffer to user-level app's SKB Am I right? Why not using zero-copy transmit in guest RX data pash too ? > > I am wondering why the VM receive throughout cannot be up to 8-10Gbps > > as VM transmit under multi-queue? I also tried to send packets > > directly from Local Dom0 to DomU, the DomU receive throughput can > > reach about 8-12Gbps, so I am also wondering why transmitting packets > > from Dom0 to Remote DomU can only reach about 4-5Gbps throughout? > > If data is from Dom0 to DomU then SKB is probably not fragmented by network > stack. You can use tcpdump to check that. In our testing , the MTU is set to 1600. However, even testing with packets whose length are 1024 (small than 1600), the throughout between Dom0 to Local DomU is more higher than that between Dom0 to Remote DomU. So maybe the fragment is not the reason for it. > Wei. > > > > > > Wei. > > > > > > > ---------- > > > > zhangleiqiang (Trump) > > > > > > > > Best Regards > > > > > > > > > > > > > -----Original Message----- > > > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > > > Sent: Tuesday, December 02, 2014 8:12 PM > > > > > To: Zhangleiqiang (Trump) > > > > > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao > > > > > (brian); Xiaoding (B); Yuzhou (C); Zhuangyuxin > > > > > Subject: Re: [Xen-devel] Poor network performance between DomU > > > > > with multiqueue support > > > > > > > > > > On Tue, Dec 02, 2014 at 11:50:59AM +0000, Zhangleiqiang (Trump) > wrote: > > > > > > > -----Original Message----- > > > > > > > From: xen-devel-bounces@lists.xen.org > > > > > > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Wei > > > > > > > Liu > > > > > > > Sent: Tuesday, December 02, 2014 7:02 PM > > > > > > > To: zhangleiqiang > > > > > > > Cc: wei.liu2@citrix.com; xen-devel@lists.xen.org > > > > > > > Subject: Re: [Xen-devel] Poor network performance between > > > > > > > DomU with multiqueue support > > > > > > > > > > > > > > On Tue, Dec 02, 2014 at 04:30:49PM +0800, zhangleiqiang wrote: > > > > > > > > Hi, all > > > > > > > > I am testing the performance of xen netfront-netback > > > > > > > > driver that with > > > > > > > multi-queues support. The throughput from domU to remote > > > > > > > dom0 is 9.2Gb/s, but the throughput from domU to remote domU > > > > > > > is only 3.6Gb/s, I think the bottleneck is the throughput > > > > > > > from dom0 to local domU. However, we have done some testing > > > > > > > and found the throughput from dom0 to local domU is 5.8Gb/s. > > > > > > > > And if we send packets from one DomU to other 3 DomUs > > > > > > > > on different > > > > > > > host simultaneously, the sum of throughout can reach 9Gbps. > > > > > > > It seems like the bottleneck is the receiver? > > > > > > > > After some analysis, I found that even the max_queue > > > > > > > > of netfront/back > > > > > > > is set to 4, there are some strange results as follows: > > > > > > > > 1. In domU, only one rx queue deal with softirq > > > > > > > > > > > > > > Try to bind irq to different vcpus? > > > > > > > > > > > > Do you mean we try to bind irq to different vcpus in DomU? I > > > > > > will try it > > > now. > > > > > > > > > > > > > > > > Yes. Given the fact that you have two backend threads running > > > > > while only one DomU vcpu is busy, it smells like misconfiguration in > DomU. > > > > > > > > > > If this phenomenon persists after correctly binding irqs, you > > > > > might want to check traffic is steering correctly to different queues. > > > > > > > > > > > > > > > > > > > > 2. In dom0, only two netback queues process are > > > > > > > > scheduled, other two > > > > > > > process aren't scheduled. > > > > > > > > > > > > > > How many Dom0 vcpu do you have? If it only has two then > > > > > > > there will only be two processes running at a time. > > > > > > > > > > > > Dom0 has 6 vcpus, and 6G memory. There are only one DomU > > > > > > running in > > > > > Dom0 and so four netback processes are running in Dom0 (because > > > > > the max_queue param of netback kernel module is set to 4). > > > > > > The phenomenon is that only 2 of these four netback process > > > > > > were running > > > > > with about 70% cpu usage, and another two use little CPU. > > > > > > Is there a hash algorithm to determine which netback process > > > > > > to handle the > > > > > input packet? > > > > > > > > > > > > > > > > I think that's whatever default algorithm Linux kernel is using. > > > > > > > > > > We don't currently support other algorithms. > > > > > > > > > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Liu Subject: Re: Poor network performance between DomU with multiqueue support Date: Thu, 4 Dec 2014 13:05:31 +0000 Message-ID: <20141204130531.GD16532@zion.uk.xensource.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393371E@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE2393371E@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhangleiqiang (Trump)" Cc: "Luohao (brian)" , Wei Liu , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "xen-devel@lists.xen.org" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org On Thu, Dec 04, 2014 at 12:09:33PM +0000, Zhangleiqiang (Trump) wrote: [...] > > > However, I find another issue. Even using 6 queues and making sure > > > that all of these 6 netback processes running with high cpu usage > > > (indeed, any of it running with 87% cpu usage), the whole VM receive > > > throughout is not very higher than results when using 4 queues. The > > > results are from 4.5Gbps to 5.04 Gbps using TCP with 512 bytes length > > > and 4.3Gbps to 5.78Gbps using TCP with 1460 bytes length. > > > > > > > I would like to ask if you're still using 4U4G (4 CPU 4 G?) configuration? If so, > > please make sure there are at least the same number of vcpus as queues. > > Sorry for misleading you, 4U4G means 4 CPU and 4 G memory, :). I also > found that the max_queue of netback is determinated by min(online_cpu, > module_param) yesterday, so when using 6 queues in the previous > testing, I used VM with 6 CPU and 6 G Memory. > > > > According to the testing result from WIKI: > > > http://wiki.xen.org/wiki/Xen-netback_and_xen-netfront_multi-queue_perf > > > ormance_testing, The VM receive throughput is also more lower than VM > > > transmit. > > > > > > > I think that's expected, because guest RX data path still uses grant_copy while > > guest TX uses grant_map to do zero-copy transmit. > > As I understand, the RX process is as follows: > 1. Phy NIC receive packet > 2. XEN Hypervisor trigger interrupt to Dom0 > 3. Dom0' s NIC driver do the "RX" operation, and the packet is stored into SKB which is also owned/shared with netback > 4. NetBack notify netfront through event channel that a packet is receiving > 5. Netfront grant a buffer for receiving and notify netback the GR (if using grant-resue mechanism, netfront just notify the GR to netback) through IO Ring > 6. NetBack do the grant_copy to copy packet from its SKB to the buffer referenced by GR, and notify netfront through event channel > 7. Netfront copy the data from buffer to user-level app's SKB > > Am I right? Step 4 is not correct, netback won't notify netfront at that point. Step 5 is not correct, all grant refs are pre-allocated and granted before that. Other steps look correct. > Why not using zero-copy transmit in guest RX data pash too ? > A rogue / buggy guest might hold the mapping for arbitrary long period of time. > > > > I am wondering why the VM receive throughout cannot be up to 8-10Gbps > > > as VM transmit under multi-queue? I also tried to send packets > > > directly from Local Dom0 to DomU, the DomU receive throughput can > > > reach about 8-12Gbps, so I am also wondering why transmitting packets > > > from Dom0 to Remote DomU can only reach about 4-5Gbps throughout? > > > > If data is from Dom0 to DomU then SKB is probably not fragmented by network > > stack. You can use tcpdump to check that. > > In our testing , the MTU is set to 1600. However, even testing with > packets whose length are 1024 (small than 1600), the throughout > between Dom0 to Local DomU is more higher than that between Dom0 to > Remote DomU. So maybe the fragment is not the reason for it. > Don't have much idea about this, sorry. Wei. > > > Wei. > > > > > > > > > Wei. > > > > > > > > > ---------- > > > > > zhangleiqiang (Trump) > > > > > > > > > > Best Regards > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > > > > Sent: Tuesday, December 02, 2014 8:12 PM > > > > > > To: Zhangleiqiang (Trump) > > > > > > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao > > > > > > (brian); Xiaoding (B); Yuzhou (C); Zhuangyuxin > > > > > > Subject: Re: [Xen-devel] Poor network performance between DomU > > > > > > with multiqueue support > > > > > > > > > > > > On Tue, Dec 02, 2014 at 11:50:59AM +0000, Zhangleiqiang (Trump) > > wrote: > > > > > > > > -----Original Message----- > > > > > > > > From: xen-devel-bounces@lists.xen.org > > > > > > > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Wei > > > > > > > > Liu > > > > > > > > Sent: Tuesday, December 02, 2014 7:02 PM > > > > > > > > To: zhangleiqiang > > > > > > > > Cc: wei.liu2@citrix.com; xen-devel@lists.xen.org > > > > > > > > Subject: Re: [Xen-devel] Poor network performance between > > > > > > > > DomU with multiqueue support > > > > > > > > > > > > > > > > On Tue, Dec 02, 2014 at 04:30:49PM +0800, zhangleiqiang wrote: > > > > > > > > > Hi, all > > > > > > > > > I am testing the performance of xen netfront-netback > > > > > > > > > driver that with > > > > > > > > multi-queues support. The throughput from domU to remote > > > > > > > > dom0 is 9.2Gb/s, but the throughput from domU to remote domU > > > > > > > > is only 3.6Gb/s, I think the bottleneck is the throughput > > > > > > > > from dom0 to local domU. However, we have done some testing > > > > > > > > and found the throughput from dom0 to local domU is 5.8Gb/s. > > > > > > > > > And if we send packets from one DomU to other 3 DomUs > > > > > > > > > on different > > > > > > > > host simultaneously, the sum of throughout can reach 9Gbps. > > > > > > > > It seems like the bottleneck is the receiver? > > > > > > > > > After some analysis, I found that even the max_queue > > > > > > > > > of netfront/back > > > > > > > > is set to 4, there are some strange results as follows: > > > > > > > > > 1. In domU, only one rx queue deal with softirq > > > > > > > > > > > > > > > > Try to bind irq to different vcpus? > > > > > > > > > > > > > > Do you mean we try to bind irq to different vcpus in DomU? I > > > > > > > will try it > > > > now. > > > > > > > > > > > > > > > > > > > Yes. Given the fact that you have two backend threads running > > > > > > while only one DomU vcpu is busy, it smells like misconfiguration in > > DomU. > > > > > > > > > > > > If this phenomenon persists after correctly binding irqs, you > > > > > > might want to check traffic is steering correctly to different queues. > > > > > > > > > > > > > > > > > > > > > > > 2. In dom0, only two netback queues process are > > > > > > > > > scheduled, other two > > > > > > > > process aren't scheduled. > > > > > > > > > > > > > > > > How many Dom0 vcpu do you have? If it only has two then > > > > > > > > there will only be two processes running at a time. > > > > > > > > > > > > > > Dom0 has 6 vcpus, and 6G memory. There are only one DomU > > > > > > > running in > > > > > > Dom0 and so four netback processes are running in Dom0 (because > > > > > > the max_queue param of netback kernel module is set to 4). > > > > > > > The phenomenon is that only 2 of these four netback process > > > > > > > were running > > > > > > with about 70% cpu usage, and another two use little CPU. > > > > > > > Is there a hash algorithm to determine which netback process > > > > > > > to handle the > > > > > > input packet? > > > > > > > > > > > > > > > > > > > I think that's whatever default algorithm Linux kernel is using. > > > > > > > > > > > > We don't currently support other algorithms. > > > > > > > > > > > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zoltan Kiss Subject: Re: Poor network performance between DomU with multiqueue support Date: Thu, 04 Dec 2014 13:35:17 +0000 Message-ID: <54806315.6010007@linaro.org> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393371E@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE2393371E@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhangleiqiang (Trump)" , Wei Liu , "xen-devel@lists.xen.org" Cc: "Xiaoding (B)" , Zhuangyuxin , zhangleiqiang , "Luohao (brian)" , "Yuzhou (C)" List-Id: xen-devel@lists.xenproject.org On 04/12/14 12:09, Zhangleiqiang (Trump) wrote: >> I think that's expected, because guest RX data path still uses grant_copy while >> >guest TX uses grant_map to do zero-copy transmit. > As I understand, the RX process is as follows: > 1. Phy NIC receive packet > 2. XEN Hypervisor trigger interrupt to Dom0 > 3. Dom0' s NIC driver do the "RX" operation, and the packet is stored into SKB which is also owned/shared with netback Not that easy. There is something between the NIC driver and netback which directs the packets, e.g. the old bridge driver, ovs, or the IP stack of the kernel. > 4. NetBack notify netfront through event channel that a packet is receiving > 5. Netfront grant a buffer for receiving and notify netback the GR (if using grant-resue mechanism, netfront just notify the GR to netback) through IO Ring It looks a bit confusing in the code, but netfront put "requests" on the ring buffer, which contains the grant ref of the guest page where the backend can copy. When the packet comes, netback consumes these requests and send back a response telling the guest the grant copy of the packet finished, it can start handling the data. (sending a response means it's placing a response in the ring and trigger the event channel) And ideally netback should always have requests in the ring, so it doesn't have to wait for the guest to fill it up. > 6. NetBack do the grant_copy to copy packet from its SKB to the buffer referenced by GR, and notify netfront through event channel > 7. Netfront copy the data from buffer to user-level app's SKB Or wherever that SKB should go, yes. Like with any received packet on a real network interface. > > Am I right? Why not using zero-copy transmit in guest RX data pash too ? Because that means you are mapping that memory to the guest, and you won't have any guarantee when the guest will release them. And netback can't just unmap them forcibly after a timeout, because finding a correct timeout value would be quite impossible. A malicious/buggy/overloaded guest can hold on to Dom0 memory indefinitely, but it even becomes worse if the memory came from another guest: you can't shutdown that guest for example, until all its memory is returned to him. Regards, Zoli From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Thu, 4 Dec 2014 14:31:12 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE23933926@SZXEMA512-MBX.china.huawei.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393371E@SZXEMA512-MBX.china.huawei.com> <54806315.6010007@linaro.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <54806315.6010007@linaro.org> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Zoltan Kiss , "xen-devel@lists.xen.org" Cc: "Xiaoding (B)" , Zhuangyuxin , zhangleiqiang , "Luohao (brian)" , "Yuzhou (C)" List-Id: xen-devel@lists.xenproject.org > -----Original Message----- > From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org] > Sent: Thursday, December 04, 2014 9:35 PM > To: Zhangleiqiang (Trump); Wei Liu; xen-devel@lists.xen.org > Cc: Xiaoding (B); Zhuangyuxin; zhangleiqiang; Luohao (brian); Yuzhou (C) > Subject: Re: [Xen-devel] Poor network performance between DomU with > multiqueue support > > > > On 04/12/14 12:09, Zhangleiqiang (Trump) wrote: > >> I think that's expected, because guest RX data path still uses > >> grant_copy while > >> >guest TX uses grant_map to do zero-copy transmit. > > As I understand, the RX process is as follows: > > 1. Phy NIC receive packet > > 2. XEN Hypervisor trigger interrupt to Dom0 3. Dom0' s NIC driver do > > the "RX" operation, and the packet is stored into SKB which is also > > owned/shared with netback > Not that easy. There is something between the NIC driver and netback which > directs the packets, e.g. the old bridge driver, ovs, or the IP stack of the kernel. > > 4. NetBack notify netfront through event channel that a packet is > > receiving 5. Netfront grant a buffer for receiving and notify netback > > the GR (if using grant-resue mechanism, netfront just notify the GR to > > netback) through IO Ring > It looks a bit confusing in the code, but netfront put "requests" on the ring > buffer, which contains the grant ref of the guest page where the backend can > copy. When the packet comes, netback consumes these requests and send > back a response telling the guest the grant copy of the packet finished, it can > start handling the data. (sending a response means it's placing a response in > the ring and trigger the event channel) And ideally netback should always have > requests in the ring, so it doesn't have to wait for the guest to fill it up. > > 6. NetBack do the grant_copy to copy packet from its SKB to the buffer > > referenced by GR, and notify netfront through event channel 7. > > Netfront copy the data from buffer to user-level app's SKB > Or wherever that SKB should go, yes. Like with any received packet on a real > network interface. > > > > Am I right? Why not using zero-copy transmit in guest RX data pash too ? > Because that means you are mapping that memory to the guest, and you won't > have any guarantee when the guest will release them. And netback can't just > unmap them forcibly after a timeout, because finding a correct timeout value > would be quite impossible. > A malicious/buggy/overloaded guest can hold on to Dom0 memory indefinitely, > but it even becomes worse if the memory came from another > guest: you can't shutdown that guest for example, until all its memory is > returned to him. Thanks for your detailed explanation about RX data path, I have get it, :) About the issue that poor performance between DomU to DomU, but high throughout between Dom0 to remote Dom0/DomU mentioned in my previous mail, do you have any idea about it? I am wondering if netfront/netback can be optimized to reach the 10Gbps throughout between DomUs running on different hosts connected with 10GE network. Currently, it seems like the TX is not the bottleneck, because we can reach the aggregate throughout of 9Gbps when sending packets from one DomU to other 3 DomUs running on different host. So I think the bottleneck maybe the RX, are you agreed with me? I am wondering what is the main reason that prevent RX to reach the higher throughout? Compared to KVM+virtio+vhost, which can reach high throughout, the RX has extra grantcopy operation, and the grantcopy operation may be one reason for it. Do you have any idea about it too? > > Regards, > > Zoli From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Thu, 4 Dec 2014 14:37:54 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE2393394C@SZXEMA512-MBX.china.huawei.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393371E@SZXEMA512-MBX.china.huawei.com> <20141204130531.GD16532@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20141204130531.GD16532@zion.uk.xensource.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , "xen-devel@lists.xen.org" Cc: "Xiaoding (B)" , Zhuangyuxin , zhangleiqiang , "Luohao (brian)" , "Yuzhou (C)" List-Id: xen-devel@lists.xenproject.org Thanks for your detailed explanation, Wei. I am wondering if netfront/netback can be optimized to reach the 10Gbps throughout between DomUs running on different hosts connected with 10GE network. Currently, it seems like the RX is the bottleneck, which also consist with the testing result in xenwiki: http://wiki.xen.org/wiki/Xen-netback_and_xen-netfront_multi-queue_performance_testing I am wondering what factors prevent RX to reach the higher throughout? You have mentioned that one reason is that guest RX data path still uses grant_copy while guest TX uses grant_map to do zero-copy transmit. Do you know any other factors or ongoing work to optimize the RX data path? ---------- zhangleiqiang (Trump) Best Regards > -----Original Message----- > From: Wei Liu [mailto:wei.liu2@citrix.com] > Sent: Thursday, December 04, 2014 9:06 PM > To: Zhangleiqiang (Trump) > Cc: Wei Liu; xen-devel@lists.xen.org; zhangleiqiang; Luohao (brian); Xiaoding > (B); Yuzhou (C); Zhuangyuxin > Subject: Re: [Xen-devel] Poor network performance between DomU with > multiqueue support > > On Thu, Dec 04, 2014 at 12:09:33PM +0000, Zhangleiqiang (Trump) wrote: > [...] > > > > However, I find another issue. Even using 6 queues and making sure > > > > that all of these 6 netback processes running with high cpu usage > > > > (indeed, any of it running with 87% cpu usage), the whole VM > > > > receive throughout is not very higher than results when using 4 > > > > queues. The results are from 4.5Gbps to 5.04 Gbps using TCP with > > > > 512 bytes length and 4.3Gbps to 5.78Gbps using TCP with 1460 bytes > length. > > > > > > > > > > I would like to ask if you're still using 4U4G (4 CPU 4 G?) > > > configuration? If so, please make sure there are at least the same number > of vcpus as queues. > > > > > Sorry for misleading you, 4U4G means 4 CPU and 4 G memory, :). I also > > found that the max_queue of netback is determinated by min(online_cpu, > > module_param) yesterday, so when using 6 queues in the previous > > testing, I used VM with 6 CPU and 6 G Memory. > > > > > > > According to the testing result from WIKI: > > > > http://wiki.xen.org/wiki/Xen-netback_and_xen-netfront_multi-queue_ > > > > perf ormance_testing, The VM receive throughput is also more lower > > > > than VM transmit. > > > > > > > > > > I think that's expected, because guest RX data path still uses > > > grant_copy while guest TX uses grant_map to do zero-copy transmit. > > > > As I understand, the RX process is as follows: > > 1. Phy NIC receive packet > > 2. XEN Hypervisor trigger interrupt to Dom0 3. Dom0' s NIC driver do > > the "RX" operation, and the packet is stored into SKB which is also > > owned/shared with netback 4. NetBack notify netfront through event > > channel that a packet is receiving 5. Netfront grant a buffer for > > receiving and notify netback the GR (if using grant-resue mechanism, > > netfront just notify the GR to netback) through IO Ring 6. NetBack do > > the grant_copy to copy packet from its SKB to the buffer referenced by > > GR, and notify netfront through event channel 7. Netfront copy the > > data from buffer to user-level app's SKB > > > > Am I right? > > Step 4 is not correct, netback won't notify netfront at that point. > > Step 5 is not correct, all grant refs are pre-allocated and granted before that. > > Other steps look correct. > > > Why not using zero-copy transmit in guest RX data pash too ? > > > > A rogue / buggy guest might hold the mapping for arbitrary long period of time. > > > > > > > I am wondering why the VM receive throughout cannot be up to > > > > 8-10Gbps as VM transmit under multi-queue? I also tried to send > > > > packets directly from Local Dom0 to DomU, the DomU receive > > > > throughput can reach about 8-12Gbps, so I am also wondering why > > > > transmitting packets from Dom0 to Remote DomU can only reach about > 4-5Gbps throughout? > > > > > > If data is from Dom0 to DomU then SKB is probably not fragmented by > > > network stack. You can use tcpdump to check that. > > > > In our testing , the MTU is set to 1600. However, even testing with > > packets whose length are 1024 (small than 1600), the throughout > > between Dom0 to Local DomU is more higher than that between Dom0 to > > Remote DomU. So maybe the fragment is not the reason for it. > > > > Don't have much idea about this, sorry. > > Wei. > > > > > > Wei. > > > > > > > > > > > > Wei. > > > > > > > > > > > ---------- > > > > > > zhangleiqiang (Trump) > > > > > > > > > > > > Best Regards > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > > > > > Sent: Tuesday, December 02, 2014 8:12 PM > > > > > > > To: Zhangleiqiang (Trump) > > > > > > > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao > > > > > > > (brian); Xiaoding (B); Yuzhou (C); Zhuangyuxin > > > > > > > Subject: Re: [Xen-devel] Poor network performance between > > > > > > > DomU with multiqueue support > > > > > > > > > > > > > > On Tue, Dec 02, 2014 at 11:50:59AM +0000, Zhangleiqiang > > > > > > > (Trump) > > > wrote: > > > > > > > > > -----Original Message----- > > > > > > > > > From: xen-devel-bounces@lists.xen.org > > > > > > > > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of > > > > > > > > > Wei Liu > > > > > > > > > Sent: Tuesday, December 02, 2014 7:02 PM > > > > > > > > > To: zhangleiqiang > > > > > > > > > Cc: wei.liu2@citrix.com; xen-devel@lists.xen.org > > > > > > > > > Subject: Re: [Xen-devel] Poor network performance > > > > > > > > > between DomU with multiqueue support > > > > > > > > > > > > > > > > > > On Tue, Dec 02, 2014 at 04:30:49PM +0800, zhangleiqiang > wrote: > > > > > > > > > > Hi, all > > > > > > > > > > I am testing the performance of xen > > > > > > > > > > netfront-netback driver that with > > > > > > > > > multi-queues support. The throughput from domU to remote > > > > > > > > > dom0 is 9.2Gb/s, but the throughput from domU to remote > > > > > > > > > domU is only 3.6Gb/s, I think the bottleneck is the > > > > > > > > > throughput from dom0 to local domU. However, we have > > > > > > > > > done some testing and found the throughput from dom0 to local > domU is 5.8Gb/s. > > > > > > > > > > And if we send packets from one DomU to other 3 > > > > > > > > > > DomUs on different > > > > > > > > > host simultaneously, the sum of throughout can reach 9Gbps. > > > > > > > > > It seems like the bottleneck is the receiver? > > > > > > > > > > After some analysis, I found that even the > > > > > > > > > > max_queue of netfront/back > > > > > > > > > is set to 4, there are some strange results as follows: > > > > > > > > > > 1. In domU, only one rx queue deal with softirq > > > > > > > > > > > > > > > > > > Try to bind irq to different vcpus? > > > > > > > > > > > > > > > > Do you mean we try to bind irq to different vcpus in DomU? > > > > > > > > I will try it > > > > > now. > > > > > > > > > > > > > > > > > > > > > > Yes. Given the fact that you have two backend threads > > > > > > > running while only one DomU vcpu is busy, it smells like > > > > > > > misconfiguration in > > > DomU. > > > > > > > > > > > > > > If this phenomenon persists after correctly binding irqs, > > > > > > > you might want to check traffic is steering correctly to different > queues. > > > > > > > > > > > > > > > > > > > > > > > > > > 2. In dom0, only two netback queues process are > > > > > > > > > > scheduled, other two > > > > > > > > > process aren't scheduled. > > > > > > > > > > > > > > > > > > How many Dom0 vcpu do you have? If it only has two then > > > > > > > > > there will only be two processes running at a time. > > > > > > > > > > > > > > > > Dom0 has 6 vcpus, and 6G memory. There are only one DomU > > > > > > > > running in > > > > > > > Dom0 and so four netback processes are running in Dom0 > > > > > > > (because the max_queue param of netback kernel module is set to > 4). > > > > > > > > The phenomenon is that only 2 of these four netback > > > > > > > > process were running > > > > > > > with about 70% cpu usage, and another two use little CPU. > > > > > > > > Is there a hash algorithm to determine which netback > > > > > > > > process to handle the > > > > > > > input packet? > > > > > > > > > > > > > > > > > > > > > > I think that's whatever default algorithm Linux kernel is using. > > > > > > > > > > > > > > We don't currently support other algorithms. > > > > > > > > > > > > > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Fri, 5 Dec 2014 01:17:16 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE23933CDA@SZXEMA512-MBX.china.huawei.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20141204105021.GA16532@zion.uk.xensource.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , "xen-devel@lists.xen.org" Cc: "Xiaoding (B)" , Zhuangyuxin , zhangleiqiang , "Luohao (brian)" , "Yuzhou (C)" List-Id: xen-devel@lists.xenproject.org > -----Original Message----- > From: Wei Liu [mailto:wei.liu2@citrix.com] > Sent: Thursday, December 04, 2014 6:50 PM > To: Zhangleiqiang (Trump) > Cc: Wei Liu; xen-devel@lists.xen.org; zhangleiqiang; Luohao (brian); Xiaoding > (B); Yuzhou (C); Zhuangyuxin > Subject: Re: [Xen-devel] Poor network performance between DomU with > multiqueue support > > On Wed, Dec 03, 2014 at 02:43:37PM +0000, Zhangleiqiang (Trump) wrote: > > > -----Original Message----- > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > Sent: Tuesday, December 02, 2014 11:59 PM > > > To: Zhangleiqiang (Trump) > > > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao (brian); > > > Xiaoding (B); Yuzhou (C); Zhuangyuxin > > > Subject: Re: [Xen-devel] Poor network performance between DomU with > > > multiqueue support > > > > > > On Tue, Dec 02, 2014 at 02:46:36PM +0000, Zhangleiqiang (Trump) wrote: > > > > Thanks for your reply, Wei. > > > > > > > > I do the following testing just now and found the results as follows: > > > > > > > > There are three DomUs (4U4G) are running on Host A (6U6G) and one > > > > DomU > > > (4U4G) is running on Host B (6U6G), I send packets from three DomUs > > > to the DomU on Host B simultaneously. > > > > > > > > 1. The "top" output of Host B as follows: > > > > > > > > top - 09:42:11 up 1:07, 2 users, load average: 2.46, 1.90, 1.47 > > > > Tasks: 173 total, 4 running, 169 sleeping, 0 stopped, 0 zombie > > > > %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni, 97.3 id, 0.0 wa, 0.0 hi, > > > > 0.8 si, 1.9 st > > > > %Cpu1 : 0.0 us, 27.0 sy, 0.0 ni, 63.1 id, 0.0 wa, 0.0 hi, > > > > 9.5 si, 0.4 st > > > > %Cpu2 : 0.0 us, 90.0 sy, 0.0 ni, 8.3 id, 0.0 wa, 0.0 hi, > > > > 1.7 si, 0.0 st > > > > %Cpu3 : 0.4 us, 1.4 sy, 0.0 ni, 95.4 id, 0.0 wa, 0.0 hi, > > > > 1.4 si, 1.4 st > > > > %Cpu4 : 0.0 us, 60.2 sy, 0.0 ni, 39.5 id, 0.0 wa, 0.0 hi, > > > > 0.3 si, 0.0 st > > > > %Cpu5 : 0.0 us, 2.8 sy, 0.0 ni, 89.4 id, 0.0 wa, 0.0 hi, > > > > 6.9 si, 0.9 > > > st > > > > KiB Mem: 4517144 total, 3116480 used, 1400664 free, 876 > > > buffers > > > > KiB Swap: 2103292 total, 0 used, 2103292 free. 2374656 > > > cached Mem > > > > > > > > PID USER PR NI VIRT RES SHR > S %CPU %MEM > > > TIME+ COMMAND > > > > 7440 root 20 0 0 0 0 R 71.10 0.000 > > > 8:15.38 vif4.0-q3-guest > > > > 7434 root 20 0 0 0 0 R 59.14 0.000 > > > 9:00.58 vif4.0-q0-guest > > > > 18 root 20 0 0 0 0 R 33.89 0.000 > > > 2:35.06 ksoftirqd/2 > > > > 28 root 20 0 0 0 0 S 20.93 0.000 > > > 3:01.81 ksoftirqd/4 > > > > > > > > > > > > As shown above, only two netback related processes (vif4.0-*) are > > > > running > > > with high cpu usage, and the other 2 netback processes are idle. The "ps" > > > result of vif4.0-* processes as follows: > > > > > > > > root 7434 50.5 0.0 0 0 ? R 09:23 > 11:29 > > > [vif4.0-q0-guest] > > > > root 7435 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q0-deall] > > > > root 7436 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q1-guest] > > > > root 7437 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q1-deall] > > > > root 7438 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q2-guest] > > > > root 7439 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q2-deall] > > > > root 7440 48.1 0.0 0 0 ? R 09:23 > 10:55 > > > [vif4.0-q3-guest] > > > > root 7441 0.0 0.0 0 0 ? S 09:23 > 0:00 > > > [vif4.0-q3-deall] > > > > root 9724 0.0 0.0 9244 1520 pts/0 S+ 09:46 > 0:00 > > > grep --color=auto > > > > > > > > > > > > 2. The "rx" related content in /proc/interupts in receiver DomU (on Host > B): > > > > > > > > 73: 2 0 2925405 0 xen-dyn-event > > > eth0-q0-rx > > > > 75: 43 93 0 118 xen-dyn-event > > > eth0-q1-rx > > > > 77: 2 3376 14 1983 xen-dyn-event > > > eth0-q2-rx > > > > 79: 2414666 0 9 0 xen-dyn-event > > > eth0-q3-rx > > > > > > > > As shown above, it seems like that only q0 and q3 handles the > > > > interrupt > > > triggered by packet receving. > > > > > > > > Any advise? Thanks. > > > > > > Netback selects queue based on the return value of > skb_get_queue_mapping. > > > The queue mapping is set by core driver or ndo_select_queue (if > > > specified by individual driver). In this case netback doesn't have > > > its implementation of ndo_select_queue, so it's up to core driver to > > > decide which queue to dispatch the packet to. I think you need to > > > inspect why Dom0 only steers traffic to these two queues but not all of > them. > > > > > > Don't know which utility is handy for this job. Probably tc(8) is useful? > > > > Thanks Wei. > > > > > I think the reason for the above results that only two > > netback/netfront processes works hard is the queue select method. I > > have tried to send packets from multiple host/vm to a vm, and all of > > the netback/netfront processes are running with high cpu usage a few > > times. > > > > A few times? You might want to check some patches to rework RX stall > detection by David Vrabel that went in after 3.16. > > > However, I find another issue. Even using 6 queues and making sure > > that all of these 6 netback processes running with high cpu usage > > (indeed, any of it running with 87% cpu usage), the whole VM receive > > throughout is not very higher than results when using 4 queues. The > > results are from 4.5Gbps to 5.04 Gbps using TCP with 512 bytes length > > and 4.3Gbps to 5.78Gbps using TCP with 1460 bytes length. > > > > I would like to ask if you're still using 4U4G (4 CPU 4 G?) configuration? If so, > please make sure there are at least the same number of vcpus as queues. > > > According to the testing result from WIKI: > > http://wiki.xen.org/wiki/Xen-netback_and_xen-netfront_multi-queue_perf > > ormance_testing, The VM receive throughput is also more lower than VM > > transmit. > > > > I think that's expected, because guest RX data path still uses grant_copy while > guest TX uses grant_map to do zero-copy transmit. As far as I know, there are three main grant-related operations used in split device model: grant mapping, grant transfer and grant copy. Grant transfer has not used now, and grant mapping and grant transfer both involve "TLB" refresh work for hypervisor, am I right? Or only grant transfer has this overhead? Does grant copy surely has more overhead than grant mapping? >>From the code, I see that in TX, netback will do gnttab_batch_copy as well as gnttab_map_refs: //netback.c:xenvif_tx_action xenvif_tx_build_gops(queue, budget, &nr_cops, &nr_mops); if (nr_cops == 0) return 0; gnttab_batch_copy(queue->tx_copy_ops, nr_cops); if (nr_mops != 0) { ret = gnttab_map_refs(queue->tx_map_ops, NULL, queue->pages_to_map, nr_mops); BUG_ON(ret); } > > I am wondering why the VM receive throughout cannot be up to 8-10Gbps > > as VM transmit under multi-queue? I also tried to send packets > > directly from Local Dom0 to DomU, the DomU receive throughput can > > reach about 8-12Gbps, so I am also wondering why transmitting packets > > from Dom0 to Remote DomU can only reach about 4-5Gbps throughout? > > If data is from Dom0 to DomU then SKB is probably not fragmented by network > stack. You can use tcpdump to check that. > > Wei. > > > > > > Wei. > > > > > > > ---------- > > > > zhangleiqiang (Trump) > > > > > > > > Best Regards > > > > > > > > > > > > > -----Original Message----- > > > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > > > Sent: Tuesday, December 02, 2014 8:12 PM > > > > > To: Zhangleiqiang (Trump) > > > > > Cc: Wei Liu; zhangleiqiang; xen-devel@lists.xen.org; Luohao > > > > > (brian); Xiaoding (B); Yuzhou (C); Zhuangyuxin > > > > > Subject: Re: [Xen-devel] Poor network performance between DomU > > > > > with multiqueue support > > > > > > > > > > On Tue, Dec 02, 2014 at 11:50:59AM +0000, Zhangleiqiang (Trump) > wrote: > > > > > > > -----Original Message----- > > > > > > > From: xen-devel-bounces@lists.xen.org > > > > > > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Wei > > > > > > > Liu > > > > > > > Sent: Tuesday, December 02, 2014 7:02 PM > > > > > > > To: zhangleiqiang > > > > > > > Cc: wei.liu2@citrix.com; xen-devel@lists.xen.org > > > > > > > Subject: Re: [Xen-devel] Poor network performance between > > > > > > > DomU with multiqueue support > > > > > > > > > > > > > > On Tue, Dec 02, 2014 at 04:30:49PM +0800, zhangleiqiang wrote: > > > > > > > > Hi, all > > > > > > > > I am testing the performance of xen netfront-netback > > > > > > > > driver that with > > > > > > > multi-queues support. The throughput from domU to remote > > > > > > > dom0 is 9.2Gb/s, but the throughput from domU to remote domU > > > > > > > is only 3.6Gb/s, I think the bottleneck is the throughput > > > > > > > from dom0 to local domU. However, we have done some testing > > > > > > > and found the throughput from dom0 to local domU is 5.8Gb/s. > > > > > > > > And if we send packets from one DomU to other 3 DomUs > > > > > > > > on different > > > > > > > host simultaneously, the sum of throughout can reach 9Gbps. > > > > > > > It seems like the bottleneck is the receiver? > > > > > > > > After some analysis, I found that even the max_queue > > > > > > > > of netfront/back > > > > > > > is set to 4, there are some strange results as follows: > > > > > > > > 1. In domU, only one rx queue deal with softirq > > > > > > > > > > > > > > Try to bind irq to different vcpus? > > > > > > > > > > > > Do you mean we try to bind irq to different vcpus in DomU? I > > > > > > will try it > > > now. > > > > > > > > > > > > > > > > Yes. Given the fact that you have two backend threads running > > > > > while only one DomU vcpu is busy, it smells like misconfiguration in > DomU. > > > > > > > > > > If this phenomenon persists after correctly binding irqs, you > > > > > might want to check traffic is steering correctly to different queues. > > > > > > > > > > > > > > > > > > > > 2. In dom0, only two netback queues process are > > > > > > > > scheduled, other two > > > > > > > process aren't scheduled. > > > > > > > > > > > > > > How many Dom0 vcpu do you have? If it only has two then > > > > > > > there will only be two processes running at a time. > > > > > > > > > > > > Dom0 has 6 vcpus, and 6G memory. There are only one DomU > > > > > > running in > > > > > Dom0 and so four netback processes are running in Dom0 (because > > > > > the max_queue param of netback kernel module is set to 4). > > > > > > The phenomenon is that only 2 of these four netback process > > > > > > were running > > > > > with about 70% cpu usage, and another two use little CPU. > > > > > > Is there a hash algorithm to determine which netback process > > > > > > to handle the > > > > > input packet? > > > > > > > > > > > > > > > > I think that's whatever default algorithm Linux kernel is using. > > > > > > > > > > We don't currently support other algorithms. > > > > > > > > > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Liu Subject: Re: Poor network performance between DomU with multiqueue support Date: Fri, 5 Dec 2014 12:42:33 +0000 Message-ID: <20141205124233.GD31446@zion.uk.xensource.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23933CDA@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE23933CDA@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhangleiqiang (Trump)" Cc: "Luohao (brian)" , Wei Liu , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "xen-devel@lists.xen.org" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump) wrote: [...] > > I think that's expected, because guest RX data path still uses grant_copy while > > guest TX uses grant_map to do zero-copy transmit. > > As far as I know, there are three main grant-related operations used in split device model: grant mapping, grant transfer and grant copy. > Grant transfer has not used now, and grant mapping and grant transfer both involve "TLB" refresh work for hypervisor, am I right? Or only grant transfer has this overhead? Transfer is not used so I can't tell. Grant unmap causes TLB flush. I saw in an email the other day XenServer folks has some planned improvement to avoid TLB flush in Xen to upstream in 4.6 window. I can't speak for sure it will get upstreamed as I don't work on that. > Does grant copy surely has more overhead than grant mapping? > At the very least the zero-copy TX path is faster than previous copying path. But speaking of the micro operation I'm not sure. There was once persistent map prototype netback / netfront that establishes a memory pool between FE and BE then use memcpy to copy data. Unfortunately that prototype was not done right so the result was not good. > >From the code, I see that in TX, netback will do gnttab_batch_copy as well as gnttab_map_refs: > > //netback.c:xenvif_tx_action > xenvif_tx_build_gops(queue, budget, &nr_cops, &nr_mops); > > if (nr_cops == 0) > return 0; > > gnttab_batch_copy(queue->tx_copy_ops, nr_cops); > if (nr_mops != 0) { > ret = gnttab_map_refs(queue->tx_map_ops, > NULL, > queue->pages_to_map, > nr_mops); > BUG_ON(ret); > } > > The copy is for the packet header. Mapping is for packet data. We need to copy header from guest so that it doesn't change under netback's feet. Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zoltan Kiss Subject: Re: Poor network performance between DomU with multiqueue support Date: Fri, 05 Dec 2014 15:18:07 +0000 Message-ID: <5481CCAF.1040102@linaro.org> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23933CDA@SZXEMA512-MBX.china.huawei.com> <20141205124233.GD31446@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20141205124233.GD31446@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , "Zhangleiqiang (Trump)" Cc: "Luohao (brian)" , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "xen-devel@lists.xen.org" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org On 05/12/14 12:42, Wei Liu wrote: > On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump) wrote: > [...] >>> I think that's expected, because guest RX data path still uses grant_copy while >>> guest TX uses grant_map to do zero-copy transmit. >> >> As far as I know, there are three main grant-related operations used in split device model: grant mapping, grant transfer and grant copy. >> Grant transfer has not used now, and grant mapping and grant transfer both involve "TLB" refresh work for hypervisor, am I right? Or only grant transfer has this overhead? > > Transfer is not used so I can't tell. Grant unmap causes TLB flush. > > I saw in an email the other day XenServer folks has some planned > improvement to avoid TLB flush in Xen to upstream in 4.6 window. I can't > speak for sure it will get upstreamed as I don't work on that. > >> Does grant copy surely has more overhead than grant mapping? >> > > At the very least the zero-copy TX path is faster than previous copying > path. > > But speaking of the micro operation I'm not sure. > > There was once persistent map prototype netback / netfront that > establishes a memory pool between FE and BE then use memcpy to copy > data. Unfortunately that prototype was not done right so the result was > not good. > >> >From the code, I see that in TX, netback will do gnttab_batch_copy as well as gnttab_map_refs: >> >> //netback.c:xenvif_tx_action >> xenvif_tx_build_gops(queue, budget, &nr_cops, &nr_mops); >> >> if (nr_cops == 0) >> return 0; >> >> gnttab_batch_copy(queue->tx_copy_ops, nr_cops); >> if (nr_mops != 0) { >> ret = gnttab_map_refs(queue->tx_map_ops, >> NULL, >> queue->pages_to_map, >> nr_mops); >> BUG_ON(ret); >> } >> >> > > The copy is for the packet header. Mapping is for packet data. > > We need to copy header from guest so that it doesn't change under > netback's feet. It is also important because if the above mentioned "TLB flush avoidance" patch goes in to Xen, it will be important to grant copy the header rather than grant map plus memcpy. The latter is the old way, it touches the page so you can't avoid TLB flush. > > Wei. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zoltan Kiss Subject: Re: Poor network performance between DomU with multiqueue support Date: Fri, 05 Dec 2014 15:20:55 +0000 Message-ID: <5481CD57.607@linaro.org> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393371E@SZXEMA512-MBX.china.huawei.com> <54806315.6010007@linaro.org> <3A6795EA1206904E94BEC8EF9DF109AE23933926@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE23933926@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhangleiqiang (Trump)" , "xen-devel@lists.xen.org" Cc: jonathan.davies@citrix.com, "Luohao (brian)" , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org On 04/12/14 14:31, Zhangleiqiang (Trump) wrote: >> -----Original Message----- >> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org] >> Sent: Thursday, December 04, 2014 9:35 PM >> To: Zhangleiqiang (Trump); Wei Liu; xen-devel@lists.xen.org >> Cc: Xiaoding (B); Zhuangyuxin; zhangleiqiang; Luohao (brian); Yuzhou (C) >> Subject: Re: [Xen-devel] Poor network performance between DomU with >> multiqueue support >> >> >> >> On 04/12/14 12:09, Zhangleiqiang (Trump) wrote: >>>> I think that's expected, because guest RX data path still uses >>>> grant_copy while >>>>> guest TX uses grant_map to do zero-copy transmit. >>> As I understand, the RX process is as follows: >>> 1. Phy NIC receive packet >>> 2. XEN Hypervisor trigger interrupt to Dom0 3. Dom0' s NIC driver do >>> the "RX" operation, and the packet is stored into SKB which is also >>> owned/shared with netback >> Not that easy. There is something between the NIC driver and netback which >> directs the packets, e.g. the old bridge driver, ovs, or the IP stack of the kernel. >>> 4. NetBack notify netfront through event channel that a packet is >>> receiving 5. Netfront grant a buffer for receiving and notify netback >>> the GR (if using grant-resue mechanism, netfront just notify the GR to >>> netback) through IO Ring >> It looks a bit confusing in the code, but netfront put "requests" on the ring >> buffer, which contains the grant ref of the guest page where the backend can >> copy. When the packet comes, netback consumes these requests and send >> back a response telling the guest the grant copy of the packet finished, it can >> start handling the data. (sending a response means it's placing a response in >> the ring and trigger the event channel) And ideally netback should always have >> requests in the ring, so it doesn't have to wait for the guest to fill it up. > >>> 6. NetBack do the grant_copy to copy packet from its SKB to the buffer >>> referenced by GR, and notify netfront through event channel 7. >>> Netfront copy the data from buffer to user-level app's SKB >> Or wherever that SKB should go, yes. Like with any received packet on a real >> network interface. >>> >>> Am I right? Why not using zero-copy transmit in guest RX data pash too ? >> Because that means you are mapping that memory to the guest, and you won't >> have any guarantee when the guest will release them. And netback can't just >> unmap them forcibly after a timeout, because finding a correct timeout value >> would be quite impossible. >> A malicious/buggy/overloaded guest can hold on to Dom0 memory indefinitely, >> but it even becomes worse if the memory came from another >> guest: you can't shutdown that guest for example, until all its memory is >> returned to him. > > Thanks for your detailed explanation about RX data path, I have get it, :) > > About the issue that poor performance between DomU to DomU, but high throughout between Dom0 to remote Dom0/DomU mentioned in my previous mail, do you have any idea about it? > > I am wondering if netfront/netback can be optimized to reach the 10Gbps throughout between DomUs running on different hosts connected with 10GE network. Currently, it seems like the TX is not the bottleneck, because we can reach the aggregate throughout of 9Gbps when sending packets from one DomU to other 3 DomUs running on different host. So I think the bottleneck maybe the RX, are you agreed with me? > > I am wondering what is the main reason that prevent RX to reach the higher throughout? Compared to KVM+virtio+vhost, which can reach high throughout, the RX has extra grantcopy operation, and the grantcopy operation may be one reason for it. Do you have any idea about it too? It's quite sure that the grant copy is the bottleneck for a single queue RX traffic. I don't know what's the plan to help that, currently only a faster CPU can help you with that. > >> >> Regards, >> >> Zoli From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Poor network performance between DomU with multiqueue support Date: Fri, 5 Dec 2014 13:27:02 -0500 Message-ID: <20141205182702.GA4754@laptop.dumpdata.com> References: <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393371E@SZXEMA512-MBX.china.huawei.com> <54806315.6010007@linaro.org> <3A6795EA1206904E94BEC8EF9DF109AE23933926@SZXEMA512-MBX.china.huawei.com> <5481CD57.607@linaro.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <5481CD57.607@linaro.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Zoltan Kiss Cc: "Zhangleiqiang (Trump)" , jonathan.davies@citrix.com, "Luohao (brian)" , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "xen-devel@lists.xen.org" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org On Fri, Dec 05, 2014 at 03:20:55PM +0000, Zoltan Kiss wrote: > > > On 04/12/14 14:31, Zhangleiqiang (Trump) wrote: > >>-----Original Message----- > >>From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org] > >>Sent: Thursday, December 04, 2014 9:35 PM > >>To: Zhangleiqiang (Trump); Wei Liu; xen-devel@lists.xen.org > >>Cc: Xiaoding (B); Zhuangyuxin; zhangleiqiang; Luohao (brian); Yuzhou (C) > >>Subject: Re: [Xen-devel] Poor network performance between DomU with > >>multiqueue support > >> > >> > >> > >>On 04/12/14 12:09, Zhangleiqiang (Trump) wrote: > >>>>I think that's expected, because guest RX data path still uses > >>>>grant_copy while > >>>>>guest TX uses grant_map to do zero-copy transmit. > >>>As I understand, the RX process is as follows: > >>>1. Phy NIC receive packet > >>>2. XEN Hypervisor trigger interrupt to Dom0 3. Dom0' s NIC driver do > >>>the "RX" operation, and the packet is stored into SKB which is also > >>>owned/shared with netback > >>Not that easy. There is something between the NIC driver and netback which > >>directs the packets, e.g. the old bridge driver, ovs, or the IP stack of the kernel. > >>>4. NetBack notify netfront through event channel that a packet is > >>>receiving 5. Netfront grant a buffer for receiving and notify netback > >>>the GR (if using grant-resue mechanism, netfront just notify the GR to > >>>netback) through IO Ring > >>It looks a bit confusing in the code, but netfront put "requests" on the ring > >>buffer, which contains the grant ref of the guest page where the backend can > >>copy. When the packet comes, netback consumes these requests and send > >>back a response telling the guest the grant copy of the packet finished, it can > >>start handling the data. (sending a response means it's placing a response in > >>the ring and trigger the event channel) And ideally netback should always have > >>requests in the ring, so it doesn't have to wait for the guest to fill it up. > > > >>>6. NetBack do the grant_copy to copy packet from its SKB to the buffer > >>>referenced by GR, and notify netfront through event channel 7. > >>>Netfront copy the data from buffer to user-level app's SKB > >>Or wherever that SKB should go, yes. Like with any received packet on a real > >>network interface. > >>> > >>>Am I right? Why not using zero-copy transmit in guest RX data pash too ? > >>Because that means you are mapping that memory to the guest, and you won't > >>have any guarantee when the guest will release them. And netback can't just > >>unmap them forcibly after a timeout, because finding a correct timeout value > >>would be quite impossible. > >>A malicious/buggy/overloaded guest can hold on to Dom0 memory indefinitely, > >>but it even becomes worse if the memory came from another > >>guest: you can't shutdown that guest for example, until all its memory is > >>returned to him. > > > >Thanks for your detailed explanation about RX data path, I have get it, :) > > > >About the issue that poor performance between DomU to DomU, but high throughout between Dom0 to remote Dom0/DomU mentioned in my previous mail, do you have any idea about it? > > > >I am wondering if netfront/netback can be optimized to reach the 10Gbps throughout between DomUs running on different hosts connected with 10GE network. Currently, it seems like the TX is not the bottleneck, because we can reach the aggregate throughout of 9Gbps when sending packets from one DomU to other 3 DomUs running on different host. So I think the bottleneck maybe the RX, are you agreed with me? > > > >I am wondering what is the main reason that prevent RX to reach the higher throughout? Compared to KVM+virtio+vhost, which can reach high throughout, the RX has extra grantcopy operation, and the grantcopy operation may be one reason for it. Do you have any idea about it too? > It's quite sure that the grant copy is the bottleneck for a single queue RX > traffic. I don't know what's the plan to help that, currently only a faster > CPU can help you with that. Could the Intel QuickData help with that? > > > > >> > >>Regards, > >> > >>Zoli > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Mon, 8 Dec 2014 06:50:15 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE2394B199@SZXEMA512-MBX.china.huawei.com> References: <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393371E@SZXEMA512-MBX.china.huawei.com> <54806315.6010007@linaro.org> <3A6795EA1206904E94BEC8EF9DF109AE23933926@SZXEMA512-MBX.china.huawei.com> <5481CD57.607@linaro.org> <20141205182702.GA4754@laptop.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20141205182702.GA4754@laptop.dumpdata.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk , Zoltan Kiss , "xen-devel@lists.xen.org" Cc: "jonathan.davies@citrix.com" , "Luohao (brian)" , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org > On Fri, Dec 05, 2014 at 03:20:55PM +0000, Zoltan Kiss wrote: > > > > > > On 04/12/14 14:31, Zhangleiqiang (Trump) wrote: > > >>-----Original Message----- > > >>From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org] > > >>Sent: Thursday, December 04, 2014 9:35 PM > > >>To: Zhangleiqiang (Trump); Wei Liu; xen-devel@lists.xen.org > > >>Cc: Xiaoding (B); Zhuangyuxin; zhangleiqiang; Luohao (brian); Yuzhou > > >>(C) > > >>Subject: Re: [Xen-devel] Poor network performance between DomU with > > >>multiqueue support > > >> > > >> > > >> > > >>On 04/12/14 12:09, Zhangleiqiang (Trump) wrote: > > >>>>I think that's expected, because guest RX data path still uses > > >>>>grant_copy while > > >>>>>guest TX uses grant_map to do zero-copy transmit. > > >>>As I understand, the RX process is as follows: > > >>>1. Phy NIC receive packet > > >>>2. XEN Hypervisor trigger interrupt to Dom0 3. Dom0' s NIC driver > > >>>do the "RX" operation, and the packet is stored into SKB which is > > >>>also owned/shared with netback > > >>Not that easy. There is something between the NIC driver and netback > > >>which directs the packets, e.g. the old bridge driver, ovs, or the IP stack of > the kernel. > > >>>4. NetBack notify netfront through event channel that a packet is > > >>>receiving 5. Netfront grant a buffer for receiving and notify > > >>>netback the GR (if using grant-resue mechanism, netfront just > > >>>notify the GR to > > >>>netback) through IO Ring > > >>It looks a bit confusing in the code, but netfront put "requests" on > > >>the ring buffer, which contains the grant ref of the guest page > > >>where the backend can copy. When the packet comes, netback consumes > > >>these requests and send back a response telling the guest the grant > > >>copy of the packet finished, it can start handling the data. > > >>(sending a response means it's placing a response in the ring and > > >>trigger the event channel) And ideally netback should always have requests > in the ring, so it doesn't have to wait for the guest to fill it up. > > > > > >>>6. NetBack do the grant_copy to copy packet from its SKB to the > > >>>buffer referenced by GR, and notify netfront through event channel 7. > > >>>Netfront copy the data from buffer to user-level app's SKB > > >>Or wherever that SKB should go, yes. Like with any received packet > > >>on a real network interface. > > >>> > > >>>Am I right? Why not using zero-copy transmit in guest RX data pash too ? > > >>Because that means you are mapping that memory to the guest, and you > > >>won't have any guarantee when the guest will release them. And > > >>netback can't just unmap them forcibly after a timeout, because > > >>finding a correct timeout value would be quite impossible. > > >>A malicious/buggy/overloaded guest can hold on to Dom0 memory > > >>indefinitely, but it even becomes worse if the memory came from > > >>another > > >>guest: you can't shutdown that guest for example, until all its > > >>memory is returned to him. > > > > > >Thanks for your detailed explanation about RX data path, I have get > > >it, :) > > > > > >About the issue that poor performance between DomU to DomU, but high > throughout between Dom0 to remote Dom0/DomU mentioned in my previous > mail, do you have any idea about it? > > > > > >I am wondering if netfront/netback can be optimized to reach the 10Gbps > throughout between DomUs running on different hosts connected with 10GE > network. Currently, it seems like the TX is not the bottleneck, because we can > reach the aggregate throughout of 9Gbps when sending packets from one > DomU to other 3 DomUs running on different host. So I think the bottleneck > maybe the RX, are you agreed with me? > > > > > >I am wondering what is the main reason that prevent RX to reach the higher > throughout? Compared to KVM+virtio+vhost, which can reach high throughout, > the RX has extra grantcopy operation, and the grantcopy operation may be one > reason for it. Do you have any idea about it too? > > It's quite sure that the grant copy is the bottleneck for a single > > queue RX traffic. I don't know what's the plan to help that, currently > > only a faster CPU can help you with that. > > Could the Intel QuickData help with that? Thanks for your hit. I am looking for method which is independent on hardware. Because I have seen that virtio can reach the 10Gbps throughout, and I think PV network protocol which is the mainline of XEN should also reach the throughout. However, the testing results show that it is not ideal, so I am wondering what the possible reason is and if PV network protocol can be optimized. > > > > > > > >> > > >>Regards, > > >> > > >>Zoli > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Mon, 8 Dec 2014 06:44:26 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE2394A187@SZXEMA512-MBX.china.huawei.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23933CDA@SZXEMA512-MBX.china.huawei.com> <20141205124233.GD31446@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20141205124233.GD31446@zion.uk.xensource.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , "xen-devel@lists.xen.org" Cc: "Zhangleiqiang (Trump)" , "Luohao (brian)" , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org > On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump) wrote: > [...] > > > I think that's expected, because guest RX data path still uses > > > grant_copy while guest TX uses grant_map to do zero-copy transmit. > > > > As far as I know, there are three main grant-related operations used in split > device model: grant mapping, grant transfer and grant copy. > > Grant transfer has not used now, and grant mapping and grant transfer both > involve "TLB" refresh work for hypervisor, am I right? Or only grant transfer > has this overhead? > > Transfer is not used so I can't tell. Grant unmap causes TLB flush. > > I saw in an email the other day XenServer folks has some planned improvement > to avoid TLB flush in Xen to upstream in 4.6 window. I can't speak for sure it will > get upstreamed as I don't work on that. > > > Does grant copy surely has more overhead than grant mapping? > > > > At the very least the zero-copy TX path is faster than previous copying path. > > But speaking of the micro operation I'm not sure. > > There was once persistent map prototype netback / netfront that establishes a > memory pool between FE and BE then use memcpy to copy data. Unfortunately > that prototype was not done right so the result was not good. The newest mail about persistent grant I can find is sent from 16 Nov 2012 (http://lists.xen.org/archives/html/xen-devel/2012-11/msg00832.html). Why is it not done right and not merged into upstream? And I also search for virtio support in XEN, and I find that the one who are familiar with it is you, too, (http://wiki.xen.org/wiki/Virtio_On_Xen), :-). I am wondering what is the current state for virtio on XEN? > > >From the code, I see that in TX, netback will do gnttab_batch_copy as well > as gnttab_map_refs: > > > > //netback.c:xenvif_tx_action > > xenvif_tx_build_gops(queue, budget, &nr_cops, &nr_mops); > > > > if (nr_cops == 0) > > return 0; > > > > gnttab_batch_copy(queue->tx_copy_ops, nr_cops); > > if (nr_mops != 0) { > > ret = gnttab_map_refs(queue->tx_map_ops, > > NULL, > > queue->pages_to_map, > > nr_mops); > > BUG_ON(ret); > > } > > > > > > The copy is for the packet header. Mapping is for packet data. > > We need to copy header from guest so that it doesn't change under netback's > feet. > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Liu Subject: Re: Poor network performance between DomU with multiqueue support Date: Mon, 8 Dec 2014 10:13:04 +0000 Message-ID: <20141208101304.GB17128@zion.uk.xensource.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23933CDA@SZXEMA512-MBX.china.huawei.com> <20141205124233.GD31446@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2394A187@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE2394A187@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhangleiqiang (Trump)" Cc: "Luohao (brian)" , Wei Liu , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "xen-devel@lists.xen.org" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org On Mon, Dec 08, 2014 at 06:44:26AM +0000, Zhangleiqiang (Trump) wrote: > > On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump) wrote: > > [...] > > > > I think that's expected, because guest RX data path still uses > > > > grant_copy while guest TX uses grant_map to do zero-copy transmit. > > > > > > As far as I know, there are three main grant-related operations used in split > > device model: grant mapping, grant transfer and grant copy. > > > Grant transfer has not used now, and grant mapping and grant transfer both > > involve "TLB" refresh work for hypervisor, am I right? Or only grant transfer > > has this overhead? > > > > Transfer is not used so I can't tell. Grant unmap causes TLB flush. > > > > I saw in an email the other day XenServer folks has some planned improvement > > to avoid TLB flush in Xen to upstream in 4.6 window. I can't speak for sure it will > > get upstreamed as I don't work on that. > > > > > Does grant copy surely has more overhead than grant mapping? > > > > > > > At the very least the zero-copy TX path is faster than previous copying path. > > > > But speaking of the micro operation I'm not sure. > > > > There was once persistent map prototype netback / netfront that establishes a > > memory pool between FE and BE then use memcpy to copy data. Unfortunately > > that prototype was not done right so the result was not good. > > The newest mail about persistent grant I can find is sent from 16 Nov > 2012 > (http://lists.xen.org/archives/html/xen-devel/2012-11/msg00832.html). > Why is it not done right and not merged into upstream? AFAICT there's one more memcpy than necessary, i.e. frontend memcpy data into the pool then backend memcpy data out of the pool, when backend should be able to use the page in pool directly. > > And I also search for virtio support in XEN, and I find that the one > who are familiar with it is you, too, > (http://wiki.xen.org/wiki/Virtio_On_Xen), :-). I am wondering what is > the current state for virtio on XEN? Yes, it was me. I never have the time to revisit that. I don't think we support virtio network at the moment. Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Mon, 8 Dec 2014 13:08:18 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE2394C523@SZXEMA512-MBX.china.huawei.com> References: <20141202110133.GA5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23931FA1@SZXEMA512-MBX.china.huawei.com> <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23933CDA@SZXEMA512-MBX.china.huawei.com> <20141205124233.GD31446@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2394A187@SZXEMA512-MBX.china.huawei.com> <20141208101304.GB17128@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20141208101304.GB17128@zion.uk.xensource.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , "xen-devel@lists.xen.org" Cc: "Xiaoding (B)" , Zhuangyuxin , zhangleiqiang , "Luohao (brian)" , "Yuzhou (C)" List-Id: xen-devel@lists.xenproject.org > On Mon, Dec 08, 2014 at 06:44:26AM +0000, Zhangleiqiang (Trump) wrote: > > > On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump) wrote: > > > [...] > > > > > I think that's expected, because guest RX data path still uses > > > > > grant_copy while guest TX uses grant_map to do zero-copy transmit. > > > > > > > > As far as I know, there are three main grant-related operations > > > > used in split > > > device model: grant mapping, grant transfer and grant copy. > > > > Grant transfer has not used now, and grant mapping and grant > > > > transfer both > > > involve "TLB" refresh work for hypervisor, am I right? Or only > > > grant transfer has this overhead? > > > > > > Transfer is not used so I can't tell. Grant unmap causes TLB flush. > > > > > > I saw in an email the other day XenServer folks has some planned > > > improvement to avoid TLB flush in Xen to upstream in 4.6 window. I > > > can't speak for sure it will get upstreamed as I don't work on that. > > > > > > > Does grant copy surely has more overhead than grant mapping? > > > > > > > > > > At the very least the zero-copy TX path is faster than previous copying path. > > > > > > But speaking of the micro operation I'm not sure. > > > > > > There was once persistent map prototype netback / netfront that > > > establishes a memory pool between FE and BE then use memcpy to copy > > > data. Unfortunately that prototype was not done right so the result was not > good. > > > > The newest mail about persistent grant I can find is sent from 16 Nov > > 2012 > > (http://lists.xen.org/archives/html/xen-devel/2012-11/msg00832.html). > > Why is it not done right and not merged into upstream? > > AFAICT there's one more memcpy than necessary, i.e. frontend memcpy data > into the pool then backend memcpy data out of the pool, when backend should > be able to use the page in pool directly. Memcpy should cheaper than grant_copy because the former needs not the "hypercall" which will cause "VM Exit" to "XEN Hypervisor", am I right? For RX path, using memcpy based on persistent grant table may have higher performance than using grant copy now. I have seen "move grant copy to guest" and "Fix grant copy alignment problem" as optimization methods used in "NetChannel2" (http://www-archive.xenproject.org/files/xensummit_fall07/16_JoseRenatoSantos.pdf). Unfortunately, NetChannel2 seems not be supported from 2.6.32. Do you know them and are them be helpful for RX path optimization under current upstream implementation? By the way, after rethinking the testing results for multi-queue pv (kernel 3.17.4+XEN 4.4) implementation, I find that when using four queues for netback/netfront, there will be about 3 netback process running with high CPU usage on receive Dom0 (about 85% usage per process running on one CPU core), and the aggregate throughout is only about 5Gbps. I doubt that there may be some bug or pitfall in current multi-queue implementation, because for 5Gbps throughout, occurring about all of 3 CPU core for packet receiving is somehow abnormal. > > > > And I also search for virtio support in XEN, and I find that the one > > who are familiar with it is you, too, > > (http://wiki.xen.org/wiki/Virtio_On_Xen), :-). I am wondering what is > > the current state for virtio on XEN? > > Yes, it was me. I never have the time to revisit that. I don't think we support > virtio network at the moment. > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Liu Subject: Re: Poor network performance between DomU with multiqueue support Date: Mon, 8 Dec 2014 13:55:34 +0000 Message-ID: <20141208135534.GA21374@zion.uk.xensource.com> References: <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23933CDA@SZXEMA512-MBX.china.huawei.com> <20141205124233.GD31446@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2394A187@SZXEMA512-MBX.china.huawei.com> <20141208101304.GB17128@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2394C523@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE2394C523@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhangleiqiang (Trump)" Cc: "Luohao (brian)" , Wei Liu , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "xen-devel@lists.xen.org" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org On Mon, Dec 08, 2014 at 01:08:18PM +0000, Zhangleiqiang (Trump) wrote: > > On Mon, Dec 08, 2014 at 06:44:26AM +0000, Zhangleiqiang (Trump) wrote: > > > > On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump) wrote: > > > > [...] > > > > > > I think that's expected, because guest RX data path still uses > > > > > > grant_copy while guest TX uses grant_map to do zero-copy transmit. > > > > > > > > > > As far as I know, there are three main grant-related operations > > > > > used in split > > > > device model: grant mapping, grant transfer and grant copy. > > > > > Grant transfer has not used now, and grant mapping and grant > > > > > transfer both > > > > involve "TLB" refresh work for hypervisor, am I right? Or only > > > > grant transfer has this overhead? > > > > > > > > Transfer is not used so I can't tell. Grant unmap causes TLB flush. > > > > > > > > I saw in an email the other day XenServer folks has some planned > > > > improvement to avoid TLB flush in Xen to upstream in 4.6 window. I > > > > can't speak for sure it will get upstreamed as I don't work on that. > > > > > > > > > Does grant copy surely has more overhead than grant mapping? > > > > > > > > > > > > > At the very least the zero-copy TX path is faster than previous copying path. > > > > > > > > But speaking of the micro operation I'm not sure. > > > > > > > > There was once persistent map prototype netback / netfront that > > > > establishes a memory pool between FE and BE then use memcpy to copy > > > > data. Unfortunately that prototype was not done right so the result was not > > good. > > > > > > The newest mail about persistent grant I can find is sent from 16 Nov > > > 2012 > > > (http://lists.xen.org/archives/html/xen-devel/2012-11/msg00832.html). > > > Why is it not done right and not merged into upstream? > > > > AFAICT there's one more memcpy than necessary, i.e. frontend memcpy data > > into the pool then backend memcpy data out of the pool, when backend should > > be able to use the page in pool directly. > > Memcpy should cheaper than grant_copy because the former needs not the > "hypercall" which will cause "VM Exit" to "XEN Hypervisor", am I > right? For RX path, using memcpy based on persistent grant table may > have higher performance than using grant copy now. In theory yes. Unfortunately nobody has benchmarked that properly. If you're interested in doing work on optimising RX performance, you might want to sync up with XenServer folks? > > I have seen "move grant copy to guest" and "Fix grant copy alignment > problem" as optimization methods used in "NetChannel2" > (http://www-archive.xenproject.org/files/xensummit_fall07/16_JoseRenatoSantos.pdf). > Unfortunately, NetChannel2 seems not be supported from 2.6.32. Do you > know them and are them be helpful for RX path optimization under > current upstream implementation? Not sure, that's long before I ever started working on Xen. > > By the way, after rethinking the testing results for multi-queue pv > (kernel 3.17.4+XEN 4.4) implementation, I find that when using four > queues for netback/netfront, there will be about 3 netback process > running with high CPU usage on receive Dom0 (about 85% usage per > process running on one CPU core), and the aggregate throughout is only > about 5Gbps. I doubt that there may be some bug or pitfall in current > multi-queue implementation, because for 5Gbps throughout, occurring > about all of 3 CPU core for packet receiving is somehow abnormal. > 3.17.4 doesn't contain David Vrabel's fixes. Look for bc96f648df1bbc2729abbb84513cf4f64273a1f1 f48da8b14d04ca87ffcffe68829afd45f926ec6a ecf08d2dbb96d5a4b4bcc53a39e8d29cc8fef02e in David Miller's net tree. BTW there are some improvement planned for 4.6: "[Xen-devel] [PATCH v3 0/2] gnttab: Improve scaleability". This is orthogonal to the problem you're trying to solve but it should help improve performance in general. Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 9 Dec 2014 02:51:55 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE2394DA29@SZXEMA512-MBX.china.huawei.com> References: <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23933CDA@SZXEMA512-MBX.china.huawei.com> <20141205124233.GD31446@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2394A187@SZXEMA512-MBX.china.huawei.com> <20141208101304.GB17128@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2394C523@SZXEMA512-MBX.china.huawei.com> <20141208135534.GA21374@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20141208135534.GA21374@zion.uk.xensource.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , "xen-devel@lists.xen.org" Cc: "Xiaoding (B)" , Zhuangyuxin , zhangleiqiang , "Luohao (brian)" , "Yuzhou (C)" List-Id: xen-devel@lists.xenproject.org > On Mon, Dec 08, 2014 at 01:08:18PM +0000, Zhangleiqiang (Trump) wrote: > > > On Mon, Dec 08, 2014 at 06:44:26AM +0000, Zhangleiqiang (Trump) wrote: > > > > > On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump) > wrote: > > > > > [...] > > > > > > > > The newest mail about persistent grant I can find is sent from 16 > > > > Nov > > > > 2012 > > > > (http://lists.xen.org/archives/html/xen-devel/2012-11/msg00832.html). > > > > Why is it not done right and not merged into upstream? > > > > > > AFAICT there's one more memcpy than necessary, i.e. frontend memcpy > > > data into the pool then backend memcpy data out of the pool, when > > > backend should be able to use the page in pool directly. > > > > Memcpy should cheaper than grant_copy because the former needs not the > > "hypercall" which will cause "VM Exit" to "XEN Hypervisor", am I > > right? For RX path, using memcpy based on persistent grant table may > > have higher performance than using grant copy now. > > In theory yes. Unfortunately nobody has benchmarked that properly. > > If you're interested in doing work on optimising RX performance, you might > want to sync up with XenServer folks? What is the recommended way to have a discussion with XenServer folks? Through the forum of XenServer or the standalone mailing list? I find the most of discussions in forum are the production of XenServer. > > > > I have seen "move grant copy to guest" and "Fix grant copy alignment > > problem" as optimization methods used in "NetChannel2" > > > (http://www-archive.xenproject.org/files/xensummit_fall07/16_JoseRenatoSa > ntos.pdf). > > Unfortunately, NetChannel2 seems not be supported from 2.6.32. Do you > > know them and are them be helpful for RX path optimization under > > current upstream implementation? > > Not sure, that's long before I ever started working on Xen. > > > > > By the way, after rethinking the testing results for multi-queue pv > > (kernel 3.17.4+XEN 4.4) implementation, I find that when using four > > queues for netback/netfront, there will be about 3 netback process > > running with high CPU usage on receive Dom0 (about 85% usage per > > process running on one CPU core), and the aggregate throughout is only > > about 5Gbps. I doubt that there may be some bug or pitfall in current > > multi-queue implementation, because for 5Gbps throughout, occurring > > about all of 3 CPU core for packet receiving is somehow abnormal. > > > > 3.17.4 doesn't contain David Vrabel's fixes. > > Look for > bc96f648df1bbc2729abbb84513cf4f64273a1f1 > f48da8b14d04ca87ffcffe68829afd45f926ec6a > ecf08d2dbb96d5a4b4bcc53a39e8d29cc8fef02e > in David Miller's net tree. > > BTW there are some improvement planned for 4.6: "[Xen-devel] [PATCH v3 0/2] > gnttab: Improve scaleability". This is orthogonal to the problem you're trying to > solve but it should help improve performance in general. Thanks for your pointer, it is helpful. > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhangleiqiang (Trump)" Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 9 Dec 2014 09:03:30 +0000 Message-ID: <3A6795EA1206904E94BEC8EF9DF109AE2394DB98@SZXEMA512-MBX.china.huawei.com> References: <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23933CDA@SZXEMA512-MBX.china.huawei.com> <20141205124233.GD31446@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2394A187@SZXEMA512-MBX.china.huawei.com> <20141208101304.GB17128@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2394C523@SZXEMA512-MBX.china.huawei.com> <20141208135534.GA21374@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20141208135534.GA21374@zion.uk.xensource.com> Content-Language: zh-CN List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , "xen-devel@lists.xen.org" Cc: "Xiaoding (B)" , Zhuangyuxin , zhangleiqiang , "Luohao (brian)" , "Yuzhou (C)" List-Id: xen-devel@lists.xenproject.org > On Mon, Dec 08, 2014 at 01:08:18PM +0000, Zhangleiqiang (Trump) wrote: > > > On Mon, Dec 08, 2014 at 06:44:26AM +0000, Zhangleiqiang (Trump) wrote: > > > > > On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump) > wrote: > > > > > [...] > > By the way, after rethinking the testing results for multi-queue pv > > (kernel 3.17.4+XEN 4.4) implementation, I find that when using four > > queues for netback/netfront, there will be about 3 netback process > > running with high CPU usage on receive Dom0 (about 85% usage per > > process running on one CPU core), and the aggregate throughout is only > > about 5Gbps. I doubt that there may be some bug or pitfall in current > > multi-queue implementation, because for 5Gbps throughout, occurring > > about all of 3 CPU core for packet receiving is somehow abnormal. > > > > 3.17.4 doesn't contain David Vrabel's fixes. > > Look for > bc96f648df1bbc2729abbb84513cf4f64273a1f1 > f48da8b14d04ca87ffcffe68829afd45f926ec6a > ecf08d2dbb96d5a4b4bcc53a39e8d29cc8fef02e > in David Miller's net tree. I have tried to testing with 3.18-rc5 which including these patches, however, it seems that the problem mentioned is not improved. There are still 3 netback receive processes each of which uses about 85% of CPU core. > BTW there are some improvement planned for 4.6: "[Xen-devel] [PATCH v3 0/2] > gnttab: Improve scaleability". This is orthogonal to the problem you're trying to > solve but it should help improve performance in general. > > > Wei. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 9 Dec 2014 10:05:02 +0000 Message-ID: <1418119502.1428.2.camel@citrix.com> References: <20141202121151.GD5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393216B@SZXEMA512-MBX.china.huawei.com> <20141202155832.GH5768@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2393301C@SZXEMA512-MBX.china.huawei.com> <20141204105021.GA16532@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE23933CDA@SZXEMA512-MBX.china.huawei.com> <20141205124233.GD31446@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2394A187@SZXEMA512-MBX.china.huawei.com> <20141208101304.GB17128@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2394C523@SZXEMA512-MBX.china.huawei.com> <20141208135534.GA21374@zion.uk.xensource.com> <3A6795EA1206904E94BEC8EF9DF109AE2394DA29@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE2394DA29@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhangleiqiang (Trump)" Cc: "Luohao (brian)" , Wei Liu , Zhuangyuxin , zhangleiqiang , "Yuzhou (C)" , "xen-devel@lists.xen.org" , "Xiaoding (B)" List-Id: xen-devel@lists.xenproject.org On Tue, 2014-12-09 at 02:51 +0000, Zhangleiqiang (Trump) wrote: > What is the recommended way to have a discussion with XenServer folks? > Through the forum of XenServer or the standalone mailing list? I find > the most of discussions in forum are the production of XenServer. AIUI development == list, users == forums. Ian. From mboxrd@z Thu Jan 1 00:00:00 1970 From: openlui Subject: Re: Poor network performance between DomU with multiqueue support Date: Fri, 27 Feb 2015 17:21:11 +0800 (CST) Message-ID: <2abdeb39.adb2.14bca56f4d4.Coremail.openlui@126.com> References: <3A6795EA1206904E94BEC8EF9DF109AE239B35A9@SZXEMA512-MBX.china.huawei.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============7654450083308015496==" Return-path: In-Reply-To: <3A6795EA1206904E94BEC8EF9DF109AE239B35A9@SZXEMA512-MBX.china.huawei.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "xen-devel@lists.xen.org" Cc: "Zhangleiqiang (Trump)" List-Id: xen-devel@lists.xenproject.org --===============7654450083308015496== Content-Type: multipart/alternative; boundary="----=_Part_172110_1876971491.1425028871379" ------=_Part_172110_1876971491.1425028871379 Content-Type: text/plain; charset=GBK Content-Transfer-Encoding: base64 Pk9uIE1vbiwgRGVjIDA4LCAyMDE0IGF0IDAxOjA4OjE4UE0gKzAwMDAsIFpoYW5nbGVpcWlhbmcg KFRydW1wKSB3cm90ZToKPj4gPiBPbiBNb24sIERlYyAwOCwgMjAxNCBhdCAwNjo0NDoyNkFNICsw MDAwLCBaaGFuZ2xlaXFpYW5nIChUcnVtcCkgd3JvdGU6Cj4+ID4gPiA+IE9uIEZyaSwgRGVjIDA1 LCAyMDE0IGF0IDAxOjE3OjE2QU0gKzAwMDAsIFpoYW5nbGVpcWlhbmcgKFRydW1wKSB3cm90ZToK Pj4gPiA+ID4gWy4uLl0KPj4gPiA+ID4gPiA+IEkgdGhpbmsgdGhhdCdzIGV4cGVjdGVkLCBiZWNh dXNlIGd1ZXN0IFJYIGRhdGEgcGF0aCBzdGlsbCAKPj4gPiA+ID4gPiA+IHVzZXMgZ3JhbnRfY29w eSB3aGlsZSBndWVzdCBUWCB1c2VzIGdyYW50X21hcCB0byBkbyB6ZXJvLWNvcHkgdHJhbnNtaXQu Cj4+ID4gPiA+ID4KPj4gPiA+ID4gPiBBcyBmYXIgYXMgSSBrbm93LCB0aGVyZSBhcmUgdGhyZWUg bWFpbiBncmFudC1yZWxhdGVkIAo+PiA+ID4gPiA+IG9wZXJhdGlvbnMgdXNlZCBpbiBzcGxpdAo+ PiA+ID4gPiBkZXZpY2UgbW9kZWw6IGdyYW50IG1hcHBpbmcsIGdyYW50IHRyYW5zZmVyIGFuZCBn cmFudCBjb3B5Lgo+PiA+ID4gPiA+IEdyYW50IHRyYW5zZmVyIGhhcyBub3QgdXNlZCBub3csIGFu ZCBncmFudCBtYXBwaW5nIGFuZCBncmFudCAKPj4gPiA+ID4gPiB0cmFuc2ZlciBib3RoCj4+ID4g PiA+IGludm9sdmUgIlRMQiIgcmVmcmVzaCB3b3JrIGZvciBoeXBlcnZpc29yLCBhbSBJIHJpZ2h0 PyAgT3Igb25seSAKPj4gPiA+ID4gZ3JhbnQgdHJhbnNmZXIgaGFzIHRoaXMgb3ZlcmhlYWQ/Cj4+ ID4gPiA+Cj4+ID4gPiA+IFRyYW5zZmVyIGlzIG5vdCB1c2VkIHNvIEkgY2FuJ3QgdGVsbC4gR3Jh bnQgdW5tYXAgY2F1c2VzIFRMQiBmbHVzaC4KPj4gPiA+ID4KPj4gPiA+ID4gSSBzYXcgaW4gYW4g ZW1haWwgdGhlIG90aGVyIGRheSBYZW5TZXJ2ZXIgZm9sa3MgaGFzIHNvbWUgcGxhbm5lZCAKPj4g PiA+ID4gaW1wcm92ZW1lbnQgdG8gYXZvaWQgVExCIGZsdXNoIGluIFhlbiB0byB1cHN0cmVhbSBp biA0LjYgd2luZG93LiAKPj4gPiA+ID4gSSBjYW4ndCBzcGVhayBmb3Igc3VyZSBpdCB3aWxsIGdl dCB1cHN0cmVhbWVkIGFzIEkgZG9uJ3Qgd29yayBvbiB0aGF0Lgo+PiA+ID4gPgo+PiA+ID4gPiA+ IERvZXMgZ3JhbnQgY29weSBzdXJlbHkgaGFzIG1vcmUgb3ZlcmhlYWQgdGhhbiBncmFudCBtYXBw aW5nPwo+PiA+ID4gPiA+Cj4+ID4gPiA+Cj4+ID4gPiA+IEF0IHRoZSB2ZXJ5IGxlYXN0IHRoZSB6 ZXJvLWNvcHkgVFggcGF0aCBpcyBmYXN0ZXIgdGhhbiBwcmV2aW91cyBjb3B5aW5nIHBhdGguCj4+ ID4gPiA+Cj4+ID4gPiA+IEJ1dCBzcGVha2luZyBvZiB0aGUgbWljcm8gb3BlcmF0aW9uIEknbSBu b3Qgc3VyZS4KPj4gPiA+ID4KPj4gPiA+ID4gVGhlcmUgd2FzIG9uY2UgcGVyc2lzdGVudCBtYXAg cHJvdG90eXBlIG5ldGJhY2sgLyBuZXRmcm9udCB0aGF0IAo+PiA+ID4gPiBlc3RhYmxpc2hlcyBh IG1lbW9yeSBwb29sIGJldHdlZW4gRkUgYW5kIEJFIHRoZW4gdXNlIG1lbWNweSB0byAKPj4gPiA+ ID4gY29weSBkYXRhLiBVbmZvcnR1bmF0ZWx5IHRoYXQgcHJvdG90eXBlIHdhcyBub3QgZG9uZSBy aWdodCBzbyAKPj4gPiA+ID4gdGhlIHJlc3VsdCB3YXMgbm90Cj4+ID4gZ29vZC4KPj4gPiA+Cj4+ ID4gPiBUaGUgbmV3ZXN0IG1haWwgYWJvdXQgcGVyc2lzdGVudCBncmFudCBJIGNhbiBmaW5kIGlz IHNlbnQgZnJvbSAxNiAKPj4gPiA+IE5vdgo+PiA+ID4gMjAxMgo+PiA+ID4gKGh0dHA6Ly9saXN0 cy54ZW4ub3JnL2FyY2hpdmVzL2h0bWwveGVuLWRldmVsLzIwMTItMTEvbXNnMDA4MzIuaHRtbCku Cj4+ID4gPiBXaHkgaXMgaXQgbm90IGRvbmUgcmlnaHQgYW5kIG5vdCBtZXJnZWQgaW50byB1cHN0 cmVhbT8KPj4gPiAKPj4gPiBBRkFJQ1QgdGhlcmUncyBvbmUgbW9yZSBtZW1jcHkgdGhhbiBuZWNl c3NhcnksIGkuZS4gZnJvbnRlbmQgbWVtY3B5IAo+PiA+IGRhdGEgaW50byB0aGUgcG9vbCB0aGVu IGJhY2tlbmQgbWVtY3B5IGRhdGEgb3V0IG9mIHRoZSBwb29sLCB3aGVuIAo+PiA+IGJhY2tlbmQg c2hvdWxkIGJlIGFibGUgdG8gdXNlIHRoZSBwYWdlIGluIHBvb2wgZGlyZWN0bHkuCj4+IAo+PiBN ZW1jcHkgc2hvdWxkIGNoZWFwZXIgdGhhbiBncmFudF9jb3B5IGJlY2F1c2UgdGhlIGZvcm1lciBu ZWVkcyBub3QgdGhlIAo+PiAiaHlwZXJjYWxsIiB3aGljaCB3aWxsIGNhdXNlICJWTSBFeGl0IiB0 byAiWEVOIEh5cGVydmlzb3IiLCBhbSBJIAo+PiByaWdodD8gRm9yIFJYIHBhdGgsIHVzaW5nIG1l bWNweSBiYXNlZCBvbiBwZXJzaXN0ZW50IGdyYW50IHRhYmxlIG1heSAKPj4gaGF2ZSBoaWdoZXIg cGVyZm9ybWFuY2UgdGhhbiB1c2luZyBncmFudCBjb3B5IG5vdy4KPgo+SW4gdGhlb3J5IHllcy4g VW5mb3J0dW5hdGVseSBub2JvZHkgaGFzIGJlbmNobWFya2VkIHRoYXQgcHJvcGVybHkuCkkgaGF2 ZSBzb21lIHRlc3RpbmcgZm9yIFJYIHBlcmZvcm1hbmNlIHVzaW5nIHBlcnNpc3RlbnQgZ3JhbnQg bWV0aG9kIGFuZCB1cHN0cmVhbSBtZXRob2QgKDMuMTcuNCBicmFuY2gpLCB0aGUgcmVzdWx0cyBz aG93IHRoYXQgcGVyc2lzdGVudCBncmFudCBtZXRob2QgZG9lcyBoYXZlIGhpZ2hlciBwZXJmb3Jt YW5jZSB0aGFuIHVwc3RyZWFtIG1ldGhvZCAoZnJvbSAzLjVHYnBzIHRvIGFib3V0IDZHYnBzKS4g QW5kIEkgZmluZCB0aGF0IHBlcnNpc3RlbnQgZ3JhbnQgbWVjaGFuaXNtIGhhcyBhbHJlYWR5IHVz ZWQgaW4gYmxrZnJvbmcvYmxrYmFjaywgSSBhbSB3b25kZXJpbmcgd2h5IHRoZXJlIGFyZSBubyBl ZmZvcnRzIHRvIHJlcGxhY2UgdGhlIGdyYW50IGNvcHkgYnkgcGVyc2lzdGVudCBncmFudCBub3cs IGF0IGxlYXN0IGluIFJYIHBhdGguIEFyZSB0aGVyZSBvdGhlciBkaXNhZHZhbnRhZ2VzIGluIHBl cnNpc3RlbnQgZ3JhbnQgbWV0aG9kIHdoaWNoIHN0b3Agd2UgdXNlIGl0PyAKClBTLiBJIHVzZWQg cGt0LWdlbiB0byBzZW5kIHBhY2tldCBmcm9tIGRvbTAgdG8gYSBkb21VIHJ1bm5pbmcgb24gYW5v dGhlciBkb20wLCB0aGUgQ1BVcyBvZiBib3RoIGRvbTAgaXMgSW50ZWwgRTU2NDAgMi40R0h6LCBh bmQgdGhlIHR3byBkb20wcyBpcyBjb25uZWN0ZWQgd2l0aCBhIDEwR0UgTklDLgoKCgoKPklmIHlv dSdyZSBpbnRlcmVzdGVkIGluIGRvaW5nIHdvcmsgb24gb3B0aW1pc2luZyBSWCBwZXJmb3JtYW5j ZSwgeW91IG1pZ2h0IHdhbnQgdG8gc3luYyB1cCB3aXRoIFhlblNlcnZlciBmb2xrcz8KPgo+PiAK Pj4gSSBoYXZlIHNlZW4gIm1vdmUgZ3JhbnQgY29weSB0byBndWVzdCIgYW5kICJGaXggZ3JhbnQg Y29weSBhbGlnbm1lbnQgCj4+IHByb2JsZW0iIGFzIG9wdGltaXphdGlvbiBtZXRob2RzIHVzZWQg aW4gIk5ldENoYW5uZWwyIgo+PiAoaHR0cDovL3d3dy1hcmNoaXZlLnhlbnByb2plY3Qub3JnL2Zp bGVzL3hlbnN1bW1pdF9mYWxsMDcvMTZfSm9zZVJlbmF0b1NhbnRvcy5wZGYpLgo+PiBVbmZvcnR1 bmF0ZWx5LCBOZXRDaGFubmVsMiBzZWVtcyBub3QgYmUgc3VwcG9ydGVkIGZyb20gMi42LjMyLiBE byB5b3UgCj4+IGtub3cgdGhlbSBhbmQgYXJlIHRoZW0gYmUgaGVscGZ1bCBmb3IgUlggcGF0aCBv cHRpbWl6YXRpb24gdW5kZXIgCj4+IGN1cnJlbnQgdXBzdHJlYW0gaW1wbGVtZW50YXRpb24/Cj4K Pk5vdCBzdXJlLCB0aGF0J3MgbG9uZyBiZWZvcmUgSSBldmVyIHN0YXJ0ZWQgd29ya2luZyBvbiBY ZW4uCj4KPj4gCj4+IEJ5IHRoZSB3YXksIGFmdGVyIHJldGhpbmtpbmcgdGhlIHRlc3RpbmcgcmVz dWx0cyBmb3IgbXVsdGktcXVldWUgcHYgCj4+IChrZXJuZWwgMy4xNy40K1hFTiA0LjQpIGltcGxl bWVudGF0aW9uLCBJIGZpbmQgdGhhdCB3aGVuIHVzaW5nIGZvdXIgCj4+IHF1ZXVlcyBmb3IgbmV0 YmFjay9uZXRmcm9udCwgdGhlcmUgd2lsbCBiZSBhYm91dCAzIG5ldGJhY2sgcHJvY2VzcyAKPj4g cnVubmluZyB3aXRoIGhpZ2ggQ1BVIHVzYWdlIG9uIHJlY2VpdmUgRG9tMCAoYWJvdXQgODUlIHVz YWdlIHBlciAKPj4gcHJvY2VzcyBydW5uaW5nIG9uIG9uZSBDUFUgY29yZSksIGFuZCB0aGUgYWdn cmVnYXRlIHRocm91Z2hvdXQgaXMgb25seSAKPj4gYWJvdXQgNUdicHMuIEkgZG91YnQgdGhhdCB0 aGVyZSBtYXkgYmUgc29tZSBidWcgb3IgcGl0ZmFsbCBpbiBjdXJyZW50IAo+PiBtdWx0aS1xdWV1 ZSBpbXBsZW1lbnRhdGlvbiwgYmVjYXVzZSBmb3IgNUdicHMgdGhyb3VnaG91dCwgb2NjdXJyaW5n IAo+PiBhYm91dCBhbGwgb2YgMyBDUFUgY29yZSBmb3IgcGFja2V0IHJlY2VpdmluZyBpcyBzb21l aG93IGFibm9ybWFsLgo+PiAKPgo+My4xNy40IGRvZXNuJ3QgY29udGFpbiBEYXZpZCBWcmFiZWwn cyBmaXhlcy4KPgo+TG9vayBmb3IKPiAgYmM5NmY2NDhkZjFiYmMyNzI5YWJiYjg0NTEzY2Y0ZjY0 MjczYTFmMQo+ICBmNDhkYThiMTRkMDRjYTg3ZmZjZmZlNjg4MjlhZmQ0NWY5MjZlYzZhCj4gIGVj ZjA4ZDJkYmI5NmQ1YTRiNGJjYzUzYTM5ZThkMjljYzhmZWYwMmUKPmluIERhdmlkIE1pbGxlcidz IG5ldCB0cmVlLgo+Cj5CVFcgdGhlcmUgYXJlIHNvbWUgaW1wcm92ZW1lbnQgcGxhbm5lZCBmb3Ig NC42OiAiW1hlbi1kZXZlbF0gW1BBVENIIHYzIDAvMl0gZ250dGFiOiBJbXByb3ZlIHNjYWxlYWJp bGl0eSIuIFRoaXMgaXMgb3J0aG9nb25hbCB0byB0aGUgcHJvYmxlbSB5b3UncmUgdHJ5aW5nIHRv IHNvbHZlIGJ1dCBpdCBzaG91bGQgaGVscCBpbXByb3ZlIHBlcmZvcm1hbmNlIGluIGdlbmVyYWwu Cj4KPgo+V2VpLgo= ------=_Part_172110_1876971491.1425028871379 Content-Type: text/html; charset=GBK Content-Transfer-Encoding: base64 PGRpdiBzdHlsZT0ibGluZS1oZWlnaHQ6MS43O2NvbG9yOiMwMDAwMDA7Zm9udC1zaXplOjE0cHg7 Zm9udC1mYW1pbHk6QXJpYWwiPjxwcmU+Jmd0O09uIE1vbiwgRGVjIDA4LCAyMDE0IGF0IDAxOjA4 OjE4UE0gKzAwMDAsIFpoYW5nbGVpcWlhbmcgKFRydW1wKSB3cm90ZToKJmd0OyZndDsgJmd0OyBP biBNb24sIERlYyAwOCwgMjAxNCBhdCAwNjo0NDoyNkFNICswMDAwLCBaaGFuZ2xlaXFpYW5nIChU cnVtcCkgd3JvdGU6CiZndDsmZ3Q7ICZndDsgJmd0OyAmZ3Q7IE9uIEZyaSwgRGVjIDA1LCAyMDE0 IGF0IDAxOjE3OjE2QU0gKzAwMDAsIFpoYW5nbGVpcWlhbmcgKFRydW1wKSB3cm90ZToKJmd0OyZn dDsgJmd0OyAmZ3Q7ICZndDsgWy4uLl0KJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsgJmd0OyAmZ3Q7 IEkgdGhpbmsgdGhhdCdzIGV4cGVjdGVkLCBiZWNhdXNlIGd1ZXN0IFJYIGRhdGEgcGF0aCBzdGls bCAKJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsgJmd0OyAmZ3Q7IHVzZXMgZ3JhbnRfY29weSB3aGls ZSBndWVzdCBUWCB1c2VzIGdyYW50X21hcCB0byBkbyB6ZXJvLWNvcHkgdHJhbnNtaXQuCiZndDsm Z3Q7ICZndDsgJmd0OyAmZ3Q7ICZndDsKJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsgJmd0OyBBcyBm YXIgYXMgSSBrbm93LCB0aGVyZSBhcmUgdGhyZWUgbWFpbiBncmFudC1yZWxhdGVkIAomZ3Q7Jmd0 OyAmZ3Q7ICZndDsgJmd0OyAmZ3Q7IG9wZXJhdGlvbnMgdXNlZCBpbiBzcGxpdAomZ3Q7Jmd0OyAm Z3Q7ICZndDsgJmd0OyBkZXZpY2UgbW9kZWw6IGdyYW50IG1hcHBpbmcsIGdyYW50IHRyYW5zZmVy IGFuZCBncmFudCBjb3B5LgomZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OyAmZ3Q7IEdyYW50IHRyYW5z ZmVyIGhhcyBub3QgdXNlZCBub3csIGFuZCBncmFudCBtYXBwaW5nIGFuZCBncmFudCAKJmd0OyZn dDsgJmd0OyAmZ3Q7ICZndDsgJmd0OyB0cmFuc2ZlciBib3RoCiZndDsmZ3Q7ICZndDsgJmd0OyAm Z3Q7IGludm9sdmUgIlRMQiIgcmVmcmVzaCB3b3JrIGZvciBoeXBlcnZpc29yLCBhbSBJIHJpZ2h0 PyAgT3Igb25seSAKJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsgZ3JhbnQgdHJhbnNmZXIgaGFzIHRo aXMgb3ZlcmhlYWQ/CiZndDsmZ3Q7ICZndDsgJmd0OyAmZ3Q7CiZndDsmZ3Q7ICZndDsgJmd0OyAm Z3Q7IFRyYW5zZmVyIGlzIG5vdCB1c2VkIHNvIEkgY2FuJ3QgdGVsbC4gR3JhbnQgdW5tYXAgY2F1 c2VzIFRMQiBmbHVzaC4KJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsKJmd0OyZndDsgJmd0OyAmZ3Q7 ICZndDsgSSBzYXcgaW4gYW4gZW1haWwgdGhlIG90aGVyIGRheSBYZW5TZXJ2ZXIgZm9sa3MgaGFz IHNvbWUgcGxhbm5lZCAKJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsgaW1wcm92ZW1lbnQgdG8gYXZv aWQgVExCIGZsdXNoIGluIFhlbiB0byB1cHN0cmVhbSBpbiA0LjYgd2luZG93LiAKJmd0OyZndDsg Jmd0OyAmZ3Q7ICZndDsgSSBjYW4ndCBzcGVhayBmb3Igc3VyZSBpdCB3aWxsIGdldCB1cHN0cmVh bWVkIGFzIEkgZG9uJ3Qgd29yayBvbiB0aGF0LgomZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OwomZ3Q7 Jmd0OyAmZ3Q7ICZndDsgJmd0OyAmZ3Q7IERvZXMgZ3JhbnQgY29weSBzdXJlbHkgaGFzIG1vcmUg b3ZlcmhlYWQgdGhhbiBncmFudCBtYXBwaW5nPwomZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OyAmZ3Q7 CiZndDsmZ3Q7ICZndDsgJmd0OyAmZ3Q7CiZndDsmZ3Q7ICZndDsgJmd0OyAmZ3Q7IEF0IHRoZSB2 ZXJ5IGxlYXN0IHRoZSB6ZXJvLWNvcHkgVFggcGF0aCBpcyBmYXN0ZXIgdGhhbiBwcmV2aW91cyBj b3B5aW5nIHBhdGguCiZndDsmZ3Q7ICZndDsgJmd0OyAmZ3Q7CiZndDsmZ3Q7ICZndDsgJmd0OyAm Z3Q7IEJ1dCBzcGVha2luZyBvZiB0aGUgbWljcm8gb3BlcmF0aW9uIEknbSBub3Qgc3VyZS4KJmd0 OyZndDsgJmd0OyAmZ3Q7ICZndDsKJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsgVGhlcmUgd2FzIG9u Y2UgcGVyc2lzdGVudCBtYXAgcHJvdG90eXBlIG5ldGJhY2sgLyBuZXRmcm9udCB0aGF0IAomZ3Q7 Jmd0OyAmZ3Q7ICZndDsgJmd0OyBlc3RhYmxpc2hlcyBhIG1lbW9yeSBwb29sIGJldHdlZW4gRkUg YW5kIEJFIHRoZW4gdXNlIG1lbWNweSB0byAKJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsgY29weSBk YXRhLiBVbmZvcnR1bmF0ZWx5IHRoYXQgcHJvdG90eXBlIHdhcyBub3QgZG9uZSByaWdodCBzbyAK Jmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsgdGhlIHJlc3VsdCB3YXMgbm90CiZndDsmZ3Q7ICZndDsg Z29vZC4KJmd0OyZndDsgJmd0OyAmZ3Q7CiZndDsmZ3Q7ICZndDsgJmd0OyBUaGUgbmV3ZXN0IG1h aWwgYWJvdXQgcGVyc2lzdGVudCBncmFudCBJIGNhbiBmaW5kIGlzIHNlbnQgZnJvbSAxNiAKJmd0 OyZndDsgJmd0OyAmZ3Q7IE5vdgomZ3Q7Jmd0OyAmZ3Q7ICZndDsgMjAxMgomZ3Q7Jmd0OyAmZ3Q7 ICZndDsgKGh0dHA6Ly9saXN0cy54ZW4ub3JnL2FyY2hpdmVzL2h0bWwveGVuLWRldmVsLzIwMTIt MTEvbXNnMDA4MzIuaHRtbCkuCiZndDsmZ3Q7ICZndDsgJmd0OyBXaHkgaXMgaXQgbm90IGRvbmUg cmlnaHQgYW5kIG5vdCBtZXJnZWQgaW50byB1cHN0cmVhbT8KJmd0OyZndDsgJmd0OyAKJmd0OyZn dDsgJmd0OyBBRkFJQ1QgdGhlcmUncyBvbmUgbW9yZSBtZW1jcHkgdGhhbiBuZWNlc3NhcnksIGku ZS4gZnJvbnRlbmQgbWVtY3B5IAomZ3Q7Jmd0OyAmZ3Q7IGRhdGEgaW50byB0aGUgcG9vbCB0aGVu IGJhY2tlbmQgbWVtY3B5IGRhdGEgb3V0IG9mIHRoZSBwb29sLCB3aGVuIAomZ3Q7Jmd0OyAmZ3Q7 IGJhY2tlbmQgc2hvdWxkIGJlIGFibGUgdG8gdXNlIHRoZSBwYWdlIGluIHBvb2wgZGlyZWN0bHku CiZndDsmZ3Q7IAomZ3Q7Jmd0OyBNZW1jcHkgc2hvdWxkIGNoZWFwZXIgdGhhbiBncmFudF9jb3B5 IGJlY2F1c2UgdGhlIGZvcm1lciBuZWVkcyBub3QgdGhlIAomZ3Q7Jmd0OyAiaHlwZXJjYWxsIiB3 aGljaCB3aWxsIGNhdXNlICJWTSBFeGl0IiB0byAiWEVOIEh5cGVydmlzb3IiLCBhbSBJIAomZ3Q7 Jmd0OyByaWdodD8gRm9yIFJYIHBhdGgsIHVzaW5nIG1lbWNweSBiYXNlZCBvbiBwZXJzaXN0ZW50 IGdyYW50IHRhYmxlIG1heSAKJmd0OyZndDsgaGF2ZSBoaWdoZXIgcGVyZm9ybWFuY2UgdGhhbiB1 c2luZyBncmFudCBjb3B5IG5vdy4KJmd0OwomZ3Q7SW4gdGhlb3J5IHllcy4gVW5mb3J0dW5hdGVs eSBub2JvZHkgaGFzIGJlbmNobWFya2VkIHRoYXQgcHJvcGVybHkuCjxkaXY+CjwvZGl2PjxkaXY+ SSBoYXZlIHNvbWUgdGVzdGluZyBmb3IgUlggcGVyZm9ybWFuY2UgdXNpbmcgcGVyc2lzdGVudCBn cmFudCBtZXRob2QgYW5kIHVwc3RyZWFtIG1ldGhvZCAoMy4xNy40IGJyYW5jaCksIHRoZSByZXN1 bHRzIHNob3cgdGhhdCBwZXJzaXN0ZW50IGdyYW50IG1ldGhvZCBkb2VzIGhhdmUgaGlnaGVyIHBl cmZvcm1hbmNlIHRoYW4gdXBzdHJlYW0gbWV0aG9kIChmcm9tIDMuNUdicHMgdG8gYWJvdXQgNkdi cHMpLiA8L2Rpdj48ZGl2PkFuZCBJIGZpbmQgdGhhdCBwZXJzaXN0ZW50IGdyYW50IG1lY2hhbmlz bSBoYXMgYWxyZWFkeSB1c2VkIGluIGJsa2Zyb25nL2Jsa2JhY2ssIEkgYW0gd29uZGVyaW5nIHdo eSB0aGVyZSBhcmUgbm8gZWZmb3J0cyB0byByZXBsYWNlIHRoZSBncmFudCBjb3B5IGJ5IHBlcnNp c3RlbnQgZ3JhbnQgbm93LCBhdCBsZWFzdCBpbiBSWCBwYXRoLiBBcmUgdGhlcmUgb3RoZXIgZGlz YWR2YW50YWdlcyBpbiBwZXJzaXN0ZW50IGdyYW50IG1ldGhvZCB3aGljaCBzdG9wIHdlIHVzZSBp dD8gPC9kaXY+PGRpdj48YnI+PC9kaXY+PGRpdj5QUy4gSSB1c2VkIHBrdC1nZW4gdG8gc2VuZCBw YWNrZXQgZnJvbSBkb20wIHRvIGEgZG9tVSBydW5uaW5nIG9uIGFub3RoZXIgZG9tMCwgdGhlIENQ VXMgb2YgYm90aCBkb20wIGlzIEludGVsIEU1NjQwIDIuNEdIeiwgYW5kIHRoZSB0d28gZG9tMHMg aXMgY29ubmVjdGVkIHdpdGggYSAxMEdFIE5JQy48L2Rpdj48ZGl2Pjxicj48L2Rpdj48ZGl2Pjxi cj48L2Rpdj4mZ3Q7SWYgeW91J3JlIGludGVyZXN0ZWQgaW4gZG9pbmcgd29yayBvbiBvcHRpbWlz aW5nIFJYIHBlcmZvcm1hbmNlLCB5b3UgbWlnaHQgd2FudCB0byBzeW5jIHVwIHdpdGggWGVuU2Vy dmVyIGZvbGtzPwomZ3Q7CiZndDsmZ3Q7IAomZ3Q7Jmd0OyBJIGhhdmUgc2VlbiAibW92ZSBncmFu dCBjb3B5IHRvIGd1ZXN0IiBhbmQgIkZpeCBncmFudCBjb3B5IGFsaWdubWVudCAKJmd0OyZndDsg cHJvYmxlbSIgYXMgb3B0aW1pemF0aW9uIG1ldGhvZHMgdXNlZCBpbiAiTmV0Q2hhbm5lbDIiCiZn dDsmZ3Q7IChodHRwOi8vd3d3LWFyY2hpdmUueGVucHJvamVjdC5vcmcvZmlsZXMveGVuc3VtbWl0 X2ZhbGwwNy8xNl9Kb3NlUmVuYXRvU2FudG9zLnBkZikuCiZndDsmZ3Q7IFVuZm9ydHVuYXRlbHks IE5ldENoYW5uZWwyIHNlZW1zIG5vdCBiZSBzdXBwb3J0ZWQgZnJvbSAyLjYuMzIuIERvIHlvdSAK Jmd0OyZndDsga25vdyB0aGVtIGFuZCBhcmUgdGhlbSBiZSBoZWxwZnVsIGZvciBSWCBwYXRoIG9w dGltaXphdGlvbiB1bmRlciAKJmd0OyZndDsgY3VycmVudCB1cHN0cmVhbSBpbXBsZW1lbnRhdGlv bj8KJmd0OwomZ3Q7Tm90IHN1cmUsIHRoYXQncyBsb25nIGJlZm9yZSBJIGV2ZXIgc3RhcnRlZCB3 b3JraW5nIG9uIFhlbi4KJmd0OwomZ3Q7Jmd0OyAKJmd0OyZndDsgQnkgdGhlIHdheSwgYWZ0ZXIg cmV0aGlua2luZyB0aGUgdGVzdGluZyByZXN1bHRzIGZvciBtdWx0aS1xdWV1ZSBwdiAKJmd0OyZn dDsgKGtlcm5lbCAzLjE3LjQrWEVOIDQuNCkgaW1wbGVtZW50YXRpb24sIEkgZmluZCB0aGF0IHdo ZW4gdXNpbmcgZm91ciAKJmd0OyZndDsgcXVldWVzIGZvciBuZXRiYWNrL25ldGZyb250LCB0aGVy ZSB3aWxsIGJlIGFib3V0IDMgbmV0YmFjayBwcm9jZXNzIAomZ3Q7Jmd0OyBydW5uaW5nIHdpdGgg aGlnaCBDUFUgdXNhZ2Ugb24gcmVjZWl2ZSBEb20wIChhYm91dCA4NSUgdXNhZ2UgcGVyIAomZ3Q7 Jmd0OyBwcm9jZXNzIHJ1bm5pbmcgb24gb25lIENQVSBjb3JlKSwgYW5kIHRoZSBhZ2dyZWdhdGUg dGhyb3VnaG91dCBpcyBvbmx5IAomZ3Q7Jmd0OyBhYm91dCA1R2Jwcy4gSSBkb3VidCB0aGF0IHRo ZXJlIG1heSBiZSBzb21lIGJ1ZyBvciBwaXRmYWxsIGluIGN1cnJlbnQgCiZndDsmZ3Q7IG11bHRp LXF1ZXVlIGltcGxlbWVudGF0aW9uLCBiZWNhdXNlIGZvciA1R2JwcyB0aHJvdWdob3V0LCBvY2N1 cnJpbmcgCiZndDsmZ3Q7IGFib3V0IGFsbCBvZiAzIENQVSBjb3JlIGZvciBwYWNrZXQgcmVjZWl2 aW5nIGlzIHNvbWVob3cgYWJub3JtYWwuCiZndDsmZ3Q7IAomZ3Q7CiZndDszLjE3LjQgZG9lc24n dCBjb250YWluIERhdmlkIFZyYWJlbCdzIGZpeGVzLgomZ3Q7CiZndDtMb29rIGZvcgomZ3Q7ICBi Yzk2ZjY0OGRmMWJiYzI3MjlhYmJiODQ1MTNjZjRmNjQyNzNhMWYxCiZndDsgIGY0OGRhOGIxNGQw NGNhODdmZmNmZmU2ODgyOWFmZDQ1ZjkyNmVjNmEKJmd0OyAgZWNmMDhkMmRiYjk2ZDVhNGI0YmNj NTNhMzllOGQyOWNjOGZlZjAyZQomZ3Q7aW4gRGF2aWQgTWlsbGVyJ3MgbmV0IHRyZWUuCiZndDsK Jmd0O0JUVyB0aGVyZSBhcmUgc29tZSBpbXByb3ZlbWVudCBwbGFubmVkIGZvciA0LjY6ICJbWGVu LWRldmVsXSBbUEFUQ0ggdjMgMC8yXSBnbnR0YWI6IEltcHJvdmUgc2NhbGVhYmlsaXR5Ii4gVGhp cyBpcyBvcnRob2dvbmFsIHRvIHRoZSBwcm9ibGVtIHlvdSdyZSB0cnlpbmcgdG8gc29sdmUgYnV0 IGl0IHNob3VsZCBoZWxwIGltcHJvdmUgcGVyZm9ybWFuY2UgaW4gZ2VuZXJhbC4KJmd0OwomZ3Q7 CiZndDtXZWkuCjwvcHJlPjwvZGl2Pjxicj48YnI+PHNwYW4gdGl0bGU9Im5ldGVhc2Vmb290ZXIi PjxzcGFuIGlkPSJuZXRlYXNlX21haWxfZm9vdGVyIj48L3NwYW4+PC9zcGFuPg== ------=_Part_172110_1876971491.1425028871379-- --===============7654450083308015496== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============7654450083308015496==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Liu Subject: Re: Poor network performance between DomU with multiqueue support Date: Fri, 27 Feb 2015 10:59:52 +0000 Message-ID: <20150227105951.GB29195@zion.uk.xensource.com> References: <3A6795EA1206904E94BEC8EF9DF109AE239B35A9@SZXEMA512-MBX.china.huawei.com> <2abdeb39.adb2.14bca56f4d4.Coremail.openlui@126.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <2abdeb39.adb2.14bca56f4d4.Coremail.openlui@126.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: openlui Cc: "Zhangleiqiang (Trump)" , wei.liu2@citrix.com, David Vrabel , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org Cc'ing David (XenServer kernel maintainer) On Fri, Feb 27, 2015 at 05:21:11PM +0800, openlui wrote: > >On Mon, Dec 08, 2014 at 01:08:18PM +0000, Zhangleiqiang (Trump) wrote: > >> > On Mon, Dec 08, 2014 at 06:44:26AM +0000, Zhangleiqiang (Trump) wrote: > >> > > > On Fri, Dec 05, 2014 at 01:17:16AM +0000, Zhangleiqiang (Trump) wrote: > >> > > > [...] > >> > > > > > I think that's expected, because guest RX data path still > >> > > > > > uses grant_copy while guest TX uses grant_map to do zero-copy transmit. > >> > > > > > >> > > > > As far as I know, there are three main grant-related > >> > > > > operations used in split > >> > > > device model: grant mapping, grant transfer and grant copy. > >> > > > > Grant transfer has not used now, and grant mapping and grant > >> > > > > transfer both > >> > > > involve "TLB" refresh work for hypervisor, am I right? Or only > >> > > > grant transfer has this overhead? > >> > > > > >> > > > Transfer is not used so I can't tell. Grant unmap causes TLB flush. > >> > > > > >> > > > I saw in an email the other day XenServer folks has some planned > >> > > > improvement to avoid TLB flush in Xen to upstream in 4.6 window. > >> > > > I can't speak for sure it will get upstreamed as I don't work on that. > >> > > > > >> > > > > Does grant copy surely has more overhead than grant mapping? > >> > > > > > >> > > > > >> > > > At the very least the zero-copy TX path is faster than previous copying path. > >> > > > > >> > > > But speaking of the micro operation I'm not sure. > >> > > > > >> > > > There was once persistent map prototype netback / netfront that > >> > > > establishes a memory pool between FE and BE then use memcpy to > >> > > > copy data. Unfortunately that prototype was not done right so > >> > > > the result was not > >> > good. > >> > > > >> > > The newest mail about persistent grant I can find is sent from 16 > >> > > Nov > >> > > 2012 > >> > > (http://lists.xen.org/archives/html/xen-devel/2012-11/msg00832.html). > >> > > Why is it not done right and not merged into upstream? > >> > > >> > AFAICT there's one more memcpy than necessary, i.e. frontend memcpy > >> > data into the pool then backend memcpy data out of the pool, when > >> > backend should be able to use the page in pool directly. > >> > >> Memcpy should cheaper than grant_copy because the former needs not the > >> "hypercall" which will cause "VM Exit" to "XEN Hypervisor", am I > >> right? For RX path, using memcpy based on persistent grant table may > >> have higher performance than using grant copy now. > > > >In theory yes. Unfortunately nobody has benchmarked that properly. > I have some testing for RX performance using persistent grant method > and upstream method (3.17.4 branch), the results show that persistent > grant method does have higher performance than upstream method (from > 3.5Gbps to about 6Gbps). And I find that persistent grant mechanism > has already used in blkfrong/blkback, I am wondering why there are no > efforts to replace the grant copy by persistent grant now, at least in > RX path. Are there other disadvantages in persistent grant method > which stop we use it? > I've seen numbers better than 6Gbps. See upstream changeset 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b. Persistent grant is not silver bullet. There is email thread on the list discussing whether it should be removed in block driver. XenServer folks have been working on improving network performance. It's my understanding that they choose different routes than persistent grant. David might have more insight. Wei. > PS. I used pkt-gen to send packet from dom0 to a domU running on > another dom0, the CPUs of both dom0 is Intel E5640 2.4GHz, and the two > dom0s is connected with a 10GE NIC. > From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Vrabel Subject: Re: Poor network performance between DomU with multiqueue support Date: Fri, 27 Feb 2015 11:30:20 +0000 Message-ID: <54F0554C.6080608@citrix.com> References: <3A6795EA1206904E94BEC8EF9DF109AE239B35A9@SZXEMA512-MBX.china.huawei.com> <2abdeb39.adb2.14bca56f4d4.Coremail.openlui@126.com> <20150227105951.GB29195@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150227105951.GB29195@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , openlui Cc: "Zhangleiqiang (Trump)" , David Vrabel , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 27/02/15 10:59, Wei Liu wrote: > > Persistent grant is not silver bullet. There is email thread on the > list discussing whether it should be removed in block driver. Persistent grants for to-guest network traffic is a flawed idea. It either requires: a) the backend to memcpy into the mapped grant /and/ the frontend to memcpy out of the persistently mapped pool. This is clearly going to be worse for memory bandwidth than a single grant copy. or b) the backend to accumulate more and more mappings of guest memory, which is bad for security and it uses too many grant and map track resources hence it does not scale to many VIFs. David From mboxrd@z Thu Jan 1 00:00:00 1970 From: openlui Subject: Re: Poor network performance between DomU with multiqueue support Date: Sat, 28 Feb 2015 10:45:02 +0800 (CST) Message-ID: <2f994425.11685.14bce12a2dd.Coremail.openlui@126.com> References: <3A6795EA1206904E94BEC8EF9DF109AE239B35A9@SZXEMA512-MBX.china.huawei.com> <2abdeb39.adb2.14bca56f4d4.Coremail.openlui@126.com> <20150227105951.GB29195@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4542424983644477469==" Return-path: In-Reply-To: <20150227105951.GB29195@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: "Zhangleiqiang (Trump)" , David Vrabel , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org --===============4542424983644477469== Content-Type: multipart/alternative; boundary="----=_Part_274106_1384485211.1425091502813" ------=_Part_274106_1384485211.1425091502813 Content-Type: text/plain; charset=GBK Content-Transfer-Encoding: base64 QXQgMjAxNS0wMi0yNyAxODo1OTo1MiwgIldlaSBMaXUiIDx3ZWkubGl1MkBjaXRyaXguY29tPiB3 cm90ZToKPkNjJ2luZyBEYXZpZCAoWGVuU2VydmVyIGtlcm5lbCBtYWludGFpbmVyKQo+Cj5PbiBG cmksIEZlYiAyNywgMjAxNSBhdCAwNToyMToxMVBNICswODAwLCBvcGVubHVpIHdyb3RlOgo+PiA+ T24gTW9uLCBEZWMgMDgsIDIwMTQgYXQgMDE6MDg6MThQTSArMDAwMCwgWmhhbmdsZWlxaWFuZyAo VHJ1bXApIHdyb3RlOgo+PiA+PiA+IE9uIE1vbiwgRGVjIDA4LCAyMDE0IGF0IDA2OjQ0OjI2QU0g KzAwMDAsIFpoYW5nbGVpcWlhbmcgKFRydW1wKSB3cm90ZToKPj4gPj4gPiA+ID4gT24gRnJpLCBE ZWMgMDUsIDIwMTQgYXQgMDE6MTc6MTZBTSArMDAwMCwgWmhhbmdsZWlxaWFuZyAoVHJ1bXApIHdy b3RlOgo+PiA+PiA+ID4gPiBbLi4uXQo+PiA+PiA+ID4gPiA+ID4gSSB0aGluayB0aGF0J3MgZXhw ZWN0ZWQsIGJlY2F1c2UgZ3Vlc3QgUlggZGF0YSBwYXRoIHN0aWxsIAo+PiA+PiA+ID4gPiA+ID4g dXNlcyBncmFudF9jb3B5IHdoaWxlIGd1ZXN0IFRYIHVzZXMgZ3JhbnRfbWFwIHRvIGRvIHplcm8t Y29weSB0cmFuc21pdC4KPj4gPj4gPiA+ID4gPgo+PiA+PiA+ID4gPiA+IEFzIGZhciBhcyBJIGtu b3csIHRoZXJlIGFyZSB0aHJlZSBtYWluIGdyYW50LXJlbGF0ZWQgCj4+ID4+ID4gPiA+ID4gb3Bl cmF0aW9ucyB1c2VkIGluIHNwbGl0Cj4+ID4+ID4gPiA+IGRldmljZSBtb2RlbDogZ3JhbnQgbWFw cGluZywgZ3JhbnQgdHJhbnNmZXIgYW5kIGdyYW50IGNvcHkuCj4+ID4+ID4gPiA+ID4gR3JhbnQg dHJhbnNmZXIgaGFzIG5vdCB1c2VkIG5vdywgYW5kIGdyYW50IG1hcHBpbmcgYW5kIGdyYW50IAo+ PiA+PiA+ID4gPiA+IHRyYW5zZmVyIGJvdGgKPj4gPj4gPiA+ID4gaW52b2x2ZSAiVExCIiByZWZy ZXNoIHdvcmsgZm9yIGh5cGVydmlzb3IsIGFtIEkgcmlnaHQ/ICBPciBvbmx5IAo+PiA+PiA+ID4g PiBncmFudCB0cmFuc2ZlciBoYXMgdGhpcyBvdmVyaGVhZD8KPj4gPj4gPiA+ID4KPj4gPj4gPiA+ ID4gVHJhbnNmZXIgaXMgbm90IHVzZWQgc28gSSBjYW4ndCB0ZWxsLiBHcmFudCB1bm1hcCBjYXVz ZXMgVExCIGZsdXNoLgo+PiA+PiA+ID4gPgo+PiA+PiA+ID4gPiBJIHNhdyBpbiBhbiBlbWFpbCB0 aGUgb3RoZXIgZGF5IFhlblNlcnZlciBmb2xrcyBoYXMgc29tZSBwbGFubmVkIAo+PiA+PiA+ID4g PiBpbXByb3ZlbWVudCB0byBhdm9pZCBUTEIgZmx1c2ggaW4gWGVuIHRvIHVwc3RyZWFtIGluIDQu NiB3aW5kb3cuIAo+PiA+PiA+ID4gPiBJIGNhbid0IHNwZWFrIGZvciBzdXJlIGl0IHdpbGwgZ2V0 IHVwc3RyZWFtZWQgYXMgSSBkb24ndCB3b3JrIG9uIHRoYXQuCj4+ID4+ID4gPiA+Cj4+ID4+ID4g PiA+ID4gRG9lcyBncmFudCBjb3B5IHN1cmVseSBoYXMgbW9yZSBvdmVyaGVhZCB0aGFuIGdyYW50 IG1hcHBpbmc/Cj4+ID4+ID4gPiA+ID4KPj4gPj4gPiA+ID4KPj4gPj4gPiA+ID4gQXQgdGhlIHZl cnkgbGVhc3QgdGhlIHplcm8tY29weSBUWCBwYXRoIGlzIGZhc3RlciB0aGFuIHByZXZpb3VzIGNv cHlpbmcgcGF0aC4KPj4gPj4gPiA+ID4KPj4gPj4gPiA+ID4gQnV0IHNwZWFraW5nIG9mIHRoZSBt aWNybyBvcGVyYXRpb24gSSdtIG5vdCBzdXJlLgo+PiA+PiA+ID4gPgo+PiA+PiA+ID4gPiBUaGVy ZSB3YXMgb25jZSBwZXJzaXN0ZW50IG1hcCBwcm90b3R5cGUgbmV0YmFjayAvIG5ldGZyb250IHRo YXQgCj4+ID4+ID4gPiA+IGVzdGFibGlzaGVzIGEgbWVtb3J5IHBvb2wgYmV0d2VlbiBGRSBhbmQg QkUgdGhlbiB1c2UgbWVtY3B5IHRvIAo+PiA+PiA+ID4gPiBjb3B5IGRhdGEuIFVuZm9ydHVuYXRl bHkgdGhhdCBwcm90b3R5cGUgd2FzIG5vdCBkb25lIHJpZ2h0IHNvIAo+PiA+PiA+ID4gPiB0aGUg cmVzdWx0IHdhcyBub3QKPj4gPj4gPiBnb29kLgo+PiA+PiA+ID4KPj4gPj4gPiA+IFRoZSBuZXdl c3QgbWFpbCBhYm91dCBwZXJzaXN0ZW50IGdyYW50IEkgY2FuIGZpbmQgaXMgc2VudCBmcm9tIDE2 IAo+PiA+PiA+ID4gTm92Cj4+ID4+ID4gPiAyMDEyCj4+ID4+ID4gPiAoaHR0cDovL2xpc3RzLnhl bi5vcmcvYXJjaGl2ZXMvaHRtbC94ZW4tZGV2ZWwvMjAxMi0xMS9tc2cwMDgzMi5odG1sKS4KPj4g Pj4gPiA+IFdoeSBpcyBpdCBub3QgZG9uZSByaWdodCBhbmQgbm90IG1lcmdlZCBpbnRvIHVwc3Ry ZWFtPwo+PiA+PiA+IAo+PiA+PiA+IEFGQUlDVCB0aGVyZSdzIG9uZSBtb3JlIG1lbWNweSB0aGFu IG5lY2Vzc2FyeSwgaS5lLiBmcm9udGVuZCBtZW1jcHkgCj4+ID4+ID4gZGF0YSBpbnRvIHRoZSBw b29sIHRoZW4gYmFja2VuZCBtZW1jcHkgZGF0YSBvdXQgb2YgdGhlIHBvb2wsIHdoZW4gCj4+ID4+ ID4gYmFja2VuZCBzaG91bGQgYmUgYWJsZSB0byB1c2UgdGhlIHBhZ2UgaW4gcG9vbCBkaXJlY3Rs eS4KPj4gPj4gCj4+ID4+IE1lbWNweSBzaG91bGQgY2hlYXBlciB0aGFuIGdyYW50X2NvcHkgYmVj YXVzZSB0aGUgZm9ybWVyIG5lZWRzIG5vdCB0aGUgCj4+ID4+ICJoeXBlcmNhbGwiIHdoaWNoIHdp bGwgY2F1c2UgIlZNIEV4aXQiIHRvICJYRU4gSHlwZXJ2aXNvciIsIGFtIEkgCj4+ID4+IHJpZ2h0 PyBGb3IgUlggcGF0aCwgdXNpbmcgbWVtY3B5IGJhc2VkIG9uIHBlcnNpc3RlbnQgZ3JhbnQgdGFi bGUgbWF5IAo+PiA+PiBoYXZlIGhpZ2hlciBwZXJmb3JtYW5jZSB0aGFuIHVzaW5nIGdyYW50IGNv cHkgbm93Lgo+PiA+Cj4+ID5JbiB0aGVvcnkgeWVzLiBVbmZvcnR1bmF0ZWx5IG5vYm9keSBoYXMg YmVuY2htYXJrZWQgdGhhdCBwcm9wZXJseS4KPgo+PiBJIGhhdmUgc29tZSB0ZXN0aW5nIGZvciBS WCBwZXJmb3JtYW5jZSB1c2luZyBwZXJzaXN0ZW50IGdyYW50IG1ldGhvZAo+PiBhbmQgdXBzdHJl YW0gbWV0aG9kICgzLjE3LjQgYnJhbmNoKSwgdGhlIHJlc3VsdHMgc2hvdyB0aGF0IHBlcnNpc3Rl bnQKPj4gZ3JhbnQgbWV0aG9kIGRvZXMgaGF2ZSBoaWdoZXIgcGVyZm9ybWFuY2UgdGhhbiB1cHN0 cmVhbSBtZXRob2QgKGZyb20KPj4gMy41R2JwcyB0byBhYm91dCA2R2JwcykuIEFuZCBJIGZpbmQg dGhhdCBwZXJzaXN0ZW50IGdyYW50IG1lY2hhbmlzbQo+PiBoYXMgYWxyZWFkeSB1c2VkIGluIGJs a2Zyb25nL2Jsa2JhY2ssIEkgYW0gd29uZGVyaW5nIHdoeSB0aGVyZSBhcmUgbm8KPj4gZWZmb3J0 cyB0byByZXBsYWNlIHRoZSBncmFudCBjb3B5IGJ5IHBlcnNpc3RlbnQgZ3JhbnQgbm93LCBhdCBs ZWFzdCBpbgo+PiBSWCBwYXRoLiBBcmUgdGhlcmUgb3RoZXIgZGlzYWR2YW50YWdlcyBpbiBwZXJz aXN0ZW50IGdyYW50IG1ldGhvZAo+PiB3aGljaCBzdG9wIHdlIHVzZSBpdD8gCj4+IAo+Cj5JJ3Zl IHNlZW4gbnVtYmVycyBiZXR0ZXIgdGhhbiA2R2Jwcy4gU2VlIHVwc3RyZWFtIGNoYW5nZXNldAo+ MTY1MGQ1NDU1YmQyZGM2YjVlZTEzNGJkNmZjMWEzMjM2YzI2NmI1Yi4KVGhhbmtzLCBXZWkuIFRo ZSB0aHJvdWdob3V0IEkgbWVudGlvbmVkICgzLjVHYnBzIGFuZCA2R2JwcykgaXMgZm9yIFVEUCAx NDAwIGJ5dGVzIHBhY2tldCwgSSB0aGluayB0aGUgcmVzdWx0IGJhc2VkIG9uIDE2NTBkNTQ1NWJk MmRjNmI1ZWUxMzRiZDZmYzFhMzIzNmMyNjZiNWIgaXMgZm9yIFRDUC4gCgo+UGVyc2lzdGVudCBn cmFudCBpcyBub3Qgc2lsdmVyIGJ1bGxldC4gVGhlcmUgaXMgZW1haWwgdGhyZWFkIG9uIHRoZQo+ bGlzdCBkaXNjdXNzaW5nIHdoZXRoZXIgaXQgc2hvdWxkIGJlIHJlbW92ZWQgaW4gYmxvY2sgZHJp dmVyLgoKSSBoYXZlIHRyaWVkIHRvIGxvb2sgZm9yIHRoZSB0aHJlYWQgYnV0IG5vIGRldGFpbGVk IGluZm8uIENvdWxkIHlvdSBnaXZlIG1lIHNvbWUga2V5d29yZCB0byBmaW5kIHRoZSB0aHJlYWQs IHRoYW5rcy4KCgo+WGVuU2VydmVyIGZvbGtzIGhhdmUgYmVlbiB3b3JraW5nIG9uIGltcHJvdmlu ZyBuZXR3b3JrIHBlcmZvcm1hbmNlLiBJdCdzCj5teSB1bmRlcnN0YW5kaW5nIHRoYXQgdGhleSBj aG9vc2UgZGlmZmVyZW50IHJvdXRlcyB0aGFuIHBlcnNpc3RlbnQKPmdyYW50LiBEYXZpZCBtaWdo dCBoYXZlIG1vcmUgaW5zaWdodC4KCgo+V2VpLgo+Cj4+IFBTLiBJIHVzZWQgcGt0LWdlbiB0byBz ZW5kIHBhY2tldCBmcm9tIGRvbTAgdG8gYSBkb21VIHJ1bm5pbmcgb24KPj4gYW5vdGhlciBkb20w LCB0aGUgQ1BVcyBvZiBib3RoIGRvbTAgaXMgSW50ZWwgRTU2NDAgMi40R0h6LCBhbmQgdGhlIHR3 bwo+PiBkb20wcyBpcyBjb25uZWN0ZWQgd2l0aCBhIDEwR0UgTklDLgo+PiAKPgo+X19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KPlhlbi1kZXZlbCBtYWlsaW5n IGxpc3QKPlhlbi1kZXZlbEBsaXN0cy54ZW4ub3JnCj5odHRwOi8vbGlzdHMueGVuLm9yZy94ZW4t ZGV2ZWwK ------=_Part_274106_1384485211.1425091502813 Content-Type: text/html; charset=GBK Content-Transfer-Encoding: base64 PGRpdiBzdHlsZT0ibGluZS1oZWlnaHQ6MS43O2NvbG9yOiMwMDAwMDA7Zm9udC1zaXplOjE0cHg7 Zm9udC1mYW1pbHk6QXJpYWwiPjxwcmU+QXQgMjAxNS0wMi0yNyAxODo1OTo1MiwgIldlaSBMaXUi ICZsdDt3ZWkubGl1MkBjaXRyaXguY29tJmd0OyB3cm90ZToKJmd0O0NjJ2luZyBEYXZpZCAoWGVu U2VydmVyIGtlcm5lbCBtYWludGFpbmVyKQomZ3Q7CiZndDtPbiBGcmksIEZlYiAyNywgMjAxNSBh dCAwNToyMToxMVBNICswODAwLCBvcGVubHVpIHdyb3RlOgomZ3Q7Jmd0OyAmZ3Q7T24gTW9uLCBE ZWMgMDgsIDIwMTQgYXQgMDE6MDg6MThQTSArMDAwMCwgWmhhbmdsZWlxaWFuZyAoVHJ1bXApIHdy b3RlOgomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7IE9uIE1vbiwgRGVjIDA4LCAyMDE0IGF0IDA2OjQ0 OjI2QU0gKzAwMDAsIFpoYW5nbGVpcWlhbmcgKFRydW1wKSB3cm90ZToKJmd0OyZndDsgJmd0OyZn dDsgJmd0OyAmZ3Q7ICZndDsgT24gRnJpLCBEZWMgMDUsIDIwMTQgYXQgMDE6MTc6MTZBTSArMDAw MCwgWmhhbmdsZWlxaWFuZyAoVHJ1bXApIHdyb3RlOgomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZn dDsgJmd0OyBbLi4uXQomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OyAmZ3Q7ICZndDsg SSB0aGluayB0aGF0J3MgZXhwZWN0ZWQsIGJlY2F1c2UgZ3Vlc3QgUlggZGF0YSBwYXRoIHN0aWxs IAomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OyAmZ3Q7ICZndDsgdXNlcyBncmFudF9j b3B5IHdoaWxlIGd1ZXN0IFRYIHVzZXMgZ3JhbnRfbWFwIHRvIGRvIHplcm8tY29weSB0cmFuc21p dC4KJmd0OyZndDsgJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsgJmd0OwomZ3Q7Jmd0OyAmZ3Q7Jmd0 OyAmZ3Q7ICZndDsgJmd0OyAmZ3Q7IEFzIGZhciBhcyBJIGtub3csIHRoZXJlIGFyZSB0aHJlZSBt YWluIGdyYW50LXJlbGF0ZWQgCiZndDsmZ3Q7ICZndDsmZ3Q7ICZndDsgJmd0OyAmZ3Q7ICZndDsg b3BlcmF0aW9ucyB1c2VkIGluIHNwbGl0CiZndDsmZ3Q7ICZndDsmZ3Q7ICZndDsgJmd0OyAmZ3Q7 IGRldmljZSBtb2RlbDogZ3JhbnQgbWFwcGluZywgZ3JhbnQgdHJhbnNmZXIgYW5kIGdyYW50IGNv cHkuCiZndDsmZ3Q7ICZndDsmZ3Q7ICZndDsgJmd0OyAmZ3Q7ICZndDsgR3JhbnQgdHJhbnNmZXIg aGFzIG5vdCB1c2VkIG5vdywgYW5kIGdyYW50IG1hcHBpbmcgYW5kIGdyYW50IAomZ3Q7Jmd0OyAm Z3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OyAmZ3Q7IHRyYW5zZmVyIGJvdGgKJmd0OyZndDsgJmd0OyZn dDsgJmd0OyAmZ3Q7ICZndDsgaW52b2x2ZSAiVExCIiByZWZyZXNoIHdvcmsgZm9yIGh5cGVydmlz b3IsIGFtIEkgcmlnaHQ/ICBPciBvbmx5IAomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0 OyBncmFudCB0cmFuc2ZlciBoYXMgdGhpcyBvdmVyaGVhZD8KJmd0OyZndDsgJmd0OyZndDsgJmd0 OyAmZ3Q7ICZndDsKJmd0OyZndDsgJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsgVHJhbnNmZXIgaXMg bm90IHVzZWQgc28gSSBjYW4ndCB0ZWxsLiBHcmFudCB1bm1hcCBjYXVzZXMgVExCIGZsdXNoLgom Z3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OwomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZn dDsgJmd0OyBJIHNhdyBpbiBhbiBlbWFpbCB0aGUgb3RoZXIgZGF5IFhlblNlcnZlciBmb2xrcyBo YXMgc29tZSBwbGFubmVkIAomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OyBpbXByb3Zl bWVudCB0byBhdm9pZCBUTEIgZmx1c2ggaW4gWGVuIHRvIHVwc3RyZWFtIGluIDQuNiB3aW5kb3cu IAomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OyBJIGNhbid0IHNwZWFrIGZvciBzdXJl IGl0IHdpbGwgZ2V0IHVwc3RyZWFtZWQgYXMgSSBkb24ndCB3b3JrIG9uIHRoYXQuCiZndDsmZ3Q7 ICZndDsmZ3Q7ICZndDsgJmd0OyAmZ3Q7CiZndDsmZ3Q7ICZndDsmZ3Q7ICZndDsgJmd0OyAmZ3Q7 ICZndDsgRG9lcyBncmFudCBjb3B5IHN1cmVseSBoYXMgbW9yZSBvdmVyaGVhZCB0aGFuIGdyYW50 IG1hcHBpbmc/CiZndDsmZ3Q7ICZndDsmZ3Q7ICZndDsgJmd0OyAmZ3Q7ICZndDsKJmd0OyZndDsg Jmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsKJmd0OyZndDsgJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsg QXQgdGhlIHZlcnkgbGVhc3QgdGhlIHplcm8tY29weSBUWCBwYXRoIGlzIGZhc3RlciB0aGFuIHBy ZXZpb3VzIGNvcHlpbmcgcGF0aC4KJmd0OyZndDsgJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsKJmd0 OyZndDsgJmd0OyZndDsgJmd0OyAmZ3Q7ICZndDsgQnV0IHNwZWFraW5nIG9mIHRoZSBtaWNybyBv cGVyYXRpb24gSSdtIG5vdCBzdXJlLgomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0Owom Z3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OyBUaGVyZSB3YXMgb25jZSBwZXJzaXN0ZW50 IG1hcCBwcm90b3R5cGUgbmV0YmFjayAvIG5ldGZyb250IHRoYXQgCiZndDsmZ3Q7ICZndDsmZ3Q7 ICZndDsgJmd0OyAmZ3Q7IGVzdGFibGlzaGVzIGEgbWVtb3J5IHBvb2wgYmV0d2VlbiBGRSBhbmQg QkUgdGhlbiB1c2UgbWVtY3B5IHRvIAomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OyBj b3B5IGRhdGEuIFVuZm9ydHVuYXRlbHkgdGhhdCBwcm90b3R5cGUgd2FzIG5vdCBkb25lIHJpZ2h0 IHNvIAomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZndDsgJmd0OyB0aGUgcmVzdWx0IHdhcyBub3QK Jmd0OyZndDsgJmd0OyZndDsgJmd0OyBnb29kLgomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7ICZndDsK Jmd0OyZndDsgJmd0OyZndDsgJmd0OyAmZ3Q7IFRoZSBuZXdlc3QgbWFpbCBhYm91dCBwZXJzaXN0 ZW50IGdyYW50IEkgY2FuIGZpbmQgaXMgc2VudCBmcm9tIDE2IAomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAm Z3Q7ICZndDsgTm92CiZndDsmZ3Q7ICZndDsmZ3Q7ICZndDsgJmd0OyAyMDEyCiZndDsmZ3Q7ICZn dDsmZ3Q7ICZndDsgJmd0OyAoaHR0cDovL2xpc3RzLnhlbi5vcmcvYXJjaGl2ZXMvaHRtbC94ZW4t ZGV2ZWwvMjAxMi0xMS9tc2cwMDgzMi5odG1sKS4KJmd0OyZndDsgJmd0OyZndDsgJmd0OyAmZ3Q7 IFdoeSBpcyBpdCBub3QgZG9uZSByaWdodCBhbmQgbm90IG1lcmdlZCBpbnRvIHVwc3RyZWFtPwom Z3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7IAomZ3Q7Jmd0OyAmZ3Q7Jmd0OyAmZ3Q7IEFGQUlDVCB0aGVy ZSdzIG9uZSBtb3JlIG1lbWNweSB0aGFuIG5lY2Vzc2FyeSwgaS5lLiBmcm9udGVuZCBtZW1jcHkg CiZndDsmZ3Q7ICZndDsmZ3Q7ICZndDsgZGF0YSBpbnRvIHRoZSBwb29sIHRoZW4gYmFja2VuZCBt ZW1jcHkgZGF0YSBvdXQgb2YgdGhlIHBvb2wsIHdoZW4gCiZndDsmZ3Q7ICZndDsmZ3Q7ICZndDsg YmFja2VuZCBzaG91bGQgYmUgYWJsZSB0byB1c2UgdGhlIHBhZ2UgaW4gcG9vbCBkaXJlY3RseS4K Jmd0OyZndDsgJmd0OyZndDsgCiZndDsmZ3Q7ICZndDsmZ3Q7IE1lbWNweSBzaG91bGQgY2hlYXBl ciB0aGFuIGdyYW50X2NvcHkgYmVjYXVzZSB0aGUgZm9ybWVyIG5lZWRzIG5vdCB0aGUgCiZndDsm Z3Q7ICZndDsmZ3Q7ICJoeXBlcmNhbGwiIHdoaWNoIHdpbGwgY2F1c2UgIlZNIEV4aXQiIHRvICJY RU4gSHlwZXJ2aXNvciIsIGFtIEkgCiZndDsmZ3Q7ICZndDsmZ3Q7IHJpZ2h0PyBGb3IgUlggcGF0 aCwgdXNpbmcgbWVtY3B5IGJhc2VkIG9uIHBlcnNpc3RlbnQgZ3JhbnQgdGFibGUgbWF5IAomZ3Q7 Jmd0OyAmZ3Q7Jmd0OyBoYXZlIGhpZ2hlciBwZXJmb3JtYW5jZSB0aGFuIHVzaW5nIGdyYW50IGNv cHkgbm93LgomZ3Q7Jmd0OyAmZ3Q7CiZndDsmZ3Q7ICZndDtJbiB0aGVvcnkgeWVzLiBVbmZvcnR1 bmF0ZWx5IG5vYm9keSBoYXMgYmVuY2htYXJrZWQgdGhhdCBwcm9wZXJseS4KJmd0OwomZ3Q7Jmd0 OyBJIGhhdmUgc29tZSB0ZXN0aW5nIGZvciBSWCBwZXJmb3JtYW5jZSB1c2luZyBwZXJzaXN0ZW50 IGdyYW50IG1ldGhvZAomZ3Q7Jmd0OyBhbmQgdXBzdHJlYW0gbWV0aG9kICgzLjE3LjQgYnJhbmNo KSwgdGhlIHJlc3VsdHMgc2hvdyB0aGF0IHBlcnNpc3RlbnQKJmd0OyZndDsgZ3JhbnQgbWV0aG9k IGRvZXMgaGF2ZSBoaWdoZXIgcGVyZm9ybWFuY2UgdGhhbiB1cHN0cmVhbSBtZXRob2QgKGZyb20K Jmd0OyZndDsgMy41R2JwcyB0byBhYm91dCA2R2JwcykuIEFuZCBJIGZpbmQgdGhhdCBwZXJzaXN0 ZW50IGdyYW50IG1lY2hhbmlzbQomZ3Q7Jmd0OyBoYXMgYWxyZWFkeSB1c2VkIGluIGJsa2Zyb25n L2Jsa2JhY2ssIEkgYW0gd29uZGVyaW5nIHdoeSB0aGVyZSBhcmUgbm8KJmd0OyZndDsgZWZmb3J0 cyB0byByZXBsYWNlIHRoZSBncmFudCBjb3B5IGJ5IHBlcnNpc3RlbnQgZ3JhbnQgbm93LCBhdCBs ZWFzdCBpbgomZ3Q7Jmd0OyBSWCBwYXRoLiBBcmUgdGhlcmUgb3RoZXIgZGlzYWR2YW50YWdlcyBp biBwZXJzaXN0ZW50IGdyYW50IG1ldGhvZAomZ3Q7Jmd0OyB3aGljaCBzdG9wIHdlIHVzZSBpdD8g CiZndDsmZ3Q7IAomZ3Q7CiZndDtJJ3ZlIHNlZW4gbnVtYmVycyBiZXR0ZXIgdGhhbiA2R2Jwcy4g U2VlIHVwc3RyZWFtIGNoYW5nZXNldAomZ3Q7MTY1MGQ1NDU1YmQyZGM2YjVlZTEzNGJkNmZjMWEz MjM2YzI2NmI1Yi4KPGRpdj4KPC9kaXY+PGRpdj5UaGFua3MsIFdlaS4mbmJzcDs8L2Rpdj48ZGl2 PjxzcGFuIHN0eWxlPSJsaW5lLWhlaWdodDogMS43OyI+VGhlIHRocm91Z2hvdXQgSSBtZW50aW9u ZWQgKDMuNUdicHMgYW5kIDZHYnBzKSBpcyBmb3IgVURQIDE0MDAgYnl0ZXMgcGFja2V0LCBJIHRo aW5rIHRoZSByZXN1bHQgYmFzZWQgb24gPC9zcGFuPjxzcGFuIHN0eWxlPSJsaW5lLWhlaWdodDog MS43OyI+MTY1MGQ1NDU1YmQyZGM2YjVlZTEzNGJkNmZjMWEzMjM2YzI2NmI1YiBpcyBmb3IgVENQ LiA8L3NwYW4+PC9kaXY+PGRpdj48YnI+PC9kaXY+Jmd0O1BlcnNpc3RlbnQgZ3JhbnQgaXMgbm90 IHNpbHZlciBidWxsZXQuIFRoZXJlIGlzIGVtYWlsIHRocmVhZCBvbiB0aGUKJmd0O2xpc3QgZGlz Y3Vzc2luZyB3aGV0aGVyIGl0IHNob3VsZCBiZSByZW1vdmVkIGluIGJsb2NrIGRyaXZlci4KPGRp dj4KPC9kaXY+PGRpdj5JIGhhdmUgdHJpZWQgdG8gbG9vayBmb3IgdGhlIHRocmVhZCBidXQgbm8g ZGV0YWlsZWQgaW5mby4gQ291bGQgeW91IGdpdmUgbWUgc29tZSBrZXl3b3JkIHRvIGZpbmQgdGhl IHRocmVhZCwgdGhhbmtzLjwvZGl2PjxkaXY+PGJyPjwvZGl2PiZndDtYZW5TZXJ2ZXIgZm9sa3Mg aGF2ZSBiZWVuIHdvcmtpbmcgb24gaW1wcm92aW5nIG5ldHdvcmsgcGVyZm9ybWFuY2UuIEl0J3MK Jmd0O215IHVuZGVyc3RhbmRpbmcgdGhhdCB0aGV5IGNob29zZSBkaWZmZXJlbnQgcm91dGVzIHRo YW4gcGVyc2lzdGVudAomZ3Q7Z3JhbnQuIERhdmlkIG1pZ2h0IGhhdmUgbW9yZSBpbnNpZ2h0Ljxk aXY+PGJyPjwvZGl2PiZndDtXZWkuCiZndDsKJmd0OyZndDsgUFMuIEkgdXNlZCBwa3QtZ2VuIHRv IHNlbmQgcGFja2V0IGZyb20gZG9tMCB0byBhIGRvbVUgcnVubmluZyBvbgomZ3Q7Jmd0OyBhbm90 aGVyIGRvbTAsIHRoZSBDUFVzIG9mIGJvdGggZG9tMCBpcyBJbnRlbCBFNTY0MCAyLjRHSHosIGFu ZCB0aGUgdHdvCiZndDsmZ3Q7IGRvbTBzIGlzIGNvbm5lY3RlZCB3aXRoIGEgMTBHRSBOSUMuCiZn dDsmZ3Q7IAomZ3Q7CiZndDtfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fXwomZ3Q7WGVuLWRldmVsIG1haWxpbmcgbGlzdAomZ3Q7WGVuLWRldmVsQGxpc3RzLnhl bi5vcmcKJmd0O2h0dHA6Ly9saXN0cy54ZW4ub3JnL3hlbi1kZXZlbAo8L3ByZT48L2Rpdj48YnI+ PGJyPjxzcGFuIHRpdGxlPSJuZXRlYXNlZm9vdGVyIj48c3BhbiBpZD0ibmV0ZWFzZV9tYWlsX2Zv b3RlciI+PC9zcGFuPjwvc3Bhbj4= ------=_Part_274106_1384485211.1425091502813-- --===============4542424983644477469== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============4542424983644477469==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: openlui Subject: Re: Poor network performance between DomU with multiqueue support Date: Sat, 28 Feb 2015 11:21:43 +0800 (CST) Message-ID: <62aa7391.1247e.14bce3436fe.Coremail.openlui@126.com> References: <3A6795EA1206904E94BEC8EF9DF109AE239B35A9@SZXEMA512-MBX.china.huawei.com> <2abdeb39.adb2.14bca56f4d4.Coremail.openlui@126.com> <20150227105951.GB29195@zion.uk.xensource.com> <54F0554C.6080608@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8845998497425409686==" Return-path: In-Reply-To: <54F0554C.6080608@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel Cc: "Zhangleiqiang (Trump)" , Wei Liu , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org --===============8845998497425409686== Content-Type: multipart/alternative; boundary="----=_Part_287764_1720644318.1425093703421" ------=_Part_287764_1720644318.1425093703421 Content-Type: text/plain; charset=GBK Content-Transfer-Encoding: base64 QXQgMjAxNS0wMi0yNyAxOTozMDoyMCwgIkRhdmlkIFZyYWJlbCIgPGRhdmlkLnZyYWJlbEBjaXRy aXguY29tPiB3cm90ZToKPk9uIDI3LzAyLzE1IDEwOjU5LCBXZWkgTGl1IHdyb3RlOgo+PiAKPj4g UGVyc2lzdGVudCBncmFudCBpcyBub3Qgc2lsdmVyIGJ1bGxldC4gVGhlcmUgaXMgZW1haWwgdGhy ZWFkIG9uIHRoZQo+PiBsaXN0IGRpc2N1c3Npbmcgd2hldGhlciBpdCBzaG91bGQgYmUgcmVtb3Zl ZCBpbiBibG9jayBkcml2ZXIuCj4KPlBlcnNpc3RlbnQgZ3JhbnRzIGZvciB0by1ndWVzdCBuZXR3 b3JrIHRyYWZmaWMgaXMgYSBmbGF3ZWQgaWRlYS4gIEl0Cj5laXRoZXIgcmVxdWlyZXM6Cj4KPmEp IHRoZSBiYWNrZW5kIHRvIG1lbWNweSBpbnRvIHRoZSBtYXBwZWQgZ3JhbnQgL2FuZC8gdGhlIGZy b250ZW5kIHRvCj5tZW1jcHkgb3V0IG9mIHRoZSBwZXJzaXN0ZW50bHkgbWFwcGVkIHBvb2wuICBU aGlzIGlzIGNsZWFybHkgZ29pbmcgdG8gYmUKPndvcnNlIGZvciBtZW1vcnkgYmFuZHdpZHRoIHRo YW4gYSBzaW5nbGUgZ3JhbnQgY29weS4KCgpZZXMsIHBlcnNpc3RlbnQgZ3JhbnQgbWV0aG9kIGRv ZXMgdXNlIG1vcmUgRG9tVSdzIGNwdSB0aGFuIGdyYW50IGNvcHkgbWV0aG9kLiAKCgpIb3dldmVy LCB0aGUgcGVyc2lzdGVudCB3YXkgZG9lcyBoYXZlIG9uZSBtb3JlIG1lbWNweSBvcGVyYXRpb24g dGhhbiBncmFudCBjb3B5LCBidXQgaXQgaGFzIHR3byBsZXNzICJtbWFwIiBvcGVyYXRpb24gdGhh biBncmFudCBjb3B5IGFuZCBubyBoeXBlcmNhbGwgdG9vLiBJIGhhdmUgZXhhbWluZWQgdGhlIGNv ZGUgZm9yIGdyYW50IGNvcHksIGl0IG5lZWRzIHRvICJtbWFwIiB0aGUgbWVtb3J5IGZyb20gc3Jj IGFuZCBkZXN0IGRvbWFpbiB0byBoeXBlcnZpc29yLCAgdGhlbiAibWVtY3B5IiB0aGUgZGF0YSBm cm9tIHNyYyB0byBkZXN0LiBUaGVyZSB3aWxsIGJlIG1vcmUgY3B1IHVzZWQgYnkgaHlwZXJ2aXNv ciBpbnN0ZWFkIG9mIERvbVUuCgoKPm9yCj4KPmIpIHRoZSBiYWNrZW5kIHRvIGFjY3VtdWxhdGUg bW9yZSBhbmQgbW9yZSBtYXBwaW5ncyBvZiBndWVzdCBtZW1vcnksCj53aGljaCBpcyBiYWQgZm9y IHNlY3VyaXR5IGFuZCBpdCB1c2VzIHRvbyBtYW55IGdyYW50IGFuZCBtYXAgdHJhY2sKPnJlc291 cmNlcyBoZW5jZSBpdCBkb2VzIG5vdCBzY2FsZSB0byBtYW55IFZJRnMuCgpJIGZpbmQgdGhhdCBw ZXJzaXN0ZW50IGdyYW50IHBhdGNoIGhhcyBhIHVwcGVyIGxpbWl0IGZvciBhbW91bnQgb2YgZ3Vl c3QgbWVtb3J5IGNhbiBiZSBtYXBwZWQgYnkgZWFjaCBxdWV1ZSBvZiBWSUYuIFRoZSBsaW1pdCBz ZWVtcyB0byB0aGUgVklGoa5zIHJpbmcgc2l6ZSBpZiBJIHVuZGVyc3RhbmQgcmlnaHQsIHNvIHRo ZSBhbW91bnQgc2VlbXMgbm90IGhpZ2guClVuZGVyIG15IGJlbmNobWFyaywgYXQgbGVhc3QgZm9y IHNpbmdsZSBVRFAgZmxvdywgdGhlIHBlcnNpc3RlbnQgZ3JhbnQgd2F5IGhhcyBtb3JlIGhpZ2hl ciB0aHJvdWdob3V0IHRoYW4gZ3JhbnQgY29weSB3YXkuIAoKCj5EYXZpZAo+Cj5fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwo+WGVuLWRldmVsIG1haWxpbmcg bGlzdAo+WGVuLWRldmVsQGxpc3RzLnhlbi5vcmcKPmh0dHA6Ly9saXN0cy54ZW4ub3JnL3hlbi1k ZXZlbAo= ------=_Part_287764_1720644318.1425093703421 Content-Type: text/html; charset=GBK Content-Transfer-Encoding: base64 PGRpdiBzdHlsZT0ibGluZS1oZWlnaHQ6MS43O2NvbG9yOiMwMDAwMDA7Zm9udC1zaXplOjE0cHg7 Zm9udC1mYW1pbHk6QXJpYWwiPjxwcmU+QXQgMjAxNS0wMi0yNyAxOTozMDoyMCwgIkRhdmlkIFZy YWJlbCIgJmx0O2RhdmlkLnZyYWJlbEBjaXRyaXguY29tJmd0OyB3cm90ZToKJmd0O09uIDI3LzAy LzE1IDEwOjU5LCBXZWkgTGl1IHdyb3RlOgomZ3Q7Jmd0OyAKJmd0OyZndDsgUGVyc2lzdGVudCBn cmFudCBpcyBub3Qgc2lsdmVyIGJ1bGxldC4gVGhlcmUgaXMgZW1haWwgdGhyZWFkIG9uIHRoZQom Z3Q7Jmd0OyBsaXN0IGRpc2N1c3Npbmcgd2hldGhlciBpdCBzaG91bGQgYmUgcmVtb3ZlZCBpbiBi bG9jayBkcml2ZXIuCiZndDsKJmd0O1BlcnNpc3RlbnQgZ3JhbnRzIGZvciB0by1ndWVzdCBuZXR3 b3JrIHRyYWZmaWMgaXMgYSBmbGF3ZWQgaWRlYS4gIEl0CiZndDtlaXRoZXIgcmVxdWlyZXM6CiZn dDsKJmd0O2EpIHRoZSBiYWNrZW5kIHRvIG1lbWNweSBpbnRvIHRoZSBtYXBwZWQgZ3JhbnQgL2Fu ZC8gdGhlIGZyb250ZW5kIHRvCiZndDttZW1jcHkgb3V0IG9mIHRoZSBwZXJzaXN0ZW50bHkgbWFw cGVkIHBvb2wuICBUaGlzIGlzIGNsZWFybHkgZ29pbmcgdG8gYmUKJmd0O3dvcnNlIGZvciBtZW1v cnkgYmFuZHdpZHRoIHRoYW4gYSBzaW5nbGUgZ3JhbnQgY29weS4KPGRpdj48YnI+PC9kaXY+PGRp dj5ZZXMsIHBlcnNpc3RlbnQgZ3JhbnQgbWV0aG9kIGRvZXMgdXNlIG1vcmUgRG9tVSdzIGNwdSB0 aGFuIGdyYW50IGNvcHkgbWV0aG9kLiZuYnNwOzwvZGl2PjxkaXY+PGJyPjwvZGl2PjxkaXY+SG93 ZXZlciwgdGhlIHBlcnNpc3RlbnQgd2F5IGRvZXMgaGF2ZSBvbmUgbW9yZSBtZW1jcHkgb3BlcmF0 aW9uIHRoYW4gZ3JhbnQgY29weSwgYnV0IGl0IGhhcyB0d28gbGVzcyAibW1hcCIgb3BlcmF0aW9u IHRoYW4gZ3JhbnQgY29weSBhbmQgbm8gaHlwZXJjYWxsIHRvby4gSSBoYXZlIGV4YW1pbmVkIHRo ZSBjb2RlIGZvciBncmFudCBjb3B5LCBpdCBuZWVkcyB0byAibW1hcCIgdGhlIG1lbW9yeSBmcm9t IHNyYyBhbmQgZGVzdCBkb21haW4gdG8gaHlwZXJ2aXNvciwgIHRoZW4gIm1lbWNweSIgdGhlIGRh dGEgZnJvbSBzcmMgdG8gZGVzdC4gVGhlcmUgd2lsbCBiZSBtb3JlIGNwdSB1c2VkIGJ5IGh5cGVy dmlzb3IgaW5zdGVhZCBvZiBEb21VLjwvZGl2PjxkaXY+PGJyPjwvZGl2PiZndDtvcgomZ3Q7CiZn dDtiKSB0aGUgYmFja2VuZCB0byBhY2N1bXVsYXRlIG1vcmUgYW5kIG1vcmUgbWFwcGluZ3Mgb2Yg Z3Vlc3QgbWVtb3J5LAomZ3Q7d2hpY2ggaXMgYmFkIGZvciBzZWN1cml0eSBhbmQgaXQgdXNlcyB0 b28gbWFueSBncmFudCBhbmQgbWFwIHRyYWNrCiZndDtyZXNvdXJjZXMgaGVuY2UgaXQgZG9lcyBu b3Qgc2NhbGUgdG8gbWFueSBWSUZzLgo8ZGl2Pgo8L2Rpdj48ZGl2PkkgZmluZCB0aGF0IHBlcnNp c3RlbnQgZ3JhbnQgcGF0Y2ggaGFzIGEgdXBwZXIgbGltaXQgZm9yIGFtb3VudCBvZiBndWVzdCBt ZW1vcnkgY2FuIGJlIG1hcHBlZCBieSBlYWNoIHF1ZXVlIG9mIFZJRi4gVGhlIGxpbWl0IHNlZW1z IHRvIHRoZSBWSUahrnMgcmluZyBzaXplIGlmIEkgdW5kZXJzdGFuZCByaWdodCwgc28gdGhlIGFt b3VudCBzZWVtcyBub3QgaGlnaC48L2Rpdj48ZGl2PlVuZGVyIG15IGJlbmNobWFyaywgYXQgbGVh c3QgZm9yIHNpbmdsZSBVRFAgZmxvdywgdGhlIHBlcnNpc3RlbnQgZ3JhbnQgd2F5IGhhcyBtb3Jl IGhpZ2hlciB0aHJvdWdob3V0IHRoYW4gZ3JhbnQgY29weSB3YXkuIDwvZGl2PjxkaXY+PGJyPjwv ZGl2PiZndDtEYXZpZAomZ3Q7CiZndDtfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fXwomZ3Q7WGVuLWRldmVsIG1haWxpbmcgbGlzdAomZ3Q7WGVuLWRldmVsQGxp c3RzLnhlbi5vcmcKJmd0O2h0dHA6Ly9saXN0cy54ZW4ub3JnL3hlbi1kZXZlbAo8L3ByZT48L2Rp dj48YnI+PGJyPjxzcGFuIHRpdGxlPSJuZXRlYXNlZm9vdGVyIj48c3BhbiBpZD0ibmV0ZWFzZV9t YWlsX2Zvb3RlciI+PC9zcGFuPjwvc3Bhbj4= ------=_Part_287764_1720644318.1425093703421-- --===============8845998497425409686== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============8845998497425409686==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Liu Subject: Re: Poor network performance between DomU with multiqueue support Date: Tue, 3 Mar 2015 10:40:17 +0000 Message-ID: <20150303104017.GW11855@zion.uk.xensource.com> References: <3A6795EA1206904E94BEC8EF9DF109AE239B35A9@SZXEMA512-MBX.china.huawei.com> <2abdeb39.adb2.14bca56f4d4.Coremail.openlui@126.com> <20150227105951.GB29195@zion.uk.xensource.com> <2f994425.11685.14bce12a2dd.Coremail.openlui@126.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <2f994425.11685.14bce12a2dd.Coremail.openlui@126.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: openlui Cc: "Zhangleiqiang (Trump)" , Wei Liu , David Vrabel , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On Sat, Feb 28, 2015 at 10:45:02AM +0800, openlui wrote: > > > >Persistent grant is not silver bullet. There is email thread on the > >list discussing whether it should be removed in block driver. > > I have tried to look for the thread but no detailed info. Could you give me some keyword to find the thread, thanks. > > Message id <1423988345-4005-5-git-send-email-bob.liu@oracle.com>