xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Pv-ops][PATCH 0/4 v2] Netback multiple threads support
@ 2010-04-29 14:27 Xu, Dongxiao
  2010-05-03 16:02 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 3+ messages in thread
From: Xu, Dongxiao @ 2010-04-29 14:27 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com; +Cc: Jeremy Fitzhardinge, Steven Smith

Current netback uses one pair of tasklets for Tx/Rx data transaction.
Netback tasklet could only run at one CPU at a time, and it is used to
serve all the netfronts. Therefore it has become a performance bottle
neck. This patch is to use multiple tasklet pairs to replace the current
single pair in dom0. 

Assuming that Dom0 has CPUNR VCPUs, we define CPUNR kinds of
tasklets pair (CPUNR for Tx, and CPUNR for Rx). Each pare of tasklets
serve specific group of netfronts. Also for those global and static
variables, we duplicated them for each group in order to avoid the
spinlock. 

PATCH 01: Generilize static/global variables into 'struct xen_netbk'.

PATCH 02: Introduce a new struct page_ext.

PATCH 03: Multiple tasklets support.

PATCH 04: Use Kernel thread to replace the tasklet.

Recently I re-tested the patchset with Intel 10G multi-queue NIC device,
and use 10 outside 1G NICs to do netperf tests with that 10G NIC.

Case 1: Dom0 has more than 10 vcpus pinned with each physical CPU.
With the patchset, the performance is 2x of the original throughput.

Case 2: Dom0 has 4 vcpus pinned with 4 physical CPUs.
With the patchset, the performance is 3.7x of the original throughput. 

when we test this patch, we found that the domain_lock in grant table
operation (gnttab_copy()) becomes a bottle neck. We temporarily
remove the global domain_lock to achieve good performance.

Thanks,
Dongxiao

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Pv-ops][PATCH 0/4 v2] Netback multiple threads support
@ 2010-04-29 14:53 Xu, Dongxiao
  0 siblings, 0 replies; 3+ messages in thread
From: Xu, Dongxiao @ 2010-04-29 14:53 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com; +Cc: Jeremy Fitzhardinge, Steven Smith

Seems that this file is missing... Resend it.

Current netback uses one pair of tasklets for Tx/Rx data transaction.
Netback tasklet could only run at one CPU at a time, and it is used to
serve all the netfronts. Therefore it has become a performance bottle
neck. This patch is to use multiple tasklet pairs to replace the current
single pair in dom0. 

Assuming that Dom0 has CPUNR VCPUs, we define CPUNR kinds of
tasklets pair (CPUNR for Tx, and CPUNR for Rx). Each pare of tasklets
serve specific group of netfronts. Also for those global and static
variables, we duplicated them for each group in order to avoid the
spinlock. 

PATCH 01: Generilize static/global variables into 'struct xen_netbk'.

PATCH 02: Introduce a new struct page_ext.

PATCH 03: Multiple tasklets support.

PATCH 04: Use Kernel thread to replace the tasklet.

Recently I re-tested the patchset with Intel 10G multi-queue NIC device,
and use 10 outside 1G NICs to do netperf tests with that 10G NIC.

Case 1: Dom0 has more than 10 vcpus pinned with each physical CPU.
With the patchset, the performance is 2x of the original throughput.

Case 2: Dom0 has 4 vcpus pinned with 4 physical CPUs.
With the patchset, the performance is 3.7x of the original throughput. 

when we test this patch, we found that the domain_lock in grant table
operation (gnttab_copy()) becomes a bottle neck. We temporarily
remove the global domain_lock to achieve good performance.

Thanks,
Dongxiao

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Pv-ops][PATCH 0/4 v2] Netback multiple threads support
  2010-04-29 14:27 [Pv-ops][PATCH 0/4 v2] Netback multiple threads support Xu, Dongxiao
@ 2010-05-03 16:02 ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 3+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-05-03 16:02 UTC (permalink / raw)
  To: Xu, Dongxiao
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Steven Smith

> when we test this patch, we found that the domain_lock in grant table
> operation (gnttab_copy()) becomes a bottle neck. We temporarily
> remove the global domain_lock to achieve good performance.

Without that change (global domain_lock in place), what are the
performance numbers?

Is there a forthcoming patch to make the gnttab_copy have a much finer
grained lock?

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-05-03 16:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-29 14:27 [Pv-ops][PATCH 0/4 v2] Netback multiple threads support Xu, Dongxiao
2010-05-03 16:02 ` Konrad Rzeszutek Wilk
  -- strict thread matches above, loose matches on Subject: below --
2010-04-29 14:53 Xu, Dongxiao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).