From mboxrd@z Thu Jan  1 00:00:00 1970
From: Shirley Ma <mashirle@us.ibm.com>
Subject: Re: Network performance with small packets
Date: Thu, 27 Jan 2011 10:44:34 -0800
Message-ID: <1296153874.1640.27.camel@localhost.localdomain>
References: <OFD293DCD2.7F0260F0-ON86257823.0061DC39-86257823.00743BB3@us.ibm.com>
	 <20110126151700.GA14113@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Cc: Steve Dobbelstein <steved@us.ibm.com>, kvm@vger.kernel.org
To: "Michael S. Tsirkin" <mst@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from e37.co.us.ibm.com ([32.97.110.158]:53551 "EHLO
	e37.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752564Ab1A0Sol (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 27 Jan 2011 13:44:41 -0500
Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com [9.17.195.228])
	by e37.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id p0RIg9K8006224
	for <kvm@vger.kernel.org>; Thu, 27 Jan 2011 11:42:09 -0700
Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167])
	by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p0RIicb5107766
	for <kvm@vger.kernel.org>; Thu, 27 Jan 2011 11:44:38 -0700
Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1])
	by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p0RIibtt003697
	for <kvm@vger.kernel.org>; Thu, 27 Jan 2011 11:44:38 -0700
In-Reply-To: <20110126151700.GA14113@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Wed, 2011-01-26 at 17:17 +0200, Michael S. Tsirkin wrote:
> I am seeing a similar problem, and am trying to fix that.
> My current theory is that this is a variant of a receive livelock:
> if the application isn't fast enough to process
> incoming data, the guest net stack switches
> from prequeue to backlog handling.
> 
> One thing I noticed is that locking the vhost thread
> and the vcpu to the same physical CPU almost doubles the
> bandwidth.  Can you confirm that in your setup?
> 
> My current guess is that when we lock both to
> a single CPU, netperf in guest gets scheduled
> slowing down the vhost thread in the host.
> 
> I also noticed that this specific workload
> performs better with vhost off: presumably
> we are loading the guest less. 

I found similar issue for small message size TCP_STREAM test when guest
as TX. I found when I slow down TX, the BW performance will be doubled
for 1K to 4K message size.

Shirley