From mboxrd@z Thu Jan 1 00:00:00 1970 From: Maxime Coquelin Subject: Re: [PATCH 1/2] net/virtio: fix performance regression due to TSO enabling Date: Wed, 11 Jan 2017 08:59:28 +0100 Message-ID: References: <1484108832-19907-1-git-send-email-yuanhan.liu@linux.intel.com> <1484108832-19907-2-git-send-email-yuanhan.liu@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: Tan Jianfeng , Wang Zhihong , Olivier Matz , "Michael S. Tsirkin" , stable@dpdk.org To: Yuanhan Liu , dev@dpdk.org Return-path: In-Reply-To: <1484108832-19907-2-git-send-email-yuanhan.liu@linux.intel.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 01/11/2017 05:27 AM, Yuanhan Liu wrote: > TSO is now enabled, but it's not actually being used by default in a > simple L2 forward mode. In such case, we have to zero the virtio net > headers, to inform the vhost backend that no offload is being used: > > hdr->csum_start = 0; > hdr->csum_offset = 0; > hdr->flags = 0; > > hdr->gso_type = 0; > hdr->gso_size = 0; > hdr->hdr_len = 0; > > Such writes could be very costly; it introduces severe cache issues: > The above operations introduce cache write for each packet, which > stalls the read operation from the vhost backend. > > The fact that virtio net header is initiated to zero in PMD driver > init stage means that these costly writes are unnecessary and could > be avoided: > > if (hdr->csum_start != 0) > hdr->csum_start = 0; > > And that's what the macro ASSIGN_UNLESS_EQUAL does. With this, the > performance drop introduced by TSO enabling is recovered: it could > be up to 20% in micro benchmarking. Very nice! > > Fixes: 58169a9c8153 ("net/virtio: support Tx checksum offload") > Fixes: 696573046e9e ("net/virtio: support TSO") > > Cc: Olivier Matz > Cc: Maxime Coquelin > Cc: Michael S. Tsirkin > Cc: stable@dpdk.org > Signed-off-by: Yuanhan Liu > --- > drivers/net/virtio/virtio_rxtx.c | 18 ++++++++++++------ > 1 file changed, 12 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c > index 1e5a6b9..8ec2f1a 100644 > --- a/drivers/net/virtio/virtio_rxtx.c > +++ b/drivers/net/virtio/virtio_rxtx.c > @@ -258,6 +258,12 @@ > vtpci_with_feature(hw, VIRTIO_NET_F_HOST_TSO6); > } > > +/* avoid write operation when necessary, to lessen cache issues */ > +#define ASSIGN_UNLESS_EQUAL(var, val) do { \ > + if ((var) != (val)) \ > + (var) = (val); \ > +} while (0) As it is intended to go in -stable, I think this is fine to have it only in the driver, but for v17.02, maybe we should have another patch on top that declares it somewhere so that other libs and drivers can make use of it? > + > static inline void > virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct rte_mbuf *cookie, > uint16_t needed, int use_indirect, int can_push) > @@ -337,9 +343,9 @@ > break; > > default: > - hdr->csum_start = 0; > - hdr->csum_offset = 0; > - hdr->flags = 0; > + ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0); > + ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0); > + ASSIGN_UNLESS_EQUAL(hdr->flags, 0); > break; > } > > @@ -355,9 +361,9 @@ > cookie->l3_len + > cookie->l4_len; > } else { > - hdr->gso_type = 0; > - hdr->gso_size = 0; > - hdr->hdr_len = 0; > + ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0); > + ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0); > + ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0); > } > } > > Reviewed-by: Maxime Coquelin Thanks! Maxime