From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vlad Zolotarov Subject: Re: [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for all NICs but 82598 Date: Tue, 25 Aug 2015 21:52:37 +0300 Message-ID: <55DCB975.2030000@cloudius-systems.com> References: <1439489195-31553-1-git-send-email-vladz@cloudius-systems.com> <55CD7EA5.6060100@cloudius-systems.com> <6A0DE07E22DDAD4C9103DF62FEBC0909D3E116@shsmsx102.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Cc: "dev@dpdk.org" To: "Zhang, Helin" Return-path: Received: from mail-wi0-f169.google.com (mail-wi0-f169.google.com [209.85.212.169]) by dpdk.org (Postfix) with ESMTP id 7DB91594B for ; Tue, 25 Aug 2015 20:52:39 +0200 (CEST) Received: by wicja10 with SMTP id ja10so23659442wic.1 for ; Tue, 25 Aug 2015 11:52:39 -0700 (PDT) In-Reply-To: List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 08/25/15 21:43, Zhang, Helin wrote: > > Hi Vlad > > I think this could possibly be the root cause of your TX hang issue.=20 > Please try to limit the number to 8 or less, and then see if the issue=20 > will still be there or not? > Helin, the issue has been seen on x540 devices. Pls., see a chapter=20 7.2.1.1 of x540 devices spec: A packet (or multiple packets in transmit segmentation) can span any numb= er of buffers (and their descriptors) up to a limit of 40 minus WTHRESH minus 2= (see Section 7.2.3.3 for Tx Ring details and section Section 7.2.3.5.1 for WTH= RESH details). For best performance it is recommended to minimize the number o= f buffers as possible. Could u, pls., clarify why do u think that the maximum number of data=20 buffers is limited by 8? thanks, vlad > It does not have any check for the number of descriptors to be used=20 > for a single packet, and it relies on the users to give correct mbuf=20 > chains. > > We may need a check of this somewhere. Of cause the point you=20 > indicated we also need to carefully investigate or fix. > > Regards, > > Helin > > *From:*Vladislav Zolotarov [mailto:vladz@cloudius-systems.com] > *Sent:* Tuesday, August 25, 2015 11:34 AM > *To:* Zhang, Helin > *Cc:* Lu, Wenzhuo; dev@dpdk.org > *Subject:* RE: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh=20 > above 1 for all NICs but 82598 > > > On Aug 25, 2015 21:14, "Zhang, Helin" > wrote: > > > > Hi Vlad > > > > > > > > In addition, I=E2=80=99d double check with you what=E2=80=99s the max= imum number of=20 > descriptors would be used for a single packet transmitting? > > > > Datasheet said that it supports up to 8. I am wondering if more than=20 > 8 were used in your case? > > If memory serves me well the maximum number of data descriptors per=20 > single xmit packet is 40 minus 2 minus WTHRESH. Since WTHRESH in DPDK=20 > is always zero it gives us 38 segments. We limit them by 33. > > > > > Thank you very much! > > > > > > > > Regards, > > > > Helin > > > > > > > > From: Zhang, Helin > > Sent: Wednesday, August 19, 2015 10:29 AM > > To: Vladislav Zolotarov > > Cc: Lu, Wenzhuo; dev@dpdk.org > > > > Subject: RE: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh=20 > above 1 for all NICs but 82598 > > > > > > > > Hi Vlad > > > > > > > > Thank you very much for the patches! Give me a few more time to=20 > double check with more guys, and possibly hardware experts. > > > > > > > > Regards, > > > > Helin > > > > > > > > From: Vladislav Zolotarov [mailto:vladz@cloudius-systems.com=20 > ] > > Sent: Tuesday, August 18, 2015 9:56 PM > > To: Lu, Wenzhuo > > Cc: dev@dpdk.org ; Zhang, Helin > > Subject: RE: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh=20 > above 1 for all NICs but 82598 > > > > > > > > > > On Aug 19, 2015 03:42, "Lu, Wenzhuo" > wrote: > > > > > > Hi Helin, > > > > > > > -----Original Message----- > > > > From: dev [mailto:dev-bounces@dpdk.org=20 > ] On Behalf Of Vlad Zolotarov > > > > Sent: Friday, August 14, 2015 1:38 PM > > > > To: Zhang, Helin; dev@dpdk.org > > > > Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid=20 > tx_rs_thresh above 1 for > > > > all NICs but 82598 > > > > > > > > > > > > > > > > On 08/13/15 23:28, Zhang, Helin wrote: > > > > > Hi Vlad > > > > > > > > > > I don't think the changes are needed. It says in datasheet=20 > that the RS > > > > > bit should be set on the last descriptor of every packet, ONLY=20 > WHEN > > > > TXDCTL.WTHRESH equals to ZERO. > > > > > > > > Of course it's needed! See below. > > > > Exactly the same spec a few lines above the place u've just=20 > quoted states: > > > > > > > > "Software should not set the RS bit when TXDCTL.WTHRESH is=20 > greater than > > > > zero." > > > > > > > > And since all three (3) ixgbe xmit callbacks are utilizing RS=20 > bit notation ixgbe PMD > > > > is actually not supporting any value of WTHRESH different from ze= ro. > > > I think Vlad is right. We need to fix this issue. Any suggestion?=20 > If not, I'd like to ack this patch. > > > > Pls., note that there is a v2 of this patch on the list. I forgot to=20 > patch ixgbevf_dev_info_get() in v1. > > > > > > > > > > > > > > > > > > > Regards, > > > > > Helin > > > > > > > > > >> -----Original Message----- > > > > >> From: Vlad Zolotarov [mailto:vladz@cloudius-systems.com=20 > ] > > > > >> Sent: Thursday, August 13, 2015 11:07 AM > > > > >> To: dev@dpdk.org > > > > >> Cc: Zhang, Helin; Ananyev, Konstantin;=20 > avi@cloudius-systems.com ; Vlad > > > > >> Zolotarov > > > > >> Subject: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh=20 > above 1 for > > > > all > > > > >> NICs but 82598 > > > > >> > > > > >> According to 82599 and x540 HW specifications RS bit *must*=20 > be set in the > > > > last > > > > >> descriptor of *every* packet. > > > > > There is a condition that if TXDCTL.WTHRESH equal to zero. > > > > > > > > Right and ixgbe PMD requires this condition to be fulfilled in=20 > order to > > > > function. See above. > > > > > > > > > > > > > >> This patch fixes the Tx hang we were constantly hitting with=20 > a seastar-based > > > > >> application on x540 NIC. > > > > > Could you help to share with us how to reproduce the tx hang=20 > issue, with using > > > > > typical DPDK examples? > > > > > > > > Sorry. I'm not very familiar with the typical DPDK examples to=20 > help u > > > > here. However this is quite irrelevant since without this this pa= tch > > > > ixgbe PMD obviously abuses the HW spec as has been explained abov= e. > > > > > > > > We saw the issue when u stressed the xmit path with a lot of high= ly > > > > fragmented TCP frames (packets with up to 33 fragments with=20 > non-headers > > > > fragments as small as 4 bytes) with all offload features enabled. > > > > > > > > Thanks, > > > > vlad > > > > > > > > > >> Signed-off-by: Vlad Zolotarov > > > > > >> --- > > > > >> drivers/net/ixgbe/ixgbe_ethdev.c | 9 +++++++++ > > > > >> drivers/net/ixgbe/ixgbe_rxtx.c | 23 ++++++++++++++++++++++- > > > > >> 2 files changed, 31 insertions(+), 1 deletion(-) > > > > >> > > > > >> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c > > > > >> b/drivers/net/ixgbe/ixgbe_ethdev.c > > > > >> index b8ee1e9..6714fd9 100644 > > > > >> --- a/drivers/net/ixgbe/ixgbe_ethdev.c > > > > >> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c > > > > >> @@ -2414,6 +2414,15 @@ ixgbe_dev_info_get(struct rte_eth_dev=20 > *dev, > > > > struct > > > > >> rte_eth_dev_info *dev_info) > > > > >> .txq_flags =3D ETH_TXQ_FLAGS_NOMULTSEGS | > > > > >> ETH_TXQ_FLAGS_NOOFFLOADS, > > > > >> }; > > > > >> + > > > > >> + /* > > > > >> + * According to 82599 and x540 specifications RS bit=20 > *must* be set on > > > > the > > > > >> + * last descriptor of *every* packet. Therefore we will=20 > not allow the > > > > >> + * tx_rs_thresh above 1 for all NICs newer than 82598. > > > > >> + */ > > > > >> + if (hw->mac.type > ixgbe_mac_82598EB) > > > > >> + dev_info->default_txconf.tx_rs_thresh =3D 1; > > > > >> + > > > > >> dev_info->hash_key_size =3D IXGBE_HKEY_MAX_INDEX *=20 > sizeof(uint32_t); > > > > >> dev_info->reta_size =3D ETH_RSS_RETA_SIZE_128; > > > > >> dev_info->flow_type_rss_offloads =3D IXGBE_RSS_OFFLOAD_ALL; di= ff -- > > > > git > > > > >> a/drivers/net/ixgbe/ixgbe_rxtx.c=20 > b/drivers/net/ixgbe/ixgbe_rxtx.c index > > > > >> 91023b9..8dbdffc 100644 > > > > >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c > > > > >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c > > > > >> @@ -2085,11 +2085,19 @@ ixgbe_dev_tx_queue_setup(struct=20 > rte_eth_dev > > > > >> *dev, > > > > >> struct ixgbe_tx_queue *txq; > > > > >> struct ixgbe_hw *hw; > > > > >> uint16_t tx_rs_thresh, tx_free_thresh; > > > > >> + bool rs_deferring_allowed; > > > > >> > > > > >> PMD_INIT_FUNC_TRACE(); > > > > >> hw =3D IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > > > > >> > > > > >> /* > > > > >> + * According to 82599 and x540 specifications RS bit=20 > *must* be set on > > > > the > > > > >> + * last descriptor of *every* packet. Therefore we will=20 > not allow the > > > > >> + * tx_rs_thresh above 1 for all NICs newer than 82598. > > > > >> + */ > > > > >> + rs_deferring_allowed =3D (hw->mac.type <=3D ixgbe_mac_82598= EB); > > > > >> + > > > > >> + /* > > > > >> * Validate number of transmit descriptors. > > > > >> * It must not exceed hardware maximum, and must be multipl= e > > > > >> * of IXGBE_ALIGN. > > > > >> @@ -2110,6 +2118,8 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_= dev > > > > *dev, > > > > >> * to transmit a packet is greater than the number of free = TX > > > > >> * descriptors. > > > > >> * The following constraints must be satisfied: > > > > >> + * tx_rs_thresh must be less than 2 for NICs for which RS=20 > deferring is > > > > >> + * forbidden (all but 82598). > > > > >> * tx_rs_thresh must be greater than 0. > > > > >> * tx_rs_thresh must be less than the size of the ring=20 > minus 2. > > > > >> * tx_rs_thresh must be less than or equal to tx_free_thre= sh. > > > > >> @@ -2121,9 +2131,20 @@ ixgbe_dev_tx_queue_setup(struct=20 > rte_eth_dev > > > > *dev, > > > > >> * When set to zero use default values. > > > > >> */ > > > > >> tx_rs_thresh =3D (uint16_t)((tx_conf->tx_rs_thresh) ? > > > > >> - tx_conf->tx_rs_thresh : DEFAULT_TX_RS_THRESH); > > > > >> + tx_conf->tx_rs_thresh : > > > > >> + (rs_deferring_allowed ? DEFAULT_TX_RS_THRESH : 1)); > > > > >> tx_free_thresh =3D (uint16_t)((tx_conf->tx_free_thresh) ? > > > > >> tx_conf->tx_free_thresh : DEFAULT_TX_FREE_THRESH); > > > > >> + > > > > >> + if (!rs_deferring_allowed && tx_rs_thresh > 1) { > > > > >> + PMD_INIT_LOG(ERR, "tx_rs_thresh must be less than=20 > 2 since RS > > > > " > > > > >> + "must be set for every packet=20 > for this HW. " > > > > >> + "(tx_rs_thresh=3D%u port=3D%d queue=3D%d)", > > > > >> + (unsigned int)tx_rs_thresh, > > > > >> + (int)dev->data->port_id, (int)queue_idx); > > > > >> + return -(EINVAL); > > > > >> + } > > > > >> + > > > > >> if (tx_rs_thresh >=3D (nb_desc - 2)) { > > > > >> PMD_INIT_LOG(ERR, "tx_rs_thresh must be less than t= he > > > > number " > > > > >> "of TX descriptors minus 2.=20 > (tx_rs_thresh=3D%u " > > > > >> -- > > > > >> 2.1.0 > > > >