From: Wengang Wang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
To: Wengang Wang
<wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH] IPoIB: serialize changing on tx_outstanding
Date: Thu, 8 Oct 2015 11:29:39 +0800 [thread overview]
Message-ID: <5615E323.9000106@oracle.com> (raw)
In-Reply-To: <1443418930-18677-1-git-send-email-wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Hi,
Any comment on this patch?
thanks,
wengang
在 2015年09月28日 13:42, Wengang Wang 写道:
> The changing on tx_outstanding should be protected by spinlock or to be
> atomic operations.
>
> Such log is found in dmesg:
>
> Sep 16 14:20:53 naep11x06 kernel: ib0: queue stopped 1, tx_head 1034733, tx_tail 1034733, tx_outstanding 359 ipoib_sendq_size: 512
> Sep 16 14:21:33 naep11x06 kernel: ib0: transmit timeout: latency 9560 msecs
> Sep 16 14:21:33 naep11x06 kernel: ib0: queue stopped 1, tx_head 1034854, tx_tail 1034854, tx_outstanding 511 ipoib_sendq_size: 512
> Sep 16 14:21:38 naep11x06 kernel: ib0: transmit timeout: latency 14568 msecs
> Sep 16 14:21:38 naep11x06 kernel: ib0: queue stopped 1, tx_head 1034854, tx_tail 1034854, tx_outstanding 511 ipoib_sendq_size: 512
>
> And the send queue of ib0 kept full. When transmit timeout is reported,
> queue is reported as "stopped", but the IPoIB stuff tx_head and tx_tail
> points to same value. I am not able to see such numbers in ipoib_cm_tx
> (for CM) because I have no vmcore. Though I am not quite sure it's caused
> by parallel access of tx_outstanding(send path VS interrup path), we really
> need to serialize the changeing on tx_outstanding.
>
> This patch also make sure the increase of tx_outstanding prior to the
> calling of post_send to avoid the possible decreasing before increasing in
> case the running of increasing is scheduled later than the interrupt
> handler.
>
> Signed-off-by: Wengang Wang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> ---
> drivers/infiniband/ulp/ipoib/ipoib_cm.c | 40 +++++++++++++++++++++++----------
> drivers/infiniband/ulp/ipoib/ipoib_ib.c | 24 ++++++++++++++++++--
> 2 files changed, 50 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> index c78dc16..044da94 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> @@ -710,6 +710,7 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_
> struct ipoib_dev_priv *priv = netdev_priv(dev);
> struct ipoib_tx_buf *tx_req;
> int rc;
> + unsigned long flags;
>
> if (unlikely(skb->len > tx->mtu)) {
> ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n",
> @@ -742,27 +743,36 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_
> skb_orphan(skb);
> skb_dst_drop(skb);
>
> + spin_lock_irqsave(&priv->lock, flags);
> + if (++priv->tx_outstanding == ipoib_sendq_size) {
> + ipoib_dbg(priv, "TX ring 0x%x full, stopping kernel net queue\n",
> + tx->qp->qp_num);
> + netif_stop_queue(dev);
> + }
> + spin_unlock_irqrestore(&priv->lock, flags);
> + if (netif_queue_stopped(dev)) {
> + rc = ib_req_notify_cq(priv->send_cq,
> + IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS);
> + if (rc < 0)
> + ipoib_warn(priv, "request notify on send CQ failed\n");
> + else if (rc)
> + ipoib_send_comp_handler(priv->send_cq, dev);
> + }
> +
> rc = post_send(priv, tx, tx->tx_head & (ipoib_sendq_size - 1), tx_req);
> if (unlikely(rc)) {
> ipoib_warn(priv, "post_send failed, error %d\n", rc);
> ++dev->stats.tx_errors;
> + spin_lock_irqsave(&priv->lock, flags);
> + --priv->tx_outstanding;
> + if (netif_queue_stopped(dev))
> + netif_wake_queue(dev);
> + spin_unlock_irqrestore(&priv->lock, flags);
> ipoib_dma_unmap_tx(priv, tx_req);
> dev_kfree_skb_any(skb);
> } else {
> dev->trans_start = jiffies;
> ++tx->tx_head;
> -
> - if (++priv->tx_outstanding == ipoib_sendq_size) {
> - ipoib_dbg(priv, "TX ring 0x%x full, stopping kernel net queue\n",
> - tx->qp->qp_num);
> - netif_stop_queue(dev);
> - rc = ib_req_notify_cq(priv->send_cq,
> - IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS);
> - if (rc < 0)
> - ipoib_warn(priv, "request notify on send CQ failed\n");
> - else if (rc)
> - ipoib_send_comp_handler(priv->send_cq, dev);
> - }
> }
> }
>
> @@ -796,10 +806,13 @@ void ipoib_cm_handle_tx_wc(struct net_device *dev, struct ib_wc *wc)
> netif_tx_lock(dev);
>
> ++tx->tx_tail;
> +
> + spin_lock_irqsave(&priv->lock, flags);
> if (unlikely(--priv->tx_outstanding == ipoib_sendq_size >> 1) &&
> netif_queue_stopped(dev) &&
> test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags))
> netif_wake_queue(dev);
> + spin_unlock_irqrestore(&priv->lock, flags);
>
> if (wc->status != IB_WC_SUCCESS &&
> wc->status != IB_WC_WR_FLUSH_ERR) {
> @@ -1169,6 +1182,7 @@ static void ipoib_cm_tx_destroy(struct ipoib_cm_tx *p)
> struct ipoib_dev_priv *priv = netdev_priv(p->dev);
> struct ipoib_tx_buf *tx_req;
> unsigned long begin;
> + unsigned long flags;
>
> ipoib_dbg(priv, "Destroy active connection 0x%x head 0x%x tail 0x%x\n",
> p->qp ? p->qp->qp_num : 0, p->tx_head, p->tx_tail);
> @@ -1198,10 +1212,12 @@ timeout:
> dev_kfree_skb_any(tx_req->skb);
> ++p->tx_tail;
> netif_tx_lock_bh(p->dev);
> + spin_lock_irqsave(&priv->lock, flags);
> if (unlikely(--priv->tx_outstanding == ipoib_sendq_size >> 1) &&
> netif_queue_stopped(p->dev) &&
> test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags))
> netif_wake_queue(p->dev);
> + spin_unlock_irqrestore(&priv->lock, flags);
> netif_tx_unlock_bh(p->dev);
> }
>
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> index d266667..7616e3c 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> @@ -377,6 +377,7 @@ static void ipoib_ib_handle_tx_wc(struct net_device *dev, struct ib_wc *wc)
> struct ipoib_dev_priv *priv = netdev_priv(dev);
> unsigned int wr_id = wc->wr_id;
> struct ipoib_tx_buf *tx_req;
> + unsigned long flags;
>
> ipoib_dbg_data(priv, "send completion: id %d, status: %d\n",
> wr_id, wc->status);
> @@ -397,10 +398,13 @@ static void ipoib_ib_handle_tx_wc(struct net_device *dev, struct ib_wc *wc)
> dev_kfree_skb_any(tx_req->skb);
>
> ++priv->tx_tail;
> +
> + spin_lock_irqsave(&priv->lock, flags);
> if (unlikely(--priv->tx_outstanding == ipoib_sendq_size >> 1) &&
> netif_queue_stopped(dev) &&
> test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags))
> netif_wake_queue(dev);
> + spin_unlock_irqrestore(&priv->lock, flags);
>
> if (wc->status != IB_WC_SUCCESS &&
> wc->status != IB_WC_WR_FLUSH_ERR) {
> @@ -540,6 +544,7 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb,
> struct ipoib_tx_buf *tx_req;
> int hlen, rc;
> void *phead;
> + unsigned long flags;
>
> if (skb_is_gso(skb)) {
> hlen = skb_transport_offset(skb) + tcp_hdrlen(skb);
> @@ -587,12 +592,22 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb,
> else
> priv->tx_wr.send_flags &= ~IB_SEND_IP_CSUM;
>
> + spin_lock_irqsave(&priv->lock, flags);
> if (++priv->tx_outstanding == ipoib_sendq_size) {
> ipoib_dbg(priv, "TX ring full, stopping kernel net queue\n");
> if (ib_req_notify_cq(priv->send_cq, IB_CQ_NEXT_COMP))
> ipoib_warn(priv, "request notify on send CQ failed\n");
> netif_stop_queue(dev);
> }
> + spin_unlock_irqrestore(&priv->lock, flags);
> + if (netif_queue_stopped(dev)) {
> + rc = ib_req_notify_cq(priv->send_cq,
> + IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS);
> + if (rc < 0)
> + ipoib_warn(priv, "request notify on send CQ failed\n");
> + else if (rc)
> + ipoib_send_comp_handler(priv->send_cq, dev);
> + }
>
> skb_orphan(skb);
> skb_dst_drop(skb);
> @@ -602,11 +617,13 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb,
> if (unlikely(rc)) {
> ipoib_warn(priv, "post_send failed, error %d\n", rc);
> ++dev->stats.tx_errors;
> + spin_lock_irqsave(&priv->lock, flags);
> --priv->tx_outstanding;
> - ipoib_dma_unmap_tx(priv, tx_req);
> - dev_kfree_skb_any(skb);
> if (netif_queue_stopped(dev))
> netif_wake_queue(dev);
> + spin_unlock_irqrestore(&priv->lock, flags);
> + ipoib_dma_unmap_tx(priv, tx_req);
> + dev_kfree_skb_any(skb);
> } else {
> dev->trans_start = jiffies;
>
> @@ -825,6 +842,7 @@ int ipoib_ib_dev_stop(struct net_device *dev)
> unsigned long begin;
> struct ipoib_tx_buf *tx_req;
> int i;
> + unsigned long flags;
>
> if (test_and_clear_bit(IPOIB_FLAG_INITIALIZED, &priv->flags))
> napi_disable(&priv->napi);
> @@ -857,7 +875,9 @@ int ipoib_ib_dev_stop(struct net_device *dev)
> ipoib_dma_unmap_tx(priv, tx_req);
> dev_kfree_skb_any(tx_req->skb);
> ++priv->tx_tail;
> + spin_lock_irqsave(&priv->lock, flags);
> --priv->tx_outstanding;
> + spin_unlock_irqrestore(&priv->lock, flags);
> }
>
> for (i = 0; i < ipoib_recvq_size; ++i) {
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-10-08 3:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-28 5:42 [PATCH] IPoIB: serialize changing on tx_outstanding Wengang Wang
[not found] ` <1443418930-18677-1-git-send-email-wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2015-10-08 3:29 ` Wengang Wang [this message]
2015-10-08 4:33 ` Leon Romanovsky
[not found] ` <20151008043317.GB1940-2ukJVAZIZ/Y@public.gmane.org>
2015-10-08 4:55 ` Wengang Wang
[not found] ` <5615F743.5050404-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2015-10-08 6:39 ` Leon Romanovsky
-- strict thread matches above, loose matches on Subject: below --
2016-05-20 2:51 Wengang Wang
[not found] ` <1463712711-5556-1-git-send-email-wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-05-22 11:42 ` Erez Shitrit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5615E323.9000106@oracle.com \
--to=wen.gang.wang-qhclzuegtsvqt0dzr+alfa@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.