All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Jack Wang <jinpu.wang@ionos.com>
Cc: linux-rdma@vger.kernel.org, leon@kernel.org
Subject: Re: [PATCH 0/2] bugfix for ipoib
Date: Mon, 20 Nov 2023 20:16:40 -0400	[thread overview]
Message-ID: <20231121001640.GG10140@ziepe.ca> (raw)
In-Reply-To: <20231120203501.321587-1-jinpu.wang@ionos.com>

On Mon, Nov 20, 2023 at 09:34:59PM +0100, Jack Wang wrote:
> We run into queue timeout often with call trace as such:
> NETDEV WATCHDOG: ib0.beef (): transmit queue 26 timed out
> Call Trace:
> call_timer_fn+0x27/0x100
> __run_timers.part.0+0x1be/0x230
> ? mlx5_cq_tasklet_cb+0x6d/0x140 [mlx5_core]
> run_timer_softirq+0x26/0x50
> __do_softirq+0xbc/0x26d
> asm_call_irq_on_stack+0xf/0x20
> ib0.beef: transmit timeout: latency 10 msecs
> ib0.beef: queue stopped 0, tx_head 0, tx_tail 0, global_tx_head 0, global_tx_tail 0
> 
> The last two message repeated for days.

You shouldn't get tx timeouts and fully stuck queues like that, it
suggests something else is very wrong in that system.

> After cross check with Mellanox OFED, I noticed some bugfix are missing in
> upstream, hence I take the liberty to send them out.

Recovery is recovery, it is just RAS

Jason


  parent reply	other threads:[~2023-11-21  0:16 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-20 20:34 [PATCH 0/2] bugfix for ipoib Jack Wang
2023-11-20 20:35 ` [PATCH 1/2] ipoib: Fix error code return in ipoib_mcast_join Jack Wang
2023-11-20 20:35 ` [PATCH 2/2] ipoib: Add tx timeout work to recover queue stop situation Jack Wang
2023-11-21  4:20   ` kernel test robot
2023-11-21  8:21   ` kernel test robot
2023-11-21  0:16 ` Jason Gunthorpe [this message]
2023-11-21 13:02   ` [PATCH 0/2] bugfix for ipoib Jinpu Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231121001640.GG10140@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=jinpu.wang@ionos.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.