From: Jason Gunthorpe <jgg@ziepe.ca>
To: Jack Wang <jinpu.wang@ionos.com>
Cc: linux-rdma@vger.kernel.org, leon@kernel.org
Subject: Re: [PATCH 0/2] bugfix for ipoib
Date: Mon, 20 Nov 2023 20:16:40 -0400 [thread overview]
Message-ID: <20231121001640.GG10140@ziepe.ca> (raw)
In-Reply-To: <20231120203501.321587-1-jinpu.wang@ionos.com>
On Mon, Nov 20, 2023 at 09:34:59PM +0100, Jack Wang wrote:
> We run into queue timeout often with call trace as such:
> NETDEV WATCHDOG: ib0.beef (): transmit queue 26 timed out
> Call Trace:
> call_timer_fn+0x27/0x100
> __run_timers.part.0+0x1be/0x230
> ? mlx5_cq_tasklet_cb+0x6d/0x140 [mlx5_core]
> run_timer_softirq+0x26/0x50
> __do_softirq+0xbc/0x26d
> asm_call_irq_on_stack+0xf/0x20
> ib0.beef: transmit timeout: latency 10 msecs
> ib0.beef: queue stopped 0, tx_head 0, tx_tail 0, global_tx_head 0, global_tx_tail 0
>
> The last two message repeated for days.
You shouldn't get tx timeouts and fully stuck queues like that, it
suggests something else is very wrong in that system.
> After cross check with Mellanox OFED, I noticed some bugfix are missing in
> upstream, hence I take the liberty to send them out.
Recovery is recovery, it is just RAS
Jason
next prev parent reply other threads:[~2023-11-21 0:16 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-20 20:34 [PATCH 0/2] bugfix for ipoib Jack Wang
2023-11-20 20:35 ` [PATCH 1/2] ipoib: Fix error code return in ipoib_mcast_join Jack Wang
2023-11-20 20:35 ` [PATCH 2/2] ipoib: Add tx timeout work to recover queue stop situation Jack Wang
2023-11-21 4:20 ` kernel test robot
2023-11-21 8:21 ` kernel test robot
2023-11-21 0:16 ` Jason Gunthorpe [this message]
2023-11-21 13:02 ` [PATCH 0/2] bugfix for ipoib Jinpu Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231121001640.GG10140@ziepe.ca \
--to=jgg@ziepe.ca \
--cc=jinpu.wang@ionos.com \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.