From: Leon Romanovsky <leon@kernel.org>
To: Jinpu Wang <jinpu.wang@ionos.com>
Cc: netdev <netdev@vger.kernel.org>,
RDMA mailing list <linux-rdma@vger.kernel.org>,
Moshe Shemesh <moshe@nvidia.com>,
Saeed Mahameed <saeedm@nvidia.com>,
Tariq Toukan <tariqt@nvidia.com>,
Maor Gottlieb <maorg@nvidia.com>, Shay Drory <shayd@nvidia.com>
Subject: Re: [BUG] mlx5_core general protection fault in mlx5_cmd_comp_handler
Date: Thu, 13 Oct 2022 13:27:38 +0300 [thread overview]
Message-ID: <Y0foGrlwnYX8lJX2@unreal> (raw)
In-Reply-To: <CAMGffEmFCgKv-6XNXjAKzr5g6TtT_=wj6H62AdGCUXx4hruxBQ@mail.gmail.com>
On Thu, Oct 13, 2022 at 10:32:55AM +0200, Jinpu Wang wrote:
> On Thu, Oct 13, 2022 at 10:18 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Wed, Oct 12, 2022 at 01:55:55PM +0200, Jinpu Wang wrote:
> > > Hi Leon, hi Saeed,
> > >
> > > We have seen crashes during server shutdown on both kernel 5.10 and
> > > kernel 5.15 with GPF in mlx5 mlx5_cmd_comp_handler function.
> > >
> > > All of the crashes point to
> > >
> > > 1606 memcpy(ent->out->first.data,
> > > ent->lay->out, sizeof(ent->lay->out));
> > >
> > > I guess, it's kind of use after free for ent buffer. I tried to reprod
> > > by repeatedly reboot the testing servers, but no success so far.
> >
> > My guess is that command interface is not flushed, but Moshe and me
> > didn't see how it can happen.
> >
> > 1206 INIT_DELAYED_WORK(&ent->cb_timeout_work, cb_timeout_handler);
> > 1207 INIT_WORK(&ent->work, cmd_work_handler);
> > 1208 if (page_queue) {
> > 1209 cmd_work_handler(&ent->work);
> > 1210 } else if (!queue_work(cmd->wq, &ent->work)) {
> > ^^^^^^^ this is what is causing to the splat
> > 1211 mlx5_core_warn(dev, "failed to queue work\n");
> > 1212 err = -EALREADY;
> > 1213 goto out_free;
> > 1214 }
> >
> > <...>
> > >
> > > Is this problem known, maybe already fixed?
> >
> > I don't see any missing Fixes that exist in 6.0 and don't exist in 5.5.32.
Sorry it is 5.15.32
> > Is it possible to reproduce this on latest upstream code?
> I haven't been able to reproduce it, as mentioned above, I tried to
> reproduce by simply reboot in loop, no luck yet.
> do you have suggestions to speedup the reproduction?
Maybe try to shutdown during filling command interface.
I think that any query command will do the trick.
> Once I can reproduce, I can also try with kernel 6.0.
It will be great.
Thanks
next prev parent reply other threads:[~2022-10-13 10:27 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-12 11:55 [BUG] mlx5_core general protection fault in mlx5_cmd_comp_handler Jinpu Wang
2022-10-13 8:18 ` Leon Romanovsky
2022-10-13 8:32 ` Jinpu Wang
2022-10-13 10:27 ` Leon Romanovsky [this message]
2022-10-17 5:54 ` Jinpu Wang
2022-11-09 9:51 ` Jinpu Wang
2022-11-15 5:14 ` Moshe Shemesh
2022-11-15 5:46 ` Jinpu Wang
2022-11-15 15:08 ` Jinpu Wang
2022-11-15 16:41 ` Moshe Shemesh
2022-11-21 9:11 ` Jinpu Wang
2022-11-22 4:31 ` Moshe Shemesh
2022-11-22 6:08 ` Jinpu Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y0foGrlwnYX8lJX2@unreal \
--to=leon@kernel.org \
--cc=jinpu.wang@ionos.com \
--cc=linux-rdma@vger.kernel.org \
--cc=maorg@nvidia.com \
--cc=moshe@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=saeedm@nvidia.com \
--cc=shayd@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).