From: Jason Gunthorpe <jgg@nvidia.com>
To: Bob Pearson <rpearsonhpe@gmail.com>
Cc: leon@kernel.org, zyjzyj2000@gmail.com, jhack@hpe.com,
linux-rdma@vger.kernel.org
Subject: Re: [PATCH for-next 4/9] RDMA/rxe: Fix delayed send packet handling
Date: Fri, 4 Aug 2023 11:17:36 -0300 [thread overview]
Message-ID: <ZM0IgLe3yv6bsEiB@nvidia.com> (raw)
In-Reply-To: <0cfb222c-ff48-daca-d512-3083878100fa@gmail.com>
On Mon, Jul 31, 2023 at 01:33:15PM -0500, Bob Pearson wrote:
> On 7/31/23 13:23, Jason Gunthorpe wrote:
> > On Mon, Jul 31, 2023 at 01:20:35PM -0500, Bob Pearson wrote:
> >> On 7/31/23 13:12, Jason Gunthorpe wrote:
> >>> On Fri, Jul 21, 2023 at 03:50:17PM -0500, Bob Pearson wrote:
> >>>> In cable pull testing some NICs can hold a send packet long enough
> >>>> to allow ulp protocol stacks to destroy the qp and the cleanup
> >>>> routines to timeout waiting for all qp references to be released.
> >>>> When the NIC driver finally frees the SKB the qp pointer is no longer
> >>>> valid and causes a seg fault in rxe_skb_tx_dtor().
> >>>>
> >>>> This patch passes the qp index instead of the qp to the skb destructor
> >>>> callback function. The call back is required to lookup the qp from the
> >>>> index and if it has been destroyed the lookup will return NULL and the
> >>>> qp will not be referenced avoiding the seg fault.
> >>>
> >>> And what if it is a different QP returned?
> >>>
> >>> Jason
> >>
> >> Since we are using xarray cyclic alloc you would have to create 16M QPs before the
> >> index was reused. This is as good as it gets I think.
> >
> > Sounds terrible, why can't you store the QP pointer instead and hold a
> > refcount on it?
>
> The goal here was to make packet send semantics to be 'fire and forget' i.e. once we
> send the packet not have any dependencies hanging around. But we still wanted to count
> the packets pending to avoid overrunning the send queue.
Well, you can't have it both ways really.
Maybe you need another bit of memory to track the packet counters that
can be refcounted independently of the qp.
And wait for those refcounts to zero out before allowing the driver to
unprobe.
Jason
next prev parent reply other threads:[~2023-08-04 14:17 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-21 20:50 [PATCH for-next 0/9] RDMA/rxe: Misc fixes and cleanups Bob Pearson
2023-07-21 20:50 ` [PATCH for-next 1/9] RDMA/rxe: Fix handling sleepable in rxe_pool.c Bob Pearson
2023-07-31 18:08 ` Jason Gunthorpe
2023-07-21 20:50 ` [PATCH for-next 2/9] RDMA/rxe: Fix xarray locking " Bob Pearson
2023-07-21 20:50 ` [PATCH for-next 3/9] RDMA/rxe: Fix freeing busy objects Bob Pearson
2023-07-31 18:11 ` Jason Gunthorpe
2023-07-31 18:16 ` Bob Pearson
2023-07-31 18:22 ` Jason Gunthorpe
2023-07-21 20:50 ` [PATCH for-next 4/9] RDMA/rxe: Fix delayed send packet handling Bob Pearson
2023-07-23 13:03 ` Zhu Yanjun
2023-07-23 17:24 ` Bob Pearson
2023-07-24 17:59 ` Leon Romanovsky
2023-07-24 18:26 ` Bob Pearson
2023-07-31 18:12 ` Jason Gunthorpe
2023-07-31 18:20 ` Bob Pearson
2023-07-31 18:23 ` Jason Gunthorpe
2023-07-31 18:33 ` Bob Pearson
2023-08-04 14:17 ` Jason Gunthorpe [this message]
2023-07-21 20:50 ` [PATCH for-next 5/9] RDMA/rxe: Optimize rxe_init_packet in rxe_net.c Bob Pearson
2023-07-21 20:50 ` [PATCH for-next 6/9] RDMA/rxe: Delete unused field elem->list Bob Pearson
2023-07-21 20:50 ` [PATCH for-next 7/9] RDMA/rxe: Add elem->valid field Bob Pearson
2023-07-31 18:15 ` Jason Gunthorpe
2023-07-21 20:50 ` [PATCH for-next 8/9] RDMA/rxe: Report leaked objects Bob Pearson
2023-07-31 18:15 ` Jason Gunthorpe
2023-07-31 18:23 ` Bob Pearson
2023-07-31 18:31 ` Jason Gunthorpe
2023-07-31 18:42 ` Bob Pearson
2023-07-31 18:43 ` Jason Gunthorpe
2023-07-31 18:51 ` Bob Pearson
2023-08-04 14:16 ` Jason Gunthorpe
2023-07-21 20:50 ` [PATCH for-next 9/9] RDMA/rxe: Protect pending send packets Bob Pearson
2023-07-31 18:17 ` Jason Gunthorpe
2023-07-31 18:26 ` Bob Pearson
2023-07-31 18:32 ` Jason Gunthorpe
2023-07-31 18:44 ` Bob Pearson
2023-08-01 22:56 ` Jason Gunthorpe
2023-08-02 14:39 ` Bob Pearson
2023-08-02 14:57 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZM0IgLe3yv6bsEiB@nvidia.com \
--to=jgg@nvidia.com \
--cc=jhack@hpe.com \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=rpearsonhpe@gmail.com \
--cc=zyjzyj2000@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).