From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Gunthorpe Subject: Re: [PATCH mlx5-next 08/10] IB/mlx5: Call PAGE_FAULT_RESUME command asynchronously Date: Thu, 8 Nov 2018 19:49:03 +0000 Message-ID: <20181108194857.GF5548@mellanox.com> References: <20181108191017.21891-1-leon@kernel.org> <20181108191017.21891-9-leon@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: Doug Ledford , Leon Romanovsky , RDMA mailing list , Artemy Kovalyov , Majd Dibbiny , Moni Shoua , Saeed Mahameed , linux-netdev To: Leon Romanovsky Return-path: Received: from mail-he1eur01on0063.outbound.protection.outlook.com ([104.47.0.63]:2240 "EHLO EUR01-HE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725723AbeKIF0t (ORCPT ); Fri, 9 Nov 2018 00:26:49 -0500 In-Reply-To: <20181108191017.21891-9-leon@kernel.org> Content-Language: en-US Content-ID: <333E3EC7DEEA6949A74981A65F964004@eurprd05.prod.outlook.com> Sender: netdev-owner@vger.kernel.org List-ID: On Thu, Nov 08, 2018 at 09:10:15PM +0200, Leon Romanovsky wrote: > From: Moni Shoua >=20 > Telling the HCA that page fault handling is done and QP can resume > its flow is done in the context of the page fault handler. This blocks > the handling of the next work in queue without a need. > Call the PAGE_FAULT_RESUME command in an asynchronous manner and free > the workqueue to pick the next work item for handling. All tasks that > were executed after PAGE_FAULT_RESUME need to be done now > in the callback of the asynchronous command mechanism. >=20 > Signed-off-by: Moni Shoua > Signed-off-by: Leon Romanovsky > drivers/infiniband/hw/mlx5/odp.c | 110 +++++++++++++++++++++++++------ > include/linux/mlx5/driver.h | 3 + > 2 files changed, 94 insertions(+), 19 deletions(-) >=20 > diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx= 5/odp.c > index abce55b8b9ba..0c4f469cdd5b 100644 > +++ b/drivers/infiniband/hw/mlx5/odp.c > @@ -298,20 +298,78 @@ void mlx5_ib_internal_fill_odp_caps(struct mlx5_ib_= dev *dev) > return; > } > =20 > +struct pfault_resume_cb_ctx { > + struct mlx5_ib_dev *dev; > + struct mlx5_core_rsc_common *res; > + struct mlx5_pagefault *pfault; > +}; > + > +static void page_fault_resume_callback(int status, void *context) > +{ > + struct pfault_resume_cb_ctx *ctx =3D context; > + struct mlx5_pagefault *pfault =3D ctx->pfault; > + > + if (status) > + mlx5_ib_err(ctx->dev, "Resolve the page fault failed with status %d\n"= , > + status); > + > + if (ctx->res) > + mlx5_core_res_put(ctx->res); > + kfree(pfault); > + kfree(ctx); > +} > + > static void mlx5_ib_page_fault_resume(struct mlx5_ib_dev *dev, > + struct mlx5_core_rsc_common *res, > struct mlx5_pagefault *pfault, > - int error) > + int error, > + bool async) > { > + int ret =3D 0; > + u32 *out =3D pfault->out_pf_resume; > + u32 *in =3D pfault->in_pf_resume; > + u32 token =3D pfault->token; > int wq_num =3D pfault->event_subtype =3D=3D MLX5_PFAULT_SUBTYPE_WQE ? > - pfault->wqe.wq_num : pfault->token; > - int ret =3D mlx5_core_page_fault_resume(dev->mdev, > - pfault->token, > - wq_num, > - pfault->type, > - error); > - if (ret) > - mlx5_ib_err(dev, "Failed to resolve the page fault on WQ 0x%x\n", > - wq_num); > + pfault->wqe.wq_num : pfault->token; > + u8 type =3D pfault->type; > + struct pfault_resume_cb_ctx *ctx =3D NULL; > + > + if (async) > + ctx =3D kmalloc(sizeof(*ctx), GFP_KERNEL); Why not allocate this ctx ast part of the mlx5_pagefault and avoid this allocation failure strategy? Jason