From: Patrisious Haddad <phaddad@nvidia.com>
To: Arnd Bergmann <arnd@kernel.org>,
Leon Romanovsky <leon@kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Arnd Bergmann" <arnd@arndb.de>,
"Christian Göttsche" <cgzones@googlemail.com>,
"Serge Hallyn" <serge@hallyn.com>,
"Chiara Meiohas" <cmeiohas@nvidia.com>,
"Al Viro" <viro@zeniv.linux.org.uk>,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] RDMA/mlx5: reduce stack usage in mlx5_ib_ufile_hw_cleanup
Date: Tue, 10 Jun 2025 12:50:57 +0300 [thread overview]
Message-ID: <fa916ae4-1ed3-4f90-8577-3666ff0fe84a@nvidia.com> (raw)
In-Reply-To: <20250610092846.2642535-1-arnd@kernel.org>
On 6/10/2025 12:28 PM, Arnd Bergmann wrote:
> External email: Use caution opening links or attachments
>
>
> From: Arnd Bergmann <arnd@arndb.de>
>
> This function has an array of eight mlx5_async_cmd structures, which
> often fits on the stack, but depending on the configuration can
> end up blowing the stack frame warning limit:
>
> drivers/infiniband/hw/mlx5/devx.c:2670:6: error: stack frame size (1392) exceeds limit (1280) in 'mlx5_ib_ufile_hw_cleanup' [-Werror,-Wframe-larger-than]
>
> Change this to a dynamic allocation instead. While a kmalloc()
> can theoretically fail, a GFP_KERNEL allocation under a page will
> block until memory has been freed up, so in the worst case, this
> only adds extra time in an already constrained environment.
>
> Fixes: 7c891a4dbcc1 ("RDMA/mlx5: Add implementation for ufile_hw_cleanup device operation")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> drivers/infiniband/hw/mlx5/devx.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c
> index 2479da8620ca..c3c0ea219ab7 100644
> --- a/drivers/infiniband/hw/mlx5/devx.c
> +++ b/drivers/infiniband/hw/mlx5/devx.c
> @@ -2669,7 +2669,7 @@ static void devx_wait_async_destroy(struct mlx5_async_cmd *cmd)
>
> void mlx5_ib_ufile_hw_cleanup(struct ib_uverbs_file *ufile)
> {
> - struct mlx5_async_cmd async_cmd[MAX_ASYNC_CMDS];
> + struct mlx5_async_cmd *async_cmd;
Please preserve reverse Christmas tree deceleration.
> struct ib_ucontext *ucontext = ufile->ucontext;
> struct ib_device *device = ucontext->device;
> struct mlx5_ib_dev *dev = to_mdev(device);
> @@ -2678,6 +2678,10 @@ void mlx5_ib_ufile_hw_cleanup(struct ib_uverbs_file *ufile)
> int head = 0;
> int tail = 0;
>
> + async_cmd = kcalloc(MAX_ASYNC_CMDS, sizeof(*async_cmd), GFP_KERNEL);
> + if (WARN_ON(!async_cmd))
> + return;
But honestly I'm not sure I like this, the whole point of this patch was
performance optimization for teardown flow, and this function is called
in a loop not even one time.
So I'm really not sure about how much kcalloc can slow it down here, and
it failing is whole other issue.
I'm thinking out-loud here, but theoretically we know stack size and
this struct size at compile time , so can we should be able to add some
kind of ifdef check "if (stack_frame_size < struct_size)" skip this
function and maybe print some warning.
(since it is purely optimization function and logically the code will
continue correctly without it - but if it needs to be executed then let
it stay like this and needs a big enough stack - which is most of today
systems anyway) ?
> +
> list_for_each_entry(uobject, &ufile->uobjects, list) {
> WARN_ON(uverbs_try_lock_object(uobject, UVERBS_LOOKUP_WRITE));
>
> @@ -2713,6 +2717,8 @@ void mlx5_ib_ufile_hw_cleanup(struct ib_uverbs_file *ufile)
> devx_wait_async_destroy(&async_cmd[head % MAX_ASYNC_CMDS]);
> head++;
> }
> +
> + kfree(async_cmd);
> }
>
> static ssize_t devx_async_cmd_event_read(struct file *filp, char __user *buf,
> --
> 2.39.5
>
next prev parent reply other threads:[~2025-06-10 9:51 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-10 9:28 [PATCH] RDMA/mlx5: reduce stack usage in mlx5_ib_ufile_hw_cleanup Arnd Bergmann
2025-06-10 9:50 ` Patrisious Haddad [this message]
2025-06-10 10:31 ` Arnd Bergmann
2025-06-10 14:51 ` Patrisious Haddad
2025-06-12 9:05 ` Leon Romanovsky
2025-06-12 9:05 ` Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fa916ae4-1ed3-4f90-8577-3666ff0fe84a@nvidia.com \
--to=phaddad@nvidia.com \
--cc=arnd@arndb.de \
--cc=arnd@kernel.org \
--cc=cgzones@googlemail.com \
--cc=cmeiohas@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=serge@hallyn.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox