From: Leon Romanovsky <leon@kernel.org>
To: Petr Pavlu <petr.pavlu@suse.com>
Cc: tariqt@nvidia.com, yishaih@nvidia.com, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: Part of devices not initialized with mlx4
Date: Tue, 3 Jan 2023 11:35:04 +0200 [thread overview]
Message-ID: <Y7P2yECHeKvyqQqo@unreal> (raw)
In-Reply-To: <e939dbde-8905-fc98-5717-c555e05b708d@suse.com>
On Mon, Jan 02, 2023 at 11:33:15AM +0100, Petr Pavlu wrote:
> On 12/18/22 10:53, Leon Romanovsky wrote:
> > On Thu, Dec 15, 2022 at 10:51:15AM +0100, Petr Pavlu wrote:
> >> Hello,
> >>
> >> We have seen an issue when some of ConnectX-3 devices are not initialized
> >> when mlx4 drivers are a part of initrd.
> >
> > <...>
> >
> >> * Systemd stops running services and then sends SIGTERM to "unmanaged" tasks
> >> on the system to terminate them too. This includes the modprobe task.
> >> * Initialization of mlx4_en is interrupted in the middle of its init function.
> >
> > And why do you think that this systemd behaviour is correct one?
>
> My view is that this is an issue between the kernel and initrd/systemd.
> Switching the root is a delicate operation and both parts need to carefully
> cooperate for it to work correctly.
>
> I think it is generally sensible that systemd tries to terminate any remaining
> processes started from the initrd. They would have troubles when the root is
> switched under their hands anyway, unless they are specifically prepared for
> it. Systemd only skips terminating kthreads and allows to exclude root storage
> daemons. A modprobe helper could be excluded from being terminated too but the
> problem with the root switch remains.
>
> It looks to me that a good approach is to complete all running module loads
> before switching the root and continue with any further loads after the
> operation is done. Leaving module loads to udevd assures this, hence the idea
> to use an auxiliary bus.
I'm not sure about it. Everything above are user-space troubles which
are invited once systemd does root switch. Anyway, if you want to do
aux bus for mlx4, go for it.
Feel free to send me patches off-list and I will add them to our
regression, but be aware that you are stepping on landmine field
here.
Thanks
prev parent reply other threads:[~2023-01-03 9:35 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-15 9:51 Part of devices not initialized with mlx4 Petr Pavlu
2022-12-18 9:53 ` Leon Romanovsky
2023-01-02 10:33 ` Petr Pavlu
2023-01-03 9:35 ` Leon Romanovsky [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y7P2yECHeKvyqQqo@unreal \
--to=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=petr.pavlu@suse.com \
--cc=tariqt@nvidia.com \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).