From: Leon Romanovsky <leon@kernel.org>
To: Petr Pavlu <petr.pavlu@suse.com>
Cc: tariqt@nvidia.com, yishaih@nvidia.com, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: Part of devices not initialized with mlx4
Date: Tue, 3 Jan 2023 11:35:04 +0200 [thread overview]
Message-ID: <Y7P2yECHeKvyqQqo@unreal> (raw)
In-Reply-To: <e939dbde-8905-fc98-5717-c555e05b708d@suse.com>
On Mon, Jan 02, 2023 at 11:33:15AM +0100, Petr Pavlu wrote:
> On 12/18/22 10:53, Leon Romanovsky wrote:
> > On Thu, Dec 15, 2022 at 10:51:15AM +0100, Petr Pavlu wrote:
> >> Hello,
> >>
> >> We have seen an issue when some of ConnectX-3 devices are not initialized
> >> when mlx4 drivers are a part of initrd.
> >
> > <...>
> >
> >> * Systemd stops running services and then sends SIGTERM to "unmanaged" tasks
> >> on the system to terminate them too. This includes the modprobe task.
> >> * Initialization of mlx4_en is interrupted in the middle of its init function.
> >
> > And why do you think that this systemd behaviour is correct one?
>
> My view is that this is an issue between the kernel and initrd/systemd.
> Switching the root is a delicate operation and both parts need to carefully
> cooperate for it to work correctly.
>
> I think it is generally sensible that systemd tries to terminate any remaining
> processes started from the initrd. They would have troubles when the root is
> switched under their hands anyway, unless they are specifically prepared for
> it. Systemd only skips terminating kthreads and allows to exclude root storage
> daemons. A modprobe helper could be excluded from being terminated too but the
> problem with the root switch remains.
>
> It looks to me that a good approach is to complete all running module loads
> before switching the root and continue with any further loads after the
> operation is done. Leaving module loads to udevd assures this, hence the idea
> to use an auxiliary bus.
I'm not sure about it. Everything above are user-space troubles which
are invited once systemd does root switch. Anyway, if you want to do
aux bus for mlx4, go for it.
Feel free to send me patches off-list and I will add them to our
regression, but be aware that you are stepping on landmine field
here.
Thanks
prev parent reply other threads:[~2023-01-03 9:35 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-15 9:51 Part of devices not initialized with mlx4 Petr Pavlu
2022-12-18 9:53 ` Leon Romanovsky
2023-01-02 10:33 ` Petr Pavlu
2023-01-03 9:35 ` Leon Romanovsky [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y7P2yECHeKvyqQqo@unreal \
--to=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=petr.pavlu@suse.com \
--cc=tariqt@nvidia.com \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.