From: Marco Pagani <marpagan@redhat.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Moritz Fischer <mdf@kernel.org>, Wu Hao <hao.wu@intel.com>,
Xu Yilun <yilun.xu@intel.com>, Tom Rix <trix@redhat.com>,
linux-kernel@vger.kernel.org, linux-fpga@vger.kernel.org
Subject: Re: [RFC PATCH v3 2/2] fpga: set owner of fpga_manager_ops for existing low-level modules
Date: Fri, 22 Dec 2023 21:52:27 +0100 [thread overview]
Message-ID: <37f7a0dc-6983-437e-a338-65e1abd751c2@redhat.com> (raw)
In-Reply-To: <2023122118-captive-suburb-6717@gregkh>
On 2023-12-21 09:22, Greg Kroah-Hartman wrote:
> On Wed, Dec 20, 2023 at 11:24:20PM +0100, Marco Pagani wrote:
>>
>>
>> On 19/12/23 19:11, Greg Kroah-Hartman wrote:
>>> On Tue, Dec 19, 2023 at 06:17:20PM +0100, Marco Pagani wrote:
>>>>
>>>> On 2023-12-19 16:10, Greg Kroah-Hartman wrote:
>>>>> On Tue, Dec 19, 2023 at 03:54:25PM +0100, Marco Pagani wrote:
>>>>>>
>>>>>>
>>>>>> On 2023-12-18 21:33, Greg Kroah-Hartman wrote:
>>>>>>> On Mon, Dec 18, 2023 at 09:28:09PM +0100, Marco Pagani wrote:
>>>>>>>> This patch tentatively set the owner field of fpga_manager_ops to
>>>>>>>> THIS_MODULE for existing fpga manager low-level control modules.
>>>>>>>>
>>>>>>>> Signed-off-by: Marco Pagani <marpagan@redhat.com>
>>>>>>>> ---
>>>>>>>> drivers/fpga/altera-cvp.c | 1 +
>>>>>>>> drivers/fpga/altera-pr-ip-core.c | 1 +
>>>>>>>> drivers/fpga/altera-ps-spi.c | 1 +
>>>>>>>> drivers/fpga/dfl-fme-mgr.c | 1 +
>>>>>>>> drivers/fpga/ice40-spi.c | 1 +
>>>>>>>> drivers/fpga/lattice-sysconfig.c | 1 +
>>>>>>>> drivers/fpga/machxo2-spi.c | 1 +
>>>>>>>> drivers/fpga/microchip-spi.c | 1 +
>>>>>>>> drivers/fpga/socfpga-a10.c | 1 +
>>>>>>>> drivers/fpga/socfpga.c | 1 +
>>>>>>>> drivers/fpga/stratix10-soc.c | 1 +
>>>>>>>> drivers/fpga/tests/fpga-mgr-test.c | 1 +
>>>>>>>> drivers/fpga/tests/fpga-region-test.c | 1 +
>>>>>>>> drivers/fpga/ts73xx-fpga.c | 1 +
>>>>>>>> drivers/fpga/versal-fpga.c | 1 +
>>>>>>>> drivers/fpga/xilinx-spi.c | 1 +
>>>>>>>> drivers/fpga/zynq-fpga.c | 1 +
>>>>>>>> drivers/fpga/zynqmp-fpga.c | 1 +
>>>>>>>> 18 files changed, 18 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/fpga/altera-cvp.c b/drivers/fpga/altera-cvp.c
>>>>>>>> index 4ffb9da537d8..aeb913547dd8 100644
>>>>>>>> --- a/drivers/fpga/altera-cvp.c
>>>>>>>> +++ b/drivers/fpga/altera-cvp.c
>>>>>>>> @@ -520,6 +520,7 @@ static const struct fpga_manager_ops altera_cvp_ops = {
>>>>>>>> .write_init = altera_cvp_write_init,
>>>>>>>> .write = altera_cvp_write,
>>>>>>>> .write_complete = altera_cvp_write_complete,
>>>>>>>> + .owner = THIS_MODULE,
>>>>>>>
>>>>>>> Note, this is not how to do this, force the compiler to set this for you
>>>>>>> automatically, otherwise everyone will always forget to do it. Look at
>>>>>>> how functions like usb_register_driver() works.
>>>>>>>
>>>>>>> Also, are you _sure_ that you need a module owner in this structure? I
>>>>>>> still don't know why...
>>>>>>>
>>>>>>
>>>>>> Do you mean moving the module owner field to the manager context and setting
>>>>>> it during registration with a helper macro?
>>>>>
>>>>> I mean set it during registration with a helper macro.
>>>>>
>>>>>> Something like:
>>>>>>
>>>>>> struct fpga_manager {
>>>>>> ...
>>>>>> struct module *owner;
>>>>>> };
>>>>>>
>>>>>> #define fpga_mgr_register(parent, ...) \
>>>>>> __fpga_mgr_register(parent,..., THIS_MODULE)
>>>>>>
>>>>>> struct fpga_manager *
>>>>>> __fpga_mgr_register(struct device *parent, ..., struct module *owner)
>>>>>> {
>>>>>> ...
>>>>>> mgr->owner = owner;
>>>>>> }
>>>>>
>>>>> Yes.
>>>>>
>>>>> But again, is a module owner even needed? I don't think you all have
>>>>> proven that yet...
>>>>
>>>> Programming an FPGA involves a potentially lengthy sequence of interactions
>>>> with the reconfiguration engine. The manager conceptually organizes these
>>>> interactions as a sequence of ops. Low-level modules implement these ops/steps
>>>> for a specific device. If we don't protect the low-level module, someone might
>>>> unload it right when we are in the middle of a low-level op programming the
>>>> FPGA. As far as I know, the kernel would crash in that case.
>>>
>>> The only way an unload of a module can happen is if a user explicitly
>>> asks for it to be unloaded. So they get what they ask for, right?
>>>
>>
>> Right, the user should get what he asked for, including hanging the
>> hardware. My only concern is that the kernel should not crash.
>>
>>> How do you "know" it is active? And why doesn't the normal
>>> "driver/device" bindings prevent unloading from being a problem? When
>>> you unload a module, you stop all ops on the driver, and then unregister
>>> it, which causes any future ones to fail.
>>>
>>> Or am I missing something here?
>>>
>>
>> I think the problem is that the ops are not directly tied to the driver
>> of the manager's parent device.
>
> Then that needs to be fixed right there, as that is obviously not using
> the driver model properly.
>
> Why aren't the "ops" a driver that is bound to this device? If it is
> the one responsible for controlling it, then it should be a driver and
> as such, the driver model logic will handle things if/when a module is
> unloaded to tear things down better.
>
>> It is not even required to have a driver
>> to register a manager. The only way to know if the fpga manager is
>> active (i.e., someone is running one op) is by poking manager->state.
>
> That too seems wrong, why is this?
I don't know. I was not around when the fpga subsystem was laid down.
>
>> One possibility that comes into my mind, excluding a major reworking,
>> is waiting in fpga_mgr_unregister() until the manager reaches a steady
>> state (no ops are running) before unregistering the device. However, it
>> feels questionable because if one of the ops hangs, the module removal
>> will also hang.
>
> You never know when a new operand will come in, so there's no way to
> know "all is quiet", sorry.
>
> Try fixing this properly, buy using the driver model correctly, that
> should help resolve these issues automatically instead of hacked up
> module reference count attempts.
>
> Remember, this is the whole reason why the driver model was created all
> those 20+ years ago, to move away from these module reference count
> issues, let's not forget history please.
>
I do not entirely understand this part. The subsystem only provides an
in-kernel API for programming the fpga that in-kernel consumers can use.
The ops that the low-level module implements are used only internally by
the manager in a predefined order.
There is no standard interface for programming the fpga exposed to
userspace using file_operations or attributes exported via sysfs.
The manager only exports read-only attributes for status. On top
of that, there is only the support for device tree overlays.
Would it be correct to assume that the responsibility of keeping
the low-level module in while programming the fpga is on the kernel
component that consumes the subsystem's in-kernel API and (eventually)
exports a programming interface to userspace?
If we consider the case where the programming is done through a
userspace interface exported by the same module that implements the ops,
then we should be good even without taking the low-level module in the
manager.
However, I guess the decision to take the low-level module in the
manager was meant to address the case where the module implementing the
ops and the consumer of the in-kernel API (that may optionally export a
userspace interface for programming) are two separate entities.
Thanks,
Marco
next prev parent reply other threads:[~2023-12-22 20:52 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-18 20:28 [RFC PATCH v3 0/2] fpga: improve protection against low-level control module unloading Marco Pagani
2023-12-18 20:28 ` [RFC PATCH v3 1/2] fpga: add an owner field and use it to take the low-level module's refcount Marco Pagani
2023-12-25 6:58 ` Xu Yilun
2024-01-03 15:02 ` Marco Pagani
2023-12-18 20:28 ` [RFC PATCH v3 2/2] fpga: set owner of fpga_manager_ops for existing low-level modules Marco Pagani
2023-12-18 20:33 ` Greg Kroah-Hartman
2023-12-19 14:54 ` Marco Pagani
2023-12-19 15:10 ` Greg Kroah-Hartman
2023-12-19 17:17 ` Marco Pagani
2023-12-19 18:11 ` Greg Kroah-Hartman
2023-12-20 22:24 ` Marco Pagani
2023-12-21 8:22 ` Greg Kroah-Hartman
2023-12-22 20:52 ` Marco Pagani [this message]
2023-12-21 9:26 ` Xu Yilun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=37f7a0dc-6983-437e-a338-65e1abd751c2@redhat.com \
--to=marpagan@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=hao.wu@intel.com \
--cc=linux-fpga@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mdf@kernel.org \
--cc=trix@redhat.com \
--cc=yilun.xu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).