linux-fpga.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marco Pagani <marpagan@redhat.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Moritz Fischer <mdf@kernel.org>, Wu Hao <hao.wu@intel.com>,
	Xu Yilun <yilun.xu@intel.com>, Tom Rix <trix@redhat.com>,
	linux-kernel@vger.kernel.org, linux-fpga@vger.kernel.org
Subject: Re: [RFC PATCH v3 2/2] fpga: set owner of fpga_manager_ops for existing low-level modules
Date: Fri, 22 Dec 2023 21:52:27 +0100	[thread overview]
Message-ID: <37f7a0dc-6983-437e-a338-65e1abd751c2@redhat.com> (raw)
In-Reply-To: <2023122118-captive-suburb-6717@gregkh>



On 2023-12-21 09:22, Greg Kroah-Hartman wrote:
> On Wed, Dec 20, 2023 at 11:24:20PM +0100, Marco Pagani wrote:
>>
>>
>> On 19/12/23 19:11, Greg Kroah-Hartman wrote:
>>> On Tue, Dec 19, 2023 at 06:17:20PM +0100, Marco Pagani wrote:
>>>>
>>>> On 2023-12-19 16:10, Greg Kroah-Hartman wrote:
>>>>> On Tue, Dec 19, 2023 at 03:54:25PM +0100, Marco Pagani wrote:
>>>>>>
>>>>>>
>>>>>> On 2023-12-18 21:33, Greg Kroah-Hartman wrote:
>>>>>>> On Mon, Dec 18, 2023 at 09:28:09PM +0100, Marco Pagani wrote:
>>>>>>>> This patch tentatively set the owner field of fpga_manager_ops to
>>>>>>>> THIS_MODULE for existing fpga manager low-level control modules.
>>>>>>>>
>>>>>>>> Signed-off-by: Marco Pagani <marpagan@redhat.com>
>>>>>>>> ---
>>>>>>>>  drivers/fpga/altera-cvp.c             | 1 +
>>>>>>>>  drivers/fpga/altera-pr-ip-core.c      | 1 +
>>>>>>>>  drivers/fpga/altera-ps-spi.c          | 1 +
>>>>>>>>  drivers/fpga/dfl-fme-mgr.c            | 1 +
>>>>>>>>  drivers/fpga/ice40-spi.c              | 1 +
>>>>>>>>  drivers/fpga/lattice-sysconfig.c      | 1 +
>>>>>>>>  drivers/fpga/machxo2-spi.c            | 1 +
>>>>>>>>  drivers/fpga/microchip-spi.c          | 1 +
>>>>>>>>  drivers/fpga/socfpga-a10.c            | 1 +
>>>>>>>>  drivers/fpga/socfpga.c                | 1 +
>>>>>>>>  drivers/fpga/stratix10-soc.c          | 1 +
>>>>>>>>  drivers/fpga/tests/fpga-mgr-test.c    | 1 +
>>>>>>>>  drivers/fpga/tests/fpga-region-test.c | 1 +
>>>>>>>>  drivers/fpga/ts73xx-fpga.c            | 1 +
>>>>>>>>  drivers/fpga/versal-fpga.c            | 1 +
>>>>>>>>  drivers/fpga/xilinx-spi.c             | 1 +
>>>>>>>>  drivers/fpga/zynq-fpga.c              | 1 +
>>>>>>>>  drivers/fpga/zynqmp-fpga.c            | 1 +
>>>>>>>>  18 files changed, 18 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/fpga/altera-cvp.c b/drivers/fpga/altera-cvp.c
>>>>>>>> index 4ffb9da537d8..aeb913547dd8 100644
>>>>>>>> --- a/drivers/fpga/altera-cvp.c
>>>>>>>> +++ b/drivers/fpga/altera-cvp.c
>>>>>>>> @@ -520,6 +520,7 @@ static const struct fpga_manager_ops altera_cvp_ops = {
>>>>>>>>  	.write_init	= altera_cvp_write_init,
>>>>>>>>  	.write		= altera_cvp_write,
>>>>>>>>  	.write_complete	= altera_cvp_write_complete,
>>>>>>>> +	.owner		= THIS_MODULE,
>>>>>>>
>>>>>>> Note, this is not how to do this, force the compiler to set this for you
>>>>>>> automatically, otherwise everyone will always forget to do it.  Look at
>>>>>>> how functions like usb_register_driver() works.
>>>>>>>
>>>>>>> Also, are you _sure_ that you need a module owner in this structure?  I
>>>>>>> still don't know why...
>>>>>>>
>>>>>>
>>>>>> Do you mean moving the module owner field to the manager context and setting
>>>>>> it during registration with a helper macro?
>>>>>
>>>>> I mean set it during registration with a helper macro.
>>>>>
>>>>>> Something like:
>>>>>>
>>>>>> struct fpga_manager {
>>>>>> 	...
>>>>>> 	struct module *owner;
>>>>>> };
>>>>>>
>>>>>> #define fpga_mgr_register(parent, ...) \
>>>>>> 	__fpga_mgr_register(parent,..., THIS_MODULE)
>>>>>>
>>>>>> struct fpga_manager *
>>>>>> __fpga_mgr_register(struct device *parent, ..., struct module *owner)
>>>>>> {
>>>>>> 	...
>>>>>> 	mgr->owner = owner;
>>>>>> }
>>>>>
>>>>> Yes.
>>>>>
>>>>> But again, is a module owner even needed?  I don't think you all have
>>>>> proven that yet...
>>>>
>>>> Programming an FPGA involves a potentially lengthy sequence of interactions
>>>> with the reconfiguration engine. The manager conceptually organizes these
>>>> interactions as a sequence of ops. Low-level modules implement these ops/steps
>>>> for a specific device. If we don't protect the low-level module, someone might
>>>> unload it right when we are in the middle of a low-level op programming the
>>>> FPGA. As far as I know, the kernel would crash in that case.
>>>
>>> The only way an unload of a module can happen is if a user explicitly
>>> asks for it to be unloaded.  So they get what they ask for, right?
>>>
>>
>> Right, the user should get what he asked for, including hanging the
>> hardware. My only concern is that the kernel should not crash.
>>
>>> How do you "know" it is active?  And why doesn't the normal
>>> "driver/device" bindings prevent unloading from being a problem?  When
>>> you unload a module, you stop all ops on the driver, and then unregister
>>> it, which causes any future ones to fail.
>>>
>>> Or am I missing something here?
>>>
>>  
>> I think the problem is that the ops are not directly tied to the driver
>> of the manager's parent device.
> 
> Then that needs to be fixed right there, as that is obviously not using
> the driver model properly.
> 
> Why aren't the "ops" a driver that is bound to this device?  If it is
> the one responsible for controlling it, then it should be a driver and
> as such, the driver model logic will handle things if/when a module is
> unloaded to tear things down better.
> 
>> It is not even required to have a driver
>> to register a manager. The only way to know if the fpga manager is
>> active (i.e., someone is running one op) is by poking manager->state.
> 
> That too seems wrong, why is this?

I don't know. I was not around when the fpga subsystem was laid down.

> 
>> One possibility that comes into my mind, excluding a major reworking,
>> is waiting in fpga_mgr_unregister() until the manager reaches a steady
>> state (no ops are running) before unregistering the device. However, it
>> feels questionable because if one of the ops hangs, the module removal
>> will also hang.
> 
> You never know when a new operand will come in, so there's no way to
> know "all is quiet", sorry.
> 
> Try fixing this properly, buy using the driver model correctly, that
> should help resolve these issues automatically instead of hacked up
> module reference count attempts.
> 
> Remember, this is the whole reason why the driver model was created all
> those 20+ years ago, to move away from these module reference count
> issues, let's not forget history please.
> 

I do not entirely understand this part. The subsystem only provides an
in-kernel API for programming the fpga that in-kernel consumers can use.
The ops that the low-level module implements are used only internally by
the manager in a predefined order.

There is no standard interface for programming the fpga exposed to
userspace using file_operations or attributes exported via sysfs.
The manager only exports read-only attributes for status. On top
of that, there is only the support for device tree overlays.

Would it be correct to assume that the responsibility of keeping
the low-level module in while programming the fpga is on the kernel
component that consumes the subsystem's in-kernel API and (eventually)
exports a programming interface to userspace?

If we consider the case where the programming is done through a
userspace interface exported by the same module that implements the ops,
then we should be good even without taking the low-level module in the
manager.

However, I guess the decision to take the low-level module in the
manager was meant to address the case where the module implementing the
ops and the consumer of the in-kernel API (that may optionally export a
userspace interface for programming) are two separate entities.

Thanks,
Marco


  reply	other threads:[~2023-12-22 20:52 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-18 20:28 [RFC PATCH v3 0/2] fpga: improve protection against low-level control module unloading Marco Pagani
2023-12-18 20:28 ` [RFC PATCH v3 1/2] fpga: add an owner field and use it to take the low-level module's refcount Marco Pagani
2023-12-25  6:58   ` Xu Yilun
2024-01-03 15:02     ` Marco Pagani
2023-12-18 20:28 ` [RFC PATCH v3 2/2] fpga: set owner of fpga_manager_ops for existing low-level modules Marco Pagani
2023-12-18 20:33   ` Greg Kroah-Hartman
2023-12-19 14:54     ` Marco Pagani
2023-12-19 15:10       ` Greg Kroah-Hartman
2023-12-19 17:17         ` Marco Pagani
2023-12-19 18:11           ` Greg Kroah-Hartman
2023-12-20 22:24             ` Marco Pagani
2023-12-21  8:22               ` Greg Kroah-Hartman
2023-12-22 20:52                 ` Marco Pagani [this message]
2023-12-21  9:26             ` Xu Yilun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37f7a0dc-6983-437e-a338-65e1abd751c2@redhat.com \
    --to=marpagan@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hao.wu@intel.com \
    --cc=linux-fpga@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mdf@kernel.org \
    --cc=trix@redhat.com \
    --cc=yilun.xu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).