From: Marco Pagani <marpagan@redhat.com>
To: Xu Yilun <yilun.xu@linux.intel.com>
Cc: Moritz Fischer <mdf@kernel.org>, Wu Hao <hao.wu@intel.com>,
Xu Yilun <yilun.xu@intel.com>, Tom Rix <trix@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Alan Tull <atull@opensource.altera.com>,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-fpga@vger.kernel.org
Subject: Re: [RFC PATCH v5 1/1] fpga: add an owner and use it to take the low-level module's refcount
Date: Fri, 1 Mar 2024 17:29:24 +0100 [thread overview]
Message-ID: <d4f5243c-696a-4d1d-94f4-0ecf42b6d7f0@redhat.com> (raw)
In-Reply-To: <ZeHwatupHVmC2N2+@yilunxu-OptiPlex-7050>
On 2024-03-01 16:12, Xu Yilun wrote:
> On Thu, Feb 29, 2024 at 11:37:10AM +0100, Marco Pagani wrote:
>>
>> On 2024-02-28 08:10, Xu Yilun wrote:
>>> On Tue, Feb 27, 2024 at 12:49:06PM +0100, Marco Pagani wrote:
>>>>
>>>>
>>>> On 2024-02-21 15:37, Xu Yilun wrote:
>>>>> On Tue, Feb 20, 2024 at 12:11:26PM +0100, Marco Pagani wrote:
>>>>>>
>>>>>>
>>>>>> On 2024-02-18 11:05, Xu Yilun wrote:
>>>>>>> On Mon, Feb 05, 2024 at 06:47:34PM +0100, Marco Pagani wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2024-02-04 06:15, Xu Yilun wrote:
>>>>>>>>> On Fri, Feb 02, 2024 at 06:44:01PM +0100, Marco Pagani wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2024-01-30 05:31, Xu Yilun wrote:
>>>>>>>>>>>> +#define fpga_mgr_register_full(parent, info) \
>>>>>>>>>>>> + __fpga_mgr_register_full(parent, info, THIS_MODULE)
>>>>>>>>>>>> struct fpga_manager *
>>>>>>>>>>>> -fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info);
>>>>>>>>>>>> +__fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info,
>>>>>>>>>>>> + struct module *owner);
>>>>>>>>>>>>
>>>>>>>>>>>> +#define fpga_mgr_register(parent, name, mops, priv) \
>>>>>>>>>>>> + __fpga_mgr_register(parent, name, mops, priv, THIS_MODULE)
>>>>>>>>>>>> struct fpga_manager *
>>>>>>>>>>>> -fpga_mgr_register(struct device *parent, const char *name,
>>>>>>>>>>>> - const struct fpga_manager_ops *mops, void *priv);
>>>>>>>>>>>> +__fpga_mgr_register(struct device *parent, const char *name,
>>>>>>>>>>>> + const struct fpga_manager_ops *mops, void *priv, struct module *owner);
>>>>>>>>>>>> +
>>>>>>>>>>>> void fpga_mgr_unregister(struct fpga_manager *mgr);
>>>>>>>>>>>>
>>>>>>>>>>>> +#define devm_fpga_mgr_register_full(parent, info) \
>>>>>>>>>>>> + __devm_fpga_mgr_register_full(parent, info, THIS_MODULE)
>>>>>>>>>>>> struct fpga_manager *
>>>>>>>>>>>> -devm_fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info);
>>>>>>>>>>>> +__devm_fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info,
>>>>>>>>>>>> + struct module *owner);
>>>>>>>>>>>
>>>>>>>>>>> Add a line here. I can do it myself if you agree.
>>>>>>>>>>
>>>>>>>>>> Sure, that is fine by me. I also spotted a typo in the commit log body
>>>>>>>>>> (in taken -> is taken). Do you want me to send a v6, or do you prefer
>>>>>>>>>> to fix that in place?
>>>>>>>>>
>>>>>>>>> No need, I can fix it.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> There is still a RFC prefix for this patch. Are you ready to get it merged?
>>>>>>>>>>> If yes, Acked-by: Xu Yilun <yilun.xu@intel.com>
>>>>>>>>>>
>>>>>>>>>> I'm ready for the patch to be merged. However, I recently sent an RFC
>>>>>>>>>> to propose a safer implementation of try_module_get() that would
>>>>>>>>>> simplify the code and may also benefit other subsystems. What do you
>>>>>>>>>> think?
>>>>>>>>>>
>>>>>>>>>> https://lore.kernel.org/linux-modules/20240130193614.49772-1-marpagan@redhat.com/
>>>>>>>>>
>>>>>>>>> I suggest take your fix to linux-fpga/for-next now. If your try_module_get()
>>>>>>>>> proposal is applied before the end of this cycle, we could re-evaluate
>>>>>>>>> this patch.
>>>>>>>>
>>>>>>>> That's fine by me.
>>>>>>>
>>>>>>> Sorry, I still found issues about this solution.
>>>>>>>
>>>>>>> void fpga_mgr_unregister(struct fpga_manager *mgr)
>>>>>>> {
>>>>>>> dev_info(&mgr->dev, "%s %s\n", __func__, mgr->name);
>>>>>>>
>>>>>>> /*
>>>>>>> * If the low level driver provides a method for putting fpga into
>>>>>>> * a desired state upon unregister, do it.
>>>>>>> */
>>>>>>> fpga_mgr_fpga_remove(mgr);
>>>>>>>
>>>>>>> mutex_lock(&mgr->mops_mutex);
>>>>>>>
>>>>>>> mgr->mops = NULL;
>>>>>>>
>>>>>>> mutex_unlock(&mgr->mops_mutex);
>>>>>>>
>>>>>>> device_unregister(&mgr->dev);
>>>>>>> }
>>>>>>>
>>>>>>> Note that fpga_mgr_unregister() doesn't have to be called in module_exit().
>>>>>>> So if we do fpga_mgr_get() then fpga_mgr_unregister(), We finally had a
>>>>>>> fpga_manager dev without mops, this is not what the user want and cause
>>>>>>> problem when using this fpga_manager dev for other FPGA APIs.
>>>>>>
>>>>>> How about moving mgr->mops = NULL from fpga_mgr_unregister() to
>>>>>> class->dev_release()? In that way, mops will be set to NULL only when the
>>>>>> manager dev refcount reaches 0.
>>>>>
>>>>> I'm afraid it doesn't help. The lifecycle of the module and the fpga
>>>>> mgr dev is different.
>>>>>
>>>>> We use mops = NULL to indicate module has been freed or will be freed in no
>>>>> time. On the other hand mops != NULL means module is still there, so
>>>>> that try_module_get() could be safely called. It is possible someone
>>>>> has got fpga mgr dev but not the module yet, at that time the module is
>>>>> unloaded, then try_module_get() triggers crash.
>>>>>
>>>>>>
>>>>>> If fpga_mgr_unregister() is called from module_exit(), we are sure that nobody
>>>>>> got the manager dev earlier using fpga_mgr_get(), or it would have bumped up
>>>>>
>>>>> No, someone may get the manager dev but not the module yet, and been
>>>>> scheduled out.
>>>>>
>>>>
>>>> You are right. Overall, it's a bad idea. How about then using an additional
>>>> bool flag instead of "overloading" the mops pointer? Something like:
>>>>
>>>> get:
>>>> if (!mgr->owner_valid || !try_module_get(mgr->mops_owner))
>>>>
>>>> remove:
>>>> mgr->owner_valid = false;
>>>
>>> I'm not quite sure which function is actually mentioned by "remove". I
>>> assume it should be fpga_mgr_unregister().
>>
>> Yes, I was referring to fpga_mgr_unregister().
>>
>>> IIUC this flag means no more reference to fpga mgr, but existing
>>> references are still valid.
>>
>> Yes.
>>
>>>
>>> It works for me. But the name of this flag could be reconsidered to
>>> avoid misunderstanding. The owner is still valid (we still need to put
>>> the owner) but allows no more reference. Maybe "owner_inactive"?
>>
>> Right, owner_valid might be misleading. How about removing any
>> reference to the owner module and name the flag unreg?
>
> the full name "unregistered" is better.
That's fine by me.
>
>>
>> __fpga_mgr_get:
>> if (mgr->unreg || !try_module_get(mgr->mops_owner))
>> mgr = ERR_PTR(-ENODEV);
>>
>> fpga_mgr_unregister:
>> mgr->unreg = true;
>>
>>> I still wanna this owner reference change been splitted, so that
>>> we could simply revert it when the try_module_get_safe() got accepted.
>>
>> I guess it may take some time to have try_module_get_safe() accepted.
>> What do you prefer to do with the bridge and the region in the
>> meantime?
>
> This issue could happen in little chance. I actually don't have much
> preference, either way is good to me.
>
Okay, I'll also send the patch for the region then.
Thanks,
Marco
next prev parent reply other threads:[~2024-03-01 16:29 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-11 16:02 [RFC PATCH v5 0/1] fpga: improve protection against low-level control module unloading Marco Pagani
2024-01-11 16:02 ` [RFC PATCH v5 1/1] fpga: add an owner and use it to take the low-level module's refcount Marco Pagani
2024-01-30 4:31 ` Xu Yilun
2024-02-02 17:44 ` Marco Pagani
2024-02-04 5:15 ` Xu Yilun
2024-02-05 17:47 ` Marco Pagani
2024-02-18 10:05 ` Xu Yilun
2024-02-20 11:11 ` Marco Pagani
2024-02-21 14:37 ` Xu Yilun
2024-02-27 11:49 ` Marco Pagani
2024-02-28 7:10 ` Xu Yilun
2024-02-29 10:37 ` Marco Pagani
2024-03-01 15:12 ` Xu Yilun
2024-03-01 16:29 ` Marco Pagani [this message]
2024-03-04 11:51 ` Marco Pagani
2024-03-04 13:49 ` Xu Yilun
2024-03-04 22:26 ` Marco Pagani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d4f5243c-696a-4d1d-94f4-0ecf42b6d7f0@redhat.com \
--to=marpagan@redhat.com \
--cc=atull@opensource.altera.com \
--cc=corbet@lwn.net \
--cc=gregkh@linuxfoundation.org \
--cc=hao.wu@intel.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fpga@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mdf@kernel.org \
--cc=trix@redhat.com \
--cc=yilun.xu@intel.com \
--cc=yilun.xu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).