From: Shay Drori <shayd@nvidia.com>
To: Przemek Kitszel <przemyslaw.kitszel@intel.com>,
<netdev@vger.kernel.org>, <pabeni@redhat.com>,
<davem@davemloft.net>, <kuba@kernel.org>, <edumazet@google.com>,
<gregkh@linuxfoundation.org>, <david.m.ertman@intel.com>
Cc: <rafael@kernel.org>, <ira.weiny@intel.com>,
<linux-rdma@vger.kernel.org>, <leon@kernel.org>,
<tariqt@nvidia.com>, Parav Pandit <parav@nvidia.com>
Subject: Re: [PATCH net-next v3 1/2] driver core: auxiliary bus: show auxiliary device IRQs
Date: Sun, 12 May 2024 10:26:21 +0300 [thread overview]
Message-ID: <0ab29049-4635-4018-8e11-bffddbd437d5@nvidia.com> (raw)
In-Reply-To: <abfa794a-e129-414a-bb5e-815eeb13131e@intel.com>
On 10/05/2024 16:07, Przemek Kitszel wrote:
> External email: Use caution opening links or attachments
>
>
> please not that v4+ is already being discussed
>
> On 5/8/24 13:33, Shay Drori wrote:
>> On 08/05/2024 12:34, Przemek Kitszel wrote:
>
> // ...
>
>>>> +
>>>> +/* Auxiliary devices can share IRQs. Expose to user whether the
>>>> provided IRQ is
>>>> + * shared or exclusive.
>>>> + */
>>>> +static ssize_t auxiliary_irq_mode_show(struct device *dev,
>>>> + struct device_attribute *attr,
>>>> char *buf)
>>>> +{
>>>> + struct auxiliary_irq_info *info =
>>>> + container_of(attr, struct auxiliary_irq_info,
>>>> sysfs_attr);
>>>> +
>>>> + if (refcount_read(xa_load(&irqs, info->irq)) > 1)
>>>
>>> I didn't checked if it is possible with current implementation, but
>>> please imagine a scenario where user open()'s sysfs file, then triggers
>>> operation to remove irq (to call auxiliary_irq_destroy()), and only then
>>> read()'s sysfs contents, what results in nullptr dereference (xa_load()
>>> returning NULL). Splitting the code into two if statements would resolve
>>> this issue.
>>
>> the first function in auxiliary_irq_destroy() is removing the sysfs.
>> I don't see how after that user can read() the sysfs...
>
> Let me illustrate, but with my running kernel instead of your series:
> # strace cat /sys/class/net/enp0s31f6/duplex 2>&1 | grep -e open -e read
> yields (among others):
> openat(AT_FDCWD, "/sys/class/net/enp0s31f6/duplex", O_RDONLY) = 3
> read(3, "full\n", 131072) = 5
>
> And now imagine that other, concurrent user app or any HW event triggers
> this IRQ removal (resulting with xarray entry removed (!), likely sysfs
> attr refcount dropped to 0 [A], so new open()s will be declined, but
> that is irrelevant).
> My assumption is that, until close()d, user is free to call read() on
> fd received from openat(), but it's possible that xa_load() would return
> NULL (because of [A] above).
>
>>
>>>
>>>> + return sysfs_emit(buf, "%s\n", "shared");
>>>> + else
>>>> + return sysfs_emit(buf, "%s\n", "exclusive");
>>>> +}
>>>> +
>>>> +static void auxiliary_irq_destroy(int irq)
>>>> +{
>>>> + refcount_t *ref;
>>>> +
>>>> + xa_lock(&irqs);
>>>> + ref = xa_load(&irqs, irq);
>>>> + if (refcount_dec_and_test(ref)) {
>>>> + __xa_erase(&irqs, irq);
>>>> + kfree(ref);
>>>> + }
>>>> + xa_unlock(&irqs);
>>>> +}
>>>> +
>>>> +static int auxiliary_irq_create(int irq)
>>>> +{
>>>> + refcount_t *ref;
>>>> + int ret = 0;
>>>> +
>>>> + mutex_lock(&irqs_lock);
>>>
>>> [1] xa_lock() instead ...
>>>
>>>> + ref = xa_load(&irqs, irq);
>>>> + if (ref && refcount_inc_not_zero(ref))
>>>> + goto out;
>>>
>>> `&& refcount_inc_not_zero()` here means: leak memory and wreak havoc on
>>> saturation, instead the logic should be:
>>> if (ref) {
>>> refcount_inc(ref);
>>> goto out;
>>> }
>>>
>
>
> <digression>
>
>>> anyway allocating under a lock taken is not the best idea in general,
>>> although xarray API somehow encourages this -
>>
>>> alternative is to
>>> preallocate and free when not used, or some lock dance that will be easy
>>> to get wrong - and that's the raison d'etre of xa_reserve() :)
>>
>> I don't understand what you picture here?
>
> Here I was digressing, sorry for not marking it clearly as that.
> IMO xarray API need an extension to make this and similar use case
> easier to code right. I will CC you ofc.
> </digression>
>
>> xa_reserve() can drop the lock while allocating the xa_entry, so how it
>> will help?
>>
>>>
>>>> +
>>>> + ref = kzalloc(sizeof(*ref), GFP_KERNEL);
>>>> + if (!ref) {
>>>> + ret = -ENOMEM;
>>>> + goto out;
>>>> + }
>>>> +
>>>> + refcount_set(ref, 1);
>>>> + ret = xa_insert(&irqs, irq, ref, GFP_KERNEL);
>>>
>>> [2] ... then __xa_insert() here
>>
>> __xa_insert() can drop the lock as well...
>
> Thank you for pointing it to me.
>
> Let's move future discussion on this series to your newer submissions.
thanks for the quick reviews :)
lets continue in the v4 series.
>
> // ...
next prev parent reply other threads:[~2024-05-12 7:26 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-08 4:04 [PATCH net-next v3 0/2] Introduce auxiliary bus IRQs sysfs Shay Drory
2024-05-08 4:04 ` [PATCH net-next v3 1/2] driver core: auxiliary bus: show auxiliary device IRQs Shay Drory
2024-05-08 9:34 ` Przemek Kitszel
2024-05-08 11:33 ` Shay Drori
2024-05-10 13:07 ` Przemek Kitszel
2024-05-12 7:26 ` Shay Drori [this message]
2024-05-08 4:04 ` [PATCH net-next v3 2/2] net/mlx5: Expose SFs IRQs Shay Drory
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0ab29049-4635-4018-8e11-bffddbd437d5@nvidia.com \
--to=shayd@nvidia.com \
--cc=davem@davemloft.net \
--cc=david.m.ertman@intel.com \
--cc=edumazet@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=ira.weiny@intel.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=parav@nvidia.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=rafael@kernel.org \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).