* [RFC v1 07/11] net/mlx4: Cache irq_desc->affinity instead of irq_desc
[not found] <1430709339-29083-1-git-send-email-jiang.liu@linux.intel.com>
@ 2015-05-04 3:15 ` Jiang Liu
2015-05-04 12:10 ` Amir Vadai
0 siblings, 1 reply; 8+ messages in thread
From: Jiang Liu @ 2015-05-04 3:15 UTC (permalink / raw)
To: Thomas Gleixner, Bjorn Helgaas, Benjamin Herrenschmidt,
Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap,
Yinghai Lu, Borislav Petkov, Amir Vadai, Ido Shamay,
David S. Miller, Or Gerlitz, Eric Dumazet, Hadar Hen Zion,
Eran Ben Elisha, Joe Perches, Saeed Mahameed, Matan Barak
Cc: Jiang Liu, Konrad Rzeszutek Wilk, Tony Luck, x86, linux-kernel,
linux-pci, linux-acpi, netdev
The field 'affinity' in irq_desc won't change once the irq_desc data
structure is created. So cache irq_desc->affinity instead of irq_desc.
This also helps to hide struct irq_desc from device drivers.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
drivers/net/ethernet/mellanox/mlx4/en_cq.c | 6 +++---
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 5 +----
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 2 +-
3 files changed, 5 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_cq.c b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
index 22da4d0d0f05..a03a01625398 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
@@ -31,6 +31,7 @@
*
*/
+#include <linux/irq.h>
#include <linux/mlx4/cq.h>
#include <linux/mlx4/qp.h>
#include <linux/mlx4/cmd.h>
@@ -135,9 +136,8 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
mdev->dev->caps.num_comp_vectors;
}
- cq->irq_desc =
- irq_to_desc(mlx4_eq_get_irq(mdev->dev,
- cq->vector));
+ cq->irq_affinity = irq_get_affinity_mask(
+ mlx4_eq_get_irq(mdev->dev, cq->vector));
} else {
/* For TX we use the same irq per
ring we assigned for the RX */
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 4fdd3c37e47b..59d8395ef31c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -1022,14 +1022,11 @@ int mlx4_en_poll_rx_cq(struct napi_struct *napi, int budget)
/* If we used up all the quota - we're probably not done yet... */
if (done == budget) {
int cpu_curr;
- const struct cpumask *aff;
INC_PERF_COUNTER(priv->pstats.napi_quota);
cpu_curr = smp_processor_id();
- aff = irq_desc_get_irq_data(cq->irq_desc)->affinity;
-
- if (likely(cpumask_test_cpu(cpu_curr, aff)))
+ if (likely(cpumask_test_cpu(cpu_curr, cq->irq_affinity)))
return budget;
/* Current cpu is not according to smp_irq_affinity -
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 9de30216b146..5de5e9a51d76 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -357,7 +357,7 @@ struct mlx4_en_cq {
#define CQ_USER_PEND (MLX4_EN_CQ_STATE_POLL | MLX4_EN_CQ_STATE_POLL_YIELD)
spinlock_t poll_lock; /* protects from LLS/napi conflicts */
#endif /* CONFIG_NET_RX_BUSY_POLL */
- struct irq_desc *irq_desc;
+ struct cpumask *irq_affinity;
};
struct mlx4_en_port_profile {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [RFC v1 07/11] net/mlx4: Cache irq_desc->affinity instead of irq_desc
2015-05-04 3:15 ` [RFC v1 07/11] net/mlx4: Cache irq_desc->affinity instead of irq_desc Jiang Liu
@ 2015-05-04 12:10 ` Amir Vadai
2015-05-04 14:00 ` Jiang Liu
2015-05-04 15:10 ` Thomas Gleixner
0 siblings, 2 replies; 8+ messages in thread
From: Amir Vadai @ 2015-05-04 12:10 UTC (permalink / raw)
To: Jiang Liu
Cc: Thomas Gleixner, Bjorn Helgaas, Benjamin Herrenschmidt,
Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap,
Yinghai Lu, Borislav Petkov, Amir Vadai, Ido Shamay,
David S. Miller, Or Gerlitz, Eric Dumazet, Hadar Hen Zion,
Eran Ben Elisha, Joe Perches, Saeed Mahameed, Matan Barak,
Konrad Rzeszutek Wilk, Tony Luck, x86,
linux-kernel@vger.kernel.org, linux-pci
On Mon, May 4, 2015 at 6:15 AM, Jiang Liu <jiang.liu@linux.intel.com> wrote:
> The field 'affinity' in irq_desc won't change once the irq_desc data
> structure is created. So cache irq_desc->affinity instead of irq_desc.
> This also helps to hide struct irq_desc from device drivers.
Hi Jiang,
I might not understand the new changes irq core, but up until now
affinity was changed when the user changed it through
/proc/irq/<IRQ>/smp_affinity.
This code is monitoring the affinity from the napi_poll context to
detect affinity changes, and prevent napi from keep running on the
wrong CPU.
Therefore, the affinity can't be cached at the beginning. Please
revert this caching.
Thanks,
Amir
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC v1 07/11] net/mlx4: Cache irq_desc->affinity instead of irq_desc
2015-05-04 12:10 ` Amir Vadai
@ 2015-05-04 14:00 ` Jiang Liu
2015-05-05 9:07 ` Amir Vadai
2015-05-04 15:10 ` Thomas Gleixner
1 sibling, 1 reply; 8+ messages in thread
From: Jiang Liu @ 2015-05-04 14:00 UTC (permalink / raw)
To: Amir Vadai
Cc: Thomas Gleixner, Bjorn Helgaas, Benjamin Herrenschmidt,
Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap,
Yinghai Lu, Borislav Petkov, Ido Shamay, David S. Miller,
Or Gerlitz, Eric Dumazet, Hadar Hen Zion, Eran Ben Elisha,
Joe Perches, Saeed Mahameed, Matan Barak, Konrad Rzeszutek Wilk,
Tony Luck, x86, linux-kernel@vger.kernel.org, linux-pci,
linux-acpi
On 2015/5/4 20:10, Amir Vadai wrote:
> On Mon, May 4, 2015 at 6:15 AM, Jiang Liu <jiang.liu@linux.intel.com> wrote:
>> The field 'affinity' in irq_desc won't change once the irq_desc data
>> structure is created. So cache irq_desc->affinity instead of irq_desc.
>> This also helps to hide struct irq_desc from device drivers.
>
> Hi Jiang,
>
> I might not understand the new changes irq core, but up until now
> affinity was changed when the user changed it through
> /proc/irq/<IRQ>/smp_affinity.
> This code is monitoring the affinity from the napi_poll context to
> detect affinity changes, and prevent napi from keep running on the
> wrong CPU.
> Therefore, the affinity can't be cached at the beginning. Please
> revert this caching.
Hi Amir,
Thanks for review:) We want to hide irq_desc implementation
details from device drivers, so made these changes.
Function irq_get_affinity_mask() returns 'struct cpumask *'
and we cache the returned pointer. On the other hand, user may change
IRQ affinity through /proc/irq/<IRQ>/smp_affinity, but that only
changes the bitmap pointed to by the cached pointer and won't change
the pointer itself. So it should always return the latest affinity
setting by calling cpumask_test_cpu(cpu_curr, cq->irq_affinity).
Or am I missing something here?
Thanks!
Gerry
>
> Thanks,
> Amir
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC v1 07/11] net/mlx4: Cache irq_desc->affinity instead of irq_desc
2015-05-04 12:10 ` Amir Vadai
2015-05-04 14:00 ` Jiang Liu
@ 2015-05-04 15:10 ` Thomas Gleixner
2015-05-05 9:17 ` Amir Vadai
1 sibling, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2015-05-04 15:10 UTC (permalink / raw)
To: Amir Vadai
Cc: Jiang Liu, Bjorn Helgaas, Benjamin Herrenschmidt, Ingo Molnar,
H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap, Yinghai Lu,
Borislav Petkov, Ido Shamay, David S. Miller, Or Gerlitz,
Eric Dumazet, Hadar Hen Zion, Eran Ben Elisha, Joe Perches,
Saeed Mahameed, Matan Barak, Konrad Rzeszutek Wilk, Tony Luck,
x86, linux-kernel@vger.kernel.org, linux-pci, linux-acpi
On Mon, 4 May 2015, Amir Vadai wrote:
> On Mon, May 4, 2015 at 6:15 AM, Jiang Liu <jiang.liu@linux.intel.com> wrote:
> > The field 'affinity' in irq_desc won't change once the irq_desc data
> > structure is created. So cache irq_desc->affinity instead of irq_desc.
> > This also helps to hide struct irq_desc from device drivers.
>
> Hi Jiang,
>
> I might not understand the new changes irq core, but up until now
> affinity was changed when the user changed it through
> /proc/irq/<IRQ>/smp_affinity.
> This code is monitoring the affinity from the napi_poll context to
> detect affinity changes, and prevent napi from keep running on the
> wrong CPU.
> Therefore, the affinity can't be cached at the beginning. Please
> revert this caching.
Sigh. All of this is completely wrong. Polling in the internals of
irq_desc is just crap.
irq_set_affinity_notifier() has been added for exactly this purpose.
Please get rid of your irq_desc abuse.
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC v1 07/11] net/mlx4: Cache irq_desc->affinity instead of irq_desc
2015-05-04 14:00 ` Jiang Liu
@ 2015-05-05 9:07 ` Amir Vadai
0 siblings, 0 replies; 8+ messages in thread
From: Amir Vadai @ 2015-05-05 9:07 UTC (permalink / raw)
To: Jiang Liu
Cc: Amir Vadai, Thomas Gleixner, Bjorn Helgaas,
Benjamin Herrenschmidt, Ingo Molnar, H. Peter Anvin,
Rafael J. Wysocki, Randy Dunlap, Yinghai Lu, Borislav Petkov,
Ido Shamay, David S. Miller, Or Gerlitz, Eric Dumazet,
Hadar Hen Zion, Eran Ben Elisha, Joe Perches, Saeed Mahameed,
Matan Barak, Konrad Rzeszutek Wilk, Tony Luck, x86,
linux-kernel@vger.kernel.org, linux-pci
On Mon, May 4, 2015 at 5:00 PM, Jiang Liu <jiang.liu@linux.intel.com> wrote:
> On 2015/5/4 20:10, Amir Vadai wrote:
>> On Mon, May 4, 2015 at 6:15 AM, Jiang Liu <jiang.liu@linux.intel.com> wrote:
>>> The field 'affinity' in irq_desc won't change once the irq_desc data
>>> structure is created. So cache irq_desc->affinity instead of irq_desc.
>>> This also helps to hide struct irq_desc from device drivers.
>>
>> Hi Jiang,
>>
>> I might not understand the new changes irq core, but up until now
>> affinity was changed when the user changed it through
>> /proc/irq/<IRQ>/smp_affinity.
>> This code is monitoring the affinity from the napi_poll context to
>> detect affinity changes, and prevent napi from keep running on the
>> wrong CPU.
>> Therefore, the affinity can't be cached at the beginning. Please
>> revert this caching.
> Hi Amir,
> Thanks for review:) We want to hide irq_desc implementation
> details from device drivers, so made these changes.
> Function irq_get_affinity_mask() returns 'struct cpumask *'
> and we cache the returned pointer. On the other hand, user may change
> IRQ affinity through /proc/irq/<IRQ>/smp_affinity, but that only
> changes the bitmap pointed to by the cached pointer and won't change
> the pointer itself. So it should always return the latest affinity
> setting by calling cpumask_test_cpu(cpu_curr, cq->irq_affinity).
> Or am I missing something here?
No - you are completely right - I missed it.
Anyway I tested it to make sure, and everything works as expected.
Acked-By: Amir Vadai <amirv@mellanox.com>
Thanks,
Amir
> Thanks!
> Gerry
>>
>> Thanks,
>> Amir
>>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC v1 07/11] net/mlx4: Cache irq_desc->affinity instead of irq_desc
2015-05-04 15:10 ` Thomas Gleixner
@ 2015-05-05 9:17 ` Amir Vadai
2015-05-05 14:53 ` Thomas Gleixner
0 siblings, 1 reply; 8+ messages in thread
From: Amir Vadai @ 2015-05-05 9:17 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Amir Vadai, Jiang Liu, Bjorn Helgaas, Benjamin Herrenschmidt,
Ingo Molnar, H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap,
Yinghai Lu, Borislav Petkov, Ido Shamay, David S. Miller,
Or Gerlitz, Eric Dumazet, Hadar Hen Zion, Eran Ben Elisha,
Joe Perches, Saeed Mahameed, Matan Barak, Konrad Rzeszutek Wilk,
Tony Luck, x86, linux-kernel@vger.kernel.org, linux-pci
On Mon, May 4, 2015 at 6:10 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Mon, 4 May 2015, Amir Vadai wrote:
>
>> On Mon, May 4, 2015 at 6:15 AM, Jiang Liu <jiang.liu@linux.intel.com> wrote:
>> > The field 'affinity' in irq_desc won't change once the irq_desc data
>> > structure is created. So cache irq_desc->affinity instead of irq_desc.
>> > This also helps to hide struct irq_desc from device drivers.
>>
>> Hi Jiang,
>>
>> I might not understand the new changes irq core, but up until now
>> affinity was changed when the user changed it through
>> /proc/irq/<IRQ>/smp_affinity.
>> This code is monitoring the affinity from the napi_poll context to
>> detect affinity changes, and prevent napi from keep running on the
>> wrong CPU.
>> Therefore, the affinity can't be cached at the beginning. Please
>> revert this caching.
>
> Sigh. All of this is completely wrong. Polling in the internals of
> irq_desc is just crap.
Hi,
>
> irq_set_affinity_notifier() has been added for exactly this purpose.
rmap is already registered on irq_set_affinity_notifier().
Actually, I did post an extension to the affinity notifier, to use
affinity chain which you rejected [1] (and BTW, convinced me that this
is not the way to do it).
Current code is according to your suggestion.
[1] - https://lkml.org/lkml/2014/5/26/222
Amir
>
> Please get rid of your irq_desc abuse.
>
> Thanks,
>
> tglx
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC v1 07/11] net/mlx4: Cache irq_desc->affinity instead of irq_desc
2015-05-05 9:17 ` Amir Vadai
@ 2015-05-05 14:53 ` Thomas Gleixner
2015-05-07 10:41 ` Amir Vadai
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2015-05-05 14:53 UTC (permalink / raw)
To: Amir Vadai
Cc: Jiang Liu, Bjorn Helgaas, Benjamin Herrenschmidt, Ingo Molnar,
H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap, Yinghai Lu,
Borislav Petkov, Ido Shamay, David S. Miller, Or Gerlitz,
Eric Dumazet, Hadar Hen Zion, Eran Ben Elisha, Joe Perches,
Saeed Mahameed, Matan Barak, Konrad Rzeszutek Wilk, Tony Luck,
x86, linux-kernel@vger.kernel.org, linux-pci, linux-acpi
On Tue, 5 May 2015, Amir Vadai wrote:
> On Mon, May 4, 2015 at 6:10 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Mon, 4 May 2015, Amir Vadai wrote:
> >
> >> On Mon, May 4, 2015 at 6:15 AM, Jiang Liu <jiang.liu@linux.intel.com> wrote:
> >> > The field 'affinity' in irq_desc won't change once the irq_desc data
> >> > structure is created. So cache irq_desc->affinity instead of irq_desc.
> >> > This also helps to hide struct irq_desc from device drivers.
> >>
> >> Hi Jiang,
> >>
> >> I might not understand the new changes irq core, but up until now
> >> affinity was changed when the user changed it through
> >> /proc/irq/<IRQ>/smp_affinity.
> >> This code is monitoring the affinity from the napi_poll context to
> >> detect affinity changes, and prevent napi from keep running on the
> >> wrong CPU.
> >> Therefore, the affinity can't be cached at the beginning. Please
> >> revert this caching.
> >
> > Sigh. All of this is completely wrong. Polling in the internals of
> > irq_desc is just crap.
> Hi,
>
> >
> > irq_set_affinity_notifier() has been added for exactly this purpose.
> rmap is already registered on irq_set_affinity_notifier().
> Actually, I did post an extension to the affinity notifier, to use
> affinity chain which you rejected [1] (and BTW, convinced me that this
> is not the way to do it).
> Current code is according to your suggestion.
>
> [1] - https://lkml.org/lkml/2014/5/26/222
No, it's not. I wrote:
"So what's the point of adding notifier call chain complexity,
ordering problems etc., if you can simply note the fact that the
affinity changed in the rmap itself and check that in the RX path?"
I did not tell you to fiddle in irqdesc, really.
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC v1 07/11] net/mlx4: Cache irq_desc->affinity instead of irq_desc
2015-05-05 14:53 ` Thomas Gleixner
@ 2015-05-07 10:41 ` Amir Vadai
0 siblings, 0 replies; 8+ messages in thread
From: Amir Vadai @ 2015-05-07 10:41 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Jiang Liu, Bjorn Helgaas, Benjamin Herrenschmidt, Ingo Molnar,
H. Peter Anvin, Rafael J. Wysocki, Randy Dunlap, Yinghai Lu,
Borislav Petkov, Ido Shamay, David S. Miller, Or Gerlitz,
Eric Dumazet, Hadar Hen Zion, Eran Ben Elisha, Joe Perches,
Saeed Mahameed, Matan Barak, Konrad Rzeszutek Wilk, Tony Luck,
x86@kernel.org, linux-kernel@vger.kernel.org,
"linux-pci@vger.kernel.org" <
On 5/5/2015 5:53 PM, Thomas Gleixner wrote:
> On Tue, 5 May 2015, Amir Vadai wrote:
>
>> On Mon, May 4, 2015 at 6:10 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>>> On Mon, 4 May 2015, Amir Vadai wrote:
>>>
>>>> On Mon, May 4, 2015 at 6:15 AM, Jiang Liu <jiang.liu@linux.intel.com> wrote:
>>>>> The field 'affinity' in irq_desc won't change once the irq_desc data
>>>>> structure is created. So cache irq_desc->affinity instead of irq_desc.
>>>>> This also helps to hide struct irq_desc from device drivers.
>>>>
>>>> Hi Jiang,
>>>>
>>>> I might not understand the new changes irq core, but up until now
>>>> affinity was changed when the user changed it through
>>>> /proc/irq/<IRQ>/smp_affinity.
>>>> This code is monitoring the affinity from the napi_poll context to
>>>> detect affinity changes, and prevent napi from keep running on the
>>>> wrong CPU.
>>>> Therefore, the affinity can't be cached at the beginning. Please
>>>> revert this caching.
>>>
>>> Sigh. All of this is completely wrong. Polling in the internals of
>>> irq_desc is just crap.
>> Hi,
>>
>>>
>>> irq_set_affinity_notifier() has been added for exactly this purpose.
>> rmap is already registered on irq_set_affinity_notifier().
>> Actually, I did post an extension to the affinity notifier, to use
>> affinity chain which you rejected [1] (and BTW, convinced me that this
>> is not the way to do it).
>> Current code is according to your suggestion.
>>
>> [1] - https://lkml.org/lkml/2014/5/26/222
>
> No, it's not. I wrote:
>
> "So what's the point of adding notifier call chain complexity,
> ordering problems etc., if you can simply note the fact that the
> affinity changed in the rmap itself and check that in the RX path?"
>
> I did not tell you to fiddle in irqdesc, really.
>
> Thanks,
>
> tglx
>
It is problematic, since I didn't want to link the napi polling affinity
issue and CONFIG_CPU_RMAP being selected. That's why I CC'd you when
submitted the patch [1]. Guess it was my bad that I didn't do it in a
better way that would give you a chance to fully review it.
Anyway, I understand now that I shouldn't access irqdesc attributes from
a device driver.
I will see if I can live with disabling this whole feature (migrate a
currently looping NAPI when affinity is changed), when CONFIG_CPU_RMAP
is not selected. If not, I might modify the rmap code to give me this
information even when it is not selected. I assume that rmap notifier
callback can access this information.
Thanks,
Amir
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-05-07 10:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1430709339-29083-1-git-send-email-jiang.liu@linux.intel.com>
2015-05-04 3:15 ` [RFC v1 07/11] net/mlx4: Cache irq_desc->affinity instead of irq_desc Jiang Liu
2015-05-04 12:10 ` Amir Vadai
2015-05-04 14:00 ` Jiang Liu
2015-05-05 9:07 ` Amir Vadai
2015-05-04 15:10 ` Thomas Gleixner
2015-05-05 9:17 ` Amir Vadai
2015-05-05 14:53 ` Thomas Gleixner
2015-05-07 10:41 ` Amir Vadai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).