* Re: [PATCH 3/3] powerpc/cacheinfo: Print correct cache-sibling map/list for L2 cache
From: Gautham R Shenoy @ 2020-12-08 17:56 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Nathan Lynch, Gautham R. Shenoy, Michael Neuling,
Vaidyanathan Srinivasan, Peter Zijlstra, linux-kernel,
Nicholas Piggin, linuxppc-dev, Valentin Schneider
In-Reply-To: <20201207131138.GJ528281@linux.vnet.ibm.com>
On Mon, Dec 07, 2020 at 06:41:38PM +0530, Srikar Dronamraju wrote:
> * Gautham R. Shenoy <ego@linux.vnet.ibm.com> [2020-12-04 10:18:47]:
>
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> >
> >
> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> > ---
> >
> > +extern bool thread_group_shares_l2;
> > /*
> > * On big-core systems, each core has two groups of CPUs each of which
> > * has its own L1-cache. The thread-siblings which share l1-cache with
> > * @cpu can be obtained via cpu_smallcore_mask().
> > + *
> > + * On some big-core systems, the L2 cache is shared only between some
> > + * groups of siblings. This is already parsed and encoded in
> > + * cpu_l2_cache_mask().
> > */
> > static const struct cpumask *get_big_core_shared_cpu_map(int cpu, struct cache *cache)
> > {
> > if (cache->level == 1)
> > return cpu_smallcore_mask(cpu);
> > + if (cache->level == 2 && thread_group_shares_l2)
> > + return cpu_l2_cache_mask(cpu);
> >
> > return &cache->shared_cpu_map;
>
> As pointed with lkp@intel.org, we need to do this only with #CONFIG_SMP,
> even for cache->level = 1 too.
Yes, I have fixed that in the next version.
>
> I agree that we are displaying shared_cpu_map correctly. Should we have also
> update /clear shared_cpu_map in the first place. For example:- If for a P9
> core with CPUs 0-7, the cache->shared_cpu_map for L1 would have 0-7 but
> would display 0,2,4,6.
>
> The drawback of this is even if cpus 0,2,4,6 are released L1 cache will not
> be released. Is this as expected?
cacheinfo populates the cache->shared_cpu_map on the basis of which
CPUs share the common device-tree node for a particular cache. There
is one l1-cache object in the device-tree for a CPU node corresponding
to a big-core. That the L1 is further split between the threads of the
core is shown using ibm,thread-groups.
The ideal thing would be to add a "group_leader" field to "struct
cache" so that we can create separate cache objects , one per thread
group. I will take a stab at this in the v2.
Thanks for the review comments.
>
>
> --
> Thanks and Regards
> Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH 2/3] powerpc/smp: Add support detecting thread-groups sharing L2 cache
From: Gautham R Shenoy @ 2020-12-08 17:42 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Nathan Lynch, Gautham R. Shenoy, Michael Neuling,
Vaidyanathan Srinivasan, Peter Zijlstra, linux-kernel,
Nicholas Piggin, linuxppc-dev, Valentin Schneider
In-Reply-To: <20201207124039.GI528281@linux.vnet.ibm.com>
Hello Srikar,
On Mon, Dec 07, 2020 at 06:10:39PM +0530, Srikar Dronamraju wrote:
> * Gautham R. Shenoy <ego@linux.vnet.ibm.com> [2020-12-04 10:18:46]:
>
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> >
> > On POWER systems, groups of threads within a core sharing the L2-cache
> > can be indicated by the "ibm,thread-groups" property array with the
> > identifier "2".
> >
> > This patch adds support for detecting this, and when present, populate
> > the populating the cpu_l2_cache_mask of every CPU to the core-siblings
> > which share L2 with the CPU as specified in the by the
> > "ibm,thread-groups" property array.
> >
> > On a platform with the following "ibm,thread-group" configuration
> > 00000001 00000002 00000004 00000000
> > 00000002 00000004 00000006 00000001
> > 00000003 00000005 00000007 00000002
> > 00000002 00000004 00000000 00000002
> > 00000004 00000006 00000001 00000003
> > 00000005 00000007
> >
> > Without this patch, the sched-domain hierarchy for CPUs 0,1 would be
> > CPU0 attaching sched-domain(s):
> > domain-0: span=0,2,4,6 level=SMT
> > domain-1: span=0-7 level=CACHE
> > domain-2: span=0-15,24-39,48-55 level=MC
> > domain-3: span=0-55 level=DIE
> >
> > CPU1 attaching sched-domain(s):
> > domain-0: span=1,3,5,7 level=SMT
> > domain-1: span=0-7 level=CACHE
> > domain-2: span=0-15,24-39,48-55 level=MC
> > domain-3: span=0-55 level=DIE
> >
> > The CACHE domain at 0-7 is incorrect since the ibm,thread-groups
> > sub-array
> > [00000002 00000002 00000004
> > 00000000 00000002 00000004 00000006
> > 00000001 00000003 00000005 00000007]
> > indicates that L2 (Property "2") is shared only between the threads of a single
> > group. There are "2" groups of threads where each group contains "4"
> > threads each. The groups being {0,2,4,6} and {1,3,5,7}.
> >
> > With this patch, the sched-domain hierarchy for CPUs 0,1 would be
> > CPU0 attaching sched-domain(s):
> > domain-0: span=0,2,4,6 level=SMT
> > domain-1: span=0-15,24-39,48-55 level=MC
> > domain-2: span=0-55 level=DIE
> >
> > CPU1 attaching sched-domain(s):
> > domain-0: span=1,3,5,7 level=SMT
> > domain-1: span=0-15,24-39,48-55 level=MC
> > domain-2: span=0-55 level=DIE
> >
> > The CACHE domain with span=0,2,4,6 for CPU 0 (span=1,3,5,7 for CPU 1
> > resp.) gets degenerated into the SMT domain. Furthermore, the
> > last-level-cache domain gets correctly set to the SMT sched-domain.
> >
> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> > ---
> > arch/powerpc/kernel/smp.c | 66 +++++++++++++++++++++++++++++++++++++++++------
> > 1 file changed, 58 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > index 6a242a3..a116d2d 100644
> > --- a/arch/powerpc/kernel/smp.c
> > +++ b/arch/powerpc/kernel/smp.c
> > @@ -76,6 +76,7 @@
> > struct task_struct *secondary_current;
> > bool has_big_cores;
> > bool coregroup_enabled;
> > +bool thread_group_shares_l2;
>
> Either keep this as static in this patch or add its declaration
>
This will be used in Patch 3 in kernel/cacheinfo.c, but not any other
place. Hence I am not making it static here.
> >
> > DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
> > DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
> > @@ -99,6 +100,7 @@ enum {
> >
> > #define MAX_THREAD_LIST_SIZE 8
> > #define THREAD_GROUP_SHARE_L1 1
> > +#define THREAD_GROUP_SHARE_L2 2
> > struct thread_groups {
> > unsigned int property;
> > unsigned int nr_groups;
> > @@ -107,7 +109,7 @@ struct thread_groups {
> > };
> >
> > /* Maximum number of properties that groups of threads within a core can share */
> > -#define MAX_THREAD_GROUP_PROPERTIES 1
> > +#define MAX_THREAD_GROUP_PROPERTIES 2
> >
> > struct thread_groups_list {
> > unsigned int nr_properties;
> > @@ -121,6 +123,13 @@ struct thread_groups_list {
> > */
> > DEFINE_PER_CPU(cpumask_var_t, cpu_l1_cache_map);
> >
> > +/*
> > + * On some big-cores system, thread_group_l2_cache_map for each CPU
> > + * corresponds to the set its siblings within the core that share the
> > + * L2-cache.
> > + */
> > +DEFINE_PER_CPU(cpumask_var_t, thread_group_l2_cache_map);
> > +
>
> NIT:
> We are trying to confuse ourselves with the names.
> For L1 we have cpu_l2_cache_map to store the tasks from the thread group.
> but cpu_smallcore_map for keeping track of tasks.
>
I suppose you mean cpu_l1_cache_map here. We are using
cpu_smallcore_map, because when the ibm,thread-groups-property=1, it
shares both L1 and the instruction data flow (SMT). We already have a
cpu_smt_map, hence, this was named cpu_smallcore_map a couple of years
ago when I wrote that patch.
> For L2 we have thread_group_l2_cache_map to store the tasks from the thread
> group. but cpu_l2_cache_map for keeping track of tasks.
>
> I think we should do some renaming to keep the names consistent.
> I would say probably say move the current cpu_l2_cache_map to
> cpu_llc_cache_map and move the new aka thread_group_l2_cache_map as
> cpu_l2_cache_map to be somewhat consistent.
Hmm.. cpu_llc_cache_map is still very generic. We want to have
something that defines l2 map.
I agree that we need to keep it consistent. How about renaming
cpu_l1_cache_map to thread_groups_l1_cache_map ?
That way thread_groups_l1_cache_map and thread_groups_l2_cache_map
refer to the corresponding L1 and L2 siblings as discovered from
ibm,thread-groups property.
We have the cpu_smallcore_mask and the cpu_l2_cache_map unchanged as
it was before.
>
> > /* SMP operations for this machine */
> > struct smp_ops_t *smp_ops;
> >
> > @@ -840,7 +851,8 @@ static int init_cpu_cache_map(int cpu, unsigned int cache_property)
> > if (!dn)
> > return -ENODATA;
> >
> > - if (!(cache_property == THREAD_GROUP_SHARE_L1))
> > + if (!(cache_property == THREAD_GROUP_SHARE_L1 ||
> > + cache_property == THREAD_GROUP_SHARE_L2))
> > return -EINVAL;
> >
> > if (!cpu_tgl->nr_properties) {
> > @@ -867,7 +879,10 @@ static int init_cpu_cache_map(int cpu, unsigned int cache_property)
> > goto out;
> > }
> >
> > - mask = &per_cpu(cpu_l1_cache_map, cpu);
> > + if (cache_property == THREAD_GROUP_SHARE_L1)
> > + mask = &per_cpu(cpu_l1_cache_map, cpu);
> > + else if (cache_property == THREAD_GROUP_SHARE_L2)
> > + mask = &per_cpu(thread_group_l2_cache_map, cpu);
> >
> > zalloc_cpumask_var_node(mask, GFP_KERNEL, cpu_to_node(cpu));
> >
> > @@ -973,6 +988,16 @@ static int init_big_cores(void)
> > }
> >
> > has_big_cores = true;
> > +
> > + for_each_possible_cpu(cpu) {
> > + int err = init_cpu_cache_map(cpu, THREAD_GROUP_SHARE_L2);
> > +
> > + if (err)
> > + return err;
> > + }
> > +
> > + thread_group_shares_l2 = true;
>
> Why do we need a separate loop. Why cant we merge this in the above loop
> itself?
No, there are platforms where one THREAD_GROUP_SHARE_L1 exists while
THREAD_GROUP_SHARE_L2 doesn't exist. It becomes easier if these are
separately tracked. Also, what do we gain if we put this in the same
loop? It will be (nr_possible_cpus * 2 * invocations of
init_cpu_cache_map()) as opposed to 2 * (nr_possible_cpus *
invocations of init_cpu_cache_map()). Isn't it ?
>
> > + pr_info("Thread-groups in a core share L2-cache\n");
>
> Can this be moved to a pr_debug? Does it help any regular user/admins to
> know if thread-groups shared l2 cache. Infact it may confuse users on what
> thread groups are and which thread groups dont share cache.
> I would prefer some other name than thread_group_shares_l2 but dont know any
> better alternatives and may be my choices are even worse.
Would you be ok with "L2 cache shared by threads of the small core" ?
>
> > return 0;
> > }
> >
> > @@ -1287,6 +1312,31 @@ static bool update_mask_by_l2(int cpu, cpumask_var_t *mask)
> > if (has_big_cores)
> > submask_fn = cpu_smallcore_mask;
> >
> > +
>
> NIT: extra blank line?
Will remove this.
>
> > + /*
> > + * If the threads in a thread-group share L2 cache, then then
> > + * the L2-mask can be obtained from thread_group_l2_cache_map.
> > + */
> > + if (thread_group_shares_l2) {
> > + /* Siblings that share L1 is a subset of siblings that share L2.*/
> > + or_cpumasks_related(cpu, cpu, submask_fn, cpu_l2_cache_mask);
> > + if (*mask) {
> > + cpumask_andnot(*mask,
> > + per_cpu(thread_group_l2_cache_map, cpu),
> > + cpu_l2_cache_mask(cpu));
> > + } else {
> > + mask = &per_cpu(thread_group_l2_cache_map, cpu);
> > + }
> > +
> > + for_each_cpu(i, *mask) {
> > + if (!cpu_online(i))
> > + continue;
> > + set_cpus_related(i, cpu, cpu_l2_cache_mask);
> > + }
> > +
> > + return true;
> > + }
> > +
>
> Ah this can be simplified to:
> if (thread_group_shares_l2) {
> cpumask_set_cpu(cpu, cpu_l2_cache_mask(cpu));
>
> for_each_cpu(i, per_cpu(thread_group_l2_cache_map, cpu)) {
> if (cpu_online(i))
> set_cpus_related(i, cpu, cpu_l2_cache_mask);
> }
Don't we want to enforce that the siblings sharing L1 be a subset of
the siblings sharing L2 ? Or do you recommend putting in a check for
that somewhere ?
> }
>
> No?
>
> > l2_cache = cpu_to_l2cache(cpu);
> > if (!l2_cache || !*mask) {
> > /* Assume only core siblings share cache with this CPU */
>
> --
> Thanks and Regards
> Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH 09/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG
From: Greg Kurz @ 2020-12-08 17:39 UTC (permalink / raw)
To: Cédric Le Goater; +Cc: linuxppc-dev
In-Reply-To: <20201208151124.1329942-10-clg@kaod.org>
On Tue, 8 Dec 2020 16:11:20 +0100
Cédric Le Goater <clg@kaod.org> wrote:
> This flag was used to support the PHB4 LSIs on P9 DD1 and we have
> stopped supporting this CPU when DD2 came out. See skiboot commit:
>
> https://github.com/open-power/skiboot/commit/0b0d15e3c170
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
Reviewed-by: Greg Kurz <groug@kaod.org>
Just a minor suggestion in case you need to post a v2. See below.
> arch/powerpc/include/asm/opal-api.h | 2 +-
> arch/powerpc/include/asm/xive.h | 2 +-
> arch/powerpc/kvm/book3s_xive_native.c | 3 ---
> arch/powerpc/kvm/book3s_xive_template.c | 3 ---
> arch/powerpc/sysdev/xive/common.c | 8 --------
> arch/powerpc/sysdev/xive/native.c | 2 --
> 6 files changed, 2 insertions(+), 18 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
> index 1dffa3cb16ba..48ee604ca39a 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -1091,7 +1091,7 @@ enum {
> OPAL_XIVE_IRQ_TRIGGER_PAGE = 0x00000001,
> OPAL_XIVE_IRQ_STORE_EOI = 0x00000002,
> OPAL_XIVE_IRQ_LSI = 0x00000004,
> - OPAL_XIVE_IRQ_SHIFT_BUG = 0x00000008,
> + OPAL_XIVE_IRQ_SHIFT_BUG = 0x00000008, /* P9 DD1.0 workaround */
Maybe you can even comment the entire line so that any future
tentative to use that flag breaks build ?
> OPAL_XIVE_IRQ_MASK_VIA_FW = 0x00000010,
> OPAL_XIVE_IRQ_EOI_VIA_FW = 0x00000020,
> };
> diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
> index d332dd9a18de..ff805885a028 100644
> --- a/arch/powerpc/include/asm/xive.h
> +++ b/arch/powerpc/include/asm/xive.h
> @@ -60,7 +60,7 @@ struct xive_irq_data {
> };
> #define XIVE_IRQ_FLAG_STORE_EOI 0x01
> #define XIVE_IRQ_FLAG_LSI 0x02
> -#define XIVE_IRQ_FLAG_SHIFT_BUG 0x04
> +#define XIVE_IRQ_FLAG_SHIFT_BUG 0x04 /* P9 DD1.0 workaround */
Same here, with an extra cleanup to stop using it when initializing
xive_irq_flags[] in common.c.
> #define XIVE_IRQ_FLAG_MASK_FW 0x08
> #define XIVE_IRQ_FLAG_EOI_FW 0x10
> #define XIVE_IRQ_FLAG_H_INT_ESB 0x20
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
> index 9b395381179d..170d1d04e1d1 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -37,9 +37,6 @@ static u8 xive_vm_esb_load(struct xive_irq_data *xd, u32 offset)
> * ordering.
> */
>
> - if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
> - offset |= offset << 4;
> -
> val = in_be64(xd->eoi_mmio + offset);
> return (u8)val;
> }
> diff --git a/arch/powerpc/kvm/book3s_xive_template.c b/arch/powerpc/kvm/book3s_xive_template.c
> index 4ad3c0279458..ece36e024a8f 100644
> --- a/arch/powerpc/kvm/book3s_xive_template.c
> +++ b/arch/powerpc/kvm/book3s_xive_template.c
> @@ -61,9 +61,6 @@ static u8 GLUE(X_PFX,esb_load)(struct xive_irq_data *xd, u32 offset)
> if (offset == XIVE_ESB_SET_PQ_10 && xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
> offset |= XIVE_ESB_LD_ST_MO;
>
> - if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
> - offset |= offset << 4;
> -
> val =__x_readq(__x_eoi_page(xd) + offset);
> #ifdef __LITTLE_ENDIAN__
> val >>= 64-8;
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index 411cba12d73b..a9259470bf9f 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -200,10 +200,6 @@ static notrace u8 xive_esb_read(struct xive_irq_data *xd, u32 offset)
> if (offset == XIVE_ESB_SET_PQ_10 && xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
> offset |= XIVE_ESB_LD_ST_MO;
>
> - /* Handle HW errata */
> - if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
> - offset |= offset << 4;
> -
> if ((xd->flags & XIVE_IRQ_FLAG_H_INT_ESB) && xive_ops->esb_rw)
> val = xive_ops->esb_rw(xd->hw_irq, offset, 0, 0);
> else
> @@ -214,10 +210,6 @@ static notrace u8 xive_esb_read(struct xive_irq_data *xd, u32 offset)
>
> static void xive_esb_write(struct xive_irq_data *xd, u32 offset, u64 data)
> {
> - /* Handle HW errata */
> - if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
> - offset |= offset << 4;
> -
> if ((xd->flags & XIVE_IRQ_FLAG_H_INT_ESB) && xive_ops->esb_rw)
> xive_ops->esb_rw(xd->hw_irq, offset, data, 1);
> else
> diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
> index 5f1e5aed8ab4..0310783241b5 100644
> --- a/arch/powerpc/sysdev/xive/native.c
> +++ b/arch/powerpc/sysdev/xive/native.c
> @@ -64,8 +64,6 @@ int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)
> data->flags |= XIVE_IRQ_FLAG_STORE_EOI;
> if (opal_flags & OPAL_XIVE_IRQ_LSI)
> data->flags |= XIVE_IRQ_FLAG_LSI;
> - if (opal_flags & OPAL_XIVE_IRQ_SHIFT_BUG)
> - data->flags |= XIVE_IRQ_FLAG_SHIFT_BUG;
> if (opal_flags & OPAL_XIVE_IRQ_MASK_VIA_FW)
> data->flags |= XIVE_IRQ_FLAG_MASK_FW;
> if (opal_flags & OPAL_XIVE_IRQ_EOI_VIA_FW)
^ permalink raw reply
* Re: [PATCH 1/3] powerpc/smp: Parse ibm,thread-groups with multiple properties
From: Gautham R Shenoy @ 2020-12-08 17:25 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Nathan Lynch, Gautham R. Shenoy, Michael Neuling,
Vaidyanathan Srinivasan, Peter Zijlstra, linux-kernel,
Nicholas Piggin, linuxppc-dev, Valentin Schneider
In-Reply-To: <20201207121042.GH528281@linux.vnet.ibm.com>
Hello Srikar,
Thanks for taking a look at the patch.
On Mon, Dec 07, 2020 at 05:40:42PM +0530, Srikar Dronamraju wrote:
> * Gautham R. Shenoy <ego@linux.vnet.ibm.com> [2020-12-04 10:18:45]:
>
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> <snipped>
>
> >
> > static int parse_thread_groups(struct device_node *dn,
> > - struct thread_groups *tg,
> > - unsigned int property)
> > + struct thread_groups_list *tglp)
> > {
> > - int i;
> > - u32 thread_group_array[3 + MAX_THREAD_LIST_SIZE];
> > + int i = 0;
> > + u32 *thread_group_array;
> > u32 *thread_list;
> > size_t total_threads;
> > - int ret;
> > + int ret = 0, count;
> > + unsigned int property_idx = 0;
>
> NIT:
> tglx mentions in one of his recent comments to try keep a reverse fir tree
> ordering of variables where possible.
I suppose you mean moving the longer local variable declarations to to
the top and shorter ones to the bottom. Thanks. Will fix this.
>
> >
> > + count = of_property_count_u32_elems(dn, "ibm,thread-groups");
> > + thread_group_array = kcalloc(count, sizeof(u32), GFP_KERNEL);
> > ret = of_property_read_u32_array(dn, "ibm,thread-groups",
> > - thread_group_array, 3);
> > + thread_group_array, count);
> > if (ret)
> > - return ret;
> > -
> > - tg->property = thread_group_array[0];
> > - tg->nr_groups = thread_group_array[1];
> > - tg->threads_per_group = thread_group_array[2];
> > - if (tg->property != property ||
> > - tg->nr_groups < 1 ||
> > - tg->threads_per_group < 1)
> > - return -ENODATA;
> > + goto out_free;
> >
> > - total_threads = tg->nr_groups * tg->threads_per_group;
> > + while (i < count && property_idx < MAX_THREAD_GROUP_PROPERTIES) {
> > + int j;
> > + struct thread_groups *tg = &tglp->property_tgs[property_idx++];
>
> NIT: same as above.
Ok.
>
> >
> > - ret = of_property_read_u32_array(dn, "ibm,thread-groups",
> > - thread_group_array,
> > - 3 + total_threads);
> > - if (ret)
> > - return ret;
> > + tg->property = thread_group_array[i];
> > + tg->nr_groups = thread_group_array[i + 1];
> > + tg->threads_per_group = thread_group_array[i + 2];
> > + total_threads = tg->nr_groups * tg->threads_per_group;
> > +
> > + thread_list = &thread_group_array[i + 3];
> >
> > - thread_list = &thread_group_array[3];
> > + for (j = 0; j < total_threads; j++)
> > + tg->thread_list[j] = thread_list[j];
> > + i = i + 3 + total_threads;
>
> Can't we simply use memcpy instead?
We could. But this one makes it more explicit.
>
> > + }
> >
> > - for (i = 0 ; i < total_threads; i++)
> > - tg->thread_list[i] = thread_list[i];
> > + tglp->nr_properties = property_idx;
> >
> > - return 0;
> > +out_free:
> > + kfree(thread_group_array);
> > + return ret;
> > }
> >
> > /*
> > @@ -805,24 +827,39 @@ static int get_cpu_thread_group_start(int cpu, struct thread_groups *tg)
> > return -1;
> > }
> >
> > -static int init_cpu_l1_cache_map(int cpu)
> > +static int init_cpu_cache_map(int cpu, unsigned int cache_property)
> >
> > {
> > struct device_node *dn = of_get_cpu_node(cpu, NULL);
> > - struct thread_groups tg = {.property = 0,
> > - .nr_groups = 0,
> > - .threads_per_group = 0};
> > + struct thread_groups *tg = NULL;
> > int first_thread = cpu_first_thread_sibling(cpu);
> > int i, cpu_group_start = -1, err = 0;
> > + cpumask_var_t *mask;
> > + struct thread_groups_list *cpu_tgl = &tgl[cpu];
>
> NIT: same as 1st comment.
Sure, will fix this.
>
> >
> > if (!dn)
> > return -ENODATA;
> >
> > - err = parse_thread_groups(dn, &tg, THREAD_GROUP_SHARE_L1);
> > - if (err)
> > - goto out;
> > + if (!(cache_property == THREAD_GROUP_SHARE_L1))
> > + return -EINVAL;
> >
> > - cpu_group_start = get_cpu_thread_group_start(cpu, &tg);
> > + if (!cpu_tgl->nr_properties) {
> > + err = parse_thread_groups(dn, cpu_tgl);
> > + if (err)
> > + goto out;
> > + }
> > +
> > + for (i = 0; i < cpu_tgl->nr_properties; i++) {
> > + if (cpu_tgl->property_tgs[i].property == cache_property) {
> > + tg = &cpu_tgl->property_tgs[i];
> > + break;
> > + }
> > + }
> > +
> > + if (!tg)
> > + return -EINVAL;
> > +
> > + cpu_group_start = get_cpu_thread_group_start(cpu, tg);
>
> This whole hunk should be moved to a new function and called before
> init_cpu_cache_map. It will simplify the logic to great extent.
I suppose you are referring to the part where we select the correct
tg. Yeah, that can move to a different helper.
>
> >
> > if (unlikely(cpu_group_start == -1)) {
> > WARN_ON_ONCE(1);
> > @@ -830,11 +867,12 @@ static int init_cpu_l1_cache_map(int cpu)
> > goto out;
> > }
> >
> > - zalloc_cpumask_var_node(&per_cpu(cpu_l1_cache_map, cpu),
> > - GFP_KERNEL, cpu_to_node(cpu));
> > + mask = &per_cpu(cpu_l1_cache_map, cpu);
> > +
> > + zalloc_cpumask_var_node(mask, GFP_KERNEL, cpu_to_node(cpu));
> >
>
> This hunk (and the next hunk) should be moved to next patch.
>
The next patch is only about introducing THREAD_GROUP_SHARE_L2. Hence
I put in any other code in this patch, since it seems to be a logical
place to collate whatever we have in a generic form.
> > for (i = first_thread; i < first_thread + threads_per_core; i++) {
> > - int i_group_start = get_cpu_thread_group_start(i, &tg);
> > + int i_group_start = get_cpu_thread_group_start(i, tg);
> >
> > if (unlikely(i_group_start == -1)) {
> > WARN_ON_ONCE(1);
> > @@ -843,7 +881,7 @@ static int init_cpu_l1_cache_map(int cpu)
> > }
> >
> > if (i_group_start == cpu_group_start)
> > - cpumask_set_cpu(i, per_cpu(cpu_l1_cache_map, cpu));
> > + cpumask_set_cpu(i, *mask);
> > }
> >
> > out:
> > @@ -924,7 +962,7 @@ static int init_big_cores(void)
> > int cpu;
> >
> > for_each_possible_cpu(cpu) {
> > - int err = init_cpu_l1_cache_map(cpu);
> > + int err = init_cpu_cache_map(cpu, THREAD_GROUP_SHARE_L1);
> >
> > if (err)
> > return err;
> > --
> > 1.9.4
> >
>
> --
> Thanks and Regards
> Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH 08/13] powerpc: Increase NR_IRQS range to support more KVM guests
From: Greg Kurz @ 2020-12-08 17:23 UTC (permalink / raw)
To: Cédric Le Goater; +Cc: linuxppc-dev
In-Reply-To: <20201208151124.1329942-9-clg@kaod.org>
On Tue, 8 Dec 2020 16:11:19 +0100
Cédric Le Goater <clg@kaod.org> wrote:
> PowerNV systems can handle up to 4K guests and 1M interrupt numbers
> per chip. Increase the range of allowed interrupts to support a larger
> number of guests.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
Reviewed-by: Greg Kurz <groug@kaod.org>
> arch/powerpc/Kconfig | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 5181872f9452..c250fbd430d1 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -66,7 +66,7 @@ config NEED_PER_CPU_PAGE_FIRST_CHUNK
>
> config NR_IRQS
> int "Number of virtual interrupt numbers"
> - range 32 32768
> + range 32 1048576
> default "512"
> help
> This defines the number of virtual interrupt numbers the kernel
^ permalink raw reply
* Re: [PATCH v2 1/2] ASoC: dt-bindings: imx-hdmi: Add binding doc for hdmi machine driver
From: Mark Brown @ 2020-12-08 17:09 UTC (permalink / raw)
To: Shengjiu Wang, timur, festevam, linux-kernel, alsa-devel,
linuxppc-dev, devicetree, perex, robh+dt, nicoleotsuka, Xiubo.Lee,
lgirdwood, tiwai
In-Reply-To: <1607251319-5821-1-git-send-email-shengjiu.wang@nxp.com>
On Sun, 6 Dec 2020 18:41:58 +0800, Shengjiu Wang wrote:
> Imx-hdmi is a new added machine driver for supporting hdmi devices
> on i.MX platforms. There is HDMI IP or external HDMI modules connect
> with SAI or AUD2HTX interface.
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
Thanks!
[1/2] ASoC: dt-bindings: imx-hdmi: Add binding doc for hdmi machine driver
commit: e344cf5e4871f99495396f78d4401b8ac4c92465
[2/2] ASoC: fsl: Add imx-hdmi machine driver
commit: 6a5f850aa83a1d844d27e3e53ca2f247e55d438b
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
^ permalink raw reply
* Re: [PATCH 03/13] powerpc/xive: Introduce XIVE_IPI_HW_IRQ
From: Greg Kurz @ 2020-12-08 17:06 UTC (permalink / raw)
To: Cédric Le Goater; +Cc: linuxppc-dev
In-Reply-To: <20201208151124.1329942-4-clg@kaod.org>
On Tue, 8 Dec 2020 16:11:14 +0100
Cédric Le Goater <clg@kaod.org> wrote:
> The XIVE driver deals with CPU IPIs in a peculiar way. Each CPU has
> its own XIVE IPI interrupt allocated at the HW level, for PowerNV, or
> at the hypervisor level for pSeries. In practice, these interrupts are
> not always used. pSeries/PowerVM prefers local doorbells for local
> threads since they are faster. On PowerNV, global doorbells are also
> preferred for the same reason.
>
> The mapping in the Linux is reduced to a single interrupt using HW
> interrupt number 0 and a custom irq_chip to handle EOI. This can cause
> performance issues in some benchmark (ipistorm) on multichip systems.
>
> Clarify the use of the 0 value, it will help in improving multichip
> support.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
Reviewed-by: Greg Kurz <groug@kaod.org>
> arch/powerpc/sysdev/xive/xive-internal.h | 2 ++
> arch/powerpc/sysdev/xive/common.c | 10 +++++-----
> 2 files changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/sysdev/xive/xive-internal.h b/arch/powerpc/sysdev/xive/xive-internal.h
> index b7b901da2168..d701af7fb48c 100644
> --- a/arch/powerpc/sysdev/xive/xive-internal.h
> +++ b/arch/powerpc/sysdev/xive/xive-internal.h
> @@ -5,6 +5,8 @@
> #ifndef __XIVE_INTERNAL_H
> #define __XIVE_INTERNAL_H
>
> +#define XIVE_IPI_HW_IRQ 0 /* interrupt source # for IPIs */
> +
> /*
> * A "disabled" interrupt should never fire, to catch problems
> * we set its logical number to this
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index 65af34ac1fa2..ee375daf8114 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -1142,7 +1142,7 @@ static void __init xive_request_ipi(void)
> return;
>
> /* Initialize it */
> - virq = irq_create_mapping(xive_irq_domain, 0);
> + virq = irq_create_mapping(xive_irq_domain, XIVE_IPI_HW_IRQ);
> xive_ipi_irq = virq;
>
> WARN_ON(request_irq(virq, xive_muxed_ipi_action,
> @@ -1242,7 +1242,7 @@ static int xive_irq_domain_map(struct irq_domain *h, unsigned int virq,
>
> #ifdef CONFIG_SMP
> /* IPIs are special and come up with HW number 0 */
> - if (hw == 0) {
> + if (hw == XIVE_IPI_HW_IRQ) {
> /*
> * IPIs are marked per-cpu. We use separate HW interrupts under
> * the hood but associated with the same "linux" interrupt
> @@ -1271,7 +1271,7 @@ static void xive_irq_domain_unmap(struct irq_domain *d, unsigned int virq)
> if (!data)
> return;
> hw_irq = (unsigned int)irqd_to_hwirq(data);
> - if (hw_irq)
> + if (hw_irq != XIVE_IPI_HW_IRQ)
> xive_irq_free_data(virq);
> }
>
> @@ -1421,7 +1421,7 @@ static void xive_flush_cpu_queue(unsigned int cpu, struct xive_cpu *xc)
> * Ignore anything that isn't a XIVE irq and ignore
> * IPIs, so can just be dropped.
> */
> - if (d->domain != xive_irq_domain || hw_irq == 0)
> + if (d->domain != xive_irq_domain || hw_irq == XIVE_IPI_HW_IRQ)
> continue;
>
> /*
> @@ -1655,7 +1655,7 @@ static int xive_core_debug_show(struct seq_file *m, void *private)
> hw_irq = (unsigned int)irqd_to_hwirq(d);
>
> /* IPIs are special (HW number 0) */
> - if (hw_irq)
> + if (hw_irq != XIVE_IPI_HW_IRQ)
> xive_debug_show_irq(m, hw_irq, d);
> }
> return 0;
^ permalink raw reply
* Re: [PATCH v2 0/5] drop unused BACKLIGHT_GENERIC option
From: Thomas Bogendoerfer @ 2020-12-08 17:00 UTC (permalink / raw)
To: Arnd Bergmann
Cc: alexandre.belloni, sam, tony, linux-kernel, James.Bottomley, wens,
thierry.reding, paulus, will, daniel.thompson, deller, linux,
krzk, jonathanh, ludovic.desroches, catalin.marinas, linux-mips,
Arnd Bergmann, mripard, Andrey Zhizhikin, soc, linux-tegra,
lee.jones, linux-omap, linux-arm-kernel, jernej.skrabec,
linux-parisc, emil.l.velikov, nicolas.ferre, linuxppc-dev
In-Reply-To: <160744514229.359082.11487352663734358657.b4-ty@arndb.de>
On Tue, Dec 08, 2020 at 05:34:46PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
>
> On Tue, 1 Dec 2020 22:29:17 +0000, Andrey Zhizhikin wrote:
> > Since the removal of generic_bl driver from the source tree in commit
> > 7ecdea4a0226 ("backlight: generic_bl: Remove this driver as it is
> > unused") BACKLIGHT_GENERIC config option became obsolete as well and
> > therefore subject to clean-up from all configuration files.
> >
> > This series introduces patches to address this removal, separated by
> > architectures in the kernel tree.
> >
> > [...]
>
> While my plan was to only take the arm specific patches, it seems
> nobody else has applied the other architecture specific ones,
> but there have been a lot of Acks. Also, b4 makes it easy to
> merge the entire branch, so I'll just take all of these.
>
> Applied to arm/defconfig, thanks!
>
> [1/5] ARM: configs: drop unused BACKLIGHT_GENERIC option
> commit: 0437141b4e2233ae0109a9584e7a003cd05b0a20
> [2/5] arm64: defconfig: drop unused BACKLIGHT_GENERIC option
> commit: 717c4c8336486781630893508b3347ae18953fae
> [3/5] MIPS: configs: drop unused BACKLIGHT_GENERIC option
> commit: 2257682282531de45929c6006152f6e2ee881b42
this one is already in mips-next.
Thomas.
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply
* Re: [PATCH 02/13] powerpc/xive: Rename XIVE_IRQ_NO_EOI to show its a flag
From: Greg Kurz @ 2020-12-08 16:59 UTC (permalink / raw)
To: Cédric Le Goater; +Cc: linuxppc-dev
In-Reply-To: <20201208151124.1329942-3-clg@kaod.org>
On Tue, 8 Dec 2020 16:11:13 +0100
Cédric Le Goater <clg@kaod.org> wrote:
> This is a simple cleanup to identify easily all flags of the XIVE
> interrupt structure. The interrupts flagged with XIVE_IRQ_FLAG_NO_EOI
> are the escalations used to wake up vCPUs in KVM. They are handled
> very differently from the rest.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
Reviewed-by: Greg Kurz <groug@kaod.org>
> arch/powerpc/include/asm/xive.h | 2 +-
> arch/powerpc/kvm/book3s_xive.c | 4 ++--
> arch/powerpc/sysdev/xive/common.c | 2 +-
> 3 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
> index 309b4d65b74f..d332dd9a18de 100644
> --- a/arch/powerpc/include/asm/xive.h
> +++ b/arch/powerpc/include/asm/xive.h
> @@ -66,7 +66,7 @@ struct xive_irq_data {
> #define XIVE_IRQ_FLAG_H_INT_ESB 0x20
>
> /* Special flag set by KVM for excalation interrupts */
> -#define XIVE_IRQ_NO_EOI 0x80
> +#define XIVE_IRQ_FLAG_NO_EOI 0x80
>
> #define XIVE_INVALID_CHIP_ID -1
>
> diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
> index 18a6b75a3bfd..fae1c2e8da29 100644
> --- a/arch/powerpc/kvm/book3s_xive.c
> +++ b/arch/powerpc/kvm/book3s_xive.c
> @@ -219,7 +219,7 @@ int kvmppc_xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio,
> /* In single escalation mode, we grab the ESB MMIO of the
> * interrupt and mask it. Also populate the VCPU v/raddr
> * of the ESB page for use by asm entry/exit code. Finally
> - * set the XIVE_IRQ_NO_EOI flag which will prevent the
> + * set the XIVE_IRQ_FLAG_NO_EOI flag which will prevent the
> * core code from performing an EOI on the escalation
> * interrupt, thus leaving it effectively masked after
> * it fires once.
> @@ -231,7 +231,7 @@ int kvmppc_xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio,
> xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_01);
> vcpu->arch.xive_esc_raddr = xd->eoi_page;
> vcpu->arch.xive_esc_vaddr = (__force u64)xd->eoi_mmio;
> - xd->flags |= XIVE_IRQ_NO_EOI;
> + xd->flags |= XIVE_IRQ_FLAG_NO_EOI;
> }
>
> return 0;
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index a80440af491a..65af34ac1fa2 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -416,7 +416,7 @@ static void xive_irq_eoi(struct irq_data *d)
> * been passed-through to a KVM guest
> */
> if (!irqd_irq_disabled(d) && !irqd_is_forwarded_to_vcpu(d) &&
> - !(xd->flags & XIVE_IRQ_NO_EOI))
> + !(xd->flags & XIVE_IRQ_FLAG_NO_EOI))
> xive_do_source_eoi(irqd_to_hwirq(d), xd);
> else
> xd->stale_p = true;
^ permalink raw reply
* Re: [PATCH v2 0/5] drop unused BACKLIGHT_GENERIC option
From: Alexandre Belloni @ 2020-12-08 16:54 UTC (permalink / raw)
To: Arnd Bergmann
Cc: sam, tony, linux-kernel, James.Bottomley, wens, thierry.reding,
paulus, will, daniel.thompson, deller, linux, krzk, jonathanh,
ludovic.desroches, catalin.marinas, linux-mips, Arnd Bergmann,
mripard, Andrey Zhizhikin, soc, linux-tegra, lee.jones,
linux-omap, linux-arm-kernel, jernej.skrabec, tsbogend,
linux-parisc, emil.l.velikov, nicolas.ferre, linuxppc-dev
In-Reply-To: <160744514229.359082.11487352663734358657.b4-ty@arndb.de>
Hi,
On 08/12/2020 17:34:46+0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
>
> On Tue, 1 Dec 2020 22:29:17 +0000, Andrey Zhizhikin wrote:
> > Since the removal of generic_bl driver from the source tree in commit
> > 7ecdea4a0226 ("backlight: generic_bl: Remove this driver as it is
> > unused") BACKLIGHT_GENERIC config option became obsolete as well and
> > therefore subject to clean-up from all configuration files.
> >
> > This series introduces patches to address this removal, separated by
> > architectures in the kernel tree.
> >
> > [...]
>
> While my plan was to only take the arm specific patches, it seems
> nobody else has applied the other architecture specific ones,
> but there have been a lot of Acks. Also, b4 makes it easy to
> merge the entire branch, so I'll just take all of these.
>
> Applied to arm/defconfig, thanks!
>
> [1/5] ARM: configs: drop unused BACKLIGHT_GENERIC option
> commit: 0437141b4e2233ae0109a9584e7a003cd05b0a20
> [2/5] arm64: defconfig: drop unused BACKLIGHT_GENERIC option
> commit: 717c4c8336486781630893508b3347ae18953fae
> [3/5] MIPS: configs: drop unused BACKLIGHT_GENERIC option
> commit: 2257682282531de45929c6006152f6e2ee881b42
Thomas did applied this one:
https://lore.kernel.org/linux-arm-kernel/20201204120632.GA10011@alpha.franken.de/
> [4/5] parisc: configs: drop unused BACKLIGHT_GENERIC option
> commit: 4e9c44b128d3eb5da129e53c7312240f838c2dbf
> [5/5] powerpc/configs: drop unused BACKLIGHT_GENERIC option
> commit: 4985c506303fb6a41a885d503a6e1f3d3126431d
>
> Arnd
--
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply
* Re: [PATCH v2 0/5] drop unused BACKLIGHT_GENERIC option
From: Arnd Bergmann @ 2020-12-08 16:34 UTC (permalink / raw)
To: jonathanh, linux-kernel, thierry.reding, krzk, mripard, benh,
emil.l.velikov, alexandre.belloni, mpe, linux-parisc, paulus,
nicolas.ferre, tony, sam, linux, ludovic.desroches, soc,
James.Bottomley, will, linuxppc-dev, linux-tegra, linux-omap,
daniel.thompson, jernej.skrabec, linux-arm-kernel, deller,
tsbogend, catalin.marinas, wens, linux-mips, lee.jones,
Andrey Zhizhikin
Cc: Arnd Bergmann
In-Reply-To: <20201201222922.3183-1-andrey.zhizhikin@leica-geosystems.com>
From: Arnd Bergmann <arnd@arndb.de>
On Tue, 1 Dec 2020 22:29:17 +0000, Andrey Zhizhikin wrote:
> Since the removal of generic_bl driver from the source tree in commit
> 7ecdea4a0226 ("backlight: generic_bl: Remove this driver as it is
> unused") BACKLIGHT_GENERIC config option became obsolete as well and
> therefore subject to clean-up from all configuration files.
>
> This series introduces patches to address this removal, separated by
> architectures in the kernel tree.
>
> [...]
While my plan was to only take the arm specific patches, it seems
nobody else has applied the other architecture specific ones,
but there have been a lot of Acks. Also, b4 makes it easy to
merge the entire branch, so I'll just take all of these.
Applied to arm/defconfig, thanks!
[1/5] ARM: configs: drop unused BACKLIGHT_GENERIC option
commit: 0437141b4e2233ae0109a9584e7a003cd05b0a20
[2/5] arm64: defconfig: drop unused BACKLIGHT_GENERIC option
commit: 717c4c8336486781630893508b3347ae18953fae
[3/5] MIPS: configs: drop unused BACKLIGHT_GENERIC option
commit: 2257682282531de45929c6006152f6e2ee881b42
[4/5] parisc: configs: drop unused BACKLIGHT_GENERIC option
commit: 4e9c44b128d3eb5da129e53c7312240f838c2dbf
[5/5] powerpc/configs: drop unused BACKLIGHT_GENERIC option
commit: 4985c506303fb6a41a885d503a6e1f3d3126431d
Arnd
^ permalink raw reply
* Re: [PATCH v6 0/5] PCI: Unify ECAM constants in native PCI Express drivers
From: Qian Cai @ 2020-12-08 16:07 UTC (permalink / raw)
To: Krzysztof Wilczyński, Bjorn Helgaas
Cc: Heiko Stuebner, Shawn Lin, Paul Mackerras, Thomas Petazzoni,
Jonathan Chocron, Toan Le, Will Deacon, Rob Herring,
Lorenzo Pieralisi, Michal Simek, linux-rockchip,
Linux Next Mailing List, bcm-kernel-feedback-list,
Jonathan Derrick, linux-pci, Stephen Rothwell, Ray Jui,
Florian Fainelli, linux-rpi-kernel, Jonathan Cameron,
linux-arm-kernel, Scott Branden, Zhou Wang, Robert Richter,
linuxppc-dev, Nicolas Saenz Julienne
In-Reply-To: <20201129230743.3006978-1-kw@linux.com>
On Sun, 2020-11-29 at 23:07 +0000, Krzysztof Wilczyński wrote:
> Unify ECAM-related constants into a single set of standard constants
> defining memory address shift values for the byte-level address that can
> be used when accessing the PCI Express Configuration Space, and then
> move native PCI Express controller drivers to use newly introduced
> definitions retiring any driver-specific ones.
>
> The ECAM ("Enhanced Configuration Access Mechanism") is defined by the
> PCI Express specification (see PCI Express Base Specification, Revision
> 5.0, Version 1.0, Section 7.2.2, p. 676), thus most hardware should
> implement it the same way.
>
> Most of the native PCI Express controller drivers define their ECAM-related
> constants, many of these could be shared, or use open-coded values when
> setting the ".bus_shift" field of the "struct pci_ecam_ops".
>
> All of the newly added constants should remove ambiguity and reduce the
> number of open-coded values, and also correlate more strongly with the
> descriptions in the aforementioned specification (see Table 7-1
> "Enhanced Configuration Address Mapping", p. 677).
>
> Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
> Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Reverting this series on the top of today's linux-next fixed a boot crash on
arm64 Thunder X2 server.
.config: https://cailca.coding.net/public/linux/mm/git/files/master/arm64.config
[ 186.285957][ T1] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f])
[ 186.293127][ T1] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig Segments MSI HPX-Type3]
[ 186.317072][ T1] acpi PNP0A08:00: _OSC: not requesting OS control; OS requires [ExtendedConfig ASPM ClockPM MSI]
[ 186.330336][ T1] acpi PNP0A08:00: ECAM area [mem 0x30000000-0x37ffffff] reserved by PNP0C02:00
[ 186.339538][ T1] acpi PNP0A08:00: ECAM at [mem 0x30000000-0x37ffffff] for [bus 00-7f]
[ 186.353258][ T1] PCI host bridge to bus 0000:00
[ 186.358162][ T1] pci_bus 0000:00: root bus resource [mem 0x40000000-0x5fffffff window]
[ 186.366509][ T1] pci_bus 0000:00: root bus resource [mem 0x10000000000-0x13fffffffff window]
[ 186.375366][ T1] pci_bus 0000:00: root bus resource [bus 00-7f]
[ 186.382652][ T1] pci 0000:00:00.0: [177d:af00] type 00 class 0x060000
[ 186.395174][ T1] pci 0000:00:01.0: [177d:af84] type 01 class 0x060400
[ 186.402433][ T1] pci 0000:00:01.0: PME# supported from D0 D3hot D3cold
[ 186.415652][ T1] Unable to handle kernel paging request at virtual address ffff800029370000
[ 186.424398][ T1] Mem abort info:
[ 186.427930][ T1] ESR = 0x96000007
[ 186.431725][ T1] EC = 0x25: DABT (current EL), IL = 32 bits
[ 186.437805][ T1] SET = 0, FnV = 0
[ 186.441599][ T1] EA = 0, S1PTW = 0
[ 186.445485][ T1] Data abort info:
[ 186.449104][ T1] ISV = 0, ISS = 0x00000007
[ 186.453687][ T1] CM = 0, WnR = 0
[ 186.457396][ T1] swapper pgtable: 64k pages, 48-bit VAs, pgdp=00000000da920000
[ 186.464979][ T1] [ffff800029370000] pgd=0000008ffcff0003, p4d=0000008ffcff0003, pud=0000008ffcff0003, pmd=00000000811d0003, pte=0000000000000000
[ 186.478424][ T1] Internal error: Oops: 96000007 [#1] SMP
[ 186.484059][ T1] Modules linked in:
[ 186.487854][ T1] CPU: 38 PID: 1 Comm: swapper/0 Tainted: G W L 5.10.0-rc7-next-20201208 #3
[ 186.497617][ T1] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.16 07/29/2020
[ 186.508174][ T1] pstate: 20400089 (nzCv daIf +PAN -UAO -TCO BTYPE=--)
[ 186.514954][ T1] pc : pci_generic_config_read+0x78/0x1d0
[ 186.520587][ T1] lr : pci_generic_config_read+0x64/0x1d0
pci_generic_config_read at drivers/pci/access.c:83
[ 186.526223][ T1] sp : ffff000005f0ef30
[ 186.530278][ T1] x29: ffff000005f0ef30 x28: 0000000000000010
[ 186.536359][ T1] x27: 0000000000000000 x26: 0000000000000087
[ 186.542441][ T1] x25: 0000000000000000 x24: ffff00084a3a5000
[ 186.548517][ T1] x23: ffff000005f0f150 x22: 0000000000000004
[ 186.554593][ T1] x21: ffff800011404588 x20: ffff000005f0eff0
[ 186.560669][ T1] x19: ffff00084a3a5000 x18: 1fffe001cf0d53ed
[ 186.566750][ T1] x17: 0000000000000000 x16: 0000000000000003
[ 186.572831][ T1] x15: 0000000000000000 x14: 0000000000000003
[ 186.578908][ T1] x13: ffff600000be1ddf x12: 1fffe00000be1dde
[ 186.584983][ T1] x11: 1fffe00000be1dde x10: ffff600000be1dde
[ 186.591059][ T1] x9 : ffff800010c4f59c x8 : ffff000005f0eef3
[ 186.597139][ T1] x7 : 0000000000000001 x6 : 0000000000000001
[ 186.603222][ T1] x5 : 1fffe00109474a1c x4 : 1fffe010fd074cb2
[ 186.609298][ T1] x3 : 0000000000000000 x2 : 0000000000000000
[ 186.615374][ T1] x1 : 0000000000010000 x0 : ffff800029370000
[ 186.621451][ T1] Call trace:
[ 186.624623][ T1] pci_generic_config_read+0x78/0x1d0
__raw_readl at arch/arm64/include/asm/io.h:75
(inlined by) pci_generic_config_read at drivers/pci/access.c:93
[ 186.629905][ T1] pci_bus_read_config_dword+0xfc/0x198
pci_bus_read_config_dword at drivers/pci/access.c:65 (discriminator 2)
[ 186.635362][ T1] pci_bus_generic_read_dev_vendor_id+0x3c/0x310
pci_bus_generic_read_dev_vendor_id at drivers/pci/probe.c:2300
[ 186.641611][ T1] pci_bus_read_dev_vendor_id+0x7c/0xd0
pci_bus_read_dev_vendor_id at drivers/pci/probe.c:2329
[ 186.647067][ T1] pci_scan_single_device+0xe0/0x1f8
pci_scan_device at drivers/pci/probe.c:2342
(inlined by) pci_scan_single_device at drivers/pci/probe.c:2511
(inlined by) pci_scan_single_device at drivers/pci/probe.c:2501
[ 186.652261][ T1] pci_scan_slot+0x88/0x258
pci_scan_slot at drivers/pci/probe.c:2590
[ 186.656663][ T1] pci_scan_child_bus_extend+0x88/0x608
pci_scan_child_bus_extend at drivers/pci/probe.c:2807
[ 186.662120][ T1] pci_scan_child_bus+0x18/0x20
pci_scan_child_bus at drivers/pci/probe.c:2938
[ 186.666875][ T1] acpi_pci_root_create+0x518/0x7a8
acpi_pci_root_create at drivers/acpi/pci_root.c:938
[ 186.671982][ T1] pci_acpi_scan_root+0x2bc/0x4c0
pci_acpi_scan_root at arch/arm64/kernel/pci.c:189
[ 186.676911][ T1] acpi_pci_root_add+0x45c/0x920
acpi_pci_root_add at drivers/acpi/pci_root.c:609
[ 186.681754][ T1] acpi_bus_attach+0x270/0x6d8
acpi_scan_attach_handler at drivers/acpi/scan.c:1969
(inlined by) acpi_bus_attach at drivers/acpi/scan.c:2013
[ 186.686419][ T1] acpi_bus_attach+0x140/0x6d8
[ 186.691085][ T1] acpi_bus_attach+0x140/0x6d8
acpi_bus_attach at drivers/acpi/scan.c:2033 (discriminator 9)
[ 186.695751][ T1] acpi_bus_scan+0x98/0xf8
acpi_bus_scan at drivers/acpi/scan.c:2087
[ 186.700067][ T1] acpi_scan_init+0x220/0x524
[ 186.704644][ T1] acpi_init+0x460/0x4e8
[ 186.708790][ T1] do_one_initcall+0x170/0xb70
[ 186.713461][ T1] kernel_init_freeable+0x6a8/0x734
[ 186.718574][ T1] kernel_init+0x18/0x12c
[ 186.722806][ T1] ret_from_fork+0x10/0x1c
[ 186.727123][ T1] Code: 710006df 540002e0 71000adf 540004e0 (b9400013)
[ 186.733988][ T1] ---[ end trace bad65ebbc8c09fe0 ]---
[ 186.739359][ T1] Kernel panic - not syncing: Oops: Fatal exception
[ 186.746115][ T1] SMP: stopping secondary CPUs
[ 186.750968][ T1] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
>
> ---
> Changed in v6:
> Converted single patch into a series.
> Dropped changes related to PPC 4xx platform.
> Refactored pci_ecam_map_bus() so that bus, device function and offset
> are correctly masked, limiting offset to 4K as per the PCI Express
> Specification. After the refactor this function will now use sensible
> defaults allowing for removal of the ".bus_shit" initialiser from all
> the users of the "ecam_ops" structure who do not use a non-standard
> ECAM bus shit values.
>
> Changed in v5:
> Removed unused constant "PCIE_ECAM_DEV_SHIFT".
> Refactored ppc4xx_pciex_get_config_base() so that the "offset"
> parameter can be passed to so that the PCIE_ECAM_OFFSET() macro
> can be used.
> Used the ALIGN_DOWN() macro where 32 bit alignment is required
> instead using the 0xffc mask.
> Added CFG_ADDR_CFG_TYPE_1 macro to pci/controller/pcie-iproc.c to
> denote that this is a configuration type 1 address and access type.
> Refactored boundary check in pci/controller/vmd.c as used by the
> vmd_cfg_addr() function following addition of the PCIE_ECAM_OFFSET()
> macro.
> Changed the "bus->number" to simply pass the "bus" argument in the
> PCIE_ECAM_OFFSET() macro.
>
> Changed in v4:
> Removed constants related to "CAM".
> Added more platforms and devices that can use new ECAM macros and
> constants.
> Removed unused ".bus_shift" initialisers from pci-xgene.c as
> xgene_pcie_map_bus() did not use these.
>
> Changes in v3:
> Updated commit message wording.
> Updated regarding custom ECAM bus shift values and concerning PCI base
> configuration space access for Type 1 access.
> Refactored rockchip_pcie_rd_other_conf() and rockchip_pcie_wr_other_conf()
> and removed the "busdev" variable.
> Removed surplus "relbus" variable from nwl_pcie_map_bus() and
> xilinx_pcie_map_bus().
> Renamed the PCIE_ECAM_ADDR() macro to PCIE_ECAM_OFFSET().
>
> Changes in v2:
> Use PCIE_ECAM_ADDR macro when computing ECAM address offset, but drop
> PCI_SLOT and PCI_FUNC macros from the PCIE_ECAM_ADDR macro in favour
> of using a single value for the device/function.
>
> Krzysztof Wilczyński (5):
> PCI: Unify ECAM constants in native PCI Express drivers
> PCI: thunder-pem: Add constant for custom ".bus_shit" initialiser
> PCI: iproc: Convert to use the new ECAM constants
> PCI: vmd: Update type of the __iomem pointers
> PCI: xgene: Removed unused ".bus_shift" initialisers from pci-xgene.c
>
> drivers/pci/controller/dwc/pcie-al.c | 12 ++------
> drivers/pci/controller/dwc/pcie-hisi.c | 2 --
> drivers/pci/controller/pci-aardvark.c | 13 ++-------
> drivers/pci/controller/pci-host-generic.c | 1 -
> drivers/pci/controller/pci-thunder-ecam.c | 1 -
> drivers/pci/controller/pci-thunder-pem.c | 13 +++++++--
> drivers/pci/controller/pci-xgene.c | 2 --
> drivers/pci/controller/pcie-brcmstb.c | 16 ++---------
> drivers/pci/controller/pcie-iproc.c | 31 ++++++---------------
> drivers/pci/controller/pcie-rockchip-host.c | 27 +++++++++---------
> drivers/pci/controller/pcie-rockchip.h | 8 +-----
> drivers/pci/controller/pcie-tango.c | 1 -
> drivers/pci/controller/pcie-xilinx-nwl.c | 9 ++----
> drivers/pci/controller/pcie-xilinx.c | 11 ++------
> drivers/pci/controller/vmd.c | 19 ++++++-------
> drivers/pci/ecam.c | 23 ++++++++++-----
> include/linux/pci-ecam.h | 27 ++++++++++++++++++
> 17 files changed, 96 insertions(+), 120 deletions(-)
>
^ permalink raw reply
* Re: [PATCH v6 0/5] PCI: Unify ECAM constants in native PCI Express drivers
From: Michael Walle @ 2020-12-08 15:41 UTC (permalink / raw)
To: lorenzo.pieralisi
Cc: kw, heiko, shawn.lin, paulus, Michael Walle, thomas.petazzoni,
jonnyc, toan, will, robh, f.fainelli, michal.simek,
linux-rockchip, bcm-kernel-feedback-list, jonathan.derrick,
linux-pci, rjui, linux-rpi-kernel, Jonathan.Cameron, bhelgaas,
linux-arm-kernel, sbranden, wangzhou1, rrichter, linuxppc-dev,
nsaenzjulienne
In-Reply-To: <160683676668.20365.13565676178464743008.b4-ty@arm.com>
Hi Lorenzo, Krzysztof,
>On Sun, 29 Nov 2020 23:07:38 +0000, Krzysztof Wilczyński wrote:
>> Unify ECAM-related constants into a single set of standard constants
>> defining memory address shift values for the byte-level address that can
>> be used when accessing the PCI Express Configuration Space, and then
>> move native PCI Express controller drivers to use newly introduced
>> definitions retiring any driver-specific ones.
>>
>> The ECAM ("Enhanced Configuration Access Mechanism") is defined by the
>> PCI Express specification (see PCI Express Base Specification, Revision
>> 5.0, Version 1.0, Section 7.2.2, p. 676), thus most hardware should
>> implement it the same way.
>>
>> [...]
>
>Applied to pci/ecam, thanks!
>
>[1/5] PCI: Unify ECAM constants in native PCI Express drivers
> https://git.kernel.org/lpieralisi/pci/c/f3c07cf692
>[2/5] PCI: thunder-pem: Add constant for custom ".bus_shift" initialiser
> https://git.kernel.org/lpieralisi/pci/c/3c38579263
>[3/5] PCI: iproc: Convert to use the new ECAM constants
> https://git.kernel.org/lpieralisi/pci/c/333ec9d3cc
>[4/5] PCI: vmd: Update type of the __iomem pointers
> https://git.kernel.org/lpieralisi/pci/c/89094c12ea
>[5/5] PCI: xgene: Removed unused ".bus_shift" initialisers from pci-xgene.c
> https://git.kernel.org/lpieralisi/pci/c/3dc62532a5
Patch 1/5 breaks LS1028A boards:
[..]
[ 1.144426] pci-host-generic 1f0000000.pcie: host bridge /soc/pcie@1f0000000 ranges:
[ 1.152276] pci-host-generic 1f0000000.pcie: MEM 0x01f8000000..0x01f815ffff -> 0x0000000000
[ 1.161161] pci-host-generic 1f0000000.pcie: MEM 0x01f8160000..0x01f81cffff -> 0x0000000000
[ 1.170043] pci-host-generic 1f0000000.pcie: MEM 0x01f81d0000..0x01f81effff -> 0x0000000000
[ 1.178924] pci-host-generic 1f0000000.pcie: MEM 0x01f81f0000..0x01f820ffff -> 0x0000000000
[ 1.187805] pci-host-generic 1f0000000.pcie: MEM 0x01f8210000..0x01f822ffff -> 0x0000000000
[ 1.196686] pci-host-generic 1f0000000.pcie: MEM 0x01f8230000..0x01f824ffff -> 0x0000000000
[ 1.205562] pci-host-generic 1f0000000.pcie: MEM 0x01fc000000..0x01fc3fffff -> 0x0000000000
[ 1.214465] pci-host-generic 1f0000000.pcie: ECAM at [mem 0x1f0000000-0x1f00fffff] for [bus 00]
[ 1.223318] pci-host-generic 1f0000000.pcie: PCI host bridge to bus 0000:00
[ 1.230350] pci_bus 0000:00: root bus resource [bus 00]
[ 1.235625] pci_bus 0000:00: root bus resource [mem 0x1f8000000-0x1f815ffff] (bus address [0x00000000-0x0015ffff])
[ 1.246077] pci_bus 0000:00: root bus resource [mem 0x1f8160000-0x1f81cffff pref] (bus address [0x00000000-0x0006ffff])
[ 1.256969] pci_bus 0000:00: root bus resource [mem 0x1f81d0000-0x1f81effff] (bus address [0x00000000-0x0001ffff])
[ 1.267427] pci_bus 0000:00: root bus resource [mem 0x1f81f0000-0x1f820ffff pref] (bus address [0x00000000-0x0001ffff])
[ 1.278326] pci_bus 0000:00: root bus resource [mem 0x1f8210000-0x1f822ffff] (bus address [0x00000000-0x0001ffff])
[ 1.288779] pci_bus 0000:00: root bus resource [mem 0x1f8230000-0x1f824ffff pref] (bus address [0x00000000-0x0001ffff])
[ 1.299669] pci_bus 0000:00: root bus resource [mem 0x1fc000000-0x1fc3fffff] (bus address [0x00000000-0x003fffff])
[ 1.310138] pci 0000:00:00.0: [1957:e100] type 00 class 0x020001
[ 1.316234] pci 0000:00:00.0: BAR 0: [mem 0x1f8000000-0x1f803ffff 64bit] (from Enhanced Allocation, properties 0x0)
[ 1.326776] pci 0000:00:00.0: BAR 2: [mem 0x1f8160000-0x1f816ffff 64bit pref] (from Enhanced Allocation, properties 0x1)
[ 1.337759] pci 0000:00:00.0: VF BAR 0: [mem 0x1f81d0000-0x1f81dffff 64bit] (from Enhanced Allocation, properties 0x4)
[ 1.348563] pci 0000:00:00.0: VF BAR 2: [mem 0x1f81f0000-0x1f81fffff 64bit pref] (from Enhanced Allocation, properties 0x3)
[ 1.359821] pci 0000:00:00.0: PME# supported from D0 D3hot
[ 1.365368] pci 0000:00:00.0: VF(n) BAR0 space: [mem 0x1f81d0000-0x1f81effff 64bit] (contains BAR0 for 2 VFs)
[ 1.375381] pci 0000:00:00.0: VF(n) BAR2 space: [mem 0x1f81f0000-0x1f820ffff 64bit pref] (contains BAR2 for 2 VFs)
[ 1.385983] Unable to handle kernel paging request at virtual address ffff800012132000
[ 1.393972] Mem abort info:
[ 1.396783] ESR = 0x96000007
[ 1.399859] EC = 0x25: DABT (current EL), IL = 32 bits
[ 1.405215] SET = 0, FnV = 0
[ 1.408290] EA = 0, S1PTW = 0
[ 1.411453] Data abort info:
[ 1.414352] ISV = 0, ISS = 0x00000007
[ 1.418216] CM = 0, WnR = 0
[ 1.421205] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000008369c000
[ 1.427966] [ffff800012132000] pgd=00000020fffff003, p4d=00000020fffff003, pud=00000020ffffe003, pmd=00000020ffffa003, pte=0000000000000000
[ 1.440618] Internal error: Oops: 96000007 [#1] PREEMPT SMP
[ 1.446239] Modules linked in:
[ 1.449320] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc3-00101-g2f378db5c89 #191
[ 1.457484] Hardware name: Kontron SMARC-sAL28 (Single PHY) on SMARC Eval 2.0 carrier (DT)
[ 1.465827] pstate: 20000085 (nzCv daIf -PAN -UAO -TCO BTYPE=--)
[ 1.471892] pc : pci_generic_config_read+0x38/0xe0
[ 1.476723] lr : pci_generic_config_read+0x24/0xe0
[ 1.481553] sp : ffff80001211b920
[ 1.484891] x29: ffff80001211b920 x28: 0000000000000000
[ 1.490252] x27: ffff8000116a04bc x26: 0000000000000000
[ 1.495612] x25: 0000000000000001 x24: ffff80001211ba54
[ 1.500972] x23: ffff0020009c3800 x22: 0000000000000000
[ 1.506332] x21: 0000000000000087 x20: ffff80001211b994
[ 1.511692] x19: 0000000000000004 x18: 0000000000000000
[ 1.517052] x17: 0000000000000000 x16: 00000000d5edfbc1
[ 1.522412] x15: ffffffffffffffff x14: ffff800011cf9948
[ 1.527772] x13: ffff002000305a1c x12: 0000000000000030
[ 1.533132] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
[ 1.538491] x9 : 2c6b7173626d686f x8 : 000000000000ea60
[ 1.543851] x7 : ffff80001211ba54 x6 : 0000000000000000
[ 1.549211] x5 : 0000000000000000 x4 : ffff800012131000
[ 1.554570] x3 : 0000000000000000 x2 : 0000000000000000
[ 1.559930] x1 : 0000000000001000 x0 : ffff800012132000
[ 1.565290] Call trace:
[ 1.567752] pci_generic_config_read+0x38/0xe0
[ 1.572233] pci_bus_read_config_dword+0x84/0xd8
[ 1.576890] pci_bus_generic_read_dev_vendor_id+0x34/0x1b0
[ 1.582423] pci_bus_read_dev_vendor_id+0x4c/0x70
[ 1.587167] pci_scan_single_device+0x84/0xe0
[ 1.591559] pci_scan_slot+0x6c/0x120
[ 1.595250] pci_scan_child_bus_extend+0x54/0x298
[ 1.599994] pci_scan_root_bus_bridge+0xd4/0xf0
[ 1.604562] pci_host_probe+0x18/0xb0
[ 1.608254] pci_host_common_probe+0x13c/0x1a0
[ 1.612735] platform_drv_probe+0x54/0xa8
[ 1.616777] really_probe+0xe4/0x3b8
[ 1.620380] driver_probe_device+0x58/0xb8
[ 1.624509] device_driver_attach+0x74/0x80
[ 1.628725] __driver_attach+0x58/0xe0
[ 1.632503] bus_for_each_dev+0x74/0xc8
[ 1.636369] driver_attach+0x24/0x30
[ 1.639972] bus_add_driver+0x18c/0x1f0
[ 1.643838] driver_register+0x64/0x120
[ 1.647704] __platform_driver_register+0x48/0x58
[ 1.652449] gen_pci_driver_init+0x1c/0x28
[ 1.656580] do_one_initcall+0x4c/0x2c0
[ 1.660447] kernel_init_freeable+0x1e4/0x250
[ 1.664840] kernel_init+0x14/0x118
[ 1.668355] ret_from_fork+0x10/0x34
[ 1.671961] Code: 7100067f 540001c0 71000a7f 54000300 (b9400001)
[ 1.678114] ---[ end trace 0aca1b048661e8b3 ]---
[ 1.682770] note: swapper/0[1] exited with preempt_count 1
[ 1.688305] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 1.696031] SMP: stopping secondary CPUs
[ 1.699989] Kernel Offset: disabled
[ 1.703503] CPU features: 0x0240022,61006008
[ 1.707806] Memory Limit: none
[ 1.710884] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
There is a LS1028A eval board in kernelci here:
https://lavalab.nxp.com/scheduler/job/170566
I actually have this board which also have a LS1028A SoC:
https://lavalab.kontron.com/scheduler/job/1771
But in the latter you won't see much because earlycon isn't active. [I'm
about to fix that.]
By reverting patch 1/5, the board will work again.
-michael
^ permalink raw reply
* Re: [PATCH v5 10/19] dt-bindings: usb: Convert DWC USB3 bindings to DT schema
From: Rob Herring @ 2020-12-08 15:21 UTC (permalink / raw)
To: Serge Semin
Cc: Neil Armstrong, Bjorn Andersson, Pavel Parkhomenko, Kevin Hilman,
Krzysztof Kozlowski, Ahmad Zainie, Andy Gross, Chunfeng Yun,
linux-snps-arc, devicetree, Mathias Nyman, Martin Blumenstingl,
Lad Prabhakar, Alexey Malahov, Rob Herring, linux-arm-kernel,
Roger Quadros, Felipe Balbi, linux-mips, Greg Kroah-Hartman,
Yoshihiro Shimoda, linux-usb, linux-kernel, Serge Semin,
Manu Gautam, linuxppc-dev
In-Reply-To: <20201205152427.29537-11-Sergey.Semin@baikalelectronics.ru>
On Sat, 05 Dec 2020 18:24:17 +0300, Serge Semin wrote:
> DWC USB3 DT node is supposed to be compliant with the Generic xHCI
> Controller schema, but with additional vendor-specific properties, the
> controller-specific reference clocks and PHYs. So let's convert the
> currently available legacy text-based DWC USB3 bindings to the DT schema
> and make sure the DWC USB3 nodes are also validated against the
> usb-xhci.yaml schema.
>
> Note 1. we have to discard the nodename restriction of being prefixed with
> "dwc3@" string, since in accordance with the usb-hcd.yaml schema USB nodes
> are supposed to be named as "^usb(@.*)".
>
> Note 2. The clock-related properties are marked as optional to match the
> DWC USB3 driver expectation and to improve the bindings mainainability
> so in case if there is a glue-node it would the responsible for the
> clocks initialization.
>
> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
>
> ---
>
> Changelog v2:
> - Discard '|' from the descriptions, since we don't need to preserve
> the text formatting in any of them.
> - Drop quotes from around the string constants.
> - Fix the "clock-names" prop description to be referring the enumerated
> clock-names instead of the ones from the Databook.
>
> Changelog v3:
> - Apply usb-xhci.yaml# schema only if the controller is supposed to work
> as either host or otg.
>
> Changelog v4:
> - Apply usb-drd.yaml schema first. If the controller is configured
> to work in a gadget mode only, then apply the usb.yaml schema too,
> otherwise apply the usb-xhci.yaml schema.
> - Discard the Rob'es Reviewed-by tag. Please review the patch one more
> time.
>
> Changelog v5:
> - Add "snps,dis-split-quirk" property to the DWC USB3 DT schema.
> - Add a commit log text about the clock-related property changes.
> - Make sure dr_mode exist to apply the USB-gadget-only schema.
> ---
> .../devicetree/bindings/usb/dwc3.txt | 128 -------
> .../devicetree/bindings/usb/snps,dwc3.yaml | 312 ++++++++++++++++++
> 2 files changed, 312 insertions(+), 128 deletions(-)
> delete mode 100644 Documentation/devicetree/bindings/usb/dwc3.txt
> create mode 100644 Documentation/devicetree/bindings/usb/snps,dwc3.yaml
>
Reviewed-by: Rob Herring <robh@kernel.org>
^ permalink raw reply
* Re: [PATCH 18/20] ethernet: ucc_geth: add helper to replace repeated switch statements
From: Christophe Leroy @ 2020-12-08 15:21 UTC (permalink / raw)
To: Rasmus Villemoes, Li Yang, David S. Miller, Jakub Kicinski
Cc: Vladimir Oltean, Zhao Qiang, linuxppc-dev, linux-kernel, netdev
In-Reply-To: <20201205191744.7847-19-rasmus.villemoes@prevas.dk>
Le 05/12/2020 à 20:17, Rasmus Villemoes a écrit :
> The translation from the ucc_geth_num_of_threads enum value to the
> actual count can be written somewhat more compactly with a small
> lookup table, allowing us to replace the four switch statements.
>
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
> drivers/net/ethernet/freescale/ucc_geth.c | 100 +++++-----------------
> 1 file changed, 22 insertions(+), 78 deletions(-)
>
> diff --git a/drivers/net/ethernet/freescale/ucc_geth.c b/drivers/net/ethernet/freescale/ucc_geth.c
> index 3aebea191b52..a26d1feb7dab 100644
> --- a/drivers/net/ethernet/freescale/ucc_geth.c
> +++ b/drivers/net/ethernet/freescale/ucc_geth.c
> @@ -70,6 +70,20 @@ static struct {
> module_param_named(debug, debug.msg_enable, int, 0);
> MODULE_PARM_DESC(debug, "Debug verbosity level (0=none, ..., 0xffff=all)");
>
> +static int ucc_geth_thread_count(enum ucc_geth_num_of_threads idx)
> +{
> + static const u8 count[] = {
> + [UCC_GETH_NUM_OF_THREADS_1] = 1,
> + [UCC_GETH_NUM_OF_THREADS_2] = 2,
> + [UCC_GETH_NUM_OF_THREADS_4] = 4,
> + [UCC_GETH_NUM_OF_THREADS_6] = 6,
> + [UCC_GETH_NUM_OF_THREADS_8] = 8,
> + };
> + if (idx >= ARRAY_SIZE(count))
> + return 0;
> + return count[idx];
> +}
I think you would allow GCC to provide a much better optimisation with something like:
static int ucc_geth_thread_count(enum ucc_geth_num_of_threads idx)
{
if (idx == UCC_GETH_NUM_OF_THREADS_1)
return 1;
if (idx == UCC_GETH_NUM_OF_THREADS_2)
return 2;
if (idx == UCC_GETH_NUM_OF_THREADS_4)
return 4;
if (idx == UCC_GETH_NUM_OF_THREADS_6)
return 6;
if (idx == UCC_GETH_NUM_OF_THREADS_8)
return 8;
return 0;
}
> +
> static const struct ucc_geth_info ugeth_primary_info = {
> .uf_info = {
> .rtsm = UCC_FAST_SEND_IDLES_BETWEEN_FRAMES,
> @@ -668,32 +682,12 @@ static void dump_regs(struct ucc_geth_private *ugeth)
> in_be32(&ugeth->ug_regs->scam));
>
> if (ugeth->p_thread_data_tx) {
> - int numThreadsTxNumerical;
> - switch (ugeth->ug_info->numThreadsTx) {
> - case UCC_GETH_NUM_OF_THREADS_1:
> - numThreadsTxNumerical = 1;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_2:
> - numThreadsTxNumerical = 2;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_4:
> - numThreadsTxNumerical = 4;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_6:
> - numThreadsTxNumerical = 6;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_8:
> - numThreadsTxNumerical = 8;
> - break;
> - default:
> - numThreadsTxNumerical = 0;
> - break;
> - }
> + int count = ucc_geth_thread_count(ugeth->ug_info->numThreadsTx);
>
> pr_info("Thread data TXs:\n");
> pr_info("Base address: 0x%08x\n",
> (u32)ugeth->p_thread_data_tx);
> - for (i = 0; i < numThreadsTxNumerical; i++) {
> + for (i = 0; i < count; i++) {
> pr_info("Thread data TX[%d]:\n", i);
> pr_info("Base address: 0x%08x\n",
> (u32)&ugeth->p_thread_data_tx[i]);
> @@ -702,32 +696,12 @@ static void dump_regs(struct ucc_geth_private *ugeth)
> }
> }
> if (ugeth->p_thread_data_rx) {
> - int numThreadsRxNumerical;
> - switch (ugeth->ug_info->numThreadsRx) {
> - case UCC_GETH_NUM_OF_THREADS_1:
> - numThreadsRxNumerical = 1;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_2:
> - numThreadsRxNumerical = 2;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_4:
> - numThreadsRxNumerical = 4;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_6:
> - numThreadsRxNumerical = 6;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_8:
> - numThreadsRxNumerical = 8;
> - break;
> - default:
> - numThreadsRxNumerical = 0;
> - break;
> - }
> + int count = ucc_geth_thread_count(ugeth->ug_info->numThreadsRx);
>
> pr_info("Thread data RX:\n");
> pr_info("Base address: 0x%08x\n",
> (u32)ugeth->p_thread_data_rx);
> - for (i = 0; i < numThreadsRxNumerical; i++) {
> + for (i = 0; i < count; i++) {
> pr_info("Thread data RX[%d]:\n", i);
> pr_info("Base address: 0x%08x\n",
> (u32)&ugeth->p_thread_data_rx[i]);
> @@ -2315,45 +2289,15 @@ static int ucc_geth_startup(struct ucc_geth_private *ugeth)
> uf_regs = uccf->uf_regs;
> ug_regs = ugeth->ug_regs;
>
> - switch (ug_info->numThreadsRx) {
> - case UCC_GETH_NUM_OF_THREADS_1:
> - numThreadsRxNumerical = 1;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_2:
> - numThreadsRxNumerical = 2;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_4:
> - numThreadsRxNumerical = 4;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_6:
> - numThreadsRxNumerical = 6;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_8:
> - numThreadsRxNumerical = 8;
> - break;
> - default:
> + numThreadsRxNumerical = ucc_geth_thread_count(ug_info->numThreadsRx);
> + if (!numThreadsRxNumerical) {
> if (netif_msg_ifup(ugeth))
> pr_err("Bad number of Rx threads value\n");
> return -EINVAL;
> }
>
> - switch (ug_info->numThreadsTx) {
> - case UCC_GETH_NUM_OF_THREADS_1:
> - numThreadsTxNumerical = 1;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_2:
> - numThreadsTxNumerical = 2;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_4:
> - numThreadsTxNumerical = 4;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_6:
> - numThreadsTxNumerical = 6;
> - break;
> - case UCC_GETH_NUM_OF_THREADS_8:
> - numThreadsTxNumerical = 8;
> - break;
> - default:
> + numThreadsTxNumerical = ucc_geth_thread_count(ug_info->numThreadsTx);
> + if (!numThreadsTxNumerical) {
> if (netif_msg_ifup(ugeth))
> pr_err("Bad number of Tx threads value\n");
> return -EINVAL;
>
^ permalink raw reply
* Re: [PATCH 15/20] ethernet: ucc_geth: use UCC_GETH_{RX, TX}_BD_RING_ALIGNMENT macros directly
From: Christophe Leroy @ 2020-12-08 15:14 UTC (permalink / raw)
To: Rasmus Villemoes, Li Yang, David S. Miller, Jakub Kicinski
Cc: Vladimir Oltean, Zhao Qiang, linuxppc-dev, linux-kernel, netdev
In-Reply-To: <20201205191744.7847-16-rasmus.villemoes@prevas.dk>
Le 05/12/2020 à 20:17, Rasmus Villemoes a écrit :
> These macros both have the value 32, there's no point first
> initializing align to a lower value.
>
> If anything, one could throw in a
> BUILD_BUG_ON(UCC_GETH_TX_BD_RING_ALIGNMENT < 4), but it's not worth it
> - lots of code depends on named constants having sensible values.
>
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
> drivers/net/ethernet/freescale/ucc_geth.c | 8 ++------
> 1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/freescale/ucc_geth.c b/drivers/net/ethernet/freescale/ucc_geth.c
> index 273342233bba..ccde42f547b8 100644
> --- a/drivers/net/ethernet/freescale/ucc_geth.c
> +++ b/drivers/net/ethernet/freescale/ucc_geth.c
> @@ -2196,9 +2196,7 @@ static int ucc_geth_alloc_tx(struct ucc_geth_private *ugeth)
> UCC_GETH_TX_BD_RING_SIZE_MEMORY_ALIGNMENT)
> length += UCC_GETH_TX_BD_RING_SIZE_MEMORY_ALIGNMENT;
> if (uf_info->bd_mem_part == MEM_PART_SYSTEM) {
> - u32 align = 4;
> - if (UCC_GETH_TX_BD_RING_ALIGNMENT > 4)
> - align = UCC_GETH_TX_BD_RING_ALIGNMENT;
> + u32 align = UCC_GETH_TX_BD_RING_ALIGNMENT;
I think checkpatch.pl will expect an empty line here in addition.
> ugeth->tx_bd_ring_offset[j] =
> (u32) kmalloc((u32) (length + align), GFP_KERNEL);
>
> @@ -2274,9 +2272,7 @@ static int ucc_geth_alloc_rx(struct ucc_geth_private *ugeth)
> for (j = 0; j < ug_info->numQueuesRx; j++) {
> length = ug_info->bdRingLenRx[j] * sizeof(struct qe_bd);
> if (uf_info->bd_mem_part == MEM_PART_SYSTEM) {
> - u32 align = 4;
> - if (UCC_GETH_RX_BD_RING_ALIGNMENT > 4)
> - align = UCC_GETH_RX_BD_RING_ALIGNMENT;
> + u32 align = UCC_GETH_RX_BD_RING_ALIGNMENT;
Same
> ugeth->rx_bd_ring_offset[j] =
> (u32) kmalloc((u32) (length + align), GFP_KERNEL);
> if (ugeth->rx_bd_ring_offset[j] != 0)
>
^ permalink raw reply
* [PATCH 05/13] powerpc/xive: Fix allocation of pages donated to the XIVE controller
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Cédric Le Goater
In-Reply-To: <20201208151124.1329942-1-clg@kaod.org>
The XIVE interrupt controller uses a set of internal tables to handle
interrupt routing. The small tables (SBE, EAT) are direct tables and
allocated by OPAL. The bigger ones (NVT, ENDT) are indirect, i.e., the
first table entries point to a page which contains the XIVE structures
used by HW. For these, OPAL only allocates the first page level and
requests pages to be allocated by Linux when a new entry is inserted.
Make sure that these pages can not be reclaimed. The problem can be
seen while stressing a system with 4K KVM guests, 16 vCPUs each.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
arch/powerpc/sysdev/xive/xive-internal.h | 2 ++
arch/powerpc/sysdev/xive/common.c | 2 +-
arch/powerpc/sysdev/xive/native.c | 2 +-
3 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/sysdev/xive/xive-internal.h b/arch/powerpc/sysdev/xive/xive-internal.h
index d701af7fb48c..1eacc90f4dcf 100644
--- a/arch/powerpc/sysdev/xive/xive-internal.h
+++ b/arch/powerpc/sysdev/xive/xive-internal.h
@@ -73,6 +73,8 @@ static inline u32 xive_alloc_order(u32 queue_shift)
return (queue_shift > PAGE_SHIFT) ? (queue_shift - PAGE_SHIFT) : 0;
}
+#define XIVE_GFP (__GFP_NORETRY | __GFP_NOWARN | __GFP_NOMEMALLOC)
+
extern bool xive_cmdline_disabled;
#endif /* __XIVE_INTERNAL_H */
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 605238ca65e4..80fd97d764ab 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1544,7 +1544,7 @@ __be32 *xive_queue_page_alloc(unsigned int cpu, u32 queue_shift)
__be32 *qpage;
alloc_order = xive_alloc_order(queue_shift);
- pages = alloc_pages_node(cpu_to_node(cpu), GFP_KERNEL, alloc_order);
+ pages = alloc_pages_node(cpu_to_node(cpu), XIVE_GFP, alloc_order);
if (!pages)
return ERR_PTR(-ENOMEM);
qpage = (__be32 *)page_address(pages);
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index cb58ec7ce77a..6afb44d0d816 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -643,7 +643,7 @@ static bool xive_native_provision_pages(void)
* XXX TODO: Try to make the allocation local to the node where
* the chip resides.
*/
- p = kmem_cache_alloc(xive_provision_cache, GFP_KERNEL);
+ p = kmem_cache_alloc(xive_provision_cache, XIVE_GFP);
if (!p) {
pr_err("Failed to allocate provisioning page\n");
return false;
--
2.26.2
^ permalink raw reply related
* Re: [PATCH 14/20] ethernet: ucc_geth: don't statically allocate eight ucc_geth_info
From: Christophe Leroy @ 2020-12-08 15:13 UTC (permalink / raw)
To: Rasmus Villemoes, Li Yang, David S. Miller, Jakub Kicinski
Cc: Vladimir Oltean, Zhao Qiang, linuxppc-dev, linux-kernel, netdev
In-Reply-To: <20201205191744.7847-15-rasmus.villemoes@prevas.dk>
Le 05/12/2020 à 20:17, Rasmus Villemoes a écrit :
> struct ucc_geth_info is somewhat large, and on systems with only one
> or two UCC instances, that just wastes a few KB of memory. So
> allocate and populate a chunk of memory at probe time instead of
> initializing them all during driver init.
>
> Note that the existing "ug_info == NULL" check was dead code, as the
> address of some static array element can obviously never be NULL.
>
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
> drivers/net/ethernet/freescale/ucc_geth.c | 32 +++++++++--------------
> 1 file changed, 12 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/net/ethernet/freescale/ucc_geth.c b/drivers/net/ethernet/freescale/ucc_geth.c
> index a06744d8b4af..273342233bba 100644
> --- a/drivers/net/ethernet/freescale/ucc_geth.c
> +++ b/drivers/net/ethernet/freescale/ucc_geth.c
> @@ -157,8 +157,6 @@ static const struct ucc_geth_info ugeth_primary_info = {
> .riscRx = QE_RISC_ALLOCATION_RISC1_AND_RISC2,
> };
>
> -static struct ucc_geth_info ugeth_info[8];
> -
> #ifdef DEBUG
> static void mem_disp(u8 *addr, int size)
> {
> @@ -3714,25 +3712,23 @@ static int ucc_geth_probe(struct platform_device* ofdev)
> if ((ucc_num < 0) || (ucc_num > 7))
> return -ENODEV;
>
> - ug_info = &ugeth_info[ucc_num];
> - if (ug_info == NULL) {
> - if (netif_msg_probe(&debug))
> - pr_err("[%d] Missing additional data!\n", ucc_num);
> - return -ENODEV;
> - }
> + ug_info = kmalloc(sizeof(*ug_info), GFP_KERNEL);
Could we use dev_kmalloc() instead, to avoid the freeing on the wait out and the err_free_info: path ?
> + if (ug_info == NULL)
> + return -ENOMEM;
> + memcpy(ug_info, &ugeth_primary_info, sizeof(*ug_info));
>
> ug_info->uf_info.ucc_num = ucc_num;
>
> err = ucc_geth_parse_clock(np, "rx", &ug_info->uf_info.rx_clock);
> if (err)
> - return err;
> + goto err_free_info;
> err = ucc_geth_parse_clock(np, "tx", &ug_info->uf_info.tx_clock);
> if (err)
> - return err;
> + goto err_free_info;
>
> err = of_address_to_resource(np, 0, &res);
> if (err)
> - return -EINVAL;
> + goto err_free_info;
>
> ug_info->uf_info.regs = res.start;
> ug_info->uf_info.irq = irq_of_parse_and_map(np, 0);
> @@ -3745,7 +3741,7 @@ static int ucc_geth_probe(struct platform_device* ofdev)
> */
> err = of_phy_register_fixed_link(np);
> if (err)
> - return err;
> + goto err_free_info;
> ug_info->phy_node = of_node_get(np);
> }
>
> @@ -3876,6 +3872,8 @@ static int ucc_geth_probe(struct platform_device* ofdev)
> of_phy_deregister_fixed_link(np);
> of_node_put(ug_info->tbi_node);
> of_node_put(ug_info->phy_node);
> +err_free_info:
> + kfree(ug_info);
>
> return err;
> }
> @@ -3886,6 +3884,7 @@ static int ucc_geth_remove(struct platform_device* ofdev)
> struct ucc_geth_private *ugeth = netdev_priv(dev);
> struct device_node *np = ofdev->dev.of_node;
>
> + kfree(ugeth->ug_info);
> ucc_geth_memclean(ugeth);
> if (of_phy_is_fixed_link(np))
> of_phy_deregister_fixed_link(np);
> @@ -3920,17 +3919,10 @@ static struct platform_driver ucc_geth_driver = {
>
> static int __init ucc_geth_init(void)
> {
> - int i, ret;
> -
> if (netif_msg_drv(&debug))
> pr_info(DRV_DESC "\n");
> - for (i = 0; i < 8; i++)
> - memcpy(&(ugeth_info[i]), &ugeth_primary_info,
> - sizeof(ugeth_primary_info));
> -
> - ret = platform_driver_register(&ucc_geth_driver);
>
> - return ret;
> + return platform_driver_register(&ucc_geth_driver);
> }
>
> static void __exit ucc_geth_exit(void)
>
^ permalink raw reply
* [PATCH 06/13] powerpc/xive: Add a name to the IRQ domain
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Cédric Le Goater
In-Reply-To: <20201208151124.1329942-1-clg@kaod.org>
We hope one day to handle multiple irq_domain in the XIVE driver.
Start simple by setting the name using the DT node.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
arch/powerpc/sysdev/xive/xive-internal.h | 4 ++--
arch/powerpc/sysdev/xive/common.c | 10 +++++-----
arch/powerpc/sysdev/xive/native.c | 2 +-
arch/powerpc/sysdev/xive/spapr.c | 2 +-
4 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/sysdev/xive/xive-internal.h b/arch/powerpc/sysdev/xive/xive-internal.h
index 1eacc90f4dcf..066d6fe3dc1d 100644
--- a/arch/powerpc/sysdev/xive/xive-internal.h
+++ b/arch/powerpc/sysdev/xive/xive-internal.h
@@ -63,8 +63,8 @@ struct xive_ops {
const char *name;
};
-bool xive_core_init(const struct xive_ops *ops, void __iomem *area, u32 offset,
- u8 max_prio);
+bool xive_core_init(struct device_node *np, const struct xive_ops *ops,
+ void __iomem *area, u32 offset, u8 max_prio);
__be32 *xive_queue_page_alloc(unsigned int cpu, u32 queue_shift);
int xive_core_debug_init(void);
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 80fd97d764ab..721617f0f854 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1310,9 +1310,9 @@ static const struct irq_domain_ops xive_irq_domain_ops = {
.xlate = xive_irq_domain_xlate,
};
-static void __init xive_init_host(void)
+static void __init xive_init_host(struct device_node *np)
{
- xive_irq_domain = irq_domain_add_nomap(NULL, XIVE_MAX_IRQ,
+ xive_irq_domain = irq_domain_add_nomap(np, XIVE_MAX_IRQ,
&xive_irq_domain_ops, NULL);
if (WARN_ON(xive_irq_domain == NULL))
return;
@@ -1508,8 +1508,8 @@ void xive_shutdown(void)
xive_ops->shutdown();
}
-bool __init xive_core_init(const struct xive_ops *ops, void __iomem *area, u32 offset,
- u8 max_prio)
+bool __init xive_core_init(struct device_node *np, const struct xive_ops *ops,
+ void __iomem *area, u32 offset, u8 max_prio)
{
xive_tima = area;
xive_tima_offset = offset;
@@ -1520,7 +1520,7 @@ bool __init xive_core_init(const struct xive_ops *ops, void __iomem *area, u32 o
__xive_enabled = true;
pr_devel("Initializing host..\n");
- xive_init_host();
+ xive_init_host(np);
pr_devel("Initializing boot CPU..\n");
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index 6afb44d0d816..5f1e5aed8ab4 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -622,7 +622,7 @@ bool __init xive_native_init(void)
xive_native_setup_pools();
/* Initialize XIVE core with our backend */
- if (!xive_core_init(&xive_native_ops, tima, TM_QW3_HV_PHYS,
+ if (!xive_core_init(np, &xive_native_ops, tima, TM_QW3_HV_PHYS,
max_prio)) {
opal_xive_reset(OPAL_XIVE_MODE_EMU);
return false;
diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
index 1e3674d7ea7b..6610e5149d5a 100644
--- a/arch/powerpc/sysdev/xive/spapr.c
+++ b/arch/powerpc/sysdev/xive/spapr.c
@@ -857,7 +857,7 @@ bool __init xive_spapr_init(void)
}
/* Initialize XIVE core with our backend */
- if (!xive_core_init(&xive_spapr_ops, tima, TM_QW1_OS, max_prio))
+ if (!xive_core_init(np, &xive_spapr_ops, tima, TM_QW1_OS, max_prio))
return false;
pr_info("Using %dkB queues\n", 1 << (xive_queue_shift - 10));
--
2.26.2
^ permalink raw reply related
* [PATCH 13/13] powerpc/xive: Improve error reporting of OPAL calls
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Cédric Le Goater
In-Reply-To: <20201208151124.1329942-1-clg@kaod.org>
Introduce a vp_err() macro to standardize error reporting.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
arch/powerpc/sysdev/xive/native.c | 28 ++++++++++++++++------------
1 file changed, 16 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index 4902d05ebbd1..42297a131a6e 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -122,6 +122,8 @@ static int xive_native_get_irq_config(u32 hw_irq, u32 *target, u8 *prio,
return rc == 0 ? 0 : -ENXIO;
}
+#define vp_err(vp, fmt, ...) pr_err("VP[0x%x]: " fmt, vp, ##__VA_ARGS__)
+
/* This can be called multiple time to change a queue configuration */
int xive_native_configure_queue(u32 vp_id, struct xive_q *q, u8 prio,
__be32 *qpage, u32 order, bool can_escalate)
@@ -149,7 +151,7 @@ int xive_native_configure_queue(u32 vp_id, struct xive_q *q, u8 prio,
&esc_irq_be,
NULL);
if (rc) {
- pr_err("Error %lld getting queue info prio %d\n", rc, prio);
+ vp_err(vp_id, "Failed to get queue %d info : %lld\n", prio, rc);
rc = -EIO;
goto fail;
}
@@ -172,7 +174,7 @@ int xive_native_configure_queue(u32 vp_id, struct xive_q *q, u8 prio,
msleep(OPAL_BUSY_DELAY_MS);
}
if (rc) {
- pr_err("Error %lld setting queue for prio %d\n", rc, prio);
+ vp_err(vp_id, "Failed to set queue %d info: %lld\n", prio, rc);
rc = -EIO;
} else {
/*
@@ -199,7 +201,7 @@ static void __xive_native_disable_queue(u32 vp_id, struct xive_q *q, u8 prio)
msleep(OPAL_BUSY_DELAY_MS);
}
if (rc)
- pr_err("Error %lld disabling queue for prio %d\n", rc, prio);
+ vp_err(vp_id, "Failed to disable queue %d : %lld\n", prio, rc);
}
void xive_native_disable_queue(u32 vp_id, struct xive_q *q, u8 prio)
@@ -698,6 +700,8 @@ int xive_native_enable_vp(u32 vp_id, bool single_escalation)
break;
msleep(OPAL_BUSY_DELAY_MS);
}
+ if (rc)
+ vp_err(vp_id, "Failed to enable VP : %lld\n", rc);
return rc ? -EIO : 0;
}
EXPORT_SYMBOL_GPL(xive_native_enable_vp);
@@ -712,6 +716,8 @@ int xive_native_disable_vp(u32 vp_id)
break;
msleep(OPAL_BUSY_DELAY_MS);
}
+ if (rc)
+ vp_err(vp_id, "Failed to disable VP : %lld\n", rc);
return rc ? -EIO : 0;
}
EXPORT_SYMBOL_GPL(xive_native_disable_vp);
@@ -723,8 +729,10 @@ int xive_native_get_vp_info(u32 vp_id, u32 *out_cam_id, u32 *out_chip_id)
s64 rc;
rc = opal_xive_get_vp_info(vp_id, NULL, &vp_cam_be, NULL, &vp_chip_id_be);
- if (rc)
+ if (rc) {
+ vp_err(vp_id, "Failed to get VP info : %lld\n", rc);
return -EIO;
+ }
*out_cam_id = be64_to_cpu(vp_cam_be) & 0xffffffffu;
*out_chip_id = be32_to_cpu(vp_chip_id_be);
@@ -755,8 +763,7 @@ int xive_native_get_queue_info(u32 vp_id, u32 prio,
rc = opal_xive_get_queue_info(vp_id, prio, &qpage, &qsize,
&qeoi_page, &escalate_irq, &qflags);
if (rc) {
- pr_err("OPAL failed to get queue info for VCPU %d/%d : %lld\n",
- vp_id, prio, rc);
+ vp_err(vp_id, "failed to get queue %d info : %lld\n", prio, rc);
return -EIO;
}
@@ -784,8 +791,7 @@ int xive_native_get_queue_state(u32 vp_id, u32 prio, u32 *qtoggle, u32 *qindex)
rc = opal_xive_get_queue_state(vp_id, prio, &opal_qtoggle,
&opal_qindex);
if (rc) {
- pr_err("OPAL failed to get queue state for VCPU %d/%d : %lld\n",
- vp_id, prio, rc);
+ vp_err(vp_id, "failed to get queue %d state : %lld\n", prio, rc);
return -EIO;
}
@@ -804,8 +810,7 @@ int xive_native_set_queue_state(u32 vp_id, u32 prio, u32 qtoggle, u32 qindex)
rc = opal_xive_set_queue_state(vp_id, prio, qtoggle, qindex);
if (rc) {
- pr_err("OPAL failed to set queue state for VCPU %d/%d : %lld\n",
- vp_id, prio, rc);
+ vp_err(vp_id, "failed to set queue %d state : %lld\n", prio, rc);
return -EIO;
}
@@ -827,8 +832,7 @@ int xive_native_get_vp_state(u32 vp_id, u64 *out_state)
rc = opal_xive_get_vp_state(vp_id, &state);
if (rc) {
- pr_err("OPAL failed to get vp state for VCPU %d : %lld\n",
- vp_id, rc);
+ vp_err(vp_id, "failed to get vp state : %lld\n", rc);
return -EIO;
}
--
2.26.2
^ permalink raw reply related
* [PATCH 10/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Cédric Le Goater
In-Reply-To: <20201208151124.1329942-1-clg@kaod.org>
This flag was used to support the PHB4 LSIs on P9 DD1 and we have
stopped supporting this CPU when DD2 came out. See skiboot commit:
https://github.com/open-power/skiboot/commit/0b0d15e3c170
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
arch/powerpc/include/asm/opal-api.h | 2 +-
arch/powerpc/include/asm/xive.h | 2 +-
arch/powerpc/kvm/book3s_xive.c | 54 +++++------------------------
arch/powerpc/sysdev/xive/common.c | 39 +--------------------
arch/powerpc/sysdev/xive/native.c | 2 --
5 files changed, 11 insertions(+), 88 deletions(-)
diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index 48ee604ca39a..0455b679c050 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -1092,7 +1092,7 @@ enum {
OPAL_XIVE_IRQ_STORE_EOI = 0x00000002,
OPAL_XIVE_IRQ_LSI = 0x00000004,
OPAL_XIVE_IRQ_SHIFT_BUG = 0x00000008, /* P9 DD1.0 workaround */
- OPAL_XIVE_IRQ_MASK_VIA_FW = 0x00000010,
+ OPAL_XIVE_IRQ_MASK_VIA_FW = 0x00000010, /* P9 DD1.0 workaround */
OPAL_XIVE_IRQ_EOI_VIA_FW = 0x00000020,
};
diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
index ff805885a028..d62368d0ba91 100644
--- a/arch/powerpc/include/asm/xive.h
+++ b/arch/powerpc/include/asm/xive.h
@@ -61,7 +61,7 @@ struct xive_irq_data {
#define XIVE_IRQ_FLAG_STORE_EOI 0x01
#define XIVE_IRQ_FLAG_LSI 0x02
#define XIVE_IRQ_FLAG_SHIFT_BUG 0x04 /* P9 DD1.0 workaround */
-#define XIVE_IRQ_FLAG_MASK_FW 0x08
+#define XIVE_IRQ_FLAG_MASK_FW 0x08 /* P9 DD1.0 workaround */
#define XIVE_IRQ_FLAG_EOI_FW 0x10
#define XIVE_IRQ_FLAG_H_INT_ESB 0x20
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index fae1c2e8da29..59a986ae640b 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -419,37 +419,16 @@ static u8 xive_lock_and_mask(struct kvmppc_xive *xive,
/* Get the right irq */
kvmppc_xive_select_irq(state, &hw_num, &xd);
- /*
- * If the interrupt is marked as needing masking via
- * firmware, we do it here. Firmware masking however
- * is "lossy", it won't return the old p and q bits
- * and won't set the interrupt to a state where it will
- * record queued ones. If this is an issue we should do
- * lazy masking instead.
- *
- * For now, we work around this in unmask by forcing
- * an interrupt whenever we unmask a non-LSI via FW
- * (if ever).
- */
- if (xd->flags & OPAL_XIVE_IRQ_MASK_VIA_FW) {
- xive_native_configure_irq(hw_num,
- kvmppc_xive_vp(xive, state->act_server),
- MASKED, state->number);
- /* set old_p so we can track if an H_EOI was done */
- state->old_p = true;
- state->old_q = false;
- } else {
- /* Set PQ to 10, return old P and old Q and remember them */
- val = xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_10);
- state->old_p = !!(val & 2);
- state->old_q = !!(val & 1);
+ /* Set PQ to 10, return old P and old Q and remember them */
+ val = xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_10);
+ state->old_p = !!(val & 2);
+ state->old_q = !!(val & 1);
- /*
- * Synchronize hardware to sensure the queues are updated
- * when masking
+ /*
+ * Synchronize hardware to sensure the queues are updated
+ * when masking
*/
- xive_native_sync_source(hw_num);
- }
+ xive_native_sync_source(hw_num);
return old_prio;
}
@@ -483,23 +462,6 @@ static void xive_finish_unmask(struct kvmppc_xive *xive,
/* Get the right irq */
kvmppc_xive_select_irq(state, &hw_num, &xd);
- /*
- * See comment in xive_lock_and_mask() concerning masking
- * via firmware.
- */
- if (xd->flags & OPAL_XIVE_IRQ_MASK_VIA_FW) {
- xive_native_configure_irq(hw_num,
- kvmppc_xive_vp(xive, state->act_server),
- state->act_priority, state->number);
- /* If an EOI is needed, do it here */
- if (!state->old_p)
- xive_vm_source_eoi(hw_num, xd);
- /* If this is not an LSI, force a trigger */
- if (!(xd->flags & OPAL_XIVE_IRQ_LSI))
- xive_irq_trigger(xd);
- goto bail;
- }
-
/* Old Q set, set PQ to 11 */
if (state->old_q)
xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_11);
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index a9259470bf9f..a71412fefb65 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -424,9 +424,7 @@ static void xive_irq_eoi(struct irq_data *d)
}
/*
- * Helper used to mask and unmask an interrupt source. This
- * is only called for normal interrupts that do not require
- * masking/unmasking via firmware.
+ * Helper used to mask and unmask an interrupt source.
*/
static void xive_do_source_set_mask(struct xive_irq_data *xd,
bool mask)
@@ -673,20 +671,6 @@ static void xive_irq_unmask(struct irq_data *d)
pr_devel("xive_irq_unmask: irq %d data @%p\n", d->irq, xd);
- /*
- * This is a workaround for PCI LSI problems on P9, for
- * these, we call FW to set the mask. The problems might
- * be fixed by P9 DD2.0, if that is the case, firmware
- * will no longer set that flag.
- */
- if (xd->flags & XIVE_IRQ_FLAG_MASK_FW) {
- unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d);
- xive_ops->configure_irq(hw_irq,
- get_hard_smp_processor_id(xd->target),
- xive_irq_priority, d->irq);
- return;
- }
-
xive_do_source_set_mask(xd, false);
}
@@ -696,20 +680,6 @@ static void xive_irq_mask(struct irq_data *d)
pr_devel("xive_irq_mask: irq %d data @%p\n", d->irq, xd);
- /*
- * This is a workaround for PCI LSI problems on P9, for
- * these, we call OPAL to set the mask. The problems might
- * be fixed by P9 DD2.0, if that is the case, firmware
- * will no longer set that flag.
- */
- if (xd->flags & XIVE_IRQ_FLAG_MASK_FW) {
- unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d);
- xive_ops->configure_irq(hw_irq,
- get_hard_smp_processor_id(xd->target),
- 0xff, d->irq);
- return;
- }
-
xive_do_source_set_mask(xd, true);
}
@@ -852,13 +822,6 @@ static int xive_irq_set_vcpu_affinity(struct irq_data *d, void *state)
int rc;
u8 pq;
- /*
- * We only support this on interrupts that do not require
- * firmware calls for masking and unmasking
- */
- if (xd->flags & XIVE_IRQ_FLAG_MASK_FW)
- return -EIO;
-
/*
* This is called by KVM with state non-NULL for enabling
* pass-through or NULL for disabling it
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index 0310783241b5..deb97ad25d62 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -64,8 +64,6 @@ int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)
data->flags |= XIVE_IRQ_FLAG_STORE_EOI;
if (opal_flags & OPAL_XIVE_IRQ_LSI)
data->flags |= XIVE_IRQ_FLAG_LSI;
- if (opal_flags & OPAL_XIVE_IRQ_MASK_VIA_FW)
- data->flags |= XIVE_IRQ_FLAG_MASK_FW;
if (opal_flags & OPAL_XIVE_IRQ_EOI_VIA_FW)
data->flags |= XIVE_IRQ_FLAG_EOI_FW;
data->eoi_page = be64_to_cpu(eoi_page);
--
2.26.2
^ permalink raw reply related
* [PATCH 07/13] powerpc/xive: Add a debug_show handler to the XIVE irq_domain
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Cédric Le Goater
In-Reply-To: <20201208151124.1329942-1-clg@kaod.org>
Full state of the Linux interrupt descriptors can be dumped under
debugfs when compiled with CONFIG_GENERIC_IRQ_DEBUGFS. Add support for
the XIVE interrupt controller.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
arch/powerpc/sysdev/xive/common.c | 58 +++++++++++++++++++++++++++++++
1 file changed, 58 insertions(+)
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 721617f0f854..411cba12d73b 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1303,11 +1303,69 @@ static int xive_irq_domain_match(struct irq_domain *h, struct device_node *node,
return xive_ops->match(node);
}
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+static const char * const esb_names[] = { "RESET", "OFF", "PENDING", "QUEUED" };
+
+static const struct {
+ u64 mask;
+ char *name;
+} xive_irq_flags[] = {
+ { XIVE_IRQ_FLAG_STORE_EOI, "STORE_EOI" },
+ { XIVE_IRQ_FLAG_LSI, "LSI" },
+ { XIVE_IRQ_FLAG_SHIFT_BUG, "SHIFT_BUG" },
+ { XIVE_IRQ_FLAG_MASK_FW, "MASK_FW" },
+ { XIVE_IRQ_FLAG_EOI_FW, "EOI_FW" },
+ { XIVE_IRQ_FLAG_H_INT_ESB, "H_INT_ESB" },
+ { XIVE_IRQ_FLAG_NO_EOI, "NO_EOI" },
+};
+
+static void xive_irq_domain_debug_show(struct seq_file *m, struct irq_domain *d,
+ struct irq_data *irqd, int ind)
+{
+ struct xive_irq_data *xd;
+ u64 val;
+ int i;
+
+ /* No IRQ domain level information. To be done */
+ if (!irqd)
+ return;
+
+ if (!is_xive_irq(irq_data_get_irq_chip(irqd)))
+ return;
+
+ seq_printf(m, "%*sXIVE:\n", ind, "");
+ ind++;
+
+ xd = irq_data_get_irq_handler_data(irqd);
+ if (!xd) {
+ seq_printf(m, "%*snot assigned\n", ind, "");
+ return;
+ }
+
+ val = xive_esb_read(xd, XIVE_ESB_GET);
+ seq_printf(m, "%*sESB: %s\n", ind, "", esb_names[val & 0x3]);
+ seq_printf(m, "%*sPstate: %s %s\n", ind, "", xd->stale_p ? "stale" : "",
+ xd->saved_p ? "saved" : "");
+ seq_printf(m, "%*sTarget: %d\n", ind, "", xd->target);
+ seq_printf(m, "%*sChip: %d\n", ind, "", xd->src_chip);
+ seq_printf(m, "%*sTrigger: 0x%016llx\n", ind, "", xd->trig_page);
+ seq_printf(m, "%*sEOI: 0x%016llx\n", ind, "", xd->eoi_page);
+ seq_printf(m, "%*sFlags: 0x%llx\n", ind, "", xd->flags);
+ for (i = 0; i < ARRAY_SIZE(xive_irq_flags); i++) {
+ if (xd->flags & xive_irq_flags[i].mask)
+ seq_printf(m, "%*s%s\n", ind + 12, "", xive_irq_flags[i].name);
+ }
+}
+#endif
+
static const struct irq_domain_ops xive_irq_domain_ops = {
.match = xive_irq_domain_match,
.map = xive_irq_domain_map,
.unmap = xive_irq_domain_unmap,
.xlate = xive_irq_domain_xlate,
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+ .debug_show = xive_irq_domain_debug_show,
+#endif
};
static void __init xive_init_host(struct device_node *np)
--
2.26.2
^ permalink raw reply related
* [PATCH 03/13] powerpc/xive: Introduce XIVE_IPI_HW_IRQ
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Cédric Le Goater
In-Reply-To: <20201208151124.1329942-1-clg@kaod.org>
The XIVE driver deals with CPU IPIs in a peculiar way. Each CPU has
its own XIVE IPI interrupt allocated at the HW level, for PowerNV, or
at the hypervisor level for pSeries. In practice, these interrupts are
not always used. pSeries/PowerVM prefers local doorbells for local
threads since they are faster. On PowerNV, global doorbells are also
preferred for the same reason.
The mapping in the Linux is reduced to a single interrupt using HW
interrupt number 0 and a custom irq_chip to handle EOI. This can cause
performance issues in some benchmark (ipistorm) on multichip systems.
Clarify the use of the 0 value, it will help in improving multichip
support.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
arch/powerpc/sysdev/xive/xive-internal.h | 2 ++
arch/powerpc/sysdev/xive/common.c | 10 +++++-----
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/sysdev/xive/xive-internal.h b/arch/powerpc/sysdev/xive/xive-internal.h
index b7b901da2168..d701af7fb48c 100644
--- a/arch/powerpc/sysdev/xive/xive-internal.h
+++ b/arch/powerpc/sysdev/xive/xive-internal.h
@@ -5,6 +5,8 @@
#ifndef __XIVE_INTERNAL_H
#define __XIVE_INTERNAL_H
+#define XIVE_IPI_HW_IRQ 0 /* interrupt source # for IPIs */
+
/*
* A "disabled" interrupt should never fire, to catch problems
* we set its logical number to this
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 65af34ac1fa2..ee375daf8114 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1142,7 +1142,7 @@ static void __init xive_request_ipi(void)
return;
/* Initialize it */
- virq = irq_create_mapping(xive_irq_domain, 0);
+ virq = irq_create_mapping(xive_irq_domain, XIVE_IPI_HW_IRQ);
xive_ipi_irq = virq;
WARN_ON(request_irq(virq, xive_muxed_ipi_action,
@@ -1242,7 +1242,7 @@ static int xive_irq_domain_map(struct irq_domain *h, unsigned int virq,
#ifdef CONFIG_SMP
/* IPIs are special and come up with HW number 0 */
- if (hw == 0) {
+ if (hw == XIVE_IPI_HW_IRQ) {
/*
* IPIs are marked per-cpu. We use separate HW interrupts under
* the hood but associated with the same "linux" interrupt
@@ -1271,7 +1271,7 @@ static void xive_irq_domain_unmap(struct irq_domain *d, unsigned int virq)
if (!data)
return;
hw_irq = (unsigned int)irqd_to_hwirq(data);
- if (hw_irq)
+ if (hw_irq != XIVE_IPI_HW_IRQ)
xive_irq_free_data(virq);
}
@@ -1421,7 +1421,7 @@ static void xive_flush_cpu_queue(unsigned int cpu, struct xive_cpu *xc)
* Ignore anything that isn't a XIVE irq and ignore
* IPIs, so can just be dropped.
*/
- if (d->domain != xive_irq_domain || hw_irq == 0)
+ if (d->domain != xive_irq_domain || hw_irq == XIVE_IPI_HW_IRQ)
continue;
/*
@@ -1655,7 +1655,7 @@ static int xive_core_debug_show(struct seq_file *m, void *private)
hw_irq = (unsigned int)irqd_to_hwirq(d);
/* IPIs are special (HW number 0) */
- if (hw_irq)
+ if (hw_irq != XIVE_IPI_HW_IRQ)
xive_debug_show_irq(m, hw_irq, d);
}
return 0;
--
2.26.2
^ permalink raw reply related
* [PATCH 01/13] KVM: PPC: Book3S HV: XIVE: Show detailed configuration in debug output
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Cédric Le Goater, Greg Kurz
In-Reply-To: <20201208151124.1329942-1-clg@kaod.org>
This is useful to track allocation of the HW resources on per guest
basis. Making sure IPIs are local to the chip of the vCPUs reduces
rerouting between interrupt controllers and gives better performance
in case of pinning. Checking the distribution of VP structures on the
chips also helps in reducing PowerBUS traffic.
Signed-off-by: Greg Kurz <groug@kaod.org>
[ clg: resurrected show_sources and reworked ouput ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
arch/powerpc/kvm/book3s_xive.h | 2 +
arch/powerpc/kvm/book3s_xive.c | 76 ++++++++++++++++++++++-----
arch/powerpc/kvm/book3s_xive_native.c | 21 ++++++--
3 files changed, 82 insertions(+), 17 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_xive.h b/arch/powerpc/kvm/book3s_xive.h
index 382e3a56e789..d5d4fee7ac94 100644
--- a/arch/powerpc/kvm/book3s_xive.h
+++ b/arch/powerpc/kvm/book3s_xive.h
@@ -290,6 +290,8 @@ extern int (*__xive_vm_h_eoi)(struct kvm_vcpu *vcpu, unsigned long xirr);
*/
void kvmppc_xive_disable_vcpu_interrupts(struct kvm_vcpu *vcpu);
int kvmppc_xive_debug_show_queues(struct seq_file *m, struct kvm_vcpu *vcpu);
+void kvmppc_xive_debug_show_sources(struct seq_file *m,
+ struct kvmppc_xive_src_block *sb);
struct kvmppc_xive_src_block *kvmppc_xive_create_src_block(
struct kvmppc_xive *xive, int irq);
void kvmppc_xive_free_sources(struct kvmppc_xive_src_block *sb);
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index a0ebc29f30b2..18a6b75a3bfd 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -2125,9 +2125,8 @@ int kvmppc_xive_debug_show_queues(struct seq_file *m, struct kvm_vcpu *vcpu)
if (!q->qpage && !xc->esc_virq[i])
continue;
- seq_printf(m, " [q%d]: ", i);
-
if (q->qpage) {
+ seq_printf(m, " q[%d]: ", i);
idx = q->idx;
i0 = be32_to_cpup(q->qpage + idx);
idx = (idx + 1) & q->msk;
@@ -2141,16 +2140,54 @@ int kvmppc_xive_debug_show_queues(struct seq_file *m, struct kvm_vcpu *vcpu)
irq_data_get_irq_handler_data(d);
u64 pq = xive_vm_esb_load(xd, XIVE_ESB_GET);
- seq_printf(m, "E:%c%c I(%d:%llx:%llx)",
- (pq & XIVE_ESB_VAL_P) ? 'P' : 'p',
- (pq & XIVE_ESB_VAL_Q) ? 'Q' : 'q',
- xc->esc_virq[i], pq, xd->eoi_page);
+ seq_printf(m, " ESC %d %c%c EOI @%llx",
+ xc->esc_virq[i],
+ (pq & XIVE_ESB_VAL_P) ? 'P' : '-',
+ (pq & XIVE_ESB_VAL_Q) ? 'Q' : '-',
+ xd->eoi_page);
seq_puts(m, "\n");
}
}
return 0;
}
+void kvmppc_xive_debug_show_sources(struct seq_file *m,
+ struct kvmppc_xive_src_block *sb)
+{
+ int i;
+
+ seq_puts(m, " LISN HW/CHIP TYPE PQ EISN CPU/PRIO\n");
+ for (i = 0; i < KVMPPC_XICS_IRQ_PER_ICS; i++) {
+ struct kvmppc_xive_irq_state *state = &sb->irq_state[i];
+ struct xive_irq_data *xd;
+ u64 pq;
+ u32 hw_num;
+
+ if (!state->valid)
+ continue;
+
+ kvmppc_xive_select_irq(state, &hw_num, &xd);
+
+ pq = xive_vm_esb_load(xd, XIVE_ESB_GET);
+
+ seq_printf(m, "%08x %08x/%02x", state->number, hw_num,
+ xd->src_chip);
+ if (state->lsi)
+ seq_printf(m, " %cLSI", state->asserted ? '^' : ' ');
+ else
+ seq_puts(m, " MSI");
+
+ seq_printf(m, " %s %c%c %08x % 4d/%d",
+ state->ipi_number == hw_num ? "IPI" : " PT",
+ pq & XIVE_ESB_VAL_P ? 'P' : '-',
+ pq & XIVE_ESB_VAL_Q ? 'Q' : '-',
+ state->eisn, state->act_server,
+ state->act_priority);
+
+ seq_puts(m, "\n");
+ }
+}
+
static int xive_debug_show(struct seq_file *m, void *private)
{
struct kvmppc_xive *xive = m->private;
@@ -2171,7 +2208,7 @@ static int xive_debug_show(struct seq_file *m, void *private)
if (!kvm)
return 0;
- seq_printf(m, "=========\nVCPU state\n=========\n");
+ seq_puts(m, "=========\nVCPU state\n=========\n");
kvm_for_each_vcpu(i, vcpu, kvm) {
struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
@@ -2179,11 +2216,12 @@ static int xive_debug_show(struct seq_file *m, void *private)
if (!xc)
continue;
- seq_printf(m, "cpu server %#x VP:%#x CPPR:%#x HWCPPR:%#x"
- " MFRR:%#x PEND:%#x h_xirr: R=%lld V=%lld\n",
- xc->server_num, xc->vp_id, xc->cppr, xc->hw_cppr,
- xc->mfrr, xc->pending,
- xc->stat_rm_h_xirr, xc->stat_vm_h_xirr);
+ seq_printf(m, "VCPU %d: VP:%#x/%02x\n"
+ " CPPR:%#x HWCPPR:%#x MFRR:%#x PEND:%#x h_xirr: R=%lld V=%lld\n",
+ xc->server_num, xc->vp_id, xc->vp_chip_id,
+ xc->cppr, xc->hw_cppr,
+ xc->mfrr, xc->pending,
+ xc->stat_rm_h_xirr, xc->stat_vm_h_xirr);
kvmppc_xive_debug_show_queues(m, vcpu);
@@ -2199,13 +2237,25 @@ static int xive_debug_show(struct seq_file *m, void *private)
t_vm_h_ipi += xc->stat_vm_h_ipi;
}
- seq_printf(m, "Hcalls totals\n");
+ seq_puts(m, "Hcalls totals\n");
seq_printf(m, " H_XIRR R=%10lld V=%10lld\n", t_rm_h_xirr, t_vm_h_xirr);
seq_printf(m, " H_IPOLL R=%10lld V=%10lld\n", t_rm_h_ipoll, t_vm_h_ipoll);
seq_printf(m, " H_CPPR R=%10lld V=%10lld\n", t_rm_h_cppr, t_vm_h_cppr);
seq_printf(m, " H_EOI R=%10lld V=%10lld\n", t_rm_h_eoi, t_vm_h_eoi);
seq_printf(m, " H_IPI R=%10lld V=%10lld\n", t_rm_h_ipi, t_vm_h_ipi);
+ seq_puts(m, "=========\nSources\n=========\n");
+
+ for (i = 0; i <= xive->max_sbid; i++) {
+ struct kvmppc_xive_src_block *sb = xive->src_blocks[i];
+
+ if (sb) {
+ arch_spin_lock(&sb->lock);
+ kvmppc_xive_debug_show_sources(m, sb);
+ arch_spin_unlock(&sb->lock);
+ }
+ }
+
return 0;
}
diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
index 6aaaa4bedaaf..9b395381179d 100644
--- a/arch/powerpc/kvm/book3s_xive_native.c
+++ b/arch/powerpc/kvm/book3s_xive_native.c
@@ -1257,18 +1257,31 @@ static int xive_native_debug_show(struct seq_file *m, void *private)
if (!xc)
continue;
- seq_printf(m, "cpu server %#x VP=%#x NSR=%02x CPPR=%02x IBP=%02x PIPR=%02x w01=%016llx w2=%08x\n",
- xc->server_num, xc->vp_id,
+ seq_printf(m, "VCPU %d: VP=%#x/%02x\n"
+ " NSR=%02x CPPR=%02x IBP=%02x PIPR=%02x w01=%016llx w2=%08x\n",
+ xc->server_num, xc->vp_id, xc->vp_chip_id,
vcpu->arch.xive_saved_state.nsr,
vcpu->arch.xive_saved_state.cppr,
vcpu->arch.xive_saved_state.ipb,
vcpu->arch.xive_saved_state.pipr,
- vcpu->arch.xive_saved_state.w01,
- (u32) vcpu->arch.xive_cam_word);
+ be64_to_cpu(vcpu->arch.xive_saved_state.w01),
+ be32_to_cpu(vcpu->arch.xive_cam_word));
kvmppc_xive_debug_show_queues(m, vcpu);
}
+ seq_puts(m, "=========\nSources\n=========\n");
+
+ for (i = 0; i <= xive->max_sbid; i++) {
+ struct kvmppc_xive_src_block *sb = xive->src_blocks[i];
+
+ if (sb) {
+ arch_spin_lock(&sb->lock);
+ kvmppc_xive_debug_show_sources(m, sb);
+ arch_spin_unlock(&sb->lock);
+ }
+ }
+
return 0;
}
--
2.26.2
^ permalink raw reply related
* [PATCH 11/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Cédric Le Goater
In-Reply-To: <20201208151124.1329942-1-clg@kaod.org>
This flag was used to support the P9 DD1 and we have stopped
supporting this CPU when DD2 came out. See skiboot commit:
https://github.com/open-power/skiboot/commit/0b0d15e3c170
Also, remove eoi handler which is now unused.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
arch/powerpc/include/asm/opal-api.h | 2 +-
arch/powerpc/include/asm/xive.h | 2 +-
arch/powerpc/sysdev/xive/xive-internal.h | 1 -
arch/powerpc/kvm/book3s_xive_template.c | 2 --
arch/powerpc/sysdev/xive/common.c | 13 +------------
arch/powerpc/sysdev/xive/native.c | 12 ------------
arch/powerpc/sysdev/xive/spapr.c | 6 ------
7 files changed, 3 insertions(+), 35 deletions(-)
diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index 0455b679c050..0b63ba7d5917 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -1093,7 +1093,7 @@ enum {
OPAL_XIVE_IRQ_LSI = 0x00000004,
OPAL_XIVE_IRQ_SHIFT_BUG = 0x00000008, /* P9 DD1.0 workaround */
OPAL_XIVE_IRQ_MASK_VIA_FW = 0x00000010, /* P9 DD1.0 workaround */
- OPAL_XIVE_IRQ_EOI_VIA_FW = 0x00000020,
+ OPAL_XIVE_IRQ_EOI_VIA_FW = 0x00000020, /* P9 DD1.0 workaround */
};
/* Flags for OPAL_XIVE_GET/SET_QUEUE_INFO */
diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
index d62368d0ba91..f6150d7a757a 100644
--- a/arch/powerpc/include/asm/xive.h
+++ b/arch/powerpc/include/asm/xive.h
@@ -62,7 +62,7 @@ struct xive_irq_data {
#define XIVE_IRQ_FLAG_LSI 0x02
#define XIVE_IRQ_FLAG_SHIFT_BUG 0x04 /* P9 DD1.0 workaround */
#define XIVE_IRQ_FLAG_MASK_FW 0x08 /* P9 DD1.0 workaround */
-#define XIVE_IRQ_FLAG_EOI_FW 0x10
+#define XIVE_IRQ_FLAG_EOI_FW 0x10 /* P9 DD1.0 workaround */
#define XIVE_IRQ_FLAG_H_INT_ESB 0x20
/* Special flag set by KVM for excalation interrupts */
diff --git a/arch/powerpc/sysdev/xive/xive-internal.h b/arch/powerpc/sysdev/xive/xive-internal.h
index 066d6fe3dc1d..3b7dd2cba9db 100644
--- a/arch/powerpc/sysdev/xive/xive-internal.h
+++ b/arch/powerpc/sysdev/xive/xive-internal.h
@@ -52,7 +52,6 @@ struct xive_ops {
void (*shutdown)(void);
void (*update_pending)(struct xive_cpu *xc);
- void (*eoi)(u32 hw_irq);
void (*sync_source)(u32 hw_irq);
u64 (*esb_rw)(u32 hw_irq, u32 offset, u64 data, bool write);
#ifdef CONFIG_SMP
diff --git a/arch/powerpc/kvm/book3s_xive_template.c b/arch/powerpc/kvm/book3s_xive_template.c
index ece36e024a8f..b0015e05d99a 100644
--- a/arch/powerpc/kvm/book3s_xive_template.c
+++ b/arch/powerpc/kvm/book3s_xive_template.c
@@ -74,8 +74,6 @@ static void GLUE(X_PFX,source_eoi)(u32 hw_irq, struct xive_irq_data *xd)
/* If the XIVE supports the new "store EOI facility, use it */
if (xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
__x_writeq(0, __x_eoi_page(xd) + XIVE_ESB_STORE_EOI);
- else if (hw_irq && xd->flags & XIVE_IRQ_FLAG_EOI_FW)
- opal_int_eoi(hw_irq);
else if (xd->flags & XIVE_IRQ_FLAG_LSI) {
/*
* For LSIs the HW EOI cycle is used rather than PQ bits,
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index a71412fefb65..fe6229dd3241 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -354,18 +354,7 @@ static void xive_do_source_eoi(u32 hw_irq, struct xive_irq_data *xd)
/* If the XIVE supports the new "store EOI facility, use it */
if (xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
xive_esb_write(xd, XIVE_ESB_STORE_EOI, 0);
- else if (hw_irq && xd->flags & XIVE_IRQ_FLAG_EOI_FW) {
- /*
- * The FW told us to call it. This happens for some
- * interrupt sources that need additional HW whacking
- * beyond the ESB manipulation. For example LPC interrupts
- * on P9 DD1.0 needed a latch to be clared in the LPC bridge
- * itself. The Firmware will take care of it.
- */
- if (WARN_ON_ONCE(!xive_ops->eoi))
- return;
- xive_ops->eoi(hw_irq);
- } else {
+ else {
u8 eoi_val;
/*
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index deb97ad25d62..4902d05ebbd1 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -64,8 +64,6 @@ int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)
data->flags |= XIVE_IRQ_FLAG_STORE_EOI;
if (opal_flags & OPAL_XIVE_IRQ_LSI)
data->flags |= XIVE_IRQ_FLAG_LSI;
- if (opal_flags & OPAL_XIVE_IRQ_EOI_VIA_FW)
- data->flags |= XIVE_IRQ_FLAG_EOI_FW;
data->eoi_page = be64_to_cpu(eoi_page);
data->trig_page = be64_to_cpu(trig_page);
data->esb_shift = be32_to_cpu(esb_shift);
@@ -380,15 +378,6 @@ static void xive_native_update_pending(struct xive_cpu *xc)
}
}
-static void xive_native_eoi(u32 hw_irq)
-{
- /*
- * Not normally used except if specific interrupts need
- * a workaround on EOI.
- */
- opal_int_eoi(hw_irq);
-}
-
static void xive_native_setup_cpu(unsigned int cpu, struct xive_cpu *xc)
{
s64 rc;
@@ -471,7 +460,6 @@ static const struct xive_ops xive_native_ops = {
.match = xive_native_match,
.shutdown = xive_native_shutdown,
.update_pending = xive_native_update_pending,
- .eoi = xive_native_eoi,
.setup_cpu = xive_native_setup_cpu,
.teardown_cpu = xive_native_teardown_cpu,
.sync_source = xive_native_sync_source,
diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
index 6610e5149d5a..01ccc0786ada 100644
--- a/arch/powerpc/sysdev/xive/spapr.c
+++ b/arch/powerpc/sysdev/xive/spapr.c
@@ -628,11 +628,6 @@ static void xive_spapr_update_pending(struct xive_cpu *xc)
}
}
-static void xive_spapr_eoi(u32 hw_irq)
-{
- /* Not used */;
-}
-
static void xive_spapr_setup_cpu(unsigned int cpu, struct xive_cpu *xc)
{
/* Only some debug on the TIMA settings */
@@ -677,7 +672,6 @@ static const struct xive_ops xive_spapr_ops = {
.match = xive_spapr_match,
.shutdown = xive_spapr_shutdown,
.update_pending = xive_spapr_update_pending,
- .eoi = xive_spapr_eoi,
.setup_cpu = xive_spapr_setup_cpu,
.teardown_cpu = xive_spapr_teardown_cpu,
.sync_source = xive_spapr_sync_source,
--
2.26.2
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox