* [RFC 0/2] powerpc/mm: Mark memory contexts requiring global TLBIs @ 2017-05-03 14:29 Frederic Barrat 2017-05-03 14:29 ` [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations Frederic Barrat ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Frederic Barrat @ 2017-05-03 14:29 UTC (permalink / raw) To: alistair, aneesh.kumar, bsingharora, linuxppc-dev, mpe capi2 and opencapi require the TLB invalidations being sent for addresses used on the cxl adapter or opencapi device to be global, as there's a translation cache in the PSL (for capi2) or NPU (for opencapi). The CAPP (for PSL) and NPU snoop the power bus. This is not new: for the hash memory model, as soon as the cxl driver is active, all local TLBIs become global. We need a similar mechanism for the radix memory model. This patch tries to improve things a bit by flagging the contexts requiring global TLBIs, therefore limiting the "upgrade" and not affecting contexts not used by the card. Alistair: for nvlink2, it is my understanding that all the required invalidations are already in place through software mmio/ATSD, i.e. this patch is not useful for you. Submitting as an RFC, since I don't get to touch mmu.h everyday and would like to probe people's reaction. Frederic Barrat (2): powerpc/mm: Add marker for contexts requiring global TLB invalidations cxl: Mark context requiring global TLBIs arch/powerpc/include/asm/book3s/64/mmu.h | 9 +++++++++ arch/powerpc/include/asm/tlb.h | 10 ++++++++-- arch/powerpc/mm/mmu_context_book3s64.c | 1 + drivers/misc/cxl/api.c | 5 ++++- drivers/misc/cxl/file.c | 5 ++++- 5 files changed, 26 insertions(+), 4 deletions(-) -- 2.11.0 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations 2017-05-03 14:29 [RFC 0/2] powerpc/mm: Mark memory contexts requiring global TLBIs Frederic Barrat @ 2017-05-03 14:29 ` Frederic Barrat 2017-05-04 6:41 ` Aneesh Kumar K.V ` (2 more replies) 2017-05-03 14:29 ` [RFC 2/2] cxl: Mark context requiring global TLBIs Frederic Barrat 2017-05-05 5:28 ` [RFC 0/2] powerpc/mm: Mark memory contexts " Alistair Popple 2 siblings, 3 replies; 12+ messages in thread From: Frederic Barrat @ 2017-05-03 14:29 UTC (permalink / raw) To: alistair, aneesh.kumar, bsingharora, linuxppc-dev, mpe Introduce a new 'flags' attribute per context and define its first bit to be a marker requiring all TLBIs for that context to be broadcasted globally. Once that marker is set on a context, it cannot be removed. Such a marker is useful for memory contexts used by devices behind the NPU and CAPP/PSL. The NPU and the PSL keep their own translation cache so they need to see all the TLBIs for those contexts. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> --- arch/powerpc/include/asm/book3s/64/mmu.h | 9 +++++++++ arch/powerpc/include/asm/tlb.h | 10 ++++++++-- arch/powerpc/mm/mmu_context_book3s64.c | 1 + 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index 77529a3e3811..7b640ab1cbeb 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -78,8 +78,12 @@ struct spinlock; /* Maximum possible number of NPUs in a system. */ #define NV_MAX_NPUS 8 +/* Bits definition for the context flags */ +#define MM_CONTEXT_GLOBAL_TLBI 1 /* TLBI must be global */ + typedef struct { mm_context_id_t id; + unsigned long flags; u16 user_psize; /* page size index */ /* NPU NMMU context */ @@ -164,5 +168,10 @@ extern void radix_init_pseries(void); static inline void radix_init_pseries(void) { }; #endif +static inline void mm_context_set_global_tlbi(mm_context_t *ctx) +{ + set_bit(MM_CONTEXT_GLOBAL_TLBI, &ctx->flags); +} + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_BOOK3S_64_MMU_H_ */ diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h index 609557569f65..bd18ed083011 100644 --- a/arch/powerpc/include/asm/tlb.h +++ b/arch/powerpc/include/asm/tlb.h @@ -71,8 +71,14 @@ static inline int mm_is_core_local(struct mm_struct *mm) static inline int mm_is_thread_local(struct mm_struct *mm) { - return cpumask_equal(mm_cpumask(mm), - cpumask_of(smp_processor_id())); + int rc; + + rc = cpumask_equal(mm_cpumask(mm), + cpumask_of(smp_processor_id())); +#ifdef CONFIG_PPC_BOOK3S_64 + rc = rc && !test_bit(MM_CONTEXT_GLOBAL_TLBI, &mm->context.flags); +#endif + return rc; } #else diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c index c6dca2ae78ef..53983a0c220c 100644 --- a/arch/powerpc/mm/mmu_context_book3s64.c +++ b/arch/powerpc/mm/mmu_context_book3s64.c @@ -156,6 +156,7 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm) return index; mm->context.id = index; + mm->context.flags = 0; #ifdef CONFIG_PPC_ICSWX mm->context.cop_lockp = kmalloc(sizeof(spinlock_t), GFP_KERNEL); if (!mm->context.cop_lockp) { -- 2.11.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations 2017-05-03 14:29 ` [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations Frederic Barrat @ 2017-05-04 6:41 ` Aneesh Kumar K.V 2017-05-04 17:24 ` Frederic Barrat 2017-05-04 7:25 ` Balbir Singh 2017-05-04 9:42 ` Michael Ellerman 2 siblings, 1 reply; 12+ messages in thread From: Aneesh Kumar K.V @ 2017-05-04 6:41 UTC (permalink / raw) To: Frederic Barrat, alistair, bsingharora, linuxppc-dev, mpe Frederic Barrat <fbarrat@linux.vnet.ibm.com> writes: > Introduce a new 'flags' attribute per context and define its first bit > to be a marker requiring all TLBIs for that context to be broadcasted > globally. Once that marker is set on a context, it cannot be removed. > > Such a marker is useful for memory contexts used by devices behind the > NPU and CAPP/PSL. The NPU and the PSL keep their own > translation cache so they need to see all the TLBIs for those > contexts. Can we also switch existing cxl_ctx_in_use() to this ? -aneesh ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations 2017-05-04 6:41 ` Aneesh Kumar K.V @ 2017-05-04 17:24 ` Frederic Barrat 0 siblings, 0 replies; 12+ messages in thread From: Frederic Barrat @ 2017-05-04 17:24 UTC (permalink / raw) To: Aneesh Kumar K.V, alistair, bsingharora, linuxppc-dev, mpe Le 04/05/2017 à 08:41, Aneesh Kumar K.V a écrit : > Frederic Barrat <fbarrat@linux.vnet.ibm.com> writes: > >> Introduce a new 'flags' attribute per context and define its first bit >> to be a marker requiring all TLBIs for that context to be broadcasted >> globally. Once that marker is set on a context, it cannot be removed. >> >> Such a marker is useful for memory contexts used by devices behind the >> NPU and CAPP/PSL. The NPU and the PSL keep their own >> translation cache so they need to see all the TLBIs for those >> contexts. > > Can we also switch existing cxl_ctx_in_use() to this ? That was my initial intent. But in the hash code, when calling the tlbie, it seems that we no longer have the related context handy. Or did I miss it? so it would require quite a bit of changes. So I've just focused on fixing the pb for radix for the time being. That being said, we'll have to update what we do for hash if we ever want to support opencapi with hash/powervm (which seems like a strong possibility for next year), as we could have more than one opencapi drivers. Fred ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations 2017-05-03 14:29 ` [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations Frederic Barrat 2017-05-04 6:41 ` Aneesh Kumar K.V @ 2017-05-04 7:25 ` Balbir Singh 2017-05-04 9:24 ` Michael Ellerman 2017-05-04 9:42 ` Michael Ellerman 2 siblings, 1 reply; 12+ messages in thread From: Balbir Singh @ 2017-05-04 7:25 UTC (permalink / raw) To: Frederic Barrat, alistair, aneesh.kumar, linuxppc-dev, mpe On Wed, 2017-05-03 at 16:29 +0200, Frederic Barrat wrote: > Introduce a new 'flags' attribute per context and define its first bit > to be a marker requiring all TLBIs for that context to be broadcasted > globally. Once that marker is set on a context, it cannot be removed. > > Such a marker is useful for memory contexts used by devices behind the > NPU and CAPP/PSL. The NPU and the PSL keep their own > translation cache so they need to see all the TLBIs for those > contexts. > > Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> > --- > arch/powerpc/include/asm/book3s/64/mmu.h | 9 +++++++++ > arch/powerpc/include/asm/tlb.h | 10 ++++++++-- > arch/powerpc/mm/mmu_context_book3s64.c | 1 + > 3 files changed, 18 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h > index 77529a3e3811..7b640ab1cbeb 100644 > --- a/arch/powerpc/include/asm/book3s/64/mmu.h > +++ b/arch/powerpc/include/asm/book3s/64/mmu.h > @@ -78,8 +78,12 @@ struct spinlock; > /* Maximum possible number of NPUs in a system. */ > #define NV_MAX_NPUS 8 > > +/* Bits definition for the context flags */ > +#define MM_CONTEXT_GLOBAL_TLBI 1 /* TLBI must be global */ > + > typedef struct { > mm_context_id_t id; > + unsigned long flags; Should these flags be under #ifdef PPC_BOOK3S_64 as well? Not sure. > u16 user_psize; /* page size index */ > > /* NPU NMMU context */ > @@ -164,5 +168,10 @@ extern void radix_init_pseries(void); > static inline void radix_init_pseries(void) { }; > #endif > > +static inline void mm_context_set_global_tlbi(mm_context_t *ctx) > +{ > + set_bit(MM_CONTEXT_GLOBAL_TLBI, &ctx->flags); > +} > + > #endif /* __ASSEMBLY__ */ > #endif /* _ASM_POWERPC_BOOK3S_64_MMU_H_ */ > diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h > index 609557569f65..bd18ed083011 100644 > --- a/arch/powerpc/include/asm/tlb.h > +++ b/arch/powerpc/include/asm/tlb.h > @@ -71,8 +71,14 @@ static inline int mm_is_core_local(struct mm_struct *mm) > > static inline int mm_is_thread_local(struct mm_struct *mm) > { > - return cpumask_equal(mm_cpumask(mm), > - cpumask_of(smp_processor_id())); > + int rc; > + > + rc = cpumask_equal(mm_cpumask(mm), > + cpumask_of(smp_processor_id())); > +#ifdef CONFIG_PPC_BOOK3S_64 > + rc = rc && !test_bit(MM_CONTEXT_GLOBAL_TLBI, &mm->context.flags); > +#endif Acked-by: Balbir Singh <bsingharora@gmail.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations 2017-05-04 7:25 ` Balbir Singh @ 2017-05-04 9:24 ` Michael Ellerman 0 siblings, 0 replies; 12+ messages in thread From: Michael Ellerman @ 2017-05-04 9:24 UTC (permalink / raw) To: Balbir Singh, Frederic Barrat, alistair, aneesh.kumar, linuxppc-dev Balbir Singh <bsingharora@gmail.com> writes: > On Wed, 2017-05-03 at 16:29 +0200, Frederic Barrat wrote: >> Introduce a new 'flags' attribute per context and define its first bit >> to be a marker requiring all TLBIs for that context to be broadcasted >> globally. Once that marker is set on a context, it cannot be removed. >> >> Such a marker is useful for memory contexts used by devices behind the >> NPU and CAPP/PSL. The NPU and the PSL keep their own >> translation cache so they need to see all the TLBIs for those >> contexts. >> >> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> >> --- >> arch/powerpc/include/asm/book3s/64/mmu.h | 9 +++++++++ >> arch/powerpc/include/asm/tlb.h | 10 ++++++++-- >> arch/powerpc/mm/mmu_context_book3s64.c | 1 + >> 3 files changed, 18 insertions(+), 2 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h >> index 77529a3e3811..7b640ab1cbeb 100644 >> --- a/arch/powerpc/include/asm/book3s/64/mmu.h >> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h >> @@ -78,8 +78,12 @@ struct spinlock; >> /* Maximum possible number of NPUs in a system. */ >> #define NV_MAX_NPUS 8 >> >> +/* Bits definition for the context flags */ >> +#define MM_CONTEXT_GLOBAL_TLBI 1 /* TLBI must be global */ >> + >> typedef struct { >> mm_context_id_t id; >> + unsigned long flags; > > Should these flags be under #ifdef PPC_BOOK3S_64 as well? Not sure. They shouldn't need to be, the whole file is Book3s 64 only. cheers ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations 2017-05-03 14:29 ` [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations Frederic Barrat 2017-05-04 6:41 ` Aneesh Kumar K.V 2017-05-04 7:25 ` Balbir Singh @ 2017-05-04 9:42 ` Michael Ellerman 2017-05-07 11:15 ` Frederic Barrat 2 siblings, 1 reply; 12+ messages in thread From: Michael Ellerman @ 2017-05-04 9:42 UTC (permalink / raw) To: Frederic Barrat, alistair, aneesh.kumar, bsingharora, linuxppc-dev Frederic Barrat <fbarrat@linux.vnet.ibm.com> writes: > Introduce a new 'flags' attribute per context and define its first bit > to be a marker requiring all TLBIs for that context to be broadcasted > globally. Once that marker is set on a context, it cannot be removed. > > Such a marker is useful for memory contexts used by devices behind the > NPU and CAPP/PSL. The NPU and the PSL keep their own > translation cache so they need to see all the TLBIs for those > contexts. > > Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> > --- > arch/powerpc/include/asm/book3s/64/mmu.h | 9 +++++++++ > arch/powerpc/include/asm/tlb.h | 10 ++++++++-- > arch/powerpc/mm/mmu_context_book3s64.c | 1 + > 3 files changed, 18 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h > index 77529a3e3811..7b640ab1cbeb 100644 > --- a/arch/powerpc/include/asm/book3s/64/mmu.h > +++ b/arch/powerpc/include/asm/book3s/64/mmu.h > @@ -78,8 +78,12 @@ struct spinlock; > /* Maximum possible number of NPUs in a system. */ > #define NV_MAX_NPUS 8 > > +/* Bits definition for the context flags */ > +#define MM_CONTEXT_GLOBAL_TLBI 1 /* TLBI must be global */ I think I'd prefer MM_GLOBAL_TLBIE, it's shorter and tlbie is the name of the instruction so is something people can search for. > @@ -164,5 +168,10 @@ extern void radix_init_pseries(void); > static inline void radix_init_pseries(void) { }; > #endif > > +static inline void mm_context_set_global_tlbi(mm_context_t *ctx) > +{ > + set_bit(MM_CONTEXT_GLOBAL_TLBI, &ctx->flags); > +} set_bit() and test_bit() are non-atomic, and unordered vs other loads and stores. So the caller will need to be careful they have a barrier between this and whatever it is they do that creates mappings that might need to be invalidated. Similarly on the read side we should have a barrier between the store that makes the PTE invalid and the load of the flag. Which makes me think cxl_ctx_in_use() is buggy :/, hmm. But it's late so hopefully I'm wrong :D > diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h > index 609557569f65..bd18ed083011 100644 > --- a/arch/powerpc/include/asm/tlb.h > +++ b/arch/powerpc/include/asm/tlb.h > @@ -71,8 +71,14 @@ static inline int mm_is_core_local(struct mm_struct *mm) > > static inline int mm_is_thread_local(struct mm_struct *mm) > { > - return cpumask_equal(mm_cpumask(mm), > - cpumask_of(smp_processor_id())); > + int rc; > + > + rc = cpumask_equal(mm_cpumask(mm), > + cpumask_of(smp_processor_id())); > +#ifdef CONFIG_PPC_BOOK3S_64 > + rc = rc && !test_bit(MM_CONTEXT_GLOBAL_TLBI, &mm->context.flags); > +#endif The ifdef's a bit ugly, but I guess it's not worth putting it in a static inline. I'd be interested to see the generated code for this, and for the reverse, ie. putting the test_bit() first, and doing an early return if it's true. That way once the bit is set we can just skip the cpumask comparison. cheers ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations 2017-05-04 9:42 ` Michael Ellerman @ 2017-05-07 11:15 ` Frederic Barrat 0 siblings, 0 replies; 12+ messages in thread From: Frederic Barrat @ 2017-05-07 11:15 UTC (permalink / raw) To: Michael Ellerman, alistair, aneesh.kumar, bsingharora, linuxppc-dev Le 04/05/2017 à 11:42, Michael Ellerman a écrit : > Frederic Barrat <fbarrat@linux.vnet.ibm.com> writes: > >> Introduce a new 'flags' attribute per context and define its first bit >> to be a marker requiring all TLBIs for that context to be broadcasted >> globally. Once that marker is set on a context, it cannot be removed. >> >> Such a marker is useful for memory contexts used by devices behind the >> NPU and CAPP/PSL. The NPU and the PSL keep their own >> translation cache so they need to see all the TLBIs for those >> contexts. >> >> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> >> --- >> arch/powerpc/include/asm/book3s/64/mmu.h | 9 +++++++++ >> arch/powerpc/include/asm/tlb.h | 10 ++++++++-- >> arch/powerpc/mm/mmu_context_book3s64.c | 1 + >> 3 files changed, 18 insertions(+), 2 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h >> index 77529a3e3811..7b640ab1cbeb 100644 >> --- a/arch/powerpc/include/asm/book3s/64/mmu.h >> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h >> @@ -78,8 +78,12 @@ struct spinlock; >> /* Maximum possible number of NPUs in a system. */ >> #define NV_MAX_NPUS 8 >> >> +/* Bits definition for the context flags */ >> +#define MM_CONTEXT_GLOBAL_TLBI 1 /* TLBI must be global */ > > I think I'd prefer MM_GLOBAL_TLBIE, it's shorter and tlbie is the name > of the instruction so is something people can search for. OK >> @@ -164,5 +168,10 @@ extern void radix_init_pseries(void); >> static inline void radix_init_pseries(void) { }; >> #endif >> >> +static inline void mm_context_set_global_tlbi(mm_context_t *ctx) >> +{ >> + set_bit(MM_CONTEXT_GLOBAL_TLBI, &ctx->flags); >> +} > > set_bit() and test_bit() are non-atomic, and unordered vs other loads > and stores. > > So the caller will need to be careful they have a barrier between this > and whatever it is they do that creates mappings that might need to be > invalidated. Agreed, I missed the barrier. So I need to set the flag, have a write memory barrier. Then, in the case of cxl, we can attach to the accelerator. > Similarly on the read side we should have a barrier between the store > that makes the PTE invalid and the load of the flag. That one is more subtle, at least to me, but I think I now see what you mean. With no extra read barrier, we would be exposed to have the following order: CPU1 CPU2 device read flag=>local set global flag write barrier attach read PTE update PTE tlbiel not seen, hence broken Is that what you meant? That would mean an extra read barrier for each tlbie. > > Which makes me think cxl_ctx_in_use() is buggy :/, hmm. But it's late so > hopefully I'm wrong :D Unfortunately, I think you're right. And we're missing the same 2 barriers: a write barrier when cxl increments atomically the use count before attaching, and a read barrier like above. Fred > >> diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h >> index 609557569f65..bd18ed083011 100644 >> --- a/arch/powerpc/include/asm/tlb.h >> +++ b/arch/powerpc/include/asm/tlb.h >> @@ -71,8 +71,14 @@ static inline int mm_is_core_local(struct mm_struct *mm) >> >> static inline int mm_is_thread_local(struct mm_struct *mm) >> { >> - return cpumask_equal(mm_cpumask(mm), >> - cpumask_of(smp_processor_id())); >> + int rc; >> + >> + rc = cpumask_equal(mm_cpumask(mm), >> + cpumask_of(smp_processor_id())); >> +#ifdef CONFIG_PPC_BOOK3S_64 >> + rc = rc && !test_bit(MM_CONTEXT_GLOBAL_TLBI, &mm->context.flags); >> +#endif > > The ifdef's a bit ugly, but I guess it's not worth putting it in a > static inline. > > I'd be interested to see the generated code for this, and for the > reverse, ie. putting the test_bit() first, and doing an early return if > it's true. That way once the bit is set we can just skip the cpumask > comparison. > > cheers > ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC 2/2] cxl: Mark context requiring global TLBIs 2017-05-03 14:29 [RFC 0/2] powerpc/mm: Mark memory contexts requiring global TLBIs Frederic Barrat 2017-05-03 14:29 ` [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations Frederic Barrat @ 2017-05-03 14:29 ` Frederic Barrat 2017-05-04 7:39 ` Balbir Singh 2017-05-05 5:28 ` [RFC 0/2] powerpc/mm: Mark memory contexts " Alistair Popple 2 siblings, 1 reply; 12+ messages in thread From: Frederic Barrat @ 2017-05-03 14:29 UTC (permalink / raw) To: alistair, aneesh.kumar, bsingharora, linuxppc-dev, mpe The PSL needs to see all TLBI pertinent to the memory contexts used on the cxl adapter. For the hash memory model, it was done by making all TLBIs global as soon as the cxl driver is in us. For radix, we need something similar, but we can refine and only convert to global the invalidations for contexts actually used by the device. So mark the contexts being attached to the cxl adapter as requiring global TLBIs. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> --- drivers/misc/cxl/api.c | 5 ++++- drivers/misc/cxl/file.c | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 1a138c83f877..86b2ad86fdee 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -347,7 +347,10 @@ int cxl_start_context(struct cxl_context *ctx, u64 wed, cxl_context_mm_count_put(ctx); goto out; } - +#ifdef CONFIG_PPC_BOOK3S_64 + if (ctx->mm) + mm_context_set_global_tlbi(&ctx->mm->context); +#endif ctx->status = STARTED; out: mutex_unlock(&ctx->status_mutex); diff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c index 17b433f1ce23..c7ead488eee2 100644 --- a/drivers/misc/cxl/file.c +++ b/drivers/misc/cxl/file.c @@ -239,7 +239,10 @@ static long afu_ioctl_start_work(struct cxl_context *ctx, cxl_context_mm_count_put(ctx); goto out; } - +#ifdef CONFIG_PPC_BOOK3S_64 + if (ctx->mm) + mm_context_set_global_tlbi(&ctx->mm->context); +#endif ctx->status = STARTED; rc = 0; out: -- 2.11.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC 2/2] cxl: Mark context requiring global TLBIs 2017-05-03 14:29 ` [RFC 2/2] cxl: Mark context requiring global TLBIs Frederic Barrat @ 2017-05-04 7:39 ` Balbir Singh 2017-05-07 10:41 ` Frederic Barrat 0 siblings, 1 reply; 12+ messages in thread From: Balbir Singh @ 2017-05-04 7:39 UTC (permalink / raw) To: Frederic Barrat, alistair, aneesh.kumar, linuxppc-dev, mpe On Wed, 2017-05-03 at 16:29 +0200, Frederic Barrat wrote: > The PSL needs to see all TLBI pertinent to the memory contexts used on > the cxl adapter. For the hash memory model, it was done by making all > TLBIs global as soon as the cxl driver is in us. For radix, we need > something similar, but we can refine and only convert to global the > invalidations for contexts actually used by the device. > > So mark the contexts being attached to the cxl adapter as requiring > global TLBIs. > <snip> > +#ifdef CONFIG_PPC_BOOK3S_64 > + if (ctx->mm) > + mm_context_set_global_tlbi(&ctx->mm->context); Just curious and wondering Could we do mm_context_set_global_tlbi() before ->attach_process() that way we won't need atomic tests (set_bit() and test_bit())? May be a memory barrier would suffice. Not 100% sure, hence checking Balbir Singh. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC 2/2] cxl: Mark context requiring global TLBIs 2017-05-04 7:39 ` Balbir Singh @ 2017-05-07 10:41 ` Frederic Barrat 0 siblings, 0 replies; 12+ messages in thread From: Frederic Barrat @ 2017-05-07 10:41 UTC (permalink / raw) To: Balbir Singh, alistair, aneesh.kumar, linuxppc-dev, mpe Le 04/05/2017 à 09:39, Balbir Singh a écrit : > On Wed, 2017-05-03 at 16:29 +0200, Frederic Barrat wrote: >> The PSL needs to see all TLBI pertinent to the memory contexts used on >> the cxl adapter. For the hash memory model, it was done by making all >> TLBIs global as soon as the cxl driver is in us. For radix, we need >> something similar, but we can refine and only convert to global the >> invalidations for contexts actually used by the device. >> >> So mark the contexts being attached to the cxl adapter as requiring >> global TLBIs. >> > <snip> >> +#ifdef CONFIG_PPC_BOOK3S_64 >> + if (ctx->mm) >> + mm_context_set_global_tlbi(&ctx->mm->context); > > Just curious and wondering > > Could we do mm_context_set_global_tlbi() before ->attach_process() that > way we won't need atomic tests (set_bit() and test_bit())? May be a memory > barrier would suffice. Not 100% sure, hence checking You're right, I need to move mm_context_set_global_tlbi() before the attach and have a write memory barrier. If the attach fails, then the context will still be marked for global TLBIs (some other driver may also set the bit for a different reason). But I would expect the life expectancy of a process designed to use an accelerator and failing to attach to be pretty short. I still think we need the atomic set_bit() though, but that's really for the future and if somebody introduces new bits in the context 'flags'. Fred > Balbir Singh. > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC 0/2] powerpc/mm: Mark memory contexts requiring global TLBIs 2017-05-03 14:29 [RFC 0/2] powerpc/mm: Mark memory contexts requiring global TLBIs Frederic Barrat 2017-05-03 14:29 ` [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations Frederic Barrat 2017-05-03 14:29 ` [RFC 2/2] cxl: Mark context requiring global TLBIs Frederic Barrat @ 2017-05-05 5:28 ` Alistair Popple 2 siblings, 0 replies; 12+ messages in thread From: Alistair Popple @ 2017-05-05 5:28 UTC (permalink / raw) To: Frederic Barrat; +Cc: aneesh.kumar, bsingharora, linuxppc-dev, mpe Hi Frederic, On Wed, 3 May 2017 04:29:04 PM Frederic Barrat wrote: > capi2 and opencapi require the TLB invalidations being sent for > addresses used on the cxl adapter or opencapi device to be global, as > there's a translation cache in the PSL (for capi2) or NPU (for > opencapi). The CAPP (for PSL) and NPU snoop the power bus. > > This is not new: for the hash memory model, as soon as the cxl driver > is active, all local TLBIs become global. We need a similar mechanism > for the radix memory model. This patch tries to improve things a bit > by flagging the contexts requiring global TLBIs, therefore limiting > the "upgrade" and not affecting contexts not used by the card. > > Alistair: for nvlink2, it is my understanding that all the required > invalidations are already in place through software mmio/ATSD, i.e. this > patch is not useful for you. Not quite true. I would like to drop the global TLBI from the MMU notifier so will need this to invalidate entries the NMMU cache. - Alistair > Submitting as an RFC, since I don't get to touch mmu.h everyday and > would like to probe people's reaction. > > > > Frederic Barrat (2): > powerpc/mm: Add marker for contexts requiring global TLB invalidations > cxl: Mark context requiring global TLBIs > > arch/powerpc/include/asm/book3s/64/mmu.h | 9 +++++++++ > arch/powerpc/include/asm/tlb.h | 10 ++++++++-- > arch/powerpc/mm/mmu_context_book3s64.c | 1 + > drivers/misc/cxl/api.c | 5 ++++- > drivers/misc/cxl/file.c | 5 ++++- > 5 files changed, 26 insertions(+), 4 deletions(-) > > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-05-07 11:16 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-05-03 14:29 [RFC 0/2] powerpc/mm: Mark memory contexts requiring global TLBIs Frederic Barrat 2017-05-03 14:29 ` [RFC 1/2] powerpc/mm: Add marker for contexts requiring global TLB invalidations Frederic Barrat 2017-05-04 6:41 ` Aneesh Kumar K.V 2017-05-04 17:24 ` Frederic Barrat 2017-05-04 7:25 ` Balbir Singh 2017-05-04 9:24 ` Michael Ellerman 2017-05-04 9:42 ` Michael Ellerman 2017-05-07 11:15 ` Frederic Barrat 2017-05-03 14:29 ` [RFC 2/2] cxl: Mark context requiring global TLBIs Frederic Barrat 2017-05-04 7:39 ` Balbir Singh 2017-05-07 10:41 ` Frederic Barrat 2017-05-05 5:28 ` [RFC 0/2] powerpc/mm: Mark memory contexts " Alistair Popple
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).