linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 5/5] cxl/region: Manage CPU caches relative to DPA invalidation events
       [not found]   ` <20221205192054.mwhzyjrfwfn3tma5@offworld>
@ 2022-12-05 20:10     ` Dan Williams
  2022-12-06  9:47       ` Jonathan Cameron
  0 siblings, 1 reply; 3+ messages in thread
From: Dan Williams @ 2022-12-05 20:10 UTC (permalink / raw)
  To: Davidlohr Bueso, Dan Williams
  Cc: linux-cxl, Jonathan.Cameron, dave.jiang, nvdimm, linux-arm-kernel

[ add linux-arm-kernel@lists.infradead.org ]

Background for ARM folks, CXL can dynamically reconfigure the target
devices that back a given physical memory region. When that happens the
CPU cache can be holding cache data from a previous configuration. The
mitigation for that scenario on x86 is wbinvd, ARM does not have an
equivalent. The result, dynamic region creation is disabled on ARM. In
the near term, most CXL is configured pre-boot, but going forward this
restriction is untenable.

Davidlohr Bueso wrote:
> On Thu, 01 Dec 2022, Dan Williams wrote:
> 
> >A "DPA invalidation event" is any scenario where the contents of a DPA
> >(Device Physical Address) is modified in a way that is incoherent with
> >CPU caches, or if the HPA (Host Physical Address) to DPA association
> >changes due to a remapping event.
> >
> >PMEM security events like Unlock and Passphrase Secure Erase already
> >manage caches through LIBNVDIMM,
> 
> Just to be clear, is this is why you get rid of the explicit flushing
> for the respective commands in security.c?

Correct, because those commands can only be executed through libnvdimm.

> 
> >so that leaves HPA to DPA remap events
> >that need cache management by the CXL core. Those only happen when the
> >boot time CXL configuration has changed. That event occurs when
> >userspace attaches an endpoint decoder to a region configuration, and
> >that region is subsequently activated.
> >
> >The implications of not invalidating caches between remap events is that
> >reads from the region at different points in time may return different
> >results due to stale cached data from the previous HPA to DPA mapping.
> >Without a guarantee that the region contents after cxl_region_probe()
> >are written before being read (a layering-violation assumption that
> >cxl_region_probe() can not make) the CXL subsystem needs to ensure that
> >reads that precede writes see consistent results.
> 
> Hmm where does this leave us remaping under arm64 which is doesn't have
> ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION?
> 
> Back when we were discussing this it was all related to the security stuff,
> which under arm it could just be easily discarded as not available feature.

I can throw out a few strawman options, but really need help from ARM
folks to decide where to go next.

1/ Map and loop cache flushing line by line. It works, but for Terabytes
   of CXL the cost is 10s of seconds of latency to reconfigure a region.
   That said, region configuration, outside of test scenarios, is typically
   a "once per bare metal provisioning" event.

2/ Set a configuration dependency that mandates that all CXL memory be
   routed through the page allocator where it is guaranteed that the memory
   will be written (zeroed) before use. This restricts some planned use
   cases for the "Dynamic Capacity Device" capability.

3/ Work with the CXL consortium to extend the back-invalidate concept
   for general purpose usage to make devices capable of invalidating caches
   for a new memory region they joined, and mandate it for ARM. This one
   has a long lead time and a gap for every device in flight currently.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 5/5] cxl/region: Manage CPU caches relative to DPA invalidation events
  2022-12-05 20:10     ` [PATCH 5/5] cxl/region: Manage CPU caches relative to DPA invalidation events Dan Williams
@ 2022-12-06  9:47       ` Jonathan Cameron
  2022-12-06 15:17         ` James Morse
  0 siblings, 1 reply; 3+ messages in thread
From: Jonathan Cameron @ 2022-12-06  9:47 UTC (permalink / raw)
  To: Dan Williams
  Cc: Davidlohr Bueso, linux-cxl, dave.jiang, nvdimm, linux-arm-kernel,
	James Morse, Will Deacon, catalin.marinas, Anshuman Khandual,
	anthony.jebson, ardb

On Mon, 5 Dec 2022 12:10:22 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> [ add linux-arm-kernel@lists.infradead.org ]
> 
> Background for ARM folks, CXL can dynamically reconfigure the target
> devices that back a given physical memory region. When that happens the
> CPU cache can be holding cache data from a previous configuration. The
> mitigation for that scenario on x86 is wbinvd, ARM does not have an
> equivalent. The result, dynamic region creation is disabled on ARM. In
> the near term, most CXL is configured pre-boot, but going forward this
> restriction is untenable.
> 
> Davidlohr Bueso wrote:
> > On Thu, 01 Dec 2022, Dan Williams wrote:
> >   
> > >A "DPA invalidation event" is any scenario where the contents of a DPA
> > >(Device Physical Address) is modified in a way that is incoherent with
> > >CPU caches, or if the HPA (Host Physical Address) to DPA association
> > >changes due to a remapping event.
> > >
> > >PMEM security events like Unlock and Passphrase Secure Erase already
> > >manage caches through LIBNVDIMM,  
> > 
> > Just to be clear, is this is why you get rid of the explicit flushing
> > for the respective commands in security.c?  
> 
> Correct, because those commands can only be executed through libnvdimm.
> 
> >   
> > >so that leaves HPA to DPA remap events
> > >that need cache management by the CXL core. Those only happen when the
> > >boot time CXL configuration has changed. That event occurs when
> > >userspace attaches an endpoint decoder to a region configuration, and
> > >that region is subsequently activated.
> > >
> > >The implications of not invalidating caches between remap events is that
> > >reads from the region at different points in time may return different
> > >results due to stale cached data from the previous HPA to DPA mapping.
> > >Without a guarantee that the region contents after cxl_region_probe()
> > >are written before being read (a layering-violation assumption that
> > >cxl_region_probe() can not make) the CXL subsystem needs to ensure that
> > >reads that precede writes see consistent results.  
> > 
> > Hmm where does this leave us remaping under arm64 which is doesn't have
> > ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION?
> > 
> > Back when we were discussing this it was all related to the security stuff,
> > which under arm it could just be easily discarded as not available feature.  
> 
> I can throw out a few strawman options, but really need help from ARM
> folks to decide where to go next.

+Cc bunch of relevant people. There are discussions underway but I'm not sure
anyone will want to give more details here yet.

> 
> 1/ Map and loop cache flushing line by line. It works, but for Terabytes
>    of CXL the cost is 10s of seconds of latency to reconfigure a region.
>    That said, region configuration, outside of test scenarios, is typically
>    a "once per bare metal provisioning" event.
> 
> 2/ Set a configuration dependency that mandates that all CXL memory be
>    routed through the page allocator where it is guaranteed that the memory
>    will be written (zeroed) before use. This restricts some planned use
>    cases for the "Dynamic Capacity Device" capability.

This is the only case that's really a problem (to my mind) I hope we will have
a more general solution before there is much hardware out there, particularly
where sharing is involved. 

> 
> 3/ Work with the CXL consortium to extend the back-invalidate concept
>    for general purpose usage to make devices capable of invalidating caches
>    for a new memory region they joined, and mandate it for ARM. This one
>    has a long lead time and a gap for every device in flight currently.

There are significant disadvantages in doing this that I suspect will mean
this never happens for some classes of device, or is turned off for performance
reasons. For anyone curious, go look at the protocol requirements of back
invalidate in the CXL 3.0 spec.

Jonathan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 5/5] cxl/region: Manage CPU caches relative to DPA invalidation events
  2022-12-06  9:47       ` Jonathan Cameron
@ 2022-12-06 15:17         ` James Morse
  0 siblings, 0 replies; 3+ messages in thread
From: James Morse @ 2022-12-06 15:17 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams
  Cc: Davidlohr Bueso, linux-cxl, dave.jiang, nvdimm, linux-arm-kernel,
	Will Deacon, catalin.marinas, Anshuman Khandual, anthony.jebson,
	ardb

Hi guys,

On 06/12/2022 09:47, Jonathan Cameron wrote:
> On Mon, 5 Dec 2022 12:10:22 -0800
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
>> [ add linux-arm-kernel@lists.infradead.org ]
>>
>> Background for ARM folks, CXL can dynamically reconfigure the target
>> devices that back a given physical memory region. When that happens the
>> CPU cache can be holding cache data from a previous configuration. The
>> mitigation for that scenario on x86 is wbinvd, ARM does not have an
>> equivalent. The result, dynamic region creation is disabled on ARM. In
>> the near term, most CXL is configured pre-boot, but going forward this
>> restriction is untenable.
>>
>> Davidlohr Bueso wrote:
>>> On Thu, 01 Dec 2022, Dan Williams wrote:
>>>   
>>>> A "DPA invalidation event" is any scenario where the contents of a DPA
>>>> (Device Physical Address) is modified in a way that is incoherent with
>>>> CPU caches, or if the HPA (Host Physical Address) to DPA association
>>>> changes due to a remapping event.
>>>>
>>>> PMEM security events like Unlock and Passphrase Secure Erase already
>>>> manage caches through LIBNVDIMM,  
>>>
>>> Just to be clear, is this is why you get rid of the explicit flushing
>>> for the respective commands in security.c?  
>>
>> Correct, because those commands can only be executed through libnvdimm.
>>
>>>   
>>>> so that leaves HPA to DPA remap events
>>>> that need cache management by the CXL core. Those only happen when the
>>>> boot time CXL configuration has changed. That event occurs when
>>>> userspace attaches an endpoint decoder to a region configuration, and
>>>> that region is subsequently activated.
>>>>
>>>> The implications of not invalidating caches between remap events is that
>>>> reads from the region at different points in time may return different
>>>> results due to stale cached data from the previous HPA to DPA mapping.
>>>> Without a guarantee that the region contents after cxl_region_probe()
>>>> are written before being read (a layering-violation assumption that
>>>> cxl_region_probe() can not make) the CXL subsystem needs to ensure that
>>>> reads that precede writes see consistent results.  
>>>
>>> Hmm where does this leave us remaping under arm64 which is doesn't have
>>> ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION?

For those reading along at home, ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION is wbinvd.
https://lore.kernel.org/linux-cxl/20220919110605.3696-1-dave@stgolabs.net/

We don't have an instruction for arm64 that 'invalidates all caches'.


>>> Back when we were discussing this it was all related to the security stuff,
>>> which under arm it could just be easily discarded as not available feature.  
>>
>> I can throw out a few strawman options, but really need help from ARM
>> folks to decide where to go next.

> +Cc bunch of relevant people. There are discussions underway but I'm not sure
> anyone will want to give more details here yet.

The best we can do today is to use the by-VA invalidate operations in the kernel.
This isn't guaranteed to invalidate 'invisible' system caches, which means its not enough
for a one-size-fits-all kernel interface.
For the NVDIMM secure-erase users of this thing, if there were a system-cache between the
CPUs and the NVDIMM, there is nothing the kernel can do to invalidate it.

If its CXL specific this would be okay for testing in Qemu, but performance would scale
with the size of the region, which would hurt in real world cases.

The plan is to add a firmware call so firmware can do things that don't scale with the
size of the mapping, and do something platform-specific to the 'invisible' system cache,
if there is one.


Ideally we wait for the PSCI spec update that describes the firmware call, and make
support dependent on that. It looks like the timeline will be March-ish, but there should
be an alpha of the spec available much sooner.


>> 1/ Map and loop cache flushing line by line. It works, but for Terabytes
>>    of CXL the cost is 10s of seconds of latency to reconfigure a region.
>>    That said, region configuration, outside of test scenarios, is typically
>>    a "once per bare metal provisioning" event.

It works for CXL because you'd never have a system-cache in front of the CXL window.
Those things don't necessarily receive cache-maintenance because they are supposed to be
invisible.

D7.4.11 of DDI0487I.a "System level caches" has this horror:
| System caches which lie beyond the point of coherency and so are invisible to the
| software. The management of such caches is outside the scope of the architecture.

(The PoP stuff reaches beyond the PoC, but there isn't a DC CIVAP instruction)

Detecting which regions we can't do this for is problematic.


>> 2/ Set a configuration dependency that mandates that all CXL memory be
>>    routed through the page allocator where it is guaranteed that the memory
>>    will be written (zeroed) before use. This restricts some planned use
>>    cases for the "Dynamic Capacity Device" capability.

> This is the only case that's really a problem (to my mind) I hope we will have
> a more general solution before there is much hardware out there, particularly
> where sharing is involved. 


Thanks,

James


>> 3/ Work with the CXL consortium to extend the back-invalidate concept
>>    for general purpose usage to make devices capable of invalidating caches
>>    for a new memory region they joined, and mandate it for ARM. This one
>>    has a long lead time and a gap for every device in flight currently.
> 
> There are significant disadvantages in doing this that I suspect will mean
> this never happens for some classes of device, or is turned off for performance
> reasons. For anyone curious, go look at the protocol requirements of back
> invalidate in the CXL 3.0 spec.
> 
> Jonathan


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-12-06 15:26 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <166993219354.1995348.12912519920112533797.stgit@dwillia2-xfh.jf.intel.com>
     [not found] ` <166993222098.1995348.16604163596374520890.stgit@dwillia2-xfh.jf.intel.com>
     [not found]   ` <20221205192054.mwhzyjrfwfn3tma5@offworld>
2022-12-05 20:10     ` [PATCH 5/5] cxl/region: Manage CPU caches relative to DPA invalidation events Dan Williams
2022-12-06  9:47       ` Jonathan Cameron
2022-12-06 15:17         ` James Morse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).