qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* CXL emulation in QEMU contribution
       [not found] <20221011095228.00001546@huawei.com>
@ 2022-10-12 22:43 ` Viacheslav A.Dubeyko
  2022-10-13 15:09   ` Jonathan Cameron via
  0 siblings, 1 reply; 4+ messages in thread
From: Viacheslav A.Dubeyko @ 2022-10-12 22:43 UTC (permalink / raw)
  To: Jonathan.Cameron; +Cc: Adam Manzanares, Cong Wang, qemu-devel, linux-cxl

[-- Attachment #1: Type: text/plain, Size: 13540 bytes --]

Hi Jonathan,

As we agreed, I am moving our discussion into public mailing list.

So, I would like to contribute to QEMU emulation of CXL memory support. And I would like to see a TODO list. I hope this list could be useful not only for me. As far as I can see, we can summarize:

1) Moving towards emulation of everything we need for Dynamic Capacity.
  a) Switch CCI - have a PoC but not yet doing tunneling to Type 3 EPs.
  b) Userspace tool to fake enough FM role that we can drive dynamics 
  c) Also need to do CXL 2.0 style HP of LDs on MLD devices (some demand
  for this to driver virtualization migration usecases)
  d) DCD implementation etc on the type 3 device.
2) Lots of smaller features from CXL 3.0 such as setting up BI.
3) Enough to test P2P UIO flows - probably need to invent an accelerator
  with appropriate support to test that - DMA engine or similar.
4) Bunch of small features:
  a) Multiple HDM decoders.
  b) Poisoning.  Right now we have prototype, but it's not wired up to actually report poison on reads.
  c) CXL non-function map DVSEC. Given QEMU lets you add any function to a given device by just setting  the bus to be the same as another, this is a bit fiddly because we need to updated it late in the QEMU bring up, or possibly easier to do it at read time (that may well be easier).
  d) Most useful of all, but most boring perhaps is review of what's already waiting for upstreaming.

Please, correct me if I miss something. I believe we need to have a TODO list to collaborate efficiently. Any ideas what else can be added into TODO list?

Thanks,
Slava. 

> Begin forwarded message:
> 
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Subject: Re: [External] CXL emulation in QEMU
> Date: October 11, 2022 at 1:52:28 AM PDT
> To: Viacheslav A.Dubeyko <viacheslav.dubeyko@bytedance.com>
> Cc: Adam Manzanares <a.manzanares@samsung.com>, Cong Wang <cong.wang@bytedance.com>
> 
> On Tue, 11 Oct 2022 09:45:50 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
>> On Mon, 10 Oct 2022 10:11:43 -0700
>> "Viacheslav A.Dubeyko" <viacheslav.dubeyko@bytedance.com> wrote:
>> 
>>> Hi Jonathan,
>>> 
>>> It looks like that my email was confusing or, maybe, you simply missed my email.  
>> Hi Slava,
>> 
>> 
>> Apologies, I thought from the comment you made about being fine to take it a
>> public list that you'd send a starting email to linux-cxl or the qemu-devel
>> and we'd take the discussion on there.  No problem with carrying on here
>> as nothing technical so we are fine...
>> 
>> 
>>> 
>>> My point is that I am ready to start from any feature at first. Then
>>> I will elaborate the vision what is more interesting for me. What a
>>> feature could I start to explore/implement?
>>> 
>> 
>> One thing I'm keen to get done, but haven't gotten to yet is doing a full
>> audit of spec vs what we have implemented and drawing up a todo list.
>> I can have a go at this perhaps later today.  Let's use the wiki
>> on my gitlab instance to build the list before sending it out for
>> wider review. 
>> 
>> https://gitlab.com/jic23/qemu/-/wikis/TODO-list
>> Send me an ID and I'll add you as a developer on the repo (which is
>> all you need to edit I think?)
>> 
>> I think there are a bunch of small features that we should wire up
>> that we haven't gotten to yet.
>> 
>> Examples of this include: 
>> * Multiple HDM decoders.
>> 
>> * Poisoning.  Right now we have prototype, but it's not wired up to
>> actually report poison on reads.
>> 
>> * CXL non-function map DVSEC
>>  Given QEMU lets you add any function to a given device by just setting
>>  the bus to be the same as another, this is a bit fiddly because we need
>>  to updated it late in the QEMU bring up, or possibly easier to do it
>>  at read time (that may well be easier).
>> 
>> * Compliance DOE + maybe DVSEC for test capability if anyone cares about that.
>> 
>> Most useful of all, but most boring perhaps is review of what's already waiting
>> for upstreaming.  I cross post everything to linux-cxl@vger.kernel.org as
>> well as qemu-devel.  + there is a bunch of stuff on my gitlab tree above.
>> cxl-2022-10-08 branch though that has some cleanup needed.
>> 
>> I'm focusing short term on upstreaming what we already have + some
>> enablement to get a discussion going about how to handle open source fabric
>> manager. Primarily switch CCI as introduced in CXL 3.0/
>> 
>> 
>>> Thanks,
>>> Slava.
>>> 
>>>> On Oct 3, 2022, at 11:12 AM, Viacheslav A.Dubeyko <viacheslav.dubeyko@bytedance.com> wrote:
>>>> 
>>>> Hi Jonathan,
>>>> 
>> 
>>>> I don’t see any troubles to move the discussion into public mailing
>>>> list. I simply didn’t consider all these complications  that you
>>>> shared.   
>> 
>>>>> If we want to do it on the phone, then I'm
>>>>> sure we can borrow a bit of the regular CXL Linux sync call that Dan Williams
>>>>> at Intel organizes, or I we can organize something similar for QEMU side of
>>>>> things.    
>>>> 
>>>> It could be the good idea.
>>>> 
>>>>> Definitely interested to hear what sorts of features you are interested
>>>>> in + working together on getting more of CXL emulation in place.    
>>>> 
>> 
>>>> I spent some time to think about what could be interesting to
>>>> implement. I realized that I have not complete picture of what
>>>> features have been implemented already and what is under progress.
>>>> So, it’s hard to share what could be interesting for
>>>> implementation. Potentially, I could consider some features that
>>>> could be used to emulate graph database use-case or file system
>>>> related functionality. I assume I could start from any feature at
>>>> first and, as a result, I can elaborate my vision what is really
>>>> interesting for me.
>>>> 
>> 
>> For graph DB, is your main interest memory only or are we talking accelerators?
>> 
>> Accelerators are tricky as they tend to be highly custom and hence each one needs
>> it's own emulation + stuff like p2p emulation will make it more complex.
>> 
>> If type 3 / memory only devices then most of what you need should already work
>> subject to a couple of bugs on the kernel side of things that we know about and
>> the delights of error handling which is not yet in place.  There is work going
>> on for the event logs etc from one of the Intel team, but I don't think anything
>> has been posted yet.
> 
> I should have read the rest of my email. Ira Weiny posted the event log stuff
> last night.
> 
> https://lore.kernel.org/linux-cxl/Y0Sgiq+WMwOmqToe@iweiny-desk3/T/#t
> 
>> 
>> Also I've had a report that x86 support isn't currently working that
>> looks related to memory region reservations. Need to look into that as I mainly
>> test on arm64.
>> 
>> Jonathan
>> 
>> 
>>>> Thanks,
>>>> Slava.
>>>> 
>>>>> On Sep 30, 2022, at 5:31 AM, Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
>>>>> 
>>>>> Hi Slava,
>>>>> 
>>>>> Thanks for the intro Adam.
>>>>> 
>>>>> The one element that makes private technical discussion tricky is that
>>>>> my company is on the US entity list. I've not idea what Bytedance's policy
>>>>> on this is - and those policies tend not to be company wide so you'd
>>>>> probably need to pester relevant lawyers for your part of Bytedance.
>>>>> 
>>>>> Now if we keep it purely non technical (and roadmaps etc aren't technical)
>>>>> then it's all fine, or if we rely on one of the exemptions.
>>>>> 1) Work actually under the standards body is exempt - this doesn't include
>>>>> implementations though so only works if we are discussing spec problems.
>>>>> 2) Work 'to be published' is exempt.  This path is a bit tricky to use and
>>>>> makes lawyers nervous as how to prove it if the company is audited.
>>>>> 3) Published work - this covers anything on public mailing lists or on
>>>>> calls that are open to anyone / conference sessions etc.
>>>>> 
>>>>> Note, I can send anyone anything, others may just need to be careful replying if it
>>>>> gets near anything that might be called technology!
>>>>> 
>>>>> So for discussions involving me it's easiest to either keep them non technical
>>>>> or put them on the public mailing list.
>>>>> 
>>>>> So far I've not put out a public 'todo' list simply because the group
>>>>> working on QEMU CXL emulation was small enough we just emailed along the lines
>>>>> of 'shall I do this bit'.  Going forwards, seems we are growing enough we
>>>>> should have better tracking.   If we want to do it on the phone, then I'm
>>>>> sure we can borrow a bit of the regular CXL Linux sync call that Dan Williams
>>>>> at Intel organizes, or I we can organize something similar for QEMU side of
>>>>> things.
>>>>> 
>>>>> Closest thing to a status report was the plumbers talk a few weeks ago.
>>>>> https://lpc.events/event/16/contributions/1248/
>>>>> 
>>>>> My focus is on Type 3 + all the fabric side of things (switches / RPs etc)
>>>>> and I care about ARM support (given I work for HiSilicon bit of Huawei, not
>>>>> supporting the architecture we build would be a bad thing)
>>>>> Majority of anything else will be heavily custom anyway, so emulation would
>>>>> need to be driven by who ever makes the type1/2 devices.
>>>>> 
>>>>> Short term, I want to clear some of the backlog of upstreaming.
>>>>> We got a lot into QEMU 7.1 but that had taken a few cycles, so various
>>>>> other prototype code exists.
>>>>> 
>>>>> 1) DOE + CDAT - should get that up for review next week. This is a rework
>>>>> of what the Avery design folk posted last year.
>>>>> 2) ARM support (bit longer as need to write DT support and deal with kernel
>>>>> driver for that *sigh*)
>>>>> 3) CXL PMU emulation (bit of work to do on kernel driver for that)
>>>>> 
>>>>> I also need to push a tree out with all the pending work on it.
>>>>> There are overlaps between different patch sets that need to resolving.
>>>>> 
>>>>> Otherwise:
>>>>> 
>>>>> 1) Ira Weiny at Intel is working on Event support (see linux-cxl@vger.kernel.org postings)
>>>>> 2) I believe we'll have some volatile support from Samsung shortly.
>>>>> 
>>>>> 
>>>>> Longer term list from me.
>>>>> 1) Moving towards emulation of everything we need for Dynamic Capacity.
>>>>> a) Switch CCI - have a PoC but not yet doing tunneling to Type 3 EPs.
>>>>> b) Userspace tool to fake enough FM role that we can drive dynamics 
>>>>> c) Also need to do CXL 2.0 style HP of LDs on MLD devices (some demand
>>>>> for this to driver virtualization migration usecases)
>>>>> d) DCD implementation etc on the type 3 device.
>>>>> 2) Lots of smaller features from CXL 3.0 such as setting up BI.
>>>>> 3) Enough to test P2P UIO flows - probably need to invent an accelerator
>>>>> with appropriate support to test that - DMA engine or similar.
>>>>> 
>>>>> Obviously DCD stuff needs a load of kernel work as well.
>>>>> 
>>>>> I've probably forgotten a bunch of things....
>>>>> 
>>>>> Definitely interested to hear what sorts of features you are interested
>>>>> in + working together on getting more of CXL emulation in place.
>>>>> A bigger active group on this will aid with review as well and hopefully
>>>>> lead to faster pick up by Michael Tsirkin who has been applying the patches
>>>>> so far.
>>>>> 
>>>>> My end goal is to catch up with the Spec as I also used the QEMU emulation
>>>>> to prove out CXL 3.0 features when they were at draft stage (particularly
>>>>> the PMU) and it was very useful for closing various corners that would have
>>>>> been a lot harder to fix later.  That stuff is tricky because we can't
>>>>> share code until the spec is releases but we can ensure those involved in
>>>>> the standards side know what others have code for.
>>>>> 
>>>>> Looking forward to hearing from you!
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Jonathan
>>>>> 
>>>>> 
>>>>> On Thu, 29 Sep 2022 14:25:17 -0700
>>>>> "Viacheslav A.Dubeyko" <viacheslav.dubeyko@bytedance.com> wrote:
>>>>> 
>>>>>> Hi Adam,
>>>>>> 
>>>>>> Yes, we are talking about open-source activity. I am simply trying to understand what direction(s), TODOs, task(s) we have right now. If we can summarize it somehow, then, potentially, this discussion could be useful for mailing list too. Otherwise, I am not sure that current discussion is useful for mailing list.
>>>>>> 
>>>>>> Thanks,
>>>>>> Slava.
>>>>>> 
>>>>>>> On Sep 29, 2022, at 2:08 PM, Adam Manzanares <a.manzanares@samsung.com> wrote:
>>>>>>> 
>>>>>>> Hello Slava,
>>>>>>> 
>>>>>>> Added Jonathan to cc so you can connect. One ask, if this CXL QEMU work is planned to 
>>>>>>> be open source let's move discussion to the mailing list and cc me if you don't mind. 
>>>>>>> 
>>>>>>> Take care,
>>>>>>> Adam
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Viacheslav A.Dubeyko [mailto:viacheslav.dubeyko@bytedance.com] 
>>>>>>> Sent: Thursday, September 29, 2022 12:06 PM
>>>>>>> To: Adam Manzanares <a.manzanares@samsung.com>
>>>>>>> Cc: Cong Wang <cong.wang@bytedance.com>
>>>>>>> Subject: CXL emulation in QEMU
>>>>>>> 
>>>>>>> Hi Adam,
>>>>>>> 
>>>>>>> We are interested to participate in CXL emulation in QEMU. What is the good way to contact with Jonathan Cameron? Is it the main guy to contact? I believe you should know email or any other way to communicate. Could you help us?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Slava.


[-- Attachment #2: Type: text/html, Size: 17987 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CXL emulation in QEMU contribution
  2022-10-12 22:43 ` CXL emulation in QEMU contribution Viacheslav A.Dubeyko
@ 2022-10-13 15:09   ` Jonathan Cameron via
  2022-10-18 21:26     ` [External] " Viacheslav A.Dubeyko
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Cameron via @ 2022-10-13 15:09 UTC (permalink / raw)
  To: Viacheslav A.Dubeyko; +Cc: Adam Manzanares, Cong Wang, qemu-devel, linux-cxl

On Wed, 12 Oct 2022 15:43:35 -0700
"Viacheslav A.Dubeyko" <viacheslav.dubeyko@bytedance.com> wrote:

> Hi Jonathan,

Hi Slava,

Thanks for sending this.
> 
> As we agreed, I am moving our discussion into public mailing list.
> 

> So, I would like to contribute to QEMU emulation of CXL memory
> support. And I would like to see a TODO list. I hope this list could
> be useful not only for me. As far as I can see, we can summarize:

Absolutely agree on need for a TODO now there are multiple groups involved.
https://gitlab.com/jic23/qemu/-/wikis/TODO-list
is my starting point on this on basis a wiki is a cheap and cheerful way
to track this.

> 

> 1) Moving towards emulation of everything we need for Dynamic Capacity.
>   a) Switch CCI - have a PoC but not yet doing tunneling to Type 3 EPs.

See below. Initial support pushed out to gitlab. Doesn't do much yet beyond
walk some basic info for a static switch.

>   b) Userspace tool to fake enough FM role that we can drive dynamics 

Currently I'm using the IOCTL path used by the cxl tool in ndctl.
I would definitely not describe my test program as 'good userspace code'
though :)

>   c) Also need to do CXL 2.0 style HP of LDs on MLD devices (some demand
>   for this to driver virtualization migration usecases)
>   d) DCD implementation etc on the type 3 device.

May want to do this on multiport devices first. 

> 2) Lots of smaller features from CXL 3.0 such as setting up BI.
> 3) Enough to test P2P UIO flows - probably need to invent an accelerator
>   with appropriate support to test that - DMA engine or similar.
> 4) Bunch of small features:
>   a) Multiple HDM decoders.

This is a fairly urgent feature to test mixed volatile / non volatile stuff
using Gregory Price's emulation of volatile for the type 3 device.

>   b) Poisoning.  Right now we have prototype, but it's not wired up
> to actually report poison on reads.

The command handling stuff is on the tree below, but this needs exploring
on qemu side. I'm not even sure what 'poison' would look like.

>   c) CXL non-function map DVSEC. Given QEMU lets you add any function
> to a given device by just setting  the bus to be the same as another,
> this is a bit fiddly because we need to updated it late in the QEMU
> bring up, or possibly easier to do it at read time (that may well be
> easier).

This one should be a fairly small task for anyone interested.  Not super
high priority though as kernel driver doesn't care yet ;)

>   d) Most useful of all, but most boring perhaps is review of what's
> already waiting for upstreaming.

For this one I've pushed out what I currently have queued up.
https://gitlab.com/jic23/qemu/-/commits/cxl-2022-10-13

I'll also add that stuff to the todo list
I'm aware of two other series that have been posted.

> 
> Please, correct me if I miss something. I believe we need to have a
> TODO list to collaborate efficiently. Any ideas what else can be added into TODO list?

More error injection to support David Jiang's patch set testing.
Also, rather tangential to the rest but can wire up UEFI CPER record
reporting as well to test Smita Koralahalli's series.  I have old code
for doing that on aRM 64.

Also fairly high on what matters to me is arm64 support via DT to hopefully
help us get the arm-virt support upstream.  That's my daily test
platform so I'd rather not maintain it out of tree forever!

If anyone wants access to edit the page, DM me a registered gitlab ID and
I'll you to the project.

Jonathan

> 
> Thanks,
> Slava. 
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [External] CXL emulation in QEMU contribution
  2022-10-13 15:09   ` Jonathan Cameron via
@ 2022-10-18 21:26     ` Viacheslav A.Dubeyko
  2022-10-19  9:52       ` Jonathan Cameron via
  0 siblings, 1 reply; 4+ messages in thread
From: Viacheslav A.Dubeyko @ 2022-10-18 21:26 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: Adam Manzanares, Cong Wang, qemu-devel, linux-cxl

Hi Jonathan,

> On Oct 13, 2022, at 8:09 AM, Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 

<skipped>

>> So, I would like to contribute to QEMU emulation of CXL memory
>> support. And I would like to see a TODO list. I hope this list could
>> be useful not only for me. As far as I can see, we can summarize:
> 
> Absolutely agree on need for a TODO now there are multiple groups involved

As far as I can see, Fabric Management looks like “uncharted territory” with a lot of work. I think it’s pretty interesting direction for me to start. I can read Compute Express Link Specification (Revision 3.0, Version 1.0). Could you recommend some other docs or links to take a look?

By the way, I see ARM64 support in TODO list. But nothing related to RISC-V. Do we need to consider RISC-V too?

Thanks,
Slava.





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [External] CXL emulation in QEMU contribution
  2022-10-18 21:26     ` [External] " Viacheslav A.Dubeyko
@ 2022-10-19  9:52       ` Jonathan Cameron via
  0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Cameron via @ 2022-10-19  9:52 UTC (permalink / raw)
  To: Viacheslav A.Dubeyko; +Cc: Adam Manzanares, Cong Wang, qemu-devel, linux-cxl

On Tue, 18 Oct 2022 14:26:41 -0700
"Viacheslav A.Dubeyko" <viacheslav.dubeyko@bytedance.com> wrote:

> Hi Jonathan,
> 
> > On Oct 13, 2022, at 8:09 AM, Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> 
> <skipped>
> 
> >> So, I would like to contribute to QEMU emulation of CXL memory
> >> support. And I would like to see a TODO list. I hope this list could
> >> be useful not only for me. As far as I can see, we can summarize:  
> > 
> > Absolutely agree on need for a TODO now there are multiple groups involved  
> 
> As far as I can see, Fabric Management looks like “uncharted territory” with 
> a lot of work. I think it’s pretty interesting direction for me to start. 
> I can read Compute Express Link Specification (Revision 3.0, Version 1.0). 
> Could you recommend some other docs or links to take a look?

So far the spec is all I'm aware of at the level of actually talking to the hardware.
Probably some wooly presentations on what people 'might build'.

There are some other efforts that may be related to higher level - e.g. what
talks to the FM that then talks to the CXL devices.

I've not looked into them but google feeds me:
https://www.dmtf.org/documents/redfish-spmf/redfish-cxl-device-management-models-bundle-08wip
https://www.dmtf.org/content/dmtf-and-cxl-consortium-establish-work-register

One interesting diversion in this space would be to get the MCTP interfaces
up and running (perhaps via i2c).  The last time I looked at that,
the issue was that there wasn't any overlap between suitable I2C controllers
(need to support master and subordinate roles) and ones with ACPI bindings.
Doing it over PCIe VDMs is also an option.  The interest here would be to
put a second transport option in place so that any userspace FM code would
work well with that and via the mailbox interfaces.

Early work on the i2c approach at:
https://lore.kernel.org/qemu-devel/20220520165909.4369-1-Jonathan.Cameron@huawei.com/

For now I've abandoned that as CXL 3.0 got published with the Switch mailbox
path (+ support for tunneling commands via a normal mailbox on multi head devices).

> 
> By the way, I see ARM64 support in TODO list. But nothing related to RISC-V.
> Do we need to consider RISC-V too?

I'm only going to focus on architectures I have reason to support
- there's enough work to keep me busy without adding more!

More than happy to review / comment on support for risc-v though. The
same applies to Power (no idea on IBM's plans, but they are a BoD level
member of CXL so I assume they may have some.)

Jonathan

> 
> Thanks,
> Slava.
> 
> 
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-10-19  9:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20221011095228.00001546@huawei.com>
2022-10-12 22:43 ` CXL emulation in QEMU contribution Viacheslav A.Dubeyko
2022-10-13 15:09   ` Jonathan Cameron via
2022-10-18 21:26     ` [External] " Viacheslav A.Dubeyko
2022-10-19  9:52       ` Jonathan Cameron via

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).