From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-cxl-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 5E23BC05027
	for <linux-cxl@archiver.kernel.org>; Fri, 10 Feb 2023 12:43:14 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S231758AbjBJMnN convert rfc822-to-8bit (ORCPT
        <rfc822;linux-cxl@archiver.kernel.org>);
        Fri, 10 Feb 2023 07:43:13 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58356 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231462AbjBJMnN (ORCPT
        <rfc822;linux-cxl@vger.kernel.org>); Fri, 10 Feb 2023 07:43:13 -0500
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 599046CC65
        for <linux-cxl@vger.kernel.org>; Fri, 10 Feb 2023 04:43:11 -0800 (PST)
Received: from lhrpeml500005.china.huawei.com (unknown [172.18.147.200])
        by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4PCtdw0Tqzz67n8d;
        Fri, 10 Feb 2023 20:41:40 +0800 (CST)
Received: from localhost (10.202.227.76) by lhrpeml500005.china.huawei.com
 (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.17; Fri, 10 Feb
 2023 12:43:08 +0000
Date:   Fri, 10 Feb 2023 12:43:07 +0000
From:   Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To:     Brice Goglin <Brice.Goglin@inria.fr>
CC:     "Verma, Vishal L" <vishal.l.verma@intel.com>,
        "linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
        "Williams, Dan J" <dan.j.williams@intel.com>,
        "gregory.price@memverge.com" <gregory.price@memverge.com>,
        "dave@stgolabs.net" <dave@stgolabs.net>,
        "nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
        "Weiny, Ira" <ira.weiny@intel.com>
Subject: Re: [PATCH ndctl v2 0/7] cxl: add support for listing and creating
 volatile regions
Message-ID: <20230210124307.00003be0@Huawei.com>
In-Reply-To: <34a03b27-923c-7bb0-d77a-b0fddc535160@inria.fr>
References: <20230120-vv-volatile-regions-v2-0-4ea6253000e5@intel.com>
        <e885b2a7-0405-153c-a578-b863a4e00977@inria.fr>
        <a5e4aff9f300d9b603111983165754b35c89c612.camel@intel.com>
        <34a03b27-923c-7bb0-d77a-b0fddc535160@inria.fr>
Organization: Huawei Technologies Research and Development (UK) Ltd.
X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32)
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8BIT
X-Originating-IP: [10.202.227.76]
X-ClientProxiedBy: lhrpeml500002.china.huawei.com (7.191.160.78) To
 lhrpeml500005.china.huawei.com (7.191.163.240)
X-CFilter-Loop: Reflected
Precedence: bulk
List-ID: <linux-cxl.vger.kernel.org>
X-Mailing-List: linux-cxl@vger.kernel.org

On Fri, 10 Feb 2023 11:15:44 +0100
Brice Goglin <Brice.Goglin@inria.fr> wrote:

> Le 09/02/2023 à 20:17, Verma, Vishal L a écrit :
> > On Thu, 2023-02-09 at 12:04 +0100, Brice Goglin wrote:  
> >> Hello Vishal
> >>
> >> I am trying to play with this but all my attempts failed so far. Could
> >> you provide Qemu and cxl-cli command-lines to get a volatile region
> >> enabled in a Qemu VM?  
> > Hi Brice,
> >
> > Greg had posted his working config in another thread:
> > https://lore.kernel.org/linux-cxl/Y9sMs0FGulQSIe9t@memverge.com/
> >
> > I've also pasted below, the qemu command line generated by the run_qemu
> > script I referenced. (Note that this adds a bunch of stuff not strictly
> > needed for a minimal CXL configuration - you can certainly trim a lot
> > of that out - this is just the default setup that is generated and I
> > usually run).  
> 
> 
> Hello Vishal
> 
> Thanks a lot, things were failing because my kernel didn't have
> CONFIG_CXL_REGION_INVALIDATION_TEST=y. Now I am able to create a single
> ram region, either with a single device or multiple interleaved ones.
> 
> However I can't get multiple separate ram regions. If I boot a config
> like yours below, I get 4 ram devices. How can I create one region for
> each? Once I create the first one, others fail saying something like
> below. I tried using other decoders but it didn't help (I still need
> to read more CXL docs about decoders, why new ones appear when creating
> a region, etc).
> 
> cxl region: collect_memdevs: no active memdevs found: decoder: decoder0.0 filter: mem3

Hi Brice,

QEMU emulation currently only supports single HDM decoder at each level,
so HB, Switch USP, EP (with exception of the CFMWS top level ones as shown
in the example which has two of those). We should fix that...

For now, you should be able to do it with multiple pxb-cxl instances with
appropriate CFMWS entries for each one. Which is horrible but might work
for you in the meantime.  

> 
> By the way, once configured in system ram, my CXL ram is merged into an
> existing "normal" NUMA node. How do I tell Qemu that a CXL region should
> be part of a new NUMA node? I assume that's what's going to happen on
> real hardware?

We don't yet have kernel code to deal with assigning a new NUMA node.
Was on the todo list in last sync call I think.

> 
> Thanks
> 
> Brice
> 
> 
> 
> >
> >
> > $ run_qemu.sh -g --cxl --cxl-debug --rw -r none --cmdline
> > /home/vverma7/git/qemu/build/qemu-system-x86_64
> > -machine q35,accel=kvm,nvdimm=on,cxl=on
> > -m 8192M,slots=4,maxmem=40964M
> > -smp 8,sockets=2,cores=2,threads=2
> > -enable-kvm
> > -display none
> > -nographic
> > -drive if=pflash,format=raw,unit=0,file=OVMF_CODE.fd,readonly=on
> > -drive if=pflash,format=raw,unit=1,file=OVMF_VARS.fd
> > -debugconfile:uefi_debug.log  
> > -global isa-debugcon.iobase=0x402
> > -drive file=root.img,format=raw,media=disk
> > -kernel ./mkosi.extra/lib/modules/6.2.0-rc6+/vmlinuz
> > -initrd mkosi.extra/boot/initramfs-6.2.0-rc2+.img
> > -append selinux=0 audit=0 console=tty0 console=ttyS0 root=/dev/sda2 ignore_loglevel rw cxl_acpi.dyndbg=+fplm cxl_pci.dyndbg=+fplm cxl_core.dyndbg=+fplm cxl_mem.dyndbg=+fplm cxl_pmem.dyndbg=+fplm cxl_port.dyndbg=+fplm cxl_region.dyndbg=+fplm cxl_test.dyndbg=+fplm cxl_mock.dyndbg=+fplm cxl_mock_mem.dyndbg=+fplm memmap=2G!4G efi_fake_mem=2G@6G:0x40000
> > -device e1000,netdev=net0,mac=52:54:00:12:34:56
> > -netdev user,id=net0,hostfwd=tcp::10022-:22
> > -object memory-backend-file,id=cxl-mem0,share=on,mem-path=cxltest0.raw,size=256M
> > -object memory-backend-file,id=cxl-mem1,share=on,mem-path=cxltest1.raw,size=256M
> > -object memory-backend-file,id=cxl-mem2,share=on,mem-path=cxltest2.raw,size=256M
> > -object memory-backend-file,id=cxl-mem3,share=on,mem-path=cxltest3.raw,size=256M
> > -object memory-backend-ram,id=cxl-mem4,share=on,size=256M
> > -object memory-backend-ram,id=cxl-mem5,share=on,size=256M
> > -object memory-backend-ram,id=cxl-mem6,share=on,size=256M
> > -object memory-backend-ram,id=cxl-mem7,share=on,size=256M
> > -object memory-backend-file,id=cxl-lsa0,share=on,mem-path=lsa0.raw,size=1K
> > -object memory-backend-file,id=cxl-lsa1,share=on,mem-path=lsa1.raw,size=1K
> > -object memory-backend-file,id=cxl-lsa2,share=on,mem-path=lsa2.raw,size=1K
> > -object memory-backend-file,id=cxl-lsa3,share=on,mem-path=lsa3.raw,size=1K
> > -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=53
> > -device pxb-cxl,id=cxl.1,bus=pcie.0,bus_nr=191
> > -device cxl-rp,id=hb0rp0,bus=cxl.0,chassis=0,slot=0,port=0
> > -device cxl-rp,id=hb0rp1,bus=cxl.0,chassis=0,slot=1,port=1
> > -device cxl-rp,id=hb0rp2,bus=cxl.0,chassis=0,slot=2,port=2
> > -device cxl-rp,id=hb0rp3,bus=cxl.0,chassis=0,slot=3,port=3
> > -device cxl-rp,id=hb1rp0,bus=cxl.1,chassis=0,slot=4,port=0
> > -device cxl-rp,id=hb1rp1,bus=cxl.1,chassis=0,slot=5,port=1
> > -device cxl-rp,id=hb1rp2,bus=cxl.1,chassis=0,slot=6,port=2
> > -device cxl-rp,id=hb1rp3,bus=cxl.1,chassis=0,slot=7,port=3
> > -device cxl-type3,bus=hb0rp0,memdev=cxl-mem0,id=cxl-dev0,lsa=cxl-lsa0
> > -device cxl-type3,bus=hb0rp1,memdev=cxl-mem1,id=cxl-dev1,lsa=cxl-lsa1
> > -device cxl-type3,bus=hb1rp0,memdev=cxl-mem2,id=cxl-dev2,lsa=cxl-lsa2
> > -device cxl-type3,bus=hb1rp1,memdev=cxl-mem3,id=cxl-dev3,lsa=cxl-lsa3
> > -device cxl-type3,bus=hb0rp2,volatile-memdev=cxl-mem4,id=cxl-dev4
> > -device cxl-type3,bus=hb0rp3,volatile-memdev=cxl-mem5,id=cxl-dev5
> > -device cxl-type3,bus=hb1rp2,volatile-memdev=cxl-mem6,id=cxl-dev6
> > -device cxl-type3,bus=hb1rp3,volatile-memdev=cxl-mem7,id=cxl-dev7
> > -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=8k,cxl-fmw.1.targets.0=cxl.0,cxl-fmw.1.targets.1=cxl.1,cxl-fmw.1.size=4G,cxl-fmw.1.interleave-granularity=8k
> > -object memory-backend-ram,id=mem0,size=2048M
> > -numa node,nodeid=0,memdev=mem0,
> > -numa cpu,node-id=0,socket-id=0
> > -object memory-backend-ram,id=mem1,size=2048M
> > -numa node,nodeid=1,memdev=mem1,
> > -numa cpu,node-id=1,socket-id=1
> > -object memory-backend-ram,id=mem2,size=2048M
> > -numa node,nodeid=2,memdev=mem2,
> > -object memory-backend-ram,id=mem3,size=2048M
> > -numa node,nodeid=3,memdev=mem3,
> > -numa node,nodeid=4,
> > -object memory-backend-file,id=nvmem0,share=on,mem-path=nvdimm-0,size=16384M,align=1G
> > -device nvdimm,memdev=nvmem0,id=nv0,label-size=2M,node=4
> > -numa node,nodeid=5,
> > -object memory-backend-file,id=nvmem1,share=on,mem-path=nvdimm-1,size=16384M,align=1G
> > -device nvdimm,memdev=nvmem1,id=nv1,label-size=2M,node=5
> > -numa dist,src=0,dst=0,val=10
> > -numa dist,src=0,dst=1,val=21
> > -numa dist,src=0,dst=2,val=12
> > -numa dist,src=0,dst=3,val=21
> > -numa dist,src=0,dst=4,val=17
> > -numa dist,src=0,dst=5,val=28
> > -numa dist,src=1,dst=1,val=10
> > -numa dist,src=1,dst=2,val=21
> > -numa dist,src=1,dst=3,val=12
> > -numa dist,src=1,dst=4,val=28
> > -numa dist,src=1,dst=5,val=17
> > -numa dist,src=2,dst=2,val=10
> > -numa dist,src=2,dst=3,val=21
> > -numa dist,src=2,dst=4,val=28
> > -numa dist,src=2,dst=5,val=28
> > -numa dist,src=3,dst=3,val=10
> > -numa dist,src=3,dst=4,val=28
> > -numa dist,src=3,dst=5,val=28
> > -numa dist,src=4,dst=4,val=10
> > -numa dist,src=4,dst=5,val=28
> > -numa dist,src=5,dst=5,val=10
> >