Linux CXL
 help / color / mirror / Atom feed
From: "Verma, Vishal L" <vishal.l.verma@intel.com>
To: "john@jagalactic.com" <john@jagalactic.com>,
	"Williams, Dan J" <dan.j.williams@intel.com>
Cc: "johnny.li@montage-tech.com" <johnny.li@montage-tech.com>,
	"Widawsky, Ben" <ben.widawsky@intel.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
	"jgroves@micron.com" <jgroves@micron.com>
Subject: Re: CXL 1.1 Support Plan
Date: Fri, 8 Oct 2021 20:20:52 +0000	[thread overview]
Message-ID: <b9fddf71e17902af1326b3242d4f3d899754e674.camel@intel.com> (raw)
In-Reply-To: <0100017c61378e14-5efea3be-ca28-476b-bc64-4512c44dba50-000000@email.amazonses.com>

On Fri, 2021-10-08 at 18:43 +0000, John Groves wrote:
> On 10/7/21 4:30 PM, Verma, Vishal L wrote:
> > On Thu, 2021-09-30 at 22:17 +0000, John Groves wrote:
> > > I don't see your doc patch yet in the ndctl repo, but I recommend merging it there.
> > > 
> > > A related question: "daxctl reconfigure-device --mode=system-ram ..." works for me,
> > > with --force, but going the other way (--mode=devdax) fails.  But a reboot puts it back
> > > into devdax mode regardless of the pre-boot setting (i.e. --mode=system-ram reverts
> > > back to devdax on reboot).
> > Oh, I'm a bit confused. --force only applies when going from system-ram
> > to devdax -- it offlines the memory for you. Without force, you're
> > responsible for a prior 'daxctl offline-memory daxX.Y' step.
> > 
> > Going from devdax to system-ram should not need --force, and I don't
> > think force actually does anything there.
> 
> Hoping I've successfully de-mangled this message...
> 
> I definitely might be doing something wrong.  Here is a "typescript".
> 
> # grep dax /proc/iomem
>     880000000-107fffffff : *dax*0.0
> 
> # ls -al /dev/dax0.0
> crw------- 1 root root 252, 2 Oct  8 13:14 /dev/dax0.0
> 
> # numastat
>                            node0
> numa_hit                 2698949
> numa_miss                      0
> numa_foreign                   0
> interleave_hit             14565
> local_node               2698949
> other_node                     0
> 
> # daxctl reconfigure-device --mode=system-ram --region=0 dax0.0
> dax0.0: error: kernel policy will auto-online memory, aborting
> error reconfiguring devices: Device or resource busy
> reconfigured 0 devices

Ah yes - so thios points to either

  CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y

or 

  $ cat /sys/devices/system/memory/auto_online_blocks 
  online

daxctl wants to online the new memory in ZONE_MOVABLE by default, but
either of the above would race daxctl to online it in ZONE_NORMAL -
that's the warning you got above. 

> 
> # daxctl reconfigure-device --mode=system-ram --region=0 --force dax0.0
> 
> dax0.0:
>   WARNING: detected a race while onlining memory
>   Some memory may not be in the expected zone. It is
>   recommended to disable any other onlining mechanisms,
>   and retry. If onlining is to be left to other agents,
>   use the --no-online option to suppress this warning

I guess the force does work in this case :)

> dax0.0: all memory sections (256) already online
> [
>   {
>     "chardev":"dax0.0",
>     "size":34359738368,
>     "target_node":1,
>     "align":2097152,
>     "mode":"system-ram",
>     "online_memblocks":256,
>     "total_memblocks":256,
>     "movable":false

This indicates that the new memory went into ZONE_NORMAL.
That can make it hard to convert back to devdax, but..


[snip]

> Starting in the state where I left off above:
> 
> # daxctl reconfigure-device --mode=devdax dax0.0
> error reconfiguring devices: Device or resource busy
> reconfigured 0 devices
> 
> # daxctl reconfigure-device --mode=devdax --force dax0.0
> libdaxctl: offline_one_memblock: dax0.0: Failed to offline /sys/devices/system/node/node1/memory272/state: Device or resource busy
> dax0.0: failed to offline memory: Device or resource busy
> 
> error reconfiguring devices: Device or resource busy
> reconfigured 0 devices
> 
> # daxctl offline-memory dax0.0
> dax0.0: 4 memory sections already offline
> libdaxctl: offline_one_memblock: dax0.0: Failed to offline /sys/devices/system/node/node1/memory272/state: Device or resource busy
> dax0.0: failed to offline memory: Device or resource busy
> error offlining memory: Device or resource busy
> offlined memory for 0 devices

This confused me a bit - I wonder if 'daxctl offline memory' claiming
that it is already offline is a bug, as a subsequent reconfigure then
says 'failed to offline..'

What is this range backed by? Most of the testing I did with this was
using a pmem device that gets it's own 'target_node' (i.e. the new
memory ends up in a new numa node of its own).

If you carve out memory using memmap or efi_fake_mem, It would end up
getting hotplugged into an existing numa node. I wonder if that causes
problems with the hot-unplug. Let take another look at this.

> 
> # daxctl offline-memory --force dax0.0
>   Error: unknown option `force'
>  usage: daxctl offline-memory <device> [<options>]
> 
>     -r, --region <region-id>
>                           filter by region
>     -u, --human           use human friendly number formats
>     -v, --verbose         emit more debug messages
> 
>  
> 
> I have to reboot to get back to dax memory, though it's very possible that my
> [questionable] doc reading skills are at fault.
> 
> Thanks,
> John
> 
> 
> 
> 
> 
> 


      reply	other threads:[~2021-10-08 20:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-11  0:18 CXL 1.1 Support Plan johnny
2021-08-10 15:21 ` Dan Williams
2021-09-13 14:46   ` John Groves
2021-09-13 19:08     ` Dan Williams
2021-09-14 19:55       ` John Groves
2021-09-14 20:20         ` Dan Williams
2021-09-30 22:17           ` John Groves
     [not found]             ` <c55b69bd-45e3-4dec-91af-02ca4eeb054a@jagalactic.com>
2021-10-07 21:30             ` Verma, Vishal L
2021-10-08 18:06               ` John Groves
2021-10-08 18:10                 ` John Groves
2021-10-08 18:43               ` John Groves
2021-10-08 20:20                 ` Verma, Vishal L [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b9fddf71e17902af1326b3242d4f3d899754e674.camel@intel.com \
    --to=vishal.l.verma@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=ben.widawsky@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=jgroves@micron.com \
    --cc=john@jagalactic.com \
    --cc=johnny.li@montage-tech.com \
    --cc=linux-cxl@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox