All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	Stefan Hajnoczi <stefanha@gmail.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Qemu Developers <qemu-devel@nongnu.org>,
	Xiao Guangrong <guangrong.xiao@gmail.com>,
	Igor Mammedov <imammedo@redhat.com>
Subject: Re: QEMU NVDIMM as type 7 in e820 table
Date: Fri, 28 Jul 2017 13:45:43 -0600	[thread overview]
Message-ID: <20170728194543.GA20726@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4i5YWK6RHYEkH6C1oPWwhPLOD-atmkMVXDXc-qdv_0hgw@mail.gmail.com>

On Fri, Jul 28, 2017 at 11:11:10AM -0700, Dan Williams wrote:
> On Fri, Jul 28, 2017 at 11:04 AM, Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > I've been using the virtualized NVDIMM support in QEMU for testing, and I
> > noticed that the physical addresses used by the virtual NVDIMMs aren't present
> > in the guest's e820 table.
> >
> > Here is the e820 table on my QEMU instance where I have one 32 GiB virtual
> > NVDIMM:
> >
> > [    0.000000] e820: BIOS-provided physical RAM map:
> > [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> > [    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffdefff] usable
> > [    0.000000] BIOS-e820: [mem 0x00000000bffdf000-0x00000000bfffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000023fffffff] usable
> >
> > The physical addresses used by the virtual NVDIMM are 0x240000000-0xA40000000.
> > You can see this by looking at ndctl and the values we get from the NFIT:
> >
> > # ndctl list -R
> > {
> >   "dev":"region0",
> >   "size":34359738368,
> >   "available_size":0,
> >   "type":"pmem"
> > }
> >
> > # grep . /sys/bus/nd/devices/region0/{resource,size}
> > region0/resource:0x240000000
> > region0/size:34359738368
> >
> > Or you can see the same info by using iasl to dump
> > /sys/firmware/acpi/tables/NFIT:
> >
> > [028h 0040   2]                Subtable Type : 0000 [System Physical Address Range]
> > [02Ah 0042   2]                       Length : 0038
> >
> > [02Ch 0044   2]                  Range Index : 0002
> > [02Eh 0046   2]        Flags (decoded below) : 0003
> >                    Add/Online Operation Only : 1
> >                       Proximity Domain Valid : 1
> > [030h 0048   4]                     Reserved : 00000000
> > [034h 0052   4]             Proximity Domain : 00000000
> > [038h 0056  16]           Address Range GUID : 66F0D379-B4F3-4074-AC43-0D3318B78CDB
> > [048h 0072   8]           Address Range Base : 0000000240000000
> > [050h 0080   8]         Address Range Length : 0000000800000000
> > [058h 0088   8]         Memory Map Attribute : 0000000000008008
> >
> > I expected to see a type 7 region for the NVDIMM physical address range in the
> > e820 table, so something like:
> >
> > [    0.000000] e820: BIOS-provided physical RAM map:
> > [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> > [    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffdefff] usable
> > [    0.000000] BIOS-e820: [mem 0x00000000bffdf000-0x00000000bfffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000023fffffff] usable
> > [    0.000000] BIOS-e820: [mem 0x0000000240000000-0x0000000A40000000] persistent (type 7)
> >
> 
> Do you need that informationin e820? Linux effectively ignores type-7.
> As long as the range is treated as reserved it's not clear that you
> need the e820 entry. We also infect the persistent type back into the
> memory map when the NFIT driver loads. /proc/iomem should show the
> right data.

[ Adding Linda & Toshi to see if they have an opinion. ]

I guess maybe we don't need it.  Yep, /proc/iomem looks good:

  # cat /proc/iomem
  00000000-00000fff : Reserved
  00001000-0009fbff : System RAM
  ...
  100000000-23fffffff : System RAM
  240000000-a3fffffff : Persistent Memory
    240000000-a3fffffff : namespace0.0

I was just worried that this was an inconsistency between the way that virtual
NVDIMMs are presented vs the way that they will be presented on bare metal.  I
at least look at the e820 table to get my bearings of how memory is laid out -
maybe I just need to look at /proc/iomem instead?
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Haozhong Zhang <haozhong.zhang@intel.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Stefan Hajnoczi <stefanha@gmail.com>,
	Xiao Guangrong <guangrong.xiao@gmail.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Linda Knippers <linda.knippers@hpe.com>,
	"Kani, Toshimitsu" <toshi.kani@hpe.com>
Subject: Re: [Qemu-devel] QEMU NVDIMM as type 7 in e820 table
Date: Fri, 28 Jul 2017 13:45:43 -0600	[thread overview]
Message-ID: <20170728194543.GA20726@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4i5YWK6RHYEkH6C1oPWwhPLOD-atmkMVXDXc-qdv_0hgw@mail.gmail.com>

On Fri, Jul 28, 2017 at 11:11:10AM -0700, Dan Williams wrote:
> On Fri, Jul 28, 2017 at 11:04 AM, Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > I've been using the virtualized NVDIMM support in QEMU for testing, and I
> > noticed that the physical addresses used by the virtual NVDIMMs aren't present
> > in the guest's e820 table.
> >
> > Here is the e820 table on my QEMU instance where I have one 32 GiB virtual
> > NVDIMM:
> >
> > [    0.000000] e820: BIOS-provided physical RAM map:
> > [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> > [    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffdefff] usable
> > [    0.000000] BIOS-e820: [mem 0x00000000bffdf000-0x00000000bfffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000023fffffff] usable
> >
> > The physical addresses used by the virtual NVDIMM are 0x240000000-0xA40000000.
> > You can see this by looking at ndctl and the values we get from the NFIT:
> >
> > # ndctl list -R
> > {
> >   "dev":"region0",
> >   "size":34359738368,
> >   "available_size":0,
> >   "type":"pmem"
> > }
> >
> > # grep . /sys/bus/nd/devices/region0/{resource,size}
> > region0/resource:0x240000000
> > region0/size:34359738368
> >
> > Or you can see the same info by using iasl to dump
> > /sys/firmware/acpi/tables/NFIT:
> >
> > [028h 0040   2]                Subtable Type : 0000 [System Physical Address Range]
> > [02Ah 0042   2]                       Length : 0038
> >
> > [02Ch 0044   2]                  Range Index : 0002
> > [02Eh 0046   2]        Flags (decoded below) : 0003
> >                    Add/Online Operation Only : 1
> >                       Proximity Domain Valid : 1
> > [030h 0048   4]                     Reserved : 00000000
> > [034h 0052   4]             Proximity Domain : 00000000
> > [038h 0056  16]           Address Range GUID : 66F0D379-B4F3-4074-AC43-0D3318B78CDB
> > [048h 0072   8]           Address Range Base : 0000000240000000
> > [050h 0080   8]         Address Range Length : 0000000800000000
> > [058h 0088   8]         Memory Map Attribute : 0000000000008008
> >
> > I expected to see a type 7 region for the NVDIMM physical address range in the
> > e820 table, so something like:
> >
> > [    0.000000] e820: BIOS-provided physical RAM map:
> > [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> > [    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffdefff] usable
> > [    0.000000] BIOS-e820: [mem 0x00000000bffdf000-0x00000000bfffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000023fffffff] usable
> > [    0.000000] BIOS-e820: [mem 0x0000000240000000-0x0000000A40000000] persistent (type 7)
> >
> 
> Do you need that informationin e820? Linux effectively ignores type-7.
> As long as the range is treated as reserved it's not clear that you
> need the e820 entry. We also infect the persistent type back into the
> memory map when the NFIT driver loads. /proc/iomem should show the
> right data.

[ Adding Linda & Toshi to see if they have an opinion. ]

I guess maybe we don't need it.  Yep, /proc/iomem looks good:

  # cat /proc/iomem
  00000000-00000fff : Reserved
  00001000-0009fbff : System RAM
  ...
  100000000-23fffffff : System RAM
  240000000-a3fffffff : Persistent Memory
    240000000-a3fffffff : namespace0.0

I was just worried that this was an inconsistency between the way that virtual
NVDIMMs are presented vs the way that they will be presented on bare metal.  I
at least look at the e820 table to get my bearings of how memory is laid out -
maybe I just need to look at /proc/iomem instead?

  reply	other threads:[~2017-07-28 19:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-28 18:04 QEMU NVDIMM as type 7 in e820 table Ross Zwisler
2017-07-28 18:04 ` [Qemu-devel] " Ross Zwisler
2017-07-28 18:11 ` Dan Williams
2017-07-28 18:11   ` [Qemu-devel] " Dan Williams
2017-07-28 19:45   ` Ross Zwisler [this message]
2017-07-28 19:45     ` Ross Zwisler
2017-07-28 20:19     ` Dan Williams
2017-07-28 20:19       ` [Qemu-devel] " Dan Williams
2017-07-28 20:19     ` Kani, Toshimitsu
2017-07-28 20:19       ` [Qemu-devel] " Kani, Toshimitsu
2017-07-29 10:49     ` Haozhong Zhang
2017-07-29 10:49       ` [Qemu-devel] " Haozhong Zhang
2017-07-31 15:48       ` Ross Zwisler
2017-07-31 15:48         ` [Qemu-devel] " Ross Zwisler
2017-07-31 16:03         ` Igor Mammedov
2017-07-31 16:03           ` Igor Mammedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170728194543.GA20726@linux.intel.com \
    --to=ross.zwisler@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=guangrong.xiao@gmail.com \
    --cc=imammedo@redhat.com \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.