From: Dan Williams <dan.j.williams@intel.com>
To: "Elliott, Robert (Server Storage)" <Elliott@hp.com>
Cc: "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
Neil Brown <neilb@suse.de>, Dave Chinner <david@fromorbit.com>,
"H. Peter Anvin" <hpa@zytor.com>, Christoph Hellwig <hch@lst.de>,
"Wysocki, Rafael J" <rafael.j.wysocki@intel.com>,
"Moore, Robert" <robert.moore@intel.com>,
Ingo Molnar <mingo@kernel.org>,
"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
Jens Axboe <axboe@fb.com>, Borislav Petkov <bp@alien8.de>,
Thomas Gleixner <tglx@linutronix.de>,
Greg KH <gregkh@linuxfoundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Andy Lutomirski <luto@amacapital.net>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [Linux-nvdimm] [PATCH v2 00/20] libnd: non-volatile memory device support
Date: Tue, 28 Apr 2015 15:15:54 -0700 [thread overview]
Message-ID: <CAPcyv4g-M6NPVV4C3nB7DebhwuuzNKbNyYGDPKh_3N54P2b4Cw@mail.gmail.com> (raw)
In-Reply-To: <94D0CD8314A33A4D9D801C0FE68B40295A8C934B@G9W0745.americas.hpqcorp.net>
On Tue, Apr 28, 2015 at 2:24 PM, Elliott, Robert (Server Storage)
<Elliott@hp.com> wrote:
>> -----Original Message-----
>> From: Linux-nvdimm [mailto:linux-nvdimm-bounces@lists.01.org] On Behalf Of
>> Dan Williams
>> Sent: Tuesday, April 28, 2015 1:24 PM
>> To: linux-nvdimm@lists.01.org
>> Cc: Neil Brown; Dave Chinner; H. Peter Anvin; Christoph Hellwig; Rafael J.
>> Wysocki; Robert Moore; Ingo Molnar; linux-acpi@vger.kernel.org; Jens Axboe;
>> Borislav Petkov; Thomas Gleixner; Greg KH; linux-kernel@vger.kernel.org;
>> Andy Lutomirski; Andrew Morton; Linus Torvalds
>> Subject: [Linux-nvdimm] [PATCH v2 00/20] libnd: non-volatile memory device
>> support
>>
>> Changes since v1 [1]: Incorporates feedback received prior to April 24.
>
> Here are some comments on the sysfs properties reported for a pmem device.
> They are based on v1, but I don't think v2 changes anything.
>
> 1. This confuses lsblk (part of util-linux):
> /sys/block/pmem0/device/type:4
>
> lsblk shows:
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> pmem0 251:0 0 8G 0 worm
> pmem1 251:16 0 8G 0 worm
> pmem2 251:32 0 8G 0 worm
> pmem3 251:48 0 8G 0 worm
> pmem4 251:64 0 8G 0 worm
> pmem5 251:80 0 8G 0 worm
> pmem6 251:96 0 8G 0 worm
> pmem7 251:112 0 8G 0 worm
>
> lsblk's blkdev_scsi_type_to_name() considers 4 to mean
> SCSI_TYPE_WORM (write once read many ... used for certain optical
> and tape drives).
Why is lsblk assuming these are scsi devices? I'll need to go check that out.
> I'm not sure what nd and pmem are doing to result in that value.
That is their libnd specific device type number from
include/uapi/ndctl.h. 4 == ND_DEVICE_NAMESPACE_IO. lsblk has no
business interpreting this as something SCSI specific.
> 2. To avoid confusing software trying to detect fast storage vs.
> slow storage devices via sysfs, this value should be 0:
> /sys/block/pmem0/queue/rotational:1
>
> That can be done by adding this shortly after the blk_alloc_queue call:
> queue_flag_set_unlocked(QUEUE_FLAG_NONROT, pmem->pmem_queue);
Yeah, good catch.
> 3. Is there any reason to have a 512 KiB limit on the transfer
> length?
> /sys/block/pmem0/queue/max_hw_sectors_kb:512
>
> That is from:
> blk_queue_max_hw_sectors(pmem->pmem_queue, 1024);
I'd only change this from the default if performance testing showed it
made a non-trivial difference.
> 4. These are read-writeable, but IOs never reach a queue, so
> the queue size is irrelevant and merging never happens:
> /sys/block/pmem0/queue/nomerges:0
> /sys/block/pmem0/queue/nr_requests:128
>
> Consider making them both read-only with:
> * nomerges set to 2 (no merging happening)
> * nr_requests as small as the block layer allows to avoid
> wasting memory.
>
> 5. No scatter-gather lists are created by the driver, so these
> read-only fields are meaningless:
> /sys/block/pmem0/queue/max_segments:128
> /sys/block/pmem0/queue/max_segment_size:65536
>
> Is there a better way to report them as irrelevant?
Again it comes back to the question of whether these default settings
are actively harmful.
>
> 6. There is no completion processing, so the read-writeable
> cpu affinity is not used:
> /sys/block/pmem0/queue/rq_affinity:0
>
> Consider making it read-only and set to 2, meaning the
> completions always run on the requesting CPU.
There are no completions with pmem, the entire I/O path is
synchronous. Ideally, this attribute would disappear for a pmem
queue, not be set to 2.
> 7. With mmap() allowing less than logical block sized accesses
> to the device, this could be considered misleading:
> /sys/block/pmem0/queue/physical_block_size:512
I don't see how it is misleading. If you access it as a block device
the block size is 512. If the application is mmap() + DAX aware it
knows that the physical_block_size is being bypassed.
>
> Perhaps that needs to be 1 byte or a cacheline size (64 bytes
> on x86) to indicate that direct partial logical block accesses
> are possible.
No, because that breaks the definition of a block device. Through the
bdev interface it's always accessed a block at a time.
> The btt driver could report 512 as one indication
> it is different.
>
> I wouldn't be surprised if smaller values than the logical block
> size confused some software, though.
Precisely why we shouldn't go there with pmem.
next prev parent reply other threads:[~2015-04-28 22:15 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-28 18:24 [PATCH v2 00/20] libnd: non-volatile memory device support Dan Williams
2015-04-28 18:24 ` [PATCH v2 01/20] e820, efi: add ACPI 6.0 persistent memory types Dan Williams
2015-04-28 20:49 ` Andy Lutomirski
2015-04-28 20:57 ` Dan Williams
2015-04-28 21:05 ` Andy Lutomirski
2015-04-28 18:24 ` [PATCH v2 02/20] libnd, nd_acpi: initial libnd infrastructure and NFIT support Dan Williams
2015-04-30 23:23 ` Rafael J. Wysocki
2015-05-01 0:39 ` Dan Williams
2015-05-01 1:21 ` Rafael J. Wysocki
2015-05-01 16:23 ` Dan Williams
2015-05-04 23:58 ` Rafael J. Wysocki
2015-05-04 23:46 ` Dan Williams
2015-04-28 18:24 ` [PATCH v2 03/20] nd_acpi, nfit-test: manufactured NFITs for interface development Dan Williams
2015-04-28 18:24 ` [PATCH v2 04/20] libnd: ndctl class device, and nd bus attributes Dan Williams
2015-04-28 18:24 ` [PATCH v2 05/20] libnd, nd_acpi: dimm/memory-devices Dan Williams
2015-05-01 17:48 ` [Linux-nvdimm] " Toshi Kani
2015-05-01 18:22 ` Dan Williams
2015-05-01 18:19 ` Toshi Kani
2015-05-01 18:43 ` Dan Williams
2015-05-01 19:15 ` Toshi Kani
2015-05-01 19:38 ` Dan Williams
2015-05-01 20:08 ` Toshi Kani
2015-04-28 18:24 ` [PATCH v2 06/20] libnd: ndctl.h, the nd ioctl abi Dan Williams
2015-04-28 18:24 ` [PATCH v2 07/20] libnd, nd_dimm: dimm driver and base libnd device-driver infrastructure Dan Williams
2015-05-20 16:59 ` [Linux-nvdimm] " Elliott, Robert (Server Storage)
2015-05-20 17:02 ` Dan Williams
2015-04-28 18:24 ` [PATCH v2 08/20] libnd, nd_acpi: regions (block-data-window, persistent memory, volatile memory) Dan Williams
2015-04-29 15:53 ` [Linux-nvdimm] " Elliott, Robert (Server Storage)
2015-04-29 15:59 ` Dan Williams
2015-05-04 20:26 ` Toshi Kani
2015-05-09 23:55 ` Dan Williams
2015-05-28 18:36 ` Toshi Kani
2015-05-28 19:59 ` Dan Williams
2015-05-28 20:51 ` Linda Knippers
2015-05-28 20:58 ` Dan Williams
2015-04-28 18:25 ` [PATCH v2 09/20] libnd: support for legacy (non-aliasing) nvdimms Dan Williams
2015-04-28 18:25 ` [PATCH v2 10/20] pmem: use ida Dan Williams
2015-04-29 18:25 ` [Linux-nvdimm] " Toshi Kani
2015-04-29 18:59 ` Dan Williams
2015-04-29 18:53 ` Toshi Kani
2015-04-29 20:49 ` Linda Knippers
2015-04-29 21:36 ` Dan Williams
2015-04-28 18:25 ` [PATCH v2 11/20] libnd, nd_pmem: add libnd support to the pmem driver Dan Williams
2015-04-28 21:04 ` Andy Lutomirski
2015-04-28 22:21 ` [Linux-nvdimm] " Phil Pokorny
2015-04-28 22:58 ` Andy Lutomirski
2015-04-29 0:17 ` Phil Pokorny
2015-04-29 0:28 ` Andy Lutomirski
2015-04-29 15:55 ` Dan Williams
2015-04-29 18:36 ` Andy Lutomirski
2015-04-28 18:25 ` [PATCH v2 12/20] libnd, nd_acpi: add interleave-set state-tracking infrastructure Dan Williams
2015-04-28 18:25 ` [PATCH v2 13/20] libnd: namespace indices: read and validate Dan Williams
2015-04-28 18:25 ` [PATCH v2 14/20] libnd: pmem label sets and namespace instantiation Dan Williams
2015-04-28 18:25 ` [PATCH v2 15/20] libnd: blk labels " Dan Williams
2015-04-28 18:25 ` [PATCH v2 16/20] libnd: write pmem label set Dan Williams
2015-04-28 18:25 ` [PATCH v2 17/20] libnd: write blk " Dan Williams
2015-04-28 18:25 ` [PATCH v2 18/20] libnd: infrastructure for btt devices Dan Williams
2015-05-12 16:33 ` [Linux-nvdimm] " Toshi Kani
2015-05-15 0:41 ` Dan Williams
2015-05-15 4:25 ` Elliott, Robert (Server Storage)
2015-04-28 18:25 ` [PATCH v2 19/20] nd_btt: atomic sector updates Dan Williams
2015-05-17 1:19 ` [Linux-nvdimm] " Elliott, Robert (Server Storage)
2015-05-17 3:22 ` Dan Williams
2015-05-20 17:20 ` Elliott, Robert (Server Storage)
2015-05-18 22:38 ` Verma, Vishal L
2015-04-28 18:26 ` [PATCH v2 20/20] libnd, nd_acpi, nd_blk: driver for BLK-mode access persistent memory Dan Williams
2015-04-28 21:10 ` Andy Lutomirski
2015-04-28 22:30 ` Dan Williams
2015-04-28 23:06 ` Andy Lutomirski
2015-04-29 17:10 ` Dan Williams
2015-04-29 19:28 ` Andy Lutomirski
2015-04-28 20:52 ` [PATCH v2 00/20] libnd: non-volatile memory device support Andy Lutomirski
2015-04-28 20:59 ` Dan Williams
2015-04-28 21:06 ` Andy Lutomirski
2015-04-28 22:28 ` Dan Williams
2015-04-28 23:05 ` Andy Lutomirski
2015-04-30 20:56 ` Ross Zwisler
2015-04-28 21:24 ` [Linux-nvdimm] " Elliott, Robert (Server Storage)
2015-04-28 22:15 ` Dan Williams [this message]
2015-05-07 7:29 ` Christoph Hellwig
2015-04-29 0:25 ` Rafael J. Wysocki
2015-04-29 1:22 ` Dan Williams
2015-05-05 0:06 ` Rafael J. Wysocki
2015-05-08 6:31 ` Williams, Dan J
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPcyv4g-M6NPVV4C3nB7DebhwuuzNKbNyYGDPKh_3N54P2b4Cw@mail.gmail.com \
--to=dan.j.williams@intel.com \
--cc=Elliott@hp.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@fb.com \
--cc=bp@alien8.de \
--cc=david@fromorbit.com \
--cc=gregkh@linuxfoundation.org \
--cc=hch@lst.de \
--cc=hpa@zytor.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=luto@amacapital.net \
--cc=mingo@kernel.org \
--cc=neilb@suse.de \
--cc=rafael.j.wysocki@intel.com \
--cc=robert.moore@intel.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).