All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	Jerome Marchand <jmarchan@redhat.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Dave Chinner <david@fromorbit.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Christoph Hellwig <hch@lst.de>,
	seungho1.park@lge.com, Jan Kara <jack@suse.cz>,
	"karam . lee" <karam.lee@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nitin Gupta <ngupta@vflare.org>
Subject: Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt
Date: Mon, 7 Aug 2017 17:23:47 +0900	[thread overview]
Message-ID: <20170807082347.GA24466@bbox> (raw)
In-Reply-To: <CAPcyv4hsicQybj1091n1n9aKDtQ1JB2fEhjK+_21mi4ta5S46Q@mail.gmail.com>

On Fri, Aug 04, 2017 at 11:24:49AM -0700, Dan Williams wrote:
> On Fri, Aug 4, 2017 at 11:21 AM, Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > On Fri, Aug 04, 2017 at 11:01:08AM -0700, Dan Williams wrote:
> >> [ adding Dave who is working on a blk-mq + dma offload version of the
> >> pmem driver ]
> >>
> >> On Fri, Aug 4, 2017 at 1:17 AM, Minchan Kim <minchan@kernel.org> wrote:
> >> > On Fri, Aug 04, 2017 at 12:54:41PM +0900, Minchan Kim wrote:
> >> [..]
> >> >> Thanks for the testing. Your testing number is within noise level?
> >> >>
> >> >> I cannot understand why PMEM doesn't have enough gain while BTT is significant
> >> >> win(8%). I guess no rw_page with BTT testing had more chances to wait bio dynamic
> >> >> allocation and mine and rw_page testing reduced it significantly. However,
> >> >> in no rw_page with pmem, there wasn't many cases to wait bio allocations due
> >> >> to the device is so fast so the number comes from purely the number of
> >> >> instructions has done. At a quick glance of bio init/submit, it's not trivial
> >> >> so indeed, i understand where the 12% enhancement comes from but I'm not sure
> >> >> it's really big difference in real practice at the cost of maintaince burden.
> >> >
> >> > I tested pmbench 10 times in my local machine(4 core) with zram-swap.
> >> > In my machine, even, on-stack bio is faster than rw_page. Unbelievable.
> >> >
> >> > I guess it's really hard to get stable result in severe memory pressure.
> >> > It would be a result within noise level(see below stddev).
> >> > So, I think it's hard to conclude rw_page is far faster than onstack-bio.
> >> >
> >> > rw_page
> >> > avg     5.54us
> >> > stddev  8.89%
> >> > max     6.02us
> >> > min     4.20us
> >> >
> >> > onstack bio
> >> > avg     5.27us
> >> > stddev  13.03%
> >> > max     5.96us
> >> > min     3.55us
> >>
> >> The maintenance burden of having alternative submission paths is
> >> significant especially as we consider the pmem driver ising more
> >> services of the core block layer. Ideally, I'd want to complete the
> >> rw_page removal work before we look at the blk-mq + dma offload
> >> reworks.
> >>
> >> The change to introduce BDI_CAP_SYNC is interesting because we might
> >> have use for switching between dma offload and cpu copy based on
> >> whether the I/O is synchronous or otherwise hinted to be a low latency
> >> request. Right now the dma offload patches are using "bio_segments() >
> >> 1" as the gate for selecting offload vs cpu copy which seem
> >> inadequate.
> >
> > Okay, so based on the feedback above and from Jens[1], it sounds like we want
> > to go forward with removing the rw_page() interface, and instead optimize the
> > regular I/O path via on-stack BIOS and dma offload, correct?
> >
> > If so, I'll prepare patches that fully remove the rw_page() code, and let
> > Minchan and Dave work on their optimizations.
> 
> I think the conversion to on-stack-bio should be done in the same
> patchset that removes rw_page. We don't want to leave a known
> performance regression while the on-stack-bio work is in-flight.

Okay. It seems everyone get an agreement with on-stack-bio.
I will send my formal patchset including Ross's patches which
removes rw_page.

Thanks.

Thanks.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	Matthew Wilcox <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"karam . lee" <karam.lee@lge.com>,
	Jerome Marchand <jmarchan@redhat.com>,
	Nitin Gupta <ngupta@vflare.org>,
	seungho1.park@lge.com, Christoph Hellwig <hch@lst.de>,
	Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
	Jens Axboe <axboe@kernel.dk>,
	Vishal Verma <vishal.l.verma@intel.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Dave Jiang <dave.jiang@intel.com>
Subject: Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt
Date: Mon, 7 Aug 2017 17:23:47 +0900	[thread overview]
Message-ID: <20170807082347.GA24466@bbox> (raw)
In-Reply-To: <CAPcyv4hsicQybj1091n1n9aKDtQ1JB2fEhjK+_21mi4ta5S46Q@mail.gmail.com>

On Fri, Aug 04, 2017 at 11:24:49AM -0700, Dan Williams wrote:
> On Fri, Aug 4, 2017 at 11:21 AM, Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > On Fri, Aug 04, 2017 at 11:01:08AM -0700, Dan Williams wrote:
> >> [ adding Dave who is working on a blk-mq + dma offload version of the
> >> pmem driver ]
> >>
> >> On Fri, Aug 4, 2017 at 1:17 AM, Minchan Kim <minchan@kernel.org> wrote:
> >> > On Fri, Aug 04, 2017 at 12:54:41PM +0900, Minchan Kim wrote:
> >> [..]
> >> >> Thanks for the testing. Your testing number is within noise level?
> >> >>
> >> >> I cannot understand why PMEM doesn't have enough gain while BTT is significant
> >> >> win(8%). I guess no rw_page with BTT testing had more chances to wait bio dynamic
> >> >> allocation and mine and rw_page testing reduced it significantly. However,
> >> >> in no rw_page with pmem, there wasn't many cases to wait bio allocations due
> >> >> to the device is so fast so the number comes from purely the number of
> >> >> instructions has done. At a quick glance of bio init/submit, it's not trivial
> >> >> so indeed, i understand where the 12% enhancement comes from but I'm not sure
> >> >> it's really big difference in real practice at the cost of maintaince burden.
> >> >
> >> > I tested pmbench 10 times in my local machine(4 core) with zram-swap.
> >> > In my machine, even, on-stack bio is faster than rw_page. Unbelievable.
> >> >
> >> > I guess it's really hard to get stable result in severe memory pressure.
> >> > It would be a result within noise level(see below stddev).
> >> > So, I think it's hard to conclude rw_page is far faster than onstack-bio.
> >> >
> >> > rw_page
> >> > avg     5.54us
> >> > stddev  8.89%
> >> > max     6.02us
> >> > min     4.20us
> >> >
> >> > onstack bio
> >> > avg     5.27us
> >> > stddev  13.03%
> >> > max     5.96us
> >> > min     3.55us
> >>
> >> The maintenance burden of having alternative submission paths is
> >> significant especially as we consider the pmem driver ising more
> >> services of the core block layer. Ideally, I'd want to complete the
> >> rw_page removal work before we look at the blk-mq + dma offload
> >> reworks.
> >>
> >> The change to introduce BDI_CAP_SYNC is interesting because we might
> >> have use for switching between dma offload and cpu copy based on
> >> whether the I/O is synchronous or otherwise hinted to be a low latency
> >> request. Right now the dma offload patches are using "bio_segments() >
> >> 1" as the gate for selecting offload vs cpu copy which seem
> >> inadequate.
> >
> > Okay, so based on the feedback above and from Jens[1], it sounds like we want
> > to go forward with removing the rw_page() interface, and instead optimize the
> > regular I/O path via on-stack BIOS and dma offload, correct?
> >
> > If so, I'll prepare patches that fully remove the rw_page() code, and let
> > Minchan and Dave work on their optimizations.
> 
> I think the conversion to on-stack-bio should be done in the same
> patchset that removes rw_page. We don't want to leave a known
> performance regression while the on-stack-bio work is in-flight.

Okay. It seems everyone get an agreement with on-stack-bio.
I will send my formal patchset including Ross's patches which
removes rw_page.

Thanks.

Thanks.

  reply	other threads:[~2017-08-07  8:21 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-28 16:56 [PATCH 0/3] remove rw_page() from brd, pmem and btt Ross Zwisler
2017-07-28 16:56 ` Ross Zwisler
2017-07-28 16:56 ` [PATCH 1/3] btt: remove btt_rw_page() Ross Zwisler
2017-07-28 16:56   ` Ross Zwisler
2017-08-03 16:15   ` kbuild test robot
2017-08-03 16:15     ` kbuild test robot
2017-07-28 16:56 ` [PATCH 2/3] pmem: remove pmem_rw_page() Ross Zwisler
2017-07-28 16:56   ` Ross Zwisler
2017-07-28 16:56 ` [PATCH 3/3] brd: remove brd_rw_page() Ross Zwisler
2017-07-28 16:56   ` Ross Zwisler
2017-07-28 17:31 ` [PATCH 0/3] remove rw_page() from brd, pmem and btt Matthew Wilcox
2017-07-28 17:31   ` Matthew Wilcox
2017-07-28 21:21   ` Andrew Morton
2017-07-28 21:21     ` Andrew Morton
2017-07-30 22:16     ` Minchan Kim
2017-07-30 22:16       ` Minchan Kim
2017-07-30 22:38       ` Minchan Kim
2017-07-30 22:38         ` Minchan Kim
2017-07-31  7:17       ` Christoph Hellwig
2017-07-31  7:17         ` Christoph Hellwig
2017-07-31  7:36         ` Minchan Kim
2017-07-31  7:36           ` Minchan Kim
2017-07-31  7:42           ` Christoph Hellwig
2017-07-31  7:42             ` Christoph Hellwig
2017-07-31  7:44             ` Christoph Hellwig
2017-07-31  7:44               ` Christoph Hellwig
2017-08-01  6:23               ` Minchan Kim
2017-08-01  6:23                 ` Minchan Kim
2017-08-02 22:13   ` Ross Zwisler
2017-08-02 22:13     ` Ross Zwisler
2017-08-03  0:13     ` Minchan Kim
2017-08-03  0:13       ` Minchan Kim
2017-08-03  0:34       ` Dan Williams
2017-08-03  0:34         ` Dan Williams
2017-08-03  8:05       ` Christoph Hellwig
2017-08-03  8:05         ` Christoph Hellwig
2017-08-04  0:57         ` Minchan Kim
2017-08-04  0:57           ` Minchan Kim
2017-08-03 21:13       ` Ross Zwisler
2017-08-03 21:13         ` Ross Zwisler
2017-08-03 21:17         ` Jens Axboe
2017-08-03 21:17           ` Jens Axboe
2017-08-04  3:54         ` Minchan Kim
2017-08-04  3:54           ` Minchan Kim
2017-08-04  8:17           ` Minchan Kim
2017-08-04  8:17             ` Minchan Kim
2017-08-04 18:01             ` Dan Williams
2017-08-04 18:01               ` Dan Williams
2017-08-04 18:21               ` Ross Zwisler
2017-08-04 18:21                 ` Ross Zwisler
2017-08-04 18:24                 ` Dan Williams
2017-08-04 18:24                   ` Dan Williams
2017-08-07  8:23                   ` Minchan Kim [this message]
2017-08-07  8:23                     ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170807082347.GA24466@bbox \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jmarchan@redhat.com \
    --cc=karam.lee@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=ngupta@vflare.org \
    --cc=seungho1.park@lge.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.