All of lore.kernel.org
 help / color / mirror / Atom feed
From: indou.takao@fujitsu.com (Takao Indoh)
Subject: [PATCH] nvme: Enable acceleration feature of A64FX processor
Date: Wed, 20 Feb 2019 18:46:11 +0900	[thread overview]
Message-ID: <20190220094610.GB3559@esprimo> (raw)
In-Reply-To: <AT5PR8401MB1169E9738C0A4754D0190F2FAB670@AT5PR8401MB1169.NAMPRD84.PROD.OUTLOOK.COM>

On Thu, Feb 14, 2019@08:44:48PM +0000, Elliott, Robert (Persistent Memory) wrote:
> 
> 
> > -----Original Message-----
> > From: Linux-nvme [mailto:linux-nvme-bounces at lists.infradead.org] On Behalf Of Keith Busch
> > Sent: Tuesday, February 5, 2019 8:39 AM
> > To: Takao Indoh <indou.takao at fujitsu.com>
> > Cc: Takao Indoh <indou.takao at jp.fujitsu.com>; sagi at grimberg.me; linux-kernel at vger.kernel.org; linux-
> > nvme at lists.infradead.org; axboe at fb.com; hch at lst.de
> > Subject: Re: [PATCH] nvme: Enable acceleration feature of A64FX processor
> > 
> > On Tue, Feb 05, 2019@09:56:05PM +0900, Takao Indoh wrote:
> > > On Fri, Feb 01, 2019@07:54:14AM -0700, Keith Busch wrote:
> > > > On Fri, Feb 01, 2019@09:46:15PM +0900, Takao Indoh wrote:
> > > > > From: Takao Indoh <indou.takao at fujitsu.com>
> > > > >
> > > > > Fujitsu A64FX processor has a feature to accelerate data transfer of
> > > > > internal bus by relaxed ordering. It is enabled when the bit 56 of dma
> > > > > address is set to 1.
> > > >
> > > > Wait, what? RO is a standard PCIe TLP attribute. Why would we need this?
> > >
> > > I should have explained this patch more carefully.
> > >
> > > Standard PCIe devices can use Relaxed Ordering (RO) by setting Attr
> > > field in the TLP header, however, this mechanism cannot be utilized if
> > > the device does not support RO feature. Fujitsu A64FX processor has an
> > > alternate feature to enable RO in its Root Port by setting the bit 56 of
> > > DMA address. This mechanism enables to utilize RO feature even if the
> > > device does not support standard PCIe RO.
> > 
> > I think you're better of just purchasing devices that support the
> > capability per spec rather than with a non-standard work around.
> > 
> 
> The PCIe and NVMe specifications dosn't standardize a way to tell the device
> when to use RO, which leads to system workarounds like this.
> 
> The Enable Relaxed Ordering bit defined by PCIe tells the device when it
> cannot use RO, but doesn't advise when it should or shall use RO.
> 
> For SCSI Express (SOP+PQI), we were going to allow specifying these
> on a per-command basis:
> * TLP attributes (No Snoop, Relaxed Ordering, ID-based Ordering)
> * TLP processing hints (Processing Hints and Steering Tags)
> 
> to be used by the data transfers for the command. In some systems, one
> setting per queue or per device might suffice. Transactions to the
> queues and doorbells require stronger ordering.
> 
> For this workaround:
> * making an extra pass through the SGL to set the address bit is 
> inefficient; it should be done as the SGL is created.

Thanks for your comment, do you mean this should be done in
nvme_pci_setup_sgls()/nvme_pci_setup_prps()?

> * why doesn't it support PRP Lists?

This patch does not support PRP because PRP is used for small data and
we cannot get enough performance improvement by this feature. But I can
support PRP to improve performance of the device which is compliant with
NVMe Spec 1.0 or does not support SGL.

> * how does this interact with an iommu, if there is one? Must the 
> address with bit 56 also be granted permission, or is that
> stripped off before any iommu comparisons?

The latter. A bit 56 is cleared in Root Port before pass it to iommu.

Thanks,
Takao Indoh

WARNING: multiple messages have this Message-ID (diff)
From: Takao Indoh <indou.takao@fujitsu.com>
To: "Elliott, Robert (Persistent Memory)" <elliott@hpe.com>
Cc: Keith Busch <keith.busch@intel.com>,
	Takao Indoh <indou.takao@jp.fujitsu.com>,
	"sagi@grimberg.me" <sagi@grimberg.me>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"axboe@fb.com" <axboe@fb.com>, "hch@lst.de" <hch@lst.de>
Subject: Re: [PATCH] nvme: Enable acceleration feature of A64FX processor
Date: Wed, 20 Feb 2019 18:46:11 +0900	[thread overview]
Message-ID: <20190220094610.GB3559@esprimo> (raw)
In-Reply-To: <AT5PR8401MB1169E9738C0A4754D0190F2FAB670@AT5PR8401MB1169.NAMPRD84.PROD.OUTLOOK.COM>

On Thu, Feb 14, 2019 at 08:44:48PM +0000, Elliott, Robert (Persistent Memory) wrote:
> 
> 
> > -----Original Message-----
> > From: Linux-nvme [mailto:linux-nvme-bounces@lists.infradead.org] On Behalf Of Keith Busch
> > Sent: Tuesday, February 5, 2019 8:39 AM
> > To: Takao Indoh <indou.takao@fujitsu.com>
> > Cc: Takao Indoh <indou.takao@jp.fujitsu.com>; sagi@grimberg.me; linux-kernel@vger.kernel.org; linux-
> > nvme@lists.infradead.org; axboe@fb.com; hch@lst.de
> > Subject: Re: [PATCH] nvme: Enable acceleration feature of A64FX processor
> > 
> > On Tue, Feb 05, 2019 at 09:56:05PM +0900, Takao Indoh wrote:
> > > On Fri, Feb 01, 2019 at 07:54:14AM -0700, Keith Busch wrote:
> > > > On Fri, Feb 01, 2019 at 09:46:15PM +0900, Takao Indoh wrote:
> > > > > From: Takao Indoh <indou.takao@fujitsu.com>
> > > > >
> > > > > Fujitsu A64FX processor has a feature to accelerate data transfer of
> > > > > internal bus by relaxed ordering. It is enabled when the bit 56 of dma
> > > > > address is set to 1.
> > > >
> > > > Wait, what? RO is a standard PCIe TLP attribute. Why would we need this?
> > >
> > > I should have explained this patch more carefully.
> > >
> > > Standard PCIe devices can use Relaxed Ordering (RO) by setting Attr
> > > field in the TLP header, however, this mechanism cannot be utilized if
> > > the device does not support RO feature. Fujitsu A64FX processor has an
> > > alternate feature to enable RO in its Root Port by setting the bit 56 of
> > > DMA address. This mechanism enables to utilize RO feature even if the
> > > device does not support standard PCIe RO.
> > 
> > I think you're better of just purchasing devices that support the
> > capability per spec rather than with a non-standard work around.
> > 
> 
> The PCIe and NVMe specifications dosn't standardize a way to tell the device
> when to use RO, which leads to system workarounds like this.
> 
> The Enable Relaxed Ordering bit defined by PCIe tells the device when it
> cannot use RO, but doesn't advise when it should or shall use RO.
> 
> For SCSI Express (SOP+PQI), we were going to allow specifying these
> on a per-command basis:
> * TLP attributes (No Snoop, Relaxed Ordering, ID-based Ordering)
> * TLP processing hints (Processing Hints and Steering Tags)
> 
> to be used by the data transfers for the command. In some systems, one
> setting per queue or per device might suffice. Transactions to the
> queues and doorbells require stronger ordering.
> 
> For this workaround:
> * making an extra pass through the SGL to set the address bit is 
> inefficient; it should be done as the SGL is created.

Thanks for your comment, do you mean this should be done in
nvme_pci_setup_sgls()/nvme_pci_setup_prps()?

> * why doesn't it support PRP Lists?

This patch does not support PRP because PRP is used for small data and
we cannot get enough performance improvement by this feature. But I can
support PRP to improve performance of the device which is compliant with
NVMe Spec 1.0 or does not support SGL.

> * how does this interact with an iommu, if there is one? Must the 
> address with bit 56 also be granted permission, or is that
> stripped off before any iommu comparisons?

The latter. A bit 56 is cleared in Root Port before pass it to iommu.

Thanks,
Takao Indoh

  parent reply	other threads:[~2019-02-20  9:46 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-01 12:46 [PATCH] nvme: Enable acceleration feature of A64FX processor Takao Indoh
2019-02-01 12:46 ` Takao Indoh
2019-02-01 14:54 ` Keith Busch
2019-02-01 14:54   ` Keith Busch
2019-02-05 12:56   ` Takao Indoh
2019-02-05 12:56     ` Takao Indoh
2019-02-05 14:39     ` Keith Busch
2019-02-05 14:39       ` Keith Busch
2019-02-05 16:13       ` Christoph Hellwig
2019-02-05 16:13         ` Christoph Hellwig
2019-02-13 12:03         ` Takao Indoh
2019-02-13 12:03           ` Takao Indoh
2019-02-14 17:11           ` Christoph Hellwig
2019-02-14 17:11             ` Christoph Hellwig
2019-02-14 20:44       ` Elliott, Robert (Persistent Memory)
2019-02-14 20:44         ` Elliott, Robert (Persistent Memory)
2019-02-14 21:17         ` Keith Busch
2019-02-14 21:17           ` Keith Busch
2019-02-20  9:46         ` Takao Indoh [this message]
2019-02-20  9:46           ` Takao Indoh
2019-02-22 17:07           ` Keith Busch
2019-02-22 17:07             ` Keith Busch
2019-02-01 15:51 ` Christoph Hellwig
2019-02-01 15:51   ` Christoph Hellwig
2019-02-05 12:56   ` Takao Indoh
2019-02-05 12:56     ` Takao Indoh
2019-02-03  0:17 ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190220094610.GB3559@esprimo \
    --to=indou.takao@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.