All of lore.kernel.org
 help / color / mirror / Atom feed
From: Krishnamraju Eraparaju <krishna2@chelsio.com>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-rdma@vger.kernel.org,
	Potnuri Bharat Teja <bharat@chelsio.com>,
	Nirranjan Kirubaharan <nirranjan@chelsio.com>,
	linux-nvme@lists.infradead.org,
	Bernard Metzler <BMT@zurich.ibm.com>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: broken CRCs at NVMeF target with SIW & NVMe/TCP transports
Date: Wed, 18 Mar 2020 00:47:44 +0530	[thread overview]
Message-ID: <20200317191743.GA22065@chelsio.com> (raw)
In-Reply-To: <70b13212-faa6-d634-8beb-55ba39891d7f@grimberg.me>

On Tuesday, March 03/17/20, 2020 at 09:39:39 -0700, Sagi Grimberg wrote:
> 
> >>>For TCP we can set BDI_CAP_STABLE_WRITES.  For RDMA I don't think
> >>that
> >>>is a good idea as pretty much all RDMA block drivers rely on the
> >>>DMA behavior above.  The answer is to bounce buffer the data in
> >>>SoftiWARP / SoftRoCE.
> >>
> >>We already do, see nvme_alloc_ns.
> >>
> >>
> >
> >Krishna was getting the issue when testing TCP/NVMeF with -G
> >during connect. That enables data digest and STABLE_WRITES
> >I think. So to me it seems we don't get stable pages, but
> >pages which are touched after handover to the provider.
> 
> Non of the transports modifies the data at any point, both will
> scan it to compute crc. So surely this is coming from the fs,
> Krishna does this happen with xfs as well?
Yes, but rare(took ~15min to recreate), whereas with ext3/4
its almost immediate. Here is the error log for NVMe/TCP with xfs.

dmesg at Host:
[  +0.000323] nvme nvme2: creating 12 I/O queues.
[  +0.008991] nvme nvme2: Successfully reconnected (1 attempt)
[ +25.277733] blk_update_request: I/O error, dev nvme2n1, sector 0 op
0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[  +6.043879] XFS (nvme2n1): Mounting V5 Filesystem
[  +0.017745] XFS (nvme2n1): Ending clean mount
[  +0.000174] xfs filesystem being mounted at /mnt supports timestamps
until 2038 (0x7fffffff)
[Mar18 00:14] nvme nvme2: Reconnecting in 10 seconds...
[  +0.000453] nvme nvme2: creating 12 I/O queues.
[  +0.009216] nvme nvme2: Successfully reconnected (1 attempt)
[Mar18 00:43] nvme nvme2: Reconnecting in 10 seconds...
[  +0.000383] nvme nvme2: creating 12 I/O queues.
[  +0.009239] nvme nvme2: Successfully reconnected (1 attempt)


dmesg at Target:
[Mar18 00:14] nvmet_tcp: queue 9: cmd 17 pdu (4) data digest error: recv
0x8e85d882 expected 0x9a46fac3
[  +0.000011] nvmet: ctrl 1 fatal error occurred!
[ +10.240266] nvmet: creating controller 1 for subsystem nvme-ram0 for
NQN nqn.2014-08.org.nvmexpress.chelsio.
[Mar18 00:42] nvmet_tcp: queue 7: cmd 89 pdu (4) data digest error: recv
0xc0ce3dfd expected 0x7ee136b5
[  +0.000012] nvmet: ctrl 1 fatal error occurred!
[Mar18 00:43] nvmet: creating controller 1 for subsystem nvme-ram0 for
NQN nqn.2014-08.org.nvmexpress.chelsio.


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

WARNING: multiple messages have this Message-ID (diff)
From: Krishnamraju Eraparaju <krishna2@chelsio.com>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Bernard Metzler <BMT@zurich.ibm.com>,
	Christoph Hellwig <hch@lst.de>,
	linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org,
	Nirranjan Kirubaharan <nirranjan@chelsio.com>,
	Potnuri Bharat Teja <bharat@chelsio.com>
Subject: Re: broken CRCs at NVMeF target with SIW & NVMe/TCP transports
Date: Wed, 18 Mar 2020 00:47:44 +0530	[thread overview]
Message-ID: <20200317191743.GA22065@chelsio.com> (raw)
In-Reply-To: <70b13212-faa6-d634-8beb-55ba39891d7f@grimberg.me>

On Tuesday, March 03/17/20, 2020 at 09:39:39 -0700, Sagi Grimberg wrote:
> 
> >>>For TCP we can set BDI_CAP_STABLE_WRITES.  For RDMA I don't think
> >>that
> >>>is a good idea as pretty much all RDMA block drivers rely on the
> >>>DMA behavior above.  The answer is to bounce buffer the data in
> >>>SoftiWARP / SoftRoCE.
> >>
> >>We already do, see nvme_alloc_ns.
> >>
> >>
> >
> >Krishna was getting the issue when testing TCP/NVMeF with -G
> >during connect. That enables data digest and STABLE_WRITES
> >I think. So to me it seems we don't get stable pages, but
> >pages which are touched after handover to the provider.
> 
> Non of the transports modifies the data at any point, both will
> scan it to compute crc. So surely this is coming from the fs,
> Krishna does this happen with xfs as well?
Yes, but rare(took ~15min to recreate), whereas with ext3/4
its almost immediate. Here is the error log for NVMe/TCP with xfs.

dmesg at Host:
[  +0.000323] nvme nvme2: creating 12 I/O queues.
[  +0.008991] nvme nvme2: Successfully reconnected (1 attempt)
[ +25.277733] blk_update_request: I/O error, dev nvme2n1, sector 0 op
0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[  +6.043879] XFS (nvme2n1): Mounting V5 Filesystem
[  +0.017745] XFS (nvme2n1): Ending clean mount
[  +0.000174] xfs filesystem being mounted at /mnt supports timestamps
until 2038 (0x7fffffff)
[Mar18 00:14] nvme nvme2: Reconnecting in 10 seconds...
[  +0.000453] nvme nvme2: creating 12 I/O queues.
[  +0.009216] nvme nvme2: Successfully reconnected (1 attempt)
[Mar18 00:43] nvme nvme2: Reconnecting in 10 seconds...
[  +0.000383] nvme nvme2: creating 12 I/O queues.
[  +0.009239] nvme nvme2: Successfully reconnected (1 attempt)


dmesg at Target:
[Mar18 00:14] nvmet_tcp: queue 9: cmd 17 pdu (4) data digest error: recv
0x8e85d882 expected 0x9a46fac3
[  +0.000011] nvmet: ctrl 1 fatal error occurred!
[ +10.240266] nvmet: creating controller 1 for subsystem nvme-ram0 for
NQN nqn.2014-08.org.nvmexpress.chelsio.
[Mar18 00:42] nvmet_tcp: queue 7: cmd 89 pdu (4) data digest error: recv
0xc0ce3dfd expected 0x7ee136b5
[  +0.000012] nvmet: ctrl 1 fatal error occurred!
[Mar18 00:43] nvmet: creating controller 1 for subsystem nvme-ram0 for
NQN nqn.2014-08.org.nvmexpress.chelsio.


  reply	other threads:[~2020-03-17 19:20 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-16 16:20 broken CRCs at NVMeF target with SIW & NVMe/TCP transports Krishnamraju Eraparaju
2020-03-16 16:20 ` Krishnamraju Eraparaju
2020-03-17  9:31 ` Bernard Metzler
2020-03-17  9:31   ` Bernard Metzler
2020-03-17 12:26   ` Tom Talpey
2020-03-17 12:26     ` Tom Talpey
2020-03-17 12:45 ` Christoph Hellwig
2020-03-17 12:45   ` Christoph Hellwig
2020-03-17 13:17   ` Bernard Metzler
2020-03-17 13:17     ` Bernard Metzler
2020-03-17 16:03   ` Sagi Grimberg
2020-03-17 16:03     ` Sagi Grimberg
2020-03-17 16:29     ` Bernard Metzler
2020-03-17 16:29       ` Bernard Metzler
2020-03-17 16:39       ` Sagi Grimberg
2020-03-17 16:39         ` Sagi Grimberg
2020-03-17 19:17         ` Krishnamraju Eraparaju [this message]
2020-03-17 19:17           ` Krishnamraju Eraparaju
2020-03-17 19:33           ` Sagi Grimberg
2020-03-17 19:33             ` Sagi Grimberg
2020-03-17 20:31             ` Krishnamraju Eraparaju
2020-03-17 20:31               ` Krishnamraju Eraparaju
2020-03-18 16:49               ` Sagi Grimberg
2020-03-18 16:49                 ` Sagi Grimberg
2020-03-20 14:35                 ` Krishnamraju Eraparaju
2020-03-20 14:35                   ` Krishnamraju Eraparaju
2020-03-20 20:49                   ` Sagi Grimberg
2020-03-20 20:49                     ` Sagi Grimberg
2020-03-21  4:02                     ` Krishnamraju Eraparaju
2020-03-21  4:02                       ` Krishnamraju Eraparaju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200317191743.GA22065@chelsio.com \
    --to=krishna2@chelsio.com \
    --cc=BMT@zurich.ibm.com \
    --cc=bharat@chelsio.com \
    --cc=hch@lst.de \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=nirranjan@chelsio.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.