All of lore.kernel.org
 help / color / mirror / Atom feed
From: jthumshirn@suse.de (Johannes Thumshirn)
Subject: I/O Errors due to keepalive timeouts with NVMf RDMA
Date: Fri, 14 Jul 2017 13:25:54 +0200	[thread overview]
Message-ID: <20170714112554.GF8497@linux-x5ow.site> (raw)
In-Reply-To: <a0656ff3-f2b9-4580-bf6f-8206cd73b680@grimberg.me>

On Tue, Jul 11, 2017@12:19:12PM +0300, Sagi Grimberg wrote:
> I didn't mean that the fabric is broken for sure, I was simply saying
> that having a 64 byte send not making it through a switch port sounds
> like a problem to me.

So JFTR I now have a 3rd setup with RoCE over mlx5 (and a Mellanox Switch) and
I can reproduce it again on this setup.

host# ibstat
CA 'mlx5_0'
	CA type: MT4115
	Number of ports: 1
	Firmware version: 12.20.1010
	Hardware version: 0
	Node GUID: 0x248a070300554504
	System image GUID: 0x248a070300554504
	Port 1:
		State: Active
		Physical state: LinkUp
		Rate: 56
		Base lid: 0
		LMC: 0
		SM lid: 0
		Capability mask: 0x04010000
		Port GUID: 0x268a07fffe554504
		Link layer: Ethernet

target# ibstat
CA 'mlx5_0'
	CA type: MT4117
	Number of ports: 1
	Firmware version: 14.20.1010
	Hardware version: 0
	Node GUID: 0x248a070300937248
	System image GUID: 0x248a070300937248
	Port 1:
		State: Down
		Physical state: Disabled
		Rate: 25
		Base lid: 0
		LMC: 0
		SM lid: 0
		Capability mask: 0x04010000
		Port GUID: 0x268a07fffe937248
		Link layer: Ethernet


host# dmesg
nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 9.9.9.6:4420
nvme nvme0: creating 24 I/O queues.
nvme nvme0: new ctrl: NQN "nvmf-test", addr 9.9.9.6:4420
test start
nvme nvme0: failed nvme_keep_alive_end_io error=-5
nvme nvme0: Reconnecting in 10 seconds...
blk_update_request: I/O error, dev nvme0n1, sector 23000728
blk_update_request: I/O error, dev nvme0n1, sector 32385208
blk_update_request: I/O error, dev nvme0n1, sector 13965416
blk_update_request: I/O error, dev nvme0n1, sector 32825384
blk_update_request: I/O error, dev nvme0n1, sector 47701688
blk_update_request: I/O error, dev nvme0n1, sector 994584
blk_update_request: I/O error, dev nvme0n1, sector 26306816
blk_update_request: I/O error, dev nvme0n1, sector 27715008
blk_update_request: I/O error, dev nvme0n1, sector 32470064
blk_update_request: I/O error, dev nvme0n1, sector 29905512
nvme0n1: detected capacity change from 68719476736 to -67550056326088704
Buffer I/O error on dev nvme0n1, logical block 0, async page read
Buffer I/O error on dev nvme0n1, logical block 0, async page read
Buffer I/O error on dev nvme0n1, logical block 0, async page read
ldm_validate_partition_table(): Disk read failed.
Buffer I/O error on dev nvme0n1, logical block 0, async page read
Buffer I/O error on dev nvme0n1, logical block 0, async page read
Buffer I/O error on dev nvme0n1, logical block 0, async page read
Buffer I/O error on dev nvme0n1, logical block 0, async page read
Buffer I/O error on dev nvme0n1, logical block 0, async page read
Buffer I/O error on dev nvme0n1, logical block 3, async page read
Buffer I/O error on dev nvme0n1, logical block 0, async page read
nvme0n1: unable to read partition table

The fio command used was:
fio --name=test --iodepth=128 --numjobs=$(nproc) --size=23g --time_based \
    --runtime=15m --filename=/dev/nvme0n1 --ioengine=libaio --direct=1 \
     --rw=randrw


-- 
Johannes Thumshirn                                          Storage
jthumshirn at suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

  parent reply	other threads:[~2017-07-14 11:25 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-07  9:48 I/O Errors due to keepalive timeouts with NVMf RDMA Johannes Thumshirn
2017-07-08 18:14 ` Max Gurtovoy
2017-07-10  7:59   ` Johannes Thumshirn
2017-07-10  7:06 ` Sagi Grimberg
2017-07-10  7:17   ` Hannes Reinecke
2017-07-10  8:46     ` Max Gurtovoy
2017-07-10  9:10       ` Johannes Thumshirn
2017-07-10 10:13         ` Sagi Grimberg
2017-07-10 10:20           ` Johannes Thumshirn
2017-07-10 11:04             ` Sagi Grimberg
2017-07-10 11:33               ` Johannes Thumshirn
2017-07-10 11:41                 ` Sagi Grimberg
2017-07-10 11:50                   ` Johannes Thumshirn
2017-07-10 12:04                     ` Sagi Grimberg
2017-07-11  8:52                       ` Johannes Thumshirn
2017-07-11  9:19                         ` Sagi Grimberg
2017-07-11  9:21                           ` Johannes Thumshirn
2017-07-14 11:25                           ` Johannes Thumshirn [this message]
2017-08-15 22:46                             ` Guilherme G. Piccoli
2017-08-16  8:16                               ` Christoph Hellwig
2017-08-16 16:19                                 ` Guilherme G. Piccoli
2017-08-28 10:15                                   ` Guan Junxiong
2017-07-10  8:59     ` Jack Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170714112554.GF8497@linux-x5ow.site \
    --to=jthumshirn@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.