public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mingbao Sun <sunmingbao@tom.com>
To: Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@fb.com>,
	Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
	Chaitanya Kulkarni <kch@nvidia.com>,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Cc: tyler.sun@dell.com, ping.gan@dell.com, yanxiu.cai@dell.com,
	libin.zhang@dell.com, ao.sun@dell.com
Subject: Re: [PATCH 0/2] NVMe_over_TCP: support specifying the congestion-control
Date: Tue, 8 Mar 2022 21:03:20 +0800	[thread overview]
Message-ID: <20220308210320.000013ca@tom.com> (raw)
In-Reply-To: <20220304092754.2721-1-sunmingbao@tom.com>

I feel that I'd better address this a little bit more to express the
meaning behind this feature.

You know, InfiniBand/RoCE provides NVMe-oF a lossless network
environment (that is zero packet loss), which is a great advantage
to performance.

In contrast, 'TCP/IP + ethernet' is often used as a lossy network
environment (packet dropping often occurs). 
And once packet dropping occurs, timeout-retransmission would be
triggered. But once timeout-retransmission was triggered, it’s a great
damage to the performance.

So although NVMe/TCP may have a bandwidth competitive to that of
NVMe/RDMA, but the packet dropping of the former is a flaw to
its performance.

However, with the combination of the following conditions, NVMe/TCP
can almost be as competitive as NVMe/RDMA in the data center.

  - Ethernet NICs supporting QoS configuration (support mapping TOS/DSCP
    in IP header into priority, support PFC)

  - Ethernet Switches supporting ECN marking, supporting adjusting
    buffer size of each priority.

  - NVMe/TCP supports specifying the tos for its TCP traffic
    (already implemented)

  - NVMe/TCP supports specifying dctcp as the congestion-control of its
    TCP sockets (the work of this feature)

So this feature is the last item from the software aspect to form up the
above combination.


      parent reply	other threads:[~2022-03-08 13:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-04  9:27 [PATCH 0/2] NVMe_over_TCP: support specifying the congestion-control Mingbao Sun
2022-03-04  9:27 ` [PATCH 1/2] nvmet-tcp: " Mingbao Sun
2022-03-04  9:27 ` [PATCH 2/2] nvme-tcp: " Mingbao Sun
2022-03-04 16:20   ` Christoph Hellwig
2022-03-05  7:09     ` Mingbao Sun
2022-03-08  7:12       ` Christoph Hellwig
2022-03-08  7:57         ` Mingbao Sun
2022-03-08 13:03 ` Mingbao Sun [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220308210320.000013ca@tom.com \
    --to=sunmingbao@tom.com \
    --cc=ao.sun@dell.com \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=kch@nvidia.com \
    --cc=libin.zhang@dell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ping.gan@dell.com \
    --cc=sagi@grimberg.me \
    --cc=tyler.sun@dell.com \
    --cc=yanxiu.cai@dell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox