Netdev List
 help / color / mirror / Atom feed
* [PATCH 1/2] bpf: sockmap, double free in __sock_map_ctx_update_elem()
From: Dan Carpenter @ 2018-05-18  7:58 UTC (permalink / raw)
  To: Alexei Starovoitov, John Fastabend
  Cc: Daniel Borkmann, netdev, kernel-janitors

We accidentally free "e" twice.

Fixes: 81110384441a ("bpf: sockmap, add hash map support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index c6de1393df63..216d5c9b0eb3 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -1833,7 +1833,6 @@ static int __sock_map_ctx_update_elem(struct bpf_map *map,
 	if (tx_msg)
 		bpf_prog_put(tx_msg);
 	write_unlock_bh(&sock->sk_callback_lock);
-	kfree(e);
 	return err;
 }
 

^ permalink raw reply related

* Re: [PATCH net-next ] net: mscc: Add SPDX identifier
From: Alexandre Belloni @ 2018-05-18  7:52 UTC (permalink / raw)
  To: Joe Perches
  Cc: David S . Miller, Allan Nielsen, razvan.stefanescu, po.liu,
	Thomas Petazzoni, Andrew Lunn, Florian Fainelli, netdev,
	linux-kernel
In-Reply-To: <ba4f4d2279b5f71e698136272637b42eeca05c54.camel@perches.com>

On 17/05/2018 18:13:25-0700, Joe Perches wrote:
> On Thu, 2018-05-17 at 21:39 +0200, Alexandre Belloni wrote:
> > On 17/05/2018 12:28:59-0700, Joe Perches wrote:
> > > On Thu, 2018-05-17 at 21:23 +0200, Alexandre Belloni wrote:
> > > > ocelot_qsys.h is missing the SPDX identfier, fix that.
> > > > 
> > > > Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> > > 
> > > Only the copyright holders should ideally be modifying
> > > these and also removing other license content.
> > > 
> > > For instance, what's the real intent here?
> > > 
> > 
> > Well, if you have a look, I submitted that file this cycle and it is the
> > only one that doesn't have the proper SPDX identifier. This is a mistake
> > I'm fixing.
> 
> Just because you submitted it does not mean you
> are the copyright holder.
> 
> > > > diff --git a/drivers/net/ethernet/mscc/ocelot_qsys.h b/drivers/net/ethernet/mscc/ocelot_qsys.h
> > > 
> > > []
> > > > @@ -1,7 +1,7 @@
> > > > +/* SPDX-License-Identifier: (GPL-2.0 OR MIT) */
> > > 
> > > GPL 2.0+ or 2.0?
> > > 
> > 
> > 2.0
> 
> How do you know that?
> 

The fact is that it should have been submitted with the SPDX identifier
to be consistent with the rest of the driver. If you can't trust me on
that, then, you can't probably trust anyone submitting anything new
driver to the kernel.

I still don't understand why you are making an issue out of this.

Because my email address doesn't match the copyright holder's name
doesn't mean I'm not allowed to fix my own mistake.

-- 
Alexandre Belloni, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply

* Re: [PATCH] netfilter: ebtables: handle string from userspace with care
From: Pablo Neira Ayuso @ 2018-05-18  7:51 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: netfilter-devel, syzbot, fw, coreteam, syzkaller-bugs, netdev
In-Reply-To: <8710122d42aa1f3e081812f2abf406973f834982.1524818458.git.pabeni@redhat.com>

On Fri, Apr 27, 2018 at 10:45:31AM +0200, Paolo Abeni wrote:
> strlcpy() can't be safely used on a user-space provided string,
> as it can try to read beyond the buffer's end, if the latter is
> not NULL terminated.

Applied, thanks!

^ permalink raw reply

* Re: [Cake] [PATCH net-next v12 3/7] sch_cake: Add optional ACK filter
From: Ryan Mounce @ 2018-05-18  7:43 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Toke Høiland-Jørgensen, netdev, Cake List
In-Reply-To: <4cc36c02-d656-5d0c-7313-3a1128213f12@gmail.com>

On 18 May 2018 at 13:38, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
> On 05/17/2018 07:36 PM, Ryan Mounce wrote:
>> On 17 May 2018 at 22:41, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>> Eric Dumazet <eric.dumazet@gmail.com> writes:
>>>
>>>> On 05/17/2018 04:23 AM, Toke Høiland-Jørgensen wrote:
>>>>
>>>>>
>>>>> We don't do full parsing of SACKs, no; we were trying to keep things
>>>>> simple... We do detect the presence of SACK options, though, and the
>>>>> presence of SACK options on an ACK will make previous ACKs be considered
>>>>> redundant.
>>>>>
>>>>
>>>> But they are not redundant in some cases, particularly when reorders
>>>> happen in the network.
>>>
>>> Huh. I was under the impression that SACKs were basically cumulative
>>> until cleared.
>>>
>>> I.e., in packet sequence ABCDE where B and D are lost, C would have
>>> SACK(B) and E would have SACK(B,D). Are you saying that E would only
>>> have SACK(D)?
>>
>> SACK works by acknowledging additional ranges above those that have
>> been ACKed, rather than ACKing up to the largest seen sequence number
>> and reporting missing ranges before that.
>>
>> A - ACK(A)
>> B - lost
>> C - ACK(A) + SACK(C)
>> D - lost
>> E - ACK(A) + SACK(C, E)
>>
>> Cake does check that the ACK sequence number is greater, or if it is
>> equal and the 'newer' ACK has the SACK option present. It doesn't
>> compare the sequence numbers inside two SACKs. If the two SACKs in the
>> above example had been reordered before reaching cake's ACK filter in
>> aggressive mode, the wrong one will be filtered.
>>
>> This is a limitation of my naive SACK handling in cake. The default
>> 'conservative' mode happens to mitigate the problem in the above
>> scenario, but the issue could still present itself in more
>> pathological cases. It's fixable, however I'm not sure this corner
>> case is sufficiently common or severe to warrant the extra complexity.
>
> The extra complexity is absolutely requested for inclusion in upstream linux.
>
> I recommend reading rfc 2018, whole section 4 (Generating Sack Options: Data Receiver Behavior
> )
>
> Proposed ACK filter in Cake is messing the protocol, since the first rule is not respected
>
> * The first SACK block (i.e., the one immediately following the
>       kind and length fields in the option) MUST specify the contiguous
>       block of data containing the segment which triggered this ACK,
>       unless that segment advanced the Acknowledgment Number field in
>       the header.  This assures that the ACK with the SACK option
>       reflects the most recent change in the data receiver's buffer
>       queue.
>
>
> An ACK filter must either :
>
> Not merge ACK if they contain different SACK blocks.
>
> Or make a precise analysis of the SACK blocks to determine if the merge is allowed,
> ie no useful information is lost.
>
> The sender should get all the information as which segments were received correctly,
> assuming no ACK are dropped because of congestion on return path.

I have submitted a patch to cake's ACK filter, now it will only filter
an old SACK if its highest right block edge value is smaller than that
of the most recently received SACK. If there is reordering, neither
will be filtered.

^ permalink raw reply

* [PATCH net-next 9/9] net/smc: restructure client and server code in af_smc
From: Ursula Braun @ 2018-05-18  7:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20180518073418.91878-1-ubraun@linux.ibm.com>

From: Hans Wippel <hwippel@linux.ibm.com>

This patch splits up the functions smc_connect_rdma and smc_listen_work
into smaller functions.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/af_smc.c | 559 +++++++++++++++++++++++++++++++------------------------
 1 file changed, 318 insertions(+), 241 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 9871604ebb53..48530dab5c94 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -396,165 +396,186 @@ static void smc_link_save_peer_info(struct smc_link *link,
 	link->peer_mtu = clc->qp_mtu;
 }
 
-/* setup for RDMA connection of client */
-static int smc_connect_rdma(struct smc_sock *smc)
+/* fall back during connect */
+static int smc_connect_fallback(struct smc_sock *smc)
 {
-	struct smc_clc_msg_accept_confirm aclc;
-	int local_contact = SMC_FIRST_CONTACT;
-	struct smc_ib_device *smcibdev;
-	struct smc_link *link;
-	u8 srv_first_contact;
-	int reason_code = 0;
-	int rc = 0;
-	u8 ibport;
-
-	sock_hold(&smc->sk); /* sock put in passive closing */
+	smc->use_fallback = true;
+	smc_copy_sock_settings_to_clc(smc);
+	if (smc->sk.sk_state == SMC_INIT)
+		smc->sk.sk_state = SMC_ACTIVE;
+	return 0;
+}
 
-	if (smc->use_fallback)
-		goto out_connected;
+/* decline and fall back during connect */
+static int smc_connect_decline_fallback(struct smc_sock *smc, int reason_code)
+{
+	int rc;
 
-	if (!tcp_sk(smc->clcsock->sk)->syn_smc) {
-		/* peer has not signalled SMC-capability */
-		smc->use_fallback = true;
-		goto out_connected;
+	if (reason_code < 0) /* error, fallback is not possible */
+		return reason_code;
+	if (reason_code != SMC_CLC_DECL_REPLY) {
+		rc = smc_clc_send_decline(smc, reason_code);
+		if (rc < 0)
+			return rc;
 	}
+	return smc_connect_fallback(smc);
+}
 
-	/* IPSec connections opt out of SMC-R optimizations */
-	if (using_ipsec(smc)) {
-		reason_code = SMC_CLC_DECL_IPSEC;
-		goto decline_rdma;
-	}
+/* abort connecting */
+static int smc_connect_abort(struct smc_sock *smc, int reason_code,
+			     int local_contact)
+{
+	if (local_contact == SMC_FIRST_CONTACT)
+		smc_lgr_forget(smc->conn.lgr);
+	mutex_unlock(&smc_create_lgr_pending);
+	smc_conn_free(&smc->conn);
+	if (reason_code < 0 && smc->sk.sk_state == SMC_INIT)
+		sock_put(&smc->sk); /* passive closing */
+	return reason_code;
+}
+
+/* check if there is a rdma device available for this connection. */
+/* called for connect and listen */
+static int smc_check_rdma(struct smc_sock *smc, struct smc_ib_device **ibdev,
+			  u8 *ibport)
+{
+	int reason_code = 0;
 
 	/* PNET table look up: search active ib_device and port
 	 * within same PNETID that also contains the ethernet device
 	 * used for the internal TCP socket
 	 */
-	smc_pnet_find_roce_resource(smc->clcsock->sk, &smcibdev, &ibport);
-	if (!smcibdev) {
+	smc_pnet_find_roce_resource(smc->clcsock->sk, ibdev, ibport);
+	if (!(*ibdev))
 		reason_code = SMC_CLC_DECL_CNFERR; /* configuration error */
-		goto decline_rdma;
-	}
+
+	return reason_code;
+}
+
+/* CLC handshake during connect */
+static int smc_connect_clc(struct smc_sock *smc,
+			   struct smc_clc_msg_accept_confirm *aclc,
+			   struct smc_ib_device *ibdev, u8 ibport)
+{
+	int rc = 0;
 
 	/* do inband token exchange */
-	reason_code = smc_clc_send_proposal(smc, smcibdev, ibport);
-	if (reason_code < 0) {
-		rc = reason_code;
-		goto out_err;
-	}
-	if (reason_code > 0) /* configuration error */
-		goto decline_rdma;
+	rc = smc_clc_send_proposal(smc, ibdev, ibport);
+	if (rc)
+		return rc;
 	/* receive SMC Accept CLC message */
-	reason_code = smc_clc_wait_msg(smc, &aclc, sizeof(aclc),
-				       SMC_CLC_ACCEPT);
-	if (reason_code < 0) {
-		rc = reason_code;
-		goto out_err;
-	}
-	if (reason_code > 0)
-		goto decline_rdma;
+	return smc_clc_wait_msg(smc, aclc, sizeof(*aclc), SMC_CLC_ACCEPT);
+}
+
+/* setup for RDMA connection of client */
+static int smc_connect_rdma(struct smc_sock *smc,
+			    struct smc_clc_msg_accept_confirm *aclc,
+			    struct smc_ib_device *ibdev, u8 ibport)
+{
+	int local_contact = SMC_FIRST_CONTACT;
+	struct smc_link *link;
+	int reason_code = 0;
 
-	srv_first_contact = aclc.hdr.flag;
 	mutex_lock(&smc_create_lgr_pending);
-	local_contact = smc_conn_create(smc, smcibdev, ibport, &aclc.lcl,
-					srv_first_contact);
+	local_contact = smc_conn_create(smc, ibdev, ibport, &aclc->lcl,
+					aclc->hdr.flag);
 	if (local_contact < 0) {
-		rc = local_contact;
-		if (rc == -ENOMEM)
+		if (local_contact == -ENOMEM)
 			reason_code = SMC_CLC_DECL_MEM;/* insufficient memory*/
-		else if (rc == -ENOLINK)
+		else if (local_contact == -ENOLINK)
 			reason_code = SMC_CLC_DECL_SYNCERR; /* synchr. error */
 		else
 			reason_code = SMC_CLC_DECL_INTERR; /* other error */
-		goto decline_rdma_unlock;
+		return smc_connect_abort(smc, reason_code, 0);
 	}
 	link = &smc->conn.lgr->lnk[SMC_SINGLE_LINK];
 
-	smc_conn_save_peer_info(smc, &aclc);
+	smc_conn_save_peer_info(smc, aclc);
 
 	/* create send buffer and rmb */
-	rc = smc_buf_create(smc);
-	if (rc) {
-		reason_code = SMC_CLC_DECL_MEM;
-		goto decline_rdma_unlock;
-	}
+	if (smc_buf_create(smc))
+		return smc_connect_abort(smc, SMC_CLC_DECL_MEM, local_contact);
 
 	if (local_contact == SMC_FIRST_CONTACT)
-		smc_link_save_peer_info(link, &aclc);
+		smc_link_save_peer_info(link, aclc);
 
-	rc = smc_rmb_rtoken_handling(&smc->conn, &aclc);
-	if (rc) {
-		reason_code = SMC_CLC_DECL_INTERR;
-		goto decline_rdma_unlock;
-	}
+	if (smc_rmb_rtoken_handling(&smc->conn, aclc))
+		return smc_connect_abort(smc, SMC_CLC_DECL_INTERR,
+					 local_contact);
 
 	smc_close_init(smc);
 	smc_rx_init(smc);
 
 	if (local_contact == SMC_FIRST_CONTACT) {
-		rc = smc_ib_ready_link(link);
-		if (rc) {
-			reason_code = SMC_CLC_DECL_INTERR;
-			goto decline_rdma_unlock;
-		}
+		if (smc_ib_ready_link(link))
+			return smc_connect_abort(smc, SMC_CLC_DECL_INTERR,
+						 local_contact);
 	} else {
-		if (!smc->conn.rmb_desc->reused) {
-			if (smc_reg_rmb(link, smc->conn.rmb_desc, true)) {
-				reason_code = SMC_CLC_DECL_INTERR;
-				goto decline_rdma_unlock;
-			}
-		}
+		if (!smc->conn.rmb_desc->reused &&
+		    smc_reg_rmb(link, smc->conn.rmb_desc, true))
+			return smc_connect_abort(smc, SMC_CLC_DECL_INTERR,
+						 local_contact);
 	}
 	smc_rmb_sync_sg_for_device(&smc->conn);
 
-	rc = smc_clc_send_confirm(smc);
-	if (rc)
-		goto out_err_unlock;
+	reason_code = smc_clc_send_confirm(smc);
+	if (reason_code)
+		return smc_connect_abort(smc, reason_code, local_contact);
+
+	smc_tx_init(smc);
 
 	if (local_contact == SMC_FIRST_CONTACT) {
 		/* QP confirmation over RoCE fabric */
 		reason_code = smc_clnt_conf_first_link(smc);
-		if (reason_code < 0) {
-			rc = reason_code;
-			goto out_err_unlock;
-		}
-		if (reason_code > 0)
-			goto decline_rdma_unlock;
+		if (reason_code)
+			return smc_connect_abort(smc, reason_code,
+						 local_contact);
 	}
-
 	mutex_unlock(&smc_create_lgr_pending);
-	smc_tx_init(smc);
 
-out_connected:
 	smc_copy_sock_settings_to_clc(smc);
 	if (smc->sk.sk_state == SMC_INIT)
 		smc->sk.sk_state = SMC_ACTIVE;
 
-	return rc ? rc : local_contact;
+	return 0;
+}
 
-decline_rdma_unlock:
-	if (local_contact == SMC_FIRST_CONTACT)
-		smc_lgr_forget(smc->conn.lgr);
-	mutex_unlock(&smc_create_lgr_pending);
-	smc_conn_free(&smc->conn);
-decline_rdma:
-	/* RDMA setup failed, switch back to TCP */
-	smc->use_fallback = true;
-	if (reason_code && (reason_code != SMC_CLC_DECL_REPLY)) {
-		rc = smc_clc_send_decline(smc, reason_code);
-		if (rc < 0)
-			goto out_err;
-	}
-	goto out_connected;
+/* perform steps before actually connecting */
+static int __smc_connect(struct smc_sock *smc)
+{
+	struct smc_clc_msg_accept_confirm aclc;
+	struct smc_ib_device *ibdev;
+	int rc = 0;
+	u8 ibport;
 
-out_err_unlock:
-	if (local_contact == SMC_FIRST_CONTACT)
-		smc_lgr_forget(smc->conn.lgr);
-	mutex_unlock(&smc_create_lgr_pending);
-	smc_conn_free(&smc->conn);
-out_err:
-	if (smc->sk.sk_state == SMC_INIT)
-		sock_put(&smc->sk); /* passive closing */
-	return rc;
+	sock_hold(&smc->sk); /* sock put in passive closing */
+
+	if (smc->use_fallback)
+		return smc_connect_fallback(smc);
+
+	/* if peer has not signalled SMC-capability, fall back */
+	if (!tcp_sk(smc->clcsock->sk)->syn_smc)
+		return smc_connect_fallback(smc);
+
+	/* IPSec connections opt out of SMC-R optimizations */
+	if (using_ipsec(smc))
+		return smc_connect_decline_fallback(smc, SMC_CLC_DECL_IPSEC);
+
+	/* check if a RDMA device is available; if not, fall back */
+	if (smc_check_rdma(smc, &ibdev, &ibport))
+		return smc_connect_decline_fallback(smc, SMC_CLC_DECL_CNFERR);
+
+	/* perform CLC handshake */
+	rc = smc_connect_clc(smc, &aclc, ibdev, ibport);
+	if (rc)
+		return smc_connect_decline_fallback(smc, rc);
+
+	/* connect using rdma */
+	rc = smc_connect_rdma(smc, &aclc, ibdev, ibport);
+	if (rc)
+		return smc_connect_decline_fallback(smc, rc);
+
+	return 0;
 }
 
 static int smc_connect(struct socket *sock, struct sockaddr *addr,
@@ -590,8 +611,7 @@ static int smc_connect(struct socket *sock, struct sockaddr *addr,
 	if (rc)
 		goto out;
 
-	/* setup RDMA connection */
-	rc = smc_connect_rdma(smc);
+	rc = __smc_connect(smc);
 	if (rc < 0)
 		goto out;
 	else
@@ -789,182 +809,239 @@ static int smc_serv_conf_first_link(struct smc_sock *smc)
 	return 0;
 }
 
-/* setup for RDMA connection of server */
-static void smc_listen_work(struct work_struct *work)
+/* listen worker: finish */
+static void smc_listen_out(struct smc_sock *new_smc)
 {
-	struct smc_sock *new_smc = container_of(work, struct smc_sock,
-						smc_listen_work);
-	struct smc_clc_msg_proposal_prefix *pclc_prfx;
-	struct socket *newclcsock = new_smc->clcsock;
 	struct smc_sock *lsmc = new_smc->listen_smc;
-	struct smc_clc_msg_accept_confirm cclc;
-	int local_contact = SMC_REUSE_CONTACT;
 	struct sock *newsmcsk = &new_smc->sk;
-	struct smc_clc_msg_proposal *pclc;
-	struct smc_ib_device *smcibdev;
-	u8 buf[SMC_CLC_MAX_LEN];
-	struct smc_link *link;
-	int reason_code = 0;
-	int rc = 0;
-	u8 ibport;
 
-	if (new_smc->use_fallback)
-		goto out_connected;
-
-	/* check if peer is smc capable */
-	if (!tcp_sk(newclcsock->sk)->syn_smc) {
-		new_smc->use_fallback = true;
-		goto out_connected;
+	lock_sock_nested(&lsmc->sk, SINGLE_DEPTH_NESTING);
+	if (lsmc->sk.sk_state == SMC_LISTEN) {
+		smc_accept_enqueue(&lsmc->sk, newsmcsk);
+	} else { /* no longer listening */
+		smc_close_non_accepted(newsmcsk);
 	}
+	release_sock(&lsmc->sk);
 
-	/* do inband token exchange -
-	 *wait for and receive SMC Proposal CLC message
-	 */
-	reason_code = smc_clc_wait_msg(new_smc, &buf, sizeof(buf),
-				       SMC_CLC_PROPOSAL);
-	if (reason_code < 0)
-		goto out_err;
-	if (reason_code > 0)
-		goto decline_rdma;
+	/* Wake up accept */
+	lsmc->sk.sk_data_ready(&lsmc->sk);
+	sock_put(&lsmc->sk); /* sock_hold in smc_tcp_listen_work */
+}
 
-	/* IPSec connections opt out of SMC-R optimizations */
-	if (using_ipsec(new_smc)) {
-		reason_code = SMC_CLC_DECL_IPSEC;
-		goto decline_rdma;
-	}
+/* listen worker: finish in state connected */
+static void smc_listen_out_connected(struct smc_sock *new_smc)
+{
+	struct sock *newsmcsk = &new_smc->sk;
 
-	/* PNET table look up: search active ib_device and port
-	 * within same PNETID that also contains the ethernet device
-	 * used for the internal TCP socket
-	 */
-	smc_pnet_find_roce_resource(newclcsock->sk, &smcibdev, &ibport);
-	if (!smcibdev) {
-		reason_code = SMC_CLC_DECL_CNFERR; /* configuration error */
-		goto decline_rdma;
+	sk_refcnt_debug_inc(newsmcsk);
+	if (newsmcsk->sk_state == SMC_INIT)
+		newsmcsk->sk_state = SMC_ACTIVE;
+
+	smc_listen_out(new_smc);
+}
+
+/* listen worker: finish in error state */
+static void smc_listen_out_err(struct smc_sock *new_smc)
+{
+	struct sock *newsmcsk = &new_smc->sk;
+
+	if (newsmcsk->sk_state == SMC_INIT)
+		sock_put(&new_smc->sk); /* passive closing */
+	newsmcsk->sk_state = SMC_CLOSED;
+	smc_conn_free(&new_smc->conn);
+
+	smc_listen_out(new_smc);
+}
+
+/* listen worker: decline and fall back if possible */
+static void smc_listen_decline(struct smc_sock *new_smc, int reason_code,
+			       int local_contact)
+{
+	/* RDMA setup failed, switch back to TCP */
+	if (local_contact == SMC_FIRST_CONTACT)
+		smc_lgr_forget(new_smc->conn.lgr);
+	if (reason_code < 0) { /* error, no fallback possible */
+		smc_listen_out_err(new_smc);
+		return;
+	}
+	smc_conn_free(&new_smc->conn);
+	new_smc->use_fallback = true;
+	if (reason_code && reason_code != SMC_CLC_DECL_REPLY) {
+		if (smc_clc_send_decline(new_smc, reason_code) < 0) {
+			smc_listen_out_err(new_smc);
+			return;
+		}
 	}
+	smc_listen_out_connected(new_smc);
+}
+
+/* listen worker: check prefixes */
+static int smc_listen_rdma_check(struct smc_sock *new_smc,
+				 struct smc_clc_msg_proposal *pclc)
+{
+	struct smc_clc_msg_proposal_prefix *pclc_prfx;
+	struct socket *newclcsock = new_smc->clcsock;
 
-	pclc = (struct smc_clc_msg_proposal *)&buf;
 	pclc_prfx = smc_clc_proposal_get_prefix(pclc);
+	if (smc_clc_prfx_match(newclcsock, pclc_prfx))
+		return SMC_CLC_DECL_CNFERR;
 
-	rc = smc_clc_prfx_match(newclcsock, pclc_prfx);
-	if (rc) {
-		reason_code = SMC_CLC_DECL_CNFERR; /* configuration error */
-		goto decline_rdma;
-	}
+	return 0;
+}
 
+/* listen worker: initialize connection and buffers */
+static int smc_listen_rdma_init(struct smc_sock *new_smc,
+				struct smc_clc_msg_proposal *pclc,
+				struct smc_ib_device *ibdev, u8 ibport,
+				int *local_contact)
+{
 	/* allocate connection / link group */
-	mutex_lock(&smc_create_lgr_pending);
-	local_contact = smc_conn_create(new_smc, smcibdev, ibport, &pclc->lcl,
-					0);
-	if (local_contact < 0) {
-		rc = local_contact;
-		if (rc == -ENOMEM)
-			reason_code = SMC_CLC_DECL_MEM;/* insufficient memory*/
-		goto decline_rdma_unlock;
+	*local_contact = smc_conn_create(new_smc, ibdev, ibport, &pclc->lcl, 0);
+	if (*local_contact < 0) {
+		if (*local_contact == -ENOMEM)
+			return SMC_CLC_DECL_MEM;/* insufficient memory*/
+		return SMC_CLC_DECL_INTERR; /* other error */
 	}
-	link = &new_smc->conn.lgr->lnk[SMC_SINGLE_LINK];
 
 	/* create send buffer and rmb */
-	rc = smc_buf_create(new_smc);
-	if (rc) {
-		reason_code = SMC_CLC_DECL_MEM;
-		goto decline_rdma_unlock;
-	}
+	if (smc_buf_create(new_smc))
+		return SMC_CLC_DECL_MEM;
 
-	smc_close_init(new_smc);
-	smc_rx_init(new_smc);
+	return 0;
+}
+
+/* listen worker: register buffers */
+static int smc_listen_rdma_reg(struct smc_sock *new_smc, int local_contact)
+{
+	struct smc_link *link = &new_smc->conn.lgr->lnk[SMC_SINGLE_LINK];
 
 	if (local_contact != SMC_FIRST_CONTACT) {
 		if (!new_smc->conn.rmb_desc->reused) {
-			if (smc_reg_rmb(link, new_smc->conn.rmb_desc, true)) {
-				reason_code = SMC_CLC_DECL_INTERR;
-				goto decline_rdma_unlock;
-			}
+			if (smc_reg_rmb(link, new_smc->conn.rmb_desc, true))
+				return SMC_CLC_DECL_INTERR;
 		}
 	}
 	smc_rmb_sync_sg_for_device(&new_smc->conn);
 
-	rc = smc_clc_send_accept(new_smc, local_contact);
-	if (rc)
-		goto out_err_unlock;
+	return 0;
+}
+
+/* listen worker: finish RDMA setup */
+static void smc_listen_rdma_finish(struct smc_sock *new_smc,
+				   struct smc_clc_msg_accept_confirm *cclc,
+				   int local_contact)
+{
+	struct smc_link *link = &new_smc->conn.lgr->lnk[SMC_SINGLE_LINK];
+	int reason_code = 0;
 
-	/* receive SMC Confirm CLC message */
-	reason_code = smc_clc_wait_msg(new_smc, &cclc, sizeof(cclc),
-				       SMC_CLC_CONFIRM);
-	if (reason_code < 0)
-		goto out_err_unlock;
-	if (reason_code > 0)
-		goto decline_rdma_unlock;
-	smc_conn_save_peer_info(new_smc, &cclc);
 	if (local_contact == SMC_FIRST_CONTACT)
-		smc_link_save_peer_info(link, &cclc);
+		smc_link_save_peer_info(link, cclc);
 
-	rc = smc_rmb_rtoken_handling(&new_smc->conn, &cclc);
-	if (rc) {
+	if (smc_rmb_rtoken_handling(&new_smc->conn, cclc)) {
 		reason_code = SMC_CLC_DECL_INTERR;
-		goto decline_rdma_unlock;
+		goto decline;
 	}
 
 	if (local_contact == SMC_FIRST_CONTACT) {
-		rc = smc_ib_ready_link(link);
-		if (rc) {
+		if (smc_ib_ready_link(link)) {
 			reason_code = SMC_CLC_DECL_INTERR;
-			goto decline_rdma_unlock;
+			goto decline;
 		}
 		/* QP confirmation over RoCE fabric */
 		reason_code = smc_serv_conf_first_link(new_smc);
-		if (reason_code < 0)
-			/* peer is not aware of a problem */
-			goto out_err_unlock;
-		if (reason_code > 0)
-			goto decline_rdma_unlock;
+		if (reason_code)
+			goto decline;
 	}
+	return;
 
-	smc_tx_init(new_smc);
+decline:
 	mutex_unlock(&smc_create_lgr_pending);
+	smc_listen_decline(new_smc, reason_code, local_contact);
+}
 
-out_connected:
-	sk_refcnt_debug_inc(newsmcsk);
-	if (newsmcsk->sk_state == SMC_INIT)
-		newsmcsk->sk_state = SMC_ACTIVE;
-enqueue:
-	lock_sock_nested(&lsmc->sk, SINGLE_DEPTH_NESTING);
-	if (lsmc->sk.sk_state == SMC_LISTEN) {
-		smc_accept_enqueue(&lsmc->sk, newsmcsk);
-	} else { /* no longer listening */
-		smc_close_non_accepted(newsmcsk);
+/* setup for RDMA connection of server */
+static void smc_listen_work(struct work_struct *work)
+{
+	struct smc_sock *new_smc = container_of(work, struct smc_sock,
+						smc_listen_work);
+	struct socket *newclcsock = new_smc->clcsock;
+	struct smc_clc_msg_accept_confirm cclc;
+	struct smc_clc_msg_proposal *pclc;
+	struct smc_ib_device *ibdev;
+	u8 buf[SMC_CLC_MAX_LEN];
+	int local_contact = 0;
+	int reason_code = 0;
+	int rc = 0;
+	u8 ibport;
+
+	if (new_smc->use_fallback) {
+		smc_listen_out_connected(new_smc);
+		return;
 	}
-	release_sock(&lsmc->sk);
 
-	/* Wake up accept */
-	lsmc->sk.sk_data_ready(&lsmc->sk);
-	sock_put(&lsmc->sk); /* sock_hold in smc_tcp_listen_work */
-	return;
+	/* check if peer is smc capable */
+	if (!tcp_sk(newclcsock->sk)->syn_smc) {
+		new_smc->use_fallback = true;
+		smc_listen_out_connected(new_smc);
+		return;
+	}
 
-decline_rdma_unlock:
-	if (local_contact == SMC_FIRST_CONTACT)
-		smc_lgr_forget(new_smc->conn.lgr);
-	mutex_unlock(&smc_create_lgr_pending);
-decline_rdma:
-	/* RDMA setup failed, switch back to TCP */
-	smc_conn_free(&new_smc->conn);
-	new_smc->use_fallback = true;
-	if (reason_code && (reason_code != SMC_CLC_DECL_REPLY)) {
-		if (smc_clc_send_decline(new_smc, reason_code) < 0)
-			goto out_err;
+	/* do inband token exchange -
+	 * wait for and receive SMC Proposal CLC message
+	 */
+	pclc = (struct smc_clc_msg_proposal *)&buf;
+	reason_code = smc_clc_wait_msg(new_smc, pclc, SMC_CLC_MAX_LEN,
+				       SMC_CLC_PROPOSAL);
+	if (reason_code) {
+		smc_listen_decline(new_smc, reason_code, 0);
+		return;
 	}
-	goto out_connected;
 
-out_err_unlock:
-	if (local_contact == SMC_FIRST_CONTACT)
-		smc_lgr_forget(new_smc->conn.lgr);
+	/* IPSec connections opt out of SMC-R optimizations */
+	if (using_ipsec(new_smc)) {
+		smc_listen_decline(new_smc, SMC_CLC_DECL_IPSEC, 0);
+		return;
+	}
+
+	mutex_lock(&smc_create_lgr_pending);
+	smc_close_init(new_smc);
+	smc_rx_init(new_smc);
+	smc_tx_init(new_smc);
+
+	/* check if RDMA is available */
+	if (smc_check_rdma(new_smc, &ibdev, &ibport) ||
+	    smc_listen_rdma_check(new_smc, pclc) ||
+	    smc_listen_rdma_init(new_smc, pclc, ibdev, ibport,
+				 &local_contact) ||
+	    smc_listen_rdma_reg(new_smc, local_contact)) {
+		/* SMC not supported, decline */
+		mutex_unlock(&smc_create_lgr_pending);
+		smc_listen_decline(new_smc, SMC_CLC_DECL_CNFERR, local_contact);
+		return;
+	}
+
+	/* send SMC Accept CLC message */
+	rc = smc_clc_send_accept(new_smc, local_contact);
+	if (rc) {
+		mutex_unlock(&smc_create_lgr_pending);
+		smc_listen_decline(new_smc, rc, local_contact);
+		return;
+	}
+
+	/* receive SMC Confirm CLC message */
+	reason_code = smc_clc_wait_msg(new_smc, &cclc, sizeof(cclc),
+				       SMC_CLC_CONFIRM);
+	if (reason_code) {
+		mutex_unlock(&smc_create_lgr_pending);
+		smc_listen_decline(new_smc, reason_code, local_contact);
+		return;
+	}
+
+	/* finish worker */
+	smc_listen_rdma_finish(new_smc, &cclc, local_contact);
+	smc_conn_save_peer_info(new_smc, &cclc);
 	mutex_unlock(&smc_create_lgr_pending);
-out_err:
-	if (newsmcsk->sk_state == SMC_INIT)
-		sock_put(&new_smc->sk); /* passive closing */
-	newsmcsk->sk_state = SMC_CLOSED;
-	smc_conn_free(&new_smc->conn);
-	goto enqueue; /* queue new sock with sk_err set */
+	smc_listen_out_connected(new_smc);
 }
 
 static void smc_tcp_listen_work(struct work_struct *work)
@@ -1225,7 +1302,7 @@ static __poll_t smc_poll(struct file *file, struct socket *sock,
 			if (sk->sk_state == SMC_INIT &&
 			    mask & EPOLLOUT &&
 			    smc->clcsock->sk->sk_state != TCP_CLOSE) {
-				rc = smc_connect_rdma(smc);
+				rc = __smc_connect(smc);
 				if (rc < 0)
 					mask |= EPOLLERR;
 				/* success cases including fallback */
-- 
2.16.3

^ permalink raw reply related

* [PATCH net-next 8/9] net/smc: change smc_buf_free function parameters
From: Ursula Braun @ 2018-05-18  7:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20180518073418.91878-1-ubraun@linux.ibm.com>

From: Hans Wippel <hwippel@linux.ibm.com>

This patch changes the function smc_buf_free to use the SMC link group
instead of the link as function parameter. Also, it changes the order of
the other two parameters.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/smc_core.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index c63cf7b07ccc..1e5c0e90a706 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -36,8 +36,8 @@ static struct smc_lgr_list smc_lgr_list = {	/* established link groups */
 	.num = 0,
 };
 
-static void smc_buf_free(struct smc_buf_desc *buf_desc, struct smc_link *lnk,
-			 bool is_rmb);
+static void smc_buf_free(struct smc_link_group *lgr, bool is_rmb,
+			 struct smc_buf_desc *buf_desc);
 
 static void smc_lgr_schedule_free_work(struct smc_link_group *lgr)
 {
@@ -249,14 +249,12 @@ static void smc_buf_unuse(struct smc_connection *conn)
 		} else {
 			/* buf registration failed, reuse not possible */
 			struct smc_link_group *lgr = conn->lgr;
-			struct smc_link *lnk;
 
 			write_lock_bh(&lgr->rmbs_lock);
 			list_del(&conn->rmb_desc->list);
 			write_unlock_bh(&lgr->rmbs_lock);
 
-			lnk = &lgr->lnk[SMC_SINGLE_LINK];
-			smc_buf_free(conn->rmb_desc, lnk, true);
+			smc_buf_free(lgr, true, conn->rmb_desc);
 		}
 	}
 }
@@ -282,9 +280,11 @@ static void smc_link_clear(struct smc_link *lnk)
 	smc_wr_free_link_mem(lnk);
 }
 
-static void smc_buf_free(struct smc_buf_desc *buf_desc, struct smc_link *lnk,
-			 bool is_rmb)
+static void smc_buf_free(struct smc_link_group *lgr, bool is_rmb,
+			 struct smc_buf_desc *buf_desc)
 {
+	struct smc_link *lnk = &lgr->lnk[SMC_SINGLE_LINK];
+
 	if (is_rmb) {
 		if (buf_desc->mr_rx[SMC_SINGLE_LINK])
 			smc_ib_put_memory_region(
@@ -303,7 +303,6 @@ static void smc_buf_free(struct smc_buf_desc *buf_desc, struct smc_link *lnk,
 
 static void __smc_lgr_free_bufs(struct smc_link_group *lgr, bool is_rmb)
 {
-	struct smc_link *lnk = &lgr->lnk[SMC_SINGLE_LINK];
 	struct smc_buf_desc *buf_desc, *bf_desc;
 	struct list_head *buf_list;
 	int i;
@@ -316,7 +315,7 @@ static void __smc_lgr_free_bufs(struct smc_link_group *lgr, bool is_rmb)
 		list_for_each_entry_safe(buf_desc, bf_desc, buf_list,
 					 list) {
 			list_del(&buf_desc->list);
-			smc_buf_free(buf_desc, lnk, is_rmb);
+			smc_buf_free(lgr, is_rmb, buf_desc);
 		}
 	}
 }
@@ -627,7 +626,7 @@ static struct smc_buf_desc *smc_new_buf_create(struct smc_link_group *lgr,
 	rc = sg_alloc_table(&buf_desc->sgt[SMC_SINGLE_LINK], 1,
 			    GFP_KERNEL);
 	if (rc) {
-		smc_buf_free(buf_desc, lnk, is_rmb);
+		smc_buf_free(lgr, is_rmb, buf_desc);
 		return ERR_PTR(rc);
 	}
 	sg_set_buf(buf_desc->sgt[SMC_SINGLE_LINK].sgl,
@@ -638,7 +637,7 @@ static struct smc_buf_desc *smc_new_buf_create(struct smc_link_group *lgr,
 			       is_rmb ? DMA_FROM_DEVICE : DMA_TO_DEVICE);
 	/* SMC protocol depends on mapping to one DMA address only */
 	if (rc != 1)  {
-		smc_buf_free(buf_desc, lnk, is_rmb);
+		smc_buf_free(lgr, is_rmb, buf_desc);
 		return ERR_PTR(-EAGAIN);
 	}
 
@@ -649,7 +648,7 @@ static struct smc_buf_desc *smc_new_buf_create(struct smc_link_group *lgr,
 					      IB_ACCESS_LOCAL_WRITE,
 					      buf_desc);
 		if (rc) {
-			smc_buf_free(buf_desc, lnk, is_rmb);
+			smc_buf_free(lgr, is_rmb, buf_desc);
 			return ERR_PTR(rc);
 		}
 	}
@@ -775,8 +774,7 @@ int smc_buf_create(struct smc_sock *smc)
 	/* create rmb */
 	rc = __smc_buf_create(smc, true);
 	if (rc)
-		smc_buf_free(smc->conn.sndbuf_desc,
-			     &smc->conn.lgr->lnk[SMC_SINGLE_LINK], false);
+		smc_buf_free(smc->conn.lgr, false, smc->conn.sndbuf_desc);
 	return rc;
 }
 
-- 
2.16.3

^ permalink raw reply related

* [PATCH net-next 7/9] net/smc: do a few smc_core.c cleanups
From: Ursula Braun @ 2018-05-18  7:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20180518073418.91878-1-ubraun@linux.ibm.com>

From: Hans Wippel <hwippel@linux.ibm.com>

This patch consists of Christmas tree fixes and removal of an unneeded
function parameter.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/smc_core.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 66afa086ae72..c63cf7b07ccc 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -474,10 +474,10 @@ int smc_conn_create(struct smc_sock *smc,
 		    struct smc_clc_msg_local *lcl, int srv_first_contact)
 {
 	struct smc_connection *conn = &smc->conn;
+	int local_contact = SMC_FIRST_CONTACT;
 	struct smc_link_group *lgr;
 	unsigned short vlan_id;
 	enum smc_lgr_role role;
-	int local_contact = SMC_FIRST_CONTACT;
 	int rc = 0;
 
 	role = smc->listen_smc ? SMC_SERV : SMC_CLNT;
@@ -573,11 +573,9 @@ int smc_uncompress_bufsize(u8 compressed)
 /* try to reuse a sndbuf or rmb description slot for a certain
  * buffer size; if not available, return NULL
  */
-static inline
-struct smc_buf_desc *smc_buf_get_slot(struct smc_link_group *lgr,
-				      int compressed_bufsize,
-				      rwlock_t *lock,
-				      struct list_head *buf_list)
+static struct smc_buf_desc *smc_buf_get_slot(int compressed_bufsize,
+					     rwlock_t *lock,
+					     struct list_head *buf_list)
 {
 	struct smc_buf_desc *buf_slot;
 
@@ -662,9 +660,9 @@ static struct smc_buf_desc *smc_new_buf_create(struct smc_link_group *lgr,
 
 static int __smc_buf_create(struct smc_sock *smc, bool is_rmb)
 {
+	struct smc_buf_desc *buf_desc = ERR_PTR(-ENOMEM);
 	struct smc_connection *conn = &smc->conn;
 	struct smc_link_group *lgr = conn->lgr;
-	struct smc_buf_desc *buf_desc = ERR_PTR(-ENOMEM);
 	struct list_head *buf_list;
 	int bufsize, bufsize_short;
 	int sk_buf_size;
@@ -692,7 +690,7 @@ static int __smc_buf_create(struct smc_sock *smc, bool is_rmb)
 			continue;
 
 		/* check for reusable slot in the link group */
-		buf_desc = smc_buf_get_slot(lgr, bufsize_short, lock, buf_list);
+		buf_desc = smc_buf_get_slot(bufsize_short, lock, buf_list);
 		if (buf_desc) {
 			memset(buf_desc->cpu_addr, 0, bufsize);
 			break; /* found reusable slot */
-- 
2.16.3

^ permalink raw reply related

* [PATCH net-next 6/9] net/smc: restructure CDC message reception
From: Ursula Braun @ 2018-05-18  7:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20180518073418.91878-1-ubraun@linux.ibm.com>

From: Hans Wippel <hwippel@linux.ibm.com>

This patch moves a CDC sanity check from smc_cdc_msg_recv_action() to
the other sanity checks in smc_cdc_rx_handler(). While doing this, it
simplifies smc_cdc_msg_recv() and removes unneeded function parameters.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/smc_cdc.c | 47 ++++++++++++++++++++++-------------------------
 1 file changed, 22 insertions(+), 25 deletions(-)

diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
index 2809f9902446..8d2c079c87b0 100644
--- a/net/smc/smc_cdc.c
+++ b/net/smc/smc_cdc.c
@@ -165,19 +165,12 @@ static inline bool smc_cdc_before(u16 seq1, u16 seq2)
 }
 
 static void smc_cdc_msg_recv_action(struct smc_sock *smc,
-				    struct smc_link *link,
 				    struct smc_cdc_msg *cdc)
 {
 	union smc_host_cursor cons_old, prod_old;
 	struct smc_connection *conn = &smc->conn;
 	int diff_cons, diff_prod;
 
-	if (!cdc->prod_flags.failover_validation) {
-		if (smc_cdc_before(ntohs(cdc->seqno),
-				   conn->local_rx_ctrl.seqno))
-			/* received seqno is old */
-			return;
-	}
 	smc_curs_write(&prod_old,
 		       smc_curs_read(&conn->local_rx_ctrl.prod, conn),
 		       conn);
@@ -236,26 +229,11 @@ static void smc_cdc_msg_recv_action(struct smc_sock *smc,
 }
 
 /* called under tasklet context */
-static inline void smc_cdc_msg_recv(struct smc_cdc_msg *cdc,
-				    struct smc_link *link, u64 wr_id)
+static void smc_cdc_msg_recv(struct smc_sock *smc, struct smc_cdc_msg *cdc)
 {
-	struct smc_link_group *lgr = container_of(link, struct smc_link_group,
-						  lnk[SMC_SINGLE_LINK]);
-	struct smc_connection *connection;
-	struct smc_sock *smc;
-
-	/* lookup connection */
-	read_lock_bh(&lgr->conns_lock);
-	connection = smc_lgr_find_conn(ntohl(cdc->token), lgr);
-	if (!connection) {
-		read_unlock_bh(&lgr->conns_lock);
-		return;
-	}
-	smc = container_of(connection, struct smc_sock, conn);
 	sock_hold(&smc->sk);
-	read_unlock_bh(&lgr->conns_lock);
 	bh_lock_sock(&smc->sk);
-	smc_cdc_msg_recv_action(smc, link, cdc);
+	smc_cdc_msg_recv_action(smc, cdc);
 	bh_unlock_sock(&smc->sk);
 	sock_put(&smc->sk); /* no free sk in softirq-context */
 }
@@ -266,12 +244,31 @@ static void smc_cdc_rx_handler(struct ib_wc *wc, void *buf)
 {
 	struct smc_link *link = (struct smc_link *)wc->qp->qp_context;
 	struct smc_cdc_msg *cdc = buf;
+	struct smc_connection *conn;
+	struct smc_link_group *lgr;
+	struct smc_sock *smc;
 
 	if (wc->byte_len < offsetof(struct smc_cdc_msg, reserved))
 		return; /* short message */
 	if (cdc->len != SMC_WR_TX_SIZE)
 		return; /* invalid message */
-	smc_cdc_msg_recv(cdc, link, wc->wr_id);
+
+	/* lookup connection */
+	lgr = container_of(link, struct smc_link_group, lnk[SMC_SINGLE_LINK]);
+	read_lock_bh(&lgr->conns_lock);
+	conn = smc_lgr_find_conn(ntohl(cdc->token), lgr);
+	read_unlock_bh(&lgr->conns_lock);
+	if (!conn)
+		return;
+	smc = container_of(conn, struct smc_sock, conn);
+
+	if (!cdc->prod_flags.failover_validation) {
+		if (smc_cdc_before(ntohs(cdc->seqno),
+				   conn->local_rx_ctrl.seqno))
+			/* received seqno is old */
+			return;
+	}
+	smc_cdc_msg_recv(smc, cdc);
 }
 
 static struct smc_wr_rx_handler smc_cdc_rx_handlers[] = {
-- 
2.16.3

^ permalink raw reply related

* [PATCH net-next 5/9] net/smc: move smc_core specific code from smc.h to smc_core
From: Ursula Braun @ 2018-05-18  7:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20180518073418.91878-1-ubraun@linux.ibm.com>

From: Hans Wippel <hwippel@linux.ibm.com>

SMC connection and buffer handling belong to smc_core. So, this patch
moves this code from smc.h to smc_core.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/smc.h      | 41 -----------------------------------------
 net/smc/smc_core.c | 27 +++++++++++++++++++++++++++
 net/smc/smc_core.h | 12 ++++++++++++
 3 files changed, 39 insertions(+), 41 deletions(-)

diff --git a/net/smc/smc.h b/net/smc/smc.h
index 9bc37645e7d5..a1467e411645 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -220,41 +220,6 @@ static inline u32 ntoh24(u8 *net)
 	return be32_to_cpu(t);
 }
 
-#define SMC_BUF_MIN_SIZE 16384		/* minimum size of an RMB */
-
-#define SMC_RMBE_SIZES	16	/* number of distinct sizes for an RMBE */
-/* theoretically, the RFC states that largest size would be 512K,
- * i.e. compressed 5 and thus 6 sizes (0..5), despite
- * struct smc_clc_msg_accept_confirm.rmbe_size being a 4 bit value (0..15)
- */
-
-/* convert the RMB size into the compressed notation - minimum 16K.
- * In contrast to plain ilog2, this rounds towards the next power of 2,
- * so the socket application gets at least its desired sndbuf / rcvbuf size.
- */
-static inline u8 smc_compress_bufsize(int size)
-{
-	u8 compressed;
-
-	if (size <= SMC_BUF_MIN_SIZE)
-		return 0;
-
-	size = (size - 1) >> 14;
-	compressed = ilog2(size) + 1;
-	if (compressed >= SMC_RMBE_SIZES)
-		compressed = SMC_RMBE_SIZES - 1;
-	return compressed;
-}
-
-/* convert the RMB size from compressed notation into integer */
-static inline int smc_uncompress_bufsize(u8 compressed)
-{
-	u32 size;
-
-	size = 0x00000001 << (((int)compressed) + 14);
-	return (int)size;
-}
-
 #ifdef CONFIG_XFRM
 static inline bool using_ipsec(struct smc_sock *smc)
 {
@@ -268,12 +233,6 @@ static inline bool using_ipsec(struct smc_sock *smc)
 }
 #endif
 
-struct smc_clc_msg_local;
-
-void smc_conn_free(struct smc_connection *conn);
-int smc_conn_create(struct smc_sock *smc,
-		    struct smc_ib_device *smcibdev, u8 ibport,
-		    struct smc_clc_msg_local *lcl, int srv_first_contact);
 struct sock *smc_accept_dequeue(struct sock *parent, struct socket *new_sock);
 void smc_close_non_accepted(struct sock *sk);
 
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 70ab2f716f83..66afa086ae72 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -543,6 +543,33 @@ int smc_conn_create(struct smc_sock *smc,
 	return rc ? rc : local_contact;
 }
 
+/* convert the RMB size into the compressed notation - minimum 16K.
+ * In contrast to plain ilog2, this rounds towards the next power of 2,
+ * so the socket application gets at least its desired sndbuf / rcvbuf size.
+ */
+static u8 smc_compress_bufsize(int size)
+{
+	u8 compressed;
+
+	if (size <= SMC_BUF_MIN_SIZE)
+		return 0;
+
+	size = (size - 1) >> 14;
+	compressed = ilog2(size) + 1;
+	if (compressed >= SMC_RMBE_SIZES)
+		compressed = SMC_RMBE_SIZES - 1;
+	return compressed;
+}
+
+/* convert the RMB size from compressed notation into integer */
+int smc_uncompress_bufsize(u8 compressed)
+{
+	u32 size;
+
+	size = 0x00000001 << (((int)compressed) + 14);
+	return (int)size;
+}
+
 /* try to reuse a sndbuf or rmb description slot for a certain
  * buffer size; if not available, return NULL
  */
diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h
index 3d4bd7011ab3..93cb3523bf50 100644
--- a/net/smc/smc_core.h
+++ b/net/smc/smc_core.h
@@ -141,6 +141,12 @@ struct smc_rtoken {				/* address/key of remote RMB */
 };
 
 #define SMC_LGR_ID_SIZE		4
+#define SMC_BUF_MIN_SIZE	16384	/* minimum size of an RMB */
+#define SMC_RMBE_SIZES		16	/* number of distinct RMBE sizes */
+/* theoretically, the RFC states that largest size would be 512K,
+ * i.e. compressed 5 and thus 6 sizes (0..5), despite
+ * struct smc_clc_msg_accept_confirm.rmbe_size being a 4 bit value (0..15)
+ */
 
 struct smc_link_group {
 	struct list_head	list;
@@ -205,12 +211,14 @@ static inline struct smc_connection *smc_lgr_find_conn(
 
 struct smc_sock;
 struct smc_clc_msg_accept_confirm;
+struct smc_clc_msg_local;
 
 void smc_lgr_free(struct smc_link_group *lgr);
 void smc_lgr_forget(struct smc_link_group *lgr);
 void smc_lgr_terminate(struct smc_link_group *lgr);
 void smc_port_terminate(struct smc_ib_device *smcibdev, u8 ibport);
 int smc_buf_create(struct smc_sock *smc);
+int smc_uncompress_bufsize(u8 compressed);
 int smc_rmb_rtoken_handling(struct smc_connection *conn,
 			    struct smc_clc_msg_accept_confirm *clc);
 int smc_rtoken_add(struct smc_link_group *lgr, __be64 nw_vaddr, __be32 nw_rkey);
@@ -219,5 +227,9 @@ void smc_sndbuf_sync_sg_for_cpu(struct smc_connection *conn);
 void smc_sndbuf_sync_sg_for_device(struct smc_connection *conn);
 void smc_rmb_sync_sg_for_cpu(struct smc_connection *conn);
 void smc_rmb_sync_sg_for_device(struct smc_connection *conn);
+void smc_conn_free(struct smc_connection *conn);
+int smc_conn_create(struct smc_sock *smc,
+		    struct smc_ib_device *smcibdev, u8 ibport,
+		    struct smc_clc_msg_local *lcl, int srv_first_contact);
 void smc_core_exit(void);
 #endif
-- 
2.16.3

^ permalink raw reply related

* [PATCH net-next 4/9] net/smc: calculate write offset in RMB only once per connection
From: Ursula Braun @ 2018-05-18  7:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20180518073418.91878-1-ubraun@linux.ibm.com>

From: Hans Wippel <hwippel@linux.ibm.com>

Currently, the write offset within the RMB is calculated on each write
operation although it is fixed for each connection. With this patch, the
offset is calculated once and stored in a connection specific variable.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/af_smc.c | 5 ++++-
 net/smc/smc.h    | 1 +
 net/smc/smc_tx.c | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index bd67430f3730..9871604ebb53 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -377,10 +377,13 @@ static int smc_clnt_conf_first_link(struct smc_sock *smc)
 static void smc_conn_save_peer_info(struct smc_sock *smc,
 				    struct smc_clc_msg_accept_confirm *clc)
 {
+	int bufsize = smc_uncompress_bufsize(clc->rmbe_size);
+
 	smc->conn.peer_rmbe_idx = clc->rmbe_idx;
 	smc->conn.local_tx_ctrl.token = ntohl(clc->rmbe_alert_token);
-	smc->conn.peer_rmbe_size = smc_uncompress_bufsize(clc->rmbe_size);
+	smc->conn.peer_rmbe_size = bufsize;
 	atomic_set(&smc->conn.peer_rmbe_space, smc->conn.peer_rmbe_size);
+	smc->conn.tx_off = bufsize * (smc->conn.peer_rmbe_idx - 1);
 }
 
 static void smc_link_save_peer_info(struct smc_link *link,
diff --git a/net/smc/smc.h b/net/smc/smc.h
index fb8dec8bc17f..9bc37645e7d5 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -151,6 +151,7 @@ struct smc_connection {
 	u16			tx_cdc_seq;	/* sequence # for CDC send */
 	spinlock_t		send_lock;	/* protect wr_sends */
 	struct delayed_work	tx_work;	/* retry of smc_cdc_msg_send */
+	u32			tx_off;		/* base offset in peer rmb */
 
 	struct smc_host_cdc_msg	local_rx_ctrl;	/* filled during event_handl.
 						 * .prod cf. TCP rcv_nxt
diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 58a56c992b26..1f4a38b857f0 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -261,7 +261,7 @@ static int smc_tx_rdma_write(struct smc_connection *conn, int peer_rmbe_offset,
 	rdma_wr.remote_addr =
 		lgr->rtokens[conn->rtoken_idx][SMC_SINGLE_LINK].dma_addr +
 		/* RMBE within RMB */
-		((conn->peer_rmbe_idx - 1) * conn->peer_rmbe_size) +
+		conn->tx_off +
 		/* offset within RMBE */
 		peer_rmbe_offset;
 	rdma_wr.rkey = lgr->rtokens[conn->rtoken_idx][SMC_SINGLE_LINK].rkey;
-- 
2.16.3

^ permalink raw reply related

* [PATCH net-next 3/9] net/smc: rename connection index to RMBE index
From: Ursula Braun @ 2018-05-18  7:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20180518073418.91878-1-ubraun@linux.ibm.com>

From: Hans Wippel <hwippel@linux.ibm.com>

The connection index is actually a RMBE index. So, this patch changes
the name accordingly.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/af_smc.c  | 2 +-
 net/smc/smc.h     | 2 +-
 net/smc/smc_clc.c | 4 ++--
 net/smc/smc_clc.h | 2 +-
 net/smc/smc_tx.c  | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 4bb320ae11a9..bd67430f3730 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -377,7 +377,7 @@ static int smc_clnt_conf_first_link(struct smc_sock *smc)
 static void smc_conn_save_peer_info(struct smc_sock *smc,
 				    struct smc_clc_msg_accept_confirm *clc)
 {
-	smc->conn.peer_conn_idx = clc->conn_idx;
+	smc->conn.peer_rmbe_idx = clc->rmbe_idx;
 	smc->conn.local_tx_ctrl.token = ntohl(clc->rmbe_alert_token);
 	smc->conn.peer_rmbe_size = smc_uncompress_bufsize(clc->rmbe_size);
 	atomic_set(&smc->conn.peer_rmbe_space, smc->conn.peer_rmbe_size);
diff --git a/net/smc/smc.h b/net/smc/smc.h
index ad6694f516dd..fb8dec8bc17f 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -118,7 +118,7 @@ struct smc_connection {
 	struct rb_node		alert_node;
 	struct smc_link_group	*lgr;		/* link group of connection */
 	u32			alert_token_local; /* unique conn. id */
-	u8			peer_conn_idx;	/* from tcp handshake */
+	u8			peer_rmbe_idx;	/* from tcp handshake */
 	int			peer_rmbe_size;	/* size of peer rx buffer */
 	atomic_t		peer_rmbe_space;/* remaining free bytes in peer
 						 * rmbe
diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c
index 236cb3f12c71..717449b1da0b 100644
--- a/net/smc/smc_clc.c
+++ b/net/smc/smc_clc.c
@@ -442,7 +442,7 @@ int smc_clc_send_confirm(struct smc_sock *smc)
 	hton24(cclc.qpn, link->roce_qp->qp_num);
 	cclc.rmb_rkey =
 		htonl(conn->rmb_desc->mr_rx[SMC_SINGLE_LINK]->rkey);
-	cclc.conn_idx = 1; /* for now: 1 RMB = 1 RMBE */
+	cclc.rmbe_idx = 1; /* for now: 1 RMB = 1 RMBE */
 	cclc.rmbe_alert_token = htonl(conn->alert_token_local);
 	cclc.qp_mtu = min(link->path_mtu, link->peer_mtu);
 	cclc.rmbe_size = conn->rmbe_size_short;
@@ -494,7 +494,7 @@ int smc_clc_send_accept(struct smc_sock *new_smc, int srv_first_contact)
 	hton24(aclc.qpn, link->roce_qp->qp_num);
 	aclc.rmb_rkey =
 		htonl(conn->rmb_desc->mr_rx[SMC_SINGLE_LINK]->rkey);
-	aclc.conn_idx = 1;			/* as long as 1 RMB = 1 RMBE */
+	aclc.rmbe_idx = 1;			/* as long as 1 RMB = 1 RMBE */
 	aclc.rmbe_alert_token = htonl(conn->alert_token_local);
 	aclc.qp_mtu = link->path_mtu;
 	aclc.rmbe_size = conn->rmbe_size_short,
diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h
index 63bf1dc2c1f9..41ff9ea96139 100644
--- a/net/smc/smc_clc.h
+++ b/net/smc/smc_clc.h
@@ -97,7 +97,7 @@ struct smc_clc_msg_accept_confirm {	/* clc accept / confirm message */
 	struct smc_clc_msg_local lcl;
 	u8 qpn[3];		/* QP number */
 	__be32 rmb_rkey;	/* RMB rkey */
-	u8 conn_idx;		/* Connection index, which RMBE in RMB */
+	u8 rmbe_idx;		/* Index of RMBE in RMB */
 	__be32 rmbe_alert_token;/* unique connection id */
 #if defined(__BIG_ENDIAN_BITFIELD)
 	u8 rmbe_size : 4,	/* RMBE buf size (compressed notation) */
diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 58bc7ca3101a..58a56c992b26 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -261,7 +261,7 @@ static int smc_tx_rdma_write(struct smc_connection *conn, int peer_rmbe_offset,
 	rdma_wr.remote_addr =
 		lgr->rtokens[conn->rtoken_idx][SMC_SINGLE_LINK].dma_addr +
 		/* RMBE within RMB */
-		((conn->peer_conn_idx - 1) * conn->peer_rmbe_size) +
+		((conn->peer_rmbe_idx - 1) * conn->peer_rmbe_size) +
 		/* offset within RMBE */
 		peer_rmbe_offset;
 	rdma_wr.rkey = lgr->rtokens[conn->rtoken_idx][SMC_SINGLE_LINK].rkey;
-- 
2.16.3

^ permalink raw reply related

* [PATCH net-next 2/9] net/smc: move link group list to smc_core
From: Ursula Braun @ 2018-05-18  7:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20180518073418.91878-1-ubraun@linux.ibm.com>

From: Hans Wippel <hwippel@linux.ibm.com>

This patch moves the global link group list to smc_core where the link
group functions are. To make this work, it moves code in af_smc and
smc_ib that operates on the link group list to smc_core as well.

While at it, the link group counter is integrated into the list
structure and initialized to zero.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/af_smc.c   | 19 +------------------
 net/smc/smc_core.c | 40 +++++++++++++++++++++++++++++++++++++---
 net/smc/smc_core.h |  5 +++--
 net/smc/smc_ib.c   | 13 +------------
 4 files changed, 42 insertions(+), 35 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 409b1367970c..4bb320ae11a9 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -46,11 +46,6 @@ static DEFINE_MUTEX(smc_create_lgr_pending);	/* serialize link group
 						 * creation
 						 */
 
-struct smc_lgr_list smc_lgr_list = {		/* established link groups */
-	.lock = __SPIN_LOCK_UNLOCKED(smc_lgr_list.lock),
-	.list = LIST_HEAD_INIT(smc_lgr_list.list),
-};
-
 static void smc_tcp_listen_work(struct work_struct *);
 
 static void smc_set_keepalive(struct sock *sk, int val)
@@ -1637,19 +1632,7 @@ static int __init smc_init(void)
 
 static void __exit smc_exit(void)
 {
-	struct smc_link_group *lgr, *lg;
-	LIST_HEAD(lgr_freeing_list);
-
-	spin_lock_bh(&smc_lgr_list.lock);
-	if (!list_empty(&smc_lgr_list.list))
-		list_splice_init(&smc_lgr_list.list, &lgr_freeing_list);
-	spin_unlock_bh(&smc_lgr_list.lock);
-	list_for_each_entry_safe(lgr, lg, &lgr_freeing_list, list) {
-		list_del_init(&lgr->list);
-		smc_llc_link_inactive(&lgr->lnk[SMC_SINGLE_LINK]);
-		cancel_delayed_work_sync(&lgr->free_work);
-		smc_lgr_free(lgr); /* free link group */
-	}
+	smc_core_exit();
 	static_branch_disable(&tcp_have_smc);
 	smc_ib_unregister_client();
 	sock_unregister(PF_SMC);
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 6396426080f9..70ab2f716f83 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -30,7 +30,11 @@
 #define SMC_LGR_FREE_DELAY_SERV		(600 * HZ)
 #define SMC_LGR_FREE_DELAY_CLNT		(SMC_LGR_FREE_DELAY_SERV + 10)
 
-static u32 smc_lgr_num;			/* unique link group number */
+static struct smc_lgr_list smc_lgr_list = {	/* established link groups */
+	.lock = __SPIN_LOCK_UNLOCKED(smc_lgr_list.lock),
+	.list = LIST_HEAD_INIT(smc_lgr_list.list),
+	.num = 0,
+};
 
 static void smc_buf_free(struct smc_buf_desc *buf_desc, struct smc_link *lnk,
 			 bool is_rmb);
@@ -181,8 +185,8 @@ static int smc_lgr_create(struct smc_sock *smc,
 		INIT_LIST_HEAD(&lgr->sndbufs[i]);
 		INIT_LIST_HEAD(&lgr->rmbs[i]);
 	}
-	smc_lgr_num += SMC_LGR_NUM_INCR;
-	memcpy(&lgr->id, (u8 *)&smc_lgr_num, SMC_LGR_ID_SIZE);
+	smc_lgr_list.num += SMC_LGR_NUM_INCR;
+	memcpy(&lgr->id, (u8 *)&smc_lgr_list.num, SMC_LGR_ID_SIZE);
 	INIT_DELAYED_WORK(&lgr->free_work, smc_lgr_free_work);
 	lgr->conns_all = RB_ROOT;
 
@@ -374,6 +378,18 @@ void smc_lgr_terminate(struct smc_link_group *lgr)
 	smc_lgr_schedule_free_work(lgr);
 }
 
+/* Called when IB port is terminated */
+void smc_port_terminate(struct smc_ib_device *smcibdev, u8 ibport)
+{
+	struct smc_link_group *lgr, *l;
+
+	list_for_each_entry_safe(lgr, l, &smc_lgr_list.list, list) {
+		if (lgr->lnk[SMC_SINGLE_LINK].smcibdev == smcibdev &&
+		    lgr->lnk[SMC_SINGLE_LINK].ibport == ibport)
+			smc_lgr_terminate(lgr);
+	}
+}
+
 /* Determine vlan of internal TCP socket.
  * @vlan_id: address to store the determined vlan id into
  */
@@ -802,3 +818,21 @@ int smc_rmb_rtoken_handling(struct smc_connection *conn,
 		return conn->rtoken_idx;
 	return 0;
 }
+
+/* Called (from smc_exit) when module is removed */
+void smc_core_exit(void)
+{
+	struct smc_link_group *lgr, *lg;
+	LIST_HEAD(lgr_freeing_list);
+
+	spin_lock_bh(&smc_lgr_list.lock);
+	if (!list_empty(&smc_lgr_list.list))
+		list_splice_init(&smc_lgr_list.list, &lgr_freeing_list);
+	spin_unlock_bh(&smc_lgr_list.lock);
+	list_for_each_entry_safe(lgr, lg, &lgr_freeing_list, list) {
+		list_del_init(&lgr->list);
+		smc_llc_link_inactive(&lgr->lnk[SMC_SINGLE_LINK]);
+		cancel_delayed_work_sync(&lgr->free_work);
+		smc_lgr_free(lgr); /* free link group */
+	}
+}
diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h
index 4885a7efa84f..3d4bd7011ab3 100644
--- a/net/smc/smc_core.h
+++ b/net/smc/smc_core.h
@@ -23,10 +23,9 @@
 struct smc_lgr_list {			/* list of link group definition */
 	struct list_head	list;
 	spinlock_t		lock;	/* protects list of link groups */
+	u32			num;	/* unique link group number */
 };
 
-extern struct smc_lgr_list	smc_lgr_list; /* list of link groups */
-
 enum smc_lgr_role {		/* possible roles of a link group */
 	SMC_CLNT,	/* client */
 	SMC_SERV	/* server */
@@ -210,6 +209,7 @@ struct smc_clc_msg_accept_confirm;
 void smc_lgr_free(struct smc_link_group *lgr);
 void smc_lgr_forget(struct smc_link_group *lgr);
 void smc_lgr_terminate(struct smc_link_group *lgr);
+void smc_port_terminate(struct smc_ib_device *smcibdev, u8 ibport);
 int smc_buf_create(struct smc_sock *smc);
 int smc_rmb_rtoken_handling(struct smc_connection *conn,
 			    struct smc_clc_msg_accept_confirm *clc);
@@ -219,4 +219,5 @@ void smc_sndbuf_sync_sg_for_cpu(struct smc_connection *conn);
 void smc_sndbuf_sync_sg_for_device(struct smc_connection *conn);
 void smc_rmb_sync_sg_for_cpu(struct smc_connection *conn);
 void smc_rmb_sync_sg_for_device(struct smc_connection *conn);
+void smc_core_exit(void);
 #endif
diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 26df554f7588..0eed7ab9f28b 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -143,17 +143,6 @@ int smc_ib_ready_link(struct smc_link *lnk)
 	return rc;
 }
 
-static void smc_ib_port_terminate(struct smc_ib_device *smcibdev, u8 ibport)
-{
-	struct smc_link_group *lgr, *l;
-
-	list_for_each_entry_safe(lgr, l, &smc_lgr_list.list, list) {
-		if (lgr->lnk[SMC_SINGLE_LINK].smcibdev == smcibdev &&
-		    lgr->lnk[SMC_SINGLE_LINK].ibport == ibport)
-			smc_lgr_terminate(lgr);
-	}
-}
-
 /* process context wrapper for might_sleep smc_ib_remember_port_attr */
 static void smc_ib_port_event_work(struct work_struct *work)
 {
@@ -165,7 +154,7 @@ static void smc_ib_port_event_work(struct work_struct *work)
 		smc_ib_remember_port_attr(smcibdev, port_idx + 1);
 		clear_bit(port_idx, &smcibdev->port_event_mask);
 		if (!smc_ib_port_active(smcibdev, port_idx + 1))
-			smc_ib_port_terminate(smcibdev, port_idx + 1);
+			smc_port_terminate(smcibdev, port_idx + 1);
 	}
 }
 
-- 
2.16.3

^ permalink raw reply related

* [PATCH net-next 1/9] net/smc: add common buffer size in send and receive buffer descriptors
From: Ursula Braun @ 2018-05-18  7:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20180518073418.91878-1-ubraun@linux.ibm.com>

From: Hans Wippel <hwippel@linux.ibm.com>

In addition to the buffer references, SMC currently stores the sizes of
the receive and send buffers in each connection as separate variables.
This patch introduces a buffer length variable in the common buffer
descriptor and uses this length instead.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/af_smc.c   |  2 +-
 net/smc/smc.h      |  2 --
 net/smc/smc_cdc.c  |  8 ++++----
 net/smc/smc_core.c |  8 ++------
 net/smc/smc_core.h |  1 +
 net/smc/smc_diag.c |  5 +++--
 net/smc/smc_rx.c   | 10 +++++-----
 net/smc/smc_tx.c   | 28 ++++++++++++++--------------
 net/smc/smc_tx.h   |  2 +-
 9 files changed, 31 insertions(+), 35 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 6ad4f6c771c3..409b1367970c 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -1421,7 +1421,7 @@ static int smc_ioctl(struct socket *sock, unsigned int cmd,
 		/* output queue size (not send + not acked) */
 		if (smc->sk.sk_state == SMC_LISTEN)
 			return -EINVAL;
-		answ = smc->conn.sndbuf_size -
+		answ = smc->conn.sndbuf_desc->len -
 					atomic_read(&smc->conn.sndbuf_space);
 		break;
 	case SIOCOUTQNSD:
diff --git a/net/smc/smc.h b/net/smc/smc.h
index ec209cd48d42..ad6694f516dd 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -126,9 +126,7 @@ struct smc_connection {
 	int			rtoken_idx;	/* idx to peer RMB rkey/addr */
 
 	struct smc_buf_desc	*sndbuf_desc;	/* send buffer descriptor */
-	int			sndbuf_size;	/* sndbuf size <== sock wmem */
 	struct smc_buf_desc	*rmb_desc;	/* RMBE descriptor */
-	int			rmbe_size;	/* RMBE size <== sock rmem */
 	int			rmbe_size_short;/* compressed notation */
 	int			rmbe_update_limit;
 						/* lower limit for consumer
diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
index 42ad57365eca..2809f9902446 100644
--- a/net/smc/smc_cdc.c
+++ b/net/smc/smc_cdc.c
@@ -44,13 +44,13 @@ static void smc_cdc_tx_handler(struct smc_wr_tx_pend_priv *pnd_snd,
 	smc = container_of(cdcpend->conn, struct smc_sock, conn);
 	bh_lock_sock(&smc->sk);
 	if (!wc_status) {
-		diff = smc_curs_diff(cdcpend->conn->sndbuf_size,
+		diff = smc_curs_diff(cdcpend->conn->sndbuf_desc->len,
 				     &cdcpend->conn->tx_curs_fin,
 				     &cdcpend->cursor);
 		/* sndbuf_space is decreased in smc_sendmsg */
 		smp_mb__before_atomic();
 		atomic_add(diff, &cdcpend->conn->sndbuf_space);
-		/* guarantee 0 <= sndbuf_space <= sndbuf_size */
+		/* guarantee 0 <= sndbuf_space <= sndbuf_desc->len */
 		smp_mb__after_atomic();
 		smc_curs_write(&cdcpend->conn->tx_curs_fin,
 			       smc_curs_read(&cdcpend->cursor, cdcpend->conn),
@@ -198,13 +198,13 @@ static void smc_cdc_msg_recv_action(struct smc_sock *smc,
 		smp_mb__after_atomic();
 	}
 
-	diff_prod = smc_curs_diff(conn->rmbe_size, &prod_old,
+	diff_prod = smc_curs_diff(conn->rmb_desc->len, &prod_old,
 				  &conn->local_rx_ctrl.prod);
 	if (diff_prod) {
 		/* bytes_to_rcv is decreased in smc_recvmsg */
 		smp_mb__before_atomic();
 		atomic_add(diff_prod, &conn->bytes_to_rcv);
-		/* guarantee 0 <= bytes_to_rcv <= rmbe_size */
+		/* guarantee 0 <= bytes_to_rcv <= rmb_desc->len */
 		smp_mb__after_atomic();
 		smc->sk.sk_data_ready(&smc->sk);
 	} else if ((conn->local_rx_ctrl.prod_flags.write_blocked) ||
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 08c05cd0bbae..6396426080f9 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -236,15 +236,12 @@ static int smc_lgr_create(struct smc_sock *smc,
 
 static void smc_buf_unuse(struct smc_connection *conn)
 {
-	if (conn->sndbuf_desc) {
+	if (conn->sndbuf_desc)
 		conn->sndbuf_desc->used = 0;
-		conn->sndbuf_size = 0;
-	}
 	if (conn->rmb_desc) {
 		if (!conn->rmb_desc->regerr) {
 			conn->rmb_desc->reused = 1;
 			conn->rmb_desc->used = 0;
-			conn->rmbe_size = 0;
 		} else {
 			/* buf registration failed, reuse not possible */
 			struct smc_link_group *lgr = conn->lgr;
@@ -616,6 +613,7 @@ static struct smc_buf_desc *smc_new_buf_create(struct smc_link_group *lgr,
 		}
 	}
 
+	buf_desc->len = bufsize;
 	return buf_desc;
 }
 
@@ -675,14 +673,12 @@ static int __smc_buf_create(struct smc_sock *smc, bool is_rmb)
 
 	if (is_rmb) {
 		conn->rmb_desc = buf_desc;
-		conn->rmbe_size = bufsize;
 		conn->rmbe_size_short = bufsize_short;
 		smc->sk.sk_rcvbuf = bufsize * 2;
 		atomic_set(&conn->bytes_to_rcv, 0);
 		conn->rmbe_update_limit = smc_rmb_wnd_update_limit(bufsize);
 	} else {
 		conn->sndbuf_desc = buf_desc;
-		conn->sndbuf_size = bufsize;
 		smc->sk.sk_sndbuf = bufsize * 2;
 		atomic_set(&conn->sndbuf_space, bufsize);
 	}
diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h
index 845dc073de13..4885a7efa84f 100644
--- a/net/smc/smc_core.h
+++ b/net/smc/smc_core.h
@@ -124,6 +124,7 @@ struct smc_buf_desc {
 	struct list_head	list;
 	void			*cpu_addr;	/* virtual address of buffer */
 	struct page		*pages;
+	int			len;		/* length of buffer */
 	struct sg_table		sgt[SMC_LINKS_PER_LGR_MAX];/* virtual buffer */
 	struct ib_mr		*mr_rx[SMC_LINKS_PER_LGR_MAX];
 						/* for rmb only: memory region
diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
index 05dd7e6d314d..839354402215 100644
--- a/net/smc/smc_diag.c
+++ b/net/smc/smc_diag.c
@@ -101,8 +101,9 @@ static int __smc_diag_dump(struct sock *sk, struct sk_buff *skb,
 		struct smc_connection *conn = &smc->conn;
 		struct smc_diag_conninfo cinfo = {
 			.token = conn->alert_token_local,
-			.sndbuf_size = conn->sndbuf_size,
-			.rmbe_size = conn->rmbe_size,
+			.sndbuf_size = conn->sndbuf_desc ?
+				conn->sndbuf_desc->len : 0,
+			.rmbe_size = conn->rmb_desc ? conn->rmb_desc->len : 0,
 			.peer_rmbe_size = conn->peer_rmbe_size,
 
 			.rx_prod.wrap = conn->local_rx_ctrl.prod.wrap,
diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
index ed45569289f5..290a434471d1 100644
--- a/net/smc/smc_rx.c
+++ b/net/smc/smc_rx.c
@@ -51,7 +51,7 @@ static void smc_rx_wake_up(struct sock *sk)
 static void smc_rx_update_consumer(struct smc_connection *conn,
 				   union smc_host_cursor cons, size_t len)
 {
-	smc_curs_add(conn->rmbe_size, &cons, len);
+	smc_curs_add(conn->rmb_desc->len, &cons, len);
 	smc_curs_write(&conn->local_tx_ctrl.cons, smc_curs_read(&cons, conn),
 		       conn);
 	/* send consumer cursor update if required */
@@ -288,11 +288,11 @@ int smc_rx_recvmsg(struct smc_sock *smc, struct msghdr *msg,
 			       conn);
 		/* subsequent splice() calls pick up where previous left */
 		if (splbytes)
-			smc_curs_add(conn->rmbe_size, &cons, splbytes);
+			smc_curs_add(conn->rmb_desc->len, &cons, splbytes);
 		/* determine chunks where to read from rcvbuf */
 		/* either unwrapped case, or 1st chunk of wrapped case */
-		chunk_len = min_t(size_t,
-				  copylen, conn->rmbe_size - cons.count);
+		chunk_len = min_t(size_t, copylen, conn->rmb_desc->len -
+				  cons.count);
 		chunk_len_sum = chunk_len;
 		chunk_off = cons.count;
 		smc_rmb_sync_sg_for_cpu(conn);
@@ -331,7 +331,7 @@ int smc_rx_recvmsg(struct smc_sock *smc, struct msghdr *msg,
 			/* increased in recv tasklet smc_cdc_msg_rcv() */
 			smp_mb__before_atomic();
 			atomic_sub(copylen, &conn->bytes_to_rcv);
-			/* guarantee 0 <= bytes_to_rcv <= rmbe_size */
+			/* guarantee 0 <= bytes_to_rcv <= rmb_desc->len */
 			smp_mb__after_atomic();
 			if (msg)
 				smc_rx_update_consumer(conn, cons, copylen);
diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 08a7de98bb03..58bc7ca3101a 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -180,8 +180,8 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 		tx_cnt_prep = prep.count;
 		/* determine chunks where to write into sndbuf */
 		/* either unwrapped case, or 1st chunk of wrapped case */
-		chunk_len = min_t(size_t,
-				  copylen, conn->sndbuf_size - tx_cnt_prep);
+		chunk_len = min_t(size_t, copylen, conn->sndbuf_desc->len -
+				  tx_cnt_prep);
 		chunk_len_sum = chunk_len;
 		chunk_off = tx_cnt_prep;
 		smc_sndbuf_sync_sg_for_cpu(conn);
@@ -206,21 +206,21 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 		}
 		smc_sndbuf_sync_sg_for_device(conn);
 		/* update cursors */
-		smc_curs_add(conn->sndbuf_size, &prep, copylen);
+		smc_curs_add(conn->sndbuf_desc->len, &prep, copylen);
 		smc_curs_write(&conn->tx_curs_prep,
 			       smc_curs_read(&prep, conn),
 			       conn);
 		/* increased in send tasklet smc_cdc_tx_handler() */
 		smp_mb__before_atomic();
 		atomic_sub(copylen, &conn->sndbuf_space);
-		/* guarantee 0 <= sndbuf_space <= sndbuf_size */
+		/* guarantee 0 <= sndbuf_space <= sndbuf_desc->len */
 		smp_mb__after_atomic();
 		/* since we just produced more new data into sndbuf,
 		 * trigger sndbuf consumer: RDMA write into peer RMBE and CDC
 		 */
 		if ((msg->msg_flags & MSG_MORE || smc_tx_is_corked(smc)) &&
 		    (atomic_read(&conn->sndbuf_space) >
-						(conn->sndbuf_size >> 1)))
+						(conn->sndbuf_desc->len >> 1)))
 			/* for a corked socket defer the RDMA writes if there
 			 * is still sufficient sndbuf_space available
 			 */
@@ -286,7 +286,7 @@ static inline void smc_tx_advance_cursors(struct smc_connection *conn,
 	atomic_sub(len, &conn->peer_rmbe_space);
 	/* guarantee 0 <= peer_rmbe_space <= peer_rmbe_size */
 	smp_mb__after_atomic();
-	smc_curs_add(conn->sndbuf_size, sent, len);
+	smc_curs_add(conn->sndbuf_desc->len, sent, len);
 }
 
 /* sndbuf consumer: prepare all necessary (src&dst) chunks of data transmit;
@@ -309,7 +309,7 @@ static int smc_tx_rdma_writes(struct smc_connection *conn)
 	smc_curs_write(&sent, smc_curs_read(&conn->tx_curs_sent, conn), conn);
 	smc_curs_write(&prep, smc_curs_read(&conn->tx_curs_prep, conn), conn);
 	/* cf. wmem_alloc - (snd_max - snd_una) */
-	to_send = smc_curs_diff(conn->sndbuf_size, &sent, &prep);
+	to_send = smc_curs_diff(conn->sndbuf_desc->len, &sent, &prep);
 	if (to_send <= 0)
 		return 0;
 
@@ -351,12 +351,12 @@ static int smc_tx_rdma_writes(struct smc_connection *conn)
 	dst_len_sum = dst_len;
 	src_off = sent.count;
 	/* dst_len determines the maximum src_len */
-	if (sent.count + dst_len <= conn->sndbuf_size) {
+	if (sent.count + dst_len <= conn->sndbuf_desc->len) {
 		/* unwrapped src case: single chunk of entire dst_len */
 		src_len = dst_len;
 	} else {
 		/* wrapped src case: 2 chunks of sum dst_len; start with 1st: */
-		src_len = conn->sndbuf_size - sent.count;
+		src_len = conn->sndbuf_desc->len - sent.count;
 	}
 	src_len_sum = src_len;
 	dma_addr = sg_dma_address(conn->sndbuf_desc->sgt[SMC_SINGLE_LINK].sgl);
@@ -368,8 +368,8 @@ static int smc_tx_rdma_writes(struct smc_connection *conn)
 			sges[srcchunk].lkey = link->roce_pd->local_dma_lkey;
 			num_sges++;
 			src_off += src_len;
-			if (src_off >= conn->sndbuf_size)
-				src_off -= conn->sndbuf_size;
+			if (src_off >= conn->sndbuf_desc->len)
+				src_off -= conn->sndbuf_desc->len;
 						/* modulo in send ring */
 			if (src_len_sum == dst_len)
 				break; /* either on 1st or 2nd iteration */
@@ -387,7 +387,7 @@ static int smc_tx_rdma_writes(struct smc_connection *conn)
 		dst_len = len - dst_len; /* remainder */
 		dst_len_sum += dst_len;
 		src_len = min_t(int,
-				dst_len, conn->sndbuf_size - sent.count);
+				dst_len, conn->sndbuf_desc->len - sent.count);
 		src_len_sum = src_len;
 	}
 
@@ -484,11 +484,11 @@ void smc_tx_consumer_update(struct smc_connection *conn)
 	smc_curs_write(&cfed,
 		       smc_curs_read(&conn->rx_curs_confirmed, conn),
 		       conn);
-	to_confirm = smc_curs_diff(conn->rmbe_size, &cfed, &cons);
+	to_confirm = smc_curs_diff(conn->rmb_desc->len, &cfed, &cons);
 
 	if (conn->local_rx_ctrl.prod_flags.cons_curs_upd_req ||
 	    ((to_confirm > conn->rmbe_update_limit) &&
-	     ((to_confirm > (conn->rmbe_size / 2)) ||
+	     ((to_confirm > (conn->rmb_desc->len / 2)) ||
 	      conn->local_rx_ctrl.prod_flags.write_blocked))) {
 		if ((smc_cdc_get_slot_and_msg_send(conn) < 0) &&
 		    conn->alert_token_local) { /* connection healthy */
diff --git a/net/smc/smc_tx.h b/net/smc/smc_tx.h
index 8f64b12bf03c..44d077942976 100644
--- a/net/smc/smc_tx.h
+++ b/net/smc/smc_tx.h
@@ -24,7 +24,7 @@ static inline int smc_tx_prepared_sends(struct smc_connection *conn)
 
 	smc_curs_write(&sent, smc_curs_read(&conn->tx_curs_sent, conn), conn);
 	smc_curs_write(&prep, smc_curs_read(&conn->tx_curs_prep, conn), conn);
-	return smc_curs_diff(conn->sndbuf_size, &sent, &prep);
+	return smc_curs_diff(conn->sndbuf_desc->len, &sent, &prep);
 }
 
 void smc_tx_work(struct work_struct *work);
-- 
2.16.3

^ permalink raw reply related

* [PATCH net-next 0/9] net/smc: cleanups 2018-05-18
From: Ursula Braun @ 2018-05-18  7:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun

From: Ursula Braun <ursula.braun@de.ibm.com>

Hi Dave,

here are SMC patches for net-next providing restructuring and cleanup
in different areas.

Thanks, Ursula

Hans Wippel (9):
  net/smc: add common buffer size in send and receive buffer descriptors
  net/smc: move link group list to smc_core
  net/smc: rename connection index to RMBE index
  net/smc: calculate write offset in RMB only once per connection
  net/smc: move smc_core specific code from smc.h to smc_core
  net/smc: restructure CDC message reception
  net/smc: do a few smc_core.c cleanups
  net/smc: change smc_buf_free function parameters
  net/smc: restructure client and server code in af_smc

 net/smc/af_smc.c   | 587 +++++++++++++++++++++++++++++------------------------
 net/smc/smc.h      |  46 +----
 net/smc/smc_cdc.c  |  55 +++--
 net/smc/smc_clc.c  |   4 +-
 net/smc/smc_clc.h  |   2 +-
 net/smc/smc_core.c | 115 ++++++++---
 net/smc/smc_core.h |  18 +-
 net/smc/smc_diag.c |   5 +-
 net/smc/smc_ib.c   |  13 +-
 net/smc/smc_rx.c   |  10 +-
 net/smc/smc_tx.c   |  30 +--
 net/smc/smc_tx.h   |   2 +-
 12 files changed, 481 insertions(+), 406 deletions(-)

-- 
2.16.3

^ permalink raw reply

* [PATCH net-next] net: mvpp2: Add missing VLAN tag detection
From: Maxime Chevallier @ 2018-05-18  7:33 UTC (permalink / raw)
  To: davem
  Cc: Maxime Chevallier, netdev, linux-kernel, Antoine Tenart,
	thomas.petazzoni, gregory.clement, miquel.raynal, nadavh, stefanc,
	ymarkman, mw

Marvell PPv2 Header Parser sets some bits in the 'result_info' field in
each lookup iteration, to identify different packet attributes such as
DSA / VLAN tag, protocol infos, etc. This is used in further
classification stages in the controller.

It's the DSA tag detection entry that is in charge of detecting when there
is a single VLAN tag.

This commits adds the missing update of the result_info in this case.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
---
 drivers/net/ethernet/marvell/mvpp2.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index 7ba7e5938c2e..47e582406bd3 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -2130,6 +2130,9 @@ static void mvpp2_prs_dsa_tag_set(struct mvpp2 *priv, int port, bool add,
 				mvpp2_prs_sram_ai_update(&pe, 0,
 							MVPP2_PRS_SRAM_AI_MASK);
 
+			/* Set result info bits to 'single vlan' */
+			mvpp2_prs_sram_ri_update(&pe, MVPP2_PRS_RI_VLAN_SINGLE,
+						 MVPP2_PRS_RI_VLAN_MASK);
 			/* If packet is tagged continue check vid filtering */
 			mvpp2_prs_sram_next_lu_set(&pe, MVPP2_PRS_LU_VID);
 		} else {
-- 
2.11.0

^ permalink raw reply related

* [patch net-next 5/5] mlxsw: use devlink helper to generate physical port name
From: Jiri Pirko @ 2018-05-18  7:29 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, jakub.kicinski, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, dsahern, vijaya.guvva,
	satananda.burla, raghu.vatsavayi, felix.manlunas, gospo,
	sathya.perla, vasundhara-v.volam, tariqt, eranbe,
	jeffrey.t.kirsher, roopa
In-Reply-To: <20180518072904.29523-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Since devlink knows the info needed to generate the physical port name
in a generic way for all devlink users, use the helper to do the job.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core.c     | 11 +++++++++++
 drivers/net/ethernet/mellanox/mlxsw/core.h     |  2 ++
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 17 +++--------------
 drivers/net/ethernet/mellanox/mlxsw/switchx2.c |  9 +++------
 4 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index a720aa11bcc0..a38faec45b30 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -1763,6 +1763,17 @@ enum devlink_port_type mlxsw_core_port_type_get(struct mlxsw_core *mlxsw_core,
 }
 EXPORT_SYMBOL(mlxsw_core_port_type_get);
 
+int mlxsw_core_port_get_phys_port_name(struct mlxsw_core *mlxsw_core,
+				       u8 local_port, char *name, size_t len)
+{
+	struct mlxsw_core_port *mlxsw_core_port =
+					&mlxsw_core->ports[local_port];
+	struct devlink_port *devlink_port = &mlxsw_core_port->devlink_port;
+
+	return devlink_port_get_phys_port_name(devlink_port, name, len);
+}
+EXPORT_SYMBOL(mlxsw_core_port_get_phys_port_name);
+
 static void mlxsw_core_buf_dump_dbg(struct mlxsw_core *mlxsw_core,
 				    const char *buf, size_t size)
 {
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index e65287c15afd..4eac7fbd07d5 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -209,6 +209,8 @@ void mlxsw_core_port_clear(struct mlxsw_core *mlxsw_core, u8 local_port,
 			   void *port_driver_priv);
 enum devlink_port_type mlxsw_core_port_type_get(struct mlxsw_core *mlxsw_core,
 						u8 local_port);
+int mlxsw_core_port_get_phys_port_name(struct mlxsw_core *mlxsw_core,
+				       u8 local_port, char *name, size_t len);
 
 int mlxsw_core_schedule_dw(struct delayed_work *dwork, unsigned long delay);
 bool mlxsw_core_schedule_work(struct work_struct *work);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 96f6d8af1fd8..bb252b36994d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1238,21 +1238,10 @@ static int mlxsw_sp_port_get_phys_port_name(struct net_device *dev, char *name,
 					    size_t len)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
-	u8 module = mlxsw_sp_port->mapping.module;
-	u8 width = mlxsw_sp_port->mapping.width;
-	u8 lane = mlxsw_sp_port->mapping.lane;
-	int err;
-
-	if (!mlxsw_sp_port->split)
-		err = snprintf(name, len, "p%d", module + 1);
-	else
-		err = snprintf(name, len, "p%ds%d", module + 1,
-			       lane / width);
 
-	if (err >= len)
-		return -EINVAL;
-
-	return 0;
+	return mlxsw_core_port_get_phys_port_name(mlxsw_sp_port->mlxsw_sp->core,
+						  mlxsw_sp_port->local_port,
+						  name, len);
 }
 
 static struct mlxsw_sp_port_mall_tc_entry *
diff --git a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
index 551e05067b1e..3922c1cfe5f5 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
@@ -417,13 +417,10 @@ static int mlxsw_sx_port_get_phys_port_name(struct net_device *dev, char *name,
 					    size_t len)
 {
 	struct mlxsw_sx_port *mlxsw_sx_port = netdev_priv(dev);
-	int err;
-
-	err = snprintf(name, len, "p%d", mlxsw_sx_port->mapping.module + 1);
-	if (err >= len)
-		return -EINVAL;
 
-	return 0;
+	return mlxsw_core_port_get_phys_port_name(mlxsw_sx_port->mlxsw_sx->core,
+						  mlxsw_sx_port->local_port,
+						  name, len);
 }
 
 static const struct net_device_ops mlxsw_sx_port_netdev_ops = {
-- 
2.14.3

^ permalink raw reply related

* [patch net-next 4/5] dsa: set devlink port attrs for dsa ports
From: Jiri Pirko @ 2018-05-18  7:29 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, jakub.kicinski, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, dsahern, vijaya.guvva,
	satananda.burla, raghu.vatsavayi, felix.manlunas, gospo,
	sathya.perla, vasundhara-v.volam, tariqt, eranbe,
	jeffrey.t.kirsher, roopa
In-Reply-To: <20180518072904.29523-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Set the attrs and allow to expose port flavour to user via devlink.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
RFC->v1:
- added missing break pointed out by Andrew
---
 net/dsa/dsa2.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index adf50fbc4c13..00126cda4319 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -270,7 +270,28 @@ static int dsa_port_setup(struct dsa_port *dp)
 	case DSA_PORT_TYPE_UNUSED:
 		break;
 	case DSA_PORT_TYPE_CPU:
+		/* dp->index is used now as port_number. However
+		 * CPU ports should have separate numbering
+		 * independent from front panel port numbers.
+		 */
+		devlink_port_attrs_set(&dp->devlink_port,
+				       DEVLINK_PORT_FLAVOUR_CPU,
+				       dp->index, false, 0);
+		err = dsa_port_link_register_of(dp);
+		if (err) {
+			dev_err(ds->dev, "failed to setup link for port %d.%d\n",
+				ds->index, dp->index);
+			return err;
+		}
+		break;
 	case DSA_PORT_TYPE_DSA:
+		/* dp->index is used now as port_number. However
+		 * DSA ports should have separate numbering
+		 * independent from front panel port numbers.
+		 */
+		devlink_port_attrs_set(&dp->devlink_port,
+				       DEVLINK_PORT_FLAVOUR_DSA,
+				       dp->index, false, 0);
 		err = dsa_port_link_register_of(dp);
 		if (err) {
 			dev_err(ds->dev, "failed to setup link for port %d.%d\n",
@@ -279,6 +300,9 @@ static int dsa_port_setup(struct dsa_port *dp)
 		}
 		break;
 	case DSA_PORT_TYPE_USER:
+		devlink_port_attrs_set(&dp->devlink_port,
+				       DEVLINK_PORT_FLAVOUR_PHYSICAL,
+				       dp->index, false, 0);
 		err = dsa_slave_create(dp);
 		if (err)
 			dev_err(ds->dev, "failed to create slave for port %d.%d\n",
-- 
2.14.3

^ permalink raw reply related

* [patch net-next 3/5] devlink: introduce a helper to generate physical port names
From: Jiri Pirko @ 2018-05-18  7:29 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, jakub.kicinski, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, dsahern, vijaya.guvva,
	satananda.burla, raghu.vatsavayi, felix.manlunas, gospo,
	sathya.perla, vasundhara-v.volam, tariqt, eranbe,
	jeffrey.t.kirsher, roopa
In-Reply-To: <20180518072904.29523-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Each driver implements physical port name generation by itself. However
as devlink has all needed info, it can easily do the job for all its
users. So implement this helper in devlink.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
RFC->v1:
- rebased
---
 include/net/devlink.h |  9 +++++++++
 net/core/devlink.c    | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 6eae15cf0f57..9686a1aa4ec9 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -378,6 +378,8 @@ void devlink_port_attrs_set(struct devlink_port *devlink_port,
 			    enum devlink_port_flavour flavour,
 			    u32 port_number, bool split,
 			    u32 split_subport_number);
+int devlink_port_get_phys_port_name(struct devlink_port *devlink_port,
+				    char *name, size_t len);
 int devlink_sb_register(struct devlink *devlink, unsigned int sb_index,
 			u32 size, u16 ingress_pools_count,
 			u16 egress_pools_count, u16 ingress_tc_count,
@@ -482,6 +484,13 @@ static inline void devlink_port_attrs_set(struct devlink_port *devlink_port,
 {
 }
 
+static inline int
+devlink_port_get_phys_port_name(struct devlink_port *devlink_port,
+				char *name, size_t len)
+{
+	return -EOPNOTSUPP;
+}
+
 static inline int devlink_sb_register(struct devlink *devlink,
 				      unsigned int sb_index, u32 size,
 				      u16 ingress_pools_count,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index af90d237cbc2..5c8a40e1a01e 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -3016,6 +3016,39 @@ void devlink_port_attrs_set(struct devlink_port *devlink_port,
 }
 EXPORT_SYMBOL_GPL(devlink_port_attrs_set);
 
+int devlink_port_get_phys_port_name(struct devlink_port *devlink_port,
+				    char *name, size_t len)
+{
+	struct devlink_port_attrs *attrs = &devlink_port->attrs;
+	int n = 0;
+
+	if (!attrs->set)
+		return -EOPNOTSUPP;
+
+	switch (attrs->flavour) {
+	case DEVLINK_PORT_FLAVOUR_PHYSICAL:
+		if (!attrs->split)
+			n = snprintf(name, len, "p%u", attrs->port_number);
+		else
+			n = snprintf(name, len, "p%us%u", attrs->port_number,
+				     attrs->split_subport_number);
+		break;
+	case DEVLINK_PORT_FLAVOUR_CPU:
+	case DEVLINK_PORT_FLAVOUR_DSA:
+		/* As CPU and DSA ports do not have a netdevice associated
+		 * case should not ever happen.
+		 */
+		WARN_ON(1);
+		return -EINVAL;
+	}
+
+	if (n >= len)
+		return -EINVAL;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devlink_port_get_phys_port_name);
+
 int devlink_sb_register(struct devlink *devlink, unsigned int sb_index,
 			u32 size, u16 ingress_pools_count,
 			u16 egress_pools_count, u16 ingress_tc_count,
-- 
2.14.3

^ permalink raw reply related

* [patch net-next 2/5] devlink: extend attrs_set for setting port flavours
From: Jiri Pirko @ 2018-05-18  7:29 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, jakub.kicinski, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, dsahern, vijaya.guvva,
	satananda.burla, raghu.vatsavayi, felix.manlunas, gospo,
	sathya.perla, vasundhara-v.volam, tariqt, eranbe,
	jeffrey.t.kirsher, roopa
In-Reply-To: <20180518072904.29523-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Devlink ports can have specific flavour according to the purpose of use.
This patch extend attrs_set so the driver can say which flavour port
has. Initial flavours are:
physical, cpu, dsa
User can query this to see right away what is the purpose of each port.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
RFC->v1:
- rebased
- removed pf/vf reps flavours
---
 drivers/net/ethernet/mellanox/mlxsw/core.c       |  4 ++--
 drivers/net/ethernet/netronome/nfp/nfp_devlink.c |  5 +++--
 include/net/devlink.h                            |  3 +++
 include/uapi/linux/devlink.h                     | 11 +++++++++++
 net/core/devlink.c                               |  5 +++++
 5 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 958769689ca2..a720aa11bcc0 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -1722,8 +1722,8 @@ void mlxsw_core_port_eth_set(struct mlxsw_core *mlxsw_core, u8 local_port,
 	struct devlink_port *devlink_port = &mlxsw_core_port->devlink_port;
 
 	mlxsw_core_port->port_driver_priv = port_driver_priv;
-	devlink_port_attrs_set(devlink_port, port_number,
-			       split, split_port_subnumber);
+	devlink_port_attrs_set(devlink_port, DEVLINK_PORT_FLAVOUR_PHYSICAL,
+			       port_number, split, split_port_subnumber);
 	devlink_port_type_eth_set(devlink_port, dev);
 }
 EXPORT_SYMBOL(mlxsw_core_port_eth_set);
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
index d7a768ff3112..b1e67cf4257a 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
@@ -175,8 +175,9 @@ int nfp_devlink_port_register(struct nfp_app *app, struct nfp_port *port)
 		return ret;
 
 	devlink_port_type_eth_set(&port->dl_port, port->netdev);
-	devlink_port_attrs_set(&port->dl_port, eth_port.label_port,
-			       eth_port.is_split, eth_port.label_subport);
+	devlink_port_attrs_set(&port->dl_port, DEVLINK_PORT_FLAVOUR_PHYSICAL,
+			       eth_port.label_port, eth_port.is_split,
+			       eth_port.label_subport);
 
 	devlink = priv_to_devlink(app->pf);
 
diff --git a/include/net/devlink.h b/include/net/devlink.h
index d6ca92266709..6eae15cf0f57 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -37,6 +37,7 @@ struct devlink {
 
 struct devlink_port_attrs {
 	bool set;
+	enum devlink_port_flavour flavour;
 	u32 port_number; /* same value as "split group" */
 	bool split;
 	u32 split_subport_number;
@@ -374,6 +375,7 @@ void devlink_port_type_ib_set(struct devlink_port *devlink_port,
 			      struct ib_device *ibdev);
 void devlink_port_type_clear(struct devlink_port *devlink_port);
 void devlink_port_attrs_set(struct devlink_port *devlink_port,
+			    enum devlink_port_flavour flavour,
 			    u32 port_number, bool split,
 			    u32 split_subport_number);
 int devlink_sb_register(struct devlink *devlink, unsigned int sb_index,
@@ -474,6 +476,7 @@ static inline void devlink_port_type_clear(struct devlink_port *devlink_port)
 }
 
 static inline void devlink_port_attrs_set(struct devlink_port *devlink_port,
+					  enum devlink_port_flavour flavour,
 					  u32 port_number, bool split,
 					  u32 split_subport_number)
 {
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 15b031a5ee7a..75cb5450c851 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -132,6 +132,16 @@ enum devlink_eswitch_encap_mode {
 	DEVLINK_ESWITCH_ENCAP_MODE_BASIC,
 };
 
+enum devlink_port_flavour {
+	DEVLINK_PORT_FLAVOUR_PHYSICAL, /* Any kind of a port physically
+					* facing the user.
+					*/
+	DEVLINK_PORT_FLAVOUR_CPU, /* CPU port */
+	DEVLINK_PORT_FLAVOUR_DSA, /* Distributed switch architecture
+				   * interconnect port.
+				   */
+};
+
 enum devlink_attr {
 	/* don't change the order or add anything between, this is ABI! */
 	DEVLINK_ATTR_UNSPEC,
@@ -224,6 +234,7 @@ enum devlink_attr {
 	DEVLINK_ATTR_DPIPE_TABLE_RESOURCE_ID,	/* u64 */
 	DEVLINK_ATTR_DPIPE_TABLE_RESOURCE_UNITS,/* u64 */
 
+	DEVLINK_ATTR_PORT_FLAVOUR,		/* u16 */
 	DEVLINK_ATTR_PORT_NUMBER,		/* u32 */
 	DEVLINK_ATTR_PORT_SPLIT_SUBPORT_NUMBER,	/* u32 */
 
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 8fde7d2df9b0..af90d237cbc2 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -460,6 +460,8 @@ static int devlink_nl_port_attrs_put(struct sk_buff *msg,
 
 	if (!attrs->set)
 		return 0;
+	if (nla_put_u16(msg, DEVLINK_ATTR_PORT_FLAVOUR, attrs->flavour))
+		return -EMSGSIZE;
 	if (nla_put_u32(msg, DEVLINK_ATTR_PORT_NUMBER, attrs->port_number))
 		return -EMSGSIZE;
 	if (!attrs->split)
@@ -2991,6 +2993,7 @@ EXPORT_SYMBOL_GPL(devlink_port_type_clear);
  *	devlink_port_attrs_set - Set port attributes
  *
  *	@devlink_port: devlink port
+ *	@flavour: flavour of the port
  *	@port_number: number of the port that is facing user, for example
  *	              the front panel port number
  *	@split: indicates if this is split port
@@ -2998,12 +3001,14 @@ EXPORT_SYMBOL_GPL(devlink_port_type_clear);
  *	                       of subport.
  */
 void devlink_port_attrs_set(struct devlink_port *devlink_port,
+			    enum devlink_port_flavour flavour,
 			    u32 port_number, bool split,
 			    u32 split_subport_number)
 {
 	struct devlink_port_attrs *attrs = &devlink_port->attrs;
 
 	attrs->set = true;
+	attrs->flavour = flavour;
 	attrs->port_number = port_number;
 	attrs->split = split;
 	attrs->split_subport_number = split_subport_number;
-- 
2.14.3

^ permalink raw reply related

* [patch net-next 1/5] devlink: introduce devlink_port_attrs_set
From: Jiri Pirko @ 2018-05-18  7:29 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, jakub.kicinski, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, dsahern, vijaya.guvva,
	satananda.burla, raghu.vatsavayi, felix.manlunas, gospo,
	sathya.perla, vasundhara-v.volam, tariqt, eranbe,
	jeffrey.t.kirsher, roopa
In-Reply-To: <20180518072904.29523-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Change existing setter for split port information into more generic
attrs setter. Alongside with that, allow to set port number and subport
number for split ports.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
RFC->v1:
- Reduced the nfp change just to simply use newly created attr_set func
---
 drivers/net/ethernet/mellanox/mlxsw/core.c       |  7 ++--
 drivers/net/ethernet/mellanox/mlxsw/core.h       |  3 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c   |  4 +--
 drivers/net/ethernet/mellanox/mlxsw/switchx2.c   |  2 +-
 drivers/net/ethernet/netronome/nfp/nfp_devlink.c |  4 +--
 include/net/devlink.h                            | 20 +++++++----
 include/uapi/linux/devlink.h                     |  3 ++
 net/core/devlink.c                               | 46 ++++++++++++++++++------
 8 files changed, 64 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index e13ac3b8dff7..958769689ca2 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -1714,15 +1714,16 @@ EXPORT_SYMBOL(mlxsw_core_port_fini);
 
 void mlxsw_core_port_eth_set(struct mlxsw_core *mlxsw_core, u8 local_port,
 			     void *port_driver_priv, struct net_device *dev,
-			     bool split, u32 split_group)
+			     u32 port_number, bool split,
+			     u32 split_port_subnumber)
 {
 	struct mlxsw_core_port *mlxsw_core_port =
 					&mlxsw_core->ports[local_port];
 	struct devlink_port *devlink_port = &mlxsw_core_port->devlink_port;
 
 	mlxsw_core_port->port_driver_priv = port_driver_priv;
-	if (split)
-		devlink_port_split_set(devlink_port, split_group);
+	devlink_port_attrs_set(devlink_port, port_number,
+			       split, split_port_subnumber);
 	devlink_port_type_eth_set(devlink_port, dev);
 }
 EXPORT_SYMBOL(mlxsw_core_port_eth_set);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index 092d39399f3c..e65287c15afd 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -201,7 +201,8 @@ int mlxsw_core_port_init(struct mlxsw_core *mlxsw_core, u8 local_port);
 void mlxsw_core_port_fini(struct mlxsw_core *mlxsw_core, u8 local_port);
 void mlxsw_core_port_eth_set(struct mlxsw_core *mlxsw_core, u8 local_port,
 			     void *port_driver_priv, struct net_device *dev,
-			     bool split, u32 split_group);
+			     u32 port_number, bool split,
+			     u32 split_port_subnumber);
 void mlxsw_core_port_ib_set(struct mlxsw_core *mlxsw_core, u8 local_port,
 			    void *port_driver_priv);
 void mlxsw_core_port_clear(struct mlxsw_core *mlxsw_core, u8 local_port,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 94132f6cec61..96f6d8af1fd8 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -2927,8 +2927,8 @@ static int mlxsw_sp_port_create(struct mlxsw_sp *mlxsw_sp, u8 local_port,
 	}
 
 	mlxsw_core_port_eth_set(mlxsw_sp->core, mlxsw_sp_port->local_port,
-				mlxsw_sp_port, dev, mlxsw_sp_port->split,
-				module);
+				mlxsw_sp_port, dev, module + 1,
+				mlxsw_sp_port->split, lane / width);
 	mlxsw_core_schedule_dw(&mlxsw_sp_port->periodic_hw_stats.update_dw, 0);
 	return 0;
 
diff --git a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
index a655c5850aa6..551e05067b1e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
@@ -1149,7 +1149,7 @@ static int __mlxsw_sx_port_eth_create(struct mlxsw_sx *mlxsw_sx, u8 local_port,
 	}
 
 	mlxsw_core_port_eth_set(mlxsw_sx->core, mlxsw_sx_port->local_port,
-				mlxsw_sx_port, dev, false, 0);
+				mlxsw_sx_port, dev, module + 1, false, 0);
 	mlxsw_sx->ports[local_port] = mlxsw_sx_port;
 	return 0;
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
index eb0fc614673d..d7a768ff3112 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
@@ -175,8 +175,8 @@ int nfp_devlink_port_register(struct nfp_app *app, struct nfp_port *port)
 		return ret;
 
 	devlink_port_type_eth_set(&port->dl_port, port->netdev);
-	if (eth_port.is_split)
-		devlink_port_split_set(&port->dl_port, eth_port.label_port);
+	devlink_port_attrs_set(&port->dl_port, eth_port.label_port,
+			       eth_port.is_split, eth_port.label_subport);
 
 	devlink = priv_to_devlink(app->pf);
 
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 2e4f71e16e95..d6ca92266709 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -35,6 +35,13 @@ struct devlink {
 	char priv[0] __aligned(NETDEV_ALIGN);
 };
 
+struct devlink_port_attrs {
+	bool set;
+	u32 port_number; /* same value as "split group" */
+	bool split;
+	u32 split_subport_number;
+};
+
 struct devlink_port {
 	struct list_head list;
 	struct devlink *devlink;
@@ -43,8 +50,7 @@ struct devlink_port {
 	enum devlink_port_type type;
 	enum devlink_port_type desired_type;
 	void *type_dev;
-	bool split;
-	u32 split_group;
+	struct devlink_port_attrs attrs;
 };
 
 struct devlink_sb_pool_info {
@@ -367,8 +373,9 @@ void devlink_port_type_eth_set(struct devlink_port *devlink_port,
 void devlink_port_type_ib_set(struct devlink_port *devlink_port,
 			      struct ib_device *ibdev);
 void devlink_port_type_clear(struct devlink_port *devlink_port);
-void devlink_port_split_set(struct devlink_port *devlink_port,
-			    u32 split_group);
+void devlink_port_attrs_set(struct devlink_port *devlink_port,
+			    u32 port_number, bool split,
+			    u32 split_subport_number);
 int devlink_sb_register(struct devlink *devlink, unsigned int sb_index,
 			u32 size, u16 ingress_pools_count,
 			u16 egress_pools_count, u16 ingress_tc_count,
@@ -466,8 +473,9 @@ static inline void devlink_port_type_clear(struct devlink_port *devlink_port)
 {
 }
 
-static inline void devlink_port_split_set(struct devlink_port *devlink_port,
-					  u32 split_group)
+static inline void devlink_port_attrs_set(struct devlink_port *devlink_port,
+					  u32 port_number, bool split,
+					  u32 split_subport_number)
 {
 }
 
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 1df65a4c2044..15b031a5ee7a 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -224,6 +224,9 @@ enum devlink_attr {
 	DEVLINK_ATTR_DPIPE_TABLE_RESOURCE_ID,	/* u64 */
 	DEVLINK_ATTR_DPIPE_TABLE_RESOURCE_UNITS,/* u64 */
 
+	DEVLINK_ATTR_PORT_NUMBER,		/* u32 */
+	DEVLINK_ATTR_PORT_SPLIT_SUBPORT_NUMBER,	/* u32 */
+
 	/* add new attributes above here, update the policy in devlink.c */
 
 	__DEVLINK_ATTR_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index ad1317376798..8fde7d2df9b0 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -453,6 +453,25 @@ static void devlink_notify(struct devlink *devlink, enum devlink_command cmd)
 				msg, 0, DEVLINK_MCGRP_CONFIG, GFP_KERNEL);
 }
 
+static int devlink_nl_port_attrs_put(struct sk_buff *msg,
+				     struct devlink_port *devlink_port)
+{
+	struct devlink_port_attrs *attrs = &devlink_port->attrs;
+
+	if (!attrs->set)
+		return 0;
+	if (nla_put_u32(msg, DEVLINK_ATTR_PORT_NUMBER, attrs->port_number))
+		return -EMSGSIZE;
+	if (!attrs->split)
+		return 0;
+	if (nla_put_u32(msg, DEVLINK_ATTR_PORT_SPLIT_GROUP, attrs->port_number))
+		return -EMSGSIZE;
+	if (nla_put_u32(msg, DEVLINK_ATTR_PORT_SPLIT_SUBPORT_NUMBER,
+			attrs->split_subport_number))
+		return -EMSGSIZE;
+	return 0;
+}
+
 static int devlink_nl_port_fill(struct sk_buff *msg, struct devlink *devlink,
 				struct devlink_port *devlink_port,
 				enum devlink_command cmd, u32 portid,
@@ -492,9 +511,7 @@ static int devlink_nl_port_fill(struct sk_buff *msg, struct devlink *devlink,
 				   ibdev->name))
 			goto nla_put_failure;
 	}
-	if (devlink_port->split &&
-	    nla_put_u32(msg, DEVLINK_ATTR_PORT_SPLIT_GROUP,
-			devlink_port->split_group))
+	if (devlink_nl_port_attrs_put(msg, devlink_port))
 		goto nla_put_failure;
 
 	genlmsg_end(msg, hdr);
@@ -2971,19 +2988,28 @@ void devlink_port_type_clear(struct devlink_port *devlink_port)
 EXPORT_SYMBOL_GPL(devlink_port_type_clear);
 
 /**
- *	devlink_port_split_set - Set port is split
+ *	devlink_port_attrs_set - Set port attributes
  *
  *	@devlink_port: devlink port
- *	@split_group: split group - identifies group split port is part of
+ *	@port_number: number of the port that is facing user, for example
+ *	              the front panel port number
+ *	@split: indicates if this is split port
+ *	@split_subport_number: if the port is split, this is the number
+ *	                       of subport.
  */
-void devlink_port_split_set(struct devlink_port *devlink_port,
-			    u32 split_group)
+void devlink_port_attrs_set(struct devlink_port *devlink_port,
+			    u32 port_number, bool split,
+			    u32 split_subport_number)
 {
-	devlink_port->split = true;
-	devlink_port->split_group = split_group;
+	struct devlink_port_attrs *attrs = &devlink_port->attrs;
+
+	attrs->set = true;
+	attrs->port_number = port_number;
+	attrs->split = split;
+	attrs->split_subport_number = split_subport_number;
 	devlink_port_notify(devlink_port, DEVLINK_CMD_PORT_NEW);
 }
-EXPORT_SYMBOL_GPL(devlink_port_split_set);
+EXPORT_SYMBOL_GPL(devlink_port_attrs_set);
 
 int devlink_sb_register(struct devlink *devlink, unsigned int sb_index,
 			u32 size, u16 ingress_pools_count,
-- 
2.14.3

^ permalink raw reply related

* [patch net-next 0/5] devlink: introduce port flavours and common phys_port_name generation
From: Jiri Pirko @ 2018-05-18  7:28 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, jakub.kicinski, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, dsahern, vijaya.guvva,
	satananda.burla, raghu.vatsavayi, felix.manlunas, gospo,
	sathya.perla, vasundhara-v.volam, tariqt, eranbe,
	jeffrey.t.kirsher, roopa

From: Jiri Pirko <jiri@mellanox.com>

This patchset resolves 2 issues we have right now:
1) There are many netdevices / ports in the system, for port, pf, vf
   represenatation but the user has no way to see which is which
2) The ndo_get_phys_port_name is implemented in each driver separatelly,
   which may lead to inconsistent names between drivers.

This patchset introduces port flavours which should address the first
problem. In this initial patchset, I focus on DSA and their port
flavours. As a follow-up, I plan to add PF and VF representor flavours.
However, that needs additional dependencies in drivers (nfp, mlx5).

The common phys_port_name generation is used by mlxsw. An example output
for mlxsw looks like this:

# devlink port
...
pci/0000:03:00.0/59: type eth netdev enp3s0np4 flavour physical number 4
pci/0000:03:00.0/61: type eth netdev enp3s0np1 flavour physical number 1
pci/0000:03:00.0/63: type eth netdev enp3s0np2 flavour physical number 2
pci/0000:03:00.0/49: type eth netdev enp3s0np8s0 flavour physical number 8 split_group 8 subport 0
pci/0000:03:00.0/50: type eth netdev enp3s0np8s1 flavour physical number 8 split_group 8 subport 1
pci/0000:03:00.0/51: type eth netdev enp3s0np8s2 flavour physical number 8 split_group 8 subport 2
pci/0000:03:00.0/52: type eth netdev enp3s0np8s3 flavour physical number 8 split_group 8 subport 3

As you can see, the netdev names are generated according to the flavour
and port number. In case the port is split, the split subnumber is also
included.

An example output for dsa_loop testing module looks like this:
# devlink port
mdio_bus/fixed-0:1f/0: type eth netdev lan1 flavour physical number 0
mdio_bus/fixed-0:1f/1: type eth netdev lan2 flavour physical number 1
mdio_bus/fixed-0:1f/2: type eth netdev lan3 flavour physical number 2
mdio_bus/fixed-0:1f/3: type eth netdev lan4 flavour physical number 3
mdio_bus/fixed-0:1f/4: type notset
mdio_bus/fixed-0:1f/5: type notset flavour cpu number 5
mdio_bus/fixed-0:1f/6: type notset
mdio_bus/fixed-0:1f/7: type notset
mdio_bus/fixed-0:1f/8: type notset
mdio_bus/fixed-0:1f/9: type notset
mdio_bus/fixed-0:1f/10: type notset
mdio_bus/fixed-0:1f/11: type notset

---
RFC->v1:
-removed nfp patches, removed DSA patch that used name generation helper
-patch 1:
 - Reduced the nfp change just to simply use newly created attr_set func
-patch 2:
 - rebased
 - removed pf/vf reps flavours
-patch 3:
 - rebased
-patch 4:
 - added missing break pointed out by Andrew

Jiri Pirko (5):
  devlink: introduce devlink_port_attrs_set
  devlink: extend attrs_set for setting port flavours
  devlink: introduce a helper to generate physical port names
  dsa: set devlink port attrs for dsa ports
  mlxsw: use devlink helper to generate physical port name

 drivers/net/ethernet/mellanox/mlxsw/core.c       | 18 ++++-
 drivers/net/ethernet/mellanox/mlxsw/core.h       |  5 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c   | 21 ++----
 drivers/net/ethernet/mellanox/mlxsw/switchx2.c   | 11 ++-
 drivers/net/ethernet/netronome/nfp/nfp_devlink.c |  5 +-
 include/net/devlink.h                            | 32 +++++++--
 include/uapi/linux/devlink.h                     | 14 ++++
 net/core/devlink.c                               | 86 +++++++++++++++++++++---
 net/dsa/dsa2.c                                   | 24 +++++++
 9 files changed, 170 insertions(+), 46 deletions(-)

-- 
2.14.3

^ permalink raw reply

* [PATCH bpf-next 5/5] selftests/bpf: test_sockmap, print additional test options
From: Prashant Bhole @ 2018-05-18  7:17 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, John Fastabend
  Cc: Prashant Bhole, David S . Miller, Shuah Khan, netdev
In-Reply-To: <20180518071753.4768-1-bhole_prashant_q7@lab.ntt.co.jp>

Print values of test options like apply, cork, start, end so that
individual failed tests can be identified for manual run

Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
---
 tools/testing/selftests/bpf/test_sockmap.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c
index e8faf4596dac..7970d48c4acb 100644
--- a/tools/testing/selftests/bpf/test_sockmap.c
+++ b/tools/testing/selftests/bpf/test_sockmap.c
@@ -869,6 +869,8 @@ static char *test_to_str(int test)
 #define OPTSTRING 60
 static void test_options(char *options)
 {
+	char tstr[OPTSTRING];
+
 	memset(options, 0, OPTSTRING);
 
 	if (txmsg_pass)
@@ -881,14 +883,22 @@ static void test_options(char *options)
 		strncat(options, "redir_noisy,", OPTSTRING);
 	if (txmsg_drop)
 		strncat(options, "drop,", OPTSTRING);
-	if (txmsg_apply)
-		strncat(options, "apply,", OPTSTRING);
-	if (txmsg_cork)
-		strncat(options, "cork,", OPTSTRING);
-	if (txmsg_start)
-		strncat(options, "start,", OPTSTRING);
-	if (txmsg_end)
-		strncat(options, "end,", OPTSTRING);
+	if (txmsg_apply) {
+		snprintf(tstr, OPTSTRING, "apply %d,", txmsg_apply);
+		strncat(options, tstr, OPTSTRING);
+	}
+	if (txmsg_cork) {
+		snprintf(tstr, OPTSTRING, "cork %d,", txmsg_cork);
+		strncat(options, tstr, OPTSTRING);
+	}
+	if (txmsg_start) {
+		snprintf(tstr, OPTSTRING, "start %d,", txmsg_start);
+		strncat(options, tstr, OPTSTRING);
+	}
+	if (txmsg_end) {
+		snprintf(tstr, OPTSTRING, "end %d,", txmsg_end);
+		strncat(options, tstr, OPTSTRING);
+	}
 	if (txmsg_ingress)
 		strncat(options, "ingress,", OPTSTRING);
 	if (txmsg_skb)
@@ -897,7 +907,7 @@ static void test_options(char *options)
 
 static int __test_exec(int cgrp, int test, struct sockmap_options *opt)
 {
-	char *options = calloc(60, sizeof(char));
+	char *options = calloc(OPTSTRING, sizeof(char));
 	int err;
 
 	if (test == SENDPAGE)
-- 
2.14.3

^ permalink raw reply related

* [PATCH bpf-next 4/5] selftests/bpf: test_sockmap, fix data verification
From: Prashant Bhole @ 2018-05-18  7:17 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, John Fastabend
  Cc: Prashant Bhole, David S . Miller, Shuah Khan, netdev
In-Reply-To: <20180518071753.4768-1-bhole_prashant_q7@lab.ntt.co.jp>

When data verification is enabled, some tests fail because verification is done
incorrectly. Following changes fix it.

- Identify the size of data block to be verified
- Reset verification counter when data block size is reached
- Fixed the value printed in case of verification failure

Fixes: 16962b2404ac ("bpf: sockmap, add selftests")
Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
---
 tools/testing/selftests/bpf/test_sockmap.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c
index f79397513362..e8faf4596dac 100644
--- a/tools/testing/selftests/bpf/test_sockmap.c
+++ b/tools/testing/selftests/bpf/test_sockmap.c
@@ -337,8 +337,15 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt,
 		int fd_flags = O_NONBLOCK;
 		struct timeval timeout;
 		float total_bytes;
+		int bytes_cnt = 0;
+		int chunk_sz;
 		fd_set w;
 
+		if (opt->sendpage)
+			chunk_sz = iov_length * cnt;
+		else
+			chunk_sz = iov_length * iov_count;
+
 		fcntl(fd, fd_flags);
 		total_bytes = (float)iov_count * (float)iov_length * (float)cnt;
 		err = clock_gettime(CLOCK_MONOTONIC, &s->start);
@@ -388,9 +395,14 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt,
 							errno = -EIO;
 							fprintf(stderr,
 								"detected data corruption @iov[%i]:%i %02x != %02x, %02x ?= %02x\n",
-								i, j, d[j], k - 1, d[j+1], k + 1);
+								i, j, d[j], k - 1, d[j+1], k);
 							goto out_errno;
 						}
+						bytes_cnt++;
+						if (bytes_cnt == chunk_sz) {
+							k = 0;
+							bytes_cnt = 0;
+						}
 						recv--;
 					}
 				}
-- 
2.14.3

^ permalink raw reply related

* [PATCH bpf-next 3/5] selftests/bpf: test_sockmap, fix test timeout
From: Prashant Bhole @ 2018-05-18  7:17 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, John Fastabend
  Cc: Prashant Bhole, David S . Miller, Shuah Khan, netdev
In-Reply-To: <20180518071753.4768-1-bhole_prashant_q7@lab.ntt.co.jp>

In order to reduce runtime of tests, recently timout for select() call
was reduced from 1sec to 10usec. This was causing many tests failures.
It was caught with failure handling commits in this series.

Restoring the timeout from 10usec to 1sec

Fixes: a18fda1a62c3 ("bpf: reduce runtime of test_sockmap tests")
Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
---
 tools/testing/selftests/bpf/test_sockmap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c
index 8a81ea0e9fb6..f79397513362 100644
--- a/tools/testing/selftests/bpf/test_sockmap.c
+++ b/tools/testing/selftests/bpf/test_sockmap.c
@@ -345,8 +345,8 @@ static int msg_loop(int fd, int iov_count, int iov_length, int cnt,
 		if (err < 0)
 			perror("recv start time: ");
 		while (s->bytes_recvd < total_bytes) {
-			timeout.tv_sec = 0;
-			timeout.tv_usec = 10;
+			timeout.tv_sec = 1;
+			timeout.tv_usec = 0;
 
 			/* FD sets */
 			FD_ZERO(&w);
-- 
2.14.3

^ permalink raw reply related

* [PATCH bpf-next 2/5] selftests/bpf: test_sockmap, join cgroup in selftest mode
From: Prashant Bhole @ 2018-05-18  7:17 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, John Fastabend
  Cc: Prashant Bhole, David S . Miller, Shuah Khan, netdev
In-Reply-To: <20180518071753.4768-1-bhole_prashant_q7@lab.ntt.co.jp>

In case of selftest mode, temporary cgroup environment is created but
cgroup is not joined. It causes test failures. Fixed by joining the
cgroup

Fixes: 16962b2404ac ("bpf: sockmap, add selftests")
Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
---
 tools/testing/selftests/bpf/test_sockmap.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c
index 34feb74c95c4..8a81ea0e9fb6 100644
--- a/tools/testing/selftests/bpf/test_sockmap.c
+++ b/tools/testing/selftests/bpf/test_sockmap.c
@@ -1342,6 +1342,11 @@ static int __test_suite(char *bpf_file)
 		return cg_fd;
 	}
 
+	if (join_cgroup(CG_PATH)) {
+		fprintf(stderr, "ERROR: failed to join cgroup\n");
+		return -EINVAL;
+	}
+
 	/* Tests basic commands and APIs with range of iov values */
 	txmsg_start = txmsg_end = 0;
 	err = test_txmsg(cg_fd);
-- 
2.14.3

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox