* [PATCH net-next 05/11] nfp: tls: count TSO segments separately for the TLS offload
From: Jakub Kicinski @ 2019-07-09 2:53 UTC (permalink / raw)
To: davem
Cc: netdev, oss-drivers, alexei.starovoitov, Jakub Kicinski,
Dirk van der Merwe
In-Reply-To: <20190709025318.5534-1-jakub.kicinski@netronome.com>
Count the number of successfully submitted TLS segments,
not skbs. This will make it easier to compare the TLS
encryption count against other counters.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 270334427448..9a4421df9be9 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -880,7 +880,10 @@ nfp_net_tls_tx(struct nfp_net_dp *dp, struct nfp_net_r_vector *r_vec,
if (datalen) {
u64_stats_update_begin(&r_vec->tx_sync);
- r_vec->hw_tls_tx++;
+ if (!skb_is_gso(skb))
+ r_vec->hw_tls_tx++;
+ else
+ r_vec->hw_tls_tx += skb_shinfo(skb)->gso_segs;
u64_stats_update_end(&r_vec->tx_sync);
}
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 04/11] nfp: ccm: increase message limits
From: Jakub Kicinski @ 2019-07-09 2:53 UTC (permalink / raw)
To: davem
Cc: netdev, oss-drivers, alexei.starovoitov, Dirk van der Merwe,
Jakub Kicinski
In-Reply-To: <20190709025318.5534-1-jakub.kicinski@netronome.com>
From: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Increase the batch limit to consume small message bursts more
effectively. Practically, the effect on the 'add' messages is not
significant since the mailbox is sized such that the 'add' messages are
still limited to the same order of magnitude that it was originally set
for.
Furthermore, increase the queue size limit to 1024 entries. This further
improves the handling of bursts of small control messages.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
drivers/net/ethernet/netronome/nfp/ccm_mbox.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/ccm_mbox.c b/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
index d160ac794d98..f0783aa9e66e 100644
--- a/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
+++ b/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
@@ -13,7 +13,7 @@
* form a batch. Threads come in with CMSG formed in an skb, then
* enqueue that skb onto the request queue. If threads skb is first
* in queue this thread will handle the mailbox operation. It copies
- * up to 16 messages into the mailbox (making sure that both requests
+ * up to 64 messages into the mailbox (making sure that both requests
* and replies will fit. After FW is done processing the batch it
* copies the data out and wakes waiting threads.
* If a thread is waiting it either gets its the message completed
@@ -23,9 +23,9 @@
* to limit potential cache line bounces.
*/
-#define NFP_CCM_MBOX_BATCH_LIMIT 16
+#define NFP_CCM_MBOX_BATCH_LIMIT 64
#define NFP_CCM_TIMEOUT (NFP_NET_POLL_TIMEOUT * 1000)
-#define NFP_CCM_MAX_QLEN 256
+#define NFP_CCM_MAX_QLEN 1024
enum nfp_net_mbox_cmsg_state {
NFP_NET_MBOX_CMSG_STATE_QUEUED,
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 03/11] nfp: tls: use unique connection ids instead of 4-tuple for TX
From: Jakub Kicinski @ 2019-07-09 2:53 UTC (permalink / raw)
To: davem
Cc: netdev, oss-drivers, alexei.starovoitov, Jakub Kicinski,
Dirk van der Merwe
In-Reply-To: <20190709025318.5534-1-jakub.kicinski@netronome.com>
Connection 4 tuple reuse is slightly problematic - TLS socket
and context do not get destroyed until all the associated skbs
left the system and all references are released. This leads
to stale connection entry in the device preventing addition
of new one if the 4 tuple is reused quickly enough.
Instead of using read 4 tuple as the key use a unique ID.
Set the protocol to TCP and port to 0 to ensure no collisions
with real connections.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
.../net/ethernet/netronome/nfp/crypto/fw.h | 2 +
.../net/ethernet/netronome/nfp/crypto/tls.c | 43 +++++++++++++------
drivers/net/ethernet/netronome/nfp/nfp_net.h | 3 ++
3 files changed, 34 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/crypto/fw.h b/drivers/net/ethernet/netronome/nfp/crypto/fw.h
index 192ba907d91b..67413d946c4a 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/fw.h
+++ b/drivers/net/ethernet/netronome/nfp/crypto/fw.h
@@ -31,6 +31,8 @@ struct nfp_crypto_req_add_front {
u8 key_len;
__be16 ipver_vlan __packed;
u8 l4_proto;
+#define NFP_NET_TLS_NON_ADDR_KEY_LEN 8
+ u8 l3_addrs[0];
};
struct nfp_crypto_req_add_back {
diff --git a/drivers/net/ethernet/netronome/nfp/crypto/tls.c b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
index b13b3dbd4843..b49405b4af55 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/tls.c
+++ b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
@@ -155,17 +155,30 @@ nfp_net_tls_set_ipver_vlan(struct nfp_crypto_req_add_front *front, u8 ipver)
NFP_NET_TLS_VLAN_UNUSED));
}
+static void
+nfp_net_tls_assign_conn_id(struct nfp_net *nn,
+ struct nfp_crypto_req_add_front *front)
+{
+ u32 len;
+ u64 id;
+
+ id = atomic64_inc_return(&nn->ktls_conn_id_gen);
+ len = front->key_len - NFP_NET_TLS_NON_ADDR_KEY_LEN;
+
+ memcpy(front->l3_addrs, &id, sizeof(id));
+ memset(front->l3_addrs + sizeof(id), 0, len - sizeof(id));
+}
+
static struct nfp_crypto_req_add_back *
-nfp_net_tls_set_ipv4(struct nfp_crypto_req_add_v4 *req, struct sock *sk,
- int direction)
+nfp_net_tls_set_ipv4(struct nfp_net *nn, struct nfp_crypto_req_add_v4 *req,
+ struct sock *sk, int direction)
{
struct inet_sock *inet = inet_sk(sk);
req->front.key_len += sizeof(__be32) * 2;
if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
- req->src_ip = inet->inet_saddr;
- req->dst_ip = inet->inet_daddr;
+ nfp_net_tls_assign_conn_id(nn, &req->front);
} else {
req->src_ip = inet->inet_daddr;
req->dst_ip = inet->inet_saddr;
@@ -175,8 +188,8 @@ nfp_net_tls_set_ipv4(struct nfp_crypto_req_add_v4 *req, struct sock *sk,
}
static struct nfp_crypto_req_add_back *
-nfp_net_tls_set_ipv6(struct nfp_crypto_req_add_v6 *req, struct sock *sk,
- int direction)
+nfp_net_tls_set_ipv6(struct nfp_net *nn, struct nfp_crypto_req_add_v6 *req,
+ struct sock *sk, int direction)
{
#if IS_ENABLED(CONFIG_IPV6)
struct ipv6_pinfo *np = inet6_sk(sk);
@@ -184,8 +197,7 @@ nfp_net_tls_set_ipv6(struct nfp_crypto_req_add_v6 *req, struct sock *sk,
req->front.key_len += sizeof(struct in6_addr) * 2;
if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
- memcpy(req->src_ip, &np->saddr, sizeof(req->src_ip));
- memcpy(req->dst_ip, &sk->sk_v6_daddr, sizeof(req->dst_ip));
+ nfp_net_tls_assign_conn_id(nn, &req->front);
} else {
memcpy(req->src_ip, &sk->sk_v6_daddr, sizeof(req->src_ip));
memcpy(req->dst_ip, &np->saddr, sizeof(req->dst_ip));
@@ -205,8 +217,8 @@ nfp_net_tls_set_l4(struct nfp_crypto_req_add_front *front,
front->l4_proto = IPPROTO_TCP;
if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
- back->src_port = inet->inet_sport;
- back->dst_port = inet->inet_dport;
+ back->src_port = 0;
+ back->dst_port = 0;
} else {
back->src_port = inet->inet_dport;
back->dst_port = inet->inet_sport;
@@ -260,6 +272,7 @@ nfp_net_tls_add(struct net_device *netdev, struct sock *sk,
struct nfp_crypto_reply_add *reply;
struct sk_buff *skb;
size_t req_sz;
+ void *req;
bool ipv6;
int err;
@@ -302,16 +315,17 @@ nfp_net_tls_add(struct net_device *netdev, struct sock *sk,
front = (void *)skb->data;
front->ep_id = 0;
- front->key_len = 8;
+ front->key_len = NFP_NET_TLS_NON_ADDR_KEY_LEN;
front->opcode = nfp_tls_1_2_dir_to_opcode(direction);
memset(front->resv, 0, sizeof(front->resv));
nfp_net_tls_set_ipver_vlan(front, ipv6 ? 6 : 4);
+ req = (void *)skb->data;
if (ipv6)
- back = nfp_net_tls_set_ipv6((void *)skb->data, sk, direction);
+ back = nfp_net_tls_set_ipv6(nn, req, sk, direction);
else
- back = nfp_net_tls_set_ipv4((void *)skb->data, sk, direction);
+ back = nfp_net_tls_set_ipv4(nn, req, sk, direction);
nfp_net_tls_set_l4(front, back, sk, direction);
@@ -329,7 +343,8 @@ nfp_net_tls_add(struct net_device *netdev, struct sock *sk,
err = nfp_ccm_mbox_communicate(nn, skb, NFP_CCM_TYPE_CRYPTO_ADD,
sizeof(*reply), sizeof(*reply));
if (err) {
- nn_dp_warn(&nn->dp, "failed to add TLS: %d\n", err);
+ nn_dp_warn(&nn->dp, "failed to add TLS: %d (%d)\n",
+ err, direction == TLS_OFFLOAD_CTX_DIR_TX);
/* communicate frees skb on error */
goto err_conn_remove;
}
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 0659756bf2bb..5d6c3738b494 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -583,6 +583,7 @@ struct nfp_net_dp {
* @tlv_caps: Parsed TLV capabilities
* @ktls_tx_conn_cnt: Number of offloaded kTLS TX connections
* @ktls_rx_conn_cnt: Number of offloaded kTLS RX connections
+ * @ktls_conn_id_gen: Trivial generator for kTLS connection ids (for TX)
* @ktls_no_space: Counter of firmware rejecting kTLS connection due to
* lack of space
* @mbox_cmsg: Common Control Message via vNIC mailbox state
@@ -670,6 +671,8 @@ struct nfp_net {
unsigned int ktls_tx_conn_cnt;
unsigned int ktls_rx_conn_cnt;
+ atomic64_t ktls_conn_id_gen;
+
atomic_t ktls_no_space;
struct {
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 02/11] nfp: tls: move setting ipver_vlan to a helper
From: Jakub Kicinski @ 2019-07-09 2:53 UTC (permalink / raw)
To: davem
Cc: netdev, oss-drivers, alexei.starovoitov, Jakub Kicinski,
Dirk van der Merwe
In-Reply-To: <20190709025318.5534-1-jakub.kicinski@netronome.com>
Long lines are ugly. No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
drivers/net/ethernet/netronome/nfp/crypto/tls.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/crypto/tls.c b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
index 086bea0a7f2d..b13b3dbd4843 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/tls.c
+++ b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
@@ -147,6 +147,14 @@ static void nfp_net_tls_del_fw(struct nfp_net *nn, __be32 *fw_handle)
NFP_CCM_TYPE_CRYPTO_DEL);
}
+static void
+nfp_net_tls_set_ipver_vlan(struct nfp_crypto_req_add_front *front, u8 ipver)
+{
+ front->ipver_vlan = cpu_to_be16(FIELD_PREP(NFP_NET_TLS_IPVER, ipver) |
+ FIELD_PREP(NFP_NET_TLS_VLAN,
+ NFP_NET_TLS_VLAN_UNUSED));
+}
+
static struct nfp_crypto_req_add_back *
nfp_net_tls_set_ipv4(struct nfp_crypto_req_add_v4 *req, struct sock *sk,
int direction)
@@ -154,9 +162,6 @@ nfp_net_tls_set_ipv4(struct nfp_crypto_req_add_v4 *req, struct sock *sk,
struct inet_sock *inet = inet_sk(sk);
req->front.key_len += sizeof(__be32) * 2;
- req->front.ipver_vlan = cpu_to_be16(FIELD_PREP(NFP_NET_TLS_IPVER, 4) |
- FIELD_PREP(NFP_NET_TLS_VLAN,
- NFP_NET_TLS_VLAN_UNUSED));
if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
req->src_ip = inet->inet_saddr;
@@ -177,9 +182,6 @@ nfp_net_tls_set_ipv6(struct nfp_crypto_req_add_v6 *req, struct sock *sk,
struct ipv6_pinfo *np = inet6_sk(sk);
req->front.key_len += sizeof(struct in6_addr) * 2;
- req->front.ipver_vlan = cpu_to_be16(FIELD_PREP(NFP_NET_TLS_IPVER, 6) |
- FIELD_PREP(NFP_NET_TLS_VLAN,
- NFP_NET_TLS_VLAN_UNUSED));
if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
memcpy(req->src_ip, &np->saddr, sizeof(req->src_ip));
@@ -304,6 +306,8 @@ nfp_net_tls_add(struct net_device *netdev, struct sock *sk,
front->opcode = nfp_tls_1_2_dir_to_opcode(direction);
memset(front->resv, 0, sizeof(front->resv));
+ nfp_net_tls_set_ipver_vlan(front, ipv6 ? 6 : 4);
+
if (ipv6)
back = nfp_net_tls_set_ipv6((void *)skb->data, sk, direction);
else
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 01/11] nfp: tls: ignore queue limits for delete commands
From: Jakub Kicinski @ 2019-07-09 2:53 UTC (permalink / raw)
To: davem
Cc: netdev, oss-drivers, alexei.starovoitov, Jakub Kicinski,
Dirk van der Merwe
In-Reply-To: <20190709025318.5534-1-jakub.kicinski@netronome.com>
We need to do our best not to drop delete commands, otherwise
we will have stale entries in the connection table. Ignore
the control message queue limits for delete commands.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
drivers/net/ethernet/netronome/nfp/ccm.h | 4 +++
drivers/net/ethernet/netronome/nfp/ccm_mbox.c | 25 +++++++++++++------
.../net/ethernet/netronome/nfp/crypto/tls.c | 5 ++--
3 files changed, 24 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/ccm.h b/drivers/net/ethernet/netronome/nfp/ccm.h
index da1b1e20df51..a460c75522be 100644
--- a/drivers/net/ethernet/netronome/nfp/ccm.h
+++ b/drivers/net/ethernet/netronome/nfp/ccm.h
@@ -118,6 +118,10 @@ bool nfp_ccm_mbox_fits(struct nfp_net *nn, unsigned int size);
struct sk_buff *
nfp_ccm_mbox_msg_alloc(struct nfp_net *nn, unsigned int req_size,
unsigned int reply_size, gfp_t flags);
+int __nfp_ccm_mbox_communicate(struct nfp_net *nn, struct sk_buff *skb,
+ enum nfp_ccm_type type,
+ unsigned int reply_size,
+ unsigned int max_reply_size, bool critical);
int nfp_ccm_mbox_communicate(struct nfp_net *nn, struct sk_buff *skb,
enum nfp_ccm_type type,
unsigned int reply_size,
diff --git a/drivers/net/ethernet/netronome/nfp/ccm_mbox.c b/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
index 02fccd90961d..d160ac794d98 100644
--- a/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
+++ b/drivers/net/ethernet/netronome/nfp/ccm_mbox.c
@@ -515,13 +515,13 @@ nfp_ccm_mbox_msg_prepare(struct nfp_net *nn, struct sk_buff *skb,
static int
nfp_ccm_mbox_msg_enqueue(struct nfp_net *nn, struct sk_buff *skb,
- enum nfp_ccm_type type)
+ enum nfp_ccm_type type, bool critical)
{
struct nfp_ccm_hdr *hdr;
assert_spin_locked(&nn->mbox_cmsg.queue.lock);
- if (nn->mbox_cmsg.queue.qlen >= NFP_CCM_MAX_QLEN) {
+ if (!critical && nn->mbox_cmsg.queue.qlen >= NFP_CCM_MAX_QLEN) {
nn_dp_warn(&nn->dp, "mailbox request queue too long\n");
return -EBUSY;
}
@@ -536,10 +536,10 @@ nfp_ccm_mbox_msg_enqueue(struct nfp_net *nn, struct sk_buff *skb,
return 0;
}
-int nfp_ccm_mbox_communicate(struct nfp_net *nn, struct sk_buff *skb,
- enum nfp_ccm_type type,
- unsigned int reply_size,
- unsigned int max_reply_size)
+int __nfp_ccm_mbox_communicate(struct nfp_net *nn, struct sk_buff *skb,
+ enum nfp_ccm_type type,
+ unsigned int reply_size,
+ unsigned int max_reply_size, bool critical)
{
int err;
@@ -550,7 +550,7 @@ int nfp_ccm_mbox_communicate(struct nfp_net *nn, struct sk_buff *skb,
spin_lock_bh(&nn->mbox_cmsg.queue.lock);
- err = nfp_ccm_mbox_msg_enqueue(nn, skb, type);
+ err = nfp_ccm_mbox_msg_enqueue(nn, skb, type, critical);
if (err)
goto err_unlock;
@@ -594,6 +594,15 @@ int nfp_ccm_mbox_communicate(struct nfp_net *nn, struct sk_buff *skb,
return err;
}
+int nfp_ccm_mbox_communicate(struct nfp_net *nn, struct sk_buff *skb,
+ enum nfp_ccm_type type,
+ unsigned int reply_size,
+ unsigned int max_reply_size)
+{
+ return __nfp_ccm_mbox_communicate(nn, skb, type, reply_size,
+ max_reply_size, false);
+}
+
static void nfp_ccm_mbox_post_runq_work(struct work_struct *work)
{
struct sk_buff *skb;
@@ -650,7 +659,7 @@ int nfp_ccm_mbox_post(struct nfp_net *nn, struct sk_buff *skb,
spin_lock_bh(&nn->mbox_cmsg.queue.lock);
- err = nfp_ccm_mbox_msg_enqueue(nn, skb, type);
+ err = nfp_ccm_mbox_msg_enqueue(nn, skb, type, false);
if (err)
goto err_unlock;
diff --git a/drivers/net/ethernet/netronome/nfp/crypto/tls.c b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
index 9f7ccb7da417..086bea0a7f2d 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/tls.c
+++ b/drivers/net/ethernet/netronome/nfp/crypto/tls.c
@@ -112,8 +112,9 @@ nfp_net_tls_communicate_simple(struct nfp_net *nn, struct sk_buff *skb,
struct nfp_crypto_reply_simple *reply;
int err;
- err = nfp_ccm_mbox_communicate(nn, skb, type,
- sizeof(*reply), sizeof(*reply));
+ err = __nfp_ccm_mbox_communicate(nn, skb, type,
+ sizeof(*reply), sizeof(*reply),
+ type == NFP_CCM_TYPE_CRYPTO_DEL);
if (err) {
nn_dp_warn(&nn->dp, "failed to %s TLS: %d\n", name, err);
return err;
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 00/11] nfp: tls: fixes for initial TLS support
From: Jakub Kicinski @ 2019-07-09 2:53 UTC (permalink / raw)
To: davem; +Cc: netdev, oss-drivers, alexei.starovoitov, Jakub Kicinski
Hi!
This series brings various fixes to nfp tls offload recently added
to net-next.
First 4 patches revolve around device mailbox communication, trying
to make it more reliable. Next patch fixes statistical counter.
Patch 6 improves the TX resync if device communication failed.
Patch 7 makes sure we remove keys from memory after talking to FW.
Patch 8 adds missing tls context initialization, we fill in the
context information from various places based on the configuration
and looks like we missed the init in the case of where TX is
offloaded, but RX wasn't initialized yet. Patches 9 and 10 make
the nfp driver undo TLS state changes if we need to drop the
frame (e.g. due to DMA mapping error).
Last but not least TLS fallback should not adjust socket memory
after skb_orphan_partial(). This code will go away once we forbid
orphaning of skbs in need of crypto, but that's "real" -next
material, so lets do a quick fix.
Dirk van der Merwe (2):
nfp: ccm: increase message limits
net/tls: don't clear TX resync flag on error
Jakub Kicinski (9):
nfp: tls: ignore queue limits for delete commands
nfp: tls: move setting ipver_vlan to a helper
nfp: tls: use unique connection ids instead of 4-tuple for TX
nfp: tls: count TSO segments separately for the TLS offload
nfp: tls: don't leave key material in freed FW cmsg skbs
net/tls: add missing prot info init
nfp: tls: avoid one of the ifdefs for TLS
nfp: tls: undo TLS sequence tracking when dropping the frame
net/tls: fix socket wmem accounting on fallback with netem
.../mellanox/mlx5/core/en_accel/tls.c | 8 +-
drivers/net/ethernet/netronome/nfp/ccm.h | 4 +
drivers/net/ethernet/netronome/nfp/ccm_mbox.c | 31 ++++---
.../net/ethernet/netronome/nfp/crypto/fw.h | 2 +
.../net/ethernet/netronome/nfp/crypto/tls.c | 93 +++++++++++++------
drivers/net/ethernet/netronome/nfp/nfp_net.h | 3 +
.../ethernet/netronome/nfp/nfp_net_common.c | 32 ++++++-
include/net/tls.h | 6 +-
net/tls/tls_device.c | 10 +-
net/tls/tls_device_fallback.c | 4 +
10 files changed, 143 insertions(+), 50 deletions(-)
--
2.21.0
^ permalink raw reply
* Re: [RFC v2] vhost: introduce mdev based hardware vhost backend
From: Jason Wang @ 2019-07-09 2:50 UTC (permalink / raw)
To: Tiwei Bie, Alex Williamson
Cc: mst, maxime.coquelin, linux-kernel, kvm, virtualization, netdev,
dan.daly, cunming.liang, zhihong.wang, idos, Rob Miller,
Ariel Adam
In-Reply-To: <20190708061625.GA15936@___>
On 2019/7/8 下午2:16, Tiwei Bie wrote:
> On Fri, Jul 05, 2019 at 08:49:46AM -0600, Alex Williamson wrote:
>> On Thu, 4 Jul 2019 14:21:34 +0800
>> Tiwei Bie <tiwei.bie@intel.com> wrote:
>>> On Thu, Jul 04, 2019 at 12:31:48PM +0800, Jason Wang wrote:
>>>> On 2019/7/3 下午9:08, Tiwei Bie wrote:
>>>>> On Wed, Jul 03, 2019 at 08:16:23PM +0800, Jason Wang wrote:
>>>>>> On 2019/7/3 下午7:52, Tiwei Bie wrote:
>>>>>>> On Wed, Jul 03, 2019 at 06:09:51PM +0800, Jason Wang wrote:
>>>>>>>> On 2019/7/3 下午5:13, Tiwei Bie wrote:
>>>>>>>>> Details about this can be found here:
>>>>>>>>>
>>>>>>>>> https://lwn.net/Articles/750770/
>>>>>>>>>
>>>>>>>>> What's new in this version
>>>>>>>>> ==========================
>>>>>>>>>
>>>>>>>>> A new VFIO device type is introduced - vfio-vhost. This addressed
>>>>>>>>> some comments from here:https://patchwork.ozlabs.org/cover/984763/
>>>>>>>>>
>>>>>>>>> Below is the updated device interface:
>>>>>>>>>
>>>>>>>>> Currently, there are two regions of this device: 1) CONFIG_REGION
>>>>>>>>> (VFIO_VHOST_CONFIG_REGION_INDEX), which can be used to setup the
>>>>>>>>> device; 2) NOTIFY_REGION (VFIO_VHOST_NOTIFY_REGION_INDEX), which
>>>>>>>>> can be used to notify the device.
>>>>>>>>>
>>>>>>>>> 1. CONFIG_REGION
>>>>>>>>>
>>>>>>>>> The region described by CONFIG_REGION is the main control interface.
>>>>>>>>> Messages will be written to or read from this region.
>>>>>>>>>
>>>>>>>>> The message type is determined by the `request` field in message
>>>>>>>>> header. The message size is encoded in the message header too.
>>>>>>>>> The message format looks like this:
>>>>>>>>>
>>>>>>>>> struct vhost_vfio_op {
>>>>>>>>> __u64 request;
>>>>>>>>> __u32 flags;
>>>>>>>>> /* Flag values: */
>>>>>>>>> #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */
>>>>>>>>> __u32 size;
>>>>>>>>> union {
>>>>>>>>> __u64 u64;
>>>>>>>>> struct vhost_vring_state state;
>>>>>>>>> struct vhost_vring_addr addr;
>>>>>>>>> } payload;
>>>>>>>>> };
>>>>>>>>>
>>>>>>>>> The existing vhost-kernel ioctl cmds are reused as the message
>>>>>>>>> requests in above structure.
>>>>>>>> Still a comments like V1. What's the advantage of inventing a new protocol?
>>>>>>> I'm trying to make it work in VFIO's way..
>>>>>>>
>>>>>>>> I believe either of the following should be better:
>>>>>>>>
>>>>>>>> - using vhost ioctl, we can start from SET_VRING_KICK/SET_VRING_CALL and
>>>>>>>> extend it with e.g notify region. The advantages is that all exist userspace
>>>>>>>> program could be reused without modification (or minimal modification). And
>>>>>>>> vhost API hides lots of details that is not necessary to be understood by
>>>>>>>> application (e.g in the case of container).
>>>>>>> Do you mean reusing vhost's ioctl on VFIO device fd directly,
>>>>>>> or introducing another mdev driver (i.e. vhost_mdev instead of
>>>>>>> using the existing vfio_mdev) for mdev device?
>>>>>> Can we simply add them into ioctl of mdev_parent_ops?
>>>>> Right, either way, these ioctls have to be and just need to be
>>>>> added in the ioctl of the mdev_parent_ops. But another thing we
>>>>> also need to consider is that which file descriptor the userspace
>>>>> will do the ioctl() on. So I'm wondering do you mean let the
>>>>> userspace do the ioctl() on the VFIO device fd of the mdev
>>>>> device?
>>>>>
>>>> Yes.
>>> Got it! I'm not sure what's Alex opinion on this. If we all
>>> agree with this, I can do it in this way.
>>>
>>>> Is there any other way btw?
>>> Just a quick thought.. Maybe totally a bad idea. I was thinking
>>> whether it would be odd to do non-VFIO's ioctls on VFIO's device
>>> fd. So I was wondering whether it's possible to allow binding
>>> another mdev driver (e.g. vhost_mdev) to the supported mdev
>>> devices. The new mdev driver, vhost_mdev, can provide similar
>>> ways to let userspace open the mdev device and do the vhost ioctls
>>> on it. To distinguish with the vfio_mdev compatible mdev devices,
>>> the device API of the new vhost_mdev compatible mdev devices
>>> might be e.g. "vhost-net" for net?
>>>
>>> So in VFIO case, the device will be for passthru directly. And
>>> in VHOST case, the device can be used to accelerate the existing
>>> virtualized devices.
>>>
>>> How do you think?
>> VFIO really can't prevent vendor specific ioctls on the device file
>> descriptor for mdevs, but a) we'd want to be sure the ioctl address
>> space can't collide with ioctls we'd use for vfio defined purposes and
>> b) maybe the VFIO user API isn't what you want in the first place if
>> you intend to mostly/entirely ignore the defined ioctl set and replace
>> them with your own. In the case of the latter, you're also not getting
>> the advantages of the existing VFIO userspace code, so why expose a
>> VFIO device at all.
> Yeah, I totally agree.
I guess the original idea is to reuse the VFIO DMA/IOMMU API for this.
Then we have the chance to reuse vfio codes in qemu for dealing with e.g
vIOMMU.
>
>> The mdev interface does provide a general interface for creating and
>> managing virtual devices, vfio-mdev is just one driver on the mdev
>> bus. Parav (Mellanox) has been doing work on mdev-core to help clean
>> out vfio-isms from the interface, aiui, with the intent of implementing
>> another mdev bus driver for using the devices within the kernel.
> Great to know this! I found below series after some searching:
>
> https://lkml.org/lkml/2019/3/8/821
>
> In above series, the new mlx5_core mdev driver will do the probe
> by calling mlx5_get_core_dev() first on the parent device of the
> mdev device. In vhost_mdev, maybe we can also keep track of all
> the compatible mdev devices and use this info to do the probe.
I don't get why this is needed. My understanding is if we want to go
this way, there're actually two parts. 1) Vhost mdev that implements the
device managements and vhost ioctl. 2) Vhost it self, which can accept
mdev fd as it backend through VHOST_NET_SET_BACKEND.
> But we also need a way to allow vfio_mdev driver to distinguish
> and reject the incompatible mdev devices.
One issue for this series is that it doesn't consider DMA isolation at all.
>
>> It
>> seems like this vhost-mdev driver might be similar, using mdev but not
>> necessarily vfio-mdev to expose devices. Thanks,
> Yeah, I also think so!
I've cced some driver developers for their inputs. I think we need a
sample parent drivers in the next version for us to understand the full
picture.
Thanks
>
> Thanks!
> Tiwei
>
>> Alex
^ permalink raw reply
* Re: [PATCH 1/4] dt-bindings: allow up to four clocks for orion-mdio
From: Andrew Lunn @ 2019-07-09 2:41 UTC (permalink / raw)
To: Rob Herring; +Cc: josua, netdev, stable, David S. Miller, Mark Rutland
In-Reply-To: <CAL_JsqJJA6=2b=VzDzS1ipOatpRuVBUmReYoOMf-9p39=jyF8Q@mail.gmail.com>
> > Optional properties:
> > - interrupts: interrupt line number for the SMI error/done interrupt
> > -- clocks: phandle for up to three required clocks for the MDIO instance
> > +- clocks: phandle for up to four required clocks for the MDIO instance
>
> This needs to enumerate exactly what the clocks are. Shouldn't there
> be an additional clock-names value too?
Hi Rob
The driver does not care what they are called. It just turns them all
on, and turns them off again when removed.
Andrew
^ permalink raw reply
* Re: [PATCH net-next v2] skbuff: increase verbosity when dumping skb data
From: David Miller @ 2019-07-09 2:39 UTC (permalink / raw)
To: willemdebruijn.kernel; +Cc: netdev, linyunsheng, willemb
In-Reply-To: <20190707095155.58578-1-willemdebruijn.kernel@gmail.com>
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Sun, 7 Jul 2019 05:51:55 -0400
> From: Willem de Bruijn <willemb@google.com>
>
> skb_warn_bad_offload and netdev_rx_csum_fault trigger on hard to debug
> issues. Dump more state and the header.
>
> Optionally dump the entire packet and linear segment. This is required
> to debug checksum bugs that may include bytes past skb_tail_pointer().
>
> Both call sites call this function inside a net_ratelimit() block.
> Limit full packet log further to a hard limit of can_dump_full (5).
>
> Based on an earlier patch by Cong Wang, see link below.
>
> Changes v1 -> v2
> - dump frag_list only on full_pkt
>
> Link: https://patchwork.ozlabs.org/patch/1000841/
> Signed-off-by: Willem de Bruijn <willemb@google.com>
Nice to finally have this, applied.
^ permalink raw reply
* Re: [PATCH net-next] ipv6: elide flowlabel check if no exclusive leases exist
From: David Miller @ 2019-07-09 2:38 UTC (permalink / raw)
To: willemdebruijn.kernel; +Cc: netdev, willemb
In-Reply-To: <20190707093445.15121-1-willemdebruijn.kernel@gmail.com>
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Sun, 7 Jul 2019 05:34:45 -0400
> From: Willem de Bruijn <willemb@google.com>
>
> Processes can request ipv6 flowlabels with cmsg IPV6_FLOWINFO.
> If not set, by default an autogenerated flowlabel is selected.
>
> Explicit flowlabels require a control operation per label plus a
> datapath check on every connection (every datagram if unconnected).
> This is particularly expensive on unconnected sockets multiplexing
> many flows, such as QUIC.
>
> In the common case, where no lease is exclusive, the check can be
> safely elided, as both lease request and check trivially succeed.
> Indeed, autoflowlabel does the same even with exclusive leases.
>
> Elide the check if no process has requested an exclusive lease.
>
> fl6_sock_lookup previously returns either a reference to a lease or
> NULL to denote failure. Modify to return a real error and update
> all callers. On return NULL, they can use the label and will elide
> the atomic_dec in fl6_sock_release.
>
> This is an optimization. Robust applications still have to revert to
> requesting leases if the fast path fails due to an exclusive lease.
>
> Changes RFC->v1:
> - use static_key_false_deferred to rate limit jump label operations
> - call static_key_deferred_flush to stop timers on exit
> - move decrement out of RCU context
> - defer optimization also if opt data is associated with a lease
> - updated all fp6_sock_lookup callers, not just udp
>
> Signed-off-by: Willem de Bruijn <willemb@google.com>
Looks good, applied, thanks Willem.
^ permalink raw reply
* [PATCH 2/2 net-next] net: stmmac: add support for hash table size 128/256 in dwmac4
From: Biao Huang @ 2019-07-09 2:36 UTC (permalink / raw)
To: davem, Jose Abreu, andrew
Cc: Giuseppe Cavallaro, Alexandre Torgue, Maxime Coquelin,
Matthias Brugger, netdev, linux-stm32, linux-arm-kernel,
linux-kernel, linux-mediatek, yt.shen, biao.huang, jianguo.zhang,
boon.leong.ong
In-Reply-To: <20190709023623.8358-1-biao.huang@mediatek.com>
1. get hash table size in hw feature reigster, and add support
for taller hash table(128/256) in dwmac4.
2. only clear GMAC_PACKET_FILTER bits used in this function,
to avoid side effect to functions of other bits.
stmmac selftests output log with flow control on:
ethtool -t eth0
The test result is PASS
The test extra info:
1. MAC Loopback 0
2. PHY Loopback -95
3. MMC Counters 0
4. EEE -95
5. Hash Filter MC 0
6. Perfect Filter UC 0
7. MC Filter 0
8. UC Filter 0
9. Flow Control 0
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
---
drivers/net/ethernet/stmicro/stmmac/common.h | 7 +--
drivers/net/ethernet/stmicro/stmmac/dwmac4.h | 4 +-
.../net/ethernet/stmicro/stmmac/dwmac4_core.c | 49 +++++++++++--------
.../net/ethernet/stmicro/stmmac/dwmac4_dma.c | 1 +
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +++
5 files changed, 42 insertions(+), 25 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 2403a65167b2..af91e6b15eaa 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -325,6 +325,7 @@ struct dma_features {
/* 802.3az - Energy-Efficient Ethernet (EEE) */
unsigned int eee;
unsigned int av;
+ unsigned int hash_tb_sz;
unsigned int tsoen;
/* TX and RX csum */
unsigned int tx_coe;
@@ -423,9 +424,9 @@ struct mac_device_info {
struct mii_regs mii; /* MII register Addresses */
struct mac_link link;
void __iomem *pcsr; /* vpointer to device CSRs */
- int multicast_filter_bins;
- int unicast_filter_entries;
- int mcast_bits_log2;
+ unsigned int multicast_filter_bins;
+ unsigned int unicast_filter_entries;
+ unsigned int mcast_bits_log2;
unsigned int rx_csum;
unsigned int pcs;
unsigned int pmt;
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4.h
index 15a9f3c7cc6a..2ed11a581d80 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4.h
@@ -15,8 +15,7 @@
/* MAC registers */
#define GMAC_CONFIG 0x00000000
#define GMAC_PACKET_FILTER 0x00000008
-#define GMAC_HASH_TAB_0_31 0x00000010
-#define GMAC_HASH_TAB_32_63 0x00000014
+#define GMAC_HASH_TAB(x) (0x10 + (x) * 4)
#define GMAC_RX_FLOW_CTRL 0x00000090
#define GMAC_QX_TX_FLOW_CTRL(x) (0x70 + x * 4)
#define GMAC_TXQ_PRTY_MAP0 0x98
@@ -181,6 +180,7 @@ enum power_event {
#define GMAC_HW_FEAT_MIISEL BIT(0)
/* MAC HW features1 bitmap */
+#define GMAC_HW_HASH_TB_SZ GENMASK(25, 24)
#define GMAC_HW_FEAT_AVSEL BIT(20)
#define GMAC_HW_TSOEN BIT(18)
#define GMAC_HW_TXFIFOSIZE GENMASK(10, 6)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
index 776077ec1a23..01c2e2d83e76 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
@@ -400,41 +400,50 @@ static void dwmac4_set_filter(struct mac_device_info *hw,
struct net_device *dev)
{
void __iomem *ioaddr = (void __iomem *)dev->base_addr;
- unsigned int value = 0;
+ int numhashregs = (hw->multicast_filter_bins >> 5);
+ int mcbitslog2 = hw->mcast_bits_log2;
+ unsigned int value;
+ int i;
+ value = readl(ioaddr + GMAC_PACKET_FILTER);
+ value &= ~GMAC_PACKET_FILTER_HMC;
+ value &= ~GMAC_PACKET_FILTER_HPF;
+ value &= ~GMAC_PACKET_FILTER_PCF;
+ value &= ~GMAC_PACKET_FILTER_PM;
+ value &= ~GMAC_PACKET_FILTER_PR;
if (dev->flags & IFF_PROMISC) {
value = GMAC_PACKET_FILTER_PR | GMAC_PACKET_FILTER_PCF;
} else if ((dev->flags & IFF_ALLMULTI) ||
- (netdev_mc_count(dev) > HASH_TABLE_SIZE)) {
+ (netdev_mc_count(dev) > hw->multicast_filter_bins)) {
/* Pass all multi */
- value = GMAC_PACKET_FILTER_PM;
- /* Set the 64 bits of the HASH tab. To be updated if taller
- * hash table is used
- */
- writel(0xffffffff, ioaddr + GMAC_HASH_TAB_0_31);
- writel(0xffffffff, ioaddr + GMAC_HASH_TAB_32_63);
+ value |= GMAC_PACKET_FILTER_PM;
+ /* Set all the bits of the HASH tab */
+ for (i = 0; i < numhashregs; i++)
+ writel(0xffffffff, ioaddr + GMAC_HASH_TAB(i));
} else if (!netdev_mc_empty(dev)) {
- u32 mc_filter[2];
struct netdev_hw_addr *ha;
+ u32 mc_filter[8];
/* Hash filter for multicast */
- value = GMAC_PACKET_FILTER_HMC;
+ value |= GMAC_PACKET_FILTER_HMC;
memset(mc_filter, 0, sizeof(mc_filter));
netdev_for_each_mc_addr(ha, dev) {
- /* The upper 6 bits of the calculated CRC are used to
- * index the content of the Hash Table Reg 0 and 1.
+ /* The upper n bits of the calculated CRC are used to
+ * index the contents of the hash table. The number of
+ * bits used depends on the hardware configuration
+ * selected at core configuration time.
*/
- int bit_nr =
- (bitrev32(~crc32_le(~0, ha->addr, 6)) >> 26);
- /* The most significant bit determines the register
- * to use while the other 5 bits determines the bit
- * within the selected register
+ int bit_nr = bitrev32(~crc32_le(~0, ha->addr,
+ ETH_ALEN)) >> (32 - mcbitslog2);
+ /* The most significant bit determines the register to
+ * use (H/L) while the other 5 bits determine the bit
+ * within the register.
*/
- mc_filter[bit_nr >> 5] |= (1 << (bit_nr & 0x1F));
+ mc_filter[bit_nr >> 5] |= (1 << (bit_nr & 0x1f));
}
- writel(mc_filter[0], ioaddr + GMAC_HASH_TAB_0_31);
- writel(mc_filter[1], ioaddr + GMAC_HASH_TAB_32_63);
+ for (i = 0; i < numhashregs; i++)
+ writel(mc_filter[i], ioaddr + GMAC_HASH_TAB(i));
}
value |= GMAC_PACKET_FILTER_HPF;
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
index 0f208e13da9f..6af79fd65ef7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
@@ -351,6 +351,7 @@ static void dwmac4_get_hw_feature(void __iomem *ioaddr,
/* MAC HW feature1 */
hw_cap = readl(ioaddr + GMAC_HW_FEATURE1);
+ dma_cap->hash_tb_sz = (hw_cap & GMAC_HW_HASH_TB_SZ) >> 24;
dma_cap->av = (hw_cap & GMAC_HW_FEAT_AVSEL) >> 20;
dma_cap->tsoen = (hw_cap & GMAC_HW_TSOEN) >> 18;
/* RX and TX FIFO sizes are encoded as log2(n / 128). Undo that by
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 3425d4dda03d..3a04ace0379a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4174,6 +4174,12 @@ static int stmmac_hw_init(struct stmmac_priv *priv)
priv->plat->enh_desc = priv->dma_cap.enh_desc;
priv->plat->pmt = priv->dma_cap.pmt_remote_wake_up;
priv->hw->pmt = priv->plat->pmt;
+ if (priv->dma_cap.hash_tb_sz) {
+ priv->hw->multicast_filter_bins =
+ (BIT(priv->dma_cap.hash_tb_sz) << 5);
+ priv->hw->mcast_bits_log2 =
+ ilog2(priv->hw->multicast_filter_bins);
+ }
/* TXCOE doesn't work in thresh DMA mode */
if (priv->plat->force_thresh_dma_mode)
--
2.18.0
^ permalink raw reply related
* [PATCH 1/2 net-next] net: stmmac: dwmac4: mac address array boudary violation issue
From: Biao Huang @ 2019-07-09 2:36 UTC (permalink / raw)
To: davem, Jose Abreu, andrew
Cc: Giuseppe Cavallaro, Alexandre Torgue, Maxime Coquelin,
Matthias Brugger, netdev, linux-stm32, linux-arm-kernel,
linux-kernel, linux-mediatek, yt.shen, biao.huang, jianguo.zhang,
boon.leong.ong
In-Reply-To: <20190709023623.8358-1-biao.huang@mediatek.com>
The mac address array size is GMAC_MAX_PERFECT_ADDRESSES,
so the 'reg' should be less than it, or will affect other registers.
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
---
drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
index 8d9f6cda4012..776077ec1a23 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
@@ -454,7 +454,7 @@ static void dwmac4_set_filter(struct mac_device_info *hw,
reg++;
}
- while (reg <= GMAC_MAX_PERFECT_ADDRESSES) {
+ while (reg < GMAC_MAX_PERFECT_ADDRESSES) {
writel(0, ioaddr + GMAC_ADDR_HIGH(reg));
writel(0, ioaddr + GMAC_ADDR_LOW(reg));
reg++;
--
2.18.0
^ permalink raw reply related
* [PATCH 0/2 net-next] fix out-of-boundary issue and add taller hash table support
From: Biao Huang @ 2019-07-09 2:36 UTC (permalink / raw)
To: davem, Jose Abreu, andrew
Cc: Giuseppe Cavallaro, Alexandre Torgue, Maxime Coquelin,
Matthias Brugger, netdev, linux-stm32, linux-arm-kernel,
linux-kernel, linux-mediatek, yt.shen, biao.huang, jianguo.zhang,
boon.leong.ong
Fix mac address out-of-boundary issue in net-next tree.
and resend the patch which was discussed in
https://lore.kernel.org/patchwork/patch/1082117
but with no further progress.
Biao Huang (2):
net: stmmac: dwmac4: mac address array boudary violation issue
net: stmmac: add support for hash table size 128/256 in dwmac4
drivers/net/ethernet/stmicro/stmmac/common.h | 7 +--
drivers/net/ethernet/stmicro/stmmac/dwmac4.h | 4 +-
.../net/ethernet/stmicro/stmmac/dwmac4_core.c | 51 +++++++++++--------
.../net/ethernet/stmicro/stmmac/dwmac4_dma.c | 1 +
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +++
5 files changed, 43 insertions(+), 26 deletions(-)
--
2.18.0
^ permalink raw reply
* RE: [PATCH net-next v5 3/5] devlink: Introduce PCI PF port flavour and port attribute
From: Parav Pandit @ 2019-07-09 2:36 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev@vger.kernel.org, Jiri Pirko, Saeed Mahameed
In-Reply-To: <20190708141403.1c01c5de@cakuba.netronome.com>
> -----Original Message-----
> From: Jakub Kicinski <jakub.kicinski@netronome.com>
> Sent: Tuesday, July 9, 2019 2:44 AM
> To: Parav Pandit <parav@mellanox.com>
> Cc: netdev@vger.kernel.org; Jiri Pirko <jiri@mellanox.com>; Saeed Mahameed
> <saeedm@mellanox.com>
> Subject: Re: [PATCH net-next v5 3/5] devlink: Introduce PCI PF port flavour and
> port attribute
>
> On Sun, 7 Jul 2019 23:15:47 -0500, Parav Pandit wrote:
> > diff --git a/net/core/devlink.c b/net/core/devlink.c index
> > 3e5f8204c36f..88b2cf207cb2 100644
> > --- a/net/core/devlink.c
> > +++ b/net/core/devlink.c
> > @@ -519,6 +519,11 @@ static int devlink_nl_port_attrs_put(struct sk_buff
> *msg,
> > if (devlink_port->attrs.flavour != DEVLINK_PORT_FLAVOUR_PHYSICAL
> &&
> > devlink_port->attrs.flavour != DEVLINK_PORT_FLAVOUR_CPU &&
> > devlink_port->attrs.flavour != DEVLINK_PORT_FLAVOUR_DSA)
> > return 0;
> > + if (devlink_port->attrs.flavour == DEVLINK_PORT_FLAVOUR_PCI_PF) {
>
> Thanks for making the changes! I'm not sure how this would work, tho.
> We return early if flavour is not phys/cpu/dsa, so how can flavour be pci here?..
>
My bad. Hunk got applied at wrong place when I split the patch.
Correcting it along with physical to phys name change that Jiri suggested.
> > + if (nla_put_u16(msg, DEVLINK_ATTR_PORT_PCI_PF_NUMBER,
> > + attrs->pci_pf.pf))
> > + return -EMSGSIZE;
> > + }
> > if (nla_put_u32(msg, DEVLINK_ATTR_PORT_NUMBER,
> > attrs->physical.port_number))
> > return -EMSGSIZE;
^ permalink raw reply
* Re: [PATCH v3 net-next 13/19] ionic: Add initial ethtool support
From: Andrew Lunn @ 2019-07-09 2:30 UTC (permalink / raw)
To: Shannon Nelson; +Cc: netdev
In-Reply-To: <20190708192532.27420-14-snelson@pensando.io>
> +static int ionic_nway_reset(struct net_device *netdev)
> +{
> + struct lif *lif = netdev_priv(netdev);
> + int err = 0;
> +
> + if (netif_running(netdev))
> + err = ionic_reset_queues(lif);
What does ionic_reset_queues() do? It sounds nothing like restarting
auto negotiation?
Andrew
^ permalink raw reply
* Re: [PATCH net] tcp: Reset bytes_acked and bytes_received when disconnecting
From: David Miller @ 2019-07-09 2:30 UTC (permalink / raw)
To: cpaasch; +Cc: netdev, edumazet
In-Reply-To: <20190706231307.98483-1-cpaasch@apple.com>
From: Christoph Paasch <cpaasch@apple.com>
Date: Sat, 06 Jul 2019 16:13:07 -0700
> If an app is playing tricks to reuse a socket via tcp_disconnect(),
> bytes_acked/received needs to be reset to 0. Otherwise tcp_info will
> report the sum of the current and the old connection..
>
> Cc: Eric Dumazet <edumazet@google.com>
> Fixes: 0df48c26d841 ("tcp: add tcpi_bytes_acked to tcp_info")
> Fixes: bdd1f9edacb5 ("tcp: add tcpi_bytes_received to tcp_info")
> Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Applied and queued up for -stable.
^ permalink raw reply
* Re: [net-next] bonding: fix value exported by Netlink for peer_notif_delay
From: David Miller @ 2019-07-09 2:30 UTC (permalink / raw)
To: vincent; +Cc: j.vosburgh, vfalico, andy, netdev
In-Reply-To: <20190706210108.15293-1-vincent@bernat.ch>
From: Vincent Bernat <vincent@bernat.ch>
Date: Sat, 6 Jul 2019 23:01:08 +0200
> IFLA_BOND_PEER_NOTIF_DELAY was set to the value of downdelay instead
> of peer_notif_delay. After this change, the correct value is exported.
>
> Fixes: 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications")
> Signed-off-by: Vincent Bernat <vincent@bernat.ch>
Applied.
^ permalink raw reply
* Re: [PATCH v3 net-next 13/19] ionic: Add initial ethtool support
From: Andrew Lunn @ 2019-07-09 2:27 UTC (permalink / raw)
To: Shannon Nelson; +Cc: netdev
In-Reply-To: <20190708192532.27420-14-snelson@pensando.io>
> +static int ionic_get_module_eeprom(struct net_device *netdev,
> + struct ethtool_eeprom *ee,
> + u8 *data)
> +{
> + struct lif *lif = netdev_priv(netdev);
> + struct ionic_dev *idev = &lif->ionic->idev;
> + struct xcvr_status *xcvr;
> + u32 len;
> +
> + /* copy the module bytes into data */
> + xcvr = &idev->port_info->status.xcvr;
> + len = min_t(u32, sizeof(xcvr->sprom), ee->len);
> + memcpy(data, xcvr->sprom, len);
Hi Shannon
This also looks odd. Where is the call into the firmware to get the
eeprom contents? Even though it is called 'eeprom', the data is not
static. It contains real time diagnostic values, temperature, transmit
power, receiver power, voltages etc.
> +
> + dev_dbg(&lif->netdev->dev, "notifyblock eid=0x%llx link_status=0x%x link_speed=0x%x link_down_cnt=0x%x\n",
> + lif->info->status.eid,
> + lif->info->status.link_status,
> + lif->info->status.link_speed,
> + lif->info->status.link_down_count);
> + dev_dbg(&lif->netdev->dev, " port_status id=0x%x status=0x%x speed=0x%x\n",
> + idev->port_info->status.id,
> + idev->port_info->status.status,
> + idev->port_info->status.speed);
> + dev_dbg(&lif->netdev->dev, " xcvr status state=0x%x phy=0x%x pid=0x%x\n",
> + xcvr->state, xcvr->phy, xcvr->pid);
> + dev_dbg(&lif->netdev->dev, " port_config state=0x%x speed=0x%x mtu=0x%x an_enable=0x%x fec_type=0x%x pause_type=0x%x loopback_mode=0x%x\n",
> + idev->port_info->config.state,
> + idev->port_info->config.speed,
> + idev->port_info->config.mtu,
> + idev->port_info->config.an_enable,
> + idev->port_info->config.fec_type,
> + idev->port_info->config.pause_type,
> + idev->port_info->config.loopback_mode);
It is a funny place to have all the debug code.
Andrew
^ permalink raw reply
* Re: [PATCH] coallocate socket_wq with socket itself
From: David Miller @ 2019-07-09 2:25 UTC (permalink / raw)
To: viro; +Cc: netdev
In-Reply-To: <20190705191416.GL17978@ZenIV.linux.org.uk>
From: Al Viro <viro@zeniv.linux.org.uk>
Date: Fri, 5 Jul 2019 20:14:16 +0100
> socket->wq is assign-once, set when we are initializing both
> struct socket it's in and struct socket_wq it points to. As the
> matter of fact, the only reason for separate allocation was the
> ability to RCU-delay freeing of socket_wq. RCU-delaying the
> freeing of socket itself gets rid of that need, so we can just
> fold struct socket_wq into the end of struct socket and simplify
> the life both for sock_alloc_inode() (one allocation instead of
> two) and for tun/tap oddballs, where we used to embed struct socket
> and struct socket_wq into the same structure (now - embedding just
> the struct socket).
>
> Note that reference to struct socket_wq in struct sock does remain
> a reference - that's unchanged.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Applied.
^ permalink raw reply
* Re: [PATCH] sockfs: switch to ->free_inode()
From: David Miller @ 2019-07-09 2:25 UTC (permalink / raw)
To: viro; +Cc: netdev
In-Reply-To: <20190705191322.GK17978@ZenIV.linux.org.uk>
From: Al Viro <viro@zeniv.linux.org.uk>
Date: Fri, 5 Jul 2019 20:13:22 +0100
> we do have an RCU-delayed part there already (freeing the wq),
> so it's not like the pipe situation; moreover, it might be
> worth considering coallocating wq with the rest of struct sock_alloc.
> ->sk_wq in struct sock would remain a pointer as it is, but
> the object it normally points to would be coallocated with
> struct socket...
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Applied.
^ permalink raw reply
* Re: pull-request: bpf-next 2019-07-09
From: David Miller @ 2019-07-09 2:15 UTC (permalink / raw)
To: daniel; +Cc: ast, netdev, bpf
In-Reply-To: <20190709001351.8848-1-daniel@iogearbox.net>
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Tue, 9 Jul 2019 02:13:51 +0200
> The following pull-request contains BPF updates for your *net-next* tree.
>
> The main changes are:
...
> Please consider pulling these changes from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
Pulled, thanks Daniel.
^ permalink raw reply
* Re: [PATCH v3 net-next 13/19] ionic: Add initial ethtool support
From: Andrew Lunn @ 2019-07-09 2:14 UTC (permalink / raw)
To: Shannon Nelson; +Cc: netdev
In-Reply-To: <20190708192532.27420-14-snelson@pensando.io>
> +static int ionic_set_pauseparam(struct net_device *netdev,
> + struct ethtool_pauseparam *pause)
> +{
> + struct lif *lif = netdev_priv(netdev);
> + struct ionic *ionic = lif->ionic;
> + struct ionic_dev *idev = &lif->ionic->idev;
> +
> + u32 requested_pause;
> + u32 cur_autoneg;
> + int err;
> +
> + cur_autoneg = idev->port_info->config.an_enable ? AUTONEG_ENABLE :
> + AUTONEG_DISABLE;
> + if (pause->autoneg != cur_autoneg) {
> + netdev_info(netdev, "Please use 'ethtool -s ...' to change autoneg\n");
> + return -EOPNOTSUPP;
> + }
> +
> + /* change both at the same time */
> + requested_pause = PORT_PAUSE_TYPE_LINK;
> + if (pause->rx_pause)
> + requested_pause |= IONIC_PAUSE_F_RX;
> + if (pause->tx_pause)
> + requested_pause |= IONIC_PAUSE_F_TX;
> +
> + if (requested_pause == idev->port_info->config.pause_type)
> + return 0;
> +
> + idev->port_info->config.pause_type = requested_pause;
> +
> + mutex_lock(&ionic->dev_cmd_lock);
> + ionic_dev_cmd_port_pause(idev, requested_pause);
> + err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> + mutex_unlock(&ionic->dev_cmd_lock);
> + if (err)
> + return err;
Hi Shannon
I've no idea what the firmware black box is doing, but this looks
wrong.
pause->autoneg is about if the results of auto-neg should be used or
not. If false, just configure the MAC with the pause settings and you
are done. If the interface is being forced, so autoneg in general is
disabled, just configure the MAC and you are done.
If pause->autoneg is true and the interface is using auto-neg as a
whole, you pass the pause values to the PHY for it to advertise and
trigger an auto-neg. Once autoneg has completed, and the resolved
settings are available, the MAC is configured with the resolved
values.
Looking at this code, i don't see any difference between configuring
the MAC or configuring the PHY. I would expect pause->autoneg to be
part of requested_pause somehow, so the firmware knows what is should
do.
Andrew
^ permalink raw reply
* Re: [EXT] [PATCH net-next 07/16] qlge: Deduplicate rx buffer queue management
From: Benjamin Poirier @ 2019-07-09 2:10 UTC (permalink / raw)
To: Manish Chopra; +Cc: GR-Linux-NIC-Dev, netdev@vger.kernel.org
In-Reply-To: <DM6PR18MB2697AC678152A26AC676A1B2ABFD0@DM6PR18MB2697.namprd18.prod.outlook.com>
On 2019/06/27 10:02, Manish Chopra wrote:
> > while (curr_idx != clean_idx) {
> > - lbq_desc = &rx_ring->lbq[curr_idx];
> > + struct qlge_bq_desc *lbq_desc = &rx_ring-
> > >lbq.queue[curr_idx];
> >
> > if (lbq_desc->p.pg_chunk.offset == last_offset)
> > - pci_unmap_page(qdev->pdev, lbq_desc-
> > >p.pg_chunk.map,
> > + pci_unmap_page(qdev->pdev, lbq_desc->dma_addr,
> > ql_lbq_block_size(qdev),
> > PCI_DMA_FROMDEVICE);
>
> In this patch, lbq_desc->dma_addr points to offset in the page. So unmapping is broken within this patch.
> Would have been nicer to fix this in the same patch although it might have been taken care in next patches probably.
>
Indeed, thanks. The same applies in ql_get_curr_lchunk().
Replaced with the following for v2:
+ pci_unmap_page(qdev->pdev, lbq_desc->dma_addr -
+ last_offset, ql_lbq_block_size(qdev),
^ permalink raw reply
* RE: [EXT] Re: [PATCH net-next v2 4/4] qed*: Add devlink support for configuration attributes.
From: Sudarsana Reddy Kalluru @ 2019-07-09 2:08 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem@davemloft.net, netdev@vger.kernel.org, Michal Kalderon,
Ariel Elior, Jiri Pirko
In-Reply-To: <20190708144706.46ad7a50@cakuba.netronome.com>
> -----Original Message-----
> From: Jakub Kicinski <jakub.kicinski@netronome.com>
> Sent: Tuesday, July 9, 2019 3:17 AM
> To: Sudarsana Reddy Kalluru <skalluru@marvell.com>
> Cc: davem@davemloft.net; netdev@vger.kernel.org; Michal Kalderon
> <mkalderon@marvell.com>; Ariel Elior <aelior@marvell.com>; Jiri Pirko
> <jiri@resnulli.us>
> Subject: Re: [EXT] Re: [PATCH net-next v2 4/4] qed*: Add devlink support for
> configuration attributes.
>
> On Mon, 8 Jul 2019 02:31:15 +0000, Sudarsana Reddy Kalluru wrote:
> > > > > > + Type: u8
> > > > > > + Configuration mode: Permanent
> > > > > > +
> > > > > > +dcbx_mode [PORT, DRIVER-SPECIFIC]
> > > > > > + Configure DCBX mode for the device.
> > > > > > + Supported dcbx modes are,
> > > > > > + Disabled(0), IEEE(1), CEE(2) and
> > > > > > Dynamic(3)
> > > > > > + Type: u8
> > > > > > + Configuration mode: Permanent
> > > > >
> > > > > Why is this a permanent parameter?
> > > > >
> > > > This specifies the dcbx_mode to be configured in non-volatile memory.
> > > > The value is persistent and is used in the next load of OS or the mfw.
> > >
> > > And it can't be changed at runtime?
> >
> > Run time dcbx params are not affected via this interface, it only
> > updates config on permanent storage of the port.
>
> IOW it affects the defaults after boot? It'd be preferable to have the DCB
> uAPI handle "persistent"/default settings if that's the case.
Yes, it's effective after the reboot. Thanks for your suggestion.
^ permalink raw reply
* Re: [PATCH] phy: added a PHY_BUSY state into phy_state_machine
From: kbuild test robot @ 2019-07-09 1:58 UTC (permalink / raw)
To: kwangdo.yi; +Cc: kbuild-all, netdev, kwangdo.yi
In-Reply-To: <1562538732-20700-1-git-send-email-kwangdo.yi@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4357 bytes --]
Hi "kwangdo.yi",
Thank you for the patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on v5.2]
[cannot apply to next-20190708]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/kwangdo-yi/phy-added-a-PHY_BUSY-state-into-phy_state_machine/20190709-075228
config: arm-omap2plus_defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 7.4.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.4.0 make.cross ARCH=arm
If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>
All error/warnings (new ones prefixed by >>):
drivers/net/phy/phy.c: In function 'phy_state_to_str':
>> drivers/net/phy/phy.c:38:2: warning: enumeration value 'PHY_BUSY' not handled in switch [-Wswitch]
switch (st) {
^~~~~~
drivers/net/phy/phy.c: In function 'phy_state_machine':
>> drivers/net/phy/phy.c:925:4: error: 'phy' undeclared (first use in this function)
phy->state = PHY_BUSY;
^~~
drivers/net/phy/phy.c:925:4: note: each undeclared identifier is reported only once for each function it appears in
vim +/phy +925 drivers/net/phy/phy.c
894
895 /**
896 * phy_state_machine - Handle the state machine
897 * @work: work_struct that describes the work to be done
898 */
899 void phy_state_machine(struct work_struct *work)
900 {
901 struct delayed_work *dwork = to_delayed_work(work);
902 struct phy_device *phydev =
903 container_of(dwork, struct phy_device, state_queue);
904 bool needs_aneg = false, do_suspend = false;
905 enum phy_state old_state;
906 int err = 0;
907
908 mutex_lock(&phydev->lock);
909
910 old_state = phydev->state;
911
912 switch (phydev->state) {
913 case PHY_DOWN:
914 case PHY_READY:
915 break;
916 case PHY_UP:
917 needs_aneg = true;
918
919 break;
920 case PHY_NOLINK:
921 case PHY_RUNNING:
922 case PHY_BUSY:
923 err = phy_check_link_status(phydev);
924 if (err == -ETIMEDOUT && old_state == PHY_RUNNING) {
> 925 phy->state = PHY_BUSY;
926 err = 0;
927
928 }
929 break;
930 case PHY_FORCING:
931 err = genphy_update_link(phydev);
932 if (err)
933 break;
934
935 if (phydev->link) {
936 phydev->state = PHY_RUNNING;
937 phy_link_up(phydev);
938 } else {
939 if (0 == phydev->link_timeout--)
940 needs_aneg = true;
941 phy_link_down(phydev, false);
942 }
943 break;
944 case PHY_HALTED:
945 if (phydev->link) {
946 phydev->link = 0;
947 phy_link_down(phydev, true);
948 do_suspend = true;
949 }
950 break;
951 }
952
953 mutex_unlock(&phydev->lock);
954
955 if (needs_aneg)
956 err = phy_start_aneg(phydev);
957 else if (do_suspend)
958 phy_suspend(phydev);
959
960 if (err < 0)
961 phy_error(phydev);
962
963 if (old_state != phydev->state) {
964 phydev_dbg(phydev, "PHY state change %s -> %s\n",
965 phy_state_to_str(old_state),
966 phy_state_to_str(phydev->state));
967 if (phydev->drv && phydev->drv->link_change_notify)
968 phydev->drv->link_change_notify(phydev);
969 }
970
971 /* Only re-schedule a PHY state machine change if we are polling the
972 * PHY, if PHY_IGNORE_INTERRUPT is set, then we will be moving
973 * between states from phy_mac_interrupt().
974 *
975 * In state PHY_HALTED the PHY gets suspended, so rescheduling the
976 * state machine would be pointless and possibly error prone when
977 * called from phy_disconnect() synchronously.
978 */
979 mutex_lock(&phydev->lock);
980 if (phy_polling_mode(phydev) && phy_is_started(phydev))
981 phy_queue_state_machine(phydev, PHY_STATE_TIME);
982 mutex_unlock(&phydev->lock);
983 }
984
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 36418 bytes --]
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox