* [PATCH net 9/9] net/smc: no close wait in case of process shut down
From: Ursula Braun @ 2017-09-21 7:16 UTC (permalink / raw)
To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-s390-u79uwXL29TY76Z2rM5mHXA,
jwi-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
schwidefsky-tA70FqPdS9bQT0dZR+AlfA,
heiko.carstens-tA70FqPdS9bQT0dZR+AlfA,
raspl-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
ubraun-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
In-Reply-To: <20170921071634.16883-1-ubraun-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Usually socket closing is delayed if there is still data available in
the send buffer to be transmitted. If a process is killed, the delay
should be avoided.
Signed-off-by: Ursula Braun <ubraun-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
net/smc/smc_close.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/net/smc/smc_close.c b/net/smc/smc_close.c
index 5201bc103bd8..f0d16fb825f7 100644
--- a/net/smc/smc_close.c
+++ b/net/smc/smc_close.c
@@ -174,15 +174,15 @@ int smc_close_active(struct smc_sock *smc)
{
struct smc_cdc_conn_state_flags *txflags =
&smc->conn.local_tx_ctrl.conn_state_flags;
- long timeout = SMC_MAX_STREAM_WAIT_TIMEOUT;
struct smc_connection *conn = &smc->conn;
struct sock *sk = &smc->sk;
int old_state;
+ long timeout;
int rc = 0;
- if (sock_flag(sk, SOCK_LINGER) &&
- !(current->flags & PF_EXITING))
- timeout = sk->sk_lingertime;
+ timeout = current->flags & PF_EXITING ?
+ 0 : sock_flag(sk, SOCK_LINGER) ?
+ sk->sk_lingertime : SMC_MAX_STREAM_WAIT_TIMEOUT;
again:
old_state = sk->sk_state;
@@ -413,13 +413,14 @@ void smc_close_sock_put_work(struct work_struct *work)
int smc_close_shutdown_write(struct smc_sock *smc)
{
struct smc_connection *conn = &smc->conn;
- long timeout = SMC_MAX_STREAM_WAIT_TIMEOUT;
struct sock *sk = &smc->sk;
int old_state;
+ long timeout;
int rc = 0;
- if (sock_flag(sk, SOCK_LINGER))
- timeout = sk->sk_lingertime;
+ timeout = current->flags & PF_EXITING ?
+ 0 : sock_flag(sk, SOCK_LINGER) ?
+ sk->sk_lingertime : SMC_MAX_STREAM_WAIT_TIMEOUT;
again:
old_state = sk->sk_state;
--
2.13.5
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH net 0/9] net/smc: bug fixes 2017-09-20
From: Ursula Braun @ 2017-09-21 7:16 UTC (permalink / raw)
To: davem
Cc: netdev, linux-rdma, linux-s390, jwi, schwidefsky, heiko.carstens,
raspl, ubraun
Hi Dave,
here is a collection of small smc-patches built for net fixing
smc problems in different areas.
Thanks,
Ursula
*** BLURB HERE ***
Hans Wippel (2):
net/smc: add missing dev_put
net/smc: add receive timeout check
Ursula Braun (7):
net/smc: take RCU read lock for routing cache lookup
net/smc: adjust net_device refcount
net/smc: adapt send request completion notification
net/smc: longer delay for client link group removal
net/smc: terminate link group if out-of-sync is received
net/smc: introduce a delay
net/smc: no close wait in case of process shut down
net/smc/af_smc.c | 16 +++++++++-------
net/smc/smc.h | 2 +-
net/smc/smc_clc.c | 10 +++++-----
net/smc/smc_clc.h | 3 +--
net/smc/smc_close.c | 27 +++++++++++++++------------
net/smc/smc_core.c | 16 ++++++++++++----
net/smc/smc_ib.c | 1 +
net/smc/smc_pnet.c | 4 +++-
net/smc/smc_rx.c | 2 ++
net/smc/smc_tx.c | 12 ++++++++----
net/smc/smc_wr.c | 2 +-
11 files changed, 58 insertions(+), 37 deletions(-)
--
2.13.5
^ permalink raw reply
* [PATCH net 3/9] net/smc: take RCU read lock for routing cache lookup
From: Ursula Braun @ 2017-09-21 7:16 UTC (permalink / raw)
To: davem
Cc: netdev, linux-rdma, linux-s390, jwi, schwidefsky, heiko.carstens,
raspl, ubraun
In-Reply-To: <20170921071634.16883-1-ubraun@linux.vnet.ibm.com>
smc_netinfo_by_tcpsk() looks up the routing cache. Such a lookup requires
protection by an RCU read lock.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
---
net/smc/af_smc.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 8c6d24b2995d..2e8d2dabac0c 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -282,6 +282,7 @@ int smc_netinfo_by_tcpsk(struct socket *clcsock,
__be32 *subnet, u8 *prefix_len)
{
struct dst_entry *dst = sk_dst_get(clcsock->sk);
+ struct in_device *in_dev;
struct sockaddr_in addr;
int rc = -ENOENT;
int len;
@@ -298,14 +299,17 @@ int smc_netinfo_by_tcpsk(struct socket *clcsock,
/* get address to which the internal TCP socket is bound */
kernel_getsockname(clcsock, (struct sockaddr *)&addr, &len);
/* analyze IPv4 specific data of net_device belonging to TCP socket */
- for_ifa(dst->dev->ip_ptr) {
- if (ifa->ifa_address != addr.sin_addr.s_addr)
+ rcu_read_lock();
+ in_dev = __in_dev_get_rcu(dst->dev);
+ for_ifa(in_dev) {
+ if (!inet_ifa_match(addr.sin_addr.s_addr, ifa))
continue;
*prefix_len = inet_mask_len(ifa->ifa_mask);
*subnet = ifa->ifa_address & ifa->ifa_mask;
rc = 0;
break;
- } endfor_ifa(dst->dev->ip_ptr);
+ } endfor_ifa(in_dev);
+ rcu_read_unlock();
out_rel:
dst_release(dst);
--
2.13.5
^ permalink raw reply related
* [PATCH net 2/9] net/smc: add receive timeout check
From: Ursula Braun @ 2017-09-21 7:16 UTC (permalink / raw)
To: davem
Cc: netdev, linux-rdma, linux-s390, jwi, schwidefsky, heiko.carstens,
raspl, ubraun
In-Reply-To: <20170921071634.16883-1-ubraun@linux.vnet.ibm.com>
From: Hans Wippel <hwippel@linux.vnet.ibm.com>
The SMC receive function currently lacks a timeout check under the
condition that no data were received and no data are available. This
patch adds such a check.
Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
---
net/smc/smc_rx.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
index b17a333e9bb0..3e631ae4b6b6 100644
--- a/net/smc/smc_rx.c
+++ b/net/smc/smc_rx.c
@@ -148,6 +148,8 @@ int smc_rx_recvmsg(struct smc_sock *smc, struct msghdr *msg, size_t len,
read_done = sock_intr_errno(timeo);
break;
}
+ if (!timeo)
+ return -EAGAIN;
}
if (!atomic_read(&conn->bytes_to_rcv)) {
--
2.13.5
^ permalink raw reply related
* [PATCH net 4/9] net/smc: adjust net_device refcount
From: Ursula Braun @ 2017-09-21 7:16 UTC (permalink / raw)
To: davem
Cc: netdev, linux-rdma, linux-s390, jwi, schwidefsky, heiko.carstens,
raspl, ubraun
In-Reply-To: <20170921071634.16883-1-ubraun@linux.vnet.ibm.com>
smc_pnet_fill_entry() uses dev_get_by_name() adding a refcount to ndev.
The following smc_pnet_enter() has to reduce the refcount if the entry
to be added exists already in the pnet table.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
---
net/smc/smc_pnet.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
index 78f7af28ae4f..31f8453c25c5 100644
--- a/net/smc/smc_pnet.c
+++ b/net/smc/smc_pnet.c
@@ -181,8 +181,10 @@ static int smc_pnet_enter(struct smc_pnetentry *new_pnetelem)
sizeof(new_pnetelem->ndev->name)) ||
smc_pnet_same_ibname(pnetelem,
new_pnetelem->smcibdev->ibdev->name,
- new_pnetelem->ib_port))
+ new_pnetelem->ib_port)) {
+ dev_put(pnetelem->ndev);
goto found;
+ }
}
list_add_tail(&new_pnetelem->list, &smc_pnettable.pnetlist);
rc = 0;
--
2.13.5
^ permalink raw reply related
* [PATCH net 8/9] net/smc: introduce a delay
From: Ursula Braun @ 2017-09-21 7:16 UTC (permalink / raw)
To: davem
Cc: netdev, linux-rdma, linux-s390, jwi, schwidefsky, heiko.carstens,
raspl, ubraun
In-Reply-To: <20170921071634.16883-1-ubraun@linux.vnet.ibm.com>
The number of outstanding work requests is limited. If all work
requests are in use, tx processing is postponed to another scheduling
of the tx worker. Switch to a delayed worker to have a gap for tx
completion queue events before the next retry.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
---
net/smc/smc.h | 2 +-
net/smc/smc_close.c | 12 +++++++-----
net/smc/smc_tx.c | 12 ++++++++----
3 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/net/smc/smc.h b/net/smc/smc.h
index 6e44313e4467..0ccd6fa387ad 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -149,7 +149,7 @@ struct smc_connection {
atomic_t sndbuf_space; /* remaining space in sndbuf */
u16 tx_cdc_seq; /* sequence # for CDC send */
spinlock_t send_lock; /* protect wr_sends */
- struct work_struct tx_work; /* retry of smc_cdc_msg_send */
+ struct delayed_work tx_work; /* retry of smc_cdc_msg_send */
struct smc_host_cdc_msg local_rx_ctrl; /* filled during event_handl.
* .prod cf. TCP rcv_nxt
diff --git a/net/smc/smc_close.c b/net/smc/smc_close.c
index 3c2e166b5d22..5201bc103bd8 100644
--- a/net/smc/smc_close.c
+++ b/net/smc/smc_close.c
@@ -208,7 +208,7 @@ int smc_close_active(struct smc_sock *smc)
case SMC_ACTIVE:
smc_close_stream_wait(smc, timeout);
release_sock(sk);
- cancel_work_sync(&conn->tx_work);
+ cancel_delayed_work_sync(&conn->tx_work);
lock_sock(sk);
if (sk->sk_state == SMC_ACTIVE) {
/* send close request */
@@ -234,7 +234,7 @@ int smc_close_active(struct smc_sock *smc)
if (!smc_cdc_rxed_any_close(conn))
smc_close_stream_wait(smc, timeout);
release_sock(sk);
- cancel_work_sync(&conn->tx_work);
+ cancel_delayed_work_sync(&conn->tx_work);
lock_sock(sk);
if (sk->sk_err != ECONNABORTED) {
/* confirm close from peer */
@@ -263,7 +263,9 @@ int smc_close_active(struct smc_sock *smc)
/* peer sending PeerConnectionClosed will cause transition */
break;
case SMC_PROCESSABORT:
- cancel_work_sync(&conn->tx_work);
+ release_sock(sk);
+ cancel_delayed_work_sync(&conn->tx_work);
+ lock_sock(sk);
smc_close_abort(conn);
sk->sk_state = SMC_CLOSED;
smc_close_wait_tx_pends(smc);
@@ -425,7 +427,7 @@ int smc_close_shutdown_write(struct smc_sock *smc)
case SMC_ACTIVE:
smc_close_stream_wait(smc, timeout);
release_sock(sk);
- cancel_work_sync(&conn->tx_work);
+ cancel_delayed_work_sync(&conn->tx_work);
lock_sock(sk);
/* send close wr request */
rc = smc_close_wr(conn);
@@ -439,7 +441,7 @@ int smc_close_shutdown_write(struct smc_sock *smc)
if (!smc_cdc_rxed_any_close(conn))
smc_close_stream_wait(smc, timeout);
release_sock(sk);
- cancel_work_sync(&conn->tx_work);
+ cancel_delayed_work_sync(&conn->tx_work);
lock_sock(sk);
/* confirm close from peer */
rc = smc_close_wr(conn);
diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 3c656beb8820..3866573288dd 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -24,6 +24,8 @@
#include "smc_cdc.h"
#include "smc_tx.h"
+#define SMC_TX_WORK_DELAY HZ
+
/***************************** sndbuf producer *******************************/
/* callback implementation for sk.sk_write_space()
@@ -406,7 +408,8 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn)
goto out_unlock;
}
rc = 0;
- schedule_work(&conn->tx_work);
+ schedule_delayed_work(&conn->tx_work,
+ SMC_TX_WORK_DELAY);
}
goto out_unlock;
}
@@ -430,7 +433,7 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn)
*/
static void smc_tx_work(struct work_struct *work)
{
- struct smc_connection *conn = container_of(work,
+ struct smc_connection *conn = container_of(to_delayed_work(work),
struct smc_connection,
tx_work);
struct smc_sock *smc = container_of(conn, struct smc_sock, conn);
@@ -468,7 +471,8 @@ void smc_tx_consumer_update(struct smc_connection *conn)
if (!rc)
rc = smc_cdc_msg_send(conn, wr_buf, pend);
if (rc < 0) {
- schedule_work(&conn->tx_work);
+ schedule_delayed_work(&conn->tx_work,
+ SMC_TX_WORK_DELAY);
return;
}
smc_curs_write(&conn->rx_curs_confirmed,
@@ -487,6 +491,6 @@ void smc_tx_consumer_update(struct smc_connection *conn)
void smc_tx_init(struct smc_sock *smc)
{
smc->sk.sk_write_space = smc_tx_write_space;
- INIT_WORK(&smc->conn.tx_work, smc_tx_work);
+ INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);
spin_lock_init(&smc->conn.send_lock);
}
--
2.13.5
^ permalink raw reply related
* [PATCH net-next 1/1] net/smc: parameter cleanup in smc_cdc_get_free_slot()
From: Ursula Braun @ 2017-09-21 7:17 UTC (permalink / raw)
To: davem
Cc: netdev, linux-rdma, linux-s390, jwi, schwidefsky, heiko.carstens,
raspl, ubraun
Use the smc_connection as first parameter with smc_cdc_get_free_slot().
This is just a small code cleanup, no functional change.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
---
net/smc/smc_cdc.c | 7 ++++---
net/smc/smc_cdc.h | 3 ++-
net/smc/smc_tx.c | 6 ++----
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
index a7294edbc221..5ef97e5a5f78 100644
--- a/net/smc/smc_cdc.c
+++ b/net/smc/smc_cdc.c
@@ -62,10 +62,12 @@ static void smc_cdc_tx_handler(struct smc_wr_tx_pend_priv *pnd_snd,
bh_unlock_sock(&smc->sk);
}
-int smc_cdc_get_free_slot(struct smc_link *link,
+int smc_cdc_get_free_slot(struct smc_connection *conn,
struct smc_wr_buf **wr_buf,
struct smc_cdc_tx_pend **pend)
{
+ struct smc_link *link = &conn->lgr->lnk[SMC_SINGLE_LINK];
+
return smc_wr_tx_get_free_slot(link, smc_cdc_tx_handler, wr_buf,
(struct smc_wr_tx_pend_priv **)pend);
}
@@ -118,8 +120,7 @@ int smc_cdc_get_slot_and_msg_send(struct smc_connection *conn)
struct smc_wr_buf *wr_buf;
int rc;
- rc = smc_cdc_get_free_slot(&conn->lgr->lnk[SMC_SINGLE_LINK], &wr_buf,
- &pend);
+ rc = smc_cdc_get_free_slot(conn, &wr_buf, &pend);
if (rc)
return rc;
diff --git a/net/smc/smc_cdc.h b/net/smc/smc_cdc.h
index 8e1d76f26007..56f883d1159c 100644
--- a/net/smc/smc_cdc.h
+++ b/net/smc/smc_cdc.h
@@ -206,7 +206,8 @@ static inline void smc_cdc_msg_to_host(struct smc_host_cdc_msg *local,
struct smc_cdc_tx_pend;
-int smc_cdc_get_free_slot(struct smc_link *link, struct smc_wr_buf **wr_buf,
+int smc_cdc_get_free_slot(struct smc_connection *conn,
+ struct smc_wr_buf **wr_buf,
struct smc_cdc_tx_pend **pend);
void smc_cdc_tx_dismiss_slots(struct smc_connection *conn);
int smc_cdc_msg_send(struct smc_connection *conn, struct smc_wr_buf *wr_buf,
diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 3c656beb8820..e2228f6d1c25 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -394,8 +394,7 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn)
int rc;
spin_lock_bh(&conn->send_lock);
- rc = smc_cdc_get_free_slot(&conn->lgr->lnk[SMC_SINGLE_LINK], &wr_buf,
- &pend);
+ rc = smc_cdc_get_free_slot(conn, &wr_buf, &pend);
if (rc < 0) {
if (rc == -EBUSY) {
struct smc_sock *smc =
@@ -463,8 +462,7 @@ void smc_tx_consumer_update(struct smc_connection *conn)
((to_confirm > conn->rmbe_update_limit) &&
((to_confirm > (conn->rmbe_size / 2)) ||
conn->local_rx_ctrl.prod_flags.write_blocked))) {
- rc = smc_cdc_get_free_slot(&conn->lgr->lnk[SMC_SINGLE_LINK],
- &wr_buf, &pend);
+ rc = smc_cdc_get_free_slot(conn, &wr_buf, &pend);
if (!rc)
rc = smc_cdc_msg_send(conn, wr_buf, pend);
if (rc < 0) {
--
2.13.5
^ permalink raw reply related
* Re: [PATCHv3 iproute2 1/2] lib/libnetlink: re malloc buff if size is not enough
From: Hangbin Liu @ 2017-09-21 7:20 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, Michal Kubecek, Phil Sutter
In-Reply-To: <20170920095605.1ea527fc@xeon-e3>
Hi Stephen,
On Wed, Sep 20, 2017 at 09:56:05AM -0700, Stephen Hemminger wrote:
> > +realloc:
> > + bufp = realloc(buf, buf_len);
> > +
> > + if (bufp == NULL) {
>
> Minor personal style issue:
> To me, blank lines are like paragraphs in writing.
> Code reads better assignment and condition check are next to
> each other.
OK, I will remove the blank lines.
>
> > +recv:
> > + len = recvmsg(fd, msg, flag);
> > +
> > + if (len < 0) {
> > + if (errno == EINTR || errno == EAGAIN)
> > + goto recv;
> > + fprintf(stderr, "netlink receive error %s (%d)\n",
> > + strerror(errno), errno);
> > + free(buf);
> > + return -errno;
> > + }
> > +
> > + if (len == 0) {
> > + fprintf(stderr, "EOF on netlink\n");
> > + free(buf);
> > + return -ENODATA;
> > + }
> > +
> > + if (len > buf_len) {
> > + buf_len = len;
> > + flag = 0;
> > + goto realloc;
> > + }
> > +
> > + if (flag != 0) {
> > + flag = 0;
> > + goto recv;
>
> Although I programmed in BASIC years ago. I never liked code
> with loops via goto. To me it indicates the logic is not well thought
> through. Not sure exactly how to rearrange the control flow, but it
> should be possible to rewrite this so that it reads cleaner.
Hmm, if we remove goto. Then the logic should look like
bufp = realloc(buf, buf_len);
/* check bufp and set msg */
len = recvmsg(fd, msg, flag);
/* check len */
if (len > buf_len) {
buf_len = len;
bufp = realloc(buf, buf_len);
/* check bufp and set msg */
len = recvmsg(fd, msg, flag);
/* check len */
}
len = recvmsg(fd, msg, flag);
/* check len */
Or maybe we can set buf_len very small first. Then it will force to realloc at
the second time. And the code would like
int buf_len = 16;
bufp = realloc(buf, buf_len);
/* check bufp and set msg */
len = recvmsg(fd, msg, flag);
/* check len */
buf_len = len;
bufp = realloc(buf, buf_len);
/* check bufp and set msg */
len = recvmsg(fd, msg, flag);
/* check len */
What do you think?
Thanks
Hangbin
^ permalink raw reply
* [PATCH net-next] cxgb4: avoid stall while shutting down the adapter
From: Ganesh Goudar @ 2017-09-21 7:20 UTC (permalink / raw)
To: netdev, davem; +Cc: nirranjan, indranil, venkatesh, Ganesh Goudar
do not wait for completion while deleting the filters
when the adapter is shutting down because we may not get
the response as interrupts will be disabled.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 1 +
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 7 ++++++-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 4 ++++
3 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index ea72d2d..c4e997f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -549,6 +549,7 @@ enum { /* adapter flags */
MASTER_PF = (1 << 7),
FW_OFLD_CONN = (1 << 9),
ROOT_NO_RELAXED_ORDERING = (1 << 10),
+ SHUTTING_DOWN = (1 << 11),
};
enum {
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
index 45b5853..97ead2c 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
@@ -191,7 +191,8 @@ static int del_filter_wr(struct adapter *adapter, int fidx)
return -ENOMEM;
fwr = __skb_put(skb, len);
- t4_mk_filtdelwr(f->tid, fwr, adapter->sge.fw_evtq.abs_id);
+ t4_mk_filtdelwr(f->tid, fwr, (adapter->flags & SHUTTING_DOWN) ? -1
+ : adapter->sge.fw_evtq.abs_id);
/* Mark the filter as "pending" and ship off the Filter Work Request.
* When we get the Work Request Reply we'll clear the pending status.
@@ -636,6 +637,10 @@ int cxgb4_del_filter(struct net_device *dev, int filter_id)
struct filter_ctx ctx;
int ret;
+ /* If we are shutting down the adapter do not wait for completion */
+ if (netdev2adap(dev)->flags & SHUTTING_DOWN)
+ return __cxgb4_del_filter(dev, filter_id, NULL);
+
init_completion(&ctx.completion);
ret = __cxgb4_del_filter(dev, filter_id, &ctx);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 92d9d79..5fe81a4 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -5254,6 +5254,8 @@ static void remove_one(struct pci_dev *pdev)
return;
}
+ adapter->flags |= SHUTTING_DOWN;
+
if (adapter->pf == 4) {
int i;
@@ -5339,6 +5341,8 @@ static void shutdown_one(struct pci_dev *pdev)
return;
}
+ adapter->flags |= SHUTTING_DOWN;
+
if (adapter->pf == 4) {
int i;
--
2.1.0
^ permalink raw reply related
* Re: [PATCHv3 iproute2 1/2] lib/libnetlink: re malloc buff if size is not enough
From: Michal Kubecek @ 2017-09-21 7:34 UTC (permalink / raw)
To: Hangbin Liu; +Cc: Stephen Hemminger, netdev, Phil Sutter
In-Reply-To: <20170921072002.GM5465@leo.usersys.redhat.com>
On Thu, Sep 21, 2017 at 03:20:02PM +0800, Hangbin Liu wrote:
>
> Or maybe we can set buf_len very small first. Then it will force to realloc at
> the second time. And the code would like
>
> int buf_len = 16;
> bufp = realloc(buf, buf_len);
> /* check bufp and set msg */
>
> len = recvmsg(fd, msg, flag);
> /* check len */
>
> buf_len = len;
> bufp = realloc(buf, buf_len);
> /* check bufp and set msg */
>
> len = recvmsg(fd, msg, flag);
> /* check len */
>
> What do you think?
I will have to check but IIRC it might be possible to use zero length
for the peek to only check the length which could help you to avoid both
the reallocation and copying the same data from kernel to userspace
twice.
Michal Kubecek
^ permalink raw reply
* [PATCH net-next 0/4] cxgb4: add support to offload tc flower
From: Rahul Lakkireddy @ 2017-09-21 7:33 UTC (permalink / raw)
To: netdev; +Cc: davem, kumaras, ganeshgr, nirranjan, indranil, Rahul Lakkireddy
This series of patches add support to offload tc flower onto Chelsio
NICs.
Patch 1 adds basic skeleton to prepare for offloading tc flower flows.
Patch 2 adds support to add/remove flows for offload. Flows can have
accompanying masks. Following match and action are currently supported
for offload:
Match: ether-protocol, IPv4/IPv6 addresses, L4 ports (TCP/UDP)
Action: drop, redirect to another port on the device.
Patch 3 adds support to offload tc-flower flows having
vlan actions: pop, push, and modify.
Patch 4 adds support to fetch stats for the offloaded tc flower flows
from hardware.
Support for offloading more match and action types are to be followed
in subsequent series.
Thanks,
Rahul
Kumar Sanghvi (4):
cxgb4: add tc flower offload skeleton
cxgb4: add basic tc flower offload support
cxgb4: add support to offload action vlan
cxgb4: fetch stats for offloaded tc flower flows
drivers/net/ethernet/chelsio/cxgb4/Makefile | 4 +-
drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 4 +
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 102 +++++
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 25 ++
.../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c | 458 +++++++++++++++++++++
.../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h | 66 +++
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h | 3 +
7 files changed, 661 insertions(+), 1 deletion(-)
create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
--
2.14.1
^ permalink raw reply
* [PATCH net-next 1/4] cxgb4: add tc flower offload skeleton
From: Rahul Lakkireddy @ 2017-09-21 7:33 UTC (permalink / raw)
To: netdev; +Cc: davem, kumaras, ganeshgr, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1505977744.git.rahul.lakkireddy@chelsio.com>
From: Kumar Sanghvi <kumaras@chelsio.com>
Add basic skeleton to prepare for offloading tc-flower flows.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
drivers/net/ethernet/chelsio/cxgb4/Makefile | 4 +-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 22 +++++++++
.../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c | 57 ++++++++++++++++++++++
.../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h | 46 +++++++++++++++++
4 files changed, 128 insertions(+), 1 deletion(-)
create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
diff --git a/drivers/net/ethernet/chelsio/cxgb4/Makefile b/drivers/net/ethernet/chelsio/cxgb4/Makefile
index 817212702f0a..fecd7aab673b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/Makefile
+++ b/drivers/net/ethernet/chelsio/cxgb4/Makefile
@@ -4,7 +4,9 @@
obj-$(CONFIG_CHELSIO_T4) += cxgb4.o
-cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o clip_tbl.o cxgb4_ethtool.o cxgb4_uld.o sched.o cxgb4_filter.o cxgb4_tc_u32.o cxgb4_ptp.o
+cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o clip_tbl.o cxgb4_ethtool.o \
+ cxgb4_uld.o sched.o cxgb4_filter.o cxgb4_tc_u32.o \
+ cxgb4_ptp.o cxgb4_tc_flower.o
cxgb4-$(CONFIG_CHELSIO_T4_DCB) += cxgb4_dcb.o
cxgb4-$(CONFIG_CHELSIO_T4_FCOE) += cxgb4_fcoe.o
cxgb4-$(CONFIG_DEBUG_FS) += cxgb4_debugfs.o
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 92d9d795d874..8923affbdaf8 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -79,6 +79,7 @@
#include "l2t.h"
#include "sched.h"
#include "cxgb4_tc_u32.h"
+#include "cxgb4_tc_flower.h"
#include "cxgb4_ptp.h"
char cxgb4_driver_name[] = KBUILD_MODNAME;
@@ -2873,6 +2874,25 @@ static int cxgb_set_tx_maxrate(struct net_device *dev, int index, u32 rate)
return err;
}
+static int cxgb_setup_tc_flower(struct net_device *dev,
+ struct tc_cls_flower_offload *cls_flower)
+{
+ if (!is_classid_clsact_ingress(cls_flower->common.classid) ||
+ cls_flower->common.chain_index)
+ return -EOPNOTSUPP;
+
+ switch (cls_flower->command) {
+ case TC_CLSFLOWER_REPLACE:
+ return cxgb4_tc_flower_replace(dev, cls_flower);
+ case TC_CLSFLOWER_DESTROY:
+ return cxgb4_tc_flower_destroy(dev, cls_flower);
+ case TC_CLSFLOWER_STATS:
+ return cxgb4_tc_flower_stats(dev, cls_flower);
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
static int cxgb_setup_tc_cls_u32(struct net_device *dev,
struct tc_cls_u32_offload *cls_u32)
{
@@ -2907,6 +2927,8 @@ static int cxgb_setup_tc(struct net_device *dev, enum tc_setup_type type,
switch (type) {
case TC_SETUP_CLSU32:
return cxgb_setup_tc_cls_u32(dev, type_data);
+ case TC_SETUP_CLSFLOWER:
+ return cxgb_setup_tc_flower(dev, type_data);
default:
return -EOPNOTSUPP;
}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
new file mode 100644
index 000000000000..16dff71e4d02
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -0,0 +1,57 @@
+/*
+ * This file is part of the Chelsio T4/T5/T6 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2017 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <net/tc_act/tc_gact.h>
+#include <net/tc_act/tc_mirred.h>
+
+#include "cxgb4.h"
+#include "cxgb4_tc_flower.h"
+
+int cxgb4_tc_flower_replace(struct net_device *dev,
+ struct tc_cls_flower_offload *cls)
+{
+ return -EOPNOTSUPP;
+}
+
+int cxgb4_tc_flower_destroy(struct net_device *dev,
+ struct tc_cls_flower_offload *cls)
+{
+ return -EOPNOTSUPP;
+}
+
+int cxgb4_tc_flower_stats(struct net_device *dev,
+ struct tc_cls_flower_offload *cls)
+{
+ return -EOPNOTSUPP;
+}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
new file mode 100644
index 000000000000..b321fc205b5a
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
@@ -0,0 +1,46 @@
+/*
+ * This file is part of the Chelsio T4/T5/T6 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2017 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __CXGB4_TC_FLOWER_H
+#define __CXGB4_TC_FLOWER_H
+
+#include <net/pkt_cls.h>
+
+int cxgb4_tc_flower_replace(struct net_device *dev,
+ struct tc_cls_flower_offload *cls);
+int cxgb4_tc_flower_destroy(struct net_device *dev,
+ struct tc_cls_flower_offload *cls);
+int cxgb4_tc_flower_stats(struct net_device *dev,
+ struct tc_cls_flower_offload *cls);
+#endif /* __CXGB4_TC_FLOWER_H */
--
2.14.1
^ permalink raw reply related
* [PATCH net-next 3/4] cxgb4: add support to offload action vlan
From: Rahul Lakkireddy @ 2017-09-21 7:33 UTC (permalink / raw)
To: netdev; +Cc: davem, kumaras, ganeshgr, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1505977744.git.rahul.lakkireddy@chelsio.com>
From: Kumar Sanghvi <kumaras@chelsio.com>
Add support for offloading tc-flower flows having
vlan actions: pop, push and modify.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
.../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c | 43 ++++++++++++++++++++++
1 file changed, 43 insertions(+)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index 1af01101faaf..fddb0c419edc 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -34,6 +34,7 @@
#include <net/tc_act/tc_gact.h>
#include <net/tc_act/tc_mirred.h>
+#include <net/tc_act/tc_vlan.h>
#include "cxgb4.h"
#include "cxgb4_tc_flower.h"
@@ -185,6 +186,27 @@ static void cxgb4_process_flow_actions(struct net_device *in,
fs->action = FILTER_SWITCH;
fs->eport = pi->port_id;
+ } else if (is_tcf_vlan(a)) {
+ u32 vlan_action = tcf_vlan_action(a);
+ u8 prio = tcf_vlan_push_prio(a);
+ u16 vid = tcf_vlan_push_vid(a);
+ u16 vlan_tci = (prio << VLAN_PRIO_SHIFT) | vid;
+
+ switch (vlan_action) {
+ case TCA_VLAN_ACT_POP:
+ fs->newvlan |= VLAN_REMOVE;
+ break;
+ case TCA_VLAN_ACT_PUSH:
+ fs->newvlan |= VLAN_INSERT;
+ fs->vlan = vlan_tci;
+ break;
+ case TCA_VLAN_ACT_MODIFY:
+ fs->newvlan |= VLAN_REWRITE;
+ fs->vlan = vlan_tci;
+ break;
+ default:
+ break;
+ }
}
}
}
@@ -222,6 +244,27 @@ static int cxgb4_validate_flow_actions(struct net_device *dev,
__func__);
return -EINVAL;
}
+ } else if (is_tcf_vlan(a)) {
+ u16 proto = be16_to_cpu(tcf_vlan_push_proto(a));
+ u32 vlan_action = tcf_vlan_action(a);
+
+ switch (vlan_action) {
+ case TCA_VLAN_ACT_POP:
+ break;
+ case TCA_VLAN_ACT_PUSH:
+ case TCA_VLAN_ACT_MODIFY:
+ if (proto != ETH_P_8021Q) {
+ netdev_err(dev,
+ "%s: Unsupp. vlan proto\n",
+ __func__);
+ return -EOPNOTSUPP;
+ }
+ break;
+ default:
+ netdev_err(dev, "%s: Unsupported vlan action\n",
+ __func__);
+ return -EOPNOTSUPP;
+ }
} else {
netdev_err(dev, "%s: Unsupported action\n", __func__);
return -EOPNOTSUPP;
--
2.14.1
^ permalink raw reply related
* [PATCH net-next 2/4] cxgb4: add basic tc flower offload support
From: Rahul Lakkireddy @ 2017-09-21 7:33 UTC (permalink / raw)
To: netdev; +Cc: davem, kumaras, ganeshgr, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1505977744.git.rahul.lakkireddy@chelsio.com>
From: Kumar Sanghvi <kumaras@chelsio.com>
Add support to add/remove flows for offload. Following match
and action are supported for offloading a flow:
Match: ether-protocol, IPv4/IPv6 addresses, L4 ports (TCP/UDP)
Action: drop, redirect to another port on the device.
The qualifying flows can have accompanying mask information.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 3 +
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 26 ++
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 2 +
.../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c | 285 ++++++++++++++++++++-
.../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h | 17 ++
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h | 1 +
6 files changed, 332 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index ea72d2d2e1b4..26eac599ab2c 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -904,6 +904,9 @@ struct adapter {
/* TC u32 offload */
struct cxgb4_tc_u32_table *tc_u32;
struct chcr_stats_debug chcr_stats;
+
+ /* TC flower offload */
+ DECLARE_HASHTABLE(flower_anymatch_tbl, 9);
};
/* Support for "sched-class" command to allow a TX Scheduling Class to be
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
index 45b5853ca2f1..07a4619e2164 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
@@ -148,6 +148,32 @@ static int get_filter_steerq(struct net_device *dev,
return iq;
}
+int cxgb4_get_free_ftid(struct net_device *dev, int family)
+{
+ struct adapter *adap = netdev2adap(dev);
+ struct tid_info *t = &adap->tids;
+ int ftid;
+
+ spin_lock_bh(&t->ftid_lock);
+ if (family == PF_INET) {
+ ftid = find_first_zero_bit(t->ftid_bmap, t->nftids);
+ if (ftid >= t->nftids)
+ ftid = -1;
+ } else {
+ ftid = bitmap_find_free_region(t->ftid_bmap, t->nftids, 2);
+ if (ftid < 0) {
+ ftid = -1;
+ goto out_unlock;
+ }
+
+ /* this is only a lookup, keep the found region unallocated */
+ bitmap_release_region(t->ftid_bmap, ftid, 2);
+ }
+out_unlock:
+ spin_unlock_bh(&t->ftid_lock);
+ return ftid;
+}
+
static int cxgb4_set_ftid(struct tid_info *t, int fidx, int family)
{
spin_lock_bh(&t->ftid_lock);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 8923affbdaf8..3ba4e1ff8486 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -5105,6 +5105,8 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
if (!adapter->tc_u32)
dev_warn(&pdev->dev,
"could not offload tc u32, continuing\n");
+
+ cxgb4_init_tc_flower(adapter);
}
if (is_offload(adapter)) {
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index 16dff71e4d02..1af01101faaf 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -38,16 +38,292 @@
#include "cxgb4.h"
#include "cxgb4_tc_flower.h"
+static struct ch_tc_flower_entry *allocate_flower_entry(void)
+{
+ struct ch_tc_flower_entry *new = kzalloc(sizeof(*new), GFP_KERNEL);
+ return new;
+}
+
+/* Must be called with either RTNL or rcu_read_lock */
+static struct ch_tc_flower_entry *ch_flower_lookup(struct adapter *adap,
+ unsigned long flower_cookie)
+{
+ struct ch_tc_flower_entry *flower_entry;
+
+ hash_for_each_possible_rcu(adap->flower_anymatch_tbl, flower_entry,
+ link, flower_cookie)
+ if (flower_entry->tc_flower_cookie == flower_cookie)
+ return flower_entry;
+ return NULL;
+}
+
+static void cxgb4_process_flow_match(struct net_device *dev,
+ struct tc_cls_flower_offload *cls,
+ struct ch_filter_specification *fs)
+{
+ u16 addr_type = 0;
+
+ if (dissector_uses_key(cls->dissector, FLOW_DISSECTOR_KEY_CONTROL)) {
+ struct flow_dissector_key_control *key =
+ skb_flow_dissector_target(cls->dissector,
+ FLOW_DISSECTOR_KEY_CONTROL,
+ cls->key);
+
+ addr_type = key->addr_type;
+ }
+
+ if (dissector_uses_key(cls->dissector, FLOW_DISSECTOR_KEY_BASIC)) {
+ struct flow_dissector_key_basic *key =
+ skb_flow_dissector_target(cls->dissector,
+ FLOW_DISSECTOR_KEY_BASIC,
+ cls->key);
+ struct flow_dissector_key_basic *mask =
+ skb_flow_dissector_target(cls->dissector,
+ FLOW_DISSECTOR_KEY_BASIC,
+ cls->mask);
+ u16 ethtype_key = ntohs(key->n_proto);
+ u16 ethtype_mask = ntohs(mask->n_proto);
+
+ if (ethtype_key == ETH_P_ALL) {
+ ethtype_key = 0;
+ ethtype_mask = 0;
+ }
+
+ fs->val.ethtype = ethtype_key;
+ fs->mask.ethtype = ethtype_mask;
+ fs->val.proto = key->ip_proto;
+ fs->mask.proto = mask->ip_proto;
+ }
+
+ if (addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
+ struct flow_dissector_key_ipv4_addrs *key =
+ skb_flow_dissector_target(cls->dissector,
+ FLOW_DISSECTOR_KEY_IPV4_ADDRS,
+ cls->key);
+ struct flow_dissector_key_ipv4_addrs *mask =
+ skb_flow_dissector_target(cls->dissector,
+ FLOW_DISSECTOR_KEY_IPV4_ADDRS,
+ cls->mask);
+ fs->type = 0;
+ memcpy(&fs->val.lip[0], &key->dst, sizeof(key->dst));
+ memcpy(&fs->val.fip[0], &key->src, sizeof(key->src));
+ memcpy(&fs->mask.lip[0], &mask->dst, sizeof(mask->dst));
+ memcpy(&fs->mask.fip[0], &mask->src, sizeof(mask->src));
+ }
+
+ if (addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) {
+ struct flow_dissector_key_ipv6_addrs *key =
+ skb_flow_dissector_target(cls->dissector,
+ FLOW_DISSECTOR_KEY_IPV6_ADDRS,
+ cls->key);
+ struct flow_dissector_key_ipv6_addrs *mask =
+ skb_flow_dissector_target(cls->dissector,
+ FLOW_DISSECTOR_KEY_IPV6_ADDRS,
+ cls->mask);
+
+ fs->type = 1;
+ memcpy(&fs->val.lip[0], key->dst.s6_addr, sizeof(key->dst));
+ memcpy(&fs->val.fip[0], key->src.s6_addr, sizeof(key->src));
+ memcpy(&fs->mask.lip[0], mask->dst.s6_addr, sizeof(mask->dst));
+ memcpy(&fs->mask.fip[0], mask->src.s6_addr, sizeof(mask->src));
+ }
+
+ if (dissector_uses_key(cls->dissector, FLOW_DISSECTOR_KEY_PORTS)) {
+ struct flow_dissector_key_ports *key, *mask;
+
+ key = skb_flow_dissector_target(cls->dissector,
+ FLOW_DISSECTOR_KEY_PORTS,
+ cls->key);
+ mask = skb_flow_dissector_target(cls->dissector,
+ FLOW_DISSECTOR_KEY_PORTS,
+ cls->mask);
+ fs->val.lport = cpu_to_be16(key->dst);
+ fs->mask.lport = cpu_to_be16(mask->dst);
+ fs->val.fport = cpu_to_be16(key->src);
+ fs->mask.fport = cpu_to_be16(mask->src);
+ }
+
+ /* Match only packets coming from the ingress port where this
+ * filter will be created.
+ */
+ fs->val.iport = netdev2pinfo(dev)->port_id;
+ fs->mask.iport = ~0;
+}
+
+static int cxgb4_validate_flow_match(struct net_device *dev,
+ struct tc_cls_flower_offload *cls)
+{
+ if (cls->dissector->used_keys &
+ ~(BIT(FLOW_DISSECTOR_KEY_CONTROL) |
+ BIT(FLOW_DISSECTOR_KEY_BASIC) |
+ BIT(FLOW_DISSECTOR_KEY_IPV4_ADDRS) |
+ BIT(FLOW_DISSECTOR_KEY_IPV6_ADDRS) |
+ BIT(FLOW_DISSECTOR_KEY_PORTS))) {
+ netdev_warn(dev, "Unsupported key used: 0x%x\n",
+ cls->dissector->used_keys);
+ return -EOPNOTSUPP;
+ }
+ return 0;
+}
+
+static void cxgb4_process_flow_actions(struct net_device *in,
+ struct tc_cls_flower_offload *cls,
+ struct ch_filter_specification *fs)
+{
+ const struct tc_action *a;
+ LIST_HEAD(actions);
+
+ tcf_exts_to_list(cls->exts, &actions);
+ list_for_each_entry(a, &actions, list) {
+ if (is_tcf_gact_shot(a)) {
+ fs->action = FILTER_DROP;
+ } else if (is_tcf_mirred_egress_redirect(a)) {
+ int ifindex = tcf_mirred_ifindex(a);
+ struct net_device *out = __dev_get_by_index(dev_net(in),
+ ifindex);
+ struct port_info *pi = netdev_priv(out);
+
+ fs->action = FILTER_SWITCH;
+ fs->eport = pi->port_id;
+ }
+ }
+}
+
+static int cxgb4_validate_flow_actions(struct net_device *dev,
+ struct tc_cls_flower_offload *cls)
+{
+ const struct tc_action *a;
+ LIST_HEAD(actions);
+
+ tcf_exts_to_list(cls->exts, &actions);
+ list_for_each_entry(a, &actions, list) {
+ if (is_tcf_gact_shot(a)) {
+ /* Do nothing */
+ } else if (is_tcf_mirred_egress_redirect(a)) {
+ struct adapter *adap = netdev2adap(dev);
+ struct net_device *n_dev;
+ unsigned int i, ifindex;
+ bool found = false;
+
+ ifindex = tcf_mirred_ifindex(a);
+ for_each_port(adap, i) {
+ n_dev = adap->port[i];
+ if (ifindex == n_dev->ifindex) {
+ found = true;
+ break;
+ }
+ }
+
+ /* If interface doesn't belong to our hw, then
+ * the provided output port is not valid
+ */
+ if (!found) {
+ netdev_err(dev, "%s: Out port invalid\n",
+ __func__);
+ return -EINVAL;
+ }
+ } else {
+ netdev_err(dev, "%s: Unsupported action\n", __func__);
+ return -EOPNOTSUPP;
+ }
+ }
+ return 0;
+}
+
int cxgb4_tc_flower_replace(struct net_device *dev,
struct tc_cls_flower_offload *cls)
{
- return -EOPNOTSUPP;
+ struct adapter *adap = netdev2adap(dev);
+ struct ch_tc_flower_entry *ch_flower;
+ struct ch_filter_specification *fs;
+ struct filter_ctx ctx;
+ int fidx;
+ int ret;
+
+ if (cxgb4_validate_flow_actions(dev, cls))
+ return -EOPNOTSUPP;
+
+ if (cxgb4_validate_flow_match(dev, cls))
+ return -EOPNOTSUPP;
+
+ ch_flower = allocate_flower_entry();
+ if (!ch_flower) {
+ netdev_err(dev, "%s: ch_flower alloc failed.\n", __func__);
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ fs = &ch_flower->fs;
+ fs->hitcnts = 1;
+ cxgb4_process_flow_actions(dev, cls, fs);
+ cxgb4_process_flow_match(dev, cls, fs);
+
+ fidx = cxgb4_get_free_ftid(dev, fs->type ? PF_INET6 : PF_INET);
+ if (fidx < 0) {
+ netdev_err(dev, "%s: No fidx for offload.\n", __func__);
+ ret = -ENOMEM;
+ goto free_entry;
+ }
+
+ init_completion(&ctx.completion);
+ ret = __cxgb4_set_filter(dev, fidx, fs, &ctx);
+ if (ret) {
+ netdev_err(dev, "%s: filter creation err %d\n",
+ __func__, ret);
+ goto free_entry;
+ }
+
+ /* Wait for reply */
+ ret = wait_for_completion_timeout(&ctx.completion, 10 * HZ);
+ if (!ret) {
+ ret = -ETIMEDOUT;
+ goto free_entry;
+ }
+
+ ret = ctx.result;
+ /* Check if hw returned error for filter creation */
+ if (ret) {
+ netdev_err(dev, "%s: filter creation err %d\n",
+ __func__, ret);
+ goto free_entry;
+ }
+
+ INIT_HLIST_NODE(&ch_flower->link);
+ ch_flower->tc_flower_cookie = cls->cookie;
+ ch_flower->filter_id = ctx.tid;
+ hash_add_rcu(adap->flower_anymatch_tbl, &ch_flower->link, cls->cookie);
+
+ return ret;
+
+free_entry:
+ kfree(ch_flower);
+err:
+ return ret;
}
int cxgb4_tc_flower_destroy(struct net_device *dev,
struct tc_cls_flower_offload *cls)
{
- return -EOPNOTSUPP;
+ struct adapter *adap = netdev2adap(dev);
+ struct ch_tc_flower_entry *ch_flower;
+ int ret;
+
+ ch_flower = ch_flower_lookup(adap, cls->cookie);
+ if (!ch_flower) {
+ ret = -ENOENT;
+ goto err;
+ }
+
+ ret = cxgb4_del_filter(dev, ch_flower->filter_id);
+ if (ret)
+ goto err;
+
+ hash_del_rcu(&ch_flower->link);
+ kfree_rcu(ch_flower, rcu);
+ return ret;
+
+err:
+ return ret;
}
int cxgb4_tc_flower_stats(struct net_device *dev,
@@ -55,3 +331,8 @@ int cxgb4_tc_flower_stats(struct net_device *dev,
{
return -EOPNOTSUPP;
}
+
+void cxgb4_init_tc_flower(struct adapter *adap)
+{
+ hash_init(adap->flower_anymatch_tbl);
+}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
index b321fc205b5a..6145a9e056eb 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
@@ -37,10 +37,27 @@
#include <net/pkt_cls.h>
+struct ch_tc_flower_stats {
+ u64 packet_count;
+ u64 byte_count;
+ u64 last_used;
+};
+
+struct ch_tc_flower_entry {
+ struct ch_filter_specification fs;
+ struct ch_tc_flower_stats stats;
+ unsigned long tc_flower_cookie;
+ struct hlist_node link;
+ struct rcu_head rcu;
+ u32 filter_id;
+};
+
int cxgb4_tc_flower_replace(struct net_device *dev,
struct tc_cls_flower_offload *cls);
int cxgb4_tc_flower_destroy(struct net_device *dev,
struct tc_cls_flower_offload *cls);
int cxgb4_tc_flower_stats(struct net_device *dev,
struct tc_cls_flower_offload *cls);
+
+void cxgb4_init_tc_flower(struct adapter *adap);
#endif /* __CXGB4_TC_FLOWER_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index 84541fce94c5..88487095d14f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -212,6 +212,7 @@ struct filter_ctx {
struct ch_filter_specification;
+int cxgb4_get_free_ftid(struct net_device *dev, int family);
int __cxgb4_set_filter(struct net_device *dev, int filter_id,
struct ch_filter_specification *fs,
struct filter_ctx *ctx);
--
2.14.1
^ permalink raw reply related
* [PATCH net-next 4/4] cxgb4: fetch stats for offloaded tc flower flows
From: Rahul Lakkireddy @ 2017-09-21 7:33 UTC (permalink / raw)
To: netdev; +Cc: davem, kumaras, ganeshgr, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1505977744.git.rahul.lakkireddy@chelsio.com>
From: Kumar Sanghvi <kumaras@chelsio.com>
Add support to retrieve stats from hardware for offloaded tc flower
flows. Also, poll for the stats of offloaded flows via timer callback.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 1 +
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 76 +++++++++++++++++++++
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 1 +
.../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c | 79 +++++++++++++++++++++-
.../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h | 3 +
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h | 2 +
6 files changed, 161 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 26eac599ab2c..8a94d97df025 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -907,6 +907,7 @@ struct adapter {
/* TC flower offload */
DECLARE_HASHTABLE(flower_anymatch_tbl, 9);
+ struct timer_list flower_stats_timer;
};
/* Support for "sched-class" command to allow a TX Scheduling Class to be
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
index 07a4619e2164..c09c4de8c9fb 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
@@ -148,6 +148,82 @@ static int get_filter_steerq(struct net_device *dev,
return iq;
}
+static int get_filter_count(struct adapter *adapter, unsigned int fidx,
+ u64 *pkts, u64 *bytes)
+{
+ unsigned int tcb_base, tcbaddr;
+ unsigned int word_offset;
+ struct filter_entry *f;
+ __be64 be64_byte_count;
+ int ret;
+
+ tcb_base = t4_read_reg(adapter, TP_CMM_TCB_BASE_A);
+ if ((fidx != (adapter->tids.nftids + adapter->tids.nsftids - 1)) &&
+ fidx >= adapter->tids.nftids)
+ return -E2BIG;
+
+ f = &adapter->tids.ftid_tab[fidx];
+ if (!f->valid)
+ return -EINVAL;
+
+ tcbaddr = tcb_base + f->tid * TCB_SIZE;
+
+ spin_lock(&adapter->win0_lock);
+ if (is_t4(adapter->params.chip)) {
+ __be64 be64_count;
+
+ /* T4 doesn't maintain byte counts in hw */
+ *bytes = 0;
+
+ /* Get pkts */
+ word_offset = 4;
+ ret = t4_memory_rw(adapter, MEMWIN_NIC, MEM_EDC0,
+ tcbaddr + (word_offset * sizeof(__be32)),
+ sizeof(be64_count),
+ (__be32 *)&be64_count,
+ T4_MEMORY_READ);
+ if (ret < 0)
+ goto out;
+ *pkts = be64_to_cpu(be64_count);
+ } else {
+ __be32 be32_count;
+
+ /* Get bytes */
+ word_offset = 4;
+ ret = t4_memory_rw(adapter, MEMWIN_NIC, MEM_EDC0,
+ tcbaddr + (word_offset * sizeof(__be32)),
+ sizeof(be64_byte_count),
+ &be64_byte_count,
+ T4_MEMORY_READ);
+ if (ret < 0)
+ goto out;
+ *bytes = be64_to_cpu(be64_byte_count);
+
+ /* Get pkts */
+ word_offset = 6;
+ ret = t4_memory_rw(adapter, MEMWIN_NIC, MEM_EDC0,
+ tcbaddr + (word_offset * sizeof(__be32)),
+ sizeof(be32_count),
+ &be32_count,
+ T4_MEMORY_READ);
+ if (ret < 0)
+ goto out;
+ *pkts = (u64)be32_to_cpu(be32_count);
+ }
+
+out:
+ spin_unlock(&adapter->win0_lock);
+ return ret;
+}
+
+int cxgb4_get_filter_counters(struct net_device *dev, unsigned int fidx,
+ u64 *hitcnt, u64 *bytecnt)
+{
+ struct adapter *adapter = netdev2adap(dev);
+
+ return get_filter_count(adapter, fidx, hitcnt, bytecnt);
+}
+
int cxgb4_get_free_ftid(struct net_device *dev, int family)
{
struct adapter *adap = netdev2adap(dev);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 3ba4e1ff8486..d634098d52ab 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -4637,6 +4637,7 @@ static void free_some_resources(struct adapter *adapter)
kvfree(adapter->l2t);
t4_cleanup_sched(adapter);
kvfree(adapter->tids.tid_tab);
+ cxgb4_cleanup_tc_flower(adapter);
cxgb4_cleanup_tc_u32(adapter);
kfree(adapter->sge.egr_map);
kfree(adapter->sge.ingr_map);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index fddb0c419edc..7a47d4e88a57 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -39,9 +39,12 @@
#include "cxgb4.h"
#include "cxgb4_tc_flower.h"
+#define STATS_CHECK_PERIOD (HZ / 2)
+
static struct ch_tc_flower_entry *allocate_flower_entry(void)
{
struct ch_tc_flower_entry *new = kzalloc(sizeof(*new), GFP_KERNEL);
+ spin_lock_init(&new->lock);
return new;
}
@@ -369,13 +372,87 @@ int cxgb4_tc_flower_destroy(struct net_device *dev,
return ret;
}
+void ch_flower_stats_cb(unsigned long data)
+{
+ struct adapter *adap = (struct adapter *)data;
+ struct ch_tc_flower_entry *flower_entry;
+ struct ch_tc_flower_stats *ofld_stats;
+ unsigned int i;
+ u64 packets;
+ u64 bytes;
+ int ret;
+
+ rcu_read_lock();
+ hash_for_each_rcu(adap->flower_anymatch_tbl, i, flower_entry, link) {
+ ret = cxgb4_get_filter_counters(adap->port[0],
+ flower_entry->filter_id,
+ &packets, &bytes);
+ if (!ret) {
+ spin_lock(&flower_entry->lock);
+ ofld_stats = &flower_entry->stats;
+
+ if (ofld_stats->prev_packet_count != packets) {
+ ofld_stats->prev_packet_count = packets;
+ ofld_stats->last_used = jiffies;
+ }
+ spin_unlock(&flower_entry->lock);
+ }
+ }
+ rcu_read_unlock();
+ mod_timer(&adap->flower_stats_timer, jiffies + STATS_CHECK_PERIOD);
+}
+
int cxgb4_tc_flower_stats(struct net_device *dev,
struct tc_cls_flower_offload *cls)
{
- return -EOPNOTSUPP;
+ struct adapter *adap = netdev2adap(dev);
+ struct ch_tc_flower_stats *ofld_stats;
+ struct ch_tc_flower_entry *ch_flower;
+ u64 packets;
+ u64 bytes;
+ int ret;
+
+ ch_flower = ch_flower_lookup(adap, cls->cookie);
+ if (!ch_flower) {
+ ret = -ENOENT;
+ goto err;
+ }
+
+ ret = cxgb4_get_filter_counters(dev, ch_flower->filter_id,
+ &packets, &bytes);
+ if (ret < 0)
+ goto err;
+
+ spin_lock_bh(&ch_flower->lock);
+ ofld_stats = &ch_flower->stats;
+ if (ofld_stats->packet_count != packets) {
+ if (ofld_stats->prev_packet_count != packets)
+ ofld_stats->last_used = jiffies;
+ tcf_exts_stats_update(cls->exts, bytes - ofld_stats->byte_count,
+ packets - ofld_stats->packet_count,
+ ofld_stats->last_used);
+
+ ofld_stats->packet_count = packets;
+ ofld_stats->byte_count = bytes;
+ ofld_stats->prev_packet_count = packets;
+ }
+ spin_unlock_bh(&ch_flower->lock);
+ return 0;
+
+err:
+ return ret;
}
void cxgb4_init_tc_flower(struct adapter *adap)
{
hash_init(adap->flower_anymatch_tbl);
+ setup_timer(&adap->flower_stats_timer, ch_flower_stats_cb,
+ (unsigned long)adap);
+ mod_timer(&adap->flower_stats_timer, jiffies + STATS_CHECK_PERIOD);
+}
+
+void cxgb4_cleanup_tc_flower(struct adapter *adap)
+{
+ if (adap->flower_stats_timer.function)
+ del_timer_sync(&adap->flower_stats_timer);
}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
index 6145a9e056eb..604feffc752e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
@@ -38,6 +38,7 @@
#include <net/pkt_cls.h>
struct ch_tc_flower_stats {
+ u64 prev_packet_count;
u64 packet_count;
u64 byte_count;
u64 last_used;
@@ -49,6 +50,7 @@ struct ch_tc_flower_entry {
unsigned long tc_flower_cookie;
struct hlist_node link;
struct rcu_head rcu;
+ spinlock_t lock; /* lock for stats */
u32 filter_id;
};
@@ -60,4 +62,5 @@ int cxgb4_tc_flower_stats(struct net_device *dev,
struct tc_cls_flower_offload *cls);
void cxgb4_init_tc_flower(struct adapter *adap);
+void cxgb4_cleanup_tc_flower(struct adapter *adap);
#endif /* __CXGB4_TC_FLOWER_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index 88487095d14f..52324c77a4fe 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -221,6 +221,8 @@ int __cxgb4_del_filter(struct net_device *dev, int filter_id,
int cxgb4_set_filter(struct net_device *dev, int filter_id,
struct ch_filter_specification *fs);
int cxgb4_del_filter(struct net_device *dev, int filter_id);
+int cxgb4_get_filter_counters(struct net_device *dev, unsigned int fidx,
+ u64 *hitcnt, u64 *bytecnt);
static inline void set_wr_txq(struct sk_buff *skb, int prio, int queue)
{
--
2.14.1
^ permalink raw reply related
* [PATCH iproute2 master 0/2] BPF/XDP json follow-up
From: Daniel Borkmann @ 2017-09-21 8:42 UTC (permalink / raw)
To: stephen; +Cc: ast, netdev, Daniel Borkmann
After merging net-next branch into master, Stephen asked to
fix up json dump for XDP as there were some merge conflicts,
so here it is.
Thanks!
Daniel Borkmann (2):
json: move json printer to common library
bpf: properly output json for xdp
include/json_print.h | 71 ++++++++++++++++
ip/Makefile | 2 +-
ip/ip_common.h | 65 ++------------
ip/ip_print.c | 233 ---------------------------------------------------
ip/iplink_xdp.c | 74 +++++++++-------
lib/Makefile | 2 +-
lib/bpf.c | 19 +++--
lib/json_print.c | 231 ++++++++++++++++++++++++++++++++++++++++++++++++++
8 files changed, 369 insertions(+), 328 deletions(-)
create mode 100644 include/json_print.h
delete mode 100644 ip/ip_print.c
create mode 100644 lib/json_print.c
--
1.9.3
^ permalink raw reply
* [PATCH iproute2 master 1/2] json: move json printer to common library
From: Daniel Borkmann @ 2017-09-21 8:42 UTC (permalink / raw)
To: stephen; +Cc: ast, netdev, Daniel Borkmann
In-Reply-To: <cover.1505956723.git.daniel@iogearbox.net>
Move the json printer which is based on json writer into the
iproute2 library, so it can be used by library code and tools
other than ip. Should probably have been done from the beginning
like that given json writer is in the library already anyway.
No functional changes.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
include/json_print.h | 71 ++++++++++++++++
ip/Makefile | 2 +-
ip/ip_common.h | 65 ++------------
ip/ip_print.c | 233 ---------------------------------------------------
lib/Makefile | 2 +-
lib/json_print.c | 231 ++++++++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 312 insertions(+), 292 deletions(-)
create mode 100644 include/json_print.h
delete mode 100644 ip/ip_print.c
create mode 100644 lib/json_print.c
diff --git a/include/json_print.h b/include/json_print.h
new file mode 100644
index 0000000..44cf5ac
--- /dev/null
+++ b/include/json_print.h
@@ -0,0 +1,71 @@
+/*
+ * json_print.h "print regular or json output, based on json_writer".
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * Authors: Julien Fortin, <julien@cumulusnetworks.com>
+ */
+
+#ifndef _JSON_PRINT_H_
+#define _JSON_PRINT_H_
+
+#include "json_writer.h"
+#include "color.h"
+
+json_writer_t *get_json_writer(void);
+
+/*
+ * use:
+ * - PRINT_ANY for context based output
+ * - PRINT_FP for non json specific output
+ * - PRINT_JSON for json specific output
+ */
+enum output_type {
+ PRINT_FP = 1,
+ PRINT_JSON = 2,
+ PRINT_ANY = 4,
+};
+
+void new_json_obj(int json, FILE *fp);
+void delete_json_obj(void);
+
+bool is_json_context(void);
+
+void set_current_fp(FILE *fp);
+
+void fflush_fp(void);
+
+void open_json_object(const char *str);
+void close_json_object(void);
+void open_json_array(enum output_type type, const char *delim);
+void close_json_array(enum output_type type, const char *delim);
+
+#define _PRINT_FUNC(type_name, type) \
+ void print_color_##type_name(enum output_type t, \
+ enum color_attr color, \
+ const char *key, \
+ const char *fmt, \
+ type value); \
+ \
+ static inline void print_##type_name(enum output_type t, \
+ const char *key, \
+ const char *fmt, \
+ type value) \
+ { \
+ print_color_##type_name(t, -1, key, fmt, value); \
+ }
+_PRINT_FUNC(int, int);
+_PRINT_FUNC(bool, bool);
+_PRINT_FUNC(null, const char*);
+_PRINT_FUNC(string, const char*);
+_PRINT_FUNC(uint, uint64_t);
+_PRINT_FUNC(hu, unsigned short);
+_PRINT_FUNC(hex, unsigned int);
+_PRINT_FUNC(0xhex, unsigned int);
+_PRINT_FUNC(lluint, unsigned long long int);
+#undef _PRINT_FUNC
+
+#endif /* _JSON_PRINT_H_ */
diff --git a/ip/Makefile b/ip/Makefile
index 52c9a2e..5a1c7ad 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -9,7 +9,7 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o ipnetns.o \
link_iptnl.o link_gre6.o iplink_bond.o iplink_bond_slave.o iplink_hsr.o \
iplink_bridge.o iplink_bridge_slave.o ipfou.o iplink_ipvlan.o \
iplink_geneve.o iplink_vrf.o iproute_lwtunnel.o ipmacsec.o ipila.o \
- ipvrf.o iplink_xstats.o ipseg6.o ip_print.o
+ ipvrf.o iplink_xstats.o ipseg6.o
RTMONOBJ=rtmon.o
diff --git a/ip/ip_common.h b/ip/ip_common.h
index efc789c..4b8b0a7 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -1,3 +1,10 @@
+#ifndef _IP_COMMON_H_
+#define _IP_COMMON_H_
+
+#include <stdbool.h>
+
+#include "json_print.h"
+
struct link_filter {
int ifindex;
int family;
@@ -101,8 +108,6 @@ static inline int rtm_get_table(struct rtmsg *r, struct rtattr **tb)
extern struct rtnl_handle rth;
-#include <stdbool.h>
-
struct link_util {
struct link_util *next;
const char *id;
@@ -141,58 +146,4 @@ int name_is_vrf(const char *name);
void print_num(FILE *fp, unsigned int width, uint64_t count);
-#include "json_writer.h"
-
-json_writer_t *get_json_writer(void);
-/*
- * use:
- * - PRINT_ANY for context based output
- * - PRINT_FP for non json specific output
- * - PRINT_JSON for json specific output
- */
-enum output_type {
- PRINT_FP = 1,
- PRINT_JSON = 2,
- PRINT_ANY = 4,
-};
-
-void new_json_obj(int json, FILE *fp);
-void delete_json_obj(void);
-
-bool is_json_context(void);
-
-void set_current_fp(FILE *fp);
-
-void fflush_fp(void);
-
-void open_json_object(const char *str);
-void close_json_object(void);
-void open_json_array(enum output_type type, const char *delim);
-void close_json_array(enum output_type type, const char *delim);
-
-#include "color.h"
-
-#define _PRINT_FUNC(type_name, type) \
- void print_color_##type_name(enum output_type t, \
- enum color_attr color, \
- const char *key, \
- const char *fmt, \
- type value); \
- \
- static inline void print_##type_name(enum output_type t, \
- const char *key, \
- const char *fmt, \
- type value) \
- { \
- print_color_##type_name(t, -1, key, fmt, value); \
- }
-_PRINT_FUNC(int, int);
-_PRINT_FUNC(bool, bool);
-_PRINT_FUNC(null, const char*);
-_PRINT_FUNC(string, const char*);
-_PRINT_FUNC(uint, uint64_t);
-_PRINT_FUNC(hu, unsigned short);
-_PRINT_FUNC(hex, unsigned int);
-_PRINT_FUNC(0xhex, unsigned int);
-_PRINT_FUNC(lluint, unsigned long long int);
-#undef _PRINT_FUNC
+#endif /* _IP_COMMON_H_ */
diff --git a/ip/ip_print.c b/ip/ip_print.c
deleted file mode 100644
index 4cd6a0b..0000000
--- a/ip/ip_print.c
+++ /dev/null
@@ -1,233 +0,0 @@
-/*
- * ip_print.c "ip print regular or json output".
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- *
- * Authors: Julien Fortin, <julien@cumulusnetworks.com>
- *
- */
-
-#include <stdarg.h>
-#include <stdio.h>
-
-#include "utils.h"
-#include "ip_common.h"
-#include "json_writer.h"
-
-static json_writer_t *_jw;
-static FILE *_fp;
-
-#define _IS_JSON_CONTEXT(type) ((type & PRINT_JSON || type & PRINT_ANY) && _jw)
-#define _IS_FP_CONTEXT(type) (!_jw && (type & PRINT_FP || type & PRINT_ANY))
-
-void new_json_obj(int json, FILE *fp)
-{
- if (json) {
- _jw = jsonw_new(fp);
- if (!_jw) {
- perror("json object");
- exit(1);
- }
- jsonw_pretty(_jw, true);
- jsonw_start_array(_jw);
- }
- set_current_fp(fp);
-}
-
-void delete_json_obj(void)
-{
- if (_jw) {
- jsonw_end_array(_jw);
- jsonw_destroy(&_jw);
- }
-}
-
-bool is_json_context(void)
-{
- return _jw != NULL;
-}
-
-void set_current_fp(FILE *fp)
-{
- if (!fp) {
- fprintf(stderr, "Error: invalid file pointer.\n");
- exit(1);
- }
- _fp = fp;
-}
-
-json_writer_t *get_json_writer(void)
-{
- return _jw;
-}
-
-void open_json_object(const char *str)
-{
- if (_IS_JSON_CONTEXT(PRINT_JSON)) {
- if (str)
- jsonw_name(_jw, str);
- jsonw_start_object(_jw);
- }
-}
-
-void close_json_object(void)
-{
- if (_IS_JSON_CONTEXT(PRINT_JSON))
- jsonw_end_object(_jw);
-}
-
-/*
- * Start json array or string array using
- * the provided string as json key (if not null)
- * or as array delimiter in non-json context.
- */
-void open_json_array(enum output_type type, const char *str)
-{
- if (_IS_JSON_CONTEXT(type)) {
- if (str)
- jsonw_name(_jw, str);
- jsonw_start_array(_jw);
- } else if (_IS_FP_CONTEXT(type)) {
- fprintf(_fp, "%s", str);
- }
-}
-
-/*
- * End json array or string array
- */
-void close_json_array(enum output_type type, const char *str)
-{
- if (_IS_JSON_CONTEXT(type)) {
- jsonw_pretty(_jw, false);
- jsonw_end_array(_jw);
- jsonw_pretty(_jw, true);
- } else if (_IS_FP_CONTEXT(type)) {
- fprintf(_fp, "%s", str);
- }
-}
-
-/*
- * pre-processor directive to generate similar
- * functions handling different types
- */
-#define _PRINT_FUNC(type_name, type) \
- void print_color_##type_name(enum output_type t, \
- enum color_attr color, \
- const char *key, \
- const char *fmt, \
- type value) \
- { \
- if (_IS_JSON_CONTEXT(t)) { \
- if (!key) \
- jsonw_##type_name(_jw, value); \
- else \
- jsonw_##type_name##_field(_jw, key, value); \
- } else if (_IS_FP_CONTEXT(t)) { \
- color_fprintf(_fp, color, fmt, value); \
- } \
- }
-_PRINT_FUNC(int, int);
-_PRINT_FUNC(hu, unsigned short);
-_PRINT_FUNC(uint, uint64_t);
-_PRINT_FUNC(lluint, unsigned long long int);
-#undef _PRINT_FUNC
-
-void print_color_string(enum output_type type,
- enum color_attr color,
- const char *key,
- const char *fmt,
- const char *value)
-{
- if (_IS_JSON_CONTEXT(type)) {
- if (key && !value)
- jsonw_name(_jw, key);
- else if (!key && value)
- jsonw_string(_jw, value);
- else
- jsonw_string_field(_jw, key, value);
- } else if (_IS_FP_CONTEXT(type)) {
- color_fprintf(_fp, color, fmt, value);
- }
-}
-
-/*
- * value's type is bool. When using this function in FP context you can't pass
- * a value to it, you will need to use "is_json_context()" to have different
- * branch for json and regular output. grep -r "print_bool" for example
- */
-void print_color_bool(enum output_type type,
- enum color_attr color,
- const char *key,
- const char *fmt,
- bool value)
-{
- if (_IS_JSON_CONTEXT(type)) {
- if (key)
- jsonw_bool_field(_jw, key, value);
- else
- jsonw_bool(_jw, value);
- } else if (_IS_FP_CONTEXT(type)) {
- color_fprintf(_fp, color, fmt, value ? "true" : "false");
- }
-}
-
-/*
- * In JSON context uses hardcode %#x format: 42 -> 0x2a
- */
-void print_color_0xhex(enum output_type type,
- enum color_attr color,
- const char *key,
- const char *fmt,
- unsigned int hex)
-{
- if (_IS_JSON_CONTEXT(type)) {
- SPRINT_BUF(b1);
-
- snprintf(b1, sizeof(b1), "%#x", hex);
- print_string(PRINT_JSON, key, NULL, b1);
- } else if (_IS_FP_CONTEXT(type)) {
- color_fprintf(_fp, color, fmt, hex);
- }
-}
-
-void print_color_hex(enum output_type type,
- enum color_attr color,
- const char *key,
- const char *fmt,
- unsigned int hex)
-{
- if (_IS_JSON_CONTEXT(type)) {
- SPRINT_BUF(b1);
-
- snprintf(b1, sizeof(b1), "%x", hex);
- if (key)
- jsonw_string_field(_jw, key, b1);
- else
- jsonw_string(_jw, b1);
- } else if (_IS_FP_CONTEXT(type)) {
- color_fprintf(_fp, color, fmt, hex);
- }
-}
-
-/*
- * In JSON context we don't use the argument "value" we simply call jsonw_null
- * whereas FP context can use "value" to output anything
- */
-void print_color_null(enum output_type type,
- enum color_attr color,
- const char *key,
- const char *fmt,
- const char *value)
-{
- if (_IS_JSON_CONTEXT(type)) {
- if (key)
- jsonw_null_field(_jw, key);
- else
- jsonw_null(_jw);
- } else if (_IS_FP_CONTEXT(type)) {
- color_fprintf(_fp, color, fmt, value);
- }
-}
diff --git a/lib/Makefile b/lib/Makefile
index 5e9f72f..0fbdf4c 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -3,7 +3,7 @@ include ../config.mk
CFLAGS += -fPIC
UTILOBJ = utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o \
- inet_proto.o namespace.o json_writer.o \
+ inet_proto.o namespace.o json_writer.o json_print.o \
names.o color.o bpf.o exec.o fs.o
NLOBJ=libgenl.o ll_map.o libnetlink.o
diff --git a/lib/json_print.c b/lib/json_print.c
new file mode 100644
index 0000000..93b4119
--- /dev/null
+++ b/lib/json_print.c
@@ -0,0 +1,231 @@
+/*
+ * json_print.c "print regular or json output, based on json_writer".
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * Authors: Julien Fortin, <julien@cumulusnetworks.com>
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+
+#include "utils.h"
+#include "json_print.h"
+
+static json_writer_t *_jw;
+static FILE *_fp;
+
+#define _IS_JSON_CONTEXT(type) ((type & PRINT_JSON || type & PRINT_ANY) && _jw)
+#define _IS_FP_CONTEXT(type) (!_jw && (type & PRINT_FP || type & PRINT_ANY))
+
+void new_json_obj(int json, FILE *fp)
+{
+ if (json) {
+ _jw = jsonw_new(fp);
+ if (!_jw) {
+ perror("json object");
+ exit(1);
+ }
+ jsonw_pretty(_jw, true);
+ jsonw_start_array(_jw);
+ }
+ set_current_fp(fp);
+}
+
+void delete_json_obj(void)
+{
+ if (_jw) {
+ jsonw_end_array(_jw);
+ jsonw_destroy(&_jw);
+ }
+}
+
+bool is_json_context(void)
+{
+ return _jw != NULL;
+}
+
+void set_current_fp(FILE *fp)
+{
+ if (!fp) {
+ fprintf(stderr, "Error: invalid file pointer.\n");
+ exit(1);
+ }
+ _fp = fp;
+}
+
+json_writer_t *get_json_writer(void)
+{
+ return _jw;
+}
+
+void open_json_object(const char *str)
+{
+ if (_IS_JSON_CONTEXT(PRINT_JSON)) {
+ if (str)
+ jsonw_name(_jw, str);
+ jsonw_start_object(_jw);
+ }
+}
+
+void close_json_object(void)
+{
+ if (_IS_JSON_CONTEXT(PRINT_JSON))
+ jsonw_end_object(_jw);
+}
+
+/*
+ * Start json array or string array using
+ * the provided string as json key (if not null)
+ * or as array delimiter in non-json context.
+ */
+void open_json_array(enum output_type type, const char *str)
+{
+ if (_IS_JSON_CONTEXT(type)) {
+ if (str)
+ jsonw_name(_jw, str);
+ jsonw_start_array(_jw);
+ } else if (_IS_FP_CONTEXT(type)) {
+ fprintf(_fp, "%s", str);
+ }
+}
+
+/*
+ * End json array or string array
+ */
+void close_json_array(enum output_type type, const char *str)
+{
+ if (_IS_JSON_CONTEXT(type)) {
+ jsonw_pretty(_jw, false);
+ jsonw_end_array(_jw);
+ jsonw_pretty(_jw, true);
+ } else if (_IS_FP_CONTEXT(type)) {
+ fprintf(_fp, "%s", str);
+ }
+}
+
+/*
+ * pre-processor directive to generate similar
+ * functions handling different types
+ */
+#define _PRINT_FUNC(type_name, type) \
+ void print_color_##type_name(enum output_type t, \
+ enum color_attr color, \
+ const char *key, \
+ const char *fmt, \
+ type value) \
+ { \
+ if (_IS_JSON_CONTEXT(t)) { \
+ if (!key) \
+ jsonw_##type_name(_jw, value); \
+ else \
+ jsonw_##type_name##_field(_jw, key, value); \
+ } else if (_IS_FP_CONTEXT(t)) { \
+ color_fprintf(_fp, color, fmt, value); \
+ } \
+ }
+_PRINT_FUNC(int, int);
+_PRINT_FUNC(hu, unsigned short);
+_PRINT_FUNC(uint, uint64_t);
+_PRINT_FUNC(lluint, unsigned long long int);
+#undef _PRINT_FUNC
+
+void print_color_string(enum output_type type,
+ enum color_attr color,
+ const char *key,
+ const char *fmt,
+ const char *value)
+{
+ if (_IS_JSON_CONTEXT(type)) {
+ if (key && !value)
+ jsonw_name(_jw, key);
+ else if (!key && value)
+ jsonw_string(_jw, value);
+ else
+ jsonw_string_field(_jw, key, value);
+ } else if (_IS_FP_CONTEXT(type)) {
+ color_fprintf(_fp, color, fmt, value);
+ }
+}
+
+/*
+ * value's type is bool. When using this function in FP context you can't pass
+ * a value to it, you will need to use "is_json_context()" to have different
+ * branch for json and regular output. grep -r "print_bool" for example
+ */
+void print_color_bool(enum output_type type,
+ enum color_attr color,
+ const char *key,
+ const char *fmt,
+ bool value)
+{
+ if (_IS_JSON_CONTEXT(type)) {
+ if (key)
+ jsonw_bool_field(_jw, key, value);
+ else
+ jsonw_bool(_jw, value);
+ } else if (_IS_FP_CONTEXT(type)) {
+ color_fprintf(_fp, color, fmt, value ? "true" : "false");
+ }
+}
+
+/*
+ * In JSON context uses hardcode %#x format: 42 -> 0x2a
+ */
+void print_color_0xhex(enum output_type type,
+ enum color_attr color,
+ const char *key,
+ const char *fmt,
+ unsigned int hex)
+{
+ if (_IS_JSON_CONTEXT(type)) {
+ SPRINT_BUF(b1);
+
+ snprintf(b1, sizeof(b1), "%#x", hex);
+ print_string(PRINT_JSON, key, NULL, b1);
+ } else if (_IS_FP_CONTEXT(type)) {
+ color_fprintf(_fp, color, fmt, hex);
+ }
+}
+
+void print_color_hex(enum output_type type,
+ enum color_attr color,
+ const char *key,
+ const char *fmt,
+ unsigned int hex)
+{
+ if (_IS_JSON_CONTEXT(type)) {
+ SPRINT_BUF(b1);
+
+ snprintf(b1, sizeof(b1), "%x", hex);
+ if (key)
+ jsonw_string_field(_jw, key, b1);
+ else
+ jsonw_string(_jw, b1);
+ } else if (_IS_FP_CONTEXT(type)) {
+ color_fprintf(_fp, color, fmt, hex);
+ }
+}
+
+/*
+ * In JSON context we don't use the argument "value" we simply call jsonw_null
+ * whereas FP context can use "value" to output anything
+ */
+void print_color_null(enum output_type type,
+ enum color_attr color,
+ const char *key,
+ const char *fmt,
+ const char *value)
+{
+ if (_IS_JSON_CONTEXT(type)) {
+ if (key)
+ jsonw_null_field(_jw, key);
+ else
+ jsonw_null(_jw);
+ } else if (_IS_FP_CONTEXT(type)) {
+ color_fprintf(_fp, color, fmt, value);
+ }
+}
--
1.9.3
^ permalink raw reply related
* [PATCH iproute2 master 2/2] bpf: properly output json for xdp
From: Daniel Borkmann @ 2017-09-21 8:42 UTC (permalink / raw)
To: stephen; +Cc: ast, netdev, Daniel Borkmann
In-Reply-To: <cover.1505956723.git.daniel@iogearbox.net>
After merging net-next branch into master, Stephen asked
to fix up json dump for XDP. Thus, rework the json dump a
bit, such that 'ip -json l' looks as below.
[{
"ifindex": 1,
"ifname": "lo",
"flags": ["LOOPBACK","UP","LOWER_UP"],
"mtu": 65536,
"xdp": {
"mode": 2,
"prog": {
"id": 5,
"tag": "e1e9d0ec0f55d638",
"jited": 1
}
},
"qdisc": "noqueue",
"operstate": "UNKNOWN",
"linkmode": "DEFAULT",
"group": "default",
"txqlen": 1000,
"link_type": "loopback",
"address": "00:00:00:00:00:00",
"broadcast": "00:00:00:00:00:00"
},[...]
]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
ip/iplink_xdp.c | 74 ++++++++++++++++++++++++++++++++++-----------------------
lib/bpf.c | 19 ++++++++++-----
2 files changed, 57 insertions(+), 36 deletions(-)
diff --git a/ip/iplink_xdp.c b/ip/iplink_xdp.c
index 71f7798..2d2953a 100644
--- a/ip/iplink_xdp.c
+++ b/ip/iplink_xdp.c
@@ -14,9 +14,9 @@
#include <linux/bpf.h>
+#include "json_print.h"
#include "xdp.h"
#include "bpf_util.h"
-#include "ip_common.h"
extern int force;
@@ -82,6 +82,22 @@ int xdp_parse(int *argc, char ***argv, struct iplink_req *req, bool generic,
return 0;
}
+static void xdp_dump_json(struct rtattr *tb[IFLA_XDP_MAX + 1])
+{
+ __u32 prog_id = 0;
+ __u8 mode;
+
+ mode = rta_getattr_u8(tb[IFLA_XDP_ATTACHED]);
+ if (tb[IFLA_XDP_PROG_ID])
+ prog_id = rta_getattr_u32(tb[IFLA_XDP_PROG_ID]);
+
+ open_json_object("xdp");
+ print_uint(PRINT_JSON, "mode", NULL, mode);
+ if (prog_id)
+ bpf_dump_prog_info(NULL, prog_id);
+ close_json_object();
+}
+
void xdp_dump(FILE *fp, struct rtattr *xdp, bool link, bool details)
{
struct rtattr *tb[IFLA_XDP_MAX + 1];
@@ -94,34 +110,32 @@ void xdp_dump(FILE *fp, struct rtattr *xdp, bool link, bool details)
return;
mode = rta_getattr_u8(tb[IFLA_XDP_ATTACHED]);
- if (is_json_context()) {
- print_uint(PRINT_JSON, "attached", NULL, mode);
- } else {
- if (mode == XDP_ATTACHED_NONE)
- return;
- else if (details && link)
- fprintf(fp, "%s prog/xdp", _SL_);
- else if (mode == XDP_ATTACHED_DRV)
- fprintf(fp, "xdp");
- else if (mode == XDP_ATTACHED_SKB)
- fprintf(fp, "xdpgeneric");
- else if (mode == XDP_ATTACHED_HW)
- fprintf(fp, "xdpoffload");
- else
- fprintf(fp, "xdp[%u]", mode);
-
- if (tb[IFLA_XDP_PROG_ID])
- prog_id = rta_getattr_u32(tb[IFLA_XDP_PROG_ID]);
- if (!details) {
- if (prog_id && !link)
- fprintf(fp, "/id:%u", prog_id);
- fprintf(fp, " ");
- return;
- }
-
- if (prog_id) {
- fprintf(fp, " ");
- bpf_dump_prog_info(fp, prog_id);
- }
+ if (mode == XDP_ATTACHED_NONE)
+ return;
+ else if (is_json_context())
+ return details ? (void)0 : xdp_dump_json(tb);
+ else if (details && link)
+ fprintf(fp, "%s prog/xdp", _SL_);
+ else if (mode == XDP_ATTACHED_DRV)
+ fprintf(fp, "xdp");
+ else if (mode == XDP_ATTACHED_SKB)
+ fprintf(fp, "xdpgeneric");
+ else if (mode == XDP_ATTACHED_HW)
+ fprintf(fp, "xdpoffload");
+ else
+ fprintf(fp, "xdp[%u]", mode);
+
+ if (tb[IFLA_XDP_PROG_ID])
+ prog_id = rta_getattr_u32(tb[IFLA_XDP_PROG_ID]);
+ if (!details) {
+ if (prog_id && !link)
+ fprintf(fp, "/id:%u", prog_id);
+ fprintf(fp, " ");
+ return;
+ }
+
+ if (prog_id) {
+ fprintf(fp, " ");
+ bpf_dump_prog_info(fp, prog_id);
}
}
diff --git a/lib/bpf.c b/lib/bpf.c
index cfa1f79..10ea23a 100644
--- a/lib/bpf.c
+++ b/lib/bpf.c
@@ -40,6 +40,7 @@
#include <arpa/inet.h>
#include "utils.h"
+#include "json_print.h"
#include "bpf_util.h"
#include "bpf_elf.h"
@@ -186,23 +187,29 @@ int bpf_dump_prog_info(FILE *f, uint32_t id)
int fd, ret, dump_ok = 0;
SPRINT_BUF(tmp);
- fprintf(f, "id %u ", id);
+ open_json_object("prog");
+ print_uint(PRINT_ANY, "id", "id %u ", id);
fd = bpf_prog_fd_by_id(id);
if (fd < 0)
- return dump_ok;
+ goto out;
ret = bpf_prog_info_by_fd(fd, &info, &len);
if (!ret && len) {
- fprintf(f, "tag %s ",
- hexstring_n2a(info.tag, sizeof(info.tag),
- tmp, sizeof(tmp)));
- if (info.jited_prog_len)
+ int jited = !!info.jited_prog_len;
+
+ print_string(PRINT_ANY, "tag", "tag %s ",
+ hexstring_n2a(info.tag, sizeof(info.tag),
+ tmp, sizeof(tmp)));
+ print_uint(PRINT_JSON, "jited", NULL, jited);
+ if (jited && !is_json_context())
fprintf(f, "jited ");
dump_ok = 1;
}
close(fd);
+out:
+ close_json_object();
return dump_ok;
}
--
1.9.3
^ permalink raw reply related
* Re: [PATCH net-next 3/4] cxgb4: add support to offload action vlan
From: Jiri Pirko @ 2017-09-21 8:55 UTC (permalink / raw)
To: Rahul Lakkireddy; +Cc: netdev, davem, kumaras, ganeshgr, nirranjan, indranil
In-Reply-To: <016c3bf21a7bfe45e73275d3191cf61cceffd362.1505977744.git.rahul.lakkireddy@chelsio.com>
Thu, Sep 21, 2017 at 09:33:36AM CEST, rahul.lakkireddy@chelsio.com wrote:
>From: Kumar Sanghvi <kumaras@chelsio.com>
>
>Add support for offloading tc-flower flows having
>vlan actions: pop, push and modify.
>
>Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
>Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
>Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
>---
> .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c | 43 ++++++++++++++++++++++
> 1 file changed, 43 insertions(+)
>
>diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
[...]
>+ switch (vlan_action) {
>+ case TCA_VLAN_ACT_POP:
>+ break;
>+ case TCA_VLAN_ACT_PUSH:
>+ case TCA_VLAN_ACT_MODIFY:
>+ if (proto != ETH_P_8021Q) {
>+ netdev_err(dev,
>+ "%s: Unsupp. vlan proto\n",
Don't wrap this. Also "Unsupp."vs"Unsupported". Please be consistent.
>+ __func__);
>+ return -EOPNOTSUPP;
>+ }
>+ break;
>+ default:
>+ netdev_err(dev, "%s: Unsupported vlan action\n",
>+ __func__);
>+ return -EOPNOTSUPP;
>+ }
> } else {
> netdev_err(dev, "%s: Unsupported action\n", __func__);
> return -EOPNOTSUPP;
>--
>2.14.1
>
^ permalink raw reply
* Re: [PATCH net-next 0/4] cxgb4: add support to offload tc flower
From: Jiri Pirko @ 2017-09-21 8:56 UTC (permalink / raw)
To: Rahul Lakkireddy; +Cc: netdev, davem, kumaras, ganeshgr, nirranjan, indranil
In-Reply-To: <cover.1505977744.git.rahul.lakkireddy@chelsio.com>
Thu, Sep 21, 2017 at 09:33:33AM CEST, rahul.lakkireddy@chelsio.com wrote:
>This series of patches add support to offload tc flower onto Chelsio
>NICs.
>
>Patch 1 adds basic skeleton to prepare for offloading tc flower flows.
>
>Patch 2 adds support to add/remove flows for offload. Flows can have
>accompanying masks. Following match and action are currently supported
>for offload:
>Match: ether-protocol, IPv4/IPv6 addresses, L4 ports (TCP/UDP)
>Action: drop, redirect to another port on the device.
>
>Patch 3 adds support to offload tc-flower flows having
>vlan actions: pop, push, and modify.
>
>Patch 4 adds support to fetch stats for the offloaded tc flower flows
>from hardware.
>
>Support for offloading more match and action types are to be followed
>in subsequent series.
Looks good to me. Thanks!
^ permalink raw reply
* Re: Latest net-next from GIT panic
From: Paweł Staszewski @ 2017-09-21 9:06 UTC (permalink / raw)
To: Eric Dumazet, Wei Wang
Cc: Cong Wang, Linux Kernel Network Developers, Eric Dumazet
In-Reply-To: <1505956639.29839.108.camel@edumazet-glaptop3.roam.corp.google.com>
W dniu 2017-09-21 o 03:17, Eric Dumazet pisze:
> On Wed, 2017-09-20 at 18:09 -0700, Wei Wang wrote:
>>> Thanks very much Pawel for the feedback.
>>>
>>> I was looking into the code (specifically IPv4 part) and found that in
>>> free_fib_info_rcu(), we call free_nh_exceptions() without holding the
>>> fnhe_lock. I am wondering if that could cause some race condition on
>>> fnhe->fnhe_rth_input/output so a double call on dst_dev_put() on the
>>> same dst could be happening.
>>>
>>> But as we call free_fib_info_rcu() only after the grace period, and
>>> the lookup code which could potentially modify
>>> fnhe->fnhe_rth_input/output all holds rcu_read_lock(), it seems
>>> fine...
>>>
>> Hi Pawel,
>>
>> Could you try the following debug patch on top of net-next branch and
>> reproduce the issue check if there are warning msg showing?
>>
>> diff --git a/include/net/dst.h b/include/net/dst.h
>> index 93568bd0a352..82aff41c6f63 100644
>> --- a/include/net/dst.h
>> +++ b/include/net/dst.h
>> @@ -271,7 +271,7 @@ static inline void dst_use_noref(struct dst_entry
>> *dst, unsigned long time)
>> static inline struct dst_entry *dst_clone(struct dst_entry *dst)
>> {
>> if (dst)
>> - atomic_inc(&dst->__refcnt);
>> + dst_hold(dst);
>> return dst;
>> }
>>
>> Thanks.
>> Wei
>>
>
> Yes, we believe skb_dst_force() and skb_dst_force_safe() should be
> unified (to the 'safe' version)
>
> We no longer have gc to protect from 0 -> 1 transition of dst refcount.
>
>
>
>
After adding patch from Wei
https://bugzilla.kernel.org/show_bug.cgi?id=197005#c14
^ permalink raw reply
* Re: [PATCH net-next 2/5] net: allow early demux to fetch noref socket
From: Paolo Abeni @ 2017-09-21 9:13 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Pablo Neira Ayuso, Florian Westphal,
Eric Dumazet, Hannes Frederic Sowa
In-Reply-To: <db75c6a6872040712a9ab97b0bac04b697c42a4c.1505926196.git.pabeni@redhat.com>
On Wed, 2017-09-20 at 18:54 +0200, Paolo Abeni wrote:
> We must be careful to avoid leaking such sockets outside
> the RCU section containing the early demux call; we clear
> them on nonlocal delivery.
>
> For ipv4 we must take care of local mcast delivery, too,
> since udp early demux works also for mcast addresses.
>
> Also update all iptables/nftables extension that can
> happen in the input chain and can transmit the skb outside
> such patch, namely TEE, nft_dup and nfqueue.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> net/ipv4/ip_input.c | 12 ++++++++++++
> net/ipv4/ipmr.c | 18 ++++++++++++++----
> net/ipv4/netfilter/nf_dup_ipv4.c | 3 +++
> net/ipv6/ip6_input.c | 7 ++++++-
> net/ipv6/netfilter/nf_dup_ipv6.c | 3 +++
> net/netfilter/nf_queue.c | 3 +++
> 6 files changed, 41 insertions(+), 5 deletions(-)
>
> diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
> index fa2dc8f692c6..e71abc8b698c 100644
> --- a/net/ipv4/ip_input.c
> +++ b/net/ipv4/ip_input.c
> @@ -349,6 +349,18 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
> __NET_INC_STATS(net, LINUX_MIB_IPRPFILTER);
> goto drop;
> }
> +
> + /* Since the sk has no reference to the socket, we must
> + * clear it before escaping this RCU section.
> + * The sk is just an hint and we know we are not going to use
> + * it outside the input path.
> + */
> + if (skb_dst(skb)->input != ip_local_deliver
> +#ifdef CONFIG_IP_MROUTE
> + && skb_dst(skb)->input != ip_mr_input
> +#endif
> + )
> + skb_clear_noref_sk(skb);
> }
The above is to allow early demux for multicast sockets even on hosts
acting as multicast router. This is probably overkill: an host will
probably act as a multicast router or receive large amount of locally
terminate mcast traffic.
We can instead preserve the sknoref only for ip_local_deliver(),
dropping the early demux optimization in the above scenario, which
should not be very relevant. Will simplify the above chunk and drop the
need for the ipmr.c changes below; overall this patch will become much
simpler.
Paolo
^ permalink raw reply
* Re: [PATCH net-next 1/5] net: add support for noref skb->sk
From: Paolo Abeni @ 2017-09-21 9:14 UTC (permalink / raw)
To: Eric Dumazet
Cc: netdev, David S. Miller, Pablo Neira Ayuso, Florian Westphal,
Eric Dumazet, Hannes Frederic Sowa
In-Reply-To: <1505929295.29839.103.camel@edumazet-glaptop3.roam.corp.google.com>
Hi,
Thank you for looking at it!
On Wed, 2017-09-20 at 10:41 -0700, Eric Dumazet wrote:
> On Wed, 2017-09-20 at 18:54 +0200, Paolo Abeni wrote:
> > Noref sk do not carry a socket refcount, are valid
> > only inside the current RCU section and must be
> > explicitly cleared before exiting such section.
> >
> > They will be used in a later patch to allow early demux
> > without sock refcounting.
>
>
>
>
> > +/* dummy destructor used by noref sockets */
> > +void sock_dummyfree(struct sk_buff *skb)
> > +{
>
> BUG();
>
> > +}
> > +EXPORT_SYMBOL(sock_dummyfree);
> > +
We can call sock_dummyfree() in legitimate paths, see below, but we can
add a:
WARN_ON_ONCE(!rcu_read_lock_held());
here and in skb_clear_noref_sk(). That should help much to catch
possible bugs.
> I do not see how you ensure we do not leave RCU section with an skb
> destructor pointing to this sock_dummyfree()
>
> This patch series looks quite dangerous to me.
The idea is to explicitly clear the sknoref references before leaving
the RCU section. Quite alike what we currently do for dst noref, but
here the only place where we get a noref socket is the socket early
demux, thus the scope of this change is more limited to what we have
with noref dst_entries.
The relevant code is in the next 2 patches; after the demux we preserve
the sknoref only if the skb has a local destination. The UDP socket
will then set the noref on early demux lookup, and the skb will either:
* land on the corresponding UDP socket, the receive function will steal
the sknoref
* be dropped by some nft/iptables target - the dummy destructor is
called
* forwarded by some nft/iptables target outside the input path; we
clear the skref explicitly in such targets.
Currently there are an handful of places affected, and we can simplify
the code dropping the early demux result for locally terminated
multicast sockets on a host acting as a multicast router, please see
the comment on the next patch.
> Do we really have real applications using connected UDP sockets and
> wanting very high pps throughput ?
The ultimate goal is to improve the unconnected UDP sockets scenario,
we do actually have use cases for that - DNS servers and VoIP SBCs.
Thanks,
Paolo
^ permalink raw reply
* [RFC PATCH 0/0] Introduce MPLS over GRE
From: Amine Kherbouche @ 2017-09-21 9:25 UTC (permalink / raw)
To: netdev, xeb, roopa; +Cc: amine.kherbouche, equinox
This series introduces the MPLS over GRE encapsulation (RFC 4023).
Various applications of MPLS make use of label stacks with multiple
entries. In some cases, it is possible to replace the top label of
the stack with an IP-based encapsulation, thereby, it is possible for
two LSRs that are adjacent on an LSP to be separated by an IP network,
even if that IP network does not provide MPLS.
An example of configuration:
node1 LER1 LER2 node2
+-----+ +------+ +------+ +-----+
| | | | | | | |
| | | |p3 GRE tunnel p4| | | |
| |p1 p2| +-------------------+ |p5 p6| |
| +-------------+ +-------------------+ +------------+| |
| |10.100.0.0/24| | | |10.200.0.0/24| |
| |fd00:100::/64| | 10.125.0.0/24 | |fd00:200::/64| |
| | | | fd00:125::/64 | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
+-----+ +------+ +------+ +-----+
### node1 ###
ip link set p1 up
ip addr add 10.100.0.1/24 dev p1
### LER1 ###
ip link set p2 up
ip addr add 10.100.0.2/24 dev p2
ip link set p3 up
ip addr add 10.125.0.1/24 dev p3
modprobe mpls_router
sysctl -w net.mpls.conf.p2.input=1
sysctl -w net.mpls.conf.p3.input=1
sysctl -w net.mpls.platform_labels=1000
ip link add gre1 type gre ttl 64 local 10.125.0.1 remote 10.125.0.2 dev p3
ip link set dev gre1 up
ip -M route add 111 as 222 dev gre1
ip -M route add 555 as 666 via inet 10.100.0.1 dev p2
### LER2 ###
ip link set p5 up
ip addr add 10.200.0.2/24 dev p5
ip link set p4 up
ip addr add 10.125.0.2/24 dev p4
modprobe mpls_router
sysctl -w net.mpls.conf.p4.input=1
sysctl -w net.mpls.conf.p5.input=1
sysctl -w net.mpls.platform_labels=1000
ip link add gre1 type gre ttl 64 local 10.125.0.2 remote 10.125.0.1 dev p4
ip link set dev gre1 up
ip -M route add 444 as 555 dev gre1
ip -M route add 222 as 333 via inet 10.200.0.1 dev p5
### node2 ###
ip link set p6 up
ip addr add 10.200.0.1/24 dev p6
Now using this scapy to forge and send packets from the port p1 of node1:
p = Ether(src='de:ed:01:0c:41:09', dst='de:ed:01:2f:3b:ba')
p /= MPLS(s=1, ttl=64, label=111)/Raw(load='\xde')
sendp(p, iface="p1", count=20, inter=0.1)
^ permalink raw reply
* [PATCH 1/2] mpls: expose stack entry function
From: Amine Kherbouche @ 2017-09-21 9:25 UTC (permalink / raw)
To: netdev, xeb, roopa; +Cc: amine.kherbouche, equinox
In-Reply-To: <1505985924-12479-1-git-send-email-amine.kherbouche@6wind.com>
Exposing mpls_forward() function to be able to be called from elsewhere
such as MPLS over GRE in the next commit.
Signed-off-by: Amine Kherbouche <amine.kherbouche@6wind.com>
---
include/linux/mpls.h | 3 +++
net/mpls/af_mpls.c | 5 +++--
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/include/linux/mpls.h b/include/linux/mpls.h
index 384fb22..d5c7599 100644
--- a/include/linux/mpls.h
+++ b/include/linux/mpls.h
@@ -2,10 +2,13 @@
#define _LINUX_MPLS_H
#include <uapi/linux/mpls.h>
+#include <linux/netdevice.h>
#define MPLS_TTL_MASK (MPLS_LS_TTL_MASK >> MPLS_LS_TTL_SHIFT)
#define MPLS_BOS_MASK (MPLS_LS_S_MASK >> MPLS_LS_S_SHIFT)
#define MPLS_TC_MASK (MPLS_LS_TC_MASK >> MPLS_LS_TC_SHIFT)
#define MPLS_LABEL_MASK (MPLS_LS_LABEL_MASK >> MPLS_LS_LABEL_SHIFT)
+int mpls_forward(struct sk_buff *skb, struct net_device *dev,
+ struct packet_type *pt, struct net_device *orig_dev);
#endif /* _LINUX_MPLS_H */
diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index c5b9ce4..36ea2ad 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -307,8 +307,8 @@ static bool mpls_egress(struct net *net, struct mpls_route *rt,
return success;
}
-static int mpls_forward(struct sk_buff *skb, struct net_device *dev,
- struct packet_type *pt, struct net_device *orig_dev)
+int mpls_forward(struct sk_buff *skb, struct net_device *dev,
+ struct packet_type *pt, struct net_device *orig_dev)
{
struct net *net = dev_net(dev);
struct mpls_shim_hdr *hdr;
@@ -442,6 +442,7 @@ static int mpls_forward(struct sk_buff *skb, struct net_device *dev,
kfree_skb(skb);
return NET_RX_DROP;
}
+EXPORT_SYMBOL(mpls_forward);
static struct packet_type mpls_packet_type __read_mostly = {
.type = cpu_to_be16(ETH_P_MPLS_UC),
--
2.1.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox