* Re: Configuring ethernet link fails with No such device
From: David Miller @ 2016-04-12 15:44 UTC (permalink / raw)
To: bob.ham
Cc: fabio.estevam, systemd-devel, netdev, bryan.wu, u.kleine-koenig,
l.stach
In-Reply-To: <1460451492.6333.6.camel@collabora.com>
From: Bob Ham <bob.ham@collabora.com>
Date: Tue, 12 Apr 2016 09:58:12 +0100
> On Mon, 2016-04-11 at 15:46 -0700, Stefan Agner wrote:
>
>> Or in other words: Is this a Kernel or systemd issue?
>
> From what I recall, both; an issue with the FEC driver, and issues in
> systemd/udevd's handling of link-level settings.
This is my impression of the situation as well.
_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel
^ permalink raw reply
* Re: [PATCH net-next 00/11] FUJITSU Extended Socket driver version 1.1
From: David Miller @ 2016-04-12 15:43 UTC (permalink / raw)
To: izumi.taku; +Cc: netdev
In-Reply-To: <E86EADE93E2D054CBCD4E708C38D364A734EF16E@G01JPEXMBYT01>
From: "Izumi, Taku" <izumi.taku@jp.fujitsu.com>
Date: Tue, 12 Apr 2016 08:35:09 +0000
> But I'd like to keep some debugfs facility for status information
> and some specific stats other thatn net_stats.
We have a facility for arbitrary driver stats, remove this debugfs crap
please.
I'm not going to say this again.
^ permalink raw reply
* Re: [Lsf] [Lsf-pc] [LSF/MM TOPIC] Generic page-pool recycle facility?
From: Alexander Duyck @ 2016-04-12 15:37 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: lsf@lists.linux-foundation.org, James Bottomley, Sagi Grimberg,
Tom Herbert, Brenden Blanco, Christoph Hellwig, linux-mm,
netdev@vger.kernel.org, Bart Van Assche,
lsf-pc@lists.linux-foundation.org, Alexei Starovoitov
In-Reply-To: <20160412082838.4ce17c1a@redhat.com>
On Mon, Apr 11, 2016 at 11:28 PM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> On Mon, 11 Apr 2016 15:02:51 -0700 Alexander Duyck <alexander.duyck@gmail.com> wrote:
>
>> Have you taken a look at possibly trying to optimize the DMA pool API
>> to work with pages? It sounds like it is supposed to do something
>> similar to what you are wanting to do.
>
> Yes, I have looked at the mm/dmapool.c API. AFAIK this is for DMA
> coherent memory (see use of dma_alloc_coherent/dma_free_coherent).
>
> What we are doing is "streaming" DMA memory, when processing the RX
> ring.
>
> (NIC are only using DMA coherent memory for the descriptors, which are
> allocated on driver init)
Yes, I know that but it shouldn't take much to extend the API to
provide the option for a streaming DMA mapping. That was why I
thought you might want to look in this direction.
- Alex
^ permalink raw reply
* [PATCH 3/3] rxrpc: Use the listen() system call to move to listening state
From: David Howells @ 2016-04-12 15:05 UTC (permalink / raw)
To: linux-afs; +Cc: dhowells, netdev, linux-kernel
In-Reply-To: <20160412150533.20637.23952.stgit@warthog.procyon.org.uk>
Use the listen() system call to move to listening state and to set the
socket backlog queue size. A limit is placed on the maximum queue size by
way of:
/proc/sys/net/rxrpc/max_backlog
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/afs/rxrpc.c | 34 +++++++++++++++++++---------------
net/rxrpc/af_rxrpc.c | 26 ++++++++++++++------------
net/rxrpc/ar-internal.h | 1 +
net/rxrpc/misc.c | 6 ++++++
net/rxrpc/sysctl.c | 10 ++++++++++
5 files changed, 50 insertions(+), 27 deletions(-)
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 63cd9f939f19..4832de84d52c 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -85,18 +85,14 @@ int afs_open_socket(void)
skb_queue_head_init(&afs_incoming_calls);
+ ret = -ENOMEM;
afs_async_calls = create_singlethread_workqueue("kafsd");
- if (!afs_async_calls) {
- _leave(" = -ENOMEM [wq]");
- return -ENOMEM;
- }
+ if (!afs_async_calls)
+ goto error_0;
ret = sock_create_kern(&init_net, AF_RXRPC, SOCK_DGRAM, PF_INET, &socket);
- if (ret < 0) {
- destroy_workqueue(afs_async_calls);
- _leave(" = %d [socket]", ret);
- return ret;
- }
+ if (ret < 0)
+ goto error_1;
socket->sk->sk_allocation = GFP_NOFS;
@@ -111,18 +107,26 @@ int afs_open_socket(void)
sizeof(srx.transport.sin.sin_addr));
ret = kernel_bind(socket, (struct sockaddr *) &srx, sizeof(srx));
- if (ret < 0) {
- sock_release(socket);
- destroy_workqueue(afs_async_calls);
- _leave(" = %d [bind]", ret);
- return ret;
- }
+ if (ret < 0)
+ goto error_2;
+
+ ret = kernel_listen(socket, INT_MAX);
+ if (ret < 0)
+ goto error_2;
rxrpc_kernel_intercept_rx_messages(socket, afs_rx_interceptor);
afs_socket = socket;
_leave(" = 0");
return 0;
+
+error_2:
+ sock_release(socket);
+error_1:
+ destroy_workqueue(afs_async_calls);
+error_0:
+ _leave(" = %d", ret);
+ return ret;
}
/*
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index dd462352a79c..7b1aedd79b7c 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -31,8 +31,6 @@ unsigned int rxrpc_debug; // = RXRPC_DEBUG_KPROTO;
module_param_named(debug, rxrpc_debug, uint, S_IWUSR | S_IRUGO);
MODULE_PARM_DESC(debug, "RxRPC debugging mask");
-static int sysctl_rxrpc_max_qlen __read_mostly = 10;
-
static struct proto rxrpc_proto;
static const struct proto_ops rxrpc_rpc_ops;
@@ -191,7 +189,7 @@ static int rxrpc_listen(struct socket *sock, int backlog)
struct rxrpc_sock *rx = rxrpc_sk(sk);
int ret;
- _enter("%p,%d", rx, backlog);
+ _enter("%p{%d},%d", rx, rx->sk.sk_state, backlog);
lock_sock(&rx->sk);
@@ -199,16 +197,20 @@ static int rxrpc_listen(struct socket *sock, int backlog)
case RXRPC_UNBOUND:
ret = -EADDRNOTAVAIL;
break;
- case RXRPC_CLIENT_UNBOUND:
- case RXRPC_CLIENT_BOUND:
- default:
- ret = -EBUSY;
- break;
case RXRPC_SERVER_BOUND:
ASSERT(rx->local != NULL);
- sk->sk_max_ack_backlog = backlog;
- rx->sk.sk_state = RXRPC_SERVER_LISTENING;
- ret = 0;
+ if (backlog == INT_MAX)
+ backlog = rxrpc_max_backlog;
+ if (backlog > rxrpc_max_backlog) {
+ ret = -EINVAL;
+ } else {
+ sk->sk_max_ack_backlog = backlog;
+ rx->sk.sk_state = RXRPC_SERVER_LISTENING;
+ ret = 0;
+ }
+ break;
+ default:
+ ret = -EBUSY;
break;
}
@@ -549,7 +551,7 @@ static int rxrpc_create(struct net *net, struct socket *sock, int protocol,
sock_init_data(sock, sk);
sk->sk_state = RXRPC_UNBOUND;
sk->sk_write_space = rxrpc_write_space;
- sk->sk_max_ack_backlog = sysctl_rxrpc_max_qlen;
+ sk->sk_max_ack_backlog = 0;
sk->sk_destruct = rxrpc_sock_destructor;
rx = rxrpc_sk(sk);
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index bbf2443af875..4c29cf236dea 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -640,6 +640,7 @@ extern const struct rxrpc_security rxrpc_no_security;
/*
* misc.c
*/
+extern unsigned int rxrpc_max_backlog __read_mostly;
extern unsigned int rxrpc_requested_ack_delay;
extern unsigned int rxrpc_soft_ack_delay;
extern unsigned int rxrpc_idle_ack_delay;
diff --git a/net/rxrpc/misc.c b/net/rxrpc/misc.c
index 1afe9876e79f..bdc5e42fe600 100644
--- a/net/rxrpc/misc.c
+++ b/net/rxrpc/misc.c
@@ -15,6 +15,12 @@
#include "ar-internal.h"
/*
+ * The maximum listening backlog queue size that may be set on a socket by
+ * listen().
+ */
+unsigned int rxrpc_max_backlog __read_mostly = 10;
+
+/*
* How long to wait before scheduling ACK generation after seeing a
* packet with RXRPC_REQUEST_ACK set (in jiffies).
*/
diff --git a/net/rxrpc/sysctl.c b/net/rxrpc/sysctl.c
index d20ed575acf4..a99690a8a3da 100644
--- a/net/rxrpc/sysctl.c
+++ b/net/rxrpc/sysctl.c
@@ -18,6 +18,7 @@ static struct ctl_table_header *rxrpc_sysctl_reg_table;
static const unsigned int zero = 0;
static const unsigned int one = 1;
static const unsigned int four = 4;
+static const unsigned int thirtytwo = 32;
static const unsigned int n_65535 = 65535;
static const unsigned int n_max_acks = RXRPC_MAXACKS;
@@ -100,6 +101,15 @@ static struct ctl_table rxrpc_sysctl_table[] = {
/* Non-time values */
{
+ .procname = "max_backlog",
+ .data = &rxrpc_max_backlog,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = (void *)&four,
+ .extra2 = (void *)&thirtytwo,
+ },
+ {
.procname = "rx_window_size",
.data = &rxrpc_rx_window_size,
.maxlen = sizeof(unsigned int),
^ permalink raw reply related
* [PATCH 2/3] rxrpc: The RXRPC_ACCEPT control message should not have an address
From: David Howells @ 2016-04-12 15:05 UTC (permalink / raw)
To: linux-afs; +Cc: dhowells, netdev, linux-kernel
In-Reply-To: <20160412150533.20637.23952.stgit@warthog.procyon.org.uk>
When sendmsg() is called with the RXRPC_ACCEPT control message, sendmsg()
shouldn't also be given an address in msg_name.
Signed-off-by: David Howells <dhowells@redhat.com>
---
net/rxrpc/ar-output.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/rxrpc/ar-output.c b/net/rxrpc/ar-output.c
index b87fda075b45..044de9bf34a4 100644
--- a/net/rxrpc/ar-output.c
+++ b/net/rxrpc/ar-output.c
@@ -199,7 +199,8 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
return ret;
if (cmd == RXRPC_CMD_ACCEPT) {
- if (rx->sk.sk_state != RXRPC_SERVER_LISTENING)
+ if (rx->sk.sk_state != RXRPC_SERVER_LISTENING ||
+ msg->msg_name)
return -EINVAL;
call = rxrpc_accept_call(rx, user_call_ID);
if (IS_ERR(call))
^ permalink raw reply related
* [PATCH 1/3] rxrpc: Don't permit use of connect() op and simplify sendmsg() op
From: David Howells @ 2016-04-12 15:05 UTC (permalink / raw)
To: linux-afs; +Cc: dhowells, netdev, linux-kernel
In-Reply-To: <20160412150533.20637.23952.stgit@warthog.procyon.org.uk>
Simplify the RxRPC user interface and remove the use of connect() to direct
client calls. It is redundant given that sendmsg() can be given the target
address and calls to multiple targets are permitted from a client socket
and also from a service socket.
Simplify sendmsg() also. If we can't find a call immediately, we create
one, as now, but if the call then exists when we try and add it, we give an
error rather than using the call we found at the second attempt. We should
never see this situation unless two threads are racing, trying to create a
call with the same ID - which would be an error.
It also isn't required to provide sendmsg() with an address - provided the
control message data holds a user ID that maps to a currently active call.
Signed-off-by: David Howells <dhowells@redhat.com>
---
Documentation/networking/rxrpc.txt | 8 --
include/linux/rxrpc.h | 18 ++-
net/rxrpc/af_rxrpc.c | 185 +++++++++---------------------------
net/rxrpc/ar-call.c | 158 ++++++++++++-------------------
net/rxrpc/ar-connection.c | 17 ---
net/rxrpc/ar-internal.h | 21 ++--
net/rxrpc/ar-output.c | 186 +++++++++++++++++-------------------
7 files changed, 219 insertions(+), 374 deletions(-)
diff --git a/Documentation/networking/rxrpc.txt b/Documentation/networking/rxrpc.txt
index 16a924c486bf..a1089f93e4ce 100644
--- a/Documentation/networking/rxrpc.txt
+++ b/Documentation/networking/rxrpc.txt
@@ -216,12 +216,8 @@ Interaction with the user of the RxRPC socket:
be used in all other sendmsgs or recvmsgs associated with that call. The
tag is carried in the control data.
- (*) connect() is used to supply a default destination address for a client
- socket. This may be overridden by supplying an alternate address to the
- first sendmsg() of a call (struct msghdr::msg_name).
-
- (*) If connect() is called on an unbound client, a random local port will
- bound before the operation takes place.
+ (*) connect() is not used. The target address for a client call must be
+ supplied to the first sendmsg() of a call (struct msghdr::msg_name).
(*) A server socket may also be used to make client calls. To do this, the
first sendmsg() of the call must specify the target address. The server's
diff --git a/include/linux/rxrpc.h b/include/linux/rxrpc.h
index a53915cd5581..e4182d3f8c8b 100644
--- a/include/linux/rxrpc.h
+++ b/include/linux/rxrpc.h
@@ -40,16 +40,18 @@ struct sockaddr_rxrpc {
/*
* RxRPC control messages
+ * - data type is specified by default (ie. not abort or accept)
* - terminal messages mean that a user call ID tag can be recycled
+ * - s/r/- indicate whether these are applicable to sendmsg() and/or recvmsg()
*/
-#define RXRPC_USER_CALL_ID 1 /* user call ID specifier */
-#define RXRPC_ABORT 2 /* abort request / notification [terminal] */
-#define RXRPC_ACK 3 /* [Server] RPC op final ACK received [terminal] */
-#define RXRPC_NET_ERROR 5 /* network error received [terminal] */
-#define RXRPC_BUSY 6 /* server busy received [terminal] */
-#define RXRPC_LOCAL_ERROR 7 /* local error generated [terminal] */
-#define RXRPC_NEW_CALL 8 /* [Server] new incoming call notification */
-#define RXRPC_ACCEPT 9 /* [Server] accept request */
+#define RXRPC_USER_CALL_ID 1 /* sr: user call ID specifier */
+#define RXRPC_ABORT 2 /* sr: abort request / notification [terminal] */
+#define RXRPC_ACK 3 /* -r: [Service] RPC op final ACK received [terminal] */
+#define RXRPC_NET_ERROR 5 /* -r: network error received [terminal] */
+#define RXRPC_BUSY 6 /* -r: server busy received [terminal] */
+#define RXRPC_LOCAL_ERROR 7 /* -r: local error generated [terminal] */
+#define RXRPC_NEW_CALL 8 /* -r: [Service] new incoming call notification */
+#define RXRPC_ACCEPT 9 /* s-: [Service] accept request */
/*
* RxRPC security levels
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index e45e94ca030f..dd462352a79c 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -137,33 +137,33 @@ static int rxrpc_bind(struct socket *sock, struct sockaddr *saddr, int len)
lock_sock(&rx->sk);
- if (rx->sk.sk_state != RXRPC_UNCONNECTED) {
+ if (rx->sk.sk_state != RXRPC_UNBOUND) {
ret = -EINVAL;
goto error_unlock;
}
memcpy(&rx->srx, srx, sizeof(rx->srx));
- /* Find or create a local transport endpoint to use */
local = rxrpc_lookup_local(&rx->srx);
if (IS_ERR(local)) {
ret = PTR_ERR(local);
goto error_unlock;
}
- rx->local = local;
- if (srx->srx_service) {
+ if (rx->srx.srx_service) {
write_lock_bh(&local->services_lock);
list_for_each_entry(prx, &local->services, listen_link) {
- if (prx->srx.srx_service == srx->srx_service)
+ if (prx->srx.srx_service == rx->srx.srx_service)
goto service_in_use;
}
+ rx->local = local;
list_add_tail(&rx->listen_link, &local->services);
write_unlock_bh(&local->services_lock);
rx->sk.sk_state = RXRPC_SERVER_BOUND;
} else {
+ rx->local = local;
rx->sk.sk_state = RXRPC_CLIENT_BOUND;
}
@@ -172,8 +172,9 @@ static int rxrpc_bind(struct socket *sock, struct sockaddr *saddr, int len)
return 0;
service_in_use:
- ret = -EADDRINUSE;
write_unlock_bh(&local->services_lock);
+ rxrpc_put_local(local);
+ ret = -EADDRINUSE;
error_unlock:
release_sock(&rx->sk);
error:
@@ -195,11 +196,11 @@ static int rxrpc_listen(struct socket *sock, int backlog)
lock_sock(&rx->sk);
switch (rx->sk.sk_state) {
- case RXRPC_UNCONNECTED:
+ case RXRPC_UNBOUND:
ret = -EADDRNOTAVAIL;
break;
+ case RXRPC_CLIENT_UNBOUND:
case RXRPC_CLIENT_BOUND:
- case RXRPC_CLIENT_CONNECTED:
default:
ret = -EBUSY;
break;
@@ -219,20 +220,18 @@ static int rxrpc_listen(struct socket *sock, int backlog)
/*
* find a transport by address
*/
-static struct rxrpc_transport *rxrpc_name_to_transport(struct socket *sock,
- struct sockaddr *addr,
- int addr_len, int flags,
- gfp_t gfp)
+struct rxrpc_transport *rxrpc_name_to_transport(struct rxrpc_sock *rx,
+ struct sockaddr *addr,
+ int addr_len, int flags,
+ gfp_t gfp)
{
struct sockaddr_rxrpc *srx = (struct sockaddr_rxrpc *) addr;
struct rxrpc_transport *trans;
- struct rxrpc_sock *rx = rxrpc_sk(sock->sk);
struct rxrpc_peer *peer;
_enter("%p,%p,%d,%d", rx, addr, addr_len, flags);
ASSERT(rx->local != NULL);
- ASSERT(rx->sk.sk_state > RXRPC_UNCONNECTED);
if (rx->srx.transport_type != srx->transport_type)
return ERR_PTR(-ESOCKTNOSUPPORT);
@@ -254,7 +253,7 @@ static struct rxrpc_transport *rxrpc_name_to_transport(struct socket *sock,
/**
* rxrpc_kernel_begin_call - Allow a kernel service to begin a call
* @sock: The socket on which to make the call
- * @srx: The address of the peer to contact (defaults to socket setting)
+ * @srx: The address of the peer to contact
* @key: The security context to use (defaults to socket setting)
* @user_call_ID: The ID to use
*
@@ -280,25 +279,14 @@ struct rxrpc_call *rxrpc_kernel_begin_call(struct socket *sock,
lock_sock(&rx->sk);
- if (srx) {
- trans = rxrpc_name_to_transport(sock, (struct sockaddr *) srx,
- sizeof(*srx), 0, gfp);
- if (IS_ERR(trans)) {
- call = ERR_CAST(trans);
- trans = NULL;
- goto out_notrans;
- }
- } else {
- trans = rx->trans;
- if (!trans) {
- call = ERR_PTR(-ENOTCONN);
- goto out_notrans;
- }
- atomic_inc(&trans->usage);
+ trans = rxrpc_name_to_transport(rx, (struct sockaddr *)srx,
+ sizeof(*srx), 0, gfp);
+ if (IS_ERR(trans)) {
+ call = ERR_CAST(trans);
+ trans = NULL;
+ goto out_notrans;
}
- if (!srx)
- srx = &rx->srx;
if (!key)
key = rx->key;
if (key && !key->payload.data[0])
@@ -310,8 +298,7 @@ struct rxrpc_call *rxrpc_kernel_begin_call(struct socket *sock,
goto out;
}
- call = rxrpc_get_client_call(rx, trans, bundle, user_call_ID, true,
- gfp);
+ call = rxrpc_new_client_call(rx, trans, bundle, user_call_ID, gfp);
rxrpc_put_bundle(trans, bundle);
out:
rxrpc_put_transport(trans);
@@ -360,69 +347,14 @@ void rxrpc_kernel_intercept_rx_messages(struct socket *sock,
EXPORT_SYMBOL(rxrpc_kernel_intercept_rx_messages);
/*
- * connect an RxRPC socket
- * - this just targets it at a specific destination; no actual connection
- * negotiation takes place
+ * We don't permit connection of an RxRPC socket. It's pointless since
+ * sendmsg() takes the target address for a new call and a socket can support
+ * calls to multiple servers simultaneously.
*/
static int rxrpc_connect(struct socket *sock, struct sockaddr *addr,
int addr_len, int flags)
{
- struct sockaddr_rxrpc *srx = (struct sockaddr_rxrpc *) addr;
- struct sock *sk = sock->sk;
- struct rxrpc_transport *trans;
- struct rxrpc_local *local;
- struct rxrpc_sock *rx = rxrpc_sk(sk);
- int ret;
-
- _enter("%p,%p,%d,%d", rx, addr, addr_len, flags);
-
- ret = rxrpc_validate_address(rx, srx, addr_len);
- if (ret < 0) {
- _leave(" = %d [bad addr]", ret);
- return ret;
- }
-
- lock_sock(&rx->sk);
-
- switch (rx->sk.sk_state) {
- case RXRPC_UNCONNECTED:
- /* find a local transport endpoint if we don't have one already */
- ASSERTCMP(rx->local, ==, NULL);
- rx->srx.srx_family = AF_RXRPC;
- rx->srx.srx_service = 0;
- rx->srx.transport_type = srx->transport_type;
- rx->srx.transport_len = sizeof(sa_family_t);
- rx->srx.transport.family = srx->transport.family;
- local = rxrpc_lookup_local(&rx->srx);
- if (IS_ERR(local)) {
- release_sock(&rx->sk);
- return PTR_ERR(local);
- }
- rx->local = local;
- rx->sk.sk_state = RXRPC_CLIENT_BOUND;
- case RXRPC_CLIENT_BOUND:
- break;
- case RXRPC_CLIENT_CONNECTED:
- release_sock(&rx->sk);
- return -EISCONN;
- default:
- release_sock(&rx->sk);
- return -EBUSY; /* server sockets can't connect as well */
- }
-
- trans = rxrpc_name_to_transport(sock, addr, addr_len, flags,
- GFP_KERNEL);
- if (IS_ERR(trans)) {
- release_sock(&rx->sk);
- _leave(" = %ld", PTR_ERR(trans));
- return PTR_ERR(trans);
- }
-
- rx->trans = trans;
- rx->sk.sk_state = RXRPC_CLIENT_CONNECTED;
-
- release_sock(&rx->sk);
- return 0;
+ return -EOPNOTSUPP;
}
/*
@@ -436,7 +368,7 @@ static int rxrpc_connect(struct socket *sock, struct sockaddr *addr,
*/
static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m, size_t len)
{
- struct rxrpc_transport *trans;
+ struct rxrpc_local *local;
struct rxrpc_sock *rx = rxrpc_sk(sock->sk);
int ret;
@@ -453,48 +385,33 @@ static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m, size_t len)
}
}
- trans = NULL;
lock_sock(&rx->sk);
- if (m->msg_name) {
- ret = -EISCONN;
- trans = rxrpc_name_to_transport(sock, m->msg_name,
- m->msg_namelen, 0, GFP_KERNEL);
- if (IS_ERR(trans)) {
- ret = PTR_ERR(trans);
- trans = NULL;
- goto out;
- }
- } else {
- trans = rx->trans;
- if (trans)
- atomic_inc(&trans->usage);
- }
-
switch (rx->sk.sk_state) {
- case RXRPC_SERVER_LISTENING:
- if (!m->msg_name) {
- ret = rxrpc_server_sendmsg(rx, m, len);
- break;
+ case RXRPC_UNBOUND:
+ local = rxrpc_lookup_local(&rx->srx);
+ if (IS_ERR(local)) {
+ ret = PTR_ERR(local);
+ goto error_unlock;
}
- case RXRPC_SERVER_BOUND:
+
+ rx->local = local;
+ rx->sk.sk_state = RXRPC_CLIENT_UNBOUND;
+ /* Fall through */
+
+ case RXRPC_CLIENT_UNBOUND:
case RXRPC_CLIENT_BOUND:
- if (!m->msg_name) {
- ret = -ENOTCONN;
- break;
- }
- case RXRPC_CLIENT_CONNECTED:
- ret = rxrpc_client_sendmsg(rx, trans, m, len);
+ case RXRPC_SERVER_BOUND:
+ case RXRPC_SERVER_LISTENING:
+ ret = rxrpc_do_sendmsg(rx, m, len);
break;
default:
- ret = -ENOTCONN;
+ ret = -EINVAL;
break;
}
-out:
+error_unlock:
release_sock(&rx->sk);
- if (trans)
- rxrpc_put_transport(trans);
_leave(" = %d", ret);
return ret;
}
@@ -521,7 +438,7 @@ static int rxrpc_setsockopt(struct socket *sock, int level, int optname,
if (optlen != 0)
goto error;
ret = -EISCONN;
- if (rx->sk.sk_state != RXRPC_UNCONNECTED)
+ if (rx->sk.sk_state != RXRPC_UNBOUND)
goto error;
set_bit(RXRPC_SOCK_EXCLUSIVE_CONN, &rx->flags);
goto success;
@@ -531,7 +448,7 @@ static int rxrpc_setsockopt(struct socket *sock, int level, int optname,
if (rx->key)
goto error;
ret = -EISCONN;
- if (rx->sk.sk_state != RXRPC_UNCONNECTED)
+ if (rx->sk.sk_state != RXRPC_UNBOUND)
goto error;
ret = rxrpc_request_key(rx, optval, optlen);
goto error;
@@ -541,7 +458,7 @@ static int rxrpc_setsockopt(struct socket *sock, int level, int optname,
if (rx->key)
goto error;
ret = -EISCONN;
- if (rx->sk.sk_state != RXRPC_UNCONNECTED)
+ if (rx->sk.sk_state != RXRPC_UNBOUND)
goto error;
ret = rxrpc_server_keyring(rx, optval, optlen);
goto error;
@@ -551,7 +468,7 @@ static int rxrpc_setsockopt(struct socket *sock, int level, int optname,
if (optlen != sizeof(unsigned int))
goto error;
ret = -EISCONN;
- if (rx->sk.sk_state != RXRPC_UNCONNECTED)
+ if (rx->sk.sk_state != RXRPC_UNBOUND)
goto error;
ret = get_user(min_sec_level,
(unsigned int __user *) optval);
@@ -630,7 +547,7 @@ static int rxrpc_create(struct net *net, struct socket *sock, int protocol,
return -ENOMEM;
sock_init_data(sock, sk);
- sk->sk_state = RXRPC_UNCONNECTED;
+ sk->sk_state = RXRPC_UNBOUND;
sk->sk_write_space = rxrpc_write_space;
sk->sk_max_ack_backlog = sysctl_rxrpc_max_qlen;
sk->sk_destruct = rxrpc_sock_destructor;
@@ -703,14 +620,6 @@ static int rxrpc_release_sock(struct sock *sk)
rx->conn = NULL;
}
- if (rx->bundle) {
- rxrpc_put_bundle(rx->trans, rx->bundle);
- rx->bundle = NULL;
- }
- if (rx->trans) {
- rxrpc_put_transport(rx->trans);
- rx->trans = NULL;
- }
if (rx->local) {
rxrpc_put_local(rx->local);
rx->local = NULL;
diff --git a/net/rxrpc/ar-call.c b/net/rxrpc/ar-call.c
index 571a41fd5a32..9296bdb26c24 100644
--- a/net/rxrpc/ar-call.c
+++ b/net/rxrpc/ar-call.c
@@ -194,6 +194,43 @@ struct rxrpc_call *rxrpc_find_call_hash(
}
/*
+ * find an extant server call
+ * - called in process context with IRQs enabled
+ */
+struct rxrpc_call *rxrpc_find_call_by_user_ID(struct rxrpc_sock *rx,
+ unsigned long user_call_ID)
+{
+ struct rxrpc_call *call;
+ struct rb_node *p;
+
+ _enter("%p,%lx", rx, user_call_ID);
+
+ read_lock(&rx->call_lock);
+
+ p = rx->calls.rb_node;
+ while (p) {
+ call = rb_entry(p, struct rxrpc_call, sock_node);
+
+ if (user_call_ID < call->user_call_ID)
+ p = p->rb_left;
+ else if (user_call_ID > call->user_call_ID)
+ p = p->rb_right;
+ else
+ goto found_extant_call;
+ }
+
+ read_unlock(&rx->call_lock);
+ _leave(" = NULL");
+ return NULL;
+
+found_extant_call:
+ rxrpc_get_call(call);
+ read_unlock(&rx->call_lock);
+ _leave(" = %p [%d]", call, atomic_read(&call->usage));
+ return call;
+}
+
+/*
* allocate a new call
*/
static struct rxrpc_call *rxrpc_alloc_call(gfp_t gfp)
@@ -309,51 +346,27 @@ static struct rxrpc_call *rxrpc_alloc_client_call(
* set up a call for the given data
* - called in process context with IRQs enabled
*/
-struct rxrpc_call *rxrpc_get_client_call(struct rxrpc_sock *rx,
+struct rxrpc_call *rxrpc_new_client_call(struct rxrpc_sock *rx,
struct rxrpc_transport *trans,
struct rxrpc_conn_bundle *bundle,
unsigned long user_call_ID,
- int create,
gfp_t gfp)
{
- struct rxrpc_call *call, *candidate;
- struct rb_node *p, *parent, **pp;
+ struct rxrpc_call *call, *xcall;
+ struct rb_node *parent, **pp;
- _enter("%p,%d,%d,%lx,%d",
- rx, trans ? trans->debug_id : -1, bundle ? bundle->debug_id : -1,
- user_call_ID, create);
+ _enter("%p,%d,%d,%lx",
+ rx, trans->debug_id, bundle ? bundle->debug_id : -1,
+ user_call_ID);
- /* search the extant calls first for one that matches the specified
- * user ID */
- read_lock(&rx->call_lock);
-
- p = rx->calls.rb_node;
- while (p) {
- call = rb_entry(p, struct rxrpc_call, sock_node);
-
- if (user_call_ID < call->user_call_ID)
- p = p->rb_left;
- else if (user_call_ID > call->user_call_ID)
- p = p->rb_right;
- else
- goto found_extant_call;
+ call = rxrpc_alloc_client_call(rx, trans, bundle, gfp);
+ if (IS_ERR(call)) {
+ _leave(" = %ld", PTR_ERR(call));
+ return call;
}
- read_unlock(&rx->call_lock);
-
- if (!create || !trans)
- return ERR_PTR(-EBADSLT);
-
- /* not yet present - create a candidate for a new record and then
- * redo the search */
- candidate = rxrpc_alloc_client_call(rx, trans, bundle, gfp);
- if (IS_ERR(candidate)) {
- _leave(" = %ld", PTR_ERR(candidate));
- return candidate;
- }
-
- candidate->user_call_ID = user_call_ID;
- __set_bit(RXRPC_CALL_HAS_USERID, &candidate->flags);
+ call->user_call_ID = user_call_ID;
+ __set_bit(RXRPC_CALL_HAS_USERID, &call->flags);
write_lock(&rx->call_lock);
@@ -361,19 +374,16 @@ struct rxrpc_call *rxrpc_get_client_call(struct rxrpc_sock *rx,
parent = NULL;
while (*pp) {
parent = *pp;
- call = rb_entry(parent, struct rxrpc_call, sock_node);
+ xcall = rb_entry(parent, struct rxrpc_call, sock_node);
- if (user_call_ID < call->user_call_ID)
+ if (user_call_ID < xcall->user_call_ID)
pp = &(*pp)->rb_left;
- else if (user_call_ID > call->user_call_ID)
+ else if (user_call_ID > xcall->user_call_ID)
pp = &(*pp)->rb_right;
else
- goto found_extant_second;
+ goto found_user_ID_now_present;
}
- /* second search also failed; add the new call */
- call = candidate;
- candidate = NULL;
rxrpc_get_call(call);
rb_link_node(&call->sock_node, parent, pp);
@@ -389,20 +399,16 @@ struct rxrpc_call *rxrpc_get_client_call(struct rxrpc_sock *rx,
_leave(" = %p [new]", call);
return call;
- /* we found the call in the list immediately */
-found_extant_call:
- rxrpc_get_call(call);
- read_unlock(&rx->call_lock);
- _leave(" = %p [extant %d]", call, atomic_read(&call->usage));
- return call;
-
- /* we found the call on the second time through the list */
-found_extant_second:
- rxrpc_get_call(call);
+ /* We unexpectedly found the user ID in the list after taking
+ * the call_lock. This shouldn't happen unless the user races
+ * with itself and tries to add the same user ID twice at the
+ * same time in different threads.
+ */
+found_user_ID_now_present:
write_unlock(&rx->call_lock);
- rxrpc_put_call(candidate);
- _leave(" = %p [second %d]", call, atomic_read(&call->usage));
- return call;
+ rxrpc_put_call(call);
+ _leave(" = -EEXIST [%p]", call);
+ return ERR_PTR(-EEXIST);
}
/*
@@ -564,46 +570,6 @@ old_call:
}
/*
- * find an extant server call
- * - called in process context with IRQs enabled
- */
-struct rxrpc_call *rxrpc_find_server_call(struct rxrpc_sock *rx,
- unsigned long user_call_ID)
-{
- struct rxrpc_call *call;
- struct rb_node *p;
-
- _enter("%p,%lx", rx, user_call_ID);
-
- /* search the extant calls for one that matches the specified user
- * ID */
- read_lock(&rx->call_lock);
-
- p = rx->calls.rb_node;
- while (p) {
- call = rb_entry(p, struct rxrpc_call, sock_node);
-
- if (user_call_ID < call->user_call_ID)
- p = p->rb_left;
- else if (user_call_ID > call->user_call_ID)
- p = p->rb_right;
- else
- goto found_extant_call;
- }
-
- read_unlock(&rx->call_lock);
- _leave(" = NULL");
- return NULL;
-
- /* we found the call in the list immediately */
-found_extant_call:
- rxrpc_get_call(call);
- read_unlock(&rx->call_lock);
- _leave(" = %p [%d]", call, atomic_read(&call->usage));
- return call;
-}
-
-/*
* detach a call from a socket and set up for release
*/
void rxrpc_release_call(struct rxrpc_call *call)
diff --git a/net/rxrpc/ar-connection.c b/net/rxrpc/ar-connection.c
index 97f4fae74bca..5307dba4a13a 100644
--- a/net/rxrpc/ar-connection.c
+++ b/net/rxrpc/ar-connection.c
@@ -78,11 +78,6 @@ struct rxrpc_conn_bundle *rxrpc_get_bundle(struct rxrpc_sock *rx,
_enter("%p{%x},%x,%hx,",
rx, key_serial(key), trans->debug_id, service_id);
- if (rx->trans == trans && rx->bundle) {
- atomic_inc(&rx->bundle->usage);
- return rx->bundle;
- }
-
/* search the extant bundles first for one that matches the specified
* user ID */
spin_lock(&trans->client_lock);
@@ -136,10 +131,6 @@ struct rxrpc_conn_bundle *rxrpc_get_bundle(struct rxrpc_sock *rx,
rb_insert_color(&bundle->node, &trans->bundles);
spin_unlock(&trans->client_lock);
_net("BUNDLE new on trans %d", trans->debug_id);
- if (!rx->bundle && rx->sk.sk_state == RXRPC_CLIENT_CONNECTED) {
- atomic_inc(&bundle->usage);
- rx->bundle = bundle;
- }
_leave(" = %p [new]", bundle);
return bundle;
@@ -148,10 +139,6 @@ found_extant_bundle:
atomic_inc(&bundle->usage);
spin_unlock(&trans->client_lock);
_net("BUNDLE old on trans %d", trans->debug_id);
- if (!rx->bundle && rx->sk.sk_state == RXRPC_CLIENT_CONNECTED) {
- atomic_inc(&bundle->usage);
- rx->bundle = bundle;
- }
_leave(" = %p [extant %d]", bundle, atomic_read(&bundle->usage));
return bundle;
@@ -161,10 +148,6 @@ found_extant_second:
spin_unlock(&trans->client_lock);
kfree(candidate);
_net("BUNDLE old2 on trans %d", trans->debug_id);
- if (!rx->bundle && rx->sk.sk_state == RXRPC_CLIENT_CONNECTED) {
- atomic_inc(&bundle->usage);
- rx->bundle = bundle;
- }
_leave(" = %p [second %d]", bundle, atomic_read(&bundle->usage));
return bundle;
}
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index f0b807a163fa..bbf2443af875 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -39,9 +39,9 @@ struct rxrpc_crypt {
* sk_state for RxRPC sockets
*/
enum {
- RXRPC_UNCONNECTED = 0,
+ RXRPC_UNBOUND = 0,
+ RXRPC_CLIENT_UNBOUND, /* Unbound socket used as client */
RXRPC_CLIENT_BOUND, /* client local address bound */
- RXRPC_CLIENT_CONNECTED, /* client is connected */
RXRPC_SERVER_BOUND, /* server local address bound */
RXRPC_SERVER_LISTENING, /* server listening for connections */
RXRPC_CLOSE, /* socket is being closed */
@@ -55,8 +55,6 @@ struct rxrpc_sock {
struct sock sk;
rxrpc_interceptor_t interceptor; /* kernel service Rx interceptor function */
struct rxrpc_local *local; /* local endpoint */
- struct rxrpc_transport *trans; /* transport handler */
- struct rxrpc_conn_bundle *bundle; /* virtual connection bundle */
struct rxrpc_connection *conn; /* exclusive virtual connection */
struct list_head listen_link; /* link in the local endpoint's listen list */
struct list_head secureq; /* calls awaiting connection security clearance */
@@ -477,6 +475,10 @@ extern u32 rxrpc_epoch;
extern atomic_t rxrpc_debug_id;
extern struct workqueue_struct *rxrpc_workqueue;
+struct rxrpc_transport *rxrpc_name_to_transport(struct rxrpc_sock *,
+ struct sockaddr *,
+ int, int, gfp_t);
+
/*
* ar-accept.c
*/
@@ -502,14 +504,15 @@ extern rwlock_t rxrpc_call_lock;
struct rxrpc_call *rxrpc_find_call_hash(struct rxrpc_host_header *,
void *, sa_family_t, const void *);
-struct rxrpc_call *rxrpc_get_client_call(struct rxrpc_sock *,
+struct rxrpc_call *rxrpc_find_call_by_user_ID(struct rxrpc_sock *,
+ unsigned long);
+struct rxrpc_call *rxrpc_new_client_call(struct rxrpc_sock *,
struct rxrpc_transport *,
struct rxrpc_conn_bundle *,
- unsigned long, int, gfp_t);
+ unsigned long, gfp_t);
struct rxrpc_call *rxrpc_incoming_call(struct rxrpc_sock *,
struct rxrpc_connection *,
struct rxrpc_host_header *);
-struct rxrpc_call *rxrpc_find_server_call(struct rxrpc_sock *, unsigned long);
void rxrpc_release_call(struct rxrpc_call *);
void rxrpc_release_calls_on_socket(struct rxrpc_sock *);
void __rxrpc_put_call(struct rxrpc_call *);
@@ -581,9 +584,7 @@ int rxrpc_get_server_data_key(struct rxrpc_connection *, const void *, time_t,
extern unsigned int rxrpc_resend_timeout;
int rxrpc_send_packet(struct rxrpc_transport *, struct sk_buff *);
-int rxrpc_client_sendmsg(struct rxrpc_sock *, struct rxrpc_transport *,
- struct msghdr *, size_t);
-int rxrpc_server_sendmsg(struct rxrpc_sock *, struct msghdr *, size_t);
+int rxrpc_do_sendmsg(struct rxrpc_sock *, struct msghdr *, size_t);
/*
* ar-peer.c
diff --git a/net/rxrpc/ar-output.c b/net/rxrpc/ar-output.c
index 51cb10062a8d..b87fda075b45 100644
--- a/net/rxrpc/ar-output.c
+++ b/net/rxrpc/ar-output.c
@@ -30,13 +30,13 @@ static int rxrpc_send_data(struct rxrpc_sock *rx,
/*
* extract control messages from the sendmsg() control buffer
*/
-static int rxrpc_sendmsg_cmsg(struct rxrpc_sock *rx, struct msghdr *msg,
+static int rxrpc_sendmsg_cmsg(struct msghdr *msg,
unsigned long *user_call_ID,
enum rxrpc_command *command,
- u32 *abort_code,
- bool server)
+ u32 *abort_code)
{
struct cmsghdr *cmsg;
+ bool got_user_ID = false;
int len;
*command = RXRPC_CMD_SEND_DATA;
@@ -68,6 +68,7 @@ static int rxrpc_sendmsg_cmsg(struct rxrpc_sock *rx, struct msghdr *msg,
CMSG_DATA(cmsg);
}
_debug("User Call ID %lx", *user_call_ID);
+ got_user_ID = true;
break;
case RXRPC_ABORT:
@@ -88,8 +89,6 @@ static int rxrpc_sendmsg_cmsg(struct rxrpc_sock *rx, struct msghdr *msg,
*command = RXRPC_CMD_ACCEPT;
if (len != 0)
return -EINVAL;
- if (!server)
- return -EISCONN;
break;
default:
@@ -97,6 +96,8 @@ static int rxrpc_sendmsg_cmsg(struct rxrpc_sock *rx, struct msghdr *msg,
}
}
+ if (!got_user_ID)
+ return -EINVAL;
_leave(" = 0");
return 0;
}
@@ -124,55 +125,96 @@ static void rxrpc_send_abort(struct rxrpc_call *call, u32 abort_code)
}
/*
+ * Create a new client call for sendmsg().
+ */
+static struct rxrpc_call *
+rxrpc_new_client_call_for_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg,
+ unsigned long user_call_ID)
+{
+ struct rxrpc_conn_bundle *bundle;
+ struct rxrpc_transport *trans;
+ struct rxrpc_call *call;
+ struct key *key;
+ long ret;
+
+ DECLARE_SOCKADDR(struct sockaddr_rxrpc *, srx, msg->msg_name);
+
+ _enter("");
+
+ if (!msg->msg_name)
+ return ERR_PTR(-EDESTADDRREQ);
+
+ trans = rxrpc_name_to_transport(rx, msg->msg_name, msg->msg_namelen, 0,
+ GFP_KERNEL);
+ if (IS_ERR(trans)) {
+ ret = PTR_ERR(trans);
+ goto out;
+ }
+
+ key = rx->key;
+ if (key && !rx->key->payload.data[0])
+ key = NULL;
+ bundle = rxrpc_get_bundle(rx, trans, key, srx->srx_service, GFP_KERNEL);
+ if (IS_ERR(bundle)) {
+ ret = PTR_ERR(bundle);
+ goto out_trans;
+ }
+
+ call = rxrpc_new_client_call(rx, trans, bundle, user_call_ID,
+ GFP_KERNEL);
+ rxrpc_put_bundle(trans, bundle);
+ rxrpc_put_transport(trans);
+ if (IS_ERR(call)) {
+ ret = PTR_ERR(call);
+ goto out_trans;
+ }
+
+ _leave(" = %p\n", call);
+ return call;
+
+out_trans:
+ rxrpc_put_transport(trans);
+out:
+ _leave(" = %ld", ret);
+ return ERR_PTR(ret);
+}
+
+/*
* send a message forming part of a client call through an RxRPC socket
* - caller holds the socket locked
* - the socket may be either a client socket or a server socket
*/
-int rxrpc_client_sendmsg(struct rxrpc_sock *rx, struct rxrpc_transport *trans,
- struct msghdr *msg, size_t len)
+int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
{
- struct rxrpc_conn_bundle *bundle;
enum rxrpc_command cmd;
struct rxrpc_call *call;
unsigned long user_call_ID = 0;
- struct key *key;
- u16 service_id;
u32 abort_code = 0;
int ret;
_enter("");
- ASSERT(trans != NULL);
-
- ret = rxrpc_sendmsg_cmsg(rx, msg, &user_call_ID, &cmd, &abort_code,
- false);
+ ret = rxrpc_sendmsg_cmsg(msg, &user_call_ID, &cmd, &abort_code);
if (ret < 0)
return ret;
- bundle = NULL;
- if (trans) {
- service_id = rx->srx.srx_service;
- if (msg->msg_name) {
- DECLARE_SOCKADDR(struct sockaddr_rxrpc *, srx,
- msg->msg_name);
- service_id = srx->srx_service;
- }
- key = rx->key;
- if (key && !rx->key->payload.data[0])
- key = NULL;
- bundle = rxrpc_get_bundle(rx, trans, key, service_id,
- GFP_KERNEL);
- if (IS_ERR(bundle))
- return PTR_ERR(bundle);
+ if (cmd == RXRPC_CMD_ACCEPT) {
+ if (rx->sk.sk_state != RXRPC_SERVER_LISTENING)
+ return -EINVAL;
+ call = rxrpc_accept_call(rx, user_call_ID);
+ if (IS_ERR(call))
+ return PTR_ERR(call);
+ rxrpc_put_call(call);
+ return 0;
}
- call = rxrpc_get_client_call(rx, trans, bundle, user_call_ID,
- abort_code == 0, GFP_KERNEL);
- if (trans)
- rxrpc_put_bundle(trans, bundle);
- if (IS_ERR(call)) {
- _leave(" = %ld", PTR_ERR(call));
- return PTR_ERR(call);
+ call = rxrpc_find_call_by_user_ID(rx, user_call_ID);
+ if (!call) {
+ if (cmd != RXRPC_CMD_SEND_DATA)
+ return -EBADSLT;
+ call = rxrpc_new_client_call_for_sendmsg(rx, msg, user_call_ID);
+ if (IS_ERR(call))
+ return PTR_ERR(call);
}
_debug("CALL %d USR %lx ST %d on CONN %p",
@@ -180,14 +222,21 @@ int rxrpc_client_sendmsg(struct rxrpc_sock *rx, struct rxrpc_transport *trans,
if (call->state >= RXRPC_CALL_COMPLETE) {
/* it's too late for this call */
- ret = -ESHUTDOWN;
+ ret = -ECONNRESET;
} else if (cmd == RXRPC_CMD_SEND_ABORT) {
rxrpc_send_abort(call, abort_code);
+ ret = 0;
} else if (cmd != RXRPC_CMD_SEND_DATA) {
ret = -EINVAL;
- } else if (call->state != RXRPC_CALL_CLIENT_SEND_REQUEST) {
+ } else if (!call->in_clientflag &&
+ call->state != RXRPC_CALL_CLIENT_SEND_REQUEST) {
/* request phase complete for this client call */
ret = -EPROTO;
+ } else if (call->in_clientflag &&
+ call->state != RXRPC_CALL_SERVER_ACK_REQUEST &&
+ call->state != RXRPC_CALL_SERVER_SEND_REPLY) {
+ /* Reply phase not begun or not complete for service call. */
+ ret = -EPROTO;
} else {
ret = rxrpc_send_data(rx, call, msg, len);
}
@@ -266,67 +315,6 @@ void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code)
EXPORT_SYMBOL(rxrpc_kernel_abort_call);
/*
- * send a message through a server socket
- * - caller holds the socket locked
- */
-int rxrpc_server_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
-{
- enum rxrpc_command cmd;
- struct rxrpc_call *call;
- unsigned long user_call_ID = 0;
- u32 abort_code = 0;
- int ret;
-
- _enter("");
-
- ret = rxrpc_sendmsg_cmsg(rx, msg, &user_call_ID, &cmd, &abort_code,
- true);
- if (ret < 0)
- return ret;
-
- if (cmd == RXRPC_CMD_ACCEPT) {
- call = rxrpc_accept_call(rx, user_call_ID);
- if (IS_ERR(call))
- return PTR_ERR(call);
- rxrpc_put_call(call);
- return 0;
- }
-
- call = rxrpc_find_server_call(rx, user_call_ID);
- if (!call)
- return -EBADSLT;
- if (call->state >= RXRPC_CALL_COMPLETE) {
- ret = -ESHUTDOWN;
- goto out;
- }
-
- switch (cmd) {
- case RXRPC_CMD_SEND_DATA:
- if (call->state != RXRPC_CALL_CLIENT_SEND_REQUEST &&
- call->state != RXRPC_CALL_SERVER_ACK_REQUEST &&
- call->state != RXRPC_CALL_SERVER_SEND_REPLY) {
- /* Tx phase not yet begun for this call */
- ret = -EPROTO;
- break;
- }
-
- ret = rxrpc_send_data(rx, call, msg, len);
- break;
-
- case RXRPC_CMD_SEND_ABORT:
- rxrpc_send_abort(call, abort_code);
- break;
- default:
- BUG();
- }
-
- out:
- rxrpc_put_call(call);
- _leave(" = %d", ret);
- return ret;
-}
-
-/*
* send a packet through the transport endpoint
*/
int rxrpc_send_packet(struct rxrpc_transport *trans, struct sk_buff *skb)
^ permalink raw reply related
* [PATCH 0/3] RxRPC: 2nd rewrite part 2
From: David Howells @ 2016-04-12 15:05 UTC (permalink / raw)
To: linux-afs; +Cc: dhowells, netdev, linux-kernel
Here's the next part of the AF_RXRPC rewrite. In this set I make some
changes to the user interface for AF_RXRPC:
(1) connect() is no longer supported on an AF_RXRPC socket. It is
redundant given that sendmsg() can be given the target address;
indeed, even on a connected client socket, sendmsg() can still be used
with an address other than the connection address.
(2) listen() is required to allow a service socket to begin accepting
incoming calls. Previously, bind() with a service ID set in the
address caused the socket to begin listening. Listen only adjusted
the backlog parameter on the socket previously.
(3) The maximum backlog size can be adjusted through a sysctl - though it
is still limited to the range 4-32. At some point I would like to
have some preallocated rxrpc_call structs prepared for incoming calls,
using the backlog to limit the preallocation. Passing INT_MAX to
listen() requests the maximum allowed.
(4) Calling sendmsg() on a socket that is not yet bound shifts the socket
to be a purely client socket and binds a random local UDP port.
(5) sendmsg() with a RXRPC_ACCEPT control message must not also have an
address specified in msg_name. It doesn't make sense to supply an
address here.
(6) If sendmsg() is asked to make a call with a particular user call ID
which doesn't yet exist, the user call ID must not come into existence
whilst sendmsg() is off creating a new call. Previously it would just
add its data to the call.
I would also like to consider making further changes, but I think they'd
probably be too much of a change:
(*) Require a control message of RXRPC_NEW_CALL to be passed to sendmsg()
when beginning a new call to make it clear that we're instituting a
new user call ID, not expecting the user call ID to already exist with
the socket. This would make (6) above cleaner.
(*) Provide RXRPC_LOCALLY_ABORTED and RXRPC_REMOTELY_ABORTED control
messages for recvmsg() to return instead of RXRPC_ABORT (which would
then be for sendmsg() only). Another way to do this is to return an
additional control message that, say, indicates that the termination
was remote.
(*) Allow userspace to presupply user call IDs for incoming calls to use.
These would be used instead of RXRPC_ACCEPT. A control message would
be required: one for sendmsg() to supply a user ID (RXRPC_PREACCEPT
say) and then RXRPC_NEW_CALL would be given a parameter through
recvmsg() to indicate the number of user call IDs available.
The patches can be found here also:
http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-rewrite
Tagged thusly:
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
rxrpc-rewrite-20160412
This is based on net-next/master
David
---
David Howells (3):
rxrpc: Don't permit use of connect() op and simplify sendmsg() op
rxrpc: The RXRPC_ACCEPT control message should not have an address
rxrpc: Use the listen() system call to move to listening state
Documentation/networking/rxrpc.txt | 8 -
fs/afs/rxrpc.c | 34 +++---
include/linux/rxrpc.h | 18 ++-
net/rxrpc/af_rxrpc.c | 209 ++++++++++--------------------------
net/rxrpc/ar-call.c | 158 +++++++++++----------------
net/rxrpc/ar-connection.c | 17 ---
net/rxrpc/ar-internal.h | 22 ++--
net/rxrpc/ar-output.c | 187 +++++++++++++++-----------------
net/rxrpc/misc.c | 6 +
net/rxrpc/sysctl.c | 10 ++
10 files changed, 269 insertions(+), 400 deletions(-)
^ permalink raw reply
* Re: TCP reaching to maximum throughput after a long time
From: Ben Greear @ 2016-04-12 15:04 UTC (permalink / raw)
To: Machani, Yaniv, netdev
Cc: Eric Dumazet, David S. Miller, Eric Dumazet, Neal Cardwell,
Yuchung Cheng, Nandita Dukkipati, open list, Kama, Meirav
In-Reply-To: <1460472764.6473.589.camel@edumazet-glaptop3.roam.corp.google.com>
On 04/12/2016 07:52 AM, Eric Dumazet wrote:
> On Tue, 2016-04-12 at 12:17 +0000, Machani, Yaniv wrote:
>> Hi,
>> After updating from Kernel 3.14 to Kernel 4.4 we have seen a TCP performance degradation over Wi-Fi.
>> In 3.14 kernel, TCP got to its max throughout after less than a second, while in the 4.4 it is taking ~20-30 seconds.
>> UDP TX/RX and TCP RX performance is as expected.
>> We are using a Beagle Bone Black and a WiLink8 device.
>>
>> Were there any related changes that might cause such behavior ?
>> Kernel configuration and sysctl values were compared, but no significant differences have been found.
If you are using 'Cubic' TCP congestion control, then please try something different.
It was broken last I checked, at least when used with the ath10k driver.
https://marc.info/?l=linux-netdev&m=144405216005715&w=2
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply
* Re: [PATCH RFT 2/2] macb: kill PHY reset code
From: Nicolas Ferre @ 2016-04-12 14:57 UTC (permalink / raw)
To: Sergei Shtylyov, Andrew Lunn; +Cc: netdev, linux-kernel, Gregory CLEMENT
In-Reply-To: <570CFE13.3040100@cogentembedded.com>
Le 12/04/2016 15:54, Sergei Shtylyov a écrit :
> Hello.
>
> On 4/12/2016 12:22 PM, Nicolas Ferre wrote:
>
>>>> With the 'phylib' now being aware of the "reset-gpios" PHY node property,
>>>> there should be no need to frob the PHY reset in this driver anymore...
>>>>
>>>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
>>>>
>>>> ---
>>>> drivers/net/ethernet/cadence/macb.c | 17 -----------------
>>>> drivers/net/ethernet/cadence/macb.h | 1 -
>>>> 2 files changed, 18 deletions(-)
>>>>
>>>> Index: net-next/drivers/net/ethernet/cadence/macb.c
>>>> ===================================================================
>>>> --- net-next.orig/drivers/net/ethernet/cadence/macb.c
>>>> +++ net-next/drivers/net/ethernet/cadence/macb.c
> [...]
>>>> @@ -2977,18 +2976,6 @@ static int macb_probe(struct platform_de
>>>> else
>>>> macb_get_hwaddr(bp);
>>>>
>>>> - /* Power up the PHY if there is a GPIO reset */
>>>> - phy_node = of_get_next_available_child(np, NULL);
>>>> - if (phy_node) {
>>>> - int gpio = of_get_named_gpio(phy_node, "reset-gpios", 0);
>>>> -
>>>> - if (gpio_is_valid(gpio)) {
>>>> - bp->reset_gpio = gpio_to_desc(gpio);
>>>> - gpiod_direction_output(bp->reset_gpio, 1);
>>>
>>> Hi Sergei
>>>
>>> The code you are deleting would of ignored the flags in the gpio
>>
>> I don't parse this.
>
>> The code deleted does take the flag into account.
>
> Not really -- you need to call of_get_named_gpio_flags() (with a valid
> last argument) for that.
Yep,
>> And the DT property
>> associated to it seems correct to me (I mean, with proper flag
>> specification).
>
> It apparently is not as it have GPIO_ACTIVE_HIGH and the driver assumes
> active-low reset signal.
Yes, logic was inverted and... anyway, the flag never used for real...
Thanks Sergei.
No problem for me accepting a patch for the at91-vinco.dts.
Bye,
--
Nicolas Ferre
^ permalink raw reply
* Re: TCP reaching to maximum throughput after a long time
From: Eric Dumazet @ 2016-04-12 14:52 UTC (permalink / raw)
To: Machani, Yaniv, netdev
Cc: David S. Miller, Eric Dumazet, Neal Cardwell, Yuchung Cheng,
Nandita Dukkipati, open list, Kama, Meirav
In-Reply-To: <AE1C82FB3D0EC64DB1F752C81CBD110139100057@DFRE01.ent.ti.com>
On Tue, 2016-04-12 at 12:17 +0000, Machani, Yaniv wrote:
> Hi,
> After updating from Kernel 3.14 to Kernel 4.4 we have seen a TCP performance degradation over Wi-Fi.
> In 3.14 kernel, TCP got to its max throughout after less than a second, while in the 4.4 it is taking ~20-30 seconds.
> UDP TX/RX and TCP RX performance is as expected.
> We are using a Beagle Bone Black and a WiLink8 device.
>
> Were there any related changes that might cause such behavior ?
> Kernel configuration and sysctl values were compared, but no significant differences have been found.
>
> See a log of the behavior below :
> -----------------------------------------------------------
> Client connecting to 10.2.46.5, TCP port 5001
> TCP window size: 320 KByte (WARNING: requested 256 KByte)
> ------------------------------------------------------------
> [ 3] local 10.2.46.6 port 49282 connected with 10.2.46.5 port 5001
> [ ID] Interval Transfer Bandwidth
> [ 3] 0.0- 1.0 sec 5.75 MBytes 48.2 Mbits/sec
> [ 3] 1.0- 2.0 sec 6.50 MBytes 54.5 Mbits/sec
> [ 3] 2.0- 3.0 sec 6.50 MBytes 54.5 Mbits/sec
> [ 3] 3.0- 4.0 sec 6.50 MBytes 54.5 Mbits/sec
> [ 3] 4.0- 5.0 sec 6.75 MBytes 56.6 Mbits/sec
> [ 3] 5.0- 6.0 sec 3.38 MBytes 28.3 Mbits/sec
> [ 3] 6.0- 7.0 sec 6.38 MBytes 53.5 Mbits/sec
> [ 3] 7.0- 8.0 sec 6.88 MBytes 57.7 Mbits/sec
> [ 3] 8.0- 9.0 sec 7.12 MBytes 59.8 Mbits/sec
> [ 3] 9.0-10.0 sec 7.12 MBytes 59.8 Mbits/sec
> [ 3] 10.0-11.0 sec 7.12 MBytes 59.8 Mbits/sec
> [ 3] 11.0-12.0 sec 7.25 MBytes 60.8 Mbits/sec
> [ 3] 12.0-13.0 sec 7.12 MBytes 59.8 Mbits/sec
> [ 3] 13.0-14.0 sec 7.25 MBytes 60.8 Mbits/sec
> [ 3] 14.0-15.0 sec 7.62 MBytes 64.0 Mbits/sec
> [ 3] 15.0-16.0 sec 7.88 MBytes 66.1 Mbits/sec
> [ 3] 16.0-17.0 sec 8.12 MBytes 68.2 Mbits/sec
> [ 3] 17.0-18.0 sec 8.25 MBytes 69.2 Mbits/sec
> [ 3] 18.0-19.0 sec 8.50 MBytes 71.3 Mbits/sec
> [ 3] 19.0-20.0 sec 8.88 MBytes 74.4 Mbits/sec
> [ 3] 20.0-21.0 sec 8.75 MBytes 73.4 Mbits/sec
> [ 3] 21.0-22.0 sec 8.62 MBytes 72.4 Mbits/sec
> [ 3] 22.0-23.0 sec 8.75 MBytes 73.4 Mbits/sec
> [ 3] 23.0-24.0 sec 8.50 MBytes 71.3 Mbits/sec
> [ 3] 24.0-25.0 sec 8.62 MBytes 72.4 Mbits/sec
> [ 3] 25.0-26.0 sec 8.62 MBytes 72.4 Mbits/sec
> [ 3] 26.0-27.0 sec 8.62 MBytes 72.4 Mbits/sec
>
CC netdev, where this is better discussed.
This could be a lot of different factors, and caused by a sender
problem, a receiver problem, ...
TCP behavior depends on the drivers, so maybe a change there can explain
this.
You could capture the first 5000 frames of the flow and post the pcap ?
(-s 128 to capture only the headers)
tcpdump -p -s 128 -i eth0 -c 5000 host 10.2.46.5 -w flow.pcap
Also, while test is running, you could fetch
ss -temoi dst 10.2.46.5:5001
^ permalink raw reply
* VLAN aux info for AF_PACKET available only with ETH_P_ALL
From: Peter Palúch @ 2016-04-12 14:40 UTC (permalink / raw)
To: netdev
Greetings,
I am running vanilla Linux kernel v4.4.6.
When using AF_PACKET sockets with PACKET_AUXDATA socket option to access
the VLAN TCI information of received frames, I have noticed that the
VLAN information in struct tpacket_auxdata, namely,
- tp_vlan_tci
- tp_vlan_tpid
- TP_STATUS_VLAN_VALID and TP_STATUS_VLAN_TPID_VALID flags
is filled in only when the socket is bound to htons (ETH_P_ALL). If the
socket is bound to any specific protocol, the VLAN information fields in
struct tpacket_auxdata are set to 0 even if the datagram of the specific
protocol was received in an 802.1Q-tagged frame.
Is this behavior intentional? If not, I would be honored to try to
provide a patch but I am not well-versed in kernel internals so any
guidance would be most appreciated.
Thanks!
Best regards,
Peter
^ permalink raw reply
* Re: [PATCH RFT 2/2] macb: kill PHY reset code
From: Nicolas Ferre @ 2016-04-12 14:45 UTC (permalink / raw)
To: Andrew Lunn; +Cc: Sergei Shtylyov, netdev, linux-kernel, Gregory CLEMENT
In-Reply-To: <20160412134001.GB29895@lunn.ch>
Le 12/04/2016 15:40, Andrew Lunn a écrit :
> On Tue, Apr 12, 2016 at 11:22:10AM +0200, Nicolas Ferre wrote:
>> Le 11/04/2016 04:28, Andrew Lunn a écrit :
>>> On Sat, Apr 09, 2016 at 01:25:03AM +0300, Sergei Shtylyov wrote:
>>>> With the 'phylib' now being aware of the "reset-gpios" PHY node property,
>>>> there should be no need to frob the PHY reset in this driver anymore...
>>>>
>>>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
>>>>
>>>> ---
>>>> drivers/net/ethernet/cadence/macb.c | 17 -----------------
>>>> drivers/net/ethernet/cadence/macb.h | 1 -
>>>> 2 files changed, 18 deletions(-)
>>>>
>>>> Index: net-next/drivers/net/ethernet/cadence/macb.c
>>>> ===================================================================
>>>> --- net-next.orig/drivers/net/ethernet/cadence/macb.c
>>>> +++ net-next/drivers/net/ethernet/cadence/macb.c
>>>> @@ -2884,7 +2884,6 @@ static int macb_probe(struct platform_de
>>>> = macb_clk_init;
>>>> int (*init)(struct platform_device *) = macb_init;
>>>> struct device_node *np = pdev->dev.of_node;
>>>> - struct device_node *phy_node;
>>>> const struct macb_config *macb_config = NULL;
>>>> struct clk *pclk, *hclk = NULL, *tx_clk = NULL;
>>>> unsigned int queue_mask, num_queues;
>>>> @@ -2977,18 +2976,6 @@ static int macb_probe(struct platform_de
>>>> else
>>>> macb_get_hwaddr(bp);
>>>>
>>>> - /* Power up the PHY if there is a GPIO reset */
>>>> - phy_node = of_get_next_available_child(np, NULL);
>>>> - if (phy_node) {
>>>> - int gpio = of_get_named_gpio(phy_node, "reset-gpios", 0);
>>>> -
>>>> - if (gpio_is_valid(gpio)) {
>>>> - bp->reset_gpio = gpio_to_desc(gpio);
>>>> - gpiod_direction_output(bp->reset_gpio, 1);
>>>
>>> Hi Sergei
>>>
>>> The code you are deleting would of ignored the flags in the gpio
>> I don't parse this.
>>
>> The code deleted does take the flag into account. And the DT property
>> associated to it seems correct to me (I mean, with proper flag
>> specification).
>
> Hi Nicolas
>
> of_get_named_gpio() does not do anything with the flags. So for
> example,
>
> gpios = <&gpio0 12 GPIO_ACTIVE_LOW>;
>
> the GPIO_ACTIVE_LOW would be ignored. If you want the flags to be
> respected, you need to use the gpiod API for all calls, in particular,
> you need to use something which calls gpiod_get_index(), since that is
> the only function to call gpiod_parse_flags() to translate
> GPIO_ACTIVE_LOW into a flag within the gpio descriptor.
Ok, I remember what confused me now: this code, used to be something around:
devm_gpiod_get_optional(&bp->pdev->dev, "phy-reset", GPIOD_OUT_HIGH);
before it has been changed to the chunk above... So, yes, the DT flag
was not handled anyway...
Sorry for the noise and thanks for the clarification.
Bye,
--
Nicolas Ferre
^ permalink raw reply
* [PATCH net,stable] cdc_mbim: apply "NDP to end" quirk to all Huawei devices
From: Bjørn Mork @ 2016-04-12 14:11 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA
Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, Andreas Fett, Andrei Burd,
Bjørn Mork
We now have a positive report of another Huawei device needing
this quirk: The ME906s-158 (12d1:15c1). This is an m.2 form
factor modem with no obvious relationship to the E3372 (12d1:157d)
we already have a quirk entry for. This is reason enough to
believe the quirk might be necessary for any number of current
and future Huawei devices.
Applying the quirk to all Huawei devices, since it is crucial
to any device affected by the firmware bug, while the impact
on non-affected devices is negligible.
The quirk can if necessary be disabled per-device by writing
N to /sys/class/net/<iface>/cdc_ncm/ndp_to_end
Reported-by: Andreas Fett <andreas.fett-opNxpl+3fjRBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
---
I'm requesting this for stable, but it depends on commit
f8c0cfa5eca9 ("net: cdc_mbim: add "NDP to end" quirk for Huawei E3372")
so it is only applicable to v4.3 (where the dependency is
backported), v4.4 and v4.5
Bjørn
drivers/net/usb/cdc_mbim.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/net/usb/cdc_mbim.c b/drivers/net/usb/cdc_mbim.c
index bdd83d95ec0a..96a5028621c8 100644
--- a/drivers/net/usb/cdc_mbim.c
+++ b/drivers/net/usb/cdc_mbim.c
@@ -617,8 +617,13 @@ static const struct usb_device_id mbim_devs[] = {
{ USB_VENDOR_AND_INTERFACE_INFO(0x0bdb, USB_CLASS_COMM, USB_CDC_SUBCLASS_MBIM, USB_CDC_PROTO_NONE),
.driver_info = (unsigned long)&cdc_mbim_info,
},
- /* Huawei E3372 fails unless NDP comes after the IP packets */
- { USB_DEVICE_AND_INTERFACE_INFO(0x12d1, 0x157d, USB_CLASS_COMM, USB_CDC_SUBCLASS_MBIM, USB_CDC_PROTO_NONE),
+
+ /* Some Huawei devices, ME906s-158 (12d1:15c1) and E3372
+ * (12d1:157d), are known to fail unless the NDP is placed
+ * after the IP packets. Applying the quirk to all Huawei
+ * devices is broader than necessary, but harmless.
+ */
+ { USB_VENDOR_AND_INTERFACE_INFO(0x12d1, USB_CLASS_COMM, USB_CDC_SUBCLASS_MBIM, USB_CDC_PROTO_NONE),
.driver_info = (unsigned long)&cdc_mbim_info_ndp_to_end,
},
/* default entry */
--
2.1.4
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH] mwifiex: fix possible NULL dereference
From: Andy Shevchenko @ 2016-04-12 14:08 UTC (permalink / raw)
To: Sudip Mukherjee
Cc: Amitkumar Karwar, Nishant Sarmukadam, Kalle Valo,
linux-kernel@vger.kernel.org, open list:TI WILINK WIRELES...,
netdev, Sudip Mukherjee
In-Reply-To: <1460388459-21090-1-git-send-email-sudipm.mukherjee@gmail.com>
On Mon, Apr 11, 2016 at 6:27 PM, Sudip Mukherjee
<sudipm.mukherjee@gmail.com> wrote:
> From: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
>
> We have a check for card just after dereferencing it. So if it is NULL
> we have already dereferenced it before its check. Lets dereference it
> after checking card for NULL.
IIUC the code does nothing with dereference.
I would have told NAK if I would have been maintainer.
>
> Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
> ---
> drivers/net/wireless/marvell/mwifiex/pcie.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
> index edf8b07..84562d0 100644
> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> @@ -2884,10 +2884,11 @@ static void mwifiex_unregister_dev(struct mwifiex_adapter *adapter)
> {
> struct pcie_service_card *card = adapter->card;
Let's say it's 0.
> const struct mwifiex_pcie_card_reg *reg;
> - struct pci_dev *pdev = card->dev;
This would be equal to offset of dev member in pcie_service_card struct.
Nothing wrong here.
> + struct pci_dev *pdev;
> int i;
>
> if (card) {
> + pdev = card->dev;
> if (card->msix_enable) {
> for (i = 0; i < MWIFIEX_NUM_MSIX_VECTORS; i++)
> synchronize_irq(card->msix_entries[i].vector);
> --
> 1.9.1
>
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* Re: Unhandled fault during system suspend in sky2_shutdown
From: Sudeep Holla @ 2016-04-12 14:06 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Sudeep Holla, linux-kernel, netdev, Mirko Lindner
In-Reply-To: <20160411112430.2a03e296@xeon-e3>
On 11/04/16 19:24, Stephen Hemminger wrote:
> On Mon, 11 Apr 2016 17:24:37 +0100
> Sudeep Holla <sudeep.holla@arm.com> wrote:
>
[...]
>>
>> diff --git i/drivers/net/ethernet/marvell/sky2.c
>> w/drivers/net/ethernet/marvell/sky2.c
>> index ec0a22119e09..0ff0434e32fc 100644
>> --- i/drivers/net/ethernet/marvell/sky2.c
>> +++ w/drivers/net/ethernet/marvell/sky2.c
>> @@ -5220,6 +5220,13 @@ static SIMPLE_DEV_PM_OPS(sky2_pm_ops,
>> sky2_suspend, sky2_resume);
>>
>> static void sky2_shutdown(struct pci_dev *pdev)
>> {
>> + struct sky2_hw *hw = pci_get_drvdata(pdev);
>> + int i;
>> +
>> + for (i = hw->ports - 1; i >= 0; --i) {
>> + sky2_detach(hw->dev[i]);
>> + unregister_netdev(hw->dev[i]);
>> + }
>> sky2_suspend(&pdev->dev);
>> pci_wake_from_d3(pdev, device_may_wakeup(&pdev->dev));
>> pci_set_power_state(pdev, PCI_D3hot);
>
> This is not the correct fix, the device is supposed to stay
> registered. The correct way to fix this would be to make get_stats
> ignore requests for device when suspended.
Yes I agree it's not correct fix. But I tried ignoring it in get_stat32
but the crash just moves the bug elsewhere. IMO patching all the places
to check the suspended state might be bit heavy ?
E.g. with something like below the crash moves to sky2_get_eeprom_len
function.
sky2_get_eeprom_len+0x10/0x30
dev_ethtool+0x29c/0x1d78
dev_ioctl+0x31c/0x5a8
sock_ioctl+0x2ac/0x310
do_vfs_ioctl+0xa4/0x750
SyS_ioctl+0x8c/0xa0
el0_svc_naked+0x24/0x2
Sorry if I am missing something fundamental, I am bit new to net drivers
Regards,
Sudeep
-->8
diff --git i/drivers/net/ethernet/marvell/sky2.c
w/drivers/net/ethernet/marvell/sky2.c
index ec0a22119e09..d4cfcd89e7e5 100644
--- i/drivers/net/ethernet/marvell/sky2.c
+++ w/drivers/net/ethernet/marvell/sky2.c
@@ -5175,6 +5175,7 @@ static int sky2_suspend(struct device *dev)
}
sky2_power_aux(hw);
+ hw->suspended = true;
rtnl_unlock();
return 0;
@@ -5198,6 +5199,7 @@ static int sky2_resume(struct device *dev)
}
rtnl_lock();
+ hw->suspended = false;
sky2_reset(hw);
sky2_all_up(hw);
rtnl_unlock();
diff --git i/drivers/net/ethernet/marvell/sky2.h
w/drivers/net/ethernet/marvell/sky2.h
index ec6dcd80152b..1386e5b635ff 100644
--- i/drivers/net/ethernet/marvell/sky2.h
+++ w/drivers/net/ethernet/marvell/sky2.h
@@ -2308,6 +2308,7 @@ struct sky2_hw {
wait_queue_head_t msi_wait;
char irq_name[0];
+ bool suspended;
};
static inline int sky2_is_copper(const struct sky2_hw *hw)
@@ -2378,6 +2379,9 @@ static inline u32 get_stats32(struct sky2_hw *hw,
unsigned port, unsigned reg)
{
u32 val;
+ if (hw->suspended)
+ return 0;
+
do {
val = gma_read32(hw, port, reg);
} while (gma_read32(hw, port, reg) != val);
^ permalink raw reply related
* Re: AF_VSOCK status
From: Stefan Hajnoczi @ 2016-04-12 14:03 UTC (permalink / raw)
To: Antoine Martin; +Cc: netdev
In-Reply-To: <570B9021.5050709@nagafix.co.uk>
[-- Attachment #1: Type: text/plain, Size: 579 bytes --]
On Mon, Apr 11, 2016 at 06:53:05PM +0700, Antoine Martin wrote:
> > There are a few existing ways to achieve that without involving
> > virtio-vsock: vhost-user or ivshmem.
> Yes, I've looked at those and they seem a bit overkill for what we
> want to achieve. We don't want sharing with multiple guests, or
> interrupts.
> All we want is a chunk of host memory to be accessible from the guest..
ivshmem does that. It operates in two modes, it sounds like you want
the first and simpler mode ("ivshmem-plain").
Take a look at hw/misc/ivshmem.c in QEMU.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply
* Re: [RFC v5 0/5] Add virtio transport for AF_VSOCK
From: Stefan Hajnoczi @ 2016-04-12 13:59 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Stefan Hajnoczi, Ian Campbell, kvm, netdev, Matt Benjamin,
Christoffer Dall, Alex Bennée, marius vlad, areis,
Claudio Imbrenda, Greg Kurz, virtualization
In-Reply-To: <20160411154517-mutt-send-email-mst@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3633 bytes --]
On Mon, Apr 11, 2016 at 03:54:08PM +0300, Michael S. Tsirkin wrote:
> On Mon, Apr 11, 2016 at 11:45:48AM +0100, Stefan Hajnoczi wrote:
> > On Fri, Apr 08, 2016 at 04:35:05PM +0100, Ian Campbell wrote:
> > > On Fri, 2016-04-01 at 15:23 +0100, Stefan Hajnoczi wrote:
> > > > This series is based on Michael Tsirkin's vhost branch (v4.5-rc6).
> > > >
> > > > I'm about to process Claudio Imbrenda's locking fixes for virtio-vsock but
> > > > first I want to share the latest version of the code. Several people are
> > > > playing with vsock now so sharing the latest code should avoid duplicate work.
> > >
> > > Thanks for this, I've been using it in my project and it mostly seems
> > > fine.
> > >
> > > One wrinkle I came across, which I'm not sure if it is by design or a
> > > problem is that I can see this sequence coming from the guest (with
> > > other activity in between):
> > >
> > > 1) OP_SHUTDOWN w/ flags == SHUTDOWN_RX
> > > 2) OP_SHUTDOWN w/ flags == SHUTDOWN_TX
> > > 3) OP_SHUTDOWN w/ flags == SHUTDOWN_TX|SHUTDOWN_RX
How did you trigger this sequence? I'd like to reproduce it.
> > > I orignally had my backend close things down at #2, however this meant
> > > that when #3 arrived it was for a non-existent socket (or, worse, an
> > > active one if the ports got reused). I checked v5 of the spec
> > > proposal[0] which says:
> > > If these bits are set and there are no more virtqueue buffers
> > > pending the socket is disconnected.
> > >
> > > but I'm not entirely sure if this behaviour contradicts this or not
> > > (the bits have both been set at #2, but not at the same time).
> > >
> > > BTW, how does one tell if there are no more virtqueue buffers pending
> > > or not while processing the op?
> >
> > #2 is odd. The shutdown bits are sticky so they cannot be cleared once
> > set. I would have expected just #1 and #3. The behavior you observe
> > look like a bug.
> >
> > The spec text does not convey the meaning of OP_SHUTDOWN well.
> > OP_SHUTDOWN SHUTDOWN_TX|SHUTDOWN_RX means no further rx/tx is possible
> > for this connection. "there are no more virtqueue buffers pending the
> > socket" really means that this isn't an immediate close from the
> > perspective of the application. If the application still has unread rx
> > buffers then the socket stays readable until the rx data has been fully
> > read.
>
> Yes but you also wrote:
> If these bits are set and there are no more virtqueue buffers
> pending the socket is disconnected.
>
> how does remote know that there are no buffers pending and so it's safe
> to reuse the same source/destination address now? Maybe destination
> should send RST at that point?
You are right, the source/destination address could be reused while the
remote still has the connection in their table. Connection
establishment would fail with a RST reply.
I can think of two solutions:
1. Implementations must remove connections from their table as soon as
SHUTDOWN_TX|SHUTDOWN_RX is received. This way the source/destination
address tuple can be reused immediately, i.e. new connections with
the same source/destination would be possible while an application is
still draining the receive buffers of an old connection.
2. Extend the connection lifecycle so that an A->B
SHUTDOWN_TX|SHUTDOWN_RX must be followed by a by a B->A RST to close
a connection. This way the source/destination address is only in use
once at a time.
Option #2 seems safer because there is no overlap in source/destination
address usage.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply
* Re: [RFC PATCH 0/2] selinux: avoid nf hooks overhead when not needed
From: Casey Schaufler @ 2016-04-12 13:57 UTC (permalink / raw)
To: Paolo Abeni, Paul Moore
Cc: Florian Westphal, linux-security-module, David S. Miller,
James Morris, Andreas Gruenbacher, Stephen Smalley, netdev,
selinux, Casey Schaufler
In-Reply-To: <1460451162.5965.16.camel@redhat.com>
On 4/12/2016 1:52 AM, Paolo Abeni wrote:
> On Thu, 2016-04-07 at 14:55 -0400, Paul Moore wrote:
>> On Thursday, April 07, 2016 01:45:32 AM Florian Westphal wrote:
>>> Paul Moore <paul@paul-moore.com> wrote:
>>>> On Wed, Apr 6, 2016 at 6:14 PM, Florian Westphal <fw@strlen.de> wrote:
>>>>> netfilter hooks are per namespace -- so there is hook unregister when
>>>>> netns is destroyed.
>>>> Looking around, I see the global and per-namespace registration
>>>> functions (nf_register_hook and nf_register_net_hook, respectively),
>>>> but I'm looking to see if/how newly created namespace inherit
>>>> netfilter hooks from the init network namespace ... if you can create
>>>> a network namespace and dodge the SELinux hooks, that isn't a good
>>>> thing from a SELinux point of view, although it might be a plus
>>>> depending on where you view Paolo's original patches ;)
>>> Heh :-)
>>>
>>> If you use nf_register_net_hook, the hook is only registered in the
>>> namespace.
>>>
>>> If you use nf_register_hook, the hook is put on a global list and
>>> registed in all existing namespaces.
>>>
>>> New namespaces will have the hook added as well (see
>>> netfilter_net_init -> nf_register_hook_list in netfilter/core.c )
>>>
>>> Since nf_register_hook is used it should be impossible to get a netns
>>> that doesn't call these hooks.
>> Great, thanks.
>>
>>>>> Do you think it makes sense to rework the patch to delay registering
>>>>> of the netfiler hooks until the system is in a state where they're
>>>>> needed, without the 'unregister' aspect?
>>>> I would need to see the patch to say for certain, but in principle
>>>> that seems perfectly reasonable and I think would satisfy both the
>>>> netdev and SELinux camps - good suggestion. My main goal is to drop
>>>> the selinux_nf_ip_init() entirely so it can't be used as a ROP gadget.
>>>>
>>>> We might even be able to trim the secmark_active and peerlbl_active
>>>> checks in the SELinux netfilter hooks (an earlier attempt at
>>>> optimization; contrary to popular belief, I do care about SELinux
>>>> performance), although that would mean that enabling the network
>>>> access controls would be one way ... I guess you can disregard that
>>>> last bit, I'm thinking aloud again.
>>> One way is fine I think.
>> Yes, just disregard my second paragraph above.
>>
>>>>> Ideally this would even be per netns -- in perfect world we would
>>>>> be able to make it so that a new netns are created with an empty
>>>>> hook list.
>>>> In general SELinux doesn't care about namespaces, for reasons that are
>>>> sorta beyond the scope of this conversation, so I would like to stick
>>>> to a all or nothing approach to enabling the SELinux netfilter hooks
>>>> across namespaces. Perhaps we can revisit this at a later time, but
>>>> let's keep it simple right now.
>>> Okay, I'd prefer to stick to your recommendation anyway wrt. to selinux
>>> (Casey, I read your comment regarding smack. Noted, we don't want to
>>> break smack either...)
>>>
>>> I think that in this case the entire question is:
>>>
>>> In your experience, how likely is a config where selinux is enabled BUT the
>>> hooks are not needed (i.e., where we hit the
>>>
>>> if (!selinux_policycap_netpeer)
>>> return NF_ACCEPT;
>>>
>>> if (!secmark_active && !peerlbl_active)
>>> return NF_ACCEPT;
>>>
>>> tests inside the hooks)? If such setups are uncommon we should just
>>> drop this idea or at least put it on the back burner until the more
>>> expensive netfilter hooks (conntrack, cough) are out of the way.
>> A few years ago I would have said that it is relatively uncommon for admins to
>> enable the SELinux network access controls; it was typically just
>> government/intelligence agencies who had very strict access control
>> requirements and represented a small portion of SELinux users. However, over
>> the past few years I've been fielding more and more questions from admins/devs
>> in the virtualization space who are interested in some of these capabilities;
>> it isn't clear to me how many of these people are switching it on, but there
>> is definitely more interest than I have seen in the past and the interested is
>> centered around some rather common use cases.
>>
>> So, to summarize, I don't know ;)
>>
>> If you've got bigger sources of overhead, my opinion would be to go tackle
>> those first. Perhaps I can even find the time to work on the
>> SELinux/netfilter stuff while you are off slaying the bigger dragons, no
>> promises at the moment.
> Double checking if I got the above correctly.
>
> Will be ok if we post a v2 version of this series, removing the hooks
> de-registration bits, but preserving the selinux nf-hooks and
> socket_sock_rcv_skb() on-demand/delayed registration ?
Imagine that I have two security modules that control sockets.
The work I'm knee deep in will allow this. If adding hooks after
the init phase is allowed you have to face the possibility that
blob sizes (in this case sock->sk_security) may change. That
requires checking on every hook that uses blobs to determine
whether the blob has data for all the modules using it. I know
that that is a simple matter of arithmetic, but it's additional
overhead on every hook call. It also makes creating kmem caches
for security blobs much more difficult. Another performance
optimization that becomes unavailable.
We know of a number of ways we can improve networking performance
in the face of security modules. Many would make the code a whole
lot cleaner. Your proposed change is clever, but targets one case
at the expense of the general case. If there really is concern
about the performance of networking in the presence of security
modules I would suggest that we revisit some of the changes that
have already been proposed.
> Will that fit
> with the post-init read only memory usage that you are planning ?
>
> Regards,
>
> Paolo
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [PATCH RFT 2/2] macb: kill PHY reset code
From: Sergei Shtylyov @ 2016-04-12 13:54 UTC (permalink / raw)
To: Nicolas Ferre, Andrew Lunn; +Cc: netdev, linux-kernel
In-Reply-To: <570CBE42.50309@atmel.com>
Hello.
On 4/12/2016 12:22 PM, Nicolas Ferre wrote:
>>> With the 'phylib' now being aware of the "reset-gpios" PHY node property,
>>> there should be no need to frob the PHY reset in this driver anymore...
>>>
>>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
>>>
>>> ---
>>> drivers/net/ethernet/cadence/macb.c | 17 -----------------
>>> drivers/net/ethernet/cadence/macb.h | 1 -
>>> 2 files changed, 18 deletions(-)
>>>
>>> Index: net-next/drivers/net/ethernet/cadence/macb.c
>>> ===================================================================
>>> --- net-next.orig/drivers/net/ethernet/cadence/macb.c
>>> +++ net-next/drivers/net/ethernet/cadence/macb.c
[...]
>>> @@ -2977,18 +2976,6 @@ static int macb_probe(struct platform_de
>>> else
>>> macb_get_hwaddr(bp);
>>>
>>> - /* Power up the PHY if there is a GPIO reset */
>>> - phy_node = of_get_next_available_child(np, NULL);
>>> - if (phy_node) {
>>> - int gpio = of_get_named_gpio(phy_node, "reset-gpios", 0);
>>> -
>>> - if (gpio_is_valid(gpio)) {
>>> - bp->reset_gpio = gpio_to_desc(gpio);
>>> - gpiod_direction_output(bp->reset_gpio, 1);
>>
>> Hi Sergei
>>
>> The code you are deleting would of ignored the flags in the gpio
>
> I don't parse this.
> The code deleted does take the flag into account.
Not really -- you need to call of_get_named_gpio_flags() (with a valid
last argument) for that.
> And the DT property
> associated to it seems correct to me (I mean, with proper flag
> specification).
It apparently is not as it have GPIO_ACTIVE_HIGH and the driver assumes
active-low reset signal.
[...]
> Bye,
MBR, Sergei
^ permalink raw reply
* Re: [PATCH RFT 2/2] macb: kill PHY reset code
From: Andrew Lunn @ 2016-04-12 13:40 UTC (permalink / raw)
To: Nicolas Ferre; +Cc: Sergei Shtylyov, netdev, linux-kernel
In-Reply-To: <570CBE42.50309@atmel.com>
On Tue, Apr 12, 2016 at 11:22:10AM +0200, Nicolas Ferre wrote:
> Le 11/04/2016 04:28, Andrew Lunn a écrit :
> > On Sat, Apr 09, 2016 at 01:25:03AM +0300, Sergei Shtylyov wrote:
> >> With the 'phylib' now being aware of the "reset-gpios" PHY node property,
> >> there should be no need to frob the PHY reset in this driver anymore...
> >>
> >> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> >>
> >> ---
> >> drivers/net/ethernet/cadence/macb.c | 17 -----------------
> >> drivers/net/ethernet/cadence/macb.h | 1 -
> >> 2 files changed, 18 deletions(-)
> >>
> >> Index: net-next/drivers/net/ethernet/cadence/macb.c
> >> ===================================================================
> >> --- net-next.orig/drivers/net/ethernet/cadence/macb.c
> >> +++ net-next/drivers/net/ethernet/cadence/macb.c
> >> @@ -2884,7 +2884,6 @@ static int macb_probe(struct platform_de
> >> = macb_clk_init;
> >> int (*init)(struct platform_device *) = macb_init;
> >> struct device_node *np = pdev->dev.of_node;
> >> - struct device_node *phy_node;
> >> const struct macb_config *macb_config = NULL;
> >> struct clk *pclk, *hclk = NULL, *tx_clk = NULL;
> >> unsigned int queue_mask, num_queues;
> >> @@ -2977,18 +2976,6 @@ static int macb_probe(struct platform_de
> >> else
> >> macb_get_hwaddr(bp);
> >>
> >> - /* Power up the PHY if there is a GPIO reset */
> >> - phy_node = of_get_next_available_child(np, NULL);
> >> - if (phy_node) {
> >> - int gpio = of_get_named_gpio(phy_node, "reset-gpios", 0);
> >> -
> >> - if (gpio_is_valid(gpio)) {
> >> - bp->reset_gpio = gpio_to_desc(gpio);
> >> - gpiod_direction_output(bp->reset_gpio, 1);
> >
> > Hi Sergei
> >
> > The code you are deleting would of ignored the flags in the gpio
> I don't parse this.
>
> The code deleted does take the flag into account. And the DT property
> associated to it seems correct to me (I mean, with proper flag
> specification).
Hi Nicolas
of_get_named_gpio() does not do anything with the flags. So for
example,
gpios = <&gpio0 12 GPIO_ACTIVE_LOW>;
the GPIO_ACTIVE_LOW would be ignored. If you want the flags to be
respected, you need to use the gpiod API for all calls, in particular,
you need to use something which calls gpiod_get_index(), since that is
the only function to call gpiod_parse_flags() to translate
GPIO_ACTIVE_LOW into a flag within the gpio descriptor.
Andrew
^ permalink raw reply
* Re: GMII2RGMII Converter support in macb driver
From: Phil Reid @ 2016-04-12 13:37 UTC (permalink / raw)
To: Nicolas Ferre, Appana Durga Kedareswara Rao,
netdev@vger.kernel.org, Michal Simek
Cc: Punnaiah Choudary Kalluri, Harini Katakam, Anirudha Sarangi,
Appana Durga Kedareswara Rao
In-Reply-To: <570CF663.4000908@atmel.com>
On 12/04/2016 9:21 PM, Nicolas Ferre wrote:
> Le 12/04/2016 15:03, Appana Durga Kedareswara Rao a écrit :
>> Hi All,
>>
>>
>>
>>
>>
>> There is a Xilinx custom IP for GMII to RGMII conversion
>> data sheet here
>> (http://www.xilinx.com/support/documentation/ip_documentation/gmii_to_rgmii/v4_0/pg160-gmii-to-rgmii.pdf
>> )
>>
>>
>>
>>
>>
>> Unlike other Phy’s this IP won’t support auto negotiation
>> and other features that usually normal Phy’s support.
>>
>> This IP has only one register (Control register) which needs to be
>> programmed based on the external phy auto negotiation
>>
>> (Based on the external phy negotiated speed).
>>
>>
>>
>> I am able to make it work for GEM driver by doing the below changes in
>> the driver (drivers/net/ethernet/cadence/macb.c).
>>
>>
>>
>> +#define XEMACPS_GMII2RGMII_FULLDPLX BMCR_FULLDPLX
>>
>> +#define XEMACPS_GMII2RGMII_SPEED1000 BMCR_SPEED1000
>>
>> +#define XEMACPS_GMII2RGMII_SPEED100 BMCR_SPEED100
>>
>> +#define
>> XEMACPS_GMII2RGMII_REG_NUM 0x10
>>
>> +
>>
>> /*
>>
>> * Graceful stop timeouts in us. We should allow up to
>>
>> * 1 frame time (10 Mbits/s, full-duplex, ignoring collisions)
>>
>> @@ -311,8 +317,10 @@ static void macb_handle_link_change(struct
>> net_device *dev)
>>
>> {
>>
>> struct macb *bp = netdev_priv(dev);
>>
>> struct phy_device *phydev = bp->phy_dev;
>>
>> + struct phy_device *gmii2rgmii_phydev = bp->gmii2rgmii_phy_dev;
>>
>> unsigned long flags;
>>
>> int status_change = 0;
>>
>> + u16 gmii2rgmii_reg = 0;
>>
>> spin_lock_irqsave(&bp->lock, flags);
>>
>> @@ -326,15 +334,27 @@ static void macb_handle_link_change(struct
>> net_device *dev)
>>
>> if (macb_is_gem(bp))
>>
>> reg &=
>> ~GEM_BIT(GBE);
>>
>> - if (phydev->duplex)
>>
>> + if (phydev->duplex) {
>>
>> reg |=
>> MACB_BIT(FD);
>>
>> - if (phydev->speed == SPEED_100)
>>
>> +
>> gmii2rgmii_reg |= XEMACPS_GMII2RGMII_FULLDPLX;
>>
>> + }
>>
>> + if (phydev->speed ==
>> SPEED_100) {
>>
>> reg |=
>> MACB_BIT(SPD);
>>
>> +
>> gmii2rgmii_reg |= XEMACPS_GMII2RGMII_SPEED100;
>>
>> + }
>>
>> if (phydev->speed ==
>> SPEED_1000 &&
>>
>> - bp->caps &
>> MACB_CAPS_GIGABIT_MODE_AVAILABLE)
>>
>> + bp->caps &
>> MACB_CAPS_GIGABIT_MODE_AVAILABLE) {
>>
>> reg |=
>> GEM_BIT(GBE);
>>
>> +
>> gmii2rgmii_reg |= XEMACPS_GMII2RGMII_SPEED1000;
>>
>> + }
>>
>> macb_or_gem_writel(bp,
>> NCFGR, reg);
>>
>> + if (gmii2rgmii_phydev != NULL) {
>>
>> +
>> macb_mdio_write(bp->mii_bus,
>>
>> + gmii2rgmii_phydev->addr,
>>
>> + XEMACPS_GMII2RGMII_REG_NUM,
>>
>> + gmii2rgmii_reg);
>>
>> + }
>>
>> bp->speed = phydev->speed;
>>
>> bp->duplex = phydev->duplex;
>>
>> @@ -382,6 +402,19 @@ static int macb_mii_probe(struct net_device *dev)
>>
>> int phy_irq;
>>
>> int ret;
>>
>> + if (bp->gmii2rgmii_phy_node) {
>>
>> + phydev = of_phy_attach(bp->dev,
>>
>> + bp->gmii2rgmii_phy_node,
>>
>> + 0,
>> 0);
>>
>> + if (!phydev) {
>>
>> + dev_err(&bp->pdev->dev, "%s:
>> no gmii to rgmii converter found\n",
>>
>> + dev->name);
>>
>> + return -1;
>>
>> + }
>>
>> + bp->gmii2rgmii_phy_dev = phydev;
>>
>> + } else
>>
>> + bp->gmii2rgmii_phy_dev = NULL;
>>
>> +
>>
>> phydev = phy_find_first(bp->mii_bus);
>>
>> if (!phydev) {
>>
>> netdev_err(dev, "no PHY found\n");
>>
>> @@ -402,6 +435,8 @@ static int macb_mii_probe(struct net_device *dev)
>>
>>
>> bp->phy_interface);
>>
>> if (ret) {
>>
>> netdev_err(dev, "Could not attach to PHY\n");
>>
>> + if (bp->gmii2rgmii_phy_dev)
>>
>> +
>> phy_disconnect(bp->gmii2rgmii_phy_dev);
>>
>> return ret;
>>
>> }
>>
>> @@ -3368,6 +3403,9 @@ static int macb_probe(struct platform_device *pdev)
>>
>> bp->phy_interface = err;
>>
>> }
>>
>> + bp->gmii2rgmii_phy_node =
>> of_parse_phandle(bp->pdev->dev.of_node,
>>
>> +
>> "gmii2rgmii-phy-handle", 0);
>>
>> +
>>
>> macb_reset_phy(pdev);
>>
>> /* IP specific init */
>>
>> @@ -3422,6 +3460,8 @@ static int macb_remove(struct platform_device *pdev)
>>
>> bp = netdev_priv(dev);
>>
>> if (bp->phy_dev)
>>
>> phy_disconnect(bp->phy_dev);
>>
>> + if (bp->gmii2rgmii_phy_dev)
>>
>> +
>> phy_disconnect(bp->gmii2rgmii_phy_dev);
>>
>>
>>
>> But doing above changes making driver looks odd.
>>
>> could you please suggest any better option to add support for this IP in
>> the macb driver?
>
> Appana,
>
> I certainly can't prototype the solution based on your datasheet and the
> code sent... do a sensible proposal, then we can evaluate.
>
> As the IP is separated from the Eth controller, make it a separate
> driver (an emulated phy one for instance... even if I don't know if it
> makes sense).
>
> I don't know if others have already made such an adaptation layer
> between GMII to RGMII but I'm pretty sure it can't be inserted into the
> macb driver.
>
> Bye,
>
This sounds very similar to the altera emac-splitter.
See stmmac driver for how they handled this.
--
Regards
Phil Reid
^ permalink raw reply
* RE: GMII2RGMII Converter support in macb driver
From: Appana Durga Kedareswara Rao @ 2016-04-12 13:31 UTC (permalink / raw)
To: Nicolas Ferre, netdev@vger.kernel.org, Michal Simek
Cc: Punnaiah Choudary Kalluri, Harini Katakam, Anirudha Sarangi
In-Reply-To: <570CF663.4000908@atmel.com>
Hi Nicolas Ferre,
> -----Original Message-----
> From: Nicolas Ferre [mailto:nicolas.ferre@atmel.com]
> Sent: Tuesday, April 12, 2016 6:52 PM
> To: Appana Durga Kedareswara Rao <appanad@xilinx.com>;
> netdev@vger.kernel.org; Michal Simek <michals@xilinx.com>
> Cc: Punnaiah Choudary Kalluri <punnaia@xilinx.com>; Harini Katakam
> <harinik@xilinx.com>; Anirudha Sarangi <anirudh@xilinx.com>; Appana Durga
> Kedareswara Rao <appanad@xilinx.com>
> Subject: Re: GMII2RGMII Converter support in macb driver
>
> Le 12/04/2016 15:03, Appana Durga Kedareswara Rao a écrit :
> > Hi All,
> >
> >
> >
> >
> >
> > There is a Xilinx custom IP for GMII to RGMII
> > conversion data sheet here
> > (http://www.xilinx.com/support/documentation/ip_documentation/gmii_to_
> > rgmii/v4_0/pg160-gmii-to-rgmii.pdf
> > )
> >
> >
> >
> >
> >
> > Unlike other Phy's this IP won't support auto
> > negotiation and other features that usually normal Phy's support.
> >
> > This IP has only one register (Control register) which needs to be
> > programmed based on the external phy auto negotiation
> >
> > (Based on the external phy negotiated speed).
> >
> >
> >
> > I am able to make it work for GEM driver by doing the below changes in
> > the driver (drivers/net/ethernet/cadence/macb.c).
> >
> >
> >
> > +#define XEMACPS_GMII2RGMII_FULLDPLX BMCR_FULLDPLX
> >
> > +#define XEMACPS_GMII2RGMII_SPEED1000 BMCR_SPEED1000
> >
> > +#define XEMACPS_GMII2RGMII_SPEED100 BMCR_SPEED100
> >
> > +#define
> > XEMACPS_GMII2RGMII_REG_NUM 0x10
> >
> > +
> >
> > /*
> >
> > * Graceful stop timeouts in us. We should allow up to
> >
> > * 1 frame time (10 Mbits/s, full-duplex, ignoring collisions)
> >
> > @@ -311,8 +317,10 @@ static void macb_handle_link_change(struct
> > net_device *dev)
> >
> > {
> >
> > struct macb *bp = netdev_priv(dev);
> >
> > struct phy_device *phydev = bp->phy_dev;
> >
> > + struct phy_device *gmii2rgmii_phydev =
> > + bp->gmii2rgmii_phy_dev;
> >
> > unsigned long flags;
> >
> > int status_change = 0;
> >
> > + u16 gmii2rgmii_reg = 0;
> >
> > spin_lock_irqsave(&bp->lock, flags);
> >
> > @@ -326,15 +334,27 @@ static void macb_handle_link_change(struct
> > net_device *dev)
> >
> > if (macb_is_gem(bp))
> >
> > reg &=
> > ~GEM_BIT(GBE);
> >
> > - if (phydev->duplex)
> >
> > + if (phydev->duplex) {
> >
> > reg |=
> > MACB_BIT(FD);
> >
> > - if (phydev->speed == SPEED_100)
> >
> > +
> > gmii2rgmii_reg |= XEMACPS_GMII2RGMII_FULLDPLX;
> >
> > + }
> >
> > + if (phydev->speed ==
> > SPEED_100) {
> >
> > reg |=
> > MACB_BIT(SPD);
> >
> > +
> > gmii2rgmii_reg |= XEMACPS_GMII2RGMII_SPEED100;
> >
> > + }
> >
> > if (phydev->speed ==
> > SPEED_1000 &&
> >
> > - bp->caps &
> > MACB_CAPS_GIGABIT_MODE_AVAILABLE)
> >
> > + bp->caps &
> > MACB_CAPS_GIGABIT_MODE_AVAILABLE) {
> >
> > reg |=
> > GEM_BIT(GBE);
> >
> > +
> > gmii2rgmii_reg |= XEMACPS_GMII2RGMII_SPEED1000;
> >
> > + }
> >
> > macb_or_gem_writel(bp,
> > NCFGR, reg);
> >
> > + if (gmii2rgmii_phydev !=
> > + NULL) {
> >
> > +
> > macb_mdio_write(bp->mii_bus,
> >
> > +
> > + gmii2rgmii_phydev->addr,
> >
> > +
> > + XEMACPS_GMII2RGMII_REG_NUM,
> >
> > +
> > + gmii2rgmii_reg);
> >
> > + }
> >
> > bp->speed =
> > phydev->speed;
> >
> > bp->duplex =
> > phydev->duplex;
> >
> > @@ -382,6 +402,19 @@ static int macb_mii_probe(struct net_device *dev)
> >
> > int phy_irq;
> >
> > int ret;
> >
> > + if (bp->gmii2rgmii_phy_node) {
> >
> > + phydev = of_phy_attach(bp->dev,
> >
> > +
> > + bp->gmii2rgmii_phy_node,
> >
> > +
> > + 0,
> > 0);
> >
> > + if (!phydev) {
> >
> > + dev_err(&bp->pdev->dev, "%s:
> > no gmii to rgmii converter found\n",
> >
> > + dev->name);
> >
> > + return -1;
> >
> > + }
> >
> > + bp->gmii2rgmii_phy_dev = phydev;
> >
> > + } else
> >
> > + bp->gmii2rgmii_phy_dev = NULL;
> >
> > +
> >
> > phydev = phy_find_first(bp->mii_bus);
> >
> > if (!phydev) {
> >
> > netdev_err(dev, "no PHY found\n");
> >
> > @@ -402,6 +435,8 @@ static int macb_mii_probe(struct net_device *dev)
> >
> >
> > bp->phy_interface);
> >
> > if (ret) {
> >
> > netdev_err(dev, "Could not attach to
> > PHY\n");
> >
> > + if (bp->gmii2rgmii_phy_dev)
> >
> > +
> > phy_disconnect(bp->gmii2rgmii_phy_dev);
> >
> > return ret;
> >
> > }
> >
> > @@ -3368,6 +3403,9 @@ static int macb_probe(struct platform_device
> > *pdev)
> >
> > bp->phy_interface = err;
> >
> > }
> >
> > + bp->gmii2rgmii_phy_node =
> > of_parse_phandle(bp->pdev->dev.of_node,
> >
> > +
> > "gmii2rgmii-phy-handle", 0);
> >
> > +
> >
> > macb_reset_phy(pdev);
> >
> > /* IP specific init */
> >
> > @@ -3422,6 +3460,8 @@ static int macb_remove(struct platform_device
> > *pdev)
> >
> > bp = netdev_priv(dev);
> >
> > if (bp->phy_dev)
> >
> >
> > phy_disconnect(bp->phy_dev);
> >
> > + if (bp->gmii2rgmii_phy_dev)
> >
> > +
> > phy_disconnect(bp->gmii2rgmii_phy_dev);
> >
> >
> >
> > But doing above changes making driver looks odd.
> >
> > could you please suggest any better option to add support for this IP
> > in the macb driver?
>
> Appana,
>
> I certainly can't prototype the solution based on your datasheet and the code
> sent... do a sensible proposal, then we can evaluate.
Thanks for the quick response will come up with a sensible proposal soon...
Regards,
Kedar.
>
> As the IP is separated from the Eth controller, make it a separate driver (an
> emulated phy one for instance... even if I don't know if it makes sense).
>
> I don't know if others have already made such an adaptation layer between
> GMII to RGMII but I'm pretty sure it can't be inserted into the macb driver.
>
> Bye,
> --
> Nicolas Ferre
^ permalink raw reply
* Re: [PATCH net-next v2 1/2] rtnetlink: add new RTM_GETSTATS message to dump link stats
From: Thomas Graf @ 2016-04-12 13:21 UTC (permalink / raw)
To: roopa; +Cc: netdev, jhs, davem, Nikolay Aleksandrov
In-Reply-To: <570C7133.8070109@cumulusnetworks.com>
On 04/11/16 at 08:53pm, roopa wrote:
> Top level stats attributes can be netdev or global attributes: We can include string "LINK" in
> the names of all stats belonging to a netdev to make it easier to recognize the netdev stats (example):
> IFLA_STATS_LINK64, (netdev)
> IFLA_STATS_LINK_INET6, (netdev)
> IFLA_STATS_TCP, (non-netdev, global tcp stats)
This is fine as well. It means that we cant mix netdev and non-netdev
stats or stats for multiple netdevs in the same request which would
not be the case if you nest it and have a top level attribute which
is a list of requests. That may be borderline to overengineering
though so I'm fine this as well.
> We will need a field in netlink_callback to indicate global or netdev stats when the stats
> crosses skb boundaries. A single nlmsg cannot have both netdev and global stats.
I would treat each IFLA_STATS_ as its own nlmsg in the reply and
enforce an NLM_F_DUMP request for any multi request message.
^ permalink raw reply
* Re: GMII2RGMII Converter support in macb driver
From: Nicolas Ferre @ 2016-04-12 13:21 UTC (permalink / raw)
To: Appana Durga Kedareswara Rao, netdev@vger.kernel.org,
Michal Simek
Cc: Punnaiah Choudary Kalluri, Harini Katakam, Anirudha Sarangi,
Appana Durga Kedareswara Rao
In-Reply-To: <C246CAC1457055469EF09E3A7AC4E11A4A57592E@XAP-PVEXMBX01.xlnx.xilinx.com>
Le 12/04/2016 15:03, Appana Durga Kedareswara Rao a écrit :
> Hi All,
>
>
>
>
>
> There is a Xilinx custom IP for GMII to RGMII conversion
> data sheet here
> (http://www.xilinx.com/support/documentation/ip_documentation/gmii_to_rgmii/v4_0/pg160-gmii-to-rgmii.pdf
> )
>
>
>
>
>
> Unlike other Phy’s this IP won’t support auto negotiation
> and other features that usually normal Phy’s support.
>
> This IP has only one register (Control register) which needs to be
> programmed based on the external phy auto negotiation
>
> (Based on the external phy negotiated speed).
>
>
>
> I am able to make it work for GEM driver by doing the below changes in
> the driver (drivers/net/ethernet/cadence/macb.c).
>
>
>
> +#define XEMACPS_GMII2RGMII_FULLDPLX BMCR_FULLDPLX
>
> +#define XEMACPS_GMII2RGMII_SPEED1000 BMCR_SPEED1000
>
> +#define XEMACPS_GMII2RGMII_SPEED100 BMCR_SPEED100
>
> +#define
> XEMACPS_GMII2RGMII_REG_NUM 0x10
>
> +
>
> /*
>
> * Graceful stop timeouts in us. We should allow up to
>
> * 1 frame time (10 Mbits/s, full-duplex, ignoring collisions)
>
> @@ -311,8 +317,10 @@ static void macb_handle_link_change(struct
> net_device *dev)
>
> {
>
> struct macb *bp = netdev_priv(dev);
>
> struct phy_device *phydev = bp->phy_dev;
>
> + struct phy_device *gmii2rgmii_phydev = bp->gmii2rgmii_phy_dev;
>
> unsigned long flags;
>
> int status_change = 0;
>
> + u16 gmii2rgmii_reg = 0;
>
> spin_lock_irqsave(&bp->lock, flags);
>
> @@ -326,15 +334,27 @@ static void macb_handle_link_change(struct
> net_device *dev)
>
> if (macb_is_gem(bp))
>
> reg &=
> ~GEM_BIT(GBE);
>
> - if (phydev->duplex)
>
> + if (phydev->duplex) {
>
> reg |=
> MACB_BIT(FD);
>
> - if (phydev->speed == SPEED_100)
>
> +
> gmii2rgmii_reg |= XEMACPS_GMII2RGMII_FULLDPLX;
>
> + }
>
> + if (phydev->speed ==
> SPEED_100) {
>
> reg |=
> MACB_BIT(SPD);
>
> +
> gmii2rgmii_reg |= XEMACPS_GMII2RGMII_SPEED100;
>
> + }
>
> if (phydev->speed ==
> SPEED_1000 &&
>
> - bp->caps &
> MACB_CAPS_GIGABIT_MODE_AVAILABLE)
>
> + bp->caps &
> MACB_CAPS_GIGABIT_MODE_AVAILABLE) {
>
> reg |=
> GEM_BIT(GBE);
>
> +
> gmii2rgmii_reg |= XEMACPS_GMII2RGMII_SPEED1000;
>
> + }
>
> macb_or_gem_writel(bp,
> NCFGR, reg);
>
> + if (gmii2rgmii_phydev != NULL) {
>
> +
> macb_mdio_write(bp->mii_bus,
>
> + gmii2rgmii_phydev->addr,
>
> + XEMACPS_GMII2RGMII_REG_NUM,
>
> + gmii2rgmii_reg);
>
> + }
>
> bp->speed = phydev->speed;
>
> bp->duplex = phydev->duplex;
>
> @@ -382,6 +402,19 @@ static int macb_mii_probe(struct net_device *dev)
>
> int phy_irq;
>
> int ret;
>
> + if (bp->gmii2rgmii_phy_node) {
>
> + phydev = of_phy_attach(bp->dev,
>
> + bp->gmii2rgmii_phy_node,
>
> + 0,
> 0);
>
> + if (!phydev) {
>
> + dev_err(&bp->pdev->dev, "%s:
> no gmii to rgmii converter found\n",
>
> + dev->name);
>
> + return -1;
>
> + }
>
> + bp->gmii2rgmii_phy_dev = phydev;
>
> + } else
>
> + bp->gmii2rgmii_phy_dev = NULL;
>
> +
>
> phydev = phy_find_first(bp->mii_bus);
>
> if (!phydev) {
>
> netdev_err(dev, "no PHY found\n");
>
> @@ -402,6 +435,8 @@ static int macb_mii_probe(struct net_device *dev)
>
>
> bp->phy_interface);
>
> if (ret) {
>
> netdev_err(dev, "Could not attach to PHY\n");
>
> + if (bp->gmii2rgmii_phy_dev)
>
> +
> phy_disconnect(bp->gmii2rgmii_phy_dev);
>
> return ret;
>
> }
>
> @@ -3368,6 +3403,9 @@ static int macb_probe(struct platform_device *pdev)
>
> bp->phy_interface = err;
>
> }
>
> + bp->gmii2rgmii_phy_node =
> of_parse_phandle(bp->pdev->dev.of_node,
>
> +
> "gmii2rgmii-phy-handle", 0);
>
> +
>
> macb_reset_phy(pdev);
>
> /* IP specific init */
>
> @@ -3422,6 +3460,8 @@ static int macb_remove(struct platform_device *pdev)
>
> bp = netdev_priv(dev);
>
> if (bp->phy_dev)
>
> phy_disconnect(bp->phy_dev);
>
> + if (bp->gmii2rgmii_phy_dev)
>
> +
> phy_disconnect(bp->gmii2rgmii_phy_dev);
>
>
>
> But doing above changes making driver looks odd.
>
> could you please suggest any better option to add support for this IP in
> the macb driver?
Appana,
I certainly can't prototype the solution based on your datasheet and the
code sent... do a sensible proposal, then we can evaluate.
As the IP is separated from the Eth controller, make it a separate
driver (an emulated phy one for instance... even if I don't know if it
makes sense).
I don't know if others have already made such an adaptation layer
between GMII to RGMII but I'm pretty sure it can't be inserted into the
macb driver.
Bye,
--
Nicolas Ferre
^ permalink raw reply
* Re: [PATCH RFC] net: decrease the length of backlog queue immediately after it's detached from sk
From: Yang Yingliang @ 2016-04-12 12:31 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, davem, Ding Tianhong
In-Reply-To: <570C6483.5020502@huawei.com>
On 2016/4/12 10:59, Yang Yingliang wrote:
>
>
> On 2016/4/11 20:13, Eric Dumazet wrote:
>> On Mon, 2016-04-11 at 19:57 +0800, Yang Yingliang wrote:
>>>
>>> On 2016/4/8 22:44, Eric Dumazet wrote:
>>>> On Fri, 2016-04-08 at 19:18 +0800, Yang Yingliang wrote:
>>>>
>>>>> I expand tcp_adv_win_scale and tcp_rmem. It has no effect.
>>>>
>>>> Try :
>>>>
>>>> echo -2 >/proc/sys/net/ipv4/tcp_adv_win_scale
>>>>
>>>> And restart your flows.
>>>>
>>> cat /proc/sys/net/ipv4/tcp_rmem
>>> 10240 2097152 10485760
>>
>> What about leaving the default values ?
> I tried, it did not work.
>
>>
>> $ cat /proc/sys/net/ipv4/tcp_rmem
>> 4096 87380 6291456
>>
>>>
>>> echo 102400 20971520 104857600 > /proc/sys/net/ipv4/tcp_rmem
>>> echo -2 >/proc/sys/net/ipv4/tcp_adv_win_scale
>>>
>>> It seems has not effect.
>>>
>>
>> I have no idea what you did on the sender side to allow it to send more
>> than 1.5 MB then.
>
> We are doing performance test. The sender send 256KB per-block with 128
> threads to one socket. And the receiver uses 10Gb NIC to handle the
> data on ARM64. The data flow is driver->ip layer->tcp layer->iscsi.
>
> I added some debug messages and found handling backlog packets in
> __release_sock() cost about 11ms at most. This can cause backlog queue
> overflow. The sk_data_ready is re-assigned, it may cost time in our
> program. I will check it out.
>
I traced the cost cycles of handling backlog packets in
__release_sock().
16.97 ms to handling about 12MB backlog packets, of which 13.66ms to do
sk_data_ready.
The speed of handling packets in TCP is 5.65Gb/s which is smaller than
the NIC's bandwidth. So the packets will be dropped.
If the cost of sk_data_read cannot be reduced, do we have other choice
exclude dropping packets ?
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox