* Re: [PATCH net-next 1/4] ipip: allow to deactivate the creation of fb dev
From: Nicolas Dichtel @ 2012-11-16 16:46 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, davem
In-Reply-To: <20121116082926.1c6cccd2@nehalam.linuxnetplumber.net>
Le 16/11/2012 17:29, Stephen Hemminger a écrit :
> On Fri, 16 Nov 2012 17:14:13 +0100
> Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:
>
>> Now that tunnels can be configured via rtnetlink, this device is not mandatory.
>> The default is conservative.
>>
>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>
>
> Although I am in favor of reducing clutter, and we even have to put in special case
> code to ignore these stub devices in the Vyatta scripts. Module parameters are bit of a nuisance to deal with, but maybe
> the only way for this kind of thing and keep the required ABI.
>
> Not sure if I can fully endorse this. The device may still have uses.
> It is still useful for capturing "none of the above" packets
If you need to capture these packets, you can still create a tunnel with local
any and remote any, even if the fb_device has not been created.
> and is used to auto-load module via module aliases.
Right, but if user uses netlink, the problem exists without these patches too.
By default, the fb device is created, so there is no change if you don't set
explicitly setup_fb to 0.
^ permalink raw reply
* [RFC PATCH] tcp: introduce raw access to experimental options
From: elelueck @ 2012-11-16 16:54 UTC (permalink / raw)
To: netdev; +Cc: frankbla, raspl, ubacher, samudrala, Einar Lueck, davem
From: Einar Lueck <elelueck@linux.vnet.ibm.com>
This patch adds means for raw acces to TCP expirimental options
253 and 254. The intention of this is to enable user space
applications to implement communication behaviour that depends
on experimental options. For that, new (set|get)sockopts are
introduced:
TCP_EXPOPTS (get & set): TCP experimental options to be added to
packets
TCP_RECV_EXPOPTS (get): experimental options received with last
packet
TCP_RECV_SYN_EXPOPTS (get): experimental options received with
SYN packet
TCP experimental options 253 and 254 configured via TCP_EXPOPTS on
any TCP socket are appended to every packet that is sent as long
as there is enough room left. If there is not enough room left they
are silently dropped.
Listening sockets reply to SYN packets with SYN ACK packets containing
TCP experimental options 253 and 254 as configured via TCP_EXPOPTS, too.
If a TCP connection gets established the configured experimental options
are the defaults for the new socket, too. Thus, a getsockopt on the
resulting accept socket for TCP_EXPOPTS returns the same stuff configured
on the listening socket.
As mentioned above, even after the 3whs is complete, experimental options
are sent with every packet. To enable user space applications to distinguish
between what has been advertized via SYN and what has been received with the
last packet the aforementioned TCP_RECV_SYN_EXPOPTS and TCP_RECV_EXPOPTS are
introduced.
Today, experimental option 253 (COOKIE) and 254 (FASTOPEN) are already
exploited. For co-existence the following approach has been taken:
General remarks:
* Interface to COOKIE and FASTOPEN stays the same
Sender side:
1. COOKIE and FASTPATH code adds own options first (if applicable)
2. Finally, if enough room is left, TCP_EXPOPTS experimental options are
appended
Receiver side:
1. ALL 253 and 254 experimental options are made available via
TCP_RECV(_SYN)_EXPOPTS
2. COOKIE and FASTOPEN code check if there is any option relevant for them
References:
http://tools.ietf.org/html/draft-ietf-tcpm-experimental-options-02
Signed-off-by: Einar Lueck <elelueck@linux.vnet.ibm.com>
---
include/linux/tcp.h | 25 ++++++++++
include/net/tcp.h | 3 ++
net/ipv4/tcp.c | 110 +++++++++++++++++++++++++++++++++++++++++++
net/ipv4/tcp_input.c | 119 +++++++++++++++++++++++++++++++----------------
net/ipv4/tcp_ipv4.c | 14 ++++++
net/ipv4/tcp_minisocks.c | 17 +++++++
net/ipv4/tcp_output.c | 37 ++++++++++++---
7 files changed, 279 insertions(+), 46 deletions(-)
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index eb125a4..b2a6451 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -110,6 +110,10 @@ enum {
#define TCP_REPAIR_QUEUE 20
#define TCP_QUEUE_SEQ 21
#define TCP_REPAIR_OPTIONS 22
+#define TCP_EXPOPTS 23 /* TCP exp. options (configured) */
+#define TCP_RECV_EXPOPTS 24 /* TCP exp. options (received) */
+#define TCP_RECV_SYN_EXPOPTS 25 /* TCP exp. options
+ (received with syn)) */
struct tcp_repair_opt {
__u32 opt_code;
@@ -269,6 +273,8 @@ struct tcp_sack_block {
#define TCP_FACK_ENABLED (1 << 1) /*1 = FACK is enabled locally*/
#define TCP_DSACK_SEEN (1 << 2) /*1 = DSACK was received from peer*/
+#define TCP_EXPOP_MAXLEN 40
+
struct tcp_options_received {
/* PAWS/RTTM data */
long ts_recent_stamp;/* Time we stored ts_recent (for aging) */
@@ -288,6 +294,9 @@ struct tcp_options_received {
u8 num_sacks; /* Number of SACK blocks */
u16 user_mss; /* mss requested by user in ioctl */
u16 mss_clamp; /* Maximal mss, negotiated at connection setup */
+ u8 exp_opts_len; /* length of buffer containing all exp
+ options in format: kind length data */
+ u8 exp_opts[TCP_EXPOP_MAXLEN]; /* experimental options */
};
static inline void tcp_clear_options(struct tcp_options_received *rx_opt)
@@ -295,6 +304,7 @@ static inline void tcp_clear_options(struct tcp_options_received *rx_opt)
rx_opt->tstamp_ok = rx_opt->sack_ok = 0;
rx_opt->wscale_ok = rx_opt->snd_wscale = 0;
rx_opt->cookie_plus = 0;
+ rx_opt->exp_opts_len = 0;
}
/* This is the max number of SACKS that we'll generate and process. It's safe
@@ -315,6 +325,10 @@ struct tcp_request_sock {
u32 rcv_isn;
u32 snt_isn;
u32 snt_synack; /* synack sent time */
+
+ u8 syn_expopts[TCP_EXPOP_MAXLEN]; /* experimental options
+ received with SYNACK */
+ u8 syn_expopts_len;
};
static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req)
@@ -406,6 +420,17 @@ struct tcp_sock {
u32 snd_up; /* Urgent pointer */
u8 keepalive_probes; /* num of allowed keep alive probes */
+
+ /* for raw acces to experimental options */
+ struct {
+ u8 *conf; /* lazy allocation of TCP_EXPOP_MAXLEN bytes
+ for raw access to experimental options */
+ u8 conf_len; /* bytes actually used for experimental opts */
+ u8 *syn; /* experimental options received with SYN,
+ allocated only if received */
+ u8 syn_len; /* bytes of experimental options actually
+ received with SYN */
+ } exp_opts;
/*
* Options received (usually on last packet, some only on SYN packets).
*/
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 1f000ff..b63d5c9 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -170,6 +170,8 @@ extern void tcp_time_wait(struct sock *sk, int state, int timeo);
#define TCPOPT_TIMESTAMP 8 /* Better RTT estimations/PAWS */
#define TCPOPT_MD5SIG 19 /* MD5 Signature (RFC2385) */
#define TCPOPT_COOKIE 253 /* Cookie extension (experimental) */
+#define TCPOPT_EXP253 253 /* TCP experimental option 253 */
+#define TCPOPT_EXP254 254 /* TCP experimental option 254 */
#define TCPOPT_EXP 254 /* Experimental */
/* Magic number to be after the option value for sharing TCP
* experimental options. See draft-ietf-tcpm-experimental-options-00.txt
@@ -180,6 +182,7 @@ extern void tcp_time_wait(struct sock *sk, int state, int timeo);
* TCP option lengths
*/
+#define TCPOLEN_MAX_ANYEXP 40
#define TCPOLEN_MSS 4
#define TCPOLEN_WINDOW 3
#define TCPOLEN_SACK_PERM 2
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5f64193..e7e4947 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -423,6 +423,12 @@ void tcp_init_sock(struct sock *sk)
sk->sk_sndbuf = sysctl_tcp_wmem[1];
sk->sk_rcvbuf = sysctl_tcp_rmem[1];
+ /* memory for raw access to experimental options is allocated lazy */
+ tp->exp_opts.conf = NULL;
+ tp->exp_opts.conf_len = 0;
+ tp->exp_opts.syn = NULL;
+ tp->exp_opts.syn_len = 0;
+
local_bh_disable();
sock_update_memcg(sk);
sk_sockets_allocated_inc(sk);
@@ -2376,6 +2382,53 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
/* These are data/string values, all the others are ints */
switch (optname) {
+ case TCP_EXPOPTS: {
+ u8 conf[TCP_EXPOP_MAXLEN];
+
+ if (optlen > TCP_EXPOP_MAXLEN || (optlen < 4 && optlen > 0) ||
+ (optlen % 4 > 0))
+ return -EINVAL;
+ if (optlen > 0 && !optval)
+ return -EINVAL;
+
+ /* filter for raw access to supported options */
+ if (optlen) {
+ u8 i;
+
+ if (copy_from_user(conf, optval, optlen))
+ return -EFAULT;
+
+ i = 0;
+ while (i < optlen) {
+ if (conf[i] != TCPOPT_EXP253 &&
+ conf[i] != TCPOPT_EXP254)
+ return -EINVAL;
+
+ if (i + 1 < optlen) {
+ i += conf[i+1];
+ if (i > optlen)
+ return -EINVAL;
+ } else {
+ return -EINVAL;
+ }
+ }
+ }
+
+ lock_sock(sk);
+ if (!optlen) {
+ tp->exp_opts.conf_len = 0;
+ release_sock(sk);
+ return 0;
+ }
+ if (!tp->exp_opts.conf) {
+ tp->exp_opts.conf = kzalloc(TCP_EXPOP_MAXLEN,
+ sk->sk_allocation);
+ }
+ memcpy(tp->exp_opts.conf, conf, optlen);
+ tp->exp_opts.conf_len = optlen;
+ release_sock(sk);
+ return err;
+ }
case TCP_CONGESTION: {
char name[TCP_CA_NAME_MAX];
@@ -2947,6 +3000,63 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
case TCP_USER_TIMEOUT:
val = jiffies_to_msecs(icsk->icsk_user_timeout);
break;
+ case TCP_EXPOPTS: {
+ u8 exp_opts_len;
+
+ if (get_user(len, optlen))
+ return -EFAULT;
+ if (len < 0)
+ return -EINVAL;
+
+ exp_opts_len = tp->exp_opts.conf_len;
+
+ if (exp_opts_len > len)
+ return -EINVAL;
+ if (put_user(exp_opts_len, optlen))
+ return -EFAULT;
+ if (exp_opts_len && copy_to_user(optval, tp->exp_opts.conf,
+ exp_opts_len))
+ return -EFAULT;
+ return 0;
+ }
+ case TCP_RECV_EXPOPTS:
+ if (get_user(len, optlen))
+ return -EFAULT;
+ if (len < 0)
+ return -EINVAL;
+
+ if (len < tp->rx_opt.exp_opts_len)
+ return -EINVAL;
+
+ if (put_user(tp->rx_opt.exp_opts_len, optlen))
+ return -EFAULT;
+ if (copy_to_user(optval, tp->rx_opt.exp_opts,
+ tp->rx_opt.exp_opts_len))
+ return -EFAULT;
+ return 0;
+ case TCP_RECV_SYN_EXPOPTS: {
+ u8 exp_opts_len;
+
+ if (get_user(len, optlen))
+ return -EFAULT;
+ if (len < 0)
+ return -EINVAL;
+
+ if (!tp->exp_opts.syn)
+ exp_opts_len = 0;
+ else
+ exp_opts_len = tp->exp_opts.syn_len;
+
+ if (exp_opts_len > len)
+ return -EINVAL;
+ if (put_user(exp_opts_len, optlen))
+ return -EFAULT;
+ if (exp_opts_len && copy_to_user(optval, tp->exp_opts.syn,
+ exp_opts_len)) {
+ return -EFAULT;
+ }
+ return 0;
+ }
default:
return -ENOPROTOOPT;
}
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index d377f48..130d4f4 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3726,11 +3726,32 @@ old_ack:
return 0;
}
+static inline void tcp_parse_fastopen_cookie(int opcode,
+ int opsize,
+ const unsigned char *ptr,
+ struct tcp_fastopen_cookie *foc,
+ const struct tcphdr *th) {
+ /* Fast Open option shares code 254 using a 16 bits magic number. It's
+ * valid only in SYN or SYN-ACK with an even size.
+ */
+ if (opsize < TCPOLEN_EXP_FASTOPEN_BASE ||
+ get_unaligned_be16(ptr) != TCPOPT_FASTOPEN_MAGIC || foc == NULL ||
+ !th->syn || (opsize & 1))
+ return;
+ foc->len = opsize - TCPOLEN_EXP_FASTOPEN_BASE;
+ if (foc->len >= TCP_FASTOPEN_COOKIE_MIN &&
+ foc->len <= TCP_FASTOPEN_COOKIE_MAX)
+ memcpy(foc->val, ptr + 2, foc->len);
+ else if (foc->len != 0)
+ foc->len = -1;
+}
+
/* Look for tcp options. Normally only called on SYN and SYNACK packets.
* But, this can also be called on packets in the established flow when
* the fast version below fails.
*/
-void tcp_parse_options(const struct sk_buff *skb, struct tcp_options_received *opt_rx,
+void tcp_parse_options(const struct sk_buff *skb,
+ struct tcp_options_received *opt_rx,
const u8 **hvpp, int estab,
struct tcp_fastopen_cookie *foc)
{
@@ -3740,6 +3761,7 @@ void tcp_parse_options(const struct sk_buff *skb, struct tcp_options_received *o
ptr = (const unsigned char *)(th + 1);
opt_rx->saw_tstamp = 0;
+ opt_rx->exp_opts_len = 0;
while (length > 0) {
int opcode = *ptr++;
@@ -3815,48 +3837,56 @@ void tcp_parse_options(const struct sk_buff *skb, struct tcp_options_received *o
*/
break;
#endif
- case TCPOPT_COOKIE:
- /* This option is variable length.
+ case TCPOPT_EXP253:
+ case TCPOPT_EXP254:
+ /* First parse options into raw access area for
+ * experimental options. Then handle
+ * potential exploitations
*/
- switch (opsize) {
- case TCPOLEN_COOKIE_BASE:
- /* not yet implemented */
- break;
- case TCPOLEN_COOKIE_PAIR:
- /* not yet implemented */
- break;
- case TCPOLEN_COOKIE_MIN+0:
- case TCPOLEN_COOKIE_MIN+2:
- case TCPOLEN_COOKIE_MIN+4:
- case TCPOLEN_COOKIE_MIN+6:
- case TCPOLEN_COOKIE_MAX:
- /* 16-bit multiple */
- opt_rx->cookie_plus = opsize;
- *hvpp = ptr;
- break;
- default:
- /* ignore option */
- break;
+ if (opsize <= TCPOLEN_MAX_ANYEXP &&
+ opsize >= 2 &&
+ (opt_rx->exp_opts_len + opsize <=
+ TCPOLEN_MAX_ANYEXP)) {
+ opt_rx->exp_opts[
+ opt_rx->exp_opts_len] = opcode;
+ opt_rx->exp_opts[
+ opt_rx->exp_opts_len + 1] =
+ opsize;
+ memcpy(opt_rx->exp_opts +
+ opt_rx->exp_opts_len + 2, ptr,
+ opsize - 2);
+ opt_rx->exp_opts_len += opsize;
}
- break;
- case TCPOPT_EXP:
- /* Fast Open option shares code 254 using a
- * 16 bits magic number. It's valid only in
- * SYN or SYN-ACK with an even size.
- */
- if (opsize < TCPOLEN_EXP_FASTOPEN_BASE ||
- get_unaligned_be16(ptr) != TCPOPT_FASTOPEN_MAGIC ||
- foc == NULL || !th->syn || (opsize & 1))
- break;
- foc->len = opsize - TCPOLEN_EXP_FASTOPEN_BASE;
- if (foc->len >= TCP_FASTOPEN_COOKIE_MIN &&
- foc->len <= TCP_FASTOPEN_COOKIE_MAX)
- memcpy(foc->val, ptr + 2, foc->len);
- else if (foc->len != 0)
- foc->len = -1;
+ /* handle potential exploitations */
+ if (opcode == TCPOPT_COOKIE) {
+ /* This option is variable length. */
+ switch (opsize) {
+ case TCPOLEN_COOKIE_BASE:
+ /* not yet implemented */
+ break;
+ case TCPOLEN_COOKIE_PAIR:
+ /* not yet implemented */
+ break;
+ case TCPOLEN_COOKIE_MIN+0:
+ case TCPOLEN_COOKIE_MIN+2:
+ case TCPOLEN_COOKIE_MIN+4:
+ case TCPOLEN_COOKIE_MIN+6:
+ case TCPOLEN_COOKIE_MAX:
+ /* 16-bit multiple */
+ opt_rx->cookie_plus = opsize;
+ *hvpp = ptr;
+ break;
+ default:
+ /* ignore option */
+ break;
+ }
+ } else {
+ tcp_parse_fastopen_cookie(opcode,
+ opsize, ptr,
+ foc, th);
+ }
break;
-
}
ptr += opsize-2;
length -= opsize;
@@ -3888,6 +3918,9 @@ static bool tcp_fast_parse_options(const struct sk_buff *skb,
const struct tcphdr *th,
struct tcp_sock *tp, const u8 **hvpp)
{
+ /* required if exp options are not used anymore by the counter part */
+ tp->rx_opt.exp_opts_len = 0;
+
/* In the spirit of fast parsing, compare doff directly to constant
* values. Because equality is used, short doff can be ignored here.
*/
@@ -5806,6 +5839,14 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
}
}
+ if (unlikely(tp->rx_opt.exp_opts_len > 0)) {
+ tp->exp_opts.syn = kzalloc(tp->rx_opt.exp_opts_len,
+ sk->sk_allocation);
+ tp->exp_opts.syn_len = tp->rx_opt.exp_opts_len;
+ memcpy(tp->exp_opts.syn, &tp->rx_opt.exp_opts,
+ tp->rx_opt.exp_opts_len);
+ }
+
smp_mb();
tcp_finish_connect(sk, skb);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 00a748d..2f66bd5 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1321,6 +1321,16 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
tmp_opt.user_mss = tp->rx_opt.user_mss;
tcp_parse_options(skb, &tmp_opt, &hash_location, 0, NULL);
+ /* for raw access to experimental options in SYN packet */
+ tcp_rsk(req)->syn_expopts_len = tmp_opt.exp_opts_len;
+ if (tcp_rsk(req)->syn_expopts_len) {
+ /* transport experimental options via request socket to big
+ * socket
+ */
+ memcpy(tcp_rsk(req)->syn_expopts, tmp_opt.exp_opts,
+ tcp_rsk(req)->syn_expopts_len);
+ }
+
if (tmp_opt.cookie_plus > 0 &&
tmp_opt.saw_tstamp &&
!tp->rx_opt.cookie_out_never &&
@@ -1978,6 +1988,10 @@ void tcp_v4_destroy_sock(struct sock *sk)
tp->cookie_values = NULL;
}
+ /* buffers for raw access to experimental options */
+ kfree(tp->exp_opts.conf);
+ kfree(tp->exp_opts.syn);
+
/* If socket is aborted during connect operation */
tcp_free_fastopen_req(tp);
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 6ff7f10..dc25875 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -466,6 +466,23 @@ struct sock *tcp_create_openreq_child(struct sock *sk, struct request_sock *req,
newtp->urg_data = 0;
+ if (tcp_rsk(req)->syn_expopts_len) {
+ newtp->exp_opts.syn_len =
+ tcp_rsk(req)->syn_expopts_len;
+ newtp->exp_opts.syn = kzalloc(newtp->exp_opts.syn_len,
+ GFP_ATOMIC);
+ memcpy(newtp->exp_opts.syn, tcp_rsk(req)->syn_expopts,
+ newtp->exp_opts.syn_len);
+ }
+
+ if (oldtp->exp_opts.conf_len > 0) {
+ newtp->exp_opts.conf_len = oldtp->exp_opts.conf_len;
+ newtp->exp_opts.conf = kzalloc(TCP_EXPOP_MAXLEN,
+ GFP_ATOMIC);
+ memcpy(newtp->exp_opts.conf, oldtp->exp_opts.conf,
+ oldtp->exp_opts.conf_len);
+ }
+
if (sock_flag(newsk, SOCK_KEEPOPEN))
inet_csk_reset_keepalive_timer(newsk,
keepalive_time_when(newtp));
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index d046326..8d7cf51 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -385,6 +385,7 @@ static inline bool tcp_urg_mode(const struct tcp_sock *tp)
#define OPTION_MD5 (1 << 2)
#define OPTION_WSCALE (1 << 3)
#define OPTION_COOKIE_EXTENSION (1 << 4)
+#define OPTION_EXP (1 << 5)
#define OPTION_FAST_OPEN_COOKIE (1 << 8)
struct tcp_out_options {
@@ -581,6 +582,12 @@ static void tcp_options_write(__be32 *ptr, struct tcp_sock *tp,
}
ptr += (foc->len + 3) >> 2;
}
+ if (unlikely(OPTION_EXP & options && tp->exp_opts.conf_len > 0)) {
+ __u8 *p = (__u8 *) ptr;
+ memcpy(ptr, tp->exp_opts.conf, tp->exp_opts.conf_len);
+ p += tp->exp_opts.conf_len;
+ ptr = (__be32 *) p;
+ }
}
/* Compute TCP options for SYN packets. This is not the final
@@ -693,6 +700,11 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
remaining -= need;
}
}
+ if (unlikely(tp->exp_opts.conf_len > 0 &&
+ tp->exp_opts.conf_len <= remaining)) {
+ opts->options |= OPTION_EXP;
+ remaining -= tp->exp_opts.conf_len;
+ }
return MAX_TCP_OPTION_SPACE - remaining;
}
@@ -747,6 +759,11 @@ static unsigned int tcp_synack_options(struct sock *sk,
if (unlikely(!ireq->tstamp_ok))
remaining -= TCPOLEN_SACKPERM_ALIGNED;
}
+ if (unlikely(tcp_sk(sk)->exp_opts.conf_len > 0 &&
+ tcp_sk(sk)->exp_opts.conf_len <= remaining)) {
+ opts->options |= OPTION_EXP;
+ remaining -= tcp_sk(sk)->exp_opts.conf_len;
+ }
/* Similar rationale to tcp_syn_options() applies here, too.
* If the <SYN> options fit, the same options should fit now!
@@ -782,38 +799,44 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
{
struct tcp_skb_cb *tcb = skb ? TCP_SKB_CB(skb) : NULL;
struct tcp_sock *tp = tcp_sk(sk);
- unsigned int size = 0;
+ unsigned remaining = MAX_TCP_OPTION_SPACE;
unsigned int eff_sacks;
#ifdef CONFIG_TCP_MD5SIG
*md5 = tp->af_specific->md5_lookup(sk, sk);
if (unlikely(*md5)) {
opts->options |= OPTION_MD5;
- size += TCPOLEN_MD5SIG_ALIGNED;
+ remaining -= TCPOLEN_MD5SIG_ALIGNED;
}
#else
*md5 = NULL;
#endif
- if (likely(tp->rx_opt.tstamp_ok)) {
+ if (likely(tp->rx_opt.tstamp_ok &&
+ remaining >= TCPOLEN_TSTAMP_ALIGNED)) {
opts->options |= OPTION_TS;
opts->tsval = tcb ? tcb->when : 0;
opts->tsecr = tp->rx_opt.ts_recent;
- size += TCPOLEN_TSTAMP_ALIGNED;
+ remaining -= TCPOLEN_TSTAMP_ALIGNED;
}
eff_sacks = tp->rx_opt.num_sacks + tp->rx_opt.dsack;
if (unlikely(eff_sacks)) {
- const unsigned int remaining = MAX_TCP_OPTION_SPACE - size;
opts->num_sack_blocks =
min_t(unsigned int, eff_sacks,
(remaining - TCPOLEN_SACK_BASE_ALIGNED) /
TCPOLEN_SACK_PERBLOCK);
- size += TCPOLEN_SACK_BASE_ALIGNED +
+ remaining -= TCPOLEN_SACK_BASE_ALIGNED +
opts->num_sack_blocks * TCPOLEN_SACK_PERBLOCK;
}
- return size;
+ if (unlikely(tp->exp_opts.conf_len > 0 &&
+ tp->exp_opts.conf_len <= remaining)) {
+ opts->options |= OPTION_EXP;
+ remaining -= tp->exp_opts.conf_len;
+ }
+
+ return MAX_TCP_OPTION_SPACE - remaining;
}
--
1.7.12.4
^ permalink raw reply related
* Re: [PATCH] openvswitch: Make IPv6 packet parsing dependent on IPv6 config
From: Jesse Gross @ 2012-11-16 17:33 UTC (permalink / raw)
To: Vlad Yasevich
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
fengguang.wu-ral2JQCrhuEAvxtiuMwx3w, davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <1353080434-14165-1-git-send-email-vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 990 bytes --]
On Fri, Nov 16, 2012 at 7:40 AM, Vlad Yasevich <vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Openvswitch attempts to use IPv6 packet parsing functions without
> any dependency on IPv6 (unlike every other place in kernel). Pull
> the IPv6 code in openvswitch togeter and put a conditional that's
> dependent on CONFIG_IPV6.
>
> Resolves:
> net/built-in.o: In function `ovs_flow_extract':
> (.text+0xbf5d5): undefined reference to `ipv6_skip_exthdr'
>
> Signed-off-by: Vlad Yasevich <vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Doesn't this move in the opposite direction of your patches to make IPv6
GSO/GRO always available? The packets being processed here
are generally created by the guest but with Open vSwitch running on the
host. Also, ipv6_skip_exthdr() is in exthdrs_core.c, so it actually is
always available. I suspect that the real problem is that the dependency
on the ipv6 directory changed to CONFIG_INET and Open vSwitch should now
depend on this.
[-- Attachment #1.2: Type: text/html, Size: 1531 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [PATCH] openvswitch: Make IPv6 packet parsing dependent on IPv6 config
From: Jesse Gross @ 2012-11-16 17:36 UTC (permalink / raw)
To: Vlad Yasevich
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
fengguang.wu-ral2JQCrhuEAvxtiuMwx3w, davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <1353080434-14165-1-git-send-email-vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
On Fri, Nov 16, 2012 at 7:40 AM, Vlad Yasevich <vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Openvswitch attempts to use IPv6 packet parsing functions without
> any dependency on IPv6 (unlike every other place in kernel). Pull
> the IPv6 code in openvswitch togeter and put a conditional that's
> dependent on CONFIG_IPV6.
>
> Resolves:
> net/built-in.o: In function `ovs_flow_extract':
> (.text+0xbf5d5): undefined reference to `ipv6_skip_exthdr'
>
> Signed-off-by: Vlad Yasevich <vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
(Sorry for duplicates received, the original message had a typo in the
mailing list address.)
Doesn't this move in the opposite direction of your patches to make
IPv6 GSO/GRO always available? The packets being processed here are
generally created by the guest but with Open vSwitch running on the
host. Also, ipv6_skip_exthdr() is in exthdrs_core.c, so it actually
is always available. I suspect that the real problem is that the
dependency on the ipv6 directory changed to CONFIG_INET and Open
vSwitch should now depend on this.
^ permalink raw reply
* Re: [PATCH] openvswitch: Make IPv6 packet parsing dependent on IPv6 config
From: Vlad Yasevich @ 2012-11-16 17:43 UTC (permalink / raw)
To: Jesse Gross
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
fengguang.wu-ral2JQCrhuEAvxtiuMwx3w, davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <CAEP_g=9ge1dq8ahinM075hjdfdmXaou4a9fqPyLPJQinths6pQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On 11/16/2012 12:26 PM, Jesse Gross wrote:
> On Fri, Nov 16, 2012 at 7:40 AM, Vlad Yasevich <vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>
>> Openvswitch attempts to use IPv6 packet parsing functions without
>> any dependency on IPv6 (unlike every other place in kernel). Pull
>> the IPv6 code in openvswitch togeter and put a conditional that's
>> dependent on CONFIG_IPV6.
>>
>> Resolves:
>> net/built-in.o: In function `ovs_flow_extract':
>> (.text+0xbf5d5): undefined reference to `ipv6_skip_exthdr'
>>
>> Signed-off-by: Vlad Yasevich <vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>
>
> Doesn't this move in the opposite direction of your patches to make IPv6
> GSO/GRO always available? The packets being processed here
> are generally created by the guest but with Open vSwitch running on the
> host. Also, ipv6_skip_exthdr() is in exthdrs_core.c, so it actually is
> always available. I suspect that the real problem is that the dependency
> on the ipv6 directory changed to CONFIG_INET and Open vSwitch should now
> depend on this.
>
Yes and no... :) IPv6 uses a bunch of IPv4 code all over. IPv4 is
enabled with CONFIG_INET and IPv6 with CONFIG_NET. So creates a strange
imbalance. By shifting IPv6 to CONFIG_INET (which is where it
lives and what enables its selection during config process), we now have
a dependency with openvswitch.
All other users of ipv6_skip_exthdr have it either under the IS_ENABLED
conditional or through some other means that don't build it when INET is
completely turned off. This patch does the same for openvswitch.
I see 2 alternatives to this:
1) Make openvswitch depend on CONFIG_INET.
2) Pull a ton of code out of CONFIG_INET (v4 and v6) and into
CONFIG_NET. This could start with IPv6 header parsing and maybe even
include GSO/TSO (but not sure how much sense that would be).
What's your take?
-vlad
^ permalink raw reply
* Re: [PATCH 0/4] netfilter updates for nf-next (try 2)
From: David Miller @ 2012-11-16 18:00 UTC (permalink / raw)
To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <1353069653-3231-1-git-send-email-pablo@netfilter.org>
From: pablo@netfilter.org
Date: Fri, 16 Nov 2012 13:40:49 +0100
> This is the second try to include the following four patches that contain
> updates for your net-next tree, they are:
>
> * Little cleanup for IPVS the use of a strange notation to assign the
> conntrack object, from Alan Cox.
>
> * Another little cleanup for nf_nat to save a couple of lines by using
> PTR_RET, from Wu Fengguan
>
> * getsockopt support to obtain the original IPv6 address after NAT,
> similar to the one that IPv4 provides, from Florian Westphal.
>
> * Follow-up patch pointed out by YOSHIFUJI Hideaki to only provide
> the scope_id in case that link is local, again from Florian Westphal.
>
> You can pull these changes from:
>
> git://1984.lsi.us.es/nf-next master
Pulled, thanks.
There was a conflict I had to resolve because the net-next tree
had a series of IS_ENABLED conversions that overlapped with some
of the changes in your tree.
^ permalink raw reply
* [PATCH net-next] net: use right lock in __dev_remove_offload
From: Eric Dumazet @ 2012-11-16 18:08 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Vlad Yasevich
From: Eric Dumazet <edumazet@google.com>
offload_base is protected by offload_lock, not ptype_lock
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
---
net/core/dev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index cf105e8..2705a2a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -513,7 +513,7 @@ void __dev_remove_offload(struct packet_offload *po)
struct list_head *head = &offload_base;
struct packet_offload *po1;
- spin_lock(&ptype_lock);
+ spin_lock(&offload_lock);
list_for_each_entry(po1, head, list) {
if (po == po1) {
@@ -524,7 +524,7 @@ void __dev_remove_offload(struct packet_offload *po)
pr_warn("dev_remove_offload: %p not found\n", po);
out:
- spin_unlock(&ptype_lock);
+ spin_unlock(&offload_lock);
}
EXPORT_SYMBOL(__dev_remove_offload);
^ permalink raw reply related
* Re: [PATCH net-next] net: use right lock in __dev_remove_offload
From: Vlad Yasevich @ 2012-11-16 18:15 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1353089303.10798.38.camel@edumazet-glaptop>
On 11/16/2012 01:08 PM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> offload_base is protected by offload_lock, not ptype_lock
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Vlad Yasevich <vyasevic@redhat.com>
> ---
> net/core/dev.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index cf105e8..2705a2a 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -513,7 +513,7 @@ void __dev_remove_offload(struct packet_offload *po)
> struct list_head *head = &offload_base;
> struct packet_offload *po1;
>
> - spin_lock(&ptype_lock);
> + spin_lock(&offload_lock);
>
> list_for_each_entry(po1, head, list) {
> if (po == po1) {
> @@ -524,7 +524,7 @@ void __dev_remove_offload(struct packet_offload *po)
>
> pr_warn("dev_remove_offload: %p not found\n", po);
> out:
> - spin_unlock(&ptype_lock);
> + spin_unlock(&offload_lock);
> }
> EXPORT_SYMBOL(__dev_remove_offload);
>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
I distinctly remember changing that, but just look at my patches and
it's there.... Sorry...
-vlad
^ permalink raw reply
* Re: [PATCH 2/5] drivers/net/wireless/ti/wlcore/main.c: eliminate possible double power off
From: Luciano Coelho @ 2012-11-16 18:18 UTC (permalink / raw)
To: Julia Lawall
Cc: kernel-janitors, John W. Linville, linux-wireless, netdev,
linux-kernel
In-Reply-To: <1350816727-1381-3-git-send-email-Julia.Lawall@lip6.fr>
On Sun, 2012-10-21 at 12:52 +0200, Julia Lawall wrote:
> From: Julia Lawall <Julia.Lawall@lip6.fr>
>
> The function wl12xx_set_power_on is only called twice, once in
> wl12xx_chip_wakeup and once in wl12xx_get_hw_info. On the failure of the
> call in wl12xx_chip_wakeup, the containing function just returns, but on
> the failure of the call in wl12xx_get_hw_info, the containing function
> calls wl1271_power_off. This does not seem necessary, because if
> wl12xx_set_power_on has set the power on and then fails, it has already
> turned the power off.
[...]
Applied and pushed, thanks!
--
Luca.
^ permalink raw reply
* Re: [PATCH 14/14] wlcore: Remove redundant check on unsigned variable
From: Luciano Coelho @ 2012-11-16 18:24 UTC (permalink / raw)
To: Tushar Behera; +Cc: linux-kernel, patches, linux-wireless, netdev
In-Reply-To: <1353048646-10935-15-git-send-email-tushar.behera@linaro.org>
On Fri, 2012-11-16 at 12:20 +0530, Tushar Behera wrote:
> No need to check whether unsigned variable is less than 0.
>
> CC: Luciano Coelho <coelho@ti.com>
> CC: linux-wireless@vger.kernel.org
> CC: netdev@vger.kernel.org
> Signed-off-by: Tushar Behera <tushar.behera@linaro.org>
> ---
Applied in the wl12xx.git tree. Thanks!
--
Luca.
^ permalink raw reply
* pull request: wireless 2012-11-16
From: John W. Linville @ 2012-11-16 18:30 UTC (permalink / raw)
To: davem; +Cc: linux-wireless, netdev, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 17964 bytes --]
commit 26c6e80892d8c160dffaba85889bd4e65b1dacf6
Dave,
This batch of fixes is intended for the 3.7 stream...
This includes a pull of the Bluetooth tree. Gustavo says:
"A few important fixes to go into 3.7. There is a new hw support by Marcos
Chaparro. Johan added a memory leak fix and hci device index list fix.
Also Marcel fixed a race condition in the device set up that was prevent the
bt monitor to work properly. Last, Paulo Sérgio added a fix to the error
status when pairing for LE fails. This was prevent userspace to work to handle
the failure properly."
Regarding the mac80211 pull, Johannes says:
"I have a locking fix for some SKB queues, a variable initialization to
avoid crashes in a certain failure case, another free_txskb fix from
Felix and another fix from him to avoid calling a stopped driver, a fix
for a (very unlikely) memory leak and a fix to not send null data
packets when resuming while not associated."
Regarding the iwlwifi pull, Johannes says:
"Two more fixes for iwlwifi ... one to use ieee80211_free_txskb(), and
one to check DMA mapping errors, please pull."
On top of that, Johannes also included a wireless regulatory fix
to allow 40 MHz on channels 12 and 13 in world roaming mode. Also,
Hauke Mehrtens fixes a #ifdef typo in brcmfmac.
Please let me know if there are problems!
Thanks,
John
---
The following changes since commit 6fc4adca6ce3e1d57a42707019dddcb883578a91:
tilegx: request_irq with a non-null device name (2012-11-16 01:40:41 -0500)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless.git for-davem
for you to fetch changes up to 26c6e80892d8c160dffaba85889bd4e65b1dacf6:
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem (2012-11-16 12:59:13 -0500)
----------------------------------------------------------------
Arik Nemtsov (1):
mac80211: sync acccess to tx_filtered/ps_tx_buf queues
David Spinadel (1):
mac80211: init sched_scan_ies
Felix Fietkau (2):
mac80211: do not call ieee80211_configure_filter if no interfaces are up
mac80211: call skb_dequeue/ieee80211_free_txskb instead of __skb_queue_purge
Hauke Mehrtens (1):
brcmfmac: fix typo in CONFIG_BRCMISCAN
Johan Hedberg (2):
Bluetooth: Fix having bogus entries in mgmt_read_index_list reply
Bluetooth: Fix memory leak when removing a UUID
Johannes Berg (5):
iwlwifi: handle DMA mapping failures
iwlwifi: use ieee80211_free_txskb
mac80211: fix memory leak in device registration error path
mac80211: don't send null data packet when not associated
wireless: allow 40 MHz on world roaming channels 12/13
John W. Linville (4):
Merge branch 'for-john' of git://git.kernel.org/.../jberg/mac80211
Merge branch 'for-john' of git://git.kernel.org/.../iwlwifi/iwlwifi-fixes
Merge branch 'master' of git://git.kernel.org/.../bluetooth/bluetooth
Merge branch 'master' of git://git.kernel.org/.../linville/wireless into for-davem
Marcel Holtmann (1):
Bluetooth: Notify about device registration before power on
Marcos Chaparro (1):
Bluetooth: ath3k: Add support for VAIO VPCEH [0489:e027]
Paulo Sérgio (1):
Bluetooth: Fix error status when pairing fails
drivers/bluetooth/ath3k.c | 1 +
drivers/bluetooth/btusb.c | 1 +
.../net/wireless/brcm80211/brcmfmac/wl_cfg80211.c | 2 +-
drivers/net/wireless/iwlwifi/dvm/mac80211.c | 2 +-
drivers/net/wireless/iwlwifi/dvm/main.c | 2 +-
drivers/net/wireless/iwlwifi/pcie/rx.c | 23 ++++++++++++++++++++--
net/bluetooth/hci_core.c | 4 ++--
net/bluetooth/mgmt.c | 12 ++++++-----
net/bluetooth/smp.c | 2 +-
net/mac80211/cfg.c | 3 +++
net/mac80211/ieee80211_i.h | 2 ++
net/mac80211/main.c | 6 ++++--
net/mac80211/scan.c | 2 +-
net/mac80211/sta_info.c | 11 ++++++++---
net/mac80211/status.c | 9 +++++++++
net/mac80211/tx.c | 9 ++++++---
net/mac80211/util.c | 2 ++
net/wireless/reg.c | 5 ++---
18 files changed, 73 insertions(+), 25 deletions(-)
diff --git a/drivers/bluetooth/ath3k.c b/drivers/bluetooth/ath3k.c
index fc2de55..b00000e 100644
--- a/drivers/bluetooth/ath3k.c
+++ b/drivers/bluetooth/ath3k.c
@@ -67,6 +67,7 @@ static struct usb_device_id ath3k_table[] = {
{ USB_DEVICE(0x13d3, 0x3304) },
{ USB_DEVICE(0x0930, 0x0215) },
{ USB_DEVICE(0x0489, 0xE03D) },
+ { USB_DEVICE(0x0489, 0xE027) },
/* Atheros AR9285 Malbec with sflash firmware */
{ USB_DEVICE(0x03F0, 0x311D) },
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index debda27..ee82f2f 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -124,6 +124,7 @@ static struct usb_device_id blacklist_table[] = {
{ USB_DEVICE(0x13d3, 0x3304), .driver_info = BTUSB_IGNORE },
{ USB_DEVICE(0x0930, 0x0215), .driver_info = BTUSB_IGNORE },
{ USB_DEVICE(0x0489, 0xe03d), .driver_info = BTUSB_IGNORE },
+ { USB_DEVICE(0x0489, 0xe027), .driver_info = BTUSB_IGNORE },
/* Atheros AR9285 Malbec with sflash firmware */
{ USB_DEVICE(0x03f0, 0x311d), .driver_info = BTUSB_IGNORE },
diff --git a/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c b/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
index a6f1e81..481345c 100644
--- a/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
+++ b/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
@@ -4401,7 +4401,7 @@ static s32 brcmf_mode_to_nl80211_iftype(s32 mode)
static void brcmf_wiphy_pno_params(struct wiphy *wiphy)
{
-#ifndef CONFIG_BRCMFISCAN
+#ifndef CONFIG_BRCMISCAN
/* scheduled scan settings */
wiphy->max_sched_scan_ssids = BRCMF_PNO_MAX_PFN_COUNT;
wiphy->max_match_sets = BRCMF_PNO_MAX_PFN_COUNT;
diff --git a/drivers/net/wireless/iwlwifi/dvm/mac80211.c b/drivers/net/wireless/iwlwifi/dvm/mac80211.c
index ff8162d..fa4d1b8 100644
--- a/drivers/net/wireless/iwlwifi/dvm/mac80211.c
+++ b/drivers/net/wireless/iwlwifi/dvm/mac80211.c
@@ -521,7 +521,7 @@ static void iwlagn_mac_tx(struct ieee80211_hw *hw,
ieee80211_get_tx_rate(hw, IEEE80211_SKB_CB(skb))->bitrate);
if (iwlagn_tx_skb(priv, control->sta, skb))
- dev_kfree_skb_any(skb);
+ ieee80211_free_txskb(hw, skb);
}
static void iwlagn_mac_update_tkip_key(struct ieee80211_hw *hw,
diff --git a/drivers/net/wireless/iwlwifi/dvm/main.c b/drivers/net/wireless/iwlwifi/dvm/main.c
index 7ff3f14..408132c 100644
--- a/drivers/net/wireless/iwlwifi/dvm/main.c
+++ b/drivers/net/wireless/iwlwifi/dvm/main.c
@@ -2114,7 +2114,7 @@ static void iwl_free_skb(struct iwl_op_mode *op_mode, struct sk_buff *skb)
info = IEEE80211_SKB_CB(skb);
iwl_trans_free_tx_cmd(priv->trans, info->driver_data[1]);
- dev_kfree_skb_any(skb);
+ ieee80211_free_txskb(priv->hw, skb);
}
static void iwl_set_hw_rfkill_state(struct iwl_op_mode *op_mode, bool state)
diff --git a/drivers/net/wireless/iwlwifi/pcie/rx.c b/drivers/net/wireless/iwlwifi/pcie/rx.c
index 17c8e5d..bb69f8f 100644
--- a/drivers/net/wireless/iwlwifi/pcie/rx.c
+++ b/drivers/net/wireless/iwlwifi/pcie/rx.c
@@ -321,6 +321,14 @@ static void iwl_rx_allocate(struct iwl_trans *trans, gfp_t priority)
dma_map_page(trans->dev, page, 0,
PAGE_SIZE << trans_pcie->rx_page_order,
DMA_FROM_DEVICE);
+ if (dma_mapping_error(trans->dev, rxb->page_dma)) {
+ rxb->page = NULL;
+ spin_lock_irqsave(&rxq->lock, flags);
+ list_add(&rxb->list, &rxq->rx_used);
+ spin_unlock_irqrestore(&rxq->lock, flags);
+ __free_pages(page, trans_pcie->rx_page_order);
+ return;
+ }
/* dma address must be no more than 36 bits */
BUG_ON(rxb->page_dma & ~DMA_BIT_MASK(36));
/* and also 256 byte aligned! */
@@ -488,8 +496,19 @@ static void iwl_rx_handle_rxbuf(struct iwl_trans *trans,
dma_map_page(trans->dev, rxb->page, 0,
PAGE_SIZE << trans_pcie->rx_page_order,
DMA_FROM_DEVICE);
- list_add_tail(&rxb->list, &rxq->rx_free);
- rxq->free_count++;
+ if (dma_mapping_error(trans->dev, rxb->page_dma)) {
+ /*
+ * free the page(s) as well to not break
+ * the invariant that the items on the used
+ * list have no page(s)
+ */
+ __free_pages(rxb->page, trans_pcie->rx_page_order);
+ rxb->page = NULL;
+ list_add_tail(&rxb->list, &rxq->rx_used);
+ } else {
+ list_add_tail(&rxb->list, &rxq->rx_free);
+ rxq->free_count++;
+ }
} else
list_add_tail(&rxb->list, &rxq->rx_used);
spin_unlock_irqrestore(&rxq->lock, flags);
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 8a0ce70..a0a2f97 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -1754,11 +1754,11 @@ int hci_register_dev(struct hci_dev *hdev)
if (hdev->dev_type != HCI_AMP)
set_bit(HCI_AUTO_OFF, &hdev->dev_flags);
- schedule_work(&hdev->power_on);
-
hci_notify(hdev, HCI_DEV_REG);
hci_dev_hold(hdev);
+ schedule_work(&hdev->power_on);
+
return id;
err_wqueue:
diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
index aa2ea0a..91de423 100644
--- a/net/bluetooth/mgmt.c
+++ b/net/bluetooth/mgmt.c
@@ -326,7 +326,7 @@ static int read_index_list(struct sock *sk, struct hci_dev *hdev, void *data,
struct hci_dev *d;
size_t rp_len;
u16 count;
- int i, err;
+ int err;
BT_DBG("sock %p", sk);
@@ -347,9 +347,7 @@ static int read_index_list(struct sock *sk, struct hci_dev *hdev, void *data,
return -ENOMEM;
}
- rp->num_controllers = cpu_to_le16(count);
-
- i = 0;
+ count = 0;
list_for_each_entry(d, &hci_dev_list, list) {
if (test_bit(HCI_SETUP, &d->dev_flags))
continue;
@@ -357,10 +355,13 @@ static int read_index_list(struct sock *sk, struct hci_dev *hdev, void *data,
if (!mgmt_valid_hdev(d))
continue;
- rp->index[i++] = cpu_to_le16(d->id);
+ rp->index[count++] = cpu_to_le16(d->id);
BT_DBG("Added hci%u", d->id);
}
+ rp->num_controllers = cpu_to_le16(count);
+ rp_len = sizeof(*rp) + (2 * count);
+
read_unlock(&hci_dev_list_lock);
err = cmd_complete(sk, MGMT_INDEX_NONE, MGMT_OP_READ_INDEX_LIST, 0, rp,
@@ -1366,6 +1367,7 @@ static int remove_uuid(struct sock *sk, struct hci_dev *hdev, void *data,
continue;
list_del(&match->list);
+ kfree(match);
found++;
}
diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c
index 2ac8d50..a592337 100644
--- a/net/bluetooth/smp.c
+++ b/net/bluetooth/smp.c
@@ -267,7 +267,7 @@ static void smp_failure(struct l2cap_conn *conn, u8 reason, u8 send)
clear_bit(HCI_CONN_ENCRYPT_PEND, &conn->hcon->flags);
mgmt_auth_failed(conn->hcon->hdev, conn->dst, hcon->type,
- hcon->dst_type, reason);
+ hcon->dst_type, HCI_ERROR_AUTH_FAILURE);
cancel_delayed_work_sync(&conn->security_timer);
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index 05f3a31..7371f67 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -2594,6 +2594,9 @@ static void ieee80211_mgmt_frame_register(struct wiphy *wiphy,
else
local->probe_req_reg--;
+ if (!local->open_count)
+ break;
+
ieee80211_queue_work(&local->hw, &local->reconfig_filter);
break;
default:
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 8c80455..156e583 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -1314,6 +1314,8 @@ netdev_tx_t ieee80211_monitor_start_xmit(struct sk_buff *skb,
struct net_device *dev);
netdev_tx_t ieee80211_subif_start_xmit(struct sk_buff *skb,
struct net_device *dev);
+void ieee80211_purge_tx_queue(struct ieee80211_hw *hw,
+ struct sk_buff_head *skbs);
/* HT */
void ieee80211_apply_htcap_overrides(struct ieee80211_sub_if_data *sdata,
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index c80c449..f57f597 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -871,8 +871,10 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
local->hw.wiphy->cipher_suites,
sizeof(u32) * local->hw.wiphy->n_cipher_suites,
GFP_KERNEL);
- if (!suites)
- return -ENOMEM;
+ if (!suites) {
+ result = -ENOMEM;
+ goto fail_wiphy_register;
+ }
for (r = 0; r < local->hw.wiphy->n_cipher_suites; r++) {
u32 suite = local->hw.wiphy->cipher_suites[r];
if (suite == WLAN_CIPHER_SUITE_WEP40 ||
diff --git a/net/mac80211/scan.c b/net/mac80211/scan.c
index c4cdbde..43e60b5 100644
--- a/net/mac80211/scan.c
+++ b/net/mac80211/scan.c
@@ -917,7 +917,7 @@ int ieee80211_request_sched_scan_start(struct ieee80211_sub_if_data *sdata,
struct cfg80211_sched_scan_request *req)
{
struct ieee80211_local *local = sdata->local;
- struct ieee80211_sched_scan_ies sched_scan_ies;
+ struct ieee80211_sched_scan_ies sched_scan_ies = {};
int ret, i;
mutex_lock(&local->mtx);
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 0a4e4c0..d2eb64e 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -117,8 +117,8 @@ static void free_sta_work(struct work_struct *wk)
for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) {
local->total_ps_buffered -= skb_queue_len(&sta->ps_tx_buf[ac]);
- __skb_queue_purge(&sta->ps_tx_buf[ac]);
- __skb_queue_purge(&sta->tx_filtered[ac]);
+ ieee80211_purge_tx_queue(&local->hw, &sta->ps_tx_buf[ac]);
+ ieee80211_purge_tx_queue(&local->hw, &sta->tx_filtered[ac]);
}
#ifdef CONFIG_MAC80211_MESH
@@ -141,7 +141,7 @@ static void free_sta_work(struct work_struct *wk)
tid_tx = rcu_dereference_raw(sta->ampdu_mlme.tid_tx[i]);
if (!tid_tx)
continue;
- __skb_queue_purge(&tid_tx->pending);
+ ieee80211_purge_tx_queue(&local->hw, &tid_tx->pending);
kfree(tid_tx);
}
@@ -961,6 +961,7 @@ void ieee80211_sta_ps_deliver_wakeup(struct sta_info *sta)
struct ieee80211_local *local = sdata->local;
struct sk_buff_head pending;
int filtered = 0, buffered = 0, ac;
+ unsigned long flags;
clear_sta_flag(sta, WLAN_STA_SP);
@@ -976,12 +977,16 @@ void ieee80211_sta_ps_deliver_wakeup(struct sta_info *sta)
for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) {
int count = skb_queue_len(&pending), tmp;
+ spin_lock_irqsave(&sta->tx_filtered[ac].lock, flags);
skb_queue_splice_tail_init(&sta->tx_filtered[ac], &pending);
+ spin_unlock_irqrestore(&sta->tx_filtered[ac].lock, flags);
tmp = skb_queue_len(&pending);
filtered += tmp - count;
count = tmp;
+ spin_lock_irqsave(&sta->ps_tx_buf[ac].lock, flags);
skb_queue_splice_tail_init(&sta->ps_tx_buf[ac], &pending);
+ spin_unlock_irqrestore(&sta->ps_tx_buf[ac].lock, flags);
tmp = skb_queue_len(&pending);
buffered += tmp - count;
}
diff --git a/net/mac80211/status.c b/net/mac80211/status.c
index 3af0cc4..101eb88 100644
--- a/net/mac80211/status.c
+++ b/net/mac80211/status.c
@@ -668,3 +668,12 @@ void ieee80211_free_txskb(struct ieee80211_hw *hw, struct sk_buff *skb)
dev_kfree_skb_any(skb);
}
EXPORT_SYMBOL(ieee80211_free_txskb);
+
+void ieee80211_purge_tx_queue(struct ieee80211_hw *hw,
+ struct sk_buff_head *skbs)
+{
+ struct sk_buff *skb;
+
+ while ((skb = __skb_dequeue(skbs)))
+ ieee80211_free_txskb(hw, skb);
+}
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index c9bf83f..b858ebe 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1358,7 +1358,7 @@ static int invoke_tx_handlers(struct ieee80211_tx_data *tx)
if (tx->skb)
ieee80211_free_txskb(&tx->local->hw, tx->skb);
else
- __skb_queue_purge(&tx->skbs);
+ ieee80211_purge_tx_queue(&tx->local->hw, &tx->skbs);
return -1;
} else if (unlikely(res == TX_QUEUED)) {
I802_DEBUG_INC(tx->local->tx_handlers_queued);
@@ -2120,10 +2120,13 @@ netdev_tx_t ieee80211_subif_start_xmit(struct sk_buff *skb,
*/
void ieee80211_clear_tx_pending(struct ieee80211_local *local)
{
+ struct sk_buff *skb;
int i;
- for (i = 0; i < local->hw.queues; i++)
- skb_queue_purge(&local->pending[i]);
+ for (i = 0; i < local->hw.queues; i++) {
+ while ((skb = skb_dequeue(&local->pending[i])) != NULL)
+ ieee80211_free_txskb(&local->hw, skb);
+ }
}
/*
diff --git a/net/mac80211/util.c b/net/mac80211/util.c
index 2393918..0151ae3 100644
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -1491,6 +1491,8 @@ int ieee80211_reconfig(struct ieee80211_local *local)
list_for_each_entry(sdata, &local->interfaces, list) {
if (sdata->vif.type != NL80211_IFTYPE_STATION)
continue;
+ if (!sdata->u.mgd.associated)
+ continue;
ieee80211_send_nullfunc(local, sdata, 0);
}
diff --git a/net/wireless/reg.c b/net/wireless/reg.c
index bcc7d7e..b75756b 100644
--- a/net/wireless/reg.c
+++ b/net/wireless/reg.c
@@ -141,9 +141,8 @@ static const struct ieee80211_regdomain world_regdom = {
.reg_rules = {
/* IEEE 802.11b/g, channels 1..11 */
REG_RULE(2412-10, 2462+10, 40, 6, 20, 0),
- /* IEEE 802.11b/g, channels 12..13. No HT40
- * channel fits here. */
- REG_RULE(2467-10, 2472+10, 20, 6, 20,
+ /* IEEE 802.11b/g, channels 12..13. */
+ REG_RULE(2467-10, 2472+10, 40, 6, 20,
NL80211_RRF_PASSIVE_SCAN |
NL80211_RRF_NO_IBSS),
/* IEEE 802.11 channel 14 - Only JP enables
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply related
* Re: [PATCH] tcp: handle tcp_net_metrics_init() order-5 memory allocation failures
From: David Miller @ 2012-11-16 18:37 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, jln
In-Reply-To: <1353079913.10798.31.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 16 Nov 2012 07:31:53 -0800
> Well, we dont really know what the size needs to be, and your proposal
> reduces the size by a 4 factor, even for the initial namespace.
>
> Julien report was about Chrome browser own netns, on a suspend/resume
> cycle (or something like that)
>
> If size can influence behavior, we could try a vmalloc() if kmalloc()
> fails...
Agreed.
> [PATCH v3] tcp: handle tcp_net_metrics_init() order-5 memory allocation failures
>
> order-5 allocations can fail with current kernels, we should
> try vmalloc() as well.
>
> Reported-by: Julien Tinnes <jln@google.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
This looks great, applied, thanks.
^ permalink raw reply
* Re: pull request: batman-adv 2012-11-16
From: David Miller @ 2012-11-16 18:39 UTC (permalink / raw)
To: ordex; +Cc: netdev, simon.wunderlich, lindner_marek, sven
In-Reply-To: <1353055758-2901-1-git-send-email-ordex@autistici.org>
From: Antonio Quartulli <ordex@autistici.org>
Date: Fri, 16 Nov 2012 09:49:14 +0100
> here is small set of fixes intended for net/linux-3.7.
> These patches are fixing some interoperability problems due to the features we
> added in 3.7. Mainly we have two big issues: one is preventing clients connected
> to the mesh network to contact any other hosts, caused by a not proper
> translation table handling; the second one compromises the AP isolation feature
> causing it to be completely useless, no matter it was on or off.
Pulled, thanks.
^ permalink raw reply
* Re: [PATCH net-next] net: use right lock in __dev_remove_offload
From: David Miller @ 2012-11-16 18:41 UTC (permalink / raw)
To: vyasevic; +Cc: eric.dumazet, netdev
In-Reply-To: <50A682DA.2060201@redhat.com>
From: Vlad Yasevich <vyasevic@redhat.com>
Date: Fri, 16 Nov 2012 13:15:54 -0500
> On 11/16/2012 01:08 PM, Eric Dumazet wrote:
>> From: Eric Dumazet <edumazet@google.com>
>>
>> offload_base is protected by offload_lock, not ptype_lock
>>
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
...
> Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Applied, thanks everyone.
^ permalink raw reply
* Re: [RFC PATCH] tcp: introduce raw access to experimental options
From: David Miller @ 2012-11-16 18:44 UTC (permalink / raw)
To: elelueck; +Cc: netdev, frankbla, raspl, ubacher, samudrala
In-Reply-To: <1353084898-42264-1-git-send-email-elelueck@linux.vnet.ibm.com>
Unprivileged access to set and fetch these things? I don't think
that's a good idea.
Also, your code has a lot of coding style errors.
^ permalink raw reply
* Re: [PATCH] openvswitch: Make IPv6 packet parsing dependent on IPv6 config
From: Jesse Gross @ 2012-11-16 18:46 UTC (permalink / raw)
To: vyasevic-H+wXaHxf7aLQT0dZR+AlfA
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
fengguang.wu-ral2JQCrhuEAvxtiuMwx3w, davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <50A67B44.9040508-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
On Fri, Nov 16, 2012 at 9:43 AM, Vlad Yasevich <vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 11/16/2012 12:26 PM, Jesse Gross wrote:
>>
>> On Fri, Nov 16, 2012 at 7:40 AM, Vlad Yasevich <vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>> wrote:
>>
>>> Openvswitch attempts to use IPv6 packet parsing functions without
>>> any dependency on IPv6 (unlike every other place in kernel). Pull
>>> the IPv6 code in openvswitch togeter and put a conditional that's
>>> dependent on CONFIG_IPV6.
>>>
>>> Resolves:
>>> net/built-in.o: In function `ovs_flow_extract':
>>> (.text+0xbf5d5): undefined reference to `ipv6_skip_exthdr'
>>>
>>> Signed-off-by: Vlad Yasevich <vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>>
>>
>>
>> Doesn't this move in the opposite direction of your patches to make IPv6
>> GSO/GRO always available? The packets being processed here
>> are generally created by the guest but with Open vSwitch running on the
>> host. Also, ipv6_skip_exthdr() is in exthdrs_core.c, so it actually is
>> always available. I suspect that the real problem is that the dependency
>> on the ipv6 directory changed to CONFIG_INET and Open vSwitch should now
>> depend on this.
>>
>
> Yes and no... :) IPv6 uses a bunch of IPv4 code all over. IPv4 is enabled
> with CONFIG_INET and IPv6 with CONFIG_NET. So creates a strange imbalance.
> By shifting IPv6 to CONFIG_INET (which is where it
> lives and what enables its selection during config process), we now have a
> dependency with openvswitch.
>
> All other users of ipv6_skip_exthdr have it either under the IS_ENABLED
> conditional or through some other means that don't build it when INET is
> completely turned off. This patch does the same for openvswitch.
>
> I see 2 alternatives to this:
> 1) Make openvswitch depend on CONFIG_INET.
> 2) Pull a ton of code out of CONFIG_INET (v4 and v6) and into CONFIG_NET.
> This could start with IPv6 header parsing and maybe even
> include GSO/TSO (but not sure how much sense that would be).
>
> What's your take?
I agree the IPv4 and IPv6 code is all tangled together and that IPv6
should use CONFIG_INET as well. I think in an ideal world we would
separate them out but it seems like a lot of work for not much
practical benefit.
I think if you took this to the logical extension and restricted all
the protocols based on the kernel config (i.e. IPv4, TCP, and UDP are
conditional on CONFIG_INET) then you end up with a confusing mess. On
the other hand, if you do it only for IPv6 it's also confusing because
the fact that the other protocols are simple enough to parse on their
own and IPv6 is more complicated is really an implementation detail
that shouldn't be exposed.
I guess the simplest thing to do seems to just make Open vSwitch
depend on CONFIG_INET seeing as it is practically useless without
upper layer protocol support anyways.
^ permalink raw reply
* Re: [PATCH] tcp: handle tcp_net_metrics_init() order-5 memory allocation failures
From: Julien Tinnes @ 2012-11-16 18:51 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1353079913.10798.31.camel@edumazet-glaptop>
On Fri, Nov 16, 2012 at 7:31 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2012-11-16 at 01:39 -0500, David Miller wrote:
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Thu, 15 Nov 2012 15:41:04 -0800
>>
>> > From: Eric Dumazet <edumazet@google.com>
>> >
>> > order-5 allocations can fail with current kernels, we should
>> > try to reduce allocation sizes to allow network namespace
>> > creation.
>> >
>> > Reported-by: Julien Tinnes <jln@google.com>
>> > Signed-off-by: Eric Dumazet <edumazet@google.com>
>>
>> Indeed, this has to be done better.
>>
>> But this kind of retry solution results in non-deterministic behavior.
>> Yes the tcp metrics cache is best effort, but it's size can influence
>> behavior in a substantial way depending upon the workload.
>>
>> I would suggest that we instead use different limits, ones which the
>> page allocator will satisfy for us always with GFP_KERNEL.
>>
>> 1) include linux/mmzone.h
>>
>> 2) Make the two limits based upon PAGE_ALLOC_COSTLY_ORDER.
>>
>> That is, make the larger table size PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER
>> and the smaller one PAGE_SIZE << (PAGE_ALLOC_COSTLY_ORDER - 1).
>
> Well, we dont really know what the size needs to be, and your proposal
> reduces the size by a 4 factor, even for the initial namespace.
>
> Julien report was about Chrome browser own netns, on a suspend/resume
> cycle (or something like that)
It happens when users start Chrome. Chrome will create one new network
NS (for the sandbox).
This has been used for a few years now, but we had our first report in
January of this year and we've been getting a few reports very
recently at a rate that is starting to worry me (crbug.com/110756).
Thanks a lot for helping with this!
Julien
^ permalink raw reply
* Re: [PATCH] openvswitch: Make IPv6 packet parsing dependent on IPv6 config
From: David Miller @ 2012-11-16 18:53 UTC (permalink / raw)
To: jesse; +Cc: vyasevic, dev, netdev, fengguang.wu
In-Reply-To: <CAEP_g=80_P=sLkQCGXwdTLKOFN5FgqYYoDbVDXMTipbtM-GhbA@mail.gmail.com>
From: Jesse Gross <jesse@nicira.com>
Date: Fri, 16 Nov 2012 10:46:17 -0800
> I guess the simplest thing to do seems to just make Open vSwitch
> depend on CONFIG_INET seeing as it is practically useless without
> upper layer protocol support anyways.
The reason we have the ipv6 extension header parsing in a seperate,
always compiled statically into the kernel, module is exactly for
situations like this.
We need to think seriously if we want to go down this road of only
using INET as protection for every module that has some kind of ipv6
component to it.
^ permalink raw reply
* [PATCH] net-rps: Fix brokeness causing OOO packets
From: Eric Dumazet @ 2012-11-16 19:04 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Ben Hutchings
From: Tom Herbert <therbert@google.com>
In commit c445477d74ab3779 which adds aRFS to the kernel, the CPU
selected for RFS is not set correctly when CPU is changing.
This is causing OOO packets and probably other issues.
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
---
net/core/dev.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index bda6d00..c0946cb 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2818,8 +2818,10 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb,
if (unlikely(tcpu != next_cpu) &&
(tcpu == RPS_NO_CPU || !cpu_online(tcpu) ||
((int)(per_cpu(softnet_data, tcpu).input_queue_head -
- rflow->last_qtail)) >= 0))
+ rflow->last_qtail)) >= 0)) {
+ tcpu = next_cpu;
rflow = set_rps_cpu(dev, skb, rflow, next_cpu);
+ }
if (tcpu != RPS_NO_CPU && cpu_online(tcpu)) {
*rflowp = rflow;
^ permalink raw reply related
* Re: [PATCH] tcp: handle tcp_net_metrics_init() order-5 memory allocation failures
From: Eric Dumazet @ 2012-11-16 19:08 UTC (permalink / raw)
To: Julien Tinnes; +Cc: David Miller, netdev
In-Reply-To: <CAKyRK=iKnoRZbjyEXSYbKoFq8=wtVKAJYTQYeE9y_84YevdagA@mail.gmail.com>
On Fri, 2012-11-16 at 10:51 -0800, Julien Tinnes wrote:
> It happens when users start Chrome. Chrome will create one new network
> NS (for the sandbox).
>
> This has been used for a few years now, but we had our first report in
> January of this year and we've been getting a few reports very
> recently at a rate that is starting to worry me (crbug.com/110756).
>
> Thanks a lot for helping with this!
Thanks for bringing this issue to our attention !
^ permalink raw reply
* Re: [PATCH] net-rps: Fix brokeness causing OOO packets
From: David Miller @ 2012-11-16 19:36 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, bhutchings
In-Reply-To: <1353092655.10798.44.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 16 Nov 2012 11:04:15 -0800
> From: Tom Herbert <therbert@google.com>
>
> In commit c445477d74ab3779 which adds aRFS to the kernel, the CPU
> selected for RFS is not set correctly when CPU is changing.
> This is causing OOO packets and probably other issues.
>
> Signed-off-by: Tom Herbert <therbert@google.com>
> Acked-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Applied and queued up for -stable, thanks everyone.
^ permalink raw reply
* Re: Optics (SFP) monitoring on ixgbe and igbe
From: Ben Hutchings @ 2012-11-16 19:38 UTC (permalink / raw)
To: footplus; +Cc: netdev, jeffrey.t.kirsher
In-Reply-To: <CAPN4dA_ymo-Bx+GM+JLLKGHghq+qBYxR0zO7K6H6Nn0pqZsJRg@mail.gmail.com>
On Fri, 2012-11-16 at 03:23 +0100, Aurélien wrote:
> On Fri, Nov 16, 2012 at 12:30 AM, Ben Hutchings
> <bhutchings@solarflare.com> wrote:
> >
> > Yes, Jeff's the one you should be talking to about these drivers. I
> > just look after the ethtool utility and API.
> >
>
> Ok, so I will discuss the ixgbe patch with Jeff :)
>
> Ben, on the ethtool side, attached is a patch to enable the following
> option and output; It's still missing externally calibrated optics
> support (my current one is internally calibrated, so that's difficult
> to test anything). What do you think ? Is there any other data that
> could be interesting to show with -O or -m options ?
[...]
> --- a/configure.ac
> +++ b/configure.ac
> @@ -13,9 +13,11 @@ AC_PROG_GCC_TRADITIONAL
> AM_PROG_CC_C_O
>
> dnl Checks for libraries.
> +AC_CHECK_LIB([m], [log10])
>
> dnl Checks for header files.
> AC_CHECK_HEADERS(sys/ioctl.h)
> +AC_CHECK_HEADERS(math.h)
This is silly; log10() and <math.h> are part of standard C and -lm is
standard on Unix. Just use <math.h> and -lm unconditionally.
> dnl Checks for typedefs, structures, and compiler characteristics.
> AC_MSG_CHECKING([whether <linux/types.h> defines big-endian types])
> diff --git a/ethtool.c b/ethtool.c
> index 3db7fec..e18fc85 100644
> --- a/ethtool.c
> +++ b/ethtool.c
> @@ -3549,6 +3549,47 @@ static int do_tsinfo(struct cmd_context *ctx)
> return 0;
> }
>
> +static int do_getmoduleoptics(struct cmd_context *ctx)
> +{
> + struct ethtool_modinfo modinfo;
> + struct ethtool_eeprom *eeprom;
> + int err;
> +
> + modinfo.cmd = ETHTOOL_GMODULEINFO;
> + err = send_ioctl(ctx, &modinfo);
> + if (err < 0) {
> + perror("Cannot get module information");
> + return 1;
> + }
> +
> + if (modinfo.type != ETH_MODULE_SFF_8472)
> + {
> + perror("Module is not SFF-8472 (DOM) compliant");
> + return 1;
> + }
> +
> + eeprom = calloc(1, sizeof(*eeprom) + modinfo.eeprom_len);
> + if (!eeprom) {
> + perror("Cannot allocate memory for module EEPROM data");
> + return 1;
> + }
> +
> + eeprom->cmd = ETHTOOL_GMODULEEEPROM;
> + eeprom->len = modinfo.eeprom_len;
> + eeprom->offset = 0;
> + err = send_ioctl(ctx, eeprom);
> + if (err < 0) {
> + perror("Cannot access module EEPROM");
> + free(eeprom);
> + return 1;
> + }
> +
> + printf("Physical interface: %s\n", ctx->devname);
> + sff8472_show_all(eeprom->data);
> + free(eeprom);
> + return 0;
> +}
Please merge this with the existing -m option and update the
documentation to say that this covers diagnostics where available. You
could add a long option alias like --dump-module or --module-info that
covers the two types of information.
> static int do_getmodule(struct cmd_context *ctx)
> {
> struct ethtool_modinfo modinfo;
> @@ -3832,11 +3873,13 @@ static const struct option {
> { "--set-priv-flags", 1, do_sprivflags, "Set private flags",
> " FLAG on|off ...\n" },
> { "-m|--dump-module-eeprom", 1, do_getmodule,
> - "Qeuery/Decode Module EEPROM information",
> + "Query/Decode Module EEPROM information",
> " [ raw on|off ]\n"
> " [ hex on|off ]\n"
> " [ offset N ]\n"
> " [ length N ]\n" },
> + { "-O|--module-optics", 1, do_getmoduleoptics,
> + "Show module optical diagnostics" },
> { "--show-eee", 1, do_geee, "Show EEE settings"},
> { "--set-eee", 1, do_seee, "Set EEE settings",
> " [ eee on|off ]\n"
> diff --git a/internal.h b/internal.h
> index 4f96fd5..e977a81 100644
> --- a/internal.h
> +++ b/internal.h
> @@ -253,4 +253,7 @@ int rxclass_rule_del(struct cmd_context *ctx, __u32 loc);
> /* Module EEPROM parsing code */
> void sff8079_show_all(const __u8 *id);
>
> +/* Optics diagnostics */
> +void sff8472_show_all(const __u8 *id);
> +
> #endif /* ETHTOOL_INTERNAL_H__ */
> diff --git a/sfpdiag.c b/sfpdiag.c
> new file mode 100644
> index 0000000..aa7c14c
> --- /dev/null
> +++ b/sfpdiag.c
[...]
> +#define SFF_A2_TEMP 0x100 + 96
> +#define SFF_A2_TEMP_HALRM 0x100 + 0
[...]
> +#define SFF_A2_ALRM_FLG 0x100 + 112
> +#define SFF_A2_WARN_FLG 0x100 + 116
All the above offsets need parentheses around their definitions.
> +struct sff8472_diags {
> +
> +#define MCURR 0
> +#define LWARN 1
> +#define HWARN 2
> +#define LALRM 3
> +#define HALRM 4
> +
> + /* [5] tables are current, low/high warn, low/high alarm */
> + __u8 supports_dom; /* Supports DOM */
> + __u8 supports_alarms; /* Supports alarm/warning thold */
> + __u8 calibrated_int; /* Is internally calibrated */
> + __u16 bias_cur[5]; /* Measured bias current in 2uA units (cur, l/h warn, l/h alarm) */
> + __u16 tx_power[5]; /* Measured TX Power in 0.1uW units (cur, warn, alarm) */
> + __u16 rx_power[5]; /* Measured RX Power (cur, warn, alarm) */
> + __u8 rx_power_type; /* 0 = OMA, 1 = Average power */
> + __s16 sfp_temp[5]; /* SFP Temp in 0.1 Celcius (cur, warn, alarm) */
> + __u16 sfp_voltage[5]; /* SFP voltage in 0.1mV units (cur, warn, alarm) */
> +
> +};
> +
> +static struct sff8472_aw_flags {
> + const char *str; /* Human-readable string, null at the end */
> + int offset; /* A2-relative adress offset */
This is commented as an offset in the A2 'EEPROM' but the offsets
actually used include the 0x100 offset from the start of the
concatenated 'EEPROM'.
> + __u8 value; /* 1-bit mask, alarm is on if offset & value != 0. */
> +} sff8472_aw_flags[] =
> +{
> + { "Laser bias current high alarm", SFF_A2_ALRM_FLG, (1 << 3) },
> + { "Laser bias current low alarm", SFF_A2_ALRM_FLG, (1 << 2) },
> + { "Laser bias current high warning", SFF_A2_WARN_FLG, (1 << 3) },
> + { "Laser bias current low warning", SFF_A2_WARN_FLG, (1 << 2) },
> +
> + { "Laser output power high alarm", SFF_A2_ALRM_FLG, (1 << 1) },
> + { "Laser output power low alarm", SFF_A2_ALRM_FLG, (1 << 0) },
> + { "Laser output power high warning", SFF_A2_WARN_FLG, (1 << 1) },
> + { "Laser output power low warning", SFF_A2_WARN_FLG, (1 << 0) },
> +
> + { "Module temperature high alarm", SFF_A2_ALRM_FLG, (1 << 7) },
> + { "Module temperature low alarm", SFF_A2_ALRM_FLG, (1 << 6) },
> + { "Module temperature high warning", SFF_A2_WARN_FLG, (1 << 7) },
> + { "Module temperature low warning", SFF_A2_WARN_FLG, (1 << 6) },
> +
> + { "Module voltage high alarm", SFF_A2_ALRM_FLG, (1 << 5) },
> + { "Module voltage low alarm", SFF_A2_ALRM_FLG, (1 << 4) },
> + { "Module voltage high warning", SFF_A2_WARN_FLG, (1 << 5) },
> + { "Module voltage low warning", SFF_A2_WARN_FLG, (1 << 4) },
> +
> + { "Laser rx power high alarm", SFF_A2_ALRM_FLG + 1, (1 << 7) },
> + { "Laser rx power low alarm", SFF_A2_ALRM_FLG + 1, (1 << 6) },
> + { "Laser rx power high warning", SFF_A2_WARN_FLG + 1, (1 << 7) },
> + { "Laser rx power low warning", SFF_A2_WARN_FLG + 1, (1 << 6) },
> +
> + { NULL, 0, 0 },
> +};
> +
> +#ifdef HAVE_LIBM
> +
> +static double convert_mw_to_dbm(double mw)
> +{
> + return (10.f * log10(mw / 1000.f)) + 30.f;
Why are all the literals explicitly float and not double?
> +}
> +
> +#endif
> +
> +/* Externally calibrated SFP calculations */
> +#define ECAL(v, s, o) (( ((double) (s>>8)) + (s & 0xFF)) * (double) v + o)
Please follow kernel coding style for spacing. checkpatch.pl will show
you what should be changed.
> +static void sff8472_parse_eeprom(const __u8 *id, struct sff8472_diags *sd)
> +{
> + sd->supports_dom = id[SFF_A0_DOM] & SFF_A0_DOM_IMPL;
> + sd->supports_alarms = id[SFF_A0_OPTIONS] & SFF_A0_OPTIONS_AW;
> + sd->calibrated_int = id[SFF_A0_DOM] & SFF_A0_DOM_INTCAL;
> + sd->rx_power_type = id[SFF_A0_DOM] & SFF_A0_DOM_PWRT;
> +
> +
> +#define OFFSET_TO_U16(offset) (id[(offset)] << 8 | id[(offset) + 1])
> +
> + sd->bias_cur[MCURR] = OFFSET_TO_U16(SFF_A2_BIAS);
> + sd->bias_cur[HALRM] = OFFSET_TO_U16(SFF_A2_BIAS_HALRM);
> + sd->bias_cur[LALRM] = OFFSET_TO_U16(SFF_A2_BIAS_LALRM);
> + sd->bias_cur[HWARN] = OFFSET_TO_U16(SFF_A2_BIAS_HWARN);
> + sd->bias_cur[LWARN] = OFFSET_TO_U16(SFF_A2_BIAS_LWARN);
> +
> + sd->sfp_voltage[MCURR] = OFFSET_TO_U16(SFF_A2_VCC);
> + sd->sfp_voltage[HALRM] = OFFSET_TO_U16(SFF_A2_VCC_HALRM);
> + sd->sfp_voltage[LALRM] = OFFSET_TO_U16(SFF_A2_VCC_LALRM);
> + sd->sfp_voltage[HWARN] = OFFSET_TO_U16(SFF_A2_VCC_HWARN);
> + sd->sfp_voltage[LWARN] = OFFSET_TO_U16(SFF_A2_VCC_LWARN);
> +
> + sd->tx_power[MCURR] = OFFSET_TO_U16(SFF_A2_TX_PWR);
> + sd->tx_power[HALRM] = OFFSET_TO_U16(SFF_A2_TX_PWR_HALRM);
> + sd->tx_power[LALRM] = OFFSET_TO_U16(SFF_A2_TX_PWR_LALRM);
> + sd->tx_power[HWARN] = OFFSET_TO_U16(SFF_A2_TX_PWR_HWARN);
> + sd->tx_power[LWARN] = OFFSET_TO_U16(SFF_A2_TX_PWR_LWARN);
> +
> + sd->rx_power[MCURR] = OFFSET_TO_U16(SFF_A2_RX_PWR);
> + sd->rx_power[HALRM] = OFFSET_TO_U16(SFF_A2_RX_PWR_HALRM);
> + sd->rx_power[LALRM] = OFFSET_TO_U16(SFF_A2_RX_PWR_LALRM);
> + sd->rx_power[HWARN] = OFFSET_TO_U16(SFF_A2_RX_PWR_HWARN);
> + sd->rx_power[LWARN] = OFFSET_TO_U16(SFF_A2_RX_PWR_LWARN);
> +
> + /* Temperature conversions */
> +#define OFFSET_TO_TEMP(offset) \
> + ((*(__s8 *)(&id[(offset)])) * 1000 + ((id[(offset) + 1] * 1000) / 256)) / 100;
This seems awfuly complicated; why not:
#define OFFSET_TO_TEMP(offset) (((s16)OFFSET_TO_U16(offset)) * 10 / 256)
But why round to tenths of a degree here and then round again to whole
degrees celsius/fahrenheit when printing?
[...]
> +#define PRINT_TEMP(string, index) \
> + printf(" %-41s : %.0f degrees C / %.0f degrees F\n", (string), \
> + (double)(sd.sfp_temp[(index)] / 10.f), \
> + (double)(sd.sfp_temp[(index)] / 10.f * 1.8f + 32.f));
[...]
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH] openvswitch: Make IPv6 packet parsing dependent on IPv6 config
From: Vlad Yasevich @ 2012-11-16 19:41 UTC (permalink / raw)
To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
fengguang.wu-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <20121116.135341.453792886356015492.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
On 11/16/2012 01:53 PM, David Miller wrote:> From: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
> Date: Fri, 16 Nov 2012 10:46:17 -0800
>
>> I guess the simplest thing to do seems to just make Open vSwitch
>> depend on CONFIG_INET seeing as it is practically useless without
>> upper layer protocol support anyways.
>
> The reason we have the ipv6 extension header parsing in a seperate,
> always compiled statically into the kernel, module is exactly for
> situations like this.
>
> We need to think seriously if we want to go down this road of only
> using INET as protection for every module that has some kind of ipv6
> component to it.
Ok. How about this approach instead. This keeps core functions we need
still dependent on CONFIG_NET and makes new GSO stuff depend on CONFIG_INET
since its quite useless without CONFIG_INET anyway...
-vlad
-- >8 --
Subject: [PATCH] ipv6: Preserve ipv6 functionality needed by NET
Some pieces of network use core pieces of IPv6 stack. Keep
them available while letting new GSO offload pieces depend
on CONFIG_INET.
Signed-off-by: Vlad Yasevich <vyasevic-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
net/Makefile | 2 +-
net/ipv6/Makefile | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/Makefile b/net/Makefile
index e050d9d..4f4ee08 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -19,7 +19,7 @@ obj-$(CONFIG_NETFILTER) += netfilter/
obj-$(CONFIG_INET) += ipv4/
obj-$(CONFIG_XFRM) += xfrm/
obj-$(CONFIG_UNIX) += unix/
-obj-$(CONFIG_INET) += ipv6/
+obj-$(CONFIG_NET) += ipv6/
obj-$(CONFIG_PACKET) += packet/
obj-$(CONFIG_NET_KEY) += key/
obj-$(CONFIG_BRIDGE) += bridge/
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index 04a475d..2068ac4 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -40,7 +40,7 @@ obj-$(CONFIG_IPV6_SIT) += sit.o
obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
-obj-y += addrconf_core.o exthdrs_core.o output_core.o protocol.o
-obj-y += $(ipv6-offload)
+obj-y += addrconf_core.o exthdrs_core.o
+obj-$(CONFIG_INET) += output_core.o protocol.o $(ipv6_offload)
obj-$(subst m,y,$(CONFIG_IPV6)) += inet6_hashtables.o
--
1.7.7.6
^ permalink raw reply related
* Re: pull request: wireless 2012-11-16
From: David Miller @ 2012-11-16 19:41 UTC (permalink / raw)
To: linville; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20121116183011.GA29426@tuxdriver.com>
From: "John W. Linville" <linville@tuxdriver.com>
Date: Fri, 16 Nov 2012 13:30:11 -0500
> This batch of fixes is intended for the 3.7 stream...
Pulled, thanks John.
^ permalink raw reply
* [PATCH] checkpatch: add double empty line check
From: Eilon Greenstein @ 2012-11-16 19:41 UTC (permalink / raw)
To: Andy Whitcroft, linux-kernel; +Cc: Joe Perches, netdev
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
scripts/checkpatch.pl | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 21a9f5d..7a9c153 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3579,6 +3579,13 @@ sub process {
WARN("EXPORTED_WORLD_WRITABLE",
"Exporting world writable files is usually an error. Consider more restrictive permissions.\n" . $herecurr);
}
+
+# check for double empty lines
+ if ($line =~ /^\+\s*$/ &&
+ ($prevline =~ /^\+?\s*$/ || $rawlines[$linenr] =~ /^\s*$/)) {
+ WARN("DOUBLE_EMPTY_LINE",
+ "One empty line should be sufficient. Consider removing this one.\n" . $herecurr);
+ }
}
# If we have no input at all, then there is nothing to report on
--
1.7.9.5
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox