* Re: [PATCH net-next 1/1] ipvlan: Initial check-in of the IPVLAN driver.
From: Alexei Starovoitov @ 2014-11-13 23:25 UTC (permalink / raw)
To: Mahesh Bandewar
Cc: netdev, Eric Dumazet, Maciej Zenczykowski, Laurent Chavey,
Tim Hockin, David Miller, Brandon Philips, Pavel Emelianov
On Tue, Nov 11, 2014 at 2:29 PM, Mahesh Bandewar <maheshb@google.com> wrote:
> The device operates in two different modes and the difference
> in these two modes in primarily in the TX side.
>
> (a) L2 mode : In this mode, the device behaves as a L2 device.
> TX processing upto L2 happens on the stack of the virtual device
> associated with (namespace). Packets are switched after that
> into the main device (default-ns) and queued for xmit.
>
> RX processing is simple and all multicast, broadcast (if
> applicable), and unicast belonging to the address(es) are
> delivered to the virtual devices.
>
> (b) L3 mode : In this mode, the device behaves like a L3 device.
> TX processing upto L3 happens on the stack of the virtual device
> associated with (namespace). Packets are switched to the
> main-device (default-ns) for the L2 processing. Hence the routing
> table of the default-ns will be used in this mode.
>
> RX processins is somewhat similar to the L2 mode except that in
> this mode only Unicast packets are delivered to the virtual device
> while main-dev will handle all other packets.
great stuff. would be interesting to see a 'typical use'
scenario of l2 vs l3 mode. Why users would pick one
or another?
I can only think of different default ip in different ns
would force l2. Anything else?
Few comments:
> +++ b/drivers/net/ipvlan/ipvlan.h
...
> +#include <linux/kernel.h>
> +#include <linux/types.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/errno.h>
> +#include <linux/slab.h>
> +#include <linux/string.h>
> +#include <linux/rculist.h>
> +#include <linux/notifier.h>
> +#include <linux/netdevice.h>
> +#include <linux/etherdevice.h>
> +#include <linux/ethtool.h>
> +#include <linux/if_arp.h>
> +#include <linux/if_link.h>
> +#include <linux/atomic.h>
> +#include <linux/if_vlan.h>
> +#include <linux/inet.h>
> +#include <linux/hash.h>
> +#include <linux/ip.h>
> +#include <linux/inetdevice.h>
> +#include <net/rtnetlink.h>
> +#include <net/gre.h>
> +#include <net/route.h>
> +#include <net/addrconf.h>
I don't think it's a good style to put all headers that all
.c need into common .h
Rather put them into individual .c
> +static void *ipvlan_get_L3_hdr(struct sk_buff *skb, int *type)
> +{
> + void *lyr3h = NULL;
> +
> + switch (skb->protocol) {
> + case htons(ETH_P_ARP): {
> + struct arphdr *arph;
> +
> + if (unlikely(!pskb_may_pull(skb, sizeof(struct arphdr))))
> + return NULL;
> +
> + arph = arp_hdr(skb);
> + *type = IPVL_ARP;
> + lyr3h = arph;
> + break;
> + }
...
> +static struct ipvl_addr *ipvlan_addr_lookup(struct ipvl_port *port,
> + void *lyr3h, int addr_type,
> + bool use_dest)
> +{
> + struct ipvl_addr *addr = NULL;
> +
> + if (addr_type == IPVL_IPV6) {
> + struct ipv6hdr *ip6h = NULL;
> + struct in6_addr *i6addr;
> +
> + ip6h = (struct ipv6hdr *)lyr3h;
> + i6addr = use_dest ? &ip6h->daddr : &ip6h->saddr;
> + addr = ipvlan_ht_addr_lookup(port, i6addr, true);
imo it looks very artificial to split logically single
lookup function into two: get() that returns 'type'/
'void * lyr3h' and lookup() that uses them.
It feels error prone.
Also everywhere lookup() follows get() immediately.
I think single lookup() would be much cleaner.
^ permalink raw reply
* Re: [PATCH 2/3] r8169: Use load_acquire() and store_release() to reduce memory barrier overhead
From: Alexander Duyck @ 2014-11-13 23:11 UTC (permalink / raw)
To: Francois Romieu, Alexander Duyck
Cc: linux-arch, netdev, linux-kernel, mikey, tony.luck,
mathieu.desnoyers, donald.c.skidmore, peterz, benh,
heiko.carstens, oleg, will.deacon, davem, michael, matthew.vick,
nic_swsd, geert, jeffrey.t.kirsher, fweisbec, schwidefsky, linux,
paulmck, torvalds, mingo
In-Reply-To: <20141113213049.GA12297@electric-eye.fr.zoreil.com>
On 11/13/2014 01:30 PM, Francois Romieu wrote:
> Alexander Duyck <alexander.h.duyck@redhat.com> :
> [...]
>> In addition the r8169 uses a rmb() however I believe it is placed incorrectly
>> as I assume it supposed to be ordering descriptor reads after the check for
>> ownership.
> Not exactly. It's a barrier against compiler optimization from 2004.
> It should not matter.
Okay. Do you recall the kind of problem it was you were seeing?
The origin of the rmb() for the Intel drivers was a PowerPC issue in
which it was fetching the length of a buffer before it checked the DD
bit (equivalent of DescOwn). I'm wondering if the issue you were seeing
was something similar where it had reordered reads in the descriptor to
cause that type of result.
> However I disagree with the change below:
>
>> @@ -7284,11 +7280,11 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, u32 budget
>> struct RxDesc *desc = tp->RxDescArray + entry;
>> u32 status;
>>
>> - rmb();
>> - status = le32_to_cpu(desc->opts1) & tp->opts1_mask;
>> -
>> + status = cpu_to_le32(load_acquire(&desc->opts1));
>> if (status & DescOwn)
>> break;
>> +
>> + status &= tp->opts1_mask;
> -> tp->opts1_mask is not __le32 tainted.
Sorry I just noticed I got my byte ordering messed up on that. It
should have been le32_to_cpu. desc->opts is le32, and status should be
CPU ordered. I will have that updated for v2.
> Btw, should I consider the sketch above as a skeleton in my r8169 closet ?
>
> NIC CPU0 CPU1
> | CPU | NIC | CPU | CPU |
>
> | CPU | NIC | CPU | CPU |
> ^ tx_dirty
>
> [start_xmit...
>
> | CPU | CPU | CPU | CPU |
> (NIC did it's job)
> [rtl_tx...
> | ... | ... | NIC | NIC |
> (ring update)
> (tx_dirty increases)
>
> | CPU | CPU | ??? | ??? |
> tx_dirty ?
> reaping about-to-be-sent
> buffers on some platforms ?
> ...start_xmit]
Actually it looks like that could be due to the placement of tp->cur_tx
update and the txd->opts1 being updated in the same spot in start_xmit
with no barrier to separate them. As such the compiler is free to
update tp->cur_tx first, and then update the desc->opts to set the
DescOwn bit.
I will move the update of tp->cur_tx down a few lines past where the
second wmb is/was. That should provide enough buffer to guarantee that
cur_tx update is only visible after the descriptors have been updated so
the reaping should only occur if the CPU has written back.
Thanks,
Alex
^ permalink raw reply
* Re: [PATCH 16/16] rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()
From: David Howells @ 2014-11-13 23:07 UTC (permalink / raw)
To: Pranith Kumar
Cc: dhowells, David S. Miller, Dan Carpenter,
open list:NETWORKING [GENERAL], open list, paulmck
In-Reply-To: <546528BF.5040902@gmail.com>
Pranith Kumar <bobby.prani@gmail.com> wrote:
> OK. Should I send in a patch removing these barriers then?
No. There need to be stronger barriers, at least in some of the cases.
circular-buffers.txt details what is required, but not all of the cases match
the pattern there, so it needs a bit more consideration.
David
^ permalink raw reply
* Re: [PATCH 3/3] sh_eth: Fix dma mapping issue
From: Sergei Shtylyov @ 2014-11-13 23:05 UTC (permalink / raw)
To: Yoshihiro Kaneko, netdev
Cc: David S. Miller, Simon Horman, Magnus Damm, linux-sh
In-Reply-To: <1415862301-28032-4-git-send-email-ykaneko0929@gmail.com>
On 11/13/2014 10:05 AM, Yoshihiro Kaneko wrote:
> From: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> When CONFIG_DMA_API_DEBUG=y, many DMA error messages reports.
> In order to use DMA debug, This patch fix following issues.
> Issue 1:
> If dma_mapping_error function is not called appropriately after
> DMA mapping, DMA debug will report error message when DMA unmap
> function is called.
> Issue 2:
> If skb_reserve function is called after DMA mapping, the relationship
> between mapping addr and mapping size will be broken.
> In this case, DMA debug will report error messages when DMA sync
> function and DMA unmap function are called.
> Issue 3:
> If the size of frame data is less than ETH_ZLEN, the size is resized
> to ETH_ZLEN after DMA map function is called.
> In the TX skb freeing function, dma unmap function is called with that
> resized value. So, unmap size error will reported.
> Issue 4:
> In the rx function, DMA map function is called without DMA unmap function
> is called for RX skb reallocating.
> It will case the DMA debug error that number of debug entry is full and
> DMA debug logic is stopped.
The rule of thumb is "fix one issue per patch". Please split accordingly.
> Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Thanks for beating me to it. Fixing these issues has been on my agenda for
a long time... :-)
> ---
> drivers/net/ethernet/renesas/sh_eth.c | 26 +++++++++++++++++++++++---
> 1 file changed, 23 insertions(+), 3 deletions(-)
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
> index 0e4a407..23318cf 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -1136,6 +1136,11 @@ static void sh_eth_ring_format(struct net_device *ndev)
> dma_map_single(&ndev->dev, skb->data, rxdesc->buffer_length,
> DMA_FROM_DEVICE);
> rxdesc->addr = virt_to_phys(skb->data);
Can't we get rid of these bogus virt_to_phys() calls, while at it?
dma_map_single() returns a DMA address, no?
> + if (dma_mapping_error(&ndev->dev, rxdesc->addr)) {
> + dev_kfree_skb(mdp->rx_skbuff[i]);
> + mdp->rx_skbuff[i] = NULL;
> + break;
> + }
> rxdesc->status = cpu_to_edmac(mdp, RD_RACT | RD_RFP);
>
> /* Rx descriptor address set */
> @@ -1364,7 +1369,7 @@ static int sh_eth_txfree(struct net_device *ndev)
> if (mdp->tx_skbuff[entry]) {
> dma_unmap_single(&ndev->dev, txdesc->addr,
> txdesc->buffer_length, DMA_TO_DEVICE);
> - dev_kfree_skb_irq(mdp->tx_skbuff[entry]);
> + dev_kfree_skb_any(mdp->tx_skbuff[entry]);
Hm, I'm not sure where is this described in the changelog...
> mdp->tx_skbuff[entry] = NULL;
> free_num++;
> }
> @@ -1466,11 +1471,19 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
> if (skb == NULL)
> break; /* Better luck next round. */
> sh_eth_set_receive_align(skb);
> + dma_unmap_single(&ndev->dev, rxdesc->addr,
> + rxdesc->buffer_length,
> + DMA_FROM_DEVICE);
> dma_map_single(&ndev->dev, skb->data,
> rxdesc->buffer_length, DMA_FROM_DEVICE);
>
> skb_checksum_none_assert(skb);
> rxdesc->addr = virt_to_phys(skb->data);
Likewise, can we get rid of this bogu?
> + if (dma_mapping_error(&ndev->dev, rxdesc->addr)) {
> + dev_kfree_skb_any(mdp->rx_skbuff[entry]);
> + mdp->rx_skbuff[entry] = NULL;
> + break;
> + }
> }
> if (entry >= mdp->num_rx_ring - 1)
> rxdesc->status |=
> @@ -2104,12 +2117,18 @@ static int sh_eth_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> if (!mdp->cd->hw_swap)
> sh_eth_soft_swap(phys_to_virt(ALIGN(txdesc->addr, 4)),
> skb->len + 2);
> - txdesc->addr = dma_map_single(&ndev->dev, skb->data, skb->len,
> - DMA_TO_DEVICE);
> if (skb->len < ETH_ZLEN)
> txdesc->buffer_length = ETH_ZLEN;
> else
> txdesc->buffer_length = skb->len;
> + txdesc->addr = dma_map_single(&ndev->dev, skb->data,
> + txdesc->buffer_length,
> + DMA_TO_DEVICE);
> + if (dma_mapping_error(&ndev->dev, txdesc->addr)) {
> + dev_kfree_skb_any(mdp->tx_skbuff[entry]);
> + mdp->tx_skbuff[entry] = NULL;
> + goto out;
Why not just *return*?!
[...]
WBR, Sergei
^ permalink raw reply
* [PATCH net-next] icmp: Remove some spurious dropped packet profile hits from the ICMP path
From: Rick Jones @ 2014-11-13 22:54 UTC (permalink / raw)
To: netdev; +Cc: davem
From: Rick Jones <rick.jones2@hp.com>
If icmp_rcv() has successfully processed the incoming ICMP datagram, we
should use consume_skb() rather than kfree_skb() because a hit on the likes
of perf -e skb:kfree_skb is not called-for.
Signed-off-by: Rick Jones <rick.jones2@hp.com>
---
A test system hit with a flood ping hits on perf top -e ksb:kfre_skb before
the change and none after for the normal/success path. The IPv6 path would
be somewhat more ugly. For the time being, just deal with the overlap on
ping_rcv() between the two to avoid a possible double free of an skb.
diff --git a/include/net/ping.h b/include/net/ping.h
index 026479b..f074060 100644
--- a/include/net/ping.h
+++ b/include/net/ping.h
@@ -82,7 +82,7 @@ int ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
int ping_v6_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
size_t len);
int ping_queue_rcv_skb(struct sock *sk, struct sk_buff *skb);
-void ping_rcv(struct sk_buff *skb);
+bool ping_rcv(struct sk_buff *skb);
#ifdef CONFIG_PROC_FS
struct ping_seq_afinfo {
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 36b7bfa..b9f3653 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -190,7 +190,7 @@ EXPORT_SYMBOL(icmp_err_convert);
*/
struct icmp_control {
- void (*handler)(struct sk_buff *skb);
+ bool (*handler)(struct sk_buff *skb);
short error; /* This ICMP is classed as an error message */
};
@@ -746,7 +746,7 @@ static bool icmp_tag_validation(int proto)
* ICMP_PARAMETERPROB.
*/
-static void icmp_unreach(struct sk_buff *skb)
+static bool icmp_unreach(struct sk_buff *skb)
{
const struct iphdr *iph;
struct icmphdr *icmph;
@@ -839,10 +839,11 @@ static void icmp_unreach(struct sk_buff *skb)
icmp_socket_deliver(skb, info);
out:
- return;
+ return true;
out_err:
ICMP_INC_STATS_BH(net, ICMP_MIB_INERRORS);
- goto out;
+ kfree_skb(skb);
+ return false;
}
@@ -850,17 +851,22 @@ out_err:
* Handle ICMP_REDIRECT.
*/
-static void icmp_redirect(struct sk_buff *skb)
+static bool icmp_redirect(struct sk_buff *skb)
{
if (skb->len < sizeof(struct iphdr)) {
ICMP_INC_STATS_BH(dev_net(skb->dev), ICMP_MIB_INERRORS);
- return;
+ kfree_skb(skb);
+ return false;
}
- if (!pskb_may_pull(skb, sizeof(struct iphdr)))
- return;
+ if (!pskb_may_pull(skb, sizeof(struct iphdr))) {
+ /* there aught to be a stat */
+ kfree_skb(skb);
+ return false;
+ }
icmp_socket_deliver(skb, icmp_hdr(skb)->un.gateway);
+ return true;
}
/*
@@ -875,7 +881,7 @@ static void icmp_redirect(struct sk_buff *skb)
* See also WRT handling of options once they are done and working.
*/
-static void icmp_echo(struct sk_buff *skb)
+static bool icmp_echo(struct sk_buff *skb)
{
struct net *net;
@@ -891,6 +897,8 @@ static void icmp_echo(struct sk_buff *skb)
icmp_param.head_len = sizeof(struct icmphdr);
icmp_reply(&icmp_param, skb);
}
+ /* should there be an ICMP stat for ignored echos? */
+ return true;
}
/*
@@ -900,7 +908,7 @@ static void icmp_echo(struct sk_buff *skb)
* MUST be accurate to a few minutes.
* MUST be updated at least at 15Hz.
*/
-static void icmp_timestamp(struct sk_buff *skb)
+static bool icmp_timestamp(struct sk_buff *skb)
{
struct timespec tv;
struct icmp_bxm icmp_param;
@@ -927,15 +935,18 @@ static void icmp_timestamp(struct sk_buff *skb)
icmp_param.data_len = 0;
icmp_param.head_len = sizeof(struct icmphdr) + 12;
icmp_reply(&icmp_param, skb);
-out:
- return;
+ return true;
+
out_err:
ICMP_INC_STATS_BH(dev_net(skb_dst(skb)->dev), ICMP_MIB_INERRORS);
- goto out;
+ kfree_skb(skb);
+ return false;
}
-static void icmp_discard(struct sk_buff *skb)
+static bool icmp_discard(struct sk_buff *skb)
{
+ /* pretend it was a success */
+ return true;
}
/*
@@ -946,6 +957,7 @@ int icmp_rcv(struct sk_buff *skb)
struct icmphdr *icmph;
struct rtable *rt = skb_rtable(skb);
struct net *net = dev_net(rt->dst.dev);
+ bool success;
if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) {
struct sec_path *sp = skb_sec_path(skb);
@@ -1012,7 +1024,12 @@ int icmp_rcv(struct sk_buff *skb)
}
}
- icmp_pointers[icmph->type].handler(skb);
+ success = icmp_pointers[icmph->type].handler(skb);
+
+ if (success)
+ consume_skb(skb);
+
+ return 0;
drop:
kfree_skb(skb);
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 736236c..7d54eed 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -955,7 +955,7 @@ EXPORT_SYMBOL_GPL(ping_queue_rcv_skb);
* All we need to do is get the socket.
*/
-void ping_rcv(struct sk_buff *skb)
+bool ping_rcv(struct sk_buff *skb)
{
struct sock *sk;
struct net *net = dev_net(skb->dev);
@@ -974,11 +974,13 @@ void ping_rcv(struct sk_buff *skb)
pr_debug("rcv on socket %p\n", sk);
ping_queue_rcv_skb(sk, skb_get(skb));
sock_put(sk);
- return;
+ return true;
}
pr_debug("no socket, dropping\n");
- /* We're called from icmp_rcv(). kfree_skb() is done there. */
+ /* Do the kfree_skb() here to get a better drop profile */
+ kfree_skb(skb);
+ return false;
}
EXPORT_SYMBOL_GPL(ping_rcv);
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index 0929340..6c0f805 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -679,6 +679,7 @@ static int icmpv6_rcv(struct sk_buff *skb)
const struct in6_addr *saddr, *daddr;
struct icmp6hdr *hdr;
u8 type;
+ bool success = true;
if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) {
struct sec_path *sp = skb_sec_path(skb);
@@ -726,7 +727,7 @@ static int icmpv6_rcv(struct sk_buff *skb)
break;
case ICMPV6_ECHO_REPLY:
- ping_rcv(skb);
+ success = ping_rcv(skb);
break;
case ICMPV6_PKT_TOOBIG:
@@ -790,7 +791,15 @@ static int icmpv6_rcv(struct sk_buff *skb)
icmpv6_notify(skb, type, hdr->icmp6_code, hdr->icmp6_mtu);
}
- kfree_skb(skb);
+ /* until the v6 path can be better sorted we may still need
+ * to kfree_sbk() here but want to avoid a double free from
+ * the ping_rcv() path, which shares code with IPv4. assume
+ * success and preserve the status quo behaviour for the rest
+ * of the paths to here
+ */
+ if (success)
+ kfree_skb(skb);
+
return 0;
csum_error:
^ permalink raw reply related
* Re: [PATCH 2/3] sh_eth: Fix skb alloc size and alignment adjust rule.
From: Sergei Shtylyov @ 2014-11-13 22:48 UTC (permalink / raw)
To: Yoshihiro Kaneko, netdev
Cc: David S. Miller, Simon Horman, Magnus Damm, linux-sh
In-Reply-To: <1415862301-28032-3-git-send-email-ykaneko0929@gmail.com>
On 11/13/2014 10:05 AM, Yoshihiro Kaneko wrote:
> From: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> In the current driver, allocation size of skb does not care the alignment
> adjust after allocation.
> And also, in the current implementation, buffer alignment method by
> sh_eth_set_receive_align function has a bug that this function displace
> buffer start address forcedly when the alignment is corrected.
> In the result, tail of the skb will exceed allocated area and kernel panic
> will be occurred.
Oh, have never seen panic but Geert has reported WARNINGs from the DMA
debug code...
> This patch fix this issue.
> Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
> ---
> drivers/net/ethernet/renesas/sh_eth.c | 35 ++++++++++++++---------------------
> drivers/net/ethernet/renesas/sh_eth.h | 2 ++
> 2 files changed, 16 insertions(+), 21 deletions(-)
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
> index 49e963e..0e4a407 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -917,21 +917,12 @@ static int sh_eth_reset(struct net_device *ndev)
> return ret;
> }
>
> -#if defined(CONFIG_CPU_SH4) || defined(CONFIG_ARCH_SHMOBILE)
> static void sh_eth_set_receive_align(struct sk_buff *skb)
> {
> - int reserve;
> -
> - reserve = SH4_SKB_RX_ALIGN - ((u32)skb->data & (SH4_SKB_RX_ALIGN - 1));
> + u32 reserve = (u32)skb->data & (SH_ETH_RX_ALIGN - 1);
Please keep an empty line after declaration, as it was before this patch.
> if (reserve)
> - skb_reserve(skb, reserve);
> -}
> -#else
> -static void sh_eth_set_receive_align(struct sk_buff *skb)
> -{
> - skb_reserve(skb, SH2_SH3_SKB_RX_ALIGN);
> + skb_reserve(skb, SH_ETH_RX_ALIGN - reserve);
> }
> -#endif
>
>
> /* CPU <-> EDMAC endian convert */
> @@ -1119,6 +1110,7 @@ static void sh_eth_ring_format(struct net_device *ndev)
> struct sh_eth_txdesc *txdesc = NULL;
> int rx_ringsize = sizeof(*rxdesc) * mdp->num_rx_ring;
> int tx_ringsize = sizeof(*txdesc) * mdp->num_tx_ring;
> + int skbuff_size = mdp->rx_buf_sz + SH_ETH_RX_ALIGN - 1;
>
> mdp->cur_rx = 0;
> mdp->cur_tx = 0;
> @@ -1131,21 +1123,21 @@ static void sh_eth_ring_format(struct net_device *ndev)
> for (i = 0; i < mdp->num_rx_ring; i++) {
> /* skb */
> mdp->rx_skbuff[i] = NULL;
> - skb = netdev_alloc_skb(ndev, mdp->rx_buf_sz);
> + skb = netdev_alloc_skb(ndev, skbuff_size);
> mdp->rx_skbuff[i] = skb;
> if (skb == NULL)
> break;
> - dma_map_single(&ndev->dev, skb->data, mdp->rx_buf_sz,
> - DMA_FROM_DEVICE);
> sh_eth_set_receive_align(skb);
>
> /* RX descriptor */
> rxdesc = &mdp->rx_ring[i];
> + /* The size of the buffer is 16 byte boundary. */
Is *on* 16 byte boundary, you mean?
> + rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 16);
> + dma_map_single(&ndev->dev, skb->data, rxdesc->buffer_length,
> + DMA_FROM_DEVICE);
> rxdesc->addr = virt_to_phys(skb->data);
> rxdesc->status = cpu_to_edmac(mdp, RD_RACT | RD_RFP);
>
> - /* The size of the buffer is 16 byte boundary. */
Ah, you're just copying an existent comment... well, seems a good time to
fix it then. :-)
[...]
> @@ -1448,8 +1441,8 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
> if (mdp->cd->rpadir)
> skb_reserve(skb, NET_IP_ALIGN);
> dma_sync_single_for_cpu(&ndev->dev, rxdesc->addr,
> - mdp->rx_buf_sz,
> - DMA_FROM_DEVICE);
> + ALIGN(mdp->rx_buf_sz, 16),
> + DMA_FROM_DEVICE);
Please keep the original alignment of the continuation lines.
> skb_put(skb, pkt_len);
> skb->protocol = eth_type_trans(skb, ndev);
> netif_receive_skb(skb);
> @@ -1468,13 +1461,13 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
> rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 16);
>
> if (mdp->rx_skbuff[entry] == NULL) {
> - skb = netdev_alloc_skb(ndev, mdp->rx_buf_sz);
> + skb = netdev_alloc_skb(ndev, skbuff_size);
> mdp->rx_skbuff[entry] = skb;
> if (skb == NULL)
> break; /* Better luck next round. */
> - dma_map_single(&ndev->dev, skb->data, mdp->rx_buf_sz,
> - DMA_FROM_DEVICE);
> sh_eth_set_receive_align(skb);
> + dma_map_single(&ndev->dev, skb->data,
> + rxdesc->buffer_length, DMA_FROM_DEVICE);
>
> skb_checksum_none_assert(skb);
> rxdesc->addr = virt_to_phys(skb->data);
> diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h
> index b37c427..d138ebe 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.h
> +++ b/drivers/net/ethernet/renesas/sh_eth.h
> @@ -163,8 +163,10 @@ enum {
> /* Driver's parameters */
> #if defined(CONFIG_CPU_SH4) || defined(CONFIG_ARCH_SHMOBILE)
> #define SH4_SKB_RX_ALIGN 32
> +#define SH_ETH_RX_ALIGN (SH4_SKB_RX_ALIGN)
() not needed.
> #else
> #define SH2_SH3_SKB_RX_ALIGN 2
> +#define SH_ETH_RX_ALIGN (SH2_SH3_SKB_RX_ALIGN)
Likewise.
And I don't think we still need {SH2_SH3|SH4}_SKB_RX_ALIGN after this patch.
[...]
WBR, Sergei
^ permalink raw reply
* Re: [PATCH 1/3] sh_eth: Remove redundant alignment adjustment
From: Sergei Shtylyov @ 2014-11-13 22:37 UTC (permalink / raw)
To: Yoshihiro Kaneko, netdev
Cc: David S. Miller, Simon Horman, Magnus Damm, linux-sh
In-Reply-To: <1415862301-28032-2-git-send-email-ykaneko0929@gmail.com>
On 11/13/2014 10:04 AM, Yoshihiro Kaneko wrote:
> From: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> PTR_ALIGN macro after skb_reserve is redundant, because skb_reserve
> function adjusts the alignment of skb->data.
OK, but where is the bug? There must be one if you base this patch on the
'net' tree...
> Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
WBR, Sergei
^ permalink raw reply
* Re: [PATCH 1/2] sh_eth: Fix sleeping function called from invalid context
From: Sergei Shtylyov @ 2014-11-13 22:33 UTC (permalink / raw)
To: Yoshihiro Kaneko, netdev
Cc: David S. Miller, Simon Horman, Magnus Damm, linux-sh
In-Reply-To: <1415862135-27972-2-git-send-email-ykaneko0929@gmail.com>
On 11/13/2014 10:02 AM, Yoshihiro Kaneko wrote:
> From: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> Fix the bug as follows:
> ----
> [ 1238.161349] BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:952
> [ 1238.188279] in_atomic(): 1, irqs_disabled(): 0, pid: 1388, name: cat
> [ 1238.207425] CPU: 0 PID: 1388 Comm: cat Not tainted 3.10.31-ltsi-00046-gefa0b46 #1087
> [ 1238.230737] Backtrace:
> [ 1238.238123] [<c0012e64>] (dump_backtrace+0x0/0x10c) from [<c0013000>] (show_stack+0x18/0x1c)
> [ 1238.263499] r6:000003b8 r5:c06160c0 r4:c0669e00 r3:00404000
> [ 1238.280583] [<c0012fe8>] (show_stack+0x0/0x1c) from [<c04515a4>] (dump_stack+0x20/0x28)
> [ 1238.304631] [<c0451584>] (dump_stack+0x0/0x28) from [<c004970c>] (__might_sleep+0xf8/0x118)
> [ 1238.329734] [<c0049614>] (__might_sleep+0x0/0x118) from [<c02465ac>] (__pm_runtime_resume+0x38/0x90)
> [ 1238.357170] r7:d616f000 r6:c049c458 r5:00000004 r4:d6a17210
> [ 1238.374251] [<c0246574>] (__pm_runtime_resume+0x0/0x90) from [<c029b1c4>] (sh_eth_get_stats+0x44/0x280)
> [ 1238.402468] r7:d616f000 r6:c049c458 r5:d5c21000 r4:d5c21000
> [ 1238.419552] [<c029b180>] (sh_eth_get_stats+0x0/0x280) from [<c03ae39c>] (dev_get_stats+0x54/0x88)
> [ 1238.446204] r5:d5c21000 r4:d5ed7e08
> [ 1238.456980] [<c03ae348>] (dev_get_stats+0x0/0x88) from [<c03c677c>] (netstat_show.isra.15+0x54/0x9c)
> [ 1238.484413] r6:d5c21000 r5:d5c21238 r4:00000028 r3:00000001
> [ 1238.501495] [<c03c6728>] (netstat_show.isra.15+0x0/0x9c) from [<c03c69b8>] (show_tx_errors+0x18/0x1c)
> [ 1238.529196] r7:d5f945d8 r6:d5f945c0 r5:c049716c r4:c0650e7c
> [ 1238.546279] [<c03c69a0>] (show_tx_errors+0x0/0x1c) from [<c023963c>] (dev_attr_show+0x24/0x50)
> [ 1238.572157] [<c0239618>] (dev_attr_show+0x0/0x50) from [<c010c148>] (sysfs_read_file+0xb0/0x140)
> [ 1238.598554] r5:c049716c r4:d5c21240
> [ 1238.609326] [<c010c098>] (sysfs_read_file+0x0/0x140) from [<c00b9ee4>] (vfs_read+0xb0/0x13c)
> [ 1238.634679] [<c00b9e34>] (vfs_read+0x0/0x13c) from [<c00ba0ac>] (SyS_read+0x44/0x74)
> [ 1238.657944] r8:bef45bf0 r7:00000000 r6:d6ac0600 r5:00000000 r4:00000000
> [ 1238.678172] [<c00ba068>] (SyS_read+0x0/0x74) from [<c000eec0>] (ret_fast_syscall+0x0/0x30)
> ----
How to reproduce this?
> Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
[...]
> diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h
> index b37c427..9a1c550 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.h
> +++ b/drivers/net/ethernet/renesas/sh_eth.h
> @@ -508,6 +508,7 @@ struct sh_eth_private {
> u32 rx_buf_sz; /* Based on MTU+slack. */
> int edmac_endian;
> struct napi_struct napi;
> + bool is_opened;
Placing it after 'vlan_num_ids' (and making it a bitfield?) would probably
allow to save some space.
WBR, Sergei
^ permalink raw reply
* Re: [PATCH] net: skb_fclone_busy() needs to detect orphaned skb
From: Luis Henriques @ 2014-11-13 22:32 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, Neal Cardwell, Joseph Salisbury
In-Reply-To: <1415913622.17262.24.camel@edumazet-glaptop2.roam.corp.google.com>
On Thu, Nov 13, 2014 at 01:20:22PM -0800, Eric Dumazet wrote:
> On Thu, 2014-11-13 at 19:15 +0000, Luis Henriques wrote:
> > Hi Eric,
> >
> > On Thu, Oct 30, 2014 at 10:32:34AM -0700, Eric Dumazet wrote:
> > > From: Eric Dumazet <edumazet@google.com>
> > >
> > > Some drivers are unable to perform TX completions in a bound time.
> > > They instead call skb_orphan()
> > >
> > > Problem is skb_fclone_busy() has to detect this case, otherwise
> > > we block TCP retransmits and can freeze unlucky tcp sessions on
> > > mostly idle hosts.
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > Fixes: 1f3279ae0c13 ("tcp: avoid retransmits of TCP packets hanging in host queues")
> > > ---
> > > This is a stable candidate.
> > > This problem is known to hurt users of linux-3.16 kernels used by guests kernels.
> > > David, I can provide backports if you want.
> > > Thanks !
> > >
> >
> > We got a bug report[0] where a backport for 3.16 was provided. Since
> > I couldn't find the original backport post, I'm not sure who's the
> > actual author. Could you please confirm if this backport is correct?
> > (I'm copying the patch below).
> >
> > [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1390604
> >
> > Cheers,
> > --
> > Luís
> >
> >
>
> Sure ! I provided this patch indeed, I am 'The Google engineer'
> mentioned in this bug report ;)
>
Awesome, Thanks! I'll queue it for the 3.16 kernel. Since I couldn't
find the original patch, I could only guess who 'The Google engineer'
was :-)
Cheers,
--
Luís
> Signed-off-by: Eric Dumazet <edumazet@google.com>
>
>
> Thanks !
>
> > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> > index 4e4932b5079b..a8794367cd20 100644
> > --- a/net/ipv4/tcp_output.c
> > +++ b/net/ipv4/tcp_output.c
> > @@ -2082,7 +2082,8 @@ static bool skb_still_in_host_queue(const struct sock *sk,
> > const struct sk_buff *fclone = skb + 1;
> >
> > if (unlikely(skb->fclone == SKB_FCLONE_ORIG &&
> > - fclone->fclone == SKB_FCLONE_CLONE)) {
> > + fclone->fclone == SKB_FCLONE_CLONE &&
> > + fclone->sk == sk)) {
> > NET_INC_STATS_BH(sock_net(sk),
> > LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES);
> > return true;
>
>
>
>
^ permalink raw reply
* Re: [PATCH] sh_eth: Optimization for RX excess judgement
From: Sergei Shtylyov @ 2014-11-13 22:27 UTC (permalink / raw)
To: Yoshihiro Kaneko, netdev
Cc: David S. Miller, Simon Horman, Magnus Damm, linux-sh
In-Reply-To: <1415862031-27925-1-git-send-email-ykaneko0929@gmail.com>
On 11/13/2014 10:00 AM, Yoshihiro Kaneko wrote:
> From: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> Both of 'boguscnt' and 'quota' have nearly meaning as the condition of
> the reception loop.
> In order to cut down redundant processing, this patch changes excess judgement.
> Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
> ---
> This patch is based on net tree.
> drivers/net/ethernet/renesas/sh_eth.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
> index 60e9c2c..7d46326 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -1394,10 +1394,15 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
>
> int entry = mdp->cur_rx % mdp->num_rx_ring;
> int boguscnt = (mdp->dirty_rx + mdp->num_rx_ring) - mdp->cur_rx;
> + int limit = boguscnt;
> struct sk_buff *skb;
> u16 pkt_len = 0;
> u32 desc_status;
>
> + if (quota) {
> + boguscnt = min(boguscnt, *quota);
> + limit = boguscnt;
> + }
> rxdesc = &mdp->rx_ring[entry];
> while (!(rxdesc->status & cpu_to_edmac(mdp, RD_RACT))) {
> desc_status = edmac_to_cpu(mdp, rxdesc->status);
[...]
> @@ -1501,7 +1501,10 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
> sh_eth_write(ndev, EDRRR_R, EDRRR);
> }
>
> - return *quota <= 0;
> + if (quota)
> + *quota -= limit - (++boguscnt);
Just 'limit - boguscnt + 1'.
> +
> + return (boguscnt <= 0);
Hm... why change the *return* statement at all? I'm not sure this is at
all correct.
WBR, Sergei
^ permalink raw reply
* Re: [PATCH] net: sh_eth: Add RMII mode setting in probe
From: Sergei Shtylyov @ 2014-11-13 22:20 UTC (permalink / raw)
To: Yoshihiro Kaneko, netdev
Cc: David S. Miller, Simon Horman, Magnus Damm, linux-sh
In-Reply-To: <1415861645-27685-1-git-send-email-ykaneko0929@gmail.com>
Hello.
On 11/13/2014 09:54 AM, Yoshihiro Kaneko wrote:
> From: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
> When using RMMI mode, it is necessary to change in probe.
I'd like this need to be explained in more detail.
> Signed-off-by: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
> ---
> This patch is based on net-next tree.
> drivers/net/ethernet/renesas/sh_eth.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
> index dbe8606..1f79ed6 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -1,5 +1,6 @@
> /* SuperH Ethernet device driver
> *
> + * Copyright (C) 2014 Renesas Electronics Corporation
> * Copyright (C) 2006-2012 Nobuhiro Iwamatsu
> * Copyright (C) 2008-2014 Renesas Solutions Corp.
> * Copyright (C) 2013-2014 Cogent Embedded, Inc.
> @@ -2883,6 +2884,9 @@ static int sh_eth_drv_probe(struct platform_device *pdev)
> }
> }
>
> + if (mdp->cd->rmiimode)
> + sh_eth_write(ndev, 0x1, RMIIMODE);
> +
Does not such code need to be removed from sh_eth_dev_init() then?
WBR, Sergei
^ permalink raw reply
* Re: [PATCH] sh_eth: Optimization for RX excess judgement
From: Sergei Shtylyov @ 2014-11-13 22:09 UTC (permalink / raw)
To: Yoshihiro Kaneko, netdev
Cc: David S. Miller, Simon Horman, Magnus Damm, linux-sh
In-Reply-To: <1415862031-27925-1-git-send-email-ykaneko0929@gmail.com>
Hello.
On 11/13/2014 10:00 AM, Yoshihiro Kaneko wrote:
> From: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> Both of 'boguscnt' and 'quota' have nearly meaning as the condition of
> the reception loop.
> In order to cut down redundant processing, this patch changes excess judgement.
> Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
> ---
> This patch is based on net tree.
This is clearly 'net-next' material.
> drivers/net/ethernet/renesas/sh_eth.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
> index 60e9c2c..7d46326 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -1394,10 +1394,15 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
>
> int entry = mdp->cur_rx % mdp->num_rx_ring;
> int boguscnt = (mdp->dirty_rx + mdp->num_rx_ring) - mdp->cur_rx;
> + int limit = boguscnt;
> struct sk_buff *skb;
> u16 pkt_len = 0;
> u32 desc_status;
>
> + if (quota) {
I don't see what's the point in checking -- quota is always non-NULL.
> + boguscnt = min(boguscnt, *quota);
> + limit = boguscnt;
> + }
> rxdesc = &mdp->rx_ring[entry];
> while (!(rxdesc->status & cpu_to_edmac(mdp, RD_RACT))) {
> desc_status = edmac_to_cpu(mdp, rxdesc->status);
[...]
> @@ -1501,7 +1501,10 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota)
> sh_eth_write(ndev, EDRRR_R, EDRRR);
> }
>
> - return *quota <= 0;
> + if (quota)
Again, seeing no sense in this check.
> + *quota -= limit - (++boguscnt);
> +
> + return (boguscnt <= 0);
Parens not needed.
[...]
WBR, Sergei
^ permalink raw reply
* Re: [PATCH] sh_eth: r8a779x: Enable automatically fetch receive descriptor
From: Sergei Shtylyov @ 2014-11-13 21:59 UTC (permalink / raw)
To: David Miller, ykaneko0929
Cc: netdev, horms, magnus.damm, linux-sh, grant.likely
In-Reply-To: <20141113.150311.1961794581032201784.davem@davemloft.net>
Hello.
On 11/13/2014 11:03 PM, David Miller wrote:
>> From: Kouei Abe <kouei.abe.cp@renesas.com>
>> HDMAC automatically fetches the receive descriptor and receives frames.
>> Continuous reception of multiple frames is possible.
>> Signed-off-by: Kouei Abe <kouei.abe.cp@renesas.com>
>> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
>> ---
>>
>> This patch is based on net-next tree.
>
> This doesn't even compile, or, it depends upon another patch which you have
> not mentioned.
This patch is just very outdated -- this issue has been fixed for all SoCs
before.
> Because sh_eth_cpu_data does not have an rmcr_value field.
Right, it was removed by the above mentioned patch (based on your
feedback BTW ;-).
WBR, Sergei
^ permalink raw reply
* Re: [PATCH] sh_eth: r8a779x: Enable automatically fetch receive descriptor
From: Sergei Shtylyov @ 2014-11-13 21:57 UTC (permalink / raw)
To: Yoshihiro Kaneko, netdev
Cc: David S. Miller, Simon Horman, Magnus Damm, linux-sh,
Grant Likely
In-Reply-To: <1415861819-27812-1-git-send-email-ykaneko0929@gmail.com>
Hello.
On 11/13/2014 09:56 AM, Yoshihiro Kaneko wrote:
> From: Kouei Abe <kouei.abe.cp@renesas.com>
> HDMAC automatically fetches the receive descriptor and receives frames.
> Continuous reception of multiple frames is possible.
> Signed-off-by: Kouei Abe <kouei.abe.cp@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
> ---
> This patch is based on net-next tree.
This patch is not needed any more because of an earlier patch by Ben Dooks
that set RMCR.RNC for all Ether devices.
> drivers/net/ethernet/renesas/sh_eth.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
> index dbe8606..badb734 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -494,6 +494,7 @@ static struct sh_eth_cpu_data r8a779x_data = {
> .eesr_err_check = EESR_TWB | EESR_TABT | EESR_RABT | EESR_RFE |
> EESR_RDE | EESR_RFRMER | EESR_TFE | EESR_TDE |
> EESR_ECI,
> + .rmcr_value = RMCR_RNC,
Looks like you didn't even bother to compile.
WBR, Sergei
^ permalink raw reply
* Re: [PATCH 16/16] rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()
From: Pranith Kumar @ 2014-11-13 21:55 UTC (permalink / raw)
To: David Howells
Cc: David S. Miller, Dan Carpenter, open list:NETWORKING [GENERAL],
open list, paulmck
In-Reply-To: <24601.1415911646@warthog.procyon.org.uk>
On 11/13/2014 03:47 PM, David Howells wrote:
> Pranith Kumar <bobby.prani@gmail.com> wrote:
>
>> Recently lockless_dereference() was added which can be used in place of
>> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Actually, the use of smp_read_barrier_depends() is wrong in circular
> buffering. See Documentation/circular-buffers.txt
>
OK. Should I send in a patch removing these barriers then?
--
Pranith
^ permalink raw reply
* Re: [PATCH 0/4] move pci_assivned_vfs() check (while disabling VFs) to pci sub-system
From: Don Dutile @ 2014-11-13 21:36 UTC (permalink / raw)
To: Sathya Perla, Alex Williamson
Cc: linux-pci@vger.kernel.org, netdev@vger.kernel.org,
ariel.elior@qlogic.com, linux.nics@intel.com,
shahed.shaikh@qlogic.com
In-Reply-To: <CF9D1877D81D214CB0CA0669EFAE020C68CD95E1@CMEXMB1.ad.emulex.com>
On 11/13/2014 02:04 AM, Sathya Perla wrote:
>> -----Original Message-----
>> From: Alex Williamson [mailto:alex.williamson@redhat.com]
>>
>> On Tue, 2014-11-11 at 14:09 -0500, Don Dutile wrote:
>>> On 11/10/2014 06:53 AM, Sathya Perla wrote:
>>>> A user must not be allowed to disable VFs while they are already assigned
>> to
>>>> a guest. This check is being made in each individual driver that
>> implements
>>>> the sriov_configure PCI method.
>>>> This patch-set fixes this code duplication by moving this check from
>>>> drivers to the sriov_nuvfs_store() routine just before invoking
>>>> sriov_configure() when num_vfs is equal to 0.
>>>>
>>>> Vasundhara Volam (4):
>>>> pci: move pci_assivned_vfs() check while disabling VFs to pci
>>>> sub-system
>>>> bnx2x: remove pci_assigned_vfs() check while disabling VFs
>>>> i40e: remove pci_assigned_vfs() check while disabling VFs
>>>> qlcnic: remove pci_assigned_vfs() check while disabling VFs
>>>>
>>>> drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c | 2 +-
>>>> drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 7 +------
>>>> .../net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c | 10 ----------
>>>> drivers/pci/pci-sysfs.c | 5 +++++
>>>> 4 files changed, 7 insertions(+), 17 deletions(-)
>>>>
>>> I have had a side conversation with Alex Williamson, VFIO author.
>>>
>>> VFIO is the upstream method that device-assignment is managed/handled
>> on kvm now.
>>> It does not set the PCI_DEV_FLAGS_ASSIGNED pci dev-flags, and thus,
>>> this check will not work when VFIO is used.
>>>
>>> This patch set will only work for the former, kvm-managed, device-
>> assignment method,
>>> which is currently being deprecated in qemu as well.
>>>
>>> So, yes, it works for kvm managed device-assignment, but not the
>>> newer, VFIO-based device-assignment.
>>>
>>> Note, also, that the pci_assigned_vfs() check in the drivers will
>>> always return 0 when VFIO is used for device assignment, so keeping
>>> these checks in the drivers doesn't do what they imply either.
>>>
>>> So, taking in the patch solves old, kvm-managed, device assignment,
>>> but a new method is needed when VFIO is involved.
>>>
>>> - Don
>>>
>>> ps -- Note: just adding the flag setting in vfio-pci does not necessarily
>>> solve this problem. VFIO does not know if a device is assigned to a
>> guest;
>>> it only knows a caller of the ioctl requesting the device to be assigned
>>> to vfio, and to be dma-mapped for a region of memory, has been
>> requested.
>>> So, a new PF<->VF mechanism needs to be put in place to
>>> determine the equivalent information.
>>
>> pps -- Note: testing pci_assivned_vfs() is racy, nothing prevents the flag
>> being added to a device between your check and removing the VF
>> device.
>> This is one of the reasons that vfio-pci doesn't use it and that this
>> interface should be discouraged in the kernel.
>
> Alex/Don, I agree with the points you've raised.
> But, I'd like to know whether you think this patch-set should be accepted or not.
> Even though this patch-set doesn't fix any of the pending issues raised here,
> it's a small step forward as it reduces the number of invocations of pci_assigned_vfs()
> check which is a good thing.
>
> thanks,
> -Sathya
>
IMO, it's only a fix for XEN. Upstream has moved on to VFIO-based device assignment,
and these patches do not fix the issue. They don't make it any worse, either,
but I don't want someone scanning the patch list thinking it's been fixed either.
So, it's ok (but racy, as it is today) for kvm-managed device-assignment & Xen
pci passthrough, but does zip for VFIO-based assignment.
We need a new api - maybe another/new state in sysfs, so userspace can set it
(like qemu &/or libvirt), as well as (xen) kernel ... and meeting non-racy condition(s).
^ permalink raw reply
* [PATCH 54/56] net/ipv4: support compiling out splice
From: Pieter Smith @ 2014-11-13 21:23 UTC (permalink / raw)
To: pieter
Cc: Josh Triplett, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy,
open list:NETWORKING [IPv4/..., open list
In-Reply-To: <1415913813-362-1-git-send-email-pieter@boesman.nl>
Compile out splice support from ipv4 networking when the splice-family of
syscalls is not supported by the system (i.e. CONFIG_SYSCALL_SPLICE is
undefined).
Signed-off-by: Pieter Smith <pieter@boesman.nl>
---
net/ipv4/af_inet.c | 2 +-
net/ipv4/tcp.c | 2 ++
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index d156b3c..e025478 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -917,7 +917,7 @@ const struct proto_ops inet_stream_ops = {
.recvmsg = inet_recvmsg,
.mmap = sock_no_mmap,
.sendpage = inet_sendpage,
- .splice_read = tcp_splice_read,
+ SPLICE_READ_INIT(tcp_splice_read)
#ifdef CONFIG_COMPAT
.compat_setsockopt = compat_sock_common_setsockopt,
.compat_getsockopt = compat_sock_common_getsockopt,
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 541f26a..afc825f 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -686,6 +686,7 @@ static void tcp_push(struct sock *sk, int flags, int mss_now,
__tcp_push_pending_frames(sk, mss_now, nonagle);
}
+#ifdef CONFIG_SYSCALL_SPLICE
static int tcp_splice_data_recv(read_descriptor_t *rd_desc, struct sk_buff *skb,
unsigned int offset, size_t len)
{
@@ -805,6 +806,7 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
return ret;
}
EXPORT_SYMBOL(tcp_splice_read);
+#endif /* #ifdef CONFIG_SYSCALL_SPLICE */
struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int size, gfp_t gfp)
{
--
1.9.1
^ permalink raw reply related
* Re: [PATCH 2/3] r8169: Use load_acquire() and store_release() to reduce memory barrier overhead
From: Francois Romieu @ 2014-11-13 21:30 UTC (permalink / raw)
To: Alexander Duyck
Cc: linux-arch, netdev, linux-kernel, mikey, tony.luck,
mathieu.desnoyers, donald.c.skidmore, peterz, benh,
heiko.carstens, oleg, will.deacon, davem, michael, matthew.vick,
nic_swsd, geert, jeffrey.t.kirsher, fweisbec, schwidefsky, linux,
paulmck, torvalds, mingo
In-Reply-To: <20141113192735.12579.22892.stgit@ahduyck-server>
Alexander Duyck <alexander.h.duyck@redhat.com> :
[...]
> In addition the r8169 uses a rmb() however I believe it is placed incorrectly
> as I assume it supposed to be ordering descriptor reads after the check for
> ownership.
Not exactly. It's a barrier against compiler optimization from 2004.
It should not matter.
However I disagree with the change below:
> @@ -7284,11 +7280,11 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, u32 budget
> struct RxDesc *desc = tp->RxDescArray + entry;
> u32 status;
>
> - rmb();
> - status = le32_to_cpu(desc->opts1) & tp->opts1_mask;
> -
> + status = cpu_to_le32(load_acquire(&desc->opts1));
> if (status & DescOwn)
> break;
> +
> + status &= tp->opts1_mask;
-> tp->opts1_mask is not __le32 tainted.
Btw, should I consider the sketch above as a skeleton in my r8169 closet ?
NIC CPU0 CPU1
| CPU | NIC | CPU | CPU |
| CPU | NIC | CPU | CPU |
^ tx_dirty
[start_xmit...
| CPU | CPU | CPU | CPU |
(NIC did it's job)
[rtl_tx...
| ... | ... | NIC | NIC |
(ring update)
(tx_dirty increases)
| CPU | CPU | ??? | ??? |
tx_dirty ?
reaping about-to-be-sent
buffers on some platforms ?
...start_xmit]
--
Ueimor
^ permalink raw reply
* pull request: wireless 2014-11-13
From: John W. Linville @ 2014-11-13 21:28 UTC (permalink / raw)
To: davem; +Cc: linux-wireless, netdev, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 17022 bytes --]
Dave,
Please pull this set of a few more wireless fixes intended for the
3.18 stream...
For the mac80211 bits, Johannes says:
"This has just one fix, for an issue with the CCMP decryption
that can cause a kernel crash. I'm not sure it's remotely
exploitable, but it's an important fix nonetheless."
For the iwlwifi bits, Emmanuel says:
"Two fixes here - we weren't updating mac80211 if a scan
was cut short by RFKILL which confused cfg80211. As a
result, the latter wouldn't allow to run another scan.
Liad fixes a small bug in the firmware dump."
On top of that...
Arend van Spriel corrects a channel width conversion that caused a
WARNING in brcmfmac.
Hauke Mehrtens avoids a NULL pointer dereference in b43.
Larry Finger hits a trio of rtlwifi bugs left over from recent
backporting from the Realtek vendor driver.
Miaoqing Pan fixes a clocking problem in ath9k that could affect
packet timestamps and such.
Stanislaw Gruszka addresses an payload alignment issue that has been
plaguing rt2x00.
Please let me know if there are problems!
John
---
The following changes since commit 0c9a67c8f1d2b71a89f66349362412e9bf6becab:
Merge tag 'mac80211-for-john-2014-11-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 (2014-11-04 15:56:33 -0500)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless.git tags/master-2014-11-11
for you to fetch changes up to 4e6ce4dc7ce71d0886908d55129d5d6482a27ff9:
ath9k: Fix RTC_DERIVED_CLK usage (2014-11-11 16:24:18 -0500)
----------------------------------------------------------------
Arend van Spriel (1):
brcmfmac: fix conversion of channel width 20MHZ_NOHT
Emmanuel Grumbach (1):
iwlwifi: mvm: abort scan upon RFKILL
Hauke Mehrtens (1):
b43: fix NULL pointer dereference in b43_phy_copy()
John W. Linville (2):
Merge tag 'mac80211-for-john-2014-11-10' of git://git.kernel.org/.../jberg/mac80211
Merge tag 'iwlwifi-for-john-2014-11-10' of git://git.kernel.org/.../iwlwifi/iwlwifi-fixes
Larry Finger (3):
rtlwifi: Fix setting of tx descriptor for new trx flow
rtlwifi: Fix errors in descriptor manipulation
rtlwifi: rtl8192se: Fix connection problems
Liad Kaufman (1):
iwlwifi: pcie: fix prph dump length
Miaoqing Pan (1):
ath9k: Fix RTC_DERIVED_CLK usage
Ronald Wahl (1):
mac80211: Fix regression that triggers a kernel BUG with CCMP
Stanislaw Gruszka (1):
rt2x00: do not align payload on modern H/W
drivers/net/wireless/ath/ath9k/ar9003_phy.c | 13 ++++++
drivers/net/wireless/ath/ath9k/hw.c | 13 ------
drivers/net/wireless/b43/phy_common.c | 4 +-
.../net/wireless/brcm80211/brcmfmac/wl_cfg80211.c | 6 +++
drivers/net/wireless/iwlwifi/mvm/scan.c | 20 ++++-----
drivers/net/wireless/iwlwifi/pcie/trans.c | 3 +-
drivers/net/wireless/rt2x00/rt2x00queue.c | 50 ++++++----------------
drivers/net/wireless/rtlwifi/pci.c | 19 +++++---
drivers/net/wireless/rtlwifi/rtl8192se/hw.c | 7 ++-
drivers/net/wireless/rtlwifi/rtl8192se/phy.c | 2 +
drivers/net/wireless/rtlwifi/rtl8192se/sw.c | 16 +++++++
net/mac80211/aes_ccm.c | 3 ++
12 files changed, 81 insertions(+), 75 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/ar9003_phy.c b/drivers/net/wireless/ath/ath9k/ar9003_phy.c
index 697c4ae90af0..1e8ea5e4d4ca 100644
--- a/drivers/net/wireless/ath/ath9k/ar9003_phy.c
+++ b/drivers/net/wireless/ath/ath9k/ar9003_phy.c
@@ -664,6 +664,19 @@ static void ar9003_hw_override_ini(struct ath_hw *ah)
ah->enabled_cals |= TX_CL_CAL;
else
ah->enabled_cals &= ~TX_CL_CAL;
+
+ if (AR_SREV_9340(ah) || AR_SREV_9531(ah) || AR_SREV_9550(ah)) {
+ if (ah->is_clk_25mhz) {
+ REG_WRITE(ah, AR_RTC_DERIVED_CLK, 0x17c << 1);
+ REG_WRITE(ah, AR_SLP32_MODE, 0x0010f3d7);
+ REG_WRITE(ah, AR_SLP32_INC, 0x0001e7ae);
+ } else {
+ REG_WRITE(ah, AR_RTC_DERIVED_CLK, 0x261 << 1);
+ REG_WRITE(ah, AR_SLP32_MODE, 0x0010f400);
+ REG_WRITE(ah, AR_SLP32_INC, 0x0001e800);
+ }
+ udelay(100);
+ }
}
static void ar9003_hw_prog_ini(struct ath_hw *ah,
diff --git a/drivers/net/wireless/ath/ath9k/hw.c b/drivers/net/wireless/ath/ath9k/hw.c
index 8be4b1453394..2ad605760e21 100644
--- a/drivers/net/wireless/ath/ath9k/hw.c
+++ b/drivers/net/wireless/ath/ath9k/hw.c
@@ -861,19 +861,6 @@ static void ath9k_hw_init_pll(struct ath_hw *ah,
udelay(RTC_PLL_SETTLE_DELAY);
REG_WRITE(ah, AR_RTC_SLEEP_CLK, AR_RTC_FORCE_DERIVED_CLK);
-
- if (AR_SREV_9340(ah) || AR_SREV_9550(ah)) {
- if (ah->is_clk_25mhz) {
- REG_WRITE(ah, AR_RTC_DERIVED_CLK, 0x17c << 1);
- REG_WRITE(ah, AR_SLP32_MODE, 0x0010f3d7);
- REG_WRITE(ah, AR_SLP32_INC, 0x0001e7ae);
- } else {
- REG_WRITE(ah, AR_RTC_DERIVED_CLK, 0x261 << 1);
- REG_WRITE(ah, AR_SLP32_MODE, 0x0010f400);
- REG_WRITE(ah, AR_SLP32_INC, 0x0001e800);
- }
- udelay(100);
- }
}
static void ath9k_hw_init_interrupt_masks(struct ath_hw *ah,
diff --git a/drivers/net/wireless/b43/phy_common.c b/drivers/net/wireless/b43/phy_common.c
index 1dfc682a8055..ee27b06074e1 100644
--- a/drivers/net/wireless/b43/phy_common.c
+++ b/drivers/net/wireless/b43/phy_common.c
@@ -300,9 +300,7 @@ void b43_phy_write(struct b43_wldev *dev, u16 reg, u16 value)
void b43_phy_copy(struct b43_wldev *dev, u16 destreg, u16 srcreg)
{
- assert_mac_suspended(dev);
- dev->phy.ops->phy_write(dev, destreg,
- dev->phy.ops->phy_read(dev, srcreg));
+ b43_phy_write(dev, destreg, b43_phy_read(dev, srcreg));
}
void b43_phy_mask(struct b43_wldev *dev, u16 offset, u16 mask)
diff --git a/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c b/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
index 28fa25b509db..39b45c038a93 100644
--- a/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
+++ b/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
@@ -299,6 +299,7 @@ static u16 chandef_to_chanspec(struct brcmu_d11inf *d11inf,
primary_offset = ch->center_freq1 - ch->chan->center_freq;
switch (ch->width) {
case NL80211_CHAN_WIDTH_20:
+ case NL80211_CHAN_WIDTH_20_NOHT:
ch_inf.bw = BRCMU_CHAN_BW_20;
WARN_ON(primary_offset != 0);
break;
@@ -323,6 +324,10 @@ static u16 chandef_to_chanspec(struct brcmu_d11inf *d11inf,
ch_inf.sb = BRCMU_CHAN_SB_LU;
}
break;
+ case NL80211_CHAN_WIDTH_80P80:
+ case NL80211_CHAN_WIDTH_160:
+ case NL80211_CHAN_WIDTH_5:
+ case NL80211_CHAN_WIDTH_10:
default:
WARN_ON_ONCE(1);
}
@@ -333,6 +338,7 @@ static u16 chandef_to_chanspec(struct brcmu_d11inf *d11inf,
case IEEE80211_BAND_5GHZ:
ch_inf.band = BRCMU_CHAN_BAND_5G;
break;
+ case IEEE80211_BAND_60GHZ:
default:
WARN_ON_ONCE(1);
}
diff --git a/drivers/net/wireless/iwlwifi/mvm/scan.c b/drivers/net/wireless/iwlwifi/mvm/scan.c
index b280d5d87127..7554f7053830 100644
--- a/drivers/net/wireless/iwlwifi/mvm/scan.c
+++ b/drivers/net/wireless/iwlwifi/mvm/scan.c
@@ -602,16 +602,6 @@ static int iwl_mvm_cancel_regular_scan(struct iwl_mvm *mvm)
SCAN_COMPLETE_NOTIFICATION };
int ret;
- if (mvm->scan_status == IWL_MVM_SCAN_NONE)
- return 0;
-
- if (iwl_mvm_is_radio_killed(mvm)) {
- ieee80211_scan_completed(mvm->hw, true);
- iwl_mvm_unref(mvm, IWL_MVM_REF_SCAN);
- mvm->scan_status = IWL_MVM_SCAN_NONE;
- return 0;
- }
-
iwl_init_notification_wait(&mvm->notif_wait, &wait_scan_abort,
scan_abort_notif,
ARRAY_SIZE(scan_abort_notif),
@@ -1400,6 +1390,16 @@ int iwl_mvm_unified_sched_scan_lmac(struct iwl_mvm *mvm,
int iwl_mvm_cancel_scan(struct iwl_mvm *mvm)
{
+ if (mvm->scan_status == IWL_MVM_SCAN_NONE)
+ return 0;
+
+ if (iwl_mvm_is_radio_killed(mvm)) {
+ ieee80211_scan_completed(mvm->hw, true);
+ iwl_mvm_unref(mvm, IWL_MVM_REF_SCAN);
+ mvm->scan_status = IWL_MVM_SCAN_NONE;
+ return 0;
+ }
+
if (mvm->fw->ucode_capa.api[0] & IWL_UCODE_TLV_API_LMAC_SCAN)
return iwl_mvm_scan_offload_stop(mvm, true);
return iwl_mvm_cancel_regular_scan(mvm);
diff --git a/drivers/net/wireless/iwlwifi/pcie/trans.c b/drivers/net/wireless/iwlwifi/pcie/trans.c
index 160c3ebc48d0..dd2f3f8baa9d 100644
--- a/drivers/net/wireless/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/iwlwifi/pcie/trans.c
@@ -1894,8 +1894,7 @@ static u32 iwl_trans_pcie_dump_prph(struct iwl_trans *trans,
int reg;
__le32 *val;
- prph_len += sizeof(*data) + sizeof(*prph) +
- num_bytes_in_chunk;
+ prph_len += sizeof(**data) + sizeof(*prph) + num_bytes_in_chunk;
(*data)->type = cpu_to_le32(IWL_FW_ERROR_DUMP_PRPH);
(*data)->len = cpu_to_le32(sizeof(*prph) +
diff --git a/drivers/net/wireless/rt2x00/rt2x00queue.c b/drivers/net/wireless/rt2x00/rt2x00queue.c
index 8e68f87ab13c..66ff36447b94 100644
--- a/drivers/net/wireless/rt2x00/rt2x00queue.c
+++ b/drivers/net/wireless/rt2x00/rt2x00queue.c
@@ -158,55 +158,29 @@ void rt2x00queue_align_frame(struct sk_buff *skb)
skb_trim(skb, frame_length);
}
-void rt2x00queue_insert_l2pad(struct sk_buff *skb, unsigned int header_length)
+/*
+ * H/W needs L2 padding between the header and the paylod if header size
+ * is not 4 bytes aligned.
+ */
+void rt2x00queue_insert_l2pad(struct sk_buff *skb, unsigned int hdr_len)
{
- unsigned int payload_length = skb->len - header_length;
- unsigned int header_align = ALIGN_SIZE(skb, 0);
- unsigned int payload_align = ALIGN_SIZE(skb, header_length);
- unsigned int l2pad = payload_length ? L2PAD_SIZE(header_length) : 0;
+ unsigned int l2pad = (skb->len > hdr_len) ? L2PAD_SIZE(hdr_len) : 0;
- /*
- * Adjust the header alignment if the payload needs to be moved more
- * than the header.
- */
- if (payload_align > header_align)
- header_align += 4;
-
- /* There is nothing to do if no alignment is needed */
- if (!header_align)
+ if (!l2pad)
return;
- /* Reserve the amount of space needed in front of the frame */
- skb_push(skb, header_align);
-
- /*
- * Move the header.
- */
- memmove(skb->data, skb->data + header_align, header_length);
-
- /* Move the payload, if present and if required */
- if (payload_length && payload_align)
- memmove(skb->data + header_length + l2pad,
- skb->data + header_length + l2pad + payload_align,
- payload_length);
-
- /* Trim the skb to the correct size */
- skb_trim(skb, header_length + l2pad + payload_length);
+ skb_push(skb, l2pad);
+ memmove(skb->data, skb->data + l2pad, hdr_len);
}
-void rt2x00queue_remove_l2pad(struct sk_buff *skb, unsigned int header_length)
+void rt2x00queue_remove_l2pad(struct sk_buff *skb, unsigned int hdr_len)
{
- /*
- * L2 padding is only present if the skb contains more than just the
- * IEEE 802.11 header.
- */
- unsigned int l2pad = (skb->len > header_length) ?
- L2PAD_SIZE(header_length) : 0;
+ unsigned int l2pad = (skb->len > hdr_len) ? L2PAD_SIZE(hdr_len) : 0;
if (!l2pad)
return;
- memmove(skb->data + l2pad, skb->data, header_length);
+ memmove(skb->data + l2pad, skb->data, hdr_len);
skb_pull(skb, l2pad);
}
diff --git a/drivers/net/wireless/rtlwifi/pci.c b/drivers/net/wireless/rtlwifi/pci.c
index 25daa8715219..61f5d36eca6a 100644
--- a/drivers/net/wireless/rtlwifi/pci.c
+++ b/drivers/net/wireless/rtlwifi/pci.c
@@ -842,7 +842,8 @@ static void _rtl_pci_rx_interrupt(struct ieee80211_hw *hw)
break;
}
/* handle command packet here */
- if (rtlpriv->cfg->ops->rx_command_packet(hw, stats, skb)) {
+ if (rtlpriv->cfg->ops->rx_command_packet &&
+ rtlpriv->cfg->ops->rx_command_packet(hw, stats, skb)) {
dev_kfree_skb_any(skb);
goto end;
}
@@ -1127,9 +1128,14 @@ static void _rtl_pci_prepare_bcn_tasklet(struct ieee80211_hw *hw)
__skb_queue_tail(&ring->queue, pskb);
- rtlpriv->cfg->ops->set_desc(hw, (u8 *)pdesc, true, HW_DESC_OWN,
- &temp_one);
-
+ if (rtlpriv->use_new_trx_flow) {
+ temp_one = 4;
+ rtlpriv->cfg->ops->set_desc(hw, (u8 *)pbuffer_desc, true,
+ HW_DESC_OWN, (u8 *)&temp_one);
+ } else {
+ rtlpriv->cfg->ops->set_desc(hw, (u8 *)pdesc, true, HW_DESC_OWN,
+ &temp_one);
+ }
return;
}
@@ -1370,9 +1376,9 @@ static void _rtl_pci_free_tx_ring(struct ieee80211_hw *hw,
ring->desc = NULL;
if (rtlpriv->use_new_trx_flow) {
pci_free_consistent(rtlpci->pdev,
- sizeof(*ring->desc) * ring->entries,
+ sizeof(*ring->buffer_desc) * ring->entries,
ring->buffer_desc, ring->buffer_desc_dma);
- ring->desc = NULL;
+ ring->buffer_desc = NULL;
}
}
@@ -1543,7 +1549,6 @@ int rtl_pci_reset_trx_ring(struct ieee80211_hw *hw)
true,
HW_DESC_TXBUFF_ADDR),
skb->len, PCI_DMA_TODEVICE);
- ring->idx = (ring->idx + 1) % ring->entries;
kfree_skb(skb);
ring->idx = (ring->idx + 1) % ring->entries;
}
diff --git a/drivers/net/wireless/rtlwifi/rtl8192se/hw.c b/drivers/net/wireless/rtlwifi/rtl8192se/hw.c
index 00e067044c08..5761d5b49e39 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192se/hw.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192se/hw.c
@@ -1201,6 +1201,9 @@ static int _rtl92se_set_media_status(struct ieee80211_hw *hw,
}
+ if (type != NL80211_IFTYPE_AP &&
+ rtlpriv->mac80211.link_state < MAC80211_LINKED)
+ bt_msr = rtl_read_byte(rtlpriv, MSR) & ~MSR_LINK_MASK;
rtl_write_byte(rtlpriv, (MSR), bt_msr);
temp = rtl_read_dword(rtlpriv, TCR);
@@ -1262,6 +1265,7 @@ void rtl92se_enable_interrupt(struct ieee80211_hw *hw)
rtl_write_dword(rtlpriv, INTA_MASK, rtlpci->irq_mask[0]);
/* Support Bit 32-37(Assign as Bit 0-5) interrupt setting now */
rtl_write_dword(rtlpriv, INTA_MASK + 4, rtlpci->irq_mask[1] & 0x3F);
+ rtlpci->irq_enabled = true;
}
void rtl92se_disable_interrupt(struct ieee80211_hw *hw)
@@ -1276,8 +1280,7 @@ void rtl92se_disable_interrupt(struct ieee80211_hw *hw)
rtlpci = rtl_pcidev(rtl_pcipriv(hw));
rtl_write_dword(rtlpriv, INTA_MASK, 0);
rtl_write_dword(rtlpriv, INTA_MASK + 4, 0);
-
- synchronize_irq(rtlpci->pdev->irq);
+ rtlpci->irq_enabled = false;
}
static u8 _rtl92s_set_sysclk(struct ieee80211_hw *hw, u8 data)
diff --git a/drivers/net/wireless/rtlwifi/rtl8192se/phy.c b/drivers/net/wireless/rtlwifi/rtl8192se/phy.c
index 77c5b5f35244..4b4612fe2fdb 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192se/phy.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192se/phy.c
@@ -399,6 +399,8 @@ static bool _rtl92s_phy_sw_chnl_step_by_step(struct ieee80211_hw *hw,
case 2:
currentcmd = &postcommoncmd[*step];
break;
+ default:
+ return true;
}
if (currentcmd->cmdid == CMDID_END) {
diff --git a/drivers/net/wireless/rtlwifi/rtl8192se/sw.c b/drivers/net/wireless/rtlwifi/rtl8192se/sw.c
index aadba29c167a..fb003868bdef 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192se/sw.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192se/sw.c
@@ -236,6 +236,19 @@ static void rtl92s_deinit_sw_vars(struct ieee80211_hw *hw)
}
}
+static bool rtl92se_is_tx_desc_closed(struct ieee80211_hw *hw, u8 hw_queue,
+ u16 index)
+{
+ struct rtl_pci *rtlpci = rtl_pcidev(rtl_pcipriv(hw));
+ struct rtl8192_tx_ring *ring = &rtlpci->tx_ring[hw_queue];
+ u8 *entry = (u8 *)(&ring->desc[ring->idx]);
+ u8 own = (u8)rtl92se_get_desc(entry, true, HW_DESC_OWN);
+
+ if (own)
+ return false;
+ return true;
+}
+
static struct rtl_hal_ops rtl8192se_hal_ops = {
.init_sw_vars = rtl92s_init_sw_vars,
.deinit_sw_vars = rtl92s_deinit_sw_vars,
@@ -269,6 +282,7 @@ static struct rtl_hal_ops rtl8192se_hal_ops = {
.led_control = rtl92se_led_control,
.set_desc = rtl92se_set_desc,
.get_desc = rtl92se_get_desc,
+ .is_tx_desc_closed = rtl92se_is_tx_desc_closed,
.tx_polling = rtl92se_tx_polling,
.enable_hw_sec = rtl92se_enable_hw_security_config,
.set_key = rtl92se_set_key,
@@ -306,6 +320,8 @@ static struct rtl_hal_cfg rtl92se_hal_cfg = {
.maps[MAC_RCR_ACRC32] = RCR_ACRC32,
.maps[MAC_RCR_ACF] = RCR_ACF,
.maps[MAC_RCR_AAP] = RCR_AAP,
+ .maps[MAC_HIMR] = INTA_MASK,
+ .maps[MAC_HIMRE] = INTA_MASK + 4,
.maps[EFUSE_TEST] = REG_EFUSE_TEST,
.maps[EFUSE_CTRL] = REG_EFUSE_CTRL,
diff --git a/net/mac80211/aes_ccm.c b/net/mac80211/aes_ccm.c
index ec24378caaaf..09d9caaec591 100644
--- a/net/mac80211/aes_ccm.c
+++ b/net/mac80211/aes_ccm.c
@@ -53,6 +53,9 @@ int ieee80211_aes_ccm_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
__aligned(__alignof__(struct aead_request));
struct aead_request *aead_req = (void *) aead_req_data;
+ if (data_len == 0)
+ return -EINVAL;
+
memset(aead_req, 0, sizeof(aead_req_data));
sg_init_one(&pt, data, data_len);
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply related
* [PATCH 55/56] net/core: support compiling out splice
From: Pieter Smith @ 2014-11-13 21:23 UTC (permalink / raw)
To: pieter
Cc: Josh Triplett, David S. Miller, Tom Herbert, Willem de Bruijn,
Eric Dumazet, Daniel Borkmann, Florian Westphal,
Michael S. Tsirkin, Vlad Yasevich, Paul Durrant, Thomas Graf,
Herbert Xu, Jan Beulich, Miklos Szeredi, open list,
open list:NETWORKING [GENERAL]
In-Reply-To: <1415913813-362-1-git-send-email-pieter@boesman.nl>
Compile out splice support from networking core when the splice-family of
syscalls is not supported by the system (i.e. CONFIG_SYSCALL_SPLICE is
undefined).
Signed-off-by: Pieter Smith <pieter@boesman.nl>
---
include/linux/skbuff.h | 2 ++
net/core/skbuff.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index abde271..5a67427 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2543,9 +2543,11 @@ int skb_copy_bits(const struct sk_buff *skb, int offset, void *to, int len);
int skb_store_bits(struct sk_buff *skb, int offset, const void *from, int len);
__wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset, u8 *to,
int len, __wsum csum);
+#ifdef CONFIG_SYSCALL_SPLICE
int skb_splice_bits(struct sk_buff *skb, unsigned int offset,
struct pipe_inode_info *pipe, unsigned int len,
unsigned int flags);
+#endif /* #ifdef CONFIG_SYSCALL_SPLICE */
void skb_copy_and_csum_dev(const struct sk_buff *skb, u8 *to);
unsigned int skb_zerocopy_headlen(const struct sk_buff *from);
int skb_zerocopy(struct sk_buff *to, struct sk_buff *from,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 163b673..5610904 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1649,6 +1649,7 @@ fault:
}
EXPORT_SYMBOL(skb_copy_bits);
+#ifdef CONFIG_SYSCALL_SPLICE
/*
* Callback from splice_to_pipe(), if we need to release some pages
* at the end of the spd in case we error'ed out in filling the pipe.
@@ -1851,6 +1852,7 @@ done:
return ret;
}
+#endif /* #ifdef CONFIG_SYSCALL_SPLICE */
/**
* skb_store_bits - store bits from kernel buffer to skb
--
1.9.1
^ permalink raw reply related
* [PATCH 53/56] net/ipv6: support compiling out splice
From: Pieter Smith @ 2014-11-13 21:23 UTC (permalink / raw)
To: pieter
Cc: Josh Triplett, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy,
open list:NETWORKING [IPv4/..., open list
In-Reply-To: <1415913813-362-1-git-send-email-pieter@boesman.nl>
Compile out splice support from ipv6 networking when the splice-family of
syscalls is not supported by the system (i.e. CONFIG_SYSCALL_SPLICE is
undefined).
Signed-off-by: Pieter Smith <pieter@boesman.nl>
---
net/ipv6/af_inet6.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 2daa3a1..3d17064 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -523,7 +523,7 @@ const struct proto_ops inet6_stream_ops = {
.recvmsg = inet_recvmsg, /* ok */
.mmap = sock_no_mmap,
.sendpage = inet_sendpage,
- .splice_read = tcp_splice_read,
+ SPLICE_READ_INIT(tcp_splice_read)
#ifdef CONFIG_COMPAT
.compat_setsockopt = compat_sock_common_setsockopt,
.compat_getsockopt = compat_sock_common_getsockopt,
--
1.9.1
^ permalink raw reply related
* [PATCH 49/56] net/socket: support compiling out splice
From: Pieter Smith @ 2014-11-13 21:23 UTC (permalink / raw)
To: pieter
Cc: Josh Triplett, David S. Miller, open list:NETWORKING [GENERAL],
open list
In-Reply-To: <1415913813-362-1-git-send-email-pieter@boesman.nl>
Compile out splice support from socket when the splice-family of syscalls is not
supported by the system (i.e. CONFIG_SYSCALL_SPLICE is undefined).
Signed-off-by: Pieter Smith <pieter@boesman.nl>
---
net/socket.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/socket.c b/net/socket.c
index 95ee7d8..5cb347a 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -155,8 +155,8 @@ static const struct file_operations socket_file_ops = {
.release = sock_close,
.fasync = sock_fasync,
.sendpage = sock_sendpage,
- .splice_write = generic_splice_sendpage,
- .splice_read = sock_splice_read,
+ SPLICE_WRITE_INIT(generic_splice_sendpage)
+ SPLICE_READ_INIT(sock_splice_read)
};
/*
@@ -881,7 +881,8 @@ static ssize_t sock_sendpage(struct file *file, struct page *page,
return kernel_sendpage(sock, page, offset, size, flags);
}
-static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
+static ssize_t __maybe_unused sock_splice_read(
+ struct file *file, loff_t *ppos,
struct pipe_inode_info *pipe, size_t len,
unsigned int flags)
{
--
1.9.1
^ permalink raw reply related
* Re: [PATCH net-next 2/2] r8152: adjust rtl_start_rx
From: David Miller @ 2014-11-13 21:22 UTC (permalink / raw)
To: hayeswang; +Cc: netdev, nic_swsd, linux-kernel, linux-usb
In-Reply-To: <20141112.223146.2221136950144767962.davem@davemloft.net>
From: David Miller <davem@davemloft.net>
Date: Wed, 12 Nov 2014 22:31:46 -0500 (EST)
> From: Hayes Wang <hayeswang@realtek.com>
> Date: Thu, 13 Nov 2014 02:31:14 +0000
>
>> My last method which I mentioned yesterday is similar to
>> this one. The difference is that I would re-use the rx
>> buffers, so I have to add them to the list for re-submitting,
>> not alwayes allocate new one.
>>
>> Although one rx buffer could contain many packets, I don't
>> think the whole size of the rx buffer is alwayes used.
>> Therefore, I re-use the rx buffers to avoid allocating
>> the 16K bytes rx buffer alwayes. This also makes sure that
>> I always have the buffers to submit without allocating new
>> one.
>>
>> If you could accept this, I would modify this patch by
>> this way.
>
> I'll reread your original patch and think some more about this.
What if even the first r8152_submit_rx() fails? What ever will cause
any of these retries to trigger at all?
Second, why does your patch increment 'i' with 'i++;' in the error
break path? You should mark the first failed entry as unallocated
with actual_length == 0 and place it on the rx_done queue.
^ permalink raw reply
* [PULL] vhost: cleanups and fixes
From: Michael S. Tsirkin @ 2014-11-13 21:22 UTC (permalink / raw)
To: Linus Torvalds; +Cc: kvm, mst, netdev, linux-kernel, virtualization, tgraf
The following changes since commit 206c5f60a3d902bc4b56dab2de3e88de5eb06108:
Linux 3.18-rc4 (2014-11-09 14:55:29 -0800)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
for you to fetch changes up to 65eca3a20264a8999570c269406196bd1ae23be7:
virtio_console: move early VQ enablement (2014-11-13 09:53:26 +0200)
It seems like a good idea to merge this bugfix now, as it's clearly
a regression and several people complained.
----------------------------------------------------------------
virtio: bugfix for 3.18
This fixes a crash in virtio console
multi-channel mode that got introduced in -rc1.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
----------------------------------------------------------------
Cornelia Huck (1):
virtio_console: move early VQ enablement
drivers/char/virtio_console.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
^ permalink raw reply
* Fw: [Bug 88161] New: High traffic causes a lot of softirqs
From: Stephen Hemminger @ 2014-11-13 21:21 UTC (permalink / raw)
To: netdev
Begin forwarded message:
Date: Thu, 13 Nov 2014 06:18:28 -0800
From: "bugzilla-daemon@bugzilla.kernel.org" <bugzilla-daemon@bugzilla.kernel.org>
To: "stephen@networkplumber.org" <stephen@networkplumber.org>
Subject: [Bug 88161] New: High traffic causes a lot of softirqs
https://bugzilla.kernel.org/show_bug.cgi?id=88161
Bug ID: 88161
Summary: High traffic causes a lot of softirqs
Product: Networking
Version: 2.5
Kernel Version: 3.17.2
Hardware: Intel
OS: Linux
Tree: Mainline
Status: NEW
Severity: high
Priority: P1
Component: Other
Assignee: shemminger@linux-foundation.org
Reporter: mike@zcentric.com
Regression: No
I'm using packaged rpms by centos and elrepo with the same results and I can
replicate this on any server in our cluster.
I have tried installing
kernel-3.10.56-11.el6.centos.alt.x86_64
Also currently running
[root@web125-east.domain.com /var/www/html]# uname -a
Linux web125-east.domain.com 3.17.2-1.el6.elrepo.x86_64 #1 SMP Fri Oct 31
10:37:44 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
from the centosplus repo to solve a problem where 2.6 was locking up process
tree on high cpu and it fixed it but it introduced another issue where we have
a lot of softirq requests when under a lot of traffic load.
Here is a powertop from a 2.6 series server
Summary: 42492.1 wakeups/second, 0.0 GPU ops/seconds, 0.0 VFS ops/sec and
2422.0% CPU use
Usage Events/s Category Description
22613 ms/s 23637.4 Process php-fpm: pool www
716.9 ms/s 15783.2 Process nginx: worker process
21.3 ms/s 1096.1 Process /usr/bin/java -Xms200m -Xmx2000m
-Xss256k -XX:MaxDirectMemorySize=516m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-Dage
5.8 ms/s 674.4 Process /usr/sbin/gmond
130.0 ms/s 494.5 Process /usr/bin/redis-server 127.0.0.1:6379
73.2 ms/s 487.4 Process python /usr/bin/statsd-relay.py
3.8 ms/s 82.7 Process java -Xmx6g -server -Dfile.encoding=utf-8
-XX:OnOutOfMemoryError=kill -9 %p -XX:+HeapDumpOnOutOfMemoryError -XX:HeapD
212.4 ms/s 0.00 Interrupt [3] net_rx(softirq)
Here it is from 3.10
Usage Events/s Category Description
10.2 ms/s 1033.6 Timer hrtimer_wakeup
3.3 ms/s 932.7 Process /usr/bin/java -Xms200m -Xmx2000m -Xss256k
591.1 ms/s 624.3 Process php-fpm: pool www
41.5 ms/s 724.0 Interrupt [3] net_rx(softirq)
Load pretty much just keeps crawling up to the 500's
There also is a lot of CPU usage from
116 root 20 0 0 0 0 R 75.0 0.0 0:04.57 kworker/u66:0
Which from my understanding handles a lot of the acpi calls that softirq is
doing.
I've tried many other 3.x kernels above 3.10 with the same results.. so I'm
wondering if this is a known issue
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox