* RE: [PATCHv2 6/6] net: fec: fix regression on i.MX28 introduced by rx_copybreak support
From: David Laight @ 2014-10-28 11:12 UTC (permalink / raw)
To: 'Lothar Waßmann', netdev@vger.kernel.org
Cc: David S. Miller, Russell King, Frank Li, Fabio Estevam,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
In-Reply-To: <1414494104-27943-7-git-send-email-LW@KARO-electronics.de>
From: Lothar Waßmann
> commit 1b7bde6d659d ("net: fec: implement rx_copybreak to improve rx performance")
> introduced a regression for i.MX28. The swap_buffer() function doing
> the endian conversion of the received data on i.MX28 may access memory
> beyond the actual packet size in the DMA buffer. fec_enet_copybreak()
> does not copy those bytes, so that the last bytes of a packet may be
> filled with invalid data after swapping.
> This will likely lead to checksum errors on received packets.
> E.g. when trying to mount an NFS rootfs:
> UDP: bad checksum. From 192.168.1.225:111 to 192.168.100.73:44662 ulen 36
>
> Do the byte swapping and copying to the new skb in one go if
> necessary.
>
> Signed-off-by: Lothar Wamann <LW@KARO-electronics.de>
> ---
> drivers/net/ethernet/freescale/fec_main.c | 25 +++++++++++++++++++++----
> 1 file changed, 21 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
> index 404fb9d..b92324c 100644
> --- a/drivers/net/ethernet/freescale/fec_main.c
> +++ b/drivers/net/ethernet/freescale/fec_main.c
> @@ -339,6 +339,18 @@ static void *swap_buffer(void *bufaddr, int len)
> return bufaddr;
> }
>
> +static void *swap_buffer2(void *dst_buf, void *src_buf, int len)
> +{
> + int i;
> + unsigned int *src = src_buf;
> + unsigned int *dst = dst_buf;
> +
> + for (i = 0; i < len; i += 4, src++, dst++)
> + swab32s(src);
This will probably benefit from being unrolled slightly.
Neither 'dst' nor the return value is used.
> +
> + return dst_buf;
> +}
> +
> static void fec_dump(struct net_device *ndev)
> {
> struct fec_enet_private *fep = netdev_priv(ndev);
> @@ -1334,7 +1346,7 @@ fec_enet_new_rxbdp(struct net_device *ndev, struct bufdesc *bdp, struct sk_buff
> }
>
> static bool fec_enet_copybreak(struct net_device *ndev, struct sk_buff **skb,
> - struct bufdesc *bdp, u32 length)
> + struct bufdesc *bdp, u32 length, bool swap)
> {
> struct fec_enet_private *fep = netdev_priv(ndev);
> struct sk_buff *new_skb;
> @@ -1349,7 +1361,10 @@ static bool fec_enet_copybreak(struct net_device *ndev, struct sk_buff **skb,
> dma_sync_single_for_cpu(&fep->pdev->dev, bdp->cbd_bufaddr,
> FEC_ENET_RX_FRSIZE - fep->rx_align,
> DMA_FROM_DEVICE);
> - memcpy(new_skb->data, (*skb)->data, length);
> + if (!swap)
> + memcpy(new_skb->data, (*skb)->data, length);
> + else
> + swap_buffer2(new_skb->data, (*skb)->data, length);
> *skb = new_skb;
>
> return true;
> @@ -1377,6 +1392,7 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
> u16 vlan_tag;
> int index = 0;
> bool is_copybreak;
> + bool need_swap = fep->quirks & FEC_QUIRK_SWAP_FRAME;
>
> #ifdef CONFIG_M532x
> flush_cache_all();
> @@ -1440,7 +1456,8 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
> * include that when passing upstream as it messes up
> * bridging applications.
> */
> - is_copybreak = fec_enet_copybreak(ndev, &skb, bdp, pkt_len - 4);
> + is_copybreak = fec_enet_copybreak(ndev, &skb, bdp, pkt_len - 4,
> + need_swap);
> if (!is_copybreak) {
> skb_new = netdev_alloc_skb(ndev, FEC_ENET_RX_FRSIZE);
> if (unlikely(!skb_new)) {
> @@ -1455,7 +1472,7 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
> prefetch(skb->data - NET_IP_ALIGN);
> skb_put(skb, pkt_len - 4);
> data = skb->data;
> - if (fep->quirks & FEC_QUIRK_SWAP_FRAME)
> + if (!is_copybreak && need_swap)
> swap_buffer(data, pkt_len);
It has to be better to set the 'copybreak' limit to be larger than the
maximum frame size and so always go through the 'copybreak' paths.
>
> /* Extract the enhanced buffer descriptor */
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCHv2 6/6] net: fec: fix regression on i.MX28 introduced by rx_copybreak support
From: Lothar Waßmann @ 2014-10-28 11:01 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Russell King, Frank Li, Fabio Estevam,
linux-kernel, Lothar Waßmann, linux-arm-kernel
In-Reply-To: <1414494104-27943-1-git-send-email-LW@KARO-electronics.de>
commit 1b7bde6d659d ("net: fec: implement rx_copybreak to improve rx performance")
introduced a regression for i.MX28. The swap_buffer() function doing
the endian conversion of the received data on i.MX28 may access memory
beyond the actual packet size in the DMA buffer. fec_enet_copybreak()
does not copy those bytes, so that the last bytes of a packet may be
filled with invalid data after swapping.
This will likely lead to checksum errors on received packets.
E.g. when trying to mount an NFS rootfs:
UDP: bad checksum. From 192.168.1.225:111 to 192.168.100.73:44662 ulen 36
Do the byte swapping and copying to the new skb in one go if
necessary.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
---
drivers/net/ethernet/freescale/fec_main.c | 25 +++++++++++++++++++++----
1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 404fb9d..b92324c 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -339,6 +339,18 @@ static void *swap_buffer(void *bufaddr, int len)
return bufaddr;
}
+static void *swap_buffer2(void *dst_buf, void *src_buf, int len)
+{
+ int i;
+ unsigned int *src = src_buf;
+ unsigned int *dst = dst_buf;
+
+ for (i = 0; i < len; i += 4, src++, dst++)
+ swab32s(src);
+
+ return dst_buf;
+}
+
static void fec_dump(struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
@@ -1334,7 +1346,7 @@ fec_enet_new_rxbdp(struct net_device *ndev, struct bufdesc *bdp, struct sk_buff
}
static bool fec_enet_copybreak(struct net_device *ndev, struct sk_buff **skb,
- struct bufdesc *bdp, u32 length)
+ struct bufdesc *bdp, u32 length, bool swap)
{
struct fec_enet_private *fep = netdev_priv(ndev);
struct sk_buff *new_skb;
@@ -1349,7 +1361,10 @@ static bool fec_enet_copybreak(struct net_device *ndev, struct sk_buff **skb,
dma_sync_single_for_cpu(&fep->pdev->dev, bdp->cbd_bufaddr,
FEC_ENET_RX_FRSIZE - fep->rx_align,
DMA_FROM_DEVICE);
- memcpy(new_skb->data, (*skb)->data, length);
+ if (!swap)
+ memcpy(new_skb->data, (*skb)->data, length);
+ else
+ swap_buffer2(new_skb->data, (*skb)->data, length);
*skb = new_skb;
return true;
@@ -1377,6 +1392,7 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
u16 vlan_tag;
int index = 0;
bool is_copybreak;
+ bool need_swap = fep->quirks & FEC_QUIRK_SWAP_FRAME;
#ifdef CONFIG_M532x
flush_cache_all();
@@ -1440,7 +1456,8 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
* include that when passing upstream as it messes up
* bridging applications.
*/
- is_copybreak = fec_enet_copybreak(ndev, &skb, bdp, pkt_len - 4);
+ is_copybreak = fec_enet_copybreak(ndev, &skb, bdp, pkt_len - 4,
+ need_swap);
if (!is_copybreak) {
skb_new = netdev_alloc_skb(ndev, FEC_ENET_RX_FRSIZE);
if (unlikely(!skb_new)) {
@@ -1455,7 +1472,7 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
prefetch(skb->data - NET_IP_ALIGN);
skb_put(skb, pkt_len - 4);
data = skb->data;
- if (fep->quirks & FEC_QUIRK_SWAP_FRAME)
+ if (!is_copybreak && need_swap)
swap_buffer(data, pkt_len);
/* Extract the enhanced buffer descriptor */
--
1.7.10.4
^ permalink raw reply related
* [PATCHv2 5/6] net: fec: simplify loop counter handling in swap_buffer()
From: Lothar Waßmann @ 2014-10-28 11:01 UTC (permalink / raw)
To: netdev
Cc: Fabio Estevam, Frank Li, linux-kernel, Russell King,
David S. Miller, linux-arm-kernel, Lothar Waßmann
In-Reply-To: <1414494104-27943-1-git-send-email-LW@KARO-electronics.de>
Eliminate the DIV_ROUND_UP() and change the loop counter increment to
4 instead. This results in saving 6 instructions in the functions
assembly code.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
---
drivers/net/ethernet/freescale/fec_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 3a103e9..404fb9d 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -333,7 +333,7 @@ static void *swap_buffer(void *bufaddr, int len)
int i;
unsigned int *buf = bufaddr;
- for (i = 0; i < DIV_ROUND_UP(len, 4); i++, buf++)
+ for (i = 0; i < len; i += 4, buf++)
swab32s(buf);
return bufaddr;
--
1.7.10.4
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCHv2 4/6] net: fec: use swab32s() instead of cpu_to_be32()
From: Lothar Waßmann @ 2014-10-28 11:01 UTC (permalink / raw)
To: netdev
Cc: Fabio Estevam, Frank Li, linux-kernel, Russell King,
David S. Miller, linux-arm-kernel, Lothar Waßmann
In-Reply-To: <1414494104-27943-1-git-send-email-LW@KARO-electronics.de>
when swap_buffer() is being called, we know for sure, that we need to
byte swap the data. Also this function is called for swapping data in
both directions. Thus cpu_to_be32() is semantically not correct for
all use cases. Use swab32s() to reflect this.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
---
drivers/net/ethernet/freescale/fec_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 323ae2e..3a103e9 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -334,7 +334,7 @@ static void *swap_buffer(void *bufaddr, int len)
unsigned int *buf = bufaddr;
for (i = 0; i < DIV_ROUND_UP(len, 4); i++, buf++)
- *buf = cpu_to_be32(*buf);
+ swab32s(buf);
return bufaddr;
}
--
1.7.10.4
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCHv2 3/6] net: fec: improve access to quirk flags by copying them into fec_enet_private struct
From: Lothar Waßmann @ 2014-10-28 11:01 UTC (permalink / raw)
To: netdev
Cc: Fabio Estevam, Frank Li, linux-kernel, Russell King,
David S. Miller, linux-arm-kernel, Lothar Waßmann
In-Reply-To: <1414494104-27943-1-git-send-email-LW@KARO-electronics.de>
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
---
drivers/net/ethernet/freescale/fec.h | 1 +
drivers/net/ethernet/freescale/fec_main.c | 105 ++++++++++-------------------
2 files changed, 38 insertions(+), 68 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index 2634de2..d43c1d3 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -453,6 +453,7 @@ struct fec_enet_private {
int irq[FEC_IRQ_NUM];
bool bufdesc_ex;
int pause_flag;
+ u32 quirks;
struct napi_struct napi;
int csum_flags;
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index f7d344f..323ae2e 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -392,8 +392,6 @@ fec_enet_txq_submit_frag_skb(struct fec_enet_priv_tx_q *txq,
struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
struct bufdesc *bdp = txq->cur_tx;
struct bufdesc_ex *ebdp;
int nr_frags = skb_shinfo(skb)->nr_frags;
@@ -429,7 +427,7 @@ fec_enet_txq_submit_frag_skb(struct fec_enet_priv_tx_q *txq,
}
if (fep->bufdesc_ex) {
- if (id_entry->driver_data & FEC_QUIRK_HAS_AVB)
+ if (fep->quirks & FEC_QUIRK_HAS_AVB)
estatus |= FEC_TX_BD_FTYPE(queue);
if (skb->ip_summed == CHECKSUM_PARTIAL)
estatus |= BD_ENET_TX_PINS | BD_ENET_TX_IINS;
@@ -441,11 +439,11 @@ fec_enet_txq_submit_frag_skb(struct fec_enet_priv_tx_q *txq,
index = fec_enet_get_bd_index(txq->tx_bd_base, bdp, fep);
if (((unsigned long) bufaddr) & fep->tx_align ||
- id_entry->driver_data & FEC_QUIRK_SWAP_FRAME) {
+ fep->quirks & FEC_QUIRK_SWAP_FRAME) {
memcpy(txq->tx_bounce[index], bufaddr, frag_len);
bufaddr = txq->tx_bounce[index];
- if (id_entry->driver_data & FEC_QUIRK_SWAP_FRAME)
+ if (fep->quirks & FEC_QUIRK_SWAP_FRAME)
swap_buffer(bufaddr, frag_len);
}
@@ -481,8 +479,6 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq,
struct sk_buff *skb, struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
int nr_frags = skb_shinfo(skb)->nr_frags;
struct bufdesc *bdp, *last_bdp;
void *bufaddr;
@@ -521,11 +517,11 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq,
queue = skb_get_queue_mapping(skb);
index = fec_enet_get_bd_index(txq->tx_bd_base, bdp, fep);
if (((unsigned long) bufaddr) & fep->tx_align ||
- id_entry->driver_data & FEC_QUIRK_SWAP_FRAME) {
+ fep->quirks & FEC_QUIRK_SWAP_FRAME) {
memcpy(txq->tx_bounce[index], skb->data, buflen);
bufaddr = txq->tx_bounce[index];
- if (id_entry->driver_data & FEC_QUIRK_SWAP_FRAME)
+ if (fep->quirks & FEC_QUIRK_SWAP_FRAME)
swap_buffer(bufaddr, buflen);
}
@@ -560,7 +556,7 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq,
fep->hwts_tx_en))
skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
- if (id_entry->driver_data & FEC_QUIRK_HAS_AVB)
+ if (fep->quirks & FEC_QUIRK_HAS_AVB)
estatus |= FEC_TX_BD_FTYPE(queue);
if (skb->ip_summed == CHECKSUM_PARTIAL)
@@ -604,8 +600,6 @@ fec_enet_txq_put_data_tso(struct fec_enet_priv_tx_q *txq, struct sk_buff *skb,
int size, bool last_tcp, bool is_last)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
struct bufdesc_ex *ebdp = container_of(bdp, struct bufdesc_ex, desc);
unsigned short queue = skb_get_queue_mapping(skb);
unsigned short status;
@@ -618,11 +612,11 @@ fec_enet_txq_put_data_tso(struct fec_enet_priv_tx_q *txq, struct sk_buff *skb,
status |= (BD_ENET_TX_TC | BD_ENET_TX_READY);
if (((unsigned long) data) & fep->tx_align ||
- id_entry->driver_data & FEC_QUIRK_SWAP_FRAME) {
+ fep->quirks & FEC_QUIRK_SWAP_FRAME) {
memcpy(txq->tx_bounce[index], data, size);
data = txq->tx_bounce[index];
- if (id_entry->driver_data & FEC_QUIRK_SWAP_FRAME)
+ if (fep->quirks & FEC_QUIRK_SWAP_FRAME)
swap_buffer(data, size);
}
@@ -638,7 +632,7 @@ fec_enet_txq_put_data_tso(struct fec_enet_priv_tx_q *txq, struct sk_buff *skb,
bdp->cbd_bufaddr = addr;
if (fep->bufdesc_ex) {
- if (id_entry->driver_data & FEC_QUIRK_HAS_AVB)
+ if (fep->quirks & FEC_QUIRK_HAS_AVB)
estatus |= FEC_TX_BD_FTYPE(queue);
if (skb->ip_summed == CHECKSUM_PARTIAL)
estatus |= BD_ENET_TX_PINS | BD_ENET_TX_IINS;
@@ -666,8 +660,6 @@ fec_enet_txq_put_hdr_tso(struct fec_enet_priv_tx_q *txq,
struct bufdesc *bdp, int index)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
int hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb);
struct bufdesc_ex *ebdp = container_of(bdp, struct bufdesc_ex, desc);
unsigned short queue = skb_get_queue_mapping(skb);
@@ -683,11 +675,11 @@ fec_enet_txq_put_hdr_tso(struct fec_enet_priv_tx_q *txq,
bufaddr = txq->tso_hdrs + index * TSO_HEADER_SIZE;
dmabuf = txq->tso_hdrs_dma + index * TSO_HEADER_SIZE;
if (((unsigned long)bufaddr) & fep->tx_align ||
- id_entry->driver_data & FEC_QUIRK_SWAP_FRAME) {
+ fep->quirks & FEC_QUIRK_SWAP_FRAME) {
memcpy(txq->tx_bounce[index], skb->data, hdr_len);
bufaddr = txq->tx_bounce[index];
- if (id_entry->driver_data & FEC_QUIRK_SWAP_FRAME)
+ if (fep->quirks & FEC_QUIRK_SWAP_FRAME)
swap_buffer(bufaddr, hdr_len);
dmabuf = dma_map_single(&fep->pdev->dev, bufaddr,
@@ -704,7 +696,7 @@ fec_enet_txq_put_hdr_tso(struct fec_enet_priv_tx_q *txq,
bdp->cbd_datlen = hdr_len;
if (fep->bufdesc_ex) {
- if (id_entry->driver_data & FEC_QUIRK_HAS_AVB)
+ if (fep->quirks & FEC_QUIRK_HAS_AVB)
estatus |= FEC_TX_BD_FTYPE(queue);
if (skb->ip_summed == CHECKSUM_PARTIAL)
estatus |= BD_ENET_TX_PINS | BD_ENET_TX_IINS;
@@ -729,8 +721,6 @@ static int fec_enet_txq_submit_tso(struct fec_enet_priv_tx_q *txq,
struct tso_t tso;
unsigned int index = 0;
int ret;
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
if (tso_count_descs(skb) >= fec_enet_get_free_txdesc_num(fep, txq)) {
dev_kfree_skb_any(skb);
@@ -792,7 +782,7 @@ static int fec_enet_txq_submit_tso(struct fec_enet_priv_tx_q *txq,
txq->cur_tx = bdp;
/* Trigger transmission start */
- if (!(id_entry->driver_data & FEC_QUIRK_ERR007885) ||
+ if (!(fep->quirks & FEC_QUIRK_ERR007885) ||
!readl(fep->hwp + FEC_X_DES_ACTIVE(queue)) ||
!readl(fep->hwp + FEC_X_DES_ACTIVE(queue)) ||
!readl(fep->hwp + FEC_X_DES_ACTIVE(queue)) ||
@@ -955,8 +945,6 @@ static void
fec_restart(struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
u32 val;
u32 temp_mac[2];
u32 rcntl = OPT_FRAME_SIZE | 0x04;
@@ -966,7 +954,7 @@ fec_restart(struct net_device *ndev)
* For i.MX6SX SOC, enet use AXI bus, we use disable MAC
* instead of reset MAC itself.
*/
- if (id_entry && id_entry->driver_data & FEC_QUIRK_HAS_AVB) {
+ if (fep->quirks & FEC_QUIRK_HAS_AVB) {
writel(0, fep->hwp + FEC_ECNTRL);
} else {
writel(1, fep->hwp + FEC_ECNTRL);
@@ -977,7 +965,7 @@ fec_restart(struct net_device *ndev)
* enet-mac reset will reset mac address registers too,
* so need to reconfigure it.
*/
- if (id_entry->driver_data & FEC_QUIRK_ENET_MAC) {
+ if (fep->quirks & FEC_QUIRK_ENET_MAC) {
memcpy(&temp_mac, ndev->dev_addr, ETH_ALEN);
writel(cpu_to_be32(temp_mac[0]), fep->hwp + FEC_ADDR_LOW);
writel(cpu_to_be32(temp_mac[1]), fep->hwp + FEC_ADDR_HIGH);
@@ -1023,7 +1011,7 @@ fec_restart(struct net_device *ndev)
* The phy interface and speed need to get configured
* differently on enet-mac.
*/
- if (id_entry->driver_data & FEC_QUIRK_ENET_MAC) {
+ if (fep->quirks & FEC_QUIRK_ENET_MAC) {
/* Enable flow control and length check */
rcntl |= 0x40000000 | 0x00000020;
@@ -1046,7 +1034,7 @@ fec_restart(struct net_device *ndev)
}
} else {
#ifdef FEC_MIIGSK_ENR
- if (id_entry->driver_data & FEC_QUIRK_USE_GASKET) {
+ if (fep->quirks & FEC_QUIRK_USE_GASKET) {
u32 cfgr;
/* disable the gasket and wait */
writel(0, fep->hwp + FEC_MIIGSK_ENR);
@@ -1099,7 +1087,7 @@ fec_restart(struct net_device *ndev)
writel(0, fep->hwp + FEC_HASH_TABLE_LOW);
#endif
- if (id_entry->driver_data & FEC_QUIRK_ENET_MAC) {
+ if (fep->quirks & FEC_QUIRK_ENET_MAC) {
/* enable ENET endian swap */
ecntl |= (1 << 8);
/* enable ENET store and forward mode */
@@ -1133,8 +1121,6 @@ static void
fec_stop(struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
u32 rmii_mode = readl(fep->hwp + FEC_R_CNTRL) & (1 << 8);
/* We cannot expect a graceful transmit stop without link !!! */
@@ -1149,7 +1135,7 @@ fec_stop(struct net_device *ndev)
* For i.MX6SX SOC, enet use AXI bus, we use disable MAC
* instead of reset MAC itself.
*/
- if (id_entry && id_entry->driver_data & FEC_QUIRK_HAS_AVB) {
+ if (fep->quirks & FEC_QUIRK_HAS_AVB) {
writel(0, fep->hwp + FEC_ECNTRL);
} else {
writel(1, fep->hwp + FEC_ECNTRL);
@@ -1159,7 +1145,7 @@ fec_stop(struct net_device *ndev)
writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK);
/* We have to keep ENET enabled to have MII interrupt stay working */
- if (id_entry->driver_data & FEC_QUIRK_ENET_MAC) {
+ if (fep->quirks & FEC_QUIRK_ENET_MAC) {
writel(2, fep->hwp + FEC_ECNTRL);
writel(rmii_mode, fep->hwp + FEC_R_CNTRL);
}
@@ -1378,8 +1364,6 @@ static int
fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
struct fec_enet_priv_rx_q *rxq;
struct bufdesc *bdp;
unsigned short status;
@@ -1471,7 +1455,7 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id)
prefetch(skb->data - NET_IP_ALIGN);
skb_put(skb, pkt_len - 4);
data = skb->data;
- if (id_entry->driver_data & FEC_QUIRK_SWAP_FRAME)
+ if (fep->quirks & FEC_QUIRK_SWAP_FRAME)
swap_buffer(data, pkt_len);
/* Extract the enhanced buffer descriptor */
@@ -1905,8 +1889,6 @@ failed_clk_ipg:
static int fec_enet_mii_probe(struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
struct phy_device *phy_dev = NULL;
char mdio_bus_id[MII_BUS_ID_SIZE];
char phy_name[MII_BUS_ID_SIZE + 3];
@@ -1952,7 +1934,7 @@ static int fec_enet_mii_probe(struct net_device *ndev)
}
/* mask with MAC supported features */
- if (id_entry->driver_data & FEC_QUIRK_HAS_GBIT) {
+ if (fep->quirks & FEC_QUIRK_HAS_GBIT) {
phy_dev->supported &= PHY_GBIT_FEATURES;
phy_dev->supported &= ~SUPPORTED_1000baseT_Half;
#if !defined(CONFIG_M5272)
@@ -1980,8 +1962,6 @@ static int fec_enet_mii_init(struct platform_device *pdev)
static struct mii_bus *fec0_mii_bus;
struct net_device *ndev = platform_get_drvdata(pdev);
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
struct device_node *node;
int err = -ENXIO, i;
@@ -2001,7 +1981,7 @@ static int fec_enet_mii_init(struct platform_device *pdev)
* mdio interface in board design, and need to be configured by
* fec0 mii_bus.
*/
- if ((id_entry->driver_data & FEC_QUIRK_ENET_MAC) && fep->dev_id > 0) {
+ if ((fep->quirks & FEC_QUIRK_ENET_MAC) && fep->dev_id > 0) {
/* fec1 uses fec0 mii_bus */
if (mii_cnt && fec0_mii_bus) {
fep->mii_bus = fec0_mii_bus;
@@ -2022,7 +2002,7 @@ static int fec_enet_mii_init(struct platform_device *pdev)
* document.
*/
fep->phy_speed = DIV_ROUND_UP(clk_get_rate(fep->clk_ipg), 5000000);
- if (id_entry->driver_data & FEC_QUIRK_ENET_MAC)
+ if (fep->quirks & FEC_QUIRK_ENET_MAC)
fep->phy_speed--;
fep->phy_speed <<= 1;
writel(fep->phy_speed, fep->hwp + FEC_MII_SPEED);
@@ -2064,7 +2044,7 @@ static int fec_enet_mii_init(struct platform_device *pdev)
mii_cnt++;
/* save fec0 mii_bus */
- if (id_entry->driver_data & FEC_QUIRK_ENET_MAC)
+ if (fep->quirks & FEC_QUIRK_ENET_MAC)
fec0_mii_bus = fep->mii_bus;
return 0;
@@ -2333,11 +2313,9 @@ static int fec_enet_us_to_itr_clock(struct net_device *ndev, int us)
static void fec_enet_itr_coal_set(struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
int rx_itr, tx_itr;
- if (!(id_entry->driver_data & FEC_QUIRK_HAS_AVB))
+ if (!(fep->quirks & FEC_QUIRK_HAS_AVB))
return;
/* Must be greater than zero to avoid unpredictable behavior */
@@ -2372,10 +2350,8 @@ static int
fec_enet_get_coalesce(struct net_device *ndev, struct ethtool_coalesce *ec)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
- if (!(id_entry->driver_data & FEC_QUIRK_HAS_AVB))
+ if (!(fep->quirks & FEC_QUIRK_HAS_AVB))
return -EOPNOTSUPP;
ec->rx_coalesce_usecs = fep->rx_time_itr;
@@ -2391,12 +2367,9 @@ static int
fec_enet_set_coalesce(struct net_device *ndev, struct ethtool_coalesce *ec)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
-
unsigned int cycle;
- if (!(id_entry->driver_data & FEC_QUIRK_HAS_AVB))
+ if (!(fep->quirks & FEC_QUIRK_HAS_AVB))
return -EOPNOTSUPP;
if (ec->rx_max_coalesced_frames > 255) {
@@ -2976,8 +2949,6 @@ static const struct net_device_ops fec_netdev_ops = {
static int fec_enet_init(struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- const struct platform_device_id *id_entry =
- platform_get_device_id(fep->pdev);
struct fec_enet_priv_tx_q *txq;
struct fec_enet_priv_rx_q *rxq;
struct bufdesc *cbd_base;
@@ -3056,11 +3027,11 @@ static int fec_enet_init(struct net_device *ndev)
writel(FEC_RX_DISABLED_IMASK, fep->hwp + FEC_IMASK);
netif_napi_add(ndev, &fep->napi, fec_enet_rx_napi, NAPI_POLL_WEIGHT);
- if (id_entry->driver_data & FEC_QUIRK_HAS_VLAN)
+ if (fep->quirks & FEC_QUIRK_HAS_VLAN)
/* enable hw VLAN support */
ndev->features |= NETIF_F_HW_VLAN_CTAG_RX;
- if (id_entry->driver_data & FEC_QUIRK_HAS_CSUM) {
+ if (fep->quirks & FEC_QUIRK_HAS_CSUM) {
ndev->gso_max_segs = FEC_MAX_TSO_SEGS;
/* enable hw accelerator */
@@ -3069,7 +3040,7 @@ static int fec_enet_init(struct net_device *ndev)
fep->csum_flags |= FLAG_RX_CSUM_ENABLED;
}
- if (id_entry->driver_data & FEC_QUIRK_HAS_AVB) {
+ if (fep->quirks & FEC_QUIRK_HAS_AVB) {
fep->tx_align = 0;
fep->rx_align = 0x3f;
}
@@ -3169,10 +3140,6 @@ fec_probe(struct platform_device *pdev)
int num_tx_qs;
int num_rx_qs;
- of_id = of_match_device(fec_dt_ids, &pdev->dev);
- if (of_id)
- pdev->id_entry = of_id->data;
-
fec_enet_get_queue_num(pdev, &num_tx_qs, &num_rx_qs);
/* Init network device */
@@ -3186,13 +3153,16 @@ fec_probe(struct platform_device *pdev)
/* setup board info structure */
fep = netdev_priv(ndev);
+ of_id = of_match_device(fec_dt_ids, &pdev->dev);
+ if (of_id)
+ fep->quirks = (u32)of_id->data;
+
fep->num_rx_queues = num_rx_qs;
fep->num_tx_queues = num_tx_qs;
#if !defined(CONFIG_M5272)
/* default enable pause frame auto negotiation */
- if (pdev->id_entry &&
- (pdev->id_entry->driver_data & FEC_QUIRK_HAS_GBIT))
+ if (fep->quirks & FEC_QUIRK_HAS_GBIT)
fep->pause_flag |= FEC_PAUSE_FLAG_AUTONEG;
#endif
@@ -3261,9 +3231,8 @@ fec_probe(struct platform_device *pdev)
if (IS_ERR(fep->clk_ref))
fep->clk_ref = NULL;
+ fep->bufdesc_ex = fep->quirks & FEC_QUIRK_HAS_BUFDESC_EX;
fep->clk_ptp = devm_clk_get(&pdev->dev, "ptp");
- fep->bufdesc_ex =
- pdev->id_entry->driver_data & FEC_QUIRK_HAS_BUFDESC_EX;
if (IS_ERR(fep->clk_ptp)) {
fep->clk_ptp = NULL;
fep->bufdesc_ex = false;
--
1.7.10.4
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCHv2 2/6] net: fec: declare bufdesc_ex flag as bool
From: Lothar Waßmann @ 2014-10-28 11:01 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Russell King, Frank Li, Fabio Estevam,
linux-kernel, Lothar Waßmann, linux-arm-kernel
In-Reply-To: <1414494104-27943-1-git-send-email-LW@KARO-electronics.de>
fep->bufdesc_ex is used as boolean flag; thus declare it as such.
Also remove an unnecessary initialization to 0.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
---
drivers/net/ethernet/freescale/fec.h | 2 +-
drivers/net/ethernet/freescale/fec_main.c | 4 +---
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index 1e65917..2634de2 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -451,7 +451,7 @@ struct fec_enet_private {
int speed;
struct completion mdio_done;
int irq[FEC_IRQ_NUM];
- int bufdesc_ex;
+ bool bufdesc_ex;
int pause_flag;
struct napi_struct napi;
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index decea57..f7d344f 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -3209,8 +3209,6 @@ fec_probe(struct platform_device *pdev)
fep->pdev = pdev;
fep->dev_id = dev_id++;
- fep->bufdesc_ex = 0;
-
platform_set_drvdata(pdev, ndev);
phy_node = of_parse_phandle(np, "phy-handle", 0);
@@ -3268,7 +3266,7 @@ fec_probe(struct platform_device *pdev)
pdev->id_entry->driver_data & FEC_QUIRK_HAS_BUFDESC_EX;
if (IS_ERR(fep->clk_ptp)) {
fep->clk_ptp = NULL;
- fep->bufdesc_ex = 0;
+ fep->bufdesc_ex = false;
}
ret = fec_enet_clk_enable(ndev, true);
--
1.7.10.4
^ permalink raw reply related
* [PATCHv2 1/6] net: fec: indentation cleanup; no functional change
From: Lothar Waßmann @ 2014-10-28 11:01 UTC (permalink / raw)
To: netdev
Cc: Fabio Estevam, Frank Li, linux-kernel, Russell King,
David S. Miller, linux-arm-kernel, Lothar Waßmann
In-Reply-To: <1414494104-27943-1-git-send-email-LW@KARO-electronics.de>
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
---
drivers/net/ethernet/freescale/fec_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index e364d1f..decea57 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -101,7 +101,7 @@ static void fec_enet_itr_coal_init(struct net_device *ndev);
* frames not being transmitted until there is a 0-to-1 transition on
* ENET_TDAR[TDAR].
*/
-#define FEC_QUIRK_ERR006358 (1 << 7)
+#define FEC_QUIRK_ERR006358 (1 << 7)
/* ENET IP hw AVB
*
* i.MX6SX ENET IP add Audio Video Bridging (AVB) feature support.
--
1.7.10.4
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCHv2 0/6] net: fec: fix regression on i.MX28 introduced by rx_copybreak support
From: Lothar Waßmann @ 2014-10-28 11:01 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Russell King, Frank Li, Fabio Estevam,
linux-kernel, Lothar Waßmann, linux-arm-kernel
Changes wrt. v1:
- added some cleanup patches
- simplify handling of 'quirks' flags as suggested by Russell King.
- remove DIV_ROUND_UP() from byte swapping loop as suggested by
Eric Dumazet
Subject:
In-Reply-To:
^ permalink raw reply
* Re: [net 1/2] sctp: add transport state in /proc/net/sctp/remaddr
From: Neil Horman @ 2014-10-28 10:27 UTC (permalink / raw)
To: David Miller; +Cc: michele, linux-sctp, vyasevich, netdev, dborkman
In-Reply-To: <20141027.185545.551457974536550723.davem@davemloft.net>
On Mon, Oct 27, 2014 at 06:55:45PM -0400, David Miller wrote:
> From: Michele Baldessari <michele@acksyn.org>
> Date: Thu, 23 Oct 2014 21:48:40 +0200
>
> > It is often quite helpful to be able to know the state of a transport
> > outside of the application itself (for troubleshooting purposes or for
> > monitoring purposes). Add it under /proc/net/sctp/remaddr.
> >
> > Signed-off-by: Michele Baldessari <michele@acksyn.org>
>
> You can't change the layout of procfs files, applications parse
> these files and any modification can potentially break such tools.
>
> Secondly, even if this change were acceptable, targetting this
> change at anything other than the net-next tree is not appropriate
> because it is a new feature.
>
Agree on the net-next submission, though there is precident for extending this
procfile, as we've done it a few times in the past to this, and other files in
the sctp area (see commits f406c8b9693f2f71ef2caeb0b68521a7d22d00f0 and
58fbbed4fbc0094fc808a568fe99a915f85402ee)
Neil
^ permalink raw reply
* Re: netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse
From: Jesper Dangaard Brouer @ 2014-10-28 10:11 UTC (permalink / raw)
To: billbonaparte
Cc: linux-kernel, 'Netfilter Developer Mailing List',
'Pablo Neira Ayuso', 'Patrick McHardy', kadlec,
davem, 'Changli Gao', 'Andrey Vagin', brouer,
netdev@vger.kernel.org
In-Reply-To: <02f201cff260$8622e610$9268b230$@gmail.com>
On Tue, 28 Oct 2014 11:37:31 +0800 "billbonaparte" <programme110@gmail.com> wrote:
> Hi, all:
> sorry for sending this mail again, the last mail doesn't show text
> clearly.
This one also mangles the text, so I cannot follow the race you are
describing. I'll try to reconstruct...
> In function __nf_conntrack_confirm, we check the conntrack if it was
> already dead, before insert it into hash-table.
> We do this because if we insert an already 'dead' hash, it will
> block further use of that particular connection.
Have you run into this problem in practice, or is this based on a
theory?
> but we don't do that right.
> let's consider the following case:
>
[tried to reconstruct]
> cpu1 cpu2
> __nf_conntrack_confirm get_next_corpse
> lock corresponding hash-list ....
> check nf_ct_is_dying(ct) ....
> ..... for_each_possible_cpu(cpu) {
> ..... (processing &pcpu->unconfirmed)
> ..... spin_lock_bh(&pcpu->lock);
> ..... set_bit(IPS_DYING_BIT, &ct->status);
> ..... spin_unlock_bh(&pcpu_lock);
> spin_lock_bh(&pcpu->lock);
> nf_ct_del_from_dying_or_unconfirmed_list(ct);
> spin_unlock_bh(&pcpu_lock);
>
> add_timer(&ct->timeout);
> ct->status |= IPS_CONFIRMED;
> __nf_conntrack_hash_insert(ct);
> /* the conntrack has been seted as dying*/
Yes, I think you are correct. There is a race. As we are modifying
the ct->status, without holding the hash bucket lock.
> The above case reveal two problems:
> 1. we may insert a dead conntrack to hash-table, it will block
> further use of that particular connection.
> 2. operation on ct->status should be atomic, because it race aginst
> get_next_corpse.
> due to this reason, the operation on ct->status in
> nf_nat_setup_info should be atomic as well.
>
> if we want to resolve the first problem, we must delete the
> unconfirmed conntrack from unconfirmed-list first, then check if it is
> already dead.
Guess that would be one approach.
> Am I right to do this ?
> Appreciate any comments and reply.
Perhaps we could get rid of unconfirmed list handling in get_next_corpse?
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply
* Re: [PATCH net] inet: frags: fix a race between inet_evict_bucket and inet_frag_kill
From: Florian Westphal @ 2014-10-28 10:03 UTC (permalink / raw)
To: Nikolay Aleksandrov
Cc: netdev, Florian Westphal, Eric Dumazet, Patrick McLean
In-Reply-To: <1414488634-28412-1-git-send-email-nikolay@redhat.com>
Nikolay Aleksandrov <nikolay@redhat.com> wrote:
> When the evictor is running it adds some chosen frags to a local list to
> be evicted once the chain lock has been released but at the same time
> the *frag_queue can be running for some of the same queues and it
> may call inet_frag_kill which will wait on the chain lock and
> will then delete the queue from the wrong list since it was added in the
> eviction one.
I had to read that twice...
cpu1 cpu2
inet_evict_bucket inet_frag_kill
chain_lock() chain_lock() ..
for_each_frag_queue spin
set fragqueue INET_FRAG_EVICTED flag [A] .
hlist_del() spin
hlist_add (to private list) .
spin
chain_unlock .
chain_lock returns
for_each_frag_queue_on_private_list hlist_del() [B]
frag_expire(fq) // destroy/free queue
[B] we may delete entry on the evictors private list.
since [A] is only set with chainlock held, other cpus
killing an entry can use INET_FRAG_EVICTED to test if the
entry is about to be removed by the evictor.
> The fix is simple - check if the queue has the evict flag
> set under the chain lock before deleting it, this is safe because the
> evict flag is set only under that lock and having the flag set also means
> that the queue has been detached from the chain list, so no need to delete
> it again.
Right, thanks everyone.
> ---
> A few more eyes to confirm all of this would be much appreciated.
Looks correct,
Reviewed-by: Florian Westphal <fw@strlen.de>
^ permalink raw reply
* [PATCH net] inet: frags: remove the WARN_ON from inet_evict_bucket
From: Nikolay Aleksandrov @ 2014-10-28 9:44 UTC (permalink / raw)
To: netdev; +Cc: Nikolay Aleksandrov, Florian Westphal, Eric Dumazet,
Patrick McLean
In-Reply-To: <1414455409.4845.1.camel@edumazet-glaptop2.roam.corp.google.com>
The WARN_ON in inet_evict_bucket can be triggered by a valid case:
inet_frag_kill and inet_evict_bucket can be running in parallel on the
same queue which means that there has been at least one more ref added
by a previous inet_frag_find call, but inet_frag_kill can delete the
timer before inet_evict_bucket which will cause the WARN_ON() there to
trigger since we'll have refcnt!=1. Now, this case is valid because the
queue is being "killed" for some reason (removed from the chain list and
its timer deleted) so it will get destroyed in the end by one of the
inet_frag_put() calls which reaches 0 i.e. refcnt is still valid.
CC: Florian Westphal <fw@strlen.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McLean <chutzpah@gentoo.org>
Fixes: b13d3cbfb8e8 ("inet: frag: move eviction of queues to work queue")
Reported-by: Patrick McLean <chutzpah@gentoo.org>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
I'm sending this as a separate patch so the race fix doesn't get blocked
in case I'm wrong and also it's a different issue.
net/ipv4/inet_fragment.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index 894ec30c5896..19419b60cb37 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -146,7 +146,6 @@ evict_again:
atomic_inc(&fq->refcnt);
spin_unlock(&hb->chain_lock);
del_timer_sync(&fq->timer);
- WARN_ON(atomic_read(&fq->refcnt) != 1);
inet_frag_put(fq, f);
goto evict_again;
}
--
1.9.3
^ permalink raw reply related
* [PATCH net] inet: frags: fix a race between inet_evict_bucket and inet_frag_kill
From: Nikolay Aleksandrov @ 2014-10-28 9:30 UTC (permalink / raw)
To: netdev; +Cc: Nikolay Aleksandrov, Florian Westphal, Eric Dumazet,
Patrick McLean
In-Reply-To: <1414455409.4845.1.camel@edumazet-glaptop2.roam.corp.google.com>
When the evictor is running it adds some chosen frags to a local list to
be evicted once the chain lock has been released but at the same time
the *frag_queue can be running for some of the same queues and it
may call inet_frag_kill which will wait on the chain lock and
will then delete the queue from the wrong list since it was added in the
eviction one. The fix is simple - check if the queue has the evict flag
set under the chain lock before deleting it, this is safe because the
evict flag is set only under that lock and having the flag set also means
that the queue has been detached from the chain list, so no need to delete
it again.
An important note to make is that we're safe w.r.t refcnt because
inet_frag_kill and inet_evict_bucket will sync on the del_timer operation
where only one of the two can succeed (or if the timer is executing -
none of them), the cases are:
1. inet_frag_kill succeeds in del_timer
- then the timer ref is removed, but inet_evict_bucket will not add
this queue to its expire list but will restart eviction in that chain
2. inet_evict_bucket succeeds in del_timer
- then the timer ref is kept until the evictor "expires" the queue, but
inet_frag_kill will remove the initial ref and will set
INET_FRAG_COMPLETE which will make the frag_expire fn just to remove
its ref.
In the end all of the queue users will do an inet_frag_put and the one
that reaches 0 will free it. The refcount balance should be okay.
CC: Florian Westphal <fw@strlen.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McLean <chutzpah@gentoo.org>
Fixes: b13d3cbfb8e8 ("inet: frag: move eviction of queues to work queue")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Patrick McLean <chutzpah@gentoo.org>
Tested-by: Patrick McLean <chutzpah@gentoo.org>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
A few more eyes to confirm all of this would be much appreciated.
net/ipv4/inet_fragment.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index 9eb89f3f0ee4..894ec30c5896 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -285,7 +285,8 @@ static inline void fq_unlink(struct inet_frag_queue *fq, struct inet_frags *f)
struct inet_frag_bucket *hb;
hb = get_frag_bucket_locked(fq, f);
- hlist_del(&fq->list);
+ if (!(fq->flags & INET_FRAG_EVICTED))
+ hlist_del(&fq->list);
spin_unlock(&hb->chain_lock);
}
--
1.9.3
^ permalink raw reply related
* [PATCH v2 net-next] net: ipv6: Add a sysctl to make optimistic addresses useful candidates
From: Erik Kline @ 2014-10-28 9:11 UTC (permalink / raw)
To: netdev; +Cc: davem, ben, lorenzo, hannes, Erik Kline
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
---
Documentation/networking/ip-sysctl.txt | 13 ++++++++++
include/linux/ipv6.h | 1 +
include/uapi/linux/ipv6.h | 1 +
net/ipv6/addrconf.c | 46 ++++++++++++++++++++++++++++++++--
4 files changed, 59 insertions(+), 2 deletions(-)
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 0307e28..e03cf49 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1452,6 +1452,19 @@ suppress_frag_ndisc - INTEGER
1 - (default) discard fragmented neighbor discovery packets
0 - allow fragmented neighbor discovery packets
+optimistic_dad - BOOLEAN
+ Whether to perform Optimistic Duplicate Address Detection (RFC 4429).
+ 0: disabled (default)
+ 1: enabled
+
+use_optimistic - BOOLEAN
+ If enabled, do not classify optimistic addresses as deprecated during
+ source address selection. Preferred addresses will still be chosen
+ before optimistic addresses, subject to other ranking in the source
+ address selection algorithm.
+ 0: disabled (default)
+ 1: enabled
+
icmp/*:
ratelimit - INTEGER
Limit the maximal rates for sending ICMPv6 packets.
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index ff56053..7121a2e 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -42,6 +42,7 @@ struct ipv6_devconf {
__s32 accept_ra_from_local;
#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
__s32 optimistic_dad;
+ __s32 use_optimistic;
#endif
#ifdef CONFIG_IPV6_MROUTE
__s32 mc_forwarding;
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index efa2666..e863d08 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -164,6 +164,7 @@ enum {
DEVCONF_MLDV2_UNSOLICITED_REPORT_INTERVAL,
DEVCONF_SUPPRESS_FRAG_NDISC,
DEVCONF_ACCEPT_RA_FROM_LOCAL,
+ DEVCONF_USE_OPTIMISTIC,
DEVCONF_MAX
};
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 50b95b2..8d12b7c 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1170,6 +1170,9 @@ enum {
IPV6_SADDR_RULE_PRIVACY,
IPV6_SADDR_RULE_ORCHID,
IPV6_SADDR_RULE_PREFIX,
+#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
+ IPV6_SADDR_RULE_NOT_OPTIMISTIC,
+#endif
IPV6_SADDR_RULE_MAX
};
@@ -1197,6 +1200,15 @@ static inline int ipv6_saddr_preferred(int type)
return 0;
}
+static inline bool ipv6_use_optimistic_addr(struct inet6_dev *idev)
+{
+#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
+ return idev && idev->cnf.optimistic_dad && idev->cnf.use_optimistic;
+#else
+ return false;
+#endif
+}
+
static int ipv6_get_saddr_eval(struct net *net,
struct ipv6_saddr_score *score,
struct ipv6_saddr_dst *dst,
@@ -1257,10 +1269,16 @@ static int ipv6_get_saddr_eval(struct net *net,
score->scopedist = ret;
break;
case IPV6_SADDR_RULE_PREFERRED:
+ {
/* Rule 3: Avoid deprecated and optimistic addresses */
+ u8 avoid = IFA_F_DEPRECATED;
+
+ if (!ipv6_use_optimistic_addr(score->ifa->idev))
+ avoid |= IFA_F_OPTIMISTIC;
ret = ipv6_saddr_preferred(score->addr_type) ||
- !(score->ifa->flags & (IFA_F_DEPRECATED|IFA_F_OPTIMISTIC));
+ !(score->ifa->flags & avoid);
break;
+ }
#ifdef CONFIG_IPV6_MIP6
case IPV6_SADDR_RULE_HOA:
{
@@ -1306,6 +1324,14 @@ static int ipv6_get_saddr_eval(struct net *net,
ret = score->ifa->prefix_len;
score->matchlen = ret;
break;
+#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
+ case IPV6_SADDR_RULE_NOT_OPTIMISTIC:
+ /* Optimistic addresses still have lower precedence than other
+ * preferred addresses.
+ */
+ ret = !(score->ifa->flags & IFA_F_OPTIMISTIC);
+ break;
+#endif
default:
ret = 0;
}
@@ -3222,8 +3248,15 @@ static void addrconf_dad_begin(struct inet6_ifaddr *ifp)
* Optimistic nodes can start receiving
* Frames right away
*/
- if (ifp->flags & IFA_F_OPTIMISTIC)
+ if (ifp->flags & IFA_F_OPTIMISTIC) {
ip6_ins_rt(ifp->rt);
+ if (ipv6_use_optimistic_addr(idev)) {
+ /* Because optimistic nodes can use this address,
+ * notify listeners. If DAD fails, RTM_DELADDR is sent.
+ */
+ ipv6_ifa_notify(RTM_NEWADDR, ifp);
+ }
+ }
addrconf_dad_kick(ifp);
out:
@@ -4330,6 +4363,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
array[DEVCONF_ACCEPT_SOURCE_ROUTE] = cnf->accept_source_route;
#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
array[DEVCONF_OPTIMISTIC_DAD] = cnf->optimistic_dad;
+ array[DEVCONF_USE_OPTIMISTIC] = cnf->use_optimistic;
#endif
#ifdef CONFIG_IPV6_MROUTE
array[DEVCONF_MC_FORWARDING] = cnf->mc_forwarding;
@@ -5155,6 +5189,14 @@ static struct addrconf_sysctl_table
.proc_handler = proc_dointvec,
},
+ {
+ .procname = "use_optimistic",
+ .data = &ipv6_devconf.use_optimistic,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+
+ },
#endif
#ifdef CONFIG_IPV6_MROUTE
{
--
2.1.0.rc2.206.gedb03e5
^ permalink raw reply related
* Re: [PATCH net-next] net: ipv6: Add a sysctl to make optimistic addresses useful candidates
From: Erik Kline @ 2014-10-28 8:59 UTC (permalink / raw)
To: Lorenzo Colitti
Cc: netdev@vger.kernel.org, David Miller, Ben Hutchings,
Hannes Frederic Sowa
In-Reply-To: <CAKD1Yr13oLyBsbuh92F91Fjno3Xv4vBQ1B__8Csoi0XUjJv9nw@mail.gmail.com>
>> * Configure the address for reception. Now it is valid.
>> */
>>
>> - ipv6_ifa_notify(RTM_NEWADDR, ifp);
>> + /* If optimistic DAD is in use, the notification was already sent
>> + * in addrconf_dad_begin().
>> + */
>> + if (!ipv6_use_optimistic_addr(ifp->idev))
>> + ipv6_ifa_notify(RTM_NEWADDR, ifp);
>
> Won't this result in not sending RTM_NEWADDR messages if
> use_optimistic is enabled on the interface, but the IP address that
> has just completed DAD is not an optimistic address (e.g., if it's a
> manually-configured address)?
Gah, yes. I originally unconditionally sent the RTM_NEWADDR, but
there was some concern about sending duplicates so this was a weak
attempt to reduce spurious messages.
I still think sending the RTM_NEWADDR unconditionally is the right
thing, since we send them all the time when something about the
address changes (including timer-refresh on receipt of RAs).
I'll revert that bit and send an updated patch ASAP.
Thanks.
^ permalink raw reply
* Re: [PATCH iproute2] xfrm: add support of ESN and anti-replay window
From: Rongqing Li @ 2014-10-28 8:30 UTC (permalink / raw)
To: Nicolas Dichtel; +Cc: shemminger, netdev, dingzhi, Adrien Mazarguil
In-Reply-To: <1413796984-9867-1-git-send-email-nicolas.dichtel@6wind.com>
On 10/20/2014 05:23 PM, Nicolas Dichtel wrote:
> From: dingzhi <zhi.ding@6wind.com>
>
> This patch allows to configure ESN and anti-replay window.
>
> Signed-off-by: dingzhi <zhi.ding@6wind.com>
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> ---
> ip/ipxfrm.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ip/xfrm_state.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++-------
Docs or man needs to be updated.
-Roy
^ permalink raw reply
* Re: [PATCH net-next] net: ipv6: Add a sysctl to make optimistic addresses useful candidates
From: Lorenzo Colitti @ 2014-10-28 8:19 UTC (permalink / raw)
To: Erik Kline
Cc: netdev@vger.kernel.org, David Miller, Ben Hutchings,
Hannes Frederic Sowa
In-Reply-To: <1414482141-27912-1-git-send-email-ek@google.com>
On Tue, Oct 28, 2014 at 4:42 PM, Erik Kline <ek@google.com> wrote:
> * Configure the address for reception. Now it is valid.
> */
>
> - ipv6_ifa_notify(RTM_NEWADDR, ifp);
> + /* If optimistic DAD is in use, the notification was already sent
> + * in addrconf_dad_begin().
> + */
> + if (!ipv6_use_optimistic_addr(ifp->idev))
> + ipv6_ifa_notify(RTM_NEWADDR, ifp);
Won't this result in not sending RTM_NEWADDR messages if
use_optimistic is enabled on the interface, but the IP address that
has just completed DAD is not an optimistic address (e.g., if it's a
manually-configured address)?
^ permalink raw reply
* Re: localed stuck in recent 3.18 git in copy_net_ns?
From: Yanko Kaneti @ 2014-10-28 8:12 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Jay Vosburgh, Josh Boyer, Eric W. Biederman, Cong Wang,
Kevin Fenzi, netdev, Linux-Kernel@Vger. Kernel. Org, mroos, tj
In-Reply-To: <20141027174539.GC27568@linux.vnet.ibm.com>
On Mon-10/27/14-2014 10:45, Paul E. McKenney wrote:
> On Sat, Oct 25, 2014 at 11:18:27AM -0700, Paul E. McKenney wrote:
> > On Sat, Oct 25, 2014 at 09:38:16AM -0700, Jay Vosburgh wrote:
> > > Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > >
> > > >On Fri, Oct 24, 2014 at 09:33:33PM -0700, Jay Vosburgh wrote:
> > > >> Looking at the dmesg, the early boot messages seem to be
> > > >> confused as to how many CPUs there are, e.g.,
> > > >>
> > > >> [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
> > > >> [ 0.000000] Hierarchical RCU implementation.
> > > >> [ 0.000000] RCU debugfs-based tracing is enabled.
> > > >> [ 0.000000] RCU dyntick-idle grace-period acceleration is enabled.
> > > >> [ 0.000000] RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=4.
> > > >> [ 0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
> > > >> [ 0.000000] NR_IRQS:16640 nr_irqs:456 0
> > > >> [ 0.000000] Offload RCU callbacks from all CPUs
> > > >> [ 0.000000] Offload RCU callbacks from CPUs: 0-3.
> > > >>
> > > >> but later shows 2:
> > > >>
> > > >> [ 0.233703] x86: Booting SMP configuration:
> > > >> [ 0.236003] .... node #0, CPUs: #1
> > > >> [ 0.255528] x86: Booted up 1 node, 2 CPUs
> > > >>
> > > >> In any event, the E8400 is a 2 core CPU with no hyperthreading.
> > > >
> > > >Well, this might explain some of the difficulties. If RCU decides to wait
> > > >on CPUs that don't exist, we will of course get a hang. And rcu_barrier()
> > > >was definitely expecting four CPUs.
> > > >
> > > >So what happens if you boot with maxcpus=2? (Or build with
> > > >CONFIG_NR_CPUS=2.) I suspect that this might avoid the hang. If so,
> > > >I might have some ideas for a real fix.
> > >
> > > Booting with maxcpus=2 makes no difference (the dmesg output is
> > > the same).
> > >
> > > Rebuilding with CONFIG_NR_CPUS=2 makes the problem go away, and
> > > dmesg has different CPU information at boot:
> > >
> > > [ 0.000000] smpboot: 4 Processors exceeds NR_CPUS limit of 2
> > > [ 0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs
> > > [...]
> > > [ 0.000000] setup_percpu: NR_CPUS:2 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1
> > > [...]
> > > [ 0.000000] Hierarchical RCU implementation.
> > > [ 0.000000] RCU debugfs-based tracing is enabled.
> > > [ 0.000000] RCU dyntick-idle grace-period acceleration is enabled.
> > > [ 0.000000] NR_IRQS:4352 nr_irqs:440 0
> > > [ 0.000000] Offload RCU callbacks from all CPUs
> > > [ 0.000000] Offload RCU callbacks from CPUs: 0-1.
> >
> > Thank you -- this confirms my suspicions on the fix, though I must admit
> > to being surprised that maxcpus made no difference.
>
> And here is an alleged fix, lightly tested at this end. Does this patch
> help?
Tested this on top of rc2 (as found in Fedora, and failing without the patch)
with all my modprobe scenarios and it seems to have fixed it.
Thanks
-Yanko
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> rcu: Make rcu_barrier() understand about missing rcuo kthreads
>
> Commit 35ce7f29a44a (rcu: Create rcuo kthreads only for onlined CPUs)
> avoids creating rcuo kthreads for CPUs that never come online. This
> fixes a bug in many instances of firmware: Instead of lying about their
> age, these systems instead lie about the number of CPUs that they have.
> Before commit 35ce7f29a44a, this could result in huge numbers of useless
> rcuo kthreads being created.
>
> It appears that experience indicates that I should have told the
> people suffering from this problem to fix their broken firmware, but
> I instead produced what turned out to be a partial fix. The missing
> piece supplied by this commit makes sure that rcu_barrier() knows not to
> post callbacks for no-CBs CPUs that have not yet come online, because
> otherwise rcu_barrier() will hang on systems having firmware that lies
> about the number of CPUs.
>
> It is tempting to simply have rcu_barrier() refuse to post a callback on
> any no-CBs CPU that does not have an rcuo kthread. This unfortunately
> does not work because rcu_barrier() is required to wait for all pending
> callbacks. It is therefore required to wait even for those callbacks
> that cannot possibly be invoked. Even if doing so hangs the system.
>
> Given that posting a callback to a no-CBs CPU that does not yet have an
> rcuo kthread can hang rcu_barrier(), It is tempting to report an error
> in this case. Unfortunately, this will result in false positives at
> boot time, when it is perfectly legal to post callbacks to the boot CPU
> before the scheduler has started, in other words, before it is legal
> to invoke rcu_barrier().
>
> So this commit instead has rcu_barrier() avoid posting callbacks to
> CPUs having neither rcuo kthread nor pending callbacks, and has it
> complain bitterly if it finds CPUs having no rcuo kthread but some
> pending callbacks. And when rcu_barrier() does find CPUs having no rcuo
> kthread but pending callbacks, as noted earlier, it has no choice but
> to hang indefinitely.
>
> Reported-by: Yanko Kaneti <yaneti@declera.com>
> Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
> Reported-by: Meelis Roos <mroos@linux.ee>
> Reported-by: Eric B Munson <emunson@akamai.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> index aa8e5eea3ab4..c78e88ce5ea3 100644
> --- a/include/trace/events/rcu.h
> +++ b/include/trace/events/rcu.h
> @@ -660,18 +660,18 @@ TRACE_EVENT(rcu_torture_read,
> /*
> * Tracepoint for _rcu_barrier() execution. The string "s" describes
> * the _rcu_barrier phase:
> - * "Begin": rcu_barrier_callback() started.
> - * "Check": rcu_barrier_callback() checking for piggybacking.
> - * "EarlyExit": rcu_barrier_callback() piggybacked, thus early exit.
> - * "Inc1": rcu_barrier_callback() piggyback check counter incremented.
> - * "Offline": rcu_barrier_callback() found offline CPU
> - * "OnlineNoCB": rcu_barrier_callback() found online no-CBs CPU.
> - * "OnlineQ": rcu_barrier_callback() found online CPU with callbacks.
> - * "OnlineNQ": rcu_barrier_callback() found online CPU, no callbacks.
> + * "Begin": _rcu_barrier() started.
> + * "Check": _rcu_barrier() checking for piggybacking.
> + * "EarlyExit": _rcu_barrier() piggybacked, thus early exit.
> + * "Inc1": _rcu_barrier() piggyback check counter incremented.
> + * "OfflineNoCB": _rcu_barrier() found callback on never-online CPU
> + * "OnlineNoCB": _rcu_barrier() found online no-CBs CPU.
> + * "OnlineQ": _rcu_barrier() found online CPU with callbacks.
> + * "OnlineNQ": _rcu_barrier() found online CPU, no callbacks.
> * "IRQ": An rcu_barrier_callback() callback posted on remote CPU.
> * "CB": An rcu_barrier_callback() invoked a callback, not the last.
> * "LastCB": An rcu_barrier_callback() invoked the last callback.
> - * "Inc2": rcu_barrier_callback() piggyback check counter incremented.
> + * "Inc2": _rcu_barrier() piggyback check counter incremented.
> * The "cpu" argument is the CPU or -1 if meaningless, the "cnt" argument
> * is the count of remaining callbacks, and "done" is the piggybacking count.
> */
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f6880052b917..7680fc275036 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3312,11 +3312,16 @@ static void _rcu_barrier(struct rcu_state *rsp)
> continue;
> rdp = per_cpu_ptr(rsp->rda, cpu);
> if (rcu_is_nocb_cpu(cpu)) {
> - _rcu_barrier_trace(rsp, "OnlineNoCB", cpu,
> - rsp->n_barrier_done);
> - atomic_inc(&rsp->barrier_cpu_count);
> - __call_rcu(&rdp->barrier_head, rcu_barrier_callback,
> - rsp, cpu, 0);
> + if (!rcu_nocb_cpu_needs_barrier(rsp, cpu)) {
> + _rcu_barrier_trace(rsp, "OfflineNoCB", cpu,
> + rsp->n_barrier_done);
> + } else {
> + _rcu_barrier_trace(rsp, "OnlineNoCB", cpu,
> + rsp->n_barrier_done);
> + atomic_inc(&rsp->barrier_cpu_count);
> + __call_rcu(&rdp->barrier_head,
> + rcu_barrier_callback, rsp, cpu, 0);
> + }
> } else if (ACCESS_ONCE(rdp->qlen)) {
> _rcu_barrier_trace(rsp, "OnlineQ", cpu,
> rsp->n_barrier_done);
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 4beab3d2328c..8e7b1843896e 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -587,6 +587,7 @@ static void print_cpu_stall_info(struct rcu_state *rsp, int cpu);
> static void print_cpu_stall_info_end(void);
> static void zero_cpu_stall_ticks(struct rcu_data *rdp);
> static void increment_cpu_stall_ticks(void);
> +static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu);
> static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq);
> static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp);
> static void rcu_init_one_nocb(struct rcu_node *rnp);
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 927c17b081c7..68c5b23b7173 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -2050,6 +2050,33 @@ static void wake_nocb_leader(struct rcu_data *rdp, bool force)
> }
>
> /*
> + * Does the specified CPU need an RCU callback for the specified flavor
> + * of rcu_barrier()?
> + */
> +static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu)
> +{
> + struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
> + struct rcu_head *rhp;
> +
> + /* No-CBs CPUs might have callbacks on any of three lists. */
> + rhp = ACCESS_ONCE(rdp->nocb_head);
> + if (!rhp)
> + rhp = ACCESS_ONCE(rdp->nocb_gp_head);
> + if (!rhp)
> + rhp = ACCESS_ONCE(rdp->nocb_follower_head);
> +
> + /* Having no rcuo kthread but CBs after scheduler starts is bad! */
> + if (!ACCESS_ONCE(rdp->nocb_kthread) && rhp) {
> + /* RCU callback enqueued before CPU first came online??? */
> + pr_err("RCU: Never-onlined no-CBs CPU %d has CB %p\n",
> + cpu, rhp->func);
> + WARN_ON_ONCE(1);
> + }
> +
> + return !!rhp;
> +}
> +
> +/*
> * Enqueue the specified string of rcu_head structures onto the specified
> * CPU's no-CBs lists. The CPU is specified by rdp, the head of the
> * string by rhp, and the tail of the string by rhtp. The non-lazy/lazy
> @@ -2646,6 +2673,10 @@ static bool init_nocb_callback_list(struct rcu_data *rdp)
>
> #else /* #ifdef CONFIG_RCU_NOCB_CPU */
>
> +static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu)
> +{
> +}
> +
> static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
> {
> }
>
^ permalink raw reply
* Re: [PATCH] ovs: Turn vports with dependencies into separate modules
From: Thomas Graf @ 2014-10-28 8:10 UTC (permalink / raw)
To: Pravin Shelar; +Cc: dev@openvswitch.org, netdev
In-Reply-To: <CALnjE+p5b5EzLkY6_6J7jvwyf9rUdu-JGbdf5r3bDuhgndoeeg@mail.gmail.com>
On 10/27/14 at 05:27pm, Pravin Shelar wrote:
> On Mon, Oct 27, 2014 at 2:47 PM, Thomas Graf <tgraf@suug.ch> wrote:
> > What I mean specifically is the following dependency logic which will
> > no longer be required:
> >
> > depends on NET_IPGRE_DEMUX && !(OPENVSWITCH=y && NET_IPGRE_DEMUX=m)
> >
> > The patch also brings additional flexibility to users of
> > distributions. Distros typically ship something like an allmodconfig
> > so a user can either run openvswitch.ko with all encaps compiled in
> > or not run openvswitch.ko. With vports as module, a user can blacklist
> > a certain encap type.
> >
> > Another advantage is obviously that users can run additional vport
> > types on top of their distribution kernels.
> >
> > Is there anything specific that you are concerned with in regard
> > to this proposed change?
>
> OVS vport code is not alot and making it plugable module does not save
> much space. Even with this patch user can not load any vport type
> since we still need to define the type in kernel interface and add the
> support in userspace netdev layer. Therefore this patch adds
> complexity without much gain.
Defining the type in the header now only serves the purpose of
reserving unique vport types. It will be perfectly fine to compile a
vport module of a newer OVS user space against an older kernel (that
has the vport API) and load the vport module even though that kernel
version does not have any explicit awareness of that type. This is
something users of distribution kernel like to do because they
typically can't recompile the kernel without break support contracts.
^ permalink raw reply
* [PATCH net-next] net: ipv6: Add a sysctl to make optimistic addresses useful candidates
From: Erik Kline @ 2014-10-28 7:42 UTC (permalink / raw)
To: netdev; +Cc: davem, ben, lorenzo, hannes, Erik Kline
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts. If DAD fails, a separate
RTM_DELADDR is always sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
---
Documentation/networking/ip-sysctl.txt | 13 +++++++++
include/linux/ipv6.h | 1 +
include/uapi/linux/ipv6.h | 1 +
net/ipv6/addrconf.c | 52 ++++++++++++++++++++++++++++++++--
4 files changed, 64 insertions(+), 3 deletions(-)
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 0307e28..e03cf49 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1452,6 +1452,19 @@ suppress_frag_ndisc - INTEGER
1 - (default) discard fragmented neighbor discovery packets
0 - allow fragmented neighbor discovery packets
+optimistic_dad - BOOLEAN
+ Whether to perform Optimistic Duplicate Address Detection (RFC 4429).
+ 0: disabled (default)
+ 1: enabled
+
+use_optimistic - BOOLEAN
+ If enabled, do not classify optimistic addresses as deprecated during
+ source address selection. Preferred addresses will still be chosen
+ before optimistic addresses, subject to other ranking in the source
+ address selection algorithm.
+ 0: disabled (default)
+ 1: enabled
+
icmp/*:
ratelimit - INTEGER
Limit the maximal rates for sending ICMPv6 packets.
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index ff56053..7121a2e 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -42,6 +42,7 @@ struct ipv6_devconf {
__s32 accept_ra_from_local;
#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
__s32 optimistic_dad;
+ __s32 use_optimistic;
#endif
#ifdef CONFIG_IPV6_MROUTE
__s32 mc_forwarding;
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index efa2666..e863d08 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -164,6 +164,7 @@ enum {
DEVCONF_MLDV2_UNSOLICITED_REPORT_INTERVAL,
DEVCONF_SUPPRESS_FRAG_NDISC,
DEVCONF_ACCEPT_RA_FROM_LOCAL,
+ DEVCONF_USE_OPTIMISTIC,
DEVCONF_MAX
};
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 50b95b2..7161743 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1170,6 +1170,9 @@ enum {
IPV6_SADDR_RULE_PRIVACY,
IPV6_SADDR_RULE_ORCHID,
IPV6_SADDR_RULE_PREFIX,
+#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
+ IPV6_SADDR_RULE_NOT_OPTIMISTIC,
+#endif
IPV6_SADDR_RULE_MAX
};
@@ -1197,6 +1200,15 @@ static inline int ipv6_saddr_preferred(int type)
return 0;
}
+static inline bool ipv6_use_optimistic_addr(struct inet6_dev *idev)
+{
+#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
+ return idev && idev->cnf.optimistic_dad && idev->cnf.use_optimistic;
+#else
+ return false;
+#endif
+}
+
static int ipv6_get_saddr_eval(struct net *net,
struct ipv6_saddr_score *score,
struct ipv6_saddr_dst *dst,
@@ -1257,10 +1269,16 @@ static int ipv6_get_saddr_eval(struct net *net,
score->scopedist = ret;
break;
case IPV6_SADDR_RULE_PREFERRED:
+ {
/* Rule 3: Avoid deprecated and optimistic addresses */
+ u8 avoid = IFA_F_DEPRECATED;
+
+ if (!ipv6_use_optimistic_addr(score->ifa->idev))
+ avoid |= IFA_F_OPTIMISTIC;
ret = ipv6_saddr_preferred(score->addr_type) ||
- !(score->ifa->flags & (IFA_F_DEPRECATED|IFA_F_OPTIMISTIC));
+ !(score->ifa->flags & avoid);
break;
+ }
#ifdef CONFIG_IPV6_MIP6
case IPV6_SADDR_RULE_HOA:
{
@@ -1306,6 +1324,14 @@ static int ipv6_get_saddr_eval(struct net *net,
ret = score->ifa->prefix_len;
score->matchlen = ret;
break;
+#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
+ case IPV6_SADDR_RULE_NOT_OPTIMISTIC:
+ /* Optimistic addresses still have lower precedence than other
+ * preferred addresses.
+ */
+ ret = !(score->ifa->flags & IFA_F_OPTIMISTIC);
+ break;
+#endif
default:
ret = 0;
}
@@ -3222,8 +3248,15 @@ static void addrconf_dad_begin(struct inet6_ifaddr *ifp)
* Optimistic nodes can start receiving
* Frames right away
*/
- if (ifp->flags & IFA_F_OPTIMISTIC)
+ if (ifp->flags & IFA_F_OPTIMISTIC) {
ip6_ins_rt(ifp->rt);
+ if (ipv6_use_optimistic_addr(idev)) {
+ /* Because optimistic nodes can use this address,
+ * notify listeners. If DAD fails, RTM_DELADDR is sent.
+ */
+ ipv6_ifa_notify(RTM_NEWADDR, ifp);
+ }
+ }
addrconf_dad_kick(ifp);
out:
@@ -3354,7 +3387,11 @@ static void addrconf_dad_completed(struct inet6_ifaddr *ifp)
* Configure the address for reception. Now it is valid.
*/
- ipv6_ifa_notify(RTM_NEWADDR, ifp);
+ /* If optimistic DAD is in use, the notification was already sent
+ * in addrconf_dad_begin().
+ */
+ if (!ipv6_use_optimistic_addr(ifp->idev))
+ ipv6_ifa_notify(RTM_NEWADDR, ifp);
/* If added prefix is link local and we are prepared to process
router advertisements, start sending router solicitations.
@@ -4330,6 +4367,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
array[DEVCONF_ACCEPT_SOURCE_ROUTE] = cnf->accept_source_route;
#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
array[DEVCONF_OPTIMISTIC_DAD] = cnf->optimistic_dad;
+ array[DEVCONF_USE_OPTIMISTIC] = cnf->use_optimistic;
#endif
#ifdef CONFIG_IPV6_MROUTE
array[DEVCONF_MC_FORWARDING] = cnf->mc_forwarding;
@@ -5155,6 +5193,14 @@ static struct addrconf_sysctl_table
.proc_handler = proc_dointvec,
},
+ {
+ .procname = "use_optimistic",
+ .data = &ipv6_devconf.use_optimistic,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+
+ },
#endif
#ifdef CONFIG_IPV6_MROUTE
{
--
2.1.0.rc2.206.gedb03e5
^ permalink raw reply related
* Re: [net 1/2] sctp: add transport state in /proc/net/sctp/remaddr
From: Michele Baldessari @ 2014-10-28 7:20 UTC (permalink / raw)
To: David Miller; +Cc: linux-sctp, vyasevich, nhorman, netdev, dborkman
In-Reply-To: <20141027.185545.551457974536550723.davem@davemloft.net>
Hi David,
On Mon, Oct 27, 2014 at 06:55:45PM -0400, David Miller wrote:
> From: Michele Baldessari <michele@acksyn.org>
> Date: Thu, 23 Oct 2014 21:48:40 +0200
>
> > It is often quite helpful to be able to know the state of a transport
> > outside of the application itself (for troubleshooting purposes or for
> > monitoring purposes). Add it under /proc/net/sctp/remaddr.
> >
> > Signed-off-by: Michele Baldessari <michele@acksyn.org>
>
> You can't change the layout of procfs files, applications parse
> these files and any modification can potentially break such tools.
Thanks for the review. I assumed that extending a procfile by adding
a column at the end is ok and that tools must cope with that anyway.
(i.e. like it's been done in f19c29e3e391a66a273e9afebaf01917245148cd)
> Secondly, even if this change were acceptable, targetting this
> change at anything other than the net-next tree is not appropriate
> because it is a new feature.
Ok. Unless you are against adding a column, I'll resubmit to net-next
later this week.
Thanks,
Michele
--
Michele Baldessari <michele@acksyn.org>
C2A5 9DA3 9961 4FFB E01B D0BC DDD4 DCCB 7515 5C6D
^ permalink raw reply
* [PATCH net-next 2/2] r8152: support nway_reset of ethtool
From: Hayes Wang @ 2014-10-28 6:05 UTC (permalink / raw)
To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang
In-Reply-To: <1394712342-15778-66-Taiwan-albertk@realtek.com>
Support the nway_reset() function for ethtool.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
drivers/net/usb/r8152.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index fdea194..e1810bc 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -3558,11 +3558,33 @@ out:
return ret;
}
+static int rtl8152_nway_reset(struct net_device *dev)
+{
+ struct r8152 *tp = netdev_priv(dev);
+ int ret;
+
+ ret = usb_autopm_get_interface(tp->intf);
+ if (ret < 0)
+ goto out;
+
+ mutex_lock(&tp->control);
+
+ ret = mii_nway_restart(&tp->mii);
+
+ mutex_unlock(&tp->control);
+
+ usb_autopm_put_interface(tp->intf);
+
+out:
+ return ret;
+}
+
static struct ethtool_ops ops = {
.get_drvinfo = rtl8152_get_drvinfo,
.get_settings = rtl8152_get_settings,
.set_settings = rtl8152_set_settings,
.get_link = ethtool_op_get_link,
+ .nway_reset = rtl8152_nway_reset,
.get_msglevel = rtl8152_get_msglevel,
.set_msglevel = rtl8152_set_msglevel,
.get_wol = rtl8152_get_wol,
--
1.9.3
^ permalink raw reply related
* [PATCH net-next 1/2] r8152: rename tx_underun
From: Hayes Wang @ 2014-10-28 6:05 UTC (permalink / raw)
To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang
In-Reply-To: <1394712342-15778-66-Taiwan-albertk@realtek.com>
Replace tx_underun with tx_underrun for checkpatch.pl.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
drivers/net/usb/r8152.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index e3d84c3..fdea194 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -486,7 +486,7 @@ struct tally_counter {
__le64 rx_broadcast;
__le32 rx_multicast;
__le16 tx_aborted;
- __le16 tx_underun;
+ __le16 tx_underrun;
};
struct rx_desc {
@@ -3420,7 +3420,7 @@ static void rtl8152_get_ethtool_stats(struct net_device *dev,
data[9] = le64_to_cpu(tally.rx_broadcast);
data[10] = le32_to_cpu(tally.rx_multicast);
data[11] = le16_to_cpu(tally.tx_aborted);
- data[12] = le16_to_cpu(tally.tx_underun);
+ data[12] = le16_to_cpu(tally.tx_underrun);
}
static void rtl8152_get_strings(struct net_device *dev, u32 stringset, u8 *data)
--
1.9.3
^ permalink raw reply related
* [PATCH net-next 0/2] r8152: support nway_reset
From: Hayes Wang @ 2014-10-28 6:05 UTC (permalink / raw)
To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang
Fix the CHECK from checkpatch.pl and support nway_reset.
Hayes Wang (2):
r8152: rename tx_underun
r8152: support nway_reset of ethtool
drivers/net/usb/r8152.c | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)
--
1.9.3
^ permalink raw reply
* [PATCH net 1/1] cnic: Update the rcu_access_pointer() usages
From: Nilesh Javali @ 2014-10-28 5:18 UTC (permalink / raw)
To: davem
Cc: netdev, Dept-GELinuxNICDev, sudarsana.kalluru, vikas.chaudhary,
giridhar.malavali, tej.parkash
In-Reply-To: <1414473495-24790-1-git-send-email-nilesh.javali@qlogic.com>
From: Tej Parkash <tej.parkash@qlogic.com>
1. Remove the rcu_read_lock/unlock around rcu_access_pointer
2. Replace the rcu_dereference with rcu_access_pointer
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
---
drivers/net/ethernet/broadcom/cnic.c | 5 +----
1 files changed, 1 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c
index 23f23c9..f05fab6 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -382,10 +382,8 @@ static int cnic_iscsi_nl_msg_recv(struct cnic_dev *dev, u32 msg_type,
if (l5_cid >= MAX_CM_SK_TBL_SZ)
break;
- rcu_read_lock();
if (!rcu_access_pointer(cp->ulp_ops[CNIC_ULP_L4])) {
rc = -ENODEV;
- rcu_read_unlock();
break;
}
csk = &cp->csk_tbl[l5_cid];
@@ -414,7 +412,6 @@ static int cnic_iscsi_nl_msg_recv(struct cnic_dev *dev, u32 msg_type,
}
}
csk_put(csk);
- rcu_read_unlock();
rc = 0;
}
}
@@ -615,7 +612,7 @@ static int cnic_unregister_device(struct cnic_dev *dev, int ulp_type)
cnic_send_nlmsg(cp, ISCSI_KEVENT_IF_DOWN, NULL);
mutex_lock(&cnic_lock);
- if (rcu_dereference(cp->ulp_ops[ulp_type])) {
+ if (rcu_access_pointer(cp->ulp_ops[ulp_type])) {
RCU_INIT_POINTER(cp->ulp_ops[ulp_type], NULL);
cnic_put(dev);
} else {
--
1.5.6
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox