* Re: [PATCH bpf-next 3/3] bpf: Add mtu checking to FIB forwarding helper
From: Daniel Borkmann @ 2018-05-18 14:01 UTC (permalink / raw)
To: David Ahern, netdev, borkmann, ast; +Cc: davem
In-Reply-To: <a61bc84d-a413-47fd-77c8-11f793e35b84@gmail.com>
On 05/18/2018 02:34 AM, David Ahern wrote:
> On 5/17/18 4:22 PM, Daniel Borkmann wrote:
>> On 05/17/2018 06:09 PM, David Ahern wrote:
>>> Add check that egress MTU can handle packet to be forwarded. If
>>> the MTU is less than the packet lenght, return 0 meaning the
>>> packet is expected to continue up the stack for help - eg.,
>>> fragmenting the packet or sending an ICMP.
>>>
>>> Signed-off-by: David Ahern <dsahern@gmail.com>
>>> ---
>>> net/core/filter.c | 10 ++++++++++
>>> 1 file changed, 10 insertions(+)
>>>
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index 6d0d1560bd70..c47c47a75d4b 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -4098,6 +4098,7 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
>>> struct fib_nh *nh;
>>> struct flowi4 fl4;
>>> int err;
>>> + u32 mtu;
>>>
>>> dev = dev_get_by_index_rcu(net, params->ifindex);
>>> if (unlikely(!dev))
>>> @@ -4149,6 +4150,10 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
>>> if (res.fi->fib_nhs > 1)
>>> fib_select_path(net, &res, &fl4, NULL);
>>>
>>> + mtu = ip_mtu_from_fib_result(&res, params->ipv4_dst);
>>> + if (params->tot_len > mtu)
>>> + return 0;
>>> +
>>> nh = &res.fi->fib_nh[res.nh_sel];
>>>
>>> /* do not handle lwt encaps right now */
>>> @@ -4188,6 +4193,7 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
>>> struct flowi6 fl6;
>>> int strict = 0;
>>> int oif;
>>> + u32 mtu;
>>>
>>> /* link local addresses are never forwarded */
>>> if (rt6_need_strict(dst) || rt6_need_strict(src))
>>> @@ -4250,6 +4256,10 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
>>> fl6.flowi6_oif, NULL,
>>> strict);
>>>
>>> + mtu = ip6_mtu_from_fib6(f6i, dst, src);
>>> + if (params->tot_len > mtu)
>>> + return 0;
>>> +
>>> if (f6i->fib6_nh.nh_lwtstate)
>>> return 0;
>>
>> Could you elaborate how this interacts in tc BPF use case where you have e.g.
>> GSO packets and tot_len from aggregated packets would definitely be larger
>> than MTU (e.g. see is_skb_forwardable() as one example on such checks)? Should
>> this be an opt-in via a new flag for the helper?
>
> It should not be opt-in for XDP.
Yes, correct, for XDP it should not.
> I could add a flag to the internal call -- bpf_skb_fib_lookup sets the
> flag to skip the MTU check in bpf_ipv4_fib_lookup and bpf_ipv6_fib_lookup.
>
> For the skb case do you want bpf_skb_fib_lookup call is_skb_forwardable
> or leave that to the BPF program?
I think it probably makes sense to add an internal (unexposed) flag or bool
where we propagate skb_is_gso(skb) from bpf_skb_fib_lookup() call-site and
have similar logic where we first check this bool and if false do the MTU
check (so it still can get enforced for control packets). Thus probably nothing
of that implementation detail needs to be exposed to the program author.
^ permalink raw reply
* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
From: Jason Wang @ 2018-05-18 14:00 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, linux-kernel, Hannes Frederic Sowa, Eric Dumazet
In-Reply-To: <e1ecf6b0-c751-52ad-a52e-0e29c3328196@redhat.com>
On 2018年05月18日 21:26, Jason Wang wrote:
>
>
> On 2018年05月18日 21:13, Michael S. Tsirkin wrote:
>> On Fri, May 18, 2018 at 09:00:43PM +0800, Jason Wang wrote:
>>> We return -EIO on device down but can not raise EPOLLOUT after it was
>>> up. This may confuse user like vhost which expects tuntap to raise
>>> EPOLLOUT to re-enable its TX routine after tuntap is down. This could
>>> be easily reproduced by transmitting packets from VM while down and up
>>> the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
>>>
>>> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>> Cc: Eric Dumazet <edumazet@google.com>
>>> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>> ---
>>> drivers/net/tun.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>>> index d45ac37..1b29761 100644
>>> --- a/drivers/net/tun.c
>>> +++ b/drivers/net/tun.c
>>> @@ -1734,8 +1734,10 @@ static ssize_t tun_get_user(struct tun_struct
>>> *tun, struct tun_file *tfile,
>>> int skb_xdp = 1;
>>> bool frags = tun_napi_frags_enabled(tun);
>>> - if (!(tun->dev->flags & IFF_UP))
>>> + if (!(tun->dev->flags & IFF_UP)) {
>> Isn't this racy? What if flag is cleared at this point?
>
> I think you mean "set at this point"? Then yes, so we probably need to
> set the bit during tun_net_close().
>
> Thanks
Looks no need, vhost will poll socket after it see EIO. So we are ok here?
Thanks
^ permalink raw reply
* [PATCH v3 net-next 12/12] net: stmmac: Remove if condition by taking advantage of hwif return code
From: Jose Abreu @ 2018-05-18 13:56 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
We can remove the if condition and check if return code is different
than -EINVAL, meaning callback is present.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 23 ++++++++++-----------
1 files changed, 11 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index f2687ec..c32de53 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3644,6 +3644,7 @@ static irqreturn_t stmmac_interrupt(int irq, void *dev_id)
/* To handle GMAC own interrupts */
if ((priv->plat->has_gmac) || (priv->plat->has_gmac4)) {
int status = stmmac_host_irq_status(priv, priv->hw, &priv->xstats);
+ int mtl_status;
if (unlikely(status)) {
/* For LPI we need to save the tx status */
@@ -3653,20 +3654,18 @@ static irqreturn_t stmmac_interrupt(int irq, void *dev_id)
priv->tx_path_in_lpi_mode = false;
}
- if (priv->synopsys_id >= DWMAC_CORE_4_00) {
- for (queue = 0; queue < queues_count; queue++) {
- struct stmmac_rx_queue *rx_q =
- &priv->rx_queue[queue];
+ for (queue = 0; queue < queues_count; queue++) {
+ struct stmmac_rx_queue *rx_q = &priv->rx_queue[queue];
- status |= stmmac_host_mtl_irq_status(priv,
- priv->hw, queue);
+ mtl_status = stmmac_host_mtl_irq_status(priv, priv->hw,
+ queue);
+ if (mtl_status != -EINVAL)
+ status |= mtl_status;
- if (status & CORE_IRQ_MTL_RX_OVERFLOW)
- stmmac_set_rx_tail_ptr(priv,
- priv->ioaddr,
- rx_q->rx_tail_addr,
- queue);
- }
+ if (status & CORE_IRQ_MTL_RX_OVERFLOW)
+ stmmac_set_rx_tail_ptr(priv, priv->ioaddr,
+ rx_q->rx_tail_addr,
+ queue);
}
/* PCS link status */
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 11/12] net: stmmac: Let descriptor code get skbuff address
From: Jose Abreu @ 2018-05-18 13:56 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
Stop using if conditions depending on the GMAC version for getting the
descriptor skbuff address and use instead a helper implemented in the
descriptor files.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c | 6 ++++++
drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 6 ++++++
drivers/net/ethernet/stmicro/stmmac/hwif.h | 4 ++++
drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 6 ++++++
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +-----
5 files changed, 23 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
index 63f869c..20299f6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
@@ -424,6 +424,11 @@ static void dwmac4_set_mss_ctxt(struct dma_desc *p, unsigned int mss)
p->des3 = cpu_to_le32(TDES3_CONTEXT_TYPE | TDES3_CTXT_TCMSSV);
}
+static void dwmac4_get_addr(struct dma_desc *p, unsigned int *addr)
+{
+ *addr = le32_to_cpu(p->des0);
+}
+
static void dwmac4_set_addr(struct dma_desc *p, dma_addr_t addr)
{
p->des0 = cpu_to_le32(addr);
@@ -459,6 +464,7 @@ static void dwmac4_clear(struct dma_desc *p)
.init_tx_desc = dwmac4_rd_init_tx_desc,
.display_ring = dwmac4_display_ring,
.set_mss = dwmac4_set_mss_ctxt,
+ .get_addr = dwmac4_get_addr,
.set_addr = dwmac4_set_addr,
.clear = dwmac4_clear,
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index 743a60f..77914c8 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -437,6 +437,11 @@ static void enh_desc_display_ring(void *head, unsigned int size, bool rx)
pr_info("\n");
}
+static void enh_desc_get_addr(struct dma_desc *p, unsigned int *addr)
+{
+ *addr = le32_to_cpu(p->des2);
+}
+
static void enh_desc_set_addr(struct dma_desc *p, dma_addr_t addr)
{
p->des2 = cpu_to_le32(addr);
@@ -467,6 +472,7 @@ static void enh_desc_clear(struct dma_desc *p)
.get_timestamp = enh_desc_get_timestamp,
.get_rx_timestamp_status = enh_desc_get_rx_timestamp_status,
.display_ring = enh_desc_display_ring,
+ .get_addr = enh_desc_get_addr,
.set_addr = enh_desc_set_addr,
.clear = enh_desc_clear,
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h
index 06b5e5b..f499a7f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.h
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h
@@ -79,6 +79,8 @@ struct stmmac_desc_ops {
void (*display_ring)(void *head, unsigned int size, bool rx);
/* set MSS via context descriptor */
void (*set_mss)(struct dma_desc *p, unsigned int mss);
+ /* get descriptor skbuff address */
+ void (*get_addr)(struct dma_desc *p, unsigned int *addr);
/* set descriptor skbuff address */
void (*set_addr)(struct dma_desc *p, dma_addr_t addr);
/* clear descriptor */
@@ -127,6 +129,8 @@ struct stmmac_desc_ops {
stmmac_do_void_callback(__priv, desc, display_ring, __args)
#define stmmac_set_mss(__priv, __args...) \
stmmac_do_void_callback(__priv, desc, set_mss, __args)
+#define stmmac_get_desc_addr(__priv, __args...) \
+ stmmac_do_void_callback(__priv, desc, get_addr, __args)
#define stmmac_set_desc_addr(__priv, __args...) \
stmmac_do_void_callback(__priv, desc, set_addr, __args)
#define stmmac_clear_desc(__priv, __args...) \
diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
index 2facdb5..de65bb2 100644
--- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
@@ -297,6 +297,11 @@ static void ndesc_display_ring(void *head, unsigned int size, bool rx)
pr_info("\n");
}
+static void ndesc_get_addr(struct dma_desc *p, unsigned int *addr)
+{
+ *addr = le32_to_cpu(p->des2);
+}
+
static void ndesc_set_addr(struct dma_desc *p, dma_addr_t addr)
{
p->des2 = cpu_to_le32(addr);
@@ -326,6 +331,7 @@ static void ndesc_clear(struct dma_desc *p)
.get_timestamp = ndesc_get_timestamp,
.get_rx_timestamp_status = ndesc_get_rx_timestamp_status,
.display_ring = ndesc_display_ring,
+ .get_addr = ndesc_get_addr,
.set_addr = ndesc_set_addr,
.clear = ndesc_clear,
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 35ccf3f..f2687ec 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3350,11 +3350,7 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
int frame_len;
unsigned int des;
- if (unlikely(priv->synopsys_id >= DWMAC_CORE_4_00))
- des = le32_to_cpu(p->des0);
- else
- des = le32_to_cpu(p->des2);
-
+ stmmac_get_desc_addr(priv, p, &des);
frame_len = stmmac_get_rx_frame_len(priv, p, coe);
/* If frame length is greater than skb buffer size
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 10/12] net: stmmac: Uniformize set_rx_owner()
From: Jose Abreu @ 2018-05-18 13:56 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
Currently an if condition is used to select the correct callback to set
rx_onwer in descriptor. Lets keep this simple and always use the same
callback.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c | 12 ++++++------
drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 2 +-
drivers/net/ethernet/stmicro/stmmac/hwif.h | 2 +-
drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 2 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 5 +----
5 files changed, 10 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
index 119a2f9..63f869c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
@@ -189,9 +189,12 @@ static void dwmac4_set_tx_owner(struct dma_desc *p)
p->des3 |= cpu_to_le32(TDES3_OWN);
}
-static void dwmac4_set_rx_owner(struct dma_desc *p)
+static void dwmac4_set_rx_owner(struct dma_desc *p, int disable_rx_ic)
{
- p->des3 |= cpu_to_le32(RDES3_OWN);
+ p->des3 = cpu_to_le32(RDES3_OWN | RDES3_BUFFER1_VALID_ADDR);
+
+ if (!disable_rx_ic)
+ p->des3 |= cpu_to_le32(RDES3_INT_ON_COMPLETION_EN);
}
static int dwmac4_get_tx_ls(struct dma_desc *p)
@@ -292,10 +295,7 @@ static int dwmac4_wrback_get_rx_timestamp_status(void *desc, void *next_desc,
static void dwmac4_rd_init_rx_desc(struct dma_desc *p, int disable_rx_ic,
int mode, int end)
{
- p->des3 = cpu_to_le32(RDES3_OWN | RDES3_BUFFER1_VALID_ADDR);
-
- if (!disable_rx_ic)
- p->des3 |= cpu_to_le32(RDES3_INT_ON_COMPLETION_EN);
+ dwmac4_set_rx_owner(p, disable_rx_ic);
}
static void dwmac4_rd_init_tx_desc(struct dma_desc *p, int mode, int end)
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index 17cd26f..743a60f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -292,7 +292,7 @@ static void enh_desc_set_tx_owner(struct dma_desc *p)
p->des0 |= cpu_to_le32(ETDES0_OWN);
}
-static void enh_desc_set_rx_owner(struct dma_desc *p)
+static void enh_desc_set_rx_owner(struct dma_desc *p, int disable_rx_ic)
{
p->des0 |= cpu_to_le32(RDES0_OWN);
}
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h
index 1c674d6..06b5e5b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.h
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h
@@ -59,7 +59,7 @@ struct stmmac_desc_ops {
/* Get the buffer size from the descriptor */
int (*get_tx_len)(struct dma_desc *p);
/* Handle extra events on specific interrupts hw dependent */
- void (*set_rx_owner)(struct dma_desc *p);
+ void (*set_rx_owner)(struct dma_desc *p, int disable_rx_ic);
/* Get the receive frame size */
int (*get_rx_frame_len)(struct dma_desc *p, int rx_coe_type);
/* Return the reception status looking at the RDES1 */
diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
index a7b221b..2facdb5 100644
--- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
@@ -168,7 +168,7 @@ static void ndesc_set_tx_owner(struct dma_desc *p)
p->des0 |= cpu_to_le32(TDES0_OWN);
}
-static void ndesc_set_rx_owner(struct dma_desc *p)
+static void ndesc_set_rx_owner(struct dma_desc *p, int disable_rx_ic)
{
p->des0 |= cpu_to_le32(RDES0_OWN);
}
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 1e7ded6..35ccf3f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3262,10 +3262,7 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv, u32 queue)
}
dma_wmb();
- if (unlikely(priv->synopsys_id >= DWMAC_CORE_4_00))
- stmmac_init_rx_desc(priv, p, priv->use_riwt, 0, 0);
- else
- stmmac_set_rx_owner(priv, p);
+ stmmac_set_rx_owner(priv, p, priv->use_riwt);
dma_wmb();
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 09/12] net: stmmac: Remove uneeded check for GMAC version in stmmac_xmit
From: Jose Abreu @ 2018-05-18 13:56 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
We either have .enable_dma_transmission or .set_tx_tail_ptr in the HW
table callbacks, we can never have both so there is no need to check for
GMAC version.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h | 1 -
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 7 ++-----
2 files changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
index 8474bf9..c63c1fe 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
@@ -184,7 +184,6 @@
#define DMA_CHAN0_DBG_STAT_RPS_SHIFT 8
int dwmac4_dma_reset(void __iomem *ioaddr);
-void dwmac4_enable_dma_transmission(void __iomem *ioaddr, u32 tail_ptr);
void dwmac4_enable_dma_irq(void __iomem *ioaddr, u32 chan);
void dwmac410_enable_dma_irq(void __iomem *ioaddr, u32 chan);
void dwmac4_disable_dma_irq(void __iomem *ioaddr, u32 chan);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 34c1fcc..1e7ded6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3166,11 +3166,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
netdev_tx_sent_queue(netdev_get_tx_queue(dev, queue), skb->len);
- if (priv->synopsys_id < DWMAC_CORE_4_00)
- stmmac_enable_dma_transmission(priv, priv->ioaddr);
- else
- stmmac_set_tx_tail_ptr(priv, priv->ioaddr, tx_q->tx_tail_addr,
- queue);
+ stmmac_enable_dma_transmission(priv, priv->ioaddr);
+ stmmac_set_tx_tail_ptr(priv, priv->ioaddr, tx_q->tx_tail_addr, queue);
return NETDEV_TX_OK;
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 08/12] net: stmmac: Uniformize the use of dma_init_* callbacks
From: Jose Abreu @ 2018-05-18 13:56 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
Instead of relying on the GMAC version for choosing if we need to use
dma_init or dma_init_{rx/tx}_chan callback, lets uniformize this and
always use the dma_init_{rx/tx}_chan callbacks.
While at it, fix the use of dma_init_chan callback, which shall be
called for as many channels as the max of rx/tx channels.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 25 ++++++--
.../net/ethernet/stmicro/stmmac/dwmac1000_dma.c | 25 ++++++--
drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c | 25 ++++++--
drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c | 3 +-
drivers/net/ethernet/stmicro/stmmac/hwif.h | 2 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 62 ++++++++-----------
6 files changed, 83 insertions(+), 59 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
index 11c287a..2e6e2a9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
@@ -276,17 +276,28 @@ static int sun8i_dwmac_dma_reset(void __iomem *ioaddr)
* Called from stmmac via stmmac_dma_ops->init
*/
static void sun8i_dwmac_dma_init(void __iomem *ioaddr,
- struct stmmac_dma_cfg *dma_cfg,
- u32 dma_tx, u32 dma_rx, int atds)
+ struct stmmac_dma_cfg *dma_cfg, int atds)
{
- /* Write TX and RX descriptors address */
- writel(dma_rx, ioaddr + EMAC_RX_DESC_LIST);
- writel(dma_tx, ioaddr + EMAC_TX_DESC_LIST);
-
writel(EMAC_RX_INT | EMAC_TX_INT, ioaddr + EMAC_INT_EN);
writel(0x1FFFFFF, ioaddr + EMAC_INT_STA);
}
+static void sun8i_dwmac_dma_init_rx(void __iomem *ioaddr,
+ struct stmmac_dma_cfg *dma_cfg,
+ u32 dma_rx_phy, u32 chan)
+{
+ /* Write RX descriptors address */
+ writel(dma_rx_phy, ioaddr + EMAC_RX_DESC_LIST);
+}
+
+static void sun8i_dwmac_dma_init_tx(void __iomem *ioaddr,
+ struct stmmac_dma_cfg *dma_cfg,
+ u32 dma_tx_phy, u32 chan)
+{
+ /* Write TX descriptors address */
+ writel(dma_tx_phy, ioaddr + EMAC_TX_DESC_LIST);
+}
+
/* sun8i_dwmac_dump_regs() - Dump EMAC address space
* Called from stmmac_dma_ops->dump_regs
* Used for ethtool
@@ -492,6 +503,8 @@ static void sun8i_dwmac_dma_operation_mode_tx(void __iomem *ioaddr, int mode,
static const struct stmmac_dma_ops sun8i_dwmac_dma_ops = {
.reset = sun8i_dwmac_dma_reset,
.init = sun8i_dwmac_dma_init,
+ .init_rx_chan = sun8i_dwmac_dma_init_rx,
+ .init_tx_chan = sun8i_dwmac_dma_init_tx,
.dump_regs = sun8i_dwmac_dump_regs,
.dma_rx_mode = sun8i_dwmac_dma_operation_mode_rx,
.dma_tx_mode = sun8i_dwmac_dma_operation_mode_tx,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
index d7447b0..aacc4aa 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
@@ -81,8 +81,7 @@ static void dwmac1000_dma_axi(void __iomem *ioaddr, struct stmmac_axi *axi)
}
static void dwmac1000_dma_init(void __iomem *ioaddr,
- struct stmmac_dma_cfg *dma_cfg,
- u32 dma_tx, u32 dma_rx, int atds)
+ struct stmmac_dma_cfg *dma_cfg, int atds)
{
u32 value = readl(ioaddr + DMA_BUS_MODE);
int txpbl = dma_cfg->txpbl ?: dma_cfg->pbl;
@@ -119,12 +118,22 @@ static void dwmac1000_dma_init(void __iomem *ioaddr,
/* Mask interrupts by writing to CSR7 */
writel(DMA_INTR_DEFAULT_MASK, ioaddr + DMA_INTR_ENA);
+}
- /* RX/TX descriptor base address lists must be written into
- * DMA CSR3 and CSR4, respectively
- */
- writel(dma_tx, ioaddr + DMA_TX_BASE_ADDR);
- writel(dma_rx, ioaddr + DMA_RCV_BASE_ADDR);
+static void dwmac1000_dma_init_rx(void __iomem *ioaddr,
+ struct stmmac_dma_cfg *dma_cfg,
+ u32 dma_rx_phy, u32 chan)
+{
+ /* RX descriptor base address list must be written into DMA CSR3 */
+ writel(dma_rx_phy, ioaddr + DMA_RCV_BASE_ADDR);
+}
+
+static void dwmac1000_dma_init_tx(void __iomem *ioaddr,
+ struct stmmac_dma_cfg *dma_cfg,
+ u32 dma_tx_phy, u32 chan)
+{
+ /* TX descriptor base address list must be written into DMA CSR4 */
+ writel(dma_tx_phy, ioaddr + DMA_TX_BASE_ADDR);
}
static u32 dwmac1000_configure_fc(u32 csr6, int rxfifosz)
@@ -264,6 +273,8 @@ static void dwmac1000_rx_watchdog(void __iomem *ioaddr, u32 riwt,
const struct stmmac_dma_ops dwmac1000_dma_ops = {
.reset = dwmac_dma_reset,
.init = dwmac1000_dma_init,
+ .init_rx_chan = dwmac1000_dma_init_rx,
+ .init_tx_chan = dwmac1000_dma_init_tx,
.axi = dwmac1000_dma_axi,
.dump_regs = dwmac1000_dump_dma_regs,
.dma_rx_mode = dwmac1000_dma_operation_mode_rx,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
index 80339d3..21dee25 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
@@ -29,8 +29,7 @@
#include "dwmac_dma.h"
static void dwmac100_dma_init(void __iomem *ioaddr,
- struct stmmac_dma_cfg *dma_cfg,
- u32 dma_tx, u32 dma_rx, int atds)
+ struct stmmac_dma_cfg *dma_cfg, int atds)
{
/* Enable Application Access by writing to DMA CSR0 */
writel(DMA_BUS_MODE_DEFAULT | (dma_cfg->pbl << DMA_BUS_MODE_PBL_SHIFT),
@@ -38,12 +37,22 @@ static void dwmac100_dma_init(void __iomem *ioaddr,
/* Mask interrupts by writing to CSR7 */
writel(DMA_INTR_DEFAULT_MASK, ioaddr + DMA_INTR_ENA);
+}
- /* RX/TX descriptor base addr lists must be written into
- * DMA CSR3 and CSR4, respectively
- */
- writel(dma_tx, ioaddr + DMA_TX_BASE_ADDR);
- writel(dma_rx, ioaddr + DMA_RCV_BASE_ADDR);
+static void dwmac100_dma_init_rx(void __iomem *ioaddr,
+ struct stmmac_dma_cfg *dma_cfg,
+ u32 dma_rx_phy, u32 chan)
+{
+ /* RX descriptor base addr lists must be written into DMA CSR3 */
+ writel(dma_rx_phy, ioaddr + DMA_RCV_BASE_ADDR);
+}
+
+static void dwmac100_dma_init_tx(void __iomem *ioaddr,
+ struct stmmac_dma_cfg *dma_cfg,
+ u32 dma_tx_phy, u32 chan)
+{
+ /* TX descriptor base addr lists must be written into DMA CSR4 */
+ writel(dma_tx_phy, ioaddr + DMA_TX_BASE_ADDR);
}
/* Store and Forward capability is not used at all.
@@ -112,6 +121,8 @@ static void dwmac100_dma_diagnostic_fr(void *data, struct stmmac_extra_stats *x,
const struct stmmac_dma_ops dwmac100_dma_ops = {
.reset = dwmac_dma_reset,
.init = dwmac100_dma_init,
+ .init_rx_chan = dwmac100_dma_init_rx,
+ .init_tx_chan = dwmac100_dma_init_tx,
.dump_regs = dwmac100_dump_dma_regs,
.dma_tx_mode = dwmac100_dma_operation_mode_tx,
.dma_diagnostic_fr = dwmac100_dma_diagnostic_fr,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
index 9aab5b3..bf8e5a1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
@@ -120,8 +120,7 @@ static void dwmac4_dma_init_channel(void __iomem *ioaddr,
}
static void dwmac4_dma_init(void __iomem *ioaddr,
- struct stmmac_dma_cfg *dma_cfg,
- u32 dma_tx, u32 dma_rx, int atds)
+ struct stmmac_dma_cfg *dma_cfg, int atds)
{
u32 value = readl(ioaddr + DMA_SYS_BUS_MODE);
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h
index 06fb20b..1c674d6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.h
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h
@@ -140,7 +140,7 @@ struct stmmac_dma_ops {
/* DMA core initialization */
int (*reset)(void __iomem *ioaddr);
void (*init)(void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg,
- u32 dma_tx, u32 dma_rx, int atds);
+ int atds);
void (*init_chan)(void __iomem *ioaddr,
struct stmmac_dma_cfg *dma_cfg, u32 chan);
void (*init_rx_chan)(void __iomem *ioaddr,
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index a4d6ea7..34c1fcc 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2138,10 +2138,9 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
{
u32 rx_channels_count = priv->plat->rx_queues_to_use;
u32 tx_channels_count = priv->plat->tx_queues_to_use;
+ u32 dma_csr_ch = max(rx_channels_count, tx_channels_count);
struct stmmac_rx_queue *rx_q;
struct stmmac_tx_queue *tx_q;
- u32 dummy_dma_rx_phy = 0;
- u32 dummy_dma_tx_phy = 0;
u32 chan = 0;
int atds = 0;
int ret = 0;
@@ -2160,48 +2159,39 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
return ret;
}
- if (priv->synopsys_id >= DWMAC_CORE_4_00) {
- /* DMA Configuration */
- stmmac_dma_init(priv, priv->ioaddr, priv->plat->dma_cfg,
- dummy_dma_tx_phy, dummy_dma_rx_phy, atds);
-
- /* DMA RX Channel Configuration */
- for (chan = 0; chan < rx_channels_count; chan++) {
- rx_q = &priv->rx_queue[chan];
-
- stmmac_init_rx_chan(priv, priv->ioaddr,
- priv->plat->dma_cfg, rx_q->dma_rx_phy,
- chan);
-
- rx_q->rx_tail_addr = rx_q->dma_rx_phy +
- (DMA_RX_SIZE * sizeof(struct dma_desc));
- stmmac_set_rx_tail_ptr(priv, priv->ioaddr,
- rx_q->rx_tail_addr, chan);
- }
-
- /* DMA TX Channel Configuration */
- for (chan = 0; chan < tx_channels_count; chan++) {
- tx_q = &priv->tx_queue[chan];
+ /* DMA RX Channel Configuration */
+ for (chan = 0; chan < rx_channels_count; chan++) {
+ rx_q = &priv->rx_queue[chan];
- stmmac_init_chan(priv, priv->ioaddr,
- priv->plat->dma_cfg, chan);
+ stmmac_init_rx_chan(priv, priv->ioaddr, priv->plat->dma_cfg,
+ rx_q->dma_rx_phy, chan);
- stmmac_init_tx_chan(priv, priv->ioaddr,
- priv->plat->dma_cfg, tx_q->dma_tx_phy,
- chan);
+ rx_q->rx_tail_addr = rx_q->dma_rx_phy +
+ (DMA_RX_SIZE * sizeof(struct dma_desc));
+ stmmac_set_rx_tail_ptr(priv, priv->ioaddr,
+ rx_q->rx_tail_addr, chan);
+ }
- tx_q->tx_tail_addr = tx_q->dma_tx_phy +
- (DMA_TX_SIZE * sizeof(struct dma_desc));
- stmmac_set_tx_tail_ptr(priv, priv->ioaddr,
- tx_q->tx_tail_addr, chan);
- }
- } else {
- rx_q = &priv->rx_queue[chan];
+ /* DMA TX Channel Configuration */
+ for (chan = 0; chan < tx_channels_count; chan++) {
tx_q = &priv->tx_queue[chan];
- stmmac_dma_init(priv, priv->ioaddr, priv->plat->dma_cfg,
- tx_q->dma_tx_phy, rx_q->dma_rx_phy, atds);
+
+ stmmac_init_tx_chan(priv, priv->ioaddr, priv->plat->dma_cfg,
+ tx_q->dma_tx_phy, chan);
+
+ tx_q->tx_tail_addr = tx_q->dma_tx_phy +
+ (DMA_TX_SIZE * sizeof(struct dma_desc));
+ stmmac_set_tx_tail_ptr(priv, priv->ioaddr,
+ tx_q->tx_tail_addr, chan);
}
+ /* DMA CSR Channel configuration */
+ for (chan = 0; chan < dma_csr_ch; chan++)
+ stmmac_init_chan(priv, priv->ioaddr, priv->plat->dma_cfg, chan);
+
+ /* DMA Configuration */
+ stmmac_dma_init(priv, priv->ioaddr, priv->plat->dma_cfg, atds);
+
if (priv->plat->axi)
stmmac_axi(priv, priv->ioaddr, priv->plat->axi);
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 07/12] net: stmmac: Move PTP and MMC base address calculation to hwif.c
From: Jose Abreu @ 2018-05-18 13:56 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
PTP and MMC modules base address can depend on the GMAC version. As this
is HW specific lets move this base address calculation to hwif.c. Also,
add an entry in the HW table so that we can specify the module offset.
This can later be extended to more modules, if deemed necessary.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/hwif.c | 34 +++++++++++++++++++++
drivers/net/ethernet/stmicro/stmmac/hwif.h | 5 +++
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 8 -----
3 files changed, 39 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.c b/drivers/net/ethernet/stmicro/stmmac/hwif.c
index 9acc8d2..23a1264 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.c
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.c
@@ -6,6 +6,7 @@
#include "common.h"
#include "stmmac.h"
+#include "stmmac_ptp.h"
static u32 stmmac_get_id(struct stmmac_priv *priv, u32 id_reg)
{
@@ -72,6 +73,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
bool gmac;
bool gmac4;
u32 min_id;
+ const struct stmmac_regs_off regs;
const void *desc;
const void *dma;
const void *mac;
@@ -86,6 +88,10 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
.gmac = false,
.gmac4 = false,
.min_id = 0,
+ .regs = {
+ .ptp_off = PTP_GMAC3_X_OFFSET,
+ .mmc_off = MMC_GMAC3_X_OFFSET,
+ },
.desc = NULL,
.dma = &dwmac100_dma_ops,
.mac = &dwmac100_ops,
@@ -98,6 +104,10 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
.gmac = true,
.gmac4 = false,
.min_id = 0,
+ .regs = {
+ .ptp_off = PTP_GMAC3_X_OFFSET,
+ .mmc_off = MMC_GMAC3_X_OFFSET,
+ },
.desc = NULL,
.dma = &dwmac1000_dma_ops,
.mac = &dwmac1000_ops,
@@ -110,6 +120,10 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
.gmac = false,
.gmac4 = true,
.min_id = 0,
+ .regs = {
+ .ptp_off = PTP_GMAC4_OFFSET,
+ .mmc_off = MMC_GMAC4_OFFSET,
+ },
.desc = &dwmac4_desc_ops,
.dma = &dwmac4_dma_ops,
.mac = &dwmac4_ops,
@@ -122,6 +136,10 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
.gmac = false,
.gmac4 = true,
.min_id = DWMAC_CORE_4_00,
+ .regs = {
+ .ptp_off = PTP_GMAC4_OFFSET,
+ .mmc_off = MMC_GMAC4_OFFSET,
+ },
.desc = &dwmac4_desc_ops,
.dma = &dwmac4_dma_ops,
.mac = &dwmac410_ops,
@@ -134,6 +152,10 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
.gmac = false,
.gmac4 = true,
.min_id = DWMAC_CORE_4_10,
+ .regs = {
+ .ptp_off = PTP_GMAC4_OFFSET,
+ .mmc_off = MMC_GMAC4_OFFSET,
+ },
.desc = &dwmac4_desc_ops,
.dma = &dwmac410_dma_ops,
.mac = &dwmac410_ops,
@@ -146,6 +168,10 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
.gmac = false,
.gmac4 = true,
.min_id = DWMAC_CORE_5_10,
+ .regs = {
+ .ptp_off = PTP_GMAC4_OFFSET,
+ .mmc_off = MMC_GMAC4_OFFSET,
+ },
.desc = &dwmac4_desc_ops,
.dma = &dwmac410_dma_ops,
.mac = &dwmac510_ops,
@@ -175,6 +201,12 @@ int stmmac_hwif_init(struct stmmac_priv *priv)
/* Save ID for later use */
priv->synopsys_id = id;
+ /* Lets assume some safe values first */
+ priv->ptpaddr = priv->ioaddr +
+ (needs_gmac4 ? PTP_GMAC4_OFFSET : PTP_GMAC3_X_OFFSET);
+ priv->mmcaddr = priv->ioaddr +
+ (needs_gmac4 ? MMC_GMAC4_OFFSET : MMC_GMAC3_X_OFFSET);
+
/* Check for HW specific setup first */
if (priv->plat->setup) {
priv->hw = priv->plat->setup(priv);
@@ -206,6 +238,8 @@ int stmmac_hwif_init(struct stmmac_priv *priv)
mac->tc = entry->tc;
priv->hw = mac;
+ priv->ptpaddr = priv->ioaddr + entry->regs.ptp_off;
+ priv->mmcaddr = priv->ioaddr + entry->regs.mmc_off;
/* Entry found */
ret = entry->setup(priv);
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h
index 3ff4afe..06fb20b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.h
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h
@@ -442,6 +442,11 @@ struct stmmac_tc_ops {
#define stmmac_tc_setup_cls_u32(__priv, __args...) \
stmmac_do_callback(__priv, tc, setup_cls_u32, __args)
+struct stmmac_regs_off {
+ u32 ptp_off;
+ u32 mmc_off;
+};
+
extern const struct stmmac_ops dwmac100_ops;
extern const struct stmmac_dma_ops dwmac100_dma_ops;
extern const struct stmmac_ops dwmac1000_ops;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index ce6f839..a4d6ea7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2085,14 +2085,6 @@ static void stmmac_mmc_setup(struct stmmac_priv *priv)
unsigned int mode = MMC_CNTRL_RESET_ON_READ | MMC_CNTRL_COUNTER_RESET |
MMC_CNTRL_PRESET | MMC_CNTRL_FULL_HALF_PRESET;
- if (priv->synopsys_id >= DWMAC_CORE_4_00) {
- priv->ptpaddr = priv->ioaddr + PTP_GMAC4_OFFSET;
- priv->mmcaddr = priv->ioaddr + MMC_GMAC4_OFFSET;
- } else {
- priv->ptpaddr = priv->ioaddr + PTP_GMAC3_X_OFFSET;
- priv->mmcaddr = priv->ioaddr + MMC_GMAC3_X_OFFSET;
- }
-
dwmac_mmc_intr_all_mask(priv->mmcaddr);
if (priv->dma_cap.rmon) {
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 06/12] net: stmmac: Remove uneeded checks for GMAC version
From: Jose Abreu @ 2018-05-18 13:56 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
With the introducion of callbacks check in hwif.h we only call the
callback if HW supports it so there is no longer need to check for GMAC
version.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 16 +++++-----------
1 files changed, 5 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index beb7ec1..ce6f839 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1973,11 +1973,8 @@ static void stmmac_set_dma_operation_mode(struct stmmac_priv *priv, u32 txmode,
static bool stmmac_safety_feat_interrupt(struct stmmac_priv *priv)
{
- int ret = false;
+ int ret;
- /* Safety features are only available in cores >= 5.10 */
- if (priv->synopsys_id < DWMAC_CORE_5_10)
- return ret;
ret = stmmac_safety_feat_irq_status(priv, priv->dev,
priv->ioaddr, priv->dma_cap.asp, &priv->sstats);
if (ret && (ret != -EINVAL)) {
@@ -2495,12 +2492,10 @@ static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
stmmac_core_init(priv, priv->hw, dev);
/* Initialize MTL*/
- if (priv->synopsys_id >= DWMAC_CORE_4_00)
- stmmac_mtl_configuration(priv);
+ stmmac_mtl_configuration(priv);
/* Initialize Safety Features */
- if (priv->synopsys_id >= DWMAC_CORE_5_10)
- stmmac_safety_feat_configuration(priv);
+ stmmac_safety_feat_configuration(priv);
ret = stmmac_rx_ipc(priv, priv->hw);
if (!ret) {
@@ -3054,10 +3049,9 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
if (enh_desc)
is_jumbo = stmmac_is_jumbo_frm(priv, skb->len, enh_desc);
- if (unlikely(is_jumbo) && likely(priv->synopsys_id <
- DWMAC_CORE_4_00)) {
+ if (unlikely(is_jumbo)) {
entry = stmmac_jumbo_frm(priv, tx_q, skb, csum_insertion);
- if (unlikely(entry < 0))
+ if (unlikely(entry < 0) && (entry != -EINVAL))
goto dma_map_err;
}
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 05/12] net: stmmac: Uniformize the use of dma_{rx/tx}_mode callbacks
From: Jose Abreu @ 2018-05-18 13:56 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
Instead of relying on the GMAC version for choosing if we need to use
dma_{rx/tx}_mode or just dma_mode callback lets uniformize this and
always use the dma_{rx/tx}_mode callbacks.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 57 +++++++++-------
.../net/ethernet/stmicro/stmmac/dwmac1000_dma.c | 67 +++++++++++---------
drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c | 10 ++--
drivers/net/ethernet/stmicro/stmmac/hwif.h | 6 --
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 32 +++------
5 files changed, 86 insertions(+), 86 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
index 2f7f091..11c287a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
@@ -437,13 +437,36 @@ static int sun8i_dwmac_dma_interrupt(void __iomem *ioaddr,
return ret;
}
-static void sun8i_dwmac_dma_operation_mode(void __iomem *ioaddr, int txmode,
- int rxmode, int rxfifosz)
+static void sun8i_dwmac_dma_operation_mode_rx(void __iomem *ioaddr, int mode,
+ u32 channel, int fifosz, u8 qmode)
+{
+ u32 v;
+
+ v = readl(ioaddr + EMAC_RX_CTL1);
+ if (mode == SF_DMA_MODE) {
+ v |= EMAC_RX_MD;
+ } else {
+ v &= ~EMAC_RX_MD;
+ v &= ~EMAC_RX_TH_MASK;
+ if (mode < 32)
+ v |= EMAC_RX_TH_32;
+ else if (mode < 64)
+ v |= EMAC_RX_TH_64;
+ else if (mode < 96)
+ v |= EMAC_RX_TH_96;
+ else if (mode < 128)
+ v |= EMAC_RX_TH_128;
+ }
+ writel(v, ioaddr + EMAC_RX_CTL1);
+}
+
+static void sun8i_dwmac_dma_operation_mode_tx(void __iomem *ioaddr, int mode,
+ u32 channel, int fifosz, u8 qmode)
{
u32 v;
v = readl(ioaddr + EMAC_TX_CTL1);
- if (txmode == SF_DMA_MODE) {
+ if (mode == SF_DMA_MODE) {
v |= EMAC_TX_MD;
/* Undocumented bit (called TX_NEXT_FRM in BSP), the original
* comment is
@@ -454,40 +477,24 @@ static void sun8i_dwmac_dma_operation_mode(void __iomem *ioaddr, int txmode,
} else {
v &= ~EMAC_TX_MD;
v &= ~EMAC_TX_TH_MASK;
- if (txmode < 64)
+ if (mode < 64)
v |= EMAC_TX_TH_64;
- else if (txmode < 128)
+ else if (mode < 128)
v |= EMAC_TX_TH_128;
- else if (txmode < 192)
+ else if (mode < 192)
v |= EMAC_TX_TH_192;
- else if (txmode < 256)
+ else if (mode < 256)
v |= EMAC_TX_TH_256;
}
writel(v, ioaddr + EMAC_TX_CTL1);
-
- v = readl(ioaddr + EMAC_RX_CTL1);
- if (rxmode == SF_DMA_MODE) {
- v |= EMAC_RX_MD;
- } else {
- v &= ~EMAC_RX_MD;
- v &= ~EMAC_RX_TH_MASK;
- if (rxmode < 32)
- v |= EMAC_RX_TH_32;
- else if (rxmode < 64)
- v |= EMAC_RX_TH_64;
- else if (rxmode < 96)
- v |= EMAC_RX_TH_96;
- else if (rxmode < 128)
- v |= EMAC_RX_TH_128;
- }
- writel(v, ioaddr + EMAC_RX_CTL1);
}
static const struct stmmac_dma_ops sun8i_dwmac_dma_ops = {
.reset = sun8i_dwmac_dma_reset,
.init = sun8i_dwmac_dma_init,
.dump_regs = sun8i_dwmac_dump_regs,
- .dma_mode = sun8i_dwmac_dma_operation_mode,
+ .dma_rx_mode = sun8i_dwmac_dma_operation_mode_rx,
+ .dma_tx_mode = sun8i_dwmac_dma_operation_mode_tx,
.enable_dma_transmission = sun8i_dwmac_enable_dma_transmission,
.enable_dma_irq = sun8i_dwmac_enable_dma_irq,
.disable_dma_irq = sun8i_dwmac_disable_dma_irq,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
index 7ecf549..d7447b0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
@@ -148,12 +148,40 @@ static u32 dwmac1000_configure_fc(u32 csr6, int rxfifosz)
return csr6;
}
-static void dwmac1000_dma_operation_mode(void __iomem *ioaddr, int txmode,
- int rxmode, int rxfifosz)
+static void dwmac1000_dma_operation_mode_rx(void __iomem *ioaddr, int mode,
+ u32 channel, int fifosz, u8 qmode)
{
u32 csr6 = readl(ioaddr + DMA_CONTROL);
- if (txmode == SF_DMA_MODE) {
+ if (mode == SF_DMA_MODE) {
+ pr_debug("GMAC: enable RX store and forward mode\n");
+ csr6 |= DMA_CONTROL_RSF;
+ } else {
+ pr_debug("GMAC: disable RX SF mode (threshold %d)\n", mode);
+ csr6 &= ~DMA_CONTROL_RSF;
+ csr6 &= DMA_CONTROL_TC_RX_MASK;
+ if (mode <= 32)
+ csr6 |= DMA_CONTROL_RTC_32;
+ else if (mode <= 64)
+ csr6 |= DMA_CONTROL_RTC_64;
+ else if (mode <= 96)
+ csr6 |= DMA_CONTROL_RTC_96;
+ else
+ csr6 |= DMA_CONTROL_RTC_128;
+ }
+
+ /* Configure flow control based on rx fifo size */
+ csr6 = dwmac1000_configure_fc(csr6, fifosz);
+
+ writel(csr6, ioaddr + DMA_CONTROL);
+}
+
+static void dwmac1000_dma_operation_mode_tx(void __iomem *ioaddr, int mode,
+ u32 channel, int fifosz, u8 qmode)
+{
+ u32 csr6 = readl(ioaddr + DMA_CONTROL);
+
+ if (mode == SF_DMA_MODE) {
pr_debug("GMAC: enable TX store and forward mode\n");
/* Transmit COE type 2 cannot be done in cut-through mode. */
csr6 |= DMA_CONTROL_TSF;
@@ -162,42 +190,22 @@ static void dwmac1000_dma_operation_mode(void __iomem *ioaddr, int txmode,
*/
csr6 |= DMA_CONTROL_OSF;
} else {
- pr_debug("GMAC: disabling TX SF (threshold %d)\n", txmode);
+ pr_debug("GMAC: disabling TX SF (threshold %d)\n", mode);
csr6 &= ~DMA_CONTROL_TSF;
csr6 &= DMA_CONTROL_TC_TX_MASK;
/* Set the transmit threshold */
- if (txmode <= 32)
+ if (mode <= 32)
csr6 |= DMA_CONTROL_TTC_32;
- else if (txmode <= 64)
+ else if (mode <= 64)
csr6 |= DMA_CONTROL_TTC_64;
- else if (txmode <= 128)
+ else if (mode <= 128)
csr6 |= DMA_CONTROL_TTC_128;
- else if (txmode <= 192)
+ else if (mode <= 192)
csr6 |= DMA_CONTROL_TTC_192;
else
csr6 |= DMA_CONTROL_TTC_256;
}
- if (rxmode == SF_DMA_MODE) {
- pr_debug("GMAC: enable RX store and forward mode\n");
- csr6 |= DMA_CONTROL_RSF;
- } else {
- pr_debug("GMAC: disable RX SF mode (threshold %d)\n", rxmode);
- csr6 &= ~DMA_CONTROL_RSF;
- csr6 &= DMA_CONTROL_TC_RX_MASK;
- if (rxmode <= 32)
- csr6 |= DMA_CONTROL_RTC_32;
- else if (rxmode <= 64)
- csr6 |= DMA_CONTROL_RTC_64;
- else if (rxmode <= 96)
- csr6 |= DMA_CONTROL_RTC_96;
- else
- csr6 |= DMA_CONTROL_RTC_128;
- }
-
- /* Configure flow control based on rx fifo size */
- csr6 = dwmac1000_configure_fc(csr6, rxfifosz);
-
writel(csr6, ioaddr + DMA_CONTROL);
}
@@ -258,7 +266,8 @@ static void dwmac1000_rx_watchdog(void __iomem *ioaddr, u32 riwt,
.init = dwmac1000_dma_init,
.axi = dwmac1000_dma_axi,
.dump_regs = dwmac1000_dump_dma_regs,
- .dma_mode = dwmac1000_dma_operation_mode,
+ .dma_rx_mode = dwmac1000_dma_operation_mode_rx,
+ .dma_tx_mode = dwmac1000_dma_operation_mode_tx,
.enable_dma_transmission = dwmac_enable_dma_transmission,
.enable_dma_irq = dwmac_enable_dma_irq,
.disable_dma_irq = dwmac_disable_dma_irq,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
index 6502b9a..80339d3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
@@ -51,14 +51,14 @@ static void dwmac100_dma_init(void __iomem *ioaddr,
* The transmit threshold can be programmed by setting the TTC bits in the DMA
* control register.
*/
-static void dwmac100_dma_operation_mode(void __iomem *ioaddr, int txmode,
- int rxmode, int rxfifosz)
+static void dwmac100_dma_operation_mode_tx(void __iomem *ioaddr, int mode,
+ u32 channel, int fifosz, u8 qmode)
{
u32 csr6 = readl(ioaddr + DMA_CONTROL);
- if (txmode <= 32)
+ if (mode <= 32)
csr6 |= DMA_CONTROL_TTC_32;
- else if (txmode <= 64)
+ else if (mode <= 64)
csr6 |= DMA_CONTROL_TTC_64;
else
csr6 |= DMA_CONTROL_TTC_128;
@@ -113,7 +113,7 @@ static void dwmac100_dma_diagnostic_fr(void *data, struct stmmac_extra_stats *x,
.reset = dwmac_dma_reset,
.init = dwmac100_dma_init,
.dump_regs = dwmac100_dump_dma_regs,
- .dma_mode = dwmac100_dma_operation_mode,
+ .dma_tx_mode = dwmac100_dma_operation_mode_tx,
.dma_diagnostic_fr = dwmac100_dma_diagnostic_fr,
.enable_dma_transmission = dwmac_enable_dma_transmission,
.enable_dma_irq = dwmac_enable_dma_irq,
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h
index a6b9c97..3ff4afe 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.h
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h
@@ -153,10 +153,6 @@ struct stmmac_dma_ops {
void (*axi)(void __iomem *ioaddr, struct stmmac_axi *axi);
/* Dump DMA registers */
void (*dump_regs)(void __iomem *ioaddr, u32 *reg_space);
- /* Set tx/rx threshold in the csr6 register
- * An invalid value enables the store-and-forward mode */
- void (*dma_mode)(void __iomem *ioaddr, int txmode, int rxmode,
- int rxfifosz);
void (*dma_rx_mode)(void __iomem *ioaddr, int mode, u32 channel,
int fifosz, u8 qmode);
void (*dma_tx_mode)(void __iomem *ioaddr, int mode, u32 channel,
@@ -199,8 +195,6 @@ struct stmmac_dma_ops {
stmmac_do_void_callback(__priv, dma, axi, __args)
#define stmmac_dump_dma_regs(__priv, __args...) \
stmmac_do_void_callback(__priv, dma, dump_regs, __args)
-#define stmmac_dma_mode(__priv, __args...) \
- stmmac_do_void_callback(__priv, dma, dma_mode, __args)
#define stmmac_dma_rx_mode(__priv, __args...) \
stmmac_do_void_callback(__priv, dma, dma_rx_mode, __args)
#define stmmac_dma_tx_mode(__priv, __args...) \
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 0ccee6a..beb7ec1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1787,22 +1787,18 @@ static void stmmac_dma_operation_mode(struct stmmac_priv *priv)
}
/* configure all channels */
- if (priv->synopsys_id >= DWMAC_CORE_4_00) {
- for (chan = 0; chan < rx_channels_count; chan++) {
- qmode = priv->plat->rx_queues_cfg[chan].mode_to_use;
+ for (chan = 0; chan < rx_channels_count; chan++) {
+ qmode = priv->plat->rx_queues_cfg[chan].mode_to_use;
- stmmac_dma_rx_mode(priv, priv->ioaddr, rxmode, chan,
- rxfifosz, qmode);
- }
+ stmmac_dma_rx_mode(priv, priv->ioaddr, rxmode, chan,
+ rxfifosz, qmode);
+ }
- for (chan = 0; chan < tx_channels_count; chan++) {
- qmode = priv->plat->tx_queues_cfg[chan].mode_to_use;
+ for (chan = 0; chan < tx_channels_count; chan++) {
+ qmode = priv->plat->tx_queues_cfg[chan].mode_to_use;
- stmmac_dma_tx_mode(priv, priv->ioaddr, txmode, chan,
- txfifosz, qmode);
- }
- } else {
- stmmac_dma_mode(priv, priv->ioaddr, txmode, rxmode, rxfifosz);
+ stmmac_dma_tx_mode(priv, priv->ioaddr, txmode, chan,
+ txfifosz, qmode);
}
}
@@ -1971,14 +1967,8 @@ static void stmmac_set_dma_operation_mode(struct stmmac_priv *priv, u32 txmode,
rxfifosz /= rx_channels_count;
txfifosz /= tx_channels_count;
- if (priv->synopsys_id >= DWMAC_CORE_4_00) {
- stmmac_dma_rx_mode(priv, priv->ioaddr, rxmode, chan, rxfifosz,
- rxqmode);
- stmmac_dma_tx_mode(priv, priv->ioaddr, txmode, chan, txfifosz,
- txqmode);
- } else {
- stmmac_dma_mode(priv, priv->ioaddr, txmode, rxmode, rxfifosz);
- }
+ stmmac_dma_rx_mode(priv, priv->ioaddr, rxmode, chan, rxfifosz, rxqmode);
+ stmmac_dma_tx_mode(priv, priv->ioaddr, txmode, chan, txfifosz, txqmode);
}
static bool stmmac_safety_feat_interrupt(struct stmmac_priv *priv)
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 04/12] net: stmmac: Let descriptor code clear the descriptor
From: Jose Abreu @ 2018-05-18 13:56 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
Stop using if conditions depending on the GMAC version for clearing the
descriptor and use instead a helper implemented in the descriptor files.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c | 9 +++++++++
drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 6 ++++++
drivers/net/ethernet/stmicro/stmmac/hwif.h | 4 ++++
drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 6 ++++++
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 9 +--------
5 files changed, 26 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
index f67caa1..119a2f9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
@@ -430,6 +430,14 @@ static void dwmac4_set_addr(struct dma_desc *p, dma_addr_t addr)
p->des1 = 0;
}
+static void dwmac4_clear(struct dma_desc *p)
+{
+ p->des0 = 0;
+ p->des1 = 0;
+ p->des2 = 0;
+ p->des3 = 0;
+}
+
const struct stmmac_desc_ops dwmac4_desc_ops = {
.tx_status = dwmac4_wrback_get_tx_status,
.rx_status = dwmac4_wrback_get_rx_status,
@@ -452,6 +460,7 @@ static void dwmac4_set_addr(struct dma_desc *p, dma_addr_t addr)
.display_ring = dwmac4_display_ring,
.set_mss = dwmac4_set_mss_ctxt,
.set_addr = dwmac4_set_addr,
+ .clear = dwmac4_clear,
};
const struct stmmac_mode_ops dwmac4_ring_mode_ops = { };
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index 02749e4..17cd26f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -442,6 +442,11 @@ static void enh_desc_set_addr(struct dma_desc *p, dma_addr_t addr)
p->des2 = cpu_to_le32(addr);
}
+static void enh_desc_clear(struct dma_desc *p)
+{
+ p->des2 = 0;
+}
+
const struct stmmac_desc_ops enh_desc_ops = {
.tx_status = enh_desc_get_tx_status,
.rx_status = enh_desc_get_rx_status,
@@ -463,4 +468,5 @@ static void enh_desc_set_addr(struct dma_desc *p, dma_addr_t addr)
.get_rx_timestamp_status = enh_desc_get_rx_timestamp_status,
.display_ring = enh_desc_display_ring,
.set_addr = enh_desc_set_addr,
+ .clear = enh_desc_clear,
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h
index d66d194..a6b9c97 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.h
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h
@@ -81,6 +81,8 @@ struct stmmac_desc_ops {
void (*set_mss)(struct dma_desc *p, unsigned int mss);
/* set descriptor skbuff address */
void (*set_addr)(struct dma_desc *p, dma_addr_t addr);
+ /* clear descriptor */
+ void (*clear)(struct dma_desc *p);
};
#define stmmac_init_rx_desc(__priv, __args...) \
@@ -127,6 +129,8 @@ struct stmmac_desc_ops {
stmmac_do_void_callback(__priv, desc, set_mss, __args)
#define stmmac_set_desc_addr(__priv, __args...) \
stmmac_do_void_callback(__priv, desc, set_addr, __args)
+#define stmmac_clear_desc(__priv, __args...) \
+ stmmac_do_void_callback(__priv, desc, clear, __args)
struct stmmac_dma_cfg;
struct dma_features;
diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
index 6cf2c7c..a7b221b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
@@ -302,6 +302,11 @@ static void ndesc_set_addr(struct dma_desc *p, dma_addr_t addr)
p->des2 = cpu_to_le32(addr);
}
+static void ndesc_clear(struct dma_desc *p)
+{
+ p->des2 = 0;
+}
+
const struct stmmac_desc_ops ndesc_ops = {
.tx_status = ndesc_get_tx_status,
.rx_status = ndesc_get_rx_status,
@@ -322,4 +327,5 @@ static void ndesc_set_addr(struct dma_desc *p, dma_addr_t addr)
.get_rx_timestamp_status = ndesc_get_rx_timestamp_status,
.display_ring = ndesc_display_ring,
.set_addr = ndesc_set_addr,
+ .clear = ndesc_clear,
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 3f559d7..0ccee6a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1341,14 +1341,7 @@ static int init_dma_tx_desc_rings(struct net_device *dev)
else
p = tx_q->dma_tx + i;
- if (priv->synopsys_id >= DWMAC_CORE_4_00) {
- p->des0 = 0;
- p->des1 = 0;
- p->des2 = 0;
- p->des3 = 0;
- } else {
- p->des2 = 0;
- }
+ stmmac_clear_desc(priv, p);
tx_q->tx_skbuff_dma[i].buf = 0;
tx_q->tx_skbuff_dma[i].map_as_page = false;
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 03/12] net: stmmac: Let descriptor code set skbuff address
From: Jose Abreu @ 2018-05-18 13:56 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
Stop using if conditions depending on the GMAC version for setting the
the descriptor skbuff address and use instead a helper implemented in
the descriptor files.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c | 7 +++++
drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 6 ++++
drivers/net/ethernet/stmicro/stmmac/hwif.h | 4 +++
drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 6 ++++
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 25 ++++---------------
5 files changed, 29 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
index 65ed896..f67caa1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
@@ -424,6 +424,12 @@ static void dwmac4_set_mss_ctxt(struct dma_desc *p, unsigned int mss)
p->des3 = cpu_to_le32(TDES3_CONTEXT_TYPE | TDES3_CTXT_TCMSSV);
}
+static void dwmac4_set_addr(struct dma_desc *p, dma_addr_t addr)
+{
+ p->des0 = cpu_to_le32(addr);
+ p->des1 = 0;
+}
+
const struct stmmac_desc_ops dwmac4_desc_ops = {
.tx_status = dwmac4_wrback_get_tx_status,
.rx_status = dwmac4_wrback_get_rx_status,
@@ -445,6 +451,7 @@ static void dwmac4_set_mss_ctxt(struct dma_desc *p, unsigned int mss)
.init_tx_desc = dwmac4_rd_init_tx_desc,
.display_ring = dwmac4_display_ring,
.set_mss = dwmac4_set_mss_ctxt,
+ .set_addr = dwmac4_set_addr,
};
const struct stmmac_mode_ops dwmac4_ring_mode_ops = { };
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index 3bfb3f5..02749e4 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -437,6 +437,11 @@ static void enh_desc_display_ring(void *head, unsigned int size, bool rx)
pr_info("\n");
}
+static void enh_desc_set_addr(struct dma_desc *p, dma_addr_t addr)
+{
+ p->des2 = cpu_to_le32(addr);
+}
+
const struct stmmac_desc_ops enh_desc_ops = {
.tx_status = enh_desc_get_tx_status,
.rx_status = enh_desc_get_rx_status,
@@ -457,4 +462,5 @@ static void enh_desc_display_ring(void *head, unsigned int size, bool rx)
.get_timestamp = enh_desc_get_timestamp,
.get_rx_timestamp_status = enh_desc_get_rx_timestamp_status,
.display_ring = enh_desc_display_ring,
+ .set_addr = enh_desc_set_addr,
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h
index b7539a1..d66d194 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.h
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h
@@ -79,6 +79,8 @@ struct stmmac_desc_ops {
void (*display_ring)(void *head, unsigned int size, bool rx);
/* set MSS via context descriptor */
void (*set_mss)(struct dma_desc *p, unsigned int mss);
+ /* set descriptor skbuff address */
+ void (*set_addr)(struct dma_desc *p, dma_addr_t addr);
};
#define stmmac_init_rx_desc(__priv, __args...) \
@@ -123,6 +125,8 @@ struct stmmac_desc_ops {
stmmac_do_void_callback(__priv, desc, display_ring, __args)
#define stmmac_set_mss(__priv, __args...) \
stmmac_do_void_callback(__priv, desc, set_mss, __args)
+#define stmmac_set_desc_addr(__priv, __args...) \
+ stmmac_do_void_callback(__priv, desc, set_addr, __args)
struct stmmac_dma_cfg;
struct dma_features;
diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
index 7b1d901..6cf2c7c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
@@ -297,6 +297,11 @@ static void ndesc_display_ring(void *head, unsigned int size, bool rx)
pr_info("\n");
}
+static void ndesc_set_addr(struct dma_desc *p, dma_addr_t addr)
+{
+ p->des2 = cpu_to_le32(addr);
+}
+
const struct stmmac_desc_ops ndesc_ops = {
.tx_status = ndesc_get_tx_status,
.rx_status = ndesc_get_rx_status,
@@ -316,4 +321,5 @@ static void ndesc_display_ring(void *head, unsigned int size, bool rx)
.get_timestamp = ndesc_get_timestamp,
.get_rx_timestamp_status = ndesc_get_rx_timestamp_status,
.display_ring = ndesc_display_ring,
+ .set_addr = ndesc_set_addr,
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 789bc22..3f559d7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1156,10 +1156,7 @@ static int stmmac_init_rx_buffers(struct stmmac_priv *priv, struct dma_desc *p,
return -EINVAL;
}
- if (priv->synopsys_id >= DWMAC_CORE_4_00)
- p->des0 = cpu_to_le32(rx_q->rx_skbuff_dma[i]);
- else
- p->des2 = cpu_to_le32(rx_q->rx_skbuff_dma[i]);
+ stmmac_set_desc_addr(priv, p, rx_q->rx_skbuff_dma[i]);
if (priv->dma_buf_sz == BUF_SIZE_16KiB)
stmmac_init_desc3(priv, p);
@@ -3100,10 +3097,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
goto dma_map_err; /* should reuse desc w/o issues */
tx_q->tx_skbuff_dma[entry].buf = des;
- if (unlikely(priv->synopsys_id >= DWMAC_CORE_4_00))
- desc->des0 = cpu_to_le32(des);
- else
- desc->des2 = cpu_to_le32(des);
+
+ stmmac_set_desc_addr(priv, desc, des);
tx_q->tx_skbuff_dma[entry].map_as_page = true;
tx_q->tx_skbuff_dma[entry].len = len;
@@ -3185,10 +3180,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
goto dma_map_err;
tx_q->tx_skbuff_dma[first_entry].buf = des;
- if (unlikely(priv->synopsys_id >= DWMAC_CORE_4_00))
- first->des0 = cpu_to_le32(des);
- else
- first->des2 = cpu_to_le32(des);
+
+ stmmac_set_desc_addr(priv, first, des);
tx_q->tx_skbuff_dma[first_entry].len = nopaged_len;
tx_q->tx_skbuff_dma[first_entry].last_segment = last_segment;
@@ -3302,13 +3295,7 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv, u32 queue)
break;
}
- if (unlikely(priv->synopsys_id >= DWMAC_CORE_4_00)) {
- p->des0 = cpu_to_le32(rx_q->rx_skbuff_dma[entry]);
- p->des1 = 0;
- } else {
- p->des2 = cpu_to_le32(rx_q->rx_skbuff_dma[entry]);
- }
-
+ stmmac_set_desc_addr(priv, p, rx_q->rx_skbuff_dma[entry]);
stmmac_refill_desc3(priv, rx_q, p);
if (rx_q->rx_zeroc_thresh > 0)
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 02/12] net: stmmac: Do not keep rearming the coalesce timer in stmmac_xmit
From: Jose Abreu @ 2018-05-18 13:55 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
This is cutting down performance. Once the timer is armed it should run
after the time expires for the first packet sent and not the last one.
After this change, running iperf, the performance gain is +/- 24%.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 1 +
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 5 ++++-
2 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 42fc76e..4d425b1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -105,6 +105,7 @@ struct stmmac_priv {
u32 tx_count_frames;
u32 tx_coal_frames;
u32 tx_coal_timer;
+ bool tx_timer_armed;
int tx_coalesce;
int hwts_tx_en;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index d9dbe13..789bc22 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3158,13 +3158,16 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
* element in case of no SG.
*/
priv->tx_count_frames += nfrags + 1;
- if (likely(priv->tx_coal_frames > priv->tx_count_frames)) {
+ if (likely(priv->tx_coal_frames > priv->tx_count_frames) &&
+ !priv->tx_timer_armed) {
mod_timer(&priv->txtimer,
STMMAC_COAL_TIMER(priv->tx_coal_timer));
+ priv->tx_timer_armed = true;
} else {
priv->tx_count_frames = 0;
stmmac_set_tx_ic(priv, desc);
priv->xstats.tx_set_ic_bit++;
+ priv->tx_timer_armed = false;
}
skb_tx_timestamp(skb);
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 01/12] net: stmmac: Enable OSP for GMAC4
From: Jose Abreu @ 2018-05-18 13:55 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <cover.1526651009.git.joabreu@synopsys.com>
This enables OSP (Operate on Second Packet) for GMAC4. The feature
allows DMA to fetch second descriptor while its still processing the
first one.
Running iperf, the performance gain is +/- 38%.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
index 117c3a5..9aab5b3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
@@ -94,6 +94,10 @@ static void dwmac4_dma_init_tx_chan(void __iomem *ioaddr,
value = readl(ioaddr + DMA_CHAN_TX_CONTROL(chan));
value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT);
+
+ /* Enable OSP to get best performance */
+ value |= DMA_CONTROL_OSP;
+
writel(value, ioaddr + DMA_CHAN_TX_CONTROL(chan));
writel(dma_tx_phy, ioaddr + DMA_CHAN_TX_BASE_ADDR(chan));
--
1.7.1
^ permalink raw reply related
* [PATCH v3 net-next 00/12] net: stmmac: Clean-up and tune-up
From: Jose Abreu @ 2018-05-18 13:55 UTC (permalink / raw)
To: netdev
Cc: Jose Abreu, David S. Miller, Joao Pinto, Vitor Soares,
Giuseppe Cavallaro, Alexandre Torgue
This targets to uniformize the handling of the different GMAC versions in
stmmac_main.c file and also tune-up the HW.
Currently there are some if/else conditions in the main source file which
calls different callbacks depending on the ID of GMAC.
With the introducion of a generic HW interface handling which automatically
selects the GMAC callbacks to be used, it is now unpleasant to see if
conditions in the main code because this should be completely agnostic of the
GMAC version.
This series removes most of these conditions. There are some if conditions
that remain untouched but the callbacks handling are now uniformized.
Tested in GMAC5, hope I didn't break any previous versions.
Please check [1] for performance analisys of patches 3-12.
---
David,
This will probably generate a merge conflict with [2] (which was not merged
yet). I'm waiting for Corentin input and then, if this series is merged
before, I will rebase [2]. Or the other way around if you prefer :D
Thanks
---
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
[1] https://marc.info/?l=linux-netdev&m=152656352607905&w=2
[2] https://patchwork.ozlabs.org/patch/915286/
Jose Abreu (12):
net: stmmac: Enable OSP for GMAC4
net: stmmac: Do not keep rearming the coalesce timer in stmmac_xmit
net: stmmac: Let descriptor code set skbuff address
net: stmmac: Let descriptor code clear the descriptor
net: stmmac: Uniformize the use of dma_{rx/tx}_mode callbacks
net: stmmac: Remove uneeded checks for GMAC version
net: stmmac: Move PTP and MMC base address calculation to hwif.c
net: stmmac: Uniformize the use of dma_init_* callbacks
net: stmmac: Remove uneeded check for GMAC version in stmmac_xmit
net: stmmac: Uniformize set_rx_owner()
net: stmmac: Let descriptor code get skbuff address
net: stmmac: Remove if condition by taking advantage of hwif return
code
drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 82 +++++---
.../net/ethernet/stmicro/stmmac/dwmac1000_dma.c | 92 ++++++----
drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c | 35 +++--
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c | 34 +++-
drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c | 7 +-
drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h | 1 -
drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 20 ++-
drivers/net/ethernet/stmicro/stmmac/hwif.c | 34 ++++
drivers/net/ethernet/stmicro/stmmac/hwif.h | 27 ++-
drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 20 ++-
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 1 +
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 198 +++++++-------------
12 files changed, 323 insertions(+), 228 deletions(-)
^ permalink raw reply
* Re: [RFC PATCH ghak32 V2 01/13] audit: add container id
From: Steve Grubb @ 2018-05-18 13:56 UTC (permalink / raw)
To: Richard Guy Briggs
Cc: simo-H+wXaHxf7aLQT0dZR+AlfA, jlayton-H+wXaHxf7aLQT0dZR+AlfA,
carlos-H+wXaHxf7aLQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, LKML,
eparis-FjpueFixGhCM4zKIHC2jIg, dhowells-H+wXaHxf7aLQT0dZR+AlfA,
Linux-Audit Mailing List, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
luto-DgEjT+Ai2ygdnm+yROfE0A, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn
In-Reply-To: <20180517215600.dyswlkvqdtgjwr5y-bcJWsdo4jJjeVoXN4CMphl7TgLCtbB0G@public.gmane.org>
On Thu, 17 May 2018 17:56:00 -0400
Richard Guy Briggs <rgb-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > During syscall events, the path info is returned in a a record
> > simply called AUDIT_PATH, cwd info is returned in AUDIT_CWD. So,
> > rather than calling the record that gets attached to everything
> > AUDIT_CONTAINER_INFO, how about simply AUDIT_CONTAINER.
>
> Considering the container initiation record is different than the
> record to document the container involved in an otherwise normal
> syscall, we need two names. I don't have a strong opinion what they
> are.
>
> I'd prefer AUDIT_CONTAIN and AUDIT_CONTAINER_INFO so that the two
> are different enough to be visually distinct while leaving
> AUDIT_CONTAINERID for the field type in patch 4 ("audit: add
> containerid filtering")
How about AUDIT_CONTAINER for the auxiliary record? The one that starts
the container, I don't have a strong opinion on. Could be
AUDIT_CONTAINER_INIT, AUDIT_CONTAINER_START, AUDIT_CONTAINERID,
AUDIT_CONTAINER_ID, or something else. The API call that sets the ID
for filtering could be AUDIT_CID or AUDIT_CONTID if that helps decide
what the initial event might be. Normally, it should match the field
being filtered.
Best Regards,
-Steve
^ permalink raw reply
* Re: [PATCH v3 2/2] bpf: add selftest for rawir_event type program
From: Quentin Monnet @ 2018-05-18 13:48 UTC (permalink / raw)
To: Sean Young
Cc: Y Song, linux-media, linux-kernel, Alexei Starovoitov,
Mauro Carvalho Chehab, Daniel Borkmann, netdev, Matthias Reichl,
Devin Heitmueller
In-Reply-To: <20180518133329.fafkew5nkr2bmzah@gofer.mess.org>
2018-05-18 14:33 UTC+0100 ~ Sean Young <sean@mess.org>
> On Fri, May 18, 2018 at 11:13:07AM +0100, Quentin Monnet wrote:
>> 2018-05-17 22:01 UTC+0100 ~ Sean Young <sean@mess.org>
>>> On Thu, May 17, 2018 at 10:17:59AM -0700, Y Song wrote:
>>>> On Wed, May 16, 2018 at 2:04 PM, Sean Young <sean@mess.org> wrote:
>>>>> This is simple test over rc-loopback.
>>>>>
>>>>> Signed-off-by: Sean Young <sean@mess.org>
>>>>> ---
>>>>> tools/bpf/bpftool/prog.c | 1 +
>>>>> tools/include/uapi/linux/bpf.h | 57 +++++++-
>>>>> tools/lib/bpf/libbpf.c | 1 +
>>>>> tools/testing/selftests/bpf/Makefile | 8 +-
>>>>> tools/testing/selftests/bpf/bpf_helpers.h | 6 +
>>>>> tools/testing/selftests/bpf/test_rawir.sh | 37 +++++
>>>>> .../selftests/bpf/test_rawir_event_kern.c | 26 ++++
>>>>> .../selftests/bpf/test_rawir_event_user.c | 130 ++++++++++++++++++
>>>>> 8 files changed, 261 insertions(+), 5 deletions(-)
>>>>> create mode 100755 tools/testing/selftests/bpf/test_rawir.sh
>>>>> create mode 100644 tools/testing/selftests/bpf/test_rawir_event_kern.c
>>>>> create mode 100644 tools/testing/selftests/bpf/test_rawir_event_user.c
>>
>> [...]
>>
>>>> Most people probably not really familiar with lircN device. It would be
>>>> good to provide more information about how to enable this, e.g.,
>>>> CONFIG_RC_CORE=y
>>>> CONFIG_BPF_RAWIR_EVENT=y
>>>> CONFIG_RC_LOOPBACK=y
>>>> ......
>>>
>>> Good point. I'll add some words explaining what is and how to make it work.
>>>
>>> Thanks
>>> Sean
>>
>>
>> By the way, shouldn't the two eBPF helpers bpf_rc_keydown() and
>> bpf_rc_repeat() be compiled out in patch 1 if e.g.
>> CONFIG_BPF_RAWIR_EVENT is not set? There are some other helpers that are
>> compiled only if relevant config options are set (bpf_get_xfrm_state()
>> for example).
>
> So if CONFIG_BPF_RAWIR_EVENT is not set, then bpf-rawir-event.c is not
> compiled. Stubs are created in include/linux/bpf_rcdev.h, so this is
> already the case if I understand your correctly.
This is correct, sorry for the mistake.
>> (If you were to change that, please also update helper documentations to
>> indicate what configuration options are required to be able to use the
>> helpers.)
>
> Ok, I'll add that.
Thanks a lot!
Quentin
^ permalink raw reply
* Re: [patch net-next RFC 04/12] dsa: set devlink port attrs for dsa ports
From: Andrew Lunn @ 2018-05-18 13:45 UTC (permalink / raw)
To: Jiri Pirko
Cc: Florian Fainelli, netdev, davem, idosch, jakub.kicinski, mlxsw,
vivien.didelot, michael.chan, ganeshgr, saeedm, simon.horman,
pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
alexander.h.duyck, ogerlitz, dsahern, vijaya.guvva,
satananda.burla, raghu.vatsavayi, felix.manlunas, gospo,
sathya.perla, vasundhara-v.volam, tariqt, eranbe,
jeffrey.t.kirsher
In-Reply-To: <20180518063735.GY1972@nanopsycho>
> What benefit does it have to register unused ports? What is a usecase
> for them. Like Florian, I also think they should not be registered.
Hi Jiri
They physically exist, so we are accurately describing the hardware by
registering them.
Andrew
^ permalink raw reply
* [PATCH net] cxgb4: fix offset in collecting TX rate limit info
From: Rahul Lakkireddy @ 2018-05-18 13:43 UTC (permalink / raw)
To: netdev; +Cc: davem, ganeshgr, nirranjan, indranil, Rahul Lakkireddy
Correct the indirect register offsets in collecting TX rate limit info
in UP CIM logs.
Also, T5 doesn't support these indirect register offsets, so remove
them from collection logic.
Fixes: be6e36d916b1 ("cxgb4: collect TX rate limit info in UP CIM logs")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h | 28 ++++++++---------------
1 file changed, 9 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h b/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h
index b57acb8dc35b..dc25066c59a1 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h
@@ -419,15 +419,15 @@ static const u32 t6_up_cim_reg_array[][IREG_NUM_ELEM + 1] = {
{0x7b50, 0x7b54, 0x280, 0x20, 0}, /* up_cim_280_to_2fc */
{0x7b50, 0x7b54, 0x300, 0x20, 0}, /* up_cim_300_to_37c */
{0x7b50, 0x7b54, 0x380, 0x14, 0}, /* up_cim_380_to_3cc */
- {0x7b50, 0x7b54, 0x2900, 0x4, 0x4}, /* up_cim_2900_to_3d40 */
- {0x7b50, 0x7b54, 0x2904, 0x4, 0x4}, /* up_cim_2904_to_3d44 */
- {0x7b50, 0x7b54, 0x2908, 0x4, 0x4}, /* up_cim_2908_to_3d48 */
- {0x7b50, 0x7b54, 0x2910, 0x4, 0x4}, /* up_cim_2910_to_3d4c */
- {0x7b50, 0x7b54, 0x2914, 0x4, 0x4}, /* up_cim_2914_to_3d50 */
- {0x7b50, 0x7b54, 0x2920, 0x10, 0x10}, /* up_cim_2920_to_2a10 */
- {0x7b50, 0x7b54, 0x2924, 0x10, 0x10}, /* up_cim_2924_to_2a14 */
- {0x7b50, 0x7b54, 0x2928, 0x10, 0x10}, /* up_cim_2928_to_2a18 */
- {0x7b50, 0x7b54, 0x292c, 0x10, 0x10}, /* up_cim_292c_to_2a1c */
+ {0x7b50, 0x7b54, 0x4900, 0x4, 0x4}, /* up_cim_4900_to_4c60 */
+ {0x7b50, 0x7b54, 0x4904, 0x4, 0x4}, /* up_cim_4904_to_4c64 */
+ {0x7b50, 0x7b54, 0x4908, 0x4, 0x4}, /* up_cim_4908_to_4c68 */
+ {0x7b50, 0x7b54, 0x4910, 0x4, 0x4}, /* up_cim_4910_to_4c70 */
+ {0x7b50, 0x7b54, 0x4914, 0x4, 0x4}, /* up_cim_4914_to_4c74 */
+ {0x7b50, 0x7b54, 0x4920, 0x10, 0x10}, /* up_cim_4920_to_4a10 */
+ {0x7b50, 0x7b54, 0x4924, 0x10, 0x10}, /* up_cim_4924_to_4a14 */
+ {0x7b50, 0x7b54, 0x4928, 0x10, 0x10}, /* up_cim_4928_to_4a18 */
+ {0x7b50, 0x7b54, 0x492c, 0x10, 0x10}, /* up_cim_492c_to_4a1c */
};
static const u32 t5_up_cim_reg_array[][IREG_NUM_ELEM + 1] = {
@@ -444,16 +444,6 @@ static const u32 t5_up_cim_reg_array[][IREG_NUM_ELEM + 1] = {
{0x7b50, 0x7b54, 0x280, 0x20, 0}, /* up_cim_280_to_2fc */
{0x7b50, 0x7b54, 0x300, 0x20, 0}, /* up_cim_300_to_37c */
{0x7b50, 0x7b54, 0x380, 0x14, 0}, /* up_cim_380_to_3cc */
- {0x7b50, 0x7b54, 0x2900, 0x4, 0x4}, /* up_cim_2900_to_3d40 */
- {0x7b50, 0x7b54, 0x2904, 0x4, 0x4}, /* up_cim_2904_to_3d44 */
- {0x7b50, 0x7b54, 0x2908, 0x4, 0x4}, /* up_cim_2908_to_3d48 */
- {0x7b50, 0x7b54, 0x2910, 0x4, 0x4}, /* up_cim_2910_to_3d4c */
- {0x7b50, 0x7b54, 0x2914, 0x4, 0x4}, /* up_cim_2914_to_3d50 */
- {0x7b50, 0x7b54, 0x2918, 0x4, 0x4}, /* up_cim_2918_to_3d54 */
- {0x7b50, 0x7b54, 0x291c, 0x4, 0x4}, /* up_cim_291c_to_3d58 */
- {0x7b50, 0x7b54, 0x2924, 0x10, 0x10}, /* up_cim_2924_to_2914 */
- {0x7b50, 0x7b54, 0x2928, 0x10, 0x10}, /* up_cim_2928_to_2a18 */
- {0x7b50, 0x7b54, 0x292c, 0x10, 0x10}, /* up_cim_292c_to_2a1c */
};
static const u32 t6_hma_ireg_array[][IREG_NUM_ELEM] = {
--
2.14.1
^ permalink raw reply related
* Re: [PATCH bpf-next v3 00/15] Introducing AF_XDP support
From: Daniel Borkmann @ 2018-05-18 13:43 UTC (permalink / raw)
To: Alexei Starovoitov, Björn Töpel, Alexei Starovoitov
Cc: Karlsson, Magnus, Duyck, Alexander H, Alexander Duyck,
John Fastabend, Jesper Dangaard Brouer, Willem de Bruijn,
Michael S. Tsirkin, Netdev, Björn Töpel,
michael.lundkvist, Brandeburg, Jesse, Singhai, Anjali,
Zhang, Qi Z
In-Reply-To: <cb3aa2a3-f72c-54bf-e883-88922e372c58@fb.com>
On 05/18/2018 05:38 AM, Alexei Starovoitov wrote:
> On 5/16/18 11:46 PM, Björn Töpel wrote:
>> 2018-05-04 1:38 GMT+02:00 Alexei Starovoitov <alexei.starovoitov@gmail.com>:
>>> On Fri, May 04, 2018 at 12:49:09AM +0200, Daniel Borkmann wrote:
>>>> On 05/02/2018 01:01 PM, Björn Töpel wrote:
>>>>> From: Björn Töpel <bjorn.topel@intel.com>
>>>>>
>>>>> This patch set introduces a new address family called AF_XDP that is
>>>>> optimized for high performance packet processing and, in upcoming
>>>>> patch sets, zero-copy semantics. In this patch set, we have removed
>>>>> all zero-copy related code in order to make it smaller, simpler and
>>>>> hopefully more review friendly. This patch set only supports copy-mode
>>>>> for the generic XDP path (XDP_SKB) for both RX and TX and copy-mode
>>>>> for RX using the XDP_DRV path. Zero-copy support requires XDP and
>>>>> driver changes that Jesper Dangaard Brouer is working on. Some of his
>>>>> work has already been accepted. We will publish our zero-copy support
>>>>> for RX and TX on top of his patch sets at a later point in time.
>>>>
>>>> +1, would be great to see it land this cycle. Saw few minor nits here
>>>> and there but nothing to hold it up, for the series:
>>>>
>>>> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
>>>>
>>>> Thanks everyone!
>>>
>>> Great stuff!
>>>
>>> Applied to bpf-next, with one condition.
>>> Upcoming zero-copy patches for both RX and TX need to be posted
>>> and reviewed within this release window.
>>> If netdev community as a whole won't be able to agree on the zero-copy
>>> bits we'd need to revert this feature before the next merge window.
>>>
>>> Few other minor nits:
>>> patch 3:
>>> +struct xdp_ring {
>>> + __u32 producer __attribute__((aligned(64)));
>>> + __u32 consumer __attribute__((aligned(64)));
>>> +};
>>> It kinda begs for ____cacheline_aligned_in_smp to be introduced for uapi headers.
>>
>> Hmm, I need some guidance on what a sane uapi variant would be. We
>> can't have the uapi depend on the kernel build. ARM64, e.g., can have
>> both 64B and 128B according to the specs. Contemporary IA processors
>> have 64B.
>>
>> The simplest, and maybe most future-proof, would be 128B aligned for
>> all. Another is having 128B for ARM and 64B for all IA. A third option
>> is having a hand-shaking API (I think virtio has that) for determine
>> the cache line size, but I'd rather not go down that route.
>>
>> Thoughts/ideas on how a uapi ____cacheline_aligned_in_smp version
>> would look like?
>
> I suspect i40e+arm combination wasn't tested anyway.
> The api may have endianness issues too on something like sparc.
> I think the way to be backwards compatible in this area
> is to make the api usable on x86 only by adding
> to include/uapi/linux/if_xdp.h
> #if defined(__x86_64__)
> #define AF_XDP_CACHE_BYTES 64
> #else
> #error "AF_XDP support is not yet available for this architecture"
> #endif
> and doing:
> __u32 producer __attribute__((aligned(AF_XDP_CACHE_BYTES)));
> __u32 consumer __attribute__((aligned(AF_XDP_CACHE_BYTES)));
>
> And progressively add to this for arm64 and few other archs.
> Eventually removing #error and adding some generic define
> that's good enough for long tail of architectures that
> we really cannot test.
Been looking into this yesterday as well a bit, and it's a bit of a mess what
uapi headers do on this regard (though there are just a handful of such headers).
Some of the kernel uapi headers hard-code generally 64 bytes regardless of the
underlying arch. In general, the kernel does expose it to user space via sysfs
(coherency_line_size). Here's what perf does to retrieve it:
#ifdef _SC_LEVEL1_DCACHE_LINESIZE
#define cache_line_size(cacheline_sizep) *cacheline_sizep = sysconf(_SC_LEVEL1_DCACHE_LINESIZE)
#else
static void cache_line_size(int *cacheline_sizep)
{
if (sysfs__read_int("devices/system/cpu/cpu0/cache/index0/coherency_line_size", cacheline_sizep))
pr_debug("cannot determine cache line size");
}
#endif
The sysconf() implementation for _SC_LEVEL1_DCACHE_LINESIZE seems also only
available for x86, arm64, s390 and ppc on a cursory glance in the glibc code.
In the x86 case it retrieves the info from cpuid insn. In order to generically
use it in combination with the header you'd have some probe which would then
set this as a define before including the header.
Then projects like urcu, they do ...
#define ____cacheline_internodealigned_in_smp \
__attribute__((__aligned__(CAA_CACHE_LINE_SIZE)))
... and then hard code CAA_CACHE_LINE_SIZE for x86 (== 128), s390 (== 128),
ppc (== 256) and sparc64 (== 256) with a generic fallback to 64.
Hmm, perhaps a combination of the two would make sense where in case of known
cacheline size it can still be used and we only have the fallback in such way.
Like:
#ifndef XDP_CACHE_BYTES
# if defined(__x86_64__)
# define XDP_CACHE_BYTES 64
# else
# error "Please define XDP_CACHE_BYTES for this architecture!"
# endif
#endif
Too bad there's no asm uapi header at least for the archs where it's fixed
anyway such that not every project out there has to redefine all of it from
scratch and we could just include it (and the generic-asm one would throw
a compile error if it's not externally defined or such).
Cheers,
Daniel
^ permalink raw reply
* [PATCH net-next] cxgb4: collect SGE PF/VF queue map
From: Rahul Lakkireddy @ 2018-05-18 13:42 UTC (permalink / raw)
To: netdev; +Cc: davem, ganeshgr, nirranjan, indranil, Rahul Lakkireddy
For T6, collect info on queue mapping to corresponding PF/VF in SGE.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h | 17 ++++++++
drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c | 47 ++++++++++++++++++++++-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c | 3 +-
3 files changed, 65 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h b/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h
index 740a18ba4229..c333e25620a7 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h
@@ -62,6 +62,18 @@ struct cudbg_hw_sched {
u32 map;
};
+#define SGE_QBASE_DATA_REG_NUM 4
+
+struct sge_qbase_reg_field {
+ u32 reg_addr;
+ u32 reg_data[SGE_QBASE_DATA_REG_NUM];
+ /* Max supported PFs */
+ u32 pf_data_value[PCIE_FW_MASTER_M + 1][SGE_QBASE_DATA_REG_NUM];
+ /* Max supported VFs */
+ u32 vf_data_value[T6_VF_M + 1][SGE_QBASE_DATA_REG_NUM];
+ u32 vfcount; /* Actual number of max vfs in current configuration */
+};
+
struct ireg_field {
u32 ireg_addr;
u32 ireg_data;
@@ -357,6 +369,11 @@ static const u32 t5_sge_dbg_index_array[2][IREG_NUM_ELEM] = {
{0x10cc, 0x10d4, 0x0, 16},
};
+static const u32 t6_sge_qbase_index_array[] = {
+ /* 1 addr reg SGE_QBASE_INDEX and 4 data reg SGE_QBASE_MAP[0-3] */
+ 0x1250, 0x1240, 0x1244, 0x1248, 0x124c,
+};
+
static const u32 t5_pcie_pdbg_array[][IREG_NUM_ELEM] = {
{0x5a04, 0x5a0c, 0x00, 0x20}, /* t5_pcie_pdbg_regs_00_to_20 */
{0x5a04, 0x5a0c, 0x21, 0x20}, /* t5_pcie_pdbg_regs_21_to_40 */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
index 4feb7eca0acf..0afcfe99bff3 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
@@ -1339,16 +1339,39 @@ int cudbg_collect_tp_indirect(struct cudbg_init *pdbg_init,
return cudbg_write_and_release_buff(pdbg_init, &temp_buff, dbg_buff);
}
+static void cudbg_read_sge_qbase_indirect_reg(struct adapter *padap,
+ struct sge_qbase_reg_field *qbase,
+ u32 func, bool is_pf)
+{
+ u32 *buff, i;
+
+ if (is_pf) {
+ buff = qbase->pf_data_value[func];
+ } else {
+ buff = qbase->vf_data_value[func];
+ /* In SGE_QBASE_INDEX,
+ * Entries 0->7 are PF0->7, Entries 8->263 are VFID0->256.
+ */
+ func += 8;
+ }
+
+ t4_write_reg(padap, qbase->reg_addr, func);
+ for (i = 0; i < SGE_QBASE_DATA_REG_NUM; i++, buff++)
+ *buff = t4_read_reg(padap, qbase->reg_data[i]);
+}
+
int cudbg_collect_sge_indirect(struct cudbg_init *pdbg_init,
struct cudbg_buffer *dbg_buff,
struct cudbg_error *cudbg_err)
{
struct adapter *padap = pdbg_init->adap;
struct cudbg_buffer temp_buff = { 0 };
+ struct sge_qbase_reg_field *sge_qbase;
struct ireg_buf *ch_sge_dbg;
int i, rc;
- rc = cudbg_get_buff(pdbg_init, dbg_buff, sizeof(*ch_sge_dbg) * 2,
+ rc = cudbg_get_buff(pdbg_init, dbg_buff,
+ sizeof(*ch_sge_dbg) * 2 + sizeof(*sge_qbase),
&temp_buff);
if (rc)
return rc;
@@ -1370,6 +1393,28 @@ int cudbg_collect_sge_indirect(struct cudbg_init *pdbg_init,
sge_pio->ireg_local_offset);
ch_sge_dbg++;
}
+
+ if (CHELSIO_CHIP_VERSION(padap->params.chip) > CHELSIO_T5) {
+ sge_qbase = (struct sge_qbase_reg_field *)ch_sge_dbg;
+ /* 1 addr reg SGE_QBASE_INDEX and 4 data reg
+ * SGE_QBASE_MAP[0-3]
+ */
+ sge_qbase->reg_addr = t6_sge_qbase_index_array[0];
+ for (i = 0; i < SGE_QBASE_DATA_REG_NUM; i++)
+ sge_qbase->reg_data[i] =
+ t6_sge_qbase_index_array[i + 1];
+
+ for (i = 0; i <= PCIE_FW_MASTER_M; i++)
+ cudbg_read_sge_qbase_indirect_reg(padap, sge_qbase,
+ i, true);
+
+ for (i = 0; i < padap->params.arch.vfcount; i++)
+ cudbg_read_sge_qbase_indirect_reg(padap, sge_qbase,
+ i, false);
+
+ sge_qbase->vfcount = padap->params.arch.vfcount;
+ }
+
return cudbg_write_and_release_buff(pdbg_init, &temp_buff, dbg_buff);
}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
index 085691eb2b95..8d751efcb90e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
@@ -214,7 +214,8 @@ static u32 cxgb4_get_entity_length(struct adapter *adap, u32 entity)
len = sizeof(struct ireg_buf) * n;
break;
case CUDBG_SGE_INDIRECT:
- len = sizeof(struct ireg_buf) * 2;
+ len = sizeof(struct ireg_buf) * 2 +
+ sizeof(struct sge_qbase_reg_field);
break;
case CUDBG_ULPRX_LA:
len = sizeof(struct cudbg_ulprx_la);
--
2.14.1
^ permalink raw reply related
* [bpf-next V4 PATCH 8/8] samples/bpf: xdp_monitor use err code from tracepoint xdp:xdp_devmap_xmit
From: Jesper Dangaard Brouer @ 2018-05-18 13:35 UTC (permalink / raw)
To: netdev, Daniel Borkmann, Alexei Starovoitov,
Jesper Dangaard Brouer
Cc: Christoph Hellwig, BjörnTöpel, Magnus Karlsson,
makita.toshiaki
In-Reply-To: <152665044141.21055.1276346542020340263.stgit@firesoul>
Update xdp_monitor to use the recently added err code introduced
in tracepoint xdp:xdp_devmap_xmit, to show if the drop count is
caused by some driver general delivery problem. Other kind of drops
will likely just be more normal TX space issues.
---
samples/bpf/xdp_monitor_kern.c | 10 ++++++++++
samples/bpf/xdp_monitor_user.c | 35 ++++++++++++++++++++++++++++++-----
2 files changed, 40 insertions(+), 5 deletions(-)
diff --git a/samples/bpf/xdp_monitor_kern.c b/samples/bpf/xdp_monitor_kern.c
index 2854aa0665ea..ad10fe700d7d 100644
--- a/samples/bpf/xdp_monitor_kern.c
+++ b/samples/bpf/xdp_monitor_kern.c
@@ -125,6 +125,7 @@ struct datarec {
u64 processed;
u64 dropped;
u64 info;
+ u64 err;
};
#define MAX_CPUS 64
@@ -228,6 +229,7 @@ struct devmap_xmit_ctx {
int sent; // offset:24; size:4; signed:1;
int from_ifindex; // offset:28; size:4; signed:1;
int to_ifindex; // offset:32; size:4; signed:1;
+ int err; // offset:36; size:4; signed:1;
};
SEC("tracepoint/xdp/xdp_devmap_xmit")
@@ -245,5 +247,13 @@ int trace_xdp_devmap_xmit(struct devmap_xmit_ctx *ctx)
/* Record bulk events, then userspace can calc average bulk size */
rec->info += 1;
+ /* Record error cases, where no frame were sent */
+ if (ctx->err)
+ rec->err++;
+
+ /* Catch API error of drv ndo_xdp_xmit sent more than count */
+ if (ctx->drops < 0)
+ rec->err++;
+
return 1;
}
diff --git a/samples/bpf/xdp_monitor_user.c b/samples/bpf/xdp_monitor_user.c
index 7e18a454924c..dd558cbb2309 100644
--- a/samples/bpf/xdp_monitor_user.c
+++ b/samples/bpf/xdp_monitor_user.c
@@ -117,6 +117,7 @@ struct datarec {
__u64 processed;
__u64 dropped;
__u64 info;
+ __u64 err;
};
#define MAX_CPUS 64
@@ -152,6 +153,7 @@ static bool map_collect_record(int fd, __u32 key, struct record *rec)
__u64 sum_processed = 0;
__u64 sum_dropped = 0;
__u64 sum_info = 0;
+ __u64 sum_err = 0;
int i;
if ((bpf_map_lookup_elem(fd, &key, values)) != 0) {
@@ -170,10 +172,13 @@ static bool map_collect_record(int fd, __u32 key, struct record *rec)
sum_dropped += values[i].dropped;
rec->cpu[i].info = values[i].info;
sum_info += values[i].info;
+ rec->cpu[i].err = values[i].err;
+ sum_err += values[i].err;
}
rec->total.processed = sum_processed;
rec->total.dropped = sum_dropped;
rec->total.info = sum_info;
+ rec->total.err = sum_err;
return true;
}
@@ -274,6 +279,18 @@ static double calc_info(struct datarec *r, struct datarec *p, double period)
return pps;
}
+static double calc_err(struct datarec *r, struct datarec *p, double period)
+{
+ __u64 packets = 0;
+ double pps = 0;
+
+ if (period > 0) {
+ packets = r->err - p->err;
+ pps = packets / period;
+ }
+ return pps;
+}
+
static void stats_print(struct stats_record *stats_rec,
struct stats_record *stats_prev,
bool err_only)
@@ -412,11 +429,12 @@ static void stats_print(struct stats_record *stats_rec,
/* devmap ndo_xdp_xmit stats */
{
- char *fmt1 = "%-15s %-7d %'-12.0f %'-12.0f %'-10.2f %s\n";
- char *fmt2 = "%-15s %-7s %'-12.0f %'-12.0f %'-10.2f %s\n";
+ char *fmt1 = "%-15s %-7d %'-12.0f %'-12.0f %'-10.2f %s %s\n";
+ char *fmt2 = "%-15s %-7s %'-12.0f %'-12.0f %'-10.2f %s %s\n";
struct record *rec, *prev;
- double drop, info;
+ double drop, info, err;
char *i_str = "";
+ char *err_str = "";
rec = &stats_rec->xdp_devmap_xmit;
prev = &stats_prev->xdp_devmap_xmit;
@@ -428,22 +446,29 @@ static void stats_print(struct stats_record *stats_rec,
pps = calc_pps(r, p, t);
drop = calc_drop(r, p, t);
info = calc_info(r, p, t);
+ err = calc_err(r, p, t);
if (info > 0) {
i_str = "bulk-average";
info = (pps+drop) / info; /* calc avg bulk */
}
+ if (err > 0)
+ err_str = "drv-err";
if (pps > 0 || drop > 0)
printf(fmt1, "devmap-xmit",
- i, pps, drop, info, i_str);
+ i, pps, drop, info, i_str, err_str);
}
pps = calc_pps(&rec->total, &prev->total, t);
drop = calc_drop(&rec->total, &prev->total, t);
info = calc_info(&rec->total, &prev->total, t);
+ err = calc_err(&rec->total, &prev->total, t);
if (info > 0) {
i_str = "bulk-average";
info = (pps+drop) / info; /* calc avg bulk */
}
- printf(fmt2, "devmap-xmit", "total", pps, drop, info, i_str);
+ if (err > 0)
+ err_str = "drv-err";
+ printf(fmt2, "devmap-xmit", "total", pps, drop,
+ info, i_str, err_str);
}
printf("\n");
^ permalink raw reply related
* [bpf-next V4 PATCH 7/8] xdp/trace: extend tracepoint in devmap with an err
From: Jesper Dangaard Brouer @ 2018-05-18 13:35 UTC (permalink / raw)
To: netdev, Daniel Borkmann, Alexei Starovoitov,
Jesper Dangaard Brouer
Cc: Christoph Hellwig, BjörnTöpel, Magnus Karlsson,
makita.toshiaki
In-Reply-To: <152665044141.21055.1276346542020340263.stgit@firesoul>
Extending tracepoint xdp:xdp_devmap_xmit in devmap with an err code
allow people to easier identify the reason behind the ndo_xdp_xmit
call to a given driver is failing.
---
include/trace/events/xdp.h | 10 ++++++----
kernel/bpf/devmap.c | 2 +-
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h
index 2e9ef0650144..1ecf4c67fcf7 100644
--- a/include/trace/events/xdp.h
+++ b/include/trace/events/xdp.h
@@ -234,9 +234,9 @@ TRACE_EVENT(xdp_devmap_xmit,
TP_PROTO(const struct bpf_map *map, u32 map_index,
int sent, int drops,
const struct net_device *from_dev,
- const struct net_device *to_dev),
+ const struct net_device *to_dev, int err),
- TP_ARGS(map, map_index, sent, drops, from_dev, to_dev),
+ TP_ARGS(map, map_index, sent, drops, from_dev, to_dev, err),
TP_STRUCT__entry(
__field(int, map_id)
@@ -246,6 +246,7 @@ TRACE_EVENT(xdp_devmap_xmit,
__field(int, sent)
__field(int, from_ifindex)
__field(int, to_ifindex)
+ __field(int, err)
),
TP_fast_assign(
@@ -256,16 +257,17 @@ TRACE_EVENT(xdp_devmap_xmit,
__entry->sent = sent;
__entry->from_ifindex = from_dev->ifindex;
__entry->to_ifindex = to_dev->ifindex;
+ __entry->err = err;
),
TP_printk("ndo_xdp_xmit"
" map_id=%d map_index=%d action=%s"
" sent=%d drops=%d"
- " from_ifindex=%d to_ifindex=%d",
+ " from_ifindex=%d to_ifindex=%d err=%d",
__entry->map_id, __entry->map_index,
__print_symbolic(__entry->act, __XDP_ACT_SYM_TAB),
__entry->sent, __entry->drops,
- __entry->from_ifindex, __entry->to_ifindex)
+ __entry->from_ifindex, __entry->to_ifindex, __entry->err)
);
#endif /* _TRACE_XDP_H */
diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index 1317629662ae..4dd8f0e3a8d9 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -245,7 +245,7 @@ static int bq_xmit_all(struct bpf_dtab_netdev *obj,
bq->count = 0;
trace_xdp_devmap_xmit(&obj->dtab->map, obj->bit,
- sent, drops, bq->dev_rx, dev);
+ sent, drops, bq->dev_rx, dev, err);
bq->dev_rx = NULL;
return 0;
error:
^ permalink raw reply related
* [bpf-next V4 PATCH 6/8] xdp: change ndo_xdp_xmit API to support bulking
From: Jesper Dangaard Brouer @ 2018-05-18 13:35 UTC (permalink / raw)
To: netdev, Daniel Borkmann, Alexei Starovoitov,
Jesper Dangaard Brouer
Cc: Christoph Hellwig, BjörnTöpel, Magnus Karlsson,
makita.toshiaki
In-Reply-To: <152665044141.21055.1276346542020340263.stgit@firesoul>
This patch change the API for ndo_xdp_xmit to support bulking
xdp_frames.
When kernel is compiled with CONFIG_RETPOLINE, XDP sees a huge slowdown.
Most of the slowdown is caused by DMA API indirect function calls, but
also the net_device->ndo_xdp_xmit() call.
Benchmarked patch with CONFIG_RETPOLINE, using xdp_redirect_map with
single flow/core test (CPU E5-1650 v4 @ 3.60GHz), showed
performance improved:
for driver ixgbe: 6,042,682 pps -> 6,853,768 pps = +811,086 pps
for driver i40e : 6,187,169 pps -> 6,724,519 pps = +537,350 pps
With frames avail as a bulk inside the driver ndo_xdp_xmit call,
further optimizations are possible, like bulk DMA-mapping for TX.
Testing without CONFIG_RETPOLINE show the same performance for
physical NIC drivers.
The virtual NIC driver tun sees a huge performance boost, as it can
avoid doing per frame producer locking, but instead amortize the
locking cost over the bulk.
V2: Fix compile errors reported by kbuild test robot <lkp@intel.com>
V4: Isolated ndo, driver changes and callers.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 26 +++++++---
drivers/net/ethernet/intel/i40e/i40e_txrx.h | 2 -
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 21 ++++++--
drivers/net/tun.c | 37 +++++++++-----
drivers/net/virtio_net.c | 66 +++++++++++++++++++------
include/linux/netdevice.h | 14 +++--
kernel/bpf/devmap.c | 31 ++++++++----
net/core/filter.c | 8 ++-
8 files changed, 141 insertions(+), 64 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 5efa68de935b..9b698c5acd05 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -3664,14 +3664,19 @@ netdev_tx_t i40e_lan_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
* @dev: netdev
* @xdp: XDP buffer
*
- * Returns Zero if sent, else an error code
+ * Returns number of frames successfully sent. Frames that fail are
+ * free'ed via XDP return API.
+ *
+ * For error cases, a negative errno code is returned and no-frames
+ * are transmitted (caller must handle freeing frames).
**/
-int i40e_xdp_xmit(struct net_device *dev, struct xdp_frame *xdpf)
+int i40e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames)
{
struct i40e_netdev_priv *np = netdev_priv(dev);
unsigned int queue_index = smp_processor_id();
struct i40e_vsi *vsi = np->vsi;
- int err;
+ int drops = 0;
+ int i;
if (test_bit(__I40E_VSI_DOWN, vsi->state))
return -ENETDOWN;
@@ -3679,11 +3684,18 @@ int i40e_xdp_xmit(struct net_device *dev, struct xdp_frame *xdpf)
if (!i40e_enabled_xdp_vsi(vsi) || queue_index >= vsi->num_queue_pairs)
return -ENXIO;
- err = i40e_xmit_xdp_ring(xdpf, vsi->xdp_rings[queue_index]);
- if (err != I40E_XDP_TX)
- return -ENOSPC;
+ for (i = 0; i < n; i++) {
+ struct xdp_frame *xdpf = frames[i];
+ int err;
- return 0;
+ err = i40e_xmit_xdp_ring(xdpf, vsi->xdp_rings[queue_index]);
+ if (err != I40E_XDP_TX) {
+ xdp_return_frame_rx_napi(xdpf);
+ drops++;
+ }
+ }
+
+ return n - drops;
}
/**
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index fdd2c55f03a6..eb8804b3d7b6 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -487,7 +487,7 @@ u32 i40e_get_tx_pending(struct i40e_ring *ring, bool in_sw);
void i40e_detect_recover_hung(struct i40e_vsi *vsi);
int __i40e_maybe_stop_tx(struct i40e_ring *tx_ring, int size);
bool __i40e_chk_linearize(struct sk_buff *skb);
-int i40e_xdp_xmit(struct net_device *dev, struct xdp_frame *xdpf);
+int i40e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames);
void i40e_xdp_flush(struct net_device *dev);
/**
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 6652b201df5b..9645619f7729 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -10017,11 +10017,13 @@ static int ixgbe_xdp(struct net_device *dev, struct netdev_bpf *xdp)
}
}
-static int ixgbe_xdp_xmit(struct net_device *dev, struct xdp_frame *xdpf)
+static int ixgbe_xdp_xmit(struct net_device *dev, int n,
+ struct xdp_frame **frames)
{
struct ixgbe_adapter *adapter = netdev_priv(dev);
struct ixgbe_ring *ring;
- int err;
+ int drops = 0;
+ int i;
if (unlikely(test_bit(__IXGBE_DOWN, &adapter->state)))
return -ENETDOWN;
@@ -10033,11 +10035,18 @@ static int ixgbe_xdp_xmit(struct net_device *dev, struct xdp_frame *xdpf)
if (unlikely(!ring))
return -ENXIO;
- err = ixgbe_xmit_xdp_ring(adapter, xdpf);
- if (err != IXGBE_XDP_TX)
- return -ENOSPC;
+ for (i = 0; i < n; i++) {
+ struct xdp_frame *xdpf = frames[i];
+ int err;
- return 0;
+ err = ixgbe_xmit_xdp_ring(adapter, xdpf);
+ if (err != IXGBE_XDP_TX) {
+ xdp_return_frame_rx_napi(xdpf);
+ drops++;
+ }
+ }
+
+ return n - drops;
}
static void ixgbe_xdp_flush(struct net_device *dev)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 44d4f3d25350..d3dcfcb1c4b3 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -70,6 +70,7 @@
#include <net/netns/generic.h>
#include <net/rtnetlink.h>
#include <net/sock.h>
+#include <net/xdp.h>
#include <linux/seq_file.h>
#include <linux/uio.h>
#include <linux/skb_array.h>
@@ -1290,34 +1291,44 @@ static const struct net_device_ops tun_netdev_ops = {
.ndo_get_stats64 = tun_net_get_stats64,
};
-static int tun_xdp_xmit(struct net_device *dev, struct xdp_frame *frame)
+static int tun_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames)
{
struct tun_struct *tun = netdev_priv(dev);
struct tun_file *tfile;
u32 numqueues;
- int ret = 0;
+ int drops = 0;
+ int cnt = n;
+ int i;
rcu_read_lock();
numqueues = READ_ONCE(tun->numqueues);
if (!numqueues) {
- ret = -ENOSPC;
- goto out;
+ rcu_read_unlock();
+ return -ENXIO; /* Caller will free/return all frames */
}
tfile = rcu_dereference(tun->tfiles[smp_processor_id() %
numqueues]);
- /* Encode the XDP flag into lowest bit for consumer to differ
- * XDP buffer from sk_buff.
- */
- if (ptr_ring_produce(&tfile->tx_ring, tun_xdp_to_ptr(frame))) {
- this_cpu_inc(tun->pcpu_stats->tx_dropped);
- ret = -ENOSPC;
+
+ spin_lock(&tfile->tx_ring.producer_lock);
+ for (i = 0; i < n; i++) {
+ struct xdp_frame *xdp = frames[i];
+ /* Encode the XDP flag into lowest bit for consumer to differ
+ * XDP buffer from sk_buff.
+ */
+ void *frame = tun_xdp_to_ptr(xdp);
+
+ if (__ptr_ring_produce(&tfile->tx_ring, frame)) {
+ this_cpu_inc(tun->pcpu_stats->tx_dropped);
+ xdp_return_frame_rx_napi(xdp);
+ drops++;
+ }
}
+ spin_unlock(&tfile->tx_ring.producer_lock);
-out:
rcu_read_unlock();
- return ret;
+ return cnt - drops;
}
static int tun_xdp_tx(struct net_device *dev, struct xdp_buff *xdp)
@@ -1327,7 +1338,7 @@ static int tun_xdp_tx(struct net_device *dev, struct xdp_buff *xdp)
if (unlikely(!frame))
return -EOVERFLOW;
- return tun_xdp_xmit(dev, frame);
+ return tun_xdp_xmit(dev, 1, &frame);
}
static void tun_xdp_flush(struct net_device *dev)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f34794a76c4d..39a0783d1cde 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -419,23 +419,13 @@ static void virtnet_xdp_flush(struct net_device *dev)
virtqueue_kick(sq->vq);
}
-static int __virtnet_xdp_xmit(struct virtnet_info *vi,
- struct xdp_frame *xdpf)
+static int __virtnet_xdp_xmit_one(struct virtnet_info *vi,
+ struct send_queue *sq,
+ struct xdp_frame *xdpf)
{
struct virtio_net_hdr_mrg_rxbuf *hdr;
- struct xdp_frame *xdpf_sent;
- struct send_queue *sq;
- unsigned int len;
- unsigned int qp;
int err;
- qp = vi->curr_queue_pairs - vi->xdp_queue_pairs + smp_processor_id();
- sq = &vi->sq[qp];
-
- /* Free up any pending old buffers before queueing new ones. */
- while ((xdpf_sent = virtqueue_get_buf(sq->vq, &len)) != NULL)
- xdp_return_frame(xdpf_sent);
-
/* virtqueue want to use data area in-front of packet */
if (unlikely(xdpf->metasize > 0))
return -EOPNOTSUPP;
@@ -459,11 +449,40 @@ static int __virtnet_xdp_xmit(struct virtnet_info *vi,
return 0;
}
-static int virtnet_xdp_xmit(struct net_device *dev, struct xdp_frame *xdpf)
+static int __virtnet_xdp_tx_xmit(struct virtnet_info *vi,
+ struct xdp_frame *xdpf)
+{
+ struct xdp_frame *xdpf_sent;
+ struct send_queue *sq;
+ unsigned int len;
+ unsigned int qp;
+
+ qp = vi->curr_queue_pairs - vi->xdp_queue_pairs + smp_processor_id();
+ sq = &vi->sq[qp];
+
+ /* Free up any pending old buffers before queueing new ones. */
+ while ((xdpf_sent = virtqueue_get_buf(sq->vq, &len)) != NULL)
+ xdp_return_frame(xdpf_sent);
+
+ return __virtnet_xdp_xmit_one(vi, sq, xdpf);
+}
+
+static int virtnet_xdp_xmit(struct net_device *dev,
+ int n, struct xdp_frame **frames)
{
struct virtnet_info *vi = netdev_priv(dev);
struct receive_queue *rq = vi->rq;
+ struct xdp_frame *xdpf_sent;
struct bpf_prog *xdp_prog;
+ struct send_queue *sq;
+ unsigned int len;
+ unsigned int qp;
+ int drops = 0;
+ int err;
+ int i;
+
+ qp = vi->curr_queue_pairs - vi->xdp_queue_pairs + smp_processor_id();
+ sq = &vi->sq[qp];
/* Only allow ndo_xdp_xmit if XDP is loaded on dev, as this
* indicate XDP resources have been successfully allocated.
@@ -472,7 +491,20 @@ static int virtnet_xdp_xmit(struct net_device *dev, struct xdp_frame *xdpf)
if (!xdp_prog)
return -ENXIO;
- return __virtnet_xdp_xmit(vi, xdpf);
+ /* Free up any pending old buffers before queueing new ones. */
+ while ((xdpf_sent = virtqueue_get_buf(sq->vq, &len)) != NULL)
+ xdp_return_frame(xdpf_sent);
+
+ for (i = 0; i < n; i++) {
+ struct xdp_frame *xdpf = frames[i];
+
+ err = __virtnet_xdp_xmit_one(vi, sq, xdpf);
+ if (err) {
+ xdp_return_frame_rx_napi(xdpf);
+ drops++;
+ }
+ }
+ return n - drops;
}
static unsigned int virtnet_get_headroom(struct virtnet_info *vi)
@@ -616,7 +648,7 @@ static struct sk_buff *receive_small(struct net_device *dev,
xdpf = convert_to_xdp_frame(&xdp);
if (unlikely(!xdpf))
goto err_xdp;
- err = __virtnet_xdp_xmit(vi, xdpf);
+ err = __virtnet_xdp_tx_xmit(vi, xdpf);
if (unlikely(err)) {
trace_xdp_exception(vi->dev, xdp_prog, act);
goto err_xdp;
@@ -779,7 +811,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
xdpf = convert_to_xdp_frame(&xdp);
if (unlikely(!xdpf))
goto err_xdp;
- err = __virtnet_xdp_xmit(vi, xdpf);
+ err = __virtnet_xdp_tx_xmit(vi, xdpf);
if (unlikely(err)) {
trace_xdp_exception(vi->dev, xdp_prog, act);
if (unlikely(xdp_page != page))
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 03ed492c4e14..debdb6286170 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1185,9 +1185,13 @@ struct dev_ifalias {
* This function is used to set or query state related to XDP on the
* netdevice and manage BPF offload. See definition of
* enum bpf_netdev_command for details.
- * int (*ndo_xdp_xmit)(struct net_device *dev, struct xdp_frame *xdp);
- * This function is used to submit a XDP packet for transmit on a
- * netdevice.
+ * int (*ndo_xdp_xmit)(struct net_device *dev, int n, struct xdp_frame **xdp);
+ * This function is used to submit @n XDP packets for transmit on a
+ * netdevice. Returns number of frames successfully transmitted, frames
+ * that got dropped are freed/returned via xdp_return_frame().
+ * Returns negative number, means general error invoking ndo, meaning
+ * no frames were xmit'ed and core-caller will free all frames.
+ * TODO: Consider add flag to allow sending flush operation.
* void (*ndo_xdp_flush)(struct net_device *dev);
* This function is used to inform the driver to flush a particular
* xdp tx queue. Must be called on same CPU as xdp_xmit.
@@ -1375,8 +1379,8 @@ struct net_device_ops {
int needed_headroom);
int (*ndo_bpf)(struct net_device *dev,
struct netdev_bpf *bpf);
- int (*ndo_xdp_xmit)(struct net_device *dev,
- struct xdp_frame *xdp);
+ int (*ndo_xdp_xmit)(struct net_device *dev, int n,
+ struct xdp_frame **xdp);
void (*ndo_xdp_flush)(struct net_device *dev);
};
diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index 6f84100723b0..1317629662ae 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -222,7 +222,7 @@ static int bq_xmit_all(struct bpf_dtab_netdev *obj,
struct xdp_bulk_queue *bq)
{
struct net_device *dev = obj->dev;
- int sent = 0, drops = 0;
+ int sent = 0, drops = 0, err = 0;
int i;
if (unlikely(!bq->count))
@@ -234,23 +234,32 @@ static int bq_xmit_all(struct bpf_dtab_netdev *obj,
prefetch(xdpf);
}
- for (i = 0; i < bq->count; i++) {
- struct xdp_frame *xdpf = bq->q[i];
- int err;
-
- err = dev->netdev_ops->ndo_xdp_xmit(dev, xdpf);
- if (err) {
- drops++;
- xdp_return_frame(xdpf);
- }
- sent++;
+ sent = dev->netdev_ops->ndo_xdp_xmit(dev, bq->count, bq->q);
+ if (sent < 0) {
+ err = sent;
+ sent = 0;
+ goto error;
}
+ drops = bq->count - sent;
+out:
bq->count = 0;
trace_xdp_devmap_xmit(&obj->dtab->map, obj->bit,
sent, drops, bq->dev_rx, dev);
bq->dev_rx = NULL;
return 0;
+error:
+ /* If ndo_xdp_xmit fails with an errno, no frames have been
+ * xmit'ed and it's our responsibility to them free all.
+ */
+ for (i = 0; i < bq->count; i++) {
+ struct xdp_frame *xdpf = bq->q[i];
+
+ /* RX path under NAPI protection, can return frames faster */
+ xdp_return_frame_rx_napi(xdpf);
+ drops++;
+ }
+ goto out;
}
/* __dev_map_flush is called from xdp_do_flush_map() which _must_ be signaled
diff --git a/net/core/filter.c b/net/core/filter.c
index 4a93423cc5ea..19504b7f4959 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3035,7 +3035,7 @@ static int __bpf_tx_xdp(struct net_device *dev,
u32 index)
{
struct xdp_frame *xdpf;
- int err;
+ int sent;
if (!dev->netdev_ops->ndo_xdp_xmit) {
return -EOPNOTSUPP;
@@ -3045,9 +3045,9 @@ static int __bpf_tx_xdp(struct net_device *dev,
if (unlikely(!xdpf))
return -EOVERFLOW;
- err = dev->netdev_ops->ndo_xdp_xmit(dev, xdpf);
- if (err)
- return err;
+ sent = dev->netdev_ops->ndo_xdp_xmit(dev, 1, &xdpf);
+ if (sent <= 0)
+ return sent;
dev->netdev_ops->ndo_xdp_flush(dev);
return 0;
}
^ permalink raw reply related
* [bpf-next V4 PATCH 5/8] xdp: introduce xdp_return_frame_rx_napi
From: Jesper Dangaard Brouer @ 2018-05-18 13:34 UTC (permalink / raw)
To: netdev, Daniel Borkmann, Alexei Starovoitov,
Jesper Dangaard Brouer
Cc: Christoph Hellwig, BjörnTöpel, Magnus Karlsson,
makita.toshiaki
In-Reply-To: <152665044141.21055.1276346542020340263.stgit@firesoul>
When sending an xdp_frame through xdp_do_redirect call, then error
cases can happen where the xdp_frame needs to be dropped, and
returning an -errno code isn't sufficient/possible any-longer
(e.g. for cpumap case). This is already fully supported, by simply
calling xdp_return_frame.
This patch is an optimization, which provides xdp_return_frame_rx_napi,
which is a faster variant for these error cases. It take advantage of
the protection provided by XDP RX running under NAPI protection.
This change is mostly relevant for drivers using the page_pool
allocator as it can take advantage of this. (Tested with mlx5).
---
include/net/page_pool.h | 5 +++--
include/net/xdp.h | 1 +
kernel/bpf/cpumap.c | 2 +-
net/core/xdp.c | 20 ++++++++++++++++----
4 files changed, 21 insertions(+), 7 deletions(-)
diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index c79087153148..694d055e01ef 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -115,13 +115,14 @@ void page_pool_destroy(struct page_pool *pool);
void __page_pool_put_page(struct page_pool *pool,
struct page *page, bool allow_direct);
-static inline void page_pool_put_page(struct page_pool *pool, struct page *page)
+static inline void page_pool_put_page(struct page_pool *pool,
+ struct page *page, bool allow_direct)
{
/* When page_pool isn't compiled-in, net/core/xdp.c doesn't
* allow registering MEM_TYPE_PAGE_POOL, but shield linker.
*/
#ifdef CONFIG_PAGE_POOL
- __page_pool_put_page(pool, page, false);
+ __page_pool_put_page(pool, page, allow_direct);
#endif
}
/* Very limited use-cases allow recycle direct */
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 0b689cf561c7..7ad779237ae8 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -104,6 +104,7 @@ struct xdp_frame *convert_to_xdp_frame(struct xdp_buff *xdp)
}
void xdp_return_frame(struct xdp_frame *xdpf);
+void xdp_return_frame_rx_napi(struct xdp_frame *xdpf);
void xdp_return_buff(struct xdp_buff *xdp);
int xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq,
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index c95b04ec103e..e0918d180f08 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -578,7 +578,7 @@ static int bq_flush_to_queue(struct bpf_cpu_map_entry *rcpu,
err = __ptr_ring_produce(q, xdpf);
if (err) {
drops++;
- xdp_return_frame(xdpf);
+ xdp_return_frame_rx_napi(xdpf);
}
processed++;
}
diff --git a/net/core/xdp.c b/net/core/xdp.c
index bf6758f74339..cb8c4e061a5a 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -308,7 +308,13 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
}
EXPORT_SYMBOL_GPL(xdp_rxq_info_reg_mem_model);
-static void xdp_return(void *data, struct xdp_mem_info *mem)
+/* XDP RX runs under NAPI protection, and in different delivery error
+ * scenarios (e.g. queue full), it is possible to return the xdp_frame
+ * while still leveraging this protection. The @napi_direct boolian
+ * is used for those calls sites. Thus, allowing for faster recycling
+ * of xdp_frames/pages in those cases.
+ */
+static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct)
{
struct xdp_mem_allocator *xa;
struct page *page;
@@ -320,7 +326,7 @@ static void xdp_return(void *data, struct xdp_mem_info *mem)
xa = rhashtable_lookup(mem_id_ht, &mem->id, mem_id_rht_params);
page = virt_to_head_page(data);
if (xa)
- page_pool_put_page(xa->page_pool, page);
+ page_pool_put_page(xa->page_pool, page, napi_direct);
else
put_page(page);
rcu_read_unlock();
@@ -340,12 +346,18 @@ static void xdp_return(void *data, struct xdp_mem_info *mem)
void xdp_return_frame(struct xdp_frame *xdpf)
{
- xdp_return(xdpf->data, &xdpf->mem);
+ __xdp_return(xdpf->data, &xdpf->mem, false);
}
EXPORT_SYMBOL_GPL(xdp_return_frame);
+void xdp_return_frame_rx_napi(struct xdp_frame *xdpf)
+{
+ __xdp_return(xdpf->data, &xdpf->mem, true);
+}
+EXPORT_SYMBOL_GPL(xdp_return_frame_rx_napi);
+
void xdp_return_buff(struct xdp_buff *xdp)
{
- xdp_return(xdp->data, &xdp->rxq->mem);
+ __xdp_return(xdp->data, &xdp->rxq->mem, true);
}
EXPORT_SYMBOL_GPL(xdp_return_buff);
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox