* [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation
@ 2013-10-16 15:24 Jimmy Perchet
2013-10-16 15:24 ` [PATCH RFC 1/5] net:stmmac: set threshold/store and forward mode according to mtu size Jimmy Perchet
` (5 more replies)
0 siblings, 6 replies; 26+ messages in thread
From: Jimmy Perchet @ 2013-10-16 15:24 UTC (permalink / raw)
To: peppe.cavallaro; +Cc: netdev, Jimmy Perchet
Hello,
I began using Synopsys IP few weeks ago and figured out that jumbo frames
are not well supported by stmmac driver.
This patch series addresses several issues which prevent them from working
properly :
*(1/5) Threshold dma mode is needed on rx path if jumbo frames are expected.
*(2/5) RX buffers are not allocated with the needed size because
priv->dma_buf_sz is updated too late (i.e. after stmmac_init_rx_buffers call)
*(3/5) On low speed link (10MBit/s), some TX descriptors can remain dirty
if the tx coalescence timer expires before they were treated. Re-arm timer
in this case.
*(4/5) There is several confusions regarding descriptor's max buffer size,
typically the "-1" is often forgotten.
*(4/5) Jumbo frames' last descriptor is never "closed", resulting in truncated
frames transfer.
*(4/5) Frags could not be jumbo.
Regarding these last points, I didn't find simpler way than writing
new "prepare frame" functions for both ring and chain mode and update
xmit function accordingly.
The last patch is not related to jumbo frames but concern a possible
optimisation :
*(5/5) Tx descriptor's cleanup and preparation are serialized, which is not
necessary and decrease performance. In addition TX descriptor's cleanup is
performed on NET_-RX- softirq, this is confusing.
By taking care of "cur_tx" and "dirty_tx" it is possible to avoid serialization
and defer cleanup in workqueue.
On my smp embedded system, with 1Gbit/s link which is cpu bound, it increases
througput by about 90MBit/s (400MBit/s to 490MBit/s).
Best Regards,
Jimmy Perchet
Jimmy Perchet (5):
net:stmmac: set threshold/store and forward mode according to mtu
size.
net:stmmac: fix rx buffer allocation.
net:stmmac: ensure we reclaim all dirty descriptors.
net:stmmac: fix jumbo frame handling.
net:stmmac: asynchronous tx_clean
drivers/net/ethernet/stmicro/stmmac/chain_mode.c | 99 +++++-----
drivers/net/ethernet/stmicro/stmmac/common.h | 6 +
drivers/net/ethernet/stmicro/stmmac/descs_com.h | 8 +-
drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 6 +
drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 6 +
drivers/net/ethernet/stmicro/stmmac/ring_mode.c | 90 ++++-----
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 6 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 219 +++++++++++++---------
8 files changed, 233 insertions(+), 207 deletions(-)
--
1.8.1.2
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH RFC 1/5] net:stmmac: set threshold/store and forward mode according to mtu size.
2013-10-16 15:24 [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation Jimmy Perchet
@ 2013-10-16 15:24 ` Jimmy Perchet
2013-10-21 8:47 ` Giuseppe CAVALLARO
2013-10-16 15:24 ` [PATCH RFC 2/5] net:stmmac: fix rx buffer allocation Jimmy Perchet
` (4 subsequent siblings)
5 siblings, 1 reply; 26+ messages in thread
From: Jimmy Perchet @ 2013-10-16 15:24 UTC (permalink / raw)
To: peppe.cavallaro; +Cc: netdev, Jimmy Perchet
Threshold mode dma is needed on rx path if jumbo frames are expected.
Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 48 ++++++++++++++++-------
1 file changed, 33 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 8d4ccd3..170f043 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1222,22 +1222,40 @@ static void free_dma_desc_resources(struct stmmac_priv *priv)
* Description: it sets the DMA operation mode: tx/rx DMA thresholds
* or Store-And-Forward capability.
*/
-static void stmmac_dma_operation_mode(struct stmmac_priv *priv)
-{
- if (priv->plat->force_thresh_dma_mode)
- priv->hw->dma->dma_mode(priv->ioaddr, tc, tc);
- else if (priv->plat->force_sf_dma_mode || priv->plat->tx_coe) {
- /*
- * In case of GMAC, SF mode can be enabled
- * to perform the TX COE in HW. This depends on:
- * 1) TX COE if actually supported
- * 2) There is no bugged Jumbo frame support
- * that needs to not insert csum in the TDES.
- */
- priv->hw->dma->dma_mode(priv->ioaddr, SF_DMA_MODE, SF_DMA_MODE);
+static void stmmac_dma_operation_mode(int mtu, struct stmmac_priv *priv)
+{
+ int rx_tc, tx_tc;
+
+ /*
+ * In case of GMAC, SF mode can be enabled
+ * to perform the TX COE in HW. This depends on:
+ * 1) TX COE if actually supported
+ * 2) There is no bugged Jumbo frame support
+ * that needs to not insert csum in the TDES.
+ */
+ if (priv->plat->tx_coe &&
+ !(priv->plat->bugged_jumbo && (mtu > ETH_DATA_LEN))) {
tc = SF_DMA_MODE;
+ tx_tc = SF_DMA_MODE;
} else
- priv->hw->dma->dma_mode(priv->ioaddr, tc, SF_DMA_MODE);
+ tx_tc = tc;
+
+ if (mtu > ETH_DATA_LEN && priv->hw_cap_support
+ && !priv->dma_cap.rxfifo_over_2048)
+ rx_tc = tc;
+ else
+ rx_tc = SF_DMA_MODE;
+
+ if (priv->plat->force_sf_dma_mode) {
+ tc = SF_DMA_MODE;
+ tx_tc = SF_DMA_MODE;
+ rx_tc = SF_DMA_MODE;
+ } else if (priv->plat->force_thresh_dma_mode) {
+ tx_tc = tc;
+ rx_tc = tc;
+ }
+
+ priv->hw->dma->dma_mode(priv->ioaddr, tx_tc, rx_tc);
}
/**
@@ -1687,7 +1705,7 @@ static int stmmac_open(struct net_device *dev)
stmmac_set_mac(priv->ioaddr, true);
/* Set the HW DMA mode and the COE */
- stmmac_dma_operation_mode(priv);
+ stmmac_dma_operation_mode(dev->mtu, priv);
/* Extra statistics */
memset(&priv->xstats, 0, sizeof(struct stmmac_extra_stats));
--
1.8.1.2
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH RFC 2/5] net:stmmac: fix rx buffer allocation.
2013-10-16 15:24 [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation Jimmy Perchet
2013-10-16 15:24 ` [PATCH RFC 1/5] net:stmmac: set threshold/store and forward mode according to mtu size Jimmy Perchet
@ 2013-10-16 15:24 ` Jimmy Perchet
2013-10-21 8:54 ` Giuseppe CAVALLARO
2013-10-16 15:24 ` [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors Jimmy Perchet
` (3 subsequent siblings)
5 siblings, 1 reply; 26+ messages in thread
From: Jimmy Perchet @ 2013-10-16 15:24 UTC (permalink / raw)
To: peppe.cavallaro; +Cc: netdev, Jimmy Perchet
Rx buffers used wrong size, because priv->dma_buf_sz was updated after allocation.
Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 170f043..0015175 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -998,6 +998,9 @@ static int init_dma_desc_rings(struct net_device *dev)
if (bfsize < BUF_SIZE_16KiB)
bfsize = stmmac_set_bfsize(dev->mtu, priv->dma_buf_sz);
+ priv->dma_buf_sz = bfsize;
+ buf_sz = bfsize;
+
if (netif_msg_probe(priv))
pr_debug("%s: txsize %d, rxsize %d, bfsize %d\n", __func__,
txsize, rxsize, bfsize);
@@ -1087,8 +1090,6 @@ static int init_dma_desc_rings(struct net_device *dev)
}
priv->cur_rx = 0;
priv->dirty_rx = (unsigned int)(i - rxsize);
- priv->dma_buf_sz = bfsize;
- buf_sz = bfsize;
/* Setup the chained descriptor addresses */
if (priv->mode == STMMAC_CHAIN_MODE) {
--
1.8.1.2
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors.
2013-10-16 15:24 [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation Jimmy Perchet
2013-10-16 15:24 ` [PATCH RFC 1/5] net:stmmac: set threshold/store and forward mode according to mtu size Jimmy Perchet
2013-10-16 15:24 ` [PATCH RFC 2/5] net:stmmac: fix rx buffer allocation Jimmy Perchet
@ 2013-10-16 15:24 ` Jimmy Perchet
2013-10-16 17:46 ` Sergei Shtylyov
2013-10-21 9:07 ` Giuseppe CAVALLARO
2013-10-16 15:24 ` [PATCH RFC 4/5] net:stmmac: fix jumbo frame handling Jimmy Perchet
` (2 subsequent siblings)
5 siblings, 2 replies; 26+ messages in thread
From: Jimmy Perchet @ 2013-10-16 15:24 UTC (permalink / raw)
To: peppe.cavallaro; +Cc: netdev, Jimmy Perchet
On low speed link (10MBit/s), some TX descriptors can remain dirty
if the tx coalescence timer expires before they were treated. Re-arm timer
in this case.
Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 0015175..af04b5d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1284,8 +1284,12 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
p = priv->dma_tx + entry;
/* Check if the descriptor is owned by the DMA. */
- if (priv->hw->desc->get_tx_owner(p))
+ if (priv->hw->desc->get_tx_owner(p)) {
+ /* Be sure to harvest remaining descriptor. */
+ mod_timer(&priv->txtimer,
+ STMMAC_COAL_TIMER(priv->tx_coal_timer));
break;
+ }
/* Verify tx error by looking at the last segment. */
last = priv->hw->desc->get_tx_ls(p);
--
1.8.1.2
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH RFC 4/5] net:stmmac: fix jumbo frame handling.
2013-10-16 15:24 [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation Jimmy Perchet
` (2 preceding siblings ...)
2013-10-16 15:24 ` [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors Jimmy Perchet
@ 2013-10-16 15:24 ` Jimmy Perchet
2013-10-21 13:40 ` Giuseppe CAVALLARO
2013-10-16 15:24 ` [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean Jimmy Perchet
2013-10-16 20:37 ` [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation Giuseppe CAVALLARO
5 siblings, 1 reply; 26+ messages in thread
From: Jimmy Perchet @ 2013-10-16 15:24 UTC (permalink / raw)
To: peppe.cavallaro; +Cc: netdev, Jimmy Perchet
This patch addresses several issues which prevent jumbo frames from working properly :
.jumbo frames' last descriptor was not closed
.several confusion regarding descriptor's max buffer size
.frags could not be jumbo
Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
---
drivers/net/ethernet/stmicro/stmmac/chain_mode.c | 95 ++++++++----------
drivers/net/ethernet/stmicro/stmmac/common.h | 6 ++
drivers/net/ethernet/stmicro/stmmac/descs_com.h | 8 +-
drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 6 ++
drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 6 ++
drivers/net/ethernet/stmicro/stmmac/ring_mode.c | 90 ++++++++---------
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 112 +++++++++++-----------
7 files changed, 158 insertions(+), 165 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
index d234ab5..d6ed0ce 100644
--- a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
@@ -28,70 +28,58 @@
#include "stmmac.h"
-static unsigned int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
+static unsigned int stmmac_prepare_frm(void *p, void *data, unsigned int len,
+ int csum, unsigned int entry)
{
struct stmmac_priv *priv = (struct stmmac_priv *)p;
- unsigned int txsize = priv->dma_tx_size;
- unsigned int entry = priv->cur_tx % txsize;
- struct dma_desc *desc = priv->dma_tx + entry;
- unsigned int nopaged_len = skb_headlen(skb);
+ unsigned int entry_count = 0;
+ struct dma_desc *desc;
unsigned int bmax;
- unsigned int i = 1, len;
- if (priv->plat->enh_desc)
- bmax = BUF_SIZE_8KiB;
- else
- bmax = BUF_SIZE_2KiB;
-
- len = nopaged_len - bmax;
-
- desc->des2 = dma_map_single(priv->device, skb->data,
- bmax, DMA_TO_DEVICE);
- priv->tx_skbuff_dma[entry] = desc->des2;
- priv->hw->desc->prepare_tx_desc(desc, 1, bmax, csum, STMMAC_CHAIN_MODE);
-
- while (len != 0) {
- entry = (++priv->cur_tx) % txsize;
+ if (priv->plat->enh_desc) {
+ desc = (struct dma_desc *)(priv->dma_etx + entry);
+ bmax = BUF_SIZE_8KiB - 1;
+ } else{
desc = priv->dma_tx + entry;
-
- if (len > bmax) {
- desc->des2 = dma_map_single(priv->device,
- (skb->data + bmax * i),
- bmax, DMA_TO_DEVICE);
- priv->tx_skbuff_dma[entry] = desc->des2;
- priv->hw->desc->prepare_tx_desc(desc, 0, bmax, csum,
- STMMAC_CHAIN_MODE);
- priv->hw->desc->set_tx_owner(desc);
- priv->tx_skbuff[entry] = NULL;
- len -= bmax;
- i++;
- } else {
- desc->des2 = dma_map_single(priv->device,
- (skb->data + bmax * i), len,
- DMA_TO_DEVICE);
- priv->tx_skbuff_dma[entry] = desc->des2;
- priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
- STMMAC_CHAIN_MODE);
- priv->hw->desc->set_tx_owner(desc);
- priv->tx_skbuff[entry] = NULL;
- len = 0;
- }
+ bmax = BUF_SIZE_2KiB - 1;
}
- return entry;
-}
-static unsigned int stmmac_is_jumbo_frm(int len, int enh_desc)
-{
- unsigned int ret = 0;
+ while (len > bmax) {
+ desc->des2 = dma_map_single(priv->device,
+ data + bmax*entry_count,
+ bmax, DMA_TO_DEVICE);
+ priv->tx_skbuff_dma[entry] = desc->des2;
+ priv->tx_skbuff[entry] = NULL;
+ priv->hw->desc->prepare_tx_desc(desc, 0, bmax, csum,
+ STMMAC_CHAIN_MODE);
+
+ len -= bmax;
+ entry++;
+ entry %= priv->dma_tx_size;
+ entry_count++;
+
+ if (priv->plat->enh_desc)
+ desc = (struct dma_desc *)
+ (((struct dma_extended_desc *)desc)+1);
+ else
+ desc++;
+ }
- if ((enh_desc && (len > BUF_SIZE_8KiB)) ||
- (!enh_desc && (len > BUF_SIZE_2KiB))) {
- ret = 1;
+ if (len) {
+ desc->des2 = dma_map_single(priv->device,
+ data + bmax*entry_count,
+ len, DMA_TO_DEVICE);
+ priv->tx_skbuff_dma[entry] = desc->des2;
+ priv->tx_skbuff[entry] = NULL;
+ priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
+ STMMAC_CHAIN_MODE);
+ entry_count++;
}
- return ret;
+ return entry_count;
}
+
static void stmmac_init_dma_chain(void *des, dma_addr_t phy_addr,
unsigned int size, unsigned int extend_desc)
{
@@ -154,8 +142,7 @@ static void stmmac_clean_desc3(void *priv_ptr, struct dma_desc *p)
const struct stmmac_chain_mode_ops chain_mode_ops = {
.init = stmmac_init_dma_chain,
- .is_jumbo_frm = stmmac_is_jumbo_frm,
- .jumbo_frm = stmmac_jumbo_frm,
+ .prepare_frm = stmmac_prepare_frm,
.refill_desc3 = stmmac_refill_desc3,
.clean_desc3 = stmmac_clean_desc3,
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 7eb8bab..5d3f734 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -323,6 +323,8 @@ struct stmmac_desc_ops {
/* Handle extra events on specific interrupts hw dependent */
int (*get_rx_owner) (struct dma_desc *p);
void (*set_rx_owner) (struct dma_desc *p);
+
+ void (*set_tx_first) (struct dma_desc *p);
/* Get the receive frame size */
int (*get_rx_frame_len) (struct dma_desc *p, int rx_coe_type);
/* Return the reception status looking at the RDES1 */
@@ -421,6 +423,8 @@ struct mii_regs {
struct stmmac_ring_mode_ops {
unsigned int (*is_jumbo_frm) (int len, int ehn_desc);
unsigned int (*jumbo_frm) (void *priv, struct sk_buff *skb, int csum);
+ unsigned int (*prepare_frm) (void *p, void *data, unsigned int len,
+ int csum, unsigned int entry);
void (*refill_desc3) (void *priv, struct dma_desc *p);
void (*init_desc3) (struct dma_desc *p);
void (*clean_desc3) (void *priv, struct dma_desc *p);
@@ -432,6 +436,8 @@ struct stmmac_chain_mode_ops {
unsigned int extend_desc);
unsigned int (*is_jumbo_frm) (int len, int ehn_desc);
unsigned int (*jumbo_frm) (void *priv, struct sk_buff *skb, int csum);
+ unsigned int (*prepare_frm) (void *p, void *data, unsigned int len,
+ int csum, unsigned int entry);
void (*refill_desc3) (void *priv, struct dma_desc *p);
void (*clean_desc3) (void *priv, struct dma_desc *p);
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/descs_com.h b/drivers/net/ethernet/stmicro/stmmac/descs_com.h
index 6f2cc78..cf199d6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/descs_com.h
+++ b/drivers/net/ethernet/stmicro/stmmac/descs_com.h
@@ -53,9 +53,9 @@ static inline void enh_desc_end_tx_desc_on_ring(struct dma_desc *p, int ter)
static inline void enh_set_tx_desc_len_on_ring(struct dma_desc *p, int len)
{
- if (unlikely(len > BUF_SIZE_4KiB)) {
- p->des01.etx.buffer1_size = BUF_SIZE_4KiB;
- p->des01.etx.buffer2_size = len - BUF_SIZE_4KiB;
+ if (unlikely(len >= BUF_SIZE_8KiB)) {
+ p->des01.etx.buffer1_size = BUF_SIZE_8KiB - 1;
+ p->des01.etx.buffer2_size = len - p->des01.etx.buffer1_size;
} else
p->des01.etx.buffer1_size = len;
}
@@ -81,7 +81,7 @@ static inline void ndesc_end_tx_desc_on_ring(struct dma_desc *p, int ter)
static inline void norm_set_tx_desc_len_on_ring(struct dma_desc *p, int len)
{
- if (unlikely(len > BUF_SIZE_2KiB)) {
+ if (unlikely(len >= BUF_SIZE_2KiB)) {
p->des01.etx.buffer1_size = BUF_SIZE_2KiB - 1;
p->des01.etx.buffer2_size = len - p->des01.etx.buffer1_size;
} else
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index 7e6628a..915a7ab 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -297,6 +297,11 @@ static void enh_desc_release_tx_desc(struct dma_desc *p, int mode)
enh_desc_end_tx_desc_on_ring(p, ter);
}
+static void enh_desc_set_tx_first(struct dma_desc *p)
+{
+ p->des01.etx.first_segment = 1;
+}
+
static void enh_desc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
int csum_flag, int mode)
{
@@ -393,6 +398,7 @@ const struct stmmac_desc_ops enh_desc_ops = {
.get_tx_ls = enh_desc_get_tx_ls,
.set_tx_owner = enh_desc_set_tx_owner,
.set_rx_owner = enh_desc_set_rx_owner,
+ .set_tx_first = enh_desc_set_tx_first,
.get_rx_frame_len = enh_desc_get_rx_frame_len,
.rx_extended_status = enh_desc_get_ext_status,
.enable_tx_timestamp = enh_desc_enable_tx_timestamp,
diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
index 35ad4f4..6363776 100644
--- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
@@ -180,6 +180,11 @@ static void ndesc_release_tx_desc(struct dma_desc *p, int mode)
ndesc_end_tx_desc_on_ring(p, ter);
}
+static void ndesc_set_tx_first(struct dma_desc *p)
+{
+ p->des01.etx.first_segment = 1;
+}
+
static void ndesc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
int csum_flag, int mode)
{
@@ -265,6 +270,7 @@ const struct stmmac_desc_ops ndesc_ops = {
.get_tx_ls = ndesc_get_tx_ls,
.set_tx_owner = ndesc_set_tx_owner,
.set_rx_owner = ndesc_set_rx_owner,
+ .set_tx_first = ndesc_set_tx_first,
.get_rx_frame_len = ndesc_get_rx_frame_len,
.enable_tx_timestamp = ndesc_enable_tx_timestamp,
.get_tx_timestamp_status = ndesc_get_tx_timestamp_status,
diff --git a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
index 1ef9d8a..7faa42a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
@@ -28,73 +28,60 @@
#include "stmmac.h"
-static unsigned int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
+
+static unsigned int stmmac_prepare_frm(void *p, void *data, unsigned int len,
+ int csum, unsigned int entry)
{
struct stmmac_priv *priv = (struct stmmac_priv *)p;
- unsigned int txsize = priv->dma_tx_size;
- unsigned int entry = priv->cur_tx % txsize;
+ unsigned int entry_count = 0;
struct dma_desc *desc;
- unsigned int nopaged_len = skb_headlen(skb);
- unsigned int bmax, len;
+ unsigned int bmax;
- if (priv->extend_desc)
+ if (priv->plat->enh_desc) {
desc = (struct dma_desc *)(priv->dma_etx + entry);
- else
+ bmax = BUF_SIZE_8KiB - 1;
+ } else{
desc = priv->dma_tx + entry;
+ bmax = BUF_SIZE_2KiB - 1;
+ }
- if (priv->plat->enh_desc)
- bmax = BUF_SIZE_8KiB;
- else
- bmax = BUF_SIZE_2KiB;
-
- len = nopaged_len - bmax;
-
- if (nopaged_len > BUF_SIZE_8KiB) {
-
- desc->des2 = dma_map_single(priv->device, skb->data,
- bmax, DMA_TO_DEVICE);
+ while (len > 2*bmax) {
+ desc->des2 = dma_map_single(priv->device,
+ data + 2*bmax*entry_count,
+ 2*bmax, DMA_TO_DEVICE);
priv->tx_skbuff_dma[entry] = desc->des2;
- desc->des3 = desc->des2 + BUF_SIZE_4KiB;
- priv->hw->desc->prepare_tx_desc(desc, 1, bmax, csum,
+ priv->tx_skbuff[entry] = NULL;
+ desc->des3 = desc->des2 + bmax;
+ priv->hw->desc->prepare_tx_desc(desc, 0, 2*bmax, csum,
STMMAC_RING_MODE);
- wmb();
- entry = (++priv->cur_tx) % txsize;
- if (priv->extend_desc)
- desc = (struct dma_desc *)(priv->dma_etx + entry);
+ len -= 2*bmax;
+ entry++;
+ entry %= priv->dma_tx_size;
+ entry_count++;
+
+ if (priv->plat->enh_desc)
+ desc = (struct dma_desc *)
+ (((struct dma_extended_desc *)desc)+1);
else
- desc = priv->dma_tx + entry;
+ desc++;
+ }
+ if (len) {
- desc->des2 = dma_map_single(priv->device, skb->data + bmax,
- len, DMA_TO_DEVICE);
+ desc->des2 = dma_map_single(priv->device,
+ data + 2*bmax*entry_count,
+ len, DMA_TO_DEVICE);
priv->tx_skbuff_dma[entry] = desc->des2;
- desc->des3 = desc->des2 + BUF_SIZE_4KiB;
- priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
- STMMAC_RING_MODE);
- wmb();
- priv->hw->desc->set_tx_owner(desc);
priv->tx_skbuff[entry] = NULL;
- } else {
- desc->des2 = dma_map_single(priv->device, skb->data,
- nopaged_len, DMA_TO_DEVICE);
- priv->tx_skbuff_dma[entry] = desc->des2;
- desc->des3 = desc->des2 + BUF_SIZE_4KiB;
- priv->hw->desc->prepare_tx_desc(desc, 1, nopaged_len, csum,
+ desc->des3 = desc->des2 + bmax;
+ priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
STMMAC_RING_MODE);
+ entry_count++;
}
- return entry;
+ return entry_count;
}
-static unsigned int stmmac_is_jumbo_frm(int len, int enh_desc)
-{
- unsigned int ret = 0;
-
- if (len >= BUF_SIZE_4KiB)
- ret = 1;
-
- return ret;
-}
static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
{
@@ -103,13 +90,13 @@ static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
if (unlikely(priv->plat->has_gmac))
/* Fill DES3 in case of RING mode */
if (priv->dma_buf_sz >= BUF_SIZE_8KiB)
- p->des3 = p->des2 + BUF_SIZE_8KiB;
+ p->des3 = p->des2 + BUF_SIZE_8KiB - 1;
}
/* In ring mode we need to fill the desc3 because it is used as buffer */
static void stmmac_init_desc3(struct dma_desc *p)
{
- p->des3 = p->des2 + BUF_SIZE_8KiB;
+ p->des3 = p->des2 + BUF_SIZE_8KiB - 1;
}
static void stmmac_clean_desc3(void *priv_ptr, struct dma_desc *p)
@@ -127,8 +114,7 @@ static int stmmac_set_16kib_bfsize(int mtu)
}
const struct stmmac_ring_mode_ops ring_mode_ops = {
- .is_jumbo_frm = stmmac_is_jumbo_frm,
- .jumbo_frm = stmmac_jumbo_frm,
+ .prepare_frm = stmmac_prepare_frm,
.refill_desc3 = stmmac_refill_desc3,
.init_desc3 = stmmac_init_desc3,
.clean_desc3 = stmmac_clean_desc3,
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index af04b5d..5873246 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1832,6 +1832,20 @@ static int stmmac_release(struct net_device *dev)
return 0;
}
+
+static struct dma_desc *get_desc(struct stmmac_priv *priv, unsigned int entry)
+{
+ struct dma_desc *desc;
+
+ if (priv->plat->enh_desc)
+ desc = (struct dma_desc *)(priv->dma_etx + entry);
+ else
+ desc = priv->dma_tx + entry;
+
+ return desc;
+}
+
+
/**
* stmmac_xmit: Tx entry point of the driver
* @skb : the socket buffer
@@ -1844,11 +1858,10 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct stmmac_priv *priv = netdev_priv(dev);
unsigned int txsize = priv->dma_tx_size;
- unsigned int entry;
- int i, csum_insertion = 0, is_jumbo = 0;
+ unsigned int entry, first_entry, nb_desc = 0;
+ int i, csum_insertion = 0;
int nfrags = skb_shinfo(skb)->nr_frags;
- struct dma_desc *desc, *first;
- unsigned int nopaged_len = skb_headlen(skb);
+ struct dma_desc *desc = NULL, *first;
if (unlikely(stmmac_tx_avail(priv) < nfrags + 1)) {
if (!netif_queue_stopped(dev)) {
@@ -1858,73 +1871,53 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
}
return NETDEV_TX_BUSY;
}
-
spin_lock(&priv->tx_lock);
if (priv->tx_path_in_lpi_mode)
stmmac_disable_eee_mode(priv);
- entry = priv->cur_tx % txsize;
+ first_entry = entry = priv->cur_tx % txsize;
csum_insertion = (skb->ip_summed == CHECKSUM_PARTIAL);
-
- if (priv->extend_desc)
- desc = (struct dma_desc *)(priv->dma_etx + entry);
+ /* To program the descriptors according to the size of the frame */
+ if (priv->mode == STMMAC_RING_MODE)
+ entry += priv->hw->ring->prepare_frm(priv, skb->data,
+ skb_headlen(skb), csum_insertion, entry);
else
- desc = priv->dma_tx + entry;
+ entry += priv->hw->chain->prepare_frm(priv, skb->data,
+ skb_headlen(skb), csum_insertion, entry);
- first = desc;
-
- priv->tx_skbuff[entry] = skb;
-
- /* To program the descriptors according to the size of the frame */
- if (priv->mode == STMMAC_RING_MODE) {
- is_jumbo = priv->hw->ring->is_jumbo_frm(skb->len,
- priv->plat->enh_desc);
- if (unlikely(is_jumbo))
- entry = priv->hw->ring->jumbo_frm(priv, skb,
- csum_insertion);
- } else {
- is_jumbo = priv->hw->chain->is_jumbo_frm(skb->len,
- priv->plat->enh_desc);
- if (unlikely(is_jumbo))
- entry = priv->hw->chain->jumbo_frm(priv, skb,
- csum_insertion);
- }
- if (likely(!is_jumbo)) {
- desc->des2 = dma_map_single(priv->device, skb->data,
- nopaged_len, DMA_TO_DEVICE);
- priv->tx_skbuff_dma[entry] = desc->des2;
- priv->hw->desc->prepare_tx_desc(desc, 1, nopaged_len,
- csum_insertion, priv->mode);
- } else
- desc = first;
+ entry %= txsize;
for (i = 0; i < nfrags; i++) {
const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
- int len = skb_frag_size(frag);
- entry = (++priv->cur_tx) % txsize;
- if (priv->extend_desc)
- desc = (struct dma_desc *)(priv->dma_etx + entry);
+ if (priv->mode == STMMAC_RING_MODE)
+ entry += priv->hw->ring->prepare_frm(priv,
+ skb_frag_address(frag),
+ skb_frag_size(frag),
+ csum_insertion, entry);
else
- desc = priv->dma_tx + entry;
-
- desc->des2 = skb_frag_dma_map(priv->device, frag, 0, len,
- DMA_TO_DEVICE);
- priv->tx_skbuff_dma[entry] = desc->des2;
- priv->tx_skbuff[entry] = NULL;
- priv->hw->desc->prepare_tx_desc(desc, 0, len, csum_insertion,
- priv->mode);
- wmb();
- priv->hw->desc->set_tx_owner(desc);
- wmb();
- }
-
+ entry += priv->hw->chain->prepare_frm(priv,
+ skb_frag_address(frag),
+ skb_frag_size(frag),
+ csum_insertion, entry);
+
+ entry %= txsize;
+ }
+ /*Set owner for all segment but the first one */
+ for (i = first_entry; i != entry;) {
+ desc = get_desc(priv, i);
+ nb_desc++;
+ if (i != first_entry)
+ priv->hw->desc->set_tx_owner(desc);
+ i++;
+ i %= txsize;
+ }
+ BUG_ON(desc == NULL);
/* Finalize the latest segment. */
priv->hw->desc->close_tx_desc(desc);
- wmb();
/* According to the coalesce parameter the IC bit for the latest
* segment could be reset and the timer re-started to invoke the
* stmmac_tx function. This approach takes care about the fragments.
@@ -1938,11 +1931,20 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
} else
priv->tx_count_frames = 0;
+ /*Prepare first segment */
+ priv->tx_skbuff[first_entry] = skb;
+
+ first = get_desc(priv, first_entry);
+
+ priv->hw->desc->set_tx_first(first);
+
+ wmb();
+
/* To avoid raise condition */
priv->hw->desc->set_tx_owner(first);
wmb();
- priv->cur_tx++;
+ priv->cur_tx += nb_desc;
if (netif_msg_pktdata(priv)) {
pr_debug("%s: curr %d dirty=%d entry=%d, first=%p, nfrags=%d",
--
1.8.1.2
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean
2013-10-16 15:24 [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation Jimmy Perchet
` (3 preceding siblings ...)
2013-10-16 15:24 ` [PATCH RFC 4/5] net:stmmac: fix jumbo frame handling Jimmy Perchet
@ 2013-10-16 15:24 ` Jimmy Perchet
2013-10-21 13:52 ` Giuseppe CAVALLARO
2013-10-16 20:37 ` [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation Giuseppe CAVALLARO
5 siblings, 1 reply; 26+ messages in thread
From: Jimmy Perchet @ 2013-10-16 15:24 UTC (permalink / raw)
To: peppe.cavallaro; +Cc: netdev, Jimmy Perchet
Tx descriptor's cleanup and preparation are serialized, which is not necessary
and decrease performance.
In addition TX descriptor's cleanup is performed on NET_RX softirq, this is
confusing.
This patch unserialize tx descriptor's cleanup and preparation
and defer cleanup in workqueue.
Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
---
drivers/net/ethernet/stmicro/stmmac/chain_mode.c | 6 +--
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 6 ++-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 52 +++++++++++++----------
3 files changed, 37 insertions(+), 27 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
index d6ed0ce..8ab83cb 100644
--- a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
@@ -120,7 +120,7 @@ static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
* to keep explicit chaining in the descriptor.
*/
p->des3 = (unsigned int)(priv->dma_rx_phy +
- (((priv->dirty_rx) + 1) %
+ ((priv->dirty_rx + 1) %
priv->dma_rx_size) *
sizeof(struct dma_desc));
}
@@ -135,9 +135,9 @@ static void stmmac_clean_desc3(void *priv_ptr, struct dma_desc *p)
* to keep explicit chaining in the descriptor.
*/
p->des3 = (unsigned int)(priv->dma_tx_phy +
- (((priv->dirty_tx + 1) %
+ ((atomic_read(&priv->dirty_tx) + 1) %
priv->dma_tx_size) *
- sizeof(struct dma_desc)));
+ sizeof(struct dma_desc));
}
const struct stmmac_chain_mode_ops chain_mode_ops = {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index f16a9bd..d5b8906 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -38,8 +38,8 @@ struct stmmac_priv {
struct dma_extended_desc *dma_etx ____cacheline_aligned_in_smp;
struct dma_desc *dma_tx;
struct sk_buff **tx_skbuff;
- unsigned int cur_tx;
- unsigned int dirty_tx;
+ atomic_t cur_tx;
+ atomic_t dirty_tx;
unsigned int dma_tx_size;
u32 tx_count_frames;
u32 tx_coal_frames;
@@ -106,6 +106,8 @@ struct stmmac_priv {
u32 adv_ts;
int use_riwt;
spinlock_t ptp_lock;
+ struct work_struct txclean_work;
+ struct workqueue_struct *wq;
};
extern int phyaddr;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 5873246..de614ad 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -48,6 +48,7 @@
#include <linux/seq_file.h>
#endif /* CONFIG_STMMAC_DEBUG_FS */
#include <linux/net_tstamp.h>
+#include <linux/workqueue.h>
#include "stmmac_ptp.h"
#include "stmmac.h"
@@ -205,7 +206,8 @@ static void print_pkt(unsigned char *buf, int len)
static inline u32 stmmac_tx_avail(struct stmmac_priv *priv)
{
- return priv->dirty_tx + priv->dma_tx_size - priv->cur_tx - 1;
+ return atomic_read(&priv->dirty_tx) + priv->dma_tx_size
+ - atomic_read(&priv->cur_tx) - 1;
}
/**
@@ -230,7 +232,7 @@ static inline void stmmac_hw_fix_mac_speed(struct stmmac_priv *priv)
static void stmmac_enable_eee_mode(struct stmmac_priv *priv)
{
/* Check and enter in LPI mode */
- if ((priv->dirty_tx == priv->cur_tx) &&
+ if ((atomic_read(&priv->dirty_tx) == atomic_read(&priv->cur_tx)) &&
(priv->tx_path_in_lpi_mode == false))
priv->hw->mac->set_eee_mode(priv->ioaddr);
}
@@ -1118,8 +1120,8 @@ static int init_dma_desc_rings(struct net_device *dev)
priv->tx_skbuff[i] = NULL;
}
- priv->dirty_tx = 0;
- priv->cur_tx = 0;
+ atomic_set(&priv->dirty_tx, 0);
+ atomic_set(&priv->cur_tx, 0);
stmmac_clear_descriptors(priv);
@@ -1264,17 +1266,17 @@ static void stmmac_dma_operation_mode(int mtu, struct stmmac_priv *priv)
* @priv: driver private structure
* Description: it reclaims resources after transmission completes.
*/
-static void stmmac_tx_clean(struct stmmac_priv *priv)
+static void stmmac_tx_clean(struct work_struct *work)
{
+ struct stmmac_priv *priv =
+ container_of(work, struct stmmac_priv, txclean_work);
unsigned int txsize = priv->dma_tx_size;
- spin_lock(&priv->tx_lock);
-
priv->xstats.tx_clean++;
- while (priv->dirty_tx != priv->cur_tx) {
+ while (atomic_read(&priv->dirty_tx) != atomic_read(&priv->cur_tx)) {
int last;
- unsigned int entry = priv->dirty_tx % txsize;
+ unsigned int entry = atomic_read(&priv->dirty_tx) % txsize;
struct sk_buff *skb = priv->tx_skbuff[entry];
struct dma_desc *p;
@@ -1308,7 +1310,8 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
}
if (netif_msg_tx_done(priv))
pr_debug("%s: curr %d, dirty %d\n", __func__,
- priv->cur_tx, priv->dirty_tx);
+ atomic_read(&priv->cur_tx),
+ atomic_read(&priv->dirty_tx));
if (likely(priv->tx_skbuff_dma[entry])) {
dma_unmap_single(priv->device,
@@ -1326,7 +1329,8 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
priv->hw->desc->release_tx_desc(p, priv->mode);
- priv->dirty_tx++;
+ wmb();
+ atomic_inc(&priv->dirty_tx);
}
if (unlikely(netif_queue_stopped(priv->dev) &&
stmmac_tx_avail(priv) > STMMAC_TX_THRESH(priv))) {
@@ -1344,7 +1348,6 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
stmmac_enable_eee_mode(priv);
mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(eee_timer));
}
- spin_unlock(&priv->tx_lock);
}
static inline void stmmac_enable_dma_irq(struct stmmac_priv *priv)
@@ -1380,8 +1383,8 @@ static void stmmac_tx_err(struct stmmac_priv *priv)
priv->hw->desc->init_tx_desc(&priv->dma_tx[i],
priv->mode,
(i == txsize - 1));
- priv->dirty_tx = 0;
- priv->cur_tx = 0;
+ atomic_set(&priv->dirty_tx, 0);
+ atomic_set(&priv->cur_tx, 0);
priv->hw->dma->start_tx(priv->ioaddr);
priv->dev->stats.tx_errors++;
@@ -1401,12 +1404,16 @@ static void stmmac_dma_interrupt(struct stmmac_priv *priv)
int status;
status = priv->hw->dma->dma_interrupt(priv->ioaddr, &priv->xstats);
- if (likely((status & handle_rx)) || (status & handle_tx)) {
+ if (likely(status & handle_rx)) {
if (likely(napi_schedule_prep(&priv->napi))) {
stmmac_disable_dma_irq(priv);
__napi_schedule(&priv->napi);
}
}
+ if (likely(status & handle_tx))
+ queue_work(priv->wq, &priv->txclean_work);
+
+
if (unlikely(status & tx_hard_error_bump_tc)) {
/* Try to bump up the dma threshold on this failure */
if (unlikely(tc != SF_DMA_MODE) && (tc <= 256)) {
@@ -1596,8 +1603,7 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
static void stmmac_tx_timer(unsigned long data)
{
struct stmmac_priv *priv = (struct stmmac_priv *)data;
-
- stmmac_tx_clean(priv);
+ queue_work(priv->wq, &priv->txclean_work);
}
/**
@@ -1876,7 +1882,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
if (priv->tx_path_in_lpi_mode)
stmmac_disable_eee_mode(priv);
- first_entry = entry = priv->cur_tx % txsize;
+ first_entry = entry = atomic_read(&priv->cur_tx) % txsize;
csum_insertion = (skb->ip_summed == CHECKSUM_PARTIAL);
/* To program the descriptors according to the size of the frame */
@@ -1944,12 +1950,13 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
priv->hw->desc->set_tx_owner(first);
wmb();
- priv->cur_tx += nb_desc;
+ atomic_add(nb_desc, &priv->cur_tx);
if (netif_msg_pktdata(priv)) {
pr_debug("%s: curr %d dirty=%d entry=%d, first=%p, nfrags=%d",
- __func__, (priv->cur_tx % txsize),
- (priv->dirty_tx % txsize), entry, first, nfrags);
+ __func__, (atomic_read(&priv->cur_tx) % txsize),
+ (atomic_read(&priv->dirty_tx) % txsize), entry,
+ first, nfrags);
if (priv->extend_desc)
stmmac_display_ring((void *)priv->dma_etx, txsize, 1);
@@ -2171,7 +2178,6 @@ static int stmmac_poll(struct napi_struct *napi, int budget)
int work_done = 0;
priv->xstats.napi_poll++;
- stmmac_tx_clean(priv);
work_done = stmmac_rx(priv, budget);
if (work_done < budget) {
@@ -2747,6 +2753,8 @@ struct stmmac_priv *stmmac_dvr_probe(struct device *device,
spin_lock_init(&priv->lock);
spin_lock_init(&priv->tx_lock);
+ priv->wq = create_singlethread_workqueue("stmmac_wq");
+ INIT_WORK(&priv->txclean_work, stmmac_tx_clean);
ret = register_netdev(ndev);
if (ret) {
--
1.8.1.2
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors.
2013-10-16 15:24 ` [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors Jimmy Perchet
@ 2013-10-16 17:46 ` Sergei Shtylyov
2013-10-18 8:32 ` Jimmy PERCHET
2013-10-21 9:07 ` Giuseppe CAVALLARO
1 sibling, 1 reply; 26+ messages in thread
From: Sergei Shtylyov @ 2013-10-16 17:46 UTC (permalink / raw)
To: Jimmy Perchet; +Cc: peppe.cavallaro, netdev
Hello.
On 10/16/2013 07:24 PM, Jimmy Perchet wrote:
> On low speed link (10MBit/s), some TX descriptors can remain dirty
> if the tx coalescence timer expires before they were treated. Re-arm timer
> in this case.
> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 0015175..af04b5d 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -1284,8 +1284,12 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
> p = priv->dma_tx + entry;
>
> /* Check if the descriptor is owned by the DMA. */
> - if (priv->hw->desc->get_tx_owner(p))
> + if (priv->hw->desc->get_tx_owner(p)) {
> + /* Be sure to harvest remaining descriptor. */
> + mod_timer(&priv->txtimer,
> + STMMAC_COAL_TIMER(priv->tx_coal_timer));
The continuation line should stat right under &, according to thenetworking
coding style.
WBR, Sergei
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation
2013-10-16 15:24 [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation Jimmy Perchet
` (4 preceding siblings ...)
2013-10-16 15:24 ` [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean Jimmy Perchet
@ 2013-10-16 20:37 ` Giuseppe CAVALLARO
2013-10-18 16:24 ` Jimmy PERCHET
5 siblings, 1 reply; 26+ messages in thread
From: Giuseppe CAVALLARO @ 2013-10-16 20:37 UTC (permalink / raw)
To: Jimmy Perchet; +Cc: netdev
Hello Jimmy
On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
> Hello,
>
> I began using Synopsys IP few weeks ago and figured out that jumbo frames
> are not well supported by stmmac driver.
I tested jumbo on chips w/o enhanced some time ago, so welcome further
tests as you did (maybe on new chips).
>
> This patch series addresses several issues which prevent them from working
> properly :
> *(1/5) Threshold dma mode is needed on rx path if jumbo frames are expected.
hmm, this depends on the HW. In the past I used HW with a Fifo that is
16KiB for rx buffers and 8KiB for tx.
So this could be managed according to these sizes I guess...
>
> *(2/5) RX buffers are not allocated with the needed size because
> priv->dma_buf_sz is updated too late (i.e. after stmmac_init_rx_buffers call)
I'll look at the patch in detail
>
> *(3/5) On low speed link (10MBit/s), some TX descriptors can remain dirty
> if the tx coalescence timer expires before they were treated. Re-arm timer
> in this case.
hmm not clear to me, let me look at the patch. I hope the link should
not impact... never seen on my side.
>
> *(4/5) There is several confusions regarding descriptor's max buffer size,
> typically the "-1" is often forgotten.
also for this I need to look at the patch
> *(4/5) Jumbo frames' last descriptor is never "closed", resulting in truncated
> frames transfer.
it sounds to be a bug... let me check
> *(4/5) Frags could not be jumbo.
> Regarding these last points, I didn't find simpler way than writing
> new "prepare frame" functions for both ring and chain mode and update
> xmit function accordingly.
it is strange and not clear too. At any rate, both modes must be
supported
>
> The last patch is not related to jumbo frames but concern a possible
> optimisation :
> *(5/5) Tx descriptor's cleanup and preparation are serialized, which is not
> necessary and decrease performance. In addition TX descriptor's cleanup is
> performed on NET_-RX- softirq, this is confusing.
> By taking care of "cur_tx" and "dirty_tx" it is possible to avoid serialization
> and defer cleanup in workqueue.
> On my smp embedded system, with 1Gbit/s link which is cpu bound, it increases
> througput by about 90MBit/s (400MBit/s to 490MBit/s).
This sounds another good point, let me enter in the patch
BR
Peppe
>
>
> Best Regards,
> Jimmy Perchet
>
> Jimmy Perchet (5):
> net:stmmac: set threshold/store and forward mode according to mtu
> size.
> net:stmmac: fix rx buffer allocation.
> net:stmmac: ensure we reclaim all dirty descriptors.
> net:stmmac: fix jumbo frame handling.
> net:stmmac: asynchronous tx_clean
>
> drivers/net/ethernet/stmicro/stmmac/chain_mode.c | 99 +++++-----
> drivers/net/ethernet/stmicro/stmmac/common.h | 6 +
> drivers/net/ethernet/stmicro/stmmac/descs_com.h | 8 +-
> drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 6 +
> drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 6 +
> drivers/net/ethernet/stmicro/stmmac/ring_mode.c | 90 ++++-----
> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 6 +-
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 219 +++++++++++++---------
> 8 files changed, 233 insertions(+), 207 deletions(-)
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors.
2013-10-16 17:46 ` Sergei Shtylyov
@ 2013-10-18 8:32 ` Jimmy PERCHET
0 siblings, 0 replies; 26+ messages in thread
From: Jimmy PERCHET @ 2013-10-18 8:32 UTC (permalink / raw)
To: Sergei Shtylyov; +Cc: peppe.cavallaro, netdev
Hello,
>
> The continuation line should stat right under &, according to thenetworking coding style.
Thanks for your comment,
I realized that there is several other issues regarding continuation line in my patch series.
I'll fix them in v2.
Best Regards,
Jimmy
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation
2013-10-16 20:37 ` [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation Giuseppe CAVALLARO
@ 2013-10-18 16:24 ` Jimmy PERCHET
0 siblings, 0 replies; 26+ messages in thread
From: Jimmy PERCHET @ 2013-10-18 16:24 UTC (permalink / raw)
To: Giuseppe CAVALLARO; +Cc: netdev, Jimmy Perchet
Hello Peppe,
Thanks for your concern,
I add some details :
> I tested jumbo on chips w/o enhanced some time ago, so welcome further
> tests as you did (maybe on new chips).
I'm working with 3.71a version, in 2KiB + 2KiB configuration.
>> *(1/5) Threshold dma mode is needed on rx path if jumbo frames are expected.
>
> hmm, this depends on the HW. In the past I used HW with a Fifo that is
> 16KiB for rx buffers and 8KiB for tx.
I used rxfifo_over_2048 flag in order to guess if threshold mode is necessary.
>> *(3/5) On low speed link (10MBit/s), some TX descriptors can remain dirty
>> if the tx coalescence timer expires before they were treated. Re-arm timer
>> in this case.
>
> hmm not clear to me, let me look at the patch. I hope the link should
> not impact... never seen on my side.
>
Tx coalescence default parameters are : one interrupt every 64 descriptors
and 40ms timer.
Let say, one is transferring 63 jumbo frames(9KiB) over 10Mb/s link:
* 63<64 there is no interrupt.
* when the timer expires, only 5 descriptors have to be cleaned.(40ms@10Mb/s)
* at the end, 58 dirty descriptors remain.
Normally, they will be cleaned at the next transfer. The real problem appears
if the socket's "wmem" is too small. The transfer stall :
*Socket is waiting for buffer's cleanup before performing a new transfer.
*Driver is waiting for new transfer before performing cleanup.
Re-arming the timer allows to continue cleanup, thus the socket's
wake-up threshold will be reach.
Best Regards,
Jimmy
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 1/5] net:stmmac: set threshold/store and forward mode according to mtu size.
2013-10-16 15:24 ` [PATCH RFC 1/5] net:stmmac: set threshold/store and forward mode according to mtu size Jimmy Perchet
@ 2013-10-21 8:47 ` Giuseppe CAVALLARO
2013-10-21 9:58 ` Rayagond K
0 siblings, 1 reply; 26+ messages in thread
From: Giuseppe CAVALLARO @ 2013-10-21 8:47 UTC (permalink / raw)
To: Jimmy Perchet; +Cc: netdev
On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
> Threshold mode dma is needed on rx path if jumbo frames are expected.
I think we should not force the threshold mode depending on the mtu.
For example, I worked on hw with rx buffers ~16KiB so able to SF
an oversized frame.
Maybe we could solve this configuration problem at platform time.
We added the fields: force_thresh_dma_mode, force_sf_dma_mode
to cover this scenarios. If these need to be enforced so we can
prepare a patch.
let me know
peppe
>
>
> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 48 ++++++++++++++++-------
> 1 file changed, 33 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 8d4ccd3..170f043 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -1222,22 +1222,40 @@ static void free_dma_desc_resources(struct stmmac_priv *priv)
> * Description: it sets the DMA operation mode: tx/rx DMA thresholds
> * or Store-And-Forward capability.
> */
> -static void stmmac_dma_operation_mode(struct stmmac_priv *priv)
> -{
> - if (priv->plat->force_thresh_dma_mode)
> - priv->hw->dma->dma_mode(priv->ioaddr, tc, tc);
> - else if (priv->plat->force_sf_dma_mode || priv->plat->tx_coe) {
> - /*
> - * In case of GMAC, SF mode can be enabled
> - * to perform the TX COE in HW. This depends on:
> - * 1) TX COE if actually supported
> - * 2) There is no bugged Jumbo frame support
> - * that needs to not insert csum in the TDES.
> - */
> - priv->hw->dma->dma_mode(priv->ioaddr, SF_DMA_MODE, SF_DMA_MODE);
> +static void stmmac_dma_operation_mode(int mtu, struct stmmac_priv *priv)
> +{
> + int rx_tc, tx_tc;
> +
> + /*
> + * In case of GMAC, SF mode can be enabled
> + * to perform the TX COE in HW. This depends on:
> + * 1) TX COE if actually supported
> + * 2) There is no bugged Jumbo frame support
> + * that needs to not insert csum in the TDES.
> + */
> + if (priv->plat->tx_coe &&
> + !(priv->plat->bugged_jumbo && (mtu > ETH_DATA_LEN))) {
> tc = SF_DMA_MODE;
> + tx_tc = SF_DMA_MODE;
> } else
> - priv->hw->dma->dma_mode(priv->ioaddr, tc, SF_DMA_MODE);
> + tx_tc = tc;
> +
> + if (mtu > ETH_DATA_LEN && priv->hw_cap_support
> + && !priv->dma_cap.rxfifo_over_2048)
> + rx_tc = tc;
> + else
> + rx_tc = SF_DMA_MODE;
> +
> + if (priv->plat->force_sf_dma_mode) {
> + tc = SF_DMA_MODE;
> + tx_tc = SF_DMA_MODE;
> + rx_tc = SF_DMA_MODE;
> + } else if (priv->plat->force_thresh_dma_mode) {
> + tx_tc = tc;
> + rx_tc = tc;
> + }
> +
> + priv->hw->dma->dma_mode(priv->ioaddr, tx_tc, rx_tc);
> }
>
> /**
> @@ -1687,7 +1705,7 @@ static int stmmac_open(struct net_device *dev)
> stmmac_set_mac(priv->ioaddr, true);
>
> /* Set the HW DMA mode and the COE */
> - stmmac_dma_operation_mode(priv);
> + stmmac_dma_operation_mode(dev->mtu, priv);
>
> /* Extra statistics */
> memset(&priv->xstats, 0, sizeof(struct stmmac_extra_stats));
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 2/5] net:stmmac: fix rx buffer allocation.
2013-10-16 15:24 ` [PATCH RFC 2/5] net:stmmac: fix rx buffer allocation Jimmy Perchet
@ 2013-10-21 8:54 ` Giuseppe CAVALLARO
0 siblings, 0 replies; 26+ messages in thread
From: Giuseppe CAVALLARO @ 2013-10-21 8:54 UTC (permalink / raw)
To: Jimmy Perchet; +Cc: netdev
On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
> Rx buffers used wrong size, because priv->dma_buf_sz was updated after allocation.
>
> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 170f043..0015175 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -998,6 +998,9 @@ static int init_dma_desc_rings(struct net_device *dev)
> if (bfsize < BUF_SIZE_16KiB)
> bfsize = stmmac_set_bfsize(dev->mtu, priv->dma_buf_sz);
>
> + priv->dma_buf_sz = bfsize;
> + buf_sz = bfsize;
> +
> if (netif_msg_probe(priv))
> pr_debug("%s: txsize %d, rxsize %d, bfsize %d\n", __func__,
> txsize, rxsize, bfsize);
> @@ -1087,8 +1090,6 @@ static int init_dma_desc_rings(struct net_device *dev)
> }
> priv->cur_rx = 0;
> priv->dirty_rx = (unsigned int)(i - rxsize);
> - priv->dma_buf_sz = bfsize;
> - buf_sz = bfsize;
>
> /* Setup the chained descriptor addresses */
> if (priv->mode == STMMAC_CHAIN_MODE) {
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors.
2013-10-16 15:24 ` [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors Jimmy Perchet
2013-10-16 17:46 ` Sergei Shtylyov
@ 2013-10-21 9:07 ` Giuseppe CAVALLARO
2013-10-21 13:10 ` Jimmy PERCHET
1 sibling, 1 reply; 26+ messages in thread
From: Giuseppe CAVALLARO @ 2013-10-21 9:07 UTC (permalink / raw)
To: Jimmy Perchet; +Cc: netdev
Hello Jimmy
On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
> On low speed link (10MBit/s), some TX descriptors can remain dirty
> if the tx coalescence timer expires before they were treated. Re-arm timer
> in this case.
>
> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 0015175..af04b5d 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -1284,8 +1284,12 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
> p = priv->dma_tx + entry;
>
> /* Check if the descriptor is owned by the DMA. */
> - if (priv->hw->desc->get_tx_owner(p))
> + if (priv->hw->desc->get_tx_owner(p)) {
> + /* Be sure to harvest remaining descriptor. */
> + mod_timer(&priv->txtimer,
> + STMMAC_COAL_TIMER(priv->tx_coal_timer));
> break;
> + }
why should we reload the timer when clean the tx resource?
This is done in the xmit where as soon as a frame has to be
transmitted it makes sense to reload the timer.
Also I have not clear why the problem happens on 10MBit/s speed
Maybe there is an hidden problem (lock protection)
that should be fixed.
How did you get this problem? Just on low speed and stress the net?
I have never seen it.
peppe
>
> /* Verify tx error by looking at the last segment. */
> last = priv->hw->desc->get_tx_ls(p);
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 1/5] net:stmmac: set threshold/store and forward mode according to mtu size.
2013-10-21 8:47 ` Giuseppe CAVALLARO
@ 2013-10-21 9:58 ` Rayagond K
2013-10-21 13:49 ` Giuseppe CAVALLARO
0 siblings, 1 reply; 26+ messages in thread
From: Rayagond K @ 2013-10-21 9:58 UTC (permalink / raw)
To: Giuseppe CAVALLARO; +Cc: Jimmy Perchet, netdev
On Mon, Oct 21, 2013 at 2:17 PM, Giuseppe CAVALLARO
<peppe.cavallaro@st.com> wrote:
> On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
>>
>> Threshold mode dma is needed on rx path if jumbo frames are expected.
>
>
> I think we should not force the threshold mode depending on the mtu.
I remember SNPS HW team suggesting me to program the rx path in
Threshold mode for jumbo frames, as it takes care of low RX FIFO size.
>
> For example, I worked on hw with rx buffers ~16KiB so able to SF
> an oversized frame.
>
> Maybe we could solve this configuration problem at platform time.
> We added the fields: force_thresh_dma_mode, force_sf_dma_mode
> to cover this scenarios. If these need to be enforced so we can
> prepare a patch.
>
> let me know
> peppe
>
>
>>
>>
>> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
>> ---
>> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 48
>> ++++++++++++++++-------
>> 1 file changed, 33 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> index 8d4ccd3..170f043 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> @@ -1222,22 +1222,40 @@ static void free_dma_desc_resources(struct
>> stmmac_priv *priv)
>> * Description: it sets the DMA operation mode: tx/rx DMA thresholds
>> * or Store-And-Forward capability.
>> */
>> -static void stmmac_dma_operation_mode(struct stmmac_priv *priv)
>> -{
>> - if (priv->plat->force_thresh_dma_mode)
>> - priv->hw->dma->dma_mode(priv->ioaddr, tc, tc);
>> - else if (priv->plat->force_sf_dma_mode || priv->plat->tx_coe) {
>> - /*
>> - * In case of GMAC, SF mode can be enabled
>> - * to perform the TX COE in HW. This depends on:
>> - * 1) TX COE if actually supported
>> - * 2) There is no bugged Jumbo frame support
>> - * that needs to not insert csum in the TDES.
>> - */
>> - priv->hw->dma->dma_mode(priv->ioaddr, SF_DMA_MODE,
>> SF_DMA_MODE);
>> +static void stmmac_dma_operation_mode(int mtu, struct stmmac_priv *priv)
>> +{
>> + int rx_tc, tx_tc;
>> +
>> + /*
>> + * In case of GMAC, SF mode can be enabled
>> + * to perform the TX COE in HW. This depends on:
>> + * 1) TX COE if actually supported
>> + * 2) There is no bugged Jumbo frame support
>> + * that needs to not insert csum in the TDES.
>> + */
>> + if (priv->plat->tx_coe &&
>> + !(priv->plat->bugged_jumbo && (mtu > ETH_DATA_LEN))) {
>> tc = SF_DMA_MODE;
>> + tx_tc = SF_DMA_MODE;
>> } else
>> - priv->hw->dma->dma_mode(priv->ioaddr, tc, SF_DMA_MODE);
>> + tx_tc = tc;
>> +
>> + if (mtu > ETH_DATA_LEN && priv->hw_cap_support
>> + && !priv->dma_cap.rxfifo_over_2048)
>> + rx_tc = tc;
>> + else
>> + rx_tc = SF_DMA_MODE;
>> +
>> + if (priv->plat->force_sf_dma_mode) {
>> + tc = SF_DMA_MODE;
>> + tx_tc = SF_DMA_MODE;
>> + rx_tc = SF_DMA_MODE;
>> + } else if (priv->plat->force_thresh_dma_mode) {
>> + tx_tc = tc;
>> + rx_tc = tc;
>> + }
>> +
>> + priv->hw->dma->dma_mode(priv->ioaddr, tx_tc, rx_tc);
>> }
>>
>> /**
>> @@ -1687,7 +1705,7 @@ static int stmmac_open(struct net_device *dev)
>> stmmac_set_mac(priv->ioaddr, true);
>>
>> /* Set the HW DMA mode and the COE */
>> - stmmac_dma_operation_mode(priv);
>> + stmmac_dma_operation_mode(dev->mtu, priv);
>>
>> /* Extra statistics */
>> memset(&priv->xstats, 0, sizeof(struct stmmac_extra_stats));
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors.
2013-10-21 9:07 ` Giuseppe CAVALLARO
@ 2013-10-21 13:10 ` Jimmy PERCHET
2013-10-21 18:32 ` Eric Dumazet
0 siblings, 1 reply; 26+ messages in thread
From: Jimmy PERCHET @ 2013-10-21 13:10 UTC (permalink / raw)
To: Giuseppe CAVALLARO; +Cc: netdev
Hello Peppe,
On 21/10/2013 11:07, Giuseppe CAVALLARO wrote:
> Hello Jimmy
>
> On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
>> On low speed link (10MBit/s), some TX descriptors can remain dirty
>> if the tx coalescence timer expires before they were treated. Re-arm timer
>> in this case.
>>
>> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
>> ---
>> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> index 0015175..af04b5d 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> @@ -1284,8 +1284,12 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
>> p = priv->dma_tx + entry;
>>
>> /* Check if the descriptor is owned by the DMA. */
>> - if (priv->hw->desc->get_tx_owner(p))
>> + if (priv->hw->desc->get_tx_owner(p)) {
>> + /* Be sure to harvest remaining descriptor. */
>> + mod_timer(&priv->txtimer,
>> + STMMAC_COAL_TIMER(priv->tx_coal_timer));
>> break;
>> + }
>
>
> why should we reload the timer when clean the tx resource?
> This is done in the xmit where as soon as a frame has to be
> transmitted it makes sense to reload the timer.
>
> Also I have not clear why the problem happens on 10MBit/s speed
> Maybe there is an hidden problem (lock protection)
> that should be fixed.
>
> How did you get this problem? Just on low speed and stress the net?
> I have never seen it.
I can reproduce this problem by issuing 9KiB jumbo frames on 10MBit/s link.
If socket's wmemory size is about 500kiB (or less), the transfer stall.
(I guess it is reproducible with 1500o frames by decreasing
socket's wmemory to 90KB)
Re-arming the timer fix this behaviour.
Here my understanding of this issue :
With 9KiB frames and 500kiB of wmemory, only 60 frames can be
prepared in a row. It is below the tx coalescence threshold,
so there will be no interrupt. When the tx coalescence timer
expires (40ms after), only five descriptors have to be
freed (9000*5 @ 10Mbit/s = 34ms), it is not enough to reach
the socket's wake-up threshold. We get into a deadlock :
*Socket is waiting for free buffers before performing new transfer.
*Driver is waiting for new transfer before performing cleanup.
Maybe, it is not a real life use-case, and is not worth
a patch. What do you think ?
Best Regards,
Jimmy
>
> peppe
>
>>
>> /* Verify tx error by looking at the last segment. */
>> last = priv->hw->desc->get_tx_ls(p);
>>
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 4/5] net:stmmac: fix jumbo frame handling.
2013-10-16 15:24 ` [PATCH RFC 4/5] net:stmmac: fix jumbo frame handling Jimmy Perchet
@ 2013-10-21 13:40 ` Giuseppe CAVALLARO
2013-10-21 16:28 ` Jimmy PERCHET
0 siblings, 1 reply; 26+ messages in thread
From: Giuseppe CAVALLARO @ 2013-10-21 13:40 UTC (permalink / raw)
To: Jimmy Perchet; +Cc: netdev
On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
> This patch addresses several issues which prevent jumbo frames from working properly :
> .jumbo frames' last descriptor was not closed
> .several confusion regarding descriptor's max buffer size
> .frags could not be jumbo
>
> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
Jimmy, thx for thi patch. BElow some my first notes.
I'll continue to look at the patch to verify if I missed
soemthing. I kindly ask you, for the next version, to add
more comments especially in the function to prepare the
tx desc in order to help me on reviewing.
I hope to do some tests asap.
peppe
> ---
> drivers/net/ethernet/stmicro/stmmac/chain_mode.c | 95 ++++++++----------
> drivers/net/ethernet/stmicro/stmmac/common.h | 6 ++
> drivers/net/ethernet/stmicro/stmmac/descs_com.h | 8 +-
> drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 6 ++
> drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 6 ++
> drivers/net/ethernet/stmicro/stmmac/ring_mode.c | 90 ++++++++---------
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 112 +++++++++++-----------
> 7 files changed, 158 insertions(+), 165 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
> index d234ab5..d6ed0ce 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
> @@ -28,70 +28,58 @@
>
> #include "stmmac.h"
>
> -static unsigned int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
> +static unsigned int stmmac_prepare_frm(void *p, void *data, unsigned int len,
> + int csum, unsigned int entry)
> {
> struct stmmac_priv *priv = (struct stmmac_priv *)p;
> - unsigned int txsize = priv->dma_tx_size;
> - unsigned int entry = priv->cur_tx % txsize;
> - struct dma_desc *desc = priv->dma_tx + entry;
> - unsigned int nopaged_len = skb_headlen(skb);
> + unsigned int entry_count = 0;
> + struct dma_desc *desc;
> unsigned int bmax;
> - unsigned int i = 1, len;
>
> - if (priv->plat->enh_desc)
> - bmax = BUF_SIZE_8KiB;
> - else
> - bmax = BUF_SIZE_2KiB;
> -
> - len = nopaged_len - bmax;
> -
> - desc->des2 = dma_map_single(priv->device, skb->data,
> - bmax, DMA_TO_DEVICE);
> - priv->tx_skbuff_dma[entry] = desc->des2;
> - priv->hw->desc->prepare_tx_desc(desc, 1, bmax, csum, STMMAC_CHAIN_MODE);
> -
> - while (len != 0) {
> - entry = (++priv->cur_tx) % txsize;
> + if (priv->plat->enh_desc) {
> + desc = (struct dma_desc *)(priv->dma_etx + entry);
> + bmax = BUF_SIZE_8KiB - 1;
> + } else{
> desc = priv->dma_tx + entry;
> -
> - if (len > bmax) {
> - desc->des2 = dma_map_single(priv->device,
> - (skb->data + bmax * i),
> - bmax, DMA_TO_DEVICE);
> - priv->tx_skbuff_dma[entry] = desc->des2;
> - priv->hw->desc->prepare_tx_desc(desc, 0, bmax, csum,
> - STMMAC_CHAIN_MODE);
> - priv->hw->desc->set_tx_owner(desc);
> - priv->tx_skbuff[entry] = NULL;
> - len -= bmax;
> - i++;
> - } else {
> - desc->des2 = dma_map_single(priv->device,
> - (skb->data + bmax * i), len,
> - DMA_TO_DEVICE);
> - priv->tx_skbuff_dma[entry] = desc->des2;
> - priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
> - STMMAC_CHAIN_MODE);
> - priv->hw->desc->set_tx_owner(desc);
> - priv->tx_skbuff[entry] = NULL;
> - len = 0;
> - }
> + bmax = BUF_SIZE_2KiB - 1;
> }
> - return entry;
> -}
>
> -static unsigned int stmmac_is_jumbo_frm(int len, int enh_desc)
> -{
> - unsigned int ret = 0;
> + while (len > bmax) {
> + desc->des2 = dma_map_single(priv->device,
> + data + bmax*entry_count,
> + bmax, DMA_TO_DEVICE);
> + priv->tx_skbuff_dma[entry] = desc->des2;
> + priv->tx_skbuff[entry] = NULL;
> + priv->hw->desc->prepare_tx_desc(desc, 0, bmax, csum,
> + STMMAC_CHAIN_MODE);
> +
> + len -= bmax;
> + entry++;
> + entry %= priv->dma_tx_size;
> + entry_count++;
> +
> + if (priv->plat->enh_desc)
> + desc = (struct dma_desc *)
> + (((struct dma_extended_desc *)desc)+1);
> + else
> + desc++;
> + }
>
> - if ((enh_desc && (len > BUF_SIZE_8KiB)) ||
> - (!enh_desc && (len > BUF_SIZE_2KiB))) {
> - ret = 1;
> + if (len) {
> + desc->des2 = dma_map_single(priv->device,
> + data + bmax*entry_count,
> + len, DMA_TO_DEVICE);
> + priv->tx_skbuff_dma[entry] = desc->des2;
> + priv->tx_skbuff[entry] = NULL;
> + priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
> + STMMAC_CHAIN_MODE);
> + entry_count++;
> }
>
> - return ret;
> + return entry_count;
> }
>
> +
> static void stmmac_init_dma_chain(void *des, dma_addr_t phy_addr,
> unsigned int size, unsigned int extend_desc)
> {
> @@ -154,8 +142,7 @@ static void stmmac_clean_desc3(void *priv_ptr, struct dma_desc *p)
>
> const struct stmmac_chain_mode_ops chain_mode_ops = {
> .init = stmmac_init_dma_chain,
> - .is_jumbo_frm = stmmac_is_jumbo_frm,
> - .jumbo_frm = stmmac_jumbo_frm,
> + .prepare_frm = stmmac_prepare_frm,
> .refill_desc3 = stmmac_refill_desc3,
> .clean_desc3 = stmmac_clean_desc3,
> };
> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
> index 7eb8bab..5d3f734 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
> @@ -323,6 +323,8 @@ struct stmmac_desc_ops {
> /* Handle extra events on specific interrupts hw dependent */
> int (*get_rx_owner) (struct dma_desc *p);
> void (*set_rx_owner) (struct dma_desc *p);
> +
> + void (*set_tx_first) (struct dma_desc *p);
pls can you use set_tx_fs to be aligned with the doc?
add it just below the ls and w/o empty line.
... see another my comment below for prepare_tx_desc.
> /* Get the receive frame size */
> int (*get_rx_frame_len) (struct dma_desc *p, int rx_coe_type);
> /* Return the reception status looking at the RDES1 */
> @@ -421,6 +423,8 @@ struct mii_regs {
> struct stmmac_ring_mode_ops {
> unsigned int (*is_jumbo_frm) (int len, int ehn_desc);
> unsigned int (*jumbo_frm) (void *priv, struct sk_buff *skb, int csum);
you are replacing two callbacks that will be never used. So you
can remove them from the header.
> + unsigned int (*prepare_frm) (void *p, void *data, unsigned int len,
> + int csum, unsigned int entry);
ok I like it. pls review indentation.
> void (*refill_desc3) (void *priv, struct dma_desc *p);
> void (*init_desc3) (struct dma_desc *p);
> void (*clean_desc3) (void *priv, struct dma_desc *p);
> @@ -432,6 +436,8 @@ struct stmmac_chain_mode_ops {
> unsigned int extend_desc);
> unsigned int (*is_jumbo_frm) (int len, int ehn_desc);
> unsigned int (*jumbo_frm) (void *priv, struct sk_buff *skb, int csum);
> + unsigned int (*prepare_frm) (void *p, void *data, unsigned int len,
> + int csum, unsigned int entry);
> void (*refill_desc3) (void *priv, struct dma_desc *p);
> void (*clean_desc3) (void *priv, struct dma_desc *p);
> };
> diff --git a/drivers/net/ethernet/stmicro/stmmac/descs_com.h b/drivers/net/ethernet/stmicro/stmmac/descs_com.h
> index 6f2cc78..cf199d6 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/descs_com.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/descs_com.h
> @@ -53,9 +53,9 @@ static inline void enh_desc_end_tx_desc_on_ring(struct dma_desc *p, int ter)
>
> static inline void enh_set_tx_desc_len_on_ring(struct dma_desc *p, int len)
> {
> - if (unlikely(len > BUF_SIZE_4KiB)) {
> - p->des01.etx.buffer1_size = BUF_SIZE_4KiB;
> - p->des01.etx.buffer2_size = len - BUF_SIZE_4KiB;
> + if (unlikely(len >= BUF_SIZE_8KiB)) {
> + p->des01.etx.buffer1_size = BUF_SIZE_8KiB - 1;
> + p->des01.etx.buffer2_size = len - p->des01.etx.buffer1_size;
When I wrote this code, the default sizes were 8KiB for tx buffers and
16KiB for rx buffers (from our RTL configuration)
For this reason, in ring mode a buffer bigger than 4KiB but lesser than
8KiB as managed in a single descriptor.
Indeed for enhanced descriptors, TBS1 and TBS2 are fine for sizes of
8KiB so for a frame of 16KiB.
This is true for old gmac (I verified the databook starting from the
3.30a to 3.70a).
Anyway I accept this change, on HW with limitations this will be
managed in other way.
> } else
> p->des01.etx.buffer1_size = len;
> }
> @@ -81,7 +81,7 @@ static inline void ndesc_end_tx_desc_on_ring(struct dma_desc *p, int ter)
>
> static inline void norm_set_tx_desc_len_on_ring(struct dma_desc *p, int len)
> {
> - if (unlikely(len > BUF_SIZE_2KiB)) {
> + if (unlikely(len >= BUF_SIZE_2KiB)) {
we cannot manage a size of 2048 on normal desc
Pls you should verify to not break the back-compatibility.
> p->des01.etx.buffer1_size = BUF_SIZE_2KiB - 1;
> p->des01.etx.buffer2_size = len - p->des01.etx.buffer1_size;
> } else
> diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
> index 7e6628a..915a7ab 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
> @@ -297,6 +297,11 @@ static void enh_desc_release_tx_desc(struct dma_desc *p, int mode)
> enh_desc_end_tx_desc_on_ring(p, ter);
> }
>
> +static void enh_desc_set_tx_first(struct dma_desc *p)
> +{
> + p->des01.etx.first_segment = 1;
> +}
> +
> static void enh_desc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
> int csum_flag, int mode)
> {
> @@ -393,6 +398,7 @@ const struct stmmac_desc_ops enh_desc_ops = {
> .get_tx_ls = enh_desc_get_tx_ls,
> .set_tx_owner = enh_desc_set_tx_owner,
> .set_rx_owner = enh_desc_set_rx_owner,
> + .set_tx_first = enh_desc_set_tx_first,
> .get_rx_frame_len = enh_desc_get_rx_frame_len,
> .rx_extended_status = enh_desc_get_ext_status,
> .enable_tx_timestamp = enh_desc_enable_tx_timestamp,
> diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
> index 35ad4f4..6363776 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
> @@ -180,6 +180,11 @@ static void ndesc_release_tx_desc(struct dma_desc *p, int mode)
> ndesc_end_tx_desc_on_ring(p, ter);
> }
>
> +static void ndesc_set_tx_first(struct dma_desc *p)
> +{
> + p->des01.etx.first_segment = 1;
> +}
> +
> static void ndesc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
> int csum_flag, int mode)
> {
> @@ -265,6 +270,7 @@ const struct stmmac_desc_ops ndesc_ops = {
> .get_tx_ls = ndesc_get_tx_ls,
> .set_tx_owner = ndesc_set_tx_owner,
> .set_rx_owner = ndesc_set_rx_owner,
> + .set_tx_first = ndesc_set_tx_first,
> .get_rx_frame_len = ndesc_get_rx_frame_len,
> .enable_tx_timestamp = ndesc_enable_tx_timestamp,
> .get_tx_timestamp_status = ndesc_get_tx_timestamp_status,
> diff --git a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
> index 1ef9d8a..7faa42a 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
> @@ -28,73 +28,60 @@
>
> #include "stmmac.h"
>
> -static unsigned int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
> +
> +static unsigned int stmmac_prepare_frm(void *p, void *data, unsigned int len,
> + int csum, unsigned int entry)
> {
> struct stmmac_priv *priv = (struct stmmac_priv *)p;
> - unsigned int txsize = priv->dma_tx_size;
> - unsigned int entry = priv->cur_tx % txsize;
> + unsigned int entry_count = 0;
> struct dma_desc *desc;
> - unsigned int nopaged_len = skb_headlen(skb);
> - unsigned int bmax, len;
> + unsigned int bmax;
>
> - if (priv->extend_desc)
> + if (priv->plat->enh_desc) {
> desc = (struct dma_desc *)(priv->dma_etx + entry);
> - else
> + bmax = BUF_SIZE_8KiB - 1;
> + } else{
> desc = priv->dma_tx + entry;
> + bmax = BUF_SIZE_2KiB - 1;
> + }
>
> - if (priv->plat->enh_desc)
> - bmax = BUF_SIZE_8KiB;
> - else
> - bmax = BUF_SIZE_2KiB;
> -
> - len = nopaged_len - bmax;
> -
> - if (nopaged_len > BUF_SIZE_8KiB) {
> -
> - desc->des2 = dma_map_single(priv->device, skb->data,
> - bmax, DMA_TO_DEVICE);
> + while (len > 2*bmax) {
> + desc->des2 = dma_map_single(priv->device,
> + data + 2*bmax*entry_count,
> + 2*bmax, DMA_TO_DEVICE);
> priv->tx_skbuff_dma[entry] = desc->des2;
> - desc->des3 = desc->des2 + BUF_SIZE_4KiB;
> - priv->hw->desc->prepare_tx_desc(desc, 1, bmax, csum,
> + priv->tx_skbuff[entry] = NULL;
> + desc->des3 = desc->des2 + bmax;
> + priv->hw->desc->prepare_tx_desc(desc, 0, 2*bmax, csum,
> STMMAC_RING_MODE);
hmm, not sure but is it now useful to pass the is_fs as parameter
in the prepare_tx_desc? This could be removed because we have a new
callback.
> - wmb();
> - entry = (++priv->cur_tx) % txsize;
>
> - if (priv->extend_desc)
> - desc = (struct dma_desc *)(priv->dma_etx + entry);
> + len -= 2*bmax;
pls use a variable for 2*bmax that is used often. Or bmax should
directly be 2*bmax ;-)
> + entry++;
> + entry %= priv->dma_tx_size;
> + entry_count++;
> +
> + if (priv->plat->enh_desc)
> + desc = (struct dma_desc *)
> + (((struct dma_extended_desc *)desc)+1);
> else
> - desc = priv->dma_tx + entry;
> + desc++;
> + }
> + if (len) {
>
> - desc->des2 = dma_map_single(priv->device, skb->data + bmax,
> - len, DMA_TO_DEVICE);
> + desc->des2 = dma_map_single(priv->device,
> + data + 2*bmax*entry_count,
> + len, DMA_TO_DEVICE);
> priv->tx_skbuff_dma[entry] = desc->des2;
> - desc->des3 = desc->des2 + BUF_SIZE_4KiB;
> - priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
> - STMMAC_RING_MODE);
> - wmb();
> - priv->hw->desc->set_tx_owner(desc);
> priv->tx_skbuff[entry] = NULL;
> - } else {
> - desc->des2 = dma_map_single(priv->device, skb->data,
> - nopaged_len, DMA_TO_DEVICE);
> - priv->tx_skbuff_dma[entry] = desc->des2;
> - desc->des3 = desc->des2 + BUF_SIZE_4KiB;
> - priv->hw->desc->prepare_tx_desc(desc, 1, nopaged_len, csum,
> + desc->des3 = desc->des2 + bmax;
> + priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
> STMMAC_RING_MODE);
> + entry_count++;
> }
>
> - return entry;
> + return entry_count;
> }
>
> -static unsigned int stmmac_is_jumbo_frm(int len, int enh_desc)
> -{
> - unsigned int ret = 0;
> -
> - if (len >= BUF_SIZE_4KiB)
> - ret = 1;
> -
> - return ret;
> -}
>
> static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
> {
> @@ -103,13 +90,13 @@ static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
> if (unlikely(priv->plat->has_gmac))
> /* Fill DES3 in case of RING mode */
> if (priv->dma_buf_sz >= BUF_SIZE_8KiB)
> - p->des3 = p->des2 + BUF_SIZE_8KiB;
> + p->des3 = p->des2 + BUF_SIZE_8KiB - 1;
is it correct? can you check?
> }
>
> /* In ring mode we need to fill the desc3 because it is used as buffer */
> static void stmmac_init_desc3(struct dma_desc *p)
> {
> - p->des3 = p->des2 + BUF_SIZE_8KiB;
> + p->des3 = p->des2 + BUF_SIZE_8KiB - 1;
> }
>
> static void stmmac_clean_desc3(void *priv_ptr, struct dma_desc *p)
> @@ -127,8 +114,7 @@ static int stmmac_set_16kib_bfsize(int mtu)
> }
>
> const struct stmmac_ring_mode_ops ring_mode_ops = {
> - .is_jumbo_frm = stmmac_is_jumbo_frm,
> - .jumbo_frm = stmmac_jumbo_frm,
> + .prepare_frm = stmmac_prepare_frm,
> .refill_desc3 = stmmac_refill_desc3,
> .init_desc3 = stmmac_init_desc3,
> .clean_desc3 = stmmac_clean_desc3,
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index af04b5d..5873246 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -1832,6 +1832,20 @@ static int stmmac_release(struct net_device *dev)
> return 0;
> }
>
> +
> +static struct dma_desc *get_desc(struct stmmac_priv *priv, unsigned int entry)
> +{
> + struct dma_desc *desc;
> +
> + if (priv->plat->enh_desc)
> + desc = (struct dma_desc *)(priv->dma_etx + entry);
> + else
> + desc = priv->dma_tx + entry;
> +
> + return desc;
> +}
this could stay in another patch and it could be also used
for other cases inside the driver to get the description
and distinguish between enhanced and "not enh" descriptors.
I didn't add it before becasue this was just done in some place but if
you need it (and it is used in many parts) I agree that we can have a
dedicated function (maybe inline and in an header if shared).
> +
> /**
> * stmmac_xmit: Tx entry point of the driver
> * @skb : the socket buffer
> @@ -1844,11 +1858,10 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
> {
> struct stmmac_priv *priv = netdev_priv(dev);
> unsigned int txsize = priv->dma_tx_size;
> - unsigned int entry;
> - int i, csum_insertion = 0, is_jumbo = 0;
> + unsigned int entry, first_entry, nb_desc = 0;
> + int i, csum_insertion = 0;
> int nfrags = skb_shinfo(skb)->nr_frags;
> - struct dma_desc *desc, *first;
> - unsigned int nopaged_len = skb_headlen(skb);
> + struct dma_desc *desc = NULL, *first;
>
> if (unlikely(stmmac_tx_avail(priv) < nfrags + 1)) {
> if (!netif_queue_stopped(dev)) {
> @@ -1858,73 +1871,53 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
> }
> return NETDEV_TX_BUSY;
> }
> -
> spin_lock(&priv->tx_lock);
>
> if (priv->tx_path_in_lpi_mode)
> stmmac_disable_eee_mode(priv);
>
> - entry = priv->cur_tx % txsize;
> + first_entry = entry = priv->cur_tx % txsize;
entry = priv->cur_tx % txsize;
first_entry = entry;
>
> csum_insertion = (skb->ip_summed == CHECKSUM_PARTIAL);
> -
> - if (priv->extend_desc)
> - desc = (struct dma_desc *)(priv->dma_etx + entry);
> + /* To program the descriptors according to the size of the frame */
> + if (priv->mode == STMMAC_RING_MODE)
> + entry += priv->hw->ring->prepare_frm(priv, skb->data,
> + skb_headlen(skb), csum_insertion, entry);
> else
> - desc = priv->dma_tx + entry;
> + entry += priv->hw->chain->prepare_frm(priv, skb->data,
> + skb_headlen(skb), csum_insertion, entry);
>
> - first = desc;
> -
> - priv->tx_skbuff[entry] = skb;
> -
> - /* To program the descriptors according to the size of the frame */
> - if (priv->mode == STMMAC_RING_MODE) {
> - is_jumbo = priv->hw->ring->is_jumbo_frm(skb->len,
> - priv->plat->enh_desc);
> - if (unlikely(is_jumbo))
> - entry = priv->hw->ring->jumbo_frm(priv, skb,
> - csum_insertion);
> - } else {
> - is_jumbo = priv->hw->chain->is_jumbo_frm(skb->len,
> - priv->plat->enh_desc);
> - if (unlikely(is_jumbo))
> - entry = priv->hw->chain->jumbo_frm(priv, skb,
> - csum_insertion);
> - }
> - if (likely(!is_jumbo)) {
> - desc->des2 = dma_map_single(priv->device, skb->data,
> - nopaged_len, DMA_TO_DEVICE);
> - priv->tx_skbuff_dma[entry] = desc->des2;
> - priv->hw->desc->prepare_tx_desc(desc, 1, nopaged_len,
> - csum_insertion, priv->mode);
> - } else
> - desc = first;
> + entry %= txsize;
>
> for (i = 0; i < nfrags; i++) {
> const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
> - int len = skb_frag_size(frag);
>
> - entry = (++priv->cur_tx) % txsize;
> - if (priv->extend_desc)
> - desc = (struct dma_desc *)(priv->dma_etx + entry);
> + if (priv->mode == STMMAC_RING_MODE)
> + entry += priv->hw->ring->prepare_frm(priv,
> + skb_frag_address(frag),
> + skb_frag_size(frag),
> + csum_insertion, entry);
> else
> - desc = priv->dma_tx + entry;
> -
> - desc->des2 = skb_frag_dma_map(priv->device, frag, 0, len,
> - DMA_TO_DEVICE);
> - priv->tx_skbuff_dma[entry] = desc->des2;
> - priv->tx_skbuff[entry] = NULL;
> - priv->hw->desc->prepare_tx_desc(desc, 0, len, csum_insertion,
> - priv->mode);
> - wmb();
> - priv->hw->desc->set_tx_owner(desc);
> - wmb();
> - }
> -
> + entry += priv->hw->chain->prepare_frm(priv,
> + skb_frag_address(frag),
> + skb_frag_size(frag),
> + csum_insertion, entry);
> +
> + entry %= txsize;
> + }
> + /*Set owner for all segment but the first one */
> + for (i = first_entry; i != entry;) {
> + desc = get_desc(priv, i);
> + nb_desc++;
> + if (i != first_entry)
> + priv->hw->desc->set_tx_owner(desc);
> + i++;
> + i %= txsize;
> + }
this sounds to be quite ineffective. We do another loop in a
critical path.
Also you didn't add barriers that were needed and fixed problems
in some platforms in the past
Maybe we could improve this operation and prepare the descr
setting the own bit unless for the first one.
> + BUG_ON(desc == NULL);
> /* Finalize the latest segment. */
> priv->hw->desc->close_tx_desc(desc);
>
> - wmb();
> /* According to the coalesce parameter the IC bit for the latest
> * segment could be reset and the timer re-started to invoke the
> * stmmac_tx function. This approach takes care about the fragments.
> @@ -1938,11 +1931,20 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
> } else
> priv->tx_count_frames = 0;
>
> + /*Prepare first segment */
> + priv->tx_skbuff[first_entry] = skb;
> +
> + first = get_desc(priv, first_entry);
> +
> + priv->hw->desc->set_tx_first(first);
> +
> + wmb();
> +
> /* To avoid raise condition */
> priv->hw->desc->set_tx_owner(first);
> wmb();
>
> - priv->cur_tx++;
> + priv->cur_tx += nb_desc;
can we avoid to use the nb_desc?
>
> if (netif_msg_pktdata(priv)) {
> pr_debug("%s: curr %d dirty=%d entry=%d, first=%p, nfrags=%d",
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 1/5] net:stmmac: set threshold/store and forward mode according to mtu size.
2013-10-21 9:58 ` Rayagond K
@ 2013-10-21 13:49 ` Giuseppe CAVALLARO
0 siblings, 0 replies; 26+ messages in thread
From: Giuseppe CAVALLARO @ 2013-10-21 13:49 UTC (permalink / raw)
To: Rayagond K; +Cc: Jimmy Perchet, netdev
On 10/21/2013 11:58 AM, Rayagond K wrote:
> On Mon, Oct 21, 2013 at 2:17 PM, Giuseppe CAVALLARO
> <peppe.cavallaro@st.com> wrote:
>> >On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
>>> >>
>>> >>Threshold mode dma is needed on rx path if jumbo frames are expected.
>> >
>> >
>> >I think we should not force the threshold mode depending on the mtu.
> I remember SNPS HW team suggesting me to program the rx path in
> Threshold mode for jumbo frames, as it takes care of low RX FIFO size.
>
ok. no problem to keep it as default in the stmmac.
Cheers
Peppe
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean
2013-10-16 15:24 ` [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean Jimmy Perchet
@ 2013-10-21 13:52 ` Giuseppe CAVALLARO
2013-10-21 16:30 ` Eric Dumazet
0 siblings, 1 reply; 26+ messages in thread
From: Giuseppe CAVALLARO @ 2013-10-21 13:52 UTC (permalink / raw)
To: Jimmy Perchet; +Cc: netdev
Hello Jimmy
On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
> Tx descriptor's cleanup and preparation are serialized, which is not necessary
> and decrease performance.
> In addition TX descriptor's cleanup is performed on NET_RX softirq, this is
> confusing.
hmm, here you are changing the logic behind the tx/rx processes.
As done in many drivers, the stmmac cleans the tx resources in
NAPI context and this is not a confuse approach ;-).
It gave me some performance improvements especially on TCP benchmarks.
> This patch unserialize tx descriptor's cleanup and preparation
> and defer cleanup in workqueue.
So you decide to use workqueue and I kindly ask you to give me more
details about the performance improvements (UDP/TCP) and cpu usage.
I can try to do some tests on my side too. This could take a while
unfortunately.
peppe
>
> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
> ---
> drivers/net/ethernet/stmicro/stmmac/chain_mode.c | 6 +--
> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 6 ++-
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 52 +++++++++++++----------
> 3 files changed, 37 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
> index d6ed0ce..8ab83cb 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
> @@ -120,7 +120,7 @@ static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
> * to keep explicit chaining in the descriptor.
> */
> p->des3 = (unsigned int)(priv->dma_rx_phy +
> - (((priv->dirty_rx) + 1) %
> + ((priv->dirty_rx + 1) %
> priv->dma_rx_size) *
> sizeof(struct dma_desc));
> }
> @@ -135,9 +135,9 @@ static void stmmac_clean_desc3(void *priv_ptr, struct dma_desc *p)
> * to keep explicit chaining in the descriptor.
> */
> p->des3 = (unsigned int)(priv->dma_tx_phy +
> - (((priv->dirty_tx + 1) %
> + ((atomic_read(&priv->dirty_tx) + 1) %
> priv->dma_tx_size) *
> - sizeof(struct dma_desc)));
> + sizeof(struct dma_desc));
> }
>
> const struct stmmac_chain_mode_ops chain_mode_ops = {
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> index f16a9bd..d5b8906 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> @@ -38,8 +38,8 @@ struct stmmac_priv {
> struct dma_extended_desc *dma_etx ____cacheline_aligned_in_smp;
> struct dma_desc *dma_tx;
> struct sk_buff **tx_skbuff;
> - unsigned int cur_tx;
> - unsigned int dirty_tx;
> + atomic_t cur_tx;
> + atomic_t dirty_tx;
> unsigned int dma_tx_size;
> u32 tx_count_frames;
> u32 tx_coal_frames;
> @@ -106,6 +106,8 @@ struct stmmac_priv {
> u32 adv_ts;
> int use_riwt;
> spinlock_t ptp_lock;
> + struct work_struct txclean_work;
> + struct workqueue_struct *wq;
> };
>
> extern int phyaddr;
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 5873246..de614ad 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -48,6 +48,7 @@
> #include <linux/seq_file.h>
> #endif /* CONFIG_STMMAC_DEBUG_FS */
> #include <linux/net_tstamp.h>
> +#include <linux/workqueue.h>
> #include "stmmac_ptp.h"
> #include "stmmac.h"
>
> @@ -205,7 +206,8 @@ static void print_pkt(unsigned char *buf, int len)
>
> static inline u32 stmmac_tx_avail(struct stmmac_priv *priv)
> {
> - return priv->dirty_tx + priv->dma_tx_size - priv->cur_tx - 1;
> + return atomic_read(&priv->dirty_tx) + priv->dma_tx_size
> + - atomic_read(&priv->cur_tx) - 1;
> }
>
> /**
> @@ -230,7 +232,7 @@ static inline void stmmac_hw_fix_mac_speed(struct stmmac_priv *priv)
> static void stmmac_enable_eee_mode(struct stmmac_priv *priv)
> {
> /* Check and enter in LPI mode */
> - if ((priv->dirty_tx == priv->cur_tx) &&
> + if ((atomic_read(&priv->dirty_tx) == atomic_read(&priv->cur_tx)) &&
> (priv->tx_path_in_lpi_mode == false))
> priv->hw->mac->set_eee_mode(priv->ioaddr);
> }
> @@ -1118,8 +1120,8 @@ static int init_dma_desc_rings(struct net_device *dev)
> priv->tx_skbuff[i] = NULL;
> }
>
> - priv->dirty_tx = 0;
> - priv->cur_tx = 0;
> + atomic_set(&priv->dirty_tx, 0);
> + atomic_set(&priv->cur_tx, 0);
>
> stmmac_clear_descriptors(priv);
>
> @@ -1264,17 +1266,17 @@ static void stmmac_dma_operation_mode(int mtu, struct stmmac_priv *priv)
> * @priv: driver private structure
> * Description: it reclaims resources after transmission completes.
> */
> -static void stmmac_tx_clean(struct stmmac_priv *priv)
> +static void stmmac_tx_clean(struct work_struct *work)
> {
> + struct stmmac_priv *priv =
> + container_of(work, struct stmmac_priv, txclean_work);
> unsigned int txsize = priv->dma_tx_size;
>
> - spin_lock(&priv->tx_lock);
> -
> priv->xstats.tx_clean++;
>
> - while (priv->dirty_tx != priv->cur_tx) {
> + while (atomic_read(&priv->dirty_tx) != atomic_read(&priv->cur_tx)) {
> int last;
> - unsigned int entry = priv->dirty_tx % txsize;
> + unsigned int entry = atomic_read(&priv->dirty_tx) % txsize;
> struct sk_buff *skb = priv->tx_skbuff[entry];
> struct dma_desc *p;
>
> @@ -1308,7 +1310,8 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
> }
> if (netif_msg_tx_done(priv))
> pr_debug("%s: curr %d, dirty %d\n", __func__,
> - priv->cur_tx, priv->dirty_tx);
> + atomic_read(&priv->cur_tx),
> + atomic_read(&priv->dirty_tx));
>
> if (likely(priv->tx_skbuff_dma[entry])) {
> dma_unmap_single(priv->device,
> @@ -1326,7 +1329,8 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
>
> priv->hw->desc->release_tx_desc(p, priv->mode);
>
> - priv->dirty_tx++;
> + wmb();
> + atomic_inc(&priv->dirty_tx);
> }
> if (unlikely(netif_queue_stopped(priv->dev) &&
> stmmac_tx_avail(priv) > STMMAC_TX_THRESH(priv))) {
> @@ -1344,7 +1348,6 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
> stmmac_enable_eee_mode(priv);
> mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(eee_timer));
> }
> - spin_unlock(&priv->tx_lock);
> }
>
> static inline void stmmac_enable_dma_irq(struct stmmac_priv *priv)
> @@ -1380,8 +1383,8 @@ static void stmmac_tx_err(struct stmmac_priv *priv)
> priv->hw->desc->init_tx_desc(&priv->dma_tx[i],
> priv->mode,
> (i == txsize - 1));
> - priv->dirty_tx = 0;
> - priv->cur_tx = 0;
> + atomic_set(&priv->dirty_tx, 0);
> + atomic_set(&priv->cur_tx, 0);
> priv->hw->dma->start_tx(priv->ioaddr);
>
> priv->dev->stats.tx_errors++;
> @@ -1401,12 +1404,16 @@ static void stmmac_dma_interrupt(struct stmmac_priv *priv)
> int status;
>
> status = priv->hw->dma->dma_interrupt(priv->ioaddr, &priv->xstats);
> - if (likely((status & handle_rx)) || (status & handle_tx)) {
> + if (likely(status & handle_rx)) {
> if (likely(napi_schedule_prep(&priv->napi))) {
> stmmac_disable_dma_irq(priv);
> __napi_schedule(&priv->napi);
> }
> }
> + if (likely(status & handle_tx))
> + queue_work(priv->wq, &priv->txclean_work);
> +
> +
> if (unlikely(status & tx_hard_error_bump_tc)) {
> /* Try to bump up the dma threshold on this failure */
> if (unlikely(tc != SF_DMA_MODE) && (tc <= 256)) {
> @@ -1596,8 +1603,7 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
> static void stmmac_tx_timer(unsigned long data)
> {
> struct stmmac_priv *priv = (struct stmmac_priv *)data;
> -
> - stmmac_tx_clean(priv);
> + queue_work(priv->wq, &priv->txclean_work);
> }
>
> /**
> @@ -1876,7 +1882,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
> if (priv->tx_path_in_lpi_mode)
> stmmac_disable_eee_mode(priv);
>
> - first_entry = entry = priv->cur_tx % txsize;
> + first_entry = entry = atomic_read(&priv->cur_tx) % txsize;
>
> csum_insertion = (skb->ip_summed == CHECKSUM_PARTIAL);
> /* To program the descriptors according to the size of the frame */
> @@ -1944,12 +1950,13 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
> priv->hw->desc->set_tx_owner(first);
> wmb();
>
> - priv->cur_tx += nb_desc;
> + atomic_add(nb_desc, &priv->cur_tx);
>
> if (netif_msg_pktdata(priv)) {
> pr_debug("%s: curr %d dirty=%d entry=%d, first=%p, nfrags=%d",
> - __func__, (priv->cur_tx % txsize),
> - (priv->dirty_tx % txsize), entry, first, nfrags);
> + __func__, (atomic_read(&priv->cur_tx) % txsize),
> + (atomic_read(&priv->dirty_tx) % txsize), entry,
> + first, nfrags);
>
> if (priv->extend_desc)
> stmmac_display_ring((void *)priv->dma_etx, txsize, 1);
> @@ -2171,7 +2178,6 @@ static int stmmac_poll(struct napi_struct *napi, int budget)
> int work_done = 0;
>
> priv->xstats.napi_poll++;
> - stmmac_tx_clean(priv);
>
> work_done = stmmac_rx(priv, budget);
> if (work_done < budget) {
> @@ -2747,6 +2753,8 @@ struct stmmac_priv *stmmac_dvr_probe(struct device *device,
>
> spin_lock_init(&priv->lock);
> spin_lock_init(&priv->tx_lock);
> + priv->wq = create_singlethread_workqueue("stmmac_wq");
> + INIT_WORK(&priv->txclean_work, stmmac_tx_clean);
>
> ret = register_netdev(ndev);
> if (ret) {
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 4/5] net:stmmac: fix jumbo frame handling.
2013-10-21 13:40 ` Giuseppe CAVALLARO
@ 2013-10-21 16:28 ` Jimmy PERCHET
2013-10-22 13:24 ` Giuseppe CAVALLARO
0 siblings, 1 reply; 26+ messages in thread
From: Jimmy PERCHET @ 2013-10-21 16:28 UTC (permalink / raw)
To: Giuseppe CAVALLARO; +Cc: netdev, jimmy.perchet
On 21/10/2013 15:40, Giuseppe CAVALLARO wrote:
> On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
>> This patch addresses several issues which prevent jumbo frames from working properly :
>> .jumbo frames' last descriptor was not closed
>> .several confusion regarding descriptor's max buffer size
>> .frags could not be jumbo
>>
>> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
>
>
> Jimmy, thx for thi patch. BElow some my first notes.
Thanks a lot for this first review.
> I'll continue to look at the patch to verify if I missed
> soemthing. I kindly ask you, for the next version, to add
> more comments especially in the function to prepare the
> tx desc in order to help me on reviewing.
Sure ;)
I hope do v2 by next week.
I'm OK with most of your comments. Some additional
notes below:
>> }
>> @@ -81,7 +81,7 @@ static inline void ndesc_end_tx_desc_on_ring(struct dma_desc *p, int ter)
>>
>> static inline void norm_set_tx_desc_len_on_ring(struct dma_desc *p, int len)
>> {
>> - if (unlikely(len > BUF_SIZE_2KiB)) {
>> + if (unlikely(len >= BUF_SIZE_2KiB)) {
>
> we cannot manage a size of 2048 on normal desc
>
> Pls you should verify to not break the back-compatibility.
IMHO, this actually fix the problem you think I create.
In current code, if len is equal to 2048, buffer1_size is set to 2048,
this is wrong because the max size is actually 2047...
>
>> p->des01.etx.buffer1_size = BUF_SIZE_2KiB - 1;
>> p->des01.etx.buffer2_size = len - p->des01.etx.buffer1_size;
>> } else
>>
>> static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
>> {
>> @@ -103,13 +90,13 @@ static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
>> if (unlikely(priv->plat->has_gmac))
>> /* Fill DES3 in case of RING mode */
>> if (priv->dma_buf_sz >= BUF_SIZE_8KiB)
>> - p->des3 = p->des2 + BUF_SIZE_8KiB;
>> + p->des3 = p->des2 + BUF_SIZE_8KiB - 1;
>
> is it correct? can you check?
The actual buffer's max size is 8191, so, in ring mode,
the second buffer must start at p->des2 + 8191.
>> - priv->cur_tx++;
>> + priv->cur_tx += nb_desc;
>
> can we avoid to use the nb_desc?
Actually, it is a preparation for my 5th patch : I want to write cur_tx only once.
I can split this.
Best Regards,
Jimmy
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean
2013-10-21 13:52 ` Giuseppe CAVALLARO
@ 2013-10-21 16:30 ` Eric Dumazet
2013-10-21 18:05 ` Jimmy PERCHET
0 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2013-10-21 16:30 UTC (permalink / raw)
To: Giuseppe CAVALLARO; +Cc: Jimmy Perchet, netdev
On Mon, 2013-10-21 at 15:52 +0200, Giuseppe CAVALLARO wrote:
> Hello Jimmy
>
> On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
> > Tx descriptor's cleanup and preparation are serialized, which is not necessary
> > and decrease performance.
> > In addition TX descriptor's cleanup is performed on NET_RX softirq, this is
> > confusing.
>
> hmm, here you are changing the logic behind the tx/rx processes.
>
> As done in many drivers, the stmmac cleans the tx resources in
> NAPI context and this is not a confuse approach ;-).
>
> It gave me some performance improvements especially on TCP benchmarks.
>
> > This patch unserialize tx descriptor's cleanup and preparation
> > and defer cleanup in workqueue.
>
> So you decide to use workqueue and I kindly ask you to give me more
> details about the performance improvements (UDP/TCP) and cpu usage.
>
> I can try to do some tests on my side too. This could take a while
> unfortunately.
Anyway this patch is buggy.
1) Removing tx_lock spinlock in TX completion adds a race in
stmmac_xmit()
2) Generally speaking, we should not rely on a work queue to perform TX
completions.
Think about being flooded by incoming frames.
Work queue could never be scheduled.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean
2013-10-21 16:30 ` Eric Dumazet
@ 2013-10-21 18:05 ` Jimmy PERCHET
2013-10-21 18:24 ` Eric Dumazet
0 siblings, 1 reply; 26+ messages in thread
From: Jimmy PERCHET @ 2013-10-21 18:05 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Giuseppe CAVALLARO, netdev, Jimmy Perchet
On 21/10/2013 18:30, Eric Dumazet wrote:
> On Mon, 2013-10-21 at 15:52 +0200, Giuseppe CAVALLARO wrote:
>> Hello Jimmy
>>
>> On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
>>> Tx descriptor's cleanup and preparation are serialized, which is not necessary
>>> and decrease performance.
>>> In addition TX descriptor's cleanup is performed on NET_RX softirq, this is
>>> confusing.
>>
>> hmm, here you are changing the logic behind the tx/rx processes.
>>
>> As done in many drivers, the stmmac cleans the tx resources in
>> NAPI context and this is not a confuse approach ;-).
>>
>> It gave me some performance improvements especially on TCP benchmarks.
>>
>>> This patch unserialize tx descriptor's cleanup and preparation
>>> and defer cleanup in workqueue.
>>
>> So you decide to use workqueue and I kindly ask you to give me more
>> details about the performance improvements (UDP/TCP) and cpu usage.
>>
>> I can try to do some tests on my side too. This could take a while
>> unfortunately.
>
> Anyway this patch is buggy.
>
> 1) Removing tx_lock spinlock in TX completion adds a race in
> stmmac_xmit()
>
> 2) Generally speaking, we should not rely on a work queue to perform TX
> completions.
>
> Think about being flooded by incoming frames.
>
> Work queue could never be scheduled.
>
>
I understand your point. Nevertheless I think it is still possible
to avoid serialization, and therefore increase performance, even if
completions must remain in softirq. What do you think ?
In my patch I tried to avoid any race condition. (by updating both
descriptor's cursors only once, for instance)
Could you explain the possible race you see ?
Best Regards,
Jimmy
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean
2013-10-21 18:05 ` Jimmy PERCHET
@ 2013-10-21 18:24 ` Eric Dumazet
0 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2013-10-21 18:24 UTC (permalink / raw)
To: Jimmy PERCHET; +Cc: Giuseppe CAVALLARO, netdev
On Mon, 2013-10-21 at 20:05 +0200, Jimmy PERCHET wrote:
> I understand your point. Nevertheless I think it is still possible
> to avoid serialization, and therefore increase performance, even if
> completions must remain in softirq. What do you think ?
I think this will break over time. This kind of 'optimizations' work on
a particular workload, and thats it. It reminds me the 'skb recycle'
thing.
The workload you try to optimize might conflict with other workloads.
Have you tried to reduce the 64 value in netif_napi_add() ?
>
> In my patch I tried to avoid any race condition. (by updating both
> descriptor's cursors only once, for instance)
> Could you explain the possible race you see ?
Well, take a look at netif_queue_stopped(), netif_wake_queue(),
netif_queue_stopped() calls for a start.
Its really hard to make start_xmit() lockless (versus TX completion).
Adding atomic_t wont be enough (and btw thats often not needed at all)
drivers/net/ethernet/broadcom/tg3.c is a good reference if you really
want to do that properly.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors.
2013-10-21 13:10 ` Jimmy PERCHET
@ 2013-10-21 18:32 ` Eric Dumazet
2013-10-21 18:49 ` Eric Dumazet
0 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2013-10-21 18:32 UTC (permalink / raw)
To: Jimmy PERCHET; +Cc: Giuseppe CAVALLARO, netdev
On Mon, 2013-10-21 at 15:10 +0200, Jimmy PERCHET wrote:
> Hello Peppe,
>
> I can reproduce this problem by issuing 9KiB jumbo frames on 10MBit/s link.
> If socket's wmemory size is about 500kiB (or less), the transfer stall.
> (I guess it is reproducible with 1500o frames by decreasing
> socket's wmemory to 90KB)
> Re-arming the timer fix this behaviour.
>
> Here my understanding of this issue :
> With 9KiB frames and 500kiB of wmemory, only 60 frames can be
> prepared in a row. It is below the tx coalescence threshold,
> so there will be no interrupt. When the tx coalescence timer
> expires (40ms after), only five descriptors have to be
> freed (9000*5 @ 10Mbit/s = 34ms), it is not enough to reach
> the socket's wake-up threshold. We get into a deadlock :
> *Socket is waiting for free buffers before performing new transfer.
> *Driver is waiting for new transfer before performing cleanup.
>
> Maybe, it is not a real life use-case, and is not worth
> a patch. What do you think ?
>
I think there is probably a bug in the driver, a race of some sort,
and it would be better to find it and fix it ;)
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors.
2013-10-21 18:32 ` Eric Dumazet
@ 2013-10-21 18:49 ` Eric Dumazet
2013-10-22 13:33 ` Giuseppe CAVALLARO
0 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2013-10-21 18:49 UTC (permalink / raw)
To: Jimmy PERCHET; +Cc: Giuseppe CAVALLARO, netdev
On Mon, 2013-10-21 at 11:32 -0700, Eric Dumazet wrote:
> On Mon, 2013-10-21 at 15:10 +0200, Jimmy PERCHET wrote:
> > Hello Peppe,
> >
>
> > I can reproduce this problem by issuing 9KiB jumbo frames on 10MBit/s link.
> > If socket's wmemory size is about 500kiB (or less), the transfer stall.
> > (I guess it is reproducible with 1500o frames by decreasing
> > socket's wmemory to 90KB)
> > Re-arming the timer fix this behaviour.
> >
> > Here my understanding of this issue :
> > With 9KiB frames and 500kiB of wmemory, only 60 frames can be
> > prepared in a row. It is below the tx coalescence threshold,
> > so there will be no interrupt. When the tx coalescence timer
> > expires (40ms after), only five descriptors have to be
> > freed (9000*5 @ 10Mbit/s = 34ms), it is not enough to reach
> > the socket's wake-up threshold. We get into a deadlock :
> > *Socket is waiting for free buffers before performing new transfer.
> > *Driver is waiting for new transfer before performing cleanup.
> >
> > Maybe, it is not a real life use-case, and is not worth
> > a patch. What do you think ?
> >
>
> I think there is probably a bug in the driver, a race of some sort,
> and it would be better to find it and fix it ;)
>
coalesce params should not be hardcoded, but depend on link speed and
mtu.
On 10Mbits, and MTU=9000 there is really no point using coalescing !
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 4/5] net:stmmac: fix jumbo frame handling.
2013-10-21 16:28 ` Jimmy PERCHET
@ 2013-10-22 13:24 ` Giuseppe CAVALLARO
0 siblings, 0 replies; 26+ messages in thread
From: Giuseppe CAVALLARO @ 2013-10-22 13:24 UTC (permalink / raw)
To: Jimmy PERCHET; +Cc: netdev, jimmy.perchet
On 10/21/2013 6:28 PM, Jimmy PERCHET wrote:
> On 21/10/2013 15:40, Giuseppe CAVALLARO wrote:
>> On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
>>> This patch addresses several issues which prevent jumbo frames from working properly :
>>> .jumbo frames' last descriptor was not closed
>>> .several confusion regarding descriptor's max buffer size
>>> .frags could not be jumbo
>>>
>>> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
>>
>>
>> Jimmy, thx for thi patch. BElow some my first notes.
>
> Thanks a lot for this first review.
welcome
>
>> I'll continue to look at the patch to verify if I missed
>> soemthing. I kindly ask you, for the next version, to add
>> more comments especially in the function to prepare the
>> tx desc in order to help me on reviewing.
>
> Sure ;)
>
> I hope do v2 by next week.
ok thx, I'll try to help on reviewing for the v2 again.
>
> I'm OK with most of your comments. Some additional
> notes below:
>
>>> }
>>> @@ -81,7 +81,7 @@ static inline void ndesc_end_tx_desc_on_ring(struct dma_desc *p, int ter)
>>>
>>> static inline void norm_set_tx_desc_len_on_ring(struct dma_desc *p, int len)
>>> {
>>> - if (unlikely(len > BUF_SIZE_2KiB)) {
>>> + if (unlikely(len >= BUF_SIZE_2KiB)) {
>>
>> we cannot manage a size of 2048 on normal desc
>>
>> Pls you should verify to not break the back-compatibility.
>
> IMHO, this actually fix the problem you think I create.
> In current code, if len is equal to 2048, buffer1_size is set to 2048,
> this is wrong because the max size is actually 2047...
IIRC, for normal descriptors, the TBS2/1 are just 11 bits
so the max programmable size is 2047 (0x7ff).
>
>>
>>> p->des01.etx.buffer1_size = BUF_SIZE_2KiB - 1;
>>> p->des01.etx.buffer2_size = len - p->des01.etx.buffer1_size;
>>> } else
>
>
>
>>>
>>> static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
>>> {
>>> @@ -103,13 +90,13 @@ static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
>>> if (unlikely(priv->plat->has_gmac))
>>> /* Fill DES3 in case of RING mode */
>>> if (priv->dma_buf_sz >= BUF_SIZE_8KiB)
>>> - p->des3 = p->des2 + BUF_SIZE_8KiB;
>>> + p->des3 = p->des2 + BUF_SIZE_8KiB - 1;
>>
>> is it correct? can you check?
>
> The actual buffer's max size is 8191, so, in ring mode,
> the second buffer must start at p->des2 + 8191.
>
>>> - priv->cur_tx++;
>>> + priv->cur_tx += nb_desc;
>>
>> can we avoid to use the nb_desc?
> Actually, it is a preparation for my 5th patch : I want to write cur_tx only once.
> I can split this.
ok
>
>
>
> Best Regards,
> Jimmy
>
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors.
2013-10-21 18:49 ` Eric Dumazet
@ 2013-10-22 13:33 ` Giuseppe CAVALLARO
0 siblings, 0 replies; 26+ messages in thread
From: Giuseppe CAVALLARO @ 2013-10-22 13:33 UTC (permalink / raw)
To: Eric Dumazet, Jimmy PERCHET; +Cc: netdev
On 10/21/2013 8:49 PM, Eric Dumazet wrote:
> On Mon, 2013-10-21 at 11:32 -0700, Eric Dumazet wrote:
>> On Mon, 2013-10-21 at 15:10 +0200, Jimmy PERCHET wrote:
>>> Hello Peppe,
>>>
>>
>>> I can reproduce this problem by issuing 9KiB jumbo frames on 10MBit/s link.
>>> If socket's wmemory size is about 500kiB (or less), the transfer stall.
>>> (I guess it is reproducible with 1500o frames by decreasing
>>> socket's wmemory to 90KB)
>>> Re-arming the timer fix this behaviour.
>>>
>>> Here my understanding of this issue :
>>> With 9KiB frames and 500kiB of wmemory, only 60 frames can be
>>> prepared in a row. It is below the tx coalescence threshold,
>>> so there will be no interrupt. When the tx coalescence timer
>>> expires (40ms after), only five descriptors have to be
>>> freed (9000*5 @ 10Mbit/s = 34ms), it is not enough to reach
>>> the socket's wake-up threshold. We get into a deadlock :
>>> *Socket is waiting for free buffers before performing new transfer.
>>> *Driver is waiting for new transfer before performing cleanup.
>>>
>>> Maybe, it is not a real life use-case, and is not worth
>>> a patch. What do you think ?
>>>
>>
>> I think there is probably a bug in the driver, a race of some sort,
>> and it would be better to find it and fix it ;)
>>
>
> coalesce params should not be hardcoded, but depend on link speed and
> mtu.
>
> On 10Mbits, and MTU=9000 there is really no point using coalescing !
so the final patch could be to tune/disable the tx coalesce according
to speed and mtu.
Indeed I had added something that can already help on that.
We can tune the tx_coal_frames and decide to set the IC bit
(interrupt on completion bit) in the frame to be transmitted.
This can be done via ethtool.
This should reduce the mitigation so, for sure, you can tune all in case
of low speed or jumbo. IIRC, you could decide to disable mitigation
at all. To Jimmy, can you try this? In any case, let me know.
Peppe
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2013-10-22 13:33 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-16 15:24 [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation Jimmy Perchet
2013-10-16 15:24 ` [PATCH RFC 1/5] net:stmmac: set threshold/store and forward mode according to mtu size Jimmy Perchet
2013-10-21 8:47 ` Giuseppe CAVALLARO
2013-10-21 9:58 ` Rayagond K
2013-10-21 13:49 ` Giuseppe CAVALLARO
2013-10-16 15:24 ` [PATCH RFC 2/5] net:stmmac: fix rx buffer allocation Jimmy Perchet
2013-10-21 8:54 ` Giuseppe CAVALLARO
2013-10-16 15:24 ` [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors Jimmy Perchet
2013-10-16 17:46 ` Sergei Shtylyov
2013-10-18 8:32 ` Jimmy PERCHET
2013-10-21 9:07 ` Giuseppe CAVALLARO
2013-10-21 13:10 ` Jimmy PERCHET
2013-10-21 18:32 ` Eric Dumazet
2013-10-21 18:49 ` Eric Dumazet
2013-10-22 13:33 ` Giuseppe CAVALLARO
2013-10-16 15:24 ` [PATCH RFC 4/5] net:stmmac: fix jumbo frame handling Jimmy Perchet
2013-10-21 13:40 ` Giuseppe CAVALLARO
2013-10-21 16:28 ` Jimmy PERCHET
2013-10-22 13:24 ` Giuseppe CAVALLARO
2013-10-16 15:24 ` [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean Jimmy Perchet
2013-10-21 13:52 ` Giuseppe CAVALLARO
2013-10-21 16:30 ` Eric Dumazet
2013-10-21 18:05 ` Jimmy PERCHET
2013-10-21 18:24 ` Eric Dumazet
2013-10-16 20:37 ` [PATCH RFC 0/5] net:stmmac: fix jumbo frames handling and optimisation Giuseppe CAVALLARO
2013-10-18 16:24 ` Jimmy PERCHET
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).