* [PATCH RFC 04/14] drivers: wireless: rt2x00: move skb_dma to queue entry [not found] <1425318028-26531-1-git-send-email-fw@strlen.de> @ 2015-03-02 17:40 ` Florian Westphal 2015-03-02 17:40 ` [PATCH RFC 05/14] drivers: wireless: ar5523: use container_of Florian Westphal ` (5 subsequent siblings) 6 siblings, 0 replies; 19+ messages in thread From: Florian Westphal @ 2015-03-02 17:40 UTC (permalink / raw) To: netdev; +Cc: Florian Westphal, linux-wireless ... to shrink size needed for the skb frame desc. Note that this is just the lowest hanging fruit I found, if you find better/other means to shrink the space needed in skb->cb (or to shrink it even further) -- great. Cc: linux-wireless@vger.kernel.org Signed-off-by: Florian Westphal <fw@strlen.de> --- drivers/net/wireless/rt2x00/rt2400pci.c | 5 ++--- drivers/net/wireless/rt2x00/rt2500pci.c | 5 ++--- drivers/net/wireless/rt2x00/rt2800mmio.c | 7 +++---- drivers/net/wireless/rt2x00/rt2x00queue.c | 10 +++++----- drivers/net/wireless/rt2x00/rt2x00queue.h | 17 ++++++++--------- drivers/net/wireless/rt2x00/rt61pci.c | 5 ++--- 6 files changed, 22 insertions(+), 27 deletions(-) diff --git a/drivers/net/wireless/rt2x00/rt2400pci.c b/drivers/net/wireless/rt2x00/rt2400pci.c index bdf5590..5c7927c 100644 --- a/drivers/net/wireless/rt2x00/rt2400pci.c +++ b/drivers/net/wireless/rt2x00/rt2400pci.c @@ -739,7 +739,6 @@ static bool rt2400pci_get_entry_state(struct queue_entry *entry) static void rt2400pci_clear_entry(struct queue_entry *entry) { struct queue_entry_priv_mmio *entry_priv = entry->priv_data; - struct skb_frame_desc *skbdesc = get_skb_frame_desc(entry->skb); u32 word; if (entry->queue->qid == QID_RX) { @@ -748,7 +747,7 @@ static void rt2400pci_clear_entry(struct queue_entry *entry) rt2x00_desc_write(entry_priv->desc, 2, word); rt2x00_desc_read(entry_priv->desc, 1, &word); - rt2x00_set_field32(&word, RXD_W1_BUFFER_ADDRESS, skbdesc->skb_dma); + rt2x00_set_field32(&word, RXD_W1_BUFFER_ADDRESS, entry->skb_dma); rt2x00_desc_write(entry_priv->desc, 1, word); rt2x00_desc_read(entry_priv->desc, 0, &word); @@ -1111,7 +1110,7 @@ static void rt2400pci_write_tx_desc(struct queue_entry *entry, * Start writing the descriptor words. */ rt2x00_desc_read(txd, 1, &word); - rt2x00_set_field32(&word, TXD_W1_BUFFER_ADDRESS, skbdesc->skb_dma); + rt2x00_set_field32(&word, TXD_W1_BUFFER_ADDRESS, entry->skb_dma); rt2x00_desc_write(txd, 1, word); rt2x00_desc_read(txd, 2, &word); diff --git a/drivers/net/wireless/rt2x00/rt2500pci.c b/drivers/net/wireless/rt2x00/rt2500pci.c index 79f4fe6..907d7c2 100644 --- a/drivers/net/wireless/rt2x00/rt2500pci.c +++ b/drivers/net/wireless/rt2x00/rt2500pci.c @@ -828,12 +828,11 @@ static bool rt2500pci_get_entry_state(struct queue_entry *entry) static void rt2500pci_clear_entry(struct queue_entry *entry) { struct queue_entry_priv_mmio *entry_priv = entry->priv_data; - struct skb_frame_desc *skbdesc = get_skb_frame_desc(entry->skb); u32 word; if (entry->queue->qid == QID_RX) { rt2x00_desc_read(entry_priv->desc, 1, &word); - rt2x00_set_field32(&word, RXD_W1_BUFFER_ADDRESS, skbdesc->skb_dma); + rt2x00_set_field32(&word, RXD_W1_BUFFER_ADDRESS, entry->skb_dma); rt2x00_desc_write(entry_priv->desc, 1, word); rt2x00_desc_read(entry_priv->desc, 0, &word); @@ -1264,7 +1263,7 @@ static void rt2500pci_write_tx_desc(struct queue_entry *entry, * Start writing the descriptor words. */ rt2x00_desc_read(txd, 1, &word); - rt2x00_set_field32(&word, TXD_W1_BUFFER_ADDRESS, skbdesc->skb_dma); + rt2x00_set_field32(&word, TXD_W1_BUFFER_ADDRESS, entry->skb_dma); rt2x00_desc_write(txd, 1, word); rt2x00_desc_read(txd, 2, &word); diff --git a/drivers/net/wireless/rt2x00/rt2800mmio.c b/drivers/net/wireless/rt2x00/rt2800mmio.c index de4790b..45772a1 100644 --- a/drivers/net/wireless/rt2x00/rt2800mmio.c +++ b/drivers/net/wireless/rt2x00/rt2800mmio.c @@ -66,7 +66,7 @@ void rt2800mmio_write_tx_desc(struct queue_entry *entry, * Initialize TX descriptor */ word = 0; - rt2x00_set_field32(&word, TXD_W0_SD_PTR0, skbdesc->skb_dma); + rt2x00_set_field32(&word, TXD_W0_SD_PTR0, entry->skb_dma); rt2x00_desc_write(txd, 0, word); word = 0; @@ -82,7 +82,7 @@ void rt2800mmio_write_tx_desc(struct queue_entry *entry, word = 0; rt2x00_set_field32(&word, TXD_W2_SD_PTR1, - skbdesc->skb_dma + txwi_size); + entry->skb_dma + txwi_size); rt2x00_desc_write(txd, 2, word); word = 0; @@ -710,13 +710,12 @@ EXPORT_SYMBOL_GPL(rt2800mmio_get_entry_state); void rt2800mmio_clear_entry(struct queue_entry *entry) { struct queue_entry_priv_mmio *entry_priv = entry->priv_data; - struct skb_frame_desc *skbdesc = get_skb_frame_desc(entry->skb); struct rt2x00_dev *rt2x00dev = entry->queue->rt2x00dev; u32 word; if (entry->queue->qid == QID_RX) { rt2x00_desc_read(entry_priv->desc, 0, &word); - rt2x00_set_field32(&word, RXD_W0_SDP0, skbdesc->skb_dma); + rt2x00_set_field32(&word, RXD_W0_SDP0, entry->skb_dma); rt2x00_desc_write(entry_priv->desc, 0, word); rt2x00_desc_read(entry_priv->desc, 1, &word); diff --git a/drivers/net/wireless/rt2x00/rt2x00queue.c b/drivers/net/wireless/rt2x00/rt2x00queue.c index 68b620b..6c79266 100644 --- a/drivers/net/wireless/rt2x00/rt2x00queue.c +++ b/drivers/net/wireless/rt2x00/rt2x00queue.c @@ -95,7 +95,7 @@ struct sk_buff *rt2x00queue_alloc_rxskb(struct queue_entry *entry, gfp_t gfp) return NULL; } - skbdesc->skb_dma = skb_dma; + entry->skb_dma = skb_dma; skbdesc->flags |= SKBDESC_DMA_MAPPED_RX; } @@ -107,10 +107,10 @@ int rt2x00queue_map_txskb(struct queue_entry *entry) struct device *dev = entry->queue->rt2x00dev->dev; struct skb_frame_desc *skbdesc = get_skb_frame_desc(entry->skb); - skbdesc->skb_dma = + entry->skb_dma = dma_map_single(dev, entry->skb->data, entry->skb->len, DMA_TO_DEVICE); - if (unlikely(dma_mapping_error(dev, skbdesc->skb_dma))) + if (unlikely(dma_mapping_error(dev, entry->skb_dma))) return -ENOMEM; skbdesc->flags |= SKBDESC_DMA_MAPPED_TX; @@ -124,11 +124,11 @@ void rt2x00queue_unmap_skb(struct queue_entry *entry) struct skb_frame_desc *skbdesc = get_skb_frame_desc(entry->skb); if (skbdesc->flags & SKBDESC_DMA_MAPPED_RX) { - dma_unmap_single(dev, skbdesc->skb_dma, entry->skb->len, + dma_unmap_single(dev, entry->skb_dma, entry->skb->len, DMA_FROM_DEVICE); skbdesc->flags &= ~SKBDESC_DMA_MAPPED_RX; } else if (skbdesc->flags & SKBDESC_DMA_MAPPED_TX) { - dma_unmap_single(dev, skbdesc->skb_dma, entry->skb->len, + dma_unmap_single(dev, entry->skb_dma, entry->skb->len, DMA_TO_DEVICE); skbdesc->flags &= ~SKBDESC_DMA_MAPPED_TX; } diff --git a/drivers/net/wireless/rt2x00/rt2x00queue.h b/drivers/net/wireless/rt2x00/rt2x00queue.h index 2233b91..e6f6a2e 100644 --- a/drivers/net/wireless/rt2x00/rt2x00queue.h +++ b/drivers/net/wireless/rt2x00/rt2x00queue.h @@ -101,23 +101,19 @@ enum skb_frame_desc_flags { * Note that this pointer could point to something outside * of the scope of the skb->data pointer. * @iv: IV/EIV data used during encryption/decryption. - * @skb_dma: (PCI-only) the DMA address associated with the sk buffer. * @entry: The entry to which this sk buffer belongs. */ struct skb_frame_desc { - u8 flags; - - u8 desc_len; - u8 tx_rate_idx; - u8 tx_rate_flags; - + struct queue_entry *entry; void *desc; __le32 iv[2]; - dma_addr_t skb_dma; + u8 flags; - struct queue_entry *entry; + u8 desc_len; + u8 tx_rate_idx; + u8 tx_rate_flags; }; /** @@ -372,6 +368,7 @@ enum queue_entry_flags { * @entry_idx: The entry index number. * @priv_data: Private data belonging to this queue entry. The pointer * points to data specific to a particular driver and queue type. + * @skb_dma: (PCI-only) the DMA address associated with the sk buffer. * @status: Device specific status */ struct queue_entry { @@ -386,6 +383,8 @@ struct queue_entry { u32 status; + dma_addr_t skb_dma; + void *priv_data; }; diff --git a/drivers/net/wireless/rt2x00/rt61pci.c b/drivers/net/wireless/rt2x00/rt61pci.c index 8194550..a0c8de1 100644 --- a/drivers/net/wireless/rt2x00/rt61pci.c +++ b/drivers/net/wireless/rt2x00/rt61pci.c @@ -1397,13 +1397,12 @@ static bool rt61pci_get_entry_state(struct queue_entry *entry) static void rt61pci_clear_entry(struct queue_entry *entry) { struct queue_entry_priv_mmio *entry_priv = entry->priv_data; - struct skb_frame_desc *skbdesc = get_skb_frame_desc(entry->skb); u32 word; if (entry->queue->qid == QID_RX) { rt2x00_desc_read(entry_priv->desc, 5, &word); rt2x00_set_field32(&word, RXD_W5_BUFFER_PHYSICAL_ADDRESS, - skbdesc->skb_dma); + entry->skb_dma); rt2x00_desc_write(entry_priv->desc, 5, word); rt2x00_desc_read(entry_priv->desc, 0, &word); @@ -1913,7 +1912,7 @@ static void rt61pci_write_tx_desc(struct queue_entry *entry, if (entry->queue->qid != QID_BEACON) { rt2x00_desc_read(txd, 6, &word); rt2x00_set_field32(&word, TXD_W6_BUFFER_PHYSICAL_ADDRESS, - skbdesc->skb_dma); + entry->skb_dma); rt2x00_desc_write(txd, 6, word); rt2x00_desc_read(txd, 11, &word); -- 2.0.5 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 05/14] drivers: wireless: ar5523: use container_of [not found] <1425318028-26531-1-git-send-email-fw@strlen.de> 2015-03-02 17:40 ` [PATCH RFC 04/14] drivers: wireless: rt2x00: move skb_dma to queue entry Florian Westphal @ 2015-03-02 17:40 ` Florian Westphal 2015-03-03 9:16 ` Pontus Fuchs 2015-03-02 17:40 ` [PATCH RFC 06/14] drivers: wireless: carl9170: shrink carl9170_tx_info Florian Westphal ` (4 subsequent siblings) 6 siblings, 1 reply; 19+ messages in thread From: Florian Westphal @ 2015-03-02 17:40 UTC (permalink / raw) To: netdev; +Cc: Florian Westphal, linux-wireless Compile tested only due to lack of hw. If we want to shrink skb->cb then we'd have to see about reducing struct ieee80211_tx_info, which gets embedded inside skb->cb[]. It provides a scratch space to be used by wireless drivers. ar5523 uses the maximum space available today (40 bytes), but it seems we don't need this -- data->skb pointer seems to always point back to the skb whose cb buffer the data structure resides, iow, given a pointer to the embedded control buffer we can infer the skb address. Cc: linux-wireless@vger.kernel.org Signed-off-by: Florian Westphal <fw@strlen.de> --- drivers/net/wireless/ath/ar5523/ar5523.c | 8 ++++---- drivers/net/wireless/ath/ar5523/ar5523.h | 1 - 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/net/wireless/ath/ar5523/ar5523.c b/drivers/net/wireless/ath/ar5523/ar5523.c index f920506..7410ac0 100644 --- a/drivers/net/wireless/ath/ar5523/ar5523.c +++ b/drivers/net/wireless/ath/ar5523/ar5523.c @@ -779,8 +779,6 @@ static void ar5523_tx(struct ieee80211_hw *hw, ieee80211_stop_queues(hw); } - data->skb = skb; - spin_lock_irqsave(&ar->tx_data_list_lock, flags); list_add_tail(&data->list, &ar->tx_queue_pending); spin_unlock_irqrestore(&ar->tx_data_list_lock, flags); @@ -817,10 +815,12 @@ static void ar5523_tx_work_locked(struct ar5523 *ar) if (!data) break; - skb = data->skb; + txi = container_of((void *)data, struct ieee80211_tx_info, driver_data); txqid = 0; - txi = IEEE80211_SKB_CB(skb); + + skb = container_of((void *) txi, struct sk_buff, cb); paylen = skb->len; + urb = usb_alloc_urb(0, GFP_KERNEL); if (!urb) { ar5523_err(ar, "Failed to allocate TX urb\n"); diff --git a/drivers/net/wireless/ath/ar5523/ar5523.h b/drivers/net/wireless/ath/ar5523/ar5523.h index 00c6fd3..9a322a6 100644 --- a/drivers/net/wireless/ath/ar5523/ar5523.h +++ b/drivers/net/wireless/ath/ar5523/ar5523.h @@ -74,7 +74,6 @@ struct ar5523_tx_cmd { struct ar5523_tx_data { struct list_head list; struct ar5523 *ar; - struct sk_buff *skb; struct urb *urb; }; -- 2.0.5 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 05/14] drivers: wireless: ar5523: use container_of 2015-03-02 17:40 ` [PATCH RFC 05/14] drivers: wireless: ar5523: use container_of Florian Westphal @ 2015-03-03 9:16 ` Pontus Fuchs 0 siblings, 0 replies; 19+ messages in thread From: Pontus Fuchs @ 2015-03-03 9:16 UTC (permalink / raw) To: Florian Westphal, netdev; +Cc: linux-wireless On 2015-03-02 18:40, Florian Westphal wrote: > Compile tested only due to lack of hw. > > If we want to shrink skb->cb then we'd have to see about > reducing struct ieee80211_tx_info, which gets embedded inside > skb->cb[]. > > It provides a scratch space to be used by wireless drivers. > ar5523 uses the maximum space available today (40 bytes), but it seems > we don't need this -- data->skb pointer seems to always point back to the > skb whose cb buffer the data structure resides, iow, given a pointer to the > embedded control buffer we can infer the skb address. > > Cc: linux-wireless@vger.kernel.org > Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Pontus Fuchs <pontus.fuchs@gmail.com> ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH RFC 06/14] drivers: wireless: carl9170: shrink carl9170_tx_info [not found] <1425318028-26531-1-git-send-email-fw@strlen.de> 2015-03-02 17:40 ` [PATCH RFC 04/14] drivers: wireless: rt2x00: move skb_dma to queue entry Florian Westphal 2015-03-02 17:40 ` [PATCH RFC 05/14] drivers: wireless: ar5523: use container_of Florian Westphal @ 2015-03-02 17:40 ` Florian Westphal 2015-03-02 17:40 ` [PATCH RFC 07/14] net: wireless: iwlwifi: shrink status private area Florian Westphal ` (3 subsequent siblings) 6 siblings, 0 replies; 19+ messages in thread From: Florian Westphal @ 2015-03-02 17:40 UTC (permalink / raw) To: netdev; +Cc: Florian Westphal, linux-wireless compile tested only. its embededded inside rate_driver_data of the ieee80211_tx_info struct, which in turn is stored in skb->cb[]. In order to shrink cb, we need to shrink ieee80211_tx_info which means to downsize the users. Alternatively, one might be able to remove kref but its less intrusive/simpler to use a u32 for timeout handling. Cc: linux-wireless@vger.kernel.org Signed-off-by: Florian Westphal <fw@strlen.de> --- drivers/net/wireless/ath/carl9170/carl9170.h | 4 ++-- drivers/net/wireless/ath/carl9170/debug.c | 2 +- drivers/net/wireless/ath/carl9170/tx.c | 22 +++++++++++++++++----- 3 files changed, 20 insertions(+), 8 deletions(-) diff --git a/drivers/net/wireless/ath/carl9170/carl9170.h b/drivers/net/wireless/ath/carl9170/carl9170.h index 237d0cd..a785300 100644 --- a/drivers/net/wireless/ath/carl9170/carl9170.h +++ b/drivers/net/wireless/ath/carl9170/carl9170.h @@ -491,9 +491,9 @@ struct carl9170_sta_info { }; struct carl9170_tx_info { - unsigned long timeout; - struct ar9170 *ar; + u32 timeout32; struct kref ref; + struct ar9170 *ar; }; #define CHK_DEV_STATE(a, s) (((struct ar9170 *)a)->state >= (s)) diff --git a/drivers/net/wireless/ath/carl9170/debug.c b/drivers/net/wireless/ath/carl9170/debug.c index 6808db4..d00ab9d 100644 --- a/drivers/net/wireless/ath/carl9170/debug.c +++ b/drivers/net/wireless/ath/carl9170/debug.c @@ -291,7 +291,7 @@ static void carl9170_debugfs_format_frame(struct ar9170 *ar, "pc:%.8x, to:%d ms\n", prefix, skb, txc->s.cookie, ieee80211_get_DA(hdr), get_seq_h(hdr), le16_to_cpu(txc->f.mac_control), le32_to_cpu(txc->f.phy_control), - jiffies_to_msecs(jiffies - arinfo->timeout)); + jiffies_to_msecs(((u32)jiffies) - arinfo->timeout32)); } diff --git a/drivers/net/wireless/ath/carl9170/tx.c b/drivers/net/wireless/ath/carl9170/tx.c index ae86a600..85687aaf4 100644 --- a/drivers/net/wireless/ath/carl9170/tx.c +++ b/drivers/net/wireless/ath/carl9170/tx.c @@ -555,6 +555,18 @@ static void carl9170_tx_fill_rateinfo(struct ar9170 *ar, unsigned int rix, } } +static inline bool carl9170_time_is_before_jiffies(u32 timeout) +{ + u32 now = (u32) jiffies; + return (s32)(now - timeout) >= 0; +} + +static inline bool carl9170_time_is_after_jiffies(u32 timeout) +{ + u32 now = (u32) jiffies; + return (s32)(now - timeout) < 0; +} + static void carl9170_check_queue_stop_timeout(struct ar9170 *ar) { int i; @@ -574,8 +586,8 @@ static void carl9170_check_queue_stop_timeout(struct ar9170 *ar) txinfo = IEEE80211_SKB_CB(skb); arinfo = (void *) txinfo->rate_driver_data; - if (time_is_before_jiffies(arinfo->timeout + - msecs_to_jiffies(CARL9170_QUEUE_STUCK_TIMEOUT)) == true) + if (carl9170_time_is_before_jiffies(arinfo->timeout32 + + (u32)msecs_to_jiffies(CARL9170_QUEUE_STUCK_TIMEOUT)) == true) restart = true; next: @@ -620,7 +632,7 @@ static void carl9170_tx_ampdu_timeout(struct ar9170 *ar) txinfo = IEEE80211_SKB_CB(skb); arinfo = (void *)txinfo->rate_driver_data; - if (time_is_after_jiffies(arinfo->timeout + + if (carl9170_time_is_after_jiffies(arinfo->timeout32 + msecs_to_jiffies(CARL9170_QUEUE_TIMEOUT))) goto unlock; @@ -1066,7 +1078,7 @@ static int carl9170_tx_prepare(struct ar9170 *ar, txc->f.mac_control = mac_tmp; arinfo = (void *)info->rate_driver_data; - arinfo->timeout = jiffies; + arinfo->timeout32 = (u32) jiffies; arinfo->ar = ar; kref_init(&arinfo->ref); return 0; @@ -1259,7 +1271,7 @@ static struct sk_buff *carl9170_tx_pick_skb(struct ar9170 *ar, info = IEEE80211_SKB_CB(skb); arinfo = (void *) info->rate_driver_data; - arinfo->timeout = jiffies; + arinfo->timeout32 = (u32) jiffies; return skb; err_unlock: -- 2.0.5 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 07/14] net: wireless: iwlwifi: shrink status private area [not found] <1425318028-26531-1-git-send-email-fw@strlen.de> ` (2 preceding siblings ...) 2015-03-02 17:40 ` [PATCH RFC 06/14] drivers: wireless: carl9170: shrink carl9170_tx_info Florian Westphal @ 2015-03-02 17:40 ` Florian Westphal 2015-03-02 17:40 ` [PATCH RFC 08/14] net: wireless: mac80211: shrink ieee80211_tx_info Florian Westphal ` (2 subsequent siblings) 6 siblings, 0 replies; 19+ messages in thread From: Florian Westphal @ 2015-03-02 17:40 UTC (permalink / raw) To: netdev; +Cc: Florian Westphal, linux-wireless iwlwifi is only driver to use status_driver_data, and it only wants to store u8 counter. This is one patch needed to reduce size of ieee80211_tx_info, which in turn is needed to allow reducing size of skb->cb[]. Cc: linux-wireless@vger.kernel.org Signed-off-by: Florian Westphal <fw@strlen.de> --- drivers/net/wireless/iwlwifi/mvm/rs.c | 2 +- drivers/net/wireless/iwlwifi/mvm/tx.c | 8 +++----- include/net/mac80211.h | 2 +- 3 files changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/mvm/rs.c b/drivers/net/wireless/iwlwifi/mvm/rs.c index 194bd1f..e29092b 100644 --- a/drivers/net/wireless/iwlwifi/mvm/rs.c +++ b/drivers/net/wireless/iwlwifi/mvm/rs.c @@ -1054,7 +1054,7 @@ void iwl_mvm_rs_tx_status(struct iwl_mvm *mvm, struct ieee80211_sta *sta, u32 ucode_rate; struct rs_rate rate; struct iwl_scale_tbl_info *curr_tbl, *other_tbl, *tmp_tbl; - u8 reduced_txp = (uintptr_t)info->status.status_driver_data[0]; + u8 reduced_txp = (u8)info->status.status_driver_data[0]; struct iwl_mvm_sta *mvmsta = iwl_mvm_sta_from_mac80211(sta); struct iwl_lq_sta *lq_sta = &mvmsta->lq_sta; diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c b/drivers/net/wireless/iwlwifi/mvm/tx.c index 07304e1..450635f 100644 --- a/drivers/net/wireless/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/iwlwifi/mvm/tx.c @@ -683,9 +683,8 @@ static void iwl_mvm_rx_tx_cmd_single(struct iwl_mvm *mvm, */ info->status.tx_time = le16_to_cpu(tx_resp->wireless_media_time); - BUILD_BUG_ON(ARRAY_SIZE(info->status.status_driver_data) < 1); - info->status.status_driver_data[0] = - (void *)(uintptr_t)tx_resp->reduced_tpc; + BUILD_BUG_ON(sizeof(info->status.status_driver_data) < 1); + info->status.status_driver_data[0] = tx_resp->reduced_tpc; ieee80211_tx_status(mvm->hw, skb); } @@ -907,8 +906,7 @@ static void iwl_mvm_tx_info_from_ba_notif(struct ieee80211_tx_info *info, info); /* TODO: not accounted if the whole A-MPDU failed */ info->status.tx_time = tid_data->tx_time; - info->status.status_driver_data[0] = - (void *)(uintptr_t)tid_data->reduced_tpc; + info->status.status_driver_data[0] = tid_data->reduced_tpc; } int iwl_mvm_rx_ba_notif(struct iwl_mvm *mvm, struct iwl_rx_cmd_buffer *rxb, diff --git a/include/net/mac80211.h b/include/net/mac80211.h index d52914b..63c3708 100644 --- a/include/net/mac80211.h +++ b/include/net/mac80211.h @@ -755,7 +755,7 @@ struct ieee80211_tx_info { u8 ampdu_len; u8 antenna; u16 tx_time; - void *status_driver_data[19 / sizeof(void *)]; + u8 status_driver_data[1]; } status; struct { struct ieee80211_tx_rate driver_rates[ -- 2.0.5 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 08/14] net: wireless: mac80211: shrink ieee80211_tx_info [not found] <1425318028-26531-1-git-send-email-fw@strlen.de> ` (3 preceding siblings ...) 2015-03-02 17:40 ` [PATCH RFC 07/14] net: wireless: iwlwifi: shrink status private area Florian Westphal @ 2015-03-02 17:40 ` Florian Westphal 2015-03-02 18:53 ` Johannes Berg 2015-03-02 17:40 ` [PATCH RFC 09/14] net: wireless: mac80211: shrink private driver area Florian Westphal 2015-03-02 19:49 ` [PATCH RFC 00/14] shrink skb cb to 44 bytes Eric Dumazet 6 siblings, 1 reply; 19+ messages in thread From: Florian Westphal @ 2015-03-02 17:40 UTC (permalink / raw) To: netdev; +Cc: Florian Westphal, linux-wireless to make it fit into (future) 44-byte sized skb->cb[]. This works, since flags is only used to store values from mac80211_tx_control_flags enum, and these are just 2 bits. We can thus move this to the padding hole inside the union. Also add BUILD_BUG_ON magic to make sure that the new flags field doesn't share storage w. other members of the union. Cc: linux-wireless@vger.kernel.org Signed-off-by: Florian Westphal <fw@strlen.de> --- include/net/mac80211.h | 11 ++++++++--- net/mac80211/main.c | 3 +++ 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/include/net/mac80211.h b/include/net/mac80211.h index 63c3708..36c2599 100644 --- a/include/net/mac80211.h +++ b/include/net/mac80211.h @@ -737,16 +737,21 @@ struct ieee80211_tx_info { u8 use_cts_prot:1; u8 short_preamble:1; u8 skip_table:1; - /* 2 bytes free */ + enum mac80211_tx_control_flags flags:2; + /* used for BUILD_BUG_ON validation that ->flags won't + * overlap with jiffies below */ + char flags_end[0]; }; /* only needed before rate control */ unsigned long jiffies; + + /* used for BUILD_BUG_ON validation that ->flags won't + * overlap with other members of this union. */ + char union_end[0]; }; /* NB: vif can be NULL for injected frames */ struct ieee80211_vif *vif; struct ieee80211_key_conf *hw_key; - u32 flags; - /* 4 bytes free */ } control; struct { struct ieee80211_tx_rate rates[IEEE80211_TX_MAX_RATES]; diff --git a/net/mac80211/main.c b/net/mac80211/main.c index 5e09d35..9e8c807 100644 --- a/net/mac80211/main.c +++ b/net/mac80211/main.c @@ -1218,6 +1218,9 @@ static int __init ieee80211_init(void) BUILD_BUG_ON(offsetof(struct ieee80211_tx_info, driver_data) + IEEE80211_TX_INFO_DRIVER_DATA_SIZE > sizeof(skb->cb)); + BUILD_BUG_ON(offsetof(struct ieee80211_tx_info, control.flags_end) < + offsetof(struct ieee80211_tx_info, control.union_end)); + ret = rc80211_minstrel_init(); if (ret) return ret; -- 2.0.5 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 08/14] net: wireless: mac80211: shrink ieee80211_tx_info 2015-03-02 17:40 ` [PATCH RFC 08/14] net: wireless: mac80211: shrink ieee80211_tx_info Florian Westphal @ 2015-03-02 18:53 ` Johannes Berg 2015-03-02 19:03 ` Florian Westphal 0 siblings, 1 reply; 19+ messages in thread From: Johannes Berg @ 2015-03-02 18:53 UTC (permalink / raw) To: Florian Westphal; +Cc: netdev, linux-wireless On Mon, 2015-03-02 at 18:40 +0100, Florian Westphal wrote: > to make it fit into (future) 44-byte sized skb->cb[]. > > This works, since flags is only used to store values > from mac80211_tx_control_flags enum, and these are just 2 bits. > We can thus move this to the padding hole inside the union. > > Also add BUILD_BUG_ON magic to make sure that the new flags > field doesn't share storage w. other members of the union. This is really ugly - what's the point of this? Mind you - we are actually acutely out of space and would rather have *more*, not less. johannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 08/14] net: wireless: mac80211: shrink ieee80211_tx_info 2015-03-02 18:53 ` Johannes Berg @ 2015-03-02 19:03 ` Florian Westphal 2015-03-02 19:18 ` Johannes Berg 0 siblings, 1 reply; 19+ messages in thread From: Florian Westphal @ 2015-03-02 19:03 UTC (permalink / raw) To: Johannes Berg; +Cc: Florian Westphal, netdev, linux-wireless Johannes Berg <johannes@sipsolutions.net> wrote: > On Mon, 2015-03-02 at 18:40 +0100, Florian Westphal wrote: > > to make it fit into (future) 44-byte sized skb->cb[]. > > > > This works, since flags is only used to store values > > from mac80211_tx_control_flags enum, and these are just 2 bits. > > We can thus move this to the padding hole inside the union. > > > > Also add BUILD_BUG_ON magic to make sure that the new flags > > field doesn't share storage w. other members of the union. > > This is really ugly - what's the point of this? Eventually reducing skb size to make it fit into 3 cachelines again even on 64bit architectures. For that 40 bytes need to go. > Mind you - we are actually acutely out of space and would rather have > *more*, not less. :-( I'm not familiar with mac80211, aside from that it seemed to me that 40 byte cb would be doable, given enough work. Where are to main problems, exactly? I known that pushing something into ->cb is a lot easier than e.g. keeping extra state on stack, but, IMO cb should really only be used when you need to associate data strictly with an skb so that this data is still availabe even when skb gets queued somewehere. Is there a document somewhere that lists all of the per-skb data that mac80211 needs to store (or wants to store)? Thanks, Florian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 08/14] net: wireless: mac80211: shrink ieee80211_tx_info 2015-03-02 19:03 ` Florian Westphal @ 2015-03-02 19:18 ` Johannes Berg 2015-03-02 19:30 ` Florian Westphal 0 siblings, 1 reply; 19+ messages in thread From: Johannes Berg @ 2015-03-02 19:18 UTC (permalink / raw) To: Florian Westphal; +Cc: netdev, linux-wireless On Mon, 2015-03-02 at 20:03 +0100, Florian Westphal wrote: > Eventually reducing skb size to make it fit into 3 cachelines again even > on 64bit architectures. For that 40 bytes need to go. That seems like a worthy goal I guess (I guess you should've copied a patch 0/N to us). > > Mind you - we are actually acutely out of space and would rather have > > *more*, not less. > > :-( > > I'm not familiar with mac80211, aside from that it seemed to me > that 40 byte cb would be doable, given enough work. Right now it is, mostly, yes. > Where are to main problems, exactly? Well, that depends. Right now clearly we're not really using all of it as you saw (even if this patch moving bits here and there is really ugly) but there are multiple things: 1) Of course mac80211 isn't static, it keeps getting developed! Right now for example we need to fix single TCP flow throughput over wifi, and for that we need a timestamp. That won't even fit into skb->cb any more right now; we'll probably be able to get away with (ab)using skb->tstamp or doing reshuffling similar to yours to get some space, but that's just lucky this time. 2) We're actually out of flags that are kept from TX generation to TX destruction and it's almost certain that we'll need to add more things. Also, we already do a lot of bit twiddling here, doing it even more makes the code even harder to follow. It's bad enough as is if you ask me. > I known that pushing something into ->cb is a lot easier than e.g. keeping > extra state on stack, but, IMO cb should really only be used when you > need to associate data strictly with an skb so that this data is still > availabe even when skb gets queued somewehere. Right, and we do that. We've in the past moved out data from here to elsewhere, but it's extremely tedious and error-prone, and I'm not sure we have much that we can possibly move now, since we do need to hang on to SKBs in many cases like client powersaving etc. > Is there a document somewhere that lists all of the per-skb data that > mac80211 needs to store (or wants to store)? Not really, sorry. > + enum mac80211_tx_control_flags flags:2; > + /* used for BUILD_BUG_ON validation that ->flags won't > + * overlap with jiffies below */ > + char flags_end[0]; That "jiffies" should probably say "other members (currently jiffies)" or something ... I got a bit confused here. johannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 08/14] net: wireless: mac80211: shrink ieee80211_tx_info 2015-03-02 19:18 ` Johannes Berg @ 2015-03-02 19:30 ` Florian Westphal 0 siblings, 0 replies; 19+ messages in thread From: Florian Westphal @ 2015-03-02 19:30 UTC (permalink / raw) To: Johannes Berg; +Cc: Florian Westphal, netdev, linux-wireless Johannes Berg <johannes@sipsolutions.net> wrote: > On Mon, 2015-03-02 at 20:03 +0100, Florian Westphal wrote: > > > Eventually reducing skb size to make it fit into 3 cachelines again even > > on 64bit architectures. For that 40 bytes need to go. > > That seems like a worthy goal I guess (I guess you should've copied a > patch 0/N to us). Indeed, sorry. archived cover letter: http://marc.info/?l=linux-netdev&m=142531804325076&w=2 > > I'm not familiar with mac80211, aside from that it seemed to me > > that 40 byte cb would be doable, given enough work. > > Right now it is, mostly, yes. > > > Where are to main problems, exactly? > > Well, that depends. Right now clearly we're not really using all of it > as you saw (even if this patch moving bits here and there is really > ugly) but there are multiple things: > 1) Of course mac80211 isn't static, it keeps getting developed! Right > now for > example we need to fix single TCP flow throughput over wifi, and for > that we > need a timestamp. That won't even fit into skb->cb any more right > now; we'll > probably be able to get away with (ab)using skb->tstamp or doing > reshuffling > similar to yours to get some space, but that's just lucky this time. > 2) We're actually out of flags that are kept from TX generation to TX > destruction and it's almost certain that we'll need to add more > things. > > Also, we already do a lot of bit twiddling here, doing it even more > makes the code even harder to follow. It's bad enough as is if you ask > me. True. > > I known that pushing something into ->cb is a lot easier than e.g. keeping > > extra state on stack, but, IMO cb should really only be used when you > > need to associate data strictly with an skb so that this data is still > > availabe even when skb gets queued somewehere. > > Right, and we do that. We've in the past moved out data from here to > elsewhere, but it's extremely tedious and error-prone, and I'm not sure > we have much that we can possibly move now, since we do need to hang on > to SKBs in many cases like client powersaving etc. I see. I cannot comment any further at the moment, I will have to dig into mac80211 more. Thanks for your comments! ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH RFC 09/14] net: wireless: mac80211: shrink private driver area [not found] <1425318028-26531-1-git-send-email-fw@strlen.de> ` (4 preceding siblings ...) 2015-03-02 17:40 ` [PATCH RFC 08/14] net: wireless: mac80211: shrink ieee80211_tx_info Florian Westphal @ 2015-03-02 17:40 ` Florian Westphal 2015-03-02 18:52 ` Johannes Berg 2015-03-02 19:49 ` [PATCH RFC 00/14] shrink skb cb to 44 bytes Eric Dumazet 6 siblings, 1 reply; 19+ messages in thread From: Florian Westphal @ 2015-03-02 17:40 UTC (permalink / raw) To: netdev; +Cc: Florian Westphal, linux-wireless This makes the structure fit 44-byte sized skb->cb[]. Cc: linux-wireless@vger.kernel.org Signed-off-by: Florian Westphal <fw@strlen.de> --- drivers/net/wireless/ath/ath9k/ath9k.h | 2 +- include/net/mac80211.h | 8 +++----- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h index 0f8e946..2ed77cf 100644 --- a/drivers/net/wireless/ath/ath9k/ath9k.h +++ b/drivers/net/wireless/ath/ath9k/ath9k.h @@ -184,12 +184,12 @@ struct ath_frame_info { struct ath_buf *bf; u16 framelen; s8 txq; - enum ath9k_key_type keytype; u8 keyix; u8 rtscts_rate; u8 retries : 7; u8 baw_tracked : 1; u8 tx_power; + enum ath9k_key_type keytype:2; }; struct ath_rxbuf { diff --git a/include/net/mac80211.h b/include/net/mac80211.h index 36c2599..372c25a 100644 --- a/include/net/mac80211.h +++ b/include/net/mac80211.h @@ -627,11 +627,11 @@ enum mac80211_rate_control_flags { }; -/* there are 40 bytes if you don't need the rateset to be kept */ -#define IEEE80211_TX_INFO_DRIVER_DATA_SIZE 40 +/* there are 32 bytes if you don't need the rateset to be kept */ +#define IEEE80211_TX_INFO_DRIVER_DATA_SIZE 32 /* if you do need the rateset, then you have less space */ -#define IEEE80211_TX_INFO_RATE_DRIVER_DATA_SIZE 24 +#define IEEE80211_TX_INFO_RATE_DRIVER_DATA_SIZE 16 /* maximum number of rate stages */ #define IEEE80211_TX_MAX_RATES 4 @@ -765,8 +765,6 @@ struct ieee80211_tx_info { struct { struct ieee80211_tx_rate driver_rates[ IEEE80211_TX_MAX_RATES]; - u8 pad[4]; - void *rate_driver_data[ IEEE80211_TX_INFO_RATE_DRIVER_DATA_SIZE / sizeof(void *)]; }; -- 2.0.5 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 09/14] net: wireless: mac80211: shrink private driver area 2015-03-02 17:40 ` [PATCH RFC 09/14] net: wireless: mac80211: shrink private driver area Florian Westphal @ 2015-03-02 18:52 ` Johannes Berg 0 siblings, 0 replies; 19+ messages in thread From: Johannes Berg @ 2015-03-02 18:52 UTC (permalink / raw) To: Florian Westphal; +Cc: netdev, linux-wireless On Mon, 2015-03-02 at 18:40 +0100, Florian Westphal wrote: > This makes the structure fit 44-byte sized skb->cb[]. > > Cc: linux-wireless@vger.kernel.org > Signed-off-by: Florian Westphal <fw@strlen.de> > --- > drivers/net/wireless/ath/ath9k/ath9k.h | 2 +- > include/net/mac80211.h | 8 +++----- > 2 files changed, 4 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h > index 0f8e946..2ed77cf 100644 > --- a/drivers/net/wireless/ath/ath9k/ath9k.h > +++ b/drivers/net/wireless/ath/ath9k/ath9k.h This is clearly not a mac80211 patch. johannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 00/14] shrink skb cb to 44 bytes [not found] <1425318028-26531-1-git-send-email-fw@strlen.de> ` (5 preceding siblings ...) 2015-03-02 17:40 ` [PATCH RFC 09/14] net: wireless: mac80211: shrink private driver area Florian Westphal @ 2015-03-02 19:49 ` Eric Dumazet 2015-03-02 20:42 ` Florian Westphal 2015-03-02 22:17 ` David Miller 6 siblings, 2 replies; 19+ messages in thread From: Eric Dumazet @ 2015-03-02 19:49 UTC (permalink / raw) To: Florian Westphal; +Cc: netdev, Johannes Berg, linux-wireless On Mon, 2015-03-02 at 18:40 +0100, Florian Westphal wrote: > Following patches shrink all in-tree users of skb->cb[] so that kernel > still builds with skb->cb[] set to 44 bytes. > > This would create a 4-byte hole, it would be easy to reorder skb > members so this hole is filled. > > pahole dump for vmlinux with allyesconfig (+this patch series): > > http://www.strlen.de/fw/pahole.txt.gz > > remarks/known issues: > - adds __packet attribute to a few structures. > Its needed to not have padding at end of the structure, else > we get build assertion errors even if all members fit into cb[]. > - checkpatch isn't happy yet. > - dccp changes are untested (its on my todo list) > - rxrpc change is untested (on todo list). > - wireless changes are untested, I don't own any of the affected hw. > > The idea is to figure out what needs to be done to make build_skb() and > friends only touch the first 3 cachelines of the skb on all architectures. > > We'd need to reduce skbuff by 40 bytes to achieve this for allyesconfig. > > Other sk_buff reduction ideas being worked: > > - move truesize to shinfo (which has 4 byte hole) > - turn ->data into offset on 64bit platforms (intrusive, > 1000 files > affected) > - move pointers that are ususally not needed (nf_bridge, secpath) to the end, > with flag field that tells us when pointer is valid (so we don't have > to memset() them unconditionally at allocation time). > - seems we could already to this for the inner header fields since the're only > valid once ->encapsulation / inner_protocol_type is set. > > Comments and more ideas welcome. Size of skb->cb[] is not the major factor. Trying to gain 4 or 8 bytes is not going to improve performance a lot. The real problem is that we clear it in skb_alloc()/build_skb(), instead of each layer doing so at demand, and only the part that matters for this layer. Basically, skb->cb[] could be 80 or 160 bytes instead of 48, and we should not care, as long as no layer does a stupid/lazy memset(skb->cb, 0, sizeof(skb->cb)) Presumably skb_clone() could be extended to receive the length of skb->cb[] that current layer cares about. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 00/14] shrink skb cb to 44 bytes 2015-03-02 19:49 ` [PATCH RFC 00/14] shrink skb cb to 44 bytes Eric Dumazet @ 2015-03-02 20:42 ` Florian Westphal 2015-03-02 21:56 ` Eric Dumazet 2015-03-02 22:17 ` David Miller 1 sibling, 1 reply; 19+ messages in thread From: Florian Westphal @ 2015-03-02 20:42 UTC (permalink / raw) To: Eric Dumazet; +Cc: Florian Westphal, netdev, Johannes Berg, linux-wireless Eric Dumazet <eric.dumazet@gmail.com> wrote: > > remarks/known issues: > > - adds __packet attribute to a few structures. > > Its needed to not have padding at end of the structure, else > > we get build assertion errors even if all members fit into cb[]. > > - checkpatch isn't happy yet. > > - dccp changes are untested (its on my todo list) > > - rxrpc change is untested (on todo list). > > - wireless changes are untested, I don't own any of the affected hw. > > > > The idea is to figure out what needs to be done to make build_skb() and > > friends only touch the first 3 cachelines of the skb on all architectures. > > > > We'd need to reduce skbuff by 40 bytes to achieve this for allyesconfig. > > > > Other sk_buff reduction ideas being worked: > > > > - move truesize to shinfo (which has 4 byte hole) > > - turn ->data into offset on 64bit platforms (intrusive, > 1000 files > > affected) > > - move pointers that are ususally not needed (nf_bridge, secpath) to the end, > > with flag field that tells us when pointer is valid (so we don't have > > to memset() them unconditionally at allocation time). > > - seems we could already to this for the inner header fields since the're only > > valid once ->encapsulation / inner_protocol_type is set. > > > > Comments and more ideas welcome. > > Size of skb->cb[] is not the major factor. Trying to gain 4 or 8 bytes > is not going to improve performance a lot. Thats right, goal is to have build_skb etc. only touch 3 cachelines, and have those layers that need some particular feature initialise it on demand. We could move skb->cb to end of skb, so that its not covered by the initial allocation memset anymore. Obviously that won't buy us anything at this point since gro needs to zero it out (GRO skb cb is 48 bytes). > The real problem is that we clear it in skb_alloc()/build_skb(), instead > of each layer doing so at demand, and only the part that matters for > this layer. Thats right. Do you think its worth to already move cb[] near the end of skb and alter build_skb to not clear it anymore? Which of the ideas, in your opinion, is worth pursuing first (if any)? I'd love to get rid of nf_bridge* pointer and use percpu storage area for it but I did not yet find a simple way to deal with when skb that uses this percpu state leaves bridge code and is then dropped or queued later -- we need to make sure that another skb isn't accidentally believed to have valid nf_bridge context. kfree_skb doesn't seem to be a nice place to add bridge crap to, and neither is netif_rx... ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 00/14] shrink skb cb to 44 bytes 2015-03-02 20:42 ` Florian Westphal @ 2015-03-02 21:56 ` Eric Dumazet 0 siblings, 0 replies; 19+ messages in thread From: Eric Dumazet @ 2015-03-02 21:56 UTC (permalink / raw) To: Florian Westphal; +Cc: netdev, Johannes Berg, linux-wireless On Mon, 2015-03-02 at 21:42 +0100, Florian Westphal wrote: > Thats right. Do you think its worth to already move cb[] near the end > of skb and alter build_skb to not clear it anymore? > > Which of the ideas, in your opinion, is worth pursuing first (if any)? moving cb[] near the end will void my patches to use one cache line per skb in TCP receive queue ( or write queue) 971f10eca186ca tcp: better TCP_SKB_CB layout to reduce cache line misses I have worked a bit (3 months ago) about doing the skb->cb[] selective clearing, but a lot of alloc_skb() users _assume_ it is already cleared. That seemed a lot of work to me, because of the many alloc_skb() variants we have. But definitely worth trying to complete. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 00/14] shrink skb cb to 44 bytes 2015-03-02 19:49 ` [PATCH RFC 00/14] shrink skb cb to 44 bytes Eric Dumazet 2015-03-02 20:42 ` Florian Westphal @ 2015-03-02 22:17 ` David Miller 2015-03-03 4:02 ` Eric Dumazet 1 sibling, 1 reply; 19+ messages in thread From: David Miller @ 2015-03-02 22:17 UTC (permalink / raw) To: eric.dumazet; +Cc: fw, netdev, johannes, linux-wireless From: Eric Dumazet <eric.dumazet@gmail.com> Date: Mon, 02 Mar 2015 11:49:23 -0800 > Size of skb->cb[] is not the major factor. Trying to gain 4 or 8 bytes > is not going to improve performance a lot. > > The real problem is that we clear it in skb_alloc()/build_skb(), instead > of each layer doing so at demand, and only the part that matters for > this layer. > > Basically, skb->cb[] could be 80 or 160 bytes instead of 48, and we > should not care, as long as no layer does a stupid/lazy > > memset(skb->cb, 0, sizeof(skb->cb)) > > Presumably skb_clone() could be extended to receive the length of > skb->cb[] that current layer cares about. Regardless, I think Florian's work has value. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 00/14] shrink skb cb to 44 bytes 2015-03-02 22:17 ` David Miller @ 2015-03-03 4:02 ` Eric Dumazet 2015-03-03 4:05 ` David Miller 0 siblings, 1 reply; 19+ messages in thread From: Eric Dumazet @ 2015-03-03 4:02 UTC (permalink / raw) To: David Miller; +Cc: fw, netdev, johannes, linux-wireless On Mon, 2015-03-02 at 17:17 -0500, David Miller wrote: > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Mon, 02 Mar 2015 11:49:23 -0800 > > > Size of skb->cb[] is not the major factor. Trying to gain 4 or 8 bytes > > is not going to improve performance a lot. > > > > The real problem is that we clear it in skb_alloc()/build_skb(), instead > > of each layer doing so at demand, and only the part that matters for > > this layer. > > > > Basically, skb->cb[] could be 80 or 160 bytes instead of 48, and we > > should not care, as long as no layer does a stupid/lazy > > > > memset(skb->cb, 0, sizeof(skb->cb)) > > > > Presumably skb_clone() could be extended to receive the length of > > skb->cb[] that current layer cares about. > > Regardless, I think Florian's work has value. Of course. I hope my answer was not implying the contrary ! 48 -> 44 cb change, with the 8 bytes alignment and various __packed tricks that might confuse compilers on some arches, will be hard to quantify in term of performances on all arches. About the GRO layout change, reason why 'struct sk_buff *last;' is at the end of struct napi_gro_cb is that this field is not used in fast path. Note : We could try to use one bit in skb to advertise zero shinfo(skb). Many skbs have a zeroed shinfo() (but shinfo->dataref == 1) , and dereferencing skb_shinfo adds a cache line miss. -> We could avoid memset(shinfo, 0, offsetof(struct skb_shared_info, dataref)) & atomic_set(&shinfo->dataref, 1); in alloc_skb() and friends completely. Unfortunately this kind of change would be quite invasive... ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 00/14] shrink skb cb to 44 bytes 2015-03-03 4:02 ` Eric Dumazet @ 2015-03-03 4:05 ` David Miller 2015-03-03 11:43 ` Florian Westphal 0 siblings, 1 reply; 19+ messages in thread From: David Miller @ 2015-03-03 4:05 UTC (permalink / raw) To: eric.dumazet; +Cc: fw, netdev, johannes, linux-wireless From: Eric Dumazet <eric.dumazet@gmail.com> Date: Mon, 02 Mar 2015 20:02:05 -0800 > About the GRO layout change, reason why 'struct sk_buff *last;' is at > the end of struct napi_gro_cb is that this field is not used in fast > path. Understood. While reviewing this I noticed that the jiffies timestamp in GRO cb could really be u32 if we want instead of full "unsigned long". > Note : We could try to use one bit in skb to advertise zero shinfo(skb). > > Many skbs have a zeroed shinfo() (but shinfo->dataref == 1) , and > dereferencing skb_shinfo adds a cache line miss. > > -> We could avoid memset(shinfo, 0, offsetof(struct skb_shared_info, > dataref)) & atomic_set(&shinfo->dataref, 1); > > in alloc_skb() and friends completely. > > Unfortunately this kind of change would be quite invasive... Right, all these kinds of things touch everything. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 00/14] shrink skb cb to 44 bytes 2015-03-03 4:05 ` David Miller @ 2015-03-03 11:43 ` Florian Westphal 0 siblings, 0 replies; 19+ messages in thread From: Florian Westphal @ 2015-03-03 11:43 UTC (permalink / raw) To: David Miller; +Cc: eric.dumazet, fw, netdev, johannes, linux-wireless David Miller <davem@davemloft.net> wrote: > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Mon, 02 Mar 2015 20:02:05 -0800 > > > About the GRO layout change, reason why 'struct sk_buff *last;' is at > > the end of struct napi_gro_cb is that this field is not used in fast > > path. > > Understood. Moved it back to end, thanks! > While reviewing this I noticed that the jiffies timestamp in GRO cb > could really be u32 if we want instead of full "unsigned long". Made it an u16 at the moment -- then __packed is no longer needed and it fits in 40 bytes. > > Note : We could try to use one bit in skb to advertise zero shinfo(skb). > > > > Many skbs have a zeroed shinfo() (but shinfo->dataref == 1) , and > > dereferencing skb_shinfo adds a cache line miss. > > > > -> We could avoid memset(shinfo, 0, offsetof(struct skb_shared_info, > > dataref)) & atomic_set(&shinfo->dataref, 1); > > > > in alloc_skb() and friends completely. > > > > Unfortunately this kind of change would be quite invasive... > > Right, all these kinds of things touch everything. Indeed, but thanks for the hint Eric -- I'll investigate. ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2015-03-03 11:43 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1425318028-26531-1-git-send-email-fw@strlen.de>
2015-03-02 17:40 ` [PATCH RFC 04/14] drivers: wireless: rt2x00: move skb_dma to queue entry Florian Westphal
2015-03-02 17:40 ` [PATCH RFC 05/14] drivers: wireless: ar5523: use container_of Florian Westphal
2015-03-03 9:16 ` Pontus Fuchs
2015-03-02 17:40 ` [PATCH RFC 06/14] drivers: wireless: carl9170: shrink carl9170_tx_info Florian Westphal
2015-03-02 17:40 ` [PATCH RFC 07/14] net: wireless: iwlwifi: shrink status private area Florian Westphal
2015-03-02 17:40 ` [PATCH RFC 08/14] net: wireless: mac80211: shrink ieee80211_tx_info Florian Westphal
2015-03-02 18:53 ` Johannes Berg
2015-03-02 19:03 ` Florian Westphal
2015-03-02 19:18 ` Johannes Berg
2015-03-02 19:30 ` Florian Westphal
2015-03-02 17:40 ` [PATCH RFC 09/14] net: wireless: mac80211: shrink private driver area Florian Westphal
2015-03-02 18:52 ` Johannes Berg
2015-03-02 19:49 ` [PATCH RFC 00/14] shrink skb cb to 44 bytes Eric Dumazet
2015-03-02 20:42 ` Florian Westphal
2015-03-02 21:56 ` Eric Dumazet
2015-03-02 22:17 ` David Miller
2015-03-03 4:02 ` Eric Dumazet
2015-03-03 4:05 ` David Miller
2015-03-03 11:43 ` Florian Westphal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).