* [PATCH net-next 7/8] ipv4/tunnel: use __vlan_hwaccel helpers
From: Michał Mirosław @ 2017-01-04 0:24 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy
In-Reply-To: <cover.1483488960.git.mirq-linux@rere.qmqm.pl>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
net/ipv4/ip_tunnel_core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index fed3d29f9eb3..0004a54373f0 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -120,7 +120,7 @@ int __iptunnel_pull_header(struct sk_buff *skb, int hdr_len,
}
skb_clear_hash_if_not_l4(skb);
- skb->vlan_tci = 0;
+ __vlan_hwaccel_clear_tag(skb);
skb_set_queue_mapping(skb, 0);
skb_scrub_packet(skb, xnet);
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 6/8] 8021q: use __vlan_hwaccel helpers
From: Michał Mirosław @ 2017-01-04 0:24 UTC (permalink / raw)
To: netdev; +Cc: Patrick McHardy
In-Reply-To: <cover.1483488960.git.mirq-linux@rere.qmqm.pl>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
net/8021q/vlan_core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index e2ed69850489..604a67abdeb6 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -50,7 +50,7 @@ bool vlan_do_receive(struct sk_buff **skbp)
}
skb->priority = vlan_get_ingress_priority(vlan_dev, skb->vlan_tci);
- skb->vlan_tci = 0;
+ __vlan_hwaccel_clear_tag(skb);
rx_stats = this_cpu_ptr(vlan_dev_priv(vlan_dev)->vlan_pcpu_stats);
--
2.11.0
^ permalink raw reply related
* [PATCH net 1/2] net: systemport: Utilize skb_put_padto()
From: Florian Fainelli @ 2017-01-04 0:34 UTC (permalink / raw)
To: netdev; +Cc: davem, Florian Fainelli
In-Reply-To: <20170104003449.27078-1-f.fainelli@gmail.com>
Since we need to pad our packets, utilize skb_put_padto() which
increases skb->len by how much we need to pad, allowing us to eliminate
the test on skb->len right below.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/ethernet/broadcom/bcmsysport.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 25d1eb4933d0..e67908b5edfe 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -1028,13 +1028,12 @@ static netdev_tx_t bcm_sysport_xmit(struct sk_buff *skb,
* (including FCS and tag) because the length verification is done after
* the Broadcom tag is stripped off the ingress packet.
*/
- if (skb_padto(skb, ETH_ZLEN + ENET_BRCM_TAG_LEN)) {
+ if (skb_put_padto(skb, ETH_ZLEN + ENET_BRCM_TAG_LEN)) {
ret = NETDEV_TX_OK;
goto out;
}
- skb_len = skb->len < ETH_ZLEN + ENET_BRCM_TAG_LEN ?
- ETH_ZLEN + ENET_BRCM_TAG_LEN : skb->len;
+ skb_len = skb->len;
mapping = dma_map_single(kdev, skb->data, skb_len, DMA_TO_DEVICE);
if (dma_mapping_error(kdev, mapping)) {
--
2.9.3
^ permalink raw reply related
* [PATCH net 2/2] net: systemport: Pad packet before inserting TSB
From: Florian Fainelli @ 2017-01-04 0:34 UTC (permalink / raw)
To: netdev; +Cc: davem, Florian Fainelli
In-Reply-To: <20170104003449.27078-1-f.fainelli@gmail.com>
Inserting the TSB means adding an extra 8 bytes in fron the of packet
that is going to be used as metadata information by the TDMA engine, but
stripped off, so it does not really help with the packet padding.
For some odd packet sizes that fall below the 60 bytes payload (e.g: ARP)
we can end-up padding them after the TSB insertion, thus making them 64
bytes, but with the TDMA stripping off the first 8 bytes, they could
still be smaller than 64 bytes which is required to ingress the switch.
Fix this by swapping the padding and TSB insertion, guaranteeing that
the packets have the right sizes.
Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/ethernet/broadcom/bcmsysport.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index e67908b5edfe..7e8cf213fd81 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -1012,15 +1012,6 @@ static netdev_tx_t bcm_sysport_xmit(struct sk_buff *skb,
goto out;
}
- /* Insert TSB and checksum infos */
- if (priv->tsb_en) {
- skb = bcm_sysport_insert_tsb(skb, dev);
- if (!skb) {
- ret = NETDEV_TX_OK;
- goto out;
- }
- }
-
/* The Ethernet switch we are interfaced with needs packets to be at
* least 64 bytes (including FCS) otherwise they will be discarded when
* they enter the switch port logic. When Broadcom tags are enabled, we
@@ -1033,6 +1024,15 @@ static netdev_tx_t bcm_sysport_xmit(struct sk_buff *skb,
goto out;
}
+ /* Insert TSB and checksum infos */
+ if (priv->tsb_en) {
+ skb = bcm_sysport_insert_tsb(skb, dev);
+ if (!skb) {
+ ret = NETDEV_TX_OK;
+ goto out;
+ }
+ }
+
skb_len = skb->len;
mapping = dma_map_single(kdev, skb->data, skb_len, DMA_TO_DEVICE);
--
2.9.3
^ permalink raw reply related
* [PATCH net 0/2] net: systemport: Fix padding vs. TSB insertion
From: Florian Fainelli @ 2017-01-04 0:34 UTC (permalink / raw)
To: netdev; +Cc: davem, Florian Fainelli
Hi David,
This patch series fixes how we pad the packets submitted to the SYSTEMPORT
adapter, and how the transmit status block (prepended 8 bytes) fits in the
picture. The first patch is not technically a bug fix, but is required for the
second path to be applied and to greatly simplify the skb length calculation.
Thanks and happy new year!
Florian Fainelli (2):
net: systemport: Utilize skb_put_padto()
net: systemport: Pad packet before inserting TSB
drivers/net/ethernet/broadcom/bcmsysport.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
--
2.9.3
^ permalink raw reply
* [PATCH 1/2] PCI: introduce locked pci_add/remove_virtfn
From: Emil Tantilov @ 2017-01-04 0:48 UTC (permalink / raw)
To: linux-pci, intel-wired-lan; +Cc: alexander.h.duyck, netdev, linux-kernel
This is to allow moving the mutex lock outside of
pci_iov_add/rem_virtfn() for enabling/disabling SRIOV, while
still making it possible to call the _locked version like it is
the case for PPC's eeh_driver.
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
---
arch/powerpc/kernel/eeh_driver.c | 4 ++--
drivers/pci/iov.c | 31 ++++++++++++++++++++++++-------
include/linux/pci.h | 4 ++--
3 files changed, 28 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index d88573b..81aaea7 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -441,7 +441,7 @@ static void *eeh_add_virt_device(void *data, void *userdata)
}
#ifdef CONFIG_PPC_POWERNV
- pci_iov_add_virtfn(edev->physfn, pdn->vf_index, 0);
+ pci_iov_add_virtfn_locked(edev->physfn, pdn->vf_index, 0);
#endif
return NULL;
}
@@ -499,7 +499,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
#ifdef CONFIG_PPC_POWERNV
struct pci_dn *pdn = eeh_dev_to_pdn(edev);
- pci_iov_remove_virtfn(edev->physfn, pdn->vf_index, 0);
+ pci_iov_remove_virtfn_locked(edev->physfn, pdn->vf_index, 0);
edev->pdev = NULL;
/*
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 4722782..fea322db 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -113,7 +113,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
}
-int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
+static int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
{
int i;
int rc = -ENOMEM;
@@ -124,7 +124,6 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
struct pci_sriov *iov = dev->sriov;
struct pci_bus *bus;
- mutex_lock(&iov->dev->sriov->lock);
bus = virtfn_add_bus(dev->bus, pci_iov_virtfn_bus(dev, id));
if (!bus)
goto failed;
@@ -162,7 +161,6 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
__pci_reset_function(virtfn);
pci_device_add(virtfn, virtfn->bus);
- mutex_unlock(&iov->dev->sriov->lock);
pci_bus_add_device(virtfn);
sprintf(buf, "virtfn%u", id);
@@ -191,11 +189,22 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
return rc;
}
-void pci_iov_remove_virtfn(struct pci_dev *dev, int id, int reset)
+int pci_iov_add_virtfn_locked(struct pci_dev *dev, int id, int reset)
+{
+ struct pci_sriov *iov = dev->sriov;
+ int rc;
+
+ mutex_lock(&iov->dev->sriov->lock);
+ rc = pci_iov_add_virtfn(dev, id, reset);
+ mutex_unlock(&iov->dev->sriov->lock);
+
+ return rc;
+}
+
+static void pci_iov_remove_virtfn(struct pci_dev *dev, int id, int reset)
{
char buf[VIRTFN_ID_LEN];
struct pci_dev *virtfn;
- struct pci_sriov *iov = dev->sriov;
virtfn = pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus),
pci_iov_virtfn_bus(dev, id),
@@ -218,16 +227,24 @@ void pci_iov_remove_virtfn(struct pci_dev *dev, int id, int reset)
if (virtfn->dev.kobj.sd)
sysfs_remove_link(&virtfn->dev.kobj, "physfn");
- mutex_lock(&iov->dev->sriov->lock);
pci_stop_and_remove_bus_device(virtfn);
virtfn_remove_bus(dev->bus, virtfn->bus);
- mutex_unlock(&iov->dev->sriov->lock);
/* balance pci_get_domain_bus_and_slot() */
pci_dev_put(virtfn);
pci_dev_put(dev);
}
+void pci_iov_remove_virtfn_locked(struct pci_dev *dev, int id, int reset)
+{
+ struct pci_sriov *iov = dev->sriov;
+
+ mutex_lock(&iov->dev->sriov->lock);
+ pci_iov_remove_virtfn(dev, id, reset);
+ mutex_unlock(&iov->dev->sriov->lock);
+}
+
+
int __weak pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
{
return 0;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e2d1a12..4351ceb7 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1872,8 +1872,8 @@ static inline void pci_mmcfg_late_init(void) { }
int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
void pci_disable_sriov(struct pci_dev *dev);
-int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset);
-void pci_iov_remove_virtfn(struct pci_dev *dev, int id, int reset);
+int pci_iov_add_virtfn_locked(struct pci_dev *dev, int id, int reset);
+void pci_iov_remove_virtfn_locked(struct pci_dev *dev, int id, int reset);
int pci_num_vf(struct pci_dev *dev);
int pci_vfs_assigned(struct pci_dev *dev);
int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
^ permalink raw reply related
* [PATCH 2/2] PCI: lock each enable/disable num_vfs operation in sysfs
From: Emil Tantilov @ 2017-01-04 0:48 UTC (permalink / raw)
To: linux-pci, intel-wired-lan; +Cc: alexander.h.duyck, netdev, linux-kernel
In-Reply-To: <20170104004826.17866.77074.stgit@localhost6.localdomain6>
Enabling/disabling SRIOV via sysfs by echo-ing multiple values
simultaneously:
echo 63 > /sys/class/net/ethX/device/sriov_numvfs&
echo 63 > /sys/class/net/ethX/device/sriov_numvfs
sleep 5
echo 0 > /sys/class/net/ethX/device/sriov_numvfs&
echo 0 > /sys/class/net/ethX/device/sriov_numvfs
Results in the following bug:
kernel BUG at drivers/pci/iov.c:495!
invalid opcode: 0000 [#1] SMP
CPU: 1 PID: 8050 Comm: bash Tainted: G W 4.9.0-rc7-net-next #2092
RIP: 0010:[<ffffffff813b1647>]
[<ffffffff813b1647>] pci_iov_release+0x57/0x60
Call Trace:
[<ffffffff81391726>] pci_release_dev+0x26/0x70
[<ffffffff8155be6e>] device_release+0x3e/0xb0
[<ffffffff81365ee7>] kobject_cleanup+0x67/0x180
[<ffffffff81365d9d>] kobject_put+0x2d/0x60
[<ffffffff8155bc27>] put_device+0x17/0x20
[<ffffffff8139c08a>] pci_dev_put+0x1a/0x20
[<ffffffff8139cb6b>] pci_get_dev_by_id+0x5b/0x90
[<ffffffff8139cca5>] pci_get_subsys+0x35/0x40
[<ffffffff8139ccc8>] pci_get_device+0x18/0x20
[<ffffffff8139ccfb>] pci_get_domain_bus_and_slot+0x2b/0x60
[<ffffffff813b09e7>] pci_iov_remove_virtfn+0x57/0x180
[<ffffffff813b0b95>] pci_disable_sriov+0x65/0x140
[<ffffffffa00a1af7>] ixgbe_disable_sriov+0xc7/0x1d0 [ixgbe]
[<ffffffffa00a1e9d>] ixgbe_pci_sriov_configure+0x3d/0x170 [ixgbe]
[<ffffffff8139d28c>] sriov_numvfs_store+0xdc/0x130
...
RIP [<ffffffff813b1647>] pci_iov_release+0x57/0x60
Use the existing mutex lock to protect each enable/disable operation.
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
---
drivers/pci/pci-sysfs.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 0666287..5b54cf5 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -472,7 +472,9 @@ static ssize_t sriov_numvfs_store(struct device *dev,
const char *buf, size_t count)
{
struct pci_dev *pdev = to_pci_dev(dev);
+ struct pci_sriov *iov = pdev->sriov;
int ret;
+
u16 num_vfs;
ret = kstrtou16(buf, 0, &num_vfs);
@@ -482,38 +484,46 @@ static ssize_t sriov_numvfs_store(struct device *dev,
if (num_vfs > pci_sriov_get_totalvfs(pdev))
return -ERANGE;
+ mutex_lock(&iov->dev->sriov->lock);
+
if (num_vfs == pdev->sriov->num_VFs)
- return count; /* no change */
+ goto exit;
/* is PF driver loaded w/callback */
if (!pdev->driver || !pdev->driver->sriov_configure) {
dev_info(&pdev->dev, "Driver doesn't support SRIOV configuration via sysfs\n");
- return -ENOSYS;
+ ret = -EINVAL;
+ goto exit;
}
if (num_vfs == 0) {
/* disable VFs */
ret = pdev->driver->sriov_configure(pdev, 0);
- if (ret < 0)
- return ret;
- return count;
+ goto exit;
}
/* enable VFs */
if (pdev->sriov->num_VFs) {
dev_warn(&pdev->dev, "%d VFs already enabled. Disable before enabling %d VFs\n",
pdev->sriov->num_VFs, num_vfs);
- return -EBUSY;
+ ret = -EBUSY;
+ goto exit;
}
ret = pdev->driver->sriov_configure(pdev, num_vfs);
if (ret < 0)
- return ret;
+ goto exit;
if (ret != num_vfs)
dev_warn(&pdev->dev, "%d VFs requested; only %d enabled\n",
num_vfs, ret);
+exit:
+ mutex_unlock(&iov->dev->sriov->lock);
+
+ if (ret < 0)
+ return ret;
+
return count;
}
^ permalink raw reply related
* [PATCH net-next 1/6] net/skbuff: add macros for VLAN_PRESENT bit
From: Michał Mirosław @ 2017-01-04 1:18 UTC (permalink / raw)
To: netdev
Cc: Russell King, Ralf Baechle, Benjamin Herrenschmidt,
Paul Mackerras, Michael Ellerman, David S. Miller,
linux-arm-kernel, linux-mips, linuxppc-dev, sparclinux
In-Reply-To: <cover.1483492355.git.mirq-linux@rere.qmqm.pl>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
include/linux/skbuff.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index b53c0cfd417e..168c3e486bd4 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -768,6 +768,12 @@ struct sk_buff {
__u32 priority;
int skb_iif;
__u32 hash;
+#define PKT_VLAN_PRESENT_BIT 4 // CFI (12-th bit) in TCI
+#ifdef __BIG_ENDIAN
+#define PKT_VLAN_PRESENT_OFFSET() offsetof(struct sk_buff, vlan_tci)
+#else
+#define PKT_VLAN_PRESENT_OFFSET() (offsetof(struct sk_buff, vlan_tci) + 1)
+#endif
__be16 vlan_proto;
__u16 vlan_tci;
#if defined(CONFIG_NET_RX_BUSY_POLL) || defined(CONFIG_XPS)
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 2/6] net/bpf_jit: ARM: split VLAN_PRESENT bit handling from VLAN_TCI
From: Michał Mirosław @ 2017-01-04 1:18 UTC (permalink / raw)
To: netdev; +Cc: Russell King, linux-arm-kernel
In-Reply-To: <cover.1483492355.git.mirq-linux@rere.qmqm.pl>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
v2: remove one insn for big-endians
arch/arm/net/bpf_jit_32.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 93d0b6d0b63e..0700cbbe4f14 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -915,17 +915,21 @@ static int build_body(struct jit_ctx *ctx)
emit(ARM_LDR_I(r_A, r_skb, off), ctx);
break;
case BPF_ANC | SKF_AD_VLAN_TAG:
- case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
ctx->seen |= SEEN_SKB;
BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_tci) != 2);
off = offsetof(struct sk_buff, vlan_tci);
emit(ARM_LDRH_I(r_A, r_skb, off), ctx);
- if (code == (BPF_ANC | SKF_AD_VLAN_TAG))
- OP_IMM3(ARM_AND, r_A, r_A, ~VLAN_TAG_PRESENT, ctx);
- else {
- OP_IMM3(ARM_LSR, r_A, r_A, 12, ctx);
+#ifdef VLAN_TAG_PRESENT
+ OP_IMM3(ARM_AND, r_A, r_A, ~VLAN_TAG_PRESENT, ctx);
+#endif
+ break;
+ case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
+ off = PKT_VLAN_PRESENT_OFFSET();
+ emit(ARM_LDRB_I(r_A, r_skb, off), ctx);
+ if (PKT_VLAN_PRESENT_BIT)
+ OP_IMM3(ARM_LSR, r_A, r_A, PKT_VLAN_PRESENT_BIT, ctx);
+ if (PKT_VLAN_PRESENT_BIT < 7)
OP_IMM3(ARM_AND, r_A, r_A, 0x1, ctx);
- }
break;
case BPF_ANC | SKF_AD_PKTTYPE:
ctx->seen |= SEEN_SKB;
--
2.11.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH net-next 4/6] net/bpf_jit: PPC: split VLAN_PRESENT bit handling from VLAN_TCI
From: Michał Mirosław @ 2017-01-04 1:18 UTC (permalink / raw)
To: netdev; +Cc: Paul Mackerras, linuxppc-dev
In-Reply-To: <cover.1483492355.git.mirq-linux@rere.qmqm.pl>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
arch/powerpc/net/bpf_jit_comp.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 7e706f36e364..22ae63fb9b7d 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -377,18 +377,19 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
hash));
break;
case BPF_ANC | SKF_AD_VLAN_TAG:
- case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_tci) != 2);
- BUILD_BUG_ON(VLAN_TAG_PRESENT != 0x1000);
PPC_LHZ_OFFS(r_A, r_skb, offsetof(struct sk_buff,
vlan_tci));
- if (code == (BPF_ANC | SKF_AD_VLAN_TAG)) {
- PPC_ANDI(r_A, r_A, ~VLAN_TAG_PRESENT);
- } else {
- PPC_ANDI(r_A, r_A, VLAN_TAG_PRESENT);
- PPC_SRWI(r_A, r_A, 12);
- }
+#ifdef VLAN_TAG_PRESENT
+ PPC_ANDI(r_A, r_A, ~VLAN_TAG_PRESENT);
+#endif
+ break;
+ case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
+ PPC_LBZ_OFFS(r_A, r_skb, PKT_VLAN_PRESENT_OFFSET());
+ if (PKT_VLAN_PRESENT_BIT)
+ PPC_SRWI(r_A, r_A, PKT_VLAN_PRESENT_BIT);
+ PPC_ANDI(r_A, r_A, 1);
break;
case BPF_ANC | SKF_AD_QUEUE:
BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 3/6] net/bpf_jit: MIPS: split VLAN_PRESENT bit handling from VLAN_TCI
From: Michał Mirosław @ 2017-01-04 1:18 UTC (permalink / raw)
To: netdev; +Cc: Ralf Baechle, linux-mips
In-Reply-To: <cover.1483492355.git.mirq-linux@rere.qmqm.pl>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
arch/mips/net/bpf_jit.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c
index 49a2e2226fee..d06722294ede 100644
--- a/arch/mips/net/bpf_jit.c
+++ b/arch/mips/net/bpf_jit.c
@@ -1138,19 +1138,21 @@ static int build_body(struct jit_ctx *ctx)
emit_load(r_A, r_skb, off, ctx);
break;
case BPF_ANC | SKF_AD_VLAN_TAG:
- case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
ctx->flags |= SEEN_SKB | SEEN_A;
BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
vlan_tci) != 2);
off = offsetof(struct sk_buff, vlan_tci);
emit_half_load(r_s0, r_skb, off, ctx);
- if (code == (BPF_ANC | SKF_AD_VLAN_TAG)) {
- emit_andi(r_A, r_s0, (u16)~VLAN_TAG_PRESENT, ctx);
- } else {
- emit_andi(r_A, r_s0, VLAN_TAG_PRESENT, ctx);
- /* return 1 if present */
- emit_sltu(r_A, r_zero, r_A, ctx);
- }
+#ifdef VLAN_TAG_PRESENT
+ emit_andi(r_A, r_s0, (u16)~VLAN_TAG_PRESENT, ctx);
+#endif
+ break;
+ case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
+ ctx->flags |= SEEN_SKB | SEEN_A;
+ emit_load_byte(r_A, r_skb, PKT_VLAN_PRESENT_OFFSET(), ctx);
+ if (PKT_VLAN_PRESENT_BIT)
+ emit_srl(r_A, r_A, PKT_VLAN_PRESENT_BIT, ctx);
+ emit_andi(r_A, r_s0, 1, ctx);
break;
case BPF_ANC | SKF_AD_PKTTYPE:
ctx->flags |= SEEN_SKB;
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 5/6] net/bpf_jit: SPARC: split VLAN_PRESENT bit handling from VLAN_TCI
From: Michał Mirosław @ 2017-01-04 1:18 UTC (permalink / raw)
To: netdev; +Cc: David S. Miller, sparclinux
In-Reply-To: <cover.1483492355.git.mirq-linux@rere.qmqm.pl>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
arch/sparc/net/bpf_jit_comp.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c
index a6d9204a6a0b..61cc15dc86f7 100644
--- a/arch/sparc/net/bpf_jit_comp.c
+++ b/arch/sparc/net/bpf_jit_comp.c
@@ -601,15 +601,17 @@ void bpf_jit_compile(struct bpf_prog *fp)
emit_skb_load32(hash, r_A);
break;
case BPF_ANC | SKF_AD_VLAN_TAG:
- case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
emit_skb_load16(vlan_tci, r_A);
- if (code != (BPF_ANC | SKF_AD_VLAN_TAG)) {
- emit_alu_K(SRL, 12);
- emit_andi(r_A, 1, r_A);
- } else {
- emit_loadimm(~VLAN_TAG_PRESENT, r_TMP);
- emit_and(r_A, r_TMP, r_A);
- }
+#ifdef VLAN_TAG_PRESENT
+ emit_loadimm(~VLAN_TAG_PRESENT, r_TMP);
+ emit_and(r_A, r_TMP, r_A);
+#endif
+ break;
+ case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
+ __emit_skb_load8(__pkt_vlan_present_offset, r_A);
+ if (PKT_VLAN_PRESENT_BIT)
+ emit_alu_K(SRL, PKT_VLAN_PRESENT_BIT);
+ emit_andi(r_A, 1, r_A);
break;
case BPF_LD | BPF_W | BPF_LEN:
emit_skb_load32(len, r_A);
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 6/6] net/bpf: split VLAN_PRESENT bit handling from VLAN_TCI
From: Michał Mirosław @ 2017-01-04 1:18 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Alexei Starovoitov, Daniel Borkmann, Thomas Graf,
Martin KaFai Lau, Craig Gallek
In-Reply-To: <cover.1483492355.git.mirq-linux@rere.qmqm.pl>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
v2: save an insn on big-endiands
net/core/filter.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index 1969b3f118c1..7caf0bbbd092 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -188,22 +188,21 @@ static u32 convert_skb_access(int skb_field, int dst_reg, int src_reg,
break;
case SKF_AD_VLAN_TAG:
- case SKF_AD_VLAN_TAG_PRESENT:
BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_tci) != 2);
- BUILD_BUG_ON(VLAN_TAG_PRESENT != 0x1000);
/* dst_reg = *(u16 *) (src_reg + offsetof(vlan_tci)) */
*insn++ = BPF_LDX_MEM(BPF_H, dst_reg, src_reg,
offsetof(struct sk_buff, vlan_tci));
- if (skb_field == SKF_AD_VLAN_TAG) {
- *insn++ = BPF_ALU32_IMM(BPF_AND, dst_reg,
- ~VLAN_TAG_PRESENT);
- } else {
- /* dst_reg >>= 12 */
- *insn++ = BPF_ALU32_IMM(BPF_RSH, dst_reg, 12);
- /* dst_reg &= 1 */
+#ifdef VLAN_TAG_PRESENT
+ *insn++ = BPF_ALU32_IMM(BPF_AND, dst_reg, ~VLAN_TAG_PRESENT);
+#endif
+ break;
+ case SKF_AD_VLAN_TAG_PRESENT:
+ *insn++ = BPF_LDX_MEM(BPF_B, dst_reg, src_reg, PKT_VLAN_PRESENT_OFFSET());
+ if (PKT_VLAN_PRESENT_BIT)
+ *insn++ = BPF_ALU32_IMM(BPF_RSH, dst_reg, PKT_VLAN_PRESENT_BIT);
+ if (PKT_VLAN_PRESENT_BIT < 7)
*insn++ = BPF_ALU32_IMM(BPF_AND, dst_reg, 1);
- }
break;
}
--
2.11.0
^ permalink raw reply related
* EMAIL UPDATE
From: Technical Subsystem @ 2017-01-03 20:25 UTC (permalink / raw)
To: Recipients
Recently, we have detect some unusual activity on your account and as a
result, all email users are urged to update their email account within 24 hours of receiving this e-mail, using the update link: http://www.beam.to/1795 to confirm that your email account is up to date with the institution requirement.
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
^ permalink raw reply
* Re: [PATCH 2/2] PCI: lock each enable/disable num_vfs operation in sysfs
From: Gavin Shan @ 2017-01-04 2:15 UTC (permalink / raw)
To: Emil Tantilov
Cc: linux-pci, intel-wired-lan, alexander.h.duyck, netdev,
linux-kernel
In-Reply-To: <20170104004831.17866.11537.stgit@localhost6.localdomain6>
On Tue, Jan 03, 2017 at 04:48:31PM -0800, Emil Tantilov wrote:
>Enabling/disabling SRIOV via sysfs by echo-ing multiple values
>simultaneously:
>
>echo 63 > /sys/class/net/ethX/device/sriov_numvfs&
>echo 63 > /sys/class/net/ethX/device/sriov_numvfs
>
>sleep 5
>
>echo 0 > /sys/class/net/ethX/device/sriov_numvfs&
>echo 0 > /sys/class/net/ethX/device/sriov_numvfs
>
>Results in the following bug:
>
>kernel BUG at drivers/pci/iov.c:495!
>invalid opcode: 0000 [#1] SMP
>CPU: 1 PID: 8050 Comm: bash Tainted: G W 4.9.0-rc7-net-next #2092
>RIP: 0010:[<ffffffff813b1647>]
> [<ffffffff813b1647>] pci_iov_release+0x57/0x60
>
>Call Trace:
> [<ffffffff81391726>] pci_release_dev+0x26/0x70
> [<ffffffff8155be6e>] device_release+0x3e/0xb0
> [<ffffffff81365ee7>] kobject_cleanup+0x67/0x180
> [<ffffffff81365d9d>] kobject_put+0x2d/0x60
> [<ffffffff8155bc27>] put_device+0x17/0x20
> [<ffffffff8139c08a>] pci_dev_put+0x1a/0x20
> [<ffffffff8139cb6b>] pci_get_dev_by_id+0x5b/0x90
> [<ffffffff8139cca5>] pci_get_subsys+0x35/0x40
> [<ffffffff8139ccc8>] pci_get_device+0x18/0x20
> [<ffffffff8139ccfb>] pci_get_domain_bus_and_slot+0x2b/0x60
> [<ffffffff813b09e7>] pci_iov_remove_virtfn+0x57/0x180
> [<ffffffff813b0b95>] pci_disable_sriov+0x65/0x140
> [<ffffffffa00a1af7>] ixgbe_disable_sriov+0xc7/0x1d0 [ixgbe]
> [<ffffffffa00a1e9d>] ixgbe_pci_sriov_configure+0x3d/0x170 [ixgbe]
> [<ffffffff8139d28c>] sriov_numvfs_store+0xdc/0x130
>...
>RIP [<ffffffff813b1647>] pci_iov_release+0x57/0x60
>
>Use the existing mutex lock to protect each enable/disable operation.
>
>CC: Alexander Duyck <alexander.h.duyck@intel.com>
>Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Emil, It's going to change semantics of pci_enable_sriov() and pci_disable_sriov().
They can be invoked when writing to the sysfs entry, or loading PF's driver. With
the change applied, the lock (pf->sriov->lock) isn't acquired and released in the
PF's driver loading path.
I think the reasonable way would be adding a flag in "struct sriov", to indicate
someone is accessing the IOV capability through sysfs file. With this, the code
returns with "-EBUSY" immediately for contenders. With it, nothing is going to
be changed in PF's driver loading path.
Also, there are some minor comments as below and I guess most of them won't be
applied if you take my suggestion eventually. However, I'm trying to make the
comments complete.
>---
> drivers/pci/pci-sysfs.c | 24 +++++++++++++++++-------
> 1 file changed, 17 insertions(+), 7 deletions(-)
>
>diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>index 0666287..5b54cf5 100644
>--- a/drivers/pci/pci-sysfs.c
>+++ b/drivers/pci/pci-sysfs.c
>@@ -472,7 +472,9 @@ static ssize_t sriov_numvfs_store(struct device *dev,
> const char *buf, size_t count)
> {
> struct pci_dev *pdev = to_pci_dev(dev);
>+ struct pci_sriov *iov = pdev->sriov;
> int ret;
>+
Unnecessary change.
> u16 num_vfs;
>
> ret = kstrtou16(buf, 0, &num_vfs);
>@@ -482,38 +484,46 @@ static ssize_t sriov_numvfs_store(struct device *dev,
> if (num_vfs > pci_sriov_get_totalvfs(pdev))
> return -ERANGE;
>
>+ mutex_lock(&iov->dev->sriov->lock);
>+
> if (num_vfs == pdev->sriov->num_VFs)
>- return count; /* no change */
>+ goto exit;
>
> /* is PF driver loaded w/callback */
> if (!pdev->driver || !pdev->driver->sriov_configure) {
> dev_info(&pdev->dev, "Driver doesn't support SRIOV configuration via sysfs\n");
>- return -ENOSYS;
>+ ret = -EINVAL;
>+ goto exit;
Why we need change the error code here?
> }
>
> if (num_vfs == 0) {
> /* disable VFs */
> ret = pdev->driver->sriov_configure(pdev, 0);
>- if (ret < 0)
>- return ret;
>- return count;
>+ goto exit;
> }
>
> /* enable VFs */
> if (pdev->sriov->num_VFs) {
> dev_warn(&pdev->dev, "%d VFs already enabled. Disable before enabling %d VFs\n",
> pdev->sriov->num_VFs, num_vfs);
>- return -EBUSY;
>+ ret = -EBUSY;
>+ goto exit;
> }
>
> ret = pdev->driver->sriov_configure(pdev, num_vfs);
> if (ret < 0)
>- return ret;
>+ goto exit;
>
> if (ret != num_vfs)
> dev_warn(&pdev->dev, "%d VFs requested; only %d enabled\n",
> num_vfs, ret);
>
>+exit:
>+ mutex_unlock(&iov->dev->sriov->lock);
>+
>+ if (ret < 0)
>+ return ret;
>+
> return count;
The code might be clearer if @ret is returned here. In that case, We need
set it properly in error paths.
> }
>
Thanks,
Gavin
^ permalink raw reply
* Re: [PATCH net-next 2/2] net:dsa: check for EPROBE_DEFER from dsa_dst_parse()
From: Andrew Lunn @ 2017-01-04 2:19 UTC (permalink / raw)
To: Volodymyr Bendiuga
Cc: vivien.didelot, f.fainelli, davem, netdev, volodymyr.bendiuga
In-Reply-To: <1481129766-10235-1-git-send-email-volodymyr.bendiuga@westermo.se>
On Wed, Dec 07, 2016 at 05:56:06PM +0100, Volodymyr Bendiuga wrote:
> Since there can be multiple dsa switches stacked together but
> not all of devicetree nodes available at the time of calling
> dsa_dst_parse(), EPROBE_DEFER can be returned by it. When this
> happens, only the last dsa switch has to be deleted by
> dsa_dst_del_ds(), but not the whole list, because next time linux
> cames back to this function it will try to add only the last dsa
> switch which returned EPROBE_DEFER.
>
> Signed-off-by: Volodymyr Bendiuga <volodymyr.bendiuga@westermo.se>
> ---
> net/dsa/dsa2.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
> index 7924c92..0a5ddaa 100644
> --- a/net/dsa/dsa2.c
> +++ b/net/dsa/dsa2.c
> @@ -673,8 +673,14 @@ static int _dsa_register_switch(struct dsa_switch *ds, struct device_node *np)
> }
>
> err = dsa_dst_parse(dst);
> - if (err)
> + if (err){
> + if (-EPROBE_DEFER == err) {
Hi Volodymyr
Please can you turn this around, err == -EPROBE_DEFER, to make it
consistent with all the other network code.
With that change
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* [PATCH net-next 0/6] Prepare BPF for VLAN_TAG_PRESENT cleanup
From: Michał Mirosław @ 2017-01-04 1:18 UTC (permalink / raw)
To: netdev
Those patches prepare BPF ant its JITs for removal of VLAN_TAG_PRESENT.
The set depends on "Preparation for VLAN_TAG_PRESENT cleanup" patchset.
The series is supposed to be bisect-friendly and that requires temporary
insertion of #define VLAN_TAG_PRESENT in BPF code to be able to split
JIT changes per architecture.
Michał Mirosław (6):
net/skbuff: add macros for VLAN_PRESENT bit
net/bpf_jit: ARM: split VLAN_PRESENT bit handling from VLAN_TCI
net/bpf_jit: MIPS: split VLAN_PRESENT bit handling from VLAN_TCI
net/bpf_jit: PPC: split VLAN_PRESENT bit handling from VLAN_TCI
net/bpf_jit: SPARC: split VLAN_PRESENT bit handling from VLAN_TCI
net/bpf: split VLAN_PRESENT bit handling from VLAN_TCI
arch/arm/net/bpf_jit_32.c | 16 ++++++++++------
arch/mips/net/bpf_jit.c | 18 ++++++++++--------
arch/powerpc/net/bpf_jit_comp.c | 17 +++++++++--------
arch/sparc/net/bpf_jit_comp.c | 18 ++++++++++--------
include/linux/skbuff.h | 6 ++++++
net/core/filter.c | 19 +++++++++----------
6 files changed, 54 insertions(+), 40 deletions(-)
--
2.11.0
^ permalink raw reply
* Re: [PATCH net-next] ibmvnic: fix accelerated VLAN handling
From: kbuild test robot @ 2017-01-04 2:36 UTC (permalink / raw)
To: Michał Mirosław; +Cc: kbuild-all, netdev, Thomas Falcon, John Allen
In-Reply-To: <8e3c0fc229bbbc549e2529e3c174b7ef477b181c.1483487887.git.mirq-linux@rere.qmqm.pl>
[-- Attachment #1: Type: text/plain, Size: 2034 bytes --]
Hi Michał,
[auto build test ERROR on net-next/master]
url: https://github.com/0day-ci/linux/commits/Micha-Miros-aw/ibmvnic-fix-accelerated-VLAN-handling/20170104-095210
config: powerpc-allmodconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc
All errors (new ones prefixed by >>):
drivers/net/ethernet/ibm/ibmvnic.c: In function 'ibmvnic_xmit':
>> drivers/net/ethernet/ibm/ibmvnic.c:768:40: error: implicit declaration of function 'skb_vlan_tag_present' [-Werror=implicit-function-declaration]
if (adapter->vlan_header_insertion && skb_vlan_tag_present(skb)) {
^~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/ibm/ibmvnic.c: In function 'ibmvnic_poll':
>> drivers/net/ethernet/ibm/ibmvnic.c:968:4: error: implicit declaration of function '__vlan_hwaccel_put_tag' [-Werror=implicit-function-declaration]
__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), be16_to_cpu(next->rx_comp.vlan_tci));
^~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +/skb_vlan_tag_present +768 drivers/net/ethernet/ibm/ibmvnic.c
762 tx_crq.v1.flags1 = IBMVNIC_TX_COMP_NEEDED;
763 tx_crq.v1.correlator = cpu_to_be32(index);
764 tx_crq.v1.dma_reg = cpu_to_be16(tx_pool->long_term_buff.map_id);
765 tx_crq.v1.sge_len = cpu_to_be32(skb->len);
766 tx_crq.v1.ioba = cpu_to_be64(data_dma_addr);
767
> 768 if (adapter->vlan_header_insertion && skb_vlan_tag_present(skb)) {
769 tx_crq.v1.flags2 |= IBMVNIC_TX_VLAN_INSERT;
770 tx_crq.v1.vlan_id = cpu_to_be16(skb->vlan_tci);
771 }
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 51750 bytes --]
^ permalink raw reply
* Re: [PATCH net-next 1/2] net:dsa: fix dsa_dst_del_ds()
From: Andrew Lunn @ 2017-01-04 2:45 UTC (permalink / raw)
To: Volodymyr Bendiuga
Cc: vivien.didelot, f.fainelli, davem, netdev, volodymyr.bendiuga
In-Reply-To: <1481129585-9084-1-git-send-email-volodymyr.bendiuga@westermo.se>
On Wed, Dec 07, 2016 at 05:53:05PM +0100, Volodymyr Bendiuga wrote:
> When dsa_dst_del_ds() is called, do not free the whole list,
> instead, only decrement refcount for the switch tree. The list
> will be deleted in dsa_put_dst() if refcount is 0. Nothing
> really needs to be freed for dsa switch, therefore dsa_free_ds()
> is empty. kref_put() will print warning if dsa_free_ds() is not
> passed as a parameter to it.
This does not look correct. I would expect there to be some symmetry.
The dst gets allocated in _dsa_register_switch(), so it should be
freed somewhere in or under _dsa_unregister_switch(). As you say, it
can be freed from dsa_free_dst(), but that is not called from
_dsa_unregister_switch().
dsa_dst_add_ds() and dsa_dst_del_ds() currently look symmetric. Add
increments the ref count for the tree, del decrements it. When it
reaches zero, the tree is freed.
dsa_dst_del_ds() is called from _dsa_unregister_switch(), which gives
us the symmetry with _dsa_register_switch().
What problem are you actually seeing? A double free? A use after free?
Thanks
Andrew
^ permalink raw reply
* Re: [PATCH net-next V2 3/3] tun: rx batching
From: Jason Wang @ 2017-01-04 3:03 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: netdev, virtualization, linux-kernel, kvm, mst
In-Reply-To: <20170103133303.GC14707@stefanha-x1.localdomain>
On 2017年01月03日 21:33, Stefan Hajnoczi wrote:
> On Wed, Dec 28, 2016 at 04:09:31PM +0800, Jason Wang wrote:
>> +static int tun_rx_batched(struct tun_file *tfile, struct sk_buff *skb,
>> + int more)
>> +{
>> + struct sk_buff_head *queue = &tfile->sk.sk_write_queue;
>> + struct sk_buff_head process_queue;
>> + int qlen;
>> + bool rcv = false;
>> +
>> + spin_lock(&queue->lock);
> Should this be spin_lock_bh()? Below and in tun_get_user() there are
> explicit local_bh_disable() calls so I guess BHs can interrupt us here
> and this would deadlock.
sk_write_queue were accessed only in this function which runs under
process context, so no need for spin_lock_bh() here.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [PATCH net 9/9] virtio-net: XDP support for small buffers
From: Jason Wang @ 2017-01-04 3:05 UTC (permalink / raw)
To: John Fastabend, mst, virtualization, netdev, linux-kernel
Cc: john.r.fastabend
In-Reply-To: <586BD408.9030009@gmail.com>
On 2017年01月04日 00:40, John Fastabend wrote:
> On 17-01-02 10:16 PM, Jason Wang wrote:
>>
>> On 2017年01月03日 06:43, John Fastabend wrote:
>>> On 16-12-23 06:37 AM, Jason Wang wrote:
>>>> Commit f600b6905015 ("virtio_net: Add XDP support") leaves the case of
>>>> small receive buffer untouched. This will confuse the user who want to
>>>> set XDP but use small buffers. Other than forbid XDP in small buffer
>>>> mode, let's make it work. XDP then can only work at skb->data since
>>>> virtio-net create skbs during refill, this is sub optimal which could
>>>> be optimized in the future.
>>>>
>>>> Cc: John Fastabend <john.r.fastabend@intel.com>
>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>> drivers/net/virtio_net.c | 112 ++++++++++++++++++++++++++++++++++++-----------
>>>> 1 file changed, 87 insertions(+), 25 deletions(-)
>>>>
>>> Hi Jason,
>>>
>>> I was doing some more testing on this what do you think about doing this
>>> so that free_unused_bufs() handles the buffer free with dev_kfree_skb()
>>> instead of put_page in small receive mode. Seems more correct to me.
>>>
>>>
>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>> index 783e842..27ff76c 100644
>>> --- a/drivers/net/virtio_net.c
>>> +++ b/drivers/net/virtio_net.c
>>> @@ -1898,6 +1898,10 @@ static void free_receive_page_frags(struct virtnet_info
>>> *vi)
>>>
>>> static bool is_xdp_queue(struct virtnet_info *vi, int q)
>>> {
>>> + /* For small receive mode always use kfree_skb variants */
>>> + if (!vi->mergeable_rx_bufs)
>>> + return false;
>>> +
>>> if (q < (vi->curr_queue_pairs - vi->xdp_queue_pairs))
>>> return false;
>>> else if (q < vi->curr_queue_pairs)
>>>
>>>
>>> patch is untested just spotted doing code review.
>>>
>>> Thanks,
>>> John
>> We probably need a better name for this function.
>>
>> Acked-by: Jason Wang <jasowang@redhat.com>
>>
> How about is_xdp_raw_buffer_queue()?
>
> I'll submit a proper patch today.
Sounds good, thanks.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [net PATCH] net: virtio: cap mtu when XDP programs are running
From: Jason Wang @ 2017-01-04 3:16 UTC (permalink / raw)
To: John Fastabend, mst; +Cc: john.r.fastabend, netdev, alexei.starovoitov, daniel
In-Reply-To: <586BD5D5.6020100@gmail.com>
case.
On 2017年01月04日 00:48, John Fastabend wrote:
> On 17-01-02 10:14 PM, Jason Wang wrote:
>>
>> On 2017年01月03日 06:30, John Fastabend wrote:
>>> XDP programs can not consume multiple pages so we cap the MTU to
>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>> program load and does not block MTU changes after the program
>>> has loaded.
>>>
>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>
>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>> ---
>>> drivers/net/virtio_net.c | 9 ++++++---
>>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>> index 5deeda6..783e842 100644
>>> --- a/drivers/net/virtio_net.c
>>> +++ b/drivers/net/virtio_net.c
>>> @@ -1699,6 +1699,9 @@ static void virtnet_init_settings(struct net_device *dev)
>>> .set_settings = virtnet_set_settings,
>>> };
>>> +#define MIN_MTU ETH_MIN_MTU
>>> +#define MAX_MTU ETH_MAX_MTU
>>> +
>>> static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
>>> {
>>> unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
>>> @@ -1748,6 +1751,9 @@ static int virtnet_xdp_set(struct net_device *dev,
>>> struct bpf_prog *prog)
>>> virtnet_set_queues(vi, curr_qp);
>>> return PTR_ERR(prog);
>>> }
>>> + dev->max_mtu = max_sz;
>>> + } else {
>>> + dev->max_mtu = ETH_MAX_MTU;
>> Or use ETH_DATA_LEN here consider we only allocate a size of GOOD_PACKET_LEN for
>> each small buffer?
>>
>> Thanks
> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
> needs to read the mtu via
>
> virtio_cread16(vdev, ...)
>
> or we may break the negotiated mtu.
Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to
notify the device about the mtu in this case which is not supported by
virtio now.
>
> As for capping it at GOOD_PACKET_LEN this has the nice benefit of avoiding any
> underestimates in EWMA predictions because it appears min estimates are capped
> at GOOD_PACKET_LEN via get_mergeable_buf_len().
This seems something misunderstanding here, I meant only use
GOOD_PACKET_LEN for small buffer (which does not use EWMA).
Thanks
>
> Thanks,
> John
>
^ permalink raw reply
* Re: [RFC PATCH] virtio_net: XDP support for adjust_head
From: Jason Wang @ 2017-01-04 3:21 UTC (permalink / raw)
To: John Fastabend, mst; +Cc: john.r.fastabend, netdev, alexei.starovoitov, daniel
In-Reply-To: <586BD734.7020105@gmail.com>
On 2017年01月04日 00:54, John Fastabend wrote:
>>> + /* Changing the headroom in buffers is a disruptive operation because
>>> + * existing buffers must be flushed and reallocated. This will happen
>>> + * when a xdp program is initially added or xdp is disabled by removing
>>> + * the xdp program.
>>> + */
>> We probably need reset the device here, but maybe Michale has more ideas. And if
>> we do this, another interesting thing to do is to disable EWMA and always use a
>> single page for each packet, this could almost eliminate linearizing.
> Well with normal MTU 1500 size we should not hit the linearizing case right?
My reply may be not clear, for 1500 I mean for small buffer only.
Thanks
> The
> question is should we cap the MTU at GOOD_PACKET_LEN vs the current cap of
> (PAGE_SIZE - overhead).
>
^ permalink raw reply
* Re: [RFC PATCH] virtio_net: XDP support for adjust_head
From: Jason Wang @ 2017-01-04 3:22 UTC (permalink / raw)
To: John Fastabend, mst; +Cc: john.r.fastabend, netdev, alexei.starovoitov, daniel
In-Reply-To: <586BD7F3.60109@gmail.com>
On 2017年01月04日 00:57, John Fastabend wrote:
>>>> + /* Changing the headroom in buffers is a disruptive operation because
>>>> + * existing buffers must be flushed and reallocated. This will happen
>>>> + * when a xdp program is initially added or xdp is disabled by removing
>>>> + * the xdp program.
>>>> + */
>>> We probably need reset the device here, but maybe Michale has more ideas. And if
>>> we do this, another interesting thing to do is to disable EWMA and always use a
>>> single page for each packet, this could almost eliminate linearizing.
>> Well with normal MTU 1500 size we should not hit the linearizing case right? The
>> question is should we cap the MTU at GOOD_PACKET_LEN vs the current cap of
>> (PAGE_SIZE - overhead).
> Sorry responding to my own post with a bit more detail. I don't really like
> going to a page for each packet because we end up with double the pages in use
> for the "normal" 1500 MTU case. We could make the xdp allocation scheme smarter
> and allocate a page per packet when MTU is greater than 2k instead of using the
> EWMA but I would push those types of things at net-next and live with the
> linearizing behavior for now or capping the MTU.
>
Yes, agree.
Thanks
^ permalink raw reply
* Re: [PATCH net-next 0/3] Preparation for VLAN_TAG_PRESENT cleanup
From: David Miller @ 2017-01-04 3:23 UTC (permalink / raw)
To: mirq-linux; +Cc: netdev
In-Reply-To: <cover.1483487429.git.mirq-linux@rere.qmqm.pl>
By submitted these in sections, but all at once, you are subverting
my requirement to submit only small self contained patch series.
Please do not do this.
The whole point is to not have a lot of patches in flight for one
thing for people to review at one time. Contributors can review
more easily small, easily digestable, pieces.
Start over, and only submit small numbers of patches at one time. Do
not submit new patches until the first series has been processed.
Thank you.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox