* Re: [PATCH net-next 1/5] tipc: silence sparse warnings
From: Ying Xue @ 2013-09-27 8:01 UTC (permalink / raw)
To: David Miller
Cc: jon.maloy, netdev, paul.gortmaker, erik.hugne, maloy,
tipc-discussion, andreas.bofjall
In-Reply-To: <20130927.015908.1293107524454870319.davem@davemloft.net>
On 09/27/2013 01:59 PM, David Miller wrote:
> From: Jon Maloy <jon.maloy@ericsson.com>
> Date: Tue, 24 Sep 2013 04:27:44 -0500
>
>> From: Ying Xue <ying.xue@windriver.com>
>>
>> Eliminate below sparse warnings:
>>
>> net/tipc/link.c:1210:37: warning: cast removes address space of expression
>> net/tipc/link.c:1218:59: warning: incorrect type in argument 2 (different address spaces)
>> net/tipc/link.c:1218:59: expected void const [noderef] <asn:1>*from
>> net/tipc/link.c:1218:59: got unsigned char const [usertype] *[assigned] sect_crs
>> net/tipc/msg.c:96:61: warning: incorrect type in argument 3 (different address spaces)
>> net/tipc/msg.c:96:61: expected void const *from
>> net/tipc/msg.c:96:61: got void [noderef] <asn:1>*const iov_base
>> net/tipc/socket.c:341:49: warning: Using plain integer as NULL pointer
>> net/tipc/socket.c:1371:36: warning: Using plain integer as NULL pointer
>> net/tipc/socket.c:1694:57: warning: Using plain integer as NULL pointer
>>
>> Signed-off-by: Ying Xue <ying.xue@windriver.com>
>> Signed-off-by: Andreas Bofjäll <andreas.bofjall@ericsson.com>
>> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
>> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
>
> These warnings are not just for fun, and they are certainly not an
> invitation to stick casts all over the place to make them go away.
>
> They indicate real problems in the TIPC code.
>
> There really are user pointers in the iovecs here, and that's why the
> iov_base member is annotated with "__user".
>
> These iovecs carry pointers that come from userspace via socket calls.
>
> And you absolutely cannot pass user pointers to skb_copy_to_linear_data()
> and friends as they access the source pointer using memcpy().
>
Good point!
It's better for us to use memcpy_fromiovecend() instead of
skb_copy_to_linear_data() and its friends.
We will submit another version to correct this error soon.
Regards,
Ying
> You have to use the proper interfaces for accessing userspace memory,
> ones that have their arguments annotated with __user.
>
> I'm not applying this series, sorry.
>
>
^ permalink raw reply
* Re: [PATCH net 2/2] ip_tunnel: Add fallback tunnels to the hash lists
From: Steffen Klassert @ 2013-09-27 7:56 UTC (permalink / raw)
To: Pravin Shelar; +Cc: David Miller, netdev
In-Reply-To: <CALnjE+pu6CEwB+p3TNH56N09Cwabr9QbaKotKiwbfyLvZUsSpA@mail.gmail.com>
On Thu, Sep 26, 2013 at 11:24:07AM -0700, Pravin Shelar wrote:
> On Thu, Sep 26, 2013 at 1:13 AM, Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> > On Wed, Sep 25, 2013 at 09:03:11AM -0700, Pravin Shelar wrote:
> >> fallback tunnel s not required to be in hash table, Its is returned if
> >> none of hashed tunnels are matched, ref ip_tunnel_lookup().
> >> Can you post command to reproduce this issue?
> >>
> >
> > Something like
> >
> > ip tunnel change tunl0 mode ipip remote 0.0.0.0 local 0.0.0.0 ttl 0 tos 1
> >
> > worked until v3.9 and stopped working with v3.10.
>
> OK, I see the bug, tunnel exact match lookup does not check fb tunnel.
> There are two options.
> 1. Fix ip_tunnel_find() to check for fb tunnel.
> 2. Add fb tunnel to hash table, which is what ur patch does.
> I think your patch is better solution as it get rid of special case.
> But patch is not complete. It needs to remove fb tunnel checks on
> netdev unregister.
It looks like this is another bug that requires an additional patch.
We add the fallback tunnel to the unregister list when we iterate over
all netdevices in the namespace at the beginning of ip_tunnel_destroy()
and then again explicitly at the end of ip_tunnel_destroy().
^ permalink raw reply
* Re: [PATCH net 1/2] ip_tunnel: Fix a memory corruption in ip_tunnel_xmit
From: Steffen Klassert @ 2013-09-27 7:45 UTC (permalink / raw)
To: Pravin Shelar; +Cc: David Miller, netdev
In-Reply-To: <CALnjE+ofaXQa3gsPPxqdWZeBxgML6M1vnC9mmc5dufwy0gJO-Q@mail.gmail.com>
On Thu, Sep 26, 2013 at 11:25:01AM -0700, Pravin Shelar wrote:
> On Thu, Sep 26, 2013 at 1:25 AM, Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> > On Wed, Sep 25, 2013 at 09:55:50AM -0700, Pravin Shelar wrote:
> >> On Tue, Sep 24, 2013 at 10:54 PM, Steffen Klassert
> >> <steffen.klassert@secunet.com> wrote:
> >> > We might extend the used aera of a skb beyond the total
> >> > headroom when we install the ipip header. Fix this by
> >> > calling skb_cow_head() unconditionally.
> >> >
> >> It is better to call skb_cow_head() from ipip_tunnel_xmit() as it is
> >> consistent with gre.
> >
> > I think this would just move the bug from ipip to gre. ipgre_xmit()
> > uses dev->needed_headroom which is based on the guessed output device
> > in ip_tunnel_bind_dev(). If the device we get from the route lookup
> > in ip_tunnel_xmit() is different from the guessed one and the resulting
> > max_headroom is bigger than dev->needed_headroom, we run into that bug
> > because skb_cow_head() will not be called with the updated
> > dev->needed_headroom.
> >
> Thats why ip_tunnel_xmit() update dev->needed_headroom.
> Just to be clear I was talking abt calling skb_cow_head with
> dev->needed_headroom in ipip_xmit and leave ip_tunnel_xmit as it is.
> So that most of cases we will not need to adjust headroom in ip_tunnel
> xmit.
skb_cow_head() does not do much if there is enough headroom, so
calling it here is uncritical. But we should adjust the headroom
as soon as we know that it is insufficient.
Also, I really wonder how you want to adjust the headroom in
ipip_tunnel_xmit() to a correct value. We know the needed
headroom after the route lookup in ip_tunnel_xmit() and
we have to adust it here because ip_tunnel_xmit() calls
iptunnel_xmit() which does a __skb_push() before it
installs the IP header.
Please keep in mind tat this is a bug fix that might be interesting
for stable too, we should try to keep the changes at a minimum.
Another thing that I noticed, with commit 0e6fbc5b
(ip_tunnels: extend iptunnel_xmit()) you moved the IP header
installation to iptunnel_xmit() and changed skb_push()
to __skb_push(). This made this bug quite hard to track
down because instead of triggering a skb under panic,
it did a silent memory corruption and crashed at random
other places. Maybe we should change this back to skb_push().
^ permalink raw reply
* [PATCH net 1/1] qlcnic: Fix register device in FAILED state for 82xx.
From: Sucheta Chakraborty @ 2013-09-27 6:12 UTC (permalink / raw)
To: davem; +Cc: netdev, Dept-HSGLinuxNICDev
o Commit 7e2cf4feba058476324dc545e3d1b316998c91e6
("qlcnic: change driver hardware interface mechanism")
has overwritten
commit b43e5ee76a4320c070cf0fe65cf4927198fbb4d1
("qlcnic: Register device in FAILED state")
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
---
.../net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c | 8 +++++
drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 39 ++++++++++++++++++++--
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c | 12 +++++++
3 files changed, 57 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c
index 4d7ad00..ebe4c86 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c
@@ -1794,3 +1794,11 @@ const struct ethtool_ops qlcnic_sriov_vf_ethtool_ops = {
.set_msglevel = qlcnic_set_msglevel,
.get_msglevel = qlcnic_get_msglevel,
};
+
+const struct ethtool_ops qlcnic_ethtool_failed_ops = {
+ .get_settings = qlcnic_get_settings,
+ .get_drvinfo = qlcnic_get_drvinfo,
+ .set_msglevel = qlcnic_set_msglevel,
+ .get_msglevel = qlcnic_get_msglevel,
+ .set_dump = qlcnic_set_dump,
+};
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
index c4c5023..21d00a0 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
@@ -431,6 +431,9 @@ static void qlcnic_82xx_cancel_idc_work(struct qlcnic_adapter *adapter)
while (test_and_set_bit(__QLCNIC_RESETTING, &adapter->state))
usleep_range(10000, 11000);
+ if (!adapter->fw_work.work.func)
+ return;
+
cancel_delayed_work_sync(&adapter->fw_work);
}
@@ -2275,8 +2278,9 @@ qlcnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
adapter->portnum = adapter->ahw->pci_func;
err = qlcnic_start_firmware(adapter);
if (err) {
- dev_err(&pdev->dev, "Loading fw failed.Please Reboot\n");
- goto err_out_free_hw;
+ dev_err(&pdev->dev, "Loading fw failed.Please Reboot\n"
+ "\t\tIf reboot doesn't help, try flashing the card\n");
+ goto err_out_maintenance_mode;
}
qlcnic_get_multiq_capability(adapter);
@@ -2408,6 +2412,22 @@ err_out_disable_pdev:
pci_set_drvdata(pdev, NULL);
pci_disable_device(pdev);
return err;
+
+err_out_maintenance_mode:
+ netdev->netdev_ops = &qlcnic_netdev_failed_ops;
+ SET_ETHTOOL_OPS(netdev, &qlcnic_ethtool_failed_ops);
+ err = register_netdev(netdev);
+
+ if (err) {
+ dev_err(&pdev->dev, "Failed to register net device\n");
+ qlcnic_clr_all_drv_state(adapter, 0);
+ goto err_out_free_hw;
+ }
+
+ pci_set_drvdata(pdev, adapter);
+ qlcnic_add_sysfs(adapter);
+
+ return 0;
}
static void qlcnic_remove(struct pci_dev *pdev)
@@ -2518,8 +2538,16 @@ static int qlcnic_resume(struct pci_dev *pdev)
static int qlcnic_open(struct net_device *netdev)
{
struct qlcnic_adapter *adapter = netdev_priv(netdev);
+ u32 state;
int err;
+ state = QLC_SHARED_REG_RD32(adapter, QLCNIC_CRB_DEV_STATE);
+ if (state == QLCNIC_DEV_FAILED || state == QLCNIC_DEV_BADBAD) {
+ netdev_err(netdev, "%s: Device is in FAILED state\n", __func__);
+
+ return -EIO;
+ }
+
netif_carrier_off(netdev);
err = qlcnic_attach(adapter);
@@ -3228,6 +3256,13 @@ void qlcnic_82xx_dev_request_reset(struct qlcnic_adapter *adapter, u32 key)
return;
state = QLC_SHARED_REG_RD32(adapter, QLCNIC_CRB_DEV_STATE);
+ if (state == QLCNIC_DEV_FAILED || state == QLCNIC_DEV_BADBAD) {
+ netdev_err(adapter->netdev, "%s: Device is in FAILED state\n",
+ __func__);
+ qlcnic_api_unlock(adapter);
+
+ return;
+ }
if (state == QLCNIC_DEV_READY) {
QLC_SHARED_REG_WR32(adapter, QLCNIC_CRB_DEV_STATE,
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c
index c6165d0..019f437 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c
@@ -1272,6 +1272,7 @@ void qlcnic_remove_sysfs_entries(struct qlcnic_adapter *adapter)
void qlcnic_create_diag_entries(struct qlcnic_adapter *adapter)
{
struct device *dev = &adapter->pdev->dev;
+ u32 state;
if (device_create_bin_file(dev, &bin_attr_port_stats))
dev_info(dev, "failed to create port stats sysfs entry");
@@ -1285,8 +1286,13 @@ void qlcnic_create_diag_entries(struct qlcnic_adapter *adapter)
if (device_create_bin_file(dev, &bin_attr_mem))
dev_info(dev, "failed to create mem sysfs entry\n");
+ state = QLC_SHARED_REG_RD32(adapter, QLCNIC_CRB_DEV_STATE);
+ if (state == QLCNIC_DEV_FAILED || state == QLCNIC_DEV_BADBAD)
+ return;
+
if (device_create_bin_file(dev, &bin_attr_pci_config))
dev_info(dev, "failed to create pci config sysfs entry");
+
if (device_create_file(dev, &dev_attr_beacon))
dev_info(dev, "failed to create beacon sysfs entry");
@@ -1307,6 +1313,7 @@ void qlcnic_create_diag_entries(struct qlcnic_adapter *adapter)
void qlcnic_remove_diag_entries(struct qlcnic_adapter *adapter)
{
struct device *dev = &adapter->pdev->dev;
+ u32 state;
device_remove_bin_file(dev, &bin_attr_port_stats);
@@ -1315,6 +1322,11 @@ void qlcnic_remove_diag_entries(struct qlcnic_adapter *adapter)
device_remove_file(dev, &dev_attr_diag_mode);
device_remove_bin_file(dev, &bin_attr_crb);
device_remove_bin_file(dev, &bin_attr_mem);
+
+ state = QLC_SHARED_REG_RD32(adapter, QLCNIC_CRB_DEV_STATE);
+ if (state == QLCNIC_DEV_FAILED || state == QLCNIC_DEV_BADBAD)
+ return;
+
device_remove_bin_file(dev, &bin_attr_pci_config);
device_remove_file(dev, &dev_attr_beacon);
if (!(adapter->flags & QLCNIC_ESWITCH_ENABLED))
--
1.8.1.4
^ permalink raw reply related
* [PATCH net-next 2/2] netxen_nic: Update version to 4.0.82
From: Shahed Shaikh @ 2013-09-27 5:42 UTC (permalink / raw)
To: davem; +Cc: netdev, Dept-HSGLinuxNICDev, Shahed Shaikh
In-Reply-To: <1380260547-25903-1-git-send-email-shahed.shaikh@qlogic.com>
From: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
---
drivers/net/ethernet/qlogic/netxen/netxen_nic.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic.h b/drivers/net/ethernet/qlogic/netxen/netxen_nic.h
index e8eff3e..9adcdbb 100644
--- a/drivers/net/ethernet/qlogic/netxen/netxen_nic.h
+++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic.h
@@ -53,8 +53,8 @@
#define _NETXEN_NIC_LINUX_MAJOR 4
#define _NETXEN_NIC_LINUX_MINOR 0
-#define _NETXEN_NIC_LINUX_SUBVERSION 81
-#define NETXEN_NIC_LINUX_VERSIONID "4.0.81"
+#define _NETXEN_NIC_LINUX_SUBVERSION 82
+#define NETXEN_NIC_LINUX_VERSIONID "4.0.82"
#define NETXEN_VERSION_CODE(a, b, c) (((a) << 24) + ((b) << 16) + (c))
#define _major(v) (((v) >> 24) & 0xff)
--
1.5.6
^ permalink raw reply related
* [PATCH net-next 1/2] netxen_nic: Print ULA information
From: Shahed Shaikh @ 2013-09-27 5:42 UTC (permalink / raw)
To: davem; +Cc: netdev, Dept-HSGLinuxNICDev, Shahed Shaikh
In-Reply-To: <1380260547-25903-1-git-send-email-shahed.shaikh@qlogic.com>
From: Shahed Shaikh <shahed.shaikh@qlogic.com>
This patch reads CAMRAM(0x178) where FW writes a key for ULA and non-ULA
adapter and based on the key, driver logs the message.
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
---
.../net/ethernet/qlogic/netxen/netxen_nic_hdr.h | 1 +
.../net/ethernet/qlogic/netxen/netxen_nic_main.c | 28 ++++++++++++++++++++
2 files changed, 29 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hdr.h b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hdr.h
index 32c7906..0c64c82 100644
--- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hdr.h
+++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hdr.h
@@ -958,6 +958,7 @@ enum {
#define NETXEN_PEG_HALT_STATUS2 (NETXEN_CAM_RAM(0xac))
#define NX_CRB_DEV_REF_COUNT (NETXEN_CAM_RAM(0x138))
#define NX_CRB_DEV_STATE (NETXEN_CAM_RAM(0x140))
+#define NETXEN_ULA_KEY (NETXEN_CAM_RAM(0x178))
/* MiniDIMM related macros */
#define NETXEN_DIMM_CAPABILITY (NETXEN_CAM_RAM(0x258))
diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c b/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c
index cbd75f9..5ec21c5 100644
--- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c
+++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c
@@ -1415,6 +1415,32 @@ netxen_setup_netdev(struct netxen_adapter *adapter,
return 0;
}
+#define NETXEN_ULA_ADAPTER_KEY (0xdaddad01)
+#define NETXEN_NON_ULA_ADAPTER_KEY (0xdaddad00)
+
+static void netxen_read_ula_info(struct netxen_adapter *adapter)
+{
+ u32 temp;
+
+ /* Print ULA info only once for an adapter */
+ if (adapter->portnum != 0)
+ return;
+
+ temp = NXRD32(adapter, NETXEN_ULA_KEY);
+ switch (temp) {
+ case NETXEN_ULA_ADAPTER_KEY:
+ dev_info(&adapter->pdev->dev, "ULA adapter");
+ break;
+ case NETXEN_NON_ULA_ADAPTER_KEY:
+ dev_info(&adapter->pdev->dev, "non ULA adapter");
+ break;
+ default:
+ break;
+ }
+
+ return;
+}
+
#ifdef CONFIG_PCIEAER
static void netxen_mask_aer_correctable(struct netxen_adapter *adapter)
{
@@ -1561,6 +1587,8 @@ netxen_nic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
goto err_out_disable_msi;
}
+ netxen_read_ula_info(adapter);
+
err = netxen_setup_netdev(adapter, netdev);
if (err)
goto err_out_disable_msi;
--
1.5.6
^ permalink raw reply related
* [PATCH net-next 0/2] netxen_nic: Minor enhancement
From: Shahed Shaikh @ 2013-09-27 5:42 UTC (permalink / raw)
To: davem; +Cc: netdev, Dept-HSGLinuxNICDev, Shahed Shaikh
From: Shahed Shaikh <shahed.shaikh@qlogic.com>
This patch series contains following changes
* Log a message about ULA adapter type.
* Update the driver version to 4.0.82.
Please apply to net-next.
Thanks,
Shahed
Shahed Shaikh (2):
netxen_nic: Print ULA information
netxen_nic: Update version to 4.0.82
drivers/net/ethernet/qlogic/netxen/netxen_nic.h | 4 +-
.../net/ethernet/qlogic/netxen/netxen_nic_hdr.h | 1 +
.../net/ethernet/qlogic/netxen/netxen_nic_main.c | 28 ++++++++++++++++++++
3 files changed, 31 insertions(+), 2 deletions(-)
^ permalink raw reply
* Re: pull-request: can-next 2013-09-21
From: David Miller @ 2013-09-27 6:10 UTC (permalink / raw)
To: mkl; +Cc: netdev, linux-can, kernel
In-Reply-To: <523DA52E.2030701@pengutronix.de>
From: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Sat, 21 Sep 2013 15:54:54 +0200
> this is a pull request for net-next. It consists of two patches by Uwe
> Kleine-König, they add explicit copyrights to the uapi CAN headers
> (including Acked-bys of the header authors). And 12 cleanup patches by
> Jingoo Han. Several drivers are converted to use dev_get_platdata()
> instead of open coding it, two unnecessary pci_set_drvdata() are
> removed from the CAN PCI drivers.
Pulled, th anks.
^ permalink raw reply
* Re: pull-request: can 2013-09-21
From: David Miller @ 2013-09-27 6:08 UTC (permalink / raw)
To: mkl; +Cc: netdev, linux-can, kernel
In-Reply-To: <1379772493-7856-1-git-send-email-mkl@pengutronix.de>
From: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Sat, 21 Sep 2013 16:08:12 +0200
> here is a fixes for the v3.12 release cycle. Alexey Khoroshilov from the Linux
> Driver Verification project submitted a patch that fixes a memory leak in the
> failure paths of the peak USB driver.
Pulled, thanks.
^ permalink raw reply
* Re: [PATCH net-next 1/5] tipc: silence sparse warnings
From: David Miller @ 2013-09-27 5:59 UTC (permalink / raw)
To: jon.maloy
Cc: netdev, paul.gortmaker, erik.hugne, ying.xue, maloy,
tipc-discussion, andreas.bofjall
In-Reply-To: <1380014868-2797-2-git-send-email-jon.maloy@ericsson.com>
From: Jon Maloy <jon.maloy@ericsson.com>
Date: Tue, 24 Sep 2013 04:27:44 -0500
> From: Ying Xue <ying.xue@windriver.com>
>
> Eliminate below sparse warnings:
>
> net/tipc/link.c:1210:37: warning: cast removes address space of expression
> net/tipc/link.c:1218:59: warning: incorrect type in argument 2 (different address spaces)
> net/tipc/link.c:1218:59: expected void const [noderef] <asn:1>*from
> net/tipc/link.c:1218:59: got unsigned char const [usertype] *[assigned] sect_crs
> net/tipc/msg.c:96:61: warning: incorrect type in argument 3 (different address spaces)
> net/tipc/msg.c:96:61: expected void const *from
> net/tipc/msg.c:96:61: got void [noderef] <asn:1>*const iov_base
> net/tipc/socket.c:341:49: warning: Using plain integer as NULL pointer
> net/tipc/socket.c:1371:36: warning: Using plain integer as NULL pointer
> net/tipc/socket.c:1694:57: warning: Using plain integer as NULL pointer
>
> Signed-off-by: Ying Xue <ying.xue@windriver.com>
> Signed-off-by: Andreas Bofjäll <andreas.bofjall@ericsson.com>
> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
These warnings are not just for fun, and they are certainly not an
invitation to stick casts all over the place to make them go away.
They indicate real problems in the TIPC code.
There really are user pointers in the iovecs here, and that's why the
iov_base member is annotated with "__user".
These iovecs carry pointers that come from userspace via socket calls.
And you absolutely cannot pass user pointers to skb_copy_to_linear_data()
and friends as they access the source pointer using memcpy().
You have to use the proper interfaces for accessing userspace memory,
ones that have their arguments annotated with __user.
I'm not applying this series, sorry.
^ permalink raw reply
* [PATCH net-next] virtio-net: switch to use XPS to choose txq
From: Jason Wang @ 2013-09-27 5:57 UTC (permalink / raw)
To: netdev, linux-kernel, virtualization; +Cc: Michael S. Tsirkin
We used to use a percpu structure vq_index to record the cpu to queue
mapping, this is suboptimal since it duplicates the work of XPS and
loses all other XPS functionality such as allowing use to configure
their own transmission steering strategy.
So this patch switches to use XPS and suggest a default mapping when
the number of cpus is equal to the number of queues. With XPS support,
there's no need for keeping per-cpu vq_index and .ndo_select_queue(),
so they were removed also.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
drivers/net/virtio_net.c | 55 +++++++--------------------------------------
1 files changed, 9 insertions(+), 46 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index defec2b..4102c1b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -127,9 +127,6 @@ struct virtnet_info {
/* Does the affinity hint is set for virtqueues? */
bool affinity_hint_set;
- /* Per-cpu variable to show the mapping from CPU to virtqueue */
- int __percpu *vq_index;
-
/* CPU hot plug notifier */
struct notifier_block nb;
};
@@ -1063,7 +1060,6 @@ static int virtnet_vlan_rx_kill_vid(struct net_device *dev,
static void virtnet_clean_affinity(struct virtnet_info *vi, long hcpu)
{
int i;
- int cpu;
if (vi->affinity_hint_set) {
for (i = 0; i < vi->max_queue_pairs; i++) {
@@ -1073,20 +1069,11 @@ static void virtnet_clean_affinity(struct virtnet_info *vi, long hcpu)
vi->affinity_hint_set = false;
}
-
- i = 0;
- for_each_online_cpu(cpu) {
- if (cpu == hcpu) {
- *per_cpu_ptr(vi->vq_index, cpu) = -1;
- } else {
- *per_cpu_ptr(vi->vq_index, cpu) =
- ++i % vi->curr_queue_pairs;
- }
- }
}
static void virtnet_set_affinity(struct virtnet_info *vi)
{
+ cpumask_var_t cpumask;
int i;
int cpu;
@@ -1100,15 +1087,21 @@ static void virtnet_set_affinity(struct virtnet_info *vi)
return;
}
+ if (!alloc_cpumask_var(&cpumask, GFP_KERNEL))
+ return;
+
i = 0;
for_each_online_cpu(cpu) {
virtqueue_set_affinity(vi->rq[i].vq, cpu);
virtqueue_set_affinity(vi->sq[i].vq, cpu);
- *per_cpu_ptr(vi->vq_index, cpu) = i;
+ cpumask_clear(cpumask);
+ cpumask_set_cpu(cpu, cpumask);
+ netif_set_xps_queue(vi->dev, cpumask, i);
i++;
}
vi->affinity_hint_set = true;
+ free_cpumask_var(cpumask);
}
static int virtnet_cpu_callback(struct notifier_block *nfb,
@@ -1217,28 +1210,6 @@ static int virtnet_change_mtu(struct net_device *dev, int new_mtu)
return 0;
}
-/* To avoid contending a lock hold by a vcpu who would exit to host, select the
- * txq based on the processor id.
- */
-static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
-{
- int txq;
- struct virtnet_info *vi = netdev_priv(dev);
-
- if (skb_rx_queue_recorded(skb)) {
- txq = skb_get_rx_queue(skb);
- } else {
- txq = *__this_cpu_ptr(vi->vq_index);
- if (txq == -1)
- txq = 0;
- }
-
- while (unlikely(txq >= dev->real_num_tx_queues))
- txq -= dev->real_num_tx_queues;
-
- return txq;
-}
-
static const struct net_device_ops virtnet_netdev = {
.ndo_open = virtnet_open,
.ndo_stop = virtnet_close,
@@ -1250,7 +1221,6 @@ static const struct net_device_ops virtnet_netdev = {
.ndo_get_stats64 = virtnet_stats,
.ndo_vlan_rx_add_vid = virtnet_vlan_rx_add_vid,
.ndo_vlan_rx_kill_vid = virtnet_vlan_rx_kill_vid,
- .ndo_select_queue = virtnet_select_queue,
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller = virtnet_netpoll,
#endif
@@ -1559,10 +1529,6 @@ static int virtnet_probe(struct virtio_device *vdev)
if (vi->stats == NULL)
goto free;
- vi->vq_index = alloc_percpu(int);
- if (vi->vq_index == NULL)
- goto free_stats;
-
mutex_init(&vi->config_lock);
vi->config_enable = true;
INIT_WORK(&vi->config_work, virtnet_config_changed_work);
@@ -1589,7 +1555,7 @@ static int virtnet_probe(struct virtio_device *vdev)
/* Allocate/initialize the rx/tx queues, and invoke find_vqs */
err = init_vqs(vi);
if (err)
- goto free_index;
+ goto free_stats;
netif_set_real_num_tx_queues(dev, 1);
netif_set_real_num_rx_queues(dev, 1);
@@ -1640,8 +1606,6 @@ free_recv_bufs:
free_vqs:
cancel_delayed_work_sync(&vi->refill);
virtnet_del_vqs(vi);
-free_index:
- free_percpu(vi->vq_index);
free_stats:
free_percpu(vi->stats);
free:
@@ -1678,7 +1642,6 @@ static void virtnet_remove(struct virtio_device *vdev)
flush_work(&vi->config_work);
- free_percpu(vi->vq_index);
free_percpu(vi->stats);
free_netdev(vi->dev);
}
--
1.7.1
^ permalink raw reply related
* Re: [PATCH 1/4] [RFC] net: Explicitly initialize u64_stats_sync structures for lockdep
From: Ingo Molnar @ 2013-09-27 5:44 UTC (permalink / raw)
To: Eric Dumazet
Cc: John Stultz, LKML, Thomas Petazzoni, Mirko Lindner,
Stephen Hemminger, Roger Luethi, Patrick McHardy, Rusty Russell,
Michael S. Tsirkin, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Wensong Zhang, Simon Horman, Julian Anastasov,
Jesse Gross, Mathieu Desnoyers, Steven Rostedt, Peter Zijlstra,
Thomas Gleixner, David S. Miller, netdev, netfilter-devel
In-Reply-To: <1380223585.3165.205.camel@edumazet-glaptop>
* Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2013-09-26 at 11:34 -0700, John Stultz wrote:
> > In order to enable lockdep on seqcount/seqlock structures, we
> > must explicitly initialize any locks.
> >
>
> > diff --git a/include/linux/u64_stats_sync.h b/include/linux/u64_stats_sync.h
> > index 8da8c4e..c450e11 100644
> > --- a/include/linux/u64_stats_sync.h
> > +++ b/include/linux/u64_stats_sync.h
> > @@ -67,6 +67,13 @@ struct u64_stats_sync {
> > #endif
> > };
> >
> > +
> > +#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
> > +#define u64_stats_init(syncp) seqcount_init(syncp.seq)
> > +#else
> > +#define u64_stats_init(syncp)
> > +#endif
> > +
>
> I would prefer a function.
C cannot pass along symbolic names, unfortunately, so we are stuck with
1970's tech and the C preprocessor.
There's a way to make such macros look a tiny bit more structured and thus
be more palatable:
#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
# define u64_stats_init(syncp) seqcount_init(syncp.seq)
#else
# define u64_stats_init(syncp)
#endif
Note, the 'else' branch should probably be:
# define u64_stats_init(syncp) do { } while (0)
Thanks,
Ingo
^ permalink raw reply
* RE: [PATCH 08/11] iwlwifi: Remove extern from function prototypes
From: Grumbach, Emmanuel @ 2013-09-27 4:25 UTC (permalink / raw)
To: Joe Perches, netdev@vger.kernel.org
Cc: David S. Miller, Berg, Johannes, Intel Linux Wireless,
John W. Linville, linux-wireless@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <b3818394ccdf68d834ca1060c1d5a1fc44e2daee.1380137610.git.joe@perches.com>
> Subject: [PATCH 08/11] iwlwifi: Remove extern from function prototypes
>
> There are a mix of function prototypes with and without extern in the kernel
> sources. Standardize on not using extern for function prototypes.
>
> Function prototypes don't need to be written with extern.
> extern is assumed by the compiler. Its use is as unnecessary as using auto to
> declare automatic/local variables in a block.
>
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
> drivers/net/wireless/iwlwifi/dvm/agn.h | 2 +-
> drivers/net/wireless/iwlwifi/dvm/dev.h | 2 +-
> drivers/net/wireless/iwlwifi/dvm/rs.h | 8 ++++----
> drivers/net/wireless/iwlwifi/mvm/rs.h | 9 ++++-----
> 4 files changed, 10 insertions(+), 11 deletions(-)
>
ACK
^ permalink raw reply
* linux-next: manual merge of the wireless-next tree with the net-next tree
From: Stephen Rothwell @ 2013-09-27 3:19 UTC (permalink / raw)
To: John W. Linville
Cc: linux-next, linux-kernel, Joe Perches, David Miller, netdev,
Catalin Iacob
[-- Attachment #1: Type: text/plain, Size: 1545 bytes --]
Hi John,
Today's linux-next merge of the wireless-next tree got a conflict in
drivers/net/wireless/rtlwifi/rtl8192ce/phy.h between commit a958df5dc306
("rtlwifi: Remove extern from function prototypes") from the net-next
tree and commit 3a1ea9fd9351 ("rtlwifi: remove duplicate declarations and
macros in headers") from the wireless-next tree.
I fixed it up (see below) and can carry the fix as necessary (no action
is required).
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
diff --cc drivers/net/wireless/rtlwifi/rtl8192ce/phy.h
index f8973e5,80a0893..0000000
--- a/drivers/net/wireless/rtlwifi/rtl8192ce/phy.h
+++ b/drivers/net/wireless/rtlwifi/rtl8192ce/phy.h
@@@ -217,10 -222,10 +215,9 @@@ void _rtl92ce_phy_lc_calibrate(struct i
void rtl92c_phy_set_rfpath_switch(struct ieee80211_hw *hw, bool bmain);
bool rtl92c_phy_config_rf_with_headerfile(struct ieee80211_hw *hw,
enum radio_path rfpath);
-bool rtl8192_phy_check_is_legal_rfpath(struct ieee80211_hw *hw,
- u32 rfpath);
+bool rtl8192_phy_check_is_legal_rfpath(struct ieee80211_hw *hw, u32 rfpath);
- bool rtl92c_phy_set_io_cmd(struct ieee80211_hw *hw, enum io_type iotype);
bool rtl92ce_phy_set_rf_power_state(struct ieee80211_hw *hw,
- enum rf_pwrstate rfpwr_state);
+ enum rf_pwrstate rfpwr_state);
void rtl92ce_phy_set_rf_on(struct ieee80211_hw *hw);
bool rtl92c_phy_set_io_cmd(struct ieee80211_hw *hw, enum io_type iotype);
void rtl92c_phy_set_io(struct ieee80211_hw *hw);
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply
* linux-next: manual merge of the net-next tree with the wireless tree
From: Stephen Rothwell @ 2013-09-27 3:03 UTC (permalink / raw)
To: David Miller, netdev
Cc: linux-next, linux-kernel, Arend van Spriel, John W. Linville,
Joe Perches
[-- Attachment #1: Type: text/plain, Size: 2860 bytes --]
Hi all,
Today's linux-next merge of the net-next tree got a conflict in
drivers/net/wireless/brcm80211/brcmfmac/dhd_bus.h between commit
db4efbbeb457 ("brcmfmac: obtain platform data upon module
initialization") from the wireless tree and commit 9bd91f3c00bd
("brcm80211: Remove extern from function prototypes") from the net-next
tree.
I fixed it up (see below) and can carry the fix as necessary (no action
is required).
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
diff --cc drivers/net/wireless/brcm80211/brcmfmac/dhd_bus.h
index 74156f8,5bc0276..0000000
--- a/drivers/net/wireless/brcm80211/brcmfmac/dhd_bus.h
+++ b/drivers/net/wireless/brcm80211/brcmfmac/dhd_bus.h
@@@ -132,35 -132,33 +132,34 @@@ struct pktq *brcmf_bus_gettxq(struct br
* interface functions from common layer
*/
- extern bool brcmf_c_prec_enq(struct device *dev, struct pktq *q,
- struct sk_buff *pkt, int prec);
+ bool brcmf_c_prec_enq(struct device *dev, struct pktq *q, struct sk_buff *pkt,
+ int prec);
/* Receive frame for delivery to OS. Callee disposes of rxp. */
- extern void brcmf_rx_frames(struct device *dev, struct sk_buff_head *rxlist);
+ void brcmf_rx_frames(struct device *dev, struct sk_buff_head *rxlist);
/* Indication from bus module regarding presence/insertion of dongle. */
- extern int brcmf_attach(uint bus_hdrlen, struct device *dev);
+ int brcmf_attach(uint bus_hdrlen, struct device *dev);
/* Indication from bus module regarding removal/absence of dongle */
- extern void brcmf_detach(struct device *dev);
+ void brcmf_detach(struct device *dev);
/* Indication from bus module that dongle should be reset */
- extern void brcmf_dev_reset(struct device *dev);
+ void brcmf_dev_reset(struct device *dev);
/* Indication from bus module to change flow-control state */
- extern void brcmf_txflowblock(struct device *dev, bool state);
+ void brcmf_txflowblock(struct device *dev, bool state);
/* Notify the bus has transferred the tx packet to firmware */
- extern void brcmf_txcomplete(struct device *dev, struct sk_buff *txp,
- bool success);
+ void brcmf_txcomplete(struct device *dev, struct sk_buff *txp, bool success);
- extern int brcmf_bus_start(struct device *dev);
+ int brcmf_bus_start(struct device *dev);
#ifdef CONFIG_BRCMFMAC_SDIO
- extern void brcmf_sdio_exit(void);
- extern void brcmf_sdio_init(void);
- extern void brcmf_sdio_register(void);
+ void brcmf_sdio_exit(void);
+ void brcmf_sdio_init(void);
++void brcmf_sdio_register(void);
#endif
#ifdef CONFIG_BRCMFMAC_USB
- extern void brcmf_usb_exit(void);
- extern void brcmf_usb_register(void);
+ void brcmf_usb_exit(void);
-void brcmf_usb_init(void);
++void brcmf_usb_register(void);
#endif
#endif /* _BRCMF_BUS_H_ */
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply
* Re: linux-next: manual merge of the ipsec-next tree with the net-next tree
From: Stephen Rothwell @ 2013-09-27 2:01 UTC (permalink / raw)
To: Steffen Klassert
Cc: linux-next, linux-kernel, Fan Du, Joe Perches, David Miller,
netdev
In-Reply-To: <20130925052923.GU7660@secunet.com>
[-- Attachment #1: Type: text/plain, Size: 1548 bytes --]
Hi Steffen,
On Wed, 25 Sep 2013 07:29:23 +0200 Steffen Klassert <steffen.klassert@secunet.com> wrote:
>
> On Wed, Sep 25, 2013 at 09:59:19AM +1000, Stephen Rothwell wrote:
> >
> > On Tue, 24 Sep 2013 12:25:05 +0200 Steffen Klassert <steffen.klassert@secunet.com> wrote:
> > >
> > > On Tue, Sep 24, 2013 at 12:16:29PM +1000, Stephen Rothwell wrote:
> > > >
> > > > Today's linux-next merge of the ipsec-next tree got a conflict in
> > > > include/net/xfrm.h between commit d511337a1eda ("xfrm.h: Remove extern
> > > > from function prototypes") from the net-next tree and commit aba826958830
> > > > ("{ipv4,xfrm}: Introduce xfrm_tunnel_notifier for xfrm tunnel mode
> > > > callback") from the ipsec-next tree.
> > >
> > > Thanks for the information, I'll do a rebase of the ipsec-next
> > > tree tomorrow.
> >
> > Did you miss the end of the next paragraph: "no action is required"?
> > Dave can fix this up (like I did) when he merges your tree into his.
>
> I applied this patch shortly before the merge window opened, it is a left
> over from the last develpoment cycle. I already rebased my tree onto
> net-next in the past if that happened, even if there were no merge
> conflicts. I did that just to see if everything still works. But I
> could also do a test merge to see if everything still works and ask
> to pull without a rebase then if this is the prefered way. Would make
> my life easier :)
That would be up to Dave ...
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply
* Re: [PATCH 31/51] DMA-API: media: omap3isp: use dma_coerce_mask_and_coherent()
From: Laurent Pinchart @ 2013-09-27 1:56 UTC (permalink / raw)
To: Russell King
Cc: alsa-devel-K7yf7f+aM1XWsZ/bQMPhNw,
b43-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b,
devicetree-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
e1000-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-crypto-u79uwXL29TY76Z2rM5mHXA,
linux-doc-u79uwXL29TY76Z2rM5mHXA,
linux-fbdev-u79uwXL29TY76Z2rM5mHXA,
linux-ide-u79uwXL29TY76Z2rM5mHXA,
linux-media-u79uwXL29TY76Z2rM5mHXA,
linux-mmc-u79uwXL29TY76Z2rM5mHXA,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-omap-u79uwXL29TY76Z2rM5mHXA,
linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA,
linux-tegra-u79uwXL29TY76Z2rM5mHXA,
linux-usb-u79uwXL29TY76Z2rM5mHXA,
linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA, Solarflare linux maintainers,
uclinux-dist-devel-ZG0+EudsQA8dtHy/vicBwGD2FQJk+8+b,
Mauro Carvalho Chehab
In-Reply-To: <E1VMmCg-0007j1-Pi-eh5Bv4kxaXIANfyc6IWni62ZND6+EDdj@public.gmane.org>
Hi Russell,
Thank you for the patch.
On Thursday 19 September 2013 22:56:02 Russell King wrote:
> The code sequence:
> isp->raw_dmamask = DMA_BIT_MASK(32);
> isp->dev->dma_mask = &isp->raw_dmamask;
> isp->dev->coherent_dma_mask = DMA_BIT_MASK(32);
> bypasses the architectures check on the DMA mask. It can be replaced
> with dma_coerce_mask_and_coherent(), avoiding the direct initialization
> of this mask.
>
> Signed-off-by: Russell King <rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org>
Acked-by: Laurent Pinchart <laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw@public.gmane.org>
> ---
> drivers/media/platform/omap3isp/isp.c | 6 +++---
> drivers/media/platform/omap3isp/isp.h | 3 ---
> 2 files changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/media/platform/omap3isp/isp.c
> b/drivers/media/platform/omap3isp/isp.c index df3a0ec..1c36080 100644
> --- a/drivers/media/platform/omap3isp/isp.c
> +++ b/drivers/media/platform/omap3isp/isp.c
> @@ -2182,9 +2182,9 @@ static int isp_probe(struct platform_device *pdev)
> isp->pdata = pdata;
> isp->ref_count = 0;
>
> - isp->raw_dmamask = DMA_BIT_MASK(32);
> - isp->dev->dma_mask = &isp->raw_dmamask;
> - isp->dev->coherent_dma_mask = DMA_BIT_MASK(32);
> + ret = dma_coerce_mask_and_coherent(isp->dev, DMA_BIT_MASK(32));
> + if (ret)
> + return ret;
>
> platform_set_drvdata(pdev, isp);
>
> diff --git a/drivers/media/platform/omap3isp/isp.h
> b/drivers/media/platform/omap3isp/isp.h index cd3eff4..ce65d3a 100644
> --- a/drivers/media/platform/omap3isp/isp.h
> +++ b/drivers/media/platform/omap3isp/isp.h
> @@ -152,7 +152,6 @@ struct isp_xclk {
> * @mmio_base_phys: Array with physical L4 bus addresses for ISP register
> * regions.
> * @mmio_size: Array with ISP register regions size in bytes.
> - * @raw_dmamask: Raw DMA mask
> * @stat_lock: Spinlock for handling statistics
> * @isp_mutex: Mutex for serializing requests to ISP.
> * @crashed: Bitmask of crashed entities (indexed by entity ID)
> @@ -190,8 +189,6 @@ struct isp_device {
> unsigned long mmio_base_phys[OMAP3_ISP_IOMEM_LAST];
> resource_size_t mmio_size[OMAP3_ISP_IOMEM_LAST];
>
> - u64 raw_dmamask;
> -
> /* ISP Obj */
> spinlock_t stat_lock; /* common lock for statistic drivers */
> struct mutex isp_mutex; /* For handling ref_count field */
--
Regards,
Laurent Pinchart
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH net-next] qdisc: basic classifier - remove unnecessary initialization
From: Stephen Hemminger @ 2013-09-27 0:42 UTC (permalink / raw)
To: Jamal Hadi Salim, David Miller; +Cc: netdev
err is set once, then first code resets it.
err = tcf_exts_validate(...)
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
--- a/net/sched/cls_basic.c 2013-08-10 10:36:11.657498301 -0700
+++ b/net/sched/cls_basic.c 2013-09-05 18:05:14.718200833 -0700
@@ -137,7 +137,7 @@ static int basic_set_parms(struct net *n
struct nlattr **tb,
struct nlattr *est)
{
- int err = -EINVAL;
+ int err;
struct tcf_exts e;
struct tcf_ematch_tree t;
^ permalink raw reply
* [PATCH net-next] qdisc: meta return ENOMEM on alloc failure
From: Stephen Hemminger @ 2013-09-27 0:40 UTC (permalink / raw)
To: David Miller, Thomas Graf; +Cc: netdev
Rather than returning earlier value (EINVAL), return ENOMEM if
kzalloc fails. Found while reviewing to find another EINVAL condition.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
--- a/net/sched/em_meta.c 2013-08-10 10:36:11.657498301 -0700
+++ b/net/sched/em_meta.c 2013-09-05 16:47:43.915846185 -0700
@@ -793,8 +793,10 @@ static int em_meta_change(struct tcf_pro
goto errout;
meta = kzalloc(sizeof(*meta), GFP_KERNEL);
- if (meta == NULL)
+ if (meta == NULL) {
+ err = -ENOMEM;
goto errout;
+ }
memcpy(&meta->lvalue.hdr, &hdr->left, sizeof(hdr->left));
memcpy(&meta->rvalue.hdr, &hdr->right, sizeof(hdr->right));
^ permalink raw reply
* [PATCH v2.40 7/7] datapath: Add basic MPLS support to kernel
From: Simon Horman @ 2013-09-27 0:18 UTC (permalink / raw)
To: dev, netdev, Jesse Gross, Ben Pfaff
Cc: Pravin B Shelar, Ravi K, Isaku Yamahata, Joe Stringer
In-Reply-To: <1380241116-7661-1-git-send-email-horms@verge.net.au>
Allow datapath to recognize and extract MPLS labels into flow keys
and execute actions which push, pop, and set labels on packets.
Based heavily on work by Leo Alterman, Ravi K, Isaku Yamahata and Joe Stringer.
Cc: Ravi K <rkerur@gmail.com>
Cc: Leo Alterman <lalterman@nicira.com>
Cc: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
v2.40
* Rebase for:
+ New dev_queue_xmit compat code
+ Updated put_vlan()
* As suggested by Jesse Gross
+ Remove bogus mac_len update from push_mpls()
+ Slightly simplify push_mpls() by using eth_hdr()
+ Remove dubious condition !eth_p_mpls(inner_protocol) on
an skb being considered to be MPLS in netdev_send()
+ Only use compatibility code for MPLS GSO segmentation on kernels
older than 3.11
+ Revamp setting of inner_protocol
1. Do not unconditionally set inner_protocol to the value of
skb->protocol in ovs_execute_actions().
2. Initialise inner_protocol it to zero only if compatibility code is in
use. In the case where compatibility code is not in use it will either
be zero due since the allocation of the skb or some other value set
by some other user.
3. Conditionally set the inner_protocol in push_mpls() to the value of
skb->protocol when entering push_mpls(). The condition is that
inner_protocol is zero and the value of skb->protocol is not an MPLS
ethernet type.
- This new scheme:
+ Pushes logic to set inner_protocol closer to the case where it is
needed.
+ Avoids over-writing values set by other users.
* As suggested by Pravin Shelar
+ Only set and restore skb->protocol in rpl___skb_gso_segment() in the
case of MPLS
+ Add inner_protocol field to struct ovs_gso_cb instead of ovs_skb_cb.
This moves compatibility code closer to where it is used
and creates fewer differences with mainline.
* Update comment on mac_len updates in datapath/actions.c
* Remove HAVE_INNER_PROCOTOL and instead just check
against kernel version 3.11 directly.
HAVE_INNER_PROCOTOL is a hang-over from work done prior
to the merge of inner_protocol into the kernel.
* Remove dubious condition !eth_p_mpls(inner_protocol) on
using inner_protocol as the type in rpl_skb_network_protocol()
* Do not update type of features in rpl_dev_queue_xmit.
Though arguably correct this is not an inherent part of
the changes made by this patch.
* Use skb_cow_head() in push_mpls()
+ Call skb_cow_head(skb, MPLS_HLEN) instead of
make_writable(skb, skb->mac_len) to ensure that there is enough head
room to push an MPLS LSE regardless of whether the skb is cloned or not.
+ This is consistent with the behaviour of rpl__vlan_put_tag().
+ This is a fix for crashes reported when performing mpls_push
with headroom less than 4. This problem was introduced in v3.36.
* Skip popping in mpls_pop if the skb is too short to contain an MPLS LSE
v2.39
* Rebase for removal of vlan, checksum and skb->mark compat code
v2.38
* Rebase for SCTP support
* Refactor validate_tp_port() to iterate over eth_types rather
than open-coding the loop. With the addition of SCTP this logic
is now used three times.
v2.36 - v2.37
* Rebase
* Do not add set_ethertype() to datapath/actions.c.
As this patch has evolved this function had devolved into
to sets of functionality wrapped into a single function with
only one line of common code. Refactor things to simply
open-code setting the ether type in the two locations where
set_ethertype() was previously used. The aim here is to improve
readability.
* Update setting skb->ethertype after mpls push and pop.
- In the case of push_mpls it should be set unconditionally
as in v2.35 the behaviour of this function to always push
an MPLS LSE before any VLAN tags.
- In the case of mpls_pop eth_p_mpls(skb->protocol) is a better
test than skb->protocol != htons(ETH_P_8021Q) as it will give the
correct behaviour in the presence of other VLAN ethernet types,
for example 0x88a8 which is used by 802.1ad. Moreover, it seems
correct to update the ethernet type if it was previously set
according to the top-most MPLS LSE.
* Deaccelerate VLANs when pushing MPLS tags the
- Since v2.35 MPLS push will insert an MPLS LSE before any VLAN tags.
This means that if an accelerated tag is present it should be
deaccelerated to ensure it ends up in the correct position.
* Update skb->mac_len in push_mpls() so that it will be correct
when used by a subsequent call to pop_mpls().
As things stand I do not believe this is strictly necessary as
ovs-vswitchd will not send a pop MPLS action after a push MPLS action.
However, I have added this in order to code more defensively as I believe
that if such a sequence did occur it would be rather unobvious why
it didn't work.
* Do not add skb_cow_head() call in push_mpls().
It is unnecessary as there is a make_writable() call.
This change was also made in v2.30 but some how the
code regressed between then and v2.35.
v2.35
* Rebase
* Move MPLS constants to mpls.h
* Push MPLS tags after ethernet, before VLAN tags
- This is consistent with the OpenFlow 1.3 specification
- Compatibility with OpenFlow 1.2 and earlier versions
may be provided by ovs-vswitchd.
* Correct GSO behaviour in the presence of MPLS but absence of VLANs
v2.34
* Rebase for megaflow changes
v2.33
* Ensure that inner_protocol is always set to to the current
skb->protocol value in ovs_execute_actions(). This ensures
it is set to the correct value in the absence of a push_mpls action.
Also remove setting of inner_protocol in push_mpls() as
it duplicates the code now in ovs_execute_actions().
* Call __skb_gso_segment() instead of skb_gso_segment() from
rpl___skb_gso_segment() in the case that HAVE___SKB_GSO_SEGMENT is set.
This was a typo.
v2.32
* As suggested by Jesse Gross
- Use int instead of size_t in validate_and_copy_actions__().
- Fix crazy edit mess in pop_mpls() action comment
- Move eth_p_mpls() into mpls.h
- Refactor skb_gso_segment MPLS handling into rpl_skb_gso_segment
Address Jesse's comments regarding this code:
"Can we push this completely into the skb_gso_segment() compatibility
code? It's both nicer and may make the interactions with the vlan code
less confusing."
- Move GSO compatibility code into linux/compat/gso.*
- Set skb->protocol on mpls_push and mpls_pop in the presence
of an offloaded VLAN.
v2.31
* As suggested by Jesse Gross
- There is no need to make mac_header_end inline as it is not in a header file
- Remove dubious if (*skb_ethertype == ethertype) optimisation from
set_ethertype
- Only set skb->protocol in push_mpls() or pop_mpls() for non-VLAN packets
- Use MAX_ETH_TYPES instead of SAMPLE_ACTION_DEPTH for array size
of types in struct eth_types. This corrects a typo/thinko.
- Correct eth type tracking logic such that start isn't advanced
when entering a sample action, ensuring that all possibly types
are checked when verifying nested actions.
* Define HAVE_INNER_PROTOCOL based on kernel version.
inner_protocol has been merged into net-next and should appear in
v3.11 so there is no longer a need for a acinclude.m4 test to check for it.
* Add MPLS GSO compatibility code.
This is for use on kernels that do not have MPLS GSO support.
Thanks to Joe Stringer for his work on this.
v2.30
* As suggested by Jesse Gross
- Use skb_cow_head in push_mpls to ensure there is sufficient headroom for
skb_push
- Call make_writable with skb->mac_len instead of skb->mac_len + MPLS_HLEN
in push_mpls as only the first skb->mac_len bytes of existing packet data
are modified.
- Rename skb_mac_header_end as mac_header_end, this seems
to be a more appropriate name for a local function.
- Remove OVS_CSUM_COMPLETE code from set_ethertype().
Inside OVS the ethernet header is not covered by OVS_CSUM_COMPLETE.
- Use __skb_pull() instead of skb_pull() in pop_mpls()
- Decrement and decrement skb->mac_len when poping and pushing VLAN tags.
Previously mac_len was reset, but this would result in forgetting
the MPLS label stack.
- Remove spurious comment from before do_execute_actions().
- Move OVS_KEY_ATTR_MPLS attribute to its final, upstreamable, location.
- Correct ethertype check for OVS_ACTION_ATTR_POP_MPLS case in
validate_and_copy_actions() to check for MPLS ethertypes rather than
ETH_P_IP.
- Rewrite tracking of eth types used to verify actions in the presence
of sample actions. There is a large comment above struct eth_types
describing the new implementation.
v2.29
* Break include/ and lib/ portions of the patch out into a
separate patch "datapath: Add basic MPLS support to kernel"
* Update for new MPLS GSO scheme
- skb->protocol is set to the new ethertype of the packet
on MPLS push and pop
- When pushing the first MPLS LSE onto a previously non-MPLS
packet set skb->inner_protocol to the original ethertype.
- skb->inner_protocol may be used by the network stack
for GSO of the inner-packet.
* Drop const from ethertype parameter of set_ethertype.
This appears to be a legacy of this parameter being a pointer.
* Pass the ethertype patrameter of pop_mpls as a value rather
than a pointer.
v2.28
* Kernel Datapath changes as suggested by Jarno Rajahalme
+ Correct the logic introduced in v2.27 to set the network_header
to after the MPLS label stack in the case of an MPLS packet.
- Increment stack_len offset so that label stacks of depth greater
than two do not cause an infinite loop.
- Correct offset passed to check_header to include skb->mac len
v2.27
* Kernel Datapath changes as suggested by Jarno Rajahalme and Jesse Gross:
+ Previously the mac_len and network_header of an skb corresponded
to the end of the L2 header. To support GSO, just before transmission,
do_output, with the results as follows:
Input: non-MPLS skb: Output: network header and mac_len correspond
to the beginning of the L3 headers
Input: MPLS: Output: network header and mac_len correspond to the
end of the L2 headers.
This is somewhat confusing.
+ The new scheme is as follows:
- The mac_len always corresponds to the end of the L2 header.
- The network header always corresponds to the beginning of the
L3 header.
+ Note that in the case of MPLS output the end of the L2 headers and the
beginning of the L3 headers will differ.
* Remove unused declaration of skb_cb_mpls_stack()
v2.26
* Rebase on master
* Kernel Datapath changes as suggested by Jarno Rajahalme
- Use skb_network_header() instead of skb_mac_header() to locate
the ethertype to set in set_ethertype() as the latter will
be wrong in the presence of VLAN tags. This resolves
a regression introduced in v2.24.
- Enhance comment in do_output()
- do_execute_actions(): Do not alter mpls_stack_depth if
a MPLS push or pop action fail. This is achieved by altering
mpls_stack_depth at the end of push_mpls() and pop_mpls().
v2.25
* Rebase on master
* Pass big-endian value as the last argument of eth_types_set() in
validate_and_copy_actions__()
* Use revised GSO support as provided by the patch series
"[PATCH 0/2] Small Modifications to GSO to allow segmentation of MPLS"
- Set skb->mac_len to the length of the l2 header + MPLS stack length
- Update skb->network_header accordingly
- Set skb->encapsulated_features
v2.24
* Use skb_mac_header() in set_ethertype()
* Set skb->encapsulation in set_ethertype() to support MPLS GSO.
Also add a note about the other requirements for MPLS GSO.
MPLS GSO support will be posted as a patch net-next (Linux mainline)
"MPLS: Add limited GSO support"
* Do not add ETH_TYPE_MIN, it is no longer used
v2.23
* As suggested by Jesse Gross:
- Verify the current ethernet type when validating sample actions
both for the taken and not-taken path if the sample action.
- Document that the OVS_KEY_ATTR_MPLS attribute accepts a list of
struct ovs_key_mpls but that an implementation may restrict
the length it accepts.
- Restrict the array length of the OVS_KEY_ATTR_MPLS to one.
+ Don't add ovs_flow_verify_key_len as it was added to
handle attributes whose values are arrays but there are
no attributes with values that are arrays (of length greater than one).
v2.22
* As suggested by Jesse Gross:
- Fix sparse warning in validate_and_copy_actions()
I have no idea why sparse doesn't show this up this on my system.
- Remove call to skb_cow_head() from push_mpls() as it
is already covered by a call to make_writable()
- Check (key_type > OVS_KEY_ATTR_MAX) in ovs_flow_verify_key_len()
- Disallow set actions on l2.5+ data and MPLS push and pop actions
after an MPLS pop action as there is no verification that the packet
is actually of the new ethernet type. This may later be supported
using recirculation or by other means.
- Do not add spurious debuging message to ovs_flow_cmd_new_or_set()
v2.21
* As suggested by Jesse Gross:
- Verify that l3 and l4 actions always always occur prior to
a push_mpls action and use the network header pointer of an skb
to track the top of the MPLS stack. This avoids adding an l2_size
element to the skb callback.
v2.20
* As suggested by Jesse Gross:
- Do not add ovs_dp_ioctl_hook
+ This appears to be garbage from a rebase
- Do not add skb_cb_set_l2_size. Instead set OVS_CB(skb)->l2_size
in ovs_flow_extract().
- Do not free skb on error in push_mpls(), it is freed in the caller
- Call skb_reset_mac_len() in pop_mpls() and push_mpls()
- Update checksums in pop_mpls(), push_mpls() and set_mpls().
- Rename skb_cb_mpls_bos() as skb_cb_mpls_stack().
It returns the top not the bottom of the stack.
- Track the current eth_type in validate_and_copy_actions
which is initially the eth_type of the flow and may be modified
by push_mpls and pop_mpls actions. Use this to correctly validate
mpls_set actions. This is to allow mpls_set actions to be applied
to a non-MPLS frame after an mpls_push action (although ovs-vswitchd
doesn't currently do that).
Also:
+ Remove the check of the eth_type in set_mpls() as the new validation
scheme should ensure it cannot be incorrect.
+ Use the current eth_type to validate mpls_pop actions and remove
the eth_type check from pop_mpls().
- Move OVS_KEY_ATTR_MPLS to non-upstream group in ovs_key_lens
- Remove unnecessary memset of mpls_key in ovs_flow_to_nlattrs()
- Make a union of the mpls and ip elements of struct sw_flow_key.
Currently the code stops parsing after an MPLS header so it is
not possible for the ip and mpls elements to be used simultaneously
and some space can be saved by using a union.
- Allow an array of MPLS key attributes
+ Currently all but the first element is ignored
+ User-space needs to be updated to accept more than one element,
currently it will treat their presence as an error
- Do not update network header in ovs_flow_extract() for after parsing
the MPLS stack as it is never used because no l3+ processing
occurs on MPLS frames.
- Allow multiple MPLS entries in a match by allowing the OVS_KEY_ATTR_MPLS
to be an array of struct ovs_key_mpls with at least one entry.
Currently only one entry is used which is byte-for-byte compatible with
the previous scheme of having OVS_KEY_ATTR_MPLS as a struct
ovs_key_mpls.
* Make skb writable in pop_mpls(), push_mpls() and set_mpls().
v2.18 - v2.19
* No change
v2.17
* As suggested by Ben Pfaff
- Use consistent terminology for MPLS.
+ Consistently refer to the MPLS component of a packet as the
MPLS label stack and entries in the stack as MPLS label stack entries
(LSE). An MPLS label is a component of an MPLS label stack entry.
The other components are the traffic class (TC), time to live (TTL)
and bottom of stack (BoS) bit.
- Rename compose_.*mpls_ functions as execute_.*mpls_
v2.16
* No change
v2.15
* As suggested by Ben Pfaff
- Use OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS instead of
OVS_ACTION_ATTR_SET_MPLS
v2.14
* Remove include/linux/openvswitch.h portion which added add
new key and action attributes. This
now present in "User-Space MPLS actions and matches"
which is now a dependency of this patch
v2.13
* As suggested by Jarno Rajahalme
- Rename mpls_bos element of ovs_skb_cb as l2_size as it is set and used
regardless of if an MPLS stack is present or not. Update the name of
helper functions and documentation accordingly.
- Ensure that skb_cb_mpls_bos() never returns NULL
* Correct endieness in eth_p_mpls()
v2.12
* Update skb and network header on MPLS extraction in ovs_flow_extract()
* Use NULL in skb_cb_mpls_bos()
* Add eth_p_mpls helper
v2.10 - v2.11
* No change
v2.9
* datapath: Always update the mpls bos if vlan_pop is successful
Regardless of the details of how a successful
vlan_pop is achieved, the mpls bos needs to be updated.
Without this fix it has been observed that the following
results in malformed packets
v2.8
* No change
v2.7
* Rebase
v2.6
* As suggested by Yamahata-san
- Do not guard against label == 0 for
OVS_ACTION_ATTR_SET_MPLS in validate_actions().
A label of 0 is valid
- Remove comment stupulating that if
the top_label element of struct sw_flow_key is 0 then
there is no MPLS label. An MPLS label of 0 is valid
and the correct check if ethertype is
ntohs(ETH_TYPE_MPLS) or ntohs(ETH_TYPE_MPLS_MCAST)
v2.4 - v2.5
* No change
v2.3
* s/mpls_stack/mpls_bos/
This is in keeping with the naming used in the OpenFlow 1.3 specification
v2.2
* Call skb_reset_mac_header() in skb_cb_set_mpls_stack()
eth_hdr(skb) is non-NULL when called in skb_cb_set_mpls_stack().
* Add a call to skb_cb_set_mpls_stack() in ovs_packet_cmd_execute().
I apologise that I have mislaid my notes on this but
it avoids a kernel panic. I can investigate again if necessary.
* Use struct ovs_action_push_mpls instead of
__be16 to decode OVS_ACTION_ATTR_PUSH_MPLS in validate_actions(). This is
consistent with the data format for the attribute.
* Indentation fix in skb_cb_mpls_stack(). [cosmetic]
v2.1
* Manual rebase
---
datapath/Modules.mk | 1 +
datapath/actions.c | 129 +++++++++++-
datapath/datapath.c | 259 +++++++++++++++++++++---
datapath/datapath.h | 2 +
datapath/flow.c | 58 +++++-
datapath/flow.h | 17 +-
datapath/linux/compat/gso.c | 117 +++++++++--
datapath/linux/compat/gso.h | 53 +++++
datapath/linux/compat/include/linux/netdevice.h | 14 +-
datapath/linux/compat/netdevice.c | 28 ---
datapath/mpls.h | 15 ++
include/linux/openvswitch.h | 7 +-
12 files changed, 609 insertions(+), 91 deletions(-)
create mode 100644 datapath/mpls.h
diff --git a/datapath/Modules.mk b/datapath/Modules.mk
index 7ddf79c..b54dc5b 100644
--- a/datapath/Modules.mk
+++ b/datapath/Modules.mk
@@ -22,6 +22,7 @@ openvswitch_headers = \
compat.h \
datapath.h \
flow.h \
+ mpls.h \
vlan.h \
vport.h \
vport-internal_dev.h \
diff --git a/datapath/actions.c b/datapath/actions.c
index d961e5d..bfab9ec 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -35,6 +35,8 @@
#include <net/sctp/checksum.h>
#include "datapath.h"
+#include "gso.h"
+#include "mpls.h"
#include "vlan.h"
#include "vport.h"
@@ -71,7 +73,8 @@ static int __pop_vlan_tci(struct sk_buff *skb, __be16 *current_tci)
vlan_set_encap_proto(skb, vhdr);
skb->mac_header += VLAN_HLEN;
- skb_reset_mac_len(skb);
+ /* Update mac_len for subsequent MPLS actions */
+ skb->mac_len -= VLAN_HLEN;
return 0;
}
@@ -113,6 +116,9 @@ static int put_vlan(struct sk_buff *skb)
if (!__vlan_put_tag(skb, skb->vlan_proto, current_tag))
return -ENOMEM;
+ /* update mac_len for subsequent MPLS actions */
+ skb->mac_len += VLAN_HLEN;
+
if (skb->ip_summed == CHECKSUM_COMPLETE)
skb->csum = csum_add(skb->csum, csum_partial(skb->data
+ (2 * ETH_ALEN), VLAN_HLEN, 0));
@@ -134,6 +140,114 @@ static int push_vlan(struct sk_buff *skb, const struct ovs_action_push_vlan *vla
return 0;
}
+/* The end of the mac header.
+ *
+ * For non-MPLS skbs this will correspond to the network header.
+ * For MPLS skbs it will be before the network_header as the MPLS
+ * label stack lies between the end of the mac header and the network
+ * header. That is, for MPLS skbs the end of the mac header
+ * is the top of the MPLS label stack.
+ */
+static unsigned char *mac_header_end(const struct sk_buff *skb)
+{
+ return skb_mac_header(skb) + skb->mac_len;
+}
+
+/* Push MPLS after the ethernet header. */
+static int push_mpls(struct sk_buff *skb,
+ const struct ovs_action_push_mpls *mpls)
+{
+ __be32 *new_mpls_lse;
+ struct ethhdr *hdr;
+
+ if (unlikely(vlan_tx_tag_present(skb))) {
+ int err;
+
+ err = put_vlan(skb);
+ if (unlikely(err))
+ return err;
+
+ vlan_set_tci(skb, 0);
+ }
+
+ if (skb_cow_head(skb, MPLS_HLEN) < 0) {
+ kfree_skb(skb);
+ return -ENOMEM;
+ }
+ skb_push(skb, MPLS_HLEN);
+
+ memmove(skb_mac_header(skb) - MPLS_HLEN, skb_mac_header(skb),
+ ETH_HLEN);
+ skb_reset_mac_header(skb);
+
+ new_mpls_lse = (__be32 *)(skb_mac_header(skb) + ETH_HLEN);
+ *new_mpls_lse = mpls->mpls_lse;
+
+ if (skb->ip_summed == CHECKSUM_COMPLETE)
+ skb->csum = csum_add(skb->csum, csum_partial(new_mpls_lse,
+ MPLS_HLEN, 0));
+
+ hdr = eth_hdr(skb);
+ hdr->h_proto = mpls->mpls_ethertype;
+ if (!eth_p_mpls(skb->protocol) && !ovs_skb_get_inner_protocol(skb))
+ ovs_skb_set_inner_protocol(skb, skb->protocol);
+ skb->protocol = mpls->mpls_ethertype;
+ return 0;
+}
+
+static int pop_mpls(struct sk_buff *skb, const __be16 ethertype)
+{
+ struct ethhdr *hdr;
+ int err;
+
+ err = make_writable(skb, skb->mac_len + MPLS_HLEN);
+ if (unlikely(err))
+ return err;
+
+ if (unlikely(skb->len < skb->mac_len + MPLS_HLEN))
+ return -ENOMEM;
+
+ if (skb->ip_summed == CHECKSUM_COMPLETE)
+ skb->csum = csum_sub(skb->csum,
+ csum_partial(mac_header_end(skb),
+ MPLS_HLEN, 0));
+
+ memmove(skb_mac_header(skb) + MPLS_HLEN, skb_mac_header(skb),
+ skb->mac_len);
+
+ __skb_pull(skb, MPLS_HLEN);
+ skb_reset_mac_header(skb);
+
+ /* mac_header_end() is used to locate the ethertype
+ * field correctly in the presence of VLAN tags.
+ */
+ hdr = (struct ethhdr *)(mac_header_end(skb) - ETH_HLEN);
+ hdr->h_proto = ethertype;
+ if (eth_p_mpls(skb->protocol))
+ skb->protocol = ethertype;
+ return 0;
+}
+
+static int set_mpls(struct sk_buff *skb, const __be32 *mpls_lse)
+{
+ __be32 *stack = (__be32 *)mac_header_end(skb);
+ int err;
+
+ err = make_writable(skb, skb->mac_len + MPLS_HLEN);
+ if (unlikely(err))
+ return err;
+
+ if (skb->ip_summed == CHECKSUM_COMPLETE) {
+ __be32 diff[] = { ~(*stack), *mpls_lse };
+ skb->csum = ~csum_partial((char *)diff, sizeof(diff),
+ ~skb->csum);
+ }
+
+ *stack = *mpls_lse;
+
+ return 0;
+}
+
static int set_eth_addr(struct sk_buff *skb,
const struct ovs_key_ethernet *eth_key)
{
@@ -509,6 +623,9 @@ static int execute_set_action(struct sk_buff *skb,
case OVS_KEY_ATTR_SCTP:
err = set_sctp(skb, nla_data(nested_attr));
+
+ case OVS_KEY_ATTR_MPLS:
+ err = set_mpls(skb, nla_data(nested_attr));
break;
}
@@ -545,6 +662,14 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
output_userspace(dp, skb, a);
break;
+ case OVS_ACTION_ATTR_PUSH_MPLS:
+ err = push_mpls(skb, nla_data(a));
+ break;
+
+ case OVS_ACTION_ATTR_POP_MPLS:
+ err = pop_mpls(skb, nla_get_be16(a));
+ break;
+
case OVS_ACTION_ATTR_PUSH_VLAN:
err = push_vlan(skb, nla_data(a));
if (unlikely(err)) /* skb already freed. */
@@ -618,6 +743,8 @@ int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb)
goto out_loop;
}
+ ovs_skb_init_inner_protocol(skb);
+
OVS_CB(skb)->tun_key = NULL;
error = do_execute_actions(dp, skb, acts->actions,
acts->actions_len, false);
diff --git a/datapath/datapath.c b/datapath/datapath.c
index 4defcdb..5a62201 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -56,6 +56,8 @@
#include "datapath.h"
#include "flow.h"
+#include "gso.h"
+#include "mpls.h"
#include "vlan.h"
#include "vport-internal_dev.h"
#include "vport-netdev.h"
@@ -543,18 +545,132 @@ static inline void add_nested_action_end(struct sw_flow_actions *sfa, int st_off
a->nla_len = sfa->actions_len - st_offset;
}
-static int validate_and_copy_actions(const struct nlattr *attr,
+#define MAX_ETH_TYPES 16 /* Arbitrary Limit */
+
+/* struct eth_types - possible eth types
+ * @types: provides storage for the possible eth types.
+ * @start: is the index of the first entry of types which is possible.
+ * @end: is the index of the last entry of types which is possible.
+ * @cursor: is the index of the entry which should be updated if an action
+ * changes the eth type.
+ *
+ * Due to the sample action there may be multiple possible eth types.
+ * In order to correctly validate actions all possible types are tracked
+ * and verified. This is done using struct eth_types.
+ *
+ * Initially start, end and cursor should be 0, and the first element of
+ * types should be set to the eth type of the flow.
+ *
+ * When an action changes the eth type then the values of start and end are
+ * updated to the value of cursor. The new type is stored at types[cursor].
+ *
+ * When entering a sample action the start and cursor values are saved. The
+ * value of cursor is set to the value of end plus one.
+ *
+ * When leaving a sample action the start and cursor values are restored to
+ * their saved values.
+ *
+ * An example follows.
+ *
+ * actions: pop_mpls(A),sample(pop_mpls(B)),sample(pop_mpls(C)),pop_mpls(D)
+ *
+ * 0. Initial state:
+ * types = { original_eth_type }
+ * start = end = cursor = 0;
+ *
+ * 1. pop_mpls(A)
+ * a. Check types from start (0) to end (0) inclusive
+ * i.e. Check against original_eth_type
+ * b. Set start = end = cursor
+ * c. Set types[cursor] = A
+ * New state:
+ * types = { A }
+ * start = end = cursor = 0;
+ *
+ * 2. Enter first sample()
+ * a. Save start and cursor
+ * b. Set cursor = end + 1
+ * New state:
+ * types = { A }
+ * start = end = 0;
+ * cursor = 1;
+ *
+ * 3. pop_mpls(B)
+ * a. Check types from start (0) to end (0)
+ * i.e: Check against A
+ * b. Set start = end = cursor
+ * c. Set types[cursor] = B
+ * New state:
+ * types = { A, B }
+ * start = end = cursor = 1;
+ *
+ * 4. Leave first sample()
+ * a. Restore start and cursor to the values when entering 2.
+ * New state:
+ * types = { A, B }
+ * start = cursor = 0;
+ * end = 1;
+ *
+ * 5. Enter second sample()
+ * a. Save start and cursor
+ * b. Set cursor = end + 1
+ * New state:
+ * types = { A, B }
+ * start = 0;
+ * end = 1;
+ * cursor = 2;
+ *
+ * 6. pop_mpls(C)
+ * a. Check types from start (0) to end (1) inclusive
+ * i.e: Check against A and B
+ * b. Set start = end = cursor
+ * c. Set types[cursor] = C
+ * New state:
+ * types = { A, B, C }
+ * start = end = cursor = 2;
+ *
+ * 7. Leave second sample()
+ * a. Restore start and cursor to the values when entering 5.
+ * New state:
+ * types = { A, B, C }
+ * start = cursor = 0;
+ * end = 2;
+ *
+ * 8. pop_mpls(D)
+ * a. Check types from start (0) to end (2) inclusive
+ * i.e: Check against A, B and C
+ * b. Set start = end = cursor
+ * c. Set types[cursor] = D
+ * New state:
+ * types = { D } // Trailing entries of type are no longer used end = 0
+ * start = end = cursor = 0;
+ */
+struct eth_types {
+ int start, end, cursor;
+ __be16 types[MAX_ETH_TYPES];
+};
+
+static void eth_types_set(struct eth_types *types, __be16 type)
+{
+ types->start = types->end = types->cursor;
+ types->types[types->cursor] = type;
+}
+
+static int validate_and_copy_actions__(const struct nlattr *attr,
const struct sw_flow_key *key, int depth,
- struct sw_flow_actions **sfa);
+ struct sw_flow_actions **sfa,
+ struct eth_types *eth_types);
static int validate_and_copy_sample(const struct nlattr *attr,
const struct sw_flow_key *key, int depth,
- struct sw_flow_actions **sfa)
+ struct sw_flow_actions **sfa,
+ struct eth_types *eth_types)
{
const struct nlattr *attrs[OVS_SAMPLE_ATTR_MAX + 1];
const struct nlattr *probability, *actions;
const struct nlattr *a;
int rem, start, err, st_acts;
+ int saved_eth_types_start, saved_eth_types_cursor;
memset(attrs, 0, sizeof(attrs));
nla_for_each_nested(a, attr, rem) {
@@ -585,22 +701,39 @@ static int validate_and_copy_sample(const struct nlattr *attr,
if (st_acts < 0)
return st_acts;
- err = validate_and_copy_actions(actions, key, depth + 1, sfa);
+ /* Save and update eth_types cursor and start. Please see the
+ * comment for struct eth_types for a discussion of this.
+ */
+ saved_eth_types_start = eth_types->start;
+ saved_eth_types_cursor = eth_types->cursor;
+ eth_types->cursor = eth_types->end + 1;
+ if (eth_types->cursor == MAX_ETH_TYPES)
+ return -EINVAL;
+
+ err = validate_and_copy_actions__(actions, key, depth + 1, sfa,
+ eth_types);
if (err)
return err;
+ /* Restore eth_types cursor and start. Please see the
+ * comment for struct eth_types for a discussion of this.
+ */
+ eth_types->cursor = saved_eth_types_cursor;
+ eth_types->start = saved_eth_types_start;
+
add_nested_action_end(*sfa, st_acts);
add_nested_action_end(*sfa, start);
return 0;
}
-static int validate_tp_port(const struct sw_flow_key *flow_key)
+static int validate_tp_port__(const struct sw_flow_key *flow_key,
+ __be16 eth_type)
{
- if (flow_key->eth.type == htons(ETH_P_IP)) {
+ if (eth_type == htons(ETH_P_IP)) {
if (flow_key->ipv4.tp.src || flow_key->ipv4.tp.dst)
return 0;
- } else if (flow_key->eth.type == htons(ETH_P_IPV6)) {
+ } else if (eth_type == htons(ETH_P_IPV6)) {
if (flow_key->ipv6.tp.src || flow_key->ipv6.tp.dst)
return 0;
}
@@ -608,6 +741,21 @@ static int validate_tp_port(const struct sw_flow_key *flow_key)
return -EINVAL;
}
+static int validate_tp_port(const struct sw_flow_key *flow_key,
+ const struct eth_types *eth_types)
+{
+ int i;
+
+ for (i = eth_types->start; i < eth_types->end; i++) {
+ int ret = validate_tp_port__(flow_key, eth_types->types[i]);
+
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
static int validate_and_copy_set_tun(const struct nlattr *attr,
struct sw_flow_actions **sfa)
{
@@ -634,7 +782,7 @@ static int validate_and_copy_set_tun(const struct nlattr *attr,
static int validate_set(const struct nlattr *a,
const struct sw_flow_key *flow_key,
struct sw_flow_actions **sfa,
- bool *set_tun)
+ bool *set_tun, struct eth_types *eth_types)
{
const struct nlattr *ovs_key = nla_data(a);
int key_type = nla_type(ovs_key);
@@ -665,9 +813,12 @@ static int validate_set(const struct nlattr *a,
return err;
break;
- case OVS_KEY_ATTR_IPV4:
- if (flow_key->eth.type != htons(ETH_P_IP))
- return -EINVAL;
+ case OVS_KEY_ATTR_IPV4: {
+ int i;
+
+ for (i = eth_types->start; i <= eth_types->end; i++)
+ if (eth_types->types[i] != htons(ETH_P_IP))
+ return -EINVAL;
if (!flow_key->ip.proto)
return -EINVAL;
@@ -680,10 +831,14 @@ static int validate_set(const struct nlattr *a,
return -EINVAL;
break;
+ }
- case OVS_KEY_ATTR_IPV6:
- if (flow_key->eth.type != htons(ETH_P_IPV6))
- return -EINVAL;
+ case OVS_KEY_ATTR_IPV6: {
+ int i;
+
+ for (i = eth_types->start; i <= eth_types->end; i++)
+ if (eth_types->types[i] != htons(ETH_P_IPV6))
+ return -EINVAL;
if (!flow_key->ip.proto)
return -EINVAL;
@@ -699,24 +854,34 @@ static int validate_set(const struct nlattr *a,
return -EINVAL;
break;
+ }
case OVS_KEY_ATTR_TCP:
if (flow_key->ip.proto != IPPROTO_TCP)
return -EINVAL;
- return validate_tp_port(flow_key);
+ return validate_tp_port(flow_key, eth_types);
case OVS_KEY_ATTR_UDP:
if (flow_key->ip.proto != IPPROTO_UDP)
return -EINVAL;
- return validate_tp_port(flow_key);
+ return validate_tp_port(flow_key, eth_types);
+
+ case OVS_KEY_ATTR_MPLS: {
+ int i;
+
+ for (i = eth_types->start; i < eth_types->end; i++)
+ if (!eth_p_mpls(eth_types->types[i]))
+ return -EINVAL;
+ break;
+ }
case OVS_KEY_ATTR_SCTP:
if (flow_key->ip.proto != IPPROTO_SCTP)
return -EINVAL;
- return validate_tp_port(flow_key);
+ return validate_tp_port(flow_key, eth_types);
default:
return -EINVAL;
@@ -760,10 +925,10 @@ static int copy_action(const struct nlattr *from,
return 0;
}
-static int validate_and_copy_actions(const struct nlattr *attr,
- const struct sw_flow_key *key,
- int depth,
- struct sw_flow_actions **sfa)
+static int validate_and_copy_actions__(const struct nlattr *attr,
+ const struct sw_flow_key *key, int depth,
+ struct sw_flow_actions **sfa,
+ struct eth_types *eth_types)
{
const struct nlattr *a;
int rem, err;
@@ -776,6 +941,8 @@ static int validate_and_copy_actions(const struct nlattr *attr,
static const u32 action_lens[OVS_ACTION_ATTR_MAX + 1] = {
[OVS_ACTION_ATTR_OUTPUT] = sizeof(u32),
[OVS_ACTION_ATTR_USERSPACE] = (u32)-1,
+ [OVS_ACTION_ATTR_PUSH_MPLS] = sizeof(struct ovs_action_push_mpls),
+ [OVS_ACTION_ATTR_POP_MPLS] = sizeof(__be16),
[OVS_ACTION_ATTR_PUSH_VLAN] = sizeof(struct ovs_action_push_vlan),
[OVS_ACTION_ATTR_POP_VLAN] = 0,
[OVS_ACTION_ATTR_SET] = (u32)-1,
@@ -806,6 +973,33 @@ static int validate_and_copy_actions(const struct nlattr *attr,
return -EINVAL;
break;
+ case OVS_ACTION_ATTR_PUSH_MPLS: {
+ const struct ovs_action_push_mpls *mpls = nla_data(a);
+ if (!eth_p_mpls(mpls->mpls_ethertype))
+ return -EINVAL;
+ eth_types_set(eth_types, mpls->mpls_ethertype);
+ break;
+ }
+
+ case OVS_ACTION_ATTR_POP_MPLS: {
+ int i;
+
+ for (i = eth_types->start; i <= eth_types->end; i++)
+ if (!eth_p_mpls(eth_types->types[i]))
+ return -EINVAL;
+
+ /* Disallow subsequent L2.5+ set and mpls_pop actions
+ * as there is no check here to ensure that the new
+ * eth_type is valid and thus set actions could
+ * write off the end of the packet or otherwise
+ * corrupt it.
+ *
+ * Support for these actions is planned using packet
+ * recirculation.
+ */
+ eth_types_set(eth_types, htons(0));
+ break;
+ }
case OVS_ACTION_ATTR_POP_VLAN:
break;
@@ -819,13 +1013,14 @@ static int validate_and_copy_actions(const struct nlattr *attr,
break;
case OVS_ACTION_ATTR_SET:
- err = validate_set(a, key, sfa, &skip_copy);
+ err = validate_set(a, key, sfa, &skip_copy, eth_types);
if (err)
return err;
break;
case OVS_ACTION_ATTR_SAMPLE:
- err = validate_and_copy_sample(a, key, depth, sfa);
+ err = validate_and_copy_sample(a, key, depth, sfa,
+ eth_types);
if (err)
return err;
skip_copy = true;
@@ -847,6 +1042,20 @@ static int validate_and_copy_actions(const struct nlattr *attr,
return 0;
}
+static int validate_and_copy_actions(const struct nlattr *attr,
+ const struct sw_flow_key *key,
+ struct sw_flow_actions **sfa)
+{
+ struct eth_types eth_type = {
+ .start = 0,
+ .end = 0,
+ .cursor = 0,
+ .types = { key->eth.type, },
+ };
+
+ return validate_and_copy_actions__(attr, key, 0, sfa, ð_type);
+}
+
static void clear_stats(struct sw_flow *flow)
{
flow->used = 0;
@@ -910,7 +1119,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
if (IS_ERR(acts))
goto err_flow_free;
- err = validate_and_copy_actions(a[OVS_PACKET_ATTR_ACTIONS], &flow->key, 0, &acts);
+ err = validate_and_copy_actions(a[OVS_PACKET_ATTR_ACTIONS], &flow->key, &acts);
rcu_assign_pointer(flow->sf_acts, acts);
if (err)
goto err_flow_free;
@@ -1268,7 +1477,7 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, struct genl_info *info)
ovs_flow_key_mask(&masked_key, &key, &mask);
error = validate_and_copy_actions(a[OVS_FLOW_ATTR_ACTIONS],
- &masked_key, 0, &acts);
+ &masked_key, &acts);
if (error) {
OVS_NLERR("Flow actions may not be safe on all matching packets.\n");
goto err_kfree;
diff --git a/datapath/datapath.h b/datapath/datapath.h
index 4a49a7d..31fe10a 100644
--- a/datapath/datapath.h
+++ b/datapath/datapath.h
@@ -95,6 +95,8 @@ struct datapath {
* @pkt_key: The flow information extracted from the packet. Must be nonnull.
* @tun_key: Key for the tunnel that encapsulated this packet. NULL if the
* packet is not being tunneled.
+ * @inner_protocol: Provides a substitute for the skb->inner_protocol field on
+ * kernels before 3.11.
*/
struct ovs_skb_cb {
struct sw_flow *flow;
diff --git a/datapath/flow.c b/datapath/flow.c
index 29122af..51e7965 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -44,6 +44,7 @@
#include <net/ipv6.h>
#include <net/ndisc.h>
+#include "mpls.h"
#include "vlan.h"
static struct kmem_cache *flow_cache;
@@ -140,7 +141,8 @@ static bool ovs_match_validate(const struct sw_flow_match *match,
| (1ULL << OVS_KEY_ATTR_ICMP)
| (1ULL << OVS_KEY_ATTR_ICMPV6)
| (1ULL << OVS_KEY_ATTR_ARP)
- | (1ULL << OVS_KEY_ATTR_ND));
+ | (1ULL << OVS_KEY_ATTR_ND)
+ | (1ULL << OVS_KEY_ATTR_MPLS));
/* Always allowed mask fields. */
mask_allowed |= ((1ULL << OVS_KEY_ATTR_TUNNEL)
@@ -155,6 +157,12 @@ static bool ovs_match_validate(const struct sw_flow_match *match,
mask_allowed |= 1ULL << OVS_KEY_ATTR_ARP;
}
+ if (eth_p_mpls(match->key->eth.type)) {
+ key_expected |= 1ULL << OVS_KEY_ATTR_MPLS;
+ if (match->mask && (match->mask->key.eth.type == htons(0xffff)))
+ mask_allowed |= 1ULL << OVS_KEY_ATTR_MPLS;
+ }
+
if (match->key->eth.type == htons(ETH_P_IP)) {
key_expected |= 1ULL << OVS_KEY_ATTR_IPV4;
if (match->mask && (match->mask->key.eth.type == htons(0xffff)))
@@ -879,6 +887,7 @@ int ovs_flow_extract(struct sk_buff *skb, u16 in_port, struct sw_flow_key *key)
return -ENOMEM;
skb_reset_network_header(skb);
+ skb_reset_mac_len(skb);
__skb_push(skb, skb->data - skb_mac_header(skb));
/* Network layer. */
@@ -961,6 +970,33 @@ int ovs_flow_extract(struct sk_buff *skb, u16 in_port, struct sw_flow_key *key)
memcpy(key->ipv4.arp.sha, arp->ar_sha, ETH_ALEN);
memcpy(key->ipv4.arp.tha, arp->ar_tha, ETH_ALEN);
}
+ } else if (eth_p_mpls(key->eth.type)) {
+ size_t stack_len = MPLS_HLEN;
+
+ /* In the presence of an MPLS label stack the end of the L2
+ * header and the beginning of the L3 header differ.
+ *
+ * Advance network_header to the beginning of the L3
+ * header. mac_len corresponds to the end of the L2 header.
+ */
+ while (1) {
+ __be32 lse;
+
+ error = check_header(skb, skb->mac_len + stack_len);
+ if (unlikely(error))
+ return 0;
+
+ memcpy(&lse, skb_network_header(skb), MPLS_HLEN);
+
+ if (stack_len == MPLS_HLEN)
+ memcpy(&key->mpls.top_lse, &lse, MPLS_HLEN);
+
+ skb_set_network_header(skb, skb->mac_len + stack_len);
+ if (lse & htonl(MPLS_BOS_MASK))
+ break;
+
+ stack_len += MPLS_HLEN;
+ }
} else if (key->eth.type == htons(ETH_P_IPV6)) {
int nh_len; /* IPv6 Header + Extensions */
@@ -1154,6 +1190,7 @@ const int ovs_key_lens[OVS_KEY_ATTR_MAX + 1] = {
[OVS_KEY_ATTR_ARP] = sizeof(struct ovs_key_arp),
[OVS_KEY_ATTR_ND] = sizeof(struct ovs_key_nd),
[OVS_KEY_ATTR_TUNNEL] = -1,
+ [OVS_KEY_ATTR_MPLS] = sizeof(struct ovs_key_mpls),
};
static bool is_all_zero(const u8 *fp, size_t size)
@@ -1528,6 +1565,17 @@ static int ovs_key_from_nlattrs(struct sw_flow_match *match, u64 attrs,
attrs &= ~(1ULL << OVS_KEY_ATTR_ARP);
}
+
+ if (attrs & (1ULL << OVS_KEY_ATTR_MPLS)) {
+ const struct ovs_key_mpls *mpls_key;
+
+ mpls_key = nla_data(a[OVS_KEY_ATTR_MPLS]);
+ SW_FLOW_KEY_PUT(match, mpls.top_lse,
+ mpls_key->mpls_lse, is_mask);
+
+ attrs &= ~(1ULL << OVS_KEY_ATTR_MPLS);
+ }
+
if (attrs & (1ULL << OVS_KEY_ATTR_TCP)) {
const struct ovs_key_tcp *tcp_key;
@@ -1891,6 +1939,14 @@ int ovs_flow_to_nlattrs(const struct sw_flow_key *swkey,
arp_key->arp_op = htons(output->ip.proto);
memcpy(arp_key->arp_sha, output->ipv4.arp.sha, ETH_ALEN);
memcpy(arp_key->arp_tha, output->ipv4.arp.tha, ETH_ALEN);
+ } else if (eth_p_mpls(swkey->eth.type)) {
+ struct ovs_key_mpls *mpls_key;
+
+ nla = nla_reserve(skb, OVS_KEY_ATTR_MPLS, sizeof(*mpls_key));
+ if (!nla)
+ goto nla_put_failure;
+ mpls_key = nla_data(nla);
+ mpls_key->mpls_lse = output->mpls.top_lse;
}
if ((swkey->eth.type == htons(ETH_P_IP) ||
diff --git a/datapath/flow.h b/datapath/flow.h
index 03eae03..9376802 100644
--- a/datapath/flow.h
+++ b/datapath/flow.h
@@ -87,12 +87,17 @@ struct sw_flow_key {
__be16 tci; /* 0 if no VLAN, VLAN_TAG_PRESENT set otherwise. */
__be16 type; /* Ethernet frame type. */
} eth;
- struct {
- u8 proto; /* IP protocol or lower 8 bits of ARP opcode. */
- u8 tos; /* IP ToS. */
- u8 ttl; /* IP TTL/hop limit. */
- u8 frag; /* One of OVS_FRAG_TYPE_*. */
- } ip;
+ union {
+ struct {
+ __be32 top_lse; /* top label stack entry */
+ } mpls;
+ struct {
+ u8 proto; /* IP protocol or lower 8 bits of ARP opcode. */
+ u8 tos; /* IP ToS. */
+ u8 ttl; /* IP TTL/hop limit. */
+ u8 frag; /* One of OVS_FRAG_TYPE_*. */
+ } ip;
+ };
union {
struct {
struct {
diff --git a/datapath/linux/compat/gso.c b/datapath/linux/compat/gso.c
index 32f906c..f917356 100644
--- a/datapath/linux/compat/gso.c
+++ b/datapath/linux/compat/gso.c
@@ -19,6 +19,7 @@
#include <linux/module.h>
#include <linux/if.h>
#include <linux/if_tunnel.h>
+#include <linux/if_vlan.h>
#include <linux/icmp.h>
#include <linux/in.h>
#include <linux/ip.h>
@@ -35,6 +36,8 @@
#include <net/xfrm.h>
#include "gso.h"
+#include "mpls.h"
+#include "vlan.h"
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,37) && \
!defined(HAVE_VLAN_BUG_WORKAROUND)
@@ -47,10 +50,12 @@ MODULE_PARM_DESC(vlan_tso, "Enable TSO for VLAN packets");
#define vlan_tso true
#endif
-#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,37)
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,11,0)
static bool dev_supports_vlan_tx(struct net_device *dev)
{
-#if defined(HAVE_VLAN_BUG_WORKAROUND)
+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,37)
+ return true;
+#elif defined(HAVE_VLAN_BUG_WORKAROUND)
return dev->features & NETIF_F_HW_VLAN_TX;
#else
/* Assume that the driver is buggy. */
@@ -58,24 +63,66 @@ static bool dev_supports_vlan_tx(struct net_device *dev)
#endif
}
+/* Strictly this is not needed and will be optimised out
+ * as this code is guarded by if LINUX_VERSION_CODE < KERNEL_VERSION(3,11,0).
+ * It is here to make things explicit should the compatibility
+ * code be extended in some way prior extending its life-span
+ * beyond v3.11.
+ */
+static bool supports_mpls_gso(void)
+{
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,11,0)
+ return true;
+#else
+ return false;
+#endif
+}
+
int rpl_dev_queue_xmit(struct sk_buff *skb)
{
#undef dev_queue_xmit
int err = -ENOMEM;
+ __be16 inner_protocol;
+ bool vlan, mpls;
- if (vlan_tx_tag_present(skb) && !dev_supports_vlan_tx(skb->dev)) {
+ vlan = mpls = false;
+
+ inner_protocol = ovs_skb_get_inner_protocol(skb);
+ if (eth_p_mpls(skb->protocol) && !supports_mpls_gso())
+ mpls = true;
+
+ if (vlan_tx_tag_present(skb) && !dev_supports_vlan_tx(skb->dev))
+ vlan = true;
+
+ if (vlan || mpls) {
int features;
features = netif_skb_features(skb);
- if (!vlan_tso)
- features &= ~(NETIF_F_TSO | NETIF_F_TSO6 |
- NETIF_F_UFO | NETIF_F_FSO);
+ if (vlan) {
+ if (!vlan_tso)
+ features &= ~(NETIF_F_TSO | NETIF_F_TSO6 |
+ NETIF_F_UFO | NETIF_F_FSO);
- skb = __vlan_put_tag(skb, skb->vlan_proto, vlan_tx_tag_get(skb));
- if (unlikely(!skb))
- return err;
- vlan_set_tci(skb, 0);
+ skb = __vlan_put_tag(skb, skb->vlan_proto,
+ vlan_tx_tag_get(skb));
+ if (unlikely(!skb))
+ return err;
+ vlan_set_tci(skb, 0);
+ }
+
+ /* As of v3.11 the kernel provides an mpls_features field in
+ * struct net_device which allows devices to advertise which
+ * features its supports for MPLS. This value defaults to
+ * NETIF_F_SG and as of v3.11.
+ *
+ * This compatibility code is intended for kernels older
+ * than v3.11 that do not support MPLS GSO and thus do not
+ * provide mpls_features. Thus this code uses NETIF_F_SG
+ * directly in place of mpls_features.
+ */
+ if (mpls)
+ features &= NETIF_F_SG;
if (netif_needs_gso(skb, features)) {
struct sk_buff *nskb;
@@ -114,13 +161,17 @@ drop:
kfree_skb(skb);
return err;
}
-#endif /* kernel version < 2.6.37 */
-static __be16 __skb_network_protocol(struct sk_buff *skb)
+__be16 rpl_skb_network_protocol(struct sk_buff *skb)
{
__be16 type = skb->protocol;
+ __be16 inner_proto;
int vlan_depth = ETH_HLEN;
+ inner_proto = ovs_skb_get_inner_protocol(skb);
+ if (eth_p_mpls(skb->protocol))
+ type = inner_proto;
+
while (type == htons(ETH_P_8021Q) || type == htons(ETH_P_8021AD)) {
struct vlan_hdr *vh;
@@ -135,6 +186,46 @@ static __be16 __skb_network_protocol(struct sk_buff *skb)
return type;
}
+struct sk_buff *rpl___skb_gso_segment(struct sk_buff *skb,
+ netdev_features_t features,
+ bool tx_path)
+{
+ struct sk_buff *skb_gso;
+ __be16 type = skb->protocol;
+ bool mpls;
+
+ mpls = eth_p_mpls(type);
+ if (mpls)
+ skb->protocol = skb_network_protocol(skb);
+
+ /* this hack needed to get regular skb_gso_segment() */
+#ifdef HAVE___SKB_GSO_SEGMENT
+#undef __skb_gso_segment
+ skb_gso = __skb_gso_segment(skb, features, tx_path);
+#else
+#undef skb_gso_segment
+ skb_gso = skb_gso_segment(skb, features);
+#endif
+
+ if (!skb_gso || IS_ERR(skb_gso) || !mpls)
+ return skb_gso;
+
+ skb = skb_gso;
+ while (skb) {
+ skb->protocol = type;
+ skb = skb->next;
+ }
+
+ return skb_gso;
+}
+
+struct sk_buff *rpl_skb_gso_segment(struct sk_buff *skb,
+ netdev_features_t features)
+{
+ return rpl___skb_gso_segment(skb, features, true);
+}
+#endif /* kernel version < 3.11.0 */
+
static struct sk_buff *tnl_skb_gso_segment(struct sk_buff *skb,
netdev_features_t features,
bool tx_path)
@@ -149,7 +240,7 @@ static struct sk_buff *tnl_skb_gso_segment(struct sk_buff *skb,
/* setup whole inner packet to get protocol. */
__skb_pull(skb, mac_offset);
- skb->protocol = __skb_network_protocol(skb);
+ skb->protocol = skb_network_protocol(skb);
/* setup l3 packet to gso, to get around segmentation bug on older kernel.*/
__skb_pull(skb, (pkt_hlen - mac_offset));
diff --git a/datapath/linux/compat/gso.h b/datapath/linux/compat/gso.h
index 44fd213..c6cd8fa 100644
--- a/datapath/linux/compat/gso.h
+++ b/datapath/linux/compat/gso.h
@@ -1,6 +1,7 @@
#ifndef __LINUX_GSO_WRAPPER_H
#define __LINUX_GSO_WRAPPER_H
+#include <linux/netdevice.h>
#include <linux/skbuff.h>
#include <net/protocol.h>
@@ -11,6 +12,9 @@ struct ovs_gso_cb {
sk_buff_data_t inner_network_header;
sk_buff_data_t inner_mac_header;
void (*fix_segment)(struct sk_buff *);
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,11,0)
+ __be16 inner_protocol;
+#endif
};
#define OVS_GSO_CB(skb) ((struct ovs_gso_cb *)(skb)->cb)
@@ -69,4 +73,53 @@ static inline void skb_reset_inner_headers(struct sk_buff *skb)
#define ip_local_out rpl_ip_local_out
int ip_local_out(struct sk_buff *skb);
+
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,11,0)
+#define skb_network_protocol rpl_skb_network_protocol
+__be16 rpl_skb_network_protocol(struct sk_buff *skb);
+
+#define skb_gso_segment rpl_skb_gso_segment
+struct sk_buff *rpl_skb_gso_segment(struct sk_buff *skb,
+ netdev_features_t features);
+
+#define __skb_gso_segment rpl___skb_gso_segment
+struct sk_buff *rpl___skb_gso_segment(struct sk_buff *skb,
+ netdev_features_t features,
+ bool tx_path);
+
+static inline void ovs_skb_init_inner_protocol(struct sk_buff *skb) {
+ OVS_GSO_CB(skb)->inner_protocol = htons(0);
+}
+
+static inline void ovs_skb_set_inner_protocol(struct sk_buff *skb,
+ __be16 ethertype) {
+ OVS_GSO_CB(skb)->inner_protocol = ethertype;
+}
+
+static inline __be16 ovs_skb_get_inner_protocol(struct sk_buff *skb)
+{
+ return OVS_GSO_CB(skb)->inner_protocol;
+}
+
+#else
+
+static inline void ovs_skb_init_inner_protocol(struct sk_buff *skb) {
+ /* Nothing to do. The inner_protocol is either zero or
+ * has been set to a value by another user.
+ * Either way it may be considered initialised.
+ */
+}
+
+static inline void ovs_skb_set_inner_protocol(struct sk_buff *skb,
+ __be16 ethertype)
+{
+ skb->inner_protocol = ethertype;
+}
+
+static inline __be16 ovs_skb_get_inner_protocol(struct sk_buff *skb)
+{
+ return skb->inner_protocol;
+}
+#endif
+
#endif
diff --git a/datapath/linux/compat/include/linux/netdevice.h b/datapath/linux/compat/include/linux/netdevice.h
index 2b2c855..958ea81 100644
--- a/datapath/linux/compat/include/linux/netdevice.h
+++ b/datapath/linux/compat/include/linux/netdevice.h
@@ -74,9 +74,6 @@ static inline struct net_device *dev_get_by_index_rcu(struct net *net, int ifind
#endif
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,38)
-#define skb_gso_segment rpl_skb_gso_segment
-struct sk_buff *rpl_skb_gso_segment(struct sk_buff *skb, u32 features);
-
#define netif_skb_features rpl_netif_skb_features
u32 rpl_netif_skb_features(struct sk_buff *skb);
@@ -92,15 +89,6 @@ static inline int rpl_netif_needs_gso(struct sk_buff *skb, int features)
typedef u32 netdev_features_t;
#endif
-#ifndef HAVE___SKB_GSO_SEGMENT
-static inline struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
- netdev_features_t features,
- bool tx_path)
-{
- return skb_gso_segment(skb, features);
-}
-#endif
-
#if LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0)
/* XEN dom0 networking assumes dev->master is bond device
@@ -120,7 +108,7 @@ static inline void netdev_upper_dev_unlink(struct net_device *dev,
}
#endif
-#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,37)
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,11,0)
#define dev_queue_xmit rpl_dev_queue_xmit
int dev_queue_xmit(struct sk_buff *skb);
#endif
diff --git a/datapath/linux/compat/netdevice.c b/datapath/linux/compat/netdevice.c
index 248066d..5f190b9 100644
--- a/datapath/linux/compat/netdevice.c
+++ b/datapath/linux/compat/netdevice.c
@@ -71,32 +71,4 @@ u32 rpl_netif_skb_features(struct sk_buff *skb)
return harmonize_features(skb, protocol, features);
}
}
-
-struct sk_buff *rpl_skb_gso_segment(struct sk_buff *skb, u32 features)
-{
- int vlan_depth = ETH_HLEN;
- __be16 type = skb->protocol;
- __be16 skb_proto;
- struct sk_buff *skb_gso;
-
- while (type == htons(ETH_P_8021Q)) {
- struct vlan_hdr *vh;
-
- if (unlikely(!pskb_may_pull(skb, vlan_depth + VLAN_HLEN)))
- return ERR_PTR(-EINVAL);
-
- vh = (struct vlan_hdr *)(skb->data + vlan_depth);
- type = vh->h_vlan_encapsulated_proto;
- vlan_depth += VLAN_HLEN;
- }
-
- /* this hack needed to get regular skb_gso_segment() */
-#undef skb_gso_segment
- skb_proto = skb->protocol;
- skb->protocol = type;
-
- skb_gso = skb_gso_segment(skb, features);
- skb->protocol = skb_proto;
- return skb_gso;
-}
#endif /* kernel version < 2.6.38 */
diff --git a/datapath/mpls.h b/datapath/mpls.h
new file mode 100644
index 0000000..7eab104
--- /dev/null
+++ b/datapath/mpls.h
@@ -0,0 +1,15 @@
+#ifndef MPLS_H
+#define MPLS_H 1
+
+#include <linux/if_ether.h>
+
+#define MPLS_BOS_MASK 0x00000100
+#define MPLS_HLEN 4
+
+static inline bool eth_p_mpls(__be16 eth_type)
+{
+ return eth_type == htons(ETH_P_MPLS_UC) ||
+ eth_type == htons(ETH_P_MPLS_MC);
+}
+
+#endif
diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index 09c26b5..1ef98a8 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -283,14 +283,13 @@ enum ovs_key_attr {
OVS_KEY_ATTR_SKB_MARK, /* u32 skb mark */
OVS_KEY_ATTR_TUNNEL, /* Nested set of ovs_tunnel attributes */
OVS_KEY_ATTR_SCTP, /* struct ovs_key_sctp */
+ OVS_KEY_ATTR_MPLS, /* array of struct ovs_key_mpls.
+ * The implementation may restrict
+ * the accepted length of the array. */
#ifdef __KERNEL__
OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */
#endif
-
- OVS_KEY_ATTR_MPLS = 62, /* array of struct ovs_key_mpls.
- * The implementation may restrict
- * the accepted length of the array. */
__OVS_KEY_ATTR_MAX
};
--
1.8.4
^ permalink raw reply related
* [PATCH v2.40 5/7] lib: Push MPLS tags in the OpenFlow 1.3 ordering
From: Simon Horman @ 2013-09-27 0:18 UTC (permalink / raw)
To: dev, netdev, Jesse Gross, Ben Pfaff
Cc: Pravin B Shelar, Ravi K, Isaku Yamahata, Joe Stringer
In-Reply-To: <1380241116-7661-1-git-send-email-horms@verge.net.au>
From: Joe Stringer <joe@wand.net.nz>
This patch modifies the push_mpls behaviour to follow the OpenFlow 1.3
specification in the presence of VLAN tagged packets. From the spec:
"Newly pushed tags should always be inserted as the outermost tag in the
outermost valid location for that tag. When a new VLAN tag is pushed, it
should be the outermost tag inserted, immediately after the Ethernet
header and before other tags. Likewise, when a new MPLS tag is pushed,
it should be the outermost tag inserted, immediately after the Ethernet
header and before other tags."
When the push_mpls action was inserted using OpenFlow 1.2, we implement
the previous behaviour by inserting VLAN actions around the MPLS action
in the odp translation; Pop VLAN tags before committing MPLS actions,
and push the expected VLAN tag afterwards. The trigger condition for
this is based on the ofpact->compat field.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
v2.40
* Trivial rebase for removal of set_ethertype()
v2.36 - v2.39
* No change
v2.35
* First post
---
lib/flow.c | 2 +-
lib/packets.c | 10 +-
lib/packets.h | 2 +-
ofproto/ofproto-dpif-xlate.c | 10 +-
tests/ofproto-dpif.at | 237 +++++++++++++++++++++++++++++++++++++++++++
5 files changed, 253 insertions(+), 8 deletions(-)
diff --git a/lib/flow.c b/lib/flow.c
index 0ce694d..1039222 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -1064,7 +1064,7 @@ flow_compose(struct ofpbuf *b, const struct flow *flow)
}
if (eth_type_mpls(flow->dl_type)) {
- b->l2_5 = b->l3;
+ b->l2_5 = (char*)b->l2 + ETH_HEADER_LEN;
push_mpls(b, flow->dl_type, flow->mpls_lse);
}
}
diff --git a/lib/packets.c b/lib/packets.c
index 922c5db..f8a58b6 100644
--- a/lib/packets.c
+++ b/lib/packets.c
@@ -220,11 +220,11 @@ eth_pop_vlan(struct ofpbuf *packet)
/* Set ethertype of the packet. */
void
-set_ethertype(struct ofpbuf *packet, ovs_be16 eth_type)
+set_ethertype(struct ofpbuf *packet, ovs_be16 eth_type, bool inner)
{
struct eth_header *eh = packet->data;
- if (eh->eth_type == htons(ETH_TYPE_VLAN)) {
+ if (inner && eh->eth_type == htons(ETH_TYPE_VLAN)) {
ovs_be16 *p;
p = ALIGNED_CAST(ovs_be16 *,
(char *)(packet->l2_5 ? packet->l2_5 : packet->l3) - 2);
@@ -332,8 +332,8 @@ push_mpls(struct ofpbuf *packet, ovs_be16 ethtype, ovs_be32 lse)
if (!is_mpls(packet)) {
/* Set ethtype and MPLS label stack entry. */
- set_ethertype(packet, ethtype);
- packet->l2_5 = packet->l3;
+ set_ethertype(packet, ethtype, false);
+ packet->l2_5 = (char*)packet->l2 + ETH_HEADER_LEN;
}
/* Push new MPLS shim header onto packet. */
@@ -354,7 +354,7 @@ pop_mpls(struct ofpbuf *packet, ovs_be16 ethtype)
size_t len;
mh = packet->l2_5;
len = (char*)packet->l2_5 - (char*)packet->l2;
- set_ethertype(packet, ethtype);
+ set_ethertype(packet, ethtype, true);
if (mh->mpls_lse & htonl(MPLS_BOS_MASK)) {
packet->l2_5 = NULL;
} else {
diff --git a/lib/packets.h b/lib/packets.h
index 7388152..38fec70 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -143,7 +143,7 @@ void compose_rarp(struct ofpbuf *, const uint8_t eth_src[ETH_ADDR_LEN]);
void eth_push_vlan(struct ofpbuf *, ovs_be16 tci);
void eth_pop_vlan(struct ofpbuf *);
-void set_ethertype(struct ofpbuf *packet, ovs_be16 eth_type);
+void set_ethertype(struct ofpbuf *packet, ovs_be16 eth_type, bool inner);
const char *eth_from_hex(const char *hex, struct ofpbuf **packetp);
void eth_format_masked(const uint8_t eth[ETH_ADDR_LEN],
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 1cf5d52..9f7298b 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -2235,6 +2235,12 @@ may_receive(const struct xport *xport, struct xlate_ctx *ctx)
return true;
}
+static bool
+mpls_compat_behaviour(enum ofputil_action_code compat)
+{
+ return (compat != OFPUTIL_OFPAT13_PUSH_MPLS);
+}
+
static void
vlan_tci_restore(struct xlate_in *xin, ovs_be16 *tci_ptr, ovs_be16 orig_tci)
{
@@ -2423,7 +2429,9 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
/* Save and pop any existing VLAN tags if running in OF1.2 mode. */
ctx->xin->vlan_tci = *vlan_tci;
- flow->vlan_tci = htons(0);
+ if (mpls_compat_behaviour(a->compat)) {
+ flow->vlan_tci = htons(0);
+ }
vlan_tci = &ctx->xin->vlan_tci;
break;
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index c07c64e..17b2b30 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -1078,6 +1078,243 @@ NXST_FLOW reply:
OVS_VSWITCHD_STOP
AT_CLEANUP
+AT_SETUP([ofproto-dpif - OF1.3+ VLAN+MPLS handling])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_DATA([flows.txt], [dnl
+cookie=0xa dl_src=40:44:44:44:55:44 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],controller
+cookie=0xa dl_src=40:44:44:44:55:45 actions=push_vlan:0x8100,mod_vlan_vid:99,push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],controller
+cookie=0xa dl_src=40:44:44:44:55:46 actions=push_vlan:0x8100,mod_vlan_vid:99,push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],controller
+cookie=0xa dl_src=40:44:44:44:55:47 actions=push_vlan:0x8100,load:99->OXM_OF_VLAN_VID[[]],push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],controller
+cookie=0xa dl_src=40:44:44:44:55:48 actions=push_vlan:0x8100,load:99->OXM_OF_VLAN_VID[[]],push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],controller
+cookie=0xa dl_src=40:44:44:44:55:49 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],push_vlan:0x8100,mod_vlan_vid:99,controller
+cookie=0xa dl_src=40:44:44:44:55:50 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],push_vlan:0x8100,mod_vlan_vid:99,controller
+cookie=0xa dl_src=40:44:44:44:55:51 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],push_vlan:0x8100,load:99->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,controller
+cookie=0xa dl_src=40:44:44:44:55:52 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],load:3->OXM_OF_MPLS_TC[[]],push_vlan:0x8100,load:99->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,controller
+])
+AT_CHECK([ovs-ofctl --protocols=OpenFlow13 add-flows br0 flows.txt])
+
+dnl Modified MPLS controller action.
+dnl The input packet has a VLAN tag, but because we push an MPLS tag in
+dnl OF1.3 mode, we can no longer see it.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:55:44,dst=50:54:00:00:00:07),eth_type(0x8100),vlan(vid=88,pcp=7),encap(eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no))'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:44,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:44,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:44,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, we push a VLAN tag, then an MPLS tag in OF1.3 mode, so we
+dnl can only see the MPLS tag in the result.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:55:45,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no)'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:45,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:45,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:45,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, the input packet is vlan-tagged; we update this tag then
+dnl push an MPLS tag in OF1.3 mode. As such, we can only see the MPLS tag in
+dnl the result.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:55:46,dst=50:54:00:00:00:07),eth_type(0x8100),vlan(vid=88,pcp=7),encap(eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no))'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:46,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:46,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:46,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, we push a VLAN tag, then an MPLS tag in OF1.3 mode, so we
+dnl can only see the MPLS tag in the result.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:55:47,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no)'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:47,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:47,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:47,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, the input packet is vlan-tagged; we update this tag then
+dnl push an MPLS tag in OF1.3 mode. As such, we can only see the MPLS tag in
+dnl the result.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:55:48,dst=50:54:00:00:00:07),eth_type(0x8100),vlan(vid=88,pcp=7),encap(eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no))'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:48,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:48,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,vlan_tci=0x0000,dl_src=40:44:44:44:55:48,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, we push the MPLS tag before pushing a VLAN tag, so we see
+dnl both of these in the final flow.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:55:49,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no)'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=0,dl_src=40:44:44:44:55:49,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=0,dl_src=40:44:44:44:55:49,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=0,dl_src=40:44:44:44:55:49,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, the input packet in vlan-tagged, which should be stripped
+dnl before we push the MPLS and VLAN tags.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:55:50,dst=50:54:00:00:00:07),eth_type(0x8100),vlan(vid=88,pcp=7),encap(eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no))'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=0,dl_src=40:44:44:44:55:50,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=0,dl_src=40:44:44:44:55:50,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=0,dl_src=40:44:44:44:55:50,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, we push the MPLS tag before pushing a VLAN tag, so we see
+dnl both of these in the final flow.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:55:51,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no)'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:55:51,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:55:51,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:55:51,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, the input packet in vlan-tagged, which should be stripped
+dnl before we push the MPLS and VLAN tags.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:55:52,dst=50:54:00:00:00:07),eth_type(0x8100),vlan(vid=88,pcp=7),encap(eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no))'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:55:52,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:55:52,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:55:52,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=3,mpls_ttl=64,mpls_bos=1
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:55:44 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:55:45 actions=mod_vlan_vid:99,push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:55:46 actions=mod_vlan_vid:99,push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:55:47 actions=load:0x63->OXM_OF_VLAN_VID[[]],push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:55:48 actions=load:0x63->OXM_OF_VLAN_VID[[]],push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:55:49 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],mod_vlan_vid:99,CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:55:50 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],mod_vlan_vid:99,CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:55:51 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],load:0x63->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:55:52 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x3->OXM_OF_MPLS_TC[[]],load:0x63->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,CONTROLLER:65535
+NXST_FLOW reply:
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
AT_SETUP([ofproto-dpif - fragment handling])
OVS_VSWITCHD_START
ADD_OF_PORTS([br0], [1], [2], [3], [4], [5], [6], [90])
--
1.8.4
^ permalink raw reply related
* [PATCH v2.40 6/7] datapath: Break out deacceleration portion of vlan_push
From: Simon Horman @ 2013-09-27 0:18 UTC (permalink / raw)
To: dev, netdev, Jesse Gross, Ben Pfaff
Cc: Pravin B Shelar, Ravi K, Isaku Yamahata, Joe Stringer
In-Reply-To: <1380241116-7661-1-git-send-email-horms@verge.net.au>
Break out deacceleration portion of vlan_push into vlan_put
so that it may be re-used by mpls_push.
For both vlan_push and mpls_push if there is an accelerated VLAN tag
present then it should be deaccelerated, adding it to the data of
the skb, before the new tag is added.
Signed-off-by: Simon Horman <horms@verge.net.au>
---
v2.40
* As suggested by Jesse Gross
+ Simplify vlan_push by returning an error code
rather than an error code encoded as a struct xkb_buff *
v2.39
* First post
---
datapath/actions.c | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/datapath/actions.c b/datapath/actions.c
index 30ea1d2..d961e5d 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -105,22 +105,31 @@ static int pop_vlan(struct sk_buff *skb)
return 0;
}
-static int push_vlan(struct sk_buff *skb, const struct ovs_action_push_vlan *vlan)
+/* push down current VLAN tag */
+static int put_vlan(struct sk_buff *skb)
{
- if (unlikely(vlan_tx_tag_present(skb))) {
- u16 current_tag;
+ u16 current_tag = vlan_tx_tag_get(skb);
- /* push down current VLAN tag */
- current_tag = vlan_tx_tag_get(skb);
+ if (!__vlan_put_tag(skb, skb->vlan_proto, current_tag))
+ return -ENOMEM;
- if (!__vlan_put_tag(skb, skb->vlan_proto, current_tag))
- return -ENOMEM;
+ if (skb->ip_summed == CHECKSUM_COMPLETE)
+ skb->csum = csum_add(skb->csum, csum_partial(skb->data
+ + (2 * ETH_ALEN), VLAN_HLEN, 0));
- if (skb->ip_summed == CHECKSUM_COMPLETE)
- skb->csum = csum_add(skb->csum, csum_partial(skb->data
- + (2 * ETH_ALEN), VLAN_HLEN, 0));
+ return 0;
+}
+static int push_vlan(struct sk_buff *skb, const struct ovs_action_push_vlan *vlan)
+{
+ if (unlikely(vlan_tx_tag_present(skb))) {
+ int err;
+
+ err = put_vlan(skb);
+ if (unlikely(err))
+ return err;
}
+
__vlan_hwaccel_put_tag(skb, vlan->vlan_tpid, ntohs(vlan->vlan_tci) & ~VLAN_TAG_PRESENT);
return 0;
}
--
1.8.4
^ permalink raw reply related
* [PATCH v2.40 2/7] odp: Allow VLAN actions after MPLS actions
From: Simon Horman @ 2013-09-27 0:18 UTC (permalink / raw)
To: dev, netdev, Jesse Gross, Ben Pfaff
Cc: Pravin B Shelar, Ravi K, Isaku Yamahata, Joe Stringer
In-Reply-To: <1380241116-7661-1-git-send-email-horms@verge.net.au>
From: Joe Stringer <joe@wand.net.nz>
OpenFlow 1.2 and 1.3 differ on their handling of MPLS actions in the
presence of VLAN tags. To allow correct behaviour to be committed in
each situation, this patch adds a second round of VLAN tag action
handling to commit_odp_actions(), which occurs after MPLS actions. This
is implemented with a new field in 'struct xlate_in' called 'vlan_tci'.
When an push_mpls action is composed, the flow's current VLAN state is
stored into xin->vlan_tci, and flow->vlan_tci is set to 0 (pop_vlan). If
a VLAN tag is present, it is stripped; if not, then there is no change.
Any later modifications to the VLAN state is written to xin->vlan_tci.
When committing the actions, flow->vlan_tci is used before MPLS actions,
and xin->vlan_tci is used afterwards. This retains the current datapath
behaviour, but allows VLAN actions to be applied in a more flexible
manner.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
v2.40
* Rebase for removal of mpls_depth from struct flow
v2.38 - v2.39
* No change
v2.37
* Rebase
v2.36
* No change
v2.5
* First post
---
lib/odp-util.c | 9 +-
lib/odp-util.h | 2 +-
ofproto/ofproto-dpif-xlate.c | 90 ++++++++++++++-----
ofproto/ofproto-dpif-xlate.h | 5 ++
tests/ofproto-dpif.at | 209 +++++++++++++++++++++++++++++++++++++++++++
5 files changed, 292 insertions(+), 23 deletions(-)
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 0785c6a..fcfa91b 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -3549,11 +3549,15 @@ commit_set_pkt_mark_action(const struct flow *flow, struct flow *base,
* key from 'base' into 'flow', and then changes 'base' the same way. Does not
* commit set_tunnel actions. Users should call commit_odp_tunnel_action()
* in addition to this function if needed. Sets fields in 'wc' that are
- * used as part of the action. */
+ * used as part of the action.
+ *
+ * VLAN actions may be committed twice; If vlan_tci in 'flow' differs from the
+ * one in 'base', then it is committed before MPLS actions. If 'final_vlan_tci'
+ * differs from 'flow->vlan_tci', it is committed afterwards. */
void
commit_odp_actions(const struct flow *flow, struct flow *base,
struct ofpbuf *odp_actions, struct flow_wildcards *wc,
- int *mpls_depth_delta)
+ int *mpls_depth_delta, ovs_be16 final_vlan_tci)
{
commit_set_ether_addr_action(flow, base, odp_actions, wc);
commit_vlan_action(flow->vlan_tci, base, odp_actions, wc);
@@ -3564,6 +3568,7 @@ commit_odp_actions(const struct flow *flow, struct flow *base,
* that it is no longer IP and thus nw and port actions are no longer valid.
*/
commit_mpls_action(flow, base, odp_actions, wc, mpls_depth_delta);
+ commit_vlan_action(final_vlan_tci, base, odp_actions, wc);
commit_set_priority_action(flow, base, odp_actions, wc);
commit_set_pkt_mark_action(flow, base, odp_actions, wc);
}
diff --git a/lib/odp-util.h b/lib/odp-util.h
index 4abf543..c7fc1eb 100644
--- a/lib/odp-util.h
+++ b/lib/odp-util.h
@@ -131,7 +131,7 @@ void commit_odp_tunnel_action(const struct flow *, struct flow *base,
struct ofpbuf *odp_actions);
void commit_odp_actions(const struct flow *, struct flow *base,
struct ofpbuf *odp_actions, struct flow_wildcards *wc,
- int *mpls_depth_delta);
+ int *mpls_depth_delta, ovs_be16 final_vlan_tci);
\f
/* ofproto-dpif interface.
*
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 5482323..1cf5d52 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -982,10 +982,11 @@ static void
output_normal(struct xlate_ctx *ctx, const struct xbundle *out_xbundle,
uint16_t vlan)
{
- ovs_be16 *flow_tci = &ctx->xin->flow.vlan_tci;
+ ovs_be16 *flow_tci = &ctx->xin->vlan_tci;
uint16_t vid;
ovs_be16 tci, old_tci;
struct xport *xport;
+ bool flow_tci_equal_to_xin = (*flow_tci == ctx->xin->flow.vlan_tci);
vid = output_vlan_to_vid(out_xbundle, vlan);
if (list_is_empty(&out_xbundle->xports)) {
@@ -1016,9 +1017,15 @@ output_normal(struct xlate_ctx *ctx, const struct xbundle *out_xbundle,
}
}
*flow_tci = tci;
+ if (flow_tci_equal_to_xin) {
+ ctx->xin->flow.vlan_tci = tci;
+ }
compose_output_action(ctx, xport->ofp_port);
*flow_tci = old_tci;
+ if (flow_tci_equal_to_xin) {
+ ctx->xin->flow.vlan_tci = old_tci;
+ }
}
/* A VM broadcasts a gratuitous ARP to indicate that it has resumed after
@@ -1251,7 +1258,7 @@ xlate_normal(struct xlate_ctx *ctx)
/* Drop malformed frames. */
if (flow->dl_type == htons(ETH_TYPE_VLAN) &&
- !(flow->vlan_tci & htons(VLAN_CFI))) {
+ !(ctx->xin->vlan_tci & htons(VLAN_CFI))) {
if (ctx->xin->packet != NULL) {
static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
VLOG_WARN_RL(&rl, "bridge %s: dropping packet with partial "
@@ -1275,7 +1282,7 @@ xlate_normal(struct xlate_ctx *ctx)
}
/* Check VLAN. */
- vid = vlan_tci_to_vid(flow->vlan_tci);
+ vid = vlan_tci_to_vid(ctx->xin->vlan_tci);
if (!input_vid_is_valid(vid, in_xbundle, ctx->xin->packet != NULL)) {
xlate_report(ctx, "disallowed VLAN VID for this input port, dropping");
return;
@@ -1533,7 +1540,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port,
const struct xport *xport = get_ofp_port(ctx->xbridge, ofp_port);
struct flow_wildcards *wc = &ctx->xout->wc;
struct flow *flow = &ctx->xin->flow;
- ovs_be16 flow_vlan_tci;
+ ovs_be16 flow_vlan_tci, xin_vlan_tci;
uint32_t flow_pkt_mark;
uint8_t flow_nw_tos;
odp_port_t out_port, odp_port;
@@ -1602,6 +1609,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port,
}
flow_vlan_tci = flow->vlan_tci;
+ xin_vlan_tci = ctx->xin->vlan_tci;
flow_pkt_mark = flow->pkt_mark;
flow_nw_tos = flow->nw_tos;
@@ -1641,19 +1649,20 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port,
wc->masks.vlan_tci |= htons(VLAN_VID_MASK | VLAN_CFI);
}
vlandev_port = vsp_realdev_to_vlandev(ctx->xbridge->ofproto, ofp_port,
- flow->vlan_tci);
+ ctx->xin->vlan_tci);
if (vlandev_port == ofp_port) {
out_port = odp_port;
} else {
out_port = ofp_port_to_odp_port(ctx->xbridge, vlandev_port);
flow->vlan_tci = htons(0);
+ ctx->xin->vlan_tci = htons(0);
}
}
if (out_port != ODPP_NONE) {
commit_odp_actions(flow, &ctx->base_flow,
&ctx->xout->odp_actions, &ctx->xout->wc,
- &ctx->mpls_depth_delta);
+ &ctx->mpls_depth_delta, ctx->xin->vlan_tci);
nl_msg_put_odp_port(&ctx->xout->odp_actions, OVS_ACTION_ATTR_OUTPUT,
out_port);
@@ -1665,6 +1674,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port,
out:
/* Restore flow */
flow->vlan_tci = flow_vlan_tci;
+ ctx->xin->vlan_tci = xin_vlan_tci;
flow->pkt_mark = flow_pkt_mark;
flow->nw_tos = flow_nw_tos;
}
@@ -1809,7 +1819,7 @@ execute_controller_action(struct xlate_ctx *ctx, int len,
commit_odp_actions(&ctx->xin->flow, &ctx->base_flow,
&ctx->xout->odp_actions, &ctx->xout->wc,
- &ctx->mpls_depth_delta);
+ &ctx->mpls_depth_delta, ctx->xin->vlan_tci);
odp_execute_actions(NULL, packet, &key, ctx->xout->odp_actions.data,
ctx->xout->odp_actions.size, NULL, NULL);
@@ -2197,7 +2207,7 @@ xlate_sample_action(struct xlate_ctx *ctx,
commit_odp_actions(&ctx->xin->flow, &ctx->base_flow,
&ctx->xout->odp_actions, &ctx->xout->wc,
- &ctx->mpls_depth_delta);
+ &ctx->mpls_depth_delta, ctx->xin->vlan_tci);
compose_flow_sample_cookie(os->probability, os->collector_set_id,
os->obs_domain_id, os->obs_point_id, &cookie);
@@ -2226,11 +2236,23 @@ may_receive(const struct xport *xport, struct xlate_ctx *ctx)
}
static void
+vlan_tci_restore(struct xlate_in *xin, ovs_be16 *tci_ptr, ovs_be16 orig_tci)
+{
+ /* If MPLS actions were executed after MPLS, copy the final vlan_tci out
+ * and restore the intermediate VLAN state. */
+ if (xin->flow.vlan_tci != orig_tci && tci_ptr == &xin->vlan_tci) {
+ xin->vlan_tci = xin->flow.vlan_tci;
+ xin->flow.vlan_tci = orig_tci;
+ }
+}
+
+static void
do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
struct xlate_ctx *ctx)
{
struct flow_wildcards *wc = &ctx->xout->wc;
struct flow *flow = &ctx->xin->flow;
+ ovs_be16 *vlan_tci = &ctx->xin->flow.vlan_tci;
const struct ofpact *a;
OFPACT_FOR_EACH (a, ofpacts, ofpacts_len) {
@@ -2241,6 +2263,15 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
break;
}
+ /* Update the final vlan state to be equal to the current state.
+ * - If 'vlan_tci' points to 'xin->flow->vlan_tci'. then additional
+ * VLAN actions will be applied before MPLS actions. 'xin->vlan_tci'
+ * is updated to reflect the final state of the flow.
+ * - If 'vlan_tci' already points to 'xin->vlan_tci', then additional
+ * VLAN actions will be applied after MPLS actions. 'xin->vlan_tci'
+ * is already equal to the current state. */
+ ctx->xin->vlan_tci = *vlan_tci;
+
switch (a->type) {
case OFPACT_OUTPUT:
xlate_output_action(ctx, ofpact_get_OUTPUT(a)->port,
@@ -2264,28 +2295,28 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
case OFPACT_SET_VLAN_VID:
wc->masks.vlan_tci |= htons(VLAN_VID_MASK | VLAN_CFI);
- flow->vlan_tci &= ~htons(VLAN_VID_MASK);
- flow->vlan_tci |= (htons(ofpact_get_SET_VLAN_VID(a)->vlan_vid)
- | htons(VLAN_CFI));
+ *vlan_tci &= ~htons(VLAN_VID_MASK);
+ *vlan_tci |= (htons(ofpact_get_SET_VLAN_VID(a)->vlan_vid)
+ | htons(VLAN_CFI));
break;
case OFPACT_SET_VLAN_PCP:
- wc->masks.vlan_tci |= htons(VLAN_PCP_MASK | VLAN_CFI);
- flow->vlan_tci &= ~htons(VLAN_PCP_MASK);
- flow->vlan_tci |=
+ wc->masks.vlan_tci |= htons(VLAN_VID_MASK | VLAN_CFI);
+ *vlan_tci &= ~htons(VLAN_PCP_MASK);
+ *vlan_tci |=
htons((ofpact_get_SET_VLAN_PCP(a)->vlan_pcp << VLAN_PCP_SHIFT)
| VLAN_CFI);
break;
case OFPACT_STRIP_VLAN:
memset(&wc->masks.vlan_tci, 0xff, sizeof wc->masks.vlan_tci);
- flow->vlan_tci = htons(0);
+ *vlan_tci = htons(0);
break;
case OFPACT_PUSH_VLAN:
/* XXX 802.1AD(QinQ) */
memset(&wc->masks.vlan_tci, 0xff, sizeof wc->masks.vlan_tci);
- flow->vlan_tci = htons(VLAN_CFI);
+ *vlan_tci = htons(VLAN_CFI);
break;
case OFPACT_SET_ETH_SRC:
@@ -2353,29 +2384,47 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
flow->skb_priority = ctx->orig_skb_priority;
break;
- case OFPACT_REG_MOVE:
+ case OFPACT_REG_MOVE: {
+ ovs_be16 orig_tci = flow->vlan_tci;
nxm_execute_reg_move(ofpact_get_REG_MOVE(a), flow, wc);
+ vlan_tci_restore(ctx->xin, vlan_tci, orig_tci);
break;
+ }
- case OFPACT_REG_LOAD:
+ case OFPACT_REG_LOAD: {
+ ovs_be16 orig_tci = flow->vlan_tci;
nxm_execute_reg_load(ofpact_get_REG_LOAD(a), flow);
+ vlan_tci_restore(ctx->xin, vlan_tci, orig_tci);
break;
+ }
- case OFPACT_STACK_PUSH:
+ case OFPACT_STACK_PUSH: {
+ ovs_be16 orig_tci = flow->vlan_tci;
+ flow->vlan_tci = *vlan_tci;
nxm_execute_stack_push(ofpact_get_STACK_PUSH(a), flow, wc,
&ctx->stack);
+ flow->vlan_tci = orig_tci;
break;
+ }
- case OFPACT_STACK_POP:
+ case OFPACT_STACK_POP: {
+ ovs_be16 orig_tci = flow->vlan_tci;
nxm_execute_stack_pop(ofpact_get_STACK_POP(a), flow, wc,
&ctx->stack);
+ vlan_tci_restore(ctx->xin, vlan_tci, orig_tci);
break;
+ }
case OFPACT_PUSH_MPLS:
if (compose_mpls_push_action(ctx,
ofpact_get_PUSH_MPLS(a)->ethertype)) {
return;
}
+
+ /* Save and pop any existing VLAN tags if running in OF1.2 mode. */
+ ctx->xin->vlan_tci = *vlan_tci;
+ flow->vlan_tci = htons(0);
+ vlan_tci = &ctx->xin->vlan_tci;
break;
case OFPACT_POP_MPLS:
@@ -2477,6 +2526,7 @@ xlate_in_init(struct xlate_in *xin, struct ofproto_dpif *ofproto,
{
xin->ofproto = ofproto;
xin->flow = *flow;
+ xin->vlan_tci = flow->vlan_tci;
xin->packet = packet;
xin->may_learn = packet != NULL;
xin->rule = rule;
diff --git a/ofproto/ofproto-dpif-xlate.h b/ofproto/ofproto-dpif-xlate.h
index a54a9e4..6ce3b31 100644
--- a/ofproto/ofproto-dpif-xlate.h
+++ b/ofproto/ofproto-dpif-xlate.h
@@ -60,6 +60,11 @@ struct xlate_in {
* this flow when actions change header fields. */
struct flow flow;
+ /* If MPLS and VLAN actions were both present in the translation, and VLAN
+ * actions should occur after the MPLS actions, then this field is used
+ * to store the final vlan_tci state. */
+ ovs_be16 vlan_tci;
+
/* The packet corresponding to 'flow', or a null pointer if we are
* revalidating without a packet to refer to. */
const struct ofpbuf *packet;
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index 652304e..c07c64e 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -869,6 +869,215 @@ done
OVS_VSWITCHD_STOP
AT_CLEANUP
+AT_SETUP([ofproto-dpif - OF1.2 VLAN+MPLS handling])
+OVS_VSWITCHD_START([dnl
+ add-port br0 p1 -- set Interface p1 type=dummy
+])
+ON_EXIT([kill `cat ovs-ofctl.pid`])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_DATA([flows.txt], [dnl
+cookie=0xa dl_src=40:44:44:44:54:50 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],push_vlan:0x8100,mod_vlan_vid:99,mod_vlan_pcp:1,controller
+cookie=0xa dl_src=40:44:44:44:54:51 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],push_vlan:0x8100,mod_vlan_vid:99,mod_vlan_pcp:1,controller
+cookie=0xa dl_src=40:44:44:44:54:52 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],push_vlan:0x8100,load:99->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,controller
+cookie=0xa dl_src=40:44:44:44:54:53 actions=push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],push_vlan:0x8100,load:99->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,controller
+cookie=0xa dl_src=40:44:44:44:54:54 actions=push_vlan:0x8100,mod_vlan_vid:99,mod_vlan_pcp:1,push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],controller
+cookie=0xa dl_src=40:44:44:44:54:55 actions=push_vlan:0x8100,mod_vlan_vid:99,mod_vlan_pcp:1,push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],controller
+cookie=0xa dl_src=40:44:44:44:54:56 actions=push_vlan:0x8100,load:99->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],controller
+cookie=0xa dl_src=40:44:44:44:54:57 actions=push_vlan:0x8100,load:99->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,push_mpls:0x8847,load:10->OXM_OF_MPLS_LABEL[[]],controller
+])
+AT_CHECK([ovs-ofctl --protocols=OpenFlow12 add-flows br0 flows.txt])
+
+dnl Modified MPLS controller action.
+dnl In this test, we push the MPLS tag before pushing a VLAN tag, so we see
+dnl both of these in the final flow
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:54:50,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no)'
+done
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:50,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:50,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:50,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, the input packet in vlan-tagged, which should be stripped
+dnl before we push the MPLS and VLAN tags.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:54:51,dst=50:54:00:00:00:07),eth_type(0x8100),vlan(vid=88,pcp=7),encap(eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no))'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:51,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:51,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:51,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, we push the MPLS tag before pushing a VLAN tag, so we see
+dnl both of these in the final flow
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:54:52,dst=52:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no)'
+done
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:52,dl_dst=52:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:52,dl_dst=52:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:52,dl_dst=52:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, the input packet in vlan-tagged, which should be stripped
+dnl before we push the MPLS and VLAN tags.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:54:53,dst=50:54:00:00:00:07),eth_type(0x8100),vlan(vid=88,pcp=7),encap(eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no))'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:53,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:53,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:53,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, we push the VLAN tag before pushing a MPLS tag, but these
+dnl actions are reordered, so we see both of these in the final flow.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:54:54,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no)'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:54,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:54,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:54,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, the input packet in vlan-tagged, which should be stripped
+dnl before we push the MPLS and VLAN tags.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:54:55,dst=50:54:00:00:00:07),eth_type(0x8100),vlan(vid=88,pcp=7),encap(eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no))'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:55,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:55,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:55,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, we push the VLAN tag before pushing a MPLS tag, but these
+dnl actions are reordered, so we see both of these in the final flow.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:54:56,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no)'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:56,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:56,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=68 in_port=1 (via action) data_len=68 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:56,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+])
+
+dnl Modified MPLS controller action.
+dnl In this test, the input packet in vlan-tagged, which should be stripped
+dnl before we push the MPLS and VLAN tags.
+AT_CHECK([ovs-ofctl monitor br0 65534 -P nxm --detach --pidfile 2> ofctl_monitor.log])
+
+for i in 1 2 3; do
+ ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=40:44:44:44:54:57,dst=50:54:00:00:00:07),eth_type(0x8100),vlan(vid=88,pcp=7),encap(eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no))'
+done
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 6])
+ovs-appctl -t ovs-ofctl exit
+
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:57,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:57,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0xa total_len=64 in_port=1 (via action) data_len=64 (unbuffered)
+mpls,metadata=0,in_port=0,dl_vlan=99,dl_vlan_pcp=1,dl_src=40:44:44:44:54:57,dl_dst=50:54:00:00:00:07,mpls_label=10,mpls_tc=0,mpls_ttl=64,mpls_bos=1
+])
+
+AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
+AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:54:50 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],mod_vlan_vid:99,mod_vlan_pcp:1,CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:54:51 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],mod_vlan_vid:99,mod_vlan_pcp:1,CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:54:52 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x63->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:54:53 actions=push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],load:0x63->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:54:54 actions=mod_vlan_vid:99,mod_vlan_pcp:1,push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:54:55 actions=mod_vlan_vid:99,mod_vlan_pcp:1,push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:54:56 actions=load:0x63->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],CONTROLLER:65535
+ cookie=0xa, n_packets=3, n_bytes=180, dl_src=40:44:44:44:54:57 actions=load:0x63->OXM_OF_VLAN_VID[[]],mod_vlan_pcp:1,push_mpls:0x8847,load:0xa->OXM_OF_MPLS_LABEL[[]],CONTROLLER:65535
+NXST_FLOW reply:
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
AT_SETUP([ofproto-dpif - fragment handling])
OVS_VSWITCHD_START
ADD_OF_PORTS([br0], [1], [2], [3], [4], [5], [6], [90])
--
1.8.4
^ permalink raw reply related
* [PATCH v2.40 1/7] odp: Only pass vlan_tci to commit_vlan_action()
From: Simon Horman @ 2013-09-27 0:18 UTC (permalink / raw)
To: dev, netdev, Jesse Gross, Ben Pfaff
Cc: Pravin B Shelar, Ravi K, Isaku Yamahata, Joe Stringer
In-Reply-To: <1380241116-7661-1-git-send-email-horms@verge.net.au>
From: Joe Stringer <joe@wand.net.nz>
This allows for future patches to pass different tci values to
commit_vlan_action() without passing an entire flow structure.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
v2.36 - v2.39
* No change
v2.35
* First post
---
lib/odp-util.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 85256b7..0785c6a 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -3318,10 +3318,10 @@ commit_set_ether_addr_action(const struct flow *flow, struct flow *base,
}
static void
-commit_vlan_action(const struct flow *flow, struct flow *base,
+commit_vlan_action(ovs_be16 vlan_tci, struct flow *base,
struct ofpbuf *odp_actions, struct flow_wildcards *wc)
{
- if (base->vlan_tci == flow->vlan_tci) {
+ if (base->vlan_tci == vlan_tci) {
return;
}
@@ -3331,15 +3331,15 @@ commit_vlan_action(const struct flow *flow, struct flow *base,
nl_msg_put_flag(odp_actions, OVS_ACTION_ATTR_POP_VLAN);
}
- if (flow->vlan_tci & htons(VLAN_CFI)) {
+ if (vlan_tci & htons(VLAN_CFI)) {
struct ovs_action_push_vlan vlan;
vlan.vlan_tpid = htons(ETH_TYPE_VLAN);
- vlan.vlan_tci = flow->vlan_tci;
+ vlan.vlan_tci = vlan_tci;
nl_msg_put_unspec(odp_actions, OVS_ACTION_ATTR_PUSH_VLAN,
&vlan, sizeof vlan);
}
- base->vlan_tci = flow->vlan_tci;
+ base->vlan_tci = vlan_tci;
}
static void
@@ -3556,7 +3556,7 @@ commit_odp_actions(const struct flow *flow, struct flow *base,
int *mpls_depth_delta)
{
commit_set_ether_addr_action(flow, base, odp_actions, wc);
- commit_vlan_action(flow, base, odp_actions, wc);
+ commit_vlan_action(flow->vlan_tci, base, odp_actions, wc);
commit_set_nw_action(flow, base, odp_actions, wc);
commit_set_port_action(flow, base, odp_actions, wc);
/* Committing MPLS actions should occur after committing nw and port
--
1.8.4
^ permalink raw reply related
* [PATCH v2.40 0/7] MPLS actions and matches
From: Simon Horman @ 2013-09-27 0:18 UTC (permalink / raw)
To: dev, netdev, Jesse Gross, Ben Pfaff
Cc: Pravin B Shelar, Ravi K, Isaku Yamahata, Joe Stringer
Hi,
This series implements MPLS actions and matches based on work by
Ravi K, Leo Alterman, Yamahata-san and Joe Stringer.
This series provides two changes
* Patches 1 - 5
Provide user-space support for the VLAN/MPLS tag insertion order
up to and including OpenFlow 1.2, and the different ordering
specified from OpenFlow 1.3. In a nutshell the datapath always
uses the OpenFlow 1.3 ordering, which is to always insert tags
immediately after the L2 header, regardless of the presence of other
tags. And ovs-vswtichd provides compatibility for the behaviour up
to OpenFlow 1.2, which is that MPLS tags should follow VLAN tags
if present.
Ben, these are for you to review.
* Patches 6 and 7
Adding basic MPLS action and match support to the kernel datapath
Jesse, these are for you to review.
Differences between v2.40 and v2.39:
* Rebase for:
+ New dev_queue_xmit compat code
+ Updated put_vlan()
+ Removal of mpls_depth field from struct flow
* As suggested by Jesse Gross
+ Remove bogus mac_len update from push_mpls()
+ Slightly simplify push_mpls() by using eth_hdr()
+ Remove dubious condition !eth_p_mpls(inner_protocol) on
an skb being considered to be MPLS in netdev_send()
+ Only use compatibility code for MPLS GSO segmentation on kernels
older than 3.11
+ Revamp setting of inner_protocol
1. Do not unconditionally set inner_protocol to the value of
skb->protocol in ovs_execute_actions().
2. Initialise inner_protocol it to zero only if compatibility code is in
use. In the case where compatibility code is not in use it will either
be zero due since the allocation of the skb or some other value set
by some other user.
3. Conditionally set the inner_protocol in push_mpls() to the value of
skb->protocol when entering push_mpls(). The condition is that
inner_protocol is zero and the value of skb->protocol is not an MPLS
ethernet type.
- This new scheme:
+ Pushes logic to set inner_protocol closer to the case where it is
needed.
+ Avoids over-writing values set by other users.
* As suggested by Pravin Shelar
+ Only set and restore skb->protocol in rpl___skb_gso_segment() in the
case of MPLS
+ Add inner_protocol field to struct ovs_gso_cb instead of ovs_skb_cb.
This moves compatibility code closer to where it is used
and creates fewer differences with mainline.
* Update comment on mac_len updates in datapath/actions.c
* Remove HAVE_INNER_PROCOTOL and instead just check
against kernel version 3.11 directly.
HAVE_INNER_PROCOTOL is a hang-over from work done prior
to the merge of inner_protocol into the kernel.
* Remove dubious condition !eth_p_mpls(inner_protocol) on
using inner_protocol as the type in rpl_skb_network_protocol()
* Do not update type of features in rpl_dev_queue_xmit.
Though arguably correct this is not an inherent part of
the changes made by this patch.
* Use skb_cow_head() in push_mpls()
+ Call skb_cow_head(skb, MPLS_HLEN) instead of
make_writable(skb, skb->mac_len) to ensure that there is enough head
room to push an MPLS LSE regardless of whether the skb is cloned or not.
+ This is consistent with the behaviour of rpl__vlan_put_tag().
+ This is a fix for crashes reported when performing mpls_push
with headroom less than 4. This problem was introduced in v3.36.
* Skip popping in mpls_pop if the skb is too short to contain an MPLS LSE
Differences between v2.39 and v2.38:
* Rebase for removal of vlan, checksum and skb->mark compat code
- This includes adding adding a new patch,
"[PATCH v2.39 6/7] datapath: Break out deacceleration portion of
vlan_push" to allow re-use of some existing code.
Differences between v2.38 and v2.37:
* Rebase for SCTP support
* Refactor validate_tp_port() to iterate over eth_types rather
than open-coding the loop. With the addition of SCTP this logic
is now used three times.
Differences between v2.37 and v2.36:
* Rebase
Differences between v2.36 and v2.35:
* Rebase
* Do not add set_ethertype() to datapath/actions.c.
As this patch has evolved this function had devolved into
to sets of functionality wrapped into a single function with
only one line of common code. Refactor things to simply
open-code setting the ether type in the two locations where
set_ethertype() was previously used. The aim here is to improve
readability.
* Update setting skb->ethertype after mpls push and pop.
- In the case of push_mpls it should be set unconditionally
as in v2.35 the behaviour of this function to always push
an MPLS LSE before any VLAN tags.
- In the case of mpls_pop eth_p_mpls(skb->protocol) is a better
test than skb->protocol != htons(ETH_P_8021Q) as it will give the
correct behaviour in the presence of other VLAN ethernet types,
for example 0x88a8 which is used by 802.1ad. Moreover, it seems
correct to update the ethernet type if it was previously set
according to the top-most MPLS LSE.
* Deaccelerate VLANs when pushing MPLS tags the
- Since v2.35 MPLS push will insert an MPLS LSE before any VLAN tags.
This means that if an accelerated tag is present it should be
deaccelerated to ensure it ends up in the correct position.
* Update skb->mac_len in push_mpls() so that it will be correct
when used by a subsequent call to pop_mpls().
As things stand I do not believe this is strictly necessary as
ovs-vswitchd will not send a pop MPLS action after a push MPLS action.
However, I have added this in order to code more defensively as I believe
that if such a sequence did occur it would be rather unobvious why
it didn't work.
* Do not add skb_cow_head() call in push_mpls().
It is unnecessary as there is a make_writable() call.
This change was also made in v2.30 but some how the
code regressed between then and v2.35.
Differences between v2.35 and v2.34:
* Add support for the tag ordering specified up until OpenFlow 1.2 and
the ordering specified from OpenFlow 1.3.
* Correct error in datapath patch's handling of GSO in the presence
of MPLS and absence of VLANs.
Pre-requisites.
This series applies on top of "[PATCH v3] Remove mpls_depth field from flow"
To aid review this series and its pre-requisite is available in git at:
git://github.com/horms/openvswitch.git devel/mpls-v2.40
Patch list and overall diffstat:
Joe Stringer (5):
odp: Only pass vlan_tci to commit_vlan_action()
odp: Allow VLAN actions after MPLS actions
ofp-actions: Add OFPUTIL_OFPAT13_PUSH_MPLS
ofp-actions: Add separate OpenFlow 1.3 action parser
lib: Push MPLS tags in the OpenFlow 1.3 ordering
Simon Horman (2):
datapath: Break out deacceleration portion of vlan_push
datapath: Add basic MPLS support to kernel
datapath/Modules.mk | 1 +
datapath/actions.c | 156 ++++++++-
datapath/datapath.c | 259 ++++++++++++--
datapath/datapath.h | 2 +
datapath/flow.c | 58 ++-
datapath/flow.h | 17 +-
datapath/linux/compat/gso.c | 117 ++++++-
datapath/linux/compat/gso.h | 53 +++
datapath/linux/compat/include/linux/netdevice.h | 14 +-
datapath/linux/compat/netdevice.c | 28 --
datapath/mpls.h | 15 +
include/linux/openvswitch.h | 7 +-
lib/flow.c | 2 +-
lib/odp-util.c | 21 +-
lib/odp-util.h | 2 +-
lib/ofp-actions.c | 68 +++-
lib/ofp-parse.c | 1 +
lib/ofp-util.c | 3 +
lib/ofp-util.h | 1 +
lib/packets.c | 10 +-
lib/packets.h | 2 +-
ofproto/ofproto-dpif-xlate.c | 98 ++++--
ofproto/ofproto-dpif-xlate.h | 5 +
tests/ofproto-dpif.at | 446 ++++++++++++++++++++++++
24 files changed, 1246 insertions(+), 140 deletions(-)
create mode 100644 datapath/mpls.h
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox