* Re: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Jiri Pirko @ 2014-12-11 11:01 UTC (permalink / raw)
To: Varlese, Marco
Cc: John Fastabend, netdev@vger.kernel.org,
stephen@networkplumber.org, Fastabend, John R,
roopa@cumulusnetworks.com, sfeldma@gmail.com,
linux-kernel@vger.kernel.org
In-Reply-To: <C4896FB061E7DE4AAC93031BDCA044B104AC3914@IRSMSX108.ger.corp.intel.com>
Thu, Dec 11, 2014 at 10:59:42AM CET, marco.varlese@intel.com wrote:
>> -----Original Message-----
>> From: John Fastabend [mailto:john.fastabend@gmail.com]
>> Sent: Wednesday, December 10, 2014 5:04 PM
>> To: Jiri Pirko
>> Cc: Varlese, Marco; netdev@vger.kernel.org;
>> stephen@networkplumber.org; Fastabend, John R;
>> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
>> kernel@vger.kernel.org
>> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
>> configuration
>>
>> On 12/10/2014 08:50 AM, Jiri Pirko wrote:
>> > Wed, Dec 10, 2014 at 05:23:40PM CET, marco.varlese@intel.com wrote:
>> >> From: Marco Varlese <marco.varlese@intel.com>
>> >>
>> >> Switch hardware offers a list of attributes that are configurable on
>> >> a per port basis.
>> >> This patch provides a mechanism to configure switch ports by adding
>> >> an NDO for setting specific values to specific attributes.
>> >> There will be a separate patch that extends iproute2 to call the new
>> >> NDO.
>> >
>> >
>> > What are these attributes? Can you give some examples. I'm asking
>> > because there is a plan to pass generic attributes to switch ports
>> > replacing current specific ndo_switch_port_stp_update. In this case,
>> > bridge is setting that attribute.
>> >
>> > Is there need to set something directly from userspace or does it make
>> > rather sense to use involved bridge/ovs/bond ? I think that both will
>> > be needed.
>>
>> +1
>>
>> I think for many attributes it would be best to have both. The in kernel callers
>> and netlink userspace can use the same driver ndo_ops.
>>
>> But then we don't _require_ any specific bridge/ovs/etc module. And we
>> may have some attributes that are not specific to any existing software
>> module. I'm guessing Marco has some examples of these.
>>
>> [...]
>>
>>
>> --
>> John Fastabend Intel Corporation
>
>We do have a need to configure the attributes directly from user-space and I have identified the tool to do that in iproute2.
>
>An example of attributes are:
>* enabling/disabling of learning of source addresses on a given port (you can imagine the attribute called LEARNING for example);
>* internal loopback control (i.e. LOOPBACK) which will control how the flow of traffic behaves from the switch fabric towards an egress port;
>* flooding for broadcast/multicast/unicast type of packets (i.e. BFLOODING, MFLOODING, UFLOODING);
>
>Some attributes would be of the type enabled/disabled while other will allow specific values to allow the user to configure different behaviours of that feature on that particular port on that platform.
>
>One thing to mention - as John stated as well - there might be some attributes that are not specific to any software module but rather have to do with the actual hardware/platform to configure.
>
>I hope this clarifies some points.
It does. Makes sense. We need to expose this attr set/get for both
in-kernel and userspace use cases.
Please adjust you patch for this. Also, as a second patch, it would be
great if you can convert ndo_switch_port_stp_update to this new ndo.
Thanks.
>
>-----------------------------------------------------------
>Marco Varlese - Intel Corporation
>-----------------------------------------------------------
>
>
^ permalink raw reply
* Re: [PATCH iproute2] ip: Simplify executing ip cmd within namespace
From: vadim4j @ 2014-12-11 10:57 UTC (permalink / raw)
To: Nicolas Dichtel; +Cc: Vadim Kochan, netdev
In-Reply-To: <548978CD.80404@6wind.com>
On Thu, Dec 11, 2014 at 11:58:21AM +0100, Nicolas Dichtel wrote:
> Le 10/12/2014 23:56, Vadim Kochan a écrit :
> >From: Vadim Kochan <vadim4j@gmail.com>
> >
> >Added new '-ns' option to simplify executing following cmd:
> >
> > ip netns exec NETNS ip OPTIONS COMMAND OBJECT
> >
> > to
> >
> > ip -ns NETNS OPTIONS COMMAND OBJECT
> >
> >e.g.:
> >
> > ip -ns vnet0 link add br0 type bridge
> >
> >Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
> >---
> >May be new option should have better name than '-ns' ?
> What about 'ip -netns' to be explicit like other options?
> user may still use 'ip -n' at the end.
>
>
> Regards,
> Nicolas
May be left '-n' for some other future option, but use the following
options: -net[ns] and -ns ? What do you think ?
Thanks,
^ permalink raw reply
* Re: default enable sparse __CHECK_ENDIAN__ (was: Re: [PATCH v7 2/3] net: Add Keystone NetCP ethernet driver)
From: Marcel Holtmann @ 2014-12-11 11:18 UTC (permalink / raw)
To: Joe Perches
Cc: David S. Miller, Andrew Morton, Christopher Li, Michal Marek,
m-karicheri2, Network Development, linux-arm-kernel, kernel list,
devicetree, Rob Herring, grant.likely, linux-sparse
In-Reply-To: <1418267099.18092.28.camel@perches.com>
Hi Joe,
>>> Are you referring to the static code analyser sparse that is invoked
>>> through?
>> You have to explicitly enable endian checking, it's not on by
>> default.
>
> There don't seem to be thousands of warnings anymore.
>
> Maybe it's time to default enable it when using C=?
>
> from: Documentation/sparse.txt:
>
> The optional make variable CF can be used to pass arguments to sparse. The
> build system passes -Wbitwise to sparse automatically. To perform endianness
> checks, you may define __CHECK_ENDIAN__:
>
> make C=2 CF="-D__CHECK_ENDIAN__"
>
> These checks are disabled by default as they generate a host of warnings.
actually a few subsystems use this in their Makefile:
subdir-ccflags-y += -D__CHECK_ENDIAN__
We could start with that to enable endian checks by default in various places.
Regards
Marcel
^ permalink raw reply
* Re: OOPS: net/ipv6/datagram.c (line 260) ipv6_local_error
From: Steffen Klassert @ 2014-12-11 11:37 UTC (permalink / raw)
To: Chris Ruehl; +Cc: netdev, davem
In-Reply-To: <5487CDE9.4070606@gtsys.com.hk>
On Wed, Dec 10, 2014 at 12:36:57PM +0800, Chris Ruehl wrote:
> Hi all,
>
> We running a Dell server which crash frequently with (dell crash
> video snapshot) vanilla 3.14.25
>
>
>
> The capture don't sadly don't show the full trace, so we lack on
> information.
> 1st line I can see in the crash video from the idrac :
> tcp_transmit_skb+0x461
>
> The null pointer happen:
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from net/ipv6/datagram.o...done.
> (gdb) list *(ipv6_local_error+0x17)
> 0xae7 is in ipv6_local_error (net/ipv6/datagram.c:260).
> 255 struct ipv6_pinfo *np = inet6_sk(sk);
> 256 struct sock_exterr_skb *serr;
> 257 struct ipv6hdr *iph;
> 258 struct sk_buff *skb;
> 259
> 260 if (!np->recverr)
> 261 return;
> 262
> 263 skb = alloc_skb(sizeof(struct ipv6hdr), GFP_ATOMIC);
> 264 if (!skb)
> (gdb) quit
>
>
> We running a 6in4 with ipsec tunnel on the 6. I found a pull request from
> Steffen Klassert
> here:
> http://article.gmane.org/gmane.linux.network/281469
>
> Which might be relevant to this problem.
>
> For time being I add a
>
> if (np == NULL){
> LIMIT_NETDEBUG(KERN_DEBUG "ipv6_pinfo is NULL\n");
> return;
> }
>
> as work around to stop the server crashing
Looks like ipv6_local_error() got an ipv4 socket. You could
extend your workaround to something like the below. This
should give a full backtrace and the socket family.
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index cc11396..cf3a5d8 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -258,6 +258,13 @@ void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info)
struct ipv6hdr *iph;
struct sk_buff *skb;
+ if (np == NULL) {
+ WARN_ON_ONCE(1);
+ if (net_ratelimit())
+ printk(KERN_DEBUG "ipv6_pinfo is NULL, sk family %d\n", sk->sk_family);
+ return;
+ }
+
if (!np->recverr)
return;
^ permalink raw reply related
* [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver
From: Ding Tianhong @ 2014-12-11 11:42 UTC (permalink / raw)
To: zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
linux-lFZ/pmaqli7XmaaqVzeoHQ, arnd-r2nGTMty4D4,
f.fainelli-Re5JQEeQqe8AvxtiuMwx3w,
sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8,
mark.rutland-5wv7dgnIgG8, David.Laight-ZS65k/vG3HxXrIkS9f7CXA,
eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
xuwei5-C8/M+/jPZTeaMJb+Lgu22Q
Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA
v9:
- There is no tx completion interrupts to free DMAd Tx packets, it means taht
we rely on new tx packets arriving to run the destructors of completed packets,
which open up space in their sockets's send queues. Sometimes we don't get such
new packets causing Tx to stall, a single UDP transmitter is a good example of
this situation, so we need a clean up workqueue to reclaims completed packets,
the workqueue will only free the last packets which is already stay for several jiffies.
Also fix some format cleanups.
v8:
- Use poll to reclaim xmitted buffer as workaround since no tx done interrupt
v7:
- Remove select NET_CORE in 0002
v6:
- Suggest by Russell: Use netdev_sent_queue & netdev_completed_queue to solve latency issue
Also shorten the period of timer, which is used to wakeup the queue since no
tx completed interrupt.
v5:
no big change, fix typo
v4:
- Modify accoringly to the suggetion from Arnd, Florian, Eric, David
Use of_parse_phandle_with_fixed_args & syscon_node_to_regmap get ppe info
Add skb_orphan() and tx_timer for reclaim since no tx_finished interrupt
Update timeout, and move of_phy_connect to probe to reuse open/stop
v3:
- Suggest from Arnd, use syscon & regmap_write/read to replace static void __iomem *ppebase.
Modify hisilicon-hip04-net.txt accrordingly to suggestion from Florian and Sergei.
v2:
- Got many suggestions from Russell, Arnd, Florian, Mark and Sergei
Remove memcpy, use dma_map/unmap_single, use dma_alloc_coherent rather than dma_pool, etc.
Refer property in ethernet.txt, change ppe description, etc.
Zhangfei Gao (3):
Documentation: add Device tree bindings for Hisilicon hip04 ethernet
net: hisilicon: new hip04 MDIO driver
net: hisilicon: new hip04 ethernet driver
.../bindings/net/hisilicon-hip04-net.txt | 88 +++
drivers/net/ethernet/hisilicon/Kconfig | 9 +
drivers/net/ethernet/hisilicon/Makefile | 1 +
drivers/net/ethernet/hisilicon/hip04_eth.c | 876 +++++++++++++++++++++
drivers/net/ethernet/hisilicon/hip04_mdio.c | 186 +++++
5 files changed, 1160 insertions(+)
create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt
create mode 100644 drivers/net/ethernet/hisilicon/hip04_eth.c
create mode 100644 drivers/net/ethernet/hisilicon/hip04_mdio.c
--
1.8.0
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH net-next v9 1/3] Documentation: add Device tree bindings for Hisilicon hip04 ethernet
From: Ding Tianhong @ 2014-12-11 11:42 UTC (permalink / raw)
To: zhangfei.gao, davem, linux, arnd, f.fainelli, sergei.shtylyov,
mark.rutland, David.Laight, eric.dumazet, xuwei5
Cc: linux-arm-kernel, netdev, devicetree
In-Reply-To: <1418298150-4944-1-git-send-email-dingtianhong@huawei.com>
From: Zhangfei Gao <zhangfei.gao@linaro.org>
This patch adds the Device Tree bindings for the Hisilicon hip04
Ethernet controller, including 100M / 1000M controller.
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
.../bindings/net/hisilicon-hip04-net.txt | 88 ++++++++++++++++++++++
1 file changed, 88 insertions(+)
create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt
diff --git a/Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt b/Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt
new file mode 100644
index 0000000..988fc69
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt
@@ -0,0 +1,88 @@
+Hisilicon hip04 Ethernet Controller
+
+* Ethernet controller node
+
+Required properties:
+- compatible: should be "hisilicon,hip04-mac".
+- reg: address and length of the register set for the device.
+- interrupts: interrupt for the device.
+- port-handle: <phandle port channel>
+ phandle, specifies a reference to the syscon ppe node
+ port, port number connected to the controller
+ channel, recv channel start from channel * number (RX_DESC_NUM)
+- phy-mode: see ethernet.txt [1].
+
+Optional properties:
+- phy-handle: see ethernet.txt [1].
+
+[1] Documentation/devicetree/bindings/net/ethernet.txt
+
+
+* Ethernet ppe node:
+Control rx & tx fifos of all ethernet controllers.
+Have 2048 recv channels shared by all ethernet controllers, only if no overlap.
+Each controller's recv channel start from channel * number (RX_DESC_NUM).
+
+Required properties:
+- compatible: "hisilicon,hip04-ppe", "syscon".
+- reg: address and length of the register set for the device.
+
+
+* MDIO bus node:
+
+Required properties:
+
+- compatible: should be "hisilicon,hip04-mdio".
+- Inherits from MDIO bus node binding [2]
+[2] Documentation/devicetree/bindings/net/phy.txt
+
+Example:
+ mdio {
+ compatible = "hisilicon,hip04-mdio";
+ reg = <0x28f1000 0x1000>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ phy0: ethernet-phy@0 {
+ compatible = "ethernet-phy-ieee802.3-c22";
+ reg = <0>;
+ marvell,reg-init = <18 0x14 0 0x8001>;
+ };
+
+ phy1: ethernet-phy@1 {
+ compatible = "ethernet-phy-ieee802.3-c22";
+ reg = <1>;
+ marvell,reg-init = <18 0x14 0 0x8001>;
+ };
+ };
+
+ ppe: ppe@28c0000 {
+ compatible = "hisilicon,hip04-ppe", "syscon";
+ reg = <0x28c0000 0x10000>;
+ };
+
+ fe: ethernet@28b0000 {
+ compatible = "hisilicon,hip04-mac";
+ reg = <0x28b0000 0x10000>;
+ interrupts = <0 413 4>;
+ phy-mode = "mii";
+ port-handle = <&ppe 31 0>;
+ };
+
+ ge0: ethernet@2800000 {
+ compatible = "hisilicon,hip04-mac";
+ reg = <0x2800000 0x10000>;
+ interrupts = <0 402 4>;
+ phy-mode = "sgmii";
+ port-handle = <&ppe 0 1>;
+ phy-handle = <&phy0>;
+ };
+
+ ge8: ethernet@2880000 {
+ compatible = "hisilicon,hip04-mac";
+ reg = <0x2880000 0x10000>;
+ interrupts = <0 410 4>;
+ phy-mode = "sgmii";
+ port-handle = <&ppe 8 2>;
+ phy-handle = <&phy1>;
+ };
--
1.8.0
^ permalink raw reply related
* [PATCH net-next v9 2/3] net: hisilicon: new hip04 MDIO driver
From: Ding Tianhong @ 2014-12-11 11:42 UTC (permalink / raw)
To: zhangfei.gao, davem, linux, arnd, f.fainelli, sergei.shtylyov,
mark.rutland, David.Laight, eric.dumazet, xuwei5
Cc: linux-arm-kernel, netdev, devicetree
In-Reply-To: <1418298150-4944-1-git-send-email-dingtianhong@huawei.com>
From: Zhangfei Gao <zhangfei.gao@linaro.org>
Hisilicon hip04 platform mdio driver
Reuse Marvell phy drivers/net/phy/marvell.c
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
drivers/net/ethernet/hisilicon/Kconfig | 9 ++
drivers/net/ethernet/hisilicon/Makefile | 1 +
drivers/net/ethernet/hisilicon/hip04_mdio.c | 186 ++++++++++++++++++++++++++++
3 files changed, 196 insertions(+)
create mode 100644 drivers/net/ethernet/hisilicon/hip04_mdio.c
diff --git a/drivers/net/ethernet/hisilicon/Kconfig b/drivers/net/ethernet/hisilicon/Kconfig
index e942173..a54d897 100644
--- a/drivers/net/ethernet/hisilicon/Kconfig
+++ b/drivers/net/ethernet/hisilicon/Kconfig
@@ -24,4 +24,13 @@ config HIX5HD2_GMAC
help
This selects the hix5hd2 mac family network device.
+config HIP04_ETH
+ tristate "HISILICON P04 Ethernet support"
+ select PHYLIB
+ select MARVELL_PHY
+ select MFD_SYSCON
+ ---help---
+ If you wish to compile a kernel for a hardware with hisilicon p04 SoC and
+ want to use the internal ethernet then you should answer Y to this.
+
endif # NET_VENDOR_HISILICON
diff --git a/drivers/net/ethernet/hisilicon/Makefile b/drivers/net/ethernet/hisilicon/Makefile
index 9175e846..40115a7 100644
--- a/drivers/net/ethernet/hisilicon/Makefile
+++ b/drivers/net/ethernet/hisilicon/Makefile
@@ -3,3 +3,4 @@
#
obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
+obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o
diff --git a/drivers/net/ethernet/hisilicon/hip04_mdio.c b/drivers/net/ethernet/hisilicon/hip04_mdio.c
new file mode 100644
index 0000000..b3bac25
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hip04_mdio.c
@@ -0,0 +1,186 @@
+/* Copyright (c) 2014 Linaro Ltd.
+ * Copyright (c) 2014 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/io.h>
+#include <linux/of_mdio.h>
+#include <linux/delay.h>
+
+#define MDIO_CMD_REG 0x0
+#define MDIO_ADDR_REG 0x4
+#define MDIO_WDATA_REG 0x8
+#define MDIO_RDATA_REG 0xc
+#define MDIO_STA_REG 0x10
+
+#define MDIO_START BIT(14)
+#define MDIO_R_VALID BIT(1)
+#define MDIO_READ (BIT(12) | BIT(11) | MDIO_START)
+#define MDIO_WRITE (BIT(12) | BIT(10) | MDIO_START)
+
+struct hip04_mdio_priv {
+ void __iomem *base;
+};
+
+#define WAIT_TIMEOUT 10
+static int hip04_mdio_wait_ready(struct mii_bus *bus)
+{
+ struct hip04_mdio_priv *priv = bus->priv;
+ int i;
+
+ for (i = 0; readl_relaxed(priv->base + MDIO_CMD_REG) & MDIO_START; i++) {
+ if (i == WAIT_TIMEOUT)
+ return -ETIMEDOUT;
+ msleep(20);
+ }
+
+ return 0;
+}
+
+static int hip04_mdio_read(struct mii_bus *bus, int mii_id, int regnum)
+{
+ struct hip04_mdio_priv *priv = bus->priv;
+ u32 val;
+ int ret;
+
+ ret = hip04_mdio_wait_ready(bus);
+ if (ret < 0)
+ goto out;
+
+ val = regnum | (mii_id << 5) | MDIO_READ;
+ writel_relaxed(val, priv->base + MDIO_CMD_REG);
+
+ ret = hip04_mdio_wait_ready(bus);
+ if (ret < 0)
+ goto out;
+
+ val = readl_relaxed(priv->base + MDIO_STA_REG);
+ if (val & MDIO_R_VALID) {
+ dev_err(bus->parent, "SMI bus read not valid\n");
+ ret = -ENODEV;
+ goto out;
+ }
+
+ val = readl_relaxed(priv->base + MDIO_RDATA_REG);
+ ret = val & 0xFFFF;
+out:
+ return ret;
+}
+
+static int hip04_mdio_write(struct mii_bus *bus, int mii_id,
+ int regnum, u16 value)
+{
+ struct hip04_mdio_priv *priv = bus->priv;
+ u32 val;
+ int ret;
+
+ ret = hip04_mdio_wait_ready(bus);
+ if (ret < 0)
+ goto out;
+
+ writel_relaxed(value, priv->base + MDIO_WDATA_REG);
+ val = regnum | (mii_id << 5) | MDIO_WRITE;
+ writel_relaxed(val, priv->base + MDIO_CMD_REG);
+out:
+ return ret;
+}
+
+static int hip04_mdio_reset(struct mii_bus *bus)
+{
+ int temp, i;
+
+ for (i = 0; i < PHY_MAX_ADDR; i++) {
+ hip04_mdio_write(bus, i, 22, 0);
+ temp = hip04_mdio_read(bus, i, MII_BMCR);
+ if (temp < 0)
+ continue;
+
+ temp |= BMCR_RESET;
+ if (hip04_mdio_write(bus, i, MII_BMCR, temp) < 0)
+ continue;
+ }
+
+ mdelay(500);
+ return 0;
+}
+
+static int hip04_mdio_probe(struct platform_device *pdev)
+{
+ struct resource *r;
+ struct mii_bus *bus;
+ struct hip04_mdio_priv *priv;
+ int ret;
+
+ bus = mdiobus_alloc_size(sizeof(struct hip04_mdio_priv));
+ if (!bus) {
+ dev_err(&pdev->dev, "Cannot allocate MDIO bus\n");
+ return -ENOMEM;
+ }
+
+ bus->name = "hip04_mdio_bus";
+ bus->read = hip04_mdio_read;
+ bus->write = hip04_mdio_write;
+ bus->reset = hip04_mdio_reset;
+ snprintf(bus->id, MII_BUS_ID_SIZE, "%s-mii", dev_name(&pdev->dev));
+ bus->parent = &pdev->dev;
+ priv = bus->priv;
+
+ r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ priv->base = devm_ioremap_resource(&pdev->dev, r);
+ if (IS_ERR(priv->base)) {
+ ret = PTR_ERR(priv->base);
+ goto out_mdio;
+ }
+
+ ret = of_mdiobus_register(bus, pdev->dev.of_node);
+ if (ret < 0) {
+ dev_err(&pdev->dev, "Cannot register MDIO bus (%d)\n", ret);
+ goto out_mdio;
+ }
+
+ platform_set_drvdata(pdev, bus);
+
+ return 0;
+
+out_mdio:
+ mdiobus_free(bus);
+ return ret;
+}
+
+static int hip04_mdio_remove(struct platform_device *pdev)
+{
+ struct mii_bus *bus = platform_get_drvdata(pdev);
+
+ mdiobus_unregister(bus);
+ mdiobus_free(bus);
+
+ return 0;
+}
+
+static const struct of_device_id hip04_mdio_match[] = {
+ { .compatible = "hisilicon,hip04-mdio" },
+ { }
+};
+MODULE_DEVICE_TABLE(of, hip04_mdio_match);
+
+static struct platform_driver hip04_mdio_driver = {
+ .probe = hip04_mdio_probe,
+ .remove = hip04_mdio_remove,
+ .driver = {
+ .name = "hip04-mdio",
+ .owner = THIS_MODULE,
+ .of_match_table = hip04_mdio_match,
+ },
+};
+
+module_platform_driver(hip04_mdio_driver);
+
+MODULE_DESCRIPTION("HISILICON P04 MDIO interface driver");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS("platform:hip04-mdio");
--
1.8.0
^ permalink raw reply related
* [PATCH net-next v9 3/3] net: hisilicon: new hip04 ethernet driver
From: Ding Tianhong @ 2014-12-11 11:42 UTC (permalink / raw)
To: zhangfei.gao, davem, linux, arnd, f.fainelli, sergei.shtylyov,
mark.rutland, David.Laight, eric.dumazet, xuwei5
Cc: linux-arm-kernel, netdev, devicetree
In-Reply-To: <1418298150-4944-1-git-send-email-dingtianhong@huawei.com>
From: Zhangfei Gao <zhangfei.gao@linaro.org>
Support Hisilicon hip04 ethernet driver, including 100M / 1000M controller.
The controller has no tx done interrupt, reclaim xmitted buffer in the poll.
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
drivers/net/ethernet/hisilicon/Makefile | 2 +-
drivers/net/ethernet/hisilicon/hip04_eth.c | 876 +++++++++++++++++++++++++++++
2 files changed, 877 insertions(+), 1 deletion(-)
create mode 100644 drivers/net/ethernet/hisilicon/hip04_eth.c
diff --git a/drivers/net/ethernet/hisilicon/Makefile b/drivers/net/ethernet/hisilicon/Makefile
index 40115a7..6c14540 100644
--- a/drivers/net/ethernet/hisilicon/Makefile
+++ b/drivers/net/ethernet/hisilicon/Makefile
@@ -3,4 +3,4 @@
#
obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
-obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o
+obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o hip04_eth.o
diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c
new file mode 100644
index 0000000..9d37b67
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hip04_eth.c
@@ -0,0 +1,876 @@
+
+/* Copyright (c) 2014 Linaro Ltd.
+ * Copyright (c) 2014 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/module.h>
+#include <linux/etherdevice.h>
+#include <linux/platform_device.h>
+#include <linux/interrupt.h>
+#include <linux/of_address.h>
+#include <linux/phy.h>
+#include <linux/of_mdio.h>
+#include <linux/of_net.h>
+#include <linux/mfd/syscon.h>
+#include <linux/regmap.h>
+
+#define PPE_CFG_RX_ADDR 0x100
+#define PPE_CFG_POOL_GRP 0x300
+#define PPE_CFG_RX_BUF_SIZE 0x400
+#define PPE_CFG_RX_FIFO_SIZE 0x500
+#define PPE_CURR_BUF_CNT 0xa200
+
+#define GE_DUPLEX_TYPE 0x08
+#define GE_MAX_FRM_SIZE_REG 0x3c
+#define GE_PORT_MODE 0x40
+#define GE_PORT_EN 0x44
+#define GE_SHORT_RUNTS_THR_REG 0x50
+#define GE_TX_LOCAL_PAGE_REG 0x5c
+#define GE_TRANSMIT_CONTROL_REG 0x60
+#define GE_CF_CRC_STRIP_REG 0x1b0
+#define GE_MODE_CHANGE_REG 0x1b4
+#define GE_RECV_CONTROL_REG 0x1e0
+#define GE_STATION_MAC_ADDRESS 0x210
+#define PPE_CFG_CPU_ADD_ADDR 0x580
+#define PPE_CFG_MAX_FRAME_LEN_REG 0x408
+#define PPE_CFG_BUS_CTRL_REG 0x424
+#define PPE_CFG_RX_CTRL_REG 0x428
+#define PPE_CFG_RX_PKT_MODE_REG 0x438
+#define PPE_CFG_QOS_VMID_GEN 0x500
+#define PPE_CFG_RX_PKT_INT 0x538
+#define PPE_INTEN 0x600
+#define PPE_INTSTS 0x608
+#define PPE_RINT 0x604
+#define PPE_CFG_STS_MODE 0x700
+#define PPE_HIS_RX_PKT_CNT 0x804
+
+/* REG_INTERRUPT */
+#define RCV_INT BIT(10)
+#define RCV_NOBUF BIT(8)
+#define RCV_DROP BIT(7)
+#define TX_DROP BIT(6)
+#define DEF_INT_ERR (RCV_NOBUF | RCV_DROP | TX_DROP)
+#define DEF_INT_MASK (RCV_INT | DEF_INT_ERR)
+
+/* TX descriptor config */
+#define TX_FREE_MEM BIT(0)
+#define TX_READ_ALLOC_L3 BIT(1)
+#define TX_FINISH_CACHE_INV BIT(2)
+#define TX_CLEAR_WB BIT(4)
+#define TX_L3_CHECKSUM BIT(5)
+#define TX_LOOP_BACK BIT(11)
+
+/* RX error */
+#define RX_PKT_DROP BIT(0)
+#define RX_L2_ERR BIT(1)
+#define RX_PKT_ERR (RX_PKT_DROP | RX_L2_ERR)
+
+#define SGMII_SPEED_1000 0x08
+#define SGMII_SPEED_100 0x07
+#define SGMII_SPEED_10 0x06
+#define MII_SPEED_100 0x01
+#define MII_SPEED_10 0x00
+
+#define GE_DUPLEX_FULL BIT(0)
+#define GE_DUPLEX_HALF 0x00
+#define GE_MODE_CHANGE_EN BIT(0)
+
+#define GE_TX_AUTO_NEG BIT(5)
+#define GE_TX_ADD_CRC BIT(6)
+#define GE_TX_SHORT_PAD_THROUGH BIT(7)
+
+#define GE_RX_STRIP_CRC BIT(0)
+#define GE_RX_STRIP_PAD BIT(3)
+#define GE_RX_PAD_EN BIT(4)
+
+#define GE_AUTO_NEG_CTL BIT(0)
+
+#define GE_RX_INT_THRESHOLD BIT(6)
+#define GE_RX_TIMEOUT 0x04
+
+#define GE_RX_PORT_EN BIT(1)
+#define GE_TX_PORT_EN BIT(2)
+
+#define PPE_CFG_STS_RX_PKT_CNT_RC BIT(12)
+
+#define PPE_CFG_RX_PKT_ALIGN BIT(18)
+#define PPE_CFG_QOS_VMID_MODE BIT(14)
+#define PPE_CFG_QOS_VMID_GRP_SHIFT 8
+
+#define PPE_CFG_RX_FIFO_FSFU BIT(11)
+#define PPE_CFG_RX_DEPTH_SHIFT 16
+#define PPE_CFG_RX_START_SHIFT 0
+#define PPE_CFG_RX_CTRL_ALIGN_SHIFT 11
+
+#define PPE_CFG_BUS_LOCAL_REL BIT(14)
+#define PPE_CFG_BUS_BIG_ENDIEN BIT(0)
+
+#define RX_DESC_NUM 128
+#define TX_DESC_NUM 256
+#define TX_NEXT(N) (((N) + 1) & (TX_DESC_NUM-1))
+#define RX_NEXT(N) (((N) + 1) & (RX_DESC_NUM-1))
+
+#define GMAC_PPE_RX_PKT_MAX_LEN 379
+#define GMAC_MAX_PKT_LEN 1516
+#define GMAC_MIN_PKT_LEN 31
+#define RX_BUF_SIZE 1600
+#define RESET_TIMEOUT 1000
+#define TX_TIMEOUT (6 * HZ)
+
+#define DRV_NAME "hip04-ether"
+
+struct tx_desc {
+ u32 send_addr;
+ u32 send_size;
+ u32 next_addr;
+ u32 cfg;
+ u32 wb_addr;
+} __aligned(64);
+
+struct rx_desc {
+ u16 reserved_16;
+ u16 pkt_len;
+ u32 reserve1[3];
+ u32 pkt_err;
+ u32 reserve2[4];
+};
+
+struct hip04_priv {
+ void __iomem *base;
+ int phy_mode;
+ int chan;
+ unsigned int port;
+ unsigned int speed;
+ unsigned int duplex;
+ unsigned int reg_inten;
+
+ struct napi_struct napi;
+ struct net_device *ndev;
+
+ struct tx_desc *tx_desc;
+ dma_addr_t tx_desc_dma;
+ struct sk_buff *tx_skb[TX_DESC_NUM];
+ dma_addr_t tx_phys[TX_DESC_NUM];
+ unsigned int tx_head;
+ unsigned int tx_tail;
+ int tx_count;
+ unsigned long last_tx;
+
+ unsigned char *rx_buf[RX_DESC_NUM];
+ dma_addr_t rx_phys[RX_DESC_NUM];
+ unsigned int rx_head;
+ unsigned int rx_buf_size;
+
+ struct device_node *phy_node;
+ struct phy_device *phy;
+ struct regmap *map;
+ struct work_struct tx_timeout_task;
+
+ struct workqueue_struct *wq;
+ struct delayed_work tx_clean_task;
+};
+
+static void hip04_config_port(struct net_device *ndev, u32 speed, u32 duplex)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+ u32 val;
+
+ priv->speed = speed;
+ priv->duplex = duplex;
+
+ switch (priv->phy_mode) {
+ case PHY_INTERFACE_MODE_SGMII:
+ if (speed == SPEED_1000)
+ val = SGMII_SPEED_1000;
+ else if (speed == SPEED_100)
+ val = SGMII_SPEED_100;
+ else
+ val = SGMII_SPEED_10;
+ break;
+ case PHY_INTERFACE_MODE_MII:
+ if (speed == SPEED_100)
+ val = MII_SPEED_100;
+ else
+ val = MII_SPEED_10;
+ break;
+ default:
+ netdev_warn(ndev, "not supported mode\n");
+ val = MII_SPEED_10;
+ break;
+ }
+ writel_relaxed(val, priv->base + GE_PORT_MODE);
+
+ val = duplex ? GE_DUPLEX_FULL : GE_DUPLEX_HALF;
+ writel_relaxed(val, priv->base + GE_DUPLEX_TYPE);
+
+ val = GE_MODE_CHANGE_EN;
+ writel_relaxed(val, priv->base + GE_MODE_CHANGE_REG);
+}
+
+static void hip04_reset_ppe(struct hip04_priv *priv)
+{
+ u32 val, tmp, timeout = 0;
+
+ do {
+ regmap_read(priv->map, priv->port * 4 + PPE_CURR_BUF_CNT, &val);
+ regmap_read(priv->map, priv->port * 4 + PPE_CFG_RX_ADDR, &tmp);
+ if (timeout++ > RESET_TIMEOUT)
+ break;
+ } while (val & 0xfff);
+}
+
+static void hip04_config_fifo(struct hip04_priv *priv)
+{
+ u32 val;
+
+ val = readl_relaxed(priv->base + PPE_CFG_STS_MODE);
+ val |= PPE_CFG_STS_RX_PKT_CNT_RC;
+ writel_relaxed(val, priv->base + PPE_CFG_STS_MODE);
+
+ val = BIT(priv->port);
+ regmap_write(priv->map, priv->port * 4 + PPE_CFG_POOL_GRP, val);
+
+ val = priv->port << PPE_CFG_QOS_VMID_GRP_SHIFT;
+ val |= PPE_CFG_QOS_VMID_MODE;
+ writel_relaxed(val, priv->base + PPE_CFG_QOS_VMID_GEN);
+
+ val = RX_BUF_SIZE;
+ regmap_write(priv->map, priv->port * 4 + PPE_CFG_RX_BUF_SIZE, val);
+
+ val = RX_DESC_NUM << PPE_CFG_RX_DEPTH_SHIFT;
+ val |= PPE_CFG_RX_FIFO_FSFU;
+ val |= priv->chan << PPE_CFG_RX_START_SHIFT;
+ regmap_write(priv->map, priv->port * 4 + PPE_CFG_RX_FIFO_SIZE, val);
+
+ val = NET_IP_ALIGN << PPE_CFG_RX_CTRL_ALIGN_SHIFT;
+ writel_relaxed(val, priv->base + PPE_CFG_RX_CTRL_REG);
+
+ val = PPE_CFG_RX_PKT_ALIGN;
+ writel_relaxed(val, priv->base + PPE_CFG_RX_PKT_MODE_REG);
+
+ val = PPE_CFG_BUS_LOCAL_REL | PPE_CFG_BUS_BIG_ENDIEN;
+ writel_relaxed(val, priv->base + PPE_CFG_BUS_CTRL_REG);
+
+ val = GMAC_PPE_RX_PKT_MAX_LEN;
+ writel_relaxed(val, priv->base + PPE_CFG_MAX_FRAME_LEN_REG);
+
+ val = GMAC_MAX_PKT_LEN;
+ writel_relaxed(val, priv->base + GE_MAX_FRM_SIZE_REG);
+
+ val = GMAC_MIN_PKT_LEN;
+ writel_relaxed(val, priv->base + GE_SHORT_RUNTS_THR_REG);
+
+ val = readl_relaxed(priv->base + GE_TRANSMIT_CONTROL_REG);
+ val |= GE_TX_AUTO_NEG | GE_TX_ADD_CRC | GE_TX_SHORT_PAD_THROUGH;
+ writel_relaxed(val, priv->base + GE_TRANSMIT_CONTROL_REG);
+
+ val = GE_RX_STRIP_CRC;
+ writel_relaxed(val, priv->base + GE_CF_CRC_STRIP_REG);
+
+ val = readl_relaxed(priv->base + GE_RECV_CONTROL_REG);
+ val |= GE_RX_STRIP_PAD | GE_RX_PAD_EN;
+ writel_relaxed(val, priv->base + GE_RECV_CONTROL_REG);
+
+ val = GE_AUTO_NEG_CTL;
+ writel_relaxed(val, priv->base + GE_TX_LOCAL_PAGE_REG);
+}
+
+static void hip04_mac_enable(struct net_device *ndev)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+ u32 val;
+
+ /* enable tx & rx */
+ val = readl_relaxed(priv->base + GE_PORT_EN);
+ val |= GE_RX_PORT_EN | GE_TX_PORT_EN;
+ writel_relaxed(val, priv->base + GE_PORT_EN);
+
+ /* clear rx int */
+ val = RCV_INT;
+ writel_relaxed(val, priv->base + PPE_RINT);
+
+ /* config recv int */
+ val = GE_RX_INT_THRESHOLD | GE_RX_TIMEOUT;
+ writel_relaxed(val, priv->base + PPE_CFG_RX_PKT_INT);
+
+ /* enable interrupt */
+ priv->reg_inten = DEF_INT_MASK;
+ writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
+}
+
+static void hip04_mac_disable(struct net_device *ndev)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+ u32 val;
+
+ /* disable int */
+ priv->reg_inten &= ~(DEF_INT_MASK);
+ writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
+
+ /* disable tx & rx */
+ val = readl_relaxed(priv->base + GE_PORT_EN);
+ val &= ~(GE_RX_PORT_EN | GE_TX_PORT_EN);
+ writel_relaxed(val, priv->base + GE_PORT_EN);
+}
+
+static void hip04_set_xmit_desc(struct hip04_priv *priv, dma_addr_t phys)
+{
+ writel(phys, priv->base + PPE_CFG_CPU_ADD_ADDR);
+}
+
+static void hip04_set_recv_desc(struct hip04_priv *priv, dma_addr_t phys)
+{
+ regmap_write(priv->map, priv->port * 4 + PPE_CFG_RX_ADDR, phys);
+}
+
+static u32 hip04_recv_cnt(struct hip04_priv *priv)
+{
+ return readl(priv->base + PPE_HIS_RX_PKT_CNT);
+}
+
+static void hip04_update_mac_address(struct net_device *ndev)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+
+ writel_relaxed(((ndev->dev_addr[0] << 8) | (ndev->dev_addr[1])),
+ priv->base + GE_STATION_MAC_ADDRESS);
+ writel_relaxed(((ndev->dev_addr[2] << 24) | (ndev->dev_addr[3] << 16) |
+ (ndev->dev_addr[4] << 8) | (ndev->dev_addr[5])),
+ priv->base + GE_STATION_MAC_ADDRESS + 4);
+}
+
+static int hip04_set_mac_address(struct net_device *ndev, void *addr)
+{
+ eth_mac_addr(ndev, addr);
+ hip04_update_mac_address(ndev);
+ return 0;
+}
+
+static void hip04_tx_reclaim(struct net_device *ndev, bool force)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+ unsigned tx_tail = priv->tx_tail;
+ struct tx_desc *desc;
+ unsigned int bytes_compl = 0, pkts_compl = 0;
+
+ if (priv->tx_count == 0)
+ goto out;
+
+ while ((tx_tail != priv->tx_head) || (priv->tx_count == TX_DESC_NUM)) {
+ desc = &priv->tx_desc[priv->tx_tail];
+ if (desc->send_addr != 0) {
+ if (force)
+ desc->send_addr = 0;
+ else
+ break;
+ }
+
+ if (priv->tx_phys[tx_tail]) {
+ dma_unmap_single(&ndev->dev, priv->tx_phys[tx_tail],
+ priv->tx_skb[tx_tail]->len,
+ DMA_TO_DEVICE);
+ priv->tx_phys[tx_tail] = 0;
+ }
+ pkts_compl++;
+ bytes_compl += priv->tx_skb[tx_tail]->len;
+ dev_kfree_skb(priv->tx_skb[tx_tail]);
+ priv->tx_skb[tx_tail] = NULL;
+ tx_tail = TX_NEXT(tx_tail);
+ priv->tx_count--;
+
+ if (priv->tx_count <= 0)
+ break;
+ }
+
+ priv->tx_tail = tx_tail;
+
+ /* Ensure tx_tail & tx_count visible to xmit */
+ smp_mb();
+out:
+
+ if (pkts_compl || bytes_compl)
+ netdev_completed_queue(ndev, pkts_compl, bytes_compl);
+
+ if (unlikely(netif_queue_stopped(ndev)) &&
+ (priv->tx_count < TX_DESC_NUM))
+ netif_wake_queue(ndev);
+}
+
+static void hip04_tx_clean_monitor(struct work_struct *work)
+{
+ struct hip04_priv *priv = container_of(work, struct hip04_priv,
+ tx_clean_task.work);
+ struct net_device *ndev = priv->ndev;
+ int delta_in_ticks = msecs_to_jiffies(1000);
+
+ if (!time_in_range(jiffies, priv->last_tx,
+ priv->last_tx + delta_in_ticks)) {
+ netif_tx_lock(ndev);
+ hip04_tx_reclaim(ndev, false);
+ netif_tx_unlock(ndev);
+ }
+ queue_delayed_work(priv->wq, &priv->tx_clean_task, delta_in_ticks);
+}
+
+static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+ struct net_device_stats *stats = &ndev->stats;
+ unsigned int tx_head = priv->tx_head;
+ struct tx_desc *desc = &priv->tx_desc[tx_head];
+ dma_addr_t phys;
+
+ if (priv->tx_count >= TX_DESC_NUM) {
+ netif_stop_queue(ndev);
+ return NETDEV_TX_BUSY;
+ }
+
+ hip04_tx_reclaim(ndev, false);
+
+ phys = dma_map_single(&ndev->dev, skb->data, skb->len, DMA_TO_DEVICE);
+ if (dma_mapping_error(&ndev->dev, phys)) {
+ dev_kfree_skb(skb);
+ return NETDEV_TX_OK;
+ }
+
+ priv->tx_skb[tx_head] = skb;
+ priv->tx_phys[tx_head] = phys;
+ desc->send_addr = cpu_to_be32(phys);
+ desc->send_size = cpu_to_be32(skb->len);
+ desc->cfg = cpu_to_be32(TX_CLEAR_WB | TX_FINISH_CACHE_INV);
+ phys = priv->tx_desc_dma + tx_head * sizeof(struct tx_desc);
+ desc->wb_addr = cpu_to_be32(phys);
+ skb_tx_timestamp(skb);
+
+ /* Don't wait up for transmitted skbs to be freed. */
+ skb_orphan(skb);
+
+ hip04_set_xmit_desc(priv, phys);
+ priv->tx_head = TX_NEXT(tx_head);
+ netdev_sent_queue(ndev, skb->len);
+
+ stats->tx_bytes += skb->len;
+ stats->tx_packets++;
+ priv->tx_count++;
+ priv->last_tx = jiffies;
+
+ /* Ensure tx_head & tx_count update visible to tx reclaim */
+ smp_mb();
+
+ return NETDEV_TX_OK;
+}
+
+static int hip04_rx_poll(struct napi_struct *napi, int budget)
+{
+ struct hip04_priv *priv = container_of(napi, struct hip04_priv, napi);
+ struct net_device *ndev = priv->ndev;
+ struct net_device_stats *stats = &ndev->stats;
+ unsigned int cnt = hip04_recv_cnt(priv);
+ struct rx_desc *desc;
+ struct sk_buff *skb;
+ unsigned char *buf;
+ bool last = false;
+ dma_addr_t phys;
+ int rx = 0;
+ u16 len;
+ u32 err;
+
+ while (cnt && !last) {
+ buf = priv->rx_buf[priv->rx_head];
+ skb = build_skb(buf, priv->rx_buf_size);
+ if (unlikely(!skb))
+ net_dbg_ratelimited("build_skb failed\n");
+
+ dma_unmap_single(&ndev->dev, priv->rx_phys[priv->rx_head],
+ RX_BUF_SIZE, DMA_FROM_DEVICE);
+ priv->rx_phys[priv->rx_head] = 0;
+
+ desc = (struct rx_desc *)skb->data;
+ len = be16_to_cpu(desc->pkt_len);
+ err = be32_to_cpu(desc->pkt_err);
+
+ if (0 == len) {
+ dev_kfree_skb_any(skb);
+ last = true;
+ } else if ((err & RX_PKT_ERR) || (len >= GMAC_MAX_PKT_LEN)) {
+ dev_kfree_skb_any(skb);
+ stats->rx_dropped++;
+ stats->rx_errors++;
+ } else {
+ skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN);
+ skb_put(skb, len);
+ skb->protocol = eth_type_trans(skb, ndev);
+ napi_gro_receive(&priv->napi, skb);
+ stats->rx_packets++;
+ stats->rx_bytes += len;
+ rx++;
+ }
+
+ buf = netdev_alloc_frag(priv->rx_buf_size);
+ if (!buf)
+ return -ENOMEM;
+ phys = dma_map_single(&ndev->dev, buf,
+ RX_BUF_SIZE, DMA_FROM_DEVICE);
+ if (dma_mapping_error(&ndev->dev, phys))
+ return -EIO;
+ priv->rx_buf[priv->rx_head] = buf;
+ priv->rx_phys[priv->rx_head] = phys;
+ hip04_set_recv_desc(priv, phys);
+
+ priv->rx_head = RX_NEXT(priv->rx_head);
+ if (rx >= budget)
+ goto done;
+
+ if (--cnt == 0)
+ cnt = hip04_recv_cnt(priv);
+ }
+
+ if (!(priv->reg_inten & RCV_INT)) {
+ /* enable rx interrupt */
+ priv->reg_inten |= RCV_INT;
+ writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
+ }
+ napi_complete(napi);
+done:
+ return rx;
+}
+
+static irqreturn_t hip04_mac_interrupt(int irq, void *dev_id)
+{
+ struct net_device *ndev = (struct net_device *)dev_id;
+ struct hip04_priv *priv = netdev_priv(ndev);
+ struct net_device_stats *stats = &ndev->stats;
+ u32 ists = readl_relaxed(priv->base + PPE_INTSTS);
+
+ writel_relaxed(DEF_INT_MASK, priv->base + PPE_RINT);
+
+ if (unlikely(ists & DEF_INT_ERR)) {
+ if (ists & (RCV_NOBUF | RCV_DROP))
+ stats->rx_errors++;
+ stats->rx_dropped++;
+ netdev_err(ndev, "rx drop\n");
+ if (ists & TX_DROP) {
+ stats->tx_dropped++;
+ netdev_err(ndev, "tx drop\n");
+ }
+ }
+
+ if (ists & RCV_INT) {
+ /* disable rx interrupt */
+ priv->reg_inten &= ~(RCV_INT);
+ writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
+ napi_schedule(&priv->napi);
+ }
+
+ return IRQ_HANDLED;
+}
+
+static void hip04_adjust_link(struct net_device *ndev)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+ struct phy_device *phy = priv->phy;
+
+ if ((priv->speed != phy->speed) || (priv->duplex != phy->duplex)) {
+ hip04_config_port(ndev, phy->speed, phy->duplex);
+ phy_print_status(phy);
+ }
+}
+
+static int hip04_mac_open(struct net_device *ndev)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+ int i;
+
+ priv->rx_head = 0;
+ priv->tx_head = 0;
+ priv->tx_tail = 0;
+ priv->tx_count = 0;
+ hip04_reset_ppe(priv);
+
+ for (i = 0; i < RX_DESC_NUM; i++) {
+ dma_addr_t phys;
+
+ phys = dma_map_single(&ndev->dev, priv->rx_buf[i],
+ RX_BUF_SIZE, DMA_FROM_DEVICE);
+ if (dma_mapping_error(&ndev->dev, phys))
+ return -EIO;
+
+ priv->rx_phys[i] = phys;
+ hip04_set_recv_desc(priv, phys);
+ }
+
+ if (priv->phy)
+ phy_start(priv->phy);
+
+ netdev_reset_queue(ndev);
+ netif_start_queue(ndev);
+ hip04_mac_enable(ndev);
+ napi_enable(&priv->napi);
+
+ INIT_DELAYED_WORK(&priv->tx_clean_task, hip04_tx_clean_monitor);
+ queue_delayed_work(priv->wq, &priv->tx_clean_task, 0);
+
+ return 0;
+}
+
+static int hip04_mac_stop(struct net_device *ndev)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+ int i;
+
+ cancel_delayed_work_sync(&priv->tx_clean_task);
+
+ napi_disable(&priv->napi);
+ netif_stop_queue(ndev);
+ hip04_mac_disable(ndev);
+ hip04_tx_reclaim(ndev, true);
+ hip04_reset_ppe(priv);
+
+ if (priv->phy)
+ phy_stop(priv->phy);
+
+ for (i = 0; i < RX_DESC_NUM; i++) {
+ if (priv->rx_phys[i]) {
+ dma_unmap_single(&ndev->dev, priv->rx_phys[i],
+ RX_BUF_SIZE, DMA_FROM_DEVICE);
+ priv->rx_phys[i] = 0;
+ }
+ }
+
+ return 0;
+}
+
+static void hip04_timeout(struct net_device *ndev)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+
+ schedule_work(&priv->tx_timeout_task);
+}
+
+static void hip04_tx_timeout_task(struct work_struct *work)
+{
+ struct hip04_priv *priv;
+
+ priv = container_of(work, struct hip04_priv, tx_timeout_task);
+ hip04_mac_stop(priv->ndev);
+ hip04_mac_open(priv->ndev);
+}
+
+static struct net_device_stats *hip04_get_stats(struct net_device *ndev)
+{
+ return &ndev->stats;
+}
+
+static struct net_device_ops hip04_netdev_ops = {
+ .ndo_open = hip04_mac_open,
+ .ndo_stop = hip04_mac_stop,
+ .ndo_get_stats = hip04_get_stats,
+ .ndo_start_xmit = hip04_mac_start_xmit,
+ .ndo_set_mac_address = hip04_set_mac_address,
+ .ndo_tx_timeout = hip04_timeout,
+ .ndo_validate_addr = eth_validate_addr,
+ .ndo_change_mtu = eth_change_mtu,
+};
+
+static int hip04_alloc_ring(struct net_device *ndev, struct device *d)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+ int i;
+
+ priv->tx_desc = dma_alloc_coherent(d,
+ TX_DESC_NUM * sizeof(struct tx_desc),
+ &priv->tx_desc_dma, GFP_KERNEL);
+ if (!priv->tx_desc)
+ return -ENOMEM;
+
+ priv->rx_buf_size = RX_BUF_SIZE +
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+ for (i = 0; i < RX_DESC_NUM; i++) {
+ priv->rx_buf[i] = netdev_alloc_frag(priv->rx_buf_size);
+ if (!priv->rx_buf[i])
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void hip04_free_ring(struct net_device *ndev, struct device *d)
+{
+ struct hip04_priv *priv = netdev_priv(ndev);
+ int i;
+
+ for (i = 0; i < RX_DESC_NUM; i++)
+ if (priv->rx_buf[i])
+ put_page(virt_to_head_page(priv->rx_buf[i]));
+
+ for (i = 0; i < TX_DESC_NUM; i++)
+ if (priv->tx_skb[i])
+ dev_kfree_skb_any(priv->tx_skb[i]);
+
+ dma_free_coherent(d, TX_DESC_NUM * sizeof(struct tx_desc),
+ priv->tx_desc, priv->tx_desc_dma);
+}
+
+static int hip04_mac_probe(struct platform_device *pdev)
+{
+ struct device *d = &pdev->dev;
+ struct device_node *node = d->of_node;
+ struct of_phandle_args arg;
+ struct net_device *ndev;
+ struct hip04_priv *priv;
+ struct resource *res;
+ unsigned int irq;
+ int ret;
+
+ ndev = alloc_etherdev(sizeof(struct hip04_priv));
+ if (!ndev)
+ return -ENOMEM;
+
+ priv = netdev_priv(ndev);
+ priv->ndev = ndev;
+ platform_set_drvdata(pdev, ndev);
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ priv->base = devm_ioremap_resource(d, res);
+ if (IS_ERR(priv->base)) {
+ ret = PTR_ERR(priv->base);
+ goto init_fail;
+ }
+
+ ret = of_parse_phandle_with_fixed_args(node, "port-handle", 2, 0, &arg);
+ if (ret < 0) {
+ dev_warn(d, "no port-handle\n");
+ goto init_fail;
+ }
+
+ priv->port = arg.args[0];
+ priv->chan = arg.args[1] * RX_DESC_NUM;
+
+ priv->map = syscon_node_to_regmap(arg.np);
+ if (IS_ERR(priv->map)) {
+ dev_warn(d, "no syscon hisilicon,hip04-ppe\n");
+ ret = PTR_ERR(priv->map);
+ goto init_fail;
+ }
+
+ priv->phy_mode = of_get_phy_mode(node);
+ if (priv->phy_mode < 0) {
+ dev_warn(d, "not find phy-mode\n");
+ ret = -EINVAL;
+ goto init_fail;
+ }
+
+ irq = platform_get_irq(pdev, 0);
+ if (irq <= 0) {
+ ret = -EINVAL;
+ goto init_fail;
+ }
+
+ ret = devm_request_irq(d, irq, hip04_mac_interrupt,
+ 0, pdev->name, ndev);
+ if (ret) {
+ netdev_err(ndev, "devm_request_irq failed\n");
+ goto init_fail;
+ }
+
+ priv->phy_node = of_parse_phandle(node, "phy-handle", 0);
+ if (priv->phy_node) {
+ priv->phy = of_phy_connect(ndev, priv->phy_node,
+ &hip04_adjust_link, 0, priv->phy_mode);
+ if (!priv->phy) {
+ ret = -EPROBE_DEFER;
+ goto init_fail;
+ }
+ }
+
+ priv->wq = create_singlethread_workqueue(ndev->name);
+ if (!priv->wq) {
+ ret = -ENOMEM;
+ goto init_fail;
+ }
+
+ INIT_WORK(&priv->tx_timeout_task, hip04_tx_timeout_task);
+
+ ether_setup(ndev);
+ ndev->netdev_ops = &hip04_netdev_ops;
+ ndev->watchdog_timeo = TX_TIMEOUT;
+ ndev->priv_flags |= IFF_UNICAST_FLT;
+ ndev->irq = irq;
+ netif_napi_add(ndev, &priv->napi, hip04_rx_poll, NAPI_POLL_WEIGHT);
+ SET_NETDEV_DEV(ndev, &pdev->dev);
+
+ hip04_reset_ppe(priv);
+ if (priv->phy_mode == PHY_INTERFACE_MODE_MII)
+ hip04_config_port(ndev, SPEED_100, DUPLEX_FULL);
+
+ hip04_config_fifo(priv);
+ random_ether_addr(ndev->dev_addr);
+ hip04_update_mac_address(ndev);
+
+ ret = hip04_alloc_ring(ndev, d);
+ if (ret) {
+ netdev_err(ndev, "alloc ring fail\n");
+ goto alloc_fail;
+ }
+
+ ret = register_netdev(ndev);
+ if (ret) {
+ free_netdev(ndev);
+ goto alloc_fail;
+ }
+
+ return 0;
+
+alloc_fail:
+ hip04_free_ring(ndev, d);
+init_fail:
+ of_node_put(priv->phy_node);
+ free_netdev(ndev);
+ return ret;
+}
+
+static int hip04_remove(struct platform_device *pdev)
+{
+ struct net_device *ndev = platform_get_drvdata(pdev);
+ struct hip04_priv *priv = netdev_priv(ndev);
+ struct device *d = &pdev->dev;
+
+ if (priv->phy)
+ phy_disconnect(priv->phy);
+
+ hip04_free_ring(ndev, d);
+ unregister_netdev(ndev);
+ free_irq(ndev->irq, ndev);
+ of_node_put(priv->phy_node);
+ cancel_work_sync(&priv->tx_timeout_task);
+ if (priv->wq)
+ destroy_workqueue(priv->wq);
+ free_netdev(ndev);
+
+ return 0;
+}
+
+static const struct of_device_id hip04_mac_match[] = {
+ { .compatible = "hisilicon,hip04-mac" },
+ { }
+};
+
+static struct platform_driver hip04_mac_driver = {
+ .probe = hip04_mac_probe,
+ .remove = hip04_remove,
+ .driver = {
+ .name = DRV_NAME,
+ .owner = THIS_MODULE,
+ .of_match_table = hip04_mac_match,
+ },
+};
+module_platform_driver(hip04_mac_driver);
+
+MODULE_DESCRIPTION("HISILICON P04 Ethernet driver");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS("platform:hip04-ether");
--
1.8.0
^ permalink raw reply related
* Re: [PATCH net-next RESEND] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined.
From: Jamal Hadi Salim @ 2014-12-11 11:49 UTC (permalink / raw)
To: David Miller, h.sokolowski; +Cc: netdev, Roopa Prabhu, Vlad Yasevich
In-Reply-To: <20141210.233239.472984361665334371.davem@davemloft.net>
On 12/10/14 23:32, David Miller wrote:
> From: "Hubert Sokolowski" <h.sokolowski@wit.edu.pl>
> Date: Wed, 10 Dec 2014 19:37:01 -0000
>
>> This change restores the semantic that was present
>> before 5e6d243587990a588143b9da3974833649595587
>> "bridge: netlink dump interface at par with brctl"
>> on how ndo_dflt_fdb_dump is called.
>> This semantic is still used for add and del operations
>> so let's keep it consistent.
>> Driver can still call ndo_dflt_fdb_dump from inside
>> its own fdb_dump routine when needed.
>>
>> Signed-off-by: Hubert Sokolowski <h.sokolowski@wit.edu.pl>
>
> Jamal, please review.
>
It wont work. As pointed out by Roopa in
the other email dev->uc/mc will not get dumped with this
change. Vlad will be in a better position to comment.
CCing Vlad.
Hubert, immediate gratification never works on netdev.
I advised you to run the commit tests in at least
2 emails when you contacted me privately before posting.
It would have chewed about 5 minutes of your time.
I am sure it cost Roopa at least 1 hour. And if Dave
had sucked in your innocent looking patch we'd be playing
damage control after which is a lot more expensive.
cheers,
jamal
^ permalink raw reply
* [PATCH net v2] Fix race condition between vxlan_sock_add and vxlan_sock_release
From: Marcelo Ricardo Leitner @ 2014-12-11 12:02 UTC (permalink / raw)
To: netdev
Currently, when trying to reuse a socket, vxlan_sock_add will grab
vn->sock_lock, locate a reusable socket, inc refcount and release
vn->sock_lock.
But vxlan_sock_release() will first decrement refcount, and then grab
that lock. refcnt operations are atomic but as currently we have
deferred works which hold vs->refcnt each, this might happen, leading to
a use after free (specially after vxlan_igmp_leave):
CPU 1 CPU 2
deferred work vxlan_sock_add
... ...
spin_lock(&vn->sock_lock)
vs = vxlan_find_sock();
vxlan_sock_release
dec vs->refcnt, reaches 0
spin_lock(&vn->sock_lock)
vxlan_sock_hold(vs), refcnt=1
spin_unlock(&vn->sock_lock)
hlist_del_rcu(&vs->hlist);
vxlan_notify_del_rx_port(vs)
spin_unlock(&vn->sock_lock)
So when we look for a reusable socket, we check if it wasn't freed
already before reusing it.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Fixes: 7c47cedf43a8b3 ("vxlan: move IGMP join/leave to work queue")
---
Notes:
v1->v2: addressed Dave's comment on better to use atomic_add_unless()
than grabbing the lock earlier on vxlan_sock_release()
Note that there are two search&reuse places, on vxlan_init() and
vxlan_sock_add(), both handled.
drivers/net/vxlan.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 31ecb03368c6dc3d581fdbd30b409b88190f3c71..49d9f229199851c48f5a9e6f1b282b42cedc2a41 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1985,9 +1985,8 @@ static int vxlan_init(struct net_device *dev)
spin_lock(&vn->sock_lock);
vs = vxlan_find_sock(vxlan->net, ipv6 ? AF_INET6 : AF_INET,
vxlan->dst_port);
- if (vs) {
+ if (vs && atomic_add_unless(&vs->refcnt, 1, 0)) {
/* If we have a socket with same port already, reuse it */
- atomic_inc(&vs->refcnt);
vxlan_vs_add_dev(vs, vxlan);
} else {
/* otherwise make new socket outside of RTNL */
@@ -2389,12 +2388,9 @@ struct vxlan_sock *vxlan_sock_add(struct net *net, __be16 port,
spin_lock(&vn->sock_lock);
vs = vxlan_find_sock(net, ipv6 ? AF_INET6 : AF_INET, port);
- if (vs) {
- if (vs->rcv == rcv)
- atomic_inc(&vs->refcnt);
- else
+ if (vs && ((vs->rcv != rcv) ||
+ !atomic_add_unless(&vs->refcnt, 1, 0)))
vs = ERR_PTR(-EBUSY);
- }
spin_unlock(&vn->sock_lock);
if (!vs)
--
1.9.3
^ permalink raw reply related
* [PULL] virtio: virtio 1.0 support, misc patches
From: Michael S. Tsirkin @ 2014-12-11 12:02 UTC (permalink / raw)
To: Linus Torvalds
Cc: sergei.shtylyov, kvm, mst, netdev, linux-kernel, virtualization,
pbonzini, ben, David Miller, thuth
The following changes since commit b2776bf7149bddd1f4161f14f79520f17fc1d71d:
Linux 3.18 (2014-12-07 14:21:05 -0800)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
for you to fetch changes up to 803cd18f7b5e6c7ad6bee9571ae8f4450190ab58:
virtio_ccw: finalize_features error handling (2014-12-09 16:32:41 +0200)
Note: some net drivers are affected by these patches.
David said he's fine with merging these patches through
my tree.
Rusty's on vacation, he acked using my tree for these, too.
----------------------------------------------------------------
virtio: virtio 1.0 support, misc patches
This adds a lot of infrastructure for virtio 1.0 support.
Notable missing pieces: virtio pci, virtio balloon (needs spec extension),
vhost scsi.
Plus, there are some minor fixes in a couple of places.
Cc: David Miller <davem@davemloft.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
----------------------------------------------------------------
Cornelia Huck (4):
virtio: allow transports to get avail/used addresses
KVM: s390: virtio-ccw revision 1 SET_VQ
KVM: s390: enable virtio-ccw revision 1
virtio_ccw: finalize_features error handling
Jason Wang (1):
vhost: remove unnecessary forward declarations in vhost.h
Michael S. Tsirkin (64):
virtio: add low-level APIs for feature bits
virtio: use u32, not bitmap for features
mic_virtio: robust feature array size calculation
virtio: add support for 64 bit features.
virtio: assert 32 bit features in transports
virtio_ccw: add support for 64 bit features.
virtio: add virtio 1.0 feature bit
virtio: memory access APIs
virtio_ring: switch to new memory access APIs
virtio_config: endian conversion for v1.0
virtio: set FEATURES_OK
virtio: simplify feature bit handling
virtio: add legacy feature table support
virtio_net: v1.0 endianness
virtio_blk: v1.0 support
KVM: s390 allow virtio_ccw status writes to fail
virtio_blk: make serial attribute static
virtio_blk: fix race at module removal
virtio_net: pass vi around
virtio_net: get rid of virtio_net_hdr/skb_vnet_hdr
virtio_net: stricter short buffer length checks
virtio_net: bigger header when VERSION_1 is set
virtio_net: disable mac write for virtio 1.0
virtio_net: enable v1.0 support
vhost: make features 64 bit
vhost: add memory access wrappers
vhost/net: force len for TX to host endian
vhost: switch to __get/__put_user exclusively
vhost: virtio 1.0 endian-ness support
vhost/net: virtio 1.0 byte swap
vhost/net: larger header for virtio 1.0
vhost/net: enable virtio 1.0
tun: move internal flag defines out of uapi
tun: drop most type defines
tun: add VNET_LE flag
tun: TUN_VNET_LE support, fix sparse warnings for virtio headers
macvtap: TUN_VNET_LE support
virtio_scsi: v1.0 support
virtio_scsi: move to uapi
virtio_scsi: export to userspace
vhost/scsi: partial virtio 1.0 support
af_packet: virtio 1.0 stubs
virtio_console: virtio 1.0 support
virtio_balloon: add legacy_only flag
virtio: make VIRTIO_F_VERSION_1 a transport bit
virtio: drop VIRTIO_F_VERSION_1 from drivers
virtio_console: fix sparse warnings
virtio: add API to detect legacy devices
virtio_ccw: legacy: don't negotiate rev 1/features
virtio: allow finalize_features to fail
virtio_ccw: rev 1 devices set VIRTIO_F_VERSION_1
virtio_balloon: drop legacy_only driver flag
virtio: drop legacy_only driver flag
virtio_pci: add isr field
virtio_pci: fix coding style for structs
virtio_pci: free up vq->priv
virtio_pci: use priv for vq notification
virtio_pci: delete vqs indirectly
virtio_pci: setup vqs indirectly
virtio_pci: setup config vector indirectly
virtio_pci: split out legacy device support
virtio_pci: update file descriptions and copyright
virtio_pci: rename virtio_pci -> virtio_pci_common
virtio_ccw: future-proof finalize_features
Thomas Huth (1):
KVM: s390: Set virtio-ccw transport revision
drivers/vhost/vhost.h | 41 +-
drivers/virtio/virtio_pci_common.h | 136 ++++++
include/linux/virtio.h | 12 +-
include/linux/virtio_byteorder.h | 59 +++
include/linux/virtio_config.h | 103 ++++-
include/uapi/linux/if_tun.h | 17 +-
include/uapi/linux/virtio_blk.h | 15 +-
include/uapi/linux/virtio_config.h | 9 +-
include/uapi/linux/virtio_console.h | 7 +-
include/uapi/linux/virtio_net.h | 15 +-
include/uapi/linux/virtio_ring.h | 45 +-
include/{ => uapi}/linux/virtio_scsi.h | 106 ++---
include/uapi/linux/virtio_types.h | 46 ++
tools/virtio/linux/virtio.h | 22 +-
tools/virtio/linux/virtio_config.h | 2 +-
drivers/block/virtio_blk.c | 74 +--
drivers/char/virtio_console.c | 39 +-
drivers/lguest/lguest_device.c | 17 +-
drivers/misc/mic/card/mic_virtio.c | 14 +-
drivers/net/macvtap.c | 68 ++-
drivers/net/tun.c | 168 +++----
drivers/net/virtio_net.c | 161 +++----
drivers/remoteproc/remoteproc_virtio.c | 11 +-
drivers/s390/kvm/kvm_virtio.c | 11 +-
drivers/s390/kvm/virtio_ccw.c | 203 +++++++--
drivers/scsi/virtio_scsi.c | 50 +-
drivers/vhost/net.c | 31 +-
drivers/vhost/scsi.c | 22 +-
drivers/vhost/vhost.c | 93 ++--
drivers/virtio/virtio.c | 102 ++++-
drivers/virtio/virtio_mmio.c | 17 +-
drivers/virtio/virtio_pci.c | 802 ---------------------------------
drivers/virtio/virtio_pci_common.c | 464 +++++++++++++++++++
drivers/virtio/virtio_pci_legacy.c | 326 ++++++++++++++
drivers/virtio/virtio_ring.c | 109 +++--
net/packet/af_packet.c | 35 +-
tools/virtio/virtio_test.c | 5 +-
tools/virtio/vringh_test.c | 16 +-
drivers/virtio/Makefile | 1 +
include/uapi/linux/Kbuild | 2 +
40 files changed, 2048 insertions(+), 1428 deletions(-)
create mode 100644 drivers/virtio/virtio_pci_common.h
create mode 100644 include/linux/virtio_byteorder.h
rename include/{ => uapi}/linux/virtio_scsi.h (73%)
create mode 100644 include/uapi/linux/virtio_types.h
delete mode 100644 drivers/virtio/virtio_pci.c
create mode 100644 drivers/virtio/virtio_pci_common.c
create mode 100644 drivers/virtio/virtio_pci_legacy.c
^ permalink raw reply
* RE: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Varlese, Marco @ 2014-12-11 12:02 UTC (permalink / raw)
To: Jiri Pirko
Cc: John Fastabend, netdev@vger.kernel.org,
stephen@networkplumber.org, Fastabend, John R,
roopa@cumulusnetworks.com, sfeldma@gmail.com,
linux-kernel@vger.kernel.org
In-Reply-To: <20141211110115.GA1979@nanopsycho.lan>
> -----Original Message-----
> From: Jiri Pirko [mailto:jiri@resnulli.us]
> Sent: Thursday, December 11, 2014 11:01 AM
> To: Varlese, Marco
> Cc: John Fastabend; netdev@vger.kernel.org;
> stephen@networkplumber.org; Fastabend, John R;
> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
> kernel@vger.kernel.org
> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
> configuration
>
> Thu, Dec 11, 2014 at 10:59:42AM CET, marco.varlese@intel.com wrote:
> >> -----Original Message-----
> >> From: John Fastabend [mailto:john.fastabend@gmail.com]
> >> Sent: Wednesday, December 10, 2014 5:04 PM
> >> To: Jiri Pirko
> >> Cc: Varlese, Marco; netdev@vger.kernel.org;
> >> stephen@networkplumber.org; Fastabend, John R;
> >> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
> >> kernel@vger.kernel.org
> >> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
> >> configuration
> >>
> >> On 12/10/2014 08:50 AM, Jiri Pirko wrote:
> >> > Wed, Dec 10, 2014 at 05:23:40PM CET, marco.varlese@intel.com wrote:
> >> >> From: Marco Varlese <marco.varlese@intel.com>
> >> >>
> >> >> Switch hardware offers a list of attributes that are configurable
> >> >> on a per port basis.
> >> >> This patch provides a mechanism to configure switch ports by
> >> >> adding an NDO for setting specific values to specific attributes.
> >> >> There will be a separate patch that extends iproute2 to call the
> >> >> new NDO.
> >> >
> >> >
> >> > What are these attributes? Can you give some examples. I'm asking
> >> > because there is a plan to pass generic attributes to switch ports
> >> > replacing current specific ndo_switch_port_stp_update. In this
> >> > case, bridge is setting that attribute.
> >> >
> >> > Is there need to set something directly from userspace or does it
> >> > make rather sense to use involved bridge/ovs/bond ? I think that
> >> > both will be needed.
> >>
> >> +1
> >>
> >> I think for many attributes it would be best to have both. The in
> >> kernel callers and netlink userspace can use the same driver ndo_ops.
> >>
> >> But then we don't _require_ any specific bridge/ovs/etc module. And
> >> we may have some attributes that are not specific to any existing
> >> software module. I'm guessing Marco has some examples of these.
> >>
> >> [...]
> >>
> >>
> >> --
> >> John Fastabend Intel Corporation
> >
> >We do have a need to configure the attributes directly from user-space and
> I have identified the tool to do that in iproute2.
> >
> >An example of attributes are:
> >* enabling/disabling of learning of source addresses on a given port
> >(you can imagine the attribute called LEARNING for example);
> >* internal loopback control (i.e. LOOPBACK) which will control how the
> >flow of traffic behaves from the switch fabric towards an egress port;
> >* flooding for broadcast/multicast/unicast type of packets (i.e.
> >BFLOODING, MFLOODING, UFLOODING);
> >
> >Some attributes would be of the type enabled/disabled while other will
> allow specific values to allow the user to configure different behaviours of
> that feature on that particular port on that platform.
> >
> >One thing to mention - as John stated as well - there might be some
> attributes that are not specific to any software module but rather have to do
> with the actual hardware/platform to configure.
> >
> >I hope this clarifies some points.
>
> It does. Makes sense. We need to expose this attr set/get for both in-kernel
> and userspace use cases.
>
> Please adjust you patch for this. Also, as a second patch, it would be great if
> you can convert ndo_switch_port_stp_update to this new ndo.
>
> Thanks.
>
>
I was thinking of leaving the get side of things implemented via sysfs rather than implementing an NDO for it. Would this not be appropriate?
- - -
Marco Varlese
^ permalink raw reply
* Re: [PULL] virtio: virtio 1.0 support, misc patches
From: Michael S. Tsirkin @ 2014-12-11 12:14 UTC (permalink / raw)
To: Linus Torvalds
Cc: sergei.shtylyov, kvm, netdev, linux-kernel, virtualization,
pbonzini, ben, David Miller, thuth
In-Reply-To: <20141211120248.GA8838@redhat.com>
On Thu, Dec 11, 2014 at 02:02:48PM +0200, Michael S. Tsirkin wrote:
> The following changes since commit b2776bf7149bddd1f4161f14f79520f17fc1d71d:
>
> Linux 3.18 (2014-12-07 14:21:05 -0800)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
>
> for you to fetch changes up to 803cd18f7b5e6c7ad6bee9571ae8f4450190ab58:
Actually the commit hash in this mail is wrong:
The correct one is
f01a2a811ae04124fc9382925038fcbbd2f0b7c8
the reason I got this wrong is I prepared the pull request mail several
days ago, and since then I have rebased, pushed, and several people
tested this correct latest hash.
It's all signed correctly, so
Linus, do I need to resend?
Sorry about the noise.
>
> virtio_ccw: finalize_features error handling (2014-12-09 16:32:41 +0200)
>
> Note: some net drivers are affected by these patches.
> David said he's fine with merging these patches through
> my tree.
> Rusty's on vacation, he acked using my tree for these, too.
>
> ----------------------------------------------------------------
> virtio: virtio 1.0 support, misc patches
>
> This adds a lot of infrastructure for virtio 1.0 support.
> Notable missing pieces: virtio pci, virtio balloon (needs spec extension),
> vhost scsi.
>
> Plus, there are some minor fixes in a couple of places.
>
> Cc: David Miller <davem@davemloft.net>
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> ----------------------------------------------------------------
> Cornelia Huck (4):
> virtio: allow transports to get avail/used addresses
> KVM: s390: virtio-ccw revision 1 SET_VQ
> KVM: s390: enable virtio-ccw revision 1
> virtio_ccw: finalize_features error handling
>
> Jason Wang (1):
> vhost: remove unnecessary forward declarations in vhost.h
>
> Michael S. Tsirkin (64):
> virtio: add low-level APIs for feature bits
> virtio: use u32, not bitmap for features
> mic_virtio: robust feature array size calculation
> virtio: add support for 64 bit features.
> virtio: assert 32 bit features in transports
> virtio_ccw: add support for 64 bit features.
> virtio: add virtio 1.0 feature bit
> virtio: memory access APIs
> virtio_ring: switch to new memory access APIs
> virtio_config: endian conversion for v1.0
> virtio: set FEATURES_OK
> virtio: simplify feature bit handling
> virtio: add legacy feature table support
> virtio_net: v1.0 endianness
> virtio_blk: v1.0 support
> KVM: s390 allow virtio_ccw status writes to fail
> virtio_blk: make serial attribute static
> virtio_blk: fix race at module removal
> virtio_net: pass vi around
> virtio_net: get rid of virtio_net_hdr/skb_vnet_hdr
> virtio_net: stricter short buffer length checks
> virtio_net: bigger header when VERSION_1 is set
> virtio_net: disable mac write for virtio 1.0
> virtio_net: enable v1.0 support
> vhost: make features 64 bit
> vhost: add memory access wrappers
> vhost/net: force len for TX to host endian
> vhost: switch to __get/__put_user exclusively
> vhost: virtio 1.0 endian-ness support
> vhost/net: virtio 1.0 byte swap
> vhost/net: larger header for virtio 1.0
> vhost/net: enable virtio 1.0
> tun: move internal flag defines out of uapi
> tun: drop most type defines
> tun: add VNET_LE flag
> tun: TUN_VNET_LE support, fix sparse warnings for virtio headers
> macvtap: TUN_VNET_LE support
> virtio_scsi: v1.0 support
> virtio_scsi: move to uapi
> virtio_scsi: export to userspace
> vhost/scsi: partial virtio 1.0 support
> af_packet: virtio 1.0 stubs
> virtio_console: virtio 1.0 support
> virtio_balloon: add legacy_only flag
> virtio: make VIRTIO_F_VERSION_1 a transport bit
> virtio: drop VIRTIO_F_VERSION_1 from drivers
> virtio_console: fix sparse warnings
> virtio: add API to detect legacy devices
> virtio_ccw: legacy: don't negotiate rev 1/features
> virtio: allow finalize_features to fail
> virtio_ccw: rev 1 devices set VIRTIO_F_VERSION_1
> virtio_balloon: drop legacy_only driver flag
> virtio: drop legacy_only driver flag
> virtio_pci: add isr field
> virtio_pci: fix coding style for structs
> virtio_pci: free up vq->priv
> virtio_pci: use priv for vq notification
> virtio_pci: delete vqs indirectly
> virtio_pci: setup vqs indirectly
> virtio_pci: setup config vector indirectly
> virtio_pci: split out legacy device support
> virtio_pci: update file descriptions and copyright
> virtio_pci: rename virtio_pci -> virtio_pci_common
> virtio_ccw: future-proof finalize_features
>
> Thomas Huth (1):
> KVM: s390: Set virtio-ccw transport revision
>
> drivers/vhost/vhost.h | 41 +-
> drivers/virtio/virtio_pci_common.h | 136 ++++++
> include/linux/virtio.h | 12 +-
> include/linux/virtio_byteorder.h | 59 +++
> include/linux/virtio_config.h | 103 ++++-
> include/uapi/linux/if_tun.h | 17 +-
> include/uapi/linux/virtio_blk.h | 15 +-
> include/uapi/linux/virtio_config.h | 9 +-
> include/uapi/linux/virtio_console.h | 7 +-
> include/uapi/linux/virtio_net.h | 15 +-
> include/uapi/linux/virtio_ring.h | 45 +-
> include/{ => uapi}/linux/virtio_scsi.h | 106 ++---
> include/uapi/linux/virtio_types.h | 46 ++
> tools/virtio/linux/virtio.h | 22 +-
> tools/virtio/linux/virtio_config.h | 2 +-
> drivers/block/virtio_blk.c | 74 +--
> drivers/char/virtio_console.c | 39 +-
> drivers/lguest/lguest_device.c | 17 +-
> drivers/misc/mic/card/mic_virtio.c | 14 +-
> drivers/net/macvtap.c | 68 ++-
> drivers/net/tun.c | 168 +++----
> drivers/net/virtio_net.c | 161 +++----
> drivers/remoteproc/remoteproc_virtio.c | 11 +-
> drivers/s390/kvm/kvm_virtio.c | 11 +-
> drivers/s390/kvm/virtio_ccw.c | 203 +++++++--
> drivers/scsi/virtio_scsi.c | 50 +-
> drivers/vhost/net.c | 31 +-
> drivers/vhost/scsi.c | 22 +-
> drivers/vhost/vhost.c | 93 ++--
> drivers/virtio/virtio.c | 102 ++++-
> drivers/virtio/virtio_mmio.c | 17 +-
> drivers/virtio/virtio_pci.c | 802 ---------------------------------
> drivers/virtio/virtio_pci_common.c | 464 +++++++++++++++++++
> drivers/virtio/virtio_pci_legacy.c | 326 ++++++++++++++
> drivers/virtio/virtio_ring.c | 109 +++--
> net/packet/af_packet.c | 35 +-
> tools/virtio/virtio_test.c | 5 +-
> tools/virtio/vringh_test.c | 16 +-
> drivers/virtio/Makefile | 1 +
> include/uapi/linux/Kbuild | 2 +
> 40 files changed, 2048 insertions(+), 1428 deletions(-)
> create mode 100644 drivers/virtio/virtio_pci_common.h
> create mode 100644 include/linux/virtio_byteorder.h
> rename include/{ => uapi}/linux/virtio_scsi.h (73%)
> create mode 100644 include/uapi/linux/virtio_types.h
> delete mode 100644 drivers/virtio/virtio_pci.c
> create mode 100644 drivers/virtio/virtio_pci_common.c
> create mode 100644 drivers/virtio/virtio_pci_legacy.c
^ permalink raw reply
* Re: [PATCH iproute2] ip: Simplify executing ip cmd within namespace
From: Nicolas Dichtel @ 2014-12-11 12:50 UTC (permalink / raw)
To: vadim4j; +Cc: netdev
In-Reply-To: <20141211105733.GA17601@angus-think.wlc.globallogic.com>
Le 11/12/2014 11:57, vadim4j@gmail.com a écrit :
> On Thu, Dec 11, 2014 at 11:58:21AM +0100, Nicolas Dichtel wrote:
>> Le 10/12/2014 23:56, Vadim Kochan a écrit :
>>> From: Vadim Kochan <vadim4j@gmail.com>
>>>
>>> Added new '-ns' option to simplify executing following cmd:
>>>
>>> ip netns exec NETNS ip OPTIONS COMMAND OBJECT
>>>
>>> to
>>>
>>> ip -ns NETNS OPTIONS COMMAND OBJECT
>>>
>>> e.g.:
>>>
>>> ip -ns vnet0 link add br0 type bridge
>>>
>>> Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
>>> ---
>>> May be new option should have better name than '-ns' ?
>> What about 'ip -netns' to be explicit like other options?
>> user may still use 'ip -n' at the end.
>>
>>
>> Regards,
>> Nicolas
>
> May be left '-n' for some other future option, but use the following
Options parsing in iproute2 will match -netns when typing -n because there
is no other options that begin with a 'n' (I've done a quick look, maybe I've
missed one).
Like -d which matches -details, etc.
> options: -net[ns] and -ns ? What do you think ?
One option is enough. '-netns' is an explicit reference to 'ip netns'.
Regards,
Nicolas
^ permalink raw reply
* Re: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Jiri Pirko @ 2014-12-11 13:08 UTC (permalink / raw)
To: Varlese, Marco
Cc: John Fastabend, netdev@vger.kernel.org,
stephen@networkplumber.org, Fastabend, John R,
roopa@cumulusnetworks.com, sfeldma@gmail.com,
linux-kernel@vger.kernel.org
In-Reply-To: <C4896FB061E7DE4AAC93031BDCA044B104AC3E77@IRSMSX108.ger.corp.intel.com>
Thu, Dec 11, 2014 at 01:02:36PM CET, marco.varlese@intel.com wrote:
>> -----Original Message-----
>> From: Jiri Pirko [mailto:jiri@resnulli.us]
>> Sent: Thursday, December 11, 2014 11:01 AM
>> To: Varlese, Marco
>> Cc: John Fastabend; netdev@vger.kernel.org;
>> stephen@networkplumber.org; Fastabend, John R;
>> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
>> kernel@vger.kernel.org
>> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
>> configuration
>>
>> Thu, Dec 11, 2014 at 10:59:42AM CET, marco.varlese@intel.com wrote:
>> >> -----Original Message-----
>> >> From: John Fastabend [mailto:john.fastabend@gmail.com]
>> >> Sent: Wednesday, December 10, 2014 5:04 PM
>> >> To: Jiri Pirko
>> >> Cc: Varlese, Marco; netdev@vger.kernel.org;
>> >> stephen@networkplumber.org; Fastabend, John R;
>> >> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
>> >> kernel@vger.kernel.org
>> >> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
>> >> configuration
>> >>
>> >> On 12/10/2014 08:50 AM, Jiri Pirko wrote:
>> >> > Wed, Dec 10, 2014 at 05:23:40PM CET, marco.varlese@intel.com wrote:
>> >> >> From: Marco Varlese <marco.varlese@intel.com>
>> >> >>
>> >> >> Switch hardware offers a list of attributes that are configurable
>> >> >> on a per port basis.
>> >> >> This patch provides a mechanism to configure switch ports by
>> >> >> adding an NDO for setting specific values to specific attributes.
>> >> >> There will be a separate patch that extends iproute2 to call the
>> >> >> new NDO.
>> >> >
>> >> >
>> >> > What are these attributes? Can you give some examples. I'm asking
>> >> > because there is a plan to pass generic attributes to switch ports
>> >> > replacing current specific ndo_switch_port_stp_update. In this
>> >> > case, bridge is setting that attribute.
>> >> >
>> >> > Is there need to set something directly from userspace or does it
>> >> > make rather sense to use involved bridge/ovs/bond ? I think that
>> >> > both will be needed.
>> >>
>> >> +1
>> >>
>> >> I think for many attributes it would be best to have both. The in
>> >> kernel callers and netlink userspace can use the same driver ndo_ops.
>> >>
>> >> But then we don't _require_ any specific bridge/ovs/etc module. And
>> >> we may have some attributes that are not specific to any existing
>> >> software module. I'm guessing Marco has some examples of these.
>> >>
>> >> [...]
>> >>
>> >>
>> >> --
>> >> John Fastabend Intel Corporation
>> >
>> >We do have a need to configure the attributes directly from user-space and
>> I have identified the tool to do that in iproute2.
>> >
>> >An example of attributes are:
>> >* enabling/disabling of learning of source addresses on a given port
>> >(you can imagine the attribute called LEARNING for example);
>> >* internal loopback control (i.e. LOOPBACK) which will control how the
>> >flow of traffic behaves from the switch fabric towards an egress port;
>> >* flooding for broadcast/multicast/unicast type of packets (i.e.
>> >BFLOODING, MFLOODING, UFLOODING);
>> >
>> >Some attributes would be of the type enabled/disabled while other will
>> allow specific values to allow the user to configure different behaviours of
>> that feature on that particular port on that platform.
>> >
>> >One thing to mention - as John stated as well - there might be some
>> attributes that are not specific to any software module but rather have to do
>> with the actual hardware/platform to configure.
>> >
>> >I hope this clarifies some points.
>>
>> It does. Makes sense. We need to expose this attr set/get for both in-kernel
>> and userspace use cases.
>>
>> Please adjust you patch for this. Also, as a second patch, it would be great if
>> you can convert ndo_switch_port_stp_update to this new ndo.
>>
>> Thanks.
>>
>>
>
>I was thinking of leaving the get side of things implemented via sysfs rather than implementing an NDO for it. Would this not be appropriate?
I believe that it is preferred to have both get and set exposed via ndo
and netlink. It can be exposed via sysfs as well, but it is "nice to have"
not "must have"
^ permalink raw reply
* Re: [PATCH iproute2] ip: Simplify executing ip cmd within namespace
From: vadim4j @ 2014-12-11 13:02 UTC (permalink / raw)
To: Nicolas Dichtel; +Cc: vadim4j, netdev
In-Reply-To: <54899309.7050109@6wind.com>
On Thu, Dec 11, 2014 at 01:50:17PM +0100, Nicolas Dichtel wrote:
> Le 11/12/2014 11:57, vadim4j@gmail.com a écrit :
> >On Thu, Dec 11, 2014 at 11:58:21AM +0100, Nicolas Dichtel wrote:
> >>Le 10/12/2014 23:56, Vadim Kochan a écrit :
> >>>From: Vadim Kochan <vadim4j@gmail.com>
> >>>
> >>>Added new '-ns' option to simplify executing following cmd:
> >>>
> >>> ip netns exec NETNS ip OPTIONS COMMAND OBJECT
> >>>
> >>> to
> >>>
> >>> ip -ns NETNS OPTIONS COMMAND OBJECT
> >>>
> >>>e.g.:
> >>>
> >>> ip -ns vnet0 link add br0 type bridge
> >>>
> >>>Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
> >>>---
> >>>May be new option should have better name than '-ns' ?
> >>What about 'ip -netns' to be explicit like other options?
> >>user may still use 'ip -n' at the end.
> >>
> >>
> >>Regards,
> >>Nicolas
> >
> >May be left '-n' for some other future option, but use the following
> Options parsing in iproute2 will match -netns when typing -n because there
> is no other options that begin with a 'n' (I've done a quick look, maybe I've
> missed one).
> Like -d which matches -details, etc.
>
> >options: -net[ns] and -ns ? What do you think ?
> One option is enough. '-netns' is an explicit reference to 'ip netns'.
>
>
> Regards,
> Nicolas
OK, I agree.
I will re-work and resend v2.
Thanks,
^ permalink raw reply
* [PATCH iproute2 v2] ip: Simplify executing ip cmd within network ns
From: Vadim Kochan @ 2014-12-11 13:38 UTC (permalink / raw)
To: netdev; +Cc: Vadim Kochan
From: Vadim Kochan <vadim4j@gmail.com>
Added new '-netns' option to simplify executing following cmd:
ip netns exec NETNS ip OPTIONS COMMAND OBJECT
to
ip -n[etns] NETNS OPTIONS COMMAND OBJECT
e.g.:
ip -net vnet0 link add br0 type bridge
ip -n vnet0 link
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
---
Changes v1 -> v2
use -n[etns] option name: suggested by Nicolas Dichtel
changed man ip.8 page
ip/ip.c | 6 ++++++
ip/ip_common.h | 1 +
ip/ipnetns.c | 2 +-
man/man8/ip.8 | 24 +++++++++++++++++++++++-
4 files changed, 31 insertions(+), 2 deletions(-)
diff --git a/ip/ip.c b/ip/ip.c
index 5f759d5..f3c2cdb 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -262,6 +262,12 @@ int main(int argc, char **argv)
rcvbuf = size;
} else if (matches(opt, "-help") == 0) {
usage();
+ } else if (matches(opt, "-netns") == 0) {
+ argc--;
+ argv++;
+ argv[0] = argv[1];
+ argv[1] = basename;
+ return netns_exec(argc, argv);
} else {
fprintf(stderr, "Option \"%s\" is unknown, try \"ip -help\".\n", opt);
exit(-1);
diff --git a/ip/ip_common.h b/ip/ip_common.h
index 75bfb82..d4f7e1f 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -88,6 +88,7 @@ struct link_util
struct link_util *get_link_kind(const char *kind);
struct link_util *get_link_slave_kind(const char *slave_kind);
int get_netns_fd(const char *name);
+int netns_exec(int argc, char **argv);
#ifndef INFINITY_LIFE_TIME
#define INFINITY_LIFE_TIME 0xFFFFFFFFU
diff --git a/ip/ipnetns.c b/ip/ipnetns.c
index 1c8aa02..367841c 100644
--- a/ip/ipnetns.c
+++ b/ip/ipnetns.c
@@ -129,7 +129,7 @@ static void bind_etc(const char *name)
closedir(dir);
}
-static int netns_exec(int argc, char **argv)
+int netns_exec(int argc, char **argv)
{
/* Setup the proper environment for apps that are not netns
* aware, and execute a program in that environment.
diff --git a/man/man8/ip.8 b/man/man8/ip.8
index 2d42e98..389c808 100644
--- a/man/man8/ip.8
+++ b/man/man8/ip.8
@@ -31,7 +31,8 @@ ip \- show / manipulate routing, devices, policy routing and tunnels
\fB\-r\fR[\fIesolve\fR] |
\fB\-f\fR[\fIamily\fR] {
.BR inet " | " inet6 " | " ipx " | " dnet " | " link " } | "
-\fB\-o\fR[\fIneline\fR] }
+\fB\-o\fR[\fIneline\fR] |
+\fB\-n\fR[\fIetns\fR] }
.SH OPTIONS
@@ -134,6 +135,27 @@ the output.
use the system's name resolver to print DNS names instead of
host addresses.
+.TP
+.BR "\-n" , " \-net" , " \-netns " <NETNS>
+executes the following
+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+in the specified network namespace
+.IR NETNS .
+Actually it just simplifies executing of:
+
+.B ip netns exec
+.IR NETNS
+.B ip
+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
+to
+
+.B ip
+.RI "-n[etns] " NETNS " [ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+
.SH IP - COMMAND SYNTAX
.SS
--
2.1.3
^ permalink raw reply related
* RE: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Varlese, Marco @ 2014-12-11 13:55 UTC (permalink / raw)
To: Jiri Pirko
Cc: John Fastabend, netdev@vger.kernel.org,
stephen@networkplumber.org, Fastabend, John R,
roopa@cumulusnetworks.com, sfeldma@gmail.com,
linux-kernel@vger.kernel.org
In-Reply-To: <20141211130830.GA1912@nanopsycho.orion>
> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Jiri Pirko
> Sent: Thursday, December 11, 2014 1:09 PM
> To: Varlese, Marco
> Cc: John Fastabend; netdev@vger.kernel.org;
> stephen@networkplumber.org; Fastabend, John R;
> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
> kernel@vger.kernel.org
> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
> configuration
>
> Thu, Dec 11, 2014 at 01:02:36PM CET, marco.varlese@intel.com wrote:
> >> -----Original Message-----
> >> From: Jiri Pirko [mailto:jiri@resnulli.us]
> >> Sent: Thursday, December 11, 2014 11:01 AM
> >> To: Varlese, Marco
> >> Cc: John Fastabend; netdev@vger.kernel.org;
> >> stephen@networkplumber.org; Fastabend, John R;
> >> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
> >> kernel@vger.kernel.org
> >> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
> >> configuration
> >>
> >> Thu, Dec 11, 2014 at 10:59:42AM CET, marco.varlese@intel.com wrote:
> >> >> -----Original Message-----
> >> >> From: John Fastabend [mailto:john.fastabend@gmail.com]
> >> >> Sent: Wednesday, December 10, 2014 5:04 PM
> >> >> To: Jiri Pirko
> >> >> Cc: Varlese, Marco; netdev@vger.kernel.org;
> >> >> stephen@networkplumber.org; Fastabend, John R;
> >> >> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
> >> >> kernel@vger.kernel.org
> >> >> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
> >> >> configuration
> >> >>
> >> >> On 12/10/2014 08:50 AM, Jiri Pirko wrote:
> >> >> > Wed, Dec 10, 2014 at 05:23:40PM CET, marco.varlese@intel.com
> wrote:
> >> >> >> From: Marco Varlese <marco.varlese@intel.com>
> >> >> >>
> >> >> >> Switch hardware offers a list of attributes that are
> >> >> >> configurable on a per port basis.
> >> >> >> This patch provides a mechanism to configure switch ports by
> >> >> >> adding an NDO for setting specific values to specific attributes.
> >> >> >> There will be a separate patch that extends iproute2 to call
> >> >> >> the new NDO.
> >> >> >
> >> >> >
> >> >> > What are these attributes? Can you give some examples. I'm
> >> >> > asking because there is a plan to pass generic attributes to
> >> >> > switch ports replacing current specific
> >> >> > ndo_switch_port_stp_update. In this case, bridge is setting that
> attribute.
> >> >> >
> >> >> > Is there need to set something directly from userspace or does
> >> >> > it make rather sense to use involved bridge/ovs/bond ? I think
> >> >> > that both will be needed.
> >> >>
> >> >> +1
> >> >>
> >> >> I think for many attributes it would be best to have both. The in
> >> >> kernel callers and netlink userspace can use the same driver ndo_ops.
> >> >>
> >> >> But then we don't _require_ any specific bridge/ovs/etc module.
> >> >> And we may have some attributes that are not specific to any
> >> >> existing software module. I'm guessing Marco has some examples of
> these.
> >> >>
> >> >> [...]
> >> >>
> >> >>
> >> >> --
> >> >> John Fastabend Intel Corporation
> >> >
> >> >We do have a need to configure the attributes directly from
> >> >user-space and
> >> I have identified the tool to do that in iproute2.
> >> >
> >> >An example of attributes are:
> >> >* enabling/disabling of learning of source addresses on a given port
> >> >(you can imagine the attribute called LEARNING for example);
> >> >* internal loopback control (i.e. LOOPBACK) which will control how
> >> >the flow of traffic behaves from the switch fabric towards an egress
> >> >port;
> >> >* flooding for broadcast/multicast/unicast type of packets (i.e.
> >> >BFLOODING, MFLOODING, UFLOODING);
> >> >
> >> >Some attributes would be of the type enabled/disabled while other
> >> >will
> >> allow specific values to allow the user to configure different
> >> behaviours of that feature on that particular port on that platform.
> >> >
> >> >One thing to mention - as John stated as well - there might be some
> >> attributes that are not specific to any software module but rather
> >> have to do with the actual hardware/platform to configure.
> >> >
> >> >I hope this clarifies some points.
> >>
> >> It does. Makes sense. We need to expose this attr set/get for both
> >> in-kernel and userspace use cases.
> >>
> >> Please adjust you patch for this. Also, as a second patch, it would
> >> be great if you can convert ndo_switch_port_stp_update to this new ndo.
> >>
> >> Thanks.
> >>
> >>
> >
> >I was thinking of leaving the get side of things implemented via sysfs rather
> than implementing an NDO for it. Would this not be appropriate?
>
> I believe that it is preferred to have both get and set exposed via ndo and
> netlink. It can be exposed via sysfs as well, but it is "nice to have"
> not "must have"
>
I'll add the get ndo to my patch now. Thanks.
^ permalink raw reply
* Re: 3.12.33 - BUG xfrm_selector_match+0x25/0x2f6
From: Smart Weblications GmbH - Florian Wiessner @ 2014-12-11 14:04 UTC (permalink / raw)
To: Julian Anastasov
Cc: Steffen Klassert, netdev, LKML, stable, Simon Horman, lvs-devel
In-Reply-To: <alpine.LFD.2.11.1412102331230.4043@ja.home.ssi.bg>
Hi,
Am 10.12.2014 22:41, schrieb Julian Anastasov:>
> Hello,
>
> On Tue, 9 Dec 2014, Smart Weblications GmbH - Florian Wiessner wrote:
>
>> I rebuild everything with the two provided patches and still get:
>>
>> [ 512.475449] BUG: unable to handle kernel NULL pointer dereference at
>> 0000000000000014
>> [ 512.481277] IP: [<ffffffffa013d470>] nf_ct_seqadj_set+0x60/0x90 [nf_conntrack]
>
> Same place, hm...
>
>> [ 512.485323] CPU: 4 PID: 28142 Comm: vsftpd Not tainted 3.12.33 #5
>
> Above "#5" is same as previous oops. It means kernel
> is not updated. Or you updated only the IPVS modules after
> applying the both patches?
I did it with make-kpkg --initrd linux_image which only rebuilt the modules,
correct. I can retry with make clean before building the package
>
> You can also try without FTP tests to see if there
> are oopses in xfrm, so that we can close this topic and then
> to continue for the FTP problem on IPVS lists without
> bothering non-IPVS people.
>
yeah, it seems that the xfrm issue is away.
--
Mit freundlichen Grüßen,
Florian Wiessner
Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila
fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de
--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
^ permalink raw reply
* Re: [PATCH v7 2/3] net: Add Keystone NetCP ethernet driver
From: Murali Karicheri @ 2014-12-11 14:14 UTC (permalink / raw)
To: David Miller
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
grant.likely-QSEj5FYQhm4dnm+yROfE0A
In-Reply-To: <20141210.204110.618599360537141819.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
On 12/10/2014 08:41 PM, David Miller wrote:
> From: Murali Karicheri<m-karicheri2-l0cyMroinI0@public.gmane.org>
> Date: Wed, 10 Dec 2014 16:31:02 -0500
>
>> Are you referring to the static code analyser sparse that is invoked
>> through?
>
> You have to explicitly enable endian checking, it's not on by
> default.
Thanks David and others who responded. Let me do this and resolve any
warning before the next post.
BTW, could you provide any suggestions that would help us merge this
series to upstream? This has been sitting on this list for a while now.
Thanks and regards,
--
Murali Karicheri
Linux Kernel, Texas Instruments
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH iproute2 v2] ip: Simplify executing ip cmd within network ns
From: Jiri Pirko @ 2014-12-11 14:21 UTC (permalink / raw)
To: Vadim Kochan; +Cc: netdev
In-Reply-To: <1418305103-5994-1-git-send-email-vadim4j@gmail.com>
Thu, Dec 11, 2014 at 02:38:23PM CET, vadim4j@gmail.com wrote:
>From: Vadim Kochan <vadim4j@gmail.com>
>
>Added new '-netns' option to simplify executing following cmd:
>
> ip netns exec NETNS ip OPTIONS COMMAND OBJECT
>
> to
>
> ip -n[etns] NETNS OPTIONS COMMAND OBJECT
>
>e.g.:
>
> ip -net vnet0 link add br0 type bridge
> ip -n vnet0 link
>
>Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
>---
>Changes v1 -> v2
> use -n[etns] option name: suggested by Nicolas Dichtel
> changed man ip.8 page
>
> ip/ip.c | 6 ++++++
> ip/ip_common.h | 1 +
> ip/ipnetns.c | 2 +-
> man/man8/ip.8 | 24 +++++++++++++++++++++++-
> 4 files changed, 31 insertions(+), 2 deletions(-)
>
>diff --git a/ip/ip.c b/ip/ip.c
>index 5f759d5..f3c2cdb 100644
>--- a/ip/ip.c
>+++ b/ip/ip.c
>@@ -262,6 +262,12 @@ int main(int argc, char **argv)
> rcvbuf = size;
> } else if (matches(opt, "-help") == 0) {
> usage();
>+ } else if (matches(opt, "-netns") == 0) {
>+ argc--;
>+ argv++;
>+ argv[0] = argv[1];
>+ argv[1] = basename;
>+ return netns_exec(argc, argv);
Can't the same functionality be done in the same ip process, meaning
without execvp ip again? It would seem clearer to me.
How about other tools (tc,bridge,..) ? It would be nice to have the same
option there as well.
> } else {
> fprintf(stderr, "Option \"%s\" is unknown, try \"ip -help\".\n", opt);
> exit(-1);
>diff --git a/ip/ip_common.h b/ip/ip_common.h
>index 75bfb82..d4f7e1f 100644
>--- a/ip/ip_common.h
>+++ b/ip/ip_common.h
>@@ -88,6 +88,7 @@ struct link_util
> struct link_util *get_link_kind(const char *kind);
> struct link_util *get_link_slave_kind(const char *slave_kind);
> int get_netns_fd(const char *name);
>+int netns_exec(int argc, char **argv);
>
> #ifndef INFINITY_LIFE_TIME
> #define INFINITY_LIFE_TIME 0xFFFFFFFFU
>diff --git a/ip/ipnetns.c b/ip/ipnetns.c
>index 1c8aa02..367841c 100644
>--- a/ip/ipnetns.c
>+++ b/ip/ipnetns.c
>@@ -129,7 +129,7 @@ static void bind_etc(const char *name)
> closedir(dir);
> }
>
>-static int netns_exec(int argc, char **argv)
>+int netns_exec(int argc, char **argv)
> {
> /* Setup the proper environment for apps that are not netns
> * aware, and execute a program in that environment.
>diff --git a/man/man8/ip.8 b/man/man8/ip.8
>index 2d42e98..389c808 100644
>--- a/man/man8/ip.8
>+++ b/man/man8/ip.8
>@@ -31,7 +31,8 @@ ip \- show / manipulate routing, devices, policy routing and tunnels
> \fB\-r\fR[\fIesolve\fR] |
> \fB\-f\fR[\fIamily\fR] {
> .BR inet " | " inet6 " | " ipx " | " dnet " | " link " } | "
>-\fB\-o\fR[\fIneline\fR] }
>+\fB\-o\fR[\fIneline\fR] |
>+\fB\-n\fR[\fIetns\fR] }
>
>
> .SH OPTIONS
>@@ -134,6 +135,27 @@ the output.
> use the system's name resolver to print DNS names instead of
> host addresses.
>
>+.TP
>+.BR "\-n" , " \-net" , " \-netns " <NETNS>
>+executes the following
>+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
>+.BR help " }"
>+in the specified network namespace
>+.IR NETNS .
>+Actually it just simplifies executing of:
>+
>+.B ip netns exec
>+.IR NETNS
>+.B ip
>+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
>+.BR help " }"
>+
>+to
>+
>+.B ip
>+.RI "-n[etns] " NETNS " [ " OPTIONS " ] " OBJECT " { " COMMAND " | "
>+.BR help " }"
>+
> .SH IP - COMMAND SYNTAX
>
> .SS
>--
>2.1.3
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH net] be2net: Export tunnel offloads only when a VxLAN tunnel is created
From: Sergei Shtylyov @ 2014-12-11 14:35 UTC (permalink / raw)
To: Sathya Perla, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <CF9D1877D81D214CB0CA0669EFAE020C68D24ADD@CMEXMB1.ad.emulex.com>
Hello.
On 12/11/2014 10:24 AM, Sathya Perla wrote:
>>> + netdev->hw_enc_features |= (NETIF_F_IP_CSUM |
>> NETIF_F_IPV6_CSUM |
>>> + NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_GSO_UDP_TUNNEL);
>> Please indent this properly:
>> netdev->hw_enc_features |= (NETIF_F_IP_CSUM |
>> NETIF_F_IPV6_CSUM |
>> NETIF_F_TSO | NETIF_F_TSO6 |
>> NETIF_F_GSO_UDP_TUNNEL);
> Oops, checkpatch didn't seem to catch this...will fix it up and send out a v2 right away...
> thanks!
Parens are not needed here as well.
WBR, Sergei
^ permalink raw reply
* Re: [PATCH 1/1] net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
From: Sergei Shtylyov @ 2014-12-11 14:39 UTC (permalink / raw)
To: Cyrille Pitchen, nicolas.ferre, davem, linux-arm-kernel, netdev,
soren.brinkmann
Cc: linux-kernel
In-Reply-To: <efa28485b430e77f5254248cb396da431d03fc5b.1418292741.git.cyrille.pitchen@atmel.com>
Hello.
On 12/11/2014 1:15 PM, Cyrille Pitchen wrote:
Citing the warning here would be a good idea.
> Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
WBR, Sergei
^ permalink raw reply
* Re: [PATCH iproute2 v2] ip: Simplify executing ip cmd within network ns
From: vadim4j @ 2014-12-11 14:34 UTC (permalink / raw)
To: Jiri Pirko; +Cc: Vadim Kochan, netdev
In-Reply-To: <20141211142131.GC1912@nanopsycho.orion>
On Thu, Dec 11, 2014 at 03:21:31PM +0100, Jiri Pirko wrote:
> Thu, Dec 11, 2014 at 02:38:23PM CET, vadim4j@gmail.com wrote:
> >From: Vadim Kochan <vadim4j@gmail.com>
> >
> >Added new '-netns' option to simplify executing following cmd:
> >
> > ip netns exec NETNS ip OPTIONS COMMAND OBJECT
> >
> > to
> >
> > ip -n[etns] NETNS OPTIONS COMMAND OBJECT
> >
> >e.g.:
> >
> > ip -net vnet0 link add br0 type bridge
> > ip -n vnet0 link
> >
> >Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
> >---
> >Changes v1 -> v2
> > use -n[etns] option name: suggested by Nicolas Dichtel
> > changed man ip.8 page
> >
> > ip/ip.c | 6 ++++++
> > ip/ip_common.h | 1 +
> > ip/ipnetns.c | 2 +-
> > man/man8/ip.8 | 24 +++++++++++++++++++++++-
> > 4 files changed, 31 insertions(+), 2 deletions(-)
> >
> >diff --git a/ip/ip.c b/ip/ip.c
> >index 5f759d5..f3c2cdb 100644
> >--- a/ip/ip.c
> >+++ b/ip/ip.c
> >@@ -262,6 +262,12 @@ int main(int argc, char **argv)
> > rcvbuf = size;
> > } else if (matches(opt, "-help") == 0) {
> > usage();
> >+ } else if (matches(opt, "-netns") == 0) {
> >+ argc--;
> >+ argv++;
> >+ argv[0] = argv[1];
> >+ argv[1] = basename;
> >+ return netns_exec(argc, argv);
>
Hi,
>
> Can't the same functionality be done in the same ip process, meaning
> without execvp ip again? It would seem clearer to me.
>
Hm, yes, I will look on this ...
> How about other tools (tc,bridge,..) ? It would be nice to have the same
> option there as well.
>
Sure, good idea, I will try.
Thanks,
^ permalink raw reply
* [PATCH net v9 0/7] cxgb4/cxgbi: misc. fixes for cxgb4i
From: Karen Xie @ 2014-12-11 15:25 UTC (permalink / raw)
To: linux-scsi, netdev
Cc: kxie, hariprasad, anish, hch, James.Bottomley, michaelc, davem
This patch set fixes cxgb4i's tx credit calculation and adds handling of
additional rx message and negative advice types. It also removes the duplicate
code in cxgb4i to set the outgoing queues of a packet.
Karen Xie (7):
cxgb4i: fix tx immediate data credit check
cxgb4i: fix credit check for tx_data_wr
cxgb4/cxgb4i: set max. outgoing pdu length in the f/w
cxgb4i: add more types of negative advice
cxgb4i: handle non pdu-aligned rx data
cxgb4i: use cxgb4's set_wr_txq() for setting outgoing queues
libcxgbi: fix the debug print accessing skb after it is freed
Sending to net as the fixes are mostly in the network area and it touches
cxgb4's header file (t4fw_api.h).
v2 corrects the "CHECK"s flagged by checkpatch.pl --strict.
v3 splits the 3rd patch from v2 to two separate patches. Adds detailed commit
messages and makes subject more concise. Patch 3/6 also changes the return
value of is_neg_adv() from int to bool.
v4 -- please ignore.
v5 splits the 1st patch from v3 to two separate patches and reduces code
duplication in make_tx_data_wr().
v6 removed the code style cleanup in the 2nd patch. The style update will be
addressed in a separate patch.
v7 updates the 7th patch with more detailed commit message.
v8 removes the duplicate subject lines from the message bodies.
v9 reformatted the commit messages to be max. 80 characters per line.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox