Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [bisected] tg3 broken in 3.18.0?
From: Michael Chan @ 2014-12-16 17:15 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Rajat Jain, Marcelo Ricardo Leitner, Nils Holland, David Miller,
	netdev, linux-pci@vger.kernel.org, Rafael Wysocki,
	Prashant Sreedharan
In-Reply-To: <CAErSpo5dqQE7nZ6zf2odgpHBWA3ZpTjhbgQKnY8YxQW+a+298w@mail.gmail.com>

On Tue, 2014-12-16 at 09:20 -0700, Bjorn Helgaas wrote:
> I think we're in this path:
> 
>     tg3_init_hw
>       tg3_reset_hw
>         tg3_disable_ints
>         tg3_stop_fw
>         tg3_write_sig_pre_reset
>         tg3_chip_reset
>           pci_device_is_present
>             pci_bus_read_dev_vendor_id
> 
> and in this case pci_device_is_present() also passes a timeout of zero
> to pci_bus_read_dev_vendor_id().  My guess is that tg3 is resetting
> the device, so it's not too surprising that the config read returns
> CRS status immediately afterward.
> 
At the point of calling pci_device_is_present(), chip reset hasn't
started yet, so there should be no problem reading config space.

In all the newer tg3 chips, chip reset does not reset the PCIE block.
So I think config space should always be accesible even during reset.
> 

^ permalink raw reply

* Re: BCM4313 & brcmsmac & 3.12: only semi-working?
From: Arend van Spriel @ 2014-12-16 16:51 UTC (permalink / raw)
  To: Michael Tokarev
  Cc: Maximilian Engelhardt, Rafał Miłecki, Seth Forshee,
	brcm80211 development, linux-wireless@vger.kernel.org,
	Network Development
In-Reply-To: <547F0575.7010104@broadcom.com>

On 12/03/14 13:43, Arend van Spriel wrote:
> On 12/02/14 22:40, Michael Tokarev wrote:
>> 30.11.2014 15:04, Arend van Spriel wrote:
>>
>>> Thanks. Did not find what I was looking for, but I started working on
>>> integrating btcoex related functionality. The attached patch will print
>>> some info so I can focus on the required functionality for your device.
>>> It is based on 3.18-rc5.
>>
>> With this patch applied against 3.18-rc5, the machine instantly reboots
>> once brcmsmac module is loaded. I'm still debugging this.

Hmm. The function brcms_btc_ecicoex_enab() is calling itself. Please 
remove that call as it causes endless recursion and eventually reboot.

Regards,
Arend

> Argh. Probably the register access I added end up in limbo land or some
> other stupid mistake. I will double check my patch.
>
> Regards,
> Arend
>
>> Thanks,
>>
>> /mjt
>

^ permalink raw reply

* [PATCH_V2] dm9000: Add regulator and reset support to dm9000
From: Zubair Lutfullah Kakakhel @ 2014-12-16 16:46 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, paul.burton-1AXoQHu6uovQT0dZR+AlfA,
	Zubair.Kakakhel-1AXoQHu6uovQT0dZR+AlfA

In boards, the dm9000 chip's power and reset can be controlled by gpio.

It makes sense to add them to the dm9000 driver and let dt be used to
enable power and reset the phy.

Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Paul Burton <paul.burton-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
---
V2. Fixed a small blooper. dev_dgb -> dev_dbg

---
 .../devicetree/bindings/net/davicom-dm9000.txt     |  4 +++
 drivers/net/ethernet/davicom/dm9000.c              | 33 ++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/davicom-dm9000.txt b/Documentation/devicetree/bindings/net/davicom-dm9000.txt
index 28767ed..dba19a2 100644
--- a/Documentation/devicetree/bindings/net/davicom-dm9000.txt
+++ b/Documentation/devicetree/bindings/net/davicom-dm9000.txt
@@ -11,6 +11,8 @@ Required properties:
 Optional properties:
 - davicom,no-eeprom : Configuration EEPROM is not available
 - davicom,ext-phy : Use external PHY
+- reset-gpio : phandle of gpio that will be used to reset chip during probe
+- vcc-supply : phandle of regulator that will be used to enable power to chip
 
 Example:
 
@@ -21,4 +23,6 @@ Example:
 		interrupts = <7 4>;
 		local-mac-address = [00 00 de ad be ef];
 		davicom,no-eeprom;
+		reset-gpio = <&gpf 12 GPIO_ACTIVE_LOW>;
+		vcc-supply = <&eth0_power>;
 	};
diff --git a/drivers/net/ethernet/davicom/dm9000.c b/drivers/net/ethernet/davicom/dm9000.c
index ef0bb58..97dbeec 100644
--- a/drivers/net/ethernet/davicom/dm9000.c
+++ b/drivers/net/ethernet/davicom/dm9000.c
@@ -36,6 +36,9 @@
 #include <linux/platform_device.h>
 #include <linux/irq.h>
 #include <linux/slab.h>
+#include <linux/regulator/consumer.h>
+#include <linux/gpio.h>
+#include <linux/of_gpio.h>
 
 #include <asm/delay.h>
 #include <asm/irq.h>
@@ -1426,11 +1429,41 @@ dm9000_probe(struct platform_device *pdev)
 	struct dm9000_plat_data *pdata = dev_get_platdata(&pdev->dev);
 	struct board_info *db;	/* Point a board information structure */
 	struct net_device *ndev;
+	struct device *dev = &pdev->dev;
 	const unsigned char *mac_src;
 	int ret = 0;
 	int iosize;
 	int i;
 	u32 id_val;
+	int reset_gpio;
+	enum of_gpio_flags flags;
+	struct regulator *power;
+
+	power = devm_regulator_get(dev, "vcc");
+	if (IS_ERR(power)) {
+		dev_dbg(dev, "no regulator provided\n");
+	} else if (!regulator_is_enabled(power)) {
+		ret = regulator_enable(power);
+		dev_dbg(dev, "regulator enabled\n");
+	}
+
+	reset_gpio = of_get_named_gpio_flags(dev->of_node, "reset-gpio", 0,
+					     &flags);
+	if (gpio_is_valid(reset_gpio)) {
+		ret = devm_gpio_request_one(dev, reset_gpio, flags,
+					    "dm9000_reset");
+		if (ret) {
+			dev_err(dev, "failed to request reset gpio %d: %d\n",
+				reset_gpio, ret);
+		} else {
+			gpio_direction_output(reset_gpio, 0);
+			/* According to manual PWRST# Low Period Min 1ms */
+			msleep(2);
+			gpio_direction_output(reset_gpio, 1);
+			/* Needs 3ms to read eeprom when PWRST is deasserted */
+			msleep(4);
+		}
+	}
 
 	if (!pdata) {
 		pdata = dm9000_parse_dt(&pdev->dev);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH net-next v2 2/4] swdevice: add new api to set and del bridge port attributes
From: John Fastabend @ 2014-12-16 16:41 UTC (permalink / raw)
  To: Arad, Ronen
  Cc: Roopa Prabhu, netdev@vger.kernel.org, Jamal Hadi Salim,
	Jiri Pirko, sfeldma@gmail.com, bcrl@kvack.org, tgraf@suug.ch,
	stephen@networkplumber.org, linville@tuxdriver.com,
	vyasevic@redhat.com, davem@davemloft.net, shm@cumulusnetworks.com,
	gospo@cumulusnetworks.com
In-Reply-To: <E4CD12F19ABA0C4D8729E087A761DC3505DB15CA@ORSMSX101.amr.corp.intel.com>

On 12/16/2014 03:01 AM, Arad, Ronen wrote:
>
> In my reply (inline) I elaborate on the validity of bridge-less and offloaded-bridge models for L2 switching.
>
> I also discuss the implied necessity of a bridge device for L3 routing and potential issues with the upcoming FIB offloading proposal.
>
>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org] On Behalf Of Roopa Prabhu
>> Sent: Tuesday, December 16, 2014 3:21 AM
>> To: Arad, Ronen
>> Cc: Jamal Hadi Salim; John Fastabend; netdev@vger.kernel.org; Jiri Pirko;
>> sfeldma@gmail.com; bcrl@kvack.org; tgraf@suug.ch;
>> stephen@networkplumber.org; linville@tuxdriver.com;
>> vyasevic@redhat.com; davem@davemloft.net;
>> shm@cumulusnetworks.com; gospo@cumulusnetworks.com
>> Subject: Re: [PATCH net-next v2 2/4] swdevice: add new api to set and del
>> bridge port attributes
>>
>> On 12/15/14, 4:58 PM, Arad, Ronen wrote:
>>>
>>>> -----Original Message-----
>>>> From: Jamal Hadi Salim [mailto:jhs@mojatatu.com]
>>>> Sent: Tuesday, December 16, 2014 1:28 AM
>>>> To: Arad, Ronen; John Fastabend; netdev@vger.kernel.org
>>>> Cc: Roopa Prabhu; Jiri Pirko; sfeldma@gmail.com; bcrl@kvack.org;
>>>> tgraf@suug.ch; stephen@networkplumber.org; linville@tuxdriver.com;
>>>> vyasevic@redhat.com; davem@davemloft.net;
>> shm@cumulusnetworks.com;
>>>> gospo@cumulusnetworks.com
>>>> Subject: Re: [PATCH net-next v2 2/4] swdevice: add new api to set and
>>>> del bridge port attributes
>>>>
>>>> On 12/15/14 13:36, Arad, Ronen wrote:
>>>>>
>>>>>> -----Original Message-----
>>>>> The behavior of a driver could depend on the presence of a bridge
>>>>> and
>>>> features such as FDB LEARNING and LEARNING_SYNC.
>>>>
>>>> Indeed, those are bridge attributes.
>>>>
>>>>> A switch port driver which is not enslaved to a bridge might need to
>>>>> implement VLAN-aware FDB within the driver and report its content to
>>>>> user-
>>>> space using ndo_fdb_dump.
>>>>    >
>>>>> A switch port driver which is enslaved to a bridge could do with
>>>>> only pass through for static FDB configuration
>>>>    > to the HW when LEARNING_SYNC is configured. FDB reporting to
>>>> user- space and soft aging are left to the bridge module FDB.
>>>>> Such driver, without LEARNING_SYNC could still avoid maintaing
>>>>> in-driver
>>>> FDB as long as it could dump the HW FDB on demand.
>>>>> LEARNING_SYNC also requires periodic updates of freshness
>>>>> information
>>>> from the driver to the bridge module.
>>>>
>>>> If you have an fdb - shouldnt that be exposed only if you have a
>>>> bridge abstraction exposed? i.e thats where the Linux tools would work.
>>> I'm trying to find out what are the opinions of other people in the netdev
>> list.
>>> John have clearly stated that he'd like to see full L2 switching functionality
>> (at least) supported without making a bridge device mandatory.
>>> The existing bridge ndos (ndo_bridge_{set,del,get}link) already support that
>> with proper setting of SELF/MASTER flags by iproute2.
>>> I see the value in supporting both approaches (bridge device mandatory
>>> and bridge device optional). If the choice is left to user-driven policy decision,
>>> we need to document both use models and map traditional L2 features to
>>> each model.
>>> The L2 offloading (or NETFUNC as it is currently called), which is being
>>> discussed on a different patch-set, is only needed when a bridge device is
>>> used.
>>> Without a bridge device, all configuration has to be targeted at the switch
>>> port driver directly using the SELF flag. FDB remains relevant and it is used to
>>> configure static MAC table entries and dump the HW MAC table.
>
>> Your understanding is right here. So far all patches have kept both models in
>> mind.
>
>
>>> When the HW device is a L2 switch or a multi-layer switch (L2-L3 or even
>>> higher), there is a gap between what the HW is doing and what is explicitly
>>> modeled in Linux.
>
>
>> Can you elaborate more here ?. We use the linux model to accelerate a
>> multi-layer (l2-l3) switch today. There maybe a few gaps, but these gaps can
>> be closed by having equivalent functionality in the software path.
>
> What I meant is that without a bridge device the HW switch is seen as a collection of independent switch ports. Typical switch ASIC performs L2 switching by default. This is not expressed explicitly in Linux without a bridge device.
> The SELF flag is used to target typical bridge port and bridge configuration at a switch port device.
> Without an explicit bridge device, bridge attributes have to be directed at an arbitrary port (any port could represent the entire switch) and interpreted by the switch port driver as intended for the entire switch (this includes attributes like STP etc.)
> Each switch port device driver has to implement similar functionality (i.e. all bridge and fdb related ndos) independently without common functionality shared (e.g. FDB, soft aging).
> It is a valid use model and could avoid the complexity of having to deal with the presence of both SW and HW bridge and to deal with explicit offloading of data-path.
>
> I was trying to find out whether the intention was to continue and support both bridge-less an offloaded-bridge models and leave it to the end-user to choose the desirable model at configuration time.
> This would require dual support in the switch port driver in order to have best user experience across multiple switch ASICs or other kinds of devices.
>

I'm still missing why there is duplicate implementations in the driver.
If the driver implements the set of ndo ops why should it care who calls
them? I think you tried to explain this already but I'm not seeing it.

[...]

I'll need to think about the l3 stuff but I think Jiri/Scott/Roopa
might have worked some of it out.

-- 
John Fastabend         Intel Corporation

^ permalink raw reply

* Re: [PATCH] dm9000: Add regulator and reset support to dm9000
From: Zubair Lutfullah Kakakhel @ 2014-12-16 16:41 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, paul.burton-1AXoQHu6uovQT0dZR+AlfA
In-Reply-To: <1418747624-2682-1-git-send-email-Zubair.Kakakhel-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>



On 16/12/14 16:33, Zubair Lutfullah Kakakhel wrote:
...

> +
> +	power = devm_regulator_get(dev, "vcc");
> +	if (IS_ERR(power)) {
> +		dev_dbg(dev, "no regulator provided\n");
> +	} else if (!regulator_is_enabled(power)) {
> +		ret = regulator_enable(power);
> +		dev_dgb(dev, "regulator enabled\n");
		^dev_dbg

Apologies. This fix wasn't squashed in. I'll resend.

> +	}
> +
> +	reset_gpio = of_get_named_gpio_flags(dev->of_node, "reset-gpio", 0,
> +					     &flags);

ZubairLK
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next RESEND] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined.
From: John Fastabend @ 2014-12-16 16:35 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Hubert Sokolowski, Roopa Prabhu, netdev@vger.kernel.org,
	Vlad Yasevich
In-Reply-To: <54902E5E.2070405@mojatatu.com>

On 12/16/2014 05:06 AM, Jamal Hadi Salim wrote:
> On 12/15/14 19:45, John Fastabend wrote:
>> On 12/15/2014 06:29 AM, Jamal Hadi Salim wrote:
>
>>
>> hmm good question. When I implemented this on the host nics with SR-IOV,
>> VMDQ, etc. The multi/unicast addresses were propagated into the FDB by
>> the driver.
>
> So if i understand correctly, this is a NIC with an FDB. And there is no
> concept of a bridge to which it is attached. To the point of
> classical uni/multicast addresses on a netdev abstraction; these
> are typically stored in *much simpler tables* (used to be IO
> registers back in the day)

 From a model perspective it looks like a edge relay. Only a single
downlink with multiple uplinks. No learning, no loops and so no
STP, et. al. required. It may or may not support MAC+VLAN forwarding
or just MAC forwarding.

It may be configured via register writes or more complicated firmware
requests or some other mechanism. This is device dependent even across
devices by the same vendor the mechanisms change. But the driver
abstracts this.

> Do these NICs not have such a concept?
> An fdb entry has an egress port column; I have seen cases where the
> port is labeled as "Cpu port" which would mean it belongs to the host;

But in the SR-IOV case you have multiple "Cpu ports" and you want
to send packets to each of them depending on the configuration.

    port0   port1     port2  port3
     |        |        |      |      uplinks
  +------------------------------+
  |                              |
  |       SRIOV edge relay       |
  |                              |
  +------------------------------+
                  |                   downlink

In a host nic with SRIOV each port will be a PCIE function. So really
they are all CPU ports. For multi-function devices they might all be
physical functions.

In the hardware there needs to be a table to forward incoming traffic
to the correct port#. For L2 we use MAC+VLAN and an egress port column
to select the port. The model shouldn't care if the port is backed by
a VF or PF or set of queues. It just needs to forward packets to the
correct uplink.

One issue we have today when writing software for these edge relays
is we don't have a netdev representing the downlink. Or a netdev
representing management functions of the device. So if I want to
say change the mode of the edge relay from VEB to VEPA I usually
just send the message to the PF. Or if I want to send packets out on
the wire but not through the edge relay usually we do this by sending
control packets over an elected PF and it will attach a tag or something
so the edge relay doesn't forward or flood them to other uplinks. Adding
a netdev for the downlink would probably clean some of this up. Now
we rely on some behaviour that is not well-defined.

> but in this case it just seems there is no such concept and as Or
> brought up in another email - what does "VLANid" mean in such a case?

I think most host nics with SR-IOV can forward using VLAN + MAC and
do filtering on VLANid. Many can also put a default VLAN on the packet.

> If we go with a CPU port concept,
> We could then use the concept of a vlan filter on a port basis
> but then what happens when you dont have an fdb (majority of cases)?

Not sure what the question is here.. I'm hoping the above helped
explain my thinking on this.

Don't have an FDB? This means you don't have any way to forward
between ports so you must have a 1:1 mapping between the physical
port and the netdev. I think its fair to think of this as a TPMR
(two port mac relay) although not a very useful abstraction.

>
>> My logic was if some netdev ethx has a set of MAC addresses
>> above it well then any virtual function or virtual device also behind
>> the hardware shouldn't be sending those addresses out the egress switch
>> facing port. Otherwise the switch will see packets it knows are behind
>> that port and drop them. Or flood them if it hasn't learned the address
>> yet. Either way they will never get to the right netdev.
>>
>> Admittedly I wasn't thinking about switches with many ports at the time.
>>
>
> I often struggle with trying to "box" SRIOV into some concept of a
> switch abstraction and sometimes i am puzzled.
> Would exposing the SRIOV underlay as a switch not have solved this
> problem? Then the virtual ports essentially are bridge ports.

Yes this would help and this is how I view it. Although the
edge relay vs "real standards based" bridge distinction is important
because we don't do learning, only have a single uplink, don't run
loop detecting protocols, etc. All that stuff is not needed on a host
where you "know" your MAC addresses (at least for many use cases) and
can not build loops.

> Maybe what we need is a concept of a "edge relay" extended netdev?

This is effectively what the fdb table does right? Sure its not as
explicit as it could be but this is how I treat the NIC when I learn
it has multiple downlinks and a single uplink. At the moment we use
a trick similar to Jiri's on rocker, when we get a switch op like
getlink, setlink we "know" what switch object it refers to because
the netdev maps to a single switch always.

> These things would have an fdb as well down and uplink relay ports that
> can be attached to them.
>

Right in the current code paths there is no "attach" operation we assume
the edge relay and ports are attached when the ports are created via
SR-IOV or hw-offload or whatever.

What are we missing? We have the FDB and a unique id to show ports on
the same edge relay. User space can build this abstraction from those
two things. A downlink netdev port would probably clean up the
abstraction a bit especially for sending control frames.

>
>>> Some of these drivers may be just doing the LinuxWay(aka cutnpaste what
>>> the other driver did).
>>
>> My original thinking here was... if it didn't implement fdb_add, fdb_del
>> and fdb_dump then if you wanted to think of it as having forwarding
>> database that was fine but it was really just a two port mac relay. In
>> which case just dump all the mac addresses it knows about. In this case
>> if it was something more fancy it could do its own dump like vxlan or
>> macvlan.
>>
>
> The challenge here is lack of separation between a NICs uni/multicast
> ports which it owns - which is a traditional operation regardless of
> what capabilities the NIC has; vs an fdb which has may have many
> other capabilities. Probably all NICs capable of many MACs implement
> fdbs?

Yes they must to support forwarding. Agreed its a bit clunky they
way we overload uni/multicast address lists. But what does it mean
to add a unicast address to a port and not have it in the FDB? If
the port wants to receive traffic on a MAC because its added to the
unicast list doesn't it mean insert it into the FDB so the packets
actually get sent to the netdev?

Otherwise its a two step process one add it to the multicast list
and then add it to the FDB. I'm not sure why this is valuable.

>
>> For a host nic ucast/multicast and fdb are the same, I think? The
>> code we had was just short-hand to allow the common case a host nic
>> to work. Notice vxlan and bridge drivers didn't dump there addr lists
>> from fdb_dump until your patch.
>>
>> Perhaps my implementation of macvlan fdb_{add|del|dump} is buggy. And
>> I shouldn't overload the addr lists.
>>
>
> Not just those - I am wondering about the general utility of what
> Hubert was trying to do if all the driver does is call the default
> dumper based on some flags presence and the default dumper
> does a dump of uni/multicast host entries. Those are not really fdb
> entries in the traditional sense.

But as a practical matter any uni/multicast entry is in the FDB
so when the host nic has multiple ports we receive those mac addresses
on the port. The drivers do this today and it seems reasonable to me.

> Is there no way to get the unicast/multicast mac addresses for such
> a driver?

You can almost infer it from ip link by looking at all the stacked
drivers and figuring out how the address are propagated down. Then
look at the routes and figure out multicast address. But other than
the fdb dump mechanism I don't think there is anything.

> I think that would help bring clarity to my confusion.
>

clear as mud now?

>
>>
>> I'm interested to see what Vlad says as well. But the current situation
>> is previously some drivers dumped their addr lists others didn't.
>> Specifically, the more switch like devices (bridge, vxlan) didn't. Now
>> every device will dump the addr lists. I'm not entirely convinced that
>> is correct.
>>
>
> I am glad this happened ;-> Otherwise we wouldnt be having this
> discussion. When Vlad was asking me I was in a rush to get the patch
> out and didnt question because i thought this was something some crazy
> virtualization people needed.
> If Vlad's use case goes away, then Hubert's little restoration is fine.

Yep. maybe we can talk about it at the netdev users conference

>
>
>> It works OK for host nics (NICS that can't forward between ports) and
>> seems at best confusing for real switch asics.
>
> So if these NICs have fdb entries and i programmed it (meaning setting
> which port a given MAC should be sent to), would it not work?

You mean via 'bridge fdb add' yes this will work. But then as a short
hand we also program the ucast/multicast addresses. (have I beaten this
to death yet?)

>
>> On a related question do
>> you expect the switch asic to trap any packets with MAC addresses in
>> the multi/unicast address lists and send them to the correct netdev? Or
>> will the switch forward them using normal FDB tables?
>>
>
> I think there would be a separate table for that. Roopa, can you check
> with the ASICs you guys work on? The point i was trying to make above
> is today there is a uni/multicast list or table of sorts that all NICs
> expose.
> There's always the hack of a "cpu port". I have also seen the "cpu port"
> being conceptualized in L3 tables to imply "next hop is cpu" where you
> have an IP address owned by the host; so maybe we need a concept of a
> cpu port or again the revival of TheThing class device.

OK the confusing part of "cpu port" to me is in a host nic trying to
map this abstraction onto it implies a host nic may have many "cpu
ports".

Thanks,
.John

>
> cheers,
> jamal
>

-- 
John Fastabend         Intel Corporation

^ permalink raw reply

* [PATCH] dm9000: Add regulator and reset support to dm9000
From: Zubair Lutfullah Kakakhel @ 2014-12-16 16:33 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, paul.burton-1AXoQHu6uovQT0dZR+AlfA,
	Zubair.Kakakhel-1AXoQHu6uovQT0dZR+AlfA

In boards, the dm9000 chip's power and reset can be controlled by gpio.

It makes sense to add them to the dm9000 driver and let dt be used to
enable power and reset the phy.

Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Paul Burton <paul.burton-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
---
 .../devicetree/bindings/net/davicom-dm9000.txt     |  4 +++
 drivers/net/ethernet/davicom/dm9000.c              | 33 ++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/davicom-dm9000.txt b/Documentation/devicetree/bindings/net/davicom-dm9000.txt
index 28767ed..dba19a2 100644
--- a/Documentation/devicetree/bindings/net/davicom-dm9000.txt
+++ b/Documentation/devicetree/bindings/net/davicom-dm9000.txt
@@ -11,6 +11,8 @@ Required properties:
 Optional properties:
 - davicom,no-eeprom : Configuration EEPROM is not available
 - davicom,ext-phy : Use external PHY
+- reset-gpio : phandle of gpio that will be used to reset chip during probe
+- vcc-supply : phandle of regulator that will be used to enable power to chip
 
 Example:
 
@@ -21,4 +23,6 @@ Example:
 		interrupts = <7 4>;
 		local-mac-address = [00 00 de ad be ef];
 		davicom,no-eeprom;
+		reset-gpio = <&gpf 12 GPIO_ACTIVE_LOW>;
+		vcc-supply = <&eth0_power>;
 	};
diff --git a/drivers/net/ethernet/davicom/dm9000.c b/drivers/net/ethernet/davicom/dm9000.c
index ef0bb58..7333b8d 100644
--- a/drivers/net/ethernet/davicom/dm9000.c
+++ b/drivers/net/ethernet/davicom/dm9000.c
@@ -36,6 +36,9 @@
 #include <linux/platform_device.h>
 #include <linux/irq.h>
 #include <linux/slab.h>
+#include <linux/regulator/consumer.h>
+#include <linux/gpio.h>
+#include <linux/of_gpio.h>
 
 #include <asm/delay.h>
 #include <asm/irq.h>
@@ -1426,11 +1429,41 @@ dm9000_probe(struct platform_device *pdev)
 	struct dm9000_plat_data *pdata = dev_get_platdata(&pdev->dev);
 	struct board_info *db;	/* Point a board information structure */
 	struct net_device *ndev;
+	struct device *dev = &pdev->dev;
 	const unsigned char *mac_src;
 	int ret = 0;
 	int iosize;
 	int i;
 	u32 id_val;
+	int reset_gpio;
+	enum of_gpio_flags flags;
+	struct regulator *power;
+
+	power = devm_regulator_get(dev, "vcc");
+	if (IS_ERR(power)) {
+		dev_dbg(dev, "no regulator provided\n");
+	} else if (!regulator_is_enabled(power)) {
+		ret = regulator_enable(power);
+		dev_dgb(dev, "regulator enabled\n");
+	}
+
+	reset_gpio = of_get_named_gpio_flags(dev->of_node, "reset-gpio", 0,
+					     &flags);
+	if (gpio_is_valid(reset_gpio)) {
+		ret = devm_gpio_request_one(dev, reset_gpio, flags,
+					    "dm9000_reset");
+		if (ret) {
+			dev_err(dev, "failed to request reset gpio %d: %d\n",
+				reset_gpio, ret);
+		} else {
+			gpio_direction_output(reset_gpio, 0);
+			/* According to manual PWRST# Low Period Min 1ms */
+			msleep(2);
+			gpio_direction_output(reset_gpio, 1);
+			/* Needs 3ms to read eeprom when PWRST is deasserted */
+			msleep(4);
+		}
+	}
 
 	if (!pdata) {
 		pdata = dm9000_parse_dt(&pdev->dev);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* FIXED_PHY is broken...
From: David Miller @ 2014-12-16 16:25 UTC (permalink / raw)
  To: netdev; +Cc: f.fainelli

I get this now when I run oldconfig:

warning: (NET_DSA_BCM_SF2 && BCMGENET && SYSTEMPORT) selects FIXED_PHY which has unmet direct dependencies (NETDEVICES && PHYLIB=y)

For the thousandth time, you cannot select Kconfig options which have
dependencies of any kind, because select does not recursively cause
dependencies to be enabled up to the root of the Kconfig tree.

If you select on something which has a "depends on", stop right there
because you can't do it.

It only works for pure leaf Kconfig nodes with no deps.

All you needed to do in order to test this was do an allmodconfig
build.

^ permalink raw reply

* Re: [bisected] tg3 broken in 3.18.0?
From: Bjorn Helgaas @ 2014-12-16 16:20 UTC (permalink / raw)
  To: Rajat Jain
  Cc: Marcelo Ricardo Leitner, Nils Holland, David Miller, netdev,
	linux-pci@vger.kernel.org, Rafael Wysocki, Prashant Sreedharan,
	Michael Chan
In-Reply-To: <CAA93t1qyZE-9tw8pg1KG6g4iyy0QMW=iass5w=6ZGMTMu+vi_A@mail.gmail.com>

[+cc Rafael, Prashant, Michael]

On Tue, Dec 16, 2014 at 9:04 AM, Rajat Jain <rajatxjain@gmail.com> wrote:
> Hello All,
>
> Apologies for jumping in late, but for some reason I do not see the
> original mail in my inbox. However I am taking a look at the mails as
> sent on linux-pci (and I will keep an eye out for the bug report that
> Bjorn asked for).
>
>
>>
>> I'm getting, with commit 89665a6a71408796565bfd29cfa6a7877b17a667:
>>
>> $ grep 'pci 0000:02' tg3.bad
>> [    0.190733] pci 0000:02:00.0: 1st 165a14e4 14e4
>> [    0.190736] pci 0000:02:00.0: 1st 165a14e4 14e4
>> [    0.190810] pci 0000:02:00.0: [14e4:165a] type 00 class 0x020000
>> [    0.190885] pci 0000:02:00.0: reg 0x10: [mem 0xf7c40000-0xf7c4ffff 64bit]
>> [    0.191048] pci 0000:02:00.0: reg 0x30: [mem 0xf7c00000-0xf7c3ffff pref]
>> [    0.191382] pci 0000:02:00.0: PME# supported from D3hot D3cold
>> [    0.191438] pci 0000:02:00.0: System wakeup disabled by ACPI
>> [    1.561555] pci 0000:02:00.0: 1st 1 1
>> [    1.561558] pci 0000:02:00.0: crs_timeout: 0
>> [   20.412021] pci 0000:02:00.0: 1st 1 1
>> [   20.412022] pci 0000:02:00.0: crs_timeout: 0
>> [   20.413596] pci 0000:02:00.0: 1st 1 1
>> [   20.413598] pci 0000:02:00.0: crs_timeout: 0
>>
>> And without it:
>>
>> $ grep 'pci 0000:02' tg3.good
>> [    0.190734] pci 0000:02:00.0: 1st 165a14e4 14e4
>> [    0.190738] pci 0000:02:00.0: 1st 165a14e4 14e4
>> [    0.190811] pci 0000:02:00.0: [14e4:165a] type 00 class 0x020000
>> [    0.190884] pci 0000:02:00.0: reg 0x10: [mem 0xf7c40000-0xf7c4ffff 64bit]
>> [    0.191047] pci 0000:02:00.0: reg 0x30: [mem 0xf7c00000-0xf7c3ffff pref]
>> [    0.191380] pci 0000:02:00.0: PME# supported from D3hot D3cold
>> [    0.191439] pci 0000:02:00.0: System wakeup disabled by ACPI
>> [    1.576778] pci 0000:02:00.0: 1st 1 1
>> [   19.068517] pci 0000:02:00.0: 1st 165a14e4 14e4
>>
>
> It seems that in the first 2 attempts that were made to probe the
> device are all OK and return regular device ID and vendor ID for TG3
> (CRS does not have a role to play). However, later attempts return a
> CRS.
>
> 1) May I ask if you are using acpihp or pciehp? I assume pciehp?
>
> 2) Can you please also send dmesg output while passing
> pciehp.pciehp_debug=1? In the fail case, do you see a message
> indicating the pciehp gave up since it got CRS for a long time
> (something like "pci 0000:02:00.0 id reading try 50 times with
> interval 20 ms to get ffff0001")?
>
> 3) Currently the pciehp passes "0" for the argument "crs_timeout" to
> pci_bus_read_dev_vendor_id(). Can you please try increasing it to, say
> 30 seconds (30 * 1000). (For comparison data, acpihp uses the value
> 60*1000 i.e. 60 seconds today) and run the fail case once again?

Using zero for the timeout seems bogus to me.  But I doubt pciehp is
involved in this situation.

I think we're in this path:

    tg3_init_hw
      tg3_reset_hw
        tg3_disable_ints
        tg3_stop_fw
        tg3_write_sig_pre_reset
        tg3_chip_reset
          pci_device_is_present
            pci_bus_read_dev_vendor_id

and in this case pci_device_is_present() also passes a timeout of zero
to pci_bus_read_dev_vendor_id().  My guess is that tg3 is resetting
the device, so it's not too surprising that the config read returns
CRS status immediately afterward.

Bjorn

^ permalink raw reply

* Re: [PATCH 0/5] tun/macvtap: TUNSETIFF fixes
From: David Miller @ 2014-12-16 16:20 UTC (permalink / raw)
  To: mst; +Cc: linux-kernel, netdev, dan.carpenter, jasowang
In-Reply-To: <1418732988-3535-1-git-send-email-mst@redhat.com>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Tue, 16 Dec 2014 15:04:53 +0200

> Dan Carpenter reported the following:
 ...
> And that's true: we have run out of IFF flags in tun.
> 
> So let's not try to add more: add simple GET/SET ioctls
> instead. Easy to test, leads to clear semantics.
> 
> Alternatively we'll have to revert the whole thing for 3.19,
> but that seems more work as this has dependencies
> in other places.
> 
> While here, I noticed that macvtap was actually reading
> ifreq flags as a 32 bit field.
> Fix that up as well.

Looks good, series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next v2 2/4] swdevice: add new api to set and del bridge port attributes
From: Samudrala, Sridhar @ 2014-12-16 15:54 UTC (permalink / raw)
  To: Arad, Ronen, Roopa Prabhu, netdev@vger.kernel.org
  Cc: Jamal Hadi Salim, John Fastabend, Jiri Pirko, sfeldma@gmail.com,
	bcrl@kvack.org, tgraf@suug.ch, stephen@networkplumber.org,
	linville@tuxdriver.com, vyasevic@redhat.com, davem@davemloft.net,
	shm@cumulusnetworks.com, gospo@cumulusnetworks.com
In-Reply-To: <E4CD12F19ABA0C4D8729E087A761DC3505DB15CA@ORSMSX101.amr.corp.intel.com>


On 12/16/2014 3:01 AM, Arad, Ronen wrote:
> In my reply (inline) I elaborate on the validity of bridge-less and offloaded-bridge models for L2 switching.
>
> I also discuss the implied necessity of a bridge device for L3 routing and potential issues with the upcoming FIB offloading proposal.
>
>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org] On Behalf Of Roopa Prabhu
>> Sent: Tuesday, December 16, 2014 3:21 AM
>> To: Arad, Ronen
>> Cc: Jamal Hadi Salim; John Fastabend; netdev@vger.kernel.org; Jiri Pirko;
>> sfeldma@gmail.com; bcrl@kvack.org; tgraf@suug.ch;
>> stephen@networkplumber.org; linville@tuxdriver.com;
>> vyasevic@redhat.com; davem@davemloft.net;
>> shm@cumulusnetworks.com; gospo@cumulusnetworks.com
>> Subject: Re: [PATCH net-next v2 2/4] swdevice: add new api to set and del
>> bridge port attributes
>>
>> On 12/15/14, 4:58 PM, Arad, Ronen wrote:
>>>> -----Original Message-----
>>>> From: Jamal Hadi Salim [mailto:jhs@mojatatu.com]
>>>> Sent: Tuesday, December 16, 2014 1:28 AM
>>>> To: Arad, Ronen; John Fastabend; netdev@vger.kernel.org
>>>> Cc: Roopa Prabhu; Jiri Pirko; sfeldma@gmail.com; bcrl@kvack.org;
>>>> tgraf@suug.ch; stephen@networkplumber.org; linville@tuxdriver.com;
>>>> vyasevic@redhat.com; davem@davemloft.net;
>> shm@cumulusnetworks.com;
>>>> gospo@cumulusnetworks.com
>>>> Subject: Re: [PATCH net-next v2 2/4] swdevice: add new api to set and
>>>> del bridge port attributes
>>>>
>>>> On 12/15/14 13:36, Arad, Ronen wrote:
>>>>>> -----Original Message-----
>>>>> The behavior of a driver could depend on the presence of a bridge
>>>>> and
>>>> features such as FDB LEARNING and LEARNING_SYNC.
>>>>
>>>> Indeed, those are bridge attributes.
>>>>
>>>>> A switch port driver which is not enslaved to a bridge might need to
>>>>> implement VLAN-aware FDB within the driver and report its content to
>>>>> user-
>>>> space using ndo_fdb_dump.
>>>>    >
>>>>> A switch port driver which is enslaved to a bridge could do with
>>>>> only pass through for static FDB configuration
>>>>    > to the HW when LEARNING_SYNC is configured. FDB reporting to
>>>> user- space and soft aging are left to the bridge module FDB.
>>>>> Such driver, without LEARNING_SYNC could still avoid maintaing
>>>>> in-driver
>>>> FDB as long as it could dump the HW FDB on demand.
>>>>> LEARNING_SYNC also requires periodic updates of freshness
>>>>> information
>>>> from the driver to the bridge module.
>>>>
>>>> If you have an fdb - shouldnt that be exposed only if you have a
>>>> bridge abstraction exposed? i.e thats where the Linux tools would work.
>>> I'm trying to find out what are the opinions of other people in the netdev
>> list.
>>> John have clearly stated that he'd like to see full L2 switching functionality
>> (at least) supported without making a bridge device mandatory.
>>> The existing bridge ndos (ndo_bridge_{set,del,get}link) already support that
>> with proper setting of SELF/MASTER flags by iproute2.
>>> I see the value in supporting both approaches (bridge device mandatory
>>> and bridge device optional). If the choice is left to user-driven policy decision,
>>> we need to document both use models and map traditional L2 features to
>>> each model.
>>> The L2 offloading (or NETFUNC as it is currently called), which is being
>>> discussed on a different patch-set, is only needed when a bridge device is
>>> used.
>>> Without a bridge device, all configuration has to be targeted at the switch
>>> port driver directly using the SELF flag. FDB remains relevant and it is used to
>>> configure static MAC table entries and dump the HW MAC table.
>> Your understanding is right here. So far all patches have kept both models in
>> mind.
>
>>> When the HW device is a L2 switch or a multi-layer switch (L2-L3 or even
>>> higher), there is a gap between what the HW is doing and what is explicitly
>>> modeled in Linux.
>
>> Can you elaborate more here ?. We use the linux model to accelerate a
>> multi-layer (l2-l3) switch today. There maybe a few gaps, but these gaps can
>> be closed by having equivalent functionality in the software path.
> What I meant is that without a bridge device the HW switch is seen as a collection of independent switch ports. Typical switch ASIC performs L2 switching by default. This is not expressed explicitly in Linux without a bridge device.
> The SELF flag is used to target typical bridge port and bridge configuration at a switch port device.
> Without an explicit bridge device, bridge attributes have to be directed at an arbitrary port (any port could represent the entire switch) and interpreted by the switch port driver as intended for the entire switch (this includes attributes like STP etc.)
> Each switch port device driver has to implement similar functionality (i.e. all bridge and fdb related ndos) independently without common functionality shared (e.g. FDB, soft aging).
> It is a valid use model and could avoid the complexity of having to deal with the presence of both SW and HW bridge and to deal with explicit offloading of data-path.
>
> I was trying to find out whether the intention was to continue and support both bridge-less an offloaded-bridge models and leave it to the end-user to choose the desirable model at configuration time.
> This would require dual support in the switch port driver in order to have best user experience across multiple switch ASICs or other kinds of devices.

Also is one of the usecase for an explicit bridge device to support 
software switching by causing
the data packets to be processed at software bridge using appropriate 
port attribute settings?
Or is it only to make it convenient to maintain the fdb and represent 
the hardware path?
>>> Without a bridge device, the HW is represented by a set of switch port
>>> devices and the bridging (both control and data planes) takes place only in
>>> the HW and switch port driver.
>>> Each switch port driver has to implement its own FDB as there is no
>>> common shared code among drivers for different HW devices.
>>> Using a bridge device could partially alleviate that, but it comes with a cost.
>>> There is a need to properly implement offloading of both configuration and
>>> data-path. The transmit and receive path in the bridge module should be
>>> somehow bypassed to avoid unnecessary overhead or duplicate packets
>>> coming from both software bridging and HW bridging.
>>>
>>>> What i was refering to was a scenario where i have no interest in the
>>>> fdb despite such a hardware capabilities. VLANs is a different issue;
>>>>
>>> VLAN is fundamental feature of L2 and L3 switching and Linux is unclear
>> about it. Bridge device could model bridging of untagged packets which
>> requires a bridge device for each VLAN and a vlan device on each port that is
>> a member of the bridge's VLAN.
>>> This different from the behavior and configuration of classic closed-source
>> switches.
>>> An alternative model is VLAN filtering where a bridge is VLAN-aware and
>>> switches tagged traffic. A bridge device represents multiple L2 domains with
>>> VLAN filtering policy that defines the switching rules within each domain.
>> And the linux bridge driver supports both models today.
>>
>>> Forwarding (e.g. L3 routing) is expected across such L2 domains using L3
>> entities.
>>> The modeling of L3 entities per L2 domain (e.g. per-VLAN) in the VLAN
>>> filtering model is yet unclear to me.
>> In the vlan filtering bridge model, You can create a vlan device on the bridge
>> for l3 ...
>>
> That's what I'm thinking too (I experimented with such setup using veth interfaces, bridge device, and vlan interfaces). This, however, seems to require an explicit bridge for L3 support.
>
> Looking at the latest code of FIB offloading (not yet submitted to netdev), I noticed that a switch port device is expected as a lower descendent of the FIB destination device.
> This assumption is valid in the per-vlan bridge model where IP address is assigned to the bridge itself.
> This, however, is not consistent with the single multi-VLAN bridge model.
> Vlan interfaces on a bridge looks like siblings of the switch ports devices on the same bridge. They are not ancestors of the switch ports.
> The L3 domain ends at the bridge sub-interfaces. The only L3 entities are the vlan sub-interfaces on the bridge.
> Those are route next hops and the only possible fib_dev.
> L3 routing is not aware of the switch ports. Route is performed to next hop addresses on one of the vlan interfaces subnets. The actual resolution to a switch port device has to be performed by the neighbor subsystem (ARP/ND).
> It is unclear to me how the FIB offloading will be redirected to an ndo of a switch port device.
For L3, i would think we need to support offloading of assigning the 
gateway IP and the actual route. For ex: to create route to subnet 
2.2.2.0/24 with GW as 1.1.1.254/24, a user may do
     ip addr add 1.1.1.254/24 dev <swX>
     ip route add 5.5.5.0/24 via 1.1.1.254 dev swX

Here swX has to be a device corresponding to the switch (or cpu port), 
not a switch port and the ARP requests for this gw IP need to be passed 
to the linux stack so that it can send arp replies and also add an ARP 
entry in hardware.


>
>>>>>>> Will the decision about using a bridge device or avoiding it be
>>>>>>> left to the end-user?
>>>>>> Its a user policy decision. Again the offload bit gets us this in a
>>>>>> reasonably configurable way IMO.
>>>>>>
>>>>>>> (This requires switch port drivers to be able to work and provide
>>>>>>> similar functionality in both setups).
>>>>>> Right, but if the drivers "care" who is calling their ndo ops
>>>>>> something is seriously broken. For the driver it should not need to
>>>>>> know anything about the callers so it doesn't matter to the driver
>>>>>> if its a netlink call from user space or an internal call fro
>>>>>> bridge.ko
>>>>> LEARNING_SYNC only makes sense when a switch port driver is enslaved
>>>>> to
>>>> a bridge.
>>>>    > Rocker switch driver indeed monitors upper change notifications
>>>> and keep track of master bridge presence.
>>>>> So bridge presence is not transparent.
>>>>>
>>>> Agreed - the challenge so far is that people have been fascinated by
>> "switch"
>>>> point of view. I think we are learning and the class device will
>>>> eventually become obvious as useful.
>>>>
>>>> cheers,
>>>> jamal
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majordomo@vger.kernel.org More majordomo
>> info
>>> at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in the body
>> of a message to majordomo@vger.kernel.org More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [bisected] tg3 broken in 3.18.0?
From: Rajat Jain @ 2014-12-16 16:04 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Nils Holland, David Miller, netdev, linux-pci@vger.kernel.org
In-Reply-To: <548EF90A.5070607@gmail.com>

Hello All,

Apologies for jumping in late, but for some reason I do not see the
original mail in my inbox. However I am taking a look at the mails as
sent on linux-pci (and I will keep an eye out for the bug report that
Bjorn asked for).

>
> I'm getting, with commit 89665a6a71408796565bfd29cfa6a7877b17a667:
>
> $ grep 'pci 0000:02' tg3.bad
> [    0.190733] pci 0000:02:00.0: 1st 165a14e4 14e4
> [    0.190736] pci 0000:02:00.0: 1st 165a14e4 14e4
> [    0.190810] pci 0000:02:00.0: [14e4:165a] type 00 class 0x020000
> [    0.190885] pci 0000:02:00.0: reg 0x10: [mem 0xf7c40000-0xf7c4ffff 64bit]
> [    0.191048] pci 0000:02:00.0: reg 0x30: [mem 0xf7c00000-0xf7c3ffff pref]
> [    0.191382] pci 0000:02:00.0: PME# supported from D3hot D3cold
> [    0.191438] pci 0000:02:00.0: System wakeup disabled by ACPI
> [    1.561555] pci 0000:02:00.0: 1st 1 1
> [    1.561558] pci 0000:02:00.0: crs_timeout: 0
> [   20.412021] pci 0000:02:00.0: 1st 1 1
> [   20.412022] pci 0000:02:00.0: crs_timeout: 0
> [   20.413596] pci 0000:02:00.0: 1st 1 1
> [   20.413598] pci 0000:02:00.0: crs_timeout: 0
>
> And without it:
>
> $ grep 'pci 0000:02' tg3.good
> [    0.190734] pci 0000:02:00.0: 1st 165a14e4 14e4
> [    0.190738] pci 0000:02:00.0: 1st 165a14e4 14e4
> [    0.190811] pci 0000:02:00.0: [14e4:165a] type 00 class 0x020000
> [    0.190884] pci 0000:02:00.0: reg 0x10: [mem 0xf7c40000-0xf7c4ffff 64bit]
> [    0.191047] pci 0000:02:00.0: reg 0x30: [mem 0xf7c00000-0xf7c3ffff pref]
> [    0.191380] pci 0000:02:00.0: PME# supported from D3hot D3cold
> [    0.191439] pci 0000:02:00.0: System wakeup disabled by ACPI
> [    1.576778] pci 0000:02:00.0: 1st 1 1
> [   19.068517] pci 0000:02:00.0: 1st 165a14e4 14e4
>

It seems that in the first 2 attempts that were made to probe the
device are all OK and return regular device ID and vendor ID for TG3
(CRS does not have a role to play). However, later attempts return a
CRS.

1) May I ask if you are using acpihp or pciehp? I assume pciehp?

2) Can you please also send dmesg output while passing
pciehp.pciehp_debug=1? In the fail case, do you see a message
indicating the pciehp gave up since it got CRS for a long time
(something like "pci 0000:02:00.0 id reading try 50 times with
interval 20 ms to get ffff0001")?

3) Currently the pciehp passes "0" for the argument "crs_timeout" to
pci_bus_read_dev_vendor_id(). Can you please try increasing it to, say
30 seconds (30 * 1000). (For comparison data, acpihp uses the value
60*1000 i.e. 60 seconds today) and run the fail case once again?

Thanks a lot in advance for the debugging help ;-)

Rajat

^ permalink raw reply

* Re: [PATCH net-next RESEND] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined.
From: Hubert Sokolowski @ 2014-12-16 14:35 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: John Fastabend, Roopa Prabhu, netdev@vger.kernel.org,
	Vlad Yasevich
In-Reply-To: <54902E5E.2070405@mojatatu.com>

>
> I am glad this happened ;-> Otherwise we wouldnt be having this
> discussion. When Vlad was asking me I was in a rush to get the patch
> out and didnt question because i thought this was something some crazy
> virtualization people needed.
> If Vlad's use case goes away, then Hubert's little restoration is fine.
>

I'm afraid there might be a little more to fix in here. I just tested
my patch after I moved the dflt_fdb_dump unconditional call inside br_fdb_dump,
so the "self" is returned again for the bridge and saw that br_fdb_dump is called
twice in some cases. As a result I saw duplicate "self" entries.
I think the problem is in rtnl_fdb_dump how it invokes ndo_fdb_dump.
In 3.16 the algorithm was very simple, now it is a little bit more complicated.

I tested on 3.18 without my changes only added pr_info inside br_fdb_dump:
pr_info("br fdb dump called dev %s filter dev %s.\n", dev->name,
			filter_dev->name);

I loaded dummy module, created bridge br0 with brctl and then attached dummy0
to that bridge:
    brctl  addif  br0 dummy0
Then when trying to filter by brport only:
./bridge fdb show brport dummy0
5e:e2:a0:21:0c:f5 vlan 1 master br0 permanent
5e:e2:a0:21:0c:f5 master br0 permanent
33:33:00:00:00:01 self permanent

Even though the output looks OK, I see in the journalctl logs the callback
was called twice with same attributes:
Dec 16 09:15:39 localhost.localdomain kernel: br fdb dump called dev br0 filter dev dummy0.
Dec 16 09:15:39 localhost.localdomain kernel: br fdb dump called dev br0 filter dev dummy0.

Do you have any idea why this is happening? I hope this test makes sense :).

Thanks,
Hubert

--
Hubert Sokolowski           Intel Corporation

^ permalink raw reply

* Re: [iproute2] tc: Show classes more hierarchically]
From: Eric Dumazet @ 2014-12-16 13:49 UTC (permalink / raw)
  To: vadim4j; +Cc: netdev
In-Reply-To: <20141215224851.GB6734@angus-think.lan>

On Tue, 2014-12-16 at 00:48 +0200, vadim4j@gmail.com wrote:
> Hi All,
> 
> I am playing with showing classes in more hierarchically format and I
> have some code and example of output from my TC looks like:
> 
> # tc/tc -t class show dev tap0
> 
>  \---1:2 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b 
>         \---1:40 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b 
>         \---1:50 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b 
>         \---1:60 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b 
>  \---1:1 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b 
>         \---1:10 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b 
>                \---1:11 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b 
>                       \---1:111 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b 
>         \---1:20 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b 
>         \---1:30 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b 
> 
> 

> So I'd like to ask if it might be useful for the TC users (may be
> better format ?) to have this ?

Sure, this seems interesting, thanks !

^ permalink raw reply

* Re: [PATCH net-next 1/1] net: fec: Fix NAPI race
From: Marek Vasut @ 2014-12-16 13:34 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Fabio Estevam, Fugang Duan, David S. Miller,
	netdev@vger.kernel.org, Estevam Fabio-R49496, Ben Hutchings,
	Stephen Hemminger, robert.daniels
In-Reply-To: <20141216114131.GQ11285@n2100.arm.linux.org.uk>

On Tuesday, December 16, 2014 at 12:41:31 PM, Russell King - ARM Linux wrote:
> On Tue, Dec 16, 2014 at 09:33:53AM -0200, Fabio Estevam wrote:
> > Hi Fugang,
> > 
> > On Tue, Dec 16, 2014 at 8:25 AM, Fugang Duan <b38611@freescale.com> wrote:
> > > Do camera capture test on i.MX6q sabresd board, and save the capture
> > > data to nfs rootfs. The command is:
> > > gst-launch-1.0 -e imxv4l2src device=/dev/video1 num-buffers=2592000 !
> > > tee name=t ! queue ! imxv4l2sink sync=false t. ! queue ! vpuenc !
> > > queue ! mux. pulsesrc num-buffers=3720937 blocksize=4096 !
> > > 'audio/x-raw, rate=44100, channels=2' ! queue ! imxmp3enc !
> > > mpegaudioparse ! queue ! mux. qtmux name=mux ! filesink
> > > location=video_recording_long.mov
> > > 
> > > After about 10 hours running, there have net watchdog timeout kernel
> > > dump: ...
> > > WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264
> > > dev_watchdog+0x2b4/0x2d8() NETDEV WATCHDOG: eth0 (fec): transmit queue
> > > 0 timed out
> > 
> > Adding more people who reported similar issues in the past.
> > 
> > Marek,
> > 
> > Does this patch solve the problem you reported at
> > http://www.spinics.net/lists/netdev/msg268167.html ?
> 
> My set of patches fixed stuff exactly like this...

I still keep your G+ post open, in case I ever manage to find free time to dive 
into it. It's be a terrible waste to let these patches go. Right now, I'm in the 
process of finishing my degree (finally) so things are just crap, apologies.

Best regards,
Marek Vasut

^ permalink raw reply

* Re: [PATCH net-next RESEND] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined.
From: Jamal Hadi Salim @ 2014-12-16 13:06 UTC (permalink / raw)
  To: John Fastabend
  Cc: Hubert Sokolowski, Roopa Prabhu, netdev@vger.kernel.org,
	Vlad Yasevich
In-Reply-To: <548F80B2.80408@gmail.com>

On 12/15/14 19:45, John Fastabend wrote:
> On 12/15/2014 06:29 AM, Jamal Hadi Salim wrote:

>
> hmm good question. When I implemented this on the host nics with SR-IOV,
> VMDQ, etc. The multi/unicast addresses were propagated into the FDB by
> the driver.

So if i understand correctly, this is a NIC with an FDB. And there is no
concept of a bridge to which it is attached. To the point of
classical uni/multicast addresses on a netdev abstraction; these
are typically stored in *much simpler tables* (used to be IO
registers back in the day)
Do these NICs not have such a concept?
An fdb entry has an egress port column; I have seen cases where the
port is labeled as "Cpu port" which would mean it belongs to the host;
but in this case it just seems there is no such concept and as Or
brought up in another email - what does "VLANid" mean in such a case?
If we go with a CPU port concept,
We could then use the concept of a vlan filter on a port basis
but then what happens when you dont have an fdb (majority of cases)?

> My logic was if some netdev ethx has a set of MAC addresses
> above it well then any virtual function or virtual device also behind
> the hardware shouldn't be sending those addresses out the egress switch
> facing port. Otherwise the switch will see packets it knows are behind
> that port and drop them. Or flood them if it hasn't learned the address
> yet. Either way they will never get to the right netdev.
>
> Admittedly I wasn't thinking about switches with many ports at the time.
>

I often struggle with trying to "box" SRIOV into some concept of a
switch abstraction and sometimes i am puzzled.
Would exposing the SRIOV underlay as a switch not have solved this
problem? Then the virtual ports essentially are bridge ports.
Maybe what we need is a concept of a "edge relay" extended netdev?
These things would have an fdb as well down and uplink relay ports that
can be attached to them.

>> Some of these drivers may be just doing the LinuxWay(aka cutnpaste what
>> the other driver did).
>
> My original thinking here was... if it didn't implement fdb_add, fdb_del
> and fdb_dump then if you wanted to think of it as having forwarding
> database that was fine but it was really just a two port mac relay. In
> which case just dump all the mac addresses it knows about. In this case
> if it was something more fancy it could do its own dump like vxlan or
> macvlan.
>

The challenge here is lack of separation between a NICs uni/multicast
ports which it owns - which is a traditional operation regardless of
what capabilities the NIC has; vs an fdb which has may have many
other capabilities. Probably all NICs capable of many MACs implement
fdbs?

> For a host nic ucast/multicast and fdb are the same, I think? The
> code we had was just short-hand to allow the common case a host nic
> to work. Notice vxlan and bridge drivers didn't dump there addr lists
> from fdb_dump until your patch.
>
> Perhaps my implementation of macvlan fdb_{add|del|dump} is buggy. And
> I shouldn't overload the addr lists.
>

Not just those - I am wondering about the general utility of what
Hubert was trying to do if all the driver does is call the default
dumper based on some flags presence and the default dumper
does a dump of uni/multicast host entries. Those are not really fdb
entries in the traditional sense.
Is there no way to get the unicast/multicast mac addresses for such
a driver?
I think that would help bring clarity to my confusion.

>
> I'm interested to see what Vlad says as well. But the current situation
> is previously some drivers dumped their addr lists others didn't.
> Specifically, the more switch like devices (bridge, vxlan) didn't. Now
> every device will dump the addr lists. I'm not entirely convinced that
> is correct.
>

I am glad this happened ;-> Otherwise we wouldnt be having this
discussion. When Vlad was asking me I was in a rush to get the patch
out and didnt question because i thought this was something some crazy
virtualization people needed.
If Vlad's use case goes away, then Hubert's little restoration is fine.

> It works OK for host nics (NICS that can't forward between ports) and
> seems at best confusing for real switch asics.

So if these NICs have fdb entries and i programmed it (meaning setting
which port a given MAC should be sent to), would it not work?

> On a related question do
> you expect the switch asic to trap any packets with MAC addresses in
> the multi/unicast address lists and send them to the correct netdev? Or
> will the switch forward them using normal FDB tables?
>

I think there would be a separate table for that. Roopa, can you check
with the ASICs you guys work on? The point i was trying to make above
is today there is a uni/multicast list or table of sorts that all NICs
expose.
There's always the hack of a "cpu port". I have also seen the "cpu port"
being conceptualized in L3 tables to imply "next hop is cpu" where you
have an IP address owned by the host; so maybe we need a concept of a
cpu port or again the revival of TheThing class device.

cheers,
jamal

^ permalink raw reply

* [PATCH 5/5] if_tun: drop broken IFF_VNET_LE
From: Michael S. Tsirkin @ 2014-12-16 13:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Miller, netdev, Dan Carpenter, linux-api, Jason Wang
In-Reply-To: <1418732988-3535-1-git-send-email-mst@redhat.com>

Everyone should use TUNSETVNETLE/TUNGETVNETLE instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/uapi/linux/if_tun.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index 274630c..50ae243 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -59,7 +59,6 @@
 #define IFF_ONE_QUEUE	0x2000
 #define IFF_VNET_HDR	0x4000
 #define IFF_TUN_EXCL	0x8000
-#define IFF_VNET_LE	0x10000
 #define IFF_MULTI_QUEUE 0x0100
 #define IFF_ATTACH_QUEUE 0x0200
 #define IFF_DETACH_QUEUE 0x0400
-- 
MST

^ permalink raw reply related

* [PATCH 4/5] macvtap: drop broken IFF_VNET_LE
From: Michael S. Tsirkin @ 2014-12-16 13:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, netdev, Dan Carpenter, Jason Wang, Tom Herbert,
	Ben Hutchings, Vlad Yasevich, Herbert Xu
In-Reply-To: <1418732988-3535-1-git-send-email-mst@redhat.com>

Use TUNSETVNETLE/TUNGETVNETLE instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/macvtap.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index de88285..f1f0df1 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -45,16 +45,18 @@ struct macvtap_queue {
 	struct list_head next;
 };
 
-#define MACVTAP_FEATURES (IFF_VNET_HDR | IFF_VNET_LE | IFF_MULTI_QUEUE)
+#define MACVTAP_FEATURES (IFF_VNET_HDR | IFF_MULTI_QUEUE)
+
+#define MACVTAP_VNET_LE 0x80000000
 
 static inline u16 macvtap16_to_cpu(struct macvtap_queue *q, __virtio16 val)
 {
-	return __virtio16_to_cpu(q->flags & IFF_VNET_LE, val);
+	return __virtio16_to_cpu(q->flags & MACVTAP_VNET_LE, val);
 }
 
 static inline __virtio16 cpu_to_macvtap16(struct macvtap_queue *q, u16 val)
 {
-	return __cpu_to_virtio16(q->flags & IFF_VNET_LE, val);
+	return __cpu_to_virtio16(q->flags & MACVTAP_VNET_LE, val);
 }
 
 static struct proto macvtap_proto = {
@@ -1082,6 +1084,21 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
 		q->vnet_hdr_sz = s;
 		return 0;
 
+	case TUNGETVNETLE:
+		s = !!(q->flags & MACVTAP_VNET_LE);
+		if (put_user(s, sp))
+			return -EFAULT;
+		return 0;
+
+	case TUNSETVNETLE:
+		if (get_user(s, sp))
+			return -EFAULT;
+		if (s)
+			q->flags |= MACVTAP_VNET_LE;
+		else
+			q->flags &= ~MACVTAP_VNET_LE;
+		return 0;
+
 	case TUNSETOFFLOAD:
 		/* let the user check for future flags */
 		if (arg & ~(TUN_F_CSUM | TUN_F_TSO4 | TUN_F_TSO6 |
-- 
MST

^ permalink raw reply related

* [PATCH 3/5] tun: drop broken IFF_VNET_LE
From: Michael S. Tsirkin @ 2014-12-16 13:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, netdev, Dan Carpenter, Jason Wang, Herbert Xu,
	Tom Herbert, Ben Hutchings, Xi Wang, Masatake YAMATO
In-Reply-To: <1418732988-3535-1-git-send-email-mst@redhat.com>

Use TUNSETVNETLE/TUNGETVNETLE instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/tun.c | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index c052bd6b..e3e8a0e 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -109,9 +109,11 @@ do {								\
  * overload it to mean fasync when stored there.
  */
 #define TUN_FASYNC	IFF_ATTACH_QUEUE
+/* High bits in flags field are unused. */
+#define TUN_VNET_LE     0x80000000
 
 #define TUN_FEATURES (IFF_NO_PI | IFF_ONE_QUEUE | IFF_VNET_HDR | \
-		      IFF_VNET_LE | IFF_MULTI_QUEUE)
+		      IFF_MULTI_QUEUE)
 #define GOODCOPY_LEN 128
 
 #define FLT_EXACT_COUNT 8
@@ -207,12 +209,12 @@ struct tun_struct {
 
 static inline u16 tun16_to_cpu(struct tun_struct *tun, __virtio16 val)
 {
-	return __virtio16_to_cpu(tun->flags & IFF_VNET_LE, val);
+	return __virtio16_to_cpu(tun->flags & TUN_VNET_LE, val);
 }
 
 static inline __virtio16 cpu_to_tun16(struct tun_struct *tun, u16 val)
 {
-	return __cpu_to_virtio16(tun->flags & IFF_VNET_LE, val);
+	return __cpu_to_virtio16(tun->flags & TUN_VNET_LE, val);
 }
 
 static inline u32 tun_hashfn(u32 rxhash)
@@ -1853,6 +1855,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 	int sndbuf;
 	int vnet_hdr_sz;
 	unsigned int ifindex;
+	int le;
 	int ret;
 
 	if (cmd == TUNSETIFF || cmd == TUNSETQUEUE || _IOC_TYPE(cmd) == 0x89) {
@@ -2052,6 +2055,23 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 		tun->vnet_hdr_sz = vnet_hdr_sz;
 		break;
 
+	case TUNGETVNETLE:
+		le = !!(tun->flags & TUN_VNET_LE);
+		if (put_user(le, (int __user *)argp))
+			ret = -EFAULT;
+		break;
+
+	case TUNSETVNETLE:
+		if (get_user(le, (int __user *)argp)) {
+			ret = -EFAULT;
+			break;
+		}
+		if (le)
+			tun->flags |= TUN_VNET_LE;
+		else
+			tun->flags &= ~TUN_VNET_LE;
+		break;
+
 	case TUNATTACHFILTER:
 		/* Can be set only for TAPs */
 		ret = -EINVAL;
-- 
MST

^ permalink raw reply related

* [PATCH 2/5] if_tun: add TUNSETVNETLE/TUNGETVNETLE
From: Michael S. Tsirkin @ 2014-12-16 13:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Miller, netdev, Dan Carpenter, linux-api, Jason Wang
In-Reply-To: <1418732988-3535-1-git-send-email-mst@redhat.com>

ifreq flags field is only 16 bit wide, so setting IFF_VNET_LE there has
no effect:
doesn't fit in two bytes.

The tests passed apparently because they have an even number of bugs,
all cancelling out.

Luckily we didn't release a kernel with this flag, so it's
not too late to fix this.

Add TUNSETVNETLE/TUNGETVNETLE to really achieve the purpose
of IFF_VNET_LE.

This has an added benefit that if we ever want a BE flag,
we won't have to deal with weird configurations like
setting both LE and BE at the same time.

IFF_VNET_LE will be dropped in a follow-up patch.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/uapi/linux/if_tun.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index 18b2403..274630c 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -48,6 +48,8 @@
 #define TUNSETQUEUE  _IOW('T', 217, int)
 #define TUNSETIFINDEX	_IOW('T', 218, unsigned int)
 #define TUNGETFILTER _IOR('T', 219, struct sock_fprog)
+#define TUNSETVNETLE _IOW('T', 220, int)
+#define TUNGETVNETLE _IOR('T', 221, int)

 /* TUNSETIFF ifr flags */
 #define IFF_TUN		0x0001
-- 
MST

^ permalink raw reply related

* [PATCH 1/5] macvtap: fix uninitialized access on TUNSETIFF
From: Michael S. Tsirkin @ 2014-12-16 13:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, netdev, Dan Carpenter, Jason Wang, Tom Herbert,
	Ben Hutchings, Vlad Yasevich, Herbert Xu
In-Reply-To: <1418732988-3535-1-git-send-email-mst@redhat.com>

flags field in ifreq is only 16 bit wide, but
we read it as a 32 bit value.
If userspace doesn't zero-initialize unused fields,
this will lead to failures.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/macvtap.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index af90ab5..de88285 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -1011,7 +1011,7 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
 	void __user *argp = (void __user *)arg;
 	struct ifreq __user *ifr = argp;
 	unsigned int __user *up = argp;
-	unsigned int u;
+	unsigned short u;
 	int __user *sp = argp;
 	int s;
 	int ret;
@@ -1026,7 +1026,7 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
 		if ((u & ~MACVTAP_FEATURES) != (IFF_NO_PI | IFF_TAP))
 			ret = -EINVAL;
 		else
-			q->flags = u;
+			q->flags = (q->flags & ~MACVTAP_FEATURES) | u;
 
 		return ret;
 
@@ -1039,8 +1039,9 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
 		}
 
 		ret = 0;
+		u = q->flags;
 		if (copy_to_user(&ifr->ifr_name, vlan->dev->name, IFNAMSIZ) ||
-		    put_user(q->flags, &ifr->ifr_flags))
+		    put_user(u, &ifr->ifr_flags))
 			ret = -EFAULT;
 		macvtap_put_vlan(vlan);
 		rtnl_unlock();
-- 
MST

^ permalink raw reply related

* [PATCH 0/5] tun/macvtap: TUNSETIFF fixes
From: Michael S. Tsirkin @ 2014-12-16 13:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Miller, netdev, Dan Carpenter, Jason Wang

Dan Carpenter reported the following:
	static checker warning:

		drivers/net/tun.c:1694 tun_set_iff()
		warn: 0x17100 is larger than 16 bits

	drivers/net/tun.c
	  1692
	  1693          tun->flags = (tun->flags & ~TUN_FEATURES) |
	  1694                  (ifr->ifr_flags & TUN_FEATURES);
	  1695

	It's complaining because the "ifr->ifr_flags" variable is a short
	(should it be unsigned?).  The new define:

	#define IFF_VNET_LE    0x10000

	doesn't fit in two bytes.  Other suspect looking code could be:

		return __virtio16_to_cpu(q->flags & IFF_VNET_LE, val);

And that's true: we have run out of IFF flags in tun.

So let's not try to add more: add simple GET/SET ioctls
instead. Easy to test, leads to clear semantics.

Alternatively we'll have to revert the whole thing for 3.19,
but that seems more work as this has dependencies
in other places.

While here, I noticed that macvtap was actually reading
ifreq flags as a 32 bit field.
Fix that up as well.

Michael S. Tsirkin (5):
  macvtap: fix uninitialized access on TUNSETIFF
  if_tun: add TUNSETVNETLE/TUNGETVNETLE
  tun: drop broken IFF_VNET_LE
  macvtap: drop broken IFF_VNET_LE
  if_tun: drop broken IFF_VNET_LE

 include/uapi/linux/if_tun.h |  3 ++-
 drivers/net/macvtap.c       | 30 ++++++++++++++++++++++++------
 drivers/net/tun.c           | 26 +++++++++++++++++++++++---
 3 files changed, 49 insertions(+), 10 deletions(-)

-- 
MST

^ permalink raw reply

* Re: [PATCH net] net/mlx4: Cache line CQE/EQE stride fixes
From: Amir Vadai @ 2014-12-16 12:29 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev@vger.kernel.org, Yevgeny Petrilin, Or Gerlitz,
	clsoto@linux.vnet.ibm.com, Ido Shamay, Wei Yang
In-Reply-To: <1418729334-2974-1-git-send-email-amirv@mellanox.com>

On 12/16/2014 1:28 PM, Amir Vadai wrote:
> From: Ido Shamay <idos@mellanox.com>
> 
> This commit contains 2 fixes for the 128B CQE/EQE stride feaure.
> Wei found that mlx4_QUERY_HCA function marked the wrong capability
> in flags (64B CQE/EQE), when CQE/EQE stride feature was enabled.
> Also added small fix in initial CQE ownership bit assignment, when CQE
> is size is not default 32B.
> 
> Fixes: 77507aa24 (net/mlx4: Enable CQE/EQE stride support)
> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
> Signed-off-by: Ido Shamay <idos@mellanox.com>
> Signed-off-by: Amir Vadai <amirv@mellanox.com>
> ---
> Dave Hi,
> 
> Please pull this patch also to stable (at least 3.17)
> 
> Thanks,
> Amir

Small correction: Should pull into -stable >= 3.18

Amir

^ permalink raw reply

* Re: [PATCH 1/5] e1000e: reset MAC-PHY interconnect on 82577/82578
From: Willy Tarreau @ 2014-12-16 12:17 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Zhu Yanjun, stable, netdev, Zhu Yanjun, Bruce Allan,
	David S. Miller
In-Reply-To: <1418731926.2467.15.camel@jtkirshe-mobl>

On Tue, Dec 16, 2014 at 04:12:06AM -0800, Jeff Kirsher wrote:
> On Tue, 2014-12-16 at 18:28 +0800, Zhu Yanjun wrote:
> > 2.6.x kernels require a similar logic change as commit 6dfaa76 
> > [e1000e: reset MAC-PHY interconnect on 82577/82578] introduces
> > for newer kernels.
> > 
> > During Sx->S0 transitions, the interconnect between the MAC and PHY on
> > 82577/82578 can remain in SMBus mode instead of transitioning to the
> > PCIe-like mode required during normal operation.  Toggling the
> > LANPHYPC
> > Value bit essentially resets the interconnect forcing it to the
> > correct
> > mode.
> > 
> > after review of all intel drivers, found several instances where
> > drivers had the incorrect pattern of:
> > memory mapped write();
> > delay();
> > 
> > which should always be:
> > memory mapped write();
> > write flush(); /* aka memory mapped read */
> > delay();
> > 
> > explanation:
> > The reason for including the flush is that writes can be held
> > (posted) in PCI/PCIe bridges, but the read always has to complete
> > synchronously and therefore has to flush all pending writes to a
> > device.  If a write is held and followed by a delay, the delay
> > means nothing because the write may not have reached hardware
> > (maybe even not until the next read)
> > 
> > Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
> > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> > Signed-off-by: David S. Miller <davem@davemloft.net>
> > Signed-off-by: Zhu Yanjun <Yanjun.Zhu@windriver.com>
> > ---
> >  drivers/net/e1000e/defines.h |  2 ++
> >  drivers/net/e1000e/ich8lan.c | 20 ++++++++++++++++++++
> >  2 files changed, 22 insertions(+)
> 
> To be clear, Zhu is wanting this applied to stable trees (yet did not CC
> stable@vger.kernel.org ).
> 
> Willy- I am fine with this series being applied to stable.

OK, thanks Jeff! I'm queuing the patches now.

Best regards,
Willy

^ permalink raw reply

* Re: [PATCH 1/5] e1000e: reset MAC-PHY interconnect on 82577/82578
From: Jeff Kirsher @ 2014-12-16 12:12 UTC (permalink / raw)
  To: Zhu Yanjun, stable, Willy Tarreau
  Cc: netdev, w, Zhu Yanjun, Bruce Allan, David S. Miller
In-Reply-To: <1418725700-31465-2-git-send-email-Yanjun.Zhu@windriver.com>

[-- Attachment #1: Type: text/plain, Size: 1727 bytes --]

On Tue, 2014-12-16 at 18:28 +0800, Zhu Yanjun wrote:
> 2.6.x kernels require a similar logic change as commit 6dfaa76 
> [e1000e: reset MAC-PHY interconnect on 82577/82578] introduces
> for newer kernels.
> 
> During Sx->S0 transitions, the interconnect between the MAC and PHY on
> 82577/82578 can remain in SMBus mode instead of transitioning to the
> PCIe-like mode required during normal operation.  Toggling the
> LANPHYPC
> Value bit essentially resets the interconnect forcing it to the
> correct
> mode.
> 
> after review of all intel drivers, found several instances where
> drivers had the incorrect pattern of:
> memory mapped write();
> delay();
> 
> which should always be:
> memory mapped write();
> write flush(); /* aka memory mapped read */
> delay();
> 
> explanation:
> The reason for including the flush is that writes can be held
> (posted) in PCI/PCIe bridges, but the read always has to complete
> synchronously and therefore has to flush all pending writes to a
> device.  If a write is held and followed by a delay, the delay
> means nothing because the write may not have reached hardware
> (maybe even not until the next read)
> 
> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Zhu Yanjun <Yanjun.Zhu@windriver.com>
> ---
>  drivers/net/e1000e/defines.h |  2 ++
>  drivers/net/e1000e/ich8lan.c | 20 ++++++++++++++++++++
>  2 files changed, 22 insertions(+)

To be clear, Zhu is wanting this applied to stable trees (yet did not CC
stable@vger.kernel.org ).

Willy- I am fine with this series being applied to stable.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox