* Re: Flow Control and Port Mirroring Revisited
From: Michael S. Tsirkin @ 2011-01-14 4:58 UTC (permalink / raw)
To: Simon Horman
Cc: Jesse Gross, Eric Dumazet, Rusty Russell, virtualization, dev,
virtualization, netdev, kvm
In-Reply-To: <20110113234135.GC8426@verge.net.au>
On Fri, Jan 14, 2011 at 08:41:36AM +0900, Simon Horman wrote:
> On Thu, Jan 13, 2011 at 10:45:38AM -0500, Jesse Gross wrote:
> > On Thu, Jan 13, 2011 at 1:47 AM, Simon Horman <horms@verge.net.au> wrote:
> > > On Mon, Jan 10, 2011 at 06:31:55PM +0900, Simon Horman wrote:
> > >> On Fri, Jan 07, 2011 at 10:23:58AM +0900, Simon Horman wrote:
> > >> > On Thu, Jan 06, 2011 at 05:38:01PM -0500, Jesse Gross wrote:
> > >> >
> > >> > [ snip ]
> > >> > >
> > >> > > I know that everyone likes a nice netperf result but I agree with
> > >> > > Michael that this probably isn't the right question to be asking. I
> > >> > > don't think that socket buffers are a real solution to the flow
> > >> > > control problem: they happen to provide that functionality but it's
> > >> > > more of a side effect than anything. It's just that the amount of
> > >> > > memory consumed by packets in the queue(s) doesn't really have any
> > >> > > implicit meaning for flow control (think multiple physical adapters,
> > >> > > all with the same speed instead of a virtual device and a physical
> > >> > > device with wildly different speeds). The analog in the physical
> > >> > > world that you're looking for would be Ethernet flow control.
> > >> > > Obviously, if the question is limiting CPU or memory consumption then
> > >> > > that's a different story.
> > >> >
> > >> > Point taken. I will see if I can control CPU (and thus memory) consumption
> > >> > using cgroups and/or tc.
> > >>
> > >> I have found that I can successfully control the throughput using
> > >> the following techniques
> > >>
> > >> 1) Place a tc egress filter on dummy0
> > >>
> > >> 2) Use ovs-ofctl to add a flow that sends skbs to dummy0 and then eth1,
> > >> this is effectively the same as one of my hacks to the datapath
> > >> that I mentioned in an earlier mail. The result is that eth1
> > >> "paces" the connection.
This is actually a bug. This means that one slow connection will
affect fast ones. I intend to change the default for qemu to sndbuf=0 :
this will fix it but break your "pacing". So pls do not count on this behaviour.
> > > Further to this, I wonder if there is any interest in providing
> > > a method to switch the action order - using ovs-ofctl is a hack imho -
> > > and/or switching the default action order for mirroring.
> >
> > I'm not sure that there is a way to do this that is correct in the
> > generic case. It's possible that the destination could be a VM while
> > packets are being mirrored to a physical device or we could be
> > multicasting or some other arbitrarily complex scenario. Just think
> > of what a physical switch would do if it has ports with two different
> > speeds.
>
> Yes, I have considered that case. And I agree that perhaps there
> is no sensible default. But perhaps we could make it configurable somehow?
The fix is at the application level. Run netperf with -b and -w flags to
limit the speed to a sensible value.
--
MST
^ permalink raw reply
* Re: [PATCH] net: add Faraday FTMAC100 10/100 Ethernet driver
From: Po-Yu Chuang @ 2011-01-14 5:37 UTC (permalink / raw)
To: Ben Hutchings; +Cc: netdev, linux-kernel, Po-Yu Chuang
In-Reply-To: <1294927387.3044.76.camel@localhost>
Dear Ben,
On Thu, Jan 13, 2011 at 10:03 PM, Ben Hutchings
<bhutchings@solarflare.com> wrote:
> On Thu, 2011-01-13 at 19:49 +0800, Po-Yu Chuang wrote:
>> From: Po-Yu Chuang <ratbert@faraday-tech.com>
>>
>> FTMAC100 Ethernet Media Access Controller supports 10/100 Mbps and
>> MII. This driver has been working on some ARM/NDS32 SoC including
>> Faraday A320 and Andes AG101.
> [...]
>> +#define USE_NAPI
>
> All new network drivers should use NAPI only, so please remove the
> definition of USE_NAPI and change the conditional code to use NAPI
> always.
OK, fixed.
> [...]
>> + struct net_device_stats stats;
>
> There is a net_device_stats structure in the net_device structure; you
> should normally use that instead of adding another one.
OK, fixed.
> [...]
>> +static int ftmac100_reset(struct ftmac100 *priv)
>> +{
>> + struct device *dev = &priv->netdev->dev;
>> + unsigned long flags;
>> + int i;
>> +
>> + /* NOTE: reset clears all registers */
>> +
>> + spin_lock_irqsave(&priv->hw_lock, flags);
>> + iowrite32(FTMAC100_MACCR_SW_RST, priv->base + FTMAC100_OFFSET_MACCR);
>> + spin_unlock_irqrestore(&priv->hw_lock, flags);
>> +
>> + for (i = 0; i < 5; i++) {
>> + int maccr;
>> +
>> + spin_lock_irqsave(&priv->hw_lock, flags);
>> + maccr = ioread32(priv->base + FTMAC100_OFFSET_MACCR);
>> + spin_unlock_irqrestore(&priv->hw_lock, flags);
>> + if (!(maccr & FTMAC100_MACCR_SW_RST)) {
>> + /*
>> + * FTMAC100_MACCR_SW_RST cleared does not indicate
>> + * that hardware reset completed (what the f*ck).
>> + * We still need to wait for a while.
>> + */
>> + usleep_range(500, 1000);
>
> Sleeping here means this must be running in process context. But you
> used spin_lock_irqsave() above which implies you're not sure what the
> context is. One of these must be wrong.
>
> I wonder whether hw_lock is even needed; you seem to acquire and release
> it around each PIO (read or write). This should be unnecessary since
> each PIO is atomic.
OK, fixed.
> I think you can also get rid of rx_lock; it's only used in the RX data
> path which is already serialised by NAPI.
OK, fixed.
> [...]
>> + netdev->last_rx = jiffies;
>
> Don't set last_rx; the networking core takes care of it now.
OK, fixed.
>> + priv->stats.rx_packets++;
>> + priv->stats.rx_bytes += skb->len;
>
> This should be done earlier, so that these stats include packets that
> are dropped for any reason.
OK, fixed.
> [...]
>> + netdev->trans_start = jiffies;
>
> Don't set trans_start; the networking core takes care of it now.
OK, fixed.
> [...]
>> + priv->descs = dma_alloc_coherent(priv->dev,
>> + sizeof(struct ftmac100_descs), &priv->descs_dma_addr,
>> + GFP_KERNEL | GFP_DMA);
>
> On x86, GFP_DMA means the memory must be within the ISA DMA range
> (< 16 MB). I don't know quite what it means on ARM but it may not be
> what you want.
On ARM, all 4G address space can be used by DMA (at least for all the
hardwares I have ever used). All memory pages are in DMA zone AFAIK.
I put GFP_DMA here just to be safe if there were any constraints.
By checking drivers in drivers/net/arm/, ep93xx_eth.c also uses this flag,
so I guess this is acceptable?
> [...]
>> + if (status & (FTMAC100_INT_XPKT_OK | FTMAC100_INT_XPKT_LOST)) {
>> + /*
>> + * FTMAC100_INT_XPKT_OK:
>> + * packet transmitted to ethernet successfully
>> + *
>> + * FTMAC100_INT_XPKT_LOST:
>> + * packet transmitted to ethernet lost due to late
>> + * collision or excessive collision
>> + */
>> + ftmac100_tx_complete(priv);
>> + }
>
> TX completions should also be handled through NAPI if possible.
OK, I'll study how to do that.
> [...]
>> + priv->rx_pointer = 0;
>> + priv->tx_clean_pointer = 0;
>> + priv->tx_pointer = 0;
>> + spin_lock_init(&priv->hw_lock);
>> + spin_lock_init(&priv->rx_lock);
>> + spin_lock_init(&priv->tx_lock);
>
> These locks should be initialised in your probe function.
OK, fixed.
> [...]
>> + unregister_netdev(netdev);
>
> There should be a netif_napi_del() before this.
OK, fixed.
> A general comment: please use netdev_err(), netdev_info() etc. for
> logging. This ensures that both the platform device address and the
> network device name appears in the log messages.
OK, fixed.
Thanks a lot for your detailed review. I'll submit a new version ASAP.
Thanks,
Po-Yu Chuang
^ permalink raw reply
* Re: [PATCH] net: remove dev_txq_stats_fold()
From: David Miller @ 2011-01-14 5:45 UTC (permalink / raw)
To: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w
Cc: jarkao2-Re5JQEeQqe8AvxtiuMwx3w, david-b-yBeKhBN/0LDR7s880joybQ,
neiljay-Re5JQEeQqe8AvxtiuMwx3w, linux-usb-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA,
alexander.h.duyck-ral2JQCrhuEAvxtiuMwx3w,
jeffrey.t.kirsher-ral2JQCrhuEAvxtiuMwx3w,
mina86-deATy8a+UHjQT0dZR+AlfA,
sandeep.kumar-KZfg59tc24xl57MIdRCFDg
In-Reply-To: <1294870394.3335.75.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Wed, 12 Jan 2011 23:13:14 +0100
> [PATCH v2] net: remove dev_txq_stats_fold()
>
> After recent changes, (percpu stats on vlan/tunnels...), we dont need
> anymore per struct netdev_queue tx_bytes/tx_packets/tx_dropped counters.
>
> Only remaining users are ixgbe, sch_teql, gianfar & macvlan :
>
> 1) ixgbe can be converted to use existing tx_ring counters.
>
> 2) macvlan incremented txq->tx_dropped, it can use the
> dev->stats.tx_dropped counter.
>
> 3) sch_teql : almost revert ab35cd4b8f42 (Use net_device internal stats)
> Now we have ndo_get_stats64(), use it, even for "unsigned long"
> fields (No need to bring back a struct net_device_stats)
>
> 4) gianfar adds a stats structure per tx queue to hold
> tx_bytes/tx_packets
>
> This removes a lockdep warning (and possible lockup) in rndis gadget,
> calling dev_get_stats() from hard IRQ context.
>
> Ref: http://www.spinics.net/lists/netdev/msg149202.html
>
> Reported-by: Neil Jones <neiljay-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> CC: Jarek Poplawski <jarkao2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> CC: Alexander Duyck <alexander.h.duyck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> CC: Jeff Kirsher <jeffrey.t.kirsher-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> CC: Sandeep Gopalpet <sandeep.kumar-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
> CC: Michal Nazarewicz <mina86-deATy8a+UHjQT0dZR+AlfA@public.gmane.org>
Applied, thanks everyone.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH v4 05/10] net/fec: add dual fec support for mx28
From: Shawn Guo @ 2011-01-14 5:48 UTC (permalink / raw)
To: Uwe Kleine-König
Cc: davem, gerg, baruch, eric, bryan.wu, r64343, B32542, lw, w.sang,
s.hauer, jamie, jamie, netdev, linux-arm-kernel
In-Reply-To: <20110113144805.GS24920@pengutronix.de>
Hi Uwe,
On Thu, Jan 13, 2011 at 03:48:05PM +0100, Uwe Kleine-König wrote:
[...]
> > +/* Controller is ENET-MAC */
> > +#define FEC_QUIRK_ENET_MAC (1 << 0)
> does this really qualify to be a quirk?
>
My understanding is that ENET-MAC is a type of "quirky" FEC
controller.
> > +/* Controller needs driver to swap frame */
> > +#define FEC_QUIRK_SWAP_FRAME (1 << 1)
> IMHO this is a bit misnamed. FEC_QUIRK_NEEDS_BE_DATA or similar would
> be more accurate.
>
When your make this change, you may want to pick a better name for
function swap_buffer too.
[...]
> > +static void *swap_buffer(void *bufaddr, int len)
> > +{
> > + int i;
> > + unsigned int *buf = bufaddr;
> > +
> > + for (i = 0; i < (len + 3) / 4; i++, buf++)
> > + *buf = cpu_to_be32(*buf);
> if len isn't a multiple of 4 this accesses bytes behind len. Is this
> generally OK here? (E.g. because skbs always have a length that is a
> multiple of 4?)
The len may not be a multiple of 4. But I believe bufaddr is always
a buffer allocated in a length that is a multiple of 4, and the 1~3
bytes exceeding the len very likely has no data that matters. But
yes, it deserves a safer implementation.
[...]
> > + /*
> > + * The dual fec interfaces are not equivalent with enet-mac.
> > + * Here are the differences:
> > + *
> > + * - fec0 supports MII & RMII modes while fec1 only supports RMII
> > + * - fec0 acts as the 1588 time master while fec1 is slave
> > + * - external phys can only be configured by fec0
> > + *
> > + * That is to say fec1 can not work independently. It only works
> > + * when fec0 is working. The reason behind this design is that the
> > + * second interface is added primarily for Switch mode.
> > + *
> > + * Because of the last point above, both phys are attached on fec0
> > + * mdio interface in board design, and need to be configured by
> > + * fec0 mii_bus.
> > + */
> > + if ((id_entry->driver_data & FEC_QUIRK_ENET_MAC) && pdev->id) {
> > + /* fec1 uses fec0 mii_bus */
> > + fep->mii_bus = fec0_mii_bus;
> > + return 0;
> What happens if imx28-fec.1 is probed before imx28-fec.0?
It's something that generally should not happen, as these two fec are
not equivalent, and fec.1 should always be added after fec.0 if you
intend to get dual interfaces. But yes, we should add error checking
for this case in the driver.
--
Regards,
Shawn
^ permalink raw reply
* Re: [PATCH] ipsec: update MAX_AH_AUTH_LEN to support sha512
From: David Miller @ 2011-01-14 5:50 UTC (permalink / raw)
To: nicolas.dichtel; +Cc: netdev
In-Reply-To: <4D2F73C7.8010107@6wind.com>
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Thu, 13 Jan 2011 22:51:03 +0100
>>From e330817aa2b33e9d1f44071072fdc4778acf8d76 Mon Sep 17 00:00:00 2001
> From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Date: Thu, 13 Jan 2011 14:20:19 -0500
> Subject: [PATCH] ipsec: update MAX_AH_AUTH_LEN to support sha512
>
> icv_truncbits is set to 256 for sha512, so update
> MAX_AH_AUTH_LEN to 64.
>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Applied.
^ permalink raw reply
* Re: [PATCH] r8169: keep firmware in memory.
From: David Miller @ 2011-01-14 5:50 UTC (permalink / raw)
To: romieu; +Cc: netdev, jarek, hayeswang, benh, torvalds
In-Reply-To: <20110113230753.GA2750@electric-eye.fr.zoreil.com>
From: Francois Romieu <romieu@fr.zoreil.com>
Date: Fri, 14 Jan 2011 00:07:53 +0100
> The firmware agent is not available during resume. Loading the firmware
> during open() (see eee3a96c6368f47df8df5bd4ed1843600652b337) is not
> enough.
>
> close() is run during resume through rtl8169_reset_task(), whence the
> mildly natural release of firmware in the driver removal method instead.
>
> It will help with http://bugs.debian.org/609538. It will not avoid
> the 60 seconds delay when:
> - there is no firmware
> - the driver is loaded and the device is not up before a suspend/resume
>
> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
> Tested-by: Jarek Kamiński <jarek@vilo.eu.org>
> Cc: Hayes <hayeswang@realtek.com>
> Cc: Ben Hutchings <benh@debian.org>
Applied.
^ permalink raw reply
* Re: [PATCH] USB CDC NCM: Don't deref NULL in cdc_ncm_rx_fixup() and don't use uninitialized variable.
From: David Miller @ 2011-01-14 5:50 UTC (permalink / raw)
To: jj
Cc: linux-kernel, oliver, gregkh, linux-usb, netdev, alexey.orishko,
hans.petter.selasky
In-Reply-To: <alpine.LNX.2.00.1101132229260.11347@swampdragon.chaosbits.net>
From: Jesper Juhl <jj@chaosbits.net>
Date: Thu, 13 Jan 2011 22:40:11 +0100 (CET)
> skb_clone() dynamically allocates memory and may fail. If it does it
> returns NULL. This means we'll dereference a NULL pointer in
> drivers/net/usb/cdc_ncm.c::cdc_ncm_rx_fixup().
> As far as I can tell, the proper way to deal with this is simply to goto
> the error label.
>
> Furthermore gcc complains that 'skb' may be used uninitialized:
> drivers/net/usb/cdc_ncm.c: In function ‘cdc_ncm_rx_fixup’:
> drivers/net/usb/cdc_ncm.c:922:18: warning: ‘skb’ may be used uninitialized in this function
> and I believe it is right. On the line where we
> pr_debug("invalid frame detected (ignored)" ...
> we are using the local variable 'skb' but nothing has ever been assigned
> to that variable yet. I believe the correct fix for that is to use
> 'skb_in' instead.
>
> Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Applied.
^ permalink raw reply
* Re: [PATCH 1/2] ks8695net: Disable non-working ethtool operations
From: David Miller @ 2011-01-14 5:50 UTC (permalink / raw)
To: bhutchings; +Cc: figo1802, zealcook, netdev
In-Reply-To: <1294941014.3946.46.camel@bwh-desktop>
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 13 Jan 2011 17:50:14 +0000
> Some ethtool operations can only be implemented for the WAN port, and
> not all such operations are allowed to return an error code such as
> -EOPNOTSUPP. Therefore, define two separate ethtool_ops structures
> for WAN and non-WAN ports; simplify and rename the WAN-only functions.
>
> This is completely untested as I don't have an ARM build environment.
>
> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Applied.
^ permalink raw reply
* Re: [PATCH] vxge: Remember to release firmware after upgrading firmware
From: David Miller @ 2011-01-14 5:50 UTC (permalink / raw)
To: jj
Cc: netdev, linux-kernel, ramkrishna.vepa, jon.mason,
sivakumar.subramani, sreenivasa.honnur
In-Reply-To: <alpine.LNX.2.00.1101132119080.11347@swampdragon.chaosbits.net>
From: Jesper Juhl <jj@chaosbits.net>
Date: Thu, 13 Jan 2011 21:25:20 +0100 (CET)
> Regardless of whether the firmware update being performed by
> vxge_fw_upgrade() is a success or not we must still remember to always
> release_firmware() before returning.
>
> Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Applied.
^ permalink raw reply
* Re: [PATCH 2/2] ks8695net: Use default implementation of ethtool_ops::get_link
From: David Miller @ 2011-01-14 5:51 UTC (permalink / raw)
To: bhutchings; +Cc: figo1802, zealcook, netdev
In-Reply-To: <1294941171.3946.49.camel@bwh-desktop>
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 13 Jan 2011 17:52:51 +0000
> This is completely untested as I don't have an ARM build environment.
>
> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
...
> -/**
> * ks8695_wan_get_pause - Retrieve network pause/flow-control
> advertising
> * @ndev: The device to retrieve settings from
> * @param: The structure to fill out with the information
Line-wrap damage from your mail client.
> @@ -1058,7 +1044,7 @@ static const struct ethtool_ops
> ks8695_wan_ethtool_ops = {
> .get_settings = ks8695_wan_get_settings,
Same here.
I fixed this up and applied your patch.
^ permalink raw reply
* Re: [PATCH net-next-2.6] netdev: bfin_mac: Remove is_multicast_ether_addr use in netdev_for_each_mc_addr
From: David Miller @ 2011-01-14 5:51 UTC (permalink / raw)
To: joe; +Cc: vapier.adi, tklauser, michael.hennerich, uclinux-dist-devel,
netdev
In-Reply-To: <1294891685.4114.29.camel@Joe-Laptop>
From: Joe Perches <joe@perches.com>
Date: Wed, 12 Jan 2011 20:08:04 -0800
> Remove code that has no effect.
>
> Signed-off-by: Joe Perches <joe@perches.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next-2.6] etherdevice.h: Add is_unicast_ether_addr function
From: David Miller @ 2011-01-14 5:51 UTC (permalink / raw)
To: tklauser; +Cc: cmetcalf, netdev
In-Reply-To: <1294906496-14950-1-git-send-email-tklauser@distanz.ch>
From: Tobias Klauser <tklauser@distanz.ch>
Date: Thu, 13 Jan 2011 09:14:56 +0100
>>From a check for !is_multicast_ether_addr it is not always obvious that
> we're checking for a unicast address. So add this helper function to
> make those code paths easier to read.
>
> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Applied.
^ permalink raw reply
* Re: [PATCH v2 net-next-2.6] netdev: tilepro: Use is_unicast_ether_addr helper
From: David Miller @ 2011-01-14 5:51 UTC (permalink / raw)
To: tklauser; +Cc: cmetcalf, netdev
In-Reply-To: <1294906508-14999-1-git-send-email-tklauser@distanz.ch>
From: Tobias Klauser <tklauser@distanz.ch>
Date: Thu, 13 Jan 2011 09:15:08 +0100
> Use is_unicast_ether_addr from linux/etherdevice.h instead of custom
> macros.
>
> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Applied.
^ permalink raw reply
* Re: [PATCH 02/10] GRETH: added option to disable a device node from bootloader.
From: David Miller @ 2011-01-14 6:12 UTC (permalink / raw)
To: daniel; +Cc: netdev, kristoffer
In-Reply-To: <1294907135-24884-2-git-send-email-daniel@gaisler.com>
From: Daniel Hellstrom <daniel@gaisler.com>
Date: Thu, 13 Jan 2011 09:25:27 +0100
> Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
This is not how you do this.
Simply not present the device in the OpenFirmware tree at all. If you
can make this special properly appear, you can also toss the device
node away completely.
There is zero reason whatsoever to create a special hack-job
non-standardized device node properly to do this.
^ permalink raw reply
* Re: [PATCH 03/10] GRETH: added no_gbit option
From: David Miller @ 2011-01-14 6:13 UTC (permalink / raw)
To: daniel; +Cc: netdev, kristoffer
In-Reply-To: <1294907135-24884-3-git-send-email-daniel@gaisler.com>
From: Daniel Hellstrom <daniel@gaisler.com>
Date: Thu, 13 Jan 2011 09:25:28 +0100
> For debug only. The driver does not report that it is GBit capable, instead
> it will report 10/100 mode to the generic PHY layer.
>
> Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
You may not add special purpose module parameters to your ethernet driver.
Instead, users can use ethtool to control what speeds the device will
try to advertise for during auto-negotiation, or use with a fixed
link configuration.
^ permalink raw reply
* Re: [PATCH 04/10] GRETH: added greth_compat_mode module parameter
From: David Miller @ 2011-01-14 6:14 UTC (permalink / raw)
To: daniel; +Cc: netdev, kristoffer
In-Reply-To: <1294907135-24884-4-git-send-email-daniel@gaisler.com>
From: Daniel Hellstrom <daniel@gaisler.com>
Date: Thu, 13 Jan 2011 09:25:29 +0100
> The greth_compat_mode option can be used to set a GRETH GBit capable MAC
> in operate as if the GRETH 10/100 device was found. The GRETH GBit supports
> TCP/UDP checksum offloading, unaligned frame buffers, scatter gather etc.
> Enabling this mode allows the developer to test the GRETH 10/100 device
> without all features mentioned above on a GBit MAC capable of the above.
>
> Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
No special device specific module parameters please.
What if every single driver author added something like this and they
all named it something different or made it behave in slightly differing
ways?
What kind of user experience would that result in?
It would result in a sucky one, which is why we avoid adding all kinds
of hacky driver specific module options.
Find a generic way to provide this functionality.
^ permalink raw reply
* Re: [PATCH] net: add Faraday FTMAC100 10/100 Ethernet driver
From: Po-Yu Chuang @ 2011-01-14 6:35 UTC (permalink / raw)
To: Joe Perches; +Cc: netdev, linux-kernel, Po-Yu Chuang
In-Reply-To: <1294933197.4114.85.camel@Joe-Laptop>
Dear Joe,
On Thu, Jan 13, 2011 at 11:39 PM, Joe Perches <joe@perches.com> wrote:
> On Thu, 2011-01-13 at 19:49 +0800, Po-Yu Chuang wrote:
>> From: Po-Yu Chuang <ratbert@faraday-tech.com>
>>
>> FTMAC100 Ethernet Media Access Controller supports 10/100 Mbps and
>> MII. This driver has been working on some ARM/NDS32 SoC including
>> Faraday A320 and Andes AG101.
>
> A couple of trivial comments not already mentioned by others.
>
>> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> []
>> @@ -2014,6 +2014,15 @@ config BCM63XX_ENET
>> This driver supports the ethernet MACs in the Broadcom 63xx
>> MIPS chipset family (BCM63XX).
>>
>> +config FTMAC100
>> + tristate "Faraday FTMAC100 10/100 Ethernet support"
>> + depends on ARM
>> + select MII
>> + help
>> + This driver supports the FTMAC100 Ethernet controller from
>> + Faraday. It is used on Faraday A320, Andes AG101, AG101P
>> + and some other ARM/NDS32 SoC's.
>> +
>
> ARM specific net drivers are for now in drivers/net/arm/
> Perhaps it's better to locate these files there?
This controller is used by not only ARM-base platforms, but also
NDS32-base ones.
NDS32 is an ISA designed by Andes tech. Although it is not yet in the mainline,
they plan to push their stuff to mainline and are working on that.
So... I don't know if it is better to put my driver to driver/net/arm/.
Should I leave my driver at driver/net/ or put it in drvier/net/arm/
and let Andes tech guys
move it out of driver/net/arm/ when they use it later?
>> + if (unlikely(ftmac100_rxdes_rx_error(rxdes))) {
>> + if (printk_ratelimit())
>> + dev_info(dev, "rx err\n");
>
> There are many printk_ratelimit() tests.
>
> It's better to use net_ratelimit() or a local ratelimit_state
> so there's less possible suppression of other subsystem
> messages.
OK, fixed.
Thanks a lot for your absolutely non-trivial review. :-)
I'll submit a new version ASAP.
Thanks,
Po-Yu Chuang
^ permalink raw reply
* Re: Flow Control and Port Mirroring Revisited
From: Simon Horman @ 2011-01-14 6:35 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Jesse Gross, Eric Dumazet, Rusty Russell, virtualization, dev,
virtualization, netdev, kvm
In-Reply-To: <20110114045818.GA29738@redhat.com>
On Fri, Jan 14, 2011 at 06:58:18AM +0200, Michael S. Tsirkin wrote:
> On Fri, Jan 14, 2011 at 08:41:36AM +0900, Simon Horman wrote:
> > On Thu, Jan 13, 2011 at 10:45:38AM -0500, Jesse Gross wrote:
> > > On Thu, Jan 13, 2011 at 1:47 AM, Simon Horman <horms@verge.net.au> wrote:
> > > > On Mon, Jan 10, 2011 at 06:31:55PM +0900, Simon Horman wrote:
> > > >> On Fri, Jan 07, 2011 at 10:23:58AM +0900, Simon Horman wrote:
> > > >> > On Thu, Jan 06, 2011 at 05:38:01PM -0500, Jesse Gross wrote:
> > > >> >
> > > >> > [ snip ]
> > > >> > >
> > > >> > > I know that everyone likes a nice netperf result but I agree with
> > > >> > > Michael that this probably isn't the right question to be asking. I
> > > >> > > don't think that socket buffers are a real solution to the flow
> > > >> > > control problem: they happen to provide that functionality but it's
> > > >> > > more of a side effect than anything. It's just that the amount of
> > > >> > > memory consumed by packets in the queue(s) doesn't really have any
> > > >> > > implicit meaning for flow control (think multiple physical adapters,
> > > >> > > all with the same speed instead of a virtual device and a physical
> > > >> > > device with wildly different speeds). The analog in the physical
> > > >> > > world that you're looking for would be Ethernet flow control.
> > > >> > > Obviously, if the question is limiting CPU or memory consumption then
> > > >> > > that's a different story.
> > > >> >
> > > >> > Point taken. I will see if I can control CPU (and thus memory) consumption
> > > >> > using cgroups and/or tc.
> > > >>
> > > >> I have found that I can successfully control the throughput using
> > > >> the following techniques
> > > >>
> > > >> 1) Place a tc egress filter on dummy0
> > > >>
> > > >> 2) Use ovs-ofctl to add a flow that sends skbs to dummy0 and then eth1,
> > > >> this is effectively the same as one of my hacks to the datapath
> > > >> that I mentioned in an earlier mail. The result is that eth1
> > > >> "paces" the connection.
>
> This is actually a bug. This means that one slow connection will affect
> fast ones. I intend to change the default for qemu to sndbuf=0 : this
> will fix it but break your "pacing". So pls do not count on this
> behaviour.
Do you have a patch I could test?
> > > > Further to this, I wonder if there is any interest in providing
> > > > a method to switch the action order - using ovs-ofctl is a hack imho -
> > > > and/or switching the default action order for mirroring.
> > >
> > > I'm not sure that there is a way to do this that is correct in the
> > > generic case. It's possible that the destination could be a VM while
> > > packets are being mirrored to a physical device or we could be
> > > multicasting or some other arbitrarily complex scenario. Just think
> > > of what a physical switch would do if it has ports with two different
> > > speeds.
> >
> > Yes, I have considered that case. And I agree that perhaps there
> > is no sensible default. But perhaps we could make it configurable somehow?
>
> The fix is at the application level. Run netperf with -b and -w flags to
> limit the speed to a sensible value.
Perhaps I should have stated my goals more clearly.
I'm interested in situations where I don't control the application.
^ permalink raw reply
* Re: [PATCH v4 06/10] ARM: mx28: update clock and device name for dual fec support
From: Shawn Guo @ 2011-01-14 6:46 UTC (permalink / raw)
To: Uwe Kleine-König
Cc: davem, gerg, baruch, eric, bryan.wu, r64343, B32542, lw, w.sang,
s.hauer, jamie, jamie, netdev, linux-arm-kernel
In-Reply-To: <20110113150622.GV24920@pengutronix.de>
Hi Uwe,
On Thu, Jan 13, 2011 at 04:06:22PM +0100, Uwe Kleine-König wrote:
> Hi Shawn,
>
> On Thu, Jan 06, 2011 at 03:13:14PM +0800, Shawn Guo wrote:
> > Change device name from "fec" to "imx28-fec", so that fec driver
> > can distinguish mx28.
> >
> > Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
> > ---
> > Changes for v4:
> > - Use "imx28-fec" as fec device name
> >
> > Changes for v3:
> > - Change device name to "enet-mac"
> >
> > arch/arm/mach-mxs/clock-mx28.c | 3 ++-
> > arch/arm/mach-mxs/devices/platform-fec.c | 2 +-
> > 2 files changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm/mach-mxs/clock-mx28.c b/arch/arm/mach-mxs/clock-mx28.c
> > index f20b254..e2a8b0f 100644
> > --- a/arch/arm/mach-mxs/clock-mx28.c
> > +++ b/arch/arm/mach-mxs/clock-mx28.c
> > @@ -606,7 +606,8 @@ static struct clk_lookup lookups[] = {
> > _REGISTER_CLOCK("duart", "apb_pclk", xbus_clk)
> > /* for amba-pl011 driver */
> > _REGISTER_CLOCK("duart", NULL, uart_clk)
> > - _REGISTER_CLOCK("fec.0", NULL, fec_clk)
> > + _REGISTER_CLOCK("imx28-fec.0", NULL, fec_clk)
> > + _REGISTER_CLOCK("imx28-fec.1", NULL, fec_clk)
> > _REGISTER_CLOCK("rtc", NULL, rtc_clk)
> > _REGISTER_CLOCK("pll2", NULL, pll2_clk)
> > _REGISTER_CLOCK(NULL, "hclk", hbus_clk)
> > diff --git a/arch/arm/mach-mxs/devices/platform-fec.c b/arch/arm/mach-mxs/devices/platform-fec.c
> > index c08168c..c42dff7 100644
> > --- a/arch/arm/mach-mxs/devices/platform-fec.c
> > +++ b/arch/arm/mach-mxs/devices/platform-fec.c
> > @@ -45,6 +45,6 @@ struct platform_device *__init mxs_add_fec(
> > },
> > };
> >
> > - return mxs_add_platform_device("fec", data->id,
> > + return mxs_add_platform_device("imx28-fec", data->id,
> IMHO "imx28-fec" shouldn't be hard coded here but come from data. See
> imx-spi device registration for an example.
>
Sascha has merged the patch, so I will send a follow-up patch.
--
Regards,
Shawn
^ permalink raw reply
* Re: [PATCH] net: add Faraday FTMAC100 10/100 Ethernet driver
From: Po-Yu Chuang @ 2011-01-14 6:44 UTC (permalink / raw)
To: Andres Salomon; +Cc: Eric Dumazet, netdev, linux-kernel, Po-Yu Chuang
In-Reply-To: <20110113082950.6747ebb8@queued.net>
Dear Andres,
On Fri, Jan 14, 2011 at 12:29 AM, Andres Salomon <dilinger@queued.net> wrote:
> On Thu, 13 Jan 2011 15:22:39 +0100
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> Le jeudi 13 janvier 2011 à 19:49 +0800, Po-Yu Chuang a écrit :
> No one else mentioned it, so I'll add:
>
> Don't explicitly inline functions unless they're in a header, or you
> have a really good reason (and that reason should probably be described
> in a comment). Otherwise, just leave off the 'inline' keyword; the
> compiler is smart enough to decide whether a function should be inlined
> or not.
OK, fixed.
Thanks,
Po-Yu Chuang
^ permalink raw reply
* Re: [PATCH] net: add Faraday FTMAC100 10/100 Ethernet driver
From: Po-Yu Chuang @ 2011-01-14 6:49 UTC (permalink / raw)
To: Joe Perches; +Cc: netdev, linux-kernel, Po-Yu Chuang
In-Reply-To: <1294959948.4114.189.camel@Joe-Laptop>
Dear Joe,
On Fri, Jan 14, 2011 at 7:05 AM, Joe Perches <joe@perches.com> wrote:
> On Thu, 2011-01-13 at 19:49 +0800, Po-Yu Chuang wrote:
>> From: Po-Yu Chuang <ratbert@faraday-tech.com>
> []
>> + if (is_zero_ether_addr(netdev->dev_addr)) {
>> + random_ether_addr(netdev->dev_addr);
>> + dev_info(&netdev->dev, "generated random MAC address "
>> + "%.2x:%.2x:%.2x:%.2x:%.2x:%.2x.\n",
>> + netdev->dev_addr[0], netdev->dev_addr[1],
>> + netdev->dev_addr[2], netdev->dev_addr[3],
>> + netdev->dev_addr[4], netdev->dev_addr[5]);
>
> Sorry, one other thing.
> There are kernel specific printf extensions for pointer types.
> (look at lib/vsprintf.c) There's one for mac addresses: "%pM".
> This could be done as:
>
> netdev_info(netdev, "Generated random MAC address: %pM\n",
> netdev->dev_addr);
Wow, this is pretty beautiful. Thanks.
best regards,
Po-Yu Chuang
^ permalink raw reply
* [PATCH] ARM: mxs: pass fec device name via platform data
From: Shawn Guo @ 2011-01-14 6:53 UTC (permalink / raw)
To: davem, gerg, baruch, eric, bryan.wu, r64343, B32542, lw; +Cc: Shawn Guo
In-Reply-To: <20110113150622.GV24920@pengutronix.de>
Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
---
arch/arm/mach-mxs/devices/platform-fec.c | 11 ++++++-----
arch/arm/mach-mxs/include/mach/devices-common.h | 1 +
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/arch/arm/mach-mxs/devices/platform-fec.c b/arch/arm/mach-mxs/devices/platform-fec.c
index c42dff7..75eb26b 100644
--- a/arch/arm/mach-mxs/devices/platform-fec.c
+++ b/arch/arm/mach-mxs/devices/platform-fec.c
@@ -10,20 +10,21 @@
#include <mach/mx28.h>
#include <mach/devices-common.h>
-#define mxs_fec_data_entry_single(soc, _id) \
+#define mxs_fec_data_entry_single(soc, _devid, _id) \
{ \
+ .devid = _devid, \
.id = _id, \
.iobase = soc ## _ENET_MAC ## _id ## _BASE_ADDR, \
.irq = soc ## _INT_ENET_MAC ## _id, \
}
-#define mxs_fec_data_entry(soc, _id) \
- [_id] = mxs_fec_data_entry_single(soc, _id)
+#define mxs_fec_data_entry(soc, _devid, _id) \
+ [_id] = mxs_fec_data_entry_single(soc, _devid, _id)
#ifdef CONFIG_SOC_IMX28
const struct mxs_fec_data mx28_fec_data[] __initconst = {
#define mx28_fec_data_entry(_id) \
- mxs_fec_data_entry(MX28, _id)
+ mxs_fec_data_entry(MX28, "imx28-fec", _id)
mx28_fec_data_entry(0),
mx28_fec_data_entry(1),
};
@@ -45,6 +46,6 @@ struct platform_device *__init mxs_add_fec(
},
};
- return mxs_add_platform_device("imx28-fec", data->id,
+ return mxs_add_platform_device(data->devid, data->id,
res, ARRAY_SIZE(res), pdata, sizeof(*pdata));
}
diff --git a/arch/arm/mach-mxs/include/mach/devices-common.h b/arch/arm/mach-mxs/include/mach/devices-common.h
index 6c3d1a1..10fbcfd 100644
--- a/arch/arm/mach-mxs/include/mach/devices-common.h
+++ b/arch/arm/mach-mxs/include/mach/devices-common.h
@@ -33,6 +33,7 @@ int __init mxs_add_duart(const struct amba_device *dev);
/* fec */
#include <linux/fec.h>
struct mxs_fec_data {
+ const char *devid;
int id;
resource_size_t iobase;
resource_size_t iosize;
--
1.7.1
^ permalink raw reply related
* Re: [PATCH] r8169: keep firmware in memory.
From: Michael Tokarev @ 2011-01-14 6:52 UTC (permalink / raw)
To: Francois Romieu
Cc: David Miller, netdev, Jarek Kamiński, Hayes, Ben Hutchings,
Linus Torvalds
In-Reply-To: <20110113230753.GA2750@electric-eye.fr.zoreil.com>
14.01.2011 02:07, Francois Romieu wrote:
> The firmware agent is not available during resume. Loading the firmware
> during open() (see eee3a96c6368f47df8df5bd4ed1843600652b337) is not
> enough.
>
> close() is run during resume through rtl8169_reset_task(), whence the
> mildly natural release of firmware in the driver removal method instead.
>
> It will help with http://bugs.debian.org/609538. It will not avoid
> the 60 seconds delay when:
> - there is no firmware
> - the driver is loaded and the device is not up before a suspend/resume
Given all this I think this is somewhat clumsy still. How
does other NIC drivers handles this situation - e.g. tg3?
Maybe this needs to be a generic solution instead of per-driver?
/mjt
^ permalink raw reply
* Re: Flow Control and Port Mirroring Revisited
From: Michael S. Tsirkin @ 2011-01-14 6:54 UTC (permalink / raw)
To: Simon Horman
Cc: Jesse Gross, Eric Dumazet, Rusty Russell, virtualization, dev,
virtualization, netdev, kvm
In-Reply-To: <20110114063528.GB10957@verge.net.au>
On Fri, Jan 14, 2011 at 03:35:28PM +0900, Simon Horman wrote:
> On Fri, Jan 14, 2011 at 06:58:18AM +0200, Michael S. Tsirkin wrote:
> > On Fri, Jan 14, 2011 at 08:41:36AM +0900, Simon Horman wrote:
> > > On Thu, Jan 13, 2011 at 10:45:38AM -0500, Jesse Gross wrote:
> > > > On Thu, Jan 13, 2011 at 1:47 AM, Simon Horman <horms@verge.net.au> wrote:
> > > > > On Mon, Jan 10, 2011 at 06:31:55PM +0900, Simon Horman wrote:
> > > > >> On Fri, Jan 07, 2011 at 10:23:58AM +0900, Simon Horman wrote:
> > > > >> > On Thu, Jan 06, 2011 at 05:38:01PM -0500, Jesse Gross wrote:
> > > > >> >
> > > > >> > [ snip ]
> > > > >> > >
> > > > >> > > I know that everyone likes a nice netperf result but I agree with
> > > > >> > > Michael that this probably isn't the right question to be asking. I
> > > > >> > > don't think that socket buffers are a real solution to the flow
> > > > >> > > control problem: they happen to provide that functionality but it's
> > > > >> > > more of a side effect than anything. It's just that the amount of
> > > > >> > > memory consumed by packets in the queue(s) doesn't really have any
> > > > >> > > implicit meaning for flow control (think multiple physical adapters,
> > > > >> > > all with the same speed instead of a virtual device and a physical
> > > > >> > > device with wildly different speeds). The analog in the physical
> > > > >> > > world that you're looking for would be Ethernet flow control.
> > > > >> > > Obviously, if the question is limiting CPU or memory consumption then
> > > > >> > > that's a different story.
> > > > >> >
> > > > >> > Point taken. I will see if I can control CPU (and thus memory) consumption
> > > > >> > using cgroups and/or tc.
> > > > >>
> > > > >> I have found that I can successfully control the throughput using
> > > > >> the following techniques
> > > > >>
> > > > >> 1) Place a tc egress filter on dummy0
> > > > >>
> > > > >> 2) Use ovs-ofctl to add a flow that sends skbs to dummy0 and then eth1,
> > > > >> this is effectively the same as one of my hacks to the datapath
> > > > >> that I mentioned in an earlier mail. The result is that eth1
> > > > >> "paces" the connection.
> >
> > This is actually a bug. This means that one slow connection will affect
> > fast ones. I intend to change the default for qemu to sndbuf=0 : this
> > will fix it but break your "pacing". So pls do not count on this
> > behaviour.
>
> Do you have a patch I could test?
You can (and users already can) just run qemu with sndbuf=0. But if you
like, below.
> > > > > Further to this, I wonder if there is any interest in providing
> > > > > a method to switch the action order - using ovs-ofctl is a hack imho -
> > > > > and/or switching the default action order for mirroring.
> > > >
> > > > I'm not sure that there is a way to do this that is correct in the
> > > > generic case. It's possible that the destination could be a VM while
> > > > packets are being mirrored to a physical device or we could be
> > > > multicasting or some other arbitrarily complex scenario. Just think
> > > > of what a physical switch would do if it has ports with two different
> > > > speeds.
> > >
> > > Yes, I have considered that case. And I agree that perhaps there
> > > is no sensible default. But perhaps we could make it configurable somehow?
> >
> > The fix is at the application level. Run netperf with -b and -w flags to
> > limit the speed to a sensible value.
>
> Perhaps I should have stated my goals more clearly.
> I'm interested in situations where I don't control the application.
Well an application that streams UDP without any throttling
at the application level will break on a physical network, right?
So I am not sure why should one try to make it work on the virtual one.
But let's assume that you do want to throttle the guest
for reasons such as QOS. The proper approach seems
to be to throttle the sender, not have a dummy throttled
receiver "pacing" it. Place the qemu process in the
correct net_cls cgroup, set the class id and apply a rate limit?
---
diff --git a/net/tap-linux.c b/net/tap-linux.c
index f7aa904..0dbcdd4 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -87,7 +87,7 @@ int tap_open(char *ifname, int ifname_size, int *vnet_hdr, int vnet_hdr_required
* Ethernet NICs generally have txqueuelen=1000, so 1Mb is
* a good default, given a 1500 byte MTU.
*/
-#define TAP_DEFAULT_SNDBUF 1024*1024
+#define TAP_DEFAULT_SNDBUF 0
int tap_set_sndbuf(int fd, QemuOpts *opts)
{
^ permalink raw reply related
* Re: [PATCH] net: add Faraday FTMAC100 10/100 Ethernet driver
From: Po-Yu Chuang @ 2011-01-14 6:56 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, linux-kernel, Po-Yu Chuang
In-Reply-To: <1294928559.3570.130.camel@edumazet-laptop>
Dear Eric,
On Thu, Jan 13, 2011 at 10:22 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le jeudi 13 janvier 2011 à 19:49 +0800, Po-Yu Chuang a écrit :
>> From: Po-Yu Chuang <ratbert@faraday-tech.com>
>>
>> FTMAC100 Ethernet Media Access Controller supports 10/100 Mbps and
>> MII. This driver has been working on some ARM/NDS32 SoC including
>> Faraday A320 and Andes AG101.
>>
>> Signed-off-by: Po-Yu Chuang <ratbert@faraday-tech.com>
> Hi
>
> 1) please use netdev_alloc_skb_ip_align() instead of dev_alloc_skb() +
> skb_reserve(skb, NET_IP_ALIGN);
OK, fixed.
> 2) Dont include a "struct net_device_stats stats" in struct ftmac100,
> you can use the one included in struct net_device
> (You can then remove ftmac100_get_stats())
OK, fixed.
> 3) Dont "netdev->last_rx = jiffies;" : This is not driver job.
> Ditto for trans_start
OK, fixed.
> 4) You must comment why wmb() is needed in ftmac100_xmit()
OK, comment added.
> 5) Why isnt NAPI used too for TX completions ?
> BTW, I am not sure we want a define. All new drivers must use NAPI.
I'll study how to do that
> 6) Use is_valid_ether_addr() instead of is_zero_ether_addr() in
> ftmac100_probe()
OK, fixed.
> 7) "static struct ethtool_ops ftmac100_ethtool_ops" should be const :
> "static const struct ethtool_ops ftmac100_ethtool_ops"
> Ditto for :
> "static struct net_device_ops ftmac100_netdev_ops"
OK, both fixed.
> 8) Why an interrupt handler (ftmac100_interrupt()) needs to block
> interrupts with spin_lock_irqsave(&priv->hw_lock, flags); ?
Fixed. Not a problem anymore since hw_lock is removed now.
> 9) Instead of dev_info(&netdev->dev ...) , please consider netdev_info()
OK, fixed.
Thanks a lot for your detailed review. I'll submit a new version ASAP.
Thanks,
Po-Yu Chuang
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox