* Re: [PATCH] drivers/net: ks8842 Fix crash on received packet when in PIO mode.
From: David Miller @ 2011-05-31 22:15 UTC (permalink / raw)
To: dennis.aberilla; +Cc: info, netdev
In-Reply-To: <20110529214652.GA4707@dens-work>
From: Dennis Aberilla <dennis.aberilla@mimomax.com>
Date: Mon, 30 May 2011 09:46:54 +1200
> This patch fixes a driver crash during packet reception due to not enough
> bytes allocated in the skb. Since the loop reads out 4 bytes at a time, we
> need to allow for up to 3 bytes of slack space.
>
> Signed-off-by: Dennis Aberilla <denzzzhome@yahoo.com>
Applied, thanks.
> |Dennis
> =======================================================================
> This email, including any attachments, is only for the intended
> addressee. It is subject to copyright, is confidential and may be
> the subject of legal or other privilege, none of which is waived or
> lost by reason of this transmission.
> If the receiver is not the intended addressee, please accept our
> apologies, notify us by return, delete all copies and perform no
> other act on the email.
> Unfortunately, we cannot warrant that the email has not been
> altered or corrupted during transmission.
> =======================================================================
Please turn this off for future patch submissions or I will
completely ignore them. It is entirely inappropriate for emails
destined for a public mailing list.
Thanks.
^ permalink raw reply
* Re: [PATCH] ip_options_compile: properly handle unaligned pointer
From: David Miller @ 2011-05-31 22:13 UTC (permalink / raw)
To: cmetcalf; +Cc: netdev, linux-kernel, kaber
In-Reply-To: <201105292112.p4TLC6cN017178@farm-0002.internal.tilera.com>
From: Chris Metcalf <cmetcalf@tilera.com>
Date: Sun, 29 May 2011 16:55:44 -0400
> The current code takes an unaligned pointer and does htonl() on it to
> make it big-endian, then does a memcpy(). The problem is that the
> compiler decides that since the pointer is to a __be32, it is legal
> to optimize the copy into a processor word store. However, on an
> architecture that does not handled unaligned writes in kernel space,
> this produces an unaligned exception fault.
>
> The solution is to track the pointer as a "char *" (which removes a bunch
> of unpleasant casts in any case), and then just use put_unaligned_be32()
> to write the value to memory.
>
> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Applied, thanks Chris.
^ permalink raw reply
* A question about Forward ACKed packets
From: Dominik Kaspar @ 2011-05-31 22:09 UTC (permalink / raw)
To: netdev
Hi,
I'm trying to understand the concept of Forward Acknowledgements
(FACK). What confuses me most is the variable fackets_out, which,
according to the source code, is equal to the "FACK'd packets".
Isn't by definition only a single packet "the most Forward ACKed
packet"...? If not, what are "FACK'd packets"? Is fackets_out the
number of "gaps" between SACKed data?
Greetings,
Dominik
^ permalink raw reply
* Re: [RFC/PATCH] sungem: Spring cleaning and GRO support
From: Benjamin Herrenschmidt @ 2011-05-31 21:58 UTC (permalink / raw)
To: Ben Hutchings; +Cc: David Miller, netdev, R. Herbst, Brian Hamilton
In-Reply-To: <1306875564.2866.39.camel@bwh-desktop>
> > Now the results .... on a dual G5 machine with a 1000Mb link, no
> > measurable netperf difference on Rx and a 3% loss on Tx.
>
> Is TX throughput now CPU-limited or is there some other problem?
I haven't had a chance to measure that properly yet (bloody perf needs
to be build 64-bit and I have a 32-bit distro on that machine, will need
to move over libs etc... today).
> Lacking TSO is going to hurt, but I know we managed multi-gigabit
> single-stream TCP throughput without TSO on x86 systems from 2005.
Right. It -could- be something else too, I need to investigate.
> [...]
> > @@ -736,6 +747,22 @@ static __inline__ void gem_post_rxds(struct gem *gp, int limit)
> > }
> > }
> >
> > +#define ALIGNED_RX_SKB_ADDR(addr) \
> > + ((((unsigned long)(addr) + (64UL - 1UL)) & ~(64UL - 1UL)) - (unsigned long)(addr))
>
> We already have a macro for most of this, so you can define this as:
This is just existing code moved around that I didn't get to cleanup
yet, in fact I was wondering if we really needed that... David, do you
remember if that's something you added for Sparc or I added back then
due to some obscure Apple errata ? I'd like to just switch to
netdev_alloc_skb()
> (PTR_ALIGN(addr, 64) - (addr))
>
> (assuming addr is always a byte pointer; otherwise you need ALIGN and
> the casts to unsigned long).
Yup, I know these :-)
> > +static __inline__ struct sk_buff *gem_alloc_skb(struct net_device *dev, int size,
> > + gfp_t gfp_flags)
> > +{
> > + struct sk_buff *skb = alloc_skb(size + 64, gfp_flags);
>
> You probably should be using netdev_alloc_skb().
As I said above. This is existing code mostly, I need to figure out if
there's a HW reason for the extra alignment first.
> > + if (likely(skb)) {
> > + int offset = (int) ALIGNED_RX_SKB_ADDR(skb->data);
> > + if (offset)
> > + skb_reserve(skb, offset);
>
> skb_reserve() is inline and very simple, so it may be cheaper to call it
> unconditionally.
Ok. Again, existing code :-)
> > + skb->dev = dev;
> > + }
> > + return skb;
> > +}
> > +
> [...]
> > @@ -951,11 +956,12 @@ static irqreturn_t gem_interrupt(int irq, void *dev_id)
> > #ifdef CONFIG_NET_POLL_CONTROLLER
> > static void gem_poll_controller(struct net_device *dev)
> > {
> > - /* gem_interrupt is safe to reentrance so no need
> > - * to disable_irq here.
> > - */
> > - gem_interrupt(dev->irq, dev);
> > -}
> > + struct gem *gp = netdev_priv(dev);
> > +
> > + disable_irq(gp->pdev->irq);
> > + gem_interrupt(gp->pdev->irq, dev);
> > + enable_irq(gp->pdev->irq);
> > +
> > #endif
>
> This might work better with the closing brace left in place...
Ah right, I haven't tested NETPOLL, thanks.
> The change from dev->irq to gp->pdev->irq looks unnecessary - though I
> hope that one day we can get rid of those I/O resource details in struct
> net_device.
That was my thinking. Other drivers I've looked at tend to use pdev->irq
and I don't want to overly rely on "irq" in the netdev, but that doesn't
matter much does it ?
> [...]
> > static int gem_do_start(struct net_device *dev)
> > {
> [...]
> > if (request_irq(gp->pdev->irq, gem_interrupt,
> > IRQF_SHARED, dev->name, (void *)dev)) {
> > netdev_err(dev, "failed to request irq !\n");
> >
> > - spin_lock_irqsave(&gp->lock, flags);
> > - spin_lock(&gp->tx_lock);
> > -
> > napi_disable(&gp->napi);
> > -
> > - gp->running = 0;
> > + netif_device_detach(dev);
>
> I don't think this can be right, as there seems to be no way for the
> device to be re-attached after this failure other than a suspend/resume
> cycle.
Indeed, brain fart. Will fix, thanks.
> > gem_reset(gp);
> > gem_clean_rings(gp);
> > - gem_put_cell(gp);
> >
> > - spin_unlock(&gp->tx_lock);
> > - spin_unlock_irqrestore(&gp->lock, flags);
> > + spin_lock_bh(&gp->lock);
> > + gem_put_cell(gp);
> > + spin_unlock_bh(&gp->lock);
> >
> > return -EAGAIN;
> > }
> [...]
>
> Is the pm_mutex really needed? All control operations should already be
> serialised by the RTNL lock, and you've started taking that in the
> suspend and resume functions.
Well, it's been there forever and I need to get my head around it, but
yes, the rtnl lock might be able to get rid of it, good point. I just
actually added that :-)
So all ndo_set_* are going to be covered by rtnl including the ethtool ?
I don't really want to take the rtnl lock in the reset task (at least
not for the whole duration of it), so I may have to be a bit creative on
synchronization there.
Part of the point of that patch is to remove the looooong locked region
under the private lock, ie most of the chip reset/init sequences are now
done without a lock held (I forgot to add that to the changeset comment
I suppose) and I want to keep it that way.
Thanks for your review, I'll give it another shot after I've managed to
do some measurements/profiling.
Cheers,
Ben.
^ permalink raw reply
* 3.0-rc1: DMAR errors from iwlagn
From: Jeremy Fitzhardinge @ 2011-05-31 21:14 UTC (permalink / raw)
To: NetDev, Linux Kernel Mailing List
I have a Lenovo X220 with:
03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)
Subsystem: Intel Corporation Centrino Ultimate-N 6300 3x3 AGN
Flags: bus master, fast devsel, latency 0, IRQ 50
Memory at d2500000 (64-bit, non-prefetchable) [size=8K]
Capabilities: <access denied>
Kernel driver in use: iwlagn
Kernel modules: iwlagn
With 3.0-rc1, I'm seeing these in dmesg:
DRHD: handling fault status reg 2
DMAR:[DMA Write] Request device [03:00.0] fault addr ffe8a000
DMAR:[fault reason 05] PTE Write access is not set
DRHD: handling fault status reg 2
DMAR:[DMA Write] Request device [03:00.0] fault addr ffd5a000
DMAR:[fault reason 05] PTE Write access is not set
DRHD: handling fault status reg 2
DMAR:[DMA Write] Request device [03:00.0] fault addr ffd52000
DMAR:[fault reason 05] PTE Write access is not set
but the device seems to be working OK.
Thanks,
J
^ permalink raw reply
* Re: [RFC/PATCH] sungem: Spring cleaning and GRO support
From: Ben Hutchings @ 2011-05-31 20:59 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: David Miller, netdev, R. Herbst, Brian Hamilton
In-Reply-To: <1306828745.7481.660.camel@pasglop>
On Tue, 2011-05-31 at 17:59 +1000, Benjamin Herrenschmidt wrote:
> Hi David !
>
> For RFC only at this stage, see blow why.
>
> This patch simplifies the logic and locking in sungem significantly:
>
> - LLTX is gone, private tx lock is gone
> - We don't poll the PHY while the interface is down
> - The above allowed me to get rid of a pile of state flags
> using the proper interface state provided by the networking
> stack when needed
> - Allocate the bulk of RX skbs at init time using GFP_KERNEL
> - Fix a bug where the dev->features were set after register_netdev()
> - Added GRO while at it
>
> Now the results .... on a dual G5 machine with a 1000Mb link, no
> measurable netperf difference on Rx and a 3% loss on Tx.
Is TX throughput now CPU-limited or is there some other problem?
Lacking TSO is going to hurt, but I know we managed multi-gigabit
single-stream TCP throughput without TSO on x86 systems from 2005.
[...]
> @@ -736,6 +747,22 @@ static __inline__ void gem_post_rxds(struct gem *gp, int limit)
> }
> }
>
> +#define ALIGNED_RX_SKB_ADDR(addr) \
> + ((((unsigned long)(addr) + (64UL - 1UL)) & ~(64UL - 1UL)) - (unsigned long)(addr))
We already have a macro for most of this, so you can define this as:
(PTR_ALIGN(addr, 64) - (addr))
(assuming addr is always a byte pointer; otherwise you need ALIGN and
the casts to unsigned long).
> +static __inline__ struct sk_buff *gem_alloc_skb(struct net_device *dev, int size,
> + gfp_t gfp_flags)
> +{
> + struct sk_buff *skb = alloc_skb(size + 64, gfp_flags);
You probably should be using netdev_alloc_skb().
> + if (likely(skb)) {
> + int offset = (int) ALIGNED_RX_SKB_ADDR(skb->data);
> + if (offset)
> + skb_reserve(skb, offset);
skb_reserve() is inline and very simple, so it may be cheaper to call it
unconditionally.
> + skb->dev = dev;
> + }
> + return skb;
> +}
> +
[...]
> @@ -951,11 +956,12 @@ static irqreturn_t gem_interrupt(int irq, void *dev_id)
> #ifdef CONFIG_NET_POLL_CONTROLLER
> static void gem_poll_controller(struct net_device *dev)
> {
> - /* gem_interrupt is safe to reentrance so no need
> - * to disable_irq here.
> - */
> - gem_interrupt(dev->irq, dev);
> -}
> + struct gem *gp = netdev_priv(dev);
> +
> + disable_irq(gp->pdev->irq);
> + gem_interrupt(gp->pdev->irq, dev);
> + enable_irq(gp->pdev->irq);
> +
> #endif
This might work better with the closing brace left in place...
The change from dev->irq to gp->pdev->irq looks unnecessary - though I
hope that one day we can get rid of those I/O resource details in struct
net_device.
[...]
> static int gem_do_start(struct net_device *dev)
> {
[...]
> if (request_irq(gp->pdev->irq, gem_interrupt,
> IRQF_SHARED, dev->name, (void *)dev)) {
> netdev_err(dev, "failed to request irq !\n");
>
> - spin_lock_irqsave(&gp->lock, flags);
> - spin_lock(&gp->tx_lock);
> -
> napi_disable(&gp->napi);
> -
> - gp->running = 0;
> + netif_device_detach(dev);
I don't think this can be right, as there seems to be no way for the
device to be re-attached after this failure other than a suspend/resume
cycle.
> gem_reset(gp);
> gem_clean_rings(gp);
> - gem_put_cell(gp);
>
> - spin_unlock(&gp->tx_lock);
> - spin_unlock_irqrestore(&gp->lock, flags);
> + spin_lock_bh(&gp->lock);
> + gem_put_cell(gp);
> + spin_unlock_bh(&gp->lock);
>
> return -EAGAIN;
> }
[...]
Is the pm_mutex really needed? All control operations should already be
serialised by the RTNL lock, and you've started taking that in the
suspend and resume functions.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: IFB and iptables
From: Andrew Beverley @ 2011-05-31 20:33 UTC (permalink / raw)
To: Jérôme Poulin; +Cc: netdev
In-Reply-To: <BANLkTimzL0_w=+CvbFNxNS=wAOZ61CrrdA@mail.gmail.com>
On Wed, 2011-05-25 at 18:21 -0400, Jérôme Poulin wrote:
> Hi,
>
> I'm trying to convert my IMQ based script to use the IFB device instead.
> Things appear to work quite right however the u32 classifier isn't
> aware of any connection tracking and I was wondering if it is at all
> possible to use match from iptables like layer7 when you use the IFB
> device?
It depends where you are attaching your IFB device. Unlike IMQ, IFB can
only be hooked on an interface (IMQ can be hooked between iptables
chains). Therefore, if you are doing it on the ingress interface,
traffic will not have been connection-tracked. Off the top of my head,
it should work on egress though.
Andy
^ permalink raw reply
* IPv6 support for route realm/flow attribute?
From: Chris Adams @ 2011-05-31 20:33 UTC (permalink / raw)
To: netdev
It looks like the IPv6 routing table doesn't support the realm/flow
attribute (RTA_FLOW). Is this something that somebody is planning to
add?
I ask because this is very useful for traffic control setups. I'm
importing routes via BGP and setting the realm based on the AS path, so
I can rate-limit outbound traffic based on the destination AS.
--
Chris Adams <cmadams@hiwaay.net>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.
^ permalink raw reply
* Re: [PATCH] uts: Make default hostname configurable, rather than always using "(none)"
From: David Miller @ 2011-05-31 19:46 UTC (permalink / raw)
To: torvalds; +Cc: josh, netdev, serge, akpm, linux-kernel, kel, pkg-sysvinit-devel
In-Reply-To: <BANLkTikgqkd0BckvTrFwAOmJQ0ObP4OOjg@mail.gmail.com>
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue, 31 May 2011 20:35:37 +0900
> On Tue, May 31, 2011 at 7:38 AM, Josh Triplett <josh@joshtriplett.org> wrote:
>>
>> The "hostname" tool falls back to setting the hostname to "localhost" if
>> /etc/hostname does not exist. Distribution init scripts have the same
>> fallback. However, if userspace never calls sethostname, such as when
>> booting with init=/bin/sh, or otherwise booting a minimal system without
>> the usual init scripts, the default hostname of "(none)" remains,
>> unhelpfully appearing in various places such as prompts
>> ("root@(none):~#") and logs. Furthermore, "(none)" doesn't typically
>> resolve to anything useful.
>
> Ok, I'm fine with this. So Ack as far as I'm concerned.
>
> Does this make most sense through the networking tree, or what?
Linus, you can just apply this directly.
Thanks!
^ permalink raw reply
* Re: [Xen-devel] Re: [PATCH] ethtool: ETHTOOL_SFEATURES: remove NETIF_F_COMPAT return
From: Jesse Gross @ 2011-05-31 18:43 UTC (permalink / raw)
To: Michał Mirosław
Cc: dev-yBygre7rU0TnMu66kgdUjQ,
xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR@public.gmane.org,
Ian Campbell, netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
xen-api-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR, Ben Hutchings,
David Miller
In-Reply-To: <20110529093849.GA5245-CoA6ZxLDdyEEUmgCuDUIdw@public.gmane.org>
2011/5/29 Michał Mirosław <mirq-linux@rere.qmqm.pl>:
> On Sat, May 28, 2011 at 10:31:03AM -0700, Jesse Gross wrote:
>> 2011/5/28 Ian Campbell <Ian.Campbell@citrix.com>:
>> > On Sat, 2011-05-28 at 08:35 +0100, Michał Mirosław wrote:
>> >> On Sat, May 28, 2011 at 12:25:55AM +0100, Ben Hutchings wrote:
>> >> > On Fri, 2011-05-27 at 18:34 +0200, Michał Mirosław wrote:
>> >> > > On Fri, May 27, 2011 at 04:45:50PM +0100, Ben Hutchings wrote:
>> >> > > > On Fri, 2011-05-27 at 17:28 +0200, Michał Mirosław wrote:
>> >> > [...]
>> >> > > > > (note: ETHTOOL_S{SG,...} are not ever going away)
>> >> > > > > - causes NETIF_F_* to be an ABI
>> >> > > > If feature flag numbers are not stable then what is the point of
>> >> > > > /sys/class/net/<name>/features? Also, I'm not aware that features have
>> >> > > > ever been renumbered in the past.
>> >> > > Since no NETIF_F_* were exported earlier, I assume /sys/class/net/*/features
>> >> > > is a debugging aid. What is it used for besides that?
>> >> > xen-api <https://github.com/xen-org/xen-api> uses it in
>> >> > scripts/InterfaceReconfigureVswitch.py. Though it doesn't seem to be
>> >> > used for a particularly good reason...
>> >> Look like it should use ETHTOOL_GFLAGS instead for netdev_has_vlan_accel().
>> ETHTOOL_GFLAGS didn't expose the vlan acceleration flags until 2.6.37,
>> which is why /sys/class/net was used instead.
>
> https://github.com/xen-org/xen-api/commit/78b8078e6ae3cf48179859bed6350bb326987546
>
> The commit using it was introduced after 2.6.37 kernel was released
Well people do use kernels other than the most recently released one...
> and uses
> undocumented acccess path to the bits in question. What is the kernel patch
> this commit is referring to?
It's a temporary workaround to deal with the fact that many drivers
that support vlan acceleration do not properly handle vlan packets if
the corresponding group isn't configured on them.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev
^ permalink raw reply
* Re: [ath5k-devel] ath5k regression associating with APs in 2.6.38
From: Felix Fietkau @ 2011-05-31 17:31 UTC (permalink / raw)
To: Nick Kossifidis
Cc: John W. Linville, Jiri Slaby, Luis R. Rodriguez, Bob Copeland,
linux-wireless, ath5k-devel, netdev, linux-kernel
In-Reply-To: <BANLkTi=8ZRUVWn3FLAMtPh=4yY1F0k6i9w@mail.gmail.com>
On 2011-05-17 7:14 PM, Nick Kossifidis wrote:
> 2011/5/17 Seth Forshee<seth.forshee@canonical.com>:
>> On Mon, May 09, 2011 at 09:02:30AM +0200, Seth Forshee wrote:
>>> On Thu, May 05, 2011 at 05:30:42PM +0300, Nick Kossifidis wrote:
>>> > Hmm I don't see any errors from reset/phy code, can you disable
>>> > Network Manager/wpa-supplicant and test connection on an open network
>>> > using iw ? It 'll give us a better picture...
>>> >
>>> > If iw doesn't return any scan results we are probably hitting a PHY/RF
>>> > error specific to your device (not all vendors follow the reference
>>> > design). Maybe we should follow a blacklist/whitelist approach for
>>> > this feature.
>>>
>>> I got the results back from my tester. He was able to get scan results,
>>> but it took multiple tries and the direct probe failures appear in the
>>> log. He didn't enable ATH5K_DEBUG_RESET this time; let me know if you
>>> need that and I'll request he retest with the extra debug logs enabled.
>>
>> I got some more feedback. Most of the time iw does not get scan results,
>> but even when it does connecting to the AP isn't always successful. The
>> tester did note that he doesn't seem to have any trouble if his machine
>> is within a few feet of his AP. Let me know if you'd like something else
>> tested.
>>
>> I noticed that bugzilla #31922 (ath5k: Decreased throughput in IBSS or
>> 802.11n mode) is also fixed by reverting 8aec7af9. It seems like the
>> synth-only channel changes are resulting in poor connection quality.
>> Maybe that patch needs to be reverted?
>>
>> Thanks,
>> Seth
>>
>>
>
> http://www.kernel.org/pub/linux/kernel/people/mickflemm/01-fast-chan-switch-modparm
Disabling fast channel change also fixed a reproducible crash on
Broadcom based MIPS boards with some cards (AR2413, I think).
- Felix
^ permalink raw reply
* Re: Skipping past TCP lost packet in userspace
From: Yuchung Cheng @ 2011-05-31 17:23 UTC (permalink / raw)
To: Josh Lehan; +Cc: netdev, jiyengar
In-Reply-To: <4DE44218.4070306@krellan.com>
On Mon, May 30, 2011 at 6:19 PM, Josh Lehan <linux@krellan.com> wrote:
>
> Hello. I looked, but could not find an answer. Is there already an
> ioctl() or something like that in Linux, that would allow a userspace
> TCP socket to skip past a lost packet?
>
> The kernel already will continue to queue up packets, and with TCP SACK,
> the kernel can acknowledge reception of further packets beyond the lost
> packet, allowing the queue to continue growing. However, all these
> queued packets won't be delivered to userspace until the original lost
> packet is received again, after it has been retransmitted.
>
> Is there a way for a userspace program to prevent this needless stall?
This paper may have a solution to your problem
"Minion—an All-Terrain Packet Packhorse to Jump-Start Stalled Internet
Transports"
http://csweb1.fandm.edu/jiyengar/lair/papers/minion-pfldnet2010.pdf
> It would be great if there was an ioctl() or similar call, that would
> tell the kernel that it's OK to leave a gap in the data stream, and
> resume supplying userspace with more data. An obvious application would
> be media streaming, and many high-level media protocols do their own
> block framing anyway, so resynchronization after the data gap would not
> be a problem.
>
> This sounds like something that would be a FAQ, and if so, please point
> me to the answer. Thank you!
>
> Josh Lehan
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] ftrace: tracepoint of net_dev_xmit sees freed skb and causes panic
From: Steven Rostedt @ 2011-05-31 16:21 UTC (permalink / raw)
To: Neil Horman
Cc: Koki Sanagi, linux-kernel, netdev, davem, mingo, fweisbec,
mathieu.desnoyers, tglx, kosaki.motohiro, izumi.taku,
kaneshige.kenji
In-Reply-To: <20110531161115.GA3267@hmsreliant.think-freely.org>
On Tue, 2011-05-31 at 12:11 -0400, Neil Horman wrote:
> skb_dst_drop(nskb);
> > >
> > > + skb_len = nskb->len;
> > > rc = ops->ndo_start_xmit(nskb, dev);
> > > - trace_net_dev_xmit(nskb, rc);
> > > + trace_net_dev_xmit(nskb, rc, dev, skb_len);
> >
> > What if you just put the tracepoint before the call to
> > ops->ndo_start_xmit?
> >
> Then you won't know the return code of ndo_start_xmit, which this tracepoint
> records.
Doh! Yeah, I see that now ;)
-- Steve
^ permalink raw reply
* Re: [PATCH] ftrace: tracepoint of net_dev_xmit sees freed skb and causes panic
From: Neil Horman @ 2011-05-31 16:11 UTC (permalink / raw)
To: Steven Rostedt
Cc: Koki Sanagi, linux-kernel, netdev, davem, mingo, fweisbec,
mathieu.desnoyers, tglx, kosaki.motohiro, izumi.taku,
kaneshige.kenji
In-Reply-To: <1306854791.11899.30.camel@gandalf.stny.rr.com>
On Tue, May 31, 2011 at 11:13:11AM -0400, Steven Rostedt wrote:
> On Tue, 2011-05-31 at 16:48 +0900, Koki Sanagi wrote:
> > Because there is a possibility that skb is kfree_skb()ed and zero cleared
> > after ndo_start_xmit, we should not see the contents of skb like skb->len and
> > skb->dev->name after ndo_start_xmit. But trace_net_dev_xmit does that
> > and causes panic by NULL pointer dereference.
> > This patch fixes trace_net_dev_xmit not to see the contents of skb directly.
>
>
> >
> > if (likely(!skb->next)) {
> > u32 features;
> > @@ -2139,8 +2140,9 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
> > }
> > }
> >
> > + skb_len = skb->len;
> > rc = ops->ndo_start_xmit(skb, dev);
> > - trace_net_dev_xmit(skb, rc);
> > + trace_net_dev_xmit(skb, rc, dev, skb_len);
> > if (rc == NETDEV_TX_OK)
> > txq_trans_update(txq);
> > return rc;
> > @@ -2160,8 +2162,9 @@ gso:
> > if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
> > skb_dst_drop(nskb);
> >
> > + skb_len = nskb->len;
> > rc = ops->ndo_start_xmit(nskb, dev);
> > - trace_net_dev_xmit(nskb, rc);
> > + trace_net_dev_xmit(nskb, rc, dev, skb_len);
>
> What if you just put the tracepoint before the call to
> ops->ndo_start_xmit?
>
Then you won't know the return code of ndo_start_xmit, which this tracepoint
records.
Neil
> -- Steve
>
> > if (unlikely(rc != NETDEV_TX_OK)) {
> > if (rc & ~NETDEV_TX_MASK)
> > goto out_kfree_gso_skb;
>
>
>
^ permalink raw reply
* [RFC PATCH 20/35] drivers/net changes for SMBIOS and System Firmware
From: Prarit Bhargava @ 2011-05-31 15:52 UTC (permalink / raw)
To: linux-kernel, netdev; +Cc: Prarit Bhargava
drivers/net changes for SMBIOS and System Firmware
---
drivers/net/skge.c | 11 ++++++-----
drivers/net/via-rhine.c | 18 ++++++++++--------
drivers/net/wireless/wl1251/sdio.c | 1 -
3 files changed, 16 insertions(+), 14 deletions(-)
diff --git a/drivers/net/skge.c b/drivers/net/skge.c
index f4be5c7..1200c53 100644
--- a/drivers/net/skge.c
+++ b/drivers/net/skge.c
@@ -43,7 +43,7 @@
#include <linux/seq_file.h>
#include <linux/mii.h>
#include <linux/slab.h>
-#include <linux/dmi.h>
+#include <linux/sysfw.h>
#include <linux/prefetch.h>
#include <asm/irq.h>
@@ -4089,12 +4089,13 @@ static struct pci_driver skge_driver = {
.driver.pm = SKGE_PM_OPS,
};
-static struct dmi_system_id skge_32bit_dma_boards[] = {
+static struct sysfw_id skge_32bit_dma_boards[] = {
{
.ident = "Gigabyte nForce boards",
.matches = {
- DMI_MATCH(DMI_BOARD_VENDOR, "Gigabyte Technology Co"),
- DMI_MATCH(DMI_BOARD_NAME, "nForce"),
+ SYSFW_MATCH(SYSFW_BOARD_VENDOR,
+ "Gigabyte Technology Co"),
+ SYSFW_MATCH(SYSFW_BOARD_NAME, "nForce"),
},
},
{}
@@ -4102,7 +4103,7 @@ static struct dmi_system_id skge_32bit_dma_boards[] = {
static int __init skge_init_module(void)
{
- if (dmi_check_system(skge_32bit_dma_boards))
+ if (sysfw_callback(skge_32bit_dma_boards))
only_32bit_dma = 1;
skge_debug_init();
return pci_register_driver(&skge_driver);
diff --git a/drivers/net/via-rhine.c b/drivers/net/via-rhine.c
index 7f23ab9..81725d5 100644
--- a/drivers/net/via-rhine.c
+++ b/drivers/net/via-rhine.c
@@ -110,7 +110,7 @@ static const int multicast_filter_limit = 32;
#include <asm/io.h>
#include <asm/irq.h>
#include <asm/uaccess.h>
-#include <linux/dmi.h>
+#include <linux/sysfw.h>
/* These identify the driver base version and may not be removed. */
static const char version[] __devinitconst =
@@ -2294,22 +2294,24 @@ static struct pci_driver rhine_driver = {
.shutdown = rhine_shutdown,
};
-static struct dmi_system_id __initdata rhine_dmi_table[] = {
+static struct sysfw_id __initdata rhine_id_table[] = {
{
.ident = "EPIA-M",
.matches = {
- DMI_MATCH(DMI_BIOS_VENDOR, "Award Software International, Inc."),
- DMI_MATCH(DMI_BIOS_VERSION, "6.00 PG"),
+ SYSFW_MATCH(SYSFW_BIOS_VENDOR,
+ "Award Software International, Inc."),
+ SYSFW_MATCH(SYSFW_BIOS_VERSION, "6.00 PG"),
},
},
{
.ident = "KV7",
.matches = {
- DMI_MATCH(DMI_BIOS_VENDOR, "Phoenix Technologies, LTD"),
- DMI_MATCH(DMI_BIOS_VERSION, "6.00 PG"),
+ SYSFW_MATCH(SYSFW_BIOS_VENDOR,
+ "Phoenix Technologies, LTD"),
+ SYSFW_MATCH(SYSFW_BIOS_VERSION, "6.00 PG"),
},
},
- { NULL }
+ {}
};
static int __init rhine_init(void)
@@ -2318,7 +2320,7 @@ static int __init rhine_init(void)
#ifdef MODULE
pr_info("%s\n", version);
#endif
- if (dmi_check_system(rhine_dmi_table)) {
+ if (sysfw_callback(rhine_id_table)) {
/* these BIOSes fail at PXE boot if chip is in D3 */
avoid_D3 = 1;
pr_warn("Broken BIOS detected, avoid_D3 enabled\n");
diff --git a/drivers/net/wireless/wl1251/sdio.c b/drivers/net/wireless/wl1251/sdio.c
index f51a024..68e36d7 100644
--- a/drivers/net/wireless/wl1251/sdio.c
+++ b/drivers/net/wireless/wl1251/sdio.c
@@ -20,7 +20,6 @@
* Copyright (C) 2009 Bob Copeland (me@bobcopeland.com)
*/
#include <linux/module.h>
-#include <linux/mod_devicetable.h>
#include <linux/mmc/sdio_func.h>
#include <linux/mmc/sdio_ids.h>
#include <linux/platform_device.h>
--
1.7.5.1
^ permalink raw reply related
* Re: [PATCH] tcp: Expose the initial RTO via a new sysctl.
From: Hagen Paul Pfeifer @ 2011-05-31 15:43 UTC (permalink / raw)
To: tsuna
Cc: H.K. Jerry Chu, David Miller, kuznet, pekkas, jmorris, yoshfuji,
kaber, netdev, linux-kernel
In-Reply-To: <BANLkTikuW7VF8AUPFYHYzGe7mXp5H_emjA@mail.gmail.com>
On Tue, 31 May 2011 08:28:18 -0700, tsuna <tsunanet@gmail.com> wrote:
> Sorry I meant a knob such as /proc/sys/net/ipv4/tcp_initrto.
That's the same! ;-)
>> The initRTO is the ideal candidate for a
>> per route knob. And happily you will solve 2) with the per route thing
>> too!
>
> You still need a knob for the default system-wide value, don't you?
Yes, try to re-read the emails. Sysctl is a no-go, with a per route
interface you have the ability to tune the values. Talk with Jerry once
again - he wrote that at Google they already have a patch for this. And
with a per route knob you can select a even smaller value for your local
network (e.g. datacenter) and a larger value for all other routes. It makes
sense to provide a knob for this on a route basis, not on a global sysctl
basis.
But once again: talk with Jerry - he has the expert knowledge!
Hagen
^ permalink raw reply
* Re: [PATCH] tcp: Expose the initial RTO via a new sysctl.
From: tsuna @ 2011-05-31 15:28 UTC (permalink / raw)
To: Hagen Paul Pfeifer
Cc: H.K. Jerry Chu, David Miller, kuznet, pekkas, jmorris, yoshfuji,
kaber, netdev, linux-kernel
In-Reply-To: <da8dbb76f0cb3762227d8d0a99eaff21@localhost>
On Tue, May 31, 2011 at 8:25 AM, Hagen Paul Pfeifer <hagen@jauu.net> wrote:
> Skip sysctl, it is deprecated.
Sorry I meant a knob such as /proc/sys/net/ipv4/tcp_initrto.
> The initRTO is the ideal candidate for a
> per route knob. And happily you will solve 2) with the per route thing too!
You still need a knob for the default system-wide value, don't you?
--
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com
^ permalink raw reply
* Re: [PATCH] tcp: Expose the initial RTO via a new sysctl.
From: Hagen Paul Pfeifer @ 2011-05-31 15:25 UTC (permalink / raw)
To: tsuna
Cc: H.K. Jerry Chu, David Miller, kuznet, pekkas, jmorris, yoshfuji,
kaber, netdev, linux-kernel
In-Reply-To: <BANLkTimqtjskX5q2ovTp_c05MdqBDOWsRg@mail.gmail.com>
On Tue, 31 May 2011 07:48:09 -0700, tsuna <tsunanet@gmail.com> wrote:
> I talked to Jerry and he's agreed to share some patches that Google
> has been using internally for years.
Great!
> Personally what I think would be ideal would be:
> 1. A sysctl knob for initRTO, to allow people to adjust this
> appropriately for their environment.
> 2. Apply the srtt / rttvar seen on previous connections to new
> connections.
>
> Does that sound reasonable?
>
> For 2), I'm not sure how the details would work yet, I believe the
> kernel already has what's necessary to remember these things on a per
> peer basis, but it would be nice if I could specify things like "for
> 10.x.0.0/16 (local datacenter) use this aggressive setting, for
> 10.0.0.0/8 (my internal backend network) use that, for everything else
> (Internets etc.) use the default".
Skip sysctl, it is deprecated. The initRTO is the ideal candidate for a
per route knob. And happily you will solve 2) with the per route thing too!
;-)
Search the web, you will find some patches where you can see how to extend
the per route system - including iproute2.
Hagen
^ permalink raw reply
* Re: [PATCH] ftrace: tracepoint of net_dev_xmit sees freed skb and causes panic
From: Steven Rostedt @ 2011-05-31 15:14 UTC (permalink / raw)
To: Koki Sanagi
Cc: linux-kernel, netdev, davem, nhorman, mingo, fweisbec,
mathieu.desnoyers, tglx, kosaki.motohiro, izumi.taku,
kaneshige.kenji
In-Reply-To: <4DE49D52.709@jp.fujitsu.com>
Note, the subject should not be "ftrace:", but "net:" or maybe even
"net/tracing", as this really has nothing to do with ftrace code. The
tracepoints are more generic than ftrace.
-- Steve
^ permalink raw reply
* Re: [PATCH] ftrace: tracepoint of net_dev_xmit sees freed skb and causes panic
From: Steven Rostedt @ 2011-05-31 15:13 UTC (permalink / raw)
To: Koki Sanagi
Cc: linux-kernel, netdev, davem, nhorman, mingo, fweisbec,
mathieu.desnoyers, tglx, kosaki.motohiro, izumi.taku,
kaneshige.kenji
In-Reply-To: <4DE49D52.709@jp.fujitsu.com>
On Tue, 2011-05-31 at 16:48 +0900, Koki Sanagi wrote:
> Because there is a possibility that skb is kfree_skb()ed and zero cleared
> after ndo_start_xmit, we should not see the contents of skb like skb->len and
> skb->dev->name after ndo_start_xmit. But trace_net_dev_xmit does that
> and causes panic by NULL pointer dereference.
> This patch fixes trace_net_dev_xmit not to see the contents of skb directly.
>
> if (likely(!skb->next)) {
> u32 features;
> @@ -2139,8 +2140,9 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
> }
> }
>
> + skb_len = skb->len;
> rc = ops->ndo_start_xmit(skb, dev);
> - trace_net_dev_xmit(skb, rc);
> + trace_net_dev_xmit(skb, rc, dev, skb_len);
> if (rc == NETDEV_TX_OK)
> txq_trans_update(txq);
> return rc;
> @@ -2160,8 +2162,9 @@ gso:
> if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
> skb_dst_drop(nskb);
>
> + skb_len = nskb->len;
> rc = ops->ndo_start_xmit(nskb, dev);
> - trace_net_dev_xmit(nskb, rc);
> + trace_net_dev_xmit(nskb, rc, dev, skb_len);
What if you just put the tracepoint before the call to
ops->ndo_start_xmit?
-- Steve
> if (unlikely(rc != NETDEV_TX_OK)) {
> if (rc & ~NETDEV_TX_MASK)
> goto out_kfree_gso_skb;
^ permalink raw reply
* Re: [PATCH] tcp: Expose the initial RTO via a new sysctl.
From: tsuna @ 2011-05-31 14:48 UTC (permalink / raw)
To: H.K. Jerry Chu
Cc: Hagen Paul Pfeifer, David Miller, kuznet, pekkas, jmorris,
yoshfuji, kaber, netdev, linux-kernel
In-Reply-To: <BANLkTi=p1V1x+y=-cH-Q7bvueu_4_D1ywQ@mail.gmail.com>
On Fri, May 20, 2011 at 5:06 PM, H.K. Jerry Chu <hkjerry.chu@gmail.com> wrote:
> Yep, that's why we've had a knob for this for years.
I was traveling last week so sorry for not replying earlier to various
comments people made.
I talked to Jerry and he's agreed to share some patches that Google
has been using internally for years. I started this work because
after leaving Google and taking these changes for granted, I was
surprised to find that they weren't actually part of the mainline
Linux kernel.
It seems that David is willing to accept a change that will lower the
initRTO to 1s (compile-time constant), with a fallback to 3s
(compile-time constant), as per the draft rfc2988bis. Others are
legitimately worried about the impact this would cause in environments
where RTT is typically (or always) in the 1-3s range. Some would like
to see this as a per-destination thing.
Personally what I think would be ideal would be:
1. A sysctl knob for initRTO, to allow people to adjust this
appropriately for their environment.
2. Apply the srtt / rttvar seen on previous connections to new connections.
Does that sound reasonable?
For 2), I'm not sure how the details would work yet, I believe the
kernel already has what's necessary to remember these things on a per
peer basis, but it would be nice if I could specify things like "for
10.x.0.0/16 (local datacenter) use this aggressive setting, for
10.0.0.0/8 (my internal backend network) use that, for everything else
(Internets etc.) use the default".
--
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com
^ permalink raw reply
* Re: [PATCH] uts: Make default hostname configurable, rather than always using "(none)"
From: Linus Torvalds @ 2011-05-31 11:35 UTC (permalink / raw)
To: Josh Triplett
Cc: David Miller, netdev, Serge E. Hallyn, Andrew Morton,
linux-kernel, Kel Modderman, pkg-sysvinit-devel
In-Reply-To: <20110530223847.GA29245@leaf>
On Tue, May 31, 2011 at 7:38 AM, Josh Triplett <josh@joshtriplett.org> wrote:
>
> The "hostname" tool falls back to setting the hostname to "localhost" if
> /etc/hostname does not exist. Distribution init scripts have the same
> fallback. However, if userspace never calls sethostname, such as when
> booting with init=/bin/sh, or otherwise booting a minimal system without
> the usual init scripts, the default hostname of "(none)" remains,
> unhelpfully appearing in various places such as prompts
> ("root@(none):~#") and logs. Furthermore, "(none)" doesn't typically
> resolve to anything useful.
Ok, I'm fine with this. So Ack as far as I'm concerned.
Does this make most sense through the networking tree, or what?
Linus
^ permalink raw reply
* Re: Skipping past TCP lost packet in userspace
From: Neil Horman @ 2011-05-31 11:12 UTC (permalink / raw)
To: Josh Lehan; +Cc: netdev
In-Reply-To: <4DE44218.4070306@krellan.com>
On Mon, May 30, 2011 at 06:19:20PM -0700, Josh Lehan wrote:
> Hello. I looked, but could not find an answer. Is there already an
> ioctl() or something like that in Linux, that would allow a userspace
> TCP socket to skip past a lost packet?
>
> The kernel already will continue to queue up packets, and with TCP SACK,
> the kernel can acknowledge reception of further packets beyond the lost
> packet, allowing the queue to continue growing. However, all these
> queued packets won't be delivered to userspace until the original lost
> packet is received again, after it has been retransmitted.
>
> Is there a way for a userspace program to prevent this needless stall?
> It would be great if there was an ioctl() or similar call, that would
> tell the kernel that it's OK to leave a gap in the data stream, and
> resume supplying userspace with more data. An obvious application would
> be media streaming, and many high-level media protocols do their own
> block framing anyway, so resynchronization after the data gap would not
> be a problem.
>
> This sounds like something that would be a FAQ, and if so, please point
> me to the answer. Thank you!
>
No, TCP doesn't and won't do that by design
If you want to allow frames to come in to an application as they arrive at the
system regarless of prior loss, use UDP
If you still want ordering and reliability, look at SCTP.
Neil
> Josh Lehan
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: Kernel crash after using new Intel NIC (igb)
From: Ingo Molnar @ 2011-05-31 10:50 UTC (permalink / raw)
To: Arun Sharma
Cc: Eric Dumazet, David Miller, Maximilian Engelhardt, linux-kernel,
netdev, StuStaNet Vorstand, Yann Dupont, Denys Fedoryshchenko,
Thomas Gleixner
In-Reply-To: <4DE3E32E.5000302@fb.com>
* Arun Sharma <asharma@fb.com> wrote:
> On 5/29/11 5:33 AM, Ingo Molnar wrote:
> >
> >* Eric Dumazet<eric.dumazet@gmail.com> wrote:
> >
> >>I asked Arun if he wanted to make this himself, because initial
> >>idea was coming from him, not because I did not want to make it ;)
> >
> >Hey, fair enough and sorry about the fuss! :-)
>
> Sounds like there is general consensus that such a cleanup would be
> good. I'll try to post a patch that does this cleanup for all archs
> in the next couple of days - but I won't have a way of testing it
> on anything other than x86_64.
-mm is a good place to do such many-arch kind of facility
cleanups/extensions so please Cc: Andrew if you think the patch is
ready to be tested on that level.
Thanks,
Ingo
^ permalink raw reply
* Re: netxen_nic: unregister_netdevice error
From: Jiri Pirko @ 2011-05-31 10:12 UTC (permalink / raw)
To: Eric Dumazet; +Cc: WeipingPan, open list:NETWORKING [GENERAL]
In-Reply-To: <1306835326.2809.1.camel@edumazet-laptop>
Tue, May 31, 2011 at 11:48:46AM CEST, eric.dumazet@gmail.com wrote:
>Le mardi 31 mai 2011 à 17:21 +0800, WeipingPan a écrit :
>> On 05/31/2011 02:15 PM, WeipingPan wrote:
>> > hi,
>> >
>> > When test bonding broadcast mode I met a problem.
>> > modprobe -r bonding will fail after I ping broadcast address.
>> >
>> > I find that the problem only shows up when I use netxen_nic driver.
>> > When use ixgbe and bn2x2 driver, no problem occurs.
>> >
>> > I use RHEL 6.2, kernel 2.6.32-131.0.15.el6.i686,
>> > Can anybody confirm this bug using 2.6.39 or upstream ?
>> >
>> > many thanks
>> > Weiping Pan
>> >
>> I got a machine to test the
>> upstream,55922c9d1b84b89cb946c777fddccb3247e7df2c
>> the problem disappears.
>>
>> many thanks
>> Weiping Pan
>
>Might be solved by commit fc75fc8339e772716744 or
>
>commit 332dd96f7ac15e937088fe11f15c (net/dst: dst_dev_event() called
>after other notifiers) and commit ef885afbf8a37689 ((net: use
>rcu_barrier() inrollback_registered_many)
Thanks, for suggestions Eric. We'll try those.
>
>
>
>
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox