* Re: ebtables on a stick
From: David Lamparter @ 2011-11-28 15:30 UTC (permalink / raw)
To: Greg Scott; +Cc: David Lamparter, netdev
In-Reply-To: <925A849792280C4E80C5461017A4B8A2A0488A@mail733.InfraSupportEtc.com>
On Mon, Nov 28, 2011 at 08:54:09AM -0600, Greg Scott wrote:
> > This doesn't answer your question, but your use case is better solved
> > with proxy arp.
>
> ...But maybe there's a way to make my NIC only answer ARPs for certain IP Addresses I care about? That would nicely solve the problem. If it works.
"ip neigh add proxy 1.2.3.4 dev eth0". Please look at the example i gave
you.
> In my earlier near-disaster, I did this:
>
> echo 1 > /proc/sys/net/ipv4/conf/${INET_IFACE}/proxy_arp
NO. You do not touch this switch. Be afraid of this switch.
The instructions i gave you are complete. This switch is not involved.
> to turn on proxy arp. I wonder if that ip neighbor stuff does it selectively?
We already had this discussion in July.
"ip neigh add proxy" is independent of /proc/.../proxy_arp.
-David
^ permalink raw reply
* Re: [ovs-dev] [GIT PULL v2] Open vSwitch
From: Jamal Hadi Salim @ 2011-11-28 15:32 UTC (permalink / raw)
To: Martin Casado; +Cc: Herbert Xu, dev, netdev, David Miller
In-Reply-To: <4ED3A85A.1030003@nicira.com>
On Mon, 2011-11-28 at 07:27 -0800, Martin Casado wrote:
> This is a common misunderstanding about OpenFlow. It does not require
> the first packet of each flow to go to the controller.
I am reading this to mean that the switch CPU will resolve things?
Typically those tend to be tiny cpus.
> In fact, no
> production system I'm aware of does this. Generally OpenFlow-based
> solutions targeted at large environments (e.g. data center, or WAN)
> send only traditional control traffic to the controller (e.g. BGP or
> OSPF), or none at all.
Even OSPF or BGP would be problematic imo if the architecture doesnt
allow prioritization of some form towards the controller.
cheers,
jamal
^ permalink raw reply
* Re: [PATCH] sctp: integer overflow in sctp_auth_create_key()
From: Vladislav Yasevich @ 2011-11-28 15:45 UTC (permalink / raw)
To: Xi Wang
Cc: linux-kernel, Sridhar Samudrala, David S. Miller, linux-sctp,
netdev, security
In-Reply-To: <426D7BA8-ECD0-44D6-A09F-2033F0C825FC@gmail.com>
On 11/22/2011 08:25 PM, Xi Wang wrote:
> The previous commit 30c2235c is incomplete and cannot prevent integer
> overflows. For example, when key_len is 0x80000000 (INT_MAX + 1), the
> left-hand side of the check, (INT_MAX - key_len), which is unsigned,
> becomes 0xffffffff (UINT_MAX) and bypasses the check.
>
> Signed-off-by: Xi Wang <xi.wang@gmail.com>
> ---
> net/sctp/auth.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/net/sctp/auth.c b/net/sctp/auth.c
> index 865e68f..989e0fd 100644
> --- a/net/sctp/auth.c
> +++ b/net/sctp/auth.c
> @@ -82,7 +82,7 @@ static struct sctp_auth_bytes *sctp_auth_create_key(__u32 key_len, gfp_t gfp)
> struct sctp_auth_bytes *key;
>
> /* Verify that we are not going to overflow INT_MAX */
> - if ((INT_MAX - key_len) < sizeof(struct sctp_auth_bytes))
> + if (key_len > INT_MAX - sizeof(struct sctp_auth_bytes))
> return NULL;
>
> /* Allocate the shared key */
Hmm. Yes, this is a more correct check.
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
However, I don't think this is a security issue. As I've written before, this function is
called from 2 places:
1) setsockopt() code path
2) sctp_auth_asoc_set_secret() code path
In case (1), sca_keylength is never going to exceed 65535 since it's
bounded by a u16 from the user api. As such, The integer promotion will
not impact anything and the malloc() will never overflow.
In case (2), sca_keylength is computed based on the key the user provided
(MAX_USHORT) and the combination of protocol negotiated data where that
combination has a max size of 3 * MAX_USHORT (see sctp_auth_make_key_vector()).
So, even this case, our maximum key length can be 4* MAX_USHORT which still
will always be below MAX_INT and will not overflow.
So, I don't think there is big security consideration here, just a bad
check that just happens to always work.
-vlad
^ permalink raw reply
* Re: [ovs-dev] [GIT PULL v2] Open vSwitch
From: Martin Casado @ 2011-11-28 15:50 UTC (permalink / raw)
To: Jamal Hadi Salim; +Cc: Herbert Xu, dev, netdev, David Miller
In-Reply-To: <1322494356.7338.75.camel@mojatatu>
>> This is a common misunderstanding about OpenFlow. It does not require
>> the first packet of each flow to go to the controller.
> I am reading this to mean that the switch CPU will resolve things?
> Typically those tend to be tiny cpus.
>
No, no datapath traffic leaves the switching chip. Generally
controllers calculate full forwarding tables using wildcarded rules and
use low priority rules to handle exception traffic (still keeping it on
the datapath).
>
>> In fact, no
>> production system I'm aware of does this. Generally OpenFlow-based
>> solutions targeted at large environments (e.g. data center, or WAN)
>> send only traditional control traffic to the controller (e.g. BGP or
>> OSPF), or none at all.
> Even OSPF or BGP would be problematic imo if the architecture doesnt
> allow prioritization of some form towards the controller.
Indeed. Controllers generally implement such prioritization by using
OpenFlow forwarding rules that match on control traffic and explicitly
set the controller as the destination. This allows the application of
any QoS primitives within OpenFlow to control traffic.
best,
.martin
>
> cheers,
> jamal
>
>
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Martin Casado
Nicira Networks, Inc.
www.nicira.com
cell: 650-776-1457
~~~~~~~~~~~~~~~~~~~~~~~~~~~
^ permalink raw reply
* Re: [GIT PULL v2] Open vSwitch
From: Ben Pfaff @ 2011-11-28 16:01 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
Herbert Xu, David Miller
In-Reply-To: <1322488954.7338.66.camel@mojatatu>
On Mon, Nov 28, 2011 at 09:02:34AM -0500, Jamal Hadi Salim wrote:
> On Mon, 2011-11-28 at 21:04 +0800, Herbert Xu wrote:
> > However, what's more worrying for me right now is the gaping
> > DoS opportunities that exist in the patch as is.
> >
> > In particular, the whole design principle of punting all new
> > flows to user-space is an excellent way of attacking the system.
>
> Indeed this is an issue with openflow in general.
> The general solution is to rate limit how much goes to the controller
> but even that is insufficient.
Regarding OpenFlow rate limiting, in addition to Martin's response, Open
vSwitch has implemented controller rate limiting since day one. It is
documented in ovs-vswitchd.conf.db(5):
OpenFlow Rate Limiting:
controller_rate_limit: optional integer, at least 100
The maximum rate at which packets in unknown flows will be for-
warded to the OpenFlow controller, in packets per second. This
feature prevents a single bridge from overwhelming the con-
troller. If not specified, the default is implementation-spe-
cific.
In addition, when a high rate triggers rate-limiting, Open
vSwitch queues controller packets for each port and transmits
them to the controller at the configured rate. The number of
queued packets is limited by the controller_burst_limit value.
The packet queue is shared fairly among the ports on a bridge.
Open vSwitch maintains two such packet rate-limiters per bridge.
One of these applies to packets sent up to the controller
because they do not correspond to any flow. The other applies
to packets sent up to the controller by request through flow
actions. When both rate-limiters are filled with packets, the
actual rate that packets are sent to the controller is up to
twice the specified rate.
controller_burst_limit: optional integer, at least 25
In conjunction with controller_rate_limit, the maximum number of
unused packet credits that the bridge will allow to accumulate,
in packets. If not specified, the default is implementation-
specific.
^ permalink raw reply
* Re: [GIT PULL v2] Open vSwitch
From: Ben Pfaff @ 2011-11-28 16:04 UTC (permalink / raw)
To: Fischer, Anna
Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org,
netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Miller,
Herbert Xu, jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org
In-Reply-To: <0199E0D51A61344794750DC57738F58E7586A74137-1IhDuF6AwYvulpxXP3Mx0dVKv6DIAtwysh7EHKopUjU@public.gmane.org>
On Mon, Nov 28, 2011 at 01:54:16PM +0000, Fischer, Anna wrote:
> Yes, I mentioned this months ago, and I am surprised this critical
> issue has never been picked up on and addressed. With a flaw like this
> there is no chance this component can be used in any serious
> virtualization deployment where different customers share the same
> physical server.
It was addressed, by introducing additional queues and using them in
userspace.
This is documented in NEWS:
- Flow setups are now processed in a round-robin manner across ports
to prevent any single client from monopolizing the CPU and conducting
a denial of service attack.
^ permalink raw reply
* Re: [PATCH net-next] tcp: tcp_sendmsg() wrong access to sk_route_caps
From: Eric Dumazet @ 2011-11-28 16:04 UTC (permalink / raw)
To: Joe Perches; +Cc: David Miller, netdev, Michał Mirosław
In-Reply-To: <1322477197.2024.4.camel@Joe-Laptop>
Le lundi 28 novembre 2011 à 02:46 -0800, Joe Perches a écrit :
> On Mon, 2011-11-28 at 11:27 +0100, Eric Dumazet wrote:
> > Now sk_route_caps is u64, its dangerous to use an integer to store
> > result of an AND operator. It wont work if NETIF_F_SG is moved on the
> > upper part of u64.
>
> trivial comment below.
Well, not trivial at all ;)
>
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> []
> > @@ -917,9 +917,9 @@ int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
> > struct iovec *iov;
> > struct tcp_sock *tp = tcp_sk(sk);
> > struct sk_buff *skb;
> > - int iovlen, flags;
> > + int iovlen, flags, err, copied;
> > int mss_now, size_goal;
> > - int sg, err, copied;
> > + bool sg;
> > long timeo;
> >
> > lock_sock(sk);
> > @@ -946,7 +946,7 @@ int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
> > if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
> > goto out_err;
> >
> > - sg = sk->sk_route_caps & NETIF_F_SG;
> > + sg = !!(sk->sk_route_caps & NETIF_F_SG);
>
> As sg is now bool, using !! is unnecessary.
>
> A commit was done recently to remove one.
> 3ad9b358e03fd9dbf6705721490c811b666b0fe2
>
Hmm... I find it dangerous and error prone. Obviously not at the time we
commit such changes, but later, because a future reader might be fooled.
Using !!(expr) is pretty clear about the potential problem, and
generates no extra code if a bool is used for the target.
^ permalink raw reply
* Re: [PATCH net-next v2 2/4] can: cc770: add legacy ISA bus driver for the CC770 and AN82527
From: Oliver Hartkopp @ 2011-11-28 16:06 UTC (permalink / raw)
To: info-PyqsHJVlJN8AvxtiuMwx3w, Wolfgang Grandegger
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
socketcan-users-0fE9KPoRgkgATYTw5x5z8w,
linux-can-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4ED37885.8080909-PyqsHJVlJN8AvxtiuMwx3w@public.gmane.org>
>> ... using a reasonable default for bcr of 0x40. But you may need to
>> provide better values for cir, bcr and cor.
>>
>
> OMG, sorry that I was bothering You but You are absolutely right and with Your
> statement You brought be back on track and You made my day. Thank You !!!!!!
>
> I just took the values for cpu and bus I was using in my lincan driver and funny
> enough now it's working absolutely perfect.
>
> Module: modprobe cc770_isa irq=0xa port=0x384 indirect=1 cir=0x61 bcr=0x4A
>
> Was just sending 50000 telegrams, perfect and no problem at all.
>
> Thanks a lot again!!!!
>
This looks like a Tested-by :-)
Thanks Wolfgang (who ever feels addressed)
^ permalink raw reply
* Re: [PATCH net-next 1/4] net: introduce skb_flow_dissect()
From: Dimitris Michailidis @ 2011-11-28 16:25 UTC (permalink / raw)
To: Eric Dumazet
Cc: dev-yBygre7rU0TnMu66kgdUjQ, chrisw-H+wXaHxf7aLQT0dZR+AlfA,
netdev-u79uwXL29TY76Z2rM5mHXA, Florian Westphal,
jhs-jkUAjuhPggJWk0Htik3J/w,
john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w,
herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw, Stephen Hemminger,
Dan Siemon, David Miller
In-Reply-To: <1322493738.2292.69.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
On 11/28/2011 07:22 AM, Eric Dumazet wrote:
> We use at least two flow dissectors in network stack, with known
> limitations and code duplication.
>
> Introduce skb_flow_dissect() to factorize this, highly inspired from
> existing dissector from __skb_get_rxhash()
>
> Note : We extensively use skb_header_pointer(), this permits us to not
> touch skb at all.
>
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> include/net/flow_keys.h | 15 ++++
> net/core/Makefile | 2
> net/core/flow_dissector.c | 134 ++++++++++++++++++++++++++++++++++++
> 3 files changed, 150 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/flow_keys.h b/include/net/flow_keys.h
> new file mode 100644
> index 0000000..7a61e21
> --- /dev/null
> +++ b/include/net/flow_keys.h
> @@ -0,0 +1,15 @@
> +#ifndef _NET_FLOW_KEYS_H
> +#define _NET_FLOW_KEYS_H
> +
> +struct flow_keys {
> + __be32 src;
> + __be32 dst;
> + union {
> + __be32 ports;
> + __be16 port16[2];
> + };
> + u8 ip_proto;
> +};
> +
> +extern bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys *flow);
> +#endif
> diff --git a/net/core/Makefile b/net/core/Makefile
> index 3606d40..c4ecc86 100644
> --- a/net/core/Makefile
> +++ b/net/core/Makefile
> @@ -3,7 +3,7 @@
> #
>
> obj-y := sock.o request_sock.o skbuff.o iovec.o datagram.o stream.o scm.o \
> - gen_stats.o gen_estimator.o net_namespace.o secure_seq.o
> + gen_stats.o gen_estimator.o net_namespace.o secure_seq.o flow_dissector.o
>
> obj-$(CONFIG_SYSCTL) += sysctl_net_core.o
>
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> new file mode 100644
> index 0000000..d0e085b
> --- /dev/null
> +++ b/net/core/flow_dissector.c
> @@ -0,0 +1,134 @@
> +#include <linux/skbuff.h>
> +#include <linux/ip.h>
> +#include <linux/ipv6.h>
> +#include <linux/if_vlan.h>
> +#include <net/ip.h>
> +#include <linux/if_tunnel.h>
> +#include <linux/if_pppox.h>
> +#include <linux/ppp_defs.h>
> +#include <net/flow_keys.h>
> +
> +
> +bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys *flow)
> +{
> + int poff, nhoff = skb_network_offset(skb);
> + u8 ip_proto;
> + u16 proto = skb->protocol;
__be16 instead of u16 for proto?
^ permalink raw reply
* Daughterboard Jetway JAD3RTLANG with three RTL-8110SC/8169SC Gigabit Ethernet is curious
From: Markus Feldmann @ 2011-11-28 16:35 UTC (permalink / raw)
To: netdev
Hi All,
i have a mini-ITX PC (Motherboard Jetway JNF92-270-LF), whit 3 ethernet
devices (RTL-8110SC/8169SC Gigabit Ethernet) on one additional
daughterboard (Jetway JAD3RTLANG).
Here is my <hwinfo --netcard> output:
http://pastebin.com/raw.php?i=Khy1NX61
and dmesg:
http://pastebin.com/raw.php?i=Ziq1mizy
I am using this mini-ITX PC as server/router with a debian
squeeze/stable system.
My problem is that only one of this three ethernet devices (from the
daughterboard not the internal one from motherboard) at once is working
correctly. Sometimes i have only one client to my server connected, but
only one specific ehernet port is working (mostly the physically upper
one). I didnt found any regularity yet.
Any hints? is there a specific person whom i schould contact?
regards markus
^ permalink raw reply
* Re: sky2: hw csum failure
From: Stephen Hemminger @ 2011-11-28 16:50 UTC (permalink / raw)
To: Vincent Blut; +Cc: netdev, Debian Bug Tracking System
In-Reply-To: <4ED37A2C.5040608@free.fr>
On Mon, 28 Nov 2011 12:10:20 +0000
Vincent Blut <vincent.debian@free.fr> wrote:
> Hi,
>
> [reference: http://bugs.debian.org/609994]
>
> I have a Marvell ethernet controller which presents some failures when
> 'rx checksumming' is enabled,
> here is the model:
>
> $ lspci -vvs 03:00.0
> 03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E
> Gigabit Ethernet Controller (rev 15)
> Subsystem: Micro-Star International Co., Ltd. Marvell 88E8053
> Gigabit Ethernet Controller (MSI)
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
> Latency: 0, Cache Line Size: 32 bytes
> Interrupt: pin A routed to IRQ 44
> Region 0: Memory at fdbfc000 (64-bit, non-prefetchable) [size=16K]
> Region 2: I/O ports at 7c00 [size=256]
> [virtual] Expansion ROM at fda00000 [disabled] [size=128K]
> Capabilities: <access denied>
> Kernel driver in use: sky2
>
> At first I thought it was due to the MTU size, so I tested different
> values but unfortunately without positive effect.
> Overall this issue appears randomly when the incoming traffic is high. I
> tested 2.6.32, 3.1.1, and 3.2-rc3, sadly
> all are affected. Finally, the only way to avoid those failures is to
> disabled 'rx checksumming' (ethtool -K ethX rx off).
>
> Here is the stack trace:
>
> [ 14.615648] sky2 0000:03:00.0: eth1: enabling interface
> [ 14.616452] ADDRCONF(NETDEV_UP): eth1: link is not ready
> [ 17.094194] sky2 0000:03:00.0: eth1: Link is up at 1000 Mbps, full
> duplex, flow control both
> [ 17.094887] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
> [ 28.080018] eth1: no IPv6 routers present
> [ 563.816032] sky2 0000:03:00.0: eth1: hung mac 124:22 fifo 195 (150:145)
> [ 563.816036] sky2 0000:03:00.0: eth1: receiver hang detected
> [ 567.005422] sky2 0000:03:00.0: eth1: Link is up at 1000 Mbps, full
> duplex, flow control both
> [ 1040.816314] sky2 0000:03:00.0: eth1: rx error, status 0x7ffc0001
> length 1004
> [ 2097.401616] sky2 0000:03:00.0: eth1: rx error, status 0x39a339a3 length 0
This isn't really a hardware checksum failure.
Your problem is deeper than that. The internal parts of the chip are not
communicating correctly. The "hung mac" is a problem only occurs if the PCI
is really stuck. There may be a timing issue on your motherboard, or the BIOS
isn't setting up the device properly. The timing then gets messed up between
the end of frame status and the PCI shared memory region. Turning checksum
off masks the problem, but the status is probably still corrupt.
In either case the problem is beyond the ability of the driver to fix or workaround.
Your best bet is to see if there is a BIOS update, or replace the hardware.
^ permalink raw reply
* Re: [PATCH net-next] tcp: tcp_sendmsg() wrong access to sk_route_caps
From: Joe Perches @ 2011-11-28 16:54 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, netdev, Michał Mirosław, Ben Hutchings,
Andrew Morton
In-Reply-To: <1322496298.2292.76.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
On Mon, 2011-11-28 at 17:04 +0100, Eric Dumazet wrote:
> Le lundi 28 novembre 2011 à 02:46 -0800, Joe Perches a écrit :
> > On Mon, 2011-11-28 at 11:27 +0100, Eric Dumazet wrote:
> > > + sg = !!(sk->sk_route_caps & NETIF_F_SG);
> > As sg is now bool, using !! is unnecessary.
> > A commit was done recently to remove one.
> > 3ad9b358e03fd9dbf6705721490c811b666b0fe2
> Hmm... I find it dangerous and error prone. Obviously not at the time we
> commit such changes, but later, because a future reader might be fooled.
> Using !!(expr) is pretty clear about the potential problem, and
> generates no extra code if a bool is used for the target.
Though I don't think
bool = expr;
is particularly error prone,
spatch shows 58 uses of
bool = !!expr;
in net-next drivers/net.
It might be useful to standardize on either the
implicit or explicit !! style.
I think !! is an intentional and sensible style
and should be the preferred use.
Perhaps a checkpatch warning should be issued
when the implicit cast is used.
^ permalink raw reply
* RE: [PATCH net-next] tcp: tcp_sendmsg() wrong access to sk_route_caps
From: David Laight @ 2011-11-28 16:53 UTC (permalink / raw)
To: Eric Dumazet, Joe Perches; +Cc: David Miller, netdev, Michal Miroslaw
In-Reply-To: <1322496298.2292.76.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
> > > - sg = sk->sk_route_caps & NETIF_F_SG;
> > > + sg = !!(sk->sk_route_caps & NETIF_F_SG);
> >
> > As sg is now bool, using !! is unnecessary.
> >
> > A commit was done recently to remove one.
> > 3ad9b358e03fd9dbf6705721490c811b666b0fe2
> >
>
> Hmm... I find it dangerous and error prone. Obviously not at
> the time we
> commit such changes, but later, because a future reader might
> be fooled.
>
> Using !!(expr) is pretty clear about the potential problem, and
> generates no extra code if a bool is used for the target.
Or leave 'sg' as 'unsigned int' and add:
COMPILE_ASSERT(NETIF_F_SG & ~0);
(or however linux spells the appropriate define)
David
^ permalink raw reply
* Re: [PATCH] 9p: Reduce object size with CONFIG_NET_9P_DEBUG
From: Joe Perches @ 2011-11-28 17:14 UTC (permalink / raw)
To: Eric Van Hensbergen
Cc: Ron Minnich, Latchesar Ionkov, David S. Miller, v9fs-developer,
linux-kernel, netdev
In-Reply-To: <CAFkjPTmyn9POfTjYYjjYEaPt6M6Wdu-P=SEc=ZeTsbA+1=VraA@mail.gmail.com>
On Mon, 2011-11-28 at 09:10 -0600, Eric Van Hensbergen wrote:
> sorry, just got to reviewing this because of holidays.
>
> On Fri, Nov 18, 2011 at 6:09 PM, Joe Perches <joe@perches.com> wrote:
> > Reduce object size by deduplicating formats.
> >
> > Use vsprintf extension %pV.
> > Rename P9_DPRINTK uses to p9_debug, align arguments.
> > Add function for _p9_debug and macro to add __func__.
> > Add missing "\n"s to p9_debug uses.
> > Remove embedded function names as p9_debug adds it.
> > Remove P9_EPRINTK macro and convert use to pr_<level>.
> > Add and use pr_fmt and pr_<level>.
> >
> > $ size fs/9p/built-in.o*
> > text data bss dec hex filename
> > 62133 984 16000 79117 1350d fs/9p/built-in.o.new
> > 67342 984 16928 85254 14d06 fs/9p/built-in.o.old
> >
> > Signed-off-by: Joe Perches <joe@perches.com>
>
> These changes seem reasonable enough, but since you only made changes
> in the fs/9p directory and not the net/9p directory which causes
> net/9p build to break with this patch.
>
> CC net/9p/client.o
> /home/ericvh/src/linux/v9fs-devel/net/9p/client.c: In function
> ‘get_protocol_version’:
> /home/ericvh/src/linux/v9fs-devel/net/9p/client.c:84:3: error:
> implicit declaration of function ‘P9_DPRINTK’
Sorry 'bout that.
I'll fix it and resubmit later.
^ permalink raw reply
* Re: [PATCH net-next] tcp: tcp_sendmsg() wrong access to sk_route_caps
From: Ben Hutchings @ 2011-11-28 17:20 UTC (permalink / raw)
To: Joe Perches
Cc: Eric Dumazet, David Miller, netdev, Michał Mirosław,
Andrew Morton
In-Reply-To: <1322499255.2024.22.camel@Joe-Laptop>
On Mon, 2011-11-28 at 08:54 -0800, Joe Perches wrote:
> On Mon, 2011-11-28 at 17:04 +0100, Eric Dumazet wrote:
> > Le lundi 28 novembre 2011 à 02:46 -0800, Joe Perches a écrit :
> > > On Mon, 2011-11-28 at 11:27 +0100, Eric Dumazet wrote:
> > > > + sg = !!(sk->sk_route_caps & NETIF_F_SG);
> > > As sg is now bool, using !! is unnecessary.
> > > A commit was done recently to remove one.
> > > 3ad9b358e03fd9dbf6705721490c811b666b0fe2
> > Hmm... I find it dangerous and error prone. Obviously not at the time we
> > commit such changes, but later, because a future reader might be fooled.
> > Using !!(expr) is pretty clear about the potential problem, and
> > generates no extra code if a bool is used for the target.
>
> Though I don't think
> bool = expr;
> is particularly error prone,
> spatch shows 58 uses of
> bool = !!expr;
> in net-next drivers/net.
>
> It might be useful to standardize on either the
> implicit or explicit !! style.
>
> I think !! is an intentional and sensible style
> and should be the preferred use.
I see it as a workaround for the historical lack of a real bool type in
C.
> Perhaps a checkpatch warning should be issued
> when the implicit cast is used.
checkpatch can't do type-checking; maybe you mean sparse.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [RFC PATCH 00/18] netfilter: IPv6 NAT
From: Stephen Clark @ 2011-11-28 17:14 UTC (permalink / raw)
To: kaber; +Cc: netfilter-devel, netdev, ulrich.weber
In-Reply-To: <1322153850-10533-1-git-send-email-kaber@trash.net>
On 11/24/2011 11:57 AM, kaber@trash.net wrote:
> The following patches contain the updated IPv6 NAT patchset forward ported
> to 3.2-rc3. Changes since last posting:
>
> - Layer 4 protocol registration has been fixed to work properly with modular
> layer 4 protocol modules
>
> - an inverted condition in the ip6t_NETMAP checkentry function causing
> rule addition to always fail has been fixed
>
> - use of HH_DATA_ALIGN in IPv6 headroom reallocation after rerouting to fix
> unaligned data access on Tile, as suggested by Eric
>
> - nla_policy fix has been moved to a separate patch
>
> The patchset is also available at:
>
> git://github.com/kaber/nf-nat-ipv6.git master
>
> Last open point is IPv6 fragmentation handling, I'll implement my suggested
> method of storing the biggest fragment size seen during reassembly in the CB
> and using that as a hard limit during refragmentation over the weekend.
>
>
> Patrick McHardy (18):
> netfilter: nf_nat: export NAT definitions to userspace
> netfilter: nf_nat: use hash random for bysource hash
> netfilter: nf_nat: add missing nla_policy entry for CTA_NAT_PROTO attribute
> netfilter: nat: remove module reference counting from NAT protocols
> netfilter: nf_nat: remove obsolete code from nf_nat_icmp_reply_translation()
> netfilter: nf_nat: remove obsolete check in nf_nat_mangle_udp_packet()
> netfilter: ctnetlink: remove dead NAT code
> netfilter: conntrack: restrict NAT helper invocation to IPv4
> netfilter: nf_nat: add protoff argument to packet mangling functions
> netfilter: add protocol independant NAT core
> netfilter: ipv6: expand skb head in ip6_route_me_harder after oif change
> net: core: add function for incremental IPv6 pseudo header checksum updates
> netfilter: ipv6: add IPv6 NAT support
> netfilter: ip6tables: add MASQUERADE target
> netfilter: ip6tables: add REDIRECT target
> netfilter: ip6tables: add NETMAP target
> netfilter: nf_nat: support IPv6 in FTP NAT helper
> netfilter: nf_nat: support IPv6 in amanda NAT helper
>
> include/linux/netfilter.h | 14 +-
> include/linux/netfilter/Kbuild | 1 +
> include/linux/netfilter/nf_conntrack_amanda.h | 1 +
> include/linux/netfilter/nf_conntrack_ftp.h | 1 +
> include/linux/netfilter/nf_conntrack_h323.h | 15 +-
> include/linux/netfilter/nf_conntrack_irc.h | 1 +
> include/linux/netfilter/nf_conntrack_pptp.h | 2 +
> include/linux/netfilter/nf_conntrack_sip.h | 12 +-
> .../linux/netfilter/nf_conntrack_tuple_common.h | 27 +
> include/linux/netfilter/nf_nat.h | 33 +
> include/linux/netfilter/nfnetlink_conntrack.h | 8 +-
> include/linux/netfilter_ipv4.h | 1 -
> include/linux/netfilter_ipv4/Kbuild | 1 -
> include/linux/netfilter_ipv4/nf_nat.h | 58 --
> include/net/addrconf.h | 2 +-
> include/net/checksum.h | 3 +
> include/net/netfilter/nf_conntrack_expect.h | 2 +-
> include/net/netfilter/nf_conntrack_tuple.h | 1 -
> include/net/netfilter/nf_nat.h | 8 +-
> include/net/netfilter/nf_nat_core.h | 7 +-
> include/net/netfilter/nf_nat_helper.h | 11 +-
> include/net/netfilter/nf_nat_l3proto.h | 52 ++
> include/net/netfilter/nf_nat_l4proto.h | 72 +++
> include/net/netfilter/nf_nat_protocol.h | 74 ---
> include/net/netfilter/nf_nat_rule.h | 15 -
> include/net/netns/conntrack.h | 4 +
> include/net/netns/ipv4.h | 2 -
> include/net/netns/ipv6.h | 1 +
> net/core/secure_seq.c | 1 +
> net/core/utils.c | 20 +
> net/ipv4/netfilter.c | 37 --
> net/ipv4/netfilter/Kconfig | 67 +--
> net/ipv4/netfilter/Makefile | 13 +-
> net/ipv4/netfilter/ipt_MASQUERADE.c | 26 +-
> net/ipv4/netfilter/ipt_NETMAP.c | 21 +-
> net/ipv4/netfilter/ipt_REDIRECT.c | 23 +-
> .../{nf_nat_standalone.c => iptable_nat.c} | 266 ++++-----
> net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c | 8 +-
> net/ipv4/netfilter/nf_nat_h323.c | 83 ++--
> net/ipv4/netfilter/nf_nat_irc.c | 4 +-
> net/ipv4/netfilter/nf_nat_l3proto_ipv4.c | 281 +++++++++
> net/ipv4/netfilter/nf_nat_pptp.c | 31 +-
> net/ipv4/netfilter/nf_nat_proto_common.c | 126 ----
> net/ipv4/netfilter/nf_nat_proto_gre.c | 36 +-
> net/ipv4/netfilter/nf_nat_proto_icmp.c | 26 +-
> net/ipv4/netfilter/nf_nat_rule.c | 214 -------
> net/ipv4/netfilter/nf_nat_sip.c | 121 ++--
> net/ipv4/netfilter/nf_nat_tftp.c | 1 -
> net/ipv6/addrconf.c | 2 +-
> net/ipv6/netfilter.c | 8 +
> net/ipv6/netfilter/Kconfig | 48 ++
> net/ipv6/netfilter/Makefile | 7 +
> net/ipv6/netfilter/ip6t_MASQUERADE.c | 135 +++++
> net/ipv6/netfilter/ip6t_NETMAP.c | 94 +++
> net/ipv6/netfilter/ip6t_REDIRECT.c | 98 +++
> net/ipv6/netfilter/ip6table_nat.c | 319 ++++++++++
> net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c | 14 +
> net/ipv6/netfilter/nf_nat_l3proto_ipv6.c | 285 +++++++++
> net/ipv6/netfilter/nf_nat_proto_icmpv6.c | 87 +++
> net/netfilter/Kconfig | 34 ++
> net/netfilter/Makefile | 15 +
> net/netfilter/core.c | 5 +
> net/netfilter/ipvs/ip_vs_ftp.c | 1 +
> net/netfilter/nf_conntrack_amanda.c | 5 +-
> net/netfilter/nf_conntrack_core.c | 7 +
> net/netfilter/nf_conntrack_ftp.c | 3 +-
> net/netfilter/nf_conntrack_h323_main.c | 232 +++++---
> net/netfilter/nf_conntrack_irc.c | 6 +-
> net/netfilter/nf_conntrack_netlink.c | 30 +-
> net/netfilter/nf_conntrack_pptp.c | 18 +-
> net/netfilter/nf_conntrack_proto_tcp.c | 8 +-
> net/netfilter/nf_conntrack_sip.c | 119 +++--
> net/netfilter/nf_conntrack_tftp.c | 3 +-
> net/{ipv4 => }/netfilter/nf_nat_amanda.c | 4 +-
> net/{ipv4 => }/netfilter/nf_nat_core.c | 621 ++++++++++----------
> net/{ipv4 => }/netfilter/nf_nat_ftp.c | 33 +-
> net/{ipv4 => }/netfilter/nf_nat_helper.c | 121 ++---
> net/netfilter/nf_nat_proto_common.c | 112 ++++
> net/{ipv4 => }/netfilter/nf_nat_proto_dccp.c | 58 +-
> net/{ipv4 => }/netfilter/nf_nat_proto_sctp.c | 55 +-
> net/{ipv4 => }/netfilter/nf_nat_proto_tcp.c | 42 +-
> net/{ipv4 => }/netfilter/nf_nat_proto_udp.c | 44 +-
> net/{ipv4 => }/netfilter/nf_nat_proto_udplite.c | 60 +-
> net/{ipv4 => }/netfilter/nf_nat_proto_unknown.c | 15 +-
> net/netfilter/xt_nat.c | 185 ++++++
> 85 files changed, 3132 insertions(+), 1635 deletions(-)
>
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
Probabably a dumb question but are these patches for natting ipv6 to
ipv6 or ipv4 to ipv6?
--
"They that give up essential liberty to obtain temporary safety,
deserve neither liberty nor safety." (Ben Franklin)
"The course of history shows that as a government grows, liberty
decreases." (Thomas Jefferson)
^ permalink raw reply
* Re: [PATCH 4/5] tcp: allow undo from reordered DSACKs
From: Neal Cardwell @ 2011-11-28 17:31 UTC (permalink / raw)
To: David Miller; +Cc: ilpo.jarvinen, netdev, nanditad, ycheng, therbert
In-Reply-To: <20111127.185857.475440859115872225.davem@davemloft.net>
On Sun, Nov 27, 2011 at 6:58 PM, David Miller <davem@davemloft.net> wrote:
> In order to make forward progress I've added Neil's patch to net-next.
Thanks!
>> Also, one (serious) word of caution! This change, by its extending of
>> CA_Open state, is somewhat similar to what I made FRTO years ago, and
>> managed to introduces subtle breaking to the state machine. Please check
>> that the problem similar to what was fixed by
>> a1c1f281b84a751fdb5ff919da3b09df7297619f does not reappear (possibly in
>> some other form causing spurious undos). ...To me it seems that
>> tcp_packet_delayed might be similarly confused after the given patch.
>
> Neil, please look into this so we can have any such issues fixed
> in time for the next merge window.
Absolutely. I will look into the areas that Ilpo mentioned.
neal
^ permalink raw reply
* Re: [PATCH net-next] tcp: tcp_sendmsg() wrong access to sk_route_caps
From: Joe Perches @ 2011-11-28 17:32 UTC (permalink / raw)
To: Ben Hutchings
Cc: Eric Dumazet, David Miller, netdev, Michał Mirosław,
Andrew Morton
In-Reply-To: <1322500843.26733.2.camel@bwh-desktop>
On Mon, 2011-11-28 at 17:20 +0000, Ben Hutchings wrote:
> On Mon, 2011-11-28 at 08:54 -0800, Joe Perches wrote:
> > Perhaps a checkpatch warning should be issued
> > when the implicit cast [to bool] is used.
> checkpatch can't do type-checking; maybe you mean sparse.
I think I could get checkpatch to track the bool
type implicit casts for the not struct member
uses of bool.
ie:
bool foo;
[...]
foo = expr;
but not:
struct foo {
bool bar;
};
[...]
foo.bar = expr;
It's pretty trivial to do it mostly correctly
with spatch/cocci though.
^ permalink raw reply
* Re: [PATCH 5/5] tcp: skip cwnd moderation in TCP_CA_Open in tcp_try_to_open
From: Neal Cardwell @ 2011-11-28 17:35 UTC (permalink / raw)
To: David Miller; +Cc: ilpo.jarvinen, ycheng, netdev, nanditad, therbert
In-Reply-To: <20111127.190120.2116987138205753451.davem@davemloft.net>
On Sun, Nov 27, 2011 at 7:01 PM, David Miller <davem@davemloft.net> wrote:
> From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
> Date: Tue, 22 Nov 2011 13:47:13 +0200 (EET)
>
>> On Mon, 21 Nov 2011, Yuchung Cheng wrote:
>>
>>> AFAIK cwnd moderation is to lower bursty drops not to discourage
>>> (dsack) cheating. I believe the reason is the same in
>>> tcp_try_to_open(). We are in common cases, e.g., loss-recovery, this
>>> logic hurts performance.
>>
>> Quote from the patch description: "Senders were overriding cwnd values
>> picked during an undo by calling tcp_moderate_cwnd()" ...so after all it
>> has to do with undo being limited. IMHO only up to orig_cwnd/2+IW is safe
>> due to cheating opportunities. Also FRTO uses orig_cwnd/2 due to same
>> reason (it could do the +IW too but IIRC it is only /2 currently). What
>> would be the safeguard there after this one is removed? I kind of see your
>> point about this particular line being related to burst mitigation but on
>> the same time the end result of removal is that undo becomes potentially
>> much more aggressive.
>
> I'm apply this patch to net-next now, but Neil and Yucheng are on the hook
> to fully look into the issues Ilpo has raised.
Thanks! We will spend some time looking into these issues.
neal
^ permalink raw reply
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
From: Stephen Hemminger @ 2011-11-28 17:42 UTC (permalink / raw)
To: Jason Wang
Cc: Krishna Kumar2, arnd, Michael S. Tsirkin, netdev, virtualization,
levinsasha928, davem
In-Reply-To: <4ED30D2B.10907@redhat.com>
On Mon, 28 Nov 2011 12:25:15 +0800
Jason Wang <jasowang@redhat.com> wrote:
> I'm using ixgbe for testing also, for host, its driver seems provide irq
> affinity hint, so no binding or irqbalance is needed.
The hint is for irqbalance to use. You need to still do manual
affinity or use irqbalance.
^ permalink raw reply
* Re: [PATCH v3 02/10] net: Add queue state xoff flag for stack
From: Ben Hutchings @ 2011-11-28 17:46 UTC (permalink / raw)
To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <alpine.DEB.2.00.1111222139180.14467@pokey.mtv.corp.google.com>
On Tue, 2011-11-22 at 21:52 -0800, Tom Herbert wrote:
> Create separate queue state flags so that either the stack or drivers
> can turn on XOFF. Added a set of functions used in the stack to determine
> if a queue is really stopped (either by stack or driver)
[...]
> static inline int netif_tx_queue_stopped(const struct netdev_queue *dev_queue)
> {
> - return test_bit(__QUEUE_STATE_XOFF, &dev_queue->state);
> + return test_bit(__QUEUE_STATE_DRV_XOFF, &dev_queue->state);
> }
[...]
> +static inline int netif_xmit_stopped(const struct netdev_queue *dev_queue)
> {
> - return dev_queue->state & QUEUE_STATE_XOFF_OR_FROZEN;
> + return dev_queue->state & QUEUE_STATE_ANY_XOFF;
> +}
[...]
The difference in names here doesn't really suggest the difference in
semantics. I think they will be easy to confuse.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH v3 03/10] net: Add netdev interfaces recording send/compl
From: Ben Hutchings @ 2011-11-28 17:49 UTC (permalink / raw)
To: Tom Herbert; +Cc: David Miller, shemminger, netdev
In-Reply-To: <CA+mtBx-VSJ6U2SyUxQt7eK9+cXFFAEQXgmhc+yWwpTX8zd6TJA@mail.gmail.com>
On Wed, 2011-11-23 at 16:32 -0800, Tom Herbert wrote:
> >> I was referring to the transmit routine, there for netdev_sent_queue()
> >> is always called with one packet.
> >
> > Optimism for the future? :-)
> >
>
> Yep, I am looking forward to the day that someone figures out how to
> batch transmits in a meaningful way!
[...]
UFO/TSO do something like that. Should we count physical or logical
packets?
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH] natsemi: make cable length magic configurable
From: Ben Hutchings @ 2011-11-28 17:51 UTC (permalink / raw)
To: David Miller; +Cc: jdelvare, netdev, thockin, okir
In-Reply-To: <20111125.013713.1783082973236989821.davem@davemloft.net>
On Fri, 2011-11-25 at 01:37 -0500, David Miller wrote:
> From: Jean Delvare <jdelvare@suse.de>
> Date: Thu, 24 Nov 2011 14:43:59 +0100
>
> > We had a customer report concerning problems with a Natsemi DP83815-D
> > and long cables. With 100m cables, the network would be essentially dead,
> > not a single packet would get through either way. We had to apply the
> > patch below to make it work.
>
> Please do not add new private device driver module knobs.
>
> Instead, add generic flags to the ethtool interface or similar to
> control device behavior.
This isn't a generic flag; it seems to be a workaround for a specific
bug that unfortunately can't be automatic.
That's why I suggested an ethtool private flag.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* RE: [PATCH net-next] igb: reset PHY after recovering from PHY power down
From: Wyborny, Carolyn @ 2011-11-28 17:57 UTC (permalink / raw)
To: Koki Sanagi, netdev@vger.kernel.org, Kirsher, Jeffrey T,
Brandeburg, Jesse, Allan, Bruce W, Skidmore, Donald C,
Rose, Gregory V, Waskiewicz Jr, Peter P, Duyck, Alexander H,
Ronciak, John, e1000-devel@lists.sourceforge.net
Cc: davem@davemloft.net
In-Reply-To: <4ECDB76F.1090104@jp.fujitsu.com>
>-----Original Message-----
>From: Koki Sanagi [mailto:sanagi.koki@jp.fujitsu.com]
>Sent: Wednesday, November 23, 2011 7:18 PM
>To: netdev@vger.kernel.org; Kirsher, Jeffrey T; Brandeburg, Jesse;
>Allan, Bruce W; Wyborny, Carolyn; Skidmore, Donald C; Rose, Gregory V;
>Waskiewicz Jr, Peter P; Duyck, Alexander H; Ronciak, John; e1000-
>devel@lists.sourceforge.net
>Cc: davem@davemloft.net
>Subject: [PATCH net-next] igb: reset PHY after recovering from PHY power
>down
>
>According to 82576_Datasheet.pdf, PHY setting is lost after PHY power
>down.
>So resetting PHY is needed when recovering from PHY power down to set a
>default
>setting to PHY register.
>
>Owing to this lack, NIC doesn't link up in some rare situation.
>The situation I encountered is following.
>
>
>1.Both ports connect to switch.
>+---------+ +--------+
>| |-----------| |
>| 82576NS | | switch |
>| |-----------| |
>+---------+ +--------+
>
>2.Detach both cables from switch.
>+---------+ +--------+
>| |------- | |
>| 82576NS | | switch |
>| |------- | |
>+---------+ +--------+
>
>3.Detach one cable from one port.
>+---------+ +--------+
>| |------- | |
>| 82576NS | | switch |
>| | | |
>+---------+ +--------+
>
>4.Attach that cable to the other port.(It means connecting directly each
>port)
>+---------+ +--------+
>| |-------+ | |
>| 82576NS | | | switch |
>| |-------+ | |
>+---------+ +--------+
>
>As a result, NIC doesn't link up sometimes.
>
>Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
>---
> drivers/net/ethernet/intel/igb/igb_main.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
>diff --git a/drivers/net/ethernet/intel/igb/igb_main.c
>b/drivers/net/ethernet/intel/igb/igb_main.c
>index bd9b30e..4d4f065 100644
>--- a/drivers/net/ethernet/intel/igb/igb_main.c
>+++ b/drivers/net/ethernet/intel/igb/igb_main.c
>@@ -2496,6 +2496,7 @@ static int igb_open(struct net_device *netdev)
> goto err_setup_rx;
>
> igb_power_up_link(adapter);
>+ igb_reset_phy(hw);
>
> /* before we allocate an interrupt, we must be ready to handle it.
> * Setting DEBUG_SHIRQ in the kernel makes it fire an interrupt
Hello,
This seems reasonable, however it would only cover cases where igb_open is being called. It would be better to integrate the phy_reset call into igb_power_up_link, so that's the default. I can provide an alternate by end of the week.
Thanks,
Carolyn
Carolyn Wyborny
Linux Development
LAN Access Division
Intel Corporation
^ permalink raw reply
* Re: [PATCH 21/62] net: remove the second argument of k[un]map_atomic()
From: Alexander Duyck @ 2011-11-28 18:06 UTC (permalink / raw)
To: Eric Dumazet
Cc: Cong Wang, linux-kernel, akpm, Jeff Kirsher, Jesse Brandeburg,
Bruce Allan, Carolyn Wyborny, Don Skidmore, Greg Rose,
Peter P Waskiewicz Jr, John Ronciak, David S. Miller, Dean Nelson,
Ian Campbell, Jiri Pirko, e1000-devel, netdev
In-Reply-To: <1322381248.2826.1.camel@edumazet-laptop>
On 11/27/2011 12:07 AM, Eric Dumazet wrote:
> Le dimanche 27 novembre 2011 à 13:27 +0800, Cong Wang a écrit :
>> Signed-off-by: Cong Wang <amwang@redhat.com>
>> ---
>> diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
>> index cf480b5..b194beb 100644
>> --- a/drivers/net/ethernet/intel/e1000/e1000_main.c
>> +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
>> @@ -3878,11 +3878,9 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
>> if (length <= copybreak &&
>> skb_tailroom(skb) >= length) {
>> u8 *vaddr;
>> - vaddr = kmap_atomic(buffer_info->page,
>> - KM_SKB_DATA_SOFTIRQ);
>> + vaddr = kmap_atomic(buffer_info->page);
>> memcpy(skb_tail_pointer(skb), vaddr, length);
>> - kunmap_atomic(vaddr,
>> - KM_SKB_DATA_SOFTIRQ);
>> + kunmap_atomic(vaddr);
>> /* re-use the page, so don't erase
>> * buffer_info->page */
>> skb_put(skb, length);
>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
>> index a855db1..8603c87 100644
>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>> @@ -1272,9 +1272,9 @@ static bool e1000_clean_rx_irq_ps(struct e1000_adapter *adapter,
>> */
>> dma_sync_single_for_cpu(&pdev->dev, ps_page->dma,
>> PAGE_SIZE, DMA_FROM_DEVICE);
>> - vaddr = kmap_atomic(ps_page->page, KM_SKB_DATA_SOFTIRQ);
>> + vaddr = kmap_atomic(ps_page->page);
>> memcpy(skb_tail_pointer(skb), vaddr, l1);
>> - kunmap_atomic(vaddr, KM_SKB_DATA_SOFTIRQ);
>> + kunmap_atomic(vaddr);
>> dma_sync_single_for_device(&pdev->dev, ps_page->dma,
>> PAGE_SIZE, DMA_FROM_DEVICE);
>>
>> @@ -1465,12 +1465,10 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
>> if (length <= copybreak &&
>> skb_tailroom(skb) >= length) {
>> u8 *vaddr;
>> - vaddr = kmap_atomic(buffer_info->page,
>> - KM_SKB_DATA_SOFTIRQ);
>> + vaddr = kmap_atomic(buffer_info->page);
>> memcpy(skb_tail_pointer(skb), vaddr,
>> length);
>> - kunmap_atomic(vaddr,
>> - KM_SKB_DATA_SOFTIRQ);
>> + kunmap_atomic(vaddr);
>> /* re-use the page, so don't erase
>> * buffer_info->page */
>> skb_put(skb, length);
> But why are these drivers using kmap_atomic() in first place, since
> their fragments are allocated in regular zone (GFP_ATOMIC or
> GFP_KERNEL) ?
I was asking the same thing myself recently when I started working on
some copy-break like code for the ixgbe driver. I believe the main
reason is a lack of documentation. This code is based loosely on the
skb_copy_bits code which will use kmap_skb_frag over all of the paged
portions of the sk_buff. As such it was decided to map things via
kmap_atomic in order to guarantee the pages had a valid virtual address.
If I understand things correctly, what you are brining up is that pages
allocated with either GFP_ATOMIC or GFP_KERNEL will always be allocated
from the lowmem pool and as such page_address should always succeed. Is
that correct?
Thanks,
Alex
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox