* Re: [PATCH 5/7] can: clear ctrlmode when close candev
From: Marc Kleine-Budde @ 2014-11-03 20:47 UTC (permalink / raw)
To: Dong Aisheng, linux-can
Cc: wg, varkabhadram, netdev, socketcan, linux-arm-kernel
In-Reply-To: <1414579527-31100-5-git-send-email-b29396@freescale.com>
[-- Attachment #1: Type: text/plain, Size: 1030 bytes --]
On 10/29/2014 11:45 AM, Dong Aisheng wrote:
> Currently priv->ctrlmode is not cleared when close_candev, so next time
> the driver will still use this value to set controller even user
> does not set any ctrl mode.
> e.g.
> Step 1. ip link set can0 up type can0 bitrate 1000000 loopback on
> Controller will be in loopback mode
> Step 2. ip link set can0 down
> Step 3. ip link set can0 up type can0 bitrate 1000000
> Controller will still be set to loopback mode in driver due to saved
> priv->ctrlmode.
>
> This patch clears priv->ctrlmode when the CAN interface is closed,
> and set it to correct mode according to next user setting.
>
> Signed-off-by: Dong Aisheng <b29396@freescale.com>
NACK, as discussed with Oliver.
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: [PATCH] uapi: resort Kbuild entries
From: David Miller @ 2014-11-03 20:45 UTC (permalink / raw)
To: stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ
Cc: gregkh-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
netdev-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20141103124234.77377274@urahara>
From: Stephen Hemminger <stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ@public.gmane.org>
Date: Mon, 3 Nov 2014 12:42:34 -0800
> The entries in the Kbuild files are incorrectly sorted.
> Matters for aesthetics only.
>
> Signed-off-by: Stephen Hemminger <stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ@public.gmane.org>
>
> ---
> Patch against -net tree since that is where my last change was.
As such, I'll apply to this to net-next after my next merge.
^ permalink raw reply
* Re: [PATCH net 0/4] ipv6: Fix iflink setting for ipv6 tunnels
From: David Miller @ 2014-11-03 20:43 UTC (permalink / raw)
To: steffen.klassert; +Cc: netdev
In-Reply-To: <1415002770-5797-1-git-send-email-steffen.klassert@secunet.com>
From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Mon, 3 Nov 2014 09:19:26 +0100
> The ipv6 tunnels do the dev->iflink setting too early, it gets
> overwritten by register_netdev(). So set dev->iflink from within
> a ndo_init function to keep the configured setting.
>
> This patchset fixes this for ip6_tunnel, vti6, sit and gre6.
Series applied, good catch, and thanks for checking all of the ipv6
tunnel drivers for this problem.
^ permalink raw reply
* [PATCH] uapi: resort Kbuild entries
From: Stephen Hemminger @ 2014-11-03 20:42 UTC (permalink / raw)
To: David Miller, Greg KH; +Cc: netdev, linux-api
The entries in the Kbuild files are incorrectly sorted.
Matters for aesthetics only.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
Patch against -net tree since that is where my last change was.
--- a/include/uapi/linux/Kbuild 2014-11-02 11:24:32.304658688 -0800
+++ b/include/uapi/linux/Kbuild 2014-11-02 11:27:09.017509917 -0800
@@ -37,27 +37,27 @@ header-y += aio_abi.h
header-y += apm_bios.h
header-y += arcfb.h
header-y += atalk.h
-header-y += atm.h
-header-y += atm_eni.h
-header-y += atm_he.h
-header-y += atm_idt77105.h
-header-y += atm_nicstar.h
-header-y += atm_tcp.h
-header-y += atm_zatm.h
header-y += atmapi.h
header-y += atmarp.h
header-y += atmbr2684.h
header-y += atmclip.h
header-y += atmdev.h
+header-y += atm_eni.h
+header-y += atm.h
+header-y += atm_he.h
+header-y += atm_idt77105.h
header-y += atmioc.h
header-y += atmlec.h
header-y += atmmpc.h
+header-y += atm_nicstar.h
header-y += atmppp.h
header-y += atmsap.h
header-y += atmsvc.h
+header-y += atm_tcp.h
+header-y += atm_zatm.h
header-y += audit.h
-header-y += auto_fs.h
header-y += auto_fs4.h
+header-y += auto_fs.h
header-y += auxvec.h
header-y += ax25.h
header-y += b1lli.h
@@ -67,8 +67,8 @@ header-y += bfs_fs.h
header-y += binfmts.h
header-y += blkpg.h
header-y += blktrace_api.h
-header-y += bpf.h
header-y += bpf_common.h
+header-y += bpf.h
header-y += bpqether.h
header-y += bsg.h
header-y += btrfs.h
@@ -93,21 +93,21 @@ header-y += cyclades.h
header-y += cycx_cfm.h
header-y += dcbnl.h
header-y += dccp.h
-header-y += dlm.h
+header-y += dlmconstants.h
header-y += dlm_device.h
+header-y += dlm.h
header-y += dlm_netlink.h
header-y += dlm_plock.h
-header-y += dlmconstants.h
header-y += dm-ioctl.h
header-y += dm-log-userspace.h
header-y += dn.h
header-y += dqblk_xfs.h
header-y += edd.h
header-y += efs_fs_sb.h
+header-y += elfcore.h
header-y += elf-em.h
header-y += elf-fdpic.h
header-y += elf.h
-header-y += elfcore.h
header-y += errno.h
header-y += errqueue.h
header-y += ethtool.h
@@ -131,15 +131,15 @@ header-y += fsl_hypervisor.h
header-y += fuse.h
header-y += futex.h
header-y += gameport.h
-header-y += gen_stats.h
header-y += genetlink.h
+header-y += gen_stats.h
header-y += gfs2_ondisk.h
header-y += gigaset_dev.h
-header-y += hdlc.h
header-y += hdlcdrv.h
+header-y += hdlc.h
header-y += hdreg.h
-header-y += hid.h
header-y += hiddev.h
+header-y += hid.h
header-y += hidraw.h
header-y += hpet.h
header-y += hsr_netlink.h
@@ -151,7 +151,6 @@ header-y += i2o-dev.h
header-y += i8k.h
header-y += icmp.h
header-y += icmpv6.h
-header-y += if.h
header-y += if_addr.h
header-y += if_addrlabel.h
header-y += if_alg.h
@@ -165,6 +164,7 @@ header-y += if_ether.h
header-y += if_fc.h
header-y += if_fddi.h
header-y += if_frad.h
+header-y += if.h
header-y += if_hippi.h
header-y += if_infiniband.h
header-y += if_link.h
@@ -182,40 +182,40 @@ header-y += if_tunnel.h
header-y += if_vlan.h
header-y += if_x25.h
header-y += igmp.h
-header-y += in.h
header-y += in6.h
-header-y += in_route.h
header-y += inet_diag.h
+header-y += in.h
header-y += inotify.h
header-y += input.h
+header-y += in_route.h
header-y += ioctl.h
-header-y += ip.h
header-y += ip6_tunnel.h
-header-y += ip_vs.h
header-y += ipc.h
+header-y += ip.h
header-y += ipmi.h
header-y += ipmi_msgdefs.h
header-y += ipsec.h
header-y += ipv6.h
header-y += ipv6_route.h
+header-y += ip_vs.h
header-y += ipx.h
header-y += irda.h
header-y += irqnr.h
-header-y += isdn.h
header-y += isdn_divertif.h
-header-y += isdn_ppp.h
+header-y += isdn.h
header-y += isdnif.h
+header-y += isdn_ppp.h
header-y += iso_fs.h
-header-y += ivtv.h
header-y += ivtvfb.h
+header-y += ivtv.h
header-y += ixjuser.h
header-y += jffs2.h
header-y += joystick.h
-header-y += kd.h
header-y += kdev_t.h
-header-y += kernel-page-flags.h
-header-y += kernel.h
+header-y += kd.h
header-y += kernelcapi.h
+header-y += kernel.h
+header-y += kernel-page-flags.h
header-y += kexec.h
header-y += keyboard.h
header-y += keyctl.h
@@ -231,6 +231,7 @@ ifneq ($(wildcard $(srctree)/arch/$(SRCA
header-y += kvm_para.h
endif
+header-y += hw_breakpoint.h
header-y += l2tp.h
header-y += libc-compat.h
header-y += limits.h
@@ -255,43 +256,43 @@ header-y += mman.h
header-y += mmtimer.h
header-y += mpls.h
header-y += mqueue.h
-header-y += mroute.h
header-y += mroute6.h
+header-y += mroute.h
header-y += msdos_fs.h
header-y += msg.h
header-y += mtio.h
-header-y += n_r3964.h
header-y += nbd.h
-header-y += ncp.h
header-y += ncp_fs.h
+header-y += ncp.h
header-y += ncp_mount.h
header-y += ncp_no.h
header-y += neighbour.h
-header-y += net.h
-header-y += net_dropmon.h
-header-y += net_tstamp.h
header-y += netconf.h
header-y += netdevice.h
-header-y += netlink_diag.h
-header-y += netfilter.h
+header-y += net_dropmon.h
header-y += netfilter_arp.h
header-y += netfilter_bridge.h
header-y += netfilter_decnet.h
+header-y += netfilter.h
header-y += netfilter_ipv4.h
header-y += netfilter_ipv6.h
+header-y += net.h
+header-y += netlink_diag.h
header-y += netlink.h
header-y += netrom.h
+header-y += net_tstamp.h
header-y += nfc.h
-header-y += nfs.h
header-y += nfs2.h
header-y += nfs3.h
header-y += nfs4.h
header-y += nfs4_mount.h
+header-y += nfsacl.h
header-y += nfs_fs.h
+header-y += nfs.h
header-y += nfs_idmap.h
header-y += nfs_mount.h
-header-y += nfsacl.h
header-y += nl80211.h
+header-y += n_r3964.h
header-y += nubus.h
header-y += nvme.h
header-y += nvram.h
@@ -311,16 +312,16 @@ header-y += pfkeyv2.h
header-y += pg.h
header-y += phantom.h
header-y += phonet.h
+header-y += pktcdvd.h
header-y += pkt_cls.h
header-y += pkt_sched.h
-header-y += pktcdvd.h
header-y += pmu.h
header-y += poll.h
header-y += posix_types.h
header-y += ppdev.h
header-y += ppp-comp.h
-header-y += ppp-ioctl.h
header-y += ppp_defs.h
+header-y += ppp-ioctl.h
header-y += pps.h
header-y += prctl.h
header-y += psci.h
@@ -352,13 +353,13 @@ header-y += seccomp.h
header-y += securebits.h
header-y += selinux_netlink.h
header-y += sem.h
-header-y += serial.h
header-y += serial_core.h
+header-y += serial.h
header-y += serial_reg.h
header-y += serio.h
header-y += shm.h
-header-y += signal.h
header-y += signalfd.h
+header-y += signal.h
header-y += smiapp.h
header-y += snmp.h
header-y += sock_diag.h
@@ -367,8 +368,8 @@ header-y += sockios.h
header-y += som.h
header-y += sonet.h
header-y += sonypi.h
-header-y += sound.h
header-y += soundcard.h
+header-y += sound.h
header-y += stat.h
header-y += stddef.h
header-y += string.h
@@ -387,11 +388,11 @@ header-y += time.h
header-y += times.h
header-y += timex.h
header-y += tiocl.h
-header-y += tipc.h
header-y += tipc_config.h
+header-y += tipc.h
header-y += toshiba.h
-header-y += tty.h
header-y += tty_flags.h
+header-y += tty.h
header-y += types.h
header-y += udf_fs_i.h
header-y += udp.h
@@ -437,6 +438,5 @@ header-y += wireless.h
header-y += x25.h
header-y += xattr.h
header-y += xfrm.h
-header-y += hw_breakpoint.h
header-y += zorro.h
header-y += zorro_ids.h
^ permalink raw reply
* Re: [PATCH] hamradio: 6pack: remove unnecessary check
From: David Miller @ 2014-11-03 20:34 UTC (permalink / raw)
To: sudipm.mukherjee; +Cc: ajk, linux-hams, netdev, linux-kernel
In-Reply-To: <1415016749-6825-1-git-send-email-sudipm.mukherjee@gmail.com>
From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Date: Mon, 3 Nov 2014 17:42:29 +0530
> this is check for dev is unnecessary, as we are already checking dev
> after allocating it via alloc_netdev, and jumping to label: out
> if it is NULL.
>
> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH v2] PPC: bpf_jit_comp: add SKF_AD_PKTTYPE instruction
From: David Miller @ 2014-11-03 20:29 UTC (permalink / raw)
To: alexei.starovoitov; +Cc: felix, kda, netdev, linuxppc-dev, mpe, matt
In-Reply-To: <CAADnVQJYNB15cQvDi0+AcL==n+f8PAb1=Dnp_vBWku4SA-Q_6Q@mail.gmail.com>
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Date: Mon, 3 Nov 2014 09:21:03 -0800
> On Mon, Nov 3, 2014 at 9:06 AM, David Miller <davem@davemloft.net> wrote:
>> From: Denis Kirjanov <kda@linux-powerpc.org>
>> Date: Thu, 30 Oct 2014 09:12:15 +0300
>>
>>> Add BPF extension SKF_AD_PKTTYPE to ppc JIT to load
>>> skb->pkt_type field.
>>>
>>> Before:
>>> [ 88.262622] test_bpf: #11 LD_IND_NET 86 97 99 PASS
>>> [ 88.265740] test_bpf: #12 LD_PKTTYPE 109 107 PASS
>>>
>>> After:
>>> [ 80.605964] test_bpf: #11 LD_IND_NET 44 40 39 PASS
>>> [ 80.607370] test_bpf: #12 LD_PKTTYPE 9 9 PASS
>>>
>>> CC: Alexei Starovoitov<alexei.starovoitov@gmail.com>
>>> CC: Michael Ellerman<mpe@ellerman.id.au>
>>> Cc: Matt Evans <matt@ozlabs.org>
>>> Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
>>>
>>> v2: Added test rusults
>>
>> So, can I apply this now?
>
> I think this question is more towards ppc folks,
> since both Daniel and myself said before that it looks ok.
> Philippe just tested the previous version of this patch on ppc64le...
> I'm guessing that Matt (original author of bpf jit for ppc) is not replying,
> because he has no objections.
> Either way the addition is tiny and contained, so can go in now.
Ok, I have applied this to net-next, thanks everyone.
^ permalink raw reply
* Re: [PATCH] VNIC: Adding support for Cavium ThunderX network controller
From: Stephen Hemminger @ 2014-11-03 20:25 UTC (permalink / raw)
To: Robert Richter
Cc: Sunil Kovvuri, Robert Richter, David S. Miller, Sunil Goutham,
Stefan Assmann, LKML, LAKML, netdev
In-Reply-To: <20141103183345.GK31556@rric.localhost>
On Mon, 3 Nov 2014 19:33:45 +0100
Robert Richter <robert.richter@caviumnetworks.com> wrote:
> On 03.11.14 10:16:51, Stephen Hemminger wrote:
> > On Fri, 31 Oct 2014 22:44:11 +0530
> > Sunil Kovvuri <sunil.kovvuri@gmail.com> wrote:
> >
> > > On Fri, Oct 31, 2014 at 8:24 AM, Stephen Hemminger
> > > <stephen@networkplumber.org> wrote:
> > > > On Thu, 30 Oct 2014 17:54:34 +0100
> > > > Robert Richter <rric@kernel.org> wrote:
> > > >
> > > >> +#ifdef VNIC_RSS_SUPPORT
> > > >> +static int rss_config = RSS_IP_HASH_ENA | RSS_TCP_HASH_ENA | RSS_UDP_HASH_ENA;
> > > >> +module_param(rss_config, int, S_IRUGO);
> > > >> +MODULE_PARM_DESC(rss_config,
> > > >> + "RSS hash config [bits 8:0] (Bit0:L2 extended, 1:IP, 2:TCP, 3:TCP SYN, 4:UDP, 5:L4 extended, 6:ROCE 7:L3 bi-directional, 8:L4 bi-directional)");
> > > >> +#endif
> > > >
> > > > This should managed be via ethtool ETHTOOL_GRXFH rather than a module parameter.
> > > Thanks, i will add setting hash options via ETHTOOL_SRXFH as well.
> > > The idea here is to have a choice of hash while module load (through
> > > module params) and if it needs to be changed runtime then
> > > via Ethtool.
> > >
> > > Sunil.
> >
> > Network developers do not like vendor unique module parameters.
> > Anything device specific doesn't work in a generic distro environment.
>
> Do you accept unique module parameters in parallel to ethtool support
> or should this be removed?
If there is ethtool support the module parameters are not needed.
Unneeded code is to be avoided.
^ permalink raw reply
* Re: [PATCH net-next 3/7] gue: Add infrastructure for flags and options
From: David Miller @ 2014-11-03 20:12 UTC (permalink / raw)
To: therbert; +Cc: netdev
In-Reply-To: <CA+mtBx8QZn8M8mnU84w2+sc4bt98kf-_SuW0jz15qYNX21vCag@mail.gmail.com>
From: Tom Herbert <therbert@google.com>
Date: Mon, 3 Nov 2014 10:39:14 -0800
> On Mon, Nov 3, 2014 at 9:18 AM, David Miller <davem@davemloft.net> wrote:
>> From: Tom Herbert <therbert@google.com>
>> Date: Sat, 1 Nov 2014 15:57:59 -0700
>>
>>> @@ -20,7 +20,16 @@ static size_t fou_encap_hlen(struct ip_tunnel_encap *e)
>>>
>>> static size_t gue_encap_hlen(struct ip_tunnel_encap *e)
>>> {
>>> - return sizeof(struct udphdr) + sizeof(struct guehdr);
>>> + size_t len;
>>> + bool need_priv = false;
>>> +
>>> + len = sizeof(struct udphdr) + sizeof(struct guehdr);
>>> +
>>> + /* Add in lengths flags */
>>> +
>>> + len += need_priv ? GUE_LEN_PRIV : 0;
>>
>> Add this need_priv logic in patch #6, not here.
>
> I would rather keep it in this patch. This is adding the common
> infrastructure to support private option field, remote checksum
> offload is an instance that uses that.
Tom, it evaluates always to a constant boolean, and contextually makes
no sense to someone reviewing this change in isolation.
Please, as I have asked, put this in the patch where the logic
actually matters.
^ permalink raw reply
* Re: [stable request <= 3.11] net/mlx4_en: Fix BlueFlame race
From: David Miller @ 2014-11-03 20:09 UTC (permalink / raw)
To: cwang; +Cc: ben, vlee, amirv, ogerlitz, jackm, eugenia, matanb, netdev
In-Reply-To: <CAHA+R7OMgOpnWyy5E55OtLS4naOcZ5EhzViE14Yffu=K=TqNcA@mail.gmail.com>
From: Cong Wang <cwang@twopensource.com>
Date: Mon, 3 Nov 2014 09:22:18 -0800
> On Sat, Nov 1, 2014 at 10:41 AM, David Miller <davem@davemloft.net> wrote:
>>
>> There is no documented way nor do I wish to state anything so strictly.
>> I want maximum flexibility for such a time consuming task.
>>
>> I tend to go back 3 or 4 releases at most, and it really depends upon
>> the difficulty of the backports and my own time constraints.
>
> You should really offload to developers, otherwise too much work for you. :)
That's exactly what I am doing by having the -stable maintainers for older
releases deal with the backports and other pains.
^ permalink raw reply
* Re: TCP NewReno and single retransmit
From: Neal Cardwell @ 2014-11-03 20:08 UTC (permalink / raw)
To: Marcelo Ricardo Leitner; +Cc: Yuchung Cheng, netdev, Eric Dumazet
In-Reply-To: <5457AF6D.6010105@redhat.com>
On Mon, Nov 3, 2014 at 11:38 AM, Marcelo Ricardo Leitner
<mleitner@redhat.com> wrote:
> On 31-10-2014 01:51, Yuchung Cheng wrote:
>>>>> if (tp->snd_una == tp->high_seq && tcp_is_reno(tp)) {
>>>>> /* Hold old state until something *above* high_seq
>>>>> * is ACKed. For Reno it is MUST to prevent false
>>>>> * fast retransmits (RFC2582). SACK TCP is safe. */
>>
>> Or we can just remove this strange state-holding logic?
>>
>> I couldn't find such a "MUST" statement in RFC2582. RFC2582 section 3
>> step 5 suggests exiting the recovery procedure when an ACK acknowledges
>> the "recover" variable (== tp->high_seq - 1).
>>
>> Since we've called tcp_reset_reno_sack() before tcp_try_undo_recovery(),
>> I couldn't see how false fast retransmits can be triggered without
>> this state-holding.
>>
>> Any insights?
>
>
> Nice one, me neither. Neal?
Since there are no literal IETF-style "MUST" statements in RFC2582, I
think the "MUST" in the code here is expressing the design philosophy
behind the author. :-)
AFAICT the "Hold old state" code in tcp_try_undo_recovery() is a
pretty reasonable implementation of a mechanism specified in NewReno
RFCs to deal with the fundamental ambiguity between (1) dupacks for
packets the receiver received above a hole above snd_una, and (2)
dupacks for spurious retransmits below snd_una. I think the motivation
behind the "Hold old state" code is to stay in Recovery so that we do
not treat (2) dupacks as (1) dupacks.
I find RFC 2582 not very clear on this point, and the newest NewReno
RFC, RFC 6582, is also not so clear IMHO. But the RFC 3782 version of
NewReno - https://tools.ietf.org/html/rfc3782 - has a reasonable
discussion of this issue. There is a discussion in
https://tools.ietf.org/html/rfc3782#section-11
of this ambiguity:
There are two separate scenarios in which the TCP sender could
receive three duplicate acknowledgements acknowledging "recover" but
no more than "recover". One scenario would be that the data sender
transmitted four packets with sequence numbers higher than "recover",
that the first packet was dropped in the network, and the following
three packets triggered three duplicate acknowledgements
acknowledging "recover". The second scenario would be that the
sender unnecessarily retransmitted three packets below "recover", and
that these three packets triggered three duplicate acknowledgements
acknowledging "recover". In the absence of SACK, the TCP sender is
unable to distinguish between these two scenarios.
AFAICT RFC 3782 uses the term "recover" for the sequence number Linux
calls tp->high_seq. The specification in RFC 3782 Sec 3 -
https://tools.ietf.org/html/rfc3782#section-3 - of the criteria for
entering Fast Recovery say that we shouldn't go into a new recovery if
the cumulative ACK field doesn't cover more than high_seq/"recover":
1) Three duplicate ACKs:
When the third duplicate ACK is received and the sender is not
already in the Fast Recovery procedure, check to see if the
Cumulative Acknowledgement field covers more than "recover". If
so, go to Step 1A. Otherwise, go to Step 1B.
1A) Invoking Fast Retransmit:
...
In addition, record the highest sequence number transmitted in
the variable "recover", and go to Step 2.
1B) Not invoking Fast Retransmit:
...
The last few slides of this presentation by Sally Floyd also talk
about this point:
http://www.icir.org/floyd/talks/newreno-Mar03.pdf
neal
^ permalink raw reply
* Re: [0/3] net: Kill skb_copy_datagram_const_iovec
From: David Miller @ 2014-11-03 20:05 UTC (permalink / raw)
To: herbert; +Cc: viro, netdev, linux-kernel, bcrl
In-Reply-To: <20141103053751.GA27845@gondor.apana.org.au>
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 3 Nov 2014 13:37:51 +0800
> On Mon, Nov 03, 2014 at 12:45:03AM +0000, Al Viro wrote:
>>
>> Note, BTW, that there's a damn good reason to convert the socket side of
>> things to iov_iter - as it is, ->splice_write() there is basically done with
>> page-by-page mapping and doing kernel_sendmsg(); being able to deal with
>> "map and copy" stuff *inside* ->sendmsg() would not only reduce the overhead,
>> it would allow to get rid of ->sendpage() completely. Basically, let
>> ->sendmsg() instances check the iov_iter type and play zerocopy games if
>> it's an "array of kernel pages" kind. Compare ->sendpage() and ->sendmsg()
>> instances for the protocols that have nontrivial ->sendpage(); you'll see
>> that there's a lot of duplication. Merging them looks very feasible, with
>> divergence happening only very deep in the call chain.
>
> Honestly I don't really care which way we end up going as long as
> we pick one solution and stick with it. Right now we have an
> abomination in the form of skb_copy_datagram_const_iovec which is
> the worst of both worlds, plus it duplicates tons of code.
>
> So here's a few patches to kill this crap.
To pick one direction and go with it, I totally agree with.
But a patch set like this as an interim solution, I am not so happy
with.
If the method says const, we have a contract with the caller to not
modify the iovec. That caller can assume that we have not done so.
So this patch set violated that contract and can result in real bugs
either now or in the future.
I'll see if I can make some progress converting the networking over
to iov_iter. It can't be that difficult... albeit perhaps a little
time consuming.
^ permalink raw reply
* [PATCH v2 3/3] drivers: net: xgene: fix: Use separate resources
From: Iyappan Subramanian @ 2014-11-03 19:59 UTC (permalink / raw)
To: davem, netdev, devicetree
Cc: linux-arm-kernel, patches, kchudgar, Iyappan Subramanian
In-Reply-To: <1415044796-5081-1-git-send-email-isubramanian@apm.com>
This patch fixes the following kernel crash during SGMII based 1GbE probe.
BUG: Bad page state in process swapper/0 pfn:40fe6ad
page:ffffffbee37a75d8 count:-1 mapcount:0 mapping: (null) index:0x0
flags: 0x0()
page dumped because: nonzero _count
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0+ #7
Call trace:
[<ffffffc000087fa0>] dump_backtrace+0x0/0x12c
[<ffffffc0000880dc>] show_stack+0x10/0x1c
[<ffffffc0004d981c>] dump_stack+0x74/0xc4
[<ffffffc00012fe70>] bad_page+0xd8/0x128
[<ffffffc000133000>] get_page_from_freelist+0x4b8/0x640
[<ffffffc000133260>] __alloc_pages_nodemask+0xd8/0x834
[<ffffffc0004194f8>] __netdev_alloc_frag+0x124/0x1b8
[<ffffffc00041bfdc>] __netdev_alloc_skb+0x90/0x10c
[<ffffffc00039ff30>] xgene_enet_refill_bufpool+0x11c/0x280
[<ffffffc0003a11a4>] xgene_enet_process_ring+0x168/0x340
[<ffffffc0003a1498>] xgene_enet_napi+0x1c/0x50
[<ffffffc00042b454>] net_rx_action+0xc8/0x18c
[<ffffffc0000b0880>] __do_softirq+0x114/0x24c
[<ffffffc0000b0c34>] irq_exit+0x94/0xc8
[<ffffffc0000e68a0>] __handle_domain_irq+0x8c/0xf4
[<ffffffc000081288>] gic_handle_irq+0x30/0x7c
This was due to hardware resource sharing conflict with the firmware. This
patch fixes this crash by using resources (descriptor ring, prefetch buffer)
that are not shared.
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
---
drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 6 +++---
drivers/net/ethernet/apm/xgene/xgene_enet_main.h | 3 +++
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index cc3f955..1236696 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -639,9 +639,9 @@ static int xgene_enet_create_desc_rings(struct net_device *ndev)
struct device *dev = ndev_to_dev(ndev);
struct xgene_enet_desc_ring *rx_ring, *tx_ring, *cp_ring;
struct xgene_enet_desc_ring *buf_pool = NULL;
- u8 cpu_bufnum = 0, eth_bufnum = 0;
- u8 bp_bufnum = 0x20;
- u16 ring_id, ring_num = 0;
+ u8 cpu_bufnum = 0, eth_bufnum = START_ETH_BUFNUM;
+ u8 bp_bufnum = START_BP_BUFNUM;
+ u16 ring_id, ring_num = START_RING_NUM;
int ret;
/* allocate rx descriptor ring */
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
index dba647d..f9958fa 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
@@ -38,6 +38,9 @@
#define SKB_BUFFER_SIZE (XGENE_ENET_MAX_MTU - NET_IP_ALIGN)
#define NUM_PKT_BUF 64
#define NUM_BUFPOOL 32
+#define START_ETH_BUFNUM 2
+#define START_BP_BUFNUM 0x22
+#define START_RING_NUM 8
#define PHY_POLL_LINK_ON (10 * HZ)
#define PHY_POLL_LINK_OFF (PHY_POLL_LINK_ON / 5)
--
1.9.1
^ permalink raw reply related
* [PATCH v2 2/3] drivers: net: xgene: Backward compatibility with older firmware
From: Iyappan Subramanian @ 2014-11-03 19:59 UTC (permalink / raw)
To: davem, netdev, devicetree
Cc: linux-arm-kernel, patches, kchudgar, Iyappan Subramanian
In-Reply-To: <1415044796-5081-1-git-send-email-isubramanian@apm.com>
This patch adds support when used with older firmware (<= 1.13.28).
- Added xgene_ring_mgr_init() to check whether ring manager is initialized
- Calling xgene_ring_mgr_init() from xgene_port_ops.reset()
- To handle errors, changed the return type of xgene_port_ops.reset()
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
---
drivers/net/ethernet/apm/xgene/xgene_enet_hw.c | 18 +++++++++++++++++-
drivers/net/ethernet/apm/xgene/xgene_enet_hw.h | 4 ++++
drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 5 ++++-
drivers/net/ethernet/apm/xgene/xgene_enet_main.h | 2 +-
drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c | 7 ++++++-
drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c | 7 ++++++-
6 files changed, 38 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
index 63ea194..7ba83ff 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
@@ -575,10 +575,24 @@ static void xgene_gmac_tx_disable(struct xgene_enet_pdata *pdata)
xgene_enet_wr_mcx_mac(pdata, MAC_CONFIG_1_ADDR, data & ~TX_EN);
}
-static void xgene_enet_reset(struct xgene_enet_pdata *pdata)
+bool xgene_ring_mgr_init(struct xgene_enet_pdata *p)
+{
+ if (!ioread32(p->ring_csr_addr + CLKEN_ADDR))
+ return false;
+
+ if (ioread32(p->ring_csr_addr + SRST_ADDR))
+ return false;
+
+ return true;
+}
+
+static int xgene_enet_reset(struct xgene_enet_pdata *pdata)
{
u32 val;
+ if (!xgene_ring_mgr_init(pdata))
+ return -ENODEV;
+
clk_prepare_enable(pdata->clk);
clk_disable_unprepare(pdata->clk);
clk_prepare_enable(pdata->clk);
@@ -590,6 +604,8 @@ static void xgene_enet_reset(struct xgene_enet_pdata *pdata)
val |= SCAN_AUTO_INCR;
MGMT_CLOCK_SEL_SET(&val, 1);
xgene_enet_wr_mcx_mac(pdata, MII_MGMT_CONFIG_ADDR, val);
+
+ return 0;
}
static void xgene_gport_shutdown(struct xgene_enet_pdata *pdata)
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
index 3855858..ec45f32 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
@@ -104,6 +104,9 @@ enum xgene_enet_rm {
#define BLOCK_ETH_MAC_OFFSET 0x0000
#define BLOCK_ETH_MAC_CSR_OFFSET 0x2800
+#define CLKEN_ADDR 0xc208
+#define SRST_ADDR 0xc200
+
#define MAC_ADDR_REG_OFFSET 0x00
#define MAC_COMMAND_REG_OFFSET 0x04
#define MAC_WRITE_REG_OFFSET 0x08
@@ -318,6 +321,7 @@ void xgene_enet_parse_error(struct xgene_enet_desc_ring *ring,
int xgene_enet_mdio_config(struct xgene_enet_pdata *pdata);
void xgene_enet_mdio_remove(struct xgene_enet_pdata *pdata);
+bool xgene_ring_mgr_init(struct xgene_enet_pdata *p);
extern struct xgene_mac_ops xgene_gmac_ops;
extern struct xgene_port_ops xgene_gport_ops;
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index 3c208cc..cc3f955 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -852,7 +852,9 @@ static int xgene_enet_init_hw(struct xgene_enet_pdata *pdata)
u16 dst_ring_num;
int ret;
- pdata->port_ops->reset(pdata);
+ ret = pdata->port_ops->reset(pdata);
+ if (ret)
+ return ret;
ret = xgene_enet_create_desc_rings(ndev);
if (ret) {
@@ -954,6 +956,7 @@ static int xgene_enet_probe(struct platform_device *pdev)
return ret;
err:
+ unregister_netdev(ndev);
free_netdev(ndev);
return ret;
}
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
index 874e5a0..dba647d 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
@@ -83,7 +83,7 @@ struct xgene_mac_ops {
};
struct xgene_port_ops {
- void (*reset)(struct xgene_enet_pdata *pdata);
+ int (*reset)(struct xgene_enet_pdata *pdata);
void (*cle_bypass)(struct xgene_enet_pdata *pdata,
u32 dst_ring_num, u16 bufpool_id);
void (*shutdown)(struct xgene_enet_pdata *pdata);
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c b/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c
index c22f326..f5d4f68 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c
@@ -311,14 +311,19 @@ static void xgene_sgmac_tx_disable(struct xgene_enet_pdata *p)
xgene_sgmac_rxtx(p, TX_EN, false);
}
-static void xgene_enet_reset(struct xgene_enet_pdata *p)
+static int xgene_enet_reset(struct xgene_enet_pdata *p)
{
+ if (!xgene_ring_mgr_init(p))
+ return -ENODEV;
+
clk_prepare_enable(p->clk);
clk_disable_unprepare(p->clk);
clk_prepare_enable(p->clk);
xgene_enet_ecc_init(p);
xgene_enet_config_ring_if_assoc(p);
+
+ return 0;
}
static void xgene_enet_cle_bypass(struct xgene_enet_pdata *p,
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
index 67d0720..a18a9d1 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
@@ -252,14 +252,19 @@ static void xgene_xgmac_tx_disable(struct xgene_enet_pdata *pdata)
xgene_enet_wr_mac(pdata, AXGMAC_CONFIG_1, data & ~HSTTFEN);
}
-static void xgene_enet_reset(struct xgene_enet_pdata *pdata)
+static int xgene_enet_reset(struct xgene_enet_pdata *pdata)
{
+ if (!xgene_ring_mgr_init(pdata))
+ return -ENODEV;
+
clk_prepare_enable(pdata->clk);
clk_disable_unprepare(pdata->clk);
clk_prepare_enable(pdata->clk);
xgene_enet_ecc_init(pdata);
xgene_enet_config_ring_if_assoc(pdata);
+
+ return 0;
}
static void xgene_enet_xgcle_bypass(struct xgene_enet_pdata *pdata,
--
1.9.1
^ permalink raw reply related
* [PATCH v2 1/3] dtb: xgene: fix: Backward compatibility with older firmware
From: Iyappan Subramanian @ 2014-11-03 19:59 UTC (permalink / raw)
To: davem, netdev, devicetree
Cc: linux-arm-kernel, patches, kchudgar, Iyappan Subramanian
In-Reply-To: <1415044796-5081-1-git-send-email-isubramanian@apm.com>
The following kernel crash was reported when using older firmware (<= 1.13.28).
[ 0.980000] libphy: APM X-Gene MDIO bus: probed
[ 1.130000] Unhandled fault: synchronous external abort (0x96000010) at 0xffffff800009a17c
[ 1.140000] Internal error: : 96000010 [#1] SMP
[ 1.140000] Modules linked in:
[ 1.140000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.0+ #21
[ 1.140000] task: ffffffc3f0110000 ti: ffffffc3f0064000 task.ti: ffffffc3f0064000
[ 1.140000] PC is at ioread32+0x58/0x68
[ 1.140000] LR is at xgene_enet_setup_ring+0x18c/0x1cc
[ 1.140000] pc : [<ffffffc0003cec68>] lr : [<ffffffc00053dad8>] pstate: a0000045
[ 1.140000] sp : ffffffc3f0067b20
[ 1.140000] x29: ffffffc3f0067b20 x28: ffffffc000aa8ea0
[ 1.140000] x27: ffffffc000bb2000 x26: ffffffc000a64270
[ 1.140000] x25: ffffffc000b05ad8 x24: ffffffc0ff99ba58
[ 1.140000] x23: 0000000000004000 x22: 0000000000004000
[ 1.140000] x21: 0000000000000200 x20: 0000000000200000
[ 1.140000] x19: ffffffc0ff99ba18 x18: ffffffc0007a6000
[ 1.140000] x17: 0000000000000007 x16: 000000000000000e
[ 1.140000] x15: 0000000000000001 x14: 0000000000000000
[ 1.140000] x13: ffffffbeedb71320 x12: 00000000ffffff80
[ 1.140000] x11: 0000000000000002 x10: 0000000000000000
[ 1.140000] x9 : 0000000000000000 x8 : ffffffc3eb2a4000
[ 1.140000] x7 : 0000000000000000 x6 : 0000000000000000
[ 1.140000] x5 : 0000000001080000 x4 : 000000007d654010
[ 1.140000] x3 : ffffffffffffffff x2 : 000000000003ffff
[ 1.140000] x1 : ffffff800009a17c x0 : ffffff800009a17c
The issue was that the older firmware does not support 10GbE and
SGMII based 1GBE interfaces.
This patch changes the address length of the reg property of sgmii0 and xgmii
nodes and serves as preparatory patch for the fix.
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Reported-by: Dann Frazier <dann.frazier@canonical.com>
---
arch/arm64/boot/dts/apm-storm.dtsi | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/boot/dts/apm-storm.dtsi b/arch/arm64/boot/dts/apm-storm.dtsi
index 295c72d..f1ad9c2 100644
--- a/arch/arm64/boot/dts/apm-storm.dtsi
+++ b/arch/arm64/boot/dts/apm-storm.dtsi
@@ -599,7 +599,7 @@
compatible = "apm,xgene-enet";
status = "disabled";
reg = <0x0 0x17020000 0x0 0xd100>,
- <0x0 0X17030000 0x0 0X400>,
+ <0x0 0X17030000 0x0 0Xc300>,
<0x0 0X10000000 0x0 0X200>;
reg-names = "enet_csr", "ring_csr", "ring_cmd";
interrupts = <0x0 0x3c 0x4>;
@@ -624,9 +624,9 @@
sgenet0: ethernet@1f210000 {
compatible = "apm,xgene-enet";
status = "disabled";
- reg = <0x0 0x1f210000 0x0 0x10000>,
- <0x0 0x1f200000 0x0 0X10000>,
- <0x0 0x1B000000 0x0 0X20000>;
+ reg = <0x0 0x1f210000 0x0 0xd100>,
+ <0x0 0x1f200000 0x0 0Xc300>,
+ <0x0 0x1B000000 0x0 0X200>;
reg-names = "enet_csr", "ring_csr", "ring_cmd";
interrupts = <0x0 0xA0 0x4>;
dma-coherent;
@@ -639,7 +639,7 @@
compatible = "apm,xgene-enet";
status = "disabled";
reg = <0x0 0x1f610000 0x0 0xd100>,
- <0x0 0x1f600000 0x0 0X400>,
+ <0x0 0x1f600000 0x0 0Xc300>,
<0x0 0x18000000 0x0 0X200>;
reg-names = "enet_csr", "ring_csr", "ring_cmd";
interrupts = <0x0 0x60 0x4>;
--
1.9.1
^ permalink raw reply related
* [PATCH v2 0/3] drivers: net: xgene: Fix crash for backward compatibility
From: Iyappan Subramanian @ 2014-11-03 19:59 UTC (permalink / raw)
To: davem, netdev, devicetree
Cc: linux-arm-kernel, patches, kchudgar, Iyappan Subramanian
This patch set fixes the following issues that were reported during regression.
Patch 1,2 : Adds backward compatibility with the older firmware (<= 1.13.28).
Patch 3 : Use separate hardware resources (descriptor ring, prefetch buffer)
that are not shared with the firmware
---
Iyappan Subramanian (3):
dtb: xgene: fix: Backward compatibility with older firmware
drivers: net: xgene: Backward compatibility with older firmware
drivers: net: xgene: fix: Use separate resources
arch/arm64/boot/dts/apm-storm.dtsi | 10 +++++-----
drivers/net/ethernet/apm/xgene/xgene_enet_hw.c | 18 +++++++++++++++++-
drivers/net/ethernet/apm/xgene/xgene_enet_hw.h | 4 ++++
drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 11 +++++++----
drivers/net/ethernet/apm/xgene/xgene_enet_main.h | 5 ++++-
drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c | 7 ++++++-
drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c | 7 ++++++-
7 files changed, 49 insertions(+), 13 deletions(-)
--
1.9.1
^ permalink raw reply
* Re: macvtap: Fix csum_start when VLAN tags are present
From: David Miller @ 2014-11-03 19:52 UTC (permalink / raw)
To: herbert; +Cc: netdev
In-Reply-To: <20141103060125.GA28295@gondor.apana.org.au>
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 3 Nov 2014 14:01:25 +0800
> When VLAN is in use in macvtap_put_user, we end up setting
> csum_start to the wrong place. The result is that the whoever
> ends up doing the checksum setting will corrupt the packet instead
> of writing the checksum to the expected location, usually this
> means writing the checksum with an offset of -4.
>
> This patch fixes this by adjusting csum_start when VLAN tags are
> detected.
>
> Fixes: f09e2249c4f5 ("macvtap: restore vlan header on user read")
> Cc: stable@vger.kernel.org
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Applied, thanks Herbert.
^ permalink raw reply
* Re: [PATCH] net: fec: fix suspend broken on multiple MACs sillicons
From: David Miller @ 2014-11-03 19:50 UTC (permalink / raw)
To: b38611; +Cc: netdev, festevam
In-Reply-To: <1414992410-27886-1-git-send-email-b38611@freescale.com>
From: Fugang Duan <b38611@freescale.com>
Date: Mon, 3 Nov 2014 13:26:50 +0800
> On i.MX6SX sdb platform, there has two same enet MACs, after system up,
> just eth0 is up, and then do suspend/resume test:
...
> The root cause is that eth1 is not opened and clock is not enabled, and .suspend() still
> call .fec_enet_clk_enable() to disable clock.
>
> To avoid the broken, let it check network device up status by calling .netif_running()
> before disable/enable clocks.
>
> Signed-off-by: Fugang Duan <B38611@freescale.com>
Applied, thanks!
^ permalink raw reply
* Re: [PATCH v1 1/2] dtb: xgene: fix: Disable 10GbE and SGMII based 1GbE by default
From: Iyappan Subramanian @ 2014-11-03 19:45 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-arm-kernel@lists.infradead.org, David Miller, netdev,
devicetree@vger.kernel.org, Keyur Chudgar, patches
In-Reply-To: <1991367.DGHysZpQVP@wuerfel>
Hi Arnd,
On Thu, Oct 30, 2014 at 3:13 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 29 October 2014 17:56:19 Iyappan Subramanian wrote:
>> @@ -621,7 +621,7 @@
>> };
>> };
>>
>> - sgenet0: ethernet@1f210000 {
>> + sgenet0: sgenet@1f210000 {
>> compatible = "apm,xgene-enet";
>> status = "disabled";
>> reg = <0x0 0x1f210000 0x0 0x10000>,
>>
>
> This looks like you accidentally reverted a bug fix made earlier.
> Network devices should always have the name 'ethernet@...'.
Thanks for the review. Since our firmware was patching the dtb, based
on the node-name, we thought by changing node-name, we can avoid the
patching and maintain backward compatibility.
Now we know that network devices should have 'ethernet@...', we will
handle the backward compatibility in a different way and will post the
patch v2 shortly.
>
> Arnd
^ permalink raw reply
* Re: [PATCH] ipv6: do xfrm transform after nat if necessary
From: David Miller @ 2014-11-03 19:42 UTC (permalink / raw)
To: duanj.fnst; +Cc: netdev
In-Reply-To: <54570A2F.2070206@cn.fujitsu.com>
From: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Date: Mon, 3 Nov 2014 12:53:03 +0800
>
>
> In function nf_nat_ipv6_out, after nat is done, nf_xfrm_me_harder()
> will be called to look up xfrm dst.
>
> Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
This is far from sufficient of a commit log message for a change that
is as serious and has as many implications as this one.
You haven't answered many questions, first of which in my mind is
why we are bypassing all of the fragmentation checks?
We're also bypassing ip6_finish_output2() which does multicast and
hooks up the neighbour.
IPV4 doesn't do this, why doesn't it have the same supposed problem
you are trying to solve?
It is not even clear to me what the problem is, because your commit
message is way too terse.
^ permalink raw reply
* Re: TCP out of memory - possible bug [3.18.0-rc3] / sched?
From: Eric Dumazet @ 2014-11-03 19:41 UTC (permalink / raw)
To: Tomasz Mloduchowski; +Cc: netdev
In-Reply-To: <5457D095.9070502@qdot.me>
On Mon, 2014-11-03 at 19:59 +0100, Tomasz Mloduchowski wrote:
> Hi List,
>
> I hope this is the right place to report a networking issue with
> 3.18.0-rc2 and 3.18.0-rc3 - under heavy P2P load (tested both
> rtorrent/libtorrent and bitcoind, so not protocol-specific), the system
> quickly exhausts tcp_mem limits in a very strange sequence of events.
>
> It might be scheduler or networking subsystem related.
>
> It's 100% reproducible on my system, first observed under 3.18.0-rc2.
>
Sounds a perfect case for a bisection maybe ?
^ permalink raw reply
* Re: [0/2] tun: Fix csum_start and TUN_PKT_STRIP
From: David Miller @ 2014-11-03 19:28 UTC (permalink / raw)
To: herbert; +Cc: netdev
In-Reply-To: <20141102202929.GA24935@gondor.apana.org.au>
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 3 Nov 2014 04:29:30 +0800
> The first patch fixes a serious problem that breaks checksum offload
> in VMs while the second patch fixes a problem that probably affects
> no one.
Series applied, thanks Herbert.
^ permalink raw reply
* Re: [PATCH] net: eth: realtek: atp: checkpatch errors and warnings corrected
From: Roberto Medina @ 2014-11-03 19:09 UTC (permalink / raw)
To: Joe Perches; +Cc: netdev, linux-kernel
In-Reply-To: <1415040686.17743.31.camel@perches.com>
On 11/03/2014 07:51 PM, Joe Perches wrote:
>
> Some ancient drivers could be regarded as neolithic
> curiosities that never need updating. This may be one.
>
> But if you really want to change it, could you please
> make sure that objdiff shows no changes?
>
I see some changes with objdiff, maybe this is caused by the include
file that I changed to linux/io.h instead of asm/io.h
---
/home/vov/Git/linux-next/.tmp_objdiff/a641d0e/drivers/net/ethernet/realtek/atp.dis
2014-11-03 19:59:18.723954900 +0100
+++
/home/vov/Git/linux-next/.tmp_objdiff/5f19b70/drivers/net/ethernet/realtek/atp.dis
2014-11-03 20:00:34.133954217 +0100
@@ -1753,9 +1753,8 @@
...
: e3 0a jrcxz 4c4 <__gcov_.atp_probe1+0x14>
: 00 00 add %al,(%rax)
-: e2 e2 loop 4a0 <__gcov_.atp_init+0x20>
-: 9c pushfq
-: 9a (bad)
+: 4b b4 98 rex.WXB mov $0x98,%r12b
+: ac lods %ds:(%rsi),%al
: 67 2f addr32 (bad)
: e4 e3 in $0xe3,%al
: 00 00 add %al,(%rax)
@@ -1767,30 +1766,37 @@
...
: e4 0a in $0xa,%al
: 00 00 add %al,(%rax)
-: a5 movsl %ds:(%rsi),%es:(%rdi)
-: 21 4f ca and %ecx,-0x36(%rdi)
-: a0 82 0e 4f 00 00 00 movabs 0x7000000004f0e82,%al
-: 00 07
+: d6 (bad)
+: ad lods %ds:(%rsi),%eax
+: 51 push %rcx
+: 70 a0 jo 491 <__gcov_.atp_init+0x11>
+: 82 (bad)
+: 0e (bad)
+: 4f 00 00 rex.WRXB add %r8b,(%r8)
+: 00 00 add %al,(%rax)
+: 07 (bad)
...
0000000000000510 <__gcov_.eeprom_op>:
...
: e5 0a in $0xa,%eax
: 00 00 add %al,(%rax)
-: cd c3 int $0xc3
-: 0f c4 5c 4b fc 84 pinsrw $0x84,-0x4(%rbx,%rcx,2),%mm3
-: 00 00 add %al,(%rax)
+: 54 push %rsp
+: 0e (bad)
+: dd ca (bad)
+: 5c pop %rsp
+: 4b fc rex.WXB cld
+: 84 00 test %al,(%rax)
: 00 00 add %al,(%rax)
-: 07 (bad)
+: 00 07 add %al,(%rdi)
...
0000000000000540 <__gcov_.net_open>:
...
: e6 0a out %al,$0xa
: 00 00 add %al,(%rax)
-: 52 push %rdx
-: e7 5d out %eax,$0x5d
-: 8f (bad)
+: 0f bf 0e movswl (%rsi),%ecx
+: 4f rex.WRXB
: 66 af scas %es:(%rdi),%ax
: e2 a2 loop 4f6 <__gcov_.get_node_ID+0x16>
: 00 00 add %al,(%rax)
@@ -1802,15 +1808,16 @@
...
: e7 0a out %eax,$0xa
: 00 00 add %al,(%rax)
-: a9 17 a9 64 dc test $0xdc64a917,%eax
-: 20 8c 1d 00 00 00 00 and %cl,0x0(%rbp,%rbx,1)
+: 21 8a b7 70 dc 20 and %ecx,0x20dc70b7(%rdx)
+: 8c 1d 00 00 00 00 mov %ds,0x0(%rip) # 588
<__gcov_.hardware_init+0x18>
: 0f 00 00 sldt (%rax)
...
00000000000005a0 <__gcov_.trigger_send>:
...
-: e8 0a 00 00 b9 callq ffffffffb90005b7
<net_open+0xffffffffb8ffeeb8>
-: d3 0c 12 rorl %cl,(%rdx,%rdx,1)
+: e8 0a 00 00 38 callq 380005b7 <net_open+0x37ffeeb8>
+: 2e cs
+: e7 bb out %eax,$0xbb
: b9 43 f3 81 00 mov $0x81f343,%ecx
: 00 00 add %al,(%rax)
: 00 04 00 add %al,(%rax,%rax,1)
@@ -1818,10 +1825,10 @@
00000000000005d0 <__gcov_.write_packet>:
...
-: e9 0a 00 00 63 jmpq 630005e7 <net_open+0x62ffeee8>
-: e5 05 in $0x5,%eax
-: 96 xchg %eax,%esi
-: 27 (bad)
+: e9 0a 00 00 05 jmpq 50005e7 <net_open+0x4ffeee8>
+: 54 push %rsp
+: 5f pop %rdi
+: 23 27 and (%rdi),%esp
: 0d e5 2c 00 00 or $0x2ce5,%eax
: 00 00 add %al,(%rax)
: 15 00 00 00 00 adc $0x0,%eax
@@ -1831,7 +1838,7 @@
...
: ea (bad)
: 0a 00 or (%rax),%al
-: 00 2d d0 35 4c 7a add %ch,0x7a4c35d0(%rip) # 7a4c3be1
<net_open+0x7a4c24e2>
+: 00 8e bf 72 94 7a add %cl,0x7a9472bf(%rsi)
: c4 61 e7 00 (bad)
: 00 00 add %al,(%rax)
: 00 06 add %al,(%rsi)
@@ -1841,9 +1848,11 @@
...
: eb 0a jmp 644 <__gcov_.atp_send_packet+0x14>
: 00 00 add %al,(%rax)
-: 78 ac js 5ea <__gcov_.write_packet+0x1a>
-: 9b fwait
-: 4f 63 b6 a7 f3 00 00 rex.WRXB movslq 0xf3a7(%r14),%r14
+: 4e ed rex.WRX in (%dx),%eax
+: 5b pop %rbx
+: dd 63 b6 frstor -0x4a(%rbx)
+: a7 cmpsl %es:(%rdi),%ds:(%rsi)
+: f3 00 00 repz add %al,(%rax)
: 00 00 add %al,(%rax)
: 0c 00 or $0x0,%al
...
@@ -1852,9 +1861,10 @@
...
: ec in (%dx),%al
: 0a 00 or (%rax),%al
-: 00 66 37 add %ah,0x37(%rsi)
-: 2f (bad)
-: d0 3e sarb (%rsi)
+: 00 f9 add %bh,%cl
+: 64 4d fs rex.WRB
+: 44 rex.R
+: 3e ds
: 16 (bad)
: 72 d7 jb 64b <__gcov_.atp_send_packet+0x1b>
: 00 00 add %al,(%rax)
@@ -1866,8 +1876,9 @@
...
: ed in (%dx),%eax
: 0a 00 or (%rax),%al
-: 00 50 2e add %dl,0x2e(%rax)
-: 24 2a and $0x2a,%al
+: 00 5a 73 add %bl,0x73(%rdx)
+: 56 push %rsi
+: cf iret
: b0 28 mov $0x28,%al
: 9e sahf
: 0c 00 or $0x0,%al
@@ -1879,9 +1890,9 @@
...
: ee out %al,(%dx)
: 0a 00 or (%rax),%al
-: 00 60 1b add %ah,0x1b(%rax)
-: f5 cmc
-: b9 ca 01 fa fa mov $0xfafa01ca,%ecx
+: 00 85 af 2d b0 ca add %al,-0x354fd251(%rbp)
+: 01 fa add %edi,%edx
+: fa cli
: 00 00 add %al,(%rax)
: 00 00 add %al,(%rax)
: 17 (bad)
@@ -1891,7 +1902,8 @@
...
: ef out %eax,(%dx)
: 0a 00 or (%rax),%al
-: 00 b3 fb 8e ed 58 add %dh,0x58ed8efb(%rbx)
+: 00 e8 add %ch,%al
+: 09 7c 81 58 or %edi,0x58(%rcx,%rax,4)
: 32 13 xor (%rbx),%dl
: 9d popfq
: 00 00 add %al,(%rax)
@@ -1902,7 +1914,8 @@
0000000000000720 <__gcov_.net_close>:
...
: f0 0a 00 lock or (%rax),%al
-: 00 ae b2 85 8a 2a add %ch,0x2a8a85b2(%rsi)
+: 00 54 b3 8a add %dl,-0x76(%rbx,%rsi,4)
+: 3b 2a cmp (%rdx),%ebp
: 24 53 and $0x53,%al
: 13 00 adc (%rax),%eax
: 00 00 add %al,(%rax)
@@ -1913,11 +1926,7 @@
...
: f1 icebp
: 0a 00 or (%rax),%al
-: 00 20 add %ah,(%rax)
-: f5 cmc
-: d4 (bad)
-: c7 (bad)
-: 96 xchg %eax,%esi
+: 00 b5 7a 16 07 96 add %dh,-0x69f8e986(%rbp)
: df 25 67 00 00 00 fbld 0x67(%rip) # 7ce
<__gcov_.atp_init_module+0x4e>
: 00 04 00 add %al,(%rax,%rax,1)
...
@@ -1925,10 +1934,7 @@
0000000000000780 <__gcov_.atp_init_module>:
...
: f2 0a 00 repnz or (%rax),%al
-: 00 3f add %bh,(%rdi)
-: 2e cs
-: 04 c9 add $0xc9,%al
-: 9e sahf
+: 00 a6 4d 6b ed 9e add %ah,-0x611294b3(%rsi)
: 66 data16
: b2 b6 mov $0xb6,%dl
: 00 00 add %al,(%rax)
@@ -1940,7 +1946,8 @@
: 30 34 00 xor %dh,(%rax,%rax,1)
...
: 00 00 add %al,(%rax)
-: 00 6c 48 04 add %ch,0x4(%rax,%rcx,2)
+: 00 2c c8 add %ch,(%rax,%rcx,8)
+: 06 (bad)
: 77 00 ja 7d5 <__gcov_.atp_init_module+0x55>
...
: 00 00 add %al,(%rax)
@@ -1950,9 +1957,8 @@
0000000000000840 <__gcov_.atp_cleanup_module>:
...
: f3 0a 00 repz or (%rax),%al
-: 00 20 add %ah,(%rax)
-: d2 18 rcrb %cl,(%rax)
-: bd 5e f3 45 61 mov $0x6145f35e,%ebp
+: 00 bb a8 99 b6 5e add %bh,0x5eb699a8(%rbx)
+: f3 45 61 repz rex.RB (bad)
: 00 00 add %al,(%rax)
: 00 00 add %al,(%rax)
: 04 00 add $0x0,%al
@@ -2076,13 +2082,13 @@
0000000000000c50 <__gcov0.atp_cleanup_module>:
...
-0000000000000c70 <num_tx_since_rx.44203>:
+0000000000000c70 <num_tx_since_rx.44241>:
...
-0000000000000c78 <__key.44146>:
+0000000000000c78 <__key.44184>:
...
-0000000000000c80 <__key.44121>:
+0000000000000c80 <__key.44159>:
...
0000000000000c88 <root_atp_dev>:
^ permalink raw reply
* Re: [PATCH 0/1] mv643xx_eth: Disable TSO by default
From: Eric Dumazet @ 2014-11-03 19:04 UTC (permalink / raw)
To: David Laight
Cc: Ezequiel Garcia, netdev@vger.kernel.org, David Miller,
Thomas Petazzoni, Gregory Clement, Tawfik Bayouk, Lior Amsalem,
Nadav Haklai
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D1C9E44F5@AcuExch.aculab.com>
On Mon, 2014-11-03 at 14:51 +0000, David Laight wrote:
> From: Eric Dumazet
> > On Sat, 2014-11-01 at 12:30 -0300, Ezequiel Garcia wrote:
> > > Several users ([1], [2]) have been reporting data corruption with TSO on
> > > Kirkwood platforms (i.e. using the mv643xx_eth driver).
> > >
> > > Until we manage to find what's causing this, this simple patch will make
> > > the TSO path disabled by default. This patch should be queued for stable,
> > > fixing the TSO feature introduced in v3.16.
> > >
> > > The corruption itself is very easy to reproduce: checking md5sum on a mounted
> > > NFS directory gives a different result each time. Same tests using the mvneta
> > > driver (Armada 370/38x/XP SoC) pass with no issues.
> > >
> > > Frankly, I'm a bit puzzled about this, and so any ideas or debugging hints
> > > are well received.
> >
> > lack of barriers maybe ?
> >
> > It seems you might need to populate all TX descriptors but delay the
> > first, like doing the populate in descending order.
> >
> > If you take a look at txq_submit_skb(), you'll see the final
> > desc->cmd_sts = cmd_sts (line 959) is done _after_ frags were cooked by
> > txq_submit_frag_skb()
> >
> > You should kick the nick only when all TX descriptors are ready and
> > committed to memory.
>
> Don't forget that the nick might process the first descriptor without
> being given a 'kick' - it will read it when it finishes processing the
> previous frame.
> This also means that you have to be careful about the order of the writes
> to the first descriptor.
This is what I implied and implemented in the patch ;)
^ permalink raw reply
* TCP out of memory - possible bug [3.18.0-rc3] / sched?
From: Tomasz Mloduchowski @ 2014-11-03 18:59 UTC (permalink / raw)
To: netdev
Hi List,
I hope this is the right place to report a networking issue with
3.18.0-rc2 and 3.18.0-rc3 - under heavy P2P load (tested both
rtorrent/libtorrent and bitcoind, so not protocol-specific), the system
quickly exhausts tcp_mem limits in a very strange sequence of events.
It might be scheduler or networking subsystem related.
It's 100% reproducible on my system, first observed under 3.18.0-rc2.
Neither terminating the offending application, nor removing the network
card seems to bring the 'mem' item in /proc/net/sockstat down from it's
extreme values.
http://static.qdot.me/tcp_mem_issue.png contains the plot of the 'mem'
and 'sockets' fields from sockstat - violet is the 'mem', quickly
exhausting the default 512k pages limit after a short period of reliable
operation.
Best Regards,
Tomasz
-- snip --
sched: RT throttling activated
kworker/dying (792) used greatest stack depth: 11984 bytes left
TCP: out of memory -- consider tuning tcp_mem
TCP: out of memory -- consider tuning tcp_mem
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5802 at net/core/stream.c:201
sk_stream_kill_queues+0x12c/0x130()
Modules linked in:
CPU: 0 PID: 5802 Comm: main Not tainted 3.18.0-rc3 #2
Hardware name: LENOVO 37023L0/37023L0, BIOS GFET48WW (1.27 ) 07/01/2014
0000000000000009 ffff8800cae2fd88 ffffffff81b6c0ff 0000000000000000
0000000000000000 ffff8800cae2fdc8 ffffffff810c235c ffff8800cae2fda8
ffff8800bd996580 ffff8800bd996700 0000000000000004 ffff8800bd996610
Call Trace:
[<ffffffff81b6c0ff>] dump_stack+0x46/0x58
[<ffffffff810c235c>] warn_slowpath_common+0x7c/0xa0
[<ffffffff810c2425>] warn_slowpath_null+0x15/0x20
[<ffffffff81954fbc>] sk_stream_kill_queues+0x12c/0x130
[<ffffffff819b8b45>] inet_csk_destroy_sock+0x55/0x140
[<ffffffff819bd9ee>] tcp_close+0x22e/0x430
[<ffffffff819e3b82>] inet_release+0x72/0x80
[<ffffffff819455fa>] sock_release+0x1a/0x90
[<ffffffff8194567d>] sock_close+0xd/0x20
[<ffffffff811dfdf6>] __fput+0xc6/0x1d0
[<ffffffff811dff49>] ____fput+0x9/0x10
[<ffffffff810db94f>] task_work_run+0x8f/0xd0
[<ffffffff81046b12>] do_notify_resume+0x82/0xa0
[<ffffffff81b7613f>] int_signal+0x12/0x17
---[ end trace 080b1124407d2571 ]---
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5802 at net/ipv4/af_inet.c:153
inet_sock_destruct+0x1d9/0x1e0()
Modules linked in:
CPU: 0 PID: 5802 Comm: main Tainted: G W 3.18.0-rc3 #2
Hardware name: LENOVO 37023L0/37023L0, BIOS GFET48WW (1.27 ) 07/01/2014
0000000000000009 ffff8800cae2fd68 ffffffff81b6c0ff 0000000000000000
0000000000000000 ffff8800cae2fda8 ffffffff810c235c ffff8800cae2fd88
ffff8800bd996580 ffff8800bd996700 0000000000000004 ffff8800bd996610
Call Trace:
[<ffffffff81b6c0ff>] dump_stack+0x46/0x58
[<ffffffff810c235c>] warn_slowpath_common+0x7c/0xa0
[<ffffffff810c2425>] warn_slowpath_null+0x15/0x20
[<ffffffff819e5179>] inet_sock_destruct+0x1d9/0x1e0
[<ffffffff81949e7e>] __sk_free+0x1e/0x100
[<ffffffff81949f79>] sk_free+0x19/0x20
[<ffffffff819bd918>] tcp_close+0x158/0x430
[<ffffffff819e3b82>] inet_release+0x72/0x80
[<ffffffff819455fa>] sock_release+0x1a/0x90
[<ffffffff8194567d>] sock_close+0xd/0x20
[<ffffffff811dfdf6>] __fput+0xc6/0x1d0
[<ffffffff811dff49>] ____fput+0x9/0x10
[<ffffffff810db94f>] task_work_run+0x8f/0xd0
[<ffffffff81046b12>] do_notify_resume+0x82/0xa0
[<ffffffff81b7613f>] int_signal+0x12/0x17
---[ end trace 080b1124407d2572 ]---
kworker/dying (679) used greatest stack depth: 11784 bytes left
TCP: out of memory -- consider tuning tcp_mem
-- snip --
^ permalink raw reply
* Re: DMA allocations from CMA and fatal_signal_pending check
From: Florian Fainelli @ 2014-11-03 18:51 UTC (permalink / raw)
To: Michal Nazarewicz, Joonsoo Kim
Cc: linux-arm-kernel, Brian Norris, Gregory Fong, linux-kernel,
linux-mm, lauraa, gioh.kim, aneesh.kumar, m.szyprowski, akpm,
netdev@vger.kernel.org
In-Reply-To: <xa1tlhnsw7v8.fsf@mina86.com>
On 11/03/2014 08:45 AM, Michal Nazarewicz wrote:
> On Fri, Oct 31 2014, Florian Fainelli wrote:
>> I agree that the CMA allocation should not be allowed to succeed, but
>> the dma_alloc_coherent() allocation should succeed. If we look at the
>> sysport driver, there are kmalloc() calls to initialize private
>> structures, those will succeed (except under high memory pressure), so
>> by the same token, a driver expects DMA allocations to succeed (unless
>> we are under high memory pressure)
>>
>> What are we trying to solve exactly with the fatal_signal_pending()
>> check here? Are we just optimizing for the case where a process has
>> allocated from a CMA region to allow this region to be returned to the
>> pool of free pages when it gets killed? Could there be another mechanism
>> used to reclaim those pages if we know the process is getting killed
>> anyway?
>
> We're guarding against situations where process may hang around
> arbitrarily long time after receiving SIGKILL. If user does “kill -9
> $pid” the usual expectation is that the $pid process will die within
> seconds and anything longer is perceived by user as a bug.
>
> What problem are *you* trying to solve? If user sent SIGKILL to
> a process that imitated device initialisation, what is the point of
> continuing initialising the device? Just recover and return -EINTR.
I have two problems with the current approach:
- behavior of a dma_alloc_coherent() call is not consistent between a
CONFIG_CMA=y vs. CONFIG_CMA=n build, which is probably fine as long as
we document that properly
- there is currently no way for a caller of dma_alloc_coherent to tell
whether the allocation failed because it was interrupted by a signal, a
genuine OOM or something else, this is largely made worse by problem 1
>
>> Well, not really. This driver is not an isolated case, there are tons of
>> other networking drivers that do exactly the same thing, and we do
>> expect these dma_alloc_* calls to succeed.
>
> Again, why do you expect them to succeed? The code must handle failures
> correctly anyway so why do you wish to ignore fatal signal?
I guess expecting them to succeed is probably not good, but at we should
at least be able to report an accurate error code to the caller and down
to user-space.
Thanks
--
Florian
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox