Netdev List
 help / color / mirror / Atom feed
* Re: Requeues and ECN marking
From: Greg Kuperman @ 2014-02-03 17:48 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1391448503.28432.101.camel@edumazet-glaptop2.roam.corp.google.com>

Thanks for the response. I agree that requeues should have nothing to
do with ECN marking, and that is why I am confused about what is
happening.

The entire setup is as follows. I am using the kernel version 3.2.44.
I am running a network emulator (CORE
http://www.nrl.navy.mil/itd/ncs/products/core), within which I have
four nodes. Each node becomes its own linux container, running its own
network control on its interfaces. The four nodes are the sender node
s, receiver node r, and two intermediate nodes 1 and 2. Node s is
connected to node 1, which is connected to node 2, which is connected
to node r. Link (1,2) is rate-limited to 1 Mbps (this rate limiting is
handled by another application that applies back pressure to the node
when its buffers are full and it can no longer send packets; the
buffer for that application is variable, and I have set it to hold up
to 10 packets).

I am running RED queuing discipline on the egress of node 1 with the
following setup:
tc qdisc add dev eth1 root red burst 1000000 limit 10000000 avpkt 1000
ecn bandwidth 125 probability 1

I also run it with the following (and have no change in behavior):
tc qdisc add dev eth1 root red min 2000 max 10000 probability 1.0
limit 1000000 burst 10 avpkt 1000 bandwidth 125 ecn probability 1

The odd thing that seems to be happening is that I can see the backlog
and requeues increasing, and once they hit 1000, then packet marking
begins. This is even though I have the minimum in RED set to 1 byte,
and max set to 0 (which, from my understanding means that packet
marking should begin when the backlog is 1 byte be the maximum
probability of marking right away because the max is set to 0). The
explanation I came up with is that it had something to do with the
requeues, but that may be entirely off base. I have no idea why it
does not begin marking packets right away (which is the desired
behavior).

Thank you again for all of your time, and please let me know if there
is anymore info that you guys need.

Some more queue statistics (I'm not sure how helpful this will be):

qdisc red 8004: root refcnt 2 limit 10000000b min 1b max 0b ecn
 Sent 1044606 bytes 996 pkt (dropped 0, overlimits 0 requeues 905)
 backlog 993072b 913p requeues 905
  marked 0 early 0 pdrop 0 other 0

qdisc red 8004: root refcnt 2 limit 10000000b min 1b max 0b ecn
 Sent 1131390 bytes 1076 pkt (dropped 0, overlimits 0 requeues 984)
 backlog 1080870b 992p requeues 984
  marked 0 early 0 pdrop 0 other 0

qdisc red 8004: root refcnt 2 limit 10000000b min 1b max 0b ecn
 Sent 1231386 bytes 1168 pkt (dropped 0, overlimits 179 requeues 1075)
 backlog 1179690b 1082p requeues 1075
  marked 179 early 0 pdrop 0 other 0

qdisc red 8004: root refcnt 2 limit 10000000b min 1b max 0b ecn
 Sent 1334640 bytes 1263 pkt (dropped 0, overlimits 368 requeues 1169)
 backlog 1283958b 1176p requeues 1169
  marked 368 early 0 pdrop 0 other 0



On Mon, Feb 3, 2014 at 12:28 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Mon, 2014-02-03 at 09:50 -0500, Greg Kuperman wrote:
>> Hi all,
>>
>> I am testing a new congestion control protocol that relies on explicit
>> congestion notifications (ECN) to notify the receiver of a congestion
>> event. I have a rate limited link of 1 Mbps, and I am using the RED
>> queuing discipline with ECN enabled. What I have noticed is that no
>> matter how small I set my queue size, or how low I set my minimum
>> marking level, the first ECN marked packet does not get sent out for
>> about 10 seconds after the input rate exceeds the output rate. Further
>> examination shows that ECN marking does not occur until the number or
>> requeues hits 1000. Below are two queries of tc -s -d qdisc ls dev
>> eth1.
>>
>> qdisc red 8028: root refcnt 2 limit 10000000b min 1b max 0b ecn ewma
>> 30 Plog 21 Scell_log 31
>>  Sent 1307892 bytes 1247 pkt (dropped 0, overlimits 0 requeues 960)
>>  backlog 1052118b 962p requeues 960
>>   marked 0 early 0 pdrop 0 other 0
>>
>> qdisc red 8028: root refcnt 2 limit 10000000b min 1b max 0b ecn ewma
>> 30 Plog 21 Scell_log 31
>>  Sent 1379262 bytes 1312 pkt (dropped 0, overlimits 72 requeues 1024)
>>  backlog 1122468b 1027p requeues 1024
>>   marked 72 early 0 pdrop 0 other 0
>>
>>
>> The txqueuelen defaults to 1000 for the interface, so I figured that
>> packets maybe buffering there, and then dequeuing, before any packets
>> are marked. I set txqueuelen to lower values (all the way down to 1),
>> but the exact same behavior occurs (no marked packets until number of
>> dequeues hits 1000). In contrast, if I set txqueuele to something very
>> high, I get no requeues, drops, or marked packets.
>>
>> My goal is for packets to be marked as soon as the ingress rate
>> exceeds the egress. Am I correct in thinking that the requeuing
>> operation is the culprit? Can I eliminate requeues? Is there something
>> else I can do to get the behavior I am looking for?
>>
>> Thank you all for the help. And please cc me in your replies; I'm not
>> 100% sure if I get all the messages from this mailing list.
>
> requeues have nothing to do with ECN marking.
>
> How is done your rate limiting ?
>
> Post the whole setup, not part of it, it will help to spot the problem
> in one go, instead of many mail exchanges.
>
>

^ permalink raw reply

* Re: Requeues and ECN marking
From: Eric Dumazet @ 2014-02-03 17:28 UTC (permalink / raw)
  To: Greg Kuperman; +Cc: netdev
In-Reply-To: <CAMvx-beWMC8awScfEtHs8sSkzz0fGNqBH1fn7hC=k0iaFpJvSA@mail.gmail.com>

On Mon, 2014-02-03 at 09:50 -0500, Greg Kuperman wrote:
> Hi all,
> 
> I am testing a new congestion control protocol that relies on explicit
> congestion notifications (ECN) to notify the receiver of a congestion
> event. I have a rate limited link of 1 Mbps, and I am using the RED
> queuing discipline with ECN enabled. What I have noticed is that no
> matter how small I set my queue size, or how low I set my minimum
> marking level, the first ECN marked packet does not get sent out for
> about 10 seconds after the input rate exceeds the output rate. Further
> examination shows that ECN marking does not occur until the number or
> requeues hits 1000. Below are two queries of tc -s -d qdisc ls dev
> eth1.
> 
> qdisc red 8028: root refcnt 2 limit 10000000b min 1b max 0b ecn ewma
> 30 Plog 21 Scell_log 31
>  Sent 1307892 bytes 1247 pkt (dropped 0, overlimits 0 requeues 960)
>  backlog 1052118b 962p requeues 960
>   marked 0 early 0 pdrop 0 other 0
> 
> qdisc red 8028: root refcnt 2 limit 10000000b min 1b max 0b ecn ewma
> 30 Plog 21 Scell_log 31
>  Sent 1379262 bytes 1312 pkt (dropped 0, overlimits 72 requeues 1024)
>  backlog 1122468b 1027p requeues 1024
>   marked 72 early 0 pdrop 0 other 0
> 
> 
> The txqueuelen defaults to 1000 for the interface, so I figured that
> packets maybe buffering there, and then dequeuing, before any packets
> are marked. I set txqueuelen to lower values (all the way down to 1),
> but the exact same behavior occurs (no marked packets until number of
> dequeues hits 1000). In contrast, if I set txqueuele to something very
> high, I get no requeues, drops, or marked packets.
> 
> My goal is for packets to be marked as soon as the ingress rate
> exceeds the egress. Am I correct in thinking that the requeuing
> operation is the culprit? Can I eliminate requeues? Is there something
> else I can do to get the behavior I am looking for?
> 
> Thank you all for the help. And please cc me in your replies; I'm not
> 100% sure if I get all the messages from this mailing list.

requeues have nothing to do with ECN marking.

How is done your rate limiting ?

Post the whole setup, not part of it, it will help to spot the problem
in one go, instead of many mail exchanges.

^ permalink raw reply

* usb interrupt storms from the ax88179 hardware
From: David Laight @ 2014-02-03 17:17 UTC (permalink / raw)
  To: netdev, linux-usb@vger.kernel.org, Freddy Xin

On one system (an amd motherboard with the ASMedia xhci controller)
I'm seeing almost back to back USB (7 or 8 a second) 'interrupt'
packets from an ax88179 Ge card.
It may be that other systems behave similarly.
I'm sure this hadn't used to happen!

I don't know what the interrupt status means, the value (as LE u32)
is 0x900a1 0xe1cd6d79
The only bit the driver looks at is the 0x10000 bit in the first word.
This is the 'link up/down' flag.
The two halves of the second word are probably different fields, the
high bits appear after about a second.

Now I'd guess that the driver ought to be doing something about some
of these values. While in this mode transmits are delayed for anything
upto 100ms, at least for some time they do get sent.

However the processing of the 'link up/down' flag is clearly wrong.
The code currently does:

348         event = urb->transfer_buffer;
349         le32_to_cpus((void *)&event->intdata1);
350 
351         link = (((__force u32)event->intdata1) & AX_INT_PPLS_LINK) >> 16;
352 
353         if (netif_carrier_ok(dev->net) != link) {
354                 usbnet_link_change(dev, link, 1);
355                 netdev_info(dev->net, "ax88179 - Link status is: %d\n", link);
356         }

Which ends up doing repeated calls to usbnet_link_change() and confusing
that code a lot.

I presume there is some delay before the return value from netif_carrier_ok()
matches the set state.

I think the code should be remembering the link state locally.
It might also need to clear some other interrupt flags, only ASIX know
what they mean.

	David

^ permalink raw reply

* Re: [PATCH net 1/5] openvswitch: Pad OVS_PACKET_ATTR_PACKET if linear copy was performed
From: Sergei Shtylyov @ 2014-02-03 16:43 UTC (permalink / raw)
  To: Jesse Gross, David Miller
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, dev-yBygre7rU0SM8Zsap4Y0gw
In-Reply-To: <1391389686-34303-2-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>

Hello.

On 03-02-2014 5:08, Jesse Gross wrote:

> From: Thomas Graf <tgraf-G/eBtMaohhA@public.gmane.org>

> While the zerocopy method is correctly omitted if user space
> does not support unaligned Netlink messages. The attribute is
> still not padded correctly as skb_zerocopy() will not ensure
> padding and the attribute size is no longer pre calculated
> though nla_reserve() which ensured padding previously.

> This patch applies appropriate padding if a linear data copy
> was performed in skb_zerocopy().

> Signed-off-by: Thomas Graf <tgraf-G/eBtMaohhA@public.gmane.org>
> Acked-by: Zoltan Kiss <zoltan.kiss-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
> Signed-off-by: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
> ---
>   net/openvswitch/datapath.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)

> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index df46928..3ca9121 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
[...]
> @@ -466,6 +466,11 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
>
>   	skb_zerocopy(user_skb, skb, skb->len, hlen);
>
> +	/* Pad OVS_PACKET_ATTR_PACKET if linear copy was performed */
> +	if (!(dp->user_features & OVS_DP_F_UNALIGNED) &&
> +	    (plen = (ALIGN(user_skb->len, NLA_ALIGNTO) - user_skb->len)) > 0)

    This shouldn't pass checkpatch.pl which complains about assignments inside 
*if* statements.

WBR, Sergei

^ permalink raw reply

* Re: [PATCH net 4/5] openvswitch: Fix ovs_flow_free() ovs-lock assert.
From: Sergei Shtylyov @ 2014-02-03 16:42 UTC (permalink / raw)
  To: Jesse Gross, David Miller
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, dev-yBygre7rU0SM8Zsap4Y0gw
In-Reply-To: <1391389686-34303-5-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>

Hello.

On 03-02-2014 5:08, Jesse Gross wrote:

> From: Pravin B Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>

> ovs_flow_free() is not called under ovs-lock during packet
> execute path (ovs_packet_cmd_execute()). Since packet execute
> does not touch flow->mask, there is no need to take that
> lock either. So move assert in case where flow->mask is checked.

> Found by code inspection.

> Signed-off-by: Pravin B Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
> Signed-off-by: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
> ---
>   net/openvswitch/flow_table.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)

> diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> index bd14052..ad0bda0 100644
> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -158,11 +158,12 @@ void ovs_flow_free(struct sw_flow *flow, bool deferred)
>   	if (!flow)
>   		return;
>
> -	ASSERT_OVSL();
> -
>   	if (flow->mask) {
>   		struct sw_flow_mask *mask = flow->mask;
>
> +		/* ovs-lock is required to protect mask-refcount and
> +		 * mask list. */

    Networking multi-line comment style is:

/* bla
  * bla
  */

WBR, Sergei

^ permalink raw reply

* Re: [PATCH] net:phy:dp83640: Initialize PTP clocks at device init.
From: Richard Cochran @ 2014-02-03 16:33 UTC (permalink / raw)
  To: Stefan Sørensen; +Cc: netdev
In-Reply-To: <1391438218-21994-1-git-send-email-stefan.sorensen@spectralink.com>

On Mon, Feb 03, 2014 at 03:36:58PM +0100, Stefan Sørensen wrote:
> The trigger and events functionality can be useful even if packet
> timestamping is not used, but the required PTP clock is only enabled
> when packet timestamping is started. This patch moves the clock enable
> to when the interface is configured.

Hm, I vaguely recall that there might have been some reason not enable
the clock too early. (Maybe this was related to multiple PHYs?)

Quickly looking at the code once again, I can't see anything wrong with
this now, but I'll look at it again tomorrow.

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH] ipv6: default route for link local address is not added while assigning a address
From: Nicolas Dichtel @ 2014-02-03 16:26 UTC (permalink / raw)
  To: Sohny Thomas, netdev, linux-kernel, yoshfuji, davem, kumuda
In-Reply-To: <20140203160838.GA17999@order.stressinduktion.org>

Le 03/02/2014 17:08, Hannes Frederic Sowa a écrit :
> Hello!
>
> On Mon, Feb 03, 2014 at 04:23:00PM +0100, Nicolas Dichtel wrote:
>> Le 03/02/2014 08:19, Sohny Thomas a écrit :
>>>
>>>> Actually I am not so sure, there is no defined semantic of flush. I would
>>>> be ok with all three solutions: leave it as is, always add link-local
>>>> address (it does not matter if we don't have a link-local address on
>> It matters. This address is required.
>> RFC 4291
>> Section 2.1:
>>     All interfaces are required to have at least one Link-Local unicast
>>     address (see Section 2.8 for additional required addresses).
>> Section 2.8:
>>        o Its required Link-Local address for each interface.
>
> Yes, sure, it is required. But you also can manually delete the LL address and
> we don't guard against that.
Sure. It's why I don't like this patch, it fix a user error.

>
>>>> that interface, as a global scoped one is just fine enough) or make flush
>>>> not
>>>> remove the link-local address (but this seems a bit too special cased for
>>>> me).
>>>
>>> 1) In case if we leave it as it is, there is rfc 6724 rule 2 to be
>>> considered (
>>> previously rfc 3484)
>>>
>>> Rule 2: Prefer appropriate scope.
>>>     If Scope(SA) < Scope(SB): If Scope(SA) < Scope(D), then prefer SB and
>>>     otherwise prefer SA.  Similarly, if Scope(SB) < Scope(SA): If
>>>     Scope(SB) < Scope(D), then prefer SA and otherwise prefer SB.
>>>
>>> Test:
>>>
>>>     Destination: fe80::2(LS)
>>>      Candidate Source Addresses: 3ffe::1(GS) or fec0::1(SS) or LLA(LS)
>>>      Result: LLA(LS)
>>>      Scope(LLA) < Scope(fec0::1): If Scope(LLA) < Scope(fe80::2),  no,
>>>      prefer LLA
>>>      Scope(LLA) < Scope(3ffe::1): If Scope(LLA) < Scope(fe80::2),  no,
>>>      prefer LLA
>>>
>>>
>>> Now the above test fails since the route itself is not present, and the
>>> test
>>> assumes that the route gets added since the LLA is not removed during the
>>> test
>> In your scenario, the link local route has been removed manually, not by the
>> kernel. What is your network manager?
>
> The test scenario is outlined here:
> <https://bugzilla.kernel.org/show_bug.cgi?id=68511>
>
> Basically, the command in question is this one:
>
> 	[root@localhost ~]# ip -6 -statistics -statistics route flush dev eth0
>
> which removes the fe80::/64 route.
>
>>> 2) having a LLA always helps in NDP i think
>> A link-local Address yes, it's a MUST. But having only the link local route
>> will
>> not help.
>
> Agreed, the LL address should be available, too. I currently don't know
> what will break if LL address is not available. I guess MLD won't work
> properly and thus even basic connectivity won't work with some switches.
>
>>> 3) making flush not remove link-local address will be chnaging
>>> functionality of
>>> ip flush command
>> You can flush by specifying the prototype:
>> ip -6 route flush proto static
>
> So we have four possiblities now:
>
> 1) leave it as is
>
> 	seems still acceptable to me
>
> 2) add fe80::/64 route unconditionally if any address gets added
>
> 	Sohny's patch already looks good in doing so at first look.
I don't like this solution, because it's a kernel patch to fix a configuration
problem.

>
> 3) add fe80::/64 route in case LL address gets added via inet6_rtm_newaddr
>
> 	would be ok, too. I tend towards this solution somehow by now.
This seems right also, but I'm not sure that this will fix Sohny's pb.

>
> 4) make flush not remove the fe80::/64 address
>
> 	Least favourable to me. I guess this also woud need iproute change
> 	and seems most difficult to do.
Why using this command 'ip -6 route flush proto static' isn't possible?
I think that we know what kind of route is added for these TAHI tests, hence
it's better to remove only routes added manually (or by a routing daemon if
it's the case).
Removing kernel routes may hide bugs: imagine the kernel adds a wrong route,
TAHI will not detect it.


Regards,
Nicolas

^ permalink raw reply

* Re: [PATCH] [RFC] netfilter: nf_conntrack: don't relase a conntrack with non-zero refcnt
From: Eric Dumazet @ 2014-02-03 16:22 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Florian Westphal, Andrew Vagin, Andrey Vagin, netfilter-devel,
	netfilter, netdev, linux-kernel, vvs, Cyrill Gorcunov,
	Vasiliy Averin
In-Reply-To: <20140202233046.GA4137@localhost>

On Mon, 2014-02-03 at 00:30 +0100, Pablo Neira Ayuso wrote:
>          */
>         smp_wmb();
> -       atomic_set(&ct->ct_general.use, 1);
> +       atomic_set(&ct->ct_general.use, 0);
>         return ct; 

Hi Pablo !

I think your patch is the way to go, but might need some extra care
with memory barriers.

I believe the smp_wmb() here is no longer needed.

If its a newly allocated memory, no other users can access to ct,
if its a recycled ct, content is already 0 anyway.

After your patch, nf_conntrack_get(&tmpl->ct_general) should increment 
an already non zero refcnt, so no memory barrier is needed.

But one smp_wmb() is needed right before this point :

	/* The caller holds a reference to this object */
	atomic_set(&ct->ct_general.use, 2);

Thanks !



^ permalink raw reply

* Re: [PATCH] net:phy:dp83640: Do not hardcode timestamping event edge
From: Richard Cochran @ 2014-02-03 16:20 UTC (permalink / raw)
  To: Stefan Sørensen; +Cc: netdev
In-Reply-To: <1391438210-21941-1-git-send-email-stefan.sorensen@spectralink.com>

On Mon, Feb 03, 2014 at 03:36:50PM +0100, Stefan Sørensen wrote:
> Currently the external timestamping code it hardcoded to use
> the rising edge even though the hardware has configurable event
> edge detection. This patch change the code to use falling edge
> detection if PTP_FALLING_EDGE is set in the user supplied flags.
> 
> Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
> ---

Acked-by: Richard Cochran <richardcochran@gmail.com>

^ permalink raw reply

* Re: [PATCH] net:phy:dp83640: Declare that TX timestamping possible
From: Richard Cochran @ 2014-02-03 16:19 UTC (permalink / raw)
  To: Stefan Sørensen; +Cc: netdev
In-Reply-To: <1391438195-21888-1-git-send-email-stefan.sorensen@spectralink.com>

On Mon, Feb 03, 2014 at 03:36:35PM +0100, Stefan Sørensen wrote:
> Set the SKBTX_IN_PROGRESS bit in tx_flags dp83640_txtstamp when doing
> tx timestamps as per Documentation/networking/timestamping.txt.
> 
> Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
> ---

Acked-by: Richard Cochran <richardcochran@gmail.com>

^ permalink raw reply

* Re: [PATCH] ipv6: default route for link local address is not added while assigning a address
From: Hannes Frederic Sowa @ 2014-02-03 16:08 UTC (permalink / raw)
  To: Nicolas Dichtel
  Cc: Sohny Thomas, netdev, linux-kernel, yoshfuji, davem, kumuda
In-Reply-To: <52EFB454.1040908@6wind.com>

Hello!

On Mon, Feb 03, 2014 at 04:23:00PM +0100, Nicolas Dichtel wrote:
> Le 03/02/2014 08:19, Sohny Thomas a écrit :
> >
> >>Actually I am not so sure, there is no defined semantic of flush. I would
> >>be ok with all three solutions: leave it as is, always add link-local
> >>address (it does not matter if we don't have a link-local address on
> It matters. This address is required.
> RFC 4291
> Section 2.1:
>    All interfaces are required to have at least one Link-Local unicast
>    address (see Section 2.8 for additional required addresses).
> Section 2.8:
>       o Its required Link-Local address for each interface.

Yes, sure, it is required. But you also can manually delete the LL address and
we don't guard against that.

> >>that interface, as a global scoped one is just fine enough) or make flush 
> >>not
> >>remove the link-local address (but this seems a bit too special cased for 
> >>me).
> >
> >1) In case if we leave it as it is, there is rfc 6724 rule 2 to be 
> >considered (
> >previously rfc 3484)
> >
> >Rule 2: Prefer appropriate scope.
> >    If Scope(SA) < Scope(SB): If Scope(SA) < Scope(D), then prefer SB and
> >    otherwise prefer SA.  Similarly, if Scope(SB) < Scope(SA): If
> >    Scope(SB) < Scope(D), then prefer SA and otherwise prefer SB.
> >
> >Test:
> >
> >    Destination: fe80::2(LS)
> >     Candidate Source Addresses: 3ffe::1(GS) or fec0::1(SS) or LLA(LS)
> >     Result: LLA(LS)
> >     Scope(LLA) < Scope(fec0::1): If Scope(LLA) < Scope(fe80::2),  no, 
> >     prefer LLA
> >     Scope(LLA) < Scope(3ffe::1): If Scope(LLA) < Scope(fe80::2),  no, 
> >     prefer LLA
> >
> >
> >Now the above test fails since the route itself is not present, and the 
> >test
> >assumes that the route gets added since the LLA is not removed during the 
> >test
> In your scenario, the link local route has been removed manually, not by the
> kernel. What is your network manager?

The test scenario is outlined here:
<https://bugzilla.kernel.org/show_bug.cgi?id=68511>

Basically, the command in question is this one:

	[root@localhost ~]# ip -6 -statistics -statistics route flush dev eth0

which removes the fe80::/64 route.

> >2) having a LLA always helps in NDP i think
> A link-local Address yes, it's a MUST. But having only the link local route 
> will
> not help.

Agreed, the LL address should be available, too. I currently don't know
what will break if LL address is not available. I guess MLD won't work
properly and thus even basic connectivity won't work with some switches.

> >3) making flush not remove link-local address will be chnaging 
> >functionality of
> >ip flush command
> You can flush by specifying the prototype:
> ip -6 route flush proto static

So we have four possiblities now:

1) leave it as is

	seems still acceptable to me

2) add fe80::/64 route unconditionally if any address gets added

	Sohny's patch already looks good in doing so at first look.

3) add fe80::/64 route in case LL address gets added via inet6_rtm_newaddr

	would be ok, too. I tend towards this solution somehow by now.

4) make flush not remove the fe80::/64 address

	Least favourable to me. I guess this also woud need iproute change
	and seems most difficult to do.

Any opionions?

Greetings,

  Hannes

^ permalink raw reply

* Re: [PATCH] ptp: Allow selecting trigger/event index in testptp
From: Richard Cochran @ 2014-02-03 15:59 UTC (permalink / raw)
  To: Stefan Sørensen; +Cc: netdev
In-Reply-To: <1391438187-21834-1-git-send-email-stefan.sorensen@spectralink.com>

On Mon, Feb 03, 2014 at 03:36:27PM +0100, Stefan Sørensen wrote:
> Currently the trigger/event is hardcoded to 0, this patch adds
> a new command line argument -i to select an arbitrary trigger/
> event.

This is a nice extension of the program, but ...

> diff --git a/Documentation/ptp/testptp.c b/Documentation/ptp/testptp.c
> index a74d0a8..04b21cd 100644
> --- a/Documentation/ptp/testptp.c
> +++ b/Documentation/ptp/testptp.c
> @@ -123,7 +123,8 @@ static void usage(char *progname)
>  		" -P val     enable or disable (val=1|0) the system clock PPS\n"
>  		" -s         set the ptp clock time from the system time\n"
>  		" -S         set the system time from the ptp clock time\n"
> -		" -t val     shift the ptp clock time by 'val' seconds\n",
> +		" -t val     shift the ptp clock time by 'val' seconds\n"
> +		" -i val     index for event/trigger\n",

can we please keep the options in alphabetical order?

Thanks,
Richard

^ permalink raw reply

* Re: OOPS in nf_ct_unlink_expect_report using Polycom RealPresence Mobile
From: astx @ 2014-02-03 15:46 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: linux-kernel, netdev, netfilter, Alexey Dobriyan, netfilter-devel
In-Reply-To: <20140203121415.GA12777@localhost>

Test results / tested kernel versions:

3.2.54
3.8.13
3.10.28

Above kernel versions without patch are dying with same error on  
trying to start h323 connections using "Polycom RealPresence Mobile".

I can confirm that with this patch all three kernel versions are  
pretty stable now again.

Thank you all for your fast and competent help.

Best Regards,

Toni


Zitat von Pablo Neira Ayuso <pablo@netfilter.org>:

> On Fri, Jan 31, 2014 at 05:04:02PM +0100, astx wrote:
>> Dear Alexey,
>>
>> seems to help. Thank you for your quick response. Kernel 3.10.28 is
>> now stable using h323 / Polycom.
>
> Thanks, if no objection, will pass this patch to David.

^ permalink raw reply

* Re: [PATCH] ipv6: default route for link local address is not added while assigning a address
From: Nicolas Dichtel @ 2014-02-03 15:23 UTC (permalink / raw)
  To: Sohny Thomas, netdev, linux-kernel, yoshfuji, davem, kumuda,
	Hannes Frederic Sowa
In-Reply-To: <52EF42FF.60907@linux.vnet.ibm.com>

Le 03/02/2014 08:19, Sohny Thomas a écrit :
>
>> Actually I am not so sure, there is no defined semantic of flush. I would
>> be ok with all three solutions: leave it as is, always add link-local
>> address (it does not matter if we don't have a link-local address on
It matters. This address is required.
RFC 4291
Section 2.1:
    All interfaces are required to have at least one Link-Local unicast
    address (see Section 2.8 for additional required addresses).
Section 2.8:
       o Its required Link-Local address for each interface.

>> that interface, as a global scoped one is just fine enough) or make flush not
>> remove the link-local address (but this seems a bit too special cased for me).
>
> 1) In case if we leave it as it is, there is rfc 6724 rule 2 to be considered (
> previously rfc 3484)
>
> Rule 2: Prefer appropriate scope.
>     If Scope(SA) < Scope(SB): If Scope(SA) < Scope(D), then prefer SB and
>     otherwise prefer SA.  Similarly, if Scope(SB) < Scope(SA): If
>     Scope(SB) < Scope(D), then prefer SA and otherwise prefer SB.
>
> Test:
>
>     Destination: fe80::2(LS)
>      Candidate Source Addresses: 3ffe::1(GS) or fec0::1(SS) or LLA(LS)
>      Result: LLA(LS)
>      Scope(LLA) < Scope(fec0::1): If Scope(LLA) < Scope(fe80::2),  no, prefer LLA
>      Scope(LLA) < Scope(3ffe::1): If Scope(LLA) < Scope(fe80::2),  no, prefer LLA
>
>
> Now the above test fails since the route itself is not present, and the test
> assumes that the route gets added since the LLA is not removed during the test
In your scenario, the link local route has been removed manually, not by the
kernel. What is your network manager?

>
> 2) having a LLA always helps in NDP i think
A link-local Address yes, it's a MUST. But having only the link local route will
not help.

>
> 3) making flush not remove link-local address will be chnaging functionality of
> ip flush command
You can flush by specifying the prototype:
ip -6 route flush proto static


Regards,
Nicolas

^ permalink raw reply

* [PATCH] ptp: Allow selecting trigger/event index in testptp
From: Stefan Sørensen @ 2014-02-03 14:36 UTC (permalink / raw)
  To: richardcochran, netdev; +Cc: Stefan Sørensen

Currently the trigger/event is hardcoded to 0, this patch adds
a new command line argument -i to select an arbitrary trigger/
event.

Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
---
 Documentation/ptp/testptp.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/Documentation/ptp/testptp.c b/Documentation/ptp/testptp.c
index a74d0a8..04b21cd 100644
--- a/Documentation/ptp/testptp.c
+++ b/Documentation/ptp/testptp.c
@@ -123,7 +123,8 @@ static void usage(char *progname)
 		" -P val     enable or disable (val=1|0) the system clock PPS\n"
 		" -s         set the ptp clock time from the system time\n"
 		" -S         set the system time from the ptp clock time\n"
-		" -t val     shift the ptp clock time by 'val' seconds\n",
+		" -t val     shift the ptp clock time by 'val' seconds\n"
+		" -i val     index for event/trigger\n",
 		progname);
 }
 
@@ -161,13 +162,14 @@ int main(int argc, char *argv[])
 	int perout = -1;
 	int pps = -1;
 	int settime = 0;
+	int index = 0;
 
 	int64_t t1, t2, tp;
 	int64_t interval, offset;
 
 	progname = strrchr(argv[0], '/');
 	progname = progname ? 1+progname : argv[0];
-	while (EOF != (c = getopt(argc, argv, "a:A:cd:e:f:ghk:p:P:sSt:v"))) {
+	while (EOF != (c = getopt(argc, argv, "a:A:cd:e:f:ghk:p:P:sSt:vi:"))) {
 		switch (c) {
 		case 'a':
 			oneshot = atoi(optarg);
@@ -209,6 +211,9 @@ int main(int argc, char *argv[])
 		case 't':
 			adjtime = atoi(optarg);
 			break;
+		case 'i':
+			index = atoi(optarg);
+			break;
 		case 'h':
 			usage(progname);
 			return 0;
@@ -301,7 +306,7 @@ int main(int argc, char *argv[])
 
 	if (extts) {
 		memset(&extts_request, 0, sizeof(extts_request));
-		extts_request.index = 0;
+		extts_request.index = index;
 		extts_request.flags = PTP_ENABLE_FEATURE;
 		if (ioctl(fd, PTP_EXTTS_REQUEST, &extts_request)) {
 			perror("PTP_EXTTS_REQUEST");
@@ -375,7 +380,7 @@ int main(int argc, char *argv[])
 			return -1;
 		}
 		memset(&perout_request, 0, sizeof(perout_request));
-		perout_request.index = 0;
+		perout_request.index = index;
 		perout_request.start.sec = ts.tv_sec + 2;
 		perout_request.start.nsec = 0;
 		perout_request.period.sec = 0;
-- 
1.8.5.3

^ permalink raw reply related

* [PATCH] net:phy:dp83640: Initialize PTP clocks at device init.
From: Stefan Sørensen @ 2014-02-03 14:36 UTC (permalink / raw)
  To: richardcochran, netdev; +Cc: Stefan Sørensen

The trigger and events functionality can be useful even if packet
timestamping is not used, but the required PTP clock is only enabled
when packet timestamping is started. This patch moves the clock enable
to when the interface is configured.

Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
---
 drivers/net/phy/dp83640.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index 80c5fc8..14616e4 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1064,6 +1064,13 @@ static void dp83640_remove(struct phy_device *phydev)
 	kfree(dp83640);
 }
 
+static int dp83640_config_init(struct phy_device *phydev)
+{
+	enable_status_frames(phydev, true);
+	ext_write(0, phydev, PAGE4, PTP_CTL, PTP_ENABLE);
+	return 0;
+}
+
 static int dp83640_ack_interrupt(struct phy_device *phydev)
 {
 	int err = phy_read(phydev, MII_DP83640_MISR);
@@ -1201,11 +1208,6 @@ static int dp83640_hwtstamp(struct phy_device *phydev, struct ifreq *ifr)
 
 	mutex_lock(&dp83640->clock->extreg_lock);
 
-	if (dp83640->hwts_tx_en || dp83640->hwts_rx_en) {
-		enable_status_frames(phydev, true);
-		ext_write(0, phydev, PAGE4, PTP_CTL, PTP_ENABLE);
-	}
-
 	ext_write(0, phydev, PAGE5, PTP_TXCFG0, txcfg0);
 	ext_write(0, phydev, PAGE5, PTP_RXCFG0, rxcfg0);
 
@@ -1337,6 +1339,7 @@ static struct phy_driver dp83640_driver = {
 	.flags		= PHY_HAS_INTERRUPT,
 	.probe		= dp83640_probe,
 	.remove		= dp83640_remove,
+	.config_init	= dp83640_config_init,
 	.config_aneg	= genphy_config_aneg,
 	.read_status	= genphy_read_status,
 	.ack_interrupt  = dp83640_ack_interrupt,
-- 
1.8.5.3

^ permalink raw reply related

* [PATCH] net:phy:dp83640: Do not hardcode timestamping event edge
From: Stefan Sørensen @ 2014-02-03 14:36 UTC (permalink / raw)
  To: richardcochran, netdev; +Cc: Stefan Sørensen

Currently the external timestamping code it hardcoded to use
the rising edge even though the hardware has configurable event
edge detection. This patch change the code to use falling edge
detection if PTP_FALLING_EDGE is set in the user supplied flags.

Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
---
 drivers/net/phy/dp83640.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index baa1a75..80c5fc8 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -440,7 +440,10 @@ static int ptp_dp83640_enable(struct ptp_clock_info *ptp,
 		if (on) {
 			gpio_num = extts_gpio[index];
 			evnt |= (gpio_num & EVNT_GPIO_MASK) << EVNT_GPIO_SHIFT;
-			evnt |= EVNT_RISE;
+			if (rq->extts.flags & PTP_FALLING_EDGE)
+				evnt |= EVNT_FALL;
+			else
+				evnt |= EVNT_RISE;
 		}
 		ext_write(0, phydev, PAGE5, PTP_EVNT, evnt);
 		return 0;
-- 
1.8.5.3

^ permalink raw reply related

* [PATCH] net:phy:dp83640: Declare that TX timestamping possible
From: Stefan Sørensen @ 2014-02-03 14:36 UTC (permalink / raw)
  To: richardcochran, netdev; +Cc: Stefan Sørensen

Set the SKBTX_IN_PROGRESS bit in tx_flags dp83640_txtstamp when doing
tx timestamps as per Documentation/networking/timestamping.txt.

Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
---
 drivers/net/phy/dp83640.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index dfb132e..ae95a9a 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1273,6 +1273,7 @@ static void dp83640_txtstamp(struct phy_device *phydev,
 		}
 		/* fall through */
 	case HWTSTAMP_TX_ON:
+		skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
 		skb_queue_tail(&dp83640->tx_queue, skb);
 		schedule_work(&dp83640->ts_work);
 		break;
-- 
1.8.5.3

^ permalink raw reply related

* Requeues and ECN marking
From: Greg Kuperman @ 2014-02-03 14:50 UTC (permalink / raw)
  To: netdev

Hi all,

I am testing a new congestion control protocol that relies on explicit
congestion notifications (ECN) to notify the receiver of a congestion
event. I have a rate limited link of 1 Mbps, and I am using the RED
queuing discipline with ECN enabled. What I have noticed is that no
matter how small I set my queue size, or how low I set my minimum
marking level, the first ECN marked packet does not get sent out for
about 10 seconds after the input rate exceeds the output rate. Further
examination shows that ECN marking does not occur until the number or
requeues hits 1000. Below are two queries of tc -s -d qdisc ls dev
eth1.

qdisc red 8028: root refcnt 2 limit 10000000b min 1b max 0b ecn ewma
30 Plog 21 Scell_log 31
 Sent 1307892 bytes 1247 pkt (dropped 0, overlimits 0 requeues 960)
 backlog 1052118b 962p requeues 960
  marked 0 early 0 pdrop 0 other 0

qdisc red 8028: root refcnt 2 limit 10000000b min 1b max 0b ecn ewma
30 Plog 21 Scell_log 31
 Sent 1379262 bytes 1312 pkt (dropped 0, overlimits 72 requeues 1024)
 backlog 1122468b 1027p requeues 1024
  marked 72 early 0 pdrop 0 other 0


The txqueuelen defaults to 1000 for the interface, so I figured that
packets maybe buffering there, and then dequeuing, before any packets
are marked. I set txqueuelen to lower values (all the way down to 1),
but the exact same behavior occurs (no marked packets until number of
dequeues hits 1000). In contrast, if I set txqueuele to something very
high, I get no requeues, drops, or marked packets.

My goal is for packets to be marked as soon as the ingress rate
exceeds the egress. Am I correct in thinking that the requeuing
operation is the culprit? Can I eliminate requeues? Is there something
else I can do to get the behavior I am looking for?

Thank you all for the help. And please cc me in your replies; I'm not
100% sure if I get all the messages from this mailing list.

Best,
Greg

^ permalink raw reply

* Re: [PATCH] [RFC] netfilter: nf_conntrack: don't relase a conntrack with non-zero refcnt
From: Andrew Vagin @ 2014-02-03 13:59 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Florian Westphal, Andrey Vagin, netfilter-devel, Eric Dumazet,
	netfilter, netdev, linux-kernel, vvs, Cyrill Gorcunov,
	Vasiliy Averin
In-Reply-To: <20140202233046.GA4137@localhost>

On Mon, Feb 03, 2014 at 12:30:46AM +0100, Pablo Neira Ayuso wrote:
> On Thu, Jan 16, 2014 at 10:23:01AM +0100, Florian Westphal wrote:
> > Andrew Vagin <avagin@parallels.com> wrote:
> > > > I think it would be nice if we could keep it that way.
> > > > If everything fails we could proably intoduce a 'larval' dummy list
> > > > similar to the one used by template conntracks?
> > > 
> > > I'm not sure, that this is required. Could you elaborate when this can
> > > be useful?
> > 
> > You can dump the lists via ctnetlink.  Its meant as a debugging aid in
> > case one suspects refcnt leaks.
> > 
> > Granted, in this situation there should be no leak since we put the newly
> > allocated entry in the error case.
> > 
> > > Now I see only overhead, because we need to take the nf_conntrack_lock
> > > lock to add conntrack in a list.
> > 
> > True. I don't have any preference, I guess I'd just do the insertion into the
> > unconfirmed list when we know we cannot track to keep the "unhashed"
> > bug trap in the destroy function.
> > 
> > Pablo, any preference?
> 
> I think we can initially set to zero the refcount and bump it once it
> gets into any of the lists, so Eric's golden rule also stands for
> conntracks that are released without being inserted in any list via
> nf_conntrack_free().
> 
> My idea was to use dying list to detect possible runtime leaks (ie.
> missing nf_ct_put somewhere), not simple leaks the initialization
> path, as you said, it would add too much overhead to catch them with
> the dying list, so we can skip those.
> 
> Please, let me know if you find any issue with this approach.

Hello Pablo,

I don't see any problem with this approach and I like it.
Thank you for the patch.

> diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
> index 01ea6ee..b2ac624 100644
> --- a/include/net/netfilter/nf_conntrack.h
> +++ b/include/net/netfilter/nf_conntrack.h
> @@ -284,6 +284,8 @@ extern unsigned int nf_conntrack_max;
>  extern unsigned int nf_conntrack_hash_rnd;
>  void init_nf_conntrack_hash_rnd(void);
>  
> +void nf_conntrack_tmpl_insert(struct net *net, struct nf_conn *tmpl);
> +
>  #define NF_CT_STAT_INC(net, count)	  __this_cpu_inc((net)->ct.stat->count)
>  #define NF_CT_STAT_INC_ATOMIC(net, count) this_cpu_inc((net)->ct.stat->count)
>  
> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> index 4d1fb5d..bd5ec5a 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -448,7 +448,8 @@ nf_conntrack_hash_check_insert(struct nf_conn *ct)
>  			goto out;
>  
>  	add_timer(&ct->timeout);
> -	nf_conntrack_get(&ct->ct_general);
> +	/* The caller holds a reference to this object */
> +	atomic_set(&ct->ct_general.use, 2);
>  	__nf_conntrack_hash_insert(ct, hash, repl_hash);
>  	NF_CT_STAT_INC(net, insert);
>  	spin_unlock_bh(&nf_conntrack_lock);
> @@ -462,6 +463,21 @@ out:
>  }
>  EXPORT_SYMBOL_GPL(nf_conntrack_hash_check_insert);
>  
> +/* deletion from this larval template list happens via nf_ct_put() */
> +void nf_conntrack_tmpl_insert(struct net *net, struct nf_conn *tmpl)
> +{
> +	__set_bit(IPS_TEMPLATE_BIT, &tmpl->status);
> +	__set_bit(IPS_CONFIRMED_BIT, &tmpl->status);
> +	nf_conntrack_get(&tmpl->ct_general);
> +
> +	spin_lock_bh(&nf_conntrack_lock);
> +	/* Overload tuple linked list to put us in template list. */
> +	hlist_nulls_add_head_rcu(&tmpl->tuplehash[IP_CT_DIR_ORIGINAL].hnnode,
> +				 &net->ct.tmpl);
> +	spin_unlock_bh(&nf_conntrack_lock);
> +}
> +EXPORT_SYMBOL_GPL(nf_conntrack_tmpl_insert);
> +
>  /* Confirm a connection given skb; places it in hash table */
>  int
>  __nf_conntrack_confirm(struct sk_buff *skb)
> @@ -733,11 +749,11 @@ __nf_conntrack_alloc(struct net *net, u16 zone,
>  		nf_ct_zone->id = zone;
>  	}
>  #endif
> -	/*
> -	 * changes to lookup keys must be done before setting refcnt to 1
> +	/* Because we use RCU lookups, we set ct_general.use to zero before
> +	 * this is inserted in any list.
>  	 */
>  	smp_wmb();
> -	atomic_set(&ct->ct_general.use, 1);
> +	atomic_set(&ct->ct_general.use, 0);
>  	return ct;
>  
>  #ifdef CONFIG_NF_CONNTRACK_ZONES
> @@ -761,6 +777,11 @@ void nf_conntrack_free(struct nf_conn *ct)
>  {
>  	struct net *net = nf_ct_net(ct);
>  
> +	/* A freed object has refcnt == 0, thats
> +	 * the golden rule for SLAB_DESTROY_BY_RCU
> +	 */
> +	NF_CT_ASSERT(atomic_read(&ct->ct_general.use) == 0);
> +
>  	nf_ct_ext_destroy(ct);
>  	nf_ct_ext_free(ct);
>  	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);
> @@ -856,6 +877,9 @@ init_conntrack(struct net *net, struct nf_conn *tmpl,
>  		NF_CT_STAT_INC(net, new);
>  	}
>  
> +	/* Now it is inserted into the hashes, bump refcount */
> +	nf_conntrack_get(&ct->ct_general);
> +
>  	/* Overload tuple linked list to put us in unconfirmed list. */
>  	hlist_nulls_add_head_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode,
>  		       &net->ct.unconfirmed);
> diff --git a/net/netfilter/nf_synproxy_core.c b/net/netfilter/nf_synproxy_core.c
> index 9858e3e..52e20c9 100644
> --- a/net/netfilter/nf_synproxy_core.c
> +++ b/net/netfilter/nf_synproxy_core.c
> @@ -363,9 +363,8 @@ static int __net_init synproxy_net_init(struct net *net)
>  		goto err2;
>  	if (!nfct_synproxy_ext_add(ct))
>  		goto err2;
> -	__set_bit(IPS_TEMPLATE_BIT, &ct->status);
> -	__set_bit(IPS_CONFIRMED_BIT, &ct->status);
>  
> +	nf_conntrack_tmpl_insert(net, ct);
>  	snet->tmpl = ct;
>  
>  	snet->stats = alloc_percpu(struct synproxy_stats);
> @@ -390,7 +389,7 @@ static void __net_exit synproxy_net_exit(struct net *net)
>  {
>  	struct synproxy_net *snet = synproxy_pernet(net);
>  
> -	nf_conntrack_free(snet->tmpl);
> +	nf_ct_put(snet->tmpl);
>  	synproxy_proc_exit(net);
>  	free_percpu(snet->stats);
>  }
> diff --git a/net/netfilter/xt_CT.c b/net/netfilter/xt_CT.c
> index 5929be6..75747ae 100644
> --- a/net/netfilter/xt_CT.c
> +++ b/net/netfilter/xt_CT.c
> @@ -228,12 +228,7 @@ static int xt_ct_tg_check(const struct xt_tgchk_param *par,
>  			goto err3;
>  	}
>  
> -	__set_bit(IPS_TEMPLATE_BIT, &ct->status);
> -	__set_bit(IPS_CONFIRMED_BIT, &ct->status);
> -
> -	/* Overload tuple linked list to put us in template list. */
> -	hlist_nulls_add_head_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode,
> -				 &par->net->ct.tmpl);
> +	nf_conntrack_tmpl_insert(par->net, ct);
>  out:
>  	info->ct = ct;
>  	return 0;


^ permalink raw reply

* Re: [Xen-devel] [PATCH v6] xen/grant-table: Avoid m2p_override during mapping
From: Zoltan Kiss @ 2014-02-03 13:27 UTC (permalink / raw)
  To: David Vrabel
  Cc: Julien Grall, Stefano Stabellini, jonathan.davies, wei.liu2,
	ian.campbell, netdev, linux-kernel, xen-devel
In-Reply-To: <52EF7618.7030402@citrix.com>

On 03/02/14 11:57, David Vrabel wrote:
>>
>> Hi,
>>
>> That's bad indeed. I think the best solution is to put those parts
>> behind an #ifdef x86. The ones moved from x86/p2m.c to grant-table.c.
>> David, Stefano, what do you think?
> I don't think we want (more) #ifdef CONFIG_X86 in grant-table.c and the
> arch-specific bits will have to factored out into their own functions
> with suitable stubs provided for ARM.
>
I've just sent in v7 with stubs, I guess that's something you suggested. 
Please review it, I'm especially curious about your thoughts regarding 
the new function name.

Zoli

^ permalink raw reply

* Re: OOPS in nf_ct_unlink_expect_report using Polycom RealPresence Mobile
From: Pablo Neira Ayuso @ 2014-02-03 12:14 UTC (permalink / raw)
  To: astx; +Cc: linux-kernel, netdev, netfilter, Alexey Dobriyan, netfilter-devel
In-Reply-To: <20140131170402.Horde.yIFUeQVjLycuS_8PGQoKmg5@aws-it.at>

[-- Attachment #1: Type: text/plain, Size: 234 bytes --]

On Fri, Jan 31, 2014 at 05:04:02PM +0100, astx wrote:
> Dear Alexey,
> 
> seems to help. Thank you for your quick response. Kernel 3.10.28 is
> now stable using h323 / Polycom.

Thanks, if no objection, will pass this patch to David.

[-- Attachment #2: 0001-netfilter-nf_nat_h323-fix-crash-in-nf_ct_unlink_expe.patch --]
[-- Type: text/x-diff, Size: 2377 bytes --]

>From d98506139d6e192705422ffba13bc2ff476ac513 Mon Sep 17 00:00:00 2001
From: Alexey Dobriyan <adobriyan@gmail.com>
Date: Mon, 3 Feb 2014 13:07:24 +0100
Subject: [PATCH] netfilter: nf_nat_h323: fix crash in
 nf_ct_unlink_expect_report()

Similar bug fixed in SIP module in 3f509c6 ("netfilter: nf_nat_sip: fix
incorrect handling of EBUSY for RTCP expectation").

BUG: unable to handle kernel paging request at 00100104
IP: [<f8214f07>] nf_ct_unlink_expect_report+0x57/0xf0 [nf_conntrack]
...
Call Trace:
  [<c0244bd8>] ? del_timer+0x48/0x70
  [<f8215687>] nf_ct_remove_expectations+0x47/0x60 [nf_conntrack]
  [<f8211c99>] nf_ct_delete_from_lists+0x59/0x90 [nf_conntrack]
  [<f8212e5e>] death_by_timeout+0x14e/0x1c0 [nf_conntrack]
  [<f8212d10>] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack]
  [<c024442d>] call_timer_fn+0x1d/0x80
  [<c024461e>] run_timer_softirq+0x18e/0x1a0
  [<f8212d10>] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack]
  [<c023e6f3>] __do_softirq+0xa3/0x170
  [<c023e650>] ? __local_bh_enable+0x70/0x70
  <IRQ>
  [<c023e587>] ? irq_exit+0x67/0xa0
  [<c0202af6>] ? do_IRQ+0x46/0xb0
  [<c027ad05>] ? clockevents_notify+0x35/0x110
  [<c066ac6c>] ? common_interrupt+0x2c/0x40
  [<c056e3c1>] ? cpuidle_enter_state+0x41/0xf0
  [<c056e6fb>] ? cpuidle_idle_call+0x8b/0x100
  [<c02085f8>] ? arch_cpu_idle+0x8/0x30
  [<c027314b>] ? cpu_idle_loop+0x4b/0x140
  [<c0273258>] ? cpu_startup_entry+0x18/0x20
  [<c066056d>] ? rest_init+0x5d/0x70
  [<c0813ac8>] ? start_kernel+0x2ec/0x2f2
  [<c081364f>] ? repair_env_string+0x5b/0x5b
  [<c0813269>] ? i386_start_kernel+0x33/0x35

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv4/netfilter/nf_nat_h323.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/netfilter/nf_nat_h323.c b/net/ipv4/netfilter/nf_nat_h323.c
index 9eea059d..574f7eb 100644
--- a/net/ipv4/netfilter/nf_nat_h323.c
+++ b/net/ipv4/netfilter/nf_nat_h323.c
@@ -229,7 +229,10 @@ static int nat_rtp_rtcp(struct sk_buff *skb, struct nf_conn *ct,
 			ret = nf_ct_expect_related(rtcp_exp);
 			if (ret == 0)
 				break;
-			else if (ret != -EBUSY) {
+			else if (ret == -EBUSY) {
+				nf_ct_unexpect_related(rtp_exp);
+				continue;
+			} else if (ret < 0) {
 				nf_ct_unexpect_related(rtp_exp);
 				nated_port = 0;
 				break;
-- 
1.7.10.4


^ permalink raw reply related

* [PATCH v1 3/3] ARM: STi: Add STiH416 ethernet support.
From: srinivas.kandagatla @ 2014-02-03 12:02 UTC (permalink / raw)
  To: netdev
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Rob Landley, Russell King, Srinivas Kandagatla, Stuart Menefy,
	Giuseppe Cavallaro, devicetree, linux-doc, linux-kernel,
	linux-arm-kernel, kernel, davem
In-Reply-To: <1391428787-27143-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

This patch adds support to STiH416 SOC, which has two ethernet
snps,dwmac controllers version 3.710. With this patch B2000 and B2020
boards can boot with ethernet in MII and RGMII modes.

Tested on both B2020 and B2000.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
---
 arch/arm/boot/dts/stih416-clock.dtsi   |   14 ++++
 arch/arm/boot/dts/stih416-pinctrl.dtsi |  109 ++++++++++++++++++++++++++++++++
 arch/arm/boot/dts/stih416.dtsi         |   44 +++++++++++++
 3 files changed, 167 insertions(+)

diff --git a/arch/arm/boot/dts/stih416-clock.dtsi b/arch/arm/boot/dts/stih416-clock.dtsi
index 7026bf1..a6942c7 100644
--- a/arch/arm/boot/dts/stih416-clock.dtsi
+++ b/arch/arm/boot/dts/stih416-clock.dtsi
@@ -37,5 +37,19 @@
 			clock-frequency = <100000000>;
 			clock-output-names = "CLK_S_ICN_REG_0";
 		};
+
+		CLK_S_GMAC0_PHY: clockgenA1@7 {
+			#clock-cells = <0>;
+			compatible = "fixed-clock";
+			clock-frequency = <25000000>;
+			clock-output-names = "CLK_S_GMAC0_PHY";
+		};
+
+		CLK_S_ETH1_PHY: clockgenA0@7 {
+			#clock-cells = <0>;
+			compatible = "fixed-clock";
+			clock-frequency = <25000000>;
+			clock-output-names = "CLK_S_ETH1_PHY";
+		};
 	};
 };
diff --git a/arch/arm/boot/dts/stih416-pinctrl.dtsi b/arch/arm/boot/dts/stih416-pinctrl.dtsi
index 8863c38..c4beef2 100644
--- a/arch/arm/boot/dts/stih416-pinctrl.dtsi
+++ b/arch/arm/boot/dts/stih416-pinctrl.dtsi
@@ -132,6 +132,58 @@
 					};
 				};
 			};
+
+			gmac1 {
+				pinctrl_mii1: mii1 {
+					st,pins {
+						txd0 = <&PIO0 0 ALT1 OUT SE_NICLK_IO 0 CLK_A>;
+						txd1 = <&PIO0 1 ALT1 OUT SE_NICLK_IO 0 CLK_A>;
+						txd2 = <&PIO0 2 ALT1 OUT SE_NICLK_IO 0 CLK_A>;
+						txd3 = <&PIO0 3 ALT1 OUT SE_NICLK_IO 0 CLK_A>;
+						txer = <&PIO0 4 ALT1 OUT SE_NICLK_IO 0 CLK_A>;
+						txen = <&PIO0 5 ALT1 OUT SE_NICLK_IO 0 CLK_A>;
+						txclk = <&PIO0 6 ALT1 IN NICLK 0 CLK_A>;
+						col =   <&PIO0 7 ALT1 IN BYPASS 1000>;
+
+						mdio =  <&PIO1 0 ALT1 OUT BYPASS 1500>;
+						mdc =   <&PIO1 1 ALT1 OUT NICLK 0 CLK_A>;
+						crs =   <&PIO1 2 ALT1 IN BYPASS 1000>;
+						mdint = <&PIO1 3 ALT1 IN BYPASS 0>;
+						rxd0 =  <&PIO1 4 ALT1 IN SE_NICLK_IO 0 CLK_A>;
+						rxd1 =  <&PIO1 5 ALT1 IN SE_NICLK_IO 0 CLK_A>;
+						rxd2 =  <&PIO1 6 ALT1 IN SE_NICLK_IO 0 CLK_A>;
+						rxd3 =  <&PIO1 7 ALT1 IN SE_NICLK_IO 0 CLK_A>;
+
+						rxdv =  <&PIO2 0 ALT1 IN SE_NICLK_IO 0 CLK_A>;
+						rx_er = <&PIO2 1 ALT1 IN SE_NICLK_IO 0 CLK_A>;
+						rxclk = <&PIO2 2 ALT1 IN NICLK 0 CLK_A>;
+					 	phyclk = <&PIO2 3 ALT1 OUT NICLK 0 CLK_A>;
+					};
+				};
+				pinctrl_rgmii1: rgmii1-0 {
+					st,pins {
+						txd0 =  <&PIO0 0 ALT1 OUT DE_IO 500 CLK_A>;
+						txd1 =  <&PIO0 1 ALT1 OUT DE_IO 500 CLK_A>;
+						txd2 =  <&PIO0 2 ALT1 OUT DE_IO 500 CLK_A>;
+						txd3 =  <&PIO0 3 ALT1 OUT DE_IO 500 CLK_A>;
+						txen =  <&PIO0 5 ALT1 OUT DE_IO 0   CLK_A>;
+						txclk = <&PIO0 6 ALT1 IN  NICLK 0   CLK_A>;
+
+						mdio = <&PIO1 0 ALT1 OUT BYPASS 0>;
+						mdc  = <&PIO1 1 ALT1 OUT NICLK  0 CLK_A>;
+						rxd0 = <&PIO1 4 ALT1 IN DE_IO 500 CLK_A>;
+						rxd1 = <&PIO1 5 ALT1 IN DE_IO 500 CLK_A>;
+						rxd2 = <&PIO1 6 ALT1 IN DE_IO 500 CLK_A>;
+						rxd3 = <&PIO1 7 ALT1 IN DE_IO 500 CLK_A>;
+
+						rxdv   = <&PIO2 0 ALT1 IN  DE_IO 500 CLK_A>;
+						rxclk  = <&PIO2 2 ALT1 IN  NICLK 0   CLK_A>;
+						phyclk = <&PIO2 3 ALT4 OUT NICLK 0   CLK_B>;
+
+						clk125= <&PIO3 7 ALT4 IN NICLK 0 CLK_A>;
+					};
+				};
+			};
 		};
 
 		pin-controller-front {
@@ -322,6 +374,63 @@
 					};
 				};
 			};
+
+			gmac0 {
+				pinctrl_mii0: mii0 {
+					st,pins {
+						mdint = <&PIO13 6 ALT2 IN  BYPASS      0>;
+						txen =  <&PIO13 7 ALT2 OUT SE_NICLK_IO 0 CLK_A>;
+						txd0 =  <&PIO14 0 ALT2 OUT SE_NICLK_IO 0 CLK_A>;
+						txd1 =  <&PIO14 1 ALT2 OUT SE_NICLK_IO 0 CLK_A>;
+						txd2 =  <&PIO14 2 ALT2 OUT SE_NICLK_IO 0 CLK_B>;
+						txd3 =  <&PIO14 3 ALT2 OUT SE_NICLK_IO 0 CLK_B>;
+
+						txclk = <&PIO15 0 ALT2 IN  NICLK       0 CLK_A>;
+						txer =  <&PIO15 1 ALT2 OUT SE_NICLK_IO 0 CLK_A>;
+						crs = <&PIO15 2 ALT2 IN  BYPASS 1000>;
+						col = <&PIO15 3 ALT2 IN  BYPASS 1000>;
+						mdio= <&PIO15 4 ALT2 OUT BYPASS 1500>;
+						mdc = <&PIO15 5 ALT2 OUT NICLK  0    CLK_B>;
+
+						rxd0 =  <&PIO16 0 ALT2 IN SE_NICLK_IO 0 CLK_A>;
+						rxd1 =  <&PIO16 1 ALT2 IN SE_NICLK_IO 0 CLK_A>;
+						rxd2 =  <&PIO16 2 ALT2 IN SE_NICLK_IO 0 CLK_A>;
+						rxd3 =  <&PIO16 3 ALT2 IN SE_NICLK_IO 0 CLK_A>;
+						rxdv =  <&PIO15 6 ALT2 IN SE_NICLK_IO 0 CLK_A>;
+						rx_er = <&PIO15 7 ALT2 IN SE_NICLK_IO 0 CLK_A>;
+						rxclk = <&PIO17 0 ALT2 IN NICLK 0 CLK_A>;
+					 	phyclk = <&PIO13 5 ALT2 OUT NICLK 0 CLK_B>;
+					};
+				};
+
+				pinctrl_gmii0: gmii0 {
+					st,pins {
+						};
+				};
+				pinctrl_rgmii0: rgmii0 {
+					st,pins {
+						 phyclk = <&PIO13  5 ALT4 OUT NICLK 0 CLK_B>;
+						 txen = <&PIO13 7 ALT2 OUT DE_IO 0 CLK_A>;
+						 txd0  = <&PIO14 0 ALT2 OUT DE_IO 500 CLK_A>;
+						 txd1  = <&PIO14 1 ALT2 OUT DE_IO 500 CLK_A>;
+						 txd2  = <&PIO14 2 ALT2 OUT DE_IO 500 CLK_B>;
+						 txd3  = <&PIO14 3 ALT2 OUT DE_IO 500 CLK_B>;
+						 txclk = <&PIO15 0 ALT2 IN NICLK 0 CLK_A>;
+
+						 mdio = <&PIO15 4 ALT2 OUT BYPASS 0>;
+						 mdc = <&PIO15 5 ALT2 OUT NICLK 0 CLK_B>;
+
+						 rxdv = <&PIO15 6 ALT2 IN DE_IO 500 CLK_A>;
+						 rxd0 =<&PIO16 0 ALT2 IN DE_IO	500 CLK_A>;
+						 rxd1 =<&PIO16 1 ALT2 IN DE_IO	500 CLK_A>;
+						 rxd2 =<&PIO16 2 ALT2 IN DE_IO	500 CLK_A>;
+						 rxd3  =<&PIO16 3 ALT2 IN DE_IO 500 CLK_A>;
+						 rxclk =<&PIO17 0 ALT2 IN NICLK 0 CLK_A>;
+
+						 clk125=<&PIO17 6 ALT1 IN NICLK 0 CLK_A>;
+					};
+				};
+			};
 		};
 
 		pin-controller-fvdp-fe {
diff --git a/arch/arm/boot/dts/stih416.dtsi b/arch/arm/boot/dts/stih416.dtsi
index 788ba5b..a96055b 100644
--- a/arch/arm/boot/dts/stih416.dtsi
+++ b/arch/arm/boot/dts/stih416.dtsi
@@ -156,5 +156,49 @@
 
 			status		= "disabled";
 		};
+
+		ethernet0: dwmac@fe810000 {
+			device_type 	= "network";
+			compatible	= "st,stih416-dwmac", "snps,dwmac", "snps,dwmac-3.710";
+			status 		= "disabled";
+			reg 		= <0xfe810000 0x8000>, <0x8bc 0x4>;
+			reg-names	= "stmmaceth", "sti-ethconf";
+
+			interrupts = <0 133 0>, <0 134 0>, <0 135 0>;
+			interrupt-names = "macirq", "eth_wake_irq", "eth_lpi";
+
+			snps,pbl 	= <32>;
+			snps,mixed-burst;
+
+			st,syscon		= <&syscfg_rear>;
+			resets			= <&softreset STIH416_ETH0_SOFTRESET>;
+			reset-names		= "stmmaceth";
+			pinctrl-names 	= "default";
+			pinctrl-0	= <&pinctrl_mii0>;
+			clock-names	= "stmmaceth";
+			clocks		= <&CLK_S_GMAC0_PHY>;
+		};
+
+		ethernet1: dwmac@fef08000 {
+			device_type = "network";
+			compatible		= "st,stih416-dwmac", "snps,dwmac", "snps,dwmac-3.710";
+			status 		= "disabled";
+			reg		= <0xfef08000 0x8000>, <0x7f0 0x4>;
+			reg-names	= "stmmaceth", "sti-ethconf";
+			interrupts = <0 136 0>, <0 137 0>, <0 138 0>;
+			interrupt-names = "macirq", "eth_wake_irq", "eth_lpi";
+
+			snps,pbl	= <32>;
+			snps,mixed-burst;
+
+			st,syscon	= <&syscfg_sbc>;
+
+			resets		= <&softreset STIH416_ETH1_SOFTRESET>;
+			reset-names	= "stmmaceth";
+			pinctrl-names 	= "default";
+			pinctrl-0	= <&pinctrl_mii1>;
+			clock-names	= "stmmaceth";
+			clocks		= <&CLK_S_ETH1_PHY>;
+		};
 	};
 };
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH v1 2/3] ARM: STi: Add STiH415 ethernet support.
From: srinivas.kandagatla @ 2014-02-03 12:01 UTC (permalink / raw)
  To: netdev
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Rob Landley, Russell King, Srinivas Kandagatla, Stuart Menefy,
	Giuseppe Cavallaro, devicetree, linux-doc, linux-kernel,
	linux-arm-kernel, kernel, davem
In-Reply-To: <1391428787-27143-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

This patch adds support to STiH415 SOC, which has two ethernet
snps,dwmac controllers version 3.610. With this patch B2000 and B2020
boards can boot with ethernet in MII and RGMII modes.

Tested on both B2020 and B2000.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
---
 arch/arm/boot/dts/stih415-clock.dtsi   |   14 ++++
 arch/arm/boot/dts/stih415-pinctrl.dtsi |  121 ++++++++++++++++++++++++++++++++
 arch/arm/boot/dts/stih415.dtsi         |   48 +++++++++++++
 arch/arm/boot/dts/stih41x-b2000.dtsi   |   22 ++++++
 arch/arm/boot/dts/stih41x-b2020.dtsi   |   26 +++++++
 5 files changed, 231 insertions(+)

diff --git a/arch/arm/boot/dts/stih415-clock.dtsi b/arch/arm/boot/dts/stih415-clock.dtsi
index 174c799..d047dbc 100644
--- a/arch/arm/boot/dts/stih415-clock.dtsi
+++ b/arch/arm/boot/dts/stih415-clock.dtsi
@@ -34,5 +34,19 @@
 			compatible = "fixed-clock";
 			clock-frequency = <100000000>;
 		};
+
+		CLKS_GMAC0_PHY: clockgenA1@7 {
+			#clock-cells = <0>;
+			compatible = "fixed-clock";
+			clock-frequency = <25000000>;
+			clock-output-names = "CLKS_GMAC0_PHY";
+		};
+
+		CLKS_ETH1_PHY: clockgenA0@7 {
+			#clock-cells = <0>;
+			compatible = "fixed-clock";
+			clock-frequency = <25000000>;
+			clock-output-names = "CLKS_ETH1_PHY";
+		};
 	};
 };
diff --git a/arch/arm/boot/dts/stih415-pinctrl.dtsi b/arch/arm/boot/dts/stih415-pinctrl.dtsi
index 887c5e5..9ca20aa 100644
--- a/arch/arm/boot/dts/stih415-pinctrl.dtsi
+++ b/arch/arm/boot/dts/stih415-pinctrl.dtsi
@@ -119,6 +119,56 @@
 					};
 				};
 			};
+
+			gmac1 {
+				pinctrl_mii1: mii1 {
+						st,pins {
+						 txd0   = <&PIO0 0 ALT1 OUT  SE_NICLK_IO	0	CLK_A>;
+						 txd1   = <&PIO0 1 ALT1 OUT  SE_NICLK_IO	0	CLK_A>;
+						 txd2   = <&PIO0 2 ALT1 OUT  SE_NICLK_IO	0	CLK_A>;
+						 txd3   = <&PIO0 3 ALT1 OUT  SE_NICLK_IO	0	CLK_A>;
+						 txer   = <&PIO0 4 ALT1 OUT  SE_NICLK_IO	0	CLK_A>;
+						 txen   = <&PIO0 5 ALT1 OUT  SE_NICLK_IO	0	CLK_A>;
+						 txclk  = <&PIO0 6 ALT1 IN   NICLK	0	CLK_A>;
+						 col    = <&PIO0 7 ALT1 IN   BYPASS	1000>;
+						 mdio   = <&PIO1 0 ALT1 OUT  BYPASS	0>;
+						 mdc    = <&PIO1 1 ALT1 OUT  NICLK	0	CLK_A>;
+						 crs    = <&PIO1 2 ALT1 IN   BYPASS	1000>;
+						 mdint  = <&PIO1 3 ALT1 IN   BYPASS	0>;
+						 rxd0   = <&PIO1 4 ALT1 IN   SE_NICLK_IO	0	CLK_A>;
+						 rxd1   = <&PIO1 5 ALT1 IN   SE_NICLK_IO	0	CLK_A>;
+						 rxd2   = <&PIO1 6 ALT1 IN   SE_NICLK_IO	0	CLK_A>;
+						 rxd3   = <&PIO1 7 ALT1 IN   SE_NICLK_IO	0	CLK_A>;
+						 rxdv   = <&PIO2 0 ALT1 IN   SE_NICLK_IO	0	CLK_A>;
+						 rx_er  = <&PIO2 1 ALT1 IN   SE_NICLK_IO	0	CLK_A>;
+						 rxclk  = <&PIO2 2 ALT1 IN   NICLK	0	CLK_A>;
+						 phyclk = <&PIO2 3 ALT1 IN   NICLK	1000	CLK_A>;
+					};
+				};
+
+				pinctrl_rgmii1: rgmii1-0 {
+					st,pins {
+						 txd0 =	 <&PIO0 0 ALT1 OUT DE_IO	1000	CLK_A>;
+						 txd1 =	 <&PIO0 1 ALT1 OUT DE_IO	1000	CLK_A>;
+						 txd2 =	 <&PIO0 2 ALT1 OUT DE_IO	1000	CLK_A>;
+						 txd3 =	 <&PIO0 3 ALT1 OUT DE_IO	1000	CLK_A>;
+						 txen =	 <&PIO0 5 ALT1 OUT DE_IO	0	CLK_A>;
+						 txclk = <&PIO0 6 ALT1 IN	NICLK	0	CLK_A>;
+						 mdio =	 <&PIO1 0 ALT1 OUT	BYPASS	0>;
+						 mdc =	 <&PIO1 1 ALT1 OUT	NICLK	0	CLK_A>;
+						 rxd0 =	 <&PIO1 4 ALT1 IN DE_IO	0	CLK_A>;
+						 rxd1 =	 <&PIO1 5 ALT1 IN DE_IO	0	CLK_A>;
+						 rxd2 =	 <&PIO1 6 ALT1 IN DE_IO	0	CLK_A>;
+						 rxd3 =	 <&PIO1 7 ALT1 IN DE_IO	0	CLK_A>;
+
+						 rxdv =	  <&PIO2 0 ALT1 IN DE_IO	500	CLK_A>;
+						 rxclk =  <&PIO2 2 ALT1 IN	NICLK	0	CLK_A>;
+						 phyclk = <&PIO2 3 ALT4 OUT	NICLK	0	CLK_B>;
+
+						 clk125= <&PIO3 7 ALT4 IN 	NICLK	0	CLK_A>;
+					};
+				};
+			};
 		};
 
 		pin-controller-front {
@@ -284,6 +334,77 @@
 					};
 				};
 			};
+
+			gmac0{
+				pinctrl_mii0: mii0 {
+					st,pins {
+					 mdint =	<&PIO13 6 ALT2	IN	BYPASS		0>;
+					 txen =		<&PIO13 7 ALT2	OUT	SE_NICLK_IO	0	CLK_A>;
+
+					 txd0 =		<&PIO14 0 ALT2	OUT	SE_NICLK_IO	0	CLK_A>;
+					 txd1 =		<&PIO14 1 ALT2	OUT	SE_NICLK_IO	0	CLK_A>;
+					 txd2 =		<&PIO14 2 ALT2	OUT	SE_NICLK_IO	0	CLK_B>;
+					 txd3 =		<&PIO14 3 ALT2	OUT	SE_NICLK_IO	0	CLK_B>;
+
+					 txclk =	<&PIO15 0 ALT2	IN	NICLK		0	CLK_A>;
+					 txer =		<&PIO15 1 ALT2	OUT	SE_NICLK_IO	0	CLK_A>;
+					 crs =		<&PIO15 2 ALT2	IN	BYPASS		1000>;
+					 col =		<&PIO15 3 ALT2	IN	BYPASS		1000>;
+					 mdio  =        <&PIO15 4 ALT2	OUT	BYPASS 	3000>;
+					 mdc   =        <&PIO15 5 ALT2	OUT     NICLK  	0    	CLK_B>;
+
+					 rxd0 =		<&PIO16 0 ALT2	IN	SE_NICLK_IO	0	CLK_A>;
+					 rxd1 =		<&PIO16 1 ALT2	IN	SE_NICLK_IO	0	CLK_A>;
+					 rxd2 =		<&PIO16 2 ALT2	IN	SE_NICLK_IO	0	CLK_A>;
+					 rxd3 =		<&PIO16 3 ALT2	IN	SE_NICLK_IO	0	CLK_A>;
+					 rxdv =		<&PIO15 6 ALT2	IN	SE_NICLK_IO	0	CLK_A>;
+					 rx_er =	<&PIO15 7 ALT2	IN	SE_NICLK_IO	0	CLK_A>;
+					 rxclk =	<&PIO17 0 ALT2	IN	NICLK		0	CLK_A>;
+					 phyclk =	<&PIO13 5 ALT2	OUT	NICLK	1000	CLK_A>;
+
+					};
+				};
+
+			pinctrl_gmii0: gmii0 {
+				st,pins {
+					 mdint =	<&PIO13 6	ALT2 IN		BYPASS	0>;
+					 mdio  =        <&PIO15 4 	ALT2 OUT	BYPASS 	3000>;
+					 mdc   =        <&PIO15 5 	ALT2 OUT    	NICLK  	0    	CLK_B>;
+					 txen =		<&PIO13 7	ALT2 OUT	SE_NICLK_IO	3000	CLK_A>;
+
+					 txd0 =		<&PIO14 0	ALT2 OUT	SE_NICLK_IO	3000	CLK_A>;
+					 txd1 =		<&PIO14 1	ALT2 OUT	SE_NICLK_IO	3000	CLK_A>;
+					 txd2 =		<&PIO14 2	ALT2 OUT	SE_NICLK_IO	3000	CLK_B>;
+					 txd3 =		<&PIO14 3	ALT2 OUT	SE_NICLK_IO	3000	CLK_B>;
+					 txd4 =		<&PIO14 4	ALT2 OUT	SE_NICLK_IO	3000	CLK_B>;
+					 txd5 =		<&PIO14 5	ALT2 OUT	SE_NICLK_IO	3000	CLK_B>;
+					 txd6 =		<&PIO14 6	ALT2 OUT	SE_NICLK_IO	3000	CLK_B>;
+					 txd7 =		<&PIO14 7	ALT2 OUT	SE_NICLK_IO	3000	CLK_B>;
+
+					 txclk =	<&PIO15 0	ALT2 IN		NICLK	0	CLK_A>;
+					 txer =		<&PIO15 1	ALT2 OUT 	SE_NICLK_IO	3000	CLK_A>;
+					 crs =		<&PIO15 2	ALT2 IN		BYPASS	1000>;
+					 col =		<&PIO15 3	ALT2 IN		BYPASS	1000>;
+					 rxdv =		<&PIO15 6	ALT2 IN		SE_NICLK_IO	1500	CLK_A>;
+					 rx_er =	<&PIO15 7	ALT2 IN		SE_NICLK_IO	1500	CLK_A>;
+
+					 rxd0 =		<&PIO16 0	ALT2 IN		SE_NICLK_IO	1500	CLK_A>;
+					 rxd1 =		<&PIO16 1	ALT2 IN		SE_NICLK_IO	1500	CLK_A>;
+					 rxd2 =		<&PIO16 2	ALT2 IN		SE_NICLK_IO	1500	CLK_A>;
+					 rxd3 =		<&PIO16 3	ALT2 IN		SE_NICLK_IO	1500	CLK_A>;
+					 rxd4 =		<&PIO16 4	ALT2 IN		SE_NICLK_IO	1500	CLK_A>;
+					 rxd5 =		<&PIO16 5	ALT2 IN		SE_NICLK_IO	1500	CLK_A>;
+					 rxd6 =		<&PIO16 6	ALT2 IN		SE_NICLK_IO	1500	CLK_A>;
+					 rxd7 =		<&PIO16 7	ALT2 IN		SE_NICLK_IO	1500	CLK_A>;
+
+					 rxclk =	<&PIO17 0	ALT2 IN	NICLK	0	CLK_A>;
+					 clk125 =	<&PIO17 6	ALT1 IN	NICLK	0	CLK_A>;
+                                         phyclk =       <&PIO13 5       ALT4 OUT NICLK   0       CLK_B>;
+
+
+					};
+				};
+			};
 		};
 
 		pin-controller-left {
diff --git a/arch/arm/boot/dts/stih415.dtsi b/arch/arm/boot/dts/stih415.dtsi
index d52207c..cc9b22b 100644
--- a/arch/arm/boot/dts/stih415.dtsi
+++ b/arch/arm/boot/dts/stih415.dtsi
@@ -147,5 +147,53 @@
 
 			status		= "disabled";
 		};
+
+		ethernet0: dwmac@fe810000 {
+			device_type 	= "network";
+			compatible	= "st,stih415-dwmac", "snps,dwmac", "snps,dwmac-3.610";
+			status 		= "disabled";
+
+			reg 		= <0xfe810000 0x8000>, <0x148 0x4>;
+			reg-names	= "stmmaceth", "sti-ethconf";
+
+			interrupts 	= <0 147 0>, <0 148 0>, <0 149 0>;
+			interrupt-names = "macirq", "eth_wake_irq", "eth_lpi";
+			resets			= <&softreset STIH415_ETH0_SOFTRESET>;
+			reset-names		= "stmmaceth";
+
+			snps,pbl 	= <32>;
+			snps,mixed-burst;
+			snps,force_sf_dma_mode;
+
+			st,syscon	= <&syscfg_rear>;
+
+			pinctrl-names 	= "default";
+			pinctrl-0	= <&pinctrl_mii0>;
+			clock-names	= "stmmaceth";
+			clocks		= <&CLKS_GMAC0_PHY>;
+		};
+
+		ethernet1: dwmac@fef08000 {
+			device_type = "network";
+			compatible	= "st,stih415-dwmac", "snps,dwmac", "snps,dwmac-3.610";
+			status 		= "disabled";
+			reg		= <0xfef08000 0x8000>, <0x74 0x4>;
+			reg-names	= "stmmaceth", "sti-ethconf";
+			interrupts 	= <0 150 0>, <0 151 0>, <0 152 0>;
+			interrupt-names = "macirq", "eth_wake_irq", "eth_lpi";
+
+			snps,pbl	= <32>;
+			snps,mixed-burst;
+			snps,force_sf_dma_mode;
+
+			st,syscon		= <&syscfg_sbc>;
+
+			resets			= <&softreset STIH415_ETH1_SOFTRESET>;
+			reset-names		= "stmmaceth";
+			pinctrl-names 	= "default";
+			pinctrl-0	= <&pinctrl_mii1>;
+			clock-names	= "stmmaceth";
+			clocks		= <&CLKS_ETH1_PHY>;
+		};
 	};
 };
diff --git a/arch/arm/boot/dts/stih41x-b2000.dtsi b/arch/arm/boot/dts/stih41x-b2000.dtsi
index 1e6aa92..bf65c49 100644
--- a/arch/arm/boot/dts/stih41x-b2000.dtsi
+++ b/arch/arm/boot/dts/stih41x-b2000.dtsi
@@ -20,6 +20,8 @@
 
 	aliases {
 		ttyAS0 = &serial2;
+		ethernet0 = &ethernet0;
+		ethernet1 = &ethernet1;
 	};
 
 	soc {
@@ -46,5 +48,25 @@
 
 			status = "okay";
 		};
+
+		ethernet0: dwmac@fe810000 {
+			status			= "okay";
+			phy-mode		= "mii";
+			pinctrl-0		= <&pinctrl_mii0>;
+
+			snps,reset-gpio 	= <&PIO106 2>;
+			snps,reset-active-low;
+			snps,reset-delays-us 	= <0 10000 10000>;
+		};
+
+		ethernet1: dwmac@fef08000 {
+			status			= "disabled";
+			phy-mode		= "mii";
+			st,tx-retime-src	= "txclk";
+
+			snps,reset-gpio 	= <&PIO4 7>;
+			snps,reset-active-low;
+			snps,reset-delays-us 	= <0 10000 10000>;
+		};
 	};
 };
diff --git a/arch/arm/boot/dts/stih41x-b2020.dtsi b/arch/arm/boot/dts/stih41x-b2020.dtsi
index 0ef0a69..6c9a2ab 100644
--- a/arch/arm/boot/dts/stih41x-b2020.dtsi
+++ b/arch/arm/boot/dts/stih41x-b2020.dtsi
@@ -19,6 +19,7 @@
 
 	aliases {
 		ttyAS0 = &sbc_serial1;
+		ethernet1 = &ethernet1;
 	};
 	soc {
 		sbc_serial1: serial@fe531000 {
@@ -60,5 +61,30 @@
 		i2c@fe541000 {
 			status = "okay";
 		};
+
+		/**
+		* ethernet clk routing:
+		* for
+		* 	max-speed = <1000>;
+		* set
+		* 	st,tx-retime-src	= "clk_125";
+		*
+		* for
+		*	max-speed = <100>;
+		* set
+		*	st,tx-retime-src	= "clkgen";
+		*/
+
+		ethernet1: dwmac@fef08000 {
+			status			= "okay";
+			phy-mode		= "rgmii-id";
+			max-speed		= <1000>;
+			st,tx-retime-src	= "clk_125";
+			snps,reset-gpio 	= <&PIO3 0>;
+			snps,reset-active-low;
+			snps,reset-delays-us 	= <0 10000 10000>;
+
+			pinctrl-0	= <&pinctrl_rgmii1>;
+		};
 	};
 };
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH v1 1/3] net: stmmac:sti: Add STi SOC glue driver.
From: srinivas.kandagatla @ 2014-02-03 12:01 UTC (permalink / raw)
  To: netdev
  Cc: Mark Rutland, devicetree, Russell King, kernel, Pawel Moll,
	Ian Campbell, Srinivas Kandagatla, linux-doc, linux-kernel,
	Stuart Menefy, Rob Herring, Rob Landley, Kumar Gala,
	Giuseppe Cavallaro, davem, linux-arm-kernel
In-Reply-To: <1391428787-27143-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas Kandagatla <srinivas.kandagatla@st.com>

STi series SOCs have a glue layer on top of the synopsis gmac IP, this
glue layer needs to be configured before the gmac driver starts using
the IP.

This patch adds a support to this glue layer which is configured via
stmmac setup, init, exit callbacks.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
---
 .../devicetree/bindings/net/sti-dwmac.txt          |   58 ++++
 drivers/net/ethernet/stmicro/stmmac/Kconfig        |   11 +
 drivers/net/ethernet/stmicro/stmmac/Makefile       |    1 +
 drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c    |  331 ++++++++++++++++++++
 drivers/net/ethernet/stmicro/stmmac/stmmac.h       |    3 +
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |    5 +
 6 files changed, 409 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/sti-dwmac.txt
 create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c

diff --git a/Documentation/devicetree/bindings/net/sti-dwmac.txt b/Documentation/devicetree/bindings/net/sti-dwmac.txt
new file mode 100644
index 0000000..3dd3d0b
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/sti-dwmac.txt
@@ -0,0 +1,58 @@
+STMicroelectronics SoC DWMAC glue layer controller
+
+The device node has following properties.
+
+Required properties:
+ - compatible	: Can be "st,stih415-dwmac", "st,stih416-dwmac" or
+   "st,stid127-dwmac".
+ - reg		: Offset of the glue configuration register map in system
+   configuration regmap pointed by st,syscon property and size.
+
+ - reg-names	: Should be "sti-ethconf".
+
+ - st,syscon	: Should be phandle to system configuration node which
+   encompases this glue registers.
+
+ - st,tx-retime-src: On STi Parts for Giga bit speeds, 125Mhz clocks can be
+   wired up in from different sources. One via TXCLK pin and other via CLK_125
+   pin. This wiring is totally board dependent. However the retiming glue
+   logic should be configured accordingly. Possible values for this property
+
+	   "txclk" - if 125Mhz clock is wired up via txclk line.
+	   "clk_125" - if 125Mhz clock is wired up via clk_125 line.
+
+   This property is only valid for Giga bit setup( GMII, RGMII), and it is
+   un-used for non-giga bit (MII and RMII) setups. Also note that internal
+   clockgen can not generate stable 125Mhz clock.
+
+ - st,ext-phyclk: This boolean property indicates who is generating the clock
+  for tx and rx. This property is only valid for RMII case where the clock can
+  be generated from the MAC or PHY.
+
+ - clock-names: should be "sti-ethclk".
+ - clocks: Should point to ethernet clockgen which can generate phyclk.
+
+
+Example:
+
+ethernet0: dwmac@fe810000 {
+	device_type 	= "network";
+	compatible	= "st,stih416-dwmac", "snps,dwmac", "snps,dwmac-3.710";
+	reg 		= <0xfe810000 0x8000>, <0x8bc 0x4>;
+	reg-names	= "stmmaceth", "sti-ethconf";
+	interrupts	= <0 133 0>, <0 134 0>, <0 135 0>;
+	interrupt-names	= "macirq", "eth_wake_irq", "eth_lpi";
+	phy-mode	= "mii";
+
+	st,syscon	= <&syscfg_rear>;
+
+	snps,pbl 	= <32>;
+	snps,mixed-burst;
+
+	resets		= <&softreset STIH416_ETH0_SOFTRESET>;
+	reset-names	= "stmmaceth";
+	pinctrl-0	= <&pinctrl_mii0>;
+	pinctrl-names 	= "default";
+	clocks		= <&CLK_S_GMAC0_PHY>;
+	clock-names	= "stmmaceth";
+};
diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig
index e2f202e..f2d7c70 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
+++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
@@ -37,6 +37,17 @@ config DWMAC_SUNXI
 	  stmmac device driver. This driver is used for A20/A31
 	  GMAC 	  ethernet controller.
 
+config DWMAC_STI
+	bool "STi GMAC support"
+	depends on STMMAC_PLATFORM && ARCH_STI
+	default y
+	---help---
+	  Support for ethernet controller on STi SOCs.
+
+	  This selects STi SoC glue layer support for the stmmac
+	  device driver. This driver is used on for the STi series
+	  SOCs GMAC ethernet controller.
+
 config STMMAC_PCI
 	bool "STMMAC PCI bus support"
 	depends on STMMAC_ETH && PCI
diff --git a/drivers/net/ethernet/stmicro/stmmac/Makefile b/drivers/net/ethernet/stmicro/stmmac/Makefile
index ecadece..dcef287 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Makefile
+++ b/drivers/net/ethernet/stmicro/stmmac/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_STMMAC_ETH) += stmmac.o
 stmmac-$(CONFIG_STMMAC_PLATFORM) += stmmac_platform.o
 stmmac-$(CONFIG_STMMAC_PCI) += stmmac_pci.o
 stmmac-$(CONFIG_DWMAC_SUNXI) += dwmac-sunxi.o
+stmmac-$(CONFIG_DWMAC_STI) += dwmac-sti.o
 stmmac-objs:= stmmac_main.o stmmac_ethtool.o stmmac_mdio.o ring_mode.o	\
 	      chain_mode.o dwmac_lib.o dwmac1000_core.o  dwmac1000_dma.o \
 	      dwmac100_core.o dwmac100_dma.o enh_desc.o  norm_desc.o \
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c
new file mode 100644
index 0000000..d87584cb
--- /dev/null
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c
@@ -0,0 +1,331 @@
+/**
+ * dwmac-sti.c - STMicroelectronics DWMAC Specific Glue layer
+ *
+ * Copyright (C) 2003-2014 STMicroelectronics (R&D) Limited
+ * Author: Srinivas Kandagatla <srinivas.kandagatla@st.com>
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/platform_device.h>
+#include <linux/stmmac.h>
+#include <linux/phy.h>
+#include <linux/mfd/syscon.h>
+#include <linux/regmap.h>
+#include <linux/clk.h>
+#include <linux/of.h>
+#include <linux/of_net.h>
+
+/**
+ *			STi GMAC glue logic.
+ *			--------------------
+ *
+ *		 _
+ *		|  \
+ *	--------|0  \ ETH_SEL_INTERNAL_NOTEXT_PHYCLK
+ * phyclk	|    |___________________________________________
+ *		|    |	|			(phyclk-in)
+ *	--------|1  /	|
+ * int-clk	|_ /	|
+ *			|	 _
+ *			|	|  \
+ *			|_______|0  \ ETH_SEL_TX_RETIME_CLK
+ *				|    |___________________________
+ *				|    |		(tx-retime-clk)
+ *			 _______|1  /
+ *			|	|_ /
+ *		 _	|
+ *		|  \	|
+ *	--------|0  \	|
+ * clk_125	|    |__|
+ *		|    |	ETH_SEL_TXCLK_NOT_CLK125
+ *	--------|1  /
+ * txclk	|_ /
+ *
+ *
+ * ETH_SEL_INTERNAL_NOTEXT_PHYCLK is valid only for RMII where PHY can
+ * generate 50MHz clock or MAC can generate it.
+ * This bit is configured by "st,ext-phyclk" property.
+ *
+ * ETH_SEL_TXCLK_NOT_CLK125 is only valid for gigabit modes, where the 125Mhz
+ * clock either comes from clk-125 pin or txclk pin. This configuration is
+ * totally driven by the board wiring. This bit is configured by
+ * "st,tx-retime-src" property.
+ *
+ * TXCLK configuration is different for different phy interface modes
+ * and changes according to link speed in modes like RGMII.
+ *
+ * Below table summarizes the clock requirement and clock sources for
+ * supported phy interface modes with link speeds.
+ * ________________________________________________
+ *|  PHY_MODE	| 1000 Mbit Link | 100 Mbit Link   |
+ * ------------------------------------------------
+ *|	MII	|	n/a	 |	25Mhz	   |
+ *|		|		 |	txclk	   |
+ * ------------------------------------------------
+ *|	GMII	|     125Mhz	 |	25Mhz	   |
+ *|		|  clk-125/txclk |	txclk	   |
+ * ------------------------------------------------
+ *|	RGMII	|     125Mhz	 |	25Mhz	   |
+ *|		|  clk-125/txclk |	clkgen     |
+ * ------------------------------------------------
+ *|	RMII	|	n/a	 |	25Mhz	   |
+ *|		|		 |clkgen/phyclk-in |
+ * ------------------------------------------------
+ *
+ * TX lines are always retimed with a clk, which can vary depending
+ * on the board configuration. Below is the table of these bits
+ * in eth configuration register depending on source of retime clk.
+ *
+ *---------------------------------------------------------------
+ * src	 | tx_rt_clk	| int_not_ext_phyclk	| txclk_n_clk125|
+ *---------------------------------------------------------------
+ * txclk |	0	|	n/a		|	1	|
+ *---------------------------------------------------------------
+ * ck_125|	0	|	n/a		|	0	|
+ *---------------------------------------------------------------
+ * phyclk|	1	|	0		|	n/a	|
+ *---------------------------------------------------------------
+ * clkgen|	1	|	1		|	n/a	|
+ *---------------------------------------------------------------
+ */
+
+ /* Register definition */
+
+ /* 3 bits [8:6]
+ *  [6:6]	ETH_SEL_TXCLK_NOT_CLK125
+ *  [7:7]	ETH_SEL_INTERNAL_NOTEXT_PHYCLK
+ *  [8:8]	ETH_SEL_TX_RETIME_CLK
+ *
+ */
+
+#define TX_RETIME_SRC_MASK		GENMASK(8, 6)
+#define ETH_SEL_TX_RETIME_CLK		BIT(8)
+#define ETH_SEL_INTERNAL_NOTEXT_PHYCLK	BIT(7)
+#define ETH_SEL_TXCLK_NOT_CLK125	BIT(6)
+
+#define ENMII_MASK			GENMASK(5, 5)
+#define ENMII				BIT(5)
+
+/**
+ * 3 bits [4:2]
+ *	000-GMII/MII
+ *	001-RGMII
+ *	010-SGMII
+ *	100-RMII
+*/
+#define MII_PHY_SEL_MASK		GENMASK(4, 2)
+#define ETH_PHY_SEL_RMII		BIT(4)
+#define ETH_PHY_SEL_SGMII		BIT(3)
+#define ETH_PHY_SEL_RGMII		BIT(2)
+#define ETH_PHY_SEL_GMII		0x0
+#define ETH_PHY_SEL_MII			0x0
+
+#define IS_PHY_IF_MODE_RGMII(iface)	(iface == PHY_INTERFACE_MODE_RGMII || \
+			iface == PHY_INTERFACE_MODE_RGMII_ID || \
+			iface == PHY_INTERFACE_MODE_RGMII_RXID || \
+			iface == PHY_INTERFACE_MODE_RGMII_TXID)
+
+#define IS_PHY_IF_MODE_GBIT(iface)	(IS_PHY_IF_MODE_RGMII(iface) || \
+			iface == PHY_INTERFACE_MODE_GMII)
+
+struct sti_dwmac {
+	int	interface;
+	bool	ext_phyclk;
+	bool	is_tx_retime_src_clk_125;
+	struct clk *clk;
+	int	reg;
+	struct	device *dev;
+	struct	regmap *regmap;
+};
+
+static u32 phy_intf_sels[] = {
+	[PHY_INTERFACE_MODE_MII]	= ETH_PHY_SEL_MII,
+	[PHY_INTERFACE_MODE_GMII]	= ETH_PHY_SEL_GMII,
+	[PHY_INTERFACE_MODE_RGMII]	= ETH_PHY_SEL_RGMII,
+	[PHY_INTERFACE_MODE_RGMII_ID]	= ETH_PHY_SEL_RGMII,
+	[PHY_INTERFACE_MODE_SGMII]	= ETH_PHY_SEL_SGMII,
+	[PHY_INTERFACE_MODE_RMII]	= ETH_PHY_SEL_RMII,
+};
+
+enum {
+	TX_RETIME_SRC_NA = 0,
+	TX_RETIME_SRC_TXCLK = 1,
+	TX_RETIME_SRC_CLK_125,
+	TX_RETIME_SRC_PHYCLK,
+	TX_RETIME_SRC_CLKGEN,
+};
+
+static const char * const tx_retime_srcs[] = {
+	[TX_RETIME_SRC_NA]		= "",
+	[TX_RETIME_SRC_TXCLK]		= "txclk",
+	[TX_RETIME_SRC_CLK_125]		= "clk_125",
+	[TX_RETIME_SRC_PHYCLK]		= "phyclk",
+	[TX_RETIME_SRC_CLKGEN]		= "clkgen",
+};
+
+static u32 tx_retime_val[] = {
+	[TX_RETIME_SRC_TXCLK]	= ETH_SEL_TXCLK_NOT_CLK125,
+	[TX_RETIME_SRC_CLK_125]	= 0x0,
+	[TX_RETIME_SRC_PHYCLK]	= ETH_SEL_TX_RETIME_CLK,
+	[TX_RETIME_SRC_CLKGEN]	= ETH_SEL_TX_RETIME_CLK |
+				ETH_SEL_INTERNAL_NOTEXT_PHYCLK,
+};
+
+static void setup_retime_src(struct sti_dwmac *dwmac, u32 spd)
+{
+	u32 src = 0, freq = 0;
+
+	if (spd == SPEED_100) {
+		if (dwmac->interface == PHY_INTERFACE_MODE_MII ||
+			dwmac->interface == PHY_INTERFACE_MODE_GMII) {
+				src = TX_RETIME_SRC_TXCLK;
+		} else if (dwmac->interface == PHY_INTERFACE_MODE_RMII) {
+			if (dwmac->ext_phyclk) {
+				src = TX_RETIME_SRC_PHYCLK;
+			} else {
+				src = TX_RETIME_SRC_CLKGEN;
+				freq = 50000000;
+			}
+
+		} else if (IS_PHY_IF_MODE_RGMII(dwmac->interface)) {
+			src = TX_RETIME_SRC_CLKGEN;
+			freq = 25000000;
+		}
+
+		if (src == TX_RETIME_SRC_CLKGEN && dwmac->clk)
+			clk_set_rate(dwmac->clk, freq);
+
+	} else if (spd == SPEED_1000) {
+		if (dwmac->is_tx_retime_src_clk_125)
+			src = TX_RETIME_SRC_CLK_125;
+		else
+			src = TX_RETIME_SRC_TXCLK;
+	}
+
+	regmap_update_bits(dwmac->regmap, dwmac->reg,
+				TX_RETIME_SRC_MASK, tx_retime_val[src]);
+}
+
+static void sti_dwmac_exit(struct platform_device *pdev, void *priv)
+{
+	struct sti_dwmac *dwmac = priv;
+
+	if (dwmac->clk)
+		clk_disable_unprepare(dwmac->clk);
+}
+
+static void sti_fix_mac_speed(void *priv, unsigned int spd)
+{
+	struct sti_dwmac *dwmac = priv;
+	setup_retime_src(dwmac, spd);
+	return;
+}
+
+static int sti_dwmac_parse_data(struct sti_dwmac *dwmac,
+		struct platform_device *pdev)
+{
+	struct resource *res;
+	struct device *dev	= &pdev->dev;
+	struct device_node *np	= dev->of_node;
+	struct regmap	*regmap;
+	int err;
+
+	if (!np)
+		return -EINVAL;
+
+	res = platform_get_resource_byname(pdev,
+				IORESOURCE_MEM, "sti-ethconf");
+	if (!res)
+		return -ENODATA;
+
+	regmap = syscon_regmap_lookup_by_phandle(np, "st,syscon");
+	if (IS_ERR(regmap))
+		return PTR_ERR(regmap);
+
+	dwmac->dev = dev;
+	dwmac->interface = of_get_phy_mode(np);
+	dwmac->regmap = regmap;
+	dwmac->reg = res->start;
+	dwmac->ext_phyclk = of_property_read_bool(np, "st,ext-phyclk");
+
+	dwmac->is_tx_retime_src_clk_125 = false;
+
+	if (IS_PHY_IF_MODE_GBIT(dwmac->interface)) {
+		const char *rs;
+		err = of_property_read_string(np, "st,tx-retime-src", &rs);
+		if (err < 0) {
+			dev_err(dev, "st,tx-retime-src not specified\n");
+			return err;
+		}
+
+		if (!strcasecmp(rs, "clk_125"))
+			dwmac->is_tx_retime_src_clk_125 = true;
+
+	}
+
+	dwmac->clk = devm_clk_get(dev, "sti-ethclk");
+
+	if (IS_ERR(dwmac->clk))
+		dwmac->clk = NULL;
+
+	return 0;
+}
+
+static int sti_dwmac_init(struct platform_device *pdev, void *priv)
+{
+	struct sti_dwmac *dwmac = priv;
+	struct regmap *regmap = dwmac->regmap;
+	int iface = dwmac->interface;
+	u32 reg = dwmac->reg;
+	u32 val, spd;
+
+	if (dwmac->clk)
+		clk_prepare_enable(dwmac->clk);
+
+	regmap_update_bits(regmap, reg, MII_PHY_SEL_MASK,
+					phy_intf_sels[iface]);
+
+	val = (iface == PHY_INTERFACE_MODE_REVMII) ? 0 : ENMII;
+	regmap_update_bits(regmap, reg, ENMII_MASK, val);
+
+	if (IS_PHY_IF_MODE_GBIT(iface))
+		spd = SPEED_1000;
+	else
+		spd = SPEED_100;
+
+	setup_retime_src(dwmac, spd);
+
+	return 0;
+}
+
+static void *sti_dwmac_setup(struct platform_device *pdev)
+{
+	struct sti_dwmac	*dwmac;
+	int ret;
+
+	dwmac = devm_kzalloc(&pdev->dev, sizeof(*dwmac), GFP_KERNEL);
+	if (!dwmac)
+		return ERR_PTR(-ENOMEM);
+
+	ret = sti_dwmac_parse_data(dwmac, pdev);
+	if (ret) {
+		dev_err(&pdev->dev, "Unable to parse OF data\n");
+		return ERR_PTR(ret);
+	}
+
+	return dwmac;
+}
+
+const struct stmmac_of_data sti_gmac_data = {
+	.fix_mac_speed = sti_fix_mac_speed,
+	.setup = sti_dwmac_setup,
+	.init = sti_dwmac_init,
+	.exit = sti_dwmac_exit,
+};
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index d9af26e..f9e60d7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -133,6 +133,9 @@ bool stmmac_eee_init(struct stmmac_priv *priv);
 #ifdef CONFIG_DWMAC_SUNXI
 extern const struct stmmac_of_data sun7i_gmac_data;
 #endif
+#ifdef CONFIG_DWMAC_STI
+extern const struct stmmac_of_data sti_gmac_data;
+#endif
 extern struct platform_driver stmmac_pltfr_driver;
 static inline int stmmac_register_platform(void)
 {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 5884a7d..c61bc72b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -33,6 +33,11 @@ static const struct of_device_id stmmac_dt_ids[] = {
 #ifdef CONFIG_DWMAC_SUNXI
 	{ .compatible = "allwinner,sun7i-a20-gmac", .data = &sun7i_gmac_data},
 #endif
+#ifdef CONFIG_DWMAC_STI
+	{ .compatible = "st,stih415-dwmac", .data = &sti_gmac_data},
+	{ .compatible = "st,stih416-dwmac", .data = &sti_gmac_data},
+	{ .compatible = "st,stih127-dwmac", .data = &sti_gmac_data},
+#endif
 	/* SoC specific glue layers should come before generic bindings */
 	{ .compatible = "st,spear600-gmac"},
 	{ .compatible = "snps,dwmac-3.610"},
-- 
1.7.9.5

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox