Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: pull request: wireless-next-2.6 2009-10-28
From: Michael Buesch @ 2009-10-29 14:48 UTC (permalink / raw)
  To: Gertjan van Wingerde
  Cc: David Miller, bzolnier-Re5JQEeQqe8AvxtiuMwx3w,
	penberg-bbCR+/B0CizivPeTLB3BmA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linville-2XuSBdqkA4R54TAoqtyWWQ
In-Reply-To: <14add3d10910290744n3abd1cf8w42a6311108eb2fa7-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Thursday 29 October 2009 15:44:42 Gertjan van Wingerde wrote:
> Hold on here. In this case it is the driver maintainer (i.e. Ivo for
> the rt2x00 project) that
> submitted this driver for inclusion, so the driver maintainer has not
> been bypassed in
> this case.
> 
> Apparently Bart has issues with the code submitted by the maintainers
> and has been
> unsuccessful in convincing others about these issues.

So Bart is not a maintainer? That of course changes the situation.

-- 
Greetings, Michael.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC] net,socket: introduce build_sockaddr_check helper to catch overflow at build time
From: Cyrill Gorcunov @ 2009-10-29 14:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20091029.030019.44832583.davem@davemloft.net>

[David Miller - Thu, Oct 29, 2009 at 03:00:19AM -0700]
...
| > Eventually inet_getname is switched to use DECLARE_SOCKADDR
| > (to show example of usage).
| > 
| > Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
| 
| I like this, applied to net-next-2.6, thanks!
| 

Thanks David. I'll handle other protocols in a couple
of days.

	-- Cyrill

^ permalink raw reply

* [patch]convert kaweth to use usb_reset_configuration()
From: Oliver Neukum @ 2009-10-29 15:07 UTC (permalink / raw)
  To: Sarah Sharp, davem, netdev

For USB 3.0 it is necessary that all drivers use the standard
API to reset a configuration. This removes a home-grown
implementation.

Signed-off-by: Oliver Neukum <oliver@neukum.org>

Hi David,

please take this for the next merge window.

	Regards
		Oliver

--

--- a/drivers/net/usb/kaweth.c
+++ b/drivers/net/usb/kaweth.c
@@ -471,16 +471,7 @@ static int kaweth_reset(struct kaweth_device *kaweth)
 	int result;
 
 	dbg("kaweth_reset(%p)", kaweth);
-	result = kaweth_control(kaweth,
-				usb_sndctrlpipe(kaweth->dev, 0),
-				USB_REQ_SET_CONFIGURATION,
-				0,
-				kaweth->dev->config[0].desc.bConfigurationValue,
-				0,
-				NULL,
-				0,
-				KAWETH_CONTROL_TIMEOUT);
-
+	result = usb_reset_configuration(kaweth->dev);
 	mdelay(10);
 
 	dbg("kaweth_reset() returns %d.",result);


^ permalink raw reply

* Re: pull request: wireless-next-2.6 2009-10-28
From: Luis R. Rodriguez @ 2009-10-29 15:08 UTC (permalink / raw)
  To: Michael Buesch
  Cc: Gertjan van Wingerde, David Miller,
	bzolnier-Re5JQEeQqe8AvxtiuMwx3w, penberg-bbCR+/B0CizivPeTLB3BmA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linville-2XuSBdqkA4R54TAoqtyWWQ
In-Reply-To: <200910291548.27235.mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>

On Thu, Oct 29, 2009 at 7:48 AM, Michael Buesch <mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org> wrote:
> On Thursday 29 October 2009 15:44:42 Gertjan van Wingerde wrote:
>> Hold on here. In this case it is the driver maintainer (i.e. Ivo for
>> the rt2x00 project) that
>> submitted this driver for inclusion, so the driver maintainer has not
>> been bypassed in
>> this case.
>>
>> Apparently Bart has issues with the code submitted by the maintainers
>> and has been
>> unsuccessful in convincing others about these issues.
>
> So Bart is not a maintainer? That of course changes the situation.

But you were :-P

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: Shared i2c adapter locking (Was: linux-next: manual merge of the net tree with the i2c tree)
From: Ben Hutchings @ 2009-10-29 15:09 UTC (permalink / raw)
  To: Jean Delvare
  Cc: Stephen Rothwell, David Miller, netdev, linux-next, linux-kernel,
	Mika Kuoppala, Linux I2C
In-Reply-To: <20091029154317.651904b9@hyperion.delvare>

On Thu, 2009-10-29 at 15:43 +0100, Jean Delvare wrote:
> Hi Stephen,
> 
> On Mon, 26 Oct 2009 13:37:57 +1100, Stephen Rothwell wrote:
> > Today's linux-next merge of the net tree got a conflict in
> > drivers/net/sfc/sfe4001.c between commit
> > 3f7c0648f727a6d5baf6117653e4001dc877b90b ("i2c: Prevent priority
> > inversion on top of bus lock") from the i2c tree and commit
> > c9597d4f89565b6562bd3026adbe6eac6c317f47 ("sfc: Merge sfe4001.c into
> > falcon_boards.c") from the net tree.
> > 
> > I have applied the following merge fixup patch (after removing
> > drivers/net/sfc/sfe4001.c) and can carry it as necessary.
> 
> Thanks for fixing it. The core problem here IMHO is that the sfc
> network driver touches i2c internals which it would rather leave alone.

I'm just a little proud of having the idea that we could avoid using an
I/O-expander on this board, but yes, the software side of this
multiplexing is a hack.

> This is the only driver I know of which does this.
> 
> I can think of 3 different ways to address the issue.
> 
> Method #1: add a public API to grab/release an I2C segment.
> 
> void i2c_adapter_lock(struct i2c_adapter *adapter)
> {
> 	rt_mutex_lock(&adapter->bus_lock);
> }
> 
> void i2c_adapter_unlock(struct i2c_adapter *adapter)
> {
> 	rt_mutex_unlock(&adapter->bus_lock);
> }
[...]
> I'm not really sure if I have a preference yet, so please speak up if
> you do.

Indirect lock operations are a recipe for deadlock, and there doesn't
seem to be any other user for this, so method 1 seems best.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 2/3] net: TCP thin linear timeouts
From: apetlund @ 2009-10-29 15:14 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Andreas Petlund, Netdev, LKML, shemminger, David Miller

I apologise that some of you received this mail more than once. My email
client played a HTML-trick on me.

>> +		icsk->icsk_backoff = 0;
>> +		icsk->icsk_rto = min(((tp->srtt >> 3) + tp->rttvar), TCP_RTO_MAX);
>
> The first part is nowadays done with __tcp_set_rto(tp).
>
> --
>  i.
>

I will address this in the next iteration of the patch.

-AP

^ permalink raw reply

* Re: [PATCH 2/3] net: TCP thin linear timeouts
From: apetlund @ 2009-10-29 15:19 UTC (permalink / raw)
  To: Arnd Hannemann
  Cc: Eric Dumazet, Andreas Petlund, netdev, linux-kernel, shemminger,
	ilpo.jarvinen, davem

I apologise that some of you received this mail more than once. My email
client played a HTML-trick on me.

> Eric Dumazet schrieb:
>> Andreas Petlund a écrit :
>>> This patch will make TCP use only linear timeouts if the stream is
thin. This will help to avoid the very high latencies that thin stream
suffer because of exponential backoff. This mechanism is only active
if
>>> enabled by iocontrol or syscontrol and the stream is identified as thin.
>> Wont this reduce the session timeout to something very small, ie 15
retransmits, way under the minute ?
>
> The session timeout no longer depends on the actual number of
retransmits.
> Instead its a time interval,
> which is roughly equivalent to the time a TCP, performing exponential
backoff would need to perform
> 15 retransmits.
>
> However, addressing the proposal:
> I wonder how one can seriously suggest to just skip congestion response
during timeout-based
> loss recovery? I believe that in a heavily congested scenarios, this
would
> lead to a goodput
> goodput disaster... Not to mention that in a heavily congested scenario,
suddenly every flow
> will become "thin", so this will even amplify the problems. Or did I
miss
> something?

We have found no noticeable degradation of the goodput in a series of
experiments we have performed in order to map the effects of the
modifications. Furthermore, the modifications implemented in the patches
are explicitly enabled only for applications where the developer knows
that streams will be thin, thus only a small subset of the streams will
apply the modifications.

Graphs presenting results from experiments performed to analyse latency
and fairness issues can be found here:
http://folk.uio.no/apetlund/lktmp/

-AP

^ permalink raw reply

* Re: [PATCH net-next-2.6 1/4] net: introduce mc list helpers
From: Jiri Pirko @ 2009-10-29 15:19 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: eric.dumazet, e1000-devel, netdev, bruce.w.allan,
	jesse.brandeburg, mchehab, john.ronciak, jeffrey.t.kirsher, davem,
	linux-media
In-Reply-To: <1256221112.2785.13.camel@achroite>

Thu, Oct 22, 2009 at 04:18:32PM CEST, bhutchings@solarflare.com wrote:
>On Thu, 2009-10-22 at 15:52 +0200, Jiri Pirko wrote:
>> This helpers should be used by network drivers to access to netdev
>> multicast lists.
>[...]
>> +static inline void netdev_mc_walk(struct net_device *dev,
>> +				  void (*func)(void *, unsigned char *),
>> +				  void *data)
>> +{
>> +	struct dev_addr_list *mclist;
>> +	int i;
>> +
>> +	for (i = 0, mclist = dev->mc_list; mclist && i < dev->mc_count;
>> +	     i++, mclist = mclist->next)
>> +		func(data, mclist->dmi_addr);
>> +}
>[...]
>
>We usually implement iteration as macros so that any context doesn't
>have to be squeezed through a single untyped (void *) variable.  A macro
>for this would look something like:
>
>#define netdev_for_each_mc_addr(dev, addr)						\
>	for (addr = (dev)->mc_list ? (dev)->mc_list->dmi_addr : NULL;			\
>	     addr;									\
>	     addr = (container_of(addr, struct dev_addr_list, dmi_addr)->next ?		\
>		     container_of(addr, struct dev_addr_list, dmi_addr)->next->dmi_addr : \
>		     NULL))
>
>Once you change the list type this can presumably be made less ugly.

Looking at this, I'm not sure how to deal with this macro once we need to
convert it to work with list_head. I see two options:

1) traverse through the list by hand in this macro (ugly)
2) introduce something like "list_for_each_struct_entry" which takes pointer of
   the structure member as a cursor. Then netdev_for_each_mc_addr would be just
   wrap-up of this.

What do you think?

Thanks

Jirka
>
>Ben.
>
>-- 
>Ben Hutchings, Senior Software Engineer, Solarflare Communications
>Not speaking for my employer; that's the marketing department's job.
>They asked us to note that Solarflare product names are trademarked.
>

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel

^ permalink raw reply

* Re: [PATCH 3/3] net: TCP thin dupack
From: apetlund @ 2009-10-29 15:23 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Andreas Petlund, Netdev, LKML, shemminger, David Miller

I apologise that some of you received this mail more than once. My email
client played a HTML-trick on me.

>> +	/* If a thin stream is detected, retransmit after first
>> +	 * received dupack */
>> +	if ((tp->thin_dupack || sysctl_tcp_force_thin_dupack) &&
>> +	    tcp_dupack_heurestics(tp) > 1 && tcp_stream_is_thin(tp))
>> +		return 1;
>> +
>>  	return 0;
>>  }
>
> Have you tested it? ...I doubt this will work like you say and
retransmit
> something when the window is small. ...Besides, you should have built
this
> patch on top of the function rename you submitted earlier as after DaveM
applied that this will no longer even compile...
>
> --
>  i.
>

We have performed extensive tests mapping the effect of the patch you
commented on some months ago. Since then, the only change was the one you
requested of switching tcp_fackets_out() with tcp_dupack_heurestics().
After inspecting the code, I believed the effect should be equal to the
previous, only making considerations for SACK and FACK availability.
Please tell if this will break the intended effect, and I will modify the
patch accordingly.

Graphs from our tests of the original patch can be found at the location
linked to below.  I have tested the new one for functionality, but have
not et performed tests on this scope as the changes were minor. I will, of
course, fix the function rename in the next iteration. Sorry for that.

http://folk.uio.no/apetlund/lktmp/

-AP

^ permalink raw reply

* Connection tracking and vlan
From: Adayadil Thomas @ 2009-10-29 15:43 UTC (permalink / raw)
  To: netdev

Greetings!

If two connections have same 5 tuple, src ip, dst ip, src port, dst
port, protocol(tcp/udp)
but on different vlans (different vlan id), does the conntrack separate these ?

I am using kernel version 2.6.20; the conntrack tuple structure do not
seem to have vlan information.

Any information is much appreciated.

Thanks

^ permalink raw reply

* Re: [PATCH 2/3] net: TCP thin linear timeouts
From: apetlund @ 2009-10-29 15:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andreas Petlund, Ilpo Järvinen, Arnd Hannemann, Netdev, LKML,
	shemminger, David Miller

> Andreas Petlund a écrit :
>
>> The removal of exponential backoff on a general basis has been
>> investigated and discussed already, for instance here:
>> http://ccr.sigcomm.org/online/?q=node/416
>> Such steps are, however considered drastic, and I agree that caution
must be made to thoroughly investigate the effects of such changes. The
changes introduced by the proposed patches, however, are not
default
>> behaviour, but an option for applications that suffer from the
>> thin-stream TCP increased retransmission latencies. They will, as such,
not affect all streams. In addition, the changes will only be active
for
>> streams which are perpetually thin or in the early phase of expanding
their cwnd. Also, experiments performed on congested bottlenecks with
tail-drop queues show very little (if any at all) effect on goodput for
the modified scenario compared to a scenario with unmodified TCP
streams.
>> Graphs both for latency-results and fairness tests can be found here:
http://folk.uio.no/apetlund/lktmp/
>
> There should be a limit to linear timeouts, to say ... no more than 6
retransmits
> (eventually tunable), then switch to exponential backoff. Maybe your
patch
> already implement such heuristic ?
>

The limitation you suggest to the linear timeouts makes very good sense.
Our experiments performed on the Internet indicate that it is extremely
rare that more than 6 retransmissions are needed to recover. It is not
included in the current patch, so I will include this in the next
iteration.

> True link collapses do happen, it would be good if not all streams
wakeup
> in the same
> second and make recovery very slow.
>

Each stream will have its own schedule for wakeup, so such events will
still be subject to coincidence. The timer granularity of the TCP wakeup
timer will also influence how many streams will wake at the same time. The
experiments we have performed on severely congested bottlenecks (link
above) indicate that the modifications will not create a large negative
effect. In fact, when goodput is drastically reduced due to severe
overload, regular TCP and the LT and dupACK modifications seem to perform
nearly identically. Other scenarios may exist where different effects can
be observed, and I am open to suggestions for further testing.

> Thats too easy to accept possibly dangerous features with the excuse of
saying
> "It wont be used very much", because you cannot predict the future.

I agree that it is no argument to say that it won't be used much; indeed,
my hope is that it will be used much. However, our experiments indicate no
negative effects while showing a large improvement on retransmission
latency for the scenario in question. I therefore think that the option
for such an improvement should be made available for time-dependent
thin-stream applications.

-AP

^ permalink raw reply

* Re: [PATCH 2/3] net: TCP thin linear timeouts
From: Eric Dumazet @ 2009-10-29 15:50 UTC (permalink / raw)
  To: apetlund
  Cc: Ilpo Järvinen, Arnd Hannemann, Netdev, LKML, shemminger,
	David Miller
In-Reply-To: <69812160e5682c9fb4acba05bc082664.squirrel@webmail.uio.no>

apetlund@simula.no a écrit :
>> Andreas Petlund a écrit :

>> There should be a limit to linear timeouts, to say ... no more than 6
> retransmits
>> (eventually tunable), then switch to exponential backoff. Maybe your
> patch
>> already implement such heuristic ?
>>
> 
> The limitation you suggest to the linear timeouts makes very good sense.
> Our experiments performed on the Internet indicate that it is extremely
> rare that more than 6 retransmissions are needed to recover. It is not
> included in the current patch, so I will include this in the next
> iteration.
> 
>> True link collapses do happen, it would be good if not all streams
> wakeup
>> in the same
>> second and make recovery very slow.
>>
> 
> Each stream will have its own schedule for wakeup, so such events will
> still be subject to coincidence. The timer granularity of the TCP wakeup
> timer will also influence how many streams will wake at the same time. The
> experiments we have performed on severely congested bottlenecks (link
> above) indicate that the modifications will not create a large negative
> effect. In fact, when goodput is drastically reduced due to severe
> overload, regular TCP and the LT and dupACK modifications seem to perform
> nearly identically. Other scenarios may exist where different effects can
> be observed, and I am open to suggestions for further testing.
> 
>> Thats too easy to accept possibly dangerous features with the excuse of
> saying
>> "It wont be used very much", because you cannot predict the future.
> 
> I agree that it is no argument to say that it won't be used much; indeed,
> my hope is that it will be used much. However, our experiments indicate no
> negative effects while showing a large improvement on retransmission
> latency for the scenario in question. I therefore think that the option
> for such an improvement should be made available for time-dependent
> thin-stream applications.
> 

Thanks ! I must say I am very interested by these experiments, I am looking
forward your next iteration.


^ permalink raw reply

* Re: [PATCH 2/3] net: TCP thin linear timeouts
From: Arnd Hannemann @ 2009-10-29 16:11 UTC (permalink / raw)
  To: Andreas Petlund
  Cc: Eric Dumazet, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, shemminger@vyatta.com,
	ilpo.jarvinen@helsinki.fi, davem@davemloft.net
In-Reply-To: <07CD1135-C68B-4264-8CD3-C4BC0400FDA2@simula.no>

Andreas Petlund schrieb:
> We have found no noticeable degradation of the goodput in a series of
> experiments we have performed in order to map the effects of the
> modifications. Furthermore, the modifications implemented in the patches
> are explicitly enabled only for applications where the developer knows
> that streams will be thin, thus only a small subset of the streams will
> apply the modifications.  
> 
> Graphs presenting results from experiments performed to analyse latency
> and fairness issues can be found here:
> http://folk.uio.no/apetlund/lktmp/

How often did you hit consecutive RTOs in these measurements?
As I see you did a measurement with 512 thick vs. 512 thin streams.
Lets do a hypothetical calculation with only 512 "thin" streams.
Lets further assume the rtt is low, so that RTO is around 200ms.
Assume each segment has 128 Bytes (already very small...).
Assume after a period of normal operation all streams are in
timeout-based loss recovery. (e.g. because destination endpoint
suddenly behaves like a black hole)
As all streams are in timeout-based loss recovery, each stream
will transmit 5 segments each second with your modification.
This would result in a throughput around 512*5*1024bit = 2560 kbit/s
and a goodput of 0 kbit/s (because the receiver is a black hole).
So you can easily saturate a 2 MBit/s link, only with retransmissions.

Unfortunately in Germany an ADSL uplink of 786 kbit/s is still quite
common, and its already called "broadband"...

Regarding the "small subset", why have a global sysctl option, then?
And I think "tcp_stream_is_thin(tp)" will be true for every flow
in the RTO case, at least for consecutive RTOs.

Best regards,
Arnd Hannemann

^ permalink raw reply

* Re: [PATCH 1/3] net: TCP thin-stream detection
From: Arnd Hannemann @ 2009-10-29 16:32 UTC (permalink / raw)
  To: Andreas Petlund
  Cc: William Allen Simpson, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, shemminger@vyatta.com,
	ilpo.jarvinen@helsinki.fi, davem@davemloft.net
In-Reply-To: <38EB8C31-A96A-4C5C-88D8-8F6BF0E9225F@simula.no>

Andreas Petlund schrieb:
> Den 28. okt. 2009 kl. 04.09 skrev William Allen Simpson:
> 
>> Andreas Petlund wrote:
>>> +/* Determines whether this is a thin stream (which may suffer from
>>> + * increased latency). Used to trigger latency-reducing mechanisms.
>>> + */
>>> +static inline unsigned int tcp_stream_is_thin(const struct  
>>> tcp_sock *tp)
>>> +{
>>> +	return tp->packets_out < 4;
>>> +}
>>> +
>> This bothers me a bit.  Having just looked at your Linux presentation,
>> and not (yet) read your papers, it seems much of your justification  
>> was
>> with 1 packet per RTT.  Here, you seem to be concentrating on 4,  
>> probably
>> because many implementations quickly ramp up to 4.
>>
> 
> The limit of 4 packets in flight is based on the fact that less than 4  
> packets in flight makes fast retransmissions impossible, thus limiting  
> the retransmit options to timeout-retransmissions. The criterion is  

There is Limited Transmit! So this is generally not true.

> therefore as conservative as possible while still serving its purpose.  
> If further losses occur, the exponential backoff will increase latency  
> further. The concept of using this limit is also discussed in the  
> Internet draft for Early Retransmit by Allman et al.:
> http://www.icir.org/mallman/papers/draft-ietf-tcpm-early-rexmt-01.txt

This ID is covering exactly the cases which Limited Transmit does not
cover and works "automagically" without help of application. So why not
just implement this ID?

Best regards,
Arnd

^ permalink raw reply

* Re: [RFC] multiqueue changes
From: Patrick McHardy @ 2009-10-29 16:37 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Eric Dumazet, David S. Miller, Linux Netdev List
In-Reply-To: <20091028212337.GA3218@ami.dom.local>

Jarek Poplawski wrote:
> On Wed, Oct 28, 2009 at 06:27:10PM +0100, Patrick McHardy wrote:
>> We don't seem to be supporting changing real_num_tx_queues for
>> registered devices currently (at least I couldn't find it).
>> So I guess it depends on how this would be implemented.
>>
>> Simply changing the dev->real_num_tx_queues value while the
>> device is down would require qdisc operations to operate on
>> all possible queues since the amount of queues in use could
>> be changed after the qdisc is created/configured, but before
>> the device is set up. This approach has more complications
>> like switching between mq and non-mq root qdiscs, taking care
>> of non-default root qdisc (cloning them to the new queues), etc.
>>
>> A simpler alternative would be to destroy the existing root
>> qdisc on any change to real_num_tx_queues and have dev_activate()
>> set it up from scratch. In this case, we could (as you suggested)
>> use real_num_tx_queues, which should fix the problem Eric reported.
> 
> Actually, I changed my mind after Eric's and especially David's
> explanations. Probably there will be needed some changes in handling
> the real_num_tx_queues, but there is no reason to misuse them for
> masking a totally useless num_tx_queues value, like in this case. So,
> IMHO, its mainly about the driver(s) (and maybe a bit of API change)
> here.

Well, we do need both values for supporting changes to the actually
used numbers of TX queues. If I understood Dave's explanation correctly,
this is also what's intended. It also doesn't seem unreasonable
what bnx2 is doing.

But getting back to the problem Eric reported - so you're suggesting
that bnx2.c should also adjust num_tx_queues in case the hardware
doesn't support multiqueue? That seems reasonable as well.

^ permalink raw reply

* [PATCH 6/6] sky2: version 1.26
From: Stephen Hemminger @ 2009-10-29 16:37 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20091029163704.793246334@vyatta.com>

[-- Attachment #1: sky2-1.26.patch --]
[-- Type: text/plain, Size: 343 bytes --]

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--- a/drivers/net/sky2.c	2009-10-29 08:57:43.512001253 -0700
+++ b/drivers/net/sky2.c	2009-10-29 08:59:34.782375629 -0700
@@ -50,7 +50,7 @@
 #include "sky2.h"
 
 #define DRV_NAME		"sky2"
-#define DRV_VERSION		"1.25"
+#define DRV_VERSION		"1.26"
 #define PFX			DRV_NAME " "
 
 /*

-- 


^ permalink raw reply

* [PATCH 2/6] sky2: add register definitions for new chips
From: Stephen Hemminger @ 2009-10-29 16:37 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20091029163704.793246334@vyatta.com>

[-- Attachment #1: sky2-reg-update.patch --]
[-- Type: text/plain, Size: 11754 bytes --]

This adds infrastructure for the newer chip versions and workarounds.
Extracted from the vendor (GPL) driver.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--- a/drivers/net/sky2.h	2009-10-29 08:34:51.778438561 -0700
+++ b/drivers/net/sky2.h	2009-10-29 08:35:53.313438448 -0700
@@ -16,6 +16,13 @@ enum {
 	PCI_DEV_REG5    = 0x88,
 	PCI_CFG_REG_0	= 0x90,
 	PCI_CFG_REG_1	= 0x94,
+
+	PSM_CONFIG_REG0  = 0x98,
+	PSM_CONFIG_REG1	 = 0x9C,
+	PSM_CONFIG_REG2  = 0x160,
+	PSM_CONFIG_REG3  = 0x164,
+	PSM_CONFIG_REG4  = 0x168,
+
 };
 
 /* Yukon-2 */
@@ -48,6 +55,37 @@ enum pci_dev_reg_2 {
 	PCI_USEDATA64	= 1<<0,		/* Use 64Bit Data bus ext */
 };
 
+/*	PCI_OUR_REG_3		32 bit	Our Register 3 (Yukon-ECU only) */
+enum pci_dev_reg_3 {
+	P_CLK_ASF_REGS_DIS	= 1<<18,/* Disable Clock ASF (Yukon-Ext.) */
+	P_CLK_COR_REGS_D0_DIS	= 1<<17,/* Disable Clock Core Regs D0 */
+	P_CLK_MACSEC_DIS	= 1<<17,/* Disable Clock MACSec (Yukon-Ext.) */
+	P_CLK_PCI_REGS_D0_DIS	= 1<<16,/* Disable Clock PCI  Regs D0 */
+	P_CLK_COR_YTB_ARB_DIS	= 1<<15,/* Disable Clock YTB  Arbiter */
+	P_CLK_MAC_LNK1_D3_DIS	= 1<<14,/* Disable Clock MAC  Link1 D3 */
+	P_CLK_COR_LNK1_D0_DIS	= 1<<13,/* Disable Clock Core Link1 D0 */
+	P_CLK_MAC_LNK1_D0_DIS	= 1<<12,/* Disable Clock MAC  Link1 D0 */
+	P_CLK_COR_LNK1_D3_DIS	= 1<<11,/* Disable Clock Core Link1 D3 */
+	P_CLK_PCI_MST_ARB_DIS	= 1<<10,/* Disable Clock PCI  Master Arb. */
+	P_CLK_COR_REGS_D3_DIS	= 1<<9,	/* Disable Clock Core Regs D3 */
+	P_CLK_PCI_REGS_D3_DIS	= 1<<8,	/* Disable Clock PCI  Regs D3 */
+	P_CLK_REF_LNK1_GM_DIS	= 1<<7,	/* Disable Clock Ref. Link1 GMAC */
+	P_CLK_COR_LNK1_GM_DIS	= 1<<6,	/* Disable Clock Core Link1 GMAC */
+	P_CLK_PCI_COMMON_DIS	= 1<<5,	/* Disable Clock PCI  Common */
+	P_CLK_COR_COMMON_DIS	= 1<<4,	/* Disable Clock Core Common */
+	P_CLK_PCI_LNK1_BMU_DIS	= 1<<3,	/* Disable Clock PCI  Link1 BMU */
+	P_CLK_COR_LNK1_BMU_DIS	= 1<<2,	/* Disable Clock Core Link1 BMU */
+	P_CLK_PCI_LNK1_BIU_DIS	= 1<<1,	/* Disable Clock PCI  Link1 BIU */
+	P_CLK_COR_LNK1_BIU_DIS	= 1<<0,	/* Disable Clock Core Link1 BIU */
+	PCIE_OUR3_WOL_D3_COLD_SET = P_CLK_ASF_REGS_DIS |
+				    P_CLK_COR_REGS_D0_DIS |
+				    P_CLK_COR_LNK1_D0_DIS |
+				    P_CLK_MAC_LNK1_D0_DIS |
+				    P_CLK_PCI_MST_ARB_DIS |
+				    P_CLK_COR_COMMON_DIS |
+				    P_CLK_COR_LNK1_BMU_DIS,
+};
+
 /*	PCI_OUR_REG_4		32 bit	Our Register 4 (Yukon-ECU only) */
 enum pci_dev_reg_4 {
 				/* (Link Training & Status State Machine) */
@@ -114,7 +152,7 @@ enum pci_dev_reg_5 {
 				     P_GAT_PCIE_RX_EL_IDLE,
 };
 
-#/*	PCI_CFG_REG_1			32 bit	Config Register 1 (Yukon-Ext only) */
+/*	PCI_CFG_REG_1			32 bit	Config Register 1 (Yukon-Ext only) */
 enum pci_cfg_reg1 {
 	P_CF1_DIS_REL_EVT_RST	= 1<<24, /* Dis. Rel. Event during PCIE reset */
 										/* Bit 23..21: Release Clock on Event */
@@ -145,6 +183,72 @@ enum pci_cfg_reg1 {
 					P_CF1_ENA_TXBMU_WR_IDLE,
 };
 
+/* Yukon-Optima */
+enum {
+	PSM_CONFIG_REG1_AC_PRESENT_STATUS = 1<<31,   /* AC Present Status */
+
+	PSM_CONFIG_REG1_PTP_CLK_SEL	  = 1<<29,   /* PTP Clock Select */
+	PSM_CONFIG_REG1_PTP_MODE	  = 1<<28,   /* PTP Mode */
+
+	PSM_CONFIG_REG1_MUX_PHY_LINK	  = 1<<27,   /* PHY Energy Detect Event */
+
+	PSM_CONFIG_REG1_EN_PIN63_AC_PRESENT = 1<<26,  /* Enable LED_DUPLEX for ac_present */
+	PSM_CONFIG_REG1_EN_PCIE_TIMER	  = 1<<25,    /* Enable PCIe Timer */
+	PSM_CONFIG_REG1_EN_SPU_TIMER	  = 1<<24,    /* Enable SPU Timer */
+	PSM_CONFIG_REG1_POLARITY_AC_PRESENT = 1<<23,  /* AC Present Polarity */
+
+	PSM_CONFIG_REG1_EN_AC_PRESENT	  = 1<<21,    /* Enable AC Present */
+
+	PSM_CONFIG_REG1_EN_GPHY_INT_PSM	= 1<<20,      /* Enable GPHY INT for PSM */
+	PSM_CONFIG_REG1_DIS_PSM_TIMER	= 1<<19,      /* Disable PSM Timer */
+};
+
+/* Yukon-Supreme */
+enum {
+	PSM_CONFIG_REG1_GPHY_ENERGY_STS	= 1<<31, /* GPHY Energy Detect Status */
+
+	PSM_CONFIG_REG1_UART_MODE_MSK	= 3<<29, /* UART_Mode */
+	PSM_CONFIG_REG1_CLK_RUN_ASF	= 1<<28, /* Enable Clock Free Running for ASF Subsystem */
+	PSM_CONFIG_REG1_UART_CLK_DISABLE= 1<<27, /* Disable UART clock */
+	PSM_CONFIG_REG1_VAUX_ONE	= 1<<26, /* Tie internal Vaux to 1'b1 */
+	PSM_CONFIG_REG1_UART_FC_RI_VAL	= 1<<25, /* Default value for UART_RI_n */
+	PSM_CONFIG_REG1_UART_FC_DCD_VAL	= 1<<24, /* Default value for UART_DCD_n */
+	PSM_CONFIG_REG1_UART_FC_DSR_VAL	= 1<<23, /* Default value for UART_DSR_n */
+	PSM_CONFIG_REG1_UART_FC_CTS_VAL	= 1<<22, /* Default value for UART_CTS_n */
+	PSM_CONFIG_REG1_LATCH_VAUX	= 1<<21, /* Enable Latch current Vaux_avlbl */
+	PSM_CONFIG_REG1_FORCE_TESTMODE_INPUT= 1<<20, /* Force Testmode pin as input PAD */
+	PSM_CONFIG_REG1_UART_RST	= 1<<19, /* UART_RST */
+	PSM_CONFIG_REG1_PSM_PCIE_L1_POL	= 1<<18, /* PCIE L1 Event Polarity for PSM */
+	PSM_CONFIG_REG1_TIMER_STAT	= 1<<17, /* PSM Timer Status */
+	PSM_CONFIG_REG1_GPHY_INT	= 1<<16, /* GPHY INT Status */
+	PSM_CONFIG_REG1_FORCE_TESTMODE_ZERO= 1<<15, /* Force internal Testmode as 1'b0 */
+	PSM_CONFIG_REG1_EN_INT_ASPM_CLKREQ = 1<<14, /* ENABLE INT for CLKRUN on ASPM and CLKREQ */
+	PSM_CONFIG_REG1_EN_SND_TASK_ASPM_CLKREQ	= 1<<13, /* ENABLE Snd_task for CLKRUN on ASPM and CLKREQ */
+	PSM_CONFIG_REG1_DIS_CLK_GATE_SND_TASK	= 1<<12, /* Disable CLK_GATE control snd_task */
+	PSM_CONFIG_REG1_DIS_FF_CHIAN_SND_INTA	= 1<<11, /* Disable flip-flop chain for sndmsg_inta */
+
+	PSM_CONFIG_REG1_DIS_LOADER	= 1<<9, /* Disable Loader SM after PSM Goes back to IDLE */
+	PSM_CONFIG_REG1_DO_PWDN		= 1<<8, /* Do Power Down, Start PSM Scheme */
+	PSM_CONFIG_REG1_DIS_PIG		= 1<<7, /* Disable Plug-in-Go SM after PSM Goes back to IDLE */
+	PSM_CONFIG_REG1_DIS_PERST	= 1<<6, /* Disable Internal PCIe Reset after PSM Goes back to IDLE */
+	PSM_CONFIG_REG1_EN_REG18_PD	= 1<<5, /* Enable REG18 Power Down for PSM */
+	PSM_CONFIG_REG1_EN_PSM_LOAD	= 1<<4, /* Disable EEPROM Loader after PSM Goes back to IDLE */
+	PSM_CONFIG_REG1_EN_PSM_HOT_RST	= 1<<3, /* Enable PCIe Hot Reset for PSM */
+	PSM_CONFIG_REG1_EN_PSM_PERST	= 1<<2, /* Enable PCIe Reset Event for PSM */
+	PSM_CONFIG_REG1_EN_PSM_PCIE_L1	= 1<<1, /* Enable PCIe L1 Event for PSM */
+	PSM_CONFIG_REG1_EN_PSM		= 1<<0, /* Enable PSM Scheme */
+};
+
+/*	PSM_CONFIG_REG4				0x0168	PSM Config Register 4 */
+enum {
+						/* PHY Link Detect Timer */
+	PSM_CONFIG_REG4_TIMER_PHY_LINK_DETECT_MSK = 0xf<<4,
+	PSM_CONFIG_REG4_TIMER_PHY_LINK_DETECT_BASE = 4,
+
+	PSM_CONFIG_REG4_DEBUG_TIMER	    = 1<<1, /* Debug Timer */
+	PSM_CONFIG_REG4_RST_PHY_LINK_DETECT = 1<<0, /* Reset GPHY Link Detect */
+};
+
 
 #define PCI_STATUS_ERROR_BITS (PCI_STATUS_DETECTED_PARITY | \
 			       PCI_STATUS_SIG_SYSTEM_ERROR | \
@@ -197,6 +301,9 @@ enum csr_regs {
 	B2_I2C_IRQ	= 0x0168,
 	B2_I2C_SW	= 0x016c,
 
+	Y2_PEX_PHY_DATA = 0x0170,
+	Y2_PEX_PHY_ADDR = 0x0172,
+
 	B3_RAM_ADDR	= 0x0180,
 	B3_RAM_DATA_LO	= 0x0184,
 	B3_RAM_DATA_HI	= 0x0188,
@@ -317,6 +424,10 @@ enum {
 	Y2_IS_CHK_TXS2	= 1<<9,		/* Descriptor error TXS 2 */
 	Y2_IS_CHK_TXA2	= 1<<8,		/* Descriptor error TXA 2 */
 
+	Y2_IS_PSM_ACK	= 1<<7,		/* PSM Acknowledge (Yukon-Optima only) */
+	Y2_IS_PTP_TIST	= 1<<6,		/* PTP Time Stamp (Yukon-Optima only) */
+	Y2_IS_PHY_QLNK	= 1<<5,		/* PHY Quick Link (Yukon-Optima only) */
+
 	Y2_IS_IRQ_PHY1	= 1<<4,		/* Interrupt from PHY 1 */
 	Y2_IS_IRQ_MAC1	= 1<<3,		/* Interrupt from MAC 1 */
 	Y2_IS_CHK_RX1	= 1<<2,		/* Descriptor error Rx 1 */
@@ -435,6 +546,7 @@ enum {
  	CHIP_ID_YUKON_FE_P = 0xb8, /* YUKON-2 FE+ */
 	CHIP_ID_YUKON_SUPR = 0xb9, /* YUKON-2 Supreme */
 	CHIP_ID_YUKON_UL_2 = 0xba, /* YUKON-2 Ultra 2 */
+	CHIP_ID_YUKON_OPT  = 0xbc, /* YUKON-2 Optima */
 };
 enum yukon_ec_rev {
 	CHIP_REV_YU_EC_A1    = 0,  /* Chip Rev. for Yukon-EC A1/A0 */
@@ -459,6 +571,8 @@ enum yukon_ex_rev {
 };
 enum yukon_supr_rev {
 	CHIP_REV_YU_SU_A0    = 0,
+	CHIP_REV_YU_SU_B0    = 1,
+	CHIP_REV_YU_SU_B1    = 3,
 };
 
 
@@ -513,6 +627,12 @@ enum {
 	TIM_T_STEP	= 1<<0,	/* Test step */
 };
 
+/*	Y2_PEX_PHY_ADDR/DATA		PEX PHY address and data reg  (Yukon-2 only) */
+enum {
+	PEX_RD_ACCESS	= 1<<31, /* Access Mode Read = 1, Write = 0 */
+	PEX_DB_ACCESS	= 1<<30, /* Access to debug register */
+};
+
 /*	B3_RAM_ADDR		32 bit	RAM Address, to read or write */
 					/* Bit 31..19:	reserved */
 #define RAM_ADR_RAN	0x0007ffffL	/* Bit 18.. 0:	RAM Address Range */
@@ -754,6 +874,42 @@ enum {
 	BMU_TX_CLR_IRQ_TCP	= 1<<11, /* Clear IRQ on TCP segment length mismatch */
 };
 
+/*	TBMU_TEST			0x06B8	Transmit BMU Test Register */
+enum {
+	TBMU_TEST_BMU_TX_CHK_AUTO_OFF		= 1<<31, /* BMU Tx Checksum Auto Calculation Disable */
+	TBMU_TEST_BMU_TX_CHK_AUTO_ON		= 1<<30, /* BMU Tx Checksum Auto Calculation Enable */
+	TBMU_TEST_HOME_ADD_PAD_FIX1_EN		= 1<<29, /* Home Address Paddiing FIX1 Enable */
+	TBMU_TEST_HOME_ADD_PAD_FIX1_DIS		= 1<<28, /* Home Address Paddiing FIX1 Disable */
+	TBMU_TEST_ROUTING_ADD_FIX_EN		= 1<<27, /* Routing Address Fix Enable */
+	TBMU_TEST_ROUTING_ADD_FIX_DIS		= 1<<26, /* Routing Address Fix Disable */
+	TBMU_TEST_HOME_ADD_FIX_EN		= 1<<25, /* Home address checksum fix enable */
+	TBMU_TEST_HOME_ADD_FIX_DIS		= 1<<24, /* Home address checksum fix disable */
+
+	TBMU_TEST_TEST_RSPTR_ON			= 1<<22, /* Testmode Shadow Read Ptr On */
+	TBMU_TEST_TEST_RSPTR_OFF		= 1<<21, /* Testmode Shadow Read Ptr Off */
+	TBMU_TEST_TESTSTEP_RSPTR		= 1<<20, /* Teststep Shadow Read Ptr */
+
+	TBMU_TEST_TEST_RPTR_ON			= 1<<18, /* Testmode Read Ptr On */
+	TBMU_TEST_TEST_RPTR_OFF			= 1<<17, /* Testmode Read Ptr Off */
+	TBMU_TEST_TESTSTEP_RPTR			= 1<<16, /* Teststep Read Ptr */
+
+	TBMU_TEST_TEST_WSPTR_ON			= 1<<14, /* Testmode Shadow Write Ptr On */
+	TBMU_TEST_TEST_WSPTR_OFF		= 1<<13, /* Testmode Shadow Write Ptr Off */
+	TBMU_TEST_TESTSTEP_WSPTR		= 1<<12, /* Teststep Shadow Write Ptr */
+
+	TBMU_TEST_TEST_WPTR_ON			= 1<<10, /* Testmode Write Ptr On */
+	TBMU_TEST_TEST_WPTR_OFF			= 1<<9, /* Testmode Write Ptr Off */
+	TBMU_TEST_TESTSTEP_WPTR			= 1<<8,			/* Teststep Write Ptr */
+
+	TBMU_TEST_TEST_REQ_NB_ON		= 1<<6, /* Testmode Req Nbytes/Addr On */
+	TBMU_TEST_TEST_REQ_NB_OFF		= 1<<5, /* Testmode Req Nbytes/Addr Off */
+	TBMU_TEST_TESTSTEP_REQ_NB		= 1<<4, /* Teststep Req Nbytes/Addr */
+
+	TBMU_TEST_TEST_DONE_IDX_ON		= 1<<2, /* Testmode Done Index On */
+	TBMU_TEST_TEST_DONE_IDX_OFF		= 1<<1, /* Testmode Done Index Off */
+	TBMU_TEST_TESTSTEP_DONE_IDX		= 1<<0,	/* Teststep Done Index */
+};
+
 /* Queue Prefetch Unit Offsets, use Y2_QADDR() to address (Yukon-2 only)*/
 /* PREF_UNIT_CTRL	32 bit	Prefetch Control register */
 enum {
@@ -1674,6 +1830,12 @@ enum {
 
 /*	RX_GMF_CTRL_T	32 bit	Rx GMAC FIFO Control/Test */
 enum {
+	RX_GCLKMAC_ENA	= 1<<31,	/* RX MAC Clock Gating Enable */
+	RX_GCLKMAC_OFF	= 1<<30,
+
+	RX_STFW_DIS	= 1<<29,	/* RX Store and Forward Enable */
+	RX_STFW_ENA	= 1<<28,
+
 	RX_TRUNC_ON	= 1<<27,  	/* enable  packet truncation */
 	RX_TRUNC_OFF	= 1<<26, 	/* disable packet truncation */
 	RX_VLAN_STRIP_ON = 1<<25,	/* enable  VLAN stripping */
@@ -1711,6 +1873,20 @@ enum {
 	GMF_RX_CTRL_DEF	= GMF_OPER_ON | GMF_RX_F_FL_ON,
 };
 
+/*	RX_GMF_FL_CTRL	16 bit	Rx GMAC FIFO Flush Control (Yukon-Supreme) */
+enum {
+	RX_IPV6_SA_MOB_ENA	= 1<<9,	/* IPv6 SA Mobility Support Enable */
+	RX_IPV6_SA_MOB_DIS	= 1<<8,	/* IPv6 SA Mobility Support Disable */
+	RX_IPV6_DA_MOB_ENA	= 1<<7,	/* IPv6 DA Mobility Support Enable */
+	RX_IPV6_DA_MOB_DIS	= 1<<6,	/* IPv6 DA Mobility Support Disable */
+	RX_PTR_SYNCDLY_ENA	= 1<<5,	/* Pointers Delay Synch Enable */
+	RX_PTR_SYNCDLY_DIS	= 1<<4,	/* Pointers Delay Synch Disable */
+	RX_ASF_NEWFLAG_ENA	= 1<<3,	/* RX ASF Flag New Logic Enable */
+	RX_ASF_NEWFLAG_DIS	= 1<<2,	/* RX ASF Flag New Logic Disable */
+	RX_FLSH_MISSPKT_ENA	= 1<<1,	/* RX Flush Miss-Packet Enable */
+	RX_FLSH_MISSPKT_DIS	= 1<<0,	/* RX Flush Miss-Packet Disable */
+};
+
 /*	TX_GMF_EA		32 bit	Tx GMAC FIFO End Address */
 enum {
 	TX_DYN_WM_ENA	= 3,	/* Yukon-FE+ specific */

-- 


^ permalink raw reply

* [PATCH 0/6] sky2: driver update
From: Stephen Hemminger @ 2009-10-29 16:37 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

This patch set adds support for Yukon Optima (88E8059) based
on the vendor driver.  Hopefully, it can go in 2.6.32.

-- 


^ permalink raw reply

* [PATCH 1/6] sky2: add SK-9E21M device id
From: Stephen Hemminger @ 2009-10-29 16:37 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20091029163704.793246334@vyatta.com>

[-- Attachment #1: sky2-sk-9e21m.patch --]
[-- Type: text/plain, Size: 749 bytes --]

This is a new ID that just showed up in latest vendor driver.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--- a/drivers/net/sky2.c	2009-10-28 11:01:10.454251343 -0700
+++ b/drivers/net/sky2.c	2009-10-28 11:01:44.867001737 -0700
@@ -102,6 +102,7 @@ MODULE_PARM_DESC(disable_msi, "Disable M
 static DEFINE_PCI_DEVICE_TABLE(sky2_id_table) = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, 0x9000) }, /* SK-9Sxx */
 	{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, 0x9E00) }, /* SK-9Exx */
+	{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, 0x9E01) }, /* SK-9E21M */
 	{ PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4b00) },	/* DGE-560T */
 	{ PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4001) }, 	/* DGE-550SX */
 	{ PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4B02) },	/* DGE-560SX */

-- 


^ permalink raw reply

* [PATCH 4/6] sky2: workarounds for Yukon-2 supreme
From: Stephen Hemminger @ 2009-10-29 16:37 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20091029163704.793246334@vyatta.com>

[-- Attachment #1: sky2-supr2.patch --]
[-- Type: text/plain, Size: 2178 bytes --]

Changes related to support of Yukon supreme chip.
Don't have this chip version to test on,
these are reverse engineered from the vendor (GPL) driver.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>


--- a/drivers/net/sky2.c	2009-10-29 08:52:37.995315808 -0700
+++ b/drivers/net/sky2.c	2009-10-29 08:55:47.116003224 -0700
@@ -787,8 +787,7 @@ static void sky2_set_tx_stfwd(struct sky
 
 	if ( (hw->chip_id == CHIP_ID_YUKON_EX &&
 	      hw->chip_rev != CHIP_REV_YU_EX_A0) ||
-	     hw->chip_id == CHIP_ID_YUKON_FE_P ||
-	     hw->chip_id == CHIP_ID_YUKON_SUPR) {
+	     hw->chip_id >= CHIP_ID_YUKON_FE_P) {
 		/* Yukon-Extreme B0 and further Extreme devices */
 		/* enable Store & Forward mode for TX */
 
@@ -1404,6 +1403,31 @@ static int sky2_rx_start(struct sky2_por
 
 	/* Tell chip about available buffers */
 	sky2_rx_update(sky2, rxq);
+
+	if (hw->chip_id == CHIP_ID_YUKON_EX ||
+	    hw->chip_id == CHIP_ID_YUKON_SUPR) {
+		/*
+		 * Disable flushing of non ASF packets;
+		 * must be done after initializing the BMUs;
+		 * drivers without ASF support should do this too, otherwise
+		 * it may happen that they cannot run on ASF devices;
+		 * remember that the MAC FIFO isn't reset during initialization.
+		 */
+		sky2_write32(hw, SK_REG(sky2->port, RX_GMF_CTRL_T), RX_MACSEC_FLUSH_OFF);
+	}
+
+	if (hw->chip_id >= CHIP_ID_YUKON_SUPR) {
+		/* Enable RX Home Address & Routing Header checksum fix */
+		sky2_write16(hw, SK_REG(sky2->port, RX_GMF_FL_CTRL),
+			     RX_IPV6_SA_MOB_ENA | RX_IPV6_DA_MOB_ENA);
+
+		/* Enable TX Home Address & Routing Header checksum fix */
+		sky2_write32(hw, Q_ADDR(txqaddr[sky2->port], Q_TEST),
+			     TBMU_TEST_HOME_ADD_FIX_EN | TBMU_TEST_ROUTING_ADD_FIX_EN);
+	}
+
+
+
 	return 0;
 nomem:
 	sky2_rx_clean(sky2);
@@ -2993,6 +3017,12 @@ static void sky2_reset(struct sky2_hw *h
 			sky2_write16(hw, SK_REG(i, GMAC_CTRL),
 				     GMC_BYP_MACSECRX_ON | GMC_BYP_MACSECTX_ON
 				     | GMC_BYP_RETR_ON);
+
+	}
+
+	if (hw->chip_id == CHIP_ID_YUKON_SUPR && hw->chip_rev > CHIP_REV_YU_SU_B0) {
+		/* enable MACSec clock gating */
+		sky2_pci_write32(hw, PCI_DEV_REG3, P_CLK_MACSEC_DIS);
 	}
 
 	/* Clear I2C IRQ noise */

-- 


^ permalink raw reply

* [PATCH 3/6] sky2: fix receive pause thresholds
From: Stephen Hemminger @ 2009-10-29 16:37 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20091029163704.793246334@vyatta.com>

[-- Attachment #1: sky2-rxthresh.patch --]
[-- Type: text/plain, Size: 1964 bytes --]

Program the receive pause thresholds differently depending on
chip version. This cloned from from the vendor (GPL) driver.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>


--- a/drivers/net/sky2.h	2009-10-29 08:35:53.313438448 -0700
+++ b/drivers/net/sky2.h	2009-10-29 08:37:04.959438572 -0700
@@ -808,10 +808,11 @@ enum {
 	RX_GMF_AF_THR	= 0x0c44,/* 32 bit	Rx GMAC FIFO Almost Full Thresh. */
 	RX_GMF_CTRL_T	= 0x0c48,/* 32 bit	Rx GMAC FIFO Control/Test */
 	RX_GMF_FL_MSK	= 0x0c4c,/* 32 bit	Rx GMAC FIFO Flush Mask */
-	RX_GMF_FL_THR	= 0x0c50,/* 32 bit	Rx GMAC FIFO Flush Threshold */
+	RX_GMF_FL_THR	= 0x0c50,/* 16 bit	Rx GMAC FIFO Flush Threshold */
+	RX_GMF_FL_CTRL	= 0x0c52,/* 16 bit	Rx GMAC FIFO Flush Control */
 	RX_GMF_TR_THR	= 0x0c54,/* 32 bit	Rx Truncation Threshold (Yukon-2) */
-	RX_GMF_UP_THR	= 0x0c58,/*  8 bit	Rx Upper Pause Thr (Yukon-EC_U) */
-	RX_GMF_LP_THR	= 0x0c5a,/*  8 bit	Rx Lower Pause Thr (Yukon-EC_U) */
+	RX_GMF_UP_THR	= 0x0c58,/* 16 bit	Rx Upper Pause Thr (Yukon-EC_U) */
+	RX_GMF_LP_THR	= 0x0c5a,/* 16 bit	Rx Lower Pause Thr (Yukon-EC_U) */
 	RX_GMF_VLAN	= 0x0c5c,/* 32 bit	Rx VLAN Type Register (Yukon-2) */
 	RX_GMF_WP	= 0x0c60,/* 32 bit	Rx GMAC FIFO Write Pointer */
 
--- a/drivers/net/sky2.c	2009-10-29 08:34:04.765191254 -0700
+++ b/drivers/net/sky2.c	2009-10-29 08:52:37.995315808 -0700
@@ -926,8 +926,14 @@ static void sky2_mac_init(struct sky2_hw
 
 	/* On chips without ram buffer, pause is controled by MAC level */
 	if (!(hw->flags & SKY2_HW_RAM_BUFFER)) {
-		sky2_write8(hw, SK_REG(port, RX_GMF_LP_THR), 768/8);
-		sky2_write8(hw, SK_REG(port, RX_GMF_UP_THR), 1024/8);
+		/* Pause threshold is scaled by 8 in bytes */
+		if (hw->chip_id == CHIP_ID_YUKON_FE_P
+			&& hw->chip_rev == CHIP_REV_YU_FE2_A0)
+			reg = 1568 / 8;
+		else
+			reg = 1024 / 8;
+		sky2_write16(hw, SK_REG(port, RX_GMF_UP_THR), reg);
+		sky2_write16(hw, SK_REG(port, RX_GMF_LP_THR), 768 / 8);
 
 		sky2_set_tx_stfwd(hw, port);
 	}

-- 


^ permalink raw reply

* [PATCH 5/6] sky2: 88E8059 support
From: Stephen Hemminger @ 2009-10-29 16:37 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20091029163704.793246334@vyatta.com>

[-- Attachment #1: sky2-yukon-ul_2.patch --]
[-- Type: text/plain, Size: 4547 bytes --]

Tentative support for newer Marvell hardware including
the Yukon-2 Optima chip. Do not have hatdware to test this yet,
code is based on vendor driver.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--- a/drivers/net/sky2.c	2009-10-28 11:02:38.031251525 -0700
+++ b/drivers/net/sky2.c	2009-10-28 11:02:44.719000693 -0700
@@ -140,6 +140,7 @@ static DEFINE_PCI_DEVICE_TABLE(sky2_id_t
 	{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x436D) }, /* 88E8055 */
 	{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4370) }, /* 88E8075 */
 	{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4380) }, /* 88E8057 */
+	{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4381) }, /* 88E8059 */
 	{ 0 }
 };
 
@@ -603,6 +604,16 @@ static void sky2_phy_init(struct sky2_hw
 		/* apply workaround for integrated resistors calibration */
 		gm_phy_write(hw, port, PHY_MARV_PAGE_ADDR, 17);
 		gm_phy_write(hw, port, PHY_MARV_PAGE_DATA, 0x3f60);
+	} else if (hw->chip_id == CHIP_ID_YUKON_OPT && hw->chip_rev == 0) {
+		/* apply fixes in PHY AFE */
+		gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 0x00ff);
+
+		/* apply RDAC termination workaround */
+		gm_phy_write(hw, port, 24, 0x2800);
+		gm_phy_write(hw, port, 23, 0x2001);
+
+		/* set page register back to 0 */
+		gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 0);
 	} else if (hw->chip_id != CHIP_ID_YUKON_EX &&
 		   hw->chip_id < CHIP_ID_YUKON_SUPR) {
 		/* no effect on Yukon-XL */
@@ -2133,6 +2144,25 @@ out:
 	spin_unlock(&sky2->phy_lock);
 }
 
+/* Special quick link interrupt (Yukon-2 Optima only) */
+static void sky2_qlink_intr(struct sky2_hw *hw)
+{
+	struct sky2_port *sky2 = netdev_priv(hw->dev[0]);
+	u32 imask;
+	u16 phy;
+
+	/* disable irq */
+	imask = sky2_read32(hw, B0_IMSK);
+	imask &= ~Y2_IS_PHY_QLNK;
+	sky2_write32(hw, B0_IMSK, imask);
+
+	/* reset PHY Link Detect */
+	phy = sky2_pci_read16(hw, PSM_CONFIG_REG4);
+	sky2_pci_write16(hw, PSM_CONFIG_REG4, phy | 1);
+
+	sky2_link_up(sky2);
+}
+
 /* Transmit timeout is only called if we are running, carrier is up
  * and tx queue is full (stopped).
  */
@@ -2803,6 +2833,9 @@ static int sky2_poll(struct napi_struct 
 	if (status & Y2_IS_IRQ_PHY2)
 		sky2_phy_intr(hw, 1);
 
+	if (status & Y2_IS_PHY_QLNK)
+		sky2_qlink_intr(hw);
+
 	while ((idx = sky2_read16(hw, STAT_PUT_IDX)) != hw->st_idx) {
 		work_done += sky2_status_intr(hw, work_limit - work_done, idx);
 
@@ -2852,6 +2885,7 @@ static u32 sky2_mhz(const struct sky2_hw
 	case CHIP_ID_YUKON_EX:
 	case CHIP_ID_YUKON_SUPR:
 	case CHIP_ID_YUKON_UL_2:
+	case CHIP_ID_YUKON_OPT:
 		return 125;
 
 	case CHIP_ID_YUKON_FE:
@@ -2941,6 +2975,7 @@ static int __devinit sky2_init(struct sk
 		break;
 
 	case CHIP_ID_YUKON_UL_2:
+	case CHIP_ID_YUKON_OPT:
 		hw->flags = SKY2_HW_GIGABIT
 			| SKY2_HW_ADV_POWER_CTL;
 		break;
@@ -3031,6 +3066,46 @@ static void sky2_reset(struct sky2_hw *h
 		sky2_pci_write32(hw, PCI_DEV_REG3, P_CLK_MACSEC_DIS);
 	}
 
+	if (hw->chip_id == CHIP_ID_YUKON_OPT) {
+		u16 reg;
+		u32 msk;
+
+		if (hw->chip_rev == 0) {
+			/* disable PCI-E PHY power down (set PHY reg 0x80, bit 7 */
+			sky2_write32(hw, Y2_PEX_PHY_DATA, (0x80UL << 16) | (1 << 7));
+
+			/* set PHY Link Detect Timer to 1.1 second (11x 100ms) */
+			reg = 10;
+		} else {
+			/* set PHY Link Detect Timer to 0.4 second (4x 100ms) */
+			reg = 3;
+		}
+
+		reg <<= PSM_CONFIG_REG4_TIMER_PHY_LINK_DETECT_BASE;
+
+		/* reset PHY Link Detect */
+		sky2_pci_write16(hw, PSM_CONFIG_REG4,
+				 reg | PSM_CONFIG_REG4_RST_PHY_LINK_DETECT);
+		sky2_pci_write16(hw, PSM_CONFIG_REG4, reg);
+
+
+		/* enable PHY Quick Link */
+		msk = sky2_read32(hw, B0_IMSK);
+		msk |= Y2_IS_PHY_QLNK;
+		sky2_write32(hw, B0_IMSK, msk);
+
+		/* check if PSMv2 was running before */
+		reg = sky2_pci_read16(hw, PSM_CONFIG_REG3);
+		if (reg & PCI_EXP_LNKCTL_ASPMC) {
+			int cap = pci_find_capability(pdev, PCI_CAP_ID_EXP);
+			/* restore the PCIe Link Control register */
+			sky2_pci_write16(hw, cap + PCI_EXP_LNKCTL, reg);
+		}
+
+		/* re-enable PEX PM in PEX PHY debug reg. 8 (clear bit 12) */
+		sky2_write32(hw, Y2_PEX_PHY_DATA, PEX_DB_ACCESS | (0x08UL << 16));
+	}
+
 	/* Clear I2C IRQ noise */
 	sky2_write32(hw, B2_I2C_IRQ, 1);
 
@@ -4451,9 +4526,11 @@ static const char *sky2_name(u8 chipid, 
 		"FE+",		/* 0xb8 */
 		"Supreme",	/* 0xb9 */
 		"UL 2",		/* 0xba */
+		"Unknown",	/* 0xbb */
+		"Optima",	/* 0xbc */
 	};
 
-	if (chipid >= CHIP_ID_YUKON_XL && chipid < CHIP_ID_YUKON_UL_2)
+	if (chipid >= CHIP_ID_YUKON_XL && chipid < CHIP_ID_YUKON_OPT)
 		strncpy(buf, name[chipid - CHIP_ID_YUKON_XL], sz);
 	else
 		snprintf(buf, sz, "(chip %#x)", chipid);

-- 


^ permalink raw reply

* RE: [PATCH] udev: create empty regular files to represent net interfaces
From: Narendra_K @ 2009-10-29 16:44 UTC (permalink / raw)
  To: greg, Matt_Domsch
  Cc: kay.sievers, dannf, linux-hotplug, netdev, Jordan_Hargrave,
	Charles_Rose, bhutchings
In-Reply-To: <20091029142554.GA16869@kroah.com>

[-- Attachment #1: Type: text/plain, Size: 3054 bytes --]

>> Netdev team - are you in agreement that having multiple names to 
>> address the same netdevice is a worthwhile thing to add, to allow a 
>> variety of naming schemes to exist simultaneously?  If not, 
>this whole 
>> discussion will be moot, and my basic problem, that the ethX naming 
>> convention is nondeterministic, but we need determinism, remains 
>> unresolved.
>
>I'm still totally confused as to why you think this.  What is 
>wrong with what we do today, which is name network devices in 
>a deterministic manner by their MAC in userspace?  That name 
>goes into the kernel, and everyone uses the same name and is happy.

The interface name as assigned by the OS is determined by how the
interface is named first during the OS installation. This name is made
persistent by associating the name with it's MAC address in userspace,
either by udev or ifcfg-eth files. In cases where there are one or more
add-in cards along with one or more integrated cards (Lan on
Motherboard), the integrated port 1, which is designated as Gb1 on the
chassis may or may not get the name "eth0". And that is the customer
expectation, most of the times.
Unattended installs and large scale image based installs are the most
affected scenarios. 

>If you don't like naming by MAC, then pick some other 
>deterministic naming scheme that works for your hardware and 
>write udev rules for it.
>
>You could easily name them in a way that could keep the lowest number
>(eth0) for the lowest PCI id if you so desired and your BIOS 
>guaranteed it.
>

This is how the lspci tree view on a PER710 (PowerEdge R710) server with
Four BCM5709 integrated NIC ports and One add-in Intel NIC port looks
like. The integrated ports are always found before the add-in nic (or
nics) by the BIOS consistently and BIOS guarantees it across every
reboot. If the OS also found and named the network ports in the same
manner, then there is no issue as integrated NIC port 1, designated Gb1
on the chassis, is always named as "eth0". But the observation is that,
it is not the case always.

-[0000:00]-+-00.0  Intel Corporation 5520 I/O Hub to ESI Port
           +-01.0-[0000:01]--+-00.0  Broadcom Corporation NetXtreme II
BCM5709 Gigabit Ethernet
           |                 \-00.1  Broadcom Corporation NetXtreme II
BCM5709 Gigabit Ethernet
           +-03.0-[0000:02]--+-00.0  Broadcom Corporation NetXtreme II
BCM5709 Gigabit Ethernet
           |                 \-00.1  Broadcom Corporation NetXtreme II
BCM5709 Gigabit Ethernet
           +-04.0-[0000:03]----00.0  LSI Logic / Symbios Logic MegaRAID
SAS 1078
           +-05.0-[0000:04]--
           +-06.0-[0000:05]--
           +-07.0-[0000:06]--
           +-09.0-[0000:07]----00.0  Intel Corporation 82598EB
10-Gigabit AT Network Connection

In such cases, pathnames like Embedded_NIC_1 -> eth[01..], point to the
right interface, and communicate a more meaningful name without any
state embedded in them.

With regards,
Narendra K

[-- Attachment #2: PER710-lspci-tv.output --]
[-- Type: application/octet-stream, Size: 735 bytes --]

-[0000:00]-+-00.0  Intel Corporation 5520 I/O Hub to ESI Port
           +-01.0-[0000:01]--+-00.0  Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
           |                 \-00.1  Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
           +-03.0-[0000:02]--+-00.0  Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
           |                 \-00.1  Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
           +-04.0-[0000:03]----00.0  LSI Logic / Symbios Logic MegaRAID SAS 1078
           +-05.0-[0000:04]--
           +-06.0-[0000:05]--
           +-07.0-[0000:06]--
           +-09.0-[0000:07]----00.0  Intel Corporation 82598EB 10-Gigabit AT Network Connection

^ permalink raw reply

* Re: [PATCH] udev: create empty regular files to represent net interfaces
From: Ben Hutchings @ 2009-10-29 16:49 UTC (permalink / raw)
  To: Greg KH
  Cc: Matt Domsch, Kay Sievers, dann frazier, linux-hotplug, Narendra_K,
	netdev, Jordan_Hargrave, Charles_Rose
In-Reply-To: <20091029142554.GA16869@kroah.com>

On Thu, 2009-10-29 at 07:25 -0700, Greg KH wrote:
> On Thu, Oct 29, 2009 at 08:11:25AM -0500, Matt Domsch wrote:
> > Netdev team - are you in agreement that having multiple names to
> > address the same netdevice is a worthwhile thing to add, to allow a
> > variety of naming schemes to exist simultaneously?  If not, this whole
> > discussion will be moot, and my basic problem, that the ethX naming
> > convention is nondeterministic, but we need determinism, remains
> > unresolved.
> 
> I'm still totally confused as to why you think this.  What is wrong with
> what we do today, which is name network devices in a deterministic
> manner by their MAC in userspace?  That name goes into the kernel, and
> everyone uses the same name and is happy.
> 
> If you don't like naming by MAC, then pick some other deterministic
> naming scheme that works for your hardware and write udev rules for it.
> 
> You could easily name them in a way that could keep the lowest number
> (eth0) for the lowest PCI id if you so desired and your BIOS guaranteed
> it.
> 
> This way the kernel has only one name, and so does userspace, and
> everyone is happy.

I thought there was a general trend in udev development to provide
default rules that work for almost everyone, so few users/administrators
need to override or add to them.  Compare disks and net devices:

1. Stable kernel device id
Disks: block device number
Net devices: ifindex

2. Unique identifier (across reboot)
Disks: label or UUID (each with limitations)
Net devices: (MAC address, subtype)

3. Name assignment mechanism
Disks: kernel suggests a name; udev can assign any number
Net devices: kernel assigns a single name; udev can override it

4. Default name assignment policy
Disks: names disk by device path (id), label and UUID
Net devices: assigns arbitrary stable names per (MAC address, subtype)

5. Naming by users
Disks: user can identify by any method without having to choose on a
system-wide basis
Net devices: user must identify by single name; policy can be overridden
on a system-wide basis

I fully understand the technical reasons for differences 3-5, but why
should users have to put up with it?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

--
To unsubscribe from this list: send the line "unsubscribe linux-hotplug" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] udev: create empty regular files to represent net interfaces
From: Greg KH @ 2009-10-29 16:52 UTC (permalink / raw)
  To: Narendra_K
  Cc: Matt_Domsch, kay.sievers, dannf, linux-hotplug, netdev,
	Jordan_Hargrave, Charles_Rose, bhutchings
In-Reply-To: <EDA0A4495861324DA2618B4C45DCB3EE589662@blrx3m08.blr.amer.dell.com>

On Thu, Oct 29, 2009 at 10:14:08PM +0530, Narendra_K@Dell.com wrote:
> 
> >> Netdev team - are you in agreement that having multiple names to 
> >> address the same netdevice is a worthwhile thing to add, to allow a 
> >> variety of naming schemes to exist simultaneously?  If not, 
> >this whole 
> >> discussion will be moot, and my basic problem, that the ethX naming 
> >> convention is nondeterministic, but we need determinism, remains 
> >> unresolved.
> >
> >I'm still totally confused as to why you think this.  What is 
> >wrong with what we do today, which is name network devices in 
> >a deterministic manner by their MAC in userspace?  That name 
> >goes into the kernel, and everyone uses the same name and is happy.
> 
> The interface name as assigned by the OS is determined by how the
> interface is named first during the OS installation.

That sounds like a distro install issue to me, why not fix it there?

> This name is made persistent by associating the name with it's MAC
> address in userspace, either by udev or ifcfg-eth files. In cases
> where there are one or more add-in cards along with one or more
> integrated cards (Lan on Motherboard), the integrated port 1, which is
> designated as Gb1 on the chassis may or may not get the name "eth0".

Exactly, who cares about "eth0" as a name?

> And that is the customer expectation, most of the times.

Then again, fix the installer to allow you to either pick the name, or
specify some rule in which to use to pick the name.

> Unattended installs and large scale image based installs are the most
> affected scenarios. 

Then fix the installer.

> >If you don't like naming by MAC, then pick some other 
> >deterministic naming scheme that works for your hardware and 
> >write udev rules for it.
> >
> >You could easily name them in a way that could keep the lowest number
> >(eth0) for the lowest PCI id if you so desired and your BIOS 
> >guaranteed it.
> >
> 
> This is how the lspci tree view on a PER710 (PowerEdge R710) server with
> Four BCM5709 integrated NIC ports and One add-in Intel NIC port looks
> like. The integrated ports are always found before the add-in nic (or
> nics) by the BIOS consistently and BIOS guarantees it across every
> reboot.

Great, then you are set to write a udev rule for this, right?

> If the OS also found and named the network ports in the same manner,
> then there is no issue as integrated NIC port 1, designated Gb1 on the
> chassis, is always named as "eth0". But the observation is that, it is
> not the case always.

Sure, it's never guaranteed by the kernel that this will happen,
especially as we speed up the boot process by doing things async.

So again, just fix your installer, or write a new udev rule for your
hardware platforms, or both.  But I still fail to see why multiple names
for network devices _in the kernel_ is a solution for your issue.

> In such cases, pathnames like Embedded_NIC_1 -> eth[01..], point to the
> right interface, and communicate a more meaningful name without any
> state embedded in them.

Yes, pathnames would be nice to work for network devices, but
unfortunatly, that's just not how network devices work :)

thanks,

greg k-h

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox