Netdev List
 help / color / mirror / Atom feed
* [XFRM][RFC v1] Fix unexpected SA hard expiration after setting new date
From: fan.du @ 2012-06-18  8:24 UTC (permalink / raw)
  To: davem, herbert; +Cc: netdev, fdu


First, I'm not sure whether I Cced to the right person, if not,
apologize for the noise.


*Background*:
Once IPsec SAs are created between two peers, kernel setup a timer to monitor
two events: soft/hard expiration. However the timer handler use xtime to
caculate whether it's soft or hard expiration event.

normal code flow(hard expire time:100s, soft expire time:82s)

a) When new SAs created, xfrm_timer_handler is called one second
after its creation. At this point, calculate soft expire
interval(81s), setup the timer;

b) soft expire occur, rearm the timer with hard expire interval(18s)
then notify racoon2 about soft expire event. racoon2 will create
new SAs.

c) hard expire happen, notify racoon2 about it. racoon2 will delete
the old SAs.

*Scenario*:
Setting a new date before b),and after a) could result c) happens first,
As a result, old SAs is deleted before new ones are created. Normally
new SAs will be created by the next time networking traffic, but there
is a small time being when networking connection is down, this could
result in upper layer connections failed in tel comm area, thus it's
better to keep it strict in sequence.

*Workaround*:
set new time could happen:
1) before a), then SAs is updated with new time.
2) before b),and after a)
2a) When new SAs created, xfrm_timer_handler is called one second
after its creation. At this point, calculate soft expire
interval(81s), setup the timer;(set flag to mark next time should
be soft time expire)

<<---- new date comes

2b) soft expire occur, the calculation results in a hard time expire
event, but flag is set, so catch ya. Sync the addtime, and rearm
the timer with hard expire interval(18s), then notify racoon2
about soft expire event;

2c) hard expire happen, notify racoon2 about it;
so everything is in order.

3) after b), hard expire always happened anyway.


So, could you please give your comments on this?

thanks

^ permalink raw reply

* RE: [PATCH] bnx2x: fix panic when TX ring is full
From: Eric Dumazet @ 2012-06-18  7:38 UTC (permalink / raw)
  To: Dmitry Kravkov
  Cc: 'David Miller', netdev@vger.kernel.org,
	therbert@google.com, evansr@google.com, Eilon Greenstein,
	Merav Sicron, Yaniv Rosner, willemb@google.com, thruby@google.com
In-Reply-To: <504C9EFCA2D0054393414C9CB605C37F1CF19E@SJEXCHMB06.corp.ad.broadcom.com>

On Sat, 2012-06-16 at 07:40 +0000, Dmitry Kravkov wrote:
> Hi Eric and Tomas
> 
> > From: netdev-owner@vger.kernel.org [mailto:netdev-
> > owner@vger.kernel.org] On Behalf Of David Miller
> > Sent: Saturday, June 16, 2012 1:31 AM
> > To: eric.dumazet@gmail.com
> > Cc: netdev@vger.kernel.org; therbert@google.com; evansr@google.com;
> > Eilon Greenstein; Merav Sicron; Yaniv Rosner; willemb@google.com;
> > thruby@google.com
> > Subject: Re: [PATCH] bnx2x: fix panic when TX ring is full
> > 
> > From: Eric Dumazet <eric.dumazet@gmail.com>
> > Date: Wed, 13 Jun 2012 21:45:16 +0200
> > 
> > > From: Eric Dumazet <edumazet@google.com>
> > >
> > > There is a off by one error in the minimal number of BD in
> > > bnx2x_start_xmit() and bnx2x_tx_int() before stopping/resuming tx
> > queue.
> > >
> > > A full size GSO packet, with data included in skb->head really needs
> > > (MAX_SKB_FRAGS + 4) BDs, because of bnx2x_tx_split()
> > >
> > > This error triggers if BQL is disabled and heavy TCP transmit traffic
> > > occurs.
> > >
> > > bnx2x_tx_split() definitely can be called, remove a wrong comment.
> > >
> > > Reported-by: Tomas Hruby <thruby@google.com>
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> 
> Theoretically a can't see how we can reach the case with 4 BDs required apart of frags,
> Usually we need 2, when split invoked 3:
> 1.Start
> 2.Start(split)
> 3.Parsing
> + Frags
> 
> Next pages descriptors and 2 extras for full indication are not counted as available.
> 
> Practically I'm running the traffic for more then a day without hitting the panic.
> 
> Can you describe the scenario you reproduced this in details? And which code has paniced? 

Thats pretty immediate.

Disable bql on your NIC.

Say you have 4 queues :

for q in 0 1 2 3
do
  echo max >/sys/class/net/eth0/queues/tx-$q/byte_queue_limits/limit_min
done

Then start 40 netperf

for i in `seq 1 40`
do
 netperf -H 192.168.1.4 &
done


[  811.369026] bnx2x: [bnx2x_attn_int_deasserted3:3647(eth0)]MC assert!
[  811.369030] bnx2x:
[bnx2x_mc_assert:597(eth0)]XSTORM_ASSERT_LIST_INDEX 0x2
[  811.369036] bnx2x: [bnx2x_mc_assert:614(eth0)]XSTORM_ASSERT_INDEX 0x0
= 0x00110000 0x00000042 0x06981000 0x0001003a
[  811.369052] bnx2x: [bnx2x_attn_int_deasserted3:3653(eth0)]driver
assert
[  811.369054] bnx2x: [bnx2x_panic_dump:773(eth0)]begin crash dump
-----------------
[  811.369056] bnx2x: [bnx2x_panic_dump:780(eth0)]def_idx(0x327)
def_att_idx(0x4)  attn_state(0x1)  spq_prod_idx(0x2f)
next_stats_cnt(0x31c)
[  811.369058] bnx2x: [bnx2x_panic_dump:785(eth0)]DSB: attn bits(0x0)
ack(0x1)  id(0x0)  idx(0x4)
[  811.369060] bnx2x: [bnx2x_panic_dump:786(eth0)]     def (0x0 0x0 0x0
0x0 0x0 0x0 0x0 0x32a 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0)  igu_sb_id(0x0)
igu_seg_id(0x1) pf_id(0x0)  vnic_id(0x0)  vf_id(0xff)  vf_valid (0x0)
state(0x1)
[  811.369073] bnx2x: [bnx2x_panic_dump:830(eth0)]fp0:
rx_bd_prod(0xdfae)  rx_bd_cons(0xfb0)  rx_comp_prod(0xe86f)
rx_comp_cons(0xd86f)  *rx_cons_sb(0xd86f)
[  811.369076] bnx2x: [bnx2x_panic_dump:834(eth0)]
rx_sge_prod(0x400)  last_max_sge(0x2b)  fp_hc_idx(0x6a2f)
[  811.369078] bnx2x: [bnx2x_panic_dump:846(eth0)]fp0:
tx_pkt_prod(0xb184)  tx_pkt_cons(0xb088)  tx_bd_prod(0xa48)
tx_bd_cons(0xf943)  *tx_cons_sb(0xb088)
[  811.369080] bnx2x: [bnx2x_panic_dump:846(eth0)]fp0: tx_pkt_prod(0x0)
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[  811.369082] bnx2x: [bnx2x_panic_dump:858(eth0)]     run indexes
(0x6a2f 0x0)
[  811.369084] bnx2x: [bnx2x_panic_dump:864(eth0)]     indexes (0x0
0xd86f 0x0 0x0 0x0 0xb088 0x0 0x0)pf_id(0x0)  vf_id(0xff)  vf_valid(0x0)
vnic_id(0x0)  same_igu_sb_1b(0x1) state(0x1)
[  811.369103] SM[0] __flags (0x0) igu_sb_id (0x1)  igu_seg_id(0x0)
time_to_expire (0x2f68cf91) timer_value(0xff)
[  811.369104] SM[1] __flags (0x0) igu_sb_id (0x1)  igu_seg_id(0x0)
time_to_expire (0x2f68cfc0) timer_value(0xff)
[  811.369106] INDEX[0] flags (0x0) timeout (0x0)
[  811.369107] INDEX[1] flags (0x2) timeout (0x6)
[  811.369108] INDEX[2] flags (0x0) timeout (0x0)
[  811.369110] INDEX[3] flags (0x0) timeout (0x0)
[  811.369111] INDEX[4] flags (0x1) timeout (0x0)
[  811.369112] INDEX[5] flags (0x3) timeout (0xc)
[  811.369113] INDEX[6] flags (0x1) timeout (0x0)
[  811.369114] INDEX[7] flags (0x1) timeout (0x0)
[  811.369116] bnx2x: [bnx2x_panic_dump:830(eth0)]fp1:
rx_bd_prod(0x11ee)  rx_bd_cons(0x1f0)  rx_comp_prod(0x1e54)
rx_comp_cons(0xe54)  *rx_cons_sb(0xe54)
[  811.369119] bnx2x: [bnx2x_panic_dump:834(eth0)]
rx_sge_prod(0x400)  last_max_sge(0x29)  fp_hc_idx(0x969d)
[  811.369121] bnx2x: [bnx2x_panic_dump:846(eth0)]fp1:
tx_pkt_prod(0xba3f)  tx_pkt_cons(0xb92f)  tx_bd_prod(0x1aee)
tx_bd_cons(0xb18)  *tx_cons_sb(0xb92f)
[  811.369123] bnx2x: [bnx2x_panic_dump:846(eth0)]fp1: tx_pkt_prod(0x0)
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[  811.369125] bnx2x: [bnx2x_panic_dump:858(eth0)]     run indexes
(0x969d 0x0)
[  811.369128] bnx2x: [bnx2x_panic_dump:864(eth0)]     indexes (0x0
0xe54 0x0 0x0 0x0 0xb92f 0x0 0x0)pf_id(0x0)  vf_id(0xff)  vf_valid(0x0)
vnic_id(0x0)  same_igu_sb_1b(0x1) state(0x1)
[  811.369146] SM[0] __flags (0x0) igu_sb_id (0x2)  igu_seg_id(0x0)
time_to_expire (0x2f68cf8a) timer_value(0xff)
[  811.369148] SM[1] __flags (0x1) igu_sb_id (0x2)  igu_seg_id(0x0)
time_to_expire (0x2f68cfc4) timer_value(0xc)
[  811.369150] INDEX[0] flags (0x0) timeout (0x0)
[  811.369151] INDEX[1] flags (0x2) timeout (0x6)
[  811.369152] INDEX[2] flags (0x0) timeout (0x0)
[  811.369153] INDEX[3] flags (0x0) timeout (0x0)
[  811.369154] INDEX[4] flags (0x1) timeout (0x0)
[  811.369156] INDEX[5] flags (0x3) timeout (0xc)
[  811.369157] INDEX[6] flags (0x1) timeout (0x0)
[  811.369158] INDEX[7] flags (0x1) timeout (0x0)
[  811.369160] bnx2x: [bnx2x_panic_dump:830(eth0)]fp2:
rx_bd_prod(0xa84f)  rx_bd_cons(0x851)  rx_comp_prod(0xb367)
rx_comp_cons(0xa367)  *rx_cons_sb(0xa367)
[  811.369162] bnx2x: [bnx2x_panic_dump:834(eth0)]
rx_sge_prod(0x400)  last_max_sge(0x1f)  fp_hc_idx(0x2bf7)
[  811.369164] bnx2x: [bnx2x_panic_dump:846(eth0)]fp2:
tx_pkt_prod(0xb38d)  tx_pkt_cons(0xb296)  tx_bd_prod(0x1947)
tx_bd_cons(0xb59)  *tx_cons_sb(0xb296)
[  811.369166] bnx2x: [bnx2x_panic_dump:846(eth0)]fp2: tx_pkt_prod(0x0)
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[  811.369168] bnx2x: [bnx2x_panic_dump:858(eth0)]     run indexes
(0x2bf7 0x0)
[  811.369170] bnx2x: [bnx2x_panic_dump:864(eth0)]     indexes (0x0
0xa367 0x0 0x0 0x0 0xb296 0x0 0x0)pf_id(0x0)  vf_id(0xff)  vf_valid(0x0)
vnic_id(0x0)  same_igu_sb_1b(0x1) state(0x1)
[  811.369189] SM[0] __flags (0x1) igu_sb_id (0x3)  igu_seg_id(0x0)
time_to_expire (0x2f68cfc7) timer_value(0x6)
[  811.369191] SM[1] __flags (0x0) igu_sb_id (0x3)  igu_seg_id(0x0)
time_to_expire (0x2f68cf9d) timer_value(0xff)
[  811.369192] INDEX[0] flags (0x0) timeout (0x0)
[  811.369193] INDEX[1] flags (0x2) timeout (0x6)
[  811.369194] INDEX[2] flags (0x0) timeout (0x0)
[  811.369196] INDEX[3] flags (0x0) timeout (0x0)
[  811.369197] INDEX[4] flags (0x1) timeout (0x0)
[  811.369198] INDEX[5] flags (0x3) timeout (0xc)
[  811.369199] INDEX[6] flags (0x1) timeout (0x0)
[  811.369200] INDEX[7] flags (0x1) timeout (0x0)
[  811.369202] bnx2x: [bnx2x_panic_dump:830(eth0)]fp3:
rx_bd_prod(0x7df1)  rx_bd_cons(0xdf3)  rx_comp_prod(0x8887)
rx_comp_cons(0x7887)  *rx_cons_sb(0x7887)
[  811.369204] bnx2x: [bnx2x_panic_dump:834(eth0)]
rx_sge_prod(0x400)  last_max_sge(0x1a)  fp_hc_idx(0x5cb)
[  811.369206] bnx2x: [bnx2x_panic_dump:846(eth0)]fp3:
tx_pkt_prod(0xb8ce)  tx_pkt_cons(0xb7cc)  tx_bd_prod(0x15cf)
tx_bd_cons(0x8c5)  *tx_cons_sb(0xb7cc)
[  811.369208] bnx2x: [bnx2x_panic_dump:846(eth0)]fp3: tx_pkt_prod(0x0)
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[  811.369210] bnx2x: [bnx2x_panic_dump:858(eth0)]     run indexes
(0x5cb 0x0)
[  811.369212] bnx2x: [bnx2x_panic_dump:864(eth0)]     indexes (0x0
0x7887 0x0 0x0 0x0 0xb7cc 0x0 0x0)pf_id(0x0)  vf_id(0xff)  vf_valid(0x0)
vnic_id(0x0)  same_igu_sb_1b(0x1) state(0x1)
[  811.369231] SM[0] __flags (0x0) igu_sb_id (0x4)  igu_seg_id(0x0)
time_to_expire (0x2f68cf9e) timer_value(0xff)
[  811.369233] SM[1] __flags (0x1) igu_sb_id (0x4)  igu_seg_id(0x0)
time_to_expire (0x2f68cfe5) timer_value(0xc)
[  811.369235] INDEX[0] flags (0x0) timeout (0x0)
[  811.369236] INDEX[1] flags (0x2) timeout (0x6)
[  811.369237] INDEX[2] flags (0x0) timeout (0x0)
[  811.369238] INDEX[3] flags (0x0) timeout (0x0)
[  811.369239] INDEX[4] flags (0x1) timeout (0x0)
[  811.369240] INDEX[5] flags (0x3) timeout (0xc)
[  811.369241] INDEX[6] flags (0x1) timeout (0x0)
[  811.369242] INDEX[7] flags (0x1) timeout (0x0)
[  811.369246] bnx2x 0000:03:00.0: eth0: bc 6.2.8
[  811.369250] begin fw dump (mark 0x3c65b0)
[  811.369257] ttn 0x0->0x10
[  811.369257] attn 0x10->0x0
[  811.369258] attn 0x0->0x10
[  811.369265] attn 0x10->0x0
[  811.369265] attn 0x0->0x10
[  811.369272] attn 0x10->0x0
[  811.369272] attn 0x0->0x10
[  811.369279] attn 0x10->0x0
[  811.369279] attn 0x0->0x10
[  811.369286] attn 0x10->0x0
[  811.369286] attn 0x0->0x10
[  811.369293] attn 0x10->0x0
[  811.369294] attn 0x0->0x10
[  811.369301] attn 0x10->0x0
[  811.369301] attn 0x0->0x10
[  811.369308] attn 0x10->0x0
[  811.369308] attn 0x0->0x10
[  811.369309] attn 0x10->0x0
[  811.369316] attn 0x0->0x10
[  811.369316] attn 0x10->0x0
[  811.369323] attn 0x0->0x10
[  811.369324] attn 0x10->0x0
[  811.369331] attn 0x0->0x10
[  811.369331] attn 0x10->0x0
[  811.369338] attn 0x0->0x10
[  811.369338] attn 0x10->0x0
[  811.369345] attn 0x0->0x10
[  811.369345] attn 0x10->0x0
[  811.369352] attn 0x0->0x10
[  811.369352] attn 0x10->0x0
[  811.369359] attn 0x0->0x10
[  811.369360] attn 0x10->0x0
[  811.369360] attn 0x0->0x10
[  811.369367] attn 0x10->0x0
[  811.369367] attn 0x0->0x10
[  811.369374] attn 0x10->0x0
[  811.369375] attn 0x0->0x10
[  811.369382] attn 0x10->0x0
[  811.369382] attn 0x0->0x10
[  811.369389] attn 0x10->0x0
[  811.369389] attn 0x0->0x10
[  811.369396] attn 0x10->0x0
[  811.369396] attn 0x0->0x10
[  811.369403] attn 0x10->0x0
[  811.369403] attn 0x0->0x10
[  811.369410] attn 0x10->0x0
[  811.369411] attn 0x0->0x10
[  811.369417] attn 0x10->0x0
[  811.369418] attn 0x0->0x10
[  811.369418] attn 0x10->0x0
[  811.369439] attn 0x0->0x10
[  811.369439] attn 0x10->0x0
[  811.369446] attn 0x0->0x10
[  811.369446] attn 0x10->0x0
[  811.369453] attn 0x0->0x10
[  811.369454] attn 0x10->0x0
[  811.369461] attn 0x0->0x10
[  811.369461] attn 0x10->0x0
[  811.369469] attn 0x0->0x10
[  811.369469] attn 0x10->0x0
[  811.369477] attn 0x0->0x10
[  811.369477] attn 0x10->0x0
[  811.369484] f0: UNLOAD_REQ_WOL_DIS 0x4
[  811.369485] evnt[0.0] 0x1mcp intr[0.0]: 0x10:RSV ACCESS => 0x0 PC
0x800dd28
[  811.369507] >0x0
[  811.369507] attn 0x0->0x10
[  811.369507] attn 0x10->0x0
[  811.369515] f0: UNLOAD_DONE 0x5
[  811.369515] mcp intr[0.0]: 0x10:RSV ACCESS => 0x0 PC 0x8015e88
[  811.369530] lnk_cmn_int[0]: (rc 0)
[  811.369537] link_init[1.0]: 0x4
[  811.369538] idx = 0, req_line_speed = 0x3e8, req_duplex=0x1
[  811.369552] idx = 1, req_line_speed = 0x0, req_duplex=0x1
[  811.369560] ML recurs lvl 1
[  811.369568] init_phy[1.0]: done
[  811.369568] ML recurs lvl 1
[  811.369576] link_init[0.0]: 0x4
[  811.369576] idx = 0, req_line_speed = 0x3e8, req_duplex=0x1
[  811.369590] idx = 1, req_line_speed = 0x0, req_duplex=0x1
[  811.369598] ML recurs lvl 1
[  811.369605] init_phy[0.0]: done
[  811.369606] f0: LOAD_REQ 0x6
[  811.369613] 0.0:PMF->f0
[  811.369614] f0: LOAD_DONE 0x7
[  811.369621] evnt[0.0] 0x0->0x1000
[  811.369629] evnt[0.0] 0x1000->0x0
[  811.369629] f0: UNLOAD_REQ_WOL_DIS 0x8
[  811.369637] f0: UNLOAD_DONE 0x9
[  811.369644] lnk_cmn_int[0]: (rc 0)
[  811.369645] link_init[1.0]: 0x4
[  811.369652] idx = 0, req_line_speed = 0x3e8, req_duplex=0x1
[  811.369667] idx = 1, req_line_speed = 0x0, req_duplex=0x1
[  811.369675] Ap\x01�AML recurs lvl 1
[  811.369682] init_phy[1.0]: done
[  811.369690] ML recurs lvl 1
[  811.369690] link_init[0.0]: 0x4
[  811.369698] idx = 0, req_line_speed = 0x3e8, req_duplex=0x1
[  811.369706] idx = 1, req_line_speed = 0x0, req_duplex=0x1
[  811.369720] ML recurs lvl 1
[  811.369720] init_phy[0.0]: done
[  811.369728] f0: LOAD_REQ 0xa
[  811.369728] 0.0:PMF->f0
[  811.369735] f0: LOAD_DONE 0xb
[  811.369735] evnt[0.0] 0x0->0x1000
[  811.369742] evnt[0.0] 0x1000->0x0
[  811.369749] end of fw dump
[  811.369752] bnx2x:
[bnx2x_mc_assert:597(eth0)]XSTORM_ASSERT_LIST_INDEX 0x2
[  811.369758] bnx2x: [bnx2x_mc_assert:614(eth0)]XSTORM_ASSERT_INDEX 0x0
= 0x00110000 0x00000042 0x06981000 0x0001003a
[  811.369776] bnx2x: [bnx2x_panic_dump:996(eth0)]end crash dump
-----------------
[  819.771607] ------------[ cut here ]------------
[  819.771618] WARNING: at net/sched/sch_generic.c:263 dev_watchdog
+0x267/0x270()
[  819.771622] Hardware name: TBG,ICH10
[  819.771625] NETDEV WATCHDOG: eth0 (bnx2x): transmit queue 1 timed out
[  819.771627] Modules linked in: act_mirred cls_u32 cls_tcindex
sch_dsmark xt_multiport iptable_mangle ip_tables x_tables pca954x
i2c_mux processor cdc_acm uhci_hcd ehci_hcd i2c_dev i2c_i801 i2c_core
i2c_debug msr cpuid bnx2x crc32c libcrc32c mdio ipv6 genrtc
[  819.771661] Pid: 0, comm: swapper/6 Tainted: G        W
3.3.6-dbg-DEV #5
[  819.771664] Call Trace:
[  819.771666]  <IRQ>  [<ffffffff8107fc0f>] warn_slowpath_common
+0x7f/0xc0
[  819.771679]  [<ffffffff8107fd06>] warn_slowpath_fmt+0x46/0x50
[  819.771685]  [<ffffffff814cd327>] dev_watchdog+0x267/0x270
[  819.771692]  [<ffffffff81090343>] run_timer_softirq+0x183/0x440
[  819.771696]  [<ffffffff810902b4>] ? run_timer_softirq+0xf4/0x440
[  819.771701]  [<ffffffff814cd0c0>] ? pfifo_fast_init+0x90/0x90
[  819.771707]  [<ffffffff810875ad>] __do_softirq+0xbd/0x250
[  819.771714]  [<ffffffff810ca054>] ? tick_program_event+0x24/0x30
[  819.771724]  [<ffffffff8156169c>] call_softirq+0x1c/0x30
[  819.771730]  [<ffffffff8104f21d>] do_softirq+0x8d/0xc0
[  819.771732]  [<ffffffff810879de>] irq_exit+0x9e/0xc0
[  819.771735]  [<ffffffff81003dee>] smp_apic_timer_interrupt+0x6e/0x99
[  819.771738]  [<ffffffff81560cb0>] apic_timer_interrupt+0x70/0x80
[  819.771739]  <EOI>  [<ffffffffa011503c>] ? acpi_idle_enter_bm
+0x220/0x260 [processor]
[  819.771746]  [<ffffffffa0115037>] ? acpi_idle_enter_bm+0x21b/0x260
[processor]
[  819.771751]  [<ffffffff8146c2cd>] cpuidle_idle_call+0xcd/0x2a0
[  819.771755]  [<ffffffff8155be60>] ? notifier_call_chain+0x70/0x70
[  819.771758]  [<ffffffff8104c1d5>] cpu_idle+0x85/0xe0
[  819.771763]  [<ffffffff8154b726>] start_secondary+0x1e3/0x1ea
[  819.771764] ---[ end trace e93713a9d40cd06f ]---
[  820.851700] bnx2x: [bnx2x_clean_tx_queue:1381(eth0)]timeout waiting
for queue[0]: txdata->tx_pkt_prod(45546) != txdata->tx_pkt_cons(45192)
[  821.926357] bnx2x: [bnx2x_clean_tx_queue:1381(eth0)]timeout waiting
for queue[1]: txdata->tx_pkt_prod(47679) != txdata->tx_pkt_cons(47407)
[  823.000384] bnx2x: [bnx2x_clean_tx_queue:1381(eth0)]timeout waiting
for queue[2]: txdata->tx_pkt_prod(46070) != txdata->tx_pkt_cons(45718)
[  824.082211] bnx2x: [bnx2x_clean_tx_queue:1381(eth0)]timeout waiting
for queue[3]: txdata->tx_pkt_prod(47370) != txdata->tx_pkt_cons(47052)
[  824.084515] bnx2x: [bnx2x_del_all_macs:7179(eth0)]Failed to delete
MACs: -5
[  824.084520] bnx2x: [bnx2x_chip_cleanup:7952(eth0)]Failed to schedule
DEL commands for UC MACs list: -5
[  824.097445] bnx2x: [bnx2x_func_stop:7758(eth0)]FUNC_STOP ramrod
failed. Running a dry transaction
[  824.960780] bnx2x 0000:03:00.0: eth0: using MSI-X  IRQs: sp 41  fp[0]
42 ... fp[3] 45
[  824.967241] bnx2x: [bnx2x_nic_load:1857(eth0)]Function start failed!

^ permalink raw reply

* Re: [PATCH] usbnet: Activate halt interrupt endpoint before re-submit URB
From: Oliver Neukum @ 2012-06-18  7:23 UTC (permalink / raw)
  To: David Miller
  Cc: huajun.li.lee-Re5JQEeQqe8AvxtiuMwx3w,
	tom.leiming-Re5JQEeQqe8AvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20120617.163017.1067800063889498786.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

Am Montag, 18. Juni 2012, 01:30:17 schrieb David Miller:
> From: Huajun Li <huajun.li.lee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Date: Wed, 13 Jun 2012 20:50:31 +0800
> 
> > intr_complete() submits URB even the interrupt endpoint stalls.
> > This patch will try to activate the endpoint once the exception
> > occurs, and then re-submit the URB if the endpoint works again.
> > 
> > Signed-off-by: Huajun Li <huajun.li.lee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> 
> Review from USB experts would be appreciated.

The code implements a minimum error handler correctly.
Did you observe a stall in actual hardware or is this a just
in case patch?

	Regards
		Oliver
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [net-next.git 3/4 (v5)] stmmac: add the Energy Efficient Ethernet support
From: Giuseppe CAVALLARO @ 2012-06-18  6:49 UTC (permalink / raw)
  To: netdev
  Cc: eric.dumazet, bhutchings, rayagond, davem, yuvalmin,
	Giuseppe Cavallaro
In-Reply-To: <1340002187-9248-1-git-send-email-peppe.cavallaro@st.com>

This patch adds the Energy Efficient Ethernet support to the stmmac.

Please see the driver's documentation for further details about this support
in the driver.

Thanks also goes to Rayagond Kokatanur for his first implementation.

Note:
 to clearly manage and expose the lpi interrupt status and eee ethtool
 stats I've had to do some modifications to the driver's design and I
 found really useful to move other parts of the code (e.g. mmc irq stat)
 in the main directly. So this means that some core has been reworked
 to introduce the EEE.

v1: initial patch
v2: fixed some sparse issues (typos)
v3: erroneously sent the v2 renamed as v3
v4:
	o Fixed the return value of the stmmac_eee_init as suggested by D.Miller
	o Totally reviewed the ethtool support for EEE
	o Added a new internal parameter to tune the SW timer for TX LPI.
v5: do not change any eee setting in case of the stmmac_ethtool_op_set_eee fails
    (it has to return -EOPNOTSUPP in that case).

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/common.h       |   31 ++++-
 drivers/net/ethernet/stmicro/stmmac/dwmac1000.h    |   20 +++
 .../net/ethernet/stmicro/stmmac/dwmac1000_core.c   |  101 +++++++++++-
 .../net/ethernet/stmicro/stmmac/dwmac100_core.c    |    4 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h    |    1 +
 drivers/net/ethernet/stmicro/stmmac/stmmac.h       |    8 +
 .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |   57 +++++++
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  166 +++++++++++++++++++-
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |    2 +
 9 files changed, 372 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index bcd54d6..e2d0832 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -95,6 +95,16 @@ struct stmmac_extra_stats {
 	unsigned long poll_n;
 	unsigned long sched_timer_n;
 	unsigned long normal_irq_n;
+	unsigned long mmc_tx_irq_n;
+	unsigned long mmc_rx_irq_n;
+	unsigned long mmc_rx_csum_offload_irq_n;
+	/* EEE */
+	unsigned long irq_receive_pmt_irq_n;
+	unsigned long irq_tx_path_in_lpi_mode_n;
+	unsigned long irq_tx_path_exit_lpi_mode_n;
+	unsigned long irq_rx_path_in_lpi_mode_n;
+	unsigned long irq_rx_path_exit_lpi_mode_n;
+	unsigned long phy_eee_wakeup_error_n;
 };
 
 /* CSR Frequency Access Defines*/
@@ -162,6 +172,17 @@ enum tx_dma_irq_status {
 	handle_tx_rx = 3,
 };
 
+enum core_specific_irq_mask {
+	core_mmc_tx_irq = 1,
+	core_mmc_rx_irq = 2,
+	core_mmc_rx_csum_offload_irq = 4,
+	core_irq_receive_pmt_irq = 8,
+	core_irq_tx_path_in_lpi_mode = 16,
+	core_irq_tx_path_exit_lpi_mode = 32,
+	core_irq_rx_path_in_lpi_mode = 64,
+	core_irq_rx_path_exit_lpi_mode = 128,
+};
+
 /* DMA HW capabilities */
 struct dma_features {
 	unsigned int mbps_10_100;
@@ -208,6 +229,10 @@ struct dma_features {
 #define MAC_ENABLE_TX		0x00000008	/* Transmitter Enable */
 #define MAC_RNABLE_RX		0x00000004	/* Receiver Enable */
 
+/* Default LPI timers */
+#define STMMAC_DEFAULT_LIT_LS_TIMER	0x3E8
+#define STMMAC_DEFAULT_TWT_LS_TIMER	0x0
+
 struct stmmac_desc_ops {
 	/* DMA RX descriptor ring initialization */
 	void (*init_rx_desc) (struct dma_desc *p, unsigned int ring_size,
@@ -278,7 +303,7 @@ struct stmmac_ops {
 	/* Dump MAC registers */
 	void (*dump_regs) (void __iomem *ioaddr);
 	/* Handle extra events on specific interrupts hw dependent */
-	void (*host_irq_status) (void __iomem *ioaddr);
+	int (*host_irq_status) (void __iomem *ioaddr);
 	/* Multicast filter setting */
 	void (*set_filter) (struct net_device *dev, int id);
 	/* Flow control setting */
@@ -291,6 +316,10 @@ struct stmmac_ops {
 			       unsigned int reg_n);
 	void (*get_umac_addr) (void __iomem *ioaddr, unsigned char *addr,
 			       unsigned int reg_n);
+	void (*set_eee_mode) (void __iomem *ioaddr);
+	void (*reset_eee_mode) (void __iomem *ioaddr);
+	void (*set_eee_timer) (void __iomem *ioaddr, int ls, int tw);
+	void (*set_eee_pls) (void __iomem *ioaddr, int link);
 };
 
 struct mac_link {
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
index 23478bf..f90fcb5 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
@@ -36,6 +36,7 @@
 
 #define GMAC_INT_STATUS		0x00000038	/* interrupt status register */
 enum dwmac1000_irq_status {
+	lpiis_irq = 0x400,
 	time_stamp_irq = 0x0200,
 	mmc_rx_csum_offload_irq = 0x0080,
 	mmc_tx_irq = 0x0040,
@@ -60,6 +61,25 @@ enum power_event {
 	power_down = 0x00000001,
 };
 
+/* Energy Efficient Ethernet (EEE)
+ *
+ * LPI status, timer and control register offset
+ */
+#define LPI_CTRL_STATUS	0x0030
+#define LPI_TIMER_CTRL	0x0034
+
+/* LPI control and status defines */
+#define LPI_CTRL_STATUS_LPITXA	0x00080000	/* Enable LPI TX Automate */
+#define LPI_CTRL_STATUS_PLSEN	0x00040000	/* Enable PHY Link Status */
+#define LPI_CTRL_STATUS_PLS	0x00020000	/* PHY Link Status */
+#define LPI_CTRL_STATUS_LPIEN	0x00010000	/* LPI Enable */
+#define LPI_CTRL_STATUS_RLPIST	0x00000200	/* Receive LPI state */
+#define LPI_CTRL_STATUS_TLPIST	0x00000100	/* Transmit LPI state */
+#define LPI_CTRL_STATUS_RLPIEX	0x00000008	/* Receive LPI Exit */
+#define LPI_CTRL_STATUS_RLPIEN	0x00000004	/* Receive LPI Entry */
+#define LPI_CTRL_STATUS_TLPIEX	0x00000002	/* Transmit LPI Exit */
+#define LPI_CTRL_STATUS_TLPIEN	0x00000001	/* Transmit LPI Entry */
+
 /* GMAC HW ADDR regs */
 #define GMAC_ADDR_HIGH(reg)	(((reg > 15) ? 0x00000800 : 0x00000040) + \
 				(reg * 8))
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
index b5e4d02..bfe0226 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
@@ -194,26 +194,107 @@ static void dwmac1000_pmt(void __iomem *ioaddr, unsigned long mode)
 }
 
 
-static void dwmac1000_irq_status(void __iomem *ioaddr)
+static int dwmac1000_irq_status(void __iomem *ioaddr)
 {
 	u32 intr_status = readl(ioaddr + GMAC_INT_STATUS);
+	int status = 0;
 
 	/* Not used events (e.g. MMC interrupts) are not handled. */
-	if ((intr_status & mmc_tx_irq))
-		CHIP_DBG(KERN_DEBUG "GMAC: MMC tx interrupt: 0x%08x\n",
+	if ((intr_status & mmc_tx_irq)) {
+		CHIP_DBG(KERN_INFO "GMAC: MMC tx interrupt: 0x%08x\n",
 		    readl(ioaddr + GMAC_MMC_TX_INTR));
-	if (unlikely(intr_status & mmc_rx_irq))
-		CHIP_DBG(KERN_DEBUG "GMAC: MMC rx interrupt: 0x%08x\n",
+		status |= core_mmc_tx_irq;
+	}
+	if (unlikely(intr_status & mmc_rx_irq)) {
+		CHIP_DBG(KERN_INFO "GMAC: MMC rx interrupt: 0x%08x\n",
 		    readl(ioaddr + GMAC_MMC_RX_INTR));
-	if (unlikely(intr_status & mmc_rx_csum_offload_irq))
-		CHIP_DBG(KERN_DEBUG "GMAC: MMC rx csum offload: 0x%08x\n",
+		status |= core_mmc_rx_irq;
+	}
+	if (unlikely(intr_status & mmc_rx_csum_offload_irq)) {
+		CHIP_DBG(KERN_INFO "GMAC: MMC rx csum offload: 0x%08x\n",
 		    readl(ioaddr + GMAC_MMC_RX_CSUM_OFFLOAD));
+		status |= core_mmc_rx_csum_offload_irq;
+	}
 	if (unlikely(intr_status & pmt_irq)) {
-		CHIP_DBG(KERN_DEBUG "GMAC: received Magic frame\n");
+		CHIP_DBG(KERN_INFO "GMAC: received Magic frame\n");
 		/* clear the PMT bits 5 and 6 by reading the PMT
 		 * status register. */
 		readl(ioaddr + GMAC_PMT);
+		status |= core_irq_receive_pmt_irq;
 	}
+	/* MAC trx/rx EEE LPI entry/exit interrupts */
+	if (intr_status & lpiis_irq) {
+		/* Clean LPI interrupt by reading the Reg 12 */
+		u32 lpi_status = readl(ioaddr + LPI_CTRL_STATUS);
+
+		if (lpi_status & LPI_CTRL_STATUS_TLPIEN) {
+			CHIP_DBG(KERN_INFO "GMAC TX entered in LPI\n");
+			status |= core_irq_tx_path_in_lpi_mode;
+		}
+		if (lpi_status & LPI_CTRL_STATUS_TLPIEX) {
+			CHIP_DBG(KERN_INFO "GMAC TX exit from LPI\n");
+			status |= core_irq_tx_path_exit_lpi_mode;
+		}
+		if (lpi_status & LPI_CTRL_STATUS_RLPIEN) {
+			CHIP_DBG(KERN_INFO "GMAC RX entered in LPI\n");
+			status |= core_irq_rx_path_in_lpi_mode;
+		}
+		if (lpi_status & LPI_CTRL_STATUS_RLPIEX) {
+			CHIP_DBG(KERN_INFO "GMAC RX exit from LPI\n");
+			status |= core_irq_rx_path_exit_lpi_mode;
+		}
+	}
+
+	return status;
+}
+
+static void  dwmac1000_set_eee_mode(void __iomem *ioaddr)
+{
+	u32 value;
+
+	/* Enable the link status receive on RGMII, SGMII ore SMII
+	 * receive path and instruct the transmit to enter in LPI
+	 * state. */
+	value = readl(ioaddr + LPI_CTRL_STATUS);
+	value |= LPI_CTRL_STATUS_LPIEN | LPI_CTRL_STATUS_LPITXA;
+	writel(value, ioaddr + LPI_CTRL_STATUS);
+}
+
+static void  dwmac1000_reset_eee_mode(void __iomem *ioaddr)
+{
+	u32 value;
+
+	value = readl(ioaddr + LPI_CTRL_STATUS);
+	value &= ~(LPI_CTRL_STATUS_LPIEN | LPI_CTRL_STATUS_LPITXA);
+	writel(value, ioaddr + LPI_CTRL_STATUS);
+}
+
+static void  dwmac1000_set_eee_pls(void __iomem *ioaddr, int link)
+{
+	u32 value;
+
+	value = readl(ioaddr + LPI_CTRL_STATUS);
+
+	if (link)
+		value |= LPI_CTRL_STATUS_PLS;
+	else
+		value &= ~LPI_CTRL_STATUS_PLS;
+
+	writel(value, ioaddr + LPI_CTRL_STATUS);
+}
+
+static void  dwmac1000_set_eee_timer(void __iomem *ioaddr, int ls, int tw)
+{
+	int value = ((tw & 0xffff)) | ((ls & 0x7ff) << 16);
+
+	/* Program the timers in the LPI timer control register:
+	 * LS: minimum time (ms) for which the link
+	 *  status from PHY should be ok before transmitting
+	 *  the LPI pattern.
+	 * TW: minimum time (us) for which the core waits
+	 *  after it has stopped transmitting the LPI pattern.
+	 */
+	writel(value, ioaddr + LPI_TIMER_CTRL);
 }
 
 static const struct stmmac_ops dwmac1000_ops = {
@@ -226,6 +307,10 @@ static const struct stmmac_ops dwmac1000_ops = {
 	.pmt = dwmac1000_pmt,
 	.set_umac_addr = dwmac1000_set_umac_addr,
 	.get_umac_addr = dwmac1000_get_umac_addr,
+	.set_eee_mode =  dwmac1000_set_eee_mode,
+	.reset_eee_mode =  dwmac1000_reset_eee_mode,
+	.set_eee_timer =  dwmac1000_set_eee_timer,
+	.set_eee_pls =  dwmac1000_set_eee_pls,
 };
 
 struct mac_device_info *dwmac1000_setup(void __iomem *ioaddr)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
index 19e0f4e..f83210e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
@@ -72,9 +72,9 @@ static int dwmac100_rx_ipc_enable(void __iomem *ioaddr)
 	return 0;
 }
 
-static void dwmac100_irq_status(void __iomem *ioaddr)
+static int dwmac100_irq_status(void __iomem *ioaddr)
 {
-	return;
+	return 0;
 }
 
 static void dwmac100_set_umac_addr(void __iomem *ioaddr, unsigned char *addr,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
index 6e0360f..e678ce3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
@@ -70,6 +70,7 @@
 #define DMA_INTR_DEFAULT_MASK	(DMA_INTR_NORMAL | DMA_INTR_ABNORMAL)
 
 /* DMA Status register defines */
+#define DMA_STATUS_GLPII	0x40000000	/* GMAC LPI interrupt */
 #define DMA_STATUS_GPI		0x10000000	/* PMT interrupt */
 #define DMA_STATUS_GMI		0x08000000	/* MMC interrupt */
 #define DMA_STATUS_GLI		0x04000000	/* GMAC Line interface int */
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index dc20c56..ab4c376 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -87,6 +87,12 @@ struct stmmac_priv {
 #endif
 	int clk_csr;
 	int synopsys_id;
+	struct timer_list eee_ctrl_timer;
+	bool tx_path_in_lpi_mode;
+	int lpi_irq;
+	int eee_enabled;
+	int eee_active;
+	int tx_lpi_timer;
 };
 
 extern int phyaddr;
@@ -104,6 +110,8 @@ int stmmac_dvr_remove(struct net_device *ndev);
 struct stmmac_priv *stmmac_dvr_probe(struct device *device,
 				     struct plat_stmmacenet_data *plat_dat,
 				     void __iomem *addr);
+void stmmac_disable_eee_mode(struct stmmac_priv *priv);
+bool stmmac_eee_init(struct stmmac_priv *priv);
 
 #ifdef CONFIG_HAVE_CLK
 static inline int stmmac_clk_enable(struct stmmac_priv *priv)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index ce43184..76fd61a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -93,6 +93,16 @@ static const struct stmmac_stats stmmac_gstrings_stats[] = {
 	STMMAC_STAT(poll_n),
 	STMMAC_STAT(sched_timer_n),
 	STMMAC_STAT(normal_irq_n),
+	STMMAC_STAT(normal_irq_n),
+	STMMAC_STAT(mmc_tx_irq_n),
+	STMMAC_STAT(mmc_rx_irq_n),
+	STMMAC_STAT(mmc_rx_csum_offload_irq_n),
+	STMMAC_STAT(irq_receive_pmt_irq_n),
+	STMMAC_STAT(irq_tx_path_in_lpi_mode_n),
+	STMMAC_STAT(irq_tx_path_exit_lpi_mode_n),
+	STMMAC_STAT(irq_rx_path_in_lpi_mode_n),
+	STMMAC_STAT(irq_rx_path_exit_lpi_mode_n),
+	STMMAC_STAT(phy_eee_wakeup_error_n),
 };
 #define STMMAC_STATS_LEN ARRAY_SIZE(stmmac_gstrings_stats)
 
@@ -366,6 +376,11 @@ static void stmmac_get_ethtool_stats(struct net_device *dev,
 					     (*(u32 *)p);
 			}
 		}
+		if (priv->eee_enabled) {
+			int val = phy_get_eee_err(priv->phydev);
+			if (val)
+				priv->xstats.phy_eee_wakeup_error_n = val;
+		}
 	}
 	for (i = 0; i < STMMAC_STATS_LEN; i++) {
 		char *p = (char *)priv + stmmac_gstrings_stats[i].stat_offset;
@@ -464,6 +479,46 @@ static int stmmac_set_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
 	return 0;
 }
 
+static int stmmac_ethtool_op_get_eee(struct net_device *dev,
+				     struct ethtool_eee *edata)
+{
+	struct stmmac_priv *priv = netdev_priv(dev);
+
+	if (!priv->dma_cap.eee)
+		return -EOPNOTSUPP;
+
+	edata->eee_enabled = priv->eee_enabled;
+	edata->eee_active = priv->eee_active;
+	edata->tx_lpi_timer = priv->tx_lpi_timer;
+
+	return phy_ethtool_get_eee(priv->phydev, edata);
+}
+
+static int stmmac_ethtool_op_set_eee(struct net_device *dev,
+				     struct ethtool_eee *edata)
+{
+	struct stmmac_priv *priv = netdev_priv(dev);
+
+	priv->eee_enabled = edata->eee_enabled;
+
+	if (!priv->eee_enabled)
+		stmmac_disable_eee_mode(priv);
+	else {
+		/* We are asking for enabling the EEE but it is safe
+		 * to verify all by invoking the eee_init function.
+		 * In case of failure it will return an error.
+		 */
+		priv->eee_enabled = stmmac_eee_init(priv);
+		if (!priv->eee_enabled)
+			return -EOPNOTSUPP;
+
+		/* Do not change tx_lpi_timer in case of failure */
+		priv->tx_lpi_timer = edata->tx_lpi_timer;
+	}
+
+	return phy_ethtool_set_eee(priv->phydev, edata);
+}
+
 static const struct ethtool_ops stmmac_ethtool_ops = {
 	.begin = stmmac_check_if_running,
 	.get_drvinfo = stmmac_ethtool_getdrvinfo,
@@ -480,6 +535,8 @@ static const struct ethtool_ops stmmac_ethtool_ops = {
 	.get_strings = stmmac_get_strings,
 	.get_wol = stmmac_get_wol,
 	.set_wol = stmmac_set_wol,
+	.get_eee = stmmac_ethtool_op_get_eee,
+	.set_eee = stmmac_ethtool_op_set_eee,
 	.get_sset_count	= stmmac_get_sset_count,
 	.get_ts_info = ethtool_op_get_ts_info,
 };
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index eba49cb..ea3bc09 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -133,6 +133,12 @@ static const u32 default_msg_level = (NETIF_MSG_DRV | NETIF_MSG_PROBE |
 				      NETIF_MSG_LINK | NETIF_MSG_IFUP |
 				      NETIF_MSG_IFDOWN | NETIF_MSG_TIMER);
 
+#define STMMAC_DEFAULT_LPI_TIMER	1000
+static int eee_timer = STMMAC_DEFAULT_LPI_TIMER;
+module_param(eee_timer, int, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(eee_timer, "LPI tx expiration time in msec");
+#define STMMAC_LPI_TIMER(x) (jiffies + msecs_to_jiffies(x))
+
 static irqreturn_t stmmac_interrupt(int irq, void *dev_id);
 
 #ifdef CONFIG_STMMAC_DEBUG_FS
@@ -161,6 +167,8 @@ static void stmmac_verify_args(void)
 		flow_ctrl = FLOW_OFF;
 	if (unlikely((pause < 0) || (pause > 0xffff)))
 		pause = PAUSE_TIME;
+	if (eee_timer < 0)
+		eee_timer = STMMAC_DEFAULT_LPI_TIMER;
 }
 
 static void stmmac_clk_csr_set(struct stmmac_priv *priv)
@@ -229,6 +237,85 @@ static inline void stmmac_hw_fix_mac_speed(struct stmmac_priv *priv)
 					  phydev->speed);
 }
 
+static void stmmac_enable_eee_mode(struct stmmac_priv *priv)
+{
+	/* Check and enter in LPI mode */
+	if ((priv->dirty_tx == priv->cur_tx) &&
+	    (priv->tx_path_in_lpi_mode == false))
+		priv->hw->mac->set_eee_mode(priv->ioaddr);
+}
+
+void stmmac_disable_eee_mode(struct stmmac_priv *priv)
+{
+	/* Exit and disable EEE in case of we are are in LPI state. */
+	priv->hw->mac->reset_eee_mode(priv->ioaddr);
+	del_timer_sync(&priv->eee_ctrl_timer);
+	priv->tx_path_in_lpi_mode = false;
+}
+
+/**
+ * stmmac_eee_ctrl_timer
+ * @arg : data hook
+ * Description:
+ *  If there is no data transfer and if we are not in LPI state,
+ *  then MAC Transmitter can be moved to LPI state.
+ */
+static void stmmac_eee_ctrl_timer(unsigned long arg)
+{
+	struct stmmac_priv *priv = (struct stmmac_priv *)arg;
+
+	stmmac_enable_eee_mode(priv);
+	mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_TIMER(eee_timer));
+}
+
+/**
+ * stmmac_eee_init
+ * @priv: private device pointer
+ * Description:
+ *  If the EEE support has been enabled while configuring the driver,
+ *  if the GMAC actually supports the EEE (from the HW cap reg) and the
+ *  phy can also manage EEE, so enable the LPI state and start the timer
+ *  to verify if the tx path can enter in LPI state.
+ */
+bool stmmac_eee_init(struct stmmac_priv *priv)
+{
+	bool ret = false;
+
+	/* MAC core supports the EEE feature. */
+	if (priv->dma_cap.eee) {
+		/* Check if the PHY supports EEE */
+		if (phy_init_eee(priv->phydev, 1))
+			goto out;
+
+		priv->eee_active = 1;
+		init_timer(&priv->eee_ctrl_timer);
+		priv->eee_ctrl_timer.function = stmmac_eee_ctrl_timer;
+		priv->eee_ctrl_timer.data = (unsigned long)priv;
+		priv->eee_ctrl_timer.expires = STMMAC_LPI_TIMER(eee_timer);
+		add_timer(&priv->eee_ctrl_timer);
+
+		priv->hw->mac->set_eee_timer(priv->ioaddr,
+					     STMMAC_DEFAULT_LIT_LS_TIMER,
+					     priv->tx_lpi_timer);
+
+		pr_info("stmmac: Energy-Efficient Ethernet initialized\n");
+
+		ret = true;
+	}
+out:
+	return ret;
+}
+
+static void stmmac_eee_adjust(struct stmmac_priv *priv)
+{
+	/* When the EEE has been already initialised we have to
+	 * modify the PLS bit in the LPI ctrl & status reg according
+	 * to the PHY link status. For this reason.
+	 */
+	if (priv->eee_enabled)
+		priv->hw->mac->set_eee_pls(priv->ioaddr, priv->phydev->link);
+}
+
 /**
  * stmmac_adjust_link
  * @dev: net device structure
@@ -249,6 +336,7 @@ static void stmmac_adjust_link(struct net_device *dev)
 	    phydev->addr, phydev->link);
 
 	spin_lock_irqsave(&priv->lock, flags);
+
 	if (phydev->link) {
 		u32 ctrl = readl(priv->ioaddr + MAC_CTRL_REG);
 
@@ -315,6 +403,8 @@ static void stmmac_adjust_link(struct net_device *dev)
 	if (new_state && netif_msg_link(priv))
 		phy_print_status(phydev);
 
+	stmmac_eee_adjust(priv);
+
 	spin_unlock_irqrestore(&priv->lock, flags);
 
 	DBG(probe, DEBUG, "stmmac_adjust_link: exiting\n");
@@ -332,7 +422,7 @@ static int stmmac_init_phy(struct net_device *dev)
 {
 	struct stmmac_priv *priv = netdev_priv(dev);
 	struct phy_device *phydev;
-	char phy_id[MII_BUS_ID_SIZE + 3];
+	char phy_id_fmt[MII_BUS_ID_SIZE + 3];
 	char bus_id[MII_BUS_ID_SIZE];
 	int interface = priv->plat->interface;
 	priv->oldlink = 0;
@@ -346,11 +436,12 @@ static int stmmac_init_phy(struct net_device *dev)
 		snprintf(bus_id, MII_BUS_ID_SIZE, "stmmac-%x",
 				priv->plat->bus_id);
 
-	snprintf(phy_id, MII_BUS_ID_SIZE + 3, PHY_ID_FMT, bus_id,
+	snprintf(phy_id_fmt, MII_BUS_ID_SIZE + 3, PHY_ID_FMT, bus_id,
 		 priv->plat->phy_addr);
-	pr_debug("stmmac_init_phy:  trying to attach to %s\n", phy_id);
+	pr_debug("stmmac_init_phy:  trying to attach to %s\n", phy_id_fmt);
 
-	phydev = phy_connect(dev, phy_id, &stmmac_adjust_link, 0, interface);
+	phydev = phy_connect(dev, phy_id_fmt, &stmmac_adjust_link, 0,
+			     interface);
 
 	if (IS_ERR(phydev)) {
 		pr_err("%s: Could not attach to PHY\n", dev->name);
@@ -689,6 +780,11 @@ static void stmmac_tx(struct stmmac_priv *priv)
 		}
 		netif_tx_unlock(priv->dev);
 	}
+
+	if ((priv->eee_enabled) && (!priv->tx_path_in_lpi_mode)) {
+		stmmac_enable_eee_mode(priv);
+		mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_TIMER(eee_timer));
+	}
 	spin_unlock(&priv->tx_lock);
 }
 
@@ -1027,6 +1123,17 @@ static int stmmac_open(struct net_device *dev)
 		}
 	}
 
+	/* Request the IRQ lines */
+	if (priv->lpi_irq != -ENXIO) {
+		ret = request_irq(priv->lpi_irq, stmmac_interrupt, IRQF_SHARED,
+				  dev->name, dev);
+		if (unlikely(ret < 0)) {
+			pr_err("%s: ERROR: allocating the LPI IRQ %d (%d)\n",
+			       __func__, priv->lpi_irq, ret);
+			goto open_error_lpiirq;
+		}
+	}
+
 	/* Enable the MAC Rx/Tx */
 	stmmac_set_mac(priv->ioaddr, true);
 
@@ -1062,12 +1169,19 @@ static int stmmac_open(struct net_device *dev)
 	if (priv->phydev)
 		phy_start(priv->phydev);
 
+	priv->tx_lpi_timer = STMMAC_DEFAULT_TWT_LS_TIMER;
+	priv->eee_enabled = stmmac_eee_init(priv);
+
 	napi_enable(&priv->napi);
 	skb_queue_head_init(&priv->rx_recycle);
 	netif_start_queue(dev);
 
 	return 0;
 
+open_error_lpiirq:
+	if (priv->wol_irq != dev->irq)
+		free_irq(priv->wol_irq, dev);
+
 open_error_wolirq:
 	free_irq(dev->irq, dev);
 
@@ -1093,6 +1207,9 @@ static int stmmac_release(struct net_device *dev)
 {
 	struct stmmac_priv *priv = netdev_priv(dev);
 
+	if (priv->eee_enabled)
+		del_timer_sync(&priv->eee_ctrl_timer);
+
 	/* Stop and disconnect the PHY */
 	if (priv->phydev) {
 		phy_stop(priv->phydev);
@@ -1115,6 +1232,8 @@ static int stmmac_release(struct net_device *dev)
 	free_irq(dev->irq, dev);
 	if (priv->wol_irq != dev->irq)
 		free_irq(priv->wol_irq, dev);
+	if (priv->lpi_irq != -ENXIO)
+		free_irq(priv->lpi_irq, dev);
 
 	/* Stop TX/RX DMA and clear the descriptors */
 	priv->hw->dma->stop_tx(priv->ioaddr);
@@ -1164,6 +1283,9 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	spin_lock(&priv->tx_lock);
 
+	if (priv->tx_path_in_lpi_mode)
+		stmmac_disable_eee_mode(priv);
+
 	entry = priv->cur_tx % txsize;
 
 #ifdef STMMAC_XMIT_DEBUG
@@ -1540,10 +1662,37 @@ static irqreturn_t stmmac_interrupt(int irq, void *dev_id)
 		return IRQ_NONE;
 	}
 
-	if (priv->plat->has_gmac)
-		/* To handle GMAC own interrupts */
-		priv->hw->mac->host_irq_status((void __iomem *) dev->base_addr);
+	/* To handle GMAC own interrupts */
+	if (priv->plat->has_gmac) {
+		int status = priv->hw->mac->host_irq_status((void __iomem *)
+							    dev->base_addr);
+		if (unlikely(status)) {
+			if (status & core_mmc_tx_irq)
+				priv->xstats.mmc_tx_irq_n++;
+			if (status & core_mmc_rx_irq)
+				priv->xstats.mmc_rx_irq_n++;
+			if (status & core_mmc_rx_csum_offload_irq)
+				priv->xstats.mmc_rx_csum_offload_irq_n++;
+			if (status & core_irq_receive_pmt_irq)
+				priv->xstats.irq_receive_pmt_irq_n++;
+
+			/* For LPI we need to save the tx status */
+			if (status & core_irq_tx_path_in_lpi_mode) {
+				priv->xstats.irq_tx_path_in_lpi_mode_n++;
+				priv->tx_path_in_lpi_mode = true;
+			}
+			if (status & core_irq_tx_path_exit_lpi_mode) {
+				priv->xstats.irq_tx_path_exit_lpi_mode_n++;
+				priv->tx_path_in_lpi_mode = false;
+			}
+			if (status & core_irq_rx_path_in_lpi_mode)
+				priv->xstats.irq_rx_path_in_lpi_mode_n++;
+			if (status & core_irq_rx_path_exit_lpi_mode)
+				priv->xstats.irq_rx_path_exit_lpi_mode_n++;
+		}
+	}
 
+	/* To handle DMA interrupts */
 	stmmac_dma_interrupt(priv);
 
 	return IRQ_HANDLED;
@@ -2155,6 +2304,9 @@ static int __init stmmac_cmdline_opt(char *str)
 		} else if (!strncmp(opt, "pause:", 6)) {
 			if (kstrtoint(opt + 6, 0, &pause))
 				goto err;
+		} else if (!strncmp(opt, "eee_timer:", 6)) {
+			if (kstrtoint(opt + 10, 0, &eee_timer))
+				goto err;
 #ifdef CONFIG_STMMAC_TIMER
 		} else if (!strncmp(opt, "tmrate:", 7)) {
 			if (kstrtoint(opt + 7, 0, &tmrate))
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 20eb502..7d36163 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -156,6 +156,8 @@ static int stmmac_pltfr_probe(struct platform_device *pdev)
 	if (priv->wol_irq == -ENXIO)
 		priv->wol_irq = priv->dev->irq;
 
+	priv->lpi_irq = platform_get_irq_byname(pdev, "eth_lpi");
+
 	platform_set_drvdata(pdev, priv->dev);
 
 	pr_debug("STMMAC platform driver registration completed");
-- 
1.7.4.4

^ permalink raw reply related

* [net-next.git 4/4 (v6)] phy: add the EEE support and the way to access to the MMD registers.
From: Giuseppe CAVALLARO @ 2012-06-18  6:49 UTC (permalink / raw)
  To: netdev
  Cc: eric.dumazet, bhutchings, rayagond, davem, yuvalmin,
	Giuseppe Cavallaro
In-Reply-To: <1340002187-9248-1-git-send-email-peppe.cavallaro@st.com>

This patch adds the support for the Energy-Efficient Ethernet (EEE)
to the Physical Abstraction Layer.
To support the EEE we have to access to the MMD registers 3.20 and
7.60/61. So two new functions have been added to read/write the MMD
registers (clause 45).

An Ethernet driver (I tested the stmmac) can invoke the phy_init_eee to properly
check if the EEE is supported by the PHYs and it can also set the clock
stop enable bit in the 3.0 register.
The phy_get_eee_err can be used for reporting the number of time where
the PHY failed to complete its normal wake sequence.

In the end, this patch also adds the EEE ethtool support implementing:
 o phy_ethtool_set_eee
 o phy_ethtool_get_eee

v1: initial patch
v2: fixed some errors especially on naming convention
v3: renamed again the mmd read/write functions thank to Ben's feedback
v4: moved file to phy.c and added the ethtool support.
v5: fixed phy_adv_to_eee, phy_eee_to_supported, phy_eee_to_adv return
    values according to ethtool API (thanks to Ben's feedback).
    Renamed some macros to avoid too long names.
v6: fixed kernel-doc comments to be properly parsed.
    Fixed the phy_init_eee function: we need to check which link mode
    was autonegotiated and then the corresponding bits in 7.60 and 7.61
    registers.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/phy/phy.c |  274 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/mdio.h  |   21 +++-
 include/linux/mii.h   |    9 ++
 include/linux/phy.h   |    5 +
 4 files changed, 305 insertions(+), 4 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 2e1c237..7551364 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -35,6 +35,7 @@
 #include <linux/phy.h>
 #include <linux/timer.h>
 #include <linux/workqueue.h>
+#include <linux/mdio.h>
 
 #include <linux/atomic.h>
 #include <asm/io.h>
@@ -967,3 +968,276 @@ void phy_state_machine(struct work_struct *work)
 
 	schedule_delayed_work(&phydev->state_queue, PHY_STATE_TIME * HZ);
 }
+
+static inline void mmd_phy_indirect(struct mii_bus *bus, int prtad, int devad,
+				    int addr)
+{
+	/* Write the desired MMD Devad */
+	bus->write(bus, addr, MII_MMD_CTRL, devad);
+
+	/* Write the desired MMD register address */
+	bus->write(bus, addr, MII_MMD_DATA, prtad);
+
+	/* Select the Function : DATA with no post increment */
+	bus->write(bus, addr, MII_MMD_CTRL, (devad | MII_MMD_CTRL_NOINCR));
+}
+
+/**
+ * phy_read_mmd_indirect - reads data from the MMD registers
+ * @bus: the target MII bus
+ * @prtad: MMD Address
+ * @devad: MMD DEVAD
+ * @addr: PHY address on the MII bus
+ *
+ * Description: it reads data from the MMD registers (clause 22 to access to
+ * clause 45) of the specified phy address.
+ * To read these register we have:
+ * 1) Write reg 13 // DEVAD
+ * 2) Write reg 14 // MMD Address
+ * 3) Write reg 13 // MMD Data Command for MMD DEVAD
+ * 3) Read  reg 14 // Read MMD data
+ */
+static int phy_read_mmd_indirect(struct mii_bus *bus, int prtad, int devad,
+				 int addr)
+{
+	u32 ret;
+
+	mmd_phy_indirect(bus, prtad, devad, addr);
+
+	/* Read the content of the MMD's selected register */
+	ret = bus->read(bus, addr, MII_MMD_DATA);
+
+	return ret;
+}
+
+/**
+ * phy_write_mmd_indirect - writes data to the MMD registers
+ * @bus: the target MII bus
+ * @prtad: MMD Address
+ * @devad: MMD DEVAD
+ * @addr: PHY address on the MII bus
+ * @data: data to write in the MMD register
+ *
+ * Description: Write data from the MMD registers of the specified
+ * phy address.
+ * To write these register we have:
+ * 1) Write reg 13 // DEVAD
+ * 2) Write reg 14 // MMD Address
+ * 3) Write reg 13 // MMD Data Command for MMD DEVAD
+ * 3) Write reg 14 // Write MMD data
+ */
+static void phy_write_mmd_indirect(struct mii_bus *bus, int prtad, int devad,
+				   int addr, u32 data)
+{
+	mmd_phy_indirect(bus, prtad, devad, addr);
+
+	/* Write the data into MMD's selected register */
+	bus->write(bus, addr, MII_MMD_DATA, data);
+}
+
+static u32 phy_eee_to_adv(u16 eee_adv)
+{
+	u32 adv = 0;
+
+	if (eee_adv & MDIO_EEE_100TX)
+		adv |= ADVERTISED_100baseT_Full;
+	if (eee_adv & MDIO_EEE_1000T)
+		adv |= ADVERTISED_1000baseT_Full;
+	if (eee_adv & MDIO_EEE_10GT)
+		adv |= ADVERTISED_10000baseT_Full;
+	if (eee_adv & MDIO_EEE_1000KX)
+		adv |= ADVERTISED_1000baseKX_Full;
+	if (eee_adv & MDIO_EEE_10GKX4)
+		adv |= ADVERTISED_10000baseKX4_Full;
+	if (eee_adv & MDIO_EEE_10GKR)
+		adv |= ADVERTISED_10000baseKR_Full;
+
+	return adv;
+}
+
+static u32 phy_eee_to_supported(u16 eee_caported)
+{
+	u32 supported = 0;
+
+	if (eee_caported & MDIO_EEE_100TX)
+		supported |= SUPPORTED_100baseT_Full;
+	if (eee_caported & MDIO_EEE_1000T)
+		supported |= SUPPORTED_1000baseT_Full;
+	if (eee_caported & MDIO_EEE_10GT)
+		supported |= SUPPORTED_10000baseT_Full;
+	if (eee_caported & MDIO_EEE_1000KX)
+		supported |= SUPPORTED_1000baseKX_Full;
+	if (eee_caported & MDIO_EEE_10GKX4)
+		supported |= SUPPORTED_10000baseKX4_Full;
+	if (eee_caported & MDIO_EEE_10GKR)
+		supported |= SUPPORTED_10000baseKR_Full;
+
+	return supported;
+}
+
+static u16 phy_adv_to_eee(u32 adv)
+{
+	u16 reg = 0;
+
+	if (adv & ADVERTISED_100baseT_Full)
+		reg |= MDIO_EEE_100TX;
+	if (adv & ADVERTISED_1000baseT_Full)
+		reg |= MDIO_EEE_1000T;
+	if (adv & ADVERTISED_10000baseT_Full)
+		reg |= MDIO_EEE_10GT;
+	if (adv & ADVERTISED_1000baseKX_Full)
+		reg |= MDIO_EEE_1000KX;
+	if (adv & ADVERTISED_10000baseKX4_Full)
+		reg |= MDIO_EEE_10GKX4;
+	if (adv & ADVERTISED_10000baseKR_Full)
+		reg |= MDIO_EEE_10GKR;
+
+	return reg;
+}
+
+/**
+ * phy_init_eee - init and check the EEE feature
+ * @phydev: target phy_device struct
+ * @clk_stop_enable: PHY may stop the clock during LPI
+ *
+ * Description: it checks if the Energy-Efficient Ethernet (EEE)
+ * is supported by looking at the MMD registers 3.20 and 7.60/61
+ * and it programs the MMD register 3.0 setting the "Clock stop enable"
+ * bit if required.
+ */
+int phy_init_eee(struct phy_device *phydev, bool clk_stop_enable)
+{
+	int ret = -EPROTONOSUPPORT;
+
+	/* According to 802.3az,the EEE is supported only in full duplex-mode.
+	 * Also EEE feature is active when core is operating with MII, GMII
+	 * or RGMII.
+	 */
+	if ((phydev->duplex == DUPLEX_FULL) &&
+	    ((phydev->interface == PHY_INTERFACE_MODE_MII) ||
+	    (phydev->interface == PHY_INTERFACE_MODE_GMII) ||
+	    (phydev->interface == PHY_INTERFACE_MODE_RGMII))) {
+		u16 eee_lp, eee_cap, eee_adv;
+		u32 lp, cap, adv;
+
+		/* First check if the EEE ability is supported */
+		eee_cap = phy_read_mmd_indirect(phydev->bus, MDIO_PCS_EEE_ABLE,
+						MDIO_MMD_PCS, phydev->addr);
+		if (eee_cap < 0)
+			return eee_cap;
+
+		cap = phy_eee_to_supported(eee_cap);
+		if (!cap)
+			goto eee_exit;
+
+		/* Check which link mode was autonegotiated and verify it in
+		 * the EEE advertising registers.
+		 */
+		eee_lp = phy_read_mmd_indirect(phydev->bus, MDIO_AN_EEE_LPABLE,
+					       MDIO_MMD_AN, phydev->addr);
+		if (eee_lp < 0)
+			return eee_lp;
+
+		eee_adv = phy_read_mmd_indirect(phydev->bus, MDIO_AN_EEE_ADV,
+						MDIO_MMD_AN, phydev->addr);
+		if (eee_adv < 0)
+			return eee_adv;
+
+		adv = phy_eee_to_adv(eee_adv);
+		lp = phy_eee_to_adv(eee_lp);
+		if (!(lp & adv & phydev->advertising))
+			goto eee_exit;
+
+		if (clk_stop_enable) {
+			/* Configure the PHY to stop receiving xMII
+			 * clock while it is signaling LPI.
+			 */
+			int val = phy_read_mmd_indirect(phydev->bus, MDIO_CTRL1,
+							MDIO_MMD_PCS,
+							phydev->addr);
+			if (val < 0)
+				return val;
+
+			val |= MDIO_PCS_CTRL1_CLKSTOP_EN;
+			phy_write_mmd_indirect(phydev->bus, MDIO_CTRL1,
+					       MDIO_MMD_PCS, phydev->addr, val);
+		}
+
+		ret = 0; /* EEE supported */
+	}
+
+eee_exit:
+	return ret;
+}
+EXPORT_SYMBOL(phy_init_eee);
+
+/**
+ * phy_get_eee_err - report the EEE wake error count
+ * @phydev: target phy_device struct
+ *
+ * Description: it is to report the number of time where the PHY
+ * failed to complete its normal wake sequence.
+ */
+int phy_get_eee_err(struct phy_device *phydev)
+{
+	return phy_read_mmd_indirect(phydev->bus, MDIO_PCS_EEE_WK_ERR,
+				     MDIO_MMD_PCS, phydev->addr);
+
+}
+EXPORT_SYMBOL(phy_get_eee_err);
+
+/**
+ * phy_ethtool_get_eee - get EEE supported and status
+ * @phydev: target phy_device struct
+ * @data: ethtool_eee data
+ *
+ * Description: it reportes the Supported/Advertisement/LP Advertisement
+ * capabilities.
+ */
+int phy_ethtool_get_eee(struct phy_device *phydev, struct ethtool_eee *data)
+{
+	int val;
+
+	/* Get Supported EEE */
+	val = phy_read_mmd_indirect(phydev->bus, MDIO_PCS_EEE_ABLE,
+				    MDIO_MMD_PCS, phydev->addr);
+	if (val < 0)
+		return val;
+	data->supported = phy_eee_to_supported(val);
+
+	/* Get advertisement EEE */
+	val = phy_read_mmd_indirect(phydev->bus, MDIO_AN_EEE_ADV,
+				    MDIO_MMD_AN, phydev->addr);
+	if (val < 0)
+		return val;
+	data->advertised = phy_eee_to_adv(val);
+
+	/* Get LP advertisement EEE */
+	val = phy_read_mmd_indirect(phydev->bus, MDIO_AN_EEE_LPABLE,
+				    MDIO_MMD_AN, phydev->addr);
+	if (val < 0)
+		return val;
+	data->lp_advertised = phy_eee_to_adv(val);
+
+	return 0;
+}
+EXPORT_SYMBOL(phy_ethtool_get_eee);
+
+/**
+ * phy_ethtool_set_eee - set EEE supported and status
+ * @phydev: target phy_device struct
+ * @data: ethtool_eee data
+ *
+ * Description: it is to program the Advertisement EEE register.
+ */
+int phy_ethtool_set_eee(struct phy_device *phydev, struct ethtool_eee *data)
+{
+	int val;
+
+	val = phy_adv_to_eee(data->advertised);
+	phy_write_mmd_indirect(phydev->bus, MDIO_AN_EEE_ADV, MDIO_MMD_AN,
+			       phydev->addr, val);
+
+	return 0;
+}
+EXPORT_SYMBOL(phy_ethtool_set_eee);
diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index dfb9479..4ad8f0e 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -43,7 +43,11 @@
 #define MDIO_PKGID2		15
 #define MDIO_AN_ADVERTISE	16	/* AN advertising (base page) */
 #define MDIO_AN_LPA		19	/* AN LP abilities (base page) */
+#define MDIO_PCS_EEE_ABLE	20	/* EEE Capability register */
+#define MDIO_PCS_EEE_WK_ERR	22	/* EEE wake error counter */
 #define MDIO_PHYXS_LNSTAT	24	/* PHY XGXS lane state */
+#define MDIO_AN_EEE_ADV		60	/* EEE advertisement */
+#define MDIO_AN_EEE_LPABLE	61	/* EEE link partner ability */
 
 /* Media-dependent registers. */
 #define MDIO_PMA_10GBT_SWAPPOL	130	/* 10GBASE-T pair swap & polarity */
@@ -56,7 +60,6 @@
 #define MDIO_PCS_10GBRT_STAT2	33	/* 10GBASE-R/-T PCS status 2 */
 #define MDIO_AN_10GBT_CTRL	32	/* 10GBASE-T auto-negotiation control */
 #define MDIO_AN_10GBT_STAT	33	/* 10GBASE-T auto-negotiation status */
-#define MDIO_AN_EEE_ADV		60	/* EEE advertisement */
 
 /* LASI (Link Alarm Status Interrupt) registers, defined by XENPAK MSA. */
 #define MDIO_PMA_LASI_RXCTRL	0x9000	/* RX_ALARM control */
@@ -82,6 +85,7 @@
 #define MDIO_AN_CTRL1_RESTART		BMCR_ANRESTART
 #define MDIO_AN_CTRL1_ENABLE		BMCR_ANENABLE
 #define MDIO_AN_CTRL1_XNP		0x2000	/* Enable extended next page */
+#define MDIO_PCS_CTRL1_CLKSTOP_EN	0x400	/* Stop the clock during LPI */
 
 /* 10 Gb/s */
 #define MDIO_CTRL1_SPEED10G		(MDIO_CTRL1_SPEEDSELEXT | 0x00)
@@ -237,9 +241,18 @@
 #define MDIO_AN_10GBT_STAT_MS		0x4000	/* Master/slave config */
 #define MDIO_AN_10GBT_STAT_MSFLT	0x8000	/* Master/slave config fault */
 
-/* AN EEE Advertisement register. */
-#define MDIO_AN_EEE_ADV_100TX		0x0002	/* Advertise 100TX EEE cap */
-#define MDIO_AN_EEE_ADV_1000T		0x0004	/* Advertise 1000T EEE cap */
+/* EEE Supported/Advertisement/LP Advertisement registers.
+ *
+ * EEE capability Register (3.20), Advertisement (7.60) and
+ * Link partner ability (7.61) registers have and can use the same identical
+ * bit masks.
+ */
+#define MDIO_EEE_100TX			0x0002	/* 100TX EEE cap */
+#define MDIO_EEE_1000T			0x0004	/* 1000T EEE cap */
+#define MDIO_EEE_10GT			0x0008	/* 10GT EEE cap */
+#define MDIO_EEE_1000KX			0x0010	/* 1000KX EEE cap */
+#define MDIO_EEE_10GKX4			0x0020	/* 10G KX4 EEE cap */
+#define MDIO_EEE_10GKR			0x0040	/* 10G KR EEE cap */
 
 /* LASI RX_ALARM control/status registers. */
 #define MDIO_PMA_LASI_RX_PHYXSLFLT	0x0001	/* PHY XS RX local fault */
diff --git a/include/linux/mii.h b/include/linux/mii.h
index 2783eca..8ef3a7a 100644
--- a/include/linux/mii.h
+++ b/include/linux/mii.h
@@ -21,6 +21,8 @@
 #define MII_EXPANSION		0x06	/* Expansion register          */
 #define MII_CTRL1000		0x09	/* 1000BASE-T control          */
 #define MII_STAT1000		0x0a	/* 1000BASE-T status           */
+#define	MII_MMD_CTRL		0x0d	/* MMD Access Control Register */
+#define	MII_MMD_DATA		0x0e	/* MMD Access Data Register */
 #define MII_ESTATUS		0x0f	/* Extended Status             */
 #define MII_DCOUNTER		0x12	/* Disconnect counter          */
 #define MII_FCSCOUNTER		0x13	/* False carrier counter       */
@@ -141,6 +143,13 @@
 #define FLOW_CTRL_TX		0x01
 #define FLOW_CTRL_RX		0x02
 
+/* MMD Access Control register fields */
+#define MII_MMD_CTRL_DEVAD_MASK	0x1f	/* Mask MMD DEVAD*/
+#define MII_MMD_CTRL_ADDR	0x0000	/* Address */
+#define MII_MMD_CTRL_NOINCR	0x4000	/* no post increment */
+#define MII_MMD_CTRL_INCR_RDWT	0x8000	/* post increment on reads & writes */
+#define MII_MMD_CTRL_INCR_ON_WT	0xC000	/* post increment on writes only */
+
 /* This structure is used in all SIOCxMIIxxx ioctl calls */
 struct mii_ioctl_data {
 	__u16		phy_id;
diff --git a/include/linux/phy.h b/include/linux/phy.h
index c291cae..97fc4cf 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -532,6 +532,11 @@ int phy_register_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask,
 		int (*run)(struct phy_device *));
 int phy_scan_fixups(struct phy_device *phydev);
 
+int phy_init_eee(struct phy_device *phydev, bool clk_stop_enable);
+int phy_get_eee_err(struct phy_device *phydev);
+int phy_ethtool_set_eee(struct phy_device *phydev, struct ethtool_eee *data);
+int phy_ethtool_get_eee(struct phy_device *phydev, struct ethtool_eee *data);
+
 int __init mdio_bus_init(void);
 void mdio_bus_exit(void);
 
-- 
1.7.4.4

^ permalink raw reply related

* [net-next.git 2/4] stmmac: update the driver Documentation and add EEE
From: Giuseppe CAVALLARO @ 2012-06-18  6:49 UTC (permalink / raw)
  To: netdev
  Cc: eric.dumazet, bhutchings, rayagond, davem, yuvalmin,
	Giuseppe Cavallaro
In-Reply-To: <1340002187-9248-1-git-send-email-peppe.cavallaro@st.com>

This patch updates the stmmac's documentation adding
some missing files in the section used to describe the
internal driver's structure.

Also the patch adds a new section to describe the EEE support.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 Documentation/networking/stmmac.txt |   36 +++++++++++++++++++++++++++++-----
 1 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
index 5cb9a19..c676b9c 100644
--- a/Documentation/networking/stmmac.txt
+++ b/Documentation/networking/stmmac.txt
@@ -257,9 +257,11 @@ reset procedure etc).
  o Makefile
  o stmmac_main.c: main network device driver;
  o stmmac_mdio.c: mdio functions;
+ o stmmac_pci: PCI driver;
+ o stmmac_platform.c: platform driver
  o stmmac_ethtool.c: ethtool support;
  o stmmac_timer.[ch]: timer code used for mitigating the driver dma interrupts
-		      Only tested on ST40 platforms based.
+		      (only tested on ST40 platforms based);
  o stmmac.h: private driver structure;
  o common.h: common definitions and VFTs;
  o descs.h: descriptor structure definitions;
@@ -269,9 +271,11 @@ reset procedure etc).
  o dwmac100_core: MAC 100 core and dma code;
  o dwmac100_dma.c: dma funtions for the MAC chip;
  o dwmac1000.h: specific header file for the MAC;
- o dwmac_lib.c: generic DMA functions shared among chips
- o enh_desc.c: functions for handling enhanced descriptors
- o norm_desc.c: functions for handling normal descriptors
+ o dwmac_lib.c: generic DMA functions shared among chips;
+ o enh_desc.c: functions for handling enhanced descriptors;
+ o norm_desc.c: functions for handling normal descriptors;
+ o chain_mode.c/ring_mode.c:: functions to manage RING/CHAINED modes;
+ o mmc_core.c/mmc.h: Management MAC Counters;
 
 5) Debug Information
 
@@ -304,7 +308,27 @@ All these are only useful during the developing stage
 and should never enabled inside the code for general usage.
 In fact, these can generate an huge amount of debug messages.
 
-6) TODO:
+6) Energy Efficient Ethernet
+
+Energy Efficient Ethernet(EEE) enables IEEE 802.3 MAC sublayer along
+with a family of Physical layer to operate in the Low power Idle(LPI)
+mode. The EEE mode supports the IEEE 802.3 MAC operation at 100Mbps,
+1000Mbps & 10Gbps.
+
+The LPI mode allows power saving by switching off parts of the
+communication device functionality when there is no data to be
+transmitted & received. The system on both the side of the link can
+disable some functionalities & save power during the period of low-link
+utilization. The MAC controls whether the system should enter or exit
+the LPI mode & communicate this to PHY.
+
+As soon as the interface is opened, the driver verifies if the EEE can
+be supported. This is done by looking at both the DMA HW capability
+register and the PHY devices MCD registers.
+To enter in Tx LPI mode the driver needs to have a software timer
+that enable and disable the LPI mode when there is nothing to be
+transmitted.
+
+7) TODO:
  o XGMAC is not supported.
- o Add the EEE - Energy Efficient Ethernet
  o Add the PTP - precision time protocol
-- 
1.7.4.4

^ permalink raw reply related

* [net-next.git 0/4] EEE for PAL and stmmac (V4)
From: Giuseppe CAVALLARO @ 2012-06-18  6:49 UTC (permalink / raw)
  To: netdev
  Cc: eric.dumazet, bhutchings, rayagond, davem, yuvalmin,
	Giuseppe Cavallaro

These patches add the EEE support in the stmmac device driver
restoring an old work I had done some months ago and not
completed in time.

I've tested all on ST STB with the IC+ 101G PHY device that has
this feature.

The initial EEE support for the stmmac has been written by Rayagond
but I have reworked all his code adding new parts and especially
performing tests on a real hardware. Thx Rayagond!

In these patches, we can see that the stmmac supports the EEE
only if the DMA HW capability register says that this
feature is actually available. In that case, the driver can enter
in the Tx LPI mode by using a timer as recommended by Synopsys.
Note that EEE is supported in new chip generations; in particular
I used the 3.61a.

At any rate, further information about how the driver treats the EEE
can be found in the stmmac.txt file (there is a patch for that).

Another patch is for Physical Abstraction Layer now able to
manage the MMD registers (clause 45); it also provides the ethtool
support to manage supported/advertisement/lp adv features.

v3: fixed the "stmmac: do not use strict_strtoul but kstrtoint"
    to use the kstrtoint.
v4: fixed the function to enable the EEE and add a check that verifies
    if the link auto-negotiated matches with the bits in the adv and lp
    registers.

Giuseppe Cavallaro (4):
  stmmac: do not use strict_strtoul but kstrtoint
  stmmac: update the driver Documentation and add EEE
  stmmac: add the Energy Efficient Ethernet support
  phy: add the EEE support and the way to access to the MMD registers.

 Documentation/networking/stmmac.txt                |   36 +++-
 drivers/net/ethernet/stmicro/stmmac/common.h       |   31 +++-
 drivers/net/ethernet/stmicro/stmmac/dwmac1000.h    |   20 ++
 .../net/ethernet/stmicro/stmmac/dwmac1000_core.c   |  101 +++++++-
 .../net/ethernet/stmicro/stmmac/dwmac100_core.c    |    4 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h    |    1 +
 drivers/net/ethernet/stmicro/stmmac/stmmac.h       |    8 +
 .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |   57 ++++
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  193 ++++++++++++--
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |    2 +
 drivers/net/phy/phy.c                              |  274 ++++++++++++++++++++
 include/linux/mdio.h                               |   21 ++-
 include/linux/mii.h                                |    9 +
 include/linux/phy.h                                |    5 +
 14 files changed, 717 insertions(+), 45 deletions(-)

-- 
1.7.4.4

^ permalink raw reply

* [net-next.git 1/4 (v3)] stmmac: do not use strict_strtoul but kstrtoint
From: Giuseppe CAVALLARO @ 2012-06-18  6:49 UTC (permalink / raw)
  To: netdev
  Cc: eric.dumazet, bhutchings, rayagond, davem, yuvalmin,
	Giuseppe Cavallaro
In-Reply-To: <1340002187-9248-1-git-send-email-peppe.cavallaro@st.com>

This patch replaces the obsolete strict_strtoul with kstrtoint.

v2: also removed casting on kstrtoul.
v3: use kstrtoint instead of kstrtoul due to all vars are integer.
    thanks to E. Dumazet.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |   27 +++++++-------------
 1 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 590e95b..eba49cb 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2129,42 +2129,35 @@ static int __init stmmac_cmdline_opt(char *str)
 		return -EINVAL;
 	while ((opt = strsep(&str, ",")) != NULL) {
 		if (!strncmp(opt, "debug:", 6)) {
-			if (strict_strtoul(opt + 6, 0, (unsigned long *)&debug))
+			if (kstrtoint(opt + 6, 0, &debug))
 				goto err;
 		} else if (!strncmp(opt, "phyaddr:", 8)) {
-			if (strict_strtoul(opt + 8, 0,
-					   (unsigned long *)&phyaddr))
+			if (kstrtoint(opt + 8, 0, &phyaddr))
 				goto err;
 		} else if (!strncmp(opt, "dma_txsize:", 11)) {
-			if (strict_strtoul(opt + 11, 0,
-					   (unsigned long *)&dma_txsize))
+			if (kstrtoint(opt + 11, 0, &dma_txsize))
 				goto err;
 		} else if (!strncmp(opt, "dma_rxsize:", 11)) {
-			if (strict_strtoul(opt + 11, 0,
-					   (unsigned long *)&dma_rxsize))
+			if (kstrtoint(opt + 11, 0, &dma_rxsize))
 				goto err;
 		} else if (!strncmp(opt, "buf_sz:", 7)) {
-			if (strict_strtoul(opt + 7, 0,
-					   (unsigned long *)&buf_sz))
+			if (kstrtoint(opt + 7, 0, &buf_sz))
 				goto err;
 		} else if (!strncmp(opt, "tc:", 3)) {
-			if (strict_strtoul(opt + 3, 0, (unsigned long *)&tc))
+			if (kstrtoint(opt + 3, 0, &tc))
 				goto err;
 		} else if (!strncmp(opt, "watchdog:", 9)) {
-			if (strict_strtoul(opt + 9, 0,
-					   (unsigned long *)&watchdog))
+			if (kstrtoint(opt + 9, 0, &watchdog))
 				goto err;
 		} else if (!strncmp(opt, "flow_ctrl:", 10)) {
-			if (strict_strtoul(opt + 10, 0,
-					   (unsigned long *)&flow_ctrl))
+			if (kstrtoint(opt + 10, 0, &flow_ctrl))
 				goto err;
 		} else if (!strncmp(opt, "pause:", 6)) {
-			if (strict_strtoul(opt + 6, 0, (unsigned long *)&pause))
+			if (kstrtoint(opt + 6, 0, &pause))
 				goto err;
 #ifdef CONFIG_STMMAC_TIMER
 		} else if (!strncmp(opt, "tmrate:", 7)) {
-			if (strict_strtoul(opt + 7, 0,
-					   (unsigned long *)&tmrate))
+			if (kstrtoint(opt + 7, 0, &tmrate))
 				goto err;
 #endif
 		}
-- 
1.7.4.4

^ permalink raw reply related

* [PATCH] tc: prio: Perform more strict check on priomap.
From: Li Wei @ 2012-06-18  6:33 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger


Since band number counts from zero thus band must be little than
opt.bands.
---
 tc/q_prio.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tc/q_prio.c b/tc/q_prio.c
index 79b4fd0..bacc702 100644
--- a/tc/q_prio.c
+++ b/tc/q_prio.c
@@ -67,7 +67,7 @@ static int prio_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct n
 				fprintf(stderr, "Illegal \"priomap\" element\n");
 				return -1;
 			}
-			if (band > opt.bands) {
+			if (band >= opt.bands) {
 				fprintf(stderr, "\"priomap\" element is out of bands\n");
 				return -1;
 			}
-- 
1.7.1

^ permalink raw reply related

* Re: [net-next.git 1/4 (v5)] phy: add the EEE support and the way to access to the MMD registers.
From: Giuseppe CAVALLARO @ 2012-06-18  6:23 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, eric.dumazet, rayagond, davem, yuvalmin
In-Reply-To: <1339778248.2555.9.camel@bwh-desktop.uk.solarflarecom.com>

On 6/15/2012 6:37 PM, Ben Hutchings wrote:
> On Fri, 2012-06-15 at 08:06 +0200, Giuseppe CAVALLARO wrote:
>> Hello Ben
>>
>> On 6/14/2012 1:28 AM, Ben Hutchings wrote:
> [...]
>>> But you also use this condition to decide whether to enable TX LPI, so
>>> it's important that it does match the specification (§78.3) for whether
>>> EEE is supported - but it doesn't.  You need to work out what mode was
>>> autonegotiated, then check that the relevant bit is set in both our EEE
>>> advertising (*not* supported) and the LP EEE advertising masks.
>>
>> I've some doubts and, before resending the patch, I kindly ask you some
>> further details just on this point.
>>
>> In the code, I check if the EEE is supported and on GMII, MII and RGMII
>> and duplex mode; in case of success the Ethernet driver can enable the
>> TX LPI.
>> Indeed, I am only using the 3.20 and 7.61 registers w/o looking at the
>> 7.60. So this should be fixed, shouldn't it?
>> Am I missing anything else?
>> What do you mean when say that it doesn't match the specification
>> (§78.3)? I'm pointing to the '78.3 Capabilities Negotiation' chapter of
>> the IEEE802-3az, is it ok?
> 
> Yes that's what I mean.  As I read it, you need to check which link mode
> was autonegotiated, then the corresponding bit in 7.60 and 7.61.  If
> they're both set then EEE is supported on the current link.  (But, let
> me repeat, I have not done any work on implementing EEE, so it's
> entirely possible that I have misunderstood some things.)

Ben, you are right, the code needs this kind of check.
For example, my phy device only supports the 100BASE-TX and, with the
current implementation, the phy_init_eee could enable the EEE on 10/full
link mode and it is not good.

I'll send the new patch asap.

Thanks
Peppe

> 
> Ben.
> 

^ permalink raw reply

* [PATCH] tc: man: Fix incorrect parameter format in prio.
From: Li Wei @ 2012-06-18  6:23 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger


Parameter priomap use blank instead of comma to separate bands,
update manpage to confirms to this.
---
 man/man8/tc-prio.8 |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/man/man8/tc-prio.8 b/man/man8/tc-prio.8
index 1625fcc..55a5f3d 100644
--- a/man/man8/tc-prio.8
+++ b/man/man8/tc-prio.8
@@ -11,7 +11,7 @@ major:
 .B ] prio [ bands 
 bands
 .B ] [ priomap
-band,band,band... 
+band band band... 
 .B ] [ estimator 
 interval timeconstant
 .B ]
@@ -134,7 +134,7 @@ showing to which Priority they are mapped.
 The last column shows the result of the default priomap. On the command line,
 the default priomap looks like this:
 
-    1, 2, 2, 2, 1, 2, 0, 0 , 1, 1, 1, 1, 1, 1, 1, 1
+    1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 
 This means that priority 4, for example, gets mapped to band number 1.
 The priomap also allows you to list higher priorities (> 7) which do not
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH] net: remove my future former mail address
From: Rémi Denis-Courmont @ 2012-06-18  3:45 UTC (permalink / raw)
  To: netdev
In-Reply-To: <4FDAE7D5.5040709@nokia.com>

Le vendredi 15 juin 2012 10:44:21 Sakari Ailus, vous avez écrit :
> Hi Rémi,
> 
> Rémi Denis-Courmont wrote:
> > From: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
> > 
> > Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
> > Cc: Sakari Ailus <sakari.ailus@nokia.com>
> 
> Hmm. While you're at it, could you please remove mine from the same
> files as well, please?
> 
> I wonder if this also means that the code will be without a maintainer
> in the future.

I don't have a crystal ball. As far as I know, the ISI modem "assets" have 
been sold to Renesas several years ago, so it would natural for them to take 
over. Or a Renesas licensee (ST-E seemed to be one at some point).

In any case, the Phonet stack has been essentially code complete from the 
beginning: the changes with new modems occur either above in userspace, or 
below in device drivers. In the past few years, it boiled down to keeping it 
up to date with kernel internal changes.

-- 
Rémi Denis-Courmont
http://www.remlab.net/
http://fi.linkedin.com/in/remidenis

^ permalink raw reply

* [PATCH] ipv4: Cap ADVMSS metric in the FIB rather than the routing cache.
From: David Miller @ 2012-06-18  2:53 UTC (permalink / raw)
  To: netdev


It makes no sense to execute this limit test every time we create a
routing cache entry.

We can't simply error out on these things since we've silently
accepted and truncated them forever.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/ipv4/fib_semantics.c |    7 ++++++-
 net/ipv4/route.c         |    2 --
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index e5b7182..415f823 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -779,9 +779,14 @@ struct fib_info *fib_create_info(struct fib_config *cfg)
 			int type = nla_type(nla);
 
 			if (type) {
+				u32 val;
+
 				if (type > RTAX_MAX)
 					goto err_inval;
-				fi->fib_metrics[type - 1] = nla_get_u32(nla);
+				val = nla_get_u32(nla);
+				if (type == RTAX_ADVMSS && val > 65535 - 40)
+					val = 65535 - 40;
+				fi->fib_metrics[type - 1] = val;
 			}
 		}
 	}
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 41df529..a91f6d3 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1951,8 +1951,6 @@ static void rt_set_nexthop(struct rtable *rt, const struct flowi4 *fl4,
 
 	if (dst_mtu(dst) > IP_MAX_MTU)
 		dst_metric_set(dst, RTAX_MTU, IP_MAX_MTU);
-	if (dst_metric_raw(dst, RTAX_ADVMSS) > 65535 - 40)
-		dst_metric_set(dst, RTAX_ADVMSS, 65535 - 40);
 
 #ifdef CONFIG_IP_ROUTE_CLASSID
 #ifdef CONFIG_IP_MULTIPLE_TABLES
-- 
1.7.10

^ permalink raw reply related

* Re: [PATCH 3/3] usbnet: handle remote wakeup asap
From: Ming Lei @ 2012-06-18  1:55 UTC (permalink / raw)
  To: David Miller
  Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, oneukum-l3A5Bk7waGM,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	stable-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20120617.162233.868029041800646826.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

On Mon, Jun 18, 2012 at 7:22 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> From: Ming Lei <ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> Date: Fri, 15 Jun 2012 10:22:16 +0800
>
>> David, sorry, the 'GFP_ATOMIC' above should be 'flags', so could
>> you take the fixed version from attachment? Or could you do it by
>> your self?
>
> You are rushing this patch submission if you are finding such
> errors right after you post the patch.
>
> Take your time and properly audit your work, then resubmit your
> entire series again once it is really ready.

Good suggestion, I will check the patches again and test them further.

Thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next v2 01/12] netfilter: fix problem with proto register
From: Gao feng @ 2012-06-18  0:59 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netdev, netfilter-devel
In-Reply-To: <20120616105037.GA18251@1984>

于 2012年06月16日 18:50, Pablo Neira Ayuso 写道:
> On Sat, Jun 16, 2012 at 11:41:12AM +0800, Gao feng wrote:
>> commit 2c352f444ccfa966a1aa4fd8e9ee29381c467448
>> (netfilter: nf_conntrack: prepare namespace support for
>> l4 protocol trackers) register proto before register sysctl.
>>
>> it changes the behavior that when register sysctl failed, the
>> proto should not be registered too.
>>
>> so change to register sysctl before register protos.
> 
> Could you explain why we need to change the order in the registration?
> ie. now first proto->init_net then sysctl things.

before commit 2c352f444ccfa966a1aa4fd8e9ee29381c467448, we register sysctl before
register protos, so if sysctl is registered faild, the protos will not be registered.

but now, we register protos first, and when register sysctl failed, we can use protos
too, it's different from before.

> 
>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>> ---
>>  net/netfilter/nf_conntrack_proto.c |   37 ++++++++++++++++++++++-------------
>>  1 files changed, 23 insertions(+), 14 deletions(-)
>>
>> diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c
>> index 1ea9194..a434dd7 100644
>> --- a/net/netfilter/nf_conntrack_proto.c
>> +++ b/net/netfilter/nf_conntrack_proto.c
>> @@ -253,18 +253,23 @@ int nf_conntrack_l3proto_register(struct net *net,
>>  {
>>  	int ret = 0;
>>  
>> -	if (net == &init_net)
>> -		ret = nf_conntrack_l3proto_register_net(proto);
>> +	if (proto->init_net) {
>> +		ret = proto->init_net(net);
>> +		if (ret < 0)
>> +			return ret;
>> +	}
>>  
>> +	ret = nf_ct_l3proto_register_sysctl(net, proto);
>>  	if (ret < 0)
>>  		return ret;
>>  
>> -	if (proto->init_net) {
>> -		ret = proto->init_net(net);
>> +	if (net == &init_net) {
>> +		ret = nf_conntrack_l3proto_register_net(proto);
>>  		if (ret < 0)
>> -			return ret;
>> +			nf_ct_l3proto_unregister_sysctl(net, proto);
>>  	}
>> -	return nf_ct_l3proto_register_sysctl(net, proto);
>> +
>> +	return ret;
>>  }
>>  EXPORT_SYMBOL_GPL(nf_conntrack_l3proto_register);
>>  
>> @@ -454,19 +459,23 @@ int nf_conntrack_l4proto_register(struct net *net,
>>  				  struct nf_conntrack_l4proto *l4proto)
>>  {
>>  	int ret = 0;
>> -	if (net == &init_net)
>> -		ret = nf_conntrack_l4proto_register_net(l4proto);
>> -
>> -	if (ret < 0)
>> -		return ret;
>> -
>> -	if (l4proto->init_net)
>> +	if (l4proto->init_net) {
>>  		ret = l4proto->init_net(net);
>> +		if (ret < 0)
>> +			return ret;
>> +	}
>>  
>> +	ret = nf_ct_l4proto_register_sysctl(net, l4proto);
>>  	if (ret < 0)
>>  		return ret;
>>  
>> -	return nf_ct_l4proto_register_sysctl(net, l4proto);
>> +	if (net == &init_net) {
>> +		ret = nf_conntrack_l4proto_register_net(l4proto);
>> +		if (ret < 0)
>> +			nf_ct_l4proto_unregister_sysctl(net, l4proto);
>> +	}
>> +
>> +	return ret;
>>  }
>>  EXPORT_SYMBOL_GPL(nf_conntrack_l4proto_register);
>>  
>> -- 
>> 1.7.7.6
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3] fec: fix clk handling for Coldfire.
From: David Miller @ 2012-06-17 23:47 UTC (permalink / raw)
  To: sfking; +Cc: netdev
In-Reply-To: <201206171639.43651.sfking@fdwdc.com>

From: Steven King <sfking@fdwdc.com>
Date: Sun, 17 Jun 2012 16:39:43 -0700

> commit f4d40de39a23f0c39cca55ac63e1175c69c3d2f7
> 'net fec: do not depend on grouped clocks' broke fec for Coldfire.
> 
> Hide the details of the fec clk management in a trio of new functions: 
> fec_clk_get, fec_clk_prepare and fec_clk_disable_unprepare.  these then can 
> be modified as needed to manage the clks for the different arches that use
> the fec controller.
> 
> Signed-off-by: Steven King <sfking@fdwdc.com>

I said keep the ifdefs out of the foo.c code.

Put this ifdef'y into one of the driver's header files, such
as fec.h

^ permalink raw reply

* [PATCH v3] fec: fix clk handling for Coldfire.
From: Steven King @ 2012-06-17 23:39 UTC (permalink / raw)
  To: netdev

commit f4d40de39a23f0c39cca55ac63e1175c69c3d2f7
'net fec: do not depend on grouped clocks' broke fec for Coldfire.

Hide the details of the fec clk management in a trio of new functions: 
fec_clk_get, fec_clk_prepare and fec_clk_disable_unprepare.  these then can 
be modified as needed to manage the clks for the different arches that use
the fec controller.

Signed-off-by: Steven King <sfking@fdwdc.com>
---
 drivers/net/ethernet/freescale/fec.c |   84 ++++++++++++++++++++++++++--------
 1 file changed, 65 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec.c b/drivers/net/ethernet/freescale/fec.c
index ff7f4c5..1072da6 100644
--- a/drivers/net/ethernet/freescale/fec.c
+++ b/drivers/net/ethernet/freescale/fec.c
@@ -207,8 +207,12 @@ struct fec_enet_private {
 
 	struct net_device *netdev;
 
+#ifdef CONFIG_COLDFIRE
+	struct clk *clk;
+#else
 	struct clk *clk_ipg;
 	struct clk *clk_ahb;
+#endif
 
 	/* The saved address of a sent-in-place packet/buffer, for skfree(). */
 	unsigned char *tx_bounce[TX_RING_SIZE];
@@ -248,6 +252,56 @@ struct fec_enet_private {
 	int	irq[FEC_IRQ_NUM];
 };
 
+static int fec_clk_get(struct platform_device *pdev,
+		       struct fec_enet_private *fep)
+{
+	int ret;
+
+#ifdef CONFIG_COLDFIRE
+	fep->clk = clk_get(&pdev->dev, NULL);
+	if (IS_ERR(fep->clk)) {
+		ret = PTR_ERR(fep->clk);
+		goto failed_clk;
+	}
+#else
+	fep->clk_ipg = devm_clk_get(&pdev->dev, "ipg");
+	if (IS_ERR(fep->clk_ipg)) {
+		ret = PTR_ERR(fep->clk_ipg);
+		goto failed_clk;
+	}
+
+	fep->clk_ahb = devm_clk_get(&pdev->dev, "ahb");
+	if (IS_ERR(fep->clk_ahb)) {
+		ret = PTR_ERR(fep->clk_ahb);
+		goto failed_clk;
+	}
+#endif
+	return 0;
+
+failed_clk:
+	return ret;
+}
+
+static void fec_clk_prepare(struct fec_enet_private *fep)
+{
+#ifdef CONFIG_COLDFIRE
+	clk_prepare_enable(fep->clk);
+#else
+	clk_prepare_enable(fep->clk_ahb);
+	clk_prepare_enable(fep->clk_ipg);
+#endif
+}
+
+static void fec_clk_disable_unprepare(struct fec_enet_private *fep)
+{
+#ifdef CONFIG_COLDFIRE
+	clk_disable_unprepare(fep->clk);
+#else
+	clk_disable_unprepare(fep->clk_ahb);
+	clk_disable_unprepare(fep->clk_ipg);
+#endif
+}
+
 /* FEC MII MMFR bits definition */
 #define FEC_MMFR_ST		(1 << 30)
 #define FEC_MMFR_OP_READ	(2 << 28)
@@ -1066,7 +1120,11 @@ static int fec_enet_mii_init(struct platform_device *pdev)
 	 * Reference Manual has an error on this, and gets fixed on i.MX6Q
 	 * document.
 	 */
+#ifdef CONFIG_COLDFIRE
+	fep->phy_speed = DIV_ROUND_UP(clk_get_rate(fep->clk), 5000000);
+#else
 	fep->phy_speed = DIV_ROUND_UP(clk_get_rate(fep->clk_ahb), 5000000);
+#endif
 	if (id_entry->driver_data & FEC_QUIRK_ENET_MAC)
 		fep->phy_speed--;
 	fep->phy_speed <<= 1;
@@ -1619,20 +1677,11 @@ fec_probe(struct platform_device *pdev)
 		goto failed_pin;
 	}
 
-	fep->clk_ipg = devm_clk_get(&pdev->dev, "ipg");
-	if (IS_ERR(fep->clk_ipg)) {
-		ret = PTR_ERR(fep->clk_ipg);
-		goto failed_clk;
-	}
 
-	fep->clk_ahb = devm_clk_get(&pdev->dev, "ahb");
-	if (IS_ERR(fep->clk_ahb)) {
-		ret = PTR_ERR(fep->clk_ahb);
+	if (fec_clk_get(pdev, fep))
 		goto failed_clk;
-	}
 
-	clk_prepare_enable(fep->clk_ahb);
-	clk_prepare_enable(fep->clk_ipg);
+	fec_clk_prepare(fep);
 
 	ret = fec_enet_init(ndev);
 	if (ret)
@@ -1655,8 +1704,7 @@ failed_register:
 	fec_enet_mii_remove(fep);
 failed_mii_init:
 failed_init:
-	clk_disable_unprepare(fep->clk_ahb);
-	clk_disable_unprepare(fep->clk_ipg);
+	fec_clk_disable_unprepare(fep);
 failed_pin:
 failed_clk:
 	for (i = 0; i < FEC_IRQ_NUM; i++) {
@@ -1689,8 +1737,7 @@ fec_drv_remove(struct platform_device *pdev)
 		if (irq > 0)
 			free_irq(irq, ndev);
 	}
-	clk_disable_unprepare(fep->clk_ahb);
-	clk_disable_unprepare(fep->clk_ipg);
+	fec_clk_disable_unprepare(fep);
 	iounmap(fep->hwp);
 	free_netdev(ndev);
 
@@ -1714,8 +1761,7 @@ fec_suspend(struct device *dev)
 		fec_stop(ndev);
 		netif_device_detach(ndev);
 	}
-	clk_disable_unprepare(fep->clk_ahb);
-	clk_disable_unprepare(fep->clk_ipg);
+	fec_clk_disable_unprepare(fep);
 
 	return 0;
 }
@@ -1726,8 +1772,8 @@ fec_resume(struct device *dev)
 	struct net_device *ndev = dev_get_drvdata(dev);
 	struct fec_enet_private *fep = netdev_priv(ndev);
 
-	clk_prepare_enable(fep->clk_ahb);
-	clk_prepare_enable(fep->clk_ipg);
+	fec_clk_prepare(fep);
+
 	if (netif_running(ndev)) {
 		fec_restart(ndev, fep->full_duplex);
 		netif_device_attach(ndev);

^ permalink raw reply related

* Re: [PATCH] usbnet: Activate halt interrupt endpoint before re-submit URB
From: David Miller @ 2012-06-17 23:30 UTC (permalink / raw)
  To: huajun.li.lee; +Cc: tom.leiming, stern, linux-usb, netdev
In-Reply-To: <CA+v9cxad+THL8u7tr_ubD7H7o+6x+qTOXOgowmPDnWUxa8aJtQ@mail.gmail.com>

From: Huajun Li <huajun.li.lee@gmail.com>
Date: Wed, 13 Jun 2012 20:50:31 +0800

> intr_complete() submits URB even the interrupt endpoint stalls.
> This patch will try to activate the endpoint once the exception
> occurs, and then re-submit the URB if the endpoint works again.
> 
> Signed-off-by: Huajun Li <huajun.li.lee@gmail.com>

Review from USB experts would be appreciated.

Thanks.

^ permalink raw reply

* Re: [PATCH] net: remove my future former mail address
From: David Miller @ 2012-06-17 23:29 UTC (permalink / raw)
  To: remi; +Cc: netdev, remi.denis-courmont, sakari.ailus
In-Reply-To: <1339662543-7203-1-git-send-email-remi@remlab.net>

From: Rémi Denis-Courmont <remi@remlab.net>
Date: Thu, 14 Jun 2012 11:29:03 +0300

> From: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
> 
> Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: lpc_eth: free skbs in start_xmit
From: David Miller @ 2012-06-17 23:28 UTC (permalink / raw)
  To: eric.dumazet
  Cc: stigge, netdev, kevin.wells, srinivas.bakki, aletes.xgr,
	linux-arm-kernel
In-Reply-To: <1339581496.22704.346.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 13 Jun 2012 11:58:16 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> Transmitted skbs can be freed immediately in lpc_eth_hard_start_xmit()
> instead of at TX completion, since driver copies the frames in DMA area.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Tested-by: Roland Stigge <stigge@antcom.de>

Applied.

^ permalink raw reply

* Re: [net-next PATCH 02/02] net/ipv4: VTI support new module for ip_vti.
From: David Miller @ 2012-06-17 23:27 UTC (permalink / raw)
  To: steffen.klassert; +Cc: saurabh.mohan, netdev
In-Reply-To: <20120615053707.GV27795@secunet.com>

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Fri, 15 Jun 2012 07:37:07 +0200

> VTI should be independent of the IPsec protocol.
> Our IPsec implementation supports AH (and IPCOMP)
> so VTI should support these protocols too.

I agree with Steffen, it shouldn't require very much work to support
all of the 3 major IPSEC protocols.  So if we are going to add a
feature like this I require that you completely flesh out the support
and support IPV6 as well.

Doing otherwise creates a terrible user experience.

^ permalink raw reply

* Re: [PATCH] net: remove my future former mail address
From: David Miller @ 2012-06-17 23:26 UTC (permalink / raw)
  To: sakari.ailus; +Cc: remi, netdev, remi.denis-courmont
In-Reply-To: <4FDAE7D5.5040709@nokia.com>

From: Sakari Ailus <sakari.ailus@nokia.com>
Date: Fri, 15 Jun 2012 10:44:21 +0300

> I wonder if this also means that the code will be without a
> maintainer in the future.

I really want to thank Nokia for submitting code then abandoning
it completely.

I am warning future Nokia patch submitters that I'm now, as a
result, extremely resultant to apply anything they submit other
than the absolutely most trivial bug fixes.

^ permalink raw reply

* Re: [Patch] bonding: show all the link status of slaves
From: David Miller @ 2012-06-17 23:24 UTC (permalink / raw)
  To: amwang; +Cc: netdev, fubar, andy
In-Reply-To: <1339749567-20393-1-git-send-email-amwang@redhat.com>

From: Cong Wang <amwang@redhat.com>
Date: Fri, 15 Jun 2012 16:39:27 +0800

> There are four link statuses of a bonding slave, the procfs
> code shows a wrong status when using downdelay/updelay:
> 
> 	(slave->link == BOND_LINK_UP) ?  "up" : "down"
> 
> It doesn't respect the rest two statuses. This patch fixes it.
> 
> Cc: Jay Vosburgh <fubar@us.ibm.com>
> Cc: Andy Gospodarek <andy@greyhouse.net>
> Cc: "David S. Miller" <davem@davemloft.net>
> Signed-off-by: Cong Wang <amwang@redhat.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 3/3] usbnet: handle remote wakeup asap
From: David Miller @ 2012-06-17 23:22 UTC (permalink / raw)
  To: ming.lei; +Cc: gregkh, oneukum, netdev, linux-usb, stable
In-Reply-To: <CACVXFVMO0Z23kW59YRyfbVFu77vG6_yAMxAhOXk619V0dhF7Fw@mail.gmail.com>

From: Ming Lei <ming.lei@canonical.com>
Date: Fri, 15 Jun 2012 10:22:16 +0800

> David, sorry, the 'GFP_ATOMIC' above should be 'flags', so could
> you take the fixed version from attachment? Or could you do it by
> your self?

You are rushing this patch submission if you are finding such
errors right after you post the patch.

Take your time and properly audit your work, then resubmit your
entire series again once it is really ready.

^ permalink raw reply

* Re: [PATCH] usbnet: sanitise overlong driver information strings
From: David Miller @ 2012-06-17 23:20 UTC (permalink / raw)
  To: phil.sutter; +Cc: netdev
In-Reply-To: <1339672722-22793-1-git-send-email-phil.sutter@viprinet.com>

From: Phil Sutter <phil.sutter@viprinet.com>
Date: Thu, 14 Jun 2012 13:18:42 +0200

> As seen on smsc75xx, driver_info->description being longer than 32
> characters messes up 'ethtool -i' output.
> 
> Signed-off-by: Phil Sutter <phil.sutter@viprinet.com>

Applied.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox