Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [patch net-next 1/2 v3] tc: add BPF based action
From: Alexei Starovoitov @ 2015-01-14 15:39 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Jiri Pirko, Network Development, David S. Miller,
	Jamal Hadi Salim, Hannes Frederic Sowa
In-Reply-To: <54B66F08.2010305@redhat.com>

On Wed, Jan 14, 2015 at 5:28 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
>
> I'm still wondering about the drop semantics ... wouldn't it be more
> intuitive to use 0 for drops in this context?

good point.
I think it must be 0 to match behavior of socket filters, etc.
If program tries to access beyond packet size or does divide
by zero if will be terminated and will return 0.
So zero should be the safest action from caller point of view.

^ permalink raw reply

* Re: Investment
From: Suklee Peck @ 2015-01-13 19:57 UTC (permalink / raw)
  To: Recipients

I'm contacting you on behalf of an investment placed under management 5 years ago by Shui bian. He needs assistance in investing these funds. If you are interested, you can write to his private email ( saitt01@qq.com ) for further details.
gesalpe
Best Regards,
Suklee Peck

^ permalink raw reply

* Re: non-OVS based vxlan config broken on 3.19-rc ?!
From: thomas Graf @ 2015-01-14 15:52 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: tom Herbert, Marcelo Leitner, Jesse Gross, netdev@vger.kernel.org
In-Reply-To: <54B688D9.8030101@mellanox.com>

On 01/14/15 at 05:18pm, Or Gerlitz wrote:
> Guys, just realized that non-OVS based vxlan config is broken with
> 3.19-rc... I see that it works for me on 3.18.2 and breaks on 3.19-rc3
> (Linus tree). Tested over mlx4 (both offloaded and non offloaded modes) and
> igb, see below the simplest form I can see it breaks on 3.19-rcand works on
> 3.18
> 
> Looking on tcpdump and stats, the arp reply arrives to the 3.19-rc host NIC
> driver but is dropped along the stack beforehanded to the vxlan driver, not
> sure where and why...

As additional data point: I tested the VXLAN-GBP with iproute2 based tunnels
on net-next and that works fine. Driver used was a e1000 in KVM.

^ permalink raw reply

* Re: [PATCH 2/2] net/macb: improved ethtool statistics support
From: Nicolas Ferre @ 2015-01-14 15:53 UTC (permalink / raw)
  To: Xander Huff, netdev, David Miller
  Cc: jaeden.amero, rich.tollerton, ben.shelton, brad.mouring,
	linux-kernel, Cyrille Pitchen
In-Reply-To: <1421187351-27279-2-git-send-email-xander.huff@ni.com>

Le 13/01/2015 23:15, Xander Huff a écrit :
> Currently `ethtool -S` simply returns "no stats available". It
> would be more useful to see what the various ethtool statistics
> registers' values are. This change implements get_ethtool_stats,
> get_strings, and get_sset_count functions to accomplish this.
> 
> Read all GEM statistics registers and sum them into
> macb.ethtool_stats. Add the necessary infrastructure to make this
> accessible via `ethtool -S`.
> 
> Update gem_update_stats to utilize ethtool_stats.
> 
> Signed-off-by: Xander Huff <xander.huff@ni.com>

David,

I see some issues with this patch: can you hold it a little bit please
(aka NAK)?

Remarks enclosed:

> ---
>  drivers/net/ethernet/cadence/macb.c |  55 +++++++-
>  drivers/net/ethernet/cadence/macb.h | 256 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 307 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> index 3767271..dd8c202 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -1827,12 +1827,23 @@ static int macb_close(struct net_device *dev)
>  
>  static void gem_update_stats(struct macb *bp)
>  {
> -	u32 __iomem *reg = bp->regs + GEM_OTX;
> +	int i;
>  	u32 *p = &bp->hw_stats.gem.tx_octets_31_0;
> -	u32 *end = &bp->hw_stats.gem.rx_udp_checksum_errors + 1;
>  
> -	for (; p < end; p++, reg++)
> -		*p += __raw_readl(reg);
> +	for (i = 0; i < GEM_STATS_LEN; ++i, ++p) {
> +		u32 offset = gem_statistics[i].offset;
> +		u64 val = __raw_readl(bp->regs+offset);
> +
> +		bp->ethtool_stats[i] += val;
> +		*p += val;
> +
> +		if (offset == GEM_OCTTXL || offset == GEM_OCTRXL) {
> +			/* Add GEM_OCTTXH, GEM_OCTRXH */
> +			val = __raw_readl(bp->regs+offset+4);

style: whitespace around '+'

> +			bp->ethtool_stats[i] += ((u64)val)<<32;

style: ditto

> +			*(++p) += val;
> +		}
> +	}
>  }
>  
>  static struct net_device_stats *gem_get_stats(struct macb *bp)
> @@ -1873,6 +1884,39 @@ static struct net_device_stats *gem_get_stats(struct macb *bp)
>  	return nstat;
>  }
>  
> +static void gem_get_ethtool_stats(struct net_device *dev,
> +				  struct ethtool_stats *stats, u64 *data)
> +{
> +	struct macb *bp;
> +
> +	bp = netdev_priv(dev);
> +	gem_update_stats(bp);
> +	memcpy(data, &bp->ethtool_stats, sizeof(u64)*GEM_STATS_LEN);

style: ditto

> +}
> +
> +static int gem_get_sset_count(struct net_device *dev, int sset)
> +{
> +	switch (sset) {
> +	case ETH_SS_STATS:
> +		return GEM_STATS_LEN;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static void gem_get_ethtool_strings(struct net_device *dev, u32 sset, u8 *p)
> +{
> +	int i;
> +
> +	switch (sset) {
> +	case ETH_SS_STATS:
> +		for (i = 0; i < GEM_STATS_LEN; i++, p += ETH_GSTRING_LEN)
> +			memcpy(p, gem_statistics[i].stat_string,
> +			       ETH_GSTRING_LEN);
> +		break;
> +	}
> +}
> +
>  struct net_device_stats *macb_get_stats(struct net_device *dev)
>  {
>  	struct macb *bp = netdev_priv(dev);
> @@ -1988,6 +2032,9 @@ const struct ethtool_ops macb_ethtool_ops = {
>  	.get_regs		= macb_get_regs,
>  	.get_link		= ethtool_op_get_link,
>  	.get_ts_info		= ethtool_op_get_ts_info,
> +	.get_ethtool_stats	= gem_get_ethtool_stats,
> +	.get_strings		= gem_get_ethtool_strings,
> +	.get_sset_count		= gem_get_sset_count,

I think that the 10/100 macb version of this IP doesn't have the same
statistic possibilities: so you shouldn't register these functions for
all the variants of the IP.
Can you please verify this and only register these functions in the proper
if (macb_is_gem(bp)) alternative?


>  };
>  EXPORT_SYMBOL_GPL(macb_ethtool_ops);
>  
> diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
> index 8e8c3c9..378b218 100644
> --- a/drivers/net/ethernet/cadence/macb.h
> +++ b/drivers/net/ethernet/cadence/macb.h
> @@ -82,6 +82,159 @@
>  #define GEM_SA4B				0x00A0 /* Specific4 Bottom */
>  #define GEM_SA4T				0x00A4 /* Specific4 Top */
>  #define GEM_OTX					0x0100 /* Octets transmitted */
> +#define GEM_OCTTXL				0x0100 /* Octets transmitted
> +							* [31:0]
> +							*/

Well please place the comment on a single line. Even if it exceeds 80
characters, it can be an exception for this.
However, check with David but personally, I feel that this formatting is
not good...

> +#define GEM_OCTTXH				0x0104 /* Octets transmitted
> +							* [47:32]
> +							*/
> +#define GEM_TXCNT				0x0108 /* Error-free Frames
> +							* Transmitted counter
> +							*/
> +#define GEM_TXBCCNT				0x010c /* Error-free Broadcast
> +							* Frames counter
> +							*/
> +#define GEM_TXMCCNT				0x0110 /* Error-free Multicast
> +							* Frames counter
> +							*/
> +#define GEM_TXPAUSECNT				0x0114 /* Pause Frames
> +							* Transmitted Counter
> +							*/
> +#define GEM_TX64CNT				0x0118 /* Error-free 64 byte
> +							* Frames Transmitted
> +							* counter
> +							*/

... particularly when it comes to reach 3 lines like above ^^^^^^

> +#define GEM_TX65CNT				0x011c /* Error-free 65-127 byte
> +							* Frames Transmitted
> +							* counter
> +							*/
> +#define GEM_TX128CNT				0x0120 /* Error-free 128-255
> +							* byte Frames
> +							* Transmitted counter
> +							*/
> +#define GEM_TX256CNT				0x0124 /* Error-free 256-511
> +							* byte Frames
> +							* transmitted counter
> +							*/
> +#define GEM_TX512CNT				0x0128 /* Error-free 512-1023
> +							* byte Frames
> +							* transmitted counter
> +							*/
> +#define GEM_TX1024CNT				0x012c /* Error-free 1024-1518
> +							* byte Frames
> +							* transmitted counter
> +							*/
> +#define GEM_TX1519CNT				0x0130 /* Error-free larger than
> +							* 1519 byte Frames
> +							* tranmitted counter
> +							*/
> +#define GEM_TXURUNCNT				0x0134 /* TX under run error
> +							* counter
> +							*/
> +#define GEM_SNGLCOLLCNT				0x0138 /* Single Collision Frame
> +							* Counter
> +							*/
> +#define GEM_MULTICOLLCNT			0x013c /* Multiple Collision
> +							* Frame Counter
> +							*/
> +#define GEM_EXCESSCOLLCNT			0x0140 /* Excessive Collision
> +							* Frame Counter
> +							*/
> +#define GEM_LATECOLLCNT				0x0144 /* Late Collision Frame
> +							* Counter
> +							*/
> +#define GEM_TXDEFERCNT				0x0148 /* Deferred Transmission
> +							* Frame Counter
> +							*/
> +#define GEM_TXCSENSECNT				0x014c /* Carrier Sense Error
> +							* Counter
> +							*/
> +#define GEM_ORX					0x0150 /* Octets received */
> +#define GEM_OCTRXL				0x0150 /* Octets received
> +							* [31:0]
> +							*/
> +#define GEM_OCTRXH				0x0154 /* Octets received
> +							* [47:32]
> +							*/
> +#define GEM_RXCNT				0x0158 /* Error-free Frames
> +							* Received Counter
> +							*/
> +#define GEM_RXBROADCNT				0x015c /* Error-free Broadcast
> +							* Frames Received
> +							* Counter
> +							*/
> +#define GEM_RXMULTICNT				0x0160 /* Error-free Multicast
> +							* Frames Received
> +							* Counter
> +							*/
> +#define GEM_RXPAUSECNT				0x0164 /* Error-free Pause
> +							* Frames Received
> +							* Counter
> +							*/
> +#define GEM_RX64CNT				0x0168 /* Error-free 64 byte
> +							* Frames Received
> +							* Counter
> +							*/
> +#define GEM_RX65CNT				0x016c /* Error-free 65-127 byte
> +							* Frames Received
> +							* Counter
> +							*/
> +#define GEM_RX128CNT				0x0170 /* Error-free 128-255
> +							* byte Frames Received
> +							* Counter
> +							*/
> +#define GEM_RX256CNT				0x0174 /* Error-free 256-511
> +							* byte Frames Received
> +							* Counter
> +							*/
> +#define GEM_RX512CNT				0x0178 /* Error-free 512-1023
> +							* byte Frames Received
> +							* Counter
> +							*/
> +#define GEM_RX1024CNT				0x017c /* Error-free 1024-1518
> +							* byte Frames Received
> +							* Counter
> +							*/
> +#define GEM_RX1519CNT				0x0180 /* Error-free larger than
> +							* 1519 Frames Received
> +							* Counter
> +							*/
> +#define GEM_RXUNDRCNT				0x0184 /* Undersize Frames
> +							* Received Counter
> +							*/
> +#define GEM_RXOVRCNT				0x0188 /* Oversize Frames
> +							* Received Counter
> +							*/
> +#define GEM_RXJABCNT				0x018c /* Jabbers Received
> +							* Counter
> +							*/
> +#define GEM_RXFCSCNT				0x0190 /* Frame Check Sequence
> +							* Error Counter
> +							*/
> +#define GEM_RXLENGTHCNT				0x0194 /* Length Field Error
> +							* Counter
> +							*/
> +#define GEM_RXSYMBCNT				0x0198 /* Symbol Error
> +							* Counter
> +							*/
> +#define GEM_RXALIGNCNT				0x019c /* Alignment Error
> +							* Counter
> +							*/
> +#define GEM_RXRESERRCNT				0x01a0 /* Receive Resource Error
> +							* Counter
> +							*/
> +#define GEM_RXORCNT				0x01a4 /* Receive Overrun
> +							* Counter
> +							*/
> +#define GEM_RXIPCCNT				0x01a8 /* IP header Checksum
> +							* Error Counter
> +							*/
> +#define GEM_RXTCPCCNT				0x01ac /* TCP Checksum Error
> +							* Counter
> +							*/
> +#define GEM_RXUDPCCNT				0x01b0 /* UDP Checksum Error
> +							* Counter
> +							*/
>  #define GEM_DCFG1				0x0280 /* Design Config 1 */
>  #define GEM_DCFG2				0x0284 /* Design Config 2 */
>  #define GEM_DCFG3				0x0288 /* Design Config 3 */
> @@ -650,6 +803,107 @@ struct gem_stats {
>  	u32	rx_udp_checksum_errors;
>  };
>  
> +/* Describes the name and offset of an individual statistic register, as

style: should be like this:
/*
 * bla bla bla
 */

> + * returned by `ethtool -S`. Also describes which net_device_stats statistics
> + * this register should contribute to.
> + */
> +struct gem_statistic {
> +	char stat_string[ETH_GSTRING_LEN];
> +	int offset;
> +	u32 stat_bits;
> +};
> +
> +/* Bitfield defs for net_device_stat statistics */
> +#define GEM_NDS_RXERR_OFFSET		0
> +#define GEM_NDS_RXLENERR_OFFSET		1
> +#define GEM_NDS_RXOVERERR_OFFSET	2
> +#define GEM_NDS_RXCRCERR_OFFSET		3
> +#define GEM_NDS_RXFRAMEERR_OFFSET	4
> +#define GEM_NDS_RXFIFOERR_OFFSET	5
> +#define GEM_NDS_TXERR_OFFSET		6
> +#define GEM_NDS_TXABORTEDERR_OFFSET	7
> +#define GEM_NDS_TXCARRIERERR_OFFSET	8
> +#define GEM_NDS_TXFIFOERR_OFFSET	9
> +#define GEM_NDS_COLLISIONS_OFFSET	10
> +
> +#define GEM_STAT_TITLE(name, title) GEM_STAT_TITLE_BITS(name, title, 0)
> +#define GEM_STAT_TITLE_BITS(name, title, bits) {	\
> +	.stat_string = title,				\
> +	.offset = GEM_##name,				\
> +	.stat_bits = bits				\
> +}
> +
> +/* list of gem statistic registers. The names MUST match the
> + * corresponding GEM_* definitions.
> + */
> +static const struct gem_statistic gem_statistics[] = {
> +	GEM_STAT_TITLE(OCTTXL, "tx_octets"), /* OCTTXH combined with OCTTXL */
> +	GEM_STAT_TITLE(TXCNT, "tx_frames"),
> +	GEM_STAT_TITLE(TXBCCNT, "tx_broadcast_frames"),
> +	GEM_STAT_TITLE(TXMCCNT, "tx_multicast_frames"),
> +	GEM_STAT_TITLE(TXPAUSECNT, "tx_pause_frames"),
> +	GEM_STAT_TITLE(TX64CNT, "tx_64_byte_frames"),
> +	GEM_STAT_TITLE(TX65CNT, "tx_65_127_byte_frames"),
> +	GEM_STAT_TITLE(TX128CNT, "tx_128_255_byte_frames"),
> +	GEM_STAT_TITLE(TX256CNT, "tx_256_511_byte_frames"),
> +	GEM_STAT_TITLE(TX512CNT, "tx_512_1023_byte_frames"),
> +	GEM_STAT_TITLE(TX1024CNT, "tx_1024_1518_byte_frames"),
> +	GEM_STAT_TITLE(TX1519CNT, "tx_greater_than_1518_byte_frames"),
> +	GEM_STAT_TITLE_BITS(TXURUNCNT, "tx_underrun",
> +			    GEM_BIT(NDS_TXERR)|GEM_BIT(NDS_TXFIFOERR)),
> +	GEM_STAT_TITLE_BITS(SNGLCOLLCNT, "tx_single_collision_frames",
> +			    GEM_BIT(NDS_TXERR)|GEM_BIT(NDS_COLLISIONS)),
> +	GEM_STAT_TITLE_BITS(MULTICOLLCNT, "tx_multiple_collision_frames",
> +			    GEM_BIT(NDS_TXERR)|GEM_BIT(NDS_COLLISIONS)),
> +	GEM_STAT_TITLE_BITS(EXCESSCOLLCNT, "tx_excessive_collisions",
> +			    GEM_BIT(NDS_TXERR)|
> +			    GEM_BIT(NDS_TXABORTEDERR)|
> +			    GEM_BIT(NDS_COLLISIONS)),
> +	GEM_STAT_TITLE_BITS(LATECOLLCNT, "tx_late_collisions",
> +			    GEM_BIT(NDS_TXERR)|GEM_BIT(NDS_COLLISIONS)),
> +	GEM_STAT_TITLE(TXDEFERCNT, "tx_deferred_frames"),
> +	GEM_STAT_TITLE_BITS(TXCSENSECNT, "tx_carrier_sense_errors",
> +			    GEM_BIT(NDS_TXERR)|GEM_BIT(NDS_COLLISIONS)),
> +	GEM_STAT_TITLE(OCTRXL, "rx_octets"), /* OCTRXH combined with OCTRXL */
> +	GEM_STAT_TITLE(RXCNT, "rx_frames"),
> +	GEM_STAT_TITLE(RXBROADCNT, "rx_broadcast_frames"),
> +	GEM_STAT_TITLE(RXMULTICNT, "rx_multicast_frames"),
> +	GEM_STAT_TITLE(RXPAUSECNT, "rx_pause_frames"),
> +	GEM_STAT_TITLE(RX64CNT, "rx_64_byte_frames"),
> +	GEM_STAT_TITLE(RX65CNT, "rx_65_127_byte_frames"),
> +	GEM_STAT_TITLE(RX128CNT, "rx_128_255_byte_frames"),
> +	GEM_STAT_TITLE(RX256CNT, "rx_256_511_byte_frames"),
> +	GEM_STAT_TITLE(RX512CNT, "rx_512_1023_byte_frames"),
> +	GEM_STAT_TITLE(RX1024CNT, "rx_1024_1518_byte_frames"),
> +	GEM_STAT_TITLE(RX1519CNT, "rx_greater_than_1518_byte_frames"),
> +	GEM_STAT_TITLE_BITS(RXUNDRCNT, "rx_undersized_frames",
> +			    GEM_BIT(NDS_RXERR)|GEM_BIT(NDS_RXLENERR)),
> +	GEM_STAT_TITLE_BITS(RXOVRCNT, "rx_oversize_frames",
> +			    GEM_BIT(NDS_RXERR)|GEM_BIT(NDS_RXLENERR)),
> +	GEM_STAT_TITLE_BITS(RXJABCNT, "rx_jabbers",
> +			    GEM_BIT(NDS_RXERR)|GEM_BIT(NDS_RXLENERR)),
> +	GEM_STAT_TITLE_BITS(RXFCSCNT, "rx_frame_check_sequence_errors",
> +			    GEM_BIT(NDS_RXERR)|GEM_BIT(NDS_RXCRCERR)),
> +	GEM_STAT_TITLE_BITS(RXLENGTHCNT, "rx_length_field_frame_errors",
> +			    GEM_BIT(NDS_RXERR)),
> +	GEM_STAT_TITLE_BITS(RXSYMBCNT, "rx_symbol_errors",
> +			    GEM_BIT(NDS_RXERR)|GEM_BIT(NDS_RXFRAMEERR)),
> +	GEM_STAT_TITLE_BITS(RXALIGNCNT, "rx_alignment_errors",
> +			    GEM_BIT(NDS_RXERR)|GEM_BIT(NDS_RXOVERERR)),
> +	GEM_STAT_TITLE_BITS(RXRESERRCNT, "rx_resource_errors",
> +			    GEM_BIT(NDS_RXERR)|GEM_BIT(NDS_RXOVERERR)),
> +	GEM_STAT_TITLE_BITS(RXORCNT, "rx_overruns",
> +			    GEM_BIT(NDS_RXERR)|GEM_BIT(NDS_RXFIFOERR)),
> +	GEM_STAT_TITLE_BITS(RXIPCCNT, "rx_ip_header_checksum_errors",
> +			    GEM_BIT(NDS_RXERR)),
> +	GEM_STAT_TITLE_BITS(RXTCPCCNT, "rx_tcp_checksum_errors",
> +			    GEM_BIT(NDS_RXERR)),
> +	GEM_STAT_TITLE_BITS(RXUDPCCNT, "rx_udp_checksum_errors",
> +			    GEM_BIT(NDS_RXERR)),
> +};
> +
> +#define GEM_STATS_LEN ARRAY_SIZE(gem_statistics)
> +
>  struct macb;
>  
>  struct macb_or_gem_ops {
> @@ -728,6 +982,8 @@ struct macb {
>  	dma_addr_t skb_physaddr;		/* phys addr from pci_map_single */
>  	int skb_length;				/* saved skb length for pci_unmap_single */
>  	unsigned int		max_tx_length;
> +
> +	u64			ethtool_stats[GEM_STATS_LEN];
>  };
>  
>  extern const struct ethtool_ops macb_ethtool_ops;
> 


-- 
Nicolas Ferre

^ permalink raw reply

* Re: [patch net-next 1/2 v3] tc: add BPF based action
From: Jiri Pirko @ 2015-01-14 15:55 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, Network Development, David S. Miller,
	Jamal Hadi Salim, Hannes Frederic Sowa
In-Reply-To: <CAMEtUux+HOOegzyi82fYT_JMDX_+d0dZCQ=Zc5GWC-awQmJu1A@mail.gmail.com>

Wed, Jan 14, 2015 at 04:39:34PM CET, ast@plumgrid.com wrote:
>On Wed, Jan 14, 2015 at 5:28 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
>>
>> I'm still wondering about the drop semantics ... wouldn't it be more
>> intuitive to use 0 for drops in this context?
>
>good point.
>I think it must be 0 to match behavior of socket filters, etc.
>If program tries to access beyond packet size or does divide
>by zero if will be terminated and will return 0.
>So zero should be the safest action from caller point of view.


Will do. Thanks!

^ permalink raw reply

* Re: [PATCH net 0/3]tg3: synchronize_irq() should be called without taking locks
From: Peter Hurley @ 2015-01-14 15:57 UTC (permalink / raw)
  To: Prashant Sreedharan, davem; +Cc: netdev, mchan
In-Reply-To: <1421217039-11689-1-git-send-email-prashant@broadcom.com>

On 01/14/2015 01:30 AM, Prashant Sreedharan wrote:
> Prashant Sreedharan (3):
>   tg3_timer() should grab tp->lock before checking for tp->irq_sync
>   tg3_reset_task() needs to use rtnl_lock to synchronize
>   Release tp->lock before invoking synchronize_irq()

Thanks!

For series:

Reported-by: Peter Hurley <peter@hurleysoftware.com>
Tested-by: Peter Hurley <peter@hurleysoftware.com>

But maybe one of these patches should reference that this fixes
BUG: sleeping function... so that others can quickly find this
fix (if they're bisecting or whatever). For the same reason, it
might be useful for this series to be just one patch.

Regards,
Peter Hurley

^ permalink raw reply

* Re: [3.19-rc3] tg3: BUG: sleeping function called from invalid context
From: Peter Hurley @ 2015-01-14 16:06 UTC (permalink / raw)
  To: Prashant Sreedharan; +Cc: Michael Chan, netdev, Linux kernel
In-Reply-To: <1421116255.16485.14.camel@prashant>

On 01/12/2015 09:30 PM, Prashant Sreedharan wrote:
> On Mon, 2015-01-12 at 19:59 -0500, Peter Hurley wrote:
>> On 3.19-rc3, I'm seeing this might_sleep() warning [1] from the tg3_open()
>> call stack. Let me know if I need to bisect this.
>>
>> Regards,
>> Peter Hurley
>>
>> [1]
>>
>> [   17.203009] BUG: sleeping function called from invalid context at /home/peter/src/kernels/mainline/kernel/irq/manage.c:104
>> [   17.203067] in_atomic(): 1, irqs_disabled(): 0, pid: 1106, name: ip
>> [   17.203092] 2 locks held by ip/1106:
>> [   17.205255]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff816adf1f>] rtnetlink_rcv+0x1f/0x40
>> [   17.207445]  #1:  (&(&tp->lock)->rlock){+.....}, at: [<ffffffffa01073e6>] tg3_start+0xc06/0x11f0 [tg3]
>> [   17.209725] CPU: 2 PID: 1106 Comm: ip Not tainted 3.19.0-rc3+wip-xeon+lockdep #rc3+wip
>> [   17.211900] Hardware name: Dell Inc. Precision WorkStation T5400  /0RW203, BIOS A11 04/30/2012
>> [   17.214086]  0000000000000068 ffff8802ac823498 ffffffff817af7e8 0000000000000005
>> [   17.216265]  ffffffff81a9be78 ffff8802ac8234a8 ffffffff810998a5 ffff8802ac8234d8
>> [   17.218446]  ffffffff8109991a ffff8802ac8234c8 ffff8802af0aae00 ffffffffa00ed000
>> [   17.220636] Call Trace:
>> [   17.222743]  [<ffffffff817af7e8>] dump_stack+0x4f/0x7b
>> [   17.224808]  [<ffffffff810998a5>] ___might_sleep+0x105/0x140
>> [   17.226842]  [<ffffffff8109991a>] __might_sleep+0x3a/0xa0
>> [   17.228869]  [<ffffffffa00ed000>] ? 0xffffffffa00ed000
>> [   17.230939]  [<ffffffff810d7d78>] synchronize_irq+0x38/0xa0
>> [   17.232967]  [<ffffffffa00ed000>] ? 0xffffffffa00ed000
>> [   17.234991]  [<ffffffffa010105f>] tg3_chip_reset+0x13f/0x9c0 [tg3]
>> [   17.236988]  [<ffffffffa01020ae>] tg3_reset_hw+0x7e/0x2d20 [tg3]
>> [   17.238996]  [<ffffffff813bfaff>] ? __udelay+0x2f/0x40
>> [   17.241007]  [<ffffffffa00ef2f7>] ? _tw32_flush+0x47/0x80 [tg3]
>> [   17.243066]  [<ffffffffa0104dac>] tg3_init_hw+0x5c/0x70 [tg3]
>> [   17.245438]  [<ffffffffa010740b>] tg3_start+0xc2b/0x11f0 [tg3]
>> [   17.247444]  [<ffffffffa0107ad7>] ? tg3_open+0x107/0x2e0 [tg3]
>> [   17.249556]  [<ffffffff810c338d>] ? trace_hardirqs_on+0xd/0x10
>> [   17.251581]  [<ffffffff8107806f>] ? __local_bh_enable_ip+0x6f/0x100
>> [   17.253710]  [<ffffffffa0107af8>] tg3_open+0x128/0x2e0 [tg3]
>> [   17.255758]  [<ffffffff816ba3f5>] ? netpoll_poll_disable+0x5/0xa0
>> [   17.257932]  [<ffffffff816a14af>] __dev_open+0xbf/0x140
>> [   17.260091]  [<ffffffff816a17c1>] __dev_change_flags+0xa1/0x160
>> [   17.262222]  [<ffffffff816a18a9>] dev_change_flags+0x29/0x60
>> [   17.264360]  [<ffffffff816b0e02>] do_setlink+0x2f2/0xa30
>> [   17.266431]  [<ffffffff816b1b7f>] rtnl_newlink+0x51f/0x750
>> [   17.268485]  [<ffffffff816b1749>] ? rtnl_newlink+0xe9/0x750
>> [   17.270483]  [<ffffffff811869c2>] ? free_pages_prepare+0x1d2/0x270
>> [   17.272507]  [<ffffffff810c32bd>] ? trace_hardirqs_on_caller+0x11d/0x1e0
>> [   17.274531]  [<ffffffff813dd1b2>] ? nla_parse+0x32/0x120
>> [   17.276531]  [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0
>> [   17.278514]  [<ffffffff816adfd5>] rtnetlink_rcv_msg+0x95/0x250
>> [   17.280485]  [<ffffffff8109f699>] ? preempt_count_sub+0x49/0x50
>> [   17.282448]  [<ffffffff817b4a02>] ? mutex_lock_nested+0x382/0x530
>> [   17.284402]  [<ffffffff816adf1f>] ? rtnetlink_rcv+0x1f/0x40
>> [   17.286290]  [<ffffffff816adf1f>] ? rtnetlink_rcv+0x1f/0x40
>> [   17.288142]  [<ffffffff816adf40>] ? rtnetlink_rcv+0x40/0x40
>> [   17.290031]  [<ffffffff816cedc1>] netlink_rcv_skb+0xc1/0xe0
>> [   17.291836]  [<ffffffff816adf2e>] rtnetlink_rcv+0x2e/0x40
>> [   17.293615]  [<ffffffff816ce473>] netlink_unicast+0xf3/0x1d0
>> [   17.295420]  [<ffffffff816ce863>] netlink_sendmsg+0x313/0x690
>> [   17.297132]  [<ffffffff811ada4f>] ? might_fault+0x5f/0xb0
>> [   17.298799]  [<ffffffff8168253c>] do_sock_sendmsg+0x8c/0x100
>> [   17.300493]  [<ffffffff81681e3e>] ? copy_msghdr_from_user+0x15e/0x1f0
>> [   17.302173]  [<ffffffff81682aeb>] ___sys_sendmsg+0x30b/0x320
>> [   17.303798]  [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0
>> [   17.305431]  [<ffffffff810bdee0>] ? cpuacct_account_field+0x80/0xb0
>> [   17.307085]  [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0
>> [   17.308744]  [<ffffffff810a4f35>] ? sched_clock_local+0x25/0x90
>> [   17.310375]  [<ffffffff810a5dc1>] ? vtime_account_user+0x91/0xa0
>> [   17.311948]  [<ffffffff810a5198>] ? sched_clock_cpu+0xb8/0xe0
>> [   17.313509]  [<ffffffff810bf8be>] ? put_lock_stats.isra.26+0xe/0x30
>> [   17.315069]  [<ffffffff810c007e>] ? lock_release_holdtime.part.27+0x12e/0x1b0
>> [   17.316618]  [<ffffffff810a5dc1>] ? vtime_account_user+0x91/0xa0
>> [   17.318162]  [<ffffffff8109f5d1>] ? get_parent_ip+0x11/0x50
>> [   17.319703]  [<ffffffff8109f699>] ? preempt_count_sub+0x49/0x50
>> [   17.321235]  [<ffffffff811807e5>] ? context_tracking_user_exit+0x55/0x130
>> [   17.322732]  [<ffffffff811807e5>] ? context_tracking_user_exit+0x55/0x130
>> [   17.324197]  [<ffffffff816834f2>] __sys_sendmsg+0x42/0x80
>> [   17.325634]  [<ffffffff81683542>] SyS_sendmsg+0x12/0x20
>> [   17.327048]  [<ffffffff817ba12d>] system_call_fastpath+0x16/0x1b
> 
> Please bisect, there hasn't been tg3 code changes in this path that
> might cause this.

What triggers this is the new debugging code added to catch nested
sleeps; specifically e22b886 ("sched/wait: Add might_sleep() checks").

Regards,
Peter Hurley

^ permalink raw reply

* Re: [PATCH iproute2 0/3] ip netns: Run over all netns
From: Vadim Kochan @ 2015-01-14 16:13 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Vadim Kochan, netdev
In-Reply-To: <20150113172837.793f0969@urahara>

On Tue, Jan 13, 2015 at 05:28:37PM -0800, Stephen Hemminger wrote:
> On Wed,  7 Jan 2015 13:04:19 +0200
> Vadim Kochan <vadim4j@gmail.com> wrote:
> 
> > From: Vadim Kochan <vadim4j@gmail.com>
> > 
> > Allow 'ip netns del' and 'ip netns exec' run over each network namespace names.
> > 
> > 'ip netns exec' executes command forcely on eacn nsname.
> > 
> > Vadim Kochan (3):
> >   lib: Exec func on each netns
> >   ip netns: Allow exec on each netns
> >   ip netns: Delete all netns
> > 
> >  include/namespace.h |  6 ++++
> >  include/utils.h     |  5 +++
> >  ip/ipnetns.c        | 96 ++++++++++++++++++++++++++++++++---------------------
> >  lib/namespace.c     | 22 ++++++++++++
> >  lib/utils.c         | 28 ++++++++++++++++
> >  man/man8/ip-netns.8 | 26 ++++++++++++---
> >  6 files changed, 141 insertions(+), 42 deletions(-)
> > 
> 
> It is a useful concept but as others have pointed out the idea of reserving
> a keyword 'all' at this point is likely to cause somebody to break.
> So sorry.
OK, then what about additional option like:

    $ ip -all netns exec ...

?

Thanks,

^ permalink raw reply

* Re: [PATCH v4 20/20] kbuild: add a new kselftest_install make target to install selftests
From: Shuah Khan @ 2015-01-14 16:32 UTC (permalink / raw)
  To: mmarek, masami.hiramatsu.pt
  Cc: gregkh, akpm, rostedt, mingo, davem, keescook, tranmanphong, mpe,
	cov, dh.herrmann, hughd, bobby.prani, serge.hallyn, ebiederm,
	tim.bird, josh, koct9i, linux-kbuild, linux-kernel, linux-api,
	netdev
In-Reply-To: <2c5a28faaa79d9c2415854a08817ada509fcb943.1420571615.git.shuahkh@osg.samsung.com>

On 01/06/2015 12:43 PM, Shuah Khan wrote:
> Add a new make target to install to install kernel selftests.
> This new target will build and install selftests. kselftest
> target now depends on kselftest_install and runs the generated
> kselftest script to reduce duplicate work and for common look
> and feel when running tests.
> 
> make kselftest_target:
> -- exports kselftest INSTALL_KSFT_PATH
>    default $(INSTALL_MOD_PATH)/lib/kselftest/$(KERNELRELEASE)
> -- exports INSTALL_KSFT_PATH
> -- runs selftests make install target
> 
> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
> ---
>  Makefile | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)

Hi Marek,

Could you please Ack this patch, if this version looks good,
so I can take this through ksefltest tree.

thanks,
-- Shuah


-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
shuahkh@osg.samsung.com | (970) 217-8978

^ permalink raw reply

* Re: [PATCH net-next v13 3/3] net: hisilicon: new hip04 ethernet driver
From: Eric Dumazet @ 2015-01-14 16:34 UTC (permalink / raw)
  To: Ding Tianhong
  Cc: arnd, robh+dt, davem, grant.likely, agraf, sergei.shtylyov,
	linux-arm-kernel, xuwei5, zhangfei.gao, netdev, devicetree, linux
In-Reply-To: <1421217254-12008-4-git-send-email-dingtianhong@huawei.com>

On Wed, 2015-01-14 at 14:34 +0800, Ding Tianhong wrote:
> Support Hisilicon hip04 ethernet driver, including 100M / 1000M controller.
> The controller has no tx done interrupt, reclaim xmitted buffer in the poll.
> 
> v13: Fix the problem of alignment parameters for function and checkpatch warming.
> 
> v12: According Alex's suggestion, modify the changelog and add MODULE_DEVICE_TABLE
>      for hip04 ethernet.
> 
> v11: Add ethtool support for tx coalecse getting and setting, the xmit_more
>      is not supported for this patch, but I think it could work for hip04,
>      will support it later after some tests for performance better.
> 
>      Here are some performance test results by ping and iperf(add tx_coalesce_frames/users),
>      it looks that the performance and latency is more better by tx_coalesce_frames/usecs.
> 
>      - Before:
>      $ ping 192.168.1.1 ...
>      === 192.168.1.1 ping statistics ===
>      24 packets transmitted, 24 received, 0% packet loss, time 22999ms
>      rtt min/avg/max/mdev = 0.180/0.202/0.403/0.043 ms
> 
>      $ iperf -c 192.168.1.1 ...
>      [ ID] Interval       Transfer     Bandwidth
>      [  3]  0.0- 1.0 sec   115 MBytes   945 Mbits/sec
> 
>      - After:
>      $ ping 192.168.1.1 ...
>      === 192.168.1.1 ping statistics ===
>      24 packets transmitted, 24 received, 0% packet loss, time 22999ms
>      rtt min/avg/max/mdev = 0.178/0.190/0.380/0.041 ms
> 
>      $ iperf -c 192.168.1.1 ...
>      [ ID] Interval       Transfer     Bandwidth
>      [  3]  0.0- 1.0 sec   115 MBytes   965 Mbits/sec
> 
> v10: According David Miller and Arnd Bergmann's suggestion, add some modification
>      for v9 version
>      - drop the workqueue
>      - batch cleanup based on tx_coalesce_frames/usecs for better throughput
>      - use a reasonable default tx timeout (200us, could be shorted
>        based on measurements) with a range timer
>      - fix napi poll function return value
>      - use a lockless queue for cleanup
> 
> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
> ---
>  drivers/net/ethernet/hisilicon/Makefile    |   2 +-
>  drivers/net/ethernet/hisilicon/hip04_eth.c | 969 +++++++++++++++++++++++++++++
>  2 files changed, 970 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/ethernet/hisilicon/hip04_eth.c
> 
> diff --git a/drivers/net/ethernet/hisilicon/Makefile b/drivers/net/ethernet/hisilicon/Makefile
> index 40115a7..6c14540 100644
> --- a/drivers/net/ethernet/hisilicon/Makefile
> +++ b/drivers/net/ethernet/hisilicon/Makefile
> @@ -3,4 +3,4 @@
>  #
>  
>  obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
> -obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o
> +obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o hip04_eth.o
> diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c
> new file mode 100644
> index 0000000..525214e
> --- /dev/null
> +++ b/drivers/net/ethernet/hisilicon/hip04_eth.c
> @@ -0,0 +1,969 @@
> +
> +/* Copyright (c) 2014 Linaro Ltd.
> + * Copyright (c) 2014 Hisilicon Limited.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/etherdevice.h>
> +#include <linux/platform_device.h>
> +#include <linux/interrupt.h>
> +#include <linux/ktime.h>
> +#include <linux/of_address.h>
> +#include <linux/phy.h>
> +#include <linux/of_mdio.h>
> +#include <linux/of_net.h>
> +#include <linux/mfd/syscon.h>
> +#include <linux/regmap.h>
> +
> +#define PPE_CFG_RX_ADDR			0x100
> +#define PPE_CFG_POOL_GRP		0x300
> +#define PPE_CFG_RX_BUF_SIZE		0x400
> +#define PPE_CFG_RX_FIFO_SIZE		0x500
> +#define PPE_CURR_BUF_CNT		0xa200
> +
> +#define GE_DUPLEX_TYPE			0x08
> +#define GE_MAX_FRM_SIZE_REG		0x3c
> +#define GE_PORT_MODE			0x40
> +#define GE_PORT_EN			0x44
> +#define GE_SHORT_RUNTS_THR_REG		0x50
> +#define GE_TX_LOCAL_PAGE_REG		0x5c
> +#define GE_TRANSMIT_CONTROL_REG		0x60
> +#define GE_CF_CRC_STRIP_REG		0x1b0
> +#define GE_MODE_CHANGE_REG		0x1b4
> +#define GE_RECV_CONTROL_REG		0x1e0
> +#define GE_STATION_MAC_ADDRESS		0x210
> +#define PPE_CFG_CPU_ADD_ADDR		0x580
> +#define PPE_CFG_MAX_FRAME_LEN_REG	0x408
> +#define PPE_CFG_BUS_CTRL_REG		0x424
> +#define PPE_CFG_RX_CTRL_REG		0x428
> +#define PPE_CFG_RX_PKT_MODE_REG		0x438
> +#define PPE_CFG_QOS_VMID_GEN		0x500
> +#define PPE_CFG_RX_PKT_INT		0x538
> +#define PPE_INTEN			0x600
> +#define PPE_INTSTS			0x608
> +#define PPE_RINT			0x604
> +#define PPE_CFG_STS_MODE		0x700
> +#define PPE_HIS_RX_PKT_CNT		0x804
> +
> +/* REG_INTERRUPT */
> +#define RCV_INT				BIT(10)
> +#define RCV_NOBUF			BIT(8)
> +#define RCV_DROP			BIT(7)
> +#define TX_DROP				BIT(6)
> +#define DEF_INT_ERR			(RCV_NOBUF | RCV_DROP | TX_DROP)
> +#define DEF_INT_MASK			(RCV_INT | DEF_INT_ERR)
> +
> +/* TX descriptor config */
> +#define TX_FREE_MEM			BIT(0)
> +#define TX_READ_ALLOC_L3		BIT(1)
> +#define TX_FINISH_CACHE_INV		BIT(2)
> +#define TX_CLEAR_WB			BIT(4)
> +#define TX_L3_CHECKSUM			BIT(5)
> +#define TX_LOOP_BACK			BIT(11)
> +
> +/* RX error */
> +#define RX_PKT_DROP			BIT(0)
> +#define RX_L2_ERR			BIT(1)
> +#define RX_PKT_ERR			(RX_PKT_DROP | RX_L2_ERR)
> +
> +#define SGMII_SPEED_1000		0x08
> +#define SGMII_SPEED_100			0x07
> +#define SGMII_SPEED_10			0x06
> +#define MII_SPEED_100			0x01
> +#define MII_SPEED_10			0x00
> +
> +#define GE_DUPLEX_FULL			BIT(0)
> +#define GE_DUPLEX_HALF			0x00
> +#define GE_MODE_CHANGE_EN		BIT(0)
> +
> +#define GE_TX_AUTO_NEG			BIT(5)
> +#define GE_TX_ADD_CRC			BIT(6)
> +#define GE_TX_SHORT_PAD_THROUGH		BIT(7)
> +
> +#define GE_RX_STRIP_CRC			BIT(0)
> +#define GE_RX_STRIP_PAD			BIT(3)
> +#define GE_RX_PAD_EN			BIT(4)
> +
> +#define GE_AUTO_NEG_CTL			BIT(0)
> +
> +#define GE_RX_INT_THRESHOLD		BIT(6)
> +#define GE_RX_TIMEOUT			0x04
> +
> +#define GE_RX_PORT_EN			BIT(1)
> +#define GE_TX_PORT_EN			BIT(2)
> +
> +#define PPE_CFG_STS_RX_PKT_CNT_RC	BIT(12)
> +
> +#define PPE_CFG_RX_PKT_ALIGN		BIT(18)
> +#define PPE_CFG_QOS_VMID_MODE		BIT(14)
> +#define PPE_CFG_QOS_VMID_GRP_SHIFT	8
> +
> +#define PPE_CFG_RX_FIFO_FSFU		BIT(11)
> +#define PPE_CFG_RX_DEPTH_SHIFT		16
> +#define PPE_CFG_RX_START_SHIFT		0
> +#define PPE_CFG_RX_CTRL_ALIGN_SHIFT	11
> +
> +#define PPE_CFG_BUS_LOCAL_REL		BIT(14)
> +#define PPE_CFG_BUS_BIG_ENDIEN		BIT(0)
> +
> +#define RX_DESC_NUM			128
> +#define TX_DESC_NUM			256
> +#define TX_NEXT(N)			(((N) + 1) & (TX_DESC_NUM-1))
> +#define RX_NEXT(N)			(((N) + 1) & (RX_DESC_NUM-1))
> +
> +#define GMAC_PPE_RX_PKT_MAX_LEN		379
> +#define GMAC_MAX_PKT_LEN		1516
> +#define GMAC_MIN_PKT_LEN		31
> +#define RX_BUF_SIZE			1600
> +#define RESET_TIMEOUT			1000
> +#define TX_TIMEOUT			(6 * HZ)
> +
> +#define DRV_NAME			"hip04-ether"
> +#define DRV_VERSION			"v1.0"
> +
> +#define HIP04_MAX_TX_COALESCE_USECS	200
> +#define HIP04_MIN_TX_COALESCE_USECS	100
> +#define HIP04_MAX_TX_COALESCE_FRAMES	200
> +#define HIP04_MIN_TX_COALESCE_FRAMES	100
> +
> +struct tx_desc {
> +	u32 send_addr;

	__be32 send_adddr; ?

> +	u32 send_size;

	__be32

> +	u32 next_addr;
	__be32

> +	u32 cfg;
	__be32

> +	u32 wb_addr;
	__be32 wb_addr ?

> +} __aligned(64);
> +
> +struct rx_desc {
> +	u16 reserved_16;
> +	u16 pkt_len;
> +	u32 reserve1[3];
> +	u32 pkt_err;
> +	u32 reserve2[4];
> +};
> +
> +struct hip04_priv {
> +	void __iomem *base;
> +	int phy_mode;
> +	int chan;
> +	unsigned int port;
> +	unsigned int speed;
> +	unsigned int duplex;
> +	unsigned int reg_inten;
> +
> +	struct napi_struct napi;
> +	struct net_device *ndev;
> +
> +	struct tx_desc *tx_desc;
> +	dma_addr_t tx_desc_dma;
> +	struct sk_buff *tx_skb[TX_DESC_NUM];
> +	dma_addr_t tx_phys[TX_DESC_NUM];

This is not an efficient way to store skb/phys, as for each skb, info
will be store in 2 separate cache lines.

It would be better to use a 

struct hip04_tx_desc {
   struct sk_buff   *skb;
   dma_addr_t       phys;
} 

> +	unsigned int tx_head;
> +
> +	int tx_coalesce_frames;
> +	int tx_coalesce_usecs;
> +	struct hrtimer tx_coalesce_timer;
> +
> +	unsigned char *rx_buf[RX_DESC_NUM];
> +	dma_addr_t rx_phys[RX_DESC_NUM];

Same thing here : Use a struct to get better data locality.

> +	unsigned int rx_head;
> +	unsigned int rx_buf_size;
> +
> +	struct device_node *phy_node;
> +	struct phy_device *phy;
> +	struct regmap *map;
> +	struct work_struct tx_timeout_task;
> +
> +	/* written only by tx cleanup */
> +	unsigned int tx_tail ____cacheline_aligned_in_smp;
> +};
> +
> +static inline unsigned int tx_count(unsigned int head, unsigned int tail)
> +{
> +	return (head - tail) % (TX_DESC_NUM - 1);
> +}
> +
> +static void hip04_config_port(struct net_device *ndev, u32 speed, u32 duplex)
> +{
> +	struct hip04_priv *priv = netdev_priv(ndev);
> +	u32 val;
> +
> +	priv->speed = speed;
> +	priv->duplex = duplex;
> +
> +	switch (priv->phy_mode) {
> +	case PHY_INTERFACE_MODE_SGMII:
> +		if (speed == SPEED_1000)
> +			val = SGMII_SPEED_1000;
> +		else if (speed == SPEED_100)
> +			val = SGMII_SPEED_100;
> +		else
> +			val = SGMII_SPEED_10;
> +		break;
> +	case PHY_INTERFACE_MODE_MII:
> +		if (speed == SPEED_100)
> +			val = MII_SPEED_100;
> +		else
> +			val = MII_SPEED_10;
> +		break;
> +	default:
> +		netdev_warn(ndev, "not supported mode\n");
> +		val = MII_SPEED_10;
> +		break;
> +	}
> +	writel_relaxed(val, priv->base + GE_PORT_MODE);
> +
> +	val = duplex ? GE_DUPLEX_FULL : GE_DUPLEX_HALF;
> +	writel_relaxed(val, priv->base + GE_DUPLEX_TYPE);
> +
> +	val = GE_MODE_CHANGE_EN;
> +	writel_relaxed(val, priv->base + GE_MODE_CHANGE_REG);
> +}
> +
> +static void hip04_reset_ppe(struct hip04_priv *priv)
> +{
> +	u32 val, tmp, timeout = 0;
> +
> +	do {
> +		regmap_read(priv->map, priv->port * 4 + PPE_CURR_BUF_CNT, &val);
> +		regmap_read(priv->map, priv->port * 4 + PPE_CFG_RX_ADDR, &tmp);
> +		if (timeout++ > RESET_TIMEOUT)
> +			break;
> +	} while (val & 0xfff);
> +}
> +
> +static void hip04_config_fifo(struct hip04_priv *priv)
> +{
> +	u32 val;
> +
> +	val = readl_relaxed(priv->base + PPE_CFG_STS_MODE);
> +	val |= PPE_CFG_STS_RX_PKT_CNT_RC;
> +	writel_relaxed(val, priv->base + PPE_CFG_STS_MODE);
> +
> +	val = BIT(priv->port);
> +	regmap_write(priv->map, priv->port * 4 + PPE_CFG_POOL_GRP, val);
> +
> +	val = priv->port << PPE_CFG_QOS_VMID_GRP_SHIFT;
> +	val |= PPE_CFG_QOS_VMID_MODE;
> +	writel_relaxed(val, priv->base + PPE_CFG_QOS_VMID_GEN);
> +
> +	val = RX_BUF_SIZE;
> +	regmap_write(priv->map, priv->port * 4 + PPE_CFG_RX_BUF_SIZE, val);
> +
> +	val = RX_DESC_NUM << PPE_CFG_RX_DEPTH_SHIFT;
> +	val |= PPE_CFG_RX_FIFO_FSFU;
> +	val |= priv->chan << PPE_CFG_RX_START_SHIFT;
> +	regmap_write(priv->map, priv->port * 4 + PPE_CFG_RX_FIFO_SIZE, val);
> +
> +	val = NET_IP_ALIGN << PPE_CFG_RX_CTRL_ALIGN_SHIFT;
> +	writel_relaxed(val, priv->base + PPE_CFG_RX_CTRL_REG);
> +
> +	val = PPE_CFG_RX_PKT_ALIGN;
> +	writel_relaxed(val, priv->base + PPE_CFG_RX_PKT_MODE_REG);
> +
> +	val = PPE_CFG_BUS_LOCAL_REL | PPE_CFG_BUS_BIG_ENDIEN;
> +	writel_relaxed(val, priv->base + PPE_CFG_BUS_CTRL_REG);
> +
> +	val = GMAC_PPE_RX_PKT_MAX_LEN;
> +	writel_relaxed(val, priv->base + PPE_CFG_MAX_FRAME_LEN_REG);
> +
> +	val = GMAC_MAX_PKT_LEN;
> +	writel_relaxed(val, priv->base + GE_MAX_FRM_SIZE_REG);
> +
> +	val = GMAC_MIN_PKT_LEN;
> +	writel_relaxed(val, priv->base + GE_SHORT_RUNTS_THR_REG);
> +
> +	val = readl_relaxed(priv->base + GE_TRANSMIT_CONTROL_REG);
> +	val |= GE_TX_AUTO_NEG | GE_TX_ADD_CRC | GE_TX_SHORT_PAD_THROUGH;
> +	writel_relaxed(val, priv->base + GE_TRANSMIT_CONTROL_REG);
> +
> +	val = GE_RX_STRIP_CRC;
> +	writel_relaxed(val, priv->base + GE_CF_CRC_STRIP_REG);
> +
> +	val = readl_relaxed(priv->base + GE_RECV_CONTROL_REG);
> +	val |= GE_RX_STRIP_PAD | GE_RX_PAD_EN;
> +	writel_relaxed(val, priv->base + GE_RECV_CONTROL_REG);
> +
> +	val = GE_AUTO_NEG_CTL;
> +	writel_relaxed(val, priv->base + GE_TX_LOCAL_PAGE_REG);
> +}
> +
> +static void hip04_mac_enable(struct net_device *ndev)
> +{
> +	struct hip04_priv *priv = netdev_priv(ndev);
> +	u32 val;
> +
> +	/* enable tx & rx */
> +	val = readl_relaxed(priv->base + GE_PORT_EN);
> +	val |= GE_RX_PORT_EN | GE_TX_PORT_EN;
> +	writel_relaxed(val, priv->base + GE_PORT_EN);
> +
> +	/* clear rx int */
> +	val = RCV_INT;
> +	writel_relaxed(val, priv->base + PPE_RINT);
> +
> +	/* config recv int */
> +	val = GE_RX_INT_THRESHOLD | GE_RX_TIMEOUT;
> +	writel_relaxed(val, priv->base + PPE_CFG_RX_PKT_INT);
> +
> +	/* enable interrupt */
> +	priv->reg_inten = DEF_INT_MASK;
> +	writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
> +}
> +
> +static void hip04_mac_disable(struct net_device *ndev)
> +{
> +	struct hip04_priv *priv = netdev_priv(ndev);
> +	u32 val;
> +
> +	/* disable int */
> +	priv->reg_inten &= ~(DEF_INT_MASK);
> +	writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
> +
> +	/* disable tx & rx */
> +	val = readl_relaxed(priv->base + GE_PORT_EN);
> +	val &= ~(GE_RX_PORT_EN | GE_TX_PORT_EN);
> +	writel_relaxed(val, priv->base + GE_PORT_EN);
> +}
> +
> +static void hip04_set_xmit_desc(struct hip04_priv *priv, dma_addr_t phys)
> +{
> +	writel(phys, priv->base + PPE_CFG_CPU_ADD_ADDR);
> +}
> +
> +static void hip04_set_recv_desc(struct hip04_priv *priv, dma_addr_t phys)
> +{
> +	regmap_write(priv->map, priv->port * 4 + PPE_CFG_RX_ADDR, phys);
> +}
> +
> +static u32 hip04_recv_cnt(struct hip04_priv *priv)
> +{
> +	return readl(priv->base + PPE_HIS_RX_PKT_CNT);
> +}
> +
> +static void hip04_update_mac_address(struct net_device *ndev)
> +{
> +	struct hip04_priv *priv = netdev_priv(ndev);
> +
> +	writel_relaxed(((ndev->dev_addr[0] << 8) | (ndev->dev_addr[1])),
> +		       priv->base + GE_STATION_MAC_ADDRESS);
> +	writel_relaxed(((ndev->dev_addr[2] << 24) | (ndev->dev_addr[3] << 16) |
> +			(ndev->dev_addr[4] << 8) | (ndev->dev_addr[5])),
> +		       priv->base + GE_STATION_MAC_ADDRESS + 4);
> +}
> +
> +static int hip04_set_mac_address(struct net_device *ndev, void *addr)
> +{
> +	eth_mac_addr(ndev, addr);
> +	hip04_update_mac_address(ndev);
> +	return 0;
> +}
> +
> +static int hip04_tx_reclaim(struct net_device *ndev, bool force)
> +{
> +	struct hip04_priv *priv = netdev_priv(ndev);
> +	unsigned tx_tail = priv->tx_tail;
> +	struct tx_desc *desc;
> +	unsigned int bytes_compl = 0, pkts_compl = 0;
> +	unsigned int count;
> +
> +	smp_rmb();
> +	count = tx_count(ACCESS_ONCE(priv->tx_head), tx_tail);
> +	if (count == 0)
> +		goto out;
> +
> +	while (count) {
> +		desc = &priv->tx_desc[tx_tail];
> +		if (desc->send_addr != 0) {
> +			if (force)
> +				desc->send_addr = 0;
> +			else
> +				break;
> +		}
> +
> +		if (priv->tx_phys[tx_tail]) {
> +			dma_unmap_single(&ndev->dev, priv->tx_phys[tx_tail],
> +					 priv->tx_skb[tx_tail]->len,
> +					 DMA_TO_DEVICE);
> +			priv->tx_phys[tx_tail] = 0;
> +		}
> +		pkts_compl++;
> +		bytes_compl += priv->tx_skb[tx_tail]->len;
> +		dev_kfree_skb(priv->tx_skb[tx_tail]);
> +		priv->tx_skb[tx_tail] = NULL;
> +		tx_tail = TX_NEXT(tx_tail);
> +		count--;
> +	}
> +
> +	priv->tx_tail = tx_tail;
> +	smp_wmb(); /* Ensure tx_tail visible to xmit */
> +
> +out:
> +	if (pkts_compl || bytes_compl)

Testing bytes_compl should be enough : There is no way pkt_compl could
be 0 if bytes_compl is not 0.

> +		netdev_completed_queue(ndev, pkts_compl, bytes_compl);
> +
> +	if (unlikely(netif_queue_stopped(ndev)) && (count < (TX_DESC_NUM - 1)))
> +		netif_wake_queue(ndev);
> +
> +	return count;
> +}
> +
> +static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> +{
> +	struct hip04_priv *priv = netdev_priv(ndev);
> +	struct net_device_stats *stats = &ndev->stats;
> +	unsigned int tx_head = priv->tx_head, count;
> +	struct tx_desc *desc = &priv->tx_desc[tx_head];
> +	dma_addr_t phys;
> +
> +	smp_rmb();
> +	count = tx_count(tx_head, ACCESS_ONCE(priv->tx_tail));
> +	if (count == (TX_DESC_NUM - 1)) {
> +		netif_stop_queue(ndev);
> +		return NETDEV_TX_BUSY;
> +	}
> +
> +	phys = dma_map_single(&ndev->dev, skb->data, skb->len, DMA_TO_DEVICE);
> +	if (dma_mapping_error(&ndev->dev, phys)) {
> +		dev_kfree_skb(skb);
> +		return NETDEV_TX_OK;
> +	}
> +
> +	priv->tx_skb[tx_head] = skb;
> +	priv->tx_phys[tx_head] = phys;
> +	desc->send_addr = cpu_to_be32(phys);
> +	desc->send_size = cpu_to_be32(skb->len);
> +	desc->cfg = cpu_to_be32(TX_CLEAR_WB | TX_FINISH_CACHE_INV);
> +	phys = priv->tx_desc_dma + tx_head * sizeof(struct tx_desc);
> +	desc->wb_addr = cpu_to_be32(phys);
> +	skb_tx_timestamp(skb);
> +
> +	hip04_set_xmit_desc(priv, phys);
> +	priv->tx_head = TX_NEXT(tx_head);
> +	count++;

Starting from this point, skb might already have been freed by TX
completion.

Its racy to access skb->len 

> +	netdev_sent_queue(ndev, skb->len);
> +
> +	stats->tx_bytes += skb->len;
> +	stats->tx_packets++;
> +
> +	/* Ensure tx_head update visible to tx reclaim */
> +	smp_wmb();
> +
> +	/* queue is getting full, better start cleaning up now */
> +	if (count >= priv->tx_coalesce_frames) {
> +		if (napi_schedule_prep(&priv->napi)) {
> +			/* disable rx interrupt and timer */
> +			priv->reg_inten &= ~(RCV_INT);
> +			writel_relaxed(DEF_INT_MASK & ~RCV_INT,
> +				       priv->base + PPE_INTEN);
> +			hrtimer_cancel(&priv->tx_coalesce_timer);
> +			__napi_schedule(&priv->napi);
> +		}
> +	} else if (!hrtimer_is_queued(&priv->tx_coalesce_timer)) {
> +		/* cleanup not pending yet, start a new timer */
> +		hrtimer_start_expires(&priv->tx_coalesce_timer,
> +				      HRTIMER_MODE_REL);
> +	}
> +
> +	return NETDEV_TX_OK;
> +}
> +
> +static int hip04_rx_poll(struct napi_struct *napi, int budget)
> +{
> +	struct hip04_priv *priv = container_of(napi, struct hip04_priv, napi);
> +	struct net_device *ndev = priv->ndev;
> +	struct net_device_stats *stats = &ndev->stats;
> +	unsigned int cnt = hip04_recv_cnt(priv);
> +	struct rx_desc *desc;
> +	struct sk_buff *skb;
> +	unsigned char *buf;
> +	bool last = false;
> +	dma_addr_t phys;
> +	int rx = 0;
> +	int tx_remaining;
> +	u16 len;
> +	u32 err;
> +
> +	while (cnt && !last) {
> +		buf = priv->rx_buf[priv->rx_head];
> +		skb = build_skb(buf, priv->rx_buf_size);
> +		if (unlikely(!skb))
> +			net_dbg_ratelimited("build_skb failed\n");

Well, is skb is NULL, you're crashing later...
You really have to address a memory allocation error much better than
that !

> +
> +		dma_unmap_single(&ndev->dev, priv->rx_phys[priv->rx_head],
> +				 RX_BUF_SIZE, DMA_FROM_DEVICE);
> +		priv->rx_phys[priv->rx_head] = 0;
> +
> +		desc = (struct rx_desc *)skb->data;
> +		len = be16_to_cpu(desc->pkt_len);
> +		err = be32_to_cpu(desc->pkt_err);
> +
> +		if (0 == len) {
> +			dev_kfree_skb_any(skb);
> +			last = true;
> +		} else if ((err & RX_PKT_ERR) || (len >= GMAC_MAX_PKT_LEN)) {
> +			dev_kfree_skb_any(skb);
> +			stats->rx_dropped++;
> +			stats->rx_errors++;
> +		} else {
> +			skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN);
> +			skb_put(skb, len);
> +			skb->protocol = eth_type_trans(skb, ndev);
> +			napi_gro_receive(&priv->napi, skb);
> +			stats->rx_packets++;
> +			stats->rx_bytes += len;
> +			rx++;
> +		}
> +
> +		buf = netdev_alloc_frag(priv->rx_buf_size);
> +		if (!buf)
> +			goto done;

Same problem here : In case of memory allocation error, your driver is
totally screwed.

> +		phys = dma_map_single(&ndev->dev, buf,
> +				      RX_BUF_SIZE, DMA_FROM_DEVICE);
> +		if (dma_mapping_error(&ndev->dev, phys))
> +			goto done;

Same problem here : You really have to recover properly.

> +		priv->rx_buf[priv->rx_head] = buf;
> +		priv->rx_phys[priv->rx_head] = phys;
> +		hip04_set_recv_desc(priv, phys);
> +
> +		priv->rx_head = RX_NEXT(priv->rx_head);
> +		if (rx >= budget)
> +			goto done;
> +
> +		if (--cnt == 0)
> +			cnt = hip04_recv_cnt(priv);
> +	}
> +
> +	if (!(priv->reg_inten & RCV_INT)) {
> +		/* enable rx interrupt */
> +		priv->reg_inten |= RCV_INT;
> +		writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
> +	}
> +	napi_complete(napi);
> +done:
> +	/* clean up tx descriptors and start a new timer if necessary */
> +	tx_remaining = hip04_tx_reclaim(ndev, false);
> +	if (rx < budget && tx_remaining)
> +		hrtimer_start_expires(&priv->tx_coalesce_timer, HRTIMER_MODE_REL);
> +
> +	return rx;
> +}
> +

^ permalink raw reply

* [patch-net-next v2 2/3] net: ethernet: cpsw: split out IRQ handler
From: Felipe Balbi @ 2015-01-14 16:58 UTC (permalink / raw)
  To: davem
  Cc: Tony Lindgren, Linux OMAP Mailing List, mugunthanvnm, netdev,
	Felipe Balbi
In-Reply-To: <1421254729-10602-1-git-send-email-balbi@ti.com>

Now we can introduce dedicated IRQ handlers
for each of the IRQ events. This helps with
cleaning up a little bit of the clutter in
cpsw_interrupt() while also making sure that
TX IRQs will try to handle TX buffers while
RX IRQs will try to handle RX buffers.

Signed-off-by: Felipe Balbi <balbi@ti.com>
---
 drivers/net/ethernet/ti/cpsw.c | 41 ++++++++++++++++++++++++++++++-----------
 1 file changed, 30 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 8e1af51e4b76..c6c483f3e49f 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -754,18 +754,36 @@ requeue:
 		dev_kfree_skb_any(new_skb);
 }
 
-static irqreturn_t cpsw_interrupt(int irq, void *dev_id)
+static irqreturn_t cpsw_dummy_interrupt(int irq, void *dev_id)
 {
 	struct cpsw_priv *priv = dev_id;
 	int value = irq - priv->irqs_table[0];
 
-	/* NOTICE: Ending IRQ here. The trick with the 'value' variable above
-	 * is to make sure we will always write the correct value to the EOI
-	 * register. Namely 0 for RX_THRESH Interrupt, 1 for RX Interrupt, 2
-	 * for TX Interrupt and 3 for MISC Interrupt.
-	 */
 	cpdma_ctlr_eoi(priv->dma, value);
 
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t cpsw_tx_interrupt(int irq, void *dev_id)
+{
+	struct cpsw_priv *priv = dev_id;
+
+	cpdma_ctlr_eoi(priv->dma, CPDMA_EOI_TX);
+	cpdma_chan_process(priv->txch, 128);
+
+	priv = cpsw_get_slave_priv(priv, 1);
+	if (priv)
+		cpdma_chan_process(priv->txch, 128);
+
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t cpsw_rx_interrupt(int irq, void *dev_id)
+{
+	struct cpsw_priv *priv = dev_id;
+
+	cpdma_ctlr_eoi(priv->dma, CPDMA_EOI_RX);
+
 	cpsw_intr_disable(priv);
 	if (priv->irq_enabled == true) {
 		cpsw_disable_irq(priv);
@@ -1617,7 +1635,8 @@ static void cpsw_ndo_poll_controller(struct net_device *ndev)
 
 	cpsw_intr_disable(priv);
 	cpdma_ctlr_int_ctrl(priv->dma, false);
-	cpsw_interrupt(ndev->irq, priv);
+	cpsw_rx_interrupt(priv->irq[1], priv);
+	cpsw_tx_interrupt(priv->irq[2], priv);
 	cpdma_ctlr_int_ctrl(priv->dma, true);
 	cpsw_intr_enable(priv);
 }
@@ -2351,7 +2370,7 @@ static int cpsw_probe(struct platform_device *pdev)
 		goto clean_ale_ret;
 
 	priv->irqs_table[0] = irq;
-	ret = devm_request_irq(&pdev->dev, irq, cpsw_interrupt,
+	ret = devm_request_irq(&pdev->dev, irq, cpsw_dummy_interrupt,
 			       0, dev_name(&pdev->dev), priv);
 	if (ret < 0) {
 		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
@@ -2363,7 +2382,7 @@ static int cpsw_probe(struct platform_device *pdev)
 		goto clean_ale_ret;
 
 	priv->irqs_table[1] = irq;
-	ret = devm_request_irq(&pdev->dev, irq, cpsw_interrupt,
+	ret = devm_request_irq(&pdev->dev, irq, cpsw_rx_interrupt,
 			       0, dev_name(&pdev->dev), priv);
 	if (ret < 0) {
 		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
@@ -2375,7 +2394,7 @@ static int cpsw_probe(struct platform_device *pdev)
 		goto clean_ale_ret;
 
 	priv->irqs_table[2] = irq;
-	ret = devm_request_irq(&pdev->dev, irq, cpsw_interrupt,
+	ret = devm_request_irq(&pdev->dev, irq, cpsw_tx_interrupt,
 			       0, dev_name(&pdev->dev), priv);
 	if (ret < 0) {
 		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
@@ -2387,7 +2406,7 @@ static int cpsw_probe(struct platform_device *pdev)
 		goto clean_ale_ret;
 
 	priv->irqs_table[3] = irq;
-	ret = devm_request_irq(&pdev->dev, irq, cpsw_interrupt,
+	ret = devm_request_irq(&pdev->dev, irq, cpsw_dummy_interrupt,
 			       0, dev_name(&pdev->dev), priv);
 	if (ret < 0) {
 		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
-- 
2.2.0


^ permalink raw reply related

* [patch-net-next v2 1/3] net: ethernet: cpsw: unroll IRQ request loop
From: Felipe Balbi @ 2015-01-14 16:58 UTC (permalink / raw)
  To: davem
  Cc: Tony Lindgren, Linux OMAP Mailing List, mugunthanvnm, netdev,
	Felipe Balbi

This patch is in preparation for a nicer IRQ
handling scheme where we use different IRQ
handlers for each IRQ line (as it should be).

Later, we will also drop IRQs offset 0 and 3
because they are always disabled in this driver.

Signed-off-by: Felipe Balbi <balbi@ti.com>
---
 drivers/net/ethernet/ti/cpsw.c | 62 ++++++++++++++++++++++++++++++++----------
 1 file changed, 47 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index e61ee8351272..8e1af51e4b76 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -2156,7 +2156,8 @@ static int cpsw_probe(struct platform_device *pdev)
 	void __iomem			*ss_regs;
 	struct resource			*res, *ss_res;
 	u32 slave_offset, sliver_offset, slave_size;
-	int ret = 0, i, k = 0;
+	int ret = 0, i;
+	int irq;
 
 	ndev = alloc_etherdev(sizeof(struct cpsw_priv));
 	if (!ndev) {
@@ -2345,24 +2346,55 @@ static int cpsw_probe(struct platform_device *pdev)
 		goto clean_ale_ret;
 	}
 
-	while ((res = platform_get_resource(priv->pdev, IORESOURCE_IRQ, k))) {
-		if (k >= ARRAY_SIZE(priv->irqs_table)) {
-			ret = -EINVAL;
-			goto clean_ale_ret;
-		}
+	irq = platform_get_irq(pdev, 0);
+	if (irq < 0)
+		goto clean_ale_ret;
 
-		ret = devm_request_irq(&pdev->dev, res->start, cpsw_interrupt,
-				       0, dev_name(&pdev->dev), priv);
-		if (ret < 0) {
-			dev_err(priv->dev, "error attaching irq (%d)\n", ret);
-			goto clean_ale_ret;
-		}
+	priv->irqs_table[0] = irq;
+	ret = devm_request_irq(&pdev->dev, irq, cpsw_interrupt,
+			       0, dev_name(&pdev->dev), priv);
+	if (ret < 0) {
+		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
+		goto clean_ale_ret;
+	}
 
-		priv->irqs_table[k] = res->start;
-		k++;
+	irq = platform_get_irq(pdev, 1);
+	if (irq < 0)
+		goto clean_ale_ret;
+
+	priv->irqs_table[1] = irq;
+	ret = devm_request_irq(&pdev->dev, irq, cpsw_interrupt,
+			       0, dev_name(&pdev->dev), priv);
+	if (ret < 0) {
+		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
+		goto clean_ale_ret;
+	}
+
+	irq = platform_get_irq(pdev, 2);
+	if (irq < 0)
+		goto clean_ale_ret;
+
+	priv->irqs_table[2] = irq;
+	ret = devm_request_irq(&pdev->dev, irq, cpsw_interrupt,
+			       0, dev_name(&pdev->dev), priv);
+	if (ret < 0) {
+		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
+		goto clean_ale_ret;
+	}
+
+	irq = platform_get_irq(pdev, 3);
+	if (irq < 0)
+		goto clean_ale_ret;
+
+	priv->irqs_table[3] = irq;
+	ret = devm_request_irq(&pdev->dev, irq, cpsw_interrupt,
+			       0, dev_name(&pdev->dev), priv);
+	if (ret < 0) {
+		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
+		goto clean_ale_ret;
 	}
 
-	priv->num_irqs = k;
+	priv->num_irqs = 4;
 
 	ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER;
 
-- 
2.2.0

^ permalink raw reply related

* [patch-net-next v2 3/3] net: ethernet: cpsw: don't requests IRQs we don't use
From: Felipe Balbi @ 2015-01-14 16:58 UTC (permalink / raw)
  To: davem
  Cc: Tony Lindgren, Linux OMAP Mailing List, mugunthanvnm, netdev,
	Felipe Balbi
In-Reply-To: <1421254729-10602-1-git-send-email-balbi@ti.com>

CPSW never uses RX_THRESHOLD or MISC interrupts. In
fact, they are always kept masked in their appropriate
IRQ Enable register.

Instead of allocating an IRQ that never fires, it's best
to remove that code altogether and let future patches
implement it if anybody needs those.

Signed-off-by: Felipe Balbi <balbi@ti.com>
---
 drivers/net/ethernet/ti/cpsw.c | 55 ++++++++++++------------------------------
 1 file changed, 15 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index c6c483f3e49f..ba09ff3c1695 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -754,16 +754,6 @@ requeue:
 		dev_kfree_skb_any(new_skb);
 }
 
-static irqreturn_t cpsw_dummy_interrupt(int irq, void *dev_id)
-{
-	struct cpsw_priv *priv = dev_id;
-	int value = irq - priv->irqs_table[0];
-
-	cpdma_ctlr_eoi(priv->dma, value);
-
-	return IRQ_HANDLED;
-}
-
 static irqreturn_t cpsw_tx_interrupt(int irq, void *dev_id)
 {
 	struct cpsw_priv *priv = dev_id;
@@ -1635,8 +1625,8 @@ static void cpsw_ndo_poll_controller(struct net_device *ndev)
 
 	cpsw_intr_disable(priv);
 	cpdma_ctlr_int_ctrl(priv->dma, false);
-	cpsw_rx_interrupt(priv->irq[1], priv);
-	cpsw_tx_interrupt(priv->irq[2], priv);
+	cpsw_rx_interrupt(priv->irq[0], priv);
+	cpsw_tx_interrupt(priv->irq[1], priv);
 	cpdma_ctlr_int_ctrl(priv->dma, true);
 	cpsw_intr_enable(priv);
 }
@@ -2358,30 +2348,27 @@ static int cpsw_probe(struct platform_device *pdev)
 		goto clean_dma_ret;
 	}
 
-	ndev->irq = platform_get_irq(pdev, 0);
+	ndev->irq = platform_get_irq(pdev, 1);
 	if (ndev->irq < 0) {
 		dev_err(priv->dev, "error getting irq resource\n");
 		ret = -ENOENT;
 		goto clean_ale_ret;
 	}
 
-	irq = platform_get_irq(pdev, 0);
-	if (irq < 0)
-		goto clean_ale_ret;
-
-	priv->irqs_table[0] = irq;
-	ret = devm_request_irq(&pdev->dev, irq, cpsw_dummy_interrupt,
-			       0, dev_name(&pdev->dev), priv);
-	if (ret < 0) {
-		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
-		goto clean_ale_ret;
-	}
+	/* Grab RX and TX IRQs. Note that we also have RX_THRESHOLD and
+	 * MISC IRQs which are always kept disabled with this driver so
+	 * we will not request them.
+	 *
+	 * If anyone wants to implement support for those, make sure to
+	 * first request and append them to irqs_table array.
+	 */
 
+	/* RX IRQ */
 	irq = platform_get_irq(pdev, 1);
 	if (irq < 0)
 		goto clean_ale_ret;
 
-	priv->irqs_table[1] = irq;
+	priv->irqs_table[0] = irq;
 	ret = devm_request_irq(&pdev->dev, irq, cpsw_rx_interrupt,
 			       0, dev_name(&pdev->dev), priv);
 	if (ret < 0) {
@@ -2389,31 +2376,19 @@ static int cpsw_probe(struct platform_device *pdev)
 		goto clean_ale_ret;
 	}
 
+	/* TX IRQ */
 	irq = platform_get_irq(pdev, 2);
 	if (irq < 0)
 		goto clean_ale_ret;
 
-	priv->irqs_table[2] = irq;
+	priv->irqs_table[1] = irq;
 	ret = devm_request_irq(&pdev->dev, irq, cpsw_tx_interrupt,
 			       0, dev_name(&pdev->dev), priv);
 	if (ret < 0) {
 		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
 		goto clean_ale_ret;
 	}
-
-	irq = platform_get_irq(pdev, 3);
-	if (irq < 0)
-		goto clean_ale_ret;
-
-	priv->irqs_table[3] = irq;
-	ret = devm_request_irq(&pdev->dev, irq, cpsw_dummy_interrupt,
-			       0, dev_name(&pdev->dev), priv);
-	if (ret < 0) {
-		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
-		goto clean_ale_ret;
-	}
-
-	priv->num_irqs = 4;
+	priv->num_irqs = 2;
 
 	ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER;
 
-- 
2.2.0

^ permalink raw reply related

* [patch net] team: avoid possible underflow of count_pending value for notify_peers and mcast_rejoin
From: Jiri Pirko @ 2015-01-14 17:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, jbenc

This patch is fixing a race condition that may cause setting
count_pending to -1, which results in unwanted big bulk of arp messages
(in case of "notify peers").

Consider following scenario:

count_pending == 2
   CPU0                                           CPU1
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to 1)
					  schedule_delayed_work
 team_notify_peers
   atomic_add (adding 1 to count_pending)
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to 1)
					  schedule_delayed_work
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to 0)
   schedule_delayed_work
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to -1)

Fix this race by using atomic_dec_if_positive - that will prevent
count_pending running under 0.

Fixes: fc423ff00df3a1955441 ("team: add peer notification")
Fixes: 492b200efdd20b8fcfd  ("team: add support for sending multicast rejoins")
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
 drivers/net/team/team.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 43bcfff..4b2bfc5 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -622,6 +622,7 @@ static int team_change_mode(struct team *team, const char *kind)
 static void team_notify_peers_work(struct work_struct *work)
 {
 	struct team *team;
+	int val;
 
 	team = container_of(work, struct team, notify_peers.dw.work);
 
@@ -629,9 +630,14 @@ static void team_notify_peers_work(struct work_struct *work)
 		schedule_delayed_work(&team->notify_peers.dw, 0);
 		return;
 	}
+	val = atomic_dec_if_positive(&team->notify_peers.count_pending);
+	if (val < 0) {
+		rtnl_unlock();
+		return;
+	}
 	call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, team->dev);
 	rtnl_unlock();
-	if (!atomic_dec_and_test(&team->notify_peers.count_pending))
+	if (val)
 		schedule_delayed_work(&team->notify_peers.dw,
 				      msecs_to_jiffies(team->notify_peers.interval));
 }
@@ -662,6 +668,7 @@ static void team_notify_peers_fini(struct team *team)
 static void team_mcast_rejoin_work(struct work_struct *work)
 {
 	struct team *team;
+	int val;
 
 	team = container_of(work, struct team, mcast_rejoin.dw.work);
 
@@ -669,9 +676,14 @@ static void team_mcast_rejoin_work(struct work_struct *work)
 		schedule_delayed_work(&team->mcast_rejoin.dw, 0);
 		return;
 	}
+	val = atomic_dec_if_positive(&team->mcast_rejoin.count_pending);
+	if (val < 0) {
+		rtnl_unlock();
+		return;
+	}
 	call_netdevice_notifiers(NETDEV_RESEND_IGMP, team->dev);
 	rtnl_unlock();
-	if (!atomic_dec_and_test(&team->mcast_rejoin.count_pending))
+	if (val)
 		schedule_delayed_work(&team->mcast_rejoin.dw,
 				      msecs_to_jiffies(team->mcast_rejoin.interval));
 }
-- 
1.9.3

^ permalink raw reply related

* Re: [bisected regression] e1000e: "Detected Hardware Unit Hang"
From: Eric Dumazet @ 2015-01-14 17:20 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: 'Linux Netdev List', Eric Dumazet, Jeff Kirsher,
	e1000-devel
In-Reply-To: <1719052.SGOfRAJhfQ@storm>

On Wed, 2015-01-14 at 16:32 +0100, Thomas Jarosch wrote:
> Hello,
> 
> after updating a good bunch of production level machines
> from kernel 3.4.101 to kernel 3.14.25, a few of them started
> to show serious trouble when there was a lot of network traffic.
> 
> ---------------------------------------------------------------
> Jan 14 11:14:57 intrartc kernel: e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
> Jan 14 11:14:57 intrartc kernel:  TDH                  <3b>
> Jan 14 11:14:57 intrartc kernel:  TDT                  <76>
> Jan 14 11:14:57 intrartc kernel:  next_to_use          <76>
> Jan 14 11:14:57 intrartc kernel:  next_to_clean        <31>
> Jan 14 11:14:57 intrartc kernel: buffer_info[next_to_clean]:
> Jan 14 11:14:57 intrartc kernel:  time_stamp           <ffff328c>
> Jan 14 11:14:57 intrartc kernel:  next_to_watch        <3b>
> Jan 14 11:14:57 intrartc kernel:  jiffies              <ffff33b9>
> Jan 14 11:14:57 intrartc kernel:  next_to_watch.status <0>
> Jan 14 11:14:57 intrartc kernel: MAC Status             <40080083>
> Jan 14 11:14:57 intrartc kernel: PHY Status             <796d>
> Jan 14 11:14:57 intrartc kernel: PHY 1000BASE-T Status  <3800>
> Jan 14 11:14:57 intrartc kernel: PHY Extended Status    <3000>
> Jan 14 11:14:57 intrartc kernel: PCI Status             <10>
> Jan 14 11:14:59 intrartc kernel: e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
> ..
> ---------------------------------------------------------------
> 
> All of those troubled machines use an Intel DH61CR board and
> are driven by the e1000e driver. Kernels 3.7.0 to 3.19-rc4 are affected.
> 
> The problem vanishes when you disable TSO. This is the
> recommended "solution" on serverfault and others.
> http://ehc.ac/p/e1000/bugs/378/
> http://serverfault.com/questions/616485/e1000e-reset-adapter-unexpectedly-detected-hardware-unit-hang
> 
> I have a test setup that can trigger the problem within seconds
> and bisected it down to this commit (hi Eric!):
> ---------------------------------------------------------------
> commit 69b08f62e17439ee3d436faf0b9a7ca6fffb78db
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Wed Sep 26 06:46:57 2012 +0000
> 
>     net: use bigger pages in __netdev_alloc_frag
> 
>     We currently use percpu order-0 pages in __netdev_alloc_frag
>     to deliver fragments used by __netdev_alloc_skb()
> 
>     Depending on NIC driver and arch being 32 or 64 bit, it allows a page to
>     be split in several fragments (between 1 and 8), assuming PAGE_SIZE=4096
> 
>     Switching to bigger pages (32768 bytes for PAGE_SIZE=4096 case) allows :
> 
>     - Better filling of space (the ending hole overhead is less an issue)
> 
>     - Less calls to page allocator or accesses to page->_count
> 
>     - Could allow struct skb_shared_info futures changes without major
>     performance impact.
> 
>     This patch implements a transparent fallback to smaller
>     pages in case of memory pressure.
> 
>     It also uses a standard "struct page_frag" instead of a custom one.
> 
>     Signed-off-by: Eric Dumazet <edumazet@google.com>
>     Cc: Alexander Duyck <alexander.h.duyck@intel.com>
>     Cc: Benjamin LaHaise <bcrl@kvack.org>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
> ---------------------------------------------------------------
> 
> Reverting the commit f.e. in kernel 3.7.0  solves the issue.
> I've done some more tests:
> 
>     3.18.0 32bit + PAE: broken
>     3.6.0 32bit + PAE: works
>     3.7.0 32bit + PAE: broken
>     3.7.0 32bit + PAE + revert 69b08f62e17439ee3d436faf0b9a7ca6fffb78db -> works
> 
>     3.7.0 32bit (without PAE) -> broken
>     3.7.0 32bit + "GFP_COMP" flag removed in __netdev_alloc_frag(): broken
>     3.7.0 32bit + "GFP_COMP" flag replaced with
>                               "GFP_DMA" in __netdev_alloc_frag(): works!
>     3.7.0 32bit + "GFP_COMP" flag + "GFP_DMA" flag: broken
>     3.19-rc4 32bit: broken
> 
> 
> The problem is triggered only when the traffic is forwarded to another client.
> (this client is behind NAT). Generating traffic directly
> on the system did not trigger the issue.
> 
> To me it looks like Eric's change uncovered a memory allocation
> issue in the e1000e driver: It probably uses a memory address
> unsuitable for DMA or so. This is just a guess though.
> 
> Funny fact: I have another Intel DH61CR board that does not show the problem.
> I've borrowed (...) the mainboard from one affected box for my bisect test setup.
> 
> Please CC: comments. Thanks.

I would try to use lower data per txd. I am not sure 24KB is really
supported.

( check commit d821a4c4d11ad160925dab2bb009b8444beff484 for details)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index e14fd85f64eb..8d973f7edfbd 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3897,7 +3897,7 @@ void e1000e_reset(struct e1000_adapter *adapter)
 	 * limit of 24KB due to receive synchronization limitations.
 	 */
 	adapter->tx_fifo_limit = min_t(u32, ((er32(PBA) >> 16) << 10) - 96,
-				       24 << 10);
+				       8 << 10);
 
 	/* Disable Adaptive Interrupt Moderation if 2 full packets cannot
 	 * fit in receive buffer.

^ permalink raw reply related

* [PATCH v3 02/16] virtio/9p: verify device has config space
From: Michael S. Tsirkin @ 2015-01-14 17:27 UTC (permalink / raw)
  To: linux-kernel, virtualization
  Cc: Eric Van Hensbergen, netdev, v9fs-developer, Ron Minnich,
	David S. Miller
In-Reply-To: <1421256142-11512-1-git-send-email-mst@redhat.com>

Some devices might not implement config space access
(e.g. remoteproc used not to - before 3.9).
virtio/9p needs config space access so make it
fail gracefully if not there.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 net/9p/trans_virtio.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index daa749c..d8e376a 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -524,6 +524,12 @@ static int p9_virtio_probe(struct virtio_device *vdev)
 	int err;
 	struct virtio_chan *chan;
 
+	if (!vdev->config->get) {
+		dev_err(&vdev->dev, "%s failure: config access disabled\n",
+			__func__);
+		return -EINVAL;
+	}
+
 	chan = kmalloc(sizeof(struct virtio_chan), GFP_KERNEL);
 	if (!chan) {
 		pr_err("Failed to allocate virtio 9P channel\n");
-- 
MST

^ permalink raw reply related

* [PATCH v3 05/16] virtio/net: verify device has config space
From: Michael S. Tsirkin @ 2015-01-14 17:27 UTC (permalink / raw)
  To: linux-kernel, virtualization
  Cc: Rusty Russell, cornelia.huck, virtualization, netdev
In-Reply-To: <1421256142-11512-1-git-send-email-mst@redhat.com>

Some devices might not implement config space access
(e.g. remoteproc used not to - before 3.9).
virtio/net needs config space access so make it
fail gracefully if not there.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 5ca9771..9bc1072 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1713,6 +1713,12 @@ static int virtnet_probe(struct virtio_device *vdev)
 	struct virtnet_info *vi;
 	u16 max_queue_pairs;

+	if (!vdev->config->get) {
+		dev_err(&vdev->dev, "%s failure: config access disabled\n",
+			__func__);
+		return -EINVAL;
+	}
+
 	if (!virtnet_validate_features(vdev))
 		return -EINVAL;

-- 
MST

^ permalink raw reply related

* [PATCH for 3.19 0/3] rtlwifi: Various updates/fixes
From: Larry Finger @ 2015-01-14 17:37 UTC (permalink / raw)
  To: kvalo; +Cc: linux-wireless, Larry Finger, netdev

The first of these patches removes a logging statement that is no longer
needed. The other two fix a number of bugs in rtl8192ee that have been
found since the original inclusion of this driver.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>


Larry Finger (1):
  rtlwifi: Remove logging statement that is no longer needed

Troy Tan (2):
  rtlwifi: Fix handling of new style descriptors
  rtlwifi: rtl8192ee: Implement new handling of FIFO descriptor buffer

 drivers/net/wireless/rtlwifi/pci.c           |  36 ++++--
 drivers/net/wireless/rtlwifi/rtl8192ee/hw.c  | 167 +++++++++++++++++++++----
 drivers/net/wireless/rtlwifi/rtl8192ee/reg.h |   2 +
 drivers/net/wireless/rtlwifi/rtl8192ee/sw.c  |   3 +-
 drivers/net/wireless/rtlwifi/rtl8192ee/trx.c | 175 +++++++++++----------------
 drivers/net/wireless/rtlwifi/rtl8192ee/trx.h |   4 +-
 drivers/net/wireless/rtlwifi/wifi.h          |   1 +
 7 files changed, 242 insertions(+), 146 deletions(-)

-- 
2.1.2

^ permalink raw reply

* [PATCH for 3.19 1/3] rtlwifi: Remove logging statement that is no longer needed
From: Larry Finger @ 2015-01-14 17:37 UTC (permalink / raw)
  To: kvalo; +Cc: linux-wireless, Larry Finger, netdev
In-Reply-To: <1421257036-5382-1-git-send-email-Larry.Finger@lwfinger.net>

In commit e9538cf4f907 ("rtlwifi: Fix error when accessing unmapped memory
in skb"), a printk was included to indicate that the condition had been
reached. There is now enough evidence from other users that the fix is
working. That logging statement can now be removed.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
---
 drivers/net/wireless/rtlwifi/pci.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/wireless/rtlwifi/pci.c b/drivers/net/wireless/rtlwifi/pci.c
index c70efb9..e25faac 100644
--- a/drivers/net/wireless/rtlwifi/pci.c
+++ b/drivers/net/wireless/rtlwifi/pci.c
@@ -816,11 +816,8 @@ static void _rtl_pci_rx_interrupt(struct ieee80211_hw *hw)
 
 		/* get a new skb - if fail, old one will be reused */
 		new_skb = dev_alloc_skb(rtlpci->rxbuffersize);
-		if (unlikely(!new_skb)) {
-			pr_err("Allocation of new skb failed in %s\n",
-			       __func__);
+		if (unlikely(!new_skb))
 			goto no_new;
-		}
 		if (rtlpriv->use_new_trx_flow) {
 			buffer_desc =
 			  &rtlpci->rx_ring[rxring_idx].buffer_desc
-- 
2.1.2

^ permalink raw reply related

* [PATCH for 3.19 2/3] rtlwifi: Fix handling of new style descriptors
From: Larry Finger @ 2015-01-14 17:37 UTC (permalink / raw)
  To: kvalo; +Cc: linux-wireless, Troy Tan, netdev, Larry Finger
In-Reply-To: <1421257036-5382-1-git-send-email-Larry.Finger@lwfinger.net>

From: Troy Tan <troy_tan@realsil.com.cn>

The hardware and firmware for the RTL8192EE utilize a FIFO list of
descriptors. There were some problems with the initial implementation.
The worst of these failed to detect that the FIFO was becoming full,
which led to the device needing to be power cycled. As this condition
is not relevant to most of the devices supported by rtlwifi, a callback
routine was added to detect this situation. This patch implements the
necessary changes in the pci handler.

Signed-off-by: Troy Tan <troy_tan@realsil.com.cn>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
---
 drivers/net/wireless/rtlwifi/pci.c  | 31 +++++++++++++++++++++++--------
 drivers/net/wireless/rtlwifi/wifi.h |  1 +
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/rtlwifi/pci.c b/drivers/net/wireless/rtlwifi/pci.c
index e25faac..a62170e 100644
--- a/drivers/net/wireless/rtlwifi/pci.c
+++ b/drivers/net/wireless/rtlwifi/pci.c
@@ -578,6 +578,13 @@ static void _rtl_pci_tx_isr(struct ieee80211_hw *hw, int prio)
 		else
 			entry = (u8 *)(&ring->desc[ring->idx]);
 
+		if (rtlpriv->cfg->ops->get_available_desc &&
+		    rtlpriv->cfg->ops->get_available_desc(hw, prio) <= 1) {
+			RT_TRACE(rtlpriv, (COMP_INTR | COMP_SEND), DBG_DMESG,
+				 "no available desc!\n");
+			return;
+		}
+
 		if (!rtlpriv->cfg->ops->is_tx_desc_closed(hw, prio, ring->idx))
 			return;
 		ring->idx = (ring->idx + 1) % ring->entries;
@@ -641,10 +648,9 @@ static void _rtl_pci_tx_isr(struct ieee80211_hw *hw, int prio)
 
 		ieee80211_tx_status_irqsafe(hw, skb);
 
-		if ((ring->entries - skb_queue_len(&ring->queue))
-				== 2) {
+		if ((ring->entries - skb_queue_len(&ring->queue)) <= 4) {
 
-			RT_TRACE(rtlpriv, COMP_ERR, DBG_LOUD,
+			RT_TRACE(rtlpriv, COMP_ERR, DBG_DMESG,
 				 "more desc left, wake skb_queue@%d, ring->idx = %d, skb_queue_len = 0x%x\n",
 				 prio, ring->idx,
 				 skb_queue_len(&ring->queue));
@@ -793,7 +799,7 @@ static void _rtl_pci_rx_interrupt(struct ieee80211_hw *hw)
 			rx_remained_cnt =
 				rtlpriv->cfg->ops->rx_desc_buff_remained_cnt(hw,
 								      hw_queue);
-			if (rx_remained_cnt < 1)
+			if (rx_remained_cnt == 0)
 				return;
 
 		} else {	/* rx descriptor */
@@ -845,18 +851,18 @@ static void _rtl_pci_rx_interrupt(struct ieee80211_hw *hw)
 			else
 				skb_reserve(skb, stats.rx_drvinfo_size +
 					    stats.rx_bufshift);
-
 		} else {
 			RT_TRACE(rtlpriv, COMP_ERR, DBG_WARNING,
 				 "skb->end - skb->tail = %d, len is %d\n",
 				 skb->end - skb->tail, len);
-			break;
+			dev_kfree_skb_any(skb);
+			goto new_trx_end;
 		}
 		/* handle command packet here */
 		if (rtlpriv->cfg->ops->rx_command_packet &&
 		    rtlpriv->cfg->ops->rx_command_packet(hw, stats, skb)) {
 				dev_kfree_skb_any(skb);
-				goto end;
+				goto new_trx_end;
 		}
 
 		/*
@@ -906,6 +912,7 @@ static void _rtl_pci_rx_interrupt(struct ieee80211_hw *hw)
 		} else {
 			dev_kfree_skb_any(skb);
 		}
+new_trx_end:
 		if (rtlpriv->use_new_trx_flow) {
 			rtlpci->rx_ring[hw_queue].next_rx_rp += 1;
 			rtlpci->rx_ring[hw_queue].next_rx_rp %=
@@ -921,7 +928,6 @@ static void _rtl_pci_rx_interrupt(struct ieee80211_hw *hw)
 			rtlpriv->enter_ps = false;
 			schedule_work(&rtlpriv->works.lps_change_work);
 		}
-end:
 		skb = new_skb;
 no_new:
 		if (rtlpriv->use_new_trx_flow) {
@@ -1685,6 +1691,15 @@ static int rtl_pci_tx(struct ieee80211_hw *hw,
 		}
 	}
 
+	if (rtlpriv->cfg->ops->get_available_desc &&
+	    rtlpriv->cfg->ops->get_available_desc(hw, hw_queue) == 0) {
+			RT_TRACE(rtlpriv, COMP_ERR, DBG_WARNING,
+				 "get_available_desc fail\n");
+			spin_unlock_irqrestore(&rtlpriv->locks.irq_th_lock,
+					       flags);
+			return skb->len;
+	}
+
 	if (ieee80211_is_data_qos(fc)) {
 		tid = rtl_get_tid(skb);
 		if (sta) {
diff --git a/drivers/net/wireless/rtlwifi/wifi.h b/drivers/net/wireless/rtlwifi/wifi.h
index 3b3453a..413c2ab 100644
--- a/drivers/net/wireless/rtlwifi/wifi.h
+++ b/drivers/net/wireless/rtlwifi/wifi.h
@@ -2172,6 +2172,7 @@ struct rtl_hal_ops {
 	void (*add_wowlan_pattern)(struct ieee80211_hw *hw,
 				   struct rtl_wow_pattern *rtl_pattern,
 				   u8 index);
+	u16 (*get_available_desc)(struct ieee80211_hw *hw, u8 q_idx);
 };
 
 struct rtl_intf_ops {
-- 
2.1.2

^ permalink raw reply related

* [PATCH for 3.19 3/3] rtlwifi: rtl8192ee: Fix several bugs
From: Larry Finger @ 2015-01-14 17:37 UTC (permalink / raw)
  To: kvalo; +Cc: linux-wireless, Troy Tan, netdev, Larry Finger
In-Reply-To: <1421257036-5382-1-git-send-email-Larry.Finger@lwfinger.net>

From: Troy Tan <troy_tan@realsil.com.cn>

The following bugs are fixed in this driver:
1. Problems parsing C2H CMD
2. An ad-hoc connection can cause a TX freeze.
3. There are additional conditions that cause a TX freeze.
4. The previous code failed to handle situations where an RX
   descriptor was unavailable.

Signed-off-by: Troy Tan <troy_tan@realsil.com.cn>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
---
 drivers/net/wireless/rtlwifi/rtl8192ee/hw.c  | 167 +++++++++++++++++++++----
 drivers/net/wireless/rtlwifi/rtl8192ee/reg.h |   2 +
 drivers/net/wireless/rtlwifi/rtl8192ee/sw.c  |   3 +-
 drivers/net/wireless/rtlwifi/rtl8192ee/trx.c | 175 +++++++++++----------------
 drivers/net/wireless/rtlwifi/rtl8192ee/trx.h |   4 +-
 5 files changed, 217 insertions(+), 134 deletions(-)

diff --git a/drivers/net/wireless/rtlwifi/rtl8192ee/hw.c b/drivers/net/wireless/rtlwifi/rtl8192ee/hw.c
index 47beb49..215b970 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ee/hw.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192ee/hw.c
@@ -85,29 +85,6 @@ static void _rtl92ee_enable_bcn_sub_func(struct ieee80211_hw *hw)
 	_rtl92ee_set_bcn_ctrl_reg(hw, 0, BIT(1));
 }
 
-static void _rtl92ee_return_beacon_queue_skb(struct ieee80211_hw *hw)
-{
-	struct rtl_priv *rtlpriv = rtl_priv(hw);
-	struct rtl_pci *rtlpci = rtl_pcidev(rtl_pcipriv(hw));
-	struct rtl8192_tx_ring *ring = &rtlpci->tx_ring[BEACON_QUEUE];
-	unsigned long flags;
-
-	spin_lock_irqsave(&rtlpriv->locks.irq_th_lock, flags);
-	while (skb_queue_len(&ring->queue)) {
-		struct rtl_tx_buffer_desc *entry =
-						&ring->buffer_desc[ring->idx];
-		struct sk_buff *skb = __skb_dequeue(&ring->queue);
-
-		pci_unmap_single(rtlpci->pdev,
-				 rtlpriv->cfg->ops->get_desc(
-				 (u8 *)entry, true, HW_DESC_TXBUFF_ADDR),
-				 skb->len, PCI_DMA_TODEVICE);
-		kfree_skb(skb);
-		ring->idx = (ring->idx + 1) % ring->entries;
-	}
-	spin_unlock_irqrestore(&rtlpriv->locks.irq_th_lock, flags);
-}
-
 static void _rtl92ee_disable_bcn_sub_func(struct ieee80211_hw *hw)
 {
 	_rtl92ee_set_bcn_ctrl_reg(hw, BIT(1), 0);
@@ -403,9 +380,6 @@ static void _rtl92ee_download_rsvd_page(struct ieee80211_hw *hw)
 		rtl_write_byte(rtlpriv, REG_DWBCN0_CTRL + 2,
 			       bcnvalid_reg | BIT(0));
 
-		/* Return Beacon TCB */
-		_rtl92ee_return_beacon_queue_skb(hw);
-
 		/* download rsvd page */
 		rtl92ee_set_fw_rsvdpagepkt(hw, false);
 
@@ -1163,6 +1137,140 @@ void rtl92ee_enable_hw_security_config(struct ieee80211_hw *hw)
 	rtlpriv->cfg->ops->set_hw_reg(hw, HW_VAR_WPA_CONFIG, &sec_reg_value);
 }
 
+static bool _rtl8192ee_check_pcie_dma_hang(struct rtl_priv *rtlpriv)
+{
+	u8 tmp;
+
+	/* write reg 0x350 Bit[26]=1. Enable debug port. */
+	tmp = rtl_read_byte(rtlpriv, REG_BACKDOOR_DBI_DATA + 3);
+	if (!(tmp & BIT(2))) {
+		rtl_write_byte(rtlpriv, REG_BACKDOOR_DBI_DATA + 3,
+			       (tmp | BIT(2)));
+		mdelay(100); /* Suggested by DD Justin_tsai. */
+	}
+
+	/* read reg 0x350 Bit[25] if 1 : RX hang
+	 * read reg 0x350 Bit[24] if 1 : TX hang
+	 */
+	tmp = rtl_read_byte(rtlpriv, REG_BACKDOOR_DBI_DATA + 3);
+	if ((tmp & BIT(0)) || (tmp & BIT(1))) {
+		RT_TRACE(rtlpriv, COMP_INIT, DBG_LOUD,
+			 "CheckPcieDMAHang8192EE(): true!!\n");
+		return true;
+	}
+	return false;
+}
+
+static void _rtl8192ee_reset_pcie_interface_dma(struct rtl_priv *rtlpriv,
+						bool mac_power_on)
+{
+	u8 tmp;
+	bool release_mac_rx_pause;
+	u8 backup_pcie_dma_pause;
+
+	RT_TRACE(rtlpriv, COMP_INIT, DBG_LOUD,
+		 "ResetPcieInterfaceDMA8192EE()\n");
+
+	/* Revise Note: Follow the document "PCIe RX DMA Hang Reset Flow_v03"
+	 * released by SD1 Alan.
+	 * 2013.05.07, by tynli.
+	 */
+
+	/* 1. disable register write lock
+	 *	write 0x1C bit[1:0] = 2'h0
+	 *	write 0xCC bit[2] = 1'b1
+	 */
+	tmp = rtl_read_byte(rtlpriv, REG_RSV_CTRL);
+	tmp &= ~(BIT(1) | BIT(0));
+	rtl_write_byte(rtlpriv, REG_RSV_CTRL, tmp);
+	tmp = rtl_read_byte(rtlpriv, REG_PMC_DBG_CTRL2);
+	tmp |= BIT(2);
+	rtl_write_byte(rtlpriv, REG_PMC_DBG_CTRL2, tmp);
+
+	/* 2. Check and pause TRX DMA
+	 *	write 0x284 bit[18] = 1'b1
+	 *	write 0x301 = 0xFF
+	 */
+	tmp = rtl_read_byte(rtlpriv, REG_RXDMA_CONTROL);
+	if (tmp & BIT(2)) {
+		/* Already pause before the function for another purpose. */
+		release_mac_rx_pause = false;
+	} else {
+		rtl_write_byte(rtlpriv, REG_RXDMA_CONTROL, (tmp | BIT(2)));
+		release_mac_rx_pause = true;
+	}
+
+	backup_pcie_dma_pause = rtl_read_byte(rtlpriv, REG_PCIE_CTRL_REG + 1);
+	if (backup_pcie_dma_pause != 0xFF)
+		rtl_write_byte(rtlpriv, REG_PCIE_CTRL_REG + 1, 0xFF);
+
+	if (mac_power_on) {
+		/* 3. reset TRX function
+		 *	write 0x100 = 0x00
+		 */
+		rtl_write_byte(rtlpriv, REG_CR, 0);
+	}
+
+	/* 4. Reset PCIe DMA
+	 *	write 0x003 bit[0] = 0
+	 */
+	tmp = rtl_read_byte(rtlpriv, REG_SYS_FUNC_EN + 1);
+	tmp &= ~(BIT(0));
+	rtl_write_byte(rtlpriv, REG_SYS_FUNC_EN + 1, tmp);
+
+	/* 5. Enable PCIe DMA
+	 *	write 0x003 bit[0] = 1
+	 */
+	tmp = rtl_read_byte(rtlpriv, REG_SYS_FUNC_EN + 1);
+	tmp |= BIT(0);
+	rtl_write_byte(rtlpriv, REG_SYS_FUNC_EN + 1, tmp);
+
+	if (mac_power_on) {
+		/* 6. enable TRX function
+		 *	write 0x100 = 0xFF
+		 */
+		rtl_write_byte(rtlpriv, REG_CR, 0xFF);
+
+		/* We should init LLT & RQPN and
+		 * prepare Tx/Rx descrptor address later
+		 * because MAC function is reset.
+		 */
+	}
+
+	/* 7. Restore PCIe autoload down bit
+	 *	write 0xF8 bit[17] = 1'b1
+	 */
+	tmp = rtl_read_byte(rtlpriv, REG_MAC_PHY_CTRL_NORMAL + 2);
+	tmp |= BIT(1);
+	rtl_write_byte(rtlpriv, REG_MAC_PHY_CTRL_NORMAL + 2, tmp);
+
+	/* In MAC power on state, BB and RF maybe in ON state,
+	 * if we release TRx DMA here
+	 * it will cause packets to be started to Tx/Rx,
+	 * so we release Tx/Rx DMA later.
+	 */
+	if (!mac_power_on) {
+		/* 8. release TRX DMA
+		 *	write 0x284 bit[18] = 1'b0
+		 *	write 0x301 = 0x00
+		 */
+		if (release_mac_rx_pause) {
+			tmp = rtl_read_byte(rtlpriv, REG_RXDMA_CONTROL);
+			rtl_write_byte(rtlpriv, REG_RXDMA_CONTROL,
+				       (tmp & (~BIT(2))));
+		}
+		rtl_write_byte(rtlpriv, REG_PCIE_CTRL_REG + 1,
+			       backup_pcie_dma_pause);
+	}
+
+	/* 9. lock system register
+	 *	write 0xCC bit[2] = 1'b0
+	 */
+	tmp = rtl_read_byte(rtlpriv, REG_PMC_DBG_CTRL2);
+	tmp &= ~(BIT(2));
+	rtl_write_byte(rtlpriv, REG_PMC_DBG_CTRL2, tmp);
+}
+
 int rtl92ee_hw_init(struct ieee80211_hw *hw)
 {
 	struct rtl_priv *rtlpriv = rtl_priv(hw);
@@ -1188,6 +1296,13 @@ int rtl92ee_hw_init(struct ieee80211_hw *hw)
 		rtlhal->fw_ps_state = FW_PS_STATE_ALL_ON_92E;
 	}
 
+	if (_rtl8192ee_check_pcie_dma_hang(rtlpriv)) {
+		RT_TRACE(rtlpriv, COMP_INIT, DBG_DMESG, "92ee dma hang!\n");
+		_rtl8192ee_reset_pcie_interface_dma(rtlpriv,
+						    rtlhal->mac_func_enable);
+		rtlhal->mac_func_enable = false;
+	}
+
 	rtstatus = _rtl92ee_init_mac(hw);
 
 	rtl_write_byte(rtlpriv, 0x577, 0x03);
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ee/reg.h b/drivers/net/wireless/rtlwifi/rtl8192ee/reg.h
index 3f2a959..696ae188 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ee/reg.h
+++ b/drivers/net/wireless/rtlwifi/rtl8192ee/reg.h
@@ -77,9 +77,11 @@
 #define REG_HIMRE				0x00B8
 #define REG_HISRE				0x00BC
 
+#define REG_PMC_DBG_CTRL2			0x00CC
 #define REG_EFUSE_ACCESS			0x00CF
 #define REG_HPON_FSM				0x00EC
 #define REG_SYS_CFG1				0x00F0
+#define REG_MAC_PHY_CTRL_NORMAL		0x00F8
 #define REG_SYS_CFG2				0x00FC
 
 #define REG_CR					0x0100
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ee/sw.c b/drivers/net/wireless/rtlwifi/rtl8192ee/sw.c
index f30c916..100d6fc 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ee/sw.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192ee/sw.c
@@ -114,8 +114,6 @@ int rtl92ee_init_sw_vars(struct ieee80211_hw *hw)
 				  RCR_AMF			|
 				  RCR_ACF			|
 				  RCR_ADF			|
-				  RCR_AICV			|
-				  RCR_ACRC32			|
 				  RCR_AB			|
 				  RCR_AM			|
 				  RCR_APM			|
@@ -241,6 +239,7 @@ static struct rtl_hal_ops rtl8192ee_hal_ops = {
 	.set_desc = rtl92ee_set_desc,
 	.get_desc = rtl92ee_get_desc,
 	.is_tx_desc_closed = rtl92ee_is_tx_desc_closed,
+	.get_available_desc = rtl92ee_get_available_desc,
 	.tx_polling = rtl92ee_tx_polling,
 	.enable_hw_sec = rtl92ee_enable_hw_security_config,
 	.init_sw_leds = rtl92ee_init_sw_leds,
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c b/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c
index 51806ac..ee1df82 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192ee/trx.c
@@ -354,6 +354,11 @@ bool rtl92ee_rx_query_desc(struct ieee80211_hw *hw,
 	struct ieee80211_hdr *hdr;
 	u32 phystatus = GET_RX_DESC_PHYST(pdesc);
 
+	if (GET_RX_STATUS_DESC_RPT_SEL(pdesc) == 0)
+		status->packet_report_type = NORMAL_RX;
+	else
+		status->packet_report_type = C2H_PACKET;
+
 	status->length = (u16)GET_RX_DESC_PKT_LEN(pdesc);
 	status->rx_drvinfo_size = (u8)GET_RX_DESC_DRV_INFO_SIZE(pdesc) *
 				  RX_DRV_INFO_SIZE_UNIT;
@@ -472,44 +477,22 @@ u16 rtl92ee_rx_desc_buff_remained_cnt(struct ieee80211_hw *hw, u8 queue_index)
 {
 	struct rtl_pci *rtlpci = rtl_pcidev(rtl_pcipriv(hw));
 	struct rtl_priv *rtlpriv = rtl_priv(hw);
-	u16 read_point = 0, write_point = 0, remind_cnt = 0;
-	u32 tmp_4byte = 0;
-	static u16 last_read_point;
-	static bool start_rx;
-
-	tmp_4byte = rtl_read_dword(rtlpriv, REG_RXQ_TXBD_IDX);
-	read_point = (u16)((tmp_4byte>>16) & 0x7ff);
-	write_point = (u16)(tmp_4byte & 0x7ff);
-
-	if (write_point != rtlpci->rx_ring[queue_index].next_rx_rp) {
-		RT_TRACE(rtlpriv, COMP_RXDESC, DBG_DMESG,
-			 "!!!write point is 0x%x, reg 0x3B4 value is 0x%x\n",
-			  write_point, tmp_4byte);
-		tmp_4byte = rtl_read_dword(rtlpriv, REG_RXQ_TXBD_IDX);
-		read_point = (u16)((tmp_4byte>>16) & 0x7ff);
-		write_point = (u16)(tmp_4byte & 0x7ff);
-	}
-
-	if (read_point > 0)
-		start_rx = true;
-	if (!start_rx)
-		return 0;
+	u16 desc_idx_hw = 0, desc_idx_host = 0, remind_cnt = 0;
+	u32 tmp_4byte = rtl_read_dword(rtlpriv, REG_RXQ_TXBD_IDX);
+	u32 rw_mask = 0x1ff;
 
-	if ((last_read_point > (RX_DESC_NUM_92E / 2)) &&
-	    (read_point <= (RX_DESC_NUM_92E / 2))) {
-		remind_cnt = RX_DESC_NUM_92E - write_point;
-	} else {
-		remind_cnt = (read_point >= write_point) ?
-			     (read_point - write_point) :
-			     (RX_DESC_NUM_92E - write_point + read_point);
-	}
+	desc_idx_hw = (u16)((tmp_4byte>>16) & rw_mask);
+	desc_idx_host = (u16)(tmp_4byte & rw_mask);
 
-	if (remind_cnt == 0)
+	/* may be no data, donot rx */
+	if (desc_idx_hw == desc_idx_host)
 		return 0;
 
-	rtlpci->rx_ring[queue_index].next_rx_rp = write_point;
+	remind_cnt = (desc_idx_hw > desc_idx_host) ?
+		     (desc_idx_hw - desc_idx_host) :
+		     (RX_DESC_NUM_92E - (desc_idx_host - desc_idx_hw));
 
-	last_read_point = read_point;
+	rtlpci->rx_ring[queue_index].next_rx_rp = desc_idx_host;
 	return remind_cnt;
 }
 
@@ -551,7 +534,8 @@ static u16 get_desc_addr_fr_q_idx(u16 queue_index)
 	return desc_address;
 }
 
-void rtl92ee_get_available_desc(struct ieee80211_hw *hw, u8 q_idx)
+/*free  desc that can be used */
+u16 rtl92ee_get_available_desc(struct ieee80211_hw *hw, u8 q_idx)
 {
 	struct rtl_pci *rtlpci = rtl_pcidev(rtl_pcipriv(hw));
 	struct rtl_priv *rtlpriv = rtl_priv(hw);
@@ -561,15 +545,25 @@ void rtl92ee_get_available_desc(struct ieee80211_hw *hw, u8 q_idx)
 
 	tmp_4byte = rtl_read_dword(rtlpriv,
 				   get_desc_addr_fr_q_idx(q_idx));
-	current_tx_read_point = (u16)((tmp_4byte >> 16) & 0x0fff);
-	current_tx_write_point = (u16)((tmp_4byte) & 0x0fff);
+	current_tx_read_point = (u16)((tmp_4byte >> 16) & 0x01ff);
+	current_tx_write_point = (u16)((tmp_4byte) & 0x01ff);
+
+	if (current_tx_read_point == current_tx_write_point)
+		point_diff = TX_DESC_NUM_92E - 1;
+	else if (current_tx_read_point < current_tx_write_point)
+		point_diff = TX_DESC_NUM_92E - (current_tx_write_point -
+			     current_tx_read_point) - 1;
+	else
+		point_diff = current_tx_read_point - current_tx_write_point - 1;
 
-	point_diff = ((current_tx_read_point > current_tx_write_point) ?
-		      (current_tx_read_point - current_tx_write_point) :
-		      (TX_DESC_NUM_92E - current_tx_write_point +
-		       current_tx_read_point));
+	if (0 == point_diff) {
+		RT_TRACE(rtlpriv, COMP_SEND, DBG_DMESG,
+			 "CR:%d,CW:%d\n",
+			 current_tx_read_point, current_tx_write_point);
+	}
 
 	rtlpci->tx_ring[q_idx].avl_desc = point_diff;
+	return point_diff;
 }
 
 void rtl92ee_pre_fill_tx_bd_desc(struct ieee80211_hw *hw,
@@ -706,7 +700,7 @@ void rtl92ee_tx_fill_desc(struct ieee80211_hw *hw,
 	mapping = pci_map_single(rtlpci->pdev, skb->data, skb->len,
 				 PCI_DMA_TODEVICE);
 	if (pci_dma_mapping_error(rtlpci->pdev, mapping)) {
-		RT_TRACE(rtlpriv, COMP_SEND, DBG_TRACE,
+		RT_TRACE(rtlpriv, COMP_SEND, DBG_DMESG,
 			 "DMA mapping error");
 		return;
 	}
@@ -870,7 +864,7 @@ void rtl92ee_tx_fill_cmddesc(struct ieee80211_hw *hw,
 	u8 txdesc_len = 40;
 
 	if (pci_dma_mapping_error(rtlpci->pdev, mapping)) {
-		RT_TRACE(rtlpriv, COMP_SEND, DBG_TRACE,
+		RT_TRACE(rtlpriv, COMP_SEND, DBG_DMESG,
 			 "DMA mapping error");
 		return;
 	}
@@ -918,8 +912,6 @@ void rtl92ee_set_desc(struct ieee80211_hw *hw, u8 *pdesc, bool istx,
 	struct rtl_priv *rtlpriv = rtl_priv(hw);
 	u16 cur_tx_rp = 0;
 	u16 cur_tx_wp = 0;
-	static u16 last_txw_point;
-	static bool over_run;
 	u32 tmp = 0;
 	u8 q_idx = *val;
 
@@ -932,6 +924,7 @@ void rtl92ee_set_desc(struct ieee80211_hw *hw, u8 *pdesc, bool istx,
 			struct rtl_pci *rtlpci = rtl_pcidev(rtl_pcipriv(hw));
 			struct rtl8192_tx_ring *ring = &rtlpci->tx_ring[q_idx];
 			u16 max_tx_desc = ring->entries;
+			u16 point_diff = 0;
 
 			if (q_idx == BEACON_QUEUE) {
 				ring->cur_tx_wp = 0;
@@ -940,44 +933,32 @@ void rtl92ee_set_desc(struct ieee80211_hw *hw, u8 *pdesc, bool istx,
 				return;
 			}
 
+			tmp = rtl_read_dword(rtlpriv,
+					     get_desc_addr_fr_q_idx(q_idx));
+			cur_tx_rp = (u16)((tmp >> 16) & 0x0fff);
+			cur_tx_wp = (u16)(tmp & 0x0fff);
+
+			ring->cur_tx_wp = cur_tx_wp;
+			ring->cur_tx_rp = cur_tx_rp;
+			point_diff = ((cur_tx_rp > cur_tx_wp) ?
+					      (cur_tx_rp - cur_tx_wp) :
+					      (TX_DESC_NUM_92E - 1 -
+					       cur_tx_wp + cur_tx_rp));
+
+			ring->avl_desc = point_diff;
+
 			ring->cur_tx_wp = ((ring->cur_tx_wp + 1) % max_tx_desc);
 
-			if (over_run) {
-				ring->cur_tx_wp = 0;
-				over_run = false;
-			}
-			if (ring->avl_desc > 1) {
+			if (ring->avl_desc >= 1) {
 				ring->avl_desc--;
-
 				rtl_write_word(rtlpriv,
 					       get_desc_addr_fr_q_idx(q_idx),
 					       ring->cur_tx_wp);
-
-				if (q_idx == 1)
-					last_txw_point = cur_tx_wp;
-			}
-
-			if (ring->avl_desc < (max_tx_desc - 15)) {
-				u16 point_diff = 0;
-
-				tmp =
-				  rtl_read_dword(rtlpriv,
-						 get_desc_addr_fr_q_idx(q_idx));
-				cur_tx_rp = (u16)((tmp >> 16) & 0x0fff);
-				cur_tx_wp = (u16)(tmp & 0x0fff);
-
-				ring->cur_tx_wp = cur_tx_wp;
-				ring->cur_tx_rp = cur_tx_rp;
-				point_diff = ((cur_tx_rp > cur_tx_wp) ?
-					      (cur_tx_rp - cur_tx_wp) :
-					      (TX_DESC_NUM_92E - 1 -
-					       cur_tx_wp + cur_tx_rp));
-
-				ring->avl_desc = point_diff;
+			} else {
+				pr_err("critical error! ring->avl_desc == 0\n");
 			}
 		}
-		break;
-		}
+		break; }
 	} else {
 		switch (desc_name) {
 		case HW_DESC_RX_PREPARE:
@@ -1043,38 +1024,29 @@ bool rtl92ee_is_tx_desc_closed(struct ieee80211_hw *hw, u8 hw_queue, u16 index)
 {
 	struct rtl_pci *rtlpci = rtl_pcidev(rtl_pcipriv(hw));
 	struct rtl_priv *rtlpriv = rtl_priv(hw);
-	u16 read_point, write_point, available_desc_num;
+	u16 read_point, write_point;
 	bool ret = false;
-	static u8 stop_report_cnt;
 	struct rtl8192_tx_ring *ring = &rtlpci->tx_ring[hw_queue];
+	u16 cur_tx_rp, cur_tx_wp;
+	u32 tmpu32 = 0;
+
+	tmpu32 = rtl_read_dword(rtlpriv,
+				get_desc_addr_fr_q_idx(hw_queue));
+	cur_tx_rp = (u16)((tmpu32 >> 16) & 0x01ff);
+	cur_tx_wp = (u16)(tmpu32 & 0x01ff);
+
+	ring->cur_tx_wp = cur_tx_wp;
+	ring->cur_tx_rp = cur_tx_rp;
+	ring->avl_desc = ((cur_tx_rp > cur_tx_wp) ? (cur_tx_rp - cur_tx_wp) :
+			  (TX_DESC_NUM_92E - cur_tx_wp + cur_tx_rp));
 
-	/*checking Read/Write Point each interrupt wastes CPU */
-	if (stop_report_cnt > 15 || !rtlpriv->link_info.busytraffic) {
-		u16 point_diff = 0;
-		u16 cur_tx_rp, cur_tx_wp;
-		u32 tmpu32 = 0;
-
-		tmpu32 =
-		  rtl_read_dword(rtlpriv,
-				 get_desc_addr_fr_q_idx(hw_queue));
-		cur_tx_rp = (u16)((tmpu32 >> 16) & 0x0fff);
-		cur_tx_wp = (u16)(tmpu32 & 0x0fff);
-
-		ring->cur_tx_wp = cur_tx_wp;
-		ring->cur_tx_rp = cur_tx_rp;
-		point_diff = ((cur_tx_rp > cur_tx_wp) ?
-			      (cur_tx_rp - cur_tx_wp) :
-			      (TX_DESC_NUM_92E - cur_tx_wp + cur_tx_rp));
-
-		ring->avl_desc = point_diff;
-	}
 
 	read_point = ring->cur_tx_rp;
 	write_point = ring->cur_tx_wp;
-	available_desc_num = ring->avl_desc;
+
 
 	if (write_point > read_point) {
-		if (index < write_point && index >= read_point)
+		if (index <= write_point && index >= read_point)
 			ret = false;
 		else
 			ret = true;
@@ -1095,13 +1067,6 @@ bool rtl92ee_is_tx_desc_closed(struct ieee80211_hw *hw, u8 hw_queue, u16 index)
 	    rtlpriv->psc.rfoff_reason > RF_CHANGE_BY_PS)
 		ret = true;
 
-	if (hw_queue < BEACON_QUEUE) {
-		if (!ret)
-			stop_report_cnt++;
-		else
-			stop_report_cnt = 0;
-	}
-
 	return ret;
 }
 
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ee/trx.h b/drivers/net/wireless/rtlwifi/rtl8192ee/trx.h
index 45fd9db..8f78ac9 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ee/trx.h
+++ b/drivers/net/wireless/rtlwifi/rtl8192ee/trx.h
@@ -542,6 +542,8 @@
 	LE_BITS_TO_4BYTE(__pdesc+8, 12, 4)
 #define GET_RX_DESC_RX_IS_QOS(__pdesc)			\
 	LE_BITS_TO_4BYTE(__pdesc+8, 16, 1)
+#define GET_RX_STATUS_DESC_RPT_SEL(__pdesc)		\
+	LE_BITS_TO_4BYTE(__pdesc+8, 28, 1)
 
 #define GET_RX_DESC_RXMCS(__pdesc)			\
 	LE_BITS_TO_4BYTE(__pdesc+12, 0, 7)
@@ -829,7 +831,7 @@ void rtl92ee_rx_check_dma_ok(struct ieee80211_hw *hw, u8 *header_desc,
 			     u8 queue_index);
 u16	rtl92ee_rx_desc_buff_remained_cnt(struct ieee80211_hw *hw,
 					  u8 queue_index);
-void rtl92ee_get_available_desc(struct ieee80211_hw *hw, u8 queue_index);
+u16 rtl92ee_get_available_desc(struct ieee80211_hw *hw, u8 queue_index);
 void rtl92ee_pre_fill_tx_bd_desc(struct ieee80211_hw *hw,
 				 u8 *tx_bd_desc, u8 *desc, u8 queue_index,
 				 struct sk_buff *skb, dma_addr_t addr);
-- 
2.1.2

^ permalink raw reply related

* [patch net-next v4 1/2] tc: add BPF based action
From: Jiri Pirko @ 2015-01-14 17:43 UTC (permalink / raw)
  To: netdev; +Cc: davem, jhs, dborkman, ast, hannes

This action provides a possibility to exec custom BPF code.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
v3->v4:
 - fixed Kconfig typo spotted out by Daniel
 - added some desc to Kconfig suggested by Daniel
 - fixed code flow in tcf_bpf to avoid gotos suggested by Daniel
 - drop retval changed from -1 to 0 as suggested by Daniel and agreed by Alexei
 - added a little comment to tcf_bpf drop code as suggested by Daniel
v2->v3:
 - s/bpf_len/bpf_num_ops/ per DaveM's suggestion
v1->v2:
 - fixed error path in _init
 - added cleanup function to kill filter prog
---
 include/net/tc_act/tc_bpf.h        |  25 +++++
 include/uapi/linux/tc_act/Kbuild   |   1 +
 include/uapi/linux/tc_act/tc_bpf.h |  31 ++++++
 net/sched/Kconfig                  |  12 +++
 net/sched/Makefile                 |   1 +
 net/sched/act_bpf.c                | 205 +++++++++++++++++++++++++++++++++++++
 6 files changed, 275 insertions(+)
 create mode 100644 include/net/tc_act/tc_bpf.h
 create mode 100644 include/uapi/linux/tc_act/tc_bpf.h
 create mode 100644 net/sched/act_bpf.c

diff --git a/include/net/tc_act/tc_bpf.h b/include/net/tc_act/tc_bpf.h
new file mode 100644
index 0000000..86a070f
--- /dev/null
+++ b/include/net/tc_act/tc_bpf.h
@@ -0,0 +1,25 @@
+/*
+ * Copyright (c) 2015 Jiri Pirko <jiri@resnulli.us>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef __NET_TC_BPF_H
+#define __NET_TC_BPF_H
+
+#include <linux/filter.h>
+#include <net/act_api.h>
+
+struct tcf_bpf {
+	struct tcf_common	common;
+	struct bpf_prog		*filter;
+	struct sock_filter	*bpf_ops;
+	u16			bpf_num_ops;
+};
+#define to_bpf(a) \
+	container_of(a->priv, struct tcf_bpf, common)
+
+#endif /* __NET_TC_BPF_H */
diff --git a/include/uapi/linux/tc_act/Kbuild b/include/uapi/linux/tc_act/Kbuild
index b057da2..19d5219 100644
--- a/include/uapi/linux/tc_act/Kbuild
+++ b/include/uapi/linux/tc_act/Kbuild
@@ -8,3 +8,4 @@ header-y += tc_nat.h
 header-y += tc_pedit.h
 header-y += tc_skbedit.h
 header-y += tc_vlan.h
+header-y += tc_bpf.h
diff --git a/include/uapi/linux/tc_act/tc_bpf.h b/include/uapi/linux/tc_act/tc_bpf.h
new file mode 100644
index 0000000..5288bd77
--- /dev/null
+++ b/include/uapi/linux/tc_act/tc_bpf.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright (c) 2015 Jiri Pirko <jiri@resnulli.us>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef __LINUX_TC_BPF_H
+#define __LINUX_TC_BPF_H
+
+#include <linux/pkt_cls.h>
+
+#define TCA_ACT_BPF 13
+
+struct tc_act_bpf {
+	tc_gen;
+};
+
+enum {
+	TCA_ACT_BPF_UNSPEC,
+	TCA_ACT_BPF_TM,
+	TCA_ACT_BPF_PARMS,
+	TCA_ACT_BPF_OPS_LEN,
+	TCA_ACT_BPF_OPS,
+	__TCA_ACT_BPF_MAX,
+};
+#define TCA_ACT_BPF_MAX (__TCA_ACT_BPF_MAX - 1)
+
+#endif
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index c54c9d9..29a0f95 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -698,6 +698,18 @@ config NET_ACT_VLAN
 	  To compile this code as a module, choose M here: the
 	  module will be called act_vlan.
 
+config NET_ACT_BPF
+        tristate "BPF based action"
+        depends on NET_CLS_ACT
+        ---help---
+	  Say Y here to execute BPF code on packets. The BPF code will decide
+	  if the packet should be dropped of not.
+
+	  If unsure, say N.
+
+	  To compile this code as a module, choose M here: the
+	  module will be called act_bpf.
+
 config NET_CLS_IND
 	bool "Incoming device classification"
 	depends on NET_CLS_U32 || NET_CLS_FW
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 679f24a..7ca2b4e 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_NET_ACT_SIMP)	+= act_simple.o
 obj-$(CONFIG_NET_ACT_SKBEDIT)	+= act_skbedit.o
 obj-$(CONFIG_NET_ACT_CSUM)	+= act_csum.o
 obj-$(CONFIG_NET_ACT_VLAN)	+= act_vlan.o
+obj-$(CONFIG_NET_ACT_BPF)	+= act_bpf.o
 obj-$(CONFIG_NET_SCH_FIFO)	+= sch_fifo.o
 obj-$(CONFIG_NET_SCH_CBQ)	+= sch_cbq.o
 obj-$(CONFIG_NET_SCH_HTB)	+= sch_htb.o
diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
new file mode 100644
index 0000000..1bd257e
--- /dev/null
+++ b/net/sched/act_bpf.c
@@ -0,0 +1,205 @@
+/*
+ * Copyright (c) 2015 Jiri Pirko <jiri@resnulli.us>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/skbuff.h>
+#include <linux/rtnetlink.h>
+#include <linux/filter.h>
+#include <net/netlink.h>
+#include <net/pkt_sched.h>
+
+#include <linux/tc_act/tc_bpf.h>
+#include <net/tc_act/tc_bpf.h>
+
+#define BPF_TAB_MASK     15
+
+static int tcf_bpf(struct sk_buff *skb, const struct tc_action *a,
+		   struct tcf_result *res)
+{
+	struct tcf_bpf *b = a->priv;
+	int action;
+	int filter_res;
+
+	spin_lock(&b->tcf_lock);
+	b->tcf_tm.lastuse = jiffies;
+	bstats_update(&b->tcf_bstats, skb);
+	action = b->tcf_action;
+
+	filter_res = BPF_PROG_RUN(b->filter, skb);
+	if (filter_res == 0) {
+		/* Return code 0 from the BPF program
+		 * is being interpreted as a drop here.
+		 */
+		action = TC_ACT_SHOT;
+		b->tcf_qstats.drops++;
+	}
+
+	spin_unlock(&b->tcf_lock);
+	return action;
+}
+
+static int tcf_bpf_dump(struct sk_buff *skb, struct tc_action *a,
+			int bind, int ref)
+{
+	unsigned char *tp = skb_tail_pointer(skb);
+	struct tcf_bpf *b = a->priv;
+	struct tc_act_bpf opt = {
+		.index    = b->tcf_index,
+		.refcnt   = b->tcf_refcnt - ref,
+		.bindcnt  = b->tcf_bindcnt - bind,
+		.action   = b->tcf_action,
+	};
+	struct tcf_t t;
+	struct nlattr *nla;
+
+	if (nla_put(skb, TCA_ACT_BPF_PARMS, sizeof(opt), &opt))
+		goto nla_put_failure;
+
+	if (nla_put_u16(skb, TCA_ACT_BPF_OPS_LEN, b->bpf_num_ops))
+		goto nla_put_failure;
+
+	nla = nla_reserve(skb, TCA_ACT_BPF_OPS, b->bpf_num_ops *
+			  sizeof(struct sock_filter));
+	if (!nla)
+		goto nla_put_failure;
+
+	memcpy(nla_data(nla), b->bpf_ops, nla_len(nla));
+
+	t.install = jiffies_to_clock_t(jiffies - b->tcf_tm.install);
+	t.lastuse = jiffies_to_clock_t(jiffies - b->tcf_tm.lastuse);
+	t.expires = jiffies_to_clock_t(b->tcf_tm.expires);
+	if (nla_put(skb, TCA_ACT_BPF_TM, sizeof(t), &t))
+		goto nla_put_failure;
+	return skb->len;
+
+nla_put_failure:
+	nlmsg_trim(skb, tp);
+	return -1;
+}
+
+static const struct nla_policy act_bpf_policy[TCA_ACT_BPF_MAX + 1] = {
+	[TCA_ACT_BPF_PARMS]	= { .len = sizeof(struct tc_act_bpf) },
+	[TCA_ACT_BPF_OPS_LEN]	= { .type = NLA_U16 },
+	[TCA_ACT_BPF_OPS]	= { .type = NLA_BINARY,
+				    .len = sizeof(struct sock_filter) * BPF_MAXINSNS },
+};
+
+static int tcf_bpf_init(struct net *net, struct nlattr *nla,
+			struct nlattr *est, struct tc_action *a,
+			int ovr, int bind)
+{
+	struct nlattr *tb[TCA_ACT_BPF_MAX + 1];
+	struct tc_act_bpf *parm;
+	struct tcf_bpf *b;
+	u16 bpf_size, bpf_num_ops;
+	struct sock_filter *bpf_ops;
+	struct sock_fprog_kern tmp;
+	struct bpf_prog *fp;
+	int ret;
+
+	if (!nla)
+		return -EINVAL;
+
+	ret = nla_parse_nested(tb, TCA_ACT_BPF_MAX, nla, act_bpf_policy);
+	if (ret < 0)
+		return ret;
+
+	if (!tb[TCA_ACT_BPF_PARMS] ||
+	    !tb[TCA_ACT_BPF_OPS_LEN] || !tb[TCA_ACT_BPF_OPS])
+		return -EINVAL;
+	parm = nla_data(tb[TCA_ACT_BPF_PARMS]);
+
+	bpf_num_ops = nla_get_u16(tb[TCA_ACT_BPF_OPS_LEN]);
+	if (bpf_num_ops	> BPF_MAXINSNS || bpf_num_ops == 0)
+		return -EINVAL;
+
+	bpf_size = bpf_num_ops * sizeof(*bpf_ops);
+	bpf_ops = kzalloc(bpf_size, GFP_KERNEL);
+	if (!bpf_ops)
+		return -ENOMEM;
+
+	memcpy(bpf_ops, nla_data(tb[TCA_ACT_BPF_OPS]), bpf_size);
+
+	tmp.len = bpf_num_ops;
+	tmp.filter = bpf_ops;
+
+	ret = bpf_prog_create(&fp, &tmp);
+	if (ret)
+		goto free_bpf_ops;
+
+	if (!tcf_hash_check(parm->index, a, bind)) {
+		ret = tcf_hash_create(parm->index, est, a, sizeof(*b), bind);
+		if (ret)
+			goto destroy_fp;
+
+		ret = ACT_P_CREATED;
+	} else {
+		if (bind)
+			goto destroy_fp;
+		tcf_hash_release(a, bind);
+		if (!ovr) {
+			ret = -EEXIST;
+			goto destroy_fp;
+		}
+	}
+
+	b = to_bpf(a);
+	spin_lock_bh(&b->tcf_lock);
+	b->tcf_action = parm->action;
+	b->bpf_num_ops = bpf_num_ops;
+	b->bpf_ops = bpf_ops;
+	b->filter = fp;
+	spin_unlock_bh(&b->tcf_lock);
+
+	if (ret == ACT_P_CREATED)
+		tcf_hash_insert(a);
+	return ret;
+
+destroy_fp:
+	bpf_prog_destroy(fp);
+free_bpf_ops:
+	kfree(bpf_ops);
+	return ret;
+}
+
+static void tcf_bpf_cleanup(struct tc_action *a, int bind)
+{
+	struct tcf_bpf *b = a->priv;
+
+	bpf_prog_destroy(b->filter);
+}
+
+static struct tc_action_ops act_bpf_ops = {
+	.kind =		"bpf",
+	.type =		TCA_ACT_BPF,
+	.owner =	THIS_MODULE,
+	.act =		tcf_bpf,
+	.dump =		tcf_bpf_dump,
+	.cleanup =	tcf_bpf_cleanup,
+	.init =		tcf_bpf_init,
+};
+
+static int __init bpf_init_module(void)
+{
+	return tcf_register_action(&act_bpf_ops, BPF_TAB_MASK);
+}
+
+static void __exit bpf_cleanup_module(void)
+{
+	tcf_unregister_action(&act_bpf_ops);
+}
+
+module_init(bpf_init_module);
+module_exit(bpf_cleanup_module);
+
+MODULE_AUTHOR("Jiri Pirko <jiri@resnulli.us>");
+MODULE_DESCRIPTION("TC BPF based action");
+MODULE_LICENSE("GPL v2");
-- 
1.9.3

^ permalink raw reply related

* [patch net-next v4 2/2] tc: cls_bpf: rename bpf_len to bpf_num_ops
From: Jiri Pirko @ 2015-01-14 17:43 UTC (permalink / raw)
  To: netdev; +Cc: davem, jhs, dborkman, ast, hannes
In-Reply-To: <1421257404-25452-1-git-send-email-jiri@resnulli.us>

It was suggested by DaveM to change the name as "len" might indicate
unit bytes.

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
---
v3->v4:
 - no change
patch added in v3
---
 net/sched/cls_bpf.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 84c8219..1029923 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -37,7 +37,7 @@ struct cls_bpf_prog {
 	struct tcf_result res;
 	struct list_head link;
 	u32 handle;
-	u16 bpf_len;
+	u16 bpf_num_ops;
 	struct tcf_proto *tp;
 	struct rcu_head rcu;
 };
@@ -160,7 +160,7 @@ static int cls_bpf_modify_existing(struct net *net, struct tcf_proto *tp,
 	struct tcf_exts exts;
 	struct sock_fprog_kern tmp;
 	struct bpf_prog *fp;
-	u16 bpf_size, bpf_len;
+	u16 bpf_size, bpf_num_ops;
 	u32 classid;
 	int ret;
 
@@ -173,13 +173,13 @@ static int cls_bpf_modify_existing(struct net *net, struct tcf_proto *tp,
 		return ret;
 
 	classid = nla_get_u32(tb[TCA_BPF_CLASSID]);
-	bpf_len = nla_get_u16(tb[TCA_BPF_OPS_LEN]);
-	if (bpf_len > BPF_MAXINSNS || bpf_len == 0) {
+	bpf_num_ops = nla_get_u16(tb[TCA_BPF_OPS_LEN]);
+	if (bpf_num_ops > BPF_MAXINSNS || bpf_num_ops == 0) {
 		ret = -EINVAL;
 		goto errout;
 	}
 
-	bpf_size = bpf_len * sizeof(*bpf_ops);
+	bpf_size = bpf_num_ops * sizeof(*bpf_ops);
 	bpf_ops = kzalloc(bpf_size, GFP_KERNEL);
 	if (bpf_ops == NULL) {
 		ret = -ENOMEM;
@@ -188,14 +188,14 @@ static int cls_bpf_modify_existing(struct net *net, struct tcf_proto *tp,
 
 	memcpy(bpf_ops, nla_data(tb[TCA_BPF_OPS]), bpf_size);
 
-	tmp.len = bpf_len;
+	tmp.len = bpf_num_ops;
 	tmp.filter = bpf_ops;
 
 	ret = bpf_prog_create(&fp, &tmp);
 	if (ret)
 		goto errout_free;
 
-	prog->bpf_len = bpf_len;
+	prog->bpf_num_ops = bpf_num_ops;
 	prog->bpf_ops = bpf_ops;
 	prog->filter = fp;
 	prog->res.classid = classid;
@@ -303,10 +303,10 @@ static int cls_bpf_dump(struct net *net, struct tcf_proto *tp, unsigned long fh,
 
 	if (nla_put_u32(skb, TCA_BPF_CLASSID, prog->res.classid))
 		goto nla_put_failure;
-	if (nla_put_u16(skb, TCA_BPF_OPS_LEN, prog->bpf_len))
+	if (nla_put_u16(skb, TCA_BPF_OPS_LEN, prog->bpf_num_ops))
 		goto nla_put_failure;
 
-	nla = nla_reserve(skb, TCA_BPF_OPS, prog->bpf_len *
+	nla = nla_reserve(skb, TCA_BPF_OPS, prog->bpf_num_ops *
 			  sizeof(struct sock_filter));
 	if (nla == NULL)
 		goto nla_put_failure;
-- 
1.9.3

^ permalink raw reply related

* [PATCH net-next 0/2] net: DSA fixes for bridge and ip-autoconf
From: Florian Fainelli @ 2015-01-14 17:52 UTC (permalink / raw)
  To: netdev; +Cc: Florian Fainelli, bridge, kaber, davem, buytenh

Hi David,

These two patches address some real world use cases of the DSA master and slave
network devices.

You have already seen patch 1 previously and you rejected it since my
explanations were not good enough to provide a justification as to why it is
useful, hopefully this time my explanation is better.

Patch 2 solves a different, yet very real problem as well at the bridge layer
when using DSA network devices.

Thanks!

Florian Fainelli (2):
  net: ipv4: handle DSA enabled master network devices
  net: bridge: reject DSA-enabled master netdevices as bridge members

 net/bridge/br_if.c  | 10 ++++++++--
 net/ipv4/ipconfig.c | 10 +++++++---
 2 files changed, 15 insertions(+), 5 deletions(-)

-- 
2.1.0

^ permalink raw reply

* [PATCH net-next 1/2] net: ipv4: handle DSA enabled master network devices
From: Florian Fainelli @ 2015-01-14 17:52 UTC (permalink / raw)
  To: netdev; +Cc: davem, stephen, kaber, bridge, buytenh, Florian Fainelli
In-Reply-To: <1421257932-11073-1-git-send-email-f.fainelli@gmail.com>

The logic to configure a network interface for kernel IP
auto-configuration is very simplistic, and does not handle the case
where a device is stacked onto another such as with DSA. This causes the
kernel not to open and configure the master network device in a DSA
switch tree, and therefore slave network devices using this master
network devices as conduit device cannot be open.

This restriction comes from a check in net/dsa/slave.c, which is
basically checking the master netdev flags for IFF_UP and returns
-ENETDOWN if it is not the case.

Automatically bringing-up DSA master network devices allows DSA slave
network devices to be used as valid interfaces for e.g: NFS root booting
by allowing kernel IP autoconfiguration to succeed on these interfaces.

On the reverse path, make sure we do not attempt to close a DSA-enabled
device as this would implicitely prevent the slave DSA network device
from operating.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 net/ipv4/ipconfig.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
index 7fa18bc7e47f..d10073d2be0f 100644
--- a/net/ipv4/ipconfig.c
+++ b/net/ipv4/ipconfig.c
@@ -209,9 +209,9 @@ static int __init ic_open_devs(void)
 	last = &ic_first_dev;
 	rtnl_lock();
 
-	/* bring loopback device up first */
+	/* bring loopback an DSA master network devices up first */
 	for_each_netdev(&init_net, dev) {
-		if (!(dev->flags & IFF_LOOPBACK))
+		if (!(dev->flags & IFF_LOOPBACK) && !netdev_uses_dsa(dev))
 			continue;
 		if (dev_change_flags(dev, dev->flags | IFF_UP) < 0)
 			pr_err("IP-Config: Failed to open %s\n", dev->name);
@@ -228,6 +228,7 @@ static int __init ic_open_devs(void)
 			if (!(dev->flags & IFF_NOARP))
 				able |= IC_RARP;
 			able &= ic_proto_enabled;
+
 			if (ic_proto_enabled && !able)
 				continue;
 			oflags = dev->flags;
@@ -306,7 +307,10 @@ static void __init ic_close_devs(void)
 	while ((d = next)) {
 		next = d->next;
 		dev = d->dev;
-		if (dev != ic_dev) {
+		/* Only bring down unused devices and not DSA enabled master
+		 * devices
+		 */
+		if (dev != ic_dev && !netdev_uses_dsa(dev)) {
 			DBG(("IP-Config: Downing %s\n", dev->name));
 			dev_change_flags(dev, d->flags);
 		}
-- 
2.1.0

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox