Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 1/2] ASIX: Simplify condition in rx_fixup()
From: David Miller @ 2011-07-28  5:40 UTC (permalink / raw)
  To: marek.vasut; +Cc: linux-kernel, netdev, linux-usb, gregkh
In-Reply-To: <1311734687-23551-1-git-send-email-marek.vasut@gmail.com>

From: Marek Vasut <marek.vasut@gmail.com>
Date: Wed, 27 Jul 2011 04:44:46 +0200

> Signed-off-by: Marek Vasut <marek.vasut@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next 0/9] tg3: Add 4k workaround for 5719
From: David Miller @ 2011-07-28  5:39 UTC (permalink / raw)
  To: mcarlson; +Cc: netdev
In-Reply-To: <1311812454-18197-1-git-send-email-mcarlson@broadcom.com>

From: "Matt Carlson" <mcarlson@broadcom.com>
Date: Wed, 27 Jul 2011 17:20:45 -0700

> This patchset adds a necessary 4k RDMA limit workaround for 5719 devices.

Series applied, thanks Matt.

^ permalink raw reply

* Could I export the udp socket security contexts to /proc/net/udp
From: Rongqing Li @ 2011-07-28  5:38 UTC (permalink / raw)
  To: netdev

Hi Linux-netdev folks:

Could I export the socket security contexts to udp, tcp, raw,
unix file under /proc/net/?

If can not, Could you tell me where and how I should export this
information to?

The element sk_security of struct sock represents the socket
security context ID, which is inheriting from the process which
creates this socket most of the time.

but when SELinux type_transition rule is applied to socket, or
application sets /proc/xxx/attr/createsock, the socket security
context would be different from the creating process. on this
condition, the "netstat -Z" will return wrong value, since
"netstat -Z" only returns the process security context as socket
process security.

I want to fix "netstat -Z", but first the kernel must export this
information, like /proc/xxx/attr/current is the process security
context. So I have this requirement.

Expect your instruction.

Thanks.

-- 
Best Reagrds,
Roy | RongQing Li
-------------------------------------------------------------
WIND RIVER Beijing | China Development Center
Phone: +86-10-6483-5025, Cell: +86-135-2202-9864, Fax: +86-10-6479-0367

^ permalink raw reply

* Re: [PATCH 0/8] bna: Driver Fixes and Support for Re-architecture
From: David Miller @ 2011-07-28  5:32 UTC (permalink / raw)
  To: rmody; +Cc: netdev, adapter_linux_open_src_team
In-Reply-To: <1311732648-29876-1-git-send-email-rmody@brocade.com>

From: Rasesh Mody <rmody@brocade.com>
Date: Tue, 26 Jul 2011 19:10:40 -0700

>    This patch-set consists of few fixes, HW reg consolidation and adds support
>    for re-architecture and re-organisation of the driver.

Please do not mix bug fixes and feature changes.

If you do this, I can't put the bug fixes into the current release.

Seperate out the real pure bug fixes into a seperate series against
the main 'net' GIT tree.

Then you can submit your feature changes and cleanups seperately for
'net-next'

^ permalink raw reply

* Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
From: David Miller @ 2011-07-28  5:27 UTC (permalink / raw)
  To: gregory.v.rose; +Cc: netdev, bhutchings, jeffrey.t.kirsher
In-Reply-To: <20110727221759.8435.11589.stgit@gitlad.jf.intel.com>

From: Greg Rose <gregory.v.rose@intel.com>
Date: Wed, 27 Jul 2011 15:17:59 -0700

> Add new set commands to configure the number of SR-IOV VFs, the
> number of VM queues and spoof checking on/off switch.
> 
> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> ---
> 
>  include/linux/ethtool.h |   11 ++++++++++-
>  1 files changed, 10 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index c6e427a..c4972ba 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -36,12 +36,14 @@ struct ethtool_cmd {
>  	__u8	mdio_support;
>  	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
>  	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
> +	__u32	num_vfs;	/* Enable SR-IOV VFs */
> +	__u32	num_vmqs;	/* Set number of queues for VMDq */

You can't change the layout of this datastructure in this way without
breaking every ethtool binary out there.

You have to find another place to add these knobs.

^ permalink raw reply

* Re: [RFC net-next PATCH 2/4] ixgbe: Reconfigure SR-IOV Init
From: David Miller @ 2011-07-28  5:26 UTC (permalink / raw)
  To: gregory.v.rose; +Cc: netdev, bhutchings, jeffrey.t.kirsher
In-Reply-To: <20110727221754.8435.99712.stgit@gitlad.jf.intel.com>

From: Greg Rose <gregory.v.rose@intel.com>
Date: Wed, 27 Jul 2011 15:17:54 -0700

> +	int i;
> +	for (i = 0; i < adapter->num_vfs; i++) {
> +		if (adapter->vfinfo[i].vfdev->dev_flags &
> +			PCI_DEV_FLAGS_ASSIGNED) {
> +		return true;
> +		}
> +	}

Bad formatting and indentation, please fix this.

> +		pvfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID, device_id, NULL);
> +		while (pvfdev) {
> +			if (pvfdev->devfn == thisvf_devfn)
> +				break;
> +			pvfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID,
> +						device_id, pvfdev);
> +		}
> +		if (pvfdev)
> +			adapter->vfinfo[vfn].vfdev = pvfdev;

pci_get_*() grabs a reference to any non-NULL pci device object
returned, where does this reference get released?  I scanned
all uses of x.vfdev and x->vfdev and could not find the necessary
release.


^ permalink raw reply

* Re: pull request: wireless-next-2.6 2011-07-27
From: David Miller @ 2011-07-28  5:22 UTC (permalink / raw)
  To: linville-2XuSBdqkA4R54TAoqtyWWQ
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20110727184920.GB16431-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>

From: "John W. Linville" <linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
Date: Wed, 27 Jul 2011 14:49:21 -0400

> Here is a handful of fixes intended for 3.1.  This includes a
> user-visible typo fix, a fix for a user after free in the new pn533
> NFC driver, a cfg80211 fix for a possible NULL pointer dereference,
> a fix for an invalid memory access in b43, and another b43 fix for
> a memory corruption problem.
> 
> On top of that b43 memory corruption fix, there is a patch to remove
> BROKEN from the B43_BCMA Kconfig entry, which is key to enabling
> support for some of the more modern Broadcom wireless hardware.
> I'm sure the Rafał (and a number of others) would love to see that
> merged while the 3.1 merge window is still open as well.
> 
> Please let me know if there are problems...

Yep, removing BROKEN from b43 seems reasonable.

Pulled, thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [patch 01/11] [PATCH] iucv: introduce loadable iucv interface
From: David Miller @ 2011-07-28  5:10 UTC (permalink / raw)
  To: blaschka; +Cc: netdev, linux-s390
In-Reply-To: <20110728041331.GA10890@tuxmaker.boeblingen.de.ibm.com>

From: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Date: Thu, 28 Jul 2011 06:13:31 +0200

> yes, zero expectation for the current merge window :-)
> This RFC post is for review only. I will sent a respin of the
> patch set when net-next opens again.

Ok, perfect. :)

^ permalink raw reply

* [PATCH] vfs: avoid call to inode_lru_list_del() if possible
From: Eric Dumazet @ 2011-07-28  4:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tim Chen, Al Viro, David Miller, Andi Kleen, Matthew Wilcox,
	Anton Blanchard, npiggin, linux-kernel, linux-fsdevel, netdev
In-Reply-To: <20110727204415.GA13308@infradead.org>

Le mercredi 27 juillet 2011 à 16:44 -0400, Christoph Hellwig a écrit :
> On Wed, Jul 27, 2011 at 05:21:05PM +0200, Eric Dumazet wrote:
> > If I am not mistaken, we can add unlocked checks on the three hot spots.
> > 
> > After following patch, a close(socket(PF_INET, SOCK_DGRAM, 0)) pair on
> > my dev machine takes ~3us instead of ~9us.
> > 
> > Maybe its better to split it in three patches, just let me know.
> 
> I think three patches would be a lot cleaner.
> 
> As for safety of the unlocked checks:
> 
>  - inode are either hashed when created or never, so that one looks
>    fine.
>  - same for the sb list.
>  - the writeback list is a bit more dynamic as we move things around
>    quite a bit.  But in additon to the inode_wb_list_del call from
>    evict() it only ever gets remove in writeback_single_inode, which
>    for a freeing inode can only be called from the callers of evict().
> 
> Btw, I wonder if you should micro-optimize things a bit further by
> moving the unhashed checks from the deletion functions into the callers
> and thus save a function call for each of them.
> 

Here is the last patch, addressing inode_lru_list_del() call.

Only the call done from iput_final() can obviously benefit from checking
i_lru being empty or not, so it makes sense to perform the check at
caller site instead of doing it in inode_lru_list_del()

[PATCH] vfs: avoid call to inode_lru_list_del() if possible

inode_lru_list_del() is expensive because of per superblock lru locking,
while some inodes are not in lru list.

Adding a check in iput_final() can speedup pipe/sockets workloads on
SMP.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 fs/inode.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/inode.c b/fs/inode.c
index d0c72ff..b8b8939 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1328,7 +1328,8 @@ static void iput_final(struct inode *inode)
 	}
 
 	inode->i_state |= I_FREEING;
-	inode_lru_list_del(inode);
+	if (!list_empty(&inode->i_lru))
+		inode_lru_list_del(inode);
 	spin_unlock(&inode->i_lock);
 
 	evict(inode);


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH] vfs: avoid taking locks if inode not in lists
From: Eric Dumazet @ 2011-07-28  4:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tim Chen, Al Viro, David Miller, Andi Kleen, Matthew Wilcox,
	Anton Blanchard, npiggin, linux-kernel, linux-fsdevel, netdev
In-Reply-To: <20110727204415.GA13308@infradead.org>

Le mercredi 27 juillet 2011 à 16:44 -0400, Christoph Hellwig a écrit :
> On Wed, Jul 27, 2011 at 05:21:05PM +0200, Eric Dumazet wrote:
> > If I am not mistaken, we can add unlocked checks on the three hot spots.
> > 
> > After following patch, a close(socket(PF_INET, SOCK_DGRAM, 0)) pair on
> > my dev machine takes ~3us instead of ~9us.
> > 
> > Maybe its better to split it in three patches, just let me know.
> 
> I think three patches would be a lot cleaner.
> 
> As for safety of the unlocked checks:
> 
>  - inode are either hashed when created or never, so that one looks
>    fine.
>  - same for the sb list.
>  - the writeback list is a bit more dynamic as we move things around
>    quite a bit.  But in additon to the inode_wb_list_del call from
>    evict() it only ever gets remove in writeback_single_inode, which
>    for a freeing inode can only be called from the callers of evict().
> 
> Btw, I wonder if you should micro-optimize things a bit further by
> moving the unhashed checks from the deletion functions into the callers
> and thus save a function call for each of them.
> 

What about following patch, addressing the micro-optimization and Andi
Kleen concern about evict() readability ?

Thanks !

[PATCH] vfs: avoid taking inode_hash_lock on pipes and sockets

Some inodes (pipes, sockets, ...) are not hashed, no need to take
contended inode_hash_lock at dismantle time.

nice speedup on SMP machines on socket intensive workloads.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 fs/inode.c         |    6 +++---
 include/linux/fs.h |    9 ++++++++-
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index d0c72ff..73b5598 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -399,12 +399,12 @@ void __insert_inode_hash(struct inode *inode, unsigned long hashval)
 EXPORT_SYMBOL(__insert_inode_hash);
 
 /**
- *	remove_inode_hash - remove an inode from the hash
+ *	__remove_inode_hash - remove an inode from the hash
  *	@inode: inode to unhash
  *
  *	Remove an inode from the superblock.
  */
-void remove_inode_hash(struct inode *inode)
+void __remove_inode_hash(struct inode *inode)
 {
 	spin_lock(&inode_hash_lock);
 	spin_lock(&inode->i_lock);
@@ -412,7 +412,7 @@ void remove_inode_hash(struct inode *inode)
 	spin_unlock(&inode->i_lock);
 	spin_unlock(&inode_hash_lock);
 }
-EXPORT_SYMBOL(remove_inode_hash);
+EXPORT_SYMBOL(__remove_inode_hash);
 
 void end_writeback(struct inode *inode)
 {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f23bcb7..786b3b1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2317,11 +2317,18 @@ extern int should_remove_suid(struct dentry *);
 extern int file_remove_suid(struct file *);
 
 extern void __insert_inode_hash(struct inode *, unsigned long hashval);
-extern void remove_inode_hash(struct inode *);
 static inline void insert_inode_hash(struct inode *inode)
 {
 	__insert_inode_hash(inode, inode->i_ino);
 }
+
+extern void __remove_inode_hash(struct inode *);
+static inline void remove_inode_hash(struct inode *inode)
+{
+	if (!inode_unhashed(inode))
+		__remove_inode_hash(inode);
+}
+
 extern void inode_sb_list_add(struct inode *inode);
 
 #ifdef CONFIG_BLOCK


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [patch 01/11] [PATCH] iucv: introduce loadable iucv interface
From: Frank Blaschka @ 2011-07-28  4:13 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-s390
In-Reply-To: <20110727.162958.467070802431920516.davem@davemloft.net>

On Wed, Jul 27, 2011 at 04:29:58PM -0700, David Miller wrote:
> 
> I sincerely hope you have exactly zero expectations of me merging
> these changes in this merge window, if you wanted that you should
> have sent this stuff at least a week ago.
> 
> I plan on adding these changes to net-next once that opens up.
Hi Dave,

yes, zero expectation for the current merge window :-)
This RFC post is for review only. I will sent a respin of the
patch set when net-next opens again.

Frank

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] vfs: conditionally call inode_wb_list_del()
From: Eric Dumazet @ 2011-07-28  4:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andi Kleen, Tim Chen, Al Viro, David Miller, Matthew Wilcox,
	Anton Blanchard, npiggin, linux-kernel, linux-fsdevel, netdev
In-Reply-To: <20110727210104.GA9066@infradead.org>

Le mercredi 27 juillet 2011 à 17:01 -0400, Christoph Hellwig a écrit :
> On Wed, Jul 27, 2011 at 10:59:57PM +0200, Andi Kleen wrote:
> > > Btw, I wonder if you should micro-optimize things a bit further by
> > > moving the unhashed checks from the deletion functions into the callers
> > > and thus save a function call for each of them.
> > 
> > If the caller is in the same file modern gcc is able to do that automatically
> > if you're lucky enough ("partial inlining")
> > 
> > I would not uglify the code for it.
> 
> Depending on how you look at it the code might actually be a tad
> cleaner.  One of called functions is outside of inode.c.
> 

Thats right, thanks again for your valuable input Christoph.

The following is a clear win, since we avoid the call to external
function.

[PATCH] vfs: conditionally call inode_wb_list_del()

Some inodes (pipes, sockets, ...) are not in bdi writeback list.

evict() can avoid calling inode_wb_list_del() and its expensive spinlock
by checking inode i_wb_list being empty or not.

At this point, no other cpu/user can concurrently manipulate this inode
i_wb_list

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 fs/inode.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/inode.c b/fs/inode.c
index d0c72ff..9dab13a 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -454,7 +454,9 @@ static void evict(struct inode *inode)
 	BUG_ON(!(inode->i_state & I_FREEING));
 	BUG_ON(!list_empty(&inode->i_lru));
 
-	inode_wb_list_del(inode);
+	if (!list_empty(&inode->i_wb_list))
+		inode_wb_list_del(inode);
+
 	inode_sb_list_del(inode);
 
 	if (op->evict_inode) {


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH] net: Fix security_socket_sendmsg() bypass problem.
From: Tetsuo Handa @ 2011-07-28  3:36 UTC (permalink / raw)
  To: eparis, anton, casey, mjt, davem; +Cc: netdev, linux-security-module
In-Reply-To: <CACLa4pvJ2+eTRf3SRzoXgiW8H+U=BDH-2DKVZ7pGRrPem=JYgg@mail.gmail.com>

Here is an optimized version. Only compile tested.

Regarding SELinux, there should be little performance loss by this change.

Regarding SMACK, please test both functionality and performance improvement.
Unoptimized version will be measurable by applying
http://www.spinics.net/linux/fedora/linux-security-module/msg11504.html .

Regarding TOMOYO, I'll update tomoyo_socket_sendmsg() like SMACK does.

Regarding AppArmor, please update apparmor_socket_sendmsg() in Oneiric's patch
like SELinux does.

Regarding no-LSM case, there should be little performance loss by this change.
----------------------------------------
[PATCH] net: Fix security_socket_sendmsg() bypass problem.

The sendmmsg() introduced by commit 228e548e "net: Add sendmmsg socket system
call" is capable of sending to multiple different destinations. However,
security_socket_sendmsg() is called for only once even if multiple different
destination's addresses are passed to sendmmsg().

SMACK is using destination's address for checking sendmsg() permission.
Therefore, we need to call security_socket_sendmsg() for each destination
address rather than only the first destination address.

Fix this regression by
(1) passing "int datagrams" argument to security_socket_sendmsg() so that
    SELinux can omit sock_has_perm() checks on the 2nd or later.
(2) passing "struct list_head *list" argument to security_socket_sendmsg() so
    that SMACK can omit smack_netlabel_send() checks for duplicated destination
    address.
(3) letting __sys_sendmmsg() provide "struct list_head list" for
    security_socket_sendmsg() and clean it up before return.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [3.0+]
---
 include/linux/security.h   |   16 ++++++++++---
 include/linux/socket.h     |    8 ++++++
 net/socket.c               |   42 +++++++++++++++++++++++++-----------
 security/capability.c      |    3 +-
 security/security.c        |   52 +++++++++++++++++++++++++++++++++++++++++++--
 security/selinux/hooks.c   |    5 ++--
 security/smack/smack_lsm.c |    5 +++-
 7 files changed, 108 insertions(+), 23 deletions(-)

--- linux-3.0.orig/include/linux/security.h
+++ linux-3.0/include/linux/security.h
@@ -93,6 +93,7 @@ struct xfrm_policy;
 struct xfrm_state;
 struct xfrm_user_sec_ctx;
 struct seq_file;
+struct list_head;
 
 extern int cap_netlink_send(struct sock *sk, struct sk_buff *skb);
 extern int cap_netlink_recv(struct sk_buff *skb, int cap);
@@ -880,6 +881,10 @@ static inline void security_free_mnt_opt
  *	@sock contains the socket structure.
  *	@msg contains the message to be transmitted.
  *	@size contains the size of message.
+ *	@datagrams contains the index of messages in sendmmsg(). This is 0 if
+ *	not sendmmsg().
+ *	@list contains the list head which can be used for holding
+ *	already-checked destination address. This is NULL if not sendmmsg().
  *	Return 0 if permission is granted.
  * @socket_recvmsg:
  *	Check permission before receiving a message from a socket.
@@ -1584,8 +1589,8 @@ struct security_operations {
 			       struct sockaddr *address, int addrlen);
 	int (*socket_listen) (struct socket *sock, int backlog);
 	int (*socket_accept) (struct socket *sock, struct socket *newsock);
-	int (*socket_sendmsg) (struct socket *sock,
-			       struct msghdr *msg, int size);
+	int (*socket_sendmsg) (struct socket *sock, struct msghdr *msg,
+			       int size, int datagrams, struct list_head *list);
 	int (*socket_recvmsg) (struct socket *sock,
 			       struct msghdr *msg, int size, int flags);
 	int (*socket_getsockname) (struct socket *sock);
@@ -2551,7 +2556,9 @@ int security_socket_bind(struct socket *
 int security_socket_connect(struct socket *sock, struct sockaddr *address, int addrlen);
 int security_socket_listen(struct socket *sock, int backlog);
 int security_socket_accept(struct socket *sock, struct socket *newsock);
-int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size);
+bool security_sendmsg_uniq_address(struct msghdr *msg, struct list_head *list);
+int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size,
+			    int datagrams, struct list_head *list);
 int security_socket_recvmsg(struct socket *sock, struct msghdr *msg,
 			    int size, int flags);
 int security_socket_getsockname(struct socket *sock);
@@ -2636,7 +2643,8 @@ static inline int security_socket_accept
 }
 
 static inline int security_socket_sendmsg(struct socket *sock,
-					  struct msghdr *msg, int size)
+					  struct msghdr *msg, int size,
+					  int datagrams, struct list_head *list)
 {
 	return 0;
 }
--- linux-3.0.orig/include/linux/socket.h
+++ linux-3.0/include/linux/socket.h
@@ -23,6 +23,7 @@ struct __kernel_sockaddr_storage {
 #include <linux/uio.h>			/* iovec support		*/
 #include <linux/types.h>		/* pid_t			*/
 #include <linux/compiler.h>		/* __user			*/
+#include <linux/list.h>			/* struct list_head             */
 
 struct pid;
 struct cred;
@@ -75,6 +76,13 @@ struct mmsghdr {
 	unsigned        msg_len;
 };
 
+/* For remembering destination's address passed to sendmmsg(). */
+struct sendmmsg_dest_info {
+	struct list_head list;
+	unsigned int address_len;
+	struct sockaddr_storage address;
+};
+
 /*
  *	POSIX 1003.1g - ancillary data object information
  *	Ancillary data consits of a sequence of pairs of
--- linux-3.0.orig/net/socket.c
+++ linux-3.0/net/socket.c
@@ -558,9 +558,10 @@ static inline int __sock_sendmsg_nosec(s
 }
 
 static inline int __sock_sendmsg(struct kiocb *iocb, struct socket *sock,
-				 struct msghdr *msg, size_t size)
+				 struct msghdr *msg, size_t size,
+				 int datagrams, struct list_head *list)
 {
-	int err = security_socket_sendmsg(sock, msg, size);
+	int err = security_socket_sendmsg(sock, msg, size, datagrams, list);
 
 	return err ?: __sock_sendmsg_nosec(iocb, sock, msg, size);
 }
@@ -573,14 +574,16 @@ int sock_sendmsg(struct socket *sock, st
 
 	init_sync_kiocb(&iocb, NULL);
 	iocb.private = &siocb;
-	ret = __sock_sendmsg(&iocb, sock, msg, size);
+	ret = __sock_sendmsg(&iocb, sock, msg, size, 0, NULL);
 	if (-EIOCBQUEUED == ret)
 		ret = wait_on_sync_kiocb(&iocb);
 	return ret;
 }
 EXPORT_SYMBOL(sock_sendmsg);
 
-int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg, size_t size)
+static int sock_send_datagrams(struct socket *sock, struct msghdr *msg,
+			       size_t size, int datagrams,
+			       struct list_head *list)
 {
 	struct kiocb iocb;
 	struct sock_iocb siocb;
@@ -588,7 +591,7 @@ int sock_sendmsg_nosec(struct socket *so
 
 	init_sync_kiocb(&iocb, NULL);
 	iocb.private = &siocb;
-	ret = __sock_sendmsg_nosec(&iocb, sock, msg, size);
+	ret = __sock_sendmsg(&iocb, sock, msg, size, datagrams, list);
 	if (-EIOCBQUEUED == ret)
 		ret = wait_on_sync_kiocb(&iocb);
 	return ret;
@@ -888,7 +891,7 @@ static ssize_t do_sock_write(struct msgh
 	if (sock->type == SOCK_SEQPACKET)
 		msg->msg_flags |= MSG_EOR;
 
-	return __sock_sendmsg(iocb, sock, msg, size);
+	return __sock_sendmsg(iocb, sock, msg, size, 0, NULL);
 }
 
 static ssize_t sock_aio_write(struct kiocb *iocb, const struct iovec *iov,
@@ -1872,7 +1875,8 @@ SYSCALL_DEFINE2(shutdown, int, fd, int, 
 #define COMPAT_FLAGS(msg)	COMPAT_MSG(msg, msg_flags)
 
 static int __sys_sendmsg(struct socket *sock, struct msghdr __user *msg,
-			 struct msghdr *msg_sys, unsigned flags, int nosec)
+			 struct msghdr *msg_sys, unsigned flags, int datagrams,
+			 struct list_head *list)
 {
 	struct compat_msghdr __user *msg_compat =
 	    (struct compat_msghdr __user *)msg;
@@ -1953,8 +1957,7 @@ static int __sys_sendmsg(struct socket *
 
 	if (sock->file->f_flags & O_NONBLOCK)
 		msg_sys->msg_flags |= MSG_DONTWAIT;
-	err = (nosec ? sock_sendmsg_nosec : sock_sendmsg)(sock, msg_sys,
-							  total_len);
+	err = sock_send_datagrams(sock, msg_sys, total_len, datagrams, list);
 
 out_freectl:
 	if (ctl_buf != ctl)
@@ -1979,7 +1982,7 @@ SYSCALL_DEFINE3(sendmsg, int, fd, struct
 	if (!sock)
 		goto out;
 
-	err = __sys_sendmsg(sock, msg, &msg_sys, flags, 0);
+	err = __sys_sendmsg(sock, msg, &msg_sys, flags, 0, NULL);
 
 	fput_light(sock->file, fput_needed);
 out:
@@ -1998,6 +2001,7 @@ int __sys_sendmmsg(int fd, struct mmsghd
 	struct mmsghdr __user *entry;
 	struct compat_mmsghdr __user *compat_entry;
 	struct msghdr msg_sys;
+	LIST_HEAD(list); /* List for finding duplicated destination address. */
 
 	datagrams = 0;
 
@@ -2014,18 +2018,19 @@ int __sys_sendmmsg(int fd, struct mmsghd
 
 	while (datagrams < vlen) {
 		/*
-		 * No need to ask LSM for more than the first datagram.
+		 * If datagrams == 0, LSM module will check. Otherwise, it will
+		 * check depending on its implementation.
 		 */
 		if (MSG_CMSG_COMPAT & flags) {
 			err = __sys_sendmsg(sock, (struct msghdr __user *)compat_entry,
-					    &msg_sys, flags, datagrams);
+					    &msg_sys, flags, datagrams, &list);
 			if (err < 0)
 				break;
 			err = __put_user(err, &compat_entry->msg_len);
 			++compat_entry;
 		} else {
 			err = __sys_sendmsg(sock, (struct msghdr __user *)entry,
-					    &msg_sys, flags, datagrams);
+					    &msg_sys, flags, datagrams, &list);
 			if (err < 0)
 				break;
 			err = put_user(err, &entry->msg_len);
@@ -2038,6 +2043,17 @@ int __sys_sendmmsg(int fd, struct mmsghd
 	}
 
 out_put:
+#ifdef CONFIG_SECURITY_NETWORK
+	{ /* Clean up destination addresses. */
+		struct sendmmsg_dest_info *ptr;
+		struct sendmmsg_dest_info *tmp;
+
+		list_for_each_entry_safe(ptr, tmp, &list, list) {
+			list_del(&ptr->list);
+			kfree(ptr);
+		}
+	}
+#endif
 	fput_light(sock->file, fput_needed);
 
 	if (err == 0)
--- linux-3.0.orig/security/capability.c
+++ linux-3.0/security/capability.c
@@ -593,7 +593,8 @@ static int cap_socket_accept(struct sock
 	return 0;
 }
 
-static int cap_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size)
+static int cap_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size,
+			      int datagrams, struct list_head *list)
 {
 	return 0;
 }
--- linux-3.0.orig/security/security.c
+++ linux-3.0/security/security.c
@@ -17,6 +17,7 @@
 #include <linux/kernel.h>
 #include <linux/security.h>
 #include <linux/ima.h>
+#include <linux/socket.h>
 
 /* Boot-time LSM user choice */
 static __initdata char chosen_lsm[SECURITY_NAME_MAX + 1] =
@@ -1036,9 +1037,56 @@ int security_socket_accept(struct socket
 	return security_ops->socket_accept(sock, newsock);
 }
 
-int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size)
+int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size,
+			    int datagrams, struct list_head *list)
 {
-	return security_ops->socket_sendmsg(sock, msg, size);
+	return security_ops->socket_sendmsg(sock, msg, size, datagrams, list);
+}
+
+/**
+ * security_sendmsg_uniq_address - Check for duplicated address.
+ *
+ * @msg:  Pointer to "struct msg".
+ * @list: Pointer to "struct list_head".
+ *
+ * Returns true if @msg->msg_name is already in @list, false otherwise.
+ * @msg->msg_name will be duplicated and added to @list (unless out-of-memory
+ * occurs) if this function returns true. __sys_sendmmsg() provides @list and
+ * will clean up allocated memory before return.
+ *
+ * Some LSM modules check permission based on destination address at
+ * security_socket_sendmsg(). But checking for duplicated destination
+ * address at common code path is waste of time unless such LSM module is
+ * selected. Therefore, let such LSM modules call this function if they want to
+ * check permission only once for each uniq destination address.
+ */
+bool security_sendmsg_uniq_address(struct msghdr *msg, struct list_head *list)
+{
+	struct sendmmsg_dest_info *ptr;
+
+	/* If not sendmmsg(), this address is uniq. */
+	if (!list)
+		return true;
+	/* If sendmmsg(), check if this address was already used. */
+	list_for_each_entry(ptr, list, list) {
+		if (ptr->address_len != msg->msg_namelen ||
+		    memcmp(&ptr->address, msg->msg_name, ptr->address_len))
+			continue;
+		return false;
+	}
+	/*
+	 * Remember this address so that subsequent call will return false.
+	 *
+	 * Out of memory error is not fatal here because checking more than
+	 * once should be harmless other than the performance loss.
+	 */
+	ptr = kmalloc(sizeof(*ptr), GFP_KERNEL);
+	if (ptr) {
+		ptr->address_len = msg->msg_namelen;
+		memcpy(&ptr->address, msg->msg_name, ptr->address_len);
+		list_add(&ptr->list, list);
+	}
+	return true;
 }
 
 int security_socket_recvmsg(struct socket *sock, struct msghdr *msg,
--- linux-3.0.orig/security/selinux/hooks.c
+++ linux-3.0/security/selinux/hooks.c
@@ -3967,9 +3967,10 @@ static int selinux_socket_accept(struct 
 }
 
 static int selinux_socket_sendmsg(struct socket *sock, struct msghdr *msg,
-				  int size)
+				  int size, int datagrams,
+				  struct list_head *list)
 {
-	return sock_has_perm(current, sock->sk, SOCKET__WRITE);
+	return datagrams ? 0 : sock_has_perm(current, sock->sk, SOCKET__WRITE);
 }
 
 static int selinux_socket_recvmsg(struct socket *sock, struct msghdr *msg,
--- linux-3.0.orig/security/smack/smack_lsm.c
+++ linux-3.0/security/smack/smack_lsm.c
@@ -2799,7 +2799,7 @@ static int smack_unix_may_send(struct so
  * label host.
  */
 static int smack_socket_sendmsg(struct socket *sock, struct msghdr *msg,
-				int size)
+				int size, int datagrams, struct list_head *list)
 {
 	struct sockaddr_in *sip = (struct sockaddr_in *) msg->msg_name;
 
@@ -2809,6 +2809,9 @@ static int smack_socket_sendmsg(struct s
 	if (sip == NULL || sip->sin_family != AF_INET)
 		return 0;
 
+	if (!security_sendmsg_uniq_address(msg, list))
+		return 0;
+
 	return smack_netlabel_send(sock->sk, sip);
 }
 

^ permalink raw reply

* Re: [PATCH 02/14] allow root in container to copy namespaces
From: Serge E. Hallyn @ 2011-07-28  2:13 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: linux-kernel, netdev, containers, dhowells
In-Reply-To: <m1hb67fh9l.fsf@fess.ebiederm.org>

Quoting Eric W. Biederman (ebiederm@xmission.com):
> Serge Hallyn <serge@hallyn.com> writes:
> 
> > From: Serge E. Hallyn <serge.hallyn@canonical.com>
> >
> > Othewise nested containers with user namespaces won't be possible.
> >
> > It's true that user namespaces are not yet fully isolated, but for
> > that same reason there are far worse things that root in a child
> > user ns can do.  Spawning a child user ns is not in itself bad.
> >
> > This patch also allows setns for root in a container:
> > @Eric Biederman: are there gotchas in allowing setns from child
> > userns?
> 
> Yes.  We need to ensure that the target namespaces are namespaces
> that have been created in from user_namespace or from a child of this
> user_namespace.
> 
> Aka we need to ensure that we have CAP_SYS_ADMIN for the new namespace.

Thanks - so the last hunk in this patch is wrong.

> Eric
> 
> > Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
> > Cc: Eric W. Biederman <ebiederm@xmission.com>
> > ---
> >  kernel/fork.c    |    4 ++--
> >  kernel/nsproxy.c |    6 +++---
> >  2 files changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 17bf7c8..22d0cf0 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -1473,8 +1473,8 @@ long do_fork(unsigned long clone_flags,
> >  		/* hopefully this check will go away when userns support is
> >  		 * complete
> >  		 */
> > -		if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SETUID) ||
> > -				!capable(CAP_SETGID))
> > +		if (!nsown_capable(CAP_SYS_ADMIN) || !nsown_capable(CAP_SETUID) ||
> > +				!nsown_capable(CAP_SETGID))
> >  			return -EPERM;
> >  	}
> >  
> > diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> > index 9aeab4b..f50542d 100644
> > --- a/kernel/nsproxy.c
> > +++ b/kernel/nsproxy.c
> > @@ -134,7 +134,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
> >  				CLONE_NEWPID | CLONE_NEWNET)))
> >  		return 0;
> >  
> > -	if (!capable(CAP_SYS_ADMIN)) {
> > +	if (!nsown_capable(CAP_SYS_ADMIN)) {
> >  		err = -EPERM;
> >  		goto out;
> >  	}
> > @@ -191,7 +191,7 @@ int unshare_nsproxy_namespaces(unsigned long unshare_flags,
> >  			       CLONE_NEWNET)))
> >  		return 0;
> >  
> > -	if (!capable(CAP_SYS_ADMIN))
> > +	if (!nsown_capable(CAP_SYS_ADMIN))
> >  		return -EPERM;
> >  
> >  	*new_nsp = create_new_namespaces(unshare_flags, current,
> > @@ -241,7 +241,7 @@ SYSCALL_DEFINE2(setns, int, fd, int, nstype)
> >  	struct file *file;
> >  	int err;
> >  
> > -	if (!capable(CAP_SYS_ADMIN))
> > +	if (!nsown_capable(CAP_SYS_ADMIN))
> >  		return -EPERM;
> >  
> >  	file = proc_ns_fget(fd);

^ permalink raw reply

* Re: Oops when insmod rtl8192ce
From: hubert Liao @ 2011-07-28  1:21 UTC (permalink / raw)
  To: Larry Finger
  Cc: Chaoming Li, John W. Linville,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4E302299.2070309-tQ5ms3gMjBLk1uMJSBkQmQ@public.gmane.org>

2011/7/27 Larry Finger <Larry.Finger-tQ5ms3gMjBLk1uMJSBkQmQ@public.gmane.org>:
> On 07/27/2011 04:26 AM, hubert Liao wrote:
>>
>> Hi,
>> We got an oops when insmod rtl8192ce module (the board is an ARM soc),
>> accroding the oops message, find it's because in rtl_pci_probe()
>> called _rtl_pci_find_adapter(),
>> in this funcation, the  pdev->bus->self is a NULL pointer .
>> static boot _rtl_pci_find_adapter(strcut pci_dev *dev,
>>               struct ieee80211_hw *hw)
>> {
>> struct pci_dev *bridge_pdev = pdev->bus->self;   //line 1601
>> ...
>> pcipriv->ndis_adapter.pcibridge_vendorid = bridge_pdev->vendor;<--
>> [oops here] line 1700
>> ...
>> }
>> here, I just want to know why the bus->self  is NULL?
>> ----
>> [  148.186632] Unable to handle kernel NULL pointer dereference at
>> virtual address 00000020
>
> As John Linville suggested, please open a bugzilla report.
>
Ok, I'll try it, but I am not familiar with it.
> I would also like some additional information. What kernel are you using? In
> addition, please post the 'lspci -nnk' information for your card.
>
The kernel is from the latest linus kernel git tree(3.0.0-05684-ge371d46-dirty),
I also have tested the 2.6.38.8 stable release, it has the same problem.

lspci -nnk
00:00.0 Class [0580]: Device [11ab:6192] (rev 03)
        Subsystem: Device [11ab:11ab]
00:01.0 Class [0280]: Device [10ec:8178] (rev 01)
        Subsystem: Device [1a3b:1178]

> I also think that pdev->bus should have been setup before the initialization
> code in rtl8192ce was called. I have not tested the driver on other than x86
> and x86_64 architectures because of hardware availability, thus ARM may
> expose some problems. Is this soc little-endian?
>
Yes ,it is little-endian

cat /proc/cpuinfo
Processor       : Feroceon 88FR131 rev 1 (v5l)
BogoMIPS        : 789.70
Features        : swp half thumb fastmult edsp
CPU implementer : 0x56
CPU architecture: 5TE
CPU variant     : 0x2
CPU part        : 0x131
CPU revision    : 1

Hardware        : Marvell RD-88F6192-NAS Development Board
Revision        : 0000
Serial          : 0000000000000000

> Thanks,
> Larry
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH net-next 2/9] tg3: Simplify tx bd assignments
From: Matt Carlson @ 2011-07-28  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

In the following patches, the process the driver will use to assign skb
fragments to transmit BDs will get more complicated.  To prepare for
that new code, this patch seeks to simplify how transmit BDs are
populated.  It does this by separating the code that assigns the BD
members from the logic that controls how the fields are set.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/tg3.c |   76 +++++++++++++++++++++++++---------------------------
 1 files changed, 37 insertions(+), 39 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 3708159..8dfde34 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -5901,27 +5901,16 @@ static inline int tg3_40bit_overflow_test(struct tg3 *tp, dma_addr_t mapping,
 #endif
 }
 
-static void tg3_set_txd(struct tg3_napi *tnapi, int entry,
-			dma_addr_t mapping, int len, u32 flags,
-			u32 mss_and_is_end)
+static inline void tg3_tx_set_bd(struct tg3_napi *tnapi, u32 entry,
+				 dma_addr_t mapping, u32 len, u32 flags,
+				 u32 mss, u32 vlan)
 {
-	struct tg3_tx_buffer_desc *txd = &tnapi->tx_ring[entry];
-	int is_end = (mss_and_is_end & 0x1);
-	u32 mss = (mss_and_is_end >> 1);
-	u32 vlan_tag = 0;
+	struct tg3_tx_buffer_desc *txbd = &tnapi->tx_ring[entry];
 
-	if (is_end)
-		flags |= TXD_FLAG_END;
-	if (flags & TXD_FLAG_VLAN) {
-		vlan_tag = flags >> 16;
-		flags &= 0xffff;
-	}
-	vlan_tag |= (mss << TXD_MSS_SHIFT);
-
-	txd->addr_hi = ((u64) mapping >> 32);
-	txd->addr_lo = ((u64) mapping & 0xffffffff);
-	txd->len_flags = (len << TXD_LEN_SHIFT) | flags;
-	txd->vlan_tag = vlan_tag << TXD_VLAN_TAG_SHIFT;
+	txbd->addr_hi = ((u64) mapping >> 32);
+	txbd->addr_lo = ((u64) mapping & 0xffffffff);
+	txbd->len_flags = (len << TXD_LEN_SHIFT) | (flags & 0x0000ffff);
+	txbd->vlan_tag = (mss << TXD_MSS_SHIFT) | (vlan << TXD_VLAN_TAG_SHIFT);
 }
 
 static void tg3_skb_error_unmap(struct tg3_napi *tnapi,
@@ -5950,7 +5939,7 @@ static void tg3_skb_error_unmap(struct tg3_napi *tnapi,
 /* Workaround 4GB and 40-bit hardware DMA bugs. */
 static int tigon3_dma_hwbug_workaround(struct tg3_napi *tnapi,
 				       struct sk_buff *skb,
-				       u32 base_flags, u32 mss)
+				       u32 base_flags, u32 mss, u32 vlan)
 {
 	struct tg3 *tp = tnapi->tp;
 	struct sk_buff *new_skb;
@@ -5988,12 +5977,14 @@ static int tigon3_dma_hwbug_workaround(struct tg3_napi *tnapi,
 			ret = -1;
 			dev_kfree_skb(new_skb);
 		} else {
+			base_flags |= TXD_FLAG_END;
+
 			tnapi->tx_buffers[entry].skb = new_skb;
 			dma_unmap_addr_set(&tnapi->tx_buffers[entry],
 					   mapping, new_addr);
 
-			tg3_set_txd(tnapi, entry, new_addr, new_skb->len,
-				    base_flags, 1 | (mss << 1));
+			tg3_tx_set_bd(tnapi, entry, new_addr, new_skb->len,
+				      base_flags, mss, vlan);
 		}
 	}
 
@@ -6051,7 +6042,7 @@ tg3_tso_bug_end:
 static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct tg3 *tp = netdev_priv(dev);
-	u32 len, entry, base_flags, mss;
+	u32 len, entry, base_flags, mss, vlan = 0;
 	int i = -1, would_hit_hwbug;
 	dma_addr_t mapping;
 	struct tg3_napi *tnapi;
@@ -6153,9 +6144,12 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		}
 	}
 
-	if (vlan_tx_tag_present(skb))
-		base_flags |= (TXD_FLAG_VLAN |
-			       (vlan_tx_tag_get(skb) << 16));
+#ifdef BCM_KERNEL_SUPPORTS_8021Q
+	if (vlan_tx_tag_present(skb)) {
+		base_flags |= TXD_FLAG_VLAN;
+		vlan = vlan_tx_tag_get(skb);
+	}
+#endif
 
 	if (tg3_flag(tp, USE_JUMBO_BDFLAG) &&
 	    !mss && skb->len > VLAN_ETH_FRAME_LEN)
@@ -6186,13 +6180,21 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (tg3_flag(tp, 5701_DMA_BUG))
 		would_hit_hwbug = 1;
 
-	tg3_set_txd(tnapi, entry, mapping, len, base_flags,
-		    (skb_shinfo(skb)->nr_frags == 0) | (mss << 1));
+	tg3_tx_set_bd(tnapi, entry, mapping, len, base_flags |
+		      ((skb_shinfo(skb)->nr_frags == 0) ? TXD_FLAG_END : 0),
+		      mss, vlan);
 
 	entry = NEXT_TX(entry);
 
 	/* Now loop through additional data fragments, and queue them. */
 	if (skb_shinfo(skb)->nr_frags > 0) {
+		u32 tmp_mss = mss;
+
+		if (!tg3_flag(tp, HW_TSO_1) &&
+		    !tg3_flag(tp, HW_TSO_2) &&
+		    !tg3_flag(tp, HW_TSO_3))
+			tmp_mss = 0;
+
 		last = skb_shinfo(skb)->nr_frags - 1;
 		for (i = 0; i <= last; i++) {
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
@@ -6219,14 +6221,9 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			if (tg3_40bit_overflow_test(tp, mapping, len))
 				would_hit_hwbug = 1;
 
-			if (tg3_flag(tp, HW_TSO_1) ||
-			    tg3_flag(tp, HW_TSO_2) ||
-			    tg3_flag(tp, HW_TSO_3))
-				tg3_set_txd(tnapi, entry, mapping, len,
-					    base_flags, (i == last)|(mss << 1));
-			else
-				tg3_set_txd(tnapi, entry, mapping, len,
-					    base_flags, (i == last));
+			tg3_tx_set_bd(tnapi, entry, mapping, len, base_flags |
+				      ((i == last) ? TXD_FLAG_END : 0),
+				      tmp_mss, vlan);
 
 			entry = NEXT_TX(entry);
 		}
@@ -6238,7 +6235,8 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		/* If the workaround fails due to memory/mapping
 		 * failure, silently drop this packet.
 		 */
-		if (tigon3_dma_hwbug_workaround(tnapi, skb, base_flags, mss))
+		if (tigon3_dma_hwbug_workaround(tnapi, skb, base_flags,
+						mss, vlan))
 			goto out_unlock;
 
 		entry = NEXT_TX(tnapi->tx_prod);
@@ -11370,8 +11368,8 @@ static int tg3_run_loopback(struct tg3 *tp, u32 pktsz, int loopback_mode)
 
 	rx_start_idx = rnapi->hw_status->idx[0].rx_producer;
 
-	tg3_set_txd(tnapi, tnapi->tx_prod, map, tx_len,
-		    base_flags, (mss << 1) | 1);
+	tg3_tx_set_bd(tnapi, tnapi->tx_prod, map, tx_len,
+		      base_flags | TXD_FLAG_END, mss, 0);
 
 	tnapi->tx_prod++;
 
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 4/9] tg3: Generalize tg3_skb_error_unmap()
From: Matt Carlson @ 2011-07-28  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

In the following patches, unmapping skb fragments will get just as
complicated as mapping them.  This patch generalizes
tg3_skb_error_unmap() and makes it the one-stop-shop for skb unmapping.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/tg3.c |   48 ++++++++++++++++--------------------------------
 1 files changed, 16 insertions(+), 32 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 0f5bcf7..3f69f1a 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -5913,13 +5913,15 @@ static inline void tg3_tx_set_bd(struct tg3_napi *tnapi, u32 entry,
 	txbd->vlan_tag = (mss << TXD_MSS_SHIFT) | (vlan << TXD_VLAN_TAG_SHIFT);
 }
 
-static void tg3_skb_error_unmap(struct tg3_napi *tnapi,
-				struct sk_buff *skb, int last)
+static void tg3_tx_skb_unmap(struct tg3_napi *tnapi, u32 entry, int last)
 {
 	int i;
-	u32 entry = tnapi->tx_prod;
+	struct sk_buff *skb;
 	struct tg3_tx_ring_info *txb = &tnapi->tx_buffers[entry];
 
+	skb = txb->skb;
+	txb->skb = NULL;
+
 	pci_unmap_single(tnapi->tp->pdev,
 			 dma_unmap_addr(txb, mapping),
 			 skb_headlen(skb),
@@ -6227,7 +6229,7 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	}
 
 	if (would_hit_hwbug) {
-		tg3_skb_error_unmap(tnapi, skb, i);
+		tg3_tx_skb_unmap(tnapi, tnapi->tx_prod, i);
 
 		/* If the workaround fails due to memory/mapping
 		 * failure, silently drop this packet.
@@ -6264,7 +6266,7 @@ out_unlock:
 	return NETDEV_TX_OK;
 
 dma_error:
-	tg3_skb_error_unmap(tnapi, skb, i);
+	tg3_tx_skb_unmap(tnapi, tnapi->tx_prod, i);
 	dev_kfree_skb(skb);
 	tnapi->tx_buffers[tnapi->tx_prod].skb = NULL;
 	return NETDEV_TX_OK;
@@ -6597,35 +6599,13 @@ static void tg3_free_rings(struct tg3 *tp)
 		if (!tnapi->tx_buffers)
 			continue;
 
-		for (i = 0; i < TG3_TX_RING_SIZE; ) {
-			struct tg3_tx_ring_info *txp;
-			struct sk_buff *skb;
-			unsigned int k;
+		for (i = 0; i < TG3_TX_RING_SIZE; i++) {
+			struct sk_buff *skb = tnapi->tx_buffers[i].skb;
 
-			txp = &tnapi->tx_buffers[i];
-			skb = txp->skb;
-
-			if (skb == NULL) {
-				i++;
+			if (!skb)
 				continue;
-			}
 
-			pci_unmap_single(tp->pdev,
-					 dma_unmap_addr(txp, mapping),
-					 skb_headlen(skb),
-					 PCI_DMA_TODEVICE);
-			txp->skb = NULL;
-
-			i++;
-
-			for (k = 0; k < skb_shinfo(skb)->nr_frags; k++) {
-				txp = &tnapi->tx_buffers[i & (TG3_TX_RING_SIZE - 1)];
-				pci_unmap_page(tp->pdev,
-					       dma_unmap_addr(txp, mapping),
-					       skb_shinfo(skb)->frags[k].size,
-					       PCI_DMA_TODEVICE);
-				i++;
-			}
+			tg3_tx_skb_unmap(tnapi, i, skb_shinfo(skb)->nr_frags);
 
 			dev_kfree_skb_any(skb);
 		}
@@ -11358,6 +11338,10 @@ static int tg3_run_loopback(struct tg3 *tp, u32 pktsz, int loopback_mode)
 		return -EIO;
 	}
 
+	val = tnapi->tx_prod;
+	tnapi->tx_buffers[val].skb = skb;
+	dma_unmap_addr_set(&tnapi->tx_buffers[val], mapping, map);
+
 	tw32_f(HOSTCC_MODE, tp->coalesce_mode | HOSTCC_MODE_ENABLE |
 	       rnapi->coal_now);
 
@@ -11389,7 +11373,7 @@ static int tg3_run_loopback(struct tg3 *tp, u32 pktsz, int loopback_mode)
 			break;
 	}
 
-	pci_unmap_single(tp->pdev, map, tx_len, PCI_DMA_TODEVICE);
+	tg3_tx_skb_unmap(tnapi, tnapi->tx_prod - 1, 0);
 	dev_kfree_skb(skb);
 
 	if (tx_idx != tnapi->tx_prod)
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 9/9] tg3: Remove 5719 jumbo frames and TSO blocks
From: Matt Carlson @ 2011-07-28  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

The A0 revision of this chip is the only device that requires these
features to be disabled.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
---
 drivers/net/tg3.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index c77a39d..dc3fbf6 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -8406,7 +8406,7 @@ static int tg3_reset_hw(struct tg3 *tp, int reset_phy)
 	/* Program the jumbo buffer descriptor ring control
 	 * blocks on those devices that have them.
 	 */
-	if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719 ||
+	if (tp->pci_chip_rev_id == CHIPREV_ID_5719_A0 ||
 	    (tg3_flag(tp, JUMBO_CAPABLE) && !tg3_flag(tp, 5780_CLASS))) {
 
 		if (tg3_flag(tp, JUMBO_RING_ENABLE)) {
@@ -13873,7 +13873,7 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
 		tg3_flag_set(tp, 5705_PLUS);
 
 	/* Determine TSO capabilities */
-	if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719)
+	if (tp->pci_chip_rev_id == CHIPREV_ID_5719_A0)
 		; /* Do nothing. HW bug. */
 	else if (tg3_flag(tp, 57765_PLUS))
 		tg3_flag_set(tp, HW_TSO_3);
@@ -13943,7 +13943,7 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
 		tg3_flag_set(tp, LRG_PROD_RING_CAP);
 
 	if (tg3_flag(tp, 57765_PLUS) &&
-	    GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5719)
+	    tp->pci_chip_rev_id != CHIPREV_ID_5719_A0)
 		tg3_flag_set(tp, USE_JUMBO_BDFLAG);
 
 	if (!tg3_flag(tp, 5705_PLUS) ||
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 1/9] tg3: Reintroduce tg3_tx_ring_info
From: Matt Carlson @ 2011-07-28  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

The following patches will require the use of an additional flag in the
ring_info structure.  The use of this flag is tx path specific, so this
patch defines a specialized ring_info structure.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/tg3.c |   12 ++++++------
 drivers/net/tg3.h |    7 ++++++-
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 8035765..3708159 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -4824,7 +4824,7 @@ static void tg3_tx(struct tg3_napi *tnapi)
 	txq = netdev_get_tx_queue(tp->dev, index);
 
 	while (sw_idx != hw_idx) {
-		struct ring_info *ri = &tnapi->tx_buffers[sw_idx];
+		struct tg3_tx_ring_info *ri = &tnapi->tx_buffers[sw_idx];
 		struct sk_buff *skb = ri->skb;
 		int i, tx_bug = 0;
 
@@ -5929,7 +5929,7 @@ static void tg3_skb_error_unmap(struct tg3_napi *tnapi,
 {
 	int i;
 	u32 entry = tnapi->tx_prod;
-	struct ring_info *txb = &tnapi->tx_buffers[entry];
+	struct tg3_tx_ring_info *txb = &tnapi->tx_buffers[entry];
 
 	pci_unmap_single(tnapi->tp->pdev,
 			 dma_unmap_addr(txb, mapping),
@@ -6603,7 +6603,7 @@ static void tg3_free_rings(struct tg3 *tp)
 			continue;
 
 		for (i = 0; i < TG3_TX_RING_SIZE; ) {
-			struct ring_info *txp;
+			struct tg3_tx_ring_info *txp;
 			struct sk_buff *skb;
 			unsigned int k;
 
@@ -6762,9 +6762,9 @@ static int tg3_alloc_consistent(struct tg3 *tp)
 		 */
 		if ((!i && !tg3_flag(tp, ENABLE_TSS)) ||
 		    (i && tg3_flag(tp, ENABLE_TSS))) {
-			tnapi->tx_buffers = kzalloc(sizeof(struct ring_info) *
-						    TG3_TX_RING_SIZE,
-						    GFP_KERNEL);
+			tnapi->tx_buffers = kzalloc(
+					       sizeof(struct tg3_tx_ring_info) *
+					       TG3_TX_RING_SIZE, GFP_KERNEL);
 			if (!tnapi->tx_buffers)
 				goto err_out;
 
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index 691539b..f6986ca 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2652,6 +2652,11 @@ struct ring_info {
 	DEFINE_DMA_UNMAP_ADDR(mapping);
 };
 
+struct tg3_tx_ring_info {
+	struct sk_buff			*skb;
+	DEFINE_DMA_UNMAP_ADDR(mapping);
+};
+
 struct tg3_link_config {
 	/* Describes what we're trying to get. */
 	u32				advertising;
@@ -2816,7 +2821,7 @@ struct tg3_napi {
 	u32				last_tx_cons;
 	u32				prodmbox;
 	struct tg3_tx_buffer_desc	*tx_ring;
-	struct ring_info		*tx_buffers;
+	struct tg3_tx_ring_info		*tx_buffers;
 
 	dma_addr_t			status_mapping;
 	dma_addr_t			rx_rcb_mapping;
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 6/9] tg3: Consolidate code that calls tg3_tx_set_bd()
From: Matt Carlson @ 2011-07-28  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

This patch consolidates all code that populates tx BDs into a single
routine.  Setting tx BDs needs to be more carefully controlled to see if
workarounds need to be applied.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/tg3.c |   79 ++++++++++++++++++++++++++++-------------------------
 1 files changed, 42 insertions(+), 37 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 90b68a2..7f816a0 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -5914,18 +5914,37 @@ static inline int tg3_40bit_overflow_test(struct tg3 *tp, dma_addr_t mapping,
 #endif
 }
 
-static inline void tg3_tx_set_bd(struct tg3_napi *tnapi, u32 entry,
+static inline void tg3_tx_set_bd(struct tg3_tx_buffer_desc *txbd,
 				 dma_addr_t mapping, u32 len, u32 flags,
 				 u32 mss, u32 vlan)
 {
-	struct tg3_tx_buffer_desc *txbd = &tnapi->tx_ring[entry];
-
 	txbd->addr_hi = ((u64) mapping >> 32);
 	txbd->addr_lo = ((u64) mapping & 0xffffffff);
 	txbd->len_flags = (len << TXD_LEN_SHIFT) | (flags & 0x0000ffff);
 	txbd->vlan_tag = (mss << TXD_MSS_SHIFT) | (vlan << TXD_VLAN_TAG_SHIFT);
 }
 
+static bool tg3_tx_frag_set(struct tg3_napi *tnapi, u32 entry,
+			    dma_addr_t map, u32 len, u32 flags,
+			    u32 mss, u32 vlan)
+{
+	struct tg3 *tp = tnapi->tp;
+	bool hwbug = false;
+
+	if (tg3_flag(tp, SHORT_DMA_BUG) && len <= 8)
+		hwbug = 1;
+
+	if (tg3_4g_overflow_test(map, len))
+		hwbug = 1;
+
+	if (tg3_40bit_overflow_test(tp, map, len))
+		hwbug = 1;
+
+	tg3_tx_set_bd(&tnapi->tx_ring[entry], map, len, flags, mss, vlan);
+
+	return hwbug;
+}
+
 static void tg3_tx_skb_unmap(struct tg3_napi *tnapi, u32 entry, int last)
 {
 	int i;
@@ -5993,17 +6012,8 @@ static int tigon3_dma_hwbug_workaround(struct tg3_napi *tnapi,
 					  PCI_DMA_TODEVICE);
 		/* Make sure the mapping succeeded */
 		if (pci_dma_mapping_error(tp->pdev, new_addr)) {
-			ret = -1;
 			dev_kfree_skb(new_skb);
-
-		/* Make sure new skb does not cross any 4G boundaries.
-		 * Drop the packet if it does.
-		 */
-		} else if (tg3_4g_overflow_test(new_addr, new_skb->len)) {
-			pci_unmap_single(tp->pdev, new_addr, new_skb->len,
-					 PCI_DMA_TODEVICE);
 			ret = -1;
-			dev_kfree_skb(new_skb);
 		} else {
 			base_flags |= TXD_FLAG_END;
 
@@ -6011,8 +6021,13 @@ static int tigon3_dma_hwbug_workaround(struct tg3_napi *tnapi,
 			dma_unmap_addr_set(&tnapi->tx_buffers[entry],
 					   mapping, new_addr);
 
-			tg3_tx_set_bd(tnapi, entry, new_addr, new_skb->len,
-				      base_flags, mss, vlan);
+			if (tg3_tx_frag_set(tnapi, entry, new_addr,
+					    new_skb->len, base_flags,
+					    mss, vlan)) {
+				tg3_tx_skb_unmap(tnapi, entry, 0);
+				dev_kfree_skb(new_skb);
+				ret = -1;
+			}
 		}
 	}
 
@@ -6196,18 +6211,13 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	would_hit_hwbug = 0;
 
-	if (tg3_4g_overflow_test(mapping, len))
-		would_hit_hwbug = 1;
-
-	if (tg3_40bit_overflow_test(tp, mapping, len))
-		would_hit_hwbug = 1;
-
 	if (tg3_flag(tp, 5701_DMA_BUG))
 		would_hit_hwbug = 1;
 
-	tg3_tx_set_bd(tnapi, entry, mapping, len, base_flags |
-		      ((skb_shinfo(skb)->nr_frags == 0) ? TXD_FLAG_END : 0),
-		      mss, vlan);
+	if (tg3_tx_frag_set(tnapi, entry, mapping, len, base_flags |
+			  ((skb_shinfo(skb)->nr_frags == 0) ? TXD_FLAG_END : 0),
+			    mss, vlan))
+		would_hit_hwbug = 1;
 
 	entry = NEXT_TX(entry);
 
@@ -6236,20 +6246,11 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			if (pci_dma_mapping_error(tp->pdev, mapping))
 				goto dma_error;
 
-			if (tg3_flag(tp, SHORT_DMA_BUG) &&
-			    len <= 8)
-				would_hit_hwbug = 1;
-
-			if (tg3_4g_overflow_test(mapping, len))
-				would_hit_hwbug = 1;
-
-			if (tg3_40bit_overflow_test(tp, mapping, len))
+			if (tg3_tx_frag_set(tnapi, entry, mapping, len,
+				  base_flags | ((i == last) ? TXD_FLAG_END : 0),
+					    tmp_mss, vlan))
 				would_hit_hwbug = 1;
 
-			tg3_tx_set_bd(tnapi, entry, mapping, len, base_flags |
-				      ((i == last) ? TXD_FLAG_END : 0),
-				      tmp_mss, vlan);
-
 			entry = NEXT_TX(entry);
 		}
 	}
@@ -11375,8 +11376,12 @@ static int tg3_run_loopback(struct tg3 *tp, u32 pktsz, int loopback_mode)
 
 	rx_start_idx = rnapi->hw_status->idx[0].rx_producer;
 
-	tg3_tx_set_bd(tnapi, tnapi->tx_prod, map, tx_len,
-		      base_flags | TXD_FLAG_END, mss, 0);
+	if (tg3_tx_frag_set(tnapi, tnapi->tx_prod, map, tx_len,
+			    base_flags | TXD_FLAG_END, mss, 0)) {
+		tnapi->tx_buffers[val].skb = NULL;
+		dev_kfree_skb(skb);
+		return -EIO;
+	}
 
 	tnapi->tx_prod++;
 
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 5/9] tg3: Add partial fragment unmapping code
From: Matt Carlson @ 2011-07-28  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

The following patches are going to break skb fragments into smaller
sizes.  This patch attempts to make the change easier to digest by only
addressing the skb teardown portion.

The patch modifies the driver to skip over any BDs that have a flag set
that indicates the BD isn't the beginning of an skb fragment.  Such BDs
were a result of segmentation and do not need a pci_unmap_page() call.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/tg3.c |   26 ++++++++++++++++++++++++++
 drivers/net/tg3.h |    1 +
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 3f69f1a..90b68a2 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -4840,6 +4840,12 @@ static void tg3_tx(struct tg3_napi *tnapi)
 
 		ri->skb = NULL;
 
+		while (ri->fragmented) {
+			ri->fragmented = false;
+			sw_idx = NEXT_TX(sw_idx);
+			ri = &tnapi->tx_buffers[sw_idx];
+		}
+
 		sw_idx = NEXT_TX(sw_idx);
 
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
@@ -4851,6 +4857,13 @@ static void tg3_tx(struct tg3_napi *tnapi)
 				       dma_unmap_addr(ri, mapping),
 				       skb_shinfo(skb)->frags[i].size,
 				       PCI_DMA_TODEVICE);
+
+			while (ri->fragmented) {
+				ri->fragmented = false;
+				sw_idx = NEXT_TX(sw_idx);
+				ri = &tnapi->tx_buffers[sw_idx];
+			}
+
 			sw_idx = NEXT_TX(sw_idx);
 		}
 
@@ -5926,6 +5939,13 @@ static void tg3_tx_skb_unmap(struct tg3_napi *tnapi, u32 entry, int last)
 			 dma_unmap_addr(txb, mapping),
 			 skb_headlen(skb),
 			 PCI_DMA_TODEVICE);
+
+	while (txb->fragmented) {
+		txb->fragmented = false;
+		entry = NEXT_TX(entry);
+		txb = &tnapi->tx_buffers[entry];
+	}
+
 	for (i = 0; i < last; i++) {
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
@@ -5935,6 +5955,12 @@ static void tg3_tx_skb_unmap(struct tg3_napi *tnapi, u32 entry, int last)
 		pci_unmap_page(tnapi->tp->pdev,
 			       dma_unmap_addr(txb, mapping),
 			       frag->size, PCI_DMA_TODEVICE);
+
+		while (txb->fragmented) {
+			txb->fragmented = false;
+			entry = NEXT_TX(entry);
+			txb = &tnapi->tx_buffers[entry];
+		}
 	}
 }
 
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index f6986ca..466dd7a 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2655,6 +2655,7 @@ struct ring_info {
 struct tg3_tx_ring_info {
 	struct sk_buff			*skb;
 	DEFINE_DMA_UNMAP_ADDR(mapping);
+	bool				fragmented;
 };
 
 struct tg3_link_config {
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 7/9] tg3: Add tx BD budgeting code
From: Matt Carlson @ 2011-07-28  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

As the driver breaks large skb fragments into smaller submissions to the
hardware, there is a new danger that BDs might get exhausted before all
fragments have been mapped.  This patch adds code to make sure tx BDs
aren't oversubscribed and flag the condition if it happens.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/tg3.c |   49 +++++++++++++++++++++++++++++--------------------
 1 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 7f816a0..b93ba3d 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -5924,7 +5924,7 @@ static inline void tg3_tx_set_bd(struct tg3_tx_buffer_desc *txbd,
 	txbd->vlan_tag = (mss << TXD_MSS_SHIFT) | (vlan << TXD_VLAN_TAG_SHIFT);
 }
 
-static bool tg3_tx_frag_set(struct tg3_napi *tnapi, u32 entry,
+static bool tg3_tx_frag_set(struct tg3_napi *tnapi, u32 *entry, u32 *budget,
 			    dma_addr_t map, u32 len, u32 flags,
 			    u32 mss, u32 vlan)
 {
@@ -5940,7 +5940,14 @@ static bool tg3_tx_frag_set(struct tg3_napi *tnapi, u32 entry,
 	if (tg3_40bit_overflow_test(tp, map, len))
 		hwbug = 1;
 
-	tg3_tx_set_bd(&tnapi->tx_ring[entry], map, len, flags, mss, vlan);
+	if (*budget) {
+		tg3_tx_set_bd(&tnapi->tx_ring[*entry], map,
+			      len, flags, mss, vlan);
+		(*budget)--;
+	} else
+		hwbug = 1;
+
+	*entry = NEXT_TX(*entry);
 
 	return hwbug;
 }
@@ -5986,12 +5993,12 @@ static void tg3_tx_skb_unmap(struct tg3_napi *tnapi, u32 entry, int last)
 /* Workaround 4GB and 40-bit hardware DMA bugs. */
 static int tigon3_dma_hwbug_workaround(struct tg3_napi *tnapi,
 				       struct sk_buff *skb,
+				       u32 *entry, u32 *budget,
 				       u32 base_flags, u32 mss, u32 vlan)
 {
 	struct tg3 *tp = tnapi->tp;
 	struct sk_buff *new_skb;
 	dma_addr_t new_addr = 0;
-	u32 entry = tnapi->tx_prod;
 	int ret = 0;
 
 	if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5701)
@@ -6017,14 +6024,14 @@ static int tigon3_dma_hwbug_workaround(struct tg3_napi *tnapi,
 		} else {
 			base_flags |= TXD_FLAG_END;
 
-			tnapi->tx_buffers[entry].skb = new_skb;
-			dma_unmap_addr_set(&tnapi->tx_buffers[entry],
+			tnapi->tx_buffers[*entry].skb = new_skb;
+			dma_unmap_addr_set(&tnapi->tx_buffers[*entry],
 					   mapping, new_addr);
 
-			if (tg3_tx_frag_set(tnapi, entry, new_addr,
+			if (tg3_tx_frag_set(tnapi, entry, budget, new_addr,
 					    new_skb->len, base_flags,
 					    mss, vlan)) {
-				tg3_tx_skb_unmap(tnapi, entry, 0);
+				tg3_tx_skb_unmap(tnapi, *entry, 0);
 				dev_kfree_skb(new_skb);
 				ret = -1;
 			}
@@ -6086,6 +6093,7 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct tg3 *tp = netdev_priv(dev);
 	u32 len, entry, base_flags, mss, vlan = 0;
+	u32 budget;
 	int i = -1, would_hit_hwbug;
 	dma_addr_t mapping;
 	struct tg3_napi *tnapi;
@@ -6097,12 +6105,14 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (tg3_flag(tp, ENABLE_TSS))
 		tnapi++;
 
+	budget = tg3_tx_avail(tnapi);
+
 	/* We are running in BH disabled context with netif_tx_lock
 	 * and TX reclaim runs via tp->napi.poll inside of a software
 	 * interrupt.  Furthermore, IRQ processing runs lockless so we have
 	 * no IRQ context deadlocks to worry about either.  Rejoice!
 	 */
-	if (unlikely(tg3_tx_avail(tnapi) <= (skb_shinfo(skb)->nr_frags + 1))) {
+	if (unlikely(budget <= (skb_shinfo(skb)->nr_frags + 1))) {
 		if (!netif_tx_queue_stopped(txq)) {
 			netif_tx_stop_queue(txq);
 
@@ -6214,13 +6224,11 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (tg3_flag(tp, 5701_DMA_BUG))
 		would_hit_hwbug = 1;
 
-	if (tg3_tx_frag_set(tnapi, entry, mapping, len, base_flags |
+	if (tg3_tx_frag_set(tnapi, &entry, &budget, mapping, len, base_flags |
 			  ((skb_shinfo(skb)->nr_frags == 0) ? TXD_FLAG_END : 0),
 			    mss, vlan))
 		would_hit_hwbug = 1;
 
-	entry = NEXT_TX(entry);
-
 	/* Now loop through additional data fragments, and queue them. */
 	if (skb_shinfo(skb)->nr_frags > 0) {
 		u32 tmp_mss = mss;
@@ -6246,12 +6254,11 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			if (pci_dma_mapping_error(tp->pdev, mapping))
 				goto dma_error;
 
-			if (tg3_tx_frag_set(tnapi, entry, mapping, len,
-				  base_flags | ((i == last) ? TXD_FLAG_END : 0),
+			if (tg3_tx_frag_set(tnapi, &entry, &budget, mapping,
+					    len, base_flags |
+					    ((i == last) ? TXD_FLAG_END : 0),
 					    tmp_mss, vlan))
 				would_hit_hwbug = 1;
-
-			entry = NEXT_TX(entry);
 		}
 	}
 
@@ -6261,11 +6268,11 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		/* If the workaround fails due to memory/mapping
 		 * failure, silently drop this packet.
 		 */
-		if (tigon3_dma_hwbug_workaround(tnapi, skb, base_flags,
-						mss, vlan))
+		entry = tnapi->tx_prod;
+		budget = tg3_tx_avail(tnapi);
+		if (tigon3_dma_hwbug_workaround(tnapi, skb, &entry, &budget,
+						base_flags, mss, vlan))
 			goto out_unlock;
-
-		entry = NEXT_TX(tnapi->tx_prod);
 	}
 
 	skb_tx_timestamp(skb);
@@ -11206,6 +11213,7 @@ static int tg3_run_loopback(struct tg3 *tp, u32 pktsz, int loopback_mode)
 {
 	u32 mac_mode, rx_start_idx, rx_idx, tx_idx, opaque_key;
 	u32 base_flags = 0, mss = 0, desc_idx, coal_now, data_off, val;
+	u32 budget;
 	struct sk_buff *skb, *rx_skb;
 	u8 *tx_data;
 	dma_addr_t map;
@@ -11376,7 +11384,8 @@ static int tg3_run_loopback(struct tg3 *tp, u32 pktsz, int loopback_mode)
 
 	rx_start_idx = rnapi->hw_status->idx[0].rx_producer;
 
-	if (tg3_tx_frag_set(tnapi, tnapi->tx_prod, map, tx_len,
+	budget = tg3_tx_avail(tnapi);
+	if (tg3_tx_frag_set(tnapi, &val, &budget, map, tx_len,
 			    base_flags | TXD_FLAG_END, mss, 0)) {
 		tnapi->tx_buffers[val].skb = NULL;
 		dev_kfree_skb(skb);
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 0/9] tg3: Add 4k workaround for 5719
From: Matt Carlson @ 2011-07-28  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

This patchset adds a necessary 4k RDMA limit workaround for 5719 devices.



^ permalink raw reply

* [PATCH net-next 8/9] tg3: Break larger frags into 4k chunks for 5719
From: Matt Carlson @ 2011-07-28  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

The 5719 has bug where RDMAs larger than 4k can cause problems.  This
patch works around the problem by dividing larger DMA requests into
something the hardware can handle.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/tg3.c |   52 ++++++++++++++++++++++++++++++++++++++++++++++------
 drivers/net/tg3.h |    1 +
 2 files changed, 47 insertions(+), 6 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index b93ba3d..c77a39d 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -190,6 +190,7 @@ static inline void _tg3_flag_clear(enum TG3_FLAGS flag, unsigned long *bits)
 
 /* minimum number of free TX descriptors required to wake up TX process */
 #define TG3_TX_WAKEUP_THRESH(tnapi)		((tnapi)->tx_pending / 4)
+#define TG3_TX_BD_DMA_MAX		4096
 
 #define TG3_RAW_IP_ALIGN 2
 
@@ -5940,14 +5941,50 @@ static bool tg3_tx_frag_set(struct tg3_napi *tnapi, u32 *entry, u32 *budget,
 	if (tg3_40bit_overflow_test(tp, map, len))
 		hwbug = 1;
 
-	if (*budget) {
+	if (tg3_flag(tp, 4K_FIFO_LIMIT)) {
+		u32 tmp_flag = flags & ~TXD_FLAG_END;
+		while (len > TG3_TX_BD_DMA_MAX) {
+			u32 frag_len = TG3_TX_BD_DMA_MAX;
+			len -= TG3_TX_BD_DMA_MAX;
+
+			if (len) {
+				tnapi->tx_buffers[*entry].fragmented = true;
+				/* Avoid the 8byte DMA problem */
+				if (len <= 8) {
+					len += TG3_TX_BD_DMA_MAX / 2;
+					frag_len = TG3_TX_BD_DMA_MAX / 2;
+				}
+			} else
+				tmp_flag = flags;
+
+			if (*budget) {
+				tg3_tx_set_bd(&tnapi->tx_ring[*entry], map,
+					      frag_len, tmp_flag, mss, vlan);
+				(*budget)--;
+				*entry = NEXT_TX(*entry);
+			} else {
+				hwbug = 1;
+				break;
+			}
+
+			map += frag_len;
+		}
+
+		if (len) {
+			if (*budget) {
+				tg3_tx_set_bd(&tnapi->tx_ring[*entry], map,
+					      len, flags, mss, vlan);
+				(*budget)--;
+				*entry = NEXT_TX(*entry);
+			} else {
+				hwbug = 1;
+			}
+		}
+	} else {
 		tg3_tx_set_bd(&tnapi->tx_ring[*entry], map,
 			      len, flags, mss, vlan);
-		(*budget)--;
-	} else
-		hwbug = 1;
-
-	*entry = NEXT_TX(*entry);
+		*entry = NEXT_TX(*entry);
+	}
 
 	return hwbug;
 }
@@ -13899,6 +13936,9 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
 	if (tg3_flag(tp, 5755_PLUS))
 		tg3_flag_set(tp, SHORT_DMA_BUG);
 
+	if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719)
+		tg3_flag_set(tp, 4K_FIFO_LIMIT);
+
 	if (tg3_flag(tp, 5717_PLUS))
 		tg3_flag_set(tp, LRG_PROD_RING_CAP);
 
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index 466dd7a..2ea456d 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2905,6 +2905,7 @@ enum TG3_FLAGS {
 	TG3_FLAG_57765_PLUS,
 	TG3_FLAG_APE_HAS_NCSI,
 	TG3_FLAG_5717_PLUS,
+	TG3_FLAG_4K_FIFO_LIMIT,
 
 	/* Add new flags before this comment and TG3_FLAG_NUMBER_OF_FLAGS */
 	TG3_FLAG_NUMBER_OF_FLAGS,	/* Last entry in enum TG3_FLAGS */
-- 
1.7.3.4



^ permalink raw reply related

* [PATCH net-next 3/9] tg3: Remove short DMA check for 1st fragment
From: Matt Carlson @ 2011-07-28  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

The first fragment of an skb should always be greater than 8 bytes.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/tg3.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 8dfde34..0f5bcf7 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -6168,9 +6168,6 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	would_hit_hwbug = 0;
 
-	if (tg3_flag(tp, SHORT_DMA_BUG) && len <= 8)
-		would_hit_hwbug = 1;
-
 	if (tg3_4g_overflow_test(mapping, len))
 		would_hit_hwbug = 1;
 
-- 
1.7.3.4



^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox