Netdev List
 help / color / mirror / Atom feed
* Re: slow performance on disk/network i/o full speed after drop_caches
From: Stefan Priebe - Profihost AG @ 2011-08-31  7:11 UTC (permalink / raw)
  To: Zhu Yanhai
  Cc: Wu Fengguang, Pekka Enberg, LKML, linux-mm@kvack.org,
	Andrew Morton, Mel Gorman, Jens Axboe, Linux Netdev List
In-Reply-To: <4E573A99.4060309@profihost.ag>

Hi Fengguang,
Hi Yanhai,

> you're abssolutely corect zone_reclaim_mode is on - but why?
> There must be some linux software which switches it on.
>
> ~# grep 'zone_reclaim_mode' /etc/sysctl.* -r -i
> ~#
>
> also
> ~# grep 'zone_reclaim_mode' /etc/sysctl.* -r -i
> ~#
>
> tells us nothing.
>
> I've then read this:
>
> "zone_reclaim_mode is set during bootup to 1 if it is determined that
> pages from remote zones will cause a measurable performance reduction.
> The page allocator will then reclaim easily reusable pages (those page
> cache pages that are currently not used) before allocating off node pages."
>
> Why does the kernel do that here in our case on these machines.

Can nobody help why the kernel in this case set it to 1?

Stefan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Jiri Pirko @ 2011-08-31  8:26 UTC (permalink / raw)
  To: Michał Mirosław
  Cc: netdev, davem, eric.dumazet, bhutchings, shemminger
In-Reply-To: <CAHXqBFJh0xjRYv_-2AzwAvpHOUfQ+pD_153qVtNA3Ybo9r3b=w@mail.gmail.com>

Tue, Aug 30, 2011 at 08:11:37PM CEST, mirqus@gmail.com wrote:
>2011/8/30 Jiri Pirko <jpirko@redhat.com>:
>> Allow to write to "carrier" attribute. Devices may implement ndo_change_carrier
>> callback to allow changing carrier from userspace.
>
>Do you expect drivers using implementation different than just calling
>netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?

Yes, generally it can be used also for en/disable phy, for testing
purposes if hw and driver would support it.

>
>BTW, I like this feature!
>
>Best Regards,
>Michał Mirosław

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Jiri Pirko @ 2011-08-31  8:31 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Michał Mirosław, netdev, davem, eric.dumazet,
	bhutchings
In-Reply-To: <20110830112553.012ffa85@nehalam.ftrdhcpuser.net>

Tue, Aug 30, 2011 at 08:25:53PM CEST, shemminger@vyatta.com wrote:
>On Tue, 30 Aug 2011 20:11:37 +0200
>Michał Mirosław <mirqus@gmail.com> wrote:
>
>> 2011/8/30 Jiri Pirko <jpirko@redhat.com>:
>> > Allow to write to "carrier" attribute. Devices may implement ndo_change_carrier
>> > callback to allow changing carrier from userspace.
>> 
>> Do you expect drivers using implementation different than just calling
>> netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
>> 
>> BTW, I like this feature!
>
>Ok for virtual devices, but please don't implement it in real hardware.
>There is already enough breakage in carrier management in applications.
>It also overlaps with operstate perhaps that is a more more complete
>solution.

Looking at operstate doc, I'm not sure what exactly do you mean by
"overlapping".

The main purpose of my patch is to give certain virt devices the
opportunity to emulate carrier loss. But I do not see reason why this
can't be implemented by real hw. Or course there should be explicitelly
documented the purpose of this feature.

Jirka
>

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Michał Mirosław @ 2011-08-31  8:33 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, eric.dumazet, bhutchings, shemminger
In-Reply-To: <20110831082655.GB2010@minipsycho.brq.redhat.com>

W dniu 31 sierpnia 2011 10:26 użytkownik Jiri Pirko <jpirko@redhat.com> napisał:
> Tue, Aug 30, 2011 at 08:11:37PM CEST, mirqus@gmail.com wrote:
>>2011/8/30 Jiri Pirko <jpirko@redhat.com>:
>>> Allow to write to "carrier" attribute. Devices may implement ndo_change_carrier
>>> callback to allow changing carrier from userspace.
>>Do you expect drivers using implementation different than just calling
>>netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
> Yes, generally it can be used also for en/disable phy, for testing
> purposes if hw and driver would support it.

I'd like to see this working for GRE tunnel devices (for keepalive
daemon to be able to indicate to routing daemons whether tunnel is
really working) - implementation would be identical to dummy's case.
Should I prepare a patch or can I leave it to you?

Best Regards,
Michał Mirosław

^ permalink raw reply

* [PATCH 0/2] Dump the sock's security context
From: rongqing.li @ 2011-08-31  8:36 UTC (permalink / raw)
  To: netdev, selinux, linux-security-module

-------
    Any review would be much appreciated.
 
Comments:
--------
    Add a netlink attribute INET_DIAG_SECCTX
    
    Add a new netlink attribute INET_DIAG_SECCTX to dump the security
    context of TCP sockets.
    
    The element sk_security of struct sock represents the socket
    security context ID, which is inherited from the parent process
    when the socket is created.
    
    but when SELinux type_transition rule is applied to socket, or
    application sets /proc/xxx/attr/createsock, the socket security
    context would be different from the creating process. For these
    conditions, the "netstat -Z" would return wrong value, since
    "netstat -Z" only returns the process security context as socket
    process security.


The application to verify the netlink new attribute.
------
See attached file

test:
--------
1. Enable SELinux when compile and startup .
	root@qemu-host:/root> ./printsocketsec
	 inode:7141 system_u:system_r:rpcbind_t:s0 
	 inode:7136 system_u:system_r:rpcbind_t:s0 
	 inode:7604 system_u:system_r:initrc_t:s0 
	 inode:7227 system_u:system_r:rpcd_t:s0 
	 inode:7471 system_u:system_r:sshd_t:s0-s0:c0.c1023 
	 inode:7469 system_u:system_r:sshd_t:s0-s0:c0.c1023 
	 inode:7552 system_u:system_r:sendmail_t:s0 
	 inode:7348 system_u:system_r:initrc_t:s0 
	 inode:7553 system_u:system_r:sendmail_t:s0 
	root@qemu-host:/root> 

2. Disable SELinux when startup.
	root@qemu-host:/root> ./printsocketsec 
	inode:3221 
	inode:2942 
	inode:2861 
	inode:3256 
	inode:3156 
	inode:3220 
	inode:3060
	root@qemu-host:/root>

3. Disable SELinux when compile and startup
	root@qemu-host:/root> ./printsocketsec 
	inode:3221 
	inode:2942 
	inode:2861 
	inode:3256 
	inode:3156 
	inode:3220 
	inode:3060
	root@qemu-host:/root>

^ permalink raw reply

* [PATCH 1/2] Define security_sk_getsecctx
From: rongqing.li @ 2011-08-31  8:36 UTC (permalink / raw)
  To: netdev, selinux, linux-security-module
In-Reply-To: <1314779777-12669-1-git-send-email-rongqing.li@windriver.com>

From: Roy.Li <rongqing.li@windriver.com>

Define security_sk_getsecctx to return the security
context of a sock.

Signed-off-by: Roy.Li <rongqing.li@windriver.com>
---
 include/linux/security.h |   13 +++++++++++++
 security/capability.c    |    6 ++++++
 security/security.c      |    6 ++++++
 security/selinux/hooks.c |    9 +++++++++
 4 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index ebd2a53..6bb8e0c 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -959,6 +959,12 @@ static inline void security_free_mnt_opts(struct security_mnt_opts *opts)
  * @sk_getsecid:
  *	Retrieve the LSM-specific secid for the sock to enable caching of network
  *	authorizations.
+ * @sk_getsecctx:
+ *	Returns a string containing sock security context information
+ *	@sk whom we wish to get the security context.
+ *	@ctx is the address of the pointer to where to place the allocated
+ *	security context.
+ *	@ctxlen points to the value of the length of the security context.
  * @sock_graft:
  *	Sets the socket's isec sid to the sock's sid.
  * @inet_conn_request:
@@ -1600,6 +1606,7 @@ struct security_operations {
 	void (*sk_free_security) (struct sock *sk);
 	void (*sk_clone_security) (const struct sock *sk, struct sock *newsk);
 	void (*sk_getsecid) (struct sock *sk, u32 *secid);
+	int (*sk_getsecctx) (struct sock *sk, void **ctx, u32 *ctxlen);
 	void (*sock_graft) (struct sock *sk, struct socket *parent);
 	int (*inet_conn_request) (struct sock *sk, struct sk_buff *skb,
 				  struct request_sock *req);
@@ -2574,6 +2581,7 @@ void security_secmark_refcount_dec(void);
 int security_tun_dev_create(void);
 void security_tun_dev_post_create(struct sock *sk);
 int security_tun_dev_attach(struct sock *sk);
+int security_sk_getsecctx(struct sock *sk, void **ctx, u32 *ctxlen);
 
 #else	/* CONFIG_SECURITY_NETWORK */
 static inline int security_unix_stream_connect(struct sock *sock,
@@ -2751,6 +2759,11 @@ static inline int security_tun_dev_attach(struct sock *sk)
 {
 	return 0;
 }
+
+static int security_sk_getsecctx(struct sock *sk, void **ctx, u32 *ctxlen)
+{
+	return -EOPNOTSUPP;
+}
 #endif	/* CONFIG_SECURITY_NETWORK */
 
 #ifdef CONFIG_SECURITY_NETWORK_XFRM
diff --git a/security/capability.c b/security/capability.c
index 2984ea4..89256a6 100644
--- a/security/capability.c
+++ b/security/capability.c
@@ -664,6 +664,11 @@ static void cap_sk_getsecid(struct sock *sk, u32 *secid)
 {
 }
 
+static int cap_sk_getsecctx(struct sock *sk, void **ctx, u32 *ctxlen)
+{
+	return 0;
+}
+
 static void cap_sock_graft(struct sock *sk, struct socket *parent)
 {
 }
@@ -1032,6 +1037,7 @@ void __init security_fixup_ops(struct security_operations *ops)
 	set_to_cap_if_null(ops, sk_free_security);
 	set_to_cap_if_null(ops, sk_clone_security);
 	set_to_cap_if_null(ops, sk_getsecid);
+	set_to_cap_if_null(ops, sk_getsecctx);
 	set_to_cap_if_null(ops, sock_graft);
 	set_to_cap_if_null(ops, inet_conn_request);
 	set_to_cap_if_null(ops, inet_csk_clone);
diff --git a/security/security.c b/security/security.c
index 0e4fccf..a939f5c 100644
--- a/security/security.c
+++ b/security/security.c
@@ -757,6 +757,12 @@ void security_task_getsecid(struct task_struct *p, u32 *secid)
 }
 EXPORT_SYMBOL(security_task_getsecid);
 
+int security_sk_getsecctx(struct sock *sk, void **ctx, u32 *ctxlen)
+{
+	return security_ops->sk_getsecctx(sk, ctx, ctxlen);
+}
+EXPORT_SYMBOL(security_sk_getsecctx);
+
 int security_task_setnice(struct task_struct *p, int nice)
 {
 	return security_ops->task_setnice(p, nice);
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 266a229..6e96f01 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4284,6 +4284,14 @@ static void selinux_sk_getsecid(struct sock *sk, u32 *secid)
 	}
 }
 
+static int selinux_sk_getsecctx(struct sock *sk, void **ctx, u32 *ctxlen)
+{
+	u32 secid;
+
+	selinux_sk_getsecid(sk, &secid);
+	return security_sid_to_context(secid, ctx, ctxlen);
+}
+
 static void selinux_sock_graft(struct sock *sk, struct socket *parent)
 {
 	struct inode_security_struct *isec = SOCK_INODE(parent)->i_security;
@@ -5613,6 +5621,7 @@ static struct security_operations selinux_ops = {
 	.sk_free_security =		selinux_sk_free_security,
 	.sk_clone_security =		selinux_sk_clone_security,
 	.sk_getsecid =			selinux_sk_getsecid,
+	.sk_getsecctx =                 selinux_sk_getsecctx,
 	.sock_graft =			selinux_sock_graft,
 	.inet_conn_request =		selinux_inet_conn_request,
 	.inet_csk_clone =		selinux_inet_csk_clone,
-- 
1.7.1

^ permalink raw reply related

* [PATCH 2/2] Add a netlink attribute INET_DIAG_SECCTX
From: rongqing.li @ 2011-08-31  8:36 UTC (permalink / raw)
  To: netdev, selinux, linux-security-module
In-Reply-To: <1314779777-12669-1-git-send-email-rongqing.li@windriver.com>

From: Roy.Li <rongqing.li@windriver.com>

Add a new netlink attribute INET_DIAG_SECCTX to dump the security
context of TCP sockets.

The element sk_security of struct sock represents the socket
security context ID, which is inherited from the parent process
when the socket is created.

but when SELinux type_transition rule is applied to socket, or
application sets /proc/xxx/attr/createsock, the socket security
context would be different from the creating process. For these
conditions, the "netstat -Z" will return wrong value, since
"netstat -Z" only returns the process security context as socket
process security.

Signed-off-by: Roy.Li <rongqing.li@windriver.com>
---
 include/linux/inet_diag.h |    3 ++-
 net/ipv4/inet_diag.c      |   38 +++++++++++++++++++++++++++++++++-----
 2 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h
index bc8c490..00382b4 100644
--- a/include/linux/inet_diag.h
+++ b/include/linux/inet_diag.h
@@ -97,9 +97,10 @@ enum {
 	INET_DIAG_INFO,
 	INET_DIAG_VEGASINFO,
 	INET_DIAG_CONG,
+	INET_DIAG_SECCTX,
 };
 
-#define INET_DIAG_MAX INET_DIAG_CONG
+#define INET_DIAG_MAX INET_DIAG_SECCTX
 
 
 /* INET_DIAG_MEM */
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 389a2e6..1faf752 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -34,6 +34,8 @@
 
 #include <linux/inet_diag.h>
 
+#define MAX_SECCTX_LEN 128
+
 static const struct inet_diag_handler **inet_diag_table;
 
 struct inet_diag_entry {
@@ -108,6 +110,25 @@ static int inet_csk_diag_fill(struct sock *sk,
 		       icsk->icsk_ca_ops->name);
 	}
 
+	if (ext & (1 << (INET_DIAG_SECCTX - 1))) {
+		u32 ctxlen = 0;
+		void *secctx;
+		int error;
+
+		error = security_sk_getsecctx(sk, &secctx, &ctxlen);
+
+		if (!error && ctxlen) {
+			if (ctxlen < MAX_SECCTX_LEN) {
+				strcpy(INET_DIAG_PUT(skb, INET_DIAG_SECCTX,
+					ctxlen + 1), secctx);
+			} else {
+				strcpy(INET_DIAG_PUT(skb, INET_DIAG_SECCTX,
+					2), "-");
+			}
+			security_release_secctx(secctx, ctxlen);
+		}
+	}
+
 	r->idiag_family = sk->sk_family;
 	r->idiag_state = sk->sk_state;
 	r->idiag_timer = 0;
@@ -246,7 +267,7 @@ static int sk_diag_fill(struct sock *sk, struct sk_buff *skb,
 static int inet_diag_get_exact(struct sk_buff *in_skb,
 			       const struct nlmsghdr *nlh)
 {
-	int err;
+	int err, len;
 	struct sock *sk;
 	struct inet_diag_req *req = NLMSG_DATA(nlh);
 	struct sk_buff *rep;
@@ -293,10 +314,17 @@ static int inet_diag_get_exact(struct sk_buff *in_skb,
 		goto out;
 
 	err = -ENOMEM;
-	rep = alloc_skb(NLMSG_SPACE((sizeof(struct inet_diag_msg) +
-				     sizeof(struct inet_diag_meminfo) +
-				     handler->idiag_info_size + 64)),
-			GFP_KERNEL);
+	len = sizeof(struct inet_diag_msg) + 64;
+
+	len += (req->idiag_ext & (1 << (INET_DIAG_MEMINFO - 1))) ?
+		sizeof(struct inet_diag_meminfo) : 0;
+	len += (req->idiag_ext & (1 << (INET_DIAG_INFO - 1))) ?
+		handler->idiag_info_size : 0;
+	len += (req->idiag_ext & (1 << (INET_DIAG_SECCTX - 1))) ?
+		MAX_SECCTX_LEN : 0;
+
+	rep = alloc_skb(NLMSG_SPACE(len), GFP_KERNEL);
+
 	if (!rep)
 		goto out;
 
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH 0/2] Dump the sock's security context
From: Rongqing Li @ 2011-08-31  8:38 UTC (permalink / raw)
  To: rongqing.li; +Cc: netdev, selinux, linux-security-module
In-Reply-To: <1314779777-12669-1-git-send-email-rongqing.li@windriver.com>

[-- Attachment #1: Type: text/plain, Size: 2153 bytes --]

On 08/31/2011 04:36 PM, rongqing.li@windriver.com wrote:
> -------
>      Any review would be much appreciated.
>
> Comments:
> --------
>      Add a netlink attribute INET_DIAG_SECCTX
>
>      Add a new netlink attribute INET_DIAG_SECCTX to dump the security
>      context of TCP sockets.
>
>      The element sk_security of struct sock represents the socket
>      security context ID, which is inherited from the parent process
>      when the socket is created.
>
>      but when SELinux type_transition rule is applied to socket, or
>      application sets /proc/xxx/attr/createsock, the socket security
>      context would be different from the creating process. For these
>      conditions, the "netstat -Z" would return wrong value, since
>      "netstat -Z" only returns the process security context as socket
>      process security.
>
>
> The application to verify the netlink new attribute.
> ------
> See attached file
>
> test:
> --------
> 1. Enable SELinux when compile and startup .
> 	root@qemu-host:/root>  ./printsocketsec
> 	 inode:7141 system_u:system_r:rpcbind_t:s0
> 	 inode:7136 system_u:system_r:rpcbind_t:s0
> 	 inode:7604 system_u:system_r:initrc_t:s0
> 	 inode:7227 system_u:system_r:rpcd_t:s0
> 	 inode:7471 system_u:system_r:sshd_t:s0-s0:c0.c1023
> 	 inode:7469 system_u:system_r:sshd_t:s0-s0:c0.c1023
> 	 inode:7552 system_u:system_r:sendmail_t:s0
> 	 inode:7348 system_u:system_r:initrc_t:s0
> 	 inode:7553 system_u:system_r:sendmail_t:s0
> 	root@qemu-host:/root>
>
> 2. Disable SELinux when startup.
> 	root@qemu-host:/root>  ./printsocketsec
> 	inode:3221
> 	inode:2942
> 	inode:2861
> 	inode:3256
> 	inode:3156
> 	inode:3220
> 	inode:3060
> 	root@qemu-host:/root>
>
> 3. Disable SELinux when compile and startup
> 	root@qemu-host:/root>  ./printsocketsec
> 	inode:3221
> 	inode:2942
> 	inode:2861
> 	inode:3256
> 	inode:3156
> 	inode:3220
> 	inode:3060
> 	root@qemu-host:/root>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

-- 
Best Reagrds,
Roy | RongQing Li

[-- Attachment #2: printsocketsec.c --]
[-- Type: text/x-csrc, Size: 2876 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <errno.h>

#include "libnetlink.h"

#include <netinet/tcp.h>
#include <linux/inet_diag.h>

enum {
        SS_UNKNOWN,
        SS_ESTABLISHED,
        SS_SYN_SENT,
        SS_SYN_RECV,
        SS_FIN_WAIT1,
        SS_FIN_WAIT2,
        SS_TIME_WAIT,
        SS_CLOSE,
        SS_CLOSE_WAIT,
        SS_LAST_ACK,
        SS_LISTEN,
        SS_CLOSING,
        SS_MAX
};

#define SS_ALL ((1<<SS_MAX)-1)

/*The INET_DIAG_SECCTX should be defined in inet_diag.h at last,
To simply the test, I define it locally*/
#define INET_DIAG_SECCTX (INET_DIAG_CONG+1)
#define LOCAL_MAX INET_DIAG_SECCTX+1


void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r)
{

	struct rtattr * tb[ LOCAL_MAX + 1];

	printf(" inode:%u", r->idiag_inode);

	parse_rtattr(tb, LOCAL_MAX, (struct rtattr*)(r+1),
		     nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)));


	if (tb[INET_DIAG_SECCTX])
		printf(" %s", (char *) RTA_DATA(tb[INET_DIAG_SECCTX]));
	printf("\n");
}

static int tcp_show_netlink( int socktype)
{
	int fd;
	struct sockaddr_nl nladdr;
	struct {
		struct nlmsghdr nlh;
		struct inet_diag_req r;
	} req;

	struct msghdr msg;
	struct rtattr rta;
	char	buf[8192];
	struct iovec iov[3];

	if ((fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_INET_DIAG)) < 0)
		return -1;

	memset(&nladdr, 0, sizeof(nladdr));
	nladdr.nl_family = AF_NETLINK;

	req.nlh.nlmsg_len = sizeof(req);
	req.nlh.nlmsg_type = socktype;
	req.nlh.nlmsg_flags = NLM_F_ROOT|NLM_F_MATCH|NLM_F_REQUEST;
	req.nlh.nlmsg_pid = 0;
	req.nlh.nlmsg_seq = 123456;
	memset(&req.r, 0, sizeof(req.r));
	req.r.idiag_family = AF_INET;
	req.r.idiag_states = SS_ALL;

	req.r.idiag_ext |= (1<<(INET_DIAG_SECCTX-1));

	iov[0] = (struct iovec){
		.iov_base = &req,
		.iov_len = sizeof(req)
	};

	msg = (struct msghdr) {
		.msg_name = (void*)&nladdr,
		.msg_namelen = sizeof(nladdr),
		.msg_iov = iov,
		.msg_iovlen = 1,
	};

	if (sendmsg(fd, &msg, 0) < 0)
		return -1;

	iov[0] = (struct iovec){
		.iov_base = buf,
		.iov_len = sizeof(buf)
	};

	while (1) {
		int status;
		struct nlmsghdr *h;

		msg = (struct msghdr) {
			(void*)&nladdr, sizeof(nladdr),
			iov,	1,
			NULL,	0,
			0
		};

		status = recvmsg(fd, &msg, 0);

		if (status < 0) {
			if (errno == EINTR)
				continue;
			perror("OVERRUN");
			continue;
		}
		if (status == 0) {
			fprintf(stderr, "EOF on netlink\n");
			return 0;
		}

		h = (struct nlmsghdr*)buf;
		while (NLMSG_OK(h, status)) {
			struct inet_diag_msg *r = NLMSG_DATA(h);

			if (/*h->nlmsg_pid != rth->local.nl_pid ||*/
			    h->nlmsg_seq != 123456)
				goto skip_it;

			if (h->nlmsg_type == NLMSG_DONE)
				return 0;

			if (h->nlmsg_type == NLMSG_ERROR) 
				return 0;

			tcp_show_info(h, r);
skip_it:
			h = NLMSG_NEXT(h, status);
		}
	}
	return 0;
}
void main()
{
	tcp_show_netlink( TCPDIAG_GETSOCK);
}

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Jiri Pirko @ 2011-08-31  8:45 UTC (permalink / raw)
  To: Michał Mirosław
  Cc: netdev, davem, eric.dumazet, bhutchings, shemminger
In-Reply-To: <CAHXqBFJpSZNhkur7QNYbjG-=Bkq2HtHKEHsn+H3TnEJ5NezJoA@mail.gmail.com>

Wed, Aug 31, 2011 at 10:33:50AM CEST, mirqus@gmail.com wrote:
>W dniu 31 sierpnia 2011 10:26 użytkownik Jiri Pirko <jpirko@redhat.com> napisał:
>> Tue, Aug 30, 2011 at 08:11:37PM CEST, mirqus@gmail.com wrote:
>>>2011/8/30 Jiri Pirko <jpirko@redhat.com>:
>>>> Allow to write to "carrier" attribute. Devices may implement ndo_change_carrier
>>>> callback to allow changing carrier from userspace.
>>>Do you expect drivers using implementation different than just calling
>>>netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
>> Yes, generally it can be used also for en/disable phy, for testing
>> purposes if hw and driver would support it.
>
>I'd like to see this working for GRE tunnel devices (for keepalive
>daemon to be able to indicate to routing daemons whether tunnel is
>really working) - implementation would be identical to dummy's case.
>Should I prepare a patch or can I leave it to you?

Ok, I can include it to this patchset (I'm going to repost first patch
anyway)
>
>Best Regards,
>Michał Mirosław

^ permalink raw reply

* [PATCH net-next] net: linkwatch: allow vlans to get carrier changes faster
From: Eric Dumazet @ 2011-08-31  9:31 UTC (permalink / raw)
  To: HAYASAKA Mitsuo
  Cc: Herbert Xu, Stephen Hemminger, Patrick McHardy, David S. Miller,
	MichałMirosław, Tom Herbert, Jesse Gross, netdev,
	linux-kernel, yrl.pp-manager.tt
In-Reply-To: <1314540589.3036.12.camel@edumazet-laptop>

There is a time-lag of IFF_RUNNING flag consistency between vlan and
real devices when the real devices are in problem such as link or cable
broken.

This leads to a degradation of Availability such as a delay of failover
in HA systems using vlan since the detection of the problem at real
device is delayed.

We can avoid the linkwatch delay (~1 sec) for devices linked to another
ones, since delay is already done for the realdev.

Based on a previous patch from Mitsuo Hayasaka

Reported-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Tom Herbert <therbert@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jesse Gross <jesse@nicira.com>
---
 net/core/link_watch.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/core/link_watch.c b/net/core/link_watch.c
index 357bd4e..c3519c6 100644
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -78,8 +78,13 @@ static void rfc2863_policy(struct net_device *dev)
 
 static bool linkwatch_urgent_event(struct net_device *dev)
 {
-	return netif_running(dev) && netif_carrier_ok(dev) &&
-		qdisc_tx_changing(dev);
+	if (!netif_running(dev))
+		return false;
+
+	if (dev->ifindex != dev->iflink)
+		return true;
+
+	return netif_carrier_ok(dev) &&	qdisc_tx_changing(dev);
 }
 
 

^ permalink raw reply related

* [PATCH 1/2] Remove requirement to set tx-usecs-irq for shared channels when modifying coalescing parameters
From: Ripduman Sohan @ 2011-08-31  9:38 UTC (permalink / raw)
  To: linux-net-drivers, shodgson, bhutchings; +Cc: netdev, Ripduman Sohan

Shared TX/RX channels possess a single channel timer controlled by the
rx-usecs-irq parameter.  Changing coalescing parameters required
explicitly setting the tx-usecs-irq parameter to 0.  Ethtool (to HEAD
of tree) does not do this and instead retrieves and re-submits the
current tx-usecs-irq value resulting in an unsupported operation
error.  I found this behaviour counter-intuitive and was only able to
work out correct moderation parameters by studying the driver code.

This patch relaxes the requirement to set tx-usecs-irq to 0 by only
erring if the presented tx-usecs-irq value differs from the current
value.  I acknowledge, however, that there may be existing scripts
relying on the old behaviour and so this condition is only triggered
if a value for tx-usecs-irq is actually presented.
---
 drivers/net/sfc/efx.c     |    6 +++---
 drivers/net/sfc/efx.h     |    1 +
 drivers/net/sfc/ethtool.c |    4 +++-
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index faca764..9a313cd 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -1556,7 +1556,7 @@ static void efx_remove_all(struct efx_nic *efx)
  *
  **************************************************************************/
 
-static unsigned irq_mod_ticks(int usecs, int resolution)
+unsigned efx_irq_mod_ticks(int usecs, int resolution)
 {
 	if (usecs <= 0)
 		return 0; /* cannot receive interrupts ahead of time :-) */
@@ -1570,8 +1570,8 @@ void efx_init_irq_moderation(struct efx_nic *efx, int tx_usecs, int rx_usecs,
 			     bool rx_adaptive)
 {
 	struct efx_channel *channel;
-	unsigned tx_ticks = irq_mod_ticks(tx_usecs, EFX_IRQ_MOD_RESOLUTION);
-	unsigned rx_ticks = irq_mod_ticks(rx_usecs, EFX_IRQ_MOD_RESOLUTION);
+	unsigned tx_ticks = efx_irq_mod_ticks(tx_usecs, EFX_IRQ_MOD_RESOLUTION);
+	unsigned rx_ticks = efx_irq_mod_ticks(rx_usecs, EFX_IRQ_MOD_RESOLUTION);
 
 	EFX_ASSERT_RESET_SERIALISED(efx);
 
diff --git a/drivers/net/sfc/efx.h b/drivers/net/sfc/efx.h
index b0d1209..ddfcc7e 100644
--- a/drivers/net/sfc/efx.h
+++ b/drivers/net/sfc/efx.h
@@ -113,6 +113,7 @@ extern int efx_reset_up(struct efx_nic *efx, enum reset_type method, bool ok);
 extern void efx_schedule_reset(struct efx_nic *efx, enum reset_type type);
 extern void efx_init_irq_moderation(struct efx_nic *efx, int tx_usecs,
 				    int rx_usecs, bool rx_adaptive);
+extern unsigned efx_irq_mod_ticks(int usecs, int resolution);
 
 /* Dummy PHY ops for PHY drivers */
 extern int efx_port_dummy_op_int(struct efx_nic *efx);
diff --git a/drivers/net/sfc/ethtool.c b/drivers/net/sfc/ethtool.c
index bc4643a..0a52447 100644
--- a/drivers/net/sfc/ethtool.c
+++ b/drivers/net/sfc/ethtool.c
@@ -644,7 +644,9 @@ static int efx_ethtool_set_coalesce(struct net_device *net_dev,
 	efx_for_each_channel(channel, efx) {
 		if (efx_channel_has_rx_queue(channel) &&
 		    efx_channel_has_tx_queues(channel) &&
-		    tx_usecs) {
+		    tx_usecs &&
+		    efx_irq_mod_ticks(tx_usecs, EFX_IRQ_MOD_RESOLUTION) !=
+		    channel->irq_moderation) {
 			netif_err(efx, drv, efx->net_dev, "Channel is shared. "
 				  "Only RX coalescing may be set\n");
 			return -EOPNOTSUPP;
-- 
1.7.1

^ permalink raw reply related

* [PATCH 1/2] sfc: Remove requirement to set tx-usecs-irq for shared channels when modifying coalescing parameters
From: Ripduman Sohan @ 2011-08-31  9:52 UTC (permalink / raw)
  To: linux-net-drivers, shodgson, bhutchings; +Cc: netdev, Ripduman Sohan

Shared TX/RX channels possess a single channel timer controlled by the
rx-usecs-irq parameter.  Changing coalescing parameters required
explicitly setting the tx-usecs-irq parameter to 0.  Ethtool (to HEAD
of tree) does not do this and instead retrieves and re-submits the
current tx-usecs-irq value resulting in an unsupported operation
error.  I found this behaviour counter-intuitive and was only able to
work out correct moderation parameters by studying the driver code.

This patch relaxes the requirement to set tx-usecs-irq to 0 by only
erring if the presented tx-usecs-irq value differs from the current
value.  I acknowledge, however, that there may be existing scripts
relying on the old behaviour and so this condition is only triggered
if a value for tx-usecs-irq is actually presented.
---
 drivers/net/sfc/efx.c     |    6 +++---
 drivers/net/sfc/efx.h     |    1 +
 drivers/net/sfc/ethtool.c |    4 +++-
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index faca764..9a313cd 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -1556,7 +1556,7 @@ static void efx_remove_all(struct efx_nic *efx)
  *
  **************************************************************************/
 
-static unsigned irq_mod_ticks(int usecs, int resolution)
+unsigned efx_irq_mod_ticks(int usecs, int resolution)
 {
 	if (usecs <= 0)
 		return 0; /* cannot receive interrupts ahead of time :-) */
@@ -1570,8 +1570,8 @@ void efx_init_irq_moderation(struct efx_nic *efx, int tx_usecs, int rx_usecs,
 			     bool rx_adaptive)
 {
 	struct efx_channel *channel;
-	unsigned tx_ticks = irq_mod_ticks(tx_usecs, EFX_IRQ_MOD_RESOLUTION);
-	unsigned rx_ticks = irq_mod_ticks(rx_usecs, EFX_IRQ_MOD_RESOLUTION);
+	unsigned tx_ticks = efx_irq_mod_ticks(tx_usecs, EFX_IRQ_MOD_RESOLUTION);
+	unsigned rx_ticks = efx_irq_mod_ticks(rx_usecs, EFX_IRQ_MOD_RESOLUTION);
 
 	EFX_ASSERT_RESET_SERIALISED(efx);
 
diff --git a/drivers/net/sfc/efx.h b/drivers/net/sfc/efx.h
index b0d1209..ddfcc7e 100644
--- a/drivers/net/sfc/efx.h
+++ b/drivers/net/sfc/efx.h
@@ -113,6 +113,7 @@ extern int efx_reset_up(struct efx_nic *efx, enum reset_type method, bool ok);
 extern void efx_schedule_reset(struct efx_nic *efx, enum reset_type type);
 extern void efx_init_irq_moderation(struct efx_nic *efx, int tx_usecs,
 				    int rx_usecs, bool rx_adaptive);
+extern unsigned efx_irq_mod_ticks(int usecs, int resolution);
 
 /* Dummy PHY ops for PHY drivers */
 extern int efx_port_dummy_op_int(struct efx_nic *efx);
diff --git a/drivers/net/sfc/ethtool.c b/drivers/net/sfc/ethtool.c
index bc4643a..0a52447 100644
--- a/drivers/net/sfc/ethtool.c
+++ b/drivers/net/sfc/ethtool.c
@@ -644,7 +644,9 @@ static int efx_ethtool_set_coalesce(struct net_device *net_dev,
 	efx_for_each_channel(channel, efx) {
 		if (efx_channel_has_rx_queue(channel) &&
 		    efx_channel_has_tx_queues(channel) &&
-		    tx_usecs) {
+		    tx_usecs &&
+		    efx_irq_mod_ticks(tx_usecs, EFX_IRQ_MOD_RESOLUTION) !=
+		    channel->irq_moderation) {
 			netif_err(efx, drv, efx->net_dev, "Channel is shared. "
 				  "Only RX coalescing may be set\n");
 			return -EOPNOTSUPP;
-- 
1.7.1

^ permalink raw reply related

* [PATCH 2/2] sfc: Report correct tx-usecs-irq value when driver is loaded with separate_tx_channels set
From: Ripduman Sohan @ 2011-08-31  9:53 UTC (permalink / raw)
  To: linux-net-drivers, shodgson, bhutchings; +Cc: netdev, Ripduman Sohan

If the driver is loaded with the separate_tx_channels parameter set it
incorrectly reports TX moderation as 0 usecs regardless of the current
value.  This patch fixes this oversight.
---
 drivers/net/sfc/ethtool.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/net/sfc/ethtool.c b/drivers/net/sfc/ethtool.c
index 0a52447..dac0a84 100644
--- a/drivers/net/sfc/ethtool.c
+++ b/drivers/net/sfc/ethtool.c
@@ -600,11 +600,8 @@ static int efx_ethtool_get_coalesce(struct net_device *net_dev,
 		if (!efx_channel_has_tx_queues(channel))
 			continue;
 		if (channel->irq_moderation < coalesce->tx_coalesce_usecs_irq) {
-			if (channel->channel < efx->n_rx_channels)
-				coalesce->tx_coalesce_usecs_irq =
-					channel->irq_moderation;
-			else
-				coalesce->tx_coalesce_usecs_irq = 0;
+			coalesce->tx_coalesce_usecs_irq =
+				channel->irq_moderation;
 		}
 	}
 
-- 
1.7.1

^ permalink raw reply related

* [PATCH 2/2] Report correct tx-usecs-irq value when driver is loaded with separate_tx_channels set
From: Ripduman Sohan @ 2011-08-31  9:38 UTC (permalink / raw)
  To: linux-net-drivers, shodgson, bhutchings; +Cc: netdev, Ripduman Sohan

If the driver is loaded with the separate_tx_channels parameter set it
incorrectly reports TX moderation as 0 usecs regardless of the current
value.  This patch fixes this oversight.
---
 drivers/net/sfc/ethtool.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/net/sfc/ethtool.c b/drivers/net/sfc/ethtool.c
index 0a52447..dac0a84 100644
--- a/drivers/net/sfc/ethtool.c
+++ b/drivers/net/sfc/ethtool.c
@@ -600,11 +600,8 @@ static int efx_ethtool_get_coalesce(struct net_device *net_dev,
 		if (!efx_channel_has_tx_queues(channel))
 			continue;
 		if (channel->irq_moderation < coalesce->tx_coalesce_usecs_irq) {
-			if (channel->channel < efx->n_rx_channels)
-				coalesce->tx_coalesce_usecs_irq =
-					channel->irq_moderation;
-			else
-				coalesce->tx_coalesce_usecs_irq = 0;
+			coalesce->tx_coalesce_usecs_irq =
+				channel->irq_moderation;
 		}
 	}
 
-- 
1.7.1

^ permalink raw reply related

* [PATCH 1/1] net/can/af_can.c: Change del_timer to del_timer_sync
From: Rajan Aggarwal @ 2011-08-31  9:57 UTC (permalink / raw)
  To: Oliver Hartkopp, Urs Thuermann, David S. Miller; +Cc: netdev

From: Rajan Aggarwal <Rajan Aggarwal rajan.aggarwal85@gmail.com>

This is important for SMP platform to check if timer function is
executing on other CPU with deleting the timer.

Signed-off-by: Rajan Aggarwal <Rajan Aggarwal rajan.aggarwal85@gmail.com>
---
 net/can/af_can.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/can/af_can.c b/net/can/af_can.c
index 8ce926d..9b0c32a 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -857,7 +857,7 @@ static __exit void can_exit(void)
 	struct net_device *dev;
 
 	if (stats_timer)
-		del_timer(&can_stattimer);
+		del_timer_sync(&can_stattimer);
 
 	can_remove_proc();
 
-- 
1.7.4.1

^ permalink raw reply related

* Re: [PATCH 2/7] bnx2x: remove the 'leading' arguments
From: Vlad Zolotarov @ 2011-08-31  9:57 UTC (permalink / raw)
  To: Michal Schmidt; +Cc: netdev@vger.kernel.org, Dmitry Kravkov, Eilon Greenstein
In-Reply-To: <1314714646-3642-3-git-send-email-mschmidt@redhat.com>

On Tuesday 30 August 2011 17:30:41 Michal Schmidt wrote:
> Whether a queue is leading can be deduced from its index.
> 
> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x.h      |    1 +
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c  |    2 +-
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h  |    4 +---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |   21
> +++++++++------------ 4 files changed, 12 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h index 735e491..c0d2d9c
> 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> @@ -531,6 +531,7 @@ struct bnx2x_fastpath {
> 
>  #define IS_ETH_FP(fp)			(fp->index < \
>  					 BNX2X_NUM_ETH_QUEUES(fp->bp))
> +#define IS_LEADING_FP(fp)		((fp)->index == 0)
>  #ifdef BCM_CNIC
>  #define IS_FCOE_FP(fp)			(fp->index == FCOE_IDX)
>  #define IS_FCOE_IDX(idx)		((idx) == FCOE_IDX)
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index 5c3eb17..448e301
> 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> @@ -1881,7 +1881,7 @@ int bnx2x_nic_load(struct bnx2x *bp, int load_mode)
>  #endif
> 
>  	for_each_nondefault_queue(bp, i) {
> -		rc = bnx2x_setup_queue(bp, &bp->fp[i], 0);
> +		rc = bnx2x_setup_queue(bp, &bp->fp[i]);
>  		if (rc)
>  			LOAD_ERROR_EXIT(bp, load_error4);
>  	}
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h index 5b1f9b5..54d50b7
> 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
> @@ -109,11 +109,9 @@ void bnx2x__init_func_obj(struct bnx2x *bp);
>   *
>   * @bp:		driver handle
>   * @fp:		pointer to the fastpath structure
> - * @leading:	boolean
>   *
>   */
> -int bnx2x_setup_queue(struct bnx2x *bp, struct bnx2x_fastpath *fp,
> -		       bool leading);
> +int bnx2x_setup_queue(struct bnx2x *bp, struct bnx2x_fastpath *fp);
> 
>  /**
>   * bnx2x_setup_leading - bring up a leading eth queue.
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index e7b584b..64314f7
> 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> @@ -2697,9 +2697,8 @@ static inline unsigned long
> bnx2x_get_common_flags(struct bnx2x *bp, return flags;
>  }
> 
> -static inline unsigned long bnx2x_get_q_flags(struct bnx2x *bp,
> -					      struct bnx2x_fastpath *fp,
> -					      bool leading)
> +static unsigned long bnx2x_get_q_flags(struct bnx2x *bp,
> +				       struct bnx2x_fastpath *fp)
>  {
>  	unsigned long flags = 0;
> 
> @@ -2715,7 +2714,7 @@ static inline unsigned long bnx2x_get_q_flags(struct
> bnx2x *bp, __set_bit(BNX2X_Q_FLG_TPA_IPV6, &flags);
>  	}
> 
> -	if (leading) {
> +	if (IS_LEADING_FP(fp)) {
>  		__set_bit(BNX2X_Q_FLG_LEADING_RSS, &flags);
>  		__set_bit(BNX2X_Q_FLG_MCAST, &flags);
>  	}
> @@ -6966,7 +6965,7 @@ int bnx2x_set_eth_mac(struct bnx2x *bp, bool set)
> 
>  int bnx2x_setup_leading(struct bnx2x *bp)
>  {
> -	return bnx2x_setup_queue(bp, &bp->fp[0], 1);
> +	return bnx2x_setup_queue(bp, &bp->fp[0]);
>  }
> 
>  /**
> @@ -7177,10 +7176,10 @@ static inline void bnx2x_pf_q_prep_init(struct
> bnx2x *bp, &bp->context.vcxt[fp->txdata[cos].cid].eth;
>  }
> 
> -int bnx2x_setup_tx_only(struct bnx2x *bp, struct bnx2x_fastpath *fp,
> +static int bnx2x_setup_tx_only(struct bnx2x *bp, struct bnx2x_fastpath
> *fp, struct bnx2x_queue_state_params *q_params,
>  			struct bnx2x_queue_setup_tx_only_params 
*tx_only_params,
> -			int tx_index, bool leading)
> +			int tx_index)
>  {
>  	memset(tx_only_params, 0, sizeof(*tx_only_params));
> 
> @@ -7216,14 +7215,12 @@ int bnx2x_setup_tx_only(struct bnx2x *bp, struct
> bnx2x_fastpath *fp, *
>   * @bp:		driver handle
>   * @fp:		pointer to fastpath
> - * @leading:	is leading
>   *
>   * This function performs 2 steps in a Queue state machine
>   *      actually: 1) RESET->INIT 2) INIT->SETUP
>   */
> 
> -int bnx2x_setup_queue(struct bnx2x *bp, struct bnx2x_fastpath *fp,
> -		       bool leading)
> +int bnx2x_setup_queue(struct bnx2x *bp, struct bnx2x_fastpath *fp)
>  {
>  	struct bnx2x_queue_state_params q_params = {0};
>  	struct bnx2x_queue_setup_params *setup_params =
> @@ -7264,7 +7261,7 @@ int bnx2x_setup_queue(struct bnx2x *bp, struct
> bnx2x_fastpath *fp, memset(setup_params, 0, sizeof(*setup_params));
> 
>  	/* Set QUEUE flags */
> -	setup_params->flags = bnx2x_get_q_flags(bp, fp, leading);
> +	setup_params->flags = bnx2x_get_q_flags(bp, fp);
> 
>  	/* Set general SETUP parameters */
>  	bnx2x_pf_q_prep_general(bp, fp, &setup_params->gen_params,
> @@ -7293,7 +7290,7 @@ int bnx2x_setup_queue(struct bnx2x *bp, struct
> bnx2x_fastpath *fp,
> 
>  		/* prepare and send tx-only ramrod*/
>  		rc = bnx2x_setup_tx_only(bp, fp, &q_params,
> -					  tx_only_params, tx_index, leading);
> +					  tx_only_params, tx_index);
>  		if (rc) {
>  			BNX2X_ERR("Queue(%d.%d) TX_ONLY_SETUP failed\n",
>  				  fp->index, tx_index);

NACK

Removing this parameter would decrese the flexability of our code.
For instance we are using this function in our KVM code, which is under the 
development now, where we define a few RSS groups on the same PF and then 
"leading" fp may have an index different from 0.

It's a shame to remove this code now in order to submit it back later...

thanks,
vlad

^ permalink raw reply

* Re: [PATCH 1/7] bnx2x: remove unused fields in struct bnx2x_func_init_params
From: Vlad Zolotarov @ 2011-08-31 10:07 UTC (permalink / raw)
  To: Michal Schmidt; +Cc: netdev@vger.kernel.org, Dmitry Kravkov, Eilon Greenstein
In-Reply-To: <1314714646-3642-2-git-send-email-mschmidt@redhat.com>

On Tuesday 30 August 2011 17:30:40 Michal Schmidt wrote:
> func_flgs is not used for anything. The only flag that's ever checked
> (FUNC_FLG_SPQ) is always set. The other flags are never read.
> 
> fw_stat_map is not used at all.
> 
> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x.h      |   15 ++-------------
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |   18 +++---------------
>  2 files changed, 5 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h index f127768..735e491
> 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> @@ -1490,24 +1490,13 @@ extern int num_queues;
>  #define RSS_IPV6_TCP_CAP_MASK						
\
>  	TSTORM_ETH_FUNCTION_COMMON_CONFIG_RSS_IPV6_TCP_CAPABILITY
> 
> -/* func init flags */
> -#define FUNC_FLG_RSS		0x0001
> -#define FUNC_FLG_STATS		0x0002
> -/* removed  FUNC_FLG_UNMATCHED	0x0004 */
> -#define FUNC_FLG_TPA		0x0008
> -#define FUNC_FLG_SPQ		0x0010
> -#define FUNC_FLG_LEADING	0x0020	/* PF only */
> -
> -
>  struct bnx2x_func_init_params {
>  	/* dma */
> -	dma_addr_t	fw_stat_map;	/* valid iff FUNC_FLG_STATS */
> -	dma_addr_t	spq_map;	/* valid iff FUNC_FLG_SPQ */
> +	dma_addr_t	spq_map;
> 
> -	u16		func_flgs;
>  	u16		func_id;	/* abs fid */
>  	u16		pf_id;
> -	u16		spq_prod;	/* valid iff FUNC_FLG_SPQ */
> +	u16		spq_prod;
>  };
> 
>  #define for_each_eth_queue(bp, var) \
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index 85dd294..e7b584b
> 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> @@ -2661,11 +2661,9 @@ void bnx2x_func_init(struct bnx2x *bp, struct
> bnx2x_func_init_params *p) storm_memset_func_en(bp, p->func_id, 1);
> 
>  	/* spq */
> -	if (p->func_flgs & FUNC_FLG_SPQ) {
> -		storm_memset_spq_addr(bp, p->spq_map, p->func_id);
> -		REG_WR(bp, XSEM_REG_FAST_MEMORY +
> -		       XSTORM_SPQ_PROD_OFFSET(p->func_id), p->spq_prod);
> -	}
> +	storm_memset_spq_addr(bp, p->spq_map, p->func_id);
> +	REG_WR(bp, XSEM_REG_FAST_MEMORY +
> +	       XSTORM_SPQ_PROD_OFFSET(p->func_id), p->spq_prod);
>  }
> 
>  /**
> @@ -2838,7 +2836,6 @@ static void bnx2x_pf_init(struct bnx2x *bp)
>  {
>  	struct bnx2x_func_init_params func_init = {0};
>  	struct event_ring_data eq_data = { {0} };
> -	u16 flags;
> 
>  	if (!CHIP_IS_E1x(bp)) {
>  		/* reset IGU PF statistics: MSIX + ATTN */
> @@ -2855,15 +2852,6 @@ static void bnx2x_pf_init(struct bnx2x *bp)
>  				BP_FUNC(bp) : BP_VN(bp))*4, 0);
>  	}
> 
> -	/* function setup flags */
> -	flags = (FUNC_FLG_STATS | FUNC_FLG_LEADING | FUNC_FLG_SPQ);
> -
> -	/* This flag is relevant for E1x only.
> -	 * E2 doesn't have a TPA configuration in a function level.
> -	 */
> -	flags |= (bp->flags & TPA_ENABLE_FLAG) ? FUNC_FLG_TPA : 0;
> -
> -	func_init.func_flgs = flags;
>  	func_init.pf_id = BP_FUNC(bp);
>  	func_init.func_id = BP_FUNC(bp);
>  	func_init.spq_map = bp->spq_mapping;

Acked-by: Vladislav Zolotarov <vladz@broadcom.com>

^ permalink raw reply

* Re: [PATCH 06/24] netfilter: Remove unnecessary OOM logging messages
From: Patrick McHardy @ 2011-08-31 10:13 UTC (permalink / raw)
  To: David Miller
  Cc: joe, bart.de.schuymer, wensong, horms, ja, shemminger, kuznet,
	jmorris, yoshfuji, netfilter-devel, netfilter, coreteam, bridge,
	netdev, linux-kernel, lvs-devel
In-Reply-To: <20110830.135502.179848097213434762.davem@davemloft.net>

On 30.08.2011 19:55, David Miller wrote:
> From: Patrick McHardy <kaber@trash.net>
> Date: Tue, 30 Aug 2011 14:46:34 +0200
> 
>> On 29.08.2011 23:17, Joe Perches wrote:
>>> Removing unnecessary messages saves code and text.
>>>
>>> Site specific OOM messages are duplications of a generic MM
>>> out of memory message and aren't really useful, so just
>>> delete them.
>>
>> Looks good to me. Do you want me to apply this patch or are you
>> intending to have the entire series go through Dave?
> 
> I'm happy with subsystem folks taking things in if they want, the
> B.A.T.M.A.N. guys did this earlier today for example.

OK, thanks.

Applied after fixing up some minor rejects in nf_nat_snmp_basic.c,
thanks Joe.

^ permalink raw reply

* Re: [PATCH 4/7] bnx2x: simplify TPA sanity check
From: Vlad Zolotarov @ 2011-08-31 10:22 UTC (permalink / raw)
  To: Michal Schmidt; +Cc: netdev@vger.kernel.org, Dmitry Kravkov, Eilon Greenstein
In-Reply-To: <1314714646-3642-5-git-send-email-mschmidt@redhat.com>

On Tuesday 30 August 2011 17:30:43 Michal Schmidt wrote:
> In the TPA branch we already know the CQE type is either START or STOP.
> No need to test for that. Even if the type were to differ, we wouldn't
> want to suppress the error message.
> 
> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |    4 +---
>  1 files changed, 1 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index f1fea58..fe5be0c
> 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> @@ -634,9 +634,7 @@ int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
>  		if (!CQE_TYPE_FAST(cqe_fp_type)) {
>  #ifdef BNX2X_STOP_ON_ERROR
>  			/* sanity check */
> -			if (fp->disable_tpa &&
> -			    (CQE_TYPE_START(cqe_fp_type) ||
> -			     CQE_TYPE_STOP(cqe_fp_type)))
> +			if (fp->disable_tpa)
>  				BNX2X_ERR("START/STOP packet while "
>  					  "disable_tpa type %x\n",
>  					  CQE_TYPE(cqe_fp_type));

Acked-by: Vladislav Zolotarov <vladz@broadcom.com>

^ permalink raw reply

* Re: [PATCH 3/7] bnx2x: decrease indentation in bnx2x_rx_int()
From: Vlad Zolotarov @ 2011-08-31 10:33 UTC (permalink / raw)
  To: Michal Schmidt; +Cc: netdev@vger.kernel.org, Dmitry Kravkov, Eilon Greenstein
In-Reply-To: <1314714646-3642-4-git-send-email-mschmidt@redhat.com>

On Tuesday 30 August 2011 17:30:42 Michal Schmidt wrote:
> For better readability decrease the indentation in bnx2x_rx_int().
> 'else' is unnecessary when the positive branch ends with a 'goto'.
> 
> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>

Acked-by: Dmitry Kravkov <dmitry@broadcom.com>

> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |  194
> +++++++++++------------ 1 files changed, 92 insertions(+), 102
> deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index 448e301..f1fea58
> 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> @@ -624,135 +624,125 @@ int bnx2x_rx_int(struct bnx2x_fastpath *fp, int
> budget) if (unlikely(CQE_TYPE_SLOW(cqe_fp_type))) {
>  			bnx2x_sp_event(fp, cqe);
>  			goto next_cqe;
> +		}
> 
>  		/* this is an rx packet */
> -		} else {
> -			rx_buf = &fp->rx_buf_ring[bd_cons];
> -			skb = rx_buf->skb;
> -			prefetch(skb);
> +		rx_buf = &fp->rx_buf_ring[bd_cons];
> +		skb = rx_buf->skb;
> +		prefetch(skb);
> 
> -			if (!CQE_TYPE_FAST(cqe_fp_type)) {
> +		if (!CQE_TYPE_FAST(cqe_fp_type)) {
>  #ifdef BNX2X_STOP_ON_ERROR
> -				/* sanity check */
> -				if (fp->disable_tpa &&
> -				    (CQE_TYPE_START(cqe_fp_type) ||
> -				     CQE_TYPE_STOP(cqe_fp_type)))
> -					BNX2X_ERR("START/STOP packet while "
> -						  "disable_tpa type %x\n",
> -						  CQE_TYPE(cqe_fp_type));
> +			/* sanity check */
> +			if (fp->disable_tpa &&
> +			    (CQE_TYPE_START(cqe_fp_type) ||
> +			     CQE_TYPE_STOP(cqe_fp_type)))
> +				BNX2X_ERR("START/STOP packet while "
> +					  "disable_tpa type %x\n",
> +					  CQE_TYPE(cqe_fp_type));
>  #endif
> 
> -				if (CQE_TYPE_START(cqe_fp_type)) {
> -					u16 queue = cqe_fp->queue_index;
> -					DP(NETIF_MSG_RX_STATUS,
> -					   "calling tpa_start on queue %d\n",
> -					   queue);
> +			if (CQE_TYPE_START(cqe_fp_type)) {
> +				u16 queue = cqe_fp->queue_index;
> +				DP(NETIF_MSG_RX_STATUS,
> +				   "calling tpa_start on queue %d\n", queue);
> 
> -					bnx2x_tpa_start(fp, queue, skb,
> -							bd_cons, bd_prod,
> -							cqe_fp);
> +				bnx2x_tpa_start(fp, queue, skb,
> +						bd_cons, bd_prod, cqe_fp);
> 
> -					/* Set Toeplitz hash for LRO skb */
> -					bnx2x_set_skb_rxhash(bp, cqe, skb);
> +				/* Set Toeplitz hash for LRO skb */
> +				bnx2x_set_skb_rxhash(bp, cqe, skb);
> 
> -					goto next_rx;
> +				goto next_rx;
> 
> -				} else {
> -					u16 queue =
> -						cqe->end_agg_cqe.queue_index;
> -					DP(NETIF_MSG_RX_STATUS,
> -					   "calling tpa_stop on queue %d\n",
> -					   queue);
> +			} else {
> +				u16 queue =
> +					cqe->end_agg_cqe.queue_index;
> +				DP(NETIF_MSG_RX_STATUS,
> +				   "calling tpa_stop on queue %d\n", queue);
> 
> -					bnx2x_tpa_stop(bp, fp, queue,
> -						       &cqe->end_agg_cqe,
> -						       comp_ring_cons);
> +				bnx2x_tpa_stop(bp, fp, queue, &cqe-
>end_agg_cqe,
> +					       comp_ring_cons);
>  #ifdef BNX2X_STOP_ON_ERROR
> -					if (bp->panic)
> -						return 0;
> +				if (bp->panic)
> +					return 0;
>  #endif
> 
> -					bnx2x_update_sge_prod(fp, cqe_fp);
> -					goto next_cqe;
> -				}
> +				bnx2x_update_sge_prod(fp, cqe_fp);
> +				goto next_cqe;
>  			}
> -			/* non TPA */
> -			len = le16_to_cpu(cqe_fp->pkt_len);
> -			pad = cqe_fp->placement_offset;
> -			dma_sync_single_for_cpu(&bp->pdev->dev,
> +		}
> +		/* non TPA */
> +		len = le16_to_cpu(cqe_fp->pkt_len);
> +		pad = cqe_fp->placement_offset;
> +		dma_sync_single_for_cpu(&bp->pdev->dev,
>  					dma_unmap_addr(rx_buf, mapping),
> -						       pad + RX_COPY_THRESH,
> -						       DMA_FROM_DEVICE);
> -			prefetch(((char *)(skb)) + L1_CACHE_BYTES);
> +					pad + RX_COPY_THRESH, 
DMA_FROM_DEVICE);
> +		prefetch(((char *)(skb)) + L1_CACHE_BYTES);
> 
> -			/* is this an error packet? */
> -			if (unlikely(cqe_fp_flags & ETH_RX_ERROR_FALGS)) {
> +		/* is this an error packet? */
> +		if (unlikely(cqe_fp_flags & ETH_RX_ERROR_FALGS)) {
> +			DP(NETIF_MSG_RX_ERR, "ERROR  flags %x  rx packet 
%u\n",
> +			   cqe_fp_flags, sw_comp_cons);
> +			fp->eth_q_stats.rx_err_discard_pkt++;
> +			goto reuse_rx;
> +		}
> +
> +		/*
> +		 * Since we don't have a jumbo ring,
> +		 * copy small packets if mtu > 1500
> +		 */
> +		if ((bp->dev->mtu > ETH_MAX_PACKET_SIZE) &&
> +		    (len <= RX_COPY_THRESH)) {
> +			struct sk_buff *new_skb;
> +
> +			new_skb = netdev_alloc_skb(bp->dev, len + pad);
> +			if (new_skb == NULL) {
>  				DP(NETIF_MSG_RX_ERR,
> -				   "ERROR  flags %x  rx packet %u\n",
> -				   cqe_fp_flags, sw_comp_cons);
> -				fp->eth_q_stats.rx_err_discard_pkt++;
> +				   "ERROR  packet dropped "
> +				   "because of alloc failure\n");
> +				fp->eth_q_stats.rx_skb_alloc_failed++;
>  				goto reuse_rx;
>  			}
> 
> -			/* Since we don't have a jumbo ring
> -			 * copy small packets if mtu > 1500
> -			 */
> -			if ((bp->dev->mtu > ETH_MAX_PACKET_SIZE) &&
> -			    (len <= RX_COPY_THRESH)) {
> -				struct sk_buff *new_skb;
> -
> -				new_skb = netdev_alloc_skb(bp->dev, len + 
pad);
> -				if (new_skb == NULL) {
> -					DP(NETIF_MSG_RX_ERR,
> -					   "ERROR  packet dropped "
> -					   "because of alloc failure\n");
> -					fp->eth_q_stats.rx_skb_alloc_failed++;
> -					goto reuse_rx;
> -				}
> -
> -				/* aligned copy */
> -				skb_copy_from_linear_data_offset(skb, pad,
> -						    new_skb->data + pad, len);
> -				skb_reserve(new_skb, pad);
> -				skb_put(new_skb, len);
> +			/* aligned copy */
> +			skb_copy_from_linear_data_offset(skb, pad,
> +						new_skb->data + pad, len);
> +			skb_reserve(new_skb, pad);
> +			skb_put(new_skb, len);
> 
> -				bnx2x_reuse_rx_skb(fp, bd_cons, bd_prod);
> +			bnx2x_reuse_rx_skb(fp, bd_cons, bd_prod);
> 
> -				skb = new_skb;
> +			skb = new_skb;
> 
> -			} else
> -			if (likely(bnx2x_alloc_rx_skb(bp, fp, bd_prod) == 0)) 
{
> -				dma_unmap_single(&bp->pdev->dev,
> -					dma_unmap_addr(rx_buf, mapping),
> -						 fp->rx_buf_size,
> -						 DMA_FROM_DEVICE);
> -				skb_reserve(skb, pad);
> -				skb_put(skb, len);
> +		} else if (likely(bnx2x_alloc_rx_skb(bp, fp, bd_prod) == 0)) {
> +			dma_unmap_single(&bp->pdev->dev,
> +					 dma_unmap_addr(rx_buf, mapping),
> +					 fp->rx_buf_size, DMA_FROM_DEVICE);
> +			skb_reserve(skb, pad);
> +			skb_put(skb, len);
> 
> -			} else {
> -				DP(NETIF_MSG_RX_ERR,
> -				   "ERROR  packet dropped because "
> -				   "of alloc failure\n");
> -				fp->eth_q_stats.rx_skb_alloc_failed++;
> +		} else {
> +			DP(NETIF_MSG_RX_ERR,
> +			   "ERROR  packet dropped because of alloc 
failure\n");
> +			fp->eth_q_stats.rx_skb_alloc_failed++;
>  reuse_rx:
> -				bnx2x_reuse_rx_skb(fp, bd_cons, bd_prod);
> -				goto next_rx;
> -			}
> -
> -			skb->protocol = eth_type_trans(skb, bp->dev);
> +			bnx2x_reuse_rx_skb(fp, bd_cons, bd_prod);
> +			goto next_rx;
> +		}
> 
> -			/* Set Toeplitz hash for a none-LRO skb */
> -			bnx2x_set_skb_rxhash(bp, cqe, skb);
> +		skb->protocol = eth_type_trans(skb, bp->dev);
> 
> -			skb_checksum_none_assert(skb);
> +		/* Set Toeplitz hash for a none-LRO skb */
> +		bnx2x_set_skb_rxhash(bp, cqe, skb);
> 
> -			if (bp->dev->features & NETIF_F_RXCSUM) {
> +		skb_checksum_none_assert(skb);
> 
> -				if (likely(BNX2X_RX_CSUM_OK(cqe)))
> -					skb->ip_summed = CHECKSUM_UNNECESSARY;
> -				else
> -					fp->eth_q_stats.hw_csum_err++;
> -			}
> +		if (bp->dev->features & NETIF_F_RXCSUM) {
> +			if (likely(BNX2X_RX_CSUM_OK(cqe)))
> +				skb->ip_summed = CHECKSUM_UNNECESSARY;
> +			else
> +				fp->eth_q_stats.hw_csum_err++;
>  		}
> 
>  		skb_record_rx_queue(skb, fp->index);

^ permalink raw reply

* [PATCH 0/14] skb fragment API: convert network drivers (part II)
From: Ian Campbell @ 2011-08-31 10:46 UTC (permalink / raw)
  To: netdev@vger.kernel.org

The following series converts the second batch of network drivers to the
SKB pages fragment API introduced in 131ea6675c76. I expect there will
be ~4 similarly sized batches to convert all the drivers over.

This is part of my series to enable visibility into SKB paged fragment's
lifecycles, [0] contains some more background and rationale but
basically the completed series will allow entities which inject pages
into the networking stack to receive a notification when the stack has
really finished with those pages (i.e. including retransmissions,
clones, pull-ups etc) and not just when the original skb is finished
with, which is beneficial to many subsystems which wish to inject pages
into the network stack without giving up full ownership of those page's
lifecycle. It implements something broadly along the lines of what was
described in [1].

Cheers,
Ian.

[0] http://marc.info/?l=linux-netdev&m=131072801125521&w=2
[1] http://marc.info/?l=linux-netdev&m=130925719513084&w=2

^ permalink raw reply

* [PATCH 01/14] ibmveth: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-31 10:46 UTC (permalink / raw)
  To: netdev; +Cc: Ian Campbell, Santiago Leon
In-Reply-To: <1314787608.28989.41.camel@zakaz.uk.xensource.com>

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Santiago Leon <santil@linux.vnet.ibm.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/ethernet/ibm/ibmveth.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index bba1ffc..8cca4a6 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1002,9 +1002,8 @@ retry_bounce:
 		unsigned long dma_addr;
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
-		dma_addr = dma_map_page(&adapter->vdev->dev, frag->page,
-					frag->page_offset, frag->size,
-					DMA_TO_DEVICE);
+		dma_addr = skb_frag_dma_map(&adapter->vdev->dev, frag, 0,
+					    frag->size, DMA_TO_DEVICE);
 
 		if (dma_mapping_error(&adapter->vdev->dev, dma_addr))
 			goto map_failed_frags;
-- 
1.7.2.5

^ permalink raw reply related

* [PATCH 02/14] jme: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-31 10:46 UTC (permalink / raw)
  To: netdev; +Cc: Ian Campbell, Guo-Fu Tseng
In-Reply-To: <1314787608.28989.41.camel@zakaz.uk.xensource.com>

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Guo-Fu Tseng <cooldavid@cooldavid.org>
Cc: netdev@vger.kernel.org
---
 drivers/net/ethernet/jme.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c
index a869ee4..48a0a23 100644
--- a/drivers/net/ethernet/jme.c
+++ b/drivers/net/ethernet/jme.c
@@ -1928,8 +1928,9 @@ jme_map_tx_skb(struct jme_adapter *jme, struct sk_buff *skb, int idx)
 		ctxdesc = txdesc + ((idx + i + 2) & (mask));
 		ctxbi = txbi + ((idx + i + 2) & (mask));
 
-		jme_fill_tx_map(jme->pdev, ctxdesc, ctxbi, frag->page,
-				 frag->page_offset, frag->size, hidma);
+		jme_fill_tx_map(jme->pdev, ctxdesc, ctxbi,
+				skb_frag_page(frag),
+				frag->page_offset, frag->size, hidma);
 	}
 
 	len = skb_is_nonlinear(skb) ? skb_headlen(skb) : skb->len;
-- 
1.7.2.5

^ permalink raw reply related

* [PATCH 03/14] ksz884x: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-31 10:46 UTC (permalink / raw)
  To: netdev; +Cc: Ian Campbell
In-Reply-To: <1314787608.28989.41.camel@zakaz.uk.xensource.com>

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/ethernet/micrel/ksz884x.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/micrel/ksz884x.c b/drivers/net/ethernet/micrel/ksz884x.c
index 27418d3..710c4ae 100644
--- a/drivers/net/ethernet/micrel/ksz884x.c
+++ b/drivers/net/ethernet/micrel/ksz884x.c
@@ -4704,8 +4704,7 @@ static void send_packet(struct sk_buff *skb, struct net_device *dev)
 
 			dma_buf->dma = pci_map_single(
 				hw_priv->pdev,
-				page_address(this_frag->page) +
-				this_frag->page_offset,
+				skb_frag_address(this_frag),
 				dma_buf->len,
 				PCI_DMA_TODEVICE);
 			set_tx_buf(desc, dma_buf->dma);
-- 
1.7.2.5

^ permalink raw reply related

* [PATCH 04/14] macvtap: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-31 10:46 UTC (permalink / raw)
  To: netdev; +Cc: Ian Campbell
In-Reply-To: <1314787608.28989.41.camel@zakaz.uk.xensource.com>

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/macvtap.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index ab96c31..7c3f84a 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -503,10 +503,10 @@ static int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
 		skb->truesize += len;
 		atomic_add(len, &skb->sk->sk_wmem_alloc);
 		while (len) {
-			f = &skb_shinfo(skb)->frags[i];
-			f->page = page[i];
-			f->page_offset = base & ~PAGE_MASK;
-			f->size = min_t(int, len, PAGE_SIZE - f->page_offset);
+			__skb_fill_page_desc(
+				skb, i, page[i],
+				base & ~PAGE_MASK,
+				min_t(int, len, PAGE_SIZE - f->page_offset));
 			skb_shinfo(skb)->nr_frags++;
 			/* increase sk_wmem_alloc */
 			base += f->size;
-- 
1.7.2.5

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox