Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next-2.6 v2] can: Topcliff: PCH_CAN driver: Fix build warnings
From: Marc Kleine-Budde @ 2010-10-29 17:58 UTC (permalink / raw)
  To: Oliver Hartkopp
  Cc: andrew.chih.howe.khor-ral2JQCrhuEAvxtiuMwx3w, Masayuki Ohtake,
	Samuel Ortiz, margie.foster-ral2JQCrhuEAvxtiuMwx3w,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	yong.y.wang-ral2JQCrhuEAvxtiuMwx3w,
	kok.howg.ewe-ral2JQCrhuEAvxtiuMwx3w, Wolfgang Grandegger,
	joel.clark-ral2JQCrhuEAvxtiuMwx3w, David S. Miller,
	Christian Pellegrin, qi.wang-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <4CCAFA68.2030903-fJ+pQTUTwRTk1uMJSBkQmQ@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 2851 bytes --]

On 10/29/2010 06:46 PM, Oliver Hartkopp wrote:
> Indeed the access to the data registers does not seem to be endian aware.
> 
> See in pch_can_rx_normal():
> 
> +			cf->can_dlc = get_can_dlc((ioread32(&priv->regs->
> +						   if1_mcont)) & 0xF);
> +			*(u16 *)(cf->data + 0) = ioread16(&priv->regs->
> +							  if1_dataa1);
> +			*(u16 *)(cf->data + 2) = ioread16(&priv->regs->
> +							  if1_dataa2);
> +			*(u16 *)(cf->data + 4) = ioread16(&priv->regs->
> +							  if1_datab1);
> +			*(u16 *)(cf->data + 6) = ioread16(&priv->regs->
> +							  if1_datab2);
> 
> See in pch_xmit():
> 
> +	/* Copy data to register */
> +	if (cf->can_dlc > 0) {
> +		u32 data1 = *((u16 *)&cf->data[0]);
> +		iowrite32(data1, &priv->regs->if2_dataa1);
> +	}
> +	if (cf->can_dlc > 2) {
> +		u32 data1 = *((u16 *)&cf->data[2]);
> +		iowrite32(data1, &priv->regs->if2_dataa2);
> +	}
> +	if (cf->can_dlc > 4) {
> +		u32 data1 = *((u16 *)&cf->data[4]);
> +		iowrite32(data1, &priv->regs->if2_datab1);
> +	}
> +	if (cf->can_dlc > 6) {
> +		u32 data1 = *((u16 *)&cf->data[6]);
> +		iowrite32(data1, &priv->regs->if2_datab2);
> +	}
> +
> 
> It's just a question if the driver for an Intel Chipset should/could ignore
> the endian problematic.
> 
> If this is intended the driver should only appear in Kconfig depending on X86
> or little endian systems.
> 
> Besides this remark, the struct pch_can_regs could also be defined in a way
> that every single CAN payload data byte could be accessed directly to allow

That _should_ work on x86. On the contrary on ARM some registers in SOCs
allow only full 32 bit access.

> e.g. this code in pch_can_rx_normal():
> 
> cf->data[0] = ioread8(&priv->regs->if1_data0);
> cf->data[1] = ioread8(&priv->regs->if1_data1);
> cf->data[2] = ioread8(&priv->regs->if1_data2);
> (..)
> cf->data[6] = ioread8(&priv->regs->if1_data6);
> cf->data[7] = ioread8(&priv->regs->if1_data7);

> This is easy to understand and additionally endian aware.

or use this, (as proposed in a earlier mail)

	for (i = 0; i < cf->can_dlc; i += 2) {
		reg = ioread32(&priv->regs->if1_data[i >> 1]);
		*(__be16 *)cf->data[i] = cpu_to_be16(reg);
	}

> In opposite to this:
> 
> +	if (cf->can_dlc > 2) {
> +		u32 data1 = *((u16 *)&cf->data[2]);
> +		iowrite32(data1, &priv->regs->if2_dataa2);
> +	}
> 
> Which puts a little? endian u16 into the big endian register layout.
> Sorry i'm just getting curious on this register access implementation.

According to the datasheet the layout is LE.

cheers, Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

[-- Attachment #2: Type: text/plain, Size: 188 bytes --]

_______________________________________________
Socketcan-core mailing list
Socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
https://lists.berlios.de/mailman/listinfo/socketcan-core

^ permalink raw reply

* Re: [PATCH] netfilter: fix nf_conntrack_l4proto_register()
From: Patrick McHardy @ 2010-10-29 18:00 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Netfilter Development Mailinglist
In-Reply-To: <1288375022.2560.306.camel@edumazet-laptop>

Am 29.10.2010 19:57, schrieb Eric Dumazet:
> While doing __rcu annotations work on net/netfilter I found following
> bug. On some arches, it is possible we publish a table while its content
> is not yet committed to memory, and lockless reader can dereference wild
> pointer.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH 1/2] vmxnet3: remove unnecessary byteswapping in BAR writing macros
From: Shreyas Bhatewara @ 2010-10-29 18:17 UTC (permalink / raw)
  To: Harvey Harrison; +Cc: netdev@vger.kernel.org, pv-drivers
In-Reply-To: <1288235555-24675-1-git-send-email-harvey.harrison@gmail.com>

Harvey,
Thanks for working on this. The change looks good.

Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>


On Wed, 27 Oct 2010, Harvey Harrison wrote:

> readl/writel swap to little-endian internally.
> 
> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
> ---
>  drivers/net/vmxnet3/vmxnet3_int.h |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/vmxnet3/vmxnet3_int.h b/drivers/net/vmxnet3/vmxnet3_int.h
> index 8a2f471..edf2288 100644
> --- a/drivers/net/vmxnet3/vmxnet3_int.h
> +++ b/drivers/net/vmxnet3/vmxnet3_int.h
> @@ -330,14 +330,14 @@ struct vmxnet3_adapter {
>  };
>  
>  #define VMXNET3_WRITE_BAR0_REG(adapter, reg, val)  \
> -	writel(cpu_to_le32(val), (adapter)->hw_addr0 + (reg))
> +	writel((val), (adapter)->hw_addr0 + (reg))
>  #define VMXNET3_READ_BAR0_REG(adapter, reg)        \
> -	le32_to_cpu(readl((adapter)->hw_addr0 + (reg)))
> +	readl((adapter)->hw_addr0 + (reg))
>  
>  #define VMXNET3_WRITE_BAR1_REG(adapter, reg, val)  \
> -	writel(cpu_to_le32(val), (adapter)->hw_addr1 + (reg))
> +	writel((val), (adapter)->hw_addr1 + (reg))
>  #define VMXNET3_READ_BAR1_REG(adapter, reg)        \
> -	le32_to_cpu(readl((adapter)->hw_addr1 + (reg)))
> +	readl((adapter)->hw_addr1 + (reg))
>  
>  #define VMXNET3_WAKE_QUEUE_THRESHOLD(tq)  (5)
>  #define VMXNET3_RX_ALLOC_THRESHOLD(rq, ring_idx, adapter) \
> -- 
> 1.7.1
> 
> 

^ permalink raw reply

* Re: [PATCH 2/2] vmxnet: trivial annotation of protocol constant
From: Shreyas Bhatewara @ 2010-10-29 18:20 UTC (permalink / raw)
  To: Harvey Harrison; +Cc: netdev@vger.kernel.org, pv-drivers
In-Reply-To: <1288235555-24675-2-git-send-email-harvey.harrison@gmail.com>


Thanks.

Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>


On Wed, 27 Oct 2010, Harvey Harrison wrote:

> Noticed by sparse:
> drivers/net/vmxnet3/vmxnet3_drv.c:876:38: warning: cast from restricted __be16
> drivers/net/vmxnet3/vmxnet3_drv.c:876:38: warning: cast from restricted __be16
> drivers/net/vmxnet3/vmxnet3_drv.c:876:24: warning: restricted __be16 degrades to integer
> 
> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
> ---
>  drivers/net/vmxnet3/vmxnet3_drv.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
> index e3658e1..21314e0 100644
> --- a/drivers/net/vmxnet3/vmxnet3_drv.c
> +++ b/drivers/net/vmxnet3/vmxnet3_drv.c
> @@ -873,7 +873,7 @@ vmxnet3_tq_xmit(struct sk_buff *skb, struct vmxnet3_tx_queue *tq,
>  	count = VMXNET3_TXD_NEEDED(skb_headlen(skb)) +
>  		skb_shinfo(skb)->nr_frags + 1;
>  
> -	ctx.ipv4 = (skb->protocol == __constant_ntohs(ETH_P_IP));
> +	ctx.ipv4 = (skb->protocol == cpu_to_be16(ETH_P_IP));
>  
>  	ctx.mss = skb_shinfo(skb)->gso_size;
>  	if (ctx.mss) {
> -- 
> 1.7.1
> 
> 

^ permalink raw reply

* [PATCH 0/1] RFC: poll/select performance on datagram sockets
From: Alban Crequy @ 2010-10-29 18:18 UTC (permalink / raw)
  To: Alban Crequy, David S. Miller, Eric Dumazet, Stephen Hemminger,
	Cyrill 

Hi,

When a process calls the poll or select, the kernel calls (struct
file_operations)->poll on every file descriptor and returns a mask of
events which are ready. If the process is only interested by POLLIN
events, the mask is still computed for POLLOUT and it can be expensive.
For example, on Unix datagram sockets, a process running poll() with
POLLIN will wakes-up when the remote end call read(). This is a
performance regression introduced when fixing another bug by
3c73419c09a5ef73d56472dbfdade9e311496e9b and
ec0d215f9420564fc8286dcf93d2d068bb53a07e.

The attached program illustrates the problem. It compares the
performance of sending/receiving data on an Unix datagram socket and
select(). When the datagram sockets are not connected, the performance
problem is not triggered, but when they are connected it becomes a lot
slower. On my computer, I have the following time:

Connected datagram sockets: >4 seconds
Non-connected datagram sockets: <1 second

The patch attached in the next email fixes the performance problem: it
becomes <1 second for both cases. I am not suggesting the patch for
inclusion; I would like to change the prototype of (struct
file_operations)->poll instead of adding ->poll2. But there is a lot of
poll functions to change (grep tells me 337 functions).

Any opinions?

Best regards,
Alban Crequy


---------------
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <time.h>

int
main(int argc, char *argv[])
{
#define NB_CLIENTS 800
  int client_fds[NB_CLIENTS];
  int server_fd;
  int max_fds = 0;
  struct sockaddr_un addr_server;
  struct sockaddr_un addr_client;
  fd_set rfds;
  struct timeval tv;
  int retval;
  int i;
  char buffer[1024];
  int pair[2] = {0,0};
  time_t t1, t2;
  int trigger;

  if (argc != 2)
    exit(1);

  if (strcmp(argv[1], "connected") == 0)
    {
      printf("The performance problem will be triggered\n");
      trigger = 1;
    }
  else
    {
      printf("The performance will be good\n");
      trigger = 0;
    }

  memset(&addr_server, 0, sizeof(addr_server));
  addr_server.sun_family = AF_UNIX;
  addr_server.sun_path[0] = '\0';
  strcpy(addr_server.sun_path + 1, "dgram_perfs_server");

  memset(&addr_client, 0, sizeof(addr_client));
  addr_client.sun_family = AF_UNIX;
  addr_client.sun_path[0] = '\0';
  strcpy(addr_client.sun_path + 1, "dgram_perfs_client");

  server_fd = socket(AF_UNIX, SOCK_DGRAM, 0);
  bind(server_fd, (struct sockaddr*)&addr_server, sizeof(addr_server));

  FD_ZERO(&rfds);
  for (i = 0 ; i < NB_CLIENTS ; i++)
    {
      client_fds[i] = socket(AF_UNIX, SOCK_DGRAM, 0);
      FD_SET(client_fds[i], &rfds);
      if (client_fds[i] > max_fds)
        max_fds = client_fds[i];
    }
  max_fds++;

  /* Do not connect the first one */
  for (i = 1 ; i < NB_CLIENTS ; i++)
    {
      if (trigger)
        connect(client_fds[i], (struct sockaddr*)&addr_server, sizeof(addr_server));
    }
  bind(client_fds[0], (struct sockaddr*)&addr_client, sizeof(addr_client));

  tv.tv_sec = 500;
  tv.tv_usec = 0;

  socketpair(AF_UNIX, SOCK_STREAM, 0, pair);

  if (fork() != 0)
  {
    recv(pair[1], buffer, 1024, 0);
    usleep(1000);
    for (i = 0 ; i < 20000 ; i++)
    {
      sendto(client_fds[0], "S", 1, 0, (struct sockaddr*)&addr_server, sizeof(addr_server));
      recv(server_fd, buffer, 1024, 0);
    }
    sendto(client_fds[0], "Z", 1, 0, (struct sockaddr*)&addr_client, sizeof(addr_client));
    wait(NULL);
    exit(0);
  }

  send(pair[0], "W", 1, 0);
  printf("select: begin\n");
  t1 = time(NULL);
  retval = select(max_fds, &rfds, NULL, NULL, &tv);
  t2 = time(NULL);
  printf("select: end: %d seconds\n", t2 - t1);

  return 0;
}
---------------


^ permalink raw reply

* [PATCH] poll/select performance on datagram sockets
From: Alban Crequy @ 2010-10-29 18:21 UTC (permalink / raw)
  To: Alban Crequy
  Cc: Alban Crequy, David S. Miller, Eric Dumazet, netdev, linux-kernel,
	Pauli Nieminen, Rainer Weikusat
In-Reply-To: <20101029191857.5f789d56@chocolatine.cbg.collabora.co.uk>

---
 fs/select.c         |   27 ++++++++++++++++++++++++---
 include/linux/fs.h  |    1 +
 include/linux/net.h |    3 +++
 net/socket.c        |   15 ++++++++++-----
 net/unix/af_unix.c  |   17 +++++++++++------
 5 files changed, 49 insertions(+), 14 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index 500a669..ff46afa 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -449,9 +449,21 @@ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
 				if (file) {
 					f_op = file->f_op;
 					mask = DEFAULT_POLLMASK;
-					if (f_op && f_op->poll) {
+					if (f_op && f_op->poll2) {
+						unsigned long filter_mask = 0;
+						filter_mask |= (in & bit)
+							? POLLIN_SET : 0;
+						filter_mask |= (out & bit)
+							? POLLOUT_SET : 0;
+						filter_mask |= (ex & bit)
+							? POLLEX_SET : 0;
 						wait_key_set(wait, in, out, bit);
-						mask = (*f_op->poll)(file, wait);
+						mask = (*f_op->poll2)(file,
+							wait, filter_mask);
+					} else if (f_op && f_op->poll) {
+						wait_key_set(wait, in, out, bit);
+						mask = (*f_op->poll)(file,
+							wait);
 					}
 					fput_light(file, fput_needed);
 					if ((mask & POLLIN_SET) && (in & bit)) {
@@ -738,7 +750,16 @@ static inline unsigned int do_pollfd(struct pollfd *pollfd, poll_table *pwait)
 		mask = POLLNVAL;
 		if (file != NULL) {
 			mask = DEFAULT_POLLMASK;
-			if (file->f_op && file->f_op->poll) {
+			if (file->f_op && file->f_op->poll2) {
+				unsigned long filter_mask;
+				filter_mask = pollfd->events
+						| POLLERR | POLLHUP;
+				if (pwait)
+					pwait->key = pollfd->events |
+							POLLERR | POLLHUP;
+				mask = file->f_op->poll2(file, pwait,
+							filter_mask);
+			} else if (file->f_op && file->f_op->poll) {
 				if (pwait)
 					pwait->key = pollfd->events |
 							POLLERR | POLLHUP;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 509ca14..dd0eb7a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1493,6 +1493,7 @@ struct file_operations {
 	ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
 	int (*readdir) (struct file *, void *, filldir_t);
 	unsigned int (*poll) (struct file *, struct poll_table_struct *);
+	unsigned int (*poll2) (struct file *, struct poll_table_struct *, unsigned long);
 	int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long);
 	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
diff --git a/include/linux/net.h b/include/linux/net.h
index dee0b11..1f33d1c 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -175,6 +175,9 @@ struct proto_ops {
 				      int *sockaddr_len, int peer);
 	unsigned int	(*poll)	     (struct file *file, struct socket *sock,
 				      struct poll_table_struct *wait);
+	unsigned int	(*poll2)     (struct file *file, struct socket *sock,
+				      struct poll_table_struct *wait,
+				      unsigned long filter_mask);
 	int		(*ioctl)     (struct socket *sock, unsigned int cmd,
 				      unsigned long arg);
 #ifdef CONFIG_COMPAT
diff --git a/net/socket.c b/net/socket.c
index 367d547..38b7786 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -113,8 +113,9 @@ static ssize_t sock_aio_write(struct kiocb *iocb, const struct iovec *iov,
 static int sock_mmap(struct file *file, struct vm_area_struct *vma);
 
 static int sock_close(struct inode *inode, struct file *file);
-static unsigned int sock_poll(struct file *file,
-			      struct poll_table_struct *wait);
+static unsigned int sock_poll2(struct file *file,
+			      struct poll_table_struct *wait,
+			      unsigned long filter_mask);
 static long sock_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
 #ifdef CONFIG_COMPAT
 static long compat_sock_ioctl(struct file *file,
@@ -137,7 +138,7 @@ static const struct file_operations socket_file_ops = {
 	.llseek =	no_llseek,
 	.aio_read =	sock_aio_read,
 	.aio_write =	sock_aio_write,
-	.poll =		sock_poll,
+	.poll2 =	sock_poll2,
 	.unlocked_ioctl = sock_ioctl,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl = compat_sock_ioctl,
@@ -1049,7 +1050,8 @@ out_release:
 }
 
 /* No kernel lock held - perfect */
-static unsigned int sock_poll(struct file *file, poll_table *wait)
+static unsigned int sock_poll2(struct file *file, poll_table *wait,
+			       unsigned long filter_mask)
 {
 	struct socket *sock;
 
@@ -1057,7 +1059,10 @@ static unsigned int sock_poll(struct file *file, poll_table *wait)
 	 *      We can't return errors to poll, so it's either yes or no.
 	 */
 	sock = file->private_data;
-	return sock->ops->poll(file, sock, wait);
+	if (sock->ops->poll2)
+		return sock->ops->poll2(file, sock, wait, filter_mask);
+	else
+		return sock->ops->poll(file, sock, wait);
 }
 
 static int sock_mmap(struct file *file, struct vm_area_struct *vma)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 617bea4..60e30ea 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -489,7 +489,7 @@ static int unix_accept(struct socket *, struct socket *, int);
 static int unix_getname(struct socket *, struct sockaddr *, int *, int);
 static unsigned int unix_poll(struct file *, struct socket *, poll_table *);
 static unsigned int unix_dgram_poll(struct file *, struct socket *,
-				    poll_table *);
+				    poll_table *, unsigned long);
 static int unix_ioctl(struct socket *, unsigned int, unsigned long);
 static int unix_shutdown(struct socket *, int);
 static int unix_stream_sendmsg(struct kiocb *, struct socket *,
@@ -535,7 +535,7 @@ static const struct proto_ops unix_dgram_ops = {
 	.socketpair =	unix_socketpair,
 	.accept =	sock_no_accept,
 	.getname =	unix_getname,
-	.poll =		unix_dgram_poll,
+	.poll2 =	unix_dgram_poll,
 	.ioctl =	unix_ioctl,
 	.listen =	sock_no_listen,
 	.shutdown =	unix_shutdown,
@@ -556,7 +556,7 @@ static const struct proto_ops unix_seqpacket_ops = {
 	.socketpair =	unix_socketpair,
 	.accept =	unix_accept,
 	.getname =	unix_getname,
-	.poll =		unix_dgram_poll,
+	.poll2 =	unix_dgram_poll,
 	.ioctl =	unix_ioctl,
 	.listen =	unix_listen,
 	.shutdown =	unix_shutdown,
@@ -2031,7 +2031,8 @@ static unsigned int unix_poll(struct file *file, struct socket *sock, poll_table
 }
 
 static unsigned int unix_dgram_poll(struct file *file, struct socket *sock,
-				    poll_table *wait)
+				    poll_table *wait,
+				    unsigned long filter_mask)
 {
 	struct sock *sk = sock->sk, *other;
 	unsigned int mask, writable;
@@ -2061,8 +2062,12 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock,
 			return mask;
 	}
 
-	/* writable? */
-	writable = unix_writable(sk);
+	/* writable?
+	 * sock_poll_wait() is expensive. Don't bother checking if it is not
+	 * requested.
+	 */
+	writable = (filter_mask & (POLLOUT | POLLWRNORM | POLLWRBAND))
+		    && unix_writable(sk);
 	if (writable) {
 		other = unix_peer_get(sk);
 		if (other) {
-- 
1.7.1


^ permalink raw reply related

* Re: [PATCH] net: Limit socket I/O iovec total length to INT_MAX.
From: Rick Jones @ 2010-10-29 18:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dan Rosenberg, David Miller, netdev, jon.maloy, allan.stephens,
	Al Viro
In-Reply-To: <AANLkTikeaXncdUohUiGF2EkhujYETK5dtyOJASffz2qR@mail.gmail.com>

It may be but a paint splatter on the bikeshed, or considered a case of "Doctor! 
Doctor! It hurts when I do this" "Then don't do that!" but elsewhere (not in the 
context of Linux) I've seen mention made of ISV software posting some 
particularly large receives and such - one case I saw was over 1GB, where that 
was tied to the size of a log buffer the creation of which I believe was not 
limited to ~INT_MAX by the application software.

rick jones

^ permalink raw reply

* Re: [PATCH] net: Limit socket I/O iovec total length to INT_MAX.
From: Linus Torvalds @ 2010-10-29 18:59 UTC (permalink / raw)
  To: Rick Jones
  Cc: Dan Rosenberg, David Miller, netdev, jon.maloy, allan.stephens,
	Al Viro
In-Reply-To: <4CCB17B7.1010908@hp.com>

On Fri, Oct 29, 2010 at 11:51 AM, Rick Jones <rick.jones2@hp.com> wrote:
> It may be but a paint splatter on the bikeshed, or considered a case of
> "Doctor! Doctor! It hurts when I do this" "Then don't do that!" but
> elsewhere (not in the context of Linux) I've seen mention made of ISV
> software posting some particularly large receives and such - one case I saw
> was over 1GB, where that was tied to the size of a log buffer the creation
> of which I believe was not limited to ~INT_MAX by the application software.

Sure. And that is why I very much don't think it's a good idea to
disallow large reads or writes. Returning an error would be bad,
because it's not entirely unreasonable for some user to just have a
really big object (presumably for unrelated reasons), and do IO on it
in one go.

But expecting anybody to _fill_ that really big object in one go is
unreasonable. Even for regular files, anybody who has ever worked with
NFS and interruptible mounts knows that they have to be able to handle
partial IO. And with non-files you obviously have that all the time.

So I think it's perfectly fine to do a terabyte read() or write(). The
fact that the kernel will then internally limit it, and won't ever
actually then write more than 2GB-1 at a time ends up being just an
"implementation issue" that happens to protect the kernel against
subsystems that happen to have issues.

And an application that cannot handle partial reads or writes, yet
works with gigabyte+ data sets is _not_ an application that I consider
reasonable. That's a user space bug, pure and simple, and no amount of
"but but ..." makes any difference what-so-ever.

                        Linus

^ permalink raw reply

* Re: mmotm 2010-10-20 - netfilter Kconfig whinge
From: Patrick McHardy @ 2010-10-29 19:11 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Andrew Morton, linux-kernel, netfilter, netdev
In-Reply-To: <83468.1287659685@localhost>

[-- Attachment #1: Type: text/plain, Size: 763 bytes --]

Am 21.10.2010 13:14, schrieb Valdis.Kletnieks@vt.edu:
> On Wed, 20 Oct 2010 15:01:43 PDT, akpm@linux-foundation.org said:
>> The mm-of-the-moment snapshot 2010-10-20-15-01 has been uploaded to
>>
>>    http://userweb.kernel.org/~akpm/mmotm/
> 
> Seen during 'make oldconfig':
> 
>       Input subsystem LEDs Trigger (LEDS_TRIGGER_INPUT) [N/m/y/?] (NEW) m
>       *
>       * iptables trigger is under Netfilter config (LED target)
>       *
> warning: (NETFILTER_XT_MATCH_REALM && NET && INET && NETFILTER && NETFILTER_XTABLES && NETFILTER_ADVANCED || NET_CLS_ROUTE4 && NET && NET_SCHED) selects NET_CLS_ROUTE which has unmet direct dependencies (NET && NET_SCHED)
> #
> # configuration written to .config

Thanks for the report, this is how I intend to fix it:


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 12101 bytes --]

commit f6ce9a8f0a591865534f1fdd78848c328e36f08d
Author: Patrick McHardy <kaber@trash.net>
Date:   Fri Oct 29 19:09:49 2010 +0200

    netfilter: fix Kconfig dependencies
    
    Fix dependencies of netfilter realm match: it depends on NET_CLS_ROUTE,
    which itself depends on NET_SCHED; this dependency is missing from netfilter.
    
    Since matching on realms is also useful without having NET_SCHED enabled and
    the option really only controls whether the tclassid member is included in
    route and dst entries, rename the config option to IP_ROUTE_CLASSID and move
    it outside of traffic scheduling context to get rid of the NET_SCHED dependeny.
    
    Reported-by: Vladis Kletnieks <Valdis.Kletnieks@vt.edu>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/net/dst.h b/include/net/dst.h
index ffe9cb7..4639ca5 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -72,7 +72,7 @@ struct dst_entry {
 
 	u32			metrics[RTAX_MAX];
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	__u32			tclassid;
 #else
 	__u32			__pad2;
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index ba3666d..626b448 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -55,7 +55,7 @@ struct fib_nh {
 	int			nh_weight;
 	int			nh_power;
 #endif
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	__u32			nh_tclassid;
 #endif
 	int			nh_oif;
@@ -199,7 +199,7 @@ static inline int fib_lookup(struct net *net, const struct flowi *flp,
 extern int __net_init fib4_rules_init(struct net *net);
 extern void __net_exit fib4_rules_exit(struct net *net);
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 extern u32 fib_rules_tclass(struct fib_result *res);
 #endif
 
@@ -233,7 +233,7 @@ extern struct fib_table *fib_hash_table(u32 id);
 
 static inline void fib_combine_itag(u32 *itag, struct fib_result *res)
 {
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 #ifdef CONFIG_IP_MULTIPLE_TABLES
 	u32 rtag;
 #endif
diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index 9e95d7f..dcb2e18 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -140,6 +140,9 @@ config IP_ROUTE_VERBOSE
 	  handled by the klogd daemon which is responsible for kernel messages
 	  ("man klogd").
 
+config IP_ROUTE_CLASSID
+	bool
+
 config IP_PNP
 	bool "IP: kernel level autoconfiguration"
 	help
@@ -655,4 +658,3 @@ config TCP_MD5SIG
 	  on the Internet.
 
 	  If unsure, say N.
-
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 7981a24..9cefe72 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -41,12 +41,12 @@ struct fib4_rule {
 	__be32			srcmask;
 	__be32			dst;
 	__be32			dstmask;
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	u32			tclassid;
 #endif
 };
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 u32 fib_rules_tclass(struct fib_result *res)
 {
 	return res->r ? ((struct fib4_rule *) res->r)->tclassid : 0;
@@ -165,7 +165,7 @@ static int fib4_rule_configure(struct fib_rule *rule, struct sk_buff *skb,
 	if (frh->dst_len)
 		rule4->dst = nla_get_be32(tb[FRA_DST]);
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	if (tb[FRA_FLOW])
 		rule4->tclassid = nla_get_u32(tb[FRA_FLOW]);
 #endif
@@ -195,7 +195,7 @@ static int fib4_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh,
 	if (frh->tos && (rule4->tos != frh->tos))
 		return 0;
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	if (tb[FRA_FLOW] && (rule4->tclassid != nla_get_u32(tb[FRA_FLOW])))
 		return 0;
 #endif
@@ -224,7 +224,7 @@ static int fib4_rule_fill(struct fib_rule *rule, struct sk_buff *skb,
 	if (rule4->src_len)
 		NLA_PUT_BE32(skb, FRA_SRC, rule4->src);
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	if (rule4->tclassid)
 		NLA_PUT_U32(skb, FRA_FLOW, rule4->tclassid);
 #endif
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 3e0da3e..a72c62d 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -200,7 +200,7 @@ static inline int nh_comp(const struct fib_info *fi, const struct fib_info *ofi)
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
 		    nh->nh_weight != onh->nh_weight ||
 #endif
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 		    nh->nh_tclassid != onh->nh_tclassid ||
 #endif
 		    ((nh->nh_flags ^ onh->nh_flags) & ~RTNH_F_DEAD))
@@ -422,7 +422,7 @@ static int fib_get_nhs(struct fib_info *fi, struct rtnexthop *rtnh,
 
 			nla = nla_find(attrs, attrlen, RTA_GATEWAY);
 			nexthop_nh->nh_gw = nla ? nla_get_be32(nla) : 0;
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 			nla = nla_find(attrs, attrlen, RTA_FLOW);
 			nexthop_nh->nh_tclassid = nla ? nla_get_u32(nla) : 0;
 #endif
@@ -476,7 +476,7 @@ int fib_nh_match(struct fib_config *cfg, struct fib_info *fi)
 			nla = nla_find(attrs, attrlen, RTA_GATEWAY);
 			if (nla && nla_get_be32(nla) != nh->nh_gw)
 				return 1;
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 			nla = nla_find(attrs, attrlen, RTA_FLOW);
 			if (nla && nla_get_u32(nla) != nh->nh_tclassid)
 				return 1;
@@ -783,7 +783,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg)
 			goto err_inval;
 		if (cfg->fc_gw && fi->fib_nh->nh_gw != cfg->fc_gw)
 			goto err_inval;
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 		if (cfg->fc_flow && fi->fib_nh->nh_tclassid != cfg->fc_flow)
 			goto err_inval;
 #endif
@@ -796,7 +796,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg)
 		nh->nh_oif = cfg->fc_oif;
 		nh->nh_gw = cfg->fc_gw;
 		nh->nh_flags = cfg->fc_flags;
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 		nh->nh_tclassid = cfg->fc_flow;
 #endif
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
@@ -1006,7 +1006,7 @@ int fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event,
 
 		if (fi->fib_nh->nh_oif)
 			NLA_PUT_U32(skb, RTA_OIF, fi->fib_nh->nh_oif);
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 		if (fi->fib_nh[0].nh_tclassid)
 			NLA_PUT_U32(skb, RTA_FLOW, fi->fib_nh[0].nh_tclassid);
 #endif
@@ -1031,7 +1031,7 @@ int fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event,
 
 			if (nh->nh_gw)
 				NLA_PUT_BE32(skb, RTA_GATEWAY, nh->nh_gw);
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 			if (nh->nh_tclassid)
 				NLA_PUT_U32(skb, RTA_FLOW, nh->nh_tclassid);
 #endif
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index d859bcc..d7b2b09 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -340,7 +340,7 @@ static int ip_rcv_finish(struct sk_buff *skb)
 		}
 	}
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	if (unlikely(skb_dst(skb)->tclassid)) {
 		struct ip_rt_acct *st = this_cpu_ptr(ip_rt_acct);
 		u32 idx = skb_dst(skb)->tclassid;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 987bf9a..866b226 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -509,7 +509,7 @@ static const struct file_operations rt_cpu_seq_fops = {
 	.release = seq_release,
 };
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 static int rt_acct_proc_show(struct seq_file *m, void *v)
 {
 	struct ip_rt_acct *dst, *src;
@@ -562,14 +562,14 @@ static int __net_init ip_rt_do_proc_init(struct net *net)
 	if (!pde)
 		goto err2;
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	pde = proc_create("rt_acct", 0, net->proc_net, &rt_acct_proc_fops);
 	if (!pde)
 		goto err3;
 #endif
 	return 0;
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 err3:
 	remove_proc_entry("rt_cache", net->proc_net_stat);
 #endif
@@ -583,7 +583,7 @@ static void __net_exit ip_rt_do_proc_exit(struct net *net)
 {
 	remove_proc_entry("rt_cache", net->proc_net_stat);
 	remove_proc_entry("rt_cache", net->proc_net);
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	remove_proc_entry("rt_acct", net->proc_net);
 #endif
 }
@@ -1804,7 +1804,7 @@ void ip_rt_get_source(u8 *addr, struct rtable *rt)
 	memcpy(addr, &src, 4);
 }
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 static void set_class_tag(struct rtable *rt, u32 tag)
 {
 	if (!(rt->dst.tclassid & 0xFFFF))
@@ -1831,7 +1831,7 @@ static void rt_set_nexthop(struct rtable *rt, struct fib_result *res, u32 itag)
 			    rt->dst.dev->mtu > 576)
 				rt->dst.metrics[RTAX_MTU-1] = 576;
 		}
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 		rt->dst.tclassid = FIB_RES_NH(*res).nh_tclassid;
 #endif
 	} else
@@ -1847,7 +1847,7 @@ static void rt_set_nexthop(struct rtable *rt, struct fib_result *res, u32 itag)
 	if (dst_metric(&rt->dst, RTAX_ADVMSS) > 65535 - 40)
 		rt->dst.metrics[RTAX_ADVMSS-1] = 65535 - 40;
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 #ifdef CONFIG_IP_MULTIPLE_TABLES
 	set_class_tag(rt, fib_rules_tclass(res));
 #endif
@@ -1903,7 +1903,7 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 	rth->fl.mark    = skb->mark;
 	rth->fl.fl4_src	= saddr;
 	rth->rt_src	= saddr;
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	rth->dst.tclassid = itag;
 #endif
 	rth->rt_iif	=
@@ -2224,7 +2224,7 @@ local_input:
 	rth->fl.mark    = skb->mark;
 	rth->fl.fl4_src	= saddr;
 	rth->rt_src	= saddr;
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	rth->dst.tclassid = itag;
 #endif
 	rth->rt_iif	=
@@ -2849,7 +2849,7 @@ static int rt_fill_info(struct net *net,
 	}
 	if (rt->dst.dev)
 		NLA_PUT_U32(skb, RTA_OIF, rt->dst.dev->ifindex);
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	if (rt->dst.tclassid)
 		NLA_PUT_U32(skb, RTA_FLOW, rt->dst.tclassid);
 #endif
@@ -3274,9 +3274,9 @@ static __net_initdata struct pernet_operations rt_genid_ops = {
 };
 
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 struct ip_rt_acct __percpu *ip_rt_acct __read_mostly;
-#endif /* CONFIG_NET_CLS_ROUTE */
+#endif /* CONFIG_IP_ROUTE_CLASSID */
 
 static __initdata unsigned long rhash_entries;
 static int __init set_rhash_entries(char *str)
@@ -3292,7 +3292,7 @@ int __init ip_rt_init(void)
 {
 	int rc = 0;
 
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	ip_rt_acct = __alloc_percpu(256 * sizeof(struct ip_rt_acct), __alignof__(struct ip_rt_acct));
 	if (!ip_rt_acct)
 		panic("IP: failed to allocate ip_rt_acct\n");
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 1534f2b..1b79353 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -886,7 +886,7 @@ config NETFILTER_XT_MATCH_RATEEST
 config NETFILTER_XT_MATCH_REALM
 	tristate  '"realm" match support'
 	depends on NETFILTER_ADVANCED
-	select NET_CLS_ROUTE
+	select IP_ROUTE_CLASSID
 	help
 	  This option adds a `realm' match, which allows you to use the realm
 	  key from the routing subsystem inside iptables.
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index a36270a..4b753ef 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -243,7 +243,7 @@ config NET_CLS_TCINDEX
 
 config NET_CLS_ROUTE4
 	tristate "Routing decision (ROUTE)"
-	select NET_CLS_ROUTE
+	select IP_ROUTE_CLASSID
 	select NET_CLS
 	---help---
 	  If you say Y here, you will be able to classify packets
@@ -252,9 +252,6 @@ config NET_CLS_ROUTE4
 	  To compile this code as a module, choose M here: the
 	  module will be called cls_route.
 
-config NET_CLS_ROUTE
-	bool
-
 config NET_CLS_FW
 	tristate "Netfilter mark (FW)"
 	select NET_CLS
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
index 5b271a1..a3b293d 100644
--- a/net/sched/cls_flow.c
+++ b/net/sched/cls_flow.c
@@ -276,7 +276,7 @@ fallback:
 
 static u32 flow_get_rtclassid(const struct sk_buff *skb)
 {
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 	if (skb_dst(skb))
 		return skb_dst(skb)->tclassid;
 #endif
diff --git a/net/sched/em_meta.c b/net/sched/em_meta.c
index 34da5e2..0d66e58 100644
--- a/net/sched/em_meta.c
+++ b/net/sched/em_meta.c
@@ -255,7 +255,7 @@ META_COLLECTOR(int_rtclassid)
 	if (unlikely(skb_dst(skb) == NULL))
 		*err = -1;
 	else
-#ifdef CONFIG_NET_CLS_ROUTE
+#ifdef CONFIG_IP_ROUTE_CLASSID
 		dst->value = skb_dst(skb)->tclassid;
 #else
 		dst->value = 0;

^ permalink raw reply related

* pull request: wireless-2.6 2010-10-29
From: John W. Linville @ 2010-10-29 19:14 UTC (permalink / raw)
  To: linux-wireless; +Cc: netdev, linux-kernel

Dave,

Here are more fixes intended for 2.6.37.  Most are obvious one-liners
(more-or-less).  The series from Luis fixes a bug as documented in the
changelog.

Please let me know if there are problems!

Thanks,

John

---

The following changes since commit 19449bfc10d163f0024dd5ae5808e28cda32e7b4:

  stmmac: enable/disable rx/tx in the core with a single write. (2010-10-28 11:47:54 -0700)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git master

Felix Fietkau (1):
      ath9k: fix tx aggregation flush on AR9003

Jesper Juhl (1):
      mac80211: fix failure to check kmalloc return value in key_key_read

Jones Desougi (1):
      ath5k: Fix double free on hw attach error path

Jouni Malinen (1):
      mac80211: Fix scan_ies_len to include DS Params

Larry Finger (1):
      b43: Fix warning at drivers/mmc/core/core.c:237 in mmc_wait_for_cmd

Luis R. Rodriguez (4):
      ath9k: add locking for stopping RX
      ath9k: add locking for starting the PCU on RX
      ath9k: rename rxflushlock to pcu_lock
      ath9k: lock reset and PCU start/stopping

Mohammed Shafi Shajakhan (1):
      ath9k: Fix incorrect access of rate flags in RC

Paul Fox (1):
      libertas: Fix sd8686 firmware reload

Rajkumar Manoharan (1):
      ath9k_htc: Set proper firmware offset for Netgear WNDA3200

 drivers/net/wireless/ath/ath5k/attach.c  |   17 +++++++--------
 drivers/net/wireless/ath/ath9k/ath9k.h   |    2 +-
 drivers/net/wireless/ath/ath9k/hif_usb.c |   10 +++++++-
 drivers/net/wireless/ath/ath9k/main.c    |   31 +++++++++++++++++++++++++++-
 drivers/net/wireless/ath/ath9k/rc.c      |    2 +-
 drivers/net/wireless/ath/ath9k/recv.c    |   15 ++++++-------
 drivers/net/wireless/ath/ath9k/xmit.c    |   18 ++++++++--------
 drivers/net/wireless/b43/sdio.c          |    2 +
 drivers/net/wireless/libertas/if_sdio.c  |   32 +++++++++++++++++++++++++++--
 net/mac80211/debugfs_key.c               |    6 ++++-
 net/mac80211/main.c                      |    5 ++-
 11 files changed, 102 insertions(+), 38 deletions(-)

diff --git a/drivers/net/wireless/ath/ath5k/attach.c b/drivers/net/wireless/ath/ath5k/attach.c
index cd0b14a..fbe8aca 100644
--- a/drivers/net/wireless/ath/ath5k/attach.c
+++ b/drivers/net/wireless/ath/ath5k/attach.c
@@ -139,12 +139,12 @@ int ath5k_hw_attach(struct ath5k_softc *sc)
 	/* Fill the ath5k_hw struct with the needed functions */
 	ret = ath5k_hw_init_desc_functions(ah);
 	if (ret)
-		goto err_free;
+		goto err;
 
 	/* Bring device out of sleep and reset its units */
 	ret = ath5k_hw_nic_wakeup(ah, 0, true);
 	if (ret)
-		goto err_free;
+		goto err;
 
 	/* Get MAC, PHY and RADIO revisions */
 	ah->ah_mac_srev = srev;
@@ -234,7 +234,7 @@ int ath5k_hw_attach(struct ath5k_softc *sc)
 		} else {
 			ATH5K_ERR(sc, "Couldn't identify radio revision.\n");
 			ret = -ENODEV;
-			goto err_free;
+			goto err;
 		}
 	}
 
@@ -244,7 +244,7 @@ int ath5k_hw_attach(struct ath5k_softc *sc)
 	(srev < AR5K_SREV_AR2425)) {
 		ATH5K_ERR(sc, "Device not yet supported.\n");
 		ret = -ENODEV;
-		goto err_free;
+		goto err;
 	}
 
 	/*
@@ -252,7 +252,7 @@ int ath5k_hw_attach(struct ath5k_softc *sc)
 	 */
 	ret = ath5k_hw_post(ah);
 	if (ret)
-		goto err_free;
+		goto err;
 
 	/* Enable pci core retry fix on Hainan (5213A) and later chips */
 	if (srev >= AR5K_SREV_AR5213A)
@@ -265,7 +265,7 @@ int ath5k_hw_attach(struct ath5k_softc *sc)
 	ret = ath5k_eeprom_init(ah);
 	if (ret) {
 		ATH5K_ERR(sc, "unable to init EEPROM\n");
-		goto err_free;
+		goto err;
 	}
 
 	ee = &ah->ah_capabilities.cap_eeprom;
@@ -307,7 +307,7 @@ int ath5k_hw_attach(struct ath5k_softc *sc)
 	if (ret) {
 		ATH5K_ERR(sc, "unable to get device capabilities: 0x%04x\n",
 			sc->pdev->device);
-		goto err_free;
+		goto err;
 	}
 
 	/* Crypto settings */
@@ -341,8 +341,7 @@ int ath5k_hw_attach(struct ath5k_softc *sc)
 	ath5k_hw_set_ledstate(ah, AR5K_LED_INIT);
 
 	return 0;
-err_free:
-	kfree(ah);
+err:
 	return ret;
 }
 
diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h
index 973c919..9b8e7e3 100644
--- a/drivers/net/wireless/ath/ath9k/ath9k.h
+++ b/drivers/net/wireless/ath/ath9k/ath9k.h
@@ -310,7 +310,7 @@ struct ath_rx {
 	u8 rxotherant;
 	u32 *rxlink;
 	unsigned int rxfilter;
-	spinlock_t rxflushlock;
+	spinlock_t pcu_lock;
 	spinlock_t rxbuflock;
 	struct list_head rxbuf;
 	struct ath_descdma rxdma;
diff --git a/drivers/net/wireless/ath/ath9k/hif_usb.c b/drivers/net/wireless/ath/ath9k/hif_usb.c
index 728d904..6576f68 100644
--- a/drivers/net/wireless/ath/ath9k/hif_usb.c
+++ b/drivers/net/wireless/ath/ath9k/hif_usb.c
@@ -801,10 +801,16 @@ static int ath9k_hif_usb_download_fw(struct hif_device_usb *hif_dev)
 	}
 	kfree(buf);
 
-	if ((hif_dev->device_id == 0x7010) || (hif_dev->device_id == 0x7015))
+	switch (hif_dev->device_id) {
+	case 0x7010:
+	case 0x7015:
+	case 0x9018:
 		firm_offset = AR7010_FIRMWARE_TEXT;
-	else
+		break;
+	default:
 		firm_offset = AR9271_FIRMWARE_TEXT;
+		break;
+	}
 
 	/*
 	 * Issue FW download complete command to firmware.
diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
index c6ec800..b52f1cf 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -241,6 +241,9 @@ int ath_set_channel(struct ath_softc *sc, struct ieee80211_hw *hw,
 	 */
 	ath9k_hw_set_interrupts(ah, 0);
 	ath_drain_all_txq(sc, false);
+
+	spin_lock_bh(&sc->rx.pcu_lock);
+
 	stopped = ath_stoprecv(sc);
 
 	/* XXX: do not flush receive queue here. We don't want
@@ -268,6 +271,7 @@ int ath_set_channel(struct ath_softc *sc, struct ieee80211_hw *hw,
 			  "reset status %d\n",
 			  channel->center_freq, r);
 		spin_unlock_bh(&sc->sc_resetlock);
+		spin_unlock_bh(&sc->rx.pcu_lock);
 		goto ps_restore;
 	}
 	spin_unlock_bh(&sc->sc_resetlock);
@@ -276,9 +280,12 @@ int ath_set_channel(struct ath_softc *sc, struct ieee80211_hw *hw,
 		ath_print(common, ATH_DBG_FATAL,
 			  "Unable to restart recv logic\n");
 		r = -EIO;
+		spin_unlock_bh(&sc->rx.pcu_lock);
 		goto ps_restore;
 	}
 
+	spin_unlock_bh(&sc->rx.pcu_lock);
+
 	ath_update_txpow(sc);
 	ath9k_hw_set_interrupts(ah, ah->imask);
 
@@ -613,7 +620,7 @@ void ath9k_tasklet(unsigned long data)
 		rxmask = (ATH9K_INT_RX | ATH9K_INT_RXEOL | ATH9K_INT_RXORN);
 
 	if (status & rxmask) {
-		spin_lock_bh(&sc->rx.rxflushlock);
+		spin_lock_bh(&sc->rx.pcu_lock);
 
 		/* Check for high priority Rx first */
 		if ((ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) &&
@@ -621,7 +628,7 @@ void ath9k_tasklet(unsigned long data)
 			ath_rx_tasklet(sc, 0, true);
 
 		ath_rx_tasklet(sc, 0, false);
-		spin_unlock_bh(&sc->rx.rxflushlock);
+		spin_unlock_bh(&sc->rx.pcu_lock);
 	}
 
 	if (status & ATH9K_INT_TX) {
@@ -876,6 +883,7 @@ void ath_radio_enable(struct ath_softc *sc, struct ieee80211_hw *hw)
 	if (!ah->curchan)
 		ah->curchan = ath_get_curchannel(sc, sc->hw);
 
+	spin_lock_bh(&sc->rx.pcu_lock);
 	spin_lock_bh(&sc->sc_resetlock);
 	r = ath9k_hw_reset(ah, ah->curchan, ah->caldata, false);
 	if (r) {
@@ -890,8 +898,10 @@ void ath_radio_enable(struct ath_softc *sc, struct ieee80211_hw *hw)
 	if (ath_startrecv(sc) != 0) {
 		ath_print(common, ATH_DBG_FATAL,
 			  "Unable to restart recv logic\n");
+		spin_unlock_bh(&sc->rx.pcu_lock);
 		return;
 	}
+	spin_unlock_bh(&sc->rx.pcu_lock);
 
 	if (sc->sc_flags & SC_OP_BEACONS)
 		ath_beacon_config(sc, NULL);	/* restart beacons */
@@ -930,6 +940,9 @@ void ath_radio_disable(struct ath_softc *sc, struct ieee80211_hw *hw)
 	ath9k_hw_set_interrupts(ah, 0);
 
 	ath_drain_all_txq(sc, false);	/* clear pending tx frames */
+
+	spin_lock_bh(&sc->rx.pcu_lock);
+
 	ath_stoprecv(sc);		/* turn off frame recv */
 	ath_flushrecv(sc);		/* flush recv queue */
 
@@ -947,6 +960,9 @@ void ath_radio_disable(struct ath_softc *sc, struct ieee80211_hw *hw)
 	spin_unlock_bh(&sc->sc_resetlock);
 
 	ath9k_hw_phy_disable(ah);
+
+	spin_unlock_bh(&sc->rx.pcu_lock);
+
 	ath9k_hw_configpcipowersave(ah, 1, 1);
 	ath9k_ps_restore(sc);
 	ath9k_setpower(sc, ATH9K_PM_FULL_SLEEP);
@@ -966,6 +982,9 @@ int ath_reset(struct ath_softc *sc, bool retry_tx)
 
 	ath9k_hw_set_interrupts(ah, 0);
 	ath_drain_all_txq(sc, retry_tx);
+
+	spin_lock_bh(&sc->rx.pcu_lock);
+
 	ath_stoprecv(sc);
 	ath_flushrecv(sc);
 
@@ -980,6 +999,8 @@ int ath_reset(struct ath_softc *sc, bool retry_tx)
 		ath_print(common, ATH_DBG_FATAL,
 			  "Unable to start recv logic\n");
 
+	spin_unlock_bh(&sc->rx.pcu_lock);
+
 	/*
 	 * We may be doing a reset in response to a request
 	 * that changes the channel so update any state that
@@ -1142,6 +1163,7 @@ static int ath9k_start(struct ieee80211_hw *hw)
 	 * be followed by initialization of the appropriate bits
 	 * and then setup of the interrupt mask.
 	 */
+	spin_lock_bh(&sc->rx.pcu_lock);
 	spin_lock_bh(&sc->sc_resetlock);
 	r = ath9k_hw_reset(ah, init_channel, ah->caldata, false);
 	if (r) {
@@ -1150,6 +1172,7 @@ static int ath9k_start(struct ieee80211_hw *hw)
 			  "(freq %u MHz)\n", r,
 			  curchan->center_freq);
 		spin_unlock_bh(&sc->sc_resetlock);
+		spin_unlock_bh(&sc->rx.pcu_lock);
 		goto mutex_unlock;
 	}
 	spin_unlock_bh(&sc->sc_resetlock);
@@ -1171,8 +1194,10 @@ static int ath9k_start(struct ieee80211_hw *hw)
 		ath_print(common, ATH_DBG_FATAL,
 			  "Unable to start recv logic\n");
 		r = -EIO;
+		spin_unlock_bh(&sc->rx.pcu_lock);
 		goto mutex_unlock;
 	}
+	spin_unlock_bh(&sc->rx.pcu_lock);
 
 	/* Setup our intr mask. */
 	ah->imask = ATH9K_INT_TX | ATH9K_INT_RXEOL |
@@ -1371,12 +1396,14 @@ static void ath9k_stop(struct ieee80211_hw *hw)
 	 * before setting the invalid flag. */
 	ath9k_hw_set_interrupts(ah, 0);
 
+	spin_lock_bh(&sc->rx.pcu_lock);
 	if (!(sc->sc_flags & SC_OP_INVALID)) {
 		ath_drain_all_txq(sc, false);
 		ath_stoprecv(sc);
 		ath9k_hw_phy_disable(ah);
 	} else
 		sc->rx.rxlink = NULL;
+	spin_unlock_bh(&sc->rx.pcu_lock);
 
 	/* disable HAL and put h/w to sleep */
 	ath9k_hw_disable(ah);
diff --git a/drivers/net/wireless/ath/ath9k/rc.c b/drivers/net/wireless/ath/ath9k/rc.c
index 0cee90c..89978d7 100644
--- a/drivers/net/wireless/ath/ath9k/rc.c
+++ b/drivers/net/wireless/ath/ath9k/rc.c
@@ -527,7 +527,7 @@ static u8 ath_rc_setvalid_rates(struct ath_rate_priv *ath_rc_priv,
 	for (i = 0; i < rateset->rs_nrates; i++) {
 		for (j = 0; j < rate_table->rate_cnt; j++) {
 			u32 phy = rate_table->info[j].phy;
-			u16 rate_flags = rate_table->info[i].rate_flags;
+			u16 rate_flags = rate_table->info[j].rate_flags;
 			u8 rate = rateset->rs_rates[i];
 			u8 dot11rate = rate_table->info[j].dot11rate;
 
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index fe73fc5..fddb012 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -297,19 +297,17 @@ static void ath_edma_start_recv(struct ath_softc *sc)
 	ath_rx_addbuffer_edma(sc, ATH9K_RX_QUEUE_LP,
 			      sc->rx.rx_edma[ATH9K_RX_QUEUE_LP].rx_fifo_hwsize);
 
-	spin_unlock_bh(&sc->rx.rxbuflock);
-
 	ath_opmode_init(sc);
 
 	ath9k_hw_startpcureceive(sc->sc_ah, (sc->sc_flags & SC_OP_OFFCHANNEL));
+
+	spin_unlock_bh(&sc->rx.rxbuflock);
 }
 
 static void ath_edma_stop_recv(struct ath_softc *sc)
 {
-	spin_lock_bh(&sc->rx.rxbuflock);
 	ath_rx_remove_buffer(sc, ATH9K_RX_QUEUE_HP);
 	ath_rx_remove_buffer(sc, ATH9K_RX_QUEUE_LP);
-	spin_unlock_bh(&sc->rx.rxbuflock);
 }
 
 int ath_rx_init(struct ath_softc *sc, int nbufs)
@@ -319,7 +317,7 @@ int ath_rx_init(struct ath_softc *sc, int nbufs)
 	struct ath_buf *bf;
 	int error = 0;
 
-	spin_lock_init(&sc->rx.rxflushlock);
+	spin_lock_init(&sc->rx.pcu_lock);
 	sc->sc_flags &= ~SC_OP_RXFLUSH;
 	spin_lock_init(&sc->rx.rxbuflock);
 
@@ -506,10 +504,11 @@ int ath_startrecv(struct ath_softc *sc)
 	ath9k_hw_rxena(ah);
 
 start_recv:
-	spin_unlock_bh(&sc->rx.rxbuflock);
 	ath_opmode_init(sc);
 	ath9k_hw_startpcureceive(ah, (sc->sc_flags & SC_OP_OFFCHANNEL));
 
+	spin_unlock_bh(&sc->rx.rxbuflock);
+
 	return 0;
 }
 
@@ -518,6 +517,7 @@ bool ath_stoprecv(struct ath_softc *sc)
 	struct ath_hw *ah = sc->sc_ah;
 	bool stopped;
 
+	spin_lock_bh(&sc->rx.rxbuflock);
 	ath9k_hw_stoppcurecv(ah);
 	ath9k_hw_setrxfilter(ah, 0);
 	stopped = ath9k_hw_stopdmarecv(ah);
@@ -526,19 +526,18 @@ bool ath_stoprecv(struct ath_softc *sc)
 		ath_edma_stop_recv(sc);
 	else
 		sc->rx.rxlink = NULL;
+	spin_unlock_bh(&sc->rx.rxbuflock);
 
 	return stopped;
 }
 
 void ath_flushrecv(struct ath_softc *sc)
 {
-	spin_lock_bh(&sc->rx.rxflushlock);
 	sc->sc_flags |= SC_OP_RXFLUSH;
 	if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA)
 		ath_rx_tasklet(sc, 1, true);
 	ath_rx_tasklet(sc, 1, false);
 	sc->sc_flags &= ~SC_OP_RXFLUSH;
-	spin_unlock_bh(&sc->rx.rxflushlock);
 }
 
 static bool ath_beacon_dtim_pending_cab(struct sk_buff *skb)
diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index 30ef2df..f2ade24 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -1089,15 +1089,6 @@ void ath_draintxq(struct ath_softc *sc, struct ath_txq *txq, bool retry_tx)
 	txq->axq_tx_inprogress = false;
 	spin_unlock_bh(&txq->axq_lock);
 
-	/* flush any pending frames if aggregation is enabled */
-	if (sc->sc_flags & SC_OP_TXAGGR) {
-		if (!retry_tx) {
-			spin_lock_bh(&txq->axq_lock);
-			ath_txq_drain_pending_buffers(sc, txq);
-			spin_unlock_bh(&txq->axq_lock);
-		}
-	}
-
 	if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) {
 		spin_lock_bh(&txq->axq_lock);
 		while (!list_empty(&txq->txq_fifo_pending)) {
@@ -1118,6 +1109,15 @@ void ath_draintxq(struct ath_softc *sc, struct ath_txq *txq, bool retry_tx)
 		}
 		spin_unlock_bh(&txq->axq_lock);
 	}
+
+	/* flush any pending frames if aggregation is enabled */
+	if (sc->sc_flags & SC_OP_TXAGGR) {
+		if (!retry_tx) {
+			spin_lock_bh(&txq->axq_lock);
+			ath_txq_drain_pending_buffers(sc, txq);
+			spin_unlock_bh(&txq->axq_lock);
+		}
+	}
 }
 
 void ath_drain_all_txq(struct ath_softc *sc, bool retry_tx)
diff --git a/drivers/net/wireless/b43/sdio.c b/drivers/net/wireless/b43/sdio.c
index 45933cf..9a55338 100644
--- a/drivers/net/wireless/b43/sdio.c
+++ b/drivers/net/wireless/b43/sdio.c
@@ -175,7 +175,9 @@ static void b43_sdio_remove(struct sdio_func *func)
 	struct b43_sdio *sdio = sdio_get_drvdata(func);
 
 	ssb_bus_unregister(&sdio->ssb);
+	sdio_claim_host(func);
 	sdio_disable_func(func);
+	sdio_release_host(func);
 	kfree(sdio);
 	sdio_set_drvdata(func, NULL);
 }
diff --git a/drivers/net/wireless/libertas/if_sdio.c b/drivers/net/wireless/libertas/if_sdio.c
index 296fd00..e5685dc 100644
--- a/drivers/net/wireless/libertas/if_sdio.c
+++ b/drivers/net/wireless/libertas/if_sdio.c
@@ -684,18 +684,40 @@ static int if_sdio_prog_firmware(struct if_sdio_card *card)
 
 	lbs_deb_enter(LBS_DEB_SDIO);
 
+	/*
+	 * Disable interrupts
+	 */
+	sdio_claim_host(card->func);
+	sdio_writeb(card->func, 0x00, IF_SDIO_H_INT_MASK, &ret);
+	sdio_release_host(card->func);
+
 	sdio_claim_host(card->func);
 	scratch = if_sdio_read_scratch(card, &ret);
 	sdio_release_host(card->func);
 
+	lbs_deb_sdio("firmware status = %#x\n", scratch);
+	lbs_deb_sdio("scratch ret = %d\n", ret);
+
 	if (ret)
 		goto out;
 
-	lbs_deb_sdio("firmware status = %#x\n", scratch);
 
+	/*
+	 * The manual clearly describes that FEDC is the right code to use
+	 * to detect firmware presence, but for SD8686 it is not that simple.
+	 * Scratch is also used to store the RX packet length, so we lose
+	 * the FEDC value early on. So we use a non-zero check in order
+	 * to validate firmware presence.
+	 * Additionally, the SD8686 in the Gumstix always has the high scratch
+	 * bit set, even when the firmware is not loaded. So we have to
+	 * exclude that from the test.
+	 */
 	if (scratch == IF_SDIO_FIRMWARE_OK) {
 		lbs_deb_sdio("firmware already loaded\n");
 		goto success;
+	} else if ((card->model == MODEL_8686) && (scratch & 0x7fff)) {
+		lbs_deb_sdio("firmware may be running\n");
+		goto success;
 	}
 
 	ret = lbs_get_firmware(&card->func->dev, lbs_helper_name, lbs_fw_name,
@@ -709,10 +731,14 @@ static int if_sdio_prog_firmware(struct if_sdio_card *card)
 	if (ret)
 		goto out;
 
+	lbs_deb_sdio("Helper firmware loaded\n");
+
 	ret = if_sdio_prog_real(card, mainfw);
 	if (ret)
 		goto out;
 
+	lbs_deb_sdio("Firmware loaded\n");
+
 success:
 	sdio_claim_host(card->func);
 	sdio_set_block_size(card->func, IF_SDIO_BLOCK_SIZE);
@@ -1042,8 +1068,6 @@ static int if_sdio_probe(struct sdio_func *func,
 	priv->exit_deep_sleep = if_sdio_exit_deep_sleep;
 	priv->reset_deep_sleep_wakeup = if_sdio_reset_deep_sleep_wakeup;
 
-	priv->fw_ready = 1;
-
 	sdio_claim_host(func);
 
 	/*
@@ -1064,6 +1088,8 @@ static int if_sdio_probe(struct sdio_func *func,
 	if (ret)
 		goto reclaim;
 
+	priv->fw_ready = 1;
+
 	/*
 	 * FUNC_INIT is required for SD8688 WLAN/BT multiple functions
 	 */
diff --git a/net/mac80211/debugfs_key.c b/net/mac80211/debugfs_key.c
index 4aa47d0..1243d1d 100644
--- a/net/mac80211/debugfs_key.c
+++ b/net/mac80211/debugfs_key.c
@@ -203,9 +203,13 @@ static ssize_t key_key_read(struct file *file, char __user *userbuf,
 			    size_t count, loff_t *ppos)
 {
 	struct ieee80211_key *key = file->private_data;
-	int i, res, bufsize = 2 * key->conf.keylen + 2;
+	int i, bufsize = 2 * key->conf.keylen + 2;
 	char *buf = kmalloc(bufsize, GFP_KERNEL);
 	char *p = buf;
+	ssize_t res;
+
+	if (!buf)
+		return -ENOMEM;
 
 	for (i = 0; i < key->conf.keylen; i++)
 		p += scnprintf(p, bufsize + buf - p, "%02x", key->conf.key[i]);
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 6b322fa..107a0cb 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -677,10 +677,11 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
 	/*
 	 * Calculate scan IE length -- we need this to alloc
 	 * memory and to subtract from the driver limit. It
-	 * includes the (extended) supported rates and HT
+	 * includes the DS Params, (extended) supported rates, and HT
 	 * information -- SSID is the driver's responsibility.
 	 */
-	local->scan_ies_len = 4 + max_bitrates; /* (ext) supp rates */
+	local->scan_ies_len = 4 + max_bitrates /* (ext) supp rates */ +
+		3 /* DS Params */;
 	if (supp_ht)
 		local->scan_ies_len += 2 + sizeof(struct ieee80211_ht_cap);
 
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply related

* Re: [PATCH 0/1] RFC: poll/select performance on datagram sockets
From: Eric Dumazet @ 2010-10-29 19:27 UTC (permalink / raw)
  To: Alban Crequy
  Cc: David S. Miller, Stephen Hemminger, Cyrill Gorcunov,
	Alexey Dobriyan, netdev, linux-kernel, Pauli Nieminen,
	Rainer Weikusat
In-Reply-To: <20101029191857.5f789d56@chocolatine.cbg.collabora.co.uk>

Le vendredi 29 octobre 2010 à 19:18 +0100, Alban Crequy a écrit :
> Hi,
> 
> When a process calls the poll or select, the kernel calls (struct
> file_operations)->poll on every file descriptor and returns a mask of
> events which are ready. If the process is only interested by POLLIN
> events, the mask is still computed for POLLOUT and it can be expensive.
> For example, on Unix datagram sockets, a process running poll() with
> POLLIN will wakes-up when the remote end call read(). This is a
> performance regression introduced when fixing another bug by
> 3c73419c09a5ef73d56472dbfdade9e311496e9b and
> ec0d215f9420564fc8286dcf93d2d068bb53a07e.
> 
> The attached program illustrates the problem. It compares the
> performance of sending/receiving data on an Unix datagram socket and
> select(). When the datagram sockets are not connected, the performance
> problem is not triggered, but when they are connected it becomes a lot
> slower. On my computer, I have the following time:
> 
> Connected datagram sockets: >4 seconds
> Non-connected datagram sockets: <1 second
> 
> The patch attached in the next email fixes the performance problem: it
> becomes <1 second for both cases. I am not suggesting the patch for
> inclusion; I would like to change the prototype of (struct
> file_operations)->poll instead of adding ->poll2. But there is a lot of
> poll functions to change (grep tells me 337 functions).
> 
> Any opinions?

My opinion would be to use epoll() for this kind of workload.

Also, about unix_datagram_poll() being slow, it probably can be
addressed separately.




^ permalink raw reply

* Re: pull request: wireless-2.6 2010-10-29
From: David Miller @ 2010-10-29 19:31 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20101029191422.GE2399@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Fri, 29 Oct 2010 15:14:22 -0400

> Here are more fixes intended for 2.6.37.  Most are obvious one-liners
> (more-or-less).  The series from Luis fixes a bug as documented in the
> changelog.

Pulled, thanks John.

^ permalink raw reply

* Re: [PATCH] net: Limit socket I/O iovec total length to INT_MAX.
From: David Miller @ 2010-10-29 19:32 UTC (permalink / raw)
  To: torvalds; +Cc: viro, netdev, jon.maloy, allan.stephens, drosenberg
In-Reply-To: <AANLkTi=sUDRzGnkp57OCQUi9APHWLYLShhiTVHLyq1eV@mail.gmail.com>

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Fri, 29 Oct 2010 09:21:16 -0700

> NOTE! UNTESTED! It compiles, but I didn't actually boot to check.
> Maybe it has some stupid thinko in it. Also, the patch ended up being
> a bit bigger than it needed to be, because I found some annoying
> whitespace issues in rw_copy_check_uvector (spaces at the beginning of
> lines that had tabs in them) that I couldn't keep myself from not
> fixing.
> 
> Comments?

I just got out of a long dentist appointment, will look at this right
now, thanks!

^ permalink raw reply

* Re: [PATCH net-next-2.6 v2] can: Topcliff: PCH_CAN driver: Fix build warnings
From: Wolfgang Grandegger @ 2010-10-29 19:32 UTC (permalink / raw)
  To: Tomoya
  Cc: andrew.chih.howe.khor-ral2JQCrhuEAvxtiuMwx3w, Masayuki Ohtake,
	Samuel Ortiz, margie.foster-ral2JQCrhuEAvxtiuMwx3w,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	yong.y.wang-ral2JQCrhuEAvxtiuMwx3w, Marc Kleine-Budde,
	joel.clark-ral2JQCrhuEAvxtiuMwx3w,
	kok.howg.ewe-ral2JQCrhuEAvxtiuMwx3w, David S. Miller,
	Christian Pellegrin, qi.wang-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <4CCAF517.2000409-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>

Hello,

some more comments from my side.

On 10/29/2010 06:23 PM, Marc Kleine-Budde wrote:
> Hello,
> 
> On 10/29/2010 02:57 PM, Marc Kleine-Budde wrote:
>> The driver has already been merged. Please send incremental patches
>> against david's net-2.6 branch.
> 
> Here a review, find comments inline. Lets talk about my remarks, please
> answer inline and don't delete the code.
> 
> Can you please explain me your locking sheme? If I understand the
> documenation correctly the two message interfaces can be used mutual.
> And you use one for rx the other one for tx.
> 
> Please use netdev_<level> instead of dev_<level> for debug.
> 
>> --- /dev/null
>> +++ b/drivers/net/can/pch_can.c
>> @@ -0,0 +1,1436 @@
>> +/*
>> + * Copyright (C) 1999 - 2010 Intel Corporation.
>> + * Copyright (C) 2010 OKI SEMICONDUCTOR CO., LTD.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; version 2 of the License.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307, USA.
>> + */
>> +
>> +#include <linux/interrupt.h>
>> +#include <linux/delay.h>
>> +#include <linux/io.h>
>> +#include <linux/module.h>
>> +#include <linux/sched.h>
>> +#include <linux/pci.h>
>> +#include <linux/init.h>
>> +#include <linux/kernel.h>
>> +#include <linux/types.h>
>> +#include <linux/errno.h>
>> +#include <linux/netdevice.h>
>> +#include <linux/skbuff.h>
>> +#include <linux/can.h>
>> +#include <linux/can/dev.h>
>> +#include <linux/can/error.h>
>> +
>> +#define MAX_MSG_OBJ		32
>> +#define MSG_OBJ_RX		0 /* The receive message object flag. */
>> +#define MSG_OBJ_TX		1 /* The transmit message object flag. */
>> +
>> +#define CAN_CTRL_INIT		0x0001 /* The INIT bit of CANCONT register. */
>> +#define CAN_CTRL_IE		0x0002 /* The IE bit of CAN control register */
>> +#define CAN_CTRL_IE_SIE_EIE	0x000e
>> +#define CAN_CTRL_CCE		0x0040
>> +#define CAN_CTRL_OPT		0x0080 /* The OPT bit of CANCONT register. */
>> +#define CAN_OPT_SILENT		0x0008 /* The Silent bit of CANOPT reg. */
>> +#define CAN_OPT_LBACK		0x0010 /* The LoopBack bit of CANOPT reg. */
>> +#define CAN_CMASK_RX_TX_SET	0x00f3
>> +#define CAN_CMASK_RX_TX_GET	0x0073
>> +#define CAN_CMASK_ALL		0xff
>> +#define CAN_CMASK_RDWR		0x80
>> +#define CAN_CMASK_ARB		0x20
>> +#define CAN_CMASK_CTRL		0x10
>> +#define CAN_CMASK_MASK		0x40
>> +#define CAN_CMASK_NEWDAT	0x04
>> +#define CAN_CMASK_CLRINTPND	0x08
>> +
>> +#define CAN_IF_MCONT_NEWDAT	0x8000
>> +#define CAN_IF_MCONT_INTPND	0x2000
>> +#define CAN_IF_MCONT_UMASK	0x1000
>> +#define CAN_IF_MCONT_TXIE	0x0800
>> +#define CAN_IF_MCONT_RXIE	0x0400
>> +#define CAN_IF_MCONT_RMTEN	0x0200
>> +#define CAN_IF_MCONT_TXRQXT	0x0100
>> +#define CAN_IF_MCONT_EOB	0x0080
>> +#define CAN_IF_MCONT_DLC	0x000f
>> +#define CAN_IF_MCONT_MSGLOST	0x4000
>> +#define CAN_MASK2_MDIR_MXTD	0xc000
>> +#define CAN_ID2_DIR		0x2000
>> +#define CAN_ID_MSGVAL		0x8000
>> +
>> +#define CAN_STATUS_INT		0x8000
>> +#define CAN_IF_CREQ_BUSY	0x8000
>> +#define CAN_ID2_XTD		0x4000
>> +
>> +#define CAN_REC			0x00007f00
>> +#define CAN_TEC			0x000000ff
> 
> A prefix for like PCH_ instead of CAN_ for all those define above would
> be fine to avoid namespace clashes and/or confusion with the defines from the socketcan framework.
> 
>> +
>> +#define PCH_RX_OK		0x00000010
>> +#define PCH_TX_OK		0x00000008
>> +#define PCH_BUS_OFF		0x00000080
>> +#define PCH_EWARN		0x00000040
>> +#define PCH_EPASSIV		0x00000020
>> +#define PCH_LEC0		0x00000001
>> +#define PCH_LEC1		0x00000002
>> +#define PCH_LEC2		0x00000004
> 
> These are just single set bit, please use BIT()
> Consider adding the name of the corresponding register to the define's
> name.
> 
>> +#define PCH_LEC_ALL		(PCH_LEC0 | PCH_LEC1 | PCH_LEC2)
>> +#define PCH_STUF_ERR		PCH_LEC0
>> +#define PCH_FORM_ERR		PCH_LEC1
>> +#define PCH_ACK_ERR		(PCH_LEC0 | PCH_LEC1)
>> +#define PCH_BIT1_ERR		PCH_LEC2
>> +#define PCH_BIT0_ERR		(PCH_LEC0 | PCH_LEC2)
>> +#define PCH_CRC_ERR		(PCH_LEC1 | PCH_LEC2)

This is an enumeration:

enum {
	PCH_STUF_ERR = 1,
	PCH_FORM_ERR,
	PCH_ACK_ERR,
	PCH_BIT1_ERR;
	PCH_BIT0_ERR,
	PCH_CRC_ERR,
	PCH_LEC_ALL;
}

Also, s/PCH_/PCH_LEC_/ would be nice.

>> +/* bit position of certain controller bits. */
>> +#define BIT_BITT_BRP		0
>> +#define BIT_BITT_SJW		6
>> +#define BIT_BITT_TSEG1		8
>> +#define BIT_BITT_TSEG2		12
>> +#define BIT_IF1_MCONT_RXIE	10
>> +#define BIT_IF2_MCONT_TXIE	11
>> +#define BIT_BRPE_BRPE		6
>> +#define BIT_ES_TXERRCNT		0
>> +#define BIT_ES_RXERRCNT		8
> 
> these are usually called SHIFT
> 
>> +#define MSK_BITT_BRP		0x3f
>> +#define MSK_BITT_SJW		0xc0
>> +#define MSK_BITT_TSEG1		0xf00
>> +#define MSK_BITT_TSEG2		0x7000
>> +#define MSK_BRPE_BRPE		0x3c0
>> +#define MSK_BRPE_GET		0x0f
>> +#define MSK_CTRL_IE_SIE_EIE	0x07
>> +#define MSK_MCONT_TXIE		0x08
>> +#define MSK_MCONT_RXIE		0x10
> 
> MSK or MASK is okay, however the last two are just single bits.
> 
> Please add a PCH_ prefix here, too.
> 
>> +#define PCH_CAN_NO_TX_BUFF	1
>> +#define COUNTER_LIMIT		10
> 
> dito
> 
>> +
>> +#define PCH_CAN_CLK		50000000	/* 50MHz */
>> +
>> +/*
>> + * Define the number of message object.
>> + * PCH CAN communications are done via Message RAM.
>> + * The Message RAM consists of 32 message objects.
>> + */
>> +#define PCH_RX_OBJ_NUM		26  /* 1~ PCH_RX_OBJ_NUM is Rx*/
>> +#define PCH_TX_OBJ_NUM		6  /* PCH_RX_OBJ_NUM is RX ~ Tx*/
>> +#define PCH_OBJ_NUM		(PCH_TX_OBJ_NUM + PCH_RX_OBJ_NUM)
> 
> You define MAX_MSG_OBJ earlier, seems like two names for the same value.
> 
>> +
>> +#define PCH_FIFO_THRESH		16
>> +
>> +enum pch_can_mode {
>> +	PCH_CAN_ENABLE,
>> +	PCH_CAN_DISABLE,
>> +	PCH_CAN_ALL,
>> +	PCH_CAN_NONE,
>> +	PCH_CAN_STOP,
>> +	PCH_CAN_RUN,
>> +};
>> +
>> +struct pch_can_regs {
>> +	u32 cont;
>> +	u32 stat;
>> +	u32 errc;
>> +	u32 bitt;
>> +	u32 intr;
>> +	u32 opt;
>> +	u32 brpe;
>> +	u32 reserve1;
> 
> VVVV
>> +	u32 if1_creq;
>> +	u32 if1_cmask;
>> +	u32 if1_mask1;
>> +	u32 if1_mask2;
>> +	u32 if1_id1;
>> +	u32 if1_id2;
>> +	u32 if1_mcont;
>> +	u32 if1_dataa1;
>> +	u32 if1_dataa2;
>> +	u32 if1_datab1;
>> +	u32 if1_datab2;
> ^^^^
> 
> these registers and....
> 
>> +	u32 reserve2;
>> +	u32 reserve3[12];
> 
> ...and these
> 
> VVVV
>> +	u32 if2_creq;
>> +	u32 if2_cmask;
>> +	u32 if2_mask1;
>> +	u32 if2_mask2;
>> +	u32 if2_id1;
>> +	u32 if2_id2;
>> +	u32 if2_mcont;
>> +	u32 if2_dataa1;
>> +	u32 if2_dataa2;
>> +	u32 if2_datab1;
>> +	u32 if2_datab2;
> 
> ^^^^
> 
> ...are identical. I suggest to make a struct defining a complete
> "Message Interface Register Set". If you include the correct number of
> reserved bytes in the struct, you can have an array of two of these
> structs in the struct pch_can_regs.

Yep, that would be nice. Using it consequently would also allow to
remove duplicated code efficiently. I will name it "struct pch_can_if"
for latter references.

> 
>> +	u32 reserve4;
>> +	u32 reserve5[20];
>> +	u32 treq1;
>> +	u32 treq2;
>> +	u32 reserve6[2];
>> +	u32 reserve7[56];
>> +	u32 reserve8[3];

Why not just one reserveX ?

>> +	u32 srst;
>> +};
>> +
>> +struct pch_can_priv {
>> +	struct can_priv can;
>> +	struct pci_dev *dev;
>> +	unsigned int tx_enable[MAX_MSG_OBJ];
>> +	unsigned int rx_enable[MAX_MSG_OBJ];
>> +	unsigned int rx_link[MAX_MSG_OBJ];
>> +	unsigned int int_enables;
>> +	unsigned int int_stat;
>> +	struct net_device *ndev;
>> +	spinlock_t msgif_reg_lock; /* Message Interface Registers Access Lock*/
>                                                                             ^^^
> please add a whitespace
> 
>> +	unsigned int msg_obj[MAX_MSG_OBJ];
>> +	struct pch_can_regs __iomem *regs;
>> +	struct napi_struct napi;
>> +	unsigned int tx_obj;	/* Point next Tx Obj index */
>> +	unsigned int use_msi;
>> +};
>> +
>> +static struct can_bittiming_const pch_can_bittiming_const = {
>> +	.name = "pch_can",
>> +	.tseg1_min = 1,
>> +	.tseg1_max = 16,
>> +	.tseg2_min = 1,
>> +	.tseg2_max = 8,
>> +	.sjw_max = 4,
>> +	.brp_min = 1,
>> +	.brp_max = 1024, /* 6bit + extended 4bit */
>> +	.brp_inc = 1,
>> +};
>> +
>> +static DEFINE_PCI_DEVICE_TABLE(pch_pci_tbl) = {
>> +	{PCI_VENDOR_ID_INTEL, 0x8818, PCI_ANY_ID, PCI_ANY_ID,},
>> +	{0,}
>> +};
>> +MODULE_DEVICE_TABLE(pci, pch_pci_tbl);
>> +
>> +static inline void pch_can_bit_set(u32 *addr, u32 mask)
>                                       ^^^^^
> 
> that should be an void __iomem *, see mail I've send the other day.
> Please use sparse to check for this kinds of errors.
> (Compile the driver with C=2, i.e.: make drivers/net/can/pch_can.ko C=2)
> 
>> +{
>> +	iowrite32(ioread32(addr) | mask, addr);
>> +}
>> +
>> +static inline void pch_can_bit_clear(u32 *addr, u32 mask)
>                                         ^^^^^
> 
> dito
> 
>> +{
>> +	iowrite32(ioread32(addr) & ~mask, addr);
>> +}
>> +
>> +static void pch_can_set_run_mode(struct pch_can_priv *priv,
>> +				 enum pch_can_mode mode)
>> +{
>> +	switch (mode) {
>> +	case PCH_CAN_RUN:
>> +		pch_can_bit_clear(&priv->regs->cont, CAN_CTRL_INIT);
>> +		break;
>> +
>> +	case PCH_CAN_STOP:
>> +		pch_can_bit_set(&priv->regs->cont, CAN_CTRL_INIT);
>> +		break;
>> +
>> +	default:
>> +		dev_err(&priv->ndev->dev, "%s -> Invalid Mode.\n", __func__);
>> +		break;
>> +	}
>> +}
>> +
>> +static void pch_can_set_optmode(struct pch_can_priv *priv)
>> +{
>> +	u32 reg_val = ioread32(&priv->regs->opt);
>> +
>> +	if (priv->can.ctrlmode & CAN_CTRLMODE_LISTENONLY)
>> +		reg_val |= CAN_OPT_SILENT;
>> +
>> +	if (priv->can.ctrlmode & CAN_CTRLMODE_LOOPBACK)
>> +		reg_val |= CAN_OPT_LBACK;
>> +
>> +	pch_can_bit_set(&priv->regs->cont, CAN_CTRL_OPT);
>> +	iowrite32(reg_val, &priv->regs->opt);
>> +}
>> +
> 
> IMHO the function name is missleading, if I understand the code
> correctly, this functions triggers the transmission of the message.
> After this it checks for busy, but 
> 
>> +static void pch_can_check_if_busy(u32 __iomem *creq_addr, u32 num)
>                                      ^^^^
> 
> that should probaby be a void

With separate structs for if1 and i2, a pointer to the relevant "struct
pch_can_if" could be passed instead.

>> +{
>> +	u32 counter = COUNTER_LIMIT;
>> +	u32 ifx_creq;
>> +
>> +	iowrite32(num, creq_addr);
>> +	while (counter) {
>> +		ifx_creq = ioread32(creq_addr) & CAN_IF_CREQ_BUSY;
>> +		if (!ifx_creq)
>> +			break;
>> +		counter--;
>> +		udelay(1);
>> +	}
>> +	if (!counter)
>> +		pr_err("%s:IF1 BUSY Flag is set forever.\n", __func__);
>> +}
>> +
>> +static void pch_can_set_int_enables(struct pch_can_priv *priv,
>> +				    enum pch_can_mode interrupt_no)
>> +{
>> +	switch (interrupt_no) {
>> +	case PCH_CAN_ENABLE:
>> +		pch_can_bit_set(&priv->regs->cont, CAN_CTRL_IE);
> 
> noone uses this case.
> 
>> +		break;
>> +
>> +	case PCH_CAN_DISABLE:
>> +		pch_can_bit_clear(&priv->regs->cont, CAN_CTRL_IE);
>> +		break;
>> +
>> +	case PCH_CAN_ALL:
>> +		pch_can_bit_set(&priv->regs->cont, CAN_CTRL_IE_SIE_EIE);
>> +		break;
>> +
>> +	case PCH_CAN_NONE:
>> +		pch_can_bit_clear(&priv->regs->cont, CAN_CTRL_IE_SIE_EIE);
>> +		break;
>> +
>> +	default:
>> +		dev_err(&priv->ndev->dev, "Invalid interrupt number.\n");
>> +		break;
>> +	}
>> +}
>> +
>> +static void pch_can_set_rx_enable(struct pch_can_priv *priv, u32 buff_num,
>> +				  int set)
>> +{
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&priv->msgif_reg_lock, flags);
>> +	/* Reading the receive buffer data from RAM to Interface1 registers */
>> +	iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
>> +	pch_can_check_if_busy(&priv->regs->if1_creq, buff_num);
>> +
>> +	/* Setting the IF1MASK1 register to access MsgVal and RxIE bits */
>> +	iowrite32(CAN_CMASK_RDWR | CAN_CMASK_ARB | CAN_CMASK_CTRL,
>> +		  &priv->regs->if1_cmask);
>> +
>> +	if (set == 1) {
>> +		/* Setting the MsgVal and RxIE bits */
>> +		pch_can_bit_set(&priv->regs->if1_mcont, CAN_IF_MCONT_RXIE);
>> +		pch_can_bit_set(&priv->regs->if1_id2, CAN_ID_MSGVAL);
>> +
>> +	} else if (set == 0) {
>> +		/* Resetting the MsgVal and RxIE bits */
>> +		pch_can_bit_clear(&priv->regs->if1_mcont, CAN_IF_MCONT_RXIE);
>> +		pch_can_bit_clear(&priv->regs->if1_id2, CAN_ID_MSGVAL);
>> +	}

Why not just?

	if (set)
	else


>> +	pch_can_check_if_busy(&priv->regs->if1_creq, buff_num);
>> +	spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
>> +}
>> +
>> +static void pch_can_rx_enable_all(struct pch_can_priv *priv)
>> +{
>> +	int i;
>> +
>> +	/* Traversing to obtain the object configured as receivers. */
>> +	for (i = 1; i <= PCH_RX_OBJ_NUM; i++)
>> +		pch_can_set_rx_enable(priv, i, 1);
>> +}
>> +
>> +static void pch_can_rx_disable_all(struct pch_can_priv *priv)
>> +{
>> +	int i;
>> +
>> +	/* Traversing to obtain the object configured as receivers. */
>> +	for (i = 1; i <= PCH_RX_OBJ_NUM; i++)
>> +		pch_can_set_rx_enable(priv, i, 0);
>> +}
>> +
>> +static void pch_can_set_tx_enable(struct pch_can_priv *priv, u32 buff_num,
>> +				 u32 set)
>> +{
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&priv->msgif_reg_lock, flags);
>> +	/* Reading the Msg buffer from Message RAM to Interface2 registers. */
>> +	iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if2_cmask);
>> +	pch_can_check_if_busy(&priv->regs->if2_creq, buff_num);
>> +
>> +	/* Setting the IF2CMASK register for accessing the
>> +		MsgVal and TxIE bits */
>> +	iowrite32(CAN_CMASK_RDWR | CAN_CMASK_ARB | CAN_CMASK_CTRL,
>> +		 &priv->regs->if2_cmask);
>> +
>> +	if (set == 1) {
>> +		/* Setting the MsgVal and TxIE bits */
>> +		pch_can_bit_set(&priv->regs->if2_mcont, CAN_IF_MCONT_TXIE);
>> +		pch_can_bit_set(&priv->regs->if2_id2, CAN_ID_MSGVAL);
>> +	} else if (set == 0) {
>> +		/* Resetting the MsgVal and TxIE bits. */
>> +		pch_can_bit_clear(&priv->regs->if2_mcont, CAN_IF_MCONT_TXIE);
>> +		pch_can_bit_clear(&priv->regs->if2_id2, CAN_ID_MSGVAL);
>> +	}
>> +
>> +	pch_can_check_if_busy(&priv->regs->if2_creq, buff_num);
>> +	spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
>> +}

That function is almost identical to pch_can_set_rx_enable. Just if2 is
used instead of if1 and CAN_IF_MCONT_TXIE instead of CAN_IF_MCONT_RXIE.
With separate "struct  pch_can_if" for if1 and if2, it could be handled
by a common function.

>> +static void pch_can_tx_enable_all(struct pch_can_priv *priv)
>> +{
>> +	int i;
>> +
>> +	/* Traversing to obtain the object configured as transmit object. */
>> +	for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++)
>> +		pch_can_set_tx_enable(priv, i, 1);
>> +}
>> +
>> +static void pch_can_tx_disable_all(struct pch_can_priv *priv)
>> +{
>> +	int i;
>> +
>> +	/* Traversing to obtain the object configured as transmit object. */
>> +	for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++)
>> +		pch_can_set_tx_enable(priv, i, 0);
>> +}

I think there is no need for separate functions for enable and disable.
Just pass "enable" 0 or 1 like you do with "set" above.

>> +static int pch_can_int_pending(struct pch_can_priv *priv)
>           ^^^
> 
> make it u32 as it returns a register value, or a u16 as you only use
> the 16 lower bits.
> 
>> +{
>> +	return ioread32(&priv->regs->intr) & 0xffff;
>> +}
>> +
>> +static void pch_can_clear_buffers(struct pch_can_priv *priv)
>> +{
>> +	int i; /* Msg Obj ID (1~32) */
>> +
>> +	for (i = 1; i <= PCH_RX_OBJ_NUM; i++) {
> 
> IMHO the readability would be improved if you define something like
> PCH_RX_OBJ_START and PCH_RX_OBJ_END.
> 
>> +		iowrite32(CAN_CMASK_RX_TX_SET, &priv->regs->if1_cmask);
>> +		iowrite32(0xffff, &priv->regs->if1_mask1);
>> +		iowrite32(0xffff, &priv->regs->if1_mask2);
>> +		iowrite32(0x0, &priv->regs->if1_id1);
>> +		iowrite32(0x0, &priv->regs->if1_id2);
>> +		iowrite32(0x0, &priv->regs->if1_mcont);
>> +		iowrite32(0x0, &priv->regs->if1_dataa1);
>> +		iowrite32(0x0, &priv->regs->if1_dataa2);
>> +		iowrite32(0x0, &priv->regs->if1_datab1);
>> +		iowrite32(0x0, &priv->regs->if1_datab2);
>> +		iowrite32(CAN_CMASK_RDWR | CAN_CMASK_MASK |
>> +			  CAN_CMASK_ARB | CAN_CMASK_CTRL,
>> +			  &priv->regs->if1_cmask);
>> +		pch_can_check_if_busy(&priv->regs->if1_creq, i);
>> +	}
>> +
>> +	for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++) {
>                  ^^^^^^^^^^^^^^^^^^
> dito for TX objects
> 
>> +		iowrite32(CAN_CMASK_RX_TX_SET, &priv->regs->if2_cmask);
>> +		iowrite32(0xffff, &priv->regs->if2_mask1);
>> +		iowrite32(0xffff, &priv->regs->if2_mask2);
>> +		iowrite32(0x0, &priv->regs->if2_id1);
>> +		iowrite32(0x0, &priv->regs->if2_id2);
>> +		iowrite32(0x0, &priv->regs->if2_mcont);
>> +		iowrite32(0x0, &priv->regs->if2_dataa1);
>> +		iowrite32(0x0, &priv->regs->if2_dataa2);
>> +		iowrite32(0x0, &priv->regs->if2_datab1);
>> +		iowrite32(0x0, &priv->regs->if2_datab2);
>> +		iowrite32(CAN_CMASK_RDWR | CAN_CMASK_MASK | CAN_CMASK_ARB |
>> +			  CAN_CMASK_CTRL, &priv->regs->if2_cmask);
>> +		pch_can_check_if_busy(&priv->regs->if2_creq, i);

This is almost the same code as above, just if2 instead of if1. With
separate "struct  pch_can_if" for if1 and i2, it could be handled by a
common function.

>> +	}
>> +}
>> +
>> +static void pch_can_config_rx_tx_buffers(struct pch_can_priv *priv)
>> +{
>> +	int i;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&priv->msgif_reg_lock, flags);
>> +
>> +	for (i = 1; i <= PCH_RX_OBJ_NUM; i++) {
>> +		iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
>> +		pch_can_check_if_busy(&priv->regs->if1_creq, i);
> 
> If I understand the code correctly, the about function triggers a
> transfer. Why do you first trigger a transfer, then set the message contents....
>> +
>> +		iowrite32(0x0, &priv->regs->if1_id1);
>> +		iowrite32(0x0, &priv->regs->if1_id2);
>> +
>> +		pch_can_bit_set(&priv->regs->if1_mcont, CAN_IF_MCONT_UMASK);
> 
>     Why do you set the "Use acceptance mask" bit? We want to receive
>     all can messages.
> 
>> +
>> +		/* Set FIFO mode set to 0 except last Rx Obj*/
>> +		pch_can_bit_clear(&priv->regs->if1_mcont, CAN_IF_MCONT_EOB);
>> +		/* In case FIFO mode, Last EoB of Rx Obj must be 1 */
>> +		if (i == (PCH_RX_OBJ_NUM - 1))
>> +			pch_can_bit_set(&priv->regs->if1_mcont,
>> +					CAN_IF_MCONT_EOB);
> 
>     Make it if () { } else { }, please.
> 
>> +
>> +		iowrite32(0, &priv->regs->if1_mask1);
>> +		pch_can_bit_clear(&priv->regs->if1_mask2,
>> +				  0x1fff | CAN_MASK2_MDIR_MXTD);
>> +
>> +		/* Setting CMASK for writing */
>> +		iowrite32(CAN_CMASK_RDWR | CAN_CMASK_MASK | CAN_CMASK_ARB |
>> +			  CAN_CMASK_CTRL, &priv->regs->if1_cmask);
>> +
>> +		pch_can_check_if_busy(&priv->regs->if1_creq, i);
> 
> ...and then trigger the transfer again?
> 
>> +	}
>> +
>> +	for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++) {
>> +		iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if2_cmask);
>> +		pch_can_check_if_busy(&priv->regs->if2_creq, i);
> 
> same question about triggering the transfer 2 times applied here, too
>> +
>> +		/* Resetting DIR bit for reception */
>> +		iowrite32(0x0, &priv->regs->if2_id1);
>> +		iowrite32(0x0, &priv->regs->if2_id2);
>> +		pch_can_bit_set(&priv->regs->if2_id2, CAN_ID2_DIR);
> 
> Can you combine the two accesses to >if2_id2 into one?
> 
>> +
>> +		/* Setting EOB bit for transmitter */
>> +		iowrite32(CAN_IF_MCONT_EOB, &priv->regs->if2_mcont);
>> +
>> +		pch_can_bit_set(&priv->regs->if2_mcont,
>> +				CAN_IF_MCONT_UMASK);
> 
> dito for if2_mcont
> 
>> +
>> +		iowrite32(0, &priv->regs->if2_mask1);
>> +		pch_can_bit_clear(&priv->regs->if2_mask2, 0x1fff);
>> +
>> +		/* Setting CMASK for writing */
>> +		iowrite32(CAN_CMASK_RDWR | CAN_CMASK_MASK | CAN_CMASK_ARB |
>> +			  CAN_CMASK_CTRL, &priv->regs->if2_cmask);
>> +
>> +		pch_can_check_if_busy(&priv->regs->if2_creq, i);
>> +	}
>> +
>> +	spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
>> +}
>> +
>> +static void pch_can_init(struct pch_can_priv *priv)
>> +{
>> +	/* Stopping the Can device. */
>> +	pch_can_set_run_mode(priv, PCH_CAN_STOP);
>> +
>> +	/* Clearing all the message object buffers. */
>> +	pch_can_clear_buffers(priv);
>> +
>> +	/* Configuring the respective message object as either rx/tx object. */
>> +	pch_can_config_rx_tx_buffers(priv);
>> +
>> +	/* Enabling the interrupts. */
>> +	pch_can_set_int_enables(priv, PCH_CAN_ALL);
>> +}
>> +
>> +static void pch_can_release(struct pch_can_priv *priv)
>> +{
>> +	/* Stooping the CAN device. */
>> +	pch_can_set_run_mode(priv, PCH_CAN_STOP);
>> +
>> +	/* Disabling the interrupts. */
>> +	pch_can_set_int_enables(priv, PCH_CAN_NONE);
>> +
>> +	/* Disabling all the receive object. */
>> +	pch_can_rx_disable_all(priv);
>> +
>> +	/* Disabling all the transmit object. */
>> +	pch_can_tx_disable_all(priv);
>> +}
>> +
>> +/* This function clears interrupt(s) from the CAN device. */
>> +static void pch_can_int_clr(struct pch_can_priv *priv, u32 mask)
>> +{
>> +	if (mask == CAN_STATUS_INT) {
> 
> is this a valid case?
> 
>> +		ioread32(&priv->regs->stat);
>> +		return;
>> +	}
>> +
>> +	/* Clear interrupt for transmit object */
>> +	if ((mask >= 1) && (mask <= PCH_RX_OBJ_NUM)) {
>> +		/* Setting CMASK for clearing the reception interrupts. */
>> +		iowrite32(CAN_CMASK_RDWR | CAN_CMASK_CTRL | CAN_CMASK_ARB,
>> +			  &priv->regs->if1_cmask);
>> +
>> +		/* Clearing the Dir bit. */
>> +		pch_can_bit_clear(&priv->regs->if1_id2, CAN_ID2_DIR);
>> +
>> +		/* Clearing NewDat & IntPnd */
>> +		pch_can_bit_clear(&priv->regs->if1_mcont,
>> +				  CAN_IF_MCONT_NEWDAT | CAN_IF_MCONT_INTPND);
>> +
>> +		pch_can_check_if_busy(&priv->regs->if1_creq, mask);
>> +	} else if ((mask > PCH_RX_OBJ_NUM) && (mask <= PCH_OBJ_NUM)) {
>> +		/* Setting CMASK for clearing interrupts for
>> +		   frame transmission. */
> 
> /*
>  * this is the prefered style of multi line comments,
>  * please adjust you comments
>  */
> 
>> +		iowrite32(CAN_CMASK_RDWR | CAN_CMASK_CTRL | CAN_CMASK_ARB,
>> +			  &priv->regs->if2_cmask);
>> +
>> +		/* Resetting the ID registers. */
>> +		pch_can_bit_set(&priv->regs->if2_id2,
>> +			       CAN_ID2_DIR | (0x7ff << 2));
>> +		iowrite32(0x0, &priv->regs->if2_id1);
>> +
>> +		/* Claring NewDat, TxRqst & IntPnd */
>> +		pch_can_bit_clear(&priv->regs->if2_mcont,
>> +				  CAN_IF_MCONT_NEWDAT | CAN_IF_MCONT_INTPND |
>> +				  CAN_IF_MCONT_TXRQXT);
>> +		pch_can_check_if_busy(&priv->regs->if2_creq, mask);
>> +	}
>> +}
>> +
>> +static u32 pch_can_get_buffer_status(struct pch_can_priv *priv)
>> +{
>> +	return (ioread32(&priv->regs->treq1) & 0xffff) |
>> +	       ((ioread32(&priv->regs->treq2) & 0xffff) << 16);
> 
> the second 0xffff is not needed, as the return value is u32 and you shift by 16.
> 
>> +}
>> +
>> +static void pch_can_reset(struct pch_can_priv *priv)
>> +{
>> +	/* write to sw reset register */
>> +	iowrite32(1, &priv->regs->srst);
>> +	iowrite32(0, &priv->regs->srst);
>> +}
>> +
>> +static void pch_can_error(struct net_device *ndev, u32 status)
>> +{
>> +	struct sk_buff *skb;
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +	struct can_frame *cf;
>> +	u32 errc;
>> +	struct net_device_stats *stats = &(priv->ndev->stats);
>> +	enum can_state state = priv->can.state;
>> +
>> +	skb = alloc_can_err_skb(ndev, &cf);
>> +	if (!skb)
>> +		return;
>> +
>> +	if (status & PCH_BUS_OFF) {
>> +		pch_can_tx_disable_all(priv);
>> +		pch_can_rx_disable_all(priv);
>> +		state = CAN_STATE_BUS_OFF;
>> +		cf->can_id |= CAN_ERR_BUSOFF;
>> +		can_bus_off(ndev);
>> +	}
>> +
>> +	/* Warning interrupt. */
>> +	if (status & PCH_EWARN) {
>> +		state = CAN_STATE_ERROR_WARNING;
>> +		priv->can.can_stats.error_warning++;
>> +		cf->can_id |= CAN_ERR_CRTL;
>> +		errc = ioread32(&priv->regs->errc);
>> +		if (((errc & CAN_REC) >> 8) > 96)
>> +			cf->data[1] |= CAN_ERR_CRTL_RX_WARNING;
>> +		if ((errc & CAN_TEC) > 96)
>> +			cf->data[1] |= CAN_ERR_CRTL_TX_WARNING;
>> +		dev_warn(&ndev->dev,
>> +			"%s -> Error Counter is more than 96.\n", __func__);
> 
> Please use just "debug" level not warning here. Consider to use
> netdev_dbg() instead. IMHO the __func__ can be dropped and the
> "official" name for the error is "Error Warning".
> 
>> +	}
>> +	/* Error passive interrupt. */
>> +	if (status & PCH_EPASSIV) {
>> +		priv->can.can_stats.error_passive++;
>> +		state = CAN_STATE_ERROR_PASSIVE;
>> +		cf->can_id |= CAN_ERR_CRTL;
>> +		errc = ioread32(&priv->regs->errc);
>> +		if (((errc & CAN_REC) >> 8) > 127)
>> +			cf->data[1] |= CAN_ERR_CRTL_RX_PASSIVE;
>> +		if ((errc & CAN_TEC) > 127)
>> +			cf->data[1] |= CAN_ERR_CRTL_TX_PASSIVE;
>> +		dev_err(&ndev->dev,
>> +			"%s -> CAN controller is ERROR PASSIVE .\n", __func__);
> 
> dito
> 
>> +	}
>> +
>> +	if (status & PCH_LEC_ALL) {
>> +		priv->can.can_stats.bus_error++;
>> +		stats->rx_errors++;
>> +		switch (status & PCH_LEC_ALL) {
> 
> I suggest to convert to a if-bit-set because there might be more than
> one bit set.

Marc, what do you mean here. It's an enumeraton. Maybe the following
code is more clear:

	lec = status & PCH_LEC_ALL;
	if (lec > 0) {
		switch (lec) {

>> +		case PCH_STUF_ERR:
>> +			cf->data[2] |= CAN_ERR_PROT_STUFF;
>> +			break;
>> +		case PCH_FORM_ERR:
>> +			cf->data[2] |= CAN_ERR_PROT_FORM;
>> +			break;
>> +		case PCH_ACK_ERR:
>> +			cf->data[2] |= CAN_ERR_PROT_LOC_ACK |
>> +				       CAN_ERR_PROT_LOC_ACK_DEL;

Could you check what that type of bus error that is? Usually it's a ack
lost error.

>> +			break;
>> +		case PCH_BIT1_ERR:
>> +		case PCH_BIT0_ERR:
>> +			cf->data[2] |= CAN_ERR_PROT_BIT;
>> +			break;
>> +		case PCH_CRC_ERR:
>> +			cf->data[2] |= CAN_ERR_PROT_LOC_CRC_SEQ |
>> +				       CAN_ERR_PROT_LOC_CRC_DEL;
>> +			break;
>> +		default:
>> +			iowrite32(status | PCH_LEC_ALL, &priv->regs->stat);
>> +			break;
>> +		}
>> +
>> +	}

Also, could you please add the TEC and REC:

	cf->data[6] = ioread32(&priv->regs->errc) & CAN_TEC;
	cf->data[7] = (ioread32(&priv->regs->errc) & CAN_REC) >> 8;

>> +	priv->can.state = state;
>> +	netif_receive_skb(skb);
>> +
>> +	stats->rx_packets++;
>> +	stats->rx_bytes += cf->can_dlc;
>> +}
>> +
>> +static irqreturn_t pch_can_interrupt(int irq, void *dev_id)
>> +{
>> +	struct net_device *ndev = (struct net_device *)dev_id;
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +
>> +	pch_can_set_int_enables(priv, PCH_CAN_NONE);
>> +	napi_schedule(&priv->napi);
>> +
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static void pch_fifo_thresh(struct pch_can_priv *priv, int obj_id)
>> +{
>> +	if (obj_id < PCH_FIFO_THRESH) {
>> +		iowrite32(CAN_CMASK_RDWR | CAN_CMASK_CTRL |
>> +			  CAN_CMASK_ARB, &priv->regs->if1_cmask);
>> +
>> +		/* Clearing the Dir bit. */
>> +		pch_can_bit_clear(&priv->regs->if1_id2, CAN_ID2_DIR);
>> +
>> +		/* Clearing NewDat & IntPnd */
>> +		pch_can_bit_clear(&priv->regs->if1_mcont,
>> +				  CAN_IF_MCONT_INTPND);
>> +		pch_can_check_if_busy(&priv->regs->if1_creq, obj_id);
>> +	} else if (obj_id > PCH_FIFO_THRESH) {
>> +		pch_can_int_clr(priv, obj_id);
>> +	} else if (obj_id == PCH_FIFO_THRESH) {
>> +		int cnt;
>> +		for (cnt = 0; cnt < PCH_FIFO_THRESH; cnt++)
>> +			pch_can_int_clr(priv, cnt+1);
>> +	}
>> +}
>> +
>> +static int pch_can_rx_msg_lost(struct net_device *ndev, int obj_id)
>> +{
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +	struct net_device_stats *stats = &(priv->ndev->stats);
>> +	struct sk_buff *skb;
>> +	struct can_frame *cf;
>> +
>> +	dev_err(&priv->ndev->dev, "Msg Obj is overwritten.\n");

Please use dev_dbg or even remove the line above.

>> +	pch_can_bit_clear(&priv->regs->if1_mcont,
>> +			  CAN_IF_MCONT_MSGLOST);
>> +	iowrite32(CAN_CMASK_RDWR | CAN_CMASK_CTRL,
>> +		  &priv->regs->if1_cmask);
>> +	pch_can_check_if_busy(&priv->regs->if1_creq, obj_id);

I think the if busy checks could be improved. Why do you need to wait here?

>> +
>> +	skb = alloc_can_err_skb(ndev, &cf);
>> +	if (!skb)
>> +		return -ENOMEM;
>> +
>> +	priv->can.can_stats.error_passive++;
>> +	priv->can.state = CAN_STATE_ERROR_PASSIVE;

Please remove the above two bogus lines.

>> +	cf->can_id |= CAN_ERR_CRTL;
>> +	cf->data[1] = CAN_ERR_CRTL_RX_OVERFLOW;
>> +	stats->rx_over_errors++;
>> +	stats->rx_errors++;
>> +
>> +	netif_receive_skb(skb);
>> +
>> +	return 0;
>> +}
>> +
>> +static int pch_can_rx_normal(struct net_device *ndev, u32 obj_num, int quota)
>> +{
>> +	u32 reg;
>> +	canid_t id;
>> +	u32 ide;
>> +	u32 rtr;
>> +	int rcv_pkts = 0;
>> +	int rtn;
>> +	int next_flag = 0;
>> +	struct sk_buff *skb;
>> +	struct can_frame *cf;
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +	struct net_device_stats *stats = &(priv->ndev->stats);
>> +
>> +	/* Reading the messsage object from the Message RAM */
>> +	iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
>> +	pch_can_check_if_busy(&priv->regs->if1_creq, obj_num);
>> +
>> +	/* Reading the MCONT register. */
>> +	reg = ioread32(&priv->regs->if1_mcont);
>> +	reg &= 0xffff;
>> +
>> +	for (; (!(reg & CAN_IF_MCONT_EOB)) && (quota > 0);
>> +						obj_num++, next_flag = 0) {
>> +		/* If MsgLost bit set. */
>> +		if (reg & CAN_IF_MCONT_MSGLOST) {
>> +			rtn = pch_can_rx_msg_lost(ndev, obj_num);
>> +			if (!rtn)
>> +				return rtn;
>> +			rcv_pkts++;
>> +			quota--;
>> +			next_flag = 1;
>> +		} else if (!(reg & CAN_IF_MCONT_NEWDAT))
>> +			next_flag = 1;
>> +
> 
> after rearanging the code (see below..) you should be able to use a continue here.
> 
>> +		if (!next_flag) {
>> +			skb = alloc_can_skb(priv->ndev, &cf);
>> +			if (!skb)
>> +				return -ENOMEM;
>> +
>> +			/* Get Received data */
>> +			ide = ((ioread32(&priv->regs->if1_id2)) & CAN_ID2_XTD);
>> +			if (ide) {
>> +				id = (ioread32(&priv->regs->if1_id1) & 0xffff);
>> +				id |= (((ioread32(&priv->regs->if1_id2)) &
>> +						    0x1fff) << 16);
>> +				cf->can_id = (id & CAN_EFF_MASK) | CAN_EFF_FLAG;
>                                               ^^^^^^^^^^^^^^^^^
> 
> is the mask needed, you mask the if1_id{1,2} already
> 
>> +			} else {
>> +				id = (((ioread32(&priv->regs->if1_id2)) &
>> +						  (CAN_SFF_MASK << 2)) >> 2);
>> +				cf->can_id = (id & CAN_SFF_MASK);
> 
> one mask can go away
> 
>> +			}
>> +
>> +			rtr = ioread32(&priv->regs->if1_id2) &  CAN_ID2_DIR;
>                                                               ^^
> 
> remove one space
> 
>> +
>> +			if (rtr)
>> +				cf->can_id |= CAN_RTR_FLAG;
>> +
>> +			cf->can_dlc = get_can_dlc((ioread32(&priv->regs->
>> +						   if1_mcont)) & 0xF);
>> +			*(u16 *)(cf->data + 0) = ioread16(&priv->regs->
>> +							  if1_dataa1);
>> +			*(u16 *)(cf->data + 2) = ioread16(&priv->regs->
>> +							  if1_dataa2);
>> +			*(u16 *)(cf->data + 4) = ioread16(&priv->regs->
>> +							  if1_datab1);
>> +			*(u16 *)(cf->data + 6) = ioread16(&priv->regs->
>> +							  if1_datab2);
> 
> are you sure, the bytes in the can package a in the correct order.
> Please test your pch_can against a non pch_can system.
> 
>> +
>> +			netif_receive_skb(skb);
>> +			rcv_pkts++;
>> +			stats->rx_packets++;
>> +			quota--;
>> +			stats->rx_bytes += cf->can_dlc;
>> +
>> +			pch_fifo_thresh(priv, obj_num);
>> +		}
>> +
>> +		/* Reading the messsage object from the Message RAM */
>> +		iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
>> +		pch_can_check_if_busy(&priv->regs->if1_creq, obj_num + 1);
>> +		reg = ioread32(&priv->regs->if1_mcont);
> 
> this is almost the same code as before the the loop, can you rearange
> the code to avoid duplication?
>  
>> +	}
>> +
>> +	return rcv_pkts;
>> +}
>> +
>> +static void pch_can_tx_complete(struct net_device *ndev, u32 int_stat)
>> +{
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +	struct net_device_stats *stats = &(priv->ndev->stats);
>> +	unsigned long flags;
>> +	u32 dlc;
>> +
>> +	can_get_echo_skb(ndev, int_stat - PCH_RX_OBJ_NUM - 1);
>> +	spin_lock_irqsave(&priv->msgif_reg_lock, flags);
>> +	iowrite32(CAN_CMASK_RX_TX_GET | CAN_CMASK_CLRINTPND,
>> +		  &priv->regs->if2_cmask);
>> +	dlc = ioread32(&priv->regs->if2_mcont) & CAN_IF_MCONT_DLC;
>> +	pch_can_check_if_busy(&priv->regs->if2_creq, int_stat);
>> +	spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
>> +	if (dlc > 8)
>> +		dlc = 8;
> 
> use get_can_dlc
> 
>> +	stats->tx_bytes += dlc;
>> +	stats->tx_packets++;
>> +}
>> +
>> +static int pch_can_rx_poll(struct napi_struct *napi, int quota)
>> +{
>> +	struct net_device *ndev = napi->dev;
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +	u32 int_stat;
>> +	int rcv_pkts = 0;
>> +	u32 reg_stat;
>> +	unsigned long flags;
>> +
>> +	int_stat = pch_can_int_pending(priv);
>> +	if (!int_stat)
>> +		goto END;

Labels should be lowercase as well.

>> +
>> +	if ((int_stat == CAN_STATUS_INT) && (quota > 0)) {
>> +		reg_stat = ioread32(&priv->regs->stat);
>> +		if (reg_stat & (PCH_BUS_OFF | PCH_LEC_ALL)) {
>> +			if ((reg_stat & PCH_LEC_ALL) != PCH_LEC_ALL) {
>> +				pch_can_error(ndev, reg_stat);
>> +				quota--;
>> +			}
>> +		}
>> +
>> +		if (reg_stat & PCH_TX_OK) {
>> +			spin_lock_irqsave(&priv->msgif_reg_lock, flags);
>> +			iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if2_cmask);
>> +			pch_can_check_if_busy(&priv->regs->if2_creq,
>> +					       ioread32(&priv->regs->intr));
>                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Isn't this "int_stat". Might it be possilbe that regs->intr changes
> between the pch_can_int_pending and here?
> 
> What should this transfer do?
> 
>> +			spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
>> +			pch_can_bit_clear(&priv->regs->stat, PCH_TX_OK);
>> +		}
>> +
>> +		if (reg_stat & PCH_RX_OK)
>> +			pch_can_bit_clear(&priv->regs->stat, PCH_RX_OK);
>> +
>> +		int_stat = pch_can_int_pending(priv);
>> +	}
>> +
>> +	if (quota == 0)
>> +		goto END;
>> +
>> +	if ((int_stat >= 1) && (int_stat <= PCH_RX_OBJ_NUM)) {
>> +		spin_lock_irqsave(&priv->msgif_reg_lock, flags);
>> +		rcv_pkts += pch_can_rx_normal(ndev, int_stat, quota);
>> +		spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
>> +		quota -= rcv_pkts;
>> +		if (rcv_pkts < 0)
> 
> how can this happen?
> 
>> +			goto END;
>> +	} else if ((int_stat > PCH_RX_OBJ_NUM) && (int_stat <= PCH_OBJ_NUM)) {
>> +		/* Handle transmission interrupt */
>> +		pch_can_tx_complete(ndev, int_stat);
>> +	}
>> +
>> +END:
>> +	napi_complete(napi);
>> +	pch_can_set_int_enables(priv, PCH_CAN_ALL);
>> +
>> +	return rcv_pkts;
>> +}
>> +
>> +static int pch_set_bittiming(struct net_device *ndev)
>> +{
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +	const struct can_bittiming *bt = &priv->can.bittiming;
>> +	u32 canbit;
>> +	u32 bepe;
>> +
>> +	/* Setting the CCE bit for accessing the Can Timing register. */
>> +	pch_can_bit_set(&priv->regs->cont, CAN_CTRL_CCE);
>> +
>> +	canbit = (bt->brp - 1) & MSK_BITT_BRP;
>> +	canbit |= (bt->sjw - 1) << BIT_BITT_SJW;
>> +	canbit |= (bt->phase_seg1 + bt->prop_seg - 1) << BIT_BITT_TSEG1;
>> +	canbit |= (bt->phase_seg2 - 1) << BIT_BITT_TSEG2;
>> +	bepe = ((bt->brp - 1) & MSK_BRPE_BRPE) >> BIT_BRPE_BRPE;
>> +	iowrite32(canbit, &priv->regs->bitt);
>> +	iowrite32(bepe, &priv->regs->brpe);
>> +	pch_can_bit_clear(&priv->regs->cont, CAN_CTRL_CCE);
>> +
>> +	return 0;
>> +}
>> +
>> +static void pch_can_start(struct net_device *ndev)
>> +{
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +
>> +	if (priv->can.state != CAN_STATE_STOPPED)
>> +		pch_can_reset(priv);
>> +
>> +	pch_set_bittiming(ndev);
>> +	pch_can_set_optmode(priv);
>> +
>> +	pch_can_tx_enable_all(priv);
>> +	pch_can_rx_enable_all(priv);
>> +
>> +	/* Setting the CAN to run mode. */
>> +	pch_can_set_run_mode(priv, PCH_CAN_RUN);
>> +
>> +	priv->can.state = CAN_STATE_ERROR_ACTIVE;
>> +
>> +	return;
>> +}
>> +
>> +static int pch_can_do_set_mode(struct net_device *ndev, enum can_mode mode)
>> +{
>> +	int ret = 0;
>> +
>> +	switch (mode) {
>> +	case CAN_MODE_START:
>> +		pch_can_start(ndev);
>> +		netif_wake_queue(ndev);
>> +		break;
>> +	default:
>> +		ret = -EOPNOTSUPP;
>> +		break;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static int pch_can_open(struct net_device *ndev)
>> +{
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +	int retval;
>> +
>> +	/* Regsitering the interrupt. */

Typo!

>> +	retval = request_irq(priv->dev->irq, pch_can_interrupt, IRQF_SHARED,
>> +			     ndev->name, ndev);
>> +	if (retval) {
>> +		dev_err(&ndev->dev, "request_irq failed.\n");
>> +		goto req_irq_err;
>> +	}
>> +
>> +	/* Open common can device */
>> +	retval = open_candev(ndev);
>> +	if (retval) {
>> +		dev_err(ndev->dev.parent, "open_candev() failed %d\n", retval);
>> +		goto err_open_candev;
>> +	}
>> +
>> +	pch_can_init(priv);
>> +	pch_can_start(ndev);
>> +	napi_enable(&priv->napi);
>> +	netif_start_queue(ndev);
>> +
>> +	return 0;
>> +
>> +err_open_candev:
>> +	free_irq(priv->dev->irq, ndev);
>> +req_irq_err:
>> +	pch_can_release(priv);
>> +
>> +	return retval;
>> +}
>> +
>> +static int pch_close(struct net_device *ndev)
>> +{
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +
>> +	netif_stop_queue(ndev);
>> +	napi_disable(&priv->napi);
>> +	pch_can_release(priv);
>> +	free_irq(priv->dev->irq, ndev);
>> +	close_candev(ndev);
>> +	priv->can.state = CAN_STATE_STOPPED;
>> +	return 0;
>> +}
>> +
>> +static netdev_tx_t pch_xmit(struct sk_buff *skb, struct net_device *ndev)
>> +{
>> +	unsigned long flags;
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +	struct can_frame *cf = (struct can_frame *)skb->data;
>> +	int tx_buffer_avail = 0;
> 
> What I'm totally missing is the TX flow controll. Your driver has to
> ensure that the package leave the controller in the order that come
> into the xmit function. Further you have to stop your xmit queue if
> you're out of tx objects and reenable if you have a object free.
> 
> Use netif_stop_queue() and netif_wake_queue() for this.
> 
>> +
>> +	if (can_dropped_invalid_skb(ndev, skb))
>> +		return NETDEV_TX_OK;
>> +
>> +	if (priv->tx_obj == (PCH_OBJ_NUM + 1)) { /* Point tail Obj + 1 */
>> +		while (ioread32(&priv->regs->treq2) & 0xfc00)
>> +			udelay(1);
> 
> please no (possible) infinite delays!
> 
>> +		priv->tx_obj = PCH_RX_OBJ_NUM + 1; /* Point head of Tx Obj ID */
>> +	}
> 
>> +
>> +	tx_buffer_avail = priv->tx_obj;
> 
> why has the "object" become a "buffer" now? :)
> 
>> +	priv->tx_obj++;
>> +
>> +	/* Attaining the lock. */
>> +	spin_lock_irqsave(&priv->msgif_reg_lock, flags);
>> +
>> +	/* Setting the CMASK register to set value*/
>                                                  ^^^
> 
> pleas add a whitespace
> 
>> +	iowrite32(CAN_CMASK_RX_TX_SET, &priv->regs->if2_cmask);
>> +
>> +	/* If ID extended is set. */
>> +	if (cf->can_id & CAN_EFF_FLAG) {
>> +		iowrite32(cf->can_id & 0xffff, &priv->regs->if2_id1);
>> +		iowrite32(((cf->can_id >> 16) & 0x1fff) | CAN_ID2_XTD,
>> +			    &priv->regs->if2_id2);
>> +	} else {
>> +		iowrite32(0, &priv->regs->if2_id1);
>> +		iowrite32((cf->can_id & CAN_SFF_MASK) << 2,
>> +			   &priv->regs->if2_id2);
>> +	}
>> +
>> +	pch_can_bit_set(&priv->regs->if2_id2, CAN_ID_MSGVAL);
> 
> Do you need to do a read-modify-write of the hardware register? Please
> prepare the values you want to write to hardware, then do it.
> 
>> +
>> +	/* If remote frame has to be transmitted.. */
>> +	if (!(cf->can_id & CAN_RTR_FLAG))
>> +		pch_can_bit_set(&priv->regs->if2_id2, CAN_ID2_DIR);
> dito
>> +	/* If remote frame has to be transmitted.. */
>> +	if (cf->can_id & CAN_RTR_FLAG)
>> +		pch_can_bit_clear(&priv->regs->if2_id2, CAN_ID2_DIR);
> dito
>> +
>> +	/* Copy data to register */
>> +	if (cf->can_dlc > 0) {
>> +		u32 data1 = *((u16 *)&cf->data[0]);
>> +		iowrite32(data1, &priv->regs->if2_dataa1);
> 
> do you think you send the bytes in correct order?
> 
>> +	}
>> +	if (cf->can_dlc > 2) {
>> +		u32 data1 = *((u16 *)&cf->data[2]);
>> +		iowrite32(data1, &priv->regs->if2_dataa2);
>> +	}
>> +	if (cf->can_dlc > 4) {
>> +		u32 data1 = *((u16 *)&cf->data[4]);
>> +		iowrite32(data1, &priv->regs->if2_datab1);
>> +	}
>> +	if (cf->can_dlc > 6) {
>> +		u32 data1 = *((u16 *)&cf->data[6]);
>> +		iowrite32(data1, &priv->regs->if2_datab2);
>> +	}

Could be handled by a loop.

>> +	can_put_echo_skb(skb, ndev, tx_buffer_avail - PCH_RX_OBJ_NUM - 1);
>> +
>> +	/* Set the size of the data. */
>> +	iowrite32(cf->can_dlc, &priv->regs->if2_mcont);
>> +
>> +	/* Update if2_mcont */
>> +	pch_can_bit_set(&priv->regs->if2_mcont,
>> +			CAN_IF_MCONT_NEWDAT | CAN_IF_MCONT_TXRQXT |
>> +			CAN_IF_MCONT_TXIE);
> 
> pleae first perpare your value, then write to hardware.
> 
>> +
>> +	if (tx_buffer_avail == PCH_RX_OBJ_NUM) /* If points tail of FIFO  */
>> +		pch_can_bit_set(&priv->regs->if2_mcont, CAN_IF_MCONT_EOB);
> 
> dito
> 
> Is EOB relevant for TX objects?
> 
>> +	pch_can_check_if_busy(&priv->regs->if2_creq, tx_buffer_avail);
>> +	spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
>> +
>> +	return NETDEV_TX_OK;
>> +}
>> +
>> +static const struct net_device_ops pch_can_netdev_ops = {
>> +	.ndo_open		= pch_can_open,
>> +	.ndo_stop		= pch_close,
>> +	.ndo_start_xmit		= pch_xmit,
>> +};
>> +
>> +static void __devexit pch_can_remove(struct pci_dev *pdev)
>> +{
>> +	struct net_device *ndev = pci_get_drvdata(pdev);
>> +	struct pch_can_priv *priv = netdev_priv(ndev);
>> +
>> +	unregister_candev(priv->ndev);
>> +	pci_iounmap(pdev, priv->regs);
>> +	if (priv->use_msi)
>> +		pci_disable_msi(priv->dev);
>> +	pci_release_regions(pdev);
>> +	pci_disable_device(pdev);
>> +	pci_set_drvdata(pdev, NULL);
>> +	free_candev(priv->ndev);
>> +}
>> +
>> +#ifdef CONFIG_PM
>> +static void pch_can_set_int_custom(struct pch_can_priv *priv)
>> +{
>> +	/* Clearing the IE, SIE and EIE bits of Can control register. */
>> +	pch_can_bit_clear(&priv->regs->cont, CAN_CTRL_IE_SIE_EIE);
>> +
>> +	/* Appropriately setting them. */
>> +	pch_can_bit_set(&priv->regs->cont,
>> +			((priv->int_enables & MSK_CTRL_IE_SIE_EIE) << 1));
>> +}
>> +
>> +/* This function retrieves interrupt enabled for the CAN device. */
>> +static u32 pch_can_get_int_enables(struct pch_can_priv *priv)
>> +{
>> +	/* Obtaining the status of IE, SIE and EIE interrupt bits. */
>> +	return (ioread32(&priv->regs->cont) & CAN_CTRL_IE_SIE_EIE) >> 1;
>> +}
>> +
>> +static u32 pch_can_get_rx_enable(struct pch_can_priv *priv, u32 buff_num)
>> +{
>> +	unsigned long flags;
>> +	u32 enable;
>> +
>> +	spin_lock_irqsave(&priv->msgif_reg_lock, flags);
>> +	iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
>> +	pch_can_check_if_busy(&priv->regs->if1_creq, buff_num);
>> +
>> +	if (((ioread32(&priv->regs->if1_id2)) & CAN_ID_MSGVAL) &&
>> +			((ioread32(&priv->regs->if1_mcont)) &
>> +			CAN_IF_MCONT_RXIE))
>> +		enable = 1;
>> +	else
>> +		enable = 0;
>> +	spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
>> +	return enable;
>> +}
>> +
>> +static u32 pch_can_get_tx_enable(struct pch_can_priv *priv, u32 buff_num)
>> +{
>> +	unsigned long flags;
>> +	u32 enable;
>> +
>> +	spin_lock_irqsave(&priv->msgif_reg_lock, flags);
>> +
>> +	iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if2_cmask);
>> +	pch_can_check_if_busy(&priv->regs->if2_creq, buff_num);
>> +	if (((ioread32(&priv->regs->if2_id2)) & CAN_ID_MSGVAL) &&
>> +			((ioread32(&priv->regs->if2_mcont)) &
>> +			CAN_IF_MCONT_TXIE)) {
>> +		enable = 1;
>> +	} else {
>> +		enable = 0;
>> +	}
>> +	spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
>> +
>> +	return enable;
>> +}

The above two functions could be handled by a common one passing "struct
pch_can_if". See similar comments above.

>> +static void pch_can_set_rx_buffer_link(struct pch_can_priv *priv,
>> +				       u32 buffer_num, u32 set)
>> +{
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&priv->msgif_reg_lock, flags);
>> +	iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
>> +	pch_can_check_if_busy(&priv->regs->if1_creq, buffer_num);
>> +	iowrite32(CAN_CMASK_RDWR | CAN_CMASK_CTRL, &priv->regs->if1_cmask);
>> +	if (set == 1)
>> +		pch_can_bit_clear(&priv->regs->if1_mcont, CAN_IF_MCONT_EOB);
>> +	else
>> +		pch_can_bit_set(&priv->regs->if1_mcont, CAN_IF_MCONT_EOB);
>> +
>> +	pch_can_check_if_busy(&priv->regs->if1_creq, buffer_num);
>> +	spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
>> +}
>> +
>> +static u32 pch_can_get_rx_buffer_link(struct pch_can_priv *priv, u32 buffer_num)
>> +{
>> +	unsigned long flags;
>> +	u32 link;
>> +
>> +	spin_lock_irqsave(&priv->msgif_reg_lock, flags);
>> +	iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
>> +	pch_can_check_if_busy(&priv->regs->if1_creq, buffer_num);
>> +
>> +	if (ioread32(&priv->regs->if1_mcont) & CAN_IF_MCONT_EOB)
>> +		link = 0;
>> +	else
>> +		link = 1;
>> +	spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
>> +	return link;
>> +}
>> +
>> +static int pch_can_suspend(struct pci_dev *pdev, pm_message_t state)
>> +{
>> +	int i;
>> +	int retval;
>> +	u32 buf_stat;	/* Variable for reading the transmit buffer status. */
>> +	u32 counter = COUNTER_LIMIT;
>> +
>> +	struct net_device *dev = pci_get_drvdata(pdev);
>> +	struct pch_can_priv *priv = netdev_priv(dev);
>> +
>> +	/* Stop the CAN controller */
>> +	pch_can_set_run_mode(priv, PCH_CAN_STOP);
>> +
>> +	/* Indicate that we are aboutto/in suspend */
>> +	priv->can.state = CAN_STATE_STOPPED;
>> +
>> +	/* Waiting for all transmission to complete. */
>> +	while (counter) {
>> +		buf_stat = pch_can_get_buffer_status(priv);
>> +		if (!buf_stat)
>> +			break;
>> +		counter--;
>> +		udelay(1);
>> +	}
>> +	if (!counter)
>> +		dev_err(&pdev->dev, "%s -> Transmission time out.\n", __func__);
>> +
>> +	/* Save interrupt configuration and then disable them */
>> +	priv->int_enables = pch_can_get_int_enables(priv);
>> +	pch_can_set_int_enables(priv, PCH_CAN_DISABLE);
>> +
>> +	/* Save Tx buffer enable state */
>> +	for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++)
>> +		priv->tx_enable[i] = pch_can_get_tx_enable(priv, i);
>> +
>> +	/* Disable all Transmit buffers */
>> +	pch_can_tx_disable_all(priv);
>> +
>> +	/* Save Rx buffer enable state */
>> +	for (i = 1; i <= PCH_RX_OBJ_NUM; i++) {
>> +		priv->rx_enable[i] = pch_can_get_rx_enable(priv, i);
>> +		priv->rx_link[i] = pch_can_get_rx_buffer_link(priv, i);
>> +	}
>> +
>> +	/* Disable all Receive buffers */
>> +	pch_can_rx_disable_all(priv);
>> +	retval = pci_save_state(pdev);
>> +	if (retval) {
>> +		dev_err(&pdev->dev, "pci_save_state failed.\n");
>> +	} else {
>> +		pci_enable_wake(pdev, PCI_D3hot, 0);
>> +		pci_disable_device(pdev);
>> +		pci_set_power_state(pdev, pci_choose_state(pdev, state));
>> +	}
>> +
>> +	return retval;
>> +}
>> +
>> +static int pch_can_resume(struct pci_dev *pdev)
>> +{
>> +	int i;
>> +	int retval;
>> +	struct net_device *dev = pci_get_drvdata(pdev);
>> +	struct pch_can_priv *priv = netdev_priv(dev);
>> +
>> +	pci_set_power_state(pdev, PCI_D0);
>> +	pci_restore_state(pdev);
>> +	retval = pci_enable_device(pdev);
>> +	if (retval) {
>> +		dev_err(&pdev->dev, "pci_enable_device failed.\n");
>> +		return retval;
>> +	}
>> +
>> +	pci_enable_wake(pdev, PCI_D3hot, 0);
>> +
>> +	priv->can.state = CAN_STATE_ERROR_ACTIVE;
>> +
>> +	/* Disabling all interrupts. */
>> +	pch_can_set_int_enables(priv, PCH_CAN_DISABLE);
>> +
>> +	/* Setting the CAN device in Stop Mode. */
>> +	pch_can_set_run_mode(priv, PCH_CAN_STOP);
>> +
>> +	/* Configuring the transmit and receive buffers. */
>> +	pch_can_config_rx_tx_buffers(priv);
>> +
>> +	/* Restore the CAN state */
>> +	pch_set_bittiming(dev);
>> +
>> +	/* Listen/Active */
>> +	pch_can_set_optmode(priv);
>> +
>> +	/* Enabling the transmit buffer. */
>> +	for (i = 1; i <= PCH_RX_OBJ_NUM; i++)
>> +		pch_can_set_tx_enable(priv, i, priv->tx_enable[i]);
>> +
>> +	/* Configuring the receive buffer and enabling them. */
>> +	for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++) {
>> +		/* Restore buffer link */
>> +		pch_can_set_rx_buffer_link(priv, i, priv->rx_link[i]);
>> +
>> +		/* Restore buffer enables */
>> +		pch_can_set_rx_enable(priv, i, priv->rx_enable[i]);
>> +	}
>> +
>> +	/* Enable CAN Interrupts */
>> +	pch_can_set_int_custom(priv);
>> +
>> +	/* Restore Run Mode */
>> +	pch_can_set_run_mode(priv, PCH_CAN_RUN);
>> +
>> +	return retval;
>> +}
>> +#else
>> +#define pch_can_suspend NULL
>> +#define pch_can_resume NULL
>> +#endif
>> +
>> +static int pch_can_get_berr_counter(const struct net_device *dev,
>> +				    struct can_berr_counter *bec)
>> +{
>> +	struct pch_can_priv *priv = netdev_priv(dev);
>> +
>> +	bec->txerr = ioread32(&priv->regs->errc) & CAN_TEC;
>> +	bec->rxerr = (ioread32(&priv->regs->errc) & CAN_REC) >> 8;
>> +
>> +	return 0;
>> +}
>> +
>> +static int __devinit pch_can_probe(struct pci_dev *pdev,
>> +				   const struct pci_device_id *id)
>> +{
>> +	struct net_device *ndev;
>> +	struct pch_can_priv *priv;
>> +	int rc;
>> +	void __iomem *addr;
>> +
>> +	rc = pci_enable_device(pdev);
>> +	if (rc) {
>> +		dev_err(&pdev->dev, "Failed pci_enable_device %d\n", rc);
>> +		goto probe_exit_endev;
>> +	}
>> +
>> +	rc = pci_request_regions(pdev, KBUILD_MODNAME);
>> +	if (rc) {
>> +		dev_err(&pdev->dev, "Failed pci_request_regions %d\n", rc);
>> +		goto probe_exit_pcireq;
>> +	}
>> +
>> +	addr = pci_iomap(pdev, 1, 0);
>> +	if (!addr) {
>> +		rc = -EIO;
>> +		dev_err(&pdev->dev, "Failed pci_iomap\n");
>> +		goto probe_exit_ipmap;
>> +	}
>> +
>> +	ndev = alloc_candev(sizeof(struct pch_can_priv), PCH_TX_OBJ_NUM);
>> +	if (!ndev) {
>> +		rc = -ENOMEM;
>> +		dev_err(&pdev->dev, "Failed alloc_candev\n");
>> +		goto probe_exit_alloc_candev;
>> +	}
>> +
>> +	priv = netdev_priv(ndev);
>> +	priv->ndev = ndev;
>> +	priv->regs = addr;
>> +	priv->dev = pdev;
>> +	priv->can.bittiming_const = &pch_can_bittiming_const;
>> +	priv->can.do_set_mode = pch_can_do_set_mode;
>> +	priv->can.do_get_berr_counter = pch_can_get_berr_counter;
>> +	priv->can.ctrlmode_supported = CAN_CTRLMODE_LISTENONLY |
>> +				       CAN_CTRLMODE_LOOPBACK;

I'm missing CAN_CTRLMODE_3_SAMPLES here?

>> +	priv->tx_obj = PCH_RX_OBJ_NUM + 1; /* Point head of Tx Obj */
>> +
>> +	ndev->irq = pdev->irq;
>> +	ndev->flags |= IFF_ECHO;
>> +
>> +	pci_set_drvdata(pdev, ndev);
>> +	SET_NETDEV_DEV(ndev, &pdev->dev);
>> +	ndev->netdev_ops = &pch_can_netdev_ops;
>> +	priv->can.clock.freq = PCH_CAN_CLK; /* Hz */
>> +
>> +	netif_napi_add(ndev, &priv->napi, pch_can_rx_poll, PCH_RX_OBJ_NUM);
>> +
>> +	rc = pci_enable_msi(priv->dev);
>> +	if (rc) {
>> +		dev_info(&ndev->dev, "PCH CAN opened without MSI\n");
>> +		priv->use_msi = 0;
>> +	} else {
>> +		dev_info(&ndev->dev, "PCH CAN opened with MSI\n");
>> +		priv->use_msi = 1;
>> +	}
>> +
>> +	rc = register_candev(ndev);
>> +	if (rc) {
>> +		dev_err(&pdev->dev, "Failed register_candev %d\n", rc);
>> +		goto probe_exit_reg_candev;
>> +	}
>> +
>> +	return 0;
>> +
>> +probe_exit_reg_candev:
>> +	free_candev(ndev);
>> +probe_exit_alloc_candev:
>> +	pci_iounmap(pdev, addr);
>> +probe_exit_ipmap:
>> +	pci_release_regions(pdev);
>> +probe_exit_pcireq:
>> +	pci_disable_device(pdev);
>> +probe_exit_endev:
>> +	return rc;
>> +}
>> +
>> +static struct pci_driver pch_can_pcidev = {
>> +	.name = "pch_can",
>> +	.id_table = pch_pci_tbl,
>> +	.probe = pch_can_probe,
>> +	.remove = __devexit_p(pch_can_remove),
>> +	.suspend = pch_can_suspend,
>> +	.resume = pch_can_resume,
>> +};
>> +
>> +static int __init pch_can_pci_init(void)
>> +{
>> +	return pci_register_driver(&pch_can_pcidev);
>> +}
>> +module_init(pch_can_pci_init);
>> +
>> +static void __exit pch_can_pci_exit(void)
>> +{
>> +	pci_unregister_driver(&pch_can_pcidev);
>> +}
>> +module_exit(pch_can_pci_exit);
>> +
>> +MODULE_DESCRIPTION("Intel EG20T PCH CAN(Controller Area Network) Driver");
>> +MODULE_LICENSE("GPL v2");
>> +MODULE_VERSION("0.94");

As the driver has already been merged. Please provide incremental
patches against the net-2.6 branch. Also, it would be nice if you could
check in-order transmission and reception, e.g., with the can-utils
program canfdtest:

http://svn.berlios.de/wsvn/socketcan/trunk/can-utils/canfdtest.c

Thanks,

Wolfgang.

^ permalink raw reply

* Re: [PATCH] net: Limit socket I/O iovec total length to INT_MAX.
From: Linus Torvalds @ 2010-10-29 19:37 UTC (permalink / raw)
  To: David Miller; +Cc: viro, netdev, jon.maloy, allan.stephens, drosenberg
In-Reply-To: <20101029.123230.226768623.davem@davemloft.net>

On Fri, Oct 29, 2010 at 12:32 PM, David Miller <davem@davemloft.net> wrote:
>
> I just got out of a long dentist appointment, will look at this right
> now, thanks!

I booted with it and committed it as "obvious". Let's see if there is
any fallout. I doubt it, but I also doubt we'll find any until we have
lots of testers, unless I made some subtly totally buggy change that
just didn't happen to show up during a normal boot.

                      Linus

^ permalink raw reply

* Re: Fwd: NULL pointer dereference at netxen_nic_probe+0x813/0x9a0
From: Denis Kirjanov @ 2010-10-29 19:47 UTC (permalink / raw)
  To: bjorn.helgaas; +Cc: amit.salecha, netdev, linux-kernel
In-Reply-To: <AANLkTim8coGDmxxCnhzXXLEswf1OYptswbA59BYcbp-N@mail.gmail.com>

> This is on current Linus upstream as of this morning (8128057)
> on an HP DL785:
> 
> QLogic/NetXen Network Driver v4.0.74
> netxen_nic 0000:07:00.0: PCI INT A -> GSI 30 (level, low) -> IRQ 30
> netxen_nic 0000:07:00.0: setting latency timer to 64
> netxen_nic 0000:07:00.0: 2MB memory map
> netxen_nic 0000:07:00.0: loading firmware from flash
> netxen_nic 0000:07:00.0: using 64-bit dma mask
> kernel: Quad Gig LP Board S/N TI9ABK0266  Chip rev 0x42
> netxen_nic 0000:07:00.0: firmware v4.0.520 [legacy]
> netxen_nic 0000:07:00.0: irq 72 for MSI/MSI-X
> netxen_nic 0000:07:00.0: irq 73 for MSI/MSI-X
> netxen_nic 0000:07:00.0: irq 74 for MSI/MSI-X
> netxen_nic 0000:07:00.0: irq 75 for MSI/MSI-X
> netxen_nic 0000:07:00.0: using msi-x interrupts
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> IP: [<ffffffff8160afda>] netxen_nic_probe+0x813/0x9a0
> PGD 0
> Oops: 0002 [#1] SMP
> last sysfs file:
> CPU 0
> Modules linked in:
> 
> Pid: 1650, comm: work_for_cpu Not tainted 2.6.36-07338-g8128057 #269
> /ProLiant DL785 G5
> RIP: 0010:[<ffffffff8160afda>]  [<ffffffff8160afda>]
> netxen_nic_probe+0x813/0x9a0
> RSP: 0018:ffff8806138abe30  EFLAGS: 00010246
> RAX: 0000000000000010 RBX: ffff8806139126c0 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffff880613895616 RDI: ffff880613912000
> RBP: ffff8806138abe90 R08: 0000000000000000 R09: ffff8806138abb80
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff880613912000
> R13: ffff8812174f7000 R14: ffff880613912000 R15: ffff8812174f7000
> FS:  0000000000000000(0000) GS:ffff8800cfa00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000010 CR3: 0000000001c07000 CR4: 00000000000006f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process work_for_cpu (pid: 1650, threadinfo ffff8806138aa000, task
> ffff880616f12be0)
> Stack:
>  ffff8812174f7090 0000000000000246 ffff8806138abe90 ffff8812174f7000
>  00008806138abfd8 0000000000000282 68cd0025b30068cc ffff880c17439d30
>  ffff8812174f7090 ffff8812174f7000 ffff8812174f7208 0000000000000000
> Call Trace:
>  [<ffffffff81203696>] local_pci_probe+0x48/0x91
>  [<ffffffff81052bae>] ? do_work_for_cpu+0x0/0x26
>  [<ffffffff81052bc1>] do_work_for_cpu+0x13/0x26
>  [<ffffffff81052bae>] ? do_work_for_cpu+0x0/0x26
>  [<ffffffff81057a7b>] kthread+0x81/0x89
>  [<ffffffff81003854>] kernel_thread_helper+0x4/0x10
>  [<ffffffff810579fa>] ? kthread+0x0/0x89
>  [<ffffffff81003850>] ? kernel_thread_helper+0x0/0x10
> Code: 00 eb 15 49 8d bf 90 00 00 00 48 c7 c6 1b 2e aa 81 31 c0 e8 c0
> 4e cd ff 4c 89 f7 e8 d6 bb ee ff 49 8b 96 00 03 00 00 48 8d 42 10 <f0>
> 80 4a 10 01 4c 89 f7 e8 a3 7e ed ff 85 c0 41 89 c4 74 2a 49
> RIP  [<ffffffff8160afda>] netxen_nic_probe+0x813/0x9a0
>  RSP <ffff8806138abe30>
> CR2: 0000000000000010
> ---[ end trace 059c7071bbf8de1f ]---
Could you please try the following patch.  

diff --git a/drivers/net/netxen/netxen_nic_main.c b/drivers/net/netxen/netxen_nic_main.c
index 50820be..5766475 100644
--- a/drivers/net/netxen/netxen_nic_main.c
+++ b/drivers/net/netxen/netxen_nic_main.c
@@ -1239,15 +1239,15 @@ netxen_setup_netdev(struct netxen_adapter *adapter,
 	if (netxen_read_mac_addr(adapter))
 		dev_warn(&pdev->dev, "failed to read mac addr\n");
 
-	netif_carrier_off(netdev);
-	netif_stop_queue(netdev);
-
 	err = register_netdev(netdev);
 	if (err) {
 		dev_err(&pdev->dev, "failed to register net device\n");
 		return err;
 	}
 
+	netif_carrier_off(netdev);
+	netif_stop_queue(netdev);
+
 	return 0;
 }


^ permalink raw reply related

* Re: NULL pointer dereference at netxen_nic_probe+0x813/0x9a0
From: David Miller @ 2010-10-29 19:54 UTC (permalink / raw)
  To: dkirjanov; +Cc: bjorn.helgaas, amit.salecha, netdev, linux-kernel
In-Reply-To: <4CCB24CA.3050106@kernel.org>

From: Denis Kirjanov <dkirjanov@kernel.org>
Date: Fri, 29 Oct 2010 23:47:22 +0400

>> This is on current Linus upstream as of this morning (8128057)
>> on an HP DL785:
>> 
>> QLogic/NetXen Network Driver v4.0.74
>> netxen_nic 0000:07:00.0: PCI INT A -> GSI 30 (level, low) -> IRQ 30
>> netxen_nic 0000:07:00.0: setting latency timer to 64
>> netxen_nic 0000:07:00.0: 2MB memory map
>> netxen_nic 0000:07:00.0: loading firmware from flash
>> netxen_nic 0000:07:00.0: using 64-bit dma mask
>> kernel: Quad Gig LP Board S/N TI9ABK0266  Chip rev 0x42
>> netxen_nic 0000:07:00.0: firmware v4.0.520 [legacy]
>> netxen_nic 0000:07:00.0: irq 72 for MSI/MSI-X
>> netxen_nic 0000:07:00.0: irq 73 for MSI/MSI-X
>> netxen_nic 0000:07:00.0: irq 74 for MSI/MSI-X
>> netxen_nic 0000:07:00.0: irq 75 for MSI/MSI-X
>> netxen_nic 0000:07:00.0: using msi-x interrupts
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
>> IP: [<ffffffff8160afda>] netxen_nic_probe+0x813/0x9a0
>> PGD 0
>> Oops: 0002 [#1] SMP
>> last sysfs file:
>> CPU 0
>> Modules linked in:
>> 
>> Pid: 1650, comm: work_for_cpu Not tainted 2.6.36-07338-g8128057 #269
>> /ProLiant DL785 G5
>> RIP: 0010:[<ffffffff8160afda>]  [<ffffffff8160afda>]
>> netxen_nic_probe+0x813/0x9a0
>> RSP: 0018:ffff8806138abe30  EFLAGS: 00010246
>> RAX: 0000000000000010 RBX: ffff8806139126c0 RCX: 0000000000000000
>> RDX: 0000000000000000 RSI: ffff880613895616 RDI: ffff880613912000
>> RBP: ffff8806138abe90 R08: 0000000000000000 R09: ffff8806138abb80
>> R10: 0000000000000000 R11: 0000000000000000 R12: ffff880613912000
>> R13: ffff8812174f7000 R14: ffff880613912000 R15: ffff8812174f7000
>> FS:  0000000000000000(0000) GS:ffff8800cfa00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> CR2: 0000000000000010 CR3: 0000000001c07000 CR4: 00000000000006f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process work_for_cpu (pid: 1650, threadinfo ffff8806138aa000, task
>> ffff880616f12be0)
>> Stack:
>>  ffff8812174f7090 0000000000000246 ffff8806138abe90 ffff8812174f7000
>>  00008806138abfd8 0000000000000282 68cd0025b30068cc ffff880c17439d30
>>  ffff8812174f7090 ffff8812174f7000 ffff8812174f7208 0000000000000000
>> Call Trace:
>>  [<ffffffff81203696>] local_pci_probe+0x48/0x91
>>  [<ffffffff81052bae>] ? do_work_for_cpu+0x0/0x26
>>  [<ffffffff81052bc1>] do_work_for_cpu+0x13/0x26
>>  [<ffffffff81052bae>] ? do_work_for_cpu+0x0/0x26
>>  [<ffffffff81057a7b>] kthread+0x81/0x89
>>  [<ffffffff81003854>] kernel_thread_helper+0x4/0x10
>>  [<ffffffff810579fa>] ? kthread+0x0/0x89
>>  [<ffffffff81003850>] ? kernel_thread_helper+0x0/0x10
>> Code: 00 eb 15 49 8d bf 90 00 00 00 48 c7 c6 1b 2e aa 81 31 c0 e8 c0
>> 4e cd ff 4c 89 f7 e8 d6 bb ee ff 49 8b 96 00 03 00 00 48 8d 42 10 <f0>
>> 80 4a 10 01 4c 89 f7 e8 a3 7e ed ff 85 c0 41 89 c4 74 2a 49
>> RIP  [<ffffffff8160afda>] netxen_nic_probe+0x813/0x9a0
>>  RSP <ffff8806138abe30>
>> CR2: 0000000000000010
>> ---[ end trace 059c7071bbf8de1f ]---
> Could you please try the following patch.  

Why do you need to touch the queue state at all in the probing code?

Until the first ->open() occurs, the queue state is "don't care."

The netif_carrier_off() call is fine.

^ permalink raw reply

* Re: [PATCH] net: Limit socket I/O iovec total length to INT_MAX.
From: David Miller @ 2010-10-29 19:55 UTC (permalink / raw)
  To: torvalds; +Cc: viro, netdev, jon.maloy, allan.stephens, drosenberg
In-Reply-To: <AANLkTikGkfcHdemZwURrYLPBC1QQnbQP_g-7YadzeE-D@mail.gmail.com>

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Fri, 29 Oct 2010 12:37:29 -0700

> On Fri, Oct 29, 2010 at 12:32 PM, David Miller <davem@davemloft.net> wrote:
>>
>> I just got out of a long dentist appointment, will look at this right
>> now, thanks!
> 
> I booted with it and committed it as "obvious". Let's see if there is
> any fallout. I doubt it, but I also doubt we'll find any until we have
> lots of testers, unless I made some subtly totally buggy change that
> just didn't happen to show up during a normal boot.

It ought to be ok.

Let me send you a pull request so you can get the verify_iovec() change.

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2010-10-29 19:59 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


This has the verify_iovec() INT_MAX limiter change as well as:

1) kdump fix in netxen from Rajesh Borundia

2) 8390 oops in probe regression fix from Pavel Emelyanov

3) ipv4 routing table memory leak on namespace stop, also from Pavel
   Emelyanov.

4) cxgb3 probe OOPS fix from Nishanth Aravamudan

5) Limit kernel stack usage of root in pktgen, from Nelson Elhage,
   although this needs a few more tweaks I think.

6) Fix panic in cxgb3 tx desc freeing, from Krishna Kumar.

7) DCCP updates from Gerrit Renker

8) pch_gbe build fix, due to missing dependency

9) Signedness fix in netfilte xt_socket code.

10) Atheros driver fixes from Felix Fietkau, Jones Desougi, Luis
    R. Rodriguez, Mohammed Shafi Shajakhan, and Rajkumar Manoharan

Please pull, thanks a lot!

The following changes since commit 1e431a9d6478940c0b5fcfa1c17a336fc0683409:

  Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb (2010-10-29 11:49:38 -0700)

are available in the git repository at:

  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Carolyn Wyborny (2):
      e1000e: reset PHY after errors detected
      e1000e: Add check for reset flags before displaying reset message

David S. Miller (4):
      pch_gbe: Select MII.
      net: Limit socket I/O iovec total length to INT_MAX.
      netfilter: xt_socket: Make tproto signed in socket_mt6_v1().
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6

Dmitry Artamonow (1):
      USB: gadget: fix ethernet gadget crash in gether_setup

Emil Tantilov (2):
      ixgb: call pci_disable_device in ixgb_remove
      igbvf: fix panic on load

Felix Fietkau (1):
      ath9k: fix tx aggregation flush on AR9003

Geert Uytterhoeven (1):
      net: atarilance - flags should be unsigned long

Gerrit Renker (4):
      dccp: Return-value convention of hc_tx_send_packet()
      dccp: Extend CCID packet dequeueing interface
      dccp: Refine the wait-for-ccid mechanism
      dccp ccid-2: Stop polling

Jesper Juhl (1):
      mac80211: fix failure to check kmalloc return value in key_key_read

Jesse Gross (1):
      igb: Fix unused variable warning.

John Fastabend (1):
      ixgbe: DCB, fix TX hang occurring in stress condition with PFC

Jones Desougi (1):
      ath5k: Fix double free on hw attach error path

Jouni Malinen (1):
      mac80211: Fix scan_ies_len to include DS Params

Krishna Kumar (1):
      cxgb3: Fix panic in free_tx_desc()

Larry Finger (1):
      b43: Fix warning at drivers/mmc/core/core.c:237 in mmc_wait_for_cmd

Luis R. Rodriguez (4):
      ath9k: add locking for stopping RX
      ath9k: add locking for starting the PCU on RX
      ath9k: rename rxflushlock to pcu_lock
      ath9k: lock reset and PCU start/stopping

Mohammed Shafi Shajakhan (1):
      ath9k: Fix incorrect access of rate flags in RC

Nelson Elhage (1):
      pktgen: Limit how much data we copy onto the stack.

Nishanth Aravamudan (1):
      cxgb3: fix crash due to manipulating queues before registration

Paul Fox (1):
      libertas: Fix sd8686 firmware reload

Pavel Emelyanov (2):
      8390: Don't oops on starting dev queue
      fib: Fix fib zone and its hash leak on namespace stop

Rajesh Borundia (1):
      netxen: fix kdump

Rajkumar Manoharan (1):
      ath9k_htc: Set proper firmware offset for Netgear WNDA3200

avisconti (1):
      stmmac: enable/disable rx/tx in the core with a single write.

 drivers/net/Kconfig                      |    1 +
 drivers/net/atarilance.c                 |    2 +-
 drivers/net/cxgb3/cxgb3_main.c           |    2 +-
 drivers/net/cxgb3/sge.c                  |    4 +-
 drivers/net/e1000e/82571.c               |   38 ++++++
 drivers/net/e1000e/e1000.h               |    3 +
 drivers/net/e1000e/netdev.c              |   29 ++++-
 drivers/net/igb/igb_main.c               |    1 -
 drivers/net/igbvf/netdev.c               |    8 +-
 drivers/net/ixgb/ixgb_main.c             |    1 +
 drivers/net/ixgbe/ixgbe_dcb.c            |   39 +++++-
 drivers/net/ixgbe/ixgbe_dcb.h            |    5 +-
 drivers/net/ixgbe/ixgbe_dcb_82599.c      |    5 +
 drivers/net/ixgbe/ixgbe_dcb_82599.h      |    3 +
 drivers/net/ixgbe/ixgbe_main.c           |   12 ++-
 drivers/net/lib8390.c                    |    1 -
 drivers/net/netxen/netxen_nic_ctx.c      |   15 --
 drivers/net/netxen/netxen_nic_main.c     |    7 +
 drivers/net/stmmac/stmmac_main.c         |   40 ++-----
 drivers/net/wireless/ath/ath5k/attach.c  |   17 +--
 drivers/net/wireless/ath/ath9k/ath9k.h   |    2 +-
 drivers/net/wireless/ath/ath9k/hif_usb.c |   10 +-
 drivers/net/wireless/ath/ath9k/main.c    |   31 ++++-
 drivers/net/wireless/ath/ath9k/rc.c      |    2 +-
 drivers/net/wireless/ath/ath9k/recv.c    |   15 +-
 drivers/net/wireless/ath/ath9k/xmit.c    |   18 ++--
 drivers/net/wireless/b43/sdio.c          |    2 +
 drivers/net/wireless/libertas/if_sdio.c  |   32 ++++-
 drivers/usb/gadget/u_ether.c             |    2 +-
 include/linux/dccp.h                     |    4 +-
 include/linux/socket.h                   |    2 +-
 include/net/ip_fib.h                     |    2 +
 net/compat.c                             |   10 +-
 net/core/iovec.c                         |   20 ++--
 net/core/pktgen.c                        |    7 +-
 net/dccp/ccid.h                          |   34 +++++-
 net/dccp/ccids/ccid2.c                   |   23 ++-
 net/dccp/ccids/ccid2.h                   |    5 +
 net/dccp/ccids/ccid3.c                   |   12 +-
 net/dccp/dccp.h                          |    5 +-
 net/dccp/output.c                        |  209 ++++++++++++++++++------------
 net/dccp/proto.c                         |   21 +++-
 net/dccp/timer.c                         |   27 ++--
 net/ipv4/fib_frontend.c                  |    2 +-
 net/ipv4/fib_hash.c                      |   18 +++
 net/ipv4/fib_trie.c                      |    5 +
 net/mac80211/debugfs_key.c               |    6 +-
 net/mac80211/main.c                      |    5 +-
 net/netfilter/xt_socket.c                |    7 +-
 49 files changed, 525 insertions(+), 246 deletions(-)

^ permalink raw reply

* Re: [PATCH] cxgb4vf: fix crash due to manipulating queues before registration
From: David Miller @ 2010-10-29 20:05 UTC (permalink / raw)
  To: dm; +Cc: leedom, netdev
In-Reply-To: <8A71B368A89016469F72CD08050AD334088DA49E@maui.asicdesigners.com>

From: "Dimitrios Michailidis" <dm@chelsio.com>
Date: Fri, 29 Oct 2010 00:36:22 -0700

> Further, I believe moving the call after register_netdev is buggy as
> open can be called after registration and it can clash with the
> queue stopping.  It seems then that these netif_tx_stop_all_queues
> calls have to go now.

This is a good explanation of why no driver should be touching the
queue state before the first ->open() call.

^ permalink raw reply

* Re: [PATCH 0/1] RFC: poll/select performance on datagram sockets
From: Davide Libenzi @ 2010-10-29 20:08 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alban Crequy, David S. Miller, Stephen Hemminger, Cyrill Gorcunov,
	Alexey Dobriyan, netdev, Linux Kernel Mailing List,
	Pauli Nieminen, Rainer Weikusat
In-Reply-To: <1288380431.2680.3.camel@edumazet-laptop>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1857 bytes --]

On Fri, 29 Oct 2010, Eric Dumazet wrote:

> Le vendredi 29 octobre 2010 à 19:18 +0100, Alban Crequy a écrit :
> > Hi,
> > 
> > When a process calls the poll or select, the kernel calls (struct
> > file_operations)->poll on every file descriptor and returns a mask of
> > events which are ready. If the process is only interested by POLLIN
> > events, the mask is still computed for POLLOUT and it can be expensive.
> > For example, on Unix datagram sockets, a process running poll() with
> > POLLIN will wakes-up when the remote end call read(). This is a
> > performance regression introduced when fixing another bug by
> > 3c73419c09a5ef73d56472dbfdade9e311496e9b and
> > ec0d215f9420564fc8286dcf93d2d068bb53a07e.
> > 
> > The attached program illustrates the problem. It compares the
> > performance of sending/receiving data on an Unix datagram socket and
> > select(). When the datagram sockets are not connected, the performance
> > problem is not triggered, but when they are connected it becomes a lot
> > slower. On my computer, I have the following time:
> > 
> > Connected datagram sockets: >4 seconds
> > Non-connected datagram sockets: <1 second
> > 
> > The patch attached in the next email fixes the performance problem: it
> > becomes <1 second for both cases. I am not suggesting the patch for
> > inclusion; I would like to change the prototype of (struct
> > file_operations)->poll instead of adding ->poll2. But there is a lot of
> > poll functions to change (grep tells me 337 functions).
> > 
> > Any opinions?
> 
> My opinion would be to use epoll() for this kind of workload.

Yeah, epoll does check for event hints coming with the callback wakeup, 
and avoid waking up epoll_wait() waiters, for non matching events.
Most of the devices we care about, have been modified to report the event 
mask with the wakeup call.


- Davide


^ permalink raw reply

* Re: "src" attribute ignored for IPv6 (preferred source address selection)
From: David Miller @ 2010-10-29 20:11 UTC (permalink / raw)
  To: dr; +Cc: netdev
In-Reply-To: <20101017011205.GA10610@srv03.cluenet.de>

From: Daniel Roesen <dr@cluenet.de>
Date: Sun, 17 Oct 2010 03:12:08 +0200

> http://lkml.indiana.edu/hypermail/linux/kernel/0409.0/1768.html
> 
> Unfortunately I don't have the time (and most probably not the necessary
> kernel knowhow) to hack this up myself, but I'm still interested. :-)

The routing table in ipv6 can very much handle routing by source
address now, if the ip commands mention do not work it's some
bug in the tool or the routing table rule adding code in the
kernel.

It's not a fundamental limitation any more.

^ permalink raw reply

* Re: [PATCH 0/1] RFC: poll/select performance on datagram sockets
From: Jesper Juhl @ 2010-10-29 20:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alban Crequy, David S. Miller, Stephen Hemminger, Cyrill Gorcunov,
	Alexey Dobriyan, netdev, linux-kernel, Pauli Nieminen,
	Rainer Weikusat
In-Reply-To: <1288380431.2680.3.camel@edumazet-laptop>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2140 bytes --]

On Fri, 29 Oct 2010, Eric Dumazet wrote:

> Le vendredi 29 octobre 2010 à 19:18 +0100, Alban Crequy a écrit :
> > Hi,
> > 
> > When a process calls the poll or select, the kernel calls (struct
> > file_operations)->poll on every file descriptor and returns a mask of
> > events which are ready. If the process is only interested by POLLIN
> > events, the mask is still computed for POLLOUT and it can be expensive.
> > For example, on Unix datagram sockets, a process running poll() with
> > POLLIN will wakes-up when the remote end call read(). This is a
> > performance regression introduced when fixing another bug by
> > 3c73419c09a5ef73d56472dbfdade9e311496e9b and
> > ec0d215f9420564fc8286dcf93d2d068bb53a07e.
> > 
> > The attached program illustrates the problem. It compares the
> > performance of sending/receiving data on an Unix datagram socket and
> > select(). When the datagram sockets are not connected, the performance
> > problem is not triggered, but when they are connected it becomes a lot
> > slower. On my computer, I have the following time:
> > 
> > Connected datagram sockets: >4 seconds
> > Non-connected datagram sockets: <1 second
> > 
> > The patch attached in the next email fixes the performance problem: it
> > becomes <1 second for both cases. I am not suggesting the patch for
> > inclusion; I would like to change the prototype of (struct
> > file_operations)->poll instead of adding ->poll2. But there is a lot of
> > poll functions to change (grep tells me 337 functions).
> > 
> > Any opinions?
> 
> My opinion would be to use epoll() for this kind of workload.
> 
Sorry to intrude out of the blue without really understanding the kernel 
side of most of the code in question, but if there's a performance 
regression for applications using poll() shouldn't we address that so we 
get back to the prior performance level rather than requireing all 
userspace apps to switch to epoll() ??

-- 
Jesper Juhl <jj@chaosbits.net>             http://www.chaosbits.net/
Plain text mails only, please      http://www.expita.com/nomime.html
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html

^ permalink raw reply

* Re: [PATCH 0/1] RFC: poll/select performance on datagram sockets
From: Eric Dumazet @ 2010-10-29 20:20 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Alban Crequy, David S. Miller, Stephen Hemminger, Cyrill Gorcunov,
	Alexey Dobriyan, netdev, Linux Kernel Mailing List,
	Pauli Nieminen, Rainer Weikusat
In-Reply-To: <alpine.DEB.2.00.1010291306390.8517@davide-lnx1>

Le vendredi 29 octobre 2010 à 13:08 -0700, Davide Libenzi a écrit :

> Yeah, epoll does check for event hints coming with the callback wakeup, 
> and avoid waking up epoll_wait() waiters, for non matching events.
> Most of the devices we care about, have been modified to report the event 
> mask with the wakeup call.

Alban test program is _very_ pathological :

All the time is consumed in do_select() because of false sharing between
two tasks.

We can probably rearrange variables in do_select() to make this false
sharing less problematic. I am taking a look at this.

Events: 3K cycles
+     26.14%  uclient  [kernel.kallsyms]  [k] do_raw_spin_lock              
+     21.11%  uclient  [kernel.kallsyms]  [k] do_select                     
+     13.38%  uclient  [kernel.kallsyms]  [k] pollwake                      
+      9.22%  uclient  [kernel.kallsyms]  [k] unix_dgram_poll               
+      5.24%  uclient  [kernel.kallsyms]  [k] unix_peer_get                 
+      3.04%  uclient  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore   
+      3.03%  uclient  [kernel.kallsyms]  [k] task_rq_lock                  
+      2.85%  uclient  [kernel.kallsyms]  [k] do_raw_spin_unlock            
+      1.84%  uclient  [kernel.kallsyms]  [k] try_to_wake_up                
+      1.55%  uclient  [kernel.kallsyms]  [k] fget_light                    
+      1.34%  uclient  [kernel.kallsyms]  [k] core_kernel_text              


annotate :

   5.66 :        410fb342:       85 ff                   test   %edi,%edi                  
    0.00 :        410fb344:       74 1f                   je     410fb365 <do_select+0x3d5> 
    0.13 :        410fb346:       85 b5 6c fd ff ff       test   %esi,-0x294(%ebp)          
    0.00 :        410fb34c:       74 17                   je     410fb365 <do_select+0x3d5> 
         :                                                        res_out |= bit;           
    0.00 :        410fb34e:       09 b5 5c fd ff ff       or     %esi,-0x2a4(%ebp)          
         :                                                        retval++;                 
    0.00 :        410fb354:       83 85 64 fd ff ff 01    addl   $0x1,-0x29c(%ebp)          
         :                                                        wait = NULL;              
    0.00 :        410fb35b:       c7 85 7c fd ff ff 00    movl   $0x0,-0x284(%ebp)          
    0.00 :        410fb362:       00 00 00                                                  
         :                                               }            
         :                                                if ((mask & POLLEX_SET) && (ex & bit)) {
   43.27 :        410fb365:       85 d2                   test   %edx,%edx                 
    0.00 :        410fb367:       0f 84 f3 fe ff ff       je     410fb260 <do_select+0x2d0>
    0.00 :        410fb36d:       85 b5 74 fd ff ff       test   %esi,-0x28c(%ebp)         
    0.00 :        410fb373:       0f 84 e7 fe ff ff       je     410fb260 <do_select+0x2d0>
         :                                                        res_ex |= bit;           
    0.00 :        410fb379:       09 b5 58 fd ff ff       or     %esi,-0x2a8(%ebp)         
         :                                if (all_bits == 0) {   
         :                                        i += __NFDBITS;
         :                                        continue;
         :                                }

^ permalink raw reply

* Re: [PATCH] net: Limit socket I/O iovec total length to INT_MAX.
From: Dan Rosenberg @ 2010-10-29 20:22 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, viro, netdev, jon.maloy, allan.stephens
In-Reply-To: <20101029.125509.104071725.davem@davemloft.net>

Thanks for your work on this.  Just a friendly reminder not to forget
the compat code.  :)

-Dan

On Fri, 2010-10-29 at 12:55 -0700, David Miller wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Fri, 29 Oct 2010 12:37:29 -0700
> 
> > On Fri, Oct 29, 2010 at 12:32 PM, David Miller <davem@davemloft.net> wrote:
> >>
> >> I just got out of a long dentist appointment, will look at this right
> >> now, thanks!
> > 
> > I booted with it and committed it as "obvious". Let's see if there is
> > any fallout. I doubt it, but I also doubt we'll find any until we have
> > lots of testers, unless I made some subtly totally buggy change that
> > just didn't happen to show up during a normal boot.
> 
> It ought to be ok.
> 
> Let me send you a pull request so you can get the verify_iovec() change.



^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox