Netdev List
 help / color / mirror / Atom feed
* Re: Netlink limitations
From: Patrick McHardy @ 2010-11-09 12:24 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: David S. Miller, Holger Eitzenberger, Pablo Neira Ayuso, netdev
In-Reply-To: <alpine.LNX.2.01.1011091303250.21752@obet.zrqbmnf.qr>

Am 09.11.2010 13:10, schrieb Jan Engelhardt:
> 
> On Sunday 2010-11-07 18:17, Patrick McHardy wrote:
>> On 07.11.2010 17:44, Jan Engelhardt wrote:
>>> we mentioned it only briefly at the Netfilter workshop a few weeks ago, 
>>> but as I am trying to figure out how to use Netlink in Xtables, 
>>> Netlink's limitations really start ruining my day.
>>>
>>> The well-known issue is that NL messages[sic] the kernel is supposed to 
>>> receive have a max size of 64K, due to nlmsghdr's use of uint16_t. This 
>>> is very problematic because attributes can easily amass more than 64K. 
>>> Think of a chain full of rules, represented by a top-level attribute 
>>> that nests attributes. The problem is bidirectional, a table 
>>> dump has the same problem.
>>
>> Messages are not limited to 64k, individual attributes are. Holger
>> started working on a nlattr32, which uses 32 bit for the length
>> value.
> 
> Does he have a format specification available?

As I said, the basic idea is to use a length value of zero to indicate
that the length should be read from a second length member. Basically:

struct nlattr32 {
	__u16 nla_len;
	__u16 nla_type;
	__u32 nla_len2;
};


^ permalink raw reply

* Re: [PATCH net-next-2.6 v2] can: Topcliff: PCH_CAN driver: Fix build warnings
From: Tomoya MORINAGA @ 2010-11-09 12:26 UTC (permalink / raw)
  To: Oliver Hartkopp
  Cc: andrew.chih.howe.khor-ral2JQCrhuEAvxtiuMwx3w, Masayuki Ohtake,
	Samuel Ortiz, margie.foster-ral2JQCrhuEAvxtiuMwx3w,
	netdev-u79uwXL29TY76Z2rM5mHXA, LKML,
	socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	yong.y.wang-ral2JQCrhuEAvxtiuMwx3w,
	kok.howg.ewe-ral2JQCrhuEAvxtiuMwx3w, Wolfgang Grandegger,
	Marc Kleine-Budde, joel.clark-ral2JQCrhuEAvxtiuMwx3w,
	David S. Miller, Christian Pellegrin,
	qi.wang-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <4CCAFA68.2030903@hartkopp.net>

Sorry, for late response.
On Saturday, October 30, 2010 1:46 AM,  Oliver Hartkopp wrote:

> 
> It's just a question if the driver for an Intel Chipset should/could ignore
> the endian problematic.
> 
> If this is intended the driver should only appear in Kconfig depending on X86
> or little endian systems.

This driver is for only x86 processor.
I have no intention to support processor except x86.

> 
> Besides this remark, the struct pch_can_regs could also be defined in a way
> that every single CAN payload data byte could be accessed directly to allow
> e.g. this code in pch_can_rx_normal():
> 
> cf->data[0] = ioread8(&priv->regs->if1_data0);
> cf->data[1] = ioread8(&priv->regs->if1_data1);
> cf->data[2] = ioread8(&priv->regs->if1_data2);
> (..)
> cf->data[6] = ioread8(&priv->regs->if1_data6);
> cf->data[7] = ioread8(&priv->regs->if1_data7);
> 
> This is easy to understand and additionally endian aware.

I agree. Thease codes are very simple.
I will modify like above.

> 
> In opposite to this:
> 
> + if (cf->can_dlc > 2) {
> + u32 data1 = *((u16 *)&cf->data[2]);
> + iowrite32(data1, &priv->regs->if2_dataa2);
> + }
> 
> Which puts a little? endian u16 into the big endian register layout.
> Sorry i'm just getting curious on this register access implementation.

I think cf->data is like below
MSB----------LSB
D3-D2-D1-D0
D7-D6-D5-D4

*((u16 *)&cf->data[2]) means "D3-D2".
This order coprresponds to register order.
data register is like below
MSB----------LSB
**-**-D3-D2

** means reserved area.

Thanks,

Tomoya MORINAGA
OKI SEMICONDUCTOR CO., LTD.

^ permalink raw reply

* Re: [PATCH net-next-2.6 v2] can: Topcliff: PCH_CAN driver: Fix build warnings
From: Tomoya MORINAGA @ 2010-11-09 12:26 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Wolfgang Grandegger, David S. Miller, Wolfram Sang,
	Christian Pellegrin, Barry Song, Samuel Ortiz, socketcan-core,
	netdev, LKML, andrew.chih.howe.khor, qi.wang, margie.foster,
	yong.y.wang, Masayuki Ohtake, kok.howg.ewe, joel.clark
In-Reply-To: <005301cb7ffa$5b63cd90$66f8800a@maildom.okisemi.com>

Sorry, for late response.
----- Original Message ----- 
From: "Tomoya MORINAGA" <tomoya-linux@dsn.okisemi.com>
To: "Tomoya MORINAGA" <tomoya-linux@dsn.okisemi.com>
Sent: Tuesday, November 09, 2010 7:39 PM
Subject: Re: [PATCH net-next-2.6 v2] can: Topcliff: PCH_CAN driver: Fix build warnings


> On 11/02/2010 11:27 AM, Tomoya MORINAGA wrote:
> > On Saturday, October 30, 2010 1:23 AM,  Marc Kleine-Budde wrote:
> > 
> >>> The driver has already been merged. Please send incremental patches
> >>> against david's net-2.6 branch.
> > 
> > I agree.
> > 
> >>
> >> Here a review, find comments inline. Lets talk about my remarks, please
> >> answer inline and don't delete the code.
> >>
> >> Can you please explain me your locking sheme? If I understand the
> >> documenation correctly the two message interfaces can be used mutual.
> >> And you use one for rx the other one for tx.
> > 
> > I show our locking scheme.
> > When CPU accesses MessageRAM via IF1, CPU protect until read-modify-write
> > so that IF2 access not occurred, vice versa.
> 
> Why is that needed?

For MessageRAM data consistency.

> 
> > 
> >>
> >> Please use netdev_<level> instead of dev_<level> for debug.
> > 
> > I agree.
> > 
> >>
> >>> --- /dev/null
> >>> +++ b/drivers/net/can/pch_can.c
> >>> @@ -0,0 +1,1436 @@
> >>> +/*
> >>> + * Copyright (C) 1999 - 2010 Intel Corporation.
> >>> + * Copyright (C) 2010 OKI SEMICONDUCTOR CO., LTD.
> >>> + *
> >>> + * This program is free software; you can redistribute it and/or modify
> >>> + * it under the terms of the GNU General Public License as published by
> >>> + * the Free Software Foundation; version 2 of the License.
> >>> + *
> >>> + * This program is distributed in the hope that it will be useful,
> >>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >>> + * GNU General Public License for more details.
> >>> + *
> >>> + * You should have received a copy of the GNU General Public License
> >>> + * along with this program; if not, write to the Free Software
> >>> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307, USA.
> >>> + */
> >>> +
> >>> +#include <linux/interrupt.h>
> >>> +#include <linux/delay.h>
> >>> +#include <linux/io.h>
> >>> +#include <linux/module.h>
> >>> +#include <linux/sched.h>
> >>> +#include <linux/pci.h>
> >>> +#include <linux/init.h>
> >>> +#include <linux/kernel.h>
> >>> +#include <linux/types.h>
> >>> +#include <linux/errno.h>
> >>> +#include <linux/netdevice.h>
> >>> +#include <linux/skbuff.h>
> >>> +#include <linux/can.h>
> >>> +#include <linux/can/dev.h>
> >>> +#include <linux/can/error.h>
> >>> +
> >>> +#define MAX_MSG_OBJ  32
> >>> +#define MSG_OBJ_RX  0 /* The receive message object flag. */
> >>> +#define MSG_OBJ_TX  1 /* The transmit message object flag. */
> >>> +
> >>> +#define CAN_CTRL_INIT  0x0001 /* The INIT bit of CANCONT register. */
> >>> +#define CAN_CTRL_IE  0x0002 /* The IE bit of CAN control register */
> >>> +#define CAN_CTRL_IE_SIE_EIE 0x000e
> >>> +#define CAN_CTRL_CCE  0x0040
> >>> +#define CAN_CTRL_OPT  0x0080 /* The OPT bit of CANCONT register. */
> >>> +#define CAN_OPT_SILENT  0x0008 /* The Silent bit of CANOPT reg. */
> >>> +#define CAN_OPT_LBACK  0x0010 /* The LoopBack bit of CANOPT reg. */
> >>> +#define CAN_CMASK_RX_TX_SET 0x00f3
> >>> +#define CAN_CMASK_RX_TX_GET 0x0073
> >>> +#define CAN_CMASK_ALL  0xff
> >>> +#define CAN_CMASK_RDWR  0x80
> >>> +#define CAN_CMASK_ARB  0x20
> >>> +#define CAN_CMASK_CTRL  0x10
> >>> +#define CAN_CMASK_MASK  0x40
> >>> +#define CAN_CMASK_NEWDAT 0x04
> >>> +#define CAN_CMASK_CLRINTPND 0x08
> >>> +
> >>> +#define CAN_IF_MCONT_NEWDAT 0x8000
> >>> +#define CAN_IF_MCONT_INTPND 0x2000
> >>> +#define CAN_IF_MCONT_UMASK 0x1000
> >>> +#define CAN_IF_MCONT_TXIE 0x0800
> >>> +#define CAN_IF_MCONT_RXIE 0x0400
> >>> +#define CAN_IF_MCONT_RMTEN 0x0200
> >>> +#define CAN_IF_MCONT_TXRQXT 0x0100
> >>> +#define CAN_IF_MCONT_EOB 0x0080
> >>> +#define CAN_IF_MCONT_DLC 0x000f
> >>> +#define CAN_IF_MCONT_MSGLOST 0x4000
> >>> +#define CAN_MASK2_MDIR_MXTD 0xc000
> >>> +#define CAN_ID2_DIR  0x2000
> >>> +#define CAN_ID_MSGVAL  0x8000
> >>> +
> >>> +#define CAN_STATUS_INT  0x8000
> >>> +#define CAN_IF_CREQ_BUSY 0x8000
> >>> +#define CAN_ID2_XTD  0x4000
> >>> +
> >>> +#define CAN_REC   0x00007f00
> >>> +#define CAN_TEC   0x000000ff
> >>
> >> A prefix for like PCH_ instead of CAN_ for all those define above would
> >> be fine to avoid namespace clashes and/or confusion with the defines from the socketcan framework.
> >>
> > 
> > I agree.
> > 
> >>> +
> >>> +#define PCH_RX_OK  0x00000010
> >>> +#define PCH_TX_OK  0x00000008
> >>> +#define PCH_BUS_OFF  0x00000080
> >>> +#define PCH_EWARN  0x00000040
> >>> +#define PCH_EPASSIV  0x00000020
> >>> +#define PCH_LEC0  0x00000001
> >>> +#define PCH_LEC1  0x00000002
> >>> +#define PCH_LEC2  0x00000004
> >>
> >> These are just single set bit, please use BIT()
> >> Consider adding the name of the corresponding register to the define's
> >> name.
> > 
> > I agree.
> > 
> >>
> >>> +#define PCH_LEC_ALL  (PCH_LEC0 | PCH_LEC1 | PCH_LEC2)
> >>> +#define PCH_STUF_ERR  PCH_LEC0
> >>> +#define PCH_FORM_ERR  PCH_LEC1
> >>> +#define PCH_ACK_ERR  (PCH_LEC0 | PCH_LEC1)
> >>> +#define PCH_BIT1_ERR  PCH_LEC2
> >>> +#define PCH_BIT0_ERR  (PCH_LEC0 | PCH_LEC2)
> >>> +#define PCH_CRC_ERR  (PCH_LEC1 | PCH_LEC2)
> >>> +
> >>> +/* bit position of certain controller bits. */
> >>> +#define BIT_BITT_BRP  0
> >>> +#define BIT_BITT_SJW  6
> >>> +#define BIT_BITT_TSEG1  8
> >>> +#define BIT_BITT_TSEG2  12
> >>> +#define BIT_IF1_MCONT_RXIE 10
> >>> +#define BIT_IF2_MCONT_TXIE 11
> >>> +#define BIT_BRPE_BRPE  6
> >>> +#define BIT_ES_TXERRCNT  0
> >>> +#define BIT_ES_RXERRCNT  8
> >>
> >> these are usually called SHIFT
> > 
> > I agree.  Is the below TRUE ?
> > e.g.#define PCH_SHIFT_BITT_BRP 0
> 
> I would put the SHIFT at the end, YMMV
> 
> #define PCH_BIT_BRP_SHIFT

I agree.

> 
> > 
> >>
> >>> +#define MSK_BITT_BRP  0x3f
> >>> +#define MSK_BITT_SJW  0xc0
> >>> +#define MSK_BITT_TSEG1  0xf00
> >>> +#define MSK_BITT_TSEG2  0x7000
> >>> +#define MSK_BRPE_BRPE  0x3c0
> >>> +#define MSK_BRPE_GET  0x0f
> >>> +#define MSK_CTRL_IE_SIE_EIE 0x07
> >>> +#define MSK_MCONT_TXIE  0x08
> >>> +#define MSK_MCONT_RXIE  0x10
> >>
> >> MSK or MASK is okay, however the last two are just single bits.
> >>
> >> Please add a PCH_ prefix here, too.
> > 
> > I agree.
> > 
> >>
> >>> +#define PCH_CAN_NO_TX_BUFF 1
> >>> +#define COUNTER_LIMIT  10
> >>
> >> dito
> > 
> > I agree.
> > 
> >>
> >>> +
> >>> +#define PCH_CAN_CLK  50000000 /* 50MHz */
> >>> +
> >>> +/*
> >>> + * Define the number of message object.
> >>> + * PCH CAN communications are done via Message RAM.
> >>> + * The Message RAM consists of 32 message objects.
> >>> + */
> >>> +#define PCH_RX_OBJ_NUM  26  /* 1~ PCH_RX_OBJ_NUM is Rx*/
> >>> +#define PCH_TX_OBJ_NUM  6  /* PCH_RX_OBJ_NUM is RX ~ Tx*/
> >>> +#define PCH_OBJ_NUM  (PCH_TX_OBJ_NUM + PCH_RX_OBJ_NUM)
> >>
> >> You define MAX_MSG_OBJ earlier, seems like two names for the same value.
> > 
> > In case, a use uses all message objects(=32), you are right.
> > But user does not alway use all message object.
> 
> No one will change these values if the driver isn't buggy. And it
> doesn't make any sense to not use all objects.

I see.
I will delete PCH_OBJ_NUM or MAX_MSG_OBJ.

> 
> > 
> > 
> >>
> >>> +
> >>> +#define PCH_FIFO_THRESH  16
> >>> +
> >>> +enum pch_can_mode {
> >>> + PCH_CAN_ENABLE,
> >>> + PCH_CAN_DISABLE,
> >>> + PCH_CAN_ALL,
> >>> + PCH_CAN_NONE,
> >>> + PCH_CAN_STOP,
> >>> + PCH_CAN_RUN,
> >>> +};
> >>> +
> >>> +struct pch_can_regs {
> >>> + u32 cont;
> >>> + u32 stat;
> >>> + u32 errc;
> >>> + u32 bitt;
> >>> + u32 intr;
> >>> + u32 opt;
> >>> + u32 brpe;
> >>> + u32 reserve1;
> >>
> >> VVVV
> >>> + u32 if1_creq;
> >>> + u32 if1_cmask;
> >>> + u32 if1_mask1;
> >>> + u32 if1_mask2;
> >>> + u32 if1_id1;
> >>> + u32 if1_id2;
> >>> + u32 if1_mcont;
> >>> + u32 if1_dataa1;
> >>> + u32 if1_dataa2;
> >>> + u32 if1_datab1;
> >>> + u32 if1_datab2;
> >> ^^^^
> >>
> >> these registers and....
> >>
> >>> + u32 reserve2;
> >>> + u32 reserve3[12];
> >>
> >> ...and these
> >>
> >> VVVV
> >>> + u32 if2_creq;
> >>> + u32 if2_cmask;
> >>> + u32 if2_mask1;
> >>> + u32 if2_mask2;
> >>> + u32 if2_id1;
> >>> + u32 if2_id2;
> >>> + u32 if2_mcont;
> >>> + u32 if2_dataa1;
> >>> + u32 if2_dataa2;
> >>> + u32 if2_datab1;
> >>> + u32 if2_datab2;
> >>
> >> ^^^^
> >>
> >> ...are identical. I suggest to make a struct defining a complete
> >> "Message Interface Register Set". If you include the correct number of
> >> reserved bytes in the struct, you can have an array of two of these
> >> structs in the struct pch_can_regs.
> > 
> > To me, IMHOHO, it looks insignificant point.
> > Please show the merit ?
> 
> See Wolfgangs comments. You can get rid of duplicated code....

Using this method, I can't image to be able to reduce code size, now.
However I will try it.


> 
> > 
> >>
> >>> + u32 reserve4;
> >>> + u32 reserve5[20];
> >>> + u32 treq1;
> >>> + u32 treq2;
> >>> + u32 reserve6[2];
> >>> + u32 reserve7[56];
> >>> + u32 reserve8[3];
> >>> + u32 srst;
> >>> +};
> >>> +
> >>> +struct pch_can_priv {
> >>> + struct can_priv can;
> >>> + struct pci_dev *dev;
> >>> + unsigned int tx_enable[MAX_MSG_OBJ];
> >>> + unsigned int rx_enable[MAX_MSG_OBJ];
> >>> + unsigned int rx_link[MAX_MSG_OBJ];
> >>> + unsigned int int_enables;
> >>> + unsigned int int_stat;
> >>> + struct net_device *ndev;
> >>> + spinlock_t msgif_reg_lock; /* Message Interface Registers Access Lock*/
> >>                                                                             ^^^
> >> please add a whitespace
> > 
> > I agree.
> > 
> >>
> >>> + unsigned int msg_obj[MAX_MSG_OBJ];
> >>> + struct pch_can_regs __iomem *regs;
> >>> + struct napi_struct napi;
> >>> + unsigned int tx_obj; /* Point next Tx Obj index */
> >>> + unsigned int use_msi;
> >>> +};
> >>> +
> >>> +static struct can_bittiming_const pch_can_bittiming_const = {
> >>> + .name = "pch_can",
> >>> + .tseg1_min = 1,
> >>> + .tseg1_max = 16,
> >>> + .tseg2_min = 1,
> >>> + .tseg2_max = 8,
> >>> + .sjw_max = 4,
> >>> + .brp_min = 1,
> >>> + .brp_max = 1024, /* 6bit + extended 4bit */
> >>> + .brp_inc = 1,
> >>> +};
> >>> +
> >>> +static DEFINE_PCI_DEVICE_TABLE(pch_pci_tbl) = {
> >>> + {PCI_VENDOR_ID_INTEL, 0x8818, PCI_ANY_ID, PCI_ANY_ID,},
> >>> + {0,}
> >>> +};
> >>> +MODULE_DEVICE_TABLE(pci, pch_pci_tbl);
> >>> +
> >>> +static inline void pch_can_bit_set(u32 *addr, u32 mask)
> >>                                       ^^^^^
> >>
> >> that should be an void __iomem *, see mail I've send the other day.
> >> Please use sparse to check for this kinds of errors.
> >> (Compile the driver with C=2, i.e.: make drivers/net/can/pch_can.ko C=2)
> >>
> > 
> > I agree.
> > 
> >>> +{
> >>> + iowrite32(ioread32(addr) | mask, addr);
> >>> +}
> >>> +
> >>> +static inline void pch_can_bit_clear(u32 *addr, u32 mask)
> >>                                         ^^^^^
> >>
> >> dito
> > 
> > I agree.
> > 
> >>
> >>> +{
> >>> + iowrite32(ioread32(addr) & ~mask, addr);
> >>> +}
> >>> +
> >>> +static void pch_can_set_run_mode(struct pch_can_priv *priv,
> >>> +     enum pch_can_mode mode)
> >>> +{
> >>> + switch (mode) {
> >>> + case PCH_CAN_RUN:
> >>> +  pch_can_bit_clear(&priv->regs->cont, CAN_CTRL_INIT);
> >>> +  break;
> >>> +
> >>> + case PCH_CAN_STOP:
> >>> +  pch_can_bit_set(&priv->regs->cont, CAN_CTRL_INIT);
> >>> +  break;
> >>> +
> >>> + default:
> >>> +  dev_err(&priv->ndev->dev, "%s -> Invalid Mode.\n", __func__);
> >>> +  break;
> >>> + }
> >>> +}
> >>> +
> >>> +static void pch_can_set_optmode(struct pch_can_priv *priv)
> >>> +{
> >>> + u32 reg_val = ioread32(&priv->regs->opt);
> >>> +
> >>> + if (priv->can.ctrlmode & CAN_CTRLMODE_LISTENONLY)
> >>> +  reg_val |= CAN_OPT_SILENT;
> >>> +
> >>> + if (priv->can.ctrlmode & CAN_CTRLMODE_LOOPBACK)
> >>> +  reg_val |= CAN_OPT_LBACK;
> >>> +
> >>> + pch_can_bit_set(&priv->regs->cont, CAN_CTRL_OPT);
> >>> + iowrite32(reg_val, &priv->regs->opt);
> >>> +}
> >>> +
> >>
> >> IMHO the function name is missleading, if I understand the code
> >> correctly, this functions triggers the transmission of the message.
> >> After this it checks for busy, 
> > 
> > Yes, your understanding is TRUE.
> > 
> >> but
> >>
> >>> +static void pch_can_check_if_busy(u32 __iomem *creq_addr, u32 num)
> >>                                      ^^^^
> >>
> > 
> > Yes, me too.
> > I will rename the function name.
> > 
> > How about "pch_can_rw_msg_obj"
> > 
> >> that should probaby be a void
> > What't the above mean ?
> > pch_can_check_if_busy is already "void" function.
> 
> >>> +static void pch_can_check_if_busy(u32 __iomem *creq_addr, u32 num)
> >>                                     ^^^^
> 
> That u32 should be a void.

I agree.

> 
> BTW: Does the Intel chip support x64? If so, have you tested the driver
> on a 64 bit kernel.
> 
> > 
> >>
> >>> +{
> >>> + u32 counter = COUNTER_LIMIT;
> >>> + u32 ifx_creq;
> >>> +
> >>> + iowrite32(num, creq_addr);
> >>> + while (counter) {
> >>> +  ifx_creq = ioread32(creq_addr) & CAN_IF_CREQ_BUSY;
> >>> +  if (!ifx_creq)
> >>> +   break;
> >>> +  counter--;
> >>> +  udelay(1);
> >>> + }
> >>> + if (!counter)
> >>> +  pr_err("%s:IF1 BUSY Flag is set forever.\n", __func__);
> >>> +}
> >>> +
> >>> +static void pch_can_set_int_enables(struct pch_can_priv *priv,
> >>> +        enum pch_can_mode interrupt_no)
> >>> +{
> >>> + switch (interrupt_no) {
> >>> + case PCH_CAN_ENABLE:
> >>> +  pch_can_bit_set(&priv->regs->cont, CAN_CTRL_IE);
> >>
> >> noone uses this case.
> > 
> > I agree.
> > 
> >>
> >>> +  break;
> >>> +
> >>> + case PCH_CAN_DISABLE:
> >>> +  pch_can_bit_clear(&priv->regs->cont, CAN_CTRL_IE);
> >>> +  break;
> >>> +
> >>> + case PCH_CAN_ALL:
> >>> +  pch_can_bit_set(&priv->regs->cont, CAN_CTRL_IE_SIE_EIE);
> >>> +  break;
> >>> +
> >>> + case PCH_CAN_NONE:
> >>> +  pch_can_bit_clear(&priv->regs->cont, CAN_CTRL_IE_SIE_EIE);
> >>> +  break;
> >>> +
> >>> + default:
> >>> +  dev_err(&priv->ndev->dev, "Invalid interrupt number.\n");
> >>> +  break;
> >>> + }
> >>> +}
> >>> +
> >>> +static void pch_can_set_rx_enable(struct pch_can_priv *priv, u32 buff_num,
> >>> +      int set)
> >>> +{
> >>> + unsigned long flags;
> >>> +
> >>> + spin_lock_irqsave(&priv->msgif_reg_lock, flags);
> >>> + /* Reading the receive buffer data from RAM to Interface1 registers */
> >>> + iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
> >>> + pch_can_check_if_busy(&priv->regs->if1_creq, buff_num);
> >>> +
> >>> + /* Setting the IF1MASK1 register to access MsgVal and RxIE bits */
> >>> + iowrite32(CAN_CMASK_RDWR | CAN_CMASK_ARB | CAN_CMASK_CTRL,
> >>> +    &priv->regs->if1_cmask);
> >>> +
> >>> + if (set == 1) {
> >>> +  /* Setting the MsgVal and RxIE bits */
> >>> +  pch_can_bit_set(&priv->regs->if1_mcont, CAN_IF_MCONT_RXIE);
> >>> +  pch_can_bit_set(&priv->regs->if1_id2, CAN_ID_MSGVAL);
> >>> +
> >>> + } else if (set == 0) {
> >>> +  /* Resetting the MsgVal and RxIE bits */
> >>> +  pch_can_bit_clear(&priv->regs->if1_mcont, CAN_IF_MCONT_RXIE);
> >>> +  pch_can_bit_clear(&priv->regs->if1_id2, CAN_ID_MSGVAL);
> >>> + }
> >>> +
> >>> + pch_can_check_if_busy(&priv->regs->if1_creq, buff_num);
> >>> + spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
> >>> +}
> >>> +
> >>> +static void pch_can_rx_enable_all(struct pch_can_priv *priv)
> >>> +{
> >>> + int i;
> >>> +
> >>> + /* Traversing to obtain the object configured as receivers. */
> >>> + for (i = 1; i <= PCH_RX_OBJ_NUM; i++)
> >>> +  pch_can_set_rx_enable(priv, i, 1);
> >>> +}
> >>> +
> >>> +static void pch_can_rx_disable_all(struct pch_can_priv *priv)
> >>> +{
> >>> + int i;
> >>> +
> >>> + /* Traversing to obtain the object configured as receivers. */
> >>> + for (i = 1; i <= PCH_RX_OBJ_NUM; i++)
> >>> +  pch_can_set_rx_enable(priv, i, 0);
> >>> +}
> >>> +
> >>> +static void pch_can_set_tx_enable(struct pch_can_priv *priv, u32 buff_num,
> >>> +     u32 set)
> >>> +{
> >>> + unsigned long flags;
> >>> +
> >>> + spin_lock_irqsave(&priv->msgif_reg_lock, flags);
> >>> + /* Reading the Msg buffer from Message RAM to Interface2 registers. */
> >>> + iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if2_cmask);
> >>> + pch_can_check_if_busy(&priv->regs->if2_creq, buff_num);
> >>> +
> >>> + /* Setting the IF2CMASK register for accessing the
> >>> +  MsgVal and TxIE bits */
> >>> + iowrite32(CAN_CMASK_RDWR | CAN_CMASK_ARB | CAN_CMASK_CTRL,
> >>> +   &priv->regs->if2_cmask);
> >>> +
> >>> + if (set == 1) {
> >>> +  /* Setting the MsgVal and TxIE bits */
> >>> +  pch_can_bit_set(&priv->regs->if2_mcont, CAN_IF_MCONT_TXIE);
> >>> +  pch_can_bit_set(&priv->regs->if2_id2, CAN_ID_MSGVAL);
> >>> + } else if (set == 0) {
> >>> +  /* Resetting the MsgVal and TxIE bits. */
> >>> +  pch_can_bit_clear(&priv->regs->if2_mcont, CAN_IF_MCONT_TXIE);
> >>> +  pch_can_bit_clear(&priv->regs->if2_id2, CAN_ID_MSGVAL);
> >>> + }
> >>> +
> >>> + pch_can_check_if_busy(&priv->regs->if2_creq, buff_num);
> >>> + spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
> >>> +}
> >>> +
> >>> +static void pch_can_tx_enable_all(struct pch_can_priv *priv)
> >>> +{
> >>> + int i;
> >>> +
> >>> + /* Traversing to obtain the object configured as transmit object. */
> >>> + for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++)
> >>> +  pch_can_set_tx_enable(priv, i, 1);
> >>> +}
> >>> +
> >>> +static void pch_can_tx_disable_all(struct pch_can_priv *priv)
> >>> +{
> >>> + int i;
> >>> +
> >>> + /* Traversing to obtain the object configured as transmit object. */
> >>> + for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++)
> >>> +  pch_can_set_tx_enable(priv, i, 0);
> >>> +}
> >>> +
> >>> +static int pch_can_int_pending(struct pch_can_priv *priv)
> >>           ^^^
> >>
> >> make it u32 as it returns a register value, or a u16 as you only use
> >> the 16 lower bits.
> > 
> > I agree. I will modify to u32.
> > 
> >>
> >>> +{
> >>> + return ioread32(&priv->regs->intr) & 0xffff;
> >>> +}
> >>> +
> >>> +static void pch_can_clear_buffers(struct pch_can_priv *priv)
> >>> +{
> >>> + int i; /* Msg Obj ID (1~32) */
> >>> +
> >>> + for (i = 1; i <= PCH_RX_OBJ_NUM; i++) {
> >>
> >> IMHO the readability would be improved if you define something like
> >> PCH_RX_OBJ_START and PCH_RX_OBJ_END.
> > 
> > I agree.
> > 
> >>
> >>> +  iowrite32(CAN_CMASK_RX_TX_SET, &priv->regs->if1_cmask);
> >>> +  iowrite32(0xffff, &priv->regs->if1_mask1);
> >>> +  iowrite32(0xffff, &priv->regs->if1_mask2);
> >>> +  iowrite32(0x0, &priv->regs->if1_id1);
> >>> +  iowrite32(0x0, &priv->regs->if1_id2);
> >>> +  iowrite32(0x0, &priv->regs->if1_mcont);
> >>> +  iowrite32(0x0, &priv->regs->if1_dataa1);
> >>> +  iowrite32(0x0, &priv->regs->if1_dataa2);
> >>> +  iowrite32(0x0, &priv->regs->if1_datab1);
> >>> +  iowrite32(0x0, &priv->regs->if1_datab2);
> >>> +  iowrite32(CAN_CMASK_RDWR | CAN_CMASK_MASK |
> >>> +     CAN_CMASK_ARB | CAN_CMASK_CTRL,
> >>> +     &priv->regs->if1_cmask);
> >>> +  pch_can_check_if_busy(&priv->regs->if1_creq, i);
> >>> + }
> >>> +
> >>> + for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++) {
> >>                  ^^^^^^^^^^^^^^^^^^
> >> dito for TX objects
> > 
> > I agree.
> > 
> >>
> >>> +  iowrite32(CAN_CMASK_RX_TX_SET, &priv->regs->if2_cmask);
> >>> +  iowrite32(0xffff, &priv->regs->if2_mask1);
> >>> +  iowrite32(0xffff, &priv->regs->if2_mask2);
> >>> +  iowrite32(0x0, &priv->regs->if2_id1);
> >>> +  iowrite32(0x0, &priv->regs->if2_id2);
> >>> +  iowrite32(0x0, &priv->regs->if2_mcont);
> >>> +  iowrite32(0x0, &priv->regs->if2_dataa1);
> >>> +  iowrite32(0x0, &priv->regs->if2_dataa2);
> >>> +  iowrite32(0x0, &priv->regs->if2_datab1);
> >>> +  iowrite32(0x0, &priv->regs->if2_datab2);
> >>> +  iowrite32(CAN_CMASK_RDWR | CAN_CMASK_MASK | CAN_CMASK_ARB |
> >>> +     CAN_CMASK_CTRL, &priv->regs->if2_cmask);
> >>> +  pch_can_check_if_busy(&priv->regs->if2_creq, i);
> >>> + }
> >>> +}
> >>> +
> >>> +static void pch_can_config_rx_tx_buffers(struct pch_can_priv *priv)
> >>> +{
> >>> + int i;
> >>> + unsigned long flags;
> >>> +
> >>> + spin_lock_irqsave(&priv->msgif_reg_lock, flags);
> >>> +
> >>> + for (i = 1; i <= PCH_RX_OBJ_NUM; i++) {
> >>> +  iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
> >>> +  pch_can_check_if_busy(&priv->regs->if1_creq, i);
> >>
> >> If I understand the code correctly, the about function triggers a
> >> transfer. Why do you first trigger a transfer, then set the message contents....
> > 
> > For doing Read-Modify-Write.
> > As to fixed parameter of message object, it doesn't be modified every access.
> 
> I see.
> 
> > We will modify to write only.
> > 
> >>> +
> >>> +  iowrite32(0x0, &priv->regs->if1_id1);
> >>> +  iowrite32(0x0, &priv->regs->if1_id2);
> >>> +
> >>> +  pch_can_bit_set(&priv->regs->if1_mcont, CAN_IF_MCONT_UMASK);
> >>
> >>     Why do you set the "Use acceptance mask" bit? We want to receive
> >>     all can messages.
> > 
> > Without "Use acceptance mask" means received packet matched ID[28:0] only.
> > As a result, filter is enabled.
> > 
> > With "Use acceptance mask" and setting Msk[0:28]=all 1, all packets can be received(=No filter state) 
> 
> Thanks for the explenation.
> 
> > 
> >>
> >>> +
> >>> +  /* Set FIFO mode set to 0 except last Rx Obj*/
> >>> +  pch_can_bit_clear(&priv->regs->if1_mcont, CAN_IF_MCONT_EOB);
> >>> +  /* In case FIFO mode, Last EoB of Rx Obj must be 1 */
> >>> +  if (i == (PCH_RX_OBJ_NUM - 1))
> >>> +   pch_can_bit_set(&priv->regs->if1_mcont,
> >>> +     CAN_IF_MCONT_EOB);
> >>
> >>     Make it if () { } else { }, please.
> > 
> > Sorry, I can't understand.
> > else {} is not necessary.
> 
> Please look at the code block above, again. You frist clean the bit
> unconditionally, then you set the bit in the if. Please make it:
> 
> if (last)
>  set_bit
> else
>  clear_bit

I understand.
I will modify like above.
Thanks.

> 
> > 
> >>
> >>> +
> >>> +  iowrite32(0, &priv->regs->if1_mask1);
> >>> +  pch_can_bit_clear(&priv->regs->if1_mask2,
> >>> +      0x1fff | CAN_MASK2_MDIR_MXTD);
> >>> +
> >>> +  /* Setting CMASK for writing */
> >>> +  iowrite32(CAN_CMASK_RDWR | CAN_CMASK_MASK | CAN_CMASK_ARB |
> >>> +     CAN_CMASK_CTRL, &priv->regs->if1_cmask);
> >>> +
> >>> +  pch_can_check_if_busy(&priv->regs->if1_creq, i);
> >>
> >> ...and then trigger the transfer again?
> > 
> > This means Read-Modify-Write.
> 
> ic
> 
> > 
> >>
> >>> + }
> >>> +
> >>> + for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++) {
> >>> +  iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if2_cmask);
> >>> +  pch_can_check_if_busy(&priv->regs->if2_creq, i);
> >>
> >> same question about triggering the transfer 2 times applied here, too
> > 
> > ditto.
> > 
> >>> +
> >>> +  /* Resetting DIR bit for reception */
> >>> +  iowrite32(0x0, &priv->regs->if2_id1);
> >>> +  iowrite32(0x0, &priv->regs->if2_id2);
> >>> +  pch_can_bit_set(&priv->regs->if2_id2, CAN_ID2_DIR);
> >>
> >> Can you combine the two accesses to >if2_id2 into one?
> > 
> > I agree.
> > 
> >>
> >>> +
> >>> +  /* Setting EOB bit for transmitter */
> >>> +  iowrite32(CAN_IF_MCONT_EOB, &priv->regs->if2_mcont);
> >>> +
> >>> +  pch_can_bit_set(&priv->regs->if2_mcont,
> >>> +    CAN_IF_MCONT_UMASK);
> >>
> >> dito for if2_mcont
> > 
> > ditto.
> > 
> >>
> >>> +
> >>> +  iowrite32(0, &priv->regs->if2_mask1);
> >>> +  pch_can_bit_clear(&priv->regs->if2_mask2, 0x1fff);
> >>> +
> >>> +  /* Setting CMASK for writing */
> >>> +  iowrite32(CAN_CMASK_RDWR | CAN_CMASK_MASK | CAN_CMASK_ARB |
> >>> +     CAN_CMASK_CTRL, &priv->regs->if2_cmask);
> >>> +
> >>> +  pch_can_check_if_busy(&priv->regs->if2_creq, i);
> >>> + }
> >>> +
> >>> + spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
> >>> +}
> >>> +
> >>> +static void pch_can_init(struct pch_can_priv *priv)
> >>> +{
> >>> + /* Stopping the Can device. */
> >>> + pch_can_set_run_mode(priv, PCH_CAN_STOP);
> >>> +
> >>> + /* Clearing all the message object buffers. */
> >>> + pch_can_clear_buffers(priv);
> >>> +
> >>> + /* Configuring the respective message object as either rx/tx object. */
> >>> + pch_can_config_rx_tx_buffers(priv);
> >>> +
> >>> + /* Enabling the interrupts. */
> >>> + pch_can_set_int_enables(priv, PCH_CAN_ALL);
> >>> +}
> >>> +
> >>> +static void pch_can_release(struct pch_can_priv *priv)
> >>> +{
> >>> + /* Stooping the CAN device. */
> >>> + pch_can_set_run_mode(priv, PCH_CAN_STOP);
> >>> +
> >>> + /* Disabling the interrupts. */
> >>> + pch_can_set_int_enables(priv, PCH_CAN_NONE);
> >>> +
> >>> + /* Disabling all the receive object. */
> >>> + pch_can_rx_disable_all(priv);
> >>> +
> >>> + /* Disabling all the transmit object. */
> >>> + pch_can_tx_disable_all(priv);
> >>> +}
> >>> +
> >>> +/* This function clears interrupt(s) from the CAN device. */
> >>> +static void pch_can_int_clr(struct pch_can_priv *priv, u32 mask)
> >>> +{
> >>> + if (mask == CAN_STATUS_INT) {
> >>
> >> is this a valid case?
> > 
> > This "if" is always false.
> > I will delete this condition.
> > 
> >>
> >>> +  ioread32(&priv->regs->stat);
> >>> +  return;
> >>> + }
> >>> +
> >>> + /* Clear interrupt for transmit object */
> >>> + if ((mask >= 1) && (mask <= PCH_RX_OBJ_NUM)) {
> >>> +  /* Setting CMASK for clearing the reception interrupts. */
> >>> +  iowrite32(CAN_CMASK_RDWR | CAN_CMASK_CTRL | CAN_CMASK_ARB,
> >>> +     &priv->regs->if1_cmask);
> >>> +
> >>> +  /* Clearing the Dir bit. */
> >>> +  pch_can_bit_clear(&priv->regs->if1_id2, CAN_ID2_DIR);
> >>> +
> >>> +  /* Clearing NewDat & IntPnd */
> >>> +  pch_can_bit_clear(&priv->regs->if1_mcont,
> >>> +      CAN_IF_MCONT_NEWDAT | CAN_IF_MCONT_INTPND);
> >>> +
> >>> +  pch_can_check_if_busy(&priv->regs->if1_creq, mask);
> >>> + } else if ((mask > PCH_RX_OBJ_NUM) && (mask <= PCH_OBJ_NUM)) {
> >>> +  /* Setting CMASK for clearing interrupts for
> >>> +     frame transmission. */
> >>
> >> /*
> >>  * this is the prefered style of multi line comments,
> >>  * please adjust you comments
> >>  */
> > 
> > I understand.
> > 
> >>
> >>> +  iowrite32(CAN_CMASK_RDWR | CAN_CMASK_CTRL | CAN_CMASK_ARB,
> >>> +     &priv->regs->if2_cmask);
> >>> +
> >>> +  /* Resetting the ID registers. */
> >>> +  pch_can_bit_set(&priv->regs->if2_id2,
> >>> +          CAN_ID2_DIR | (0x7ff << 2));
> >>> +  iowrite32(0x0, &priv->regs->if2_id1);
> >>> +
> >>> +  /* Claring NewDat, TxRqst & IntPnd */
> >>> +  pch_can_bit_clear(&priv->regs->if2_mcont,
> >>> +      CAN_IF_MCONT_NEWDAT | CAN_IF_MCONT_INTPND |
> >>> +      CAN_IF_MCONT_TXRQXT);
> >>> +  pch_can_check_if_busy(&priv->regs->if2_creq, mask);
> >>> + }
> >>> +}
> >>> +
> >>> +static u32 pch_can_get_buffer_status(struct pch_can_priv *priv)
> >>> +{
> >>> + return (ioread32(&priv->regs->treq1) & 0xffff) |
> >>> +        ((ioread32(&priv->regs->treq2) & 0xffff) << 16);
> >>
> >> the second 0xffff is not needed, as the return value is u32 and you shift by 16.
> > 
> > I agree.
> > 
> >>
> >>> +}
> >>> +
> >>> +static void pch_can_reset(struct pch_can_priv *priv)
> >>> +{
> >>> + /* write to sw reset register */
> >>> + iowrite32(1, &priv->regs->srst);
> >>> + iowrite32(0, &priv->regs->srst);
> >>> +}
> >>> +
> >>> +static void pch_can_error(struct net_device *ndev, u32 status)
> >>> +{
> >>> + struct sk_buff *skb;
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> + struct can_frame *cf;
> >>> + u32 errc;
> >>> + struct net_device_stats *stats = &(priv->ndev->stats);
> >>> + enum can_state state = priv->can.state;
> >>> +
> >>> + skb = alloc_can_err_skb(ndev, &cf);
> >>> + if (!skb)
> >>> +  return;
> >>> +
> >>> + if (status & PCH_BUS_OFF) {
> >>> +  pch_can_tx_disable_all(priv);
> >>> +  pch_can_rx_disable_all(priv);
> >>> +  state = CAN_STATE_BUS_OFF;
> >>> +  cf->can_id |= CAN_ERR_BUSOFF;
> >>> +  can_bus_off(ndev);
> >>> + }
> >>> +
> >>> + /* Warning interrupt. */
> >>> + if (status & PCH_EWARN) {
> >>> +  state = CAN_STATE_ERROR_WARNING;
> >>> +  priv->can.can_stats.error_warning++;
> >>> +  cf->can_id |= CAN_ERR_CRTL;
> >>> +  errc = ioread32(&priv->regs->errc);
> >>> +  if (((errc & CAN_REC) >> 8) > 96)
> >>> +   cf->data[1] |= CAN_ERR_CRTL_RX_WARNING;
> >>> +  if ((errc & CAN_TEC) > 96)
> >>> +   cf->data[1] |= CAN_ERR_CRTL_TX_WARNING;
> >>> +  dev_warn(&ndev->dev,
> >>> +   "%s -> Error Counter is more than 96.\n", __func__);
> >>
> >> Please use just "debug" level not warning here. Consider to use
> >> netdev_dbg() instead. IMHO the __func__ can be dropped and the
> >> "official" name for the error is "Error Warning".
> > 
> > I want to know the reason.
> > Why is it not dev_warn but netdev_dbg ?
> 
> If you use warning level it would end up on the console or and in the
> syslog. It's quite complicated (for programs) to get information from
> there. This is why we send CAN error frames. They hold the same
> information but int a binary form, thus it's easier to process.

I understand the reason.
BTW, Why do you say not dev_dbg but netdev_dbg ?

> 
> > 
> >>
> >>> + }
> >>> + /* Error passive interrupt. */
> >>> + if (status & PCH_EPASSIV) {
> >>> +  priv->can.can_stats.error_passive++;
> >>> +  state = CAN_STATE_ERROR_PASSIVE;
> >>> +  cf->can_id |= CAN_ERR_CRTL;
> >>> +  errc = ioread32(&priv->regs->errc);
> >>> +  if (((errc & CAN_REC) >> 8) > 127)
> >>> +   cf->data[1] |= CAN_ERR_CRTL_RX_PASSIVE;
> >>> +  if ((errc & CAN_TEC) > 127)
> >>> +   cf->data[1] |= CAN_ERR_CRTL_TX_PASSIVE;
> >>> +  dev_err(&ndev->dev,
> >>> +   "%s -> CAN controller is ERROR PASSIVE .\n", __func__);
> >>
> >> dito
> > 
> > ditto
> > 
> >>
> >>> + }
> >>> +
> >>> + if (status & PCH_LEC_ALL) {
> >>> +  priv->can.can_stats.bus_error++;
> >>> +  stats->rx_errors++;
> >>> +  switch (status & PCH_LEC_ALL) {
> >>
> >> I suggest to convert to a if-bit-set because there might be more than
> >> one bit set.
> > 
> > I agree.
> > 
> >>
> >>> +  case PCH_STUF_ERR:
> >>> +   cf->data[2] |= CAN_ERR_PROT_STUFF;
> >>> +   break;
> >>> +  case PCH_FORM_ERR:
> >>> +   cf->data[2] |= CAN_ERR_PROT_FORM;
> >>> +   break;
> >>> +  case PCH_ACK_ERR:
> >>> +   cf->data[2] |= CAN_ERR_PROT_LOC_ACK |
> >>> +           CAN_ERR_PROT_LOC_ACK_DEL;
> >>> +   break;
> >>> +  case PCH_BIT1_ERR:
> >>> +  case PCH_BIT0_ERR:
> >>> +   cf->data[2] |= CAN_ERR_PROT_BIT;
> >>> +   break;
> >>> +  case PCH_CRC_ERR:
> >>> +   cf->data[2] |= CAN_ERR_PROT_LOC_CRC_SEQ |
> >>> +           CAN_ERR_PROT_LOC_CRC_DEL;
> >>> +   break;
> >>> +  default:
> >>> +   iowrite32(status | PCH_LEC_ALL, &priv->regs->stat);
> >>> +   break;
> >>> +  }
> >>> +
> >>> + }
> >>> +
> >>> + priv->can.state = state;
> >>> + netif_receive_skb(skb);
> >>> +
> >>> + stats->rx_packets++;
> >>> + stats->rx_bytes += cf->can_dlc;
> >>> +}
> >>> +
> >>> +static irqreturn_t pch_can_interrupt(int irq, void *dev_id)
> >>> +{
> >>> + struct net_device *ndev = (struct net_device *)dev_id;
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> +
> >>> + pch_can_set_int_enables(priv, PCH_CAN_NONE);
> >>> + napi_schedule(&priv->napi);
> >>> +
> >>> + return IRQ_HANDLED;
> >>> +}
> >>> +
> >>> +static void pch_fifo_thresh(struct pch_can_priv *priv, int obj_id)
> >>> +{
> >>> + if (obj_id < PCH_FIFO_THRESH) {
> >>> +  iowrite32(CAN_CMASK_RDWR | CAN_CMASK_CTRL |
> >>> +     CAN_CMASK_ARB, &priv->regs->if1_cmask);
> >>> +
> >>> +  /* Clearing the Dir bit. */
> >>> +  pch_can_bit_clear(&priv->regs->if1_id2, CAN_ID2_DIR);
> >>> +
> >>> +  /* Clearing NewDat & IntPnd */
> >>> +  pch_can_bit_clear(&priv->regs->if1_mcont,
> >>> +      CAN_IF_MCONT_INTPND);
> >>> +  pch_can_check_if_busy(&priv->regs->if1_creq, obj_id);
> >>> + } else if (obj_id > PCH_FIFO_THRESH) {
> >>> +  pch_can_int_clr(priv, obj_id);
> >>> + } else if (obj_id == PCH_FIFO_THRESH) {
> >>> +  int cnt;
> >>> +  for (cnt = 0; cnt < PCH_FIFO_THRESH; cnt++)
> >>> +   pch_can_int_clr(priv, cnt+1);
> >>> + }
> >>> +}
> >>> +
> >>> +static int pch_can_rx_msg_lost(struct net_device *ndev, int obj_id)
> >>> +{
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> + struct net_device_stats *stats = &(priv->ndev->stats);
> >>> + struct sk_buff *skb;
> >>> + struct can_frame *cf;
> >>> +
> >>> + dev_err(&priv->ndev->dev, "Msg Obj is overwritten.\n");
> >>> + pch_can_bit_clear(&priv->regs->if1_mcont,
> >>> +     CAN_IF_MCONT_MSGLOST);
> >>> + iowrite32(CAN_CMASK_RDWR | CAN_CMASK_CTRL,
> >>> +    &priv->regs->if1_cmask);
> >>> + pch_can_check_if_busy(&priv->regs->if1_creq, obj_id);
> >>> +
> >>> + skb = alloc_can_err_skb(ndev, &cf);
> >>> + if (!skb)
> >>> +  return -ENOMEM;
> >>> +
> >>> + priv->can.can_stats.error_passive++;
> >>> + priv->can.state = CAN_STATE_ERROR_PASSIVE;
> >>> + cf->can_id |= CAN_ERR_CRTL;
> >>> + cf->data[1] = CAN_ERR_CRTL_RX_OVERFLOW;
> >>> + stats->rx_over_errors++;
> >>> + stats->rx_errors++;
> >>> +
> >>> + netif_receive_skb(skb);
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static int pch_can_rx_normal(struct net_device *ndev, u32 obj_num, int quota)
> >>> +{
> >>> + u32 reg;
> >>> + canid_t id;
> >>> + u32 ide;
> >>> + u32 rtr;
> >>> + int rcv_pkts = 0;
> >>> + int rtn;
> >>> + int next_flag = 0;
> >>> + struct sk_buff *skb;
> >>> + struct can_frame *cf;
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> + struct net_device_stats *stats = &(priv->ndev->stats);
> >>> +
> >>> + /* Reading the messsage object from the Message RAM */
> >>> + iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
> >>> + pch_can_check_if_busy(&priv->regs->if1_creq, obj_num);
> >>> +
> >>> + /* Reading the MCONT register. */
> >>> + reg = ioread32(&priv->regs->if1_mcont);
> >>> + reg &= 0xffff;
> >>> +
> >>> + for (; (!(reg & CAN_IF_MCONT_EOB)) && (quota > 0);
> >>> +      obj_num++, next_flag = 0) {
> >>> +  /* If MsgLost bit set. */
> >>> +  if (reg & CAN_IF_MCONT_MSGLOST) {
> >>> +   rtn = pch_can_rx_msg_lost(ndev, obj_num);
> >>> +   if (!rtn)
> >>> +    return rtn;
> >>> +   rcv_pkts++;
> >>> +   quota--;
> >>> +   next_flag = 1;
> >>> +  } else if (!(reg & CAN_IF_MCONT_NEWDAT))
> >>> +   next_flag = 1;
> >>> +
> >>
> >> after rearanging the code (see below..) you should be able to use a continue here.
> >>
> >>> +  if (!next_flag) {
> >>> +   skb = alloc_can_skb(priv->ndev, &cf);
> >>> +   if (!skb)
> >>> +    return -ENOMEM;
> >>> +
> >>> +   /* Get Received data */
> >>> +   ide = ((ioread32(&priv->regs->if1_id2)) & CAN_ID2_XTD);
> >>> +   if (ide) {
> >>> +    id = (ioread32(&priv->regs->if1_id1) & 0xffff);
> >>> +    id |= (((ioread32(&priv->regs->if1_id2)) &
> >>> +          0x1fff) << 16);
> >>> +    cf->can_id = (id & CAN_EFF_MASK) | CAN_EFF_FLAG;
> >>                                               ^^^^^^^^^^^^^^^^^
> >>
> >> is the mask needed, you mask the if1_id{1,2} already
> > 
> > I will delete
> > 
> >>
> >>> +   } else {
> >>> +    id = (((ioread32(&priv->regs->if1_id2)) &
> >>> +        (CAN_SFF_MASK << 2)) >> 2);
> >>> +    cf->can_id = (id & CAN_SFF_MASK);
> >>
> >> one mask can go away
> > 
> > I agree.
> > 
> >>
> >>> +   }
> >>> +
> >>> +   rtr = ioread32(&priv->regs->if1_id2) &  CAN_ID2_DIR;
> >>                                                               ^^
> >>
> >> remove one space
> > 
> > I agree.
> > 
> >>
> >>> +
> >>> +   if (rtr)
> >>> +    cf->can_id |= CAN_RTR_FLAG;
> >>> +
> >>> +   cf->can_dlc = get_can_dlc((ioread32(&priv->regs->
> >>> +         if1_mcont)) & 0xF);
> >>> +   *(u16 *)(cf->data + 0) = ioread16(&priv->regs->
> >>> +         if1_dataa1);
> >>> +   *(u16 *)(cf->data + 2) = ioread16(&priv->regs->
> >>> +         if1_dataa2);
> >>> +   *(u16 *)(cf->data + 4) = ioread16(&priv->regs->
> >>> +         if1_datab1);
> >>> +   *(u16 *)(cf->data + 6) = ioread16(&priv->regs->
> >>> +         if1_datab2);
> >>
> >> are you sure, the bytes in the can package a in the correct order.
> >> Please test your pch_can against a non pch_can system.
> > 
> > Unfortunately, we don't have non pch_can system.
> 
> Have a look a the driver/net/can/usb subdir and buy one of those. It
> really hard to find bugs if you test against your own driver.

We can't buy this right now for few budget.
But I heard we have "CANalyzer".
Using this, we can test this endian concern.
But we may need much time for studying how to use.

> 
> > 
> >>
> >>> +
> >>> +   netif_receive_skb(skb);
> >>> +   rcv_pkts++;
> >>> +   stats->rx_packets++;
> >>> +   quota--;
> >>> +   stats->rx_bytes += cf->can_dlc;
> >>> +
> >>> +   pch_fifo_thresh(priv, obj_num);
> >>> +  }
> >>> +
> >>> +  /* Reading the messsage object from the Message RAM */
> >>> +  iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
> >>> +  pch_can_check_if_busy(&priv->regs->if1_creq, obj_num + 1);
> >>> +  reg = ioread32(&priv->regs->if1_mcont);
> >>
> >> this is almost the same code as before the the loop, can you rearange
> >> the code to avoid duplication?
> > 
> > I agree.
> > 
> >>
> >>> + }
> >>> +
> >>> + return rcv_pkts;
> >>> +}
> >>> +
> >>> +static void pch_can_tx_complete(struct net_device *ndev, u32 int_stat)
> >>> +{
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> + struct net_device_stats *stats = &(priv->ndev->stats);
> >>> + unsigned long flags;
> >>> + u32 dlc;
> >>> +
> >>> + can_get_echo_skb(ndev, int_stat - PCH_RX_OBJ_NUM - 1);
> >>> + spin_lock_irqsave(&priv->msgif_reg_lock, flags);
> >>> + iowrite32(CAN_CMASK_RX_TX_GET | CAN_CMASK_CLRINTPND,
> >>> +    &priv->regs->if2_cmask);
> >>> + dlc = ioread32(&priv->regs->if2_mcont) & CAN_IF_MCONT_DLC;
> >>> + pch_can_check_if_busy(&priv->regs->if2_creq, int_stat);
> >>> + spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
> >>> + if (dlc > 8)
> >>> +  dlc = 8;
> >>
> >> use get_can_dlc
> > 
> > I agree.
> > 
> >>
> >>> + stats->tx_bytes += dlc;
> >>> + stats->tx_packets++;
> >>> +}
> >>> +
> >>> +static int pch_can_rx_poll(struct napi_struct *napi, int quota)
> >>> +{
> >>> + struct net_device *ndev = napi->dev;
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> + u32 int_stat;
> >>> + int rcv_pkts = 0;
> >>> + u32 reg_stat;
> >>> + unsigned long flags;
> >>> +
> >>> + int_stat = pch_can_int_pending(priv);
> >>> + if (!int_stat)
> >>> +  goto END;
> >>> +
> >>> + if ((int_stat == CAN_STATUS_INT) && (quota > 0)) {
> >>> +  reg_stat = ioread32(&priv->regs->stat);
> >>> +  if (reg_stat & (PCH_BUS_OFF | PCH_LEC_ALL)) {
> >>> +   if ((reg_stat & PCH_LEC_ALL) != PCH_LEC_ALL) {
> >>> +    pch_can_error(ndev, reg_stat);
> >>> +    quota--;
> >>> +   }
> >>> +  }
> >>> +
> >>> +  if (reg_stat & PCH_TX_OK) {
> >>> +   spin_lock_irqsave(&priv->msgif_reg_lock, flags);
> >>> +   iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if2_cmask);
> >>> +   pch_can_check_if_busy(&priv->regs->if2_creq,
> >>> +            ioread32(&priv->regs->intr));
> >>                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>
> >> Isn't this "int_stat". Might it be possilbe that regs->intr changes
> >> between the pch_can_int_pending and here?
> > 
> > This code was mistake.
> > This condition, message object is not acccessed.
> > Thus, pch_can_check_if_busy can be deleted.
> > 
> >>
> >> What should this transfer do?
> >>
> >>> +   spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
> >>> +   pch_can_bit_clear(&priv->regs->stat, PCH_TX_OK);
> >>> +  }
> >>> +
> >>> +  if (reg_stat & PCH_RX_OK)
> >>> +   pch_can_bit_clear(&priv->regs->stat, PCH_RX_OK);
> >>> +
> >>> +  int_stat = pch_can_int_pending(priv);
> >>> + }
> >>> +
> >>> + if (quota == 0)
> >>> +  goto END;
> >>> +
> >>> + if ((int_stat >= 1) && (int_stat <= PCH_RX_OBJ_NUM)) {
> >>> +  spin_lock_irqsave(&priv->msgif_reg_lock, flags);
> >>> +  rcv_pkts += pch_can_rx_normal(ndev, int_stat, quota);
> >>> +  spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
> >>> +  quota -= rcv_pkts;
> >>> +  if (rcv_pkts < 0)
> >>
> >> how can this happen?
> > 
> > My mistake.
> > if (quota < 0) is TRUE.
> > 
> > 
> >>
> >>> +   goto END;
> >>> + } else if ((int_stat > PCH_RX_OBJ_NUM) && (int_stat <= PCH_OBJ_NUM)) {
> >>> +  /* Handle transmission interrupt */
> >>> +  pch_can_tx_complete(ndev, int_stat);
> >>> + }
> >>> +
> >>> +END:
> >>> + napi_complete(napi);
> >>> + pch_can_set_int_enables(priv, PCH_CAN_ALL);
> >>> +
> >>> + return rcv_pkts;
> >>> +}
> >>> +
> >>> +static int pch_set_bittiming(struct net_device *ndev)
> >>> +{
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> + const struct can_bittiming *bt = &priv->can.bittiming;
> >>> + u32 canbit;
> >>> + u32 bepe;
> >>> +
> >>> + /* Setting the CCE bit for accessing the Can Timing register. */
> >>> + pch_can_bit_set(&priv->regs->cont, CAN_CTRL_CCE);
> >>> +
> >>> + canbit = (bt->brp - 1) & MSK_BITT_BRP;
> >>> + canbit |= (bt->sjw - 1) << BIT_BITT_SJW;
> >>> + canbit |= (bt->phase_seg1 + bt->prop_seg - 1) << BIT_BITT_TSEG1;
> >>> + canbit |= (bt->phase_seg2 - 1) << BIT_BITT_TSEG2;
> >>> + bepe = ((bt->brp - 1) & MSK_BRPE_BRPE) >> BIT_BRPE_BRPE;
> >>> + iowrite32(canbit, &priv->regs->bitt);
> >>> + iowrite32(bepe, &priv->regs->brpe);
> >>> + pch_can_bit_clear(&priv->regs->cont, CAN_CTRL_CCE);
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static void pch_can_start(struct net_device *ndev)
> >>> +{
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> +
> >>> + if (priv->can.state != CAN_STATE_STOPPED)
> >>> +  pch_can_reset(priv);
> >>> +
> >>> + pch_set_bittiming(ndev);
> >>> + pch_can_set_optmode(priv);
> >>> +
> >>> + pch_can_tx_enable_all(priv);
> >>> + pch_can_rx_enable_all(priv);
> >>> +
> >>> + /* Setting the CAN to run mode. */
> >>> + pch_can_set_run_mode(priv, PCH_CAN_RUN);
> >>> +
> >>> + priv->can.state = CAN_STATE_ERROR_ACTIVE;
> >>> +
> >>> + return;
> >>> +}
> >>> +
> >>> +static int pch_can_do_set_mode(struct net_device *ndev, enum can_mode mode)
> >>> +{
> >>> + int ret = 0;
> >>> +
> >>> + switch (mode) {
> >>> + case CAN_MODE_START:
> >>> +  pch_can_start(ndev);
> >>> +  netif_wake_queue(ndev);
> >>> +  break;
> >>> + default:
> >>> +  ret = -EOPNOTSUPP;
> >>> +  break;
> >>> + }
> >>> +
> >>> + return ret;
> >>> +}
> >>> +
> >>> +static int pch_can_open(struct net_device *ndev)
> >>> +{
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> + int retval;
> >>> +
> >>> + /* Regsitering the interrupt. */
> >>> + retval = request_irq(priv->dev->irq, pch_can_interrupt, IRQF_SHARED,
> >>> +        ndev->name, ndev);
> >>> + if (retval) {
> >>> +  dev_err(&ndev->dev, "request_irq failed.\n");
> >>> +  goto req_irq_err;
> >>> + }
> >>> +
> >>> + /* Open common can device */
> >>> + retval = open_candev(ndev);
> >>> + if (retval) {
> >>> +  dev_err(ndev->dev.parent, "open_candev() failed %d\n", retval);
> >>> +  goto err_open_candev;
> >>> + }
> >>> +
> >>> + pch_can_init(priv);
> >>> + pch_can_start(ndev);
> >>> + napi_enable(&priv->napi);
> >>> + netif_start_queue(ndev);
> >>> +
> >>> + return 0;
> >>> +
> >>> +err_open_candev:
> >>> + free_irq(priv->dev->irq, ndev);
> >>> +req_irq_err:
> >>> + pch_can_release(priv);
> >>> +
> >>> + return retval;
> >>> +}
> >>> +
> >>> +static int pch_close(struct net_device *ndev)
> >>> +{
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> +
> >>> + netif_stop_queue(ndev);
> >>> + napi_disable(&priv->napi);
> >>> + pch_can_release(priv);
> >>> + free_irq(priv->dev->irq, ndev);
> >>> + close_candev(ndev);
> >>> + priv->can.state = CAN_STATE_STOPPED;
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static netdev_tx_t pch_xmit(struct sk_buff *skb, struct net_device *ndev)
> >>> +{
> >>> + unsigned long flags;
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> + struct can_frame *cf = (struct can_frame *)skb->data;
> >>> + int tx_buffer_avail = 0;
> >>
> >> What I'm totally missing is the TX flow controll. Your driver has to
> >> ensure that the package leave the controller in the order that come
> >> into the xmit function. Further you have to stop your xmit queue if
> >> you're out of tx objects and reenable if you have a object free.
> >>
> >> Use netif_stop_queue() and netif_wake_queue() for this.
> > 
> > In this code, I think "out of tx objects" cannot be  occurred.
> 
> It's not a matter of code it's the hardware. You cannot put more than a
> certain number of CAN frames into the hardware. If you have a CAN bus at
> a certain speed, you can only send a certain number of CAN frames in a
> second. So you cannot push more than this amount of frames/s into the
> hardware.
> 
> > Nevertheless, are netif_stop_queue() and netif_wake_queue() is necessary ?
> 
> Yes.

I can' understand your issue.
Please can you hear my opinion?

Please see the head of pch_xmit.

> > + if (priv->tx_obj == (PCH_OBJ_NUM + 1)) { /* Point tail Obj + 1 */
> > +  while (ioread32(&priv->regs->treq2) & 0xfc00)
> > +   udelay(1);

When points tail of Tx message object,
this driver waits until completion of all tx messaeg objects.
Thus, application/driver ought not to be able to put Tx object exceed the number of tx message object.
Thus I think these code(netif_stop_queue/netif_wake_queue) are completely redundant.

> 
> > 
> >>
> >>> +
> >>> + if (can_dropped_invalid_skb(ndev, skb))
> >>> +  return NETDEV_TX_OK;
> >>> +
> >>> + if (priv->tx_obj == (PCH_OBJ_NUM + 1)) { /* Point tail Obj + 1 */
> >>> +  while (ioread32(&priv->regs->treq2) & 0xfc00)
> >>> +   udelay(1);
> >>
> >> please no (possible) infinite delays!
> > 
> > I will add break processing.
> > 
> >>
> >>> +  priv->tx_obj = PCH_RX_OBJ_NUM + 1; /* Point head of Tx Obj ID */
> >>> + }
> >>
> >>> +
> >>> + tx_buffer_avail = priv->tx_obj;
> >>
> >> why has the "object" become a "buffer" now? :)
> > 
> > You are right.
> > I will modify the name.
> > 
> >>
> >>> + priv->tx_obj++;
> >>> +
> >>> + /* Attaining the lock. */
> >>> + spin_lock_irqsave(&priv->msgif_reg_lock, flags);
> >>> +
> >>> + /* Setting the CMASK register to set value*/
> >>                                                  ^^^
> >>
> >> pleas add a whitespace
> > 
> > I agree.
> > 
> >>
> >>> + iowrite32(CAN_CMASK_RX_TX_SET, &priv->regs->if2_cmask);
> >>> +
> >>> + /* If ID extended is set. */
> >>> + if (cf->can_id & CAN_EFF_FLAG) {
> >>> +  iowrite32(cf->can_id & 0xffff, &priv->regs->if2_id1);
> >>> +  iowrite32(((cf->can_id >> 16) & 0x1fff) | CAN_ID2_XTD,
> >>> +       &priv->regs->if2_id2);
> >>> + } else {
> >>> +  iowrite32(0, &priv->regs->if2_id1);
> >>> +  iowrite32((cf->can_id & CAN_SFF_MASK) << 2,
> >>> +      &priv->regs->if2_id2);
> >>> + }
> >>> +
> >>> + pch_can_bit_set(&priv->regs->if2_id2, CAN_ID_MSGVAL);
> >>
> >> Do you need to do a read-modify-write of the hardware register? Please
> >> prepare the values you want to write to hardware, then do it.
> > 
> > Current design policy for read/write message object,
> > the driver is designed with Read-Modify-Write.
> > 
> > I will modify to Write only for reducing accessing Message RAM.
> > 
> >>
> >>> +
> >>> + /* If remote frame has to be transmitted.. */
> >>> + if (!(cf->can_id & CAN_RTR_FLAG))
> >>> +  pch_can_bit_set(&priv->regs->if2_id2, CAN_ID2_DIR);
> >> dito
> >>> + /* If remote frame has to be transmitted.. */
> >>> + if (cf->can_id & CAN_RTR_FLAG)
> >>> +  pch_can_bit_clear(&priv->regs->if2_id2, CAN_ID2_DIR);
> >> dito
> >>> +
> >>> + /* Copy data to register */
> >>> + if (cf->can_dlc > 0) {
> >>> +  u32 data1 = *((u16 *)&cf->data[0]);
> >>> +  iowrite32(data1, &priv->regs->if2_dataa1);
> >>
> >> do you think you send the bytes in correct order?
> > 
> > Let me study this endianess.
> > 
> >>
> >>> + }
> >>> + if (cf->can_dlc > 2) {
> >>> +  u32 data1 = *((u16 *)&cf->data[2]);
> >>> +  iowrite32(data1, &priv->regs->if2_dataa2);
> >>> + }
> >>> + if (cf->can_dlc > 4) {
> >>> +  u32 data1 = *((u16 *)&cf->data[4]);
> >>> +  iowrite32(data1, &priv->regs->if2_datab1);
> >>> + }
> >>> + if (cf->can_dlc > 6) {
> >>> +  u32 data1 = *((u16 *)&cf->data[6]);
> >>> +  iowrite32(data1, &priv->regs->if2_datab2);
> >>> + }
> >>> +
> >>> + can_put_echo_skb(skb, ndev, tx_buffer_avail - PCH_RX_OBJ_NUM - 1);
> >>> +
> >>> + /* Set the size of the data. */
> >>> + iowrite32(cf->can_dlc, &priv->regs->if2_mcont);
> >>> +
> >>> + /* Update if2_mcont */
> >>> + pch_can_bit_set(&priv->regs->if2_mcont,
> >>> +   CAN_IF_MCONT_NEWDAT | CAN_IF_MCONT_TXRQXT |
> >>> +   CAN_IF_MCONT_TXIE);
> >>
> >> pleae first perpare your value, then write to hardware.
> > 
> > ditto.
> > 
> >>
> >>> +
> >>> + if (tx_buffer_avail == PCH_RX_OBJ_NUM) /* If points tail of FIFO  */
> >>> +  pch_can_bit_set(&priv->regs->if2_mcont, CAN_IF_MCONT_EOB);
> >>
> >> dito
> >>
> >> Is EOB relevant for TX objects?
> > 
> > This is mistake. No meaning for tx.
> > I will modify.
> > 
> >>
> >>> + pch_can_check_if_busy(&priv->regs->if2_creq, tx_buffer_avail);
> >>> + spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
> >>> +
> >>> + return NETDEV_TX_OK;
> >>> +}
> >>> +
> >>> +static const struct net_device_ops pch_can_netdev_ops = {
> >>> + .ndo_open  = pch_can_open,
> >>> + .ndo_stop  = pch_close,
> >>> + .ndo_start_xmit  = pch_xmit,
> >>> +};
> >>> +
> >>> +static void __devexit pch_can_remove(struct pci_dev *pdev)
> >>> +{
> >>> + struct net_device *ndev = pci_get_drvdata(pdev);
> >>> + struct pch_can_priv *priv = netdev_priv(ndev);
> >>> +
> >>> + unregister_candev(priv->ndev);
> >>> + pci_iounmap(pdev, priv->regs);
> >>> + if (priv->use_msi)
> >>> +  pci_disable_msi(priv->dev);
> >>> + pci_release_regions(pdev);
> >>> + pci_disable_device(pdev);
> >>> + pci_set_drvdata(pdev, NULL);
> >>> + free_candev(priv->ndev);
> >>> +}
> >>> +
> >>> +#ifdef CONFIG_PM
> >>> +static void pch_can_set_int_custom(struct pch_can_priv *priv)
> >>> +{
> >>> + /* Clearing the IE, SIE and EIE bits of Can control register. */
> >>> + pch_can_bit_clear(&priv->regs->cont, CAN_CTRL_IE_SIE_EIE);
> >>> +
> >>> + /* Appropriately setting them. */
> >>> + pch_can_bit_set(&priv->regs->cont,
> >>> +   ((priv->int_enables & MSK_CTRL_IE_SIE_EIE) << 1));
> >>> +}
> >>> +
> >>> +/* This function retrieves interrupt enabled for the CAN device. */
> >>> +static u32 pch_can_get_int_enables(struct pch_can_priv *priv)
> >>> +{
> >>> + /* Obtaining the status of IE, SIE and EIE interrupt bits. */
> >>> + return (ioread32(&priv->regs->cont) & CAN_CTRL_IE_SIE_EIE) >> 1;
> >>> +}
> >>> +
> >>> +static u32 pch_can_get_rx_enable(struct pch_can_priv *priv, u32 buff_num)
> >>> +{
> >>> + unsigned long flags;
> >>> + u32 enable;
> >>> +
> >>> + spin_lock_irqsave(&priv->msgif_reg_lock, flags);
> >>> + iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
> >>> + pch_can_check_if_busy(&priv->regs->if1_creq, buff_num);
> >>> +
> >>> + if (((ioread32(&priv->regs->if1_id2)) & CAN_ID_MSGVAL) &&
> >>> +   ((ioread32(&priv->regs->if1_mcont)) &
> >>> +   CAN_IF_MCONT_RXIE))
> >>> +  enable = 1;
> >>> + else
> >>> +  enable = 0;
> >>> + spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
> >>> + return enable;
> >>> +}
> >>> +
> >>> +static u32 pch_can_get_tx_enable(struct pch_can_priv *priv, u32 buff_num)
> >>> +{
> >>> + unsigned long flags;
> >>> + u32 enable;
> >>> +
> >>> + spin_lock_irqsave(&priv->msgif_reg_lock, flags);
> >>> +
> >>> + iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if2_cmask);
> >>> + pch_can_check_if_busy(&priv->regs->if2_creq, buff_num);
> >>> + if (((ioread32(&priv->regs->if2_id2)) & CAN_ID_MSGVAL) &&
> >>> +   ((ioread32(&priv->regs->if2_mcont)) &
> >>> +   CAN_IF_MCONT_TXIE)) {
> >>> +  enable = 1;
> >>> + } else {
> >>> +  enable = 0;
> >>> + }
> >>> + spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
> >>> +
> >>> + return enable;
> >>> +}
> >>> +
> >>> +static void pch_can_set_rx_buffer_link(struct pch_can_priv *priv,
> >>> +           u32 buffer_num, u32 set)
> >>> +{
> >>> + unsigned long flags;
> >>> +
> >>> + spin_lock_irqsave(&priv->msgif_reg_lock, flags);
> >>> + iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
> >>> + pch_can_check_if_busy(&priv->regs->if1_creq, buffer_num);
> >>> + iowrite32(CAN_CMASK_RDWR | CAN_CMASK_CTRL, &priv->regs->if1_cmask);
> >>> + if (set == 1)
> >>> +  pch_can_bit_clear(&priv->regs->if1_mcont, CAN_IF_MCONT_EOB);
> >>> + else
> >>> +  pch_can_bit_set(&priv->regs->if1_mcont, CAN_IF_MCONT_EOB);
> >>> +
> >>> + pch_can_check_if_busy(&priv->regs->if1_creq, buffer_num);
> >>> + spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
> >>> +}
> >>> +
> >>> +static u32 pch_can_get_rx_buffer_link(struct pch_can_priv *priv, u32 buffer_num)
> >>> +{
> >>> + unsigned long flags;
> >>> + u32 link;
> >>> +
> >>> + spin_lock_irqsave(&priv->msgif_reg_lock, flags);
> >>> + iowrite32(CAN_CMASK_RX_TX_GET, &priv->regs->if1_cmask);
> >>> + pch_can_check_if_busy(&priv->regs->if1_creq, buffer_num);
> >>> +
> >>> + if (ioread32(&priv->regs->if1_mcont) & CAN_IF_MCONT_EOB)
> >>> +  link = 0;
> >>> + else
> >>> +  link = 1;
> >>> + spin_unlock_irqrestore(&priv->msgif_reg_lock, flags);
> >>> + return link;
> >>> +}
> >>> +
> >>> +static int pch_can_suspend(struct pci_dev *pdev, pm_message_t state)
> >>> +{
> >>> + int i;
> >>> + int retval;
> >>> + u32 buf_stat; /* Variable for reading the transmit buffer status. */
> >>> + u32 counter = COUNTER_LIMIT;
> >>> +
> >>> + struct net_device *dev = pci_get_drvdata(pdev);
> >>> + struct pch_can_priv *priv = netdev_priv(dev);
> >>> +
> >>> + /* Stop the CAN controller */
> >>> + pch_can_set_run_mode(priv, PCH_CAN_STOP);
> >>> +
> >>> + /* Indicate that we are aboutto/in suspend */
> >>> + priv->can.state = CAN_STATE_STOPPED;
> >>> +
> >>> + /* Waiting for all transmission to complete. */
> >>> + while (counter) {
> >>> +  buf_stat = pch_can_get_buffer_status(priv);
> >>> +  if (!buf_stat)
> >>> +   break;
> >>> +  counter--;
> >>> +  udelay(1);
> >>> + }
> >>> + if (!counter)
> >>> +  dev_err(&pdev->dev, "%s -> Transmission time out.\n", __func__);
> >>> +
> >>> + /* Save interrupt configuration and then disable them */
> >>> + priv->int_enables = pch_can_get_int_enables(priv);
> >>> + pch_can_set_int_enables(priv, PCH_CAN_DISABLE);
> >>> +
> >>> + /* Save Tx buffer enable state */
> >>> + for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++)
> >>> +  priv->tx_enable[i] = pch_can_get_tx_enable(priv, i);
> >>> +
> >>> + /* Disable all Transmit buffers */
> >>> + pch_can_tx_disable_all(priv);
> >>> +
> >>> + /* Save Rx buffer enable state */
> >>> + for (i = 1; i <= PCH_RX_OBJ_NUM; i++) {
> >>> +  priv->rx_enable[i] = pch_can_get_rx_enable(priv, i);
> >>> +  priv->rx_link[i] = pch_can_get_rx_buffer_link(priv, i);
> >>> + }
> >>> +
> >>> + /* Disable all Receive buffers */
> >>> + pch_can_rx_disable_all(priv);
> >>> + retval = pci_save_state(pdev);
> >>> + if (retval) {
> >>> +  dev_err(&pdev->dev, "pci_save_state failed.\n");
> >>> + } else {
> >>> +  pci_enable_wake(pdev, PCI_D3hot, 0);
> >>> +  pci_disable_device(pdev);
> >>> +  pci_set_power_state(pdev, pci_choose_state(pdev, state));
> >>> + }
> >>> +
> >>> + return retval;
> >>> +}
> >>> +
> >>> +static int pch_can_resume(struct pci_dev *pdev)
> >>> +{
> >>> + int i;
> >>> + int retval;
> >>> + struct net_device *dev = pci_get_drvdata(pdev);
> >>> + struct pch_can_priv *priv = netdev_priv(dev);
> >>> +
> >>> + pci_set_power_state(pdev, PCI_D0);
> >>> + pci_restore_state(pdev);
> >>> + retval = pci_enable_device(pdev);
> >>> + if (retval) {
> >>> +  dev_err(&pdev->dev, "pci_enable_device failed.\n");
> >>> +  return retval;
> >>> + }
> >>> +
> >>> + pci_enable_wake(pdev, PCI_D3hot, 0);
> >>> +
> >>> + priv->can.state = CAN_STATE_ERROR_ACTIVE;
> >>> +
> >>> + /* Disabling all interrupts. */
> >>> + pch_can_set_int_enables(priv, PCH_CAN_DISABLE);
> >>> +
> >>> + /* Setting the CAN device in Stop Mode. */
> >>> + pch_can_set_run_mode(priv, PCH_CAN_STOP);
> >>> +
> >>> + /* Configuring the transmit and receive buffers. */
> >>> + pch_can_config_rx_tx_buffers(priv);
> >>> +
> >>> + /* Restore the CAN state */
> >>> + pch_set_bittiming(dev);
> >>> +
> >>> + /* Listen/Active */
> >>> + pch_can_set_optmode(priv);
> >>> +
> >>> + /* Enabling the transmit buffer. */
> >>> + for (i = 1; i <= PCH_RX_OBJ_NUM; i++)
> >>> +  pch_can_set_tx_enable(priv, i, priv->tx_enable[i]);
> >>> +
> >>> + /* Configuring the receive buffer and enabling them. */
> >>> + for (i = PCH_RX_OBJ_NUM + 1; i <= PCH_OBJ_NUM; i++) {
> >>> +  /* Restore buffer link */
> >>> +  pch_can_set_rx_buffer_link(priv, i, priv->rx_link[i]);
> >>> +
> >>> +  /* Restore buffer enables */
> >>> +  pch_can_set_rx_enable(priv, i, priv->rx_enable[i]);
> >>> + }
> >>> +
> >>> + /* Enable CAN Interrupts */
> >>> + pch_can_set_int_custom(priv);
> >>> +
> >>> + /* Restore Run Mode */
> >>> + pch_can_set_run_mode(priv, PCH_CAN_RUN);
> >>> +
> >>> + return retval;
> >>> +}
> >>> +#else
> >>> +#define pch_can_suspend NULL
> >>> +#define pch_can_resume NULL
> >>> +#endif
> >>> +
> >>> +static int pch_can_get_berr_counter(const struct net_device *dev,
> >>> +        struct can_berr_counter *bec)
> >>> +{
> >>> + struct pch_can_priv *priv = netdev_priv(dev);
> >>> +
> >>> + bec->txerr = ioread32(&priv->regs->errc) & CAN_TEC;
> >>> + bec->rxerr = (ioread32(&priv->regs->errc) & CAN_REC) >> 8;
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static int __devinit pch_can_probe(struct pci_dev *pdev,
> >>> +       const struct pci_device_id *id)
> >>> +{
> >>> + struct net_device *ndev;
> >>> + struct pch_can_priv *priv;
> >>> + int rc;
> >>> + void __iomem *addr;
> >>> +
> >>> + rc = pci_enable_device(pdev);
> >>> + if (rc) {
> >>> +  dev_err(&pdev->dev, "Failed pci_enable_device %d\n", rc);
> >>> +  goto probe_exit_endev;
> >>> + }
> >>> +
> >>> + rc = pci_request_regions(pdev, KBUILD_MODNAME);
> >>> + if (rc) {
> >>> +  dev_err(&pdev->dev, "Failed pci_request_regions %d\n", rc);
> >>> +  goto probe_exit_pcireq;
> >>> + }
> >>> +
> >>> + addr = pci_iomap(pdev, 1, 0);
> >>> + if (!addr) {
> >>> +  rc = -EIO;
> >>> +  dev_err(&pdev->dev, "Failed pci_iomap\n");
> >>> +  goto probe_exit_ipmap;
> >>> + }
> >>> +
> >>> + ndev = alloc_candev(sizeof(struct pch_can_priv), PCH_TX_OBJ_NUM);
> >>> + if (!ndev) {
> >>> +  rc = -ENOMEM;
> >>> +  dev_err(&pdev->dev, "Failed alloc_candev\n");
> >>> +  goto probe_exit_alloc_candev;
> >>> + }
> >>> +
> >>> + priv = netdev_priv(ndev);
> >>> + priv->ndev = ndev;
> >>> + priv->regs = addr;
> >>> + priv->dev = pdev;
> >>> + priv->can.bittiming_const = &pch_can_bittiming_const;
> >>> + priv->can.do_set_mode = pch_can_do_set_mode;
> >>> + priv->can.do_get_berr_counter = pch_can_get_berr_counter;
> >>> + priv->can.ctrlmode_supported = CAN_CTRLMODE_LISTENONLY |
> >>> +           CAN_CTRLMODE_LOOPBACK;
> >>> + priv->tx_obj = PCH_RX_OBJ_NUM + 1; /* Point head of Tx Obj */
> >>> +
> >>> + ndev->irq = pdev->irq;
> >>> + ndev->flags |= IFF_ECHO;
> >>> +
> >>> + pci_set_drvdata(pdev, ndev);
> >>> + SET_NETDEV_DEV(ndev, &pdev->dev);
> >>> + ndev->netdev_ops = &pch_can_netdev_ops;
> >>> + priv->can.clock.freq = PCH_CAN_CLK; /* Hz */
> >>> +
> >>> + netif_napi_add(ndev, &priv->napi, pch_can_rx_poll, PCH_RX_OBJ_NUM);
> >>> +
> >>> + rc = pci_enable_msi(priv->dev);
> >>> + if (rc) {
> >>> +  dev_info(&ndev->dev, "PCH CAN opened without MSI\n");
> >>> +  priv->use_msi = 0;
> >>> + } else {
> >>> +  dev_info(&ndev->dev, "PCH CAN opened with MSI\n");
> >>> +  priv->use_msi = 1;
> >>> + }
> >>> +
> >>> + rc = register_candev(ndev);
> >>> + if (rc) {
> >>> +  dev_err(&pdev->dev, "Failed register_candev %d\n", rc);
> >>> +  goto probe_exit_reg_candev;
> >>> + }
> >>> +
> >>> + return 0;
> >>> +
> >>> +probe_exit_reg_candev:
> >>> + free_candev(ndev);
> >>> +probe_exit_alloc_candev:
> >>> + pci_iounmap(pdev, addr);
> >>> +probe_exit_ipmap:
> >>> + pci_release_regions(pdev);
> >>> +probe_exit_pcireq:
> >>> + pci_disable_device(pdev);
> >>> +probe_exit_endev:
> >>> + return rc;
> >>> +}
> >>> +
> >>> +static struct pci_driver pch_can_pcidev = {
> >>> + .name = "pch_can",
> >>> + .id_table = pch_pci_tbl,
> >>> + .probe = pch_can_probe,
> >>> + .remove = __devexit_p(pch_can_remove),
> >>> + .suspend = pch_can_suspend,
> >>> + .resume = pch_can_resume,
> >>> +};
> >>> +
> >>> +static int __init pch_can_pci_init(void)
> >>> +{
> >>> + return pci_register_driver(&pch_can_pcidev);
> >>> +}
> >>> +module_init(pch_can_pci_init);
> >>> +
> >>> +static void __exit pch_can_pci_exit(void)
> >>> +{
> >>> + pci_unregister_driver(&pch_can_pcidev);
> >>> +}
> >>> +module_exit(pch_can_pci_exit);
> >>> +
> >>> +MODULE_DESCRIPTION("Intel EG20T PCH CAN(Controller Area Network) Driver");
> >>> +MODULE_LICENSE("GPL v2");
> >>> +MODULE_VERSION("0.94");
> >>
> >> cheers, Marc
> >>
> >> -- 
> >> Pengutronix e.K.                  | Marc Kleine-Budde           |
> >> Industrial Linux Solutions        | Phone: +49-231-2826-924     |
> >> Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
> >> Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |
> >>
> >>
> > 
> > Thanks, Tomoya(OKI SEMICONDUCTOR CO., LTD.)
> 
> cheers, Marc
> 
> -- 
> Pengutronix e.K.                  | Marc Kleine-Budde           |
> Industrial Linux Solutions        | Phone: +49-231-2826-924     |
> Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
> Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |
> 
>

^ permalink raw reply

* Re: [PATCH net-next-2.6 v2] can: Topcliff: PCH_CAN driver: Fix build warnings
From: Marc Kleine-Budde @ 2010-11-09 12:59 UTC (permalink / raw)
  To: Tomoya MORINAGA
  Cc: andrew.chih.howe.khor-ral2JQCrhuEAvxtiuMwx3w, Masayuki Ohtake,
	Samuel Ortiz, margie.foster-ral2JQCrhuEAvxtiuMwx3w,
	netdev-u79uwXL29TY76Z2rM5mHXA, LKML,
	socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	yong.y.wang-ral2JQCrhuEAvxtiuMwx3w,
	kok.howg.ewe-ral2JQCrhuEAvxtiuMwx3w, Wolfgang Grandegger,
	joel.clark-ral2JQCrhuEAvxtiuMwx3w, David S. Miller,
	Christian Pellegrin, qi.wang-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <00fe01cb8009$62e11410$66f8800a-a06+6cuVnkTSQfdrb5gaxUEOCMrvLtNR@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 3570 bytes --]

On 11/09/2010 01:26 PM, Tomoya MORINAGA wrote:
>>>> Can you please explain me your locking sheme? If I understand the
>>>> documenation correctly the two message interfaces can be used mutual.
>>>> And you use one for rx the other one for tx.
>>>
>>> I show our locking scheme.
>>> When CPU accesses MessageRAM via IF1, CPU protect until read-modify-write
>>> so that IF2 access not occurred, vice versa.
>>
>> Why is that needed?
> 
> For MessageRAM data consistency.

As far as I understand the datasheet the access to IF1 and IF2 is
completely independent. Why do you lock here?

[...]

>>>> Please use just "debug" level not warning here. Consider to use
>>>> netdev_dbg() instead. IMHO the __func__ can be dropped and the
>>>> "official" name for the error is "Error Warning".
>>>
>>> I want to know the reason.
>>> Why is it not dev_warn but netdev_dbg ?
>>
>> If you use warning level it would end up on the console or and in the
>> syslog. It's quite complicated (for programs) to get information from
>> there. This is why we send CAN error frames. They hold the same
>> information but int a binary form, thus it's easier to process.
> 
> I understand the reason.
> BTW, Why do you say not dev_dbg but netdev_dbg ?

Sorry - netdev_dbg() is easier to use, its first argument is the
netdevice, while dev_dbg needs a device and that's deeply hidden in the
netdevice.

[...]

>>>>> +static netdev_tx_t pch_xmit(struct sk_buff *skb, struct net_device *ndev)
>>>>> +{
>>>>> + unsigned long flags;
>>>>> + struct pch_can_priv *priv = netdev_priv(ndev);
>>>>> + struct can_frame *cf = (struct can_frame *)skb->data;
>>>>> + int tx_buffer_avail = 0;
>>>>
>>>> What I'm totally missing is the TX flow controll. Your driver has to
>>>> ensure that the package leave the controller in the order that come
>>>> into the xmit function. Further you have to stop your xmit queue if
>>>> you're out of tx objects and reenable if you have a object free.
>>>>
>>>> Use netif_stop_queue() and netif_wake_queue() for this.
>>>
>>> In this code, I think "out of tx objects" cannot be  occurred.
>>
>> It's not a matter of code it's the hardware. You cannot put more than a
>> certain number of CAN frames into the hardware. If you have a CAN bus at
>> a certain speed, you can only send a certain number of CAN frames in a
>> second. So you cannot push more than this amount of frames/s into the
>> hardware.
>>
>>> Nevertheless, are netif_stop_queue() and netif_wake_queue() is necessary ?
>>
>> Yes.
> 
> I can' understand your issue.
> Please can you hear my opinion?
> 
> Please see the head of pch_xmit.
> 
>>> + if (priv->tx_obj == (PCH_OBJ_NUM + 1)) { /* Point tail Obj + 1 */
>>> +  while (ioread32(&priv->regs->treq2) & 0xfc00)
>>> +   udelay(1);
> 
> When points tail of Tx message object,
> this driver waits until completion of all tx messaeg objects.

Looping busy it not an option here.

> Thus, application/driver ought not to be able to put Tx object exceed the number of tx message object.
> Thus I think these code(netif_stop_queue/netif_wake_queue) are completely redundant.

Nope - please remove the waiting completely and convert your flow
control to netif_stop_queue/netif_wake_queue.

cheers, Marc
-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

[-- Attachment #2: Type: text/plain, Size: 188 bytes --]

_______________________________________________
Socketcan-core mailing list
Socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
https://lists.berlios.de/mailman/listinfo/socketcan-core

^ permalink raw reply

* Re: [PATCH] virtio_net: Fix queue full check
From: Michael S. Tsirkin @ 2010-11-09 13:15 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: Rusty Russell, davem, netdev, yvugenfi
In-Reply-To: <OF5977B5FB.669A3250-ON652577D6.00168221-652577D6.00180738@in.ibm.com>

On Tue, Nov 09, 2010 at 09:56:03AM +0530, Krishna Kumar2 wrote:
> Rusty Russell <rusty@rustcorp.com.au> wrote on 11/08/2010 04:38:47 AM:
> 
> > Re: [PATCH] virtio_net: Fix queue full check
> >
> > On Thu, 4 Nov 2010 10:54:24 pm Michael S. Tsirkin wrote:
> > > I thought about this some more.  I think the original
> > > code is actually correct in returning ENOSPC: indirect
> > > buffers are nice, but it's a mistake
> > > to rely on them as a memory allocation might fail.
> > >
> > > And if you look at virtio-net, it is dropping packets
> > > under memory pressure which is not really a happy outcome:
> > > the packet will get freed, reallocated and we get another one,
> > > adding pressure on the allocator instead of releasing it
> > > until we free up some buffers.
> > >
> > > So I now think we should calculate the capacity
> > > assuming non-indirect entries, and if we manage to
> > > use indirect, all the better.
> >
> > I've long said it's a weakness in the network stack that it insists
> > drivers stop the tx queue before they *might* run out of room, leading to
> > worst-case assumptions and underutilization of the tx ring.
> >
> > However, I lost that debate, and so your patch is the way it's supposed
> to
> > work.  The other main indirect user (block) doesn't care as its queue
> > allows for post-attempt blocking.
> >
> > I enhanced your commentry a little:
> >
> > Subject: virtio: return correct capacity to users
> > Date: Thu, 4 Nov 2010 14:24:24 +0200
> > From: "Michael S. Tsirkin" <mst@redhat.com>
> >
> > We can't rely on indirect buffers for capacity
> > calculations because they need a memory allocation
> > which might fail.  In particular, virtio_net can get
> > into this situation under stress, and it drops packets
> > and performs badly.
> >
> > So return the number of buffers we can guarantee users.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> 
> I have tested this patch for 3-4 hours but so far I have not got the tx
> full
> error. I am not sure if "Tested-By" applies to this situation, but just in
> case:
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> Tested-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> 
> I think both this patch and the original patch I submitted
> are needed? That patch removes ENOMEM check and the increment
> of dev->stats.tx_fifo_errors, and reports "memory failure".
> 
> Thanks,
> 
> - KK

So I think your patch on top of this one would be wrong:
we actually make sure out of memory does not affect TX path
at all, so any error would be unexpected.

Incrementing tx fifo errors is probably also helpful for debugging.

If you like, we could kill the special handling for ENOMEM.
Not sure whether it matters.

-- 
MST

^ permalink raw reply

* Re: [v3 RFC PATCH 0/4] Implement multiqueue virtio-net
From: Michael S. Tsirkin @ 2010-11-09 13:22 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: anthony, arnd, avi, davem, eric.dumazet, kvm, netdev, rusty
In-Reply-To: <OF529E89EE.84DB37B8-ON652577D6.001665CF-652577D6.00192784@in.ibm.com>

On Tue, Nov 09, 2010 at 10:08:21AM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote on 10/26/2010 02:27:09 PM:
> 
> > Re: [v3 RFC PATCH 0/4] Implement multiqueue virtio-net
> >
> > On Mon, Oct 25, 2010 at 09:20:38PM +0530, Krishna Kumar2 wrote:
> > > > Krishna Kumar2/India/IBM@IBMIN wrote on 10/20/2010 02:24:52 PM:
> > >
> > > Any feedback, comments, objections, issues or bugs about the
> > > patches? Please let me know if something needs to be done.
> > >
> > > Some more test results:
> > > _____________________________________________________
> > >          Host->Guest BW (numtxqs=2)
> > > #       BW%     CPU%    RCPU%   SD%     RSD%
> > > _____________________________________________________
> >
> > I think we discussed the need for external to guest testing
> > over 10G. For large messages we should not see any change
> > but you should be able to get better numbers for small messages
> > assuming a MQ NIC card.
> 
> I had to make a few changes to qemu (and a minor change in macvtap
> driver) to get multiple TXQ support using macvtap working. The NIC
> is a ixgbe card.
> 
> __________________________________________________________________________
>             Org vs New (I/O: 512 bytes, #numtxqs=2, #vhosts=3)
> #      BW1     BW2 (%)       SD1    SD2 (%)        RSD1    RSD2 (%)
> __________________________________________________________________________
> 1      14367   13142 (-8.5)  56     62 (10.7)      8        8 (0)
> 2      3652    3855 (5.5)    37     35 (-5.4)      7        6 (-14.2)
> 4      12529   12059 (-3.7)  65     77 (18.4)      35       35 (0)
> 8      13912   14668 (5.4)   288    332 (15.2)     175      184 (5.1)
> 16     13433   14455 (7.6)   1218   1321 (8.4)     920      943 (2.5)
> 24     12750   13477 (5.7)   2876   2985 (3.7)     2514     2348 (-6.6)
> 32     11729   12632 (7.6)   5299   5332 (.6)      4934     4497 (-8.8)
> 40     11061   11923 (7.7)   8482   8364 (-1.3)    8374     7495 (-10.4)
> 48     10624   11267 (6.0)   12329  12258 (-.5)    12762    11538 (-9.5)
> 64     10524   10596 (.6)    21689  22859 (5.3)    23626    22403 (-5.1)
> 80     9856    10284 (4.3)   35769  36313 (1.5)    39932    36419 (-8.7)
> 96     9691    10075 (3.9)   52357  52259 (-.1)    58676    53463 (-8.8)
> 128    9351    9794 (4.7)    114707 94275 (-17.8)  114050   97337 (-14.6)
> __________________________________________________________________________
> Avg:      BW: (3.3)      SD: (-7.3)      RSD: (-11.0)
> 
> __________________________________________________________________________
>             Org vs New (I/O: 1K, #numtxqs=8, #vhosts=5)
> #      BW1      BW2 (%)       SD1   SD2 (%)        RSD1   RSD2 (%)
> __________________________________________________________________________
> 1      16509    15985 (-3.1)  45    47 (4.4)       7       7 (0)
> 2      6963     4499 (-35.3)  17    51 (200.0)     7       7 (0)
> 4      12932    11080 (-14.3) 49    74 (51.0)      35      35 (0)
> 8      13878    14095 (1.5)   223   292 (30.9)     175     181 (3.4)
> 16     13440    13698 (1.9)   980   1131 (15.4)    926     942 (1.7)
> 24     12680    12927 (1.9)   2387  2463 (3.1)     2526    2342 (-7.2)
> 32     11714    12261 (4.6)   4506  4486 (-.4)     4941    4463 (-9.6)
> 40     11059    11651 (5.3)   7244  7081 (-2.2)    8349    7437 (-10.9)
> 48     10580    11095 (4.8)   10811 10500 (-2.8)   12809   11403 (-10.9)
> 64     10569    10566 (0)     19194 19270 (.3)     23648   21717 (-8.1)
> 80     9827     10753 (9.4)   31668 29425 (-7.0)   39991   33824 (-15.4)
> 96     10043    10150 (1.0)   45352 44227 (-2.4)   57766   51131 (-11.4)
> 128    9360     9979 (6.6)    92058 79198 (-13.9)  114381  92873 (-18.8)
> __________________________________________________________________________
> Avg:      BW: (-.5)      SD: (-7.5)      RSD: (-14.7)
> 
> Is there anything else you would like me to test/change, or shall
> I submit the next version (with the above macvtap changes)?
> 
> Thanks,
> 
> - KK

Something strange here, right?
1. You are consistently getting >10G/s here, and even with a single stream?
2. With 2 streams, is where we get < 10G/s originally. Instead of
   doubling that we get a marginal improvement with 2 queues and
   about 30% worse with 1 queue.

Is your card MQ?

-- 
MST

^ permalink raw reply

* Re: Loopback performance from kernel 2.6.12 to 2.6.37
From: Eric Dumazet @ 2010-11-09 14:06 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: netdev
In-Reply-To: <1289311159.17448.9.camel@traveldev.cxnet.dk>

Le mardi 09 novembre 2010 à 14:59 +0100, Jesper Dangaard Brouer a
écrit :
> On Mon, 2010-11-08 at 16:06 +0100, Eric Dumazet wrote:
> ...
> > > > > > I noticed that the loopback performance has gotten quite bad:
> > > > > > 
> > > > > > http://www.phoronix.com/scan.php?page=article&item=linux_2612_2637&num=6
> > >
> 
> > CC netdev, if you dont mind.
> 
> No problem :-)
> 
> > Their network test is basically :
> > 
> > netcat -l 9999 >/dev/null &
> > time dd if=/dev/zero bs=1M count=10000 | netcat  127.0.0.1 9999
> 
> Should it not be:
>  netcat -l -p 9999 >/dev/null &
> 

It depends on netcat version. On yours, yes, you need the "-p"

> When I run the commands "dd | netcat", netcat never finish/exits, I have
> to press Ctrl-C to stop it.  What am I doing wrong? Any tricks?
> 
> "dd" reports 17.54 sec, and but the "time" measurement cannot be uses as
> the netcat just hangs/waits for more data...
> 

Your netcat version needs a "-c" switch

time dd if=/dev/zero bs=1M count=10000 | netcat -c 127.0.0.1 9999


> time dd if=/dev/zero bs=1M count=10000 | netcat 127.0.0.1 9999
> 10000+0 records in
> 10000+0 records out
> 10485760000 bytes (10 GB) copied, 17,5419 s, 598 MB/s
> 
> When I run the commands, I see 261682 context switches per sec...
> 
> This quick test were on a Core i7 920 using kernel
> 2.6.32-rc3-net-next-dev-mp2t.
> 

32 or 64bit kernel ?

Thanks !




^ permalink raw reply

* Re: Loopback performance from kernel 2.6.12 to 2.6.37
From: Jesper Dangaard Brouer @ 2010-11-09 13:59 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1289228785.2820.203.camel@edumazet-laptop>

On Mon, 2010-11-08 at 16:06 +0100, Eric Dumazet wrote:
...
> > > > > I noticed that the loopback performance has gotten quite bad:
> > > > > 
> > > > > http://www.phoronix.com/scan.php?page=article&item=linux_2612_2637&num=6
> >

> CC netdev, if you dont mind.

No problem :-)

> Their network test is basically :
> 
> netcat -l 9999 >/dev/null &
> time dd if=/dev/zero bs=1M count=10000 | netcat  127.0.0.1 9999

Should it not be:
 netcat -l -p 9999 >/dev/null &

When I run the commands "dd | netcat", netcat never finish/exits, I have
to press Ctrl-C to stop it.  What am I doing wrong? Any tricks?

"dd" reports 17.54 sec, and but the "time" measurement cannot be uses as
the netcat just hangs/waits for more data...

time dd if=/dev/zero bs=1M count=10000 | netcat 127.0.0.1 9999
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 17,5419 s, 598 MB/s

When I run the commands, I see 261682 context switches per sec...

This quick test were on a Core i7 920 using kernel
2.6.32-rc3-net-next-dev-mp2t.

-- 
Jesper Dangaard Brouer
ComX Networks A/S


^ permalink raw reply

* Re: Loopback performance from kernel 2.6.12 to 2.6.37
From: Jesper Dangaard Brouer @ 2010-11-09 14:16 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1289311159.17448.9.camel@traveldev.cxnet.dk>

On Tue, 2010-11-09 at 14:59 +0100, Jesper Dangaard Brouer wrote:
> On Mon, 2010-11-08 at 16:06 +0100, Eric Dumazet wrote:
> ...
> > > > > > I noticed that the loopback performance has gotten quite bad:
> > > > > > 
> > > > > > http://www.phoronix.com/scan.php?page=article&item=linux_2612_2637&num=6

> > Their network test is basically :
> > 
> > netcat -l 9999 >/dev/null &
> > time dd if=/dev/zero bs=1M count=10000 | netcat  127.0.0.1 9999
> 
> Should it not be:
>  netcat -l -p 9999 >/dev/null &
> 
> When I run the commands "dd | netcat", netcat never finish/exits, I have
> to press Ctrl-C to stop it.  What am I doing wrong? Any tricks?

To fix this I added "-q 0" to netcat.  Thus my working commands are:

 netcat -l -p 9999 >/dev/null &
 time dd if=/dev/zero bs=1M count=10000 | netcat -q0 127.0.0.1 9999

Running this on my "big" 10G testlab system, Dual Xeon 5550 2.67GHz,
kernel version 2.6.32-5-amd64 (which I usually don't use)
The results are 7.487 sec:

time dd if=/dev/zero bs=1M count=10000 | netcat -q0 127.0.0.1 9999
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 7,48562 s, 1,4 GB/s

real    0m7.487s
user    0m0.224s
sys     0m9.785s

Using vmstat I see approx 400000 context switches per sec.

Perf top says:
            samples  pcnt function                  DSO
             _______ _____ _________________________ ___________

             6442.00 16.3% copy_user_generic_string  [kernel]   
             2226.00  5.6% __clear_user              [kernel]   
              912.00  2.3% _spin_lock_irqsave        [kernel]   
              773.00  2.0% _spin_lock_bh             [kernel]   
              736.00  1.9% schedule                  [kernel]   
              582.00  1.5% ipt_do_table              [ip_tables]
              569.00  1.4% _spin_lock                [kernel]   
              505.00  1.3% get_page_from_freelist    [kernel]   
              451.00  1.1% _spin_unlock_irqrestore   [kernel]   
              434.00  1.1% do_select                 [kernel]   
              354.00  0.9% tcp_sendmsg               [kernel]   
              348.00  0.9% tick_nohz_stop_sched_tick [kernel]   
              347.00  0.9% tcp_transmit_skb          [kernel]   
              345.00  0.9% zero_fd_set               [kernel]   

Perf also complains about it cannot resolve netcat debug symbols.


-- 
Jesper Dangaard Brouer
ComX Networks A/S


^ permalink raw reply

* SEASON LOAN!
From: PRIVATE HOME LENDER INC @ 2010-11-09 14:18 UTC (permalink / raw)
  To: aitierfamily

Apply for a loan to establish your business.
Our interest is very affordable and our loan process is very fast as well as
percentage rate of 2.5% yea rly from $5 000.00 Min. To $100 000 000.00 M a x.
Contact us via email if you are interested with the following details


NAME:
PHONE:
DURATION:
ADDRESS:
AMOUNT:

Regards,
Private Money Lender Inc
phl111@kkwl.ac.th.


^ permalink raw reply

* Re: Loopback performance from kernel 2.6.12 to 2.6.37
From: Eric Dumazet @ 2010-11-09 14:25 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: netdev
In-Reply-To: <1289312178.17448.20.camel@traveldev.cxnet.dk>

Le mardi 09 novembre 2010 à 15:16 +0100, Jesper Dangaard Brouer a
écrit :
> On Tue, 2010-11-09 at 14:59 +0100, Jesper Dangaard Brouer wrote:
> > On Mon, 2010-11-08 at 16:06 +0100, Eric Dumazet wrote:
> > ...
> > > > > > > I noticed that the loopback performance has gotten quite bad:
> > > > > > > 
> > > > > > > http://www.phoronix.com/scan.php?page=article&item=linux_2612_2637&num=6
> 
> > > Their network test is basically :
> > > 
> > > netcat -l 9999 >/dev/null &
> > > time dd if=/dev/zero bs=1M count=10000 | netcat  127.0.0.1 9999
> > 
> > Should it not be:
> >  netcat -l -p 9999 >/dev/null &
> > 
> > When I run the commands "dd | netcat", netcat never finish/exits, I have
> > to press Ctrl-C to stop it.  What am I doing wrong? Any tricks?
> 
> To fix this I added "-q 0" to netcat.  Thus my working commands are:
> 
>  netcat -l -p 9999 >/dev/null &
>  time dd if=/dev/zero bs=1M count=10000 | netcat -q0 127.0.0.1 9999
> 
> Running this on my "big" 10G testlab system, Dual Xeon 5550 2.67GHz,
> kernel version 2.6.32-5-amd64 (which I usually don't use)
> The results are 7.487 sec:
> 
> time dd if=/dev/zero bs=1M count=10000 | netcat -q0 127.0.0.1 9999
> 10000+0 records in
> 10000+0 records out
> 10485760000 bytes (10 GB) copied, 7,48562 s, 1,4 GB/s
> 
> real    0m7.487s
> user    0m0.224s
> sys     0m9.785s

So far, so good. These are the expected numbers. Now we have to
understand why corei7 gets 38 seconds instead of 8 :)




^ permalink raw reply

* Re: Loopback performance from kernel 2.6.12 to 2.6.37
From: Jesper Dangaard Brouer @ 2010-11-09 14:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1289312178.17448.20.camel@traveldev.cxnet.dk>

On Tue, 2010-11-09 at 15:16 +0100, Jesper Dangaard Brouer wrote:
> On Tue, 2010-11-09 at 14:59 +0100, Jesper Dangaard Brouer wrote:
> > On Mon, 2010-11-08 at 16:06 +0100, Eric Dumazet wrote:
> > ...
> > > > > > > I noticed that the loopback performance has gotten quite bad:
> > > > > > > 
> > > > > > > http://www.phoronix.com/scan.php?page=article&item=linux_2612_2637&num=6
> 
> > > Their network test is basically :
> > > 
> > > netcat -l 9999 >/dev/null &
> > > time dd if=/dev/zero bs=1M count=10000 | netcat  127.0.0.1 9999
> > 
> > Should it not be:
> >  netcat -l -p 9999 >/dev/null &
> > 
> > When I run the commands "dd | netcat", netcat never finish/exits, I have
> > to press Ctrl-C to stop it.  What am I doing wrong? Any tricks?
> 
> To fix this I added "-q 0" to netcat.  Thus my working commands are:
> 
>  netcat -l -p 9999 >/dev/null &
>  time dd if=/dev/zero bs=1M count=10000 | netcat -q0 127.0.0.1 9999
> 
> Running this on my "big" 10G testlab system, Dual Xeon 5550 2.67GHz,
> kernel version 2.6.32-5-amd64 (which I usually don't use)
> The results are 7.487 sec

Using kernel 2.6.35.8-comx01+ (which is 35-stable with some minor
patches of my own) on the same type of hardware (our preprod server).
The result is 12 sec.

time dd if=/dev/zero bs=1M count=10000 | netcat -q0 127.0.0.1 9999
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 12,0805 s, 868 MB/s

real    0m12.082s
user    0m0.311s
sys     0m15.896s

BUT perf top reveals that its probably related to the function
'find_busiest_group' ... any kernel config hints how I get rid of that?


              samples  pcnt function                    DSO
             _______ _____ ___________________________ ______________

             4152.00 12.8% copy_user_generic_string    [kernel]      
             1802.00  5.6% find_busiest_group          [kernel]      
              852.00  2.6% __clear_user                [kernel]      
              836.00  2.6% _raw_spin_lock_bh           [kernel]      
              819.00  2.5% ipt_do_table                [ip_tables]   
              628.00  1.9% rebalance_domains           [kernel]      
              564.00  1.7% _raw_spin_lock              [kernel]      
              562.00  1.7% _raw_spin_lock_irqsave      [kernel]      
              522.00  1.6% schedule                    [kernel]      
              441.00  1.4% find_next_bit               [kernel]      
              413.00  1.3% _raw_spin_unlock_irqrestore [kernel]      
              394.00  1.2% tcp_sendmsg                 [kernel]      
              391.00  1.2% tcp_packet                  [nf_conntrack]
              368.00  1.1% do_select                   [kernel]      



> Using vmstat I see approx 400000 context switches per sec.
>
Previous:
> Perf top says:
>             samples  pcnt function                  DSO
>              _______ _____ _________________________ ___________
> 
>              6442.00 16.3% copy_user_generic_string  [kernel]   
>              2226.00  5.6% __clear_user              [kernel]   
>               912.00  2.3% _spin_lock_irqsave        [kernel]   
>               773.00  2.0% _spin_lock_bh             [kernel]   
>               736.00  1.9% schedule                  [kernel]   
>               582.00  1.5% ipt_do_table              [ip_tables]
>               569.00  1.4% _spin_lock                [kernel]   
>               505.00  1.3% get_page_from_freelist    [kernel]   
>               451.00  1.1% _spin_unlock_irqrestore   [kernel]   
>               434.00  1.1% do_select                 [kernel]   
>               354.00  0.9% tcp_sendmsg               [kernel]   
>               348.00  0.9% tick_nohz_stop_sched_tick [kernel]   
>               347.00  0.9% tcp_transmit_skb          [kernel]   
>               345.00  0.9% zero_fd_set               [kernel]   


-- 
Jesper Dangaard Brouer
ComX Networks A/S


^ permalink raw reply

* Re: [PATCH net-next-2.6 v2] can: Topcliff: PCH_CAN driver: Fix build warnings
From: Marc Kleine-Budde @ 2010-11-09 14:47 UTC (permalink / raw)
  To: Tomoya MORINAGA
  Cc: andrew.chih.howe.khor-ral2JQCrhuEAvxtiuMwx3w, Masayuki Ohtake,
	Samuel Ortiz, margie.foster-ral2JQCrhuEAvxtiuMwx3w,
	Oliver Hartkopp, LKML, socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	yong.y.wang-ral2JQCrhuEAvxtiuMwx3w,
	kok.howg.ewe-ral2JQCrhuEAvxtiuMwx3w, Wolfgang Grandegger,
	netdev-u79uwXL29TY76Z2rM5mHXA, joel.clark-ral2JQCrhuEAvxtiuMwx3w,
	David S. Miller, Christian Pellegrin,
	qi.wang-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <00fd01cb8009$5efc5fd0$66f8800a-a06+6cuVnkTSQfdrb5gaxUEOCMrvLtNR@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 2312 bytes --]

On 11/09/2010 01:26 PM, Tomoya MORINAGA wrote:
>> It's just a question if the driver for an Intel Chipset should/could ignore
>> the endian problematic.
>>
>> If this is intended the driver should only appear in Kconfig depending on X86
>> or little endian systems.
> 
> This driver is for only x86 processor.
> I have no intention to support processor except x86.

Fair enough, if you do the endianess conversion correct, either with my
proposed cpu_to_be16 or with the simpler and probably unnoticeable
slower version from Oliver. With Olivers version it's a bit harder to
build a loop that just copies the number of bytes specified in dlc.

>> Besides this remark, the struct pch_can_regs could also be defined in a way
>> that every single CAN payload data byte could be accessed directly to allow
>> e.g. this code in pch_can_rx_normal():
>>
>> cf->data[0] = ioread8(&priv->regs->if1_data0);
>> cf->data[1] = ioread8(&priv->regs->if1_data1);
>> cf->data[2] = ioread8(&priv->regs->if1_data2);
>> (..)
>> cf->data[6] = ioread8(&priv->regs->if1_data6);
>> cf->data[7] = ioread8(&priv->regs->if1_data7);
>>
>> This is easy to understand and additionally endian aware.
> 
> I agree. Thease codes are very simple.
> I will modify like above.
> 
>>
>> In opposite to this:
>>
>> + if (cf->can_dlc > 2) {
>> + u32 data1 = *((u16 *)&cf->data[2]);
>> + iowrite32(data1, &priv->regs->if2_dataa2);
>> + }
>>
>> Which puts a little? endian u16 into the big endian register layout.
>> Sorry i'm just getting curious on this register access implementation.
> 
> I think cf->data is like below
> MSB----------LSB
> D3-D2-D1-D0
> D7-D6-D5-D4

No, cf->data is an array of 8 u8, so it looks this way:

D0 D1 D2 D3  D4 D5 D6 D7

> *((u16 *)&cf->data[2]) means "D3-D2".

No, it's D2 D3. This is why you need the endianess conversion.

> This order coprresponds to register order.
> data register is like below
> MSB----------LSB
> **-**-D3-D2
> 
> ** means reserved area.

cheers, Marc


-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

[-- Attachment #2: Type: text/plain, Size: 188 bytes --]

_______________________________________________
Socketcan-core mailing list
Socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
https://lists.berlios.de/mailman/listinfo/socketcan-core

^ permalink raw reply

* Re: Netlink limitations
From: Thomas Graf @ 2010-11-09 14:49 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Jan Engelhardt, David S. Miller, pablo, netdev
In-Reply-To: <4CD91406.4000603@trash.net>

On Tue, Nov 09, 2010 at 10:27:34AM +0100, Patrick McHardy wrote:
> Am 08.11.2010 16:16, schrieb Thomas Graf:
> > On Sun, Nov 07, 2010 at 06:17:43PM +0100, Patrick McHardy wrote:
> >> On 07.11.2010 17:44, Jan Engelhardt wrote:
> >>> ...
> > 
> >> That's somewhat similar to the nlattr32 idea, but a length of 0 makes
> >> more sense since that's currently not used. In that case the length
> >> would be read from a second length field which has 32 bits.
> > 
> > I agree. This makes perfect sense. I was just thinking wheter we want
> > to encode more information into the attribute header if we extend it
> > anyways.
> 
> What kind of additional information are you thinking about?

We have tried to come up with ways of forwarding netlink messages to
other machines several times. It always failed due to the fact that
protocols encode attributes/data differently without having the
ability to specify the encoding. E.g. we have nfnetlink encoding
everything in network byte order, most others encode it in host byte
order. Therefore all parsers must be aware of all protocols if they
want to forward/do something with the data. A flag would help there.

I haven't given up on the idea of self describing netlink protocols
yet. For example we could encode the attribute type
(i8|u16|u32|u16|string) in additional to the existing nested attribute
flag.

Given this additional meta data to each attribute and given we limit
ourselves to only using strict netlink attribtes going forward, i.e.
we no longer encode whole structs in attributes we'd be given netlink
protocols which can be forwarded, saved, audited while still being
fairly efficient.

^ permalink raw reply

* Re: [v3 RFC PATCH 0/4] Implement multiqueue virtio-net
From: Krishna Kumar2 @ 2010-11-09 15:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: anthony, arnd, avi, davem, eric.dumazet, kvm, netdev, rusty
In-Reply-To: <20101109132239.GF22705@redhat.com>

"Michael S. Tsirkin" <mst@redhat.com> wrote on 11/09/2010 06:52:39 PM:

> > > Re: [v3 RFC PATCH 0/4] Implement multiqueue virtio-net
> > >
> > > On Mon, Oct 25, 2010 at 09:20:38PM +0530, Krishna Kumar2 wrote:
> > > > > Krishna Kumar2/India/IBM@IBMIN wrote on 10/20/2010 02:24:52 PM:
> > > >
> > > > Any feedback, comments, objections, issues or bugs about the
> > > > patches? Please let me know if something needs to be done.
> > > >
> > > > Some more test results:
> > > > _____________________________________________________
> > > >          Host->Guest BW (numtxqs=2)
> > > > #       BW%     CPU%    RCPU%   SD%     RSD%
> > > > _____________________________________________________
> > >
> > > I think we discussed the need for external to guest testing
> > > over 10G. For large messages we should not see any change
> > > but you should be able to get better numbers for small messages
> > > assuming a MQ NIC card.
> >
> > I had to make a few changes to qemu (and a minor change in macvtap
> > driver) to get multiple TXQ support using macvtap working. The NIC
> > is a ixgbe card.
> >
> >
__________________________________________________________________________
> >             Org vs New (I/O: 512 bytes, #numtxqs=2, #vhosts=3)
> > #      BW1     BW2 (%)       SD1    SD2 (%)        RSD1    RSD2 (%)
> >
__________________________________________________________________________
> > 1      14367   13142 (-8.5)  56     62 (10.7)      8        8 (0)
> > 2      3652    3855 (5.5)    37     35 (-5.4)      7        6 (-14.2)
> > 4      12529   12059 (-3.7)  65     77 (18.4)      35       35 (0)
> > 8      13912   14668 (5.4)   288    332 (15.2)     175      184 (5.1)
> > 16     13433   14455 (7.6)   1218   1321 (8.4)     920      943 (2.5)
> > 24     12750   13477 (5.7)   2876   2985 (3.7)     2514     2348 (-6.6)
> > 32     11729   12632 (7.6)   5299   5332 (.6)      4934     4497 (-8.8)
> > 40     11061   11923 (7.7)   8482   8364 (-1.3)    8374     7495
(-10.4)
> > 48     10624   11267 (6.0)   12329  12258 (-.5)    12762    11538
(-9.5)
> > 64     10524   10596 (.6)    21689  22859 (5.3)    23626    22403
(-5.1)
> > 80     9856    10284 (4.3)   35769  36313 (1.5)    39932    36419
(-8.7)
> > 96     9691    10075 (3.9)   52357  52259 (-.1)    58676    53463
(-8.8)
> > 128    9351    9794 (4.7)    114707 94275 (-17.8)  114050   97337
(-14.6)
> >
__________________________________________________________________________
> > Avg:      BW: (3.3)      SD: (-7.3)      RSD: (-11.0)
> >
> >
__________________________________________________________________________
> >             Org vs New (I/O: 1K, #numtxqs=8, #vhosts=5)
> > #      BW1      BW2 (%)       SD1   SD2 (%)        RSD1   RSD2 (%)
> >
__________________________________________________________________________
> > 1      16509    15985 (-3.1)  45    47 (4.4)       7       7 (0)
> > 2      6963     4499 (-35.3)  17    51 (200.0)     7       7 (0)
> > 4      12932    11080 (-14.3) 49    74 (51.0)      35      35 (0)
> > 8      13878    14095 (1.5)   223   292 (30.9)     175     181 (3.4)
> > 16     13440    13698 (1.9)   980   1131 (15.4)    926     942 (1.7)
> > 24     12680    12927 (1.9)   2387  2463 (3.1)     2526    2342 (-7.2)
> > 32     11714    12261 (4.6)   4506  4486 (-.4)     4941    4463 (-9.6)
> > 40     11059    11651 (5.3)   7244  7081 (-2.2)    8349    7437 (-10.9)
> > 48     10580    11095 (4.8)   10811 10500 (-2.8)   12809   11403
(-10.9)
> > 64     10569    10566 (0)     19194 19270 (.3)     23648   21717 (-8.1)
> > 80     9827     10753 (9.4)   31668 29425 (-7.0)   39991   33824
(-15.4)
> > 96     10043    10150 (1.0)   45352 44227 (-2.4)   57766   51131
(-11.4)
> > 128    9360     9979 (6.6)    92058 79198 (-13.9)  114381  92873
(-18.8)
> >
__________________________________________________________________________
> > Avg:      BW: (-.5)      SD: (-7.5)      RSD: (-14.7)
> >
> > Is there anything else you would like me to test/change, or shall
> > I submit the next version (with the above macvtap changes)?
> >
> > Thanks,
> >
> > - KK
>
> Something strange here, right?
> 1. You are consistently getting >10G/s here, and even with a single
stream?

Sorry, I should have mentioned this though I had stated in my
earlier mails. Each test result has two iterations, each of 60
seconds, except when #netperfs is 1 for which I do 10 iteration
(sum across 10 iterations).  I started doing many more iterations
for 1 netperf after finding the issue earlier with single stream.
So the BW is only 4.5-7 Gbps.

> 2. With 2 streams, is where we get < 10G/s originally. Instead of
>    doubling that we get a marginal improvement with 2 queues and
>    about 30% worse with 1 queue.

(doubling happens consistently for guest -> host, but never for
remote host) I tried 512/txqs=2 and 1024/txqs=8 to get a varied
testing scenario. In first case, there is a slight improvement in
BW and good reduction in SD. In the second case, only SD improves
(though BW drops for 2 stream for some reason).  In both cases,
BW and SD improves as the number of sessions increase.

> Is your card MQ?

Yes, the card is MQ. ixgbe 10g card.

Thanks,

- KK


^ permalink raw reply

* Re: [PATCH] virtio_net: Fix queue full check
From: Krishna Kumar2 @ 2010-11-09 15:30 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: davem, netdev, Rusty Russell, yvugenfi
In-Reply-To: <20101109131555.GE22705@redhat.com>

"Michael S. Tsirkin" <mst@redhat.com> wrote on 11/09/2010 06:45:55 PM:

> Re: [PATCH] virtio_net: Fix queue full check
>
> On Tue, Nov 09, 2010 at 09:56:03AM +0530, Krishna Kumar2 wrote:
> > Rusty Russell <rusty@rustcorp.com.au> wrote on 11/08/2010 04:38:47 AM:
> >
> > > Re: [PATCH] virtio_net: Fix queue full check
> > >
> > > On Thu, 4 Nov 2010 10:54:24 pm Michael S. Tsirkin wrote:
> > > > I thought about this some more.  I think the original
> > > > code is actually correct in returning ENOSPC: indirect
> > > > buffers are nice, but it's a mistake
> > > > to rely on them as a memory allocation might fail.
> > > >
> > > > And if you look at virtio-net, it is dropping packets
> > > > under memory pressure which is not really a happy outcome:
> > > > the packet will get freed, reallocated and we get another one,
> > > > adding pressure on the allocator instead of releasing it
> > > > until we free up some buffers.
> > > >
> > > > So I now think we should calculate the capacity
> > > > assuming non-indirect entries, and if we manage to
> > > > use indirect, all the better.
> > >
> > > I've long said it's a weakness in the network stack that it insists
> > > drivers stop the tx queue before they *might* run out of room,
leading to
> > > worst-case assumptions and underutilization of the tx ring.
> > >
> > > However, I lost that debate, and so your patch is the way it's
supposed
> > to
> > > work.  The other main indirect user (block) doesn't care as its queue
> > > allows for post-attempt blocking.
> > >
> > > I enhanced your commentry a little:
> > >
> > > Subject: virtio: return correct capacity to users
> > > Date: Thu, 4 Nov 2010 14:24:24 +0200
> > > From: "Michael S. Tsirkin" <mst@redhat.com>
> > >
> > > We can't rely on indirect buffers for capacity
> > > calculations because they need a memory allocation
> > > which might fail.  In particular, virtio_net can get
> > > into this situation under stress, and it drops packets
> > > and performs badly.
> > >
> > > So return the number of buffers we can guarantee users.
> > >
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > > Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> >
> > I have tested this patch for 3-4 hours but so far I have not got the tx
> > full
> > error. I am not sure if "Tested-By" applies to this situation, but just
in
> > case:
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> > Tested-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> >
> > I think both this patch and the original patch I submitted
> > are needed? That patch removes ENOMEM check and the increment
> > of dev->stats.tx_fifo_errors, and reports "memory failure".
> >
> > Thanks,
> >
> > - KK
>
> So I think your patch on top of this one would be wrong:
> we actually make sure out of memory does not affect TX path
> at all, so any error would be unexpected.
>
> Incrementing tx fifo errors is probably also helpful for debugging.
>
> If you like, we could kill the special handling for ENOMEM.
> Not sure whether it matters.

Since that is dead code, we could remove it (and the fifo error
should disappear too - tx_dropped should be the only counter to
be incremented?). Sorry if I misunderstood something.

Thanks,

- KK


^ permalink raw reply

* Re: [PATCH] virtio_net: Fix queue full check
From: Michael S. Tsirkin @ 2010-11-09 15:30 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: davem, netdev, Rusty Russell, yvugenfi
In-Reply-To: <OF5BF09BF3.7DE39268-ON652577D6.004B4830-652577D6.0054E713@in.ibm.com>

On Tue, Nov 09, 2010 at 09:00:58PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote on 11/09/2010 06:45:55 PM:
> 
> > Re: [PATCH] virtio_net: Fix queue full check
> >
> > On Tue, Nov 09, 2010 at 09:56:03AM +0530, Krishna Kumar2 wrote:
> > > Rusty Russell <rusty@rustcorp.com.au> wrote on 11/08/2010 04:38:47 AM:
> > >
> > > > Re: [PATCH] virtio_net: Fix queue full check
> > > >
> > > > On Thu, 4 Nov 2010 10:54:24 pm Michael S. Tsirkin wrote:
> > > > > I thought about this some more.  I think the original
> > > > > code is actually correct in returning ENOSPC: indirect
> > > > > buffers are nice, but it's a mistake
> > > > > to rely on them as a memory allocation might fail.
> > > > >
> > > > > And if you look at virtio-net, it is dropping packets
> > > > > under memory pressure which is not really a happy outcome:
> > > > > the packet will get freed, reallocated and we get another one,
> > > > > adding pressure on the allocator instead of releasing it
> > > > > until we free up some buffers.
> > > > >
> > > > > So I now think we should calculate the capacity
> > > > > assuming non-indirect entries, and if we manage to
> > > > > use indirect, all the better.
> > > >
> > > > I've long said it's a weakness in the network stack that it insists
> > > > drivers stop the tx queue before they *might* run out of room,
> leading to
> > > > worst-case assumptions and underutilization of the tx ring.
> > > >
> > > > However, I lost that debate, and so your patch is the way it's
> supposed
> > > to
> > > > work.  The other main indirect user (block) doesn't care as its queue
> > > > allows for post-attempt blocking.
> > > >
> > > > I enhanced your commentry a little:
> > > >
> > > > Subject: virtio: return correct capacity to users
> > > > Date: Thu, 4 Nov 2010 14:24:24 +0200
> > > > From: "Michael S. Tsirkin" <mst@redhat.com>
> > > >
> > > > We can't rely on indirect buffers for capacity
> > > > calculations because they need a memory allocation
> > > > which might fail.  In particular, virtio_net can get
> > > > into this situation under stress, and it drops packets
> > > > and performs badly.
> > > >
> > > > So return the number of buffers we can guarantee users.
> > > >
> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > > > Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> > >
> > > I have tested this patch for 3-4 hours but so far I have not got the tx
> > > full
> > > error. I am not sure if "Tested-By" applies to this situation, but just
> in
> > > case:
> > >
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > > Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> > > Tested-By: Krishna Kumar2 <krkumar2@in.ibm.com>
> > >
> > > I think both this patch and the original patch I submitted
> > > are needed? That patch removes ENOMEM check and the increment
> > > of dev->stats.tx_fifo_errors, and reports "memory failure".
> > >
> > > Thanks,
> > >
> > > - KK
> >
> > So I think your patch on top of this one would be wrong:
> > we actually make sure out of memory does not affect TX path
> > at all, so any error would be unexpected.
> >
> > Incrementing tx fifo errors is probably also helpful for debugging.
> >
> > If you like, we could kill the special handling for ENOMEM.
> > Not sure whether it matters.
> 
> Since that is dead code, we could remove it (and the fifo error
> should disappear too - tx_dropped should be the only counter to
> be incremented?). Sorry if I misunderstood something.
> 
> Thanks,
> 
> - KK

It's just a sanity check. The capacity checking is tricky enough
that I'm happier with some way to detect overflow both from ifconfig
and dmesg.

I don't really care which counter gets incremented really,
since this is some TX bug fifo error seems appropriate
but I don't really care much.


-- 
MST

^ permalink raw reply

* Re: [v3 RFC PATCH 0/4] Implement multiqueue virtio-net
From: Michael S. Tsirkin @ 2010-11-09 15:33 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: anthony, arnd, avi, davem, eric.dumazet, kvm, netdev, rusty
In-Reply-To: <OF24E08752.2087FFA4-ON652577D6.00532DF1-652577D6.0054B291@in.ibm.com>

On Tue, Nov 09, 2010 at 08:58:44PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote on 11/09/2010 06:52:39 PM:
> 
> > > > Re: [v3 RFC PATCH 0/4] Implement multiqueue virtio-net
> > > >
> > > > On Mon, Oct 25, 2010 at 09:20:38PM +0530, Krishna Kumar2 wrote:
> > > > > > Krishna Kumar2/India/IBM@IBMIN wrote on 10/20/2010 02:24:52 PM:
> > > > >
> > > > > Any feedback, comments, objections, issues or bugs about the
> > > > > patches? Please let me know if something needs to be done.
> > > > >
> > > > > Some more test results:
> > > > > _____________________________________________________
> > > > >          Host->Guest BW (numtxqs=2)
> > > > > #       BW%     CPU%    RCPU%   SD%     RSD%
> > > > > _____________________________________________________
> > > >
> > > > I think we discussed the need for external to guest testing
> > > > over 10G. For large messages we should not see any change
> > > > but you should be able to get better numbers for small messages
> > > > assuming a MQ NIC card.
> > >
> > > I had to make a few changes to qemu (and a minor change in macvtap
> > > driver) to get multiple TXQ support using macvtap working. The NIC
> > > is a ixgbe card.
> > >
> > >
> __________________________________________________________________________
> > >             Org vs New (I/O: 512 bytes, #numtxqs=2, #vhosts=3)
> > > #      BW1     BW2 (%)       SD1    SD2 (%)        RSD1    RSD2 (%)
> > >
> __________________________________________________________________________
> > > 1      14367   13142 (-8.5)  56     62 (10.7)      8        8 (0)
> > > 2      3652    3855 (5.5)    37     35 (-5.4)      7        6 (-14.2)
> > > 4      12529   12059 (-3.7)  65     77 (18.4)      35       35 (0)
> > > 8      13912   14668 (5.4)   288    332 (15.2)     175      184 (5.1)
> > > 16     13433   14455 (7.6)   1218   1321 (8.4)     920      943 (2.5)
> > > 24     12750   13477 (5.7)   2876   2985 (3.7)     2514     2348 (-6.6)
> > > 32     11729   12632 (7.6)   5299   5332 (.6)      4934     4497 (-8.8)
> > > 40     11061   11923 (7.7)   8482   8364 (-1.3)    8374     7495
> (-10.4)
> > > 48     10624   11267 (6.0)   12329  12258 (-.5)    12762    11538
> (-9.5)
> > > 64     10524   10596 (.6)    21689  22859 (5.3)    23626    22403
> (-5.1)
> > > 80     9856    10284 (4.3)   35769  36313 (1.5)    39932    36419
> (-8.7)
> > > 96     9691    10075 (3.9)   52357  52259 (-.1)    58676    53463
> (-8.8)
> > > 128    9351    9794 (4.7)    114707 94275 (-17.8)  114050   97337
> (-14.6)
> > >
> __________________________________________________________________________
> > > Avg:      BW: (3.3)      SD: (-7.3)      RSD: (-11.0)
> > >
> > >
> __________________________________________________________________________
> > >             Org vs New (I/O: 1K, #numtxqs=8, #vhosts=5)
> > > #      BW1      BW2 (%)       SD1   SD2 (%)        RSD1   RSD2 (%)
> > >
> __________________________________________________________________________
> > > 1      16509    15985 (-3.1)  45    47 (4.4)       7       7 (0)
> > > 2      6963     4499 (-35.3)  17    51 (200.0)     7       7 (0)
> > > 4      12932    11080 (-14.3) 49    74 (51.0)      35      35 (0)
> > > 8      13878    14095 (1.5)   223   292 (30.9)     175     181 (3.4)
> > > 16     13440    13698 (1.9)   980   1131 (15.4)    926     942 (1.7)
> > > 24     12680    12927 (1.9)   2387  2463 (3.1)     2526    2342 (-7.2)
> > > 32     11714    12261 (4.6)   4506  4486 (-.4)     4941    4463 (-9.6)
> > > 40     11059    11651 (5.3)   7244  7081 (-2.2)    8349    7437 (-10.9)
> > > 48     10580    11095 (4.8)   10811 10500 (-2.8)   12809   11403
> (-10.9)
> > > 64     10569    10566 (0)     19194 19270 (.3)     23648   21717 (-8.1)
> > > 80     9827     10753 (9.4)   31668 29425 (-7.0)   39991   33824
> (-15.4)
> > > 96     10043    10150 (1.0)   45352 44227 (-2.4)   57766   51131
> (-11.4)
> > > 128    9360     9979 (6.6)    92058 79198 (-13.9)  114381  92873
> (-18.8)
> > >
> __________________________________________________________________________
> > > Avg:      BW: (-.5)      SD: (-7.5)      RSD: (-14.7)
> > >
> > > Is there anything else you would like me to test/change, or shall
> > > I submit the next version (with the above macvtap changes)?
> > >
> > > Thanks,
> > >
> > > - KK
> >
> > Something strange here, right?
> > 1. You are consistently getting >10G/s here, and even with a single
> stream?
> 
> Sorry, I should have mentioned this though I had stated in my
> earlier mails. Each test result has two iterations, each of 60
> seconds, except when #netperfs is 1 for which I do 10 iteration
> (sum across 10 iterations).

So need to divide the number by 10?

>  I started doing many more iterations
> for 1 netperf after finding the issue earlier with single stream.
> So the BW is only 4.5-7 Gbps.
> 
> > 2. With 2 streams, is where we get < 10G/s originally. Instead of
> >    doubling that we get a marginal improvement with 2 queues and
> >    about 30% worse with 1 queue.
> 
> (doubling happens consistently for guest -> host, but never for
> remote host) I tried 512/txqs=2 and 1024/txqs=8 to get a varied
> testing scenario. In first case, there is a slight improvement in
> BW and good reduction in SD. In the second case, only SD improves
> (though BW drops for 2 stream for some reason).  In both cases,
> BW and SD improves as the number of sessions increase.

I guess this is another indication that something's wrong.
We are quite far from line rate, the fact BW does not scale
means there's some contention in the code.

> > Is your card MQ?
> 
> Yes, the card is MQ. ixgbe 10g card.
> 
> Thanks,
> 
> - KK

^ permalink raw reply

* Re: ip_summed setting for TCP pure-ACK packets
From: Ben Hutchings @ 2010-11-09 16:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, sf-linux-drivers, netdev
In-Reply-To: <1289235366.15376.3.camel@edumazet-laptop>

On Mon, 2010-11-08 at 17:56 +0100, Eric Dumazet wrote: 
> Le lundi 08 novembre 2010 à 16:35 +0000, Ben Hutchings a écrit :
> > As we discussed at LPC:
> > 
> > Current controllers handled by the sfc driver have a per-queue (rather
> > than per-packet) option for checksum generation.  Currently pure-ACK
> > packets sent by TCP have ip_summed == CHECKSUM_NONE and we must send
> > them on hardware queues with checksum generation disabled.  To support
> > this, we allocate 2 hardware queues per core TX queue.
> > 
> > To reduce the risk of reordering (and possibly the number of hardware TX
> > queues required), it would be helpful for TCP to set ip_summed ==
> > CHECKSUM_PARTIAL on pure-ACK packets when the output device is known to
> > support checksum generation.
> > 
> > Ben.
> > 
> 
> Hmm
> 
> Do you mean commit 2e8e18ef52e7dd1af0a3bd1 is not enough ?

It might well be... I must admit I hadn't thought to check whether this
issue had gone away.

Yes, that does the trick.  Sorry for wasting people's time on this.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.



^ permalink raw reply

* Re: [PATCH] inet: fix ip_mc_drop_socket()
From: David Miller @ 2010-11-09 16:26 UTC (permalink / raw)
  To: eric.dumazet
  Cc: markus, paulmck, miles.lane, ilpo.jarvinen, linux-kernel, lenb,
	netdev
In-Reply-To: <1289250954.2790.11.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 08 Nov 2010 22:15:54 +0100

> Hmm, I believe I found the bug.
> 
> Thanks guys !
> 
> [PATCH] inet: fix ip_mc_drop_socket()
> 
> commit 8723e1b4ad9be4444 (inet: RCU changes in inetdev_by_index())
> forgot one call site in ip_mc_drop_socket()
> 
> We should not decrease idev refcount after inetdev_by_index() call,
> since refcount is not increased anymore.
> 
> Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
> Reported-by: Miles Lane <miles.lane@gmail.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks a lot!

^ permalink raw reply

* Re: [PATCH 1/2] r8169: revert "Handle rxfifo errors on 8168 chips"
From: David Miller @ 2010-11-09 16:27 UTC (permalink / raw)
  To: romieu; +Cc: netdev, a.radke, mjg, daniel.blueman
In-Reply-To: <20101108232305.GA13720@electric-eye.fr.zoreil.com>

From: Francois Romieu <romieu@fr.zoreil.com>
Date: Tue, 9 Nov 2010 00:23:05 +0100

> The original patch helps under obscure conditions (no pun) but
> some 8168 do not like it. The change needs to be tightened with
> a specific 8168 version.
> 
> This reverts commit 801e147cde02f04b5c2f42764cd43a89fc7400a2.
> 
> Regression at https://bugzilla.kernel.org/show_bug.cgi?id=20882
> 
> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
> Tested-by: Andreas Radke <a.radke@arcor.de>
> Cc: Matthew Garrett <mjg@redhat.com>
> Cc: Daniel J Blueman <daniel.blueman@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/2] r8169: fix sleeping while holding spinlock.
From: David Miller @ 2010-11-09 16:27 UTC (permalink / raw)
  To: romieu; +Cc: netdev, daniel.blueman, rjw
In-Reply-To: <20101108232358.GB13720@electric-eye.fr.zoreil.com>

From: Francois Romieu <romieu@fr.zoreil.com>
Date: Tue, 9 Nov 2010 00:23:58 +0100

> As device_set_wakeup_enable can now sleep, move the call to outside
> the critical section.
> 
> Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
> Acked-by: Rafael J. Wysocki <rjw@sisk.pl>

Applied.

^ permalink raw reply

* Re: ip_summed setting for TCP pure-ACK packets
From: Eric Dumazet @ 2010-11-09 16:28 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: David Miller, sf-linux-drivers, netdev
In-Reply-To: <1289319889.2238.195.camel@achroite.uk.solarflarecom.com>

Le mardi 09 novembre 2010 à 16:24 +0000, Ben Hutchings a écrit :
> On Mon, 2010-11-08 at 17:56 +0100, Eric Dumazet wrote: 
> > Le lundi 08 novembre 2010 à 16:35 +0000, Ben Hutchings a écrit :
> > > As we discussed at LPC:
> > > 
> > > Current controllers handled by the sfc driver have a per-queue (rather
> > > than per-packet) option for checksum generation.  Currently pure-ACK
> > > packets sent by TCP have ip_summed == CHECKSUM_NONE and we must send
> > > them on hardware queues with checksum generation disabled.  To support
> > > this, we allocate 2 hardware queues per core TX queue.
> > > 
> > > To reduce the risk of reordering (and possibly the number of hardware TX
> > > queues required), it would be helpful for TCP to set ip_summed ==
> > > CHECKSUM_PARTIAL on pure-ACK packets when the output device is known to
> > > support checksum generation.
> > > 
> > > Ben.
> > > 
> > 
> > Hmm
> > 
> > Do you mean commit 2e8e18ef52e7dd1af0a3bd1 is not enough ?
> 
> It might well be... I must admit I hadn't thought to check whether this
> issue had gone away.
> 
> Yes, that does the trick.  Sorry for wasting people's time on this.
> 

You're welcome ;)




^ permalink raw reply

* [PATCH] net: replace min with min_t in pktgen_if_write
From: Davidlohr Bueso @ 2010-11-09 16:37 UTC (permalink / raw)
  To: David Miller, Robert Olsson; +Cc: LKML, Network Development

From: Davidlohr Bueso <dave@gnu.org>

This addresses the following compiler (gcc 4.4.5) warning:

  CC [M]  net/core/pktgen.o
net/core/pktgen.c: In function ‘pktgen_if_write’:
net/core/pktgen.c:890: warning: comparison of distinct pointer types lacks a cast

Signed-off-by: Davidlohr Bueso <dave@gnu.org>
---
 net/core/pktgen.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index fbce4b0..1992cd0 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -887,7 +887,7 @@ static ssize_t pktgen_if_write(struct file *file,
 	i += len;
 
 	if (debug) {
-		size_t copy = min(count, 1023);
+		size_t copy = min_t(size_t, count, 1023);
 		char tb[copy + 1];
 		if (copy_from_user(tb, user_buffer, copy))
 			return -EFAULT;
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH] net: replace min with min_t in pktgen_if_write
From: David Miller @ 2010-11-09 16:40 UTC (permalink / raw)
  To: dave; +Cc: robert.olsson, linux-kernel, netdev
In-Reply-To: <1289320633.2260.12.camel@cowboy>

From: Davidlohr Bueso <dave@gnu.org>
Date: Tue, 09 Nov 2010 13:37:13 -0300

> From: Davidlohr Bueso <dave@gnu.org>
> 
> This addresses the following compiler (gcc 4.4.5) warning:
> 
>   CC [M]  net/core/pktgen.o
> net/core/pktgen.c: In function ‘pktgen_if_write’:
> net/core/pktgen.c:890: warning: comparison of distinct pointer types lacks a cast
> 
> Signed-off-by: Davidlohr Bueso <dave@gnu.org>

Already fixed in net-2.6:

commit 86c2c0a8a4965ae5cbc0ff97ed39a4472e8e9b23
Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date:   Sat Nov 6 20:11:38 2010 +0000

    NET: pktgen - fix compile warning
    
    This should fix the following warning:
    
    net/core/pktgen.c: In function ‘pktgen_if_write’:
    net/core/pktgen.c:890: warning: comparison of distinct pointer types lacks a cast
    
    Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
    Reviewed-by: Nelson Elhage <nelhage@ksplice.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index fbce4b0..1992cd0 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -887,7 +887,7 @@ static ssize_t pktgen_if_write(struct file *file,
 	i += len;
 
 	if (debug) {
-		size_t copy = min(count, 1023);
+		size_t copy = min_t(size_t, count, 1023);
 		char tb[copy + 1];
 		if (copy_from_user(tb, user_buffer, copy))
 			return -EFAULT;

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox