Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 4/8] trivial: fix some typos and punctuation in comments
From: Ivo van Doorn @ 2009-10-15 20:07 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo
  Cc: trivial, linville, johannes, users, linux-kernel, linux-wireless,
	netdev
In-Reply-To: <1255636103-5817-1-git-send-email-cascardo@holoscopio.com>

On Thursday 15 October 2009, Thadeu Lima de Souza Cascardo wrote:
> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>

Acked-by: Ivo van Doorn <IvDoorn@gmail.com>

> ---
>  drivers/net/wireless/rt2x00/rt61pci.c |   36 ++++++++++++++++----------------
>  1 files changed, 18 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/net/wireless/rt2x00/rt61pci.c b/drivers/net/wireless/rt2x00/rt61pci.c
> index b20e3ea..343e565 100644
> --- a/drivers/net/wireless/rt2x00/rt61pci.c
> +++ b/drivers/net/wireless/rt2x00/rt61pci.c
> @@ -51,7 +51,7 @@ MODULE_PARM_DESC(nohwcrypt, "Disable hardware encryption.");
>   * These indirect registers work with busy bits,
>   * and we will try maximal REGISTER_BUSY_COUNT times to access
>   * the register while taking a REGISTER_BUSY_DELAY us delay
> - * between each attampt. When the busy bit is still set at that time,
> + * between each attempt. When the busy bit is still set at that time,
>   * the access attempt is considered to have failed,
>   * and we will print an error.
>   */
> @@ -386,7 +386,7 @@ static int rt61pci_config_shared_key(struct rt2x00_dev *rt2x00dev,
>  		 * The driver does not support the IV/EIV generation
>  		 * in hardware. However it doesn't support the IV/EIV
>  		 * inside the ieee80211 frame either, but requires it
> -		 * to be provided seperately for the descriptor.
> +		 * to be provided separately for the descriptor.
>  		 * rt2x00lib will cut the IV/EIV data out of all frames
>  		 * given to us by mac80211, but we must tell mac80211
>  		 * to generate the IV/EIV data.
> @@ -397,7 +397,7 @@ static int rt61pci_config_shared_key(struct rt2x00_dev *rt2x00dev,
>  	/*
>  	 * SEC_CSR0 contains only single-bit fields to indicate
>  	 * a particular key is valid. Because using the FIELD32()
> -	 * defines directly will cause a lot of overhead we use
> +	 * defines directly will cause a lot of overhead, we use
>  	 * a calculation to determine the correct bit directly.
>  	 */
>  	mask = 1 << key->hw_key_idx;
> @@ -425,11 +425,11 @@ static int rt61pci_config_pairwise_key(struct rt2x00_dev *rt2x00dev,
>  		/*
>  		 * rt2x00lib can't determine the correct free
>  		 * key_idx for pairwise keys. We have 2 registers
> -		 * with key valid bits. The goal is simple, read
> -		 * the first register, if that is full move to
> +		 * with key valid bits. The goal is simple: read
> +		 * the first register. If that is full, move to
>  		 * the next register.
> -		 * When both registers are full, we drop the key,
> -		 * otherwise we use the first invalid entry.
> +		 * When both registers are full, we drop the key.
> +		 * Otherwise, we use the first invalid entry.
>  		 */
>  		rt2x00pci_register_read(rt2x00dev, SEC_CSR2, &reg);
>  		if (reg && reg == ~0) {
> @@ -464,8 +464,8 @@ static int rt61pci_config_pairwise_key(struct rt2x00_dev *rt2x00dev,
>  					      &addr_entry, sizeof(addr_entry));
>  
>  		/*
> -		 * Enable pairwise lookup table for given BSS idx,
> -		 * without this received frames will not be decrypted
> +		 * Enable pairwise lookup table for given BSS idx.
> +		 * Without this, received frames will not be decrypted
>  		 * by the hardware.
>  		 */
>  		rt2x00pci_register_read(rt2x00dev, SEC_CSR4, &reg);
> @@ -487,7 +487,7 @@ static int rt61pci_config_pairwise_key(struct rt2x00_dev *rt2x00dev,
>  	/*
>  	 * SEC_CSR2 and SEC_CSR3 contain only single-bit fields to indicate
>  	 * a particular key is valid. Because using the FIELD32()
> -	 * defines directly will cause a lot of overhead we use
> +	 * defines directly will cause a lot of overhead, we use
>  	 * a calculation to determine the correct bit directly.
>  	 */
>  	if (key->hw_key_idx < 32) {
> @@ -556,7 +556,7 @@ static void rt61pci_config_intf(struct rt2x00_dev *rt2x00dev,
>  	if (flags & CONFIG_UPDATE_TYPE) {
>  		/*
>  		 * Clear current synchronisation setup.
> -		 * For the Beacon base registers we only need to clear
> +		 * For the Beacon base registers, we only need to clear
>  		 * the first byte since that byte contains the VALID and OWNER
>  		 * bits which (when set to 0) will invalidate the entire beacon.
>  		 */
> @@ -1168,8 +1168,8 @@ static int rt61pci_check_firmware(struct rt2x00_dev *rt2x00dev,
>  		return FW_BAD_LENGTH;
>  
>  	/*
> -	 * The last 2 bytes in the firmware array are the crc checksum itself,
> -	 * this means that we should never pass those 2 bytes to the crc
> +	 * The last 2 bytes in the firmware array are the crc checksum itself.
> +	 * This means that we should never pass those 2 bytes to the crc
>  	 * algorithm.
>  	 */
>  	fw_crc = (data[len - 2] << 8 | data[len - 1]);
> @@ -1986,7 +1986,7 @@ static void rt61pci_fill_rxdone(struct queue_entry *entry,
>  
>  		/*
>  		 * Hardware has stripped IV/EIV data from 802.11 frame during
> -		 * decryption. It has provided the data seperately but rt2x00lib
> +		 * decryption. It has provided the data separately but rt2x00lib
>  		 * should decide if it should be reinserted.
>  		 */
>  		rxdesc->flags |= RX_FLAG_IV_STRIPPED;
> @@ -2042,7 +2042,7 @@ static void rt61pci_txdone(struct rt2x00_dev *rt2x00dev)
>  	 * During each loop we will compare the freshly read
>  	 * STA_CSR4 register value with the value read from
>  	 * the previous loop. If the 2 values are equal then
> -	 * we should stop processing because the chance it
> +	 * we should stop processing because the chance is
>  	 * quite big that the device has been unplugged and
>  	 * we risk going into an endless loop.
>  	 */
> @@ -2330,7 +2330,7 @@ static int rt61pci_init_eeprom(struct rt2x00_dev *rt2x00dev)
>  		__set_bit(CONFIG_FRAME_TYPE, &rt2x00dev->flags);
>  
>  	/*
> -	 * Detect if this device has an hardware controlled radio.
> +	 * Detect if this device has a hardware controlled radio.
>  	 */
>  	if (rt2x00_get_field16(eeprom, EEPROM_ANTENNA_HARDWARE_RADIO))
>  		__set_bit(CONFIG_SUPPORT_HW_BUTTON, &rt2x00dev->flags);
> @@ -2355,7 +2355,7 @@ static int rt61pci_init_eeprom(struct rt2x00_dev *rt2x00dev)
>  		__set_bit(CONFIG_EXTERNAL_LNA_BG, &rt2x00dev->flags);
>  
>  	/*
> -	 * When working with a RF2529 chip without double antenna
> +	 * When working with a RF2529 chip without double antenna,
>  	 * the antenna settings should be gathered from the NIC
>  	 * eeprom word.
>  	 */
> @@ -2668,7 +2668,7 @@ static int rt61pci_conf_tx(struct ieee80211_hw *hw, u16 queue_idx,
>  
>  	/*
>  	 * We only need to perform additional register initialization
> -	 * for WMM queues/
> +	 * for WMM queues.
>  	 */
>  	if (queue_idx >= 4)
>  		return 0;



^ permalink raw reply

* Re: [PATCH] Re: PACKET_TX_RING: packet size is too long
From: Gabor Gombas @ 2009-10-15 20:10 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, johann.baudy
In-Reply-To: <20091013.005438.32630424.davem@davemloft.net>

Currently PACKET_TX_RING forces certain amount of every frame to remain
unused. This probably originates from an early version of the
PACKET_TX_RING patch that in fact used the extra space when the (since
removed) CONFIG_PACKET_MMAP_ZERO_COPY option was enabled. The current
code does not make any use of this extra space.

This patch removes the extra space reservation and lets userspace make
use of the full frame size.

Signed-off-by: Gabor Gombas <gombasg@sztaki.hu>
---

On Tue, Oct 13, 2009 at 12:54:38AM -0700, David Miller wrote:

> Gabor, please resubmit your patch with a proper Signed-off-by:
> tag so I can apply it if it is correct.

Sure. I have pumped out more than 150 GiB of data using this patch
without apparent ill effects so it is also tested now.

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index f9f7177..745a016 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -985,10 +985,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 		goto out_put;
 
 	size_max = po->tx_ring.frame_size
-		- sizeof(struct skb_shared_info)
-		- po->tp_hdrlen
-		- LL_ALLOCATED_SPACE(dev)
-		- sizeof(struct sockaddr_ll);
+		- (po->tp_hdrlen - sizeof(struct sockaddr_ll));
 
 	if (size_max > dev->mtu + reserve)
 		size_max = dev->mtu + reserve;
-- 
1.6.5

-- 
     ---------------------------------------------------------
     MTA SZTAKI Computer and Automation Research Institute
                Hungarian Academy of Sciences
     ---------------------------------------------------------

^ permalink raw reply related

* [PATCH 5/8] trivial: fix typo s/assocate/associate/ in comment
From: Thadeu Lima de Souza Cascardo @ 2009-10-15 20:15 UTC (permalink / raw)
  To: trivial-DgEjT+Ai2ygdnm+yROfE0A
  Cc: Jiri Slaby, Nick Kossifidis, Luis R. Rodriguez, Bob Copeland,
	John W. Linville, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	ath5k-devel-xDcbHBWguxEUs3QNXV6qNA, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Thadeu Lima de Souza Cascardo

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo-DmMZpsCg3uxeGPcbtGPokg@public.gmane.org>
---
 drivers/net/wireless/ath/ath5k/base.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/ath/ath5k/base.h b/drivers/net/wireless/ath/ath5k/base.h
index b14ba07..b73e7d3 100644
--- a/drivers/net/wireless/ath/ath5k/base.h
+++ b/drivers/net/wireless/ath/ath5k/base.h
@@ -192,7 +192,7 @@ struct ath5k_softc {
 	struct ath5k_txq	*cabq;		/* content after beacon */
 
 	int 			power_level;	/* Requested tx power in dbm */
-	bool			assoc;		/* assocate state */
+	bool			assoc;		/* associate state */
 	bool			enable_beacon;	/* true if beacons are on */
 };
 
-- 
1.6.3.3

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 6/8] trivial: fix typo s/refference/reference/ in comment
From: Thadeu Lima de Souza Cascardo @ 2009-10-15 20:16 UTC (permalink / raw)
  To: trivial
  Cc: Marcel Holtmann, David S. Miller, Thadeu Lima de Souza Cascardo,
	Stephen Hemminger, Wang Chen, linux-bluetooth, netdev,
	linux-kernel

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
---
 net/bluetooth/bnep/core.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/bluetooth/bnep/core.c b/net/bluetooth/bnep/core.c
index cafe9f5..1bd9398 100644
--- a/net/bluetooth/bnep/core.c
+++ b/net/bluetooth/bnep/core.c
@@ -78,7 +78,7 @@ static struct bnep_session *__bnep_get_session(u8 *dst)
 static void __bnep_link_session(struct bnep_session *s)
 {
 	/* It's safe to call __module_get() here because sessions are added
-	   by the socket layer which has to hold the refference to this module.
+	   by the socket layer which has to hold the reference to this module.
 	 */
 	__module_get(THIS_MODULE);
 	list_add(&s->list, &bnep_session_list);
-- 
1.6.3.3

^ permalink raw reply related

* rx_buffer_len and LPE in intel drivers
From: Andrew Swan @ 2009-10-15 20:09 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev


Hi,

I've been looking at skbuff memory usage with e1000, e1000e, and igb  
and have a few questions related to some common code that sets  
rx_buffer_len to ~1500 bytes when LPE is disabled.

For context, there is a huge advantage to holding rx_buffer_len to  
~1500 bytes: a few hundred bytes of overhead are added to each skbuff  
allocation so when rx_buffer_len is ~1500, each allocation fits within  
the 2048 byte allocation slab.  In contrast, when rx_buffer_len is  
2048, the skb overhead means that the allocations get pushed into the  
4096 byte slab, doubling the memory consumed by a typical 1500 byte  
packet.

e1000 and e1000e both have code in _setup_rctl() to disable LPE when  
the mtu is <= 1500.  Then there is this related code in _change_mtu():

e1000:
	/* adjust allocation if LPE protects us, and we aren't using SBP */
	if (!hw->tbi_compatibility_on &&
	    ((max_frame == MAXIMUM_ETHERNET_FRAME_SIZE) ||
	     (max_frame == MAXIMUM_ETHERNET_VLAN_SIZE)))
		adapter->rx_buffer_len = MAXIMUM_ETHERNET_VLAN_SIZE;

e1000e:
	/* adjust allocation if LPE protects us, and we aren't using SBP */
	if ((max_frame == ETH_FRAME_LEN + ETH_FCS_LEN) ||
	     (max_frame == ETH_FRAME_LEN + VLAN_HLEN + ETH_FCS_LEN))
		adapter->rx_buffer_len = ETH_FRAME_LEN + VLAN_HLEN
					 + ETH_FCS_LEN;

My first question is about e1000 specifically -- it doesn't use this  
optimization if "tbi compatability" is on.  My understanding is that  
tbi compatability implies "store back packets" (SBP) is on and SBP  
overrides the discard of large packets when LPE is off.  But  
e1000_sw_init() initializes rx_buffer_len to ~1500 bytes, regardless  
of the tbi compatability setting.  I don't know exactly what tbi  
compatability is but is there a bug where a device that uses this mode  
comes up in an unsafe mode?  (i.e., the driver is allocating 1500 byte  
skb's but tbi compatability is on so the hardware may deliver large  
packets and overflow the skb?)

The larger question though is why both drivers test the mtu against  
these two specific values before using a smaller rx_buffer_len.  It  
seems that it would be safe to just use the test "if max_frame <=  
MAXIMUM_ETHERNET_VLAN_SIZE".  Or am I overlooking something else?

Finally, igb has the same logic as e1000 and e1000e in _change_mtu()  
but it is overriden by code in igb_setup_rctl() that unconditionally  
sets LPE and rounds rx_buffer_len up to a power of 2.  Is there some  
reason why it wouldn't be safe to implement the same logic from e1000  
and e1000e for clearing LPE and shrinking rx_buffer_len?

I'd be happy to provide patches for these fixes but want to make sure  
I'm on a reasonable path first..

Thanks!

-Andrew



^ permalink raw reply

* [PATCH 7/8] trivial: fix many typos s/untill/until/
From: Thadeu Lima de Souza Cascardo @ 2009-10-15 20:21 UTC (permalink / raw)
  To: trivial
  Cc: linux-rdma, linux-kernel, bonding-devel, netdev, linux-wireless,
	users, linux-scsi, devel, linux-ext4, linux-bluetooth, linux-sctp,
	Thadeu Lima de Souza Cascardo

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
---
 drivers/infiniband/ulp/iser/iser_verbs.c |    2 +-
 drivers/net/bonding/bond_alb.c           |    2 +-
 drivers/net/wireless/rt2x00/rt2800usb.c  |    2 +-
 drivers/scsi/bnx2i/bnx2i_iscsi.c         |    2 +-
 drivers/staging/rtl8187se/r8180.h        |    2 +-
 fs/ext4/balloc.c                         |    2 +-
 net/bluetooth/bnep/core.c                |    2 +-
 net/sctp/sm_statefuns.c                  |    2 +-
 8 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c
index ea9e155..8579f32 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -499,7 +499,7 @@ void iser_conn_init(struct iser_conn *ib_conn)
 
  /**
  * starts the process of connecting to the target
- * sleeps untill the connection is established or rejected
+ * sleeps until the connection is established or rejected
  */
 int iser_connect(struct iser_conn   *ib_conn,
 		 struct sockaddr_in *src_addr,
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 9b5936f..a1e7eb9 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -559,7 +559,7 @@ static void rlb_update_rx_clients(struct bonding *bond)
 		}
 	}
 
-	/* do not update the entries again untill this counter is zero so that
+	/* do not update the entries again until this counter is zero so that
 	 * not to confuse the clients.
 	 */
 	bond_info->rlb_update_delay_counter = RLB_UPDATE_DELAY;
diff --git a/drivers/net/wireless/rt2x00/rt2800usb.c b/drivers/net/wireless/rt2x00/rt2800usb.c
index a084077..449886c 100644
--- a/drivers/net/wireless/rt2x00/rt2800usb.c
+++ b/drivers/net/wireless/rt2x00/rt2800usb.c
@@ -1257,7 +1257,7 @@ static int rt2800usb_init_registers(struct rt2x00_dev *rt2x00dev)
 	unsigned int i;
 
 	/*
-	 * Wait untill BBP and RF are ready.
+	 * Wait until BBP and RF are ready.
 	 */
 	for (i = 0; i < REGISTER_BUSY_COUNT; i++) {
 		rt2x00usb_register_read(rt2x00dev, MAC_CSR0, &reg);
diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c
index cafb888..10110be 100644
--- a/drivers/scsi/bnx2i/bnx2i_iscsi.c
+++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c
@@ -1883,7 +1883,7 @@ static void bnx2i_ep_disconnect(struct iscsi_endpoint *ep)
 
 	bnx2i_ep = ep->dd_data;
 
-	/* driver should not attempt connection cleanup untill TCP_CONNECT
+	/* driver should not attempt connection cleanup until TCP_CONNECT
 	 * completes either successfully or fails. Timeout is 9-secs, so
 	 * wait for it to complete
 	 */
diff --git a/drivers/staging/rtl8187se/r8180.h b/drivers/staging/rtl8187se/r8180.h
index 8216d7e..35ed60b 100644
--- a/drivers/staging/rtl8187se/r8180.h
+++ b/drivers/staging/rtl8187se/r8180.h
@@ -521,7 +521,7 @@ typedef struct r8180_priv
 	//u32 NumTxOkInPeriod;   //YJ,del,080828
 	u8   TxPollingTimes;
 
-	bool	bApBufOurFrame;// TRUE if AP buffer our unicast data , we will keep eAwake untill receive data or timeout.
+	bool	bApBufOurFrame;// TRUE if AP buffer our unicast data , we will keep eAwake until receive data or timeout.
 	u8	WaitBufDataBcnCount;
 	u8	WaitBufDataTimeOut;
 
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 1d04189..2565b8c 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -519,7 +519,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 		metadata = 1;
 
 	/* We need to make sure we don't reuse
-	 * block released untill the transaction commit.
+	 * block released until the transaction commit.
 	 * writeback mode have weak data consistency so
 	 * don't force data as metadata when freeing block
 	 * for writeback mode.
diff --git a/net/bluetooth/bnep/core.c b/net/bluetooth/bnep/core.c
index 1bd9398..9ac0497 100644
--- a/net/bluetooth/bnep/core.c
+++ b/net/bluetooth/bnep/core.c
@@ -629,7 +629,7 @@ int bnep_del_connection(struct bnep_conndel_req *req)
 	s = __bnep_get_session(req->dst);
 	if (s) {
 		/* Wakeup user-space which is polling for socket errors.
-		 * This is temporary hack untill we have shutdown in L2CAP */
+		 * This is temporary hack until we have shutdown in L2CAP */
 		s->sock->sk->sk_err = EUNATCH;
 
 		/* Kill session thread */
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index c8fae19..ba2f66d 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -3569,7 +3569,7 @@ sctp_disposition_t sctp_sf_do_asconf(const struct sctp_endpoint *ep,
 	 * To do this properly, we'll set the destination address of the chunk
 	 * and at the transmit time, will try look up the transport to use.
 	 * Since ASCONFs may be bundled, the correct transport may not be
-	 * created untill we process the entire packet, thus this workaround.
+	 * created until we process the entire packet, thus this workaround.
 	 */
 	asconf_ack->dest = chunk->source;
 	sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(asconf_ack));
-- 
1.6.3.3


^ permalink raw reply related

* [patch 3/5] sunrpc: Convert to unlocked_ioctl and remove stray smp_lock.h
From: Thomas Gleixner @ 2009-10-15 20:28 UTC (permalink / raw)
  To: LKML; +Cc: ALan Cox, Arnd Bergmann, David S. Miller, Trond Myklebust, netdev
In-Reply-To: <20091015202722.372890083@linutronix.de>

[-- Attachment #1: net-sunrpc-convert-to-unlocked-ioctl.patch --]
[-- Type: text/plain, Size: 4905 bytes --]

The ioctls in sunrpc/cache.c can be converted to unlocked_ioctl safely
because the operation is already protected by queue_lock.

The ioctl in rpc_pipe.c can be converted by serializing via
inode->i_mutex.

Remove the stray smp_lock.h include in svc_xprt.c which was not
removed when the BKL was replaced in commit ed2d8aed

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: netdev@vger.kernel.org
---
 net/sunrpc/cache.c    |   25 ++++++++++++-------------
 net/sunrpc/rpc_pipe.c |   16 ++++++++++------
 net/sunrpc/svc_xprt.c |    1 -
 3 files changed, 22 insertions(+), 20 deletions(-)

Index: linux-2.6-tip/net/sunrpc/cache.c
===================================================================
--- linux-2.6-tip.orig/net/sunrpc/cache.c
+++ linux-2.6-tip/net/sunrpc/cache.c
@@ -815,9 +815,8 @@ static unsigned int cache_poll(struct fi
 	return mask;
 }
 
-static int cache_ioctl(struct inode *ino, struct file *filp,
-		       unsigned int cmd, unsigned long arg,
-		       struct cache_detail *cd)
+static long cache_ioctl(struct file *filp, unsigned int cmd, unsigned long arg,
+			struct cache_detail *cd)
 {
 	int len = 0;
 	struct cache_reader *rp = filp->private_data;
@@ -1332,12 +1331,12 @@ static unsigned int cache_poll_procfs(st
 	return cache_poll(filp, wait, cd);
 }
 
-static int cache_ioctl_procfs(struct inode *inode, struct file *filp,
-			      unsigned int cmd, unsigned long arg)
+static long cache_ioctl_procfs(struct file *filp, unsigned int cmd,
+			       unsigned long arg)
 {
-	struct cache_detail *cd = PDE(inode)->data;
+	struct cache_detail *cd = PDE(filp->f_path.dentry->d_inode)->data;
 
-	return cache_ioctl(inode, filp, cmd, arg, cd);
+	return cache_ioctl(filp, cmd, arg, cd);
 }
 
 static int cache_open_procfs(struct inode *inode, struct file *filp)
@@ -1360,7 +1359,7 @@ static const struct file_operations cach
 	.read		= cache_read_procfs,
 	.write		= cache_write_procfs,
 	.poll		= cache_poll_procfs,
-	.ioctl		= cache_ioctl_procfs, /* for FIONREAD */
+	.unlocked_ioctl	= cache_ioctl_procfs, /* for FIONREAD */
 	.open		= cache_open_procfs,
 	.release	= cache_release_procfs,
 };
@@ -1526,12 +1525,12 @@ static unsigned int cache_poll_pipefs(st
 	return cache_poll(filp, wait, cd);
 }
 
-static int cache_ioctl_pipefs(struct inode *inode, struct file *filp,
-			      unsigned int cmd, unsigned long arg)
+static long cache_ioctl_pipefs(struct file *filp, unsigned int cmd,
+			       unsigned long arg)
 {
-	struct cache_detail *cd = RPC_I(inode)->private;
+	struct cache_detail *cd = RPC_I(filp->f_path.dentry->d_inode)->private;
 
-	return cache_ioctl(inode, filp, cmd, arg, cd);
+	return cache_ioctl(filp, cmd, arg, cd);
 }
 
 static int cache_open_pipefs(struct inode *inode, struct file *filp)
@@ -1554,7 +1553,7 @@ const struct file_operations cache_file_
 	.read		= cache_read_pipefs,
 	.write		= cache_write_pipefs,
 	.poll		= cache_poll_pipefs,
-	.ioctl		= cache_ioctl_pipefs, /* for FIONREAD */
+	.unlocked_ioctl	= cache_ioctl_pipefs, /* for FIONREAD */
 	.open		= cache_open_pipefs,
 	.release	= cache_release_pipefs,
 };
Index: linux-2.6-tip/net/sunrpc/rpc_pipe.c
===================================================================
--- linux-2.6-tip.orig/net/sunrpc/rpc_pipe.c
+++ linux-2.6-tip/net/sunrpc/rpc_pipe.c
@@ -308,23 +308,27 @@ rpc_pipe_poll(struct file *filp, struct 
 	return mask;
 }
 
-static int
-rpc_pipe_ioctl(struct inode *ino, struct file *filp,
-		unsigned int cmd, unsigned long arg)
+static long
+rpc_pipe_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 {
-	struct rpc_inode *rpci = RPC_I(filp->f_path.dentry->d_inode);
+	struct inode *inode = filp->f_path.dentry->d_inode;
+	struct rpc_inode *rpci = RPC_I(inode);
 	int len;
 
 	switch (cmd) {
 	case FIONREAD:
-		if (rpci->ops == NULL)
+		mutex_lock(&inode->i_mutex);
+		if (rpci->ops == NULL) {
+			mutex_unlock(&inode->i_mutex);
 			return -EPIPE;
+		}
 		len = rpci->pipelen;
 		if (filp->private_data) {
 			struct rpc_pipe_msg *msg;
 			msg = (struct rpc_pipe_msg *)filp->private_data;
 			len += msg->len - msg->copied;
 		}
+		mutex_unlock(&inode->i_mutex);
 		return put_user(len, (int __user *)arg);
 	default:
 		return -EINVAL;
@@ -337,7 +341,7 @@ static const struct file_operations rpc_
 	.read		= rpc_pipe_read,
 	.write		= rpc_pipe_write,
 	.poll		= rpc_pipe_poll,
-	.ioctl		= rpc_pipe_ioctl,
+	.unlocked_ioctl	= rpc_pipe_ioctl,
 	.open		= rpc_pipe_open,
 	.release	= rpc_pipe_release,
 };
Index: linux-2.6-tip/net/sunrpc/svc_xprt.c
===================================================================
--- linux-2.6-tip.orig/net/sunrpc/svc_xprt.c
+++ linux-2.6-tip/net/sunrpc/svc_xprt.c
@@ -5,7 +5,6 @@
  */
 
 #include <linux/sched.h>
-#include <linux/smp_lock.h>
 #include <linux/errno.h>
 #include <linux/freezer.h>
 #include <linux/kthread.h>

^ permalink raw reply

* Re: [BUILD-FAILURE] next-20091015 - vbus_enet driver breaks with allmodconfig
From: Gregory Haskins @ 2009-10-15 22:25 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Gregory Haskins, Kamalesh Babulal, sfr, greg, netdev, LKML,
	linux-next
In-Reply-To: <20091015114210.65e1be72.randy.dunlap@oracle.com>

[-- Attachment #1: Type: text/plain, Size: 826 bytes --]

Randy Dunlap wrote:
> On Thu, 15 Oct 2009 13:39:44 -0400 Gregory Haskins wrote:
> 
>>
>> I am having difficulty reproducing the problem.  When I look at the
>> Kconfig, I see that VBUS_ENET has a "select VBUS_PROXY" as I would
>> expect.  Additionally, if I run allmodconfig I can confirm that I get
>> VBUS_ENET=m, VBUS_PROXY=y:
> 
> Greg,
> 
> Did you try to reproduce it with the ppc64 config file?
> allmodconfig on what arch?
> 
> I think the problem is that arch/x86/Kconfig says:
> 
> source "drivers/vbus/Kconfig"
> 
> and that should actually be in drivers/Kconfig, so that it applies
> to ppc64 et al, not just x86.

Ah, indeed.  I only have/tried x86, and that is the issue.  I will have
to fix this (and the create_irq() dependency you mention in the next mail).

Thanks Randy.

-Greg


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 267 bytes --]

^ permalink raw reply

* [PATCH -next] vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
From: Randy Dunlap @ 2009-10-15 22:29 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: linux-next, LKML, netdev, David Miller, Shreyas Bhatewara,
	pv-drivers
In-Reply-To: <20091015152538.74a1cb15.sfr@canb.auug.org.au>

From: Randy Dunlap <randy.dunlap@oracle.com>

vmxnet3 was using dprintk() for debugging output.  This was
defined in <linux/dst.h> and was the only thing that was
used from that header file.  This caused compile errors
when CONFIG_BLOCK was not enabled due to bio* and BIO*
uses in the header file, so change this driver to use
dev_dbg() for debugging output.

include/linux/dst.h:520: error: dereferencing pointer to incomplete type
include/linux/dst.h:520: error: 'BIO_POOL_BITS' undeclared (first use in this function)
include/linux/dst.h:521: error: dereferencing pointer to incomplete type
include/linux/dst.h:522: error: dereferencing pointer to incomplete type
include/linux/dst.h:525: error: dereferencing pointer to incomplete type
make[4]: *** [drivers/net/vmxnet3/vmxnet3_drv.o] Error 1

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
---
 drivers/net/vmxnet3/vmxnet3_drv.c |   27 ++++++++++++++++++---------
 drivers/net/vmxnet3/vmxnet3_int.h |    2 +-
 2 files changed, 19 insertions(+), 10 deletions(-)

--- linux-next-20091015.orig/drivers/net/vmxnet3/vmxnet3_int.h
+++ linux-next-20091015/drivers/net/vmxnet3/vmxnet3_int.h
@@ -30,6 +30,7 @@
 #include <linux/types.h>
 #include <linux/ethtool.h>
 #include <linux/delay.h>
+#include <linux/device.h>
 #include <linux/netdevice.h>
 #include <linux/pci.h>
 #include <linux/ethtool.h>
@@ -59,7 +60,6 @@
 #include <linux/if_vlan.h>
 #include <linux/if_arp.h>
 #include <linux/inetdevice.h>
-#include <linux/dst.h>
 
 #include "vmxnet3_defs.h"
 
--- linux-next-20091015.orig/drivers/net/vmxnet3/vmxnet3_drv.c
+++ linux-next-20091015/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -481,7 +481,8 @@ vmxnet3_rq_alloc_rx_buf(struct vmxnet3_r
 	}
 	rq->uncommitted[ring_idx] += num_allocated;
 
-	dprintk(KERN_ERR "alloc_rx_buf: %d allocated, next2fill %u, next2comp "
+	dev_dbg(&adapter->netdev->dev,
+		"alloc_rx_buf: %d allocated, next2fill %u, next2comp "
 		"%u, uncommited %u\n", num_allocated, ring->next2fill,
 		ring->next2comp, rq->uncommitted[ring_idx]);
 
@@ -539,7 +540,8 @@ vmxnet3_map_pkt(struct sk_buff *skb, str
 		tbi = tq->buf_info + tq->tx_ring.next2fill;
 		tbi->map_type = VMXNET3_MAP_NONE;
 
-		dprintk(KERN_ERR "txd[%u]: 0x%Lx 0x%x 0x%x\n",
+		dev_dbg(&adapter->netdev->dev,
+			"txd[%u]: 0x%Lx 0x%x 0x%x\n",
 			tq->tx_ring.next2fill, ctx->sop_txd->txd.addr,
 			ctx->sop_txd->dword[2], ctx->sop_txd->dword[3]);
 		vmxnet3_cmd_ring_adv_next2fill(&tq->tx_ring);
@@ -572,7 +574,8 @@ vmxnet3_map_pkt(struct sk_buff *skb, str
 		gdesc->dword[2] = dw2 | buf_size;
 		gdesc->dword[3] = 0;
 
-		dprintk(KERN_ERR "txd[%u]: 0x%Lx 0x%x 0x%x\n",
+		dev_dbg(&adapter->netdev->dev,
+			"txd[%u]: 0x%Lx 0x%x 0x%x\n",
 			tq->tx_ring.next2fill, gdesc->txd.addr,
 			gdesc->dword[2], gdesc->dword[3]);
 		vmxnet3_cmd_ring_adv_next2fill(&tq->tx_ring);
@@ -600,7 +603,8 @@ vmxnet3_map_pkt(struct sk_buff *skb, str
 		gdesc->dword[2] = dw2 | frag->size;
 		gdesc->dword[3] = 0;
 
-		dprintk(KERN_ERR "txd[%u]: 0x%llu %u %u\n",
+		dev_dbg(&adapter->netdev->dev,
+			"txd[%u]: 0x%llu %u %u\n",
 			tq->tx_ring.next2fill, gdesc->txd.addr,
 			gdesc->dword[2], gdesc->dword[3]);
 		vmxnet3_cmd_ring_adv_next2fill(&tq->tx_ring);
@@ -697,7 +701,8 @@ vmxnet3_parse_and_copy_hdr(struct sk_buf
 	tdd = tq->data_ring.base + tq->tx_ring.next2fill;
 
 	memcpy(tdd->data, skb->data, ctx->copy_size);
-	dprintk(KERN_ERR "copy %u bytes to dataRing[%u]\n",
+	dev_dbg(&adapter->netdev->dev,
+		"copy %u bytes to dataRing[%u]\n",
 		ctx->copy_size, tq->tx_ring.next2fill);
 	return 1;
 
@@ -808,7 +813,8 @@ vmxnet3_tq_xmit(struct sk_buff *skb, str
 
 	if (count > vmxnet3_cmd_ring_desc_avail(&tq->tx_ring)) {
 		tq->stats.tx_ring_full++;
-		dprintk(KERN_ERR "tx queue stopped on %s, next2comp %u"
+		dev_dbg(&adapter->netdev->dev,
+			"tx queue stopped on %s, next2comp %u"
 			" next2fill %u\n", adapter->netdev->name,
 			tq->tx_ring.next2comp, tq->tx_ring.next2fill);
 
@@ -853,7 +859,8 @@ vmxnet3_tq_xmit(struct sk_buff *skb, str
 
 	/* finally flips the GEN bit of the SOP desc */
 	gdesc->dword[2] ^= VMXNET3_TXD_GEN;
-	dprintk(KERN_ERR "txd[%u]: SOP 0x%Lx 0x%x 0x%x\n",
+	dev_dbg(&adapter->netdev->dev,
+		"txd[%u]: SOP 0x%Lx 0x%x 0x%x\n",
 		(u32)((union Vmxnet3_GenericDesc *)ctx.sop_txd -
 		tq->tx_ring.base), gdesc->txd.addr, gdesc->dword[2],
 		gdesc->dword[3]);
@@ -990,7 +997,8 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx
 			if (unlikely(rcd->len == 0)) {
 				/* Pretend the rx buffer is skipped. */
 				BUG_ON(!(rcd->sop && rcd->eop));
-				dprintk(KERN_ERR "rxRing[%u][%u] 0 length\n",
+				dev_dbg(&adapter->netdev->dev,
+					"rxRing[%u][%u] 0 length\n",
 					ring_idx, idx);
 				goto rcd_done;
 			}
@@ -1676,7 +1684,8 @@ vmxnet3_activate_dev(struct vmxnet3_adap
 	int err;
 	u32 ret;
 
-	dprintk(KERN_ERR "%s: skb_buf_size %d, rx_buf_per_pkt %d, ring sizes"
+	dev_dbg(&adapter->netdev->dev,
+		"%s: skb_buf_size %d, rx_buf_per_pkt %d, ring sizes"
 		" %u %u %u\n", adapter->netdev->name, adapter->skb_buf_size,
 		adapter->rx_buf_per_pkt, adapter->tx_queue.tx_ring.size,
 		adapter->rx_queue.rx_ring[0].size,

^ permalink raw reply

* RE: [Pv-drivers] [PATCH -next] vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
From: Bhavesh Davda @ 2009-10-15 22:35 UTC (permalink / raw)
  To: Randy Dunlap, Stephen Rothwell
  Cc: pv-drivers@vmware.com, netdev, LKML, linux-next@vger.kernel.org,
	David Miller
In-Reply-To: <4AD7A24F.6000900@oracle.com>

Who ever compiles with CONFIG_BLOCK=n? Just kidding...

Thanks again for making this change! Ship it!

Signed-off-by: Bhavesh Davda <bhavesh@vmware.com>

- Bhavesh

> -----Original Message-----
> From: pv-drivers-bounces@vmware.com [mailto:pv-drivers-
> bounces@vmware.com] On Behalf Of Randy Dunlap
> Sent: Thursday, October 15, 2009 3:30 PM
> To: Stephen Rothwell
> Cc: pv-drivers@vmware.com; netdev; LKML; linux-next@vger.kernel.org;
> David Miller
> Subject: [Pv-drivers] [PATCH -next] vmxnet3: use dev_dbg, fix build for
> CONFIG_BLOCK=n
> 
> From: Randy Dunlap <randy.dunlap@oracle.com>
> 
> vmxnet3 was using dprintk() for debugging output.  This was
> defined in <linux/dst.h> and was the only thing that was
> used from that header file.  This caused compile errors
> when CONFIG_BLOCK was not enabled due to bio* and BIO*
> uses in the header file, so change this driver to use
> dev_dbg() for debugging output.
> 
> include/linux/dst.h:520: error: dereferencing pointer to incomplete
> type
> include/linux/dst.h:520: error: 'BIO_POOL_BITS' undeclared (first use
> in this function)
> include/linux/dst.h:521: error: dereferencing pointer to incomplete
> type
> include/linux/dst.h:522: error: dereferencing pointer to incomplete
> type
> include/linux/dst.h:525: error: dereferencing pointer to incomplete
> type
> make[4]: *** [drivers/net/vmxnet3/vmxnet3_drv.o] Error 1
> 
> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
> ---
>  drivers/net/vmxnet3/vmxnet3_drv.c |   27 ++++++++++++++++++---------
>  drivers/net/vmxnet3/vmxnet3_int.h |    2 +-
>  2 files changed, 19 insertions(+), 10 deletions(-)
> 
> --- linux-next-20091015.orig/drivers/net/vmxnet3/vmxnet3_int.h
> +++ linux-next-20091015/drivers/net/vmxnet3/vmxnet3_int.h
> @@ -30,6 +30,7 @@
>  #include <linux/types.h>
>  #include <linux/ethtool.h>
>  #include <linux/delay.h>
> +#include <linux/device.h>
>  #include <linux/netdevice.h>
>  #include <linux/pci.h>
>  #include <linux/ethtool.h>
> @@ -59,7 +60,6 @@
>  #include <linux/if_vlan.h>
>  #include <linux/if_arp.h>
>  #include <linux/inetdevice.h>
> -#include <linux/dst.h>
> 
>  #include "vmxnet3_defs.h"
> 
> --- linux-next-20091015.orig/drivers/net/vmxnet3/vmxnet3_drv.c
> +++ linux-next-20091015/drivers/net/vmxnet3/vmxnet3_drv.c
> @@ -481,7 +481,8 @@ vmxnet3_rq_alloc_rx_buf(struct vmxnet3_r
>  	}
>  	rq->uncommitted[ring_idx] += num_allocated;
> 
> -	dprintk(KERN_ERR "alloc_rx_buf: %d allocated, next2fill %u,
> next2comp "
> +	dev_dbg(&adapter->netdev->dev,
> +		"alloc_rx_buf: %d allocated, next2fill %u, next2comp "
>  		"%u, uncommited %u\n", num_allocated, ring->next2fill,
>  		ring->next2comp, rq->uncommitted[ring_idx]);
> 
> @@ -539,7 +540,8 @@ vmxnet3_map_pkt(struct sk_buff *skb, str
>  		tbi = tq->buf_info + tq->tx_ring.next2fill;
>  		tbi->map_type = VMXNET3_MAP_NONE;
> 
> -		dprintk(KERN_ERR "txd[%u]: 0x%Lx 0x%x 0x%x\n",
> +		dev_dbg(&adapter->netdev->dev,
> +			"txd[%u]: 0x%Lx 0x%x 0x%x\n",
>  			tq->tx_ring.next2fill, ctx->sop_txd->txd.addr,
>  			ctx->sop_txd->dword[2], ctx->sop_txd->dword[3]);
>  		vmxnet3_cmd_ring_adv_next2fill(&tq->tx_ring);
> @@ -572,7 +574,8 @@ vmxnet3_map_pkt(struct sk_buff *skb, str
>  		gdesc->dword[2] = dw2 | buf_size;
>  		gdesc->dword[3] = 0;
> 
> -		dprintk(KERN_ERR "txd[%u]: 0x%Lx 0x%x 0x%x\n",
> +		dev_dbg(&adapter->netdev->dev,
> +			"txd[%u]: 0x%Lx 0x%x 0x%x\n",
>  			tq->tx_ring.next2fill, gdesc->txd.addr,
>  			gdesc->dword[2], gdesc->dword[3]);
>  		vmxnet3_cmd_ring_adv_next2fill(&tq->tx_ring);
> @@ -600,7 +603,8 @@ vmxnet3_map_pkt(struct sk_buff *skb, str
>  		gdesc->dword[2] = dw2 | frag->size;
>  		gdesc->dword[3] = 0;
> 
> -		dprintk(KERN_ERR "txd[%u]: 0x%llu %u %u\n",
> +		dev_dbg(&adapter->netdev->dev,
> +			"txd[%u]: 0x%llu %u %u\n",
>  			tq->tx_ring.next2fill, gdesc->txd.addr,
>  			gdesc->dword[2], gdesc->dword[3]);
>  		vmxnet3_cmd_ring_adv_next2fill(&tq->tx_ring);
> @@ -697,7 +701,8 @@ vmxnet3_parse_and_copy_hdr(struct sk_buf
>  	tdd = tq->data_ring.base + tq->tx_ring.next2fill;
> 
>  	memcpy(tdd->data, skb->data, ctx->copy_size);
> -	dprintk(KERN_ERR "copy %u bytes to dataRing[%u]\n",
> +	dev_dbg(&adapter->netdev->dev,
> +		"copy %u bytes to dataRing[%u]\n",
>  		ctx->copy_size, tq->tx_ring.next2fill);
>  	return 1;
> 
> @@ -808,7 +813,8 @@ vmxnet3_tq_xmit(struct sk_buff *skb, str
> 
>  	if (count > vmxnet3_cmd_ring_desc_avail(&tq->tx_ring)) {
>  		tq->stats.tx_ring_full++;
> -		dprintk(KERN_ERR "tx queue stopped on %s, next2comp %u"
> +		dev_dbg(&adapter->netdev->dev,
> +			"tx queue stopped on %s, next2comp %u"
>  			" next2fill %u\n", adapter->netdev->name,
>  			tq->tx_ring.next2comp, tq->tx_ring.next2fill);
> 
> @@ -853,7 +859,8 @@ vmxnet3_tq_xmit(struct sk_buff *skb, str
> 
>  	/* finally flips the GEN bit of the SOP desc */
>  	gdesc->dword[2] ^= VMXNET3_TXD_GEN;
> -	dprintk(KERN_ERR "txd[%u]: SOP 0x%Lx 0x%x 0x%x\n",
> +	dev_dbg(&adapter->netdev->dev,
> +		"txd[%u]: SOP 0x%Lx 0x%x 0x%x\n",
>  		(u32)((union Vmxnet3_GenericDesc *)ctx.sop_txd -
>  		tq->tx_ring.base), gdesc->txd.addr, gdesc->dword[2],
>  		gdesc->dword[3]);
> @@ -990,7 +997,8 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx
>  			if (unlikely(rcd->len == 0)) {
>  				/* Pretend the rx buffer is skipped. */
>  				BUG_ON(!(rcd->sop && rcd->eop));
> -				dprintk(KERN_ERR "rxRing[%u][%u] 0 length\n",
> +				dev_dbg(&adapter->netdev->dev,
> +					"rxRing[%u][%u] 0 length\n",
>  					ring_idx, idx);
>  				goto rcd_done;
>  			}
> @@ -1676,7 +1684,8 @@ vmxnet3_activate_dev(struct vmxnet3_adap
>  	int err;
>  	u32 ret;
> 
> -	dprintk(KERN_ERR "%s: skb_buf_size %d, rx_buf_per_pkt %d, ring
> sizes"
> +	dev_dbg(&adapter->netdev->dev,
> +		"%s: skb_buf_size %d, rx_buf_per_pkt %d, ring sizes"
>  		" %u %u %u\n", adapter->netdev->name, adapter-
> >skb_buf_size,
>  		adapter->rx_buf_per_pkt, adapter->tx_queue.tx_ring.size,
>  		adapter->rx_queue.rx_ring[0].size,
> _______________________________________________
> Pv-drivers mailing list
> Pv-drivers@vmware.com
> http://mailman2.vmware.com/mailman/listinfo/pv-drivers

^ permalink raw reply

* RE: [PATCH -next] vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
From: Shreyas Bhatewara @ 2009-10-15 22:37 UTC (permalink / raw)
  To: Randy Dunlap, Stephen Rothwell
  Cc: linux-next@vger.kernel.org, LKML, netdev, David Miller,
	pv-drivers@vmware.com
In-Reply-To: <4AD7A24F.6000900@oracle.com>

> -----Original Message-----
> From: Randy Dunlap [mailto:randy.dunlap@oracle.com]
> Sent: Thursday, October 15, 2009 3:30 PM
> To: Stephen Rothwell
> Cc: linux-next@vger.kernel.org; LKML; netdev; David Miller; Shreyas
> Bhatewara; pv-drivers@vmware.com
> Subject: [PATCH -next] vmxnet3: use dev_dbg, fix build for
> CONFIG_BLOCK=n
> 
> From: Randy Dunlap <randy.dunlap@oracle.com>
> 
> vmxnet3 was using dprintk() for debugging output.  This was
> defined in <linux/dst.h> and was the only thing that was
> used from that header file.  This caused compile errors
> when CONFIG_BLOCK was not enabled due to bio* and BIO*
> uses in the header file, so change this driver to use
> dev_dbg() for debugging output.
> 
> include/linux/dst.h:520: error: dereferencing pointer to incomplete
> type
> include/linux/dst.h:520: error: 'BIO_POOL_BITS' undeclared (first use
> in this function)
> include/linux/dst.h:521: error: dereferencing pointer to incomplete
> type
> include/linux/dst.h:522: error: dereferencing pointer to incomplete
> type
> include/linux/dst.h:525: error: dereferencing pointer to incomplete
> type
> make[4]: *** [drivers/net/vmxnet3/vmxnet3_drv.o] Error 1
> 
> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>


Thanks for the changes Randy, appreciate it.

->Shreyas

^ permalink raw reply

* Re: [Pv-drivers] [PATCH -next] vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
From: Randy Dunlap @ 2009-10-15 22:42 UTC (permalink / raw)
  To: Bhavesh Davda
  Cc: Randy Dunlap, Stephen Rothwell, pv-drivers@vmware.com, netdev,
	LKML, linux-next@vger.kernel.org, David Miller
In-Reply-To: <8B1F619C9F5F454E81D90D3C161698D7017DD56435@EXCH-MBX-3.vmware.com>

Bhavesh Davda wrote:
> Who ever compiles with CONFIG_BLOCK=n? Just kidding...

or why does networking need CONFIG_BLOCK at all?  ;)

> Thanks again for making this change! Ship it!

So could I have used adapter->netdev->dev or adapter->pdev->dev ?
either of them?  are they the same?


> Signed-off-by: Bhavesh Davda <bhavesh@vmware.com>
> 
> - Bhavesh
> 
>> -----Original Message-----
>> From: pv-drivers-bounces@vmware.com [mailto:pv-drivers-
>> bounces@vmware.com] On Behalf Of Randy Dunlap
>> Sent: Thursday, October 15, 2009 3:30 PM
>> To: Stephen Rothwell
>> Cc: pv-drivers@vmware.com; netdev; LKML; linux-next@vger.kernel.org;
>> David Miller
>> Subject: [Pv-drivers] [PATCH -next] vmxnet3: use dev_dbg, fix build for
>> CONFIG_BLOCK=n
>>
>> From: Randy Dunlap <randy.dunlap@oracle.com>
>>
>> vmxnet3 was using dprintk() for debugging output.  This was
>> defined in <linux/dst.h> and was the only thing that was
>> used from that header file.  This caused compile errors
>> when CONFIG_BLOCK was not enabled due to bio* and BIO*
>> uses in the header file, so change this driver to use
>> dev_dbg() for debugging output.
>>
>> include/linux/dst.h:520: error: dereferencing pointer to incomplete
>> type
>> include/linux/dst.h:520: error: 'BIO_POOL_BITS' undeclared (first use
>> in this function)
>> include/linux/dst.h:521: error: dereferencing pointer to incomplete
>> type
>> include/linux/dst.h:522: error: dereferencing pointer to incomplete
>> type
>> include/linux/dst.h:525: error: dereferencing pointer to incomplete
>> type
>> make[4]: *** [drivers/net/vmxnet3/vmxnet3_drv.o] Error 1
>>
>> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
>> ---
>>  drivers/net/vmxnet3/vmxnet3_drv.c |   27 ++++++++++++++++++---------
>>  drivers/net/vmxnet3/vmxnet3_int.h |    2 +-
>>  2 files changed, 19 insertions(+), 10 deletions(-)
>>
>> --- linux-next-20091015.orig/drivers/net/vmxnet3/vmxnet3_int.h
>> +++ linux-next-20091015/drivers/net/vmxnet3/vmxnet3_int.h
>> @@ -30,6 +30,7 @@
>>  #include <linux/types.h>
>>  #include <linux/ethtool.h>
>>  #include <linux/delay.h>
>> +#include <linux/device.h>
>>  #include <linux/netdevice.h>
>>  #include <linux/pci.h>
>>  #include <linux/ethtool.h>
>> @@ -59,7 +60,6 @@
>>  #include <linux/if_vlan.h>
>>  #include <linux/if_arp.h>
>>  #include <linux/inetdevice.h>
>> -#include <linux/dst.h>
>>
>>  #include "vmxnet3_defs.h"
>>
>> --- linux-next-20091015.orig/drivers/net/vmxnet3/vmxnet3_drv.c
>> +++ linux-next-20091015/drivers/net/vmxnet3/vmxnet3_drv.c
>> @@ -481,7 +481,8 @@ vmxnet3_rq_alloc_rx_buf(struct vmxnet3_r
>>  	}
>>  	rq->uncommitted[ring_idx] += num_allocated;
>>
>> -	dprintk(KERN_ERR "alloc_rx_buf: %d allocated, next2fill %u,
>> next2comp "
>> +	dev_dbg(&adapter->netdev->dev,
>> +		"alloc_rx_buf: %d allocated, next2fill %u, next2comp "
>>  		"%u, uncommited %u\n", num_allocated, ring->next2fill,
>>  		ring->next2comp, rq->uncommitted[ring_idx]);
>>
>> @@ -539,7 +540,8 @@ vmxnet3_map_pkt(struct sk_buff *skb, str
>>  		tbi = tq->buf_info + tq->tx_ring.next2fill;
>>  		tbi->map_type = VMXNET3_MAP_NONE;
>>
>> -		dprintk(KERN_ERR "txd[%u]: 0x%Lx 0x%x 0x%x\n",
>> +		dev_dbg(&adapter->netdev->dev,
>> +			"txd[%u]: 0x%Lx 0x%x 0x%x\n",
>>  			tq->tx_ring.next2fill, ctx->sop_txd->txd.addr,
>>  			ctx->sop_txd->dword[2], ctx->sop_txd->dword[3]);
>>  		vmxnet3_cmd_ring_adv_next2fill(&tq->tx_ring);
>> @@ -572,7 +574,8 @@ vmxnet3_map_pkt(struct sk_buff *skb, str
>>  		gdesc->dword[2] = dw2 | buf_size;
>>  		gdesc->dword[3] = 0;
>>
>> -		dprintk(KERN_ERR "txd[%u]: 0x%Lx 0x%x 0x%x\n",
>> +		dev_dbg(&adapter->netdev->dev,
>> +			"txd[%u]: 0x%Lx 0x%x 0x%x\n",
>>  			tq->tx_ring.next2fill, gdesc->txd.addr,
>>  			gdesc->dword[2], gdesc->dword[3]);
>>  		vmxnet3_cmd_ring_adv_next2fill(&tq->tx_ring);
>> @@ -600,7 +603,8 @@ vmxnet3_map_pkt(struct sk_buff *skb, str
>>  		gdesc->dword[2] = dw2 | frag->size;
>>  		gdesc->dword[3] = 0;
>>
>> -		dprintk(KERN_ERR "txd[%u]: 0x%llu %u %u\n",
>> +		dev_dbg(&adapter->netdev->dev,
>> +			"txd[%u]: 0x%llu %u %u\n",
>>  			tq->tx_ring.next2fill, gdesc->txd.addr,
>>  			gdesc->dword[2], gdesc->dword[3]);
>>  		vmxnet3_cmd_ring_adv_next2fill(&tq->tx_ring);
>> @@ -697,7 +701,8 @@ vmxnet3_parse_and_copy_hdr(struct sk_buf
>>  	tdd = tq->data_ring.base + tq->tx_ring.next2fill;
>>
>>  	memcpy(tdd->data, skb->data, ctx->copy_size);
>> -	dprintk(KERN_ERR "copy %u bytes to dataRing[%u]\n",
>> +	dev_dbg(&adapter->netdev->dev,
>> +		"copy %u bytes to dataRing[%u]\n",
>>  		ctx->copy_size, tq->tx_ring.next2fill);
>>  	return 1;
>>
>> @@ -808,7 +813,8 @@ vmxnet3_tq_xmit(struct sk_buff *skb, str
>>
>>  	if (count > vmxnet3_cmd_ring_desc_avail(&tq->tx_ring)) {
>>  		tq->stats.tx_ring_full++;
>> -		dprintk(KERN_ERR "tx queue stopped on %s, next2comp %u"
>> +		dev_dbg(&adapter->netdev->dev,
>> +			"tx queue stopped on %s, next2comp %u"
>>  			" next2fill %u\n", adapter->netdev->name,
>>  			tq->tx_ring.next2comp, tq->tx_ring.next2fill);
>>
>> @@ -853,7 +859,8 @@ vmxnet3_tq_xmit(struct sk_buff *skb, str
>>
>>  	/* finally flips the GEN bit of the SOP desc */
>>  	gdesc->dword[2] ^= VMXNET3_TXD_GEN;
>> -	dprintk(KERN_ERR "txd[%u]: SOP 0x%Lx 0x%x 0x%x\n",
>> +	dev_dbg(&adapter->netdev->dev,
>> +		"txd[%u]: SOP 0x%Lx 0x%x 0x%x\n",
>>  		(u32)((union Vmxnet3_GenericDesc *)ctx.sop_txd -
>>  		tq->tx_ring.base), gdesc->txd.addr, gdesc->dword[2],
>>  		gdesc->dword[3]);
>> @@ -990,7 +997,8 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx
>>  			if (unlikely(rcd->len == 0)) {
>>  				/* Pretend the rx buffer is skipped. */
>>  				BUG_ON(!(rcd->sop && rcd->eop));
>> -				dprintk(KERN_ERR "rxRing[%u][%u] 0 length\n",
>> +				dev_dbg(&adapter->netdev->dev,
>> +					"rxRing[%u][%u] 0 length\n",
>>  					ring_idx, idx);
>>  				goto rcd_done;
>>  			}
>> @@ -1676,7 +1684,8 @@ vmxnet3_activate_dev(struct vmxnet3_adap
>>  	int err;
>>  	u32 ret;
>>
>> -	dprintk(KERN_ERR "%s: skb_buf_size %d, rx_buf_per_pkt %d, ring
>> sizes"
>> +	dev_dbg(&adapter->netdev->dev,
>> +		"%s: skb_buf_size %d, rx_buf_per_pkt %d, ring sizes"
>>  		" %u %u %u\n", adapter->netdev->name, adapter-
>>> skb_buf_size,
>>  		adapter->rx_buf_per_pkt, adapter->tx_queue.tx_ring.size,
>>  		adapter->rx_queue.rx_ring[0].size,
>> _______________________________________________

^ permalink raw reply

* Re: TCP_DEFER_ACCEPT is missing counter update
From: Julian Anastasov @ 2009-10-15 22:44 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: David Miller, netdev, eric.dumazet
In-Reply-To: <20091015124134.GB1073@1wt.eu>


	Hello,

On Thu, 15 Oct 2009, Willy Tarreau wrote:

> Hello Julian,
> 
> On Thu, Oct 15, 2009 at 11:47:51AM +0300, Julian Anastasov wrote:
> (...)
> > 	If one changes TCP_DEFER_ACCEPT to create socket it
> > will save wakeups but not resources. I'm wondering if the
> > behavior should be changed at all. For me the options are two:
> > 
> > a) you want to save resources: use TCP_DEFER_ACCEPT. To help
> > proxies use large values for TCP_SYNCNT and TCP_DEFER_ACCEPT.
> > 
> > b) you can live with wakeups and many sockets: do not use
> > TCP_DEFER_ACCEPT. Suitable for servers using short timeouts
> > for first request.
> 
> and c) you want to avoid wakeups as much as possible and you'd like
> to drop just one empty ACK packet, so that as soon as you accept a
> an HTTP connection, you can read the request without polling at all.
> 
> Right now I'm able to process a complete HTTP request without
> registering the any FD in epoll *at all* for most requests if the first
> two ACKs are close enough and the server responds quickly. This saves a
> substantial amount of CPU cycles. Epoll is fast, but calling epoll_ctl()
> 100000 times a second still has a measurable cost. Doing an accept() on
> an empty connection implies this cost. Waiting for data always saves this
> cost, but causes the undesirable side effects that have been reported.
> Waiting for data just a few milliseconds is enough to save this cost
> 99.99% of the time, just as skipping the first empty packet.
> 
> Since you're saying that updating the value is wrong when it's used as
> a flag, would a patch to implement a specific option for this usage be
> accepted ?  Either by passing a negative value to TCP_DEFER_ACCEPT, or
> by using another flag ?

	Sorry for the long mail ...

	I was not clear enough in previous email. Your goal
is to decrease period per client while the actually decreased
threshold is on the listener's socket. 256 conns will be enough
to completely disable TCP_DEFER_ACCEPT on the listener (u8). I'm not
sure that you tested what happens after Nth client (where
N matches your TCP_DEFER_ACCEPT value as retransmissions), do you
still see accept deferring for next clients? Now if your patch
is applied the deferring will be disabled soon after server start.

	The open requests are just 64 bytes for me:
cat /proc/slabinfo | grep request_sock_TCP
and there is no rskq_defer_accept flag in struct tcp_request_sock,
struct inet_request_sock or struct request_sock. It is
present only for normal sockets. These open connection
requests have only 'retrans' and 'acked' flag. Please,
check again what your patch does and test it with some simple
client that sleeps after connect().

	As for new flags, may be we should not change
TCP_DEFER_ACCEPT values because current applications can
depend on it. There is some free space in
struct request_sock_queue just after u8 rskq_defer_accept.
May be new flags/modes can go there to define another
behavior but it means also changes in applications to support
it.

	Because using TCP_DEFER_ACCEPT as flag is not documented
one solution can be the following change, may be it matches your
idea but implemented correctly:

        /* If TCP_DEFER_ACCEPT is set, drop bare ACK. */
-       if (inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
+       if (req->retrans < inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
            TCP_SKB_CB(skb)->end_seq == tcp_rsk(req)->rcv_isn + 1) {
                inet_rsk(req)->acked = 1;
                return NULL;
        }

	with the meaning "When TCP_DEFER_ACCEPT retransmission
period is shorter than SYN-ACK retrans period (eg. TCP_SYNCNT
or sysctl_tcp_synack_retries) move the open request as
established on client's packet after the TCP_DEFER_ACCEPT
period has expired". Then if you set TCP_SYNCNT=5 and
TCP_DEFER_ACCEPT=1retrans first ACK will set inet_rsk(req)->acked=1
because req->retrans is 0 and rskq_defer_accept is 1,
later timer will send one SYN-ACK (which marks the end
of TCP_DEFER_ACCEPT period), then client will send DATA or it
will be forced by our SYN-ACK retransmission to send 2nd
ACK packet for which we will create established socket.

	Such change will affect all servers that use
TCP_DEFER_ACCEPT retransmissions less than TCP_SYNCNT. They
will start to see wakeups without data after the TCP_DEFER_ACCEPT
period.

	To summarize:

SECOND	CLIENT			SERVER
---------------------------------------------------------
0	SYN			SYN-ACK
	if DATA => ESTABLISH
	if ACK => acked=1
3				SYN-ACK (set retrans=1)
	if ACK and TCP_DEFER_ACCEPT=1retrans => ESTABLISH
	if DATA => ESTABLISH
	if ACK => acked=1
9				if TCP_SYNCNT=1 => expire
				else SYN-ACK (set retrans=2)
	if ACK and TCP_DEFER_ACCEPT=2retrans => ESTABLISH
	if DATA => ESTABLISH
	if ACK => acked=1
...

PRO:

- if TCP_DEFER_ACCEPT<TCP_SYNCNT and client properly resends
ACK on every SYN-ACK retransmission then we always will
switch to established state on TCP_DEFER_ACCEPT expiration.
Such conns will never expire in SYN_RECV state. They
will be terminated by client's FIN or will be accepted
by server application and terminated properly. Of course,
there is some chance if client delays its ACKs or if SYN-ACK
is lost the open request to expire in SYN_RECV state.

CON:

- if client refuses to send DATA we still need these SYN-ACKs
to trigger ACK retransmissions from client because the only
way to switch to established state is when packet is received,
I don't know how TCP_DEFER_ACCEPT expiration can directly change
the open request to established state.

	May be it is possible to send first SYN-ACK and
if one ACK is received to send more SYN-ACKs after
TCP_DEFER_ACCEPT period expires. Then client still has chance
to send ACK or DATA that will switch open request to established
socket. So, our timer will be silent when acked=1 while
TCP_DEFER_ACCEPT period is active, for example:

SYN
	SYN-ACK
ACK
	...
	acked=1 => no SYN-ACKs retrans (assume they are sent and lost)
	...
	TCP_DEFER_ACCEPT expires => send 2nd SYN-ACK
	... If no client's ACK then resend SYN-ACK while retrans<TCP_SYNCNT
	...
ACK or DATA => ESTABLISHED

	This will need little change in inet_csk_reqsk_queue_prune()
but it saves SYN-ACK traffic during deferring period in the
common case when client sends ACK. If such compromise is
acceptable I can prepare and test some patch.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [Pv-drivers] [PATCH -next] vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
From: Dmitry Torokhov @ 2009-10-15 22:47 UTC (permalink / raw)
  To: pv-drivers
  Cc: Randy Dunlap, Bhavesh Davda, Stephen Rothwell, netdev, LKML,
	linux-next@vger.kernel.org, David Miller
In-Reply-To: <4AD7A569.1040308@oracle.com>

Hi Randy,

On Thursday 15 October 2009 03:42:49 pm Randy Dunlap wrote:
> Bhavesh Davda wrote:
> > Who ever compiles with CONFIG_BLOCK=n? Just kidding...
> 
> or why does networking need CONFIG_BLOCK at all?  ;)
> 
> > Thanks again for making this change! Ship it!
> 
> So could I have used adapter->netdev->dev or adapter->pdev->dev ?
> either of them?  are they the same?
> 

They are the same.

-- 
Dmitry

^ permalink raw reply

* Re: [Pv-drivers] [PATCH -next] vmxnet3: use dev_dbg,  fix build for CONFIG_BLOCK=n
From: Dmitry Torokhov @ 2009-10-15 22:53 UTC (permalink / raw)
  To: pv-drivers
  Cc: Randy Dunlap, Stephen Rothwell, netdev, LKML,
	linux-next@vger.kernel.org, David Miller
In-Reply-To: <200910151547.48247.dtor@vmware.com>

On Thursday 15 October 2009 03:47:48 pm Dmitry Torokhov wrote:
> Hi Randy,
> 
> On Thursday 15 October 2009 03:42:49 pm Randy Dunlap wrote:
> > Bhavesh Davda wrote:
> > > Who ever compiles with CONFIG_BLOCK=n? Just kidding...
> >
> > or why does networking need CONFIG_BLOCK at all?  ;)
> >
> > > Thanks again for making this change! Ship it!
> >
> > So could I have used adapter->netdev->dev or adapter->pdev->dev ?
> > either of them?  are they the same?
> 
> They are the same.
> 

Hmm.. I take this back, adapter->netdev->dev.parent == adapter->pdev->dev.

-- 
Dmitry

^ permalink raw reply

* RE: [Pv-drivers] [PATCH -next] vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
From: Bhavesh Davda @ 2009-10-15 22:53 UTC (permalink / raw)
  To: Dmitry Torokhov, pv-drivers@vmware.com
  Cc: Randy Dunlap, Stephen Rothwell, netdev, LKML,
	linux-next@vger.kernel.org, David Miller
In-Reply-To: <200910151547.48247.dtor@vmware.com>

> From: Dmitry Torokhov [mailto:dtor@vmware.com]
> Sent: Thursday, October 15, 2009 3:48 PM
> To: pv-drivers@vmware.com
> Cc: Randy Dunlap; Bhavesh Davda; Stephen Rothwell; netdev; LKML; linux-
> next@vger.kernel.org; David Miller
> Subject: Re: [Pv-drivers] [PATCH -next] vmxnet3: use dev_dbg, fix build
> for CONFIG_BLOCK=n
> 
> Hi Randy,
> 
> On Thursday 15 October 2009 03:42:49 pm Randy Dunlap wrote:
> > Bhavesh Davda wrote:
> > > Who ever compiles with CONFIG_BLOCK=n? Just kidding...
> >
> > or why does networking need CONFIG_BLOCK at all?  ;)
> >
> > > Thanks again for making this change! Ship it!
> >
> > So could I have used adapter->netdev->dev or adapter->pdev->dev ?
> > either of them?  are they the same?
> >
> 
> They are the same.


Right:

vmxnet3_probe_device(struct pci_dev *pdev,...)

	struct net_device *netdev;
	...
	netdev = alloc_etherdev(sizeof(struct vmxnet3_adapter));
	...
	pci_set_drvdata(pdev, netdev);
	adapter = netdev_priv(netdev);
	adapter->netdev = netdev;
	adapter->pdev = pdev;

- Bhavesh

^ permalink raw reply

* Re: [BUILD-FAILURE] next-20091015 - vbus_enet driver breaks with allmodconfig
From: Gregory Haskins @ 2009-10-16  0:29 UTC (permalink / raw)
  To: Kamalesh Babulal; +Cc: ghaskins, linux-next, LKML, sfr, netdev, greg
In-Reply-To: <20091015104852.GA6740@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 1518 bytes --]

Kamalesh Babulal wrote:
> Hi Gregory,
> 
> 	While building next-20091015 with allmodconfig on the powerpc
> vbus-enet driver breaks
> 
> MODPOST 2492 modules
> ERROR: ".vbus_driver_register" [drivers/net/vbus-enet.ko] undefined!
> ERROR: ".vbus_driver_unregister" [drivers/net/vbus-enet.ko] undefined!
> ERROR: ".vbus_driver_ioq_alloc" [drivers/net/vbus-enet.ko] undefined!
> 
> CONFIG_VBUS_ENET=m
> CONFIG_VBUS_ENET_DEBUG=y
> CONFIG_VBUS_PROXY=n

Hi Kamalesh,

Please try the following patch:

commit 1de440616ac84679902d045b4476effcebfae400
Author: Gregory Haskins <ghaskins@novell.com>
Date:   Thu Oct 15 20:25:05 2009 -0400

    net: fix vbus-enet Kconfig dependencies

    We currently select VBUS_PROXY when vbus-enet is enabled, which is
    the wrong direction.  Not all platforms will define VBUS-PROXY, and
    venet depends on its inclusion.  Therefore, lets fix vbus-enet to
    properly depend on the presence of VBUS_PROXY to get this right.

    Signed-off-by: Gregory Haskins <ghaskins@novell.com>

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 47dfa04..c9128ea 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -3233,7 +3233,7 @@ config VIRTIO_NET
 config VBUS_ENET
        tristate "VBUS Ethernet Driver"
        default n
-       select VBUS_PROXY
+       depends on VBUS_PROXY
        help
           A virtualized 802.x network device based on the VBUS
           "virtual-ethernet" interface.  It can be used with any




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 267 bytes --]

^ permalink raw reply related

* Re: PATCH: Network Device Naming mechanism and policy
From: dann frazier @ 2009-10-16  0:32 UTC (permalink / raw)
  To: Narendra_K
  Cc: netdev, linux-hotplug, Matt_Domsch, Jordan_Hargrave, Charles_Rose
In-Reply-To: <20091013173638.GE1119@ldl.fc.hp.com>

On Tue, Oct 13, 2009 at 11:36:38AM -0600, dann frazier wrote:
> 1;2202;0cOn Tue, Oct 13, 2009 at 10:43:49PM +0530, Narendra_K@Dell.com wrote:
> > 
> > >> These device nodes are not functional at the moment - open() returns 
> > >> -ENOSYS.  Their only purpose is to provide userspace with a kernel 
> > >> name to ifindex mapping, in a form that udev can easily manage.
> > >
> > >If the idea is just to provide a userspace-visible mapping 
> > >(and presumably take advantage of udev's infrastructure for 
> > >naming) does this need kernel changes? Could this be a 
> > >hierarchy under e.g. /etc/udev instead, using plain text 
> > >files? It still means we need something like libnetdevname for 
> > >apps to do the translation, but I'm not seeing why it matters 
> > >how this map is stored. Is there some special property of the 
> > >character devices (e.g. uevents) that we're not already 
> > >getting with the existing interfaces?
> > 
> > Yes. The char device by itself doesn't help in any way. But it provides
> > a flexible mechanism to provide multiple names for the same device, just
> > the way it is for disks.
> 
> Right - so any reason this couldn't be implemented completely in
> userspace by having udev manipulate plain text files under say
> /etc/udev/net/?
> 
> I do agree that it would be nice for admins/installers to tweak/use
> nic names in a similar way to storage names (udev rules), and it might
> let us take advantage of a lot of the existing udev code.

Is there interest in this approach?

 - modify udev to manage network devices names as regular (non-device)
   files (stored in /etc/udev, /dev/netdev, or wherever)
 - use the existing udev rules to manage symlinks to these files
 - point libnetdevname at these text files for its name resolution

I've started prototyping this, and it certainly looks possible w/o any
kernel changes. However, I could probably use some advice from a udev
person to do a proper implementation.

-- 
dann frazier


^ permalink raw reply

* Re: TCP_DEFER_ACCEPT is missing counter update
From: Eric Dumazet @ 2009-10-16  3:51 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: Willy Tarreau, David Miller, netdev
In-Reply-To: <Pine.LNX.4.58.0910152257410.3047@u.domain.uli>

Julian Anastasov a écrit :
> 	Sorry for the long mail ...
> 
> 	I was not clear enough in previous email. Your goal
> is to decrease period per client while the actually decreased
> threshold is on the listener's socket. 256 conns will be enough
> to completely disable TCP_DEFER_ACCEPT on the listener (u8). I'm not
> sure that you tested what happens after Nth client (where
> N matches your TCP_DEFER_ACCEPT value as retransmissions), do you
> still see accept deferring for next clients? Now if your patch
> is applied the deferring will be disabled soon after server start.


So, it appears defer_accept value is not an inherited attribute,
but shared by all embryons. Therefore we should not touch it.

No need to type so much text Julian.

Of course it should be done, or add a new connection field to count number
of pure ACKS received on each SYN_RECV embryon.

Do you volunter for this patch ?

^ permalink raw reply

* Re: TCP_DEFER_ACCEPT is missing counter update
From: Eric Dumazet @ 2009-10-16  5:00 UTC (permalink / raw)
  Cc: Julian Anastasov, Willy Tarreau, David Miller, netdev
In-Reply-To: <4AD7EDB4.7060907@gmail.com>

Eric Dumazet a écrit :
> 
> 
> So, it appears defer_accept value is not an inherited attribute,
> but shared by all embryons. Therefore we should not touch it.
> 
> Of course it should be done, or add a new connection field to count number
> of pure ACKS received on each SYN_RECV embryon.
> 

Could be something like this ? (on top of net-next-2.6 of course)

7 bits is more than enough, we could take 5 bits IMHO.


Thanks

[PATCH net-next-2.6] tcp: TCP_DEFER_ACCEPT fix

Julian Anastasov pointed out defer_accept was an attribute of listening socket,
shared by all embryons. We therefore need a new per connection attribute.

We can split current u8 used by cookie_ts into 7 bits used to store
number of pure ACKS received by the embryon, and 1 bit to store cookie_ts.

Note, I use an unsigned int field so that kmemcheck doesnt shoot,
so patch also touches cookie_v4_init_sequence()/cookie_v6_init_sequence()
implementations.

Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/request_sock.h |    9 ++++++---
 include/net/tcp.h          |    6 ++++--
 net/ipv4/syncookies.c      |    7 ++++---
 net/ipv4/tcp_ipv4.c        |    2 +-
 net/ipv4/tcp_minisocks.c   |    4 ++--
 net/ipv6/syncookies.c      |    6 +++---
 net/ipv6/tcp_ipv6.c        |    2 +-
 7 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/include/net/request_sock.h b/include/net/request_sock.h
index c719084..3464d90 100644
--- a/include/net/request_sock.h
+++ b/include/net/request_sock.h
@@ -45,9 +45,12 @@ struct request_sock_ops {
  */
 struct request_sock {
 	struct request_sock		*dl_next; /* Must be first member! */
-	u16				mss;
-	u8				retrans;
-	u8				cookie_ts; /* syncookie: encode tcpopts in timestamp */
+	kmemcheck_bitfield_begin(flags);
+	unsigned int			mss : 16,
+					retrans : 8,
+					req_acks : 7,
+					cookie_ts : 1; /* syncookie: encode tcpopts in timestamp */
+	kmemcheck_bitfield_end(flags);
 	/* The following two fields can be easily recomputed I think -AK */
 	u32				window_clamp; /* window clamp at creation time */
 	u32				rcv_wnd;	  /* rcv_wnd offered first time */
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 03a49c7..a2c0439 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -453,7 +453,7 @@ extern __u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS];
 extern struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, 
 				    struct ip_options *opt);
 extern __u32 cookie_v4_init_sequence(struct sock *sk, struct sk_buff *skb, 
-				     __u16 *mss);
+				     struct request_sock *req);
 
 extern __u32 cookie_init_timestamp(struct request_sock *req);
 extern void cookie_check_timestamp(struct tcp_options_received *tcp_opt);
@@ -461,7 +461,7 @@ extern void cookie_check_timestamp(struct tcp_options_received *tcp_opt);
 /* From net/ipv6/syncookies.c */
 extern struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb);
 extern __u32 cookie_v6_init_sequence(struct sock *sk, struct sk_buff *skb,
-				     __u16 *mss);
+				     struct request_sock *req);
 
 /* tcp_output.c */
 
@@ -1000,7 +1000,9 @@ static inline void tcp_openreq_init(struct request_sock *req,
 	struct inet_request_sock *ireq = inet_rsk(req);
 
 	req->rcv_wnd = 0;		/* So that tcp_send_synack() knows! */
+	kmemcheck_annotate_bitfield(req, flags);
 	req->cookie_ts = 0;
+	req->req_acks = 0;
 	tcp_rsk(req)->rcv_isn = TCP_SKB_CB(skb)->seq;
 	req->mss = rx_opt->mss_clamp;
 	req->ts_recent = rx_opt->saw_tstamp ? rx_opt->rcv_tsval : 0;
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 5ec678a..8af9261 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -160,19 +160,20 @@ static __u16 const msstab[] = {
  * Generate a syncookie.  mssp points to the mss, which is returned
  * rounded down to the value encoded in the cookie.
  */
-__u32 cookie_v4_init_sequence(struct sock *sk, struct sk_buff *skb, __u16 *mssp)
+__u32 cookie_v4_init_sequence(struct sock *sk, struct sk_buff *skb,
+			      struct request_sock *req)
 {
 	const struct iphdr *iph = ip_hdr(skb);
 	const struct tcphdr *th = tcp_hdr(skb);
 	int mssind;
-	const __u16 mss = *mssp;
+	const __u16 mss = req->mss;
 
 	tcp_synq_overflow(sk);
 
 	/* XXX sort msstab[] by probability?  Binary search? */
 	for (mssind = 0; mss > msstab[mssind + 1]; mssind++)
 		;
-	*mssp = msstab[mssind] + 1;
+	req->mss = msstab[mssind] + 1;
 
 	NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_SYNCOOKIESSENT);
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 9971870..3a2dfc5 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1286,7 +1286,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 		syn_flood_warning(skb);
 		req->cookie_ts = tmp_opt.tstamp_ok;
 #endif
-		isn = cookie_v4_init_sequence(sk, skb, &req->mss);
+		isn = cookie_v4_init_sequence(sk, skb, req);
 	} else if (!isn) {
 		struct inet_peer *peer = NULL;
 
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index e320afe..92778e6 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -642,9 +642,9 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
 		return NULL;
 
 	/* If TCP_DEFER_ACCEPT is set, drop bare ACK. */
-	if (inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
+	if (inet_csk(sk)->icsk_accept_queue.rskq_defer_accept > req->req_acks &&
 	    TCP_SKB_CB(skb)->end_seq == tcp_rsk(req)->rcv_isn + 1) {
-		inet_csk(sk)->icsk_accept_queue.rskq_defer_accept--;
+		req->req_acks++;
 		inet_rsk(req)->acked = 1;
 		return NULL;
 	}
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index cbe55e5..60d024d 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -125,18 +125,18 @@ static __u32 check_tcp_syn_cookie(__u32 cookie, struct in6_addr *saddr,
 		& COOKIEMASK;
 }
 
-__u32 cookie_v6_init_sequence(struct sock *sk, struct sk_buff *skb, __u16 *mssp)
+__u32 cookie_v6_init_sequence(struct sock *sk, struct sk_buff *skb, struct request_sock *req)
 {
 	struct ipv6hdr *iph = ipv6_hdr(skb);
 	const struct tcphdr *th = tcp_hdr(skb);
 	int mssind;
-	const __u16 mss = *mssp;
+	const __u16 mss = req->mss;
 
 	tcp_synq_overflow(sk);
 
 	for (mssind = 0; mss > msstab[mssind + 1]; mssind++)
 		;
-	*mssp = msstab[mssind] + 1;
+	req->mss = msstab[mssind] + 1;
 
 	NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_SYNCOOKIESSENT);
 
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 4517630..2717bcb 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1219,7 +1219,7 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
 		TCP_ECN_create_request(req, tcp_hdr(skb));
 
 	if (want_cookie) {
-		isn = cookie_v6_init_sequence(sk, skb, &req->mss);
+		isn = cookie_v6_init_sequence(sk, skb, req);
 		req->cookie_ts = tmp_opt.tstamp_ok;
 	} else if (!isn) {
 		if (ipv6_opt_accepted(sk, skb) ||


^ permalink raw reply related

* Re: TCP_DEFER_ACCEPT is missing counter update
From: Willy Tarreau @ 2009-10-16  5:03 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: David Miller, netdev, eric.dumazet
In-Reply-To: <Pine.LNX.4.58.0910152257410.3047@u.domain.uli>

Hello Julian,

On Fri, Oct 16, 2009 at 01:44:34AM +0300, Julian Anastasov wrote:
(...)
> 	I was not clear enough in previous email. Your goal
> is to decrease period per client while the actually decreased
> threshold is on the listener's socket. 256 conns will be enough
> to completely disable TCP_DEFER_ACCEPT on the listener (u8).

Damn! Now that you're explaining this, it seems obvious and makes
so much sense ! Of course you're right. We should not touch the
listening socket, only the new socket being created ! Thanks for
this clarification. David, Julian is right, please drop my patch.

(...)
> 	As for new flags, may be we should not change
> TCP_DEFER_ACCEPT values because current applications can
> depend on it.

I have no problem with that, eventhough I'm really doubting
that many applications depend on its current behaviour.

> There is some free space in
> struct request_sock_queue just after u8 rskq_defer_accept.
> May be new flags/modes can go there to define another
> behavior but it means also changes in applications to support
> it.

One idea I had was to make it signed, because there is currently
no use for negative values. But let's see your proposal below.

>         /* If TCP_DEFER_ACCEPT is set, drop bare ACK. */
> -       if (inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
> +       if (req->retrans < inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
>             TCP_SKB_CB(skb)->end_seq == tcp_rsk(req)->rcv_isn + 1) {
>                 inet_rsk(req)->acked = 1;
>                 return NULL;
>         }

Yes, of course that looks a lot better !

(...)
> 	Such change will affect all servers that use
> TCP_DEFER_ACCEPT retransmissions less than TCP_SYNCNT. They
> will start to see wakeups without data after the TCP_DEFER_ACCEPT
> period.

Indeed. But anyway a server must always be prepared to see wakeup
without data, because there are some situations where it will still
happen. For instance, if hardware RX cksum is not possible, a data
packet will cause a wakeup and during the recv(), the copy_and_cksum
will detect a cksum error and will not send anything to userland,
so the application will get an empty read.

> 	To summarize:
> 
> SECOND	CLIENT			SERVER
> ---------------------------------------------------------
> 0	SYN			SYN-ACK
> 	if DATA => ESTABLISH
> 	if ACK => acked=1
> 3				SYN-ACK (set retrans=1)
> 	if ACK and TCP_DEFER_ACCEPT=1retrans => ESTABLISH
> 	if DATA => ESTABLISH
> 	if ACK => acked=1
> 9				if TCP_SYNCNT=1 => expire
> 				else SYN-ACK (set retrans=2)
> 	if ACK and TCP_DEFER_ACCEPT=2retrans => ESTABLISH
> 	if DATA => ESTABLISH
> 	if ACK => acked=1
> ...
> 
> PRO:
> 
> - if TCP_DEFER_ACCEPT<TCP_SYNCNT and client properly resends
> ACK on every SYN-ACK retransmission then we always will
> switch to established state on TCP_DEFER_ACCEPT expiration.

That's what I wanted to achieve :-)

> Such conns will never expire in SYN_RECV state. They
> will be terminated by client's FIN or will be accepted
> by server application and terminated properly. Of course,
> there is some chance if client delays its ACKs or if SYN-ACK
> is lost the open request to expire in SYN_RECV state.

Yes, but that must be covered by both ends stacks. Network
losses are normal and such events already happen in minor
quantities everyday.

> CON:
> 
> - if client refuses to send DATA we still need these SYN-ACKs
> to trigger ACK retransmissions from client because the only
> way to switch to established state is when packet is received,
> I don't know how TCP_DEFER_ACCEPT expiration can directly change
> the open request to established state.

Well anyway this is already better than the current situation where
an apparently established connection silently dies. With this
proposal, applications have a way to get a normal behaviour.

> 	May be it is possible to send first SYN-ACK and
> if one ACK is received to send more SYN-ACKs after
> TCP_DEFER_ACCEPT period expires. Then client still has chance
> to send ACK or DATA that will switch open request to established
> socket. So, our timer will be silent when acked=1 while
> TCP_DEFER_ACCEPT period is active, for example:
> 
> SYN
> 	SYN-ACK
> ACK
> 	...
> 	acked=1 => no SYN-ACKs retrans (assume they are sent and lost)
>
> 	...
> 	TCP_DEFER_ACCEPT expires => send 2nd SYN-ACK
> 	... If no client's ACK then resend SYN-ACK while retrans<TCP_SYNCNT
> 	...
> ACK or DATA => ESTABLISHED

On first reading I found it a bit dangerous because the client will assume
an established connection when not seeing any new SYN-ACKs. And if it has
no data to send, it will never retransmit its ACK. But after some thinking,
it still matches the purpose of TCP_DEFER_ACCEPT, without the extra
SYN-ACK / ACK dance that we currently observe. I was also worried about
middle firewalls, but they will have no problem because they'd see an
established connection (SYN,SYN-ACK,ACK), so their expiration timer will
be large enough to cover the lack of SYN-ACK during TCP_DEFER_ACCEPT.

> 	This will need little change in inet_csk_reqsk_queue_prune()
> but it saves SYN-ACK traffic during deferring period in the
> common case when client sends ACK. If such compromise is
> acceptable I can prepare and test some patch.

I would personally like this a lot ! This will satisfy people who
expect it to establish at the end of the "TCP_DEFER_ACCEPT delay"
as can be interpreted from the man page, will reduce the number of
useless SYN-ACKs that annoy other people while still making no
visible change for anyone who would rely on the current behaviour.

Regards,
Willy


^ permalink raw reply

* Re: TCP_DEFER_ACCEPT is missing counter update
From: Willy Tarreau @ 2009-10-16  5:29 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Julian Anastasov, David Miller, netdev
In-Reply-To: <4AD7FE01.1010805@gmail.com>

On Fri, Oct 16, 2009 at 07:00:49AM +0200, Eric Dumazet wrote:
> Eric Dumazet a écrit :
> > 
> > 
> > So, it appears defer_accept value is not an inherited attribute,
> > but shared by all embryons. Therefore we should not touch it.
> > 
> > Of course it should be done, or add a new connection field to count number
> > of pure ACKS received on each SYN_RECV embryon.
> > 
> 
> Could be something like this ? (on top of net-next-2.6 of course)
> 
> 7 bits is more than enough, we could take 5 bits IMHO.

Couldn't we just rely on the retrans vs rskq_defer_accept comparison ?

Willy


^ permalink raw reply

* Re: [PATCH] cxgb3: No need to wake queue in xmit handler
From: Divy Le Ray @ 2009-10-16  5:09 UTC (permalink / raw)
  To: Krishna Kumar, davem; +Cc: netdev
In-Reply-To: <20091015055419.30109.16062.sendpatchset@localhost.localdomain>

On Wed, 14 Oct 2009 22:54:19 -0700, Krishna Kumar <krkumar2@in.ibm.com>  
wrote:

> The xmit handler doesn't need to wake the queue after stopping
> it temporarily (some other drivers are doing the same).
>
> Patch on net-next-2.6, multiple netperf sessions tested.
>
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
> ---
>
>  drivers/net/cxgb3/sge.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff -ruNp org/drivers/net/cxgb3/sge.c new/drivers/net/cxgb3/sge.c
> --- org/drivers/net/cxgb3/sge.c	2009-09-07 10:09:03.000000000 +0530
> +++ new/drivers/net/cxgb3/sge.c	2009-09-07 10:11:11.000000000 +0530
> @@ -1260,7 +1260,7 @@ netdev_tx_t t3_eth_xmit(struct sk_buff *
>  		if (should_restart_tx(q) &&
>  		    test_and_clear_bit(TXQ_ETH, &qs->txq_stopped)) {
>  			q->restarts++;
> -			netif_tx_wake_queue(txq);
> +			netif_tx_start_queue(txq);
>  		}
>  	}
>



^ permalink raw reply

* Re: TCP_DEFER_ACCEPT is missing counter update
From: Eric Dumazet @ 2009-10-16  6:05 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Julian Anastasov, David Miller, netdev
In-Reply-To: <20091016052913.GB5574@1wt.eu>

Willy Tarreau a écrit :
> On Fri, Oct 16, 2009 at 07:00:49AM +0200, Eric Dumazet wrote:
>> Eric Dumazet a écrit :
>>>
>>> So, it appears defer_accept value is not an inherited attribute,
>>> but shared by all embryons. Therefore we should not touch it.
>>>
>>> Of course it should be done, or add a new connection field to count number
>>> of pure ACKS received on each SYN_RECV embryon.
>>>
>> Could be something like this ? (on top of net-next-2.6 of course)
>>
>> 7 bits is more than enough, we could take 5 bits IMHO.
> 
> Couldn't we just rely on the retrans vs rskq_defer_accept comparison ?
> 

In this case, we lose TCP_DEFER_ACCEPT advantage in case one SYN-ACK was dropped
by the network : We wakeup the listening server when first ACK comes from client,
instead of really wait the request.

I think being able to count pure-acks would be slighly better, and cost nothing.


retrans is the number of SYN-RECV (re)sent, while req_acks would count number of
pure ACK received.

Those numbers, in an ideal world should be related, but could differ in real world ?


^ permalink raw reply

* Re: [PATCH 0/2] Add implementation of CCID4 into the DCCP test tree
From: Gerrit Renker @ 2009-10-16  6:07 UTC (permalink / raw)
  To: Ivo Calado; +Cc: dccp, netdev
In-Reply-To: <4AD4B89C.6010704@embedded.ufcg.edu.br>

| These patches add implementation of CCID4 into the DCCP test tree.
Thank you for sending these.

I have taken all three series,
 * 4-part TFRC-SP receiver set,
 * 4-part TFRC-SP sender set,
 * 2-part CCID-4 implementation set,
and replaced the old ccid4 subtree of the DCCP test tree.


You can checkout the whole CCID-4 subtree via
    git://eden-feed.erg.abdn.ac.uk/dccp_exp    => subtree 'ccid4'

and view (or download) the CCID-4 patch series at
    http://eden-feed.erg.abdn.ac.uk/cgi-bin/gitweb.cgi?p=dccp_exp.git;a=log;h=ccid4

These are all your patches, no edits apart from removing some trailing whitespace.
If you make changes to your patches please send them relative to this tree.

So far I have only verified that they build cleanly, will be back next week after
some more tests.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox