Netdev List

Netdev List
 help / color / mirror / Atom feed

* [net-next 4/6] i40e: whitespace fixes
From: Aaron Brown @ 2014-01-14  8:49 UTC (permalink / raw)
  To: davem; +Cc: Jesse Brandeburg, netdev, gospo, sassmann, Aaron Brown
In-Reply-To: <1389689394-22369-1-git-send-email-aaron.f.brown@intel.com>

From: Jesse Brandeburg <jesse.brandeburg@intel.com>

Fix some whitespace and comment issues.

Change-ID: I1587599e50ce66fd389965720e86f9e331d86643
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h            | 1 -
 drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h | 4 ++--
 drivers/net/ethernet/intel/i40e/i40e_common.c     | 1 -
 drivers/net/ethernet/intel/i40e/i40e_main.c       | 3 +--
 drivers/net/ethernet/intel/i40e/i40e_txrx.c       | 4 ++--
 5 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index c05984d..91b0052 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -93,7 +93,6 @@
 #define I40E_CURRENT_NVM_VERSION_HI 0x2
 #define I40E_CURRENT_NVM_VERSION_LO 0x30
 
-
 /* magic for getting defines into strings */
 #define STRINGIFY(foo)  #foo
 #define XSTRINGIFY(bar) STRINGIFY(bar)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h b/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
index c009eb4..be61a47 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
@@ -1195,8 +1195,8 @@ struct i40e_aqc_add_remove_cloud_filters_element_data {
 		} v4;
 		struct {
 			u8 data[16];
-			} v6;
-		} ipaddr;
+		} v6;
+	} ipaddr;
 	__le16 flags;
 #define I40E_AQC_ADD_CLOUD_FILTER_SHIFT                 0
 #define I40E_AQC_ADD_CLOUD_FILTER_MASK                  (0x3F << \
diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
index 0b5a75c..aedc71b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -162,7 +162,6 @@ i40e_status i40e_aq_queue_shutdown(struct i40e_hw *hw,
 	return status;
 }
 
-
 /**
  * i40e_init_shared_code - Initialize the shared code
  * @hw: pointer to hardware structure
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 14e4f4a..6a5d4ca 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -356,7 +356,6 @@ static struct rtnl_link_stats64 *i40e_get_netdev_stats_struct(
 	struct rtnl_link_stats64 *vsi_stats = i40e_get_vsi_stats_struct(vsi);
 	int i;
 
-
 	if (test_bit(__I40E_DOWN, &vsi->state))
 		return stats;
 
@@ -3603,7 +3602,7 @@ static int i40e_vsi_get_bw_info(struct i40e_vsi *vsi)
 
 	/* Get the VSI level BW configuration per TC */
 	aq_ret = i40e_aq_query_vsi_ets_sla_config(hw, vsi->seid, &bw_ets_config,
-					          NULL);
+						  NULL);
 	if (aq_ret) {
 		dev_info(&pf->pdev->dev,
 			 "couldn't get pf vsi ets bw config, err %d, aq_err %d\n",
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index a089ac1..f57a8f8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -977,8 +977,8 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 
 	rx_desc = I40E_RX_DESC(rx_ring, i);
 	qword = le64_to_cpu(rx_desc->wb.qword1.status_error_len);
-	rx_status = (qword & I40E_RXD_QW1_STATUS_MASK)
-				>> I40E_RXD_QW1_STATUS_SHIFT;
+	rx_status = (qword & I40E_RXD_QW1_STATUS_MASK) >>
+		    I40E_RXD_QW1_STATUS_SHIFT;
 
 	while (rx_status & (1 << I40E_RX_DESC_STATUS_DD_SHIFT)) {
 		union i40e_rx_desc *next_rxd;
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 3/6] i40e: make message meaningful
From: Aaron Brown @ 2014-01-14  8:49 UTC (permalink / raw)
  To: davem
  Cc: Mitch Williams, netdev, gospo, sassmann, Jesse Brandeburg,
	Aaron Brown
In-Reply-To: <1389689394-22369-1-git-send-email-aaron.f.brown@intel.com>

From: Mitch Williams <mitch.a.williams@intel.com>

Make this message mean something, rather than just spitting out a VSI id
without any context whatsoever.

Change-ID: Iafb906c6db46d4b5dcbe84adc9ed44730d08bd42
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 3868c11..53069c0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -386,8 +386,8 @@ static int i40e_alloc_vsi_res(struct i40e_vf *vf, enum i40e_vsi_type type)
 		vf->lan_vsi_index = vsi->idx;
 		vf->lan_vsi_id = vsi->id;
 		dev_info(&pf->pdev->dev,
-			 "LAN VSI index %d, VSI id %d\n",
-			 vsi->idx, vsi->id);
+			 "VF %d assigned LAN VSI index %d, VSI id %d\n",
+			 vf->vf_id, vsi->idx, vsi->id);
 		/* If the port VLAN has been configured and then the
 		 * VF driver was removed then the VSI port VLAN
 		 * configuration was destroyed.  Check if there is
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 2/6] i40e: associate VMDq queue with VM type
From: Aaron Brown @ 2014-01-14  8:49 UTC (permalink / raw)
  To: davem
  Cc: Shannon Nelson, netdev, gospo, sassmann, Jesse Brandeburg,
	Aaron Brown
In-Reply-To: <1389689394-22369-1-git-send-email-aaron.f.brown@intel.com>

From: Shannon Nelson <shannon.nelson@intel.com>

Fix a bug where the queue was not associated with the right set-up
within the hardware.  The fix is to use the right QTX_CTL VSI type
when associating it to the VSI.

Change-ID: I65ef6c5a8205601c640a6593e4b7e78d6ba45545
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/i40e/Module.symvers | 0
 drivers/net/ethernet/intel/i40e/i40e_main.c    | 5 ++++-
 drivers/net/ethernet/intel/i40e/i40e_type.h    | 1 +
 3 files changed, 5 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/intel/i40e/Module.symvers

diff --git a/drivers/net/ethernet/intel/i40e/Module.symvers b/drivers/net/ethernet/intel/i40e/Module.symvers
new file mode 100644
index 0000000..e69de29
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index ad04da2..14e4f4a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -2206,7 +2206,10 @@ static int i40e_configure_tx_ring(struct i40e_ring *ring)
 	}
 
 	/* Now associate this queue with this PCI function */
-	qtx_ctl = I40E_QTX_CTL_PF_QUEUE;
+	if (vsi->type == I40E_VSI_VMDQ2)
+		qtx_ctl = I40E_QTX_CTL_VM_QUEUE;
+	else
+		qtx_ctl = I40E_QTX_CTL_PF_QUEUE;
 	qtx_ctl |= ((hw->pf_id << I40E_QTX_CTL_PF_INDX_SHIFT) &
 		    I40E_QTX_CTL_PF_INDX_MASK);
 	wr32(hw, I40E_QTX_CTL(pf_q), qtx_ctl);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h
index 80cf240..ccfc52d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
@@ -75,6 +75,7 @@ typedef void (*I40E_ADMINQ_CALLBACK)(struct i40e_hw *, struct i40e_aq_desc *);
 
 /* bitfields for Tx queue mapping in QTX_CTL */
 #define I40E_QTX_CTL_VF_QUEUE	0x0
+#define I40E_QTX_CTL_VM_QUEUE	0x1
 #define I40E_QTX_CTL_PF_QUEUE	0x2
 
 /* debug masks - set these bits in hw->debug_mask to control output */
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 1/6] i40e: remove extra register write
From: Aaron Brown @ 2014-01-14  8:49 UTC (permalink / raw)
  To: davem
  Cc: Mitch Williams, netdev, gospo, sassmann, Jesse Brandeburg,
	Aaron Brown
In-Reply-To: <1389689394-22369-1-git-send-email-aaron.f.brown@intel.com>

From: Mitch Williams <mitch.a.williams@intel.com>

This write done at the end of VF reset and should not be performed here.

Change-ID: I4d89813b68c6173184293868a6f26cf559bc2405
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index d04a776..3868c11 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1792,7 +1792,7 @@ int i40e_vc_process_vf_msg(struct i40e_pf *pf, u16 vf_id, u32 v_opcode,
 			local_vf_id, v_opcode, msglen);
 		return ret;
 	}
-	wr32(hw, I40E_VFGEN_RSTAT1(local_vf_id), I40E_VFR_VFACTIVE);
+
 	switch (v_opcode) {
 	case I40E_VIRTCHNL_OP_VERSION:
 		ret = i40e_vc_get_version_msg(vf);
-- 
1.8.5.GIT

^ permalink raw reply related

* [net-next 0/6] Intel Wired LAN Driver Updates
From: Aaron Brown @ 2014-01-14  8:49 UTC (permalink / raw)
  To: davem; +Cc: Aaron Brown, netdev, gospo, sassmann

This series contains updates to i40e that are primarily minor fixes or 
general cleanup.

Shannon fixes a bug where the VMDq queue is not associated with the
right setup within the hardware.

Mitch provides a patch adjusting where the VF is reset and another
one adding meaningful context to a message.

Jesse cleans up white space comments and parenthesis.  

Catherine bumps the version.

Mitch Williams (2):
  1/6 i40e: remove extra register write
  3/6 i40e: make message meaningful

Shannon Nelson (1):
  2/6 i40e: associate VMDq queue with VM type

Jesse Brandeburg (2):
  4/6 i40e: whitespace fixes
  5/6 i40e: trivial cleanup

Catherine Sullivan (1):
  6/6 i40e: Bump version number

 drivers/net/ethernet/intel/i40e/Module.symvers     |  0
 drivers/net/ethernet/intel/i40e/i40e.h             |  1 -
 drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h  |  4 ++--
 drivers/net/ethernet/intel/i40e/i40e_common.c      |  1 -
 drivers/net/ethernet/intel/i40e/i40e_lan_hmc.c     |  3 +--
 drivers/net/ethernet/intel/i40e/i40e_main.c        | 10 ++++++----
 drivers/net/ethernet/intel/i40e/i40e_txrx.c        |  4 ++--
 drivers/net/ethernet/intel/i40e/i40e_type.h        |  1 +
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |  6 +++---
 9 files changed, 15 insertions(+), 15 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/i40e/Module.symvers

-- 
1.8.5.GIT

^ permalink raw reply

* Re: [PATCH net-next] tun/macvtap: limit the packets queued through rcvbuf
From: Jason Wang @ 2014-01-14  8:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: davem, netdev, linux-kernel, Vlad Yasevich, John Fastabend,
	Stephen Hemminger, Herbert Xu
In-Reply-To: <20140114082516.GB1101@redhat.com>

On 01/14/2014 04:25 PM, Michael S. Tsirkin wrote:
> On Tue, Jan 14, 2014 at 02:53:07PM +0800, Jason Wang wrote:
>> We used to limit the number of packets queued through tx_queue_length. This
>> has several issues:
>>
>> - tx_queue_length is the control of qdisc queue length, simply reusing it
>>   to control the packets queued by device may cause confusion.
>> - After commit 6acf54f1cf0a6747bac9fea26f34cfc5a9029523 ("macvtap: Add
>>   support of packet capture on macvtap device."), an unexpected qdisc
>>   caused by non-zero tx_queue_length will lead qdisc lock contention for
>>   multiqueue deivce.
>> - What we really want is to limit the total amount of memory occupied not
>>   the number of packets.
>>
>> So this patch tries to solve the above issues by using socket rcvbuf to
>> limit the packets could be queued for tun/macvtap. This was done by using
>> sock_queue_rcv_skb() instead of a direct call to skb_queue_tail(). Also two
>> new ioctl() were introduced for userspace to change the rcvbuf like what we
>> have done for sndbuf.
>>
>> With this fix, we can safely change the tx_queue_len of macvtap to
>> zero. This will make multiqueue works without extra lock contention.
>>
>> Cc: Vlad Yasevich <vyasevic@redhat.com>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Cc: John Fastabend <john.r.fastabend@intel.com>
>> Cc: Stephen Hemminger <stephen@networkplumber.org>
>> Cc: Herbert Xu <herbert@gondor.apana.org.au>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> No, I don't think we can change userspace-visible behaviour like that.
>
> This will break any existing user that tries to control
> queue length through sysfs,netlink or device ioctl.

But it looks like a buggy API, since tx_queue_len should be for qdisc
queue length instead of device itself. If we really want to preserve the
behaviour, how about using a new feature flag and change the behaviour
only when the device is created (TUNSETIFF) with the new flag?
>
> Take a look at my patch in msg ID 20140109071721.GD19559@redhat.com
> which gives one way to set tx_queue_len to zero without
> breaking userspace.

If I read the patch correctly, it will make no way for the user who
really want to change the qdisc queue length for tun.
>
>
>> ---
>>  drivers/net/macvtap.c       | 31 ++++++++++++++++++++---------
>>  drivers/net/tun.c           | 48 +++++++++++++++++++++++++++++++++------------
>>  include/uapi/linux/if_tun.h |  3 +++
>>  3 files changed, 60 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
>> index a2c3a89..c429c56 100644
>> --- a/drivers/net/macvtap.c
>> +++ b/drivers/net/macvtap.c
>> @@ -292,9 +292,6 @@ static rx_handler_result_t macvtap_handle_frame(struct sk_buff **pskb)
>>  	if (!q)
>>  		return RX_HANDLER_PASS;
>>  
>> -	if (skb_queue_len(&q->sk.sk_receive_queue) >= dev->tx_queue_len)
>> -		goto drop;
>> -
>>  	skb_push(skb, ETH_HLEN);
>>  
>>  	/* Apply the forward feature mask so that we perform segmentation
>> @@ -310,8 +307,10 @@ static rx_handler_result_t macvtap_handle_frame(struct sk_buff **pskb)
>>  			goto drop;
>>  
>>  		if (!segs) {
>> -			skb_queue_tail(&q->sk.sk_receive_queue, skb);
>> -			goto wake_up;
>> +			if (sock_queue_rcv_skb(&q->sk, skb))
>> +				goto drop;
>> +			else
>> +				goto wake_up;
>>  		}
>>  
>>  		kfree_skb(skb);
>> @@ -319,11 +318,17 @@ static rx_handler_result_t macvtap_handle_frame(struct sk_buff **pskb)
>>  			struct sk_buff *nskb = segs->next;
>>  
>>  			segs->next = NULL;
>> -			skb_queue_tail(&q->sk.sk_receive_queue, segs);
>> +			if (sock_queue_rcv_skb(&q->sk, segs)) {
>> +				skb = segs;
>> +				skb->next = nskb;
>> +				goto drop;
>> +			}
>> +
>>  			segs = nskb;
>>  		}
>>  	} else {
>> -		skb_queue_tail(&q->sk.sk_receive_queue, skb);
>> +		if (sock_queue_rcv_skb(&q->sk, skb))
>> +			goto drop;
>>  	}
>>  
>>  wake_up:
>> @@ -333,7 +338,7 @@ wake_up:
>>  drop:
>>  	/* Count errors/drops only here, thus don't care about args. */
>>  	macvlan_count_rx(vlan, 0, 0, 0);
>> -	kfree_skb(skb);
>> +	kfree_skb_list(skb);
>>  	return RX_HANDLER_CONSUMED;
>>  }
>>  
>> @@ -414,7 +419,7 @@ static void macvtap_dellink(struct net_device *dev,
>>  static void macvtap_setup(struct net_device *dev)
>>  {
>>  	macvlan_common_setup(dev);
>> -	dev->tx_queue_len = TUN_READQ_SIZE;
>> +	dev->tx_queue_len = 0;
>>  }
>>  
>>  static struct rtnl_link_ops macvtap_link_ops __read_mostly = {
>> @@ -469,6 +474,7 @@ static int macvtap_open(struct inode *inode, struct file *file)
>>  	sock_init_data(&q->sock, &q->sk);
>>  	q->sk.sk_write_space = macvtap_sock_write_space;
>>  	q->sk.sk_destruct = macvtap_sock_destruct;
>> +	q->sk.sk_rcvbuf = TUN_RCVBUF;
>>  	q->flags = IFF_VNET_HDR | IFF_NO_PI | IFF_TAP;
>>  	q->vnet_hdr_sz = sizeof(struct virtio_net_hdr);
>>  
>> @@ -1040,6 +1046,13 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
>>  		q->sk.sk_sndbuf = u;
>>  		return 0;
>>  
>> +	case TUNSETRCVBUF:
>> +		if (get_user(u, up))
>> +			return -EFAULT;
>> +
>> +		q->sk.sk_rcvbuf = u;
>> +		return 0;
>> +
>>  	case TUNGETVNETHDRSZ:
>>  		s = q->vnet_hdr_sz;
>>  		if (put_user(s, sp))
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index 09f6662..7a08fa3 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -177,6 +177,7 @@ struct tun_struct {
>>  
>>  	int			vnet_hdr_sz;
>>  	int			sndbuf;
>> +	int			rcvbuf;
>>  	struct tap_filter	txflt;
>>  	struct sock_fprog	fprog;
>>  	/* protected by rtnl lock */
>> @@ -771,17 +772,6 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
>>  	if (!check_filter(&tun->txflt, skb))
>>  		goto drop;
>>  
>> -	if (tfile->socket.sk->sk_filter &&
>> -	    sk_filter(tfile->socket.sk, skb))
>> -		goto drop;
>> -
>> -	/* Limit the number of packets queued by dividing txq length with the
>> -	 * number of queues.
>> -	 */
>> -	if (skb_queue_len(&tfile->socket.sk->sk_receive_queue)
>> -			  >= dev->tx_queue_len / tun->numqueues)
>> -		goto drop;
>> -
>>  	if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))
>>  		goto drop;
>>  
>> @@ -798,7 +788,8 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
>>  	nf_reset(skb);
>>  
>>  	/* Enqueue packet */
>> -	skb_queue_tail(&tfile->socket.sk->sk_receive_queue, skb);
>> +	if (sock_queue_rcv_skb(tfile->socket.sk, skb))
>> +		goto drop;
>>  
>>  	/* Notify and wake up reader process */
>>  	if (tfile->flags & TUN_FASYNC)
>> @@ -1668,6 +1659,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
>>  
>>  		tun->filter_attached = false;
>>  		tun->sndbuf = tfile->socket.sk->sk_sndbuf;
>> +		tun->rcvbuf = tfile->socket.sk->sk_rcvbuf;
>>  
>>  		spin_lock_init(&tun->lock);
>>  
>> @@ -1837,6 +1829,17 @@ static void tun_set_sndbuf(struct tun_struct *tun)
>>  	}
>>  }
>>  
>> +static void tun_set_rcvbuf(struct tun_struct *tun)
>> +{
>> +	struct tun_file *tfile;
>> +	int i;
>> +
>> +	for (i = 0; i < tun->numqueues; i++) {
>> +		tfile = rtnl_dereference(tun->tfiles[i]);
>> +		tfile->socket.sk->sk_sndbuf = tun->sndbuf;
>> +	}
>> +}
>> +
>>  static int tun_set_queue(struct file *file, struct ifreq *ifr)
>>  {
>>  	struct tun_file *tfile = file->private_data;
>> @@ -1878,7 +1881,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
>>  	struct ifreq ifr;
>>  	kuid_t owner;
>>  	kgid_t group;
>> -	int sndbuf;
>> +	int sndbuf, rcvbuf;
>>  	int vnet_hdr_sz;
>>  	unsigned int ifindex;
>>  	int ret;
>> @@ -2061,6 +2064,22 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
>>  		tun_set_sndbuf(tun);
>>  		break;
>>  
>> +	case TUNGETRCVBUF:
>> +		rcvbuf = tfile->socket.sk->sk_rcvbuf;
>> +		if (copy_to_user(argp, &rcvbuf, sizeof(rcvbuf)))
>> +			ret = -EFAULT;
>> +		break;
>> +
>> +	case TUNSETRCVBUF:
>> +		if (copy_from_user(&rcvbuf, argp, sizeof(rcvbuf))) {
>> +			ret = -EFAULT;
>> +			break;
>> +		}
>> +
>> +		tun->rcvbuf = rcvbuf;
>> +		tun_set_rcvbuf(tun);
>> +		break;
>> +
>>  	case TUNGETVNETHDRSZ:
>>  		vnet_hdr_sz = tun->vnet_hdr_sz;
>>  		if (copy_to_user(argp, &vnet_hdr_sz, sizeof(vnet_hdr_sz)))
>> @@ -2139,6 +2158,8 @@ static long tun_chr_compat_ioctl(struct file *file,
>>  	case TUNSETTXFILTER:
>>  	case TUNGETSNDBUF:
>>  	case TUNSETSNDBUF:
>> +	case TUNGETRCVBUF:
>> +	case TUNSETRCVBUF:
>>  	case SIOCGIFHWADDR:
>>  	case SIOCSIFHWADDR:
>>  		arg = (unsigned long)compat_ptr(arg);
>> @@ -2204,6 +2225,7 @@ static int tun_chr_open(struct inode *inode, struct file * file)
>>  
>>  	tfile->sk.sk_write_space = tun_sock_write_space;
>>  	tfile->sk.sk_sndbuf = INT_MAX;
>> +	tfile->sk.sk_rcvbuf = TUN_RCVBUF;
>>  
>>  	file->private_data = tfile;
>>  	set_bit(SOCK_EXTERNALLY_ALLOCATED, &tfile->socket.flags);
>> diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
>> index e9502dd..8e04657 100644
>> --- a/include/uapi/linux/if_tun.h
>> +++ b/include/uapi/linux/if_tun.h
>> @@ -22,6 +22,7 @@
>>  
>>  /* Read queue size */
>>  #define TUN_READQ_SIZE	500
>> +#define TUN_RCVBUF	(512 * PAGE_SIZE)
>>  
>>  /* TUN device flags */
>>  #define TUN_TUN_DEV 	0x0001	
> That's about 16x less than we were able to queue previously
> by default.
> How can you be sure this won't break any applications?
>

Ok, we can change it back to 500 * 65535, but I'm not sure whether this
or not this value (about 32M) is too big for a single socket.
>> @@ -58,6 +59,8 @@
>>  #define TUNSETQUEUE  _IOW('T', 217, int)
>>  #define TUNSETIFINDEX	_IOW('T', 218, unsigned int)
>>  #define TUNGETFILTER _IOR('T', 219, struct sock_fprog)
>> +#define TUNGETRCVBUF   _IOR('T', 220, int)
>> +#define TUNSETRCVBUF   _IOW('T', 221, int)
>>  
>>  /* TUNSETIFF ifr flags */
>>  #define IFF_TUN		0x0001
>> -- 
>> 1.8.3.2
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply

* [PATCH net-next v2 3/3] bonding: update bonding.txt for primary description
From: Ding Tianhong @ 2014-01-14  8:41 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, David S. Miller, Netdev

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 Documentation/networking/bonding.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index a4d925e..5cdb229 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -657,7 +657,8 @@ primary
 	one slave is preferred over another, e.g., when one slave has
 	higher throughput than another.
 
-	The primary option is only valid for active-backup mode.
+	The primary option is only valid for active-backup(1),
+	balance-tlb (5) and balance-alb (6) mode.
 
 primary_reselect
 
-- 
1.8.0

^ permalink raw reply related

* [PATCH net-next v2 1/3] [PATCH 1/3] bonding: update the primary slave when changing slave's name
From: Ding Tianhong @ 2014-01-14  8:41 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, David S. Miller, Netdev

If the slave's name changed, and the bond params primary is exist,
the bond should deal with the situation in two ways:

1) If the slave was the primary slave yet, clean the primary slave
   and reselect active slave.
2) If the slave's new name is as same as bond primary, set the slave
   as primary slave and reselect active slave.

Thanks for Veaceslav's suggestion.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/bonding/bond_main.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e06c445..64e25d5 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2860,9 +2860,29 @@ static int bond_slave_netdev_event(unsigned long event,
 		 */
 		break;
 	case NETDEV_CHANGENAME:
-		/*
-		 * TODO: handle changing the primary's name
+		/* Handle changing the slave's name:
+		 * 1) If the slave was primary slave,
+		 * clean the primary slave and reselect
+		 * active slave.
+		 * 2) If the slave's new name is same as
+		 * bond primary, set the slave as primary
+		 * slave and reselect active slave.
 		 */
+		if (slave == bond->primary_slave ||
+		    !strcmp(bond->params.primary, slave_dev->name)) {
+			if (bond->primary_slave) {
+				pr_info("%s: Setting primary slave to None.\n",
+					bond->dev->name);
+				bond->primary_slave = NULL;
+			} else {
+				pr_info("%s: Setting %s as primary slave.\n",
+					bond->dev->name, slave_dev->name);
+				bond->primary_slave = slave;
+			}
+			write_lock_bh(&bond->curr_slave_lock);
+			bond_select_active_slave(bond);
+			write_unlock_bh(&bond->curr_slave_lock);
+		}
 		break;
 	case NETDEV_FEAT_CHANGE:
 		bond_compute_features(bond);
-- 
1.8.0

^ permalink raw reply related

* [PATCH net-next v2 0/3] bonding: fix primary problem for bonding
From: Ding Tianhong @ 2014-01-14  8:41 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, Netdev, David S. Miller

If the slave's name changed, and the bond params primary is exist,
the bond should deal with the situation in two ways:

1) If the slave was the primary slave yet, clean the primary slave
   and reselect active slave.
2) If the slave's new name is as same as bond primary, set the slave
   as primary slave and reselect active slave.

If the new primary is not matching any slave in the bond, the bond should
record it to params, clean the primary slave and select a new active slave.

Update bonding.txt for primary description.

Ding Tianhong (3):
  bonding: update the primary slave when changing slave's name
  bonding: clean the primary slave if there is no slave matching new
    primary
  bonding: update bonding.txt for primary description.

 Documentation/networking/bonding.txt |  3 ++-
 drivers/net/bonding/bond_main.c      | 24 ++++++++++++++++++++++--
 drivers/net/bonding/bond_options.c   |  6 ++++++
 3 files changed, 30 insertions(+), 3 deletions(-)

-- 
1.8.0

^ permalink raw reply

* [PATCH net-next v2 2/3] bonding: clean the primary slave if there is no slave matching new primary
From: Ding Tianhong @ 2014-01-14  8:41 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, David S. Miller, Netdev

If the new primay is not matching any slave in the bond, the bond should
record it to params, clean the primary slave and select a new active slave.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/bonding/bond_options.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 945a666..0ee0bfe 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -512,6 +512,12 @@ int bond_option_primary_set(struct bonding *bond, const char *primary)
 		}
 	}

+	if (bond->primary_slave) {
+		pr_info("%s: Setting primary slave to None.\n",
+			bond->dev->name);
+		bond->primary_slave = NULL;
+		bond_select_active_slave(bond);
+	}
 	strncpy(bond->params.primary, primary, IFNAMSIZ);
 	bond->params.primary[IFNAMSIZ - 1] = 0;

-- 
1.8.0

^ permalink raw reply related

* Re: [PATCH net-next] tun/macvtap: limit the packets queued through rcvbuf
From: Michael S. Tsirkin @ 2014-01-14  8:25 UTC (permalink / raw)
  To: Jason Wang
  Cc: davem, netdev, linux-kernel, Vlad Yasevich, John Fastabend,
	Stephen Hemminger, Herbert Xu
In-Reply-To: <1389682387-28601-1-git-send-email-jasowang@redhat.com>

On Tue, Jan 14, 2014 at 02:53:07PM +0800, Jason Wang wrote:
> We used to limit the number of packets queued through tx_queue_length. This
> has several issues:
> 
> - tx_queue_length is the control of qdisc queue length, simply reusing it
>   to control the packets queued by device may cause confusion.
> - After commit 6acf54f1cf0a6747bac9fea26f34cfc5a9029523 ("macvtap: Add
>   support of packet capture on macvtap device."), an unexpected qdisc
>   caused by non-zero tx_queue_length will lead qdisc lock contention for
>   multiqueue deivce.
> - What we really want is to limit the total amount of memory occupied not
>   the number of packets.
> 
> So this patch tries to solve the above issues by using socket rcvbuf to
> limit the packets could be queued for tun/macvtap. This was done by using
> sock_queue_rcv_skb() instead of a direct call to skb_queue_tail(). Also two
> new ioctl() were introduced for userspace to change the rcvbuf like what we
> have done for sndbuf.
> 
> With this fix, we can safely change the tx_queue_len of macvtap to
> zero. This will make multiqueue works without extra lock contention.
> 
> Cc: Vlad Yasevich <vyasevic@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: John Fastabend <john.r.fastabend@intel.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Signed-off-by: Jason Wang <jasowang@redhat.com>

No, I don't think we can change userspace-visible behaviour like that.

This will break any existing user that tries to control
queue length through sysfs,netlink or device ioctl.

Take a look at my patch in msg ID 20140109071721.GD19559@redhat.com
which gives one way to set tx_queue_len to zero without
breaking userspace.



> ---
>  drivers/net/macvtap.c       | 31 ++++++++++++++++++++---------
>  drivers/net/tun.c           | 48 +++++++++++++++++++++++++++++++++------------
>  include/uapi/linux/if_tun.h |  3 +++
>  3 files changed, 60 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index a2c3a89..c429c56 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -292,9 +292,6 @@ static rx_handler_result_t macvtap_handle_frame(struct sk_buff **pskb)
>  	if (!q)
>  		return RX_HANDLER_PASS;
>  
> -	if (skb_queue_len(&q->sk.sk_receive_queue) >= dev->tx_queue_len)
> -		goto drop;
> -
>  	skb_push(skb, ETH_HLEN);
>  
>  	/* Apply the forward feature mask so that we perform segmentation
> @@ -310,8 +307,10 @@ static rx_handler_result_t macvtap_handle_frame(struct sk_buff **pskb)
>  			goto drop;
>  
>  		if (!segs) {
> -			skb_queue_tail(&q->sk.sk_receive_queue, skb);
> -			goto wake_up;
> +			if (sock_queue_rcv_skb(&q->sk, skb))
> +				goto drop;
> +			else
> +				goto wake_up;
>  		}
>  
>  		kfree_skb(skb);
> @@ -319,11 +318,17 @@ static rx_handler_result_t macvtap_handle_frame(struct sk_buff **pskb)
>  			struct sk_buff *nskb = segs->next;
>  
>  			segs->next = NULL;
> -			skb_queue_tail(&q->sk.sk_receive_queue, segs);
> +			if (sock_queue_rcv_skb(&q->sk, segs)) {
> +				skb = segs;
> +				skb->next = nskb;
> +				goto drop;
> +			}
> +
>  			segs = nskb;
>  		}
>  	} else {
> -		skb_queue_tail(&q->sk.sk_receive_queue, skb);
> +		if (sock_queue_rcv_skb(&q->sk, skb))
> +			goto drop;
>  	}
>  
>  wake_up:
> @@ -333,7 +338,7 @@ wake_up:
>  drop:
>  	/* Count errors/drops only here, thus don't care about args. */
>  	macvlan_count_rx(vlan, 0, 0, 0);
> -	kfree_skb(skb);
> +	kfree_skb_list(skb);
>  	return RX_HANDLER_CONSUMED;
>  }
>  
> @@ -414,7 +419,7 @@ static void macvtap_dellink(struct net_device *dev,
>  static void macvtap_setup(struct net_device *dev)
>  {
>  	macvlan_common_setup(dev);
> -	dev->tx_queue_len = TUN_READQ_SIZE;
> +	dev->tx_queue_len = 0;
>  }
>  
>  static struct rtnl_link_ops macvtap_link_ops __read_mostly = {
> @@ -469,6 +474,7 @@ static int macvtap_open(struct inode *inode, struct file *file)
>  	sock_init_data(&q->sock, &q->sk);
>  	q->sk.sk_write_space = macvtap_sock_write_space;
>  	q->sk.sk_destruct = macvtap_sock_destruct;
> +	q->sk.sk_rcvbuf = TUN_RCVBUF;
>  	q->flags = IFF_VNET_HDR | IFF_NO_PI | IFF_TAP;
>  	q->vnet_hdr_sz = sizeof(struct virtio_net_hdr);
>  
> @@ -1040,6 +1046,13 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
>  		q->sk.sk_sndbuf = u;
>  		return 0;
>  
> +	case TUNSETRCVBUF:
> +		if (get_user(u, up))
> +			return -EFAULT;
> +
> +		q->sk.sk_rcvbuf = u;
> +		return 0;
> +
>  	case TUNGETVNETHDRSZ:
>  		s = q->vnet_hdr_sz;
>  		if (put_user(s, sp))
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 09f6662..7a08fa3 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -177,6 +177,7 @@ struct tun_struct {
>  
>  	int			vnet_hdr_sz;
>  	int			sndbuf;
> +	int			rcvbuf;
>  	struct tap_filter	txflt;
>  	struct sock_fprog	fprog;
>  	/* protected by rtnl lock */
> @@ -771,17 +772,6 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
>  	if (!check_filter(&tun->txflt, skb))
>  		goto drop;
>  
> -	if (tfile->socket.sk->sk_filter &&
> -	    sk_filter(tfile->socket.sk, skb))
> -		goto drop;
> -
> -	/* Limit the number of packets queued by dividing txq length with the
> -	 * number of queues.
> -	 */
> -	if (skb_queue_len(&tfile->socket.sk->sk_receive_queue)
> -			  >= dev->tx_queue_len / tun->numqueues)
> -		goto drop;
> -
>  	if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))
>  		goto drop;
>  
> @@ -798,7 +788,8 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
>  	nf_reset(skb);
>  
>  	/* Enqueue packet */
> -	skb_queue_tail(&tfile->socket.sk->sk_receive_queue, skb);
> +	if (sock_queue_rcv_skb(tfile->socket.sk, skb))
> +		goto drop;
>  
>  	/* Notify and wake up reader process */
>  	if (tfile->flags & TUN_FASYNC)
> @@ -1668,6 +1659,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
>  
>  		tun->filter_attached = false;
>  		tun->sndbuf = tfile->socket.sk->sk_sndbuf;
> +		tun->rcvbuf = tfile->socket.sk->sk_rcvbuf;
>  
>  		spin_lock_init(&tun->lock);
>  
> @@ -1837,6 +1829,17 @@ static void tun_set_sndbuf(struct tun_struct *tun)
>  	}
>  }
>  
> +static void tun_set_rcvbuf(struct tun_struct *tun)
> +{
> +	struct tun_file *tfile;
> +	int i;
> +
> +	for (i = 0; i < tun->numqueues; i++) {
> +		tfile = rtnl_dereference(tun->tfiles[i]);
> +		tfile->socket.sk->sk_sndbuf = tun->sndbuf;
> +	}
> +}
> +
>  static int tun_set_queue(struct file *file, struct ifreq *ifr)
>  {
>  	struct tun_file *tfile = file->private_data;
> @@ -1878,7 +1881,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
>  	struct ifreq ifr;
>  	kuid_t owner;
>  	kgid_t group;
> -	int sndbuf;
> +	int sndbuf, rcvbuf;
>  	int vnet_hdr_sz;
>  	unsigned int ifindex;
>  	int ret;
> @@ -2061,6 +2064,22 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
>  		tun_set_sndbuf(tun);
>  		break;
>  
> +	case TUNGETRCVBUF:
> +		rcvbuf = tfile->socket.sk->sk_rcvbuf;
> +		if (copy_to_user(argp, &rcvbuf, sizeof(rcvbuf)))
> +			ret = -EFAULT;
> +		break;
> +
> +	case TUNSETRCVBUF:
> +		if (copy_from_user(&rcvbuf, argp, sizeof(rcvbuf))) {
> +			ret = -EFAULT;
> +			break;
> +		}
> +
> +		tun->rcvbuf = rcvbuf;
> +		tun_set_rcvbuf(tun);
> +		break;
> +
>  	case TUNGETVNETHDRSZ:
>  		vnet_hdr_sz = tun->vnet_hdr_sz;
>  		if (copy_to_user(argp, &vnet_hdr_sz, sizeof(vnet_hdr_sz)))
> @@ -2139,6 +2158,8 @@ static long tun_chr_compat_ioctl(struct file *file,
>  	case TUNSETTXFILTER:
>  	case TUNGETSNDBUF:
>  	case TUNSETSNDBUF:
> +	case TUNGETRCVBUF:
> +	case TUNSETRCVBUF:
>  	case SIOCGIFHWADDR:
>  	case SIOCSIFHWADDR:
>  		arg = (unsigned long)compat_ptr(arg);
> @@ -2204,6 +2225,7 @@ static int tun_chr_open(struct inode *inode, struct file * file)
>  
>  	tfile->sk.sk_write_space = tun_sock_write_space;
>  	tfile->sk.sk_sndbuf = INT_MAX;
> +	tfile->sk.sk_rcvbuf = TUN_RCVBUF;
>  
>  	file->private_data = tfile;
>  	set_bit(SOCK_EXTERNALLY_ALLOCATED, &tfile->socket.flags);
> diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
> index e9502dd..8e04657 100644
> --- a/include/uapi/linux/if_tun.h
> +++ b/include/uapi/linux/if_tun.h
> @@ -22,6 +22,7 @@
>  
>  /* Read queue size */
>  #define TUN_READQ_SIZE	500
> +#define TUN_RCVBUF	(512 * PAGE_SIZE)
>  
>  /* TUN device flags */
>  #define TUN_TUN_DEV 	0x0001	

That's about 16x less than we were able to queue previously
by default.
How can you be sure this won't break any applications?

> @@ -58,6 +59,8 @@
>  #define TUNSETQUEUE  _IOW('T', 217, int)
>  #define TUNSETIFINDEX	_IOW('T', 218, unsigned int)
>  #define TUNGETFILTER _IOR('T', 219, struct sock_fprog)
> +#define TUNGETRCVBUF   _IOR('T', 220, int)
> +#define TUNSETRCVBUF   _IOW('T', 221, int)
>  
>  /* TUNSETIFF ifr flags */
>  #define IFF_TUN		0x0001
> -- 
> 1.8.3.2

^ permalink raw reply

* Re: [PATCH v4 0/3] Send audit/procinfo/cgroup data in socket-level control message
From: Jan Kaluža @ 2014-01-14  8:25 UTC (permalink / raw)
  To: Casey Schaufler, davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: rgb-H+wXaHxf7aLQT0dZR+AlfA, netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, LKML,
	eparis-H+wXaHxf7aLQT0dZR+AlfA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn, tj-DgEjT+Ai2ygdnm+yROfE0A,
	cgroups-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <52D44206.2000906-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>

On 01/13/2014 08:44 PM, Casey Schaufler wrote:
> On 1/13/2014 12:01 AM, Jan Kaluza wrote:
>> Hi,
>>
>> this patchset against net-next (applies also to linux-next) adds 3 new types
>> of "Socket"-level control message (SCM_AUDIT, SCM_PROCINFO and SCM_CGROUP).
>
> How about the group list, while you're at it?

That would be of course possible, but I would rather start with these 
three patches at the beginning before adding more features, because I'm 
not sure if there is consensus on accepting them. But I have no problem 
with introducing group list later.

>>
>> Server-like processes in many cases need credentials and other
>> metadata of the peer, to decide if the calling process is allowed to
>> request a specific action, or the server just wants to log away this
>> type of information for auditing tasks.
>>
>> The current practice to retrieve such process metadata is to look that
>> information up in procfs with the $PID received over SCM_CREDENTIALS.
>> This is sufficient for long-running tasks, but introduces a race which
>> cannot be worked around for short-living processes; the calling
>> process and all the information in /proc/$PID/ is gone before the
>> receiver of the socket message can look it up.
>>
>> Changes introduced in this patchset can also increase performance
>> of such server-like processes, because current way of opening and
>> parsing /proc/$PID/* files is much more expensive than receiving these
>> metadata using SCM.
>>
>> Changes in v4:
>> - Rebased to work with the latest net-next tree
>>
>> Changes in v3:
>> - Better description of patches (Thanks to Kay Sievers)
>>
>> Changes in v2:
>> - use PATH_MAX instead of PAGE_SIZE in SCM_CGROUP patch
>> - describe each patch individually
>>
>> Jan Kaluza (3):
>>    Send loginuid and sessionid in SCM_AUDIT
>>    Send comm and cmdline in SCM_PROCINFO
>>    Send cgroup_path in SCM_CGROUP
>>
>>   include/linux/socket.h |  9 ++++++
>>   include/net/af_unix.h  | 10 ++++++
>>   include/net/scm.h      | 67 ++++++++++++++++++++++++++++++++++++++--
>>   net/core/scm.c         | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/unix/af_unix.c     | 70 ++++++++++++++++++++++++++++++++++++++++++
>>   5 files changed, 237 insertions(+), 2 deletions(-)
>>
>

^ permalink raw reply

* Re: [PATCH RFC v2 0/13] vti4: prepare namespace and interfamily support.
From: Steffen Klassert @ 2014-01-14  7:51 UTC (permalink / raw)
  To: Christophe Gouault; +Cc: netdev, Saurabh Mohan
In-Reply-To: <52CC2714.5080600@6wind.com>

On Tue, Jan 07, 2014 at 05:11:00PM +0100, Christophe Gouault wrote:
> 
> Sorry for my late comments, I had to delay my tests due to Christmas and
> New Year's celebrations.

Sorry for the delay on my side, I had to setup a testcase
for vti with namespaces first.

> 
> I have a few comments about your proposed patches:
> 
> In input, the vti tunnel processing does not follow the usual tunnel
> processing. Conventionally, the packets are first decapsulated, then
> only the skbuff interface is changed to the tunnel interface. In the vti
> code, the interface is changed before IPsec decryption, hence before
> decapsulation.
> 
> It results in a configuration asymmetry when we later support cross
> netns: the outer SAs and SPs must be defined in the outer netns, while
> the inner SAs and SPs must be defined in the inner netns.

You are absolutely right here. I'll change this to do the namespace
transition after the decapsulation in the vti_rcv_cb() callback.
Then in and outbound states/policies must be defined in the outer
namespace. I'll send another RFC version of that patchset during the
next days.

Thanks for pointing this out!

^ permalink raw reply

* Re: [Patch net-next] bridge: move br_net_exit() to br.c
From: David Miller @ 2014-01-14  7:43 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, stephen
In-Reply-To: <1389391127-3881-1-git-send-email-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Fri, 10 Jan 2014 13:58:47 -0800

> And it can become static.
> 
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied, thanks.

^ permalink raw reply

* [PATCH net-next 06/10] vxlan: use __dev_get_by_index instead of dev_get_by_index to find interface
From: Ying Xue @ 2014-01-14  7:41 UTC (permalink / raw)
  To: davem
  Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
	socketcan, johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-1-git-send-email-ying.xue@windriver.com>

The following call chains indicate that vxlan_fdb_parse() is
under rtnl_lock protection. So if we use __dev_get_by_index()
instead of dev_get_by_index() to find interface handler in it,
this would help us avoid to change interface reference counter.

rtnetlink_rcv()
  rtnl_lock()
  netlink_rcv_skb()
    rtnl_fdb_add()
      vxlan_fdb_add()
        vxlan_fdb_parse()
  rtnl_unlock()

rtnetlink_rcv()
  rtnl_lock()
  netlink_rcv_skb()
    rtnl_fdb_del()
      vxlan_fdb_del()
        vxlan_fdb_parse()
  rtnl_unlock()

Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 drivers/net/vxlan.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 481f85d..8c40802 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -741,10 +741,9 @@ static int vxlan_fdb_parse(struct nlattr *tb[], struct vxlan_dev *vxlan,
 		if (nla_len(tb[NDA_IFINDEX]) != sizeof(u32))
 			return -EINVAL;
 		*ifindex = nla_get_u32(tb[NDA_IFINDEX]);
-		tdev = dev_get_by_index(net, *ifindex);
+		tdev = __dev_get_by_index(net, *ifindex);
 		if (!tdev)
 			return -EADDRNOTAVAIL;
-		dev_put(tdev);
 	} else {
 		*ifindex = 0;
 	}
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next 01/10] Drivers: Staging: cxt1e1: use __dev_get_name instead of dev_get_name to find interfaces
From: Ying Xue @ 2014-01-14  7:41 UTC (permalink / raw)
  To: davem
  Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
	socketcan, johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-1-git-send-email-ying.xue@windriver.com>

The following call chain denotes that both do_reset() and do_del_chan()
are protected under rtnl_lock. If we use __dev_get_by_name() instead of
dev_get_by_name() to find interface handlers in them, this would help
us avoid to change interface reference counter.

dev_ioctl()
  rtnl_lock()
  dev_ifsioc()
    c4_ioctl()
      do_reset()
      do_del_chan()

Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 drivers/staging/cxt1e1/linux.c |   15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/cxt1e1/linux.c b/drivers/staging/cxt1e1/linux.c
index 9b48373..4a08e16 100644
--- a/drivers/staging/cxt1e1/linux.c
+++ b/drivers/staging/cxt1e1/linux.c
@@ -770,9 +770,9 @@ do_del_chan (struct net_device *musycc_dev, void *data)
     if (cp.channum > 999)
         return -EINVAL;
     snprintf (buf, sizeof(buf), CHANNAME "%d", cp.channum);
-    if (!(dev = dev_get_by_name (&init_net, buf)))
-        return -ENOENT;
-    dev_put (dev);
+	dev = __dev_get_by_name(&init_net, buf);
+	if (!dev)
+		return -ENODEV;
     ret = do_deluser (dev, 1);
     if (ret)
         return ret;
@@ -792,19 +792,18 @@ do_reset (struct net_device *musycc_dev, void *data)
         char        buf[sizeof (CHANNAME) + 3];
 
         sprintf (buf, CHANNAME "%d", i);
-        if (!(ndev = dev_get_by_name(&init_net, buf)))
-            continue;
+	ndev = __dev_get_by_name(&init_net, buf);
+	if (!ndev)
+		continue;
         priv = dev_to_hdlc (ndev)->priv;
 
         if ((unsigned long) (priv->ci) ==
             (unsigned long) (netdev_priv(musycc_dev)))
         {
             ndev->flags &= ~IFF_UP;
-            dev_put (ndev);
             netif_stop_queue (ndev);
             do_deluser (ndev, 1);
-        } else
-            dev_put (ndev);
+	}
     }
     return 0;
 }
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next 03/10] eql: use __dev_get_by_name instead of dev_get_by_name to find interface
From: Ying Xue @ 2014-01-14  7:41 UTC (permalink / raw)
  To: davem
  Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
	socketcan, johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-1-git-send-email-ying.xue@windriver.com>

The following call chain indicates that eql_ioctl(), eql_enslave(),
eql_emancipate(), eql_g_slave_cfg() and eql_s_slave_cfg() are
protected under rtnl_lock. So if we use __dev_get_by_name() instead
of dev_get_by_name() to find interface handlers in them, this would
help us avoid to change interface reference counters.

dev_ioctl()
  rtnl_lock()
    dev_ifsioc()
      eql_ioctl()
        eql_enslave()
	eql_emancipate()
	eql_g_slave_cfg()
	eql_s_slave_cfg()
  rtnl_unlock()

Additionally we also change their return values from -EINVAL to
-ENODEV in case that interfaces are no found.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 drivers/net/eql.c |   95 +++++++++++++++++++++++------------------------------
 1 file changed, 42 insertions(+), 53 deletions(-)

diff --git a/drivers/net/eql.c b/drivers/net/eql.c
index f219d38..7a79b60 100644
--- a/drivers/net/eql.c
+++ b/drivers/net/eql.c
@@ -395,6 +395,7 @@ static int __eql_insert_slave(slave_queue_t *queue, slave_t *slave)
 		if (duplicate_slave)
 			eql_kill_one_slave(queue, duplicate_slave);
 
+		dev_hold(slave->dev);
 		list_add(&slave->list, &queue->all_slaves);
 		queue->num_slaves++;
 		slave->dev->flags |= IFF_SLAVE;
@@ -413,39 +414,35 @@ static int eql_enslave(struct net_device *master_dev, slaving_request_t __user *
 	if (copy_from_user(&srq, srqp, sizeof (slaving_request_t)))
 		return -EFAULT;
 
-	slave_dev  = dev_get_by_name(&init_net, srq.slave_name);
-	if (slave_dev) {
-		if ((master_dev->flags & IFF_UP) == IFF_UP) {
-			/* slave is not a master & not already a slave: */
-			if (!eql_is_master(slave_dev) &&
-			    !eql_is_slave(slave_dev)) {
-				slave_t *s = kmalloc(sizeof(*s), GFP_KERNEL);
-				equalizer_t *eql = netdev_priv(master_dev);
-				int ret;
-
-				if (!s) {
-					dev_put(slave_dev);
-					return -ENOMEM;
-				}
-
-				memset(s, 0, sizeof(*s));
-				s->dev = slave_dev;
-				s->priority = srq.priority;
-				s->priority_bps = srq.priority;
-				s->priority_Bps = srq.priority / 8;
-
-				spin_lock_bh(&eql->queue.lock);
-				ret = __eql_insert_slave(&eql->queue, s);
-				if (ret) {
-					dev_put(slave_dev);
-					kfree(s);
-				}
-				spin_unlock_bh(&eql->queue.lock);
-
-				return ret;
-			}
+	slave_dev = __dev_get_by_name(&init_net, srq.slave_name);
+	if (!slave_dev)
+		return -ENODEV;
+
+	if ((master_dev->flags & IFF_UP) == IFF_UP) {
+		/* slave is not a master & not already a slave: */
+		if (!eql_is_master(slave_dev) && !eql_is_slave(slave_dev)) {
+			slave_t *s = kmalloc(sizeof(*s), GFP_KERNEL);
+			equalizer_t *eql = netdev_priv(master_dev);
+			int ret;
+
+			if (!s)
+				return -ENOMEM;
+
+			memset(s, 0, sizeof(*s));
+			s->dev = slave_dev;
+			s->priority = srq.priority;
+			s->priority_bps = srq.priority;
+			s->priority_Bps = srq.priority / 8;
+
+			spin_lock_bh(&eql->queue.lock);
+			ret = __eql_insert_slave(&eql->queue, s);
+			if (ret)
+				kfree(s);
+
+			spin_unlock_bh(&eql->queue.lock);
+
+			return ret;
 		}
-		dev_put(slave_dev);
 	}
 
 	return -EINVAL;
@@ -461,24 +458,20 @@ static int eql_emancipate(struct net_device *master_dev, slaving_request_t __use
 	if (copy_from_user(&srq, srqp, sizeof (slaving_request_t)))
 		return -EFAULT;
 
-	slave_dev = dev_get_by_name(&init_net, srq.slave_name);
-	ret = -EINVAL;
-	if (slave_dev) {
-		spin_lock_bh(&eql->queue.lock);
-
-		if (eql_is_slave(slave_dev)) {
-			slave_t *slave = __eql_find_slave_dev(&eql->queue,
-							      slave_dev);
+	slave_dev = __dev_get_by_name(&init_net, srq.slave_name);
+	if (!slave_dev)
+		return -ENODEV;
 
-			if (slave) {
-				eql_kill_one_slave(&eql->queue, slave);
-				ret = 0;
-			}
+	ret = -EINVAL;
+	spin_lock_bh(&eql->queue.lock);
+	if (eql_is_slave(slave_dev)) {
+		slave_t *slave = __eql_find_slave_dev(&eql->queue, slave_dev);
+		if (slave) {
+			eql_kill_one_slave(&eql->queue, slave);
+			ret = 0;
 		}
-		dev_put(slave_dev);
-
-		spin_unlock_bh(&eql->queue.lock);
 	}
+	spin_unlock_bh(&eql->queue.lock);
 
 	return ret;
 }
@@ -494,7 +487,7 @@ static int eql_g_slave_cfg(struct net_device *dev, slave_config_t __user *scp)
 	if (copy_from_user(&sc, scp, sizeof (slave_config_t)))
 		return -EFAULT;
 
-	slave_dev = dev_get_by_name(&init_net, sc.slave_name);
+	slave_dev = __dev_get_by_name(&init_net, sc.slave_name);
 	if (!slave_dev)
 		return -ENODEV;
 
@@ -510,8 +503,6 @@ static int eql_g_slave_cfg(struct net_device *dev, slave_config_t __user *scp)
 	}
 	spin_unlock_bh(&eql->queue.lock);
 
-	dev_put(slave_dev);
-
 	if (!ret && copy_to_user(scp, &sc, sizeof (slave_config_t)))
 		ret = -EFAULT;
 
@@ -529,7 +520,7 @@ static int eql_s_slave_cfg(struct net_device *dev, slave_config_t __user *scp)
 	if (copy_from_user(&sc, scp, sizeof (slave_config_t)))
 		return -EFAULT;
 
-	slave_dev = dev_get_by_name(&init_net, sc.slave_name);
+	slave_dev = __dev_get_by_name(&init_net, sc.slave_name);
 	if (!slave_dev)
 		return -ENODEV;
 
@@ -548,8 +539,6 @@ static int eql_s_slave_cfg(struct net_device *dev, slave_config_t __user *scp)
 	}
 	spin_unlock_bh(&eql->queue.lock);
 
-	dev_put(slave_dev);
-
 	return ret;
 }
 
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next 10/10] net: nl80211: __dev_get_by_index instead of dev_get_by_index to find interface
From: Ying Xue @ 2014-01-14  7:41 UTC (permalink / raw)
  To: davem
  Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
	socketcan, johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-1-git-send-email-ying.xue@windriver.com>

As __cfg80211_rdev_from_attrs(), nl80211_dump_wiphy_parse() and
nl80211_set_wiphy() are all under rtnl_lock protection,
__dev_get_by_index() instead of dev_get_by_index() should be used
to find interface handler in them allowing us to avoid to change
interface reference counter.

Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 net/wireless/nl80211.c |  100 +++++++++++++++++-------------------------------
 1 file changed, 36 insertions(+), 64 deletions(-)

diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index b4f40fe..0d4d7fd 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -165,7 +165,7 @@ __cfg80211_rdev_from_attrs(struct net *netns, struct nlattr **attrs)
 
 	if (attrs[NL80211_ATTR_IFINDEX]) {
 		int ifindex = nla_get_u32(attrs[NL80211_ATTR_IFINDEX]);
-		netdev = dev_get_by_index(netns, ifindex);
+		netdev = __dev_get_by_index(netns, ifindex);
 		if (netdev) {
 			if (netdev->ieee80211_ptr)
 				tmp = wiphy_to_dev(
@@ -173,8 +173,6 @@ __cfg80211_rdev_from_attrs(struct net *netns, struct nlattr **attrs)
 			else
 				tmp = NULL;
 
-			dev_put(netdev);
-
 			/* not wireless device -- return error */
 			if (!tmp)
 				return ERR_PTR(-EINVAL);
@@ -1656,7 +1654,7 @@ static int nl80211_dump_wiphy_parse(struct sk_buff *skb,
 		struct cfg80211_registered_device *rdev;
 		int ifidx = nla_get_u32(tb[NL80211_ATTR_IFINDEX]);
 
-		netdev = dev_get_by_index(sock_net(skb->sk), ifidx);
+		netdev = __dev_get_by_index(sock_net(skb->sk), ifidx);
 		if (!netdev)
 			return -ENODEV;
 		if (netdev->ieee80211_ptr) {
@@ -1664,7 +1662,6 @@ static int nl80211_dump_wiphy_parse(struct sk_buff *skb,
 				netdev->ieee80211_ptr->wiphy);
 			state->filter_wiphy = rdev->wiphy_idx;
 		}
-		dev_put(netdev);
 	}
 
 	return 0;
@@ -1987,7 +1984,7 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
 	if (info->attrs[NL80211_ATTR_IFINDEX]) {
 		int ifindex = nla_get_u32(info->attrs[NL80211_ATTR_IFINDEX]);
 
-		netdev = dev_get_by_index(genl_info_net(info), ifindex);
+		netdev = __dev_get_by_index(genl_info_net(info), ifindex);
 		if (netdev && netdev->ieee80211_ptr)
 			rdev = wiphy_to_dev(netdev->ieee80211_ptr->wiphy);
 		else
@@ -2015,32 +2012,24 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
 			rdev, nla_data(info->attrs[NL80211_ATTR_WIPHY_NAME]));
 
 	if (result)
-		goto bad_res;
+		return result;
 
 	if (info->attrs[NL80211_ATTR_WIPHY_TXQ_PARAMS]) {
 		struct ieee80211_txq_params txq_params;
 		struct nlattr *tb[NL80211_TXQ_ATTR_MAX + 1];
 
-		if (!rdev->ops->set_txq_params) {
-			result = -EOPNOTSUPP;
-			goto bad_res;
-		}
+		if (!rdev->ops->set_txq_params)
+			return -EOPNOTSUPP;
 
-		if (!netdev) {
-			result = -EINVAL;
-			goto bad_res;
-		}
+		if (!netdev)
+			return -EINVAL;
 
 		if (netdev->ieee80211_ptr->iftype != NL80211_IFTYPE_AP &&
-		    netdev->ieee80211_ptr->iftype != NL80211_IFTYPE_P2P_GO) {
-			result = -EINVAL;
-			goto bad_res;
-		}
+		    netdev->ieee80211_ptr->iftype != NL80211_IFTYPE_P2P_GO)
+			return -EINVAL;
 
-		if (!netif_running(netdev)) {
-			result = -ENETDOWN;
-			goto bad_res;
-		}
+		if (!netif_running(netdev))
+			return -ENETDOWN;
 
 		nla_for_each_nested(nl_txq_params,
 				    info->attrs[NL80211_ATTR_WIPHY_TXQ_PARAMS],
@@ -2051,12 +2040,12 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
 				  txq_params_policy);
 			result = parse_txq_params(tb, &txq_params);
 			if (result)
-				goto bad_res;
+				return result;
 
 			result = rdev_set_txq_params(rdev, netdev,
 						     &txq_params);
 			if (result)
-				goto bad_res;
+				return result;
 		}
 	}
 
@@ -2065,7 +2054,7 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
 				nl80211_can_set_dev_channel(wdev) ? wdev : NULL,
 				info);
 		if (result)
-			goto bad_res;
+			return result;
 	}
 
 	if (info->attrs[NL80211_ATTR_WIPHY_TX_POWER_SETTING]) {
@@ -2076,19 +2065,15 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
 		if (!(rdev->wiphy.features & NL80211_FEATURE_VIF_TXPOWER))
 			txp_wdev = NULL;
 
-		if (!rdev->ops->set_tx_power) {
-			result = -EOPNOTSUPP;
-			goto bad_res;
-		}
+		if (!rdev->ops->set_tx_power)
+			return -EOPNOTSUPP;
 
 		idx = NL80211_ATTR_WIPHY_TX_POWER_SETTING;
 		type = nla_get_u32(info->attrs[idx]);
 
 		if (!info->attrs[NL80211_ATTR_WIPHY_TX_POWER_LEVEL] &&
-		    (type != NL80211_TX_POWER_AUTOMATIC)) {
-			result = -EINVAL;
-			goto bad_res;
-		}
+		    (type != NL80211_TX_POWER_AUTOMATIC))
+			return -EINVAL;
 
 		if (type != NL80211_TX_POWER_AUTOMATIC) {
 			idx = NL80211_ATTR_WIPHY_TX_POWER_LEVEL;
@@ -2097,7 +2082,7 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
 
 		result = rdev_set_tx_power(rdev, txp_wdev, type, mbm);
 		if (result)
-			goto bad_res;
+			return result;
 	}
 
 	if (info->attrs[NL80211_ATTR_WIPHY_ANTENNA_TX] &&
@@ -2105,10 +2090,8 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
 		u32 tx_ant, rx_ant;
 		if ((!rdev->wiphy.available_antennas_tx &&
 		     !rdev->wiphy.available_antennas_rx) ||
-		    !rdev->ops->set_antenna) {
-			result = -EOPNOTSUPP;
-			goto bad_res;
-		}
+		    !rdev->ops->set_antenna)
+			return -EOPNOTSUPP;
 
 		tx_ant = nla_get_u32(info->attrs[NL80211_ATTR_WIPHY_ANTENNA_TX]);
 		rx_ant = nla_get_u32(info->attrs[NL80211_ATTR_WIPHY_ANTENNA_RX]);
@@ -2116,17 +2099,15 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
 		/* reject antenna configurations which don't match the
 		 * available antenna masks, except for the "all" mask */
 		if ((~tx_ant && (tx_ant & ~rdev->wiphy.available_antennas_tx)) ||
-		    (~rx_ant && (rx_ant & ~rdev->wiphy.available_antennas_rx))) {
-			result = -EINVAL;
-			goto bad_res;
-		}
+		    (~rx_ant && (rx_ant & ~rdev->wiphy.available_antennas_rx)))
+			return -EINVAL;
 
 		tx_ant = tx_ant & rdev->wiphy.available_antennas_tx;
 		rx_ant = rx_ant & rdev->wiphy.available_antennas_rx;
 
 		result = rdev_set_antenna(rdev, tx_ant, rx_ant);
 		if (result)
-			goto bad_res;
+			return result;
 	}
 
 	changed = 0;
@@ -2134,30 +2115,27 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
 	if (info->attrs[NL80211_ATTR_WIPHY_RETRY_SHORT]) {
 		retry_short = nla_get_u8(
 			info->attrs[NL80211_ATTR_WIPHY_RETRY_SHORT]);
-		if (retry_short == 0) {
-			result = -EINVAL;
-			goto bad_res;
-		}
+		if (retry_short == 0)
+			return -EINVAL;
+
 		changed |= WIPHY_PARAM_RETRY_SHORT;
 	}
 
 	if (info->attrs[NL80211_ATTR_WIPHY_RETRY_LONG]) {
 		retry_long = nla_get_u8(
 			info->attrs[NL80211_ATTR_WIPHY_RETRY_LONG]);
-		if (retry_long == 0) {
-			result = -EINVAL;
-			goto bad_res;
-		}
+		if (retry_long == 0)
+			return -EINVAL;
+
 		changed |= WIPHY_PARAM_RETRY_LONG;
 	}
 
 	if (info->attrs[NL80211_ATTR_WIPHY_FRAG_THRESHOLD]) {
 		frag_threshold = nla_get_u32(
 			info->attrs[NL80211_ATTR_WIPHY_FRAG_THRESHOLD]);
-		if (frag_threshold < 256) {
-			result = -EINVAL;
-			goto bad_res;
-		}
+		if (frag_threshold < 256)
+			return -EINVAL;
+
 		if (frag_threshold != (u32) -1) {
 			/*
 			 * Fragments (apart from the last one) are required to
@@ -2187,10 +2165,8 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
 		u32 old_frag_threshold, old_rts_threshold;
 		u8 old_coverage_class;
 
-		if (!rdev->ops->set_wiphy_params) {
-			result = -EOPNOTSUPP;
-			goto bad_res;
-		}
+		if (!rdev->ops->set_wiphy_params)
+			return -EOPNOTSUPP;
 
 		old_retry_short = rdev->wiphy.retry_short;
 		old_retry_long = rdev->wiphy.retry_long;
@@ -2218,10 +2194,6 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info)
 			rdev->wiphy.coverage_class = old_coverage_class;
 		}
 	}
-
- bad_res:
-	if (netdev)
-		dev_put(netdev);
 	return result;
 }
 
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next 09/10] can: use __dev_get_by_index instead of dev_get_by_index to find interface
From: Ying Xue @ 2014-01-14  7:41 UTC (permalink / raw)
  To: davem
  Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
	socketcan, johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-1-git-send-email-ying.xue@windriver.com>

As cgw_create_job() is always under rtnl_lock protection,
__dev_get_by_index() instead of dev_get_by_index() should be used to
find interface handler in it having us avoid to change interface
reference counter.

Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 net/can/gw.c |   15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/net/can/gw.c b/net/can/gw.c
index 88c8a39..ac31891 100644
--- a/net/can/gw.c
+++ b/net/can/gw.c
@@ -839,21 +839,21 @@ static int cgw_create_job(struct sk_buff *skb,  struct nlmsghdr *nlh)
 	if (!gwj->ccgw.src_idx || !gwj->ccgw.dst_idx)
 		goto out;
 
-	gwj->src.dev = dev_get_by_index(&init_net, gwj->ccgw.src_idx);
+	gwj->src.dev = __dev_get_by_index(&init_net, gwj->ccgw.src_idx);
 
 	if (!gwj->src.dev)
 		goto out;
 
 	if (gwj->src.dev->type != ARPHRD_CAN)
-		goto put_src_out;
+		goto out;
 
-	gwj->dst.dev = dev_get_by_index(&init_net, gwj->ccgw.dst_idx);
+	gwj->dst.dev = __dev_get_by_index(&init_net, gwj->ccgw.dst_idx);
 
 	if (!gwj->dst.dev)
-		goto put_src_out;
+		goto out;
 
 	if (gwj->dst.dev->type != ARPHRD_CAN)
-		goto put_src_dst_out;
+		goto out;
 
 	gwj->limit_hops = limhops;
 
@@ -862,11 +862,6 @@ static int cgw_create_job(struct sk_buff *skb,  struct nlmsghdr *nlh)
 	err = cgw_register_filter(gwj);
 	if (!err)
 		hlist_add_head_rcu(&gwj->list, &cgw_list);
-
-put_src_dst_out:
-	dev_put(gwj->dst.dev);
-put_src_out:
-	dev_put(gwj->src.dev);
 out:
 	if (err)
 		kmem_cache_free(cgw_cache, gwj);
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next 08/10] caif: __dev_get_by_index instead of dev_get_by_index to find interface
From: Ying Xue @ 2014-01-14  7:41 UTC (permalink / raw)
  To: davem
  Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
	socketcan, johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-1-git-send-email-ying.xue@windriver.com>

The following call chains indicate that chnl_net_open() is under
rtnl_lock protection as __dev_open() is protected by rtnl_lock.
So if __dev_get_by_index() instead of dev_get_by_index() is used
to find interface handler in it, this would help us avoid to change
interface reference counter.

__dev_open()
  chnl_net_open()

Cc: Dmitry Tarnyagin <dmitry.tarnyagin@lockless.no>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 net/caif/chnl_net.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/caif/chnl_net.c b/net/caif/chnl_net.c
index 7344a8f..4589ff67 100644
--- a/net/caif/chnl_net.c
+++ b/net/caif/chnl_net.c
@@ -285,7 +285,7 @@ static int chnl_net_open(struct net_device *dev)
 				goto error;
 		}
 
-		lldev = dev_get_by_index(dev_net(dev), llifindex);
+		lldev = __dev_get_by_index(dev_net(dev), llifindex);
 
 		if (lldev == NULL) {
 			pr_debug("no interface?\n");
@@ -307,7 +307,6 @@ static int chnl_net_open(struct net_device *dev)
 		mtu = min_t(int, dev->mtu, lldev->mtu - (headroom + tailroom));
 		mtu = min_t(int, GPRS_PDP_MTU, mtu);
 		dev_set_mtu(dev, mtu);
-		dev_put(lldev);
 
 		if (mtu < 100) {
 			pr_warn("CAIF Interface MTU too small (%d)\n", mtu);
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next 07/10] batman-adv: use __dev_get_by_index instead of dev_get_by_index to find interface
From: Ying Xue @ 2014-01-14  7:41 UTC (permalink / raw)
  To: davem
  Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
	socketcan, johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-1-git-send-email-ying.xue@windriver.com>

The following call chains indicate that batadv_is_on_batman_iface()
is always under rtnl_lock protection as call_netdevice_notifier()
is protected by rtnl_lock. So if __dev_get_by_index() rather than
dev_get_by_index() is used to find interface handler in it, this
would help us avoid to change interface reference counter.

call_netdevice_notifier()
  batadv_hard_if_event()
    batadv_hardif_add_interface()
      batadv_is_valid_iface()
        batadv_is_on_batman_iface()

Cc: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 net/batman-adv/hard-interface.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index bebd46c..115d14e 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -86,15 +86,13 @@ static bool batadv_is_on_batman_iface(const struct net_device *net_dev)
 		return false;
 
 	/* recurse over the parent device */
-	parent_dev = dev_get_by_index(&init_net, net_dev->iflink);
+	parent_dev = __dev_get_by_index(&init_net, net_dev->iflink);
 	/* if we got a NULL parent_dev there is something broken.. */
 	if (WARN(!parent_dev, "Cannot find parent device"))
 		return false;
 
 	ret = batadv_is_on_batman_iface(parent_dev);
 
-	if (parent_dev)
-		dev_put(parent_dev);
 	return ret;
 }
 
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next 05/10] decnet: use __dev_get_by_index instead of dev_get_by_index to find interface
From: Ying Xue @ 2014-01-14  7:41 UTC (permalink / raw)
  To: davem
  Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
	socketcan, johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-1-git-send-email-ying.xue@windriver.com>

The following call chain we can identify that dn_cache_getroute() is
protected under rtnl_lock. So if we use __dev_get_by_index() instead
of dev_get_by_index() to find interface handlers in it, this would help
us avoid to change interface reference counter.

rtnetlink_rcv()
  rtnl_lock()
    netlink_rcv_skb()
      dn_cache_getroute()
  rtnl_unlock()

Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 net/decnet/dn_route.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c
index ad2efa5..22390e4 100644
--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -1666,12 +1666,12 @@ static int dn_cache_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh)
 
 	if (fld.flowidn_iif) {
 		struct net_device *dev;
-		if ((dev = dev_get_by_index(&init_net, fld.flowidn_iif)) == NULL) {
+		dev = __dev_get_by_index(&init_net, fld.flowidn_iif);
+		if (!dev) {
 			kfree_skb(skb);
 			return -ENODEV;
 		}
 		if (!dev->dn_ptr) {
-			dev_put(dev);
 			kfree_skb(skb);
 			return -ENODEV;
 		}
@@ -1693,8 +1693,6 @@ static int dn_cache_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh)
 		err = dn_route_output_key((struct dst_entry **)&rt, &fld, 0);
 	}
 
-	if (skb->dev)
-		dev_put(skb->dev);
 	skb->dev = NULL;
 	if (err)
 		goto out_free;
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next 04/10] dcb: use __dev_get_by_name instead of dev_get_by_name to find interface
From: Ying Xue @ 2014-01-14  7:41 UTC (permalink / raw)
  To: davem
  Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
	socketcan, johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-1-git-send-email-ying.xue@windriver.com>

The following call chain indicates that dcb_doit() is protected
under rtnl_lock. So if we use __dev_get_by_name() instead of
dev_get_by_name() to find interface handlers in it, this would
help us avoid to change interface reference counter.

rtnetlink_rcv()
  rtnl_lock()
  netlink_rcv_skb()
    dcb_doit()
  rtnl_unlock()

Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 net/dcb/dcbnl.c |   15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/net/dcb/dcbnl.c b/net/dcb/dcbnl.c
index 66fbe19..5536444 100644
--- a/net/dcb/dcbnl.c
+++ b/net/dcb/dcbnl.c
@@ -1688,21 +1688,17 @@ static int dcb_doit(struct sk_buff *skb, struct nlmsghdr *nlh)
 	if (!tb[DCB_ATTR_IFNAME])
 		return -EINVAL;
 
-	netdev = dev_get_by_name(net, nla_data(tb[DCB_ATTR_IFNAME]));
+	netdev = __dev_get_by_name(net, nla_data(tb[DCB_ATTR_IFNAME]));
 	if (!netdev)
 		return -ENODEV;
 
-	if (!netdev->dcbnl_ops) {
-		ret = -EOPNOTSUPP;
-		goto out;
-	}
+	if (!netdev->dcbnl_ops)
+		return -EOPNOTSUPP;
 
 	reply_skb = dcbnl_newmsg(fn->type, dcb->cmd, portid, nlh->nlmsg_seq,
 				 nlh->nlmsg_flags, &reply_nlh);
-	if (!reply_skb) {
-		ret = -ENOBUFS;
-		goto out;
-	}
+	if (!reply_skb)
+		return -ENOBUFS;
 
 	ret = fn->cb(netdev, nlh, nlh->nlmsg_seq, tb, reply_skb);
 	if (ret < 0) {
@@ -1714,7 +1710,6 @@ static int dcb_doit(struct sk_buff *skb, struct nlmsghdr *nlh)
 
 	ret = rtnl_unicast(reply_skb, net, portid);
 out:
-	dev_put(netdev);
 	return ret;
 }
 
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next 02/10] bonding: use __dev_get_by_name instead of dev_get_by_name to find interface
From: Ying Xue @ 2014-01-14  7:41 UTC (permalink / raw)
  To: davem
  Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
	socketcan, johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-1-git-send-email-ying.xue@windriver.com>

The following call chain indicates that bond_do_ioctl() is protected
under rtnl_lock. If we use __dev_get_by_name() instead of
dev_get_by_name() to find interface handler in it, this would
help us avoid to change reference counter of interface once.

dev_ioctl()
  rtnl_lock()
  dev_ifsioc()
    bond_do_ioctl()
  rtnl_unlock()

Additionally we also change the coding style in bond_do_ioctl(),
letting it more readable for us.

Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 drivers/net/bonding/bond_main.c |   49 ++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 26 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e06c445..a69afbf 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3213,37 +3213,34 @@ static int bond_do_ioctl(struct net_device *bond_dev, struct ifreq *ifr, int cmd
 	if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
 		return -EPERM;
 
-	slave_dev = dev_get_by_name(net, ifr->ifr_slave);
+	slave_dev = __dev_get_by_name(net, ifr->ifr_slave);
 
 	pr_debug("slave_dev=%p:\n", slave_dev);
 
 	if (!slave_dev)
-		res = -ENODEV;
-	else {
-		pr_debug("slave_dev->name=%s:\n", slave_dev->name);
-		switch (cmd) {
-		case BOND_ENSLAVE_OLD:
-		case SIOCBONDENSLAVE:
-			res = bond_enslave(bond_dev, slave_dev);
-			break;
-		case BOND_RELEASE_OLD:
-		case SIOCBONDRELEASE:
-			res = bond_release(bond_dev, slave_dev);
-			break;
-		case BOND_SETHWADDR_OLD:
-		case SIOCBONDSETHWADDR:
-			bond_set_dev_addr(bond_dev, slave_dev);
-			res = 0;
-			break;
-		case BOND_CHANGE_ACTIVE_OLD:
-		case SIOCBONDCHANGEACTIVE:
-			res = bond_option_active_slave_set(bond, slave_dev);
-			break;
-		default:
-			res = -EOPNOTSUPP;
-		}
+		return -ENODEV;
 
-		dev_put(slave_dev);
+	pr_debug("slave_dev->name=%s:\n", slave_dev->name);
+	switch (cmd) {
+	case BOND_ENSLAVE_OLD:
+	case SIOCBONDENSLAVE:
+		res = bond_enslave(bond_dev, slave_dev);
+		break;
+	case BOND_RELEASE_OLD:
+	case SIOCBONDRELEASE:
+		res = bond_release(bond_dev, slave_dev);
+		break;
+	case BOND_SETHWADDR_OLD:
+	case SIOCBONDSETHWADDR:
+		bond_set_dev_addr(bond_dev, slave_dev);
+		res = 0;
+		break;
+	case BOND_CHANGE_ACTIVE_OLD:
+	case SIOCBONDCHANGEACTIVE:
+		res = bond_option_active_slave_set(bond, slave_dev);
+		break;
+	default:
+		res = -EOPNOTSUPP;
 	}
 
 	return res;
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next 00/10] use appropriate APIs to get interfaces
From: Ying Xue @ 2014-01-14  7:40 UTC (permalink / raw)
  To: davem
  Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
	socketcan, johannes, netdev, linux-kernel

Under rtnl_lock protection, we should use __dev_get_name/index()
rather than dev_get_name()/index() to find interface handlers
because the former interfaces can help us avoid to change interface
reference counter.

Ying Xue (10):
  Drivers: Staging: cxt1e1: use __dev_get_name instead of dev_get_name
    to find interfaces
  bonding: use __dev_get_by_name instead of dev_get_by_name to find
    interface
  eql: use __dev_get_by_name instead of dev_get_by_name to find
    interface
  dcb: use __dev_get_by_name instead of dev_get_by_name to find
    interface
  decnet: use __dev_get_by_index instead of dev_get_by_index to find
    interface
  vxlan: use __dev_get_by_index instead of dev_get_by_index to find
    interface
  batman-adv: use __dev_get_by_index instead of dev_get_by_index to
    find interface
  caif: __dev_get_by_index instead of dev_get_by_index to find
    interface
  can: use __dev_get_by_index instead of dev_get_by_index to find
    interface
  net: nl80211: __dev_get_by_index instead of dev_get_by_index to find
    interface

 drivers/net/bonding/bond_main.c |   49 +++++++++----------
 drivers/net/eql.c               |   95 ++++++++++++++++---------------------
 drivers/net/vxlan.c             |    3 +-
 drivers/staging/cxt1e1/linux.c  |   15 +++---
 net/batman-adv/hard-interface.c |    4 +-
 net/caif/chnl_net.c             |    3 +-
 net/can/gw.c                    |   15 ++----
 net/dcb/dcbnl.c                 |   15 ++----
 net/decnet/dn_route.c           |    6 +--
 net/wireless/nl80211.c          |  100 ++++++++++++++-------------------------
 10 files changed, 123 insertions(+), 182 deletions(-)

-- 
1.7.9.5

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox