Netdev List
 help / color / mirror / Atom feed
* Re: KASAN: use-after-free Read in vhost_chr_write_iter
From: Michael S. Tsirkin @ 2018-05-21 14:42 UTC (permalink / raw)
  To: Jason Wang
  Cc: DaeRyong Jeong, kvm, virtualization, netdev, linux-kernel,
	byoungyoung, kt0755, bammanag
In-Reply-To: <fb27c1fd-5172-252a-cb8f-b53927a26d06@redhat.com>

On Mon, May 21, 2018 at 10:38:10AM +0800, Jason Wang wrote:
> 
> 
> On 2018年05月18日 17:24, Jason Wang wrote:
> > 
> > 
> > On 2018年05月17日 21:45, DaeRyong Jeong wrote:
> > > We report the crash: KASAN: use-after-free Read in vhost_chr_write_iter
> > > 
> > > This crash has been found in v4.17-rc1 using RaceFuzzer (a modified
> > > version of Syzkaller), which we describe more at the end of this
> > > report. Our analysis shows that the race occurs when invoking two
> > > syscalls concurrently, write$vnet and ioctl$VHOST_RESET_OWNER.
> > > 
> > > 
> > > Analysis:
> > > We think the concurrent execution of vhost_process_iotlb_msg() and
> > > vhost_dev_cleanup() causes the crash.
> > > Both of functions can run concurrently (please see call sequence below),
> > > and possibly, there is a race on dev->iotlb.
> > > If the switch occurs right after vhost_dev_cleanup() frees
> > > dev->iotlb, vhost_process_iotlb_msg() still sees the non-null value
> > > and it
> > > keep executing without returning -EFAULT. Consequently, use-after-free
> > > occures
> > > 
> > > 
> > > Thread interleaving:
> > > CPU0 (vhost_process_iotlb_msg)                CPU1 (vhost_dev_cleanup)
> > > (In the case of both VHOST_IOTLB_UPDATE and
> > > VHOST_IOTLB_INVALIDATE)
> > > =====                            =====
> > >                             vhost_umem_clean(dev->iotlb);
> > > if (!dev->iotlb) {
> > >             ret = -EFAULT;
> > >                 break;
> > > }
> > >                             dev->iotlb = NULL;
> > > 
> > > 
> > > Call Sequence:
> > > CPU0
> > > =====
> > > vhost_net_chr_write_iter
> > >     vhost_chr_write_iter
> > >         vhost_process_iotlb_msg
> > > 
> > > CPU1
> > > =====
> > > vhost_net_ioctl
> > >     vhost_net_reset_owner
> > >         vhost_dev_reset_owner
> > >             vhost_dev_cleanup
> > 
> > Thanks a lot for the analysis.
> > 
> > This could be addressed by simply protect it with dev mutex.
> > 
> > Will post a patch.
> > 
> 
> Could you please help to test the attached patch? I've done some smoking
> test.
> 
> Thanks

> >From 88328386f3f652e684ee33dc4cf63dcaed871aea Mon Sep 17 00:00:00 2001
> From: Jason Wang <jasowang@redhat.com>
> Date: Fri, 18 May 2018 17:33:27 +0800
> Subject: [PATCH] vhost: synchronize IOTLB message with dev cleanup
> 
> DaeRyong Jeong reports a race between vhost_dev_cleanup() and
> vhost_process_iotlb_msg():
> 
> Thread interleaving:
> CPU0 (vhost_process_iotlb_msg)			CPU1 (vhost_dev_cleanup)
> (In the case of both VHOST_IOTLB_UPDATE and
> VHOST_IOTLB_INVALIDATE)
> =====						=====
> 						vhost_umem_clean(dev->iotlb);
> if (!dev->iotlb) {
> 	        ret = -EFAULT;
> 		        break;
> }
> 						dev->iotlb = NULL;
> 
> The reason is we don't synchronize between them, fixing by protecting
> vhost_process_iotlb_msg() with dev mutex.
> 
> Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
> Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
> Reported-by: DaeRyong Jeong <threeearcat@gmail.com>

Long terms we might want to move iotlb into vqs
so that messages can be processed in parallel.
Not sure how to do it yet.

> ---
>  drivers/vhost/vhost.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index f3bd8e9..f0be5f3 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -981,6 +981,7 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>  {
>  	int ret = 0;
>  
> +	mutex_lock(&dev->mutex);
>  	vhost_dev_lock_vqs(dev);
>  	switch (msg->type) {
>  	case VHOST_IOTLB_UPDATE:
> @@ -1016,6 +1017,8 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>  	}
>  
>  	vhost_dev_unlock_vqs(dev);
> +	mutex_unlock(&dev->mutex);
> +
>  	return ret;
>  }
>  ssize_t vhost_chr_write_iter(struct vhost_dev *dev,
> -- 
> 2.7.4
> 

^ permalink raw reply

* Re: [PATCH net-next 5/7] net: dsa: qca8k: Allow overwriting CPU port setting
From: Andrew Lunn @ 2018-05-21 14:46 UTC (permalink / raw)
  To: Michal Vokáč
  Cc: netdev, michal.vokac, linux-kernel, devicetree, f.fainelli,
	vivien.didelot, mark.rutland, robh+dt, davem
In-Reply-To: <1526909293-56377-6-git-send-email-michal.vokac@ysoft.com>

On Mon, May 21, 2018 at 03:28:11PM +0200, Michal Vokáč wrote:
> Implement adjust_link function that allows to overwrite default CPU port
> setting using fixed-link device tree subnode.
> 
> Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next 1/7] net: dsa: qca8k: Add QCA8334 binding documentation
From: Andrew Lunn @ 2018-05-21 14:47 UTC (permalink / raw)
  To: Michal Vokáč
  Cc: netdev, michal.vokac, linux-kernel, devicetree, f.fainelli,
	vivien.didelot, mark.rutland, robh+dt, davem
In-Reply-To: <1526909293-56377-2-git-send-email-michal.vokac@ysoft.com>

On Mon, May 21, 2018 at 03:28:07PM +0200, Michal Vokáč wrote:
> Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>

Hi Michal

It would be good to document that fixed-link can be used.

   Andrew

^ permalink raw reply

* Re: [PATCH net-next 6/7] net: dsa: qca8k: Replace GPL boilerplate by SPDX
From: Andrew Lunn @ 2018-05-21 14:47 UTC (permalink / raw)
  To: Michal Vokáč
  Cc: netdev, michal.vokac, linux-kernel, devicetree, f.fainelli,
	vivien.didelot, mark.rutland, robh+dt, davem
In-Reply-To: <1526909293-56377-7-git-send-email-michal.vokac@ysoft.com>

On Mon, May 21, 2018 at 03:28:12PM +0200, Michal Vokáč wrote:
> Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next 7/7] net: dsa: qca8k: Remove rudundant parentheses
From: Andrew Lunn @ 2018-05-21 14:48 UTC (permalink / raw)
  To: Michal Vokáč
  Cc: netdev, michal.vokac, linux-kernel, devicetree, f.fainelli,
	vivien.didelot, mark.rutland, robh+dt, davem
In-Reply-To: <1526909293-56377-8-git-send-email-michal.vokac@ysoft.com>

On Mon, May 21, 2018 at 03:28:13PM +0200, Michal Vokáč wrote:
> Fix warning reported by checkpatch.
> 
> Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [net-next PATCH v2 2/4] net: Enable Tx queue selection based on Rx queues
From: Tom Herbert @ 2018-05-21 14:51 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Amritha Nambiar, Linux Kernel Network Developers, David S. Miller,
	Alexander Duyck, Sridhar Samudrala, Eric Dumazet,
	Hannes Frederic Sowa
In-Reply-To: <CAF=yD-JghZY5NN6cHGdHeOTs8xb9KF=mQ=J2P49ojrvp+MsD8w@mail.gmail.com>

On Sat, May 19, 2018 at 1:27 PM, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
> On Sat, May 19, 2018 at 4:13 PM, Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>> On Fri, May 18, 2018 at 12:03 AM, Tom Herbert <tom@herbertland.com> wrote:
>>> On Tue, May 15, 2018 at 6:26 PM, Amritha Nambiar
>>> <amritha.nambiar@intel.com> wrote:
>>>> This patch adds support to pick Tx queue based on the Rx queue map
>>>> configuration set by the admin through the sysfs attribute
>>>> for each Tx queue. If the user configuration for receive
>>>> queue map does not apply, then the Tx queue selection falls back
>>>> to CPU map based selection and finally to hashing.
>>>>
>>>> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
>>>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>>>> ---
>
>>>> +static int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
>>>> +{
>>>> +#ifdef CONFIG_XPS
>>>> +       enum xps_map_type i = XPS_MAP_RXQS;
>>>> +       struct xps_dev_maps *dev_maps;
>>>> +       struct sock *sk = skb->sk;
>>>> +       int queue_index = -1;
>>>> +       unsigned int tci = 0;
>>>> +
>>>> +       if (sk && sk->sk_rx_queue_mapping <= dev->real_num_rx_queues &&
>>>> +           dev->ifindex == sk->sk_rx_ifindex)
>>>> +               tci = sk->sk_rx_queue_mapping;
>>>> +
>>>> +       rcu_read_lock();
>>>> +       while (queue_index < 0 && i < __XPS_MAP_MAX) {
>>>> +               if (i == XPS_MAP_CPUS)
>>>
>>> This while loop typifies exactly why I don't think the XPS maps should
>>> be an array.
>>
>> +1
>
> as a matter of fact, as enabling both cpu and rxqueue map at the same
> time makes no sense, only one map is needed at any one time. The
> only difference is in how it is indexed. It should probably not be possible
> to configure both at the same time. Keeping a single map probably also
> significantly simplifies patch 1/4.

Willem,

I think it might makes sense to have them both. Maybe one application
is spin polling that needs this, where others might be happy with
normal CPU mappings as default.

Tom

^ permalink raw reply

* Re: [PATCH net 3/4] virtio-net: reset num_buf to 1 after linearizing packet
From: Michael S. Tsirkin @ 2018-05-21 14:59 UTC (permalink / raw)
  To: Jason Wang; +Cc: virtualization, netdev, linux-kernel
In-Reply-To: <1526891706-18516-4-git-send-email-jasowang@redhat.com>

On Mon, May 21, 2018 at 04:35:05PM +0800, Jason Wang wrote:
> If we successfully linearize the packets, num_buf were set to zero
> which was wrong since we now have only 1 buffer to be used for e.g in
> the error path of receive_mergeable(). Zero num_buf will lead the code
> try to pop the buffers of next packet and drop it. Fixing this by set
> num_buf to 1 if we successfully linearize the packet.
> 
> Fixes: 4941d472bf95 ("virtio-net: do not reset during XDP set")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/net/virtio_net.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 6260d65..165a922 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -722,6 +722,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  						      &len);
>  			if (!xdp_page)
>  				goto err_xdp;
> +			num_buf = 1;

So this is tweaked here for the benefit of err_skb below.
That's confusing and we won't remember to change it
if we change the error handling.

How about fixing the error path?


-        while (--num_buf) {
+        while (num_buf-- > 1) {

Seems more robust to me.


>  			offset = VIRTIO_XDP_HEADROOM;
>  		} else {
>  			xdp_page = page;
> -- 
> 2.7.4

^ permalink raw reply

* Re: [PATCH net-next] net:sched: add action inheritdsfield to skbmod
From: Jamal Hadi Salim @ 2018-05-21 15:00 UTC (permalink / raw)
  To: Fu, Qiaobin; +Cc: davem@davemloft.net, netdev@vger.kernel.org, Michel Machado
In-Reply-To: <DA5C727C-BAE1-4355-B67C-5F9C3769CA30@bu.edu>

On 21/05/18 10:42 AM, Fu, Qiaobin wrote:
> Hi Jamal,
> 
> I've tested my patch before publishing it here, and Nishanth is going to test it further with version 2 of the GKprio. I'm going to push a patch to the repository iproute2 to add support for "inheritdsfield”.
> 

Thanks. I already acked the kernel patch. It looks good on its own.

Would you consider adding one or more tdc tests as well?

cheers,
jamal

^ permalink raw reply

* Re: [PATCH net 4/4] virito-net: fix leaking page for gso packet during mergeable XDP
From: Michael S. Tsirkin @ 2018-05-21 15:01 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, John Fastabend, linux-kernel, virtualization
In-Reply-To: <1526891706-18516-5-git-send-email-jasowang@redhat.com>

On Mon, May 21, 2018 at 04:35:06PM +0800, Jason Wang wrote:
> We need to drop refcnt to xdp_page if we see a gso packet. Otherwise
> it will be leaked. Fixing this by moving the check of gso packet above
> the linearizing logic.
> 
> Cc: John Fastabend <john.fastabend@gmail.com>
> Fixes: 72979a6c3590 ("virtio_net: xdp, add slowpath case for non contiguous buffers")
> Signed-off-by: Jason Wang <jasowang@redhat.com>

typo in subject

> ---
>  drivers/net/virtio_net.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 165a922..f8db809 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -707,6 +707,14 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  		void *data;
>  		u32 act;
>  
> +		/* Transient failure which in theory could occur if
> +		 * in-flight packets from before XDP was enabled reach
> +		 * the receive path after XDP is loaded. In practice I
> +		 * was not able to create this condition.

BTW we should probably drop the last sentence. It says in theory, should be enough.

> +		 */
> +		if (unlikely(hdr->hdr.gso_type))
> +			goto err_xdp;
> +
>  		/* This happens when rx buffer size is underestimated
>  		 * or headroom is not enough because of the buffer
>  		 * was refilled before XDP is set. This should only
> @@ -728,14 +736,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  			xdp_page = page;
>  		}
>  
> -		/* Transient failure which in theory could occur if
> -		 * in-flight packets from before XDP was enabled reach
> -		 * the receive path after XDP is loaded. In practice I
> -		 * was not able to create this condition.
> -		 */
> -		if (unlikely(hdr->hdr.gso_type))
> -			goto err_xdp;
> -
>  		/* Allow consuming headroom but reserve enough space to push
>  		 * the descriptor on if we get an XDP_TX return code.
>  		 */
> -- 
> 2.7.4

^ permalink raw reply

* Re: [PATCH net 1/4] virtio-net: correctly redirect linearized packet
From: Michael S. Tsirkin @ 2018-05-21 15:03 UTC (permalink / raw)
  To: Jason Wang; +Cc: virtualization, netdev, linux-kernel
In-Reply-To: <1526891706-18516-2-git-send-email-jasowang@redhat.com>

On Mon, May 21, 2018 at 04:35:03PM +0800, Jason Wang wrote:
> After a linearized packet was redirected by XDP, we should not go for
> the err path which will try to pop buffers for the next packet and
> increase the drop counter. Fixing this by just drop the page refcnt
> for the original page.
> 
> Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT")
> Reported-by: David Ahern <dsahern@gmail.com>
> Tested-by: David Ahern <dsahern@gmail.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/net/virtio_net.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 770422e..c15d240 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -787,7 +787,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  			}
>  			*xdp_xmit = true;
>  			if (unlikely(xdp_page != page))
> -				goto err_xdp;
> +				put_page(page);
>  			rcu_read_unlock();
>  			goto xdp_xmit;
>  		default:
> -- 
> 2.7.4

^ permalink raw reply

* Re: [PATCH net 2/4] virtio-net: correctly transmit XDP buff after linearizing
From: Michael S. Tsirkin @ 2018-05-21 15:03 UTC (permalink / raw)
  To: Jason Wang; +Cc: virtualization, netdev, linux-kernel, John Fastabend
In-Reply-To: <1526891706-18516-3-git-send-email-jasowang@redhat.com>

On Mon, May 21, 2018 at 04:35:04PM +0800, Jason Wang wrote:
> We should not go for the error path after successfully transmitting a
> XDP buffer after linearizing. Since the error path may try to pop and
> drop next packet and increase the drop counters. Fixing this by simply
> drop the refcnt of original page and go for xmit path.
> 
> Fixes: 72979a6c3590 ("virtio_net: xdp, add slowpath case for non contiguous buffers")
> Cc: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/net/virtio_net.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index c15d240..6260d65 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -775,7 +775,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  			}
>  			*xdp_xmit = true;
>  			if (unlikely(xdp_page != page))
> -				goto err_xdp;
> +				put_page(page);
>  			rcu_read_unlock();
>  			goto xdp_xmit;
>  		case XDP_REDIRECT:
> -- 
> 2.7.4

^ permalink raw reply

* Re: [PATCH net 0/4] Fix several issues of virtio-net mergeable XDP
From: Michael S. Tsirkin @ 2018-05-21 15:04 UTC (permalink / raw)
  To: Jason Wang; +Cc: virtualization, netdev, linux-kernel
In-Reply-To: <1526891706-18516-1-git-send-email-jasowang@redhat.com>

On Mon, May 21, 2018 at 04:35:02PM +0800, Jason Wang wrote:
> Hi:
> 
> Please review the patches that tries to fix sevreal issues of
> virtio-net mergeable XDP.
> 
> Thanks

I think we should do 3/4 differently.
The rest looks good, and probably needed on stable.

Thanks!

> Jason Wang (4):
>   virtio-net: correctly redirect linearized packet
>   virtio-net: correctly transmit XDP buff after linearizing
>   virtio-net: reset num_buf to 1 after linearizing packet
>   virito-net: fix leaking page for gso packet during mergeable XDP
> 
>  drivers/net/virtio_net.c | 21 +++++++++++----------
>  1 file changed, 11 insertions(+), 10 deletions(-)
> 
> -- 
> 2.7.4

^ permalink raw reply

* Re: [net-next PATCH v2 2/4] net: Enable Tx queue selection based on Rx queues
From: Willem de Bruijn @ 2018-05-21 15:12 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Amritha Nambiar, Linux Kernel Network Developers, David S. Miller,
	Alexander Duyck, Sridhar Samudrala, Eric Dumazet,
	Hannes Frederic Sowa
In-Reply-To: <CALx6S36h=gGb1LkLuJ80DUrE=m+FhbcQ0AD94AdtEUvxJfHf=g@mail.gmail.com>

On Mon, May 21, 2018 at 10:51 AM, Tom Herbert <tom@herbertland.com> wrote:
> On Sat, May 19, 2018 at 1:27 PM, Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>> On Sat, May 19, 2018 at 4:13 PM, Willem de Bruijn
>> <willemdebruijn.kernel@gmail.com> wrote:
>>> On Fri, May 18, 2018 at 12:03 AM, Tom Herbert <tom@herbertland.com> wrote:
>>>> On Tue, May 15, 2018 at 6:26 PM, Amritha Nambiar
>>>> <amritha.nambiar@intel.com> wrote:
>>>>> This patch adds support to pick Tx queue based on the Rx queue map
>>>>> configuration set by the admin through the sysfs attribute
>>>>> for each Tx queue. If the user configuration for receive
>>>>> queue map does not apply, then the Tx queue selection falls back
>>>>> to CPU map based selection and finally to hashing.
>>>>>
>>>>> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
>>>>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>>>>> ---
>>
>>>>> +static int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
>>>>> +{
>>>>> +#ifdef CONFIG_XPS
>>>>> +       enum xps_map_type i = XPS_MAP_RXQS;
>>>>> +       struct xps_dev_maps *dev_maps;
>>>>> +       struct sock *sk = skb->sk;
>>>>> +       int queue_index = -1;
>>>>> +       unsigned int tci = 0;
>>>>> +
>>>>> +       if (sk && sk->sk_rx_queue_mapping <= dev->real_num_rx_queues &&
>>>>> +           dev->ifindex == sk->sk_rx_ifindex)
>>>>> +               tci = sk->sk_rx_queue_mapping;
>>>>> +
>>>>> +       rcu_read_lock();
>>>>> +       while (queue_index < 0 && i < __XPS_MAP_MAX) {
>>>>> +               if (i == XPS_MAP_CPUS)
>>>>
>>>> This while loop typifies exactly why I don't think the XPS maps should
>>>> be an array.
>>>
>>> +1
>>
>> as a matter of fact, as enabling both cpu and rxqueue map at the same
>> time makes no sense, only one map is needed at any one time. The
>> only difference is in how it is indexed. It should probably not be possible
>> to configure both at the same time. Keeping a single map probably also
>> significantly simplifies patch 1/4.
>
> Willem,
>
> I think it might makes sense to have them both. Maybe one application
> is spin polling that needs this, where others might be happy with
> normal CPU mappings as default.

Some entries in the rx_queue table have queue_pair affinity
configured, the others return -1 to fall through to the cpu
affinity table?

I guess that implies flow steering to those special purpose
queues. I wonder whether this would be used this in practice.
I does make the code more complex by having to duplicate
the map lookup logic (mostly, patch 1/4).

^ permalink raw reply

* Re: [PATCH] bpf: check NULL for sk_to_full_sk()
From: Eric Dumazet @ 2018-05-21 15:17 UTC (permalink / raw)
  To: YueHaibing, ast, daniel; +Cc: linux-kernel, netdev
In-Reply-To: <20180521075558.11968-1-yuehaibing@huawei.com>



On 05/21/2018 12:55 AM, YueHaibing wrote:
> like commit df39a9f106d5 ("bpf: check NULL for sk_to_full_sk() return value"),
> we should check sk_to_full_sk return value against NULL.
> 
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> ---
>  include/linux/bpf-cgroup.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
> index 30d15e6..fd3fbeb 100644
> --- a/include/linux/bpf-cgroup.h
> +++ b/include/linux/bpf-cgroup.h
> @@ -91,7 +91,7 @@ int __cgroup_bpf_check_dev_permission(short dev_type, u32 major, u32 minor,
>  	int __ret = 0;							       \
>  	if (cgroup_bpf_enabled && sk && sk == skb->sk) {		       \
>  		typeof(sk) __sk = sk_to_full_sk(sk);			       \
> -		if (sk_fullsock(__sk))					       \
> +		if (__sk && sk_fullsock(__sk))				       \
>  			__ret = __cgroup_bpf_run_filter_skb(__sk, skb,	       \
>  						      BPF_CGROUP_INET_EGRESS); \
>  	}								       \
> 

Why is this needed ???

^ permalink raw reply

* Re: [PATCH net-next 3/7] net: dsa: qca8k: Enable RXMAC when bringing up a port
From: Florian Fainelli @ 2018-05-21 15:17 UTC (permalink / raw)
  To: Michal Vokáč, netdev, michal.vokac
  Cc: linux-kernel, devicetree, vivien.didelot, andrew, mark.rutland,
	robh+dt, davem
In-Reply-To: <1526909293-56377-4-git-send-email-michal.vokac@ysoft.com>



On 05/21/2018 06:28 AM, Michal Vokáč wrote:
> When a port is brought up/down do not enable/disable only the TXMAC
> but the RXMAC as well. This is essential for the CPU port to work.
> 
> Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

Should this have:

Fixes: 6b93fb46480a ("net-next: dsa: add new driver for qca8xxx family")?
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next 4/7] net: dsa: qca8k: Force CPU port to its highest bandwidth
From: Florian Fainelli @ 2018-05-21 15:19 UTC (permalink / raw)
  To: Michal Vokáč, netdev, michal.vokac
  Cc: linux-kernel, devicetree, vivien.didelot, andrew, mark.rutland,
	robh+dt, davem
In-Reply-To: <1526909293-56377-5-git-send-email-michal.vokac@ysoft.com>



On 05/21/2018 06:28 AM, Michal Vokáč wrote:
> By default autonegotiation is enabled to configure MAC on all ports.
> For the CPU port autonegotiation can not be used so we need to set
> some sensible defaults manually.
> 
> This patch forces the default setting of the CPU port to 1000Mbps/full
> duplex which is the chip maximum capability.
> 
> Also correct size of the bit field used to configure link speed.
> 
> Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

Likewise, would not we want to have a:

Fixes: 6b93fb46480a ("net-next: dsa: add new driver for qca8xxx family")

tag here as well?
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next 5/7] net: dsa: qca8k: Allow overwriting CPU port setting
From: Florian Fainelli @ 2018-05-21 15:20 UTC (permalink / raw)
  To: Michal Vokáč, netdev, michal.vokac
  Cc: linux-kernel, devicetree, vivien.didelot, andrew, mark.rutland,
	robh+dt, davem
In-Reply-To: <1526909293-56377-6-git-send-email-michal.vokac@ysoft.com>



On 05/21/2018 06:28 AM, Michal Vokáč wrote:
> Implement adjust_link function that allows to overwrite default CPU port
> setting using fixed-link device tree subnode.
> 
> Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next 6/7] net: dsa: qca8k: Replace GPL boilerplate by SPDX
From: Florian Fainelli @ 2018-05-21 15:20 UTC (permalink / raw)
  To: Michal Vokáč, netdev, michal.vokac
  Cc: linux-kernel, devicetree, vivien.didelot, andrew, mark.rutland,
	robh+dt, davem
In-Reply-To: <1526909293-56377-7-git-send-email-michal.vokac@ysoft.com>



On 05/21/2018 06:28 AM, Michal Vokáč wrote:
> Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

I don't know if we need all people who contributed to that driver to
agree on that, this is not a license change, so it should be okay I presume?

-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next 7/7] net: dsa: qca8k: Remove rudundant parentheses
From: Florian Fainelli @ 2018-05-21 15:21 UTC (permalink / raw)
  To: Michal Vokáč, netdev, michal.vokac
  Cc: linux-kernel, devicetree, vivien.didelot, andrew, mark.rutland,
	robh+dt, davem
In-Reply-To: <1526909293-56377-8-git-send-email-michal.vokac@ysoft.com>



On 05/21/2018 06:28 AM, Michal Vokáč wrote:
> Fix warning reported by checkpatch.

Nit in the subject: should be redundant, with that:

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
From: David Miller @ 2018-05-21 15:47 UTC (permalink / raw)
  To: jasowang; +Cc: netdev, linux-kernel, mst, hannes, edumazet
In-Reply-To: <1526648443-24128-1-git-send-email-jasowang@redhat.com>

From: Jason Wang <jasowang@redhat.com>
Date: Fri, 18 May 2018 21:00:43 +0800

> We return -EIO on device down but can not raise EPOLLOUT after it was
> up. This may confuse user like vhost which expects tuntap to raise
> EPOLLOUT to re-enable its TX routine after tuntap is down. This could
> be easily reproduced by transmitting packets from VM while down and up
> the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
> 
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Cc: Eric Dumazet <edumazet@google.com>
> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
> Signed-off-by: Jason Wang <jasowang@redhat.com>

I'm no so sure what to do with this patch.

Like Michael says, this flag bit is only checks upon transmit which
may or may not happen after this point.  It doesn't seem to be
guaranteed.

^ permalink raw reply

* Re: [PATCH v2 net] stmmac: strip vlan tag on reception only for 8021q tagged frames
From: David Miller @ 2018-05-21 15:48 UTC (permalink / raw)
  To: eladv6; +Cc: makita.toshiaki, netdev, peppe.cavallaro, alexandre.torgue
In-Reply-To: <20180517.124356.1373521143004050823.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Thu, 17 May 2018 12:43:56 -0400 (EDT)

> Giuseppe and Alexandre, please review this patch.

If nobody thinks this patch is important enough to actually
review, I'm tossing it.

Sorry.

^ permalink raw reply

* Re: [patch net-next] nfp: flower: set sysfs link to device for representors
From: David Miller @ 2018-05-21 15:49 UTC (permalink / raw)
  To: jiri
  Cc: netdev, jakub.kicinski, simon.horman, dirk.vandermerwe,
	john.hurley, pieter.jansenvanvuuren, oss-drivers
In-Reply-To: <20180517100520.23971-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Thu, 17 May 2018 12:05:20 +0200

> From: Jiri Pirko <jiri@mellanox.com>
> 
> Do this so the sysfs has "device" link correctly set.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

Please sort out the non-PF representor issue with Or and Jakub.

Thanks.

^ permalink raw reply

* Re: [PATCH net-next 0/2] net: sfp: small improvements
From: David Miller @ 2018-05-21 15:51 UTC (permalink / raw)
  To: antoine.tenart
  Cc: linux, netdev, linux-kernel, thomas.petazzoni, maxime.chevallier,
	gregory.clement, miquel.raynal, nadavh, stefanc, ymarkman, mw
In-Reply-To: <20180517082907.14420-1-antoine.tenart@bootlin.com>

From: Antoine Tenart <antoine.tenart@bootlin.com>
Date: Thu, 17 May 2018 10:29:05 +0200

> This series was part of the mvpp2 phylink one but as we reworked it to
> use fixed-link on the DB boards, the SFP commits weren't needed
> anymore for our use case. Two of the three patches still are needed I
> believe (I ditched the one about non-wired SFP cages), so they are sent
> here in a separate series.

Based upon the discussion of patch #1, it seems there is a desire to make
the i2c-bus property mandatory since it isn't clear if access to the SFP
module without it really all that doable.

^ permalink raw reply

* Re: [PATCH net-next] sctp: add support for SCTP_REUSE_PORT sockopt
From: Marcelo Ricardo Leitner @ 2018-05-21 15:51 UTC (permalink / raw)
  To: Michael Tuexen; +Cc: Neil Horman, Xin Long, network dev, linux-sctp, davem
In-Reply-To: <43A7D2C9-DFCE-4ADA-9ABB-B7ACD78C210B@fh-muenster.de>

On Mon, May 21, 2018 at 04:09:31PM +0200, Michael Tuexen wrote:
> > On 21. May 2018, at 15:48, Neil Horman <nhorman@tuxdriver.com> wrote:
> > 
> > On Mon, May 21, 2018 at 02:16:56PM +0200, Michael Tuexen wrote:
> >>> On 21. May 2018, at 13:39, Neil Horman <nhorman@tuxdriver.com> wrote:
> >>> 
> >>> On Sun, May 20, 2018 at 10:54:04PM -0300, Marcelo Ricardo Leitner wrote:
> >>>> On Sun, May 20, 2018 at 08:50:59PM -0400, Neil Horman wrote:
> >>>>> On Sat, May 19, 2018 at 03:44:40PM +0800, Xin Long wrote:
> >>>>>> This feature is actually already supported by sk->sk_reuse which can be
> >>>>>> set by SO_REUSEADDR. But it's not working exactly as RFC6458 demands in
> >>>>>> section 8.1.27, like:
> >>>>>> 
> >>>>>> - This option only supports one-to-one style SCTP sockets
> >>>>>> - This socket option must not be used after calling bind()
> >>>>>>   or sctp_bindx().
> >>>>>> 
> >>>>>> Besides, SCTP_REUSE_PORT sockopt should be provided for user's programs.
> >>>>>> Otherwise, the programs with SCTP_REUSE_PORT from other systems will not
> >>>>>> work in linux.
> >>>>>> 
> >>>>>> This patch reuses sk->sk_reuse and works pretty much as SO_REUSEADDR,
> >>>>>> just with some extra setup limitations that are neeeded when it is being
> >>>>>> enabled.
> >>>>>> 
> >>>>>> "It should be noted that the behavior of the socket-level socket option
> >>>>>> to reuse ports and/or addresses for SCTP sockets is unspecified", so it
> >>>>>> leaves SO_REUSEADDR as is for the compatibility.
> >>>>>> 
> >>>>>> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> >>>>>> ---
> >>>>>> include/uapi/linux/sctp.h |  1 +
> >>>>>> net/sctp/socket.c         | 48 +++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>> 2 files changed, 49 insertions(+)
> >>>>>> 
> >>>>> A few things:
> >>>>> 
> >>>>> 1) I agree with Tom, this feature is a complete duplication of the SK_REUSEPORT
> >>>>> socket option.  I understand that this is an implementation of the option in the
> >>>>> RFC, but its definately a duplication of a feature, which makes several things
> >>>>> really messy.
> >>>>> 
> >>>>> 2) The overloading of the sk_reuse opeion is a bad idea, for several reasons.
> >>>>> Chief among them is the behavioral interference between this patch and the
> >>>>> SO_REUSEADDR socket level option, that also sets this feature.  If you set
> >>>>> sk_reuse via SO_REUSEADDR, you will set the SCTP port reuse feature regardless
> >>>>> of the bind or 1:1/1:m state of the socket.  Vice versa, if you set this socket
> >>>>> option via the SCTP_PORT_REUSE option you will inadvertently turn on address
> >>>>> reuse for the socket.  We can't do that.
> >>>> 
> >>>> Given your comments, going a bit further here, one other big
> >>>> implication is that a port would never be able to be considered to
> >>>> fully meet SCTP standards regarding reuse because a rogue application
> >>>> may always abuse of the socket level opt to gain access to the port.
> >>>> 
> >>>> IOW, the patch allows the application to use such restrictions against
> >>>> itself and nothing else, which undermines the patch idea.
> >>>> 
> >>> Agreed.
> >>> 
> >>>> I lack the knowledge on why the SCTP option was proposed in the RFC. I
> >>>> guess they had a good reason to add the restriction on 1:1/1:m style.
> >>>> Does the usage of the current imply in any risk to SCTP sockets? If
> >>>> yes, that would give some grounds for going forward with the SCTP
> >>>> option.
> >>>> 
> >>> I'm also not privy to why the sctp option was proposed, though I expect that the
> >>> lack of standardization of SO_REUSEPORT probably had something to do with it.
> >>> As for the reasoning behind restriction to only 1:1 sockets, if I had to guess,
> >>> I would say it likely because it creates ordering difficulty at the application
> >>> level.
> >>> 
> >>> CC-ing Michael Tuxen, who I believe had some input on this RFC.  Hopefully he
> >>> can shed some light on this.
> >> Dear all,
> >> 
> >> the reason this was added is to have a specified way to allow a system to
> >> behave like a client and server making use of the INIT collision.
> >> 
> >> For 1-to-many style sockets you can do this by creating a socket, binding it,
> >> calling listen on it and trying to connect to the peer.
> >> 
> >> For 1-to-1 style sockets you need two sockets for it. One listener and one
> >> you use to connect (and close it in case of failure, open a new one...).
> >> 
> >> It was not clear if one can achieve this with SO_REUSEPORT and/or SO_REUSEADDR
> >> on all platforms. We left that unspecified.
> >> 
> >> I hope this makes the intention clearer.
> >> 
> > I think it makes the intention clearer yes, but it unfortunately does nothing in
> > my mind to clarify how the implementation should best handle the potential
> > overlap in functionality.  What I see here is that we have two functional paths
> > (the SO_REUSEPORT path and the SCTP_PORT_REUSE path), which may or may not
> > (depending on the OS implementation achieve the same functional goal (allowing
> > multiple sockets to share a port while allowing one socket to listen and the
> > other connect to a remote peer).  If both implementations do the same thing on a
> > given platform, we can either just alias one to another and be done, but if they
> > don't then we either have to implement both paths, and ensure that the
> > SO_REUSEPORT path is a no-op/error return for SCTP sockets, or that each path
> > implements a distinct feature set that is cleaarly documented.
> > 
> > That said, I think we may be in luck.  Looking at the connect and listen paths,
> > it appears to me that:
> > 
> > 1) Sockets ignore SO_REUSEPORT in the connect and listen paths (save for any
> > autobinding) so it would appear that the intent of the SCTP rfc can be honored
> > via SO_REUSEPORT on linux.  
> > 
> > 2) SO_REUSEPORT prevents changing state after a bind has occured, so we can honr
> > that part of the SCTP RFC.
> > 
> > The only missing part is the restriction that SCTP_REUSE_PORT has which is
> > unaccounted for is that 1:M sockets aren't allowed to enable port reuse.
> > However, I think the implication from Michaels description above is that port
> > reuse on a 1:M socket is implicit because a single socket can connect and listen
> > in that use case, rather than there being a danger to doing so.
> > 
> > As such, I would propose that we implement this socket option by simply setting
> > the sk->sk_reuseport field in the sock structure, and document the fact that
> > linux does not restrict port reuse from 1:M sockets.
> > 
> > Thoughts?
> Sounds acceptable to me...

+1

> 
> Best regards
> Michael
> > Neil
> > 
> 

^ permalink raw reply

* Re: [PATCH net] sctp: fix the issue that flags are ignored when using kernel_connect
From: Marcelo Ricardo Leitner @ 2018-05-21 15:52 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, linux-sctp, davem, Neil Horman, mkubecek
In-Reply-To: <4863916c3e574b0d860725466d7d4a2f445fbe5b.1526805550.git.lucien.xin@gmail.com>

On Sun, May 20, 2018 at 04:39:10PM +0800, Xin Long wrote:
> Now sctp uses inet_dgram_connect as its proto_ops .connect, and the flags
> param can't be passed into its proto .connect where this flags is really
> needed.
> 
> sctp works around it by getting flags from socket file in __sctp_connect.
> It works for connecting from userspace, as inherently the user sock has
> socket file and it passes f_flags as the flags param into the proto_ops
> .connect.
> 
> However, the sock created by sock_create_kern doesn't have a socket file,
> and it passes the flags (like O_NONBLOCK) by using the flags param in
> kernel_connect, which calls proto_ops .connect later.
> 
> So to fix it, this patch defines a new proto_ops .connect for sctp,
> sctp_inet_connect, which calls __sctp_connect() directly with this
> flags param. After this, the sctp's proto .connect can be removed.
> 
> Note that sctp_inet_connect doesn't need to do some checks that are not
> needed for sctp, which makes thing better than with inet_dgram_connect.
> 
> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

> ---
>  include/net/sctp/sctp.h |  2 ++
>  net/sctp/ipv6.c         |  2 +-
>  net/sctp/protocol.c     |  2 +-
>  net/sctp/socket.c       | 51 +++++++++++++++++++++++++++++++++----------------
>  4 files changed, 39 insertions(+), 18 deletions(-)
> 
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index 28b996d..35498e6 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -103,6 +103,8 @@ void sctp_addr_wq_mgmt(struct net *, struct sctp_sockaddr_entry *, int);
>  /*
>   * sctp/socket.c
>   */
> +int sctp_inet_connect(struct socket *sock, struct sockaddr *uaddr,
> +		      int addr_len, int flags);
>  int sctp_backlog_rcv(struct sock *sk, struct sk_buff *skb);
>  int sctp_inet_listen(struct socket *sock, int backlog);
>  void sctp_write_space(struct sock *sk);
> diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
> index 4224711..0cd2e76 100644
> --- a/net/sctp/ipv6.c
> +++ b/net/sctp/ipv6.c
> @@ -1006,7 +1006,7 @@ static const struct proto_ops inet6_seqpacket_ops = {
>  	.owner		   = THIS_MODULE,
>  	.release	   = inet6_release,
>  	.bind		   = inet6_bind,
> -	.connect	   = inet_dgram_connect,
> +	.connect	   = sctp_inet_connect,
>  	.socketpair	   = sock_no_socketpair,
>  	.accept		   = inet_accept,
>  	.getname	   = sctp_getname,
> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> index d685f84..6bf0a99 100644
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -1012,7 +1012,7 @@ static const struct proto_ops inet_seqpacket_ops = {
>  	.owner		   = THIS_MODULE,
>  	.release	   = inet_release,	/* Needs to be wrapped... */
>  	.bind		   = inet_bind,
> -	.connect	   = inet_dgram_connect,
> +	.connect	   = sctp_inet_connect,
>  	.socketpair	   = sock_no_socketpair,
>  	.accept		   = inet_accept,
>  	.getname	   = inet_getname,	/* Semantics are different.  */
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 80835ac..ae7e7c6 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -1086,7 +1086,7 @@ static int sctp_setsockopt_bindx(struct sock *sk,
>   */
>  static int __sctp_connect(struct sock *sk,
>  			  struct sockaddr *kaddrs,
> -			  int addrs_size,
> +			  int addrs_size, int flags,
>  			  sctp_assoc_t *assoc_id)
>  {
>  	struct net *net = sock_net(sk);
> @@ -1104,7 +1104,6 @@ static int __sctp_connect(struct sock *sk,
>  	union sctp_addr *sa_addr = NULL;
>  	void *addr_buf;
>  	unsigned short port;
> -	unsigned int f_flags = 0;
>  
>  	sp = sctp_sk(sk);
>  	ep = sp->ep;
> @@ -1254,13 +1253,7 @@ static int __sctp_connect(struct sock *sk,
>  	sp->pf->to_sk_daddr(sa_addr, sk);
>  	sk->sk_err = 0;
>  
> -	/* in-kernel sockets don't generally have a file allocated to them
> -	 * if all they do is call sock_create_kern().
> -	 */
> -	if (sk->sk_socket->file)
> -		f_flags = sk->sk_socket->file->f_flags;
> -
> -	timeo = sock_sndtimeo(sk, f_flags & O_NONBLOCK);
> +	timeo = sock_sndtimeo(sk, flags & O_NONBLOCK);
>  
>  	if (assoc_id)
>  		*assoc_id = asoc->assoc_id;
> @@ -1348,7 +1341,7 @@ static int __sctp_setsockopt_connectx(struct sock *sk,
>  				      sctp_assoc_t *assoc_id)
>  {
>  	struct sockaddr *kaddrs;
> -	int err = 0;
> +	int err = 0, flags = 0;
>  
>  	pr_debug("%s: sk:%p addrs:%p addrs_size:%d\n",
>  		 __func__, sk, addrs, addrs_size);
> @@ -1367,7 +1360,13 @@ static int __sctp_setsockopt_connectx(struct sock *sk,
>  	if (err)
>  		goto out_free;
>  
> -	err = __sctp_connect(sk, kaddrs, addrs_size, assoc_id);
> +	/* in-kernel sockets don't generally have a file allocated to them
> +	 * if all they do is call sock_create_kern().
> +	 */
> +	if (sk->sk_socket->file)
> +		flags = sk->sk_socket->file->f_flags;
> +
> +	err = __sctp_connect(sk, kaddrs, addrs_size, flags, assoc_id);
>  
>  out_free:
>  	kvfree(kaddrs);
> @@ -4397,16 +4396,26 @@ static int sctp_setsockopt(struct sock *sk, int level, int optname,
>   * len: the size of the address.
>   */
>  static int sctp_connect(struct sock *sk, struct sockaddr *addr,
> -			int addr_len)
> +			int addr_len, int flags)
>  {
> -	int err = 0;
> +	struct inet_sock *inet = inet_sk(sk);
>  	struct sctp_af *af;
> +	int err = 0;
>  
>  	lock_sock(sk);
>  
>  	pr_debug("%s: sk:%p, sockaddr:%p, addr_len:%d\n", __func__, sk,
>  		 addr, addr_len);
>  
> +	/* We may need to bind the socket. */
> +	if (!inet->inet_num) {
> +		if (sk->sk_prot->get_port(sk, 0)) {
> +			release_sock(sk);
> +			return -EAGAIN;
> +		}
> +		inet->inet_sport = htons(inet->inet_num);
> +	}
> +
>  	/* Validate addr_len before calling common connect/connectx routine. */
>  	af = sctp_get_af_specific(addr->sa_family);
>  	if (!af || addr_len < af->sockaddr_len) {
> @@ -4415,13 +4424,25 @@ static int sctp_connect(struct sock *sk, struct sockaddr *addr,
>  		/* Pass correct addr len to common routine (so it knows there
>  		 * is only one address being passed.
>  		 */
> -		err = __sctp_connect(sk, addr, af->sockaddr_len, NULL);
> +		err = __sctp_connect(sk, addr, af->sockaddr_len, flags, NULL);
>  	}
>  
>  	release_sock(sk);
>  	return err;
>  }
>  
> +int sctp_inet_connect(struct socket *sock, struct sockaddr *uaddr,
> +		      int addr_len, int flags)
> +{
> +	if (addr_len < sizeof(uaddr->sa_family))
> +		return -EINVAL;
> +
> +	if (uaddr->sa_family == AF_UNSPEC)
> +		return -EOPNOTSUPP;
> +
> +	return sctp_connect(sock->sk, uaddr, addr_len, flags);
> +}
> +
>  /* FIXME: Write comments. */
>  static int sctp_disconnect(struct sock *sk, int flags)
>  {
> @@ -8724,7 +8745,6 @@ struct proto sctp_prot = {
>  	.name        =	"SCTP",
>  	.owner       =	THIS_MODULE,
>  	.close       =	sctp_close,
> -	.connect     =	sctp_connect,
>  	.disconnect  =	sctp_disconnect,
>  	.accept      =	sctp_accept,
>  	.ioctl       =	sctp_ioctl,
> @@ -8767,7 +8787,6 @@ struct proto sctpv6_prot = {
>  	.name		= "SCTPv6",
>  	.owner		= THIS_MODULE,
>  	.close		= sctp_close,
> -	.connect	= sctp_connect,
>  	.disconnect	= sctp_disconnect,
>  	.accept		= sctp_accept,
>  	.ioctl		= sctp_ioctl,
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox