From: Pavel Emelyanov <xemul@parallels.com>
To: Brian Haley <brian.haley@hp.com>
Cc: Linux Netdev List <netdev@vger.kernel.org>,
David Miller <davem@davemloft.net>
Subject: Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
Date: Tue, 23 Oct 2012 00:28:33 +0400 [thread overview]
Message-ID: <5085AC71.7060904@parallels.com> (raw)
In-Reply-To: <50858ABD.2000206@hp.com>
On 10/22/2012 10:04 PM, Brian Haley wrote:
> On 10/19/2012 05:55 AM, Pavel Emelyanov wrote:
>> The SO_BINDTODEVICE option is the only SOL_SOCKET one that can be set, but
>> cannot be get via sockopt API. The only way we can find the device id a
>> socket is bound to is via sock-diag interface. But the diag works only on
>> hashed sockets, while the opt in question can be set for yet unhashed one.
>>
>> That said, in order to know what device a socket is bound to (we do want
>> to know this in checkpoint-restore project) I propose to make this option
>> getsockopt-able and report the respective device index.
>>
>> Another solution to the problem might be to teach the sock-diag reporting
>> info on unhashed sockets. Should I go this way instead?
>>
>> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>>
>> ---
>>
>> diff --git a/net/core/sock.c b/net/core/sock.c
>> index 8a146cf..c49412c 100644
>> --- a/net/core/sock.c
>> +++ b/net/core/sock.c
>> @@ -1074,6 +1074,9 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
>> case SO_NOFCS:
>> v.val = sock_flag(sk, SOCK_NOFCS);
>> break;
>> + case SO_BINDTODEVICE:
>> + v.val = sk->sk_bound_dev_if;
>> + break;
>> default:
>> return -ENOPROTOOPT;
>> }
>
> Doesn't this make the set and get non-symmetrical? For example, setsockopt()
> would take "eth0", but getsockopt() would return 2.
It will, but since device name and index are two equal device "IDs" I assumed
it would be OK.
However, some comments inline.
> The following patch would return a string, or -ENODEV if not set.
>
> -Brian
>
> ---
>
> Change getsockopt(SO_BINDTODEVICE) to be symmetrical with setsockopt() by
> returning the interface name as a string.
>
> Signed-off-by: Brian Haley <brian.haley@hp.com>
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index c49412c..69b9d92 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -505,7 +505,8 @@ struct dst_entry *sk_dst_check(struct sock *sk, u32 cookie)
> }
> EXPORT_SYMBOL(sk_dst_check);
>
> -static int sock_bindtodevice(struct sock *sk, char __user *optval, int optlen)
> +static int sock_setbindtodevice(struct sock *sk, char __user *optval,
> + int optlen)
> {
> int ret = -ENOPROTOOPT;
> #ifdef CONFIG_NETDEVICES
> @@ -562,6 +563,49 @@ out:
> return ret;
> }
>
> +static int sock_getbindtodevice(struct sock *sk, char __user *optval,
> + int __user *optlen, int len)
> +{
> + int ret = -ENOPROTOOPT;
> +#ifdef CONFIG_NETDEVICES
> + struct net *net = sock_net(sk);
> + struct net_device *dev;
> + char devname[IFNAMSIZ];
> +
> + ret = 0;
> + if (sk->sk_bound_dev_if == 0)
> + goto out;
It will return 0 if device is not set, thus making it impossible to detect
this situation.
> + ret = -EINVAL;
> + if (len < IFNAMSIZ)
> + goto out;
> + if (len > IFNAMSIZ)
> + len = IFNAMSIZ;
> +
> + rcu_read_lock();
> + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if);
> + if (dev)
> + strcpy(dev->name, devname);
> + rcu_read_unlock();
> + ret = -ENODEV;
> + if (!dev)
> + goto out;
> +
> + ret = -EFAULT;
> + if (copy_to_user(optval, devname, len))
> + goto out;
> +
> + if (put_user(len, optlen))
> + goto out;
What's the point in reporting IFNAMSIZ to the userspace always, taking
into account that this constant is exported there anyway?
> + ret = 0;
> +
> +out:
> +#endif
> +
> + return ret;
> +}
> +
> static inline void sock_valbool_flag(struct sock *sk, int bit, int valbool)
> {
> if (valbool)
> @@ -589,7 +633,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
> */
>
> if (optname == SO_BINDTODEVICE)
> - return sock_bindtodevice(sk, optval, optlen);
> + return sock_setbindtodevice(sk, optval, optlen);
>
> if (optlen < sizeof(int))
> return -EINVAL;
> @@ -1074,9 +1118,10 @@ int sock_getsockopt(struct socket *sock, int level, int
> optname,
> case SO_NOFCS:
> v.val = sock_flag(sk, SOCK_NOFCS);
> break;
> +
> case SO_BINDTODEVICE:
> - v.val = sk->sk_bound_dev_if;
> - break;
> + return sock_getbindtodevice(sk, optval, optlen, len);
> +
> default:
> return -ENOPROTOOPT;
> }
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> .
>
next prev parent reply other threads:[~2012-10-22 20:28 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-19 9:55 [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable Pavel Emelyanov
2012-10-19 10:34 ` Eric Dumazet
2012-10-22 0:43 ` David Miller
2012-10-22 10:20 ` Pavel Emelyanov
2012-10-22 18:04 ` Brian Haley
2012-10-22 20:28 ` Pavel Emelyanov [this message]
2012-10-22 21:23 ` Brian Haley
2012-10-22 20:45 ` Eric Dumazet
2012-10-22 21:10 ` Pavel Emelyanov
2012-10-22 21:20 ` Brian Haley
2012-10-22 21:37 ` Eric Dumazet
2012-10-22 21:47 ` Brian Haley
2012-10-22 21:52 ` Eric Dumazet
2012-10-22 21:58 ` Ben Hutchings
2012-10-23 21:43 ` Brian Haley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5085AC71.7060904@parallels.com \
--to=xemul@parallels.com \
--cc=brian.haley@hp.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.