From: Laurent Vivier <laurent@vivier.eu>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: Riku Voipio <riku.voipio@iki.fi>,
QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH] linux-user: manage SOCK_PACKET socket type.
Date: Tue, 27 Oct 2015 11:47:38 +0100 [thread overview]
Message-ID: <562F564A.5040902@vivier.eu> (raw)
In-Reply-To: <562EEAD4.7070706@vivier.eu>
Le 27/10/2015 04:09, Laurent Vivier a écrit :
>
>
> Le 26/10/2015 15:40, Peter Maydell a écrit :
>> On 6 October 2015 at 18:11, Laurent Vivier <laurent@vivier.eu> wrote:
>>> This is obsolete, but if we want to use dhcp with some distros (like debian
>>> ppc 8.2 jessie), we need it.
>>>
>>> At the bind level, we are not able to know the socket type so we try to
>>> guess it by analyzing the name. We manage only the case "ethX",
>>> "ethX" in spk_device is similar to set htons(0x6574) in sll_protocol in the
>>> normal case, and as this protocol does not exist, it's ok.
>>>
>>> SOCK_PACKET uses network endian to encode protocol in socket()
>>>
>>> in PACKET(7) :
>>> protocol is the IEEE 802.3 protocol
>>> number in network order. See the <linux/if_ether.h> include file for a
>>> list of allowed protocols. When protocol is set to htons(ETH_P_ALL)
>>> then all protocols are received. All incoming packets of that protocol
>>> type will be passed to the packet socket before they are passed to the
>>> protocols implemented in the kernel.
>>>
>>> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
>>> ---
>>> This patch is a remix of an old patch sent in 2012:
>>> https://patchwork.ozlabs.org/patch/208892/
>>>
>>> linux-user/syscall.c | 33 +++++++++++++++++++++++++++------
>>> 1 file changed, 27 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>>> index 64be431..71cc1e2 100644
>>> --- a/linux-user/syscall.c
>>> +++ b/linux-user/syscall.c
>>> @@ -111,6 +111,7 @@ int __clone2(int (*fn)(void *), void *child_stack_base,
>>> #include <linux/route.h>
>>> #include <linux/filter.h>
>>> #include <linux/blkpg.h>
>>> +#include <linux/if_packet.h>
>>> #include "linux_loop.h"
>>> #include "uname.h"
>>>
>>> @@ -1198,11 +1199,20 @@ static inline abi_long target_to_host_sockaddr(struct sockaddr *addr,
>>> memcpy(addr, target_saddr, len);
>>> addr->sa_family = sa_family;
>>> if (sa_family == AF_PACKET) {
>>> - struct target_sockaddr_ll *lladdr;
>>> + /* Manage an obsolete case :
>>> + * if socket type is SOCK_PACKET, bind by name otherwise by index
>>> + * but we are not able to know socket type, so check if the name
>>> + * is usable...
>>> + * see linux/net/packet/af_packet.c: packet_bind_spkt()
>>> + */
>>> + if (strncmp((char *)((struct sockaddr_pkt *)addr)->spkt_device,
>>> + "eth", 3) != 0) {
>>> + struct target_sockaddr_ll *lladdr;
>>
>> This confuses me. The packet(7) manpage suggests there are two flavours
>> of packet socket:
>> (1) legacy AF_INET + SOCK_PACKET
>> (2) new style AF_PACKET + SOCK_RAW / SOCK_DGRAM
>>
>> but this comment suggests it's trying to handle AF_PACKET + SOCK_PACKET ?
>
> In fact, I've started not from the man page, but from a non working dhcp
> client, originally with a m68k target and etch-m68k distro, and I've met
> again this problem on a ppc target and jessie distro.
>
>>
>> If AF_PACKET was introduced as the new way of doing things, it's not
>> clear why it would be the family type used in the legacy approach's
>> sockaddr_pkt (though there seems to be code around that does this).
>> I suspect that 2.0 kernels just didn't check af_family here at all.
>
> by digging into the code, I have found:
>
> $ apt-get source isc-dhcp-client
> $ vi isc-dhcp-4.3.1/common/lpf.c
>
> ...
> 72 int if_register_lpf (info)
> 73 struct interface_info *info;
> 74 {
> ...
> 79 if ((sock = socket(PF_PACKET, SOCK_PACKET,
> 80 htons((short)ETH_P_ALL))) < 0) {
> ...
>
> So we can see socket() is used with domain PF_PACKET and type SOCK_PACKET.
>
> Next, the interface name is put into the sa_data of sockaddr, and bind()
> is used with AF_PACKET:
>
> 94 /* Bind to the interface name */
> 95 memset (&sa, 0, sizeof sa);
> 96 sa.sa_family = AF_PACKET;
> 97 strncpy (sa.sa_data, (const char *)info -> ifp, sizeof
> sa.sa_data);
> 98 if (bind (sock, &sa, sizeof sa)) {
>
> ifp is initialized from a list of all discovered interfaces in
> common/discover.c:
> ...
> 238 int
> 239 begin_iface_scan(struct iface_conf_list *ifaces) {
> ...
> 283 if (ioctl(ifaces->sock, SIOCGLIFCONF, &ifaces->conf) < 0) {
> ...
> 918 void
> 919 discover_interfaces(int state) {
> ...
> 924 struct interface_info *tmp;
> ...
> 939 if (!begin_iface_scan(&ifaces)) {
> ...
> 955 while (next_iface(&info, &err, &ifaces)) {
> 956
> 957 /* See if we've seen an interface that matches this
> one. */
> 958 for (tmp = interfaces; tmp; tmp = tmp->next) {
> 959 if (!strcmp(tmp->name, info.name))
> 960 break;
> 961 }
>
> ...
> 987 strcpy(tmp->name, info.name);
> ...
> 1050 }
> ...
> 1063 for (tmp = interfaces ; tmp != NULL ; tmp = tmp->next) {
> 1064 if (tmp->ifp == NULL) {
> 1065 struct ifreq *tif;
> 1066
> 1067 tif = (struct ifreq *)dmalloc(sizeof(struct
> ifreq),
> 1068 MDL);
> 1069 if (tif == NULL)
> 1070 log_fatal("no space for ifp mockup.");
> 1071 strcpy(tif->ifr_name, tmp->name);
> 1072 tmp->ifp = tif;
> 1073 }
> 1074 }
>
>>
>> The code in the kernel's packet_recvmsg fills in the spkt_family
>> field with the netdevice's type field, which is an ARPHRD_* constant
>> as far as I can tell (I could well be wrong here).
>
> kernel 4.3.0-rc3, net/packet/af_packet.c:
>
> 2961
> 2962 static int packet_bind_spkt(struct socket *sock, struct sockaddr
> *uaddr,
> 2963 int addr_len)
> 2964 {
> 2965 struct sock *sk = sock->sk;
> 2966 char name[15];
> 2967 struct net_device *dev;
> 2968 int err = -ENODEV;
> 2969
> 2970 /*
> 2971 * Check legality
> 2972 */
> 2973
> 2974 if (addr_len != sizeof(struct sockaddr))
> 2975 return -EINVAL;
> 2976 strlcpy(name, uaddr->sa_data, sizeof(name));
> 2977
> 2978 dev = dev_get_by_name(sock_net(sk), name);
> 2979 if (dev)
> 2980 err = packet_do_bind(sk, dev, pkt_sk(sk)->num);
> 2981 return err;
> 2982 }
> ...
> 4246 static const struct proto_ops packet_ops_spkt = {
> 4247 .family = PF_PACKET,
> ...
> 4250 .bind = packet_bind_spkt,
> ...
> 3022
> 3023 static int packet_create(struct net *net, struct socket *sock,
> int protocol,
> 3024 int kern)
> ...
> 3045 if (sock->type == SOCK_PACKET)
> 3046 sock->ops = &packet_ops_spkt;
> ...
>
>> Odd to have code in target_to_host_sockaddr and not in
>> host_to_target_sockaddr.
>
> In the case of host_to_target_sockaddr(), there is no "if (sa_family ==
> AF_PACKET) {" as it was in target_to_host_sockaddr(), it's why I didn't
> add it (and I don't like to add code I don't test).
>
>>
>>> - lladdr = (struct target_sockaddr_ll *)addr;
>>> - lladdr->sll_ifindex = tswap32(lladdr->sll_ifindex);
>>> - lladdr->sll_hatype = tswap16(lladdr->sll_hatype);
>>> + lladdr = (struct target_sockaddr_ll *)addr;
>>> + lladdr->sll_ifindex = tswap32(lladdr->sll_ifindex);
>>> + lladdr->sll_hatype = tswap16(lladdr->sll_hatype);
>>> + }
>>> }
>>> unlock_user(target_saddr, target_addr, 0);
>>>
>>> @@ -2509,7 +2519,12 @@ static abi_long do_socketcall(int num, abi_ulong vptr)
>>> /* now when we have the args, actually handle the call */
>>> switch (num) {
>>> case SOCKOP_socket: /* domain, type, protocol */
>>> - return do_socket(a[0], a[1], a[2]);
>>> + if (a[0] == AF_PACKET ||
>>> + a[1] == TARGET_SOCK_PACKET) {
>>> + return do_socket(a[0], a[1], tswap16(a[2]));
>>> + } else {
>>> + return do_socket(a[0], a[1], a[2]);
>>> + }
>>> case SOCKOP_bind: /* sockfd, addr, addrlen */
>>> return do_bind(a[0], a[1], a[2]);
>>> case SOCKOP_connect: /* sockfd, addr, addrlen */
>>> @@ -7500,7 +7515,13 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>>> #endif
>>> #ifdef TARGET_NR_socket
>>> case TARGET_NR_socket:
>>> - ret = do_socket(arg1, arg2, arg3);
>>> + if (arg1 == AF_PACKET ||
>>> + arg2 == TARGET_SOCK_PACKET) {
>>> + /* in this case, socket() needs a network endian short */
>>> + ret = do_socket(arg1, arg2, tswap16(arg3));
>>> + } else {
>>> + ret = do_socket(arg1, arg2, arg3);
>>> + }
>>
>> This doesn't make sense to me. The argument to socket()
>> is passed in via a register; so if the guest code correctly
>> passes the protocol value as htons(whatever) then that will
>> be the value in arg3 and we do not need to swap anything.
>>
>> I see we had this discussion about the previous version of the
>> patch too... and my argument that we don't need the tswap16
>> in the socketcall code path either still makes sense to me.
>
> I agree with you, I think I have confused socketcall() that passes
> parameters by memory, and socket() that passes parameters by registers.
> I will remove this part and resend a patch.
And for the socketcall part, we need the tswap16():
for instance,
int a = htons(0x0003);
On a LE host:
a = 0x00000300
On a BE host:
a = 0x00000003
If the guest is BE, it will put in memory:
0x00 0x00 0x00 0x03
Then a LE host, will read:
int b = 0x03000000
but get_user_ual() in do_socketcall() will byte-swap it and put
0x00000003 in a[2].
so without the byte-swap, we call do_socket(..., 0x0003),
whereas the syscall is waiting for htons(0x0003) -> 0x0300 as we are on
LE host.
I'm sorry, I can't explain that better...
>>
>>> fd_trans_unregister(ret);
>>> break;
>>> #endif
>>> --
>>> 2.4.3
>>
>> thanks
>> -- PMM
>>
>
> Thank you for your comments,
> Laurent
>
next prev parent reply other threads:[~2015-10-27 10:47 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-06 17:11 [Qemu-devel] [PATCH] linux-user: manage SOCK_PACKET socket type Laurent Vivier
2015-10-26 14:40 ` Peter Maydell
2015-10-27 3:09 ` Laurent Vivier
2015-10-27 10:47 ` Laurent Vivier [this message]
2015-10-27 11:35 ` Peter Maydell
2015-10-27 11:39 ` Peter Maydell
2015-10-27 11:49 ` Laurent Vivier
2015-10-27 11:52 ` Peter Maydell
2015-10-27 11:56 ` Laurent Vivier
2015-10-27 11:54 ` Laurent Vivier
2015-10-27 11:50 ` Peter Maydell
2015-10-27 11:54 ` Laurent Vivier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=562F564A.5040902@vivier.eu \
--to=laurent@vivier.eu \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=riku.voipio@iki.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.