All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurent Vivier <laurent@vivier.eu>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: Riku Voipio <riku.voipio@iki.fi>,
	QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH] linux-user: manage SOCK_PACKET socket type.
Date: Tue, 27 Oct 2015 04:09:08 +0100	[thread overview]
Message-ID: <562EEAD4.7070706@vivier.eu> (raw)
In-Reply-To: <CAFEAcA91uAwxoM934xg3EKLrxNSNdA496cxb8C1TdmmSsVn5Kg@mail.gmail.com>



Le 26/10/2015 15:40, Peter Maydell a écrit :
> On 6 October 2015 at 18:11, Laurent Vivier <laurent@vivier.eu> wrote:
>> This is obsolete, but if we want to use dhcp with some distros (like debian
>> ppc 8.2 jessie), we need it.
>>
>> At the bind level, we are not able to know the socket type so we try to
>> guess it by analyzing the name. We manage only the case "ethX",
>> "ethX" in spk_device is similar to set htons(0x6574) in sll_protocol in the
>> normal case, and as this protocol does not exist, it's ok.
>>
>> SOCK_PACKET uses network endian to encode protocol in socket()
>>
>> in PACKET(7) :
>>                                  protocol is the  IEEE  802.3  protocol
>> number in network order.  See the <linux/if_ether.h> include file for a
>> list of allowed protocols.  When protocol is  set  to  htons(ETH_P_ALL)
>> then all protocols are received.  All incoming packets of that protocol
>> type will be passed to the packet socket before they are passed to  the
>> protocols implemented in the kernel.
>>
>> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
>> ---
>> This patch is a remix of an old patch sent in 2012:
>> https://patchwork.ozlabs.org/patch/208892/
>>
>>  linux-user/syscall.c | 33 +++++++++++++++++++++++++++------
>>  1 file changed, 27 insertions(+), 6 deletions(-)
>>
>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>> index 64be431..71cc1e2 100644
>> --- a/linux-user/syscall.c
>> +++ b/linux-user/syscall.c
>> @@ -111,6 +111,7 @@ int __clone2(int (*fn)(void *), void *child_stack_base,
>>  #include <linux/route.h>
>>  #include <linux/filter.h>
>>  #include <linux/blkpg.h>
>> +#include <linux/if_packet.h>
>>  #include "linux_loop.h"
>>  #include "uname.h"
>>
>> @@ -1198,11 +1199,20 @@ static inline abi_long target_to_host_sockaddr(struct sockaddr *addr,
>>      memcpy(addr, target_saddr, len);
>>      addr->sa_family = sa_family;
>>      if (sa_family == AF_PACKET) {
>> -       struct target_sockaddr_ll *lladdr;
>> +        /* Manage an obsolete case :
>> +         * if socket type is SOCK_PACKET, bind by name otherwise by index
>> +         * but we are not able to know socket type, so check if the name
>> +         * is usable...
>> +         * see linux/net/packet/af_packet.c: packet_bind_spkt()
>> +         */
>> +        if (strncmp((char *)((struct sockaddr_pkt *)addr)->spkt_device,
>> +                    "eth", 3) != 0) {
>> +            struct target_sockaddr_ll *lladdr;
> 
> This confuses me. The packet(7) manpage suggests there are two flavours
> of packet socket:
>  (1) legacy AF_INET + SOCK_PACKET
>  (2) new style AF_PACKET + SOCK_RAW / SOCK_DGRAM
> 
> but this comment suggests it's trying to handle AF_PACKET + SOCK_PACKET ?

In fact, I've started not from the man page, but from a non working dhcp
client, originally with a m68k target and etch-m68k distro, and I've met
again this problem on a ppc target and jessie distro.

> 
> If AF_PACKET was introduced as the new way of doing things, it's not
> clear why it would be the family type used in the legacy approach's
> sockaddr_pkt (though there seems to be code around that does this).
> I suspect that 2.0 kernels just didn't check af_family here at all.

by digging into the code, I have found:

$ apt-get source isc-dhcp-client
$ vi isc-dhcp-4.3.1/common/lpf.c

...
 72 int if_register_lpf (info)
 73         struct interface_info *info;
 74 {
...
 79         if ((sock = socket(PF_PACKET, SOCK_PACKET,
 80                            htons((short)ETH_P_ALL))) < 0) {
...

So we can see socket() is used with domain PF_PACKET and type SOCK_PACKET.

Next, the interface name is put into the sa_data of sockaddr, and bind()
is used with AF_PACKET:

 94         /* Bind to the interface name */
 95         memset (&sa, 0, sizeof sa);
 96         sa.sa_family = AF_PACKET;
 97         strncpy (sa.sa_data, (const char *)info -> ifp, sizeof
sa.sa_data);
 98         if (bind (sock, &sa, sizeof sa)) {

ifp is initialized from a list of all discovered interfaces in
common/discover.c:
...
 238 int
 239 begin_iface_scan(struct iface_conf_list *ifaces) {
...
 283         if (ioctl(ifaces->sock, SIOCGLIFCONF, &ifaces->conf) < 0) {
...
 918 void
 919 discover_interfaces(int state) {
...
 924         struct interface_info *tmp;
...
 939         if (!begin_iface_scan(&ifaces)) {
...
 955         while (next_iface(&info, &err, &ifaces)) {
 956
 957                 /* See if we've seen an interface that matches this
one. */
 958                 for (tmp = interfaces; tmp; tmp = tmp->next) {
 959                         if (!strcmp(tmp->name, info.name))
 960                                 break;
 961                 }

...
 987                         strcpy(tmp->name, info.name);
...
1050         }
...
1063         for (tmp = interfaces ; tmp != NULL ; tmp = tmp->next) {
1064                 if (tmp->ifp == NULL) {
1065                         struct ifreq *tif;
1066
1067                         tif = (struct ifreq *)dmalloc(sizeof(struct
ifreq),
1068                                                       MDL);
1069                         if (tif == NULL)
1070                                 log_fatal("no space for ifp mockup.");
1071                         strcpy(tif->ifr_name, tmp->name);
1072                         tmp->ifp = tif;
1073                 }
1074         }

> 
> The code in the kernel's packet_recvmsg fills in the spkt_family
> field with the netdevice's type field, which is an ARPHRD_* constant
> as far as I can tell (I could well be wrong here).

kernel 4.3.0-rc3, net/packet/af_packet.c:

   2961
   2962 static int packet_bind_spkt(struct socket *sock, struct sockaddr
*uaddr,
   2963                             int addr_len)
   2964 {
   2965         struct sock *sk = sock->sk;
   2966         char name[15];
   2967         struct net_device *dev;
   2968         int err = -ENODEV;
   2969
   2970         /*
   2971          *      Check legality
   2972          */
   2973
   2974         if (addr_len != sizeof(struct sockaddr))
   2975                 return -EINVAL;
   2976         strlcpy(name, uaddr->sa_data, sizeof(name));
   2977
   2978         dev = dev_get_by_name(sock_net(sk), name);
   2979         if (dev)
   2980                 err = packet_do_bind(sk, dev, pkt_sk(sk)->num);
   2981         return err;
   2982 }
...
   4246 static const struct proto_ops packet_ops_spkt = {
   4247         .family =       PF_PACKET,
...
   4250         .bind =         packet_bind_spkt,
...
   3022
   3023 static int packet_create(struct net *net, struct socket *sock,
int protocol,
   3024                          int kern)
...
   3045         if (sock->type == SOCK_PACKET)
   3046                 sock->ops = &packet_ops_spkt;
...

> Odd to have code in target_to_host_sockaddr and not in
> host_to_target_sockaddr.

In the case of host_to_target_sockaddr(), there is no "if (sa_family ==
AF_PACKET) {" as it was in target_to_host_sockaddr(), it's why I didn't
add it (and I don't like to add code I don't test).

> 
>> -       lladdr = (struct target_sockaddr_ll *)addr;
>> -       lladdr->sll_ifindex = tswap32(lladdr->sll_ifindex);
>> -       lladdr->sll_hatype = tswap16(lladdr->sll_hatype);
>> +            lladdr = (struct target_sockaddr_ll *)addr;
>> +            lladdr->sll_ifindex = tswap32(lladdr->sll_ifindex);
>> +            lladdr->sll_hatype = tswap16(lladdr->sll_hatype);
>> +        }
>>      }
>>      unlock_user(target_saddr, target_addr, 0);
>>
>> @@ -2509,7 +2519,12 @@ static abi_long do_socketcall(int num, abi_ulong vptr)
>>      /* now when we have the args, actually handle the call */
>>      switch (num) {
>>      case SOCKOP_socket: /* domain, type, protocol */
>> -        return do_socket(a[0], a[1], a[2]);
>> +        if (a[0] == AF_PACKET ||
>> +            a[1] == TARGET_SOCK_PACKET) {
>> +            return do_socket(a[0], a[1], tswap16(a[2]));
>> +        } else {
>> +            return do_socket(a[0], a[1], a[2]);
>> +        }
>>      case SOCKOP_bind: /* sockfd, addr, addrlen */
>>          return do_bind(a[0], a[1], a[2]);
>>      case SOCKOP_connect: /* sockfd, addr, addrlen */
>> @@ -7500,7 +7515,13 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>>  #endif
>>  #ifdef TARGET_NR_socket
>>      case TARGET_NR_socket:
>> -        ret = do_socket(arg1, arg2, arg3);
>> +        if (arg1 == AF_PACKET ||
>> +            arg2 == TARGET_SOCK_PACKET) {
>> +            /* in this case, socket() needs a network endian short */
>> +            ret = do_socket(arg1, arg2, tswap16(arg3));
>> +        } else {
>> +            ret = do_socket(arg1, arg2, arg3);
>> +        }
> 
> This doesn't make sense to me. The argument to socket()
> is passed in via a register; so if the guest code correctly
> passes the protocol value as htons(whatever) then that will
> be the value in arg3 and we do not need to swap anything.
> 
> I see we had this discussion about the previous version of the
> patch too... and my argument that we don't need the tswap16
> in the socketcall code path either still makes sense to me.

I agree with you, I think I have confused socketcall() that passes
parameters by memory, and socket() that passes parameters by registers.
I will remove this part and resend a patch.

> 
>>          fd_trans_unregister(ret);
>>          break;
>>  #endif
>> --
>> 2.4.3
> 
> thanks
> -- PMM
> 

Thank you for your comments,
Laurent

  reply	other threads:[~2015-10-27  3:11 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-06 17:11 [Qemu-devel] [PATCH] linux-user: manage SOCK_PACKET socket type Laurent Vivier
2015-10-26 14:40 ` Peter Maydell
2015-10-27  3:09   ` Laurent Vivier [this message]
2015-10-27 10:47     ` Laurent Vivier
2015-10-27 11:35       ` Peter Maydell
2015-10-27 11:39         ` Peter Maydell
2015-10-27 11:49           ` Laurent Vivier
2015-10-27 11:52             ` Peter Maydell
2015-10-27 11:56               ` Laurent Vivier
2015-10-27 11:54         ` Laurent Vivier
2015-10-27 11:50     ` Peter Maydell
2015-10-27 11:54       ` Laurent Vivier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=562EEAD4.7070706@vivier.eu \
    --to=laurent@vivier.eu \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=riku.voipio@iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.