From mboxrd@z Thu Jan  1 00:00:00 1970
From: "santosh.shilimkar@oracle.com" <santosh.shilimkar@oracle.com>
Subject: Re: [PATCH v2 net-next 1/3] rds: Changing IP address internal
 representation to struct in6_addr
Date: Fri, 6 Jul 2018 07:55:18 -0700
Message-ID: <116715eb-12f0-94d9-279c-b06b89d7dac7@oracle.com>
References: <cover.1530086216.git.ka-cheong.poon@oracle.com>
 <693463ed87c5956eed05a6266c4edf354ab3d51a.1530086216.git.ka-cheong.poon@oracle.com>
 <c5d7497c-2761-8941-19df-dcae087d10ce@oracle.com>
 <6a7b0951-d950-1555-a5b2-622d6f7b9f8f@oracle.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Cc: davem@davemloft.net, rds-devel@oss.oracle.com
To: Ka-Cheong Poon <ka-cheong.poon@oracle.com>, netdev@vger.kernel.org
Return-path: <netdev-owner@vger.kernel.org>
Received: from userp2120.oracle.com ([156.151.31.85]:40102 "EHLO
        userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1753450AbeGFOz3 (ORCPT
        <rfc822;netdev@vger.kernel.org>); Fri, 6 Jul 2018 10:55:29 -0400
In-Reply-To: <6a7b0951-d950-1555-a5b2-622d6f7b9f8f@oracle.com>
Content-Language: en-US
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On 7/6/18 2:08 AM, Ka-Cheong Poon wrote:
> On 07/06/2018 01:58 AM, Santosh Shilimkar wrote:
> 

> 
>>> diff --git a/net/rds/connection.c b/net/rds/connection.c
>>> index abef75d..ca72563 100644
>>> --- a/net/rds/connection.c
>>> +++ b/net/rds/connection.c
>>
>>> @@ -142,9 +151,12 @@ static void __rds_conn_path_init(struct 
>>> rds_connection *conn,
>>>    * are torn down as the module is removed, if ever.
>>>    */
>>>   static struct rds_connection *__rds_conn_create(struct net *net,
>>> -                        __be32 laddr, __be32 faddr,
>>> -                       struct rds_transport *trans, gfp_t gfp,
>>> -                       int is_outgoing)
>>> +                        const struct in6_addr *laddr,
>>> +                        const struct in6_addr *faddr,
>>> +                        struct rds_transport *trans,
>>> +                        gfp_t gfp,
>>> +                        int is_outgoing,
>>> +                        int dev_if)
>> Am just wondering if we can handle local link address case differently
>> instead of pushing the interface index everywhere. Did you think about
>> any alternative. This can also avoid bunch of changes again and hence
>> the question. Am trying to see if we can minimize the changes to
>> absolute must have to support IPv6.
> 
> 
> If link local address is supported, scoped ID must be used.
> Unless we remove the support of link local address, we need
> to keep scope ID.
>
Ok.

> 
>>> diff --git a/net/rds/ib.h b/net/rds/ib.h
>>> index a6f4d7d..12f96b3 100644
>>> --- a/net/rds/ib.h
>>> +++ b/net/rds/ib.h
> 
>>> +union rds_ib_conn_priv {
>>> +    struct rds_ib_connect_private    ricp_v4;
>>> +    struct rds6_ib_connect_private    ricp_v6;
>>>   };
>> This change was invetiable. Just add a comment saying the priviate
>> data size for v6 vs v4 is  different. Also for IPv6 priviate data,
>> I suggest add some resrve fields for any future extensions so
>> that things can be added without breaking wire protocol. With IPv4,
>> we ended up in bit of mess.
> 
> 
> Will add some comments.  Unfortunately the IB private data
> exchange has only a limited space.  I don't think there is
> any more space left for reserved field.  I think we should
> think of a different way to handle extensions in future.
>
There is enough space. You can send upto 512 bytes iirc. Please
add some reserve fields and ping me if you run into issues.

>>> diff --git a/net/rds/send.c b/net/rds/send.c
>>> index 94c7f74..6ed2e92 100644
>>> --- a/net/rds/send.c
>>> +++ b/net/rds/send.c
> 
>>> @@ -1081,27 +1085,59 @@ int rds_sendmsg(struct socket *sock, struct 
>>> msghdr *msg, size_t payload_len)
>>>           goto out;
>>>       }
>>> -    if (msg->msg_namelen) {
>>> -        /* XXX fail non-unicast destination IPs? */
>>> -        if (msg->msg_namelen < sizeof(*usin) || usin->sin_family != 
>>> AF_INET) {
>>> +    namelen = msg->msg_namelen;
>>> +    if (namelen != 0) {
>>> +        if (namelen < sizeof(*usin)) {
>>> +            ret = -EINVAL;
>>> +            goto out;
>>> +        }
>>> +        switch (namelen) {
>>> +        case sizeof(*usin):
>>> +            if (usin->sin_family != AF_INET ||
>>> +                usin->sin_addr.s_addr == htonl(INADDR_ANY) ||
>>> +                usin->sin_addr.s_addr == htonl(INADDR_BROADCAST) ||
>>> +                IN_MULTICAST(ntohl(usin->sin_addr.s_addr))) {
>>> +                ret = -EINVAL;
>> The idea was to fail non-unicast IP someday but can you not add this
>> change as part of v6 series. We can look at it later unless you need
>> this change for v6. Again the question is mainly to add only necessary
>> check and leave the existing behavior as is so please re-check all below
>> checks too.
> 
> 
> This is to match the same IPv6 check.  In IPv6 code, it checks
> if the address is a unicast address.  I can remove the IPv4
> checks but if an app does send to a multicast address, the
> connection will be stuck forever.
>
OK. Lets keep it then.

> 
>>> diff --git a/net/rds/transport.c b/net/rds/transport.c
>>> index 0b188dd..c9788db 100644
>>> --- a/net/rds/transport.c
>>> +++ b/net/rds/transport.c
>>
>>> +    if (ipv6_addr_v4mapped(addr)) {
>>
>> Dave already commented on ipv6_addr_v4mapped(). Apart from
>> some of those comments questions, rest of the patch looks
>> really good and nicely done. Will also look at your
>> subsequent two patches and try to send you comments in next
>> few days.
> 
> 
> Do you mean using mapped address to create IPv4 connections?
> I already have it in my work space.  Will be in v3.

yeah. Thanks !!