All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlad Yasevich <vladislav.yasevich@hp.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David Miller <davem@davemloft.net>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	yjwei@cn.fujitsu.com, netdev@vger.kernel.org
Subject: Re: [PATCH] inet6: Fix paramater issue of inet6_csk_xmit
Date: Fri, 01 Aug 2008 10:50:17 -0400	[thread overview]
Message-ID: <489322A9.3080906@hp.com> (raw)
In-Reply-To: <20080801131453.GA19851@gondor.apana.org.au>

Herbert Xu wrote:
> On Fri, Aug 01, 2008 at 07:50:41PM +0800, Herbert Xu wrote:
>> So we should be able to simply replace ipfragok with skb->local_df.
>> The only trouble is that SCTP doesn't seem to keep its PMTU setting
>> in the usual spot so we'll need to fix that before this can be done.
> 
> OK I think this should work:
> 
> sctp: Drop ipfargok in sctp_xmit function
> 
> The ipfragok flag controls whether the packet may be fragmented
> either on the local host on beyond.  The latter is only valid on
> IPv4.
> 
> In fact, we never want to do the latter even on IPv4 when PMTU is
> enabled.  This is because even though we can't fragment packets

I think you mean that we can't "re-fragment packets".  The reason
for that is how SCTP handles sequence numbering,  Instead of numbering
bytes, it number messages.  Thus once the number is assigned, you can't
re-fragment it as that would result in different sequence numbers.

> within SCTP due to the prtocol's inherent faults, we can still
> fragment it at IP layer.  By setting the DF bit we will improve
> the PMTU process.
> 
> RFC 2960 only says that we SHOULD clear the DF bit in this case,
> so we're compliant even if we set the DF bit.

RFC 4960 doesn't mention DF bit anywhere, so that's not an issue.

> 
> Once we make this change, we only need to control the local
> fragmentation.  There is already a bit in the skb which controls
> that, local_df.  So this patch sets that instead of using the
> ipfragok argument.
> 
> The only complication is that there isn't a struct sock object
> per transport, so for IPv4 we have to resort to changing the
> pmtudisc field for every packet.  This should be safe though
> as the protocol is single-threaded.
> 
> Note that after this patch we can remove ipfragok from the rest
> of the stack too.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index 535a18f..ab1c472 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -524,8 +524,7 @@ static inline void sctp_ssn_skip(struct sctp_stream *stream, __u16 id,
>   */
>  struct sctp_af {
>  	int		(*sctp_xmit)	(struct sk_buff *skb,
> -					 struct sctp_transport *,
> -					 int ipfragok);
> +					 struct sctp_transport *);
>  	int		(*setsockopt)	(struct sock *sk,
>  					 int level,
>  					 int optname,
> diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
> index a238d68..11dab07 100644
> --- a/net/sctp/ipv6.c
> +++ b/net/sctp/ipv6.c
> @@ -195,8 +195,7 @@ out:
>  }
>  
>  /* Based on tcp_v6_xmit() in tcp_ipv6.c. */
> -static int sctp_v6_xmit(struct sk_buff *skb, struct sctp_transport *transport,
> -			int ipfragok)
> +static int sctp_v6_xmit(struct sk_buff *skb, struct sctp_transport *transport)
>  {
>  	struct sock *sk = skb->sk;
>  	struct ipv6_pinfo *np = inet6_sk(sk);
> @@ -231,7 +230,10 @@ static int sctp_v6_xmit(struct sk_buff *skb, struct sctp_transport *transport,
>  
>  	SCTP_INC_STATS(SCTP_MIB_OUTSCTPPACKS);
>  
> -	return ip6_xmit(sk, skb, &fl, np->opt, ipfragok);
> +	if (!(tp->param_flags & SPP_PMTUD_ENABLE))
> +		skb->local_df = 1;
> +
> +	return ip6_xmit(sk, skb, &fl, np->opt, 0);
>  }
>  
>  /* Returns the dst cache entry for the given source and destination ip
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 4568464..0dc4a7d 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -586,10 +586,8 @@ int sctp_packet_transmit(struct sctp_packet *packet)
>  	SCTP_DEBUG_PRINTK("***sctp_transmit_packet*** skb len %d\n",
>  			  nskb->len);
>  
> -	if (tp->param_flags & SPP_PMTUD_ENABLE)
> -		(*tp->af_specific->sctp_xmit)(nskb, tp, packet->ipfragok);
> -	else
> -		(*tp->af_specific->sctp_xmit)(nskb, tp, 1);
> +	nskb->local_df = packet->ipfragok;
> +	(*tp->af_specific->sctp_xmit)(nskb, tp);
>  
>  out:
>  	packet->size = packet->overhead;
> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> index a6e0818..80b8efb 100644
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -862,16 +862,25 @@ static int sctp_inet_supported_addrs(const struct sctp_sock *opt,
>  
>  /* Wrapper routine that calls the ip transmit routine. */
>  static inline int sctp_v4_xmit(struct sk_buff *skb,
> -			       struct sctp_transport *transport, int ipfragok)
> +			       struct sctp_transport *transport)
>  {
> +	struct inet_sock *inet = inet_sk(skb->sk);
> +
>  	SCTP_DEBUG_PRINTK("%s: skb:%p, len:%d, "
>  			  "src:%u.%u.%u.%u, dst:%u.%u.%u.%u\n",
>  			  __func__, skb, skb->len,
>  			  NIPQUAD(skb->rtable->rt_src),
>  			  NIPQUAD(skb->rtable->rt_dst));
>  
> +	/*
> +	 * You'd think with a protocol this big they could afford
> +	 * a struct sock per transport.
> +	 */

Please don't add comments like this.  It does not contribute anything.

> +	inet->pmtudisc = transport->param_flags & SPP_PMTUD_ENABLE ?
> +			 IP_PMTUDISC_DO : IP_PMTUDISC_DONT;

If you insist on doing this, you need to save the old value and restore
it after ip_queue_xmit(). 

The reason is that PMTU discovery change on a specific transport or
association should not affect the socket, since there could be multiple
associations on a given socket.  Each association has it's own control.

Thanks
-vlad

> +
>  	SCTP_INC_STATS(SCTP_MIB_OUTSCTPPACKS);
> -	return ip_queue_xmit(skb, ipfragok);
> +	return ip_queue_xmit(skb, 0);
>  }
>  
>  static struct sctp_af sctp_af_inet;
> 
> Cheers,


  reply	other threads:[~2008-08-01 14:50 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-31  9:56 [PATCH] inet6: Fix paramater issue of inet6_csk_xmit Wei Yongjun
2008-08-01  3:48 ` David Miller
2008-08-01  5:17   ` Wei Yongjun
2008-08-01  9:11     ` Herbert Xu
2008-08-01  9:18       ` David Miller
2008-08-01  9:25         ` Herbert Xu
2008-08-01 11:50           ` Herbert Xu
2008-08-01 13:14             ` Herbert Xu
2008-08-01 14:50               ` Vlad Yasevich [this message]
2008-08-02  0:30                 ` Herbert Xu
2008-08-02 23:26                   ` Vlad Yasevich
2008-08-03  2:01                     ` Herbert Xu
2008-08-03  6:56                     ` David Miller
2008-08-04  1:03               ` Vlad Yasevich
2008-08-04  1:55                 ` Herbert Xu
2008-08-04  2:58                   ` Wei Yongjun
2008-08-04  3:05                     ` Herbert Xu
2008-08-04  4:16                       ` David Miller
2008-08-04  4:15                   ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=489322A9.3080906@hp.com \
    --to=vladislav.yasevich@hp.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=netdev@vger.kernel.org \
    --cc=yjwei@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.