Netdev List

Netdev List
 help / color / mirror / Atom feed

* Exclusive offer, feel it for real
From: BarrettDavis @ 2013-09-28 11:08 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 300 bytes --]


		Dear Customer 

Make your first deposit and we will triple your bonus money up to £600.

http://translate.googleusercontent.com/translate_c?depth=1&hl=auto&sl=fr&url=www.google.com.bd&u=http://onlinecasino27.yolasite.com/casino1&usg=ALkJrhjsEp2wEy6D4p3zB8y4lFpDcS7xmg

Best Regards, 
		

^ permalink raw reply

* Re: [PATCH] iproute2: xfrm state add abort issue
From: Sohny Thomas @ 2013-09-28 13:24 UTC (permalink / raw)
  To: David Laight; +Cc: stephen, netdev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B7361@saturn3.aculab.com>

On Friday 27 September 2013 01:56 PM, David Laight wrote:
>> ip xfrm state add causes a SIGABRT due to a strncpy_chk .
>> This happens since strncpy doesn't account for the '\0' .
>> I have fixed this using sizeof  instead of strlen .
>>
>> There is a redhat bug which documents this issue
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=982761
>>
>> Signed-off-by: Sohny Thomas <sohthoma@in.ibm.com>
>>
>> --------------
>>
>> diff --git a/ip/xfrm_state.c b/ip/xfrm_state.c
>> index 389942c..7dd8799 100644
>> --- a/ip/xfrm_state.c
>> +++ b/ip/xfrm_state.c
>> @@ -117,7 +117,7 @@ static int xfrm_algo_parse(struct xfrm_algo *alg,
>> enum xfrm_attr_type_t type,
>>                               char *name, char *key, char *buf, int max)
>>     {
>>            int len;
>> -       int slen = strlen(key);
>> +       int slen = sizeof(key);
>
> you definitely don't want sizeof(key) - that is either 4 or 8.
oh damn my bad.
I think i will go with strlen(key) + 1.

or i will pass slen+1 to strncpy .

Regards,
Sohny
>
> 	David
>
>
>
>

^ permalink raw reply

* Re: [PATCH v1 net-next] net: pkt_sched: PIE AQM scheme
From: Eric Dumazet @ 2013-09-28 17:03 UTC (permalink / raw)
  To: Vijay Subramanian; +Cc: netdev, davem, shemminger, Mythili Prabhu, Dave Taht
In-Reply-To: <1380333383-9507-1-git-send-email-subramanian.vijay@gmail.com>

On Fri, 2013-09-27 at 18:56 -0700, Vijay Subramanian wrote:

> +static int pie_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch)
> +{
> +	struct pie_sched_data *q = qdisc_priv(sch);
> +
> +	if (unlikely(qdisc_qlen(sch) >= sch->limit))
> +		goto out;
> +

...

> +out:
> +	q->stats.overlimit++;
> +	qdisc_drop(skb, sch);
> +	return NET_XMIT_CN;  /*indicate congestion*/
> +}


If a Qdisc drops a packet because sch->limit is hit, you must :

return qdisc_drop(skb, sch);

So that NET_XMIT_DROP is returned, not NET_XMIT_CN.

This packet was dropped for sure.

vi +96 net/sched/sch_api.c

   ---enqueue

   enqueue returns 0, if packet was enqueued successfully.
   If packet (this one or another one) was dropped, it returns
   not zero error code.
   NET_XMIT_DROP        - this packet dropped
     Expected action: do not backoff, but wait until queue will clear.
   NET_XMIT_CN          - probably this packet enqueued, but another one dropped.
     Expected action: backoff or ignore
   NET_XMIT_POLICED     - dropped by police.
     Expected action: backoff or error to real-time apps.

^ permalink raw reply

* Re: [PATCH v1 net-next] net: pkt_sched: PIE AQM scheme
From: Stephen Hemminger @ 2013-09-28 17:06 UTC (permalink / raw)
  To: Vijay Subramanian; +Cc: netdev
In-Reply-To: <1902752B0C92F943AB7EA9EE13E2DEEC1272C74DD1@HQ1-EXCH02.corp.brocade.com>

Thanks for submitting this, it is in ok shape, but I still
see several issues.

> diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
> index f2624b5..2fb6e6d 100644
> --- a/include/uapi/linux/pkt_sched.h
> +++ b/include/uapi/linux/pkt_sched.h
> @@ -787,4 +787,30 @@ struct tc_fq_qd_stats {
>         __u32   throttled_flows;
>         __u32   pad;
>  };
> +
> +/*PIE*/
> +enum {
> +       TCA_PIE_UNSPEC,
> +       TCA_PIE_TARGET,
> +       TCA_PIE_LIMIT,
> +       TCA_PIE_TUPDATE,
> +       TCA_PIE_ALPHA,
> +       TCA_PIE_BETA,
> +       TCA_PIE_ECN,
> +       TCA_PIE_BYTEMODE,
> +       __TCA_PIE_MAX
> +};
> +#define TCA_PIE_MAX   (__TCA_PIE_MAX - 1)
> +
> +struct tc_pie_xstats {
> +       __u32 prob;             /* current probability */
> +       __u32 delay;            /* current delay in ms */
> +       __u32 avg_dq_rate;      /* current average dq_rate in bits/pie_time */
> +       __u32 packets_in;       /* total number of packets enqueued */
> +       __u32 dropped;          /* packets dropped due to pie_action */
> +       __u32 overlimit;        /* dropped due to lack of space in queue */
> +       __u32 maxq;             /* maximum queue size */
> +       __u32 ecn_mark;         /* packets marked with ecn*/
> +};
> +
>  #endif
> diff --git a/net/sched/Kconfig b/net/sched/Kconfig
> index c03a32a..7b32e58 100644
> --- a/net/sched/Kconfig
> +++ b/net/sched/Kconfig
> @@ -286,6 +286,17 @@ config NET_SCH_FQ
> 
>           If unsure, say N.
> 
> +config NET_SCH_PIE
> +        tristate "Proportianal Enhanced Controller AQM (PIE)"

Spelling should be: Proportional 


> diff --git a/net/sched/sch_pie.c b/net/sched/sch_pie.c
> new file mode 100644
> index 0000000..cfcfde9
> --- /dev/null
> +++ b/net/sched/sch_pie.c
> @@ -0,0 +1,623 @@
> +/* Copyright (C) 2013 Cisco Systems, Inc, 2013.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
> + * USA.
> + *
> + * Author: Vijay Subramanian <vijaynsu@cisco.com>
> + * Author: Mythili Prabhu <mysuryan@cisco.com>
> + *
> + * ECN support is added by Naeem Khademi <naeemk@ifi.uio.no>
> + * University of Oslo, Norway.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <linux/errno.h>
> +#include <linux/skbuff.h>
> +#include <net/pkt_sched.h>
> +#include <net/inet_ecn.h>
> +
> +#define PIE_DEFAULT_QUEUE_LIMIT 200    /* in packets */
> +#define QUEUE_THRESHOLD (5000)
Useless paren's here

> +#define DQCOUNT_INVALID -1
> +#define THRESHOLD_PKT_SIZE     1500
> +#define MAX_INT_VALUE  0xffffffff
> +#define MAX_INT_VALUE_CAP  (0xffffffff >> 8)
> +
> +typedef u32 pie_time_t;
> +typedef s32 pie_tdiff_t;
> +#define PIE_SHIFT 10
> +#define MS2PIETIME(a) ((a * NSEC_PER_MSEC) >> PIE_SHIFT)
> +#define PIE_TIME_PER_SEC  ((NSEC_PER_SEC >> PIE_SHIFT))
> +

I would prefer that all packet schedulers use the same set of clock
routines (psched), rather than inventing own wrapper for high resolution
clock.

> +static inline pie_time_t pie_get_time(void)
> +{
> +       u64 ns = ktime_to_ns(ktime_get());
> +       return ns >> PIE_SHIFT;
> +}
> +
> +static inline u32 pie_time_to_ms(pie_time_t val)
> +{
> +       u64 valms = ((u64) val << PIE_SHIFT);
> +
> +       do_div(valms, NSEC_PER_MSEC);
> +       return (u32) valms;
> +}

Psched has all this.

> +/* parameters used*/
> +struct pie_params {
> +       pie_time_t target;      /* user specified target delay in pietime*/
> +       pie_time_t tupdate;     /* frequency with which the timer fires*/
                                                                         ^ space before end of comment
> +       u32 limit;              /* number of packets that can be enqueued */
> +       u32 alpha;              /* alpha and beta are between -4 and 4 */
> +       u32 beta;               /* and are used for shift relative to 1 */
> +       bool ecn;               /* true if ecn is enabled */
> +       bool bytemode;          /* to scale drop early prob based on pkt size */
> +};
> +
> +/* variables used*/
> +struct pie_vars {
> +       u32 prob;               /* probability but scaled by u32 limit. */
> +       pie_time_t burst_time;
> +       pie_time_t qdelay;
> +       pie_time_t qdelay_old;
> +       u32 dq_count;           /* measured in bytes */
> +       pie_time_t dq_tstamp;   /* drain rate */
> +       u32 avg_dq_rate;        /* bytes per pietime tick, scaled by 8 */
> +       u32 qlen_old;           /* in bytes */
> +};
> +
> +struct pie_stats {
> +       u32 packets_in;         /* total number of packets enqueued */
> +       u32 dropped;            /* packets dropped due to pie_action */
> +       u32 overlimit;          /* dropped due to lack of space in queue */
> +       u32 maxq;               /* maximum queue size */
> +       u32 ecn_mark;           /* packets marked with ECN */
> +};
> +
> +static void pie_params_init(struct pie_params *params)
> +{
> +       memset(params, 0, sizeof(*params));

Unnecessary, already zero'd when created by qdisc_alloc()
Call chain
      qdisc_create
	      qdisc_alloc
              ops->init => pie_init
                 pie_params_init
      
> +       params->alpha = 2;
> +       params->beta = 20;
> +       params->tupdate = MS2PIETIME(30);       /* 30 ms */
> +       params->limit = PIE_DEFAULT_QUEUE_LIMIT;
> +       params->target = MS2PIETIME(20);        /* 20 ms */
> +       params->ecn = false;
> +       params->bytemode = false;
> +}


> +
> +static void pie_vars_init(struct pie_vars *vars)
> +{
> +       memset(vars, 0, sizeof(*vars));
ditto, already zero'd

> +       vars->dq_count = DQCOUNT_INVALID;
> +       vars->avg_dq_rate = 0;
> +       /* default of 100 ms in pietime  */
> +       vars->burst_time = MS2PIETIME(100);
> +}
> +
> +static void pie_stats_init(struct pie_stats *stats)
> +{
> +       memset(stats, 0, sizeof(*stats));
> +}
> +
ditto, already zero'd

> +struct pie_sched_data {
> +       struct pie_params params;
> +       struct pie_vars vars;
> +       struct pie_stats stats;
> +       struct timer_list adapt_timer;
> +};

Standard practice is to put data structures before code.

> +
> +static inline bool drop_early(struct Qdisc *sch, u32 packet_size)

Inline is not needed if static. Let compiler decide.

> +{
> +       struct pie_sched_data *q = qdisc_priv(sch);
> +       u32 rnd;
> +       u32 local_prob = q->vars.prob;
> +
> +
> +       /* If there is still burst allowance left or delay is much below target
> +        * not due to heavy dropping, skip random early drop
> +        */
> +       if (q->vars.burst_time > 0)
> +               return false;
> +
> +       /* If current delay is less than half of target, and
> +        * if drop prob is low already, disable early_drop
> +        */
> +       if ((q->vars.qdelay < q->params.target / 2)
> +           && (q->vars.prob < MAX_INT_VALUE / 5))
> +               return false;
> +
> +       /* If we have fewer than 2 packets, disable drop_early,
> +        * similar to min_th in RED
> +        */
> +       if (sch->qstats.backlog < 2 * 1500)
> +               return false;
> +
> +       /* If bytemode is turned on, use packet size to compute new
> +        * probablity. Smaller packets will have lower drop prob in this case
> +        */
> +       if (q->params.bytemode) {
> +               /* If packet_size is greater than THRESHOLD_PKT_SIZE,
> +                * we cap the probability to the maximum value
> +                */
> +               if (packet_size <= THRESHOLD_PKT_SIZE) {
> +                       local_prob =
> +                           (local_prob / THRESHOLD_PKT_SIZE) * packet_size;
> +               } else {
> +                       local_prob = MAX_INT_VALUE_CAP;
> +               }
> +
> +       } else {
> +               local_prob = q->vars.prob;
> +       }
> +
> +       rnd = net_random() % MAX_INT_VALUE_CAP;
> +       if (rnd < local_prob)
> +               return true;
> +
> +       return false;
> +}
> +
> +static int pie_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch)
> +{
> +       struct pie_sched_data *q = qdisc_priv(sch);
> +
> +       if (unlikely(qdisc_qlen(sch) >= sch->limit))
> +               goto out;
> +
> +       if (!drop_early(sch, skb->len)) {
> +               /* we can enqueue the packet */
> +               q->stats.packets_in++;
> +
> +               if (qdisc_qlen(sch) > q->stats.maxq)
> +                       q->stats.maxq = qdisc_qlen(sch);
> +
> +               return qdisc_enqueue_tail(skb, sch);
> +       } else if (q->params.ecn && INET_ECN_set_ce(skb) &&
> +                  (q->vars.prob <= MAX_INT_VALUE / 10)) {
> +                       /* If packet is ecn capable, mark it if drop probability
> +                        * is lower than 10%, else drop it.
> +                        */
> +                       q->stats.ecn_mark++;
> +                       return qdisc_enqueue_tail(skb, sch);
> +       }
> +out:
> +       q->stats.overlimit++;
> +       qdisc_drop(skb, sch);
> +       return NET_XMIT_CN;  /*indicate congestion*/
> +}
> +
> +static const struct nla_policy pie_policy[TCA_PIE_MAX + 1] = {
> +       [TCA_PIE_TARGET] = {.type = NLA_U32},
                                             ^ space before }
> +       [TCA_PIE_LIMIT] = {.type = NLA_U32},
> +       [TCA_PIE_TUPDATE] = {.type = NLA_U32},
> +       [TCA_PIE_ALPHA] = {.type = NLA_U32},
> +       [TCA_PIE_BETA] = {.type = NLA_U32},
> +       [TCA_PIE_ECN] = {.type = NLA_U32},
> +       [TCA_PIE_BYTEMODE] = {.type = NLA_U32},
              Looks prettier if all = are aligned

> +};
> +
> +static int pie_change(struct Qdisc *sch, struct nlattr *opt)
> +{
> +       struct pie_sched_data *q = qdisc_priv(sch);
> +       struct nlattr *tb[TCA_PIE_MAX + 1];
> +       unsigned int qlen;
> +       int err;
> +
> +       if (!opt)
> +               return -EINVAL;
> +
> +       err = nla_parse_nested(tb, TCA_PIE_MAX, opt, pie_policy);
> +       if (err < 0)
> +               return err;
> +
> +       sch_tree_lock(sch);
> +
> +       /* convert from microseconds to pietime */
> +       if (tb[TCA_PIE_TARGET]) {
> +               /* target is in us */
> +               u32 target = nla_get_u32(tb[TCA_PIE_TARGET]);
> +               /* convert to pietime */
> +               q->params.target = ((u64) target * NSEC_PER_USEC) >> PIE_SHIFT;
> +       }
> +
> +       if (tb[TCA_PIE_TUPDATE]) {
> +               /* tupdate is in us */
> +               u32 tupdate = nla_get_u32(tb[TCA_PIE_TUPDATE]);
> +               /* convert to pietime */
> +               q->params.tupdate =
> +                   ((u64) tupdate * NSEC_PER_USEC) >> PIE_SHIFT;
> +       }
> +
> +       if (tb[TCA_PIE_LIMIT]) {
> +               u32 limit = nla_get_u32(tb[TCA_PIE_LIMIT]);
> +               q->params.limit = limit;
> +               sch->limit = limit;
> +       }
> +
> +       if (tb[TCA_PIE_ALPHA])
> +               q->params.alpha = nla_get_u32(tb[TCA_PIE_ALPHA]);
> +
> +       if (tb[TCA_PIE_BETA])
> +               q->params.beta = nla_get_u32(tb[TCA_PIE_BETA]);
> +
> +       if (tb[TCA_PIE_ECN])
> +               q->params.ecn = nla_get_u32(tb[TCA_PIE_ECN]);
> +
> +       if (tb[TCA_PIE_BYTEMODE])
> +               q->params.bytemode = nla_get_u32(tb[TCA_PIE_BYTEMODE]);
> +
> +       /* Drop excess packets if new limit is lower */
> +       qlen = sch->q.qlen;
> +       while (sch->q.qlen > sch->limit) {
> +               struct sk_buff *skb = __skb_dequeue(&sch->q);
> +
> +               sch->qstats.backlog -= qdisc_pkt_len(skb);
> +               qdisc_drop(skb, sch);
> +       }
> +       qdisc_tree_decrease_qlen(sch, qlen - sch->q.qlen);
> +
> +       sch_tree_unlock(sch);
> +       return 0;
> +}
> +
> +static int pie_process_dequeue(struct Qdisc *sch, struct sk_buff *skb)
> +{
> +
> +       struct pie_sched_data *q = qdisc_priv(sch);
> +       int qlen = sch->qstats.backlog; /* current queue size in bytes */
> +
> +       /* If current queue is about 10 packets or more and dq_count is unset
> +        *  we have enough packets to calculate the drain rate. Save
> +        *  current time as dq_tstamp and start measurement cycle.
> +        */
> +
> +       if (qlen >= QUEUE_THRESHOLD && q->vars.dq_count == DQCOUNT_INVALID) {
> +               q->vars.dq_tstamp = pie_get_time();
> +               q->vars.dq_count = 0;
> +       }
> +
> +       /*  Calculate the average drain rate from this value.  If queue length
> +        *  has receded to a small value viz., <= QUEUE_THRESHOLD bytes,reset
> +        *  the dq_count to -1 as we don't have enough packets to calculate the
> +        *  drain rate anymore The following if block is entered only when we
> +        *  have a substantial queue built up (QUEUE_THRESHOLD bytes or more)
> +        *  and we calculate the drain rate for the threshold here.  dq_count is
> +        *  in bytes, time difference in pie_time, hence rate is in
> +        *  bytes/pie_time.
> +        */
> +
> +       if (q->vars.dq_count != DQCOUNT_INVALID) {

I prefer not having blank line after each block comment when it refers
directly to the code below (save screen space when reading).

> +
> +               q->vars.dq_count += skb->len;
> +
> +               if (q->vars.dq_count >= QUEUE_THRESHOLD) {
> +                       pie_time_t now = pie_get_time();
> +                       pie_tdiff_t dtime = now - q->vars.dq_tstamp;
> +                       u64 count = q->vars.dq_count << 3;  /* scale by 8*/

This shift doesn't do what you expect. Since dq_count is 32 bit,
the compiler does a 32 bit shift and then assigns it to count.
So either just make count 32 bit and skip all the expensive 64 bit divide,
or cast q->vars.dq_count to 64 bit first.

> +
> +                       if (dtime == 0)
> +                               return 0;
> +
> +                       /* dtime has overflowed */
> +                       if (dtime < 0)
> +                               dtime = -dtime;
> +
> +                       do_div(count, dtime);
> +
> +                       if (q->vars.avg_dq_rate == 0)
> +                               q->vars.avg_dq_rate = count;
> +                       else
> +                               q->vars.avg_dq_rate =
> +                                   (q->vars.avg_dq_rate -
> +                                    (q->vars.avg_dq_rate >> 3)) + (count >> 3);
> +
> +                       /* If the queue has receded below the threshold, we hold
> +                        * on to the last drain rate calculated, else we reset
> +                        * dq_count to 0 to re-enter the if block when the next
> +                        * packet is dequeued
> +                        */
> +
> +                       if (qlen < QUEUE_THRESHOLD)
> +                               q->vars.dq_count = DQCOUNT_INVALID;
> +                       else {
> +                               q->vars.dq_count = 0;
> +                               q->vars.dq_tstamp = pie_get_time();
> +                       }
> +
> +                       if (q->vars.burst_time > 0) {
> +                               if (q->vars.burst_time > dtime)
> +                                       q->vars.burst_time -= dtime;
> +                               else
> +                                       q->vars.burst_time = 0;
> +                       }
> +               }
> +
> +       }
> +       return 0;
> +}
> +
> +static void calculate_probability(struct Qdisc *sch)
> +{
> +       struct pie_sched_data *q = qdisc_priv(sch);
> +       int qlen = sch->qstats.backlog; /* queue size in bytes */
> +       pie_time_t qdelay = 0;  /* in pietime */
> +       pie_time_t qdelay_old = q->vars.qdelay; /* in pietime */
> +       s32 delta = 0;          /* signed difference */
> +       u32 oldprob;
> +       u32 alpha, beta;
> +       bool update_prob = true;        /* Should probability be updated? */
> +

Big problem, this is called without locks??
timer -> pie_timer -> calculate_probability

When you add locks you also have to worry about deadlock
on shutdown.

> +       q->vars.qdelay_old = q->vars.qdelay;
> +
> +       if (q->vars.avg_dq_rate > 0)
> +               qdelay = (qlen << 3) / q->vars.avg_dq_rate;
> +       else
> +               qdelay = 0;
> +
> +       /* If qdelay is zero and qlen is not, it means qlen is very small, less
> +        * than dequeue_rate, so we do not update probabilty in this round
> +        */
> +       if (qdelay == 0 && qlen != 0)
> +               update_prob = false;
> +
> +       /* Add ranges for alpha and beta, more aggressive for high dropping
> +        * mode and gentle steps for light dropping mode
> +        * In light dropping mode, take gentle steps; in medium dropping mode,
> +        * take medium steps; in high dropping mode, take big steps.
> +        */
> +       if (q->vars.prob < MAX_INT_VALUE / 100) {
> +               alpha =
> +                   (q->params.alpha * (MAX_INT_VALUE / PIE_TIME_PER_SEC)) >> 7;
> +               beta =
> +                   (q->params.beta * (MAX_INT_VALUE / PIE_TIME_PER_SEC)) >> 7;
> +       } else if (q->vars.prob < MAX_INT_VALUE / 10) {
> +               alpha =
> +                   (q->params.alpha * (MAX_INT_VALUE / PIE_TIME_PER_SEC)) >> 5;
> +               beta =
> +                   (q->params.beta * (MAX_INT_VALUE / PIE_TIME_PER_SEC)) >> 5;
> +       } else {
> +               alpha =
> +                   (q->params.alpha * (MAX_INT_VALUE / PIE_TIME_PER_SEC)) >> 4;
> +               beta =
> +                   (q->params.beta * (MAX_INT_VALUE / PIE_TIME_PER_SEC)) >> 4;
> +       }
> +
> +       /* alpha and beta should be between 0 and 32, in multiples of 1/16
> +        */
> +       delta += alpha * ((qdelay - q->params.target));
> +       delta += beta * ((qdelay - qdelay_old));
> +
> +       oldprob = q->vars.prob;
> +
> +       /* addition to ensure we increase probability in steps of no
> +        *  more than 2%
> +        */
> +
> +       if (delta > (s32) (MAX_INT_VALUE * 2 / 100)
> +           && q->vars.prob >= MAX_INT_VALUE / 10) {
> +               delta = MAX_INT_VALUE * 2 / 100;
> +       }
> +
> +       /*  Non-linear drop
> +        *  Tune drop probability to increase quickly for high delays
> +        *  (250ms and above)
> +        *  250ms is derived through experiments and provides error protection
> +        */
> +
> +       if (qdelay > (MS2PIETIME(250)))
> +               delta += (2 * MAX_INT_VALUE) / 100;
> +
> +       q->vars.prob += delta;
> +
> +       if (delta > 0) {
> +               /* prevent overflow */
> +               if (q->vars.prob < oldprob) {
> +                       q->vars.prob = MAX_INT_VALUE;
> +                       /* Prevent normalization error
> +                        * If probability is the maximum value already,
> +                        * we normalize it here, and skip the
> +                        * check to do a non-linear drop in the next section
> +                        */
> +                       update_prob = false;
> +               }
> +       } else {
> +               /* prevent underflow */
> +               if (q->vars.prob > oldprob)
> +                       q->vars.prob = 0;
> +       }
> +
> +       /* Non-linear drop in probability */
> +       /* Reduce drop probability quickly if delay is 0 for 2 consecutive
> +        * Tupdate periods
> +        */
> +       if ((qdelay == 0) && (qdelay_old == 0) && update_prob)
> +               q->vars.prob = (q->vars.prob * 98) / 100;
> +
> +       q->vars.qdelay = qdelay;
> +       q->vars.qlen_old = qlen;
> +
> +       /* we restart the measurement cycle if the following conditions are met
> +        *  1. If the delay has been low for 2 consecutive Tupdate periods
> +        *  2. Calculated drop probability is zero
> +        *  3. We have atleast one estimate for the avg_dq_rate ie.,
> +        *     is a non-zero value
> +        */
> +       if ((q->vars.qdelay < q->params.target / 2)
> +           && (q->vars.qdelay_old < q->params.target / 2)
> +           && (q->vars.prob == 0)
> +           && q->vars.avg_dq_rate > 0)
> +               pie_vars_init(&q->vars);
> +
> +       return;
> +}

Don't do empty return at end of function. It is unnecessary.


> +
> +static inline void pie_timer(unsigned long arg)

Since this is a callback it can't be inlined anyway

> +{
> +       struct Qdisc *sch = (struct Qdisc *)arg;
> +       struct pie_sched_data *q = qdisc_priv(sch);
> +       u64 tup;
Does this really have to be 64 bit?

> +
> +       calculate_probability(sch);
> +       /* reset the timer to fire after 'tupdate' us,
> +        * tupdate is currently in pie_time
> +        * mod_timer expects time to be in jiffies
> +        */
> +       /* convert from pietime to nsecs to ms*/
> +       tup = pie_time_to_ms(q->params.tupdate);
> +       tup = (tup * HZ) / (1000);      /* and then to  jiffies */

Useless paren around (1000)

> +
> +       mod_timer(&q->adapt_timer, jiffies + tup);
> +
> +       return;
> +
> +}
> +
> +static int pie_init(struct Qdisc *sch, struct nlattr *opt)
> +{
> +       struct pie_sched_data *q = qdisc_priv(sch);
> +
> +       pie_params_init(&q->params);
> +       pie_vars_init(&q->vars);
> +       pie_stats_init(&q->stats);
> +       sch->limit = q->params.limit;
> +       setup_timer(&q->adapt_timer, pie_timer, (unsigned long)sch);
> +       add_timer(&q->adapt_timer);

This is wrong, you haven't set expiration timer (will be 0), so timer
will fire.

> +
> +       if (opt) {
> +               int err = pie_change(sch, opt);
> +
> +               if (err)
> +                       return err;
> +       }
> +
> +       return 0;
> +}
> +
> +static int pie_dump(struct Qdisc *sch, struct sk_buff *skb)
> +{
> +       struct pie_sched_data *q = qdisc_priv(sch);
> +       struct nlattr *opts;
> +
> +       opts = nla_nest_start(skb, TCA_OPTIONS);
> +       if (opts == NULL)
> +               goto nla_put_failure;
> +
> +       /* convert target and tupdate from pietime to us */
> +       if (nla_put_u32(skb, TCA_PIE_TARGET,
> +                       pie_time_to_ms(q->params.target) * 1000L) ||
> +           nla_put_u32(skb, TCA_PIE_LIMIT,
> +                       sch->limit) ||
> +           nla_put_u32(skb, TCA_PIE_TUPDATE,
> +                       pie_time_to_ms(q->params.tupdate) * 1000L) ||
> +           nla_put_u32(skb, TCA_PIE_ALPHA,
> +                       q->params.alpha) ||
> +           nla_put_u32(skb, TCA_PIE_BETA, q->params.beta) ||
> +           nla_put_u32(skb, TCA_PIE_ECN, q->params.ecn) ||
> +           nla_put_u32(skb, TCA_PIE_BYTEMODE, q->params.bytemode))
> +               goto nla_put_failure;
> +
> +       return nla_nest_end(skb, opts);
> +
> +nla_put_failure:
> +       nla_nest_cancel(skb, opts);
> +       return -1;
> +
> +}
> +
> +static int pie_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
> +{
> +       struct pie_sched_data *q = qdisc_priv(sch);
> +       struct tc_pie_xstats st = {
> +               .prob = q->vars.prob,
> +               .delay = pie_time_to_ms(q->vars.qdelay) * 1000, /* in us*/
> +               /* unscale and return dq_rate in bytes per sec*/
> +               .avg_dq_rate = q->vars.avg_dq_rate * (PIE_TIME_PER_SEC/8),
> +               .packets_in = q->stats.packets_in,
> +               .overlimit = q->stats.overlimit,
> +               .maxq = q->stats.maxq,
> +               .dropped = q->stats.dropped,
> +               .ecn_mark = q->stats.ecn_mark,
> +       };
Personal preference, I prefer a aligned initialization style.

> +
> +       return gnet_stats_copy_app(d, &st, sizeof(st));
> +}
> +
> +static inline struct sk_buff *pie_qdisc_dequeue(struct Qdisc *sch)
> +{
> +       struct sk_buff *skb;
> +       skb = __qdisc_dequeue_head(sch, &sch->q);
> +
> +       if (!skb)
> +               return NULL;
> +
> +       pie_process_dequeue(sch, skb);
> +
> +       return skb;
> +}
> +
> +static void pie_reset(struct Qdisc *sch)
> +{
> +       struct pie_sched_data *q = qdisc_priv(sch);
> +       qdisc_reset_queue(sch);
> +       pie_vars_init(&q->vars);
> +
> +       return;
> +}
> +
> +static void pie_destroy(struct Qdisc *sch)
> +{
> +       struct pie_sched_data *q = qdisc_priv(sch);
> +
> +       del_timer_sync(&q->adapt_timer);
> +}
> +
> +static struct Qdisc_ops pie_qdisc_ops __read_mostly = {
> +       .id = "pie",
> +       .priv_size = sizeof(struct pie_sched_data),
> +
> +       .enqueue = pie_qdisc_enqueue,
> +       .dequeue = pie_qdisc_dequeue,
> +       .peek = qdisc_peek_dequeued,
> +       .init = pie_init,
> +       .destroy = pie_destroy,
> +       .reset = pie_reset,
> +       .change = pie_change,
> +       .dump = pie_dump,
> +       .dump_stats = pie_dump_stats,
> +       .owner = THIS_MODULE,
> +};
> +
> +static int __init pie_module_init(void)
> +{
> +       return register_qdisc(&pie_qdisc_ops);
> +}
> +
> +static void __exit pie_module_exit(void)
> +{
> +       unregister_qdisc(&pie_qdisc_ops);
> +}
> +
> +module_init(pie_module_init);
> +module_exit(pie_module_exit);
> +
> +MODULE_DESCRIPTION
> +       ("PIE (Proportional Intergal controller Enhanced) scheduler");
Keep this on one line, ignore any complaints from checkpatch about it.

> +MODULE_AUTHOR("Vijay Subramanian");
> +MODULE_AUTHOR("Mythili Prabhu");
> +MODULE_LICENSE("GPL");

Please fix and resubmit. This is just a first pass review, there are
probably more detailed issues that others will see.

^ permalink raw reply

* You've won a Prize
From: Microsoft Iberica SL @ 2013-09-28 12:56 UTC (permalink / raw)


You've won a Prize
MICROSOFT IBERICA SL"
YOU 'VE WON.
ATTN:MICROSOFT IBERICA SL
Your email has won (EUR244,000,00)
(TWO HUNDRED AND FOURTY FOUR THOUSAND EURO)
Batch number:XL73276498AM
Ref number:QR352899526KC
This is a millennium scientific computer game in which
email addresses were used.It is a promotional program aimed at
encouraging internet users,therefore you do not need to buy ticket to enter
for it.
For further development,clarification and procedure please
Contact:Dr Eduardo Sanchez,
Email contact:payingroll446@yahoo.com.hk

^ permalink raw reply

* Re: [PATCH RESEND] iproute2: GRE over IPv6 tunnel support.
From: Stephen Hemminger @ 2013-09-28 17:14 UTC (permalink / raw)
  To: Dmitry Kozlov; +Cc: Hannes Frederic Sowa, Templin, Fred L, netdev
In-Reply-To: <20130928113251.75738a49@comp1>

On Sat, 28 Sep 2013 11:32:51 +0400
Dmitry Kozlov <xeb@mail.ru> wrote:

> GRE over IPv6 tunnel support.
> 
> Signed-off-by: Dmitry Kozlov <xeb@mail.ru>
> ---
>  ip/Makefile    |   3 +-
>  ip/ip6tunnel.c | 131 ++++++++++++++++---
>  ip/iplink.c    |   7 +-
>  ip/link_gre6.c | 398 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 516 insertions(+), 23 deletions(-)
>  create mode 100644 ip/link_gre6.c
> 

Please send an additional patch to update the man page.

^ permalink raw reply

* Re: [PATCH v1 net-next] net: pkt_sched: PIE AQM scheme
From: Eric Dumazet @ 2013-09-28 17:19 UTC (permalink / raw)
  To: Vijay Subramanian; +Cc: netdev, davem, shemminger, Mythili Prabhu, Dave Taht
In-Reply-To: <1380333383-9507-1-git-send-email-subramanian.vijay@gmail.com>


> +MODULE_DESCRIPTION
> +	("PIE (Proportional Intergal controller Enhanced) scheduler");

Intergal -> Integral

^ permalink raw reply

* See the attached file
From: Microsoft Promotion @ 2013-09-28 17:45 UTC (permalink / raw)

In-Reply-To: <1380390305.94955.YahooMailNeo@web5705.biz.mail.ne1.yahoo.com>

[-- Attachment #1: Type: text/plain, Size: 21 bytes --]

See the attached file

[-- Attachment #2: MICROSOFT_AWARD_PROMOTION_2013.doc --]
[-- Type: application/msword, Size: 124416 bytes --]

^ permalink raw reply

* Re: [PATCH 11/12] netfilter: Remove extern from function prototypes
From: Jan Engelhardt @ 2013-09-28 19:17 UTC (permalink / raw)
  To: Joe Perches
  Cc: netdev, David S. Miller, linux-kernel, Pablo Neira Ayuso,
	Patrick McHardy, Jozsef Kadlecsik, netfilter-devel, netfilter,
	coreteam
In-Reply-To: <de1106130366672acb936422f4da7cbb1aafbda1.1379961014.git.joe@perches.com>

On Monday 2013-09-23 20:37, Joe Perches wrote:

>There are a mix of function prototypes with and without extern
>in the kernel sources.  Standardize on not using extern for
>function prototypes.
>
>Function prototypes don't need to be written with extern.
>extern is assumed by the compiler.  Its use is as unnecessary as
>using auto to declare automatic/local variables in a block.

Or you could just extern all functions for consistency with variables.

^ permalink raw reply

* [PATCH net-next] bonding: RCUify bond_set_rx_mode()
From: Veaceslav Falico @ 2013-09-28 19:18 UTC (permalink / raw)
  To: netdev; +Cc: joe.lawrence, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek

Currently we rely on rtnl locking in bond_set_rx_mode(), however it's not
always the case:

RTNL: assertion failed at drivers/net/bonding/bond_main.c (3391)
...
 [<ffffffff81651ca5>] dump_stack+0x54/0x74
 [<ffffffffa029e717>] bond_set_rx_mode+0xc7/0xd0 [bonding]
 [<ffffffff81553af7>] __dev_set_rx_mode+0x57/0xa0
 [<ffffffff81557ff8>] __dev_mc_add+0x58/0x70
 [<ffffffff81558020>] dev_mc_add+0x10/0x20
 [<ffffffff8161e26e>] igmp6_group_added+0x18e/0x1d0
 [<ffffffff81186f76>] ? kmem_cache_alloc_trace+0x236/0x260
 [<ffffffff8161f80f>] ipv6_dev_mc_inc+0x29f/0x320
 [<ffffffff8161f9e7>] ipv6_sock_mc_join+0x157/0x260
...

Fix this by using RCU primitives.

Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_main.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index d5c3153..996d196 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3393,20 +3393,21 @@ static void bond_set_rx_mode(struct net_device *bond_dev)
 	struct list_head *iter;
 	struct slave *slave;
 
-	ASSERT_RTNL();
 
+	rcu_read_lock();
 	if (USES_PRIMARY(bond->params.mode)) {
-		slave = rtnl_dereference(bond->curr_active_slave);
+		slave = rcu_dereference(bond->curr_active_slave);
 		if (slave) {
 			dev_uc_sync(slave->dev, bond_dev);
 			dev_mc_sync(slave->dev, bond_dev);
 		}
 	} else {
-		bond_for_each_slave(bond, slave, iter) {
+		bond_for_each_slave_rcu(bond, slave, iter) {
 			dev_uc_sync_multiple(slave->dev, bond_dev);
 			dev_mc_sync_multiple(slave->dev, bond_dev);
 		}
 	}
+	rcu_read_unlock();
 }
 
 static int bond_neigh_init(struct neighbour *n)
-- 
1.8.4

^ permalink raw reply related

* Re: [PATCH net-next] xen-netfront: convert to GRO API and advertise this feature
From: David Miller @ 2013-09-28 19:38 UTC (permalink / raw)
  To: wei.liu2; +Cc: netdev, xen-devel, abchak, ian.campbell
In-Reply-To: <1379779543-27122-1-git-send-email-wei.liu2@citrix.com>

From: Wei Liu <wei.liu2@citrix.com>
Date: Sat, 21 Sep 2013 17:05:43 +0100

> @@ -1371,7 +1373,8 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
>  	netif_napi_add(netdev, &np->napi, xennet_poll, 64);
>  	netdev->features        = NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
>  				  NETIF_F_GSO_ROBUST;
> -	netdev->hw_features	= NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO;
> +	netdev->hw_features	= NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO |
> +				  NETIF_F_GRO;

Please post a new version of this patch with the feedback you've been
given integrated, in particular with this part removed because it is
not necessary.

Ian, please review the patch when Wei posts it.

Thanks.

^ permalink raw reply

* Re: [PATCH 2/2] drivers: net: vmxnet3 : vmxnet3_drv.c: removed checkaptch warning related to msleep()
From: David Miller @ 2013-09-28 19:38 UTC (permalink / raw)
  To: avi.kp.137; +Cc: sbhatewara, pv-drivers, netdev, linux-kernel
In-Reply-To: <1379866187-3158-1-git-send-email-avi.kp.137@gmail.com>


I see only patch #2 and #3.

Sort out why only 2 of the 3 patches were posted, and resend them
all.

Thank you.

^ permalink raw reply

* Re: [PATCH v1] USBNET: fix handling padding packet
From: David Miller @ 2013-09-28 19:45 UTC (permalink / raw)
  To: ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw
  Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, oneukum-l3A5Bk7waGM,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1379941175-10500-1-git-send-email-ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>

From: Ming Lei <ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Date: Mon, 23 Sep 2013 20:59:35 +0800

> Commit 638c5115a7949(USBNET: support DMA SG) introduces DMA SG
> if the usb host controller is capable of building packet from
> discontinuous buffers, but missed handling padding packet when
> building DMA SG.
> 
> This patch attachs the pre-allocated padding packet at the
> end of the sg list, so padding packet can be sent to device
> if drivers require that.
> 
> Reported-by: David Laight <David.Laight-JxhZ9S5GRejQT0dZR+AlfA@public.gmane.org>
> Acked-by: Oliver Neukum <oliver-GvhC2dPhHPQdnm+yROfE0A@public.gmane.org>
> Signed-off-by: Ming Lei <ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>

Applied, thanks.

I still think the suggestion to disable scatter gather for
devices with the padding issue was the most sane approach
to solve this.

I guess people like supporting complicated crap and excess
code for things that pretty much do not exist, or at best
are not prominent enough to cater for at all.

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next] net ipv4: Convert ipv4.ip_local_port_range to be per netns
From: David Miller @ 2013-09-28 19:52 UTC (permalink / raw)
  To: ebiederm; +Cc: netdev
In-Reply-To: <87fvswt5m5.fsf@tw-ebiederman.twitter.com>

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sun, 22 Sep 2013 23:27:30 -0700

> 
> - Move sysctl_local_ports from a global variable into struct netns_ipv4.
> - Modify inet_get_local_port_range to take a struct net.
> - Manually expand inet_get_local_range into ipv4_local_port_range
>   because I do not know the struct net.
> - Move the initialization of sysctl_local_ports into
>   sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c
> 
> Originally-by: Samya <samya@twitter.com>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] ipv6: Not need to set fl6.flowi6_flags as zero
From: David Miller @ 2013-09-28 19:52 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev
In-Reply-To: <1379919359-3032-1-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
Date: Mon, 23 Sep 2013 14:55:59 +0800

> From: Li RongQing <roy.qing.li@gmail.com>
> 
> setting fl6.flowi6_flags as zero after memset is redundant, Remove it.
> 
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH v5] IPv6 NAT: Do not drop DNATed 6to4/6rd packets
From: David Miller @ 2013-09-28 19:57 UTC (permalink / raw)
  To: hannes; +Cc: catab, netdev, yoshfuji, joe
In-Reply-To: <20130924213606.GB4446@order.stressinduktion.org>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Tue, 24 Sep 2013 23:36:06 +0200

> On Mon, Sep 23, 2013 at 11:04:19PM +0300, Catalin(ux) M. BOIE wrote:
>> When a router is doing  DNAT for 6to4/6rd packets the latest anti-spoofing
>> patch (218774dc) will drop them because the IPv6 address embedded
>> does not match the IPv4 destination. This patch will allow them to
>> pass by testing if we have an address that matches on 6to4/6rd interface.
>> I have been hit by this problem using Fedora and IPV6TO4_IPV4ADDR.
>> Also, log the dropped packets (with rate limit).
>> 
>> Signed-off-by: Catalin(ux) M. BOIE <catab@embedromix.ro>
> 
> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Applied, but Catalin please strictly refer to changes in the following
precise format:

	commit $SHA1_ID ("Commit message header line text")

Because SHA1_IDs are ambiguous, especially when the change in question
is backported into various -stable branches.

The only way to resolve the ambiguity is to provide the commit message
text (in parenthesis and double quotes).

^ permalink raw reply

* Re: [PATCH net-next] net ipv4: Convert ipv4.ip_local_port_range to be per netns
From: David Miller @ 2013-09-28 20:07 UTC (permalink / raw)
  To: ebiederm; +Cc: netdev
In-Reply-To: <20130928.155228.211244914284752204.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Sat, 28 Sep 2013 15:52:28 -0400 (EDT)

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Sun, 22 Sep 2013 23:27:30 -0700
> 
>> 
>> - Move sysctl_local_ports from a global variable into struct netns_ipv4.
>> - Modify inet_get_local_port_range to take a struct net.
>> - Manually expand inet_get_local_range into ipv4_local_port_range
>>   because I do not know the struct net.
>> - Move the initialization of sysctl_local_ports into
>>   sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c
>> 
>> Originally-by: Samya <samya@twitter.com>
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Applied.

I had to revert, you didn't throughly build test this.

security/selinux/hooks.c: In function ‘selinux_socket_bind’:
security/selinux/hooks.c:3933:4: error: incompatible type for argument 1 of ‘inet_get_local_port_range’
In file included from security/selinux/hooks.c:53:0:
include/net/ip.h:206:6: note: expected ‘struct net *’ but argument is of type ‘struct lsm_network_audit’

And when you repost make sure to deal with the space vs. TAB
issues pointed out to you.

Thanks.

^ permalink raw reply

* Re: [PATCH] ipv6: Fix preferred_lft not updating in some cases
From: Hannes Frederic Sowa @ 2013-09-28 20:28 UTC (permalink / raw)
  To: Paul Marks; +Cc: netdev, davem, yoshfuji, Lorenzo Colitti
In-Reply-To: <CAHaKRvJDZJjuv4sALmQAotk5EUMfYPiLN=8_noWCRQYOW+bxSA@mail.gmail.com>

On Fri, Sep 27, 2013 at 01:28:06PM -0700, Paul Marks wrote:
> On Fri, Sep 27, 2013 at 1:16 AM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
> > On Wed, Sep 25, 2013 at 03:12:55PM -0700, Paul Marks wrote:
> >> -                                     if (prefered_lft != ifp->prefered_lft) {
> >
> > Wouldn't the easiest solution be to just drop this if and execute the two
> > lines below unconditionally?
> 
> Yes, that's also correct.  But is it not better to have simpler code
> than shorter diffs?  Should we transliterate English to C, or think
> about what the algorithm is actually doing?  The fact that this bug
> has gone unnoticed provides some evidence that the code may have been
> too complicated.

I don't care about the length of diffs or shorter code. I would favour
a transliteration here because it makes verification easier (at least
for me). The algorithm is not that complex and I guess the bug has been
unnoticed because nobody ran into problems and cared til now.

So, why not get rid of update_lft then?

> >> +                             const u32 minimum_lft = min(
> >> +                                     stored_lft, (u32)MIN_VALID_LIFETIME);
> >> +                             valid_lft = max(valid_lft, minimum_lft);
> >
> > Quick question: Don't we need a prefered_lft = min(preferred_lft, valid_lft)
> > here?
> 
> The invariant is (preferred_lft <= valid_lft), and valid_lft can only
> get bigger, so I don't think there's a problem.

Ah, I got confused. Missed in the last case that it got tested earlier in the
function. Your code looks correct regarding every rule.

Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Thanks,

  Hannes

^ permalink raw reply

* Re: [PATCH net-next] net ipv4: Convert ipv4.ip_local_port_range to be per netns
From: Eric W. Biederman @ 2013-09-28 20:32 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20130928.160753.1218915059639502436.davem@davemloft.net>

David Miller <davem@davemloft.net> writes:

> From: David Miller <davem@davemloft.net>
> Date: Sat, 28 Sep 2013 15:52:28 -0400 (EDT)
>
>> From: ebiederm@xmission.com (Eric W. Biederman)
>> Date: Sun, 22 Sep 2013 23:27:30 -0700
>> 
>>> 
>>> - Move sysctl_local_ports from a global variable into struct netns_ipv4.
>>> - Modify inet_get_local_port_range to take a struct net.
>>> - Manually expand inet_get_local_range into ipv4_local_port_range
>>>   because I do not know the struct net.
>>> - Move the initialization of sysctl_local_ports into
>>>   sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c
>>> 
>>> Originally-by: Samya <samya@twitter.com>
>>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> 
>> Applied.
>
> I had to revert, you didn't throughly build test this.

My apologies, I was pushing a little too hard that day.  v3 will be
coming with the fix.

> security/selinux/hooks.c: In function ‘selinux_socket_bind’:
> security/selinux/hooks.c:3933:4: error: incompatible type for argument 1 of ‘inet_get_local_port_range’
> In file included from security/selinux/hooks.c:53:0:
> include/net/ip.h:206:6: note: expected ‘struct net *’ but argument is of type ‘struct lsm_network_audit’
>
> And when you repost make sure to deal with the space vs. TAB
> issues pointed out to you.

Definitely.

Eric

^ permalink raw reply

* Re: IPv6 path MTU discovery broken
From: Hannes Frederic Sowa @ 2013-09-28 20:33 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: netdev, edumazet
In-Reply-To: <20130927201420.GB12043@sesse.net>

Hello!

On Fri, Sep 27, 2013 at 10:14:20PM +0200, Steinar H. Gunderson wrote:
> So the “packet too big” packets really look like they're being ignored.
> However, they _do_ reach the kernel somehow, since Icmp6InPktTooBigs
> seems to increase.
> 
> Could this be related somehow to the packets coming from 2001:67c:29f4::31,
> while the default route is to a link-local address? (An RPF issue?) This used
> to work (although it was often flaky for me) in 3.10 and before. I can't
> easily bisect, though, as I don't boot this machine too often.

This looks like a bug and should definitely get fixed. There should be
no RPF issue. May I have a look at your /proc/net/ipv6_route?

Thanks,

  Hannes

^ permalink raw reply

* MUTUAL PROJECT
From: jing01lee @ 2013-09-28 20:07 UTC (permalink / raw)
  To: Recipients

Hello

I have a business proposal for you. There is no risks involved.
Pls reply for briefs. 
Mr Lee

^ permalink raw reply

* Re: IPv6 path MTU discovery broken
From: Steinar H. Gunderson @ 2013-09-28 20:51 UTC (permalink / raw)
  To: netdev, edumazet
In-Reply-To: <20130928203318.GC23654@order.stressinduktion.org>

On Sat, Sep 28, 2013 at 10:33:18PM +0200, Hannes Frederic Sowa wrote:
>> Could this be related somehow to the packets coming from 2001:67c:29f4::31,
>> while the default route is to a link-local address? (An RPF issue?) This used
>> to work (although it was often flaky for me) in 3.10 and before. I can't
>> easily bisect, though, as I don't boot this machine too often.
> This looks like a bug and should definitely get fixed. There should be
> no RPF issue. May I have a look at your /proc/net/ipv6_route?

Hi,

I removed all the “weird” routes, and confirmed it fixed the problem.
However, upon adding them back again, the problem was still gone
(despite flushing the route cache).

This means that the issue has gone back to being intermittent, which is of
course the worst kind of bug to trace down. :-) I'll dump
/proc/net/ipv6_route and send you once I see the bug manifest itself again,
OK?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply

* [PATCH net-next ] net ipv4: Convert ipv4.ip_local_port_range to be per netns v3
From: Eric W. Biederman @ 2013-09-28 21:10 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, nicolas.dichtel
In-Reply-To: <8738ovtea9.fsf_-_@tw-ebiederman.twitter.com>


- Move sysctl_local_ports from a global variable into struct netns_ipv4.
- Modify inet_get_local_port_range to take a struct net, and update all
  of the callers.
- Move the initialization of sysctl_local_ports into
   sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c

v2:
- Ensure indentation used tabs
- Fixed ip.h so it applies cleanly to todays net-next

v3:
- Compile fixes of strange callers of inet_get_local_port_range.
  This patch now successfully passes an allmodconfig build.
  Removed manual inlining of inet_get_local_port_range in ipv4_local_port_range

Originally-by: Samya <samya@twitter.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/infiniband/core/cma.c   |    2 +-
 drivers/net/vxlan.c             |    2 +-
 include/net/ip.h                |    6 +----
 include/net/netns/ipv4.h        |    6 +++++
 net/ipv4/inet_connection_sock.c |   20 +++++----------
 net/ipv4/inet_hashtables.c      |    2 +-
 net/ipv4/ping.c                 |    4 +--
 net/ipv4/sysctl_net_ipv4.c      |   52 +++++++++++++++++++++++++--------------
 net/ipv4/udp.c                  |    2 +-
 net/openvswitch/vport-vxlan.c   |    2 +-
 net/sctp/socket.c               |    2 +-
 security/selinux/hooks.c        |    2 +-
 12 files changed, 56 insertions(+), 46 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index dab4b41..a082fd9 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2294,7 +2294,7 @@ static int cma_alloc_any_port(struct idr *ps, struct rdma_id_private *id_priv)
 	int low, high, remaining;
 	unsigned int rover;
 
-	inet_get_local_port_range(&low, &high);
+	inet_get_local_port_range(&init_net, &low, &high);
 	remaining = (high - low) + 1;
 	rover = net_random() % remaining + low;
 retry:
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index d1292fe..c376be7 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2089,7 +2089,7 @@ static void vxlan_setup(struct net_device *dev)
 	vxlan->age_timer.function = vxlan_cleanup;
 	vxlan->age_timer.data = (unsigned long) vxlan;
 
-	inet_get_local_port_range(&low, &high);
+	inet_get_local_port_range(dev_net(dev), &low, &high);
 	vxlan->port_min = low;
 	vxlan->port_max = high;
 	vxlan->dst_port = htons(vxlan_port);
diff --git a/include/net/ip.h b/include/net/ip.h
index c1f192b..c82e6ec 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -203,11 +203,7 @@ static inline void snmp_mib_free(void __percpu *ptr[SNMP_ARRAY_SZ])
 	}
 }
 
-extern struct local_ports {
-	seqlock_t	lock;
-	int		range[2];
-} sysctl_local_ports;
-void inet_get_local_port_range(int *low, int *high);
+void inet_get_local_port_range(struct net *net, int *low, int *high);
 
 extern unsigned long *sysctl_local_reserved_ports;
 static inline int inet_is_reserved_local_port(int port)
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index bf2ec22..5dbd232 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -15,6 +15,10 @@ struct fib_rules_ops;
 struct hlist_head;
 struct fib_table;
 struct sock;
+struct local_ports {
+	seqlock_t	lock;
+	int		range[2];
+};
 
 struct netns_ipv4 {
 #ifdef CONFIG_SYSCTL
@@ -62,6 +66,8 @@ struct netns_ipv4 {
 	int sysctl_icmp_ratemask;
 	int sysctl_icmp_errors_use_inbound_ifaddr;
 
+	struct local_ports sysctl_local_ports;
+
 	int sysctl_tcp_ecn;
 
 	kgid_t sysctl_ping_group_range[2];
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 6acb541..7ac7aa1 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -29,27 +29,19 @@ const char inet_csk_timer_bug_msg[] = "inet_csk BUG: unknown timer value\n";
 EXPORT_SYMBOL(inet_csk_timer_bug_msg);
 #endif
 
-/*
- * This struct holds the first and last local port number.
- */
-struct local_ports sysctl_local_ports __read_mostly = {
-	.lock = __SEQLOCK_UNLOCKED(sysctl_local_ports.lock),
-	.range = { 32768, 61000 },
-};
-
 unsigned long *sysctl_local_reserved_ports;
 EXPORT_SYMBOL(sysctl_local_reserved_ports);
 
-void inet_get_local_port_range(int *low, int *high)
+void inet_get_local_port_range(struct net *net, int *low, int *high)
 {
 	unsigned int seq;
 
 	do {
-		seq = read_seqbegin(&sysctl_local_ports.lock);
+		seq = read_seqbegin(&net->ipv4.sysctl_local_ports.lock);
 
-		*low = sysctl_local_ports.range[0];
-		*high = sysctl_local_ports.range[1];
-	} while (read_seqretry(&sysctl_local_ports.lock, seq));
+		*low = net->ipv4.sysctl_local_ports.range[0];
+		*high = net->ipv4.sysctl_local_ports.range[1];
+	} while (read_seqretry(&net->ipv4.sysctl_local_ports.lock, seq));
 }
 EXPORT_SYMBOL(inet_get_local_port_range);
 
@@ -116,7 +108,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
 		int remaining, rover, low, high;
 
 again:
-		inet_get_local_port_range(&low, &high);
+		inet_get_local_port_range(net, &low, &high);
 		remaining = (high - low) + 1;
 		smallest_rover = rover = net_random() % remaining + low;
 
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 7bd8983..2779037 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -494,7 +494,7 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 		u32 offset = hint + port_offset;
 		struct inet_timewait_sock *tw = NULL;
 
-		inet_get_local_port_range(&low, &high);
+		inet_get_local_port_range(net, &low, &high);
 		remaining = (high - low) + 1;
 
 		local_bh_disable();
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index d7d9882..fc8c95d 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -237,11 +237,11 @@ static void inet_get_ping_group_range_net(struct net *net, kgid_t *low,
 	unsigned int seq;
 
 	do {
-		seq = read_seqbegin(&sysctl_local_ports.lock);
+		seq = read_seqbegin(&net->ipv4.sysctl_local_ports.lock);
 
 		*low = data[0];
 		*high = data[1];
-	} while (read_seqretry(&sysctl_local_ports.lock, seq));
+	} while (read_seqretry(&net->ipv4.sysctl_local_ports.lock, seq));
 }
 
 
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 540279f..c08f096 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -43,12 +43,12 @@ static int ip_ping_group_range_min[] = { 0, 0 };
 static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
 
 /* Update system visible IP port range */
-static void set_local_port_range(int range[2])
+static void set_local_port_range(struct net *net, int range[2])
 {
-	write_seqlock(&sysctl_local_ports.lock);
-	sysctl_local_ports.range[0] = range[0];
-	sysctl_local_ports.range[1] = range[1];
-	write_sequnlock(&sysctl_local_ports.lock);
+	write_seqlock(&net->ipv4.sysctl_local_ports.lock);
+	net->ipv4.sysctl_local_ports.range[0] = range[0];
+	net->ipv4.sysctl_local_ports.range[1] = range[1];
+	write_sequnlock(&net->ipv4.sysctl_local_ports.lock);
 }
 
 /* Validate changes from /proc interface. */
@@ -56,6 +56,8 @@ static int ipv4_local_port_range(struct ctl_table *table, int write,
 				 void __user *buffer,
 				 size_t *lenp, loff_t *ppos)
 {
+	struct net *net =
+		container_of(table->data, struct net, ipv4.sysctl_local_ports.range);
 	int ret;
 	int range[2];
 	struct ctl_table tmp = {
@@ -66,14 +68,15 @@ static int ipv4_local_port_range(struct ctl_table *table, int write,
 		.extra2 = &ip_local_port_range_max,
 	};
 
-	inet_get_local_port_range(range, range + 1);
+	inet_get_local_port_range(net, &range[0], &range[1]);
+
 	ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
 
 	if (write && ret == 0) {
 		if (range[1] < range[0])
 			ret = -EINVAL;
 		else
-			set_local_port_range(range);
+			set_local_port_range(net, range);
 	}
 
 	return ret;
@@ -83,23 +86,27 @@ static int ipv4_local_port_range(struct ctl_table *table, int write,
 static void inet_get_ping_group_range_table(struct ctl_table *table, kgid_t *low, kgid_t *high)
 {
 	kgid_t *data = table->data;
+	struct net *net =
+		container_of(table->data, struct net, ipv4.sysctl_ping_group_range);
 	unsigned int seq;
 	do {
-		seq = read_seqbegin(&sysctl_local_ports.lock);
+		seq = read_seqbegin(&net->ipv4.sysctl_local_ports.lock);
 
 		*low = data[0];
 		*high = data[1];
-	} while (read_seqretry(&sysctl_local_ports.lock, seq));
+	} while (read_seqretry(&net->ipv4.sysctl_local_ports.lock, seq));
 }
 
 /* Update system visible IP port range */
 static void set_ping_group_range(struct ctl_table *table, kgid_t low, kgid_t high)
 {
 	kgid_t *data = table->data;
-	write_seqlock(&sysctl_local_ports.lock);
+	struct net *net =
+		container_of(table->data, struct net, ipv4.sysctl_ping_group_range);
+	write_seqlock(&net->ipv4.sysctl_local_ports.lock);
 	data[0] = low;
 	data[1] = high;
-	write_sequnlock(&sysctl_local_ports.lock);
+	write_sequnlock(&net->ipv4.sysctl_local_ports.lock);
 }
 
 /* Validate changes from /proc interface. */
@@ -475,13 +482,6 @@ static struct ctl_table ipv4_table[] = {
 		.proc_handler	= proc_dointvec
 	},
 	{
-		.procname	= "ip_local_port_range",
-		.data		= &sysctl_local_ports.range,
-		.maxlen		= sizeof(sysctl_local_ports.range),
-		.mode		= 0644,
-		.proc_handler	= ipv4_local_port_range,
-	},
-	{
 		.procname	= "ip_local_reserved_ports",
 		.data		= NULL, /* initialized in sysctl_ipv4_init */
 		.maxlen		= 65536,
@@ -854,6 +854,13 @@ static struct ctl_table ipv4_net_table[] = {
 		.proc_handler	= proc_dointvec
 	},
 	{
+		.procname	= "ip_local_port_range",
+		.maxlen		= sizeof(init_net.ipv4.sysctl_local_ports.range),
+		.data		= &init_net.ipv4.sysctl_local_ports.range,
+		.mode		= 0644,
+		.proc_handler	= ipv4_local_port_range,
+	},
+	{
 		.procname	= "tcp_mem",
 		.maxlen		= sizeof(init_net.ipv4.sysctl_tcp_mem),
 		.mode		= 0644,
@@ -888,6 +895,8 @@ static __net_init int ipv4_sysctl_init_net(struct net *net)
 			&net->ipv4.sysctl_ping_group_range;
 		table[7].data =
 			&net->ipv4.sysctl_tcp_ecn;
+		table[8].data =
+			&net->ipv4.sysctl_local_ports.range;
 
 		/* Don't export sysctls to unprivileged users */
 		if (net->user_ns != &init_user_ns)
@@ -901,6 +910,13 @@ static __net_init int ipv4_sysctl_init_net(struct net *net)
 	net->ipv4.sysctl_ping_group_range[0] = make_kgid(&init_user_ns, 1);
 	net->ipv4.sysctl_ping_group_range[1] = make_kgid(&init_user_ns, 0);
 
+	/*
+	 * Set defaults for local port range
+	 */
+	seqlock_init(&net->ipv4.sysctl_local_ports.lock);
+	net->ipv4.sysctl_local_ports.range[0] =  32768;
+	net->ipv4.sysctl_local_ports.range[1] =  61000;
+
 	tcp_init_mem(net);
 
 	net->ipv4.ipv4_hdr = register_net_sysctl(net, "net/ipv4", table);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 74d2c95..460a4c1 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -219,7 +219,7 @@ int udp_lib_get_port(struct sock *sk, unsigned short snum,
 		unsigned short first, last;
 		DECLARE_BITMAP(bitmap, PORTS_PER_CHAIN);
 
-		inet_get_local_port_range(&low, &high);
+		inet_get_local_port_range(net, &low, &high);
 		remaining = (high - low) + 1;
 
 		rand = net_random();
diff --git a/net/openvswitch/vport-vxlan.c b/net/openvswitch/vport-vxlan.c
index a481c03..56e22b7 100644
--- a/net/openvswitch/vport-vxlan.c
+++ b/net/openvswitch/vport-vxlan.c
@@ -173,7 +173,7 @@ static int vxlan_tnl_send(struct vport *vport, struct sk_buff *skb)
 
 	skb->local_df = 1;
 
-	inet_get_local_port_range(&port_min, &port_max);
+	inet_get_local_port_range(net, &port_min, &port_max);
 	src_port = vxlan_src_port(port_min, port_max, skb);
 
 	err = vxlan_xmit_skb(vxlan_port->vs, rt, skb,
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 911b71b..72046b9 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -5890,7 +5890,7 @@ static long sctp_get_port_local(struct sock *sk, union sctp_addr *addr)
 		int low, high, remaining, index;
 		unsigned int rover;
 
-		inet_get_local_port_range(&low, &high);
+		inet_get_local_port_range(sock_net(sk), &low, &high);
 		remaining = (high - low) + 1;
 		rover = net_random() % remaining + low;
 
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index a5091ec..568c769 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3929,7 +3929,7 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 		if (snum) {
 			int low, high;
 
-			inet_get_local_port_range(&low, &high);
+			inet_get_local_port_range(sock_net(sk), &low, &high);
 
 			if (snum < max(PROT_SOCK, low) || snum > high) {
 				err = sel_netport_sid(sk->sk_protocol,
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 1/3] tg3: add support a phy at an address different than 01
From: Hauke Mehrtens @ 2013-09-28 21:15 UTC (permalink / raw)
  To: davem; +Cc: nsujir, mchan, netdev, Hauke Mehrtens

When phylib was in use tg3 only searched at address 01 on the mdio
bus and did not work with any other address. On the BCM4705 SoCs the
switch is connected as a PHY behind the MAC driven by tg3 and it is at
PHY address 30 in most cases. This is a preparation patch to allow
support for such switches.

phy_addr is set to TG3_PHY_MII_ADDR for all devices, which are using
phylib, so this should not change any behavior.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---
 drivers/net/ethernet/broadcom/tg3.c |   38 +++++++++++++++++------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 221a181..853a05e 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -1375,7 +1375,7 @@ static int tg3_mdio_read(struct mii_bus *bp, int mii_id, int reg)
 
 	spin_lock_bh(&tp->lock);
 
-	if (tg3_readphy(tp, reg, &val))
+	if (__tg3_readphy(tp, mii_id, reg, &val))
 		val = -EIO;
 
 	spin_unlock_bh(&tp->lock);
@@ -1390,7 +1390,7 @@ static int tg3_mdio_write(struct mii_bus *bp, int mii_id, int reg, u16 val)
 
 	spin_lock_bh(&tp->lock);
 
-	if (tg3_writephy(tp, reg, val))
+	if (__tg3_writephy(tp, mii_id, reg, val))
 		ret = -EIO;
 
 	spin_unlock_bh(&tp->lock);
@@ -1408,7 +1408,7 @@ static void tg3_mdio_config_5785(struct tg3 *tp)
 	u32 val;
 	struct phy_device *phydev;
 
-	phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+	phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 	switch (phydev->drv->phy_id & phydev->drv->phy_id_mask) {
 	case PHY_ID_BCM50610:
 	case PHY_ID_BCM50610M:
@@ -1533,7 +1533,7 @@ static int tg3_mdio_init(struct tg3 *tp)
 	tp->mdio_bus->read     = &tg3_mdio_read;
 	tp->mdio_bus->write    = &tg3_mdio_write;
 	tp->mdio_bus->reset    = &tg3_mdio_reset;
-	tp->mdio_bus->phy_mask = ~(1 << TG3_PHY_MII_ADDR);
+	tp->mdio_bus->phy_mask = ~(1 << tp->phy_addr);
 	tp->mdio_bus->irq      = &tp->mdio_irq[0];
 
 	for (i = 0; i < PHY_MAX_ADDR; i++)
@@ -1554,7 +1554,7 @@ static int tg3_mdio_init(struct tg3 *tp)
 		return i;
 	}
 
-	phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+	phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 	if (!phydev || !phydev->drv) {
 		dev_warn(&tp->pdev->dev, "No PHY devices\n");
@@ -1964,7 +1964,7 @@ static void tg3_setup_flow_control(struct tg3 *tp, u32 lcladv, u32 rmtadv)
 	u32 old_tx_mode = tp->tx_mode;
 
 	if (tg3_flag(tp, USE_PHYLIB))
-		autoneg = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR]->autoneg;
+		autoneg = tp->mdio_bus->phy_map[tp->phy_addr]->autoneg;
 	else
 		autoneg = tp->link_config.autoneg;
 
@@ -2000,7 +2000,7 @@ static void tg3_adjust_link(struct net_device *dev)
 	u8 oldflowctrl, linkmesg = 0;
 	u32 mac_mode, lcl_adv, rmt_adv;
 	struct tg3 *tp = netdev_priv(dev);
-	struct phy_device *phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+	struct phy_device *phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 	spin_lock_bh(&tp->lock);
 
@@ -2089,7 +2089,7 @@ static int tg3_phy_init(struct tg3 *tp)
 	/* Bring the PHY back to a known state. */
 	tg3_bmcr_reset(tp);
 
-	phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+	phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 	/* Attach the MAC to the PHY. */
 	phydev = phy_connect(tp->dev, dev_name(&phydev->dev),
@@ -2116,7 +2116,7 @@ static int tg3_phy_init(struct tg3 *tp)
 				      SUPPORTED_Asym_Pause);
 		break;
 	default:
-		phy_disconnect(tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR]);
+		phy_disconnect(tp->mdio_bus->phy_map[tp->phy_addr]);
 		return -EINVAL;
 	}
 
@@ -2134,7 +2134,7 @@ static void tg3_phy_start(struct tg3 *tp)
 	if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 		return;
 
-	phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+	phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 	if (tp->phy_flags & TG3_PHYFLG_IS_LOW_POWER) {
 		tp->phy_flags &= ~TG3_PHYFLG_IS_LOW_POWER;
@@ -2154,13 +2154,13 @@ static void tg3_phy_stop(struct tg3 *tp)
 	if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 		return;
 
-	phy_stop(tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR]);
+	phy_stop(tp->mdio_bus->phy_map[tp->phy_addr]);
 }
 
 static void tg3_phy_fini(struct tg3 *tp)
 {
 	if (tp->phy_flags & TG3_PHYFLG_IS_CONNECTED) {
-		phy_disconnect(tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR]);
+		phy_disconnect(tp->mdio_bus->phy_map[tp->phy_addr]);
 		tp->phy_flags &= ~TG3_PHYFLG_IS_CONNECTED;
 	}
 }
@@ -4034,7 +4034,7 @@ static int tg3_power_down_prepare(struct tg3 *tp)
 			struct phy_device *phydev;
 			u32 phyid, advertising;
 
-			phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+			phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 			tp->phy_flags |= TG3_PHYFLG_IS_LOW_POWER;
 
@@ -11922,7 +11922,7 @@ static int tg3_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 		struct phy_device *phydev;
 		if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 			return -EAGAIN;
-		phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+		phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 		return phy_ethtool_gset(phydev, cmd);
 	}
 
@@ -11989,7 +11989,7 @@ static int tg3_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 		struct phy_device *phydev;
 		if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 			return -EAGAIN;
-		phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+		phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 		return phy_ethtool_sset(phydev, cmd);
 	}
 
@@ -12144,7 +12144,7 @@ static int tg3_nway_reset(struct net_device *dev)
 	if (tg3_flag(tp, USE_PHYLIB)) {
 		if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 			return -EAGAIN;
-		r = phy_start_aneg(tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR]);
+		r = phy_start_aneg(tp->mdio_bus->phy_map[tp->phy_addr]);
 	} else {
 		u32 bmcr;
 
@@ -12260,7 +12260,7 @@ static int tg3_set_pauseparam(struct net_device *dev, struct ethtool_pauseparam
 		u32 newadv;
 		struct phy_device *phydev;
 
-		phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+		phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 
 		if (!(phydev->supported & SUPPORTED_Pause) ||
 		    (!(phydev->supported & SUPPORTED_Asym_Pause) &&
@@ -13696,7 +13696,7 @@ static int tg3_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 		struct phy_device *phydev;
 		if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
 			return -EAGAIN;
-		phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+		phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 		return phy_mii_ioctl(phydev, ifr, cmd);
 	}
 
@@ -17635,7 +17635,7 @@ static int tg3_init_one(struct pci_dev *pdev,
 
 	if (tp->phy_flags & TG3_PHYFLG_IS_CONNECTED) {
 		struct phy_device *phydev;
-		phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
+		phydev = tp->mdio_bus->phy_map[tp->phy_addr];
 		netdev_info(dev,
 			    "attached PHY driver [%s] (mii_bus:phy_addr=%s)\n",
 			    phydev->drv->name, dev_name(&phydev->dev));
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 2/3] ssb: provide phy address for Gigabit Ethernet driver
From: Hauke Mehrtens @ 2013-09-28 21:15 UTC (permalink / raw)
  To: davem; +Cc: nsujir, mchan, netdev, Hauke Mehrtens
In-Reply-To: <1380402928-11480-1-git-send-email-hauke@hauke-m.de>

Add a function to provide the phy address which should be used to the
Gigabit Ethernet driver connected to ssb.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---
 include/linux/ssb/ssb_driver_gige.h |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/ssb/ssb_driver_gige.h b/include/linux/ssb/ssb_driver_gige.h
index 86a12b0..0688472 100644
--- a/include/linux/ssb/ssb_driver_gige.h
+++ b/include/linux/ssb/ssb_driver_gige.h
@@ -108,6 +108,16 @@ static inline int ssb_gige_get_macaddr(struct pci_dev *pdev, u8 *macaddr)
 	return 0;
 }
 
+/* Get the device phy address */
+static inline int ssb_gige_get_phyaddr(struct pci_dev *pdev)
+{
+	struct ssb_gige *dev = pdev_to_ssb_gige(pdev);
+	if (!dev)
+		return -ENODEV;
+
+	return dev->dev->bus->sprom.et0phyaddr;
+}
+
 extern int ssb_gige_pcibios_plat_dev_init(struct ssb_device *sdev,
 					  struct pci_dev *pdev);
 extern int ssb_gige_map_irq(struct ssb_device *sdev,
@@ -174,6 +184,10 @@ static inline int ssb_gige_get_macaddr(struct pci_dev *pdev, u8 *macaddr)
 {
 	return -ENODEV;
 }
+static inline int ssb_gige_get_phyaddr(struct pci_dev *pdev)
+{
+	return -ENODEV;
+}
 
 #endif /* CONFIG_SSB_DRIVER_GIGE */
 #endif /* LINUX_SSB_DRIVER_GIGE_H_ */
-- 
1.7.10.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox