Netdev List
 help / color / mirror / Atom feed
* Re: [patch v3 00/12] IPVS: SIP Persistence Engine
From: Julian Anastasov @ 2010-10-02  9:00 UTC (permalink / raw)
  To: Simon Horman
  Cc: lvs-devel, netdev, netfilter, netfilter-devel, Jan Engelhardt,
	Stephen Hemminger, Wensong Zhang, Patrick McHardy
In-Reply-To: <20101002023056.536217499@akiko.akashicho.tokyo.vergenet.net>


 	Hello,

On Sat, 2 Oct 2010, Simon Horman wrote:

> This patch series adds load-balancing of UDP SIP based on Call-ID to
> IPVS as well as a frame-work for extending IPVS to handle alternate
> persistence requirements.
>
> REVISIONS
>
> This is v3 of the patch series with addresses serveral problems
> raised by Julian Anastasov on the lvs-devel mailing list.

 	No other obvious problems in v3 1-12. So,
Acked-by: Julian Anastasov <ja@ssi.bg>

 	May be next days I'll test my changes on top of PE v3

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH net-next V2] net: dynamic ingress_queue allocation
From: Jarek Poplawski @ 2010-10-02  9:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: hadi, David Miller, netdev
In-Reply-To: <1285941388.2641.175.camel@edumazet-laptop>

On Fri, Oct 01, 2010 at 03:56:28PM +0200, Eric Dumazet wrote:
> Le vendredi 01 octobre 2010 ?? 07:45 -0400, jamal a écrit :
> > On Fri, 2010-10-01 at 00:58 +0200, Eric Dumazet wrote:
> > > Hi Jamal
> > > 
> > > Here is the dynamic allocation I promised. I lightly tested it, could
> > > you review it please ?
> > > Thanks !
> > > 
> > > [PATCH net-next2.6] net: dynamic ingress_queue allocation
> > > 
> > > ingress being not used very much, and net_device->ingress_queue being
> > > quite a big object (128 or 256 bytes), use a dynamic allocation if
> > > needed (tc qdisc add dev eth0 ingress ...)
> > 
> > I agree with the principle that it is valuable in making it dynamic for
> > people who dont want it; but, but (like my kid would say, sniff, sniff)
> > you are making me sad saying it is not used very much ;-> It is used
> > very much in my world. My friend Jarek uses it;->

Thanks Jamal, my friend, I'm really glad! (And sad ;-)

> 
> ;)
> 
> > 
> > > +#ifdef CONFIG_NET_CLS_ACT
> > 
> > I think appropriately this should be NET_SCH_INGRESS (everywhere else as
> > well).
> > 
> 
> I first thought of this, and found it would add a new dependence on
> vmlinux :
> 
> If someone wants to add NET_SCH_INGRESS module, he would need to
> recompile whole kernel and reboot.
> 
> This is probably why ing_filter() and handle_ing() are enclosed with
> CONFIG_NET_CLS_ACT, not CONFIG_NET_SCH_INGRESS.
> 
> Since struct net_dev only holds one pointer (after this patch), I
> believe its better to use same dependence.
> 
> > 
> > > +static inline struct netdev_queue *dev_ingress_queue(struct net_device *dev)
> > > +{
> > > +#ifdef CONFIG_NET_CLS_ACT
> > > +	return dev->ingress_queue;
> > > +#else
> > > +	return NULL;
> > > +#endif
> > 
> > Above, if you just returned dev->ingress_queue wouldnt it always be 
> > NULL if it was not allocated?
> > 
> 
> ingress_queue is not defined in "struct net_device *dev" if 
> !CONFIG_NET_CLS_ACT
> 
> Returning NULL here permits dead code elimination by compiler.
> 
> Then, probably nobody unset CONFIG_NET_CLS_ACT, so we can probably avoid
> this preprocessor stuff.
> 
> > 
> > > @@ -2737,7 +2734,9 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
> > >  					 struct packet_type **pt_prev,
> > >  					 int *ret, struct net_device *orig_dev)
> > >  {
> > > -	if (skb->dev->ingress_queue.qdisc == &noop_qdisc)
> > > +	struct netdev_queue *rxq = dev_ingress_queue(skb->dev);
> > > +
> > > +	if (!rxq || rxq->qdisc == &noop_qdisc)
> > >  		goto out;
> > 
> > I stared at above a little longer since this is the only fast path
> > affected; is it a few more cycles now for people who love ingress?
> 
> I see, this adds an indirection and a conditional branch, but this
> should be in cpu cache and well predicted.
> 
> I thought adding a fake "struct netdev_queue" object, with a qdisc
> pointing to noop_qdisc. But this would slow down a bit non ingress
> users ;)
> 
> For people not using ingress, my solution removes an access to an extra
> cache line. So network latency is improved a bit when cpu caches are
> full of user land data.
> 
> > 
> > > @@ -690,6 +693,8 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
> > >  		    (new && new->flags & TCQ_F_INGRESS)) {
> > >  			num_q = 1;
> > >  			ingress = 1;
> > > +			if (!dev_ingress_queue(dev))
> > > +				return -ENOENT;
> > >  		}
> > >  
> > 
> > The above looks clever but worries me because it changes the old flow.
> > If you have time,  the following tests will alleviate my fears
> > 
> > 1) compile support for ingress and add/delete ingress qdisc
> 
> This worked for me, but I dont know complex setups.
> 
> > 2) Dont compile support and add/delete ingress qdisc
> 
> tc gives an error (a bit like !CONFIG_NET_SCH_INGRESS)
> 
> # tc qdisc add dev eth0 ingress
> RTNETLINK answers: No such file or directory
> # tc -s -d qdisc show dev eth0
> qdisc mq 0: root 
>  Sent 636 bytes 10 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0 
> 
> 
> > 3) Compile ingress as a module and add/delete ingress qdisc
> > 
> > 
> 
> Seems to work like 1)
> 
> > Other than that excellent work Eric. And you can add my
> > Acked/reviewed-by etc.
> > 
> > BTW, did i say i like your per-cpu stats stuff? It applies nicely to
> > qdiscs, actions etc ;->
> 
> I took a look at ifb as suggested by Stephen but could not see trivial
> changes (LLTX or per-cpu stats), since central lock is needed I am
> afraid. And qdisc are the same, stats updates are mostly free as we
> dirtied cache line for the lock.
> 
> Thanks Jamal !
> 
> Here is the V2, with two #ifdef removed.
> 
> 
> [PATCH net-next V2] net: dynamic ingress_queue allocation
> 
> ingress being not used very much, and net_device->ingress_queue being
> quite a big object (128 or 256 bytes), use a dynamic allocation if
> needed (tc qdisc add dev eth0 ingress ...)
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
>  include/linux/netdevice.h |   11 ++++++--
>  net/core/dev.c            |   48 +++++++++++++++++++++++++++---------
>  net/sched/sch_api.c       |   40 ++++++++++++++++++++----------
>  net/sched/sch_generic.c   |   36 +++++++++++++++------------
>  4 files changed, 92 insertions(+), 43 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index ceed347..4f86009 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -986,8 +986,7 @@ struct net_device {
>  	rx_handler_func_t	*rx_handler;
>  	void			*rx_handler_data;
>  
> -	struct netdev_queue	ingress_queue; /* use two cache lines */
> -
> +	struct netdev_queue	*ingress_queue;
>  /*
>   * Cache lines mostly used on transmit path
>   */
> @@ -1115,6 +1114,14 @@ static inline void netdev_for_each_tx_queue(struct net_device *dev,
>  		f(dev, &dev->_tx[i], arg);
>  }
>  
> +
> +static inline struct netdev_queue *dev_ingress_queue(struct net_device *dev)
> +{
> +	return dev->ingress_queue;
> +}
> +
> +extern struct netdev_queue *dev_ingress_queue_create(struct net_device *dev);
> +
>  /*
>   * Net namespace inlines
>   */
> diff --git a/net/core/dev.c b/net/core/dev.c
> index a313bab..e3bb8c9 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2702,11 +2702,10 @@ EXPORT_SYMBOL_GPL(br_fdb_test_addr_hook);
>   * the ingress scheduler, you just cant add policies on ingress.
>   *
>   */
> -static int ing_filter(struct sk_buff *skb)
> +static int ing_filter(struct sk_buff *skb, struct netdev_queue *rxq)
>  {
>  	struct net_device *dev = skb->dev;
>  	u32 ttl = G_TC_RTTL(skb->tc_verd);
> -	struct netdev_queue *rxq;
>  	int result = TC_ACT_OK;
>  	struct Qdisc *q;
>  
> @@ -2720,8 +2719,6 @@ static int ing_filter(struct sk_buff *skb)
>  	skb->tc_verd = SET_TC_RTTL(skb->tc_verd, ttl);
>  	skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_INGRESS);
>  
> -	rxq = &dev->ingress_queue;
> -
>  	q = rxq->qdisc;
>  	if (q != &noop_qdisc) {
>  		spin_lock(qdisc_lock(q));
> @@ -2737,7 +2734,9 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
>  					 struct packet_type **pt_prev,
>  					 int *ret, struct net_device *orig_dev)
>  {
> -	if (skb->dev->ingress_queue.qdisc == &noop_qdisc)
> +	struct netdev_queue *rxq = dev_ingress_queue(skb->dev);
> +
> +	if (!rxq || rxq->qdisc == &noop_qdisc)
>  		goto out;
>  
>  	if (*pt_prev) {
> @@ -2745,7 +2744,7 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
>  		*pt_prev = NULL;
>  	}
>  
> -	switch (ing_filter(skb)) {
> +	switch (ing_filter(skb, rxq)) {
>  	case TC_ACT_SHOT:
>  	case TC_ACT_STOLEN:
>  		kfree_skb(skb);
> @@ -4932,15 +4931,17 @@ static void __netdev_init_queue_locks_one(struct net_device *dev,
>  					  struct netdev_queue *dev_queue,
>  					  void *_unused)
>  {
> -	spin_lock_init(&dev_queue->_xmit_lock);
> -	netdev_set_xmit_lockdep_class(&dev_queue->_xmit_lock, dev->type);
> -	dev_queue->xmit_lock_owner = -1;
> +	if (dev_queue) {
> +		spin_lock_init(&dev_queue->_xmit_lock);
> +		netdev_set_xmit_lockdep_class(&dev_queue->_xmit_lock, dev->type);
> +		dev_queue->xmit_lock_owner = -1;
> +	}
>  }
>  
>  static void netdev_init_queue_locks(struct net_device *dev)
>  {
>  	netdev_for_each_tx_queue(dev, __netdev_init_queue_locks_one, NULL);
> -	__netdev_init_queue_locks_one(dev, &dev->ingress_queue, NULL);
> +	__netdev_init_queue_locks_one(dev, dev_ingress_queue(dev), NULL);

Is dev_ingress_queue(dev) not NULL anytime here?

>  }
>  
>  unsigned long netdev_fix_features(unsigned long features, const char *name)
> @@ -5447,16 +5448,37 @@ static void netdev_init_one_queue(struct net_device *dev,
>  				  struct netdev_queue *queue,
>  				  void *_unused)
>  {
> -	queue->dev = dev;
> +	if (queue)
> +		queue->dev = dev;
>  }
>  
>  static void netdev_init_queues(struct net_device *dev)
>  {
> -	netdev_init_one_queue(dev, &dev->ingress_queue, NULL);
> +	netdev_init_one_queue(dev, dev_ingress_queue(dev), NULL);

Is dev_ingress_queue(dev) not NULL anytime here?

>  	netdev_for_each_tx_queue(dev, netdev_init_one_queue, NULL);
>  	spin_lock_init(&dev->tx_global_lock);
>  }
>  
> +struct netdev_queue *dev_ingress_queue_create(struct net_device *dev)
> +{
> +	struct netdev_queue *queue = dev_ingress_queue(dev);
> +
> +#ifdef CONFIG_NET_CLS_ACT
> +	if (queue)
> +		return queue;
> +	queue = kzalloc(sizeof(*queue), GFP_KERNEL);
> +	if (!queue)
> +		return NULL;
> +	netdev_init_one_queue(dev, queue, NULL);
> +	__netdev_init_queue_locks_one(dev, queue, NULL);
> +	queue->qdisc = &noop_qdisc;
> +	queue->qdisc_sleeping = &noop_qdisc;
> +	smp_wmb();

Why don't we need smp_rmb() in handle_ing()?

> +	dev->ingress_queue = queue;
> +#endif
> +	return queue;
> +}
> +
>  /**
>   *	alloc_netdev_mq - allocate network device
>   *	@sizeof_priv:	size of private data to allocate space for
> @@ -5559,6 +5581,8 @@ void free_netdev(struct net_device *dev)
>  
>  	kfree(dev->_tx);
>  
> +	kfree(dev_ingress_queue(dev));
> +
>  	/* Flush device addresses */
>  	dev_addr_flush(dev);
>  
> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> index b802078..8635110 100644
> --- a/net/sched/sch_api.c
> +++ b/net/sched/sch_api.c
> @@ -240,7 +240,10 @@ struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle)
>  	if (q)
>  		goto out;
>  
> -	q = qdisc_match_from_root(dev->ingress_queue.qdisc_sleeping, handle);
> +	if (!dev_ingress_queue(dev))
> +		goto out;
> +	q = qdisc_match_from_root(dev_ingress_queue(dev)->qdisc_sleeping,
> +				  handle);

I'd prefer:
 +	if (dev_ingress_queue(dev))
 +		q = qdisc_match_from_root(dev_ingress_queue(dev)->qdisc_sleeping,

>  out:
>  	return q;
>  }
> @@ -690,6 +693,8 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
>  		    (new && new->flags & TCQ_F_INGRESS)) {
>  			num_q = 1;
>  			ingress = 1;
> +			if (!dev_ingress_queue(dev))
> +				return -ENOENT;

Is this test really needed here?

>  		}
>  
>  		if (dev->flags & IFF_UP)
> @@ -701,7 +706,7 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
>  		}
>  
>  		for (i = 0; i < num_q; i++) {
> -			struct netdev_queue *dev_queue = &dev->ingress_queue;
> +			struct netdev_queue *dev_queue = dev_ingress_queue(dev);
>  
>  			if (!ingress)
>  				dev_queue = netdev_get_tx_queue(dev, i);
> @@ -979,7 +984,8 @@ static int tc_get_qdisc(struct sk_buff *skb, struct nlmsghdr *n, void *arg)
>  					return -ENOENT;
>  				q = qdisc_leaf(p, clid);
>  			} else { /* ingress */
> -				q = dev->ingress_queue.qdisc_sleeping;
> +				if (dev_ingress_queue(dev))
> +					q = dev_ingress_queue(dev)->qdisc_sleeping;
>  			}
>  		} else {
>  			q = dev->qdisc;
> @@ -1044,7 +1050,8 @@ replay:
>  					return -ENOENT;
>  				q = qdisc_leaf(p, clid);
>  			} else { /*ingress */
> -				q = dev->ingress_queue.qdisc_sleeping;
> +				if (dev_ingress_queue_create(dev))
> +					q = dev_ingress_queue(dev)->qdisc_sleeping;

I wonder if doing dev_ingress_queue_create() just before qdisc_create()
(and the test here) isn't more readable.

>  			}
>  		} else {
>  			q = dev->qdisc;
> @@ -1123,11 +1130,14 @@ replay:
>  create_n_graft:
>  	if (!(n->nlmsg_flags&NLM_F_CREATE))
>  		return -ENOENT;
> -	if (clid == TC_H_INGRESS)
> -		q = qdisc_create(dev, &dev->ingress_queue, p,
> -				 tcm->tcm_parent, tcm->tcm_parent,
> -				 tca, &err);
> -	else {
> +	if (clid == TC_H_INGRESS) {
> +		if (dev_ingress_queue(dev))
> +			q = qdisc_create(dev, dev_ingress_queue(dev), p,
> +					 tcm->tcm_parent, tcm->tcm_parent,
> +					 tca, &err);
> +		else
> +			err = -ENOENT;
> +	} else {
>  		struct netdev_queue *dev_queue;
>  
>  		if (p && p->ops->cl_ops && p->ops->cl_ops->select_queue)
> @@ -1304,8 +1314,10 @@ static int tc_dump_qdisc(struct sk_buff *skb, struct netlink_callback *cb)
>  		if (tc_dump_qdisc_root(dev->qdisc, skb, cb, &q_idx, s_q_idx) < 0)
>  			goto done;
>  
> -		dev_queue = &dev->ingress_queue;
> -		if (tc_dump_qdisc_root(dev_queue->qdisc_sleeping, skb, cb, &q_idx, s_q_idx) < 0)
> +		dev_queue = dev_ingress_queue(dev);
> +		if (dev_queue &&
> +		    tc_dump_qdisc_root(dev_queue->qdisc_sleeping, skb, cb,
> +				       &q_idx, s_q_idx) < 0)
>  			goto done;
>  
>  cont:
> @@ -1595,8 +1607,10 @@ static int tc_dump_tclass(struct sk_buff *skb, struct netlink_callback *cb)
>  	if (tc_dump_tclass_root(dev->qdisc, skb, tcm, cb, &t, s_t) < 0)
>  		goto done;
>  
> -	dev_queue = &dev->ingress_queue;
> -	if (tc_dump_tclass_root(dev_queue->qdisc_sleeping, skb, tcm, cb, &t, s_t) < 0)
> +	dev_queue = dev_ingress_queue(dev);
> +	if (dev_queue &&
> +	    tc_dump_tclass_root(dev_queue->qdisc_sleeping, skb, tcm, cb,
> +				&t, s_t) < 0)
>  		goto done;
>  
>  done:
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 545278a..c42dec5 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -721,16 +721,18 @@ static void transition_one_qdisc(struct net_device *dev,
>  				 struct netdev_queue *dev_queue,
>  				 void *_need_watchdog)
>  {
> -	struct Qdisc *new_qdisc = dev_queue->qdisc_sleeping;
> -	int *need_watchdog_p = _need_watchdog;
> +	if (dev_queue) {
> +		struct Qdisc *new_qdisc = dev_queue->qdisc_sleeping;
> +		int *need_watchdog_p = _need_watchdog;
>  
> -	if (!(new_qdisc->flags & TCQ_F_BUILTIN))
> -		clear_bit(__QDISC_STATE_DEACTIVATED, &new_qdisc->state);
> +		if (!(new_qdisc->flags & TCQ_F_BUILTIN))
> +			clear_bit(__QDISC_STATE_DEACTIVATED, &new_qdisc->state);
>  
> -	rcu_assign_pointer(dev_queue->qdisc, new_qdisc);
> -	if (need_watchdog_p && new_qdisc != &noqueue_qdisc) {
> -		dev_queue->trans_start = 0;
> -		*need_watchdog_p = 1;
> +		rcu_assign_pointer(dev_queue->qdisc, new_qdisc);
> +		if (need_watchdog_p && new_qdisc != &noqueue_qdisc) {
> +			dev_queue->trans_start = 0;
> +			*need_watchdog_p = 1;
> +		}
>  	}
>  }
>  
> @@ -753,7 +755,7 @@ void dev_activate(struct net_device *dev)
>  
>  	need_watchdog = 0;
>  	netdev_for_each_tx_queue(dev, transition_one_qdisc, &need_watchdog);
> -	transition_one_qdisc(dev, &dev->ingress_queue, NULL);
> +	transition_one_qdisc(dev, dev_ingress_queue(dev), NULL);

I'd prefer here and similarly later:

 +	if (dev_ingress_queue(dev))
 +		transition_one_qdisc(dev, dev_ingress_queue(dev), NULL);

to show NULL dev_queue is only legal in this one case.

Cheers,
Jarek P.

>  
>  	if (need_watchdog) {
>  		dev->trans_start = jiffies;
> @@ -768,7 +770,7 @@ static void dev_deactivate_queue(struct net_device *dev,
>  	struct Qdisc *qdisc_default = _qdisc_default;
>  	struct Qdisc *qdisc;
>  
> -	qdisc = dev_queue->qdisc;
> +	qdisc = dev_queue ? dev_queue->qdisc : NULL;
>  	if (qdisc) {
>  		spin_lock_bh(qdisc_lock(qdisc));
>  
> @@ -812,7 +814,7 @@ static bool some_qdisc_is_busy(struct net_device *dev)
>  void dev_deactivate(struct net_device *dev)
>  {
>  	netdev_for_each_tx_queue(dev, dev_deactivate_queue, &noop_qdisc);
> -	dev_deactivate_queue(dev, &dev->ingress_queue, &noop_qdisc);
> +	dev_deactivate_queue(dev, dev_ingress_queue(dev), &noop_qdisc);
>  
>  	dev_watchdog_down(dev);
>  
> @@ -830,15 +832,17 @@ static void dev_init_scheduler_queue(struct net_device *dev,
>  {
>  	struct Qdisc *qdisc = _qdisc;
>  
> -	dev_queue->qdisc = qdisc;
> -	dev_queue->qdisc_sleeping = qdisc;
> +	if (dev_queue) {
> +		dev_queue->qdisc = qdisc;
> +		dev_queue->qdisc_sleeping = qdisc;
> +	}
>  }
>  
>  void dev_init_scheduler(struct net_device *dev)
>  {
>  	dev->qdisc = &noop_qdisc;
>  	netdev_for_each_tx_queue(dev, dev_init_scheduler_queue, &noop_qdisc);
> -	dev_init_scheduler_queue(dev, &dev->ingress_queue, &noop_qdisc);
> +	dev_init_scheduler_queue(dev, dev_ingress_queue(dev), &noop_qdisc);
>  
>  	setup_timer(&dev->watchdog_timer, dev_watchdog, (unsigned long)dev);
>  }
> @@ -847,7 +851,7 @@ static void shutdown_scheduler_queue(struct net_device *dev,
>  				     struct netdev_queue *dev_queue,
>  				     void *_qdisc_default)
>  {
> -	struct Qdisc *qdisc = dev_queue->qdisc_sleeping;
> +	struct Qdisc *qdisc = dev_queue ? dev_queue->qdisc_sleeping : NULL;
>  	struct Qdisc *qdisc_default = _qdisc_default;
>  
>  	if (qdisc) {
> @@ -861,7 +865,7 @@ static void shutdown_scheduler_queue(struct net_device *dev,
>  void dev_shutdown(struct net_device *dev)
>  {
>  	netdev_for_each_tx_queue(dev, shutdown_scheduler_queue, &noop_qdisc);
> -	shutdown_scheduler_queue(dev, &dev->ingress_queue, &noop_qdisc);
> +	shutdown_scheduler_queue(dev, dev_ingress_queue(dev), &noop_qdisc);
>  	qdisc_destroy(dev->qdisc);
>  	dev->qdisc = &noop_qdisc;
>  
> 
> 

^ permalink raw reply

* Re: [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers
From: Arnaud Ebalard @ 2010-10-02 10:17 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, herbert, yoshfuji, netdev
In-Reply-To: <20100929.201618.241458205.davem@davemloft.net>

Hi,

David Miller <davem@davemloft.net> writes:

>> +static int mip6_iro_src_reject(struct xfrm_state *x, struct sk_buff *skb, struct flowi *fl)
>> +{
>> +	int err = 0;
>> +
>> +	/* XXX We may need some reject handler at some point but it is not
>> +	 * critical yet: see xfrm_secpath_reject() in net/xfrm/xfrm_policy.c
>> +	 * and aslo what mip6_destopt_reject() implements */
>> +
>> +	printk("XXX FIXME: mip6_iro_src_reject() called\n");
>
> pr_debug() or pr_err() or get rid of it altogher and use WARN_ON() or
> similar.

I will take a look at this reject handler tomorrow (implement or remove it).


>> +	spin_lock(&x->lock);
>> +	if (!ipv6_addr_equal(&iph->daddr, (struct in6_addr *)x->coaddr) &&
>> +	    !ipv6_addr_any((struct in6_addr *)x->coaddr))
>> +		err = -ENOENT;
>> +	spin_unlock(&x->lock);
>
> What are you actually protecting with this lock?  The moment you drop
> it the x->coaddr can change which changes the result you should return
> here.
>
> I suspect you either don't need the lock, or you need to lock at a higher
> level.

I basically trusted RH2 input handler code and reused it as a basis:

  static int mip6_rthdr_input(struct xfrm_state *x, struct sk_buff *skb)
  {
  	struct ipv6hdr *iph = ipv6_hdr(skb);
  	struct rt2_hdr *rt2 = (struct rt2_hdr *)skb->data;
  	int err = rt2->rt_hdr.nexthdr;
  
  	spin_lock(&x->lock);
  	if (!ipv6_addr_equal(&iph->daddr, (struct in6_addr *)x->coaddr) &&
  	    !ipv6_addr_any((struct in6_addr *)x->coaddr))
  		err = -ENOENT;
  	spin_unlock(&x->lock);
  
  	return err;
  }

*At that time*, I considered the lock useful to prevent changes on coaddr
during the two checks, i.e. to make it coherent. But I think you are
right and I see no reason for the lock not to be at a higher level:
I may have missed somthing but AFAICT, from a look at the code, there is
nothing preventing x->coaddr to  be updated (via xfrm_sa_update()) just
before or just after the checks.

I took a look at the callers for mip6 handlers and if I am not mistaken
there is *only* xfrm6_input_addr() because xfrm_input() only handles
esp, ah and ipcomp extension headers and not mip6-related ones
(i.e. only IPsec-related ones, those with a SPI). Here is a snippet of
the interesting (lock-wise) part of xfrm6_input_addr():

>	for (i = 0; i < 3; i++) {
>
>               <....snip....>
>
> 		spin_lock(&x->lock);
> 
> 		if ((!i || (x->props.flags & XFRM_STATE_WILDRECV)) &&
> 		    likely(x->km.state == XFRM_STATE_VALID) &&
> 		    !xfrm_state_check_expire(x)) {
> 			spin_unlock(&x->lock);
> 			if (x->type->input(x, skb) > 0) {
> 				/* found a valid state */
> 				break;
> 			}
> 		} else
> 			spin_unlock(&x->lock);
> 
> 		xfrm_state_put(x);
> 		x = NULL;
> 	}
> 
> 	if (!x) {
> 		XFRM_INC_STATS(net, LINUX_MIB_XFRMINNOSTATES);
> 		xfrm_audit_state_notfound_simple(skb, AF_INET6);
> 		goto drop;
> 	}
> 
> 	skb->sp->xvec[skb->sp->len++] = x;
> 
> 	spin_lock(&x->lock);
> 
> 	x->curlft.bytes += skb->len;
> 	x->curlft.packets++;
> 
> 	spin_unlock(&x->lock);

and I see no reason not to keep the lock we have on the state until the
end of the function when the state is valid (when we break), instead of
releasing it to get it again later. Something like the following would
allow removing the spin_lock()/spin_unlock() calls from all mip6 input
handlers (mip6_{destopt,rthdr,iro_src,iro_dst}_input()):

> 		spin_lock(&x->lock);
>
> 		if ((!i || (x->props.flags & XFRM_STATE_WILDRECV)) &&
> 		    likely(x->km.state == XFRM_STATE_VALID) &&
> 		    !xfrm_state_check_expire(x)) {
> 			if (x->type->input(x, skb) > 0) {
> 				/* found a valid state */
> 				break;
> 			} 
> 		}
>
> 		spin_unlock(&x->lock);
>
> 		xfrm_state_put(x);
> 		x = NULL;
> 	}
> 
> 	if (!x) {
> 		XFRM_INC_STATS(net, LINUX_MIB_XFRMINNOSTATES);
> 		xfrm_audit_state_notfound_simple(skb, AF_INET6);
> 		goto drop;
> 	}
> 
> 	skb->sp->xvec[skb->sp->len++] = x;
> 
> 	x->curlft.bytes += skb->len;
> 	x->curlft.packets++;
> 
> 	spin_unlock(&x->lock);

If this is ok, I will add a patch to the set to do that and also remove
the locks from the input handlers I introduce.

Cheers,

a+

^ permalink raw reply

* Re: [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers
From: Herbert Xu @ 2010-10-02 10:32 UTC (permalink / raw)
  To: Arnaud Ebalard; +Cc: David Miller, eric.dumazet, yoshfuji, netdev
In-Reply-To: <87iq1k99vk.fsf@small.ssi.corp>

On Sat, Oct 02, 2010 at 12:17:35PM +0200, Arnaud Ebalard wrote:
>
> and I see no reason not to keep the lock we have on the state until the
> end of the function when the state is valid (when we break), instead of
> releasing it to get it again later. Something like the following would
> allow removing the spin_lock()/spin_unlock() calls from all mip6 input
> handlers (mip6_{destopt,rthdr,iro_src,iro_dst}_input()):

No I moved the state lock down precisely because it should not
be taken at a higher level as that breaks asynchronous IPsec
processing and the fact that it isn't needed in most places.

If your code needs it then you should take it rather than impose
it on real IPsec users.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [Patch] Limit sysctl_tcp_mem and sysctl_udp_mem initializers to prevent integer overflows.
From: Robin Holt @ 2010-10-02 11:24 UTC (permalink / raw)
  To: Willy Tarreau, linux-kernel, netdev
  Cc: David S. Miller, Alexey Kuznetsov, Pekka Savola (ipv6),
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy
In-Reply-To: <20101002112405.951704198@gulag1.americas.sgi.com>

[-- Attachment #1: sysctl_tcp_udp_mem_max_overflows --]
[-- Type: text/plain, Size: 2746 bytes --]

Subject: [Patch] Limit sysctl_tcp_mem and sysctl_udp_mem initializers to prevent integer overflows.

On a 16TB x86_64 machine, sysctl_tcp_mem[2], sysctl_udp_mem[2], and
sysctl_sctp_mem[2] can integer overflow.  Set limit such that they are
maximized without overflowing.

Signed-off-by: Robin Holt <holt@sgi.com>
To: Willy Tarreau <w@1wt.eu>
To: linux-kernel@vger.kernel.org
To: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>

---

 net/ipv4/tcp.c |    4 +++-
 net/ipv4/udp.c |    4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

Index: pv1010932/net/ipv4/tcp.c
===================================================================
--- pv1010932.orig/net/ipv4/tcp.c	2010-10-02 06:11:59.737449853 -0500
+++ pv1010932/net/ipv4/tcp.c	2010-10-02 06:12:35.445454593 -0500
@@ -3271,12 +3271,14 @@ void __init tcp_init(void)
 
 	/* Set the pressure threshold to be a fraction of global memory that
 	 * is up to 1/2 at 256 MB, decreasing toward zero with the amount of
-	 * memory, with a floor of 128 pages.
+	 * memory, with a floor of 128 pages, and a ceiling that prevents an
+	 * integer overflow.
 	 */
 	nr_pages = totalram_pages - totalhigh_pages;
 	limit = min(nr_pages, 1UL<<(28-PAGE_SHIFT)) >> (20-PAGE_SHIFT);
 	limit = (limit * (nr_pages >> (20-PAGE_SHIFT))) >> (PAGE_SHIFT-11);
 	limit = max(limit, 128UL);
+	limit = min(limit, INT_MAX * 4UL / 3 / 2);
 	sysctl_tcp_mem[0] = limit / 4 * 3;
 	sysctl_tcp_mem[1] = limit;
 	sysctl_tcp_mem[2] = sysctl_tcp_mem[0] * 2;
Index: pv1010932/net/ipv4/udp.c
===================================================================
--- pv1010932.orig/net/ipv4/udp.c	2010-10-02 06:11:59.737449853 -0500
+++ pv1010932/net/ipv4/udp.c	2010-10-02 06:12:35.453453784 -0500
@@ -2167,12 +2167,14 @@ void __init udp_init(void)
 	udp_table_init(&udp_table, "UDP");
 	/* Set the pressure threshold up by the same strategy of TCP. It is a
 	 * fraction of global memory that is up to 1/2 at 256 MB, decreasing
-	 * toward zero with the amount of memory, with a floor of 128 pages.
+	 * toward zero with the amount of memory, with a floor of 128 pages,
+	 * and a ceiling that prevents an integer overflow.
 	 */
 	nr_pages = totalram_pages - totalhigh_pages;
 	limit = min(nr_pages, 1UL<<(28-PAGE_SHIFT)) >> (20-PAGE_SHIFT);
 	limit = (limit * (nr_pages >> (20-PAGE_SHIFT))) >> (PAGE_SHIFT-11);
 	limit = max(limit, 128UL);
+	limit = min(limit, INT_MAX * 4UL / 3 / 2);
 	sysctl_udp_mem[0] = limit / 4 * 3;
 	sysctl_udp_mem[1] = limit;
 	sysctl_udp_mem[2] = sysctl_udp_mem[0] * 2;

^ permalink raw reply

* Re: [PATCH 2.6.35.7] net: Fix the condition passed to sk_wait_event()
From: Nagendra Tomar @ 2010-10-02 11:49 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, davem, linux-kernel
In-Reply-To: <1286008441.2582.851.camel@edumazet-laptop>



--- On Sat, 2/10/10, Eric Dumazet <eric.dumazet@gmail.com> wrote:

> > Just wondering why you remove the test on sk->err
> ?
> > 
> > We want to break the loop If sk->sk_err is set, or
> state is ESTABLISHED
> > or CLOSE_WAIT.
> 
> Hmm, reading the code again, I can see sk_err is tested in
> the loop, so
> your code is better (sk_stream_wait_connect() returns an
> error after
> your patch, instead of returning 0)

Exactly.

> 
> Could you please split your patch in two patches ?
> 

ok, I'll send it soon.

Thanks,
Tomar




      

^ permalink raw reply

* checkentry function
From: Nicola Padovano @ 2010-10-02 11:59 UTC (permalink / raw)
  To: netfilter-devel, netdev

Hello there.
I've written checkentry function to check my new target, in this way:

[CHECK_ENTRY_CODE]
static bool xt_tarpit_check(const char *tablename, const void *entry,
                            const struct xt_target *target, void *targinfo,
                            unsigned int hook_mask)
{
 if (strcmp(tablename, "filter"))   {
    printk(KERN_INFO "DEBUG: the tablename (not FILTER) is %s\n",tablename);
    return false;
  }
return true;
}
[/CHECK_ENTRY_CODE]

but it doesn't work.
In fact if I do:

iptables -A INPUT -t filter -s 192.168.0.1 -p tcp -j TAR

the printk prints this message: DEBUG: the tablename (not FILTER) is: �%H �

so: in the tablename i haven't the string "filter"...what' the matter?

-- 
Nicola Padovano
e-mail: nicola.padovano@gmail.com
web: http://npadovano.altervista.org

"My only ambition is not be anything at all; it seems the most
sensible thing" (C. Bukowski)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 1/2] net: Fix the condition passed to sk_wait_event()
From: Nagendra Tomar @ 2010-10-02 12:08 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, davem

This is the first part of the split up patch submitted in http://www.spinics.net/lists/netdev/msg142617.html

This patch fixes the sk_wait_event() condition in the sk_stream_wait_connect() function. With this change, we correctly check for the TCPF_ESTABLISHED and TCPF_CLOSE_WAIT states and avoid potentially returning success when there is an error on the socket.

Signed-off-by: Nagendra Singh Tomar <tomer_iisc@yahoo.com>
---
--- linux-2.6.35.7/net/core/stream.c.orig	2010-03-24 09:30:00.000000000 +0530
+++ linux-2.6.35.7/net/core/stream.c	2010-03-24 09:30:17.000000000 +0530
@@ -73,9 +73,8 @@ int sk_stream_wait_connect(struct sock *
 		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
 		sk->sk_write_pending++;
 		done = sk_wait_event(sk, timeo_p,
-				     !sk->sk_err &&
-				     !((1 << sk->sk_state) &
-				       ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)));
+				     ((1 << sk->sk_state) &
+				       (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)));
 		finish_wait(sk_sleep(sk), &wait);
 		sk->sk_write_pending--;
 	} while (!done);

^ permalink raw reply

* Re: [PATCH net-next V2] net: dynamic ingress_queue allocation
From: jamal @ 2010-10-02 12:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Jarek Poplawski, David Miller, netdev
In-Reply-To: <1285941388.2641.175.camel@edumazet-laptop>

On Fri, 2010-10-01 at 15:56 +0200, Eric Dumazet wrote: 
> Le vendredi 01 octobre 2010 à 07:45 -0400, jamal a écrit :

> 
> I first thought of this, and found it would add a new dependence on
> vmlinux :
> 
> If someone wants to add NET_SCH_INGRESS module, he would need to
> recompile whole kernel and reboot.
> 
> This is probably why ing_filter() and handle_ing() are enclosed with
> CONFIG_NET_CLS_ACT, not CONFIG_NET_SCH_INGRESS.
> 
> Since struct net_dev only holds one pointer (after this patch), I
> believe its better to use same dependence.
> 

Ok - I just noticed that CONFIG_NET_SCH_INGRESS depends on 
CONFIG_NET_CLS_ACT - probably my fault too. I have a use case
for having just CONFIG_NET_SCH_INGRESS without need for
CONFIG_NET_CLS_ACT; but you are right to keep in the current
spirit it is fair to keep it as is; i wonder if it needs to be fixed.

> > 
> > > +static inline struct netdev_queue *dev_ingress_queue(struct net_device *dev)
> > > +{
> > > +#ifdef CONFIG_NET_CLS_ACT
> > > +	return dev->ingress_queue;
> > > +#else
> > > +	return NULL;
> > > +#endif
> > 
> > Above, if you just returned dev->ingress_queue wouldnt it always be 
> > NULL if it was not allocated?
> > 
> 
> ingress_queue is not defined in "struct net_device *dev" if 
> !CONFIG_NET_CLS_ACT
> 
> Returning NULL here permits dead code elimination by compiler.
> 
> Then, probably nobody unset CONFIG_NET_CLS_ACT, so we can probably avoid
> this preprocessor stuff.
> 

Ok


> I see, this adds an indirection and a conditional branch, but this
> should be in cpu cache and well predicted.

Ok - good enough for me.

> I thought adding a fake "struct netdev_queue" object, with a qdisc
> pointing to noop_qdisc. But this would slow down a bit non ingress
> users ;)

nod.

> For people not using ingress, my solution removes an access to an extra
> cache line. So network latency is improved a bit when cpu caches are
> full of user land data.

Who are these lunatics running without ingress anyways?

> > 1) compile support for ingress and add/delete ingress qdisc
> 
> This worked for me, but I dont know complex setups.
> 

I think if you covered the basic case, should be equivalent
to any other case. Maybe a replace after you do an add etc.

> > 2) Dont compile support and add/delete ingress qdisc
> 
> tc gives an error (a bit like !CONFIG_NET_SCH_INGRESS)
> 
> # tc qdisc add dev eth0 ingress
> RTNETLINK answers: No such file or directory
> # tc -s -d qdisc show dev eth0
> qdisc mq 0: root 
>  Sent 636 bytes 10 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0 
> 

nice

> 
> > 3) Compile ingress as a module and add/delete ingress qdisc
> > 
> Seems to work like 1)
> 

ok then, nothing else to bitch about.

Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>

> I took a look at ifb as suggested by Stephen but could not see trivial
> changes (LLTX or per-cpu stats), since central lock is needed I am
> afraid. 

I wasnt thinking ifb - but ifb (and probably other netdevs) could
benefit a tiny bit by rearranging net_stats struct rx and tx parts into
different cache lines since these are typically updated in different
paths. In kernel, rx path for ifb is guaranteed to run in one cpu at a
time but tx could be many cpus (so at least for rx path theres benefit)

> And qdisc are the same, stats updates are mostly free as we
> dirtied cache line for the lock.

ok - qdiscs are easier because a qdisc instance can be accessed by one
cpu at a time. actions could be configured to be shared by many filters;
in such a mode they could be accessed across many cpus.

cheers,
jamal


^ permalink raw reply

* Re: [PATCH 2/2] net: Fix the condition passed to sk_wait_event()
From: Nagendra Tomar @ 2010-10-02 12:16 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, davem

This is the second part of the split up patch submitted in http://www.spinics.net/lists/netdev/msg142617.html

This fixes the condition (3rd arg) passed to sk_wait_event() in sk_stream_wait_memory(). The incorrect check in sk_stream_wait_memory() causes the following soft lockup in tcp_sendmsg() when the global tcp memory pool has exhausted. 

>>> snip <<<

localhost kernel: BUG: soft lockup - CPU#3 stuck for 11s! [sshd:6429]
localhost kernel: CPU 3:
localhost kernel: RIP: 0010:[sk_stream_wait_memory+0xcd/0x200]  [sk_stream_wait_memory+0xcd/0x200] sk_stream_wait_memory+0xcd/0x200
localhost kernel: 
localhost kernel: Call Trace:
localhost kernel:  [sk_stream_wait_memory+0x1b1/0x200] sk_stream_wait_memory+0x1b1/0x200
localhost kernel:  [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40
localhost kernel:  [ipv6:tcp_sendmsg+0x6e6/0xe90] tcp_sendmsg+0x6e6/0xce0
localhost kernel:  [sock_aio_write+0x126/0x140] sock_aio_write+0x126/0x140
localhost kernel:  [xfs:do_sync_write+0xf1/0x130] do_sync_write+0xf1/0x130
localhost kernel:  [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40
localhost kernel:  [hrtimer_start+0xe3/0x170] hrtimer_start+0xe3/0x170
localhost kernel:  [vfs_write+0x185/0x190] vfs_write+0x185/0x190
localhost kernel:  [sys_write+0x50/0x90] sys_write+0x50/0x90
localhost kernel:  [system_call+0x7e/0x83] system_call+0x7e/0x83

>>> snip <<<


What is happening is, that the sk_wait_event() condition passed from
sk_stream_wait_memory() evaluates to true for the case of tcp global memory
exhaustion. This is because both sk_stream_memory_free() and vm_wait are true
which causes sk_wait_event() to *not* call schedule_timeout(). 
Hence sk_stream_wait_memory() returns immediately to the caller w/o sleeping.
This causes the caller to again try allocation, which again fails and again
calls sk_stream_wait_memory(), and so on.


Signed-off-by: Nagendra Singh Tomar <tomer_iisc@yahoo.com>
---
--- linux-2.6.35.7/net/core/stream.c.orig	2010-03-24 09:31:00.000000000 +0530
+++ linux-2.6.35.7/net/core/stream.c	2010-03-24 09:31:08.000000000 +0530
@@ -143,10 +143,9 @@ int sk_stream_wait_memory(struct sock *s
 
 		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
 		sk->sk_write_pending++;
-		sk_wait_event(sk, &current_timeo, !sk->sk_err &&
-						  !(sk->sk_shutdown & SEND_SHUTDOWN) &&
-						  sk_stream_memory_free(sk) &&
-						  vm_wait);
+		sk_wait_event(sk, &current_timeo, sk->sk_err ||
+						  (sk->sk_shutdown & SEND_SHUTDOWN) ||
+						  (sk_stream_memory_free(sk) && !vm_wait));
 		sk->sk_write_pending--;
 
 		if (vm_wait) {



      

^ permalink raw reply

* [PATCH] sysctl: fix min/max handling in __do_proc_doulongvec_minmax()
From: Eric Dumazet @ 2010-10-02 13:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Robin Holt, Willy Tarreau, David S. Miller, netdev,
	James Morris, Hideaki YOSHIFUJI, Pekka Savola (ipv6), netdev,
	James Morris, Hideaki YOSHIFUJI, Pekka Savola (ipv6),
	Patrick McHardy, Alexey Kuznetsov

When proc_doulongvec_minmax() is used with an array of longs,
and no min/max check requested (.extra1 or .extra2 being NULL), we
dereference a NULL pointer for the second element of the array.

Noticed while doing some changes in network stack for the "16TB problem"

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 kernel/sysctl.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index f88552c..4fba86d 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2500,7 +2500,8 @@ static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int
 				break;
 			if (neg)
 				continue;
-			if ((min && val < *min) || (max && val > *max))
+			if ((table->extra1 && val < *min) ||
+			    (table->extra2 && val > *max))
 				continue;
 			*i = val;
 		} else {



^ permalink raw reply related

* Re: [Patch] Limit sysctl_tcp_mem and sysctl_udp_mem initializers to prevent integer overflows.
From: Eric Dumazet @ 2010-10-02 13:22 UTC (permalink / raw)
  To: Robin Holt, Andrew Morton
  Cc: Willy Tarreau, linux-kernel, netdev, David S. Miller,
	Alexey Kuznetsov, Pekka Savola (ipv6), James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy
In-Reply-To: <20101002112419.248437367@gulag1.americas.sgi.com>

Le samedi 02 octobre 2010 à 06:24 -0500, Robin Holt a écrit :
> pièce jointe document texte brut (sysctl_tcp_udp_mem_max_overflows)
> Subject: [Patch] Limit sysctl_tcp_mem and sysctl_udp_mem initializers to prevent integer overflows.
> 
> On a 16TB x86_64 machine, sysctl_tcp_mem[2], sysctl_udp_mem[2], and
> sysctl_sctp_mem[2] can integer overflow.  Set limit such that they are
> maximized without overflowing.
> 
> Signed-off-by: Robin Holt <holt@sgi.com>
> To: Willy Tarreau <w@1wt.eu>
> To: linux-kernel@vger.kernel.org
> To: netdev@vger.kernel.org
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
> Cc: James Morris <jmorris@namei.org>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: Patrick McHardy <kaber@trash.net>
> 
> ---
> 
>  net/ipv4/tcp.c |    4 +++-
>  net/ipv4/udp.c |    4 +++-
>  2 files changed, 6 insertions(+), 2 deletions(-)

Hi Mr SixteenTB

Strange, you mention sctp in changelog but I cant see the patch.

We could apply your patch (with sctp changes) for stable.

IMHO it would be better in long term to switch all these limits from
"int" to "long", now we have atomic_long_t primitives that have same
runtime cost than atomic_t ones. I am pretty sure one of your customer
will need more than 5TB of memory for tcp/udp buffers before year 2020,
dont you think ? (some further scalability works probably needed.)

Something like this (boot tested on my dev machine, not a 16TB one, as
you guessed ;) ) :

Based on linux-2.6 for your convenience, should probably sit first in
David net-next-2.6 ...

Note : this needs the "sysctl: fix min/max handling in
__do_proc_doulongvec_minmax()" I sent to Andrew some minutes ago.
or this triggers a BUG :

"echo "378912 505216 758989782400000" >/proc/sys/net/ipv4/tcp_mem


Thanks !

[PATCH] net: avoid limits overflow

Robin Holt tried to boot a 16TB machine and found some limits were
reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

We can switch infrastructure to use long "instead" of "int", now
atomic_long_t primitives are available for free.

Reported-by: Robin Holt <holt@sgi.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/dn.h               |    2 +-
 include/net/sock.h             |    4 ++--
 include/net/tcp.h              |    6 +++---
 include/net/udp.h              |    4 ++--
 net/core/sock.c                |   14 +++++++-------
 net/decnet/af_decnet.c         |    2 +-
 net/decnet/sysctl_net_decnet.c |    4 ++--
 net/ipv4/proc.c                |    8 ++++----
 net/ipv4/sysctl_net_ipv4.c     |    5 ++---
 net/ipv4/tcp.c                 |    4 ++--
 net/ipv4/tcp_input.c           |   11 +++++++----
 net/ipv4/udp.c                 |    4 ++--
 net/sctp/protocol.c            |    2 +-
 net/sctp/socket.c              |    4 ++--
 net/sctp/sysctl.c              |    4 ++--
 15 files changed, 40 insertions(+), 38 deletions(-)

diff --git a/include/net/dn.h b/include/net/dn.h
index e5469f7..a514a3c 100644
--- a/include/net/dn.h
+++ b/include/net/dn.h
@@ -225,7 +225,7 @@ extern int decnet_di_count;
 extern int decnet_dr_count;
 extern int decnet_no_fc_max_cwnd;
 
-extern int sysctl_decnet_mem[3];
+extern long sysctl_decnet_mem[3];
 extern int sysctl_decnet_wmem[3];
 extern int sysctl_decnet_rmem[3];
 
diff --git a/include/net/sock.h b/include/net/sock.h
index adab9dc..5d84d86 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -762,7 +762,7 @@ struct proto {
 
 	/* Memory pressure */
 	void			(*enter_memory_pressure)(struct sock *sk);
-	atomic_t		*memory_allocated;	/* Current allocated memory. */
+	atomic_long_t		*memory_allocated;	/* Current allocated memory. */
 	struct percpu_counter	*sockets_allocated;	/* Current number of sockets. */
 	/*
 	 * Pressure flag: try to collapse.
@@ -771,7 +771,7 @@ struct proto {
 	 * is strict, actions are advisory and have some latency.
 	 */
 	int			*memory_pressure;
-	int			*sysctl_mem;
+	long			*sysctl_mem;
 	int			*sysctl_wmem;
 	int			*sysctl_rmem;
 	int			max_header;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 3e4b33e..3f05403 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -224,7 +224,7 @@ extern int sysctl_tcp_fack;
 extern int sysctl_tcp_reordering;
 extern int sysctl_tcp_ecn;
 extern int sysctl_tcp_dsack;
-extern int sysctl_tcp_mem[3];
+extern long sysctl_tcp_mem[3];
 extern int sysctl_tcp_wmem[3];
 extern int sysctl_tcp_rmem[3];
 extern int sysctl_tcp_app_win;
@@ -247,7 +247,7 @@ extern int sysctl_tcp_cookie_size;
 extern int sysctl_tcp_thin_linear_timeouts;
 extern int sysctl_tcp_thin_dupack;
 
-extern atomic_t tcp_memory_allocated;
+extern atomic_long_t tcp_memory_allocated;
 extern struct percpu_counter tcp_sockets_allocated;
 extern int tcp_memory_pressure;
 
@@ -280,7 +280,7 @@ static inline bool tcp_too_many_orphans(struct sock *sk, int shift)
 	}
 
 	if (sk->sk_wmem_queued > SOCK_MIN_SNDBUF &&
-	    atomic_read(&tcp_memory_allocated) > sysctl_tcp_mem[2])
+	    atomic_long_read(&tcp_memory_allocated) > sysctl_tcp_mem[2])
 		return true;
 	return false;
 }
diff --git a/include/net/udp.h b/include/net/udp.h
index a184d34..e686e01 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -105,10 +105,10 @@ static inline struct udp_hslot *udp_hashslot2(struct udp_table *table,
 
 extern struct proto udp_prot;
 
-extern atomic_t udp_memory_allocated;
+extern atomic_long_t udp_memory_allocated;
 
 /* sysctl variables for udp */
-extern int sysctl_udp_mem[3];
+extern long sysctl_udp_mem[3];
 extern int sysctl_udp_rmem_min;
 extern int sysctl_udp_wmem_min;
 
diff --git a/net/core/sock.c b/net/core/sock.c
index ef30e9d..a2af957 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1646,10 +1646,10 @@ int __sk_mem_schedule(struct sock *sk, int size, int kind)
 {
 	struct proto *prot = sk->sk_prot;
 	int amt = sk_mem_pages(size);
-	int allocated;
+	long allocated;
 
 	sk->sk_forward_alloc += amt * SK_MEM_QUANTUM;
-	allocated = atomic_add_return(amt, prot->memory_allocated);
+	allocated = atomic_long_add_return(amt, prot->memory_allocated);
 
 	/* Under limit. */
 	if (allocated <= prot->sysctl_mem[0]) {
@@ -1707,7 +1707,7 @@ suppress_allocation:
 
 	/* Alas. Undo changes. */
 	sk->sk_forward_alloc -= amt * SK_MEM_QUANTUM;
-	atomic_sub(amt, prot->memory_allocated);
+	atomic_long_sub(amt, prot->memory_allocated);
 	return 0;
 }
 EXPORT_SYMBOL(__sk_mem_schedule);
@@ -1720,12 +1720,12 @@ void __sk_mem_reclaim(struct sock *sk)
 {
 	struct proto *prot = sk->sk_prot;
 
-	atomic_sub(sk->sk_forward_alloc >> SK_MEM_QUANTUM_SHIFT,
+	atomic_long_sub(sk->sk_forward_alloc >> SK_MEM_QUANTUM_SHIFT,
 		   prot->memory_allocated);
 	sk->sk_forward_alloc &= SK_MEM_QUANTUM - 1;
 
 	if (prot->memory_pressure && *prot->memory_pressure &&
-	    (atomic_read(prot->memory_allocated) < prot->sysctl_mem[0]))
+	    (atomic_long_read(prot->memory_allocated) < prot->sysctl_mem[0]))
 		*prot->memory_pressure = 0;
 }
 EXPORT_SYMBOL(__sk_mem_reclaim);
@@ -2445,12 +2445,12 @@ static char proto_method_implemented(const void *method)
 
 static void proto_seq_printf(struct seq_file *seq, struct proto *proto)
 {
-	seq_printf(seq, "%-9s %4u %6d  %6d   %-3s %6u   %-3s  %-10s "
+	seq_printf(seq, "%-9s %4u %6d  %6ld   %-3s %6u   %-3s  %-10s "
 			"%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n",
 		   proto->name,
 		   proto->obj_size,
 		   sock_prot_inuse_get(seq_file_net(seq), proto),
-		   proto->memory_allocated != NULL ? atomic_read(proto->memory_allocated) : -1,
+		   proto->memory_allocated != NULL ? atomic_long_read(proto->memory_allocated) : -1L,
 		   proto->memory_pressure != NULL ? *proto->memory_pressure ? "yes" : "no" : "NI",
 		   proto->max_header,
 		   proto->slab == NULL ? "no" : "yes",
diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c
index d6b93d1..a76b78d 100644
--- a/net/decnet/af_decnet.c
+++ b/net/decnet/af_decnet.c
@@ -155,7 +155,7 @@ static const struct proto_ops dn_proto_ops;
 static DEFINE_RWLOCK(dn_hash_lock);
 static struct hlist_head dn_sk_hash[DN_SK_HASH_SIZE];
 static struct hlist_head dn_wild_sk;
-static atomic_t decnet_memory_allocated;
+static atomic_long_t decnet_memory_allocated;
 
 static int __dn_setsockopt(struct socket *sock, int level, int optname, char __user *optval, unsigned int optlen, int flags);
 static int __dn_getsockopt(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen, int flags);
diff --git a/net/decnet/sysctl_net_decnet.c b/net/decnet/sysctl_net_decnet.c
index be3eb8e..28f8b5e 100644
--- a/net/decnet/sysctl_net_decnet.c
+++ b/net/decnet/sysctl_net_decnet.c
@@ -38,7 +38,7 @@ int decnet_log_martians = 1;
 int decnet_no_fc_max_cwnd = NSP_MIN_WINDOW;
 
 /* Reasonable defaults, I hope, based on tcp's defaults */
-int sysctl_decnet_mem[3] = { 768 << 3, 1024 << 3, 1536 << 3 };
+long sysctl_decnet_mem[3] = { 768 << 3, 1024 << 3, 1536 << 3 };
 int sysctl_decnet_wmem[3] = { 4 * 1024, 16 * 1024, 128 * 1024 };
 int sysctl_decnet_rmem[3] = { 4 * 1024, 87380, 87380 * 2 };
 
@@ -324,7 +324,7 @@ static ctl_table dn_table[] = {
 		.data = &sysctl_decnet_mem,
 		.maxlen = sizeof(sysctl_decnet_mem),
 		.mode = 0644,
-		.proc_handler = proc_dointvec,
+		.proc_handler = proc_doulongvec_minmax
 	},
 	{
 		.procname = "decnet_rmem",
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 4ae1f20..1b48eb1 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -59,13 +59,13 @@ static int sockstat_seq_show(struct seq_file *seq, void *v)
 	local_bh_enable();
 
 	socket_seq_show(seq);
-	seq_printf(seq, "TCP: inuse %d orphan %d tw %d alloc %d mem %d\n",
+	seq_printf(seq, "TCP: inuse %d orphan %d tw %d alloc %d mem %ld\n",
 		   sock_prot_inuse_get(net, &tcp_prot), orphans,
 		   tcp_death_row.tw_count, sockets,
-		   atomic_read(&tcp_memory_allocated));
-	seq_printf(seq, "UDP: inuse %d mem %d\n",
+		   atomic_long_read(&tcp_memory_allocated));
+	seq_printf(seq, "UDP: inuse %d mem %ld\n",
 		   sock_prot_inuse_get(net, &udp_prot),
-		   atomic_read(&udp_memory_allocated));
+		   atomic_long_read(&udp_memory_allocated));
 	seq_printf(seq, "UDPLITE: inuse %d\n",
 		   sock_prot_inuse_get(net, &udplite_prot));
 	seq_printf(seq, "RAW: inuse %d\n",
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index d96c1da..e91911d 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -398,7 +398,7 @@ static struct ctl_table ipv4_table[] = {
 		.data		= &sysctl_tcp_mem,
 		.maxlen		= sizeof(sysctl_tcp_mem),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec
+		.proc_handler	= proc_doulongvec_minmax
 	},
 	{
 		.procname	= "tcp_wmem",
@@ -602,8 +602,7 @@ static struct ctl_table ipv4_table[] = {
 		.data		= &sysctl_udp_mem,
 		.maxlen		= sizeof(sysctl_udp_mem),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &zero
+		.proc_handler	= proc_doulongvec_minmax,
 	},
 	{
 		.procname	= "udp_rmem_min",
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f115ea6..e88d7a0 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -282,7 +282,7 @@ int sysctl_tcp_fin_timeout __read_mostly = TCP_FIN_TIMEOUT;
 struct percpu_counter tcp_orphan_count;
 EXPORT_SYMBOL_GPL(tcp_orphan_count);
 
-int sysctl_tcp_mem[3] __read_mostly;
+long sysctl_tcp_mem[3] __read_mostly;
 int sysctl_tcp_wmem[3] __read_mostly;
 int sysctl_tcp_rmem[3] __read_mostly;
 
@@ -290,7 +290,7 @@ EXPORT_SYMBOL(sysctl_tcp_mem);
 EXPORT_SYMBOL(sysctl_tcp_rmem);
 EXPORT_SYMBOL(sysctl_tcp_wmem);
 
-atomic_t tcp_memory_allocated;	/* Current allocated memory. */
+atomic_long_t tcp_memory_allocated;	/* Current allocated memory. */
 EXPORT_SYMBOL(tcp_memory_allocated);
 
 /*
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index b55f60f..0f56fb4 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -259,8 +259,11 @@ static void tcp_fixup_sndbuf(struct sock *sk)
 	int sndmem = tcp_sk(sk)->rx_opt.mss_clamp + MAX_TCP_HEADER + 16 +
 		     sizeof(struct sk_buff);
 
-	if (sk->sk_sndbuf < 3 * sndmem)
-		sk->sk_sndbuf = min(3 * sndmem, sysctl_tcp_wmem[2]);
+	if (sk->sk_sndbuf < 3 * sndmem) {
+		sk->sk_sndbuf = 3 * sndmem;
+		if (sk->sk_sndbuf > sysctl_tcp_wmem[2])
+			sk->sk_sndbuf = sysctl_tcp_wmem[2];
+	}
 }
 
 /* 2. Tuning advertised window (window_clamp, rcv_ssthresh)
@@ -396,7 +399,7 @@ static void tcp_clamp_window(struct sock *sk)
 	if (sk->sk_rcvbuf < sysctl_tcp_rmem[2] &&
 	    !(sk->sk_userlocks & SOCK_RCVBUF_LOCK) &&
 	    !tcp_memory_pressure &&
-	    atomic_read(&tcp_memory_allocated) < sysctl_tcp_mem[0]) {
+	    atomic_long_read(&tcp_memory_allocated) < sysctl_tcp_mem[0]) {
 		sk->sk_rcvbuf = min(atomic_read(&sk->sk_rmem_alloc),
 				    sysctl_tcp_rmem[2]);
 	}
@@ -4870,7 +4873,7 @@ static int tcp_should_expand_sndbuf(struct sock *sk)
 		return 0;
 
 	/* If we are under soft global TCP memory pressure, do not expand.  */
-	if (atomic_read(&tcp_memory_allocated) >= sysctl_tcp_mem[0])
+	if (atomic_long_read(&tcp_memory_allocated) >= sysctl_tcp_mem[0])
 		return 0;
 
 	/* If we filled the congestion window, do not expand.  */
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index fb23c2e..ecfdf2e 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -110,7 +110,7 @@
 struct udp_table udp_table __read_mostly;
 EXPORT_SYMBOL(udp_table);
 
-int sysctl_udp_mem[3] __read_mostly;
+long sysctl_udp_mem[3] __read_mostly;
 EXPORT_SYMBOL(sysctl_udp_mem);
 
 int sysctl_udp_rmem_min __read_mostly;
@@ -119,7 +119,7 @@ EXPORT_SYMBOL(sysctl_udp_rmem_min);
 int sysctl_udp_wmem_min __read_mostly;
 EXPORT_SYMBOL(sysctl_udp_wmem_min);
 
-atomic_t udp_memory_allocated;
+atomic_long_t udp_memory_allocated;
 EXPORT_SYMBOL(udp_memory_allocated);
 
 #define MAX_UDP_PORTS 65536
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 5027b83..bf95400 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -90,7 +90,7 @@ static struct sctp_af *sctp_af_v6_specific;
 struct kmem_cache *sctp_chunk_cachep __read_mostly;
 struct kmem_cache *sctp_bucket_cachep __read_mostly;
 
-int sysctl_sctp_mem[3];
+long sysctl_sctp_mem[3];
 int sysctl_sctp_rmem[3];
 int sysctl_sctp_wmem[3];
 
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index ca44917..cc03b44 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -109,12 +109,12 @@ static void sctp_sock_migrate(struct sock *, struct sock *,
 static char *sctp_hmac_alg = SCTP_COOKIE_HMAC_ALG;
 
 extern struct kmem_cache *sctp_bucket_cachep;
-extern int sysctl_sctp_mem[3];
+extern long sysctl_sctp_mem[3];
 extern int sysctl_sctp_rmem[3];
 extern int sysctl_sctp_wmem[3];
 
 static int sctp_memory_pressure;
-static atomic_t sctp_memory_allocated;
+static atomic_long_t sctp_memory_allocated;
 struct percpu_counter sctp_sockets_allocated;
 
 static void sctp_enter_memory_pressure(struct sock *sk)
diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
index 832590b..50cb57f 100644
--- a/net/sctp/sysctl.c
+++ b/net/sctp/sysctl.c
@@ -54,7 +54,7 @@ static int sack_timer_max = 500;
 static int addr_scope_max = 3; /* check sctp_scope_policy_t in include/net/sctp/constants.h for max entries */
 static int rwnd_scale_max = 16;
 
-extern int sysctl_sctp_mem[3];
+extern long sysctl_sctp_mem[3];
 extern int sysctl_sctp_rmem[3];
 extern int sysctl_sctp_wmem[3];
 
@@ -203,7 +203,7 @@ static ctl_table sctp_table[] = {
 		.data		= &sysctl_sctp_mem,
 		.maxlen		= sizeof(sysctl_sctp_mem),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec,
+		.proc_handler	= proc_doulongvec_minmax
 	},
 	{
 		.procname	= "sctp_rmem",



^ permalink raw reply related

* [PATCH 4/4] drivers/atm/idt77252.c: Remove unnecessary error check
From: Julia Lawall @ 2010-10-02 13:59 UTC (permalink / raw)
  To: Chas Williams; +Cc: kernel-janitors, linux-atm-general, netdev, linux-kernel

This code does not call deinit_card(card); in an error case, as done in
other error-handling code in the same function.  But actually, the called
function init_sram can only return 0, so there is no need for the error
check at all.

A simplified version of the sematic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
@r@
statement S1,S2,S3;
constant C1,C2,C3;
@@

*if (...)
 {... S1 return -C1;}
...
*if (...)
 {... when != S1
    return -C2;}
...
*if (...)
 {... S1 return -C3;}
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>

---
 drivers/atm/idt77252.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/atm/idt77252.c b/drivers/atm/idt77252.c
index 1679cbf..08ccde6 100644
--- a/drivers/atm/idt77252.c
+++ b/drivers/atm/idt77252.c
@@ -3410,8 +3410,7 @@ init_card(struct atm_dev *dev)
 
 	writel(readl(SAR_REG_CFG) | conf, SAR_REG_CFG);
 
-	if (init_sram(card) < 0)
-		return -1;
+	init_sram(card);
 
 /********************************************************************/
 /*  A L L O C   R A M   A N D   S E T   V A R I O U S   T H I N G S */

^ permalink raw reply related

* Re: [PATCH 4/4] drivers/atm/idt77252.c: Remove unnecessary error check
From: walter harms @ 2010-10-02 14:09 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Chas Williams, kernel-janitors, linux-atm-general, netdev,
	linux-kernel
In-Reply-To: <1286027958-7333-4-git-send-email-julia@diku.dk>



Julia Lawall schrieb:
> This code does not call deinit_card(card); in an error case, as done in
> other error-handling code in the same function.  But actually, the called
> function init_sram can only return 0, so there is no need for the error
> check at all.
> 


did you set init_sram() to void ?
re,
 wh




> A simplified version of the sematic match that finds this problem is as
> follows: (http://coccinelle.lip6.fr/)
> 
> // <smpl>
> @r exists@
> @r@
> statement S1,S2,S3;
> constant C1,C2,C3;
> @@
> 
> *if (...)
>  {... S1 return -C1;}
> ...
> *if (...)
>  {... when != S1
>     return -C2;}
> ...
> *if (...)
>  {... S1 return -C3;}
> // </smpl>
> 
> Signed-off-by: Julia Lawall <julia@diku.dk>
> 
> ---
>  drivers/atm/idt77252.c |    3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/atm/idt77252.c b/drivers/atm/idt77252.c
> index 1679cbf..08ccde6 100644
> --- a/drivers/atm/idt77252.c
> +++ b/drivers/atm/idt77252.c
> @@ -3410,8 +3410,7 @@ init_card(struct atm_dev *dev)
>  
>  	writel(readl(SAR_REG_CFG) | conf, SAR_REG_CFG);
>  
> -	if (init_sram(card) < 0)
> -		return -1;
> +	init_sram(card);
>  
>  /********************************************************************/
>  /*  A L L O C   R A M   A N D   S E T   V A R I O U S   T H I N G S */
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply

* Re: [PATCH 4/4] drivers/atm/idt77252.c: Remove unnecessary error check
From: Julia Lawall @ 2010-10-02 14:21 UTC (permalink / raw)
  To: walter harms
  Cc: Chas Williams, kernel-janitors, linux-atm-general, netdev,
	linux-kernel
In-Reply-To: <4CA73D26.7090109@bfs.de>

On Sat, 2 Oct 2010, walter harms wrote:

> 
> 
> Julia Lawall schrieb:
> > This code does not call deinit_card(card); in an error case, as done in
> > other error-handling code in the same function.  But actually, the called
> > function init_sram can only return 0, so there is no need for the error
> > check at all.
> > 
> 
> 
> did you set init_sram() to void ?

No, that indeed seems like a reasonable change.  Patch shortly.

julia

^ permalink raw reply

* Attention Beneficiary,
From: Barrister.Brandon  Moore @ 2010-10-02  9:38 UTC (permalink / raw)



Attention Beneficiary,

Are you ready to pick up this $5,000.00 sent to you today. We have
concluded to effect your Compensation payment of $850.000.00 through
Western Union Money Transfer for $5,000.00 daily until your $850.000.00 is
completely transfered. Meanwhile, Western Union Office has $5,000.00 in
your name today. So contact Western Union Agent below to pick up this
$5,000 now:

Contact person: Mr.Kings Igwe ,
Western Union Office
E-mail:western.union.ofice@ufo.tc
Tele phone Number: +229-97844712.
Ask him to give you the MTCN, Sender's  name, question and answer to pick
the $5,000.00. Also you should send to him your following information.

1. Your Full Name...............................
2. Your Address.................................
3. Your Marital Status..........................
4. Occupation and Age...........................
5. Your Mobile Numbers/Fax......................
6. Your International Passport/Driving License..
7. Country .....................................

Regards,
Barrister.Brandon  Moore


 
- ILMCI Mail -
Fast and Clean Email by www.ILMCI.com


^ permalink raw reply

* Re: [PATCH 4/4] drivers/atm/idt77252.c: Remove unnecessary error check
From: Julia Lawall @ 2010-10-02 14:37 UTC (permalink / raw)
  To: walter harms
  Cc: Chas Williams, kernel-janitors, linux-atm-general, netdev,
	linux-kernel
In-Reply-To: <4CA73D26.7090109@bfs.de>

This code does not call deinit_card(card); in an error case, as done in
other error-handling code in the same function.  But actually, the called
function init_sram can only return 0, so there is no need for the error
check at all.

init_sram is also given a void return type, and its single return statement
at the end of the function is dropped.

A simplified version of the sematic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
@r@
statement S1,S2,S3;
constant C1,C2,C3;
@@

*if (...)
 {... S1 return -C1;}
...
*if (...)
 {... when != S1
    return -C2;}
...
*if (...)
 {... S1 return -C3;}
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>

---
 drivers/atm/idt77252.c              |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/atm/idt77252.c b/drivers/atm/idt77252.c
index 1679cbf..bce5732 100644
--- a/drivers/atm/idt77252.c
+++ b/drivers/atm/idt77252.c
@@ -3152,7 +3152,7 @@ deinit_card(struct idt77252_dev *card)
 }
 
 
-static int __devinit
+static void __devinit
 init_sram(struct idt77252_dev *card)
 {
 	int i;
@@ -3298,7 +3298,6 @@ init_sram(struct idt77252_dev *card)
 	       SAR_REG_RXFD);
 
 	IPRINTK("%s: SRAM initialization complete.\n", card->name);
-	return 0;
 }
 
 static int __devinit
@@ -3410,8 +3409,7 @@ init_card(struct atm_dev *dev)
 
 	writel(readl(SAR_REG_CFG) | conf, SAR_REG_CFG);
 
-	if (init_sram(card) < 0)
-		return -1;
+	init_sram(card);
 
 /********************************************************************/
 /*  A L L O C   R A M   A N D   S E T   V A R I O U S   T H I N G S */

^ permalink raw reply related

* [PATCH nf-next] ipt_LOG: add bufferisation to call printk() once
From: Eric Dumazet @ 2010-10-02 15:07 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, Netfilter Development Mailinglist

ipt_LOG & ip6t_LOG use lot of calls to printk() and use a lock in a hope
several cpus wont mix their output in syslog.

printk() being very expensive [1], its better to call it once, on a
prebuilt and complete line. Also, with mixed IPv4 and IPv6 trafic,
separate IPv4/IPv6 locks dont avoid garbage.

I used an allocation of a 1024 bytes structure, sort of seq_printf() but
with a fixed size limit.
Use a static buffer if dynamic allocation failed.

Emit a once time alert if buffer size happens to be too short.

[1]: printk() has various features like printk_delay()...

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/netfilter/xt_log.h |   54 ++++++++++
 net/ipv4/netfilter/ipt_LOG.c   |  145 ++++++++++++++--------------
 net/ipv6/netfilter/ip6t_LOG.c  |  157 +++++++++++++++----------------
 3 files changed, 206 insertions(+), 150 deletions(-)

diff --git a/include/net/netfilter/xt_log.h b/include/net/netfilter/xt_log.h
index e69de29..0dfb34a 100644
--- a/include/net/netfilter/xt_log.h
+++ b/include/net/netfilter/xt_log.h
@@ -0,0 +1,54 @@
+#define S_SIZE (1024 - (sizeof(unsigned int) + 1))
+
+struct sbuff {
+	unsigned int	count;
+	char		buf[S_SIZE + 1];
+};
+static struct sbuff emergency, *emergency_ptr = &emergency;
+
+static int sb_add(struct sbuff *m, const char *f, ...)
+{
+	va_list args;
+	int len;
+
+	if (likely(m->count < S_SIZE)) {
+		va_start(args, f);
+		len = vsnprintf(m->buf + m->count, S_SIZE - m->count, f, args);
+		va_end(args);
+		if (likely(m->count + len < S_SIZE)) {
+			m->count += len;
+			return 0;
+		}
+	}
+	m->count = S_SIZE;
+	printk_once(KERN_ERR KBUILD_MODNAME " please increase S_SIZE\n");
+	return -1;
+}
+
+static struct sbuff *sb_open(void)
+{
+	struct sbuff *m = kmalloc(sizeof(*m), GFP_ATOMIC);
+
+	if (unlikely(!m)) {
+		local_bh_disable();
+		do {
+			m = xchg(&emergency_ptr, NULL);
+		} while (!m);
+	}
+	m->count = 0;
+	return m;
+}
+
+static void sb_close(struct sbuff *m)
+{
+	m->buf[m->count] = 0;
+	printk("%s\n", m->buf);
+
+	if (likely(m != &emergency))
+		kfree(m);
+	else {
+		xchg(&emergency_ptr, m);
+		local_bh_enable();
+	}
+}
+
diff --git a/net/ipv4/netfilter/ipt_LOG.c b/net/ipv4/netfilter/ipt_LOG.c
index 915fc17..72ffc8f 100644
--- a/net/ipv4/netfilter/ipt_LOG.c
+++ b/net/ipv4/netfilter/ipt_LOG.c
@@ -24,16 +24,15 @@
 #include <linux/netfilter/x_tables.h>
 #include <linux/netfilter_ipv4/ipt_LOG.h>
 #include <net/netfilter/nf_log.h>
+#include <net/netfilter/xt_log.h>
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Netfilter Core Team <coreteam@netfilter.org>");
 MODULE_DESCRIPTION("Xtables: IPv4 packet logging to syslog");
 
-/* Use lock to serialize, so printks don't overlap */
-static DEFINE_SPINLOCK(log_lock);
-
 /* One level of recursion won't kill us */
-static void dump_packet(const struct nf_loginfo *info,
+static void dump_packet(struct sbuff *m,
+			const struct nf_loginfo *info,
 			const struct sk_buff *skb,
 			unsigned int iphoff)
 {
@@ -48,32 +47,32 @@ static void dump_packet(const struct nf_loginfo *info,
 
 	ih = skb_header_pointer(skb, iphoff, sizeof(_iph), &_iph);
 	if (ih == NULL) {
-		printk("TRUNCATED");
+		sb_add(m, "TRUNCATED");
 		return;
 	}
 
 	/* Important fields:
 	 * TOS, len, DF/MF, fragment offset, TTL, src, dst, options. */
 	/* Max length: 40 "SRC=255.255.255.255 DST=255.255.255.255 " */
-	printk("SRC=%pI4 DST=%pI4 ",
+	sb_add(m, "SRC=%pI4 DST=%pI4 ",
 	       &ih->saddr, &ih->daddr);
 
 	/* Max length: 46 "LEN=65535 TOS=0xFF PREC=0xFF TTL=255 ID=65535 " */
-	printk("LEN=%u TOS=0x%02X PREC=0x%02X TTL=%u ID=%u ",
+	sb_add(m, "LEN=%u TOS=0x%02X PREC=0x%02X TTL=%u ID=%u ",
 	       ntohs(ih->tot_len), ih->tos & IPTOS_TOS_MASK,
 	       ih->tos & IPTOS_PREC_MASK, ih->ttl, ntohs(ih->id));
 
 	/* Max length: 6 "CE DF MF " */
 	if (ntohs(ih->frag_off) & IP_CE)
-		printk("CE ");
+		sb_add(m, "CE ");
 	if (ntohs(ih->frag_off) & IP_DF)
-		printk("DF ");
+		sb_add(m, "DF ");
 	if (ntohs(ih->frag_off) & IP_MF)
-		printk("MF ");
+		sb_add(m, "MF ");
 
 	/* Max length: 11 "FRAG:65535 " */
 	if (ntohs(ih->frag_off) & IP_OFFSET)
-		printk("FRAG:%u ", ntohs(ih->frag_off) & IP_OFFSET);
+		sb_add(m, "FRAG:%u ", ntohs(ih->frag_off) & IP_OFFSET);
 
 	if ((logflags & IPT_LOG_IPOPT) &&
 	    ih->ihl * 4 > sizeof(struct iphdr)) {
@@ -85,15 +84,15 @@ static void dump_packet(const struct nf_loginfo *info,
 		op = skb_header_pointer(skb, iphoff+sizeof(_iph),
 					optsize, _opt);
 		if (op == NULL) {
-			printk("TRUNCATED");
+			sb_add(m, "TRUNCATED");
 			return;
 		}
 
 		/* Max length: 127 "OPT (" 15*4*2chars ") " */
-		printk("OPT (");
+		sb_add(m, "OPT (");
 		for (i = 0; i < optsize; i++)
-			printk("%02X", op[i]);
-		printk(") ");
+			sb_add(m, "%02X", op[i]);
+		sb_add(m, ") ");
 	}
 
 	switch (ih->protocol) {
@@ -102,7 +101,7 @@ static void dump_packet(const struct nf_loginfo *info,
 		const struct tcphdr *th;
 
 		/* Max length: 10 "PROTO=TCP " */
-		printk("PROTO=TCP ");
+		sb_add(m, "PROTO=TCP ");
 
 		if (ntohs(ih->frag_off) & IP_OFFSET)
 			break;
@@ -111,41 +110,41 @@ static void dump_packet(const struct nf_loginfo *info,
 		th = skb_header_pointer(skb, iphoff + ih->ihl * 4,
 					sizeof(_tcph), &_tcph);
 		if (th == NULL) {
-			printk("INCOMPLETE [%u bytes] ",
+			sb_add(m, "INCOMPLETE [%u bytes] ",
 			       skb->len - iphoff - ih->ihl*4);
 			break;
 		}
 
 		/* Max length: 20 "SPT=65535 DPT=65535 " */
-		printk("SPT=%u DPT=%u ",
+		sb_add(m, "SPT=%u DPT=%u ",
 		       ntohs(th->source), ntohs(th->dest));
 		/* Max length: 30 "SEQ=4294967295 ACK=4294967295 " */
 		if (logflags & IPT_LOG_TCPSEQ)
-			printk("SEQ=%u ACK=%u ",
+			sb_add(m, "SEQ=%u ACK=%u ",
 			       ntohl(th->seq), ntohl(th->ack_seq));
 		/* Max length: 13 "WINDOW=65535 " */
-		printk("WINDOW=%u ", ntohs(th->window));
+		sb_add(m, "WINDOW=%u ", ntohs(th->window));
 		/* Max length: 9 "RES=0x3F " */
-		printk("RES=0x%02x ", (u8)(ntohl(tcp_flag_word(th) & TCP_RESERVED_BITS) >> 22));
+		sb_add(m, "RES=0x%02x ", (u8)(ntohl(tcp_flag_word(th) & TCP_RESERVED_BITS) >> 22));
 		/* Max length: 32 "CWR ECE URG ACK PSH RST SYN FIN " */
 		if (th->cwr)
-			printk("CWR ");
+			sb_add(m, "CWR ");
 		if (th->ece)
-			printk("ECE ");
+			sb_add(m, "ECE ");
 		if (th->urg)
-			printk("URG ");
+			sb_add(m, "URG ");
 		if (th->ack)
-			printk("ACK ");
+			sb_add(m, "ACK ");
 		if (th->psh)
-			printk("PSH ");
+			sb_add(m, "PSH ");
 		if (th->rst)
-			printk("RST ");
+			sb_add(m, "RST ");
 		if (th->syn)
-			printk("SYN ");
+			sb_add(m, "SYN ");
 		if (th->fin)
-			printk("FIN ");
+			sb_add(m, "FIN ");
 		/* Max length: 11 "URGP=65535 " */
-		printk("URGP=%u ", ntohs(th->urg_ptr));
+		sb_add(m, "URGP=%u ", ntohs(th->urg_ptr));
 
 		if ((logflags & IPT_LOG_TCPOPT) &&
 		    th->doff * 4 > sizeof(struct tcphdr)) {
@@ -158,15 +157,15 @@ static void dump_packet(const struct nf_loginfo *info,
 						iphoff+ih->ihl*4+sizeof(_tcph),
 						optsize, _opt);
 			if (op == NULL) {
-				printk("TRUNCATED");
+				sb_add(m, "TRUNCATED");
 				return;
 			}
 
 			/* Max length: 127 "OPT (" 15*4*2chars ") " */
-			printk("OPT (");
+			sb_add(m, "OPT (");
 			for (i = 0; i < optsize; i++)
-				printk("%02X", op[i]);
-			printk(") ");
+				sb_add(m, "%02X", op[i]);
+			sb_add(m, ") ");
 		}
 		break;
 	}
@@ -177,9 +176,9 @@ static void dump_packet(const struct nf_loginfo *info,
 
 		if (ih->protocol == IPPROTO_UDP)
 			/* Max length: 10 "PROTO=UDP "     */
-			printk("PROTO=UDP " );
+			sb_add(m, "PROTO=UDP " );
 		else	/* Max length: 14 "PROTO=UDPLITE " */
-			printk("PROTO=UDPLITE ");
+			sb_add(m, "PROTO=UDPLITE ");
 
 		if (ntohs(ih->frag_off) & IP_OFFSET)
 			break;
@@ -188,13 +187,13 @@ static void dump_packet(const struct nf_loginfo *info,
 		uh = skb_header_pointer(skb, iphoff+ih->ihl*4,
 					sizeof(_udph), &_udph);
 		if (uh == NULL) {
-			printk("INCOMPLETE [%u bytes] ",
+			sb_add(m, "INCOMPLETE [%u bytes] ",
 			       skb->len - iphoff - ih->ihl*4);
 			break;
 		}
 
 		/* Max length: 20 "SPT=65535 DPT=65535 " */
-		printk("SPT=%u DPT=%u LEN=%u ",
+		sb_add(m, "SPT=%u DPT=%u LEN=%u ",
 		       ntohs(uh->source), ntohs(uh->dest),
 		       ntohs(uh->len));
 		break;
@@ -221,7 +220,7 @@ static void dump_packet(const struct nf_loginfo *info,
 			    [ICMP_ADDRESSREPLY] = 12 };
 
 		/* Max length: 11 "PROTO=ICMP " */
-		printk("PROTO=ICMP ");
+		sb_add(m, "PROTO=ICMP ");
 
 		if (ntohs(ih->frag_off) & IP_OFFSET)
 			break;
@@ -230,19 +229,19 @@ static void dump_packet(const struct nf_loginfo *info,
 		ich = skb_header_pointer(skb, iphoff + ih->ihl * 4,
 					 sizeof(_icmph), &_icmph);
 		if (ich == NULL) {
-			printk("INCOMPLETE [%u bytes] ",
+			sb_add(m, "INCOMPLETE [%u bytes] ",
 			       skb->len - iphoff - ih->ihl*4);
 			break;
 		}
 
 		/* Max length: 18 "TYPE=255 CODE=255 " */
-		printk("TYPE=%u CODE=%u ", ich->type, ich->code);
+		sb_add(m, "TYPE=%u CODE=%u ", ich->type, ich->code);
 
 		/* Max length: 25 "INCOMPLETE [65535 bytes] " */
 		if (ich->type <= NR_ICMP_TYPES &&
 		    required_len[ich->type] &&
 		    skb->len-iphoff-ih->ihl*4 < required_len[ich->type]) {
-			printk("INCOMPLETE [%u bytes] ",
+			sb_add(m, "INCOMPLETE [%u bytes] ",
 			       skb->len - iphoff - ih->ihl*4);
 			break;
 		}
@@ -251,35 +250,35 @@ static void dump_packet(const struct nf_loginfo *info,
 		case ICMP_ECHOREPLY:
 		case ICMP_ECHO:
 			/* Max length: 19 "ID=65535 SEQ=65535 " */
-			printk("ID=%u SEQ=%u ",
+			sb_add(m, "ID=%u SEQ=%u ",
 			       ntohs(ich->un.echo.id),
 			       ntohs(ich->un.echo.sequence));
 			break;
 
 		case ICMP_PARAMETERPROB:
 			/* Max length: 14 "PARAMETER=255 " */
-			printk("PARAMETER=%u ",
+			sb_add(m, "PARAMETER=%u ",
 			       ntohl(ich->un.gateway) >> 24);
 			break;
 		case ICMP_REDIRECT:
 			/* Max length: 24 "GATEWAY=255.255.255.255 " */
-			printk("GATEWAY=%pI4 ", &ich->un.gateway);
+			sb_add(m, "GATEWAY=%pI4 ", &ich->un.gateway);
 			/* Fall through */
 		case ICMP_DEST_UNREACH:
 		case ICMP_SOURCE_QUENCH:
 		case ICMP_TIME_EXCEEDED:
 			/* Max length: 3+maxlen */
 			if (!iphoff) { /* Only recurse once. */
-				printk("[");
-				dump_packet(info, skb,
+				sb_add(m, "[");
+				dump_packet(m, info, skb,
 					    iphoff + ih->ihl*4+sizeof(_icmph));
-				printk("] ");
+				sb_add(m, "] ");
 			}
 
 			/* Max length: 10 "MTU=65535 " */
 			if (ich->type == ICMP_DEST_UNREACH &&
 			    ich->code == ICMP_FRAG_NEEDED)
-				printk("MTU=%u ", ntohs(ich->un.frag.mtu));
+				sb_add(m, "MTU=%u ", ntohs(ich->un.frag.mtu));
 		}
 		break;
 	}
@@ -292,19 +291,19 @@ static void dump_packet(const struct nf_loginfo *info,
 			break;
 
 		/* Max length: 9 "PROTO=AH " */
-		printk("PROTO=AH ");
+		sb_add(m, "PROTO=AH ");
 
 		/* Max length: 25 "INCOMPLETE [65535 bytes] " */
 		ah = skb_header_pointer(skb, iphoff+ih->ihl*4,
 					sizeof(_ahdr), &_ahdr);
 		if (ah == NULL) {
-			printk("INCOMPLETE [%u bytes] ",
+			sb_add(m, "INCOMPLETE [%u bytes] ",
 			       skb->len - iphoff - ih->ihl*4);
 			break;
 		}
 
 		/* Length: 15 "SPI=0xF1234567 " */
-		printk("SPI=0x%x ", ntohl(ah->spi));
+		sb_add(m, "SPI=0x%x ", ntohl(ah->spi));
 		break;
 	}
 	case IPPROTO_ESP: {
@@ -312,7 +311,7 @@ static void dump_packet(const struct nf_loginfo *info,
 		const struct ip_esp_hdr *eh;
 
 		/* Max length: 10 "PROTO=ESP " */
-		printk("PROTO=ESP ");
+		sb_add(m, "PROTO=ESP ");
 
 		if (ntohs(ih->frag_off) & IP_OFFSET)
 			break;
@@ -321,25 +320,25 @@ static void dump_packet(const struct nf_loginfo *info,
 		eh = skb_header_pointer(skb, iphoff+ih->ihl*4,
 					sizeof(_esph), &_esph);
 		if (eh == NULL) {
-			printk("INCOMPLETE [%u bytes] ",
+			sb_add(m, "INCOMPLETE [%u bytes] ",
 			       skb->len - iphoff - ih->ihl*4);
 			break;
 		}
 
 		/* Length: 15 "SPI=0xF1234567 " */
-		printk("SPI=0x%x ", ntohl(eh->spi));
+		sb_add(m, "SPI=0x%x ", ntohl(eh->spi));
 		break;
 	}
 	/* Max length: 10 "PROTO 255 " */
 	default:
-		printk("PROTO=%u ", ih->protocol);
+		sb_add(m, "PROTO=%u ", ih->protocol);
 	}
 
 	/* Max length: 15 "UID=4294967295 " */
 	if ((logflags & IPT_LOG_UID) && !iphoff && skb->sk) {
 		read_lock_bh(&skb->sk->sk_callback_lock);
 		if (skb->sk->sk_socket && skb->sk->sk_socket->file)
-			printk("UID=%u GID=%u ",
+			sb_add(m, "UID=%u GID=%u ",
 				skb->sk->sk_socket->file->f_cred->fsuid,
 				skb->sk->sk_socket->file->f_cred->fsgid);
 		read_unlock_bh(&skb->sk->sk_callback_lock);
@@ -347,7 +346,7 @@ static void dump_packet(const struct nf_loginfo *info,
 
 	/* Max length: 16 "MARK=0xFFFFFFFF " */
 	if (!iphoff && skb->mark)
-		printk("MARK=0x%x ", skb->mark);
+		sb_add(m, "MARK=0x%x ", skb->mark);
 
 	/* Proto    Max log string length */
 	/* IP:      40+46+6+11+127 = 230 */
@@ -364,7 +363,8 @@ static void dump_packet(const struct nf_loginfo *info,
 	/* maxlen = 230+   91  + 230 + 252 = 803 */
 }
 
-static void dump_mac_header(const struct nf_loginfo *info,
+static void dump_mac_header(struct sbuff *m,
+			    const struct nf_loginfo *info,
 			    const struct sk_buff *skb)
 {
 	struct net_device *dev = skb->dev;
@@ -378,7 +378,7 @@ static void dump_mac_header(const struct nf_loginfo *info,
 
 	switch (dev->type) {
 	case ARPHRD_ETHER:
-		printk("MACSRC=%pM MACDST=%pM MACPROTO=%04x ",
+		sb_add(m, "MACSRC=%pM MACDST=%pM MACPROTO=%04x ",
 		       eth_hdr(skb)->h_source, eth_hdr(skb)->h_dest,
 		       ntohs(eth_hdr(skb)->h_proto));
 		return;
@@ -387,17 +387,17 @@ static void dump_mac_header(const struct nf_loginfo *info,
 	}
 
 fallback:
-	printk("MAC=");
+	sb_add(m, "MAC=");
 	if (dev->hard_header_len &&
 	    skb->mac_header != skb->network_header) {
 		const unsigned char *p = skb_mac_header(skb);
 		unsigned int i;
 
-		printk("%02x", *p++);
+		sb_add(m, "%02x", *p++);
 		for (i = 1; i < dev->hard_header_len; i++, p++)
-			printk(":%02x", *p);
+			sb_add(m, ":%02x", *p);
 	}
-	printk(" ");
+	sb_add(m, " ");
 }
 
 static struct nf_loginfo default_loginfo = {
@@ -419,11 +419,12 @@ ipt_log_packet(u_int8_t pf,
 	       const struct nf_loginfo *loginfo,
 	       const char *prefix)
 {
+	struct sbuff *m = sb_open();
+
 	if (!loginfo)
 		loginfo = &default_loginfo;
 
-	spin_lock_bh(&log_lock);
-	printk("<%d>%sIN=%s OUT=%s ", loginfo->u.log.level,
+	sb_add(m, "<%d>%sIN=%s OUT=%s ", loginfo->u.log.level,
 	       prefix,
 	       in ? in->name : "",
 	       out ? out->name : "");
@@ -434,20 +435,20 @@ ipt_log_packet(u_int8_t pf,
 
 		physindev = skb->nf_bridge->physindev;
 		if (physindev && in != physindev)
-			printk("PHYSIN=%s ", physindev->name);
+			sb_add(m, "PHYSIN=%s ", physindev->name);
 		physoutdev = skb->nf_bridge->physoutdev;
 		if (physoutdev && out != physoutdev)
-			printk("PHYSOUT=%s ", physoutdev->name);
+			sb_add(m, "PHYSOUT=%s ", physoutdev->name);
 	}
 #endif
 
 	/* MAC logging for input path only. */
 	if (in && !out)
-		dump_mac_header(loginfo, skb);
+		dump_mac_header(m, loginfo, skb);
+
+	dump_packet(m, loginfo, skb, 0);
 
-	dump_packet(loginfo, skb, 0);
-	printk("\n");
-	spin_unlock_bh(&log_lock);
+	sb_close(m);
 }
 
 static unsigned int
diff --git a/net/ipv6/netfilter/ip6t_LOG.c b/net/ipv6/netfilter/ip6t_LOG.c
index 0a07ae7..09c8889 100644
--- a/net/ipv6/netfilter/ip6t_LOG.c
+++ b/net/ipv6/netfilter/ip6t_LOG.c
@@ -23,6 +23,7 @@
 #include <linux/netfilter/x_tables.h>
 #include <linux/netfilter_ipv6/ip6_tables.h>
 #include <net/netfilter/nf_log.h>
+#include <net/netfilter/xt_log.h>
 
 MODULE_AUTHOR("Jan Rekorajski <baggins@pld.org.pl>");
 MODULE_DESCRIPTION("Xtables: IPv6 packet logging to syslog");
@@ -32,11 +33,9 @@ struct in_device;
 #include <net/route.h>
 #include <linux/netfilter_ipv6/ip6t_LOG.h>
 
-/* Use lock to serialize, so printks don't overlap */
-static DEFINE_SPINLOCK(log_lock);
-
 /* One level of recursion won't kill us */
-static void dump_packet(const struct nf_loginfo *info,
+static void dump_packet(struct sbuff *m,
+			const struct nf_loginfo *info,
 			const struct sk_buff *skb, unsigned int ip6hoff,
 			int recurse)
 {
@@ -55,15 +54,15 @@ static void dump_packet(const struct nf_loginfo *info,
 
 	ih = skb_header_pointer(skb, ip6hoff, sizeof(_ip6h), &_ip6h);
 	if (ih == NULL) {
-		printk("TRUNCATED");
+		sb_add(m, "TRUNCATED");
 		return;
 	}
 
 	/* Max length: 88 "SRC=0000.0000.0000.0000.0000.0000.0000.0000 DST=0000.0000.0000.0000.0000.0000.0000.0000 " */
-	printk("SRC=%pI6 DST=%pI6 ", &ih->saddr, &ih->daddr);
+	sb_add(m, "SRC=%pI6 DST=%pI6 ", &ih->saddr, &ih->daddr);
 
 	/* Max length: 44 "LEN=65535 TC=255 HOPLIMIT=255 FLOWLBL=FFFFF " */
-	printk("LEN=%Zu TC=%u HOPLIMIT=%u FLOWLBL=%u ",
+	sb_add(m, "LEN=%Zu TC=%u HOPLIMIT=%u FLOWLBL=%u ",
 	       ntohs(ih->payload_len) + sizeof(struct ipv6hdr),
 	       (ntohl(*(__be32 *)ih) & 0x0ff00000) >> 20,
 	       ih->hop_limit,
@@ -78,35 +77,35 @@ static void dump_packet(const struct nf_loginfo *info,
 
 		hp = skb_header_pointer(skb, ptr, sizeof(_hdr), &_hdr);
 		if (hp == NULL) {
-			printk("TRUNCATED");
+			sb_add(m, "TRUNCATED");
 			return;
 		}
 
 		/* Max length: 48 "OPT (...) " */
 		if (logflags & IP6T_LOG_IPOPT)
-			printk("OPT ( ");
+			sb_add(m, "OPT ( ");
 
 		switch (currenthdr) {
 		case IPPROTO_FRAGMENT: {
 			struct frag_hdr _fhdr;
 			const struct frag_hdr *fh;
 
-			printk("FRAG:");
+			sb_add(m, "FRAG:");
 			fh = skb_header_pointer(skb, ptr, sizeof(_fhdr),
 						&_fhdr);
 			if (fh == NULL) {
-				printk("TRUNCATED ");
+				sb_add(m, "TRUNCATED ");
 				return;
 			}
 
 			/* Max length: 6 "65535 " */
-			printk("%u ", ntohs(fh->frag_off) & 0xFFF8);
+			sb_add(m, "%u ", ntohs(fh->frag_off) & 0xFFF8);
 
 			/* Max length: 11 "INCOMPLETE " */
 			if (fh->frag_off & htons(0x0001))
-				printk("INCOMPLETE ");
+				sb_add(m, "INCOMPLETE ");
 
-			printk("ID:%08x ", ntohl(fh->identification));
+			sb_add(m, "ID:%08x ", ntohl(fh->identification));
 
 			if (ntohs(fh->frag_off) & 0xFFF8)
 				fragment = 1;
@@ -120,7 +119,7 @@ static void dump_packet(const struct nf_loginfo *info,
 		case IPPROTO_HOPOPTS:
 			if (fragment) {
 				if (logflags & IP6T_LOG_IPOPT)
-					printk(")");
+					sb_add(m, ")");
 				return;
 			}
 			hdrlen = ipv6_optlen(hp);
@@ -132,10 +131,10 @@ static void dump_packet(const struct nf_loginfo *info,
 				const struct ip_auth_hdr *ah;
 
 				/* Max length: 3 "AH " */
-				printk("AH ");
+				sb_add(m, "AH ");
 
 				if (fragment) {
-					printk(")");
+					sb_add(m, ")");
 					return;
 				}
 
@@ -146,13 +145,13 @@ static void dump_packet(const struct nf_loginfo *info,
 					 * Max length: 26 "INCOMPLETE [65535
 					 *  bytes] )"
 					 */
-					printk("INCOMPLETE [%u bytes] )",
+					sb_add(m, "INCOMPLETE [%u bytes] )",
 					       skb->len - ptr);
 					return;
 				}
 
 				/* Length: 15 "SPI=0xF1234567 */
-				printk("SPI=0x%x ", ntohl(ah->spi));
+				sb_add(m, "SPI=0x%x ", ntohl(ah->spi));
 
 			}
 
@@ -164,10 +163,10 @@ static void dump_packet(const struct nf_loginfo *info,
 				const struct ip_esp_hdr *eh;
 
 				/* Max length: 4 "ESP " */
-				printk("ESP ");
+				sb_add(m, "ESP ");
 
 				if (fragment) {
-					printk(")");
+					sb_add(m, ")");
 					return;
 				}
 
@@ -177,23 +176,23 @@ static void dump_packet(const struct nf_loginfo *info,
 				eh = skb_header_pointer(skb, ptr, sizeof(_esph),
 							&_esph);
 				if (eh == NULL) {
-					printk("INCOMPLETE [%u bytes] )",
+					sb_add(m, "INCOMPLETE [%u bytes] )",
 					       skb->len - ptr);
 					return;
 				}
 
 				/* Length: 16 "SPI=0xF1234567 )" */
-				printk("SPI=0x%x )", ntohl(eh->spi) );
+				sb_add(m, "SPI=0x%x )", ntohl(eh->spi) );
 
 			}
 			return;
 		default:
 			/* Max length: 20 "Unknown Ext Hdr 255" */
-			printk("Unknown Ext Hdr %u", currenthdr);
+			sb_add(m, "Unknown Ext Hdr %u", currenthdr);
 			return;
 		}
 		if (logflags & IP6T_LOG_IPOPT)
-			printk(") ");
+			sb_add(m, ") ");
 
 		currenthdr = hp->nexthdr;
 		ptr += hdrlen;
@@ -205,7 +204,7 @@ static void dump_packet(const struct nf_loginfo *info,
 		const struct tcphdr *th;
 
 		/* Max length: 10 "PROTO=TCP " */
-		printk("PROTO=TCP ");
+		sb_add(m, "PROTO=TCP ");
 
 		if (fragment)
 			break;
@@ -213,40 +212,40 @@ static void dump_packet(const struct nf_loginfo *info,
 		/* Max length: 25 "INCOMPLETE [65535 bytes] " */
 		th = skb_header_pointer(skb, ptr, sizeof(_tcph), &_tcph);
 		if (th == NULL) {
-			printk("INCOMPLETE [%u bytes] ", skb->len - ptr);
+			sb_add(m, "INCOMPLETE [%u bytes] ", skb->len - ptr);
 			return;
 		}
 
 		/* Max length: 20 "SPT=65535 DPT=65535 " */
-		printk("SPT=%u DPT=%u ",
+		sb_add(m, "SPT=%u DPT=%u ",
 		       ntohs(th->source), ntohs(th->dest));
 		/* Max length: 30 "SEQ=4294967295 ACK=4294967295 " */
 		if (logflags & IP6T_LOG_TCPSEQ)
-			printk("SEQ=%u ACK=%u ",
+			sb_add(m, "SEQ=%u ACK=%u ",
 			       ntohl(th->seq), ntohl(th->ack_seq));
 		/* Max length: 13 "WINDOW=65535 " */
-		printk("WINDOW=%u ", ntohs(th->window));
+		sb_add(m, "WINDOW=%u ", ntohs(th->window));
 		/* Max length: 9 "RES=0x3C " */
-		printk("RES=0x%02x ", (u_int8_t)(ntohl(tcp_flag_word(th) & TCP_RESERVED_BITS) >> 22));
+		sb_add(m, "RES=0x%02x ", (u_int8_t)(ntohl(tcp_flag_word(th) & TCP_RESERVED_BITS) >> 22));
 		/* Max length: 32 "CWR ECE URG ACK PSH RST SYN FIN " */
 		if (th->cwr)
-			printk("CWR ");
+			sb_add(m, "CWR ");
 		if (th->ece)
-			printk("ECE ");
+			sb_add(m, "ECE ");
 		if (th->urg)
-			printk("URG ");
+			sb_add(m, "URG ");
 		if (th->ack)
-			printk("ACK ");
+			sb_add(m, "ACK ");
 		if (th->psh)
-			printk("PSH ");
+			sb_add(m, "PSH ");
 		if (th->rst)
-			printk("RST ");
+			sb_add(m, "RST ");
 		if (th->syn)
-			printk("SYN ");
+			sb_add(m, "SYN ");
 		if (th->fin)
-			printk("FIN ");
+			sb_add(m, "FIN ");
 		/* Max length: 11 "URGP=65535 " */
-		printk("URGP=%u ", ntohs(th->urg_ptr));
+		sb_add(m, "URGP=%u ", ntohs(th->urg_ptr));
 
 		if ((logflags & IP6T_LOG_TCPOPT) &&
 		    th->doff * 4 > sizeof(struct tcphdr)) {
@@ -260,15 +259,15 @@ static void dump_packet(const struct nf_loginfo *info,
 						ptr + sizeof(struct tcphdr),
 						optsize, _opt);
 			if (op == NULL) {
-				printk("OPT (TRUNCATED)");
+				sb_add(m, "OPT (TRUNCATED)");
 				return;
 			}
 
 			/* Max length: 127 "OPT (" 15*4*2chars ") " */
-			printk("OPT (");
+			sb_add(m, "OPT (");
 			for (i =0; i < optsize; i++)
-				printk("%02X", op[i]);
-			printk(") ");
+				sb_add(m, "%02X", op[i]);
+			sb_add(m, ") ");
 		}
 		break;
 	}
@@ -279,9 +278,9 @@ static void dump_packet(const struct nf_loginfo *info,
 
 		if (currenthdr == IPPROTO_UDP)
 			/* Max length: 10 "PROTO=UDP "     */
-			printk("PROTO=UDP " );
+			sb_add(m, "PROTO=UDP " );
 		else	/* Max length: 14 "PROTO=UDPLITE " */
-			printk("PROTO=UDPLITE ");
+			sb_add(m, "PROTO=UDPLITE ");
 
 		if (fragment)
 			break;
@@ -289,12 +288,12 @@ static void dump_packet(const struct nf_loginfo *info,
 		/* Max length: 25 "INCOMPLETE [65535 bytes] " */
 		uh = skb_header_pointer(skb, ptr, sizeof(_udph), &_udph);
 		if (uh == NULL) {
-			printk("INCOMPLETE [%u bytes] ", skb->len - ptr);
+			sb_add(m, "INCOMPLETE [%u bytes] ", skb->len - ptr);
 			return;
 		}
 
 		/* Max length: 20 "SPT=65535 DPT=65535 " */
-		printk("SPT=%u DPT=%u LEN=%u ",
+		sb_add(m, "SPT=%u DPT=%u LEN=%u ",
 		       ntohs(uh->source), ntohs(uh->dest),
 		       ntohs(uh->len));
 		break;
@@ -304,7 +303,7 @@ static void dump_packet(const struct nf_loginfo *info,
 		const struct icmp6hdr *ic;
 
 		/* Max length: 13 "PROTO=ICMPv6 " */
-		printk("PROTO=ICMPv6 ");
+		sb_add(m, "PROTO=ICMPv6 ");
 
 		if (fragment)
 			break;
@@ -312,18 +311,18 @@ static void dump_packet(const struct nf_loginfo *info,
 		/* Max length: 25 "INCOMPLETE [65535 bytes] " */
 		ic = skb_header_pointer(skb, ptr, sizeof(_icmp6h), &_icmp6h);
 		if (ic == NULL) {
-			printk("INCOMPLETE [%u bytes] ", skb->len - ptr);
+			sb_add(m, "INCOMPLETE [%u bytes] ", skb->len - ptr);
 			return;
 		}
 
 		/* Max length: 18 "TYPE=255 CODE=255 " */
-		printk("TYPE=%u CODE=%u ", ic->icmp6_type, ic->icmp6_code);
+		sb_add(m, "TYPE=%u CODE=%u ", ic->icmp6_type, ic->icmp6_code);
 
 		switch (ic->icmp6_type) {
 		case ICMPV6_ECHO_REQUEST:
 		case ICMPV6_ECHO_REPLY:
 			/* Max length: 19 "ID=65535 SEQ=65535 " */
-			printk("ID=%u SEQ=%u ",
+			sb_add(m, "ID=%u SEQ=%u ",
 				ntohs(ic->icmp6_identifier),
 				ntohs(ic->icmp6_sequence));
 			break;
@@ -334,35 +333,35 @@ static void dump_packet(const struct nf_loginfo *info,
 
 		case ICMPV6_PARAMPROB:
 			/* Max length: 17 "POINTER=ffffffff " */
-			printk("POINTER=%08x ", ntohl(ic->icmp6_pointer));
+			sb_add(m, "POINTER=%08x ", ntohl(ic->icmp6_pointer));
 			/* Fall through */
 		case ICMPV6_DEST_UNREACH:
 		case ICMPV6_PKT_TOOBIG:
 		case ICMPV6_TIME_EXCEED:
 			/* Max length: 3+maxlen */
 			if (recurse) {
-				printk("[");
-				dump_packet(info, skb, ptr + sizeof(_icmp6h),
-					    0);
-				printk("] ");
+				sb_add(m, "[");
+				dump_packet(m, info, skb,
+					    ptr + sizeof(_icmp6h), 0);
+				sb_add(m, "] ");
 			}
 
 			/* Max length: 10 "MTU=65535 " */
 			if (ic->icmp6_type == ICMPV6_PKT_TOOBIG)
-				printk("MTU=%u ", ntohl(ic->icmp6_mtu));
+				sb_add(m, "MTU=%u ", ntohl(ic->icmp6_mtu));
 		}
 		break;
 	}
 	/* Max length: 10 "PROTO=255 " */
 	default:
-		printk("PROTO=%u ", currenthdr);
+		sb_add(m, "PROTO=%u ", currenthdr);
 	}
 
 	/* Max length: 15 "UID=4294967295 " */
 	if ((logflags & IP6T_LOG_UID) && recurse && skb->sk) {
 		read_lock_bh(&skb->sk->sk_callback_lock);
 		if (skb->sk->sk_socket && skb->sk->sk_socket->file)
-			printk("UID=%u GID=%u ",
+			sb_add(m, "UID=%u GID=%u ",
 				skb->sk->sk_socket->file->f_cred->fsuid,
 				skb->sk->sk_socket->file->f_cred->fsgid);
 		read_unlock_bh(&skb->sk->sk_callback_lock);
@@ -370,10 +369,11 @@ static void dump_packet(const struct nf_loginfo *info,
 
 	/* Max length: 16 "MARK=0xFFFFFFFF " */
 	if (!recurse && skb->mark)
-		printk("MARK=0x%x ", skb->mark);
+		sb_add(m, "MARK=0x%x ", skb->mark);
 }
 
-static void dump_mac_header(const struct nf_loginfo *info,
+static void dump_mac_header(struct sbuff *m,
+			    const struct nf_loginfo *info,
 			    const struct sk_buff *skb)
 {
 	struct net_device *dev = skb->dev;
@@ -387,7 +387,7 @@ static void dump_mac_header(const struct nf_loginfo *info,
 
 	switch (dev->type) {
 	case ARPHRD_ETHER:
-		printk("MACSRC=%pM MACDST=%pM MACPROTO=%04x ",
+		sb_add(m, "MACSRC=%pM MACDST=%pM MACPROTO=%04x ",
 		       eth_hdr(skb)->h_source, eth_hdr(skb)->h_dest,
 		       ntohs(eth_hdr(skb)->h_proto));
 		return;
@@ -396,7 +396,7 @@ static void dump_mac_header(const struct nf_loginfo *info,
 	}
 
 fallback:
-	printk("MAC=");
+	sb_add(m, "MAC=");
 	if (dev->hard_header_len &&
 	    skb->mac_header != skb->network_header) {
 		const unsigned char *p = skb_mac_header(skb);
@@ -408,19 +408,19 @@ fallback:
 			p = NULL;
 
 		if (p != NULL) {
-			printk("%02x", *p++);
+			sb_add(m, "%02x", *p++);
 			for (i = 1; i < len; i++)
-				printk(":%02x", p[i]);
+				sb_add(m, ":%02x", p[i]);
 		}
-		printk(" ");
+		sb_add(m, " ");
 
 		if (dev->type == ARPHRD_SIT) {
 			const struct iphdr *iph =
 				(struct iphdr *)skb_mac_header(skb);
-			printk("TUNNEL=%pI4->%pI4 ", &iph->saddr, &iph->daddr);
+			sb_add(m, "TUNNEL=%pI4->%pI4 ", &iph->saddr, &iph->daddr);
 		}
 	} else
-		printk(" ");
+		sb_add(m, " ");
 }
 
 static struct nf_loginfo default_loginfo = {
@@ -442,22 +442,23 @@ ip6t_log_packet(u_int8_t pf,
 		const struct nf_loginfo *loginfo,
 		const char *prefix)
 {
+	struct sbuff *m = sb_open();
+
 	if (!loginfo)
 		loginfo = &default_loginfo;
 
-	spin_lock_bh(&log_lock);
-	printk("<%d>%sIN=%s OUT=%s ", loginfo->u.log.level,
-		prefix,
-		in ? in->name : "",
-		out ? out->name : "");
+	sb_add(m, "<%d>%sIN=%s OUT=%s ", loginfo->u.log.level,
+	       prefix,
+	       in ? in->name : "",
+	       out ? out->name : "");
 
 	/* MAC logging for input path only. */
 	if (in && !out)
-		dump_mac_header(loginfo, skb);
+		dump_mac_header(m, loginfo, skb);
+
+	dump_packet(m, loginfo, skb, skb_network_offset(skb), 1);
 
-	dump_packet(loginfo, skb, skb_network_offset(skb), 1);
-	printk("\n");
-	spin_unlock_bh(&log_lock);
+	sb_close(m);
 }
 
 static unsigned int



^ permalink raw reply related

* Re: [ANNOUNCE] ipset-4.4 released
From: Christophe Ngo Van Duc @ 2010-10-02 15:39 UTC (permalink / raw)
  To: netdev
In-Reply-To: <alpine.DEB.2.00.1010012214250.19669@blackhole.kfki.hu>

Thanks so much,

I was just facing that very same problem of mixed src,dst with ipportiphash.

Christophe

On Fri, Oct 1, 2010 at 6:02 PM, Jozsef Kadlecsik
<kadlec@blackhole.kfki.hu> wrote:
> Hi,
>
> ipset-4.4 has just been released with one important fix and some small
> corrections:
>
> Kernel part changes:
>  - The ipporthash, ipportiphash and ipportnethash set types did
>    not work with mixed "src" and "dst" direction parameters of the "set"
>    and  "SET" iptables match and target (reported by Dash Four)
>  - Errorneous semaphore handling in error path fixed (reported by
>    Jan Engelhardt, bugzilla id 668)
>
> Userspace changes:
>   - Manpage fix to make it clear how ipset works on setlist type
>     of sets (John Brendler, bugzilla id 640)
>
> Please note, if you use ipporthash, ipportiphash or ipportnethash type of
> sets with mixed direction parameters of the "set" and "SET" iptables match
> and target, then you must upgrade.
>
> You can download the new version from the usual places:
>        http://ipset.netfilter.org
>        git://git.netfilter.org/ipset.git
>
> Best regards,
> Jozsef
> -
> E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
> PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
> Address : KFKI Research Institute for Particle and Nuclear Physics
>          H-1525 Budapest 114, POB. 49, Hungary
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* [PATCH net-next V3] net: dynamic ingress_queue allocation
From: Eric Dumazet @ 2010-10-02 16:11 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: hadi, David Miller, netdev
In-Reply-To: <20101002093255.GA2049@del.dom.local>

Le samedi 02 octobre 2010 à 11:32 +0200, Jarek Poplawski a écrit :
> On Fri, Oct 01, 2010 at 03:56:28PM +0200, Eric Dumazet wrote:
>  
> >  static void netdev_init_queue_locks(struct net_device *dev)
> >  {
> >  	netdev_for_each_tx_queue(dev, __netdev_init_queue_locks_one, NULL);
> > -	__netdev_init_queue_locks_one(dev, &dev->ingress_queue, NULL);
> > +	__netdev_init_queue_locks_one(dev, dev_ingress_queue(dev), NULL);
> 
> Is dev_ingress_queue(dev) not NULL anytime here?

Yes, but I felt this could be removed later. If you feel its OK right
now, I am OK too ;)

> 
> >  }
> >  
> >  unsigned long netdev_fix_features(unsigned long features, const char *name)
> > @@ -5447,16 +5448,37 @@ static void netdev_init_one_queue(struct net_device *dev,
> >  				  struct netdev_queue *queue,
> >  				  void *_unused)
> >  {
> > -	queue->dev = dev;
> > +	if (queue)
> > +		queue->dev = dev;
> >  }
> >  
> >  static void netdev_init_queues(struct net_device *dev)
> >  {
> > -	netdev_init_one_queue(dev, &dev->ingress_queue, NULL);
> > +	netdev_init_one_queue(dev, dev_ingress_queue(dev), NULL);
> 
> Is dev_ingress_queue(dev) not NULL anytime here?
> 

Yes

> >  	netdev_for_each_tx_queue(dev, netdev_init_one_queue, NULL);
> >  	spin_lock_init(&dev->tx_global_lock);
> >  }
> >  
> > +struct netdev_queue *dev_ingress_queue_create(struct net_device *dev)
> > +{
> > +	struct netdev_queue *queue = dev_ingress_queue(dev);
> > +
> > +#ifdef CONFIG_NET_CLS_ACT
> > +	if (queue)
> > +		return queue;
> > +	queue = kzalloc(sizeof(*queue), GFP_KERNEL);
> > +	if (!queue)
> > +		return NULL;
> > +	netdev_init_one_queue(dev, queue, NULL);
> > +	__netdev_init_queue_locks_one(dev, queue, NULL);
> > +	queue->qdisc = &noop_qdisc;
> > +	queue->qdisc_sleeping = &noop_qdisc;
> > +	smp_wmb();
> 
> Why don't we need smp_rmb() in handle_ing()?
> 

I only wanted to see if Al Viro was using ingress on its Alpha
machine ;)

I am going to use regular rcu api to ease code understanding :)


> > +	dev->ingress_queue = queue;
> > +#endif
> > +	return queue;
> > +}
> > +
> >  /**
> >   *	alloc_netdev_mq - allocate network device
> >   *	@sizeof_priv:	size of private data to allocate space for
> > @@ -5559,6 +5581,8 @@ void free_netdev(struct net_device *dev)
> >  
> >  	kfree(dev->_tx);
> >  
> > +	kfree(dev_ingress_queue(dev));
> > +
> >  	/* Flush device addresses */
> >  	dev_addr_flush(dev);
> >  
> > diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> > index b802078..8635110 100644
> > --- a/net/sched/sch_api.c
> > +++ b/net/sched/sch_api.c
> > @@ -240,7 +240,10 @@ struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle)
> >  	if (q)
> >  		goto out;
> >  
> > -	q = qdisc_match_from_root(dev->ingress_queue.qdisc_sleeping, handle);
> > +	if (!dev_ingress_queue(dev))
> > +		goto out;
> > +	q = qdisc_match_from_root(dev_ingress_queue(dev)->qdisc_sleeping,
> > +				  handle);
> 


> I'd prefer:
>  +	if (dev_ingress_queue(dev))
>  +		q = qdisc_match_from_root(dev_ingress_queue(dev)->qdisc_sleeping,
> 

Yes

> >  out:
> >  	return q;
> >  }
> > @@ -690,6 +693,8 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
> >  		    (new && new->flags & TCQ_F_INGRESS)) {
> >  			num_q = 1;
> >  			ingress = 1;
> > +			if (!dev_ingress_queue(dev))
> > +				return -ENOENT;
> 
> Is this test really needed here?

To avoid a NULL dereference some lines later.
Do I have a guarantee its not NULL here ?

> 
> >  		}
> >  
> >  		if (dev->flags & IFF_UP)
> > @@ -701,7 +706,7 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
> >  		}
> >  
> >  		for (i = 0; i < num_q; i++) {
> > -			struct netdev_queue *dev_queue = &dev->ingress_queue;
> > +			struct netdev_queue *dev_queue = dev_ingress_queue(dev);
> >  
> >  			if (!ingress)
> >  				dev_queue = netdev_get_tx_queue(dev, i);
> > @@ -979,7 +984,8 @@ static int tc_get_qdisc(struct sk_buff *skb, struct nlmsghdr *n, void *arg)
> >  					return -ENOENT;
> >  				q = qdisc_leaf(p, clid);
> >  			} else { /* ingress */
> > -				q = dev->ingress_queue.qdisc_sleeping;
> > +				if (dev_ingress_queue(dev))
> > +					q = dev_ingress_queue(dev)->qdisc_sleeping;
> >  			}
> >  		} else {
> >  			q = dev->qdisc;
> > @@ -1044,7 +1050,8 @@ replay:
> >  					return -ENOENT;
> >  				q = qdisc_leaf(p, clid);
> >  			} else { /*ingress */
> > -				q = dev->ingress_queue.qdisc_sleeping;
> > +				if (dev_ingress_queue_create(dev))
> > +					q = dev_ingress_queue(dev)->qdisc_sleeping;
> 
> I wonder if doing dev_ingress_queue_create() just before qdisc_create()
> (and the test here) isn't more readable.

Sorry, I dont understand. I want to create ingress_queue only if user
wants it. If we setup (egress) trafic shaping, no need to setup
ingress_queue.

> >  
> > @@ -753,7 +755,7 @@ void dev_activate(struct net_device *dev)
> >  
> >  	need_watchdog = 0;
> >  	netdev_for_each_tx_queue(dev, transition_one_qdisc, &need_watchdog);
> > -	transition_one_qdisc(dev, &dev->ingress_queue, NULL);
> > +	transition_one_qdisc(dev, dev_ingress_queue(dev), NULL);
> 
> I'd prefer here and similarly later:
> 
>  +	if (dev_ingress_queue(dev))
>  +		transition_one_qdisc(dev, dev_ingress_queue(dev), NULL);
> 
> to show NULL dev_queue is only legal in this one case.

OK, thanks a lot for the extended review Jarek (and Jamal of course)

Here is the V3 then.

[PATCH net-next V3] net: dynamic ingress_queue allocation

ingress being not used very much, and net_device->ingress_queue being
quite a big object (128 or 256 bytes), use a dynamic allocation if
needed (tc qdisc add dev eth0 ingress ...)

dev_ingress_queue(dev) helper should be used only with RTNL taken.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
V3: add rcu notations & address Jarek comments
 include/linux/netdevice.h |    2 -
 include/linux/rtnetlink.h |    8 ++++++
 net/core/dev.c            |   34 ++++++++++++++++++++++-------
 net/sched/sch_api.c       |   42 ++++++++++++++++++++++++------------
 net/sched/sch_generic.c   |   12 ++++++----
 5 files changed, 71 insertions(+), 27 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ceed347..92d81ed 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -986,7 +986,7 @@ struct net_device {
 	rx_handler_func_t	*rx_handler;
 	void			*rx_handler_data;
 
-	struct netdev_queue	ingress_queue; /* use two cache lines */
+	struct netdev_queue __rcu *ingress_queue;
 
 /*
  * Cache lines mostly used on transmit path
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 68c436b..0bb7b48 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -6,6 +6,7 @@
 #include <linux/if_link.h>
 #include <linux/if_addr.h>
 #include <linux/neighbour.h>
+#include <linux/netdevice.h>
 
 /* rtnetlink families. Values up to 127 are reserved for real address
  * families, values above 128 may be used arbitrarily.
@@ -769,6 +770,13 @@ extern int lockdep_rtnl_is_held(void);
 #define rtnl_dereference(p)					\
 	rcu_dereference_check(p, lockdep_rtnl_is_held())
 
+static inline struct netdev_queue *dev_ingress_queue(struct net_device *dev)
+{
+	return rtnl_dereference(dev->ingress_queue);
+}
+
+extern struct netdev_queue *dev_ingress_queue_create(struct net_device *dev);
+
 extern void rtnetlink_init(void);
 extern void __rtnl_unlock(void);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index a313bab..b078ec8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2702,11 +2702,10 @@ EXPORT_SYMBOL_GPL(br_fdb_test_addr_hook);
  * the ingress scheduler, you just cant add policies on ingress.
  *
  */
-static int ing_filter(struct sk_buff *skb)
+static int ing_filter(struct sk_buff *skb, struct netdev_queue *rxq)
 {
 	struct net_device *dev = skb->dev;
 	u32 ttl = G_TC_RTTL(skb->tc_verd);
-	struct netdev_queue *rxq;
 	int result = TC_ACT_OK;
 	struct Qdisc *q;
 
@@ -2720,8 +2719,6 @@ static int ing_filter(struct sk_buff *skb)
 	skb->tc_verd = SET_TC_RTTL(skb->tc_verd, ttl);
 	skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_INGRESS);
 
-	rxq = &dev->ingress_queue;
-
 	q = rxq->qdisc;
 	if (q != &noop_qdisc) {
 		spin_lock(qdisc_lock(q));
@@ -2737,7 +2734,9 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
 					 struct packet_type **pt_prev,
 					 int *ret, struct net_device *orig_dev)
 {
-	if (skb->dev->ingress_queue.qdisc == &noop_qdisc)
+	struct netdev_queue *rxq = rcu_dereference(skb->dev->ingress_queue);
+
+	if (!rxq || rxq->qdisc == &noop_qdisc)
 		goto out;
 
 	if (*pt_prev) {
@@ -2745,7 +2744,7 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
 		*pt_prev = NULL;
 	}
 
-	switch (ing_filter(skb)) {
+	switch (ing_filter(skb, rxq)) {
 	case TC_ACT_SHOT:
 	case TC_ACT_STOLEN:
 		kfree_skb(skb);
@@ -4940,7 +4939,6 @@ static void __netdev_init_queue_locks_one(struct net_device *dev,
 static void netdev_init_queue_locks(struct net_device *dev)
 {
 	netdev_for_each_tx_queue(dev, __netdev_init_queue_locks_one, NULL);
-	__netdev_init_queue_locks_one(dev, &dev->ingress_queue, NULL);
 }
 
 unsigned long netdev_fix_features(unsigned long features, const char *name)
@@ -5452,11 +5450,29 @@ static void netdev_init_one_queue(struct net_device *dev,
 
 static void netdev_init_queues(struct net_device *dev)
 {
-	netdev_init_one_queue(dev, &dev->ingress_queue, NULL);
 	netdev_for_each_tx_queue(dev, netdev_init_one_queue, NULL);
 	spin_lock_init(&dev->tx_global_lock);
 }
 
+struct netdev_queue *dev_ingress_queue_create(struct net_device *dev)
+{
+	struct netdev_queue *queue = dev_ingress_queue(dev);
+
+#ifdef CONFIG_NET_CLS_ACT
+	if (queue)
+		return queue;
+	queue = kzalloc(sizeof(*queue), GFP_KERNEL);
+	if (!queue)
+		return NULL;
+	netdev_init_one_queue(dev, queue, NULL);
+	__netdev_init_queue_locks_one(dev, queue, NULL);
+	queue->qdisc = &noop_qdisc;
+	queue->qdisc_sleeping = &noop_qdisc;
+	rcu_assign_pointer(dev->ingress_queue, queue);
+#endif
+	return queue;
+}
+
 /**
  *	alloc_netdev_mq - allocate network device
  *	@sizeof_priv:	size of private data to allocate space for
@@ -5559,6 +5575,8 @@ void free_netdev(struct net_device *dev)
 
 	kfree(dev->_tx);
 
+	kfree(rcu_dereference_raw(dev->ingress_queue));
+
 	/* Flush device addresses */
 	dev_addr_flush(dev);
 
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index b802078..b22ca2d 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -240,7 +240,10 @@ struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle)
 	if (q)
 		goto out;
 
-	q = qdisc_match_from_root(dev->ingress_queue.qdisc_sleeping, handle);
+	if (dev_ingress_queue(dev))
+		q = qdisc_match_from_root(
+			dev_ingress_queue(dev)->qdisc_sleeping,
+			handle);
 out:
 	return q;
 }
@@ -690,6 +693,8 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
 		    (new && new->flags & TCQ_F_INGRESS)) {
 			num_q = 1;
 			ingress = 1;
+			if (!dev_ingress_queue(dev))
+				return -ENOENT;
 		}
 
 		if (dev->flags & IFF_UP)
@@ -701,7 +706,7 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
 		}
 
 		for (i = 0; i < num_q; i++) {
-			struct netdev_queue *dev_queue = &dev->ingress_queue;
+			struct netdev_queue *dev_queue = dev_ingress_queue(dev);
 
 			if (!ingress)
 				dev_queue = netdev_get_tx_queue(dev, i);
@@ -979,7 +984,8 @@ static int tc_get_qdisc(struct sk_buff *skb, struct nlmsghdr *n, void *arg)
 					return -ENOENT;
 				q = qdisc_leaf(p, clid);
 			} else { /* ingress */
-				q = dev->ingress_queue.qdisc_sleeping;
+				if (dev_ingress_queue(dev))
+					q = dev_ingress_queue(dev)->qdisc_sleeping;
 			}
 		} else {
 			q = dev->qdisc;
@@ -1043,8 +1049,9 @@ replay:
 				if ((p = qdisc_lookup(dev, TC_H_MAJ(clid))) == NULL)
 					return -ENOENT;
 				q = qdisc_leaf(p, clid);
-			} else { /*ingress */
-				q = dev->ingress_queue.qdisc_sleeping;
+			} else { /* ingress */
+				if (dev_ingress_queue_create(dev))
+					q = dev_ingress_queue(dev)->qdisc_sleeping;
 			}
 		} else {
 			q = dev->qdisc;
@@ -1123,11 +1130,14 @@ replay:
 create_n_graft:
 	if (!(n->nlmsg_flags&NLM_F_CREATE))
 		return -ENOENT;
-	if (clid == TC_H_INGRESS)
-		q = qdisc_create(dev, &dev->ingress_queue, p,
-				 tcm->tcm_parent, tcm->tcm_parent,
-				 tca, &err);
-	else {
+	if (clid == TC_H_INGRESS) {
+		if (dev_ingress_queue(dev))
+			q = qdisc_create(dev, dev_ingress_queue(dev), p,
+					 tcm->tcm_parent, tcm->tcm_parent,
+					 tca, &err);
+		else
+			err = -ENOENT;
+	} else {
 		struct netdev_queue *dev_queue;
 
 		if (p && p->ops->cl_ops && p->ops->cl_ops->select_queue)
@@ -1304,8 +1314,10 @@ static int tc_dump_qdisc(struct sk_buff *skb, struct netlink_callback *cb)
 		if (tc_dump_qdisc_root(dev->qdisc, skb, cb, &q_idx, s_q_idx) < 0)
 			goto done;
 
-		dev_queue = &dev->ingress_queue;
-		if (tc_dump_qdisc_root(dev_queue->qdisc_sleeping, skb, cb, &q_idx, s_q_idx) < 0)
+		dev_queue = dev_ingress_queue(dev);
+		if (dev_queue &&
+		    tc_dump_qdisc_root(dev_queue->qdisc_sleeping, skb, cb,
+				       &q_idx, s_q_idx) < 0)
 			goto done;
 
 cont:
@@ -1595,8 +1607,10 @@ static int tc_dump_tclass(struct sk_buff *skb, struct netlink_callback *cb)
 	if (tc_dump_tclass_root(dev->qdisc, skb, tcm, cb, &t, s_t) < 0)
 		goto done;
 
-	dev_queue = &dev->ingress_queue;
-	if (tc_dump_tclass_root(dev_queue->qdisc_sleeping, skb, tcm, cb, &t, s_t) < 0)
+	dev_queue = dev_ingress_queue(dev);
+	if (dev_queue &&
+	    tc_dump_tclass_root(dev_queue->qdisc_sleeping, skb, tcm, cb,
+				&t, s_t) < 0)
 		goto done;
 
 done:
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 545278a..3d57681 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -753,7 +753,8 @@ void dev_activate(struct net_device *dev)
 
 	need_watchdog = 0;
 	netdev_for_each_tx_queue(dev, transition_one_qdisc, &need_watchdog);
-	transition_one_qdisc(dev, &dev->ingress_queue, NULL);
+	if (dev_ingress_queue(dev))
+		transition_one_qdisc(dev, dev_ingress_queue(dev), NULL);
 
 	if (need_watchdog) {
 		dev->trans_start = jiffies;
@@ -812,7 +813,8 @@ static bool some_qdisc_is_busy(struct net_device *dev)
 void dev_deactivate(struct net_device *dev)
 {
 	netdev_for_each_tx_queue(dev, dev_deactivate_queue, &noop_qdisc);
-	dev_deactivate_queue(dev, &dev->ingress_queue, &noop_qdisc);
+	if (dev_ingress_queue(dev))
+		dev_deactivate_queue(dev, dev_ingress_queue(dev), &noop_qdisc);
 
 	dev_watchdog_down(dev);
 
@@ -838,7 +840,8 @@ void dev_init_scheduler(struct net_device *dev)
 {
 	dev->qdisc = &noop_qdisc;
 	netdev_for_each_tx_queue(dev, dev_init_scheduler_queue, &noop_qdisc);
-	dev_init_scheduler_queue(dev, &dev->ingress_queue, &noop_qdisc);
+	if (dev_ingress_queue(dev))
+		dev_init_scheduler_queue(dev, dev_ingress_queue(dev), &noop_qdisc);
 
 	setup_timer(&dev->watchdog_timer, dev_watchdog, (unsigned long)dev);
 }
@@ -861,7 +864,8 @@ static void shutdown_scheduler_queue(struct net_device *dev,
 void dev_shutdown(struct net_device *dev)
 {
 	netdev_for_each_tx_queue(dev, shutdown_scheduler_queue, &noop_qdisc);
-	shutdown_scheduler_queue(dev, &dev->ingress_queue, &noop_qdisc);
+	if (dev_ingress_queue(dev))
+		shutdown_scheduler_queue(dev, dev_ingress_queue(dev), &noop_qdisc);
 	qdisc_destroy(dev->qdisc);
 	dev->qdisc = &noop_qdisc;
 



^ permalink raw reply related

* Urgent Write Back
From: Dr. Raymond Kuo Fung CH'IEN. @ 2010-10-02 16:12 UTC (permalink / raw)



Dear Friend,

I am Dr. Raymond Kuo Fung CHIEN Executive  Director and Chief Financial
Officer of the operations of the Hang Seng Bank Ltd.I have a secured
business suggestion for you. Before the U.S and Iraqi war our client Mr
Fayez Abdulaziz Mohammed who was with the Iraqi forces and also a business
man made a numbered fixed deposit for 18 calendar months, with a value of
Thirty  Million United State Dollars($30,000,000.00) only in my
branch.Upon maturity several notices was sent to him but there was no
word. Again after the war another notification was sent and still no
response came from him. We later find out that Mr Fayez Abdulaziz Mohammed
and his family had been killed during the war in bomb blast that hit their
home.

 After further investigation it was also discovered that Mr Fayez
Abdulaziz Mohammed did not declare any next of kin in his official papers
including the paper work of his bank deposit. And he also confided in me
the last time he was at my office that no one except me knew of his
deposit in my bank. So,Thirty Million United State
Dollars($30,000,000.00)is still lying in my bank and no one will ever
come forward to claim it.What bothers me most is that according to the
laws of my country at the expiration of 8 years the funds will revert to
the ownership of the Hong Kong Government if nobody applies to claim the
funds.

Against this backdrop, my suggestion to you is that I will like you as a
foreigner to stand as the next of kin to Mr Fayez Abdulaziz Mohammed so
that you will be able to receive his funds.

        WHAT IS TO BE DONE:

I want you to know that I have had everything planned out so that we shall
come out successful.I have contacted an attorney that will prepare the
necessary document that will back you up as the next of kin to Mr Fayez
Abdulaziz Mohammed, all that is required from you at this stage is for you
to provide me with your Full Names and Address so that the attorney can
commence his job. After you have been made the next of kin to the late Mr
Fayez Abdulaziz Mohammed, the Attorney will also file in for claims on
your behalf and secure the necessary approval and letter of probate in
your favor for the move of the funds to an account that will be provided
by you.There is no risk involved at all in the matter as we are going to
adopt a legalized method and the attorney will prepare all the necessary
documents. Please endeavor to observe utmost discretion in all matters
concerning this issue.Once the funds have been transferred to your
nominated bank account we shall share in the ratio of 70% for me, 30% for
you, Should you be interested please send me the following informations
below,

1. Full name and Age.
2. Occupation
3. Private/office phone number.
4. Current residential address.

And finally after that i shall provide you with more details of this
transaction.Your earliest response to this letter will be highly
appreciated. Write back if intrested at
Email:drraymondkuochien3001@gmail.com  or
Email:drraymondkuochien@w.cn

Kind Regards,
Dr.Raymond Kuo Fung CHIEN.



-----------------------------------------------------------------------------------------------------------------------------
Please consider the environmental impact of needlessly printing this e-mail. Epyllion dreams new world with green environment.



^ permalink raw reply

* Urgent Write Back
From: Dr. Raymond Kuo Fung CH'IEN. @ 2010-10-02 17:02 UTC (permalink / raw)



Dear Friend,

I am Dr. Raymond Kuo Fung CHIEN Executive  Director and Chief Financial
Officer of the operations of the Hang Seng Bank Ltd.I have a secured
business suggestion for you. Before the U.S and Iraqi war our client Mr
Fayez Abdulaziz Mohammed who was with the Iraqi forces and also a business
man made a numbered fixed deposit for 18 calendar months, with a value of
Thirty  Million United State Dollars($30,000,000.00) only in my
branch.Upon maturity several notices was sent to him but there was no
word. Again after the war another notification was sent and still no
response came from him. We later find out that Mr Fayez Abdulaziz Mohammed
and his family had been killed during the war in bomb blast that hit their
home.

 After further investigation it was also discovered that Mr Fayez
Abdulaziz Mohammed did not declare any next of kin in his official papers
including the paper work of his bank deposit. And he also confided in me
the last time he was at my office that no one except me knew of his
deposit in my bank. So,Thirty Million United State
Dollars($30,000,000.00)is still lying in my bank and no one will ever
come forward to claim it.What bothers me most is that according to the
laws of my country at the expiration of 8 years the funds will revert to
the ownership of the Hong Kong Government if nobody applies to claim the
funds.

Against this backdrop, my suggestion to you is that I will like you as a
foreigner to stand as the next of kin to Mr Fayez Abdulaziz Mohammed so
that you will be able to receive his funds.

        WHAT IS TO BE DONE:

I want you to know that I have had everything planned out so that we shall
come out successful.I have contacted an attorney that will prepare the
necessary document that will back you up as the next of kin to Mr Fayez
Abdulaziz Mohammed, all that is required from you at this stage is for you
to provide me with your Full Names and Address so that the attorney can
commence his job. After you have been made the next of kin to the late Mr
Fayez Abdulaziz Mohammed, the Attorney will also file in for claims on
your behalf and secure the necessary approval and letter of probate in
your favor for the move of the funds to an account that will be provided
by you.There is no risk involved at all in the matter as we are going to
adopt a legalized method and the attorney will prepare all the necessary
documents. Please endeavor to observe utmost discretion in all matters
concerning this issue.Once the funds have been transferred to your
nominated bank account we shall share in the ratio of 70% for me, 30% for
you, Should you be interested please send me the following informations
below,

1. Full name and Age.
2. Occupation
3. Private/office phone number.
4. Current residential address.

And finally after that i shall provide you with more details of this
transaction.Your earliest response to this letter will be highly
appreciated. Write back if intrested at
Email:drraymondkuochien3001@gmail.com  or
Email:drraymondkuochien@w.cn

Kind Regards,
Dr.Raymond Kuo Fung CHIEN.



-----------------------------------------------------------------------------------------------------------------------------
Please consider the environmental impact of needlessly printing this e-mail. Epyllion dreams new world with green environment.



^ permalink raw reply

* [PATCH] ibmtr: fix tr%d in dmesg
From: Meelis Roos @ 2010-10-02 18:35 UTC (permalink / raw)
  To: netdev

ibmtr and ibmtr_cs show tr%d in dmesg after alloc_netdev() but before 
register_netdev(). Fix it like e100 does - put a different name into 
dev->name until the device gets registered. I/O port seems to be unique
enough and is available for the time of printk messages. With the fix, 
dmesg shows

ibmtr@0a20: ISA P&P 16/4 Adapter/A (short) | 16/4 ISA-16 Adapter found
ibmtr@0a20: using irq 3, PIOaddr a20, 64K shared RAM.
ibmtr@0a20: Hardware address : 00:40:aa:a9:03:98
ibmtr@0a20: Shared RAM paging disabled. ti->page_mask 0
ibmtr@0a20: Maximum Receive Internet Protocol MTU 16Mbps: 16344, 4Mbps: 6104
tr0: port 0xa20, irq 3,  mmio 0xd4850000, sram 0xd0000, hwaddr=00:40:aa:a9:03:98

Signed-off-by: Meelis Roos <mroos@linux.ee>

diff --git a/drivers/net/tokenring/ibmtr.c b/drivers/net/tokenring/ibmtr.c
index 91e6c78..5de281f 100644
--- a/drivers/net/tokenring/ibmtr.c
+++ b/drivers/net/tokenring/ibmtr.c
@@ -368,6 +368,7 @@ int __devinit ibmtr_probe_card(struct net_device *dev)
 {
 	int err = ibmtr_probe(dev);
 	if (!err) {
+		strcpy(dev->name, "tr%d");
 		err = register_netdev(dev);
 		if (err)
 			ibmtr_cleanup_card(dev);
@@ -699,6 +700,7 @@ static int __devinit ibmtr_probe1(struct net_device *dev, int PIOaddr)
 		printk(version);
 	}
 #endif /* !PCMCIA */
+	sprintf(dev->name, "ibmtr@%04x", PIOaddr);
 	DPRINTK("%s %s found\n",
 		channel_def[cardpresent - 1], adapter_def(ti->adapter_type));
 	DPRINTK("using irq %d, PIOaddr %hx, %dK shared RAM.\n",

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply related

* INVESTMENT PROJECT IN YOUR COUNTRY. (HOW CAN I INVEST IN YOUR COUNTRY)?
From: kunihira barbara @ 2010-10-01  7:59 UTC (permalink / raw)


Dear Sir, 
 
I am Mr.Sam Gapke, The Managing Director of the REA. I am interested in investing the sum of US $20M (Twenty Million US Dollars) in any Lucrative Business in the area related to your Business line. Thanks for anticipate cooperation, your urgent response will be highly appreciated. 
Feel free to contact me with this my personal email address:sam.pk01@gmail.com
Best regards, 
 
Mr. Sam Gapke

^ permalink raw reply

* Re: problem with flowi structure
From: David Miller @ 2010-10-02 20:18 UTC (permalink / raw)
  To: nicola.padovano
  Cc: jengelh, eric.dumazet, aijazbaig1, netfilter-devel, netdev
In-Reply-To: <AANLkTinObV+176prSdCL0RoVx7a74quUM5H58xR2_WLG@mail.gmail.com>

From: Nicola Padovano <nicola.padovano@gmail.com>
Date: Sat, 2 Oct 2010 10:08:56 +0200

> i.e. like a wildcard?

Please do not top-post, because when you top-post people need
to scroll down and read the context you're replying to, then
scroll back up to read your reply.

If, instead of top-posting, you reply after the context you're
replying to, people need to do less work to read your email.

Thanks.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox