* Re: [PATCH net-next-2.6] macvlan: Use compare_ether_addr_64bits()
From: David Miller @ 2009-09-02 0:43 UTC (permalink / raw)
To: eric.dumazet; +Cc: kaber, netdev
In-Reply-To: <4A9D41BD.9000406@gmail.com>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 01 Sep 2009 17:46:05 +0200
> To speedup ether addresses compares, we can use compare_ether_addr_64bits()
> (all operands are guaranteed to be at least 8 bytes long)
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next-2.6] bonding: use compare_ether_addr_64bits() in ALB
From: David Miller @ 2009-09-02 0:44 UTC (permalink / raw)
To: jpirko; +Cc: eric.dumazet, netdev, fubar
In-Reply-To: <20090901181834.GB3209@psychotron.redhat.com>
From: Jiri Pirko <jpirko@redhat.com>
Date: Tue, 1 Sep 2009 20:18:35 +0200
> Tue, Sep 01, 2009 at 06:31:18PM CEST, eric.dumazet@gmail.com wrote:
>>We can speedup ether addresses compares using compare_ether_addr_64bits()
>>instead of memcmp(). We make sure all operands are at least 8 bytes long and
>>16bits aligned (or better, long word aligned if possible)
>>
>>Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
...
> Reviewed-by: Jiri Pirko <jpirko@redhat.com>
Applied.
^ permalink raw reply
* Re: [PATCH] netns: embed ip6_dst_ops directly
From: David Miller @ 2009-09-02 0:44 UTC (permalink / raw)
To: adobriyan; +Cc: netdev, benjamin.thery, dlezcano
In-Reply-To: <20090829113449.GA3067@x200.localdomain>
From: Alexey Dobriyan <adobriyan@gmail.com>
Date: Sat, 29 Aug 2009 15:34:49 +0400
> struct net::ipv6.ip6_dst_ops is separatedly dynamically allocated,
> but there is no fundamental reason for it. Embed it directly into
> struct netns_ipv6.
>
> For that:
> * move struct dst_ops into separate header to fix circular dependencies
> I honestly tried not to, it's pretty impossible to do other way
> * drop dynamical allocation, allocate together with netns
>
> For a change, remove struct dst_ops::dst_net, it's deducible
> by using container_of() given dst_ops pointer.
>
> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH] tun: reuse struct sock fields
From: David Miller @ 2009-09-02 0:44 UTC (permalink / raw)
To: mst; +Cc: m.s.tsirkin, netdev, herbert
In-Reply-To: <20090830170442.GA3482@redhat.com>
From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Sun, 30 Aug 2009 20:04:42 +0300
> As tun always has an embeedded struct sock,
> use sk and sk_receive_queue fields instead of
> duplicating them in tun_struct.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Applied.
^ permalink raw reply
* Re: [PATCH] sky2: fix management of driver LED
From: David Miller @ 2009-09-02 0:44 UTC (permalink / raw)
To: shemminger; +Cc: mikem, netdev, rene.mayrhofer, leitner
In-Reply-To: <20090831103141.77598705@nehalam>
From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Mon, 31 Aug 2009 10:31:41 -0700
> Observed by Mike McCormack.
>
> The LED bit here is just a software controlled value used to
> turn on one of the LED's on some boards. The register value was wrong,
> which could have been causing some power control issues.
> Get rid of problematic define use the correct mask.
>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Applied.
^ permalink raw reply
* Re: [PATCH] sky2: Create buffer alloc and free helpers
From: David Miller @ 2009-09-02 0:44 UTC (permalink / raw)
To: mikem; +Cc: netdev
In-Reply-To: <4A9D1FDF.4020407@ring3k.org>
From: Mike McCormack <mikem@ring3k.org>
Date: Tue, 01 Sep 2009 22:21:35 +0900
> Refactor similar two sections of code that free buffers into one.
> Only call tx_init if all buffer allocations succeed.
>
> Signed-off-by: Mike McCormack <mikem@ring3k.org>
> Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Applied.
^ permalink raw reply
* Re: [PATCH] sky2: Use 32bit read to read Y2_VAUX_AVAIL
From: David Miller @ 2009-09-02 0:45 UTC (permalink / raw)
To: shemminger; +Cc: mikem, netdev
In-Reply-To: <20090901085757.47b5c09c@nehalam>
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Tue, 1 Sep 2009 08:57:57 -0700
> On Tue, 01 Sep 2009 22:54:27 +0900
> Mike McCormack <mikem@ring3k.org> wrote:
>
>> B0_CTST is a 24bit register according to the vendor driver (sk98lin).
>> A 16bit read on B0_CTST will always return 0 for Y2_VAUX_AVAIL (1<<16),
>> so use a 32bit read when testing Y2_VAUX_AVAIL
>>
>> Signed-off-by: Mike McCormack <mikem@ring3k.org>
...
> Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Applied.
^ permalink raw reply
* Re: [PATCH 0/2] RTO connection timeout calculation
From: David Miller @ 2009-09-02 0:45 UTC (permalink / raw)
To: damian; +Cc: netdev
In-Reply-To: <4A9D82DC.50508@tvk.rwth-aachen.de>
From: Damian Lukowski <damian@tvk.rwth-aachen.de>
Date: Tue, 01 Sep 2009 22:23:56 +0200
> this series of patches implements some changes concerning the RTO
> retransmission timeouts.
>
> 1) retransmits_timed_out() is commented shortly and has
> more meaningful names.
> 2) The sysctl documentation is being updated.
>
> Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>
Looks good to me, all applied.
Thanks.
^ permalink raw reply
* Re: [PATCH] net: make neigh_ops constant
From: David Miller @ 2009-09-02 0:45 UTC (permalink / raw)
To: shemminger; +Cc: netdev
In-Reply-To: <20090901141319.6ce685e9@nehalam>
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Tue, 1 Sep 2009 14:13:19 -0700
>
> These tables are never modified at runtime. Move to read-only
> section.
>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Applied.
^ permalink raw reply
* Re: [PATCH] au1000_eth: possible NULL dereference of aup->mii_bus->irq in au1000_probe()
From: David Miller @ 2009-09-02 0:45 UTC (permalink / raw)
To: f.fainelli; +Cc: roel.kluin, netdev, akpm
In-Reply-To: <200908311439.01781.f.fainelli@gmail.com>
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Mon, 31 Aug 2009 14:38:55 +0200
> Hello Roel,
>
> Le lundi 31 août 2009 10:40:15, Roel Kluin a écrit :
>> aup->mii_bus->irq allocation may fail, prevent a dereference of NULL.
>
> Good catch.
>
>>
>> Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
>
> Acked-by: Florian Fainelli <florian@openwrt.org>
Applied.
^ permalink raw reply
* Re: [PATCH] net: sk_free() should be allowed right after sk_alloc()
From: David Miller @ 2009-09-02 0:50 UTC (permalink / raw)
To: eric.dumazet; +Cc: jarkao2, netdev
In-Reply-To: <4A9BBE12.4080206@gmail.com>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 31 Aug 2009 14:12:02 +0200
> David Miller a écrit :
>> From: Jarek Poplawski <jarkao2@gmail.com>
>> Date: Mon, 31 Aug 2009 09:30:19 +0000
>>
>>> On Mon, Aug 31, 2009 at 11:15:36AM +0200, Eric Dumazet wrote:
>>>> From: Jarek Poplawski <jarkao2@gmail.com>
>>>>
>>>> After commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
>>>> (net: No more expensive sock_hold()/sock_put() on each tx)
>>>> sk_free() frees socks conditionally and depends
>>>> on sk_wmem_alloc beeing set e.g. in sock_init_data(). But in some
>>> Very nice, but I hope David could fix btw. my "beeing" misspelling.
>>
>> I will :-)
>
> Thanks :)
Applied to net-2.6, thanks everyone.
^ permalink raw reply
* Re: [PATCH] [V3] net: add Xilinx emac lite device driver
From: David Miller @ 2009-09-02 0:51 UTC (permalink / raw)
To: michal.simek
Cc: john.linn, netdev, linuxppc-dev, jgarzik, grant.likely, jwboyer,
john.williams, sadanan
In-Reply-To: <4A9BCDB4.8040406@petalogix.com>
From: Michal Simek <michal.simek@petalogix.com>
Date: Mon, 31 Aug 2009 15:18:44 +0200
> I see that John's patch has wrong file permission
> -rwxr-xr-x xilinx_emaclite.c
...
> should be 644.
>
> Please fix it in your repo.
Done, thanks!
^ permalink raw reply
* [PATCH]: tasklet_hrtimer revert...
From: David Miller @ 2009-09-02 1:00 UTC (permalink / raw)
To: netdev
I'll push this into net-2.6 shortly...
pkt_sched: Revert tasklet_hrtimer changes.
These are full of unresolved problems, mainly that conversions don't
work 1-1 from hrtimers to tasklet_hrtimers because unlike hrtimers
tasklets can't be killed from softirq context.
And when a qdisc gets reset, that's exactly what we need to do here.
We'll work this out in the net-next-2.6 tree and if warranted we'll
backport that work to -stable.
This reverts the following 3 changesets:
a2cb6a4dd470d7a64255a10b843b0d188416b78f
("pkt_sched: Fix bogon in tasklet_hrtimer changes.")
38acce2d7983632100a9ff3fd20295f6e34074a8
("pkt_sched: Convert CBQ to tasklet_hrtimer.")
ee5f9757ea17759e1ce5503bdae2b07e48e32af9
("pkt_sched: Convert qdisc_watchdog to tasklet_hrtimer")
Signed-off-by: David S. Miller <davem@davemloft.net>
---
include/net/pkt_sched.h | 4 ++--
net/sched/sch_api.c | 10 +++++-----
net/sched/sch_cbq.c | 25 +++++++++++--------------
3 files changed, 18 insertions(+), 21 deletions(-)
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 7eafb8d..82a3191 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -61,8 +61,8 @@ psched_tdiff_bounded(psched_time_t tv1, psched_time_t tv2, psched_time_t bound)
}
struct qdisc_watchdog {
- struct tasklet_hrtimer timer;
- struct Qdisc *qdisc;
+ struct hrtimer timer;
+ struct Qdisc *qdisc;
};
extern void qdisc_watchdog_init(struct qdisc_watchdog *wd, struct Qdisc *qdisc);
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 92e6f3a..24d17ce 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -458,7 +458,7 @@ EXPORT_SYMBOL(qdisc_warn_nonwc);
static enum hrtimer_restart qdisc_watchdog(struct hrtimer *timer)
{
struct qdisc_watchdog *wd = container_of(timer, struct qdisc_watchdog,
- timer.timer);
+ timer);
wd->qdisc->flags &= ~TCQ_F_THROTTLED;
__netif_schedule(qdisc_root(wd->qdisc));
@@ -468,8 +468,8 @@ static enum hrtimer_restart qdisc_watchdog(struct hrtimer *timer)
void qdisc_watchdog_init(struct qdisc_watchdog *wd, struct Qdisc *qdisc)
{
- tasklet_hrtimer_init(&wd->timer, qdisc_watchdog,
- CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+ hrtimer_init(&wd->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+ wd->timer.function = qdisc_watchdog;
wd->qdisc = qdisc;
}
EXPORT_SYMBOL(qdisc_watchdog_init);
@@ -485,13 +485,13 @@ void qdisc_watchdog_schedule(struct qdisc_watchdog *wd, psched_time_t expires)
wd->qdisc->flags |= TCQ_F_THROTTLED;
time = ktime_set(0, 0);
time = ktime_add_ns(time, PSCHED_TICKS2NS(expires));
- tasklet_hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS);
+ hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS);
}
EXPORT_SYMBOL(qdisc_watchdog_schedule);
void qdisc_watchdog_cancel(struct qdisc_watchdog *wd)
{
- tasklet_hrtimer_cancel(&wd->timer);
+ hrtimer_cancel(&wd->timer);
wd->qdisc->flags &= ~TCQ_F_THROTTLED;
}
EXPORT_SYMBOL(qdisc_watchdog_cancel);
diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 149b040..d5798e1 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -163,7 +163,7 @@ struct cbq_sched_data
psched_time_t now_rt; /* Cached real time */
unsigned pmask;
- struct tasklet_hrtimer delay_timer;
+ struct hrtimer delay_timer;
struct qdisc_watchdog watchdog; /* Watchdog timer,
started when CBQ has
backlog, but cannot
@@ -503,8 +503,6 @@ static void cbq_ovl_delay(struct cbq_class *cl)
cl->undertime = q->now + delay;
if (delay > 0) {
- struct hrtimer *ht;
-
sched += delay + cl->penalty;
cl->penalized = sched;
cl->cpriority = TC_CBQ_MAXPRIO;
@@ -512,12 +510,12 @@ static void cbq_ovl_delay(struct cbq_class *cl)
expires = ktime_set(0, 0);
expires = ktime_add_ns(expires, PSCHED_TICKS2NS(sched));
- ht = &q->delay_timer.timer;
- if (hrtimer_try_to_cancel(ht) &&
- ktime_to_ns(ktime_sub(hrtimer_get_expires(ht),
- expires)) > 0)
- hrtimer_set_expires(ht, expires);
- hrtimer_restart(ht);
+ if (hrtimer_try_to_cancel(&q->delay_timer) &&
+ ktime_to_ns(ktime_sub(
+ hrtimer_get_expires(&q->delay_timer),
+ expires)) > 0)
+ hrtimer_set_expires(&q->delay_timer, expires);
+ hrtimer_restart(&q->delay_timer);
cl->delayed = 1;
cl->xstats.overactions++;
return;
@@ -593,7 +591,7 @@ static psched_tdiff_t cbq_undelay_prio(struct cbq_sched_data *q, int prio,
static enum hrtimer_restart cbq_undelay(struct hrtimer *timer)
{
struct cbq_sched_data *q = container_of(timer, struct cbq_sched_data,
- delay_timer.timer);
+ delay_timer);
struct Qdisc *sch = q->watchdog.qdisc;
psched_time_t now;
psched_tdiff_t delay = 0;
@@ -623,7 +621,7 @@ static enum hrtimer_restart cbq_undelay(struct hrtimer *timer)
time = ktime_set(0, 0);
time = ktime_add_ns(time, PSCHED_TICKS2NS(now + delay));
- tasklet_hrtimer_start(&q->delay_timer, time, HRTIMER_MODE_ABS);
+ hrtimer_start(&q->delay_timer, time, HRTIMER_MODE_ABS);
}
sch->flags &= ~TCQ_F_THROTTLED;
@@ -1216,7 +1214,7 @@ cbq_reset(struct Qdisc* sch)
q->tx_class = NULL;
q->tx_borrowed = NULL;
qdisc_watchdog_cancel(&q->watchdog);
- tasklet_hrtimer_cancel(&q->delay_timer);
+ hrtimer_cancel(&q->delay_timer);
q->toplevel = TC_CBQ_MAXLEVEL;
q->now = psched_get_time();
q->now_rt = q->now;
@@ -1399,8 +1397,7 @@ static int cbq_init(struct Qdisc *sch, struct nlattr *opt)
q->link.minidle = -0x7FFFFFFF;
qdisc_watchdog_init(&q->watchdog, sch);
- tasklet_hrtimer_init(&q->delay_timer, cbq_undelay,
- CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+ hrtimer_init(&q->delay_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
q->delay_timer.function = cbq_undelay;
q->toplevel = TC_CBQ_MAXLEVEL;
q->now = psched_get_time();
--
1.6.4.2
^ permalink raw reply related
* Re: [PATCH 16/19] intel: convert drivers to netdev_tx_t
From: Jeff Kirsher @ 2009-09-02 1:03 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, netdev, e1000-devel
In-Reply-To: <20090901055130.042455994@vyatta.com>
On Mon, Aug 31, 2009 at 22:50, Stephen Hemminger<shemminger@vyatta.com> wrote:
> Get rid of some bogus return wrapping as well.
>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Patch looks fine.
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
^ permalink raw reply
* Re: [PATCH] [net-next] tcp: avoid sending zero TSval
From: David Miller @ 2009-09-02 1:16 UTC (permalink / raw)
To: opurdila; +Cc: netdev
In-Reply-To: <200908312044.56235.opurdila@ixiacom.com>
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Mon, 31 Aug 2009 20:44:56 +0300
> Per RFC1323, zero TSecr is considered invalid. Thus we must avoid when
> possible sending a zero TSval.
>
> Currently, we use the least significant 32 bits of jiffies to fill in
> TSval. But that can wrap around to zero (in 5 minutes after reboot,
> and every 49 days after that in the worst case).
>
> This patch approximate a wrap-around zero TSval to 1. This is better
> then emitting a value which will be ignored.
>
> Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Ok, I've changed my mind again. I think we need to go with
a solution like this.
Even if we could somehow justify allowing zero timestamps,
I just checked some other stacks and all of them ignore zero
tsecr values. So we can't make that kind of change no matter
what.
This patch needs some changes.
We have to adjust the tests we make against tsecr.
If we bump up a zero jiffies to one in an advertised timestamp,
then we get back a tsecr value of one, and jiffies is still
zero, we should use a comparison value of one not zero.
This is not trivial. You might think it's OK to handle all of
this by just adjusting the definition of tcp_time_stamp but that
gets used by a lot of other things in the stack so those side
effects need to be analyzed.
Grepping around also shows that we also have some code that doesn't
handle jiffies wraparound at all, f.e. check out the rcv_tsecr tests
in net/ipv4/tcp_lp.c :-/
^ permalink raw reply
* Re: [PATCH,v2] Re: e1000e: why does pci_enable_pcie_error_reporting() fail on my hp2510p?
From: Jeff Kirsher @ 2009-09-02 1:18 UTC (permalink / raw)
To: Frans Pop; +Cc: Danny Feng, Netdev, linux-kernel, David Miller
In-Reply-To: <200908210848.39377.elendil@planet.nl>
On Thu, Aug 20, 2009 at 23:48, Frans Pop<elendil@planet.nl> wrote:
> On Friday 21 August 2009, Danny Feng wrote:
>> You may also need to silence pci_disable_pcie_error_reporting,
>> otherwise rmmod/shutdown, you will get
>>
>> e1000e 0000:00:19.0: pci_disable_pcie_error_reporting failed
>
> Yes, thanks. Exactly the same thing there. Updated patch below.
>
>
> From: Frans Pop <elendil@planet.nl>
> Subject: net: Don't report an error if devices don't support AER
>
> The only error returned by pci_{en,dis}able_pcie_error_reporting() is
> -EIO which simply means that Advanced Error Reporting is not supported.
> There is no need to report that, so remove the error check from e1001e,
> igb and ixgbe.
>
> Signed-off-by: Frans Pop <elendil@planet.nl>
>
> diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
> index fa92a68..d67798f 100644
> --- a/drivers/net/e1000e/netdev.c
> +++ b/drivers/net/e1000e/netdev.c
> @@ -4983,12 +4983,7 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
> goto err_pci_reg;
>
> /* AER (Advanced Error Reporting) hooks */
> - err = pci_enable_pcie_error_reporting(pdev);
> - if (err) {
> - dev_err(&pdev->dev, "pci_enable_pcie_error_reporting failed "
> - "0x%x\n", err);
> - /* non-fatal, continue */
> - }
> + pci_enable_pcie_error_reporting(pdev);
>
> pci_set_master(pdev);
> /* PCI config space info */
> @@ -5301,9 +5296,6 @@ static void __devexit e1000_remove(struct pci_dev *pdev)
>
> /* AER disable */
> err = pci_disable_pcie_error_reporting(pdev);
> - if (err)
> - dev_err(&pdev->dev,
> - "pci_disable_pcie_error_reporting failed 0x%x\n", err);
>
> pci_disable_device(pdev);
> }
> diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
> index adb09d3..1533d6f 100644
> --- a/drivers/net/igb/igb_main.c
> +++ b/drivers/net/igb/igb_main.c
> @@ -1232,12 +1232,7 @@ static int __devinit igb_probe(struct pci_dev *pdev,
> if (err)
> goto err_pci_reg;
>
> - err = pci_enable_pcie_error_reporting(pdev);
> - if (err) {
> - dev_err(&pdev->dev, "pci_enable_pcie_error_reporting failed "
> - "0x%x\n", err);
> - /* non-fatal, continue */
> - }
> + pci_enable_pcie_error_reporting(pdev);
>
> pci_set_master(pdev);
> pci_save_state(pdev);
> @@ -1668,9 +1663,6 @@ static void __devexit igb_remove(struct pci_dev *pdev)
> free_netdev(netdev);
>
> err = pci_disable_pcie_error_reporting(pdev);
> - if (err)
> - dev_err(&pdev->dev,
> - "pci_disable_pcie_error_reporting failed 0x%x\n", err);
>
> pci_disable_device(pdev);
> }
> diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
> index 77b0381..777556d 100644
> --- a/drivers/net/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ixgbe/ixgbe_main.c
> @@ -5430,12 +5430,7 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
> goto err_pci_reg;
> }
>
> - err = pci_enable_pcie_error_reporting(pdev);
> - if (err) {
> - dev_err(&pdev->dev, "pci_enable_pcie_error_reporting failed "
> - "0x%x\n", err);
> - /* non-fatal, continue */
> - }
> + pci_enable_pcie_error_reporting(pdev);
>
> pci_set_master(pdev);
> pci_save_state(pdev);
> @@ -5795,9 +5790,6 @@ static void __devexit ixgbe_remove(struct pci_dev *pdev)
> free_netdev(netdev);
>
> err = pci_disable_pcie_error_reporting(pdev);
> - if (err)
> - dev_err(&pdev->dev,
> - "pci_disable_pcie_error_reporting failed 0x%x\n", err);
>
> pci_disable_device(pdev);
> }
Patch looks fine
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
^ permalink raw reply
* Re: [PATCH resend] drop_monitor: fix trace_napi_poll_hit()
From: David Miller @ 2009-09-02 1:20 UTC (permalink / raw)
To: xiaoguangrong; +Cc: nhorman, yjwei, netdev, linux-kernel
In-Reply-To: <4A9B6963.5090207@cn.fujitsu.com>
From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Date: Mon, 31 Aug 2009 14:10:43 +0800
> The net_dev of backlog napi is NULL, like below:
>
> __get_cpu_var(softnet_data).backlog.dev == NULL
>
> So, we should check it in napi tracepoint's probe function
>
> Acked-by: Neil Horman <nhorman@tuxdriver.com>
> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Applied to net-next-2.6, thanks.
^ permalink raw reply
* Re: [PATCH] drop_monitor: make last_rx timestamp private
From: David Miller @ 2009-09-02 1:21 UTC (permalink / raw)
To: nhorman; +Cc: netdev, eric.dumazet
In-Reply-To: <20090831195847.GA6506@hmsreliant.think-freely.org>
From: Neil Horman <nhorman@tuxdriver.com>
Date: Mon, 31 Aug 2009 15:58:47 -0400
> It was recently pointed out to me that the last_rx field of the net_device
> structure wasn't updated regularly. In fact only the bonding driver really uses
> it currently. Since the drop_monitor code relies on the last_rx field to detect
> drops on recevie in hardware, We need to find a more reliable way to rate limit
> our drop checks (so that we don't check for drops on every frame recevied, which
> would be inefficient. This patch makes a last_rx timestamp that is private to
> the drop monitor code and is updated for every device that we track.
>
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Neil, this doesn't apply to net-next-2.6:
> diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
> index 9d66fa9..34a05ce 100644
> --- a/net/core/drop_monitor.c
> +++ b/net/core/drop_monitor.c
...
> @@ -179,18 +180,21 @@ static void trace_napi_poll_hit(struct napi_struct *napi)
> {
> struct dm_hw_stat_delta *new_stat;
>
> - /*
> - * Ratelimit our check time to dm_hw_check_delta jiffies
> - */
> - if (!time_after(jiffies, napi->dev->last_rx + dm_hw_check_delta))
> - return;
>
> rcu_read_lock();
> list_for_each_entry_rcu(new_stat, &hw_stats_list, list) {
In net-next-2.6 this test reads:
/*
* Ratelimit our check time to dm_hw_check_delta jiffies
*/
if (!napi->dev ||
!time_after(jiffies, napi->dev->last_rx + dm_hw_check_delta))
return;
and you must retain the napi->dev NULL check there as otherwise
the list traversal tests will blindly dereference that pointer.
^ permalink raw reply
* Re: [PATCH resend] tracing/events: convert NAPI's tracepoint via TRACE_EVENT
From: David Miller @ 2009-09-02 1:26 UTC (permalink / raw)
To: rostedt
Cc: mingo, xiaoguangrong, nhorman, fweisbec, yjwei, netdev,
linux-kernel
In-Reply-To: <alpine.DEB.2.00.0908311407060.13931@gandalf.stny.rr.com>
From: Steven Rostedt <rostedt@goodmis.org>
Date: Mon, 31 Aug 2009 14:09:04 -0400 (EDT)
>
> On Mon, 31 Aug 2009, Ingo Molnar wrote:
>
>>
>> * Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> wrote:
>>
>> > - Convert NAPI's tracepoint via TRACE_EVENT macro, the output information
>> > like below:
>>
>> I think as long as it does not touch tracing infrastructure (which
>> your patches dont do in their current form) this should be
>> done/merged via the networking tree.
>
> I agree, all changes that are in include/trace/events/ and trace point
> usage can stay within the subsystem tree.
>
>>
>> [ There might be some small collisions in define_trace.h (because
>> these tracepoints move from legacy to new-style TRACE_EVENT()
>> form) but that's OK. ]
>
> But changes to anything in include/trace or kernel/trace needs to go
> throught the tracing subsystem. This includes a changes to define_trace.h.
This patch can't be split up, so I'm wondering how you suggest to
handle this patch given that you have declared that define_trace.h
changes aren't to go through the subsystem tree?
If we do the define_trace.h change only, we break the build
(lack of macro defined for the trace).
If we do only the other parts of his patch, we get a duplicate
definition.
And keep in mind that Neil and Xiao are probably going to want to do
work on top of this to the networking bits. Thus if we put this patch
here into the tracing tree, I'll have to develop a dependency on the
tracing tree and I think that will go over like a fart in a spacesuit
with the -next crowd and Stephen Rothwell in particular.
Please advise.
^ permalink raw reply
* Re: [PATCH net-next-2.6] ip: Report qdisc packet drops
From: David Miller @ 2009-09-02 1:41 UTC (permalink / raw)
To: eric.dumazet; +Cc: cl, sri, dlstevens, netdev, niv, mtk.manpages
In-Reply-To: <4A9BBD8E.2010303@gmail.com>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 31 Aug 2009 14:09:50 +0200
> Re-reading again this stuff, I realized ip6_push_pending_frames()
> was not updating IPSTATS_MIB_OUTDISCARDS, even if IP_RECVERR was set.
>
> May I suggest following path :
>
> 1) Correct ip6_push_pending_frames() to properly
> account for dropped-by-qdisc frames when IP_RECVERR is set
Your patch is applied to net-next-2.6, thanks!
> 2) Submit a patch to account for qdisc-dropped frames in SNMP counters
> but still return a OK to user application, to not break them ?
Sounds good.
I think if you sample random UDP applications, you will find that such
errors will upset them terribly, make them log tons of crap to
/var/log/messages et al., and consume tons of CPU.
And in such cases silent ignoring of drops is entirely appropriate and
optimal, which supports our current behavior.
If we are to make such applications "more sophisticated" such
converted apps can be indicated simply their use of IP_RECVERR.
If you want to be notified of all asynchronous errors we can detect,
you use this, end of story. It is the only way to handle this
situation without breaking the world.
As usual, Alexey Kuznetsov's analysis of this situation is timeless,
accurate, and wise. And he understood all of this 10+ years ago.
^ permalink raw reply
* [PATCH 0/5] Adds implementation of TFRC-SP on DCCP test tree
From: Ivo Calado @ 2009-09-02 2:43 UTC (permalink / raw)
To: dccp; +Cc: netdev
These patches adds implementation of TFRC-SP at the receiver side, and
are targeted at the DCCP branch
Patch #1: First Patch on TFRC-SP. Copy base files from TFRC
Patch #2: Implement loss counting on TFRC-SP receiver
Patch #3: Implement TFRC-SP calc of mean length of loss intervals
accordingly to section 3 of RFC 4828
Patch #4: Adds options DROPPED PACKETS and LOSS INTERVALS to receiver
Patch #5: Updating documentation accordingly
Following this patches, we'll be sending the sender side of TFRC-SP.
Once this code is integrated on the branch, we can proceed adding the
CCID4 code that uses this new TFRC-SP.
--
Ivo Augusto Andrade Rocha Calado
MSc. Candidate
Embedded Systems and Pervasive Computing Lab - http://embedded.ufcg.edu.br
Systems and Computing Department - http://www.dsc.ufcg.edu.br
Electrical Engineering and Informatics Center - http://www.ceei.ufcg.edu.br
Federal University of Campina Grande - http://www.ufcg.edu.br
PGP: 0x03422935
Quidquid latine dictum sit, altum viditur.
^ permalink raw reply
* [PATCH 1/5] First Patch on TFRC-SP. Copy base files from TFRC
From: Ivo Calado @ 2009-09-02 2:44 UTC (permalink / raw)
To: dccp; +Cc: netdev
In-Reply-To: <cb00fa210909011735l5038c28eofba84b3976902964@mail.gmail.com>
First Patch on TFRC-SP. Does a copy from TFRC and adjust symbols name
with infix "_sp".
Also updates Kconfig and init/exit code. An #ifndef was added to headers that
have commom symbols with TFRC that were not changed, so they don't get
included twice.
Following the rule #8 in Documentation/SubmittingPatches the patch is
stored at http://embedded.ufcg.edu.br/~ivocalado/dccp/patches_tfrc_sp/tfrc_sp_receiver_00.patch
^ permalink raw reply
* [PATCH 2/5] Implement loss counting on TFRC-SP receiver
From: Ivo Calado @ 2009-09-02 2:44 UTC (permalink / raw)
To: dccp; +Cc: netdev
In-Reply-To: <cb00fa210909011735m4f81224dg1736767ba6f51ceb@mail.gmail.com>
Implement loss counting on TFRC-SP receiver. Consider transmission's
hole size as loss count.
Changes:
- Adds field li_losses to tfrc_loss_interval to track loss count per interval
- Adds field num_losses to tfrc_rx_hist, used to store loss count per
loss event
- Adds dccp_loss_count function to net/dccp/dccp.h, responsible for
loss count using sequence numbers
Signed-off-by: Ivo Calado, Erivaldo Xavier, Leandro Sales
<ivocalado@embedded.ufcg.edu.br>, <desadoc@gmail.com>,
<leandroal@gmail.com>
Index: b/net/dccp/ccids/lib/loss_interval_sp.c
===================================================================
--- a/net/dccp/ccids/lib/loss_interval_sp.c 2009-08-26
21:50:23.000000000 -0300
+++ b/net/dccp/ccids/lib/loss_interval_sp.c 2009-08-26
22:51:32.000000000 -0300
@@ -184,6 +184,7 @@
s64 len = dccp_delta_seqno(cur->li_seqno, cong_evt_seqno);
if ((len <= 0)||(!tfrc_lh_closed_check(cur,
cong_evt->tfrchrx_ccval)))
{
+ cur->li_losses += rh->num_losses;
return false;
}
@@ -201,6 +202,7 @@
cur->li_seqno = cong_evt_seqno;
cur->li_ccval = cong_evt->tfrchrx_ccval;
cur->li_is_closed = false;
+ cur->li_losses = rh->num_losses;
if (++lh->counter == 1)
lh->i_mean = cur->li_length = (*calc_first_li)(sk);
Index: b/net/dccp/ccids/lib/loss_interval_sp.h
===================================================================
--- a/net/dccp/ccids/lib/loss_interval_sp.h 2009-08-26
21:30:11.000000000 -0300
+++ b/net/dccp/ccids/lib/loss_interval_sp.h 2009-08-26
22:52:20.000000000 -0300
@@ -30,12 +30,14 @@
* @li_ccval: The CCVal belonging to @li_seqno
* @li_is_closed: Whether @li_seqno is older than 1 RTT
* @li_length: Loss interval sequence length
+ * @li_losses: Number of losses counted on this interval
*/
struct tfrc_loss_interval {
u64 li_seqno:48,
li_ccval:4,
li_is_closed:1;
u32 li_length;
+ u32 li_losses;
};
/**
Index: b/net/dccp/ccids/lib/packet_history_sp.c
===================================================================
--- a/net/dccp/ccids/lib/packet_history_sp.c 2009-08-26
21:46:36.000000000 -0300
+++ b/net/dccp/ccids/lib/packet_history_sp.c 2009-08-26
22:55:01.000000000 -0300
@@ -236,6 +236,7 @@
if (likely(dccp_delta_seqno(s2, s3) > 0)) { /* S2 < S3 */
h->loss_count = 3;
tfrc_sp_rx_hist_entry_from_skb(tfrc_rx_hist_entry(h,
3), skb, n3);
+ h->num_losses = dccp_loss_count(s2, s3, n3);
return 1;
}
@@ -248,6 +249,7 @@
tfrc_rx_hist_swap(h, 2, 3);
tfrc_sp_rx_hist_entry_from_skb(tfrc_rx_hist_entry(h,
2), skb, n3);
h->loss_count = 3;
+ h->num_losses = dccp_loss_count(s1, s3, n3);
return 1;
}
@@ -283,6 +285,7 @@
h->loss_start = tfrc_rx_hist_index(h, 3);
tfrc_sp_rx_hist_entry_from_skb(tfrc_rx_hist_entry(h, 1), skb, n3);
h->loss_count = 3;
+ h->num_losses = dccp_loss_count(s0, s3, n3);
return 1;
}
Index: b/net/dccp/ccids/lib/packet_history_sp.h
===================================================================
--- a/net/dccp/ccids/lib/packet_history_sp.h 2009-08-26
21:40:55.000000000 -0300
+++ b/net/dccp/ccids/lib/packet_history_sp.h 2009-08-26
22:55:58.000000000 -0300
@@ -101,6 +101,7 @@
* @packet_size: Packet size in bytes (as per RFC 3448, 3.1)
* @bytes_recvd: Number of bytes received since @bytes_start
* @bytes_start: Start time for counting @bytes_recvd
+ * @num_losses: Number of losses contained on this loss event
*/
struct tfrc_rx_hist {
struct tfrc_rx_hist_entry *ring[TFRC_NDUPACK + 1];
@@ -113,6 +114,7 @@
u32 packet_size,
bytes_recvd;
ktime_t bytes_start;
+ u8 num_losses;
};
/**
Index: b/net/dccp/dccp.h
===================================================================
--- a/net/dccp/dccp.h 2009-08-25 20:21:45.000000000 -0300
+++ b/net/dccp/dccp.h 2009-08-26 22:59:10.000000000 -0300
@@ -168,6 +168,21 @@
return (u64)delta <= ndp + 1;
}
+static inline u64 dccp_loss_count(const u64 s1, const u64 s2, const u64 ndp)
+{
+ s64 delta, count;
+
+ delta = dccp_delta_seqno(s1, s2);
+ WARN_ON(delta < 0);
+
+ count = ndp + 1;
+ count -= delta;
+
+ count = (count > 0)? count: 0;
+
+ return (u64) count;
+}
+
enum {
DCCP_MIB_NUM = 0,
DCCP_MIB_ACTIVEOPENS, /* ActiveOpens */
^ permalink raw reply
* [PATCH 3/5] Implement TFRC-SP calc of mean length of loss intervals accordingly to section 3 of RFC 4828
From: Ivo Calado @ 2009-09-02 2:45 UTC (permalink / raw)
To: dccp; +Cc: netdev
In-Reply-To: <cb00fa210909011735kb74904bsc34058b725f9f5e9@mail.gmail.com>
Implement TFRC-SP calc of mean length of loss intervals accordingly to
section 3 of RFC 4828
Changes:
- Modify tfrc_sp_lh_calc_i_mean header, now receiving the current
ccval, so it can determine
if a loss interval is too recent
- Consider number of losses in each loss interval
- Only consider open loss interval if it is at least 2 rtt old
- Changes function signatures as necessary
Signed-off-by: Ivo Calado, Erivaldo Xavier, Leandro Sales
<ivocalado@embedded.ufcg.edu.br>, <desadoc@gmail.com>,
<leandroal@gmail.com>
Index: b/net/dccp/ccids/lib/loss_interval_sp.c
===================================================================
--- a/net/dccp/ccids/lib/loss_interval_sp.c 2009-08-26
23:28:27.000000000 -0300
+++ b/net/dccp/ccids/lib/loss_interval_sp.c 2009-08-26
23:53:32.000000000 -0300
@@ -66,10 +66,11 @@
}
}
-static void tfrc_sp_lh_calc_i_mean(struct tfrc_loss_hist *lh)
+static void tfrc_sp_lh_calc_i_mean(struct tfrc_loss_hist *lh, __u8 curr_ccval)
{
u32 i_i, i_tot0 = 0, i_tot1 = 0, w_tot = 0;
int i, k = tfrc_lh_length(lh) - 1; /* k is as in rfc3448bis, 5.4 */
+ u32 losses;
if (k <= 0)
return;
@@ -77,6 +78,14 @@
for (i = 0; i <= k; i++) {
i_i = tfrc_lh_get_interval(lh, i);
+ if (SUB16(curr_ccval,
tfrc_lh_get_loss_interval(lh,i)->li_ccval) <= 8)
+ {
+ losses = tfrc_lh_get_loss_interval(lh,i)->li_losses;
+
+ if (losses > 0)
+ i_i = div64_u64(i_i, losses);
+ }
+
if (i < k) {
i_tot0 += i_i * tfrc_lh_weights[i];
w_tot += tfrc_lh_weights[i];
@@ -86,6 +95,12 @@
}
lh->i_mean = max(i_tot0, i_tot1) / w_tot;
+ BUG_ON(w_tot == 0);
+ if (SUB16(curr_ccval, tfrc_lh_get_loss_interval(lh,0)->li_ccval) > 8) {
+ lh->i_mean = max(i_tot0, i_tot1) / w_tot;
+ } else {
+ lh->i_mean = i_tot1 / w_tot;
+ }
}
/**
@@ -126,7 +141,7 @@
return;
cur->li_length = len;
- tfrc_sp_lh_calc_i_mean(lh);
+ tfrc_sp_lh_calc_i_mean(lh, dccp_hdr(skb)->dccph_ccval);
}
/* RFC 4342, 10.2: test for the existence of packet with sequence number S */
@@ -145,7 +160,7 @@
* Updates I_mean and returns 1 if a new interval has in fact been
added to @lh.
*/
bool tfrc_sp_lh_interval_add(struct tfrc_loss_hist *lh, struct
tfrc_rx_hist *rh,
- u32 (*calc_first_li)(struct sock *),
struct sock *sk)
+ u32 (*calc_first_li)(struct sock *),
struct sock *sk, __u8 ccval)
{
struct tfrc_loss_interval *cur = tfrc_lh_peek(lh);
struct tfrc_rx_hist_entry *cong_evt;
@@ -214,7 +229,7 @@
if (lh->counter > (2*LIH_SIZE))
lh->counter -= LIH_SIZE;
- tfrc_sp_lh_calc_i_mean(lh);
+ tfrc_sp_lh_calc_i_mean(lh, ccval);
}
return true;
Index: b/net/dccp/ccids/lib/loss_interval_sp.h
===================================================================
--- a/net/dccp/ccids/lib/loss_interval_sp.h 2009-08-26
22:52:20.000000000 -0300
+++ b/net/dccp/ccids/lib/loss_interval_sp.h 2009-08-26
23:44:20.000000000 -0300
@@ -71,7 +71,7 @@
#endif
extern bool tfrc_sp_lh_interval_add(struct tfrc_loss_hist *, struct
tfrc_rx_hist *,
- u32 (*first_li)(struct sock *),
struct sock *);
+ u32 (*first_li)(struct sock *),
struct sock *, __u8 ccval);
extern void tfrc_sp_lh_update_i_mean(struct tfrc_loss_hist *lh,
struct sk_buff *);
extern void tfrc_sp_lh_cleanup(struct tfrc_loss_hist *lh);
Index: b/net/dccp/ccids/lib/packet_history_sp.c
===================================================================
--- a/net/dccp/ccids/lib/packet_history_sp.c 2009-08-26
22:55:01.000000000 -0300
+++ b/net/dccp/ccids/lib/packet_history_sp.c 2009-08-26
23:49:59.000000000 -0300
@@ -359,7 +359,7 @@
/*
* Update Loss Interval database and recycle RX records
*/
- new_event = tfrc_sp_lh_interval_add(lh, h, first_li, sk);
+ new_event = tfrc_sp_lh_interval_add(lh, h, first_li, sk,
dccp_hdr(skb)->dccph_ccval);
__three_after_loss(h);
} else if (dccp_data_packet(skb) && dccp_skb_is_ecn_ce(skb)) {
@@ -368,7 +368,7 @@
* the RFC considers ECN marks - a future implementation may
* find it useful to also check ECN marks on non-data packets.
*/
- new_event = tfrc_sp_lh_interval_add(lh, h, first_li, sk);
+ new_event = tfrc_sp_lh_interval_add(lh, h, first_li, sk,
dccp_hdr(skb)->dccph_ccval);
/*
* Also combinations of loss and ECN-marks (as per the warning)
* are not supported. The permutations of loss combined with or
^ permalink raw reply
* [PATCH 4/5] Adds options DROPPED PACKETS and LOSS INTERVALS to receiver
From: Ivo Calado @ 2009-09-02 2:45 UTC (permalink / raw)
To: dccp; +Cc: netdev
In-Reply-To: <cb00fa210909011736w7fc7245cq22a04171f525ec8@mail.gmail.com>
Adds options DROPPED PACKETS and LOSS INTERVALS to receiver. In this
patch is added the
mechanism of gathering information about loss intervals and storing
it, for later
construction of these two options.
Changes:
- Adds tfrc_loss_data and tfrc_loss_data_entry, structures that
register loss intervals info
- Adds dccp_skb_is_ecn_ect0 and dccp_skb_is_ecn_ect1 as necessary, so
ecn can be verified and
used in loss intervals option, that reports ecn nonce sum
- Adds tfrc_sp_update_li_data that updates information about loss intervals
- Adds tfrc_sp_ld_prepare_data, that fills fields on tfrc_loss_data
with current options values
- And adds a field of type struct tfrc_loss_data to struct tfrc_hc_rx_sock
Signed-off-by: Ivo Calado, Erivaldo Xavier, Leandro Sales
<ivocalado@embedded.ufcg.edu.br>, <desadoc@gmail.com>,
<leandroal@gmail.com>
Index: b/net/dccp/ccids/lib/packet_history_sp.c
===================================================================
--- a/net/dccp/ccids/lib/packet_history_sp.c 2009-08-26
23:49:59.000000000 -0300
+++ b/net/dccp/ccids/lib/packet_history_sp.c 2009-08-27
22:36:43.000000000 -0300
@@ -339,10 +339,12 @@
*/
bool tfrc_sp_rx_congestion_event(struct tfrc_rx_hist *h,
struct tfrc_loss_hist *lh,
- struct sk_buff *skb, const u64 ndp,
- u32 (*first_li)(struct sock *), struct sock *sk)
+ struct tfrc_loss_data *ld,
+ struct sk_buff *skb, const u64 ndp,
+ u32 (*first_li)(struct sock *), struct sock *sk)
{
bool new_event = false;
+ bool new_loss = false;
if (tfrc_sp_rx_hist_duplicate(h, skb))
return 0;
@@ -355,11 +357,12 @@
__one_after_loss(h, skb, ndp);
} else if (h->loss_count != 2) {
DCCP_BUG("invalid loss_count %d", h->loss_count);
- } else if (__two_after_loss(h, skb, ndp)) {
+ } else if ((new_loss = __two_after_loss(h, skb, ndp))) {
/*
* Update Loss Interval database and recycle RX records
*/
new_event = tfrc_sp_lh_interval_add(lh, h, first_li, sk,
dccp_hdr(skb)->dccph_ccval);
+ tfrc_sp_update_li_data(ld, h, skb, new_loss, new_event);
__three_after_loss(h);
} else if (dccp_data_packet(skb) && dccp_skb_is_ecn_ce(skb)) {
@@ -384,6 +387,8 @@
}
}
+ tfrc_sp_update_li_data(ld, h, skb, new_loss, new_event);
+
/*
* Update moving-average of `s' and the sum of received payload bytes.
*/
Index: b/net/dccp/ccids/lib/loss_interval_sp.c
===================================================================
--- a/net/dccp/ccids/lib/loss_interval_sp.c 2009-08-26
23:53:32.000000000 -0300
+++ b/net/dccp/ccids/lib/loss_interval_sp.c 2009-08-27
22:36:43.000000000 -0300
@@ -14,9 +14,89 @@
#include "tfrc_sp.h"
static struct kmem_cache *tfrc_lh_slab __read_mostly;
+static struct kmem_cache *tfrc_ld_slab __read_mostly;
+
/* Loss Interval weights from [RFC 3448, 5.4], scaled by 10 */
static const int tfrc_lh_weights[NINTERVAL] = { 10, 10, 10, 10, 8, 6, 4, 2 };
+/*
+ * Allocation routine for new entries of loss interval data
+ */
+static struct tfrc_loss_data_entry* tfrc_ld_add_new(struct tfrc_loss_data* ld)
+{
+ struct tfrc_loss_data_entry* new =
kmem_cache_alloc(tfrc_ld_slab, GFP_ATOMIC);
+
+ if(new == NULL)
+ return NULL;
+
+ memset(new, 0, sizeof(struct tfrc_loss_data_entry));
+
+ new->next = ld->head;
+ ld->head = new;
+ ld->counter++;
+
+ return new;
+}
+
+void tfrc_sp_ld_cleanup(struct tfrc_loss_data *ld)
+{
+ struct tfrc_loss_data_entry *next, *h = ld->head;
+
+ if(!h)
+ return;
+
+ while(h)
+ {
+ next = h->next;
+ kmem_cache_free(tfrc_ld_slab, h);
+ h = next;
+ }
+
+ ld->head = NULL;
+ ld->counter = 0;
+}
+
+void tfrc_sp_ld_prepare_data(u8 loss_count, struct tfrc_loss_data* ld)
+{
+ u8* li_ofs, *d_ofs;
+ struct tfrc_loss_data_entry* e;
+ u16 count;
+
+ li_ofs = &ld->loss_intervals_opts[0];
+ d_ofs = &ld->drop_opts[0];
+
+ count = 0;
+ e = ld->head;
+
+ *li_ofs = loss_count + 1;
+ li_ofs++;
+
+ while (e != NULL) {
+
+ if(count<TFRC_LOSS_INTERVALS_OPT_MAX_LENGTH)
+ {
+ *li_ofs = ((htonl(e->lossless_length)&0x00FFFFFF)<<8);
+ li_ofs += 3;
+ *li_ofs =
((e->ecn_nonce_sum&0x1)<<31)&(htonl((e->loss_length&0x00FFFFFF))<<8);
+ li_ofs += 3;
+ *li_ofs = ((htonl(e->data_length)&0x00FFFFFF)<<8);
+ li_ofs += 3;
+ }
+
+ if(count<TFRC_DROP_OPT_MAX_LENGTH)
+ {
+ *d_ofs = (htonl(e->drop_count)&0x00FFFFFF)<<8;
+ d_ofs += 3;
+ }
+
+
if((count>=TFRC_LOSS_INTERVALS_OPT_MAX_LENGTH)&&(count>=TFRC_DROP_OPT_MAX_LENGTH))
+ break;
+
+ count++;
+ e = e->next;
+ }
+}
+
/* implements LIFO semantics on the array */
static inline u8 LIH_INDEX(const u8 ctr)
{
@@ -235,13 +315,166 @@
return true;
}
+void tfrc_sp_update_li_data(struct tfrc_loss_data *ld, struct
tfrc_rx_hist *rh, struct sk_buff *skb, bool new_loss, bool new_event)
+{
+ struct tfrc_loss_data_entry* new, *h;
+
+ if(!dccp_data_packet(skb))
+ return;
+
+ if (ld->head == NULL)
+ {
+ new = tfrc_ld_add_new(ld);
+ if (unlikely(new == NULL)) {
+ DCCP_CRIT("Cannot allocate new loss data registry.");
+ return;
+ }
+
+ if (new_loss)
+ {
+ new->drop_count = rh->num_losses;
+ new->lossless_length = 1;
+ new->loss_length = rh->num_losses;
+
+ if (dccp_data_packet(skb))
+ new->data_length = 1;
+
+ if(dccp_data_packet(skb) && dccp_skb_is_ecn_ect1(skb))
+ new->ecn_nonce_sum = 1;
+ else
+ new->ecn_nonce_sum = 0;
+ }
+ else
+ {
+ new->drop_count = 0;
+ new->lossless_length = 1;
+ new->loss_length = 0;
+
+ if (dccp_data_packet(skb))
+ new->data_length = 1;
+
+ if(dccp_data_packet(skb) && dccp_skb_is_ecn_ect1(skb))
+ new->ecn_nonce_sum = 1;
+ else
+ new->ecn_nonce_sum = 0;
+ }
+
+ return;
+ }
+
+ if (new_event)
+ {
+ new = tfrc_ld_add_new(ld);
+ if (unlikely(new == NULL)) {
+ DCCP_CRIT("Cannot allocate new loss data
registry. Cleaning up.");
+ tfrc_sp_ld_cleanup(ld);
+ return;
+ }
+
+ new->drop_count = rh->num_losses;
+ new->lossless_length = (ld->last_loss_count - rh->loss_count);
+ new->loss_length = rh->num_losses;
+
+ new->ecn_nonce_sum = 0;
+ new->data_length = 0;
+
+ while (ld->last_loss_count > rh->loss_count)
+ {
+ ld->last_loss_count--;
+
+ if (ld->sto_is_data&(1 << (ld->last_loss_count)))
+ {
+ new->data_length++;
+
+ if (ld->sto_ecn&(1 << (ld->last_loss_count)))
+ new->ecn_nonce_sum =
!new->ecn_nonce_sum;
+ }
+ }
+
+ return;
+ }
+
+ h = ld->head;
+
+ if (rh->loss_count > ld->last_loss_count)
+ {
+ ld->last_loss_count = rh->loss_count;
+
+ if (dccp_data_packet(skb))
+ ld->sto_is_data |= (1 << (ld->last_loss_count - 1));
+
+ if (dccp_skb_is_ecn_ect1(skb))
+ ld->sto_ecn |= (1 << (ld->last_loss_count - 1));
+
+ return;
+ }
+
+ if (new_loss)
+ {
+ h->drop_count += rh->num_losses;
+ h->lossless_length = (ld->last_loss_count - rh->loss_count);
+ h->loss_length += h->lossless_length + rh->num_losses;
+
+ h->ecn_nonce_sum = 0;
+ h->data_length = 0;
+
+ while (ld->last_loss_count > rh->loss_count)
+ {
+ ld->last_loss_count--;
+
+ if (ld->sto_is_data&(1 << (ld->last_loss_count)))
+ {
+ h->data_length++;
+
+ if (ld->sto_ecn&(1 << (ld->last_loss_count)))
+ h->ecn_nonce_sum = !h->ecn_nonce_sum;
+ }
+ }
+
+ return;
+ }
+
+ if (ld->last_loss_count > rh->loss_count)
+ {
+ while (ld->last_loss_count > rh->loss_count)
+ {
+ ld->last_loss_count--;
+
+ h->lossless_length++;
+
+ if (ld->sto_is_data&(1 << (ld->last_loss_count)))
+ {
+ h->data_length++;
+
+ if (ld->sto_ecn&(1 << (ld->last_loss_count)))
+ h->ecn_nonce_sum = !h->ecn_nonce_sum;
+ }
+ }
+
+ return;
+ }
+
+ h->lossless_length++;
+
+ if(dccp_data_packet(skb))
+ {
+ h->data_length++;
+
+ if (dccp_skb_is_ecn_ect1(skb))
+ h->ecn_nonce_sum = !h->ecn_nonce_sum;
+ }
+}
+
int __init tfrc_sp_li_init(void)
{
tfrc_lh_slab = kmem_cache_create("tfrc_sp_li_hist",
sizeof(struct tfrc_loss_interval), 0,
SLAB_HWCACHE_ALIGN, NULL);
+ tfrc_ld_slab = kmem_cache_create("tfrc_sp_li_data",
+ sizeof(struct tfrc_loss_data_entry), 0,
+
SLAB_HWCACHE_ALIGN, NULL);
- if((tfrc_lh_slab != NULL))
+ if((tfrc_lh_slab != NULL)||(tfrc_ld_slab != NULL))
return 0;
if(tfrc_lh_slab != NULL)
@@ -250,6 +483,12 @@
tfrc_lh_slab = NULL;
}
+ if(tfrc_ld_slab != NULL)
+ {
+ kmem_cache_destroy(tfrc_ld_slab);
+ tfrc_ld_slab = NULL;
+ }
+
return -ENOBUFS;
}
@@ -259,4 +498,9 @@
kmem_cache_destroy(tfrc_lh_slab);
tfrc_lh_slab = NULL;
}
+
+ if (tfrc_ld_slab != NULL) {
+ kmem_cache_destroy(tfrc_ld_slab);
+ tfrc_ld_slab = NULL;
+ }
}
Index: b/net/dccp/ccids/lib/loss_interval_sp.h
===================================================================
--- a/net/dccp/ccids/lib/loss_interval_sp.h 2009-08-26
23:44:20.000000000 -0300
+++ b/net/dccp/ccids/lib/loss_interval_sp.h 2009-08-27
22:37:40.000000000 -0300
@@ -67,12 +67,44 @@
return min(lh->counter, (u8)LIH_SIZE);
}
-struct tfrc_rx_hist;
#endif
+struct tfrc_loss_data_entry {
+ struct tfrc_loss_data_entry *next;
+ u32 lossless_length:24;
+ u8 ecn_nonce_sum:1;
+ u32 loss_length:24;
+ u32 data_length:24;
+ u32 drop_count:24;
+};
+
+#define TFRC_LOSS_INTERVALS_OPT_MAX_LENGTH 28
+#define TFRC_DROP_OPT_MAX_LENGTH 84
+
+struct tfrc_loss_data {
+ struct tfrc_loss_data_entry *head;
+ u16 counter;
+ u8 loss_intervals_opts[2 +
TFRC_LOSS_INTERVALS_OPT_MAX_LENGTH*9];
+ u8 drop_opts[1 +
TFRC_DROP_OPT_MAX_LENGTH*3];
+ u8 last_loss_count;
+ u8 sto_ecn;
+ u8 sto_is_data;
+};
+
+static inline void tfrc_ld_init(struct tfrc_loss_data* ld)
+{
+ memset(ld, 0, sizeof(struct tfrc_loss_data));
+}
+
+struct tfrc_rx_hist;
+
extern bool tfrc_sp_lh_interval_add(struct tfrc_loss_hist *, struct
tfrc_rx_hist *,
u32 (*first_li)(struct sock *),
struct sock *, __u8 ccval);
+extern void tfrc_sp_update_li_data(struct tfrc_loss_data *, struct
tfrc_rx_hist *,
+ struct sk_buff *, bool new_loss,
bool new_event);
extern void tfrc_sp_lh_update_i_mean(struct tfrc_loss_hist *lh,
struct sk_buff *);
extern void tfrc_sp_lh_cleanup(struct tfrc_loss_hist *lh);
+extern void tfrc_sp_ld_cleanup(struct tfrc_loss_data *ld);
+extern void tfrc_sp_ld_prepare_data(u8 loss_count, struct tfrc_loss_data* ld);
#endif /* _DCCP_LI_HIST_SP_ */
Index: b/net/dccp/ccids/lib/tfrc_ccids_sp.h
===================================================================
--- a/net/dccp/ccids/lib/tfrc_ccids_sp.h 2009-08-27
00:50:46.000000000 -0300
+++ b/net/dccp/ccids/lib/tfrc_ccids_sp.h 2009-08-27
22:36:43.000000000 -0300
@@ -128,6 +128,7 @@
* @tstamp_last_feedback - Time at which last feedback was sent
* @hist - Packet history (loss detection + RTT sampling)
* @li_hist - Loss Interval database
+ * @li_data - Loss Interval data for options
* @p_inverse - Inverse of Loss Event Rate (RFC 4342, sec. 8.5)
*/
struct tfrc_hc_rx_sock {
@@ -137,6 +138,7 @@
ktime_t tstamp_last_feedback;
struct tfrc_rx_hist hist;
struct tfrc_loss_hist li_hist;
+ struct tfrc_loss_data li_data;
#define p_inverse li_hist.i_mean
};
Index: b/net/dccp/ccids/lib/packet_history_sp.h
===================================================================
--- a/net/dccp/ccids/lib/packet_history_sp.h 2009-08-26
22:55:58.000000000 -0300
+++ b/net/dccp/ccids/lib/packet_history_sp.h 2009-08-27
22:36:43.000000000 -0300
@@ -200,6 +200,7 @@
extern bool tfrc_sp_rx_congestion_event(struct tfrc_rx_hist *h,
struct tfrc_loss_hist *lh,
+ struct tfrc_loss_data *ld,
struct sk_buff *skb, const u64 ndp,
u32 (*first_li)(struct sock *sk),
struct sock *sk);
Index: b/net/dccp/dccp.h
===================================================================
--- a/net/dccp/dccp.h 2009-08-26 22:59:10.000000000 -0300
+++ b/net/dccp/dccp.h 2009-08-27 22:36:43.000000000 -0300
@@ -403,6 +403,16 @@
return (DCCP_SKB_CB(skb)->dccpd_ecn & INET_ECN_MASK) == INET_ECN_CE;
}
+static inline bool dccp_skb_is_ecn_ect0(const struct sk_buff *skb)
+{
+ return (DCCP_SKB_CB(skb)->dccpd_ecn & INET_ECN_MASK) == INET_ECN_ECT_0;
+}
+
+static inline bool dccp_skb_is_ecn_ect1(const struct sk_buff *skb)
+{
+ return (DCCP_SKB_CB(skb)->dccpd_ecn & INET_ECN_MASK) == INET_ECN_ECT_0;
+}
+
/* RFC 4340, sec. 7.7 */
static inline int dccp_non_data_packet(const struct sk_buff *skb)
{
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox