* [PATCH] ipvs: A couple of fixes and cleanups
@ 2008-08-10 11:07 sven.wegener
2008-08-10 11:07 ` [PATCH 1/9] ipvs: Fix possible deadlock in sync code sven.wegener
` (9 more replies)
0 siblings, 10 replies; 22+ messages in thread
From: sven.wegener @ 2008-08-10 11:07 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
Hi guys,
here come a couple of fixes and cleanups for IPVS. Worth mentioning are the two
possible deadlock fixes. One introduced by my last sync daemon work, which
hasn't hit any stable kernel yet. The other one is in the estimator code and
goes back to at leat since we started working with git for the kernel. The
latter I think qualifies for -stable.
I've pushed the changes (8123b42..2e45552) based on davem's net tree here
git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
Sven Wegener (9):
ipvs: Fix possible deadlock in sync code
ipvs: Fix possible deadlock in estimator code
ipvs: Use ARRAY_SIZE()
ipvs: Use list_empty() instead of open-coding the same functionality
ipvs: Initialize schedulers' struct list_head at compile time
ipvs: Annotate init functions with __init
ipvs: Mark net_vs_ctl_path const
ipvs: Embed estimator object into stats object
ipvs: No need to zero out ip_vs_stats during initialization
include/net/ip_vs.h | 26 ++++++++--
net/ipv4/ipvs/ip_vs_app.c | 2 +-
net/ipv4/ipvs/ip_vs_conn.c | 2 +-
net/ipv4/ipvs/ip_vs_ctl.c | 12 ++--
net/ipv4/ipvs/ip_vs_dh.c | 2 +-
net/ipv4/ipvs/ip_vs_est.c | 116 ++++++++++++++----------------------------
net/ipv4/ipvs/ip_vs_lblc.c | 2 +-
net/ipv4/ipvs/ip_vs_lblcr.c | 2 +-
net/ipv4/ipvs/ip_vs_lc.c | 2 +-
net/ipv4/ipvs/ip_vs_nq.c | 2 +-
net/ipv4/ipvs/ip_vs_proto.c | 2 +-
net/ipv4/ipvs/ip_vs_rr.c | 2 +-
net/ipv4/ipvs/ip_vs_sched.c | 4 +-
net/ipv4/ipvs/ip_vs_sed.c | 2 +-
net/ipv4/ipvs/ip_vs_sh.c | 2 +-
net/ipv4/ipvs/ip_vs_sync.c | 4 +-
net/ipv4/ipvs/ip_vs_wlc.c | 2 +-
net/ipv4/ipvs/ip_vs_wrr.c | 2 +-
18 files changed, 84 insertions(+), 104 deletions(-)
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/9] ipvs: Fix possible deadlock in sync code
2008-08-10 11:07 [PATCH] ipvs: A couple of fixes and cleanups sven.wegener
@ 2008-08-10 11:07 ` sven.wegener
2008-08-10 11:07 ` [PATCH 2/9] ipvs: Fix possible deadlock in estimator code sven.wegener
` (8 subsequent siblings)
9 siblings, 0 replies; 22+ messages in thread
From: sven.wegener @ 2008-08-10 11:07 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
From: Sven Wegener <sven.wegener@stealer.net>
Commit 998e7a76804b7a273a0460c2cdd5a51fa9856717 ("ipvs: Use kthread_run()
instead of doing a double-fork via kernel_thread()") introduced a possible
deadlock in the sync code. We need to use the _bh versions for the lock, as the
lock is also accessed from a bottom half.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
---
net/ipv4/ipvs/ip_vs_sync.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/ipvs/ip_vs_sync.c b/net/ipv4/ipvs/ip_vs_sync.c
index 45e9bd9..a652da2 100644
--- a/net/ipv4/ipvs/ip_vs_sync.c
+++ b/net/ipv4/ipvs/ip_vs_sync.c
@@ -904,9 +904,9 @@ int stop_sync_thread(int state)
* progress of stopping the master sync daemon.
*/
- spin_lock(&ip_vs_sync_lock);
+ spin_lock_bh(&ip_vs_sync_lock);
ip_vs_sync_state &= ~IP_VS_STATE_MASTER;
- spin_unlock(&ip_vs_sync_lock);
+ spin_unlock_bh(&ip_vs_sync_lock);
kthread_stop(sync_master_thread);
sync_master_thread = NULL;
} else if (state == IP_VS_STATE_BACKUP) {
--
1.5.6.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/9] ipvs: Fix possible deadlock in estimator code
2008-08-10 11:07 [PATCH] ipvs: A couple of fixes and cleanups sven.wegener
2008-08-10 11:07 ` [PATCH 1/9] ipvs: Fix possible deadlock in sync code sven.wegener
@ 2008-08-10 11:07 ` sven.wegener
2008-08-10 13:58 ` Sven Wegener
2008-08-10 11:07 ` [PATCH 3/9] ipvs: Use ARRAY_SIZE() sven.wegener
` (7 subsequent siblings)
9 siblings, 1 reply; 22+ messages in thread
From: sven.wegener @ 2008-08-10 11:07 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
From: Sven Wegener <sven.wegener@stealer.net>
There is a slight chance for a deadlock in the estimator code. We can't call
del_timer_sync() while holding our lock, as the timer might be active and
spinning for the lock on another cpu. Work around this issue by using
try_to_del_timer_sync() and releasing the lock. We could actually delete the
timer outside of our lock, as the add and kill functions are only every called
from userspace via [gs]etsockopt() and are serialized by a mutex, but better
make this explicit.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Cc: stable <stable@kernel.org>
---
net/ipv4/ipvs/ip_vs_est.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/ipvs/ip_vs_est.c b/net/ipv4/ipvs/ip_vs_est.c
index bc04eed..1d6e58e 100644
--- a/net/ipv4/ipvs/ip_vs_est.c
+++ b/net/ipv4/ipvs/ip_vs_est.c
@@ -170,8 +170,11 @@ void ip_vs_kill_estimator(struct ip_vs_stats *stats)
kfree(est);
killed++;
}
- if (killed && est_list == NULL)
- del_timer_sync(&est_timer);
+ while (killed && !est_list && try_to_del_timer_sync(&est_timer) < 0) {
+ write_unlock_bh(&est_lock);
+ cpu_relax();
+ write_lock_bh(&est_lock);
+ }
write_unlock_bh(&est_lock);
}
--
1.5.6.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 3/9] ipvs: Use ARRAY_SIZE()
2008-08-10 11:07 [PATCH] ipvs: A couple of fixes and cleanups sven.wegener
2008-08-10 11:07 ` [PATCH 1/9] ipvs: Fix possible deadlock in sync code sven.wegener
2008-08-10 11:07 ` [PATCH 2/9] ipvs: Fix possible deadlock in estimator code sven.wegener
@ 2008-08-10 11:07 ` sven.wegener
2008-08-10 11:07 ` [PATCH 4/9] ipvs: Use list_empty() instead of open-coding the same functionality sven.wegener
` (6 subsequent siblings)
9 siblings, 0 replies; 22+ messages in thread
From: sven.wegener @ 2008-08-10 11:07 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
From: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
---
include/net/ip_vs.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index cbb59eb..e980416 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -440,7 +440,7 @@ struct ip_vs_app
*/
extern const char *ip_vs_proto_name(unsigned proto);
extern void ip_vs_init_hash_table(struct list_head *table, int rows);
-#define IP_VS_INIT_HASH_TABLE(t) ip_vs_init_hash_table(t, sizeof(t)/sizeof(t[0]))
+#define IP_VS_INIT_HASH_TABLE(t) ip_vs_init_hash_table((t), ARRAY_SIZE((t)))
#define IP_VS_APP_TYPE_FTP 1
--
1.5.6.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 4/9] ipvs: Use list_empty() instead of open-coding the same functionality
2008-08-10 11:07 [PATCH] ipvs: A couple of fixes and cleanups sven.wegener
` (2 preceding siblings ...)
2008-08-10 11:07 ` [PATCH 3/9] ipvs: Use ARRAY_SIZE() sven.wegener
@ 2008-08-10 11:07 ` sven.wegener
2008-08-10 11:07 ` [PATCH 5/9] ipvs: Initialize schedulers' struct list_head at compile time sven.wegener
` (5 subsequent siblings)
9 siblings, 0 replies; 22+ messages in thread
From: sven.wegener @ 2008-08-10 11:07 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
From: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
---
net/ipv4/ipvs/ip_vs_sched.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/ipvs/ip_vs_sched.c b/net/ipv4/ipvs/ip_vs_sched.c
index b647673..a46ad9e 100644
--- a/net/ipv4/ipvs/ip_vs_sched.c
+++ b/net/ipv4/ipvs/ip_vs_sched.c
@@ -184,7 +184,7 @@ int register_ip_vs_scheduler(struct ip_vs_scheduler *scheduler)
write_lock_bh(&__ip_vs_sched_lock);
- if (scheduler->n_list.next != &scheduler->n_list) {
+ if (!list_empty(&scheduler->n_list)) {
write_unlock_bh(&__ip_vs_sched_lock);
ip_vs_use_count_dec();
IP_VS_ERR("register_ip_vs_scheduler(): [%s] scheduler "
@@ -229,7 +229,7 @@ int unregister_ip_vs_scheduler(struct ip_vs_scheduler *scheduler)
}
write_lock_bh(&__ip_vs_sched_lock);
- if (scheduler->n_list.next == &scheduler->n_list) {
+ if (list_empty(&scheduler->n_list)) {
write_unlock_bh(&__ip_vs_sched_lock);
IP_VS_ERR("unregister_ip_vs_scheduler(): [%s] scheduler "
"is not in the list. failed\n", scheduler->name);
--
1.5.6.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 5/9] ipvs: Initialize schedulers' struct list_head at compile time
2008-08-10 11:07 [PATCH] ipvs: A couple of fixes and cleanups sven.wegener
` (3 preceding siblings ...)
2008-08-10 11:07 ` [PATCH 4/9] ipvs: Use list_empty() instead of open-coding the same functionality sven.wegener
@ 2008-08-10 11:07 ` sven.wegener
2008-08-10 11:07 ` [PATCH 6/9] ipvs: Annotate init functions with __init sven.wegener
` (4 subsequent siblings)
9 siblings, 0 replies; 22+ messages in thread
From: sven.wegener @ 2008-08-10 11:07 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
From: Sven Wegener <sven.wegener@stealer.net>
No need to do it at runtime and this saves a couple of bytes in the text
section.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
---
net/ipv4/ipvs/ip_vs_dh.c | 2 +-
net/ipv4/ipvs/ip_vs_lblc.c | 2 +-
net/ipv4/ipvs/ip_vs_lblcr.c | 2 +-
net/ipv4/ipvs/ip_vs_lc.c | 2 +-
net/ipv4/ipvs/ip_vs_nq.c | 2 +-
net/ipv4/ipvs/ip_vs_rr.c | 2 +-
net/ipv4/ipvs/ip_vs_sed.c | 2 +-
net/ipv4/ipvs/ip_vs_sh.c | 2 +-
net/ipv4/ipvs/ip_vs_wlc.c | 2 +-
net/ipv4/ipvs/ip_vs_wrr.c | 2 +-
10 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/net/ipv4/ipvs/ip_vs_dh.c b/net/ipv4/ipvs/ip_vs_dh.c
index 8afc150..fa66824 100644
--- a/net/ipv4/ipvs/ip_vs_dh.c
+++ b/net/ipv4/ipvs/ip_vs_dh.c
@@ -233,6 +233,7 @@ static struct ip_vs_scheduler ip_vs_dh_scheduler =
.name = "dh",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+ .n_list = LIST_HEAD_INIT(ip_vs_dh_scheduler.n_list),
.init_service = ip_vs_dh_init_svc,
.done_service = ip_vs_dh_done_svc,
.update_service = ip_vs_dh_update_svc,
@@ -242,7 +243,6 @@ static struct ip_vs_scheduler ip_vs_dh_scheduler =
static int __init ip_vs_dh_init(void)
{
- INIT_LIST_HEAD(&ip_vs_dh_scheduler.n_list);
return register_ip_vs_scheduler(&ip_vs_dh_scheduler);
}
diff --git a/net/ipv4/ipvs/ip_vs_lblc.c b/net/ipv4/ipvs/ip_vs_lblc.c
index 0efa3db..7a6a319 100644
--- a/net/ipv4/ipvs/ip_vs_lblc.c
+++ b/net/ipv4/ipvs/ip_vs_lblc.c
@@ -539,6 +539,7 @@ static struct ip_vs_scheduler ip_vs_lblc_scheduler =
.name = "lblc",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+ .n_list = LIST_HEAD_INIT(ip_vs_lblc_scheduler.n_list),
.init_service = ip_vs_lblc_init_svc,
.done_service = ip_vs_lblc_done_svc,
.update_service = ip_vs_lblc_update_svc,
@@ -550,7 +551,6 @@ static int __init ip_vs_lblc_init(void)
{
int ret;
- INIT_LIST_HEAD(&ip_vs_lblc_scheduler.n_list);
sysctl_header = register_sysctl_paths(net_vs_ctl_path, vs_vars_table);
ret = register_ip_vs_scheduler(&ip_vs_lblc_scheduler);
if (ret)
diff --git a/net/ipv4/ipvs/ip_vs_lblcr.c b/net/ipv4/ipvs/ip_vs_lblcr.c
index 8e3bbeb..c234e73 100644
--- a/net/ipv4/ipvs/ip_vs_lblcr.c
+++ b/net/ipv4/ipvs/ip_vs_lblcr.c
@@ -728,6 +728,7 @@ static struct ip_vs_scheduler ip_vs_lblcr_scheduler =
.name = "lblcr",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+ .n_list = LIST_HEAD_INIT(ip_vs_lblcr_scheduler.n_list),
.init_service = ip_vs_lblcr_init_svc,
.done_service = ip_vs_lblcr_done_svc,
.update_service = ip_vs_lblcr_update_svc,
@@ -739,7 +740,6 @@ static int __init ip_vs_lblcr_init(void)
{
int ret;
- INIT_LIST_HEAD(&ip_vs_lblcr_scheduler.n_list);
sysctl_header = register_sysctl_paths(net_vs_ctl_path, vs_vars_table);
ret = register_ip_vs_scheduler(&ip_vs_lblcr_scheduler);
if (ret)
diff --git a/net/ipv4/ipvs/ip_vs_lc.c b/net/ipv4/ipvs/ip_vs_lc.c
index ac9f08e..ebcdbf7 100644
--- a/net/ipv4/ipvs/ip_vs_lc.c
+++ b/net/ipv4/ipvs/ip_vs_lc.c
@@ -98,6 +98,7 @@ static struct ip_vs_scheduler ip_vs_lc_scheduler = {
.name = "lc",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+ .n_list = LIST_HEAD_INIT(ip_vs_lc_scheduler.n_list),
.init_service = ip_vs_lc_init_svc,
.done_service = ip_vs_lc_done_svc,
.update_service = ip_vs_lc_update_svc,
@@ -107,7 +108,6 @@ static struct ip_vs_scheduler ip_vs_lc_scheduler = {
static int __init ip_vs_lc_init(void)
{
- INIT_LIST_HEAD(&ip_vs_lc_scheduler.n_list);
return register_ip_vs_scheduler(&ip_vs_lc_scheduler) ;
}
diff --git a/net/ipv4/ipvs/ip_vs_nq.c b/net/ipv4/ipvs/ip_vs_nq.c
index a46bf25..92f3a67 100644
--- a/net/ipv4/ipvs/ip_vs_nq.c
+++ b/net/ipv4/ipvs/ip_vs_nq.c
@@ -136,6 +136,7 @@ static struct ip_vs_scheduler ip_vs_nq_scheduler =
.name = "nq",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+ .n_list = LIST_HEAD_INIT(ip_vs_nq_scheduler.n_list),
.init_service = ip_vs_nq_init_svc,
.done_service = ip_vs_nq_done_svc,
.update_service = ip_vs_nq_update_svc,
@@ -145,7 +146,6 @@ static struct ip_vs_scheduler ip_vs_nq_scheduler =
static int __init ip_vs_nq_init(void)
{
- INIT_LIST_HEAD(&ip_vs_nq_scheduler.n_list);
return register_ip_vs_scheduler(&ip_vs_nq_scheduler);
}
diff --git a/net/ipv4/ipvs/ip_vs_rr.c b/net/ipv4/ipvs/ip_vs_rr.c
index c8db12d..358110d 100644
--- a/net/ipv4/ipvs/ip_vs_rr.c
+++ b/net/ipv4/ipvs/ip_vs_rr.c
@@ -94,6 +94,7 @@ static struct ip_vs_scheduler ip_vs_rr_scheduler = {
.name = "rr", /* name */
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+ .n_list = LIST_HEAD_INIT(ip_vs_rr_scheduler.n_list),
.init_service = ip_vs_rr_init_svc,
.done_service = ip_vs_rr_done_svc,
.update_service = ip_vs_rr_update_svc,
@@ -102,7 +103,6 @@ static struct ip_vs_scheduler ip_vs_rr_scheduler = {
static int __init ip_vs_rr_init(void)
{
- INIT_LIST_HEAD(&ip_vs_rr_scheduler.n_list);
return register_ip_vs_scheduler(&ip_vs_rr_scheduler);
}
diff --git a/net/ipv4/ipvs/ip_vs_sed.c b/net/ipv4/ipvs/ip_vs_sed.c
index 2a7d313..77663d8 100644
--- a/net/ipv4/ipvs/ip_vs_sed.c
+++ b/net/ipv4/ipvs/ip_vs_sed.c
@@ -138,6 +138,7 @@ static struct ip_vs_scheduler ip_vs_sed_scheduler =
.name = "sed",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+ .n_list = LIST_HEAD_INIT(ip_vs_sed_scheduler.n_list),
.init_service = ip_vs_sed_init_svc,
.done_service = ip_vs_sed_done_svc,
.update_service = ip_vs_sed_update_svc,
@@ -147,7 +148,6 @@ static struct ip_vs_scheduler ip_vs_sed_scheduler =
static int __init ip_vs_sed_init(void)
{
- INIT_LIST_HEAD(&ip_vs_sed_scheduler.n_list);
return register_ip_vs_scheduler(&ip_vs_sed_scheduler);
}
diff --git a/net/ipv4/ipvs/ip_vs_sh.c b/net/ipv4/ipvs/ip_vs_sh.c
index b8fdfac..7b979e2 100644
--- a/net/ipv4/ipvs/ip_vs_sh.c
+++ b/net/ipv4/ipvs/ip_vs_sh.c
@@ -230,6 +230,7 @@ static struct ip_vs_scheduler ip_vs_sh_scheduler =
.name = "sh",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+ .n_list = LIST_HEAD_INIT(ip_vs_sh_scheduler.n_list),
.init_service = ip_vs_sh_init_svc,
.done_service = ip_vs_sh_done_svc,
.update_service = ip_vs_sh_update_svc,
@@ -239,7 +240,6 @@ static struct ip_vs_scheduler ip_vs_sh_scheduler =
static int __init ip_vs_sh_init(void)
{
- INIT_LIST_HEAD(&ip_vs_sh_scheduler.n_list);
return register_ip_vs_scheduler(&ip_vs_sh_scheduler);
}
diff --git a/net/ipv4/ipvs/ip_vs_wlc.c b/net/ipv4/ipvs/ip_vs_wlc.c
index 772c3cb..9b0ef86 100644
--- a/net/ipv4/ipvs/ip_vs_wlc.c
+++ b/net/ipv4/ipvs/ip_vs_wlc.c
@@ -126,6 +126,7 @@ static struct ip_vs_scheduler ip_vs_wlc_scheduler =
.name = "wlc",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+ .n_list = LIST_HEAD_INIT(ip_vs_wlc_scheduler.n_list),
.init_service = ip_vs_wlc_init_svc,
.done_service = ip_vs_wlc_done_svc,
.update_service = ip_vs_wlc_update_svc,
@@ -135,7 +136,6 @@ static struct ip_vs_scheduler ip_vs_wlc_scheduler =
static int __init ip_vs_wlc_init(void)
{
- INIT_LIST_HEAD(&ip_vs_wlc_scheduler.n_list);
return register_ip_vs_scheduler(&ip_vs_wlc_scheduler);
}
diff --git a/net/ipv4/ipvs/ip_vs_wrr.c b/net/ipv4/ipvs/ip_vs_wrr.c
index 1d6932d..0d86a79 100644
--- a/net/ipv4/ipvs/ip_vs_wrr.c
+++ b/net/ipv4/ipvs/ip_vs_wrr.c
@@ -212,6 +212,7 @@ static struct ip_vs_scheduler ip_vs_wrr_scheduler = {
.name = "wrr",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+ .n_list = LIST_HEAD_INIT(ip_vs_wrr_scheduler.n_list),
.init_service = ip_vs_wrr_init_svc,
.done_service = ip_vs_wrr_done_svc,
.update_service = ip_vs_wrr_update_svc,
@@ -220,7 +221,6 @@ static struct ip_vs_scheduler ip_vs_wrr_scheduler = {
static int __init ip_vs_wrr_init(void)
{
- INIT_LIST_HEAD(&ip_vs_wrr_scheduler.n_list);
return register_ip_vs_scheduler(&ip_vs_wrr_scheduler) ;
}
--
1.5.6.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 6/9] ipvs: Annotate init functions with __init
2008-08-10 11:07 [PATCH] ipvs: A couple of fixes and cleanups sven.wegener
` (4 preceding siblings ...)
2008-08-10 11:07 ` [PATCH 5/9] ipvs: Initialize schedulers' struct list_head at compile time sven.wegener
@ 2008-08-10 11:07 ` sven.wegener
2008-08-10 18:32 ` Sven Wegener
2008-08-10 11:07 ` [PATCH 7/9] ipvs: Mark net_vs_ctl_path const sven.wegener
` (3 subsequent siblings)
9 siblings, 1 reply; 22+ messages in thread
From: sven.wegener @ 2008-08-10 11:07 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
From: Sven Wegener <sven.wegener@stealer.net>
Being able to discard these functions saves a couple of bytes at runtime. The
cleanup functions can't be annotated with __exit as they are also called from
init functions.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
---
net/ipv4/ipvs/ip_vs_app.c | 2 +-
net/ipv4/ipvs/ip_vs_conn.c | 2 +-
net/ipv4/ipvs/ip_vs_ctl.c | 2 +-
net/ipv4/ipvs/ip_vs_proto.c | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/ipvs/ip_vs_app.c b/net/ipv4/ipvs/ip_vs_app.c
index 1f1897a..201b8ea 100644
--- a/net/ipv4/ipvs/ip_vs_app.c
+++ b/net/ipv4/ipvs/ip_vs_app.c
@@ -608,7 +608,7 @@ int ip_vs_skb_replace(struct sk_buff *skb, gfp_t pri,
}
-int ip_vs_app_init(void)
+int __init ip_vs_app_init(void)
{
/* we will replace it with proc_net_ipvs_create() soon */
proc_net_fops_create(&init_net, "ip_vs_app", 0, &ip_vs_app_fops);
diff --git a/net/ipv4/ipvs/ip_vs_conn.c b/net/ipv4/ipvs/ip_vs_conn.c
index f8bdae4..44a6872 100644
--- a/net/ipv4/ipvs/ip_vs_conn.c
+++ b/net/ipv4/ipvs/ip_vs_conn.c
@@ -965,7 +965,7 @@ static void ip_vs_conn_flush(void)
}
-int ip_vs_conn_init(void)
+int __init ip_vs_conn_init(void)
{
int idx;
diff --git a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c
index 9a5ace0..df13333 100644
--- a/net/ipv4/ipvs/ip_vs_ctl.c
+++ b/net/ipv4/ipvs/ip_vs_ctl.c
@@ -2306,7 +2306,7 @@ static struct nf_sockopt_ops ip_vs_sockopts = {
};
-int ip_vs_control_init(void)
+int __init ip_vs_control_init(void)
{
int ret;
int idx;
diff --git a/net/ipv4/ipvs/ip_vs_proto.c b/net/ipv4/ipvs/ip_vs_proto.c
index 876714f..6c93ffa 100644
--- a/net/ipv4/ipvs/ip_vs_proto.c
+++ b/net/ipv4/ipvs/ip_vs_proto.c
@@ -190,7 +190,7 @@ ip_vs_tcpudp_debug_packet(struct ip_vs_protocol *pp,
}
-int ip_vs_protocol_init(void)
+int __init ip_vs_protocol_init(void)
{
char protocols[64];
#define REGISTER_PROTOCOL(p) \
--
1.5.6.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 7/9] ipvs: Mark net_vs_ctl_path const
2008-08-10 11:07 [PATCH] ipvs: A couple of fixes and cleanups sven.wegener
` (5 preceding siblings ...)
2008-08-10 11:07 ` [PATCH 6/9] ipvs: Annotate init functions with __init sven.wegener
@ 2008-08-10 11:07 ` sven.wegener
2008-08-10 11:07 ` [PATCH 8/9] ipvs: Embed estimator object into stats object sven.wegener
` (2 subsequent siblings)
9 siblings, 0 replies; 22+ messages in thread
From: sven.wegener @ 2008-08-10 11:07 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
From: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
---
include/net/ip_vs.h | 2 +-
net/ipv4/ipvs/ip_vs_ctl.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index e980416..c8ee9b8 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -620,7 +620,7 @@ extern int sysctl_ip_vs_expire_quiescent_template;
extern int sysctl_ip_vs_sync_threshold[2];
extern int sysctl_ip_vs_nat_icmp_send;
extern struct ip_vs_stats ip_vs_stats;
-extern struct ctl_path net_vs_ctl_path[];
+extern const struct ctl_path net_vs_ctl_path[];
extern struct ip_vs_service *
ip_vs_service_get(__u32 fwmark, __u16 protocol, __be32 vaddr, __be16 vport);
diff --git a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c
index df13333..999d884 100644
--- a/net/ipv4/ipvs/ip_vs_ctl.c
+++ b/net/ipv4/ipvs/ip_vs_ctl.c
@@ -1589,7 +1589,7 @@ static struct ctl_table vs_vars[] = {
{ .ctl_name = 0 }
};
-struct ctl_path net_vs_ctl_path[] = {
+const struct ctl_path net_vs_ctl_path[] = {
{ .procname = "net", .ctl_name = CTL_NET, },
{ .procname = "ipv4", .ctl_name = NET_IPV4, },
{ .procname = "vs", },
--
1.5.6.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 8/9] ipvs: Embed estimator object into stats object
2008-08-10 11:07 [PATCH] ipvs: A couple of fixes and cleanups sven.wegener
` (6 preceding siblings ...)
2008-08-10 11:07 ` [PATCH 7/9] ipvs: Mark net_vs_ctl_path const sven.wegener
@ 2008-08-10 11:07 ` sven.wegener
2008-08-11 12:16 ` Sven Wegener
2008-08-10 11:07 ` [PATCH 9/9] ipvs: No need to zero out ip_vs_stats during initialization sven.wegener
2008-08-10 18:35 ` [PATCH] ipvs: A couple of fixes and cleanups Sven Wegener
9 siblings, 1 reply; 22+ messages in thread
From: sven.wegener @ 2008-08-10 11:07 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
From: Sven Wegener <sven.wegener@stealer.net>
There's no reason for dynamically allocating an estimator object for every
stats object. Directly embed an estimator object into every stats object and
switch to using the kernel-provided list implementation. This makes the code
much simpler and faster, as we do not need to traverse the list of all
estimators to find the one belonging to a stats object.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
---
include/net/ip_vs.h | 22 ++++++++-
net/ipv4/ipvs/ip_vs_ctl.c | 2 +-
net/ipv4/ipvs/ip_vs_est.c | 117 +++++++++++++++------------------------------
3 files changed, 59 insertions(+), 82 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index c8ee9b8..ff5b02a 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -140,8 +140,24 @@ struct ip_vs_seq {
/*
- * IPVS statistics object
+ * IPVS statistics objects
*/
+struct ip_vs_estimator {
+ struct list_head list;
+
+ u32 last_conns;
+ u32 last_inpkts;
+ u32 last_outpkts;
+ u64 last_inbytes;
+ u64 last_outbytes;
+
+ u32 cps;
+ u32 inpps;
+ u32 outpps;
+ u32 inbps;
+ u32 outbps;
+};
+
struct ip_vs_stats
{
__u32 conns; /* connections scheduled */
@@ -157,6 +173,8 @@ struct ip_vs_stats
__u32 outbps; /* current out byte rate */
spinlock_t lock; /* spin lock */
+
+ struct ip_vs_estimator est; /* estimator */
};
struct dst_entry;
@@ -659,7 +677,7 @@ extern void ip_vs_sync_conn(struct ip_vs_conn *cp);
/*
* IPVS rate estimator prototypes (from ip_vs_est.c)
*/
-extern int ip_vs_new_estimator(struct ip_vs_stats *stats);
+extern void ip_vs_new_estimator(struct ip_vs_stats *stats);
extern void ip_vs_kill_estimator(struct ip_vs_stats *stats);
extern void ip_vs_zero_estimator(struct ip_vs_stats *stats);
diff --git a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c
index 999d884..d651bce 100644
--- a/net/ipv4/ipvs/ip_vs_ctl.c
+++ b/net/ipv4/ipvs/ip_vs_ctl.c
@@ -684,8 +684,8 @@ ip_vs_zero_stats(struct ip_vs_stats *stats)
{
spin_lock_bh(&stats->lock);
memset(stats, 0, (char *)&stats->lock - (char *)stats);
- spin_unlock_bh(&stats->lock);
ip_vs_zero_estimator(stats);
+ spin_unlock_bh(&stats->lock);
}
/*
diff --git a/net/ipv4/ipvs/ip_vs_est.c b/net/ipv4/ipvs/ip_vs_est.c
index 1d6e58e..3088175 100644
--- a/net/ipv4/ipvs/ip_vs_est.c
+++ b/net/ipv4/ipvs/ip_vs_est.c
@@ -17,6 +17,7 @@
#include <linux/types.h>
#include <linux/interrupt.h>
#include <linux/sysctl.h>
+#include <linux/list.h>
#include <net/ip_vs.h>
@@ -44,28 +45,11 @@
*/
-struct ip_vs_estimator
-{
- struct ip_vs_estimator *next;
- struct ip_vs_stats *stats;
-
- u32 last_conns;
- u32 last_inpkts;
- u32 last_outpkts;
- u64 last_inbytes;
- u64 last_outbytes;
-
- u32 cps;
- u32 inpps;
- u32 outpps;
- u32 inbps;
- u32 outbps;
-};
+static void estimation_timer(unsigned long arg);
-
-static struct ip_vs_estimator *est_list = NULL;
-static DEFINE_RWLOCK(est_lock);
-static struct timer_list est_timer;
+static LIST_HEAD(est_list);
+static DEFINE_SPINLOCK(est_lock);
+static DEFINE_TIMER(est_timer, estimation_timer, 0, 0);
static void estimation_timer(unsigned long arg)
{
@@ -76,9 +60,9 @@ static void estimation_timer(unsigned long arg)
u64 n_inbytes, n_outbytes;
u32 rate;
- read_lock(&est_lock);
- for (e = est_list; e; e = e->next) {
- s = e->stats;
+ spin_lock(&est_lock);
+ list_for_each_entry(e, &est_list, list) {
+ s = container_of(e, struct ip_vs_stats, est);
spin_lock(&s->lock);
n_conns = s->conns;
@@ -114,19 +98,16 @@ static void estimation_timer(unsigned long arg)
s->outbps = (e->outbps+0xF)>>5;
spin_unlock(&s->lock);
}
- read_unlock(&est_lock);
+ spin_unlock(&est_lock);
mod_timer(&est_timer, jiffies + 2*HZ);
}
-int ip_vs_new_estimator(struct ip_vs_stats *stats)
+void ip_vs_new_estimator(struct ip_vs_stats *stats)
{
- struct ip_vs_estimator *est;
+ struct ip_vs_estimator *est = &stats->est;
- est = kzalloc(sizeof(*est), GFP_KERNEL);
- if (est == NULL)
- return -ENOMEM;
+ INIT_LIST_HEAD(&est->list);
- est->stats = stats;
est->last_conns = stats->conns;
est->cps = stats->cps<<10;
@@ -142,62 +123,40 @@ int ip_vs_new_estimator(struct ip_vs_stats *stats)
est->last_outbytes = stats->outbytes;
est->outbps = stats->outbps<<5;
- write_lock_bh(&est_lock);
- est->next = est_list;
- if (est->next == NULL) {
- setup_timer(&est_timer, estimation_timer, 0);
- est_timer.expires = jiffies + 2*HZ;
- add_timer(&est_timer);
- }
- est_list = est;
- write_unlock_bh(&est_lock);
- return 0;
+ spin_lock_bh(&est_lock);
+ if (list_empty(&est_list))
+ mod_timer(&est_timer, jiffies + 2 * HZ);
+ list_add(&est->list, &est_list);
+ spin_unlock_bh(&est_lock);
}
void ip_vs_kill_estimator(struct ip_vs_stats *stats)
{
- struct ip_vs_estimator *est, **pest;
- int killed = 0;
-
- write_lock_bh(&est_lock);
- pest = &est_list;
- while ((est=*pest) != NULL) {
- if (est->stats != stats) {
- pest = &est->next;
- continue;
- }
- *pest = est->next;
- kfree(est);
- killed++;
- }
- while (killed && !est_list && try_to_del_timer_sync(&est_timer) < 0) {
- write_unlock_bh(&est_lock);
+ struct ip_vs_estimator *est = &stats->est;
+
+ spin_lock_bh(&est_lock);
+ list_del(&est->list);
+ while (list_empty(&est_list) && try_to_del_timer_sync(&est_timer) < 0) {
+ spin_unlock_bh(&est_lock);
cpu_relax();
- write_lock_bh(&est_lock);
+ spin_lock_bh(&est_lock);
}
- write_unlock_bh(&est_lock);
+ spin_unlock_bh(&est_lock);
}
void ip_vs_zero_estimator(struct ip_vs_stats *stats)
{
- struct ip_vs_estimator *e;
-
- write_lock_bh(&est_lock);
- for (e = est_list; e; e = e->next) {
- if (e->stats != stats)
- continue;
-
- /* set counters zero */
- e->last_conns = 0;
- e->last_inpkts = 0;
- e->last_outpkts = 0;
- e->last_inbytes = 0;
- e->last_outbytes = 0;
- e->cps = 0;
- e->inpps = 0;
- e->outpps = 0;
- e->inbps = 0;
- e->outbps = 0;
- }
- write_unlock_bh(&est_lock);
+ struct ip_vs_estimator *est = &stats->est;
+
+ /* set counters zero, caller must hold the stats->lock lock */
+ est->last_conns = 0;
+ est->last_inpkts = 0;
+ est->last_outpkts = 0;
+ est->last_inbytes = 0;
+ est->last_outbytes = 0;
+ est->cps = 0;
+ est->inpps = 0;
+ est->outpps = 0;
+ est->inbps = 0;
+ est->outbps = 0;
}
--
1.5.6.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 9/9] ipvs: No need to zero out ip_vs_stats during initialization
2008-08-10 11:07 [PATCH] ipvs: A couple of fixes and cleanups sven.wegener
` (7 preceding siblings ...)
2008-08-10 11:07 ` [PATCH 8/9] ipvs: Embed estimator object into stats object sven.wegener
@ 2008-08-10 11:07 ` sven.wegener
2008-08-10 18:35 ` [PATCH] ipvs: A couple of fixes and cleanups Sven Wegener
9 siblings, 0 replies; 22+ messages in thread
From: sven.wegener @ 2008-08-10 11:07 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
From: Sven Wegener <sven.wegener@stealer.net>
It's a global variable and automatically initialized to zero. And now we can
also initialize the lock at compile time.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
---
net/ipv4/ipvs/ip_vs_ctl.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c
index d651bce..cfb1d20 100644
--- a/net/ipv4/ipvs/ip_vs_ctl.c
+++ b/net/ipv4/ipvs/ip_vs_ctl.c
@@ -1784,7 +1784,9 @@ static const struct file_operations ip_vs_info_fops = {
#endif
-struct ip_vs_stats ip_vs_stats;
+struct ip_vs_stats ip_vs_stats = {
+ .lock = __SPIN_LOCK_UNLOCKED(ip_vs_stats.lock),
+};
#ifdef CONFIG_PROC_FS
static int ip_vs_stats_show(struct seq_file *seq, void *v)
@@ -2333,8 +2335,6 @@ int __init ip_vs_control_init(void)
INIT_LIST_HEAD(&ip_vs_rtable[idx]);
}
- memset(&ip_vs_stats, 0, sizeof(ip_vs_stats));
- spin_lock_init(&ip_vs_stats.lock);
ip_vs_new_estimator(&ip_vs_stats);
/* Hook the defense timer */
--
1.5.6.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 2/9] ipvs: Fix possible deadlock in estimator code
2008-08-10 11:07 ` [PATCH 2/9] ipvs: Fix possible deadlock in estimator code sven.wegener
@ 2008-08-10 13:58 ` Sven Wegener
0 siblings, 0 replies; 22+ messages in thread
From: Sven Wegener @ 2008-08-10 13:58 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
On Sun, 10 Aug 2008, sven.wegener@stealer.net wrote:
> From: Sven Wegener <sven.wegener@stealer.net>
>
> There is a slight chance for a deadlock in the estimator code. We can't call
> del_timer_sync() while holding our lock, as the timer might be active and
> spinning for the lock on another cpu. Work around this issue by using
> try_to_del_timer_sync() and releasing the lock. We could actually delete the
> timer outside of our lock, as the add and kill functions are only every called
> from userspace via [gs]etsockopt() and are serialized by a mutex, but better
> make this explicit.
>
> Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
> Cc: stable <stable@kernel.org>
> ---
> net/ipv4/ipvs/ip_vs_est.c | 7 +++++--
> 1 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/ipvs/ip_vs_est.c b/net/ipv4/ipvs/ip_vs_est.c
> index bc04eed..1d6e58e 100644
> --- a/net/ipv4/ipvs/ip_vs_est.c
> +++ b/net/ipv4/ipvs/ip_vs_est.c
> @@ -170,8 +170,11 @@ void ip_vs_kill_estimator(struct ip_vs_stats *stats)
> kfree(est);
> killed++;
> }
> - if (killed && est_list == NULL)
> - del_timer_sync(&est_timer);
> + while (killed && !est_list && try_to_del_timer_sync(&est_timer) < 0) {
> + write_unlock_bh(&est_lock);
> + cpu_relax();
> + write_lock_bh(&est_lock);
> + }
> write_unlock_bh(&est_lock);
> }
>
As a side note, just noticed that this opens another race condition, if we
leave the mutex serializing the [gs]etsockopt calls out:
If the timer reschedules itself, after we just removed the last estimator,
but we failed to deactivate the timer due to it being active, we have the
chance of add_timer() on already pending timer during creating a new
estimator. This will then trigger the BUG_ON() in add_timer(). Thanks to
the mutex this can't happen currently.
The "ipvs: Embed estimator object into stats object" patch will move to
code from add_timer() to mod_timer() so at the the end of this series
we're safe.
Sven
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 6/9] ipvs: Annotate init functions with __init
2008-08-10 11:07 ` [PATCH 6/9] ipvs: Annotate init functions with __init sven.wegener
@ 2008-08-10 18:32 ` Sven Wegener
0 siblings, 0 replies; 22+ messages in thread
From: Sven Wegener @ 2008-08-10 18:32 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
Being able to discard these functions saves a couple of bytes at runtime. The
cleanup functions can't be annotated with __exit as they are also called from
init functions.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
---
net/ipv4/ipvs/ip_vs_app.c | 2 +-
net/ipv4/ipvs/ip_vs_conn.c | 2 +-
net/ipv4/ipvs/ip_vs_ctl.c | 2 +-
net/ipv4/ipvs/ip_vs_proto.c | 4 ++--
4 files changed, 5 insertions(+), 5 deletions(-)
There's one more, register_ip_vs_protocol() is also only used during
initialization.
diff --git a/net/ipv4/ipvs/ip_vs_app.c b/net/ipv4/ipvs/ip_vs_app.c
index 1f1897a..201b8ea 100644
--- a/net/ipv4/ipvs/ip_vs_app.c
+++ b/net/ipv4/ipvs/ip_vs_app.c
@@ -608,7 +608,7 @@ int ip_vs_skb_replace(struct sk_buff *skb, gfp_t pri,
}
-int ip_vs_app_init(void)
+int __init ip_vs_app_init(void)
{
/* we will replace it with proc_net_ipvs_create() soon */
proc_net_fops_create(&init_net, "ip_vs_app", 0, &ip_vs_app_fops);
diff --git a/net/ipv4/ipvs/ip_vs_conn.c b/net/ipv4/ipvs/ip_vs_conn.c
index f8bdae4..44a6872 100644
--- a/net/ipv4/ipvs/ip_vs_conn.c
+++ b/net/ipv4/ipvs/ip_vs_conn.c
@@ -965,7 +965,7 @@ static void ip_vs_conn_flush(void)
}
-int ip_vs_conn_init(void)
+int __init ip_vs_conn_init(void)
{
int idx;
diff --git a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c
index 9a5ace0..df13333 100644
--- a/net/ipv4/ipvs/ip_vs_ctl.c
+++ b/net/ipv4/ipvs/ip_vs_ctl.c
@@ -2306,7 +2306,7 @@ static struct nf_sockopt_ops ip_vs_sockopts = {
};
-int ip_vs_control_init(void)
+int __init ip_vs_control_init(void)
{
int ret;
int idx;
diff --git a/net/ipv4/ipvs/ip_vs_proto.c b/net/ipv4/ipvs/ip_vs_proto.c
index 876714f..6099a88 100644
--- a/net/ipv4/ipvs/ip_vs_proto.c
+++ b/net/ipv4/ipvs/ip_vs_proto.c
@@ -43,7 +43,7 @@ static struct ip_vs_protocol *ip_vs_proto_table[IP_VS_PROTO_TAB_SIZE];
/*
* register an ipvs protocol
*/
-static int __used register_ip_vs_protocol(struct ip_vs_protocol *pp)
+static int __used __init register_ip_vs_protocol(struct ip_vs_protocol *pp)
{
unsigned hash = IP_VS_PROTO_HASH(pp->protocol);
@@ -190,7 +190,7 @@ ip_vs_tcpudp_debug_packet(struct ip_vs_protocol *pp,
}
-int ip_vs_protocol_init(void)
+int __init ip_vs_protocol_init(void)
{
char protocols[64];
#define REGISTER_PROTOCOL(p) \
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] ipvs: A couple of fixes and cleanups
2008-08-10 11:07 [PATCH] ipvs: A couple of fixes and cleanups sven.wegener
` (8 preceding siblings ...)
2008-08-10 11:07 ` [PATCH 9/9] ipvs: No need to zero out ip_vs_stats during initialization sven.wegener
@ 2008-08-10 18:35 ` Sven Wegener
2008-08-11 0:50 ` Simon Horman
2008-08-11 12:19 ` Sven Wegener
9 siblings, 2 replies; 22+ messages in thread
From: Sven Wegener @ 2008-08-10 18:35 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
On Sun, 10 Aug 2008, sven.wegener@stealer.net wrote:
> Hi guys,
>
> here come a couple of fixes and cleanups for IPVS. Worth mentioning are the two
> possible deadlock fixes. One introduced by my last sync daemon work, which
> hasn't hit any stable kernel yet. The other one is in the estimator code and
> goes back to at leat since we started working with git for the kernel. The
> latter I think qualifies for -stable.
>
> I've pushed the changes (8123b42..2e45552) based on davem's net tree here
>
> git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
I've included the register_ip_vs_protocol() annotation. Changes are now
8123b42..7ead17b. Diffstat has changed slightly, but is probably not worth
posting again.
Sven
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ipvs: A couple of fixes and cleanups
2008-08-10 18:35 ` [PATCH] ipvs: A couple of fixes and cleanups Sven Wegener
@ 2008-08-11 0:50 ` Simon Horman
2008-08-11 6:43 ` Sven Wegener
2008-08-11 12:19 ` Sven Wegener
1 sibling, 1 reply; 22+ messages in thread
From: Simon Horman @ 2008-08-11 0:50 UTC (permalink / raw)
To: Sven Wegener; +Cc: wensong, ja, netdev, lvs-devel
On Sun, Aug 10, 2008 at 08:35:48PM +0200, Sven Wegener wrote:
> On Sun, 10 Aug 2008, sven.wegener@stealer.net wrote:
>
> > Hi guys,
> >
> > here come a couple of fixes and cleanups for IPVS. Worth mentioning are the two
> > possible deadlock fixes. One introduced by my last sync daemon work, which
> > hasn't hit any stable kernel yet. The other one is in the estimator code and
> > goes back to at leat since we started working with git for the kernel. The
> > latter I think qualifies for -stable.
> >
> > I've pushed the changes (8123b42..2e45552) based on davem's net tree here
> >
> > git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
>
> I've included the register_ip_vs_protocol() annotation. Changes are now
> 8123b42..7ead17b. Diffstat has changed slightly, but is probably not worth
> posting again.
Hi Sven,
all these changes seem fine to me.
Acked-by: Simon Horman <horms@verge.net.au>
With regards to ip_vs_zero_stats(), it uses
memset(stats, 0, (char *)&stats->lock - (char *)stats);
to clear stats and then calls ip_vs_zero_estimator(), which uses
est->last_conns = 0;
est->last_inpkts = 0;
...
to clear stats->est.
I wonder if it would be cleaner to either clear
stats->... directly in ip_vs_zero_stats(), or use
memset in ip_vs_zero_estimator()?
Something like this...
----------------------------------------------------------------
ipvs: Use memset to clear estimator
This makes the code in ip_vs_zero_estimator() a lot closer
to the style of its parent, ip_vs_zero_stats().
Signed-off-by: Simon Horman <horms@verge.net.au>
Index: net-2.6/include/net/ip_vs.h
===================================================================
--- net-2.6.orig/include/net/ip_vs.h 2008-08-11 10:43:23.000000000 +1000
+++ net-2.6/include/net/ip_vs.h 2008-08-11 10:43:23.000000000 +1000
@@ -143,8 +143,6 @@ struct ip_vs_seq {
* IPVS statistics objects
*/
struct ip_vs_estimator {
- struct list_head list;
-
u32 last_conns;
u32 last_inpkts;
u32 last_outpkts;
@@ -156,6 +154,8 @@ struct ip_vs_estimator {
u32 outpps;
u32 inbps;
u32 outbps;
+
+ struct list_head list;
};
struct ip_vs_stats
Index: net-2.6/net/ipv4/ipvs/ip_vs_est.c
===================================================================
--- net-2.6.orig/net/ipv4/ipvs/ip_vs_est.c 2008-08-11 10:43:23.000000000 +1000
+++ net-2.6/net/ipv4/ipvs/ip_vs_est.c 2008-08-11 10:43:23.000000000 +1000
@@ -149,14 +149,5 @@ void ip_vs_zero_estimator(struct ip_vs_s
struct ip_vs_estimator *est = &stats->est;
/* set counters zero, caller must hold the stats->lock lock */
- est->last_conns = 0;
- est->last_inpkts = 0;
- est->last_outpkts = 0;
- est->last_inbytes = 0;
- est->last_outbytes = 0;
- est->cps = 0;
- est->inpps = 0;
- est->outpps = 0;
- est->inbps = 0;
- est->outbps = 0;
+ memset(est, 0, (char *)&est->list - (char *)est);
}
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ipvs: A couple of fixes and cleanups
2008-08-11 0:50 ` Simon Horman
@ 2008-08-11 6:43 ` Sven Wegener
2008-08-11 6:56 ` Simon Horman
0 siblings, 1 reply; 22+ messages in thread
From: Sven Wegener @ 2008-08-11 6:43 UTC (permalink / raw)
To: Simon Horman; +Cc: wensong, ja, netdev, lvs-devel
On Mon, 11 Aug 2008, Simon Horman wrote:
> On Sun, Aug 10, 2008 at 08:35:48PM +0200, Sven Wegener wrote:
> > On Sun, 10 Aug 2008, sven.wegener@stealer.net wrote:
> >
> > > Hi guys,
> > >
> > > here come a couple of fixes and cleanups for IPVS. Worth mentioning are the two
> > > possible deadlock fixes. One introduced by my last sync daemon work, which
> > > hasn't hit any stable kernel yet. The other one is in the estimator code and
> > > goes back to at leat since we started working with git for the kernel. The
> > > latter I think qualifies for -stable.
> > >
> > > I've pushed the changes (8123b42..2e45552) based on davem's net tree here
> > >
> > > git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
> >
> > I've included the register_ip_vs_protocol() annotation. Changes are now
> > 8123b42..7ead17b. Diffstat has changed slightly, but is probably not worth
> > posting again.
>
> Hi Sven,
>
> all these changes seem fine to me.
>
> Acked-by: Simon Horman <horms@verge.net.au>
>
> With regards to ip_vs_zero_stats(), it uses
>
> memset(stats, 0, (char *)&stats->lock - (char *)stats);
>
> to clear stats and then calls ip_vs_zero_estimator(), which uses
>
> est->last_conns = 0;
> est->last_inpkts = 0;
> ...
>
> to clear stats->est.
>
> I wonder if it would be cleaner to either clear
> stats->... directly in ip_vs_zero_stats(), or use
> memset in ip_vs_zero_estimator()?
Yeah, I wondered about the same. memset is probably simpler, but direct
assignment makes it more obvious what is changed. Thinking about it, I'd
prefer direct assignment, when not setting a complete structure to zero
and there are not more than a handful lines needed to do it with direct
assignment. But I'm fine with either way here. If we prefer the memset
way, we should add a comment to both structures, saying that nobody should
add anything non-statistic before the member we use to get the size.
Also the estimator and statistics structures can be optimized to avoid
some padding. I'll include these changes and resend.
Sven
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ipvs: A couple of fixes and cleanups
2008-08-11 6:43 ` Sven Wegener
@ 2008-08-11 6:56 ` Simon Horman
2008-08-11 7:19 ` Simon Horman
2008-08-11 8:19 ` Sven Wegener
0 siblings, 2 replies; 22+ messages in thread
From: Simon Horman @ 2008-08-11 6:56 UTC (permalink / raw)
To: Sven Wegener; +Cc: wensong, ja, netdev, lvs-devel
On Mon, Aug 11, 2008 at 08:43:54AM +0200, Sven Wegener wrote:
> On Mon, 11 Aug 2008, Simon Horman wrote:
>
> > On Sun, Aug 10, 2008 at 08:35:48PM +0200, Sven Wegener wrote:
> > > On Sun, 10 Aug 2008, sven.wegener@stealer.net wrote:
> > >
> > > > Hi guys,
> > > >
> > > > here come a couple of fixes and cleanups for IPVS. Worth mentioning are the two
> > > > possible deadlock fixes. One introduced by my last sync daemon work, which
> > > > hasn't hit any stable kernel yet. The other one is in the estimator code and
> > > > goes back to at leat since we started working with git for the kernel. The
> > > > latter I think qualifies for -stable.
> > > >
> > > > I've pushed the changes (8123b42..2e45552) based on davem's net tree here
> > > >
> > > > git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
> > >
> > > I've included the register_ip_vs_protocol() annotation. Changes are now
> > > 8123b42..7ead17b. Diffstat has changed slightly, but is probably not worth
> > > posting again.
> >
> > Hi Sven,
> >
> > all these changes seem fine to me.
> >
> > Acked-by: Simon Horman <horms@verge.net.au>
> >
> > With regards to ip_vs_zero_stats(), it uses
> >
> > memset(stats, 0, (char *)&stats->lock - (char *)stats);
> >
> > to clear stats and then calls ip_vs_zero_estimator(), which uses
> >
> > est->last_conns = 0;
> > est->last_inpkts = 0;
> > ...
> >
> > to clear stats->est.
> >
> > I wonder if it would be cleaner to either clear
> > stats->... directly in ip_vs_zero_stats(), or use
> > memset in ip_vs_zero_estimator()?
>
> Yeah, I wondered about the same. memset is probably simpler, but direct
> assignment makes it more obvious what is changed. Thinking about it, I'd
> prefer direct assignment, when not setting a complete structure to zero
> and there are not more than a handful lines needed to do it with direct
> assignment. But I'm fine with either way here. If we prefer the memset
> way, we should add a comment to both structures, saying that nobody should
> add anything non-statistic before the member we use to get the size.
To be honest I prefer direct assignment too. I think it is less fragile
as the structures can be re-ordered without effecting how clear works.
I'll post a (trivial) patch shortly.
> Also the estimator and statistics structures can be optimized to avoid
> some padding. I'll include these changes and resend.
Great.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ipvs: A couple of fixes and cleanups
2008-08-11 6:56 ` Simon Horman
@ 2008-08-11 7:19 ` Simon Horman
2008-08-11 8:19 ` Sven Wegener
1 sibling, 0 replies; 22+ messages in thread
From: Simon Horman @ 2008-08-11 7:19 UTC (permalink / raw)
To: Sven Wegener; +Cc: wensong, ja, netdev, lvs-devel
On Mon, Aug 11, 2008 at 04:56:48PM +1000, Simon Horman wrote:
> On Mon, Aug 11, 2008 at 08:43:54AM +0200, Sven Wegener wrote:
> > On Mon, 11 Aug 2008, Simon Horman wrote:
> >
> > > On Sun, Aug 10, 2008 at 08:35:48PM +0200, Sven Wegener wrote:
> > > > On Sun, 10 Aug 2008, sven.wegener@stealer.net wrote:
> > > >
> > > > > Hi guys,
> > > > >
> > > > > here come a couple of fixes and cleanups for IPVS. Worth mentioning are the two
> > > > > possible deadlock fixes. One introduced by my last sync daemon work, which
> > > > > hasn't hit any stable kernel yet. The other one is in the estimator code and
> > > > > goes back to at leat since we started working with git for the kernel. The
> > > > > latter I think qualifies for -stable.
> > > > >
> > > > > I've pushed the changes (8123b42..2e45552) based on davem's net tree here
> > > > >
> > > > > git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
> > > >
> > > > I've included the register_ip_vs_protocol() annotation. Changes are now
> > > > 8123b42..7ead17b. Diffstat has changed slightly, but is probably not worth
> > > > posting again.
> > >
> > > Hi Sven,
> > >
> > > all these changes seem fine to me.
> > >
> > > Acked-by: Simon Horman <horms@verge.net.au>
> > >
> > > With regards to ip_vs_zero_stats(), it uses
> > >
> > > memset(stats, 0, (char *)&stats->lock - (char *)stats);
> > >
> > > to clear stats and then calls ip_vs_zero_estimator(), which uses
> > >
> > > est->last_conns = 0;
> > > est->last_inpkts = 0;
> > > ...
> > >
> > > to clear stats->est.
> > >
> > > I wonder if it would be cleaner to either clear
> > > stats->... directly in ip_vs_zero_stats(), or use
> > > memset in ip_vs_zero_estimator()?
> >
> > Yeah, I wondered about the same. memset is probably simpler, but direct
> > assignment makes it more obvious what is changed. Thinking about it, I'd
> > prefer direct assignment, when not setting a complete structure to zero
> > and there are not more than a handful lines needed to do it with direct
> > assignment. But I'm fine with either way here. If we prefer the memset
> > way, we should add a comment to both structures, saying that nobody should
> > add anything non-statistic before the member we use to get the size.
>
> To be honest I prefer direct assignment too. I think it is less fragile
> as the structures can be re-ordered without effecting how clear works.
> I'll post a (trivial) patch shortly.
And here it is...
--------------------------------------------------
ipvs: Explictly clear ip_vs_status members
In order to align the coding styles of ip_vs_zero_stats() and
its child-function ip_vs_zero_estimator(), clear ip_vs_status
members explicitlty rather than doing a limited memset().
This was chosen over modifying ip_vs_zero_estimator() to use
memset() as it is more robust against changes in members
in the relevant structures. memset() would be prefered if
all members of the structure were to be cleared.
Cc: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: Simon Horman <horms@verge.net.au>
Index: net-2.6/net/ipv4/ipvs/ip_vs_ctl.c
===================================================================
--- net-2.6.orig/net/ipv4/ipvs/ip_vs_ctl.c 2008-08-11 17:15:04.000000000 +1000
+++ net-2.6/net/ipv4/ipvs/ip_vs_ctl.c 2008-08-11 17:15:07.000000000 +1000
@@ -683,8 +683,21 @@ static void
ip_vs_zero_stats(struct ip_vs_stats *stats)
{
spin_lock_bh(&stats->lock);
- memset(stats, 0, (char *)&stats->lock - (char *)stats);
+
+ stats->conns = 0;
+ stats->inpkts = 0;
+ stats->outpkts = 0;
+ stats->inbytes = 0;
+ stats->outbytes = 0;
+
+ stats->cps = 0;
+ stats->inpps = 0;
+ stats->outpps = 0;
+ stats->inbps = 0;
+ stats->outbps = 0;
+
ip_vs_zero_estimator(stats);
+
spin_unlock_bh(&stats->lock);
}
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ipvs: A couple of fixes and cleanups
2008-08-11 6:56 ` Simon Horman
2008-08-11 7:19 ` Simon Horman
@ 2008-08-11 8:19 ` Sven Wegener
2008-08-11 9:31 ` Simon Horman
1 sibling, 1 reply; 22+ messages in thread
From: Sven Wegener @ 2008-08-11 8:19 UTC (permalink / raw)
To: Simon Horman; +Cc: wensong, ja, netdev, lvs-devel
On Mon, 11 Aug 2008, Simon Horman wrote:
> On Mon, Aug 11, 2008 at 08:43:54AM +0200, Sven Wegener wrote:
> > On Mon, 11 Aug 2008, Simon Horman wrote:
> >
> > > On Sun, Aug 10, 2008 at 08:35:48PM +0200, Sven Wegener wrote:
> > > > On Sun, 10 Aug 2008, sven.wegener@stealer.net wrote:
> > > >
> > > > > Hi guys,
> > > > >
> > > > > here come a couple of fixes and cleanups for IPVS. Worth mentioning are the two
> > > > > possible deadlock fixes. One introduced by my last sync daemon work, which
> > > > > hasn't hit any stable kernel yet. The other one is in the estimator code and
> > > > > goes back to at leat since we started working with git for the kernel. The
> > > > > latter I think qualifies for -stable.
> > > > >
> > > > > I've pushed the changes (8123b42..2e45552) based on davem's net tree here
> > > > >
> > > > > git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
> > > >
> > > > I've included the register_ip_vs_protocol() annotation. Changes are now
> > > > 8123b42..7ead17b. Diffstat has changed slightly, but is probably not worth
> > > > posting again.
> > >
> > > Hi Sven,
> > >
> > > all these changes seem fine to me.
> > >
> > > Acked-by: Simon Horman <horms@verge.net.au>
> > >
> > > With regards to ip_vs_zero_stats(), it uses
> > >
> > > memset(stats, 0, (char *)&stats->lock - (char *)stats);
> > >
> > > to clear stats and then calls ip_vs_zero_estimator(), which uses
> > >
> > > est->last_conns = 0;
> > > est->last_inpkts = 0;
> > > ...
> > >
> > > to clear stats->est.
> > >
> > > I wonder if it would be cleaner to either clear
> > > stats->... directly in ip_vs_zero_stats(), or use
> > > memset in ip_vs_zero_estimator()?
> >
> > Yeah, I wondered about the same. memset is probably simpler, but direct
> > assignment makes it more obvious what is changed. Thinking about it, I'd
> > prefer direct assignment, when not setting a complete structure to zero
> > and there are not more than a handful lines needed to do it with direct
> > assignment. But I'm fine with either way here. If we prefer the memset
> > way, we should add a comment to both structures, saying that nobody should
> > add anything non-statistic before the member we use to get the size.
>
> To be honest I prefer direct assignment too. I think it is less fragile
> as the structures can be re-ordered without effecting how clear works.
> I'll post a (trivial) patch shortly.
>
> > Also the estimator and statistics structures can be optimized to avoid
> > some padding. I'll include these changes and resend.
>
> Great.
Sadly the layout of the stats structure is shared with user space, so this
can't be changed easily without breaking ABI.
Sven
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ipvs: A couple of fixes and cleanups
2008-08-11 8:19 ` Sven Wegener
@ 2008-08-11 9:31 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2008-08-11 9:31 UTC (permalink / raw)
To: Sven Wegener; +Cc: wensong, ja, netdev, lvs-devel
On Mon, Aug 11, 2008 at 10:19:09AM +0200, Sven Wegener wrote:
> On Mon, 11 Aug 2008, Simon Horman wrote:
>
> > On Mon, Aug 11, 2008 at 08:43:54AM +0200, Sven Wegener wrote:
> > > On Mon, 11 Aug 2008, Simon Horman wrote:
> > >
> > > > On Sun, Aug 10, 2008 at 08:35:48PM +0200, Sven Wegener wrote:
> > > > > On Sun, 10 Aug 2008, sven.wegener@stealer.net wrote:
> > > > >
> > > > > > Hi guys,
> > > > > >
> > > > > > here come a couple of fixes and cleanups for IPVS. Worth mentioning are the two
> > > > > > possible deadlock fixes. One introduced by my last sync daemon work, which
> > > > > > hasn't hit any stable kernel yet. The other one is in the estimator code and
> > > > > > goes back to at leat since we started working with git for the kernel. The
> > > > > > latter I think qualifies for -stable.
> > > > > >
> > > > > > I've pushed the changes (8123b42..2e45552) based on davem's net tree here
> > > > > >
> > > > > > git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
> > > > >
> > > > > I've included the register_ip_vs_protocol() annotation. Changes are now
> > > > > 8123b42..7ead17b. Diffstat has changed slightly, but is probably not worth
> > > > > posting again.
> > > >
> > > > Hi Sven,
> > > >
> > > > all these changes seem fine to me.
> > > >
> > > > Acked-by: Simon Horman <horms@verge.net.au>
> > > >
> > > > With regards to ip_vs_zero_stats(), it uses
> > > >
> > > > memset(stats, 0, (char *)&stats->lock - (char *)stats);
> > > >
> > > > to clear stats and then calls ip_vs_zero_estimator(), which uses
> > > >
> > > > est->last_conns = 0;
> > > > est->last_inpkts = 0;
> > > > ...
> > > >
> > > > to clear stats->est.
> > > >
> > > > I wonder if it would be cleaner to either clear
> > > > stats->... directly in ip_vs_zero_stats(), or use
> > > > memset in ip_vs_zero_estimator()?
> > >
> > > Yeah, I wondered about the same. memset is probably simpler, but direct
> > > assignment makes it more obvious what is changed. Thinking about it, I'd
> > > prefer direct assignment, when not setting a complete structure to zero
> > > and there are not more than a handful lines needed to do it with direct
> > > assignment. But I'm fine with either way here. If we prefer the memset
> > > way, we should add a comment to both structures, saying that nobody should
> > > add anything non-statistic before the member we use to get the size.
> >
> > To be honest I prefer direct assignment too. I think it is less fragile
> > as the structures can be re-ordered without effecting how clear works.
> > I'll post a (trivial) patch shortly.
> >
> > > Also the estimator and statistics structures can be optimized to avoid
> > > some padding. I'll include these changes and resend.
> >
> > Great.
>
> Sadly the layout of the stats structure is shared with user space, so this
> can't be changed easily without breaking ABI.
un-Great :-(
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 8/9] ipvs: Embed estimator object into stats object
2008-08-10 11:07 ` [PATCH 8/9] ipvs: Embed estimator object into stats object sven.wegener
@ 2008-08-11 12:16 ` Sven Wegener
0 siblings, 0 replies; 22+ messages in thread
From: Sven Wegener @ 2008-08-11 12:16 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
There's no reason for dynamically allocating an estimator object for every
stats object. Directly embed an estimator object into every stats object and
switch to using the kernel-provided list implementation. This makes the code
much simpler and faster, as we do not need to traverse the list of all
estimators to find the one belonging to a stats object. There's no need to use
an rwlock, as we only have one reader. Also reorder the members of the
estimator structure slightly to avoid padding overhead. This can't be done
with the stats object as the members are currently copied to our user space
object via memcpy() and changing it would break ABI.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
---
include/net/ip_vs.h | 28 ++++++++++-
net/ipv4/ipvs/ip_vs_ctl.c | 2 +-
net/ipv4/ipvs/ip_vs_est.c | 117 +++++++++++++++------------------------------
3 files changed, 65 insertions(+), 82 deletions(-)
Changes since last post: Include a comment on the requirement that no
member should precede the lock in ip_vs_stats. Reorder the member of
ip_vs_estimator to avoid padding.
diff --git a/include/net/ip_vs.h
b/include/net/ip_vs.h index c8ee9b8..7312c3d 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -140,8 +140,24 @@ struct ip_vs_seq {
/*
- * IPVS statistics object
+ * IPVS statistics objects
*/
+struct ip_vs_estimator {
+ struct list_head list;
+
+ u64 last_inbytes;
+ u64 last_outbytes;
+ u32 last_conns;
+ u32 last_inpkts;
+ u32 last_outpkts;
+
+ u32 cps;
+ u32 inpps;
+ u32 outpps;
+ u32 inbps;
+ u32 outbps;
+};
+
struct ip_vs_stats
{
__u32 conns; /* connections scheduled */
@@ -156,7 +172,15 @@ struct ip_vs_stats
__u32 inbps; /* current in byte rate */
__u32 outbps; /* current out byte rate */
+ /*
+ * Don't add anything before the lock, because we use memcpy() to copy
+ * the members before the lock to struct ip_vs_stats_user in
+ * ip_vs_ctl.c.
+ */
+
spinlock_t lock; /* spin lock */
+
+ struct ip_vs_estimator est; /* estimator */
};
struct dst_entry;
@@ -659,7 +683,7 @@ extern void ip_vs_sync_conn(struct ip_vs_conn *cp);
/*
* IPVS rate estimator prototypes (from ip_vs_est.c)
*/
-extern int ip_vs_new_estimator(struct ip_vs_stats *stats);
+extern void ip_vs_new_estimator(struct ip_vs_stats *stats);
extern void ip_vs_kill_estimator(struct ip_vs_stats *stats);
extern void ip_vs_zero_estimator(struct ip_vs_stats *stats);
diff --git a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c
index 999d884..d651bce 100644
--- a/net/ipv4/ipvs/ip_vs_ctl.c
+++ b/net/ipv4/ipvs/ip_vs_ctl.c
@@ -684,8 +684,8 @@ ip_vs_zero_stats(struct ip_vs_stats *stats)
{
spin_lock_bh(&stats->lock);
memset(stats, 0, (char *)&stats->lock - (char *)stats);
- spin_unlock_bh(&stats->lock);
ip_vs_zero_estimator(stats);
+ spin_unlock_bh(&stats->lock);
}
/*
diff --git a/net/ipv4/ipvs/ip_vs_est.c b/net/ipv4/ipvs/ip_vs_est.c
index 1d6e58e..5a20f93 100644
--- a/net/ipv4/ipvs/ip_vs_est.c
+++ b/net/ipv4/ipvs/ip_vs_est.c
@@ -17,6 +17,7 @@
#include <linux/types.h>
#include <linux/interrupt.h>
#include <linux/sysctl.h>
+#include <linux/list.h>
#include <net/ip_vs.h>
@@ -44,28 +45,11 @@
*/
-struct ip_vs_estimator
-{
- struct ip_vs_estimator *next;
- struct ip_vs_stats *stats;
-
- u32 last_conns;
- u32 last_inpkts;
- u32 last_outpkts;
- u64 last_inbytes;
- u64 last_outbytes;
-
- u32 cps;
- u32 inpps;
- u32 outpps;
- u32 inbps;
- u32 outbps;
-};
+static void estimation_timer(unsigned long arg);
-
-static struct ip_vs_estimator *est_list = NULL;
-static DEFINE_RWLOCK(est_lock);
-static struct timer_list est_timer;
+static LIST_HEAD(est_list);
+static DEFINE_SPINLOCK(est_lock);
+static DEFINE_TIMER(est_timer, estimation_timer, 0, 0);
static void estimation_timer(unsigned long arg)
{
@@ -76,9 +60,9 @@ static void estimation_timer(unsigned long arg)
u64 n_inbytes, n_outbytes;
u32 rate;
- read_lock(&est_lock);
- for (e = est_list; e; e = e->next) {
- s = e->stats;
+ spin_lock(&est_lock);
+ list_for_each_entry(e, &est_list, list) {
+ s = container_of(e, struct ip_vs_stats, est);
spin_lock(&s->lock);
n_conns = s->conns;
@@ -114,19 +98,16 @@ static void estimation_timer(unsigned long arg)
s->outbps = (e->outbps+0xF)>>5;
spin_unlock(&s->lock);
}
- read_unlock(&est_lock);
+ spin_unlock(&est_lock);
mod_timer(&est_timer, jiffies + 2*HZ);
}
-int ip_vs_new_estimator(struct ip_vs_stats *stats)
+void ip_vs_new_estimator(struct ip_vs_stats *stats)
{
- struct ip_vs_estimator *est;
+ struct ip_vs_estimator *est = &stats->est;
- est = kzalloc(sizeof(*est), GFP_KERNEL);
- if (est == NULL)
- return -ENOMEM;
+ INIT_LIST_HEAD(&est->list);
- est->stats = stats;
est->last_conns = stats->conns;
est->cps = stats->cps<<10;
@@ -142,62 +123,40 @@ int ip_vs_new_estimator(struct ip_vs_stats *stats)
est->last_outbytes = stats->outbytes;
est->outbps = stats->outbps<<5;
- write_lock_bh(&est_lock);
- est->next = est_list;
- if (est->next == NULL) {
- setup_timer(&est_timer, estimation_timer, 0);
- est_timer.expires = jiffies + 2*HZ;
- add_timer(&est_timer);
- }
- est_list = est;
- write_unlock_bh(&est_lock);
- return 0;
+ spin_lock_bh(&est_lock);
+ if (list_empty(&est_list))
+ mod_timer(&est_timer, jiffies + 2 * HZ);
+ list_add(&est->list, &est_list);
+ spin_unlock_bh(&est_lock);
}
void ip_vs_kill_estimator(struct ip_vs_stats *stats)
{
- struct ip_vs_estimator *est, **pest;
- int killed = 0;
-
- write_lock_bh(&est_lock);
- pest = &est_list;
- while ((est=*pest) != NULL) {
- if (est->stats != stats) {
- pest = &est->next;
- continue;
- }
- *pest = est->next;
- kfree(est);
- killed++;
- }
- while (killed && !est_list && try_to_del_timer_sync(&est_timer) < 0) {
- write_unlock_bh(&est_lock);
+ struct ip_vs_estimator *est = &stats->est;
+
+ spin_lock_bh(&est_lock);
+ list_del(&est->list);
+ while (list_empty(&est_list) && try_to_del_timer_sync(&est_timer) < 0) {
+ spin_unlock_bh(&est_lock);
cpu_relax();
- write_lock_bh(&est_lock);
+ spin_lock_bh(&est_lock);
}
- write_unlock_bh(&est_lock);
+ spin_unlock_bh(&est_lock);
}
void ip_vs_zero_estimator(struct ip_vs_stats *stats)
{
- struct ip_vs_estimator *e;
-
- write_lock_bh(&est_lock);
- for (e = est_list; e; e = e->next) {
- if (e->stats != stats)
- continue;
-
- /* set counters zero */
- e->last_conns = 0;
- e->last_inpkts = 0;
- e->last_outpkts = 0;
- e->last_inbytes = 0;
- e->last_outbytes = 0;
- e->cps = 0;
- e->inpps = 0;
- e->outpps = 0;
- e->inbps = 0;
- e->outbps = 0;
- }
- write_unlock_bh(&est_lock);
+ struct ip_vs_estimator *est = &stats->est;
+
+ /* set counters zero, caller must hold the stats->lock lock */
+ est->last_inbytes = 0;
+ est->last_outbytes = 0;
+ est->last_conns = 0;
+ est->last_inpkts = 0;
+ est->last_outpkts = 0;
+ est->cps = 0;
+ est->inpps = 0;
+ est->outpps = 0;
+ est->inbps = 0;
+ est->outbps = 0;
}
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] ipvs: A couple of fixes and cleanups
2008-08-10 18:35 ` [PATCH] ipvs: A couple of fixes and cleanups Sven Wegener
2008-08-11 0:50 ` Simon Horman
@ 2008-08-11 12:19 ` Sven Wegener
2008-08-11 12:57 ` Simon Horman
1 sibling, 1 reply; 22+ messages in thread
From: Sven Wegener @ 2008-08-11 12:19 UTC (permalink / raw)
To: wensong, horms, ja; +Cc: netdev, lvs-devel
On Sun, 10 Aug 2008, Sven Wegener wrote:
> On Sun, 10 Aug 2008, sven.wegener@stealer.net wrote:
>
> > Hi guys,
> >
> > here come a couple of fixes and cleanups for IPVS. Worth mentioning are the two
> > possible deadlock fixes. One introduced by my last sync daemon work, which
> > hasn't hit any stable kernel yet. The other one is in the estimator code and
> > goes back to at leat since we started working with git for the kernel. The
> > latter I think qualifies for -stable.
> >
> > I've pushed the changes (8123b42..2e45552) based on davem's net tree here
> >
> > git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
>
> I've included the register_ip_vs_protocol() annotation. Changes are now
> 8123b42..7ead17b. Diffstat has changed slightly, but is probably not worth
> posting again.
OK, this should be final now. I've pushed the changes 8123b42..e93615d to
git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
Simon, as you see I've included your patch and added your acked-by onto
the other patches.
Simon Horman (1):
ipvs: Explictly clear ip_vs_stats members
Sven Wegener (9):
ipvs: Fix possible deadlock in sync code
ipvs: Fix possible deadlock in estimator code
ipvs: Use ARRAY_SIZE()
ipvs: Use list_empty() instead of open-coding the same functionality
ipvs: Initialize schedulers' struct list_head at compile time
ipvs: Annotate init functions with __init
ipvs: Mark net_vs_ctl_path const
ipvs: Embed estimator object into stats object
ipvs: No need to zero out ip_vs_stats during initialization
include/net/ip_vs.h | 32 ++++++++++--
net/ipv4/ipvs/ip_vs_app.c | 2 +-
net/ipv4/ipvs/ip_vs_conn.c | 2 +-
net/ipv4/ipvs/ip_vs_ctl.c | 27 +++++++---
net/ipv4/ipvs/ip_vs_dh.c | 2 +-
net/ipv4/ipvs/ip_vs_est.c | 116 ++++++++++++++----------------------------
net/ipv4/ipvs/ip_vs_lblc.c | 2 +-
net/ipv4/ipvs/ip_vs_lblcr.c | 2 +-
net/ipv4/ipvs/ip_vs_lc.c | 2 +-
net/ipv4/ipvs/ip_vs_nq.c | 2 +-
net/ipv4/ipvs/ip_vs_proto.c | 4 +-
net/ipv4/ipvs/ip_vs_rr.c | 2 +-
net/ipv4/ipvs/ip_vs_sched.c | 4 +-
net/ipv4/ipvs/ip_vs_sed.c | 2 +-
net/ipv4/ipvs/ip_vs_sh.c | 2 +-
net/ipv4/ipvs/ip_vs_sync.c | 4 +-
net/ipv4/ipvs/ip_vs_wlc.c | 2 +-
net/ipv4/ipvs/ip_vs_wrr.c | 2 +-
18 files changed, 105 insertions(+), 106 deletions(-)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] ipvs: A couple of fixes and cleanups
2008-08-11 12:19 ` Sven Wegener
@ 2008-08-11 12:57 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2008-08-11 12:57 UTC (permalink / raw)
To: Sven Wegener; +Cc: wensong, ja, netdev, lvs-devel
On Mon, Aug 11, 2008 at 02:19:47PM +0200, Sven Wegener wrote:
> On Sun, 10 Aug 2008, Sven Wegener wrote:
>
> > On Sun, 10 Aug 2008, sven.wegener@stealer.net wrote:
> >
> > > Hi guys,
> > >
> > > here come a couple of fixes and cleanups for IPVS. Worth mentioning are the two
> > > possible deadlock fixes. One introduced by my last sync daemon work, which
> > > hasn't hit any stable kernel yet. The other one is in the estimator code and
> > > goes back to at leat since we started working with git for the kernel. The
> > > latter I think qualifies for -stable.
> > >
> > > I've pushed the changes (8123b42..2e45552) based on davem's net tree here
> > >
> > > git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
> >
> > I've included the register_ip_vs_protocol() annotation. Changes are now
> > 8123b42..7ead17b. Diffstat has changed slightly, but is probably not worth
> > posting again.
>
> OK, this should be final now. I've pushed the changes 8123b42..e93615d to
>
> git://git.stealer.net/linux-2.6.git stealer/ipvs/for-davem
>
> Simon, as you see I've included your patch and added your acked-by onto
> the other patches.
Thanks. Everything looks in order to me.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2008-08-11 12:57 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-10 11:07 [PATCH] ipvs: A couple of fixes and cleanups sven.wegener
2008-08-10 11:07 ` [PATCH 1/9] ipvs: Fix possible deadlock in sync code sven.wegener
2008-08-10 11:07 ` [PATCH 2/9] ipvs: Fix possible deadlock in estimator code sven.wegener
2008-08-10 13:58 ` Sven Wegener
2008-08-10 11:07 ` [PATCH 3/9] ipvs: Use ARRAY_SIZE() sven.wegener
2008-08-10 11:07 ` [PATCH 4/9] ipvs: Use list_empty() instead of open-coding the same functionality sven.wegener
2008-08-10 11:07 ` [PATCH 5/9] ipvs: Initialize schedulers' struct list_head at compile time sven.wegener
2008-08-10 11:07 ` [PATCH 6/9] ipvs: Annotate init functions with __init sven.wegener
2008-08-10 18:32 ` Sven Wegener
2008-08-10 11:07 ` [PATCH 7/9] ipvs: Mark net_vs_ctl_path const sven.wegener
2008-08-10 11:07 ` [PATCH 8/9] ipvs: Embed estimator object into stats object sven.wegener
2008-08-11 12:16 ` Sven Wegener
2008-08-10 11:07 ` [PATCH 9/9] ipvs: No need to zero out ip_vs_stats during initialization sven.wegener
2008-08-10 18:35 ` [PATCH] ipvs: A couple of fixes and cleanups Sven Wegener
2008-08-11 0:50 ` Simon Horman
2008-08-11 6:43 ` Sven Wegener
2008-08-11 6:56 ` Simon Horman
2008-08-11 7:19 ` Simon Horman
2008-08-11 8:19 ` Sven Wegener
2008-08-11 9:31 ` Simon Horman
2008-08-11 12:19 ` Sven Wegener
2008-08-11 12:57 ` Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).