* [RFC/PATCH 2/3] UDP memory usage accounting: accounting unit and variable
@ 2007-09-21 12:26 Satoshi OSHIMA
2007-09-21 13:10 ` Andi Kleen
0 siblings, 1 reply; 6+ messages in thread
From: Satoshi OSHIMA @ 2007-09-21 12:26 UTC (permalink / raw)
To: netdev; +Cc: haoki, 吉藤 英明
This patch introduces global variable for UDP memory accounting.
The unit is page.
signed-off-by: Satoshi Oshima <satoshi.oshima.fk@hitachi.com>
signed-off-by: Hideo Aoki <haoki@redhat.com>
Index: 2.6.23-rc3-udp_limit/include/net/sock.h
===================================================================
--- 2.6.23-rc3-udp_limit.orig/include/net/sock.h
+++ 2.6.23-rc3-udp_limit/include/net/sock.h
@@ -723,6 +723,13 @@ static inline int sk_stream_wmem_schedul
sk_stream_mem_schedule(sk, size, 0);
}
+#define SK_DATAGRAM_MEM_QUANTUM ((int)PAGE_SIZE)
+
+static inline int sk_datagram_pages(int amt)
+{
+ return DIV_ROUND_UP(amt, SK_DATAGRAM_MEM_QUANTUM);
+}
+
/* Used by processes to "lock" a socket state, so that
* interrupts and bottom half handlers won't change it
* from under us. It essentially blocks any incoming
Index: 2.6.23-rc3-udp_limit/include/net/udp.h
===================================================================
--- 2.6.23-rc3-udp_limit.orig/include/net/udp.h
+++ 2.6.23-rc3-udp_limit/include/net/udp.h
@@ -65,6 +65,8 @@ extern rwlock_t udp_hash_lock;
extern struct proto udp_prot;
+extern atomic_t udp_memory_allocated;
+
struct sk_buff;
/*
Index: 2.6.23-rc3-udp_limit/net/ipv4/proc.c
===================================================================
--- 2.6.23-rc3-udp_limit.orig/net/ipv4/proc.c
+++ 2.6.23-rc3-udp_limit/net/ipv4/proc.c
@@ -66,7 +66,8 @@ static int sockstat_seq_show(struct seq_
fold_prot_inuse(&tcp_prot), atomic_read(&tcp_orphan_count),
tcp_death_row.tw_count, atomic_read(&tcp_sockets_allocated),
atomic_read(&tcp_memory_allocated));
- seq_printf(seq, "UDP: inuse %d\n", fold_prot_inuse(&udp_prot));
+ seq_printf(seq, "UDP: inuse %d mem %d\n", fold_prot_inuse(&udp_prot),
+ atomic_read(&udp_memory_allocated));
seq_printf(seq, "UDPLITE: inuse %d\n", fold_prot_inuse(&udplite_prot));
seq_printf(seq, "RAW: inuse %d\n", fold_prot_inuse(&raw_prot));
seq_printf(seq, "FRAG: inuse %d memory %d\n", ip_frag_nqueues,
Index: 2.6.23-rc3-udp_limit/net/ipv4/udp.c
===================================================================
--- 2.6.23-rc3-udp_limit.orig/net/ipv4/udp.c
+++ 2.6.23-rc3-udp_limit/net/ipv4/udp.c
@@ -113,6 +113,10 @@ DEFINE_SNMP_STAT(struct udp_mib, udp_sta
struct hlist_head udp_hash[UDP_HTABLE_SIZE];
DEFINE_RWLOCK(udp_hash_lock);
+atomic_t udp_memory_allocated;
+
+EXPORT_SYMBOL(udp_memory_allocated);
+
static int udp_port_rover;
static inline int __udp_lib_lport_inuse(__u16 num, struct hlist_head
udptable[])
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC/PATCH 2/3] UDP memory usage accounting: accounting unit and variable
2007-09-21 12:26 Satoshi OSHIMA
@ 2007-09-21 13:10 ` Andi Kleen
2007-09-27 18:54 ` Hideo AOKI
2007-09-28 13:24 ` Satoshi OSHIMA
0 siblings, 2 replies; 6+ messages in thread
From: Andi Kleen @ 2007-09-21 13:10 UTC (permalink / raw)
To: Satoshi OSHIMA; +Cc: netdev, haoki, 吉藤英明
Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com> writes:
> This patch introduces global variable for UDP memory accounting.
> The unit is page.
The global variable doesn't seem to be very MP scalable, especially
if you change it for each packet. This will be a very hot cache line,
in the worst case bouncing around a large machine.
Possible alternatives:
- Per CPU variables
- You only change the global on socket creation time (by pre allocating a large
amount) or when the system comes under memory pressure.
- Batching of the global updates for multiple packets [that's a variant
of the previous one, might be still too costly though]
Also for such variables it's usually good to cache line pad them on SMP
to avoid false sharing with something else.
-Andi
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC/PATCH 2/3] UDP memory usage accounting: accounting unit and variable
2007-09-21 13:10 ` Andi Kleen
@ 2007-09-27 18:54 ` Hideo AOKI
2007-09-28 13:24 ` Satoshi OSHIMA
1 sibling, 0 replies; 6+ messages in thread
From: Hideo AOKI @ 2007-09-27 18:54 UTC (permalink / raw)
To: Andi Kleen; +Cc: Satoshi OSHIMA, netdev, yoshfuji
Hello,
I apologize for not replying sooner.
Andi Kleen wrote:
> Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com> writes:
>
>> This patch introduces global variable for UDP memory accounting.
>> The unit is page.
>
> The global variable doesn't seem to be very MP scalable, especially
> if you change it for each packet. This will be a very hot cache line,
> in the worst case bouncing around a large machine.
>
> Possible alternatives:
> - Per CPU variables
> - You only change the global on socket creation time (by pre allocating a large
> amount) or when the system comes under memory pressure.
> - Batching of the global updates for multiple packets [that's a variant
> of the previous one, might be still too costly though]
>
> Also for such variables it's usually good to cache line pad them on SMP
> to avoid false sharing with something else.
>
> -Andi
Thank you so much for your suggestions.
The implementation of the patch basically followed implementation of
tcp_memory_allocated. However, I should agree that the patch introduces
atomic operations too much. Then, I'll try to use the batching to reduce
the number of atomic operations.
Best regards,
Hideo Aoki
--
Hitachi Computer Products (America) Inc.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC/PATCH 2/3] UDP memory usage accounting: accounting unit and variable
2007-09-21 13:10 ` Andi Kleen
2007-09-27 18:54 ` Hideo AOKI
@ 2007-09-28 13:24 ` Satoshi OSHIMA
2007-09-29 3:22 ` Herbert Xu
1 sibling, 1 reply; 6+ messages in thread
From: Satoshi OSHIMA @ 2007-09-28 13:24 UTC (permalink / raw)
To: Andi Kleen; +Cc: netdev, haoki, ????, Yumiko SUGITA
Hi,
Thank you for your comment.
Andi Kleen wrote:
> Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com> writes:
>
>> This patch introduces global variable for UDP memory accounting.
>> The unit is page.
>
> The global variable doesn't seem to be very MP scalable, especially
> if you change it for each packet. This will be a very hot cache line,
> in the worst case bouncing around a large machine.
I understand what you pointed out. But I think the accounting
method I'm proposing is very similar to TCP accounting and per
socket accounting.
How do you think of it?
> Possible alternatives:
> - Per CPU variables
I'm afraid that sockets and socket buffers are handled on
various CPUs. I mean that socket creation might be done
on CPU-A but socket receiving might be done on CPU-B.
And per CPU variables must be counted up when socket
cap is checked. I'm afraid that per CPU vaiables are
also costly enough.
> - You only change the global on socket creation time (by pre
allocating a large
> amount) or when the system comes under memory pressure.
> - Batching of the global updates for multiple packets [that's a variant
> of the previous one, might be still too costly though]
>
> Also for such variables it's usually good to cache line pad them on SMP
> to avoid false sharing with something else.
I believe that memory usage accounting should be done accurately.
Currently I couldn't see how can we know the accurate memory
accounting only when the system is under memory pressure.
But I revised the patch to avoid some atomic operations.
If I could find the good way to avoid atomic operation more,
I will add it.
Satoshi Oshima
^ permalink raw reply [flat|nested] 6+ messages in thread
* [RFC/PATCH 2/3] UDP memory usage accounting: accounting unit and variable
@ 2007-09-28 13:40 Satoshi OSHIMA
0 siblings, 0 replies; 6+ messages in thread
From: Satoshi OSHIMA @ 2007-09-28 13:40 UTC (permalink / raw)
To: Andi Kleen, Evgeniy Polyakov, netdev
Cc: Yumiko SUGITA, "青木@RedHat",
吉藤 英明
This patch introduces global variable for UDP memory accounting.
The unit is page.
signed-off-by: Satoshi Oshima <satoshi.oshima.fk@hitachi.com>
signed-off-by: Hideo Aoki <haoki@redhat.com>
Index: 2.6.23-rc3-udp_limit/include/net/sock.h
===================================================================
--- 2.6.23-rc3-udp_limit.orig/include/net/sock.h
+++ 2.6.23-rc3-udp_limit/include/net/sock.h
@@ -723,6 +723,13 @@ static inline int sk_stream_wmem_schedul
sk_stream_mem_schedule(sk, size, 0);
}
+#define SK_DATAGRAM_MEM_QUANTUM ((int)PAGE_SIZE)
+
+static inline int sk_datagram_pages(int amt)
+{
+ return DIV_ROUND_UP(amt, SK_DATAGRAM_MEM_QUANTUM);
+}
+
/* Used by processes to "lock" a socket state, so that
* interrupts and bottom half handlers won't change it
* from under us. It essentially blocks any incoming
Index: 2.6.23-rc3-udp_limit/include/net/udp.h
===================================================================
--- 2.6.23-rc3-udp_limit.orig/include/net/udp.h
+++ 2.6.23-rc3-udp_limit/include/net/udp.h
@@ -65,6 +65,8 @@ extern rwlock_t udp_hash_lock;
extern struct proto udp_prot;
+extern atomic_t udp_memory_allocated;
+
struct sk_buff;
/*
Index: 2.6.23-rc3-udp_limit/net/ipv4/proc.c
===================================================================
--- 2.6.23-rc3-udp_limit.orig/net/ipv4/proc.c
+++ 2.6.23-rc3-udp_limit/net/ipv4/proc.c
@@ -66,7 +66,8 @@ static int sockstat_seq_show(struct seq_
fold_prot_inuse(&tcp_prot), atomic_read(&tcp_orphan_count),
tcp_death_row.tw_count, atomic_read(&tcp_sockets_allocated),
atomic_read(&tcp_memory_allocated));
- seq_printf(seq, "UDP: inuse %d\n", fold_prot_inuse(&udp_prot));
+ seq_printf(seq, "UDP: inuse %d mem %d\n", fold_prot_inuse(&udp_prot),
+ atomic_read(&udp_memory_allocated));
seq_printf(seq, "UDPLITE: inuse %d\n", fold_prot_inuse(&udplite_prot));
seq_printf(seq, "RAW: inuse %d\n", fold_prot_inuse(&raw_prot));
seq_printf(seq, "FRAG: inuse %d memory %d\n", ip_frag_nqueues,
Index: 2.6.23-rc3-udp_limit/net/ipv4/udp.c
===================================================================
--- 2.6.23-rc3-udp_limit.orig/net/ipv4/udp.c
+++ 2.6.23-rc3-udp_limit/net/ipv4/udp.c
@@ -113,6 +113,10 @@ DEFINE_SNMP_STAT(struct udp_mib, udp_sta
struct hlist_head udp_hash[UDP_HTABLE_SIZE];
DEFINE_RWLOCK(udp_hash_lock);
+atomic_t udp_memory_allocated;
+
+EXPORT_SYMBOL(udp_memory_allocated);
+
static int udp_port_rover;
static inline int __udp_lib_lport_inuse(__u16 num, struct hlist_head udptable[])
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC/PATCH 2/3] UDP memory usage accounting: accounting unit and variable
2007-09-28 13:24 ` Satoshi OSHIMA
@ 2007-09-29 3:22 ` Herbert Xu
0 siblings, 0 replies; 6+ messages in thread
From: Herbert Xu @ 2007-09-29 3:22 UTC (permalink / raw)
To: Satoshi OSHIMA; +Cc: andi, netdev, haoki, yoshfuji, yumiko.sugita.yf
Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com> wrote:
>
> I understand what you pointed out. But I think the accounting
> method I'm proposing is very similar to TCP accounting and per
> socket accounting.
> How do you think of it?
I think allowing Joe user to stop root from using TCP or UDP
isn't much better than having him hog kernel memory.
So let's fix this properly and add per-user limits rather than
system-wide ones.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-09-29 3:23 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-28 13:40 [RFC/PATCH 2/3] UDP memory usage accounting: accounting unit and variable Satoshi OSHIMA
-- strict thread matches above, loose matches on Subject: below --
2007-09-21 12:26 Satoshi OSHIMA
2007-09-21 13:10 ` Andi Kleen
2007-09-27 18:54 ` Hideo AOKI
2007-09-28 13:24 ` Satoshi OSHIMA
2007-09-29 3:22 ` Herbert Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).