* [PATCHv2 net-next 0/2] sctp: fully support memory accounting
@ 2019-04-15 9:15 ` Xin Long
0 siblings, 0 replies; 8+ messages in thread
From: Xin Long @ 2019-04-15 9:15 UTC (permalink / raw)
To: network dev, linux-sctp
Cc: Marcelo Ricardo Leitner, Neil Horman, davem, Matteo Croce,
Vladis Dronov
sctp memory accounting is added in this patchset by using
these kernel APIs on send side:
- sk_mem_charge()
- sk_mem_uncharge()
- sk_wmem_schedule()
- sk_under_memory_pressure()
- sk_mem_reclaim()
and these on receive side:
- sk_mem_charge()
- sk_mem_uncharge()
- sk_rmem_schedule()
- sk_under_memory_pressure()
- sk_mem_reclaim()
With sctp memory accounting, we can limit the memory allocation by
either sysctl:
# sysctl -w net.sctp.sctp_mem="10 20 50"
or cgroup:
# echo $((8<<14)) > \
/sys/fs/cgroup/memory/sctp_mem/memory.kmem.tcp.limit_in_bytes
When the socket is under memory pressure, the send side will block
and wait, while the receive side will renege or drop.
v1->v2:
- add the missing Reported/Tested/Acked/-bys.
Xin Long (2):
sctp: implement memory accounting on tx path
sctp: implement memory accounting on rx path
include/net/sctp/sctp.h | 2 +-
net/sctp/sm_statefuns.c | 6 ++++--
net/sctp/socket.c | 10 ++++++++--
net/sctp/ulpevent.c | 19 ++++++++-----------
net/sctp/ulpqueue.c | 3 ++-
5 files changed, 23 insertions(+), 17 deletions(-)
--
2.1.0
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCHv2 net-next 0/2] sctp: fully support memory accounting @ 2019-04-15 9:15 ` Xin Long 0 siblings, 0 replies; 8+ messages in thread From: Xin Long @ 2019-04-15 9:15 UTC (permalink / raw) To: network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, davem, Matteo Croce, Vladis Dronov sctp memory accounting is added in this patchset by using these kernel APIs on send side: - sk_mem_charge() - sk_mem_uncharge() - sk_wmem_schedule() - sk_under_memory_pressure() - sk_mem_reclaim() and these on receive side: - sk_mem_charge() - sk_mem_uncharge() - sk_rmem_schedule() - sk_under_memory_pressure() - sk_mem_reclaim() With sctp memory accounting, we can limit the memory allocation by either sysctl: # sysctl -w net.sctp.sctp_mem="10 20 50" or cgroup: # echo $((8<<14)) > \ /sys/fs/cgroup/memory/sctp_mem/memory.kmem.tcp.limit_in_bytes When the socket is under memory pressure, the send side will block and wait, while the receive side will renege or drop. v1->v2: - add the missing Reported/Tested/Acked/-bys. Xin Long (2): sctp: implement memory accounting on tx path sctp: implement memory accounting on rx path include/net/sctp/sctp.h | 2 +- net/sctp/sm_statefuns.c | 6 ++++-- net/sctp/socket.c | 10 ++++++++-- net/sctp/ulpevent.c | 19 ++++++++----------- net/sctp/ulpqueue.c | 3 ++- 5 files changed, 23 insertions(+), 17 deletions(-) -- 2.1.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCHv2 net-next 1/2] sctp: implement memory accounting on tx path 2019-04-15 9:15 ` Xin Long @ 2019-04-15 9:15 ` Xin Long -1 siblings, 0 replies; 8+ messages in thread From: Xin Long @ 2019-04-15 9:15 UTC (permalink / raw) To: network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, davem, Matteo Croce, Vladis Dronov Now when sending packets, sk_mem_charge() and sk_mem_uncharge() have been used to set sk_forward_alloc. We just need to call sk_wmem_schedule() to check if the allocated should be raised, and call sk_mem_reclaim() to check if the allocated should be reduced when it's under memory pressure. If sk_wmem_schedule() returns false, which means no memory is allowed to allocate, it will block and wait for memory to become available. Note different from tcp, sctp wait_for_buf happens before allocating any skb, so memory accounting check is done with the whole msg_len before it too. Reported-by: Matteo Croce <mcroce@redhat.com> Tested-by: Matteo Croce <mcroce@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> --- net/sctp/socket.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 6140471..06c6f4a 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -1913,7 +1913,10 @@ static int sctp_sendmsg_to_asoc(struct sctp_association *asoc, if (sctp_wspace(asoc) < (int)msg_len) sctp_prsctp_prune(asoc, sinfo, msg_len - sctp_wspace(asoc)); - if (sctp_wspace(asoc) <= 0) { + if (sk_under_memory_pressure(sk)) + sk_mem_reclaim(sk); + + if (sctp_wspace(asoc) <= 0 || !sk_wmem_schedule(sk, msg_len)) { timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); err = sctp_wait_for_sndbuf(asoc, &timeo, msg_len); if (err) @@ -8891,7 +8894,10 @@ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, goto do_error; if (signal_pending(current)) goto do_interrupted; - if ((int)msg_len <= sctp_wspace(asoc)) + if (sk_under_memory_pressure(sk)) + sk_mem_reclaim(sk); + if ((int)msg_len <= sctp_wspace(asoc) && + sk_wmem_schedule(sk, msg_len)) break; /* Let another process have a go. Since we are going -- 2.1.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCHv2 net-next 1/2] sctp: implement memory accounting on tx path @ 2019-04-15 9:15 ` Xin Long 0 siblings, 0 replies; 8+ messages in thread From: Xin Long @ 2019-04-15 9:15 UTC (permalink / raw) To: network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, davem, Matteo Croce, Vladis Dronov Now when sending packets, sk_mem_charge() and sk_mem_uncharge() have been used to set sk_forward_alloc. We just need to call sk_wmem_schedule() to check if the allocated should be raised, and call sk_mem_reclaim() to check if the allocated should be reduced when it's under memory pressure. If sk_wmem_schedule() returns false, which means no memory is allowed to allocate, it will block and wait for memory to become available. Note different from tcp, sctp wait_for_buf happens before allocating any skb, so memory accounting check is done with the whole msg_len before it too. Reported-by: Matteo Croce <mcroce@redhat.com> Tested-by: Matteo Croce <mcroce@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> --- net/sctp/socket.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 6140471..06c6f4a 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -1913,7 +1913,10 @@ static int sctp_sendmsg_to_asoc(struct sctp_association *asoc, if (sctp_wspace(asoc) < (int)msg_len) sctp_prsctp_prune(asoc, sinfo, msg_len - sctp_wspace(asoc)); - if (sctp_wspace(asoc) <= 0) { + if (sk_under_memory_pressure(sk)) + sk_mem_reclaim(sk); + + if (sctp_wspace(asoc) <= 0 || !sk_wmem_schedule(sk, msg_len)) { timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); err = sctp_wait_for_sndbuf(asoc, &timeo, msg_len); if (err) @@ -8891,7 +8894,10 @@ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, goto do_error; if (signal_pending(current)) goto do_interrupted; - if ((int)msg_len <= sctp_wspace(asoc)) + if (sk_under_memory_pressure(sk)) + sk_mem_reclaim(sk); + if ((int)msg_len <= sctp_wspace(asoc) && + sk_wmem_schedule(sk, msg_len)) break; /* Let another process have a go. Since we are going -- 2.1.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCHv2 net-next 2/2] sctp: implement memory accounting on rx path 2019-04-15 9:15 ` Xin Long @ 2019-04-15 9:15 ` Xin Long -1 siblings, 0 replies; 8+ messages in thread From: Xin Long @ 2019-04-15 9:15 UTC (permalink / raw) To: network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, davem, Matteo Croce, Vladis Dronov sk_forward_alloc's updating is also done on rx path, but to be consistent we change to use sk_mem_charge() in sctp_skb_set_owner_r(). In sctp_eat_data(), it's not enough to check sctp_memory_pressure only, which doesn't work for mem_cgroup_sockets_enabled, so we change to use sk_under_memory_pressure(). When it's under memory pressure, sk_mem_reclaim() and sk_rmem_schedule() should be called on both RENEGE or CHUNK DELIVERY path exit the memory pressure status as soon as possible. Note that sk_rmem_schedule() is using datalen to make things easy there. Reported-by: Matteo Croce <mcroce@redhat.com> Tested-by: Matteo Croce <mcroce@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> --- include/net/sctp/sctp.h | 2 +- net/sctp/sm_statefuns.c | 6 ++++-- net/sctp/ulpevent.c | 19 ++++++++----------- net/sctp/ulpqueue.c | 3 ++- 4 files changed, 15 insertions(+), 15 deletions(-) diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h index 1d13ec3..eefdfa5 100644 --- a/include/net/sctp/sctp.h +++ b/include/net/sctp/sctp.h @@ -421,7 +421,7 @@ static inline void sctp_skb_set_owner_r(struct sk_buff *skb, struct sock *sk) /* * This mimics the behavior of skb_set_owner_r */ - sk->sk_forward_alloc -= event->rmem_len; + sk_mem_charge(sk, event->rmem_len); } /* Tests if the list has one and only one entry. */ diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index c9ae340..7dfc34b 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -6412,13 +6412,15 @@ static int sctp_eat_data(const struct sctp_association *asoc, * in sctp_ulpevent_make_rcvmsg will drop the frame if we grow our * memory usage too much */ - if (*sk->sk_prot_creator->memory_pressure) { + if (sk_under_memory_pressure(sk)) { if (sctp_tsnmap_has_gap(map) && (sctp_tsnmap_get_ctsn(map) + 1) = tsn) { pr_debug("%s: under pressure, reneging for tsn:%u\n", __func__, tsn); deliver = SCTP_CMD_RENEGE; - } + } else { + sk_mem_reclaim(sk); + } } /* diff --git a/net/sctp/ulpevent.c b/net/sctp/ulpevent.c index 8cb7d98..c2a7478 100644 --- a/net/sctp/ulpevent.c +++ b/net/sctp/ulpevent.c @@ -634,8 +634,9 @@ struct sctp_ulpevent *sctp_ulpevent_make_rcvmsg(struct sctp_association *asoc, gfp_t gfp) { struct sctp_ulpevent *event = NULL; - struct sk_buff *skb; - size_t padding, len; + struct sk_buff *skb = chunk->skb; + struct sock *sk = asoc->base.sk; + size_t padding, datalen; int rx_count; /* @@ -646,15 +647,12 @@ struct sctp_ulpevent *sctp_ulpevent_make_rcvmsg(struct sctp_association *asoc, if (asoc->ep->rcvbuf_policy) rx_count = atomic_read(&asoc->rmem_alloc); else - rx_count = atomic_read(&asoc->base.sk->sk_rmem_alloc); + rx_count = atomic_read(&sk->sk_rmem_alloc); - if (rx_count >= asoc->base.sk->sk_rcvbuf) { + datalen = ntohs(chunk->chunk_hdr->length); - if ((asoc->base.sk->sk_userlocks & SOCK_RCVBUF_LOCK) || - (!sk_rmem_schedule(asoc->base.sk, chunk->skb, - chunk->skb->truesize))) - goto fail; - } + if (rx_count >= sk->sk_rcvbuf || !sk_rmem_schedule(sk, skb, datalen)) + goto fail; /* Clone the original skb, sharing the data. */ skb = skb_clone(chunk->skb, gfp); @@ -681,8 +679,7 @@ struct sctp_ulpevent *sctp_ulpevent_make_rcvmsg(struct sctp_association *asoc, * The sender should never pad with more than 3 bytes. The receiver * MUST ignore the padding bytes. */ - len = ntohs(chunk->chunk_hdr->length); - padding = SCTP_PAD4(len) - len; + padding = SCTP_PAD4(datalen) - datalen; /* Fixup cloned skb with just this chunks data. */ skb_trim(skb, chunk->chunk_end - padding - skb->data); diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c index 5dde921..770ff1f 100644 --- a/net/sctp/ulpqueue.c +++ b/net/sctp/ulpqueue.c @@ -1106,7 +1106,8 @@ void sctp_ulpq_renege(struct sctp_ulpq *ulpq, struct sctp_chunk *chunk, freed += sctp_ulpq_renege_frags(ulpq, needed - freed); } /* If able to free enough room, accept this chunk. */ - if (freed >= needed) { + if (sk_rmem_schedule(asoc->base.sk, chunk->skb, needed) && + freed >= needed) { int retval = sctp_ulpq_tail_data(ulpq, chunk, gfp); /* * Enter partial delivery if chunk has not been -- 2.1.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCHv2 net-next 2/2] sctp: implement memory accounting on rx path @ 2019-04-15 9:15 ` Xin Long 0 siblings, 0 replies; 8+ messages in thread From: Xin Long @ 2019-04-15 9:15 UTC (permalink / raw) To: network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, davem, Matteo Croce, Vladis Dronov sk_forward_alloc's updating is also done on rx path, but to be consistent we change to use sk_mem_charge() in sctp_skb_set_owner_r(). In sctp_eat_data(), it's not enough to check sctp_memory_pressure only, which doesn't work for mem_cgroup_sockets_enabled, so we change to use sk_under_memory_pressure(). When it's under memory pressure, sk_mem_reclaim() and sk_rmem_schedule() should be called on both RENEGE or CHUNK DELIVERY path exit the memory pressure status as soon as possible. Note that sk_rmem_schedule() is using datalen to make things easy there. Reported-by: Matteo Croce <mcroce@redhat.com> Tested-by: Matteo Croce <mcroce@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> --- include/net/sctp/sctp.h | 2 +- net/sctp/sm_statefuns.c | 6 ++++-- net/sctp/ulpevent.c | 19 ++++++++----------- net/sctp/ulpqueue.c | 3 ++- 4 files changed, 15 insertions(+), 15 deletions(-) diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h index 1d13ec3..eefdfa5 100644 --- a/include/net/sctp/sctp.h +++ b/include/net/sctp/sctp.h @@ -421,7 +421,7 @@ static inline void sctp_skb_set_owner_r(struct sk_buff *skb, struct sock *sk) /* * This mimics the behavior of skb_set_owner_r */ - sk->sk_forward_alloc -= event->rmem_len; + sk_mem_charge(sk, event->rmem_len); } /* Tests if the list has one and only one entry. */ diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index c9ae340..7dfc34b 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -6412,13 +6412,15 @@ static int sctp_eat_data(const struct sctp_association *asoc, * in sctp_ulpevent_make_rcvmsg will drop the frame if we grow our * memory usage too much */ - if (*sk->sk_prot_creator->memory_pressure) { + if (sk_under_memory_pressure(sk)) { if (sctp_tsnmap_has_gap(map) && (sctp_tsnmap_get_ctsn(map) + 1) == tsn) { pr_debug("%s: under pressure, reneging for tsn:%u\n", __func__, tsn); deliver = SCTP_CMD_RENEGE; - } + } else { + sk_mem_reclaim(sk); + } } /* diff --git a/net/sctp/ulpevent.c b/net/sctp/ulpevent.c index 8cb7d98..c2a7478 100644 --- a/net/sctp/ulpevent.c +++ b/net/sctp/ulpevent.c @@ -634,8 +634,9 @@ struct sctp_ulpevent *sctp_ulpevent_make_rcvmsg(struct sctp_association *asoc, gfp_t gfp) { struct sctp_ulpevent *event = NULL; - struct sk_buff *skb; - size_t padding, len; + struct sk_buff *skb = chunk->skb; + struct sock *sk = asoc->base.sk; + size_t padding, datalen; int rx_count; /* @@ -646,15 +647,12 @@ struct sctp_ulpevent *sctp_ulpevent_make_rcvmsg(struct sctp_association *asoc, if (asoc->ep->rcvbuf_policy) rx_count = atomic_read(&asoc->rmem_alloc); else - rx_count = atomic_read(&asoc->base.sk->sk_rmem_alloc); + rx_count = atomic_read(&sk->sk_rmem_alloc); - if (rx_count >= asoc->base.sk->sk_rcvbuf) { + datalen = ntohs(chunk->chunk_hdr->length); - if ((asoc->base.sk->sk_userlocks & SOCK_RCVBUF_LOCK) || - (!sk_rmem_schedule(asoc->base.sk, chunk->skb, - chunk->skb->truesize))) - goto fail; - } + if (rx_count >= sk->sk_rcvbuf || !sk_rmem_schedule(sk, skb, datalen)) + goto fail; /* Clone the original skb, sharing the data. */ skb = skb_clone(chunk->skb, gfp); @@ -681,8 +679,7 @@ struct sctp_ulpevent *sctp_ulpevent_make_rcvmsg(struct sctp_association *asoc, * The sender should never pad with more than 3 bytes. The receiver * MUST ignore the padding bytes. */ - len = ntohs(chunk->chunk_hdr->length); - padding = SCTP_PAD4(len) - len; + padding = SCTP_PAD4(datalen) - datalen; /* Fixup cloned skb with just this chunks data. */ skb_trim(skb, chunk->chunk_end - padding - skb->data); diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c index 5dde921..770ff1f 100644 --- a/net/sctp/ulpqueue.c +++ b/net/sctp/ulpqueue.c @@ -1106,7 +1106,8 @@ void sctp_ulpq_renege(struct sctp_ulpq *ulpq, struct sctp_chunk *chunk, freed += sctp_ulpq_renege_frags(ulpq, needed - freed); } /* If able to free enough room, accept this chunk. */ - if (freed >= needed) { + if (sk_rmem_schedule(asoc->base.sk, chunk->skb, needed) && + freed >= needed) { int retval = sctp_ulpq_tail_data(ulpq, chunk, gfp); /* * Enter partial delivery if chunk has not been -- 2.1.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCHv2 net-next 0/2] sctp: fully support memory accounting 2019-04-15 9:15 ` Xin Long @ 2019-04-15 20:37 ` David Miller -1 siblings, 0 replies; 8+ messages in thread From: David Miller @ 2019-04-15 20:37 UTC (permalink / raw) To: lucien.xin; +Cc: netdev, linux-sctp, marcelo.leitner, nhorman, mcroce, vdronov From: Xin Long <lucien.xin@gmail.com> Date: Mon, 15 Apr 2019 17:15:05 +0800 > sctp memory accounting is added in this patchset by using > these kernel APIs on send side: > > - sk_mem_charge() > - sk_mem_uncharge() > - sk_wmem_schedule() > - sk_under_memory_pressure() > - sk_mem_reclaim() > > and these on receive side: > > - sk_mem_charge() > - sk_mem_uncharge() > - sk_rmem_schedule() > - sk_under_memory_pressure() > - sk_mem_reclaim() > > With sctp memory accounting, we can limit the memory allocation by > either sysctl: > > # sysctl -w net.sctp.sctp_mem="10 20 50" > > or cgroup: > > # echo $((8<<14)) > \ > /sys/fs/cgroup/memory/sctp_mem/memory.kmem.tcp.limit_in_bytes > > When the socket is under memory pressure, the send side will block > and wait, while the receive side will renege or drop. > > v1->v2: > - add the missing Reported/Tested/Acked/-bys. Series applied, thanks Xin. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv2 net-next 0/2] sctp: fully support memory accounting @ 2019-04-15 20:37 ` David Miller 0 siblings, 0 replies; 8+ messages in thread From: David Miller @ 2019-04-15 20:37 UTC (permalink / raw) To: lucien.xin; +Cc: netdev, linux-sctp, marcelo.leitner, nhorman, mcroce, vdronov From: Xin Long <lucien.xin@gmail.com> Date: Mon, 15 Apr 2019 17:15:05 +0800 > sctp memory accounting is added in this patchset by using > these kernel APIs on send side: > > - sk_mem_charge() > - sk_mem_uncharge() > - sk_wmem_schedule() > - sk_under_memory_pressure() > - sk_mem_reclaim() > > and these on receive side: > > - sk_mem_charge() > - sk_mem_uncharge() > - sk_rmem_schedule() > - sk_under_memory_pressure() > - sk_mem_reclaim() > > With sctp memory accounting, we can limit the memory allocation by > either sysctl: > > # sysctl -w net.sctp.sctp_mem="10 20 50" > > or cgroup: > > # echo $((8<<14)) > \ > /sys/fs/cgroup/memory/sctp_mem/memory.kmem.tcp.limit_in_bytes > > When the socket is under memory pressure, the send side will block > and wait, while the receive side will renege or drop. > > v1->v2: > - add the missing Reported/Tested/Acked/-bys. Series applied, thanks Xin. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-04-15 20:37 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-04-15 9:15 [PATCHv2 net-next 0/2] sctp: fully support memory accounting Xin Long 2019-04-15 9:15 ` Xin Long 2019-04-15 9:15 ` [PATCHv2 net-next 1/2] sctp: implement memory accounting on tx path Xin Long 2019-04-15 9:15 ` Xin Long 2019-04-15 9:15 ` [PATCHv2 net-next 2/2] sctp: implement memory accounting on rx path Xin Long 2019-04-15 9:15 ` Xin Long 2019-04-15 20:37 ` [PATCHv2 net-next 0/2] sctp: fully support memory accounting David Miller 2019-04-15 20:37 ` David Miller
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.