From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Denys Fedoryshchenko <nuclearcat@nuclearcat.com>,
Eric Dumazet <edumazet@google.com>,
Yuchung Cheng <ycheng@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 3.10 15/55] tcp: make connect() mem charging friendly
Date: Tue, 24 Mar 2015 16:42:55 +0100 [thread overview]
Message-ID: <20150324154159.393265345@linuxfoundation.org> (raw)
In-Reply-To: <20150324154158.748418668@linuxfoundation.org>
3.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit 355a901e6cf1b2b763ec85caa2a9f04fbcc4ab4a ]
While working on sk_forward_alloc problems reported by Denys
Fedoryshchenko, we found that tcp connect() (and fastopen) do not call
sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so
sk_forward_alloc is negative while connect is in progress.
We can fix this by calling regular sk_stream_alloc_skb() both for the
SYN packet (in tcp_connect()) and the syn_data packet in
tcp_send_syn_data()
Then, tcp_send_syn_data() can avoid copying syn_data as we simply
can manipulate syn_data->cb[] to remove SYN flag (and increment seq)
Instead of open coding memcpy_fromiovecend(), simply use this helper.
This leaves in socket write queue clean fast clone skbs.
This was tested against our fastopen packetdrill tests.
Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/ipv4/tcp_output.c | 66 +++++++++++++++++++++-----------------------------
1 file changed, 29 insertions(+), 37 deletions(-)
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2871,9 +2871,9 @@ static int tcp_send_syn_data(struct sock
{
struct tcp_sock *tp = tcp_sk(sk);
struct tcp_fastopen_request *fo = tp->fastopen_req;
- int syn_loss = 0, space, i, err = 0, iovlen = fo->data->msg_iovlen;
- struct sk_buff *syn_data = NULL, *data;
+ int syn_loss = 0, space, err = 0;
unsigned long last_syn_loss = 0;
+ struct sk_buff *syn_data;
tp->rx_opt.mss_clamp = tp->advmss; /* If MSS is not cached */
tcp_fastopen_cache_get(sk, &tp->rx_opt.mss_clamp, &fo->cookie,
@@ -2904,42 +2904,38 @@ static int tcp_send_syn_data(struct sock
/* limit to order-0 allocations */
space = min_t(size_t, space, SKB_MAX_HEAD(MAX_TCP_HEADER));
- syn_data = skb_copy_expand(syn, MAX_TCP_HEADER, space,
- sk->sk_allocation);
- if (syn_data == NULL)
+ syn_data = sk_stream_alloc_skb(sk, space, sk->sk_allocation);
+ if (!syn_data)
goto fallback;
-
- for (i = 0; i < iovlen && syn_data->len < space; ++i) {
- struct iovec *iov = &fo->data->msg_iov[i];
- unsigned char __user *from = iov->iov_base;
- int len = iov->iov_len;
-
- if (syn_data->len + len > space)
- len = space - syn_data->len;
- else if (i + 1 == iovlen)
- /* No more data pending in inet_wait_for_connect() */
- fo->data = NULL;
-
- if (skb_add_data(syn_data, from, len))
- goto fallback;
- }
-
- /* Queue a data-only packet after the regular SYN for retransmission */
- data = pskb_copy(syn_data, sk->sk_allocation);
- if (data == NULL)
+ syn_data->ip_summed = CHECKSUM_PARTIAL;
+ memcpy(syn_data->cb, syn->cb, sizeof(syn->cb));
+ if (unlikely(memcpy_fromiovecend(skb_put(syn_data, space),
+ fo->data->msg_iov, 0, space))) {
+ kfree_skb(syn_data);
goto fallback;
- TCP_SKB_CB(data)->seq++;
- TCP_SKB_CB(data)->tcp_flags &= ~TCPHDR_SYN;
- TCP_SKB_CB(data)->tcp_flags = (TCPHDR_ACK|TCPHDR_PSH);
- tcp_connect_queue_skb(sk, data);
- fo->copied = data->len;
+ }
- if (tcp_transmit_skb(sk, syn_data, 0, sk->sk_allocation) == 0) {
+ /* No more data pending in inet_wait_for_connect() */
+ if (space == fo->size)
+ fo->data = NULL;
+ fo->copied = space;
+
+ tcp_connect_queue_skb(sk, syn_data);
+
+ err = tcp_transmit_skb(sk, syn_data, 1, sk->sk_allocation);
+
+ /* Now full SYN+DATA was cloned and sent (or not),
+ * remove the SYN from the original skb (syn_data)
+ * we keep in write queue in case of a retransmit, as we
+ * also have the SYN packet (with no data) in the same queue.
+ */
+ TCP_SKB_CB(syn_data)->seq++;
+ TCP_SKB_CB(syn_data)->tcp_flags = TCPHDR_ACK | TCPHDR_PSH;
+ if (!err) {
tp->syn_data = (fo->copied > 0);
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPFASTOPENACTIVE);
goto done;
}
- syn_data = NULL;
fallback:
/* Send a regular SYN with Fast Open cookie request option */
@@ -2948,7 +2944,6 @@ fallback:
err = tcp_transmit_skb(sk, syn, 1, sk->sk_allocation);
if (err)
tp->syn_fastopen = 0;
- kfree_skb(syn_data);
done:
fo->cookie.len = -1; /* Exclude Fast Open option for SYN retries */
return err;
@@ -2968,13 +2963,10 @@ int tcp_connect(struct sock *sk)
return 0;
}
- buff = alloc_skb_fclone(MAX_TCP_HEADER + 15, sk->sk_allocation);
- if (unlikely(buff == NULL))
+ buff = sk_stream_alloc_skb(sk, 0, sk->sk_allocation);
+ if (unlikely(!buff))
return -ENOBUFS;
- /* Reserve space for headers. */
- skb_reserve(buff, MAX_TCP_HEADER);
-
tcp_init_nondata_skb(buff, tp->write_seq++, TCPHDR_SYN);
tp->retrans_stamp = TCP_SKB_CB(buff)->when = tcp_time_stamp;
tcp_connect_queue_skb(sk, buff);
next prev parent reply other threads:[~2015-03-24 15:45 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-24 15:42 [PATCH 3.10 00/55] 3.10.73-stable review Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 01/55] sparc32: destroy_context() and switch_mm() needs to disable interrupts Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 02/55] sparc: semtimedop() unreachable due to comparison error Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 03/55] sparc: perf: Remove redundant perf_pmu_{en|dis}able calls Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 04/55] sparc: perf: Make counting mode actually work Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 05/55] sparc: Touch NMI watchdog when walking cpus and calling printk Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 06/55] sparc64: Fix several bugs in memmove() Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 07/55] net: sysctl_net_core: check SNDBUF and RCVBUF for min length Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 08/55] rds: avoid potential stack overflow Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 09/55] inet_diag: fix possible overflow in inet_diag_dump_one_icsk() Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 10/55] caif: fix MSG_OOB test in caif_seqpkt_recvmsg() Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 11/55] rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 12/55] Revert "net: cx82310_eth: use common match macro" Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 13/55] tcp: fix tcp fin memory accounting Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 14/55] net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour Greg Kroah-Hartman
2015-03-24 15:42 ` Greg Kroah-Hartman [this message]
2015-03-24 15:42 ` [PATCH 3.10 17/55] drm/radeon: do a posting read in evergreen_set_irq Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 18/55] drm/radeon: do a posting read in r100_set_irq Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 19/55] drm/radeon: do a posting read in r600_set_irq Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 20/55] drm/radeon: do a posting read in si_set_irq Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 21/55] drm/radeon: do a posting read in rs600_set_irq Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 23/55] fuse: set stolen page uptodate Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 24/55] fuse: notify: dont move pages Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 25/55] virtio_console: init work unconditionally Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 26/55] Change email address for 8250_pci Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 27/55] can: add missing initialisations in CAN related skbuffs Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 28/55] workqueue: fix hang involving racing cancel[_delayed]_work_sync()s for PREEMPT_NONE Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 29/55] tpm/ibmvtpm: Additional LE support for tpm_ibmvtpm_send Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 30/55] spi: pl022: Fix race in giveback() leading to driver lock-up Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 31/55] ALSA: control: Add sanity checks for user ctl id name string Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 32/55] ALSA: hda - Fix built-in mic on Compaq Presario CQ60 Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 33/55] ALSA: hda - Dont access stereo amps for mono channel widgets Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 34/55] ALSA: hda - Set single_adc_amp flag for CS420x codecs Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 35/55] ALSA: hda - Add workaround for MacBook Air 5,2 built-in mic Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 36/55] ALSA: hda - Treat stereo-to-mono mix properly Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 37/55] regulator: Only enable disabled regulators on resume Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 38/55] regulator: core: Fix enable GPIO reference counting Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 39/55] nilfs2: fix deadlock of segment constructor during recovery Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 40/55] xen-pciback: limit guest control of command register Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 41/55] libsas: Fix Kernel Crash in smp_execute_task Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 42/55] crypto: aesni - fix memory usage in GCM decryption Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 43/55] x86/fpu: Avoid math_state_restore() without used_math() in __restore_xstate_sig() Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 44/55] x86/fpu: Drop_fpu() should not assume that tsk equals current Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 45/55] x86/vdso: Fix the build on GCC5 Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 46/55] powerpc/smp: Wait until secondaries are active & online Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 47/55] ipvs: add missing ip_vs_pe_put in sync code Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 48/55] ipvs: rerouting to local clients is not needed anymore Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 49/55] ARM: at91: pm: fix at91rm9200 standby Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 50/55] target: Fix reference leak in target_get_sess_cmd() error path Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 51/55] iscsi-target: Avoid early conn_logout_comp for iser connections Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 52/55] target/pscsi: Fix NULL pointer dereference in get_device_type Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 53/55] target: Fix R_HOLDER bit usage for AllRegistrants Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 54/55] target: Allow AllRegistrants to re-RESERVE existing reservation Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 55/55] target: Allow Write Exclusive non-reservation holders to READ Greg Kroah-Hartman
2015-03-25 2:34 ` [PATCH 3.10 00/55] 3.10.73-stable review Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150324154159.393265345@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nuclearcat@nuclearcat.com \
--cc=stable@vger.kernel.org \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox