From: Sasha Levin <sasha.levin@oracle.com>
To: stable@vger.kernel.org, stable-commits@vger.kernel.org
Cc: Eric Dumazet <edumazet@google.com>,
"David S. Miller" <davem@davemloft.net>,
Sasha Levin <sasha.levin@oracle.com>
Subject: [added to the 3.18 stable tree] tcp: avoid looping in tcp_send_fin()
Date: Mon, 11 May 2015 07:16:51 -0400 [thread overview]
Message-ID: <1431343152-19437-30-git-send-email-sasha.levin@oracle.com> (raw)
In-Reply-To: <1431343152-19437-1-git-send-email-sasha.levin@oracle.com>
From: Eric Dumazet <edumazet@google.com>
This patch has been added to the 3.18 stable tree. If you have any
objections, please let us know.
===============
[ Upstream commit 845704a535e9b3c76448f52af1b70e4422ea03fd ]
Presence of an unbound loop in tcp_send_fin() had always been hard
to explain when analyzing crash dumps involving gigantic dying processes
with millions of sockets.
Lets try a different strategy :
In case of memory pressure, try to add the FIN flag to last packet
in write queue, even if packet was already sent. TCP stack will
be able to deliver this FIN after a timeout event. Note that this
FIN being delivered by a retransmit, it also carries a Push flag
given our current implementation.
By checking sk_under_memory_pressure(), we anticipate that cooking
many FIN packets might deplete tcp memory.
In the case we could not allocate a packet, even with __GFP_WAIT
allocation, then not sending a FIN seems quite reasonable if it allows
to get rid of this socket, free memory, and not block the process from
eventually doing other useful work.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
---
net/ipv4/tcp_output.c | 50 +++++++++++++++++++++++++++++---------------------
1 file changed, 29 insertions(+), 21 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index c253999..dc9f925 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2719,7 +2719,8 @@ begin_fwd:
/* We allow to exceed memory limits for FIN packets to expedite
* connection tear down and (memory) recovery.
- * Otherwise tcp_send_fin() could loop forever.
+ * Otherwise tcp_send_fin() could be tempted to either delay FIN
+ * or even be forced to close flow without any FIN.
*/
static void sk_forced_wmem_schedule(struct sock *sk, int size)
{
@@ -2732,33 +2733,40 @@ static void sk_forced_wmem_schedule(struct sock *sk, int size)
sk_memory_allocated_add(sk, amt, &status);
}
-/* Send a fin. The caller locks the socket for us. This cannot be
- * allowed to fail queueing a FIN frame under any circumstances.
+/* Send a FIN. The caller locks the socket for us.
+ * We should try to send a FIN packet really hard, but eventually give up.
*/
void tcp_send_fin(struct sock *sk)
{
+ struct sk_buff *skb, *tskb = tcp_write_queue_tail(sk);
struct tcp_sock *tp = tcp_sk(sk);
- struct sk_buff *skb = tcp_write_queue_tail(sk);
- int mss_now;
- /* Optimization, tack on the FIN if we have a queue of
- * unsent frames. But be careful about outgoing SACKS
- * and IP options.
+ /* Optimization, tack on the FIN if we have one skb in write queue and
+ * this skb was not yet sent, or we are under memory pressure.
+ * Note: in the latter case, FIN packet will be sent after a timeout,
+ * as TCP stack thinks it has already been transmitted.
*/
- mss_now = tcp_current_mss(sk);
-
- if (tcp_send_head(sk) != NULL) {
- TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_FIN;
- TCP_SKB_CB(skb)->end_seq++;
+ if (tskb && (tcp_send_head(sk) || sk_under_memory_pressure(sk))) {
+coalesce:
+ TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;
+ TCP_SKB_CB(tskb)->end_seq++;
tp->write_seq++;
+ if (!tcp_send_head(sk)) {
+ /* This means tskb was already sent.
+ * Pretend we included the FIN on previous transmit.
+ * We need to set tp->snd_nxt to the value it would have
+ * if FIN had been sent. This is because retransmit path
+ * does not change tp->snd_nxt.
+ */
+ tp->snd_nxt++;
+ return;
+ }
} else {
- /* Socket is locked, keep trying until memory is available. */
- for (;;) {
- skb = alloc_skb_fclone(MAX_TCP_HEADER,
- sk->sk_allocation);
- if (skb)
- break;
- yield();
+ skb = alloc_skb_fclone(MAX_TCP_HEADER, sk->sk_allocation);
+ if (unlikely(!skb)) {
+ if (tskb)
+ goto coalesce;
+ return;
}
skb_reserve(skb, MAX_TCP_HEADER);
sk_forced_wmem_schedule(sk, skb->truesize);
@@ -2767,7 +2775,7 @@ void tcp_send_fin(struct sock *sk)
TCPHDR_ACK | TCPHDR_FIN);
tcp_queue_skb(sk, skb);
}
- __tcp_push_pending_frames(sk, mss_now, TCP_NAGLE_OFF);
+ __tcp_push_pending_frames(sk, tcp_current_mss(sk), TCP_NAGLE_OFF);
}
/* We get here when a process closes a file descriptor (either due to
--
2.1.0
next prev parent reply other threads:[~2015-05-11 11:20 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-11 11:16 [added to the 3.18 stable tree] kvm: add a memslot flag for incoherent memory regions Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm, arm64: KVM: allow forced dcache flush on page faults Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm, arm64: KVM: handle potential incoherency of readonly memslots Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: Don't clear the VCPU_POWER_OFF flag Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off option Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: Reset the HCR on each vcpu when resetting the vcpu Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: Turn off vcpus on PSCI shutdown/reboot Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: Introduce stage2_unmap_vm Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: vgic: move reset initialization into vgic_init_maps() Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: Don't allow creating VCPUs after vgic_initialized Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: vgic: kick the specific vcpu instead of iterating through all Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: Initialize the vgic on-demand when injecting IRQs Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: Require in-kernel vgic for the arch timers Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] KVM: arm/arm64: vgic: vgic_init returns -ENODEV when no online vcpu Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm64: KVM: Fix TLB invalidation by IPA/VMID Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm64: KVM: Fix HCR setting for 32bit guests Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: Invalidate data cache on unmap Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: Use kernel mapping to perform invalidation on page fault Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] ARM: KVM: Fix size check in __coherent_cache_guest_page Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm64: KVM: Fix stage-2 PGD allocation to have per-page refcounting Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm64: KVM: Do not use pgd_index to index stage-2 pgd Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] arm/arm64: KVM: Keep elrsr/aisr in sync with software model Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] mlx4: Fix tx ring affinity_mask creation Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] net/mlx4_en: Schedule napi when RX buffers allocation fails Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] ipv4: Missing sk_nulls_node_init() in ping_unhash() Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] ip_forward: Drop frames with attached skb->sk Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] net: add skb_checksum_complete_unset Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] ppp: call skb_checksum_complete_unset in ppp_receive_frame Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] tcp: fix possible deadlock in tcp_send_fin() Sasha Levin
2015-05-11 11:16 ` Sasha Levin [this message]
2015-05-11 11:16 ` [added to the 3.18 stable tree] net: do not deplete pfmemalloc reserve Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] net: fix crash in build_skb() Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] x86/asm/decoder: Fix and enforce max instruction size in the insn decoder Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] sched/idle/x86: Optimize unnecessary mwait_idle() resched IPIs Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] KVM: x86: Fix MSR_IA32_BNDCFGS in msrs_to_save Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] Btrfs: fix log tree corruption when fs mounted with -o discard Sasha Levin
2015-05-11 11:16 ` [added to the 3.18 stable tree] btrfs: don't accept bare namespace as a valid xattr Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] Btrfs: fix inode eviction infinite loop after cloning into it Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] Btrfs: fix inode eviction infinite loop after extent_same ioctl Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: gadget: printer: enqueue printer's response for setup request Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] KVM: s390: fix handling of write errors in the tpi handler Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] KVM: s390: reinjection of irqs can fail " Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] KVM: s390: Zero out current VMDB of STSI before including level3 data Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] KVM: s390: no need to hold the kvm->mutex for floating interrupts Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] KVM: s390: fix get_all_floating_irqs Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] s390/hibernate: fix save and restore of kernel text section Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] KVM: use slowpath for cross page cached accesses Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] KVM: arm/arm64: check IRQ number on userland injection Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] MIPS: KVM: Handle MSA Disabled exceptions from guest Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] MIPS: lose_fpu(): Disable FPU when MSA enabled Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] MIPS: Malta: Detect and fix bad memsize values Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] MIPS: asm: asm-eva: Introduce kernel load/store variants Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] MIPS: Loongson-3: Add IRQF_NO_SUSPEND to Cascade irqaction Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] MIPS: Hibernate: flush TLB entries earlier Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] staging: panel: fix lcd type Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] staging: android: sync: Fix memory corruption in sync_timeline_signal() Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] md/raid0: fix bug with chunksize not a power of 2 Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] cdc-wdm: fix endianness bug in debug statements Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] mmc: sunxi: Use devm_reset_control_get_optional() for reset control Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] spi: imx: read back the RX/TX watermark levels earlier Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] spi: spidev: fix possible arithmetic overflow for multi-transfer message Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] compal-laptop: Fix leaking hwmon device Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] compal-laptop: Check return value of power_supply_register Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ring-buffer: Replace this_cpu_*() with __this_cpu_*() Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] power_supply: twl4030_madc: Check return value of power_supply_register Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] power_supply: lp8788-charger: Fix leaked power supply on probe fail Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] power_supply: ipaq_micro_battery: Fix leaking workqueue Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] power_supply: ipaq_micro_battery: Check return values in probe Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] NFS: fix BUG() crash in notify_change() with patch to chown_common() Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ARM: fix broken hibernation Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ARM: 8320/1: fix integer overflow in ELF_ET_DYN_BASE Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ARM: mvebu: Disable CPU Idle on Armada 38x Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ARM: S3C64XX: Use fixed IRQ bases to avoid conflicts on Cragganmore Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ARM: dts: dove: Fix uart[23] reg property Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: musb: core: fix TX/RX endpoint order Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: phy: Find the right match in devm_usb_phy_match Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: define a generic USB_RESUME_TIMEOUT macro Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: musb: use new USB_RESUME_TIMEOUT Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: host: oxu210hp: " Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: host: fusbh200: " Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: host: uhci: " Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: host: fotg210: " Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: host: r8a66597: " Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: host: isp116x: " Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: host: xhci: " Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: host: ehci: " Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: host: sl811: " Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] usb: core: hub: " Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] clk: at91: usb: propagate rate modification to the parent clk Sasha Levin
2015-05-15 7:16 ` Boris Brezillon
2015-05-16 0:11 ` Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ALSA: hda - Add dock support for ThinkPad X250 (17aa:2226) Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ALSA: emu10k1: don't deadlock in proc-functions Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ALSA: hda/realtek - Enable the ALC292 dock fixup on the Thinkpad T450 Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ALSA: hda - fix "num_steps = 0" error on ALC256 Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ALSA: hda/realtek - Fix Headphone Mic doesn't recording for ALC256 Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] Input: elantech - fix absolute mode setting on some ASUS laptops Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] fs/binfmt_elf.c: fix bug in loading of PIE binaries Sasha Levin
2015-05-11 11:17 ` [added to the 3.18 stable tree] ptrace: fix race between ptrace_resume() and wait_task_stopped() Sasha Levin
2015-05-11 11:18 ` [added to the 3.18 stable tree] NFC: st21nfcb: Retry i2c_master_send if it returns a negative value Sasha Levin
2015-05-11 11:18 ` [added to the 3.18 stable tree] rtlwifi: rtl8192cu: Add new USB ID Sasha Levin
2015-05-11 11:18 ` [added to the 3.18 stable tree] rtlwifi: rtl8192cu: Add new device ID Sasha Levin
2015-05-11 11:18 ` [added to the 3.18 stable tree] ext4: make fsync to sync parent dir in no-journal for real this time Sasha Levin
2015-05-11 11:18 ` [added to the 3.18 stable tree] mnt: Improve the umount_tree flags Sasha Levin
2015-05-11 11:18 ` [added to the 3.18 stable tree] mnt: Don't propagate umounts in __detach_mounts Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1431343152-19437-30-git-send-email-sasha.levin@oracle.com \
--to=sasha.levin@oracle.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=stable-commits@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox