linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Tan Hu <tan.hu@zte.com.cn>,
	Jiang Biao <jiang.biao2@zte.com.cn>, Julian Anastasov <ja@ssi.bg>,
	Simon Horman <horms@verge.net.au>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Sasha Levin <alexander.levin@microsoft.com>
Subject: [PATCH 4.4 17/60] ipvs: fix race between ip_vs_conn_new() and ip_vs_del_dest()
Date: Thu, 13 Sep 2018 15:30:24 +0200	[thread overview]
Message-ID: <20180913131746.176139405@linuxfoundation.org> (raw)
In-Reply-To: <20180913131745.261413581@linuxfoundation.org>

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tan Hu <tan.hu@zte.com.cn>

[ Upstream commit a53b42c11815d2357e31a9403ae3950517525894 ]

We came across infinite loop in ipvs when using ipvs in docker
env.

When ipvs receives new packets and cannot find an ipvs connection,
it will create a new connection, then if the dest is unavailable
(i.e. IP_VS_DEST_F_AVAILABLE), the packet will be dropped sliently.

But if the dropped packet is the first packet of this connection,
the connection control timer never has a chance to start and the
ipvs connection cannot be released. This will lead to memory leak, or
infinite loop in cleanup_net() when net namespace is released like
this:

    ip_vs_conn_net_cleanup at ffffffffa0a9f31a [ip_vs]
    __ip_vs_cleanup at ffffffffa0a9f60a [ip_vs]
    ops_exit_list at ffffffff81567a49
    cleanup_net at ffffffff81568b40
    process_one_work at ffffffff810a851b
    worker_thread at ffffffff810a9356
    kthread at ffffffff810b0b6f
    ret_from_fork at ffffffff81697a18

race condition:
    CPU1                           CPU2
    ip_vs_in()
      ip_vs_conn_new()
                                   ip_vs_del_dest()
                                     __ip_vs_unlink_dest()
                                       ~IP_VS_DEST_F_AVAILABLE
      cp->dest && !IP_VS_DEST_F_AVAILABLE
      __ip_vs_conn_put
    ...
    cleanup_net  ---> infinite looping

Fix this by checking whether the timer already started.

Signed-off-by: Tan Hu <tan.hu@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/netfilter/ipvs/ip_vs_core.c |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1809,13 +1809,20 @@ ip_vs_in(struct netns_ipvs *ipvs, unsign
 	if (cp->dest && !(cp->dest->flags & IP_VS_DEST_F_AVAILABLE)) {
 		/* the destination server is not available */
 
-		if (sysctl_expire_nodest_conn(ipvs)) {
+		__u32 flags = cp->flags;
+
+		/* when timer already started, silently drop the packet.*/
+		if (timer_pending(&cp->timer))
+			__ip_vs_conn_put(cp);
+		else
+			ip_vs_conn_put(cp);
+
+		if (sysctl_expire_nodest_conn(ipvs) &&
+		    !(flags & IP_VS_CONN_F_ONE_PACKET)) {
 			/* try to expire the connection immediately */
 			ip_vs_conn_expire_now(cp);
 		}
-		/* don't restart its timer, and silently
-		   drop the packet. */
-		__ip_vs_conn_put(cp);
+
 		return NF_DROP;
 	}
 



  parent reply	other threads:[~2018-09-13 13:31 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-13 13:30 [PATCH 4.4 00/60] 4.4.156-stable review Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 01/60] x86/speculation/l1tf: Fix up pte->pfn conversion for PAE Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 02/60] staging: android: ion: fix ION_IOC_{MAP,SHARE} use-after-free Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 03/60] net: bcmgenet: use MAC link status for fixed phy Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 04/60] qlge: Fix netdev features configuration Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 05/60] tcp: do not restart timewait timer on rst reception Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 06/60] vti6: remove !skb->ignore_df check from vti6_xmit() Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 07/60] cifs: check if SMB2 PDU size has been padded and suppress the warning Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 08/60] hfsplus: dont return 0 when fill_super() failed Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 09/60] hfs: prevent crash on exit from failed search Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 10/60] fork: dont copy inconsistent signal handler state to child Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 11/60] reiserfs: change j_timestamp type to time64_t Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 12/60] hfsplus: fix NULL dereference in hfsplus_lookup() Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 13/60] fat: validate ->i_start before using Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 14/60] scripts: modpost: check memory allocation results Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 15/60] mm/fadvise.c: fix signed overflow UBSAN complaint Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 16/60] fs/dcache.c: fix kmemcheck splat at take_dentry_name_snapshot() Greg Kroah-Hartman
2018-09-13 13:30 ` Greg Kroah-Hartman [this message]
2018-09-13 13:30 ` [PATCH 4.4 18/60] mfd: sm501: Set coherent_dma_mask when creating subdevices Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 19/60] platform/x86: asus-nb-wmi: Add keymap entry for lid flip action on UX360 Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 20/60] irqchip/bcm7038-l1: Hide cpu offline callback when building for !SMP Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 21/60] net/9p: fix error path of p9_virtio_probe Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 22/60] powerpc: Fix size calculation using resource_size() Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 23/60] s390/dasd: fix hanging offline processing due to canceled worker Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 24/60] scsi: aic94xx: fix an error code in aic94xx_init() Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 25/60] PCI: mvebu: Fix I/O space end address calculation Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 26/60] dm kcopyd: avoid softlockup in run_complete_job Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 27/60] staging: comedi: ni_mio_common: fix subdevice flags for PFI subdevice Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 28/60] selftests/powerpc: Kill child processes on SIGINT Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 29/60] smb3: fix reset of bytes read and written stats Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 30/60] SMB3: Number of requests sent should be displayed for SMB3 not just CIFS Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 31/60] powerpc/pseries: Avoid using the size greater than RTAS_ERROR_LOG_MAX Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 32/60] btrfs: replace: Reset on-disk dev stats value after replace Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 33/60] btrfs: relocation: Only remove reloc rb_trees if reloc control has been initialized Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 34/60] btrfs: Dont remove block group that still has pinned down bytes Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 35/60] debugobjects: Make stack check warning more informative Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 36/60] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 37/60] kbuild: make missing $DEPMOD a Warning instead of an Error Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 38/60] irda: Fix memory leak caused by repeated binds of irda socket Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 39/60] irda: Only insert new objects into the global database via setsockopt Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 40/60] Revert "ARM: imx_v6_v7_defconfig: Select ULPI support" Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 41/60] enic: do not call enic_change_mtu in enic_probe Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 42/60] Fixes: Commit cdbf92675fad ("mm: numa: avoid waiting on freed migrated pages") Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 43/60] genirq: Delay incrementing interrupt count if its disabled/pending Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 44/60] irqchip/gic-v3-its: Recompute the number of pages on page size change Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 45/60] irqchip/gicv3-its: Fix memory leak in its_free_tables() Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 46/60] irqchip/gicv3-its: Avoid cache flush beyond ITS_BASERn memory size Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 47/60] irqchip/gic-v3: Add missing barrier to 32bit version of gic_read_iar() Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 48/60] irqchip/gic: Make interrupt ID 1020 invalid Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 49/60] ovl: rename is_merge to is_lowest Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 50/60] ovl: override creds with the ones from the superblock mounter Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 51/60] ovl: proper cleanup of workdir Greg Kroah-Hartman
2018-09-13 13:30 ` [PATCH 4.4 52/60] sch_htb: fix crash on init failure Greg Kroah-Hartman
2018-09-13 13:31 ` [PATCH 4.4 53/60] sch_multiq: fix double free " Greg Kroah-Hartman
2018-09-13 13:31 ` [PATCH 4.4 54/60] sch_hhf: fix null pointer dereference " Greg Kroah-Hartman
2018-09-13 13:31 ` [PATCH 4.4 55/60] sch_netem: avoid null pointer deref " Greg Kroah-Hartman
2018-09-13 13:31 ` [PATCH 4.4 56/60] sch_tbf: fix two null pointer dereferences " Greg Kroah-Hartman
2018-09-13 13:31 ` [PATCH 4.4 57/60] mei: me: allow runtime pm for platform with D0i3 Greg Kroah-Hartman
2018-09-13 13:31 ` [PATCH 4.4 58/60] s390/lib: use expoline for all bcr instructions Greg Kroah-Hartman
2018-09-13 13:31 ` [PATCH 4.4 59/60] ASoC: wm8994: Fix missing break in switch Greg Kroah-Hartman
2018-09-13 13:31 ` [PATCH 4.4 60/60] btrfs: use correct compare function of dirty_metadata_bytes Greg Kroah-Hartman
2018-09-13 19:07 ` [PATCH 4.4 00/60] 4.4.156-stable review Nathan Chancellor
2018-09-14 12:49 ` Naresh Kamboju
2018-09-14 14:52 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180913131746.176139405@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=alexander.levin@microsoft.com \
    --cc=horms@verge.net.au \
    --cc=ja@ssi.bg \
    --cc=jiang.biao2@zte.com.cn \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=stable@vger.kernel.org \
    --cc=tan.hu@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).