All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Holger Eitzenberger <holger@eitzenberger.org>
Cc: netfilter-devel <netfilter-devel@vger.kernel.org>,
	Patrick McHardy <kaber@trash.net>
Subject: Re: [OOPS,TPROXY,xt_owner]: Oops accessing socket in owner_mt()
Date: Mon, 28 Apr 2014 17:57:31 +0200	[thread overview]
Message-ID: <20140428155731.GA29816@localhost> (raw)
In-Reply-To: <20140415133220.GG6576@imap.eitzenberger.org>

Hi Holger,

On Tue, Apr 15, 2014 at 03:32:20PM +0200, Holger Eitzenberger wrote:
> Hi all,
> 
> using kernel v8.8.13.15 I see a kernel oops happening in a setup where
> a HTTP using TPROXY is used on a bridge interface.  Also NFQUEUE
> is involved:
> 
> [ 379.046358] Pid: 5847, comm: afcd/258 Tainted: G O
> 3.8.13.15-110.g4be5643-smp64 001 Astaro AG ASG/NSB2189
> [ 379.046358] RIP: 0010:[<ffffffffa0269034>] [<ffffffffa0269034>] owner_mt+0x31/0xad [xt_owner]
> [ 379.046358] RSP: 0018:ffff88016b05b5f0 EFLAGS: 00210246
> [ 379.046358] RAX: 0000000000002000 RBX: ffffc90014306ca8 RCX: ffffc90014306cc8
> [ 379.046358] RDX: ffffc90014306c01 RSI: ffff88016b05b600 RDI: c9443747ad79b9de
> [ 379.046358] RBP: ffffc90014306c38 R08: ffffc90014306b68 R09: ffff88010069b0e0
> [ 379.046358] R10: 0000000000000002 R11: 0000000000001000 R12: ffff8801980f9810
> [ 379.046358] R13: 0000000000000004 R14: ffff8800d4022f00 R15: ffffffff8147faa0
> [ 379.046358] FS: 0000000000000000(0000) GS:ffff88019fc80000(0063)
> knlGS:00000000f5983b70
> [ 379.046358] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
> [ 379.046358] CR2: 00000000f5b34fa8 CR3: 000000016b115000 CR4: 00000000000007e0
> [ 379.046358] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 379.046358] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 379.046358] Process afcd/258 (pid: 5847, threadinfo ffff88016b05a000, task
> ffff880196292520)
> [ 379.046358] Stack:
> [ 379.046358] ffffffffa015a3f1 0000000100000001 ffff880101dda1c0
> 00000000000005c8
> [ 379.046358] ffff880101dda1c0 ffff8800db9b1e00 ffffffff812b5f73
> 000000000000000c
> [ 379.046358] 0000000000000004 ffff88019627c000 0000000000000000
> 0000000400000000
> [ 379.046358] Call Trace:
> [ 379.046358] [<ffffffffa015a3f1>] ? ipt_do_table+0x286/0x5f8 [ip_tables]
> [ 379.046358] [<ffffffff812b5f73>] ? tcp_rcv_established+0x594/0x685
> [ 379.046358] [<ffffffff8129b502>] ? nf_iterate+0x42/0x7d
> [ 379.046358] [<ffffffff810b5141>] ? virt_to_head_page+0x9/0x30
> [ 379.046358] [<ffffffff8129c304>] ? nf_reinject+0x9c/0x131
> [ 379.046358] [<ffffffff812a5576>] ? ip_finish_output2+0x2b1/0x2b1
> [ 379.046358] [<ffffffffa03337e0>] ? nfqnl_recv_verdict+0x68/0x2b2
> 
> Actual OOPS is at the time owner match access VFS file on socket, from
> xt_owner.c:owner_mt():
> 
> 	filp = sk->sk_socket->file;
> 		if (filp == NULL) {
> 
> And I think that owner match races with TPROXY calling sock_orphan()
> eventually, e. g. if system is low on memory.  To be verified.
> 
> I think I have fixed the issue by using a read_lock on 
> sk->sk_callback_lock, same lock is used in sock_orphan().

This makes sense to me.

> Also I found xt_LOG.c:dump_sk_uid_gid() to be similar, also
> using sk->sk_callback_lock.
> 
> I have attached the patch I am currently using.  Please check, it
> seems to fix the issue seen.  But still unsure, as not occuring
> often.
> 
>  /Holger
>
>
> xt_owner: fix race with sock_orphan()
> 
> By using a read_lock on sk->sk_callback_lock we can avoid Oops
> due to someone else calling sock_orphan() at same time on
> another CPU, e. g. when using TPROXY.
> 
> Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
> 
> Index: net-next/net/netfilter/xt_owner.c
> ===================================================================
> --- net-next.orig/net/netfilter/xt_owner.c
> +++ net-next/net/netfilter/xt_owner.c
> @@ -32,21 +32,32 @@ static bool
>  owner_mt(const struct sk_buff *skb, struct xt_action_param *par)
>  {
>  	const struct xt_owner_match_info *info = par->matchinfo;
> +	struct sock *sk = skb->sk;
>  	const struct file *filp;
>  
> -	if (skb->sk == NULL || skb->sk->sk_socket == NULL)
> +	if (sk == NULL)

Not your fault, but I think we should also check for ...

        ... || skb->sk->sk_state == TCP_TIME_WAIT)

since early demux was introduced, we may have skb->sk pointing to a
timewait socket.

>  		return (info->match ^ info->invert) == 0;
> -	else if (info->match & info->invert & XT_OWNER_SOCKET)
> +
> +	read_lock_bh(&sk->sk_callback_lock);
> +
> +	if (sk->sk_socket == NULL) {
> +		read_unlock_bh(&sk->sk_callback_lock);
> +		return (info->match ^ info->invert) == 0;
> +	}
> +
> +	if (info->match & info->invert & XT_OWNER_SOCKET)
>  		/*
>  		 * Socket exists but user wanted ! --socket-exists.
>  		 * (Single ampersands intended.)
>  		 */
> -		return false;
> +		goto out_false;
>  
> -	filp = skb->sk->sk_socket->file;
> -	if (filp == NULL)
> +	filp = sk->sk_socket->file;
> +	if (filp == NULL) {
> +		read_unlock_bh(&sk->sk_callback_lock);
>  		return ((info->match ^ info->invert) &
>  		       (XT_OWNER_UID | XT_OWNER_GID)) == 0;
> +	}
>  
>  	if (info->match & XT_OWNER_UID) {
>  		kuid_t uid_min = make_kuid(&init_user_ns, info->uid_min);
> @@ -54,7 +65,7 @@ owner_mt(const struct sk_buff *skb, stru
>  		if ((uid_gte(filp->f_cred->fsuid, uid_min) &&
>  		     uid_lte(filp->f_cred->fsuid, uid_max)) ^
>  		    !(info->invert & XT_OWNER_UID))
> -			return false;
> +			goto out_false;
>  	}
>  
>  	if (info->match & XT_OWNER_GID) {
> @@ -63,10 +74,16 @@ owner_mt(const struct sk_buff *skb, stru
>  		if ((gid_gte(filp->f_cred->fsgid, gid_min) &&
>  		     gid_lte(filp->f_cred->fsgid, gid_max)) ^
>  		    !(info->invert & XT_OWNER_GID))
> -			return false;
> +			goto out_false;
>  	}
>  
> +	read_unlock_bh(&sk->sk_callback_lock);
> +
>  	return true;
> +
> +out_false:
> +	read_unlock_bh(&sk->sk_callback_lock);
> +	return false;
>  }
>  
>  static struct xt_match owner_mt_reg __read_mostly = {


  reply	other threads:[~2014-04-28 15:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-15 13:32 [OOPS,TPROXY,xt_owner]: Oops accessing socket in owner_mt() Holger Eitzenberger
2014-04-28 15:57 ` Pablo Neira Ayuso [this message]
2014-04-28 16:20   ` Eric Dumazet
2014-04-28 16:43   ` Holger Eitzenberger
2014-08-11 12:40   ` [OOPS,xt_owner,V2]: " Holger Eitzenberger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140428155731.GA29816@localhost \
    --to=pablo@netfilter.org \
    --cc=holger@eitzenberger.org \
    --cc=kaber@trash.net \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.