Netdev List

* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Eric Dumazet @ 2011-09-06 19:01 UTC (permalink / raw)
  To: Tim Chen
  Cc: Yan, Zheng, netdev@vger.kernel.org, davem@davemloft.net,
	sfr@canb.auug.org.au, jirislaby@gmail.com, sedat.dilek@gmail.com,
	alex.shi
In-Reply-To: <1315335019.2576.3048.camel@schen9-DESK>

Le mardi 06 septembre 2011 à 11:50 -0700, Tim Chen a écrit :
> On Tue, 2011-09-06 at 19:40 +0200, Eric Dumazet wrote:
> > Le mardi 06 septembre 2011 à 09:25 -0700, Tim Chen a écrit :
> > > On Sun, 2011-09-04 at 13:44 +0800, Yan, Zheng wrote:
> > > > Commit 0856a30409 (Scm: Remove unnecessary pid & credential references
> > > > in Unix socket's send and receive path) introduced a use-after-free bug.
> > > > It passes the scm reference to the first skb. Skb(s) afterwards may
> > > > reference freed data structure because the first skb can be destructed
> > > > by the receiver at anytime. The fix is by passing the scm reference to
> > > > the very last skb.
> > > > 
> > > > Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
> > > > Reported-by: Jiri Slaby <jirislaby@gmail.com>
> > > > ---
> > > 
> > > Thanks for finding this bug in my original patch.  I've missed the case
> > > where receiving side could have released the all the references to the
> > > credential before the send side is using the credential again for
> > > subsequent skbs in the stream, thus causing the problem we saw.  Getting
> > > an extra reference for pid/credentials at the beginning of the stream
> > > and not getting reference for the last skb is the right approach.
> > > 
> > > Thanks also to Sedat, Valdis and Jiri for their extensive testing to
> > > discover the bug and testing the subsequent fixes. 
> > > 
> > > Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
> > 
> > What happens if message must be split in two skb,
> > first skb is built, queued (without scm reference)
> 
> An extra scm reference is already first obtained in scm_send at the
> beginning of unix_stream_sendmsg in Yan Zheng's patch.  So things should
> be okay as long as we only use this extra reference we got in scm_send
> for the last skb in unix_stream_sendmsg instead of the first skb.
> 
> > 
> > Second skb allocation fails.
> > 
> > Rule about refs/norefs games is : As soon as you put skb into a list, it
> > should have all appropriate references if this skb has pointer(s) to
> > objects(s)
> 
> All the skbs put on the list does have proper reference on pid/scm.  In
> the example you give, the first skb got the reference at this line:
> 
> err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, fds_sent);

This is the current code. We know its buggy.

I was discussing of things after proposed patch, not current net-next.

This reads :

err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, scm_ref);

So first skb is sent without ref taken, as mentioned in Changelog ?

If second skb cannot be built, we exit this system call with an already
queued skb. Receiver can then access to freed memory.

> 
> the second skb use the reference already obtained at the beginning of
> unix_stream_sendmsg if the skb allocation is successful:
> 
> err = scm_send(sock, msg, siocb->scm);
> 
> Now if the second skb allocation failed, the extra scm reference will be
> released by scm_destroy in the error handling path.
> 
> > 
> > We should revert 0856a304091b33a and code the thing differently.
> > 
> > Instead of storing pointer to pid and cred in UNIXSKB(), why dont we
> > copy all needed information ? No ref counts at all.
> > 
> > skb->cb[] is large enough.
> > 
> 
> If we can simply copy some information over, that will be ideal and
> will resolve all the scalability problems.  
> 
> However, I don't see other obvious info that we can pass to avoid
> passing pid.  Our current credential is pid and uid based, and requires
> the knowledge of sender's pid to interpret uid to do credentials
> checking.  So without passing the sender pid, I don't see an easy way
> for the receive side to interpret sender uid it got, which is needed in
> user_ns_map_uid function when we call cred_to_ucred.  
> 
> I was trying to do minimal changes to gain some performance.  The
> approach you suggest is great but will probably require much more
> changes to the credentials infrastructure.  Or maybe there are some easy
> way to do it that I don't see.

My approach would basically revert the 7361c36c commit too :(

I am sorry, but the only way to avoid too many pid/cred references is to
lock the socket [aka unix_state_lock(other);] for the whole send()
duration.

This way, you can really increment the pid/cred reference on the last
pushed skb, because no reader can 'catch first skb'

As soon as unix_state_unlock(other) is called, everything can happen, so
skb must be self contained, as I stated in my earlier mail.

^ permalink raw reply