All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stephen Hemminger <shemminger@vyatta.com>
To: ebiederm@xmission.com (Eric W. Biederman)
Cc: Francesco Ruggeri <fruggeri@aristanetworks.com>,
	netdev@vger.kernel.org, Eric Dumazet <eric.dumazet@gmail.com>,
	David Miller <davem@davemloft.ne>
Subject: Re: Race condition in ipv6 code
Date: Thu, 12 Jan 2012 17:57:58 -0800	[thread overview]
Message-ID: <20120112175758.773c8e85@nehalam.linuxnetplumber.net> (raw)
In-Reply-To: <m1ty40tn83.fsf@fess.ebiederm.org>

On Thu, 12 Jan 2012 17:17:32 -0800
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Francesco Ruggeri <fruggeri@aristanetworks.com> writes:
> 
> > We have hit a race condition in ipv6 code when setting
> > /proc/sys/net/ipv6/conf/*/forwarding. This happens when the syscall
> > has to be restarted.
> >
> > I wonder if anyone else has run into the same issue.
> >
> > The current sequence in addrconf_sysctl_forward() and
> > addrconf_fixup_forwarding()  is as follows:
> > - change the parameter in idev->cnf.forwarding (using proc_dointvec())
> > - try to get the rtnl lock
> > - if cannot get the lock then restore the original value in
> > idev->cnf.forwarding and restart the syscall.
> >
> > While this is going on, the ipv6 code may access idev->cnf.forwarding
> > and get an incorrect value.
> > In our case we were in addrconf_ifdown (holding the rtnl lock)  and
> > calling __ipv6_ifa_notify(RTM_DELADDR, ifa) on the idev->addr_list
> > entries.
> > __ipv6_ifa_notify() only invokes addrconf_leave_anycast() if
> > idev->cnf.forwarding is set. Because a process trying to set
> > forwarding to 0 was stuck in the restart_syscall sequence above
> > flipping the flag on and off, we erroneously read the flag as 0, with
> > the result that addrconf_leave_anycast() was not invoked, some
> > idev->ac_list entries were never released, idev was never freed and
> > kept a reference to its net_device, and the net_device was never freed
> > and caused the "unregister_netdevice: waiting for xxx to become free"
> > message forever. In our case this was a vlan interfaces that was being
> > deleted, so we ended up getting stuck in vlan_ioctl_handler() holding
> > vlan_ioctl_mutex with further bad consequences.
> > The following diffs (for 2.6.38, but the same logic seems to be used
> > in 3.2) address the issue by modifying idev->cnf.forwarding only after
> > the rtnl lock is acquired. There is a similar situation for
> > disable_ipv6.
> > Any comments are appreciated.
> 
> Interesting.  So ultimately the problem is not the syscall restart
> although that exacerbates it, the problem is that we expect 
> idev->cnf.forwarding to be protected by the rtnl_lock and it is not.
> 
> At first read through your patch looks good.  I am a bit worried that
> we have some versions of the value: aka
> net->ipv6.devconf_dflt->forwarding not protected by the rtnl_lock
> and other version of the value protected by the rtnl_lock.
> 
> That just seems confusing.
> 
> We can't hold the rtnl_lock around proc_dointvec because that can sleep
> indefinitely in copy_from_user.  So it looks like your change to create
> a temporary ctl_table and call proc_dointvec seems very reasonable,
> and necessary however we do this.
> 
> I don't know if there are other places that need the rtnl_lock that
> but your patch below looks like it makes things better for all of
> the right reasons.  So on that score.
> 
> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Unless someone wants to volunteer to sort out the impedance mismatch
> between these tunables and the sysctl infrastructure.  I suggest
> you resend this patch to David with [PATCH] in the subject line.
> 
> I would also suggest a little clearer description why
> idev->cnf.forwarding and idev->cnf.disable_ipv6 need rntl_lock
> protection.
> 
> But overall this looks like a pretty obvious bug fix, to the
> problem that we need the rtnl_lock to protect idev->cnf.forwarding,
> and we currently allow updates to idev->cnf.forwarding without
> holding the rtnl_lock. 
> 
> Eric
> 

Looks like a better function (proc_doint_rtnl?) needs to be built
that has the locking in the right place. I.e:
  get value from user
  get lock (with restart)
  do changes
  unlock

  reply	other threads:[~2012-01-13  1:58 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-12  2:13 Race condition in ipv6 code Francesco Ruggeri
2012-01-12  6:31 ` Eric Dumazet
2012-01-12  6:44   ` David Miller
2012-01-12 20:48     ` Francesco Ruggeri
2012-01-13  0:11   ` Eric W. Biederman
2012-01-13  6:02     ` Eric Dumazet
2012-01-13  7:40       ` Eric W. Biederman
2012-01-13 17:04         ` Ben Greear
2012-01-14  5:46           ` Eric W. Biederman
2012-01-14 18:31             ` Ben Greear
2012-01-20  2:54               ` Eric W. Biederman
2012-01-13  1:17 ` Eric W. Biederman
2012-01-13  1:57   ` Stephen Hemminger [this message]
2012-01-13 22:02   ` Francesco Ruggeri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120112175758.773c8e85@nehalam.linuxnetplumber.net \
    --to=shemminger@vyatta.com \
    --cc=davem@davemloft.ne \
    --cc=ebiederm@xmission.com \
    --cc=eric.dumazet@gmail.com \
    --cc=fruggeri@aristanetworks.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.