All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlad Yasevich <vladislav.yasevich@hp.com>
To: linux-sctp@vger.kernel.org
Subject: Is this a possible race condition.
Date: Thu, 17 Apr 2008 16:45:47 +0000	[thread overview]
Message-ID: <48077EBB.20006@hp.com> (raw)

Hi All

Background:

During the course of some stress testing we found an interesting scenario when
setting the socket send buffer fairly low.

The applications we use sets the send buffer low, and them performs a bunch of
short sends followed by a single receive.  The test runs for about 10-12 hours
with the application run a tight loop on each interface of the system.

Once in while this application ends up sitting in sendmsg().  Looking at the kernel
stack dumps (SysRq-T), the task in scheduled out sleeping in sctp_wait_for_sndbuf().
Looking at the memory consumption, everything is 0, i.e. all the skbs have been freed
and there is nothing on any queues.

This got me thinking that maybe wake-ups aren't getting to the socket.  I am working on
instrumenting a module to verify this, but looking through the wake-up logic, I think
we have a interesting race.

Race:

We charge every DATA chunk twice against the socket memory accounting, once in sctp_sendmsg()
and once in sctp_packet_transmit() when we allocated the skb that will hold all the bundled
chunks.  Now, a very important note is that we only perform the second charge against the
sk_wmem_alloc, so for short span of time socket wmem accounting does not match association
wmem accounting.  This is a separate bug, but it shows that the behavior described in the
Background section does not occur when sendbuf_policy is changed.

What I believe is happening is this.  The interface queue becomes full and thus a packet
that was charged during sctp_packet_transmit() is still sitting on this queue, while the
sendmsg() call completes successfully.  The actually draining of the queue happens later
and freeing up of the buffer happen when the interface is cleaning its tx rings.  This
can happen asynchronously from subsequent sendmsg() calls and also without any locking.

So, my supposition is that sock_rfree() and thus sctp_write_space() call
happens without a lock and in parallel with another sendmsg() call.  It appears that
the race is between checking to see if the asoc->wait queue is active and the activation
of this wait queue by the sctp_wait_for_sndbuf().  Thus, if by some very odd coincidence,
this happens to the last message that the application sent, the application we be scheduled
away and never wake up because the wake up logic already ran.

As you can see, the timing of this is very very narrow, and for us it takes about 10-12 hours
to reproduce.  The problem also happens on one or two interfaces out of 13 that are on the
systems and it's never the same interfaces every time.

Did I miss anything?  I was using e1000 as an example of how and when the transmitted skb is
freed and thus above is based on that.

-vlad

             reply	other threads:[~2008-04-17 16:45 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-17 16:45 Vlad Yasevich [this message]
2008-04-18  0:43 ` Is this a possible race condition Neil Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48077EBB.20006@hp.com \
    --to=vladislav.yasevich@hp.com \
    --cc=linux-sctp@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.