Ethernet Bridge development
 help / color / mirror / Atom feed
From: John Morris <john@zultron.com>
To: bridge@lists.linux-foundation.org
Subject: Re: [Bridge] bridge dropping packets
Date: Sat, 30 May 2009 23:43:44 +0800	[thread overview]
Message-ID: <4A215430.2000609@zultron.com> (raw)
In-Reply-To: <2976.119.40.3.91.1237444151.squirrel@norman.zultron.com>

Someone asked about this, and I've learned a little bit more since:

The real problem was caused by the loading of the sip nat and conntrack 
kernel modules.  I assume that disabling the bridge-nf* sysctls helped 
because they took those modules out of the path of the bridge traffic.

So:

rmmod ip_nat_sip ip_conntrack_sip

	John


John Morris wrote:
> Too early to say for sure, but this may have been a case where I should've
> done better at RTFMing.
> 
> http://www.linuxfoundation.org/en/Net:Bridge#No_traffic_gets_trough_.28except_ARP_and_STP.29
> 
> Disabling the /proc/sys/net/bridge/bridge-nf* sysctls may have worked.  I
> don't understand how this could cause some, but not other traffic to be
> dropped.
> 
> At any rate, if this turns out not to be the fix after all, I'll report back.
> 
>     John
> 
> 
> On Wed, March 18, 2009 6:56 pm, John Morris wrote:
>> Same problem again here, this time with phone from a different vendor.
>> The dom0 had been running VLANs, but these are removed and the eth0 device
>> directly connected to the bridge for testing.
>>
>> Here are some tcpdumps that help illustrate the problem.  In this output,
>> sipura1 is the phone, and pbx0 is the domU.  Pbx0 is connected through the
>> interface vif8.0.  Sergey is the dom0, with a bridge 'bo1br'.
>>
>> [root@sergey ~]# jobs
>> [7]-  Running    tcpdump -i vif8.0 -l -A host sipura1 and not port 5061 \
>>     | sed 's/^/vif8.0-s:       /' &
>> [8]+  Running    tcpdump -i bo1br -l -e -A host sipura1 and not port 5061
>> \
>>     | sed 's/^/bo1br-s:      /' &
>>
>> Here are some sample packets that are never forwarded from the bridge to
>> vif8.0:
>> [...]
>> bo1br-s:        18:30:37.948378 00:0e:08:ab:6a:78 (oui Unknown) \
>>     > 00:16:ee:68:03:13 (oui Unknown), ethertype IPv4 (0x0800), \
>>     length 543: sipura1.zultron.com.sip > pbx0.zultron.com.sip: \
>>     SIP, length: 501
>> bo1br-s:        [...]REGISTER sip:pbx0.zultron.com SIP/2.0
>> bo1br-s:        Via: SIP/2.0/UD
>> bo1br-s:        18:30:39.948675 00:0e:08:ab:6a:78 (oui Unknown) \
>>     > 00:16:ee:68:03:13 (oui Unknown), ethertype IPv4 (0x0800), \
>>     length 543: sipura1.zultron.com.sip > pbx0.zultron.com.sip: \
>>     SIP, length: 501
>> bo1br-s:        [...]REGISTER sip:pbx0.zultron.com SIP/2.0
>> bo1br-s:        Via: SIP/2.0/UD
>> [...]
>>
>> A ping from pbx0 to sipura1 makes it through just fine, however:
>> [...]
>> vif8.0-s:       18:39:40.986555 IP pbx0.zultron.com > \
>>     sipura1.zultron.com: ICMP echo request, id 2318, seq 5, length 64
>> bo1br-s:        18:39:40.986555 00:16:ee:68:03:13 (oui Unknown) \
>>     > 00:0e:08:ab:6a:78 (oui Unknown), ethertype IPv4 (0x0800), \
>>     length 98: pbx0.ablesky.com > sipura1.ablesky.com: ICMP echo \
>>     request, id 2318, seq 5, length 64
>> bo1br-s:        18:39:40.987507 00:0e:08:ab:6a:78 (oui Unknown) \
>>     > 00:16:ee:68:03:13 (oui Unknown), ethertype IPv4 (0x0800), \
>>     length 98: sipura1.ablesky.com > pbx0.ablesky.com: ICMP echo \
>>     reply, id 2318, seq 5, length 64
>> vif8.0-s:       18:39:40.987516 IP sipura1.ablesky.com > \
>>     pbx0.ablesky.com: ICMP echo reply, id 2318, seq 5, length 64
>>
>> The relevant entries in the MAC table:
>> [root@sergey ~]# brctl showmacs bo1br | grep -e 6a:78 -e 03:13
>>   1     00:0e:08:ab:6a:78       no                26.80
>>   9     00:16:ee:68:03:13       no                 3.38
>>
>> Strangest of all, sipura1, an ATA, has two phone ports, and the software
>> registers them separately, one from port 5060, the other from port 5061.
>> The registration from port 5061 works just fine.  What's more, immediately
>> after a reboot of sergey, the dom0, the phones register fine; it is after
>> some time that the traffic suddenly begins being dropped.
>>
>> Should I be suspecting packet corruption?  Tcpdump seems to be able to
>> recognize the packets just fine.  Are the packets being forwarded out
>> another port?  The dest MACs aren't duplicated on the network, and I've
>> put a tcpdump on each switch port interface just to be sure.  Is it the
>> physical switch that sergey is connected to?  I've moved sergey to another
>> switch to test.  Is it the phone itself?  But different phones from
>> different vendors exhibit the same problem, and sipura1 has the problem on
>> one line, but not the other.  Obviously, I'm missing something here.
>> Thanks for any and all wild suggestions.
>>
>>     John
>>
>>
>> On Tue, March 3, 2009 7:04 pm, John Morris wrote:
>>> We have about 20 IP phones connecting to a Xen-based PBX, and in the
>>> past
>>> month or two, a problem has been popping up.
>>>
>>> About once a week, some, but not all, of the phones lose their
>>> registration with the PBX.  The PBX can ping the unregistered phones,
>>> and
>>> the phone ARP requests for the PBX IP are answered.  However, the UDP
>>> 5060
>>> registration traffic originating from those phones enters the dom0's
>>> bridge and is then dropped; it is never forwarded onto the vif
>>> associated
>>> with the pbx.
>>>
>>> Rebooting the dom0 is the only way I've found to fix it so far.
>>> Reloading
>>> the bridge kernel module doesn't seem to solve the problem, though the
>>> set
>>> of phones that are unable to register changes (I haven't looked closely
>>> to
>>> see if there's a pattern to it).
>>>
>>> There's no packet filtering going on here, and this problem seems to pop
>>> up after random, infrequent intervals.  I've verified that there are no
>>> hosts with duplicate MAC addresses.  I can't for the life of me think of
>>> why some packets from some IPs would be forwarded correctly and others
>>> would not.  Another post in the archives described some similar-sounding
>>> symptoms, but the OP found it to be an MTU-related problem; these
>>> packets
>>> are all 356 bytes long, too short to be the problem.
>>>
>>> Thanks-
>>>
>>>         John
>>>
>>> _______________________________________________
>>> Bridge mailing list
>>> Bridge@lists.linux-foundation.org
>>> https://lists.linux-foundation.org/mailman/listinfo/bridge
>>>
>> _______________________________________________
>> Bridge mailing list
>> Bridge@lists.linux-foundation.org
>> https://lists.linux-foundation.org/mailman/listinfo/bridge
>>
> 
> _______________________________________________
> Bridge mailing list
> Bridge@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/bridge

      reply	other threads:[~2009-05-30 15:43 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-03 11:04 [Bridge] bridge dropping packets John Morris
2009-03-18 10:56 ` John Morris
2009-03-19  6:29   ` John Morris
2009-05-30 15:43     ` John Morris [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A215430.2000609@zultron.com \
    --to=john@zultron.com \
    --cc=bridge@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox