* Bug in MACSec - stops passing traffic after approx 5TB @ 2018-10-14 14:59 Josh Coombs 2018-10-14 20:25 ` Sabrina Dubroca 0 siblings, 1 reply; 6+ messages in thread From: Josh Coombs @ 2018-10-14 14:59 UTC (permalink / raw) To: netdev I initially mistook this for a traffic control issue, but after stripping the test beds down to just the MACSec component, I can still replicate the issue. After approximately 5TB of transfer / 4 billion packets over a MACSec link it stops passing traffic. I have replicated this now on both Ubuntu Server 18.04 with their patched 4.15 kernel and Gentoo with a vanilla 4.18.13 kernel. As noted before, my test setup consists of two machines with a direct ethernet connection to each other. MACSec is setup on the link, along with a /30 network. I then hammer the link with iperf3 in both directions. On a gigabit link it takes me one to two days to trip the bug. Nothing is logged to dmesg. If I remove and re-add the MACSec link, or reboot the machines traffic flow resumes. I can replicate on physical hardware or in VMs. How should I proceed to help diag and correct this? To replicate, setup two machines with a direct ethernet connection. If simulating in ESXi setup a dedicated vSwitch, allow promiscuous, forged and MAC changes should be enabled, MTU increased to 9000, then setup a dedicated port group and VLAN to simulate the direct connection. I use the following script to setup the MACSec connection, adjusting keys, rxmac, interface and IPs as appropriate: ----------- The script I used on each host (keys, rxmacs and IPs updated as appropriate): #!/bin/bash # Interfaces: # dif = Egress physical interface (Dest) # eif = Encrypted interface dif=ens224 eif=macsec0 # MACSec Keys: # txkey = Transmit (Local) key # rxkey = Receive (Remote) key # rxmac = Receive (Remote) MAC addy txkey=60995924232808431491190820961556 rxkey=87345530111733181210202106249824 rxmac=00:0c:29:c5:95:df # Clear any existing IP config ifconfig $dif 0.0.0.0 # Bring up macsec: echo "* Enable MACSec" modprobe macsec ip link add link "$dif" "$eif" type macsec ip macsec add "$eif" tx sa 0 pn 1 on key 02 "$txkey" ip macsec add "$eif" rx address "$rxmac" port 1 ip macsec add "$eif" rx address "$rxmac" port 1 sa 0 pn 1 on key 01 "$rxkey" ip link set "$eif" type macsec encrypt on # Bring up the interfaces: echo "* Light tunnel NICS" ip link set "$dif" up ip link set "$eif" up # Set IP ifconfig $eif 192.168.211.1/30 Once you can ping across the link, use iperf3 or a similar network stress tool to flood the link with traffic in both directions and wait for the bug to trigger. Josh Coombs ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug in MACSec - stops passing traffic after approx 5TB 2018-10-14 14:59 Bug in MACSec - stops passing traffic after approx 5TB Josh Coombs @ 2018-10-14 20:25 ` Sabrina Dubroca 2018-10-14 20:52 ` Josh Coombs 0 siblings, 1 reply; 6+ messages in thread From: Sabrina Dubroca @ 2018-10-14 20:25 UTC (permalink / raw) To: Josh Coombs; +Cc: netdev 2018-10-14, 10:59:31 -0400, Josh Coombs wrote: > I initially mistook this for a traffic control issue, but after > stripping the test beds down to just the MACSec component, I can still > replicate the issue. After approximately 5TB of transfer / 4 billion > packets over a MACSec link it stops passing traffic. I think you're just hitting packet number exhaustion. After 2^32 packets, the packet number would wrap to 0 and start being reused, which breaks the crypto used by macsec. Before this point, you have to add a new SA, and tell the macsec device to switch to it. That's why you should be using wpa_supplicant. It will monitor the growth of the packet number, and handle the rekey for you. If you start with a PN already close to exhaustion (say, 4294967000), you should hit the "bug" very quickly. > # Bring up macsec: > echo "* Enable MACSec" > modprobe macsec > ip link add link "$dif" "$eif" type macsec > ip macsec add "$eif" tx sa 0 pn 1 on key 02 "$txkey" Keep the rest of the configuration, and replace that one with: ip macsec add "$eif" tx sa 0 pn 4294967000 on key 02 "$txkey" to trigger the issue faster. -- Sabrina ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug in MACSec - stops passing traffic after approx 5TB 2018-10-14 20:25 ` Sabrina Dubroca @ 2018-10-14 20:52 ` Josh Coombs 2018-10-15 15:45 ` Josh Coombs 0 siblings, 1 reply; 6+ messages in thread From: Josh Coombs @ 2018-10-14 20:52 UTC (permalink / raw) To: sd; +Cc: netdev On Sun, Oct 14, 2018 at 4:24 PM Sabrina Dubroca <sd@queasysnail.net> wrote: > > 2018-10-14, 10:59:31 -0400, Josh Coombs wrote: > > I initially mistook this for a traffic control issue, but after > > stripping the test beds down to just the MACSec component, I can still > > replicate the issue. After approximately 5TB of transfer / 4 billion > > packets over a MACSec link it stops passing traffic. > > I think you're just hitting packet number exhaustion. After 2^32 > packets, the packet number would wrap to 0 and start being reused, > which breaks the crypto used by macsec. Before this point, you have to > add a new SA, and tell the macsec device to switch to it. I had not considered that, I naively thought as long as I didn't specify a replay window, it'd roll the PN over on it's own and life would be good. I'll test that theory tomorrow, should be easy to prove out. > That's why you should be using wpa_supplicant. It will monitor the > growth of the packet number, and handle the rekey for you. Thank you for the heads up, I'll read up on this as well. Josh C ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug in MACSec - stops passing traffic after approx 5TB 2018-10-14 20:52 ` Josh Coombs @ 2018-10-15 15:45 ` Josh Coombs 2018-10-17 13:45 ` Josh Coombs 0 siblings, 1 reply; 6+ messages in thread From: Josh Coombs @ 2018-10-15 15:45 UTC (permalink / raw) To: sd; +Cc: netdev And confirmed, starting with a high packet number results in a very short testbed run, 296 packets and then nothing, just as you surmised. Sorry for raising the alarm falsely. Looks like I need to roll my own build of wpa_supplicant as the ubuntu builds don't include the macsec driver, haven't tested Gentoo's ebuilds yet to see if they do. Josh Coombs On Sun, Oct 14, 2018 at 4:52 PM Josh Coombs <jcoombs@staff.gwi.net> wrote: > > On Sun, Oct 14, 2018 at 4:24 PM Sabrina Dubroca <sd@queasysnail.net> wrote: > > > > 2018-10-14, 10:59:31 -0400, Josh Coombs wrote: > > > I initially mistook this for a traffic control issue, but after > > > stripping the test beds down to just the MACSec component, I can still > > > replicate the issue. After approximately 5TB of transfer / 4 billion > > > packets over a MACSec link it stops passing traffic. > > > > I think you're just hitting packet number exhaustion. After 2^32 > > packets, the packet number would wrap to 0 and start being reused, > > which breaks the crypto used by macsec. Before this point, you have to > > add a new SA, and tell the macsec device to switch to it. > > I had not considered that, I naively thought as long as I didn't > specify a replay window, it'd roll the PN over on it's own and life > would be good. I'll test that theory tomorrow, should be easy to > prove out. > > > That's why you should be using wpa_supplicant. It will monitor the > > growth of the packet number, and handle the rekey for you. > > Thank you for the heads up, I'll read up on this as well. > > Josh C ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug in MACSec - stops passing traffic after approx 5TB 2018-10-15 15:45 ` Josh Coombs @ 2018-10-17 13:45 ` Josh Coombs 2018-10-17 14:46 ` Josh Coombs 0 siblings, 1 reply; 6+ messages in thread From: Josh Coombs @ 2018-10-17 13:45 UTC (permalink / raw) To: sd; +Cc: netdev I've got wpa_supplicant working with macsec on Fedora, my test bed has shuffled 16 billion packets so far without interruption. I am a bit concerned that I've just pushed the resource exhaustion issue down the road though, looking at the output of ip macsec show I see four SAs for TX and RX, it appears to negotiate a new pair every 3 to 3.5 billion packets. It doesn't appear to be ripping down old SAs. What happens when available SA slots run out? Joshua Coombs GWI office 207-494-2140 www.gwi.net On Mon, Oct 15, 2018 at 11:45 AM Josh Coombs <jcoombs@staff.gwi.net> wrote: > > And confirmed, starting with a high packet number results in a very > short testbed run, 296 packets and then nothing, just as you surmised. > Sorry for raising the alarm falsely. Looks like I need to roll my own > build of wpa_supplicant as the ubuntu builds don't include the macsec > driver, haven't tested Gentoo's ebuilds yet to see if they do. > > Josh Coombs > > On Sun, Oct 14, 2018 at 4:52 PM Josh Coombs <jcoombs@staff.gwi.net> wrote: > > > > On Sun, Oct 14, 2018 at 4:24 PM Sabrina Dubroca <sd@queasysnail.net> wrote: > > > > > > 2018-10-14, 10:59:31 -0400, Josh Coombs wrote: > > > > I initially mistook this for a traffic control issue, but after > > > > stripping the test beds down to just the MACSec component, I can still > > > > replicate the issue. After approximately 5TB of transfer / 4 billion > > > > packets over a MACSec link it stops passing traffic. > > > > > > I think you're just hitting packet number exhaustion. After 2^32 > > > packets, the packet number would wrap to 0 and start being reused, > > > which breaks the crypto used by macsec. Before this point, you have to > > > add a new SA, and tell the macsec device to switch to it. > > > > I had not considered that, I naively thought as long as I didn't > > specify a replay window, it'd roll the PN over on it's own and life > > would be good. I'll test that theory tomorrow, should be easy to > > prove out. > > > > > That's why you should be using wpa_supplicant. It will monitor the > > > growth of the packet number, and handle the rekey for you. > > > > Thank you for the heads up, I'll read up on this as well. > > > > Josh C ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug in MACSec - stops passing traffic after approx 5TB 2018-10-17 13:45 ` Josh Coombs @ 2018-10-17 14:46 ` Josh Coombs 0 siblings, 0 replies; 6+ messages in thread From: Josh Coombs @ 2018-10-17 14:46 UTC (permalink / raw) To: sd; +Cc: netdev I see it reusing SAs, so I'm good. Joshua Coombs On Wed, Oct 17, 2018 at 9:45 AM Josh Coombs <jcoombs@staff.gwi.net> wrote: > > I've got wpa_supplicant working with macsec on Fedora, my test bed has > shuffled 16 billion packets so far without interruption. I am a bit > concerned that I've just pushed the resource exhaustion issue down the > road though, looking at the output of ip macsec show I see four SAs > for TX and RX, it appears to negotiate a new pair every 3 to 3.5 > billion packets. It doesn't appear to be ripping down old SAs. What > happens when available SA slots run out? > > Joshua Coombs > GWI > > office 207-494-2140 > www.gwi.net > > On Mon, Oct 15, 2018 at 11:45 AM Josh Coombs <jcoombs@staff.gwi.net> wrote: > > > > And confirmed, starting with a high packet number results in a very > > short testbed run, 296 packets and then nothing, just as you surmised. > > Sorry for raising the alarm falsely. Looks like I need to roll my own > > build of wpa_supplicant as the ubuntu builds don't include the macsec > > driver, haven't tested Gentoo's ebuilds yet to see if they do. > > > > Josh Coombs > > > > On Sun, Oct 14, 2018 at 4:52 PM Josh Coombs <jcoombs@staff.gwi.net> wrote: > > > > > > On Sun, Oct 14, 2018 at 4:24 PM Sabrina Dubroca <sd@queasysnail.net> wrote: > > > > > > > > 2018-10-14, 10:59:31 -0400, Josh Coombs wrote: > > > > > I initially mistook this for a traffic control issue, but after > > > > > stripping the test beds down to just the MACSec component, I can still > > > > > replicate the issue. After approximately 5TB of transfer / 4 billion > > > > > packets over a MACSec link it stops passing traffic. > > > > > > > > I think you're just hitting packet number exhaustion. After 2^32 > > > > packets, the packet number would wrap to 0 and start being reused, > > > > which breaks the crypto used by macsec. Before this point, you have to > > > > add a new SA, and tell the macsec device to switch to it. > > > > > > I had not considered that, I naively thought as long as I didn't > > > specify a replay window, it'd roll the PN over on it's own and life > > > would be good. I'll test that theory tomorrow, should be easy to > > > prove out. > > > > > > > That's why you should be using wpa_supplicant. It will monitor the > > > > growth of the packet number, and handle the rekey for you. > > > > > > Thank you for the heads up, I'll read up on this as well. > > > > > > Josh C ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-10-17 22:42 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-10-14 14:59 Bug in MACSec - stops passing traffic after approx 5TB Josh Coombs 2018-10-14 20:25 ` Sabrina Dubroca 2018-10-14 20:52 ` Josh Coombs 2018-10-15 15:45 ` Josh Coombs 2018-10-17 13:45 ` Josh Coombs 2018-10-17 14:46 ` Josh Coombs
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).