* [PATCH] test/utils: add more reliable "get remote address" approach @ 2025-07-03 11:33 Alan Maguire 2025-07-03 16:43 ` [DTrace-devel] " Eugene Loh 0 siblings, 1 reply; 21+ messages in thread From: Alan Maguire @ 2025-07-03 11:33 UTC (permalink / raw) To: dtrace; +Cc: dtrace-devel, Alan Maguire The current approach of looking for remote addresses is brittle and fails in many environments; it checks the default route gateway and looks for open ports in the TCP case. We can however achieve the same goal reliably by creating a network namespace on the system and configuring either IPv4 or IPv6 addresses on the namespaced and local veth interfaces that support communication between namespaces. If a tcp port is required start sshd to listen on that port. Teardown is managed in runtest.sh as signal handling for timeouts within the test scripts is not working; a trap function does not trigger for TERM. Move the get_remote.sh script to test/utils also as it seems a more natural location. One issue - this cannot be run on a local system with a VPN running as the VPN connection is pretty aggressive in disconnecting/reconnecting when spotting a link-up event associated with the global netns side of the veth. However in my experience the remote IP tests do not work reliably in that environment anyway. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> --- runtest.sh | 2 + test/unittest/ip/get.ipv4remote.pl | 87 --------------------- test/unittest/ip/get.ipv6remote.pl | 70 ----------------- test/unittest/ip/tst.ipv4remoteicmp.sh | 10 +-- test/unittest/ip/tst.ipv4remotetcp.sh | 25 ++---- test/unittest/ip/tst.ipv4remoteudp.sh | 8 +- test/unittest/ip/tst.ipv6remoteicmp.sh | 16 ++-- test/unittest/tcp/tst.ipv4remotetcp.sh | 24 ++---- test/unittest/tcp/tst.ipv4remotetcpstate.sh | 29 +++---- test/unittest/udp/tst.ipv4remoteudp.sh | 4 +- test/utils/get_remote.sh | 71 +++++++++++++++++ 11 files changed, 116 insertions(+), 230 deletions(-) delete mode 100755 test/unittest/ip/get.ipv4remote.pl delete mode 100755 test/unittest/ip/get.ipv6remote.pl create mode 100755 test/utils/get_remote.sh diff --git a/runtest.sh b/runtest.sh index 156e7dec..9f06a499 100755 --- a/runtest.sh +++ b/runtest.sh @@ -1473,6 +1473,8 @@ for dt in $dtrace; do log "\n" + test/utils/get_remote.sh cleanup + if [[ -n $regression ]]; then # If regtesting, we run a second time, with intermediate results # displayed, and output redirected to a per-test, per-dtrace diff --git a/test/unittest/ip/get.ipv4remote.pl b/test/unittest/ip/get.ipv4remote.pl deleted file mode 100755 index 3cc47d01..00000000 --- a/test/unittest/ip/get.ipv4remote.pl +++ /dev/null @@ -1,87 +0,0 @@ -#!/usr/bin/perl -w -# -# Oracle Linux DTrace. -# Copyright (c) 2008, 2017, Oracle and/or its affiliates. All rights reserved. -# Licensed under the Universal Permissive License v 1.0 as shown at -# http://oss.oracle.com/licenses/upl. -# - -# -# get.ipv4remote.pl [tcpport] -# -# Find an IPv4 reachable remote host using both ip(8) and ping(8). -# If a tcpport is specified, return a host that is also listening on this -# TCP port. Print the local address and the remote address, or an -# error message if no suitable remote host was found. Exit status is 0 if -# a host was found. (Note: the only host we check is the gateway. Nobody -# responds to broadcast pings these days, and portscanning the local net is -# unfriendly.) -# - -use strict; -use IO::Socket; - -my $TIMEOUT = 3; -my $tcpport = @ARGV == 1 ? $ARGV[0] : 0; - -# -# Determine gateway IP address -# - -my $local = ""; -my $remote = ""; -my $responsive = ""; -my $up; -open IP, '/sbin/ip -o -4 route show |' or die "Couldn't run ip route show: $!\n"; -while (<IP>) { - next unless /^default /; - - if (/via (\S+)/) { - $remote = $1; - } -} -close IP; -die "Could not determine gateway router IP address" if $remote eq ""; - -open IP, "/sbin/ip -o route get to $remote |" or die "Couldn't run ip route get: $!\n"; -while (<IP>) { - next unless /^$remote /; - if (/src (\S+)/) { - $local = $1; - } -} -close IP; -die "Could not determine local IP address" if $local eq ""; - -# -# See if the rmote host responds to an icmp echo. -# -open PING, "/bin/ping -n -s 56 -w $TIMEOUT $remote |" or - die "Couldn't run ping: $!\n"; -while (<PING>) { - if (/bytes from (.*): /) { - my $addr = $1; - - if ($tcpport != 0) { - # - # Test TCP - # - my $socket = IO::Socket::INET->new( - Proto => "tcp", - PeerAddr => $addr, - PeerPort => $tcpport, - Timeout => $TIMEOUT, - ); - next unless $socket; - close $socket; - } - - $responsive = $addr; - last; - } -} -close PING; -die "Can't find a remote host for testing: No suitable response from " . - "$remote\n" if $responsive eq ""; - -print "$local $responsive\n"; diff --git a/test/unittest/ip/get.ipv6remote.pl b/test/unittest/ip/get.ipv6remote.pl deleted file mode 100755 index b2136c5b..00000000 --- a/test/unittest/ip/get.ipv6remote.pl +++ /dev/null @@ -1,70 +0,0 @@ -#!/usr/bin/perl -w -# -# Oracle Linux DTrace. -# Copyright (c) 2008, 2017, Oracle and/or its affiliates. All rights reserved. -# Licensed under the Universal Permissive License v 1.0 as shown at -# http://oss.oracle.com/licenses/upl. -# - -# -# get.ipv6remote.pl -# -# Find an IPv6 reachable remote host using both ip(8) and ping(8). -# Print the local address and the remote address, or print nothing if either -# no IPv6 interfaces or remote hosts were found. (Remote IPv6 testing is -# considered optional, and so not finding another IPv6 host is not an error -# state we need to log.) Exit status is 0 if a host was found. -# - -use strict; -use IO::Socket; - -my $TIMEOUT = 3; # connection timeout - -# possible paths for ping6 -$ENV{'PATH'} = "/bin:/usr/bin:/sbin:/usr/sbin:$ENV{'PATH'}"; - -# -# Determine local IP address -# -my $local = ""; -my $remote = ""; -my $responsive = ""; -my $up; -open IP, '/sbin/ip -o -6 route show |' or die "Couldn't run ip route show: $!\n"; -while (<IP>) { - next unless /^default /; - - if (/via (\S+)/) { - $remote = $1; - } -} -close IP; -die "Could not determine gateway router IPv6 address" if $remote eq ""; - -open IP, "/sbin/ip -o route get to $remote |" or die "Couldn't run ip route get: $!\n"; -while (<IP>) { - next unless /^$remote /; - if (/src (\S+)/) { - $local = $1; - } -} -close IP; -die "Could not determine local IPv6 address" if $local eq ""; - -# -# Find the first remote host that responds to an icmp echo, -# which isn't a local address. -# -open PING, "ping6 -n -s 56 -w $TIMEOUT $remote 2>/dev/null |" or - die "Couldn't run ping: $!\n"; -while (<PING>) { - if (/bytes from (.*): /) { - $responsive = $1; - last; - } -} -close PING; -exit 2 if $responsive eq ""; - -print "$local $responsive\n"; diff --git a/test/unittest/ip/tst.ipv4remoteicmp.sh b/test/unittest/ip/tst.ipv4remoteicmp.sh index c165cbdc..854797a7 100755 --- a/test/unittest/ip/tst.ipv4remoteicmp.sh +++ b/test/unittest/ip/tst.ipv4remoteicmp.sh @@ -13,9 +13,7 @@ # # 1. A change to the ip stack breaking expected probe behavior, # which is the reason we are testing. -# 2. No physical network interface is plumbed and up. -# 3. The subnet gateway is not reachable. -# 4. An unrelated ICMP between these hosts was traced by accident. +# 2. An unrelated ICMP between these hosts was traced by accident. # if (( $# != 1 )); then @@ -25,18 +23,20 @@ fi dtrace=$1 testdir="$(dirname $_test)" -getaddr=$testdir/get.ipv4remote.pl +getaddr=$testdir/../../utils/get_remote.sh if [[ ! -x $getaddr ]]; then echo "could not find or execute sub program: $getaddr" >&2 exit 3 fi -set -- $($getaddr) + +set -- $($getaddr ipv4) source="$1" dest="$2" if [[ $? -ne 0 ]] || [[ -z $dest ]]; then exit 67 fi + $dtrace $dt_flags -c "$testdir/perlping.pl icmp $dest" -qs /dev/stdin <<EOF | \ sort -n ip:::send diff --git a/test/unittest/ip/tst.ipv4remotetcp.sh b/test/unittest/ip/tst.ipv4remotetcp.sh index 577c5668..f164d9d0 100755 --- a/test/unittest/ip/tst.ipv4remotetcp.sh +++ b/test/unittest/ip/tst.ipv4remotetcp.sh @@ -13,9 +13,7 @@ # # 1. A change to the ip stack breaking expected probe behavior, # which is the reason we are testing. -# 2. No physical network interface is plumbed and up. -# 3. The subnet gateway is not reachable and listening on ssh. -# 4. An unlikely race causes the unlocked global send/receive +# 2. An unlikely race causes the unlocked global send/receive # variables to be corrupted. # # This test performs a TCP connection and checks that at least the @@ -32,27 +30,20 @@ fi dtrace=$1 testdir="$(dirname $_test)" -getaddr=$testdir/get.ipv4remote.pl -tcpports="22 80" -tcpport="" +getaddr=$testdir/../../utils/get_remote.sh +tcpport="22" dest="" if [[ ! -x $getaddr ]]; then echo "could not find or execute sub program: $getaddr" >&2 exit 3 fi -for port in $tcpports ; do - res=`$getaddr $port 2>/dev/null` - if (( $? == 0 )); then - read s d <<< $res - tcpport=$port - source=$s - dest=$d - break - fi -done -if [ -z $tcpport ]; then +set -- $($getaddr ipv4 $tcpport) +source="$1" +dest="$2" + +if [[ $? -ne 0 ]] || [[ -z $dest ]]; then exit 67 fi diff --git a/test/unittest/ip/tst.ipv4remoteudp.sh b/test/unittest/ip/tst.ipv4remoteudp.sh index 3d25e1f5..f88ab35b 100755 --- a/test/unittest/ip/tst.ipv4remoteudp.sh +++ b/test/unittest/ip/tst.ipv4remoteudp.sh @@ -13,9 +13,7 @@ # # 1. A change to the ip stack breaking expected probe behavior, # which is the reason we are testing. -# 2. No physical network interface is plumbed and up. -# 3. The gateway is not reachable and listening on rpcbind. -# 4. An unlikely race causes the unlocked global send/receive +# 2. An unlikely race causes the unlocked global send/receive # variables to be corrupted. # # This test sends a UDP message using ping and checks that at least the @@ -31,13 +29,13 @@ fi dtrace=$1 testdir="$(dirname $_test)" -getaddr=$testdir/get.ipv4remote.pl +getaddr=$testdir/../../utils/get_remote.sh if [[ ! -x $getaddr ]]; then echo "could not find or execute sub program: $getaddr" >&2 exit 3 fi -set -- $($getaddr) +set -- $($getaddr ipv4) source="$1" dest="$2" if [[ $? -ne 0 ]] || [[ -z $dest ]]; then diff --git a/test/unittest/ip/tst.ipv6remoteicmp.sh b/test/unittest/ip/tst.ipv6remoteicmp.sh index 90fd48b4..0107a3ae 100755 --- a/test/unittest/ip/tst.ipv6remoteicmp.sh +++ b/test/unittest/ip/tst.ipv6remoteicmp.sh @@ -19,7 +19,7 @@ # # @@tags: unstable -# possible paths for ping6 +# possible paths for ping export PATH=/bin:/usr/bin:/sbin:/usr/sbin:$PATH if (( $# != 1 )); then @@ -29,24 +29,24 @@ fi dtrace=$1 testdir="$(dirname $_test)" -getaddr=$testdir/get.ipv6remote.pl +getaddr=$testdir/../../utils/get_remote.sh if [[ ! -x $getaddr ]]; then echo "could not find or execute sub program: $getaddr" >&2 exit 3 fi -set -- $($getaddr) + +set -- $($getaddr ipv6) source="$1" dest="$2" + if [[ $? -ne 0 ]] || [[ -z $dest ]]; then echo -n "Could not find a local IPv6 interface and a remote IPv6 " >&2 echo "host. Aborting test." >&2 exit 67 fi -nolinkdest="$(printf "%s" "$dest" | sed 's,%.*,,')" - -$dtrace $dt_flags -c "ping6 -c 6 $dest" -qs /dev/stdin <<EOF | \ +$dtrace $dt_flags -c "ping -6 -c 10 $dest" -qs /dev/stdin <<EOF | \ gawk '/ip:::/ { print $0 }' | sort -n /* * We use a size match to include only things that are big enough to @@ -54,7 +54,7 @@ $dtrace $dt_flags -c "ping6 -c 6 $dest" -qs /dev/stdin <<EOF | \ */ ip:::send -/args[2]->ip_saddr == "$source" && args[2]->ip_daddr == "$nolinkdest" && +/args[2]->ip_saddr == "$source" && args[2]->ip_daddr == "$dest" && args[5]->ipv6_nexthdr == IPPROTO_ICMPV6 && args[2]->ip_plength > 32/ { printf("1 ip:::send ("); @@ -64,7 +64,7 @@ ip:::send } ip:::receive -/args[2]->ip_saddr == "$nolinkdest" && args[2]->ip_daddr == "$source" && +/args[2]->ip_saddr == "$dest" && args[2]->ip_daddr == "$source" && args[5]->ipv6_nexthdr == IPPROTO_ICMPV6 && args[2]->ip_plength > 32/ { printf("2 ip:::receive ("); diff --git a/test/unittest/tcp/tst.ipv4remotetcp.sh b/test/unittest/tcp/tst.ipv4remotetcp.sh index 333760a1..d8673d4b 100755 --- a/test/unittest/tcp/tst.ipv4remotetcp.sh +++ b/test/unittest/tcp/tst.ipv4remotetcp.sh @@ -13,9 +13,7 @@ # # 1. A change to the tcp stack breaking expected probe behavior, # which is the reason we are testing. -# 2. No physical network interface is plumbed and up. -# 3. No other hosts on this subnet are reachable and listening on ssh. -# 4. An unlikely race causes the unlocked global send/receive +# 2. An unlikely race causes the unlocked global send/receive # variables to be corrupted. # # This test performs a TCP connection and checks that at least the @@ -32,9 +30,8 @@ fi dtrace=$1 testdir="$(dirname $_test)" -getaddr=$testdir/../ip/get.ipv4remote.pl -tcpports="22 80" -tcpport="" +getaddr=$testdir/../../utils/get_remote.sh +tcpport="22" dest="" if [[ ! -x $getaddr ]]; then @@ -42,18 +39,11 @@ if [[ ! -x $getaddr ]]; then exit 3 fi -for port in $tcpports ; do - res=`$getaddr $port 2>/dev/null` - if (( $? == 0 )); then - read s d <<< $res - tcpport=$port - source=$s - dest=$d - break - fi -done +set -- $($getaddr ipv4 $tcpport) +source="$1" +dest="$2" -if [[ -z $tcpport ]]; then +if [[ $? -ne 0 ]] || [[ -z $dest ]]; then exit 67 fi diff --git a/test/unittest/tcp/tst.ipv4remotetcpstate.sh b/test/unittest/tcp/tst.ipv4remotetcpstate.sh index 74fb4ce3..e9ff218d 100755 --- a/test/unittest/tcp/tst.ipv4remotetcpstate.sh +++ b/test/unittest/tcp/tst.ipv4remotetcpstate.sh @@ -17,8 +17,7 @@ # # 1. A change to the ip stack breaking expected probe behavior, # which is the reason we are testing. -# 2. The remote ssh service is not online. -# 3. An unlikely race causes the unlocked global send/receive +# 2. An unlikely race causes the unlocked global send/receive # variables to be corrupted. # # This test performs a TCP connection to the ssh service (port 22) and @@ -40,29 +39,21 @@ fi dtrace=$1 testdir="$(dirname $_test)" -getaddr=$testdir/../ip/get.ipv4remote.pl +getaddr=$testdir/../../utils/get_remote.sh client=$testdir/../ip/client.ip.pl -tcpports="22 80" -tcpport="" -dest="" +tcpport="22" if [[ ! -x $getaddr ]]; then echo "could not find or execute sub program: $getaddr" >&2 exit 3 fi -for port in $tcpports ; do - res=`$getaddr $port 2>/dev/null` - if (( $? == 0 )); then - read s d <<< $res - tcpport=$port - source=$s - dest=$d - break - fi -done - -if [ -z $tcpport ]; then - exit 67 + +set -- $($getaddr ipv4 $tcpport) +source="$1" +dest="$2" + +if [[ $? -ne 0 ]] || [[ -z $dest ]]; then + exit 67 fi diff --git a/test/unittest/udp/tst.ipv4remoteudp.sh b/test/unittest/udp/tst.ipv4remoteudp.sh index 1c5f2a9a..4fe70f5a 100755 --- a/test/unittest/udp/tst.ipv4remoteudp.sh +++ b/test/unittest/udp/tst.ipv4remoteudp.sh @@ -34,14 +34,14 @@ fi dtrace=$1 testdir="$(dirname $_test)" -getaddr=$testdir/../ip/get.ipv4remote.pl +getaddr=$testdir/../../utils/get_remote.sh port=31337 if [[ ! -x $getaddr ]]; then echo "could not find or execute sub program: $getaddr" >&2 exit 3 fi -read source dest <<<`$getaddr 2>/dev/null` +read source dest <<<`$getaddr ipv4 2>/dev/null` if (( $? != 0 )) || [[ -z $dest ]]; then exit 67 fi diff --git a/test/utils/get_remote.sh b/test/utils/get_remote.sh new file mode 100755 index 00000000..d8a4d450 --- /dev/null +++ b/test/utils/get_remote.sh @@ -0,0 +1,71 @@ +#!/bin/bash +# +# Oracle Linux DTrace. +# Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. +# Licensed under the Universal Permissive License v 1.0 as shown at +# http://oss.oracle.com/licenses/upl. +# + +# +# get_remote.sh ipv4|ipv6|cleanup [tcpport] +# +# Create (or cleanup) a network namespace with either IPv4 or IPv6 +# address associated. +# +# Print the local address and the remote address, or an +# error message if a failure occurred during setup. +# +# If tcpport is specified, start sshd on that port. +# +# Exit status is 0 if all succceeded. +# + +cmd=$1 +tcpport=$2 + +prefix=$(basename $tmpdir) +netns=${prefix}ns +veth1=${prefix}v1 +veth2=${prefix}v2 +mtu=1500 + +set -e + +case $cmd in +cleanup) pids=$(ip netns pids ${netns} 2>/dev/null) + if [[ -n "$pids" ]]; then + kill -TERM $pids + fi + ip netns del ${netns} 2>/dev/null + exit 0 + ;; + ipv4) veth1_addr=192.168.168.1 + veth2_addr=192.168.168.2 + prefixlen=24 + family= + ;; + ipv6) veth1_addr=fd::1 + veth2_addr=fd::2 + prefixlen=64 + family=-6 + ;; + *) echo "Unexpected cmd $cmd" >2 + exit 1 + ;; +esac + +ip netns add $netns +ip link add dev $veth1 mtu $mtu netns $netns type veth \ + peer name $veth2 mtu $mtu +ip netns exec $netns ip $family addr add ${veth1_addr}/$prefixlen dev $veth1 +ip netns exec $netns ip link set $veth1 up +ip addr add ${veth2_addr}/${prefixlen} dev $veth2 +ip link set $veth2 up + +if [[ -n "$tcpport" ]]; then + sshd=$(which sshd) + ip netns exec $netns $sshd -p $tcpport & +fi + +echo "$veth2_addr $veth1_addr" +exit 0 -- 2.43.5 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-03 11:33 [PATCH] test/utils: add more reliable "get remote address" approach Alan Maguire @ 2025-07-03 16:43 ` Eugene Loh 2025-07-03 16:59 ` Alan Maguire 0 siblings, 1 reply; 21+ messages in thread From: Eugene Loh @ 2025-07-03 16:43 UTC (permalink / raw) To: Alan Maguire, dtrace; +Cc: dtrace-devel Reviewed-by: Eugene Loh <eugene.loh@oracle.com> I confess I don't understand all the details, but it seems like a nice improvement. Thanks. I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the patch 3/4 feedback). I think you need to update Copyright years in the modified files. And... On 7/3/25 07:33, Alan Maguire via DTrace-devel wrote: > The current approach of looking for remote addresses > is brittle and fails in many environments; it checks the > default route gateway and looks for open ports in the TCP > case. > > We can however achieve the same goal reliably by creating > a network namespace on the system and configuring either > IPv4 or IPv6 addresses on the namespaced and local veth > interfaces that support communication between namespaces. > If a tcp port is required start sshd to listen on that port. Maybe a comma after "required"? > Teardown is managed in runtest.sh as signal handling for > timeouts within the test scripts is not working; a trap > function does not trigger for TERM. I'm having trouble parsing the text before the semicolon. I think I understand it, but cannot seem to figure out the grammar. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-03 16:43 ` [DTrace-devel] " Eugene Loh @ 2025-07-03 16:59 ` Alan Maguire 2025-07-03 17:06 ` Eugene Loh 0 siblings, 1 reply; 21+ messages in thread From: Alan Maguire @ 2025-07-03 16:59 UTC (permalink / raw) To: Eugene Loh, dtrace; +Cc: dtrace-devel On 03/07/2025 17:43, Eugene Loh wrote: > Reviewed-by: Eugene Loh <eugene.loh@oracle.com> > Thanks for the review! > I confess I don't understand all the details, but it seems like a nice > improvement. Thanks. > Creating a network namespace essentially gives you an independent TCP/IP stack on the system; you then connect to it from the main (global network namespace) via a veth pair; one lives on the network namespace side in that TCP/IP stack and the other lives on the local system (global namespace) side; kind of like using a back-to-back cable between two ethernet cards on different systems, but virtually. The upshot is that the networking stack sends to the namespaced interface just like it does for "real" network traffic; that is the benefit it gives us. > I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the > patch 3/4 feedback). > Sorry I couldn't find that issue; is this the 5.15 problem with the ip send probes? > I think you need to update Copyright years in the modified files. And... > Will do, thanks! > On 7/3/25 07:33, Alan Maguire via DTrace-devel wrote: > >> The current approach of looking for remote addresses >> is brittle and fails in many environments; it checks the >> default route gateway and looks for open ports in the TCP >> case. >> >> We can however achieve the same goal reliably by creating >> a network namespace on the system and configuring either >> IPv4 or IPv6 addresses on the namespaced and local veth >> interfaces that support communication between namespaces. >> If a tcp port is required start sshd to listen on that port. > > Maybe a comma after "required"? > yep, will fix. >> Teardown is managed in runtest.sh as signal handling for >> timeouts within the test scripts is not working; a trap >> function does not trigger for TERM. > > I'm having trouble parsing the text before the semicolon. I think I > understand it, but cannot seem to figure out the grammar. I'll try and rephrase; basically I tried adding a trap cleanup TERM to the test script to catch a SIGTERM when the test timed out; unfortunately this didn't trigger when tests timed out so we were left with network namespaces hanging around. How about Teardown of network namespaces is managed in the toplevel runtest.sh to ensure that network namespaces are removed after test completion for all cases; success, failure and timeout. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-03 16:59 ` Alan Maguire @ 2025-07-03 17:06 ` Eugene Loh 2025-07-03 18:02 ` Alan Maguire 0 siblings, 1 reply; 21+ messages in thread From: Eugene Loh @ 2025-07-03 17:06 UTC (permalink / raw) To: Alan Maguire, dtrace; +Cc: dtrace-devel On 7/3/25 12:59, Alan Maguire wrote: > On 03/07/2025 17:43, Eugene Loh wrote: > >> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the >> patch 3/4 feedback). >> > Sorry I couldn't find that issue; is this the 5.15 problem with the ip > send probes? dtrace: failed to compile script /dev/stdin: ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name >> On 7/3/25 07:33, Alan Maguire via DTrace-devel wrote: >> >>> The current approach of looking for remote addresses >>> is brittle and fails in many environments; it checks the >>> default route gateway and looks for open ports in the TCP >>> case. >>> >>> We can however achieve the same goal reliably by creating >>> a network namespace on the system and configuring either >>> IPv4 or IPv6 addresses on the namespaced and local veth >>> interfaces that support communication between namespaces. >>> If a tcp port is required start sshd to listen on that port. >> Maybe a comma after "required"? >> > yep, will fix. > >>> Teardown is managed in runtest.sh as signal handling for >>> timeouts within the test scripts is not working; a trap >>> function does not trigger for TERM. >> I'm having trouble parsing the text before the semicolon. I think I >> understand it, but cannot seem to figure out the grammar. > I'll try and rephrase; basically I tried adding a > > trap cleanup TERM > > to the test script to catch a SIGTERM when the test timed out; > unfortunately this didn't trigger when tests timed out so we were left > with network namespaces hanging around. > > How about > > Teardown of network namespaces is managed in the toplevel runtest.sh to > ensure that network namespaces are removed after test completion for all > cases; success, failure and timeout. Great. Or how about a colon instead of semicolon? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-03 17:06 ` Eugene Loh @ 2025-07-03 18:02 ` Alan Maguire 2025-07-03 18:26 ` Kris Van Hees 0 siblings, 1 reply; 21+ messages in thread From: Alan Maguire @ 2025-07-03 18:02 UTC (permalink / raw) To: Eugene Loh, dtrace; +Cc: dtrace-devel On 03/07/2025 18:06, Eugene Loh wrote: > On 7/3/25 12:59, Alan Maguire wrote: > >> On 03/07/2025 17:43, Eugene Loh wrote: >> >>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the >>> patch 3/4 feedback). >>> >> Sorry I couldn't find that issue; is this the 5.15 problem with the ip >> send probes? > > dtrace: failed to compile script /dev/stdin: > ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of > inet_ntoa arg#1 (ipaddr_t *): > Unknown type name > Ah, sorry yep I have a fix for that one in the next round. Basically we need to add it to the core set of typedefs and add a type for a pointer to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. >>> On 7/3/25 07:33, Alan Maguire via DTrace-devel wrote: >>> >>>> The current approach of looking for remote addresses >>>> is brittle and fails in many environments; it checks the >>>> default route gateway and looks for open ports in the TCP >>>> case. >>>> >>>> We can however achieve the same goal reliably by creating >>>> a network namespace on the system and configuring either >>>> IPv4 or IPv6 addresses on the namespaced and local veth >>>> interfaces that support communication between namespaces. >>>> If a tcp port is required start sshd to listen on that port. >>> Maybe a comma after "required"? >>> >> yep, will fix. >> >>>> Teardown is managed in runtest.sh as signal handling for >>>> timeouts within the test scripts is not working; a trap >>>> function does not trigger for TERM. >>> I'm having trouble parsing the text before the semicolon. I think I >>> understand it, but cannot seem to figure out the grammar. >> I'll try and rephrase; basically I tried adding a >> >> trap cleanup TERM >> >> to the test script to catch a SIGTERM when the test timed out; >> unfortunately this didn't trigger when tests timed out so we were left >> with network namespaces hanging around. >> >> How about >> >> Teardown of network namespaces is managed in the toplevel runtest.sh to >> ensure that network namespaces are removed after test completion for all >> cases; success, failure and timeout. > > Great. Or how about a colon instead of semicolon? Sure! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-03 18:02 ` Alan Maguire @ 2025-07-03 18:26 ` Kris Van Hees 2025-07-03 18:41 ` Alan Maguire 0 siblings, 1 reply; 21+ messages in thread From: Kris Van Hees @ 2025-07-03 18:26 UTC (permalink / raw) To: Alan Maguire; +Cc: Eugene Loh, dtrace, dtrace-devel On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: > On 03/07/2025 18:06, Eugene Loh wrote: > > On 7/3/25 12:59, Alan Maguire wrote: > > > >> On 03/07/2025 17:43, Eugene Loh wrote: > >> > >>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the > >>> patch 3/4 feedback). > >>> > >> Sorry I couldn't find that issue; is this the 5.15 problem with the ip > >> send probes? > > > > dtrace: failed to compile script /dev/stdin: > > ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of > > inet_ntoa arg#1 (ipaddr_t *): > > Unknown type name > > > > Ah, sorry yep I have a fix for that one in the next round. Basically we > need to add it to the core set of typedefs and add a type for a pointer > to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. Why can't we rely on the pragma? That is how e.g. the ip provider manages this I believe? I'd really rather not add a type like this to the core set of typedefs we can avoid it, because it really isn't a core type. > >>> On 7/3/25 07:33, Alan Maguire via DTrace-devel wrote: > >>> > >>>> The current approach of looking for remote addresses > >>>> is brittle and fails in many environments; it checks the > >>>> default route gateway and looks for open ports in the TCP > >>>> case. > >>>> > >>>> We can however achieve the same goal reliably by creating > >>>> a network namespace on the system and configuring either > >>>> IPv4 or IPv6 addresses on the namespaced and local veth > >>>> interfaces that support communication between namespaces. > >>>> If a tcp port is required start sshd to listen on that port. > >>> Maybe a comma after "required"? > >>> > >> yep, will fix. > >> > >>>> Teardown is managed in runtest.sh as signal handling for > >>>> timeouts within the test scripts is not working; a trap > >>>> function does not trigger for TERM. > >>> I'm having trouble parsing the text before the semicolon. I think I > >>> understand it, but cannot seem to figure out the grammar. > >> I'll try and rephrase; basically I tried adding a > >> > >> trap cleanup TERM > >> > >> to the test script to catch a SIGTERM when the test timed out; > >> unfortunately this didn't trigger when tests timed out so we were left > >> with network namespaces hanging around. > >> > >> How about > >> > >> Teardown of network namespaces is managed in the toplevel runtest.sh to > >> ensure that network namespaces are removed after test completion for all > >> cases; success, failure and timeout. > > > > Great. Or how about a colon instead of semicolon? > > Sure! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-03 18:26 ` Kris Van Hees @ 2025-07-03 18:41 ` Alan Maguire 2025-07-03 19:03 ` Kris Van Hees 0 siblings, 1 reply; 21+ messages in thread From: Alan Maguire @ 2025-07-03 18:41 UTC (permalink / raw) To: Kris Van Hees; +Cc: Eugene Loh, dtrace, dtrace-devel On 03/07/2025 19:26, Kris Van Hees wrote: > On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: >> On 03/07/2025 18:06, Eugene Loh wrote: >>> On 7/3/25 12:59, Alan Maguire wrote: >>> >>>> On 03/07/2025 17:43, Eugene Loh wrote: >>>> >>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the >>>>> patch 3/4 feedback). >>>>> >>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip >>>> send probes? >>> >>> dtrace: failed to compile script /dev/stdin: >>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of >>> inet_ntoa arg#1 (ipaddr_t *): >>> Unknown type name >>> >> >> Ah, sorry yep I have a fix for that one in the next round. Basically we >> need to add it to the core set of typedefs and add a type for a pointer >> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. > > Why can't we rely on the pragma? That is how e.g. the ip provider manages > this I believe? > Unfortunately the #pragma include doesn't do enough; it just defines a type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t typedef to net.d and doing the pointer lookup/addition but that doesn't work either. Seems we need the core typedef + pointer addition or we hit this failure. > I'd really rather not add a type like this to the core set of typedefs we > can avoid it, because it really isn't a core type. > I can't see another way round this currently unfortunately. Alan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-03 18:41 ` Alan Maguire @ 2025-07-03 19:03 ` Kris Van Hees 2025-07-03 20:23 ` Alan Maguire 0 siblings, 1 reply; 21+ messages in thread From: Kris Van Hees @ 2025-07-03 19:03 UTC (permalink / raw) To: Alan Maguire; +Cc: Kris Van Hees, Eugene Loh, dtrace, dtrace-devel On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: > On 03/07/2025 19:26, Kris Van Hees wrote: > > On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: > >> On 03/07/2025 18:06, Eugene Loh wrote: > >>> On 7/3/25 12:59, Alan Maguire wrote: > >>> > >>>> On 03/07/2025 17:43, Eugene Loh wrote: > >>>> > >>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the > >>>>> patch 3/4 feedback). > >>>>> > >>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip > >>>> send probes? > >>> > >>> dtrace: failed to compile script /dev/stdin: > >>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of > >>> inet_ntoa arg#1 (ipaddr_t *): > >>> Unknown type name > >>> > >> > >> Ah, sorry yep I have a fix for that one in the next round. Basically we > >> need to add it to the core set of typedefs and add a type for a pointer > >> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. > > > > Why can't we rely on the pragma? That is how e.g. the ip provider manages > > this I believe? > > > > Unfortunately the #pragma include doesn't do enough; it just defines a > type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is > what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t > typedef to net.d and doing the pointer lookup/addition but that doesn't > work either. Seems we need the core typedef + pointer addition or we hit > this failure. Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, you should be set. That is what I did in my priliminary tcp provider impl. I do believe that works. Either way, we use inet_ntoa() in the ip.d translators and that works with that typedef in the file, so this really ought to work. > > I'd really rather not add a type like this to the core set of typedefs we > > can avoid it, because it really isn't a core type. > > > > I can't see another way round this currently unfortunately. > > Alan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-03 19:03 ` Kris Van Hees @ 2025-07-03 20:23 ` Alan Maguire 2025-07-03 20:59 ` Kris Van Hees 0 siblings, 1 reply; 21+ messages in thread From: Alan Maguire @ 2025-07-03 20:23 UTC (permalink / raw) To: Kris Van Hees; +Cc: Eugene Loh, dtrace, dtrace-devel On 03/07/2025 20:03, Kris Van Hees wrote: > On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: >> On 03/07/2025 19:26, Kris Van Hees wrote: >>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: >>>> On 03/07/2025 18:06, Eugene Loh wrote: >>>>> On 7/3/25 12:59, Alan Maguire wrote: >>>>> >>>>>> On 03/07/2025 17:43, Eugene Loh wrote: >>>>>> >>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the >>>>>>> patch 3/4 feedback). >>>>>>> >>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip >>>>>> send probes? >>>>> >>>>> dtrace: failed to compile script /dev/stdin: >>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of >>>>> inet_ntoa arg#1 (ipaddr_t *): >>>>> Unknown type name >>>>> >>>> >>>> Ah, sorry yep I have a fix for that one in the next round. Basically we >>>> need to add it to the core set of typedefs and add a type for a pointer >>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. >>> >>> Why can't we rely on the pragma? That is how e.g. the ip provider manages >>> this I believe? >>> >> >> Unfortunately the #pragma include doesn't do enough; it just defines a >> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is >> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t >> typedef to net.d and doing the pointer lookup/addition but that doesn't >> work either. Seems we need the core typedef + pointer addition or we hit >> this failure. > > Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, > you should be set. That is what I did in my priliminary tcp provider impl. > I do believe that works. Either way, we use inet_ntoa() in the ip.d > translators and that works with that typedef in the file, so this really ought > to work. > Yep, I tried that in the v2 patch series; Eugene hit the undefined error in one test and I now hit it consistently for all tcp/ip tests unfortunately with "typedef __be32 ipaddr_t;" in net.d. My assumption (probably wrong) is that the include of the library does happen but nothing triggers the pointer type generation for "ipaddr *" in the CTF dict. If there was a way to force that type generation at the .d file level that would be great, not sure I see a way currently tho. Alan >>> I'd really rather not add a type like this to the core set of typedefs we >>> can avoid it, because it really isn't a core type. >>> >> >> I can't see another way round this currently unfortunately. >> >> Alan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-03 20:23 ` Alan Maguire @ 2025-07-03 20:59 ` Kris Van Hees 2025-07-03 22:36 ` Kris Van Hees 0 siblings, 1 reply; 21+ messages in thread From: Kris Van Hees @ 2025-07-03 20:59 UTC (permalink / raw) To: Alan Maguire; +Cc: Kris Van Hees, Eugene Loh, dtrace, dtrace-devel On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote: > On 03/07/2025 20:03, Kris Van Hees wrote: > > On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: > >> On 03/07/2025 19:26, Kris Van Hees wrote: > >>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: > >>>> On 03/07/2025 18:06, Eugene Loh wrote: > >>>>> On 7/3/25 12:59, Alan Maguire wrote: > >>>>> > >>>>>> On 03/07/2025 17:43, Eugene Loh wrote: > >>>>>> > >>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the > >>>>>>> patch 3/4 feedback). > >>>>>>> > >>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip > >>>>>> send probes? > >>>>> > >>>>> dtrace: failed to compile script /dev/stdin: > >>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of > >>>>> inet_ntoa arg#1 (ipaddr_t *): > >>>>> Unknown type name > >>>>> > >>>> > >>>> Ah, sorry yep I have a fix for that one in the next round. Basically we > >>>> need to add it to the core set of typedefs and add a type for a pointer > >>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. > >>> > >>> Why can't we rely on the pragma? That is how e.g. the ip provider manages > >>> this I believe? > >>> > >> > >> Unfortunately the #pragma include doesn't do enough; it just defines a > >> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is > >> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t > >> typedef to net.d and doing the pointer lookup/addition but that doesn't > >> work either. Seems we need the core typedef + pointer addition or we hit > >> this failure. > > > > Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, > > you should be set. That is what I did in my priliminary tcp provider impl. > > I do believe that works. Either way, we use inet_ntoa() in the ip.d > > translators and that works with that typedef in the file, so this really ought > > to work. > Yep, I tried that in the v2 patch series; Eugene hit the undefined error > in one test and I now hit it consistently for all tcp/ip tests > unfortunately with "typedef __be32 ipaddr_t;" in net.d. > > My assumption (probably wrong) is that the include of the library does > happen but nothing triggers the pointer type generation for "ipaddr *" > in the CTF dict. If there was a way to force that type generation at the > .d file level that would be great, not sure I see a way currently tho. Well, like I said, it does work for ip.d so I don't see why this would be any different. I'll have a look and see if I can figure something out. Kris ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-03 20:59 ` Kris Van Hees @ 2025-07-03 22:36 ` Kris Van Hees 2025-07-07 16:32 ` Alan Maguire 0 siblings, 1 reply; 21+ messages in thread From: Kris Van Hees @ 2025-07-03 22:36 UTC (permalink / raw) To: Kris Van Hees; +Cc: Alan Maguire, Eugene Loh, dtrace, dtrace-devel On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote: > On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote: > > On 03/07/2025 20:03, Kris Van Hees wrote: > > > On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: > > >> On 03/07/2025 19:26, Kris Van Hees wrote: > > >>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: > > >>>> On 03/07/2025 18:06, Eugene Loh wrote: > > >>>>> On 7/3/25 12:59, Alan Maguire wrote: > > >>>>> > > >>>>>> On 03/07/2025 17:43, Eugene Loh wrote: > > >>>>>> > > >>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the > > >>>>>>> patch 3/4 feedback). > > >>>>>>> > > >>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip > > >>>>>> send probes? > > >>>>> > > >>>>> dtrace: failed to compile script /dev/stdin: > > >>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of > > >>>>> inet_ntoa arg#1 (ipaddr_t *): > > >>>>> Unknown type name > > >>>>> > > >>>> > > >>>> Ah, sorry yep I have a fix for that one in the next round. Basically we > > >>>> need to add it to the core set of typedefs and add a type for a pointer > > >>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. > > >>> > > >>> Why can't we rely on the pragma? That is how e.g. the ip provider manages > > >>> this I believe? > > >>> > > >> > > >> Unfortunately the #pragma include doesn't do enough; it just defines a > > >> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is > > >> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t > > >> typedef to net.d and doing the pointer lookup/addition but that doesn't > > >> work either. Seems we need the core typedef + pointer addition or we hit > > >> this failure. > > > > > > Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, > > > you should be set. That is what I did in my priliminary tcp provider impl. > > > I do believe that works. Either way, we use inet_ntoa() in the ip.d > > > translators and that works with that typedef in the file, so this really ought > > > to work. > > > Yep, I tried that in the v2 patch series; Eugene hit the undefined error > > in one test and I now hit it consistently for all tcp/ip tests > > unfortunately with "typedef __be32 ipaddr_t;" in net.d. > > > > My assumption (probably wrong) is that the include of the library does > > happen but nothing triggers the pointer type generation for "ipaddr *" > > in the CTF dict. If there was a way to force that type generation at the > > .d file level that would be great, not sure I see a way currently tho. > > Well, like I said, it does work for ip.d so I don't see why this would be > any different. I'll have a look and see if I can figure something out. Looking into this more, I think the problem is simply that you did not sync all the dlibs for the various kernel versions with the updated ip.d, net.d, and tcp.d files. So, if the kernel on the OL8 instance you test on does not have your change, it will fail. Also, I do not understand why you removed the pragma #pragma D depends_on provider tcp from tcp.d. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-03 22:36 ` Kris Van Hees @ 2025-07-07 16:32 ` Alan Maguire 2025-07-07 16:53 ` Kris Van Hees 0 siblings, 1 reply; 21+ messages in thread From: Alan Maguire @ 2025-07-07 16:32 UTC (permalink / raw) To: Kris Van Hees; +Cc: Eugene Loh, dtrace, dtrace-devel On 03/07/2025 23:36, Kris Van Hees wrote: > On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote: >> On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote: >>> On 03/07/2025 20:03, Kris Van Hees wrote: >>>> On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: >>>>> On 03/07/2025 19:26, Kris Van Hees wrote: >>>>>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: >>>>>>> On 03/07/2025 18:06, Eugene Loh wrote: >>>>>>>> On 7/3/25 12:59, Alan Maguire wrote: >>>>>>>> >>>>>>>>> On 03/07/2025 17:43, Eugene Loh wrote: >>>>>>>>> >>>>>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the >>>>>>>>>> patch 3/4 feedback). >>>>>>>>>> >>>>>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip >>>>>>>>> send probes? >>>>>>>> >>>>>>>> dtrace: failed to compile script /dev/stdin: >>>>>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of >>>>>>>> inet_ntoa arg#1 (ipaddr_t *): >>>>>>>> Unknown type name >>>>>>>> >>>>>>> >>>>>>> Ah, sorry yep I have a fix for that one in the next round. Basically we >>>>>>> need to add it to the core set of typedefs and add a type for a pointer >>>>>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. >>>>>> >>>>>> Why can't we rely on the pragma? That is how e.g. the ip provider manages >>>>>> this I believe? >>>>>> >>>>> >>>>> Unfortunately the #pragma include doesn't do enough; it just defines a >>>>> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is >>>>> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t >>>>> typedef to net.d and doing the pointer lookup/addition but that doesn't >>>>> work either. Seems we need the core typedef + pointer addition or we hit >>>>> this failure. >>>> >>>> Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, >>>> you should be set. That is what I did in my priliminary tcp provider impl. >>>> I do believe that works. Either way, we use inet_ntoa() in the ip.d >>>> translators and that works with that typedef in the file, so this really ought >>>> to work. >> >>> Yep, I tried that in the v2 patch series; Eugene hit the undefined error >>> in one test and I now hit it consistently for all tcp/ip tests >>> unfortunately with "typedef __be32 ipaddr_t;" in net.d. >>> >>> My assumption (probably wrong) is that the include of the library does >>> happen but nothing triggers the pointer type generation for "ipaddr *" >>> in the CTF dict. If there was a way to force that type generation at the >>> .d file level that would be great, not sure I see a way currently tho. >> >> Well, like I said, it does work for ip.d so I don't see why this would be >> any different. I'll have a look and see if I can figure something out. > > Looking into this more, I think the problem is simply that you did not sync > all the dlibs for the various kernel versions with the updated ip.d, net.d, and > tcp.d files. So, if the kernel on the OL8 instance you test on does not have > your change, it will fail. > No, don't think that's it; the .d files that matched the kernel I tested on (6.10) were synced; the use of the 6.10 .d files was visible in the error message. The problem appears to be around the fact that tcp.d uses the ipaddr_t * in inet_ntoa(), but unlike ip.d (which uses ipaddr_t in translated types) it does not have any other mention of ipaddr_t. Adding an explicit cast in tcp.d to the argument to inet_ntoa() to ipaddr_t * resolves the issue without having to add ipaddr_t to the core type list. Further debug logging shows that the actual type addition happens in an order which can cause potentially cause problems in some cases; for example with the above changes I see: libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/errno.d sorted (1/2) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/io.d sorted (3/4) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/ip.d sorted (5/6) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/lockstat.d sorted (7/8) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/tcp.d sorted (10/11) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/udp.d sorted (12/13) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/net.d sorted (9/14) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/pcap.d sorted (15/16) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/procfs.d sorted (17/18) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/regs.d sorted (19/20) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/sched.d sorted (21/22) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/signal.d sorted (23/24) libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/unistd.d sorted (25/26) libdtrace DEBUG 1751904270: typedef processorid_t added as id 2147483677 libdtrace DEBUG 1751904270: typedef psetid_t added as id 2147483678 libdtrace DEBUG 1751904270: typedef chipid_t added as id 2147483679 libdtrace DEBUG 1751904270: typedef lgrp_id_t added as id 2147483680 libdtrace DEBUG 1751904270: typedef cpuinfo_t added as id 2147483682 libdtrace DEBUG 1751904270: typedef cpuinfo_t_p added as id 2147483684 libdtrace DEBUG 1751904270: typedef time_t added as id 2147483688 libdtrace DEBUG 1751904270: typedef timestruc_t added as id 2147483690 libdtrace DEBUG 1751904270: typedef lwpsinfo_t added as id 2147483695 libdtrace DEBUG 1751904270: typedef taskid_t added as id 2147483696 libdtrace DEBUG 1751904270: typedef dprojid_t added as id 2147483697 libdtrace DEBUG 1751904270: typedef poolid_t added as id 2147483698 libdtrace DEBUG 1751904270: typedef zoneid_t added as id 2147483699 libdtrace DEBUG 1751904270: typedef psinfo_t added as id 2147490324 [edit: following typedefs come from net.d:] libdtrace DEBUG 1751904270: typedef conninfo_t added as id 2147490329 libdtrace DEBUG 1751904270: typedef netstackid_t added as id 2147490330 libdtrace DEBUG 1751904270: typedef ipaddr_t added as id 2147490331 libdtrace DEBUG 1751904270: typedef in6_addr_t added as id 2147490332 libdtrace DEBUG 1751904270: typedef pktinfo_t added as id 2147490334 libdtrace DEBUG 1751904270: typedef csinfo_t added as id 2147490336 libdtrace DEBUG 1751904270: skipping library /usr/lib64/dtrace/6.10/udp.d: "/usr/lib64/dtrace/6.10/udp.d", line 10: program requires provider udp [edit: followig typedefs come from tcp.d:] libdtrace DEBUG 1751904270: typedef tcpinfo_t added as id 2147490338 libdtrace DEBUG 1751904270: typedef tcpsinfo_t added as id 2147490340 libdtrace DEBUG 1751904270: typedef tcplsinfo_t added as id 2147490342 [edit: following typedefs come from ip.d;] libdtrace DEBUG 1751904270: typedef ipinfo_t added as id 2147490350 libdtrace DEBUG 1751904270: typedef ifinfo_t added as id 2147490352 libdtrace DEBUG 1751904270: typedef ipv4info_t added as id 2147490361 libdtrace DEBUG 1751904270: typedef ipv6info_t added as id 2147490369 libdtrace DEBUG 1751904270: typedef void_ip_t added as id 2147490370 libdtrace DEBUG 1751904270: typedef __dtrace_tcp_void_ip_t added as id 2147490371 libdtrace DEBUG 1751904270: typedef caddr_t added as id 2147490380 libdtrace DEBUG 1751904270: typedef bufinfo_t added as id 2147490382 libdtrace DEBUG 1751904270: typedef devinfo_t added as id 2147490385 Notice that the ip.d typedefs are added after the tcp ones, despite tcp.d having a "#pragma D depends_on provider ip.d . So it seems like the "depends_on library net.d" was honoured above, but not the "depends_on provider ip.d". Moving the core net-generic definitions to net.d alone was not enough however; including the explicit casts to the argument to inet_ntoa() (to ipaddr_t *) in tcp.d was needed also. > Also, I do not understand why you removed the pragma > #pragma D depends_on provider tcp > from tcp.d. Yeah, not sure why that got removed; I'll add it again for v3. Alan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-07 16:32 ` Alan Maguire @ 2025-07-07 16:53 ` Kris Van Hees 2025-07-07 18:14 ` Alan Maguire 0 siblings, 1 reply; 21+ messages in thread From: Kris Van Hees @ 2025-07-07 16:53 UTC (permalink / raw) To: Alan Maguire; +Cc: Kris Van Hees, Eugene Loh, dtrace, dtrace-devel On Mon, Jul 07, 2025 at 05:32:19PM +0100, Alan Maguire wrote: > On 03/07/2025 23:36, Kris Van Hees wrote: > > On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote: > >> On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote: > >>> On 03/07/2025 20:03, Kris Van Hees wrote: > >>>> On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: > >>>>> On 03/07/2025 19:26, Kris Van Hees wrote: > >>>>>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: > >>>>>>> On 03/07/2025 18:06, Eugene Loh wrote: > >>>>>>>> On 7/3/25 12:59, Alan Maguire wrote: > >>>>>>>> > >>>>>>>>> On 03/07/2025 17:43, Eugene Loh wrote: > >>>>>>>>> > >>>>>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the > >>>>>>>>>> patch 3/4 feedback). > >>>>>>>>>> > >>>>>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip > >>>>>>>>> send probes? > >>>>>>>> > >>>>>>>> dtrace: failed to compile script /dev/stdin: > >>>>>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of > >>>>>>>> inet_ntoa arg#1 (ipaddr_t *): > >>>>>>>> Unknown type name > >>>>>>>> > >>>>>>> > >>>>>>> Ah, sorry yep I have a fix for that one in the next round. Basically we > >>>>>>> need to add it to the core set of typedefs and add a type for a pointer > >>>>>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. > >>>>>> > >>>>>> Why can't we rely on the pragma? That is how e.g. the ip provider manages > >>>>>> this I believe? > >>>>>> > >>>>> > >>>>> Unfortunately the #pragma include doesn't do enough; it just defines a > >>>>> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is > >>>>> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t > >>>>> typedef to net.d and doing the pointer lookup/addition but that doesn't > >>>>> work either. Seems we need the core typedef + pointer addition or we hit > >>>>> this failure. > >>>> > >>>> Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, > >>>> you should be set. That is what I did in my priliminary tcp provider impl. > >>>> I do believe that works. Either way, we use inet_ntoa() in the ip.d > >>>> translators and that works with that typedef in the file, so this really ought > >>>> to work. > >> > >>> Yep, I tried that in the v2 patch series; Eugene hit the undefined error > >>> in one test and I now hit it consistently for all tcp/ip tests > >>> unfortunately with "typedef __be32 ipaddr_t;" in net.d. > >>> > >>> My assumption (probably wrong) is that the include of the library does > >>> happen but nothing triggers the pointer type generation for "ipaddr *" > >>> in the CTF dict. If there was a way to force that type generation at the > >>> .d file level that would be great, not sure I see a way currently tho. > >> > >> Well, like I said, it does work for ip.d so I don't see why this would be > >> any different. I'll have a look and see if I can figure something out. > > > > Looking into this more, I think the problem is simply that you did not sync > > all the dlibs for the various kernel versions with the updated ip.d, net.d, and > > tcp.d files. So, if the kernel on the OL8 instance you test on does not have > > your change, it will fail. > > > > No, don't think that's it; the .d files that matched the kernel I tested > on (6.10) were synced; the use of the 6.10 .d files was visible in the > error message. The problem appears to be around the fact that tcp.d uses > the ipaddr_t * in inet_ntoa(), but unlike ip.d (which uses ipaddr_t in > translated types) it does not have any other mention of ipaddr_t. > Adding an explicit cast in tcp.d to the argument to inet_ntoa() to > ipaddr_t * resolves the issue without having to add ipaddr_t to the core > type list. Can you reproduce this at will? Can you give me specifics on OL version, kernel version, etc? I'd like to be able to reproduce what you see, because so far, all I tried actually works once the ipaddr_t typedef is in net.d. > Further debug logging shows that the actual type addition happens in an > order which can cause potentially cause problems in some cases; for > example with the above changes I see: > > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/errno.d > sorted (1/2) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/io.d sorted (3/4) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/ip.d sorted (5/6) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/lockstat.d > sorted (7/8) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/tcp.d sorted > (10/11) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/udp.d sorted > (12/13) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/net.d sorted > (9/14) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/pcap.d sorted > (15/16) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/procfs.d > sorted (17/18) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/regs.d sorted > (19/20) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/sched.d > sorted (21/22) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/signal.d > sorted (23/24) > libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/unistd.d > sorted (25/26) > libdtrace DEBUG 1751904270: typedef processorid_t added as id 2147483677 > libdtrace DEBUG 1751904270: typedef psetid_t added as id 2147483678 > libdtrace DEBUG 1751904270: typedef chipid_t added as id 2147483679 > libdtrace DEBUG 1751904270: typedef lgrp_id_t added as id 2147483680 > libdtrace DEBUG 1751904270: typedef cpuinfo_t added as id 2147483682 > libdtrace DEBUG 1751904270: typedef cpuinfo_t_p added as id 2147483684 > libdtrace DEBUG 1751904270: typedef time_t added as id 2147483688 > libdtrace DEBUG 1751904270: typedef timestruc_t added as id 2147483690 > libdtrace DEBUG 1751904270: typedef lwpsinfo_t added as id 2147483695 > libdtrace DEBUG 1751904270: typedef taskid_t added as id 2147483696 > libdtrace DEBUG 1751904270: typedef dprojid_t added as id 2147483697 > libdtrace DEBUG 1751904270: typedef poolid_t added as id 2147483698 > libdtrace DEBUG 1751904270: typedef zoneid_t added as id 2147483699 > libdtrace DEBUG 1751904270: typedef psinfo_t added as id 2147490324 > > [edit: following typedefs come from net.d:] > > libdtrace DEBUG 1751904270: typedef conninfo_t added as id 2147490329 > libdtrace DEBUG 1751904270: typedef netstackid_t added as id 2147490330 > libdtrace DEBUG 1751904270: typedef ipaddr_t added as id 2147490331 > libdtrace DEBUG 1751904270: typedef in6_addr_t added as id 2147490332 > libdtrace DEBUG 1751904270: typedef pktinfo_t added as id 2147490334 > libdtrace DEBUG 1751904270: typedef csinfo_t added as id 2147490336 > libdtrace DEBUG 1751904270: skipping library > /usr/lib64/dtrace/6.10/udp.d: "/usr/lib64/dtrace/6.10/udp.d", line 10: > program requires provider udp > > [edit: followig typedefs come from tcp.d:] > > libdtrace DEBUG 1751904270: typedef tcpinfo_t added as id 2147490338 > libdtrace DEBUG 1751904270: typedef tcpsinfo_t added as id 2147490340 > libdtrace DEBUG 1751904270: typedef tcplsinfo_t added as id 2147490342 > > [edit: following typedefs come from ip.d;] > > libdtrace DEBUG 1751904270: typedef ipinfo_t added as id 2147490350 > libdtrace DEBUG 1751904270: typedef ifinfo_t added as id 2147490352 > libdtrace DEBUG 1751904270: typedef ipv4info_t added as id 2147490361 > libdtrace DEBUG 1751904270: typedef ipv6info_t added as id 2147490369 > libdtrace DEBUG 1751904270: typedef void_ip_t added as id 2147490370 > libdtrace DEBUG 1751904270: typedef __dtrace_tcp_void_ip_t added as id > 2147490371 > libdtrace DEBUG 1751904270: typedef caddr_t added as id 2147490380 > libdtrace DEBUG 1751904270: typedef bufinfo_t added as id 2147490382 > libdtrace DEBUG 1751904270: typedef devinfo_t added as id 2147490385 > > > Notice that the ip.d typedefs are added after the tcp ones, despite > tcp.d having a "#pragma D depends_on provider ip.d . So it seems like > the "depends_on library net.d" was honoured above, but not the > "depends_on provider ip.d". Moving the core net-generic definitions to Ah, but depends_on provider is different from depends_on library. The dependency on the provider means that DTrace will not load the .d file if the provider is not available. The dependency on a library file affects the sorting, because it indicates that the given library file needs to be loaded before the one that specifies the depends_on. > net.d alone was not enough however; including the explicit casts to the > argument to inet_ntoa() (to ipaddr_t *) in tcp.d was needed also. > > Also, I do not understand why you removed the pragma > > #pragma D depends_on provider tcp > > from tcp.d. > > Yeah, not sure why that got removed; I'll add it again for v3. > > Alan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-07 16:53 ` Kris Van Hees @ 2025-07-07 18:14 ` Alan Maguire 2025-07-07 19:55 ` Kris Van Hees 0 siblings, 1 reply; 21+ messages in thread From: Alan Maguire @ 2025-07-07 18:14 UTC (permalink / raw) To: Kris Van Hees; +Cc: Eugene Loh, dtrace, dtrace-devel On 07/07/2025 17:53, Kris Van Hees wrote: > On Mon, Jul 07, 2025 at 05:32:19PM +0100, Alan Maguire wrote: >> On 03/07/2025 23:36, Kris Van Hees wrote: >>> On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote: >>>> On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote: >>>>> On 03/07/2025 20:03, Kris Van Hees wrote: >>>>>> On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: >>>>>>> On 03/07/2025 19:26, Kris Van Hees wrote: >>>>>>>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: >>>>>>>>> On 03/07/2025 18:06, Eugene Loh wrote: >>>>>>>>>> On 7/3/25 12:59, Alan Maguire wrote: >>>>>>>>>> >>>>>>>>>>> On 03/07/2025 17:43, Eugene Loh wrote: >>>>>>>>>>> >>>>>>>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the >>>>>>>>>>>> patch 3/4 feedback). >>>>>>>>>>>> >>>>>>>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip >>>>>>>>>>> send probes? >>>>>>>>>> >>>>>>>>>> dtrace: failed to compile script /dev/stdin: >>>>>>>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of >>>>>>>>>> inet_ntoa arg#1 (ipaddr_t *): >>>>>>>>>> Unknown type name >>>>>>>>>> >>>>>>>>> >>>>>>>>> Ah, sorry yep I have a fix for that one in the next round. Basically we >>>>>>>>> need to add it to the core set of typedefs and add a type for a pointer >>>>>>>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. >>>>>>>> >>>>>>>> Why can't we rely on the pragma? That is how e.g. the ip provider manages >>>>>>>> this I believe? >>>>>>>> >>>>>>> >>>>>>> Unfortunately the #pragma include doesn't do enough; it just defines a >>>>>>> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is >>>>>>> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t >>>>>>> typedef to net.d and doing the pointer lookup/addition but that doesn't >>>>>>> work either. Seems we need the core typedef + pointer addition or we hit >>>>>>> this failure. >>>>>> >>>>>> Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, >>>>>> you should be set. That is what I did in my priliminary tcp provider impl. >>>>>> I do believe that works. Either way, we use inet_ntoa() in the ip.d >>>>>> translators and that works with that typedef in the file, so this really ought >>>>>> to work. >>>> >>>>> Yep, I tried that in the v2 patch series; Eugene hit the undefined error >>>>> in one test and I now hit it consistently for all tcp/ip tests >>>>> unfortunately with "typedef __be32 ipaddr_t;" in net.d. >>>>> >>>>> My assumption (probably wrong) is that the include of the library does >>>>> happen but nothing triggers the pointer type generation for "ipaddr *" >>>>> in the CTF dict. If there was a way to force that type generation at the >>>>> .d file level that would be great, not sure I see a way currently tho. >>>> >>>> Well, like I said, it does work for ip.d so I don't see why this would be >>>> any different. I'll have a look and see if I can figure something out. >>> >>> Looking into this more, I think the problem is simply that you did not sync >>> all the dlibs for the various kernel versions with the updated ip.d, net.d, and >>> tcp.d files. So, if the kernel on the OL8 instance you test on does not have >>> your change, it will fail. >>> >> >> No, don't think that's it; the .d files that matched the kernel I tested >> on (6.10) were synced; the use of the 6.10 .d files was visible in the >> error message. The problem appears to be around the fact that tcp.d uses >> the ipaddr_t * in inet_ntoa(), but unlike ip.d (which uses ipaddr_t in >> translated types) it does not have any other mention of ipaddr_t. >> Adding an explicit cast in tcp.d to the argument to inet_ntoa() to >> ipaddr_t * resolves the issue without having to add ipaddr_t to the core >> type list. > > Can you reproduce this at will? Can you give me specifics on OL version, > kernel version, etc? I'd like to be able to reproduce what you see, because > so far, all I tried actually works once the ipaddr_t typedef is in net.d. > Yep, it's 100% reproducible for me on an upstream (bpf-next 6.15) kernel + OL9. Moving ipaddr_t to net.d works for ip.d but not tcp.d in that environment. The extra casts for the inet_ntoa() parameters that I mention above are needed in tcp.d to get things to work properly for me. I pushed a branch to https://github.com/alan-maguire/dtrace-utils/tree/remote-tcp-v3-wip-broken that illustrates the failure. Relative to devel, it consists of 6 commits 1: the v2 of the remote IP address change (ensuring the remote address tests won't fail); 2-4: a few prep patches for the tcp provider; and 5: the tcp provider patch (in a v3 work-in-progress form); and finally 6: the top-level commit then removes the casts I added to tcp.d in the previous "tcp: new provider" commit. With that change in place on my system, the previously-passing IP tests start failing. If I "git reset --hard HEAD~1" on that branch (reestablishing those ipaddr_t * casts) and rebuild, the failures go away for me. >> Further debug logging shows that the actual type addition happens in an >> order which can cause potentially cause problems in some cases; for >> example with the above changes I see: >> >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/errno.d >> sorted (1/2) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/io.d sorted (3/4) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/ip.d sorted (5/6) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/lockstat.d >> sorted (7/8) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/tcp.d sorted >> (10/11) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/udp.d sorted >> (12/13) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/net.d sorted >> (9/14) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/pcap.d sorted >> (15/16) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/procfs.d >> sorted (17/18) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/regs.d sorted >> (19/20) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/sched.d >> sorted (21/22) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/signal.d >> sorted (23/24) >> libdtrace DEBUG 1751904270: library /usr/lib64/dtrace/6.10/unistd.d >> sorted (25/26) >> libdtrace DEBUG 1751904270: typedef processorid_t added as id 2147483677 >> libdtrace DEBUG 1751904270: typedef psetid_t added as id 2147483678 >> libdtrace DEBUG 1751904270: typedef chipid_t added as id 2147483679 >> libdtrace DEBUG 1751904270: typedef lgrp_id_t added as id 2147483680 >> libdtrace DEBUG 1751904270: typedef cpuinfo_t added as id 2147483682 >> libdtrace DEBUG 1751904270: typedef cpuinfo_t_p added as id 2147483684 >> libdtrace DEBUG 1751904270: typedef time_t added as id 2147483688 >> libdtrace DEBUG 1751904270: typedef timestruc_t added as id 2147483690 >> libdtrace DEBUG 1751904270: typedef lwpsinfo_t added as id 2147483695 >> libdtrace DEBUG 1751904270: typedef taskid_t added as id 2147483696 >> libdtrace DEBUG 1751904270: typedef dprojid_t added as id 2147483697 >> libdtrace DEBUG 1751904270: typedef poolid_t added as id 2147483698 >> libdtrace DEBUG 1751904270: typedef zoneid_t added as id 2147483699 >> libdtrace DEBUG 1751904270: typedef psinfo_t added as id 2147490324 >> >> [edit: following typedefs come from net.d:] >> >> libdtrace DEBUG 1751904270: typedef conninfo_t added as id 2147490329 >> libdtrace DEBUG 1751904270: typedef netstackid_t added as id 2147490330 >> libdtrace DEBUG 1751904270: typedef ipaddr_t added as id 2147490331 >> libdtrace DEBUG 1751904270: typedef in6_addr_t added as id 2147490332 >> libdtrace DEBUG 1751904270: typedef pktinfo_t added as id 2147490334 >> libdtrace DEBUG 1751904270: typedef csinfo_t added as id 2147490336 >> libdtrace DEBUG 1751904270: skipping library >> /usr/lib64/dtrace/6.10/udp.d: "/usr/lib64/dtrace/6.10/udp.d", line 10: >> program requires provider udp >> >> [edit: followig typedefs come from tcp.d:] >> >> libdtrace DEBUG 1751904270: typedef tcpinfo_t added as id 2147490338 >> libdtrace DEBUG 1751904270: typedef tcpsinfo_t added as id 2147490340 >> libdtrace DEBUG 1751904270: typedef tcplsinfo_t added as id 2147490342 >> >> [edit: following typedefs come from ip.d;] >> >> libdtrace DEBUG 1751904270: typedef ipinfo_t added as id 2147490350 >> libdtrace DEBUG 1751904270: typedef ifinfo_t added as id 2147490352 >> libdtrace DEBUG 1751904270: typedef ipv4info_t added as id 2147490361 >> libdtrace DEBUG 1751904270: typedef ipv6info_t added as id 2147490369 >> libdtrace DEBUG 1751904270: typedef void_ip_t added as id 2147490370 >> libdtrace DEBUG 1751904270: typedef __dtrace_tcp_void_ip_t added as id >> 2147490371 >> libdtrace DEBUG 1751904270: typedef caddr_t added as id 2147490380 >> libdtrace DEBUG 1751904270: typedef bufinfo_t added as id 2147490382 >> libdtrace DEBUG 1751904270: typedef devinfo_t added as id 2147490385 >> >> >> Notice that the ip.d typedefs are added after the tcp ones, despite >> tcp.d having a "#pragma D depends_on provider ip.d . So it seems like >> the "depends_on library net.d" was honoured above, but not the >> "depends_on provider ip.d". Moving the core net-generic definitions to > > Ah, but depends_on provider is different from depends_on library. The > dependency on the provider means that DTrace will not load the .d file if > the provider is not available. The dependency on a library file affects > the sorting, because it indicates that the given library file needs to be > loaded before the one that specifies the depends_on. > okay, good to know, thanks! >> net.d alone was not enough however; including the explicit casts to the >> argument to inet_ntoa() (to ipaddr_t *) in tcp.d was needed also. >> > Also, I do not understand why you removed the pragma >>> #pragma D depends_on provider tcp >>> from tcp.d. >> >> Yeah, not sure why that got removed; I'll add it again for v3. >> >> Alan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-07 18:14 ` Alan Maguire @ 2025-07-07 19:55 ` Kris Van Hees 2025-07-07 21:51 ` Alan Maguire 0 siblings, 1 reply; 21+ messages in thread From: Kris Van Hees @ 2025-07-07 19:55 UTC (permalink / raw) To: Alan Maguire; +Cc: Kris Van Hees, Eugene Loh, dtrace, dtrace-devel On Mon, Jul 07, 2025 at 07:14:35PM +0100, Alan Maguire wrote: > On 07/07/2025 17:53, Kris Van Hees wrote: > > On Mon, Jul 07, 2025 at 05:32:19PM +0100, Alan Maguire wrote: > >> On 03/07/2025 23:36, Kris Van Hees wrote: > >>> On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote: > >>>> On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote: > >>>>> On 03/07/2025 20:03, Kris Van Hees wrote: > >>>>>> On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: > >>>>>>> On 03/07/2025 19:26, Kris Van Hees wrote: > >>>>>>>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: > >>>>>>>>> On 03/07/2025 18:06, Eugene Loh wrote: > >>>>>>>>>> On 7/3/25 12:59, Alan Maguire wrote: > >>>>>>>>>> > >>>>>>>>>>> On 03/07/2025 17:43, Eugene Loh wrote: > >>>>>>>>>>> > >>>>>>>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the > >>>>>>>>>>>> patch 3/4 feedback). > >>>>>>>>>>>> > >>>>>>>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip > >>>>>>>>>>> send probes? > >>>>>>>>>> > >>>>>>>>>> dtrace: failed to compile script /dev/stdin: > >>>>>>>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of > >>>>>>>>>> inet_ntoa arg#1 (ipaddr_t *): > >>>>>>>>>> Unknown type name > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> Ah, sorry yep I have a fix for that one in the next round. Basically we > >>>>>>>>> need to add it to the core set of typedefs and add a type for a pointer > >>>>>>>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. > >>>>>>>> > >>>>>>>> Why can't we rely on the pragma? That is how e.g. the ip provider manages > >>>>>>>> this I believe? > >>>>>>>> > >>>>>>> > >>>>>>> Unfortunately the #pragma include doesn't do enough; it just defines a > >>>>>>> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is > >>>>>>> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t > >>>>>>> typedef to net.d and doing the pointer lookup/addition but that doesn't > >>>>>>> work either. Seems we need the core typedef + pointer addition or we hit > >>>>>>> this failure. > >>>>>> > >>>>>> Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, > >>>>>> you should be set. That is what I did in my priliminary tcp provider impl. > >>>>>> I do believe that works. Either way, we use inet_ntoa() in the ip.d > >>>>>> translators and that works with that typedef in the file, so this really ought > >>>>>> to work. > >>>> > >>>>> Yep, I tried that in the v2 patch series; Eugene hit the undefined error > >>>>> in one test and I now hit it consistently for all tcp/ip tests > >>>>> unfortunately with "typedef __be32 ipaddr_t;" in net.d. > >>>>> > >>>>> My assumption (probably wrong) is that the include of the library does > >>>>> happen but nothing triggers the pointer type generation for "ipaddr *" > >>>>> in the CTF dict. If there was a way to force that type generation at the > >>>>> .d file level that would be great, not sure I see a way currently tho. > >>>> > >>>> Well, like I said, it does work for ip.d so I don't see why this would be > >>>> any different. I'll have a look and see if I can figure something out. > >>> > >>> Looking into this more, I think the problem is simply that you did not sync > >>> all the dlibs for the various kernel versions with the updated ip.d, net.d, and > >>> tcp.d files. So, if the kernel on the OL8 instance you test on does not have > >>> your change, it will fail. > >>> > >> > >> No, don't think that's it; the .d files that matched the kernel I tested > >> on (6.10) were synced; the use of the 6.10 .d files was visible in the > >> error message. The problem appears to be around the fact that tcp.d uses > >> the ipaddr_t * in inet_ntoa(), but unlike ip.d (which uses ipaddr_t in > >> translated types) it does not have any other mention of ipaddr_t. > >> Adding an explicit cast in tcp.d to the argument to inet_ntoa() to > >> ipaddr_t * resolves the issue without having to add ipaddr_t to the core > >> type list. > > > > Can you reproduce this at will? Can you give me specifics on OL version, > > kernel version, etc? I'd like to be able to reproduce what you see, because > > so far, all I tried actually works once the ipaddr_t typedef is in net.d. > > > > Yep, it's 100% reproducible for me on an upstream (bpf-next 6.15) kernel > + OL9. Moving ipaddr_t to net.d works for ip.d but not tcp.d in that > environment. The extra casts for the inet_ntoa() parameters that I > mention above are needed in tcp.d to get things to work properly for me. > > I pushed a branch to > > https://github.com/alan-maguire/dtrace-utils/tree/remote-tcp-v3-wip-broken > > that illustrates the failure. > > Relative to devel, it consists of 6 commits > > 1: the v2 of the remote IP address change (ensuring the remote address > tests won't fail); > 2-4: a few prep patches for the tcp provider; and > 5: the tcp provider patch (in a v3 work-in-progress form); and finally > 6: the top-level commit then removes the casts I added to tcp.d in the > previous "tcp: new provider" commit. With that change in place on my > system, the previously-passing IP tests start failing. > > If I "git reset --hard HEAD~1" on that branch (reestablishing those > ipaddr_t * casts) and rebuild, the failures go away for me. I tested your tree on Debian with the 6.15 kernel, and this is the result: $ uname -a Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux $ cat test/log/current/runtest.sum dtrace: Oracle D 2.0 This is DTrace 2.0.1 dtrace(1) version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc libdtrace version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux testsuite version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc test/unittest/tcp/tst.ipv4localtcp.sh: PASS. test/unittest/tcp/tst.ipv4localtcpstate.sh: PASS. test/unittest/tcp/tst.ipv4remotetcp.sh: PASS. test/unittest/tcp/tst.ipv4remotetcpstate.sh: PASS. test/unittest/tcp/tst.ipv6localtcp.sh: PASS. test/unittest/tcp/tst.ipv6localtcpstate.sh: PASS. 6 cases (6 PASS, 0 FAIL, 0 XPASS, 0 XFAIL, 0 SKIP) I will try to get 6.15 on an OL9 instance and try there, but either way, I have a feeling there is a binutils (libctf) discrepancy somewhere? What version of binutils is installed on your system (nm -V)? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-07 19:55 ` Kris Van Hees @ 2025-07-07 21:51 ` Alan Maguire 2025-07-08 1:34 ` Kris Van Hees 0 siblings, 1 reply; 21+ messages in thread From: Alan Maguire @ 2025-07-07 21:51 UTC (permalink / raw) To: Kris Van Hees; +Cc: Eugene Loh, dtrace, dtrace-devel On 07/07/2025 20:55, Kris Van Hees wrote: > On Mon, Jul 07, 2025 at 07:14:35PM +0100, Alan Maguire wrote: >> On 07/07/2025 17:53, Kris Van Hees wrote: >>> On Mon, Jul 07, 2025 at 05:32:19PM +0100, Alan Maguire wrote: >>>> On 03/07/2025 23:36, Kris Van Hees wrote: >>>>> On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote: >>>>>> On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote: >>>>>>> On 03/07/2025 20:03, Kris Van Hees wrote: >>>>>>>> On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: >>>>>>>>> On 03/07/2025 19:26, Kris Van Hees wrote: >>>>>>>>>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: >>>>>>>>>>> On 03/07/2025 18:06, Eugene Loh wrote: >>>>>>>>>>>> On 7/3/25 12:59, Alan Maguire wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On 03/07/2025 17:43, Eugene Loh wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the >>>>>>>>>>>>>> patch 3/4 feedback). >>>>>>>>>>>>>> >>>>>>>>>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip >>>>>>>>>>>>> send probes? >>>>>>>>>>>> >>>>>>>>>>>> dtrace: failed to compile script /dev/stdin: >>>>>>>>>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of >>>>>>>>>>>> inet_ntoa arg#1 (ipaddr_t *): >>>>>>>>>>>> Unknown type name >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Ah, sorry yep I have a fix for that one in the next round. Basically we >>>>>>>>>>> need to add it to the core set of typedefs and add a type for a pointer >>>>>>>>>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. >>>>>>>>>> >>>>>>>>>> Why can't we rely on the pragma? That is how e.g. the ip provider manages >>>>>>>>>> this I believe? >>>>>>>>>> >>>>>>>>> >>>>>>>>> Unfortunately the #pragma include doesn't do enough; it just defines a >>>>>>>>> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is >>>>>>>>> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t >>>>>>>>> typedef to net.d and doing the pointer lookup/addition but that doesn't >>>>>>>>> work either. Seems we need the core typedef + pointer addition or we hit >>>>>>>>> this failure. >>>>>>>> >>>>>>>> Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, >>>>>>>> you should be set. That is what I did in my priliminary tcp provider impl. >>>>>>>> I do believe that works. Either way, we use inet_ntoa() in the ip.d >>>>>>>> translators and that works with that typedef in the file, so this really ought >>>>>>>> to work. >>>>>> >>>>>>> Yep, I tried that in the v2 patch series; Eugene hit the undefined error >>>>>>> in one test and I now hit it consistently for all tcp/ip tests >>>>>>> unfortunately with "typedef __be32 ipaddr_t;" in net.d. >>>>>>> >>>>>>> My assumption (probably wrong) is that the include of the library does >>>>>>> happen but nothing triggers the pointer type generation for "ipaddr *" >>>>>>> in the CTF dict. If there was a way to force that type generation at the >>>>>>> .d file level that would be great, not sure I see a way currently tho. >>>>>> >>>>>> Well, like I said, it does work for ip.d so I don't see why this would be >>>>>> any different. I'll have a look and see if I can figure something out. >>>>> >>>>> Looking into this more, I think the problem is simply that you did not sync >>>>> all the dlibs for the various kernel versions with the updated ip.d, net.d, and >>>>> tcp.d files. So, if the kernel on the OL8 instance you test on does not have >>>>> your change, it will fail. >>>>> >>>> >>>> No, don't think that's it; the .d files that matched the kernel I tested >>>> on (6.10) were synced; the use of the 6.10 .d files was visible in the >>>> error message. The problem appears to be around the fact that tcp.d uses >>>> the ipaddr_t * in inet_ntoa(), but unlike ip.d (which uses ipaddr_t in >>>> translated types) it does not have any other mention of ipaddr_t. >>>> Adding an explicit cast in tcp.d to the argument to inet_ntoa() to >>>> ipaddr_t * resolves the issue without having to add ipaddr_t to the core >>>> type list. >>> >>> Can you reproduce this at will? Can you give me specifics on OL version, >>> kernel version, etc? I'd like to be able to reproduce what you see, because >>> so far, all I tried actually works once the ipaddr_t typedef is in net.d. >>> >> >> Yep, it's 100% reproducible for me on an upstream (bpf-next 6.15) kernel >> + OL9. Moving ipaddr_t to net.d works for ip.d but not tcp.d in that >> environment. The extra casts for the inet_ntoa() parameters that I >> mention above are needed in tcp.d to get things to work properly for me. >> >> I pushed a branch to >> >> https://github.com/alan-maguire/dtrace-utils/tree/remote-tcp-v3-wip-broken >> >> that illustrates the failure. >> >> Relative to devel, it consists of 6 commits >> >> 1: the v2 of the remote IP address change (ensuring the remote address >> tests won't fail); >> 2-4: a few prep patches for the tcp provider; and >> 5: the tcp provider patch (in a v3 work-in-progress form); and finally >> 6: the top-level commit then removes the casts I added to tcp.d in the >> previous "tcp: new provider" commit. With that change in place on my >> system, the previously-passing IP tests start failing. >> >> If I "git reset --hard HEAD~1" on that branch (reestablishing those >> ipaddr_t * casts) and rebuild, the failures go away for me. > > I tested your tree on Debian with the 6.15 kernel, and this is the result: > > $ uname -a > Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux > $ cat test/log/current/runtest.sum > dtrace: Oracle D 2.0 > This is DTrace 2.0.1 > dtrace(1) version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc > libdtrace version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc > Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux > testsuite version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc > > test/unittest/tcp/tst.ipv4localtcp.sh: PASS. > test/unittest/tcp/tst.ipv4localtcpstate.sh: PASS. > test/unittest/tcp/tst.ipv4remotetcp.sh: PASS. > test/unittest/tcp/tst.ipv4remotetcpstate.sh: PASS. > test/unittest/tcp/tst.ipv6localtcp.sh: PASS. > test/unittest/tcp/tst.ipv6localtcpstate.sh: PASS. > 6 cases (6 PASS, 0 FAIL, 0 XPASS, 0 XFAIL, 0 SKIP) > > I will try to get 6.15 on an OL9 instance and try there, but either way, I > have a feeling there is a binutils (libctf) discrepancy somewhere? What could be; see below.. > version of binutils is installed on your system (nm -V)? $ nm -V GNU nm version 2.35.2-42.0.1.el9 Copyright (C) 2020 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or (at your option) any later version. This program has absolutely no warranty. Let me know if you need any more info. Thanks! Alan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-07 21:51 ` Alan Maguire @ 2025-07-08 1:34 ` Kris Van Hees 2025-07-08 17:19 ` Alan Maguire 0 siblings, 1 reply; 21+ messages in thread From: Kris Van Hees @ 2025-07-08 1:34 UTC (permalink / raw) To: Alan Maguire; +Cc: Kris Van Hees, Eugene Loh, dtrace, dtrace-devel On Mon, Jul 07, 2025 at 10:51:10PM +0100, Alan Maguire wrote: > On 07/07/2025 20:55, Kris Van Hees wrote: > > On Mon, Jul 07, 2025 at 07:14:35PM +0100, Alan Maguire wrote: > >> On 07/07/2025 17:53, Kris Van Hees wrote: > >>> On Mon, Jul 07, 2025 at 05:32:19PM +0100, Alan Maguire wrote: > >>>> On 03/07/2025 23:36, Kris Van Hees wrote: > >>>>> On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote: > >>>>>> On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote: > >>>>>>> On 03/07/2025 20:03, Kris Van Hees wrote: > >>>>>>>> On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: > >>>>>>>>> On 03/07/2025 19:26, Kris Van Hees wrote: > >>>>>>>>>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: > >>>>>>>>>>> On 03/07/2025 18:06, Eugene Loh wrote: > >>>>>>>>>>>> On 7/3/25 12:59, Alan Maguire wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> On 03/07/2025 17:43, Eugene Loh wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the > >>>>>>>>>>>>>> patch 3/4 feedback). > >>>>>>>>>>>>>> > >>>>>>>>>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip > >>>>>>>>>>>>> send probes? > >>>>>>>>>>>> > >>>>>>>>>>>> dtrace: failed to compile script /dev/stdin: > >>>>>>>>>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of > >>>>>>>>>>>> inet_ntoa arg#1 (ipaddr_t *): > >>>>>>>>>>>> Unknown type name > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Ah, sorry yep I have a fix for that one in the next round. Basically we > >>>>>>>>>>> need to add it to the core set of typedefs and add a type for a pointer > >>>>>>>>>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. > >>>>>>>>>> > >>>>>>>>>> Why can't we rely on the pragma? That is how e.g. the ip provider manages > >>>>>>>>>> this I believe? > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> Unfortunately the #pragma include doesn't do enough; it just defines a > >>>>>>>>> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is > >>>>>>>>> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t > >>>>>>>>> typedef to net.d and doing the pointer lookup/addition but that doesn't > >>>>>>>>> work either. Seems we need the core typedef + pointer addition or we hit > >>>>>>>>> this failure. > >>>>>>>> > >>>>>>>> Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, > >>>>>>>> you should be set. That is what I did in my priliminary tcp provider impl. > >>>>>>>> I do believe that works. Either way, we use inet_ntoa() in the ip.d > >>>>>>>> translators and that works with that typedef in the file, so this really ought > >>>>>>>> to work. > >>>>>> > >>>>>>> Yep, I tried that in the v2 patch series; Eugene hit the undefined error > >>>>>>> in one test and I now hit it consistently for all tcp/ip tests > >>>>>>> unfortunately with "typedef __be32 ipaddr_t;" in net.d. > >>>>>>> > >>>>>>> My assumption (probably wrong) is that the include of the library does > >>>>>>> happen but nothing triggers the pointer type generation for "ipaddr *" > >>>>>>> in the CTF dict. If there was a way to force that type generation at the > >>>>>>> .d file level that would be great, not sure I see a way currently tho. > >>>>>> > >>>>>> Well, like I said, it does work for ip.d so I don't see why this would be > >>>>>> any different. I'll have a look and see if I can figure something out. > >>>>> > >>>>> Looking into this more, I think the problem is simply that you did not sync > >>>>> all the dlibs for the various kernel versions with the updated ip.d, net.d, and > >>>>> tcp.d files. So, if the kernel on the OL8 instance you test on does not have > >>>>> your change, it will fail. > >>>>> > >>>> > >>>> No, don't think that's it; the .d files that matched the kernel I tested > >>>> on (6.10) were synced; the use of the 6.10 .d files was visible in the > >>>> error message. The problem appears to be around the fact that tcp.d uses > >>>> the ipaddr_t * in inet_ntoa(), but unlike ip.d (which uses ipaddr_t in > >>>> translated types) it does not have any other mention of ipaddr_t. > >>>> Adding an explicit cast in tcp.d to the argument to inet_ntoa() to > >>>> ipaddr_t * resolves the issue without having to add ipaddr_t to the core > >>>> type list. > >>> > >>> Can you reproduce this at will? Can you give me specifics on OL version, > >>> kernel version, etc? I'd like to be able to reproduce what you see, because > >>> so far, all I tried actually works once the ipaddr_t typedef is in net.d. > >>> > >> > >> Yep, it's 100% reproducible for me on an upstream (bpf-next 6.15) kernel > >> + OL9. Moving ipaddr_t to net.d works for ip.d but not tcp.d in that > >> environment. The extra casts for the inet_ntoa() parameters that I > >> mention above are needed in tcp.d to get things to work properly for me. > >> > >> I pushed a branch to > >> > >> https://github.com/alan-maguire/dtrace-utils/tree/remote-tcp-v3-wip-broken > >> > >> that illustrates the failure. > >> > >> Relative to devel, it consists of 6 commits > >> > >> 1: the v2 of the remote IP address change (ensuring the remote address > >> tests won't fail); > >> 2-4: a few prep patches for the tcp provider; and > >> 5: the tcp provider patch (in a v3 work-in-progress form); and finally > >> 6: the top-level commit then removes the casts I added to tcp.d in the > >> previous "tcp: new provider" commit. With that change in place on my > >> system, the previously-passing IP tests start failing. > >> > >> If I "git reset --hard HEAD~1" on that branch (reestablishing those > >> ipaddr_t * casts) and rebuild, the failures go away for me. > > > > I tested your tree on Debian with the 6.15 kernel, and this is the result: > > > > $ uname -a > > Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux > > $ cat test/log/current/runtest.sum > > dtrace: Oracle D 2.0 > > This is DTrace 2.0.1 > > dtrace(1) version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc > > libdtrace version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc > > Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux > > testsuite version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc > > > > test/unittest/tcp/tst.ipv4localtcp.sh: PASS. > > test/unittest/tcp/tst.ipv4localtcpstate.sh: PASS. > > test/unittest/tcp/tst.ipv4remotetcp.sh: PASS. > > test/unittest/tcp/tst.ipv4remotetcpstate.sh: PASS. > > test/unittest/tcp/tst.ipv6localtcp.sh: PASS. > > test/unittest/tcp/tst.ipv6localtcpstate.sh: PASS. > > 6 cases (6 PASS, 0 FAIL, 0 XPASS, 0 XFAIL, 0 SKIP) > > > > I will try to get 6.15 on an OL9 instance and try there, but either way, I > > have a feeling there is a binutils (libctf) discrepancy somewhere? What > > could be; see below.. > > > version of binutils is installed on your system (nm -V)? > > $ nm -V > GNU nm version 2.35.2-42.0.1.el9 > Copyright (C) 2020 Free Software Foundation, Inc. > This program is free software; you may redistribute it under the terms of > the GNU General Public License version 3 or (at your option) any later > version. > This program has absolutely no warranty. > > Let me know if you need any more info. Thanks! > > Alan Tried it on OL9 with 6.15.4 kernel, and aside from some probes not firing, the tests work. $ nm -V GNU nm version 2.35.2-63.0.1.el9 Copyright (C) 2020 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or (at your option) any later version. This program has absolutely no warranty. So I think you need to yum update your system? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-08 1:34 ` Kris Van Hees @ 2025-07-08 17:19 ` Alan Maguire 2025-07-08 17:30 ` Kris Van Hees 0 siblings, 1 reply; 21+ messages in thread From: Alan Maguire @ 2025-07-08 17:19 UTC (permalink / raw) To: Kris Van Hees; +Cc: Eugene Loh, dtrace, dtrace-devel On 08/07/2025 02:34, Kris Van Hees wrote: > On Mon, Jul 07, 2025 at 10:51:10PM +0100, Alan Maguire wrote: >> On 07/07/2025 20:55, Kris Van Hees wrote: >>> On Mon, Jul 07, 2025 at 07:14:35PM +0100, Alan Maguire wrote: >>>> On 07/07/2025 17:53, Kris Van Hees wrote: >>>>> On Mon, Jul 07, 2025 at 05:32:19PM +0100, Alan Maguire wrote: >>>>>> On 03/07/2025 23:36, Kris Van Hees wrote: >>>>>>> On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote: >>>>>>>> On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote: >>>>>>>>> On 03/07/2025 20:03, Kris Van Hees wrote: >>>>>>>>>> On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: >>>>>>>>>>> On 03/07/2025 19:26, Kris Van Hees wrote: >>>>>>>>>>>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: >>>>>>>>>>>>> On 03/07/2025 18:06, Eugene Loh wrote: >>>>>>>>>>>>>> On 7/3/25 12:59, Alan Maguire wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 03/07/2025 17:43, Eugene Loh wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the >>>>>>>>>>>>>>>> patch 3/4 feedback). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip >>>>>>>>>>>>>>> send probes? >>>>>>>>>>>>>> >>>>>>>>>>>>>> dtrace: failed to compile script /dev/stdin: >>>>>>>>>>>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of >>>>>>>>>>>>>> inet_ntoa arg#1 (ipaddr_t *): >>>>>>>>>>>>>> Unknown type name >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Ah, sorry yep I have a fix for that one in the next round. Basically we >>>>>>>>>>>>> need to add it to the core set of typedefs and add a type for a pointer >>>>>>>>>>>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. >>>>>>>>>>>> >>>>>>>>>>>> Why can't we rely on the pragma? That is how e.g. the ip provider manages >>>>>>>>>>>> this I believe? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Unfortunately the #pragma include doesn't do enough; it just defines a >>>>>>>>>>> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is >>>>>>>>>>> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t >>>>>>>>>>> typedef to net.d and doing the pointer lookup/addition but that doesn't >>>>>>>>>>> work either. Seems we need the core typedef + pointer addition or we hit >>>>>>>>>>> this failure. >>>>>>>>>> >>>>>>>>>> Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, >>>>>>>>>> you should be set. That is what I did in my priliminary tcp provider impl. >>>>>>>>>> I do believe that works. Either way, we use inet_ntoa() in the ip.d >>>>>>>>>> translators and that works with that typedef in the file, so this really ought >>>>>>>>>> to work. >>>>>>>> >>>>>>>>> Yep, I tried that in the v2 patch series; Eugene hit the undefined error >>>>>>>>> in one test and I now hit it consistently for all tcp/ip tests >>>>>>>>> unfortunately with "typedef __be32 ipaddr_t;" in net.d. >>>>>>>>> >>>>>>>>> My assumption (probably wrong) is that the include of the library does >>>>>>>>> happen but nothing triggers the pointer type generation for "ipaddr *" >>>>>>>>> in the CTF dict. If there was a way to force that type generation at the >>>>>>>>> .d file level that would be great, not sure I see a way currently tho. >>>>>>>> >>>>>>>> Well, like I said, it does work for ip.d so I don't see why this would be >>>>>>>> any different. I'll have a look and see if I can figure something out. >>>>>>> >>>>>>> Looking into this more, I think the problem is simply that you did not sync >>>>>>> all the dlibs for the various kernel versions with the updated ip.d, net.d, and >>>>>>> tcp.d files. So, if the kernel on the OL8 instance you test on does not have >>>>>>> your change, it will fail. >>>>>>> >>>>>> >>>>>> No, don't think that's it; the .d files that matched the kernel I tested >>>>>> on (6.10) were synced; the use of the 6.10 .d files was visible in the >>>>>> error message. The problem appears to be around the fact that tcp.d uses >>>>>> the ipaddr_t * in inet_ntoa(), but unlike ip.d (which uses ipaddr_t in >>>>>> translated types) it does not have any other mention of ipaddr_t. >>>>>> Adding an explicit cast in tcp.d to the argument to inet_ntoa() to >>>>>> ipaddr_t * resolves the issue without having to add ipaddr_t to the core >>>>>> type list. >>>>> >>>>> Can you reproduce this at will? Can you give me specifics on OL version, >>>>> kernel version, etc? I'd like to be able to reproduce what you see, because >>>>> so far, all I tried actually works once the ipaddr_t typedef is in net.d. >>>>> >>>> >>>> Yep, it's 100% reproducible for me on an upstream (bpf-next 6.15) kernel >>>> + OL9. Moving ipaddr_t to net.d works for ip.d but not tcp.d in that >>>> environment. The extra casts for the inet_ntoa() parameters that I >>>> mention above are needed in tcp.d to get things to work properly for me. >>>> >>>> I pushed a branch to >>>> >>>> https://github.com/alan-maguire/dtrace-utils/tree/remote-tcp-v3-wip-broken >>>> >>>> that illustrates the failure. >>>> >>>> Relative to devel, it consists of 6 commits >>>> >>>> 1: the v2 of the remote IP address change (ensuring the remote address >>>> tests won't fail); >>>> 2-4: a few prep patches for the tcp provider; and >>>> 5: the tcp provider patch (in a v3 work-in-progress form); and finally >>>> 6: the top-level commit then removes the casts I added to tcp.d in the >>>> previous "tcp: new provider" commit. With that change in place on my >>>> system, the previously-passing IP tests start failing. >>>> >>>> If I "git reset --hard HEAD~1" on that branch (reestablishing those >>>> ipaddr_t * casts) and rebuild, the failures go away for me. >>> >>> I tested your tree on Debian with the 6.15 kernel, and this is the result: >>> >>> $ uname -a >>> Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux >>> $ cat test/log/current/runtest.sum >>> dtrace: Oracle D 2.0 >>> This is DTrace 2.0.1 >>> dtrace(1) version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc >>> libdtrace version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc >>> Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux >>> testsuite version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc >>> >>> test/unittest/tcp/tst.ipv4localtcp.sh: PASS. >>> test/unittest/tcp/tst.ipv4localtcpstate.sh: PASS. >>> test/unittest/tcp/tst.ipv4remotetcp.sh: PASS. >>> test/unittest/tcp/tst.ipv4remotetcpstate.sh: PASS. >>> test/unittest/tcp/tst.ipv6localtcp.sh: PASS. >>> test/unittest/tcp/tst.ipv6localtcpstate.sh: PASS. >>> 6 cases (6 PASS, 0 FAIL, 0 XPASS, 0 XFAIL, 0 SKIP) >>> >>> I will try to get 6.15 on an OL9 instance and try there, but either way, I >>> have a feeling there is a binutils (libctf) discrepancy somewhere? What >> >> could be; see below.. >> >>> version of binutils is installed on your system (nm -V)? >> >> $ nm -V >> GNU nm version 2.35.2-42.0.1.el9 >> Copyright (C) 2020 Free Software Foundation, Inc. >> This program is free software; you may redistribute it under the terms of >> the GNU General Public License version 3 or (at your option) any later >> version. >> This program has absolutely no warranty. >> >> Let me know if you need any more info. Thanks! >> >> Alan > > > Tried it on OL9 with 6.15.4 kernel, and aside from some probes not firing, > the tests work. > > $ nm -V > GNU nm version 2.35.2-63.0.1.el9 > Copyright (C) 2020 Free Software Foundation, Inc. > This program is free software; you may redistribute it under the terms of > the GNU General Public License version 3 or (at your option) any later version. > This program has absolutely no warranty. > > So I think you need to yum update your system? I think I may have found another clue to why it's happening. I tried on a gcc-toolset-14 -built system, with $ nm -V GNU nm version 2.41-3.el9 Copyright (C) 2023 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or (at your option) any later version. This program has absolutely no warranty. Now I can run the following fine: # build/dtrace -n 'ip:::send /args[4]->ipv4_protocol == IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); } END { printa(@c); }' dtrace: description 'ip:::send ' matched 2 probes However, if I add a syslibdir path - as the tests do when they execute - I see $ build/dtrace -xsyslibdir=$(pwd)/build/dlibs -n 'ip:::send /args[4]->ipv4_protocol == IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); } END { printa(@c); }' dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol == IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); } END { printa(@c); }: "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name using -xdebug I see a lot less types added in the error case; i.e. only the following are added when processing .d files: libdtrace DEBUG 1751994411: typedef conninfo_t added as id 2147483678 libdtrace DEBUG 1751994411: typedef netstackid_t added as id 2147483679 libdtrace DEBUG 1751994411: typedef ipaddr_t added as id 2147483683 libdtrace DEBUG 1751994411: typedef in6_addr_t added as id 2147483695 libdtrace DEBUG 1751994411: typedef pktinfo_t added as id 2147483697 libdtrace DEBUG 1751994411: typedef csinfo_t added as id 2147483699 libdtrace DEBUG 1751994411: typedef tcpinfo_t added as id 2147483701 libdtrace DEBUG 1751994411: typedef tcpsinfo_t added as id 2147483703 libdtrace DEBUG 1751994411: typedef tcplsinfo_t added as id 2147483705 versus the good case: libdtrace DEBUG 1751994399: typedef processorid_t added as id 2147483677 libdtrace DEBUG 1751994399: typedef psetid_t added as id 2147483678 libdtrace DEBUG 1751994399: typedef chipid_t added as id 2147483679 libdtrace DEBUG 1751994399: typedef lgrp_id_t added as id 2147483680 libdtrace DEBUG 1751994399: typedef cpuinfo_t added as id 2147483682 libdtrace DEBUG 1751994399: typedef cpuinfo_t_p added as id 2147483684 libdtrace DEBUG 1751994399: typedef time_t added as id 2147483688 libdtrace DEBUG 1751994399: typedef timestruc_t added as id 2147483690 libdtrace DEBUG 1751994399: typedef lwpsinfo_t added as id 2147483695 libdtrace DEBUG 1751994399: typedef taskid_t added as id 2147483696 libdtrace DEBUG 1751994399: typedef dprojid_t added as id 2147483697 libdtrace DEBUG 1751994399: typedef poolid_t added as id 2147483698 libdtrace DEBUG 1751994399: typedef zoneid_t added as id 2147483699 libdtrace DEBUG 1751994399: typedef psinfo_t added as id 2147490324 libdtrace DEBUG 1751994399: typedef conninfo_t added as id 2147490329 libdtrace DEBUG 1751994399: typedef netstackid_t added as id 2147490330 libdtrace DEBUG 1751994399: typedef ipaddr_t added as id 2147490331 libdtrace DEBUG 1751994399: typedef in6_addr_t added as id 2147490332 libdtrace DEBUG 1751994399: typedef pktinfo_t added as id 2147490334 libdtrace DEBUG 1751994399: typedef csinfo_t added as id 2147490336 libdtrace DEBUG 1751994399: skipping library /usr/lib64/dtrace/6.10/udp.d: "/usr/lib64/dtrace/6.10/udp.d", line 10: program requires provider udp libdtrace DEBUG 1751994399: typedef tcpinfo_t added as id 2147490338 libdtrace DEBUG 1751994399: typedef tcpsinfo_t added as id 2147490340 libdtrace DEBUG 1751994399: typedef tcplsinfo_t added as id 2147490342 libdtrace DEBUG 1751994399: typedef ipinfo_t added as id 2147490350 libdtrace DEBUG 1751994399: typedef ifinfo_t added as id 2147490352 libdtrace DEBUG 1751994399: typedef ipv4info_t added as id 2147490361 libdtrace DEBUG 1751994399: typedef ipv6info_t added as id 2147490369 libdtrace DEBUG 1751994399: typedef void_ip_t added as id 2147490370 libdtrace DEBUG 1751994399: typedef __dtrace_tcp_void_ip_t added as id 2147490371 libdtrace DEBUG 1751994399: typedef caddr_t added as id 2147490380 libdtrace DEBUG 1751994399: typedef bufinfo_t added as id 2147490382 libdtrace DEBUG 1751994399: typedef devinfo_t added as id 2147490385 So it looks like sched.d wasn't processed for example, but weirdly in the failing case net.d (containing the typedef ipaddr_t) and tcp.d were. The actual error comes later though, after processing kernel/module BTF: dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol == IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); } END { printa(@c); }: "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name And so I checked for differences in the build/dlibs files versus what is installed, and found none. Maybe the above might help reproduce this at least. Thanks! Alan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-08 17:19 ` Alan Maguire @ 2025-07-08 17:30 ` Kris Van Hees 2025-07-08 19:04 ` Alan Maguire 0 siblings, 1 reply; 21+ messages in thread From: Kris Van Hees @ 2025-07-08 17:30 UTC (permalink / raw) To: Alan Maguire; +Cc: Kris Van Hees, Eugene Loh, dtrace, dtrace-devel On Tue, Jul 08, 2025 at 06:19:25PM +0100, Alan Maguire wrote: > On 08/07/2025 02:34, Kris Van Hees wrote: > > On Mon, Jul 07, 2025 at 10:51:10PM +0100, Alan Maguire wrote: > >> On 07/07/2025 20:55, Kris Van Hees wrote: > >>> On Mon, Jul 07, 2025 at 07:14:35PM +0100, Alan Maguire wrote: > >>>> On 07/07/2025 17:53, Kris Van Hees wrote: > >>>>> On Mon, Jul 07, 2025 at 05:32:19PM +0100, Alan Maguire wrote: > >>>>>> On 03/07/2025 23:36, Kris Van Hees wrote: > >>>>>>> On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote: > >>>>>>>> On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote: > >>>>>>>>> On 03/07/2025 20:03, Kris Van Hees wrote: > >>>>>>>>>> On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: > >>>>>>>>>>> On 03/07/2025 19:26, Kris Van Hees wrote: > >>>>>>>>>>>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: > >>>>>>>>>>>>> On 03/07/2025 18:06, Eugene Loh wrote: > >>>>>>>>>>>>>> On 7/3/25 12:59, Alan Maguire wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On 03/07/2025 17:43, Eugene Loh wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the > >>>>>>>>>>>>>>>> patch 3/4 feedback). > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip > >>>>>>>>>>>>>>> send probes? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> dtrace: failed to compile script /dev/stdin: > >>>>>>>>>>>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of > >>>>>>>>>>>>>> inet_ntoa arg#1 (ipaddr_t *): > >>>>>>>>>>>>>> Unknown type name > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> Ah, sorry yep I have a fix for that one in the next round. Basically we > >>>>>>>>>>>>> need to add it to the core set of typedefs and add a type for a pointer > >>>>>>>>>>>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. > >>>>>>>>>>>> > >>>>>>>>>>>> Why can't we rely on the pragma? That is how e.g. the ip provider manages > >>>>>>>>>>>> this I believe? > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Unfortunately the #pragma include doesn't do enough; it just defines a > >>>>>>>>>>> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is > >>>>>>>>>>> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t > >>>>>>>>>>> typedef to net.d and doing the pointer lookup/addition but that doesn't > >>>>>>>>>>> work either. Seems we need the core typedef + pointer addition or we hit > >>>>>>>>>>> this failure. > >>>>>>>>>> > >>>>>>>>>> Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, > >>>>>>>>>> you should be set. That is what I did in my priliminary tcp provider impl. > >>>>>>>>>> I do believe that works. Either way, we use inet_ntoa() in the ip.d > >>>>>>>>>> translators and that works with that typedef in the file, so this really ought > >>>>>>>>>> to work. > >>>>>>>> > >>>>>>>>> Yep, I tried that in the v2 patch series; Eugene hit the undefined error > >>>>>>>>> in one test and I now hit it consistently for all tcp/ip tests > >>>>>>>>> unfortunately with "typedef __be32 ipaddr_t;" in net.d. > >>>>>>>>> > >>>>>>>>> My assumption (probably wrong) is that the include of the library does > >>>>>>>>> happen but nothing triggers the pointer type generation for "ipaddr *" > >>>>>>>>> in the CTF dict. If there was a way to force that type generation at the > >>>>>>>>> .d file level that would be great, not sure I see a way currently tho. > >>>>>>>> > >>>>>>>> Well, like I said, it does work for ip.d so I don't see why this would be > >>>>>>>> any different. I'll have a look and see if I can figure something out. > >>>>>>> > >>>>>>> Looking into this more, I think the problem is simply that you did not sync > >>>>>>> all the dlibs for the various kernel versions with the updated ip.d, net.d, and > >>>>>>> tcp.d files. So, if the kernel on the OL8 instance you test on does not have > >>>>>>> your change, it will fail. > >>>>>>> > >>>>>> > >>>>>> No, don't think that's it; the .d files that matched the kernel I tested > >>>>>> on (6.10) were synced; the use of the 6.10 .d files was visible in the > >>>>>> error message. The problem appears to be around the fact that tcp.d uses > >>>>>> the ipaddr_t * in inet_ntoa(), but unlike ip.d (which uses ipaddr_t in > >>>>>> translated types) it does not have any other mention of ipaddr_t. > >>>>>> Adding an explicit cast in tcp.d to the argument to inet_ntoa() to > >>>>>> ipaddr_t * resolves the issue without having to add ipaddr_t to the core > >>>>>> type list. > >>>>> > >>>>> Can you reproduce this at will? Can you give me specifics on OL version, > >>>>> kernel version, etc? I'd like to be able to reproduce what you see, because > >>>>> so far, all I tried actually works once the ipaddr_t typedef is in net.d. > >>>>> > >>>> > >>>> Yep, it's 100% reproducible for me on an upstream (bpf-next 6.15) kernel > >>>> + OL9. Moving ipaddr_t to net.d works for ip.d but not tcp.d in that > >>>> environment. The extra casts for the inet_ntoa() parameters that I > >>>> mention above are needed in tcp.d to get things to work properly for me. > >>>> > >>>> I pushed a branch to > >>>> > >>>> https://github.com/alan-maguire/dtrace-utils/tree/remote-tcp-v3-wip-broken > >>>> > >>>> that illustrates the failure. > >>>> > >>>> Relative to devel, it consists of 6 commits > >>>> > >>>> 1: the v2 of the remote IP address change (ensuring the remote address > >>>> tests won't fail); > >>>> 2-4: a few prep patches for the tcp provider; and > >>>> 5: the tcp provider patch (in a v3 work-in-progress form); and finally > >>>> 6: the top-level commit then removes the casts I added to tcp.d in the > >>>> previous "tcp: new provider" commit. With that change in place on my > >>>> system, the previously-passing IP tests start failing. > >>>> > >>>> If I "git reset --hard HEAD~1" on that branch (reestablishing those > >>>> ipaddr_t * casts) and rebuild, the failures go away for me. > >>> > >>> I tested your tree on Debian with the 6.15 kernel, and this is the result: > >>> > >>> $ uname -a > >>> Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux > >>> $ cat test/log/current/runtest.sum > >>> dtrace: Oracle D 2.0 > >>> This is DTrace 2.0.1 > >>> dtrace(1) version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc > >>> libdtrace version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc > >>> Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux > >>> testsuite version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc > >>> > >>> test/unittest/tcp/tst.ipv4localtcp.sh: PASS. > >>> test/unittest/tcp/tst.ipv4localtcpstate.sh: PASS. > >>> test/unittest/tcp/tst.ipv4remotetcp.sh: PASS. > >>> test/unittest/tcp/tst.ipv4remotetcpstate.sh: PASS. > >>> test/unittest/tcp/tst.ipv6localtcp.sh: PASS. > >>> test/unittest/tcp/tst.ipv6localtcpstate.sh: PASS. > >>> 6 cases (6 PASS, 0 FAIL, 0 XPASS, 0 XFAIL, 0 SKIP) > >>> > >>> I will try to get 6.15 on an OL9 instance and try there, but either way, I > >>> have a feeling there is a binutils (libctf) discrepancy somewhere? What > >> > >> could be; see below.. > >> > >>> version of binutils is installed on your system (nm -V)? > >> > >> $ nm -V > >> GNU nm version 2.35.2-42.0.1.el9 > >> Copyright (C) 2020 Free Software Foundation, Inc. > >> This program is free software; you may redistribute it under the terms of > >> the GNU General Public License version 3 or (at your option) any later > >> version. > >> This program has absolutely no warranty. > >> > >> Let me know if you need any more info. Thanks! > >> > >> Alan > > > > > > Tried it on OL9 with 6.15.4 kernel, and aside from some probes not firing, > > the tests work. > > > > $ nm -V > > GNU nm version 2.35.2-63.0.1.el9 > > Copyright (C) 2020 Free Software Foundation, Inc. > > This program is free software; you may redistribute it under the terms of > > the GNU General Public License version 3 or (at your option) any later version. > > This program has absolutely no warranty. > > > > So I think you need to yum update your system? > > I think I may have found another clue to why it's happening. I tried on > a gcc-toolset-14 -built system, with > > $ nm -V > GNU nm version 2.41-3.el9 > Copyright (C) 2023 Free Software Foundation, Inc. > This program is free software; you may redistribute it under the terms of > the GNU General Public License version 3 or (at your option) any later > version. > This program has absolutely no warranty. > > > Now I can run the following fine: > > # build/dtrace -n 'ip:::send /args[4]->ipv4_protocol == IPPROTO_TCP/ { > @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); } END { > printa(@c); }' > dtrace: description 'ip:::send ' matched 2 probes > > However, if I add a syslibdir path - as the tests do when they execute - > I see > > $ build/dtrace -xsyslibdir=$(pwd)/build/dlibs -n 'ip:::send > /args[4]->ipv4_protocol == IPPROTO_TCP/ { @c[args[2]->ip_saddr, > args[4]->ipv4_protocol] = count(); } END { printa(@c); }' > dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol == > IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); > } END { printa(@c); }: > "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to > resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name > > using -xdebug I see a lot less types added in the error case; i.e. only > the following are added when processing .d files: > > libdtrace DEBUG 1751994411: typedef conninfo_t added as id 2147483678 > libdtrace DEBUG 1751994411: typedef netstackid_t added as id 2147483679 > libdtrace DEBUG 1751994411: typedef ipaddr_t added as id 2147483683 > libdtrace DEBUG 1751994411: typedef in6_addr_t added as id 2147483695 > libdtrace DEBUG 1751994411: typedef pktinfo_t added as id 2147483697 > libdtrace DEBUG 1751994411: typedef csinfo_t added as id 2147483699 > libdtrace DEBUG 1751994411: typedef tcpinfo_t added as id 2147483701 > libdtrace DEBUG 1751994411: typedef tcpsinfo_t added as id 2147483703 > libdtrace DEBUG 1751994411: typedef tcplsinfo_t added as id 2147483705 > > > versus the good case: > > libdtrace DEBUG 1751994399: typedef processorid_t added as id 2147483677 > libdtrace DEBUG 1751994399: typedef psetid_t added as id 2147483678 > libdtrace DEBUG 1751994399: typedef chipid_t added as id 2147483679 > libdtrace DEBUG 1751994399: typedef lgrp_id_t added as id 2147483680 > libdtrace DEBUG 1751994399: typedef cpuinfo_t added as id 2147483682 > libdtrace DEBUG 1751994399: typedef cpuinfo_t_p added as id 2147483684 > libdtrace DEBUG 1751994399: typedef time_t added as id 2147483688 > libdtrace DEBUG 1751994399: typedef timestruc_t added as id 2147483690 > libdtrace DEBUG 1751994399: typedef lwpsinfo_t added as id 2147483695 > libdtrace DEBUG 1751994399: typedef taskid_t added as id 2147483696 > libdtrace DEBUG 1751994399: typedef dprojid_t added as id 2147483697 > libdtrace DEBUG 1751994399: typedef poolid_t added as id 2147483698 > libdtrace DEBUG 1751994399: typedef zoneid_t added as id 2147483699 > libdtrace DEBUG 1751994399: typedef psinfo_t added as id 2147490324 > libdtrace DEBUG 1751994399: typedef conninfo_t added as id 2147490329 > libdtrace DEBUG 1751994399: typedef netstackid_t added as id 2147490330 > libdtrace DEBUG 1751994399: typedef ipaddr_t added as id 2147490331 > libdtrace DEBUG 1751994399: typedef in6_addr_t added as id 2147490332 > libdtrace DEBUG 1751994399: typedef pktinfo_t added as id 2147490334 > libdtrace DEBUG 1751994399: typedef csinfo_t added as id 2147490336 > libdtrace DEBUG 1751994399: skipping library > /usr/lib64/dtrace/6.10/udp.d: "/usr/lib64/dtrace/6.10/udp.d", line 10: > program requires provider udp > libdtrace DEBUG 1751994399: typedef tcpinfo_t added as id 2147490338 > libdtrace DEBUG 1751994399: typedef tcpsinfo_t added as id 2147490340 > libdtrace DEBUG 1751994399: typedef tcplsinfo_t added as id 2147490342 > libdtrace DEBUG 1751994399: typedef ipinfo_t added as id 2147490350 > libdtrace DEBUG 1751994399: typedef ifinfo_t added as id 2147490352 > libdtrace DEBUG 1751994399: typedef ipv4info_t added as id 2147490361 > libdtrace DEBUG 1751994399: typedef ipv6info_t added as id 2147490369 > libdtrace DEBUG 1751994399: typedef void_ip_t added as id 2147490370 > libdtrace DEBUG 1751994399: typedef __dtrace_tcp_void_ip_t added as id > 2147490371 > libdtrace DEBUG 1751994399: typedef caddr_t added as id 2147490380 > libdtrace DEBUG 1751994399: typedef bufinfo_t added as id 2147490382 > libdtrace DEBUG 1751994399: typedef devinfo_t added as id 2147490385 > > So it looks like sched.d wasn't processed for example, but weirdly in > the failing case net.d (containing the typedef ipaddr_t) and tcp.d were. > > The actual error comes later though, after processing kernel/module BTF: > > dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol == > IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); > } END { printa(@c); }: > "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to > resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name > > And so I checked for differences in the build/dlibs files versus what is > installed, and found none. Maybe the above might help reproduce this at > least. Thanks! I'll have a look, but when you are using a locally built dtrace, you should use ./build/run-dtrace so that the correct paths are set up for libdtrace.so and the dlibs to be found. Otherwise, you end up using the locally built frontend (dtrace) with the installed libdtrace.so and dlibs. And even when passing the -xsyslibdir, you still end up using the installed libdtrace.so, so your testing is not based on the locally built dtrace. Kris ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-08 17:30 ` Kris Van Hees @ 2025-07-08 19:04 ` Alan Maguire 2025-07-08 20:13 ` Kris Van Hees 0 siblings, 1 reply; 21+ messages in thread From: Alan Maguire @ 2025-07-08 19:04 UTC (permalink / raw) To: Kris Van Hees; +Cc: Eugene Loh, dtrace, dtrace-devel On 08/07/2025 18:30, Kris Van Hees wrote: > On Tue, Jul 08, 2025 at 06:19:25PM +0100, Alan Maguire wrote: >> On 08/07/2025 02:34, Kris Van Hees wrote: >>> On Mon, Jul 07, 2025 at 10:51:10PM +0100, Alan Maguire wrote: >>>> On 07/07/2025 20:55, Kris Van Hees wrote: >>>>> On Mon, Jul 07, 2025 at 07:14:35PM +0100, Alan Maguire wrote: >>>>>> On 07/07/2025 17:53, Kris Van Hees wrote: >>>>>>> On Mon, Jul 07, 2025 at 05:32:19PM +0100, Alan Maguire wrote: >>>>>>>> On 03/07/2025 23:36, Kris Van Hees wrote: >>>>>>>>> On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote: >>>>>>>>>> On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote: >>>>>>>>>>> On 03/07/2025 20:03, Kris Van Hees wrote: >>>>>>>>>>>> On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote: >>>>>>>>>>>>> On 03/07/2025 19:26, Kris Van Hees wrote: >>>>>>>>>>>>>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote: >>>>>>>>>>>>>>> On 03/07/2025 18:06, Eugene Loh wrote: >>>>>>>>>>>>>>>> On 7/3/25 12:59, Alan Maguire wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 03/07/2025 17:43, Eugene Loh wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the >>>>>>>>>>>>>>>>>> patch 3/4 feedback). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip >>>>>>>>>>>>>>>>> send probes? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> dtrace: failed to compile script /dev/stdin: >>>>>>>>>>>>>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of >>>>>>>>>>>>>>>> inet_ntoa arg#1 (ipaddr_t *): >>>>>>>>>>>>>>>> Unknown type name >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ah, sorry yep I have a fix for that one in the next round. Basically we >>>>>>>>>>>>>>> need to add it to the core set of typedefs and add a type for a pointer >>>>>>>>>>>>>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Why can't we rely on the pragma? That is how e.g. the ip provider manages >>>>>>>>>>>>>> this I believe? >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Unfortunately the #pragma include doesn't do enough; it just defines a >>>>>>>>>>>>> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is >>>>>>>>>>>>> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t >>>>>>>>>>>>> typedef to net.d and doing the pointer lookup/addition but that doesn't >>>>>>>>>>>>> work either. Seems we need the core typedef + pointer addition or we hit >>>>>>>>>>>>> this failure. >>>>>>>>>>>> >>>>>>>>>>>> Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d, >>>>>>>>>>>> you should be set. That is what I did in my priliminary tcp provider impl. >>>>>>>>>>>> I do believe that works. Either way, we use inet_ntoa() in the ip.d >>>>>>>>>>>> translators and that works with that typedef in the file, so this really ought >>>>>>>>>>>> to work. >>>>>>>>>> >>>>>>>>>>> Yep, I tried that in the v2 patch series; Eugene hit the undefined error >>>>>>>>>>> in one test and I now hit it consistently for all tcp/ip tests >>>>>>>>>>> unfortunately with "typedef __be32 ipaddr_t;" in net.d. >>>>>>>>>>> >>>>>>>>>>> My assumption (probably wrong) is that the include of the library does >>>>>>>>>>> happen but nothing triggers the pointer type generation for "ipaddr *" >>>>>>>>>>> in the CTF dict. If there was a way to force that type generation at the >>>>>>>>>>> .d file level that would be great, not sure I see a way currently tho. >>>>>>>>>> >>>>>>>>>> Well, like I said, it does work for ip.d so I don't see why this would be >>>>>>>>>> any different. I'll have a look and see if I can figure something out. >>>>>>>>> >>>>>>>>> Looking into this more, I think the problem is simply that you did not sync >>>>>>>>> all the dlibs for the various kernel versions with the updated ip.d, net.d, and >>>>>>>>> tcp.d files. So, if the kernel on the OL8 instance you test on does not have >>>>>>>>> your change, it will fail. >>>>>>>>> >>>>>>>> >>>>>>>> No, don't think that's it; the .d files that matched the kernel I tested >>>>>>>> on (6.10) were synced; the use of the 6.10 .d files was visible in the >>>>>>>> error message. The problem appears to be around the fact that tcp.d uses >>>>>>>> the ipaddr_t * in inet_ntoa(), but unlike ip.d (which uses ipaddr_t in >>>>>>>> translated types) it does not have any other mention of ipaddr_t. >>>>>>>> Adding an explicit cast in tcp.d to the argument to inet_ntoa() to >>>>>>>> ipaddr_t * resolves the issue without having to add ipaddr_t to the core >>>>>>>> type list. >>>>>>> >>>>>>> Can you reproduce this at will? Can you give me specifics on OL version, >>>>>>> kernel version, etc? I'd like to be able to reproduce what you see, because >>>>>>> so far, all I tried actually works once the ipaddr_t typedef is in net.d. >>>>>>> >>>>>> >>>>>> Yep, it's 100% reproducible for me on an upstream (bpf-next 6.15) kernel >>>>>> + OL9. Moving ipaddr_t to net.d works for ip.d but not tcp.d in that >>>>>> environment. The extra casts for the inet_ntoa() parameters that I >>>>>> mention above are needed in tcp.d to get things to work properly for me. >>>>>> >>>>>> I pushed a branch to >>>>>> >>>>>> https://github.com/alan-maguire/dtrace-utils/tree/remote-tcp-v3-wip-broken >>>>>> >>>>>> that illustrates the failure. >>>>>> >>>>>> Relative to devel, it consists of 6 commits >>>>>> >>>>>> 1: the v2 of the remote IP address change (ensuring the remote address >>>>>> tests won't fail); >>>>>> 2-4: a few prep patches for the tcp provider; and >>>>>> 5: the tcp provider patch (in a v3 work-in-progress form); and finally >>>>>> 6: the top-level commit then removes the casts I added to tcp.d in the >>>>>> previous "tcp: new provider" commit. With that change in place on my >>>>>> system, the previously-passing IP tests start failing. >>>>>> >>>>>> If I "git reset --hard HEAD~1" on that branch (reestablishing those >>>>>> ipaddr_t * casts) and rebuild, the failures go away for me. >>>>> >>>>> I tested your tree on Debian with the 6.15 kernel, and this is the result: >>>>> >>>>> $ uname -a >>>>> Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux >>>>> $ cat test/log/current/runtest.sum >>>>> dtrace: Oracle D 2.0 >>>>> This is DTrace 2.0.1 >>>>> dtrace(1) version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc >>>>> libdtrace version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc >>>>> Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux >>>>> testsuite version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc >>>>> >>>>> test/unittest/tcp/tst.ipv4localtcp.sh: PASS. >>>>> test/unittest/tcp/tst.ipv4localtcpstate.sh: PASS. >>>>> test/unittest/tcp/tst.ipv4remotetcp.sh: PASS. >>>>> test/unittest/tcp/tst.ipv4remotetcpstate.sh: PASS. >>>>> test/unittest/tcp/tst.ipv6localtcp.sh: PASS. >>>>> test/unittest/tcp/tst.ipv6localtcpstate.sh: PASS. >>>>> 6 cases (6 PASS, 0 FAIL, 0 XPASS, 0 XFAIL, 0 SKIP) >>>>> >>>>> I will try to get 6.15 on an OL9 instance and try there, but either way, I >>>>> have a feeling there is a binutils (libctf) discrepancy somewhere? What >>>> >>>> could be; see below.. >>>> >>>>> version of binutils is installed on your system (nm -V)? >>>> >>>> $ nm -V >>>> GNU nm version 2.35.2-42.0.1.el9 >>>> Copyright (C) 2020 Free Software Foundation, Inc. >>>> This program is free software; you may redistribute it under the terms of >>>> the GNU General Public License version 3 or (at your option) any later >>>> version. >>>> This program has absolutely no warranty. >>>> >>>> Let me know if you need any more info. Thanks! >>>> >>>> Alan >>> >>> >>> Tried it on OL9 with 6.15.4 kernel, and aside from some probes not firing, >>> the tests work. >>> >>> $ nm -V >>> GNU nm version 2.35.2-63.0.1.el9 >>> Copyright (C) 2020 Free Software Foundation, Inc. >>> This program is free software; you may redistribute it under the terms of >>> the GNU General Public License version 3 or (at your option) any later version. >>> This program has absolutely no warranty. >>> >>> So I think you need to yum update your system? >> >> I think I may have found another clue to why it's happening. I tried on >> a gcc-toolset-14 -built system, with >> >> $ nm -V >> GNU nm version 2.41-3.el9 >> Copyright (C) 2023 Free Software Foundation, Inc. >> This program is free software; you may redistribute it under the terms of >> the GNU General Public License version 3 or (at your option) any later >> version. >> This program has absolutely no warranty. >> >> >> Now I can run the following fine: >> >> # build/dtrace -n 'ip:::send /args[4]->ipv4_protocol == IPPROTO_TCP/ { >> @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); } END { >> printa(@c); }' >> dtrace: description 'ip:::send ' matched 2 probes >> >> However, if I add a syslibdir path - as the tests do when they execute - >> I see >> >> $ build/dtrace -xsyslibdir=$(pwd)/build/dlibs -n 'ip:::send >> /args[4]->ipv4_protocol == IPPROTO_TCP/ { @c[args[2]->ip_saddr, >> args[4]->ipv4_protocol] = count(); } END { printa(@c); }' >> dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol == >> IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); >> } END { printa(@c); }: >> "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to >> resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name >> >> using -xdebug I see a lot less types added in the error case; i.e. only >> the following are added when processing .d files: >> >> libdtrace DEBUG 1751994411: typedef conninfo_t added as id 2147483678 >> libdtrace DEBUG 1751994411: typedef netstackid_t added as id 2147483679 >> libdtrace DEBUG 1751994411: typedef ipaddr_t added as id 2147483683 >> libdtrace DEBUG 1751994411: typedef in6_addr_t added as id 2147483695 >> libdtrace DEBUG 1751994411: typedef pktinfo_t added as id 2147483697 >> libdtrace DEBUG 1751994411: typedef csinfo_t added as id 2147483699 >> libdtrace DEBUG 1751994411: typedef tcpinfo_t added as id 2147483701 >> libdtrace DEBUG 1751994411: typedef tcpsinfo_t added as id 2147483703 >> libdtrace DEBUG 1751994411: typedef tcplsinfo_t added as id 2147483705 >> >> >> versus the good case: >> >> libdtrace DEBUG 1751994399: typedef processorid_t added as id 2147483677 >> libdtrace DEBUG 1751994399: typedef psetid_t added as id 2147483678 >> libdtrace DEBUG 1751994399: typedef chipid_t added as id 2147483679 >> libdtrace DEBUG 1751994399: typedef lgrp_id_t added as id 2147483680 >> libdtrace DEBUG 1751994399: typedef cpuinfo_t added as id 2147483682 >> libdtrace DEBUG 1751994399: typedef cpuinfo_t_p added as id 2147483684 >> libdtrace DEBUG 1751994399: typedef time_t added as id 2147483688 >> libdtrace DEBUG 1751994399: typedef timestruc_t added as id 2147483690 >> libdtrace DEBUG 1751994399: typedef lwpsinfo_t added as id 2147483695 >> libdtrace DEBUG 1751994399: typedef taskid_t added as id 2147483696 >> libdtrace DEBUG 1751994399: typedef dprojid_t added as id 2147483697 >> libdtrace DEBUG 1751994399: typedef poolid_t added as id 2147483698 >> libdtrace DEBUG 1751994399: typedef zoneid_t added as id 2147483699 >> libdtrace DEBUG 1751994399: typedef psinfo_t added as id 2147490324 >> libdtrace DEBUG 1751994399: typedef conninfo_t added as id 2147490329 >> libdtrace DEBUG 1751994399: typedef netstackid_t added as id 2147490330 >> libdtrace DEBUG 1751994399: typedef ipaddr_t added as id 2147490331 >> libdtrace DEBUG 1751994399: typedef in6_addr_t added as id 2147490332 >> libdtrace DEBUG 1751994399: typedef pktinfo_t added as id 2147490334 >> libdtrace DEBUG 1751994399: typedef csinfo_t added as id 2147490336 >> libdtrace DEBUG 1751994399: skipping library >> /usr/lib64/dtrace/6.10/udp.d: "/usr/lib64/dtrace/6.10/udp.d", line 10: >> program requires provider udp >> libdtrace DEBUG 1751994399: typedef tcpinfo_t added as id 2147490338 >> libdtrace DEBUG 1751994399: typedef tcpsinfo_t added as id 2147490340 >> libdtrace DEBUG 1751994399: typedef tcplsinfo_t added as id 2147490342 >> libdtrace DEBUG 1751994399: typedef ipinfo_t added as id 2147490350 >> libdtrace DEBUG 1751994399: typedef ifinfo_t added as id 2147490352 >> libdtrace DEBUG 1751994399: typedef ipv4info_t added as id 2147490361 >> libdtrace DEBUG 1751994399: typedef ipv6info_t added as id 2147490369 >> libdtrace DEBUG 1751994399: typedef void_ip_t added as id 2147490370 >> libdtrace DEBUG 1751994399: typedef __dtrace_tcp_void_ip_t added as id >> 2147490371 >> libdtrace DEBUG 1751994399: typedef caddr_t added as id 2147490380 >> libdtrace DEBUG 1751994399: typedef bufinfo_t added as id 2147490382 >> libdtrace DEBUG 1751994399: typedef devinfo_t added as id 2147490385 >> >> So it looks like sched.d wasn't processed for example, but weirdly in >> the failing case net.d (containing the typedef ipaddr_t) and tcp.d were. >> >> The actual error comes later though, after processing kernel/module BTF: >> >> dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol == >> IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); >> } END { printa(@c); }: >> "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to >> resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name >> >> And so I checked for differences in the build/dlibs files versus what is >> installed, and found none. Maybe the above might help reproduce this at >> least. Thanks! > > I'll have a look, but when you are using a locally built dtrace, you should use > ./build/run-dtrace so that the correct paths are set up for libdtrace.so and > the dlibs to be found. Otherwise, you end up using the locally built frontend > (dtrace) with the installed libdtrace.so and dlibs. And even when passing > the -xsyslibdir, you still end up using the installed libdtrace.so, so your > testing is not based on the locally built dtrace. > > Kris thanks; tried with build/run-dtrace with same result. However by adding some debug logging I think I've discovered the root cause; the order of .d file sorting seems to be different in the build/dlibs versus /usr/lib64/dtrace case, and the problem is that tcp.d actually implicitly relies on ip.d for ipinfo_t . We get lucky in the sort order for /usr/lib64/dtrace, and because an ipaddr_t * gets added during ip.d processing, by the time we lookup "ipaddr_t *" in tcp.d it's already in the D CTF dict. I _think_ the ipaddr_t * gets added as a side effect of the fact that there are fields of type ipaddr_t in the translated ipv4info_t in ip.d However in the problematic case with build/run-dtrace , net.d is still loaded first, and then tcp.d is loaded immediately after without an intervening load of ip.d. As a result we have no "ipaddr_t *", hence dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol == IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); } END { printa(@c); }: "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name To fix this I think the right answer is to change the dependency tcp.d has on ip.d, from #pragma D depends_on provider ip to #pragma D depends_on library ip.d This is needed for other reasons (ipinfo_t declaration for example), but with that change the problem is resolved. Alan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach 2025-07-08 19:04 ` Alan Maguire @ 2025-07-08 20:13 ` Kris Van Hees 0 siblings, 0 replies; 21+ messages in thread From: Kris Van Hees @ 2025-07-08 20:13 UTC (permalink / raw) To: Alan Maguire; +Cc: Kris Van Hees, Eugene Loh, dtrace, dtrace-devel On Tue, Jul 08, 2025 at 08:04:41PM +0100, Alan Maguire wrote: << omitted >> > thanks; tried with build/run-dtrace with same result. However by adding > some debug logging I think I've discovered the root cause; the order of > .d file sorting seems to be different in the build/dlibs versus > /usr/lib64/dtrace case, and the problem is that tcp.d actually > implicitly relies on ip.d for ipinfo_t . We get lucky in the sort order for > /usr/lib64/dtrace, and because an ipaddr_t * gets added during ip.d > processing, by the time we lookup "ipaddr_t *" in tcp.d it's already in > the D CTF dict. I _think_ the ipaddr_t * gets added as a side effect of > the fact that there are fields of type ipaddr_t in the translated > ipv4info_t in ip.d > > However in the problematic case with build/run-dtrace , net.d is still > loaded first, and then tcp.d is loaded immediately after without an > intervening load of ip.d. As a result we have no "ipaddr_t *", hence > > dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol == > IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); > } END { printa(@c); }: > "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to > resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name > > To fix this I think the right answer is to change the dependency tcp.d > has on ip.d, from > > #pragma D depends_on provider ip > > to > > #pragma D depends_on library ip.d > > This is needed for other reasons (ipinfo_t declaration for example), but > with that change the problem is resolved. Sounds like a good solution! ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2025-07-08 20:13 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-03 11:33 [PATCH] test/utils: add more reliable "get remote address" approach Alan Maguire 2025-07-03 16:43 ` [DTrace-devel] " Eugene Loh 2025-07-03 16:59 ` Alan Maguire 2025-07-03 17:06 ` Eugene Loh 2025-07-03 18:02 ` Alan Maguire 2025-07-03 18:26 ` Kris Van Hees 2025-07-03 18:41 ` Alan Maguire 2025-07-03 19:03 ` Kris Van Hees 2025-07-03 20:23 ` Alan Maguire 2025-07-03 20:59 ` Kris Van Hees 2025-07-03 22:36 ` Kris Van Hees 2025-07-07 16:32 ` Alan Maguire 2025-07-07 16:53 ` Kris Van Hees 2025-07-07 18:14 ` Alan Maguire 2025-07-07 19:55 ` Kris Van Hees 2025-07-07 21:51 ` Alan Maguire 2025-07-08 1:34 ` Kris Van Hees 2025-07-08 17:19 ` Alan Maguire 2025-07-08 17:30 ` Kris Van Hees 2025-07-08 19:04 ` Alan Maguire 2025-07-08 20:13 ` Kris Van Hees
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox