All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kris Van Hees <kris.van.hees@oracle.com>
To: Kris Van Hees <kris.van.hees@oracle.com>
Cc: Eugene Loh <eugene.loh@oracle.com>,
	Alan Maguire <alan.maguire@oracle.com>,
	dtrace@lists.linux.dev, dtrace-devel@oss.oracle.com
Subject: Re: [PATCH v2 3/4] dtrace: add tcp provider
Date: Thu, 3 Jul 2025 11:38:57 -0400	[thread overview]
Message-ID: <aGakEUw2wYzngot5@oracle.com> (raw)
In-Reply-To: <aGah5oXmjKCjhI7L@oracle.com>

On Thu, Jul 03, 2025 at 11:29:42AM -0400, Kris Van Hees wrote:
> Not sure if this would confuse things or help, but here is an early attempt I
> did on implementing a tcp provider:
> 
> 	kvh/2.0-branch-dev-tcp

But... it is only on an internal repo since I never finished it :)  But perhaps
it can help with review.  Probably not worth pushing to github because I do not
think what I did is useful anymore in view of Alan's work.

> >From my recollection it was missing stuff, since it was WIP compared to what
> Alan has worked on.
> 
> On Wed, Jul 02, 2025 at 08:02:00PM -0400, Eugene Loh wrote:
> > On 7/2/25 11:06, Alan Maguire wrote:
> > 
> > > On 02/07/2025 00:16, Eugene Loh wrote:
> > > > On most VMs,
> > > >      test/unittest/tcp/tst.ipv4remotetcp.sh
> > > >      test/unittest/tcp/tst.ipv4remotetcpstate.sh
> > > > xfail due to missing remote.  Are we okay with "shrugging our shoulders"
> > > > like that?
> > > Yeah, I don't think the remote test is robust enough. Specifically in
> > > OCI it seems to always fail. I'd suggest we replace it with creating a
> > > network namespace with IP addresses configured on top of veths to
> > > simulate the remote case, the codepaths will be the same. I've done this
> > > in other test suites and it works well.
> > 
> > Sounds great (if "we" is "you", haha).
> > 
> > > > Meanwhile, my one non-OCI VM ran those tests.  The first test passes.
> > > > The second one consistently reports
> > > >      -tcp:::state-change to time-wait - yes
> > > >      +tcp:::state-change to time-wait - no
> > > I hit some of these failure during development; adding the
> > > fbt::tcp_time_wait:entry probe helped. Is that inlined or something
> > > perhaps (grep tcp_time_wait /proc/kallsyms)?
> > 
> > On the VM in question:
> > 
> > # grep -w tcp_time_wait /proc/kallsyms
> > ffffffff92ad25b0 T tcp_time_wait
> > # dtrace -lP fbt |& grep tcp_time_wait
> > 49373        fbt           vmlinux                     tcp_time_wait return
> > 49372        fbt           vmlinux                     tcp_time_wait entry
> > # dtrace -lP rawfbt |& grep tcp_time_wait
> > 51079     rawfbt           vmlinux                     tcp_time_wait return
> > 51078     rawfbt           vmlinux                     tcp_time_wait entry
> > 
> > > > and occasionally reports stuff like
> > > >      dtrace: error in dt_clause_2 for probe ID 4976 (tcp:vmlinux::send):
> > > > invalid address (0x1fc0c0000000000) at BPF pc 287
> > > >      dtrace: error in dt_clause_2 for probe ID 4976 (tcp:vmlinux::send):
> > > > invalid address (0x225b80000000000) at BPF pc 287
> > > > 
> > > ah, ok there must be a null deref somewhere. Haven't seen this before;
> > > what kernel version/arch is this?
> > 
> > 5.15.0-300.161.13.el9uek.x86_64
> > 
> > FWIW, I can comment out all probes in tcp other than:
> > 
> >         { "send", DTRACE_PROBESPEC_NAME,
> > "rawfbt::ip_send_unicast_reply:entry" },
> > 
> > Then I run
> > 
> > dtrace -c "$testdir/client.ip.pl tcp $dest $tcpport" -qn 'tcp:::send
> > /args[2]->ip_saddr == "'$source'"/ { tcpsend++; }'
> > 
> > The disassembly shows that I look up args[2] using dt_bvar_args() (including
> > checking for a fault).  Then we try to dereference args[2]->ip_saddr.  We
> > first check the pointer is non NULL.  Then we call dt_cg_load_scalar() to
> > bpf_probe_read() from the desired location.  This call is problematic.
> > 
> > > > The non-remote tests fail on OL8 UEK6 (x86 and arm).
> > > >      dtrace: failed to compile script /dev/stdin:
> > > >      ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of
> > > > inet_ntoa arg#1 (ipaddr_t *):
> > > >      Unknown type name
> > > > 
> > > This is a weird failure; I see it on some systems but not on others.
> > > In tcp.d we have
> > > 
> > > #pragma D depends_on library net.d
> > > 
> > > which contains the typedef for ipaddr_t ; it seems that's not enough to
> > > pull in the typedef reliably. I suspect there is a timing element
> > > involved here in when the net.d library is included. Perhaps there is a
> > > better way to define ipaddr_t ; would using a builtin typedef in
> > > _dtrace_typedefs_32/64 work better perhaps?
> > 
> > Don't know.
> > 
> > > > The probe names are
> > > >      tcp:ip:*:*        Solaris
> > > >      tcp:vmlinux:*:*   DTv1
> > > >      tcp:vmlinux::*    with this patch (that is, no more function)
> > > > I guess precedents have already been set for other SDT providers;  so,
> > > > okay.  Just noting for my own sake.
> > > > Meanwhile, the typed args[] have changed in number and type from Solaris> to DTv1 to this patch.  Does that merit discussion?
> > > Hmm, that's not intentional (aside from the additional INBOUND/OUTBOUND
> > > etc which we use to help inform translation).
> > 
> > Worth mentioning somewhere?
> > 
> > > Do you see other changes aside from them? Thanks!
> > 
> > This is what I have for typed args[] for tcp probes.
> > 
> > The typed probe arguments for probes
> >         accept-[refused|established]
> >         connect-[refused|established|request]
> >         receive
> > are the same as for send.
> > 
> > The typed probe arguments for state-change may be different.
> > 
> > So, the typed probe arguments are (wide screen, fixed-width font):
> > 
> > args[0]:      args[1]:      args[2]:      args[3]: args[4]:     
> > args[5]:      args[6]:      args[7]:
> > 
> >             send Solaris         pktinfo_t *   csinfo_t * ipinfo_t *   
> > tcpsinfo_t *  tcpinfo_t *
> >             send DTv1            (unknown)     (unknown) (unknown)    
> > (unknown)     (unknown)     (unknown) int           int
> >             send DTv2            pktinfo_t *   csinfo_t * ipinfo_t *   
> > tcpsinfo_t *  tcpinfo_t *   int tcplsinfo_t * int
> > 
> >             state-change Solaris void          csinfo_t * void         
> > tcpsinfo_t *  void          tcplsinfo_t *
> >             state-change DTv1    (unknown)     (unknown) (unknown)    
> > (unknown)     (unknown)     (unknown) int           int
> >             state-change DTv2    void      *   csinfo_t * void     *   
> > tcpsinfo_t *  void      *   void * tcplsinfo_t * int
> > 
> > Here, "DTv1" refers to legacy DTrace on Linux.  I guess we can ignore that. 
> > By "DTv2" I mean your patch.  For state-change, Solaris calls some things
> > "void" (not "void *") and tcplsinfo_t* moves from args[5] to args[6].

  reply	other threads:[~2025-07-03 15:39 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-10 13:58 [PATCH v2 0/4] DTrace TCP provider Alan Maguire
2025-06-10 13:58 ` [PATCH v2 1/4] dtrace: move get_member() to dt_cg.c Alan Maguire
2025-07-01 18:23   ` Eugene Loh
2025-06-10 13:58 ` [PATCH v2 2/4] dt_impl: bump number of TSLOTS to 8 Alan Maguire
2025-07-01 18:31   ` Eugene Loh
2025-07-02 14:52     ` Alan Maguire
2025-07-02 20:22       ` Eugene Loh
2025-07-03 15:18         ` Alan Maguire
2025-06-10 13:58 ` [PATCH v2 3/4] dtrace: add tcp provider Alan Maguire
2025-07-01 23:16   ` Eugene Loh
2025-07-02 15:06     ` Alan Maguire
2025-07-03  0:02       ` Eugene Loh
2025-07-03 15:03         ` Alan Maguire
2025-07-03 15:29         ` Kris Van Hees
2025-07-03 15:38           ` Kris Van Hees [this message]
2025-07-03 19:55   ` Eugene Loh
2025-07-07 18:44   ` Kris Van Hees
2025-06-10 13:58 ` [PATCH v2 4/4] dtrace: sync dlibs with tcp.d, ip.d and net.d changes Alan Maguire
2025-07-01 19:08 ` [PATCH v2 0/4] DTrace TCP provider Eugene Loh
2025-07-01 19:27   ` Kris Van Hees
2025-07-02 14:52     ` Alan Maguire

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aGakEUw2wYzngot5@oracle.com \
    --to=kris.van.hees@oracle.com \
    --cc=alan.maguire@oracle.com \
    --cc=dtrace-devel@oss.oracle.com \
    --cc=dtrace@lists.linux.dev \
    --cc=eugene.loh@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.