* [PATCH] cg: fix offset for > 8 bit bitfields in dt_cg_ctf_offsetof()
@ 2025-08-26 8:12 Alan Maguire
2025-08-26 17:40 ` Nick Alcock
2025-09-02 18:05 ` Eugene Loh
0 siblings, 2 replies; 5+ messages in thread
From: Alan Maguire @ 2025-08-26 8:12 UTC (permalink / raw)
To: dtrace; +Cc: dtrace-devel, Alan Maguire, Eugene Loh
The tcp provider uses dt_cg_tramp_get_member() to retrieve the
offset of the sk_protocol field in struct sock. However it
returns the wrong value on UEK6 since it is an 8-bit bitfield.
From pahole we see:
unsigned int __sk_flags_offset[0]; /* 560 0 */
unsigned int sk_padding:1; /* 560: 0 4 */
unsigned int sk_kern_sock:1; /* 560: 1 4 */
unsigned int sk_no_check_tx:1; /* 560: 2 4 */
unsigned int sk_no_check_rx:1; /* 560: 3 4 */
unsigned int sk_userlocks:4; /* 560: 4 4 */
unsigned int sk_protocol:8; /* 560: 8 4 */
In other words it is really at offset 561 but because we just
lookup the member offset and not the member type offset we get the
wrong value for the sk_protoocol.
This in turn causes tcp state-change probes (and in-progress UDP
probes) to not fire since we verify that sk_protocol == IPPROTO_TCP.
The fix is to look up the member _type_ offset and add it to the
bit offset we get for the member itself. With this in place the
state-change probes fire, but the local tcp tests still fail due
to separate issues with the tcp:::accept-established probe.
This issue is not seen on more recent kernels because sk_protocol
becomes a __u16 as the number of protocols exceeds 256.
Reported-by: Eugene Loh <eugene.loh@oracle.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
libdtrace/dt_cg.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
index cd9e7f4e..7abb5bc6 100644
--- a/libdtrace/dt_cg.c
+++ b/libdtrace/dt_cg.c
@@ -1959,6 +1959,8 @@ dt_cg_ctf_offsetof(const char *structname, const char *membername,
dtrace_typeinfo_t sym;
ctf_file_t *ctfp;
ctf_membinfo_t ctm;
+ ctf_encoding_t cte;
+ int offset;
if (dtrace_lookup_by_type(yypcb->pcb_hdl, DTRACE_OBJ_EVERY, structname,
&sym))
@@ -1973,6 +1975,13 @@ dt_cg_ctf_offsetof(const char *structname, const char *membername,
longjmp(yypcb->pcb_jmpbuf, EDT_NOCTF);
}
+ offset = ctm.ctm_offset;
+
+ /* a bitfield may have an additional > 8 bit offset which means we need
+ * to adjust the reported byte offset.
+ */
+ if (ctf_type_encoding(ctfp, ctm.ctm_type, &cte) != CTF_ERR)
+ offset += cte.cte_offset;
if (sizep || ldopp) {
uint_t ldop;
@@ -1982,7 +1991,7 @@ dt_cg_ctf_offsetof(const char *structname, const char *membername,
*ldopp = ldop;
}
- return (ctm.ctm_offset / NBBY);
+ return (offset / NBBY);
}
static void
dt_cg_act_breakpoint(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
--
2.43.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] cg: fix offset for > 8 bit bitfields in dt_cg_ctf_offsetof()
2025-08-26 8:12 [PATCH] cg: fix offset for > 8 bit bitfields in dt_cg_ctf_offsetof() Alan Maguire
@ 2025-08-26 17:40 ` Nick Alcock
2025-08-28 11:50 ` Alan Maguire
2025-09-02 18:05 ` Eugene Loh
1 sibling, 1 reply; 5+ messages in thread
From: Nick Alcock @ 2025-08-26 17:40 UTC (permalink / raw)
To: Alan Maguire; +Cc: dtrace, dtrace-devel, Eugene Loh
On 26 Aug 2025, Alan Maguire verbalised:
> The tcp provider uses dt_cg_tramp_get_member() to retrieve the
> offset of the sk_protocol field in struct sock. However it
> returns the wrong value on UEK6 since it is an 8-bit bitfield.
> From pahole we see:
>
> unsigned int __sk_flags_offset[0]; /* 560 0 */
> unsigned int sk_padding:1; /* 560: 0 4 */
> unsigned int sk_kern_sock:1; /* 560: 1 4 */
> unsigned int sk_no_check_tx:1; /* 560: 2 4 */
> unsigned int sk_no_check_rx:1; /* 560: 3 4 */
> unsigned int sk_userlocks:4; /* 560: 4 4 */
> unsigned int sk_protocol:8; /* 560: 8 4 */
>
> In other words it is really at offset 561 but because we just
> lookup the member offset and not the member type offset we get the
> wrong value for the sk_protoocol.
Excellent description! It's annoying as hell that BTF still has this
redundant representation :( DTrace literally always got this wrong,
right back to the year dot.
> The fix is to look up the member _type_ offset and add it to the
> bit offset we get for the member itself. With this in place the
> state-change probes fire, but the local tcp tests still fail due
> to separate issues with the tcp:::accept-established probe.
This is certainly necessary (and your code looks good), but we also need
to ensure that the load is suitably shifted and masked, particularly for
bitfields that cross machine words. DTrace never used to do this
either... e.g. the size returned by dt_cg_ldsize() assumes that the
entire bitfield can be read in one load op, which depending on its
offset may not be true.
It looks like dt_cg_field_get might get this *mostly* right (once you
delete the #if 0'ed stuff -- but I'd be suspicious until I'd tested it
on a big-endian platform), but even that assumes it only needs to do one
load. As soon as a bitfield crosses machine words, I suspect we'll only
get the first half of it :( we might need to do a second load to fetch
the second half of the bitfield, then a shift and | to combine the two,
and DTrace has never had any code for that.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cg: fix offset for > 8 bit bitfields in dt_cg_ctf_offsetof()
2025-08-26 17:40 ` Nick Alcock
@ 2025-08-28 11:50 ` Alan Maguire
0 siblings, 0 replies; 5+ messages in thread
From: Alan Maguire @ 2025-08-28 11:50 UTC (permalink / raw)
To: Nick Alcock; +Cc: dtrace, dtrace-devel, Eugene Loh
On 26/08/2025 18:40, Nick Alcock wrote:
> On 26 Aug 2025, Alan Maguire verbalised:
>
>> The tcp provider uses dt_cg_tramp_get_member() to retrieve the
>> offset of the sk_protocol field in struct sock. However it
>> returns the wrong value on UEK6 since it is an 8-bit bitfield.
>> From pahole we see:
>>
>> unsigned int __sk_flags_offset[0]; /* 560 0 */
>> unsigned int sk_padding:1; /* 560: 0 4 */
>> unsigned int sk_kern_sock:1; /* 560: 1 4 */
>> unsigned int sk_no_check_tx:1; /* 560: 2 4 */
>> unsigned int sk_no_check_rx:1; /* 560: 3 4 */
>> unsigned int sk_userlocks:4; /* 560: 4 4 */
>> unsigned int sk_protocol:8; /* 560: 8 4 */
>>
>> In other words it is really at offset 561 but because we just
>> lookup the member offset and not the member type offset we get the
>> wrong value for the sk_protoocol.
>
> Excellent description! It's annoying as hell that BTF still has this
> redundant representation :( DTrace literally always got this wrong,
> right back to the year dot.
>
>> The fix is to look up the member _type_ offset and add it to the
>> bit offset we get for the member itself. With this in place the
>> state-change probes fire, but the local tcp tests still fail due
>> to separate issues with the tcp:::accept-established probe.
>
> This is certainly necessary (and your code looks good), but we also need
> to ensure that the load is suitably shifted and masked, particularly for
> bitfields that cross machine words. DTrace never used to do this
> either... e.g. the size returned by dt_cg_ldsize() assumes that the
> entire bitfield can be read in one load op, which depending on its
> offset may not be true.
>
> It looks like dt_cg_field_get might get this *mostly* right (once you
> delete the #if 0'ed stuff -- but I'd be suspicious until I'd tested it
> on a big-endian platform), but even that assumes it only needs to do one
> load. As soon as a bitfield crosses machine words, I suspect we'll only
> get the first half of it :( we might need to do a second load to fetch
> the second half of the bitfield, then a shift and | to combine the two,
> and DTrace has never had any code for that.
thanks for taking a look! To be honest I punted on bitfield handling as
currently we don't have any cases where we are looking up bitfield
values that require shifting. The problem is we have the machinery for
offsetof calculation in dt_cg_ctf_offsetof() and it just returns byte
offset and load size, then after we do the probe_read_kernel(). We'd
need to repeat some of the operations done for the offsetof cacluation
to determine if/how we bitshift, or alternatively have
dt_cg_ctf_offsetof() return more info to consumers.
Perhaps the better option would be to rework dt_cg_ctf_offsetof() to be
a wrapper to a function that gives us this additional info; that would
require less changes overall to consumers of that function that don't
need to handle bitfields. I'm thinking something like this:
int
dt_cg_ctf_bit_offsetof(const char *structname, const char *membername,
size_t *ldsizep, uint_t *ldopp, size_t *bitszp,
int relaxed)
where the returned offset, *bitsizep are in bits (*ldsizep would still
be in bytes). This would allow us to figure out what shifting is
required in dt_cg_tramp_get_member().
Either that or I can just add a comment that we don't need bitshifts yet!
Thanks!
Alan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cg: fix offset for > 8 bit bitfields in dt_cg_ctf_offsetof()
2025-08-26 8:12 [PATCH] cg: fix offset for > 8 bit bitfields in dt_cg_ctf_offsetof() Alan Maguire
2025-08-26 17:40 ` Nick Alcock
@ 2025-09-02 18:05 ` Eugene Loh
2025-09-03 14:11 ` Alan Maguire
1 sibling, 1 reply; 5+ messages in thread
From: Eugene Loh @ 2025-09-02 18:05 UTC (permalink / raw)
To: Alan Maguire, dtrace; +Cc: dtrace-devel
So far as I can tell:
*) https://oss.oracle.com/pipermail/dtrace-devel/2025-August/006562.html
Nick commented on the patch
*) https://oss.oracle.com/pipermail/dtrace-devel/2025-August/006573.html
Alan responded to Nick
Since this patch is related to observed test failures, can we complete
this patch review? E.g., perhaps Nick can respond to Alan's last
message, maybe even agreeing with Alan's proposal simply to add a
comment that we don't need bitshifts yet. If so, Alan can post a v2 of
the patch and then Nick or I could add our "Reviewed-by"?
On 8/26/25 04:12, Alan Maguire wrote:
> The tcp provider uses dt_cg_tramp_get_member() to retrieve the
> offset of the sk_protocol field in struct sock. However it
> returns the wrong value on UEK6 since it is an 8-bit bitfield.
> From pahole we see:
>
> unsigned int __sk_flags_offset[0]; /* 560 0 */
> unsigned int sk_padding:1; /* 560: 0 4 */
> unsigned int sk_kern_sock:1; /* 560: 1 4 */
> unsigned int sk_no_check_tx:1; /* 560: 2 4 */
> unsigned int sk_no_check_rx:1; /* 560: 3 4 */
> unsigned int sk_userlocks:4; /* 560: 4 4 */
> unsigned int sk_protocol:8; /* 560: 8 4 */
>
> In other words it is really at offset 561 but because we just
> lookup the member offset and not the member type offset we get the
> wrong value for the sk_protoocol.
>
> This in turn causes tcp state-change probes (and in-progress UDP
> probes) to not fire since we verify that sk_protocol == IPPROTO_TCP.
>
> The fix is to look up the member _type_ offset and add it to the
> bit offset we get for the member itself. With this in place the
> state-change probes fire, but the local tcp tests still fail due
> to separate issues with the tcp:::accept-established probe.
>
> This issue is not seen on more recent kernels because sk_protocol
> becomes a __u16 as the number of protocols exceeds 256.
>
> Reported-by: Eugene Loh <eugene.loh@oracle.com>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
> libdtrace/dt_cg.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> index cd9e7f4e..7abb5bc6 100644
> --- a/libdtrace/dt_cg.c
> +++ b/libdtrace/dt_cg.c
> @@ -1959,6 +1959,8 @@ dt_cg_ctf_offsetof(const char *structname, const char *membername,
> dtrace_typeinfo_t sym;
> ctf_file_t *ctfp;
> ctf_membinfo_t ctm;
> + ctf_encoding_t cte;
> + int offset;
>
> if (dtrace_lookup_by_type(yypcb->pcb_hdl, DTRACE_OBJ_EVERY, structname,
> &sym))
> @@ -1973,6 +1975,13 @@ dt_cg_ctf_offsetof(const char *structname, const char *membername,
>
> longjmp(yypcb->pcb_jmpbuf, EDT_NOCTF);
> }
> + offset = ctm.ctm_offset;
> +
> + /* a bitfield may have an additional > 8 bit offset which means we need
> + * to adjust the reported byte offset.
> + */
> + if (ctf_type_encoding(ctfp, ctm.ctm_type, &cte) != CTF_ERR)
> + offset += cte.cte_offset;
>
> if (sizep || ldopp) {
> uint_t ldop;
> @@ -1982,7 +1991,7 @@ dt_cg_ctf_offsetof(const char *structname, const char *membername,
> *ldopp = ldop;
> }
>
> - return (ctm.ctm_offset / NBBY);
> + return (offset / NBBY);
> }
> static void
> dt_cg_act_breakpoint(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cg: fix offset for > 8 bit bitfields in dt_cg_ctf_offsetof()
2025-09-02 18:05 ` Eugene Loh
@ 2025-09-03 14:11 ` Alan Maguire
0 siblings, 0 replies; 5+ messages in thread
From: Alan Maguire @ 2025-09-03 14:11 UTC (permalink / raw)
To: Eugene Loh, dtrace; +Cc: dtrace-devel
On 02/09/2025 19:05, Eugene Loh wrote:
> So far as I can tell:
>
> *) https://oss.oracle.com/pipermail/dtrace-devel/2025-August/006562.html
> Nick commented on the patch
>
> *) https://oss.oracle.com/pipermail/dtrace-devel/2025-August/006573.html
> Alan responded to Nick
>
> Since this patch is related to observed test failures, can we complete
> this patch review? E.g., perhaps Nick can respond to Alan's last
> message, maybe even agreeing with Alan's proposal simply to add a
> comment that we don't need bitshifts yet. If so, Alan can post a v2 of
> the patch and then Nick or I could add our "Reviewed-by"?
>
sounds good to me; if Nick is happy with that plan I'll send v2 ASAP.
Thanks!
Alan
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-09-03 14:11 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-26 8:12 [PATCH] cg: fix offset for > 8 bit bitfields in dt_cg_ctf_offsetof() Alan Maguire
2025-08-26 17:40 ` Nick Alcock
2025-08-28 11:50 ` Alan Maguire
2025-09-02 18:05 ` Eugene Loh
2025-09-03 14:11 ` Alan Maguire
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox