public inbox for linux-sparse@vger.kernel.org
 help / color / mirror / Atom feed
* sparse feature request: nocast integer types
@ 2023-11-09  5:03 Christoph Hellwig
       [not found] ` <CAMHZB6G_TZJ_uQGm5an0-bhG8wCxpEQrUCShen7O61Q9arAf+Q@mail.gmail.com>
  2023-11-27 12:51 ` Dan Carpenter
  0 siblings, 2 replies; 7+ messages in thread
From: Christoph Hellwig @ 2023-11-09  5:03 UTC (permalink / raw)
  To: linux-sparse; +Cc: linux-xfs

Hi dear spearse developers,

in a lot of kernel code we have integer types that store offsets and
length in certain units (typically 512 byte disk "sectors", file systems
block sizes, and some weird variations of the same), and we had a fair
amount of bugs beause people get confused about which ones to use.

I wonder if it is possible to add an attribute (say nocast) that works
similar to __attribute__((bitwise)) in that it disallows mixing this
type with other integer types, but unlike __attribute__((bitwise))
allows all the normal arithmetics on it?  That way we could annotate
all the normal conversion helpers with __force overrides and check
where people are otherwise mixing these types.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sparse feature request: nocast integer types
       [not found] ` <CAMHZB6G_TZJ_uQGm5an0-bhG8wCxpEQrUCShen7O61Q9arAf+Q@mail.gmail.com>
@ 2023-11-09  5:30   ` Christoph Hellwig
       [not found]     ` <CAMHZB6H7Y0m2Y-ZD0PMKiGDeo7_sy=scDrzbBbBuUJfuzLK-Lg@mail.gmail.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2023-11-09  5:30 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Christoph Hellwig, linux-sparse, linux-xfs

On Thu, Nov 09, 2023 at 06:21:05AM +0100, Luc Van Oostenryck wrote:
> Such 'nocast' attribute already exists and seems to do more or less what
> you would like:
> See Sparse docs at https://sparse.docs.kernel.org/en/latest/annotations.html
> :
> nocast <https://sparse.docs.kernel.org/en/latest/annotations.html#nocast>
> 
> This attribute is similar to bitwise but in a much weaker form. It warns
> about explicit or implicit casting to different types. However, it doesn’t
> warn about the mixing with other types and it easily gets lost: you can add
> plain integers to __nocast integer types and the result will be plain
> integers.

Hmm, that's a little suboptimal.  But still a lot better than nothing.
I'll see what I can do with them.

Thanks a lot!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sparse feature request: nocast integer types
       [not found]     ` <CAMHZB6H7Y0m2Y-ZD0PMKiGDeo7_sy=scDrzbBbBuUJfuzLK-Lg@mail.gmail.com>
@ 2023-11-09  5:44       ` Christoph Hellwig
  2023-11-09 17:57         ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2023-11-09  5:44 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Christoph Hellwig, linux-sparse, linux-xfs

On Thu, Nov 09, 2023 at 06:43:30AM +0100, Luc Van Oostenryck wrote:
> >
> >> Hmm, that's a little suboptimal.  But still a lot better than nothing.
> >> I'll see what I can do with them.
> >>
> >> Thanks a lot!
> >
> > Yes, something in-between is most probably needed and be clearly specified:
> - does it need to define a new type like done for bitwise?
>   Or can it be used like a specifier/attribute
> - when a warning is required: only out-casting?
> - when __force is needed?

For my use case it'd treat it exactly like __bitwise except for also
allowing arithmetics on it.  Bonus for always allowing 0 withou explicit
__force cast just like __bitwise.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sparse feature request: nocast integer types
  2023-11-09  5:44       ` Christoph Hellwig
@ 2023-11-09 17:57         ` Linus Torvalds
  0 siblings, 0 replies; 7+ messages in thread
From: Linus Torvalds @ 2023-11-09 17:57 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Luc Van Oostenryck, linux-sparse, linux-xfs

On Wed, 8 Nov 2023 at 21:45, Christoph Hellwig <hch@infradead.org> wrote:
>
> For my use case it'd treat it exactly like __bitwise except for also
> allowing arithmetics on it.  Bonus for always allowing 0 withou explicit
> __force cast just like __bitwise.

It's too long for me to really remember, but I think that was the
*intention* of the "nocast" attribute originally.

The whole "nocast" thing goes back to pretty early in sparse (first
few weeks of it, in fact), and is almost entirely undocumented.

The commit that starts parsing it doesn't actually even mention it -
it mentions the *other* attributes it also starts parsing, but not
'nocast'.

Iirc, what happened was that one of the opriginal goals was that I
really wanted to have that "warn about any implicit integer casts",
and added "nocast" quite early, before the sparse type system was even
very strong. But it ended up being almost just a placeholder.

And then later on, Al came in, and he did the much stronger 'bitwise'
thing, which was immediately useful for all the byte order handling,
and 'nocast' ended up falling by the wayside.

Anyway, I think what you ask for is what 'nocast' was always supposed
to be, but just plain isn't.

                   Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sparse feature request: nocast integer types
  2023-11-09  5:03 sparse feature request: nocast integer types Christoph Hellwig
       [not found] ` <CAMHZB6G_TZJ_uQGm5an0-bhG8wCxpEQrUCShen7O61Q9arAf+Q@mail.gmail.com>
@ 2023-11-27 12:51 ` Dan Carpenter
  2023-11-27 16:05   ` Christoph Hellwig
  1 sibling, 1 reply; 7+ messages in thread
From: Dan Carpenter @ 2023-11-27 12:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-sparse, linux-xfs, smatch

On Wed, Nov 08, 2023 at 09:03:34PM -0800, Christoph Hellwig wrote:
> Hi dear spearse developers,
> 
> in a lot of kernel code we have integer types that store offsets and
> length in certain units (typically 512 byte disk "sectors", file systems
> block sizes, and some weird variations of the same), and we had a fair
> amount of bugs beause people get confused about which ones to use.
> 
> I wonder if it is possible to add an attribute (say nocast) that works
> similar to __attribute__((bitwise)) in that it disallows mixing this
> type with other integer types, but unlike __attribute__((bitwise))
> allows all the normal arithmetics on it?  That way we could annotate
> all the normal conversion helpers with __force overrides and check
> where people are otherwise mixing these types.

I started writing something like this in Smatch for tying variables to
a specific unit.

https://github.com/error27/smatch/blob/master/smatch_units.c

But unfortunately, it doesn't actually work.  The problem is that once
I said x is a byte, then if you have y = x then I would store that in
the database.  If the first "x is a byte" assessment was wrong then the
misinformation gets amplified times 100 and can't be purged without
a mass delete.

The second problem is that Smatch automatically determines that a struct
foo->bar is a byte unit or whatever.  Which generally works, but
sometimes fails catastrophically.  For example, it's not true to
all the registers are used to store byte units.  But some code does
store bytes there and now Smatch thinks the everything stored in
registers is in bytes.

My plan was to go through the false positives and manually edit out
stuff like this.  The problem is that it's a lot of work and I haven't
done it.  I did a similar thing for tracking user data and that works
pretty decently these days.  So it's doable.

I tend to avoid manual annotations, but here it could be good.  Manually
annotating things would avoid the false positives (at the expense of
missing bugs).

I'd prefer an annotation that had the type of the unit built in.

Creating an annotation like that is difficult because you have to
coordinate with GCC and Clang etc.  In the mean time, I could just
create a table in smatch which has stuff like:

	{ "(struct foo)->member", &byte_units },

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sparse feature request: nocast integer types
  2023-11-27 12:51 ` Dan Carpenter
@ 2023-11-27 16:05   ` Christoph Hellwig
  2023-11-27 17:26     ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2023-11-27 16:05 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: Christoph Hellwig, linux-sparse, linux-xfs, smatch

On Mon, Nov 27, 2023 at 03:51:05PM +0300, Dan Carpenter wrote:
> My plan was to go through the false positives and manually edit out
> stuff like this.  The problem is that it's a lot of work and I haven't
> done it.  I did a similar thing for tracking user data and that works
> pretty decently these days.  So it's doable.

Yes, doing it without specific annotations seems like a pain.  I did a
little prototype with the existing sparse __nocast for one xfs type that
is not very heavily used, and it actually worked pretty good.

The major painpoint is that 0 isn't treated special, but with that 
fixed the amount of churn is mangable.

The next big thing is our stupid 64-bit divison helpers (do_div & co),
which require helpers to do that case. I'm actually kinda tempted to
propose that we drop 32-bit support for xfs to get rid of that and a
lot of other ugly things because of do_div.  That is unless we can
finally agree that the libgcc division helpes might not be great but
good enough that we don't want to inflict do_div on folks unless they
want to optize that case, which would be even better.

Linus, any commens on that?

> I'd prefer an annotation that had the type of the unit built in.

Annotating the type seems really hard.  I think the sparse concept
of simply not alowing different of these types to be mixed is good
enough without needing to know the actual unit in the type system.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sparse feature request: nocast integer types
  2023-11-27 16:05   ` Christoph Hellwig
@ 2023-11-27 17:26     ` Linus Torvalds
  0 siblings, 0 replies; 7+ messages in thread
From: Linus Torvalds @ 2023-11-27 17:26 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Dan Carpenter, linux-sparse, linux-xfs, smatch

On Mon, 27 Nov 2023 at 08:50, Christoph Hellwig <hch@infradead.org> wrote:
>
> Yes, doing it without specific annotations seems like a pain.  I did a
> little prototype with the existing sparse __nocast for one xfs type that
> is not very heavily used, and it actually worked pretty good.
>
> The major painpoint is that 0 isn't treated special, but with that
> fixed the amount of churn is mangable.

I would suggest trying to just treat "__bitwise" as the "nocast" type.
And note that doing a

   typedef uXX __bitwise new_integer_type;

will make a *specific* new integer type that is only compatible with
itself (so not other bitwise types).

Of course, that only works if you are then willing to just use
accessor functions when you actually want to do arithmetic on the
values. If you use a *lot* of arithmetic - as opposed to just passing
values around - it is too painful.

> The next big thing is our stupid 64-bit divison helpers (do_div & co),
> which require helpers to do that case. I'm actually kinda tempted to
> propose that we drop 32-bit support for xfs to get rid of that and a
> lot of other ugly things because of do_div.  That is unless we can
> finally agree that the libgcc division helpes might not be great but
> good enough that we don't want to inflict do_div on folks unless they
> want to optize that case, which would be even better.
>
> Linus, any commens on that?

Some architectures do that, but honestly, we've had *horrendous*
numbers of cases where people did 64x64 divisions without ever
realizing what they did. Having it cause compile failures on x86 -
even if it silently works elsewhere - means that they do get caught
fairly quickly.

Just go to lore, and search for "__divdi3". You will find *tons* of
kernel test robot reports for completely idiotic cases that then never
get to me because of this.

IOW, you may not see it in the resulting kernel, but the reason you
don't see it is *exactly* because we require that do_div() dance for
64-bit divides.

The least one I see is literally from less than a week ago, when media
code tried to do this:

        i = v / (1LL << fraction_bits);

without realizing that a signed 64-bit division is truly *horrendous* here.

So you may never see this, but it's a *constant* churn, and we really
are better for it.  It's not some historical artifact, it's a "the
kernel test robot finds stuff weekly".

                 Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-11-27 17:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-09  5:03 sparse feature request: nocast integer types Christoph Hellwig
     [not found] ` <CAMHZB6G_TZJ_uQGm5an0-bhG8wCxpEQrUCShen7O61Q9arAf+Q@mail.gmail.com>
2023-11-09  5:30   ` Christoph Hellwig
     [not found]     ` <CAMHZB6H7Y0m2Y-ZD0PMKiGDeo7_sy=scDrzbBbBuUJfuzLK-Lg@mail.gmail.com>
2023-11-09  5:44       ` Christoph Hellwig
2023-11-09 17:57         ` Linus Torvalds
2023-11-27 12:51 ` Dan Carpenter
2023-11-27 16:05   ` Christoph Hellwig
2023-11-27 17:26     ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox