* sparse feature request: nocast integer types
@ 2023-11-09 5:03 Christoph Hellwig
[not found] ` <CAMHZB6G_TZJ_uQGm5an0-bhG8wCxpEQrUCShen7O61Q9arAf+Q@mail.gmail.com>
2023-11-27 12:51 ` Dan Carpenter
0 siblings, 2 replies; 7+ messages in thread
From: Christoph Hellwig @ 2023-11-09 5:03 UTC (permalink / raw)
To: linux-sparse; +Cc: linux-xfs
Hi dear spearse developers,
in a lot of kernel code we have integer types that store offsets and
length in certain units (typically 512 byte disk "sectors", file systems
block sizes, and some weird variations of the same), and we had a fair
amount of bugs beause people get confused about which ones to use.
I wonder if it is possible to add an attribute (say nocast) that works
similar to __attribute__((bitwise)) in that it disallows mixing this
type with other integer types, but unlike __attribute__((bitwise))
allows all the normal arithmetics on it? That way we could annotate
all the normal conversion helpers with __force overrides and check
where people are otherwise mixing these types.
^ permalink raw reply [flat|nested] 7+ messages in thread[parent not found: <CAMHZB6G_TZJ_uQGm5an0-bhG8wCxpEQrUCShen7O61Q9arAf+Q@mail.gmail.com>]
* Re: sparse feature request: nocast integer types [not found] ` <CAMHZB6G_TZJ_uQGm5an0-bhG8wCxpEQrUCShen7O61Q9arAf+Q@mail.gmail.com> @ 2023-11-09 5:30 ` Christoph Hellwig [not found] ` <CAMHZB6H7Y0m2Y-ZD0PMKiGDeo7_sy=scDrzbBbBuUJfuzLK-Lg@mail.gmail.com> 0 siblings, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2023-11-09 5:30 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Christoph Hellwig, linux-sparse, linux-xfs On Thu, Nov 09, 2023 at 06:21:05AM +0100, Luc Van Oostenryck wrote: > Such 'nocast' attribute already exists and seems to do more or less what > you would like: > See Sparse docs at https://sparse.docs.kernel.org/en/latest/annotations.html > : > nocast <https://sparse.docs.kernel.org/en/latest/annotations.html#nocast> > > This attribute is similar to bitwise but in a much weaker form. It warns > about explicit or implicit casting to different types. However, it doesn’t > warn about the mixing with other types and it easily gets lost: you can add > plain integers to __nocast integer types and the result will be plain > integers. Hmm, that's a little suboptimal. But still a lot better than nothing. I'll see what I can do with them. Thanks a lot! ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <CAMHZB6H7Y0m2Y-ZD0PMKiGDeo7_sy=scDrzbBbBuUJfuzLK-Lg@mail.gmail.com>]
* Re: sparse feature request: nocast integer types [not found] ` <CAMHZB6H7Y0m2Y-ZD0PMKiGDeo7_sy=scDrzbBbBuUJfuzLK-Lg@mail.gmail.com> @ 2023-11-09 5:44 ` Christoph Hellwig 2023-11-09 17:57 ` Linus Torvalds 0 siblings, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2023-11-09 5:44 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Christoph Hellwig, linux-sparse, linux-xfs On Thu, Nov 09, 2023 at 06:43:30AM +0100, Luc Van Oostenryck wrote: > > > >> Hmm, that's a little suboptimal. But still a lot better than nothing. > >> I'll see what I can do with them. > >> > >> Thanks a lot! > > > > Yes, something in-between is most probably needed and be clearly specified: > - does it need to define a new type like done for bitwise? > Or can it be used like a specifier/attribute > - when a warning is required: only out-casting? > - when __force is needed? For my use case it'd treat it exactly like __bitwise except for also allowing arithmetics on it. Bonus for always allowing 0 withou explicit __force cast just like __bitwise. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: sparse feature request: nocast integer types 2023-11-09 5:44 ` Christoph Hellwig @ 2023-11-09 17:57 ` Linus Torvalds 0 siblings, 0 replies; 7+ messages in thread From: Linus Torvalds @ 2023-11-09 17:57 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Luc Van Oostenryck, linux-sparse, linux-xfs On Wed, 8 Nov 2023 at 21:45, Christoph Hellwig <hch@infradead.org> wrote: > > For my use case it'd treat it exactly like __bitwise except for also > allowing arithmetics on it. Bonus for always allowing 0 withou explicit > __force cast just like __bitwise. It's too long for me to really remember, but I think that was the *intention* of the "nocast" attribute originally. The whole "nocast" thing goes back to pretty early in sparse (first few weeks of it, in fact), and is almost entirely undocumented. The commit that starts parsing it doesn't actually even mention it - it mentions the *other* attributes it also starts parsing, but not 'nocast'. Iirc, what happened was that one of the opriginal goals was that I really wanted to have that "warn about any implicit integer casts", and added "nocast" quite early, before the sparse type system was even very strong. But it ended up being almost just a placeholder. And then later on, Al came in, and he did the much stronger 'bitwise' thing, which was immediately useful for all the byte order handling, and 'nocast' ended up falling by the wayside. Anyway, I think what you ask for is what 'nocast' was always supposed to be, but just plain isn't. Linus ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: sparse feature request: nocast integer types 2023-11-09 5:03 sparse feature request: nocast integer types Christoph Hellwig [not found] ` <CAMHZB6G_TZJ_uQGm5an0-bhG8wCxpEQrUCShen7O61Q9arAf+Q@mail.gmail.com> @ 2023-11-27 12:51 ` Dan Carpenter 2023-11-27 16:05 ` Christoph Hellwig 1 sibling, 1 reply; 7+ messages in thread From: Dan Carpenter @ 2023-11-27 12:51 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-sparse, linux-xfs, smatch On Wed, Nov 08, 2023 at 09:03:34PM -0800, Christoph Hellwig wrote: > Hi dear spearse developers, > > in a lot of kernel code we have integer types that store offsets and > length in certain units (typically 512 byte disk "sectors", file systems > block sizes, and some weird variations of the same), and we had a fair > amount of bugs beause people get confused about which ones to use. > > I wonder if it is possible to add an attribute (say nocast) that works > similar to __attribute__((bitwise)) in that it disallows mixing this > type with other integer types, but unlike __attribute__((bitwise)) > allows all the normal arithmetics on it? That way we could annotate > all the normal conversion helpers with __force overrides and check > where people are otherwise mixing these types. I started writing something like this in Smatch for tying variables to a specific unit. https://github.com/error27/smatch/blob/master/smatch_units.c But unfortunately, it doesn't actually work. The problem is that once I said x is a byte, then if you have y = x then I would store that in the database. If the first "x is a byte" assessment was wrong then the misinformation gets amplified times 100 and can't be purged without a mass delete. The second problem is that Smatch automatically determines that a struct foo->bar is a byte unit or whatever. Which generally works, but sometimes fails catastrophically. For example, it's not true to all the registers are used to store byte units. But some code does store bytes there and now Smatch thinks the everything stored in registers is in bytes. My plan was to go through the false positives and manually edit out stuff like this. The problem is that it's a lot of work and I haven't done it. I did a similar thing for tracking user data and that works pretty decently these days. So it's doable. I tend to avoid manual annotations, but here it could be good. Manually annotating things would avoid the false positives (at the expense of missing bugs). I'd prefer an annotation that had the type of the unit built in. Creating an annotation like that is difficult because you have to coordinate with GCC and Clang etc. In the mean time, I could just create a table in smatch which has stuff like: { "(struct foo)->member", &byte_units }, regards, dan carpenter ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: sparse feature request: nocast integer types 2023-11-27 12:51 ` Dan Carpenter @ 2023-11-27 16:05 ` Christoph Hellwig 2023-11-27 17:26 ` Linus Torvalds 0 siblings, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2023-11-27 16:05 UTC (permalink / raw) To: Dan Carpenter; +Cc: Christoph Hellwig, linux-sparse, linux-xfs, smatch On Mon, Nov 27, 2023 at 03:51:05PM +0300, Dan Carpenter wrote: > My plan was to go through the false positives and manually edit out > stuff like this. The problem is that it's a lot of work and I haven't > done it. I did a similar thing for tracking user data and that works > pretty decently these days. So it's doable. Yes, doing it without specific annotations seems like a pain. I did a little prototype with the existing sparse __nocast for one xfs type that is not very heavily used, and it actually worked pretty good. The major painpoint is that 0 isn't treated special, but with that fixed the amount of churn is mangable. The next big thing is our stupid 64-bit divison helpers (do_div & co), which require helpers to do that case. I'm actually kinda tempted to propose that we drop 32-bit support for xfs to get rid of that and a lot of other ugly things because of do_div. That is unless we can finally agree that the libgcc division helpes might not be great but good enough that we don't want to inflict do_div on folks unless they want to optize that case, which would be even better. Linus, any commens on that? > I'd prefer an annotation that had the type of the unit built in. Annotating the type seems really hard. I think the sparse concept of simply not alowing different of these types to be mixed is good enough without needing to know the actual unit in the type system. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: sparse feature request: nocast integer types 2023-11-27 16:05 ` Christoph Hellwig @ 2023-11-27 17:26 ` Linus Torvalds 0 siblings, 0 replies; 7+ messages in thread From: Linus Torvalds @ 2023-11-27 17:26 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Dan Carpenter, linux-sparse, linux-xfs, smatch On Mon, 27 Nov 2023 at 08:50, Christoph Hellwig <hch@infradead.org> wrote: > > Yes, doing it without specific annotations seems like a pain. I did a > little prototype with the existing sparse __nocast for one xfs type that > is not very heavily used, and it actually worked pretty good. > > The major painpoint is that 0 isn't treated special, but with that > fixed the amount of churn is mangable. I would suggest trying to just treat "__bitwise" as the "nocast" type. And note that doing a typedef uXX __bitwise new_integer_type; will make a *specific* new integer type that is only compatible with itself (so not other bitwise types). Of course, that only works if you are then willing to just use accessor functions when you actually want to do arithmetic on the values. If you use a *lot* of arithmetic - as opposed to just passing values around - it is too painful. > The next big thing is our stupid 64-bit divison helpers (do_div & co), > which require helpers to do that case. I'm actually kinda tempted to > propose that we drop 32-bit support for xfs to get rid of that and a > lot of other ugly things because of do_div. That is unless we can > finally agree that the libgcc division helpes might not be great but > good enough that we don't want to inflict do_div on folks unless they > want to optize that case, which would be even better. > > Linus, any commens on that? Some architectures do that, but honestly, we've had *horrendous* numbers of cases where people did 64x64 divisions without ever realizing what they did. Having it cause compile failures on x86 - even if it silently works elsewhere - means that they do get caught fairly quickly. Just go to lore, and search for "__divdi3". You will find *tons* of kernel test robot reports for completely idiotic cases that then never get to me because of this. IOW, you may not see it in the resulting kernel, but the reason you don't see it is *exactly* because we require that do_div() dance for 64-bit divides. The least one I see is literally from less than a week ago, when media code tried to do this: i = v / (1LL << fraction_bits); without realizing that a signed 64-bit division is truly *horrendous* here. So you may never see this, but it's a *constant* churn, and we really are better for it. It's not some historical artifact, it's a "the kernel test robot finds stuff weekly". Linus ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-11-27 17:26 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-09 5:03 sparse feature request: nocast integer types Christoph Hellwig
[not found] ` <CAMHZB6G_TZJ_uQGm5an0-bhG8wCxpEQrUCShen7O61Q9arAf+Q@mail.gmail.com>
2023-11-09 5:30 ` Christoph Hellwig
[not found] ` <CAMHZB6H7Y0m2Y-ZD0PMKiGDeo7_sy=scDrzbBbBuUJfuzLK-Lg@mail.gmail.com>
2023-11-09 5:44 ` Christoph Hellwig
2023-11-09 17:57 ` Linus Torvalds
2023-11-27 12:51 ` Dan Carpenter
2023-11-27 16:05 ` Christoph Hellwig
2023-11-27 17:26 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox