* Missing cache considerations in randstruct performance feature
@ 2023-10-06 22:30 Lukas Loidolt
2023-10-07 2:24 ` Kees Cook
2023-10-07 4:12 ` Kees Cook
0 siblings, 2 replies; 5+ messages in thread
From: Lukas Loidolt @ 2023-10-06 22:30 UTC (permalink / raw)
To: keescook, linux-hardening; +Cc: linux-kernel, Daniel Marth
Hello!
I have been looking into the implementation of the "randstruct"
gcc-plugin and noticed a potential bug in its performance version, which
is supposed to limit randomization to cache-line sized groupings of
structure members.
I haven't been able to find too much documentation on this version of
randstruct, but my general understanding of its intended behavior is as
follows:
- in performance mode, randstruct groups structure members into cache
line sized partitions of 64 bytes each
- the order of these partitions is randomized
- the order of structure members within each partition is also randomized
In my tests, however, the performance version behaves more or less like
the full version of randstruct.
For example, testing on a struct of 10 function pointers:
struct test_struct{
void (*func1)(void);
void (*func2)(void);
void (*func3)(void);
void (*func4)(void);
void (*func5)(void);
void (*func6)(void);
void (*func7)(void);
void (*func8)(void);
void (*func9)(void);
void (*func10)(void);
};
resulted in the following randomized memory layout:
func3 (offset 0)
func5 (offset 8)
func10 (offset 16)
func2 (offset 24)
func1 (offset 32)
func6 (offset 40)
func8 (offset 48)
func7 (offset 56)
func9 (offset 64)
func4 (offset 72)
I would have expected cache-line sized partitions of (up to) 8 pointers,
so that func1 through func8 are adjacent in the final layout, but these
partitions are seemingly not preserved.
Assuming that this is indeed not the intended behavior, the culprit is
line 213 in "randomize_layout_plugin.c"
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/gcc-plugins/randomize_layout_plugin.c?id=f291209eca5eba0b4704fa0832af57b12dbc1a02#n213
where
randnum = ranval(prng_state) % (i + 1);
should probably be something like
randnum = size_group[x].start + ranval(prng_state) % size_group[x].length;
After changing this line, cache-line sized partitions are created and
preserved as expected.
However, while structure members within each partition are randomized,
the order of the partitions themselves is not randomized and remains the
same as in the original struct declaration.
I assume that the for loop in lines 200 to 206 is intended to shuffle
the partition_group structures
for (i = num_groups - 1; i > 0; i--) {
struct partition_group tmp;
randnum = ranval(prng_state) % (i + 1);
tmp = size_group[i];
size_group[i] = size_group[randnum];
size_group[randnum] = tmp;
}
but the order of the partition_group structs is not written back into
the newtree object, so the randomization from this loop is not reflected
in the final layout.
I would be interested to know if this is an actual issue with the
implementation or if I'm misinterpreting how randstruct is supposed to
work here.
Thanks,
Lukas Loidolt
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: Missing cache considerations in randstruct performance feature 2023-10-06 22:30 Missing cache considerations in randstruct performance feature Lukas Loidolt @ 2023-10-07 2:24 ` Kees Cook 2023-10-07 4:12 ` Kees Cook 1 sibling, 0 replies; 5+ messages in thread From: Kees Cook @ 2023-10-07 2:24 UTC (permalink / raw) To: Lukas Loidolt; +Cc: linux-hardening, linux-kernel, Daniel Marth On Sat, Oct 07, 2023 at 12:30:01AM +0200, Lukas Loidolt wrote: > Hello! > > I have been looking into the implementation of the "randstruct" gcc-plugin > and noticed a potential bug in its performance version, which is supposed to > limit randomization to cache-line sized groupings of structure members. > I haven't been able to find too much documentation on this version of > randstruct, but my general understanding of its intended behavior is as > follows: > > - in performance mode, randstruct groups structure members into cache line > sized partitions of 64 bytes each > - the order of these partitions is randomized > - the order of structure members within each partition is also randomized > > In my tests, however, the performance version behaves more or less like the > full version of randstruct. > For example, testing on a struct of 10 function pointers: > > struct test_struct{ > void (*func1)(void); > void (*func2)(void); > void (*func3)(void); > void (*func4)(void); > void (*func5)(void); > void (*func6)(void); > void (*func7)(void); > void (*func8)(void); > void (*func9)(void); > void (*func10)(void); > }; > > resulted in the following randomized memory layout: > > func3 (offset 0) > func5 (offset 8) > func10 (offset 16) > func2 (offset 24) > func1 (offset 32) > func6 (offset 40) > func8 (offset 48) > func7 (offset 56) > func9 (offset 64) > func4 (offset 72) I'd agree; this doesn't look like an expected layout. > I would have expected cache-line sized partitions of (up to) 8 pointers, so > that func1 through func8 are adjacent in the final layout, but these > partitions are seemingly not preserved. I'd expect the order of func1-func8 to be randomized together, and func9-func10 order to be randomized together, and then they're pasted together. > Assuming that this is indeed not the intended behavior, the culprit is line > 213 in "randomize_layout_plugin.c" > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/gcc-plugins/randomize_layout_plugin.c?id=f291209eca5eba0b4704fa0832af57b12dbc1a02#n213 > > where > > randnum = ranval(prng_state) % (i + 1); > > should probably be something like > > randnum = size_group[x].start + ranval(prng_state) % size_group[x].length; I thought the intent was to handle it in the loop offsets: for (i = size_group[x].start + size_group[x].length - 1; i > size_group[x].start; i--) { But, no, this looks wrong, see below... > After changing this line, cache-line sized partitions are created and > preserved as expected. > However, while structure members within each partition are randomized, the > order of the partitions themselves is not randomized and remains the same as > in the original struct declaration. size_group[] is what holds the initial partitions. I would agree, though, the group shuffling doesn't seem to do anything. > I assume that the for loop in lines 200 to 206 is intended to shuffle the > partition_group structures > > for (i = num_groups - 1; i > 0; i--) { > struct partition_group tmp; > randnum = ranval(prng_state) % (i + 1); > tmp = size_group[i]; > size_group[i] = size_group[randnum]; > size_group[randnum] = tmp; > } > > but the order of the partition_group structs is not written back into the > newtree object, so the randomization from this loop is not reflected in the > final layout. I'd expect the newtree index to start at 0, rather than reusing "i", if the intent was to move groups around too. But in fact, the second shuffle loop seems to just not work at all: it's not shuffling within the group, since it operates from 0 to i, not size_group[x].start through size_group[x].start + size_group[x].length. > I would be interested to know if this is an actual issue with the > implementation or if I'm misinterpreting how randstruct is supposed to work > here. I'd agree; this looks totally broken. -Kees -- Kees Cook ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Missing cache considerations in randstruct performance feature 2023-10-06 22:30 Missing cache considerations in randstruct performance feature Lukas Loidolt 2023-10-07 2:24 ` Kees Cook @ 2023-10-07 4:12 ` Kees Cook 2023-10-07 10:38 ` Lukas Loidolt 1 sibling, 1 reply; 5+ messages in thread From: Kees Cook @ 2023-10-07 4:12 UTC (permalink / raw) To: Lukas Loidolt; +Cc: linux-hardening, linux-kernel, Daniel Marth On Sat, Oct 07, 2023 at 12:30:01AM +0200, Lukas Loidolt wrote: > In my tests, however, the performance version behaves more or less like the > full version of randstruct. Can you try this patch? commit d73a3244700d3c945cedea7e1fb7042243c41e08 Author: Kees Cook <keescook@chromium.org> AuthorDate: Fri Oct 6 21:09:28 2023 -0700 Commit: Kees Cook <keescook@chromium.org> CommitDate: Fri Oct 6 21:09:28 2023 -0700 randstruct: Fix gcc-plugin performance mode to stay in group The performance mode of the gcc-plugin randstruct was shuffling struct members outside of the cache-line groups. Limit the range to the specified group indexes. Cc: linux-hardening@vger.kernel.org Reported-by: Lukas Loidolt <e1634039@student.tuwien.ac.at> Closes: https://lore.kernel.org/all/f3ca77f0-e414-4065-83a5-ae4c4d25545d@student.tuwien.ac.at Signed-off-by: Kees Cook <keescook@chromium.org> diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c b/scripts/gcc-plugins/randomize_layout_plugin.c index 951b74ba1b24..178831917f01 100644 --- a/scripts/gcc-plugins/randomize_layout_plugin.c +++ b/scripts/gcc-plugins/randomize_layout_plugin.c @@ -191,7 +191,7 @@ static void partition_struct(tree *fields, unsigned long length, struct partitio static void performance_shuffle(tree *newtree, unsigned long length, ranctx *prng_state) { - unsigned long i, x; + unsigned long i, x, index; struct partition_group size_group[length]; unsigned long num_groups = 0; unsigned long randnum; @@ -206,11 +206,14 @@ static void performance_shuffle(tree *newtree, unsigned long length, ranctx *prn } for (x = 0; x < num_groups; x++) { - for (i = size_group[x].start + size_group[x].length - 1; i > size_group[x].start; i--) { + for (index = size_group[x].length - 1; index > 0; index--) { tree tmp; + + i = size_group[x].start + index; if (DECL_BIT_FIELD_TYPE(newtree[i])) continue; randnum = ranval(prng_state) % (i + 1); + randnum += size_group[x].start; // we could handle this case differently if desired if (DECL_BIT_FIELD_TYPE(newtree[randnum])) continue; -- Kees Cook ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Missing cache considerations in randstruct performance feature 2023-10-07 4:12 ` Kees Cook @ 2023-10-07 10:38 ` Lukas Loidolt 2023-10-08 17:04 ` Kees Cook 0 siblings, 1 reply; 5+ messages in thread From: Lukas Loidolt @ 2023-10-07 10:38 UTC (permalink / raw) To: Kees Cook; +Cc: linux-hardening, linux-kernel, Daniel Marth On 07.10.23 06:12, Kees Cook wrote: > On Sat, Oct 07, 2023 at 12:30:01AM +0200, Lukas Loidolt wrote: >> In my tests, however, the performance version behaves more or less like the >> full version of randstruct. > > Can you try this patch? > > > commit d73a3244700d3c945cedea7e1fb7042243c41e08 > Author: Kees Cook <keescook@chromium.org> > AuthorDate: Fri Oct 6 21:09:28 2023 -0700 > Commit: Kees Cook <keescook@chromium.org> > CommitDate: Fri Oct 6 21:09:28 2023 -0700 > > randstruct: Fix gcc-plugin performance mode to stay in group > > The performance mode of the gcc-plugin randstruct was shuffling struct > members outside of the cache-line groups. Limit the range to the > specified group indexes. > > Cc: linux-hardening@vger.kernel.org > Reported-by: Lukas Loidolt <e1634039@student.tuwien.ac.at> > Closes: https://lore.kernel.org/all/f3ca77f0-e414-4065-83a5-ae4c4d25545d@student.tuwien.ac.at > Signed-off-by: Kees Cook <keescook@chromium.org> > > diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c b/scripts/gcc-plugins/randomize_layout_plugin.c > index 951b74ba1b24..178831917f01 100644 > --- a/scripts/gcc-plugins/randomize_layout_plugin.c > +++ b/scripts/gcc-plugins/randomize_layout_plugin.c > @@ -191,7 +191,7 @@ static void partition_struct(tree *fields, unsigned long length, struct partitio > > static void performance_shuffle(tree *newtree, unsigned long length, ranctx *prng_state) > { > - unsigned long i, x; > + unsigned long i, x, index; > struct partition_group size_group[length]; > unsigned long num_groups = 0; > unsigned long randnum; > @@ -206,11 +206,14 @@ static void performance_shuffle(tree *newtree, unsigned long length, ranctx *prn > } > > for (x = 0; x < num_groups; x++) { > - for (i = size_group[x].start + size_group[x].length - 1; i > size_group[x].start; i--) { > + for (index = size_group[x].length - 1; index > 0; index--) { > tree tmp; > + > + i = size_group[x].start + index; > if (DECL_BIT_FIELD_TYPE(newtree[i])) > continue; > randnum = ranval(prng_state) % (i + 1); > + randnum += size_group[x].start; > // we could handle this case differently if desired > if (DECL_BIT_FIELD_TYPE(newtree[randnum])) > continue; > > -- > Kees Cook I think, this is still missing a change in the randnum calculation to use index instead of i. Without that, randnum can be larger than the length of newtree, which crashes kernel compilation for me. diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c b/scripts/gcc-plugins/randomize_layout_plugin.c index 178831917f01..4b4627e3f2ce 100644 --- a/scripts/gcc-plugins/randomize_layout_plugin.c +++ b/scripts/gcc-plugins/randomize_layout_plugin.c @@ -212,7 +212,7 @@ static void performance_shuffle(tree *newtree, unsigned long length, ranctx *prn i = size_group[x].start + index; if (DECL_BIT_FIELD_TYPE(newtree[i])) continue; - randnum = ranval(prng_state) % (i + 1); + randnum = ranval(prng_state) % (index + 1); randnum += size_group[x].start; // we could handle this case differently if desired if (DECL_BIT_FIELD_TYPE(newtree[randnum])) The patch seems to work after that though. For the previous example, I now get the following layout: func1 (offset: 0) func3 (offset: 8) func4 (offset: 16) func6 (offset: 24) func7 (offset: 32) func8 (offset: 40) func5 (offset: 48) func2 (offset: 56) func10 (offset: 64) func9 (offset: 72) Regarding the shuffling of groups/partitions (rather than just the randomization of structure members within each partition), I'm not sure if that was intended at some point, but it might be worth looking into. I'd assume it would improve randomization without sacrificing performance, and it's also what the clang implementation of randstruct does. Lukas Loidolt ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Missing cache considerations in randstruct performance feature 2023-10-07 10:38 ` Lukas Loidolt @ 2023-10-08 17:04 ` Kees Cook 0 siblings, 0 replies; 5+ messages in thread From: Kees Cook @ 2023-10-08 17:04 UTC (permalink / raw) To: Lukas Loidolt; +Cc: linux-hardening, linux-kernel, Daniel Marth On Sat, Oct 07, 2023 at 12:38:28PM +0200, Lukas Loidolt wrote: > On 07.10.23 06:12, Kees Cook wrote: > > On Sat, Oct 07, 2023 at 12:30:01AM +0200, Lukas Loidolt wrote: > > > In my tests, however, the performance version behaves more or less like the > > > full version of randstruct. > > > > Can you try this patch? > > > > > > commit d73a3244700d3c945cedea7e1fb7042243c41e08 > > Author: Kees Cook <keescook@chromium.org> > > AuthorDate: Fri Oct 6 21:09:28 2023 -0700 > > Commit: Kees Cook <keescook@chromium.org> > > CommitDate: Fri Oct 6 21:09:28 2023 -0700 > > > > randstruct: Fix gcc-plugin performance mode to stay in group > > > > The performance mode of the gcc-plugin randstruct was shuffling struct > > members outside of the cache-line groups. Limit the range to the > > specified group indexes. > > > > Cc: linux-hardening@vger.kernel.org > > Reported-by: Lukas Loidolt <e1634039@student.tuwien.ac.at> > > Closes: https://lore.kernel.org/all/f3ca77f0-e414-4065-83a5-ae4c4d25545d@student.tuwien.ac.at > > Signed-off-by: Kees Cook <keescook@chromium.org> > > > > diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c b/scripts/gcc-plugins/randomize_layout_plugin.c > > index 951b74ba1b24..178831917f01 100644 > > --- a/scripts/gcc-plugins/randomize_layout_plugin.c > > +++ b/scripts/gcc-plugins/randomize_layout_plugin.c > > @@ -191,7 +191,7 @@ static void partition_struct(tree *fields, unsigned long length, struct partitio > > > > static void performance_shuffle(tree *newtree, unsigned long length, ranctx *prng_state) > > { > > - unsigned long i, x; > > + unsigned long i, x, index; > > struct partition_group size_group[length]; > > unsigned long num_groups = 0; > > unsigned long randnum; > > @@ -206,11 +206,14 @@ static void performance_shuffle(tree *newtree, unsigned long length, ranctx *prn > > } > > > > for (x = 0; x < num_groups; x++) { > > - for (i = size_group[x].start + size_group[x].length - 1; i > size_group[x].start; i--) { > > + for (index = size_group[x].length - 1; index > 0; index--) { > > tree tmp; > > + > > + i = size_group[x].start + index; > > if (DECL_BIT_FIELD_TYPE(newtree[i])) > > continue; > > randnum = ranval(prng_state) % (i + 1); > > + randnum += size_group[x].start; > > // we could handle this case differently if desired > > if (DECL_BIT_FIELD_TYPE(newtree[randnum])) > > continue; > > > > -- > > Kees Cook > > I think, this is still missing a change in the randnum calculation to use index instead of i. > Without that, randnum can be larger than the length of newtree, which crashes kernel compilation for me. Oops, yes, I missed that while refactoring my patch to reduce lines changed. > > diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c b/scripts/gcc-plugins/randomize_layout_plugin.c > index 178831917f01..4b4627e3f2ce 100644 > --- a/scripts/gcc-plugins/randomize_layout_plugin.c > +++ b/scripts/gcc-plugins/randomize_layout_plugin.c > @@ -212,7 +212,7 @@ static void performance_shuffle(tree *newtree, unsigned long length, ranctx *prn > i = size_group[x].start + index; > if (DECL_BIT_FIELD_TYPE(newtree[i])) > continue; > - randnum = ranval(prng_state) % (i + 1); > + randnum = ranval(prng_state) % (index + 1); > randnum += size_group[x].start; > // we could handle this case differently if desired > if (DECL_BIT_FIELD_TYPE(newtree[randnum])) > > > The patch seems to work after that though. For the previous example, I now get the following layout: > > func1 (offset: 0) > func3 (offset: 8) > func4 (offset: 16) > func6 (offset: 24) > func7 (offset: 32) > func8 (offset: 40) > func5 (offset: 48) > func2 (offset: 56) > func10 (offset: 64) > func9 (offset: 72) > > Regarding the shuffling of groups/partitions (rather than just the randomization of structure members within each partition), I'm not sure if that was intended at some point, but it might be worth looking into. Yeah, this is also clearly not working. > I'd assume it would improve randomization without sacrificing performance, and it's also what the clang implementation of randstruct does. Thanks for testing! -Kees -- Kees Cook ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-10-08 17:04 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-10-06 22:30 Missing cache considerations in randstruct performance feature Lukas Loidolt 2023-10-07 2:24 ` Kees Cook 2023-10-07 4:12 ` Kees Cook 2023-10-07 10:38 ` Lukas Loidolt 2023-10-08 17:04 ` Kees Cook
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox