* [PATCH v1 0/2] Handle multi-letter extensions starting with caps in riscv,isa
@ 2023-04-26 10:43 Conor Dooley
2023-04-26 10:43 ` [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps Conor Dooley
2023-04-26 10:43 ` [PATCH v1 2/2] dt-bindings: riscv: drop invalid comment about riscv,isa lower-case reasoning Conor Dooley
0 siblings, 2 replies; 15+ messages in thread
From: Conor Dooley @ 2023-04-26 10:43 UTC (permalink / raw)
To: palmer
Cc: conor, conor.dooley, Paul Walmsley, Rob Herring,
Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng,
Yangyu Chen, devicetree, linux-riscv
Hey,
Following on from [1] in which Yangyu reported kernel panics for a
riscv,isa string containing "rv64ima_Zifencei", as the parser got
confused by the capital letter, here's a small change to the parser to
handle invalid extensions starting with capital & the removal of some
inaccurate wording from the dt-binding.
Cheers,
Conor.
1 - https://lore.kernel.org/all/tencent_1647475C9618C390BEC601BE2CC1206D0C07@qq.com/
CC: Paul Walmsley <paul.walmsley@sifive.com>
CC: Palmer Dabbelt <palmer@dabbelt.com>
CC: Rob Herring <robh+dt@kernel.org>
CC: Krzysztof Kozlowski <krzk+dt@kernel.org>
CC: Wende Tan <twd2.me@gmail.com>
CC: Soha Jin <soha@lohu.info>
CC: Hongren Zheng <i@zenithal.me>
CC: Yangyu Chen <cyy@cyyself.name>
CC: devicetree@vger.kernel.org
CC: linux-riscv@lists.infradead.org
Conor Dooley (2):
RISC-V: skip parsing multi-letter extensions starting with caps
dt-bindings: riscv: drop invalid comment about riscv,isa lower-case
reasoning
.../devicetree/bindings/riscv/cpus.yaml | 2 +-
arch/riscv/kernel/cpufeature.c | 17 ++++++++++++++---
2 files changed, 15 insertions(+), 4 deletions(-)
--
2.39.2
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 15+ messages in thread* [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps 2023-04-26 10:43 [PATCH v1 0/2] Handle multi-letter extensions starting with caps in riscv,isa Conor Dooley @ 2023-04-26 10:43 ` Conor Dooley 2023-04-26 12:18 ` Andrew Jones 2023-04-26 10:43 ` [PATCH v1 2/2] dt-bindings: riscv: drop invalid comment about riscv,isa lower-case reasoning Conor Dooley 1 sibling, 1 reply; 15+ messages in thread From: Conor Dooley @ 2023-04-26 10:43 UTC (permalink / raw) To: palmer Cc: conor, conor.dooley, Paul Walmsley, Rob Herring, Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng, Yangyu Chen, devicetree, linux-riscv Yangyu Chen reported that if an multi-letter extension begins with a capital letter the parser will treat the remainder of that multi-letter extension as single-letter extensions. Certain versions of rocket-chip will export devicetree containing rv64ima_Zifencei, which is parsed by the kernel as rv64imafc. While capital letters in riscv,isa are invalid and the validation of devicetree's isn't the kernel's job, we should behave more gracefully here. Rather than abort parsing on meeting a capital letter, mark the extension as an error & allow the parser to skip ahead to the next extension. Reported-by: Yangyu Chen <cyy@cyyself.name> Link: https://lore.kernel.org/all/tencent_1647475C9618C390BEC601BE2CC1206D0C07@qq.com/ Fixes: 2a31c54be097 ("RISC-V: Minimal parser for "riscv, isa" strings") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> --- arch/riscv/kernel/cpufeature.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 52585e088873..93850540b0b4 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -142,6 +142,10 @@ void __init riscv_fill_hwcap(void) const char *ext_end = isa; bool ext_long = false, ext_err = false; + if (unlikely(!islower(*ext))) { + ext_err = true; + } + switch (*ext) { case 's': /** @@ -156,6 +160,15 @@ void __init riscv_fill_hwcap(void) break; } fallthrough; + case 'S': + case 'X': + case 'Z': + /* + * As the riscv,isa string must be lower-case, + * S, X and Z are not valid characters. Parse + * the invalid extension anyway, to skip ahead + * to the next valid one. + */ case 'x': case 'z': ext_long = true; @@ -185,10 +198,8 @@ void __init riscv_fill_hwcap(void) ++ext_end; break; default: - if (unlikely(!islower(*ext))) { - ext_err = true; + if (unlikely(ext_err)) break; - } /* Find next extension */ if (!isdigit(*isa)) break; -- 2.39.2 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps 2023-04-26 10:43 ` [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps Conor Dooley @ 2023-04-26 12:18 ` Andrew Jones 2023-04-26 12:47 ` Conor Dooley 0 siblings, 1 reply; 15+ messages in thread From: Andrew Jones @ 2023-04-26 12:18 UTC (permalink / raw) To: Conor Dooley Cc: palmer, conor, Paul Walmsley, Rob Herring, Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng, Yangyu Chen, devicetree, linux-riscv On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > Yangyu Chen reported that if an multi-letter extension begins with a > capital letter the parser will treat the remainder of that multi-letter > extension as single-letter extensions. I think the problem is that the parser doesn't completely abort when it sees something it doesn't understand. Continuing is risky since it may be possible to compose an invalid string that gets the parser to run off the rails. How about completely aborting, noisily, when the string doesn't match expectations, falling back to a default string such as rv64ima instead. That also ought to get faster corrections of device trees. > > Certain versions of rocket-chip will export devicetree containing > rv64ima_Zifencei, which is parsed by the kernel as rv64imafc. > > While capital letters in riscv,isa are invalid and the validation of > devicetree's isn't the kernel's job, we should behave more gracefully > here. Rather than abort parsing on meeting a capital letter, mark the > extension as an error & allow the parser to skip ahead to the next > extension. > > Reported-by: Yangyu Chen <cyy@cyyself.name> > Link: https://lore.kernel.org/all/tencent_1647475C9618C390BEC601BE2CC1206D0C07@qq.com/ > Fixes: 2a31c54be097 ("RISC-V: Minimal parser for "riscv, isa" strings") > Signed-off-by: Conor Dooley <conor.dooley@microchip.com> > --- > arch/riscv/kernel/cpufeature.c | 17 ++++++++++++++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c > index 52585e088873..93850540b0b4 100644 > --- a/arch/riscv/kernel/cpufeature.c > +++ b/arch/riscv/kernel/cpufeature.c > @@ -142,6 +142,10 @@ void __init riscv_fill_hwcap(void) > const char *ext_end = isa; > bool ext_long = false, ext_err = false; > > + if (unlikely(!islower(*ext))) { > + ext_err = true; > + } Can drop the {} > + > switch (*ext) { > case 's': > /** > @@ -156,6 +160,15 @@ void __init riscv_fill_hwcap(void) > break; > } > fallthrough; > + case 'S': > + case 'X': > + case 'Z': > + /* > + * As the riscv,isa string must be lower-case, > + * S, X and Z are not valid characters. Parse > + * the invalid extension anyway, to skip ahead > + * to the next valid one. > + */ > case 'x': > case 'z': > ext_long = true; > @@ -185,10 +198,8 @@ void __init riscv_fill_hwcap(void) > ++ext_end; > break; > default: > - if (unlikely(!islower(*ext))) { > - ext_err = true; > + if (unlikely(ext_err)) > break; > - } > /* Find next extension */ > if (!isdigit(*isa)) > break; > -- > 2.39.2 > Thanks, drew _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps 2023-04-26 12:18 ` Andrew Jones @ 2023-04-26 12:47 ` Conor Dooley 2023-04-26 13:08 ` Andrew Jones 0 siblings, 1 reply; 15+ messages in thread From: Conor Dooley @ 2023-04-26 12:47 UTC (permalink / raw) To: Andrew Jones Cc: palmer, conor, Paul Walmsley, Rob Herring, Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng, Yangyu Chen, devicetree, linux-riscv [-- Attachment #1.1: Type: text/plain, Size: 2087 bytes --] On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > Yangyu Chen reported that if an multi-letter extension begins with a > > capital letter the parser will treat the remainder of that multi-letter > > extension as single-letter extensions. > > I think the problem is that the parser doesn't completely abort when > it sees something it doesn't understand. Continuing is risky since > it may be possible to compose an invalid string that gets the parser > to run off the rails. Usually I am of the opinion that we should not seek the validate the dt in the kernel, since there are tools for doing so *cough* dt-validate *cough*. This one seemed like low hanging fruit though, since the parser handles having capital letters in any of the other places after the rv##, but falls over pretty badly for this particular issue. In general, I don't think we need to be concerned about anything that fails dt-validate though, you kinda need to trust that that is correct. I'd argue that we might even do too much validation in the parser at present. Is there some attack vector, or ACPI related consideration, that I am unaware of that makes this risky? > How about completely aborting, noisily, when the string doesn't match > expectations, falling back to a default string such as rv64ima instead. > That also ought to get faster corrections of device trees. I did this first actually, but I was afraid that it would cause regressions? If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is invalid and dt-validate would have told you so, but at present that would be parsed as "rv64imafdc_zicbom" which is a perfect description of the hardware in question (since the meaning of i was set before RVI made a hames of things). So that's why I opted to not do some sort of pr_err/BUG()/WARN() and try to keep processing the string. I'm happy to abort entirely on reaching a capital if people feel there's unlikely to be a fallout from that. Cheers, Conor. [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] [-- Attachment #2: Type: text/plain, Size: 161 bytes --] _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps 2023-04-26 12:47 ` Conor Dooley @ 2023-04-26 13:08 ` Andrew Jones 2023-04-26 13:54 ` Andrew Jones 2023-04-26 13:58 ` Conor Dooley 0 siblings, 2 replies; 15+ messages in thread From: Andrew Jones @ 2023-04-26 13:08 UTC (permalink / raw) To: Conor Dooley Cc: palmer, conor, Paul Walmsley, Rob Herring, Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng, Yangyu Chen, devicetree, linux-riscv On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > capital letter the parser will treat the remainder of that multi-letter > > > extension as single-letter extensions. > > > > I think the problem is that the parser doesn't completely abort when > > it sees something it doesn't understand. Continuing is risky since > > it may be possible to compose an invalid string that gets the parser > > to run off the rails. > > Usually I am of the opinion that we should not seek the validate the dt > in the kernel, since there are tools for doing so *cough* dt-validate > *cough*. This one seemed like low hanging fruit though, since the parser > handles having capital letters in any of the other places after the > rv##, but falls over pretty badly for this particular issue. > > In general, I don't think we need to be concerned about anything that > fails dt-validate though, you kinda need to trust that that is correct. > I'd argue that we might even do too much validation in the parser at > present. > Is there some attack vector, or ACPI related consideration, that I am > unaware of that makes this risky? C language + string processing == potential attack vector > > > How about completely aborting, noisily, when the string doesn't match > > expectations, falling back to a default string such as rv64ima instead. > > That also ought to get faster corrections of device trees. > > I did this first actually, but I was afraid that it would cause > regressions? > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > invalid and dt-validate would have told you so, but at present that > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > the hardware in question (since the meaning of i was set before RVI made > a hames of things). > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > try to keep processing the string. I'm happy to abort entirely on > reaching a capital if people feel there's unlikely to be a fallout from > that. There might be fallout, but the kernel needs to defend itself. IMO, if the kernel doesn't know how to parse something, then it should stop trying to immediately, either with a BUG(), refusing to accept any part of it, by fallbacking back to a default, or by only accepting what it believes it parsed correctly. The third option is probably a reasonable choice in this case. Thanks, drew _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps 2023-04-26 13:08 ` Andrew Jones @ 2023-04-26 13:54 ` Andrew Jones 2023-04-26 14:37 ` Conor Dooley 2023-04-26 13:58 ` Conor Dooley 1 sibling, 1 reply; 15+ messages in thread From: Andrew Jones @ 2023-04-26 13:54 UTC (permalink / raw) To: Conor Dooley Cc: palmer, conor, Paul Walmsley, Rob Herring, Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng, Yangyu Chen, devicetree, linux-riscv On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > capital letter the parser will treat the remainder of that multi-letter > > > > extension as single-letter extensions. > > > > > > I think the problem is that the parser doesn't completely abort when > > > it sees something it doesn't understand. Continuing is risky since > > > it may be possible to compose an invalid string that gets the parser > > > to run off the rails. > > > > Usually I am of the opinion that we should not seek the validate the dt > > in the kernel, since there are tools for doing so *cough* dt-validate > > *cough*. This one seemed like low hanging fruit though, since the parser > > handles having capital letters in any of the other places after the > > rv##, but falls over pretty badly for this particular issue. > > > > In general, I don't think we need to be concerned about anything that > > fails dt-validate though, you kinda need to trust that that is correct. > > I'd argue that we might even do too much validation in the parser at > > present. > > Is there some attack vector, or ACPI related consideration, that I am > > unaware of that makes this risky? A bit unrelated to this, but your mention of ACPI made me go look at the approved ECR[1] again for the ISA string. It says "Null-terminated ASCII Instruction Set Architecture (ISA) string for this hart. The format of the ISA string is defined in the RISC-V unprivileged specification." I suppose we can still add additional requirements to an ACPI ISA string which the Linux kernel will parse, but it'll be odd to point people at the DT binding to do that. Maybe we should consider making the parser more complete, possibly by importing it from some reference implementation or something. [1] https://drive.google.com/file/d/1nP3nFiH4jkPMp6COOxP6123DCZKR-tia/view Thanks, drew > > C language + string processing == potential attack vector > > > > > > How about completely aborting, noisily, when the string doesn't match > > > expectations, falling back to a default string such as rv64ima instead. > > > That also ought to get faster corrections of device trees. > > > > I did this first actually, but I was afraid that it would cause > > regressions? > > > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > > invalid and dt-validate would have told you so, but at present that > > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > > the hardware in question (since the meaning of i was set before RVI made > > a hames of things). > > > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > > try to keep processing the string. I'm happy to abort entirely on > > reaching a capital if people feel there's unlikely to be a fallout from > > that. > > There might be fallout, but the kernel needs to defend itself. IMO, if > the kernel doesn't know how to parse something, then it should stop > trying to immediately, either with a BUG(), refusing to accept any > part of it, by fallbacking back to a default, or by only accepting what > it believes it parsed correctly. > > The third option is probably a reasonable choice in this case. > > Thanks, > drew _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps 2023-04-26 13:54 ` Andrew Jones @ 2023-04-26 14:37 ` Conor Dooley 2023-04-26 15:01 ` Andrew Jones 2023-04-26 17:11 ` Yangyu Chen 0 siblings, 2 replies; 15+ messages in thread From: Conor Dooley @ 2023-04-26 14:37 UTC (permalink / raw) To: Andrew Jones Cc: Conor Dooley, palmer, Paul Walmsley, Rob Herring, Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng, Yangyu Chen, devicetree, linux-riscv [-- Attachment #1.1: Type: text/plain, Size: 4738 bytes --] On Wed, Apr 26, 2023 at 03:54:55PM +0200, Andrew Jones wrote: > On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > > capital letter the parser will treat the remainder of that multi-letter > > > > > extension as single-letter extensions. > > > > > > > > I think the problem is that the parser doesn't completely abort when > > > > it sees something it doesn't understand. Continuing is risky since > > > > it may be possible to compose an invalid string that gets the parser > > > > to run off the rails. > > > > > > Usually I am of the opinion that we should not seek the validate the dt > > > in the kernel, since there are tools for doing so *cough* dt-validate > > > *cough*. This one seemed like low hanging fruit though, since the parser > > > handles having capital letters in any of the other places after the > > > rv##, but falls over pretty badly for this particular issue. > > > > > > In general, I don't think we need to be concerned about anything that > > > fails dt-validate though, you kinda need to trust that that is correct. > > > I'd argue that we might even do too much validation in the parser at > > > present. > > > Is there some attack vector, or ACPI related consideration, that I am > > > unaware of that makes this risky? > > A bit unrelated to this, but your mention of ACPI made me go look at the > approved ECR[1] again for the ISA string. It says "Null-terminated ASCII > Instruction Set Architecture (ISA) string for this hart. The format of the > ISA string is defined in the RISC-V unprivileged specification." I suppose > we can still add additional requirements to an ACPI ISA string which the > Linux kernel will parse, but it'll be odd to point people at the DT > binding to do that. Maybe we should consider making the parser more > complete, possibly by importing it from some reference implementation or > something. Heh, I wonder are we heading for some divergence here then. riscv,isa in a DT is explicitly *not* a match for that due to the backwards-incompatible changes made by RVI to extension definitions since riscv,isa was added to the dt-binding. Clarifying that one is the next patch in my todo list.. ACPI naively saying "it matches the spec" is asking for trouble, since there does not actually appear to be any sort of clarification about which *version* of the spec that may be. At least in the dt-binding, we have a format there, what happens to the ACPI spec if RVI decides that - is a suitable alternative to _ in some future edition? I don't think such a thing is all that likely, but surely you'd like to insulate the ABI from that sort of eventuality? Perhaps the thing to do is to actually take Yangyu's first patch and my second one, since the problem with backwards compatibility doesn't stop the kernel from being more permissive? Cheers, Conor. > > [1] https://drive.google.com/file/d/1nP3nFiH4jkPMp6COOxP6123DCZKR-tia/view > > Thanks, > drew > > > > > C language + string processing == potential attack vector > > > > > > > > > How about completely aborting, noisily, when the string doesn't match > > > > expectations, falling back to a default string such as rv64ima instead. > > > > That also ought to get faster corrections of device trees. > > > > > > I did this first actually, but I was afraid that it would cause > > > regressions? > > > > > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > > > invalid and dt-validate would have told you so, but at present that > > > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > > > the hardware in question (since the meaning of i was set before RVI made > > > a hames of things). > > > > > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > > > try to keep processing the string. I'm happy to abort entirely on > > > reaching a capital if people feel there's unlikely to be a fallout from > > > that. > > > > There might be fallout, but the kernel needs to defend itself. IMO, if > > the kernel doesn't know how to parse something, then it should stop > > trying to immediately, either with a BUG(), refusing to accept any > > part of it, by fallbacking back to a default, or by only accepting what > > it believes it parsed correctly. > > > > The third option is probably a reasonable choice in this case. > > > > Thanks, > > drew [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] [-- Attachment #2: Type: text/plain, Size: 161 bytes --] _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps 2023-04-26 14:37 ` Conor Dooley @ 2023-04-26 15:01 ` Andrew Jones 2023-04-26 17:11 ` Yangyu Chen 1 sibling, 0 replies; 15+ messages in thread From: Andrew Jones @ 2023-04-26 15:01 UTC (permalink / raw) To: Conor Dooley Cc: Conor Dooley, palmer, Paul Walmsley, Rob Herring, Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng, Yangyu Chen, devicetree, linux-riscv On Wed, Apr 26, 2023 at 03:37:58PM +0100, Conor Dooley wrote: > On Wed, Apr 26, 2023 at 03:54:55PM +0200, Andrew Jones wrote: > > On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > > > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > > > capital letter the parser will treat the remainder of that multi-letter > > > > > > extension as single-letter extensions. > > > > > > > > > > I think the problem is that the parser doesn't completely abort when > > > > > it sees something it doesn't understand. Continuing is risky since > > > > > it may be possible to compose an invalid string that gets the parser > > > > > to run off the rails. > > > > > > > > Usually I am of the opinion that we should not seek the validate the dt > > > > in the kernel, since there are tools for doing so *cough* dt-validate > > > > *cough*. This one seemed like low hanging fruit though, since the parser > > > > handles having capital letters in any of the other places after the > > > > rv##, but falls over pretty badly for this particular issue. > > > > > > > > In general, I don't think we need to be concerned about anything that > > > > fails dt-validate though, you kinda need to trust that that is correct. > > > > I'd argue that we might even do too much validation in the parser at > > > > present. > > > > Is there some attack vector, or ACPI related consideration, that I am > > > > unaware of that makes this risky? > > > > A bit unrelated to this, but your mention of ACPI made me go look at the > > approved ECR[1] again for the ISA string. It says "Null-terminated ASCII > > Instruction Set Architecture (ISA) string for this hart. The format of the > > ISA string is defined in the RISC-V unprivileged specification." I suppose > > we can still add additional requirements to an ACPI ISA string which the > > Linux kernel will parse, but it'll be odd to point people at the DT > > binding to do that. Maybe we should consider making the parser more > > complete, possibly by importing it from some reference implementation or > > something. > > Heh, I wonder are we heading for some divergence here then. riscv,isa in > a DT is explicitly *not* a match for that due to the > backwards-incompatible changes made by RVI to extension definitions > since riscv,isa was added to the dt-binding. Clarifying that one is the > next patch in my todo list.. > > ACPI naively saying "it matches the spec" is asking for trouble, since > there does not actually appear to be any sort of clarification about > which *version* of the spec that may be. At least in the dt-binding, we > have a format there, what happens to the ACPI spec if RVI decides that - > is a suitable alternative to _ in some future edition? I don't think > such a thing is all that likely, but surely you'd like to insulate the > ABI from that sort of eventuality? Agreed. I'll raise this concern with the ACPI people. > > Perhaps the thing to do is to actually take Yangyu's first patch and my > second one, since the problem with backwards compatibility doesn't stop > the kernel from being more permissive? I guess so? Every time you wake up and see a parser, you get six more weeks of Winter. drew _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps 2023-04-26 14:37 ` Conor Dooley 2023-04-26 15:01 ` Andrew Jones @ 2023-04-26 17:11 ` Yangyu Chen 2023-04-26 17:47 ` Conor Dooley 1 sibling, 1 reply; 15+ messages in thread From: Yangyu Chen @ 2023-04-26 17:11 UTC (permalink / raw) To: conor Cc: ajones, conor.dooley, cyy, devicetree, i, krzk+dt, linux-riscv, palmer, paul.walmsley, robh+dt, soha, twd2.me Hi, On Wed, 26 Apr 2023 15:37:58 +0100, Conor Dooley wrote: > Perhaps the thing to do is to actually take Yangyu's first patch and my > second one, since the problem with backwards compatibility doesn't stop > the kernel from being more permissive? How about taking my first patch[1] since the ECR[2] mentioned that the format of the ISA string is defined in the RISC-V unprivileged specification? However, I think we still need to stop the parser if some characters that the parser is not able to handle as Andrew Jones mentioned in the previous mail[3]. In this case, we still need to add some code to stop parsing if any error happens. In my humble opinion, backward compatibility can be solved by backports to LTS kernels. I think the better option is to allow using uppercase letters in the device-tree document to make the parser coherent with RISC-V ISA specification but recommend using all lowercase letters for better backward compatibility. [1] https://lore.kernel.org/all/tencent_63090269FF399AE30AC774848C344EF2F10A@qq.com/ [2] https://drive.google.com/file/d/1nP3nFiH4jkPMp6COOxP6123DCZKR-tia/view [3] https://lore.kernel.org/all/d7t6nxmblmyqriogs4bxakpse3qc7pc6cczjnhmkpk4kjwvgcb@3aihh3erdn6p/ Thanks, Yangyu Chen _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps 2023-04-26 17:11 ` Yangyu Chen @ 2023-04-26 17:47 ` Conor Dooley 0 siblings, 0 replies; 15+ messages in thread From: Conor Dooley @ 2023-04-26 17:47 UTC (permalink / raw) To: Yangyu Chen Cc: ajones, conor.dooley, devicetree, i, krzk+dt, linux-riscv, palmer, paul.walmsley, robh+dt, soha, twd2.me [-- Attachment #1.1: Type: text/plain, Size: 2484 bytes --] On Thu, Apr 27, 2023 at 01:11:00AM +0800, Yangyu Chen wrote: > Hi, > > On Wed, 26 Apr 2023 15:37:58 +0100, Conor Dooley wrote: > > Perhaps the thing to do is to actually take Yangyu's first patch and my > > second one, since the problem with backwards compatibility doesn't stop > > the kernel from being more permissive? > > How about taking my first patch[1] since the ECR[2] mentioned that > the format of the ISA string is defined in the RISC-V unprivileged > specification? That is what I suggested, no? Your 1/2 with a revised subject noting that ACPI may need it, rather than talking about DT. See also my comments to Drew about the "perils" of letting undefined spec versions have control over your ABI. > However, I think we still need to stop the parser if > some characters that the parser is not able to handle as Andrew Jones > mentioned in the previous mail[3]. In this case, we still need to add > some code to stop parsing if any error happens. Currently it skips extensions that are not valid, but keeps parsing. Apart from the case where SZX are capital letters it "should" either parse into something resembling correct or skip. If we start parsing in a case-invariant manner, I don't see any immediately problem with what we currently have. I just don't really get what we need to "protect" the kernel from. If someone controls the dtb, I think what they do to the isa string is probably the least of our worries. > In my humble opinion, backward compatibility can be solved by backports > to LTS kernels. The binding might lie in Linux, but that does not mean we can fix the problem by backporting parser changes to stable. There are other users and Linux kernels that would pre-date the change that we would be inflicting this relaxation on for no benefit at all. U-Boot for example does a case-sensitive comparison. > I think the better option is to allow using uppercase > letters in the device-tree document to make the parser coherent with > RISC-V ISA specification but recommend using all lowercase letters for > better backward compatibility. Any addition of uppercase to that binding will get my NAK. There is no realistic benefit to doing so, only potential for disruption. DT generators should emit DT that complies with bindings ¯\_(ツ)_/¯. I'll go take a proper look at your 1/2 from the other day. I had a comment about it that I didn't leave, but will do so now. Thanks, Conor. [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] [-- Attachment #2: Type: text/plain, Size: 161 bytes --] _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps 2023-04-26 13:08 ` Andrew Jones 2023-04-26 13:54 ` Andrew Jones @ 2023-04-26 13:58 ` Conor Dooley 2023-04-26 14:27 ` Andrew Jones 1 sibling, 1 reply; 15+ messages in thread From: Conor Dooley @ 2023-04-26 13:58 UTC (permalink / raw) To: Andrew Jones Cc: Conor Dooley, palmer, Paul Walmsley, Rob Herring, Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng, Yangyu Chen, devicetree, linux-riscv [-- Attachment #1.1: Type: text/plain, Size: 3676 bytes --] On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > capital letter the parser will treat the remainder of that multi-letter > > > > extension as single-letter extensions. > > > > > > I think the problem is that the parser doesn't completely abort when > > > it sees something it doesn't understand. Continuing is risky since > > > it may be possible to compose an invalid string that gets the parser > > > to run off the rails. > > > > Usually I am of the opinion that we should not seek the validate the dt > > in the kernel, since there are tools for doing so *cough* dt-validate > > *cough*. This one seemed like low hanging fruit though, since the parser > > handles having capital letters in any of the other places after the > > rv##, but falls over pretty badly for this particular issue. > > > > In general, I don't think we need to be concerned about anything that > > fails dt-validate though, you kinda need to trust that that is correct. > > I'd argue that we might even do too much validation in the parser at > > present. > > Is there some attack vector, or ACPI related consideration, that I am > > unaware of that makes this risky? > > C language + string processing == potential attack vector Right. I thought there was some specific scenario that you had in mind, rather than the "obvious" "parsing strings is bad". What I was wondering is whether the devicetree is an attack vector you actually have to care about? I had thought it wasn't, hence asking. > > > How about completely aborting, noisily, when the string doesn't match > > > expectations, falling back to a default string such as rv64ima instead. > > > That also ought to get faster corrections of device trees. > > > > I did this first actually, but I was afraid that it would cause > > regressions? > > > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > > invalid and dt-validate would have told you so, but at present that > > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > > the hardware in question (since the meaning of i was set before RVI made > > a hames of things). After thinking about it a bit cycling home, I don't actually think that this would be a regression. If your dt is not valid, then that's your problem, not ours :) Valid dt will be parsed correctly before and after such a change, so I think that that is actually okay. The dt-binding exists for a reason, and can be pointed to if anyone claims this is a regression I think. > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > > try to keep processing the string. I'm happy to abort entirely on > > reaching a capital if people feel there's unlikely to be a fallout from > > that. > > There might be fallout, but the kernel needs to defend itself. IMO, if > the kernel doesn't know how to parse something, then it should stop > trying to immediately, either with a BUG(), refusing to accept any > part of it, by fallbacking back to a default, or by only accepting what > it believes it parsed correctly. > > The third option is probably a reasonable choice in this case. My patch could be interpreted as meeting the criteria for option 3. I think you instead mean "stop parsing at that point & only report the extensions seen prior to the first bad one"? Cheers, Conor. [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] [-- Attachment #2: Type: text/plain, Size: 161 bytes --] _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps 2023-04-26 13:58 ` Conor Dooley @ 2023-04-26 14:27 ` Andrew Jones 0 siblings, 0 replies; 15+ messages in thread From: Andrew Jones @ 2023-04-26 14:27 UTC (permalink / raw) To: Conor Dooley Cc: Conor Dooley, palmer, Paul Walmsley, Rob Herring, Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng, Yangyu Chen, devicetree, linux-riscv On Wed, Apr 26, 2023 at 02:58:45PM +0100, Conor Dooley wrote: > On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > > capital letter the parser will treat the remainder of that multi-letter > > > > > extension as single-letter extensions. > > > > > > > > I think the problem is that the parser doesn't completely abort when > > > > it sees something it doesn't understand. Continuing is risky since > > > > it may be possible to compose an invalid string that gets the parser > > > > to run off the rails. > > > > > > Usually I am of the opinion that we should not seek the validate the dt > > > in the kernel, since there are tools for doing so *cough* dt-validate > > > *cough*. This one seemed like low hanging fruit though, since the parser > > > handles having capital letters in any of the other places after the > > > rv##, but falls over pretty badly for this particular issue. > > > > > > In general, I don't think we need to be concerned about anything that > > > fails dt-validate though, you kinda need to trust that that is correct. > > > I'd argue that we might even do too much validation in the parser at > > > present. > > > Is there some attack vector, or ACPI related consideration, that I am > > > unaware of that makes this risky? > > > > C language + string processing == potential attack vector > > Right. I thought there was some specific scenario that you had in mind, > rather than the "obvious" "parsing strings is bad". Yup, I only pointed out the obvious since it's always possible, at least for me, to lose sight of the forest for the trees. > What I was wondering is whether the devicetree is an attack vector you > actually have to care about? I had thought it wasn't, hence asking. Nope, I haven't put any thought into this potential attack vector beyond this discussion. > > > > > How about completely aborting, noisily, when the string doesn't match > > > > expectations, falling back to a default string such as rv64ima instead. > > > > That also ought to get faster corrections of device trees. > > > > > > I did this first actually, but I was afraid that it would cause > > > regressions? > > > > > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > > > invalid and dt-validate would have told you so, but at present that > > > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > > > the hardware in question (since the meaning of i was set before RVI made > > > a hames of things). > > After thinking about it a bit cycling home, I don't actually think that > this would be a regression. If your dt is not valid, then that's your > problem, not ours :) > Valid dt will be parsed correctly before and after such a change, so I > think that that is actually okay. > The dt-binding exists for a reason, and can be pointed to if anyone > claims this is a regression I think. I agree. > > > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > > > try to keep processing the string. I'm happy to abort entirely on > > > reaching a capital if people feel there's unlikely to be a fallout from > > > that. > > > > There might be fallout, but the kernel needs to defend itself. IMO, if > > the kernel doesn't know how to parse something, then it should stop > > trying to immediately, either with a BUG(), refusing to accept any > > part of it, by fallbacking back to a default, or by only accepting what > > it believes it parsed correctly. > > > > The third option is probably a reasonable choice in this case. > > My patch could be interpreted as meeting the criteria for option 3. > I think you instead mean "stop parsing at that point & only report the > extensions seen prior to the first bad one"? Right. Thanks, drew _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v1 2/2] dt-bindings: riscv: drop invalid comment about riscv,isa lower-case reasoning 2023-04-26 10:43 [PATCH v1 0/2] Handle multi-letter extensions starting with caps in riscv,isa Conor Dooley 2023-04-26 10:43 ` [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps Conor Dooley @ 2023-04-26 10:43 ` Conor Dooley 2023-04-26 12:20 ` Andrew Jones 2023-04-27 16:06 ` Rob Herring 1 sibling, 2 replies; 15+ messages in thread From: Conor Dooley @ 2023-04-26 10:43 UTC (permalink / raw) To: palmer Cc: conor, conor.dooley, Paul Walmsley, Rob Herring, Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng, Yangyu Chen, devicetree, linux-riscv "Ease of parsing" may have been the initial argument for keeping this string in lower-case, but parsers may have been written that expect lower-case only. For example, the one in released kernels currently does not behave correctly for multi-letter extensions that begin with a capital letter. Allowing upper-case here brings about no benefit but would break compatibility between new devicetrees and older kernels. Drop the comment to avoid confusing people. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> --- Documentation/devicetree/bindings/riscv/cpus.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml index f24cf9601c6e..9e273a3264e3 100644 --- a/Documentation/devicetree/bindings/riscv/cpus.yaml +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml @@ -86,7 +86,7 @@ properties: While the isa strings in ISA specification are case insensitive, letters in the riscv,isa string must be all - lowercase to simplify parsing. + lowercase. $ref: "/schemas/types.yaml#/definitions/string" pattern: ^rv(?:64|32)imaf?d?q?c?b?k?j?p?v?h?(?:[hsxz](?:[a-z])+)?(?:_[hsxz](?:[a-z])+)*$ -- 2.39.2 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v1 2/2] dt-bindings: riscv: drop invalid comment about riscv,isa lower-case reasoning 2023-04-26 10:43 ` [PATCH v1 2/2] dt-bindings: riscv: drop invalid comment about riscv,isa lower-case reasoning Conor Dooley @ 2023-04-26 12:20 ` Andrew Jones 2023-04-27 16:06 ` Rob Herring 1 sibling, 0 replies; 15+ messages in thread From: Andrew Jones @ 2023-04-26 12:20 UTC (permalink / raw) To: Conor Dooley Cc: palmer, conor, Paul Walmsley, Rob Herring, Krzysztof Kozlowski, Wende Tan, Soha Jin, Hongren Zheng, Yangyu Chen, devicetree, linux-riscv On Wed, Apr 26, 2023 at 11:43:25AM +0100, Conor Dooley wrote: > "Ease of parsing" may have been the initial argument for keeping this > string in lower-case, but parsers may have been written that expect > lower-case only. > For example, the one in released kernels currently does not behave > correctly for multi-letter extensions that begin with a capital letter. > Allowing upper-case here brings about no benefit but would break > compatibility between new devicetrees and older kernels. > > Drop the comment to avoid confusing people. > > Signed-off-by: Conor Dooley <conor.dooley@microchip.com> > --- > Documentation/devicetree/bindings/riscv/cpus.yaml | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml > index f24cf9601c6e..9e273a3264e3 100644 > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml > @@ -86,7 +86,7 @@ properties: > > While the isa strings in ISA specification are case > insensitive, letters in the riscv,isa string must be all > - lowercase to simplify parsing. > + lowercase. > $ref: "/schemas/types.yaml#/definitions/string" > pattern: ^rv(?:64|32)imaf?d?q?c?b?k?j?p?v?h?(?:[hsxz](?:[a-z])+)?(?:_[hsxz](?:[a-z])+)*$ > > -- > 2.39.2 > Reviewed-by: Andrew Jones <ajones@ventanamicro.com> _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 2/2] dt-bindings: riscv: drop invalid comment about riscv,isa lower-case reasoning 2023-04-26 10:43 ` [PATCH v1 2/2] dt-bindings: riscv: drop invalid comment about riscv,isa lower-case reasoning Conor Dooley 2023-04-26 12:20 ` Andrew Jones @ 2023-04-27 16:06 ` Rob Herring 1 sibling, 0 replies; 15+ messages in thread From: Rob Herring @ 2023-04-27 16:06 UTC (permalink / raw) To: Conor Dooley Cc: Krzysztof Kozlowski, Soha Jin, palmer, Wende Tan, conor, Paul Walmsley, Hongren Zheng, Yangyu Chen, linux-riscv, devicetree, Rob Herring On Wed, 26 Apr 2023 11:43:25 +0100, Conor Dooley wrote: > "Ease of parsing" may have been the initial argument for keeping this > string in lower-case, but parsers may have been written that expect > lower-case only. > For example, the one in released kernels currently does not behave > correctly for multi-letter extensions that begin with a capital letter. > Allowing upper-case here brings about no benefit but would break > compatibility between new devicetrees and older kernels. > > Drop the comment to avoid confusing people. > > Signed-off-by: Conor Dooley <conor.dooley@microchip.com> > --- > Documentation/devicetree/bindings/riscv/cpus.yaml | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Acked-by: Rob Herring <robh@kernel.org> _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-04-27 16:06 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-04-26 10:43 [PATCH v1 0/2] Handle multi-letter extensions starting with caps in riscv,isa Conor Dooley 2023-04-26 10:43 ` [PATCH v1 1/2] RISC-V: skip parsing multi-letter extensions starting with caps Conor Dooley 2023-04-26 12:18 ` Andrew Jones 2023-04-26 12:47 ` Conor Dooley 2023-04-26 13:08 ` Andrew Jones 2023-04-26 13:54 ` Andrew Jones 2023-04-26 14:37 ` Conor Dooley 2023-04-26 15:01 ` Andrew Jones 2023-04-26 17:11 ` Yangyu Chen 2023-04-26 17:47 ` Conor Dooley 2023-04-26 13:58 ` Conor Dooley 2023-04-26 14:27 ` Andrew Jones 2023-04-26 10:43 ` [PATCH v1 2/2] dt-bindings: riscv: drop invalid comment about riscv,isa lower-case reasoning Conor Dooley 2023-04-26 12:20 ` Andrew Jones 2023-04-27 16:06 ` Rob Herring
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox