[PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling.

rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling.
@ 2025-02-28 13:25 Guillaume Gomez
  0 siblings, 0 replies; 12+ messages in thread
From: Guillaume Gomez @ 2025-02-28 13:25 UTC (permalink / raw)
  To: ojeda, rust-for-linux; +Cc: Guillaume Gomez

Before this patch, the code was using very hacky methods in order to retrieve
doctests, modify them as needed and then concatenate all of them in one file.

Now, with this new flag, it instead asks rustdoc to provide the doctests
code with their associated information such as file path and line number.
---
 rust/Makefile                   |   8 +-
 scripts/json.rs                 | 235 ++++++++++++++++++++++++++++++++
 scripts/rustdoc_test_builder.rs | 115 ++++++++++------
 scripts/rustdoc_test_gen.rs     |  17 +--
 4 files changed, 323 insertions(+), 52 deletions(-)
 create mode 100644 scripts/json.rs

diff --git a/rust/Makefile b/rust/Makefile
index ea3849eb7..237ed23f8 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -171,14 +171,14 @@ quiet_cmd_rustdoc_test_kernel = RUSTDOC TK $<
 	rm -rf $(objtree)/$(obj)/test/doctests/kernel; \
 	mkdir -p $(objtree)/$(obj)/test/doctests/kernel; \
 	OBJTREE=$(abspath $(objtree)) \
-	$(RUSTDOC) --test $(rust_flags) \
+	$(RUSTDOC) --output-format=doctest $(rust_flags) \
 		-L$(objtree)/$(obj) --extern ffi --extern kernel \
 		--extern build_error --extern macros \
 		--extern bindings --extern uapi \
-		--no-run --crate-name kernel -Zunstable-options \
+		--crate-name kernel -Zunstable-options \
 		--sysroot=/dev/null \
-		--test-builder $(objtree)/scripts/rustdoc_test_builder \
-		$< $(rustdoc_test_kernel_quiet); \
+		$< $(rustdoc_test_kernel_quiet) > rustdoc.json; \
+	cat rustdoc.json | $(objtree)/scripts/rustdoc_test_builder; \
 	$(objtree)/scripts/rustdoc_test_gen
 
 %/doctests_kernel_generated.rs %/doctests_kernel_generated_kunit.c: \
diff --git a/scripts/json.rs b/scripts/json.rs
new file mode 100644
index 000000000..aff24bfd9
--- /dev/null
+++ b/scripts/json.rs
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! JSON parser used to parse rustdoc output when retrieving doctests.
+
+use std::collections::HashMap;
+use std::iter::Peekable;
+use std::str::FromStr;
+
+#[derive(Debug, PartialEq, Eq)]
+pub(crate) enum JsonValue {
+    Object(HashMap<String, JsonValue>),
+    String(String),
+    Number(i32),
+    Bool(bool),
+    Array(Vec<JsonValue>),
+    Null,
+}
+
+fn parse_ident<I: Iterator<Item = char>>(
+    iter: &mut I,
+    output: JsonValue,
+    ident: &str,
+) -> Result<JsonValue, String> {
+    let mut ident_iter = ident.chars().skip(1);
+
+    loop {
+        let i = ident_iter.next();
+        if i.is_none() {
+            return Ok(output);
+        }
+        let c = iter.next();
+        if i != c {
+            if let Some(c) = c {
+                return Err(format!("Unexpected character `{c}` when parsing `{ident}`"));
+            }
+            return Err(format!("Missing character when parsing `{ident}`"));
+        }
+    }
+}
+
+fn parse_string<I: Iterator<Item = char>>(iter: &mut I) -> Result<JsonValue, String> {
+    let mut out = String::new();
+
+    while let Some(c) = iter.next() {
+        match c {
+            '\\' => {
+                let Some(c) = iter.next() else { break };
+                match c {
+                    '"' | '\\' | '/' => out.push(c),
+                    'b' => out.push(char::from(0x8u8)),
+                    'f' => out.push(char::from(0xCu8)),
+                    't' => out.push('\t'),
+                    'r' => out.push('\r'),
+                    'n' => out.push('\n'),
+                    _ => {
+                        // This code doesn't handle codepoints so we put the string content as is.
+                        out.push('\\');
+                        out.push(c);
+                    }
+                }
+            }
+            '"' => {
+                return Ok(JsonValue::String(out));
+            }
+            _ => out.push(c),
+        }
+    }
+    Err(format!("Unclosed JSON string `{out}`"))
+}
+
+fn parse_number<I: Iterator<Item = char>>(
+    iter: &mut Peekable<I>,
+    digit: char,
+) -> Result<JsonValue, String> {
+    let mut nb = String::new();
+
+    nb.push(digit);
+    loop {
+        // We peek next character to prevent taking it from the iterator in case it's a comma.
+        if matches!(iter.peek(), Some(',' | '}' | ']')) {
+            break;
+        }
+        let Some(c) = iter.next() else { break };
+        if c.is_whitespace() {
+            break;
+        } else if !c.is_ascii_digit() {
+            return Err(format!("Error when parsing number `{nb}`: found `{c}`"));
+        }
+        nb.push(c);
+    }
+    i32::from_str(&nb)
+        .map(|nb| JsonValue::Number(nb))
+        .map_err(|error| format!("Invalid number: `{error}`"))
+}
+
+fn parse_array<I: Iterator<Item = char>>(iter: &mut Peekable<I>) -> Result<JsonValue, String> {
+    let mut values = Vec::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed array".to_string());
+        };
+        if c.is_whitespace() {
+            continue;
+        } else if c == ']' {
+            break;
+        }
+        values.push(parse(iter, c)?);
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == ']' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Array(values))
+}
+
+fn parse_object<I: Iterator<Item = char>>(iter: &mut Peekable<I>) -> Result<JsonValue, String> {
+    let mut values = HashMap::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed object".to_string());
+        };
+        let key;
+        if c.is_whitespace() {
+            continue;
+        } else if c == '"' {
+            let JsonValue::String(k) = parse_string(iter)? else {
+                unreachable!()
+            };
+            key = k;
+        } else if c == '}' {
+            break;
+        } else {
+            return Err(format!("Expected `\"` when parsing Object, found `{c}`"));
+        }
+
+        // We then get the `:` separator.
+        loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else if c == ':' {
+                break;
+            } else {
+                return Err(format!(
+                    "Expected `:` after key, found `{c}` when parsing object"
+                ));
+            }
+        }
+        // Then the value.
+        let value = loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else {
+                break parse(iter, c)?;
+            }
+        };
+
+        if values.contains_key(&key) {
+            return Err(format!("Duplicated key `{key}`"));
+        }
+        values.insert(key, value);
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == '}' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Object(values))
+}
+
+fn parse<I: Iterator<Item = char>>(iter: &mut Peekable<I>, c: char) -> Result<JsonValue, String> {
+    match c {
+        '{' => parse_object(iter),
+        '"' => parse_string(iter),
+        '[' => parse_array(iter),
+        't' => parse_ident(iter, JsonValue::Bool(true), "true"),
+        'f' => parse_ident(iter, JsonValue::Bool(false), "false"),
+        'n' => parse_ident(iter, JsonValue::Null, "null"),
+        c => {
+            if c.is_ascii_digit() || c == '-' {
+                parse_number(iter, c)
+            } else {
+                Err(format!("Unexpected `{c}` character"))
+            }
+        }
+    }
+}
+
+impl JsonValue {
+    pub(crate) fn parse(input: &str) -> Result<Self, String> {
+        let mut iter = input.chars().peekable();
+        let mut value = None;
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            }
+            value = Some(parse(&mut iter, c)?);
+            break;
+        }
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else {
+                return Err(format!("Unexpected character `{c}` after content"));
+            }
+        }
+        if let Some(value) = value {
+            Ok(value)
+        } else {
+            Err("Empty content".to_string())
+        }
+    }
+}
diff --git a/scripts/rustdoc_test_builder.rs b/scripts/rustdoc_test_builder.rs
index e5894652f..9b6bc1c1d 100644
--- a/scripts/rustdoc_test_builder.rs
+++ b/scripts/rustdoc_test_builder.rs
@@ -15,58 +15,93 @@
 //! from that. For the moment, we generate ourselves a new name, `{file}_{number}` instead, in
 //! the `gen` script (done there since we need to be aware of all the tests in a given file).
 
+use std::fs::create_dir_all;
 use std::io::Read;
 
-fn main() {
-    let mut stdin = std::io::stdin().lock();
-    let mut body = String::new();
-    stdin.read_to_string(&mut body).unwrap();
+use json::JsonValue;
 
-    // Find the generated function name looking for the inner function inside `main()`.
-    //
-    // The line we are looking for looks like one of the following:
-    //
-    // ```
-    // fn main() { #[allow(non_snake_case)] fn _doctest_main_rust_kernel_file_rs_28_0() {
-    // fn main() { #[allow(non_snake_case)] fn _doctest_main_rust_kernel_file_rs_37_0() -> Result<(), impl core::fmt::Debug> {
-    // ```
-    //
-    // It should be unlikely that doctest code matches such lines (when code is formatted properly).
-    let rustdoc_function_name = body
-        .lines()
-        .find_map(|line| {
-            Some(
-                line.split_once("fn main() {")?
-                    .1
-                    .split_once("fn ")?
-                    .1
-                    .split_once("()")?
-                    .0,
-            )
-            .filter(|x| x.chars().all(|c| c.is_alphanumeric() || c == '_'))
-        })
-        .expect("No test function found in `rustdoc`'s output.");
+mod json;
 
-    // Qualify `Result` to avoid the collision with our own `Result` coming from the prelude.
-    let body = body.replace(
-        &format!("{rustdoc_function_name}() -> Result<(), impl core::fmt::Debug> {{"),
-        &format!("{rustdoc_function_name}() -> core::result::Result<(), impl core::fmt::Debug> {{"),
-    );
+fn generate_doctest(file: &str, line: i32, doctest_code: &str) {
+    let file = file
+        .strip_suffix(".rs")
+        .unwrap_or(file)
+        .strip_prefix("../rust/kernel/")
+        .unwrap_or(file)
+        .replace('/', "_");
+    let path = format!("rust/test/doctests/kernel/{file}-{line}.rs");
 
+    // We replace the `Result` if needed.
+    let doctest_code = doctest_code.replace(
+        "fn main() { fn _inner() -> Result<",
+        "fn main() { fn _inner() -> core::result::Result<",
+    );
     // For tests that get generated with `Result`, like above, `rustdoc` generates an `unwrap()` on
     // the return value to check there were no returned errors. Instead, we use our assert macro
     // since we want to just fail the test, not panic the kernel.
     //
     // We save the result in a variable so that the failed assertion message looks nicer.
-    let body = body.replace(
-        &format!("}} {rustdoc_function_name}().unwrap() }}"),
-        &format!("}} let test_return_value = {rustdoc_function_name}(); assert!(test_return_value.is_ok()); }}"),
+    let doctest_code = doctest_code.replace(
+        "} _inner().unwrap() }",
+        "} let test_return_value = _inner(); assert!(test_return_value.is_ok()); }",
     );
+    std::fs::write(path, doctest_code.as_bytes()).unwrap();
+}
+
+fn main() {
+    let mut stdin = std::io::stdin().lock();
+    let mut body = String::new();
+    stdin.read_to_string(&mut body).unwrap();
+
+    let JsonValue::Object(rustdoc) = JsonValue::parse(&body).unwrap() else {
+        panic!("Expected an object")
+    };
+    if let Some(JsonValue::Number(format_version)) = rustdoc.get("format_version") {
+        if *format_version != 1 {
+            panic!("unsupported rustdoc format version: {format_version}");
+        }
+    } else {
+        panic!("missing `format_version` field");
+    }
+    let Some(JsonValue::Array(doctests)) = rustdoc.get("doctests") else {
+        panic!("missing `doctests` field");
+    };
+
+    // We ignore the error since it will fail when generating doctests below if the folder doesn't
+    // exist.
+    let _ = create_dir_all("rust/test/doctests/kernel");
 
-    // Figure out a smaller test name based on the generated function name.
-    let name = rustdoc_function_name.split_once("_rust_kernel_").unwrap().1;
+    let mut nb_generated = 0;
+    for doctest in doctests {
+        let JsonValue::Object(doctest) = doctest else {
+            unreachable!()
+        };
 
-    let path = format!("rust/test/doctests/kernel/{name}");
+        // We check if we need to skip this test by checking it's a rust code and it's not ignored.
+        if let Some(JsonValue::Object(attributes)) = doctest.get("doctest_attributes") {
+            if attributes.get("rust") != Some(&JsonValue::Bool(true)) {
+                continue;
+            } else if let Some(JsonValue::String(ignore)) = attributes.get("ignore") {
+                if ignore != "None" {
+                    continue;
+                }
+            }
+        }
+        if let (
+            Some(JsonValue::String(file)),
+            Some(JsonValue::Number(line)),
+            Some(JsonValue::String(doctest_code)),
+        ) = (
+            doctest.get("file"),
+            doctest.get("line"),
+            doctest.get("doctest_code"),
+        ) {
+            generate_doctest(file, *line, doctest_code);
+            nb_generated += 1;
+        }
+    }
 
-    std::fs::write(path, body.as_bytes()).unwrap();
+    if nb_generated == 0 {
+        panic!("No test function found in `rustdoc`'s output.");
+    }
 }
diff --git a/scripts/rustdoc_test_gen.rs b/scripts/rustdoc_test_gen.rs
index 5ebd42ae4..fd6635bbf 100644
--- a/scripts/rustdoc_test_gen.rs
+++ b/scripts/rustdoc_test_gen.rs
@@ -48,7 +48,7 @@
 fn find_real_path<'a>(srctree: &Path, valid_paths: &'a mut Vec<PathBuf>, file: &str) -> &'a str {
     valid_paths.clear();
 
-    let potential_components: Vec<&str> = file.strip_suffix("_rs").unwrap().split('_').collect();
+    let potential_components: Vec<&str> = file.split('_').collect();
 
     find_candidates(srctree, valid_paths, Path::new(""), &potential_components);
     fn find_candidates(
@@ -87,8 +87,8 @@ fn find_candidates(
 
     assert!(
         valid_paths.len() > 0,
-        "No path candidates found. This is likely a bug in the build system, or some files went \
-        away while compiling."
+        "No path candidates found for `{file}`. This is likely a bug in the build system, or some \
+        files went away while compiling.",
     );
 
     if valid_paths.len() > 1 {
@@ -97,8 +97,8 @@ fn find_candidates(
             eprintln!("    {path:?}");
         }
         panic!(
-            "Several path candidates found, please resolve the ambiguity by renaming a file or \
-            folder."
+            "Several path candidates found for `{file}`, please resolve the ambiguity by renaming \
+            a file or folder."
         );
     }
 
@@ -126,12 +126,13 @@ fn main() {
     let mut valid_paths: Vec<PathBuf> = Vec::new();
     let mut real_path: &str = "";
     for path in paths {
-        // The `name` follows the `{file}_{line}_{number}` pattern (see description in
+        // The `name` follows the `{file}_{line}` pattern (see description in
         // `scripts/rustdoc_test_builder.rs`). Discard the `number`.
         let name = path.file_name().unwrap().to_str().unwrap().to_string();
 
-        // Extract the `file` and the `line`, discarding the `number`.
-        let (file, line) = name.rsplit_once('_').unwrap().0.rsplit_once('_').unwrap();
+        // Extract the `file` and the `line`, discarding the extension.
+        let (file, line) = name.rsplit_once('-').unwrap();
+        let line = line.split('.').next().unwrap();
 
         // Generate an ID sequence ("test number") for each one in the file.
         if file == last_file {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling.
@ 2025-02-28 13:29 Guillaume Gomez
  2025-02-28 15:43 ` Miguel Ojeda
  0 siblings, 1 reply; 12+ messages in thread
From: Guillaume Gomez @ 2025-02-28 13:29 UTC (permalink / raw)
  To: ojeda, rust-for-linux, linux-kernel; +Cc: Guillaume Gomez

Before this patch, the code was using very hacky methods in order to retrieve
doctests, modify them as needed and then concatenate all of them in one file.

Now, with this new flag, it instead asks rustdoc to provide the doctests
code with their associated information such as file path and line number.
---
 rust/Makefile                   |   8 +-
 scripts/json.rs                 | 235 ++++++++++++++++++++++++++++++++
 scripts/rustdoc_test_builder.rs | 115 ++++++++++------
 scripts/rustdoc_test_gen.rs     |  17 +--
 4 files changed, 323 insertions(+), 52 deletions(-)
 create mode 100644 scripts/json.rs

diff --git a/rust/Makefile b/rust/Makefile
index ea3849eb7..237ed23f8 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -171,14 +171,14 @@ quiet_cmd_rustdoc_test_kernel = RUSTDOC TK $<
 	rm -rf $(objtree)/$(obj)/test/doctests/kernel; \
 	mkdir -p $(objtree)/$(obj)/test/doctests/kernel; \
 	OBJTREE=$(abspath $(objtree)) \
-	$(RUSTDOC) --test $(rust_flags) \
+	$(RUSTDOC) --output-format=doctest $(rust_flags) \
 		-L$(objtree)/$(obj) --extern ffi --extern kernel \
 		--extern build_error --extern macros \
 		--extern bindings --extern uapi \
-		--no-run --crate-name kernel -Zunstable-options \
+		--crate-name kernel -Zunstable-options \
 		--sysroot=/dev/null \
-		--test-builder $(objtree)/scripts/rustdoc_test_builder \
-		$< $(rustdoc_test_kernel_quiet); \
+		$< $(rustdoc_test_kernel_quiet) > rustdoc.json; \
+	cat rustdoc.json | $(objtree)/scripts/rustdoc_test_builder; \
 	$(objtree)/scripts/rustdoc_test_gen
 
 %/doctests_kernel_generated.rs %/doctests_kernel_generated_kunit.c: \
diff --git a/scripts/json.rs b/scripts/json.rs
new file mode 100644
index 000000000..aff24bfd9
--- /dev/null
+++ b/scripts/json.rs
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! JSON parser used to parse rustdoc output when retrieving doctests.
+
+use std::collections::HashMap;
+use std::iter::Peekable;
+use std::str::FromStr;
+
+#[derive(Debug, PartialEq, Eq)]
+pub(crate) enum JsonValue {
+    Object(HashMap<String, JsonValue>),
+    String(String),
+    Number(i32),
+    Bool(bool),
+    Array(Vec<JsonValue>),
+    Null,
+}
+
+fn parse_ident<I: Iterator<Item = char>>(
+    iter: &mut I,
+    output: JsonValue,
+    ident: &str,
+) -> Result<JsonValue, String> {
+    let mut ident_iter = ident.chars().skip(1);
+
+    loop {
+        let i = ident_iter.next();
+        if i.is_none() {
+            return Ok(output);
+        }
+        let c = iter.next();
+        if i != c {
+            if let Some(c) = c {
+                return Err(format!("Unexpected character `{c}` when parsing `{ident}`"));
+            }
+            return Err(format!("Missing character when parsing `{ident}`"));
+        }
+    }
+}
+
+fn parse_string<I: Iterator<Item = char>>(iter: &mut I) -> Result<JsonValue, String> {
+    let mut out = String::new();
+
+    while let Some(c) = iter.next() {
+        match c {
+            '\\' => {
+                let Some(c) = iter.next() else { break };
+                match c {
+                    '"' | '\\' | '/' => out.push(c),
+                    'b' => out.push(char::from(0x8u8)),
+                    'f' => out.push(char::from(0xCu8)),
+                    't' => out.push('\t'),
+                    'r' => out.push('\r'),
+                    'n' => out.push('\n'),
+                    _ => {
+                        // This code doesn't handle codepoints so we put the string content as is.
+                        out.push('\\');
+                        out.push(c);
+                    }
+                }
+            }
+            '"' => {
+                return Ok(JsonValue::String(out));
+            }
+            _ => out.push(c),
+        }
+    }
+    Err(format!("Unclosed JSON string `{out}`"))
+}
+
+fn parse_number<I: Iterator<Item = char>>(
+    iter: &mut Peekable<I>,
+    digit: char,
+) -> Result<JsonValue, String> {
+    let mut nb = String::new();
+
+    nb.push(digit);
+    loop {
+        // We peek next character to prevent taking it from the iterator in case it's a comma.
+        if matches!(iter.peek(), Some(',' | '}' | ']')) {
+            break;
+        }
+        let Some(c) = iter.next() else { break };
+        if c.is_whitespace() {
+            break;
+        } else if !c.is_ascii_digit() {
+            return Err(format!("Error when parsing number `{nb}`: found `{c}`"));
+        }
+        nb.push(c);
+    }
+    i32::from_str(&nb)
+        .map(|nb| JsonValue::Number(nb))
+        .map_err(|error| format!("Invalid number: `{error}`"))
+}
+
+fn parse_array<I: Iterator<Item = char>>(iter: &mut Peekable<I>) -> Result<JsonValue, String> {
+    let mut values = Vec::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed array".to_string());
+        };
+        if c.is_whitespace() {
+            continue;
+        } else if c == ']' {
+            break;
+        }
+        values.push(parse(iter, c)?);
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == ']' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Array(values))
+}
+
+fn parse_object<I: Iterator<Item = char>>(iter: &mut Peekable<I>) -> Result<JsonValue, String> {
+    let mut values = HashMap::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed object".to_string());
+        };
+        let key;
+        if c.is_whitespace() {
+            continue;
+        } else if c == '"' {
+            let JsonValue::String(k) = parse_string(iter)? else {
+                unreachable!()
+            };
+            key = k;
+        } else if c == '}' {
+            break;
+        } else {
+            return Err(format!("Expected `\"` when parsing Object, found `{c}`"));
+        }
+
+        // We then get the `:` separator.
+        loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else if c == ':' {
+                break;
+            } else {
+                return Err(format!(
+                    "Expected `:` after key, found `{c}` when parsing object"
+                ));
+            }
+        }
+        // Then the value.
+        let value = loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else {
+                break parse(iter, c)?;
+            }
+        };
+
+        if values.contains_key(&key) {
+            return Err(format!("Duplicated key `{key}`"));
+        }
+        values.insert(key, value);
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == '}' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Object(values))
+}
+
+fn parse<I: Iterator<Item = char>>(iter: &mut Peekable<I>, c: char) -> Result<JsonValue, String> {
+    match c {
+        '{' => parse_object(iter),
+        '"' => parse_string(iter),
+        '[' => parse_array(iter),
+        't' => parse_ident(iter, JsonValue::Bool(true), "true"),
+        'f' => parse_ident(iter, JsonValue::Bool(false), "false"),
+        'n' => parse_ident(iter, JsonValue::Null, "null"),
+        c => {
+            if c.is_ascii_digit() || c == '-' {
+                parse_number(iter, c)
+            } else {
+                Err(format!("Unexpected `{c}` character"))
+            }
+        }
+    }
+}
+
+impl JsonValue {
+    pub(crate) fn parse(input: &str) -> Result<Self, String> {
+        let mut iter = input.chars().peekable();
+        let mut value = None;
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            }
+            value = Some(parse(&mut iter, c)?);
+            break;
+        }
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else {
+                return Err(format!("Unexpected character `{c}` after content"));
+            }
+        }
+        if let Some(value) = value {
+            Ok(value)
+        } else {
+            Err("Empty content".to_string())
+        }
+    }
+}
diff --git a/scripts/rustdoc_test_builder.rs b/scripts/rustdoc_test_builder.rs
index e5894652f..9b6bc1c1d 100644
--- a/scripts/rustdoc_test_builder.rs
+++ b/scripts/rustdoc_test_builder.rs
@@ -15,58 +15,93 @@
 //! from that. For the moment, we generate ourselves a new name, `{file}_{number}` instead, in
 //! the `gen` script (done there since we need to be aware of all the tests in a given file).
 
+use std::fs::create_dir_all;
 use std::io::Read;
 
-fn main() {
-    let mut stdin = std::io::stdin().lock();
-    let mut body = String::new();
-    stdin.read_to_string(&mut body).unwrap();
+use json::JsonValue;
 
-    // Find the generated function name looking for the inner function inside `main()`.
-    //
-    // The line we are looking for looks like one of the following:
-    //
-    // ```
-    // fn main() { #[allow(non_snake_case)] fn _doctest_main_rust_kernel_file_rs_28_0() {
-    // fn main() { #[allow(non_snake_case)] fn _doctest_main_rust_kernel_file_rs_37_0() -> Result<(), impl core::fmt::Debug> {
-    // ```
-    //
-    // It should be unlikely that doctest code matches such lines (when code is formatted properly).
-    let rustdoc_function_name = body
-        .lines()
-        .find_map(|line| {
-            Some(
-                line.split_once("fn main() {")?
-                    .1
-                    .split_once("fn ")?
-                    .1
-                    .split_once("()")?
-                    .0,
-            )
-            .filter(|x| x.chars().all(|c| c.is_alphanumeric() || c == '_'))
-        })
-        .expect("No test function found in `rustdoc`'s output.");
+mod json;
 
-    // Qualify `Result` to avoid the collision with our own `Result` coming from the prelude.
-    let body = body.replace(
-        &format!("{rustdoc_function_name}() -> Result<(), impl core::fmt::Debug> {{"),
-        &format!("{rustdoc_function_name}() -> core::result::Result<(), impl core::fmt::Debug> {{"),
-    );
+fn generate_doctest(file: &str, line: i32, doctest_code: &str) {
+    let file = file
+        .strip_suffix(".rs")
+        .unwrap_or(file)
+        .strip_prefix("../rust/kernel/")
+        .unwrap_or(file)
+        .replace('/', "_");
+    let path = format!("rust/test/doctests/kernel/{file}-{line}.rs");
 
+    // We replace the `Result` if needed.
+    let doctest_code = doctest_code.replace(
+        "fn main() { fn _inner() -> Result<",
+        "fn main() { fn _inner() -> core::result::Result<",
+    );
     // For tests that get generated with `Result`, like above, `rustdoc` generates an `unwrap()` on
     // the return value to check there were no returned errors. Instead, we use our assert macro
     // since we want to just fail the test, not panic the kernel.
     //
     // We save the result in a variable so that the failed assertion message looks nicer.
-    let body = body.replace(
-        &format!("}} {rustdoc_function_name}().unwrap() }}"),
-        &format!("}} let test_return_value = {rustdoc_function_name}(); assert!(test_return_value.is_ok()); }}"),
+    let doctest_code = doctest_code.replace(
+        "} _inner().unwrap() }",
+        "} let test_return_value = _inner(); assert!(test_return_value.is_ok()); }",
     );
+    std::fs::write(path, doctest_code.as_bytes()).unwrap();
+}
+
+fn main() {
+    let mut stdin = std::io::stdin().lock();
+    let mut body = String::new();
+    stdin.read_to_string(&mut body).unwrap();
+
+    let JsonValue::Object(rustdoc) = JsonValue::parse(&body).unwrap() else {
+        panic!("Expected an object")
+    };
+    if let Some(JsonValue::Number(format_version)) = rustdoc.get("format_version") {
+        if *format_version != 1 {
+            panic!("unsupported rustdoc format version: {format_version}");
+        }
+    } else {
+        panic!("missing `format_version` field");
+    }
+    let Some(JsonValue::Array(doctests)) = rustdoc.get("doctests") else {
+        panic!("missing `doctests` field");
+    };
+
+    // We ignore the error since it will fail when generating doctests below if the folder doesn't
+    // exist.
+    let _ = create_dir_all("rust/test/doctests/kernel");
 
-    // Figure out a smaller test name based on the generated function name.
-    let name = rustdoc_function_name.split_once("_rust_kernel_").unwrap().1;
+    let mut nb_generated = 0;
+    for doctest in doctests {
+        let JsonValue::Object(doctest) = doctest else {
+            unreachable!()
+        };
 
-    let path = format!("rust/test/doctests/kernel/{name}");
+        // We check if we need to skip this test by checking it's a rust code and it's not ignored.
+        if let Some(JsonValue::Object(attributes)) = doctest.get("doctest_attributes") {
+            if attributes.get("rust") != Some(&JsonValue::Bool(true)) {
+                continue;
+            } else if let Some(JsonValue::String(ignore)) = attributes.get("ignore") {
+                if ignore != "None" {
+                    continue;
+                }
+            }
+        }
+        if let (
+            Some(JsonValue::String(file)),
+            Some(JsonValue::Number(line)),
+            Some(JsonValue::String(doctest_code)),
+        ) = (
+            doctest.get("file"),
+            doctest.get("line"),
+            doctest.get("doctest_code"),
+        ) {
+            generate_doctest(file, *line, doctest_code);
+            nb_generated += 1;
+        }
+    }
 
-    std::fs::write(path, body.as_bytes()).unwrap();
+    if nb_generated == 0 {
+        panic!("No test function found in `rustdoc`'s output.");
+    }
 }
diff --git a/scripts/rustdoc_test_gen.rs b/scripts/rustdoc_test_gen.rs
index 5ebd42ae4..fd6635bbf 100644
--- a/scripts/rustdoc_test_gen.rs
+++ b/scripts/rustdoc_test_gen.rs
@@ -48,7 +48,7 @@
 fn find_real_path<'a>(srctree: &Path, valid_paths: &'a mut Vec<PathBuf>, file: &str) -> &'a str {
     valid_paths.clear();
 
-    let potential_components: Vec<&str> = file.strip_suffix("_rs").unwrap().split('_').collect();
+    let potential_components: Vec<&str> = file.split('_').collect();
 
     find_candidates(srctree, valid_paths, Path::new(""), &potential_components);
     fn find_candidates(
@@ -87,8 +87,8 @@ fn find_candidates(
 
     assert!(
         valid_paths.len() > 0,
-        "No path candidates found. This is likely a bug in the build system, or some files went \
-        away while compiling."
+        "No path candidates found for `{file}`. This is likely a bug in the build system, or some \
+        files went away while compiling.",
     );
 
     if valid_paths.len() > 1 {
@@ -97,8 +97,8 @@ fn find_candidates(
             eprintln!("    {path:?}");
         }
         panic!(
-            "Several path candidates found, please resolve the ambiguity by renaming a file or \
-            folder."
+            "Several path candidates found for `{file}`, please resolve the ambiguity by renaming \
+            a file or folder."
         );
     }
 
@@ -126,12 +126,13 @@ fn main() {
     let mut valid_paths: Vec<PathBuf> = Vec::new();
     let mut real_path: &str = "";
     for path in paths {
-        // The `name` follows the `{file}_{line}_{number}` pattern (see description in
+        // The `name` follows the `{file}_{line}` pattern (see description in
         // `scripts/rustdoc_test_builder.rs`). Discard the `number`.
         let name = path.file_name().unwrap().to_str().unwrap().to_string();
 
-        // Extract the `file` and the `line`, discarding the `number`.
-        let (file, line) = name.rsplit_once('_').unwrap().0.rsplit_once('_').unwrap();
+        // Extract the `file` and the `line`, discarding the extension.
+        let (file, line) = name.rsplit_once('-').unwrap();
+        let line = line.split('.').next().unwrap();
 
         // Generate an ID sequence ("test number") for each one in the file.
         if file == last_file {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling.
  2025-02-28 13:29 Guillaume Gomez
@ 2025-02-28 15:43 ` Miguel Ojeda
  2025-02-28 16:33   ` Guillaume Gomez
                     ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Miguel Ojeda @ 2025-02-28 15:43 UTC (permalink / raw)
  To: Guillaume Gomez; +Cc: ojeda, rust-for-linux, linux-kernel

On Fri, Feb 28, 2025 at 2:29 PM Guillaume Gomez
<guillaume1.gomez@gmail.com> wrote:
>
> Before this patch, the code was using very hacky methods in order to retrieve
> doctests, modify them as needed and then concatenate all of them in one file.
>
> Now, with this new flag, it instead asks rustdoc to provide the doctests
> code with their associated information such as file path and line number.

Thanks for doing this (and the `rustdoc` side of it!), and welcome!

Normally in commit messages we need to give more details:

  - A Signed-off-by is required (please see for what that implies).

    https://docs.kernel.org/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin

  - The commit message should explain the motivation, i.e. removing
the 2 unstable `rustdoc` features, and moving to the new,
intended-to-be-stable one (which one? etc.), and so on.

  - Also, a bit of context (or at least linking to it) would be good,
e.g. explaining that we were working on removing unstable features,
and that `rustdoc` provided a new flag for the kernel, ideally with
links to the relevant Rust tracking issues/PRs (typically with a
"Link: " tag), plus which version the unstable feature is available,
etc. (well, when we know for sure, i.e. I understand that we may need
to tweak it still).

Anyway, those are procedural details we can sort out later :) What is
most important I think are the following two notes, please see below.

> -       $(RUSTDOC) --test $(rust_flags) \
> +       $(RUSTDOC) --output-format=doctest $(rust_flags) \
>                 -L$(objtree)/$(obj) --extern ffi --extern kernel \
>                 --extern build_error --extern macros \
>                 --extern bindings --extern uapi \
> -               --no-run --crate-name kernel -Zunstable-options \
> +               --crate-name kernel -Zunstable-options \
>                 --sysroot=/dev/null \
> -               --test-builder $(objtree)/scripts/rustdoc_test_builder \
> -               $< $(rustdoc_test_kernel_quiet); \
> +               $< $(rustdoc_test_kernel_quiet) > rustdoc.json; \
> +       cat rustdoc.json | $(objtree)/scripts/rustdoc_test_builder; \
>         $(objtree)/scripts/rustdoc_test_gen

We currently support versions 1.78+, so we will need to do these
things conditionally. I can help with that, so don't worry too much
about it for the moment (e.g. we have `rustc-option` and other helpers
to pick the right flags given a version etc.).

Similarly, we will also need to tell the script whether to use the old
or the new way, too.

> +    // We replace the `Result` if needed.
> +    let doctest_code = doctest_code.replace(
> +        "fn main() { fn _inner() -> Result<",
> +        "fn main() { fn _inner() -> core::result::Result<",
> +    );

Hmm... I was hoping we could do something on the `rustdoc` side to
remove these hacks altogether since we are going to have a "proper
flag" for it, i.e. to avoid relying on the particular "format" of the
tests somehow.

I wrote about a bit of that here:

    https://github.com/rust-lang/rust/pull/134531#discussion_r1894610592

> +    let doctest_code = doctest_code.replace(
> +        "} _inner().unwrap() }",
> +        "} let test_return_value = _inner(); assert!(test_return_value.is_ok()); }",
>      );
> +    std::fs::write(path, doctest_code.as_bytes()).unwrap();
> +}

Same for this.

> +    } else {
> +        panic!("missing `format_version` field");
> +    }

`expect` maybe?

> @@ -87,8 +87,8 @@ fn find_candidates(
>
>      assert!(
>          valid_paths.len() > 0,
> -        "No path candidates found. This is likely a bug in the build system, or some files went \
> -        away while compiling."
> +        "No path candidates found for `{file}`. This is likely a bug in the build system, or some \
> +        files went away while compiling.",
>      );
>
>      if valid_paths.len() > 1 {
> @@ -97,8 +97,8 @@ fn find_candidates(
>              eprintln!("    {path:?}");
>          }
>          panic!(
> -            "Several path candidates found, please resolve the ambiguity by renaming a file or \
> -            folder."
> +            "Several path candidates found for `{file}`, please resolve the ambiguity by renaming \
> +            a file or folder."
>          );
>      }

These two bits could go in a first patch, I think, though it isn't a big deal.

Thanks again for getting us out of unstable in `rustdoc`!

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling.
  2025-02-28 15:43 ` Miguel Ojeda
@ 2025-02-28 16:33   ` Guillaume Gomez
  2025-02-28 16:38   ` Guillaume Gomez
       [not found]   ` <CAAOQCfSEUnp8U3+6amWCd6+yPrAmPy6gsjJnQdrqmpER5md3kA@mail.gmail.com>
  2 siblings, 0 replies; 12+ messages in thread
From: Guillaume Gomez @ 2025-02-28 16:33 UTC (permalink / raw)
  To: Miguel Ojeda; +Cc: ojeda, rust-for-linux, linux-kernel

Thanks for the detailed answer!

I added the missing "Signed-of-by" in the commit message.

I added some more context in the commit message, hopefully I didn't
miss anything.

> We currently support versions 1.78+, so we will need to do these
> things conditionally. I can help with that, so don't worry too much
> about it for the moment (e.g. we have `rustc-option` and other helpers
> to pick the right flags given a version etc.).

I'll definitely need some help here. I'm not sure current stable already
has this change so until then, we'll need a beta/nightly version to run
these tests.

> Hmm... I was hoping we could do something on the `rustdoc` side to
> remove these hacks altogether since we are going to have a "proper
> flag" for it, i.e. to avoid relying on the particular "format" of the
> tests somehow.

I opened https://github.com/rust-lang/rust/pull/137807 to resolve
this problem.

> > +    let doctest_code = doctest_code.replace(
> > +        "} _inner().unwrap() }",
> > +        "} let test_return_value = _inner(); assert!(test_return_value.is_ok()); }",
> >      );
> > +    std::fs::write(path, doctest_code.as_bytes()).unwrap();
> > +}
>
> Same for this.

For this one I'll need to check first if it can be done "safely" (ie,
so unexpected side
effect).

> > +    } else {
> > +        panic!("missing `format_version` field");
> > +    }
>
> `expect` maybe?

I don't think `expect` would work in any of the cases in this file.
What I suggest
is to add methods on `JsonValue` in a future patch which would allow to reduce
code in this file (and call `expect` too).

> These two bits could go in a first patch, I think, though it isn't a big deal.

You're absolutely right, removing them and sending a separate patch.

Here is the updated patch:


The goal of this patch is to remove the use of 2 unstable rustdoc features
(`--no-run` and `--test-builder`) and replace it with a stable feature:
`--output-format=doctest`, which was added in
https://github.com/rust-lang/rust/pull/134531.

Before this patch, the code was using very hacky methods in order to retrieve
doctests, modify them as needed and then concatenate all of them in one file.

Now, with this new flag, it instead asks rustdoc to provide the doctests
code with their associated information such as file path and line number.

Signed-off-by: Guillaume Gomez <guillaume1.gomez@gmail.com>
---
 rust/Makefile                   |   8 +-
 scripts/json.rs                 | 235 ++++++++++++++++++++++++++++++++
 scripts/rustdoc_test_builder.rs | 115 ++++++++++------
 scripts/rustdoc_test_gen.rs     |  11 +-
 4 files changed, 320 insertions(+), 49 deletions(-)
 create mode 100644 scripts/json.rs

diff --git a/rust/Makefile b/rust/Makefile
index ea3849eb7..237ed23f8 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -171,14 +171,14 @@ quiet_cmd_rustdoc_test_kernel = RUSTDOC TK $<
  rm -rf $(objtree)/$(obj)/test/doctests/kernel; \
  mkdir -p $(objtree)/$(obj)/test/doctests/kernel; \
  OBJTREE=$(abspath $(objtree)) \
- $(RUSTDOC) --test $(rust_flags) \
+ $(RUSTDOC) --output-format=doctest $(rust_flags) \
  -L$(objtree)/$(obj) --extern ffi --extern kernel \
  --extern build_error --extern macros \
  --extern bindings --extern uapi \
- --no-run --crate-name kernel -Zunstable-options \
+ --crate-name kernel -Zunstable-options \
  --sysroot=/dev/null \
- --test-builder $(objtree)/scripts/rustdoc_test_builder \
- $< $(rustdoc_test_kernel_quiet); \
+ $< $(rustdoc_test_kernel_quiet) > rustdoc.json; \
+ cat rustdoc.json | $(objtree)/scripts/rustdoc_test_builder; \
  $(objtree)/scripts/rustdoc_test_gen

 %/doctests_kernel_generated.rs %/doctests_kernel_generated_kunit.c: \
diff --git a/scripts/json.rs b/scripts/json.rs
new file mode 100644
index 000000000..aff24bfd9
--- /dev/null
+++ b/scripts/json.rs
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! JSON parser used to parse rustdoc output when retrieving doctests.
+
+use std::collections::HashMap;
+use std::iter::Peekable;
+use std::str::FromStr;
+
+#[derive(Debug, PartialEq, Eq)]
+pub(crate) enum JsonValue {
+    Object(HashMap<String, JsonValue>),
+    String(String),
+    Number(i32),
+    Bool(bool),
+    Array(Vec<JsonValue>),
+    Null,
+}
+
+fn parse_ident<I: Iterator<Item = char>>(
+    iter: &mut I,
+    output: JsonValue,
+    ident: &str,
+) -> Result<JsonValue, String> {
+    let mut ident_iter = ident.chars().skip(1);
+
+    loop {
+        let i = ident_iter.next();
+        if i.is_none() {
+            return Ok(output);
+        }
+        let c = iter.next();
+        if i != c {
+            if let Some(c) = c {
+                return Err(format!("Unexpected character `{c}` when
parsing `{ident}`"));
+            }
+            return Err(format!("Missing character when parsing `{ident}`"));
+        }
+    }
+}
+
+fn parse_string<I: Iterator<Item = char>>(iter: &mut I) ->
Result<JsonValue, String> {
+    let mut out = String::new();
+
+    while let Some(c) = iter.next() {
+        match c {
+            '\\' => {
+                let Some(c) = iter.next() else { break };
+                match c {
+                    '"' | '\\' | '/' => out.push(c),
+                    'b' => out.push(char::from(0x8u8)),
+                    'f' => out.push(char::from(0xCu8)),
+                    't' => out.push('\t'),
+                    'r' => out.push('\r'),
+                    'n' => out.push('\n'),
+                    _ => {
+                        // This code doesn't handle codepoints so we
put the string content as is.
+                        out.push('\\');
+                        out.push(c);
+                    }
+                }
+            }
+            '"' => {
+                return Ok(JsonValue::String(out));
+            }
+            _ => out.push(c),
+        }
+    }
+    Err(format!("Unclosed JSON string `{out}`"))
+}
+
+fn parse_number<I: Iterator<Item = char>>(
+    iter: &mut Peekable<I>,
+    digit: char,
+) -> Result<JsonValue, String> {
+    let mut nb = String::new();
+
+    nb.push(digit);
+    loop {
+        // We peek next character to prevent taking it from the
iterator in case it's a comma.
+        if matches!(iter.peek(), Some(',' | '}' | ']')) {
+            break;
+        }
+        let Some(c) = iter.next() else { break };
+        if c.is_whitespace() {
+            break;
+        } else if !c.is_ascii_digit() {
+            return Err(format!("Error when parsing number `{nb}`:
found `{c}`"));
+        }
+        nb.push(c);
+    }
+    i32::from_str(&nb)
+        .map(|nb| JsonValue::Number(nb))
+        .map_err(|error| format!("Invalid number: `{error}`"))
+}
+
+fn parse_array<I: Iterator<Item = char>>(iter: &mut Peekable<I>) ->
Result<JsonValue, String> {
+    let mut values = Vec::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed array".to_string());
+        };
+        if c.is_whitespace() {
+            continue;
+        } else if c == ']' {
+            break;
+        }
+        values.push(parse(iter, c)?);
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == ']' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Array(values))
+}
+
+fn parse_object<I: Iterator<Item = char>>(iter: &mut Peekable<I>) ->
Result<JsonValue, String> {
+    let mut values = HashMap::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed object".to_string());
+        };
+        let key;
+        if c.is_whitespace() {
+            continue;
+        } else if c == '"' {
+            let JsonValue::String(k) = parse_string(iter)? else {
+                unreachable!()
+            };
+            key = k;
+        } else if c == '}' {
+            break;
+        } else {
+            return Err(format!("Expected `\"` when parsing Object,
found `{c}`"));
+        }
+
+        // We then get the `:` separator.
+        loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else if c == ':' {
+                break;
+            } else {
+                return Err(format!(
+                    "Expected `:` after key, found `{c}` when parsing object"
+                ));
+            }
+        }
+        // Then the value.
+        let value = loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else {
+                break parse(iter, c)?;
+            }
+        };
+
+        if values.contains_key(&key) {
+            return Err(format!("Duplicated key `{key}`"));
+        }
+        values.insert(key, value);
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == '}' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Object(values))
+}
+
+fn parse<I: Iterator<Item = char>>(iter: &mut Peekable<I>, c: char)
-> Result<JsonValue, String> {
+    match c {
+        '{' => parse_object(iter),
+        '"' => parse_string(iter),
+        '[' => parse_array(iter),
+        't' => parse_ident(iter, JsonValue::Bool(true), "true"),
+        'f' => parse_ident(iter, JsonValue::Bool(false), "false"),
+        'n' => parse_ident(iter, JsonValue::Null, "null"),
+        c => {
+            if c.is_ascii_digit() || c == '-' {
+                parse_number(iter, c)
+            } else {
+                Err(format!("Unexpected `{c}` character"))
+            }
+        }
+    }
+}
+
+impl JsonValue {
+    pub(crate) fn parse(input: &str) -> Result<Self, String> {
+        let mut iter = input.chars().peekable();
+        let mut value = None;
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            }
+            value = Some(parse(&mut iter, c)?);
+            break;
+        }
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else {
+                return Err(format!("Unexpected character `{c}` after
content"));
+            }
+        }
+        if let Some(value) = value {
+            Ok(value)
+        } else {
+            Err("Empty content".to_string())
+        }
+    }
+}
diff --git a/scripts/rustdoc_test_builder.rs b/scripts/rustdoc_test_builder.rs
index e5894652f..ba17e9db9 100644
--- a/scripts/rustdoc_test_builder.rs
+++ b/scripts/rustdoc_test_builder.rs
@@ -15,58 +15,93 @@
 //! from that. For the moment, we generate ourselves a new name,
`{file}_{number}` instead, in
 //! the `gen` script (done there since we need to be aware of all the
tests in a given file).

+use std::fs::create_dir_all;
 use std::io::Read;

-fn main() {
-    let mut stdin = std::io::stdin().lock();
-    let mut body = String::new();
-    stdin.read_to_string(&mut body).unwrap();
+use json::JsonValue;

-    // Find the generated function name looking for the inner
function inside `main()`.
-    //
-    // The line we are looking for looks like one of the following:
-    //
-    // ```
-    // fn main() { #[allow(non_snake_case)] fn
_doctest_main_rust_kernel_file_rs_28_0() {
-    // fn main() { #[allow(non_snake_case)] fn
_doctest_main_rust_kernel_file_rs_37_0() -> Result<(), impl
core::fmt::Debug> {
-    // ```
-    //
-    // It should be unlikely that doctest code matches such lines
(when code is formatted properly).
-    let rustdoc_function_name = body
-        .lines()
-        .find_map(|line| {
-            Some(
-                line.split_once("fn main() {")?
-                    .1
-                    .split_once("fn ")?
-                    .1
-                    .split_once("()")?
-                    .0,
-            )
-            .filter(|x| x.chars().all(|c| c.is_alphanumeric() || c == '_'))
-        })
-        .expect("No test function found in `rustdoc`'s output.");
+mod json;

-    // Qualify `Result` to avoid the collision with our own `Result`
coming from the prelude.
-    let body = body.replace(
-        &format!("{rustdoc_function_name}() -> Result<(), impl
core::fmt::Debug> {{"),
-        &format!("{rustdoc_function_name}() ->
core::result::Result<(), impl core::fmt::Debug> {{"),
-    );
+fn generate_doctest(file: &str, line: i32, doctest_code: &str) {
+    let file = file
+        .strip_suffix(".rs")
+        .unwrap_or(file)
+        .strip_prefix("../rust/kernel/")
+        .unwrap_or(file)
+        .replace('/', "_");
+    let path = format!("rust/test/doctests/kernel/{file}-{line}.rs");

+    // We replace the `Result` if needed.
+    let doctest_code = doctest_code.replace(
+        "fn main() { fn _inner() -> Result<",
+        "fn main() { fn _inner() -> core::result::Result<",
+    );
     // For tests that get generated with `Result`, like above,
`rustdoc` generates an `unwrap()` on
     // the return value to check there were no returned errors.
Instead, we use our assert macro
     // since we want to just fail the test, not panic the kernel.
     //
     // We save the result in a variable so that the failed assertion
message looks nicer.
-    let body = body.replace(
-        &format!("}} {rustdoc_function_name}().unwrap() }}"),
-        &format!("}} let test_return_value =
{rustdoc_function_name}(); assert!(test_return_value.is_ok()); }}"),
+    let doctest_code = doctest_code.replace(
+        "} _inner().unwrap() }",
+        "} let test_return_value = _inner();
assert!(test_return_value.is_ok()); }",
     );
+    std::fs::write(path, doctest_code.as_bytes()).unwrap();
+}
+
+fn main() {
+    let mut stdin = std::io::stdin().lock();
+    let mut body = String::new();
+    stdin.read_to_string(&mut body).unwrap();
+
+    let JsonValue::Object(rustdoc) = JsonValue::parse(&body).unwrap() else {
+        panic!("Expected an object")
+    };
+    if let Some(JsonValue::Number(format_version)) =
rustdoc.get("format_version") {
+        if *format_version != 1 {
+            panic!("unsupported rustdoc format version: {format_version}");
+        }
+    } else {
+        panic!("missing `format_version` field");
+    }
+    let Some(JsonValue::Array(doctests)) = rustdoc.get("doctests") else {
+        panic!("`doctests` field is missing or has the wrong type");
+    };
+
+    // We ignore the error since it will fail when generating
doctests below if the folder doesn't
+    // exist.
+    let _ = create_dir_all("rust/test/doctests/kernel");

-    // Figure out a smaller test name based on the generated function name.
-    let name = rustdoc_function_name.split_once("_rust_kernel_").unwrap().1;
+    let mut nb_generated = 0;
+    for doctest in doctests {
+        let JsonValue::Object(doctest) = doctest else {
+            unreachable!()
+        };

-    let path = format!("rust/test/doctests/kernel/{name}");
+        // We check if we need to skip this test by checking it's a
rust code and it's not ignored.
+        if let Some(JsonValue::Object(attributes)) =
doctest.get("doctest_attributes") {
+            if attributes.get("rust") != Some(&JsonValue::Bool(true)) {
+                continue;
+            } else if let Some(JsonValue::String(ignore)) =
attributes.get("ignore") {
+                if ignore != "None" {
+                    continue;
+                }
+            }
+        }
+        if let (
+            Some(JsonValue::String(file)),
+            Some(JsonValue::Number(line)),
+            Some(JsonValue::String(doctest_code)),
+        ) = (
+            doctest.get("file"),
+            doctest.get("line"),
+            doctest.get("doctest_code"),
+        ) {
+            generate_doctest(file, *line, doctest_code);
+            nb_generated += 1;
+        }
+    }

-    std::fs::write(path, body.as_bytes()).unwrap();
+    if nb_generated == 0 {
+        panic!("No test function found in `rustdoc`'s output.");
+    }
 }
diff --git a/scripts/rustdoc_test_gen.rs b/scripts/rustdoc_test_gen.rs
index 5ebd42ae4..66e665ea0 100644
--- a/scripts/rustdoc_test_gen.rs
+++ b/scripts/rustdoc_test_gen.rs
@@ -48,7 +48,7 @@
 fn find_real_path<'a>(srctree: &Path, valid_paths: &'a mut
Vec<PathBuf>, file: &str) -> &'a str {
     valid_paths.clear();

-    let potential_components: Vec<&str> =
file.strip_suffix("_rs").unwrap().split('_').collect();
+    let potential_components: Vec<&str> = file.split('_').collect();

     find_candidates(srctree, valid_paths, Path::new(""),
&potential_components);
     fn find_candidates(
@@ -88,7 +88,7 @@ fn find_candidates(
     assert!(
         valid_paths.len() > 0,
         "No path candidates found. This is likely a bug in the build
system, or some files went \
-        away while compiling."
+        away while compiling.",
     );

     if valid_paths.len() > 1 {
@@ -126,12 +126,13 @@ fn main() {
     let mut valid_paths: Vec<PathBuf> = Vec::new();
     let mut real_path: &str = "";
     for path in paths {
-        // The `name` follows the `{file}_{line}_{number}` pattern
(see description in
+        // The `name` follows the `{file}_{line}` pattern (see description in
         // `scripts/rustdoc_test_builder.rs`). Discard the `number`.
         let name = path.file_name().unwrap().to_str().unwrap().to_string();

-        // Extract the `file` and the `line`, discarding the `number`.
-        let (file, line) =
name.rsplit_once('_').unwrap().0.rsplit_once('_').unwrap();
+        // Extract the `file` and the `line`, discarding the extension.
+        let (file, line) = name.rsplit_once('-').unwrap();
+        let line = line.split('.').next().unwrap();

         // Generate an ID sequence ("test number") for each one in the file.
         if file == last_file {
-- 
2.48.1


Le ven. 28 févr. 2025 à 16:43, Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> a écrit :
>
> On Fri, Feb 28, 2025 at 2:29 PM Guillaume Gomez
> <guillaume1.gomez@gmail.com> wrote:
> >
> > Before this patch, the code was using very hacky methods in order to retrieve
> > doctests, modify them as needed and then concatenate all of them in one file.
> >
> > Now, with this new flag, it instead asks rustdoc to provide the doctests
> > code with their associated information such as file path and line number.
>
> Thanks for doing this (and the `rustdoc` side of it!), and welcome!
>
> Normally in commit messages we need to give more details:
>
>   - A Signed-off-by is required (please see for what that implies).
>
>     https://docs.kernel.org/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin
>
>   - The commit message should explain the motivation, i.e. removing
> the 2 unstable `rustdoc` features, and moving to the new,
> intended-to-be-stable one (which one? etc.), and so on.
>
>   - Also, a bit of context (or at least linking to it) would be good,
> e.g. explaining that we were working on removing unstable features,
> and that `rustdoc` provided a new flag for the kernel, ideally with
> links to the relevant Rust tracking issues/PRs (typically with a
> "Link: " tag), plus which version the unstable feature is available,
> etc. (well, when we know for sure, i.e. I understand that we may need
> to tweak it still).
>
> Anyway, those are procedural details we can sort out later :) What is
> most important I think are the following two notes, please see below.
>
> > -       $(RUSTDOC) --test $(rust_flags) \
> > +       $(RUSTDOC) --output-format=doctest $(rust_flags) \
> >                 -L$(objtree)/$(obj) --extern ffi --extern kernel \
> >                 --extern build_error --extern macros \
> >                 --extern bindings --extern uapi \
> > -               --no-run --crate-name kernel -Zunstable-options \
> > +               --crate-name kernel -Zunstable-options \
> >                 --sysroot=/dev/null \
> > -               --test-builder $(objtree)/scripts/rustdoc_test_builder \
> > -               $< $(rustdoc_test_kernel_quiet); \
> > +               $< $(rustdoc_test_kernel_quiet) > rustdoc.json; \
> > +       cat rustdoc.json | $(objtree)/scripts/rustdoc_test_builder; \
> >         $(objtree)/scripts/rustdoc_test_gen
>
> We currently support versions 1.78+, so we will need to do these
> things conditionally. I can help with that, so don't worry too much
> about it for the moment (e.g. we have `rustc-option` and other helpers
> to pick the right flags given a version etc.).
>
> Similarly, we will also need to tell the script whether to use the old
> or the new way, too.
>
> > +    // We replace the `Result` if needed.
> > +    let doctest_code = doctest_code.replace(
> > +        "fn main() { fn _inner() -> Result<",
> > +        "fn main() { fn _inner() -> core::result::Result<",
> > +    );
>
> Hmm... I was hoping we could do something on the `rustdoc` side to
> remove these hacks altogether since we are going to have a "proper
> flag" for it, i.e. to avoid relying on the particular "format" of the
> tests somehow.
>
> I wrote about a bit of that here:
>
>     https://github.com/rust-lang/rust/pull/134531#discussion_r1894610592
>
> > +    let doctest_code = doctest_code.replace(
> > +        "} _inner().unwrap() }",
> > +        "} let test_return_value = _inner(); assert!(test_return_value.is_ok()); }",
> >      );
> > +    std::fs::write(path, doctest_code.as_bytes()).unwrap();
> > +}
>
> Same for this.
>
> > +    } else {
> > +        panic!("missing `format_version` field");
> > +    }
>
> `expect` maybe?
>
> > @@ -87,8 +87,8 @@ fn find_candidates(
> >
> >      assert!(
> >          valid_paths.len() > 0,
> > -        "No path candidates found. This is likely a bug in the build system, or some files went \
> > -        away while compiling."
> > +        "No path candidates found for `{file}`. This is likely a bug in the build system, or some \
> > +        files went away while compiling.",
> >      );
> >
> >      if valid_paths.len() > 1 {
> > @@ -97,8 +97,8 @@ fn find_candidates(
> >              eprintln!("    {path:?}");
> >          }
> >          panic!(
> > -            "Several path candidates found, please resolve the ambiguity by renaming a file or \
> > -            folder."
> > +            "Several path candidates found for `{file}`, please resolve the ambiguity by renaming \
> > +            a file or folder."
> >          );
> >      }
>
> These two bits could go in a first patch, I think, though it isn't a big deal.
>
> Thanks again for getting us out of unstable in `rustdoc`!
>
> Cheers,
> Miguel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling.
  2025-02-28 15:43 ` Miguel Ojeda
  2025-02-28 16:33   ` Guillaume Gomez
@ 2025-02-28 16:38   ` Guillaume Gomez
       [not found]   ` <CAAOQCfSEUnp8U3+6amWCd6+yPrAmPy6gsjJnQdrqmpER5md3kA@mail.gmail.com>
  2 siblings, 0 replies; 12+ messages in thread
From: Guillaume Gomez @ 2025-02-28 16:38 UTC (permalink / raw)
  To: miguel.ojeda.sandonis
  Cc: guillaume1.gomez, linux-kernel, ojeda, rust-for-linux

Thanks for the detailed answer!

I added the missing "Signed-of-by" in the commit message.

I added some more context in the commit message, hopefully I didn't
miss anything.

> We currently support versions 1.78+, so we will need to do these
> things conditionally. I can help with that, so don't worry too much
> about it for the moment (e.g. we have `rustc-option` and other helpers
> to pick the right flags given a version etc.).

I'll definitely need some help here. I'm not sure current stable already
has this change so until then, we'll need a beta/nightly version to run
these tests.

> Hmm... I was hoping we could do something on the `rustdoc` side to
> remove these hacks altogether since we are going to have a "proper
> flag" for it, i.e. to avoid relying on the particular "format" of the
> tests somehow.

I opened https://github.com/rust-lang/rust/pull/137807 to resolve
this problem.

> > +    let doctest_code = doctest_code.replace(
> > +        "} _inner().unwrap() }",
> > +        "} let test_return_value = _inner(); assert!(test_return_value.is_ok()); }",
> >      );
> > +    std::fs::write(path, doctest_code.as_bytes()).unwrap();
> > +}
>
> Same for this.

For this one I'll need to check first if it can be done "safely" (ie,
so unexpected side
effect).

> > +    } else {
> > +        panic!("missing `format_version` field");
> > +    }
>
> `expect` maybe?

I don't think `expect` would work in any of the cases in this file.
What I suggest
is to add methods on `JsonValue` in a future patch which would allow to reduce
code in this file (and call `expect` too).

> These two bits could go in a first patch, I think, though it isn't a big deal.

You're absolutely right, removing them and sending a separate patch.

Here is the updated patch:


The goal of this patch is to remove the use of 2 unstable rustdoc features
(`--no-run` and `--test-builder`) and replace it with a stable feature:
`--output-format=doctest`, which was added in
https://github.com/rust-lang/rust/pull/134531.

Before this patch, the code was using very hacky methods in order to retrieve
doctests, modify them as needed and then concatenate all of them in one file.

Now, with this new flag, it instead asks rustdoc to provide the doctests
code with their associated information such as file path and line number.

Signed-off-by: Guillaume Gomez <guillaume1.gomez@gmail.com>
---
 rust/Makefile                   |   8 +-
 scripts/json.rs                 | 235 ++++++++++++++++++++++++++++++++
 scripts/rustdoc_test_builder.rs | 115 ++++++++++------
 scripts/rustdoc_test_gen.rs     |  11 +-
 4 files changed, 320 insertions(+), 49 deletions(-)
 create mode 100644 scripts/json.rs

diff --git a/rust/Makefile b/rust/Makefile
index ea3849eb7..237ed23f8 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -171,14 +171,14 @@ quiet_cmd_rustdoc_test_kernel = RUSTDOC TK $<
 	rm -rf $(objtree)/$(obj)/test/doctests/kernel; \
 	mkdir -p $(objtree)/$(obj)/test/doctests/kernel; \
 	OBJTREE=$(abspath $(objtree)) \
-	$(RUSTDOC) --test $(rust_flags) \
+	$(RUSTDOC) --output-format=doctest $(rust_flags) \
 		-L$(objtree)/$(obj) --extern ffi --extern kernel \
 		--extern build_error --extern macros \
 		--extern bindings --extern uapi \
-		--no-run --crate-name kernel -Zunstable-options \
+		--crate-name kernel -Zunstable-options \
 		--sysroot=/dev/null \
-		--test-builder $(objtree)/scripts/rustdoc_test_builder \
-		$< $(rustdoc_test_kernel_quiet); \
+		$< $(rustdoc_test_kernel_quiet) > rustdoc.json; \
+	cat rustdoc.json | $(objtree)/scripts/rustdoc_test_builder; \
 	$(objtree)/scripts/rustdoc_test_gen
 
 %/doctests_kernel_generated.rs %/doctests_kernel_generated_kunit.c: \
diff --git a/scripts/json.rs b/scripts/json.rs
new file mode 100644
index 000000000..aff24bfd9
--- /dev/null
+++ b/scripts/json.rs
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! JSON parser used to parse rustdoc output when retrieving doctests.
+
+use std::collections::HashMap;
+use std::iter::Peekable;
+use std::str::FromStr;
+
+#[derive(Debug, PartialEq, Eq)]
+pub(crate) enum JsonValue {
+    Object(HashMap<String, JsonValue>),
+    String(String),
+    Number(i32),
+    Bool(bool),
+    Array(Vec<JsonValue>),
+    Null,
+}
+
+fn parse_ident<I: Iterator<Item = char>>(
+    iter: &mut I,
+    output: JsonValue,
+    ident: &str,
+) -> Result<JsonValue, String> {
+    let mut ident_iter = ident.chars().skip(1);
+
+    loop {
+        let i = ident_iter.next();
+        if i.is_none() {
+            return Ok(output);
+        }
+        let c = iter.next();
+        if i != c {
+            if let Some(c) = c {
+                return Err(format!("Unexpected character `{c}` when parsing `{ident}`"));
+            }
+            return Err(format!("Missing character when parsing `{ident}`"));
+        }
+    }
+}
+
+fn parse_string<I: Iterator<Item = char>>(iter: &mut I) -> Result<JsonValue, String> {
+    let mut out = String::new();
+
+    while let Some(c) = iter.next() {
+        match c {
+            '\\' => {
+                let Some(c) = iter.next() else { break };
+                match c {
+                    '"' | '\\' | '/' => out.push(c),
+                    'b' => out.push(char::from(0x8u8)),
+                    'f' => out.push(char::from(0xCu8)),
+                    't' => out.push('\t'),
+                    'r' => out.push('\r'),
+                    'n' => out.push('\n'),
+                    _ => {
+                        // This code doesn't handle codepoints so we put the string content as is.
+                        out.push('\\');
+                        out.push(c);
+                    }
+                }
+            }
+            '"' => {
+                return Ok(JsonValue::String(out));
+            }
+            _ => out.push(c),
+        }
+    }
+    Err(format!("Unclosed JSON string `{out}`"))
+}
+
+fn parse_number<I: Iterator<Item = char>>(
+    iter: &mut Peekable<I>,
+    digit: char,
+) -> Result<JsonValue, String> {
+    let mut nb = String::new();
+
+    nb.push(digit);
+    loop {
+        // We peek next character to prevent taking it from the iterator in case it's a comma.
+        if matches!(iter.peek(), Some(',' | '}' | ']')) {
+            break;
+        }
+        let Some(c) = iter.next() else { break };
+        if c.is_whitespace() {
+            break;
+        } else if !c.is_ascii_digit() {
+            return Err(format!("Error when parsing number `{nb}`: found `{c}`"));
+        }
+        nb.push(c);
+    }
+    i32::from_str(&nb)
+        .map(|nb| JsonValue::Number(nb))
+        .map_err(|error| format!("Invalid number: `{error}`"))
+}
+
+fn parse_array<I: Iterator<Item = char>>(iter: &mut Peekable<I>) -> Result<JsonValue, String> {
+    let mut values = Vec::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed array".to_string());
+        };
+        if c.is_whitespace() {
+            continue;
+        } else if c == ']' {
+            break;
+        }
+        values.push(parse(iter, c)?);
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == ']' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Array(values))
+}
+
+fn parse_object<I: Iterator<Item = char>>(iter: &mut Peekable<I>) -> Result<JsonValue, String> {
+    let mut values = HashMap::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed object".to_string());
+        };
+        let key;
+        if c.is_whitespace() {
+            continue;
+        } else if c == '"' {
+            let JsonValue::String(k) = parse_string(iter)? else {
+                unreachable!()
+            };
+            key = k;
+        } else if c == '}' {
+            break;
+        } else {
+            return Err(format!("Expected `\"` when parsing Object, found `{c}`"));
+        }
+
+        // We then get the `:` separator.
+        loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else if c == ':' {
+                break;
+            } else {
+                return Err(format!(
+                    "Expected `:` after key, found `{c}` when parsing object"
+                ));
+            }
+        }
+        // Then the value.
+        let value = loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else {
+                break parse(iter, c)?;
+            }
+        };
+
+        if values.contains_key(&key) {
+            return Err(format!("Duplicated key `{key}`"));
+        }
+        values.insert(key, value);
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == '}' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Object(values))
+}
+
+fn parse<I: Iterator<Item = char>>(iter: &mut Peekable<I>, c: char) -> Result<JsonValue, String> {
+    match c {
+        '{' => parse_object(iter),
+        '"' => parse_string(iter),
+        '[' => parse_array(iter),
+        't' => parse_ident(iter, JsonValue::Bool(true), "true"),
+        'f' => parse_ident(iter, JsonValue::Bool(false), "false"),
+        'n' => parse_ident(iter, JsonValue::Null, "null"),
+        c => {
+            if c.is_ascii_digit() || c == '-' {
+                parse_number(iter, c)
+            } else {
+                Err(format!("Unexpected `{c}` character"))
+            }
+        }
+    }
+}
+
+impl JsonValue {
+    pub(crate) fn parse(input: &str) -> Result<Self, String> {
+        let mut iter = input.chars().peekable();
+        let mut value = None;
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            }
+            value = Some(parse(&mut iter, c)?);
+            break;
+        }
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else {
+                return Err(format!("Unexpected character `{c}` after content"));
+            }
+        }
+        if let Some(value) = value {
+            Ok(value)
+        } else {
+            Err("Empty content".to_string())
+        }
+    }
+}
diff --git a/scripts/rustdoc_test_builder.rs b/scripts/rustdoc_test_builder.rs
index e5894652f..ba17e9db9 100644
--- a/scripts/rustdoc_test_builder.rs
+++ b/scripts/rustdoc_test_builder.rs
@@ -15,58 +15,93 @@
 //! from that. For the moment, we generate ourselves a new name, `{file}_{number}` instead, in
 //! the `gen` script (done there since we need to be aware of all the tests in a given file).
 
+use std::fs::create_dir_all;
 use std::io::Read;
 
-fn main() {
-    let mut stdin = std::io::stdin().lock();
-    let mut body = String::new();
-    stdin.read_to_string(&mut body).unwrap();
+use json::JsonValue;
 
-    // Find the generated function name looking for the inner function inside `main()`.
-    //
-    // The line we are looking for looks like one of the following:
-    //
-    // ```
-    // fn main() { #[allow(non_snake_case)] fn _doctest_main_rust_kernel_file_rs_28_0() {
-    // fn main() { #[allow(non_snake_case)] fn _doctest_main_rust_kernel_file_rs_37_0() -> Result<(), impl core::fmt::Debug> {
-    // ```
-    //
-    // It should be unlikely that doctest code matches such lines (when code is formatted properly).
-    let rustdoc_function_name = body
-        .lines()
-        .find_map(|line| {
-            Some(
-                line.split_once("fn main() {")?
-                    .1
-                    .split_once("fn ")?
-                    .1
-                    .split_once("()")?
-                    .0,
-            )
-            .filter(|x| x.chars().all(|c| c.is_alphanumeric() || c == '_'))
-        })
-        .expect("No test function found in `rustdoc`'s output.");
+mod json;
 
-    // Qualify `Result` to avoid the collision with our own `Result` coming from the prelude.
-    let body = body.replace(
-        &format!("{rustdoc_function_name}() -> Result<(), impl core::fmt::Debug> {{"),
-        &format!("{rustdoc_function_name}() -> core::result::Result<(), impl core::fmt::Debug> {{"),
-    );
+fn generate_doctest(file: &str, line: i32, doctest_code: &str) {
+    let file = file
+        .strip_suffix(".rs")
+        .unwrap_or(file)
+        .strip_prefix("../rust/kernel/")
+        .unwrap_or(file)
+        .replace('/', "_");
+    let path = format!("rust/test/doctests/kernel/{file}-{line}.rs");
 
+    // We replace the `Result` if needed.
+    let doctest_code = doctest_code.replace(
+        "fn main() { fn _inner() -> Result<",
+        "fn main() { fn _inner() -> core::result::Result<",
+    );
     // For tests that get generated with `Result`, like above, `rustdoc` generates an `unwrap()` on
     // the return value to check there were no returned errors. Instead, we use our assert macro
     // since we want to just fail the test, not panic the kernel.
     //
     // We save the result in a variable so that the failed assertion message looks nicer.
-    let body = body.replace(
-        &format!("}} {rustdoc_function_name}().unwrap() }}"),
-        &format!("}} let test_return_value = {rustdoc_function_name}(); assert!(test_return_value.is_ok()); }}"),
+    let doctest_code = doctest_code.replace(
+        "} _inner().unwrap() }",
+        "} let test_return_value = _inner(); assert!(test_return_value.is_ok()); }",
     );
+    std::fs::write(path, doctest_code.as_bytes()).unwrap();
+}
+
+fn main() {
+    let mut stdin = std::io::stdin().lock();
+    let mut body = String::new();
+    stdin.read_to_string(&mut body).unwrap();
+
+    let JsonValue::Object(rustdoc) = JsonValue::parse(&body).unwrap() else {
+        panic!("Expected an object")
+    };
+    if let Some(JsonValue::Number(format_version)) = rustdoc.get("format_version") {
+        if *format_version != 1 {
+            panic!("unsupported rustdoc format version: {format_version}");
+        }
+    } else {
+        panic!("missing `format_version` field");
+    }
+    let Some(JsonValue::Array(doctests)) = rustdoc.get("doctests") else {
+        panic!("`doctests` field is missing or has the wrong type");
+    };
+
+    // We ignore the error since it will fail when generating doctests below if the folder doesn't
+    // exist.
+    let _ = create_dir_all("rust/test/doctests/kernel");
 
-    // Figure out a smaller test name based on the generated function name.
-    let name = rustdoc_function_name.split_once("_rust_kernel_").unwrap().1;
+    let mut nb_generated = 0;
+    for doctest in doctests {
+        let JsonValue::Object(doctest) = doctest else {
+            unreachable!()
+        };
 
-    let path = format!("rust/test/doctests/kernel/{name}");
+        // We check if we need to skip this test by checking it's a rust code and it's not ignored.
+        if let Some(JsonValue::Object(attributes)) = doctest.get("doctest_attributes") {
+            if attributes.get("rust") != Some(&JsonValue::Bool(true)) {
+                continue;
+            } else if let Some(JsonValue::String(ignore)) = attributes.get("ignore") {
+                if ignore != "None" {
+                    continue;
+                }
+            }
+        }
+        if let (
+            Some(JsonValue::String(file)),
+            Some(JsonValue::Number(line)),
+            Some(JsonValue::String(doctest_code)),
+        ) = (
+            doctest.get("file"),
+            doctest.get("line"),
+            doctest.get("doctest_code"),
+        ) {
+            generate_doctest(file, *line, doctest_code);
+            nb_generated += 1;
+        }
+    }
 
-    std::fs::write(path, body.as_bytes()).unwrap();
+    if nb_generated == 0 {
+        panic!("No test function found in `rustdoc`'s output.");
+    }
 }
diff --git a/scripts/rustdoc_test_gen.rs b/scripts/rustdoc_test_gen.rs
index 5ebd42ae4..66e665ea0 100644
--- a/scripts/rustdoc_test_gen.rs
+++ b/scripts/rustdoc_test_gen.rs
@@ -48,7 +48,7 @@
 fn find_real_path<'a>(srctree: &Path, valid_paths: &'a mut Vec<PathBuf>, file: &str) -> &'a str {
     valid_paths.clear();
 
-    let potential_components: Vec<&str> = file.strip_suffix("_rs").unwrap().split('_').collect();
+    let potential_components: Vec<&str> = file.split('_').collect();
 
     find_candidates(srctree, valid_paths, Path::new(""), &potential_components);
     fn find_candidates(
@@ -88,7 +88,7 @@ fn find_candidates(
     assert!(
         valid_paths.len() > 0,
         "No path candidates found. This is likely a bug in the build system, or some files went \
-        away while compiling."
+        away while compiling.",
     );
 
     if valid_paths.len() > 1 {
@@ -126,12 +126,13 @@ fn main() {
     let mut valid_paths: Vec<PathBuf> = Vec::new();
     let mut real_path: &str = "";
     for path in paths {
-        // The `name` follows the `{file}_{line}_{number}` pattern (see description in
+        // The `name` follows the `{file}_{line}` pattern (see description in
         // `scripts/rustdoc_test_builder.rs`). Discard the `number`.
         let name = path.file_name().unwrap().to_str().unwrap().to_string();
 
-        // Extract the `file` and the `line`, discarding the `number`.
-        let (file, line) = name.rsplit_once('_').unwrap().0.rsplit_once('_').unwrap();
+        // Extract the `file` and the `line`, discarding the extension.
+        let (file, line) = name.rsplit_once('-').unwrap();
+        let line = line.split('.').next().unwrap();
 
         // Generate an ID sequence ("test number") for each one in the file.
         if file == last_file {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling.
       [not found]   ` <CAAOQCfSEUnp8U3+6amWCd6+yPrAmPy6gsjJnQdrqmpER5md3kA@mail.gmail.com>
@ 2025-02-28 16:48     ` Miguel Ojeda
  2025-02-28 16:55       ` Guillaume Gomez
  0 siblings, 1 reply; 12+ messages in thread
From: Miguel Ojeda @ 2025-02-28 16:48 UTC (permalink / raw)
  To: Guillaume Gomez; +Cc: ojeda, rust-for-linux, linux-kernel

On Fri, Feb 28, 2025 at 5:32 PM Guillaume Gomez
<guillaume1.gomez@gmail.com> wrote:
>
> I'll definitely need some help here. I'm not sure current stable already
> has this change so until then, we'll need a beta/nightly version to run
> these tests.

Yeah, we will need to wait until the "final" version of the flag is in
a stable version (the flag would not need to be "stable" itself, nor
the compiler released, but yeah, we need to know the version).

> I opened https://github.com/rust-lang/rust/pull/137807 to resolve
> this problem.

Thanks!

That would still force us to have all the "hardcoded knowledge" about
the `rustdoc` output (apart from the JSON schema), right?

i.e. my idea was to try to see if we could avoid matching on "strings"
as much as possible, and trying to have enough metadata (properly
encoded in the JSON somehow), so that we need to avoid searching for
e.g. `main()` etc.; and instead generate everything else needed on our
side, customized for the kernel case.

> I don't think `expect` would work in any of the cases in this file. What I suggest
> is to add methods on `JsonValue` in a future patch which would allow to reduce
> code in this file (and call `expect` too).

Yeah, sorry, when I saw the `Some(...) ... else panic!` I replied too
quickly -- in this case, I don't think it matters to have a custom
error for the "wrong JSON type" case as we discussed offline since
nobody should be seeing the error to begin with, so it is fine.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling.
  2025-02-28 16:48     ` Miguel Ojeda
@ 2025-02-28 16:55       ` Guillaume Gomez
  0 siblings, 0 replies; 12+ messages in thread
From: Guillaume Gomez @ 2025-02-28 16:55 UTC (permalink / raw)
  To: Miguel Ojeda; +Cc: ojeda, rust-for-linux, linux-kernel

> That would still force us to have all the "hardcoded knowledge" about
> the `rustdoc` output (apart from the JSON schema), right?

That reduces it, but doesn't solve the problem completely. The second part
is more tricky.

> i.e. my idea was to try to see if we could avoid matching on "strings"
> as much as possible, and trying to have enough metadata (properly
> encoded in the JSON somehow), so that we need to avoid searching for
> e.g. `main()` etc.; and instead generate everything else needed on our
> side, customized for the kernel case.

I think in this case, the only issue is that we're calling `_inner().unwrap()`
right? So we could go around this issue by not generating this code
directly in the doctest output and instead add a new field:

* return_result_fn_name: Contains the name of the wrapping function
  returning the `Result` type. If this field is present, it means that the
  doctest is returning a `Result` and that we need to add in the code:
  `assert!({return_result_fn_name}.is_ok())`.

But even that is tricky because it's part of another block (`{}`) so the users
would need to add an extra `}` after the call. Anyway, I think it'll need to
be discussed more in details in another discussion on how to best
implement it for both sides.

Le ven. 28 févr. 2025 à 17:48, Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> a écrit :
>
> On Fri, Feb 28, 2025 at 5:32 PM Guillaume Gomez
> <guillaume1.gomez@gmail.com> wrote:
> >
> > I'll definitely need some help here. I'm not sure current stable already
> > has this change so until then, we'll need a beta/nightly version to run
> > these tests.
>
> Yeah, we will need to wait until the "final" version of the flag is in
> a stable version (the flag would not need to be "stable" itself, nor
> the compiler released, but yeah, we need to know the version).
>
> > I opened https://github.com/rust-lang/rust/pull/137807 to resolve
> > this problem.
>
> Thanks!
>
> That would still force us to have all the "hardcoded knowledge" about
> the `rustdoc` output (apart from the JSON schema), right?
>
> i.e. my idea was to try to see if we could avoid matching on "strings"
> as much as possible, and trying to have enough metadata (properly
> encoded in the JSON somehow), so that we need to avoid searching for
> e.g. `main()` etc.; and instead generate everything else needed on our
> side, customized for the kernel case.
>
> > I don't think `expect` would work in any of the cases in this file. What I suggest
> > is to add methods on `JsonValue` in a future patch which would allow to reduce
> > code in this file (and call `expect` too).
>
> Yeah, sorry, when I saw the `Some(...) ... else panic!` I replied too
> quickly -- in this case, I don't think it matters to have a custom
> error for the "wrong JSON type" case as we discussed offline since
> nobody should be seeing the error to begin with, so it is fine.
>
> Cheers,
> Miguel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling
@ 2025-06-17 14:00 Guillaume Gomez
  2025-06-18  5:50 ` kernel test robot
  2025-10-09  9:14 ` Miguel Ojeda
  0 siblings, 2 replies; 12+ messages in thread
From: Guillaume Gomez @ 2025-06-17 14:00 UTC (permalink / raw)
  To: linux-kernel, rust-for-linux, ojeda, guillaume1.gomez

The goal of this patch is to remove the use of 2 unstable
rustdoc features (`--no-run` and `--test-builder`) and replace it with a
stable feature: `--output-format=doctest`, which was added in
https://github.com/rust-lang/rust/pull/134531.

Before this patch, the code was using very hacky methods in order to retrieve
doctests, modify them as needed and then concatenate all of them in one file.

Now, with this new flag, it instead asks rustdoc to provide the doctests
code with their associated information such as file path and line number.

Signed-off-by: Guillaume Gomez <guillaume1.gomez@gmail.com>
---
 rust/Makefile                   |   7 +-
 scripts/json.rs                 | 235 ++++++++++++++++++++++++++++++++
 scripts/rustdoc_test_builder.rs | 128 +++++++++++------
 scripts/rustdoc_test_gen.rs     |  11 +-
 4 files changed, 329 insertions(+), 52 deletions(-)
 create mode 100644 scripts/json.rs

diff --git a/rust/Makefile b/rust/Makefile
index 27dec7904c3a..b643514221b3 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -205,14 +205,15 @@ quiet_cmd_rustdoc_test_kernel = RUSTDOC TK $<
 	rm -rf $(objtree)/$(obj)/test/doctests/kernel; \
 	mkdir -p $(objtree)/$(obj)/test/doctests/kernel; \
 	OBJTREE=$(abspath $(objtree)) \
-	$(RUSTDOC) --test $(filter-out --remap-path-prefix=%,$(rust_flags)) \
+	$(RUSTDOC) --output-format=doctest $(filter-out --remap-path-prefix=%,$(rust_flags)) \
 		-L$(objtree)/$(obj) --extern ffi --extern pin_init \
 		--extern kernel --extern build_error --extern macros \
 		--extern bindings --extern uapi \
-		--no-run --crate-name kernel -Zunstable-options \
+		--crate-name kernel -Zunstable-options \
 		--sysroot=/dev/null \
 		--test-builder $(objtree)/scripts/rustdoc_test_builder \
-		$< $(rustdoc_test_kernel_quiet); \
+		$< $(rustdoc_test_kernel_quiet) > rustdoc.json; \
+	cat rustdoc.json | $(objtree)/scripts/rustdoc_test_builder; \
 	$(objtree)/scripts/rustdoc_test_gen
 
 %/doctests_kernel_generated.rs %/doctests_kernel_generated_kunit.c: \
diff --git a/scripts/json.rs b/scripts/json.rs
new file mode 100644
index 000000000000..aff24bfd9213
--- /dev/null
+++ b/scripts/json.rs
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! JSON parser used to parse rustdoc output when retrieving doctests.
+
+use std::collections::HashMap;
+use std::iter::Peekable;
+use std::str::FromStr;
+
+#[derive(Debug, PartialEq, Eq)]
+pub(crate) enum JsonValue {
+    Object(HashMap<String, JsonValue>),
+    String(String),
+    Number(i32),
+    Bool(bool),
+    Array(Vec<JsonValue>),
+    Null,
+}
+
+fn parse_ident<I: Iterator<Item = char>>(
+    iter: &mut I,
+    output: JsonValue,
+    ident: &str,
+) -> Result<JsonValue, String> {
+    let mut ident_iter = ident.chars().skip(1);
+
+    loop {
+        let i = ident_iter.next();
+        if i.is_none() {
+            return Ok(output);
+        }
+        let c = iter.next();
+        if i != c {
+            if let Some(c) = c {
+                return Err(format!("Unexpected character `{c}` when parsing `{ident}`"));
+            }
+            return Err(format!("Missing character when parsing `{ident}`"));
+        }
+    }
+}
+
+fn parse_string<I: Iterator<Item = char>>(iter: &mut I) -> Result<JsonValue, String> {
+    let mut out = String::new();
+
+    while let Some(c) = iter.next() {
+        match c {
+            '\\' => {
+                let Some(c) = iter.next() else { break };
+                match c {
+                    '"' | '\\' | '/' => out.push(c),
+                    'b' => out.push(char::from(0x8u8)),
+                    'f' => out.push(char::from(0xCu8)),
+                    't' => out.push('\t'),
+                    'r' => out.push('\r'),
+                    'n' => out.push('\n'),
+                    _ => {
+                        // This code doesn't handle codepoints so we put the string content as is.
+                        out.push('\\');
+                        out.push(c);
+                    }
+                }
+            }
+            '"' => {
+                return Ok(JsonValue::String(out));
+            }
+            _ => out.push(c),
+        }
+    }
+    Err(format!("Unclosed JSON string `{out}`"))
+}
+
+fn parse_number<I: Iterator<Item = char>>(
+    iter: &mut Peekable<I>,
+    digit: char,
+) -> Result<JsonValue, String> {
+    let mut nb = String::new();
+
+    nb.push(digit);
+    loop {
+        // We peek next character to prevent taking it from the iterator in case it's a comma.
+        if matches!(iter.peek(), Some(',' | '}' | ']')) {
+            break;
+        }
+        let Some(c) = iter.next() else { break };
+        if c.is_whitespace() {
+            break;
+        } else if !c.is_ascii_digit() {
+            return Err(format!("Error when parsing number `{nb}`: found `{c}`"));
+        }
+        nb.push(c);
+    }
+    i32::from_str(&nb)
+        .map(|nb| JsonValue::Number(nb))
+        .map_err(|error| format!("Invalid number: `{error}`"))
+}
+
+fn parse_array<I: Iterator<Item = char>>(iter: &mut Peekable<I>) -> Result<JsonValue, String> {
+    let mut values = Vec::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed array".to_string());
+        };
+        if c.is_whitespace() {
+            continue;
+        } else if c == ']' {
+            break;
+        }
+        values.push(parse(iter, c)?);
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == ']' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Array(values))
+}
+
+fn parse_object<I: Iterator<Item = char>>(iter: &mut Peekable<I>) -> Result<JsonValue, String> {
+    let mut values = HashMap::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed object".to_string());
+        };
+        let key;
+        if c.is_whitespace() {
+            continue;
+        } else if c == '"' {
+            let JsonValue::String(k) = parse_string(iter)? else {
+                unreachable!()
+            };
+            key = k;
+        } else if c == '}' {
+            break;
+        } else {
+            return Err(format!("Expected `\"` when parsing Object, found `{c}`"));
+        }
+
+        // We then get the `:` separator.
+        loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else if c == ':' {
+                break;
+            } else {
+                return Err(format!(
+                    "Expected `:` after key, found `{c}` when parsing object"
+                ));
+            }
+        }
+        // Then the value.
+        let value = loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else {
+                break parse(iter, c)?;
+            }
+        };
+
+        if values.contains_key(&key) {
+            return Err(format!("Duplicated key `{key}`"));
+        }
+        values.insert(key, value);
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == '}' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Object(values))
+}
+
+fn parse<I: Iterator<Item = char>>(iter: &mut Peekable<I>, c: char) -> Result<JsonValue, String> {
+    match c {
+        '{' => parse_object(iter),
+        '"' => parse_string(iter),
+        '[' => parse_array(iter),
+        't' => parse_ident(iter, JsonValue::Bool(true), "true"),
+        'f' => parse_ident(iter, JsonValue::Bool(false), "false"),
+        'n' => parse_ident(iter, JsonValue::Null, "null"),
+        c => {
+            if c.is_ascii_digit() || c == '-' {
+                parse_number(iter, c)
+            } else {
+                Err(format!("Unexpected `{c}` character"))
+            }
+        }
+    }
+}
+
+impl JsonValue {
+    pub(crate) fn parse(input: &str) -> Result<Self, String> {
+        let mut iter = input.chars().peekable();
+        let mut value = None;
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            }
+            value = Some(parse(&mut iter, c)?);
+            break;
+        }
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else {
+                return Err(format!("Unexpected character `{c}` after content"));
+            }
+        }
+        if let Some(value) = value {
+            Ok(value)
+        } else {
+            Err("Empty content".to_string())
+        }
+    }
+}
diff --git a/scripts/rustdoc_test_builder.rs b/scripts/rustdoc_test_builder.rs
index f7540bcf595a..f4cb9457a705 100644
--- a/scripts/rustdoc_test_builder.rs
+++ b/scripts/rustdoc_test_builder.rs
@@ -15,60 +15,100 @@
 //! from that. For the moment, we generate ourselves a new name, `{file}_{number}` instead, in
 //! the `gen` script (done there since we need to be aware of all the tests in a given file).
 
+use std::collections::HashMap;
 use std::io::Read;
+use std::fs::create_dir_all;
 
-fn main() {
-    let mut stdin = std::io::stdin().lock();
-    let mut body = String::new();
-    stdin.read_to_string(&mut body).unwrap();
+use json::JsonValue;
 
-    // Find the generated function name looking for the inner function inside `main()`.
-    //
-    // The line we are looking for looks like one of the following:
-    //
-    // ```
-    // fn main() { #[allow(non_snake_case)] fn _doctest_main_rust_kernel_file_rs_28_0() {
-    // fn main() { #[allow(non_snake_case)] fn _doctest_main_rust_kernel_file_rs_37_0() -> Result<(), impl ::core::fmt::Debug> {
-    // ```
-    //
-    // It should be unlikely that doctest code matches such lines (when code is formatted properly).
-    let rustdoc_function_name = body
-        .lines()
-        .find_map(|line| {
-            Some(
-                line.split_once("fn main() {")?
-                    .1
-                    .split_once("fn ")?
-                    .1
-                    .split_once("()")?
-                    .0,
-            )
-            .filter(|x| x.chars().all(|c| c.is_alphanumeric() || c == '_'))
-        })
-        .expect("No test function found in `rustdoc`'s output.");
+mod json;
 
-    // Qualify `Result` to avoid the collision with our own `Result` coming from the prelude.
-    let body = body.replace(
-        &format!("{rustdoc_function_name}() -> Result<(), impl ::core::fmt::Debug> {{"),
-        &format!(
-            "{rustdoc_function_name}() -> ::core::result::Result<(), impl ::core::fmt::Debug> {{"
-        ),
-    );
+fn generate_doctest(file: &str, line: i32, doctest_code: &HashMap<String, JsonValue>) -> bool {
+    // FIXME: Once let chain feature is stable, please use it instead.
+    let Some(JsonValue::Object(wrapper)) = doctest_code.get("wrapper") else { return false };
+    let Some(JsonValue::String(before)) = wrapper.get("before") else { return false };
+    let Some(JsonValue::String(after)) = wrapper.get("after") else { return false };
+    let Some(JsonValue::String(code)) = doctest_code.get("code") else { return false };
+    let Some(JsonValue::String(crate_level_code)) = doctest_code.get("crate_level") else { return false };
 
-    // For tests that get generated with `Result`, like above, `rustdoc` generates an `unwrap()` on
+    // For tests that get generated with `Result`, `rustdoc` generates an `unwrap()` on
     // the return value to check there were no returned errors. Instead, we use our assert macro
     // since we want to just fail the test, not panic the kernel.
     //
     // We save the result in a variable so that the failed assertion message looks nicer.
-    let body = body.replace(
-        &format!("}} {rustdoc_function_name}().unwrap() }}"),
-        &format!("}} let test_return_value = {rustdoc_function_name}(); assert!(test_return_value.is_ok()); }}"),
-    );
+    let after = if let Some(JsonValue::Bool(true)) = wrapper.get("returns_result") {
+        "\n} let test_return_value = _inner(); assert!(test_return_value.is_ok()); }"
+    } else {
+        after.as_str()
+    };
+    let code = format!("{crate_level_code}\n{before}\n{code}{after}\n");
+
+    let file = file
+        .strip_suffix(".rs")
+        .unwrap_or(file)
+        .strip_prefix("../rust/kernel/")
+        .unwrap_or(file)
+        .replace('/', "_");
+    let path = format!("rust/test/doctests/kernel/{file}-{line}.rs");
+
+    std::fs::write(path, code.as_bytes()).unwrap();
+    true
+}
+
+fn main() {
+    let mut stdin = std::io::stdin().lock();
+    let mut body = String::new();
+    stdin.read_to_string(&mut body).unwrap();
+
+    let JsonValue::Object(rustdoc) = JsonValue::parse(&body).unwrap() else {
+        panic!("Expected an object")
+    };
+    if let Some(JsonValue::Number(format_version)) = rustdoc.get("format_version") {
+        if *format_version != 2 {
+            panic!("unsupported rustdoc format version: {format_version}");
+        }
+    } else {
+        panic!("missing `format_version` field");
+    }
+    let Some(JsonValue::Array(doctests)) = rustdoc.get("doctests") else {
+        panic!("`doctests` field is missing or has the wrong type");
+    };
 
-    // Figure out a smaller test name based on the generated function name.
-    let name = rustdoc_function_name.split_once("_rust_kernel_").unwrap().1;
+    // We ignore the error since it will fail when generating doctests below if the folder doesn't
+    // exist.
+    let _ = create_dir_all("rust/test/doctests/kernel");
 
-    let path = format!("rust/test/doctests/kernel/{name}");
+    let mut nb_generated = 0;
+    for doctest in doctests {
+        let JsonValue::Object(doctest) = doctest else {
+            unreachable!()
+        };
+        // We check if we need to skip this test by checking it's a rust code and it's not ignored.
+        if let Some(JsonValue::Object(attributes)) = doctest.get("doctest_attributes") {
+            if attributes.get("rust") != Some(&JsonValue::Bool(true)) {
+                continue;
+            } else if let Some(JsonValue::String(ignore)) = attributes.get("ignore") {
+                if ignore != "None" {
+                    continue;
+                }
+            }
+        }
+        if let (
+            Some(JsonValue::String(file)),
+            Some(JsonValue::Number(line)),
+            Some(JsonValue::Object(doctest_code)),
+        ) = (
+            doctest.get("file"),
+            doctest.get("line"),
+            doctest.get("doctest_code"),
+        ) {
+            if generate_doctest(file, *line, doctest_code) {
+                nb_generated += 1;
+            }
+        }
+    }
 
-    std::fs::write(path, body.as_bytes()).unwrap();
+    if nb_generated == 0 {
+        panic!("No test function found in `rustdoc`'s output.");
+    }
 }
diff --git a/scripts/rustdoc_test_gen.rs b/scripts/rustdoc_test_gen.rs
index 1ca253594d38..b23fc7c91f54 100644
--- a/scripts/rustdoc_test_gen.rs
+++ b/scripts/rustdoc_test_gen.rs
@@ -48,7 +48,7 @@
 fn find_real_path<'a>(srctree: &Path, valid_paths: &'a mut Vec<PathBuf>, file: &str) -> &'a str {
     valid_paths.clear();
 
-    let potential_components: Vec<&str> = file.strip_suffix("_rs").unwrap().split('_').collect();
+    let potential_components: Vec<&str> = file.split('_').collect();
 
     find_candidates(srctree, valid_paths, Path::new(""), &potential_components);
     fn find_candidates(
@@ -88,7 +88,7 @@ fn find_candidates(
     assert!(
         valid_paths.len() > 0,
         "No path candidates found for `{file}`. This is likely a bug in the build system, or some \
-        files went away while compiling."
+        files went away while compiling.",
     );
 
     if valid_paths.len() > 1 {
@@ -126,12 +126,13 @@ fn main() {
     let mut valid_paths: Vec<PathBuf> = Vec::new();
     let mut real_path: &str = "";
     for path in paths {
-        // The `name` follows the `{file}_{line}_{number}` pattern (see description in
+        // The `name` follows the `{file}_{line}` pattern (see description in
         // `scripts/rustdoc_test_builder.rs`). Discard the `number`.
         let name = path.file_name().unwrap().to_str().unwrap().to_string();
 
-        // Extract the `file` and the `line`, discarding the `number`.
-        let (file, line) = name.rsplit_once('_').unwrap().0.rsplit_once('_').unwrap();
+        // Extract the `file` and the `line`, discarding the extension.
+        let (file, line) = name.rsplit_once('-').unwrap();
+        let line = line.split('.').next().unwrap();
 
         // Generate an ID sequence ("test number") for each one in the file.
         if file == last_file {
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling
  2025-06-17 14:00 [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling Guillaume Gomez
@ 2025-06-18  5:50 ` kernel test robot
  2025-06-18  6:19   ` Miguel Ojeda
  2025-10-09  9:14 ` Miguel Ojeda
  1 sibling, 1 reply; 12+ messages in thread
From: kernel test robot @ 2025-06-18  5:50 UTC (permalink / raw)
  To: Guillaume Gomez, linux-kernel, rust-for-linux, ojeda; +Cc: oe-kbuild-all

Hi Guillaume,

kernel test robot noticed the following build errors:

[auto build test ERROR on rust/rust-next]
[also build test ERROR on shuah-kselftest/kunit shuah-kselftest/kunit-fixes linus/master v6.16-rc2 next-20250617]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Guillaume-Gomez/Use-new-output-format-doctest-rustdoc-command-line-flag-to-improve-doctest-handling/20250617-220502
base:   https://github.com/Rust-for-Linux/linux rust-next
patch link:    https://lore.kernel.org/r/20250617140032.1133337-2-guillaume1.gomez%40gmail.com
patch subject: [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling
config: x86_64-rhel-9.4-rust (https://download.01.org/0day-ci/archive/20250618/202506181334.f0STx6ta-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
rustc: rustc 1.78.0 (9b00956e5 2024-04-29)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250618/202506181334.f0STx6ta-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506181334.f0STx6ta-lkp@intel.com/

All errors (new ones prefixed by >>):

   PATH=/opt/cross/clang-18/bin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
   INFO PATH=/opt/cross/rustc-1.78.0-bindgen-0.65.1/cargo/bin:/opt/cross/clang-18/bin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
   /usr/bin/timeout -k 100 12h /usr/bin/make KCFLAGS= -Wno-error=return-type -Wreturn-type -funsigned-char -Wundef W=1 --keep-going LLVM=1 -j32 -C source O=/kbuild/obj/consumer/x86_64-rhel-9.4-rust ARCH=x86_64 SHELL=/bin/bash rustfmtcheck
   make: Entering directory '/kbuild/src/consumer'
   make[1]: Entering directory '/kbuild/obj/consumer/x86_64-rhel-9.4-rust'
>> Diff in scripts/rustdoc_test_builder.rs at line 16:
    //! the `gen` script (done there since we need to be aware of all the tests in a given file).
    
    use std::collections::HashMap;
   -use std::io::Read;
    use std::fs::create_dir_all;
   +use std::io::Read;
    
    use json::JsonValue;
    
   Diff in scripts/rustdoc_test_builder.rs at line 25:
    
    fn generate_doctest(file: &str, line: i32, doctest_code: &HashMap<String, JsonValue>) -> bool {
        // FIXME: Once let chain feature is stable, please use it instead.
   -    let Some(JsonValue::Object(wrapper)) = doctest_code.get("wrapper") else { return false };
   -    let Some(JsonValue::String(before)) = wrapper.get("before") else { return false };
   -    let Some(JsonValue::String(after)) = wrapper.get("after") else { return false };
   -    let Some(JsonValue::String(code)) = doctest_code.get("code") else { return false };
   -    let Some(JsonValue::String(crate_level_code)) = doctest_code.get("crate_level") else { return false };
   +    let Some(JsonValue::Object(wrapper)) = doctest_code.get("wrapper") else {
   +        return false;
   +    };
   +    let Some(JsonValue::String(before)) = wrapper.get("before") else {
   +        return false;
   +    };
   +    let Some(JsonValue::String(after)) = wrapper.get("after") else {
   +        return false;
   +    };
   +    let Some(JsonValue::String(code)) = doctest_code.get("code") else {
   +        return false;
   +    };
   +    let Some(JsonValue::String(crate_level_code)) = doctest_code.get("crate_level") else {
   +        return false;
   +    };
    
        // For tests that get generated with `Result`, `rustdoc` generates an `unwrap()` on
        // the return value to check there were no returned errors. Instead, we use our assert macro
   make[1]: Leaving directory '/kbuild/obj/consumer/x86_64-rhel-9.4-rust'
   make[2]: *** [Makefile:1825: rustfmt] Error 123
   make[2]: Target 'rustfmtcheck' not remade because of errors.
   make[1]: *** [Makefile:248: __sub-make] Error 2
   make[1]: Target 'rustfmtcheck' not remade because of errors.
   make: *** [Makefile:248: __sub-make] Error 2
   make: Target 'rustfmtcheck' not remade because of errors.
   make: Leaving directory '/kbuild/src/consumer'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling
  2025-06-18  5:50 ` kernel test robot
@ 2025-06-18  6:19   ` Miguel Ojeda
  0 siblings, 0 replies; 12+ messages in thread
From: Miguel Ojeda @ 2025-06-18  6:19 UTC (permalink / raw)
  To: kernel test robot
  Cc: Guillaume Gomez, linux-kernel, rust-for-linux, ojeda,
	oe-kbuild-all

On Wed, Jun 18, 2025 at 7:51 AM kernel test robot <lkp@intel.com> wrote:
>
> kernel test robot noticed the following build errors:

It is just `rustfmt`, so no need to send a new version just for this.

By the way, from a quick look, it indeed looks like with this now we
can finally avoid relying on implementation details on the kernel
side, so this is great, thanks!

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling
  2025-06-17 14:00 [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling Guillaume Gomez
  2025-06-18  5:50 ` kernel test robot
@ 2025-10-09  9:14 ` Miguel Ojeda
  2025-10-18  7:18   ` David Gow
  1 sibling, 1 reply; 12+ messages in thread
From: Miguel Ojeda @ 2025-10-09  9:14 UTC (permalink / raw)
  To: Guillaume Gomez, David Gow, Brendan Higgins, Rae Moar
  Cc: linux-kernel, ojeda, rust-for-linux, kunit-dev, linux-kselftest

On Tue, 17 Jun 2025 16:00:33 +0200 Guillaume Gomez <guillaume1.gomez@gmail.com> wrote:
>
> The goal of this patch is to remove the use of 2 unstable
> rustdoc features (`--no-run` and `--test-builder`) and replace it with a
> stable feature: `--output-format=doctest`, which was added in
> https://github.com/rust-lang/rust/pull/134531.
>
> Before this patch, the code was using very hacky methods in order to retrieve
> doctests, modify them as needed and then concatenate all of them in one file.
>
> Now, with this new flag, it instead asks rustdoc to provide the doctests
> code with their associated information such as file path and line number.
>
> Signed-off-by: Guillaume Gomez <guillaume1.gomez@gmail.com>
> ---

(Procedural bit: normally we provide a changelog between versions after
this `---` line so that reviewers now what changed so far.)

I finally took a look at this again, so I rebased it and got:

    thread 'main' panicked at scripts/rustdoc_test_gen.rs:92:15:
    No path candidates found for `rust_kernel_alloc_allocator.rs`.This is likely a bug in the build system, or some files went away while compiling.

which brings me to the bigger point: the main reason to have the new
output format is to avoid all these hacks, including the "find the real
path back to the original file" hack here. More generally, to avoid the
2 scripts approach.

So now we can finally get rid of all that and simplify. That is, we can
just merge it all in a single script that reads the JSON and builds the
result directly, since now we have everything we need (originally I
needed the 2 scripts approach since `rustdoc` executed the test builder
once per test so I had to somehow collect the results).

i.e. no more hundreds of generated files/processes, just a simple pipe.

Anyway, just to check we had everything we needed, I did a quick try --
please see the draft patch below.

I gave it a go -- please see the draft patch below. The diff w.r.t. your
patch would be something like +217 -341, i.e. we get rid of quite a lot
of lines. I added as well some more context in the commit message, and
put the right docs in the unified script. This also improves the sorting
of the tests (it now follows the line number better).

We still have to preserve the support for the old compilers, so what I
think I will do is just have the new script separately, keeping the old
ones as-is until we can remove them when we upgrade the minimum for e.g.
the next Debian Stable.

Cc'ing David and KUnit, since this is closer to getting ready -- please
let me know if this raises alarms for anyone.

Thanks!

Cheers,
Miguel

From 4aa4581e9004cb95534805f73fdae56c454b3d1d Mon Sep 17 00:00:00 2001
From: Guillaume Gomez <guillaume1.gomez@gmail.com>
Date: Tue, 17 Jun 2025 16:00:33 +0200
Subject: [PATCH] [TODO] rust: use new `rustdoc`'s `--output-format=doctest`

The goal of this patch is to remove the use of 2 unstable `rustdoc`
features (`--no-run` and `--test-builder`) and replace it with a future
stable feature: `--output-format=doctest` [1].

Before this patch, the KUnit Rust doctests generation needed to employ
several hacks in order to retrieve doctests, modify them as needed and
then concatenate all of them in one file. In particular, it required
using two scripts: one that got run as a test builder by `rustdoc` in
order to extract the data and another that collected the results of all
those processes.

We requested upstream `rustdoc` a feature to get `rustdoc` to generate
the information directly -- one that would also be designed to eventually
be made stable. This resulted in the `--output-format=doctest` flag,
which makes all the information neatly available as a JSON output,
including filenames, line numbers, doctest test bodies and so on.

Thus take advantage of the new flag, which in turn allows to just use
a single script that gets piped that JSON output from the compiler and
uses it to directly build the generated files to be run by KUnit.

Link: https://github.com/rust-lang/rust/issues/134529 [1]
Signed-off-by: Guillaume Gomez <guillaume1.gomez@gmail.com>
Co-developed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
---
 rust/Makefile                   |  12 +-
 scripts/.gitignore              |   1 -
 scripts/Makefile                |   2 -
 scripts/json.rs                 | 235 +++++++++++++++++++++++++
 scripts/remove-stale-files      |   2 +
 scripts/rustdoc_test_builder.rs | 300 ++++++++++++++++++++++++++------
 scripts/rustdoc_test_gen.rs     | 265 ----------------------------
 7 files changed, 485 insertions(+), 332 deletions(-)
 create mode 100644 scripts/json.rs
 delete mode 100644 scripts/rustdoc_test_gen.rs

diff --git a/rust/Makefile b/rust/Makefile
index 23c7ae905bd2..93bc456e3576 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -57,7 +57,6 @@ RUST_LIB_SRC ?= $(rustc_sysroot)/lib/rustlib/src/rust/library
 ifneq ($(quiet),)
 rust_test_quiet=-q
 rustdoc_test_quiet=--test-args -q
-rustdoc_test_kernel_quiet=>/dev/null
 endif

 core-cfgs = \
@@ -224,21 +223,20 @@ quiet_cmd_rustdoc_test_kernel = RUSTDOC TK $<
 	rm -rf $(objtree)/$(obj)/test/doctests/kernel; \
 	mkdir -p $(objtree)/$(obj)/test/doctests/kernel; \
 	OBJTREE=$(abspath $(objtree)) \
-	$(RUSTDOC) --test $(filter-out --remap-path-prefix=%,$(rust_flags)) \
+	$(RUSTDOC) $(filter-out --remap-path-prefix=%,$(rust_flags)) \
 		-L$(objtree)/$(obj) --extern ffi --extern pin_init \
 		--extern kernel --extern build_error --extern macros \
 		--extern bindings --extern uapi \
-		--no-run --crate-name kernel -Zunstable-options \
+		--crate-name kernel -Zunstable-options \
 		--sysroot=/dev/null \
+		--output-format=doctest \
 		$(rustdoc_modifiers_workaround) \
-		--test-builder $(objtree)/scripts/rustdoc_test_builder \
-		$< $(rustdoc_test_kernel_quiet); \
-	$(objtree)/scripts/rustdoc_test_gen
+		$< | $(objtree)/scripts/rustdoc_test_builder

 %/doctests_kernel_generated.rs %/doctests_kernel_generated_kunit.c: \
     $(src)/kernel/lib.rs $(obj)/kernel.o \
     $(objtree)/scripts/rustdoc_test_builder \
-    $(objtree)/scripts/rustdoc_test_gen FORCE
+    FORCE
 	+$(call if_changed,rustdoc_test_kernel)

 # We cannot use `-Zpanic-abort-tests` because some tests are dynamic,
diff --git a/scripts/.gitignore b/scripts/.gitignore
index c2ef68848da5..6e6ab7b8f496 100644
--- a/scripts/.gitignore
+++ b/scripts/.gitignore
@@ -7,7 +7,6 @@
 /module.lds
 /recordmcount
 /rustdoc_test_builder
-/rustdoc_test_gen
 /sign-file
 /sorttable
 /target.json
diff --git a/scripts/Makefile b/scripts/Makefile
index 46f860529df5..71c7d9dcd95b 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -10,7 +10,6 @@ hostprogs-always-$(CONFIG_ASN1)				+= asn1_compiler
 hostprogs-always-$(CONFIG_MODULE_SIG_FORMAT)		+= sign-file
 hostprogs-always-$(CONFIG_SYSTEM_EXTRA_CERTIFICATE)	+= insert-sys-cert
 hostprogs-always-$(CONFIG_RUST_KERNEL_DOCTESTS)		+= rustdoc_test_builder
-hostprogs-always-$(CONFIG_RUST_KERNEL_DOCTESTS)		+= rustdoc_test_gen

 ifneq ($(or $(CONFIG_X86_64),$(CONFIG_X86_32)),)
 always-$(CONFIG_RUST)					+= target.json
@@ -23,7 +22,6 @@ endif
 hostprogs += generate_rust_target
 generate_rust_target-rust := y
 rustdoc_test_builder-rust := y
-rustdoc_test_gen-rust := y

 HOSTCFLAGS_sorttable.o = -I$(srctree)/tools/include
 HOSTLDLIBS_sorttable = -lpthread
diff --git a/scripts/json.rs b/scripts/json.rs
new file mode 100644
index 000000000000..aff24bfd9213
--- /dev/null
+++ b/scripts/json.rs
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! JSON parser used to parse rustdoc output when retrieving doctests.
+
+use std::collections::HashMap;
+use std::iter::Peekable;
+use std::str::FromStr;
+
+#[derive(Debug, PartialEq, Eq)]
+pub(crate) enum JsonValue {
+    Object(HashMap<String, JsonValue>),
+    String(String),
+    Number(i32),
+    Bool(bool),
+    Array(Vec<JsonValue>),
+    Null,
+}
+
+fn parse_ident<I: Iterator<Item = char>>(
+    iter: &mut I,
+    output: JsonValue,
+    ident: &str,
+) -> Result<JsonValue, String> {
+    let mut ident_iter = ident.chars().skip(1);
+
+    loop {
+        let i = ident_iter.next();
+        if i.is_none() {
+            return Ok(output);
+        }
+        let c = iter.next();
+        if i != c {
+            if let Some(c) = c {
+                return Err(format!("Unexpected character `{c}` when parsing `{ident}`"));
+            }
+            return Err(format!("Missing character when parsing `{ident}`"));
+        }
+    }
+}
+
+fn parse_string<I: Iterator<Item = char>>(iter: &mut I) -> Result<JsonValue, String> {
+    let mut out = String::new();
+
+    while let Some(c) = iter.next() {
+        match c {
+            '\\' => {
+                let Some(c) = iter.next() else { break };
+                match c {
+                    '"' | '\\' | '/' => out.push(c),
+                    'b' => out.push(char::from(0x8u8)),
+                    'f' => out.push(char::from(0xCu8)),
+                    't' => out.push('\t'),
+                    'r' => out.push('\r'),
+                    'n' => out.push('\n'),
+                    _ => {
+                        // This code doesn't handle codepoints so we put the string content as is.
+                        out.push('\\');
+                        out.push(c);
+                    }
+                }
+            }
+            '"' => {
+                return Ok(JsonValue::String(out));
+            }
+            _ => out.push(c),
+        }
+    }
+    Err(format!("Unclosed JSON string `{out}`"))
+}
+
+fn parse_number<I: Iterator<Item = char>>(
+    iter: &mut Peekable<I>,
+    digit: char,
+) -> Result<JsonValue, String> {
+    let mut nb = String::new();
+
+    nb.push(digit);
+    loop {
+        // We peek next character to prevent taking it from the iterator in case it's a comma.
+        if matches!(iter.peek(), Some(',' | '}' | ']')) {
+            break;
+        }
+        let Some(c) = iter.next() else { break };
+        if c.is_whitespace() {
+            break;
+        } else if !c.is_ascii_digit() {
+            return Err(format!("Error when parsing number `{nb}`: found `{c}`"));
+        }
+        nb.push(c);
+    }
+    i32::from_str(&nb)
+        .map(|nb| JsonValue::Number(nb))
+        .map_err(|error| format!("Invalid number: `{error}`"))
+}
+
+fn parse_array<I: Iterator<Item = char>>(iter: &mut Peekable<I>) -> Result<JsonValue, String> {
+    let mut values = Vec::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed array".to_string());
+        };
+        if c.is_whitespace() {
+            continue;
+        } else if c == ']' {
+            break;
+        }
+        values.push(parse(iter, c)?);
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == ']' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Array(values))
+}
+
+fn parse_object<I: Iterator<Item = char>>(iter: &mut Peekable<I>) -> Result<JsonValue, String> {
+    let mut values = HashMap::new();
+
+    'main: loop {
+        let Some(c) = iter.next() else {
+            return Err("Unclosed object".to_string());
+        };
+        let key;
+        if c.is_whitespace() {
+            continue;
+        } else if c == '"' {
+            let JsonValue::String(k) = parse_string(iter)? else {
+                unreachable!()
+            };
+            key = k;
+        } else if c == '}' {
+            break;
+        } else {
+            return Err(format!("Expected `\"` when parsing Object, found `{c}`"));
+        }
+
+        // We then get the `:` separator.
+        loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else if c == ':' {
+                break;
+            } else {
+                return Err(format!(
+                    "Expected `:` after key, found `{c}` when parsing object"
+                ));
+            }
+        }
+        // Then the value.
+        let value = loop {
+            let Some(c) = iter.next() else {
+                return Err(format!("Missing value after key `{key}`"));
+            };
+            if c.is_whitespace() {
+                continue;
+            } else {
+                break parse(iter, c)?;
+            }
+        };
+
+        if values.contains_key(&key) {
+            return Err(format!("Duplicated key `{key}`"));
+        }
+        values.insert(key, value);
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else if c == ',' {
+                break;
+            } else if c == '}' {
+                break 'main;
+            } else {
+                return Err(format!("Unexpected `{c}` when parsing array"));
+            }
+        }
+    }
+    Ok(JsonValue::Object(values))
+}
+
+fn parse<I: Iterator<Item = char>>(iter: &mut Peekable<I>, c: char) -> Result<JsonValue, String> {
+    match c {
+        '{' => parse_object(iter),
+        '"' => parse_string(iter),
+        '[' => parse_array(iter),
+        't' => parse_ident(iter, JsonValue::Bool(true), "true"),
+        'f' => parse_ident(iter, JsonValue::Bool(false), "false"),
+        'n' => parse_ident(iter, JsonValue::Null, "null"),
+        c => {
+            if c.is_ascii_digit() || c == '-' {
+                parse_number(iter, c)
+            } else {
+                Err(format!("Unexpected `{c}` character"))
+            }
+        }
+    }
+}
+
+impl JsonValue {
+    pub(crate) fn parse(input: &str) -> Result<Self, String> {
+        let mut iter = input.chars().peekable();
+        let mut value = None;
+
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            }
+            value = Some(parse(&mut iter, c)?);
+            break;
+        }
+        while let Some(c) = iter.next() {
+            if c.is_whitespace() {
+                continue;
+            } else {
+                return Err(format!("Unexpected character `{c}` after content"));
+            }
+        }
+        if let Some(value) = value {
+            Ok(value)
+        } else {
+            Err("Empty content".to_string())
+        }
+    }
+}
diff --git a/scripts/remove-stale-files b/scripts/remove-stale-files
index 6e39fa8540df..190dee6b50e8 100755
--- a/scripts/remove-stale-files
+++ b/scripts/remove-stale-files
@@ -26,3 +26,5 @@ rm -f scripts/selinux/genheaders/genheaders
 rm -f *.spec

 rm -f lib/test_fortify.log
+
+rm -f scripts/rustdoc_test_gen
diff --git a/scripts/rustdoc_test_builder.rs b/scripts/rustdoc_test_builder.rs
index f7540bcf595a..dd65bb670d25 100644
--- a/scripts/rustdoc_test_builder.rs
+++ b/scripts/rustdoc_test_builder.rs
@@ -1,74 +1,260 @@
 // SPDX-License-Identifier: GPL-2.0

-//! Test builder for `rustdoc`-generated tests.
+//! Generates KUnit tests from `rustdoc`-generated doctests.
 //!
-//! This script is a hack to extract the test from `rustdoc`'s output. Ideally, `rustdoc` would
-//! have an option to generate this information instead, e.g. as JSON output.
+//! KUnit passes a context (`struct kunit *`) to each test, which should be forwarded to the other
+//! KUnit functions and macros.
 //!
-//! The `rustdoc`-generated test names look like `{file}_{line}_{number}`, e.g.
-//! `...path_rust_kernel_sync_arc_rs_42_0`. `number` is the "test number", needed in cases like
-//! a macro that expands into items with doctests is invoked several times within the same line.
+//! However, we want to keep this as an implementation detail because:
 //!
-//! However, since these names are used for bisection in CI, the line number makes it not stable
-//! at all. In the future, we would like `rustdoc` to give us the Rust item path associated with
-//! the test, plus a "test number" (for cases with several examples per item) and generate a name
-//! from that. For the moment, we generate ourselves a new name, `{file}_{number}` instead, in
-//! the `gen` script (done there since we need to be aware of all the tests in a given file).
+//!   - Test code should not care about the implementation.
+//!
+//!   - Documentation looks worse if it needs to carry extra details unrelated to the piece
+//!     being described.
+//!
+//!   - Test code should be able to define functions and call them, without having to carry
+//!     the context.
+//!
+//!   - Later on, we may want to be able to test non-kernel code (e.g. `core` or third-party
+//!     crates) which likely use the standard library `assert*!` macros.
+//!
+//! For this reason, instead of the passed context, `kunit_get_current_test()` is used instead
+//! (i.e. `current->kunit_test`).
+//!
+//! Note that this means other threads/tasks potentially spawned by a given test, if failing, will
+//! report the failure in the kernel log but will not fail the actual test. Saving the pointer in
+//! e.g. a `static` per test does not fully solve the issue either, because currently KUnit does
+//! not support assertions (only expectations) from other tasks. Thus leave that feature for
+//! the future, which simplifies the code here too. We could also simply not allow `assert`s in
+//! other tasks, but that seems overly constraining, and we do want to support them, eventually.

-use std::io::Read;
+use std::{
+    fs::File,
+    io::{BufWriter, Read, Write},
+};
+
+use json::JsonValue;
+
+mod json;

 fn main() {
     let mut stdin = std::io::stdin().lock();
-    let mut body = String::new();
-    stdin.read_to_string(&mut body).unwrap();
+    let mut rustdoc_json = String::new();
+    stdin.read_to_string(&mut rustdoc_json).unwrap();

-    // Find the generated function name looking for the inner function inside `main()`.
-    //
-    // The line we are looking for looks like one of the following:
-    //
-    // ```
-    // fn main() { #[allow(non_snake_case)] fn _doctest_main_rust_kernel_file_rs_28_0() {
-    // fn main() { #[allow(non_snake_case)] fn _doctest_main_rust_kernel_file_rs_37_0() -> Result<(), impl ::core::fmt::Debug> {
-    // ```
-    //
-    // It should be unlikely that doctest code matches such lines (when code is formatted properly).
-    let rustdoc_function_name = body
-        .lines()
-        .find_map(|line| {
-            Some(
-                line.split_once("fn main() {")?
-                    .1
-                    .split_once("fn ")?
-                    .1
-                    .split_once("()")?
-                    .0,
-            )
-            .filter(|x| x.chars().all(|c| c.is_alphanumeric() || c == '_'))
-        })
-        .expect("No test function found in `rustdoc`'s output.");
-
-    // Qualify `Result` to avoid the collision with our own `Result` coming from the prelude.
-    let body = body.replace(
-        &format!("{rustdoc_function_name}() -> Result<(), impl ::core::fmt::Debug> {{"),
-        &format!(
-            "{rustdoc_function_name}() -> ::core::result::Result<(), impl ::core::fmt::Debug> {{"
-        ),
+    let JsonValue::Object(rustdoc) = JsonValue::parse(&rustdoc_json).unwrap() else {
+        panic!("Expected an object")
+    };
+
+    let Some(JsonValue::Number(format_version)) = rustdoc.get("format_version") else {
+        panic!("missing `format_version` field");
+    };
+    assert!(
+        *format_version == 2,
+        "unsupported rustdoc format version: {format_version}"
     );

-    // For tests that get generated with `Result`, like above, `rustdoc` generates an `unwrap()` on
-    // the return value to check there were no returned errors. Instead, we use our assert macro
-    // since we want to just fail the test, not panic the kernel.
+    let Some(JsonValue::Array(doctests)) = rustdoc.get("doctests") else {
+        panic!("`doctests` field is missing or has the wrong type");
+    };
+
+    let mut nb_generated = 0;
+    let mut number = 0;
+    let mut last_file = "";
+    let mut rust_tests = String::new();
+    let mut c_test_declarations = String::new();
+    let mut c_test_cases = String::new();
+    for doctest in doctests {
+        let JsonValue::Object(doctest) = doctest else {
+            unreachable!()
+        };
+
+        // We check if we need to skip this test by checking it's a rust code and it's not ignored.
+        if let Some(JsonValue::Object(attributes)) = doctest.get("doctest_attributes") {
+            if attributes.get("rust") != Some(&JsonValue::Bool(true)) {
+                continue;
+            }
+            if let Some(JsonValue::String(ignore)) = attributes.get("ignore") {
+                if ignore != "None" {
+                    continue;
+                }
+            }
+        }
+
+        let (
+            Some(JsonValue::String(file)),
+            Some(JsonValue::Number(line)),
+            Some(JsonValue::String(name)),
+            Some(JsonValue::Object(doctest_code)),
+        ) = (
+            doctest.get("file"),
+            doctest.get("line"),
+            doctest.get("name"),
+            doctest.get("doctest_code"),
+        )
+        else {
+            continue;
+        };
+
+        let (
+            Some(JsonValue::String(code)),
+            Some(JsonValue::String(crate_level_code)),
+            Some(JsonValue::Object(wrapper)),
+        ) = (
+            doctest_code.get("code"),
+            doctest_code.get("crate_level"),
+            doctest_code.get("wrapper"),
+        )
+        else {
+            continue;
+        };
+
+        let (Some(JsonValue::String(before)), Some(JsonValue::String(after))) =
+            (wrapper.get("before"), wrapper.get("after"))
+        else {
+            continue;
+        };
+
+        // For tests that get generated with `Result`, `rustdoc` generates an `unwrap()` on
+        // the return value to check there were no returned errors. Instead, we use our assert macro
+        // since we want to just fail the test, not panic the kernel.
+        //
+        // We save the result in a variable so that the failed assertion message looks nicer.
+        let after = if let Some(JsonValue::Bool(true)) = wrapper.get("returns_result") {
+            "\n} let test_return_value = _inner(); assert!(test_return_value.is_ok()); }"
+        } else {
+            after.as_str()
+        };
+
+        let body = format!("{crate_level_code}\n{before}\n{code}{after}\n");
+        nb_generated += 1;
+
+        // Generate an ID sequence ("test number") for each one in the file.
+        if file == last_file {
+            number += 1;
+        } else {
+            number = 0;
+            last_file = file;
+        }
+
+        // Generate a KUnit name (i.e. test name and C symbol) for this test.
+        //
+        // We avoid the line number, like `rustdoc` does, to make things slightly more stable for
+        // bisection purposes. However, to aid developers in mapping back what test failed, we will
+        // print a diagnostics line in the KTAP report.
+        let kunit_name = format!(
+            "rust_doctest_{}_{number}",
+            file.replace('/', "_").replace('.', "_")
+        );
+
+        // Calculate how many lines before `main` function (including the `main` function line).
+        let body_offset = body
+            .lines()
+            .take_while(|line| !line.contains("fn main() {"))
+            .count()
+            + 1;
+
+        use std::fmt::Write;
+        write!(
+            rust_tests,
+            r#"/// Generated `{name}` KUnit test case from a Rust documentation test.
+#[no_mangle]
+pub extern "C" fn {kunit_name}(__kunit_test: *mut ::kernel::bindings::kunit) {{
+    /// Overrides the usual [`assert!`] macro with one that calls KUnit instead.
+    #[allow(unused)]
+    macro_rules! assert {{
+        ($cond:expr $(,)?) => {{{{
+            ::kernel::kunit_assert!(
+                "{kunit_name}", "{file}", __DOCTEST_ANCHOR - {line}, $cond
+            );
+        }}}}
+    }}
+
+    /// Overrides the usual [`assert_eq!`] macro with one that calls KUnit instead.
+    #[allow(unused)]
+    macro_rules! assert_eq {{
+        ($left:expr, $right:expr $(,)?) => {{{{
+            ::kernel::kunit_assert_eq!(
+                "{kunit_name}", "{file}", __DOCTEST_ANCHOR - {line}, $left, $right
+            );
+        }}}}
+    }}
+
+    // Many tests need the prelude, so provide it by default.
+    #[allow(unused)]
+    use ::kernel::prelude::*;
+
+    // Unconditionally print the location of the original doctest (i.e. rather than the location in
+    // the generated file) so that developers can easily map the test back to the source code.
     //
-    // We save the result in a variable so that the failed assertion message looks nicer.
-    let body = body.replace(
-        &format!("}} {rustdoc_function_name}().unwrap() }}"),
-        &format!("}} let test_return_value = {rustdoc_function_name}(); assert!(test_return_value.is_ok()); }}"),
-    );
+    // This information is also printed when assertions fail, but this helps in the successful cases
+    // when the user is running KUnit manually, or when passing `--raw_output` to `kunit.py`.
+    //
+    // This follows the syntax for declaring test metadata in the proposed KTAP v2 spec, which may
+    // be used for the proposed KUnit test attributes API. Thus hopefully this will make migration
+    // easier later on.
+    ::kernel::kunit::info(fmt!("    # {kunit_name}.location: {file}:{line}\n"));
+
+    /// The anchor where the test code body starts.
+    #[allow(unused)]
+    static __DOCTEST_ANCHOR: i32 = ::core::line!() as i32 + {body_offset} + 1;
+    {{
+        {body}
+        main();
+    }}
+}}
+
+"#
+        )
+        .unwrap();
+
+        write!(c_test_declarations, "void {kunit_name}(struct kunit *);\n").unwrap();
+        write!(c_test_cases, "    KUNIT_CASE({kunit_name}),\n").unwrap();
+    }
+
+    if nb_generated == 0 {
+        panic!("No test function found in `rustdoc`'s output.");
+    }
+
+    let rust_tests = rust_tests.trim();
+    let c_test_declarations = c_test_declarations.trim();
+    let c_test_cases = c_test_cases.trim();
+
+    write!(
+        BufWriter::new(File::create("rust/doctests_kernel_generated.rs").unwrap()),
+        r#"//! `kernel` crate documentation tests.
+
+const __LOG_PREFIX: &[u8] = b"rust_doctests_kernel\0";
+
+{rust_tests}
+"#
+    )
+    .unwrap();
+
+    write!(
+        BufWriter::new(File::create("rust/doctests_kernel_generated_kunit.c").unwrap()),
+        r#"/*
+ * `kernel` crate documentation tests.
+ */
+
+#include <kunit/test.h>
+
+{c_test_declarations}
+
+static struct kunit_case test_cases[] = {{
+    {c_test_cases}
+    {{ }}
+}};

-    // Figure out a smaller test name based on the generated function name.
-    let name = rustdoc_function_name.split_once("_rust_kernel_").unwrap().1;
+static struct kunit_suite test_suite = {{
+    .name = "rust_doctests_kernel",
+    .test_cases = test_cases,
+}};

-    let path = format!("rust/test/doctests/kernel/{name}");
+kunit_test_suite(test_suite);

-    std::fs::write(path, body.as_bytes()).unwrap();
+MODULE_LICENSE("GPL");
+"#
+    )
+    .unwrap();
 }
diff --git a/scripts/rustdoc_test_gen.rs b/scripts/rustdoc_test_gen.rs
deleted file mode 100644
index c8f9dc2ab976..000000000000
--- a/scripts/rustdoc_test_gen.rs
+++ /dev/null
@@ -1,265 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-
-//! Generates KUnit tests from saved `rustdoc`-generated tests.
-//!
-//! KUnit passes a context (`struct kunit *`) to each test, which should be forwarded to the other
-//! KUnit functions and macros.
-//!
-//! However, we want to keep this as an implementation detail because:
-//!
-//!   - Test code should not care about the implementation.
-//!
-//!   - Documentation looks worse if it needs to carry extra details unrelated to the piece
-//!     being described.
-//!
-//!   - Test code should be able to define functions and call them, without having to carry
-//!     the context.
-//!
-//!   - Later on, we may want to be able to test non-kernel code (e.g. `core` or third-party
-//!     crates) which likely use the standard library `assert*!` macros.
-//!
-//! For this reason, instead of the passed context, `kunit_get_current_test()` is used instead
-//! (i.e. `current->kunit_test`).
-//!
-//! Note that this means other threads/tasks potentially spawned by a given test, if failing, will
-//! report the failure in the kernel log but will not fail the actual test. Saving the pointer in
-//! e.g. a `static` per test does not fully solve the issue either, because currently KUnit does
-//! not support assertions (only expectations) from other tasks. Thus leave that feature for
-//! the future, which simplifies the code here too. We could also simply not allow `assert`s in
-//! other tasks, but that seems overly constraining, and we do want to support them, eventually.
-
-use std::{
-    fs,
-    fs::File,
-    io::{BufWriter, Read, Write},
-    path::{Path, PathBuf},
-};
-
-/// Find the real path to the original file based on the `file` portion of the test name.
-///
-/// `rustdoc` generated `file`s look like `sync_locked_by_rs`. Underscores (except the last one)
-/// may represent an actual underscore in a directory/file, or a path separator. Thus the actual
-/// file might be `sync_locked_by.rs`, `sync/locked_by.rs`, `sync_locked/by.rs` or
-/// `sync/locked/by.rs`. This function walks the file system to determine which is the real one.
-///
-/// This does require that ambiguities do not exist, but that seems fair, especially since this is
-/// all supposed to be temporary until `rustdoc` gives us proper metadata to build this. If such
-/// ambiguities are detected, they are diagnosed and the script panics.
-fn find_real_path<'a>(srctree: &Path, valid_paths: &'a mut Vec<PathBuf>, file: &str) -> &'a str {
-    valid_paths.clear();
-
-    let potential_components: Vec<&str> = file.strip_suffix("_rs").unwrap().split('_').collect();
-
-    find_candidates(srctree, valid_paths, Path::new(""), &potential_components);
-    fn find_candidates(
-        srctree: &Path,
-        valid_paths: &mut Vec<PathBuf>,
-        prefix: &Path,
-        potential_components: &[&str],
-    ) {
-        // The base case: check whether all the potential components left, joined by underscores,
-        // is a file.
-        let joined_potential_components = potential_components.join("_") + ".rs";
-        if srctree
-            .join("rust/kernel")
-            .join(prefix)
-            .join(&joined_potential_components)
-            .is_file()
-        {
-            // Avoid `srctree` here in order to keep paths relative to it in the KTAP output.
-            valid_paths.push(
-                Path::new("rust/kernel")
-                    .join(prefix)
-                    .join(joined_potential_components),
-            );
-        }
-
-        // In addition, check whether each component prefix, joined by underscores, is a directory.
-        // If not, there is no need to check for combinations with that prefix.
-        for i in 1..potential_components.len() {
-            let (components_prefix, components_rest) = potential_components.split_at(i);
-            let prefix = prefix.join(components_prefix.join("_"));
-            if srctree.join("rust/kernel").join(&prefix).is_dir() {
-                find_candidates(srctree, valid_paths, &prefix, components_rest);
-            }
-        }
-    }
-
-    match valid_paths.as_slice() {
-        [] => panic!(
-            "No path candidates found for `{file}`. This is likely a bug in the build system, or \
-            some files went away while compiling."
-        ),
-        [valid_path] => valid_path.to_str().unwrap(),
-        valid_paths => {
-            use std::fmt::Write;
-
-            let mut candidates = String::new();
-            for path in valid_paths {
-                writeln!(&mut candidates, "    {path:?}").unwrap();
-            }
-            panic!(
-                "Several path candidates found for `{file}`, please resolve the ambiguity by \
-                renaming a file or folder. Candidates:\n{candidates}",
-            );
-        }
-    }
-}
-
-fn main() {
-    let srctree = std::env::var("srctree").unwrap();
-    let srctree = Path::new(&srctree);
-
-    let mut paths = fs::read_dir("rust/test/doctests/kernel")
-        .unwrap()
-        .map(|entry| entry.unwrap().path())
-        .collect::<Vec<_>>();
-
-    // Sort paths.
-    paths.sort();
-
-    let mut rust_tests = String::new();
-    let mut c_test_declarations = String::new();
-    let mut c_test_cases = String::new();
-    let mut body = String::new();
-    let mut last_file = String::new();
-    let mut number = 0;
-    let mut valid_paths: Vec<PathBuf> = Vec::new();
-    let mut real_path: &str = "";
-    for path in paths {
-        // The `name` follows the `{file}_{line}_{number}` pattern (see description in
-        // `scripts/rustdoc_test_builder.rs`). Discard the `number`.
-        let name = path.file_name().unwrap().to_str().unwrap().to_string();
-
-        // Extract the `file` and the `line`, discarding the `number`.
-        let (file, line) = name.rsplit_once('_').unwrap().0.rsplit_once('_').unwrap();
-
-        // Generate an ID sequence ("test number") for each one in the file.
-        if file == last_file {
-            number += 1;
-        } else {
-            number = 0;
-            last_file = file.to_string();
-
-            // Figure out the real path, only once per file.
-            real_path = find_real_path(srctree, &mut valid_paths, file);
-        }
-
-        // Generate a KUnit name (i.e. test name and C symbol) for this test.
-        //
-        // We avoid the line number, like `rustdoc` does, to make things slightly more stable for
-        // bisection purposes. However, to aid developers in mapping back what test failed, we will
-        // print a diagnostics line in the KTAP report.
-        let kunit_name = format!("rust_doctest_kernel_{file}_{number}");
-
-        // Read the test's text contents to dump it below.
-        body.clear();
-        File::open(path).unwrap().read_to_string(&mut body).unwrap();
-
-        // Calculate how many lines before `main` function (including the `main` function line).
-        let body_offset = body
-            .lines()
-            .take_while(|line| !line.contains("fn main() {"))
-            .count()
-            + 1;
-
-        use std::fmt::Write;
-        write!(
-            rust_tests,
-            r#"/// Generated `{name}` KUnit test case from a Rust documentation test.
-#[no_mangle]
-pub extern "C" fn {kunit_name}(__kunit_test: *mut ::kernel::bindings::kunit) {{
-    /// Overrides the usual [`assert!`] macro with one that calls KUnit instead.
-    #[allow(unused)]
-    macro_rules! assert {{
-        ($cond:expr $(,)?) => {{{{
-            ::kernel::kunit_assert!(
-                "{kunit_name}", "{real_path}", __DOCTEST_ANCHOR - {line}, $cond
-            );
-        }}}}
-    }}
-
-    /// Overrides the usual [`assert_eq!`] macro with one that calls KUnit instead.
-    #[allow(unused)]
-    macro_rules! assert_eq {{
-        ($left:expr, $right:expr $(,)?) => {{{{
-            ::kernel::kunit_assert_eq!(
-                "{kunit_name}", "{real_path}", __DOCTEST_ANCHOR - {line}, $left, $right
-            );
-        }}}}
-    }}
-
-    // Many tests need the prelude, so provide it by default.
-    #[allow(unused)]
-    use ::kernel::prelude::*;
-
-    // Unconditionally print the location of the original doctest (i.e. rather than the location in
-    // the generated file) so that developers can easily map the test back to the source code.
-    //
-    // This information is also printed when assertions fail, but this helps in the successful cases
-    // when the user is running KUnit manually, or when passing `--raw_output` to `kunit.py`.
-    //
-    // This follows the syntax for declaring test metadata in the proposed KTAP v2 spec, which may
-    // be used for the proposed KUnit test attributes API. Thus hopefully this will make migration
-    // easier later on.
-    ::kernel::kunit::info(fmt!("    # {kunit_name}.location: {real_path}:{line}\n"));
-
-    /// The anchor where the test code body starts.
-    #[allow(unused)]
-    static __DOCTEST_ANCHOR: i32 = ::core::line!() as i32 + {body_offset} + 1;
-    {{
-        {body}
-        main();
-    }}
-}}
-
-"#
-        )
-        .unwrap();
-
-        write!(c_test_declarations, "void {kunit_name}(struct kunit *);\n").unwrap();
-        write!(c_test_cases, "    KUNIT_CASE({kunit_name}),\n").unwrap();
-    }
-
-    let rust_tests = rust_tests.trim();
-    let c_test_declarations = c_test_declarations.trim();
-    let c_test_cases = c_test_cases.trim();
-
-    write!(
-        BufWriter::new(File::create("rust/doctests_kernel_generated.rs").unwrap()),
-        r#"//! `kernel` crate documentation tests.
-
-const __LOG_PREFIX: &[u8] = b"rust_doctests_kernel\0";
-
-{rust_tests}
-"#
-    )
-    .unwrap();
-
-    write!(
-        BufWriter::new(File::create("rust/doctests_kernel_generated_kunit.c").unwrap()),
-        r#"/*
- * `kernel` crate documentation tests.
- */
-
-#include <kunit/test.h>
-
-{c_test_declarations}
-
-static struct kunit_case test_cases[] = {{
-    {c_test_cases}
-    {{ }}
-}};
-
-static struct kunit_suite test_suite = {{
-    .name = "rust_doctests_kernel",
-    .test_cases = test_cases,
-}};
-
-kunit_test_suite(test_suite);
-
-MODULE_LICENSE("GPL");
-"#
-    )
-    .unwrap();
-}

base-commit: 0d97f2067c166eb495771fede9f7b73999c67f66
--
2.51.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling
  2025-10-09  9:14 ` Miguel Ojeda
@ 2025-10-18  7:18   ` David Gow
  0 siblings, 0 replies; 12+ messages in thread
From: David Gow @ 2025-10-18  7:18 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Guillaume Gomez, Brendan Higgins, Rae Moar, linux-kernel,
	rust-for-linux, kunit-dev, linux-kselftest

[-- Attachment #1: Type: text/plain, Size: 4317 bytes --]

Thanks Guillaume, Miguel.

This is looking good to me.

On Thu, 9 Oct 2025 at 17:14, Miguel Ojeda <ojeda@kernel.org> wrote:
>
> On Tue, 17 Jun 2025 16:00:33 +0200 Guillaume Gomez <guillaume1.gomez@gmail.com> wrote:
> >
> > The goal of this patch is to remove the use of 2 unstable
> > rustdoc features (`--no-run` and `--test-builder`) and replace it with a
> > stable feature: `--output-format=doctest`, which was added in
> > https://github.com/rust-lang/rust/pull/134531.
> >
> > Before this patch, the code was using very hacky methods in order to retrieve
> > doctests, modify them as needed and then concatenate all of them in one file.
> >
> > Now, with this new flag, it instead asks rustdoc to provide the doctests
> > code with their associated information such as file path and line number.
> >
> > Signed-off-by: Guillaume Gomez <guillaume1.gomez@gmail.com>
> > ---
>
> (Procedural bit: normally we provide a changelog between versions after
> this `---` line so that reviewers now what changed so far.)
>
> I finally took a look at this again, so I rebased it and got:
>
>     thread 'main' panicked at scripts/rustdoc_test_gen.rs:92:15:
>     No path candidates found for `rust_kernel_alloc_allocator.rs`.This is likely a bug in the build system, or some files went away while compiling.
>
> which brings me to the bigger point: the main reason to have the new
> output format is to avoid all these hacks, including the "find the real
> path back to the original file" hack here. More generally, to avoid the
> 2 scripts approach.
>
> So now we can finally get rid of all that and simplify. That is, we can
> just merge it all in a single script that reads the JSON and builds the
> result directly, since now we have everything we need (originally I
> needed the 2 scripts approach since `rustdoc` executed the test builder
> once per test so I had to somehow collect the results).
>
> i.e. no more hundreds of generated files/processes, just a simple pipe.

Yeah, this definitely seems the right way -- even though outputting
the generated rustdoc.json to a file is sometimes useful for debugging
it. But unless this actually breaks often (which hopefully it will
less frequently with this setup), just passing things around directly
in a pipe is neater.

>
> Anyway, just to check we had everything we needed, I did a quick try --
> please see the draft patch below.
>
> I gave it a go -- please see the draft patch below. The diff w.r.t. your
> patch would be something like +217 -341, i.e. we get rid of quite a lot
> of lines. I added as well some more context in the commit message, and
> put the right docs in the unified script. This also improves the sorting
> of the tests (it now follows the line number better).
>
> We still have to preserve the support for the old compilers, so what I
> think I will do is just have the new script separately, keeping the old
> ones as-is until we can remove them when we upgrade the minimum for e.g.
> the next Debian Stable.

Agreed: we definitely need to keep the old way running for a little
bit longer. Whether or not it makes sense to have both versions
running in parallel, or just delay this until we no-longer need the
old implementation is a matter of taste, I suspect. Having just one
"active" implementation at a time has a bunch of advantages
(particularly in the case there's some inconsistency in, e.g., test
names between compiler versions) -- it's probably what I'd go for
(mostly because it's easier), but I've no objection to having both
running in parallel, assuming it doesn't cause any more issues than
the code getting more complicated. And I personally don't feel that
having both is likely to be a big enough burden to be worth bumping
the compiler version alone.

>
> Cc'ing David and KUnit, since this is closer to getting ready -- please
> let me know if this raises alarms for anyone.
>

Apart from the fact that it breaks the older compilers, this looks
like a definite improvement. (I was a little worried about adding a
whole json parser in, but it's nicely small and self-contained, so I
actually quite like it.)

So I'm happy with this from the KUnit side, so long as we either don't
remove support for the old system/old compilers, or delay merging this
until we can drop support for those altogether.

Cheers,
-- David

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5281 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-10-18  7:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-17 14:00 [PATCH 1/1] Use new `--output-format=doctest` rustdoc command line flag to improve doctest handling Guillaume Gomez
2025-06-18  5:50 ` kernel test robot
2025-06-18  6:19   ` Miguel Ojeda
2025-10-09  9:14 ` Miguel Ojeda
2025-10-18  7:18   ` David Gow
  -- strict thread matches above, loose matches on Subject: below --
2025-02-28 13:29 Guillaume Gomez
2025-02-28 15:43 ` Miguel Ojeda
2025-02-28 16:33   ` Guillaume Gomez
2025-02-28 16:38   ` Guillaume Gomez
     [not found]   ` <CAAOQCfSEUnp8U3+6amWCd6+yPrAmPy6gsjJnQdrqmpER5md3kA@mail.gmail.com>
2025-02-28 16:48     ` Miguel Ojeda
2025-02-28 16:55       ` Guillaume Gomez
2025-02-28 13:25 Guillaume Gomez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).